Browse Source

Fix/hms flapping notifications (#444)

* Debounce HMS error tracking to prevent flapping notifications

HMS errors on newer printers (e.g. H2S) can flicker on and off
every few seconds as conditions fluctuate around thresholds. For
example, chamber temperature regulation during PETG prints sends
HMS code 0300A70000030001 repeatedly as temps oscillate. Previously
each reappearance after a brief hms:[] gap was treated as a new
error, triggering duplicate notifications. This adds a 30-second
grace period before clearing the notification tracking state so
flapping errors are only notified once per episode.

* build(deps): bump aquasecurity/trivy-action

Bumps the github_actions group with 1 update in the /.github/workflows directory: [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action).


Updates `aquasecurity/trivy-action` from 0.33.1 to 0.34.0
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.33.1...0.34.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-version: 0.34.0
  dependency-type: direct:production
  dependency-group: github_actions
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updated CI

* build(deps): bump aquasecurity/trivy-action

Bumps the github_actions group with 1 update in the /.github/workflows directory: [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action).


Updates `aquasecurity/trivy-action` from 0.33.1 to 0.34.0
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.33.1...0.34.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-version: 0.34.0
  dependency-type: direct:production
  dependency-group: github_actions
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updated CI

* Updated CI

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MartinNYHC <mz@v8w.de>
sbcrumb 3 months ago
parent
commit
b8d48b0362
1 changed files with 14 additions and 2 deletions
  1. 14 2
      backend/app/main.py

+ 14 - 2
backend/app/main.py

@@ -1,5 +1,6 @@
 import asyncio
 import asyncio
 import logging
 import logging
+import time
 from contextlib import asynccontextmanager
 from contextlib import asynccontextmanager
 from datetime import datetime, timedelta, timezone
 from datetime import datetime, timedelta, timezone
 from logging.handlers import RotatingFileHandler
 from logging.handlers import RotatingFileHandler
@@ -260,6 +261,10 @@ _last_progress_milestone: dict[int, int] = {}
 # Track HMS errors that have been notified: {printer_id: set of error codes}
 # Track HMS errors that have been notified: {printer_id: set of error codes}
 # This prevents sending duplicate notifications for the same error
 # This prevents sending duplicate notifications for the same error
 _notified_hms_errors: dict[int, set[str]] = {}
 _notified_hms_errors: dict[int, set[str]] = {}
+# Track when HMS errors were last seen: {printer_id: timestamp}
+# Used to debounce clearing — prevents flapping errors from re-triggering notifications
+_hms_last_seen: dict[int, float] = {}
+_HMS_CLEAR_GRACE_SECONDS = 30.0
 
 
 # Track timelapse file baselines at print start: {printer_id: set of video filenames}
 # Track timelapse file baselines at print start: {printer_id: set of video filenames}
 # Used for snapshot-diff detection at print completion
 # Used for snapshot-diff detection at print completion
@@ -432,6 +437,7 @@ async def on_printer_status_change(printer_id: int, state: PrinterState):
 
 
         # Update tracking immediately to prevent duplicate notifications from concurrent callbacks
         # Update tracking immediately to prevent duplicate notifications from concurrent callbacks
         _notified_hms_errors[printer_id] = current_error_codes
         _notified_hms_errors[printer_id] = current_error_codes
+        _hms_last_seen[printer_id] = time.time()
 
 
         if new_error_codes:
         if new_error_codes:
             # Get the actual new errors for the notification
             # Get the actual new errors for the notification
@@ -504,9 +510,15 @@ async def on_printer_status_change(printer_id: int, state: PrinterState):
                 logging.getLogger(__name__).warning(f"HMS error notification failed: {e}")
                 logging.getLogger(__name__).warning(f"HMS error notification failed: {e}")
 
 
     else:
     else:
-        # No HMS errors - clear tracking so future errors get notified
+        # No HMS errors — only clear tracking after a grace period to prevent
+        # flapping errors (brief hms:[] gaps) from re-triggering notifications.
+        # Some HMS codes (e.g. chamber temp regulation during PETG prints) toggle
+        # on/off every few seconds as conditions fluctuate around thresholds.
         if printer_id in _notified_hms_errors:
         if printer_id in _notified_hms_errors:
-            _notified_hms_errors.pop(printer_id, None)
+            last_seen = _hms_last_seen.get(printer_id, 0)
+            if time.time() - last_seen >= _HMS_CLEAR_GRACE_SECONDS:
+                _notified_hms_errors.pop(printer_id, None)
+                _hms_last_seen.pop(printer_id, None)
 
 
     await ws_manager.send_printer_status(
     await ws_manager.send_printer_status(
         printer_id,
         printer_id,