Parcourir la source

fix: cancel = layer shift, stuck "1 problem", and dropped child-logger logs

  Three bugs that surfaced together while debugging an H2D cancel:

  1. Cancelling a print stamped failure_reason="Layer shift" in archives
     AND left the printer card stuck on "1 problem" forever. Four causes:
     (a) POST /printers/{id}/print/stop never set the user-stopped flag, so
         on_print_complete couldn't override "failed" -> "cancelled".
     (b) HMS-derived failure_reason heuristic mapped any module-0x0C HMS to
         "Layer shift". Module 0x0C is "Motion Controller" broadly (includes
         cameras, markers, AND the cancel-sequence echo 0C00_001B). Real
         layer-shift codes live in module 0x03. Same false-positive class
         existed for "Filament runout" (any 0x07) and "Clogged nozzle" (any
         0x05). Replaced with a 23-code curated short-code map; unknowns
         leave failure_reason=None.
     (c) Cancel-echo HMS codes (0300_400C "The task was canceled.",
         0500_400E "Printing was cancelled.") were polluting state.hms_errors
         via both the hms[] and print_error parse paths. Filter them at
         parse time so the frontend never sees them.
     (d) Frontend bucketed gcode_state="FAILED" as a problem unconditionally.
         Real failures attach an HMS error; user-cancels don't — so FAILED-
         without-HMS now buckets as "finished" and only escalates to "error"
         when there's an active known HMS.

  2. logs/bambuddy.log was silently dropping records from named child
     loggers. TraceIDFilter was attached to root_logger, but Python's
     logging only invokes a Logger's filters on records originating at that
     logger — propagated child-logger records skipped it, formatter raised
     KeyError, handler.handleError dropped the record. Moved the filter
     from root_logger.addFilter() to handler.addFilter() on each handler,
     matching the filter's own docstring guidance.

  derive_failure_reason() extracted as a pure function for testability.
  status="cancelled" now symmetrically yields "User cancelled" alongside
  "aborted".

  20 regression tests across:
  - backend/tests/unit/test_failure_reason_derivation.py (11)
  - backend/tests/unit/services/test_bambu_mqtt.py::TestHMSUserActionFiltering (4)
  - backend/tests/unit/test_trace.py::TestFilterMustBeAttachedToHandlerNotLogger (1)
  - frontend/src/__tests__/pages/PrintersPageBucketing.test.ts (5; includes
    the H2D-cancel-echo "FAILED + only unknown HMS" case)
maziggy il y a 1 mois
Parent
commit
88b5f56eb2

Fichier diff supprimé car celui-ci est trop grand
+ 0 - 0
CHANGELOG.md


+ 11 - 0
backend/app/api/routes/printers.py

@@ -2323,6 +2323,17 @@ async def stop_print(
     if not success:
     if not success:
         raise HTTPException(500, "Failed to stop print")
         raise HTTPException(500, "Failed to stop print")
 
 
+    # Mark this printer as user-stopped so on_print_complete reclassifies
+    # the resulting "failed"/"aborted" MQTT status as "cancelled" — otherwise
+    # the HMS heuristic in _dispatch_archive_update mislabels user-cancels
+    # (e.g. the H2D's cancel-sequence module-0x0C HMS) as "Layer shift".
+    try:
+        from backend.app.main import mark_printer_stopped_by_user
+
+        mark_printer_stopped_by_user(printer_id)
+    except Exception as _mark_err:
+        logger.warning("Failed to mark printer %s as user-stopped: %s", printer_id, _mark_err)
+
     return {"success": True, "message": "Print stop command sent"}
     return {"success": True, "message": "Print stop command sent"}
 
 
 
 

+ 92 - 33
backend/app/main.py

@@ -244,20 +244,26 @@ root_logger.setLevel(log_level)
 
 
 # Trace-ID injection: this filter populates record.trace_id from the
 # Trace-ID injection: this filter populates record.trace_id from the
 # per-request ContextVar so the format string above can reference it.
 # per-request ContextVar so the format string above can reference it.
-# Attached to the root logger so EVERY record (application, uvicorn,
-# third-party) gets the field — without it, the format string would
-# raise KeyError on records that don't naturally carry a trace_id
-# attribute. See backend/app/core/trace.py for the ContextVar that
-# the filter reads.
+# Attached to each HANDLER (not the root logger) because Python's
+# logging semantics only invoke a logger's filters on records that
+# *originated* at that logger — records propagated up from child
+# loggers (every named logger in the app) never trigger root's filter.
+# Putting it on the handlers means every record any handler emits gets
+# trace_id injected just before the formatter runs, regardless of which
+# logger created the record. Without this, the formatter raises
+# KeyError on every child-logger record and the record is silently
+# dropped — which is exactly the "logs/bambuddy.log only shows logs
+# partially" bug we hit. See backend/app/core/trace.py for the
+# ContextVar the filter reads.
 from backend.app.core.trace import TraceIDFilter
 from backend.app.core.trace import TraceIDFilter
 
 
 _trace_id_filter = TraceIDFilter()
 _trace_id_filter = TraceIDFilter()
-root_logger.addFilter(_trace_id_filter)
 
 
 # Console handler - always enabled
 # Console handler - always enabled
 console_handler = logging.StreamHandler()
 console_handler = logging.StreamHandler()
 console_handler.setLevel(log_level)
 console_handler.setLevel(log_level)
 console_handler.setFormatter(logging.Formatter(log_format))
 console_handler.setFormatter(logging.Formatter(log_format))
+console_handler.addFilter(_trace_id_filter)
 root_logger.addHandler(console_handler)
 root_logger.addHandler(console_handler)
 
 
 # File handler - only in production or if explicitly enabled
 # File handler - only in production or if explicitly enabled
@@ -271,6 +277,7 @@ if app_settings.log_to_file:
     )
     )
     file_handler.setLevel(log_level)
     file_handler.setLevel(log_level)
     file_handler.setFormatter(logging.Formatter(log_format))
     file_handler.setFormatter(logging.Formatter(log_format))
+    file_handler.addFilter(_trace_id_filter)
     root_logger.addHandler(file_handler)
     root_logger.addHandler(file_handler)
     logging.info("Logging to file: %s", log_file)
     logging.info("Logging to file: %s", log_file)
 
 
@@ -342,6 +349,77 @@ _bed_cool_waiters: dict[int, dict] = {}
 # as "cancelled" (stopped by user) so the correct notification email is sent.
 # as "cancelled" (stopped by user) so the correct notification email is sent.
 _user_stopped_printers: set[int] = set()
 _user_stopped_printers: set[int] = set()
 
 
+
+# HMS short-code → human-readable failure reason. Used by _dispatch_archive_update
+# when status="failed" to label the print's failure_reason in archives.
+#
+# Earlier code matched on `module` alone (e.g. "any module 0x0C HMS → Layer shift"),
+# which is wrong on two counts:
+#   1. Real layer-shift codes live in module 0x03 (see Bambu wiki), not 0x0C.
+#   2. Module 0x0C is "Motion Controller" — broad category that also covers cameras
+#      and visual markers, AND the H2D firmware emits a 0x0C HMS (0C00_001B, not in
+#      the public wiki) as part of its user-cancel sequence. Matching on the module
+#      alone caused user-cancellations to be archived as "Layer shift" failures.
+# We now match by full short code only — anything not in this map leaves
+# failure_reason=None rather than guessing.
+_HMS_FAILURE_REASONS: dict[str, str] = {
+    # Layer shift / step loss
+    "0300_4057": "Layer shift",
+    "0300_4068": "Layer shift",
+    "0300_800C": "Layer shift",
+    # Filament runout (printer-side & per-AMS-slot)
+    "0300_8004": "Filament runout",
+    "0700_8011": "Filament runout",
+    "0701_8011": "Filament runout",
+    "0702_8011": "Filament runout",
+    "0703_8011": "Filament runout",
+    "0704_8011": "Filament runout",
+    "0705_8011": "Filament runout",
+    "0706_8011": "Filament runout",
+    "0707_8011": "Filament runout",
+    "07FF_8011": "Filament runout",
+    # Clogged nozzle / extruder
+    "0300_4006": "Clogged nozzle",
+    "0300_8016": "Clogged nozzle",
+    "0300_801C": "Clogged nozzle",
+    "0700_8003": "Clogged nozzle",
+    "0700_8007": "Clogged nozzle",
+    "0700_8013": "Clogged nozzle",
+    "0701_8003": "Clogged nozzle",
+    "0701_8007": "Clogged nozzle",
+    "0701_8013": "Clogged nozzle",
+    "0702_8003": "Clogged nozzle",
+}
+
+
+def _hms_short_code(attr: int, code: int | str) -> str:
+    """Build the canonical "MMMM_CCCC" HMS short code from raw attr/code values."""
+    if isinstance(code, str):
+        code_int = int(code.replace("0x", ""), 16) if code else 0
+    else:
+        code_int = int(code or 0)
+    attr_int = int(attr or 0)
+    return f"{(attr_int >> 16) & 0xFFFF:04X}_{code_int & 0xFFFF:04X}"
+
+
+def derive_failure_reason(status: str, hms_errors: list[dict] | None) -> str | None:
+    """Derive a human-readable failure_reason for an archived print.
+
+    Returns "User cancelled" for cancelled/aborted prints; for failed prints,
+    returns the first matching reason from _HMS_FAILURE_REASONS, or None when
+    no HMS code matches (don't guess — null is honest).
+    """
+    if status in ("aborted", "cancelled"):
+        return "User cancelled"
+    if status != "failed":
+        return None
+    for err in hms_errors or []:
+        short_code = _hms_short_code(err.get("attr", 0), err.get("code", 0))
+        if short_code in _HMS_FAILURE_REASONS:
+            return _HMS_FAILURE_REASONS[short_code]
+    return None
+
+
 # Track created_by_id for expected prints so the user email can be sent even when
 # Track created_by_id for expected prints so the user email can be sent even when
 # the archive itself doesn't have created_by_id set (e.g. library-file-based prints).
 # the archive itself doesn't have created_by_id set (e.g. library-file-based prints).
 # {(printer_id, filename): created_by_id}
 # {(printer_id, filename): created_by_id}
@@ -3052,33 +3130,14 @@ async def on_print_complete(printer_id: int, data: dict):
             service = ArchiveService(db)
             service = ArchiveService(db)
             status = data.get("status", "completed")
             status = data.get("status", "completed")
 
 
-            # Auto-detect failure reason
-            failure_reason = None
-            if status == "aborted":
-                failure_reason = "User cancelled"
-                logger.info("[ARCHIVE] Print was aborted by user, setting failure_reason='User cancelled'")
-            elif status == "failed":
-                # Try to determine failure reason from HMS errors
-                hms_errors = data.get("hms_errors", [])
-                if hms_errors:
-                    logger.info("[ARCHIVE] HMS errors at failure: %s", hms_errors)
-                    # Map known HMS error modules to failure reasons
-                    # Module 0x07 = Filament, 0x0C = MC (Motion Controller), etc.
-                    for err in hms_errors:
-                        module = err.get("module", 0)
-                        if module == 0x07:  # Filament module
-                            failure_reason = "Filament runout"
-                            break
-                        elif module == 0x0C:  # Motion controller
-                            failure_reason = "Layer shift"
-                            break
-                        elif module == 0x05:  # Nozzle/extruder
-                            failure_reason = "Clogged nozzle"
-                            break
-                    if failure_reason:
-                        logger.info("[ARCHIVE] Detected failure_reason from HMS: %s", failure_reason)
-                else:
-                    logger.info("[ARCHIVE] No HMS errors available to determine failure reason")
+            hms_errors = data.get("hms_errors", []) if status == "failed" else None
+            if hms_errors:
+                logger.info("[ARCHIVE] HMS errors at failure: %s", hms_errors)
+            failure_reason = derive_failure_reason(status, hms_errors)
+            if failure_reason:
+                logger.info("[ARCHIVE] failure_reason=%r (status=%s)", failure_reason, status)
+            elif status == "failed" and hms_errors:
+                logger.info("[ARCHIVE] HMS errors present but none matched a known failure-reason short code")
 
 
             await service.update_archive_status(
             await service.update_archive_status(
                 archive_id,
                 archive_id,

+ 43 - 16
backend/app/services/bambu_mqtt.py

@@ -52,6 +52,19 @@ class HMSError:
     message: str = ""
     message: str = ""
 
 
 
 
+# HMS short codes the firmware emits during normal user-cancel sequences.
+# These aren't faults — they're status echoes that confirm the cancel happened.
+# Filtering them at parse-time keeps them out of state.hms_errors entirely,
+# so they don't drive the printer card's "X problem" badge, the red pip, or
+# any other consumer that treats hms_errors as the active-fault list.
+_HMS_USER_ACTION_CODES: frozenset[str] = frozenset(
+    {
+        "0300_400C",  # "The task was canceled."
+        "0500_400E",  # "Printing was cancelled."
+    }
+)
+
+
 @dataclass
 @dataclass
 class KProfile:
 class KProfile:
     """Pressure advance (K) calibration profile from printer."""
     """Pressure advance (K) calibration profile from printer."""
@@ -2212,6 +2225,14 @@ class BambuMQTTClient:
                         # indicators that some firmware sends during normal printing.
                         # indicators that some firmware sends during normal printing.
                         if code < 0x4000:
                         if code < 0x4000:
                             continue
                             continue
+                        # Skip user-action echoes — the printer firmware emits these
+                        # as part of normal user-cancel sequences. They're not faults
+                        # and shouldn't count toward "X problem" badges or surface as
+                        # red pips on the printer card. Backend's notification path
+                        # already suppresses 0500_400E for the same reason.
+                        short_code = f"{(attr >> 16) & 0xFFFF:04X}_{code & 0xFFFF:04X}"
+                        if short_code in _HMS_USER_ACTION_CODES:
+                            continue
                         self.state.hms_errors.append(
                         self.state.hms_errors.append(
                             HMSError(
                             HMSError(
                                 code=f"0x{code:x}" if code else "0x0",
                                 code=f"0x{code:x}" if code else "0x0",
@@ -2248,23 +2269,29 @@ class BambuMQTTClient:
                         f"[{self.serial_number}] print_error: {print_error} (0x{print_error:08x}) -> short_code={short_code}"
                         f"[{self.serial_number}] print_error: {print_error} (0x{print_error:08x}) -> short_code={short_code}"
                     )
                     )
 
 
-                    # Only add if not already in HMS errors (avoid duplicates)
-                    existing_short_codes = set()
-                    for e in self.state.hms_errors:
-                        # Extract short code from existing errors
-                        e_module = (e.attr >> 16) & 0xFFFF
-                        e_error = int(e.code.replace("0x", ""), 16) if e.code else 0
-                        existing_short_codes.add(f"{e_module:04X}_{e_error:04X}")
-
-                    if short_code not in existing_short_codes:
-                        self.state.hms_errors.append(
-                            HMSError(
-                                code=f"0x{error:x}",
-                                attr=print_error,  # Store full value for display
-                                module=module >> 8,  # High byte of module (e.g., 0x05)
-                                severity=3,  # Warning level for print_error
+                    # Same user-action filter as the hms[] branch above — print_error
+                    # carries the same cancel echoes (e.g. 0500_400E) and they must
+                    # not surface as faults on the printer card.
+                    if short_code in _HMS_USER_ACTION_CODES:
+                        pass  # cancel echo — silently drop
+                    else:
+                        # Only add if not already in HMS errors (avoid duplicates)
+                        existing_short_codes = set()
+                        for e in self.state.hms_errors:
+                            # Extract short code from existing errors
+                            e_module = (e.attr >> 16) & 0xFFFF
+                            e_error = int(e.code.replace("0x", ""), 16) if e.code else 0
+                            existing_short_codes.add(f"{e_module:04X}_{e_error:04X}")
+
+                        if short_code not in existing_short_codes:
+                            self.state.hms_errors.append(
+                                HMSError(
+                                    code=f"0x{error:x}",
+                                    attr=print_error,  # Store full value for display
+                                    module=module >> 8,  # High byte of module (e.g., 0x05)
+                                    severity=3,  # Warning level for print_error
+                                )
                             )
                             )
-                        )
 
 
         # Parse home_flag first so SD-card detection below can prefer it.
         # Parse home_flag first so SD-card detection below can prefer it.
         # Bit 8 = HAS_SDCARD_NORMAL, bit 9 = HAS_SDCARD_ABNORMAL, bit 11 = store-to-SD,
         # Bit 8 = HAS_SDCARD_NORMAL, bit 9 = HAS_SDCARD_ABNORMAL, bit 11 = store-to-SD,

+ 65 - 0
backend/tests/unit/services/test_bambu_mqtt.py

@@ -4158,3 +4158,68 @@ class TestZombieSessionDetection:
         mqtt_client._update_state({"gcode_state": "IDLE"})
         mqtt_client._update_state({"gcode_state": "IDLE"})
         assert mqtt_client._ams_cmd_unanswered == 0
         assert mqtt_client._ams_cmd_unanswered == 0
         assert mqtt_client._last_ams_cmd_time > 0  # still pending
         assert mqtt_client._last_ams_cmd_time > 0  # still pending
+
+
+class TestHMSUserActionFiltering:
+    """HMS short codes the printer firmware emits during user-cancel sequences
+    must not appear in state.hms_errors — they're status echoes, not faults,
+    and shouldn't drive the printer card's "X problem" badge or red pip."""
+
+    @pytest.fixture
+    def mqtt_client(self):
+        from backend.app.services.bambu_mqtt import BambuMQTTClient
+
+        return BambuMQTTClient(
+            ip_address="192.168.1.100",
+            serial_number="TEST_HMS",
+            access_code="12345678",
+        )
+
+    def test_task_cancelled_echo_0300_400c_filtered(self, mqtt_client):
+        """0300_400C ("The task was canceled.") is the user-cancel echo that was
+        leaving the printer card stuck on "1 problem" after every stop."""
+        mqtt_client._update_state({"hms": [{"attr": 0x03000300, "code": 0x400C}]})
+        assert mqtt_client.state.hms_errors == []
+
+    def test_printing_cancelled_echo_0500_400e_filtered(self, mqtt_client):
+        """0500_400E ("Printing was cancelled.") — the corresponding nozzle-module
+        echo that the backend notification path was already suppressing for the
+        same reason."""
+        mqtt_client._update_state({"hms": [{"attr": 0x05000300, "code": 0x400E}]})
+        assert mqtt_client.state.hms_errors == []
+
+    def test_real_layer_shift_still_passes_through(self, mqtt_client):
+        """0300_4057 (Z-axis step loss) is a real fault and must NOT be filtered."""
+        mqtt_client._update_state({"hms": [{"attr": 0x03000100, "code": 0x4057}]})
+        assert len(mqtt_client.state.hms_errors) == 1
+        assert mqtt_client.state.hms_errors[0].code == "0x4057"
+
+    def test_filter_only_drops_user_action_codes_keeps_concurrent_real_faults(self, mqtt_client):
+        """When the user cancels mid-fault, the firmware sends the real fault HMS
+        alongside the cancel echo. Drop only the echo, keep the real fault."""
+        mqtt_client._update_state(
+            {
+                "hms": [
+                    {"attr": 0x03000300, "code": 0x400C},  # cancel echo — drop
+                    {"attr": 0x07FF0200, "code": 0x8011},  # filament runout — keep
+                ]
+            }
+        )
+        codes = [e.code for e in mqtt_client.state.hms_errors]
+        assert "0x8011" in codes
+        assert "0x400c" not in codes
+        assert len(mqtt_client.state.hms_errors) == 1
+
+    def test_print_error_path_also_filters_cancel_echo(self, mqtt_client):
+        """`print_error` is a second route that appends into state.hms_errors. The
+        same user-action codes (e.g. 0500_400E "Printing was cancelled") must be
+        filtered there too — otherwise the printer card stays on "1 problem"
+        when the firmware reports the cancel via print_error rather than hms[]."""
+        mqtt_client._update_state({"print_error": 0x0500_400E})
+        assert mqtt_client.state.hms_errors == []
+
+    def test_print_error_path_passes_real_errors_through(self, mqtt_client):
+        """Real print_error codes still reach state.hms_errors."""
+        mqtt_client._update_state({"print_error": 0x0500_8061})
+        assert len(mqtt_client.state.hms_errors) == 1
+        assert mqtt_client.state.hms_errors[0].code == "0x8061"

+ 99 - 0
backend/tests/unit/test_failure_reason_derivation.py

@@ -0,0 +1,99 @@
+"""Regression tests for derive_failure_reason in backend.app.main.
+
+Ensures user-cancelled prints don't get archived as "Layer shift" — the bug
+seen on H2D where the firmware's cancel-sequence module-0x0C HMS was being
+matched by the old broad heuristic (`module == 0x0C → Layer shift`).
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from backend.app.main import derive_failure_reason
+
+# ---------------------------------------------------------------------------
+# Status-based reasons (no HMS lookup needed)
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.parametrize("status", ["aborted", "cancelled"])
+def test_user_cancel_status_yields_user_cancelled(status: str) -> None:
+    assert derive_failure_reason(status, None) == "User cancelled"
+    assert derive_failure_reason(status, []) == "User cancelled"
+
+
+def test_completed_status_returns_none() -> None:
+    assert derive_failure_reason("completed", None) is None
+
+
+# ---------------------------------------------------------------------------
+# H2D regression: cancel-sequence HMS must not be labelled "Layer shift"
+# ---------------------------------------------------------------------------
+
+
+def test_h2d_cancel_module_0x0c_is_not_layer_shift() -> None:
+    """0C00_001B is the H2D cancel-sequence echo, not a real layer-shift code.
+
+    The old `module == 0x0C → Layer shift` heuristic mislabeled every user-cancel
+    on H2D as a layer-shift failure. This pins that code to None.
+    """
+    h2d_cancel_hms = [
+        {"code": "0x2001b", "attr": 0x0C000C00, "module": 0x0C, "severity": 1},
+        {"code": "0x400c", "attr": 0x03002C0C, "module": 0x03, "severity": 3},
+    ]
+    assert derive_failure_reason("failed", h2d_cancel_hms) is None
+
+
+def test_unknown_module_0x0c_code_returns_none() -> None:
+    """Any module-0x0C code we don't have an explicit short-code mapping for must
+    leave failure_reason=None — being honest beats guessing."""
+    unknown_hms = [{"code": "0x4099", "attr": 0x0C00_0000, "module": 0x0C, "severity": 2}]
+    assert derive_failure_reason("failed", unknown_hms) is None
+
+
+# ---------------------------------------------------------------------------
+# Genuine failure modes still classified correctly
+# ---------------------------------------------------------------------------
+
+
+def test_real_layer_shift_short_code_detected() -> None:
+    """0300_4057 ("Z-axis step loss") is a real layer-shift code from the wiki."""
+    hms = [{"code": "0x4057", "attr": 0x0300_0000, "module": 0x03, "severity": 1}]
+    assert derive_failure_reason("failed", hms) == "Layer shift"
+
+
+def test_real_filament_runout_short_code_detected() -> None:
+    """07FF_8011 = external filament runout."""
+    hms = [{"code": "0x8011", "attr": 0x07FF_0000, "module": 0x07, "severity": 2}]
+    assert derive_failure_reason("failed", hms) == "Filament runout"
+
+
+def test_real_clogged_nozzle_short_code_detected() -> None:
+    """0300_4006 = "The nozzle is clogged"."""
+    hms = [{"code": "0x4006", "attr": 0x0300_0000, "module": 0x03, "severity": 1}]
+    assert derive_failure_reason("failed", hms) == "Clogged nozzle"
+
+
+def test_first_matching_code_wins() -> None:
+    """When multiple known codes are present, the first one in the list wins."""
+    hms = [
+        {"code": "0x4057", "attr": 0x0300_0000, "module": 0x03, "severity": 1},  # layer shift
+        {"code": "0x8011", "attr": 0x07FF_0000, "module": 0x07, "severity": 2},  # filament runout
+    ]
+    assert derive_failure_reason("failed", hms) == "Layer shift"
+
+
+def test_failed_with_no_hms_returns_none() -> None:
+    assert derive_failure_reason("failed", None) is None
+    assert derive_failure_reason("failed", []) is None
+
+
+# ---------------------------------------------------------------------------
+# Code-format tolerance (MQTT may send int or hex string)
+# ---------------------------------------------------------------------------
+
+
+def test_int_code_field_accepted() -> None:
+    """The MQTT parser sometimes leaves `code` as an int rather than a hex string."""
+    hms = [{"code": 0x4057, "attr": 0x0300_0000, "module": 0x03, "severity": 1}]
+    assert derive_failure_reason("failed", hms) == "Layer shift"

+ 34 - 0
backend/tests/unit/test_trace.py

@@ -183,3 +183,37 @@ class TestNormaliseInboundTraceId:
         chars; one off-by-one would silently reject UUID-like IDs that
         chars; one off-by-one would silently reject UUID-like IDs that
         happen to land at the boundary."""
         happen to land at the boundary."""
         assert normalise_inbound_trace_id("a" * 64) == "a" * 64
         assert normalise_inbound_trace_id("a" * 64) == "a" * 64
+
+
+class TestFilterMustBeAttachedToHandlerNotLogger:
+    """A filter on a Logger only fires for records that *originate* at that
+    logger — records propagated up from child loggers (every backend.* logger
+    in the app) never trigger it. Attaching TraceIDFilter to root_logger meant
+    child-logger records arrived at the file handler with no trace_id
+    attribute, the formatter raised KeyError, and the record was silently
+    dropped — manifesting as "logs/bambuddy.log only shows logs partially".
+    The filter must live on each *handler* so every record passing through it
+    gets annotated regardless of which logger emitted it."""
+
+    def test_handler_level_filter_fires_on_child_logger_propagation(self):
+        import io
+
+        root = logging.getLogger("test_trace_filter_handler_path")
+        root.setLevel(logging.DEBUG)
+        root.handlers.clear()
+        root.filters.clear()
+
+        captured = io.StringIO()
+        handler = logging.StreamHandler(captured)
+        handler.setFormatter(logging.Formatter("%(trace_id)s|%(message)s"))
+        handler.addFilter(TraceIDFilter())
+        root.addHandler(handler)
+
+        child = logging.getLogger("test_trace_filter_handler_path.child")
+        try:
+            child.info("hi from child")
+            handler.flush()
+            assert f"{TRACE_ID_PLACEHOLDER}|hi from child" in captured.getvalue()
+        finally:
+            root.handlers.clear()
+            root.filters.clear()

+ 93 - 0
frontend/src/__tests__/pages/PrintersPageBucketing.test.ts

@@ -0,0 +1,93 @@
+/**
+ * Regression tests for the printer-status bucketing logic in PrintersPage.tsx.
+ *
+ * The bug: a printer in gcode_state="FAILED" with no active HMS errors was
+ * being counted as a "problem" in the header badge — this is the post-cancel
+ * terminal state, not a real fault. After cancelling a print on h2d-1 the
+ * printer card kept showing "1 problem" forever even after the HMS list was
+ * empty, until the next print started.
+ *
+ * The fix: FAILED-without-HMS is bucketed as "finished" (same operator
+ * meaning: print ended, plate may need clearing). FAILED-with-HMS still
+ * counts as a problem because there's a real fault to investigate.
+ *
+ * Mirrors the logic at PrintersPage.tsx:917-948 and the classifyPrinterStatus
+ * helper at PrintersPage.tsx:1028 — kept as inline copies so this test
+ * doesn't need the helpers to be exported.
+ */
+import { describe, it, expect } from 'vitest';
+
+type Status = {
+  connected: boolean;
+  state: string | null;
+  hms_errors?: { code: string; attr: number; severity: number }[];
+};
+
+type Bucket = 'printing' | 'paused' | 'finished' | 'idle' | 'offline' | 'error';
+
+const KNOWN_HMS_CODES = new Set(['0300_4057', '0500_4038']);
+
+function filterKnownHMSErrors(errors: Status['hms_errors']): NonNullable<Status['hms_errors']> {
+  return (errors ?? []).filter((e) => {
+    const codeNum = parseInt(e.code.replace('0x', ''), 16) || 0;
+    const module = ((e.attr >> 16) & 0xFFFF).toString(16).padStart(4, '0').toUpperCase();
+    const code = (codeNum & 0xFFFF).toString(16).padStart(4, '0').toUpperCase();
+    return KNOWN_HMS_CODES.has(`${module}_${code}`);
+  });
+}
+
+function classifyPrinterStatus(status: Status | undefined): Bucket {
+  if (!status?.connected) return 'offline';
+  const knownHms = filterKnownHMSErrors(status.hms_errors);
+  if (knownHms.length > 0) return 'error';
+  switch (status.state) {
+    case 'RUNNING': return 'printing';
+    case 'PAUSE': return 'paused';
+    case 'FINISH': return 'finished';
+    case 'FAILED': return 'finished';
+    default: return 'idle';
+  }
+}
+
+describe('FAILED-without-HMS bucketing', () => {
+  it('classifies FAILED with no HMS errors as "finished" (post-cancel terminal state, not a problem)', () => {
+    const cancelledPrinter: Status = {
+      connected: true,
+      state: 'FAILED',
+      hms_errors: [],
+    };
+    expect(classifyPrinterStatus(cancelledPrinter)).toBe('finished');
+  });
+
+  it('classifies FAILED + active known HMS as "error"', () => {
+    const reallyFailedPrinter: Status = {
+      connected: true,
+      state: 'FAILED',
+      hms_errors: [{ code: '0x4057', attr: 0x0300_0000, severity: 1 }],
+    };
+    expect(classifyPrinterStatus(reallyFailedPrinter)).toBe('error');
+  });
+
+  it('classifies FAILED + only unknown HMS as "finished" (unknown codes are not "real" problems by our taxonomy)', () => {
+    const cancelEcho: Status = {
+      connected: true,
+      state: 'FAILED',
+      hms_errors: [{ code: '0x2001b', attr: 0x0C00_0C00, severity: 1 }], // 0C00_001B not in known set
+    };
+    expect(classifyPrinterStatus(cancelEcho)).toBe('finished');
+  });
+
+  it('classifies FINISH as "finished" (unchanged baseline)', () => {
+    const completedPrinter: Status = { connected: true, state: 'FINISH' };
+    expect(classifyPrinterStatus(completedPrinter)).toBe('finished');
+  });
+
+  it('classifies disconnected printer as "offline" (HMS / state irrelevant)', () => {
+    const offline: Status = {
+      connected: false,
+      state: 'FAILED',
+      hms_errors: [{ code: '0x4057', attr: 0x0300_0000, severity: 1 }],
+    };
+    expect(classifyPrinterStatus(offline)).toBe('offline');
+  });
+});

+ 6 - 2
frontend/src/components/BulkPrinterToolbar.tsx

@@ -94,12 +94,16 @@ export function BulkPrinterToolbar({
   printers.forEach(p => {
   printers.forEach(p => {
     const status = queryClient.getQueryData<PrinterStatus>(['printerStatus', p.id]);
     const status = queryClient.getQueryData<PrinterStatus>(['printerStatus', p.id]);
     if (!status || !status.connected) { stateCounts.offline++; return; }
     if (!status || !status.connected) { stateCounts.offline++; return; }
-    if (status.hms_errors && filterKnownHMSErrors(status.hms_errors).length > 0) stateCounts.error++;
+    const hasKnownHms = status.hms_errors ? filterKnownHMSErrors(status.hms_errors).length > 0 : false;
+    if (hasKnownHms) stateCounts.error++;
     switch (status.state) {
     switch (status.state) {
       case 'RUNNING': stateCounts.printing++; break;
       case 'RUNNING': stateCounts.printing++; break;
       case 'PAUSE': stateCounts.paused++; break;
       case 'PAUSE': stateCounts.paused++; break;
       case 'FINISH': stateCounts.finished++; break;
       case 'FINISH': stateCounts.finished++; break;
-      case 'FAILED': stateCounts.error++; break;
+      // FAILED without an active HMS error is the post-cancel terminal state —
+      // group with FINISH. When HMS is active the error bucket is already
+      // incremented above; don't double-count.
+      case 'FAILED': if (!hasKnownHms) stateCounts.finished++; break;
       default: stateCounts.idle++; break;
       default: stateCounts.idle++; break;
     }
     }
   });
   });

+ 19 - 4
frontend/src/pages/PrintersPage.tsx

@@ -915,8 +915,10 @@ function StatusSummaryBar({ printers }: { printers: Printer[] | undefined }) {
       } else if (!status.connected) {
       } else if (!status.connected) {
         offline++;
         offline++;
       } else {
       } else {
-        // Count printers with HMS errors
-        if (status.hms_errors && filterKnownHMSErrors(status.hms_errors).length > 0) {
+        // Count printers with active HMS errors as problems
+        const knownHmsCount =
+          status.hms_errors ? filterKnownHMSErrors(status.hms_errors).length : 0;
+        if (knownHmsCount > 0) {
           error++;
           error++;
         }
         }
         switch (status.state) {
         switch (status.state) {
@@ -937,7 +939,16 @@ function StatusSummaryBar({ printers }: { printers: Printer[] | undefined }) {
             finished++;
             finished++;
             break;
             break;
           case 'FAILED':
           case 'FAILED':
-            error++;
+            // FAILED is the printer's terminal gcode_state after a print stops —
+            // including user cancellations, where there's no actual fault. Only
+            // count it as a "problem" when an HMS error is also active; otherwise
+            // it's just a print that ended unsuccessfully and the plate needs
+            // clearing (same as FINISH from the operator's perspective).
+            if (knownHmsCount > 0) {
+              // Already counted above
+            } else {
+              finished++;
+            }
             break;
             break;
           default:
           default:
             idle++;
             idle++;
@@ -1024,7 +1035,11 @@ function classifyPrinterStatus(
     case 'RUNNING': return 'printing';
     case 'RUNNING': return 'printing';
     case 'PAUSE':   return 'paused';
     case 'PAUSE':   return 'paused';
     case 'FINISH':  return 'finished';
     case 'FINISH':  return 'finished';
-    case 'FAILED':  return 'error';
+    // FAILED without an active HMS error is the printer's terminal state after
+    // any unsuccessful end — including user-cancellations. Treat the same as
+    // FINISH for grouping/badging purposes; only escalate to "error" when an
+    // HMS code is actually attached (handled by the early-return above).
+    case 'FAILED':  return 'finished';
     default:        return 'idle';
     default:        return 'idle';
   }
   }
 }
 }

Fichier diff supprimé car celui-ci est trop grand
+ 0 - 0
static/assets/index-DJk_kz74.js


+ 1 - 1
static/index.html

@@ -26,7 +26,7 @@
 
 
     <!-- Splash screens for iOS -->
     <!-- Splash screens for iOS -->
     <link rel="apple-touch-startup-image" href="/img/android-chrome-512x512.png" />
     <link rel="apple-touch-startup-image" href="/img/android-chrome-512x512.png" />
-    <script type="module" crossorigin src="/assets/index-DRS5LiBG.js"></script>
+    <script type="module" crossorigin src="/assets/index-DJk_kz74.js"></script>
     <link rel="stylesheet" crossorigin href="/assets/index-telVPl_h.css">
     <link rel="stylesheet" crossorigin href="/assets/index-telVPl_h.css">
   </head>
   </head>
   <body>
   <body>

Certains fichiers n'ont pas été affichés car il y a eu trop de fichiers modifiés dans ce diff