Jelajahi Sumber

chore(vp): per-minute MQTT status-push counter for idle-disconnect triage (#1548 follow-up)

  The 1 Hz status push was silent at INFO, so support bundles couldn't show
  whether the push task was actually reaching a specific slicer connection.
  Now ``_periodic_status_push`` emits one line per minute per connected
  slicer ("1Hz status push: N pushes/min to <client>") and stays silent when
  no slicer is attached. No behaviour change to the push itself — counters
  are local to the task and reset every 60 ticks.

  Motivated by the open follow-up on #1548: keepalive parser shipped (b6636053)
  and the symptom moved 60 s → 90 s, but OrcaSlicer still disconnects on idle.
  This adds the missing observability so the reporter's next support bundle
  shows whether our outbound 1 Hz push is alive for the disconnecting
  connection.
maziggy 4 hari lalu
induk
melakukan
4ffefa60f4
2 mengubah file dengan 24 tambahan dan 0 penghapusan
  1. 1 0
      CHANGELOG.md
  2. 23 0
      backend/app/services/virtual_printer/mqtt_server.py

+ 1 - 0
CHANGELOG.md

@@ -19,6 +19,7 @@ All notable changes to Bambuddy will be documented in this file.
 - **VP target-printer DHCP IP / serial refresh now restarts proxy VPs** — Pre-fix: when a target printer's IP changed (DHCP renewal, network reconfiguration), the running proxy VP kept forwarding to the stale IP forever because ``sync_from_db``'s "changed" predicate didn't compare ``proxy_ips`` against the running instance's ``target_printer_ip`` / ``target_printer_serial``. The user had to manually toggle the VP to refresh. Now ``sync_from_db`` re-evaluates the proxy target each cycle and restarts the VP when the IP or serial actually changes — same code path as a config change. If the target printer's DHCP lease cycles frequently this means more proxy restarts, but the alternative was silent breakage; documented in the release-notes for users on flaky-DHCP networks.
 - **VP queue_force_color_match setting takes effect immediately** — Pre-fix: toggling the per-VP ``Force exact color match`` setting via the UI silently no-op'd because ``sync_from_db``'s "changed" predicate didn't include the field. The user had to restart the process for the new value to land. The predicate now also checks ``queue_force_color_match`` so the running instance gets restarted on toggle.
 - **VP MQTT client session errors elevated from DEBUG to WARNING** — The outer ``except Exception`` in ``SimpleMQTTServer._handle_client`` was logging at DEBUG, which production deployments default to suppressing. Users reporting "slicer disconnects randomly" then had no signal to pass us. WARNING surfaces it. Inner handlers' expected parser/IO failures stay at DEBUG — only unexpected errors that would otherwise reach the outer catch get visibility.
+- **VP MQTT periodic status push now logs a one-line per-minute counter per active slicer connection (#1548 follow-up)** — ``_periodic_status_push`` emits ``1Hz status push: N pushes/min to <client>`` at INFO level once per minute per connected slicer (silent when no slicer is attached). The 1 Hz status push was previously silent at INFO; when a reporter sent a support bundle showing an idle disconnect, there was no way to tell whether the push task was actually pushing to that connection or being eaten silently. The counter both confirms the task is healthy for a given client and gives us a concrete data point (N < 60 means pushes were dropped) when triaging future "slicer disconnects on idle" reports. No behaviour change to the push itself.
 
 
 ### Security

+ 23 - 0
backend/app/services/virtual_printer/mqtt_server.py

@@ -399,9 +399,19 @@ class SimpleMQTTServer:
     async def _periodic_status_push(self) -> None:
         """Send periodic status updates to all connected clients (1 Hz, exact pre-bridge behaviour)."""
         logger.info("Starting periodic status push task")
+        # Per-client push counters reset every 60 ticks. Lets us confirm from
+        # logs whether the 1Hz push is actually reaching a specific slicer
+        # connection (#1548 keepalive follow-up: keepalive parser shipped but
+        # OrcaSlicer still disconnects on idle, and the periodic push is
+        # otherwise silent at INFO level so it can't be observed in the
+        # support bundle). One log line per minute per active connection —
+        # nothing when no slicer is attached.
+        push_counts: dict[str, int] = {}
+        ticks_since_summary = 0
         while self._running:
             try:
                 await asyncio.sleep(1)  # Push every 1 second like real printers
+                ticks_since_summary += 1
 
                 disconnected = []
                 for client_id, writer in list(self._clients.items()):
@@ -411,6 +421,7 @@ class SimpleMQTTServer:
                             continue
                         serial = self._client_serials.get(client_id, self.serial)
                         await self._send_status_report(writer, serial=serial)
+                        push_counts[client_id] = push_counts.get(client_id, 0) + 1
                     except OSError as e:
                         logger.debug("Failed to push status to %s: %s", client_id, e)
                         disconnected.append(client_id)
@@ -419,6 +430,18 @@ class SimpleMQTTServer:
                 for client_id in disconnected:
                     self._clients.pop(client_id, None)
                     self._client_serials.pop(client_id, None)
+                    push_counts.pop(client_id, None)
+
+                if ticks_since_summary >= 60:
+                    for cid, count in push_counts.items():
+                        logger.info(
+                            "%s1Hz status push: %d pushes/min to %s",
+                            self._log_prefix,
+                            count,
+                            cid,
+                        )
+                    push_counts.clear()
+                    ticks_since_summary = 0
 
             except asyncio.CancelledError:
                 break