Browse Source

Fix queue 500 error when cancelled print exists (#558)

  The MQTT completion handler stored "aborted" as the queue item status
  when a print was cancelled mid-print, but the response schema only
  allows "cancelled". Pydantic validation failed on the invalid status
  when listing all queue items, returning 500. Filtering by specific
  status excluded the bad row so those still worked.

  Normalise "aborted" → "cancelled" before storing. A startup fixup
  also converts any existing "aborted" rows in the database.
maziggy 2 months ago
parent
commit
8d9894793a

+ 2 - 0
CHANGELOG.md

@@ -14,6 +14,8 @@ All notable changes to Bambuddy will be documented in this file.
 - **SpoolBuddy Kiosk Auth Bypass via API Key** — When Bambuddy auth is enabled, the SpoolBuddy kiosk (Chromium on RPi) was redirected to the login page because the `ProtectedRoute` requires a user object from `GET /auth/me`, which only accepted JWT tokens. The `/auth/me` endpoint now also accepts API keys (via `Authorization: Bearer bb_xxx` or `X-API-Key` header) and returns a synthetic admin user with all permissions. The frontend's `AuthContext` reads an optional `?token=` URL parameter on first load, stores it in localStorage, and strips it from the URL to prevent leakage via browser history or referrer. The install script now includes the API key in the kiosk URL (`/spoolbuddy?token=${API_KEY}`), so the device authenticates automatically on boot without manual login.
 
 ### Fixed
+- **Queue Returns 500 When Cancelled Print Exists** ([#558](https://github.com/maziggy/bambuddy/issues/558)) — When a print was cancelled mid-print, the MQTT completion handler stored status `"aborted"` on the queue item, but the response schema only accepts `"pending"`, `"printing"`, `"completed"`, `"failed"`, `"skipped"`, or `"cancelled"`. Listing all queue items hit a Pydantic validation error on the invalid status, returning a 500 error. Filtering by a specific status (e.g. "pending") excluded the bad row and worked fine. Now normalises `"aborted"` to `"cancelled"` before storing. A startup fixup also converts any existing `"aborted"` rows.
+- **Tests Send Real Maintenance Notifications** — Tests that call `on_print_complete(status="completed")` created background `asyncio` tasks (maintenance check, smart plug, notifications) that outlived the test's mock context. When the event loop processed these orphaned tasks, `async_session` was no longer patched and they queried the real production database — finding real printers with maintenance due and real notification providers, then sending real notifications. Tests now cancel spawned background tasks before the mock context exits.
 - **Virtual Printer Config Changes Ignored Until Toggle Off/On** — Changing a virtual printer's mode (e.g. proxy → archive), model, access code, bind IP, remote interface IP, or target printer via the UI updated the database but the running VP instance was never restarted. `sync_from_db()` skipped any VP whose ID was already in the running instances dict without checking if config had changed. Now compares critical fields between the running instance and DB record and restarts the VP when a difference is detected.
 - **Sidebar Navigation Ignores User Permissions** — All sidebar navigation items (Archives, Queue, Stats, Profiles, Maintenance, Projects, Inventory, Files) were visible to every user regardless of their role's permissions. Only the Settings item was permission-gated. Now each nav item is hidden when the user lacks the corresponding read permission (e.g., `archives:read`, `queue:read`, `library:read`). The Printers item remains always visible as the home page. Also added the missing `inventory:read|create|update|delete` permissions to the frontend Permission type (they existed in the backend but were absent from the frontend type definition).
 - **Camera Button Clickable Without Permission & ffmpeg Process Leak** ([#550](https://github.com/maziggy/bambuddy/issues/550)) — Two camera issues in multi-user environments (e.g., classrooms with multiple printers). First, the camera button on the printer card was clickable even when the user's role lacked `camera:view` permission. Now disabled with a permission tooltip, matching the existing pattern for `printers:control` on the chamber light button. Second, ffmpeg processes (~240MB each) were never cleaned up after closing a camera stream. The `stop_camera_stream` endpoint called `terminate()` but never `wait()`ed or `kill()`ed, and HTTP disconnect detection in the streaming response only checked between frames — if the generator was blocked reading from ffmpeg stdout, disconnect was never detected (due to TCP send buffer masking the closed connection). Three fixes: (1) the stop endpoint now uses `terminate()` → `wait(2s)` → `kill()` → `wait()`; (2) each stream gets a background disconnect monitor task that polls `request.is_disconnected()` every 2 seconds independently of the frame loop, directly killing the ffmpeg process on disconnect; (3) a periodic cleanup (every 60s) scans `/proc` for any ffmpeg process with a Bambu RTSP URL (`rtsps://bblp:`) that isn't in an active stream and `SIGKILL`s it — catching orphans that survive app restarts or generator abandonment.

+ 20 - 0
backend/app/main.py

@@ -2208,6 +2208,10 @@ async def on_print_complete(printer_id: int, data: dict):
             queue_item = printing_items[0] if printing_items else None
             if queue_item:
                 queue_status = data.get("status", "completed")
+                # MQTT sends "aborted" for cancelled prints; normalise to
+                # "cancelled" so it matches the queue schema Literal.
+                if queue_status == "aborted":
+                    queue_status = "cancelled"
                 queue_item.status = queue_status
                 queue_item.completed_at = datetime.now(timezone.utc)
                 await db.commit()
@@ -3256,6 +3260,22 @@ async def lifespan(app: FastAPI):
     # Startup
     await init_db()
 
+    # Fix queue items stuck with invalid "aborted" status (should be "cancelled").
+    # This can happen when a print was cancelled mid-print on versions before this fix.
+    try:
+        async with async_session() as db:
+            from backend.app.models.print_queue import PrintQueueItem
+
+            result = await db.execute(select(PrintQueueItem).where(PrintQueueItem.status == "aborted"))
+            aborted_items = result.scalars().all()
+            if aborted_items:
+                for item in aborted_items:
+                    item.status = "cancelled"
+                await db.commit()
+                logging.info("Fixed %d queue item(s) with invalid 'aborted' status → 'cancelled'", len(aborted_items))
+    except Exception as e:
+        logging.warning("Failed to fix aborted queue items: %s", e)
+
     # Restore debug logging state from previous session
     await init_debug_logging()
 

+ 21 - 0
backend/tests/integration/test_print_lifecycle.py

@@ -65,15 +65,26 @@ class TestPrintCompleteLogic:
     @pytest.mark.asyncio
     async def test_print_complete_no_import_errors(self, capture_logs):
         """Verify on_print_complete doesn't have import shadowing issues."""
+        # Snapshot tasks before the call so we can cancel orphans afterwards.
+        # on_print_complete fires background tasks (maintenance check, notifications,
+        # smart-plug) via asyncio.create_task.  If those tasks outlive the mock
+        # context they use the *real* async_session and can send real notifications.
+        tasks_before = set(asyncio.all_tasks())
+
         with (
             patch("backend.app.main.async_session") as mock_session_maker,
             patch("backend.app.main.notification_service") as mock_notif,
             patch("backend.app.main.smart_plug_manager") as mock_plug,
             patch("backend.app.main.ws_manager") as mock_ws,
+            patch("backend.app.main.mqtt_relay") as mock_relay,
+            patch("backend.app.main.printer_manager") as mock_pm,
         ):
             mock_notif.on_print_complete = AsyncMock()
             mock_plug.on_print_complete = AsyncMock()
             mock_ws.send_print_complete = AsyncMock()
+            mock_ws.broadcast = AsyncMock()
+            mock_relay.on_print_complete = AsyncMock()
+            mock_pm.get_printer.return_value = None
 
             # Mock the database session
             mock_session = AsyncMock()
@@ -94,6 +105,16 @@ class TestPrintCompleteLogic:
                 },
             )
 
+            # Cancel background tasks spawned by on_print_complete before
+            # leaving the mock context — prevents them from running with
+            # the real async_session and sending real notifications.
+            for task in asyncio.all_tasks() - tasks_before:
+                task.cancel()
+                try:
+                    await task
+                except (asyncio.CancelledError, Exception):
+                    pass
+
         # Verify no import shadowing errors - this would have caught the ArchiveService bug
         errors = [r for r in capture_logs.get_errors() if "cannot access local variable" in str(r.message)]
         assert not errors, f"Import shadowing error: {capture_logs.format_errors()}"

+ 247 - 0
backend/tests/integration/test_print_queue_api.py

@@ -1176,3 +1176,250 @@ class TestTargetLocationFeature:
         assert response.status_code == 200
         result = response.json()
         assert result["target_location"] is None
+
+
+class TestAbortedStatusNormalisation:
+    """Tests for issue #558: 'aborted' queue status causes 500 error."""
+
+    @pytest.fixture
+    async def printer_factory(self, db_session):
+        """Factory to create test printers."""
+        _counter = [0]
+
+        async def _create_printer(**kwargs):
+            from backend.app.models.printer import Printer
+
+            _counter[0] += 1
+            counter = _counter[0]
+
+            defaults = {
+                "name": f"Abort Test Printer {counter}",
+                "ip_address": f"192.168.1.{60 + counter}",
+                "serial_number": f"TESTABORT{counter:04d}",
+                "access_code": "12345678",
+                "model": "P1S",
+            }
+            defaults.update(kwargs)
+
+            printer = Printer(**defaults)
+            db_session.add(printer)
+            await db_session.commit()
+            await db_session.refresh(printer)
+            return printer
+
+        return _create_printer
+
+    @pytest.fixture
+    async def archive_factory(self, db_session):
+        """Factory to create test archives."""
+        _counter = [0]
+
+        async def _create_archive(**kwargs):
+            from backend.app.models.archive import PrintArchive
+
+            _counter[0] += 1
+            counter = _counter[0]
+
+            defaults = {
+                "filename": f"abort_test_{counter}.3mf",
+                "print_name": f"Abort Test Print {counter}",
+                "file_path": f"/tmp/abort_test_{counter}.3mf",
+                "file_size": 1024,
+                "content_hash": f"aborthash{counter:06d}",
+                "status": "completed",
+            }
+            defaults.update(kwargs)
+
+            archive = PrintArchive(**defaults)
+            db_session.add(archive)
+            await db_session.commit()
+            await db_session.refresh(archive)
+            return archive
+
+        return _create_archive
+
+    @pytest.fixture
+    async def queue_item_factory(self, db_session, printer_factory, archive_factory):
+        """Factory to create test queue items."""
+        _counter = [0]
+
+        async def _create_queue_item(**kwargs):
+            from backend.app.models.print_queue import PrintQueueItem
+
+            _counter[0] += 1
+            counter = _counter[0]
+
+            if "printer_id" not in kwargs:
+                printer = await printer_factory()
+                kwargs["printer_id"] = printer.id
+            if "archive_id" not in kwargs:
+                archive = await archive_factory()
+                kwargs["archive_id"] = archive.id
+
+            defaults = {
+                "status": "pending",
+                "position": counter,
+            }
+            defaults.update(kwargs)
+
+            item = PrintQueueItem(**defaults)
+            db_session.add(item)
+            await db_session.commit()
+            await db_session.refresh(item)
+            return item
+
+        return _create_queue_item
+
+    @pytest.mark.asyncio
+    @pytest.mark.integration
+    async def test_on_print_complete_normalises_aborted_to_cancelled(self, queue_item_factory, db_session):
+        """Verify the completion handler maps 'aborted' → 'cancelled' for queue items."""
+        import asyncio
+        from unittest.mock import AsyncMock, MagicMock, patch
+
+        item = await queue_item_factory(status="printing")
+
+        # Build a mock session whose execute returns our item
+        mock_result = MagicMock()
+        mock_result.scalars.return_value.all.return_value = [item]
+
+        mock_session = AsyncMock()
+        mock_session.__aenter__ = AsyncMock(return_value=mock_session)
+        mock_session.__aexit__ = AsyncMock(return_value=False)
+        mock_session.execute = AsyncMock(return_value=mock_result)
+        mock_session.commit = AsyncMock()
+
+        tasks_before = set(asyncio.all_tasks())
+
+        with (
+            patch("backend.app.main.async_session", return_value=mock_session),
+            patch("backend.app.main.ws_manager") as mock_ws,
+            patch("backend.app.main.mqtt_relay") as mock_relay,
+            patch("backend.app.main.notification_service") as mock_notif,
+            patch("backend.app.main.smart_plug_manager") as mock_plug,
+            patch("backend.app.main.printer_manager") as mock_pm,
+        ):
+            mock_ws.send_print_complete = AsyncMock()
+            mock_ws.broadcast = AsyncMock()
+            mock_relay.on_print_complete = AsyncMock()
+            mock_relay.on_queue_job_completed = AsyncMock()
+            mock_notif.on_print_complete = AsyncMock()
+            mock_plug.on_print_complete = AsyncMock()
+            mock_pm.get_printer.return_value = None
+
+            from backend.app.main import on_print_complete
+
+            await on_print_complete(
+                item.printer_id,
+                {
+                    "status": "aborted",
+                    "filename": "test.gcode",
+                    "subtask_name": "Test",
+                    "timelapse_was_active": False,
+                },
+            )
+
+            # Cancel background tasks before leaving mock context
+            for task in asyncio.all_tasks() - tasks_before:
+                task.cancel()
+                try:
+                    await task
+                except (asyncio.CancelledError, Exception):
+                    pass
+
+        # The item status should be normalised to 'cancelled', not 'aborted'
+        assert item.status == "cancelled"
+
+    @pytest.mark.asyncio
+    @pytest.mark.integration
+    async def test_startup_fixup_converts_aborted_to_cancelled(self, queue_item_factory, db_session):
+        """Verify the startup fixup converts existing 'aborted' rows to 'cancelled'."""
+        from sqlalchemy import select
+
+        from backend.app.models.print_queue import PrintQueueItem
+
+        # Create items with various statuses including 'aborted'
+        item_aborted = await queue_item_factory(status="pending")
+        item_pending = await queue_item_factory(status="pending")
+
+        # Manually set the invalid status
+        item_aborted.status = "aborted"
+        db_session.add(item_aborted)
+        await db_session.commit()
+
+        # Run the fixup query (same logic as lifespan)
+        result = await db_session.execute(select(PrintQueueItem).where(PrintQueueItem.status == "aborted"))
+        aborted_items = result.scalars().all()
+        for i in aborted_items:
+            i.status = "cancelled"
+        await db_session.commit()
+
+        # Verify: no more 'aborted' items
+        result = await db_session.execute(select(PrintQueueItem).where(PrintQueueItem.status == "aborted"))
+        assert len(result.scalars().all()) == 0
+
+        # The previously aborted item should now be 'cancelled'
+        await db_session.refresh(item_aborted)
+        assert item_aborted.status == "cancelled"
+
+        # The pending item should be unchanged
+        await db_session.refresh(item_pending)
+        assert item_pending.status == "pending"
+
+    @pytest.mark.asyncio
+    @pytest.mark.integration
+    async def test_completed_status_passes_through_unchanged(self, queue_item_factory, db_session):
+        """Verify normal statuses like 'completed' are not affected by normalisation."""
+        import asyncio
+        from unittest.mock import AsyncMock, MagicMock, patch
+
+        item = await queue_item_factory(status="printing")
+
+        mock_result = MagicMock()
+        mock_result.scalars.return_value.all.return_value = [item]
+
+        mock_session = AsyncMock()
+        mock_session.__aenter__ = AsyncMock(return_value=mock_session)
+        mock_session.__aexit__ = AsyncMock(return_value=False)
+        mock_session.execute = AsyncMock(return_value=mock_result)
+        mock_session.commit = AsyncMock()
+
+        tasks_before = set(asyncio.all_tasks())
+
+        with (
+            patch("backend.app.main.async_session", return_value=mock_session),
+            patch("backend.app.main.ws_manager") as mock_ws,
+            patch("backend.app.main.mqtt_relay") as mock_relay,
+            patch("backend.app.main.notification_service") as mock_notif,
+            patch("backend.app.main.smart_plug_manager") as mock_plug,
+            patch("backend.app.main.printer_manager") as mock_pm,
+        ):
+            mock_ws.send_print_complete = AsyncMock()
+            mock_ws.broadcast = AsyncMock()
+            mock_relay.on_print_complete = AsyncMock()
+            mock_relay.on_queue_job_completed = AsyncMock()
+            mock_notif.on_print_complete = AsyncMock()
+            mock_plug.on_print_complete = AsyncMock()
+            mock_pm.get_printer.return_value = None
+
+            from backend.app.main import on_print_complete
+
+            await on_print_complete(
+                item.printer_id,
+                {
+                    "status": "completed",
+                    "filename": "test.gcode",
+                    "subtask_name": "Test",
+                    "timelapse_was_active": False,
+                },
+            )
+
+            # Cancel background tasks before leaving mock context
+            for task in asyncio.all_tasks() - tasks_before:
+                task.cancel()
+                try:
+                    await task
+                except (asyncio.CancelledError, Exception):
+                    pass
+
+        assert item.status == "completed"