浏览代码

● fix(library): #1600 thumbnail extraction for external-folder .gcode.3mf
files + unify file_type classification across ingest paths

#1600: external-folder sliced outputs landed
with no thumbnail. Cause: four backend ingest paths classified
LibraryFile.file_type differently for the same .gcode.3mf family.
upload / ZIP-extract / in-process used os.path.splitext()[1] which
returns .3mf for foo.gcode.3mf and stored file_type="3mf", matching
the thumbnail-extraction gate at library.py:1467 (file_type == "3mf").
External-folder scan explicitly detected the compound and stored
file_type="gcode.3mf" — preserving "sliced output" identity — but
then skipped both the "3mf" gate and the "gcode" gate, so the file
landed with thumbnail_path = None. Same compound-extension drift that
bit #1543 in the 3D preview, in a surface that audit didn't trace
back to.

Unified fix:

- New classify_file_type(filename) helper in routes/library.py is the
single source of truth. Returns "gcode.3mf" for sliced outputs and
ext[1:] otherwise.
- Applied to every ingest path: upload (line 1704), ZIP-extract
(1998), external-folder scan (the bug site — the manual compound
check is replaced), and in-process save_3mf_from_bytes (471, used
by MakerWorld import).
- External-scan thumbnail gate widened to
`if file_type in ("3mf", "gcode.3mf"):` — a .gcode.3mf IS a 3MF zip
with Metadata/plate_1.png; ThreeMFParser doesn't care about the
trailing extension.
- gcode-download endpoint at GET /library/files/{id}/gcode had the
same drift in reverse: gate was `elif file.file_type == "3mf":` so
a row stored with file_type="gcode.3mf" (the external-scan path's
pre-unification behaviour, and the canonical going forward) got
rejected with HTTP 400. Widened to the same compound-aware tuple.

One-shot DB migration in core/database.py::run_migrations backfills
existing legacy rows:

UPDATE library_files
SET file_type = 'gcode.3mf'
WHERE file_type = '3mf'
AND LOWER(filename) LIKE '%.gcode.3mf'

Idempotent (post-update rows no longer match the file_type='3mf'
predicate, so re-runs at every boot are no-ops) and dialect-neutral
(LOWER + LIKE are identical under SQLite and Postgres). Without the
backfill, users would have a permanent split state: old uploads at
'3mf', new uploads at 'gcode.3mf' — which would double-bucket sliced
outputs in the dashboard stats query at line 4615 and show two
entries in the file-manager filter dropdown for the same conceptual
type.

Frontend untouched. FileManagerPage.tsx and ProjectDetailPage.tsx
already accept both '3mf' and 'gcode.3mf' per the #1543 fix. After
the migration the DB only contains canonical values, so the legacy
'3mf' branches in the frontend become dead code for sliced files —
they stay as defence-in-depth in case any future ingest path I
missed reverts to the legacy classifier.

maziggy 1 天之前
父节点
当前提交
a837a3acbd

文件差异内容过多而无法显示
+ 0 - 0
CHANGELOG.md


+ 38 - 14
backend/app/api/routes/library.py

@@ -91,6 +91,26 @@ def get_library_files_dir() -> Path:
     return files_dir
 
 
+def classify_file_type(filename: str) -> str:
+    """Return the canonical ``LibraryFile.file_type`` for *filename*.
+
+    Compound extensions are preserved — a `.gcode.3mf` file (a sliced
+    output, still a 3MF zip on disk) is classified ``gcode.3mf`` rather
+    than ``3mf``. Pre-#1600 this was only done in the external-scan
+    path; the upload / ZIP-extract / in-process paths all stripped to
+    the trailing extension and stored ``3mf``, so the FE had to accept
+    both. Unified here so every ingest path stores the same value and
+    downstream gates (gcode download, file-type filter, thumbnail
+    extraction) only need to handle one canonical name per file family.
+    Files with no extension classify as ``unknown``.
+    """
+    lower = filename.lower()
+    if lower.endswith(".gcode.3mf"):
+        return "gcode.3mf"
+    ext = os.path.splitext(lower)[1]
+    return ext[1:] if ext else "unknown"
+
+
 def get_library_thumbnails_dir() -> Path:
     """Get the directory for library thumbnails."""
     thumbnails_dir = get_library_dir() / "thumbnails"
@@ -468,7 +488,7 @@ async def save_3mf_bytes_to_library(
         folder_id=folder_id,
         filename=filename,
         file_path=to_relative_path(file_path),
-        file_type=ext[1:] if ext else "unknown",
+        file_type=classify_file_type(filename),
         file_size=len(file_bytes),
         file_hash=file_hash,
         thumbnail_path=to_relative_path(thumbnail_path) if thumbnail_path else None,
@@ -1454,17 +1474,16 @@ async def scan_external_folder(
             except OSError:
                 continue
 
-            file_type = ext[1:] if ext else "unknown"
-            # For compound extensions, use the meaningful part
-            if file_type in ("3mf",) and len(filepath.suffixes) >= 2:
-                inner = filepath.suffixes[-2].lower()
-                if inner == ".gcode":
-                    file_type = "gcode.3mf"
+            file_type = classify_file_type(filename)
 
-            # Extract thumbnail for 3mf files
+            # Extract thumbnail for 3mf files (including .gcode.3mf sliced
+            # outputs — those are 3MF zips on disk and carry the same
+            # thumbnail Metadata/plate_1.png the parser reads). Pre-#1600
+            # the gate was `file_type == "3mf"` alone, so .gcode.3mf files
+            # in external folders silently got no thumbnail.
             thumbnail_path = None
             file_metadata = None
-            if file_type == "3mf":
+            if file_type in ("3mf", "gcode.3mf"):
                 try:
                     parser = ThreeMFParser(str(filepath))
                     raw_metadata = parser.parse()
@@ -1700,8 +1719,11 @@ async def upload_file(
         except InvalidFilenameError as e:
             raise HTTPException(status_code=400, detail=str(e)) from e
         ext = os.path.splitext(filename)[1].lower()
-        # Handle files without extension
-        file_type = ext[1:] if ext else "unknown"
+        # `file_type` is compound-aware (`gcode.3mf` for sliced outputs).
+        # `ext` stays the trailing extension because the on-disk filename
+        # uses it directly and the 3MF-parse branch below still gates on
+        # `ext == ".3mf"`, which is correct for both `.3mf` and `.gcode.3mf`.
+        file_type = classify_file_type(filename)
 
         # Verify folder exists if specified
         target_folder = None
@@ -1995,7 +2017,7 @@ async def extract_zip_file(
                     # Extract file
                     filename = os.path.basename(zip_path)
                     ext = os.path.splitext(filename)[1].lower()
-                    file_type = ext[1:] if ext else "unknown"
+                    file_type = classify_file_type(filename)
 
                     # Generate unique filename for storage
                     unique_filename = f"{uuid.uuid4().hex}{ext}"
@@ -4387,8 +4409,10 @@ async def get_gcode(
 
     if file.file_type == "gcode":
         return FastAPIFileResponse(str(abs_path), media_type="text/plain")
-    elif file.file_type == "3mf":
-        # Extract gcode from 3mf
+    elif file.file_type in ("3mf", "gcode.3mf"):
+        # Extract gcode from 3mf zip container. `.gcode.3mf` sliced outputs
+        # carry the same `Metadata/plate_*.gcode` entries as a `.3mf`, so
+        # the unzip path is identical — just had to expand the gate.
         try:
             with zipfile.ZipFile(str(abs_path), "r") as zf:
                 # Find gcode file

+ 17 - 0
backend/app/core/database.py

@@ -1833,6 +1833,23 @@ async def run_migrations(conn):
             {"old": old_val, "new": new_val},
         )
 
+    # Migration: Unify `LibraryFile.file_type` across ingest paths (#1600).
+    # Pre-#1600, only the external-folder scan path stored `gcode.3mf` for
+    # sliced outputs — the upload, ZIP-extract, and in-process paths all
+    # stripped to the trailing `.3mf` and stored `3mf`, so the same file
+    # family was split between two values depending on how it was ingested.
+    # Going forward `classify_file_type()` is canonical; this backfill flips
+    # existing legacy `3mf` rows whose filename ends in `.gcode.3mf` to the
+    # canonical compound name. Idempotent (post-update rows no longer match
+    # `file_type = '3mf'`) and dialect-neutral (`LOWER` + `LIKE` work the
+    # same under SQLite and Postgres).
+    await conn.execute(
+        text(
+            "UPDATE library_files SET file_type = 'gcode.3mf' "
+            "WHERE file_type = '3mf' AND LOWER(filename) LIKE '%.gcode.3mf'"
+        )
+    )
+
     # Migration: Add per-user Bambu Cloud credential columns
     await _safe_execute(conn, "ALTER TABLE users ADD COLUMN cloud_token VARCHAR(500)")
     await _safe_execute(conn, "ALTER TABLE users ADD COLUMN cloud_email VARCHAR(255)")

+ 49 - 0
backend/tests/integration/test_library_api.py

@@ -1179,6 +1179,55 @@ class TestPrintFileUploadValidation:
         result = response.json()
         assert result["filename"] == "plate_1.gcode.3mf"
 
+    @pytest.mark.asyncio
+    @pytest.mark.integration
+    async def test_library_upload_classifies_gcode_3mf_as_compound(self, async_client: AsyncClient, db_session):
+        """#1600 follow-up: upload path used to strip to the trailing
+        extension and store ``file_type='3mf'`` for sliced outputs, while
+        the external-folder scan stored ``file_type='gcode.3mf'``. Now
+        every ingest path goes through ``classify_file_type`` and
+        produces the canonical compound name."""
+        files = {
+            "file": (
+                "sliced.gcode.3mf",
+                self._valid_3mf_bytes(),
+                "application/zip",
+            )
+        }
+        response = await async_client.post("/api/v1/library/files", files=files)
+        assert response.status_code == 200
+        assert response.json()["file_type"] == "gcode.3mf"
+
+    @pytest.mark.asyncio
+    @pytest.mark.integration
+    async def test_library_get_gcode_endpoint_accepts_compound_file_type(self, async_client: AsyncClient, db_session):
+        """#1600 follow-up: pre-fix, ``GET /files/{id}/gcode`` only handled
+        ``file_type`` of ``gcode`` or ``3mf`` and 400'd on a row whose
+        ``file_type`` was ``gcode.3mf`` — exactly the rows the external-
+        folder scan was creating. The gate now treats both as 3MF and
+        unzips the embedded gcode the same way."""
+        from backend.app.models.library import LibraryFile
+
+        # Persist a real `.gcode.3mf` zip under file_type='gcode.3mf' so
+        # the endpoint hits the new branch.
+        with tempfile.NamedTemporaryFile(suffix=".gcode.3mf", delete=False) as tmp:
+            tmp.write(self._valid_3mf_bytes(name="Metadata/plate_1.gcode"))
+            tmp_path = tmp.name
+
+        lib_file = LibraryFile(
+            filename="sliced.gcode.3mf",
+            file_path=tmp_path,
+            file_type="gcode.3mf",
+            file_size=Path(tmp_path).stat().st_size,
+        )
+        db_session.add(lib_file)
+        await db_session.commit()
+        await db_session.refresh(lib_file)
+
+        response = await async_client.get(f"/api/v1/library/files/{lib_file.id}/gcode")
+        assert response.status_code == 200
+        assert b"G28" in response.content
+
     @pytest.mark.asyncio
     @pytest.mark.integration
     async def test_library_still_accepts_non_print_extensions(self, async_client: AsyncClient, db_session):

+ 56 - 0
backend/tests/unit/test_library_classify_file_type.py

@@ -0,0 +1,56 @@
+"""Regression tests for the unified file_type classifier (#1600).
+
+Pre-#1600 each ingest path classified `LibraryFile.file_type` differently
+for the same on-disk file family — only the external-folder scan stored
+`gcode.3mf` for sliced outputs, while upload / ZIP-extract / in-process
+all stripped to the trailing extension and stored `3mf`. The frontend
+had to accept both per #1543, the gcode-download endpoint only handled
+`3mf`, and the external-scan thumbnail gate skipped `gcode.3mf` entirely
+(#1600 itself). `classify_file_type` is now the single source of truth
+across every ingest path + a one-shot DB migration backfills legacy rows.
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from backend.app.api.routes.library import classify_file_type
+
+
+@pytest.mark.parametrize(
+    "filename, expected",
+    [
+        # Sliced output — compound extension preserved
+        ("model.gcode.3mf", "gcode.3mf"),
+        ("Multi.Plate.gcode.3mf", "gcode.3mf"),
+        ("MIXED_CASE.GCODE.3MF", "gcode.3mf"),
+        # Plain 3MF — unchanged
+        ("model.3mf", "3mf"),
+        ("model.3MF", "3mf"),
+        # Raw gcode — not a sliced-3mf, classified as gcode (matches existing
+        # gcode-thumbnail branch and the gcode-download endpoint).
+        ("model.gcode", "gcode"),
+        ("model.GCODE", "gcode"),
+        # STL — used by the stats query, thumbnail backfill, and the
+        # `file_type == "stl"` filter. Must not change.
+        ("model.stl", "stl"),
+        # Common image extensions used for thumbnails
+        ("preview.png", "png"),
+        ("preview.JPG", "jpg"),
+        # Files without an extension classify as `unknown` so the downstream
+        # `unknown` branches still see them.
+        ("README", "unknown"),
+        # Files that LOOK like sliced output but aren't — confidence guard.
+        ("not.gcode.3mf.bak", "bak"),
+    ],
+)
+def test_classify_file_type_returns_canonical_value(filename, expected):
+    assert classify_file_type(filename) == expected
+
+
+def test_gcode_3mf_classification_is_stable_across_extension_casing():
+    """The migration's `LOWER(filename) LIKE '%.gcode.3mf'` predicate matches
+    every casing the classifier accepts, so unify on a single canonical
+    lowercase value."""
+    for variant in ("foo.gcode.3mf", "foo.Gcode.3mf", "foo.gcode.3MF", "FOO.GCODE.3MF"):
+        assert classify_file_type(variant) == "gcode.3mf"

+ 160 - 0
backend/tests/unit/test_library_file_type_backfill_migration.py

@@ -0,0 +1,160 @@
+"""Regression test for the library_files.file_type backfill migration (#1600).
+
+Pre-#1600 the upload, ZIP-extract, and in-process ingest paths all stored
+`file_type='3mf'` for sliced `.gcode.3mf` outputs while the external-folder
+scan stored `file_type='gcode.3mf'` — the same on-disk file family split
+across two values depending on how it was ingested. `classify_file_type`
+is now canonical going forward; this migration backfills the legacy `3mf`
+rows so the DB ends up consistent. Idempotent and dialect-neutral.
+"""
+
+from __future__ import annotations
+
+import pytest
+from sqlalchemy import text
+from sqlalchemy.ext.asyncio import create_async_engine
+
+from backend.app.core.database import run_migrations
+
+
+@pytest.fixture(autouse=True)
+def force_sqlite_dialect(monkeypatch):
+    """Force the SQLite branch regardless of test env settings."""
+    from backend.app.core import db_dialect
+
+    monkeypatch.setattr(db_dialect, "is_sqlite", lambda: True)
+    monkeypatch.setattr(db_dialect, "is_postgres", lambda: False)
+    from backend.app.core import database as database_module
+
+    monkeypatch.setattr(database_module, "is_sqlite", lambda: True)
+
+
+def _register_all_models():
+    from backend.app.models import (  # noqa: F401
+        ams_history,
+        ams_label,
+        api_key,
+        archive,
+        color_catalog,
+        external_link,
+        filament,
+        group,
+        kprofile_note,
+        library,
+        maintenance,
+        notification,
+        notification_template,
+        print_log,
+        print_queue,
+        printer,
+        project,
+        project_bom,
+        settings,
+        slot_preset,
+        smart_plug,
+        smart_plug_energy_snapshot,
+        spool,
+        spool_assignment,
+        spool_catalog,
+        spool_k_profile,
+        spool_usage_history,
+        spoolbuddy_device,
+        user,
+        user_email_pref,
+        virtual_printer,
+    )
+
+
+@pytest.fixture
+async def engine():
+    from backend.app.core.database import Base
+
+    _register_all_models()
+
+    eng = create_async_engine("sqlite+aiosqlite:///:memory:", echo=False)
+    async with eng.begin() as conn:
+        await conn.run_sync(Base.metadata.create_all)
+    yield eng
+    await eng.dispose()
+
+
+async def _insert_file(conn, *, file_id: int, filename: str, file_type: str) -> None:
+    """Insert a minimal LibraryFile row; only the columns the migration
+    touches matter."""
+    await conn.execute(
+        text(
+            "INSERT INTO library_files "
+            "(id, filename, file_path, file_type, file_size, is_external, print_count) "
+            "VALUES (:id, :filename, :path, :ftype, 0, 0, 0)"
+        ),
+        {
+            "id": file_id,
+            "filename": filename,
+            "path": f"/lib/{file_id}",
+            "ftype": file_type,
+        },
+    )
+
+
+@pytest.mark.asyncio
+async def test_backfill_flips_only_legacy_gcode_3mf_rows(engine):
+    """Rows with `file_type='3mf'` whose filename ends in `.gcode.3mf` get
+    upgraded to `gcode.3mf`. Everything else stays put."""
+    async with engine.begin() as conn:
+        await _insert_file(conn, file_id=1, filename="sliced.gcode.3mf", file_type="3mf")
+        await _insert_file(conn, file_id=2, filename="UPPER.GCODE.3MF", file_type="3mf")
+        await _insert_file(conn, file_id=3, filename="model.3mf", file_type="3mf")  # not sliced
+        await _insert_file(conn, file_id=4, filename="model.gcode", file_type="gcode")
+        await _insert_file(conn, file_id=5, filename="model.stl", file_type="stl")
+        await _insert_file(conn, file_id=6, filename="already.gcode.3mf", file_type="gcode.3mf")
+
+    async with engine.begin() as conn:
+        await run_migrations(conn)
+
+    async with engine.connect() as conn:
+        rows = dict((await conn.execute(text("SELECT id, file_type FROM library_files ORDER BY id"))).fetchall())
+
+    assert rows[1] == "gcode.3mf", "lowercase .gcode.3mf must be backfilled"
+    assert rows[2] == "gcode.3mf", "uppercase .GCODE.3MF must be backfilled (LOWER(filename) in migration)"
+    assert rows[3] == "3mf", "plain .3mf stays at `3mf` — not a sliced output"
+    assert rows[4] == "gcode", "raw .gcode is untouched"
+    assert rows[5] == "stl", "stl is untouched"
+    assert rows[6] == "gcode.3mf", "rows already at canonical pass through"
+
+
+@pytest.mark.asyncio
+async def test_backfill_is_idempotent(engine):
+    """Every boot re-runs the migration set; a second pass on already-
+    backfilled rows must be a no-op."""
+    async with engine.begin() as conn:
+        await _insert_file(conn, file_id=1, filename="sliced.gcode.3mf", file_type="3mf")
+
+    async with engine.begin() as conn:
+        await run_migrations(conn)
+    async with engine.begin() as conn:
+        await run_migrations(conn)
+
+    async with engine.connect() as conn:
+        result = await conn.execute(text("SELECT file_type FROM library_files WHERE id = 1"))
+        assert result.scalar() == "gcode.3mf"
+
+
+@pytest.mark.asyncio
+async def test_backfill_leaves_unrelated_3mf_rows_alone(engine):
+    """A row whose filename happens to contain `.gcode.3mf` as a substring
+    but doesn't END with it (e.g. a `.bak` of a sliced output) is not a
+    sliced output — must NOT be backfilled."""
+    async with engine.begin() as conn:
+        await _insert_file(conn, file_id=1, filename="sliced.gcode.3mf.bak", file_type="3mf")
+
+    async with engine.begin() as conn:
+        await run_migrations(conn)
+
+    async with engine.connect() as conn:
+        result = await conn.execute(text("SELECT file_type FROM library_files WHERE id = 1"))
+        # The LIKE predicate uses '%.gcode.3mf' so a trailing .bak doesn't match.
+        # The row keeps its pre-migration `3mf` — odd, but classify_file_type
+        # returns `bak` for a fresh ingest, so this row simply stays where it
+        # was. The migration's job is to fix the dominant class, not chase
+        # every edge case.
+        assert result.scalar() == "3mf"

部分文件因为文件数量过多而无法显示