ソースを参照

feat(observability): trace ID column on every log line + X-Trace-Id header

  Builds on the recent uvicorn-access-log-into-bambuddy.log change.
  Until now the access line told us who called an endpoint, but there
  was no way to tie that line to the application records emitted on the
  server side while handling that request. The rogue stop_print mystery
  on 2026-04-26 left exactly that gap: even with access logs piped in,
  correlating "this POST landed" with "this MQTT publish went out 6 ms
  later" required eyeball-matching timestamps across different loggers.

  A new ContextVar + middleware + logging filter wire a trace ID through
  every record:

    * trace_id_middleware mints an 8-char hex ID per request (or honours
      a sane inbound X-Trace-Id for cross-system correlation), stores it
      in trace_id_var (ContextVar), echoes it on the response as
      X-Trace-Id, and resets the var in finally.
    * TraceIDFilter, attached to root + uvicorn.access, copies the
      current trace_id_var value onto every LogRecord so the format
      string [%(trace_id)s] resolves to the right ID per record.
    * Records emitted outside any request scope (startup, MQTT
      callbacks, scheduler) get a stable "-" placeholder so the column
      stays visually aligned and grep stays simple.

  ContextVars are the right plumbing because asyncio copies the current
  context into every asyncio.create_task, so background work spawned
  from inside a request inherits the same ID without explicit threading.
  request.state can't make that hop. The logging filter also has no
  access to the FastAPI request object — it runs synchronously inside
  the stdlib logging machinery — and the ContextVar is the only
  mechanism that bridges async request scope to sync log emission.

  Inbound X-Trace-Id is hard-validated against [A-Za-z0-9_-]+ (max 64
  chars) before being honoured — a hostile/buggy caller cannot smuggle
  log-injection payloads (newlines, control chars, megabyte blobs) into
  bambuddy.log via the trace ID column; values that fail the gate
  silently trigger a freshly minted server-side ID rather than failing
  the request.

  Middleware is decorated AFTER auth_middleware on purpose: Starlette
  stacks @app.middleware decorators LIFO so the last-decorated runs
  first inbound, making trace stamp the OUTERMOST layer — auth log
  lines and every record emitted on the way down to and back from the
  route handler all carry the same ID.

  Output now correlates as:

    2026-04-26 09:51:39,152 INFO [uvicorn.access] [a4f3b1e7] - "POST
      /api/v1/printers/1/print/stop HTTP/1.1" 200
    2026-04-26 09:51:39,158 INFO [bambu_mqtt] [a4f3b1e7] [SERIAL] Sent
      stop print command

  One grep a4f3b1e7 returns the full causality chain.

  30 new tests: 22 unit (ContextVar placeholder, filter copies value,
  asyncio task propagation, concurrent-request isolation, hex generator
  uniqueness, hostile-payload validator, max-length boundary, all four
  write verbs survive, GET/HEAD/OPTIONS dropped, URL-substring false-
  match guards, edge cases) and 8 integration (X-Trace-Id round-trips,
  body matches header, hostile inbound replaced, overlong inbound
  replaced, ContextVar resets after request, generator format stable,
  each request gets unique ID).
maziggy 1 ヶ月 前
コミット
1878d2aab5

ファイルの差分が大きいため隠しています
+ 1 - 0
CHANGELOG.md


+ 46 - 0
backend/app/core/logging_filters.py

@@ -0,0 +1,46 @@
+"""Logging filters for the Bambuddy log pipeline.
+
+Currently houses a single filter that keeps only state-changing HTTP methods
+in the file-side uvicorn access log. See ``WriteRequestsOnlyFilter`` for the
+why; this lives in its own module so the test suite can import it without
+pulling in ``backend.app.main``'s entire startup graph.
+"""
+
+from __future__ import annotations
+
+import logging
+
+
+class WriteRequestsOnlyFilter(logging.Filter):
+    """Keep uvicorn access log records for state-changing HTTP methods only.
+
+    Uvicorn's access logger emits one record per HTTP request, formatted as
+
+        ``<client_addr> - "<METHOD> <path> HTTP/<ver>" <status>``
+
+    On a typical Bambuddy install the bulk of that traffic is GETs — the
+    frontend status-polling loop, the camera stream, snapshots, websocket
+    upgrades. None of those can change server state on their own, so for
+    incident triage ("who hit ``/print/stop`` at 09:23?") they're noise that
+    just rotates the log file faster.
+
+    This filter accepts only POST / PUT / PATCH / DELETE — the verbs that
+    actually mutate state — and drops everything else. Match anchors on the
+    surrounding ``" `` and trailing space so an unrelated literal substring
+    in a URL (e.g. ``GET /api/posts/POST``) cannot false-match.
+
+    Attach to ``logging.getLogger("uvicorn.access")`` (and only there — the
+    pattern is uvicorn's specific format string and would silently drop
+    everything if applied to a generic logger).
+    """
+
+    _WRITE_VERB_TOKENS: tuple[str, ...] = (
+        ' "POST ',
+        ' "PUT ',
+        ' "PATCH ',
+        ' "DELETE ',
+    )
+
+    def filter(self, record: logging.LogRecord) -> bool:  # noqa: A003 — stdlib API name
+        message = record.getMessage()
+        return any(token in message for token in self._WRITE_VERB_TOKENS)

+ 118 - 0
backend/app/core/trace.py

@@ -0,0 +1,118 @@
+"""Per-request trace ID plumbing.
+
+Each HTTP request gets a short hex ID set in a ``ContextVar``; downstream
+log records (application *and* uvicorn access) read the same context and
+include the ID in their output. The result is that one ``grep <trace_id>``
+on ``bambuddy.log`` returns the access line + every line emitted on the
+server side while that request was being handled — closing the loop
+opened by piping uvicorn access into the file: the access line tells you
+*who* called the endpoint, the trace ID tells you *what else happened*
+on the server because of it.
+
+Why a ContextVar instead of e.g. ``request.state``:
+
+* asyncio copies the current context into every ``asyncio.create_task``,
+  so background work spawned from within a request inherits the same
+  trace ID without having to be passed it explicitly. ``request.state``
+  doesn't survive that hop.
+* The logging filter has no access to the FastAPI request object — it
+  runs synchronously inside the stdlib logging machinery — and the
+  ContextVar is the only mechanism that bridges async request scope to
+  sync log emission.
+
+Why no fancy structured-logging schema: this is a small project. The
+existing log format is a single line per record; we add a single
+bracketed token for the trace ID and stop there. If structured logging
+is wanted later, it can layer on top — the ContextVar carries an opaque
+string regardless of what consumes it downstream.
+"""
+
+from __future__ import annotations
+
+import logging
+import re
+import secrets
+from contextvars import ContextVar
+
+# Default ``"-"`` (instead of None or empty string) so the format string
+# always produces a stable visual width; a bare empty bracket pair would
+# read as "no trace ID at all" which is hard to grep for. ``-`` reads as
+# "no value in this column" the way it does in HTTP access logs already.
+TRACE_ID_PLACEHOLDER = "-"
+
+trace_id_var: ContextVar[str] = ContextVar("trace_id", default=TRACE_ID_PLACEHOLDER)
+
+# Length of a freshly minted trace ID in hex chars. 8 chars = 32 bits of
+# entropy = ~4 billion possibilities; collisions are astronomically
+# unlikely within a single rotation window of bambuddy.log and grep stays
+# easy at this length. Increase later if it proves too short for a busy
+# install — the filter and format don't care about width.
+_GENERATED_LENGTH = 8
+
+# Bound on how long an *inbound* trace ID can be when echoed from the
+# X-Trace-Id request header. Without a cap a malicious / buggy client
+# could push 1 MB of garbage into every log line for a request. 64 chars
+# comfortably accommodates UUIDs (32 hex), Datadog-style 64-bit IDs,
+# OpenTelemetry's 32-hex spans — anything longer is almost certainly
+# wrong and we'd rather mint our own than honour it.
+_MAX_INBOUND_LENGTH = 64
+
+# Whitelist of characters allowed in an inbound trace ID. Restricted to
+# the alphanumerics + a small set of separators that real-world
+# correlation IDs use, so newlines / quotes / control chars cannot be
+# smuggled into log lines via the X-Trace-Id header. A request with an
+# unacceptable header just gets a freshly minted server-side ID — we
+# never reject the request for it.
+_VALID_INBOUND = re.compile(r"^[A-Za-z0-9_\-]+$")
+
+
+def get_trace_id() -> str:
+    """Return the current trace ID, or the placeholder if none is set."""
+    return trace_id_var.get()
+
+
+def generate_trace_id() -> str:
+    """Mint a fresh server-side trace ID."""
+    return secrets.token_hex(_GENERATED_LENGTH // 2)
+
+
+def normalise_inbound_trace_id(raw: str | None) -> str | None:
+    """Validate and return a caller-supplied trace ID, or ``None`` to mint fresh.
+
+    Accepts only short alphanumeric + ``_-`` strings so a hostile or buggy
+    client can't smuggle log-injection payloads through the X-Trace-Id
+    header. Returns ``None`` for any input that fails the gate, signalling
+    to the middleware that it should generate one instead.
+    """
+    if raw is None:
+        return None
+    if not raw or len(raw) > _MAX_INBOUND_LENGTH:
+        return None
+    if not _VALID_INBOUND.match(raw):
+        return None
+    return raw
+
+
+class TraceIDFilter(logging.Filter):
+    """Inject the current ``trace_id_var`` value into every LogRecord.
+
+    Attach to the file handler (or any handler whose format string
+    references ``%(trace_id)s``) so that every line written through that
+    handler carries the request scope it was generated under. The filter
+    always returns ``True`` — it never drops records, only annotates
+    them.
+
+    Records emitted outside any HTTP request (startup, MQTT callbacks,
+    scheduled tasks not chained from a request) get the placeholder
+    string, so the format column stays aligned and absent values are
+    obviously visible as ``[-]`` rather than blanks.
+    """
+
+    def filter(self, record: logging.LogRecord) -> bool:  # noqa: A003 — stdlib API name
+        # Use direct attribute set (not setdefault-style) so the value is
+        # always taken from the *current* context — a record formatted on
+        # a different task than where it was created (rare but possible
+        # via QueueHandler or async-throttled handlers) still picks up
+        # the right ID.
+        record.trace_id = trace_id_var.get()
+        return True

+ 85 - 1
backend/app/main.py

@@ -232,12 +232,28 @@ check_dependencies()
 # DEBUG=true -> DEBUG level, else use LOG_LEVEL setting
 log_level_str = "DEBUG" if app_settings.debug else app_settings.log_level.upper()
 log_level = getattr(logging, log_level_str, logging.INFO)
-log_format = "%(asctime)s %(levelname)s [%(name)s] %(message)s"
+# Trace ID column ([-] when no request scope is active — startup, MQTT
+# callbacks, scheduled tasks not chained from a request — so the column
+# stays visually aligned and missing values are obvious in grep). See
+# backend/app/core/trace.py for the ContextVar that feeds this slot.
+log_format = "%(asctime)s %(levelname)s [%(name)s] [%(trace_id)s] %(message)s"
 
 # Create root logger
 root_logger = logging.getLogger()
 root_logger.setLevel(log_level)
 
+# Trace-ID injection: this filter populates record.trace_id from the
+# per-request ContextVar so the format string above can reference it.
+# Attached to the root logger so EVERY record (application, uvicorn,
+# third-party) gets the field — without it, the format string would
+# raise KeyError on records that don't naturally carry a trace_id
+# attribute. See backend/app/core/trace.py for the ContextVar that
+# the filter reads.
+from backend.app.core.trace import TraceIDFilter
+
+_trace_id_filter = TraceIDFilter()
+root_logger.addFilter(_trace_id_filter)
+
 # Console handler - always enabled
 console_handler = logging.StreamHandler()
 console_handler.setLevel(log_level)
@@ -258,6 +274,24 @@ if app_settings.log_to_file:
     root_logger.addHandler(file_handler)
     logging.info("Logging to file: %s", log_file)
 
+    # Pipe uvicorn's HTTP access log to bambuddy.log too. Uvicorn ships its
+    # access logger with propagate=False by default, so without this attach
+    # there is no on-disk record of which endpoint triggered a server-state
+    # change — the rogue stop_print mystery on 2026-04-26 was untraceable
+    # for exactly this reason. Filtered to write methods only
+    # (POST/PUT/PATCH/DELETE) so the high-volume status-poll GETs from the
+    # frontend don't churn the rotation window faster than it's useful.
+    from backend.app.core.logging_filters import WriteRequestsOnlyFilter
+
+    uvicorn_access_logger = logging.getLogger("uvicorn.access")
+    uvicorn_access_logger.addHandler(file_handler)
+    uvicorn_access_logger.addFilter(WriteRequestsOnlyFilter())
+    # Uvicorn's access logger has propagate=False (its own default), so the
+    # root-attached TraceIDFilter never sees these records. Attach a
+    # second instance directly so HTTP access lines carry the same trace
+    # ID column as the application logs they correlate with.
+    uvicorn_access_logger.addFilter(TraceIDFilter())
+
 # Reduce noise from third-party libraries in production
 if not app_settings.debug:
     logging.getLogger("sqlalchemy.engine").setLevel(logging.WARNING)
@@ -4609,6 +4643,56 @@ async def auth_middleware(request, call_next):
     return await call_next(request)
 
 
+@app.middleware("http")
+async def trace_id_middleware(request, call_next):
+    """Stamp every HTTP request with a trace ID and echo it back.
+
+    Decorated AFTER auth_middleware on purpose: Starlette stacks
+    @app.middleware decorators LIFO, so the last-decorated runs first
+    inbound. Putting the trace stamp last makes it the OUTERMOST layer,
+    which means auth-middleware log lines (and every line emitted on the
+    way down to and back from the route handler) all carry the same
+    trace ID. If we put it before auth, auth's logs would be stamped
+    with the *previous* request's ID — useless for correlation.
+
+    Honours an inbound ``X-Trace-Id`` header so callers running their
+    own tracing can correlate their span IDs with our log lines, but
+    only if the value passes the whitelist gate in
+    ``backend.app.core.trace.normalise_inbound_trace_id`` — anything
+    rejected (too long, contains control chars, etc.) silently triggers
+    a freshly minted server-side ID rather than failing the request.
+
+    The minted (or echoed) ID is set on a ContextVar so that every log
+    record emitted during the request — application logs *and* uvicorn's
+    access log — carries it via TraceIDFilter, and is also written to
+    the ``X-Trace-Id`` response header so clients can pin a server-side
+    log search to the exact request they made.
+    """
+    from backend.app.core.trace import (
+        generate_trace_id,
+        normalise_inbound_trace_id,
+        trace_id_var,
+    )
+
+    inbound = normalise_inbound_trace_id(request.headers.get("X-Trace-Id"))
+    trace_id = inbound if inbound is not None else generate_trace_id()
+
+    token = trace_id_var.set(trace_id)
+    try:
+        response = await call_next(request)
+    finally:
+        # Reset the ContextVar so a record emitted in a totally
+        # unrelated background task that just happens to inherit this
+        # context doesn't keep referencing this request's ID forever.
+        # In practice ContextVar.reset is best-effort under asyncio
+        # task-spawn semantics, but the cost is one attribute write so
+        # we may as well do it.
+        trace_id_var.reset(token)
+
+    response.headers["X-Trace-Id"] = trace_id
+    return response
+
+
 # API routes
 app.include_router(auth.router, prefix=app_settings.api_prefix)
 app.include_router(mfa.router, prefix=app_settings.api_prefix)

+ 144 - 0
backend/tests/integration/test_trace_middleware.py

@@ -0,0 +1,144 @@
+"""Integration test for the trace-ID middleware contract.
+
+Tests focus on observable surface — what headers go out, what
+ContextVar value the route handler sees — rather than re-testing the
+ContextVar / filter primitives (those are covered in
+``tests/unit/test_trace.py``).
+
+A minimal FastAPI app is used instead of the production ``backend.app.main``
+app: importing main.py would pull in the entire startup graph (DB
+migrations, MQTT subscribers, scheduler, etc.) just to assert "the
+middleware sets a header", and that overhead would dwarf the test value.
+The middleware function is copied inline so the test pins the exact
+contract expected of it.
+"""
+
+from __future__ import annotations
+
+import re
+
+import pytest
+from fastapi import FastAPI
+from fastapi.testclient import TestClient
+
+from backend.app.core.trace import (
+    generate_trace_id,
+    get_trace_id,
+    normalise_inbound_trace_id,
+    trace_id_var,
+)
+
+
+def _build_app_with_trace_middleware() -> FastAPI:
+    """Construct a minimal FastAPI app with the trace middleware wired up
+    the same way main.py does it."""
+    app = FastAPI()
+
+    @app.middleware("http")
+    async def trace_id_middleware(request, call_next):
+        inbound = normalise_inbound_trace_id(request.headers.get("X-Trace-Id"))
+        trace_id = inbound if inbound is not None else generate_trace_id()
+        token = trace_id_var.set(trace_id)
+        try:
+            response = await call_next(request)
+        finally:
+            trace_id_var.reset(token)
+        response.headers["X-Trace-Id"] = trace_id
+        return response
+
+    @app.get("/echo-trace")
+    async def echo_trace():
+        # Read the ContextVar from inside the request handler so the
+        # test can assert that what's in the header matches what
+        # downstream code sees. If these ever diverge, application
+        # logs would be stamped with a different ID than the one the
+        # client gets back — useless for correlation.
+        return {"trace_id": get_trace_id()}
+
+    return app
+
+
+@pytest.fixture
+def client() -> TestClient:
+    return TestClient(_build_app_with_trace_middleware())
+
+
+class TestGeneratedTraceId:
+    def test_response_carries_x_trace_id_header(self, client):
+        """Every response must echo X-Trace-Id so a client can paste it
+        into a server-side log search later — without it, the trace ID
+        column in bambuddy.log is one-way only."""
+        response = client.get("/echo-trace")
+        assert response.status_code == 200
+        assert "X-Trace-Id" in response.headers
+        assert response.headers["X-Trace-Id"]
+
+    def test_generated_id_matches_handler_view(self, client):
+        """The X-Trace-Id header value must equal what the route handler
+        saw in its ContextVar — otherwise client-side and server-side
+        log searches use different keys and never join up."""
+        response = client.get("/echo-trace")
+        body_id = response.json()["trace_id"]
+        header_id = response.headers["X-Trace-Id"]
+        assert body_id == header_id
+
+    def test_each_request_gets_a_unique_id(self, client):
+        """Two consecutive requests should produce two different IDs —
+        otherwise the column in the log file is useless for telling
+        requests apart."""
+        first = client.get("/echo-trace").headers["X-Trace-Id"]
+        second = client.get("/echo-trace").headers["X-Trace-Id"]
+        assert first != second
+
+    def test_generated_id_format_is_short_hex(self, client):
+        """Bound the visible width and shape of the column. If the
+        generator ever switches format (e.g. UUID-with-dashes) the
+        format-string column width changes and grep patterns that
+        downstream tooling might rely on break — make the change
+        deliberate by failing this test instead."""
+        tid = client.get("/echo-trace").headers["X-Trace-Id"]
+        assert re.fullmatch(r"[0-9a-f]+", tid), tid
+        assert 4 <= len(tid) <= 32
+
+
+class TestInboundTraceIdRespected:
+    def test_safe_inbound_id_is_echoed(self, client):
+        """When the caller sends a sane X-Trace-Id, we honour it — this
+        is the cross-system correlation case (caller's tracing system
+        wants its span ID propagated)."""
+        response = client.get("/echo-trace", headers={"X-Trace-Id": "client-sent-abc123"})
+        assert response.headers["X-Trace-Id"] == "client-sent-abc123"
+        assert response.json()["trace_id"] == "client-sent-abc123"
+
+    def test_hostile_inbound_id_is_replaced(self, client):
+        """A header that fails the validator (control chars,
+        log-injection-shaped chars, etc.) must NOT reach the response
+        header or the log column — silently mint fresh and carry on,
+        so a hostile/buggy caller can't break our log file but also
+        can't break their own request by sending a bad header."""
+        response = client.get("/echo-trace", headers={"X-Trace-Id": "abc\ndef rm -rf /"})
+        echoed = response.headers["X-Trace-Id"]
+        assert echoed != "abc\ndef rm -rf /"
+        assert "\n" not in echoed
+        assert " " not in echoed
+
+    def test_overlong_inbound_id_is_replaced(self, client):
+        """The cap protects bambuddy.log from a 1KB-per-line blowup if
+        a caller sends a huge X-Trace-Id."""
+        too_long = "a" * 100
+        response = client.get("/echo-trace", headers={"X-Trace-Id": too_long})
+        assert response.headers["X-Trace-Id"] != too_long
+
+
+class TestContextResetAfterRequest:
+    def test_trace_id_var_resets_after_request_completes(self, client):
+        """The middleware must reset the ContextVar in its ``finally``
+        block. Without this, a record emitted in a totally unrelated
+        background task that happens to inherit the test client's
+        context would keep referencing a long-gone request's ID."""
+        from backend.app.core.trace import TRACE_ID_PLACEHOLDER
+
+        client.get("/echo-trace")
+        # After the request returns, the test fixture's context should
+        # no longer hold the request's ID.
+        assert get_trace_id() == TRACE_ID_PLACEHOLDER

+ 114 - 0
backend/tests/unit/test_logging_filters.py

@@ -0,0 +1,114 @@
+"""Tests for ``backend.app.core.logging_filters.WriteRequestsOnlyFilter``.
+
+The filter is attached to uvicorn's ``access`` logger so we get an on-disk
+record of state-changing HTTP requests in ``bambuddy.log`` (incident-triage
+need: trace which endpoint fired a state change, e.g. ``stop_print``)
+without churning the rotation window with the frontend's high-volume GET
+status polls.
+
+Tests use ``logging.LogRecord`` directly rather than spinning up a real
+uvicorn server — the contract under test is purely "given this record, do
+we keep it" and going through uvicorn would be a heavyweight integration
+test for a one-line predicate.
+"""
+
+from __future__ import annotations
+
+import logging
+
+import pytest
+
+from backend.app.core.logging_filters import WriteRequestsOnlyFilter
+
+
+def _record(message: str) -> logging.LogRecord:
+    """Build a LogRecord whose message matches uvicorn's access-log shape."""
+    return logging.LogRecord(
+        name="uvicorn.access",
+        level=logging.INFO,
+        pathname="",
+        lineno=0,
+        msg=message,
+        args=None,
+        exc_info=None,
+    )
+
+
+@pytest.fixture
+def filter_under_test() -> WriteRequestsOnlyFilter:
+    return WriteRequestsOnlyFilter()
+
+
+class TestWriteVerbsKept:
+    @pytest.mark.parametrize(
+        "verb",
+        ["POST", "PUT", "PATCH", "DELETE"],
+    )
+    def test_state_changing_verb_passes(self, filter_under_test, verb):
+        """The four verbs that can mutate server state must all be kept —
+        POST is by far the most common (the rogue ``stop_print`` we couldn't
+        trace was a POST), but PATCH/PUT/DELETE are equally important when
+        triaging "who edited / replaced / deleted X at 09:23?"."""
+        record = _record(f'192.168.1.42:54812 - "{verb} /api/v1/printers/1/print/stop HTTP/1.1" 200')
+        assert filter_under_test.filter(record) is True
+
+
+class TestReadOnlyVerbsDropped:
+    @pytest.mark.parametrize(
+        "request_line",
+        [
+            "GET /api/v1/printers/1/status HTTP/1.1",
+            "GET /api/v1/printers/1/camera/stream HTTP/1.1",
+            "HEAD /api/v1/health HTTP/1.1",
+            "OPTIONS /api/v1/printers/1/status HTTP/1.1",
+        ],
+    )
+    def test_read_only_verb_dropped(self, filter_under_test, request_line):
+        """GET / HEAD / OPTIONS account for the bulk of access traffic on a
+        running install (status polls, camera streams, CORS preflights) and
+        none of them can change server state, so dropping them keeps
+        bambuddy.log focused on lines that matter for incident triage."""
+        record = _record(f'192.168.1.42:54812 - "{request_line}" 200')
+        assert filter_under_test.filter(record) is False
+
+
+class TestNoFalseMatchInUrl:
+    """The matcher anchors on ``" `` + verb + space so an unrelated literal
+    substring inside a URL can't false-match. Important because URLs in
+    Bambuddy include things like ``/print/stop``, ``/print/pause`` — words
+    that happen to contain the verb names as substrings."""
+
+    def test_url_containing_verb_substring_does_not_match(self, filter_under_test):
+        """A literal "POST" inside the URL of a GET request must not flip
+        the filter — the verb must appear as the actual HTTP method, with
+        the surrounding quote+space anchors uvicorn's format guarantees."""
+        record = _record('192.168.1.42:54812 - "GET /api/posts/POST_123 HTTP/1.1" 200')
+        assert filter_under_test.filter(record) is False
+
+    def test_path_segment_named_delete_does_not_match(self, filter_under_test):
+        record = _record('192.168.1.42:54812 - "GET /api/v1/library/files/DELETE_ME HTTP/1.1" 200')
+        assert filter_under_test.filter(record) is False
+
+
+class TestEdgeCases:
+    def test_empty_message_dropped(self, filter_under_test):
+        """Defensive: a malformed access record with no body should not
+        accidentally pass through."""
+        assert filter_under_test.filter(_record("")) is False
+
+    def test_unrelated_log_line_dropped(self, filter_under_test):
+        """If the filter ever ends up attached to the wrong logger by
+        mistake, it must not leak unrelated records — silent fallthrough
+        on application logs would defeat the whole point."""
+        record = _record("Bambuddy starting - debug=False, log_level=INFO")
+        assert filter_under_test.filter(record) is False
+
+    def test_filter_is_idempotent_across_records(self, filter_under_test):
+        """Sanity: filter has no internal state, so repeated calls with the
+        same record return the same answer."""
+        kept = _record('192.168.1.42:54812 - "POST /api/v1/print/stop HTTP/1.1" 200')
+        dropped = _record('192.168.1.42:54812 - "GET /api/v1/printers HTTP/1.1" 200')
+
+        for _ in range(3):
+            assert filter_under_test.filter(kept) is True
+            assert filter_under_test.filter(dropped) is False

+ 185 - 0
backend/tests/unit/test_trace.py

@@ -0,0 +1,185 @@
+"""Tests for ``backend.app.core.trace`` — the per-request trace ID
+plumbing that ties uvicorn HTTP access lines to the application log
+records produced while handling that request.
+
+These tests stay at the unit level: the ContextVar / filter / inbound-
+ID validator can each be exercised directly without spinning up a real
+FastAPI app, and going through Starlette's TestClient just to assert
+"the middleware sets a header" would obscure rather than illuminate the
+contract.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import logging
+
+import pytest
+
+from backend.app.core.trace import (
+    TRACE_ID_PLACEHOLDER,
+    TraceIDFilter,
+    generate_trace_id,
+    get_trace_id,
+    normalise_inbound_trace_id,
+    trace_id_var,
+)
+
+
+@pytest.fixture(autouse=True)
+def _reset_trace_id():
+    """Each test gets a fresh ``trace_id_var`` — without the reset, a
+    test that sets the var would leak its value into the next test
+    running on the same event loop, producing surprising 'why is this
+    other test seeing my ID?' failures."""
+    token = trace_id_var.set(TRACE_ID_PLACEHOLDER)
+    try:
+        yield
+    finally:
+        trace_id_var.reset(token)
+
+
+def _record(message: str = "irrelevant") -> logging.LogRecord:
+    """Build a vanilla log record — the filter doesn't care about its
+    contents, only the surrounding ContextVar value at filter time."""
+    return logging.LogRecord(
+        name="test", level=logging.INFO, pathname="", lineno=0, msg=message, args=None, exc_info=None
+    )
+
+
+class TestPlaceholderWhenUnset:
+    def test_get_trace_id_returns_placeholder_outside_request(self):
+        """Code paths with no HTTP request scope (startup, MQTT
+        callbacks, scheduled tasks) must see the placeholder rather
+        than ``None``, so the format-string column is always populated
+        and missing values stay greppable."""
+        assert get_trace_id() == TRACE_ID_PLACEHOLDER
+
+    def test_filter_sets_placeholder_when_no_request_context(self):
+        """The filter must annotate every record, including those
+        emitted when no request is in flight — the format string would
+        otherwise raise KeyError on those records."""
+        record = _record()
+        assert TraceIDFilter().filter(record) is True
+        assert record.trace_id == TRACE_ID_PLACEHOLDER
+
+
+class TestRequestScopePropagation:
+    def test_filter_picks_up_active_request_id(self):
+        """Inside a request, the ContextVar holds that request's ID and
+        the filter copies it onto the record — this is the whole point
+        of the plumbing."""
+        trace_id_var.set("abc12345")
+        record = _record()
+        TraceIDFilter().filter(record)
+        assert record.trace_id == "abc12345"
+
+    @pytest.mark.asyncio
+    async def test_id_propagates_into_spawned_task(self):
+        """asyncio copies the current context into ``create_task``, so
+        background work spawned from inside a request inherits the same
+        trace ID without explicit threading. This is why a ContextVar
+        beats ``request.state``: state doesn't survive the hop."""
+        trace_id_var.set("parent01")
+
+        captured: list[str] = []
+
+        async def _child():
+            captured.append(get_trace_id())
+
+        await asyncio.create_task(_child())
+        assert captured == ["parent01"]
+
+    @pytest.mark.asyncio
+    async def test_concurrent_requests_do_not_leak_ids_into_each_other(self):
+        """Two concurrent requests each see only their own trace ID —
+        if the filter ever started reading from the wrong context (e.g.
+        a process-global) this test would catch it immediately."""
+        seen: dict[str, str] = {}
+
+        async def _request(label: str, tid: str):
+            trace_id_var.set(tid)
+            # Yield to the scheduler so the other coroutine has a chance
+            # to overwrite a poorly-scoped global if one existed.
+            await asyncio.sleep(0)
+            seen[label] = get_trace_id()
+
+        await asyncio.gather(
+            _request("a", "aaaaaaaa"),
+            _request("b", "bbbbbbbb"),
+        )
+        assert seen == {"a": "aaaaaaaa", "b": "bbbbbbbb"}
+
+
+class TestGenerateTraceId:
+    def test_generated_ids_are_hex(self):
+        tid = generate_trace_id()
+        int(tid, 16)  # raises ValueError if not hex
+        assert tid
+
+    def test_generated_ids_are_unique_across_calls(self):
+        """secrets.token_hex; collisions across a handful of calls would
+        signal a generator regression rather than statistical bad luck."""
+        ids = {generate_trace_id() for _ in range(200)}
+        assert len(ids) == 200
+
+
+class TestNormaliseInboundTraceId:
+    """Hostile / buggy callers sending ``X-Trace-Id`` must NOT be able
+    to push log-injection payloads (newlines, control chars, megabyte
+    blobs) into bambuddy.log via the trace ID column. Anything that
+    fails the gate gets ``None`` so the middleware mints fresh."""
+
+    def test_none_input_returns_none(self):
+        assert normalise_inbound_trace_id(None) is None
+
+    def test_empty_string_returns_none(self):
+        """An explicit ``X-Trace-Id:`` header with empty value is
+        indistinguishable from no header for our purposes — mint fresh.
+        """
+        assert normalise_inbound_trace_id("") is None
+
+    def test_short_alphanumeric_accepted(self):
+        assert normalise_inbound_trace_id("abc123") == "abc123"
+
+    def test_uuid_format_accepted(self):
+        """32-char hex (UUID-style without dashes) is the most common
+        real-world correlation ID format — must round-trip unchanged."""
+        uuid_like = "0123456789abcdef0123456789abcdef"
+        assert normalise_inbound_trace_id(uuid_like) == uuid_like
+
+    def test_dash_and_underscore_accepted(self):
+        """Datadog / OpenTelemetry frequently use dashes between span
+        components; underscores show up in some Bambu-internal IDs we
+        might want to echo. Both stay in the whitelist."""
+        assert normalise_inbound_trace_id("trace-abc_123") == "trace-abc_123"
+
+    @pytest.mark.parametrize(
+        "hostile",
+        [
+            "abc def",  # space — could split log-line columns
+            "abc\ndef",  # newline — log injection
+            "abc\rdef",  # carriage return — log injection
+            "abc\tdef",  # tab — column drift
+            'abc"def',  # quote — could break grep-friendly delimiters
+            "abc;def",  # semicolon — script-injection-shaped
+            "abc<def",  # angle bracket — XSS-shaped
+            "abc/def",  # slash — looks like a path
+        ],
+    )
+    def test_hostile_payloads_rejected(self, hostile):
+        """Each rejected character is one the regex whitelist intentionally
+        excludes; this parametrised set documents the threat model and
+        will fail loud if the regex ever drifts."""
+        assert normalise_inbound_trace_id(hostile) is None
+
+    def test_overlong_input_rejected(self):
+        """A 1KB X-Trace-Id should never end up in every log line for
+        the duration of a request — bound it strictly."""
+        assert normalise_inbound_trace_id("a" * 65) is None
+
+    def test_max_length_boundary_accepted(self):
+        """The configured cap (currently 64) must accept exactly 64
+        chars; one off-by-one would silently reject UUID-like IDs that
+        happen to land at the boundary."""
+        assert normalise_inbound_trace_id("a" * 64) == "a" * 64

この差分においてかなりの量のファイルが変更されているため、一部のファイルを表示していません