Browse Source

Fix SpoolBuddy update Docker failure — asyncssh local-username lookup

  Follow-up to the previous commit that swapped the `ssh`/`ssh-keygen`
  subprocesses for asyncssh. asyncssh.connect() internally calls
  getpass.getuser() to resolve the *local* username for ~/.ssh/config
  host matching, regardless of the explicit `username=` we pass for the
  remote login. Under an arbitrary Docker PUID with no /etc/passwd
  entry, getpass.getuser() tries LOGNAME/USER/LNAME/USERNAME (all unset
  in python:3.13-slim) and falls back to pwd.getpwuid(), which raises
  KeyError. asyncssh rewraps that as "Unknown local username: set one
  of LOGNAME, USER, LNAME, or USERNAME in the environment" — which
  surfaced in the UI as "ssh connection failed: no username set in the
  environment".

  Fix is two-part:

  - _ensure_local_username_env() runs at module import. If getpass
    .getuser() already works, or any of LOGNAME/USER/LNAME/USERNAME is
    set, it is a no-op. Otherwise it sets LOGNAME=bambuddy so asyncssh
    can proceed. Native installs are untouched.

  - asyncssh.connect() is now called with config=[] to skip the
    default ~/.ssh/config load, which relies on a resolvable home
    directory that may not exist under arbitrary Docker PUIDs.

  Three new unit tests cover the env-var fallback, including the case
  where the operator has set USER but the passwd lookup still fails.
maziggy 1 month ago
parent
commit
60d034d40c

+ 1 - 1
CHANGELOG.md

@@ -13,7 +13,7 @@ All notable changes to Bambuddy will be documented in this file.
 ### Fixed
 - **External Sidebar Link Icon Not Showing** ([#878](https://github.com/maziggy/bambuddy/issues/878)) — Custom icons uploaded for external sidebar links rendered correctly in the edit dialog but were missing from the sidebar itself, and opening the icon URL directly returned `{"detail":"Valid camera stream token required..."}`. The sidebar `<img>` tag in `Layout.tsx` used a raw `/api/v1/external-links/{id}/icon` URL, but that endpoint is protected by a query-string stream token (the same mechanism used for camera streams and archive thumbnails, because `<img>` tags cannot send Authorization headers). The edit dialog already routed through `api.getExternalLinkIconUrl()`, which wraps the URL via `withStreamToken()`; the sidebar now does the same, so icons appear when auth is enabled.
 - **Shortest Job First Toggle Disappears After Clicking** ([#879](https://github.com/maziggy/bambuddy/issues/879)) — The SJF toggle badge on the queue page was rendered inside the Pending Queue section header, which is only shown when there is at least one pending item and the list view is active. Clicking the toggle often coincided with the scheduler starting the only pending print, at which point the Pending section unmounted and the toggle vanished along with it — making it look like the button had disappeared after clicking. The toggle has been moved to the top of the queue page, next to the list/timeline view switcher, so it stays reachable regardless of pending-item count, active filters, or the selected view mode.
-- **SpoolBuddy Update Fails in Docker with "no user exists for uid 1000"** — The SpoolBuddy remote-update flow shelled out to the OpenSSH `ssh-keygen` and `ssh` binaries for keypair creation and command execution. Both of those binaries call `getpwuid(getuid())` during startup and abort with `No user exists for uid <N>` when the container runs under an arbitrary PUID that is not listed in `/etc/passwd` (the default `python:3.13-slim` image only has an entry for root, so running with `user: "1000:1000"` — or any non-root user — tripped the same error). The entire SpoolBuddy update path is now subprocess-free: keypairs are generated in-process via the `cryptography` library (already a dependency), SSH commands run through the pure-Python `asyncssh` client, and git-branch detection reads `.git/HEAD` directly instead of shelling out to `git`. Native installs behave identically — they already worked because the running user was always in `/etc/passwd`. A regression test asserts that neither keypair creation nor command execution spawns any subprocess.
+- **SpoolBuddy Update Fails in Docker with "no user exists for uid 1000"** — The SpoolBuddy remote-update flow shelled out to the OpenSSH `ssh-keygen` and `ssh` binaries for keypair creation and command execution. Both of those binaries call `getpwuid(getuid())` during startup and abort with `No user exists for uid <N>` when the container runs under an arbitrary PUID that is not listed in `/etc/passwd` (the default `python:3.13-slim` image only has an entry for root, so running with `user: "1000:1000"` — or any non-root user — tripped the same error). The entire SpoolBuddy update path is now subprocess-free: keypairs are generated in-process via the `cryptography` library (already a dependency), SSH commands run through the pure-Python `asyncssh` client, and git-branch detection reads `.git/HEAD` directly instead of shelling out to `git`. asyncssh internally calls `getpass.getuser()` to resolve the *local* username for `~/.ssh/config` host matching, which hit the same missing-passwd-entry failure (asyncssh reported it as `SSH connection failed: no username set in the environment`); this is now worked around by setting `LOGNAME=bambuddy` at module import when neither `LOGNAME`/`USER`/`LNAME`/`USERNAME` nor the passwd lookup resolves a name, and by explicitly passing `config=[]` to skip the `~/.ssh/config` load (which also needs a resolvable home directory). Native installs behave identically — they already worked because the running user was always in `/etc/passwd`. Regression tests assert that neither keypair creation nor command execution spawns any subprocess, and that the local-username fallback fires only when the passwd lookup actually fails.
 - **Camera Stream "6 of 5" Reconnect Counter + ffmpeg Log Flood** ([#925](https://github.com/maziggy/bambuddy/issues/925)) — Two bugs surfaced while investigating camera reconnect behaviour. First, the camera page briefly displayed "Reconnecting attempt 6 of 5" before giving up, because the attempt counter could be incremented to the maximum while the reconnect banner was still rendering. The displayed value is now clamped to the configured maximum. Second, every failed ffmpeg spawn logged the full ~20-line ffmpeg version/configuration banner, producing hundreds of lines of noise per failed camera click (one reported click produced 555 log lines across 30 retries). A new stderr summarizer strips the ffmpeg banner before logging so only the actual error lines remain. The underlying "camera service stops accepting new connections after prolonged uptime" behaviour in the X1C firmware is still under investigation.
 - **LDAP POSIX Primary Group Ignored** — LDAP authentication only looked at groups that listed the user explicitly via `memberUid` (supplementary group membership). A user's POSIX primary group — referenced by the `gidNumber` attribute on the user object and matching the `gidNumber` on a `posixGroup` — was ignored entirely, so users whose role came from their primary group landed without the expected permissions. The authenticator now also searches for `posixGroup` entries whose `gidNumber` matches the user's primary `gidNumber`, and dedupes DNs case-insensitively before resolving the group mapping (LDAP DNs are case-insensitive by spec).
 - **Support Bundle Leaks Virtual Printer IP Address** — The debug support bundle included the `virtual_printer_remote_interface_ip` setting value unmasked in `support-info.json`. The setting key didn't match any of the existing sensitive-key filters, so the raw IP address was included in the bundle. Added `_ip` to the sensitive key filter so IP address settings are excluded from support bundles. Log file content was already covered by the existing IPv4 regex redaction.

+ 35 - 0
backend/app/services/spoolbuddy_ssh.py

@@ -13,6 +13,7 @@ entries for root). asyncssh does all of its work in-process.
 """
 
 import asyncio
+import getpass
 import logging
 import os
 from pathlib import Path
@@ -29,6 +30,39 @@ SSH_USER = "spoolbuddy"
 DEFAULT_INSTALL_PATH = "/opt/bambuddy"
 
 
+def _ensure_local_username_env() -> None:
+    """Make `getpass.getuser()` succeed even when the process runs under a UID
+    that is not listed in /etc/passwd.
+
+    asyncssh.connect() unconditionally calls `getpass.getuser()` to resolve
+    the *local* username (used for `~/.ssh/config` host matching, not the
+    remote login name). `getpass.getuser()` reads `LOGNAME`/`USER`/`LNAME`/
+    `USERNAME` first and falls back to `pwd.getpwuid(os.getuid())`. Inside a
+    Docker container with an arbitrary PUID (e.g. 1000 on `python:3.13-slim`,
+    which only has a root passwd entry), none of those env vars are set and
+    the pwd lookup raises `KeyError`, causing asyncssh to abort with
+    "Unknown local username: set one of LOGNAME, USER, LNAME, or USERNAME in
+    the environment".
+
+    If the lookup already works, or the operator has any of those env vars
+    set, this is a no-op. Otherwise we set a harmless `LOGNAME` default so
+    asyncssh can proceed. This only affects the resolution of the *local*
+    username; the SSH login user is always passed explicitly as `SSH_USER`.
+    """
+    try:
+        getpass.getuser()
+        return
+    except KeyError:
+        pass
+
+    if not any(os.environ.get(k) for k in ("LOGNAME", "USER", "LNAME", "USERNAME")):
+        os.environ["LOGNAME"] = "bambuddy"
+        logger.debug("Set LOGNAME=bambuddy for asyncssh (container UID has no /etc/passwd entry)")
+
+
+_ensure_local_username_env()
+
+
 def _get_ssh_key_dir() -> Path:
     """Return (and create if needed) the directory for SpoolBuddy SSH keys."""
     key_dir = settings.base_dir / "spoolbuddy" / "ssh"
@@ -139,6 +173,7 @@ async def _run_ssh_command(
                 username=SSH_USER,
                 client_keys=[str(private_key)],
                 known_hosts=None,  # equivalent to StrictHostKeyChecking=no + UserKnownHostsFile=/dev/null
+                config=[],  # do not load ~/.ssh/config — HOME may not resolve under arbitrary Docker PUIDs
                 connect_timeout=10,
             ) as conn:
                 result = await conn.run(command, check=False)

+ 47 - 0
backend/tests/unit/services/test_spoolbuddy_ssh.py

@@ -7,6 +7,7 @@ from unittest.mock import AsyncMock, MagicMock, patch
 import pytest
 
 from backend.app.services.spoolbuddy_ssh import (
+    _ensure_local_username_env,
     _get_ssh_key_dir,
     _run_ssh_command,
     detect_current_branch,
@@ -179,6 +180,49 @@ def test_detect_branch_default_main(tmp_path):
         assert detect_current_branch() == "main"
 
 
+# -- _ensure_local_username_env ------------------------------------------------
+
+
+def test_ensure_local_username_env_noop_when_getuser_works():
+    """When getpass.getuser() succeeds, the env must not be mutated."""
+    # Stash the current values so we can detect mutation.
+    before = {k: os.environ.get(k) for k in ("LOGNAME", "USER", "LNAME", "USERNAME")}
+    with patch("backend.app.services.spoolbuddy_ssh.getpass.getuser", return_value="realuser"):
+        _ensure_local_username_env()
+    after = {k: os.environ.get(k) for k in ("LOGNAME", "USER", "LNAME", "USERNAME")}
+    assert before == after
+
+
+def test_ensure_local_username_env_sets_logname_when_getuser_fails(monkeypatch):
+    """When getpass.getuser() raises KeyError AND no USER/LOGNAME/etc is set,
+    LOGNAME must be populated so asyncssh.connect() can proceed.
+
+    Regression guard for the Docker/PUID failure mode: asyncssh's connect()
+    calls getpass.getuser() unconditionally for ~/.ssh/config host matching,
+    and raises 'Unknown local username: set one of LOGNAME, USER, LNAME, or
+    USERNAME in the environment' when pwd.getpwuid() fails under an
+    arbitrary PUID not listed in /etc/passwd.
+    """
+    for key in ("LOGNAME", "USER", "LNAME", "USERNAME"):
+        monkeypatch.delenv(key, raising=False)
+    with patch("backend.app.services.spoolbuddy_ssh.getpass.getuser", side_effect=KeyError("no passwd entry")):
+        _ensure_local_username_env()
+    assert os.environ.get("LOGNAME") == "bambuddy"
+
+
+def test_ensure_local_username_env_respects_existing_env(monkeypatch):
+    """If the operator has set USER (or any of the fallback vars) but the
+    passwd lookup still fails, we must leave their value alone."""
+    monkeypatch.delenv("LOGNAME", raising=False)
+    monkeypatch.delenv("LNAME", raising=False)
+    monkeypatch.delenv("USERNAME", raising=False)
+    monkeypatch.setenv("USER", "operator")
+    with patch("backend.app.services.spoolbuddy_ssh.getpass.getuser", side_effect=KeyError):
+        _ensure_local_username_env()
+    assert os.environ.get("USER") == "operator"
+    assert os.environ.get("LOGNAME") is None
+
+
 # -- _run_ssh_command ----------------------------------------------------------
 #
 # _run_ssh_command uses asyncssh (pure Python) rather than the OpenSSH `ssh`
@@ -215,6 +259,9 @@ async def test_run_ssh_command_success(tmp_path):
     assert kwargs["client_keys"] == [str(key_file)]
     # Host-key verification is disabled (equivalent to StrictHostKeyChecking=no)
     assert kwargs["known_hosts"] is None
+    # ~/.ssh/config loading is disabled — HOME may not resolve under arbitrary
+    # Docker PUIDs.
+    assert kwargs["config"] == []
     mock_conn.run.assert_awaited_once()
     run_args = mock_conn.run.call_args
     assert run_args.args[0] == "echo hello"