moonlock/DECISIONS.md
nevaforget 3e610bdb4b
All checks were successful
Update PKGBUILD version / update-pkgver (push) Successful in 3s
fix: audit LOW fixes — docs, rustdoc, scope, debug gate, lto fat (v0.6.12)
- Update CLAUDE.md and README.md to reflect the blur range [0,200] that
  the code has clamped to since v0.6.8.
- Move the // SYNC: comment above the /// doc on MAX_BLUR_DIMENSION so
  rustdoc renders one coherent paragraph instead of a truncated sentence.
- Narrow check_account visibility to pub(crate) and document the caller
  precondition (username must come from users::get_current_user()).
- Gate MOONLOCK_DEBUG behind #[cfg(debug_assertions)]. Release builds
  always run at LevelFilter::Info so a session script cannot escalate
  journal verbosity to leak fprintd / D-Bus internals.
- Document why pam_setcred is deliberately not called in authenticate().
- Release profile: lto = "fat" instead of "thin" — doubles release build
  time for better cross-crate inlining on the auth + i18n hot paths.
2026-04-24 14:05:17 +02:00

14 KiB
Raw Blame History

Decisions

Architectural and design decisions for Moonlock, in reverse chronological order.

2026-04-24 Audit LOW fixes: docs, rustdoc, check_account scope, debug gating, lto fat (v0.6.12)

  • Who: ClaudeCode, Dom
  • Why: Six LOW findings cleared in a single pass. (1) Docs referenced the old [0,100] blur range; code clamps [0,200] since v0.6.8. (2) The MAX_BLUR_DIMENSION doc comment was split by a // SYNC: block, producing a truncated sentence in rustdoc. (3) check_account was pub and relied on callers only ever passing getuid()-derived usernames; the contract was not enforced by the type system. (4) MOONLOCK_DEBUG env var flipped log verbosity to Debug in release builds, letting a compromised session script escalate journal noise about fprintd / D-Bus. (5) pam_setcred absence was undocumented. (6) [profile.release] used lto = "thin" — fine for most crates, but for a latency-critical auth binary compiled once per release, fat LTO's extra cross-crate inlining is worth the ~1 min build hit.
  • Tradeoffs: lto = "fat" roughly doubles release build time (~30 s → ~60 s) for slightly better inlining across PAM FFI wrappers and the i18n/status paths. #[cfg(debug_assertions)] on the debug-level selector means you have to run a debug build to raise log level — inconvenient for live troubleshooting, but aligned with the security-first posture.
  • How: (1) CLAUDE.md + README.md updated to [0,200]. (2) // SYNC: block moved above the /// doc so rustdoc renders one coherent paragraph. (3) check_account visibility narrowed to pub(crate) with a Precondition paragraph explaining the username contract. (4) Debug-level selection wrapped in #[cfg(debug_assertions)]; release builds always run at LevelFilter::Info. (5) Added a comment block in authenticate() documenting why pam_setcred is deliberately absent and where it would hook in if needed. (6) lto = "fat" in Cargo.toml.

2026-04-24 Audit MEDIUM fixes: D-Bus cleanup race, TOCTOU open, FP reset, GTK entry clear (v0.6.11)

  • Who: ClaudeCode, Dom
  • Why: Second round after the HIGH fixes, addressing the four MEDIUM findings. (1) cleanup_dbus spawned VerifyStop + Release as fire-and-forget, then resume_async called Claim after only a 2 s timeout — shorter than the 3 s D-Bus timeout, so on a slow bus the Claim could race the Release and fprintd would reject it, leaving the FP listener permanently dead. (2) load_background_texture relied on the caller's symlink_metadata check, re-opening the path via gdk::Texture::from_file — a classic TOCTOU window. (3) resume_async unconditionally reset failed_attempts, allowing an attacker with sensor control to evade the 10-attempt cap by cycling verify-match → check_account fail → resume. (4) The GTK PasswordEntry buffer was only cleared on timeout or auth failure, leaving the password in GLib malloc'd memory longer than necessary.
  • Tradeoffs: The D-Bus cleanup is now split into a synchronous helper (take_cleanup_proxy — signal disconnect + flag clear) and an async helper (perform_dbus_cleanup — VerifyStop + Release), so resume_async can await the release while stop() stays fire-and-forget. Dropping the failed_attempts reset means a flaky sensor could reach 10 failures faster, but the correct remedy is a new lock session (construction) rather than a reset that also helps attackers.
  • How: (1) Split cleanup_dbus into take_cleanup_proxy() (sync) + perform_dbus_cleanup(proxy) (async). resume_async now awaits perform_dbus_cleanup before begin_verification. stop() still spawns the cleanup fire-and-forget. (2) load_background_texture opens with O_NOFOLLOW via std::fs::OpenOptions::custom_flags, reads to bytes, and builds the texture via gdk::Texture::from_bytes. (3) Removed listener.borrow_mut().failed_attempts = 0 from resume_async. (4) password_entry.set_text("") now fires right after the Zeroizing::new(entry.text().to_string()) extraction, shortening the GTK-side window.

2026-04-24 Audit fixes: RefCell borrow across await, async avatar decode

  • Who: ClaudeCode, Dom
  • Why: Triple audit found two HIGH issues. (1) init_fingerprint_async held a RefCell immutable borrow across is_available_async().await — a concurrent connect_monitor signal (hotplug / suspend-resume) invoking borrow_mut() during the await would panic. (2) set_avatar_from_file decoded avatars synchronously via Pixbuf::from_file_at_scale, blocking the GTK main thread inside the connect_monitor handler. With MAX_AVATAR_FILE_SIZE at 10 MB the worst-case stall was 200500 ms on monitor hotplug.
  • Tradeoffs: Avatar is shown as the symbolic default icon for a brief window while decoding completes. Wallpaper stays synchronous because connect_monitor fires during lock() and needs the texture already present (see 2026-04-09).
  • How: (1) Extract username into a local String in init_fingerprint_async, drop the borrow before the await, re-borrow in a new scope after — no awaits inside the second borrow, so hotplug during signal setup is safe. (2) set_avatar_from_file now uses gio::File::read_future + Pixbuf::from_stream_at_scale_future for async I/O and decode. The default icon is shown immediately; the decoded texture replaces it when ready. Pixbuf itself is !Send, so gio::spawn_blocking does not apply — the GIO async stream loader keeps the Pixbuf on the main thread while the kernel does the I/O asynchronously.

2026-04-09 Monitor hotplug via connect_monitor signal

  • Who: ClaudeCode, Dom
  • Why: moonlock crashed with segfault in libgtk-4.so after suspend/resume — HDMI monitor disconnect/reconnect invalidated GDK monitor objects, and the statically created windows referenced destroyed surfaces. Crash at consistent GTK4 offset (0x278 NULL dereference), 3x reproduced.
  • Tradeoffs: Wallpaper texture now loaded before lock() instead of after (connect_monitor fires during lock() and needs the texture). Local JPEG loading is fast enough that the delay is negligible. Shared state moved to Rc's for the signal closure — slightly more indirection but necessary for dynamic window creation.
  • How: (1) Bump gtk4-session-lock feature from v1_1 to v1_2 to enable Instance::connect_monitor. (2) Replace manual monitor iteration with lock.connect_monitor() signal handler that creates windows on demand. (3) Signal fires once per existing monitor at lock() and again on hotplug. (4) Windows auto-unmap when their monitor disappears (ext-session-lock-v1 guarantee). (5) Fingerprint listener published to shared Rc so hotplugged monitors get FP labels.

2026-03-31 Fourth audit: peek icon, blur limit, GResource compression, sync markers

  • Who: ClaudeCode, Dom
  • Why: Fourth triple audit found blur limit inconsistency (moonlock 0100 vs moongreet/moonset 0200), missing GResource compression, peek icon inconsistency, and duplicated code without sync markers.
  • Tradeoffs: Peek icon enabled in lockscreen — user decision favoring UX consistency over shoulder-surfing protection. Acceptable for single-user desktop. Blur limit raised to 200 for ecosystem consistency.
  • How: (1) show_peek_icon(true) in lockscreen password entry. (2) clamp(0.0, 200.0) for blur in config.rs. (3) compressed="true" on CSS/SVG GResource entries. (4) SYNC comments on duplicated blur/background functions pointing to moongreet and moonset.

2026-03-30 Third audit: blur offset, lock-before-IO, FP signal lifecycle, TOCTOU

  • Who: ClaudeCode, Dom
  • Why: Third triple audit (quality, performance, security) found: blur padding offset rendering texture at (0,0) instead of (-pad,-pad) causing edge darkening on left/top (BUG), wallpaper disk I/O blocking before lock() extending the unsecured window (PERF/SEC), signal handler duplication on resume_async (SEC), failed_attempts not reset on FP resume (SEC), unknown VerifyStatus with done=false hanging FP listener (SEC), TOCTOU in is_file+is_symlink checks (SEC), dead code in faillock_warning (QUALITY), unbounded blur sigma (SEC).
  • Tradeoffs: Wallpaper loads after lock() — screen briefly shows without wallpaper until texture is ready. Acceptable: security > aesthetics. Blur sigma clamped to [0.0, 100.0] — arbitrary upper bound but prevents GPU memory exhaustion.
  • How: (1) Texture offset to (-pad, -pad) in render_blurred_texture. (2) lock.lock() before resolve_background_path. (3) begin_verification disconnects old signal_id before registering new. (4) resume_async resets failed_attempts. (5) Unknown VerifyStatus with done=true triggers restart. (6) symlink_metadata() for atomic file+symlink check. (7) faillock_warning dead code removed, saturating_sub. (8) background_blur clamped. (9) Redundant Zeroizing<Vec> removed. (10) Default impl for FingerprintListener. (11) on_verify_status restricted to pub(crate). (12) Warn logging for non-UTF-8 GECOS and avatar paths.

2026-03-30 Second audit: zeroize CString, FP account check, PAM timeout, blur downscale

  • Who: ClaudeCode, Dom
  • Why: Second triple audit (quality, performance, security) found: CString password copy not zeroized (HIGH), fingerprint unlock bypassing pam_acct_mgmt (MEDIUM), no PAM timeout leaving user locked out on hanging modules (MEDIUM), GPU blur on full wallpaper resolution (MEDIUM), no-monitor edge case doing return instead of exit(1) (MEDIUM).
  • Tradeoffs: PAM timeout (30s) uses a generation counter to avoid stale result interference — adds complexity but prevents parallel PAM sessions. FP restart after failed account check re-claims the device, adding a D-Bus round-trip, but prevents permanent FP death on transient failures. Blur downscale to 1920px cap trades negligible quality for ~4x less GPU work on 4K wallpapers.
  • How: (1) Zeroizing<CString> wraps password in auth.rs, zeroize/std feature enabled. (2) check_account() calls pam_acct_mgmt after FP match; resume_async() restarts FP on transient failure. (3) auth_generation counter invalidates stale PAM results; 30s timeout re-enables UI. (4) MAX_BLUR_DIMENSION caps blur input at 1920px, sigma scaled proportionally. (5) exit(1) on no-monitor after lock.lock().

2026-03-28 Remove embedded wallpaper from binary

  • Who: ClaudeCode, Dom
  • Why: Wallpaper is installed by moonarch to /usr/share/moonarch/wallpaper.jpg. Embedding a 374K JPEG in the binary is redundant. GTK background color (Catppuccin Mocha base) is a clean fallback.
  • Tradeoffs: Without moonarch installed AND without config, lockscreen shows plain dark background instead of wallpaper. Acceptable — that's the expected minimal state.
  • How: Remove wallpaper.jpg from GResources, return None from resolve_background_path when no file found, skip background picture creation when no texture available.

2026-03-28 Audit-driven security and lifecycle fixes (v0.6.0)

  • Who: ClaudeCode, Dom
  • Why: Triple audit (quality, performance, security) revealed a critical D-Bus signal spoofing vector, fingerprint lifecycle bugs, and multi-monitor performance issues.
  • Tradeoffs: cleanup_dbus() extraction adds a method but clarifies the stop/match ownership; running_flag: Rc<Cell<bool>> adds a field but prevents race between async restart and stop; sender validation adds a check per signal but closes the only known auth bypass.
  • How: (1) Validate D-Bus VerifyStatus sender against fprintd's unique bus name. (2) Extract cleanup_dbus() from stop(), call it on verify-match. (3) Rc<Cell<bool>> running flag checked after await in restart_verify_async. (4) Consistent 3s D-Bus timeouts. (5) Panic hook before logging. (6) Blur and avatar caches shared across monitors. (7) Peek icon disabled. (8) Symlink rejection for background_path. (9) TOML parse errors logged.

2026-03-28 GPU blur via GskBlurNode replaces CPU blur

  • Who: ClaudeCode, Dom
  • Why: CPU-side Gaussian blur (image crate) blocked the GTK main thread for 500ms2s on 4K wallpapers at cold cache. Disk cache mitigated repeat starts but added ~100 lines of complexity.
  • Tradeoffs: GPU blur quality is slightly different (box-blur approximation vs true Gaussian), acceptable for wallpaper. Removes image and dirs dependencies entirely. No disk cache needed.
  • How: Snapshot::push_blur() + GskRenderer::render_texture() on connect_realize. Blur happens once on the GPU when the widget gets its renderer, producing a concrete gdk::Texture. Zero startup latency.

2026-03-28 Optional background blur via image crate (superseded)

  • Who: ClaudeCode, Dom
  • Why: Consistent with moonset/moongreet — blurred wallpaper as lockscreen background is a common UX pattern
  • Tradeoffs: Adds image crate dependency (~15 transitive crates); CPU-side Gaussian blur at load time adds startup latency proportional to image size and sigma. Acceptable because blur runs once and the texture is shared across monitors.
  • How: load_background_texture(bg_path, blur_radius) loads texture, optionally applies imageops::blur(), returns gdk::Texture. Config option background_blur: Option<f32> in TOML.

2026-03-28 Shared wallpaper texture pattern (aligned with moonset/moongreet)

  • Who: ClaudeCode, Dom
  • Why: Previously loaded wallpaper per-window via Picture::for_filename(). Multi-monitor setups decoded the JPEG redundantly. Blur feature requires texture pixel access anyway.
  • Tradeoffs: Slightly more code in main.rs (texture loaded before window creation), but avoids redundant decoding and enables the blur feature.
  • How: load_background_texture() in lockscreen.rs decodes once, create_background_picture() wraps shared gdk::Texture in gtk::Picture. Same pattern as moonset/moongreet.