14 KiB
DB Worker Node Owner-Aware Process Management Implementation Plan
Goal: Add owner-aware lock metadata and orphan-process recovery so CLI and Electron can safely share one graph daemon without cross-managing each other.
Architecture: Keep one db-worker.lock per graph directory, but extend lock schema with owner-source so lifecycle actions can enforce owner boundaries.
Architecture: Keep read and write traffic reusable across clients, while restricting stop and restart to the side that originally started the daemon.
Architecture: Add orphan-process detection for lock-missing cases so logseq server restart does not hang on timeout when a legacy process is still alive.
Tech Stack: ClojureScript, Node.js child process APIs, promesa, logseq.cli.server, logseq.db-worker.daemon, Electron main-process daemon manager, db-worker-node lock helpers.
Related: Builds on docs/agent-guide/033-desktop-db-worker-node-backend.md.
Related: Relates to docs/agent-guide/030-logseq-cli-db-graph-default-dir-locking.md.
Related: Relates to docs/agent-guide/003-db-worker-node-cli-orchestration.md.
Problem statement
Current lock payload only records repo, pid, host, and port, so ownership is implicit and lifecycle commands cannot distinguish CLI-started and Electron-started daemons.
stop-server! and restart-server! can currently terminate any alive daemon if the lock exists, which violates the requirement that each client should manage only its own process.
If a db-worker-node process remains alive but lock file is missing, server restart can wait until timeout because startup relies on lock appearance and has no orphan recovery path.
When CLI starts a daemon first, Electron may treat the runtime as managed-by-self and attempt stop or restart logic, which can break graph open flow and produce user-facing errors.
Testing Plan
I will follow @test-driven-development and add failing tests before each implementation change.
I will add lock schema and owner compatibility tests in src/test/frontend/worker/db_worker_node_lock_test.cljs.
I will add daemon-owner lifecycle and orphan-recovery tests in src/test/logseq/cli/server_test.cljs and src/test/logseq/db_worker/daemon_test.cljs.
I will add Electron manager tests for external-runtime attachment and no-cross-stop behavior in src/test/electron/db_worker_manager_test.cljs.
I will add db-worker-node argument and lock-write tests in src/test/frontend/worker/db_worker_node_test.cljs.
I will run focused red-green loops first, then run bb dev:lint-and-test, and finish with a review pass against @prompts/review.md.
NOTE: I will write all tests before I add any implementation behavior.
Current behavior map
| Area | Current behavior | Target behavior |
|---|---|---|
| Lock metadata | No owner field in db-worker.lock. |
Lock includes owner-source as cli or electron, plus versioned metadata. |
| Lifecycle authority | Any caller can stop or restart lock-owned daemon. | stop and restart are allowed only for matching owner-source. |
| Runtime reuse | Reuse happens, but manager cannot tell owned vs external runtime. | Reuse still happens, and runtime state tracks owned? to prevent cross-stop. |
| Lock missing + orphan process | Startup can timeout with no clear recovery path. | Orphan detection and cleanup path runs before or after failed startup wait. |
| Compatibility | Legacy lock without owner is ambiguous. | Legacy lock is treated as owner-source: unknown with explicit policy. |
Integration sketch
CLI or Electron request ensure-server(repo, requester-owner)
-> read lock
-> if lock exists and healthy:
return runtime + ownership(owned/external)
-> if lock missing:
scan orphan db-worker-node process for same repo/data-dir
if orphan found:
terminate orphan
spawn new daemon with --owner-source <cli|electron>
-> db-worker-node writes lock {repo,pid,host,port,owner-source,lock-id,...}
stop/restart(requester-owner)
-> read lock owner-source
-> if owner-source matches requester-owner: allow
-> else: deny with :server-owned-by-other
Implementation plan
Phase 1: Add failing tests for owner-aware lock schema and policies.
- Add a failing test in
src/test/frontend/worker/db_worker_node_lock_test.cljsthat lock serialization includesowner-source. - Add a failing test in
src/test/frontend/worker/db_worker_node_lock_test.cljsthat missing owner metadata is normalized to:unknown. - Add a failing test in
src/test/frontend/worker/db_worker_node_test.cljsthat--owner-source cliis written into the lock. - Add a failing test in
src/test/frontend/worker/db_worker_node_test.cljsthat--owner-source electronis written into the lock. - Add a failing test in
src/test/logseq/cli/server_test.cljsthatstop-server!returns:server-owned-by-otheron owner mismatch. - Add a failing test in
src/test/logseq/cli/server_test.cljsthatrestart-server!does not SIGTERM external-owner daemon. - Add a failing test in
src/test/electron/db_worker_manager_test.cljsthat external runtime release does not callstop-daemon!. - Run
bb dev:test -v 'frontend.worker.db-worker-node-lock-test'and confirm failures match new assertions. - Run
bb dev:test -v 'logseq.cli.server-test'and confirm failures match new assertions.
Phase 2: Extend lock schema and daemon startup arguments.
- Update argument parsing in
src/main/frontend/worker/db_worker_node.cljsto accept--owner-source. - Add owner-source validation in
src/main/frontend/worker/db_worker_node.cljswith allowed valuescli,electron, and fallbackunknown. - Thread owner-source through
start-daemon!insrc/main/frontend/worker/db_worker_node.cljsinto lock creation. - Update
create-lock!insrc/main/frontend/worker/db_worker_node_lock.cljsto persistowner-source. - Update
read-locknormalization path insrc/main/frontend/worker/db_worker_node_lock.cljsto inject:owner-source :unknownfor legacy files. - Keep
update-lock!insrc/main/frontend/worker/db_worker_node_lock.cljsfrom mutating existing owner-source during port updates. - Add targeted tests for lock read-write roundtrip in
src/test/frontend/worker/db_worker_node_lock_test.cljs. - Run
bb dev:test -v 'frontend.worker.db-worker-node-test'and make sure lock owner assertions pass.
Phase 3: Make CLI server orchestration owner-aware.
- Add requester owner config to
ensure-server!insrc/main/logseq/cli/server.cljs. - Pass owner-source to daemon spawn args from
spawn-server!insrc/main/logseq/db_worker/daemon.cljs. - Update
ensure-server-started!insrc/main/logseq/cli/server.cljsto return ownership metadata for caller state tracking. - Update
stop-server!insrc/main/logseq/cli/server.cljsto deny stop when lock owner-source differs from requester owner. - Update
restart-server!insrc/main/logseq/cli/server.cljsto preserve the same owner check semantics asstop-server!. - Update
server-statusandlist-serversinsrc/main/logseq/cli/server.cljsto include owner-source in output payload. - Update server command response formatting so
logseq server listhuman output includes anOWNERcolumn mapped fromowner-source(and preserves owner metadata in structured output). - Add regression tests in
src/test/logseq/cli/server_test.cljsfor owner-aware start, stop, and restart. - Run
bb dev:test -v 'logseq.cli.server-test'and verify no timeout-based flaky failure remains.
Phase 4: Add orphan-process detection and recovery for lock-missing start.
- Add process listing helper in
src/main/logseq/db_worker/daemon.cljsthat discovers db-worker-node processes by repo and data-dir arguments. - Add parser helpers in
src/main/logseq/db_worker/daemon.cljsto readrepo,data-dir, andowner-sourcefrom command args. - Add
cleanup-orphan-process!insrc/main/logseq/db_worker/daemon.cljsto SIGTERM matched orphan pids before new spawn. - Call orphan cleanup path in
ensure-server-started!insrc/main/logseq/cli/server.cljswhen lock is missing before spawn. - Add timeout fallback in
ensure-server-started!insrc/main/logseq/cli/server.cljsto emit:server-start-timeout-orphanwith discovered pids. - Add unit tests in
src/test/logseq/db_worker/daemon_test.cljsfor process-arg parsing and orphan match logic. - Add CLI regression test in
src/test/logseq/cli/server_test.cljsfor lock-missing orphan scenario to avoid raw timeout. - Run
bb dev:test -v 'logseq.db-worker.daemon-test'andbb dev:test -v 'logseq.cli.server-test'.
Phase 5: Make Electron manager attach external daemon without cross-management.
- Pass requester owner as
electronfromstart-managed-daemon!insrc/electron/electron/db_worker.cljs. - Save ownership flag in manager runtime state in
src/electron/electron/db_worker.cljs. - Update stop flow in
src/electron/electron/db_worker.cljssostop-daemon!runs only whenowned?is true. - Update unhealthy-runtime branch in
src/electron/electron/db_worker.cljsto avoid stopping external owner daemon and re-resolve runtime instead. - Add tests in
src/test/electron/db_worker_manager_test.cljsfor external runtime reuse plus no-stop-on-release. - Run
bb dev:test -v 'electron.db-worker-manager-test'and confirm lifecycle behavior.
Phase 6: Update docs and error surfaces.
- Update CLI docs in
docs/cli/logseq-cli.mdto document owner-awareserver stopandserver restartbehavior. - Update desktop lifecycle docs in
docs/developers/desktop-db-worker-node.mdto explain external runtime attachment semantics. - Add explicit error messages for
:server-owned-by-otherand:server-start-timeout-orphaninsrc/main/logseq/cli/format.cljs. - Add one integration note in
docs/agent-guide/033-desktop-db-worker-node-backend.mdlinking to owner-aware behavior.
Phase 7: Full verification and review gate.
- Run
bb dev:test -v 'frontend.worker.db-worker-node-test'. - Run
bb dev:test -v 'frontend.worker.db-worker-node-lock-test'. - Run
bb dev:test -v 'logseq.db-worker.daemon-test'. - Run
bb dev:test -v 'logseq.cli.server-test'. - Run
bb dev:test -v 'electron.db-worker-manager-test'. - Run
bb dev:lint-and-testand confirm zero failures and zero errors. - Perform final review checklist pass against
@prompts/review.md.
Edge cases
| Scenario | Expected behavior |
|---|---|
| Lock file missing but old CLI daemon still alive. | CLI restart detects orphan by repo and data-dir, cleans it, then starts cleanly. |
Lock owner is cli and Electron calls ensure runtime. |
Electron reuses runtime with owned? false and never stops it on window close. |
Lock owner is electron and CLI calls server stop. |
CLI returns :server-owned-by-other with owner metadata and no process kill. |
| Legacy lock file has no owner-source field. | System treats owner as unknown and allows CLI takeover with owner metadata rewrite. |
| Two owners race to start same graph. | First lock wins and second caller reuses healthy daemon without extra spawn. |
| Owner process crashes and lock remains stale. | Stale lock cleanup still works, and next owner can start daemon normally. |
Verification commands and expected outputs
bb dev:test -v 'frontend.worker.db-worker-node-lock-test'
bb dev:test -v 'frontend.worker.db-worker-node-test'
bb dev:test -v 'logseq.db-worker.daemon-test'
bb dev:test -v 'logseq.cli.server-test'
bb dev:test -v 'electron.db-worker-manager-test'
bb dev:lint-and-test
Each command should finish with 0 failures, 0 errors.
The owner-mismatch tests should return :server-owned-by-other instead of timeout or forced stop behavior.
logseq server list human output should include an OWNER column.
The orphan-recovery tests should return deterministic cleanup behavior instead of waiting until generic timeout.
Testing Details
The tests validate behavior by asserting lifecycle authority boundaries, successful runtime reuse across clients, and orphan recovery outcomes.
The tests avoid mock-only success criteria by asserting returned error codes and process-management side effects observable from public APIs.
The critical regressions are lock-missing orphan restart and CLI-first then Electron-open graph flow, and both are explicitly covered.
Implementation Details
- Extend lock payload with
owner-sourceand preserve it across lock updates. - Pass
--owner-sourcewhen spawning db-worker-node from both CLI and Electron pathways. - Return ownership metadata from server orchestration so callers can track
owned?state. - Enforce owner check for stop and restart while keeping read and write invoke reuse unchanged.
- Add orphan process discovery by command args for lock-missing recovery.
- Scope orphan process discovery in v1 to macOS and Linux, and use a Windows-safe no-op fallback.
- Keep stale-lock cleanup logic and layer orphan recovery without changing healthy lock reuse flow.
- Add explicit CLI error codes for owner mismatch and orphan timeout contexts.
- Prevent Electron manager from stopping external-owner runtime on release or health fallback.
- Document operator-visible behavior changes in CLI and desktop developer docs.
- Execute full suite and
@prompts/review.mdchecks before merge.
Question
Decision: CLI is allowed to take over owner-source: unknown and rewrite ownership metadata in v1.
Decision: when lock file is missing, orphan cleanup terminates all matching repo and data-dir processes.
Decision: v1 scopes orphan process-scan support to macOS and Linux only, with a Windows-safe no-op fallback.