diff --git a/AGENTS.md b/AGENTS.md index a0bea1d6e9..3f1d7db043 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -47,3 +47,4 @@ - Avoid creating new class or property unless you have to. - Avoid shadow var, e.g. `bytes` should be named as `payload`. - Avoid using `js/Buffer` in browser related code. +- Git commits should include a concise summary. diff --git a/docs/agent-guide/036-db-worker-node-ncc-bundling.md b/docs/agent-guide/036-db-worker-node-ncc-bundling.md deleted file mode 100644 index dccc7f4929..0000000000 --- a/docs/agent-guide/036-db-worker-node-ncc-bundling.md +++ /dev/null @@ -1,199 +0,0 @@ -# db-worker-node ncc Standalone Bundle Implementation Plan - -Goal: Build `db-worker-node.js` with `@vercel/ncc` so the runtime can run without `node_modules` present next to the executable. - -Architecture: Keep `shadow-cljs` as the source compiler for `:db-worker-node`, then run `ncc` on the generated entry and publish a single runtime artifact in `dist/` that is used by CLI daemon orchestration. -Architecture: Preserve local development ergonomics by keeping `static/db-worker-node.js` for fast dev loops, while production and package paths resolve to the ncc artifact first. - -Tech Stack: ClojureScript, `shadow-cljs` `:node-script`, `@vercel/ncc`, Node.js 22, `pnpm` scripts in `package.json`, existing CLI daemon and doctor checks. - -Related: Builds on `docs/agent-guide/031-logseq-cli-doctor-command.md`. -Related: Relates to `docs/agent-guide/033-desktop-db-worker-node-backend.md`. -Related: Relates to `docs/agent-guide/035-logseq-cli-db-worker-deps-cli-decoupling.md`. - -## Problem statement - -`/Users/rcmerci/gh-repos/logseq/static/db-worker-node.js` is currently generated by `shadow-cljs`, and runtime behavior assumes dependencies are available from `node_modules`. - -The CLI server startup path in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` points to `../dist/db-worker-node.js`, which is currently a thin wrapper that forwards to `../static/db-worker-node.js`. - -This wrapper model keeps runtime coupled to workspace layout and to `node_modules`, so the daemon script is not independently portable. - -We need a deterministic packaging path that produces one runnable artifact plus copied native assets, and we need to verify that artifact works when `node_modules` is absent. - -Electron release packaging also needs to include the db-worker standalone bundle step, so `pnpm release-electron` always ships the same packaged runtime artifact. - -The solution must keep existing CLI and Electron daemon orchestration behavior unchanged, including lock-file semantics, owner-source semantics, and health endpoint behavior. - -## Current packaging map - -| Area | Current behavior | Limitation | -| --- | --- | --- | -| Build output | `pnpm db-worker-node:compile` writes `/Users/rcmerci/gh-repos/logseq/static/db-worker-node.js`. | Output is not a standalone distribution artifact. | -| Dist entry | `/Users/rcmerci/gh-repos/logseq/dist/db-worker-node.js` only `require`s `../static/db-worker-node.js`. | Runtime still depends on static output and installed dependencies. | -| Daemon spawn | `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` spawns `../dist/db-worker-node.js`. | Spawn path is stable, but executable is not standalone. | -| Doctor check | `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/command/doctor.cljs` checks `../static/db-worker-node.js` by default. | Diagnostic target does not match the intended distributable runtime. | -| Package manifest | `/Users/rcmerci/gh-repos/logseq/package.json` includes `static/db-worker-node.js` in `files`. | Published package does not guarantee standalone daemon artifact contract. | -| Electron release | `pnpm release-electron` does not guarantee db-worker bundle refresh before packaging. | Desktop release artifact can drift from standalone db-worker bundle contract. | - -## Target packaging map - -| Area | Target behavior | Verification signal | -| --- | --- | --- | -| Bundle output | `ncc` emits a standalone `db-worker-node` runtime in `/Users/rcmerci/gh-repos/logseq/dist/` with required runtime assets copied adjacent to entrypoint. | Daemon starts and serves `/healthz` and `/readyz` without `node_modules`. | -| Spawn path | CLI server keeps spawning `/Users/rcmerci/gh-repos/logseq/dist/db-worker-node.js` as canonical runtime. | Existing `logseq.cli.server-test` assertions remain green with updated contract. | -| Doctor check | Doctor defaults to the same packaged runtime path used for spawn, and does not auto-fallback to static runtime. | Doctor check path matches runtime path in tests and manual runs. | -| Dev flow | Fast local dev command remains available using `static/db-worker-node.js` for watch and debug workflows. | `pnpm db-worker-node:compile` and `node ./static/db-worker-node.js` still work during development. | -| Publish flow | Package `files` include standalone runtime assets required by ncc output. | Installed package can execute daemon without extra dependency install. | -| Electron release | `pnpm release-electron` runs db-worker bundle build before Electron packaging steps. | Electron release artifact includes the same standalone db-worker runtime contract. | - -## Integration sketch - -```text -shadow-cljs (:db-worker-node) - -> /static/db-worker-node.js - -> ncc build step - -> /dist/db-worker-node.js - -> /dist/ - -logseq-cli runtime - -> logseq.cli.server/spawn-server! - -> /dist/db-worker-node.js - -> db-worker daemon HTTP + SSE API -``` - -## Testing Plan - -I will follow `@test-driven-development` and add failing tests before implementation changes in each phase. - -I will add behavior tests for runtime path resolution so spawn and doctor point to the same canonical bundle target. - -I will add a standalone smoke test that launches the bundled daemon from a temporary directory without `node_modules` and verifies `/healthz`, `/readyz`, and shutdown behavior. - -I will keep existing daemon lifecycle tests green to ensure no regression in lock cleanup, owner checks, and timeout error semantics. - -I will run focused tests first, then full validation with `pnpm cljs:lint && pnpm test`, and if any unexpected failures appear I will use `@clojure-debug` before changing behavior. - -NOTE: I will write *all* tests before I add any implementation behavior. - -## Implementation plan - -### Phase 1: RED for runtime artifact contract. - -1. Add a failing test in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/cli/server_test.cljs` asserting canonical db-worker runtime path resolves to `dist/db-worker-node.js` as the production target. -2. Add a failing test in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/cli/command/doctor_test.cljs` asserting default doctor script check points to the same canonical runtime target as server spawn. -3. Add a failing test in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/cli/command/doctor_test.cljs` asserting optional dev-mode check can still validate `static/db-worker-node.js` when explicitly requested. -4. Add a new failing bundle smoke test file at `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/ncc_bundle_test.cljs` that expects daemon startup success from a bundle-only temp directory. -5. Run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.server-test'` and confirm failures occur on the new path-contract assertions. -6. Run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.command.doctor-test'` and confirm failures occur on the new default-path expectations. -7. Run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.db-worker.ncc-bundle-test'` and confirm standalone smoke test fails before implementation. - -### Phase 2: Add ncc build pipeline. - -8. Add `@vercel/ncc` to `/Users/rcmerci/gh-repos/logseq/package.json` as a dev dependency. -9. Add a dedicated script in `/Users/rcmerci/gh-repos/logseq/package.json` to build `:db-worker-node` in release mode before bundling. -10. Add a dedicated script in `/Users/rcmerci/gh-repos/logseq/package.json` to run `ncc` against `/Users/rcmerci/gh-repos/logseq/static/db-worker-node.js`. -11. Add a dedicated script in `/Users/rcmerci/gh-repos/logseq/package.json` to normalize ncc output into `/Users/rcmerci/gh-repos/logseq/dist/db-worker-node.js` with adjacent assets preserved. -12. If script complexity is non-trivial, add `/Users/rcmerci/gh-repos/logseq/scripts/build-db-worker-node-bundle.mjs` to encapsulate output normalization and deterministic cleanup. -13. Add or update `pnpm` scripts in `/Users/rcmerci/gh-repos/logseq/package.json` for one-command bundle build and optional local run of the bundled artifact. -14. Update `pnpm release-electron` in `/Users/rcmerci/gh-repos/logseq/package.json` so it includes `db-worker-node:release:bundle` before Electron packaging. -15. Run `pnpm db-worker-node:release:bundle` and verify `dist/db-worker-node.js` is regenerated with executable permissions preserved. - -### Phase 3: Align runtime path and diagnostics. - -15. Refactor runtime path helpers in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` so there is one canonical function for packaged runtime and one explicit dev fallback function. -16. Keep `spawn-server!` in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` bound to the packaged runtime path to avoid ambiguity. -17. Update `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/command/doctor.cljs` to default-check the same packaged runtime path used by spawn. -18. Add an explicit action option in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/command/doctor.cljs` for fallback static-path diagnostics used only for development troubleshooting. -19. Ensure doctor failure codes remain stable as `:doctor-script-missing` and `:doctor-script-unreadable`. -21. Re-run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.server-test'` and `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.command.doctor-test'` to make the new path contract green. - -### Phase 4: Package manifest and docs alignment. - -22. Update `files` in `/Users/rcmerci/gh-repos/logseq/package.json` so packaged runtime includes `dist/db-worker-node.js` and ncc-emitted adjacent assets. -23. Keep `static/db-worker-node.js` inclusion only if required for development workflows, and document that distinction explicitly. -24. Update `/Users/rcmerci/gh-repos/logseq/docs/cli/logseq-cli.md` build instructions to include the ncc bundle command and standalone runtime expectations. -25. Update daemon runtime notes in `/Users/rcmerci/gh-repos/logseq/docs/cli/logseq-cli.md` so `doctor` references packaged runtime as primary target. -26. Add troubleshooting notes for native module asset copy behavior from ncc output. - -### Phase 5: Standalone runtime behavior verification. - -27. Implement bundle smoke test setup in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/ncc_bundle_test.cljs` to copy only bundle artifacts into a temp directory with no `node_modules`. -28. In that test, spawn `node ./db-worker-node.js --repo --data-dir ` from the bundle-only directory. -29. In that test, poll `/healthz` and `/readyz` and assert both return HTTP 200 after startup. -30. In that test, invoke `/v1/shutdown` and assert process exits and lock file is cleaned or becomes stale-removable. -31. In that test, assert failure output is actionable if native binary asset is missing, to guard accidental packaging regressions. -32. Run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.db-worker.ncc-bundle-test'` and make it green. - -### Phase 6: Final validation and review checklist. - -33. Run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.server-test'` and confirm zero failures and zero errors. -34. Run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.command.doctor-test'` and confirm zero failures and zero errors. -35. Run `pnpm cljs:test && pnpm cljs:run-test -v 'logseq.db-worker.ncc-bundle-test'` and confirm zero failures and zero errors. -36. Run `pnpm db-worker-node:compile` and verify `static/db-worker-node.js` remains valid for local dev flow. -37. Run `pnpm db-worker-node:release:bundle` and verify `dist/db-worker-node.js` starts successfully with `node dist/db-worker-node.js --help`. -38. Run `pnpm release-electron` and verify the script execution includes `db-worker-node:release:bundle` before Electron packaging steps. -39. Run `pnpm cljs:lint && pnpm test` and confirm repository review checklist passes. -40. Validate changed code against `@prompts/review.md` before merge. - -## Edge cases to validate during implementation - -| Scenario | Expected behavior | -| --- | --- | -| `ncc` emits native `.node` assets for `better-sqlite3`. | Bundle output keeps those assets adjacent to entrypoint and runtime loads without `node_modules`. | -| Bundle is copied to another directory without static files. | Daemon still starts because packaged runtime no longer `require`s `../static/db-worker-node.js`. | -| Developer runs doctor in source workspace before bundle build. | Doctor reports missing packaged artifact by default, and only checks static runtime when explicitly requested. | -| `dist/db-worker-node.js` exists but is not readable or is a directory. | Doctor returns `:doctor-script-unreadable` with path detail. | -| Bundle build is run twice. | Build output remains deterministic and stale ncc artifacts are cleaned safely. | -| `pnpm release-electron` is run directly. | Release flow still builds db-worker standalone bundle before Electron packaging artifacts are produced. | -| CLI and Electron share lock for same repo under bundled runtime. | Existing ownership and lock semantics remain unchanged from current behavior. | - -## Verification commands and expected outputs - -```bash -pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.server-test' -pnpm cljs:test && pnpm cljs:run-test -v 'logseq.cli.command.doctor-test' -pnpm cljs:test && pnpm cljs:run-test -v 'logseq.db-worker.ncc-bundle-test' -pnpm db-worker-node:compile -pnpm db-worker-node:release:bundle -pnpm release-electron -node dist/db-worker-node.js --help -pnpm cljs:lint && pnpm test -``` - -All test commands should finish with `0 failures, 0 errors`. - -The bundle command should finish without `MODULE_NOT_FOUND` for runtime dependencies. - -`node dist/db-worker-node.js --help` should print daemon help text and exit with code `0`. - -## Testing Details - -Behavior-focused tests will validate that runtime path resolution, doctor diagnostics, and daemon startup behavior match user-visible expectations. - -The standalone smoke test will verify real process startup and HTTP readiness in a bundle-only filesystem layout, rather than asserting internal helper calls. - -Regression safety is provided by existing CLI server and doctor tests to ensure lock lifecycle and error-code contracts remain stable. - -## Implementation Details - -- Keep packaged runtime path as one canonical helper shared by spawn and doctor. -- Keep dev runtime path explicit and opt-in for local diagnostics only. -- Introduce ncc build scripts with deterministic output normalization into `dist/`. -- Preserve local `shadow-cljs` development flow and avoid slowing watch mode. -- Add a dedicated standalone bundle smoke test namespace for runtime validation. -- Keep error code contracts stable for doctor and server command callers. -- Ensure package `files` include all ncc runtime assets required at execution time. -- Document build and troubleshooting steps in CLI docs for contributors and release workflows. -- Use `@test-driven-development` for every behavior change and follow `@clojure-debug` for unexpected failures. -- Finish with full lint and test validation and checklist review from `@prompts/review.md`. - -## Question - -Resolved: choose option 1. - -`doctor` defaults to strict packaged-runtime validation only. - -`doctor` does not auto-fallback to static runtime without an explicit flag. - ---- diff --git a/docs/agent-guide/db-worker/003-server-list-write-lock.md b/docs/agent-guide/db-worker/003-server-list-write-lock.md new file mode 100644 index 0000000000..06d7406cc6 --- /dev/null +++ b/docs/agent-guide/db-worker/003-server-list-write-lock.md @@ -0,0 +1,534 @@ +# db-worker-node Server List Write Lock Implementation Plan + +Goal: Make `db-worker-node` server-list reads and writes safe across multiple CLI and Electron processes without blocking readers. + +Architecture: Keep `server-list` as the shared discovery file under the configured root-dir, and serialize every mutation with a sibling `server-list.lock` file. +Architecture: Reads remain lock-free and parse the latest complete `server-list` content, while writes acquire the lock, read the current file inside the critical section, write through a temporary file, and atomically rename it into place. + +Tech Stack: ClojureScript, Node.js `fs` and `path`, Promesa, existing `logseq.db-worker.server-list`, existing `logseq.cli.server`, existing `frontend.worker.db-worker-node` lifecycle code. + +Related: Builds on `docs/agent-guide/db-worker/001-db-worker-node-restart-on-version-mismatch.md`. +Related: Relates to `docs/agent-guide/db-worker/002-desktop-db-worker-request-cap-switch-graph.md`. +Related: Relates to `docs/cli/logseq-cli.md` server lifecycle documentation. + +## Problem statement + +The current `server-list` file is a root-dir level text file used to discover running `db-worker-node` servers. + +The current file path is derived by `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs` through `logseq.db-worker.server-list/path`, which returns `/server-list`. + +Each line currently contains one `pid port` pair. + +`logseq.db-worker.server-list/read-entries` currently checks existence, reads the whole file with `fs/readFileSync`, splits lines, parses valid `pid port` pairs, ignores invalid lines, and returns a vector. + +`logseq.db-worker.server-list/append-entry!` currently creates the parent directory and calls `fs/appendFileSync` with one line. + +`logseq.db-worker.server-list/remove-entry!` currently reads all entries, removes the matching `pid port`, and calls `rewrite-entries!`. + +`logseq.db-worker.server-list/rewrite-entries!` currently creates the parent directory and overwrites the entire file with `fs/writeFileSync`. + +`/Users/rcmerci/gh-repos/logseq/src/main/frontend/worker/db_worker_node.cljs` appends the current process entry after the HTTP server starts listening. + +`/Users/rcmerci/gh-repos/logseq/src/main/frontend/worker/db_worker_node.cljs` removes the current process entry when the daemon runtime clears state during stop. + +`/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` discovers servers by reading `server-list`, checking process and `/healthz` state for each entry, then rewriting the list with retained entries. + +The current read-modify-write cleanup path can lose data when another process appends or removes entries between the initial read and the later rewrite. + +A concrete lost-update sequence is possible today. + +1. Process A calls `discover-servers` and reads stale entry `S` from `server-list`. +2. Process B starts a new `db-worker-node` and appends live entry `B` to `server-list`. +3. Process A finishes stale-entry cleanup using its old snapshot and rewrites `server-list` without entry `B`. +4. Later discovery cannot see process B even though process B is alive. + +The requested fix is to add a lock when writing `server-list`. + +The requested lock file name is exactly `server-list.lock`. + +The requested lock file is only used by write operations. + +Read operations must not acquire, wait for, or delete `server-list.lock`. + +## Testing Plan + +I will use @Test-Driven Development (TDD) before changing implementation behavior. + +I will add unit tests in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs` for lock path derivation and lock-free reads. + +I will add a unit test in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs` proving `read-entries` can read `server-list` while `server-list.lock` already exists. + +I will add behavior tests in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs` for `append-entry!` and `remove-entry!` preserving unrelated entries through the new locked update path. + +I will add a regression test in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/cli/server_test.cljs` that simulates stale cleanup racing with a concurrent append. + +The CLI regression test should create an initial stale entry, make the stale check append a live entry before cleanup finishes, call `logseq.cli.server/list-servers`, and assert that the live entry remains in `server-list` after stale cleanup. + +This test should fail on the current implementation because `discover-servers` rewrites from a stale snapshot. + +I will add a regression test in `/Users/rcmerci/gh-repos/logseq/src/test/frontend/worker/db_worker_node_test.cljs` only if the public `append-entry!` or `remove-entry!` return contract changes. + +The preferred implementation should keep those return contracts stable, so existing daemon registration tests should continue to cover daemon integration. + +I will run the focused RED tests before implementation and confirm they fail because of missing write locking or stale-snapshot cleanup behavior. + +I will run the same focused tests after implementation and confirm they pass. + +I will run the existing daemon and CLI server-list tests to ensure the lifecycle still registers, unregisters, lists, and lazily cleans stale entries. + +NOTE: I will write *all* tests before I add any implementation behavior. + +## Current implementation snapshot + +| Concern | Current location | Current behavior | Risk | +|---------|------------------|------------------|------| +| Path derivation | `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs` | `path` returns `/server-list`. | No lock path exists. | +| Reads | `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs` | `read-entries` reads without locking and ignores invalid lines. | Existence check plus read has a small TOCTOU window. | +| Append | `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs` | `append-entry!` appends one line without coordination. | Append can race with a full rewrite. | +| Remove | `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs` | `remove-entry!` reads and rewrites without coordination. | Remove can overwrite a concurrent append. | +| Full cleanup rewrite | `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` | `discover-servers` rewrites retained entries after async checks. | Cleanup can overwrite changes made after the initial read. | +| Daemon publish | `/Users/rcmerci/gh-repos/logseq/src/main/frontend/worker/db_worker_node.cljs` | A started daemon calls `append-entry!`. | Publish can be lost by concurrent cleanup. | +| Daemon unpublish | `/Users/rcmerci/gh-repos/logseq/src/main/frontend/worker/db_worker_node.cljs` | A stopped daemon calls `remove-entry!`. | Unpublish can lose unrelated entries. | + +## Target behavior + +All writes to `/server-list` must be serialized by `/server-list.lock`. + +The lock file basename must be exactly `server-list.lock`. + +The lock file must live in the same directory as `server-list`. + +Reads must remain lock-free. + +Reads must not block when `server-list.lock` exists. + +Reads must not remove stale `server-list.lock` files. + +Writes must acquire `server-list.lock` before reading current entries for a mutation. + +Writes must release `server-list.lock` in a `finally` path after the file mutation finishes or fails. + +Writes must fail fast with a structured error after a bounded lock acquisition timeout. + +Writes must not silently skip server-list publication or cleanup when lock acquisition fails. + +Stale cleanup must remove only entries that the cleanup pass proved stale. + +Stale cleanup must not rewrite the file from an old snapshot that can exclude newer entries. + +Readers must only ever observe either the previous complete `server-list` file or the next complete `server-list` file. + +The implementation must not add backward compatibility for old lock names or old `--server-list-file` arguments. + +## Design + +### 1. Add server-list lock path derivation + +Add `logseq.db-worker.server-list/lock-path` in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs`. + +`lock-path` should accept the existing `server-list` file path and return `(node-path/join (node-path/dirname file-path) "server-list.lock")`. + +`lock-path` should reject a missing or blank file path with a clear `js/Error` or `ex-info` error. + +Do not derive the lock path as `server-list.cljs.lock`, `server-list.locked`, or `.lock` if that can produce a basename other than `server-list.lock`. + +### 2. Add a write-lock helper + +Add a private helper such as `with-write-lock!` in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs`. + +The helper should create the parent directory before acquiring the lock. + +The helper should acquire `server-list.lock` with exclusive file creation through Node `fs.openSync` using an exclusive mode such as `"wx"`. + +The helper should write small metadata into the lock file after acquisition. + +The metadata should include the owning process pid and a unique lock id. + +The metadata makes stale-lock cleanup safer and prevents a releasing writer from deleting a lock owned by a later writer. + +The helper should close the lock file descriptor after writing metadata. + +The helper should run the supplied write operation only after acquiring the lock. + +The helper should release the lock in `finally` by reading the lock metadata and unlinking only when the lock id still matches the current writer. + +The helper should tolerate `ENOENT` while releasing because another process may have removed a stale lock after a crash. + +The helper should treat `EEXIST` as normal contention while waiting for the lock. + +The helper should retry acquisition on a short fixed interval until timeout. + +Use a small poll interval such as `25` milliseconds. + +Use a bounded timeout of `2000` milliseconds. + +If timeout expires and the lock owner process is still alive or cannot be inspected, throw `ex-info` with `:code :server-list-lock-timeout` and include `:file-path`, `:lock-path`, and lock metadata when available. + +If timeout sees a stale lock whose metadata pid is not alive, remove that stale lock and retry acquisition. + +If the lock file exists but metadata cannot be parsed, treat it as active until timeout and then fail fast rather than guessing. + +Do not use `js/Buffer` in this helper. + +### 3. Make writes atomic + +Add a private `write-entries-unlocked!` helper in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs`. + +The helper should write the next payload to a unique temporary file in the same directory as `server-list`. + +The helper should then call `fs/renameSync` from the temporary file to `server-list`. + +The temporary file name should include `server-list`, the current pid, and a unique suffix. + +The temporary file must not be named `server-list.lock`. + +The helper should remove the temporary file in a best-effort `catch` path if writing or renaming fails. + +Atomic rename keeps lock-free readers from observing a partially truncated `server-list` file. + +### 4. Centralize mutations through an update helper + +Add a public helper such as `update-entries!` in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs`. + +`update-entries!` should acquire `server-list.lock`. + +`update-entries!` should call `read-entries` after the lock is acquired. + +`update-entries!` should apply a pure transform to the current entries. + +`update-entries!` should normalize the result before writing. + +Normalization should keep only valid positive integer `:pid` and `:port` pairs. + +Normalization should remove exact duplicate `pid port` entries while preserving first-seen order. + +`update-entries!` should write with `write-entries-unlocked!`. + +`update-entries!` should return the normalized entries that were written. + +`append-entry!` should become a thin wrapper around `update-entries!`. + +`append-entry!` should append the entry only when both `pid` and `port` are positive integers. + +`append-entry!` should avoid adding a duplicate exact `pid port` entry. + +`append-entry!` should keep returning the appended entry for valid input to preserve current callers. + +`remove-entry!` should become a thin wrapper around `update-entries!`. + +`remove-entry!` should remove only exact matches for the target `pid port` pair. + +`remove-entry!` should preserve unrelated entries added by other writers. + +`rewrite-entries!` should remain public only as a locked full replacement helper for compatibility with explicit full-replacement callers. + +Production code must not use `rewrite-entries!` with a stale snapshot from before asynchronous work. + +Prefer mutation-specific helpers such as `update-entries!` and `remove-entries!` for normal production server-list changes. + +### 5. Replace stale cleanup rewrite in discovery + +Update `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs`. + +In `logseq.cli.server/discover-servers`, keep the existing lock-free read for discovery input. + +Keep health checks outside the write lock because health checks are asynchronous and can take up to the request timeout. + +Compute the set of entries that should be removed because they are proven stale. + +After checks finish, call a server-list helper such as `remove-entries!` or `update-entries!` to remove only that stale set from the current file under `server-list.lock`. + +Do not rewrite `server-list` from `cleaned-entries` computed from the initial snapshot. + +Do not hold `server-list.lock` while calling `pid-status` or `/healthz`. + +Return discovered server payloads from the original discovery pass as today. + +The purpose of the write lock is to protect file mutations, not to serialize discovery network work. + +### 6. Keep daemon call sites simple + +Keep `/Users/rcmerci/gh-repos/logseq/src/main/frontend/worker/db_worker_node.cljs` call sites using `server-list/append-entry!` on publish and `server-list/remove-entry!` on stop if the public signatures stay the same. + +Do not add a separate lock concern in the daemon namespace. + +The server-list namespace should own all lock-file details. + +### 7. Improve lock-free reads without changing semantics + +Update `read-entries` in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs` to avoid a separate `fs/existsSync` check before reading. + +Read the file directly and catch only missing-file errors such as `ENOENT` to return `[]`. + +Continue returning `[]` for blank or missing paths only if that is the existing accepted behavior. + +Continue ignoring invalid lines unless implementation review decides to fail fast for corrupt server-list content. + +Do not inspect `server-list.lock` from `read-entries`. + +This removes one filesystem call from the common read path and avoids an existence-check TOCTOU race. + +## Integration overview + +```text +CLI or Electron process + | + | lock-free read + v +/server-list + | + | parse pid port entries + v +pid-status and /healthz checks + | + | write mutation only + v +acquire /server-list.lock + | + | read current server-list inside lock + | apply append/remove/prune transform + | write temp file + | rename temp file to server-list + v +release /server-list.lock +``` + +## Affected files + +| File | Planned change | +|------|----------------| +| `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs` | Add `lock-path`, write-lock helper, atomic write helper, update helper, locked append/remove/prune behavior, and optimized read behavior. | +| `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` | Replace stale full rewrite in `discover-servers` with locked removal of only proven stale entries. | +| `/Users/rcmerci/gh-repos/logseq/src/main/frontend/worker/db_worker_node.cljs` | Prefer no code change, but verify existing publish and unpublish call sites use the updated locked helpers. | +| `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs` | Add focused server-list lock, read, append, remove, dedupe, and stale-lock tests. | +| `/Users/rcmerci/gh-repos/logseq/src/test/logseq/cli/server_test.cljs` | Add a regression test for discovery cleanup preserving entries added during cleanup. | +| `/Users/rcmerci/gh-repos/logseq/src/test/frontend/worker/db_worker_node_test.cljs` | Keep existing daemon registration tests green, and add tests only if signatures or return contracts change. | +| `/Users/rcmerci/gh-repos/logseq/docs/cli/logseq-cli.md` | Update only if user-facing server-list troubleshooting needs to mention `server-list.lock`. | + +## Implementation tasks + +### Phase 0: Preparation + +1. Read `/Users/rcmerci/gh-repos/logseq/AGENTS.md`. +2. Check for directory-specific `AGENTS.md` files before editing each target file. +3. Load @Test-Driven Development (TDD) before implementation. +4. Re-read `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs`. +5. Re-read `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs` around `discover-servers`. +6. Re-read `/Users/rcmerci/gh-repos/logseq/src/main/frontend/worker/db_worker_node.cljs` around server-list append and remove call sites. + +### Phase 1: Write RED tests + +1. Add `server-list-lock-path-derives-sibling-lock-file` in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs`. +2. Assert `(server-list/lock-path "/tmp/logseq-root/server-list")` returns `/tmp/logseq-root/server-list.lock`. +3. Add `read-entries-ignores-server-list-lock` in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs`. +4. Create both `server-list` and `server-list.lock` in a temporary root-dir. +5. Assert `server-list/read-entries` returns parsed entries without waiting for or deleting `server-list.lock`. +6. Add `append-entry-deduplicates-valid-entry-under-update` in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs`. +7. Assert two identical appends leave one `pid port` line. +8. Add `remove-entry-preserves-unrelated-current-entry` in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/db_worker/server_list_test.cljs`. +9. Assert removing one entry does not remove another existing entry. +10. Add `list-servers-preserves-concurrent-server-list-writes` in `/Users/rcmerci/gh-repos/logseq/src/test/logseq/cli/server_test.cljs`. +11. In that test, write one stale entry to `server-list`. +12. In that test, redefine `daemon/pid-status` so checking the stale pid appends a second live entry to `server-list` before returning `:not-found`. +13. In that test, call `cli-server/list-servers`. +14. In that test, assert the final `server-list` still contains the second live entry. +15. Run the focused tests and confirm the new race regression test fails on current behavior. + +Expected RED command examples: + +```shell +bb dev:test -v logseq.db-worker.server-list-test/server-list-lock-path-derives-sibling-lock-file +bb dev:test -v logseq.db-worker.server-list-test/read-entries-ignores-server-list-lock +bb dev:test -v logseq.cli.server-test/list-servers-preserves-concurrent-server-list-writes +``` + +The first command should fail until `lock-path` exists. + +The race regression command should fail until discovery cleanup stops rewriting stale snapshots. + +### Phase 2: Implement locked server-list mutations + +1. Add `lock-path` to `/Users/rcmerci/gh-repos/logseq/src/main/logseq/db_worker/server_list.cljs`. +2. Add private constants for write-lock timeout and poll interval. +3. Set the write-lock timeout constant to `2000` milliseconds. +4. Set the write-lock poll interval constant to `25` milliseconds unless tests show this is too aggressive. +5. Add a private lock metadata writer. +6. Add a private lock metadata reader. +7. Add a private stale-lock predicate using current lock metadata and process status. +8. Use a local process-status helper rather than depending on `logseq.db-worker.daemon` if adding the dependency would create a cycle. +9. Add `acquire-write-lock!` with exclusive file creation. +10. Add `release-write-lock!` with lock-id verification. +11. Add `with-write-lock!` that wraps acquire, body execution, and release. +12. Add `read-entries-unlocked` only if it keeps the public `read-entries` simple and testable. +13. Update `read-entries` to read directly and catch missing-file errors. +14. Add `entry-key` or equivalent exact `pid port` identity helper. +15. Add `normalize-entries` to validate entries and remove exact duplicate `pid port` entries while preserving order. +16. Add `payload-for-entries` or equivalent formatting helper. +17. Add `write-entries-unlocked!` that writes through a temporary file and renames to `server-list`. +18. Add `update-entries!` that reads current entries inside the lock and writes normalized transformed entries. +19. Reimplement `append-entry!` through `update-entries!`. +20. Reimplement `remove-entry!` through `update-entries!`. +21. Keep `rewrite-entries!` public as a locked full-replacement helper. +22. Add `remove-entries!` or `prune-entries!` if this makes `discover-servers` clearer. +23. Keep public return values compatible for current daemon call sites unless a test proves a change is needed. + +### Phase 3: Replace stale discovery cleanup + +1. Update `logseq.cli.server/discover-servers` in `/Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs`. +2. Keep `entries` as the discovery input read before health checks. +3. Keep the existing `p/all` health-check structure. +4. Compute `stale-entries` as entries whose result has `:retain? false`. +5. Call `server-list/remove-entries!` or `server-list/update-entries!` only when `stale-entries` is non-empty. +6. Remove the current `cleaned-entries` full rewrite based on the initial snapshot. +7. Preserve existing `retained-results` and returned `:server` values. +8. Verify `list-servers-lazily-cleans-stale-server-list-entries` still passes. +9. Verify `ensure-server-preserves-live-server-list-entry-after-transient-healthz-failure` still passes. + +### Phase 4: Verify integration behavior + +1. Run the new focused server-list tests. +2. Run the new focused CLI race regression test. +3. Run existing CLI server-list cleanup tests. +4. Run existing db-worker-node daemon registration tests. +5. Run the full lint and unit test suite when the focused tests are green. + +Expected GREEN command examples: + +```shell +bb dev:test -v logseq.db-worker.server-list-test +bb dev:test -v logseq.cli.server-test/list-servers-preserves-concurrent-server-list-writes +bb dev:test -v logseq.cli.server-test/list-servers-lazily-cleans-stale-server-list-entries +bb dev:test -v logseq.cli.server-test/ensure-server-preserves-live-server-list-entry-after-transient-healthz-failure +bb dev:test -v frontend.worker.db-worker-node-test/db-worker-node-start-daemon-registers-and-unregisters-derived-server-list-entry +bb dev:lint-and-test +``` + +Each focused command should pass without warnings. + +`bb dev:lint-and-test` should pass before opening a PR. + +## Edge cases + +A reader may run while a writer holds `server-list.lock`. + +The reader must ignore the lock and return entries from the current complete `server-list` file. + +A reader may run while a writer is replacing `server-list` through atomic rename. + +The reader should see either the old complete file or the new complete file. + +A writer may crash after acquiring `server-list.lock`. + +A later writer should remove the stale lock only when metadata proves the owner pid is gone. + +A writer may see `server-list.lock` from a live process. + +The writer should wait for the `2000` millisecond bounded timeout and then fail fast with `:server-list-lock-timeout`. + +A process may not have permission to inspect the pid in the lock file. + +Treat that lock as active until timeout rather than deleting it. + +A lock file may contain malformed metadata. + +Treat malformed metadata as active until timeout and then fail fast rather than deleting it speculatively. + +Two processes may append the same `pid port` entry. + +The normalized mutation path should avoid duplicate exact entries. + +Discovery may prove entry `S` stale while another process appends entry `B`. + +The cleanup mutation must remove only `S` from the current file and preserve `B`. + +Discovery may fail `/healthz` transiently for a live process. + +The current behavior retains entries after transient health failure, and the new implementation must preserve that behavior. + +A daemon may stop and remove its entry while discovery is checking health. + +The locked remove path should remove only the daemon entry and preserve unrelated entries. + +The root-dir directory may not exist before the first write. + +The write path should create the directory before lock acquisition. + +The root-dir may be missing during read. + +The read path should return an empty vector for a missing `server-list` file. + +## Acceptance criteria + +`server-list.lock` is the only lock filename used for `server-list` writes. + +Every production mutation of `server-list` acquires `server-list.lock` first. + +No production read of `server-list` acquires or waits for `server-list.lock`. + +Write-lock acquisition times out after `2000` milliseconds with a structured `:server-list-lock-timeout` error. + +Malformed `server-list.lock` metadata is not deleted speculatively and fails fast after the timeout. + +`append-entry!` does not lose entries when another process is cleaning stale entries. + +`remove-entry!` does not lose unrelated entries. + +`discover-servers` no longer rewrites the list from a stale pre-health-check snapshot. + +A stale cleanup pass preserves entries appended after the pass started. + +Exact duplicate `pid port` entries are normalized away while preserving first-seen order. + +`rewrite-entries!` remains public only as a locked full-replacement helper. + +Readers never intentionally observe temporary files. + +Focused tests for `logseq.db-worker.server-list-test` pass. + +Focused tests for `logseq.cli.server-test` server-list discovery behavior pass. + +Existing `frontend.worker.db-worker-node-test` daemon registration behavior passes. + +`bb dev:lint-and-test` passes before PR submission. + +## Testing Details + +The server-list unit tests should verify externally visible file behavior rather than private helper details wherever possible. + +The lock path test is acceptable because the requested lock filename is part of the behavior contract. + +The lock-free read test should create `server-list.lock` and prove reads still return entries without deleting the lock. + +The CLI race regression should test the actual `list-servers` or `discover-servers` behavior and prove a live entry appended during stale cleanup survives. + +The daemon registration test should verify the existing start and stop behavior still publishes and unpublishes the current process entry through the updated helpers. + +## Implementation Details + +- Keep the lock implementation inside `logseq.db-worker.server-list` so callers cannot bypass it accidentally. +- Use `/server-list.lock` and no other lock filename. +- Use exclusive file creation to acquire the lock. +- Use a `2000` millisecond bounded wait and fail fast with structured error data when the lock cannot be acquired. +- Use lock metadata with pid and lock id to support safe release and stale cleanup. +- Treat malformed `server-list.lock` metadata as active until timeout and then fail fast. +- Keep reads lock-free and independent from `server-list.lock`. +- Read current entries inside the write lock before applying append, remove, or prune transforms. +- Normalize exact duplicate `pid port` entries while preserving first-seen order. +- Keep `rewrite-entries!` public only as a locked full-replacement helper. +- Replace full stale-snapshot rewrites in discovery with targeted stale-entry removal. +- Write through a temporary file and atomic rename so readers do not see partial writes. +- Preserve current public call-site simplicity in `frontend.worker.db-worker-node`. + +## Question + +No open questions remain for this plan. + +The write-lock timeout is `2000` milliseconds. + +Malformed `server-list.lock` metadata fails fast after the normal timeout and is not deleted speculatively. + +`rewrite-entries!` remains public only as a locked full-replacement helper, while production code should prefer mutation-specific helpers. + +Exact duplicate `pid port` entries are normalized away while preserving first-seen order. + +--- diff --git a/docs/agent-guide/001-db-sync-nodejs-adapter.md b/docs/archive/adr/001-db-sync-nodejs-adapter.md similarity index 100% rename from docs/agent-guide/001-db-sync-nodejs-adapter.md rename to docs/archive/adr/001-db-sync-nodejs-adapter.md diff --git a/docs/agent-guide/002-reactions.md b/docs/archive/adr/002-reactions.md similarity index 100% rename from docs/agent-guide/002-reactions.md rename to docs/archive/adr/002-reactions.md diff --git a/src/main/frontend/components/rtc/indicator.cljs b/src/main/frontend/components/rtc/indicator.cljs index 4eb9387834..813d20cc4c 100644 --- a/src/main/frontend/components/rtc/indicator.cljs +++ b/src/main/frontend/components/rtc/indicator.cljs @@ -78,11 +78,46 @@ (keep #(get % repo)) (dedupe)))) +(defn asset-transfer-counts + [progress] + (reduce-kv + (fn [counts _id {:keys [direction loaded total]}] + (if (and (contains? #{:upload :download} direction) + (number? loaded) + (number? total) + (not= loaded total)) + (update counts direction inc) + counts)) + {:upload 0 + :download 0} + (or progress {}))) + +(defn asset-status-rows + [{:keys [pending-asset-ops asset-transfer-counts]}] + (let [{:keys [upload download]} asset-transfer-counts] + (cond-> [] + (pos? (or pending-asset-ops 0)) + (conj {:count pending-asset-ops + :label-key :sync/pending-asset-uploads}) + + (pos? (or upload 0)) + (conj {:count upload + :label-key :sync/assets-uploading}) + + (pos? (or download 0)) + (conj {:count download + :label-key :sync/assets-downloading})))) + +(defn- asset-status-label + [label-key] + (case label-key + :sync/pending-asset-uploads (t :sync/pending-asset-uploads) + :sync/assets-uploading (t :sync/assets-uploading) + :sync/assets-downloading (t :sync/assets-downloading))) + (rum/defc assets-progressing - [] - (let [repo (state/get-current-repo) - progress (hooks/use-flow-state (asset-upload-download-progress-flow repo)) - downloading (->> + [progress] + (let [downloading (->> (keep (fn [[id {:keys [direction loaded total]}]] (when (and (= direction :download) (not= loaded total) @@ -122,17 +157,25 @@ (rum/defc details [] (let [online? (hooks/use-flow-state flows/network-online-event-flow) + repo (state/get-current-repo) + asset-progress (hooks/use-flow-state (asset-upload-download-progress-flow repo)) [expand-debug? set-expand-debug!] (hooks/use-state false) show-checksums? (or config/dev? util/node-test?) {:keys [graph-uuid local-tx remote-tx local-checksum remote-checksum rtc-state - download-logs upload-logs misc-logs pending-local-ops pending-server-ops]} - (hooks/use-flow-state (m/watch *detail-info))] + download-logs upload-logs misc-logs pending-local-ops pending-asset-ops pending-server-ops]} + (hooks/use-flow-state (m/watch *detail-info)) + asset-rows (asset-status-rows {:pending-asset-ops pending-asset-ops + :asset-transfer-counts (asset-transfer-counts asset-progress)})] [:div.rtc-info.flex.flex-col.gap-1.p-2.text-gray-11 [:div.font-medium.mb-2 (t (if online? :sync/online :sync/offline))] [:div [:span.font-medium.mr-1 (or pending-local-ops 0)] (t :sync/pending-local-changes)] + (for [{:keys [count label-key]} asset-rows] + [:div {:key (name label-key)} + [:span.font-medium.mr-1 count] + (asset-status-label label-key)]) ;; FIXME: pending-server-ops [:div [:span.font-medium.mr-1 (or pending-server-ops 0)] (t :sync/pending-server-changes)] - (assets-progressing) + (assets-progressing asset-progress) ;; FIXME: What's the type for downloaded log? (when-let [latest-log (some (fn [l] (when (contains? #{:rtc.log/push-local-update} (:type l)) l)) misc-logs)] (when-let [time (:created-at latest-log)] diff --git a/src/main/frontend/worker/platform/browser.cljs b/src/main/frontend/worker/platform/browser.cljs index e3b065cc7f..ed004ab617 100644 --- a/src/main/frontend/worker/platform/browser.cljs +++ b/src/main/frontend/worker/platform/browser.cljs @@ -106,7 +106,9 @@ (defn- graph-assets-dir [repo] (when-let [graph-name (some-> repo common-config/strip-leading-db-version-prefix)] - (str "/" graph-name "/assets"))) + (path/url-to-path + (path/path-join (str "memory:///" graph-name) + common-config/local-assets-dir)))) (defn- ensure-pfs-dir! [^js pfs dir] diff --git a/src/main/frontend/worker/sync.cljs b/src/main/frontend/worker/sync.cljs index e660544f0d..587ba312ca 100644 --- a/src/main/frontend/worker/sync.cljs +++ b/src/main/frontend/worker/sync.cljs @@ -201,6 +201,8 @@ (let [delay (reconnect-delay-ms attempt) timeout-id (js/setTimeout (fn [] + (log/info :db-sync/ws-reconnect {:repo repo + :db-sync-client-exists? (some? @worker-state/*db-sync-client)}) (swap! reconnect assoc :timer nil) (when-let [current @worker-state/*db-sync-client] (when (and (= (:repo current) repo) @@ -296,6 +298,8 @@ [repo client url token] (when (:ws client) (stop-client! client)) + (log/info :db-sync/connect! {:repo repo + :token-exists? (some? (or token (auth-token)))}) (when-let [token' (or token (auth-token))] (let [ws (platform/websocket-connect (platform/current) (sync-transport/append-token url token')) updated (assoc client :ws ws)] diff --git a/src/main/frontend/worker/sync/apply_txs.cljs b/src/main/frontend/worker/sync/apply_txs.cljs index 1fe98081ad..53ff2745be 100644 --- a/src/main/frontend/worker/sync/apply_txs.cljs +++ b/src/main/frontend/worker/sync/apply_txs.cljs @@ -13,7 +13,7 @@ [frontend.worker.sync.large-title :as sync-large-title] [frontend.worker.sync.presence :as sync-presence] [frontend.worker.sync.transport :as sync-transport] - [frontend.worker.sync.util :refer [fail-fast]] + [frontend.worker.sync.util :as sync-util :refer [fail-fast]] [frontend.worker.undo-redo :as worker-undo-redo] [lambdaisland.glogi :as log] [logseq.common.util :as common-util] @@ -373,6 +373,7 @@ (when-not (upload-stopped? repo) (let [{:keys [tx-entries drop-tx-ids]} (prepare-upload-tx-entries conn batch)] (when (seq drop-tx-ids) + (log/info :db-sync/drop-tx-ids {:tx-ids drop-tx-ids}) (mark-pending-txs-false! repo drop-tx-ids)) (when (seq tx-entries) (-> (p/let [aes-key (when (sync-crypt/graph-e2ee? repo) @@ -404,7 +405,10 @@ :t-before local-tx :txs payload})) (p/catch (fn [error] - (js/console.error error))))))))))))))) + (sync-util/set-last-sync-error! client error) + (log/error :db-sync/flush-pending-failed + {:repo repo + :error error}))))))))))))))) (defn enqueue-flush-pending! [repo client] diff --git a/src/main/frontend/worker/sync/assets.cljs b/src/main/frontend/worker/sync/assets.cljs index d0fc21b2a3..12c06fadf6 100644 --- a/src/main/frontend/worker/sync/assets.cljs +++ b/src/main/frontend/worker/sync/assets.cljs @@ -109,14 +109,14 @@ (throw (ex-info (name tag) data))) ) asset-id (str asset-uuid) put-url (sync-large-title/asset-url base graph-id asset-id asset-type) - asset-file (try + asset-bytes (-> (uint8 asset-file) asset-file) + (p/catch (fn [e] + (log/error :read-asset-failed e) + (throw (ex-info "read-asset failed" + {:type :rtc.exception/read-asset-failed} + e))))) + asset-bytes (if aes-key (->uint8 asset-bytes) asset-bytes) payload (if (not aes-key) asset-bytes (p/let [encrypted-bytes (crypt/str attr) - value - remote-t - now])))))) + (let [attr-str (qualified-kw->str attr)] + (sqlite-run! store + "delete from sync_conflicts where block_uuid = ? and attr = ?" + [(str block-uuid) attr-str]) + (when-not (string/blank? value) + (sqlite-run! store + (str "insert into sync_conflicts " + "(block_uuid, attr, value, remote_t, created_at) " + "values (?, ?, ?, ?, ?) " + "on conflict(block_uuid, attr, value) do update set " + "remote_t = excluded.remote_t, " + "created_at = excluded.created_at") + [(str block-uuid) + attr-str + value + remote-t + now])))))))) (defn get-sync-conflicts [repo block-uuid] diff --git a/src/main/frontend/worker/sync/handle_message.cljs b/src/main/frontend/worker/sync/handle_message.cljs index c83e8bd9e7..30db076c8c 100644 --- a/src/main/frontend/worker/sync/handle_message.cljs +++ b/src/main/frontend/worker/sync/handle_message.cljs @@ -238,6 +238,12 @@ :current-client-f current-client :broadcast-rtc-state!-f broadcast-rtc-state! :fail-fast-f fail-fast}) + (log/info :db-sync/handle-hello + {:empty-inflight? (empty? @(:inflight client)) + :online? (worker-state/online?) + :ws-open? (when-let [ws (:ws client)] + (ws-open? ws)) + :pending-txs-count (count (sync-apply/pending-txs repo {:limit 50}))}) (sync-apply/enqueue-flush-pending! repo client)) (defn- handle-online-users! @@ -325,6 +331,7 @@ "pull/ok" (handle-pull-ok! repo client local-tx remote-tx remote-checksum message) "changed" (handle-changed! repo client local-tx remote-tx) "tx/reject" (handle-tx-reject! repo client message local-tx) + "pong" nil (fail-fast :db-sync/invalid-field {:repo repo :type (:type message)}))))) diff --git a/src/main/logseq/cli/server.cljs b/src/main/logseq/cli/server.cljs index 87ba28bc49..d073a7d991 100644 --- a/src/main/logseq/cli/server.cljs +++ b/src/main/logseq/cli/server.cljs @@ -247,9 +247,12 @@ (update :owner-source normalize-owner-source))})) (p/catch (fn [_] {:entry entry :retain? true}))))))) - retained-results (filter :retain? results) - cleaned-entries (mapv :entry retained-results) - _ (server-list/rewrite-entries! path cleaned-entries)] + retained-results (filterv :retain? results) + stale-entries (->> results + (remove :retain?) + (mapv :entry)) + _ (when (seq stale-entries) + (server-list/remove-entries! path stale-entries))] (->> retained-results (keep :server) vec)))) diff --git a/src/main/logseq/db_worker/server_list.cljs b/src/main/logseq/db_worker/server_list.cljs index c4cbd0bb18..cf469b3df1 100644 --- a/src/main/logseq/db_worker/server_list.cljs +++ b/src/main/logseq/db_worker/server_list.cljs @@ -4,12 +4,21 @@ ["fs" :as fs] ["path" :as node-path])) +(def ^:private write-lock-timeout-ms 2000) +(def ^:private write-lock-poll-interval-ms 25) + (defn path [root-dir-path] (when-not (seq root-dir-path) (throw (js/Error. "root-dir is required"))) (node-path/join root-dir-path "server-list")) +(defn lock-path + [file-path] + (when-not (seq file-path) + (throw (js/Error. "server-list file path is required"))) + (node-path/join (node-path/dirname file-path) "server-list.lock")) + (defn- parse-int [value] (when (re-matches #"\d+" value) @@ -28,38 +37,223 @@ (defn read-entries [file-path] - (if (and (seq file-path) (fs/existsSync file-path)) - (->> (.toString (fs/readFileSync file-path) "utf8") - string/split-lines - (keep parse-line) - vec) + (if (seq file-path) + (try + (->> (.toString (fs/readFileSync file-path) "utf8") + string/split-lines + (keep parse-line) + vec) + (catch :default e + (if (= "ENOENT" (.-code e)) + [] + (throw e)))) [])) +(defn- entry-key + [{:keys [pid port]}] + [pid port]) + +(defn- valid-entry? + [{:keys [pid port]}] + (and (pos-int? pid) (pos-int? port))) + +(defn- normalize-entries + [entries] + (:entries + (reduce (fn [{:keys [seen] :as acc} entry] + (let [key (entry-key entry)] + (if (and (valid-entry? entry) + (not (contains? seen key))) + (-> acc + (update :seen conj key) + (update :entries conj {:pid (:pid entry) + :port (:port entry)})) + acc))) + {:seen #{} + :entries []} + (or entries [])))) + +(defn- payload-for-entries + [entries] + (if (seq entries) + (str (string/join "\n" (map (fn [{:keys [pid port]}] + (str pid " " port)) + entries)) + "\n") + "")) + +(defn- sleep-sync! + [ms] + (let [shared (js/SharedArrayBuffer. 4) + view (js/Int32Array. shared)] + (js/Atomics.wait view 0 0 ms))) + +(defn- process-status + [pid] + (try + (.kill js/process pid 0) + :alive + (catch :default e + (case (.-code e) + "ESRCH" :not-found + "EPERM" :no-permission + :unknown)))) + +(defn- read-lock-metadata + [lock-file] + (try + (let [raw (.toString (fs/readFileSync lock-file) "utf8")] + (try + {:raw raw + :metadata (js->clj (js/JSON.parse raw) :keywordize-keys true)} + (catch :default e + {:raw raw + :parse-error e}))) + (catch :default e + (if (= "ENOENT" (.-code e)) + nil + {:read-error e})))) + +(defn- lock-stale? + [lock-info] + (let [pid (get-in lock-info [:metadata :pid])] + (and (pos-int? pid) + (= :not-found (process-status pid))))) + +(defn- unlink-if-exists! + [file-path] + (try + (fs/unlinkSync file-path) + (catch :default e + (when-not (= "ENOENT" (.-code e)) + (throw e))))) + +(defn- lock-timeout-error + [file-path lock-file lock-info] + (ex-info (str "Timed out acquiring server-list lock: " lock-file) + (cond-> {:code :server-list-lock-timeout + :file-path file-path + :lock-path lock-file} + (:metadata lock-info) (assoc :lock-metadata (:metadata lock-info)) + (:raw lock-info) (assoc :lock-raw (:raw lock-info))))) + +(defn- write-lock-metadata! + [fd metadata] + (fs/writeFileSync fd (js/JSON.stringify (clj->js metadata)) "utf8")) + +(defn- acquire-write-lock! + [file-path] + (fs/mkdirSync (node-path/dirname file-path) #js {:recursive true}) + (let [lock-file (lock-path file-path) + deadline (+ (js/Date.now) write-lock-timeout-ms)] + (loop [] + (let [lock-id (str (random-uuid)) + metadata {:pid (.-pid js/process) + :lock-id lock-id + :created-at (.toISOString (js/Date.))} + result (try + (let [fd (fs/openSync lock-file "wx") + write-error (try + (write-lock-metadata! fd metadata) + nil + (catch :default e + e))] + (fs/closeSync fd) + (when write-error + (unlink-if-exists! lock-file) + (throw write-error)) + {:file-path file-path + :lock-path lock-file + :metadata metadata}) + (catch :default e + (if (= "EEXIST" (.-code e)) + (let [lock-info (read-lock-metadata lock-file)] + (when (lock-stale? lock-info) + (unlink-if-exists! lock-file)) + (if (>= (js/Date.now) deadline) + (throw (lock-timeout-error file-path lock-file lock-info)) + ::retry)) + (throw e))))] + (if (= ::retry result) + (do + (sleep-sync! write-lock-poll-interval-ms) + (recur)) + result))))) + +(defn- release-write-lock! + [{lock-file :lock-path metadata :metadata}] + (let [lock-info (read-lock-metadata lock-file)] + (when (= (:lock-id metadata) (get-in lock-info [:metadata :lock-id])) + (unlink-if-exists! lock-file)))) + +(defn- with-write-lock! + [file-path f] + (let [lock (acquire-write-lock! file-path)] + (try + (f) + (finally + (release-write-lock! lock))))) + +(defn- write-entries-unlocked! + [file-path entries] + (let [dir (node-path/dirname file-path) + tmp-file (node-path/join dir (str "server-list." + (.-pid js/process) + "." + (random-uuid) + ".tmp")) + payload (payload-for-entries entries)] + (try + (fs/writeFileSync tmp-file payload "utf8") + (fs/renameSync tmp-file file-path) + (catch :default e + (try + (fs/rmSync tmp-file #js {:force true}) + (catch :default _)) + (throw e))))) + +(defn update-entries! + [file-path transform] + (when (seq file-path) + (with-write-lock! + file-path + (fn [] + (let [entries (read-entries file-path) + next-entries (normalize-entries (transform entries))] + (write-entries-unlocked! file-path next-entries) + next-entries))))) + (defn rewrite-entries! [file-path entries] (when (seq file-path) - (fs/mkdirSync (node-path/dirname file-path) #js {:recursive true}) - (let [payload (if (seq entries) - (str (string/join "\n" (map (fn [{:keys [pid port]}] - (str pid " " port)) - entries)) - "\n") - "")] - (fs/writeFileSync file-path payload "utf8")))) + (update-entries! file-path (constantly entries)) + nil)) (defn append-entry! [file-path {:keys [pid port] :as entry}] (when (and (seq file-path) (pos-int? pid) (pos-int? port)) - (fs/mkdirSync (node-path/dirname file-path) #js {:recursive true}) - (fs/appendFileSync file-path (str pid " " port "\n") "utf8") + (update-entries! file-path #(conj (vec %) {:pid pid :port port})) entry)) (defn remove-entry! [file-path {:keys [pid port]}] (when (seq file-path) - (let [entries (->> (read-entries file-path) - (remove (fn [entry] - (and (= pid (:pid entry)) - (= port (:port entry))))) - vec)] - (rewrite-entries! file-path entries)))) + (update-entries! + file-path + (fn [entries] + (remove (fn [entry] + (and (= pid (:pid entry)) + (= port (:port entry)))) + entries))) + nil)) + +(defn remove-entries! + [file-path entries-to-remove] + (when (seq file-path) + (let [keys-to-remove (set (map entry-key (normalize-entries entries-to-remove)))] + (update-entries! + file-path + (fn [entries] + (remove (fn [entry] + (contains? keys-to-remove (entry-key entry))) + entries)))))) diff --git a/src/resources/dicts/en.edn b/src/resources/dicts/en.edn index 701f3163ee..24aaf22ab3 100644 --- a/src/resources/dicts/en.edn +++ b/src/resources/dicts/en.edn @@ -1724,7 +1724,9 @@ :storage.recycle/retention-desc "Deleted pages and blocks stay here until restored or automatically garbage collected after 30 days." :storage.recycle/title "Recycle" + :sync/assets-downloading "downloading assets" :sync/assets-downloading-count "Downloading assets ({1})" + :sync/assets-uploading "uploading assets" :sync/assets-uploading-count "Uploading assets ({1})" :sync/conflicts-description "These server versions conflicted with local edits. Copy the text you want and edit the block manually." :sync/conflicts-title "Sync conflicts" @@ -1738,6 +1740,7 @@ :sync/more-debug-info "More debug info" :sync/offline "Offline" :sync/online "Online" + :sync/pending-asset-uploads "pending asset uploads" :sync/pending-local-changes "pending local changes" :sync/pending-server-changes "pending server changes" :sync/show-conflicts "Show sync conflicts" diff --git a/src/resources/dicts/zh-cn.edn b/src/resources/dicts/zh-cn.edn index 5401f3c73c..7cca3407c1 100644 --- a/src/resources/dicts/zh-cn.edn +++ b/src/resources/dicts/zh-cn.edn @@ -1707,7 +1707,9 @@ :storage.recycle/retention-desc "已删除的页面和块会保留在此处,直到被恢复或在 30 天后被自动垃圾回收。" :storage.recycle/title "回收站" + :sync/assets-downloading "正在下载资源" :sync/assets-downloading-count "正在下载资源({1})" + :sync/assets-uploading "正在上传资源" :sync/assets-uploading-count "正在上传资源({1})" :sync/conflicts-description "这些服务器版本与本地编辑冲突。复制你需要的文本,然后手动编辑该块。" :sync/conflicts-title "同步冲突" @@ -1721,6 +1723,7 @@ :sync/more-debug-info "更多调试信息" :sync/offline "离线" :sync/online "在线" + :sync/pending-asset-uploads "待上传的资源" :sync/pending-local-changes "待同步的本地更改" :sync/pending-server-changes "待同步的服务器更改" :sync/show-conflicts "显示同步冲突" diff --git a/src/test/frontend/components/rtc/indicator_test.cljs b/src/test/frontend/components/rtc/indicator_test.cljs new file mode 100644 index 0000000000..a4a878b080 --- /dev/null +++ b/src/test/frontend/components/rtc/indicator_test.cljs @@ -0,0 +1,30 @@ +(ns frontend.components.rtc.indicator-test + (:require [cljs.test :refer [deftest is]] + [frontend.components.rtc.indicator :as indicator])) + +(deftest asset-transfer-counts-counts-active-uploads-and-downloads + (is (= {:upload 2 + :download 1} + (indicator/asset-transfer-counts + {"upload-1" {:direction :upload :loaded 0 :total 10} + "upload-2" {:direction :upload :loaded 5 :total 10} + "upload-done" {:direction :upload :loaded 10 :total 10} + "download-1" {:direction :download :loaded 1 :total 10} + "missing-total" {:direction :download :loaded 1} + "other" {:direction :other :loaded 0 :total 10}})))) + +(deftest asset-status-rows-shows-pending-upload-and-active-transfer-info + (is (= [{:count 3 :label-key :sync/pending-asset-uploads} + {:count 1 :label-key :sync/assets-uploading} + {:count 2 :label-key :sync/assets-downloading}] + (indicator/asset-status-rows + {:pending-asset-ops 3 + :asset-transfer-counts {:upload 1 + :download 2}})))) + +(deftest asset-status-rows-hides-zero-counts + (is (= [] + (indicator/asset-status-rows + {:pending-asset-ops 0 + :asset-transfer-counts {:upload 0 + :download 0}})))) diff --git a/src/test/frontend/worker/db_sync_test.cljs b/src/test/frontend/worker/db_sync_test.cljs index 7a191e61a5..9b5293dcd8 100644 --- a/src/test/frontend/worker/db_sync_test.cljs +++ b/src/test/frontend/worker/db_sync_test.cljs @@ -4266,6 +4266,44 @@ (mapv #(select-keys % [:block-uuid :attr :value :remote-t]) (client-op/get-sync-conflicts test-repo block-uuid))))))))) +(deftest sync-title-conflict-store-keeps-only-latest-non-empty-value-test + (testing "newer title conflicts replace previous title conflicts for the block" + (let [client-ops-conn (new-client-ops-db) + block-uuid (random-uuid)] + (with-datascript-conns (db-test/create-conn-with-blocks + {:pages-and-blocks + [{:page {:block/title "page 1"} + :blocks [{:block/title "target"}]}]}) + client-ops-conn + (fn [] + (client-op/add-sync-conflicts! + test-repo + [{:block-uuid block-uuid + :attr :block/title + :value "first remote title" + :remote-t 42} + {:block-uuid block-uuid + :attr :block/title + :value "" + :remote-t 43} + {:block-uuid block-uuid + :attr :block/title + :value "latest remote title" + :remote-t 44}]) + (is (= [{:block-uuid block-uuid + :attr :block/title + :value "latest remote title" + :remote-t 44}] + (mapv #(select-keys % [:block-uuid :attr :value :remote-t]) + (client-op/get-sync-conflicts test-repo block-uuid)))) + (client-op/add-sync-conflicts! + test-repo + [{:block-uuid block-uuid + :attr :block/title + :value "" + :remote-t 45}]) + (is (empty? (client-op/get-sync-conflicts test-repo block-uuid)))))))) + (deftest sync-conflict-clear-test (testing "resolved sync conflicts are removed for the block" (let [client-ops-conn (new-client-ops-db) diff --git a/src/test/frontend/worker/platform_browser_test.cljs b/src/test/frontend/worker/platform_browser_test.cljs new file mode 100644 index 0000000000..90311b1170 --- /dev/null +++ b/src/test/frontend/worker/platform_browser_test.cljs @@ -0,0 +1,50 @@ +(ns frontend.worker.platform-browser-test + (:require [cljs.test :refer [async deftest is]] + [frontend.worker.platform.browser :as platform-browser] + [goog.object :as gobj] + [promesa.core :as p])) + +(defn- fake-pfs + [files] + #js {:readFile (fn [path] + (if (contains? @files path) + (p/resolved (get @files path)) + (p/rejected (js/Error. (str "ENOENT: " path))))) + :writeFile (fn [path payload] + (swap! files assoc path payload) + (p/resolved nil)) + :stat (fn [path] + (if-let [payload (get @files path)] + (p/resolved #js {:size (.-byteLength payload) + :type "file"}) + (p/rejected (js/Error. (str "ENOENT: " path))))) + :mkdir (fn [path] + (swap! files assoc path (js/Uint8Array. #js [])) + (p/resolved nil))}) + +(deftest browser-platform-asset-read-uses-renderer-memory-path-test + (async done + (let [repo "logseq_db_Lambda RTC" + file-name "69fb07e1-852f-4896-9d30-761843368fdb.png" + payload (js/Uint8Array. #js [1 2 3]) + files (atom {"/Lambda%20RTC/assets/69fb07e1-852f-4896-9d30-761843368fdb.png" payload}) + original-pfs (gobj/get js/globalThis "pfs") + original-location (gobj/get js/globalThis "location")] + (gobj/set js/globalThis "pfs" (fake-pfs files)) + (gobj/set js/globalThis "location" #js {:href "http://localhost/" :search ""}) + (-> (p/let [platform (platform-browser/browser-platform) + read-payload ((get-in platform [:storage :asset-read-bytes!]) + repo + file-name)] + (is (= payload read-payload))) + (p/catch (fn [error] + (is false (str "unexpected error: " error)))) + (p/finally + (fn [] + (if (some? original-pfs) + (gobj/set js/globalThis "pfs" original-pfs) + (gobj/remove js/globalThis "pfs")) + (if (some? original-location) + (gobj/set js/globalThis "location" original-location) + (gobj/remove js/globalThis "location")) + (done))))))) diff --git a/src/test/logseq/cli/server_test.cljs b/src/test/logseq/cli/server_test.cljs index 30c5b9f772..e1572a48d3 100644 --- a/src/test/logseq/cli/server_test.cljs +++ b/src/test/logseq/cli/server_test.cljs @@ -13,6 +13,7 @@ [logseq.cli.test-helper :as test-helper] [logseq.common.version :as version] [logseq.db-worker.daemon :as daemon] + [logseq.db-worker.server-list :as server-list] [promesa.core :as p])) (deftest spawn-server-omits-host-and-port-flags @@ -969,6 +970,34 @@ (is false (str "unexpected error: " e)))) (p/finally done))))) +(deftest list-servers-preserves-concurrent-server-list-writes + (async done + (let [root-dir (node-helper/create-tmp-dir "cli-server-list-race") + config-path (node-path/join root-dir "cli.edn") + server-list-file (cli-config/server-list-path root-dir) + stale-entry {:pid 999999 :port 65535} + live-entry {:pid (.-pid js/process) :port 65432} + appended? (atom false)] + (fs/writeFileSync server-list-file + (str (:pid stale-entry) " " (:port stale-entry) "\n") + "utf8") + (-> (p/with-redefs [daemon/pid-status (fn [pid] + (when (and (= (:pid stale-entry) pid) + (not @appended?)) + (reset! appended? true) + (server-list/append-entry! server-list-file live-entry)) + :not-found)] + (cli-server/list-servers {:root-dir root-dir + :config-path config-path})) + (p/then (fn [servers] + (is (empty? servers)) + (is @appended?) + (is (= [live-entry] + (server-list/read-entries server-list-file))))) + (p/catch (fn [e] + (is false (str "unexpected error: " e)))) + (p/finally done))))) + (deftest cleanup-revision-mismatched-servers-kills-only-cli-owned-targets (async done (let [stop-calls (atom [])] diff --git a/src/test/logseq/db_worker/server_list_test.cljs b/src/test/logseq/db_worker/server_list_test.cljs index 2f79364c82..7188a7e7fc 100644 --- a/src/test/logseq/db_worker/server_list_test.cljs +++ b/src/test/logseq/db_worker/server_list_test.cljs @@ -1,5 +1,7 @@ (ns logseq.db-worker.server-list-test - (:require [cljs.test :refer [deftest is]] + (:require ["fs" :as fs] + [cljs.test :refer [deftest is]] + [frontend.test.node-helper :as node-helper] [logseq.db-worker.server-list :as server-list])) (deftest path-derives-server-list-from-root-dir @@ -10,3 +12,69 @@ (is (thrown-with-msg? js/Error #"root-dir is required" (server-list/path nil)))) + +(deftest server-list-lock-path-derives-sibling-lock-file + (is (= "/tmp/logseq-root/server-list.lock" + (server-list/lock-path "/tmp/logseq-root/server-list")))) + +(deftest read-entries-ignores-server-list-lock + (let [root-dir (node-helper/create-tmp-dir "server-list-read-lock-free") + file-path (server-list/path root-dir) + lock-file (server-list/lock-path file-path)] + (fs/writeFileSync file-path "123 456\n" "utf8") + (fs/writeFileSync lock-file "locked" "utf8") + (is (= [{:pid 123 :port 456}] + (server-list/read-entries file-path))) + (is (fs/existsSync lock-file)) + (is (= "locked" + (.toString (fs/readFileSync lock-file) "utf8"))))) + +(deftest append-entry-deduplicates-valid-entry-under-update + (let [root-dir (node-helper/create-tmp-dir "server-list-append-dedupe") + file-path (server-list/path root-dir) + entry {:pid 123 :port 456}] + (is (= entry (server-list/append-entry! file-path entry))) + (is (= entry (server-list/append-entry! file-path entry))) + (is (= "123 456\n" + (.toString (fs/readFileSync file-path) "utf8"))) + (is (= [entry] + (server-list/read-entries file-path))))) + +(deftest remove-entry-preserves-unrelated-current-entry + (let [root-dir (node-helper/create-tmp-dir "server-list-remove-preserve") + file-path (server-list/path root-dir)] + (server-list/rewrite-entries! file-path [{:pid 111 :port 222} + {:pid 333 :port 444}]) + (server-list/remove-entry! file-path {:pid 111 :port 222}) + (is (= "333 444\n" + (.toString (fs/readFileSync file-path) "utf8"))) + (is (= [{:pid 333 :port 444}] + (server-list/read-entries file-path))))) + +(deftest append-entry-repairs-stale-server-list-lock + (let [root-dir (node-helper/create-tmp-dir "server-list-stale-lock") + file-path (server-list/path root-dir) + lock-file (server-list/lock-path file-path)] + (fs/writeFileSync lock-file + (js/JSON.stringify (clj->js {:pid 999999 + :lock-id "stale-lock"})) + "utf8") + (server-list/append-entry! file-path {:pid 123 :port 456}) + (is (= [{:pid 123 :port 456}] + (server-list/read-entries file-path))) + (is (not (fs/existsSync lock-file))))) + +(deftest append-entry-times-out-on-malformed-server-list-lock + (let [root-dir (node-helper/create-tmp-dir "server-list-malformed-lock") + file-path (server-list/path root-dir) + lock-file (server-list/lock-path file-path)] + (fs/writeFileSync lock-file "not-json" "utf8") + (try + (server-list/append-entry! file-path {:pid 123 :port 456}) + (is false "expected server-list lock timeout") + (catch :default e + (is (= :server-list-lock-timeout (:code (ex-data e)))) + (is (= file-path (:file-path (ex-data e)))) + (is (= lock-file (:lock-path (ex-data e)))) + (is (= "not-json" + (.toString (fs/readFileSync lock-file) "utf8")))))))