14 KiB
CLI E2E Sync Suite Implementation Plan
Goal: Add a dedicated sync-focused cli-e2e suite that is isolated from non-sync coverage and validates MVP upload and download behavior using two independent db-worker-node processes driven only through CLI commands.
Architecture: Keep the existing non-sync suite as the default cli-e2e path and introduce a separate sync suite with its own manifest files, runner entrypoint, and preconditions.
Architecture: Model sync behavior with two distinct data directories that run two different db-worker-node processes against the same graph name, then assert health via sync status and data convergence via CLI queries.
Architecture: Ship MVP coverage for upload and download first, while leaving realtime sync-start convergence tests for a follow-up phase.
Tech Stack: Babashka, EDN case manifests, logseq-cli, db-worker-node, JSON parsing via Python 3 in shell helpers, existing CLI sync commands.
Related: Builds on /Users/rcmerci/gh-repos/logseq/docs/agent-guide/064-logseq-cli-integration-test-shell-refactor.md.
Related: Relates to /Users/rcmerci/gh-repos/logseq/docs/agent-guide/047-logseq-cli-sync-command.md.
Related: Relates to /Users/rcmerci/gh-repos/logseq/docs/agent-guide/048-sync-download-start-reliability.md.
Related: Relates to /Users/rcmerci/gh-repos/logseq/docs/agent-guide/051-logseq-cli-sync-upload-fix.md.
Problem statement
cli-e2e currently excludes all sync commands by design.
/Users/rcmerci/gh-repos/logseq/cli-e2e/spec/non_sync_inventory.edn explicitly excludes the sync prefix, and /Users/rcmerci/gh-repos/logseq/cli-e2e/src/logseq/cli/e2e/manifests.clj only loads non_sync_* manifests.
This keeps non-sync coverage clean, but there is currently no shell-first cli-e2e coverage for sync upload and download behavior.
Current sync integration checks in /Users/rcmerci/gh-repos/logseq/src/test/logseq/cli/integration_test.cljs are mostly mocked transport-level tests and do not validate real shell command orchestration in a two-process setup.
The requested test architecture requires two independent db-worker-node processes in different directories, both operating on the same graph name, with CLI-only operations and status-driven verification.
Current implementation snapshot
| Area | Current file | Current behavior | Gap for this plan |
|---|---|---|---|
| Suite manifests | /Users/rcmerci/gh-repos/logseq/cli-e2e/src/logseq/cli/e2e/manifests.clj |
Loads only non_sync_inventory.edn and non_sync_cases.edn. |
No sync suite loading path. |
| Non-sync inventory policy | /Users/rcmerci/gh-repos/logseq/cli-e2e/spec/non_sync_inventory.edn |
Excludes sync, login, and logout. |
Sync tests must live in separate manifests to avoid policy conflict. |
| CLI runner tasks | /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn |
Exposes test, list-cases, and build for one suite. |
Need dedicated sync tasks and clearer suite-level ergonomics. |
| Case execution model | /Users/rcmerci/gh-repos/logseq/cli-e2e/src/logseq/cli/e2e/runner.clj |
Supports shell-first setup and command chains with templating. | No built-in wait helper for polling sync status until pending queues are empty. |
| Sync command behavior | /Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/command/sync.cljs |
sync upload, sync download, and sync status are implemented and return structured JSON. |
E2E harness does not yet assert these behaviors with two independent workers. |
| Server process isolation | /Users/rcmerci/gh-repos/logseq/src/main/logseq/cli/server.cljs |
Data-dir and graph path determine lock ownership and process identity. | Need explicit test cases that prove two data dirs produce two independent worker processes for the same graph name. |
Scope and MVP boundaries
MVP in this plan covers upload and download flows only.
MVP does not add coverage for long-running sync start websocket convergence behavior.
MVP requires CLI-only graph operations during test execution, including graph creation, mutation, upload, download, status checks, and data verification queries.
MVP keeps non-sync test behavior unchanged and isolated.
Testing Plan
I will add runner-level unit tests that fail first when sync suite manifests and tasks are missing, and pass only after suite separation is implemented.
I will add sync suite manifest coverage tests that fail first when required sync command options are not covered by MVP cases.
I will add shell-first sync E2E cases that fail first and validate the two-data-dir architecture, sync status health checks, pending queue convergence checks, and graph data parity assertions.
I will validate command ergonomics by running non-sync and sync suites independently and ensuring their outputs and selection logic remain deterministic.
I will follow @test-driven-development for every behavior slice in this plan.
NOTE: I will write all tests before I add any implementation behavior.
Target sync suite architecture
+----------------------------------+ +----------------------------------+
| data-dir A | | data-dir B |
| graph: sync-e2e-mvp | | graph: sync-e2e-mvp |
| db-worker-node process A | | db-worker-node process B |
+----------------+-----------------+ +----------------+-----------------+
| |
| CLI commands only | CLI commands only
v v
logseq sync upload logseq sync download
| |
+-------------------> remote sync backend <----------+
Verification path:
1) mutate graph via CLI in A.
2) run sync upload via CLI in A.
3) poll sync status via CLI until pending queues settle and last-error remains nil.
4) run sync download via CLI in B.
5) compare graph data via CLI queries in A and B.
Detailed implementation plan
Phase 1. Add explicit sync suite separation in cli-e2e.
- Add a failing unit test in
/Users/rcmerci/gh-repos/logseq/cli-e2e/test/logseq/cli/e2e/main_test.cljthat expects a dedicated sync test entrypoint to load sync manifests instead of non-sync manifests. - Add a failing unit test in
/Users/rcmerci/gh-repos/logseq/cli-e2e/test/logseq/cli/e2e/main_test.cljthat expects non-synctestto keep current behavior unchanged. - Add
sync_inventory.ednloading support in/Users/rcmerci/gh-repos/logseq/cli-e2e/src/logseq/cli/e2e/manifests.cljwith a suite selector API. - Add
sync_cases.ednloading support in/Users/rcmerci/gh-repos/logseq/cli-e2e/src/logseq/cli/e2e/manifests.cljwith the same suite selector API. - Add suite-aware run helpers in
/Users/rcmerci/gh-repos/logseq/cli-e2e/src/logseq/cli/e2e/main.cljso non-sync and sync share execution plumbing but load different manifests. - Add new tasks in
/Users/rcmerci/gh-repos/logseq/cli-e2e/bb.ednfortest-syncandlist-sync-cases. - Keep existing
testandlist-casesmapped to non-sync manifests. - Run
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn unit-testand confirm failures turn green for the new suite selection behavior.
Phase 2. Define sync inventory and MVP case manifests.
- Create
/Users/rcmerci/gh-repos/logseq/cli-e2e/spec/sync_inventory.ednwith MVP required commandssync upload,sync download, andsync status. - Include only MVP-required sync options in
sync_inventory.ednto avoid over-scoping phase one. - Add a failing coverage test in
/Users/rcmerci/gh-repos/logseq/cli-e2e/test/logseq/cli/e2e/coverage_test.cljfor missing sync command coverage. - Create
/Users/rcmerci/gh-repos/logseq/cli-e2e/spec/sync_cases.ednwith initial empty or placeholder MVP case definitions that intentionally fail coverage. - Run
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn test-sync --skip-buildand confirm coverage failure is clear and actionable.
Phase 3. Add reusable sync status and graph parity helper scripts.
- Add
/Users/rcmerci/gh-repos/logseq/cli-e2e/scripts/wait_sync_status.pythat repeatedly executes CLIsync status --output jsonuntil pending queues reach zero or timeout. - Make
wait_sync_status.pyfail immediately whenstatusis notokor whendata.last-erroris notnull. - Add
/Users/rcmerci/gh-repos/logseq/cli-e2e/scripts/compare_graph_queries.pythat executes two CLI query commands and compares normalized payloads. - Keep helper scripts CLI-only by calling
node static/logseq-cli.jscommands rather than reading DB files directly. - Add shell-level tests for these helper scripts in
/Users/rcmerci/gh-repos/logseq/cli-e2e/test/logseq/cli/e2e/runner_test.cljor a new helper test namespace using mocked command execution.
Phase 4. Implement MVP sync upload/download test case with two worker processes.
- Add one MVP case in
/Users/rcmerci/gh-repos/logseq/cli-e2e/spec/sync_cases.ednthat provisions two data dirs under one temp root. - In setup, create two separate config files for directory A and directory B with explicit sync endpoint keys and auth placeholders sourced from environment variables.
- In setup, create the graph in directory A via CLI and add deterministic marker data via CLI
upsertcommands. - In setup, start
db-worker-nodefor graph A via CLIserver start. - In main commands, run CLI
sync uploadin directory A. - In main commands, run
wait_sync_status.pyagainst directory A to ensurelast-errorremains empty and pending counters settle. - In main commands, run CLI
sync downloadin directory B for the same graph name. - In main commands, start
db-worker-nodefor graph B via CLIserver start. - In main commands, use
compare_graph_queries.pyto compare deterministic query outputs between A and B. - In cleanup, stop servers for both directory A and directory B via CLI
server stop. - Ensure the case
:coversmap markssync upload,sync download, andsync statuscoverage in/Users/rcmerci/gh-repos/logseq/cli-e2e/spec/sync_cases.edn. - Run
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn test-sync --skip-buildand verify the MVP case passes.
Phase 5. Keep non-sync suite stable and document operator workflow.
- Run
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn test --skip-buildand verify non-sync behavior is unchanged. - Update
/Users/rcmerci/gh-repos/logseq/cli-e2e/README.mdwith separate commands for non-sync and sync suites. - Add required environment variable documentation in
/Users/rcmerci/gh-repos/logseq/cli-e2e/README.mdfor sync suite execution. - Optionally add
dev:cli-e2e-synctask in/Users/rcmerci/gh-repos/logseq/bb.ednthat delegates tobb -f cli-e2e/bb.edn test-sync. - Run
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn list-sync-casesand ensure the new case is discoverable.
Verification commands and expected outcomes
| Command | Expected outcome |
|---|---|
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn test --skip-build |
Runs non-sync suite only and remains green. |
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn list-cases |
Lists non-sync case ids only. |
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn list-sync-cases |
Lists sync case ids only. |
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn test-sync --skip-build |
Runs sync suite only and validates MVP upload or download behavior. |
bb -f /Users/rcmerci/gh-repos/logseq/cli-e2e/bb.edn test-sync --skip-build --case sync-upload-download-mvp |
Runs one sync MVP case with deterministic status and parity checks. |
Edge cases to include in MVP case design
The sync suite must fail with a clear message when required auth or endpoint environment variables are missing.
The status polling helper must fail on timeout and print the last seen status payload for debugging.
The status polling helper must fail when last-error appears even if pending counters reach zero.
The graph parity helper must compare normalized query results, not raw command output strings that can differ by formatting.
Cleanup must tolerate partially started state and still attempt to stop both servers.
The sync suite must not mutate or depend on non_sync_* manifest files.
Open clarifications to resolve before implementation
MVP sync suite will target local db-sync by default (http://localhost:8080 plus local websocket).
CI integration is intentionally out of scope for this phase and will be decided after MVP stabilizes.
Confirm the minimum auth material for sync MVP in test environments, including whether a refresh token is strictly required or whether pre-seeded runtime tokens in config are sufficient.
Testing Details
The new tests validate real shell behavior through compiled logseq-cli commands and real db-worker-node process lifecycle handling across two independent data directories.
The MVP sync case verifies behavior outcomes by checking sync health status, pending queue convergence, and cross-directory graph data parity for deterministic query payloads.
The suite separation tests ensure sync coverage does not destabilize non-sync command coverage expectations.
Implementation Details
- Keep non-sync manifests and command coverage unchanged.
- Add sync manifests as a separate suite, not an extension of non-sync inventory.
- Reuse existing
main/run!and runner infrastructure with suite-aware manifest loading. - Keep all graph mutations and validations CLI-driven in case commands and helper scripts.
- Use two explicit data directories per sync case to guarantee two independent
db-worker-nodeprocesses. - Poll
sync statusuntil pending counters settle and fail onlast-error. - Compare graph parity through deterministic CLI query outputs.
- Document sync suite environment requirements in
cli-e2e/README.md. - Keep sync suite runnable independently with
test-syncandlist-sync-casestasks. - Defer
sync startrealtime scenarios to a follow-up plan after MVP upload and download stabilization.
Question
Should MVP include only one-direction flow A upload -> B download, or should it also include the reverse-direction snapshot refresh in the same phase.