13 KiB
Align graph dir encoding between logseq-cli and desktop app
Summary
Align logseq-cli, db-worker-node, and desktop app handling of graph dir / graph-name so special characters are encoded and decoded with one shared, reversible contract.
The authoritative contract would be the existing encode-graph-dir-name / decode-graph-dir-name pair in src/main/frontend/worker_common/util.cljc, which is already used by db-worker-node and logseq-cli server-side graph directory resolution.
This plan keeps user-facing graph names unchanged and only aligns their on-disk directory representation.
Background
Current code paths do not agree on how a graph name maps to a graph directory on disk:
db-worker-nodeandlogseq-cliserver/runtime paths use a reversible graph-dir encoding.- desktop app contains paths that join the raw graph name directly into a filesystem path.
- some Electron and CLI-adjacent helpers still use lossy
sanitize-db-namebehavior. - shared graph discovery still contains legacy decoding logic for older naming conventions, but not the current reversible encoding.
This mismatch becomes visible when graph names contain special characters such as /, :, %, ~, or spaces.
Goals
- Use one shared graph-dir encoding/decoding contract across CLI and desktop app.
- Preserve current user-facing graph-name semantics.
- Keep
logseq_db_prefix canonicalization separate from graph-dir encoding. - Define compatibility behavior for legacy graph directory names.
- Add tests that cover special-character graph names across all affected entry points.
Non-goals
- Redesign the user-visible graph naming model.
- Change the existing
logseq_db_display normalization rules. - Remove all legacy compatibility in one step without an explicit migration strategy.
Current behavior
Shared reversible encoding already exists
Authoritative implementation today:
src/main/frontend/worker_common/util.cljcencode-graph-dir-namedecode-graph-dir-name
Current behavior:
encodeURIComponentis applied.- literal
~is rewritten to%7E. %is rewritten to~.- decoding reverses
~ -> %and then appliesdecodeURIComponent.
This gives a reversible filesystem-safe directory key without / or \\ path separators.
db-worker-node follows the shared contract
Relevant files:
src/main/frontend/worker/db_worker_node_lock.cljssrc/main/frontend/worker/platform/node.cljssrc/main/frontend/worker/db_worker_node.cljssrc/main/frontend/worker/graph_dir.cljs
Current behavior:
- repo identity strips one leading
logseq_db_to produce a graph-dir key. - graph-dir key is encoded with
encode-graph-dir-name. - list-graphs decodes on-disk directory names back to graph-dir keys.
- worker log paths and lock paths are stored under the encoded graph directory.
CLI is partially aligned
Relevant files:
src/main/logseq/cli/server.cljssrc/main/logseq/cli/command/core.cljssrc/main/logseq/cli/command/graph.cljssrc/main/logseq/cli/common.cljsdeps/cli/src/logseq/cli/common/graph.cljsdeps/cli/src/logseq/cli/util.cljs
Current behavior:
cli.serveralready uses the same canonical graph-dir path contract asdb-worker-node.- graph display/input normalization strips or restores one
logseq_db_prefix as needed. unlink-graph!still derives directory names withsanitize-db-name, which is lossy.- shared discovery in
deps/clistill decodes only older directory naming patterns such as++and+3A+.
Desktop app is not aligned
Relevant files:
src/electron/electron/utils.cljssrc/electron/electron/db.cljssrc/electron/electron/handler.cljssrc/electron/electron/url.cljssrc/main/frontend/config.cljs
Current behavior:
electron.utils/get-graph-dirjoins the raw graph name into the graph path after db-prefix stripping.- if the graph name contains
/, the resulting path becomes nested directories. electron.dbstill usessanitize-db-namein some db path creation logic.- frontend local-dir helpers also treat graph name as a raw path segment.
Problem statement
The same logical graph name can map to different on-disk paths depending on which subsystem touches it:
- reversible encoded path in
db-worker-node - raw path join in Electron/frontend
- lossy underscore replacement in sanitize-based helpers
- legacy decode-only behavior in shared graph discovery
As a result:
- a graph may be listable but not removable
- a graph may be resolvable in CLI but not in desktop app
- a graph name containing
/may accidentally create path nesting in one flow but not another - existing tests do not enforce cross-subsystem parity
Proposed contract
1. Separate graph identity from graph directory representation
The plan would explicitly distinguish:
- graph-name / repo: user-facing identifier, subject to existing
logseq_db_canonicalization rules - graph-dir key: graph-name with exactly one leading db prefix stripped
- encoded graph-dir: on-disk directory name produced only by
encode-graph-dir-name
This separation would make it clear that special-character handling belongs to the graph-dir layer, not the user-facing name layer.
2. Make the db-worker-node contract authoritative
The repository would standardize on:
repo -> graph-dir key: strip one leadinglogseq_db_graph-dir key -> encoded graph-dir:encode-graph-dir-nameencoded graph-dir -> graph-dir key:decode-graph-dir-name
Any code path that needs an on-disk db graph directory would route through this contract rather than reimplementing path logic.
3. Keep user-visible graph names unchanged
The plan would preserve current user-visible behavior:
- CLI graph names remain prefix-free for display and config storage where already intended.
- desktop app continues to display logical graph names, not encoded directory names.
- URL-level graph identification continues to resolve to logical graph names, not on-disk encoded names.
Proposed code changes
A. Consolidate path-authoritative helpers
Add or reuse one shared helper layer for:
- converting repo to graph-dir key
- converting graph-dir key to encoded graph directory
- converting repo directly to on-disk graph directory path
Target files likely involved:
src/main/frontend/worker/graph_dir.cljssrc/main/frontend/worker/db_worker_node_lock.cljssrc/electron/electron/utils.cljssrc/main/frontend/config.cljsdeps/cli/src/logseq/cli/util.cljs
Expected outcome:
- no raw path join for logical graph names in path-authoritative code
- no duplicate graph-dir encoding implementations
B. Align Electron graph-dir resolution
Replace raw graph path derivation in Electron with the shared encoded graph-dir contract.
Target files:
src/electron/electron/utils.cljssrc/electron/electron/handler.cljssrc/electron/electron/db.cljs
Expected outcome:
- desktop app resolves the same on-disk graph dir as
db-worker-node - graph names containing
/,:,%,~, or spaces behave predictably sanitize-db-nameis no longer used for authoritative db graph-dir mapping
C. Align CLI remove/unlink behavior
Update CLI removal/unlink flows to resolve graph directories via the same encoded contract used by list/start/lock behavior.
Target file:
src/main/logseq/cli/common.cljs
Expected outcome:
- a graph that can be listed or switched to can also be removed through the same path mapping
D. Align shared graph discovery
Update shared discovery helpers so current encoded graph dirs are decoded correctly, while preserving deliberate support for legacy names where needed.
Target file:
deps/cli/src/logseq/cli/common/graph.cljs
Expected outcome:
- desktop/CLI discovery would recognize encoded graph dirs produced by current db-worker-node logic
- legacy decode branches would be explicitly documented as compatibility behavior
E. Audit frontend local-dir helpers
Review helpers that expose graph-related directories to ensure they are either:
- display-only helpers, or
- path-authoritative helpers using the shared encoded contract
Target file:
src/main/frontend/config.cljs
Expected outcome:
- no ambiguous helper remains that appears safe for filesystem use while still using raw graph names
Compatibility and migration
This plan should explicitly decide how to handle already-existing graph directories created by older logic.
Option 1: Read legacy names, write canonical encoded names
Behavior:
- discovery accepts legacy directory names and current encoded names
- all newly created or rewritten paths use the canonical encoded form
- optional one-time migration may rename legacy directories
Pros:
- safer rollout
- less risk of immediately losing access to existing graphs
Cons:
- mixed formats may coexist temporarily
Option 2: Auto-migrate on access
Behavior:
- when a legacy graph directory is detected, code renames it to the canonical encoded path before continuing
Pros:
- converges quickly to one format
Cons:
- higher operational risk
- rename behavior must be designed carefully for active workers and lock files
Option 3: Strict cutover
Behavior:
- only encoded graph dirs are supported after the change
Pros:
- simplest long-term contract
Cons:
- too risky without explicit migration tooling
Recommended direction
Prefer Option 1 for the first rollout:
- read compatibility for legacy directory names
- canonical writes to encoded graph dirs
- add explicit migration follow-up only after parity tests pass
Test plan
Unit tests
Extend or add tests for:
src/test/frontend/worker/worker_common_util_test.cljssrc/test/frontend/worker/db_worker_node_lock_test.cljssrc/test/logseq/cli/server_test.cljssrc/test/logseq/cli/common/graph_test.cljs- Electron-specific tests if available for graph-dir resolution
Special-character test matrix
All subsystems should use the same examples:
foo/bara:bspace name100% legittil~demix/of:many %chars~here
Behavior to verify
- encode/decode roundtrip is lossless
- CLI list-graphs returns the same logical graph name that was encoded on disk
- CLI switch/remove resolve the same graph directory
- desktop app resolves the same graph directory as CLI/db-worker-node
- graph names remain user-visible without encoded substitutions
- legacy discovery behavior remains intentional and documented
Missing coverage today
The repository currently appears to lack end-to-end parity tests for:
- CLI create/switch/remove with special-character graph names
- Electron graph-name -> graph-dir resolution with special characters
- desktop and CLI agreement on one on-disk graph directory for the same logical graph
Rollout sequence
- Make the shared graph-dir contract explicit in code and docs.
- Update Electron path-authoritative helpers to use encoded graph dirs.
- Update CLI unlink/remove behavior to use the same mapping.
- Update shared graph discovery for encoded graph dirs and legacy compatibility.
- Add parity tests across worker, CLI, and desktop-related helpers.
- Evaluate whether legacy directory migration should be a separate follow-up.
Risks
- Existing graphs may already exist under lossy or raw directory naming rules.
- Desktop-specific compatibility code may rely on current path layout assumptions.
- URL/deeplink flows may resolve graph identifiers separately from filesystem mapping and should not accidentally expose encoded names to users.
- Removing
sanitize-db-namefrom authoritative paths may surface hidden assumptions in older db bootstrap code.
Open questions
- Should legacy raw/sanitized graph directories remain writable, or only readable?
- Should migration happen automatically, manually, or in a later dedicated change?
- Which helper should become the single exported entry point for graph-name -> on-disk graph-dir path resolution?
- Should
docs/cli/logseq-cli.mdbe updated in the same change to clarify that on-disk graph directories are encoded, not always literal graph names?
Expected files to change in implementation
Likely implementation targets:
src/main/frontend/worker_common/util.cljcsrc/main/frontend/worker/graph_dir.cljssrc/main/frontend/worker/db_worker_node_lock.cljssrc/main/logseq/cli/server.cljssrc/main/logseq/cli/common.cljsdeps/cli/src/logseq/cli/common/graph.cljsdeps/cli/src/logseq/cli/util.cljssrc/electron/electron/utils.cljssrc/electron/electron/db.cljssrc/electron/electron/handler.cljssrc/main/frontend/config.cljs- related tests under
src/test/
Acceptance criteria
This plan would be complete when:
- one shared graph-dir encoding contract is identified as authoritative
- all affected subsystems and files are enumerated
- compatibility strategy for legacy graph directories is documented
- a concrete test matrix for special-character graph names is defined
- the plan preserves current user-facing graph-name semantics while aligning on-disk graph-dir behavior