mirror of
https://github.com/logseq/logseq.git
synced 2026-05-19 02:12:41 +00:00
378 lines
13 KiB
Markdown
378 lines
13 KiB
Markdown
# Align graph dir encoding between logseq-cli and desktop app
|
|
|
|
## Summary
|
|
|
|
Align `logseq-cli`, `db-worker-node`, and desktop app handling of `graph dir` / `graph-name` so special characters are encoded and decoded with one shared, reversible contract.
|
|
|
|
The authoritative contract would be the existing `encode-graph-dir-name` / `decode-graph-dir-name` pair in `src/main/frontend/worker_common/util.cljc`, which is already used by `db-worker-node` and `logseq-cli` server-side graph directory resolution.
|
|
|
|
This plan keeps user-facing graph names unchanged and only aligns their on-disk directory representation.
|
|
|
|
## Background
|
|
|
|
Current code paths do not agree on how a graph name maps to a graph directory on disk:
|
|
|
|
- `db-worker-node` and `logseq-cli` server/runtime paths use a reversible graph-dir encoding.
|
|
- desktop app contains paths that join the raw graph name directly into a filesystem path.
|
|
- some Electron and CLI-adjacent helpers still use lossy `sanitize-db-name` behavior.
|
|
- shared graph discovery still contains legacy decoding logic for older naming conventions, but not the current reversible encoding.
|
|
|
|
This mismatch becomes visible when graph names contain special characters such as `/`, `:`, `%`, `~`, or spaces.
|
|
|
|
## Goals
|
|
|
|
- Use one shared graph-dir encoding/decoding contract across CLI and desktop app.
|
|
- Preserve current user-facing graph-name semantics.
|
|
- Keep `logseq_db_` prefix canonicalization separate from graph-dir encoding.
|
|
- Define compatibility behavior for legacy graph directory names.
|
|
- Add tests that cover special-character graph names across all affected entry points.
|
|
|
|
## Non-goals
|
|
|
|
- Redesign the user-visible graph naming model.
|
|
- Change the existing `logseq_db_` display normalization rules.
|
|
- Remove all legacy compatibility in one step without an explicit migration strategy.
|
|
|
|
## Current behavior
|
|
|
|
### Shared reversible encoding already exists
|
|
|
|
Authoritative implementation today:
|
|
|
|
- `src/main/frontend/worker_common/util.cljc`
|
|
- `encode-graph-dir-name`
|
|
- `decode-graph-dir-name`
|
|
|
|
Current behavior:
|
|
|
|
1. `encodeURIComponent` is applied.
|
|
2. literal `~` is rewritten to `%7E`.
|
|
3. `%` is rewritten to `~`.
|
|
4. decoding reverses `~ -> %` and then applies `decodeURIComponent`.
|
|
|
|
This gives a reversible filesystem-safe directory key without `/` or `\\` path separators.
|
|
|
|
### db-worker-node follows the shared contract
|
|
|
|
Relevant files:
|
|
|
|
- `src/main/frontend/worker/db_worker_node_lock.cljs`
|
|
- `src/main/frontend/worker/platform/node.cljs`
|
|
- `src/main/frontend/worker/db_worker_node.cljs`
|
|
- `src/main/frontend/worker/graph_dir.cljs`
|
|
|
|
Current behavior:
|
|
|
|
- repo identity strips one leading `logseq_db_` to produce a graph-dir key.
|
|
- graph-dir key is encoded with `encode-graph-dir-name`.
|
|
- list-graphs decodes on-disk directory names back to graph-dir keys.
|
|
- worker log paths and lock paths are stored under the encoded graph directory.
|
|
|
|
### CLI is partially aligned
|
|
|
|
Relevant files:
|
|
|
|
- `src/main/logseq/cli/server.cljs`
|
|
- `src/main/logseq/cli/command/core.cljs`
|
|
- `src/main/logseq/cli/command/graph.cljs`
|
|
- `src/main/logseq/cli/common.cljs`
|
|
- `deps/cli/src/logseq/cli/common/graph.cljs`
|
|
- `deps/cli/src/logseq/cli/util.cljs`
|
|
|
|
Current behavior:
|
|
|
|
- `cli.server` already uses the same canonical graph-dir path contract as `db-worker-node`.
|
|
- graph display/input normalization strips or restores one `logseq_db_` prefix as needed.
|
|
- `unlink-graph!` still derives directory names with `sanitize-db-name`, which is lossy.
|
|
- shared discovery in `deps/cli` still decodes only older directory naming patterns such as `++` and `+3A+`.
|
|
|
|
### Desktop app is not aligned
|
|
|
|
Relevant files:
|
|
|
|
- `src/electron/electron/utils.cljs`
|
|
- `src/electron/electron/db.cljs`
|
|
- `src/electron/electron/handler.cljs`
|
|
- `src/electron/electron/url.cljs`
|
|
- `src/main/frontend/config.cljs`
|
|
|
|
Current behavior:
|
|
|
|
- `electron.utils/get-graph-dir` joins the raw graph name into the graph path after db-prefix stripping.
|
|
- if the graph name contains `/`, the resulting path becomes nested directories.
|
|
- `electron.db` still uses `sanitize-db-name` in some db path creation logic.
|
|
- frontend local-dir helpers also treat graph name as a raw path segment.
|
|
|
|
## Problem statement
|
|
|
|
The same logical graph name can map to different on-disk paths depending on which subsystem touches it:
|
|
|
|
- reversible encoded path in `db-worker-node`
|
|
- raw path join in Electron/frontend
|
|
- lossy underscore replacement in sanitize-based helpers
|
|
- legacy decode-only behavior in shared graph discovery
|
|
|
|
As a result:
|
|
|
|
- a graph may be listable but not removable
|
|
- a graph may be resolvable in CLI but not in desktop app
|
|
- a graph name containing `/` may accidentally create path nesting in one flow but not another
|
|
- existing tests do not enforce cross-subsystem parity
|
|
|
|
## Proposed contract
|
|
|
|
### 1. Separate graph identity from graph directory representation
|
|
|
|
The plan would explicitly distinguish:
|
|
|
|
- **graph-name / repo**: user-facing identifier, subject to existing `logseq_db_` canonicalization rules
|
|
- **graph-dir key**: graph-name with exactly one leading db prefix stripped
|
|
- **encoded graph-dir**: on-disk directory name produced only by `encode-graph-dir-name`
|
|
|
|
This separation would make it clear that special-character handling belongs to the graph-dir layer, not the user-facing name layer.
|
|
|
|
### 2. Make the db-worker-node contract authoritative
|
|
|
|
The repository would standardize on:
|
|
|
|
- `repo -> graph-dir key`: strip one leading `logseq_db_`
|
|
- `graph-dir key -> encoded graph-dir`: `encode-graph-dir-name`
|
|
- `encoded graph-dir -> graph-dir key`: `decode-graph-dir-name`
|
|
|
|
Any code path that needs an on-disk db graph directory would route through this contract rather than reimplementing path logic.
|
|
|
|
### 3. Keep user-visible graph names unchanged
|
|
|
|
The plan would preserve current user-visible behavior:
|
|
|
|
- CLI graph names remain prefix-free for display and config storage where already intended.
|
|
- desktop app continues to display logical graph names, not encoded directory names.
|
|
- URL-level graph identification continues to resolve to logical graph names, not on-disk encoded names.
|
|
|
|
## Proposed code changes
|
|
|
|
### A. Consolidate path-authoritative helpers
|
|
|
|
Add or reuse one shared helper layer for:
|
|
|
|
- converting repo to graph-dir key
|
|
- converting graph-dir key to encoded graph directory
|
|
- converting repo directly to on-disk graph directory path
|
|
|
|
Target files likely involved:
|
|
|
|
- `src/main/frontend/worker/graph_dir.cljs`
|
|
- `src/main/frontend/worker/db_worker_node_lock.cljs`
|
|
- `src/electron/electron/utils.cljs`
|
|
- `src/main/frontend/config.cljs`
|
|
- `deps/cli/src/logseq/cli/util.cljs`
|
|
|
|
Expected outcome:
|
|
|
|
- no raw path join for logical graph names in path-authoritative code
|
|
- no duplicate graph-dir encoding implementations
|
|
|
|
### B. Align Electron graph-dir resolution
|
|
|
|
Replace raw graph path derivation in Electron with the shared encoded graph-dir contract.
|
|
|
|
Target files:
|
|
|
|
- `src/electron/electron/utils.cljs`
|
|
- `src/electron/electron/handler.cljs`
|
|
- `src/electron/electron/db.cljs`
|
|
|
|
Expected outcome:
|
|
|
|
- desktop app resolves the same on-disk graph dir as `db-worker-node`
|
|
- graph names containing `/`, `:`, `%`, `~`, or spaces behave predictably
|
|
- `sanitize-db-name` is no longer used for authoritative db graph-dir mapping
|
|
|
|
### C. Align CLI remove/unlink behavior
|
|
|
|
Update CLI removal/unlink flows to resolve graph directories via the same encoded contract used by list/start/lock behavior.
|
|
|
|
Target file:
|
|
|
|
- `src/main/logseq/cli/common.cljs`
|
|
|
|
Expected outcome:
|
|
|
|
- a graph that can be listed or switched to can also be removed through the same path mapping
|
|
|
|
### D. Align shared graph discovery
|
|
|
|
Update shared discovery helpers so current encoded graph dirs are decoded correctly, while preserving deliberate support for legacy names where needed.
|
|
|
|
Target file:
|
|
|
|
- `deps/cli/src/logseq/cli/common/graph.cljs`
|
|
|
|
Expected outcome:
|
|
|
|
- desktop/CLI discovery would recognize encoded graph dirs produced by current db-worker-node logic
|
|
- legacy decode branches would be explicitly documented as compatibility behavior
|
|
|
|
### E. Audit frontend local-dir helpers
|
|
|
|
Review helpers that expose graph-related directories to ensure they are either:
|
|
|
|
- display-only helpers, or
|
|
- path-authoritative helpers using the shared encoded contract
|
|
|
|
Target file:
|
|
|
|
- `src/main/frontend/config.cljs`
|
|
|
|
Expected outcome:
|
|
|
|
- no ambiguous helper remains that appears safe for filesystem use while still using raw graph names
|
|
|
|
## Compatibility and migration
|
|
|
|
This plan should explicitly decide how to handle already-existing graph directories created by older logic.
|
|
|
|
### Option 1: Read legacy names, write canonical encoded names
|
|
|
|
Behavior:
|
|
|
|
- discovery accepts legacy directory names and current encoded names
|
|
- all newly created or rewritten paths use the canonical encoded form
|
|
- optional one-time migration may rename legacy directories
|
|
|
|
Pros:
|
|
|
|
- safer rollout
|
|
- less risk of immediately losing access to existing graphs
|
|
|
|
Cons:
|
|
|
|
- mixed formats may coexist temporarily
|
|
|
|
### Option 2: Auto-migrate on access
|
|
|
|
Behavior:
|
|
|
|
- when a legacy graph directory is detected, code renames it to the canonical encoded path before continuing
|
|
|
|
Pros:
|
|
|
|
- converges quickly to one format
|
|
|
|
Cons:
|
|
|
|
- higher operational risk
|
|
- rename behavior must be designed carefully for active workers and lock files
|
|
|
|
### Option 3: Strict cutover
|
|
|
|
Behavior:
|
|
|
|
- only encoded graph dirs are supported after the change
|
|
|
|
Pros:
|
|
|
|
- simplest long-term contract
|
|
|
|
Cons:
|
|
|
|
- too risky without explicit migration tooling
|
|
|
|
### Recommended direction
|
|
|
|
Prefer **Option 1** for the first rollout:
|
|
|
|
- read compatibility for legacy directory names
|
|
- canonical writes to encoded graph dirs
|
|
- add explicit migration follow-up only after parity tests pass
|
|
|
|
## Test plan
|
|
|
|
### Unit tests
|
|
|
|
Extend or add tests for:
|
|
|
|
- `src/test/frontend/worker/worker_common_util_test.cljs`
|
|
- `src/test/frontend/worker/db_worker_node_lock_test.cljs`
|
|
- `src/test/logseq/cli/server_test.cljs`
|
|
- `src/test/logseq/cli/common/graph_test.cljs`
|
|
- Electron-specific tests if available for graph-dir resolution
|
|
|
|
### Special-character test matrix
|
|
|
|
All subsystems should use the same examples:
|
|
|
|
- `foo/bar`
|
|
- `a:b`
|
|
- `space name`
|
|
- `100% legit`
|
|
- `til~de`
|
|
- `mix/of:many %chars~here`
|
|
|
|
### Behavior to verify
|
|
|
|
1. encode/decode roundtrip is lossless
|
|
2. CLI list-graphs returns the same logical graph name that was encoded on disk
|
|
3. CLI switch/remove resolve the same graph directory
|
|
4. desktop app resolves the same graph directory as CLI/db-worker-node
|
|
5. graph names remain user-visible without encoded substitutions
|
|
6. legacy discovery behavior remains intentional and documented
|
|
|
|
### Missing coverage today
|
|
|
|
The repository currently appears to lack end-to-end parity tests for:
|
|
|
|
- CLI create/switch/remove with special-character graph names
|
|
- Electron graph-name -> graph-dir resolution with special characters
|
|
- desktop and CLI agreement on one on-disk graph directory for the same logical graph
|
|
|
|
## Rollout sequence
|
|
|
|
1. Make the shared graph-dir contract explicit in code and docs.
|
|
2. Update Electron path-authoritative helpers to use encoded graph dirs.
|
|
3. Update CLI unlink/remove behavior to use the same mapping.
|
|
4. Update shared graph discovery for encoded graph dirs and legacy compatibility.
|
|
5. Add parity tests across worker, CLI, and desktop-related helpers.
|
|
6. Evaluate whether legacy directory migration should be a separate follow-up.
|
|
|
|
## Risks
|
|
|
|
- Existing graphs may already exist under lossy or raw directory naming rules.
|
|
- Desktop-specific compatibility code may rely on current path layout assumptions.
|
|
- URL/deeplink flows may resolve graph identifiers separately from filesystem mapping and should not accidentally expose encoded names to users.
|
|
- Removing `sanitize-db-name` from authoritative paths may surface hidden assumptions in older db bootstrap code.
|
|
|
|
## Open questions
|
|
|
|
1. Should legacy raw/sanitized graph directories remain writable, or only readable?
|
|
2. Should migration happen automatically, manually, or in a later dedicated change?
|
|
3. Which helper should become the single exported entry point for graph-name -> on-disk graph-dir path resolution?
|
|
4. Should `docs/cli/logseq-cli.md` be updated in the same change to clarify that on-disk graph directories are encoded, not always literal graph names?
|
|
|
|
## Expected files to change in implementation
|
|
|
|
Likely implementation targets:
|
|
|
|
- `src/main/frontend/worker_common/util.cljc`
|
|
- `src/main/frontend/worker/graph_dir.cljs`
|
|
- `src/main/frontend/worker/db_worker_node_lock.cljs`
|
|
- `src/main/logseq/cli/server.cljs`
|
|
- `src/main/logseq/cli/common.cljs`
|
|
- `deps/cli/src/logseq/cli/common/graph.cljs`
|
|
- `deps/cli/src/logseq/cli/util.cljs`
|
|
- `src/electron/electron/utils.cljs`
|
|
- `src/electron/electron/db.cljs`
|
|
- `src/electron/electron/handler.cljs`
|
|
- `src/main/frontend/config.cljs`
|
|
- related tests under `src/test/`
|
|
|
|
## Acceptance criteria
|
|
|
|
This plan would be complete when:
|
|
|
|
- one shared graph-dir encoding contract is identified as authoritative
|
|
- all affected subsystems and files are enumerated
|
|
- compatibility strategy for legacy graph directories is documented
|
|
- a concrete test matrix for special-character graph names is defined
|
|
- the plan preserves current user-facing graph-name semantics while aligning on-disk graph-dir behavior
|