mirror of
https://github.com/openai/codex.git
synced 2026-05-17 01:32:32 +00:00
d91201ef995935dd67a247845fc401ac4e43df55
10 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
8a5306ff88 |
app-server: use permission ids and runtime workspace roots (#22611)
## Why This PR builds on [#22610](https://github.com/openai/codex/pull/22610) and is the app-server side of the migration from mutable per-turn `SandboxPolicy` replacement toward selecting immutable permission profiles by id plus mutable runtime workspace roots. Once permission profiles can carry their own immutable `workspace_roots`, app-server no longer needs to mutate the selected `PermissionProfile` just to represent thread-specific filesystem context. The mutable part now lives on the thread as explicit `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until the sandbox is realized for a turn. ## What Changed - Replaced the v2 permission-selection wrapper surface with plain profile ids for `thread/start`, `thread/resume`, `thread/fork`, and `turn/start`. - Removed the API surface for profile modifications (`PermissionProfileSelectionParams`, `PermissionProfileModificationParams`, `ActivePermissionProfileModification`). - Added experimental `runtimeWorkspaceRoots` fields to the thread lifecycle and turn-start APIs. - Threaded runtime workspace roots through core session/thread snapshots, turn overrides, app-server request handling, and command execution permission resolution. - Kept session permission state symbolic so later runtime root updates and cwd-only implicit-root retargeting rebind `:workspace_roots` correctly. - Updated the embedded clients just enough to send and restore the new thread state. - Refreshed the generated schema/TypeScript artifacts and the app-server README to match the new contract. ## Verification Targeted coverage for this layer lives in: - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs` - `codex-rs/app-server/tests/suite/v2/thread_start.rs` - `codex-rs/app-server/tests/suite/v2/thread_resume.rs` - `codex-rs/app-server/tests/suite/v2/turn_start.rs` - `codex-rs/core/src/session/tests.rs` The key regression checks exercise that: - `runtimeWorkspaceRoots` resolve against the effective cwd on thread start. - Profile-declared workspace roots are excluded from the runtime workspace roots returned by app-server. - A turn-level runtime workspace-root update persists onto the thread and is returned by `thread/resume`. - A named permission profile selected on one turn remains symbolic so a later runtime-root-only turn update changes the actual sandbox writes. - A cwd-only turn update retargets the implicit runtime cwd root while preserving additional runtime roots. - The protocol fixtures and generated client artifacts stay in sync with the string-based permission selection contract. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611). * #22612 * __->__ #22611 |
||
|
|
2b3b220605 |
revert: mark Feature::RemoteControl as removed (#22520)
reverts: https://github.com/openai/codex/pull/22386 |
||
|
|
889ee018e7 |
config: add strict config parsing (#20559)
## Why Codex intentionally ignores unknown `config.toml` fields by default so older and newer config files keep working across versions. That leniency also makes typo detection hard because misspelled or misplaced keys disappear silently. This change adds an opt-in strict config mode so users and tooling can fail fast on unrecognized config fields without changing the default permissive behavior. This feature is possible because `serde_ignored` exposes the exact signal Codex needs: it lets Codex run ordinary Serde deserialization while recording fields Serde would otherwise ignore. That avoids requiring `#[serde(deny_unknown_fields)]` across every config type and keeps strict validation opt-in around the existing config model. ## What Changed ### Added strict config validation - Added `serde_ignored`-based validation for `ConfigToml` in `codex-rs/config/src/strict_config.rs`. - Combined `serde_ignored` with `serde_path_to_error` so strict mode preserves typed config error paths while also collecting fields Serde would otherwise ignore. - Added strict-mode validation for unknown `[features]` keys, including keys that would otherwise be accepted by `FeaturesToml`'s flattened boolean map. - Kept typed config errors ahead of ignored-field reporting, so malformed known fields are reported before unknown-field diagnostics. - Added source-range diagnostics for top-level and nested unknown config fields, including non-file managed preference source names. ### Kept parsing single-pass per source - Reworked file and managed-config loading so strict validation reuses the already parsed `TomlValue` for that source. - For actual config files and managed config strings, the loader now reads once, parses once, and validates that same parsed value instead of deserializing multiple times. - Validated `-c` / `--config` override layers with the same base-directory context used for normal relative-path resolution, so unknown override keys are still reported when another override contains a relative path. ### Scoped `--strict-config` to config-heavy entry points - Added support for `--strict-config` on the main config-loading entry points where it is most useful: - `codex` - `codex resume` - `codex fork` - `codex exec` - `codex review` - `codex mcp-server` - `codex app-server` when running the server itself - the standalone `codex-app-server` binary - the standalone `codex-exec` binary - Commands outside that set now reject `--strict-config` early with targeted errors instead of accepting it everywhere through shared CLI plumbing. - `codex app-server` subcommands such as `proxy`, `daemon`, and `generate-*` are intentionally excluded from the first rollout. - When app-server strict mode sees invalid config, app-server exits with the config error instead of logging a warning and continuing with defaults. - Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli` instead of extending shared `ReviewArgs`, so `--strict-config` stays on the outer config-loading command surface and does not become part of the reusable review payload used by `codex exec review`. ### Coverage - Added tests for top-level and nested unknown config fields, unknown `[features]` keys, typed-error precedence, source-location reporting, and non-file managed preference source names. - Added CLI coverage showing invalid `--enable`, invalid `--disable`, and unknown `-c` overrides still error when `--strict-config` is present, including compound-looking feature names such as `multi_agent_v2.subagent_usage_hint_text`. - Added integration coverage showing both `codex app-server --strict-config` and standalone `codex-app-server --strict-config` exit with an error for unknown config fields instead of starting with fallback defaults. - Added coverage showing unsupported command surfaces reject `--strict-config` with explicit errors. ## Example Usage Run Codex with strict config validation enabled: ```shell codex --strict-config ``` Strict config mode is also available on the supported config-heavy subcommands: ```shell codex --strict-config exec "explain this repository" codex review --strict-config --uncommitted codex mcp-server --strict-config codex app-server --strict-config --listen off codex-app-server --strict-config --listen off ``` For example, if `~/.codex/config.toml` contains a typo in a key name: ```toml model = "gpt-5" approval_polic = "on-request" ``` then `codex --strict-config` reports the misspelled key instead of silently ignoring it. The path is shortened to `~` here for readability: ```text $ codex --strict-config Error loading config.toml: ~/.codex/config.toml:2:1: unknown configuration field `approval_polic` | 2 | approval_polic = "on-request" | ^^^^^^^^^^^^^^ ``` Without `--strict-config`, Codex keeps the existing permissive behavior and ignores the unknown key. Strict config mode also validates ad-hoc `-c` / `--config` overrides: ```text $ codex --strict-config -c foo=bar Error: unknown configuration field `foo` in -c/--config override $ codex --strict-config -c features.foo=true Error: unknown configuration field `features.foo` in -c/--config override ``` Invalid feature toggles are rejected too, including values that look like nested config paths: ```text $ codex --strict-config --enable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --disable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text ``` Unsupported commands reject the flag explicitly: ```text $ codex --strict-config cloud list Error: `--strict-config` is not supported for `codex cloud` ``` ## Verification The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid `--disable`, the compound `multi_agent_v2.subagent_usage_hint_text` case, unknown `-c` overrides, app-server strict startup failure through `codex app-server`, and rejection for unsupported commands such as `codex cloud`, `codex mcp`, `codex remote-control`, and `codex app-server proxy`. The config and config-loader tests cover unknown top-level fields, unknown nested fields, unknown `[features]` keys, source-location reporting, non-file managed config sources, and `-c` validation for keys such as `features.foo`. The app-server test suite covers standalone `codex-app-server --strict-config` startup failure for an unknown config field. ## Documentation The Codex CLI docs on developers.openai.com/codex should mention `--strict-config` as an opt-in validation mode for supported config-heavy entry points once this ships. |
||
|
|
2237a13cf1 |
mark Feature::RemoteControl as removed (#22386)
## Why
`remote_control` can appear in `config.toml`, CLI feature overrides, and
the app-server config APIs. Before this PR, app-server startup treated
`config.features.enabled(Feature::RemoteControl)` as the signal to start
remote control ([base
code](
|
||
|
|
e64a8979b0 | device-key: clean up unused crate (#21487) | ||
|
|
a8488fec5e |
Revert state DB injection and agent graph store (#21481)
## Why Reverts #20689 to restore the previous optional state DB plumbing. The conflict resolution keeps the newer installation ID and session/thread identity changes that landed after #20689, while removing the mandatory state DB and agent graph store dependency from ThreadManager construction. ## What changed - Restored `Option<StateDbHandle>` through app-server, MCP server, prompt debug, and test entry points. - Removed the `codex-core` dependency on `codex-agent-graph-store` and reverted descendant lookup back to the existing state DB path when available. - Kept newer `installation_id` forwarding by passing it beside the optional DB handle. - Kept local thread-name updates working when the optional state DB handle is absent. ## Validation - `git diff --check` - `cargo test -p codex-thread-store` - `cargo test -p codex-state -p codex-rollout -p codex-app-server-protocol` - Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p codex-app-server -p codex-app-server-client -p codex-mcp-server -p codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607 2026-01-19)` on `aarch64-apple-darwin`. |
||
|
|
8f3bb355f4 |
Move installation ID resolution out of core startup (#21182)
## Summary - resolve or inject the installation ID before core startup and pass it through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain `String` - keep child sessions on the parent installation ID instead of rediscovering it inside core - propagate installation ID startup failures in `mcp-server` instead of panicking ## Why Core was still touching the filesystem on the session startup path to discover `installation_id`. This moves that work to the outer host boundary so core no longer depends on `codex_home` reads during session construction. --------- Co-authored-by: Codex <noreply@openai.com> |
||
|
|
7e310bc7f3 |
Inject state DB, agent graph store (#20689)
## Why We want the agent graph store to be passed down the stack as a real dependency, the same way we already treat the thread store. This will let us inject the agent graph store as a real dependency and support implementations other than the local SQLite-backed one. Right now most code instantiates a state DB and an agent graph store just-in-time. Ideally, we would not depend on the state DB directly but only read through the higher-level interfaces. This change makes the dependency boundaries explicit and moves state DB initialization to process bootstrap instead of hiding it inside local store implementations. ## What changed - `ThreadManager` now requires a `StateDbHandle` and an `AgentGraphStore` at construction time instead of treating them as optional internals. - The local store constructors no longer lazily initialize SQLite. Callers now initialize the state DB once per process and use that shared handle to build: - `LocalThreadStore` - `LocalAgentGraphStore` - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the thread-manager sample) now initialize the state DB up front and inject the resulting handle down the stack. - `app-server` now consistently uses its process-scoped state DB handle instead of reopening SQLite or trying to recover it from loaded threads. - Device-key storage now reuses the shared state DB handle instead of maintaining its own lazy opener. - The thread archive / descendant traversal paths now use the injected `AgentGraphStore` instead of reaching through local thread-store-specific state. ## Verification - `cargo check -p codex-core -p codex-thread-store -p codex-app-server -p codex-mcp-server -p codex-thread-manager-sample --tests` - `cargo test -p codex-thread-store` - `cargo test -p codex-core thread_manager_accepts_separate_agent_graph_store_and_thread_store -- --nocapture` - `cargo test -p codex-app-server thread_archive_archives_spawned_descendants -- --nocapture` |
||
|
|
4d201e340e |
state: pass state db handles through consumers (#20561)
## Why SQLite state was still being opened from consumer paths, including lazy `OnceCell`-backed thread-store call sites. That let one process construct multiple state DB connections for the same Codex home, which makes SQLite lock contention and `database is locked` failures much easier to hit. State DB lifetime should be chosen by main-like entrypoints and tests, then passed through explicitly. Consumers should use the supplied `Option<StateDbHandle>` or `StateDbHandle` and keep their existing filesystem fallback or error behavior when no handle is available. The startup path also needs to keep the rollout crate in charge of SQLite state initialization. Opening `codex_state::StateRuntime` directly bypasses rollout metadata backfill, so entrypoints should initialize through `codex_rollout::state_db` and receive a handle only after required rollout backfills have completed. ## What Changed - Initialize the state DB in main-like entrypoints for CLI, TUI, app-server, exec, MCP server, and the thread-manager sample. - Pass `Option<StateDbHandle>` through `ThreadManager`, `LocalThreadStore`, app-server processors, TUI app wiring, rollout listing/recording, personality migration, shell snapshot cleanup, session-name lookup, and memory/device-key consumers. - Remove the lazy local state DB wrapper from the thread store so non-test consumers use only the supplied handle or their existing fallback path. - Make `codex_rollout::state_db::init` the local state startup path: it opens/migrates SQLite, runs rollout metadata backfill when needed, waits for concurrent backfill workers up to a bounded timeout, verifies completion, and then returns the initialized handle. - Keep optional/non-owning SQLite helpers, such as remote TUI local reads, as open-only paths that do not run startup backfill. - Switch app-server startup from direct `codex_state::StateRuntime::init` to the rollout state initializer so app-server cannot skip rollout backfill. - Collapse split rollout lookup/list APIs so callers use the normal methods with an optional state handle instead of `_with_state_db` variants. - Restore `getConversationSummary(ThreadId)` to delegate through `ThreadStore::read_thread` instead of a LocalThreadStore-specific rollout path special case. - Keep DB-backed rollout path lookup keyed on the DB row and file existence, without imposing the filesystem filename convention on existing DB rows. - Verify readable DB-backed rollout paths against `session_meta.id` before returning them, so a stale SQLite row that points at another thread's JSONL falls back to filesystem search and read-repairs the DB row. - Keep `debug prompt-input` filesystem-only so a one-off debug command does not initialize or backfill SQLite state just to print prompt input. - Keep goal-session test Codex homes alive only in the goal-specific helper, rather than leaking tempdirs from the shared session test helper. - Update tests and call sites to pass explicit state handles where DB behavior is expected and explicit `None` where filesystem-only behavior is intended. ## Validation - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p codex-tui -p codex-exec -p codex-cli --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout state_db_` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout try_init_ -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p codex-rollout --lib -- -D warnings` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store read_thread_falls_back_when_sqlite_path_points_to_another_thread -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core shell_snapshot` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all personality_migration` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find::find_prefers_sqlite_path_by_id -- --nocapture` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core interrupt_accounts_active_goal_before_pausing` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server get_auth_status -- --test-threads=1` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server --lib` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-app-server --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p codex-exec -p codex-cli` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-app-server` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core` - `just argument-comment-lint -p codex-core` - `just argument-comment-lint -p codex-rollout` Focused coverage added in `codex-rollout`: - `recorder::tests::state_db_init_backfills_before_returning` verifies the rollout metadata row exists before startup init returns. - `state_db::tests::try_init_waits_for_concurrent_startup_backfill` verifies startup waits for another worker to finish backfill instead of disabling the handle for the process. - `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill` verifies startup does not hang indefinitely on a stuck backfill lease. - `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename` verifies DB-backed lookup accepts valid existing rollout paths even when the filename does not include the thread UUID. - `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread` verifies DB-backed lookup ignores a stale row whose existing path belongs to another thread and read-repairs the row after filesystem fallback. Focused coverage updated in `codex-core`: - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a DB-preferred rollout file with matching `session_meta.id`, so it still verifies that valid SQLite paths win without depending on stale/empty rollout contents. `cargo test -p codex-app-server thread_list_respects_search_term_filter -- --test-threads=1 --nocapture` was attempted locally but timed out waiting for the app-server test harness `initialize` response before reaching the changed thread-list code path. `bazel test //codex-rs/thread-store:thread-store-unit-tests --test_output=errors` was attempted locally after the thread-store fix, but this container failed before target analysis while fetching `v8+` through BuildBuddy/direct GitHub. The equivalent local crate coverage, including `cargo test -p codex-thread-store`, passes. A plain local `cargo check -p codex-rollout -p codex-app-server --tests` also requires system `libcap.pc` for `codex-linux-sandbox`; the follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in this container. |
||
|
|
33b19bcfde |
[codex] Split app-server request processors (#20940)
## Why The app-server request path had grown around a large `CodexMessageProcessor` plus separate API wrapper/helper modules. That made the dependency graph hard to see and forced unrelated request families to share broad processor state. This PR makes the split mechanical and command-prefix oriented so request families own only the dependencies they use. ## What changed - Replaced `CodexMessageProcessor` with command-prefix request processors under `app-server/src/request_processors/`. - Removed the old config, device-key, external-agent-config, and fs API wrapper files by moving their API handling into processors. - Split apps, plugins, marketplace, catalog, account, MCP, command exec, fs, git, feedback, thread, turn, thread goals, and Windows sandbox handling into dedicated processors. - Kept shared lifecycle, summary conversion, token usage replay, and shared error mapping only where multiple processors use them; single-use helpers were inlined into their owning processor. - Removed the fallback processor path and moved processor tests to `_tests` files. ## Validation - `cargo test -p codex-app-server` - `cargo check -p codex-app-server` - `just fix -p codex-app-server` |