codex

mirror of https://github.com/openai/codex.git synced 2026-05-25 21:45:22 +00:00

Author	SHA1	Message	Date
Ruslan Nigmatullin	4d201e340e	state: pass state db handles through consumers (#20561 ) ## Why SQLite state was still being opened from consumer paths, including lazy `OnceCell`-backed thread-store call sites. That let one process construct multiple state DB connections for the same Codex home, which makes SQLite lock contention and `database is locked` failures much easier to hit. State DB lifetime should be chosen by main-like entrypoints and tests, then passed through explicitly. Consumers should use the supplied `Option<StateDbHandle>` or `StateDbHandle` and keep their existing filesystem fallback or error behavior when no handle is available. The startup path also needs to keep the rollout crate in charge of SQLite state initialization. Opening `codex_state::StateRuntime` directly bypasses rollout metadata backfill, so entrypoints should initialize through `codex_rollout::state_db` and receive a handle only after required rollout backfills have completed. ## What Changed - Initialize the state DB in main-like entrypoints for CLI, TUI, app-server, exec, MCP server, and the thread-manager sample. - Pass `Option<StateDbHandle>` through `ThreadManager`, `LocalThreadStore`, app-server processors, TUI app wiring, rollout listing/recording, personality migration, shell snapshot cleanup, session-name lookup, and memory/device-key consumers. - Remove the lazy local state DB wrapper from the thread store so non-test consumers use only the supplied handle or their existing fallback path. - Make `codex_rollout::state_db::init` the local state startup path: it opens/migrates SQLite, runs rollout metadata backfill when needed, waits for concurrent backfill workers up to a bounded timeout, verifies completion, and then returns the initialized handle. - Keep optional/non-owning SQLite helpers, such as remote TUI local reads, as open-only paths that do not run startup backfill. - Switch app-server startup from direct `codex_state::StateRuntime::init` to the rollout state initializer so app-server cannot skip rollout backfill. - Collapse split rollout lookup/list APIs so callers use the normal methods with an optional state handle instead of `_with_state_db` variants. - Restore `getConversationSummary(ThreadId)` to delegate through `ThreadStore::read_thread` instead of a LocalThreadStore-specific rollout path special case. - Keep DB-backed rollout path lookup keyed on the DB row and file existence, without imposing the filesystem filename convention on existing DB rows. - Verify readable DB-backed rollout paths against `session_meta.id` before returning them, so a stale SQLite row that points at another thread's JSONL falls back to filesystem search and read-repairs the DB row. - Keep `debug prompt-input` filesystem-only so a one-off debug command does not initialize or backfill SQLite state just to print prompt input. - Keep goal-session test Codex homes alive only in the goal-specific helper, rather than leaking tempdirs from the shared session test helper. - Update tests and call sites to pass explicit state handles where DB behavior is expected and explicit `None` where filesystem-only behavior is intended. ## Validation - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p codex-tui -p codex-exec -p codex-cli --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout state_db_` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout try_init_ -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p codex-rollout --lib -- -D warnings` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store read_thread_falls_back_when_sqlite_path_points_to_another_thread -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core shell_snapshot` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all personality_migration` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find::find_prefers_sqlite_path_by_id -- --nocapture` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core interrupt_accounts_active_goal_before_pausing` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server get_auth_status -- --test-threads=1` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server --lib` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-app-server --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p codex-exec -p codex-cli` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-app-server` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core` - `just argument-comment-lint -p codex-core` - `just argument-comment-lint -p codex-rollout` Focused coverage added in `codex-rollout`: - `recorder::tests::state_db_init_backfills_before_returning` verifies the rollout metadata row exists before startup init returns. - `state_db::tests::try_init_waits_for_concurrent_startup_backfill` verifies startup waits for another worker to finish backfill instead of disabling the handle for the process. - `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill` verifies startup does not hang indefinitely on a stuck backfill lease. - `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename` verifies DB-backed lookup accepts valid existing rollout paths even when the filename does not include the thread UUID. - `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread` verifies DB-backed lookup ignores a stale row whose existing path belongs to another thread and read-repairs the row after filesystem fallback. Focused coverage updated in `codex-core`: - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a DB-preferred rollout file with matching `session_meta.id`, so it still verifies that valid SQLite paths win without depending on stale/empty rollout contents. `cargo test -p codex-app-server thread_list_respects_search_term_filter -- --test-threads=1 --nocapture` was attempted locally but timed out waiting for the app-server test harness `initialize` response before reaching the changed thread-list code path. `bazel test //codex-rs/thread-store:thread-store-unit-tests --test_output=errors` was attempted locally after the thread-store fix, but this container failed before target analysis while fetching `v8+` through BuildBuddy/direct GitHub. The equivalent local crate coverage, including `cargo test -p codex-thread-store`, passes. A plain local `cargo check -p codex-rollout -p codex-app-server --tests` also requires system `libcap.pc` for `codex-linux-sandbox`; the follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in this container.	2026-05-04 11:46:03 -07:00
Eric Traut	fa0e2ba87c	Avoid false shell snapshot cleanup warnings (#18441 ) ## Why Fresh app-server thread startup can create a shell snapshot through a temp file and then promote it to the final snapshot path. The previous implementation briefly wrapped the temp path in `ShellSnapshot`, so after a successful rename its `Drop` attempted to delete the old temp path and could log a false `ENOENT` warning. Fixes #17549. ## What changed - Validate the temp snapshot path directly before promotion. - Rename the temp path directly to the final snapshot path. - Keep explicit cleanup of the temp path on validation or finalization failures.	2026-04-20 15:15:05 +01:00
pakrym-oai	dd1321d11b	Spread AbsolutePathBuf (#17792 ) Mechanical change to promote absolute paths through code.	2026-04-14 14:26:10 -07:00
jif-oai	d484bb57d9	feat: add suffix to shell snapshot name (#14938 ) https://github.com/openai/codex/issues/14906	2026-03-17 17:59:27 +00:00
Michael Bolin	b77fe8fefe	Apply argument comment lint across codex-rs (#14652 ) ## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/param/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/param/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/param/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- -> #14652 * #14651	2026-03-16 16:48:15 -07:00
Michael Bolin	0c8a36676a	fix: move inline codex-rs/core unit tests into sibling files (#14444 ) ## Why PR #13783 moved the `codex.rs` unit tests into `codex_tests.rs`. This applies the same extraction pattern across the rest of `codex-rs/core` so the production modules stay focused on runtime code instead of large inline test blocks. Keeping the tests in sibling files also makes follow-up edits easier to review because product changes no longer have to share a file with hundreds or thousands of lines of test scaffolding. ## What changed - replaced each inline `mod tests { ... }` in `codex-rs/core/src/*` with a path-based module declaration - moved each extracted unit test module into a sibling `_tests.rs` file, using `mod_tests.rs` for `mod.rs` modules - preserved the existing `cfg(...)` guards and module-local structure so the refactor remains structural rather than behavioral ## Testing - `cargo test -p codex-core --lib` (`1653 passed; 0 failed; 5 ignored`) - `just fix -p codex-core` - `cargo fmt --check` - `cargo shear`	2026-03-12 08:16:36 -07:00
Ahmed Ibrahim	4a0e6dc916	Serialize shell snapshot stdin test (#13878 ) ## What changed - `snapshot_shell_does_not_inherit_stdin` now runs under its own serial key. - The change isolates it from other Unix shell-snapshot tests that also interact with stdin. ## Why this fixes the flake - The failure was not a shell-snapshot logic bug. It was shared-stdin interference between concurrently executing tests. - When multiple tests compete for inherited stdin at the same time, one test can observe EOF or consumed input that actually belongs to a different test. - Running this specific test in a dedicated serial bucket guarantees exclusive ownership of stdin, which makes the assertion deterministic without weakening coverage. ## Scope - Test-only change.	2026-03-09 10:44:13 -07:00
Owen Lin	289ed549cf	chore(otel): rename OtelManager to SessionTelemetry (#13808 ) ## Summary This is a purely mechanical refactor of `OtelManager` -> `SessionTelemetry` to better convey what the struct is doing. No behavior change. ## Why `OtelManager` ended up sounding much broader than what this type actually does. It doesn't manage OTEL globally; it's the session-scoped telemetry surface for emitting log/trace events and recording metrics with consistent session metadata (`app_version`, `model`, `slug`, `originator`, etc.). `SessionTelemetry` is a more accurate name, and updating the call sites makes that boundary a lot easier to follow. ## Validation - `just fmt` - `cargo test -p codex-otel` - `cargo test -p codex-core`	2026-03-06 16:23:30 -08:00
Dylan Hurd	e10df4ba10	fix(core) shell_snapshot multiline exports (#12642 ) ## Summary Codex discovered this one - shell_snapshot tests were breaking on my machine because I had a multiline env var. We should handle these! ## Testing - [x] existing tests pass - [x] Updated unit tests	2026-03-02 12:08:17 -07:00
daveaitel-openai	dcab40123f	Agent jobs (spawn_agents_on_csv) + progress UI (#10935 ) ## Summary - Add agent job support: spawn a batch of sub-agents from CSV, auto-run, auto-export, and store results in SQLite. - Simplify workflow: remove run/resume/get-status/export tools; spawn is deterministic and completes in one call. - Improve exec UX: stable, single-line progress bar with ETA; suppress sub-agent chatter in exec. ## Why Enables map-reduce style workflows over arbitrarily large repos using the existing Codex orchestrator. This addresses review feedback about overly complex job controls and non-deterministic monitoring. ## Demo (progress bar) ``` ./codex-rs/target/debug/codex exec \ --enable collab \ --enable sqlite \ --full-auto \ --progress-cursor \ -c agents.max_threads=16 \ -C /Users/daveaitel/code/codex \ - <<'PROMPT' Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows: path = item-01..item-30, area = test. Then call spawn_agents_on_csv with: - csv_path: /tmp/agent_job_progress_demo.csv - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1." - output_csv_path: /tmp/agent_job_progress_demo_out.csv PROMPT ``` ## Review feedback addressed - Auto-start jobs on spawn; removed run/resume/status/export tools. - Auto-export on success. - More descriptive tool spec + clearer prompts. - Avoid deadlocks on spawn failure; pending/running handled safely. - Progress bar no longer scrolls; stable single-line redraw. ## Tests - `cd codex-rs && cargo test -p codex-exec` - `cd codex-rs && cargo build -p codex-cli`	2026-02-24 21:00:19 +00:00
jif-oai	743caea3a6	feat: add shell snapshot failure reason (#12233 )	2026-02-19 13:49:12 +00:00
jif-oai	ffd4bd345c	feat: tie shell snapshot to cwd (#11231 ) Fix for this: https://github.com/openai/codex/issues/11223 Basically we tie the shell snapshot to a `cwd` to handle `cwd`-based env setups	2026-02-09 22:14:39 +00:00
jif-oai	4971e96a98	nit: shell snapshot retention to 3 days (#10382 )	2026-02-02 12:52:45 +00:00
Skylar Graika	9008a0eff9	core: prevent shell_snapshot from inheriting stdin (#9735 ) Fixes #9559. When `shell_snapshot` runs, it may execute user startup files (e.g. `.bashrc`). If those files read from stdin (or if stdin is an interactive TTY under job control), the snapshot subprocess can block or receive `SIGTTIN` (as reported over SSH). This change explicitly sets `stdin` to `Stdio::null()` for the snapshot subprocess, so it can't read from the terminal. Regression test added that would hang/timeout without this change. Tests: `ulimit -n 4096 && cargo test -p codex-core`. cc @dongdongbh @etraut-openai --------- Co-authored-by: Skylar Graika <sgraika127@gmail.com>	2026-01-30 13:47:10 -08:00
jif-oai	89c5f3c4d4	feat: adding thread ID to logs + filter in the client (#10150 )	2026-01-29 16:53:30 +01:00
gt-oai	fdc69df454	Fix flakey shell snapshot test (#9919 ) Sometimes fails with: ``` failures: ---- shell_snapshot::tests::timed_out_snapshot_shell_is_terminated stdout ---- thread 'shell_snapshot::tests::timed_out_snapshot_shell_is_terminated' panicked at codex-rs/core/src/shell_snapshot.rs:588:9: expected timeout error, got Failed to execute sh Caused by: Text file busy (os error 26) failures: shell_snapshot::tests::timed_out_snapshot_shell_is_terminated test result: FAILED. 815 passed; 1 failed; 4 ignored; 0 measured; 0 filtered out; finished in 18.00s ```	2026-01-26 18:05:30 +00:00
jif-oai	afa08570f2	nit: exclude PWD for rc sourcing (#9753 )	2026-01-23 13:35:48 +01:00
jif-oai	3355adad1d	chore: defensive shell snapshot (#9609 ) This PR adds 2 defensive mechanisms for shell snapshotting: * Filter out invalid env variables (containing `-` for example) without dropping the whole snapshot * Validate the snapshot before considering it as valid by running a mock command with a shell snapshot	2026-01-21 18:41:58 +00:00
jif-oai	b75024c465	feat: async shell snapshot (#9600 )	2026-01-21 10:41:13 +00:00
jif-oai	dc1b62acbd	feat: detach non-tty childs (#9477 ) Thanks to the investigations made by * @frantic-openai https://github.com/openai/codex/pull/9403 * @kfiramar https://github.com/openai/codex/pull/9388	2026-01-19 11:35:34 +00:00
jif-oai	6fbb89e858	fix: shell snapshot clean-up (#9155 ) Clean all shell snapshot files corresponding to sessions that have not been updated in 7 days Those files should never leak. The only known cases were it can leak are during non graceful interrupt of the process (`kill -9, `panic`, OS crash, ...)	2026-01-14 09:05:46 +00:00
jif-oai	258fc4b401	feat: add sourcing of rc files to shell snapshot (#9150 )	2026-01-14 08:58:10 +00:00
jif-oai	29381ba5c2	feat: add shell snapshot for shell command (#7786 )	2025-12-11 13:46:43 +00:00
jif-oai	463249eff3	fix: flaky test 2 (#7818 )	2025-12-10 16:35:28 +00:00
jif-oai	7836aeddae	feat: shell snapshotting (#7641 )	2025-12-09 18:36:58 +00:00

25 Commits