codex

mirror of https://github.com/openai/codex.git synced 2026-05-20 11:12:43 +00:00

Author	SHA1	Message	Date
Akshay Nathan	ea735b69da	Merge branch 'main' into codex/stateful-apply-patch-streaming	2026-05-04 15:18:11 -07:00
Ruslan Nigmatullin	4d201e340e	state: pass state db handles through consumers (#20561 ) ## Why SQLite state was still being opened from consumer paths, including lazy `OnceCell`-backed thread-store call sites. That let one process construct multiple state DB connections for the same Codex home, which makes SQLite lock contention and `database is locked` failures much easier to hit. State DB lifetime should be chosen by main-like entrypoints and tests, then passed through explicitly. Consumers should use the supplied `Option<StateDbHandle>` or `StateDbHandle` and keep their existing filesystem fallback or error behavior when no handle is available. The startup path also needs to keep the rollout crate in charge of SQLite state initialization. Opening `codex_state::StateRuntime` directly bypasses rollout metadata backfill, so entrypoints should initialize through `codex_rollout::state_db` and receive a handle only after required rollout backfills have completed. ## What Changed - Initialize the state DB in main-like entrypoints for CLI, TUI, app-server, exec, MCP server, and the thread-manager sample. - Pass `Option<StateDbHandle>` through `ThreadManager`, `LocalThreadStore`, app-server processors, TUI app wiring, rollout listing/recording, personality migration, shell snapshot cleanup, session-name lookup, and memory/device-key consumers. - Remove the lazy local state DB wrapper from the thread store so non-test consumers use only the supplied handle or their existing fallback path. - Make `codex_rollout::state_db::init` the local state startup path: it opens/migrates SQLite, runs rollout metadata backfill when needed, waits for concurrent backfill workers up to a bounded timeout, verifies completion, and then returns the initialized handle. - Keep optional/non-owning SQLite helpers, such as remote TUI local reads, as open-only paths that do not run startup backfill. - Switch app-server startup from direct `codex_state::StateRuntime::init` to the rollout state initializer so app-server cannot skip rollout backfill. - Collapse split rollout lookup/list APIs so callers use the normal methods with an optional state handle instead of `_with_state_db` variants. - Restore `getConversationSummary(ThreadId)` to delegate through `ThreadStore::read_thread` instead of a LocalThreadStore-specific rollout path special case. - Keep DB-backed rollout path lookup keyed on the DB row and file existence, without imposing the filesystem filename convention on existing DB rows. - Verify readable DB-backed rollout paths against `session_meta.id` before returning them, so a stale SQLite row that points at another thread's JSONL falls back to filesystem search and read-repairs the DB row. - Keep `debug prompt-input` filesystem-only so a one-off debug command does not initialize or backfill SQLite state just to print prompt input. - Keep goal-session test Codex homes alive only in the goal-specific helper, rather than leaking tempdirs from the shared session test helper. - Update tests and call sites to pass explicit state handles where DB behavior is expected and explicit `None` where filesystem-only behavior is intended. ## Validation - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p codex-tui -p codex-exec -p codex-cli --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout state_db_` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout try_init_ -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p codex-rollout --lib -- -D warnings` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store read_thread_falls_back_when_sqlite_path_points_to_another_thread -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core shell_snapshot` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all personality_migration` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find::find_prefers_sqlite_path_by_id -- --nocapture` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core interrupt_accounts_active_goal_before_pausing` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server get_auth_status -- --test-threads=1` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server --lib` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-app-server --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p codex-exec -p codex-cli` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-app-server` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core` - `just argument-comment-lint -p codex-core` - `just argument-comment-lint -p codex-rollout` Focused coverage added in `codex-rollout`: - `recorder::tests::state_db_init_backfills_before_returning` verifies the rollout metadata row exists before startup init returns. - `state_db::tests::try_init_waits_for_concurrent_startup_backfill` verifies startup waits for another worker to finish backfill instead of disabling the handle for the process. - `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill` verifies startup does not hang indefinitely on a stuck backfill lease. - `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename` verifies DB-backed lookup accepts valid existing rollout paths even when the filename does not include the thread UUID. - `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread` verifies DB-backed lookup ignores a stale row whose existing path belongs to another thread and read-repairs the row after filesystem fallback. Focused coverage updated in `codex-core`: - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a DB-preferred rollout file with matching `session_meta.id`, so it still verifies that valid SQLite paths win without depending on stale/empty rollout contents. `cargo test -p codex-app-server thread_list_respects_search_term_filter -- --test-threads=1 --nocapture` was attempted locally but timed out waiting for the app-server test harness `initialize` response before reaching the changed thread-list code path. `bazel test //codex-rs/thread-store:thread-store-unit-tests --test_output=errors` was attempted locally after the thread-store fix, but this container failed before target analysis while fetching `v8+` through BuildBuddy/direct GitHub. The equivalent local crate coverage, including `cargo test -p codex-thread-store`, passes. A plain local `cargo check -p codex-rollout -p codex-app-server --tests` also requires system `libcap.pc` for `codex-linux-sandbox`; the follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in this container.	2026-05-04 11:46:03 -07:00
Akshay Nathan	b459ba0f4d	Gate streaming apply_patch parser	2026-05-04 11:15:50 -07:00
starr-openai	905987c08f	Prepare selected environment plumbing (#20669 ) ## Why This is a prep PR in the multi-environment process-tool stack. It separates ownership/config cleanup from the behavior change that teaches process tools to route by selected environment, so the follow-up PR can focus on model-facing `environment_id` behavior. ## Stack 1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext` rendering for selected environments 2. https://github.com/openai/codex/pull/20669 - selected-environment ownership and tool config prep (this PR) 3. https://github.com/openai/codex/pull/20647 - process-tool `environment_id` routing ## What Changed - keep the resolved turn environment list wrapped in `ResolvedTurnEnvironments` through `TurnContext` instead of unwrapping it back to a raw `Vec` - add `TurnContext::resolve_path_against` so cwd-relative path resolution has one shared helper - replace the old tool config boolean with `ToolEnvironmentMode::{None, Single, Multiple}` ## Testing - Tests not run locally; this prep refactor is covered by GitHub CI for the stack. Co-authored-by: Codex <noreply@openai.com>	2026-05-04 17:55:49 +00:00
Matthew Zeng	1b900bee8a	Unify skip-review handling for approval_mode = "approve" (#20750 ) ## Summary - Treat `approval_mode = "approve"` as skip-review across all permission modes. - Remove the mode-specific split in the MCP auto-approval gate so approved tools bypass review consistently. - Expand regression coverage in the shared MCP helper and the core tool-call flow. ## Testing - `just fmt` - `cargo test -p codex-mcp` - `cargo test -p codex-core approve_mode_skips_arc_and_guardian_in_every_permission_mode` - `git diff --check` - Full `cargo test -p codex-core` was also attempted, but the suite hit an unrelated pre-existing stack overflow in an existing multi-agent test	2026-05-04 10:30:47 -07:00
jif-oai	e3451ce6be	core: share responses request builder with compact requests (#20989 ) ## Why `ModelClientSession` and `compact_conversation_history()` were still rebuilding the same `ResponsesApiRequest` fields separately. That duplication makes it easy for normal `/responses` turns and compact requests to drift when request-shape changes land later, which is exactly the kind of cache-affecting divergence we want to avoid. This follow-up keeps the scope small by extracting the shared request-construction logic into one helper and using it from both paths. ## What changed - move `ResponsesApiRequest` construction into a shared `ModelClient::build_responses_request(...)` helper in `core/src/client.rs` - update the normal `/responses` streaming path to call that helper instead of the old `ModelClientSession`-local implementation - update `compact_conversation_history()` to derive its compact payload from the same helper so `model`, `instructions`, `input`, `tools`, `parallel_tool_calls`, `reasoning`, and `text` stay aligned with normal request building - add a unit test covering the shared helper's prompt cache key, installation metadata, and `service_tier` behavior ## Verification - `cargo test -p codex-core build_responses_request_sets_shared_cache_and_metadata_fields` - `cargo test -p codex-core --test all remote_compact_v2_reuses_context_compaction_for_followups` ## Docs No docs update needed.	2026-05-04 17:18:38 +00:00
Eric Traut	3c2dcbef85	Keep paused goals paused on thread resume (#20790 ) ## Summary Early adopters of the `/goal` feature have provided feedback that they expect a goal they explicitly paused to remain paused when they resume a thread. Previously, resuming a thread would reactivate a paused goal. This PR keeps persisted goal status unchanged during thread resume. This honors the user feedback while also simplifying the core goal logic. Rather than have the core logic automatically resume a paused goal, that responsibility is transferred to the client. The TUI now detects a resumed thread with a paused goal and asks the user whether to `Resume goal` or `Leave paused`. The prompt appears only for quiet resume flows, so users who resume with an immediate prompt are not interrupted. <img width="544" height="111" alt="image" src="https://github.com/user-attachments/assets/0ac9de1c-6ee6-47ba-b223-c03c8eb4c192" />	2026-05-04 08:58:07 -07:00
jif-oai	d927f61208	feat: add remote compaction v2 Responses client path (#20773 ) ## Why This adds the `remote_compaction_v2` client path so remote compaction can run through the normal Responses stream and install a `context_compaction` item that trigger a compaction. The goal is to migrate some of the compaction logic on the client side We keeps the v2 transport behind a feature flag while letting follow-up requests reuse the compacted context instead of falling back to the legacy compaction item shape. ## What changed - add `ResponseItem::ContextCompaction` and refresh the generated app-server / schema / TypeScript fixtures that expose response items on the wire - add `core/src/compact_remote_v2.rs` to send compaction through the standard streamed Responses client, require exactly one `context_compaction` output item, and install that item into compacted history - route manual compact and auto-compaction through the v2 path when `remote_compaction_v2` is enabled, while keeping the existing remote compaction path as the fallback - preserve the new item type across history retention, follow-up request construction, telemetry, rollout persistence, and rollout-trace normalization - add targeted coverage for the feature flag, `context_compaction` serialization, rollout-trace normalization, and remote-compaction follow-up behavior ## Verification - added protocol tests for `context_compaction` serialization/deserialization in `protocol/src/models.rs` - added rollout-trace coverage for `context_compaction` normalization in `rollout-trace/src/reducer/conversation_tests.rs` - added remote compaction integration coverage for v2 follow-up reuse and mixed compaction output streams in `core/tests/suite/compact_remote.rs` --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-04 14:15:01 +02:00
jif-oai	f48b777717	feat: support template interpolation in multi-agent usage hints (#20973 ) ## Why `multi_agent_v2` usage hints sometimes need to reference resolved config values such as the effective thread limit. Those values only exist after config layering, defaulting, and feature materialization, so the raw TOML alone was not enough to render them. ## What changed - allow `features.multi_agent_v2.{usage_hint_text,root_agent_usage_hint_text,subagent_usage_hint_text}` to use `{{ ... }}` placeholders backed by the materialized effective config - fail config loading with a targeted error when a referenced placeholder does not exist or does not resolve to a scalar value - move resolved-config materialization into a shared helper so config interpolation and config-lock export/replay both serialize the same resolved feature, memory, and agent settings ## Example ``` [features.multi_agent_v2] enabled = true usage_hint_text = "lorem {{ features.multi_agent_v2.max_concurrent_threads_per_session }} ipsum" ``` gets rendered as ``` "description": String("... \lorem 4 ipsum"), ```	2026-05-04 11:50:01 +02:00
pakrym-oai	c8c30d9d75	[codex] Emit MCP tool calls as turn items (#20677 ) ## Why `McpToolCall` was still an app-server item synthesized from deprecated legacy begin/end events. Recent item migrations moved this ownership into core `TurnItem`s, so MCP tool calls now follow the same canonical lifecycle and leave legacy events as compatibility fanout. Keeping the core item close to the v2 `ThreadItem::McpToolCall` shape also avoids spreading MCP result semantics across app-server conversion code. Core now owns whether a completed call is `completed` or `failed`, and whether the payload is a tool result or an error. ## What changed - Added core `TurnItem::McpToolCall` with flattened `server`, `tool`, `arguments`, `status`, `result`, and `error` fields. - Updated MCP tool call emitters, including MCP resource tools, to emit `ItemStarted`/`ItemCompleted` around directly constructed core MCP items. - Updated app-server v2 conversion to project the core MCP item into `ThreadItem::McpToolCall` without deriving status or splitting `Result` locally. - Ignored live deprecated MCP legacy fanout in app-server v2 to avoid duplicate item notifications, while keeping thread history replay on the legacy event path. ## Verification - `cargo test -p codex-protocol` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core --lib mcp_tool_call` - `cargo check -p codex-app-server` - `cargo test -p codex-app-server mcp_tool_call_completion_notification_contains_truncated_large_result`	2026-05-03 22:50:13 -07:00
Matthew Zeng	f88701f5c8	[tool_suggest] More prompt polishes. (#20566 ) Tool suggest still misfires when model needs tool_search, updating the prompts to further disambiguate it: - [x] rename it from `tool_suggest` to `request_plugin_install` - [x] rephrase "suggestion" to "install" in the tool descriptions. - [x] disambiguate "the tool" vs "the plugin/connector". Tested with the Codex App and verified it still works.	2026-05-02 04:22:12 +00:00
starr-openai	2952beb009	Surface multi-environment choices in environment context (#20646 ) ## Why The model needs a way to see which environments are available during a multi-environment turn without changing the legacy single-environment prompt surface or pulling replay/persistence changes into the same review. ## Stack 1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext` rendering for selected environments (this PR) 2. https://github.com/openai/codex/pull/20669 - selected-environment ownership and tool config prep 3. https://github.com/openai/codex/pull/20647 - process-tool `environment_id` routing ## What Changed - extend `environment_context` so multi-environment turns render an `<environments>` block with the selected environment ids and cwd values - keep zero- and single-environment turns on the existing cwd-only render path - keep replay and persistence paths on the legacy surface for now so this PR stays scoped to live prompt rendering - add focused coverage in `codex-rs/core/src/context/environment_context_tests.rs` ## Testing - CI --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-01 22:11:06 +00:00
pakrym-oai	aed74e5ee4	[codex] Emit image view as core item (#20512 ) ## Why Image-view results should be represented as a core-produced turn item instead of being reconstructed by app-server. At the same time, existing rollout/history paths still understand the legacy `ViewImageToolCall` event, so this keeps that event as compatibility output generated from the new item lifecycle. ## What changed - Added `TurnItem::ImageView` to `codex-protocol`. - Emitted image-view item start/completion directly from the core `view_image` handler. - Kept `ViewImageToolCall` as a legacy event and generate it from completed `TurnItem::ImageView` items. - Kept `thread_history.rs` on the legacy `ViewImageToolCall` replay path, with `ImageView` item lifecycle events ignored there. - Updated app-server protocol conversion, rollout persistence, and affected exhaustive event matches for the new item plus legacy fan-out shape. ## Verification - `cargo test -p codex-protocol -p codex-app-server-protocol -p codex-rollout -p codex-rollout-trace -p codex-mcp-server -p codex-app-server --lib` - `cargo test -p codex-core --test all view_image_tool_attaches_local_image` - `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `git diff --check`	2026-05-01 11:28:30 -07:00
jif-oai	2817866a32	fix: reduce ConfigBuilder::build stack usage (#20650 ) ## Why `ConfigBuilder::build` performs a large amount of async config loading. Leaving that entire future on the caller stack makes config startup more fragile on small runtime worker stacks. ## What changed - keep `ConfigBuilder::build` as a thin wrapper that boxes the config-loading future before awaiting it - move the existing implementation into a private `build_inner` method so the large async state machine lives on the heap instead of the runtime thread stack ## Testing - Not run locally	2026-05-01 20:24:17 +02:00
starr-openai	be71b6fcd1	Use selected turn environments for runtime context (#20281 ) ## Summary - make selected turn environments the source of truth for session runtime cwd and MCP runtime environment selection - keep local/no-selection fallback behavior intact - add coverage for duplicate selected environments, cwd resolution, and MCP runtime environment selection ## Validation - git diff --check - rustfmt was run on touched Rust files during the implementation workflow CI should provide the full Bazel/test signal. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-01 11:00:14 -07:00
Abhinav	78baa20780	deprecate legacy notify (#20524 ) # Why `notify` is the remaining compatibility surface from the legacy hook implementation. The newer lifecycle hook engine now owns the active hook system, so we should start steering users away from adding new `notify` configs before removing the old path entirely. This also adds a lightweight watchpoint for the deprecation so we can see how much legacy usage remains before the clean drop. # What - emit a startup deprecation notice when a non-empty `notify` command is configured - emit `codex.notify.configured` when a session starts with legacy `notify` configured - emit `codex.notify.run` when the legacy notify path fires after a completed turn - mark `notify` as deprecated in the config schema and repo docs - remove the orphaned `codex-rs/hooks/src/user_notification.rs` file that is no longer compiled - add regression coverage for the new deprecation notice # Next steps A follow-up PR can remove the legacy notify path entirely once we are ready for the clean drop. Before then, we can watch `codex.notify.configured` and `codex.notify.run` to understand the deprecation impact and remaining active usage. The cleanup PR should then delete the `notify` config field, the `legacy_notify` implementation, the old compatibility dispatch types and callsites that only exist for the legacy path, and the remaining compatibility docs/tests. # Testing - `cargo test -p codex-hooks` - `cargo test -p codex-config` - `cargo test -p codex-core emits_deprecation_notice_for_notify`	2026-05-01 17:35:21 +00:00
Eric Traut	3d1d164aee	Remove no-tool goal continuation suppression (#20523 ) ## Why `/goal` is supposed to keep Codex working until the goal is actually done. The previous continuation logic had two ways to stop early: the continuation prompt told the model to wait for new input when it felt blocked, and the runtime suppressed another continuation turn after a continuation finished without any tool calls. That made goals stop short even when the agent could still keep making progress (I received a few reports of this from users). It also relied on a brittle heuristic that treated "no registry tool calls" as equivalent to "should stop." ## What changed - removed the continuation prompt sentence that told the model to stop and wait for new input when it could not continue productively - removed the goal runtime suppression heuristic that stopped auto-continuation after a no-tool continuation turn - deleted the continuation-activity bookkeeping and left `tool_calls` as telemetry only - added focused regressions for the two intended behaviors: completed no-tool continuation turns still continue, while `request_user_input` keeps the existing turn open instead of spawning a new continuation	2026-05-01 09:09:55 -07:00
pakrym-oai	f476338f93	Move apply-patch file changes into turn items (#20540 ) ## Why Apply-patch file changes are now part of the core turn item stream, so v2 clients can consume the same first-class item lifecycle path used by other turn items instead of relying on app-server-specific remapping from legacy patch events. ## What changed - Added a core `TurnItem::FileChange` carrying apply-patch changes and completion metadata. - Updated the apply-patch tool emitter to send `ItemStarted` / `ItemCompleted` with the new `FileChange` item while preserving legacy `PatchApplyBegin` / `PatchApplyEnd` fan-out. - Updated app-server v2 conversion to render the new core item directly and stopped `event_mapping` from remapping old patch begin/end events into item notifications. - Kept thread history reconstruction based on the existing old apply-patch events for rollout compatibility. ## Verification - `cargo test -p codex-protocol -p codex-app-server-protocol` - `cargo test -p codex-core --test all apply_patch_tool_executes_and_emits_patch_events` - `cargo test -p codex-app-server bespoke_event_handling`	2026-05-01 08:47:18 -07:00
jif-oai	0b04d1b3cc	feat: export and replay effective config locks (#20405 ) ## Why For reproducibility. A hand-written `config.toml` is not enough to recreate what a Codex session actually ran with because layered config, CLI overrides, defaults, feature aliases, resolved feature config, prompt setup, and model-catalog/session values can all affect the final runtime behavior. This PR adds an effective config lockfile path: one run can export the resolved session config, and a later run can replay that lockfile and fail early if the regenerated effective config drifts. ## What Changed - Add a dedicated `ConfigLockfileToml` wrapper with top-level lockfile metadata plus the replayable config: ```toml version = 1 codex_version = "..." [config] # effective ConfigToml fields ``` - Keep lockfile metadata out of regular `ConfigToml`; replay loads `ConfigLockfileToml` and then uses its nested `config` as the authoritative config layer. - Add `debug.config_lockfile.export_dir` to write `<thread_id>.config.lock.toml` when a root session starts. - Add `debug.config_lockfile.load_path` to replay a saved lockfile and validate the regenerated session lockfile against it. - Add `debug.config_lockfile.allow_codex_version_mismatch` to optionally tolerate Codex binary version drift while still comparing the rest of the lockfile. - Add `debug.config_lockfile.save_fields_resolved_from_model_catalog` so lock creation can either save model-catalog/session-resolved fields or intentionally leave those fields dynamic. - Build lockfiles from the effective config plus resolved runtime values such as model selection, reasoning settings, prompts, service tier, web search mode, feature states/config, memories config, skill instructions, and agent limits. - Materialize feature aliases and custom feature config into the lockfile so replay compares canonical resolved behavior instead of user-authored alias shape. - Strip profile/debug/file-include/environment-specific inputs from generated lockfiles so they contain replayable values rather than the inputs that produced those values. - Surface JSON-RPC server error code/data in app-server client and TUI bootstrap errors so config-lock replay failures include the actual TOML diff. - Regenerate the config schema for the new debug config keys. ## Review Notes The main flow is split across these files: - `config/src/config_toml.rs`: lockfile/debug TOML shapes. - `core/src/config/mod.rs`: loading `debug.config_lockfile.*`, replaying a lockfile as a config layer, and preserving the expected lockfile for validation. - `core/src/session/config_lock.rs`: exporting the current session lockfile and materializing resolved session/config values. - `core/src/config_lock.rs`: lockfile parsing, metadata/version checks, replay comparison, and diff formatting. ## Usage Export a lockfile from a normal session: ```sh codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"' ``` Export a lockfile without saving model-catalog/session-resolved fields: ```sh codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"' \ -c 'debug.config_lockfile.save_fields_resolved_from_model_catalog=false' ``` Replay a saved lockfile in a later session: ```sh codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"' ``` If replay resolves to a different effective config, startup fails with a TOML diff. To tolerate Codex binary version drift during replay: ```sh codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"' \ -c 'debug.config_lockfile.allow_codex_version_mismatch=true' ``` ## Limitations This does not support custom rules/network policies. ## Verification - `cargo test -p codex-core config_lock` - `cargo test -p codex-config` - `cargo test -p codex-thread-manager-sample`	2026-05-01 17:46:02 +02:00
Eric Traut	a93c89f497	Color TUI statusline from active theme (#19631 ) ## Why Users have shared that the TUI can feel too visually flat because themes mostly show up in code syntax highlighting. The configurable statusline is a natural place to make the active theme more visible, while still letting users keep the existing monotone statusline if they prefer it. ## What Changed - Added a statusline styling helper that builds the rendered statusline from `(StatusLineItem, text)` segments, preserving item identity while keeping the plain text output unchanged. - Derived foreground accent colors from the active syntax theme by looking up TextMate scopes through the existing syntax highlighter, with conservative ANSI fallbacks when a scope does not provide a foreground. - Tuned theme-derived colors to keep the accents visible without making the statusline feel overly bright. - Added `[tui].status_line_use_colors`, defaulting to `true`, plus a separated `/statusline` toggle so users can enable or disable theme-derived statusline colors from the setup UI. - Updated the live statusline and `/statusline` preview to use the same styled builder, while keeping terminal-title preview text plain. - Kept statusline separators and active-agent add-ons subdued while removing blanket dimming from the whole passive statusline. ## Verification - `cargo test -p codex-tui status_line` - `cargo test -p codex-tui theme_picker` - `cargo test -p codex-tui foreground_style_for_scopes` - `cargo test -p codex-tui` - `cargo test -p codex-config` - `cargo test -p codex-core status_line_use_colors` - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml` ## Visual <img width="369" height="23" alt="Screenshot 2026-04-30 at 6 16 08 PM" src="https://github.com/user-attachments/assets/11d03efb-8e4f-4450-8f4d-00a9659ef4cd" /> <img width="385" height="23" alt="Screenshot 2026-04-30 at 6 16 02 PM" src="https://github.com/user-attachments/assets/a3d89f36-bdc1-42e8-8e84-61350e3999e2" />	2026-04-30 22:42:48 -07:00
Tom	fe05acad23	Make thread store process-scoped (#19474 ) - Build one app-server process ThreadStore from startup config and share it with ThreadManager and CodexMessageProcessor. - Remove per-thread/fork store reconstruction so effective thread config cannot switch the persistence backend. - Add params to ThreadStore create/resume for specifying thread metadata, since otherwise the metadata from store creation would be used (incorrectly).	2026-04-30 21:24:59 -07:00
pakrym-oai	f50c02d7bc	[codex] Remove unused event messages (#20511 ) ## Why Several legacy `EventMsg` variants were still emitted or mapped even though clients either ignored them or had moved to item/lifecycle events. `Op::Undo` had also degraded to an unavailable shim, so this removes that dead task path instead of preserving a command that cannot do useful work. `McpStartupComplete`, `WebSearchBegin`, and `ImageGenerationBegin` are intentionally kept because useful consumers still depend on them: MCP startup completion drives readiness behavior, and the begin events let app-server/core consumers surface in-progress web-search and image-generation items before the final payload arrives. ## What Changed - Removed weak legacy event variants and payloads from `codex-protocol`, including legacy agent deltas, background events, and undo lifecycle events. - Kept/restored `EventMsg::McpStartupComplete`, `EventMsg::WebSearchBegin`, and `EventMsg::ImageGenerationBegin` with serializer and emission coverage. - Updated core, rollout, MCP server, app-server thread history, review/delegate filtering, and tests to rely on the useful replacement events that remain. - Removed `Op::Undo`, `UndoTask`, the undo test module, and stale TUI slash-command comments. - Stopped agent job/background progress and compaction retry notices from emitting `BackgroundEvent` payloads. ## Verification - `cargo check -p codex-protocol -p codex-app-server-protocol -p codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-protocol -p codex-app-server-protocol -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-core --test all suite::items` - `just fix -p codex-protocol -p codex-app-server-protocol -p codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - Earlier coverage on this PR also included `codex-mcp`, `codex-tui`, core library tests, MCP/plugin/delegate/review/agent job tests, and MCP startup TUI tests.	2026-04-30 20:03:26 -07:00
Dylan Hurd	af089fb21d	fix(exec_policy) heredoc parsing file_redirect (#20113 ) ## Summary Fixes a regression introduced in #10941 so that heredocs do not permit file redirects to be approved by rules, and adds scenario tests to cover this behavior. Previously, heredoc command parsing would allow redirects and environment variables: ```bash # commands_for_exec_policy() would parse this via parse_shell_lc_single_command_prefix PATH=/tmp/bad:$PATH cat <<'EOF' > /tmp/bad/hello.txt hello EOF ``` This conflicts with the Codex Rules documentation; heredoc parsing logic should abide by the same strictness of parsing. ## Tests - [x] Updated unit tests accordingly - [x] Added scenario tests for these cases --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-01 01:05:02 +00:00
iceweasel-oai	4f96001fa7	execpolicy: unwrap PowerShell -Command wrappers on Windows (#20336 ) ## Why On Windows, Codex runs shell commands through a top-level `powershell.exe -NoProfile -Command ...` wrapper. `execpolicy` was matching that wrapper instead of the inner command, so prefix rules like `["git", "push"]` did not fire for PowerShell-wrapped commands even though the same normalization already happens for `bash -lc` on Unix. This change makes the Windows shell wrapper transparent to rule matching while preserving the existing Windows unmatched-command safelist and dangerous-command heuristics. ## What changed - add `parse_powershell_command_plain_commands()` in `shell-command/src/powershell.rs` to unwrap the top-level PowerShell `-Command` body with `extract_powershell_command()` and parse it with the existing PowerShell AST parser - update `core/src/exec_policy.rs` so `commands_for_exec_policy()` treats top-level PowerShell wrappers like `bash -lc` and evaluates rules against the parsed inner commands - carry a small `ExecPolicyCommandOrigin` through unmatched-command evaluation and expose `is_safe_powershell_words()` / `is_dangerous_powershell_words()` so Windows safelist and dangerous-command checks still work after unwrap - add Windows-focused tests for wrapped PowerShell prompt/allow matches, wrapper parsing, and unmatched safe/dangerous inner commands, and re-enable the end-to-end `execpolicy_blocks_shell_invocation` test on Windows ## Testing - `cargo test -p codex-shell-command`	2026-05-01 00:56:20 +00:00
Felipe Coury	b6f81257f8	feat(tui): add vim composer mode (#18595 ) ## Why Codex now has configurable TUI keymaps, but the composer still behaves like a plain text field. Users who prefer modal editing need a way to keep Vim muscle memory while drafting prompts, and the keymap picker needs to expose Vim-specific actions if those bindings are configurable instead of hardcoded. ## What Changed - Adds composer Vim mode with insert/normal state, common normal-mode movement and editing commands, `d`/`y` operator-pending flows, and mode-aware footer and cursor indicators. - Adds `/vim`, an optional global `toggle_vim_mode` binding, and `tui.vim_mode_default` so Vim mode can be toggled per session or enabled as the default composer state. - Extends runtime and config keymaps with `vim_normal` and `vim_operator` contexts, exposes those contexts in `/keymap`, refreshes the config schema, and validates Vim bindings separately. - Integrates Vim normal mode with existing composer behavior: `/` opens slash command entry, `!` enters shell mode, `j`/`k` navigate history at history boundaries, successful submissions reset back to normal mode, and paste burst handling remains insert-mode only. - Teaches the TUI render path to apply and restore cursor style so Vim insert mode can use a bar cursor without leaving the terminal in that state after exit. ## Validation - `cargo test -p codex-tui keymap -- --nocapture` on the keymap/Vim coverage - `cargo insta pending-snapshots` ## Docs This introduces user-facing `/vim`, `tui.vim_mode_default`, and Vim keymap contexts under `tui.keymap`, so the public CLI configuration and slash-command docs should be updated before the feature ships.	2026-04-30 17:20:51 -07:00
maja-openai	a5ebedef67	Bypass review for always-allow MCP tools in auto-review (#20069 ) ## Why When an MCP or app tool is configured with approval mode `approve` (always allow), users expect that decision to be authoritative. In guardian auto-review mode, ARC could still return `ask-user`, which then routed the approval question into guardian with the ARC reason as context. That meant a tool explicitly configured as always allowed still went through both safety monitors before running. This change keeps the existing ARC behavior for non-auto-review sessions, but avoids the ARC-to-guardian sequence when `approvals_reviewer = auto_review` and the tool approval mode is `approve`. ## What changed - Short-circuit MCP tool approval handling when `approval_mode == approve` and `approvals_reviewer == auto_review`. - Updated the MCP approval regression test so the auto-review case asserts neither ARC nor guardian is called. - Preserved existing tests that verify ARC can still block always-allow MCP tools outside guardian auto-review mode. ## Verification - `cargo test -p codex-core --lib mcp_tool_call`	2026-04-30 16:44:09 -07:00
Owen Lin	9ddb267e9c	fix: ignore dangerous project-level config keys (#20098 ) ## Description Ignore these top-level config keys when loading project-scoped config.toml files: ``` "openai_base_url", "chatgpt_base_url", "model_provider", "model_providers", "profile", "profiles", "experimental_realtime_ws_base_url", ``` ## What changed - Add a project-local config denylist for credential-routing fields such as `openai_base_url`, `chatgpt_base_url`, `model_provider`, `model_providers`, `profile`, `profiles`, and `experimental_realtime_ws_base_url`. - Strip those fields from project config layers before they participate in effective config merging, while leaving safe project-local settings intact. - Track ignored project-local keys on config layers and surface a startup warning telling users to move those settings to user-level `config.toml` if they intentionally need them. - Update profile behavior coverage so project-local `profile` / `profiles` entries are ignored instead of overriding user-level profile selection. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core project_layer_ignores_unsupported_config_keys` - `cargo test -p codex-core project_profiles_are_ignored` - `cargo test -p codex-core config::config_loader_tests`	2026-04-30 23:03:01 +00:00
Akshay Nathan	8426edf71e	Stateful streaming apply_patch parser	2026-04-30 21:41:15 +00:00
xl-openai	7b3de63041	Move plugin out of core. (#20348 )	2026-04-30 14:26:14 -07:00
Tom	127be0612c	[codex] Migrate thread turns list to thread store (#19280 ) - migrate `thread/turns/list` to ThreadStore. Uses ThreadStore for most data now but merges in the in-memory state from thread manager - keep v2 `thread/list` pathless-store friendly by converting `StoredThread` directly to API `Thread` - add regression coverage for pathless store history/listing	2026-04-30 14:16:42 -07:00
pakrym-oai	b52083146c	Stop emitting item/fileChange/outputDelta output delta notifications (#20471 ) ## Why `item/fileChange/outputDelta` text output was only the tool's summary or error text and not used by client surfaces. We keep `item/fileChange/outputDelta` in the app-server protocol as a deprecated compatibility entry, but the server no longer emits it. ## What changed - stop the `apply_patch` runtime from emitting `ExecCommandOutputDelta` events - simplify `item_event_to_server_notification` so command output deltas always map to `item/commandExecution/outputDelta` - remove the app-server bookkeeping that tried to detect whether an output delta belonged to a file change - mark `item/fileChange/outputDelta` as a deprecated legacy protocol entry in the v2 types, schema, and README - simplify the file-change approval tests so they only wait for completion instead of expecting output-delta notifications ## Testing - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-thread-manager-sample` - `cargo test -p codex-app-server-protocol protocol::event_mapping::tests::exec_command_output_delta_maps_to_command_execution_output_delta -- --exact` - `cargo test -p codex-app-server turn_start_file_change_approval_accept_for_session_persists_v2 -- --exact` (failed before the test assertions because the wiremock `/responses` mock received 0 requests in setup)	2026-04-30 11:42:07 -07:00
Owen Lin	3516cb9751	fix(core): truncate large mcp tool outputs in rollouts (#20260 ) ## Why Large MCP tool call outputs can make rollout JSONL files enormous. In the session that motivated this change, the biggest JSONL records were: - `event_msg/mcp_tool_call_end` - `response_item/function_call_output` both containing the same unbounded MCP payloads - just 3 MCP tool calls that each were multi-hundred MBs 😱 This PR truncates both of those JSONL records. ## How #### For `response_item/function_call_output` Unified exec already bounds tool output before it is injected into model-facing history, which also keeps the corresponding rollout `response_item/function_call_output` records small. MCP should follow the same pattern: truncate the model-facing tool output at the tool-output boundary, while leaving code-mode/raw hook consumers alone. #### For `event_msg/mcp_tool_call_end` `McpToolCallEnd` also needs its own bounded event copy because it is the app-server/replay/UI event shape that backs `ThreadItem::McpToolCall`. Unfortunately this is _not_ downstream of the `ToolOutput` trait. ## Model behavior Model behavior is actually unchanged as a result of this PR. Before this PR, MCP output was: 1. Converted to `FunctionCallOutput`. 2. Recorded into in-memory history. 3. Truncated by `ContextManager::record_items()` before later model turns saw it. After this branch, MCP output is truncated earlier, in `McpToolOutput::response_payload()`, using the same helper. Then `ContextManager::record_items()` sees an already-truncated output and effectively has little/no additional work to do. So the model should still see the same kind of truncated function-call output. The practical difference is where truncation happens: earlier, before rollout persistence/app-server emission can see the giant payload. ## Verification - `cargo test -p codex-core mcp_tool_output` - `cargo test -p codex-core mcp_tool_call::tests::truncate_mcp_tool_result_for_event` - `cargo test -p codex-core mcp_post_tool_use_payload_uses_model_tool_name_args_and_result` - `just fmt` - `just fix -p codex-core` - `git diff --check`	2026-04-30 16:30:43 +00:00
Ahmed Ibrahim	8a97f3cf03	realtime: rename provider session ids (#20361 ) ## Summary Codex is repurposing `session` to mean a thread group, so the realtime provider session id should no longer use `session_id` / `sessionId` in Codex-facing protocol payloads. This PR renames that provider-specific field to `realtime_session_id` / `realtimeSessionId` and intentionally breaks clients that still send the old field names. ## What Changed - Renamed realtime provider session fields in `ConversationStartParams`, `RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`. - Renamed app-server v2 realtime request and notification fields to `realtimeSessionId`. - Removed legacy serde aliases for `session_id` / `sessionId`; clients must send the new names. - Propagated the rename through core realtime startup, app-server adapters, codex-api websocket handling, and TUI realtime state. - Regenerated app-server protocol schema/TypeScript outputs and updated app-server README examples. - Kept upstream Realtime API concepts unchanged: provider `session.id` parsing and `x-session-id` headers still use the upstream wire names. ## Testing - CI is running on the latest pushed commit. - Earlier local verification on this PR: - `cargo test -p codex-protocol` - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core realtime_conversation` - `cargo test -p codex-app-server-protocol` - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server realtime_conversation` - attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local linker bus error while linking the test binary) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-30 13:39:48 +03:00
jif-oai	c37f7434ba	Gate multi-agent v2 tools independently of collab (#20246 ) ## Why `multi_agents_v2` is meant to be independently gated from the older `collab` feature. The tool registry still treated the collaboration-style agent tools as `collab`-only, so enabling `multi_agents_v2` without `collab` omitted the v2 agent tools. Review and guardian sub-sessions also need to keep agent spawning disabled even when the outer session has `multi_agents_v2` enabled. ## What changed - Include the collab-backed agent tools when either `multi_agents_v2` or `collab` is enabled. - Explicitly disable `multi_agents_v2` for review and guardian review sub-sessions, matching the existing `spawn_csv` and `collab` restrictions. - Add a registry test that enables `multi_agents_v2`, disables `collab`, and verifies the v2 agent tools are present while legacy `send_input` and `resume_agent` remain hidden. ## Testing - Added `test_build_specs_multi_agent_v2_does_not_require_collab_feature`.	2026-04-30 10:23:31 +02:00
Abhinav	8f3c06cc97	Add persisted hook enablement state (#19840 ) ## Why After `hooks/list` exposes the hook inventory, clients need a way to persist user hook preferences, make those changes effective in already-open sessions, and distinguish user-controllable hooks from managed requirements without adding another bespoke app-server write API. ## What - Extends `hooks/list` entries with effective `enabled` state. - Persists user-level hook state under `hooks.state.<hook-id>` so the model can grow beyond a single boolean over time. - Uses the existing `config/batchWrite` path for hook state updates instead of introducing a dedicated hook write RPC. - Refreshes live session hook engines after config writes so already-open threads observe updated enablement without a restart. ## Stack 1. openai/codex#19705 2. openai/codex#19778 3. This PR - openai/codex#19840 4. openai/codex#19882 ## Reviewer Notes The generated schema files account for much of the raw diff. The core behavior is in: - `hooks/src/config_rules.rs`, which resolves per-hook user state from the config layer stack. - `hooks/src/engine/discovery.rs`, which projects effective enablement into `hooks/list` from source-derived managedness. - `config/src/hook_config.rs`, which defines the new `hooks.state` representation. - `core/src/session/mod.rs`, which rebuilds live hook state after user config reloads. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-30 04:46:32 +00:00
Michael Bolin	ac4332c05b	permissions: expose active profile metadata (#20095 )	2026-04-29 20:54:59 -07:00
Matthew Zeng	ebe602d005	[plugins] Allow MSFT curated plugins in tool_suggest (#20304 ) ## Summary - [x] Move the allowlist out of core crate - [x] Add Teams, SharePoint, Outlook Email, and Outlook Calendar to the tool_suggest discoverable plugin allowlist - [x] Add focused coverage for Microsoft curated plugin discovery ## Testing - just fmt - cargo test -p codex-core-plugins - cargo test -p codex-core list_tool_suggest_discoverable_plugins_returns_	2026-04-29 19:45:52 -07:00
rhan-oai	bb536d65bd	[codex-analytics] prevent stale guardian events from satisfying reused reviews (#20080 ) ## Why Reused Guardian review trunks can still have older child-turn events queued when a later review starts. The review waiter currently accepts the first terminal event it sees from the shared child session, so a stale `TurnComplete` can be attributed to the new review. That produces impossible analytics combinations such as non-null TTFT with sub-10 ms completion latency and zero token deltas on `trunk_reused` reviews. ## What changed - Preserve the child turn id returned by the Guardian review `Op::UserTurn` submission. - Restrict Guardian review waiting to events correlated with that submitted child turn. - Restrict timeout/abort draining to terminal events for the same child turn. - Add regression coverage for stale prior-turn completions, stale prior-turn errors, and interrupt draining in `codex-rs/core/src/guardian/review_session.rs`. ## Verification - `cargo test -p codex-core guardian::review_session::tests::` - `cargo clippy -p codex-core --tests -- -D warnings`	2026-04-29 18:26:39 -07:00
pakrym-oai	fedcefe9da	Reduce the surface of collaboration modes (#20149 ) Collaboration modes were slightly invasive both into ThreadManager construction and ModelProvider	2026-04-29 17:22:41 -07:00
Abhinav	8774229a89	Add hooks/list app-server RPC (#19778 ) ## Why We need a way to list the available hooks to expose via the TUI and App so users can view and manage their hooks ## What - Adds `hooks/list` for one or more `cwd` values that returns discovered hook metadata ## Stack 1. openai/codex#19705 2. This PR - openai/codex#19778 3. openai/codex#19840 4. openai/codex#19882 ## Review Notes The generated schema files account for most of the raw diff, these files have the core change: - `hooks/src/engine/discovery.rs` builds the inventory entries during hook discovery while leaving runtime handlers focused on execution. - `app-server/src/codex_message_processor.rs` wires `hooks/list` into the app-server flow for each requested `cwd`. - `app-server-protocol/src/protocol/v2.rs` defines the new v2 request/response payloads exposed on the wire. ### Core Changes `core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so `skills/list` and `hooks/list`can resolve plugin state for each requested `cwd` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-29 23:39:57 +00:00
iceweasel-oai	13dbcda28f	stop blocking unified_exec on Windows (#19435 ) ## Summary - remove the Windows-specific unified-exec environment block from tool selection - keep `unified_exec` default-off on Windows unless the feature is explicitly enabled - normalize model-provided `shell_type = unified_exec` to `shell_command` when the feature is disabled - drop obsolete tests tied to the removed environment gate and keep the feature-flag regression coverage ## Why Now that the session/long-lived process backend is implemented for the Windows sandbox, we don't need to hard disable it anymore. We will be rolling out slowly using a feature gate. ## Impact This allows manual Windows opt-in in CLI and app-backed flows while preserving the existing default-off behavior for Windows users. --------- Co-authored-by: canvrno-oai <kbond@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-29 16:06:33 -07:00
pakrym-oai	8de2a7a16d	Add codex-core public API listing (#20243 ) Summary: - Add a checked-in codex-core public API listing generated by cargo-public-api. - Add scripts/regen-public-api.sh with an embedded crate list, auto-install for cargo-public-api 0.51.0, pinned nightly, and --check mode. - Add Rust CI jobs on the codex Linux x64 runner pool to verify the listing stays up to date. Testing: - bash -n scripts/regen-public-api.sh - just regen-public-api --check - yq '.' .github/workflows/rust-ci.yml .github/workflows/rust-ci-full.yml - git diff --check	2026-04-29 22:58:08 +00:00
Matthew Zeng	e20391e567	[mcp] Fix plugin MCP approval policy. (#19537 ) Plugin MCP servers are loaded from plugin manifests rather than top-level `[mcp_servers]`, so their tool approval preferences need to be stored and applied through the owning plugin config. Without this, choosing "Always allow" for a plugin MCP tool could write a preference that was not reliably used on later tool calls. ## Summary - Add plugin-scoped MCP policy config under `plugins.<plugin>.mcp_servers`, including server enablement, tool allow/deny lists, server defaults, and per-tool approval modes. - Overlay plugin MCP policy onto manifest-provided server configs when plugins are loaded. - Route persistent "Always allow" writes for plugin MCP tools back to the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config entry. - Reload user config after persisting an approval and make the plugin load cache config-aware so stale plugin MCP policy is not reused after `config.toml` changes. - Regenerate the config schema and add coverage for plugin MCP policy loading, approval lookup, persistence, and stale-cache prevention. ## Testing - `cargo test -p codex-config` - `cargo test -p codex-core-plugins` - `cargo test -p codex-core --lib plugin_mcp`	2026-04-29 15:40:03 -07:00
Eric Traut	4241df4d79	Escape turn metadata headers as ASCII JSON (#19620 ) ## Why `x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex was serializing the metadata JSON with raw UTF-8 string contents. When a workspace path contains non-ASCII characters, common HTTP stacks can reject or corrupt that header before the request reaches the provider. Fixes #17468. Also addresses the duplicate WebSocket report in #19581. ## What changed - Added `codex_utils_string::to_ascii_json_string`, a shared helper that serializes JSON normally while escaping non-ASCII string content as `\uXXXX`. - Switched turn metadata header serialization, including merged Responses API client metadata, to use the ASCII-safe JSON helper. - Added coverage for non-ASCII workspace paths and non-ASCII client metadata while preserving the same parsed JSON values. ## Verification - `cargo test -p codex-utils-string` - `cargo test -p codex-core turn_metadata` - `just bazel-lock-check`	2026-04-29 15:35:33 -07:00
Alex Daley	f63b19bedd	[apps] Add apps MCP path override (#20231 ) Summary - Add `[features.apps_mcp_path_override]` config with a `path` field for overriding only the built-in apps MCP path. - Keep existing host/base URL derivation unchanged and append the configured path after that base. - Regenerate the config schema with the custom feature-config case. Test Plan - Not run for latest revision; only `just fmt` and `just write-config-schema` were run. - Earlier revision: `cargo test -p codex-features` - Earlier revision: `cargo test -p codex-mcp`	2026-04-29 18:08:06 -04:00
viyatb-oai	07c8b8c77c	fix: handle deferred network proxy denials (#19184 ) ## Why This bug is exposed by Guardian/auto-review approvals. With the managed network proxy enabled, a blocked network request can be reported back through the network approval service as an approval denial after the command has already started. Before this change, the shell and unified exec runtimes registered those network approval calls, but did not have a way to observe an async proxy denial as a cancellation/failure signal for the running process. The result was confusing: Guardian/auto-review could correctly deny network access, but the command path could keep running or unregister the approval without surfacing the denial as the command failure. ## What Changed - `NetworkApprovalService` now attaches a cancellation token to active and deferred network approvals. - Proxy-denial outcomes are recorded only for active registrations, cancel the owning token, and are consumed when the approval is finalized. - The shell runtime combines the normal command timeout with the network-denial cancellation token. - Unified exec stores the deferred network approval object, terminates tracked processes when the proxy denial arrives, and returns the denial as a process failure while polling or completing the process. - Tool orchestration passes the active network approval cancellation token into the sandbox attempt and preserves deferred approval errors instead of silently unregistering them. - App-server `command/exec` now handles the combined timeout-or-cancellation expiration variant used by the runtime. ## Verification - `cargo test -p codex-core network_approval --lib` - `cargo clippy -p codex-app-server --all-targets -- -D warnings` - `cargo clippy -p codex-core --all-targets -- -D warnings` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-29 19:13:57 +00:00
xl-openai	73cd831952	feat: Use remote installed plugin cache for skills and MCP (#20096 ) - Fetches and caches remote /installed plugin state - Lets skills/list load skills from remote-installed cached plugins without requiring a local marketplace entry - Routes plugin list/startup/install/uninstall changes through async plugin cache invalidation and MCP refresh	2026-04-29 12:09:49 -07:00
Won Park	5cf0adba93	Include auto-review rollout in feedback uploads (#20064 ) ## Summary - include the live auto-review trunk rollout when `/feedback` uploads logs - upload that attachment as `auto-review-rollout-<parent-thread-id>.jsonl` so it is distinguishable from the parent rollout - show the same auto-review attachment name in the TUI consent popup ## Scope - this only covers the live cached auto-review trunk for the current parent thread - it does not add durable historical parent->auto-review lookup - it does not add persisted rollout support for ephemeral parallel review forks ## UI <img width="599" height="185" alt="Screenshot 2026-04-28 at 1 17 18 PM" src="https://github.com/user-attachments/assets/6a0e79c2-5d21-4702-8a89-f765778bc9e9" /> ## Validation - `cargo test -p codex-core cached_guardian_subagent_exposes_its_rollout_path` - `cargo test -p codex-feedback` - `cargo test -p codex-app-server` - `cargo test -p codex-tui feedback_upload_consent_popup_snapshot` - `cargo test -p codex-tui feedback_good_result_consent_popup_includes_connectivity_diagnostics_filename` ## Known unrelated local failures - `cargo test -p codex-core` currently fails in the pre-existing proxy env snapshot test `tools::runtimes::tests::maybe_wrap_shell_lc_with_snapshot_keeps_user_proxy_env_when_proxy_inactive` - `cargo test -p codex-tui` currently hits pre-existing `status::*` snapshot drift unrelated to this change ## Follow-Up - persist parallel auto-review fork sessions so /feedback can include their rollout history too - attach each persisted fork as its own clearly named file, for example auto-review-rollout-<parent-thread-id>-fork <n>.jsonl, instead of merging multiple Guardian sessions into one attachment - keep the same live-session-only scope initially; durable historical parent -> auto-review lookup can remain a separate decision if we later need feedback from resumed sessions	2026-04-29 11:44:55 -07:00
pakrym-oai	8356806fc9	Add ThreadManager sample crate (#20141 ) Summary: - Add codex-thread-manager-sample, a one-shot binary that starts a ThreadManager thread, submits a prompt, and prints the final assistant output. - Pass ThreadStore into ThreadManager::new and expose thread_store_from_config for existing callsites. - Build the sample Config directly with only --model and prompt inputs. Verification: - just fmt - cargo check -p codex-thread-manager-sample -p codex-app-server -p codex-mcp-server - git diff --check Tests: Not run per request.	2026-04-29 11:21:06 -07:00
jif-oai	70ac0f123c	Make multi-agent v2 ignore agents.max_depth (#20180 ) ## Why `agents.max_depth` is a legacy multi-agent v1 guard. Multi-agent v2 uses task-path routing and its own session/thread limits, so v2 should not reject nested `spawn_agent` calls just because the thread-spawn depth has reached the v1 maximum. Keeping the v1 depth guard active in v2 prevents deeper task trees even though the v2 path still needs the depth value only for lineage and task-path metadata. ## What Changed - Removed the depth-limit rejection from the multi-agent v2 `spawn_agent` handler while still computing child depth for lineage/path metadata. - Made the depth-based disabling of legacy `SpawnCsv`/`Collab` tools apply only when `Feature::MultiAgentV2` is disabled. - Added `multi_agent_v2_spawn_agent_ignores_configured_max_depth` to cover a v2 child spawning another agent when `agent_max_depth = 1`, while the existing v1 depth-limit tests continue to enforce the legacy behavior. ## Verification - `cargo test -p codex-core multi_agent_v2_spawn_agent_ignores_configured_max_depth -- --nocapture` - `cargo test -p codex-core depth_limit -- --nocapture` - `cargo test -p codex-core tools::handlers::multi_agents::tests -- --nocapture`	2026-04-29 12:23:00 +02:00

1 2 3 4 5 ...

2729 Commits