codex

mirror of https://github.com/openai/codex.git synced 2026-05-21 03:33:41 +00:00

Author	SHA1	Message	Date
Eric Traut	51895767b1	codex: address PR review feedback (#22508 )	2026-05-13 09:39:08 -07:00
Eric Traut	63a3a26013	Add core turn context update helpers	2026-05-13 09:08:02 -07:00
cassirer-openai	702e6a3c64	[rollout-trace] Add a trace ID to MCP calls. (#22326 ) This allows us to connect individual tool calls to the logs of the invocations.	2026-05-13 16:03:33 +00:00
jif-oai	441c2f818f	fix: main (#22503 ) Fix main due to conflicting merge	2026-05-13 17:28:37 +02:00
jif-oai	34bb85519f	feat: add config-change extension contributor (#22488 ) ## Why Extensions can observe thread and turn lifecycle events today, but there was no single host-owned hook for changes to the effective thread configuration. That makes features that need to react to model, permission, or tool-suggest updates either depend on individual mutation paths or risk going stale after runtime config refreshes. This adds a typed config-change contributor so extension-owned state can stay synchronized with the effective thread config while the host remains responsible for deciding when config changed. ## What Changed - Added `ConfigContributor<C>` to `codex_extension_api`, with before/after immutable snapshots of the effective config plus session/thread extension stores. - Added registry builder/accessor support through `config_contributor` and `config_contributors`. - Emits config-change callbacks after committed updates from session settings, per-turn setting updates, and `refresh_runtime_config`. - Builds effective config snapshots only when config contributors are registered, and suppresses no-op callbacks when the before/after snapshots are equal. - Added a core session regression test that verifies contributors observe both model changes and user-layer runtime config changes, including access to session and thread extension stores. ## Validation Added `config_change_contributor_observes_effective_config_changes` in `codex-rs/core/src/session/tests.rs` to cover the new contributor path.	2026-05-13 17:13:34 +02:00
Ahmed Ibrahim	87de4e3290	Add service tier overrides to spawned agents (#22139 ) ## Why Spawned agents can already override `model` and `reasoning_effort`, but they have no equivalent way to opt into a model-supported service tier. That makes it impossible to preserve or intentionally select tiered execution behavior when delegating work to a sub-agent, even though the model catalog already advertises supported `service_tiers`. ## What changed - Add optional `service_tier` to both legacy and `MultiAgentV2` `spawn_agent` tool inputs. - Show each picker-visible model's supported service tier ids and descriptions in the `spawn_agent` tool guidance. - Resolve service tier selection after the child agent's effective model is known. - Inherit the parent tier when omitted and still supported by the final child model; otherwise clear it. - Reject explicit unsupported tier requests with a model-facing error. - Keep explicit `service_tier` usable on full-history forks, while still honoring the existing model/reasoning fork restrictions. - Hide `service_tier` alongside other spawn metadata when `hide_spawn_agent_metadata` is enabled. ## Verification Added focused coverage for: - v1/v2 `spawn_agent` schema exposure for `service_tier` - tier descriptions in spawn guidance - hidden-metadata suppression - explicit supported tier selection - explicit unknown and unsupported tier rejection - inherited tier preservation or clearing based on child-model support - full-history fork acceptance for explicit service tiers in both v1 and v2 Local Rust tests were not run in this workspace per repo guidance; the new coverage is included for CI.	2026-05-13 18:11:50 +03:00
Felipe Coury	6f77b70ff3	feat(tui): remove Zellij TUI workarounds (#22214 ) ## Why We added Zellij-specific TUI workarounds because older Zellij behavior did not work with Codex's normal terminal model: - #8555 made `tui.alternate_screen = "auto"` disable alternate screen in Zellij so transcript history stayed available. - #16578 avoided scroll-region operations in Zellij by emitting raw newlines and using a separate composer styling path. This PR removes both workarounds because the latest Zellij release tested locally (`zellij 0.44.1`) works correctly with Codex's standard TUI behavior: normal alternate-screen handling, redraw, and history insertion. ## What Changed - Removed the `InsertHistoryMode::Zellij` path and the Zellij-only newline scrollback insertion behavior. - Removed cached `is_zellij` state from the TUI and composer. - Removed Zellij-specific composer styling, the helper snapshot, and the `TerminalInfo::is_zellij()` convenience method that only served this workaround. - Changed `tui.alternate_screen = "auto"` to use alternate screen for Zellij too; `--no-alt-screen` and `tui.alternate_screen = "never"` still preserve the inline mode escape hatch. - Updated the generated config schema description for `tui.alternate_screen`. ## How to Test Manual smoke path used with `zellij 0.44.1`: 1. Build and run this branch inside a Zellij `0.44.1` session with default config. 2. Start Codex normally and produce enough assistant/tool output to create scrollback. 3. Confirm the transcript remains readable, the composer renders normally, and scrolling through terminal history works. 4. Resize the Zellij pane while output exists and confirm the TUI redraws without duplicated, missing, or stale rows. 5. Compare with `--no-alt-screen` or `-c tui.alternate_screen=never` if you want to verify the inline fallback still works. Targeted tests: - `just write-config-schema` - `just fmt` - `just fix -p codex-tui` - `cargo test -p codex-terminal-detection` - `cargo test -p codex-tui alternate_screen_auto_uses_alt_screen` Attempted but did not complete locally: - `cargo test -p codex-tui` built and ran the new test successfully, then failed later on unrelated local failures in `status_permissions_full_disk_managed_*` and a stack overflow in `tests::fork_last_filters_latest_session_by_cwd_unless_show_all`. ## Documentation No developers.openai.com Codex documentation update is needed for this revert.	2026-05-13 12:11:15 -03:00
jif-oai	68e045a631	Make context contributors async (#22491 ) ## Summary - make ContextContributor return a boxed Send future - await context contributors during initial context assembly - update existing contributors and extension-api examples for the async contract ## Testing - cargo test -p codex-extension-api --examples - cargo test -p codex-git-attribution - cargo test -p codex-core build_initial_context_includes_git_attribution_from_extensions -- --nocapture - cargo test -p codex-core build_initial_context_omits_git_attribution_when_feature_is_disabled -- --nocapture - cargo test -p codex-core (fails in unrelated agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns stack overflow) - just fix -p codex-extension-api - just fix -p codex-git-attribution - just fix -p codex-core - cargo clippy -p codex-extension-api --examples	2026-05-13 16:43:28 +02:00
jif-oai	1dcc89f1d4	feat: move extension scope ids into ExtensionData (#22490 ) ## Summary - add a scoped level_id to ExtensionData and expose it through level_id() - remove thread_id/turn_id parameters from extension contributor inputs where the scoped ExtensionData already carries that identity - move turn-scoped extension data onto TurnContext so token usage and lifecycle contributors can share the same turn store ## Testing - cargo check -p codex-extension-api -p codex-core --tests - cargo test -p codex-extension-api - cargo test -p codex-guardian - cargo test -p codex-core --lib record_token_usage_info_notifies_extension_contributors - cargo test -p codex-core --lib submission_loop_channel_close_emits_thread_stop_lifecycle - cargo test -p codex-core --lib submission_loop_channel_close_aborts_active_turn_before_thread_stop_lifecycle - just fix -p codex-extension-api - just fix -p codex-guardian - just fix -p codex-core - just fmt ## Note - Attempted cargo test -p codex-core; it aborted in agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns with the existing stack overflow before the full suite completed.	2026-05-13 16:13:16 +02:00
jif-oai	083c1962f9	feat: add token usage contributor hook (#22485 ) ## Why Extensions need a stable place to observe token accounting after Codex folds model-provider usage into the session's cached `TokenUsageInfo`. Without a contributor hook, extension-owned features that need last-turn or cumulative token usage have to duplicate session plumbing or infer state from client-facing `TokenCount` notifications. ## What changed - Added `TokenUsageContributor` to `codex-extension-api`, passing session/thread `ExtensionData`, `ThreadId`, turn id, and the current `TokenUsageInfo`. - Added registry builder/storage support for token-usage contributors. - Invoked registered contributors from `Session::record_token_usage_info` after the session token cache is updated and before the client `TokenCount` notification is emitted. ## Testing - Added `record_token_usage_info_notifies_extension_contributors`, covering cumulative token usage updates and access to both extension stores.	2026-05-13 14:32:23 +02:00
jif-oai	fcc2a92743	fix: emit thread stop lifecycle on implicit shutdown (#22482 ) ## Why The thread lifecycle contributor hooks from #22476 should observe every session teardown. The explicit `Op::Shutdown` path already emitted `on_thread_stop`, but when `submission_loop` exited because its submission channel closed, it only tore down runtime services. That meant extensions could miss the thread-stop lifecycle signal on implicit runtime shutdown. ## What Changed - Split shared runtime teardown into `shutdown_runtime_services(...)`. - Split thread-stop lifecycle emission into `emit_thread_stop_lifecycle(...)`. - Reused those helpers from both explicit shutdown and the channel-close shutdown path. - Tracked whether `Op::Shutdown` was received so the explicit path does not double-emit lifecycle events after it exits the loop. - Added a regression test that closes the submission channel and asserts `ThreadLifecycleContributor::on_thread_stop` runs once with the expected thread/session stores. ## Testing - `cargo test -p codex-core submission_loop_channel_close_emits_thread_stop_lifecycle`	2026-05-13 14:19:57 +02:00
jif-oai	27e67a8c2a	feat: add turn lifecycle contributors (#22480 ) ## Why Extensions can already contribute prompt, tool, turn-item, and thread-lifecycle behavior, but there was no explicit host-owned hook for per-turn setup and cleanup. That makes extension-private turn state awkward: an extension either has to stash it outside the turn lifecycle or depend on core runtime objects. This adds a small turn lifecycle boundary. Extensions receive stable identifiers plus the existing session, thread, and turn `ExtensionData` stores, while core keeps owning task scheduling, cancellation, and turn teardown. ## What Changed - Added `TurnLifecycleContributor` with `on_turn_start`, `on_turn_stop`, and `on_turn_abort` callbacks in `codex-rs/ext/extension-api`. - Added typed `TurnStartInput`, `TurnStopInput`, and `TurnAbortInput` payloads that expose `thread_id`, `turn_id`, `session_store`, `thread_store`, and `turn_store`. - Registered and re-exported turn lifecycle contributors through `ExtensionRegistry` and `ExtensionRegistryBuilder`. - Wired `Session` to emit turn start, stop, and abort callbacks from the existing turn/task lifecycle paths. - Carried the turn-scoped `ExtensionData` through `RunningTask` and `RemovedTask` so stop/abort callbacks receive the same turn store created at turn start. ## Verification - Not run locally.	2026-05-13 13:47:27 +02:00
jif-oai	5ab7e6b4c6	feat: add thread lifecycle contributor hooks (#22476 ) ## Why Extensions that need thread-scoped state currently only get a start-time callback. That is enough for seeding stores, but it leaves the host without a shared extension seam for later thread rehydrate and flush work as thread ownership evolves. This PR turns that start-only seam into a host-owned thread lifecycle contributor contract so extension-private state can stay behind the extension API instead of leaking extra orchestration through core. ## What changed - Replaced `ThreadStartContributor` with `ThreadLifecycleContributor` and added typed lifecycle inputs for thread start, resume, and stop. The contract lives in [`contributors/thread_lifecycle.rs`](`d0e9211f70/codex-rs/ext/extension-api/src/contributors/thread_lifecycle.rs (L1-L64)`). - Kept the existing start-time behavior intact by routing session construction through `on_thread_start`. - Invoked `on_thread_stop` during session shutdown before thread-scoped extension state is dropped, while isolating contributor failures behind warning logs. - Migrated `git-attribution` and `guardian` onto the lifecycle registration path. - Renamed the extension registry plumbing from start-specific contributors to lifecycle-specific contributors. ## Notes `on_thread_resume` is introduced at the API boundary here so extensions can target the final lifecycle shape; host resume dispatch can be wired where that runtime path is finalized.	2026-05-13 13:11:30 +02:00
jif-oai	9c5dfa7b1a	Refactor extension tools onto shared ToolExecutor (#22369 ) ## Why Extension tools were split across two public runtime contracts: `codex-tool-api` exposed `ToolBundle` plus its own call/spec/error types, while core native tools used `codex_tools::ToolExecutor`. That made contributed tool specs and execution behavior easy to drift apart and added another crate boundary for what should be one executable-tool seam. This PR makes `ToolExecutor` the single runtime contract and keeps extension-specific pinning in `codex-extension-api`. ## Remaining todo https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5 Either generic `Invocation` or sub-extract the `ToolCall` and clean `ToolInvocation` ## What changed - Removed the `codex-tool-api` workspace crate and its dependencies from core and `codex-extension-api`. - Made `codex_tools::ToolExecutor` object-safe with `async_trait` so extension contributors can return a dyn executor. - Added the extension-facing aliases under `ext/extension-api/src/contributors/tools.rs`, including `ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output = ExtensionToolOutput>`. - Changed `ToolContributor::tools` to return extension executors directly instead of `ToolBundle`s. - Updated core’s extension tool handler/registry/router path to adapt those extension executors into the existing native `ToolInvocation` runtime path. - Added focused coverage for extension tools being registered, model-visible, dispatchable, and not replacing built-in tools. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-extension-api`	2026-05-13 12:12:06 +02:00
jif-oai	1824685a00	feat: extract shared tool executor interface (#22359 ) ## Why Codex still models model-visible tools and executable behavior largely inside `codex-core`, which makes it harder to evolve the tool system toward a single reusable abstraction for built-ins, MCP-backed tools, dynamic tools, and later tools injected from outside core. This PR takes the next incremental step in that direction by moving the common execution-facing pieces out of core and separating them from core-only orchestration. The intent is to let shared tool abstractions improve in one place, while `codex-core` keeps the parts that are still inherently host-specific today, such as `ToolInvocation`, dispatch wiring, and hook integration. This PR is mostly moving things around. The only interesting piece is this abstraction: https://github.com/openai/codex/pull/22359/changes#diff-81af519002548ba51ed102bdaaf77e081d40a1e73a6e5f9b104bbbc96a6f1b3dR13 ## What changed - Added `codex_tools::ToolExecutor<Invocation>` as the shared execution trait for model-visible tools. - Moved the reusable execution support types from `codex-core` into `codex-tools`: - `FunctionCallError` - `ToolPayload` - `ToolOutput` - Refactored core tool implementations so that execution behavior lives on `ToolExecutor<ToolInvocation>`, while `ToolHandler` remains the core-local extension point for hook payloads, telemetry tags, diff consumers, and other orchestration concerns. - Kept the registry and dispatch flow behaviorally unchanged while making the shared/extracted boundary explicit across built-in, MCP, dynamic, extension-backed, shell, and multi-agent tool handlers. ## Verification - `cargo test -p codex-tools` - `just fix -p codex-tools` - `just fix -p codex-core` - `cargo test -p codex-core` progressed through the updated tool surfaces and then hit the existing unrelated multi-agent stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.	2026-05-13 11:31:27 +02:00
jif-oai	7e97da7c13	chore: Keep view_image sandbox test in temp dir (#22355 ) ## Summary - move the `view_image` sandbox filesystem-read unit test onto a temporary cwd - keep the turn cwd and selected turn environment cwd aligned inside the test - avoid leaving `core/image.png` behind in the repo checkout after the test runs ## Root cause The test wrote `image.png` beneath `turn.cwd`, and the shared session test helper defaults that cwd to the current repo directory when no override is provided. ## Validation - `just fmt` - `cargo test -p codex-core tools::handlers::view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads`	2026-05-13 10:39:07 +02:00
Abhinav	392e94e9ea	add --dangerously-bypass-hook-trust CLI flag (#21768 ) # Why Hook trust happens through the TUI in `/hooks` so it can block non-interactive use cases. This flag will allow users that are using codex headlessly to bypass hooks when they want to. # What This adds one invocation-scoped escape hatch. - the CLI flag sets a runtime-only `bypass_hook_trust` override; there is no durable `config.toml` setting - hook discovery still respects normal enablement, so explicitly disabled hooks remain disabled - we show a `--dangerously-bypass-hook-trust is enabled. Enabled hooks may run without review for this invocation.` message on startup so accidental use is visible in both interactive and exec flows This keeps “enabled” and “trusted” as separate concepts in the normal path, while giving CI/E2E callers a stable way to opt into the exceptional path when they already control the hook set.	2026-05-13 07:13:57 +00:00
Abhinav	934a40c7d9	Use root repo hooks in linked worktrees (#21969 ) # Why Linked worktrees currently load their own project hook declarations, so the same repo can present different hook definitions depending on which checkout is active. https://github.com/openai/codex/pull/21762 tried to share trust by giving matching worktree hooks a shared synthetic key, but review pointed out that divergent worktree hook definitions would then fight over one `trusted_hash`. Instead of introducing a second trust model, this makes linked worktrees use the root checkout as the single source of truth for project hook declarations. Worktree-local project config can still diverge for unrelated settings, but project hooks now keep one real source path and one trust state per repo. # What - Teach project config loading to remember the matching root-checkout `.codex/` folder for actual linked-worktree project layers. - Keep ordinary project config sourced from the worktree, but replace project hook declarations with the root checkout's matching layer before hook discovery runs, including linked-worktree layers with `.codex/` but no local `config.toml`. - Make hook discovery use that authoritative hook folder for both `hooks.json` and TOML hook source paths, so linked worktrees produce the same hook key and trust state as the root checkout. - Cover the linked-worktree path plus regressions for missing worktree `config.toml` and nested non-worktree project roots.	2026-05-13 06:58:58 +00:00
sayan-oai	2304ec45ca	Remove unavailable MCP placeholder tool backfill (#22439 ) ## Why `UnavailableDummyTools` kept synthetic placeholder tools alive for historical tool calls whose backing MCP tool was no longer available. That path adds stale model-visible tool specs and special routing at the point where unavailable MCP calls should use ordinary current-tool handling. This removes the runtime backfill instead of preserving a second compatibility lane. ## Is it safe to remove? The unavailable tools were added in #17853 after a CS issue when a previously-called MCP tool failed to load and was omitted from the CS spec. Now that we have tool search, I think this is resolved: - API merges tools from previous TST output into effective tool set so theyre always in CS spec - if an MCP tool surfaced by TST later becomes unavailable, the model can still call it and it will just return model-visible error - both TST output and function call output are dropped on compaction so model will not remember old calls to MCP post compaction ## What changed - Delete unavailable-tool collection, placeholder handler, router/spec plumbing, and obsolete placeholder coverage. - Keep `features.unavailable_dummy_tools` as a removed no-op feature tombstone so existing configs still parse cleanly. - Add an integration-style `tool_search` regression test showing that a deferred MCP tool surfaced through `tool_search` still routes through MCP and returns a model-visible tool-call error rather than `unsupported call`. ## Verification - `cargo test -p codex-core tool_search`	2026-05-12 23:30:13 -07:00
pakrym-oai	104fc14956	Encapsulate tool search entries in handlers (#22261 ) ## Why This builds on the handler-owned spec refactor by moving deferred tool-search metadata to the same handlers that already own tool specs. The registry builder no longer needs a separate prebuilt `tool_search_entries` path; it can collect searchable entries from deferred handlers directly. ## What changed - Added `search_info()` to tool handlers and implemented it for MCP and dynamic handlers. - Reused handler `spec()` output when constructing tool-search entries, adapting it into the deferred `LoadableToolSpec` shape expected by `tool_search`. - Simplified `build_tool_registry_builder(...)` so `tool_search` registration is based on deferred handlers with search info. - Removed the old standalone search-entry builders and now-unused `codex-tools` discovery helper exports. ## Verification - `cargo test -p codex-core tools::handlers::tool_search::tests:: -- --nocapture` - `cargo test -p codex-core tools::spec_plan::tests::search_tool -- --nocapture` - `cargo test -p codex-core tools::spec::tests:: -- --nocapture` - `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture` - `cargo test -p codex-tools` - `just fix -p codex-core` - `just fix -p codex-tools`	2026-05-12 20:48:02 -07:00
pakrym-oai	67c8486462	tools: infer code-mode namespace descriptions from specs (#22406 ) ## Why Code mode already builds the merged nested `ToolSpec`s that feed the `exec` prompt. Keeping a separate `tool_namespaces` map in the planning path duplicated that metadata and left extra wrapper plumbing in `spec.rs`. ## What changed - derive code-mode namespace descriptions from the merged `ToolSpec::Namespace` entries before building the code-mode handlers - extract `build_code_mode_handlers(...)` so the code-mode-specific planning stays in one place - remove `tool_namespaces` from `ToolRegistryBuildParams` - delete the now-unused `McpToolPlanInputs` wrapper and related test helper plumbing ## Testing - `cargo test -p codex-core spec_plan`	2026-05-12 20:47:50 -07:00
pakrym-oai	96833c5b15	Remove CODEX_RS_SSE_FIXTURE test hook (#22413 ) ## Why `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests bypass the normal Responses transport by reading SSE from local files. That kept test-only behavior wired through production client code. The affected tests can stay hermetic by using the existing `core_test_support::responses` mock server and passing `openai_base_url` instead. ## What Changed - Removed the `CODEX_RS_SSE_FIXTURE` flag, `codex_api::stream_from_fixture`, the `env-flags` dependency, and the checked-in SSE fixture files. - Repointed the affected core, exec, and TUI tests at `MockServer` with the existing SSE event constructors. - Removed the Bazel test data plumbing for the deleted fixtures and refreshed cargo/Bazel lock state. ## Verification - `cargo build -p codex-cli` - `cargo test -p codex-api` - `cargo test -p codex-core --test all responses_api_stream_cli` - `cargo test -p codex-core --test all integration_creates_and_checks_session_file` - `cargo test -p codex-exec --test all ephemeral` - `cargo test -p codex-exec --test all resume` - `cargo test -p codex-tui --test all resume_startup_does_not_consume_model_availability_nux_count` - `just bazel-lock-update` - `just bazel-lock-check` - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui` - `git diff --check`	2026-05-13 03:08:01 +00:00
Andrei Eternal	913aad4d3c	Add allow_managed_hooks_only hook requirement (#20319 ) ## Why Enterprise-managed hook policy needs a narrow way to require Codex to ignore user-controlled lifecycle hooks without adopting the broader trust-precedence model from earlier hook work. This keeps the policy anchored in `requirements.toml`, so admins can opt into managed hooks only while normal `config.toml` files cannot enable the restriction themselves. ## What changed - Added `allow_managed_hooks_only` to the requirements data flow and preserved explicit `false` values. - Also adds it to /debug-config - Marked MDM, system, and legacy managed config layers as managed for hook discovery. - Updated hook discovery so `allow_managed_hooks_only = true`: - keeps managed requirements hooks and managed config-layer hooks, - skips user/project/session `hooks.json` and `[hooks]` entries with concise startup warnings, - skips current unmanaged plugin hooks, - ignores any `allow_managed_hooks_only` key placed in ordinary `config.toml` layers.	2026-05-12 19:05:25 -07:00
Andrei Eternal	fbfbfe5fc5	hooks: use new session IDs instead of thread IDs for hooks, apply parent's session ID to subagents' hooks (#22268 ) ## Why hook semantics treat `session_id` as shared across a root session and its subagents. Codex hooks were still emitting the current thread ID, which made spawned agents look like independent sessions and made it harder for hook integrations to correlate work across a root thread and its spawned helpers This change makes hooks use Codex's existing shared session identity so hook `session_id` matches the root-thread session across spawned subagents. ## What Changed - switch hook payloads to use the existing shared session identity from core instead of the current thread ID - cover all hook surfaces that expose `session_id`, including `SessionStart`, tool hooks, compact hooks, prompt-submit hooks, stop hooks, and legacy after-agent dispatch	2026-05-12 19:05:10 -07:00
Celia Chen	e2eb7c30fe	feat: route guardian review model selection through providers (#22258 ) ## Why Guardian review selection was hard-coded in `core`, which worked for the default OpenAI path but did not give provider implementations a way to choose backend-specific reviewer model IDs. That matters for Amazon Bedrock: guardian review should run through the Bedrock/Mantle provider using Bedrock's `openai.gpt-5.4` model ID, instead of accidentally selecting a reviewer model that implies the OpenAI backend. ## What Changed - Added provider-owned approval review model selection via `ModelProvider::approval_review_model_selection`. - Moved the existing default selection policy into the provider abstraction: prefer the requested reviewer model when it is available, otherwise fall back to the active turn model, preferring `Low` reasoning when supported. - Added an Amazon Bedrock override that pins guardian review to `openai.gpt-5.4` with `Low` reasoning.	2026-05-13 01:55:46 +00:00
sayan-oai	1ae9867296	[codex] Remove tool search bucket limit override (#22381 ) ## Why `tool_search` still carries the server-specific result-cap path added in #17684 for `computer-use`: when the model omitted `limit`, a matching result expanded the search to 20 and then `limit_results_by_bucket` applied per-bucket caps. That makes default result handling depend on a one-off server exception instead of the single `TOOL_SEARCH_DEFAULT_LIMIT` path. This PR removes that custom branch so omitted `limit` values use the ordinary global default consistently. The implementation being retired is the pre-change bucketed search path in [`tool_search.rs`](`5e3ee5eddf/codex-rs/core/src/tools/handlers/tool_search.rs (L121-L190)`). ## What changed - Collapse `ToolSearchHandler::search` back to one BM25 search with the resolved limit. - Remove `limit_results_by_bucket`, the `computer-use` constants, and the omitted-limit plumbing that only existed for the override. - Drop dead `ToolSearchEntry::limit_bucket` metadata from deferred MCP and dynamic search entries. - Remove tests and helpers that only asserted the deleted override behavior. - Add direct handler-level unit coverage for omitted/default and explicit `tool_search` result limits. ## Validation - `cargo test -p codex-core tool_search` - The matching unit tests passed, including the new omitted/default and explicit result-limit coverage. - The broader `--test all` search-tool fixture phase then failed before sending mocked response requests in `tool_search_indexes_only_enabled_non_app_mcp_tools` and `tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`. - `cargo test -p codex-core` - The touched tool-search coverage passed before the run later aborted in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` with a stack overflow.	2026-05-13 00:46:07 +00:00
Tom	c51c65ad09	Unify thread metadata updates above store (#22236 ) - make ThreadStore::update_thread_metadata accept a broad range of metadata patches - keep ThreadStore::append_items as raw canonical history append (no metadata side effects) - in the local store, write these metadata updates to a combination of sqlite and rollout jsonl files for backwards-compat. It special cases which fields need to go into jsonl vs sqlite vs whatever, confining the awkwardness to just this implementation - in remote stores we can simply persist the metadata directly to a database, no special casing required. - move the "implicit metadata updates triggered by appending rollout items" from the RolloutRecorder (which is local-threadstore-specific) to the LiveThread layer above the ThreadStore, inside of a private helper utility called ThreadMetadataSync. LiveThread calls ThreadStore append_items and update_metadata separately. - Add a generic update metadata method to ThreadManager that works on both live threads and "cold" threads - Call that ThreadManager method from app server code, so app server doesn't need to worry about whether the thread is live or not	2026-05-13 00:28:15 +00:00
Michael Bolin	9e7cdbd0d2	core: box multi-agent handler futures (#22266 ) ## Why This is the base PR in the split stack for the permissions migration. It isolates stack-safety work that had been mixed into the larger permissions PR, so reviewers can evaluate the async-future changes separately from the permissions model changes in #22267. The main risk this addresses is large or recursive multi-agent futures overflowing smaller runner stacks. A follow-up review also called out that `shutdown_live_agent` must remain quiescent: callers should not remove a live agent from tracking or release its spawn slot until the worker loop has actually terminated. ## What Changed - Boxes the large async futures in the multi-agent spawn, resume, and close tool handlers. - Boxes the `AgentControl` spawn and recursive close/shutdown paths that can otherwise build very deep futures. - Keeps `shutdown_live_agent` waiting for thread termination before removing/releasing the live agent, preserving the previous shutdown ordering while still boxing the recursive close path. ## Verification Strategy The focused local coverage was `cargo test -p codex-core multi_agents`, which exercises the multi-agent spawn/resume/close handlers, cascade close/resume behavior, and the shutdown path touched by this PR. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22266). * #22330 * #22329 * #22328 * #22327 * __->__ #22266	2026-05-12 17:22:25 -07:00
pakrym-oai	0173f71143	Refactor namespaced tool spec registration (#22256 ) ## Summary This refactor makes tool handlers the owner of the specs they can publish, so registry construction can register handlers once and separately publish only the specs that should be model-visible. The main motivation is deferred tools: MCP and dynamic tools still need handlers registered up front, but deferred tools should be discoverable through `tool_search` rather than emitted in the initial tool spec list. ## What changed - `McpHandler` and `DynamicToolHandler` can return their own `ToolSpec`. - `build_tool_registry_builder` now collects handlers, registers them through the no-spec path, and publishes only non-deferred handler specs. - Deferred MCP and dynamic tool names are combined into one `all_deferred_tools` set that drives spec filtering, code-mode deferred-tool signaling, and `tool_search` registration. - `tool_search` registration now requires both deferred tools and `namespace_tools`. - Namespace specs are merged in `spec_plan`, preserving top-level spec order, sorting tools within each namespace, and backfilling empty namespace descriptions. - Hosted web search and image-generation specs are included in the collected spec vector before namespace merge/publication, and tool-name tests that should not care about hosted relative order now compare sets. ## Testing - `cargo test -p codex-core tools::spec::tests:: -- --nocapture` - `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture` - `cargo test -p codex-core tools::router::tests::specs_filter_deferred_dynamic_tools -- --nocapture` - `cargo test -p codex-core suite::prompt_caching::prompt_tools_are_consistent_across_requests -- --nocapture` - `just fmt` - `just fix -p codex-core` - `cargo test -p codex-core -- --skip tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` passed the library suite after skipping the known stack-overflowing unit test. Full `cargo test -p codex-core` currently hits a stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`; the same focused test reproduces on `origin/main`.	2026-05-12 17:09:14 -07:00
pakrym-oai	960d42ddae	code-mode: carry nested tool kind through runtime (#22377 ) ## Why Code mode only used nested spec lookup at execution time to rediscover whether a nested tool should be invoked as a function tool or a freeform tool. That information is already present in the enabled tool metadata that code mode builds to expose `tools.*` and `ALL_TOOLS`, so re-looking it up from the router was redundant and kept execution coupled to a separate spec lookup path. ## What Changed - thread `CodeModeToolKind` through the code-mode runtime `ToolCall` event and `CodeModeNestedToolCall` - emit the nested tool kind directly from the V8 callback using the already-enabled tool metadata - build nested tool payloads from the propagated kind instead of calling `find_spec` - remove the now-unused `find_spec` plumbing from the router and parallel runtime helpers - add unit coverage for function vs freeform payload shaping and update affected router tests ## Testing - `cargo test -p codex-code-mode` - `cargo test -p codex-core code_mode::tests` - `cargo test -p codex-core extension_tool_bundles_are_model_visible_and_dispatchable` - `cargo test -p codex-core model_visible_specs_filter_deferred_dynamic_tools`	2026-05-12 23:34:37 +00:00
Dylan Hurd	8123bddb16	chore(config) include_collaboration_mode_instructions (#22383 ) ## Summary Adds include_collaboration_mode_instructions, which is a config equivalent to include_permissions_instructions for collaboration modes. Desired for situations where we want to disable this instruction from entering the context ## Testing - [x] Added unit test	2026-05-12 15:50:10 -07:00
pakrym-oai	862b2122ee	tools: remove is_mutating dispatch gating (#22382 ) ## Why Tool dispatch had two serialization mechanisms: - `supports_parallel_tool_calls` decides whether a tool participates in the shared parallel-execution lock. - `is_mutating` separately gated some calls inside dispatch. That second hook no longer carried its weight. The remaining parallel-support flag is already the per-tool concurrency policy, so keeping a second mutating gate made dispatch harder to follow and left behind extra session plumbing that only existed for that path. ## What changed - Removed `is_mutating` from tool handlers and deleted the `tool_call_gate` path that existed only to support it. - Simplified dispatch and routing to rely on the existing per-tool `supports_parallel_tool_calls` boolean. - Dropped the now-unused handler overrides and related session/test scaffolding. - Kept the router/parallel tests focused on the surviving per-tool behavior. - Removed the unused `codex-utils-readiness` dependency from `codex-core` as a follow-up fix for `cargo shear`. ## Testing - `cargo test -p codex-core parallel_support_does_not_match_namespaced_local_tool_names` - `cargo test -p codex-core mcp_parallel_support_uses_handler_data` - `cargo test -p codex-core tools_without_handlers_do_not_support_parallel`	2026-05-12 22:44:54 +00:00
Chris Bookholt	5e3ee5eddf	[codex] Tighten unified exec sandbox setup (#22207 ) ## Summary - tighten unified exec sandbox initialization - preserve the requested process workdir independently from sandbox setup - add regression coverage for the updated invariant ## Validation - Ran `/tmp/cargo-tools/bin/just fmt`. - Ran the targeted `codex-core` regression test successfully. - Ran `cargo test -p codex-core`; it did not complete cleanly because unrelated existing agent/config-loader tests failed and the run later aborted on a stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`. Co-authored-by: Codex <noreply@openai.com>	2026-05-12 08:41:00 -07:00
Felipe Coury	95b332c820	feat(tui): add ambient terminal pets (#21206 ) ## Why The Codex App has animated pets, but the TUI had no equivalent ambient companion surface. This brings that experience into terminal Codex while keeping the main chat flow usable: the pet should feel present, but it cannot cover transcript text, composer input, approvals, or picker content. The feature also needs to be terminal-aware. Different terminals support different image protocols, tmux can interfere with image rendering, and some users will want pets disabled entirely or anchored differently depending on their layout. <table> <tr><td> <img width="4110" height="2584" alt="CleanShot 2026-05-05 at 12 41 45@2x" src="https://github.com/user-attachments/assets/68a1fcbc-2104-48d6-b834-69c6aaa95cdf" /> <p align="center">macOS - Ghostty, iTerm2 and WezTerm with Custom Pet</p> </td></tr> <tr><td> ![Uploading CleanShot 2026-05-10 at 20.28.30.png…]() <p align="center">Windows Terminal</p> </td></tr> <tr><td> <img width="3902" height="2752" alt="CleanShot 2026-05-05 at 12 39 02@2x" src="https://github.com/user-attachments/assets/300e2931-6b00-467e-91cb-ab8e28470500" /> <p align="center">Linux - WezTerm and Ghostty</p> </td></tr> </table> ## What Changed - Add a TUI ambient pet renderer in `codex-rs/tui/src/pets/`. - Port the app-style pet animation states so the sprite changes with task status, waiting-for-input states, review/ready states, and failures. - Add `/pets` selection UI with a preview pane, loading state, built-in pet choices, and a first-row `Disable terminal pets` option. - Download built-in pet spritesheets on demand from the same public CDN path already used by Android, under `https://persistent.oaistatic.com/codex/pets/v1/...`, and cache them locally under `~/.codex/cache/tui-pets/`. - Keep custom pets local. - Add config support for pet selection, disabling pets, and choosing whether the pet follows the composer bottom or anchors to the terminal bottom. - Reserve layout space around the pet so transcript wrapping, live responses, and composer input do not render underneath the sprite. - Gate image rendering by terminal capability, disable image pets under tmux, and support both Kitty Graphics and SIXEL terminals. - Add redraw cleanup for terminal image artifacts, including sixel cell clearing. ## Current Scope - This is an initial TUI version of ambient pets, not full App parity. - It focuses on ambient sprite rendering, `/pets` selection, custom pets, terminal capability gating, and on-demand CDN-backed built-in assets. - The ambient text overlay is currently disabled, so the TUI renders the pet sprite without extra status text beside it. ## How to Test 1. Start Codex TUI in a terminal with image support. 2. Run `/pets`. 3. Confirm the picker shows built-in pets plus custom pets, and the first item is `Disable terminal pets`. 4. On a fresh `~/.codex/cache/tui-pets/`, move onto a built-in pet and confirm the first preview downloads the spritesheet from the shared Codex pets CDN and renders successfully. 5. Move through the pet list and confirm subsequent built-in previews use the local cache. 6. Select a pet, then send and receive messages. Confirm transcript and composer text wrap before the pet instead of rendering underneath the sprite. 7. Change the pet anchor setting and confirm the pet can either follow the composer bottom or sit at the terminal bottom. 8. Return to `/pets`, choose `Disable terminal pets`, and confirm the sprite disappears cleanly. Targeted tests: - `cargo test -p codex-tui ambient_pet_` - `cargo test -p codex-tui resize_reflow_wraps_transcript_early_when_pet_is_enabled` - `cargo insta pending-snapshots`	2026-05-12 10:43:17 -03:00
cassirer-openai	cb55b769d1	[rollout-trace] Add x-codex-inference-call-id header to inference calls. (#22311 ) This allows us to attach call logs to inference requests in traces.	2026-05-12 05:55:11 -07:00
jif-oai	d996f5366f	feat: guardian as an extension (contributors part) (#22216 ) Part 1 of guardian as extension. This bind all the logic to spawn another agent from an extension and it adds `ThreadId` in the start thread collaborator	2026-05-12 14:41:45 +02:00
viyatb-oai	46f30d0282	feat(sandbox): add Windows deny-read parity (#18202 ) ## Why The split filesystem policy stack already supports exact and glob `access = none` read restrictions on macOS and Linux. Windows still needed subprocess handling for those deny-read policies without claiming enforcement from a backend that cannot provide it. ## Key finding The unelevated restricted-token backend cannot safely enforce deny-read overlays. Its `WRITE_RESTRICTED` token model is authoritative for write checks, not read denials, so this PR intentionally fails that backend closed when deny-read overrides are present instead of claiming unsupported enforcement. ## What changed This PR adds the Windows deny-read enforcement layer and makes the backend split explicit: - Resolves Windows deny-read filesystem policy entries into concrete ACL targets. - Preserves exact missing paths so they can be materialized and denied before an enforceable sandboxed process starts. - Snapshot-expands existing glob matches into ACL targets for Windows subprocess enforcement. - Honors `glob_scan_max_depth` when expanding Windows deny-read globs. - Plans both the configured lexical path and the canonical target for existing paths so reparse-point aliases are covered. - Threads deny-read overrides through the elevated/logon-user Windows sandbox backend and unified exec. - Applies elevated deny-read ACLs synchronously before command launch rather than delegating them to the background read-grant helper. - Reconciles persistent deny-read ACEs per sandbox principal so policy changes do not leave stale deny-read ACLs behind. - Fails closed on the unelevated restricted-token backend when deny-read overrides are present, because its `WRITE_RESTRICTED` token model is not authoritative for read denials. ## Landed prerequisites These prerequisite PRs are already on `main`: 1. #15979 `feat(permissions): add glob deny-read policy support` 2. #18096 `feat(sandbox): add glob deny-read platform enforcement` 3. #17740 `feat(config): support managed deny-read requirements` This PR targets `main` directly and contains only the Windows deny-read enforcement layer. ## Implementation notes - Exact deny-read paths remain enforceable on the elevated path even when they do not exist yet: Windows materializes the missing path before applying the deny ACE, so the sandboxed command cannot create and read it during the same run. - Existing exact deny paths are preserved lexically until the ACL planner, which then adds the canonical target as a second ACL target when needed. That keeps both the configured alias and the resolved object covered. - Windows ACLs do not consume Codex glob syntax directly, so glob deny-read entries are expanded to the concrete matches that exist before process launch. - Glob traversal deduplicates directory visits within each pattern walk to avoid cycles, without collapsing distinct lexical roots that happen to resolve to the same target. - Persistent deny-read ACL state is keyed by sandbox principal SID, so cleanup only removes ACEs owned by the same backend principal. - Deny-read ACEs are fail-closed on the elevated path: setup aborts if mandatory deny-read ACL application fails. - Unelevated restricted-token sessions reject deny-read overrides early instead of running with a silently unenforceable read policy. ## Verification - `cargo test -p codex-core windows_restricted_token_rejects_unreadable_split_carveouts` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-windows-sandbox` - GitHub Actions rerun is in progress on the pushed head. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 23:04:28 -07:00
pakrym-oai	c9e46ed639	[codex] Make handlers own parallel tool support (#22254 ) ## Why `ToolRouter::tool_supports_parallel()` was still consulting configured specs when a handler lookup missed, even though parallel schedulability is really a property of the executable handler. Keeping that metadata on `ConfiguredToolSpec` duplicated state between the model-visible spec layer and the runtime handler layer. This change makes handlers the sole source of truth for parallel tool support and removes the extra spec wrapper that only existed to carry duplicated metadata. ## What changed - removed `ConfiguredToolSpec` and store plain `ToolSpec` values in the registry/router builder path - changed `ToolRouter::tool_supports_parallel()` to consult only the handler registry and fall back to `false` - simplified spec collection and test helpers to operate directly on `ToolSpec` - updated router/spec tests to cover handler-owned parallel behavior and the no-handler fallback ## Validation - `cargo test -p codex-tools` - `cargo test -p codex-core mcp_parallel_support_uses_handler_data` - `cargo test -p codex-core deferred_responses_api_tool_serializes_with_defer_loading` - `cargo test -p codex-core tools_without_handlers_do_not_support_parallel` - `cargo test -p codex-core request_plugin_install_can_be_registered_without_search_tool` ## Docs No documentation updates needed.	2026-05-11 22:26:33 -07:00
pakrym-oai	79c65f816c	[codex] Filter legacy warning messages during compaction (#22243 ) ## Why Older sessions can contain model-warning records persisted as `user` messages, including the unified exec process-limit warning, the `apply_patch`-via-`exec_command` warning, and the model-mismatch high-risk cyber fallback warning. Those warnings are no longer produced as conversation history items, but when old sessions compact they should still be recognized as injected context rather than preserved as real user turns. ## What changed - Removed `record_model_warning` and the production paths that emitted these warning messages into conversation history. - Added `LegacyUnifiedExecProcessLimitWarning`, `LegacyApplyPatchExecCommandWarning`, and `LegacyModelMismatchWarning` contextual fragments that are used only for matching old persisted messages. - Registered the legacy fragments with contextual user message detection so compaction filters them through the existing fragment path. - Added focused compaction coverage for old warning messages being dropped during compacted-history processing. ## Testing - `cargo test -p codex-core warning` - `just fix -p codex-core`	2026-05-11 19:51:51 -07:00
Abhinav	d08906a944	Support PreToolUse updatedInput rewrites (#20527 ) ## Why `PreToolUse` already exposes `updatedInput` in its hook output schema, but Codex currently rejects it instead of applying the rewrite. That leaves hook authors unable to make the documented pre-execution adjustment to a tool call before it runs. ## What - Accept `updatedInput` from `PreToolUse` hooks when paired with `permissionDecision: "allow"`. - Apply the rewritten input before dispatch so the tool executes the updated payload, not the original one. - Preserve the stable hook-facing compatibility shapes that participating tool handlers expose: - Bash-like tools (`shell`, `container.exec`, `local_shell`, `shell_command`, `exec_command`) use `{ "command": ... }`. - `apply_patch` exposes its patch body through the same command-shaped hook contract. - MCP tools expose their JSON argument object directly. - Keep each participating tool handler responsible for translating hook-facing `updatedInput` back into its concrete invocation shape. ## Verification Direct Bash-like rewrite coverage: - `pre_tool_use_rewrites_shell_before_execution` - `pre_tool_use_rewrites_container_exec_before_execution` - `pre_tool_use_rewrites_local_shell_before_execution` - `pre_tool_use_rewrites_shell_command_before_execution` - `pre_tool_use_rewrites_exec_command_before_execution` These cases assert that each supported Bash-like surface runs only the rewritten command while the hook still observes the original `{ "command": ... }` input. `pre_tool_use_rewrites_apply_patch_before_execution` - Model emits one patch. - Hook swaps in a different patch. - Asserts only the rewritten file is created, and the hook saw the original patch. `pre_tool_use_rewrites_code_mode_nested_exec_command_before_execution` - Model runs one nested shell command from code mode. - Hook rewrites it. - Asserts only the rewritten command runs, and the hook saw the original nested input. `pre_tool_use_rewrites_mcp_tool_before_execution` - Model calls the RMCP echo tool. - Hook rewrites the MCP arguments. - Asserts the MCP server receives and returns the rewritten message, not the original one.	2026-05-11 22:27:24 -04:00
starr-openai	17ed5ad0b0	Apply sandbox context to local view_image reads (#21861 ) ## Summary - create a selected-cwd filesystem sandbox context for view_image metadata and file reads in both local and remote environments - add a local restricted-profile regression test for the previously unsandboxed read path ## Validation - just fmt - bazel test --bes_backend= --bes_results_url= --test_output=errors --test_filter=view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads //codex-rs/core:core-unit-tests --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 18:48:43 -07:00
pakrym-oai	ed5944ba1d	Simplify MCP tool handler plumbing (#21595 ) ## Why The MCP tool path had accumulated a few core-owned special cases: a dedicated payload variant, resolver plumbing, a legacy `AfterToolUse` translation path, and a side channel for parallel-call metadata. That made `ToolRegistry` and the spec builder know more about MCP than they needed to. This change moves MCP-specific execution details back onto `ToolInfo` and `McpHandler` so `codex-core` can treat MCP calls like normal function calls while still preserving MCP-specific dispatch and telemetry behavior where it belongs. ## What changed - removed `resolve_mcp_tool_info`, `ToolPayload::Mcp`, `ToolKind`, and the remaining registry-side MCP resolver path - stored MCP routing metadata directly on `McpHandler` and `ToolInfo`, including `supports_parallel_tool_calls` - deleted the legacy `AfterToolUse` consumer in `core`, which removes the need for handler-specific `after_tool_use_payload` implementations - switched tool-result telemetry to handler-provided tags and kept MCP-specific dispatch payload construction inside the handler - simplified tool spec planning/building by passing `ToolInfo` directly and dropping the direct/deferred MCP wrapper structs and the parallel-server side table ## Testing - `cargo check -p codex-core -p codex-mcp -p codex-otel` - `cargo test -p codex-core mcp_parallel_support_uses_exact_payload_server` - `cargo test -p codex-core direct_mcp_tools_register_namespaced_handlers` - `cargo test -p codex-core search_tool_description_lists_each_mcp_source_once` - `cargo test -p codex-mcp list_all_tools_uses_startup_snapshot_while_client_is_pending` - `just fix -p codex-core -p codex-mcp -p codex-otel`	2026-05-12 00:11:31 +00:00
Matthew Zeng	e15ecc9c35	Add production startup and TTFT telemetry (#22198 ) ## Why While investigating `codex exec hi` startup latency, the useful questions were not "is startup slow?" but "which durable bucket is slow in production?" The path we observed has a few distinct stages: 1. `thread/start` creates the session 2. startup prewarm builds the turn context, tools, and prompt 3. startup prewarm warms the websocket 4. the first real turn resolves the prewarm 5. the model produces the first token Before this PR, production telemetry had some of the raw measurements already: - aggregate startup-prewarm duration / age-at-first-turn metrics - TTFT as a metric - websocket request telemetry But there was no coherent production event stream for the startup breakdown itself, and TTFT was metric-only. That made it hard to answer the same latency questions from OpenTelemetry-backed logs without adding one-off local instrumentation. ## What changed Add durable production telemetry on the existing `SessionTelemetry` path: - new `codex.startup_phase` OTel log/trace events plus `codex.startup.phase.duration_ms` - new `codex.turn_ttft` OTel log/trace events while preserving the existing TTFT metric The startup phase event is emitted for the coarse buckets we actually observed while running `exec hi`: - `thread_start_create_thread` - `startup_prewarm_total` - `startup_prewarm_create_turn_context` - `startup_prewarm_build_tools` - `startup_prewarm_build_prompt` - `startup_prewarm_websocket_warmup` - `startup_prewarm_resolve` These phases are intentionally low-cardinality so they remain safe as production telemetry tags. ## Why this shape This keeps the instrumentation on the same production path as the rest of the session telemetry instead of adding a local debug-only trace mode. It also avoids changing startup behavior: - prewarm still runs - no control flow changes - no extra remote calls - no user-visible behavior changes One boundary is intentional: very early process bootstrap that happens before a session exists is not included here, because this PR uses session-scoped production telemetry. The expensive buckets we were trying to understand after `thread/start` are now covered durably. ## Verification - `cargo test -p codex-otel` - `cargo test -p codex-core turn_timing` - `cargo test -p codex-core regular_turn_emits_turn_started_without_waiting_for_startup_prewarm` - `cargo test -p codex-core interrupting_regular_turn_waiting_on_startup_prewarm_emits_turn_aborted` - `cargo test -p codex-app-server thread_start` - `just fix -p codex-otel -p codex-core -p codex-app-server` I also ran `cargo test -p codex-core`; it built successfully and then hit an existing unrelated stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.	2026-05-11 23:58:36 +00:00
starr-openai	22e84c49d0	Support multi-environment apply_patch selection (#21617 ) ## Summary - add multi-environment apply_patch routing for both freeform and function-call tool flows - parse and reconcile the optional environment selector in the main apply_patch parser, then verify against the selected environment in the handler - carry environment_id through runtime and approval surfaces so remote-targeted patches stay explicit end to end ## Testing - just fmt - remote exec-server e2e: `cargo test -p codex-core --test all apply_patch_multi_environment_uses_remote_executor -- --nocapture` on dev via `scripts/test-remote-env.sh` --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 16:33:44 -07:00
Abhinav	9ab7f4e6ac	Add Windows hook command overrides (#22159 ) # Why Managed hook configs need a shared cross-platform shape without making the existing `command` field polymorphic. The common case is still one command string, with Windows needing a different entrypoint only when the runtime is actually Windows. Keeping `command` as the portable/default path and adding an optional Windows override keeps the config easier to read, preserves the existing scalar shape for non-Windows users, and avoids forcing every caller into a `{ unix, windows }` object when only one platform needs special handling. # What - Add optional `command_windows` / `commandWindows` alongside the existing hook `command` field. - Resolve `command_windows` only on Windows during hook discovery; other platforms continue to use `command` unchanged. - Keep trust hashing aligned to the effective command selected for the current runtime. # Docs The Codex hooks/config reference should document `command_windows` as the Windows-only override for command hooks.	2026-05-11 22:22:29 +00:00
rhan-oai	a175ddacc0	[codex-analytics] emit terminal review events (#18748 ) ## Why Review telemetry should describe reviews as first-class events, not only as counters denormalized onto terminal tool-item events. That lets us analyze guardian and user reviews consistently across command execution, file changes, permissions, and network access, while still preserving the terminal item summaries that existing tool analytics need. To make those review events accurate, analytics also needs the observed completion time for each review and enough command metadata to distinguish `shell` from `unified_exec` reviews. ## What changed - emit generic `codex_review_event` rows for completed user and guardian reviews, with review subjects, reviewer, trigger, terminal status, resolution, and observed duration - reduce approval request / response / abort facts into review events for command execution, file change, and permissions flows - keep denormalized review counts, final approval outcome, and permission-request flags on terminal tool-item events for item-associated reviews - plumb review completion timing so user-review responses and aborts use app-server-observed completion times, while guardian analytics reuse the same terminal timestamps emitted on guardian assessment events - carry command approval `source` through the protocol and app-server layers so review analytics can distinguish `shell` from `unified_exec` - add analytics coverage for user-review emission, guardian-review emission, permission reviews that should not denormalize onto tool items, item-summary isolation across threads, and the serialized review-event shape ## Verification - `cargo test -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18748). * __->__ #18748 * #21434 * #18747 * #17090 * #17089 * #20514	2026-05-11 22:13:32 +00:00
viyatb-oai	c7b55cdc46	feat: add network proxy feature flag (#20147 ) ## Why The permissions migration is making `permissions.<profile>.network.enabled` the canonical sandbox network bit, while proxy startup is a separate concern. Enabling network access should not implicitly start the proxy, and users who are still on legacy sandbox modes need a separate place to opt into proxy startup and provide proxy-specific settings. This follow-up to #19900 gives the network proxy its own feature surface instead of overloading permission-profile network semantics. ## What changed - Add an experimental `network_proxy` feature with a configurable `[features.network_proxy]` table. - Overlay `features.network_proxy` settings onto the configured proxy state after permission-profile selection, so the proxy only starts when the active `NetworkSandboxPolicy` already allows network access. - Preserve `[experimental_network]` startup behavior independently of the new feature flag. ## Behavior and examples There are now three related knobs: - `permissions.<profile>.network.enabled` controls whether the active permission profile has network access at all. - `features.network_proxy` enables proxy restrictions for an already-network-enabled profile. - Legacy `sandbox_mode` plus `[sandbox_workspace_write].network_access` still control whether legacy `workspace-write` has network access at all. The rule is: - network off + proxy flag on -> network stays off, proxy is a no-op - network on + proxy flag off -> unrestricted direct network - network on + proxy flag on -> network stays on, with proxy restrictions applied For permission profiles, the feature toggle adds proxy restrictions only when network access is already enabled: ```toml default_permissions = "workspace" [permissions.workspace.filesystem] ":minimal" = "read" [permissions.workspace.network] enabled = true [features] network_proxy = true ``` If `network.enabled = false`, the same feature flag is a no-op: network remains off and the proxy does not start. For legacy sandbox config, `network_access` remains the master switch: ```toml sandbox_mode = "workspace-write" [sandbox_workspace_write] network_access = true [features] network_proxy = true ``` That keeps legacy `workspace-write` network access on, but routes it through the proxy policy. If `network_access = false`, the proxy feature is a no-op and legacy `workspace-write` remains offline. The same proxy opt-in can be supplied from the CLI: ```bash codex -c 'features.network_proxy=true' ``` Additional proxy settings can be supplied when a table is needed: ```bash codex \ -c 'features.network_proxy.enabled=true' \ -c 'features.network_proxy.enable_socks5=false' ``` The intended behavior matrix is: \| Config surface \| Network setting \| `features.network_proxy` \| Direct sandbox network \| Proxy \| \| --- \| --- \| --- \| --- \| --- \| \| Permission profile \| `network.enabled = false` \| off \| restricted \| off \| \| Permission profile \| `network.enabled = false` \| on \| restricted \| off \| \| Permission profile \| `network.enabled = true` \| off \| enabled \| off \| \| Permission profile \| `network.enabled = true` \| on \| enabled \| on \| \| Legacy `workspace-write` \| `network_access = false` \| off \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = false` \| on \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = true` \| off \| enabled \| off \| \| Legacy `workspace-write` \| `network_access = true` \| on \| enabled \| on \| `[experimental_network]` requirements remain separate from the user feature toggle and still start the proxy on their own. Relevant code: - [`features/src/feature_configs.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/features/src/feature_configs.rs#L58-L117) defines the feature-specific proxy config. - [`core/src/config/mod.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L1959-L1964) reads the feature table, and [later applies it only when network access is already enabled](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L2448-L2458). ## Verification Added focused coverage for: - keeping the proxy off when `features.network_proxy` is enabled but sandbox network access is disabled - the full permission-profile and legacy `workspace-write` matrix above - preserving `[experimental_network]` startup without the feature - reusing profile-supplied proxy settings when the feature is enabled Ran: - `cargo test -p codex-features` - `cargo test -p codex-core network_proxy_feature` - `cargo test -p codex-core experimental_network_requirements_enable_proxy_without_feature`	2026-05-11 14:12:00 -07:00
Matthew Zeng	192481d1a1	[elicitation] Advertise new url elicitation capability when auth_elicitation is enabled. (#22188 ) ## Why We've added support for auth elicitation behind the auth_elicitation flag, but servers need to explicitly check the capability before it decides to send elicitations in order to be backward compatible. This PR adds the capability advertising conditioned on the flag. ## What changed - Build `client_elicitation_capability` from the `AuthElicitation` feature state. - Thread that capability through MCP config, session startup, and `McpConnectionManager` so RMCP initialization advertises the correct elicitation support. - Advertise both `form` and `url` elicitation when the feature is enabled, and preserve the empty default capability when it is disabled. - Add coverage for the feature-derived config shape and the advertised initialization payload. ## Testing - `cargo test -p codex-mcp` - `cargo test -p codex-core to_mcp_config_preserves_auth_elicitation_feature_from_config` - `cargo test -p codex-core` (currently fails outside this change in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` with a stack overflow after unrelated tests have started running)	2026-05-11 12:23:55 -07:00
viyatb-oai	d0fa2d81d8	feat(connectors): support managed app tool approval requirements (#21061 ) ## Why Managed requirements can already centrally disable apps, but they could not express the per-tool app approval rules that normal config already supports. That left admins without a way to enforce connector tool approvals through `/etc/codex/requirements.toml` or cloud requirements. ## What changed - Extend app requirements with per-tool `approval_mode` entries. - Merge managed app tool requirements across managed sources while preserving higher-precedence exact tool settings. - Apply managed tool approvals separately from user app config so managed policy is matched only on raw MCP `tool.name`, while user config keeps the existing raw-name-then-title convenience fallback. - Add coverage for local requirements, cloud requirements parsing, managed-over-user precedence, and a title-collision case that must not widen managed auto-approval. ## Configuration shape Local `/etc/codex/requirements.toml` and cloud requirements use the same TOML shape: ```toml [apps.connector_123123.tools."calendar/list_events"] approval_mode = "approve" ``` This is a per-tool approval rule keyed by app ID and raw MCP tool name, not an app-level boolean such as `apps.connector_123123.approve = true`.	2026-05-11 19:08:26 +00:00
viyatb-oai	6506765168	fix(permissions): preserve managed deny-read during escalation (#15977 ) ## Why Managed filesystem `deny_read` requirements are administrator-enforced restrictions on specific paths. Once those requirements are active, Codex should not drop them just because an execution path would otherwise leave the sandbox. Before this change, an explicit escalation, a prefix-rule allow, a sandbox-denial retry, or an app-server legacy sandbox override could rebuild the runtime policy without those managed read-deny entries and expose a path the administrator had marked unreadable. This is narrower than general sandbox-mode constraints. If an enterprise only sets `allowed_sandbox_modes`, a trusted `prefix_rule(..., decision = "allow")` can still run its matching command unsandboxed; this PR only preserves managed filesystem `deny_read` restrictions across those paths. ## What Changed - Mark filesystem policies built from managed `deny_read` requirements so callers can tell when those deny entries must survive escalation. - Preserve managed deny-read entries when runtime permission profiles are rebuilt through protocol, app-server, or legacy sandbox-policy compatibility paths. - Keep managed deny-read attempts inside the selected sandbox on the first attempt and after sandbox-denial retries. - Preserve the same behavior in the zsh-fork escalation path, including prefix-rule-driven escalation. - Add a regression test showing the opposite case too: without managed deny-read, a prefix-rule allow still chooses unsandboxed execution. ## Verification Targeted automated verification: ```shell cargo test -p codex-core shell_request_escalation_execution_is_explicit -- --nocapture cargo test -p codex-core prefix_rule_uses_unsandboxed_execution_without_managed_deny_read -- --nocapture cargo test -p codex-core prefix_rule_preserves_managed_deny_read_escalation -- --nocapture cargo test -p codex-protocol permission_profile_round_trip_preserves_filesystem_policy_metadata -- --nocapture cargo test -p codex-protocol preserving_deny_entries_keeps_unrestricted_policy_enforceable -- --nocapture cargo test -p codex-app-server-protocol permission_profile_file_system_permissions_preserves_policy_metadata -- --nocapture cargo check -p codex-app-server -p codex-tui ``` Smoke-test invocations: ```shell # macOS exact deny + allowed control codex exec --skip-git-repo-check -C "$ROOT" \ -c 'default_permissions="deny_read_smoke"' \ -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",":project_roots"={"."="write","secrets"="none","future-secret"="none","*/.env"="none"}}' \ 'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.' # Linux exact deny + allowed control codex exec --skip-git-repo-check -C "$ROOT" \ -c 'default_permissions="deny_read_smoke"' \ -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",glob_scan_max_depth=3,":project_roots"={"."="write","secrets"="none","future-secret"="none","*/.env"="none"}}' \ 'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.' ``` Observed manual smoke matrix: \| Case \| macOS Seatbelt \| Linux bubblewrap \| \| --- \| --- \| --- \| \| `cat allowed.txt` \| Pass \| Pass \| \| `cat secrets/exact-secret.txt` \| Blocked \| Blocked \| \| `cat envs/root.env` \| Blocked \| Blocked \| \| `cat envs/nested/one.env` \| Blocked \| Blocked \| \| `cat envs/nested/two.env` \| Blocked \| Blocked \| \| `cat alias-to-secrets/exact-secret.txt` \| Blocked \| Blocked \| \| Missing denied path \| A file created after sandbox setup remained unreadable \| Creation was blocked by the reserved missing-path placeholder, and the placeholder was cleaned up after exit \| \| Real `codex exec` shell turn \| Pass \| Pass \| Notes: - The Linux smoke run used the fallback glob walker because the devbox did not have `rg` installed. - The smoke matrix verifies the end-to-end filesystem behavior on macOS and Linux; the escalation-specific behavior is covered by the focused tests above. --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charlie Marsh <charliemarsh@openai.com>	2026-05-11 11:49:44 -07:00

1 2 3 4 5 ...

2868 Commits