codex

mirror of https://github.com/openai/codex.git synced 2026-05-16 01:02:48 +00:00

Author	SHA1	Message	Date
celia-oai	eb448bed95	test	2026-05-13 11:56:49 -07:00
pakrym-oai	4454e1411b	Deprecate TurnContext cwd and resolve_path (#22519 ) ## Why `TurnContext::cwd` and `TurnContext::resolve_path` are being phased out in favor of using the selected turn environment cwd directly. Deprecating both APIs makes any new direct dependency visible while preserving the existing migration path for current callers. ## What Changed - Marked `TurnContext::cwd` and `TurnContext::resolve_path` as deprecated with guidance to use the selected turn environment cwd instead. - Added exact `#[allow(deprecated)]` suppressions at each existing direct usage site, including tests, rather than adding crate-wide suppression. - Kept the change behavior-preserving: current cwd reads, writes, and path resolution continue to use the same values. ## Verification - `just fmt` - `cargo check -p codex-core` - `cargo check -p codex-core --tests` - `git diff --check`	2026-05-13 11:15:25 -07:00
jif-oai	fc26af377f	feat: expose multi-agent v2 as model-only tools (#22514 ) ## Why `code_mode_only` filters code-mode nested tools out of the top-level tool list. For multi-agent v2, we need a rollout shape where the collaboration tools remain callable as normal model tools without also being embedded into the code-mode `exec` tool declaration. Related to this: https://openai-corpws.slack.com/archives/C0AQLHB4U75/p1778660267922549 ## What Changed - Adds `features.multi_agent_v2.non_code_mode_only`, including config resolution, profile override handling, and generated schema coverage. - Introduces `ToolExposure::DirectModelOnly` so a tool can be included in the initial model-visible list while staying out of the nested code-mode tool surface. - Applies that exposure to the multi-agent v2 tools when the new flag is set: `spawn_agent`, `send_message`, `followup_task`, `wait_agent`, `close_agent`, and `list_agents`. - Updates code-mode-only filtering so direct-model-only tools remain visible while ordinary nested code-mode tools are still hidden. ## Verification - Added config parsing/profile tests for `non_code_mode_only`. - Added tool spec coverage for the code-mode-only multi-agent v2 exposure behavior.	2026-05-13 19:49:47 +02:00
pakrym-oai	83decfa300	[codex] Remove unused legacy shell tools (#22246 ) ## Why Recent session history showed no active use of the raw `shell`, `local_shell`, or `container.exec` execution surfaces. Keeping those handlers/specs wired into core leaves duplicate shell execution paths alongside the supported `shell_command` and unified exec tools. ## What changed - Removed the raw `shell` handler/spec and its `ShellToolCallParams` protocol helper. - Removed the legacy `local_shell` and `container.exec` handler/spec plumbing while preserving persisted-history compatibility for old response items. - Normalized model/config `default` and `local` shell selections to `shell_command`. - Pruned tests that exercised removed raw-shell/local-shell/apply-patch variants and kept coverage on `shell_command`, unified exec, and freeform `apply_patch`. ## Verification - `git diff --check` - `cargo test -p codex-protocol` - `cargo test -p codex-tools` - `cargo test -p codex-core tools::handlers::shell` - `cargo test -p codex-core tools::spec` - `cargo test -p codex-core tools::router` - `cargo test -p codex-core active_call_preserves_triggering_command_context` - `cargo test -p codex-core guardian_tests` - `cargo test -p codex-core --test all shell_serialization` - `cargo test -p codex-core --test all apply_patch_cli` - `cargo test -p codex-core --test all shell_command_` - `cargo test -p codex-core --test all local_shell` - `cargo test -p codex-core --test all otel::` - `cargo test -p codex-core --test all hooks::` - `just fix -p codex-core` - `just fix -p codex-tools`	2026-05-13 16:43:25 +00:00
jif-oai	7c7b4861d8	fix: drop underscored id headers (#22193 ) ## Why Stop sending duplicate `session_id`/`thread_id` headers. We only want the hyphenated forms as `_` is rejected by some proxies Related discussion here: https://openai.slack.com/archives/C095U48JNL9/p1778508316923179 ## What - Keep `session-id` and `thread-id` - Remove the underscore aliases	2026-05-13 18:21:02 +02:00
jif-oai	fdda59c00b	Introduce tool exposure for deferred registration (#22489 ) ## Why Deferred tools were tracked with separate side-channel filtering after tool specs had already been assembled. That made the registry responsible for executing tools while the router/spec planner separately decided whether those same tools should be exposed to the model up front. This PR makes exposure part of the tool handler contract so direct versus deferred availability travels with the executable tool registration. Next step will be to simplify registration ## What Changed - Adds `ToolExposure` to `codex-tools` and exposes it through `ToolExecutor`, defaulting tools to `Direct`. - Teaches dynamic tools and MCP handlers to mark deferred tools as `Deferred` at construction time. - Renames the registry object-safe wrapper from `AnyToolHandler` to `RegisteredTool` and uses `ToolExposure` when deciding whether to include a handler's spec in the initial model-visible tool list. - Refactors tool spec planning to derive direct specs and deferred search entries from registered handlers, removing the router's special-case deferred dynamic tool filtering. ## Verification - Not run.	2026-05-13 18:16:51 +02:00
Michael Bolin	889ee018e7	config: add strict config parsing (#20559 ) ## Why Codex intentionally ignores unknown `config.toml` fields by default so older and newer config files keep working across versions. That leniency also makes typo detection hard because misspelled or misplaced keys disappear silently. This change adds an opt-in strict config mode so users and tooling can fail fast on unrecognized config fields without changing the default permissive behavior. This feature is possible because `serde_ignored` exposes the exact signal Codex needs: it lets Codex run ordinary Serde deserialization while recording fields Serde would otherwise ignore. That avoids requiring `#[serde(deny_unknown_fields)]` across every config type and keeps strict validation opt-in around the existing config model. ## What Changed ### Added strict config validation - Added `serde_ignored`-based validation for `ConfigToml` in `codex-rs/config/src/strict_config.rs`. - Combined `serde_ignored` with `serde_path_to_error` so strict mode preserves typed config error paths while also collecting fields Serde would otherwise ignore. - Added strict-mode validation for unknown `[features]` keys, including keys that would otherwise be accepted by `FeaturesToml`'s flattened boolean map. - Kept typed config errors ahead of ignored-field reporting, so malformed known fields are reported before unknown-field diagnostics. - Added source-range diagnostics for top-level and nested unknown config fields, including non-file managed preference source names. ### Kept parsing single-pass per source - Reworked file and managed-config loading so strict validation reuses the already parsed `TomlValue` for that source. - For actual config files and managed config strings, the loader now reads once, parses once, and validates that same parsed value instead of deserializing multiple times. - Validated `-c` / `--config` override layers with the same base-directory context used for normal relative-path resolution, so unknown override keys are still reported when another override contains a relative path. ### Scoped `--strict-config` to config-heavy entry points - Added support for `--strict-config` on the main config-loading entry points where it is most useful: - `codex` - `codex resume` - `codex fork` - `codex exec` - `codex review` - `codex mcp-server` - `codex app-server` when running the server itself - the standalone `codex-app-server` binary - the standalone `codex-exec` binary - Commands outside that set now reject `--strict-config` early with targeted errors instead of accepting it everywhere through shared CLI plumbing. - `codex app-server` subcommands such as `proxy`, `daemon`, and `generate-*` are intentionally excluded from the first rollout. - When app-server strict mode sees invalid config, app-server exits with the config error instead of logging a warning and continuing with defaults. - Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli` instead of extending shared `ReviewArgs`, so `--strict-config` stays on the outer config-loading command surface and does not become part of the reusable review payload used by `codex exec review`. ### Coverage - Added tests for top-level and nested unknown config fields, unknown `[features]` keys, typed-error precedence, source-location reporting, and non-file managed preference source names. - Added CLI coverage showing invalid `--enable`, invalid `--disable`, and unknown `-c` overrides still error when `--strict-config` is present, including compound-looking feature names such as `multi_agent_v2.subagent_usage_hint_text`. - Added integration coverage showing both `codex app-server --strict-config` and standalone `codex-app-server --strict-config` exit with an error for unknown config fields instead of starting with fallback defaults. - Added coverage showing unsupported command surfaces reject `--strict-config` with explicit errors. ## Example Usage Run Codex with strict config validation enabled: ```shell codex --strict-config ``` Strict config mode is also available on the supported config-heavy subcommands: ```shell codex --strict-config exec "explain this repository" codex review --strict-config --uncommitted codex mcp-server --strict-config codex app-server --strict-config --listen off codex-app-server --strict-config --listen off ``` For example, if `~/.codex/config.toml` contains a typo in a key name: ```toml model = "gpt-5" approval_polic = "on-request" ``` then `codex --strict-config` reports the misspelled key instead of silently ignoring it. The path is shortened to `~` here for readability: ```text $ codex --strict-config Error loading config.toml: ~/.codex/config.toml:2:1: unknown configuration field `approval_polic` \| 2 \| approval_polic = "on-request" \| ^^^^^^^^^^^^^^ ``` Without `--strict-config`, Codex keeps the existing permissive behavior and ignores the unknown key. Strict config mode also validates ad-hoc `-c` / `--config` overrides: ```text $ codex --strict-config -c foo=bar Error: unknown configuration field `foo` in -c/--config override $ codex --strict-config -c features.foo=true Error: unknown configuration field `features.foo` in -c/--config override ``` Invalid feature toggles are rejected too, including values that look like nested config paths: ```text $ codex --strict-config --enable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --disable does_not_exist Error: Unknown feature flag: does_not_exist $ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text ``` Unsupported commands reject the flag explicitly: ```text $ codex --strict-config cloud list Error: `--strict-config` is not supported for `codex cloud` ``` ## Verification The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid `--disable`, the compound `multi_agent_v2.subagent_usage_hint_text` case, unknown `-c` overrides, app-server strict startup failure through `codex app-server`, and rejection for unsupported commands such as `codex cloud`, `codex mcp`, `codex remote-control`, and `codex app-server proxy`. The config and config-loader tests cover unknown top-level fields, unknown nested fields, unknown `[features]` keys, source-location reporting, non-file managed config sources, and `-c` validation for keys such as `features.foo`. The app-server test suite covers standalone `codex-app-server --strict-config` startup failure for an unknown config field. ## Documentation The Codex CLI docs on developers.openai.com/codex should mention `--strict-config` as an opt-in validation mode for supported config-heavy entry points once this ships.	2026-05-13 16:08:05 +00:00
cassirer-openai	702e6a3c64	[rollout-trace] Add a trace ID to MCP calls. (#22326 ) This allows us to connect individual tool calls to the logs of the invocations.	2026-05-13 16:03:33 +00:00
Felipe Coury	3d517fbd00	feat(tui): standardize picker navigation keys (#22347 ) ## Why Picker-style UI in the TUI has accumulated a mix of hardcoded navigation keys. Some lists supported page movement, some did not; some accepted Vim-like keys, while others only accepted arrows; and tabbed or horizontally adjustable pickers had no shared keymap action for left/right movement. This PR makes picker/list navigation consistent and configurable so users can rely on the same defaults across the TUI. ## What Changed - Adds shared list keymap actions for: - vertical movement: `move_up`, `move_down` - horizontal movement: `move_left`, `move_right` - paging and jumps: `page_up`, `page_down`, `jump_top`, `jump_bottom` - Adds defaults: - Up/down: arrows, `Ctrl+P/N`, `Ctrl+K/J`, and plain `k/j` where text input is not active - Page up/down: `PageUp/PageDown` and `Ctrl+B/F` - First/last: `Home/End` - Left/right: `Left/Right` and `Ctrl+H/L` - Wires the shared list keymap through picker and list surfaces including session resume, multi-select, tabbed selection lists, settings-style lists, app-link selection, MCP elicitation, request-user-input, and the OSS selection wizard. - Keeps search behavior intact by reserving printable characters for query text in searchable pickers. - Updates keymap setup actions, config schema, snapshots, and focused coverage for the new list actions. ## How to Test 1. Start Codex from this branch and open the session picker, for example with an existing session history. 2. In the session list, verify that `Ctrl+J/K` moves the selection down/up. 3. Verify that `Ctrl+F/B` pages down/up and `Home/End` jumps to the first/last visible session. 4. Type printable search text such as `j` or `k` and confirm it updates the query instead of navigating. 5. Focus a picker control that changes values horizontally, such as a session picker toolbar control, and verify `Ctrl+H/L` changes the focused value like left/right arrows. Targeted tests run: - `cargo test -p codex-tui keymap::tests::` - `cargo test -p codex-tui keymap_setup::tests::` - `cargo test -p codex-tui horizontal_list_keys` - `cargo test -p codex-tui page_and_jump_navigation_use_list_keymap` - `cargo test -p codex-tui ctrl_h_l_move_provider_selection` - `cargo test -p codex-tui scroll_state::tests` - `cargo test -p codex-tui switching_tabs_changes_visible_items_and_clears_search` - `cargo test -p codex-tui toggle_sort_key_reloads_with_new_sort` Also ran `just write-config-schema`, `just fmt`, `just fix -p codex-tui`, `just argument-comment-lint`, and `git diff --check`. Note: `cargo test -p codex-tui` was attempted and still aborts in the pre-existing `tests::fork_last_filters_latest_session_by_cwd_unless_show_all` stack overflow, which is unrelated to this branch.	2026-05-13 15:33:27 +00:00
jif-oai	441c2f818f	fix: main (#22503 ) Fix main due to conflicting merge	2026-05-13 17:28:37 +02:00
jif-oai	34bb85519f	feat: add config-change extension contributor (#22488 ) ## Why Extensions can observe thread and turn lifecycle events today, but there was no single host-owned hook for changes to the effective thread configuration. That makes features that need to react to model, permission, or tool-suggest updates either depend on individual mutation paths or risk going stale after runtime config refreshes. This adds a typed config-change contributor so extension-owned state can stay synchronized with the effective thread config while the host remains responsible for deciding when config changed. ## What Changed - Added `ConfigContributor<C>` to `codex_extension_api`, with before/after immutable snapshots of the effective config plus session/thread extension stores. - Added registry builder/accessor support through `config_contributor` and `config_contributors`. - Emits config-change callbacks after committed updates from session settings, per-turn setting updates, and `refresh_runtime_config`. - Builds effective config snapshots only when config contributors are registered, and suppresses no-op callbacks when the before/after snapshots are equal. - Added a core session regression test that verifies contributors observe both model changes and user-layer runtime config changes, including access to session and thread extension stores. ## Validation Added `config_change_contributor_observes_effective_config_changes` in `codex-rs/core/src/session/tests.rs` to cover the new contributor path.	2026-05-13 17:13:34 +02:00
Ahmed Ibrahim	87de4e3290	Add service tier overrides to spawned agents (#22139 ) ## Why Spawned agents can already override `model` and `reasoning_effort`, but they have no equivalent way to opt into a model-supported service tier. That makes it impossible to preserve or intentionally select tiered execution behavior when delegating work to a sub-agent, even though the model catalog already advertises supported `service_tiers`. ## What changed - Add optional `service_tier` to both legacy and `MultiAgentV2` `spawn_agent` tool inputs. - Show each picker-visible model's supported service tier ids and descriptions in the `spawn_agent` tool guidance. - Resolve service tier selection after the child agent's effective model is known. - Inherit the parent tier when omitted and still supported by the final child model; otherwise clear it. - Reject explicit unsupported tier requests with a model-facing error. - Keep explicit `service_tier` usable on full-history forks, while still honoring the existing model/reasoning fork restrictions. - Hide `service_tier` alongside other spawn metadata when `hide_spawn_agent_metadata` is enabled. ## Verification Added focused coverage for: - v1/v2 `spawn_agent` schema exposure for `service_tier` - tier descriptions in spawn guidance - hidden-metadata suppression - explicit supported tier selection - explicit unknown and unsupported tier rejection - inherited tier preservation or clearing based on child-model support - full-history fork acceptance for explicit service tiers in both v1 and v2 Local Rust tests were not run in this workspace per repo guidance; the new coverage is included for CI.	2026-05-13 18:11:50 +03:00
Felipe Coury	6f77b70ff3	feat(tui): remove Zellij TUI workarounds (#22214 ) ## Why We added Zellij-specific TUI workarounds because older Zellij behavior did not work with Codex's normal terminal model: - #8555 made `tui.alternate_screen = "auto"` disable alternate screen in Zellij so transcript history stayed available. - #16578 avoided scroll-region operations in Zellij by emitting raw newlines and using a separate composer styling path. This PR removes both workarounds because the latest Zellij release tested locally (`zellij 0.44.1`) works correctly with Codex's standard TUI behavior: normal alternate-screen handling, redraw, and history insertion. ## What Changed - Removed the `InsertHistoryMode::Zellij` path and the Zellij-only newline scrollback insertion behavior. - Removed cached `is_zellij` state from the TUI and composer. - Removed Zellij-specific composer styling, the helper snapshot, and the `TerminalInfo::is_zellij()` convenience method that only served this workaround. - Changed `tui.alternate_screen = "auto"` to use alternate screen for Zellij too; `--no-alt-screen` and `tui.alternate_screen = "never"` still preserve the inline mode escape hatch. - Updated the generated config schema description for `tui.alternate_screen`. ## How to Test Manual smoke path used with `zellij 0.44.1`: 1. Build and run this branch inside a Zellij `0.44.1` session with default config. 2. Start Codex normally and produce enough assistant/tool output to create scrollback. 3. Confirm the transcript remains readable, the composer renders normally, and scrolling through terminal history works. 4. Resize the Zellij pane while output exists and confirm the TUI redraws without duplicated, missing, or stale rows. 5. Compare with `--no-alt-screen` or `-c tui.alternate_screen=never` if you want to verify the inline fallback still works. Targeted tests: - `just write-config-schema` - `just fmt` - `just fix -p codex-tui` - `cargo test -p codex-terminal-detection` - `cargo test -p codex-tui alternate_screen_auto_uses_alt_screen` Attempted but did not complete locally: - `cargo test -p codex-tui` built and ran the new test successfully, then failed later on unrelated local failures in `status_permissions_full_disk_managed_*` and a stack overflow in `tests::fork_last_filters_latest_session_by_cwd_unless_show_all`. ## Documentation No developers.openai.com Codex documentation update is needed for this revert.	2026-05-13 12:11:15 -03:00
jif-oai	68e045a631	Make context contributors async (#22491 ) ## Summary - make ContextContributor return a boxed Send future - await context contributors during initial context assembly - update existing contributors and extension-api examples for the async contract ## Testing - cargo test -p codex-extension-api --examples - cargo test -p codex-git-attribution - cargo test -p codex-core build_initial_context_includes_git_attribution_from_extensions -- --nocapture - cargo test -p codex-core build_initial_context_omits_git_attribution_when_feature_is_disabled -- --nocapture - cargo test -p codex-core (fails in unrelated agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns stack overflow) - just fix -p codex-extension-api - just fix -p codex-git-attribution - just fix -p codex-core - cargo clippy -p codex-extension-api --examples	2026-05-13 16:43:28 +02:00
jif-oai	1dcc89f1d4	feat: move extension scope ids into ExtensionData (#22490 ) ## Summary - add a scoped level_id to ExtensionData and expose it through level_id() - remove thread_id/turn_id parameters from extension contributor inputs where the scoped ExtensionData already carries that identity - move turn-scoped extension data onto TurnContext so token usage and lifecycle contributors can share the same turn store ## Testing - cargo check -p codex-extension-api -p codex-core --tests - cargo test -p codex-extension-api - cargo test -p codex-guardian - cargo test -p codex-core --lib record_token_usage_info_notifies_extension_contributors - cargo test -p codex-core --lib submission_loop_channel_close_emits_thread_stop_lifecycle - cargo test -p codex-core --lib submission_loop_channel_close_aborts_active_turn_before_thread_stop_lifecycle - just fix -p codex-extension-api - just fix -p codex-guardian - just fix -p codex-core - just fmt ## Note - Attempted cargo test -p codex-core; it aborted in agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns with the existing stack overflow before the full suite completed.	2026-05-13 16:13:16 +02:00
jif-oai	083c1962f9	feat: add token usage contributor hook (#22485 ) ## Why Extensions need a stable place to observe token accounting after Codex folds model-provider usage into the session's cached `TokenUsageInfo`. Without a contributor hook, extension-owned features that need last-turn or cumulative token usage have to duplicate session plumbing or infer state from client-facing `TokenCount` notifications. ## What changed - Added `TokenUsageContributor` to `codex-extension-api`, passing session/thread `ExtensionData`, `ThreadId`, turn id, and the current `TokenUsageInfo`. - Added registry builder/storage support for token-usage contributors. - Invoked registered contributors from `Session::record_token_usage_info` after the session token cache is updated and before the client `TokenCount` notification is emitted. ## Testing - Added `record_token_usage_info_notifies_extension_contributors`, covering cumulative token usage updates and access to both extension stores.	2026-05-13 14:32:23 +02:00
jif-oai	fcc2a92743	fix: emit thread stop lifecycle on implicit shutdown (#22482 ) ## Why The thread lifecycle contributor hooks from #22476 should observe every session teardown. The explicit `Op::Shutdown` path already emitted `on_thread_stop`, but when `submission_loop` exited because its submission channel closed, it only tore down runtime services. That meant extensions could miss the thread-stop lifecycle signal on implicit runtime shutdown. ## What Changed - Split shared runtime teardown into `shutdown_runtime_services(...)`. - Split thread-stop lifecycle emission into `emit_thread_stop_lifecycle(...)`. - Reused those helpers from both explicit shutdown and the channel-close shutdown path. - Tracked whether `Op::Shutdown` was received so the explicit path does not double-emit lifecycle events after it exits the loop. - Added a regression test that closes the submission channel and asserts `ThreadLifecycleContributor::on_thread_stop` runs once with the expected thread/session stores. ## Testing - `cargo test -p codex-core submission_loop_channel_close_emits_thread_stop_lifecycle`	2026-05-13 14:19:57 +02:00
jif-oai	27e67a8c2a	feat: add turn lifecycle contributors (#22480 ) ## Why Extensions can already contribute prompt, tool, turn-item, and thread-lifecycle behavior, but there was no explicit host-owned hook for per-turn setup and cleanup. That makes extension-private turn state awkward: an extension either has to stash it outside the turn lifecycle or depend on core runtime objects. This adds a small turn lifecycle boundary. Extensions receive stable identifiers plus the existing session, thread, and turn `ExtensionData` stores, while core keeps owning task scheduling, cancellation, and turn teardown. ## What Changed - Added `TurnLifecycleContributor` with `on_turn_start`, `on_turn_stop`, and `on_turn_abort` callbacks in `codex-rs/ext/extension-api`. - Added typed `TurnStartInput`, `TurnStopInput`, and `TurnAbortInput` payloads that expose `thread_id`, `turn_id`, `session_store`, `thread_store`, and `turn_store`. - Registered and re-exported turn lifecycle contributors through `ExtensionRegistry` and `ExtensionRegistryBuilder`. - Wired `Session` to emit turn start, stop, and abort callbacks from the existing turn/task lifecycle paths. - Carried the turn-scoped `ExtensionData` through `RunningTask` and `RemovedTask` so stop/abort callbacks receive the same turn store created at turn start. ## Verification - Not run locally.	2026-05-13 13:47:27 +02:00
jif-oai	5ab7e6b4c6	feat: add thread lifecycle contributor hooks (#22476 ) ## Why Extensions that need thread-scoped state currently only get a start-time callback. That is enough for seeding stores, but it leaves the host without a shared extension seam for later thread rehydrate and flush work as thread ownership evolves. This PR turns that start-only seam into a host-owned thread lifecycle contributor contract so extension-private state can stay behind the extension API instead of leaking extra orchestration through core. ## What changed - Replaced `ThreadStartContributor` with `ThreadLifecycleContributor` and added typed lifecycle inputs for thread start, resume, and stop. The contract lives in [`contributors/thread_lifecycle.rs`](`d0e9211f70/codex-rs/ext/extension-api/src/contributors/thread_lifecycle.rs (L1-L64)`). - Kept the existing start-time behavior intact by routing session construction through `on_thread_start`. - Invoked `on_thread_stop` during session shutdown before thread-scoped extension state is dropped, while isolating contributor failures behind warning logs. - Migrated `git-attribution` and `guardian` onto the lifecycle registration path. - Renamed the extension registry plumbing from start-specific contributors to lifecycle-specific contributors. ## Notes `on_thread_resume` is introduced at the API boundary here so extensions can target the final lifecycle shape; host resume dispatch can be wired where that runtime path is finalized.	2026-05-13 13:11:30 +02:00
jif-oai	9c5dfa7b1a	Refactor extension tools onto shared ToolExecutor (#22369 ) ## Why Extension tools were split across two public runtime contracts: `codex-tool-api` exposed `ToolBundle` plus its own call/spec/error types, while core native tools used `codex_tools::ToolExecutor`. That made contributed tool specs and execution behavior easy to drift apart and added another crate boundary for what should be one executable-tool seam. This PR makes `ToolExecutor` the single runtime contract and keeps extension-specific pinning in `codex-extension-api`. ## Remaining todo https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5 Either generic `Invocation` or sub-extract the `ToolCall` and clean `ToolInvocation` ## What changed - Removed the `codex-tool-api` workspace crate and its dependencies from core and `codex-extension-api`. - Made `codex_tools::ToolExecutor` object-safe with `async_trait` so extension contributors can return a dyn executor. - Added the extension-facing aliases under `ext/extension-api/src/contributors/tools.rs`, including `ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output = ExtensionToolOutput>`. - Changed `ToolContributor::tools` to return extension executors directly instead of `ToolBundle`s. - Updated core’s extension tool handler/registry/router path to adapt those extension executors into the existing native `ToolInvocation` runtime path. - Added focused coverage for extension tools being registered, model-visible, dispatchable, and not replacing built-in tools. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-extension-api`	2026-05-13 12:12:06 +02:00
jif-oai	1824685a00	feat: extract shared tool executor interface (#22359 ) ## Why Codex still models model-visible tools and executable behavior largely inside `codex-core`, which makes it harder to evolve the tool system toward a single reusable abstraction for built-ins, MCP-backed tools, dynamic tools, and later tools injected from outside core. This PR takes the next incremental step in that direction by moving the common execution-facing pieces out of core and separating them from core-only orchestration. The intent is to let shared tool abstractions improve in one place, while `codex-core` keeps the parts that are still inherently host-specific today, such as `ToolInvocation`, dispatch wiring, and hook integration. This PR is mostly moving things around. The only interesting piece is this abstraction: https://github.com/openai/codex/pull/22359/changes#diff-81af519002548ba51ed102bdaaf77e081d40a1e73a6e5f9b104bbbc96a6f1b3dR13 ## What changed - Added `codex_tools::ToolExecutor<Invocation>` as the shared execution trait for model-visible tools. - Moved the reusable execution support types from `codex-core` into `codex-tools`: - `FunctionCallError` - `ToolPayload` - `ToolOutput` - Refactored core tool implementations so that execution behavior lives on `ToolExecutor<ToolInvocation>`, while `ToolHandler` remains the core-local extension point for hook payloads, telemetry tags, diff consumers, and other orchestration concerns. - Kept the registry and dispatch flow behaviorally unchanged while making the shared/extracted boundary explicit across built-in, MCP, dynamic, extension-backed, shell, and multi-agent tool handlers. ## Verification - `cargo test -p codex-tools` - `just fix -p codex-tools` - `just fix -p codex-core` - `cargo test -p codex-core` progressed through the updated tool surfaces and then hit the existing unrelated multi-agent stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.	2026-05-13 11:31:27 +02:00
jif-oai	7e97da7c13	chore: Keep view_image sandbox test in temp dir (#22355 ) ## Summary - move the `view_image` sandbox filesystem-read unit test onto a temporary cwd - keep the turn cwd and selected turn environment cwd aligned inside the test - avoid leaving `core/image.png` behind in the repo checkout after the test runs ## Root cause The test wrote `image.png` beneath `turn.cwd`, and the shared session test helper defaults that cwd to the current repo directory when no override is provided. ## Validation - `just fmt` - `cargo test -p codex-core tools::handlers::view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads`	2026-05-13 10:39:07 +02:00
Abhinav	392e94e9ea	add --dangerously-bypass-hook-trust CLI flag (#21768 ) # Why Hook trust happens through the TUI in `/hooks` so it can block non-interactive use cases. This flag will allow users that are using codex headlessly to bypass hooks when they want to. # What This adds one invocation-scoped escape hatch. - the CLI flag sets a runtime-only `bypass_hook_trust` override; there is no durable `config.toml` setting - hook discovery still respects normal enablement, so explicitly disabled hooks remain disabled - we show a `--dangerously-bypass-hook-trust is enabled. Enabled hooks may run without review for this invocation.` message on startup so accidental use is visible in both interactive and exec flows This keeps “enabled” and “trusted” as separate concepts in the normal path, while giving CI/E2E callers a stable way to opt into the exceptional path when they already control the hook set.	2026-05-13 07:13:57 +00:00
Abhinav	934a40c7d9	Use root repo hooks in linked worktrees (#21969 ) # Why Linked worktrees currently load their own project hook declarations, so the same repo can present different hook definitions depending on which checkout is active. https://github.com/openai/codex/pull/21762 tried to share trust by giving matching worktree hooks a shared synthetic key, but review pointed out that divergent worktree hook definitions would then fight over one `trusted_hash`. Instead of introducing a second trust model, this makes linked worktrees use the root checkout as the single source of truth for project hook declarations. Worktree-local project config can still diverge for unrelated settings, but project hooks now keep one real source path and one trust state per repo. # What - Teach project config loading to remember the matching root-checkout `.codex/` folder for actual linked-worktree project layers. - Keep ordinary project config sourced from the worktree, but replace project hook declarations with the root checkout's matching layer before hook discovery runs, including linked-worktree layers with `.codex/` but no local `config.toml`. - Make hook discovery use that authoritative hook folder for both `hooks.json` and TOML hook source paths, so linked worktrees produce the same hook key and trust state as the root checkout. - Cover the linked-worktree path plus regressions for missing worktree `config.toml` and nested non-worktree project roots.	2026-05-13 06:58:58 +00:00
sayan-oai	2304ec45ca	Remove unavailable MCP placeholder tool backfill (#22439 ) ## Why `UnavailableDummyTools` kept synthetic placeholder tools alive for historical tool calls whose backing MCP tool was no longer available. That path adds stale model-visible tool specs and special routing at the point where unavailable MCP calls should use ordinary current-tool handling. This removes the runtime backfill instead of preserving a second compatibility lane. ## Is it safe to remove? The unavailable tools were added in #17853 after a CS issue when a previously-called MCP tool failed to load and was omitted from the CS spec. Now that we have tool search, I think this is resolved: - API merges tools from previous TST output into effective tool set so theyre always in CS spec - if an MCP tool surfaced by TST later becomes unavailable, the model can still call it and it will just return model-visible error - both TST output and function call output are dropped on compaction so model will not remember old calls to MCP post compaction ## What changed - Delete unavailable-tool collection, placeholder handler, router/spec plumbing, and obsolete placeholder coverage. - Keep `features.unavailable_dummy_tools` as a removed no-op feature tombstone so existing configs still parse cleanly. - Add an integration-style `tool_search` regression test showing that a deferred MCP tool surfaced through `tool_search` still routes through MCP and returns a model-visible tool-call error rather than `unsupported call`. ## Verification - `cargo test -p codex-core tool_search`	2026-05-12 23:30:13 -07:00
pakrym-oai	104fc14956	Encapsulate tool search entries in handlers (#22261 ) ## Why This builds on the handler-owned spec refactor by moving deferred tool-search metadata to the same handlers that already own tool specs. The registry builder no longer needs a separate prebuilt `tool_search_entries` path; it can collect searchable entries from deferred handlers directly. ## What changed - Added `search_info()` to tool handlers and implemented it for MCP and dynamic handlers. - Reused handler `spec()` output when constructing tool-search entries, adapting it into the deferred `LoadableToolSpec` shape expected by `tool_search`. - Simplified `build_tool_registry_builder(...)` so `tool_search` registration is based on deferred handlers with search info. - Removed the old standalone search-entry builders and now-unused `codex-tools` discovery helper exports. ## Verification - `cargo test -p codex-core tools::handlers::tool_search::tests:: -- --nocapture` - `cargo test -p codex-core tools::spec_plan::tests::search_tool -- --nocapture` - `cargo test -p codex-core tools::spec::tests:: -- --nocapture` - `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture` - `cargo test -p codex-tools` - `just fix -p codex-core` - `just fix -p codex-tools`	2026-05-12 20:48:02 -07:00
pakrym-oai	67c8486462	tools: infer code-mode namespace descriptions from specs (#22406 ) ## Why Code mode already builds the merged nested `ToolSpec`s that feed the `exec` prompt. Keeping a separate `tool_namespaces` map in the planning path duplicated that metadata and left extra wrapper plumbing in `spec.rs`. ## What changed - derive code-mode namespace descriptions from the merged `ToolSpec::Namespace` entries before building the code-mode handlers - extract `build_code_mode_handlers(...)` so the code-mode-specific planning stays in one place - remove `tool_namespaces` from `ToolRegistryBuildParams` - delete the now-unused `McpToolPlanInputs` wrapper and related test helper plumbing ## Testing - `cargo test -p codex-core spec_plan`	2026-05-12 20:47:50 -07:00
pakrym-oai	96833c5b15	Remove CODEX_RS_SSE_FIXTURE test hook (#22413 ) ## Why `CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests bypass the normal Responses transport by reading SSE from local files. That kept test-only behavior wired through production client code. The affected tests can stay hermetic by using the existing `core_test_support::responses` mock server and passing `openai_base_url` instead. ## What Changed - Removed the `CODEX_RS_SSE_FIXTURE` flag, `codex_api::stream_from_fixture`, the `env-flags` dependency, and the checked-in SSE fixture files. - Repointed the affected core, exec, and TUI tests at `MockServer` with the existing SSE event constructors. - Removed the Bazel test data plumbing for the deleted fixtures and refreshed cargo/Bazel lock state. ## Verification - `cargo build -p codex-cli` - `cargo test -p codex-api` - `cargo test -p codex-core --test all responses_api_stream_cli` - `cargo test -p codex-core --test all integration_creates_and_checks_session_file` - `cargo test -p codex-exec --test all ephemeral` - `cargo test -p codex-exec --test all resume` - `cargo test -p codex-tui --test all resume_startup_does_not_consume_model_availability_nux_count` - `just bazel-lock-update` - `just bazel-lock-check` - `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui` - `git diff --check`	2026-05-13 03:08:01 +00:00
Andrei Eternal	913aad4d3c	Add allow_managed_hooks_only hook requirement (#20319 ) ## Why Enterprise-managed hook policy needs a narrow way to require Codex to ignore user-controlled lifecycle hooks without adopting the broader trust-precedence model from earlier hook work. This keeps the policy anchored in `requirements.toml`, so admins can opt into managed hooks only while normal `config.toml` files cannot enable the restriction themselves. ## What changed - Added `allow_managed_hooks_only` to the requirements data flow and preserved explicit `false` values. - Also adds it to /debug-config - Marked MDM, system, and legacy managed config layers as managed for hook discovery. - Updated hook discovery so `allow_managed_hooks_only = true`: - keeps managed requirements hooks and managed config-layer hooks, - skips user/project/session `hooks.json` and `[hooks]` entries with concise startup warnings, - skips current unmanaged plugin hooks, - ignores any `allow_managed_hooks_only` key placed in ordinary `config.toml` layers.	2026-05-12 19:05:25 -07:00
Andrei Eternal	fbfbfe5fc5	hooks: use new session IDs instead of thread IDs for hooks, apply parent's session ID to subagents' hooks (#22268 ) ## Why hook semantics treat `session_id` as shared across a root session and its subagents. Codex hooks were still emitting the current thread ID, which made spawned agents look like independent sessions and made it harder for hook integrations to correlate work across a root thread and its spawned helpers This change makes hooks use Codex's existing shared session identity so hook `session_id` matches the root-thread session across spawned subagents. ## What Changed - switch hook payloads to use the existing shared session identity from core instead of the current thread ID - cover all hook surfaces that expose `session_id`, including `SessionStart`, tool hooks, compact hooks, prompt-submit hooks, stop hooks, and legacy after-agent dispatch	2026-05-12 19:05:10 -07:00
Celia Chen	e2eb7c30fe	feat: route guardian review model selection through providers (#22258 ) ## Why Guardian review selection was hard-coded in `core`, which worked for the default OpenAI path but did not give provider implementations a way to choose backend-specific reviewer model IDs. That matters for Amazon Bedrock: guardian review should run through the Bedrock/Mantle provider using Bedrock's `openai.gpt-5.4` model ID, instead of accidentally selecting a reviewer model that implies the OpenAI backend. ## What Changed - Added provider-owned approval review model selection via `ModelProvider::approval_review_model_selection`. - Moved the existing default selection policy into the provider abstraction: prefer the requested reviewer model when it is available, otherwise fall back to the active turn model, preferring `Low` reasoning when supported. - Added an Amazon Bedrock override that pins guardian review to `openai.gpt-5.4` with `Low` reasoning.	2026-05-13 01:55:46 +00:00
xl-openai	d1430fd61e	feat: Expose plugin versions and gate plugin sharing (#22397 ) - Adds localVersion to plugin summaries and remoteVersion to share context, including generated API schemas. - Hydrates local and remote plugin versions from manifests and remote release metadata. - Adds default-on plugin_sharing gate for shared-with-me listing and plugin/share/save, with disabled-path errors and focused coverage.	2026-05-12 17:56:30 -07:00
sayan-oai	1ae9867296	[codex] Remove tool search bucket limit override (#22381 ) ## Why `tool_search` still carries the server-specific result-cap path added in #17684 for `computer-use`: when the model omitted `limit`, a matching result expanded the search to 20 and then `limit_results_by_bucket` applied per-bucket caps. That makes default result handling depend on a one-off server exception instead of the single `TOOL_SEARCH_DEFAULT_LIMIT` path. This PR removes that custom branch so omitted `limit` values use the ordinary global default consistently. The implementation being retired is the pre-change bucketed search path in [`tool_search.rs`](`5e3ee5eddf/codex-rs/core/src/tools/handlers/tool_search.rs (L121-L190)`). ## What changed - Collapse `ToolSearchHandler::search` back to one BM25 search with the resolved limit. - Remove `limit_results_by_bucket`, the `computer-use` constants, and the omitted-limit plumbing that only existed for the override. - Drop dead `ToolSearchEntry::limit_bucket` metadata from deferred MCP and dynamic search entries. - Remove tests and helpers that only asserted the deleted override behavior. - Add direct handler-level unit coverage for omitted/default and explicit `tool_search` result limits. ## Validation - `cargo test -p codex-core tool_search` - The matching unit tests passed, including the new omitted/default and explicit result-limit coverage. - The broader `--test all` search-tool fixture phase then failed before sending mocked response requests in `tool_search_indexes_only_enabled_non_app_mcp_tools` and `tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`. - `cargo test -p codex-core` - The touched tool-search coverage passed before the run later aborted in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` with a stack overflow.	2026-05-13 00:46:07 +00:00
Tom	c51c65ad09	Unify thread metadata updates above store (#22236 ) - make ThreadStore::update_thread_metadata accept a broad range of metadata patches - keep ThreadStore::append_items as raw canonical history append (no metadata side effects) - in the local store, write these metadata updates to a combination of sqlite and rollout jsonl files for backwards-compat. It special cases which fields need to go into jsonl vs sqlite vs whatever, confining the awkwardness to just this implementation - in remote stores we can simply persist the metadata directly to a database, no special casing required. - move the "implicit metadata updates triggered by appending rollout items" from the RolloutRecorder (which is local-threadstore-specific) to the LiveThread layer above the ThreadStore, inside of a private helper utility called ThreadMetadataSync. LiveThread calls ThreadStore append_items and update_metadata separately. - Add a generic update metadata method to ThreadManager that works on both live threads and "cold" threads - Call that ThreadManager method from app server code, so app server doesn't need to worry about whether the thread is live or not	2026-05-13 00:28:15 +00:00
pakrym-oai	f11ad1eacb	[codex] Add search term coverage for tool_search (#22398 ) ## Why `tool_search` already had solid end-to-end coverage for discovery and follow-up execution, but it did not prove that distinct pieces of indexed search text actually work in integration. In particular, we were not exercising whether unique tool names, descriptions, namespaces, underscore-expanded dynamic names, and schema-property terms were sufficient to surface the expected deferred tools. This change adds focused integration coverage for those term sources so regressions in search text construction are caught by a real `TestCodex` flow instead of only by lower-level unit tests. ## What changed - added a small helper in `core/tests/suite/search_tool.rs` to assert that a `tool_search_output` contains an expected namespace child tool - added an MCP integration test that issues several `tool_search_call`s and verifies distinct query terms match the expected app tools: - exact tool name: `calendar_timezone_option_99` - tool description phrase: `uploaded document` - top-level schema property: `starts_at` - added a dynamic-tool integration test that verifies distinct query terms match the expected deferred dynamic tool: - exact name: `quasar_ping_beacon` - underscore-expanded name: `quasar ping beacon` - description phrase: `saffron metronome` - namespace: `orbit_ops` - schema property: `chrono_spec` ## Validation - `cargo test -p codex-core tool_search_matches_` ## Docs No documentation update needed.	2026-05-13 00:24:07 +00:00
Michael Bolin	9e7cdbd0d2	core: box multi-agent handler futures (#22266 ) ## Why This is the base PR in the split stack for the permissions migration. It isolates stack-safety work that had been mixed into the larger permissions PR, so reviewers can evaluate the async-future changes separately from the permissions model changes in #22267. The main risk this addresses is large or recursive multi-agent futures overflowing smaller runner stacks. A follow-up review also called out that `shutdown_live_agent` must remain quiescent: callers should not remove a live agent from tracking or release its spawn slot until the worker loop has actually terminated. ## What Changed - Boxes the large async futures in the multi-agent spawn, resume, and close tool handlers. - Boxes the `AgentControl` spawn and recursive close/shutdown paths that can otherwise build very deep futures. - Keeps `shutdown_live_agent` waiting for thread termination before removing/releasing the live agent, preserving the previous shutdown ordering while still boxing the recursive close path. ## Verification Strategy The focused local coverage was `cargo test -p codex-core multi_agents`, which exercises the multi-agent spawn/resume/close handlers, cascade close/resume behavior, and the shutdown path touched by this PR. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22266). * #22330 * #22329 * #22328 * #22327 * __->__ #22266	2026-05-12 17:22:25 -07:00
pakrym-oai	0173f71143	Refactor namespaced tool spec registration (#22256 ) ## Summary This refactor makes tool handlers the owner of the specs they can publish, so registry construction can register handlers once and separately publish only the specs that should be model-visible. The main motivation is deferred tools: MCP and dynamic tools still need handlers registered up front, but deferred tools should be discoverable through `tool_search` rather than emitted in the initial tool spec list. ## What changed - `McpHandler` and `DynamicToolHandler` can return their own `ToolSpec`. - `build_tool_registry_builder` now collects handlers, registers them through the no-spec path, and publishes only non-deferred handler specs. - Deferred MCP and dynamic tool names are combined into one `all_deferred_tools` set that drives spec filtering, code-mode deferred-tool signaling, and `tool_search` registration. - `tool_search` registration now requires both deferred tools and `namespace_tools`. - Namespace specs are merged in `spec_plan`, preserving top-level spec order, sorting tools within each namespace, and backfilling empty namespace descriptions. - Hosted web search and image-generation specs are included in the collected spec vector before namespace merge/publication, and tool-name tests that should not care about hosted relative order now compare sets. ## Testing - `cargo test -p codex-core tools::spec::tests:: -- --nocapture` - `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture` - `cargo test -p codex-core tools::router::tests::specs_filter_deferred_dynamic_tools -- --nocapture` - `cargo test -p codex-core suite::prompt_caching::prompt_tools_are_consistent_across_requests -- --nocapture` - `just fmt` - `just fix -p codex-core` - `cargo test -p codex-core -- --skip tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` passed the library suite after skipping the known stack-overflowing unit test. Full `cargo test -p codex-core` currently hits a stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`; the same focused test reproduces on `origin/main`.	2026-05-12 17:09:14 -07:00
pakrym-oai	960d42ddae	code-mode: carry nested tool kind through runtime (#22377 ) ## Why Code mode only used nested spec lookup at execution time to rediscover whether a nested tool should be invoked as a function tool or a freeform tool. That information is already present in the enabled tool metadata that code mode builds to expose `tools.*` and `ALL_TOOLS`, so re-looking it up from the router was redundant and kept execution coupled to a separate spec lookup path. ## What Changed - thread `CodeModeToolKind` through the code-mode runtime `ToolCall` event and `CodeModeNestedToolCall` - emit the nested tool kind directly from the V8 callback using the already-enabled tool metadata - build nested tool payloads from the propagated kind instead of calling `find_spec` - remove the now-unused `find_spec` plumbing from the router and parallel runtime helpers - add unit coverage for function vs freeform payload shaping and update affected router tests ## Testing - `cargo test -p codex-code-mode` - `cargo test -p codex-core code_mode::tests` - `cargo test -p codex-core extension_tool_bundles_are_model_visible_and_dispatchable` - `cargo test -p codex-core model_visible_specs_filter_deferred_dynamic_tools`	2026-05-12 23:34:37 +00:00
Dylan Hurd	8123bddb16	chore(config) include_collaboration_mode_instructions (#22383 ) ## Summary Adds include_collaboration_mode_instructions, which is a config equivalent to include_permissions_instructions for collaboration modes. Desired for situations where we want to disable this instruction from entering the context ## Testing - [x] Added unit test	2026-05-12 15:50:10 -07:00
pakrym-oai	862b2122ee	tools: remove is_mutating dispatch gating (#22382 ) ## Why Tool dispatch had two serialization mechanisms: - `supports_parallel_tool_calls` decides whether a tool participates in the shared parallel-execution lock. - `is_mutating` separately gated some calls inside dispatch. That second hook no longer carried its weight. The remaining parallel-support flag is already the per-tool concurrency policy, so keeping a second mutating gate made dispatch harder to follow and left behind extra session plumbing that only existed for that path. ## What changed - Removed `is_mutating` from tool handlers and deleted the `tool_call_gate` path that existed only to support it. - Simplified dispatch and routing to rely on the existing per-tool `supports_parallel_tool_calls` boolean. - Dropped the now-unused handler overrides and related session/test scaffolding. - Kept the router/parallel tests focused on the surviving per-tool behavior. - Removed the unused `codex-utils-readiness` dependency from `codex-core` as a follow-up fix for `cargo shear`. ## Testing - `cargo test -p codex-core parallel_support_does_not_match_namespaced_local_tool_names` - `cargo test -p codex-core mcp_parallel_support_uses_handler_data` - `cargo test -p codex-core tools_without_handlers_do_not_support_parallel`	2026-05-12 22:44:54 +00:00
Chris Bookholt	5e3ee5eddf	[codex] Tighten unified exec sandbox setup (#22207 ) ## Summary - tighten unified exec sandbox initialization - preserve the requested process workdir independently from sandbox setup - add regression coverage for the updated invariant ## Validation - Ran `/tmp/cargo-tools/bin/just fmt`. - Ran the targeted `codex-core` regression test successfully. - Ran `cargo test -p codex-core`; it did not complete cleanly because unrelated existing agent/config-loader tests failed and the run later aborted on a stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`. Co-authored-by: Codex <noreply@openai.com>	2026-05-12 08:41:00 -07:00
Felipe Coury	95b332c820	feat(tui): add ambient terminal pets (#21206 ) ## Why The Codex App has animated pets, but the TUI had no equivalent ambient companion surface. This brings that experience into terminal Codex while keeping the main chat flow usable: the pet should feel present, but it cannot cover transcript text, composer input, approvals, or picker content. The feature also needs to be terminal-aware. Different terminals support different image protocols, tmux can interfere with image rendering, and some users will want pets disabled entirely or anchored differently depending on their layout. <table> <tr><td> <img width="4110" height="2584" alt="CleanShot 2026-05-05 at 12 41 45@2x" src="https://github.com/user-attachments/assets/68a1fcbc-2104-48d6-b834-69c6aaa95cdf" /> <p align="center">macOS - Ghostty, iTerm2 and WezTerm with Custom Pet</p> </td></tr> <tr><td> ![Uploading CleanShot 2026-05-10 at 20.28.30.png…]() <p align="center">Windows Terminal</p> </td></tr> <tr><td> <img width="3902" height="2752" alt="CleanShot 2026-05-05 at 12 39 02@2x" src="https://github.com/user-attachments/assets/300e2931-6b00-467e-91cb-ab8e28470500" /> <p align="center">Linux - WezTerm and Ghostty</p> </td></tr> </table> ## What Changed - Add a TUI ambient pet renderer in `codex-rs/tui/src/pets/`. - Port the app-style pet animation states so the sprite changes with task status, waiting-for-input states, review/ready states, and failures. - Add `/pets` selection UI with a preview pane, loading state, built-in pet choices, and a first-row `Disable terminal pets` option. - Download built-in pet spritesheets on demand from the same public CDN path already used by Android, under `https://persistent.oaistatic.com/codex/pets/v1/...`, and cache them locally under `~/.codex/cache/tui-pets/`. - Keep custom pets local. - Add config support for pet selection, disabling pets, and choosing whether the pet follows the composer bottom or anchors to the terminal bottom. - Reserve layout space around the pet so transcript wrapping, live responses, and composer input do not render underneath the sprite. - Gate image rendering by terminal capability, disable image pets under tmux, and support both Kitty Graphics and SIXEL terminals. - Add redraw cleanup for terminal image artifacts, including sixel cell clearing. ## Current Scope - This is an initial TUI version of ambient pets, not full App parity. - It focuses on ambient sprite rendering, `/pets` selection, custom pets, terminal capability gating, and on-demand CDN-backed built-in assets. - The ambient text overlay is currently disabled, so the TUI renders the pet sprite without extra status text beside it. ## How to Test 1. Start Codex TUI in a terminal with image support. 2. Run `/pets`. 3. Confirm the picker shows built-in pets plus custom pets, and the first item is `Disable terminal pets`. 4. On a fresh `~/.codex/cache/tui-pets/`, move onto a built-in pet and confirm the first preview downloads the spritesheet from the shared Codex pets CDN and renders successfully. 5. Move through the pet list and confirm subsequent built-in previews use the local cache. 6. Select a pet, then send and receive messages. Confirm transcript and composer text wrap before the pet instead of rendering underneath the sprite. 7. Change the pet anchor setting and confirm the pet can either follow the composer bottom or sit at the terminal bottom. 8. Return to `/pets`, choose `Disable terminal pets`, and confirm the sprite disappears cleanly. Targeted tests: - `cargo test -p codex-tui ambient_pet_` - `cargo test -p codex-tui resize_reflow_wraps_transcript_early_when_pet_is_enabled` - `cargo insta pending-snapshots`	2026-05-12 10:43:17 -03:00
cassirer-openai	cb55b769d1	[rollout-trace] Add x-codex-inference-call-id header to inference calls. (#22311 ) This allows us to attach call logs to inference requests in traces.	2026-05-12 05:55:11 -07:00
jif-oai	d996f5366f	feat: guardian as an extension (contributors part) (#22216 ) Part 1 of guardian as extension. This bind all the logic to spawn another agent from an extension and it adds `ThreadId` in the start thread collaborator	2026-05-12 14:41:45 +02:00
viyatb-oai	46f30d0282	feat(sandbox): add Windows deny-read parity (#18202 ) ## Why The split filesystem policy stack already supports exact and glob `access = none` read restrictions on macOS and Linux. Windows still needed subprocess handling for those deny-read policies without claiming enforcement from a backend that cannot provide it. ## Key finding The unelevated restricted-token backend cannot safely enforce deny-read overlays. Its `WRITE_RESTRICTED` token model is authoritative for write checks, not read denials, so this PR intentionally fails that backend closed when deny-read overrides are present instead of claiming unsupported enforcement. ## What changed This PR adds the Windows deny-read enforcement layer and makes the backend split explicit: - Resolves Windows deny-read filesystem policy entries into concrete ACL targets. - Preserves exact missing paths so they can be materialized and denied before an enforceable sandboxed process starts. - Snapshot-expands existing glob matches into ACL targets for Windows subprocess enforcement. - Honors `glob_scan_max_depth` when expanding Windows deny-read globs. - Plans both the configured lexical path and the canonical target for existing paths so reparse-point aliases are covered. - Threads deny-read overrides through the elevated/logon-user Windows sandbox backend and unified exec. - Applies elevated deny-read ACLs synchronously before command launch rather than delegating them to the background read-grant helper. - Reconciles persistent deny-read ACEs per sandbox principal so policy changes do not leave stale deny-read ACLs behind. - Fails closed on the unelevated restricted-token backend when deny-read overrides are present, because its `WRITE_RESTRICTED` token model is not authoritative for read denials. ## Landed prerequisites These prerequisite PRs are already on `main`: 1. #15979 `feat(permissions): add glob deny-read policy support` 2. #18096 `feat(sandbox): add glob deny-read platform enforcement` 3. #17740 `feat(config): support managed deny-read requirements` This PR targets `main` directly and contains only the Windows deny-read enforcement layer. ## Implementation notes - Exact deny-read paths remain enforceable on the elevated path even when they do not exist yet: Windows materializes the missing path before applying the deny ACE, so the sandboxed command cannot create and read it during the same run. - Existing exact deny paths are preserved lexically until the ACL planner, which then adds the canonical target as a second ACL target when needed. That keeps both the configured alias and the resolved object covered. - Windows ACLs do not consume Codex glob syntax directly, so glob deny-read entries are expanded to the concrete matches that exist before process launch. - Glob traversal deduplicates directory visits within each pattern walk to avoid cycles, without collapsing distinct lexical roots that happen to resolve to the same target. - Persistent deny-read ACL state is keyed by sandbox principal SID, so cleanup only removes ACEs owned by the same backend principal. - Deny-read ACEs are fail-closed on the elevated path: setup aborts if mandatory deny-read ACL application fails. - Unelevated restricted-token sessions reject deny-read overrides early instead of running with a silently unenforceable read policy. ## Verification - `cargo test -p codex-core windows_restricted_token_rejects_unreadable_split_carveouts` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-windows-sandbox` - GitHub Actions rerun is in progress on the pushed head. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 23:04:28 -07:00
pakrym-oai	c9e46ed639	[codex] Make handlers own parallel tool support (#22254 ) ## Why `ToolRouter::tool_supports_parallel()` was still consulting configured specs when a handler lookup missed, even though parallel schedulability is really a property of the executable handler. Keeping that metadata on `ConfiguredToolSpec` duplicated state between the model-visible spec layer and the runtime handler layer. This change makes handlers the sole source of truth for parallel tool support and removes the extra spec wrapper that only existed to carry duplicated metadata. ## What changed - removed `ConfiguredToolSpec` and store plain `ToolSpec` values in the registry/router builder path - changed `ToolRouter::tool_supports_parallel()` to consult only the handler registry and fall back to `false` - simplified spec collection and test helpers to operate directly on `ToolSpec` - updated router/spec tests to cover handler-owned parallel behavior and the no-handler fallback ## Validation - `cargo test -p codex-tools` - `cargo test -p codex-core mcp_parallel_support_uses_handler_data` - `cargo test -p codex-core deferred_responses_api_tool_serializes_with_defer_loading` - `cargo test -p codex-core tools_without_handlers_do_not_support_parallel` - `cargo test -p codex-core request_plugin_install_can_be_registered_without_search_tool` ## Docs No documentation updates needed.	2026-05-11 22:26:33 -07:00
pakrym-oai	79c65f816c	[codex] Filter legacy warning messages during compaction (#22243 ) ## Why Older sessions can contain model-warning records persisted as `user` messages, including the unified exec process-limit warning, the `apply_patch`-via-`exec_command` warning, and the model-mismatch high-risk cyber fallback warning. Those warnings are no longer produced as conversation history items, but when old sessions compact they should still be recognized as injected context rather than preserved as real user turns. ## What changed - Removed `record_model_warning` and the production paths that emitted these warning messages into conversation history. - Added `LegacyUnifiedExecProcessLimitWarning`, `LegacyApplyPatchExecCommandWarning`, and `LegacyModelMismatchWarning` contextual fragments that are used only for matching old persisted messages. - Registered the legacy fragments with contextual user message detection so compaction filters them through the existing fragment path. - Added focused compaction coverage for old warning messages being dropped during compacted-history processing. ## Testing - `cargo test -p codex-core warning` - `just fix -p codex-core`	2026-05-11 19:51:51 -07:00
Abhinav	d08906a944	Support PreToolUse updatedInput rewrites (#20527 ) ## Why `PreToolUse` already exposes `updatedInput` in its hook output schema, but Codex currently rejects it instead of applying the rewrite. That leaves hook authors unable to make the documented pre-execution adjustment to a tool call before it runs. ## What - Accept `updatedInput` from `PreToolUse` hooks when paired with `permissionDecision: "allow"`. - Apply the rewritten input before dispatch so the tool executes the updated payload, not the original one. - Preserve the stable hook-facing compatibility shapes that participating tool handlers expose: - Bash-like tools (`shell`, `container.exec`, `local_shell`, `shell_command`, `exec_command`) use `{ "command": ... }`. - `apply_patch` exposes its patch body through the same command-shaped hook contract. - MCP tools expose their JSON argument object directly. - Keep each participating tool handler responsible for translating hook-facing `updatedInput` back into its concrete invocation shape. ## Verification Direct Bash-like rewrite coverage: - `pre_tool_use_rewrites_shell_before_execution` - `pre_tool_use_rewrites_container_exec_before_execution` - `pre_tool_use_rewrites_local_shell_before_execution` - `pre_tool_use_rewrites_shell_command_before_execution` - `pre_tool_use_rewrites_exec_command_before_execution` These cases assert that each supported Bash-like surface runs only the rewritten command while the hook still observes the original `{ "command": ... }` input. `pre_tool_use_rewrites_apply_patch_before_execution` - Model emits one patch. - Hook swaps in a different patch. - Asserts only the rewritten file is created, and the hook saw the original patch. `pre_tool_use_rewrites_code_mode_nested_exec_command_before_execution` - Model runs one nested shell command from code mode. - Hook rewrites it. - Asserts only the rewritten command runs, and the hook saw the original nested input. `pre_tool_use_rewrites_mcp_tool_before_execution` - Model calls the RMCP echo tool. - Hook rewrites the MCP arguments. - Asserts the MCP server receives and returns the rewritten message, not the original one.	2026-05-11 22:27:24 -04:00
starr-openai	17ed5ad0b0	Apply sandbox context to local view_image reads (#21861 ) ## Summary - create a selected-cwd filesystem sandbox context for view_image metadata and file reads in both local and remote environments - add a local restricted-profile regression test for the previously unsandboxed read path ## Validation - just fmt - bazel test --bes_backend= --bes_results_url= --test_output=errors --test_filter=view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads //codex-rs/core:core-unit-tests --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 18:48:43 -07:00
pakrym-oai	ed5944ba1d	Simplify MCP tool handler plumbing (#21595 ) ## Why The MCP tool path had accumulated a few core-owned special cases: a dedicated payload variant, resolver plumbing, a legacy `AfterToolUse` translation path, and a side channel for parallel-call metadata. That made `ToolRegistry` and the spec builder know more about MCP than they needed to. This change moves MCP-specific execution details back onto `ToolInfo` and `McpHandler` so `codex-core` can treat MCP calls like normal function calls while still preserving MCP-specific dispatch and telemetry behavior where it belongs. ## What changed - removed `resolve_mcp_tool_info`, `ToolPayload::Mcp`, `ToolKind`, and the remaining registry-side MCP resolver path - stored MCP routing metadata directly on `McpHandler` and `ToolInfo`, including `supports_parallel_tool_calls` - deleted the legacy `AfterToolUse` consumer in `core`, which removes the need for handler-specific `after_tool_use_payload` implementations - switched tool-result telemetry to handler-provided tags and kept MCP-specific dispatch payload construction inside the handler - simplified tool spec planning/building by passing `ToolInfo` directly and dropping the direct/deferred MCP wrapper structs and the parallel-server side table ## Testing - `cargo check -p codex-core -p codex-mcp -p codex-otel` - `cargo test -p codex-core mcp_parallel_support_uses_exact_payload_server` - `cargo test -p codex-core direct_mcp_tools_register_namespaced_handlers` - `cargo test -p codex-core search_tool_description_lists_each_mcp_source_once` - `cargo test -p codex-mcp list_all_tools_uses_startup_snapshot_while_client_is_pending` - `just fix -p codex-core -p codex-mcp -p codex-otel`	2026-05-12 00:11:31 +00:00

1 2 3 4 5 ...

3255 Commits