codex

mirror of https://github.com/openai/codex.git synced 2026-05-25 05:24:37 +00:00

Author	SHA1	Message	Date
Matthew Zeng	ebdf3a878c	Support disabling tool suggest for specific tools. (#20072 ) ## Summary - Add `disable_tool_suggest` to app and plugin config, schema, and TypeScript output - Exclude disabled connectors and plugins from tool suggestion discovery - Persist "never show again" tool-suggestion choices back into `config.toml` - Update config docs and add coverage for connector and plugin suppression ## Testing - Added and updated unit tests for config persistence and tool-suggest filtering - Not run (not requested)	2026-04-29 00:19:34 +00:00
Michael Bolin	1211a90a35	core tests: migrate hook turns to profiles (#20041 ) ## Summary - Removes `SandboxPolicy` from the hooks test suite. - Submits hook-related turns with explicit `PermissionProfile` values for disabled, read-only, and workspace-write cases. - Preserves the managed-network hook test by configuring and submitting a workspace-write profile with enabled network, allowing the existing requirements-backed proxy path to remain covered. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:18:45 -07:00
Michael Bolin	1fed948c66	core tests: migrate apply patch turns to profiles (#20040 ) ## Summary - Removes `SandboxPolicy` from the apply-patch CLI test suite. - Uses the harness' profile-backed submit helper for danger/no-sandbox turns instead of constructing `Op::UserTurn` manually with legacy fields. - Converts the workspace-write traversal cases to submit `PermissionProfile::workspace_write_with(...)` directly. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:18:19 -07:00
Michael Bolin	1dae5788e1	core tests: migrate rmcp turns to profiles (#20037 ) ## Summary - Removes `SandboxPolicy` from the RMCP client test suite. - Adds shared read-only user-turn helpers that submit `PermissionProfile::read_only()` plus the legacy compatibility projection required by the current `Op::UserTurn` shape. - Keeps sandbox metadata assertions intact by deriving the expected legacy `sandboxPolicy` value from the same read-only profile used for the turn. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:17:47 -07:00
Michael Bolin	6662c0f312	core tests: migrate compact turns to profiles (#20035 ) ## Summary - Removes the remaining `SandboxPolicy` usage from the compaction test suite. - Adds a small local helper for direct `Op::UserTurn` construction so these tests send `PermissionProfile::Disabled` plus the legacy compatibility projection required by the protocol field. - Keeps the existing danger/full-access behavior while exercising the canonical permission profile path. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:17:12 -07:00
Michael Bolin	026df712cc	core tests: migrate zsh-fork permissions to profiles (#20034 ) ## Summary - Updates the zsh-fork test helper to configure `PermissionProfile` directly instead of constructing a legacy `SandboxPolicy`. - Sends permission-profile-backed turns from the skill approval zsh-fork tests so the runtime and request path exercise the canonical permissions model. - Leaves the broader approvals suite on legacy policies for now, except for the zsh-fork test that shares this helper. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:15:58 -07:00
Michael Bolin	1ea90410e1	core tests: migrate request permissions tool turns to profiles (#20033 ) ## Summary This migrates the macOS request-permissions tool tests from legacy `SandboxPolicy` setup to `PermissionProfile` setup. The tests still exercise the same workspace-write baseline and request-permission grants, but the canonical permissions value is now the profile. ## Changes - Replaces the `workspace_write_excluding_tmp()` helper with a `PermissionProfile::workspace_write_with()` helper. - Applies test config through `Permissions::set_permission_profile()`. - Uses `turn_permission_fields()` for `Op::UserTurn` compatibility fields. - Removes the `SandboxPolicy` import from `request_permissions_tool.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:15:13 -07:00
Michael Bolin	af39e488bc	core tests: migrate prompt caching turns to profiles (#20032 ) ## Summary This removes the explicit `SandboxPolicy` constructors from `core/tests/suite/prompt_caching.rs`. The tests still exercise the same prompt-cache invariants across permission and turn-context changes, but the permission source is now `PermissionProfile`. ## Changes - Uses `PermissionProfile::workspace_write_with()` for workspace-write override scenarios. - Uses `PermissionProfile::Disabled` for the no-sandbox per-turn override. - Projects profiles through `turn_permission_fields()` or `to_legacy_sandbox_policy()` only to populate compatibility fields on existing ops. - Removes the `SandboxPolicy` import from `prompt_caching.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:13:53 -07:00
Michael Bolin	5d08315c00	core tests: migrate exec policy turns to profiles (#20030 ) ## Summary This migrates `core/tests/suite/exec_policy.rs` away from legacy `SandboxPolicy` turn construction. These tests all use no-sandbox turns to exercise exec-policy behavior, so `PermissionProfile::Disabled` is the canonical representation. ## Changes - Replaces direct `SandboxPolicy::DangerFullAccess` turn fields with `PermissionProfile::Disabled`. - Uses `turn_permission_fields()` to populate the compatibility `sandbox_policy` field required by `Op::UserTurn`. - Removes the `SandboxPolicy` import from `exec_policy.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:12:48 -07:00
Michael Bolin	b599849d86	core tests: migrate permissions message tests to profiles (#20028 ) ## Summary This removes another test-only `SandboxPolicy` dependency by configuring `permissions_messages.rs` with a `PermissionProfile` directly. The test still verifies the rendered compatibility permissions text, but now obtains the legacy projection from the loaded `Config` rather than using `SandboxPolicy` as the source of truth. ## Changes - Builds the workspace-write test setup with `PermissionProfile::workspace_write_with()`. - Applies that profile through `Permissions::set_permission_profile()`. - Uses `Config::legacy_sandbox_policy()` only for the expected `PermissionsInstructions` compatibility rendering. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:12:10 -07:00
Michael Bolin	3ef09c71d3	core tests: migrate tools tests to permission profiles (#20027 ) ## Summary This continues the test-side migration away from `SandboxPolicy` by removing the remaining legacy policy setup in `core/tests/suite/tools.rs`. The affected test was already modeling a profile-backed filesystem policy with a deny-read glob, so configuring the test through `Permissions::set_permission_profile()` is a better match for the behavior being exercised. ## Changes - Drops the `SandboxPolicy` import from `core/tests/suite/tools.rs`. - Configures the glob deny-read shell test directly with a `PermissionProfile` instead of creating a legacy read-only policy first. - Submits the test turn with the session permission profile so the deny-read glob remains active for the command under test. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:11:43 -07:00
Michael Bolin	8d3992d830	core tests: migrate plan item turns to profiles (#20026 ) ## Why The core item tests still had a cluster of plan-mode `Op::UserTurn` literals that used `SandboxPolicy::DangerFullAccess` and omitted `permission_profile`. These tests are validating emitted item lifecycle events, so keeping them on the legacy sandbox-only turn shape adds noise to the broader permissions migration without testing legacy behavior. ## What Changed - Adds a local `disabled_plan_turn()` helper that preserves the existing `std::env::current_dir()` turn cwd behavior. - Uses `turn_permission_fields(PermissionProfile::Disabled, cwd)` to populate both the compatibility `sandbox_policy` and canonical `permission_profile` fields. - Replaces the plan-mode hand-built turns in `codex-rs/core/tests/suite/items.rs`, removing all `SandboxPolicy` references from that file and reducing remaining `codex-rs/core/tests` `SandboxPolicy` files from 16 to 15. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:11:17 -07:00
Michael Bolin	162f4e3183	core tests: migrate safety check turns to profiles (#20024 ) ## Why This stack is retiring direct `SandboxPolicy` construction from tests so core coverage exercises the same `PermissionProfile` turn path used by runtime code. `safety_check_downgrade.rs` still submitted each test turn as `SandboxPolicy::DangerFullAccess` with no permission profile, even though the tests are about model verification/reroute behavior rather than legacy sandbox conversion. ## What Changed - Adds a local `disabled_text_turn()` helper that derives both the compatibility `sandbox_policy` and canonical `permission_profile` from `PermissionProfile::Disabled`. - Replaces repeated hand-built `Op::UserTurn` literals in `codex-rs/core/tests/suite/safety_check_downgrade.rs` with that helper. - Removes all `SandboxPolicy` references from the safety-check suite, reducing the remaining `codex-rs/core/tests` files that mention `SandboxPolicy` from 17 to 16. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:10:42 -07:00
Michael Bolin	2a8ce9b319	core tests: migrate view image turns to profiles (#20021 ) ## Why This stack is removing direct `SandboxPolicy` usage from test code so new tests exercise the same `PermissionProfile` path that runtime code now treats as canonical. `view_image.rs` still built `Op::UserTurn` requests with `SandboxPolicy::DangerFullAccess` and no permission profile, which kept another core test module on the legacy turn shape. ## What Changed - Adds a small `disabled_user_turn()` helper for the view-image suite that derives the compatibility `sandbox_policy` and canonical `permission_profile` from `PermissionProfile::Disabled`. - Replaces repeated direct `Op::UserTurn` literals in `codex-rs/core/tests/suite/view_image.rs` with that helper. - Removes all `SandboxPolicy` references from `view_image.rs`, reducing the remaining `codex-rs/core/tests` files that mention `SandboxPolicy` from 18 to 17. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:09:48 -07:00
Michael Bolin	d77d23da2e	core tests: migrate model/personality turns to profiles (#20018 ) ## Summary - Migrates `model_switching.rs` and `personality.rs` direct `Op::UserTurn` construction from legacy `SandboxPolicy` literals to `PermissionProfile`-backed turn fields. - Adds small local helpers in each file so tests keep asserting model/personality behavior without repeating permission plumbing. - Reduces `rg -l '\bSandboxPolicy\b' codex-rs/core/tests` from 20 files to 18; `codex-rs/tui` remains at zero `SandboxPolicy` references. ## Testing - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:09:12 -07:00
Abhinav	5b0d9df1d0	Increase plugin hook env test timeout (#20100 ) # Why `plugin_hook_sources_run_with_plugin_env_and_plugin_source` can still fail on Windows after the earlier file-based assertion cleanup because the hook process itself occasionally exceeds the old 5s timeout under CI load. When that happens, the hook run ends as `Failed` before the test can inspect its structured output. The Windows Bazel failure showed the hook run itself failing after nearly 8 seconds: ```text ---- engine::tests::plugin_hook_sources_run_with_plugin_env_and_plugin_source stdout ---- thread 'engine::tests::plugin_hook_sources_run_with_plugin_env_and_plugin_source' panicked at hooks/src\engine\mod_tests.rs:428:5: assertion failed: `(left == right)` Diff < left / right > : <Failed >Completed ... test result: FAILED. 78 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 7.96s ``` # What - raise the flaky plugin hook env test timeout from 5s to 10s so it matches the other executed hook tests in this module # Validation - `cargo test -p codex-hooks`	2026-04-28 17:08:12 -07:00
Michael Bolin	d6d79ffcc7	core tests: send model turns with permission profiles (#20016 ) ## Summary - Migrate direct `Op::UserTurn` construction in remote-model tests from legacy `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled` via `turn_permission_fields()`. - Migrate the Responses API proxy header helper from an inline workspace-write `SandboxPolicy` to `PermissionProfile::workspace_write()`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 22 files after #20015 to 20 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20016). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * __->__ #20016	2026-04-28 17:08:04 -07:00
Michael Bolin	158b2a4201	core tests: configure profiles directly (#20015 ) ## Summary - Replace legacy sandbox config setup in delegate and telemetry tests with direct `PermissionProfile` configuration. - Move no-sandbox and read-only test turns in `tools.rs`, `code_mode.rs`, `user_shell_cmd.rs`, and `model_visible_layout.rs` from legacy `SandboxPolicy` values to `PermissionProfile` helpers, while leaving the deny-glob read-only compatibility case for a later targeted cleanup. - Use `PermissionProfile::read_only()` where tests need managed read-only behavior and `PermissionProfile::Disabled` where they intentionally need no sandbox. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 27 files after #20013 to 22 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:06:59 -07:00
Michael Bolin	52e79ee49a	core tests: migrate more turns to permission profiles (#20013 ) ## Summary - Migrate another batch of direct `Op::UserTurn` test construction from legacy `SandboxPolicy` values to `PermissionProfile` inputs via `turn_permission_fields()`. - Replace a one-off read-only `SandboxPolicy` bridge in the macOS exec test with `PermissionProfile::read_only()`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 32 files at the start of the cleanup stack to 27 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt` - `just fix -p codex-core`	2026-04-28 17:05:53 -07:00
Michael Bolin	7d15936e69	core tests: build user turns from permission profiles (#20011 ) ## Summary - Add `turn_permission_fields()` so tests that construct `Op::UserTurn` directly can provide a canonical `PermissionProfile` while still filling the required legacy `sandbox_policy` compatibility field. - Migrate direct user-turn construction in core integration tests from `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled`. - Continue reducing direct `SandboxPolicy` usage in `codex-rs/core/tests`, from 41 files after #20010 to 32 files in this PR. ## Testing - `cargo check -p codex-core --tests` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`	2026-04-28 17:03:20 -07:00
Ruslan Nigmatullin	c6465c1ec2	app-server: notify clients of remote-control status changes (#19919 ) ## Why Remote-control app-server enrollments have both an internal server id and the environment id exposed to remote-control clients. App-server clients need one current status snapshot that says whether remote control is usable and which environment id, if any, is exposed. A temporary websocket disconnect is not itself an identity change. Account changes, stale enrollment invalidation, successful re-enrollment, and missing ChatGPT auth are meaningful status changes. Disabled remote control remains `disabled` regardless of auth or SQLite state. SQLite startup failure disablement and enrollment persistence failures are handled in #20068; this PR reports the resulting effective status to clients. ## What changed - Adds v2 `remoteControl/status/changed` carrying `state` and `environmentId`. - Adds `RemoteControlConnectionState` values: `disabled`, `connecting`, `connected`, and `errored`. - Exposes remote-control status updates through `RemoteControlHandle` using a Tokio watch channel. - Always sends the current remote-control status snapshot to newly initialized app-server clients. - Broadcasts status changes to initialized app-server clients when state or environment id changes. - Treats missing ChatGPT auth as an `errored` status while leaving it retryable because auth can change at runtime. - Clears `environmentId` when enrollment is cleared for account changes, auth loss, stale backend invalidation, or disabled remote control. - Updates app-server protocol schema fixtures, generated TypeScript, app-server README, remote-control tests, and TUI exhaustive notification matches. ## Stack - Builds on #20068. ## Verification - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server transport::remote_control --lib` - `cargo check -p codex-tui` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server` - `just fix -p codex-tui`	2026-04-28 23:52:14 +00:00
Gabriel Peal	5e6cbbadf7	Return None when auth refresh fails (#20092 ) Right now, if Codex winds up in a state with auth but it can't refresh the token, the user is left with an unhelpful message that says to log out and log back in again. Ultimately, we should prevent that from happening but if it does, returning None will allow the caller to redirect the user back to the login page	2026-04-28 16:15:47 -07:00
Michael Bolin	891722849d	core tests: submit turns with permission profiles (#20010 ) ## Summary - Add `PermissionProfile`-based turn submission helpers to `core_test_support`, while keeping the legacy `SandboxPolicy` helper for tests that intentionally exercise legacy fallback behavior. - Switch the default `TestCodex::submit_turn()` path to send a real `PermissionProfile` plus the required legacy compatibility projection in `Op::UserTurn`. - Migrate straightforward app/search/shell/truncation tests from `SandboxPolicy::{DangerFullAccess, ReadOnly}` to `PermissionProfile::{Disabled, read_only}`. - Add a TUI compatibility projection helper for legacy app-server fields so non-legacy writable roots are preserved instead of being downgraded to read-only. - Fix remote start/resume/fork sandbox-mode projection to classify any managed profile with writable roots as workspace-write, not only profiles that can write `cwd`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 47 files to 41 files without changing production behavior. ## Testing - `cargo check -p codex-core --tests` - `cargo test -p codex-tui compatibility_profile_preserves_unbridgeable_write_roots` - `cargo test -p codex-tui sandbox_mode_preserves_non_cwd_write_roots_for_remote_sessions` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`	2026-04-28 23:01:40 +00:00
viyatb-oai	2dbde94aa9	fix(network-proxy): normalize network proxy host matching (#19995 ) ## Why The proxy matches allow and deny rules against normalized host strings. Scoped IPv6 literals can arrive in equivalent forms, such as `fd00::1%eth0`, `[fd00::1%eth0]`, or `[fd00::1%25eth0]`. Policy should canonicalize those spellings without erasing scope granularity: an unscoped rule like `fd00::1` should still cover scoped requests for that address, while a scoped rule like `fd00::1%eth0` should remain exact to that scope. ## What changed - preserve IPv6 scope IDs during host normalization and canonicalize `%25scope` to `%scope` - match policy against the exact normalized host plus the unscoped IP base for scoped literals - keep local-address explicit allow checks aligned with the same scoped/unscoped semantics - add focused coverage for scoped IPv6 normalization, scoped allow rules, and scoped deny rules in `network-proxy` ## Security impact A request cannot bypass a broad deny rule by adding an IPv6 scope suffix. At the same time, scoped policy remains precise: `deny=fd00::1%eth0` affects that scoped spelling without collapsing `fd00::1%eth1` onto the same key, and `allow=fe80::1%eth0` does not implicitly allow other scopes. ## Verification - `just fmt` - `cargo test -p codex-network-proxy` - `just fix -p codex-network-proxy` - `git diff --check` --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: evawong-oai <evawong@openai.com>	2026-04-28 15:50:00 -07:00
Abhinav	3291463ff1	Fix flaky plugin hook env test (#20088 ) The test was flaky because it was checking the right thing in a roundabout way. What it wanted to prove: - plugin hooks receive the right environment variables. What it actually did: 1. Run a plugin hook. 2. Have that hook write those env vars into a temporary `env.json` file. 3. After the hook finished, read `env.json` back from disk. On Windows, that last file was sometimes not there when the test tried to read it, so the test failed with `read env log: file not found`. The hook system itself was not what the test failure was directly proving; the test was failing on the extra filesystem side effect it introduced. The fix is to stop using a temp file as the proof mechanism. The hook now prints the env values in its normal structured output, and the test asserts on the output that the hook engine already captures. So we still verify the same behavior, but without depending on a separate file being created and read back correctly on Windows.	2026-04-28 15:45:26 -07:00
Owen Lin	2e598df6fc	fix: don't auto approve git -C ... (#20085 ) It's safer to make sure these commands go through approval flows.	2026-04-28 22:06:55 +00:00
canvrno-oai	66b0781502	/plugins: add marketplace install flow (#18704 ) This PR adds a new feature to the `/plugins` menu that gives users the ability to add new plugin marketplaces. It introduces an Add Marketplace tab to the right of installed marketplaces, a source prompt, loading and error states, and the app-server request flow needed to perform the install. After a successful `marketplace/add`, the popup refreshes back into the newly added marketplace tab so the new plugins are immediately visible. - Add an Add Marketplace tab to the `/plugins` menu - Prompt for marketplace source input from git repo, URL, or local path - Show loading and error states during `marketplace/add` - Refresh plugin data after success and switch into the newly added marketplace tab - Add tests and snapshot updates	2026-04-28 14:22:39 -07:00
Abhinav	c6e7d564c3	Discover hooks bundled with plugins (#19705 ) ## Why Plugins can bundle lifecycle hooks, but Codex previously only discovered hooks from user, project, and managed config layers. This adds the plugin discovery and runtime plumbing needed for plugin-bundled hooks while keeping execution behind the `plugin_hooks` feature flag. ## What - Discovers plugin hook sources from each plugin's default `hooks/hooks.json`. - Supports `plugin.json` manifest `hooks` entries as either relative paths or inline hook objects. - Plumbs discovered plugin hook sources through plugin loading into the hook runtime when `plugin_hooks` is enabled. - Marks plugin-originated hook runs as `HookSource::Plugin`. - Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook command environments. - Updates generated schemas and hook source metadata for the plugin hook source. ## Stack 1. This PR - openai/codex#19705 2. openai/codex#19778 3. openai/codex#19840 4. openai/codex#19882 ## Reviewer Notes - Core logic is in `codex-rs/core-plugins/src/loader.rs` and `codex-rs/hooks/src/engine/discovery.rs` - Moved existing / adding new tests to `codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there - Otherwise mostly plumbing and minor schema updates ### Core Changes The `codex-rs/core` changes are limited to wiring plugin hook support into existing core flows: - `core/src/session/session.rs` conditionally pulls effective plugin hook sources and plugin hook load warnings from `PluginsManager` when `plugin_hooks` is enabled, then passes them into `HooksConfig`. - `core/src/hook_runtime.rs` adds the `plugin` metric tag for `HookSource::Plugin`. - `core/config.schema.json` picks up the new `plugin_hooks` feature flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the added plugin hook fields. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 14:17:18 -07:00
cassirer-openai	89698ad1c3	[rollout-trace] Include x-request-id in rollout trace. (#20066 ) ## Why Rollout traces need an identifier that can be used to correlate a Codex inference with upstream Responses API, proxy, and engine logs. The reduced trace model already exposed `upstream_request_id`, but it was being populated from the Responses API `response.id`. That value is useful for `previous_response_id` chaining, but it is not the transport request id that upstream systems key on. This PR separates those concepts so trace consumers can reliably answer both questions: - which Responses API response did this inference produce? - which upstream request handled it? ## Structure The change keeps the upstream request id at the same lifecycle level as the provider stream: - `codex-api` captures the `x-request-id` HTTP response header when the SSE stream is created and exposes it on `ResponseStream`. Fixture and websocket streams set the field to `None` because they do not have that HTTP response header. - `codex-core` carries that stream-level id into `InferenceTraceAttempt` when recording terminal stream outcomes. Completed, failed, cancelled, dropped-stream, and pre-response error paths all record the id when it is available. - `rollout-trace` now records both identifiers in raw terminal inference events and response payloads: `response_id` for the Responses API `response.id`, and `upstream_request_id` for `x-request-id`. - The reducer stores both fields on `InferenceCall`. It also uses `response_id` for `previous_response_id` conversation linking, which removes the old accidental dependency on the misnamed `upstream_request_id` field. - Terminal inference reduction now consumes the full terminal payload (`InferenceCompleted`, `InferenceFailed`, or `InferenceCancelled`) in one place. That keeps status, partial payloads, response ids, and upstream request ids consistent across success, failure, cancellation, and late stream-mapper events. ## Why This Shape `x-request-id` is a property of the HTTP/provider response envelope, not an SSE event. Capturing it once in `codex-api` and plumbing it through terminal trace recording avoids trying to infer the value from stream contents, and it preserves the id even when the stream fails or is cancelled after only partial output. Keeping `response_id` separate from `upstream_request_id` also makes the reduced trace model less surprising: `response_id` remains the conversation-continuation id, while `upstream_request_id` is the operational correlation id for upstream debugging. ## Validation The PR updates trace and reducer coverage for: - reading `x-request-id` from SSE response headers; - storing the true upstream request id on completed inference calls; - preserving upstream request ids for cancelled and late-cancelled inference streams; - keeping `previous_response_id` reconstruction tied to `response_id` rather than transport request ids.	2026-04-28 21:11:17 +00:00
Ruslan Nigmatullin	10e2a73b3c	app-server: disable remote control without sqlite (#20068 ) ## Why Remote control depends on the app-server SQLite state DB for persisted enrollment identity. If the state DB cannot be opened at startup, continuing with remote control enabled leaves the process in a misleading state where enrollment identity cannot be read or persisted. Feature-disabled remote control remains disabled regardless of SQLite state. This only changes the case where remote control is requested but the SQLite state DB is unavailable. ## What changed - Logs SQLite state DB initialization failures instead of dropping the error silently. - Treats remote control as effectively disabled when the SQLite state DB is unavailable. - Prevents `RemoteControlHandle::set_enabled(true)` from enabling remote control later in the same process if the state DB was unavailable at startup. - Keeps the existing behavior that disabled remote control does not validate or connect to the remote-control URL. - Makes persisted enrollment load/update failures propagate as remote-control errors instead of silently falling back to in-memory state. - Makes the direct websocket connection path fail when called without a SQLite state DB. - Adds coverage for startup without a state DB, later handle enablement with no state DB, and direct websocket connection without a state DB. ## Verification - `cargo test -p codex-app-server transport::remote_control --lib` - `just fix -p codex-app-server`	2026-04-28 13:49:00 -07:00
Michael Bolin	3b74a4d3b1	tui: use permission profiles for sandbox state (#20008 ) ## Summary - Move TUI permission state from legacy `SandboxPolicy` values to canonical `PermissionProfile` values across presets, app events, chat widget state, app commands, thread routing, and cached thread session state. - Keep app-server compatibility boundaries explicit: embedded sessions send `permissionProfile`, while remote sessions send only a legacy `sandbox` projection and fall back to read-only when a custom profile cannot be projected. - Update status/add-dir UI summaries and snapshots to render the active permission profile, including workspace profiles selected by the new built-in defaults. ## Verification - `rg '\bSandboxPolicy\b' codex-rs/tui -n` returns no matches. - `cargo test -p codex-tui` - `cargo check -p codex-tui --tests` - `cargo test -p codex-tui additional_dirs` - `just fmt` - `just fix -p codex-tui` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20008). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * #20016 * #20015 * #20013 * #20011 * #20010 * __->__ #20008	2026-04-28 20:36:48 +00:00
jif-oai	34d71d43eb	Make MultiAgentV2 wait minimum configurable (#20052 ) ## Why MultiAgentV2 `wait_agent` currently clamps short waits to a fixed 10 second minimum. That default is still useful for preventing tight polling loops, but it is too rigid for environments that need faster mailbox wake-up checks or a larger minimum to discourage frequent polling. This PR makes the minimum wait timeout configurable from the existing MultiAgentV2 feature config section, so operators can tune the behavior without changing the legacy multi-agent tool surface. ## What Changed - Added `features.multi_agent_v2.min_wait_timeout_ms`. - Defaulted the new setting to the existing 10 second floor. - Validated the configured value as `1..=3600000`, matching the existing one hour maximum wait bound. - Applied the configured minimum to MultiAgentV2 `wait_agent` runtime clamping. - Plumbed the configured minimum into the `wait_agent` tool schema, including the effective default when the minimum is above the normal 30 second default. - Regenerated `core/config.schema.json`. ## Verification - `cargo test -p codex-features` - `cargo test -p codex-tools` - `cargo test -p codex-core --lib multi_agent_v2` - `just fix -p codex-core`	2026-04-28 22:36:44 +02:00
Ruslan Nigmatullin	1de7a9bf69	app-server: allow remote_control runtime feature override (#20047 )	2026-04-28 13:36:12 -07:00
viyatb-oai	e1ba87ccb2	fix(network-proxy): recheck network proxy connect targets (#19999 ) ## Why The proxy checks the requested host before opening the upstream connection, but DNS can resolve an allowed hostname to a loopback, private, or other non-public address after that first decision. Without a final check on the actual socket target, a request that looks acceptable at the hostname layer can still connect to a local service once resolution completes. ## What changed - add a shared TCP connector check for direct proxy egress - use that path for HTTP, `CONNECT`, SOCKS5, and MITM upstream connections - keep configured upstream proxy hops on the existing proxy path - add direct-connector coverage for allowed and rejected local targets ## Security impact Direct proxy egress now rechecks the resolved socket address before connecting, closing the gap between hostname policy evaluation and the final network target. ## Verification - `cargo test -p codex-network-proxy` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 12:51:43 -07:00
Shijie Rao	25ac0e4527	Load cloud requirements for agent identity (#19708 ) ## Why Agent Identity sessions can represent Business and Enterprise ChatGPT workspaces, but cloud requirements were skipped before fetch. That meant workspace-managed requirements were not loaded for Agent Identity even when the JWT carried the same account identity and plan information that normal ChatGPT token auth exposes. This PR now sits on top of the Agent Identity stack through [#19764](https://github.com/openai/codex/pull/19764). Because [#19763](https://github.com/openai/codex/pull/19763) moved task registration into Agent Identity auth loading, cloud requirements no longer needs a separate runtime-initialization step before building the backend client. ## What changed - Stop skipping `CodexAuth::AgentIdentity` in the cloud requirements loader. - Share the cloud requirements eligibility check between startup load and background cache refresh. - Rely on eagerly loaded Agent Identity auth so backend requests can attach task-scoped `AgentAssertion` headers. - Decode Agent Identity JWT `plan_type` as the auth-layer plan type, then convert it through a shared `auth::PlanType` -> `account::PlanType` mapping. - Add the missing serde alias for the `education` plan string and add coverage for raw Agent Identity plan aliases such as `hc` and `education`. ## Testing - `cargo test -p codex-agent-identity -p codex-login -p codex-cloud-requirements -p codex-protocol`	2026-04-28 12:35:00 -07:00
Ruslan Nigmatullin	0700f979ba	app-server: run initialized rpcs with keyed serialization (#17373 ) ## Why Initialized app-server RPCs no longer need to bottleneck behind one request processor path. Running them concurrently improves responsiveness, but several request families still mutate shared state or depend on ordered side effects. Those stateful families need an auditable serialization contract so concurrency does not reorder thread, config, auth, command, watcher, MCP, or similar state transitions. This PR keeps that boundary explicit: stateful work is serialized by the smallest useful key, while intentionally read-only or externally concurrent work remains unkeyed. In particular, `thread/list` and `thread/turns/list` explicitly have no serialization because they primarily read append-only rollout storage and should continue to be served concurrently. ## What changed - Adds `ClientRequest::serialization_scope()` in `app-server-protocol` and requires every client request definition to declare its serialization behavior. - Introduces keyed request scopes for thread, thread path, command exec process, fuzzy search session, fs watch, MCP OAuth, and global state buckets such as config, account auth, memory, and device keys. - Routes initialized app-server RPCs through per-key FIFO serialization while allowing unkeyed initialized requests to run concurrently. - Cancels in-flight initialized RPC work when the connection disconnects or the app-server exits so spawned request tasks do not outlive their session. - Adds focused coverage for representative keyed and unkeyed serialization scopes, including explicitly concurrent `thread/turns/list` behavior. ## Validation - Added protocol tests for representative keyed serialization scopes and intentionally unkeyed request families. - Added app-server request serialization tests covering per-key FIFO behavior, concurrent unkeyed execution, disconnect shutdown, and config read-after-write ordering. - Local focused protocol validation after the latest rebase is currently blocked by packageproxy failing to resolve locked `rustls-webpki 0.103.13`; CI is expected to provide the full validation signal.	2026-04-28 12:23:34 -07:00
Dylan Hurd	7f7c7c2c07	Fix log db batch flush flake (#19959 ) ## Why The log DB writer batches tracing events before inserting them into SQLite, but `tokio::time::interval` produces an immediate first tick. That meant the inserter could flush the first accepted log entry before `batch_size` was reached, making `configured_batch_size_flushes_without_explicit_flush` timing-sensitive in CI. ## What Changed - Consume the interval's startup tick before entering the inserter loop, so interval flushing starts after the configured delay. - Remove the test's startup sleep, which was masking the race instead of proving the batch-size behavior. ## Validation - `cargo test -p codex-state` - `cargo test -p codex-state configured_batch_size_flushes_without_explicit_flush` passed 3 consecutive focused runs - PR checks passed across `rust-ci`, Bazel, `ci`, `sdk`, `cargo-deny`, Codespell, blob-size policy, and CLA	2026-04-28 12:08:41 -07:00
viyatb-oai	3377afd84a	fix(network-proxy): harden linux proxy bridge helpers (#20001 ) ## Why The Linux managed-proxy bridge helpers are long-lived child processes in the sandbox networking path. Before this change they stayed dumpable and the network seccomp profile did not block cross-process memory syscalls, so another same-user process could potentially inspect or modify bridge memory instead of interacting only through the intended proxy interface. ## What changed - reuse the shared `codex-process-hardening` helper to mark bridge helper children non-dumpable before they begin serving - deny `process_vm_readv` and `process_vm_writev` in the existing network seccomp filter ## Security impact Bridge helpers are less exposed to same-user cross-process inspection or memory writes, which reduces the chance that sandboxed code can interfere with proxy support processes outside the intended IPC path. ## Verification - `cargo test -p codex-process-hardening` - `cargo test -p codex-linux-sandbox` - attempted `cargo check -p codex-linux-sandbox --target x86_64-unknown-linux-gnu`; blocked on missing `x86_64-linux-gnu-gcc` on this macOS host --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 11:52:50 -07:00
charley-openai	de2ccf9473	[codex] Add token usage to turn tracing spans (#19432 ) ## Why Slow Codex turns are easier to debug when token usage is visible in the trace itself, without joining against separate analytics. This adds token usage to existing turn-handling spans for regular user turns only. [Example turn](https://openai.datadoghq.com/apm/trace/9d353efa2cb5de1f4c5b93dc33c3df04?colorBy=service&graphType=flamegraph&shouldShowLegend=true&sort=time&spanID=3555541504891512675&spanViewType=metadata&traceQuery=) <img width="1447" height="967" alt="Screenshot 2026-04-24 at 3 03 07 PM" src="https://github.com/user-attachments/assets/ab7bb187-e7fc-41f0-a366-6c44610b2b2c" /> ## What Changed Added response-level token fields on completed handle_responses spans: gen_ai.usage.input_tokens gen_ai.usage.cache_read.input_tokens gen_ai.usage.output_tokens codex.usage.reasoning_output_tokens codex.usage.total_tokens Added aggregate token fields on regular turn spans: codex.turn.token_usage.* Added an explicit regular-turn opt-in via SessionTask::records_turn_token_usage_on_span() so this is not coupled to span-name strings. ## Testing - `cargo test -p codex-otel` - `cargo test -p codex-core turn_and_completed_response_spans_record_token_usage` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-otel` - Manual local Electron/app-server smoke test: regular user turn emits the new span fields Known status: `cargo test -p codex-core` was attempted and failed in unrelated existing areas: config approvals, request-permissions, git-info ordering, and subagent metadata persistence.	2026-04-28 11:41:32 -07:00
canvrno-oai	640a1b23ea	Fix plan mode nudge test after task completion signature change (#20045 ) Updates the plan mode nudge test to pass the new `duration_ms` argument to task completion. Co-authored-by: Codex <noreply@openai.com>	2026-04-28 11:24:22 -07:00
Michael Bolin	9e26613657	permissions: add built-in default profiles (#19900 ) ## Why The migration away from `SandboxPolicy` needs new configs to start from permissions profiles instead of deriving profiles from legacy sandbox modes. Existing users can have empty `config.toml` files, and we should not rewrite user-owned config files that may live in shared repositories. This PR introduces built-in profile names so an empty config can resolve to a canonical `PermissionProfile`, while explicit named `[permissions]` profiles still behave predictably. ## What changed - Adds built-in `default_permissions` profile names: - `:read-only` maps to `PermissionProfile::read_only()`. - `:workspace` maps to the workspace-write profile, including project-root metadata carveouts. - `:danger-no-sandbox` maps to `PermissionProfile::Disabled`, preserving the distinction between no sandbox and a broad managed sandbox. - Reserves the `:` prefix for built-in profiles so user-defined `[permissions]` profiles cannot collide with future built-ins. - Allows `default_permissions` to reference a built-in profile without requiring a `[permissions]` table. - Makes an otherwise empty config choose a built-in profile by trust/platform context: trusted or untrusted project roots use `:workspace` when the platform supports that sandbox, while roots without a trust decision use `:read-only`. - Keeps legacy `sandbox_mode` configs on the legacy path, and still rejects user-defined `[permissions]` profiles that omit `default_permissions` so we do not silently guess among custom profiles. - Preserves compatibility behavior for implicit defaults: bare `network.enabled = true` allows runtime network without starting the managed proxy, explicit profile proxy policy still starts the proxy, and implicit workspace/add-dir roots keep legacy metadata carveouts. ## Verification - `cargo test -p codex-core builtin --lib` - `cargo test -p codex-core profile_network_proxy_config` - `cargo test -p codex-core implicit_builtin_workspace_profile_preserves_add_dir_metadata_carveouts` - `cargo test -p codex-core permissions_profiles_network_enabled_allows_runtime_network_without_proxy` - `cargo test -p codex-core permissions_profiles_proxy_policy_starts_managed_network_proxy` ## Documentation Public Codex config docs should mention these built-in names when the `[permissions]` config format is ready to document as stable. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19900). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * #20016 * #20015 * #20013 * #20011 * #20010 * #20008 * __->__ #19900	2026-04-28 11:21:39 -07:00
viyatb-oai	3afb185a4f	fix(network-proxy): tighten network proxy bypass defaults (#20002 ) ## Why Managed sessions use `NO_PROXY` to keep a small set of destinations on the direct path by default. The old default also bypassed all IPv4 link-local addresses in `169.254.0.0/16`, which includes metadata endpoints such as `169.254.169.254`. Because `NO_PROXY` is evaluated by the client before the request reaches the managed proxy, requests to that range could skip proxy-side allowlist and local-binding checks entirely. On hosts where a link-local metadata service is reachable, that creates a path to sensitive environment metadata or credentials outside the intended enforcement point. ## What changed - remove the default IPv4 link-local `169.254.0.0/16` bypass from the managed proxy environment - keep the existing loopback and private-network defaults unchanged - update the regression assertion to lock in the narrower default ## Security impact Link-local requests now stay on the managed-proxy path by default, so the proxy can apply configured policy before they reach metadata-style endpoints or other link-local services. ## Verification - `cargo test -p codex-network-proxy` Co-authored-by: Codex <noreply@openai.com>	2026-04-28 10:51:43 -07:00
stefanstokic-oai	4c68bd728f	External agent session support (#19895 ) ## Summary This extends external agent detection/import beyond config artifacts so Codex can detect recent sessions files from the external agent home and import them into Codex rollout history. ## What changed - Added a focused `external_agent_sessions` module for: - session discovery - source-record parsing - rollout construction - import ledger tracking - Wired session detection/import into the app-server external agent config API. - Added compaction handling so large imported sessions can be resumed safely before the first follow-up turn. ## Testing Added coverage for: - recent-session detection - custom-title handling - recency filtering - dedupe and re-detect-after-source-change behavior - visible imported turn construction - backward-compatible import payload deserialization - end-to-end RPC import flow - rejection of undetected session paths - repeat-import behavior - large-session compaction before first follow-up Ran: - `cargo test -p codex-app-server external_agent_config_import_ --test all`	2026-04-28 17:42:36 +00:00
Felipe Coury	a036584104	fix(tui): let esc exit empty shell mode (#19986 ) ## Summary - exit shell mode when `Esc` is pressed while the absorbed `!` is the only input - add direct regression coverage plus a composer snapshot for the restored normal prompt state ## Root cause Shell mode stores the leading `!` outside the editable textarea. After typing only `!`, the textarea is empty but the composer is still in bash mode, so the existing empty-composer `Esc` handling never runs. ## Validation - `just fmt` - `cargo test -p codex-tui bottom_pane::chat_composer::tests::esc_exits_empty_shell_mode` - `cargo test -p codex-tui bottom_pane::chat_composer::tests::footer_mode_snapshots` - `cargo insta pending-snapshots` `cargo test -p codex-tui` still reports unrelated existing `/status` snapshot drift in this local environment because the rendered permissions text is `workspace-write with network access` instead of the older `read-only` fixture text.	2026-04-28 14:35:24 -03:00
canvrno-oai	bc5a1b961e	Move local /resume cwd filtering into thread/list (#19931 ) Move local resume and fork cwd filtering to `thread/list` instead of filtering in the TUI. This makes the `/resume` menu feel slightly faster to load when working in repos with many historical threads, and centralizes the cwd filtering in app-server. Affected: - /resume from inside the TUI. - codex resume with no session ID and without --last - codex resume --all - codex fork with no session ID and without --last - codex fork --all Not affected: - codex resume <id> - codex fork <id> - codex resume --last - codex fork --last Steps to test performance improvement in a real Codex environment: - Launch `codex resume` using compiled binary in a directory that has seen many threads. - Launch `codex resume` using release binary in same directory. - Observe difference in time-to-full-page as threads load.	2026-04-28 10:35:10 -07:00
Felipe Coury	c6bcd27832	feat(tui): suggest plan mode from composer drafts (#19901 ) ## Summary - suggest Plan mode when the current composer draft contains the standalone word `plan` - shares the Codex App heuristics for detection - excludes things line `/plan` and the word plan in shell mode - reuse the existing `Shift+Tab` mode cycle and add thread-scoped dismissal with `Esc` - replace the normal footer hint while the reminder is visible so the statusline stays anchored https://github.com/user-attachments/assets/01123ae8-cee6-4e95-b563-44655c071cde ## Why The desktop app already nudges users toward Plan mode when their draft clearly signals planning intent. The TUI had the underlying `/plan` and `Shift+Tab` flows, but no equivalent reminder at the moment the user was most likely to benefit from them. ## Details The reminder is shown only when Plan mode is available, the draft contains standalone `plan`, the user is not already in Plan mode, the composer is actionable, and the current thread has not dismissed the reminder. Slash-command and shell-command drafts are excluded. The first implementation used an extra composer row, but that moved the statusline whenever the heuristic fired. This version keeps the layout stable by rendering the reminder in the existing footer row instead. ## Validation - `INSTA_UPDATE=always cargo test -p codex-tui chatwidget::tests::plan_mode::plan_mode_nudge -- --nocapture` - `just fmt` - `just fix -p codex-tui` - `./tools/argument-comment-lint/run.py -p codex-tui` - `cargo insta pending-snapshots` - `git diff --check`	2026-04-28 14:34:10 -03:00
maja-openai	273c2e21a9	Clarify network approval auto-review prompts (#19907 ) ## Why Network access approval prompts were showing the generic retry reason, which made auto-review focus on the blocked connection instead of the command that caused it. This makes network approvals easier to assess by telling the reviewer to evaluate whether the triggering command was authorised by the user and within policy, and to treat the network call as acceptable when it is a reasonable consequence of that command. ## What changed - Split guardian approval request prompt rendering so `NetworkAccess` has a dedicated branch. - For network requests, show `Network approval context` and `Network access JSON` instead of `Retry reason` / `Planned action JSON`. - Added regression coverage for the network approval prompt wording and for omitting retry reason in this case. ## Verification - `cargo test -p codex-core guardian::tests::build_guardian_prompt_items_explains_network_access_review_scope`	2026-04-28 10:25:37 -07:00
mchen-oai	01de13b7e6	Record MCP result telemetry on mcp.tools.call spans (#19509 ) ## Why - Without change: MCP tool call spans include request-side details such as server, tool, call ID, connector, session, and turn. - Issue: Some useful telemetry is only known by the MCP server after it handles the tool call, such as target identity or whether the call triggered a user-facing flow. ## What Changed - With change: Codex reads allowlisted telemetry from `_meta["codex/telemetry"]["span"]` and records it on the `mcp.tools.call` span. - Adds span fields for `codex.mcp.target.id` and `codex.mcp.user_flow.triggered`, with strict type checks and bounded target ID length. ## Verification `codex-rs/core/src/mcp_tool_call_tests.rs`	2026-04-28 17:20:38 +00:00
evawong-oai	0670d8971a	Enforce workspace metadata protections in Seatbelt (#19847 ) ## Summary Translate FileSystemSandboxPolicy project root metadata carveouts into macOS Seatbelt rules. ## Scope 1. Thread protected metadata names into Seatbelt access roots. 2. Ask FileSystemSandboxPolicy whether each metadata carveout is writable. 3. Emit Seatbelt deny rules that block creating or replacing protected metadata names under writable roots. 4. Add coverage for first time metadata creation and read only carveouts. ## Reviewer Focus 1. This PR only covers the macOS sandbox adapter. 2. The policy decision comes from FileSystemSandboxPolicy. 3. Read only subpath carveouts and metadata protection checks should compose cleanly. ## Stack 1. Policy primitive: #19846 2. macOS Seatbelt adapter: this PR 3. Shell preflight UX: #19848 4. Runtime profile propagation: #19849 5. Linux bubblewrap adapter: #19852 ## Validation 1. formatting for codex sandboxing 2. codex sandboxing package tests	2026-04-28 10:13:00 -07:00
efrazer-oai	f6797c3ac6	feat: verify agent identity JWTs with JWKS (#19764 )	2026-04-28 09:56:20 -07:00

... 5 6 7 8 9 ...

5468 Commits