codex

mirror of https://github.com/openai/codex.git synced 2026-05-03 02:46:39 +00:00

Author	SHA1	Message	Date
alexsong-oai	cde2fbf4e3	Release 0.126.0-alpha.13	2026-04-28 20:05:32 -07:00
starr-openai	e1ec9e63a0	Add environment provider snapshot (#20058 ) ## Summary - Change `EnvironmentProvider` to return concrete `Environment` instances instead of `EnvironmentConfigurations`. - Make `DefaultEnvironmentProvider` provide the provider-visible `local` environment plus optional `remote` environment from `CODEX_EXEC_SERVER_URL`. - Keep `EnvironmentManager` as the concrete cache while exposing its own explicit local environment for `local_environment()` fallback paths. ## Validation - `just fmt` - `git diff --check` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 20:05:18 -07:00
xl-openai	6f328d5e02	Soften skill description budget warnings (#20112 ) Updates skill description budget messaging to be less alarming	2026-04-28 19:56:25 -07:00
Michael Bolin	e6db1a9442	linux-sandbox: switch helper plumbing to PermissionProfile (#20106 ) ## Why `PermissionProfile` is the canonical runtime permission model in the Rust workspace, but the Linux sandbox helper still accepted a legacy `SandboxPolicy` plus separate filesystem and network policy flags. That translation layer made the helper interface harder to reason about and left `linux-sandbox`-specific callers and tests coupled to the legacy policy representation. This change moves the helper onto `PermissionProfile` directly so the Linux sandbox plumbing matches the rest of the permission stack. ## What changed - changed `codex-linux-sandbox` to accept `--permission-profile` and derive the runtime filesystem and network policies internally - updated the in-process seccomp and legacy Landlock path in `codex-rs/linux-sandbox` to operate on `PermissionProfile` - updated Linux sandbox argv construction in `codex-rs/sandboxing`, `codex-rs/core`, and the CLI debug sandbox path to pass the canonical profile instead of serializing compatibility policy projections - simplified the Linux sandbox tests to build the exact permission profile under test, including the managed-proxy path and direct-runtime-enforcement carveout coverage - removed helper-local `SandboxPolicy` usage from `bwrap` tests where `FileSystemSandboxPolicy` is already the value being exercised ## Testing - `cargo test -p codex-sandboxing` - `cargo test -p codex-linux-sandbox` (on this macOS host, the crate compiled cleanly and its Linux-only tests were cfg-gated) - `cargo test -p codex-core --no-run` - `cargo test -p codex-cli --no-run`	2026-04-28 19:43:44 -07:00
Celia Chen	80fb0704ee	feat: update Bedrock Mantle endpoint and GPT-5.4 model ID (#20109 ) ## Summary Amazon Bedrock Mantle's OpenAI-compatible endpoint now lives under `/openai/v1`, and the GPT-5.4 Mantle model ID no longer uses the `-cmb` suffix. This updates Codex's built-in Bedrock provider configuration so generated providers and the static Bedrock catalog use the current endpoint and model ID. ## Changes - Update the Bedrock Mantle base URL from `https://bedrock-mantle.{region}.api.aws/v1` to `https://bedrock-mantle.{region}.api.aws/openai/v1`. - Update the Amazon Bedrock default base URL in `codex-model-provider-info`. - Change the Bedrock GPT-5.4 catalog slug from `openai.gpt-5.4-cmb` to `openai.gpt-5.4`. - Align provider and catalog tests with the new URL and model ID. ## Test Plan - Manual smoke test: ```shell target/debug/codex \ -m openai.gpt-5.4 \ -c 'model_provider="amazon-bedrock"' \ -c 'model_providers.amazon-bedrock.aws.region="us-west-2"' ```	2026-04-29 01:37:21 +00:00
Celia Chen	8c47e36504	feat: expose provider capability bounds to app server clients (#20049 ) follow up of #19442. The app server now exposes provider-derived bounds through a new v2 `modelProvider/read` method. The response reports the configured provider map key as `modelProvider` and returns the effective capability booleans so clients can align their UI with the same provider-owned limits used by core.	2026-04-29 01:36:19 +00:00
canvrno-oai	4c39ad33cb	Fix plugin list workspace settings test isolation (#20086 ) Fixes test that often fails locally when running `cargo test` - Add an app-server test helper that combines managed-config isolation with custom env overrides. - Isolate `HOME` / `USERPROFILE` in plugin-list workspace settings tests so host home marketplaces do not affect results.	2026-04-28 18:34:38 -07:00
canvrno-oai	24be9ac0a4	Restore TUI working status after steer message is set (#19939 ) Fix for #19925 Restore the `Working` indicator after a streamed final answer finishes when a user steer message is sent. Add regression coverage for long output plus a mid-stream steer: `cargo test -p codex-tui final_answer_completion_restores_status_indicator_for_pending_steer` Duplication/testing steps: 1. Start a new thread and ask for a long response. 2. While the response is streaming, submit a steer message. 3. When the first response finishes, observe whether `Working...` is shown while waiting for the steer message response.	2026-04-28 18:10:40 -07:00
Michael Bolin	c9f7c88f3d	fix: restore live event submit path for apply patch tests (#20108 ) ## Summary This fixes the CI regression introduced by [#20040](https://github.com/openai/codex/pull/20040). That PR migrated several `apply_patch_cli` tests from direct `codex.submit(Op::UserTurn { ... })` calls to `harness.submit(...)`. `harness.submit()` waits for `TurnComplete` before returning, which drains the same event stream that these tests use to assert `TurnDiff`, `PatchApplyUpdated`, and related live events. The regressed tests then timed out waiting for events that had already been consumed. This change restores a no-wait submit path for the event-observing `apply_patch_cli` tests so they can watch the turn stream directly again. ## What Changed - added a local `submit_without_wait(...)` helper in `codex-rs/core/tests/suite/apply_patch_cli.rs` - switched the `apply_patch_cli` tests that assert live turn events back to that helper - left the profile-backed `harness.submit(...)` migration in place for tests that only care about final filesystem or tool output state ## Why macOS Looked Green In the failing run [25084487331](https://github.com/openai/codex/actions/runs/25084487331), `//codex-rs/core:core-all-test` was cached on macOS, so the regressed tests were not rerun there. The Linux GNU, Linux MUSL, and Windows Bazel jobs reran the target and exposed the failure. ## Verification - `cargo test -p codex-core apply_patch_ -- --nocapture` - previously failing local cases now pass again: - `apply_patch_cli_move_without_content_change_has_no_turn_diff` - `apply_patch_turn_diff_for_rename_with_content_change` - `apply_patch_aggregates_diff_across_multiple_tool_calls`	2026-04-28 18:09:20 -07:00
Celia Chen	f8fe96d548	feat: disable capabilities by model provider (#19442 ) ## Why Unsupported features must fail closed and Codex must not expose OpenAI-hosted fallback paths when the active provider cannot support them. In practice, Bedrock should not surface app connectors, MCP servers, tool search/suggestions, image generation, web search, or JS REPL until those paths are explicitly supported for that provider. This PR moves that decision into provider-owned capability metadata instead of scattering Bedrock-specific checks across callers. ## What changed - Adds `ProviderCapabilities` to `codex-model-provider`, with default support for existing providers and a Bedrock override that disables unsupported launch surfaces. - Adds `ToolCapabilityBounds` to `codex-tools` so provider capability limits can clamp otherwise-enabled tool config. - Applies capability bounds when building session and review-thread tool config. - Routes MCP/app connector configuration through `McpManager::mcp_config`, which filters configured MCP servers and app connectors based on the active provider. - Updates app-server MCP list/read paths to use the filtered MCP config. - Adds coverage for default provider capabilities, Bedrock disabled capabilities, and optional tool-surface clamping. ## Testing built locally and verified that bedrock responses api now return without errors calling unsupported tools.	2026-04-28 17:51:30 -07:00
alexsong-oai	cb8b1bbcd6	Support detect and import MCP, Subagents, hooks, commands from external (#19949 ) ## Why This PR expands the migration path so Codex can detect and import MCP server config, hooks, commands, and subagents configs in a Codex-native shape. ## What changed - Added a `codex-external-agent-migration` crate that owns conversion logic for external-agent MCP servers, hooks, commands, and subagents. - Extended the app-server external-agent config detection/import API with migration item types for MCP server config, hooks, commands, and subagents. ## Migration strategy The migration is intentionally conservative: Codex only imports external-agent config that can be represented safely in Codex today. Unsupported or ambiguous config is skipped instead of being partially translated into behavior that may not match the source system. - MCP servers: import supported stdio and HTTP MCP server definitions into `mcp_servers`. Disabled servers and servers filtered out by source `enabledMcpjsonServers` / `disabledMcpjsonServers` are skipped. Project-scoped MCP entries from `.claude.json` are included when they match the repo path. - Hooks: import only supported command hooks into `.codex/hooks.json`. Unsupported hook features such as conditional groups, async handlers, prompt/http hooks, or unknown fields are skipped. Referenced hook scripts are copied into `.codex/hooks/`, preserving any existing target scripts. - Commands: import supported external commands as Codex skills under `.agents/skills/source-command-`. Commands that rely on source runtime expansion such as `$ARGUMENTS`, `$1`, `@file` references, shell interpolation, or colliding generated names are skipped. - Subagents: import valid subagent Markdown files into `.codex/agents/.toml` when they have the minimum Codex agent fields. Source model names are not migrated, so imported agents keep the user’s Codex default model; compatible reasoning effort and sandbox mode are migrated when present. - Skills and project guidance: copy missing skill directories into `.agents/skills` and migrate `CLAUDE.md` guidance into `AGENTS.md`, rewriting source-agent terminology to Codex terminology where appropriate. - Detection details: detected migration items include lightweight details for UI preview, such as MCP server names, hook event names, generated command skill names, and subagent names. Import still recomputes from disk instead of trusting details as the source of truth. - Adds focused coverage for the new migration behavior and app-server import flow. ## Verification - `cargo test -p codex-external-agent-migration` - `cargo test -p codex-hooks` - `cargo test -p codex-app-server external_agent_config` - `just bazel-lock-check`	2026-04-29 00:45:24 +00:00
Matthew Zeng	ebdf3a878c	Support disabling tool suggest for specific tools. (#20072 ) ## Summary - Add `disable_tool_suggest` to app and plugin config, schema, and TypeScript output - Exclude disabled connectors and plugins from tool suggestion discovery - Persist "never show again" tool-suggestion choices back into `config.toml` - Update config docs and add coverage for connector and plugin suppression ## Testing - Added and updated unit tests for config persistence and tool-suggest filtering - Not run (not requested)	2026-04-29 00:19:34 +00:00
Michael Bolin	1211a90a35	core tests: migrate hook turns to profiles (#20041 ) ## Summary - Removes `SandboxPolicy` from the hooks test suite. - Submits hook-related turns with explicit `PermissionProfile` values for disabled, read-only, and workspace-write cases. - Preserves the managed-network hook test by configuring and submitting a workspace-write profile with enabled network, allowing the existing requirements-backed proxy path to remain covered. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:18:45 -07:00
Michael Bolin	1fed948c66	core tests: migrate apply patch turns to profiles (#20040 ) ## Summary - Removes `SandboxPolicy` from the apply-patch CLI test suite. - Uses the harness' profile-backed submit helper for danger/no-sandbox turns instead of constructing `Op::UserTurn` manually with legacy fields. - Converts the workspace-write traversal cases to submit `PermissionProfile::workspace_write_with(...)` directly. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:18:19 -07:00
Michael Bolin	1dae5788e1	core tests: migrate rmcp turns to profiles (#20037 ) ## Summary - Removes `SandboxPolicy` from the RMCP client test suite. - Adds shared read-only user-turn helpers that submit `PermissionProfile::read_only()` plus the legacy compatibility projection required by the current `Op::UserTurn` shape. - Keeps sandbox metadata assertions intact by deriving the expected legacy `sandboxPolicy` value from the same read-only profile used for the turn. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:17:47 -07:00
Michael Bolin	6662c0f312	core tests: migrate compact turns to profiles (#20035 ) ## Summary - Removes the remaining `SandboxPolicy` usage from the compaction test suite. - Adds a small local helper for direct `Op::UserTurn` construction so these tests send `PermissionProfile::Disabled` plus the legacy compatibility projection required by the protocol field. - Keeps the existing danger/full-access behavior while exercising the canonical permission profile path. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:17:12 -07:00
Michael Bolin	026df712cc	core tests: migrate zsh-fork permissions to profiles (#20034 ) ## Summary - Updates the zsh-fork test helper to configure `PermissionProfile` directly instead of constructing a legacy `SandboxPolicy`. - Sends permission-profile-backed turns from the skill approval zsh-fork tests so the runtime and request path exercise the canonical permissions model. - Leaves the broader approvals suite on legacy policies for now, except for the zsh-fork test that shares this helper. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:15:58 -07:00
Michael Bolin	1ea90410e1	core tests: migrate request permissions tool turns to profiles (#20033 ) ## Summary This migrates the macOS request-permissions tool tests from legacy `SandboxPolicy` setup to `PermissionProfile` setup. The tests still exercise the same workspace-write baseline and request-permission grants, but the canonical permissions value is now the profile. ## Changes - Replaces the `workspace_write_excluding_tmp()` helper with a `PermissionProfile::workspace_write_with()` helper. - Applies test config through `Permissions::set_permission_profile()`. - Uses `turn_permission_fields()` for `Op::UserTurn` compatibility fields. - Removes the `SandboxPolicy` import from `request_permissions_tool.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:15:13 -07:00
Michael Bolin	af39e488bc	core tests: migrate prompt caching turns to profiles (#20032 ) ## Summary This removes the explicit `SandboxPolicy` constructors from `core/tests/suite/prompt_caching.rs`. The tests still exercise the same prompt-cache invariants across permission and turn-context changes, but the permission source is now `PermissionProfile`. ## Changes - Uses `PermissionProfile::workspace_write_with()` for workspace-write override scenarios. - Uses `PermissionProfile::Disabled` for the no-sandbox per-turn override. - Projects profiles through `turn_permission_fields()` or `to_legacy_sandbox_policy()` only to populate compatibility fields on existing ops. - Removes the `SandboxPolicy` import from `prompt_caching.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:13:53 -07:00
Michael Bolin	5d08315c00	core tests: migrate exec policy turns to profiles (#20030 ) ## Summary This migrates `core/tests/suite/exec_policy.rs` away from legacy `SandboxPolicy` turn construction. These tests all use no-sandbox turns to exercise exec-policy behavior, so `PermissionProfile::Disabled` is the canonical representation. ## Changes - Replaces direct `SandboxPolicy::DangerFullAccess` turn fields with `PermissionProfile::Disabled`. - Uses `turn_permission_fields()` to populate the compatibility `sandbox_policy` field required by `Op::UserTurn`. - Removes the `SandboxPolicy` import from `exec_policy.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:12:48 -07:00
Michael Bolin	b599849d86	core tests: migrate permissions message tests to profiles (#20028 ) ## Summary This removes another test-only `SandboxPolicy` dependency by configuring `permissions_messages.rs` with a `PermissionProfile` directly. The test still verifies the rendered compatibility permissions text, but now obtains the legacy projection from the loaded `Config` rather than using `SandboxPolicy` as the source of truth. ## Changes - Builds the workspace-write test setup with `PermissionProfile::workspace_write_with()`. - Applies that profile through `Permissions::set_permission_profile()`. - Uses `Config::legacy_sandbox_policy()` only for the expected `PermissionsInstructions` compatibility rendering. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:12:10 -07:00
Michael Bolin	3ef09c71d3	core tests: migrate tools tests to permission profiles (#20027 ) ## Summary This continues the test-side migration away from `SandboxPolicy` by removing the remaining legacy policy setup in `core/tests/suite/tools.rs`. The affected test was already modeling a profile-backed filesystem policy with a deny-read glob, so configuring the test through `Permissions::set_permission_profile()` is a better match for the behavior being exercised. ## Changes - Drops the `SandboxPolicy` import from `core/tests/suite/tools.rs`. - Configures the glob deny-read shell test directly with a `PermissionProfile` instead of creating a legacy read-only policy first. - Submits the test turn with the session permission profile so the deny-read glob remains active for the command under test. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:11:43 -07:00
Michael Bolin	8d3992d830	core tests: migrate plan item turns to profiles (#20026 ) ## Why The core item tests still had a cluster of plan-mode `Op::UserTurn` literals that used `SandboxPolicy::DangerFullAccess` and omitted `permission_profile`. These tests are validating emitted item lifecycle events, so keeping them on the legacy sandbox-only turn shape adds noise to the broader permissions migration without testing legacy behavior. ## What Changed - Adds a local `disabled_plan_turn()` helper that preserves the existing `std::env::current_dir()` turn cwd behavior. - Uses `turn_permission_fields(PermissionProfile::Disabled, cwd)` to populate both the compatibility `sandbox_policy` and canonical `permission_profile` fields. - Replaces the plan-mode hand-built turns in `codex-rs/core/tests/suite/items.rs`, removing all `SandboxPolicy` references from that file and reducing remaining `codex-rs/core/tests` `SandboxPolicy` files from 16 to 15. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:11:17 -07:00
Michael Bolin	162f4e3183	core tests: migrate safety check turns to profiles (#20024 ) ## Why This stack is retiring direct `SandboxPolicy` construction from tests so core coverage exercises the same `PermissionProfile` turn path used by runtime code. `safety_check_downgrade.rs` still submitted each test turn as `SandboxPolicy::DangerFullAccess` with no permission profile, even though the tests are about model verification/reroute behavior rather than legacy sandbox conversion. ## What Changed - Adds a local `disabled_text_turn()` helper that derives both the compatibility `sandbox_policy` and canonical `permission_profile` from `PermissionProfile::Disabled`. - Replaces repeated hand-built `Op::UserTurn` literals in `codex-rs/core/tests/suite/safety_check_downgrade.rs` with that helper. - Removes all `SandboxPolicy` references from the safety-check suite, reducing the remaining `codex-rs/core/tests` files that mention `SandboxPolicy` from 17 to 16. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:10:42 -07:00
Michael Bolin	2a8ce9b319	core tests: migrate view image turns to profiles (#20021 ) ## Why This stack is removing direct `SandboxPolicy` usage from test code so new tests exercise the same `PermissionProfile` path that runtime code now treats as canonical. `view_image.rs` still built `Op::UserTurn` requests with `SandboxPolicy::DangerFullAccess` and no permission profile, which kept another core test module on the legacy turn shape. ## What Changed - Adds a small `disabled_user_turn()` helper for the view-image suite that derives the compatibility `sandbox_policy` and canonical `permission_profile` from `PermissionProfile::Disabled`. - Replaces repeated direct `Op::UserTurn` literals in `codex-rs/core/tests/suite/view_image.rs` with that helper. - Removes all `SandboxPolicy` references from `view_image.rs`, reducing the remaining `codex-rs/core/tests` files that mention `SandboxPolicy` from 18 to 17. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:09:48 -07:00
Michael Bolin	d77d23da2e	core tests: migrate model/personality turns to profiles (#20018 ) ## Summary - Migrates `model_switching.rs` and `personality.rs` direct `Op::UserTurn` construction from legacy `SandboxPolicy` literals to `PermissionProfile`-backed turn fields. - Adds small local helpers in each file so tests keep asserting model/personality behavior without repeating permission plumbing. - Reduces `rg -l '\bSandboxPolicy\b' codex-rs/core/tests` from 20 files to 18; `codex-rs/tui` remains at zero `SandboxPolicy` references. ## Testing - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:09:12 -07:00
Abhinav	5b0d9df1d0	Increase plugin hook env test timeout (#20100 ) # Why `plugin_hook_sources_run_with_plugin_env_and_plugin_source` can still fail on Windows after the earlier file-based assertion cleanup because the hook process itself occasionally exceeds the old 5s timeout under CI load. When that happens, the hook run ends as `Failed` before the test can inspect its structured output. The Windows Bazel failure showed the hook run itself failing after nearly 8 seconds: ```text ---- engine::tests::plugin_hook_sources_run_with_plugin_env_and_plugin_source stdout ---- thread 'engine::tests::plugin_hook_sources_run_with_plugin_env_and_plugin_source' panicked at hooks/src\engine\mod_tests.rs:428:5: assertion failed: `(left == right)` Diff < left / right > : <Failed >Completed ... test result: FAILED. 78 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 7.96s ``` # What - raise the flaky plugin hook env test timeout from 5s to 10s so it matches the other executed hook tests in this module # Validation - `cargo test -p codex-hooks`	2026-04-28 17:08:12 -07:00
Michael Bolin	d6d79ffcc7	core tests: send model turns with permission profiles (#20016 ) ## Summary - Migrate direct `Op::UserTurn` construction in remote-model tests from legacy `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled` via `turn_permission_fields()`. - Migrate the Responses API proxy header helper from an inline workspace-write `SandboxPolicy` to `PermissionProfile::workspace_write()`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 22 files after #20015 to 20 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20016). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * __->__ #20016	2026-04-28 17:08:04 -07:00
Michael Bolin	158b2a4201	core tests: configure profiles directly (#20015 ) ## Summary - Replace legacy sandbox config setup in delegate and telemetry tests with direct `PermissionProfile` configuration. - Move no-sandbox and read-only test turns in `tools.rs`, `code_mode.rs`, `user_shell_cmd.rs`, and `model_visible_layout.rs` from legacy `SandboxPolicy` values to `PermissionProfile` helpers, while leaving the deny-glob read-only compatibility case for a later targeted cleanup. - Use `PermissionProfile::read_only()` where tests need managed read-only behavior and `PermissionProfile::Disabled` where they intentionally need no sandbox. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 27 files after #20013 to 22 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:06:59 -07:00
Michael Bolin	52e79ee49a	core tests: migrate more turns to permission profiles (#20013 ) ## Summary - Migrate another batch of direct `Op::UserTurn` test construction from legacy `SandboxPolicy` values to `PermissionProfile` inputs via `turn_permission_fields()`. - Replace a one-off read-only `SandboxPolicy` bridge in the macOS exec test with `PermissionProfile::read_only()`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 32 files at the start of the cleanup stack to 27 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt` - `just fix -p codex-core`	2026-04-28 17:05:53 -07:00
Michael Bolin	7d15936e69	core tests: build user turns from permission profiles (#20011 ) ## Summary - Add `turn_permission_fields()` so tests that construct `Op::UserTurn` directly can provide a canonical `PermissionProfile` while still filling the required legacy `sandbox_policy` compatibility field. - Migrate direct user-turn construction in core integration tests from `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled`. - Continue reducing direct `SandboxPolicy` usage in `codex-rs/core/tests`, from 41 files after #20010 to 32 files in this PR. ## Testing - `cargo check -p codex-core --tests` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`	2026-04-28 17:03:20 -07:00
Ruslan Nigmatullin	c6465c1ec2	app-server: notify clients of remote-control status changes (#19919 ) ## Why Remote-control app-server enrollments have both an internal server id and the environment id exposed to remote-control clients. App-server clients need one current status snapshot that says whether remote control is usable and which environment id, if any, is exposed. A temporary websocket disconnect is not itself an identity change. Account changes, stale enrollment invalidation, successful re-enrollment, and missing ChatGPT auth are meaningful status changes. Disabled remote control remains `disabled` regardless of auth or SQLite state. SQLite startup failure disablement and enrollment persistence failures are handled in #20068; this PR reports the resulting effective status to clients. ## What changed - Adds v2 `remoteControl/status/changed` carrying `state` and `environmentId`. - Adds `RemoteControlConnectionState` values: `disabled`, `connecting`, `connected`, and `errored`. - Exposes remote-control status updates through `RemoteControlHandle` using a Tokio watch channel. - Always sends the current remote-control status snapshot to newly initialized app-server clients. - Broadcasts status changes to initialized app-server clients when state or environment id changes. - Treats missing ChatGPT auth as an `errored` status while leaving it retryable because auth can change at runtime. - Clears `environmentId` when enrollment is cleared for account changes, auth loss, stale backend invalidation, or disabled remote control. - Updates app-server protocol schema fixtures, generated TypeScript, app-server README, remote-control tests, and TUI exhaustive notification matches. ## Stack - Builds on #20068. ## Verification - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server transport::remote_control --lib` - `cargo check -p codex-tui` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server` - `just fix -p codex-tui`	2026-04-28 23:52:14 +00:00
Gabriel Peal	5e6cbbadf7	Return None when auth refresh fails (#20092 ) Right now, if Codex winds up in a state with auth but it can't refresh the token, the user is left with an unhelpful message that says to log out and log back in again. Ultimately, we should prevent that from happening but if it does, returning None will allow the caller to redirect the user back to the login page	2026-04-28 16:15:47 -07:00
Michael Bolin	891722849d	core tests: submit turns with permission profiles (#20010 ) ## Summary - Add `PermissionProfile`-based turn submission helpers to `core_test_support`, while keeping the legacy `SandboxPolicy` helper for tests that intentionally exercise legacy fallback behavior. - Switch the default `TestCodex::submit_turn()` path to send a real `PermissionProfile` plus the required legacy compatibility projection in `Op::UserTurn`. - Migrate straightforward app/search/shell/truncation tests from `SandboxPolicy::{DangerFullAccess, ReadOnly}` to `PermissionProfile::{Disabled, read_only}`. - Add a TUI compatibility projection helper for legacy app-server fields so non-legacy writable roots are preserved instead of being downgraded to read-only. - Fix remote start/resume/fork sandbox-mode projection to classify any managed profile with writable roots as workspace-write, not only profiles that can write `cwd`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 47 files to 41 files without changing production behavior. ## Testing - `cargo check -p codex-core --tests` - `cargo test -p codex-tui compatibility_profile_preserves_unbridgeable_write_roots` - `cargo test -p codex-tui sandbox_mode_preserves_non_cwd_write_roots_for_remote_sessions` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`	2026-04-28 23:01:40 +00:00
viyatb-oai	2dbde94aa9	fix(network-proxy): normalize network proxy host matching (#19995 ) ## Why The proxy matches allow and deny rules against normalized host strings. Scoped IPv6 literals can arrive in equivalent forms, such as `fd00::1%eth0`, `[fd00::1%eth0]`, or `[fd00::1%25eth0]`. Policy should canonicalize those spellings without erasing scope granularity: an unscoped rule like `fd00::1` should still cover scoped requests for that address, while a scoped rule like `fd00::1%eth0` should remain exact to that scope. ## What changed - preserve IPv6 scope IDs during host normalization and canonicalize `%25scope` to `%scope` - match policy against the exact normalized host plus the unscoped IP base for scoped literals - keep local-address explicit allow checks aligned with the same scoped/unscoped semantics - add focused coverage for scoped IPv6 normalization, scoped allow rules, and scoped deny rules in `network-proxy` ## Security impact A request cannot bypass a broad deny rule by adding an IPv6 scope suffix. At the same time, scoped policy remains precise: `deny=fd00::1%eth0` affects that scoped spelling without collapsing `fd00::1%eth1` onto the same key, and `allow=fe80::1%eth0` does not implicitly allow other scopes. ## Verification - `just fmt` - `cargo test -p codex-network-proxy` - `just fix -p codex-network-proxy` - `git diff --check` --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: evawong-oai <evawong@openai.com>	2026-04-28 15:50:00 -07:00
Abhinav	3291463ff1	Fix flaky plugin hook env test (#20088 ) The test was flaky because it was checking the right thing in a roundabout way. What it wanted to prove: - plugin hooks receive the right environment variables. What it actually did: 1. Run a plugin hook. 2. Have that hook write those env vars into a temporary `env.json` file. 3. After the hook finished, read `env.json` back from disk. On Windows, that last file was sometimes not there when the test tried to read it, so the test failed with `read env log: file not found`. The hook system itself was not what the test failure was directly proving; the test was failing on the extra filesystem side effect it introduced. The fix is to stop using a temp file as the proof mechanism. The hook now prints the env values in its normal structured output, and the test asserts on the output that the hook engine already captures. So we still verify the same behavior, but without depending on a separate file being created and read back correctly on Windows.	2026-04-28 15:45:26 -07:00
Owen Lin	2e598df6fc	fix: don't auto approve git -C ... (#20085 ) It's safer to make sure these commands go through approval flows.	2026-04-28 22:06:55 +00:00
canvrno-oai	66b0781502	/plugins: add marketplace install flow (#18704 ) This PR adds a new feature to the `/plugins` menu that gives users the ability to add new plugin marketplaces. It introduces an Add Marketplace tab to the right of installed marketplaces, a source prompt, loading and error states, and the app-server request flow needed to perform the install. After a successful `marketplace/add`, the popup refreshes back into the newly added marketplace tab so the new plugins are immediately visible. - Add an Add Marketplace tab to the `/plugins` menu - Prompt for marketplace source input from git repo, URL, or local path - Show loading and error states during `marketplace/add` - Refresh plugin data after success and switch into the newly added marketplace tab - Add tests and snapshot updates	2026-04-28 14:22:39 -07:00
Abhinav	c6e7d564c3	Discover hooks bundled with plugins (#19705 ) ## Why Plugins can bundle lifecycle hooks, but Codex previously only discovered hooks from user, project, and managed config layers. This adds the plugin discovery and runtime plumbing needed for plugin-bundled hooks while keeping execution behind the `plugin_hooks` feature flag. ## What - Discovers plugin hook sources from each plugin's default `hooks/hooks.json`. - Supports `plugin.json` manifest `hooks` entries as either relative paths or inline hook objects. - Plumbs discovered plugin hook sources through plugin loading into the hook runtime when `plugin_hooks` is enabled. - Marks plugin-originated hook runs as `HookSource::Plugin`. - Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook command environments. - Updates generated schemas and hook source metadata for the plugin hook source. ## Stack 1. This PR - openai/codex#19705 2. openai/codex#19778 3. openai/codex#19840 4. openai/codex#19882 ## Reviewer Notes - Core logic is in `codex-rs/core-plugins/src/loader.rs` and `codex-rs/hooks/src/engine/discovery.rs` - Moved existing / adding new tests to `codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there - Otherwise mostly plumbing and minor schema updates ### Core Changes The `codex-rs/core` changes are limited to wiring plugin hook support into existing core flows: - `core/src/session/session.rs` conditionally pulls effective plugin hook sources and plugin hook load warnings from `PluginsManager` when `plugin_hooks` is enabled, then passes them into `HooksConfig`. - `core/src/hook_runtime.rs` adds the `plugin` metric tag for `HookSource::Plugin`. - `core/config.schema.json` picks up the new `plugin_hooks` feature flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the added plugin hook fields. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 14:17:18 -07:00
cassirer-openai	89698ad1c3	[rollout-trace] Include x-request-id in rollout trace. (#20066 ) ## Why Rollout traces need an identifier that can be used to correlate a Codex inference with upstream Responses API, proxy, and engine logs. The reduced trace model already exposed `upstream_request_id`, but it was being populated from the Responses API `response.id`. That value is useful for `previous_response_id` chaining, but it is not the transport request id that upstream systems key on. This PR separates those concepts so trace consumers can reliably answer both questions: - which Responses API response did this inference produce? - which upstream request handled it? ## Structure The change keeps the upstream request id at the same lifecycle level as the provider stream: - `codex-api` captures the `x-request-id` HTTP response header when the SSE stream is created and exposes it on `ResponseStream`. Fixture and websocket streams set the field to `None` because they do not have that HTTP response header. - `codex-core` carries that stream-level id into `InferenceTraceAttempt` when recording terminal stream outcomes. Completed, failed, cancelled, dropped-stream, and pre-response error paths all record the id when it is available. - `rollout-trace` now records both identifiers in raw terminal inference events and response payloads: `response_id` for the Responses API `response.id`, and `upstream_request_id` for `x-request-id`. - The reducer stores both fields on `InferenceCall`. It also uses `response_id` for `previous_response_id` conversation linking, which removes the old accidental dependency on the misnamed `upstream_request_id` field. - Terminal inference reduction now consumes the full terminal payload (`InferenceCompleted`, `InferenceFailed`, or `InferenceCancelled`) in one place. That keeps status, partial payloads, response ids, and upstream request ids consistent across success, failure, cancellation, and late stream-mapper events. ## Why This Shape `x-request-id` is a property of the HTTP/provider response envelope, not an SSE event. Capturing it once in `codex-api` and plumbing it through terminal trace recording avoids trying to infer the value from stream contents, and it preserves the id even when the stream fails or is cancelled after only partial output. Keeping `response_id` separate from `upstream_request_id` also makes the reduced trace model less surprising: `response_id` remains the conversation-continuation id, while `upstream_request_id` is the operational correlation id for upstream debugging. ## Validation The PR updates trace and reducer coverage for: - reading `x-request-id` from SSE response headers; - storing the true upstream request id on completed inference calls; - preserving upstream request ids for cancelled and late-cancelled inference streams; - keeping `previous_response_id` reconstruction tied to `response_id` rather than transport request ids.	2026-04-28 21:11:17 +00:00
Ruslan Nigmatullin	10e2a73b3c	app-server: disable remote control without sqlite (#20068 ) ## Why Remote control depends on the app-server SQLite state DB for persisted enrollment identity. If the state DB cannot be opened at startup, continuing with remote control enabled leaves the process in a misleading state where enrollment identity cannot be read or persisted. Feature-disabled remote control remains disabled regardless of SQLite state. This only changes the case where remote control is requested but the SQLite state DB is unavailable. ## What changed - Logs SQLite state DB initialization failures instead of dropping the error silently. - Treats remote control as effectively disabled when the SQLite state DB is unavailable. - Prevents `RemoteControlHandle::set_enabled(true)` from enabling remote control later in the same process if the state DB was unavailable at startup. - Keeps the existing behavior that disabled remote control does not validate or connect to the remote-control URL. - Makes persisted enrollment load/update failures propagate as remote-control errors instead of silently falling back to in-memory state. - Makes the direct websocket connection path fail when called without a SQLite state DB. - Adds coverage for startup without a state DB, later handle enablement with no state DB, and direct websocket connection without a state DB. ## Verification - `cargo test -p codex-app-server transport::remote_control --lib` - `just fix -p codex-app-server`	2026-04-28 13:49:00 -07:00
Michael Bolin	3b74a4d3b1	tui: use permission profiles for sandbox state (#20008 ) ## Summary - Move TUI permission state from legacy `SandboxPolicy` values to canonical `PermissionProfile` values across presets, app events, chat widget state, app commands, thread routing, and cached thread session state. - Keep app-server compatibility boundaries explicit: embedded sessions send `permissionProfile`, while remote sessions send only a legacy `sandbox` projection and fall back to read-only when a custom profile cannot be projected. - Update status/add-dir UI summaries and snapshots to render the active permission profile, including workspace profiles selected by the new built-in defaults. ## Verification - `rg '\bSandboxPolicy\b' codex-rs/tui -n` returns no matches. - `cargo test -p codex-tui` - `cargo check -p codex-tui --tests` - `cargo test -p codex-tui additional_dirs` - `just fmt` - `just fix -p codex-tui` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20008). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * #20016 * #20015 * #20013 * #20011 * #20010 * __->__ #20008	2026-04-28 20:36:48 +00:00
jif-oai	34d71d43eb	Make MultiAgentV2 wait minimum configurable (#20052 ) ## Why MultiAgentV2 `wait_agent` currently clamps short waits to a fixed 10 second minimum. That default is still useful for preventing tight polling loops, but it is too rigid for environments that need faster mailbox wake-up checks or a larger minimum to discourage frequent polling. This PR makes the minimum wait timeout configurable from the existing MultiAgentV2 feature config section, so operators can tune the behavior without changing the legacy multi-agent tool surface. ## What Changed - Added `features.multi_agent_v2.min_wait_timeout_ms`. - Defaulted the new setting to the existing 10 second floor. - Validated the configured value as `1..=3600000`, matching the existing one hour maximum wait bound. - Applied the configured minimum to MultiAgentV2 `wait_agent` runtime clamping. - Plumbed the configured minimum into the `wait_agent` tool schema, including the effective default when the minimum is above the normal 30 second default. - Regenerated `core/config.schema.json`. ## Verification - `cargo test -p codex-features` - `cargo test -p codex-tools` - `cargo test -p codex-core --lib multi_agent_v2` - `just fix -p codex-core`	2026-04-28 22:36:44 +02:00
Ruslan Nigmatullin	1de7a9bf69	app-server: allow remote_control runtime feature override (#20047 )	2026-04-28 13:36:12 -07:00
viyatb-oai	e1ba87ccb2	fix(network-proxy): recheck network proxy connect targets (#19999 ) ## Why The proxy checks the requested host before opening the upstream connection, but DNS can resolve an allowed hostname to a loopback, private, or other non-public address after that first decision. Without a final check on the actual socket target, a request that looks acceptable at the hostname layer can still connect to a local service once resolution completes. ## What changed - add a shared TCP connector check for direct proxy egress - use that path for HTTP, `CONNECT`, SOCKS5, and MITM upstream connections - keep configured upstream proxy hops on the existing proxy path - add direct-connector coverage for allowed and rejected local targets ## Security impact Direct proxy egress now rechecks the resolved socket address before connecting, closing the gap between hostname policy evaluation and the final network target. ## Verification - `cargo test -p codex-network-proxy` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 12:51:43 -07:00
Shijie Rao	25ac0e4527	Load cloud requirements for agent identity (#19708 ) ## Why Agent Identity sessions can represent Business and Enterprise ChatGPT workspaces, but cloud requirements were skipped before fetch. That meant workspace-managed requirements were not loaded for Agent Identity even when the JWT carried the same account identity and plan information that normal ChatGPT token auth exposes. This PR now sits on top of the Agent Identity stack through [#19764](https://github.com/openai/codex/pull/19764). Because [#19763](https://github.com/openai/codex/pull/19763) moved task registration into Agent Identity auth loading, cloud requirements no longer needs a separate runtime-initialization step before building the backend client. ## What changed - Stop skipping `CodexAuth::AgentIdentity` in the cloud requirements loader. - Share the cloud requirements eligibility check between startup load and background cache refresh. - Rely on eagerly loaded Agent Identity auth so backend requests can attach task-scoped `AgentAssertion` headers. - Decode Agent Identity JWT `plan_type` as the auth-layer plan type, then convert it through a shared `auth::PlanType` -> `account::PlanType` mapping. - Add the missing serde alias for the `education` plan string and add coverage for raw Agent Identity plan aliases such as `hc` and `education`. ## Testing - `cargo test -p codex-agent-identity -p codex-login -p codex-cloud-requirements -p codex-protocol`	2026-04-28 12:35:00 -07:00
Ruslan Nigmatullin	0700f979ba	app-server: run initialized rpcs with keyed serialization (#17373 ) ## Why Initialized app-server RPCs no longer need to bottleneck behind one request processor path. Running them concurrently improves responsiveness, but several request families still mutate shared state or depend on ordered side effects. Those stateful families need an auditable serialization contract so concurrency does not reorder thread, config, auth, command, watcher, MCP, or similar state transitions. This PR keeps that boundary explicit: stateful work is serialized by the smallest useful key, while intentionally read-only or externally concurrent work remains unkeyed. In particular, `thread/list` and `thread/turns/list` explicitly have no serialization because they primarily read append-only rollout storage and should continue to be served concurrently. ## What changed - Adds `ClientRequest::serialization_scope()` in `app-server-protocol` and requires every client request definition to declare its serialization behavior. - Introduces keyed request scopes for thread, thread path, command exec process, fuzzy search session, fs watch, MCP OAuth, and global state buckets such as config, account auth, memory, and device keys. - Routes initialized app-server RPCs through per-key FIFO serialization while allowing unkeyed initialized requests to run concurrently. - Cancels in-flight initialized RPC work when the connection disconnects or the app-server exits so spawned request tasks do not outlive their session. - Adds focused coverage for representative keyed and unkeyed serialization scopes, including explicitly concurrent `thread/turns/list` behavior. ## Validation - Added protocol tests for representative keyed serialization scopes and intentionally unkeyed request families. - Added app-server request serialization tests covering per-key FIFO behavior, concurrent unkeyed execution, disconnect shutdown, and config read-after-write ordering. - Local focused protocol validation after the latest rebase is currently blocked by packageproxy failing to resolve locked `rustls-webpki 0.103.13`; CI is expected to provide the full validation signal.	2026-04-28 12:23:34 -07:00
Dylan Hurd	7f7c7c2c07	Fix log db batch flush flake (#19959 ) ## Why The log DB writer batches tracing events before inserting them into SQLite, but `tokio::time::interval` produces an immediate first tick. That meant the inserter could flush the first accepted log entry before `batch_size` was reached, making `configured_batch_size_flushes_without_explicit_flush` timing-sensitive in CI. ## What Changed - Consume the interval's startup tick before entering the inserter loop, so interval flushing starts after the configured delay. - Remove the test's startup sleep, which was masking the race instead of proving the batch-size behavior. ## Validation - `cargo test -p codex-state` - `cargo test -p codex-state configured_batch_size_flushes_without_explicit_flush` passed 3 consecutive focused runs - PR checks passed across `rust-ci`, Bazel, `ci`, `sdk`, `cargo-deny`, Codespell, blob-size policy, and CLA	2026-04-28 12:08:41 -07:00
viyatb-oai	3377afd84a	fix(network-proxy): harden linux proxy bridge helpers (#20001 ) ## Why The Linux managed-proxy bridge helpers are long-lived child processes in the sandbox networking path. Before this change they stayed dumpable and the network seccomp profile did not block cross-process memory syscalls, so another same-user process could potentially inspect or modify bridge memory instead of interacting only through the intended proxy interface. ## What changed - reuse the shared `codex-process-hardening` helper to mark bridge helper children non-dumpable before they begin serving - deny `process_vm_readv` and `process_vm_writev` in the existing network seccomp filter ## Security impact Bridge helpers are less exposed to same-user cross-process inspection or memory writes, which reduces the chance that sandboxed code can interfere with proxy support processes outside the intended IPC path. ## Verification - `cargo test -p codex-process-hardening` - `cargo test -p codex-linux-sandbox` - attempted `cargo check -p codex-linux-sandbox --target x86_64-unknown-linux-gnu`; blocked on missing `x86_64-linux-gnu-gcc` on this macOS host --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 11:52:50 -07:00
charley-openai	de2ccf9473	[codex] Add token usage to turn tracing spans (#19432 ) ## Why Slow Codex turns are easier to debug when token usage is visible in the trace itself, without joining against separate analytics. This adds token usage to existing turn-handling spans for regular user turns only. [Example turn](https://openai.datadoghq.com/apm/trace/9d353efa2cb5de1f4c5b93dc33c3df04?colorBy=service&graphType=flamegraph&shouldShowLegend=true&sort=time&spanID=3555541504891512675&spanViewType=metadata&traceQuery=) <img width="1447" height="967" alt="Screenshot 2026-04-24 at 3 03 07 PM" src="https://github.com/user-attachments/assets/ab7bb187-e7fc-41f0-a366-6c44610b2b2c" /> ## What Changed Added response-level token fields on completed handle_responses spans: gen_ai.usage.input_tokens gen_ai.usage.cache_read.input_tokens gen_ai.usage.output_tokens codex.usage.reasoning_output_tokens codex.usage.total_tokens Added aggregate token fields on regular turn spans: codex.turn.token_usage.* Added an explicit regular-turn opt-in via SessionTask::records_turn_token_usage_on_span() so this is not coupled to span-name strings. ## Testing - `cargo test -p codex-otel` - `cargo test -p codex-core turn_and_completed_response_spans_record_token_usage` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-otel` - Manual local Electron/app-server smoke test: regular user turn emits the new span fields Known status: `cargo test -p codex-core` was attempted and failed in unrelated existing areas: config approvals, request-permissions, git-info ordering, and subagent metadata persistence.	2026-04-28 11:41:32 -07:00

1 2 3 4 5 ...

5179 Commits