codex

mirror of https://github.com/openai/codex.git synced 2026-05-01 01:47:18 +00:00

Author	SHA1	Message	Date
rhan-oai	756c45ec61	[codex-analytics] add protocol-native turn timestamps (#16638 ) --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638). * #16870 * #16706 * #16659 * #16641 * #16640 * __->__ #16638	2026-04-06 16:22:59 -07:00
Thibault Sottiaux	624c69e840	[codex] add response proxy subagent header test (#16876 ) This adds end-to-end coverage for `responses-api-proxy` request dumps when Codex spawns a subagent and validates that the `x-codex-window-id` and `x-openai-subagent` are properly set.	2026-04-06 08:18:46 -07:00
Thibault Sottiaux	9e19004bc2	[codex] add context-window lineage headers (#16758 ) This change adds client-owned context-window and parent thread id headers to all requests to responses api.	2026-04-04 05:54:31 +00:00
Michael Bolin	3a22e10172	test: avoid PowerShell startup in Windows auth fixture (#16737 ) ## Why `provider_auth_command_supplies_bearer_token` and `provider_auth_command_refreshes_after_401` were still flaky under Windows Bazel because the generated fixture used `powershell.exe`, whose startup can be slow enough to trip the provider-auth timeout in CI. ## What Replace the generated Windows auth fixture script in `codex-rs/core/tests/suite/client.rs` with a small `.cmd` script executed by `cmd.exe /D /Q /C`, and advance `tokens.txt` one line at a time so the refresh-after-401 test still gets the second token on the second invocation. Also align the fixture timeout with the provider-auth default (`5_000` ms) to avoid introducing a test-only timing budget that is stricter than production behavior. ## Testing Left to CI, specifically the Windows Bazel `//codex-rs/core:core-all-test` coverage for the two provider-auth command tests.	2026-04-03 20:05:39 -07:00
Ahmed Ibrahim	8a19dbb177	Add spawn context for MultiAgentV2 children (#16746 )	2026-04-03 19:56:59 -07:00
Ahmed Ibrahim	e4f1b3a65e	Preempt mailbox mail after reasoning/commentary items (#16725 ) Send pending mailbox mail after completed reasoning or commentary items so follow-up requests can pick it up mid-turn. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-03 18:29:05 -07:00
Thibault Sottiaux	91ca49e53c	[codex] allow disabling environment context injection (#16745 ) This adds an `include_environment_context` config/profile flag that defaults on, and guards both initial injection and later environment updates to allow skipping injection of `<environment_context>`.	2026-04-03 18:06:52 -07:00
Thibault Sottiaux	8d19646861	[codex] allow disabling prompt instruction blocks (#16735 ) This PR adds root and profile config switches to omit the generated `<permissions instructions>` and `<apps_instructions>` prompt blocks while keeping both enabled by default, and it gates both the initial developer-context injection and later permissions diff injection so turning the permissions block off stays effective across turn-context overrides. Also added a prompt debug tool that can be used as `codex debug prompt-input "hello"` and dumps the constructed items list.	2026-04-03 23:47:56 +00:00
Eric Traut	4b8bab6ad3	Remove OPENAI_BASE_URL config fallback (#16720 ) The `OPENAI_BASE_URL` environment variable has been a significant support issue, so we decided to deprecate it in favor of an `openai_base_url` config key. We've had the deprecation warning in place for about a month, so users have had time to migrate to the new mechanism. This PR removes support for `OPENAI_BASE_URL` entirely.	2026-04-03 15:03:21 -07:00
Ahmed Ibrahim	567d2603b8	Sanitize forked child history (#16709 ) - Keep only parent system/developer/user messages plus assistant final-answer messages in forked child history. - Strip parent tool/reasoning items and remove the unmatched synthetic spawn output.	2026-04-03 21:13:34 +00:00
Michael Bolin	faab4d39e1	fix: preserve platform-specific core shell env vars (#16707 ) ## Why We were seeing failures in the following tests as part of trying to get all the tests running under Bazel on Windows in CI (https://github.com/openai/codex/pull/16528): ``` suite::shell_command::unicode_output::with_login suite::shell_command::unicode_output::without_login ``` Certainly `PATHEXT` should have been included in the extra `CORE_VARS` list, so we fix that up here, but also take things a step further for now by forcibly ensuring it is set on Windows in the return value of `create_env()`. Once we get the Windows Bazel build working reliably (i.e., after #16528 is merged), we should come back to this and confirm we can remove the special case in `create_env()`. ## What - Split core env inheritance into `COMMON_CORE_VARS` plus platform-specific allowlists for Windows and Unix in [`exec_env.rs`](`1b55c88fbf/codex-rs/core/src/exec_env.rs (L45-L81)`). - Preserve `PATHEXT`, `USERNAME`, and `USERPROFILE` on Windows, and `HOME` / locale vars on Unix. - Backfill a default `PATHEXT` in `create_env()` on Windows if the parent env does not provide one, so child process launch still works in stripped-down Bazel environments. - Extend the Windows exec-env test to assert mixed-case `PathExt` survives case-insensitive core filtering, and document why the shell-command Unicode test goes through a child process. ## Verification - `cargo test -p codex-core exec_env::tests`	2026-04-03 12:07:07 -07:00
Ahmed Ibrahim	af8a9d2d2b	remove temporary ownership re-exports (#16626 ) Stacked on #16508. This removes the temporary `codex-core` / `codex-login` re-export shims from the ownership split and rewrites callsites to import directly from `codex-model-provider-info`, `codex-models-manager`, `codex-api`, `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`. No behavior change intended; this is the mechanical import cleanup layer split out from the ownership move. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-03 00:33:34 -07:00
Michael Bolin	b15c918836	fix: use cmd.exe in Windows unicode shell test (#16668 ) ## Why This is a follow-up to #16665. The Windows `unicode_output` test should still exercise a child process so it verifies PowerShell's UTF-8 output configuration, but `$env:COMSPEC` depends on that environment variable surviving the curated Bazel test environment. Using `cmd.exe` keeps the child-process coverage while avoiding both bare `cmd` + `PATHEXT` lookup and `$env:COMSPEC` env passthrough assumptions. ## What - Run `cmd.exe /c echo naïve_café` in the Windows branch of `unicode_output`. ## Verification - `cargo test -p codex-core unicode_output`	2026-04-03 00:32:08 -07:00
Michael Bolin	14f95db57b	fix: use COMSPEC in Windows unicode shell test (#16665 ) ## Why Windows Bazel shell tests launch PowerShell with a curated environment, so `PATHEXT` may be absent. The existing `unicode_output` test invokes bare `cmd`, which can fail before the test exercises UTF-8 child-process output. ## What - Use `$env:COMSPEC /c echo naïve_café` in the Windows branch of `unicode_output`. - Preserve the external child-process path instead of switching the test to a PowerShell builtin. ## Verification - `cargo test -p codex-core unicode_output`	2026-04-02 23:54:02 -07:00
Michael Bolin	b4787bf4c0	fix: changes to test that should help them pass on Windows under Bazel (#16662 ) https://github.com/openai/codex/pull/16460 was a large PR created by Codex to try to get the tests to pass under Bazel on Windows. Indeed, it successfully ran all of the tests under `//codex-rs/core:` with its changes to `codex-rs/core/`, though the full set of changes seems to be too broad. This PR tries to port the key changes, which are: - Under Bazel, the `USERNAME` environment variable is not guaranteed to be set on Windows, so for tests that need a non-empty env var as a convenient substitute for an env var containing an API key, just use `PATH`. Note that `PATH` is unlikely to contain characters that are not allowed in an HTTP header value. - Specify `"powershell.exe"` instead of just `"powershell"` in case the `PATHEXT` env var gets lost in the shuffle.	2026-04-02 23:06:36 -07:00
Ahmed Ibrahim	6fff9955f1	extract models manager and related ownership from core (#16508 ) ## Summary - split `models-manager` out of `core` and add `ModelsManagerConfig` plus `Config::to_models_manager_config()` so model metadata paths stop depending on `core::Config` - move login-owned/auth-owned code out of `core` into `codex-login`, move model provider config into `codex-model-provider-info`, move API bridge mapping into `codex-api`, move protocol-owned types/impls into `codex-protocol`, and move response debug helpers into a dedicated `response-debug-context` crate - move feedback tag emission into `codex-feedback`, relocate tests to the crates that now own the code, and keep broad temporary re-exports so this PR avoids a giant import-only rewrite ## Major moves and decisions - created `codex-models-manager` as the owner for model cache/catalog/config/model info logic, including the new `ModelsManagerConfig` struct - created `codex-model-provider-info` as the owner for provider config parsing/defaults and kept temporary `codex-login`/`codex-core` re-exports for old import paths - moved `api_bridge` error mapping + `CoreAuthProvider` into `codex-api`, while `codex-login::api_bridge` temporarily re-exports those symbols and keeps the `auth_provider_from_auth` wrapper - moved `auth_env_telemetry` and `provider_auth` ownership to `codex-login` - moved `CodexErr` ownership to `codex-protocol::error`, plus `StreamOutput`, `bytes_to_string_smart`, and network policy helpers to protocol-owned modules - created `codex-response-debug-context` for `extract_response_debug_context`, `telemetry_transport_error_message`, and related response-debug plumbing instead of leaving that behavior in `core` - moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and `emit_feedback_request_tags_with_auth_env` to `codex-feedback` - deferred removal of temporary re-exports and the mechanical import rewrites to a stacked follow-up PR so this PR stays reviewable ## Test moves - moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to `login/tests/suite/auth_refresh.rs` - moved text encoding coverage from `core/tests/suite/text_encoding_fix.rs` to `protocol/src/exec_output_tests.rs` - moved model info override coverage from `core/tests/suite/model_info_overrides.rs` to `models-manager/src/model_info_overrides_tests.rs` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-02 23:00:02 -07:00
Michael Bolin	f894c3f687	fix: add more detail to test assertion (#16606 ) In https://github.com/openai/codex/pull/16528, I am trying to get tests running under Bazel on Windows, but currently I see: ``` thread 'suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var' (10220) panicked at core/tests\suite\user_shell_cmd.rs:358:5: assertion failed: `(left == right)` Diff < left / right > : <1 >0 ``` This PR updates the `assert_eq!()` to provide more information to help diagnose the failure. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16606). * #16608 * __->__ #16606	2026-04-02 12:34:42 -07:00
Michael Bolin	c1d18ceb6f	[codex] Remove codex-core config type shim (#16529 ) ## Why This finishes the config-type move out of `codex-core` by removing the temporary compatibility shim in `codex_core::config::types`. Callers now depend on `codex-config` directly, which keeps these config model types owned by the config crate instead of re-expanding `codex-core` as a transitive API surface. ## What Changed - Removed the `codex-rs/core/src/config/types.rs` re-export shim and the `core::config::ApprovalsReviewer` re-export. - Updated `codex-core`, `codex-cli`, `codex-tui`, `codex-app-server`, `codex-mcp-server`, and `codex-linux-sandbox` call sites to import `codex_config::types` directly. - Added explicit `codex-config` dependencies to downstream crates that previously relied on the `codex-core` re-export. - Regenerated `codex-rs/core/config.schema.json` after updating the config docs path reference.	2026-04-02 01:19:44 -07:00
Michael Bolin	aa2403e2eb	core: remove cross-crate re-exports from lib.rs (#16512 ) ## Why `codex-core` was re-exporting APIs owned by sibling `codex-` crates, which made downstream crates depend on `codex-core` as a proxy module instead of the actual owner crate. Removing those forwards makes crate boundaries explicit and lets leaf crates drop unnecessary `codex-core` dependencies. In this PR, this reduces the dependency on `codex-core` to `codex-login` in the following files: ``` codex-rs/backend-client/Cargo.toml codex-rs/mcp-server/tests/common/Cargo.toml ``` ## What - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`, `codex-protocol`, `codex-shell-command`, `codex-sandboxing`, `codex-tools`, and `codex-utils-path`. - Delete the `default_client` forwarding shim in `codex-rs/core`. - Update in-crate and downstream callsites to import directly from the owning `codex-` crate. - Add direct Cargo dependencies where callsites now target the owner crate, and remove `codex-core` from `codex-rs/backend-client`.	2026-04-01 23:06:24 -07:00
Michael Bolin	f83f3fa2a6	login: treat provider auth refresh_interval_ms=0 as no auto-refresh (#16480 ) ## Why Follow-up to #16288: the new dynamic provider auth token flow currently defaults `refresh_interval_ms` to a non-zero value and rejects `0` entirely. For command-backed bearer auth, `0` should mean "never auto-refresh". That lets callers keep using the cached token until the backend actually returns `401 Unauthorized`, at which point Codex can rerun the auth command as part of the existing retry path. ## What changed - changed `ModelProviderAuthInfo.refresh_interval_ms` to accept `0` and documented that value as disabling proactive refresh - updated the external bearer token refresher to treat `refresh_interval_ms = 0` as an indefinitely reusable cached token, while still rerunning the auth command during unauthorized recovery - regenerated `core/config.schema.json` so the schema minimum is `0` and the new behavior is described in the field docs - added coverage for both config deserialization and the no-auto-refresh plus `401` recovery behavior ## How tested - `cargo test -p codex-protocol` - `cargo test -p codex-login` - `cargo test -p codex-core test_deserialize_provider_auth_config_`	2026-04-01 15:30:10 -07:00
Michael Bolin	04ec9ef8af	Fix Windows external bearer refresh test (#16366 ) ## Why https://github.com/openai/codex/pull/16287 introduced a change to `codex-rs/login/src/auth/auth_tests.rs` that uses a PowerShell helper to read the next token from `tokens.txt` and rewrite the remainder back to disk. On Windows, `Get-Content` can return a scalar when the file has only one remaining line, so `$lines[0]` reads the first character instead of the full token. That breaks the external bearer refresh test once the token list is nearly exhausted. https://github.com/openai/codex/pull/16288 introduced similar changes to `codex-rs/core/src/models_manager/manager_tests.rs` and `codex-rs/core/tests/suite/client.rs`. These went unnoticed because the failures showed up when the test was run via Cargo on Windows, but not in our Bazel harness. Figuring out that Cargo-vs-Bazel delta will happen in a follow-up PR. ## Verification On my Windows machine, I verified `cargo test` passes when run in `codex-rs/login` and `codex-rs/core`. Once this PR is merged, I will keep an eye on https://github.com/openai/codex/actions/workflows/rust-ci-full.yml to verify it goes green. ## What changed - Wrap `Get-Content -Path tokens.txt` in `@(...)` so the script always gets array semantics before counting, indexing, and rewriting the remaining lines.	2026-03-31 14:44:54 -07:00
rhan-oai	e8de4ea953	[codex-analytics] thread events (#15690 ) - add event for thread initialization - thread/start, thread/fork, thread/resume - feature flagged behind `FeatureFlag::GeneralAnalytics` - does not yet support threads started by subagents PR stack: - --> [[telemetry] thread events #15690](https://github.com/openai/codex/pull/15690) - [[telemetry] subagent events #15915](https://github.com/openai/codex/pull/15915) - [[telemetry] turn events #15591](https://github.com/openai/codex/pull/15591) - [[telemetry] steer events #15697](https://github.com/openai/codex/pull/15697) - [[telemetry] queued prompt data #15804](https://github.com/openai/codex/pull/15804) Sample extracted logs in Codex-backend ``` INFO \| 2026-03-29 16:39:37 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:398 \| Tracked analytics event codex_thread_initialized thread_id=019d3bf7-9f5f-7f82-9877-6d48d1052531 product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=new subagent_source=None parent_thread_id=None created_at=1774827577 \| INFO \| 2026-03-29 16:45:46 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:398 \| Tracked analytics event codex_thread_initialized thread_id=019d3b84-5731-79d0-9b3b-9c6efe5f5066 product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=resumed subagent_source=None parent_thread_id=None created_at=1774820022 \| INFO \| 2026-03-29 16:45:49 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:398 \| Tracked analytics event codex_thread_initialized thread_id=019d3bfd-4cd6-7c12-a13e-48cef02e8c4d product_surface=codex product_client_id=CODEX_CLI client_name=codex-tui client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=forked subagent_source=None parent_thread_id=None created_at=1774827949 \| INFO \| 2026-03-29 17:20:29 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:398 \| Tracked analytics event codex_thread_initialized thread_id=019d3c1d-0412-7ed2-ad24-c9c0881a36b0 product_surface=codex product_client_id=CODEX_SERVICE_EXEC client_name=codex_exec client_version=0.0.0 rpc_transport=in_process experimental_api_enabled=True codex_rs_version=0.0.0 runtime_os=macos runtime_os_version=26.4.0 runtime_arch=aarch64 model=gpt-5.3-codex ephemeral=False thread_source=user initialization_mode=new subagent_source=None parent_thread_id=None created_at=1774830027 \| ``` Notes - `product_client_id` gets canonicalized in codex-backend - subagent threads are addressed in a following pr	2026-03-31 12:16:44 -07:00
Michael Bolin	03b2465591	fix: fix clippy issue caught by cargo but not bazel (#16345 ) I noticed that https://github.com/openai/codex/actions/workflows/rust-ci-full.yml started failing on my own PR, https://github.com/openai/codex/pull/16288, even though CI was green when I merged it. Apparently, it introduced a lint violation that was [correctly!] caught by our Cargo-based clippy runner, but not our Bazel-based one. My next step is to figure out the reason for the delta between the two setups, but I wanted to get us green again quickly, first.	2026-03-31 16:01:06 +00:00
Michael Bolin	20f43c1e05	core: support dynamic auth tokens for model providers (#16288 ) ## Summary Fixes #15189. Custom model providers that set `requires_openai_auth = false` could only use static credentials via `env_key` or `experimental_bearer_token`. That is not enough for providers that mint short-lived bearer tokens, because Codex had no way to run a command to obtain a bearer token, cache it briefly in memory, and retry with a refreshed token after a `401`. This PR adds that provider config and wires it through the existing auth design: request paths still go through `AuthManager.auth()` and `UnauthorizedRecovery`, with `core` only choosing when to use a provider-backed bearer-only `AuthManager`. ## Scope To keep this PR reviewable, `/models` only uses provider auth for the initial request in this change. It does not add a dedicated `401` retry path for `/models`; that can be follow-up work if we still need it after landing the main provider-token support. ## Example Usage ```toml model_provider = "corp-openai" [model_providers.corp-openai] name = "Corp OpenAI" base_url = "https://gateway.example.com/openai" requires_openai_auth = false [model_providers.corp-openai.auth] command = "gcloud" args = ["auth", "print-access-token"] timeout_ms = 5000 refresh_interval_ms = 300000 ``` The command contract is intentionally small: - write the bearer token to `stdout` - exit `0` - any leading or trailing whitespace is trimmed before the token is used ## What Changed - add `model_providers.<id>.auth` to the config model and generated schema - validate that command-backed provider auth is mutually exclusive with `env_key`, `experimental_bearer_token`, and `requires_openai_auth` - build a bearer-only `AuthManager` for `ModelClient` and `ModelsManager` when a provider configures `auth` - let normal Responses requests and realtime websocket connects use the provider-backed bearer source through the same `AuthManager.auth()` path - allow `/models` online refresh for command-auth providers and attach the provider token to the initial `/models` request - keep `auth.cwd` available as an advanced escape hatch and include it in the generated config schema ## Testing - `cargo test -p codex-core provider_auth_command` - `cargo test -p codex-core refresh_available_models_uses_provider_auth_token` - `cargo test -p codex-core test_deserialize_provider_auth_config_defaults` ## Docs - `developers.openai.com/codex` should document the new `[model_providers.<id>.auth]` block and the token-command contract	2026-03-31 01:37:27 -07:00
Michael Bolin	61dfe0b86c	chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054 ) ## Why `argument-comment-lint` was green in CI even though the repo still had many uncommented literal arguments. The main gap was target coverage: the repo wrapper did not force Cargo to inspect test-only call sites, so examples like the `latest_session_lookup_params(true, ...)` tests in `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path. This change cleans up the existing backlog, makes the default repo lint path cover all Cargo targets, and starts rolling that stricter CI enforcement out on the platform where it is currently validated. ## What changed - mechanically fixed existing `argument-comment-lint` violations across the `codex-rs` workspace, including tests, examples, and benches - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to `--all-targets` unless the caller explicitly narrows the target set - fixed both wrappers so forwarded cargo arguments after `--` are preserved with a single separator - documented the new default behavior in `tools/argument-comment-lint/README.md` - updated `rust-ci` so the macOS lint lane keeps the plain wrapper invocation and therefore enforces `--all-targets`, while Linux and Windows temporarily pass `-- --lib --bins` That temporary CI split keeps the stricter all-targets check where it is already cleaned up, while leaving room to finish the remaining Linux- and Windows-specific target-gated cleanup before enabling `--all-targets` on those runners. The Linux and Windows failures on the intermediate revision were caused by the wrapper forwarding bug, not by additional lint findings in those lanes. ## Validation - `bash -n tools/argument-comment-lint/run.sh` - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh` - shell-level wrapper forwarding check for `-- --lib --bins` - shell-level wrapper forwarding check for `-- --tests` - `just argument-comment-lint` - `cargo test` in `tools/argument-comment-lint` - `cargo test -p codex-terminal-detection` ## Follow-up - Clean up remaining Linux-only target-gated callsites, then switch the Linux lint lane back to the plain wrapper invocation. - Clean up remaining Windows-only target-gated callsites, then switch the Windows lint lane back to the plain wrapper invocation.	2026-03-27 19:00:44 -07:00
viyatb-oai	ec089fd22a	fix(sandbox): fix bwrap lookup for multi-entry PATH (#15973 ) ## Summary - split the joined `PATH` before running system `bwrap` lookup - keep the existing workspace-local `bwrap` skip behavior intact - add regression tests that exercise real multi-entry search paths ## Why The PATH-based lookup added in #15791 still wrapped the raw `PATH` environment value as a single `PathBuf` before passing it through `join_paths()`. On Unix, a normal multi-entry `PATH` contains `:`, so that wrapper path is invalid as one path element and the lookup returns `None`. That made Codex behave as if no system `bwrap` was installed even when `bwrap` was available on `PATH`, which is what users in #15340 were still hitting on `0.117.0-alpha.25`. ## Impact System `bwrap` discovery now works with normal multi-entry `PATH` values instead of silently falling back to the vendored binary. Fixes #15340. ## Validation - `just fmt` - `cargo test -p codex-sandboxing` - `cargo test -p codex-linux-sandbox` - `just fix -p codex-sandboxing` - `just argument-comment-lint`	2026-03-27 08:41:06 -07:00
Michael Bolin	e6e2999209	permissions: remove macOS seatbelt extension profiles (#15918 ) ## Why `PermissionProfile` should only describe the per-command permissions we still want to grant dynamically. Keeping `MacOsSeatbeltProfileExtensions` in that surface forced extra macOS-only approval, protocol, schema, and TUI branches for a capability we no longer want to expose. ## What changed - Removed the macOS-specific permission-profile types from `codex-protocol`, the app-server v2 API, and the generated schema/TypeScript artifacts. - Deleted the core and sandboxing plumbing that threaded `MacOsSeatbeltProfileExtensions` through execution requests and seatbelt construction. - Simplified macOS seatbelt generation so it always includes the fixed read-only preferences allowlist instead of carrying a configurable profile extension. - Removed the macOS additional-permissions UI/docs/test coverage and deleted the obsolete macOS permission modules. - Tightened `request_permissions` intersection handling so explicitly empty requested read lists are preserved only when that field was actually granted, avoiding zero-grant responses being stored as active permissions.	2026-03-26 17:12:45 -07:00
Michael Bolin	b23789b770	[codex] import token_data from codex-login directly (#15903 ) ## Why `token_data` is owned by `codex-login`, but `codex-core` was still re-exporting it. That let callers pull auth token types through `codex-core`, which keeps otherwise unrelated crates coupled to `codex-core` and makes `codex-core` more of a build-graph bottleneck. ## What changed - remove the `codex-core` re-export of `codex_login::token_data` - update the remaining `codex-core` internals that used `crate::token_data` to import `codex_login::token_data` directly - update downstream callers in `codex-rs/chatgpt`, `codex-rs/tui_app_server`, `codex-rs/app-server/tests/common`, and `codex-rs/core/tests` to import `codex_login::token_data` directly - add explicit `codex-login` workspace dependencies and refresh lock metadata for crates that now depend on it directly ## Validation - `cargo test -p codex-chatgpt --locked` - `just argument-comment-lint` - `just bazel-lock-update` - `just bazel-lock-check` ## Notes - attempted `cargo test -p codex-core --locked` and `cargo test -p codex-core auth_refresh --locked`, but both ran out of disk while linking `codex-core` test binaries in the local environment	2026-03-26 13:34:02 -07:00
Michael Bolin	e36ebaa3da	fix: box apply_patch test harness futures (#15835 ) ## Why `#[large_stack_test]` made the `apply_patch_cli` tests pass by giving them more stack, but it did not address why those tests needed the extra stack in the first place. The real problem is the async state built by the `apply_patch_cli` harness path. Those tests await three helper boundaries directly: harness construction, turn submission, and apply-patch output collection. If those helpers inline their full child futures, the test future grows to include the whole harness startup and request/response path. This change replaces the workaround from #12768 with the same basic approach used in #13429, but keeps the fix narrower: only the helper boundaries awaited directly by `apply_patch_cli` stay boxed. ## What Changed - removed `#[large_stack_test]` from `core/tests/suite/apply_patch_cli.rs` - restored ordinary `#[tokio::test(flavor = "multi_thread", worker_threads = 2)]` annotations in that suite - deleted the now-unused `codex-test-macros` crate and removed its workspace wiring - boxed only the three helper boundaries that the suite awaits directly: - `apply_patch_harness_with(...)` - `TestCodexHarness::submit(...)` - `TestCodexHarness::apply_patch_output(...)` - added comments at those boxed boundaries explaining why they remain boxed ## Testing - `cargo test -p codex-core --test all suite::apply_patch_cli -- --nocapture` ## References - #12768 - #13429	2026-03-26 17:32:04 +00:00
jif-oai	26c66f3ee1	fix: flaky (#15869 )	2026-03-26 16:07:32 +01:00
Michael Bolin	01fa4f0212	core: remove special execve handling for skill scripts (#15812 )	2026-03-26 07:46:04 -07:00
viyatb-oai	937cb5081d	fix: fix old system bubblewrap compatibility without falling back to vendored bwrap (#15693 ) Fixes #15283. ## Summary Older system bubblewrap builds reject `--argv0`, which makes our Linux sandbox fail before the helper can re-exec. This PR keeps using system `/usr/bin/bwrap` whenever it exists and only falls back to vendored bwrap when the system binary is missing. That matters on stricter AppArmor hosts, where the distro bwrap package also provides the policy setup needed for user namespaces. For old system bwrap, we avoid `--argv0` instead of switching binaries: - pass the sandbox helper a full-path `argv0`, - keep the existing `current_exe() + --argv0` path when the selected launcher supports it, - otherwise omit `--argv0` and re-exec through the helper's own `argv[0]` path, whose basename still dispatches as `codex-linux-sandbox`. Also updates the launcher/warning tests and docs so they match the new behavior: present-but-old system bwrap uses the compatibility path, and only absent system bwrap falls back to vendored. ### Validation 1. Install Ubuntu 20.04 in a VM 2. Compile codex and run without bubblewrap installed - see a warning about falling back to the vendored bwrap 3. Install bwrap and verify version is 0.4.0 without `argv0` support 4. run codex and use apply_patch tool without errors <img width="802" height="631" alt="Screenshot 2026-03-25 at 11 48 36 PM" src="https://github.com/user-attachments/assets/77248a29-aa38-4d7c-9833-496ec6a458b8" /> <img width="807" height="634" alt="Screenshot 2026-03-25 at 11 47 32 PM" src="https://github.com/user-attachments/assets/5af8b850-a466-489b-95a6-455b76b5050f" /> <img width="812" height="635" alt="Screenshot 2026-03-25 at 11 45 45 PM" src="https://github.com/user-attachments/assets/438074f0-8435-4274-a667-332efdd5cb57" /> <img width="801" height="623" alt="Screenshot 2026-03-25 at 11 43 56 PM" src="https://github.com/user-attachments/assets/0dc8d3f5-e8cf-4218-b4b4-a4f7d9bf02e3" /> --------- Co-authored-by: Michael Bolin <mbolin@openai.com>	2026-03-25 23:51:39 -07:00
Andrei Eternal	c4d9887f9a	[hooks] add non-streaming (non-stdin style) shell-only PostToolUse support (#15531 ) CHAINED PR - note that base is eternal/hooks-pretooluse-bash, not main -- so the following PR should be first Matching post-tool hook to the pre-tool functionality here: https://github.com/openai/codex/pull/15211 So, PreToolUse calls for plain shell calls, allows blocking. This PostToolUse call runs after the command executed example run: ``` › as a test, run in parallel the following commands: - echo 'one' - echo '[block-pre-tool-use]' - echo '[block-post-tool-use]' ⚠ MCP startup incomplete (failed: notion, linear) • Cruising through those three commands in parallel now, and I’ll share the exact outputs right after they land. • Running PreToolUse hook: checking the observatory runes • Running PreToolUse hook: checking the observatory runes • Running PreToolUse hook: checking the observatory runes PreToolUse hook (blocked) warning: wizard-tower PreToolUse demo blocked a Bash command on purpose. feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue. PreToolUse hook (completed) warning: wizard-tower PreToolUse demo inspected Bash: echo 'one' PreToolUse hook (completed) warning: wizard-tower PreToolUse demo inspected Bash: echo '[block-post-tool-use]' • Ran echo '[block-post-tool-use]' └ [block-post-tool-use] • Ran echo 'one' └ one • Running PostToolUse hook: reading the observatory aftermath • Running PostToolUse hook: reading the observatory aftermath PostToolUse hook (blocked) warning: wizard-tower PostToolUse demo blocked the Bash follow-through on purpose. feedback: PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to continue. PostToolUse hook (completed) warning: wizard-tower PostToolUse demo inspected Bash: echo 'one' hook context: PostToolUse demo saw the Bash result. Response preview: one ────────────────────────────────────────────────────────────────────────────────────────────────────── • Ran them in parallel. Little wave report: - echo 'one' completed and returned one. - echo '[block-pre-tool-use]' was blocked before execution by the PreToolUse hook. - echo '[block-post-tool-use]' ran, but its result was blocked afterward by the PostToolUse hook. The hook messages were: - PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue. - PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to continue. › for the blocked messages (correctly blocked in this test) did the tool output land back into context? • Yeah, from this run: - PreToolUse block: the command itself did not execute, so no command output landed in context. Only the block message did. - PostToolUse block: the command did execute, but the actual stdout did not come back into context. Only the post-tool block message did. So the blocked notifications landed in context, but the blocked command payloads themselves did not. Nice clean guardrail, aloha. ```	2026-03-25 19:18:03 -07:00
Matthew Zeng	78799c1bcf	[mcp] Improve custom MCP elicitation (#15800 ) - [x] Support don't ask again for custom MCP tool calls. - [x] Don't run arc in yolo mode. - [x] Run arc for custom MCP tools in always allow mode.	2026-03-26 01:02:37 +00:00
pakrym-oai	8fa88fa8ca	Add cached environment manager for exec server URL (#15785 ) Add environment manager that is a singleton and is created early in app-server (before skill manager, before config loading). Use an environment variable to point to a running exec server.	2026-03-25 16:14:36 -07:00
Matthew Zeng	91337399fe	[apps][tool_suggest] Remove tool_suggest's dependency on tool search. (#14856 ) - [x] Remove tool_suggest's dependency on tool search.	2026-03-25 12:26:02 -07:00
pakrym-oai	504aeb0e09	Use AbsolutePathBuf for cwd state (#15710 ) Migrate `cwd` and related session/config state to `AbsolutePathBuf` so downstream consumers consistently see absolute working directories. Add test-only `.abs()` helpers for `Path`, `PathBuf`, and `TempDir`, and update branch-local tests to use them instead of `AbsolutePathBuf::try_from(...)`. For the remaining TUI/app-server snapshot coverage that renders absolute cwd values, keep the snapshots unchanged and skip the Windows-only cases where the platform-specific absolute path layout differs.	2026-03-25 16:02:22 +00:00
jif-oai	178c3b15b4	chore: remove grep_files handler (#15775 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-25 16:01:45 +00:00
Matthew Zeng	e590fad50b	[plugins] Add a flag for tool search. (#15722 ) - [x] Add a flag for tool search.	2026-03-25 07:00:25 +00:00
Charley Cunningham	d72fa2a209	[codex] Defer fork context injection until first turn (#15699 ) ## Summary - remove the fork-startup `build_initial_context` injection - keep the reconstructed `reference_context_item` as the fork baseline until the first real turn - update fork-history tests and the request snapshot, and add a `TODO(ccunningham)` for remaining nondiffable initial-context inputs ## Why Fork startup was appending current-session initial context immediately after reconstructing the parent rollout, then the first real turn could emit context updates again. That duplicated model-visible context in the child rollout. ## Impact Forked sessions now behave like resume for context seeding: startup reconstructs history and preserves the prior baseline, and the first real turn handles any current-session context emission. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-24 18:34:44 -07:00
Ahmed Ibrahim	0f957a93cd	Move git utilities into a dedicated crate (#15564 ) - create `codex-git-utils` and move the shared git helpers into it with file moves preserved for diff readability - move the `GitInfo` helpers out of `core` so stacked rollout work can depend on the shared crate without carrying its own git info module --------- Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-24 13:26:23 -07:00
Charley Cunningham	2d61357c76	Trim pre-turn context updates during rollback (#15577 ) ## Summary - trim contiguous developer/contextual-user pre-turn updates when rollback cuts back to a user turn - add a focused history regression test for the trim behavior - update the rollback request-boundary snapshots to show the fixed non-duplicating context shape --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-24 12:43:53 -07:00
Celia Chen	88694e8417	chore: stop app-server auth refresh storms after permanent token failure (#15530 ) built from #14256. PR description from @etraut-openai: This PR addresses a hole in [PR 11802](https://github.com/openai/codex/pull/11802). The previous PR assumed that app server clients would respond to token refresh failures by presenting the user with an error ("you must log in again") and then not making further attempts to call network endpoints using the expired token. While they do present the user with this error, they don't prevent further attempts to call network endpoints and can repeatedly call `getAuthStatus(refreshToken=true)` resulting in many failed calls to the token refresh endpoint. There are three solutions I considered here: 1. Change the getAuthStatus app server call to return a null auth if the caller specified "refreshToken" on input and the refresh attempt fails. This will cause clients to immediately log out the user and return them to the log in screen. This is a really bad user experience. It's also a breaking change in the app server contract that could break third-party clients. 2. Augment the getAuthStatus app server call to return an additional field that indicates the state of "token could not be refreshed". This is a non-breaking change to the app server API, but it requires non-trivial changes for all clients to properly handle this new field properly. 3. Change the getAuthStatus implementation to handle the case where a token refresh fails by marking the AuthManager's in-memory access and refresh tokens as "poisoned" so it they are no longer used. This is the simplest fix that requires no client changes. I chose option 3. Here's Codex's explanation of this change: When an app-server client asks `getAuthStatus(refreshToken=true)`, we may try to refresh a stale ChatGPT access token. If that refresh fails permanently (for example `refresh_token_reused`, expired, or revoked), the old behavior was bad in two ways: 1. We kept the in-memory auth snapshot alive as if it were still usable. 2. Later auth checks could retry refresh again and again, creating a storm of doomed `/oauth/token` requests and repeatedly surfacing the same failure. This is especially painful for app-server clients because they poll auth status and can keep driving the refresh path without any real chance of recovery. This change makes permanent refresh failures terminal for the current managed auth snapshot without changing the app-server API contract. What changed: - `AuthManager` now poisons the current managed auth snapshot in memory after a permanent refresh failure, keyed to the unchanged `AuthDotJson`. - Once poisoned, later refresh attempts for that same snapshot fail fast locally without calling the auth service again. - The poison is cleared automatically when auth materially changes, such as a new login, logout, or reload of different auth state from storage. - `getAuthStatus(includeToken=true)` now omits `authToken` after a permanent refresh failure instead of handing out the stale cached bearer token. This keeps the current auth method visible to clients, avoids forcing an immediate logout flow, and stops repeated refresh attempts for credentials that cannot recover. --------- Co-authored-by: Eric Traut <etraut@openai.com>	2026-03-24 12:39:58 -07:00
Celia Chen	7dc2cd2ebe	chore: use access token expiration for proactive auth refresh (#15545 ) Follow up to #15357 by making proactive ChatGPT auth refresh depend on the access token's JWT expiration instead of treating `last_refresh` age as the primary source of truth.	2026-03-24 19:34:48 +00:00
Charley Cunningham	910cf49269	[codex] Stabilize second compaction history test (#15605 ) ## Summary - replace the second-compaction test fixtures with a single ordered `/responses` sequence - assert against the real recorded request order instead of aggregating per-mock captures - realign the second-summary assertion to the first post-compaction user turn where the summary actually appears ## Root cause `compact_resume_after_second_compaction_preserves_history` collected requests from multiple `mount_sse_once_match` recorders. Overlapping matchers could record the same HTTP request more than once, so the test indexed into a duplicated synthetic list rather than the true request stream. That made the summary assertion depend on matcher evaluation order and platform-specific behavior. ## Impact - makes the flaky test deterministic by removing duplicate request capture from the assertion path - keeps the change scoped to the test only ## Validation - `just fmt` - `just argument-comment-lint` - `env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - repeated the same targeted test 10 times --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-24 10:14:21 -07:00
pakrym-oai	f49eb8e9d7	Extract sandbox manager and transforms into codex-sandboxing (#15603 ) Extract sandbox manager	2026-03-24 08:20:57 -07:00
dhruvgupta-oai	c2410060ea	[codex-cli][app-server] Update self-serve business usage limit copy in error returned (#15478 ) ## Summary - update the self-serve business usage-based limit message to direct users to their admin for additional credits - add a focused unit test for the self_serve_business_usage_based plan branch Added also: If you are at a rate limit but you still have credits, codex cli would tell you to switch the model. We shouldnt do this if you have credits so fixed this. ## Test - launched the source-built CLI and verified the updated message is shown for the self-serve business usage-based plan ![Test screenshot](https://raw.githubusercontent.com/openai/codex/5cc3c013ef17ac5c66dfd9395c0d3c4837602231/docs/images/self-serve-business-usage-limit.png)	2026-03-24 04:41:38 +00:00
Charley Cunningham	f547b79bd0	Add fork snapshot modes (#15239 ) ## Summary - add `ForkSnapshotMode` to `ThreadManager::fork_thread` so callers can request either a committed snapshot or an interrupted snapshot - share the model-visible `<turn_aborted>` history marker between the live interrupt path and interrupted forks - update the small set of direct fork callsites to pass `ForkSnapshotMode::Committed` Note: this enables /btw to work similarly as Esc to interrupt (hopefully somewhat in distribution) --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-23 19:05:42 -07:00
Charley Cunningham	0f34b14b41	[codex] Add rollback context duplication snapshot (#15562 ) ## What changed - adds a targeted snapshot test for rollback with contextual diffs in `codex_tests.rs` - snapshots the exact model-visible request input before the rolled-back turn and on the follow-up request after rollback - shows the duplicate developer and environment context pair appearing again before the follow-up user message ## Why Rollback currently rewinds the reference context baseline without rewinding the live session overrides. On the next turn, the same contextual diff is emitted again and duplicated in the request sent to the model. ## Impact - makes the regression visible in a canonical snapshot test - keeps the snapshot on the shared `context_snapshot` path without adding new formatting helpers - gives a direct repro for future fixes to rollback/context reconstruction --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-23 15:36:23 -07:00
Dylan Hurd	67c1c7c054	chore(core) Add approvals reviewer to UserTurn (#15426 ) ## Summary Adds support for approvals_reviewer to `Op::UserTurn` so we can migrate `[CodexMessageProcessor::turn_start]` to use Op::UserTurn ## Testing - [x] Adds quick test for the new field Co-authored-by: Codex <noreply@openai.com>	2026-03-23 15:19:01 -07:00

1 2 3 4 5 ...

900 Commits