codex

mirror of https://github.com/openai/codex.git synced 2026-05-27 14:34:24 +00:00

Author	SHA1	Message	Date
Casey Chow	99dcf63956	Add sandboxed file transfer tools Co-authored-by: Codex <noreply@openai.com>	2026-03-19 01:34:45 +00:00
pakrym-oai	83a60fdb94	Add FS abstraction and use in view_image (#14960 ) Adds an environment crate and environment + file system abstraction. Environment is a combination of attributes and services specific to environment the agent is connected to: File system, process management, OS, default shell. The goal is to move most of agent logic that assumes environment to work through the environment abstraction.	2026-03-17 17:36:23 -07:00
xl-openai	a5d3114e97	feat: Add product-aware plugin policies and clean up manifest naming (#14993 ) - Add shared Product support to marketplace plugin policy and skill policy (no enforced yet). - Move marketplace installation/authentication under policy and model it as MarketplacePluginPolicy. - Rename plugin/marketplace local manifest types to separate raw serde shapes from resolved in-memory models.	2026-03-17 17:01:34 -07:00
viyatb-oai	0d1539e74c	fix(linux-sandbox): prefer system /usr/bin/bwrap when available (#14963 ) ## Problem Ubuntu/AppArmor hosts started failing in the default Linux sandbox path after the switch to vendored/default bubblewrap in `0.115.0`. The clearest report is in [#14919](https://github.com/openai/codex/issues/14919), especially [this investigation comment](https://github.com/openai/codex/issues/14919#issuecomment-4076504751): on affected Ubuntu systems, `/usr/bin/bwrap` works, but a copied or vendored `bwrap` binary fails with errors like `bwrap: setting up uid map: Permission denied` or `bwrap: loopback: Failed RTM_NEWADDR: Operation not permitted`. The root cause is Ubuntu's `/etc/apparmor.d/bwrap-userns-restrict` profile, which grants `userns` access specifically to `/usr/bin/bwrap`. Once Codex started using a vendored/internal bubblewrap path, that path was no longer covered by the distro AppArmor exception, so sandbox namespace setup could fail even when user namespaces were otherwise enabled and `uidmap` was installed. ## What this PR changes - prefer system `/usr/bin/bwrap` whenever it is available - keep vendored bubblewrap as the fallback when `/usr/bin/bwrap` is missing - when `/usr/bin/bwrap` is missing, surface a Codex startup warning through the app-server/TUI warning path instead of printing directly from the sandbox helper with `eprintln!` - use the same launcher decision for both the main sandbox execution path and the `/proc` preflight path - document the updated Linux bubblewrap behavior in the Linux sandbox and core READMEs ## Why this fix This still fixes the Ubuntu/AppArmor regression from [#14919](https://github.com/openai/codex/issues/14919), but it keeps the runtime rule simple and platform-agnostic: if the standard system bubblewrap is installed, use it; otherwise fall back to the vendored helper. The warning now follows that same simple rule. If Codex cannot find `/usr/bin/bwrap`, it tells the user that it is falling back to the vendored helper, and it does so through the existing startup warning plumbing that reaches the TUI and app-server instead of low-level sandbox stderr. ## Testing - `cargo test -p codex-linux-sandbox` - `cargo test -p codex-app-server --lib` - `cargo test -p codex-tui-app-server tests::embedded_app_server_start_failure_is_returned` - `cargo clippy -p codex-linux-sandbox --all-targets` - `cargo clippy -p codex-app-server --all-targets` - `cargo clippy -p codex-tui-app-server --all-targets`	2026-03-17 23:05:34 +00:00
Ahmed Ibrahim	98be562fd3	Unify realtime shutdown in core (#14902 ) - route realtime startup, input, and transport failures through a single shutdown path - emit one realtime error/closed lifecycle while clearing session state once --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>	2026-03-17 15:58:52 -07:00
Ahmed Ibrahim	c6ab4ee537	Gate realtime audio interruption logic to v2 (#14984 ) - thread the realtime version into conversation start and app-server notifications - keep playback-aware mic gating and playback interruption behavior on v2 only, leaving v1 on the legacy path	2026-03-17 15:24:37 -07:00
xl-openai	1a9555eda9	Cleanup skills/remote/xxx endpoints. (#14977 ) Remote skills/remote/xxx as they are not in used for now.	2026-03-17 15:22:36 -07:00
Colin Young	0d2ff40a58	Add auth env observability (#14905 ) CXC-410 Emit Env Var Status with `/feedback` report Add more observability on top of #14611 [Unset](https://openai.sentry.io/issues/7340419168/?project=4510195390611458&query=019cfa8d-c1ba-7002-96fa-e35fc340551d&referrer=issue-stream) [Set](https://openai.sentry.io/issues/7340426331/?project=4510195390611458&query=019cfa91-aba1-7823-ab7e-762edfbc0ed4&referrer=issue-stream) <img width="1063" height="610" alt="image" src="https://github.com/user-attachments/assets/937ab026-1c2d-4757-81d5-5f31b853113e" /> ###### Summary - Adds auth-env telemetry that records whether key auth-related env overrides were present on session start and request paths. - Threads those auth-env fields through `/responses`, websocket, and `/models` telemetry and feedback metadata. - Buckets custom provider `env_key` configuration to a safe `"configured"` value instead of emitting raw config text. - Keeps the slice observability-only: no raw token values or raw URLs are emitted. ###### Rationale (from spec findings) - 401 and auth-path debugging needs a way to distinguish env-driven auth paths from sessions with no auth env override. - Startup and model-refresh failures need the same auth-env diagnostics as normal request failures. - Feedback and Sentry tags need the same auth-env signal as OTel events so reports can be triaged consistently. - Custom provider config is user-controlled text, so the telemetry contract must stay presence-only / bucketed. ###### Scope - Adds a small `AuthEnvTelemetry` bundle for env presence collection and threads it through the main request/session telemetry paths. - Does not add endpoint/base-url/provider-header/geo routing attribution or broader telemetry API redesign. ###### Trade-offs - `provider_env_key_name` is bucketed to `"configured"` instead of preserving the literal configured env var name. - `/models` is included because startup/model-refresh auth failures need the same diagnostics, but broader parity work remains out of scope. - This slice keeps the existing telemetry APIs and layers auth-env fields onto them rather than redesigning the metadata model. ###### Client follow-up - Add the separate endpoint/base-url attribution slice if routing-source diagnosis is still needed. - Add provider-header or residency attribution only if auth-env presence proves insufficient in real reports. - Revisit whether any additional auth-related env inputs need safe bucketing after more 401 triage data. ###### Testing - `cargo test -p codex-core emit_feedback_request_tags -- --nocapture` - `cargo test -p codex-core collect_auth_env_telemetry_buckets_provider_env_key_name -- --nocapture` - `cargo test -p codex-core models_request_telemetry_emits_auth_env_feedback_tags_on_failure -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_api_request_auth_observability -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_connect_auth_observability -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_request_transport_observability -- --nocapture` - `cargo test -p codex-core --no-run --message-format short` - `cargo test -p codex-otel --no-run --message-format short` --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-17 14:26:27 -07:00
pakrym-oai	ee756eb80f	Rename exec_wait tool to wait (#14983 ) Summary - document that code mode only exposes `exec` and the renamed `wait` tool - update code mode tool spec and descriptions to match the new tool name - rename tests and helper references from `exec_wait` to `wait` Testing - Not run (not requested)	2026-03-17 14:22:26 -07:00
Ahmed Ibrahim	4d9d4b7b0f	Stabilize approval matrix write-file command (#14968 ) ## What is flaky The approval-matrix `WriteFile` scenario is flaky. It sometimes fails in CI even though the approval logic is unchanged, because the test delegates the file write and readback to shell parsing instead of deterministic file I/O. ## Why it was flaky The test generated a command shaped like `printf ... > file && cat file`. That means the scenario depended on shell quoting, redirection, newline handling, and encoding behavior in addition to the approval system it was actually trying to validate. If the shell interpreted the payload differently, the test would report an approval failure even though the product logic was fine. That also made failures hard to diagnose, because the test did not log the exact generated command or the parsed result payload. ## How this PR fixes it This PR replaces the shell-redirection path with a deterministic `python3 -c` script that writes the file with `Path.write_text(..., encoding='utf-8')` and then reads it back with the same UTF-8 path. It also logs the generated command and the resulting exit code/stdout for the approval scenario so any future failure is directly attributable. ## Why this fix fixes the flakiness The scenario no longer depends on shell parsing and redirection semantics. The file contents are produced and read through explicit UTF-8 file I/O, so the approval test is measuring approval behavior instead of shell behavior. The added diagnostics mean a future failure will show the exact command/result pair instead of looking like a generic intermittent mismatch. Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-17 13:52:36 -07:00
Ahmed Ibrahim	b02388672f	Stabilize Windows cmd-based shell test harnesses (#14958 ) ## What is flaky The Windows shell-driven integration tests in `codex-rs/core` were intermittently unstable, especially: - `apply_patch_cli_can_use_shell_command_output_as_patch_input` - `websocket_test_codex_shell_chain` - `websocket_v2_test_codex_shell_chain` ## Why it was flaky These tests were exercising real shell-tool flows through whichever shell Codex selected on Windows, and the `apply_patch` test also nested a PowerShell read inside `cmd /c`. There were multiple independent sources of nondeterminism in that setup: - The test harness depended on the model-selected Windows shell instead of pinning the shell it actually meant to exercise. - `cmd.exe /c powershell.exe -Command "..."` is quoting-sensitive; on CI that could leave the read command wrapped as a literal string instead of executing it. - Even after getting the quoting right, PowerShell could emit CLIXML progress records like module-initialization output onto stdout. - The `apply_patch` test was building a patch directly from shell stdout, so any quoting artifact or progress noise corrupted the patch input. So the failures were driven by shell startup and output-shape variance, not by the `apply_patch` or websocket logic themselves. ## How this PR fixes it - Add a test-only `user_shell_override` path so Windows integration tests can pin `cmd.exe` explicitly. - Use that override in the websocket shell-chain tests and in the `apply_patch` harness. - Change the nested Windows file read in `apply_patch_cli_can_use_shell_command_output_as_patch_input` to a UTF-8 PowerShell `-EncodedCommand` script. - Run that nested PowerShell process with `-NonInteractive`, set `$ProgressPreference = 'SilentlyContinue'`, and read the file with `[System.IO.File]::ReadAllText(...)`. ## Why this fix fixes the flakiness The outer harness now runs under a deterministic shell, and the inner PowerShell read no longer depends on fragile `cmd` quoting or on progress output staying quiet by accident. The shell tool returns only the file contents, so patch construction and websocket assertions depend on stable test inputs instead of on runner-specific shell behavior. --------- Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-17 20:21:46 +00:00
Matthew Zeng	683c37ce75	[plugins] Support plugin installation elicitation. (#14896 ) It now supports: - Connectors that are from installed and enabled plugins that are not installed yet - Plugins that are on the allowlist that are not installed yet.	2026-03-17 13:19:28 -07:00
Ahmed Ibrahim	0d531c05f2	Fix code mode yield startup race (#14959 )	2026-03-17 11:09:12 -07:00
jif-oai	d484bb57d9	feat: add suffix to shell snapshot name (#14938 ) https://github.com/openai/codex/issues/14906	2026-03-17 17:59:27 +00:00
Shijie Rao	8e258eb3f5	Feat: CXA-1831 Persist latest model and reasoning effort in sqlite (#14859 ) ### Summary The goal is for us to get the latest turn model and reasoning effort on thread/resume is no override is provided on the thread/resume func call. This is the part 1 which we write the model and reasoning effort for a thread to the sqlite db and there will be a followup PR to consume the two new fields on thread/resume. [part 2 PR is currently WIP](https://github.com/openai/codex/pull/14888) and this one can be merged independently.	2026-03-17 10:14:34 -07:00
Owen Lin	6ea041032b	fix(core): prevent hanging turn/start due to websocket warming issues (#14838 ) ## Description This PR fixes a bad first-turn failure mode in app-server when the startup websocket prewarm hangs. Before this change, `initialize -> thread/start -> turn/start` could sit behind the prewarm for up to five minutes, so the client would not see `turn/started`, and even `turn/interrupt` would block because the turn had not actually started yet. Now, we: - set a (configurable) timeout of 15s for websocket startup time, exposed as `websocket_startup_timeout_ms` in config.toml - `turn/started` is sent immediately on `turn/start` even if the websocket is still connecting - `turn/interrupt` can be used to cancel a turn that is still waiting on the websocket warmup - the turn task will wait for the full 15s websocket warming timeout before falling back ## Why The old behavior made app-server feel stuck at exactly the moment the client expects turn lifecycle events to start flowing. That was especially painful for external clients, because from their point of view the server had accepted the request but then went silent for minutes. ## Configuring the websocket startup timeout Can set it in config.toml like this: ``` [model_providers.openai] supports_websockets = true websocket_connect_timeout_ms = 15000 ```	2026-03-17 10:07:46 -07:00
jif-oai	e8add54e5d	feat: show effective model in spawn agent event (#14944 ) Show effective model after the full config layering for the sub agent	2026-03-17 16:58:58 +00:00
daveaitel-openai	ef36d39199	Fix agent jobs finalization race and reduce status polling churn (#14843 ) ## Summary - make `report_agent_job_result` atomically transition an item from running to completed while storing `result_json` - remove brittle finalization grace-sleep logic and make finished-item cleanup idempotent - replace blind fixed-interval waiting with status-subscription-based waiting for active worker threads - add state runtime tests for atomic completion and late-report rejection ## Why This addresses the race and polling concerns in #13948 by removing timing-based correctness assumptions and reducing unnecessary status polling churn. ## Validation - `cd codex-rs && just fmt` - `cd codex-rs && cargo test -p codex-state` - `cd codex-rs && cargo test -p codex-core --test all suite::agent_jobs` - `cd codex-rs && cargo test` - fails in an unrelated app-server tracing test: `message_processor::tracing_tests::thread_start_jsonrpc_span_exports_server_span_and_parents_children` timed out waiting for response ## Notes - This PR supersedes #14129 with the same agent-jobs fix on a clean branch from `main`. - The earlier PR branch was stacked on unrelated history, which made the review diff include unrelated commits. Fixes #13948	2026-03-17 10:40:14 -04:00
jif-oai	4ed19b0766	feat: rename to get more explicit close agent (#14935 ) https://github.com/openai/codex/issues/14907	2026-03-17 14:37:20 +00:00
jif-oai	31648563c8	feat: centralize package manager version (#14920 )	2026-03-17 12:03:07 +00:00
viyatb-oai	db7e02c739	fix: canonicalize symlinked Linux sandbox cwd (#14849 ) ## Problem On Linux, Codex can be launched from a workspace path that is a symlink (for example, a symlinked checkout or a symlinked parent directory). Our sandbox policy intentionally canonicalizes writable/readable roots to the real filesystem path before building the bubblewrap mounts. That part is correct and needed for safety. The remaining bug was that bubblewrap could still inherit the helper process's logical cwd, which might be the symlinked alias instead of the mounted canonical path. In that case, the sandbox starts in a cwd that does not exist inside the sandbox namespace even though the real workspace is mounted. This can cause sandboxed commands to fail in symlinked workspaces. ## Fix This PR keeps the sandbox policy behavior the same, but separates two concepts that were previously conflated: - the canonical cwd used to define sandbox mounts and permissions - the caller's logical cwd used when launching the command On the Linux bubblewrap path, we now thread the logical command cwd through the helper explicitly and only add `--chdir <canonical path>` when the logical cwd differs from the mounted canonical path. That means: - permissions are still computed from canonical paths - bubblewrap starts the command from a cwd that definitely exists inside the sandbox - we do not widen filesystem access or undo the earlier symlink hardening ## Why This Is Safe This is a narrow Linux-only launch fix, not a policy change. - Writable/readable root canonicalization stays intact. - Protected metadata carveouts still operate on canonical roots. - We only override bubblewrap's inherited cwd when the logical path would otherwise point at a symlink alias that is not mounted in the sandbox. ## Tests - kept the existing protocol/core regression coverage for symlink canonicalization - added regression coverage for symlinked cwd handling in the Linux bubblewrap builder/helper path Local validation: - `just fmt` - `cargo test -p codex-protocol` - `cargo test -p codex-core normalize_additional_permissions_canonicalizes_symlinked_write_paths` - `cargo clippy -p codex-linux-sandbox -p codex-protocol -p codex-core --tests -- -D warnings` - `cargo build --bin codex` ## Context This is related to #14694. The earlier writable-root symlink fix addressed the mount/permission side; this PR fixes the remaining symlinked-cwd launch mismatch in the Linux sandbox path.	2026-03-16 22:39:18 -07:00
Ahmed Ibrahim	79f476e47d	[stack 3/4] Add current thread context to realtime startup (#14829 ) ## Stack Position 3/4. Top-of-stack sibling built on #14830. ## Base - #14830 ## Sibling - #14827 ## Scope - Extend the realtime startup context with a bounded summary of the latest thread turns for continuity. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-17 05:11:05 +00:00
Thibault Sottiaux	8e34caffcc	[codex] add Jason as a predefined subagent name (#14881 ) This change adds Jason to codex-core's built-in subagent nickname pool so spawned agents can pick it without any custom role configuration. The default list was simply missing that predefined name (a grave mistake).	2026-03-16 22:01:14 -07:00
xl-openai	e5a28ba0c2	fix: align marketplace display name with existing interface conventions (#14886 ) 1. camelCase for displayName; 2. move displayName under interface.	2026-03-16 21:52:19 -07:00
Ahmed Ibrahim	fbd7f9b986	[stack 2/4] Align main realtime v2 wire and runtime flow (#14830 ) ## Stack Position 2/4. Built on top of #14828. ## Base - #14828 ## Unblocks - #14829 - #14827 ## Scope - Port the realtime v2 wire parsing, session, app-server, and conversation runtime behavior onto the split websocket-method base. - Branch runtime behavior directly on the current realtime session kind instead of parser-derived flow flags. - Keep regression coverage in the existing e2e suites. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-16 21:38:07 -07:00
xl-openai	1d85fe79ed	feat: support remote_sync for plugin install/uninstall. (#14878 ) - Added forceRemoteSync to plugin/install and plugin/uninstall. - With forceRemoteSync=true, we update the remote plugin status first, then apply the local change only if the backend call succeeds. - Kept plugin/list(forceRemoteSync=true) as the main recon path, and for now it treats remote enabled=false as uninstall. We will eventually migrate to plugin/installed for more precise state handling.	2026-03-16 21:37:27 -07:00
xl-openai	49c2b66ece	Add marketplace display names to plugin/list (#14861 ) Add display_name support to marketplace.json.	2026-03-16 19:04:40 -07:00
Michael Bolin	b77fe8fefe	Apply argument comment lint across codex-rs (#14652 ) ## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/param/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/param/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/param/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- -> #14652 * #14651	2026-03-16 16:48:15 -07:00
pakrym-oai	a3ba10b44b	Add exit helper to code mode scripts (#14851 ) - Summary - expose `exit` through the code mode bridge and module so scripts can stop mid-flight - surface the helper in the description documentation - add a regression test ensuring `exit()` terminates execution cleanly - Testing - Not run (not requested)	2026-03-16 22:07:58 +00:00
Andi Liu	4c9dbc1f88	memories: exclude AGENTS and skills from stage1 input (#14268 ) ###### Why/Context/Summary - Exclude injected AGENTS.md instructions and standalone skill payloads from memory stage 1 inputs so memory generation focuses on conversation content instead of prompt scaffolding. - Strip only the AGENTS fragment from mixed contextual user messages during stage-1 serialization, which preserves environment context in the same message. - Keep subagent notifications in the memory input, and add focused unit coverage for the fragment classifier, rollout policy, and stage-1 serialization path. ###### Test plan - `just fmt` - `cargo test -p codex-core --lib contextual_user_message` - `cargo test -p codex-core --lib rollout::policy` - `cargo test -p codex-core --lib memories::phase1`	2026-03-16 19:30:38 +00:00
Anton Panasenko	663dd3f935	fix(core): fix sanitize name to use '_' everywhere (#14833 )	2026-03-16 12:22:10 -07:00
Eric Traut	db89b73a9c	Move TUI on top of app server (parallel code) (#14717 ) This PR replicates the `tui` code directory and creates a temporary parallel `tui_app_server` directory. It also implements a new feature flag `tui_app_server` to select between the two tui implementations. Once the new app-server-based TUI is stabilized, we'll delete the old `tui` directory and feature flag.	2026-03-16 10:49:19 -06:00
jif-oai	3f266bcd68	feat: make interrupt state not final for multi-agents (#13850 ) Make `interrupted` an agent state and make it not final. As a result, a `wait` won't return on an interrupted agent and no notification will be send to the parent agent. The rationals are: * If a user interrupt a sub-agent for any reason, you don't want the parent agent to instantaneously ask the sub-agent to restart * If a parent agent interrupt a sub-agent, no need to add a noisy notification in the parent agen	2026-03-16 16:39:40 +00:00
jif-oai	18ad67549c	feat: improve skills cache key to take into account config layering (#14806 ) Fix https://github.com/openai/codex/issues/14161 This fixes sub-agent [[skills.config]] overrides being ignored when parent and child share the same cwd. The root cause was that turn skill loading rebuilt from cwd-only state and reused a cwd-scoped cache, so role-local skill enable/disable overrides did not reliably affect the spawned agent's effective skill set. This change switches turn construction to use the effective per-turn config and adds a config-aware skills cache keyed by skill roots plus final disabled paths.	2026-03-16 16:12:44 +00:00
jif-oai	33acc1e65f	fix: sub-agent role when using profiles (#14807 ) Fix the layering conflict when a project profile is used with agents. This PR clean the config layering and make sure the agent config > project profile Fix https://github.com/openai/codex/issues/13849, https://github.com/openai/codex/issues/14671	2026-03-16 16:08:16 +00:00
Matthew Zeng	029aab5563	fix(core): preserve tool_params for elicitations (#14769 ) - [x] Preserve tool_params keys.	2026-03-15 23:15:52 -07:00
Charley Cunningham	6fdeb1d602	Reuse guardian session across approvals (#14668 ) ## Summary - reuse a guardian subagent session across approvals so reviews keep a stable prompt cache key and avoid one-shot startup overhead - clear the guardian child history before each review so prior guardian decisions do not leak into later approvals - include the `smart_approvals` -> `guardian_approval` feature flag rename in the same PR to minimize release latency on a very tight timeline - add regression coverage for prompt-cache-key reuse without prior-review prompt bleed ## Request - Bug/enhancement request: internal guardian prompt-cache and latency improvement request --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-15 22:56:18 -07:00
friel-openai	ba463a9dc7	Preserve background terminals on interrupt and rename cleanup command to /stop (#14602 ) ### Motivation - Interrupting a running turn (Ctrl+C / Esc) currently also terminates long‑running background shells, which is surprising for workflows like local dev servers or file watchers. - The existing cleanup command name was confusing; callers expect an explicit command to stop background terminals rather than a UI clear action. - Make background‑shell termination explicit and surface a clearer command name while preserving backward compatibility. ### Description - Renamed the background‑terminal cleanup slash command from `Clean` (`/clean`) to `Stop` (`/stop`) and kept `clean` as an alias in the command parsing/visibility layer, updated the user descriptions and command popup wiring accordingly. - Updated the unified‑exec footer text and snapshots to point to `/stop` (and trimmed corresponding snapshot output to match the new label). - Changed interrupt behavior so `Op::Interrupt` (Ctrl+C / Esc interrupt) no longer closes or clears tracked unified exec / background terminal processes in the TUI or core cleanup path; background shells are now preserved after an interrupt. - Updated protocol/docs to clarify that `turn/interrupt` (or `Op::Interrupt`) interrupts the active turn but does not terminate background terminals, and that `thread/backgroundTerminals/clean` is the explicit API to stop those shells. - Updated unit/integration tests and insta snapshots in the TUI and core unified‑exec suites to reflect the new semantics and command name. ### Testing - Ran formatting with `just fmt` in `codex-rs` (succeeded). - Ran `cargo test -p codex-protocol` (succeeded). - Attempted `cargo test -p codex-tui` but the build could not complete in this environment due to a native build dependency that requires `libcap` development headers (the `codex-linux-sandbox` vendored build step); install `libcap-dev` / make `libcap.pc` available in `PKG_CONFIG_PATH` to run the TUI test suite locally. - Updated and accepted the affected `insta` snapshots for the TUI changes so visual diffs reflect the new `/stop` wording and preserved interrupt behavior. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69b39c44b6dc8323bd133ae206310fae)	2026-03-15 22:17:25 -07:00
Matthew Zeng	d4af6053e2	[apps] Improve search tool fallback. (#14732 ) - [x] Bypass tool search and stuff tool specs directly into model context when either a. Tool search is not available for the model or b. There are not that many tools to search for.	2026-03-15 21:41:55 -07:00
Matthew Zeng	49edf311ac	[apps] Add tool call meta. (#14647 ) - [x] Add resource_uri and other things to _meta to shortcut resource lookup and speed things up.	2026-03-14 22:24:13 -07:00
Colin Young	d692b74007	Add auth 401 observability to client bug reports (#14611 ) CXC-392 [With 401](https://openai.sentry.io/issues/7333870443/?project=4510195390611458&query=019ce8f8-560c-7f10-a00a-c59553740674&referrer=issue-stream) <img width="1909" height="555" alt="401 auth tags in Sentry" src="https://github.com/user-attachments/assets/412ea950-61c4-4780-9697-15c270971ee3" /> - auth_401_: preserved facts from the latest unauthorized response snapshot - auth_: latest auth-related facts from the latest request attempt - auth_recovery_: unauthorized recovery state and follow-up result Without 401 <img width="1917" height="522" alt="happy-path auth tags in Sentry" src="https://github.com/user-attachments/assets/3381ed28-8022-43b0-b6c0-623a630e679f" /> ###### Summary - Add client-visible 401 diagnostics for auth attachment, upstream auth classification, and 401 request id / cf-ray correlation. - Record unauthorized recovery mode, phase, outcome, and retry/follow-up status without changing auth behavior. - Surface the highest-signal auth and recovery fields on uploaded client bug reports so they are usable in Sentry. - Preserve original unauthorized evidence under `auth_401_` while keeping follow-up result tags separate. ###### Rationale (from spec findings) - The dominant bucket needed proof of whether the client attached auth before send or upstream still classified the request as missing auth. - Client uploads needed to show whether unauthorized recovery ran and what the client tried next. - Request id and cf-ray needed to be preserved on the unauthorized response so server-side correlation is immediate. - The bug-report path needed the same auth evidence as the request telemetry path, otherwise the observability would not be operationally useful. ###### Scope - Add auth 401 and unauthorized-recovery observability in `codex-rs/core`, `codex-rs/codex-api`, and `codex-rs/otel`, including feedback-tag surfacing. - Keep auth semantics, refresh behavior, retry behavior, endpoint classification, and geo-denial follow-up work out of this PR. ###### Trade-offs - This exports only safe auth evidence: header presence/name, upstream auth classification, request ids, and recovery state. It does not export token values or raw upstream bodies. - This keeps websocket connection reuse as a transport clue because it can help distinguish stale reused sessions from fresh reconnects. - Misroute/base-url classification and geo-denial are intentionally deferred to a separate follow-up PR so this review stays focused on the dominant auth 401 bucket. ###### Client follow-up - PR 2 will add misroute/provider and geo-denial observability plus the matching feedback-tag surfacing. - A separate host/app-server PR should log auth-decision inputs so pre-send host auth state can be correlated with client request evidence. - `device_id` remains intentionally separate until there is a safe existing source on the feedback upload path. ###### Testing - `cargo test -p codex-core refresh_available_models_sorts_by_priority` - `cargo test -p codex-core emit_feedback_request_tags_` - `cargo test -p codex-core emit_feedback_auth_recovery_tags_` - `cargo test -p codex-core auth_request_telemetry_context_tracks_attached_auth_and_retry_phase` - `cargo test -p codex-core extract_response_debug_context_decodes_identity_headers` - `cargo test -p codex-core identity_auth_details` - `cargo test -p codex-core telemetry_error_messages_preserve_non_http_details` - `cargo test -p codex-core --all-features --no-run` - `cargo test -p codex-otel otel_export_routing_policy_routes_api_request_auth_observability` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_connect_auth_observability` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_request_transport_observability`	2026-03-14 15:38:51 -07:00
viyatb-oai	9060dc7557	fix: fix symlinked writable roots in sandbox policies (#14674 ) ## Summary - normalize effective readable, writable, and unreadable sandbox roots after resolving special paths so symlinked roots use canonical runtime paths - add a protocol regression test for a symlinked writable root with a denied child and update protocol expectations to canonicalized effective paths - update macOS seatbelt tests to assert against effective normalized roots produced by the shared policy helpers ## Testing - just fmt - cargo test -p codex-protocol - cargo test -p codex-core explicit_unreadable_paths_are_excluded_ - cargo clippy -p codex-protocol -p codex-core --tests -- -D warnings ## Notes - This is intended to fix the symlinked TMPDIR bind failure in bubblewrap described in #14672. Fixes #14672	2026-03-14 13:24:43 -07:00
Channing Conger	70eddad6b0	dynamic tool calls: add param `exposeToContext` to optionally hide tool (#14501 ) This extends dynamic_tool_calls to allow us to hide a tool from the model context but still use it as part of the general tool calling runtime (for ex from js_repl/code_mode)	2026-03-14 01:58:43 -07:00
sayan-oai	e389091042	make defaultPrompt an array, keep backcompat (#14649 ) make plugins' `defaultPrompt` an array, but keep backcompat for strings. the array is limited by app-server to 3 entries of up to 128 chars (drops extra entries, `None`s-out ones that are too long) without erroring if those invariants are violating. added tests, tested locally.	2026-03-14 06:13:51 +00:00
Eric Traut	ae0a6510e1	Enforce errors on overriding built-in model providers (#12024 ) We receive bug reports from users who attempt to override one of the three built-in model providers (openai, ollama, or lmstuio). Currently, these overrides are silently ignored. This PR makes it an error to override them. ## Summary - add validation for `model_providers` so `openai`, `ollama`, and `lmstudio` keys now produce clear configuration errors instead of being silently ignored	2026-03-13 22:10:13 -06:00
sayan-oai	d272f45058	move plugin/skill instructions into dev msg and reorder (#14609 ) Move the general `Apps`, `Skills` and `Plugins` instructions blocks out of `user_instructions` and into the developer message, with new `Apps -> Skills -> Plugins` order for better clarity. Also wrap those sections in stable XML-style instruction tags (like other sections) and update prompt-layout tests/snapshots. This makes the tests less brittle in snapshot output (we can parse the sections), and it consolidates the capability instructions in one place. #### Tests Updated snapshots, added tests. `<AGENTS_MD>` disappearing in snapshots is expected: before this change, the wrapped user-instructions message was kept alive by `Skills` content. Now that `Skills` and `Plugins` are in the developer message, that wrapper only appears when there is real project-doc/user-instructions content. --------- Co-authored-by: Charley Cunningham <ccunningham@openai.com>	2026-03-13 20:51:01 -07:00
viyatb-oai	7f571396c8	fix: sync split sandbox policies for spawned subagents (#14650 ) ## Summary - reapply the live split filesystem and network sandbox policies when building spawned subagent configs - keep spawned child sessions aligned with the parent turn after role-layer config reloads - add regression coverage for both config construction and spawned child-turn inheritance	2026-03-14 03:03:49 +00:00
viyatb-oai	6dc04df5e6	fix: persist future network host approvals across sessions (#14619 ) ## Summary - apply persisted execpolicy network rules when booting the managed network proxy - pass the current execpolicy into managed proxy startup so host approvals selected with "allow this host in the future" survive new sessions	2026-03-14 02:46:10 +00:00
Charley Cunningham	bbd329a812	Fix turn context reconstruction after backtracking (#14616 ) ## Summary - reuse rollout reconstruction when applying a backtrack rollback so `reference_context_item` is restored from persisted rollout state - build rollback replay from the flushed rollout items plus the rollback marker, avoiding the extra reread/fallback path - add regression coverage for rollback after compaction so turn-context diffing stays aligned after backtracking Co-authored-by: Codex <noreply@openai.com>	2026-03-13 19:28:31 -07:00
Ahmed Ibrahim	69c8a1ef9e	Fix Windows CI assertions for guardian and Smart Approvals (#14645 ) - Normalize guardian assessment path serialization to use forward slashes for cross-platform stability. - Seed workspace-write defaults in the Smart Approvals override-turn-context test so Windows and non-Windows selection flows are consistent. --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-14 02:15:58 +00:00

1 2 3 4 5 ...

2302 Commits