codex

mirror of https://github.com/openai/codex.git synced 2026-05-02 18:37:01 +00:00

Author	SHA1	Message	Date
nicholasclark-openai	bab2a28f26	Revert "Forward session and turn headers to MCP HTTP requests (#15011 )" This reverts commit `b14689df3b`.	2026-03-18 21:40:32 -07:00
nicholasclark-openai	b14689df3b	Forward session and turn headers to MCP HTTP requests (#15011 ) ## Summary - forward request-scoped task headers through MCP tool metadata lookups and tool calls - apply those headers to streamable HTTP initialize, tools/list, and tools/call requests - update affected rmcp/core tests for the new request_headers plumbing ## Testing - cargo test -p codex-rmcp-client - cargo test -p codex-core (fails on pre-existing unrelated error in core/src/auth_env_telemetry.rs: missing websocket_connect_timeout_ms in ModelProviderInfo initializer) - just fix -p codex-rmcp-client - just fix -p codex-core (hits the same unrelated auth_env_telemetry.rs error) - just fmt --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-18 21:29:37 -07:00
Owen Lin	20f2a216df	feat(core, tracing): create turn spans over websockets (#14632 ) ## Description Dependent on: - [responsesapi] https://github.com/openai/openai/pull/760991 - [codex-backend] https://github.com/openai/openai/pull/760985 `codex app-server -> codex-backend -> responsesapi` now reuses a persistent websocket connection across many turns. This PR updates tracing when using websockets so that each `response.create` websocket request propagates the current tracing context, so we can get a holistic end-to-end trace for each turn. Tracing is propagated via special keys (`ws_request_header_traceparent`, `ws_request_header_tracestate`) set in the `client_metadata` param in Responses API. Currently tracing on websockets is a bit broken because we only set tracing context on ws connection time, so it's detached from a `turn/start` request.	2026-03-19 03:41:06 +00:00
alexsong-oai	825d09373d	Support featured plugins (#15042 )	2026-03-18 17:45:30 -07:00
xl-openai	dcd5e08269	fix: harden plugin feature gating (#15104 ) Resubmit https://github.com/openai/codex/pull/15020 with correct content. 1. Use requirement-resolved config.features as the plugin gate. 2. Guard plugin/list, plugin/read, and related flows behind that gate. 3. Skip bad marketplace.json files instead of failing the whole list. 4. Simplify plugin state and caching.	2026-03-19 00:03:37 +00:00
pakrym-oai	56d0c6bf67	Add apply_patch code mode result (#15100 ) It's empty !	2026-03-18 16:11:10 -07:00
pakrym-oai	3590e181fa	Add update_plan code mode result (#15103 ) It's empty!	2026-03-18 16:10:51 -07:00
Charley Cunningham	ebbbc52ce4	Align SQLite feedback logs with feedback formatter (#13494 ) ## Summary - store a pre-rendered `feedback_log_body` in SQLite so `/feedback` exports keep span prefixes and structured event fields - render SQLite feedback exports with timestamps and level prefixes to match the old in-memory feedback formatter, while preserving existing trailing newlines - count `feedback_log_body` in the SQLite retention budget so structured or span-prefixed rows still prune correctly - bound `/feedback` row loading in SQL with the retention estimate, then apply exact whole-line truncation in Rust so uploads stay capped without splitting lines ## Details - add a `feedback_log_body` column to `logs` and backfill it from `message` for existing rows - capture span names plus formatted span and event fields at write time, since SQLite does not retain enough structure to reconstruct the old formatter later - keep SQLite feedback queries scoped to the requested thread plus same-process threadless rows - restore a SQL-side cumulative `estimated_bytes` cap for feedback export queries so over-retained partitions do not load every matching row before truncation - add focused formatting coverage for exported feedback lines and parity coverage against `tracing_subscriber` ## Testing - cargo test -p codex-state - just fix -p codex-state - just fmt codex author: `codex resume 019ca1b0-0ecc-78b1-85eb-6befdd7e4f1f` --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-18 22:44:31 +00:00
Ahmed Ibrahim	7b37a0350f	Add final message prefix to realtime handoff output (#15077 ) - prefix realtime handoff output with the agent final message label for both realtime v1 and v2 - update realtime websocket and core expectations to match	2026-03-18 15:19:49 -07:00
xl-openai	86982ca1f9	Revert "fix: harden plugin feature gating" (#15102 ) Reverts openai/codex#15020 I messed up the commit in my PR and accidentally merged changes that were still under review.	2026-03-18 15:19:29 -07:00
pakrym-oai	5cada46ddf	Return image URL from view_image tool (#15072 ) Cleanup image semantics in code mode. `view_image` now returns `{image_url:string, details?: string}` `image()` now allows both string parameter and `{image_url:string, details?: string}`	2026-03-18 13:58:20 -07:00
pakrym-oai	88e5382fc4	Propagate tool errors to code mode (#15075 ) Clean up error flow to push the FunctionCallError all the way up to dispatcher and allow code mode to surface as exception.	2026-03-18 13:57:55 -07:00
Felipe Coury	334164a6f7	feat(tui): restore composer history in app-server tui (#14945 ) ## Problem The app-server TUI (`tui_app_server`) lacked composer history support. Pressing Up/Down to recall previous prompts hit a stub that logged a warning and displayed "Not available in app-server TUI yet." New submissions were silently dropped from the shared history file, so nothing persisted for future sessions. ## Mental model Codex maintains a single, append-only history file (`$CODEX_HOME/history.jsonl`) shared across all TUI processes on the same machine. The legacy (in-process) TUI already reads/writes this file through `codex_core::message_history`. The app-server TUI delegates most operations to a separate process over RPC, but history is intentionally not an RPC concern — it's a client-local file. This PR makes the app-server TUI access the same history file directly, bypassing the app-server process entirely. The composer's Up/Down navigation and submit-time persistence now follow the same code paths as the legacy TUI, with the only difference being where the call is dispatched (locally in `App`, rather than inside `CodexThread`). The branch is rebuilt directly on top of `upstream/main`, so it keeps the existing app-server restore architecture intact. `AppServerStartedThread` still restores transcript history from the server `Thread` snapshot via `thread_snapshot_events`; this PR only adds composer-history support. ## Non-goals - Adding history support to the app-server protocol. History remains client-local. - Changing the on-disk format or location of `history.jsonl`. - Surfacing history I/O errors to the user (failures are logged and silently swallowed, matching the legacy TUI). ## Tradeoffs \| Decision \| Why \| Risk \| \|----------\|-----\|------\| \| Widen `message_history` from `pub(crate)` to `pub` \| Avoids duplicating file I/O logic; the module already has a clean, minimal API surface. \| Other workspace crates can now call these functions — the contract is no longer crate-private. However, this is consistent with recent precedent: `590cfa617` exposed `mention_syntax` for TUI consumption, `752402c4f` exposed plugin APIs (`PluginsManager`), and `14fcb6645`/`edacbf7b6` widened internal core APIs for other crates. These were all narrow, intentional exposures of specific APIs — not broad "make internals public" moves. `1af2a37ad` even went the other direction, reducing broad re-exports to tighten boundaries. This change follows the same pattern: a small, deliberate API surface (3 functions) rather than a wholesale visibility change. \| \| Intercept `AddToHistory` / `GetHistoryEntryRequest` in `App` before RPC fallback \| Keeps history ops out of the "unsupported op" error path without changing app-server protocol. \| This now routes through a single `submit_thread_op` entry point, which is safer than the original duplicated dispatch. The remaining risk is organizational: future thread-op submission paths need to keep using that shared entry point. \| \| `session_configured_from_thread_response` is now `async` \| Needs `await` on `history_metadata()` to populate real `history_log_id` / `history_entry_count`. \| Adds an async file-stat + full-file newline scan to the session bootstrap path. The scan is bounded by `history.max_bytes` and matches the legacy TUI's cost profile, but startup latency still scales with file size. \| ## Architecture ``` User presses Up User submits a prompt │ │ ▼ ▼ ChatComposerHistory ChatWidget::do_submit_turn navigate_up() encode_history_mentions() │ │ ▼ ▼ AppEvent::CodexOp Op::AddToHistory { text } (GetHistoryEntryRequest) │ │ ▼ ▼ App::try_handle_local_history_op App::try_handle_local_history_op message_history::append_entry() spawn_blocking { │ message_history::lookup() ▼ } $CODEX_HOME/history.jsonl │ ▼ AppEvent::ThreadEvent (GetHistoryEntryResponse) │ ▼ ChatComposerHistory::on_entry_response() ``` ## Observability - `tracing::warn` on `append_entry` failure (includes thread ID). - `tracing::warn` on `spawn_blocking` lookup join error. - `tracing::warn` from `message_history` internals on file-open, lock, or parse failures. ## Tests - `chat_composer_history::tests::navigation_with_async_fetch` — verifies that Up emits `Op::GetHistoryEntryRequest` (was: checked for stub error cell). - `app::tests::history_lookup_response_is_routed_to_requesting_thread` — verifies multi-thread composer recall routes the lookup result back to the originating thread. - `app_server_session::tests::resume_response_relies_on_snapshot_replay_not_initial_messages` — verifies app-server session restore still uses the upstream thread-snapshot path. - `app_server_session::tests::session_configured_populates_history_metadata` — verifies bootstrap sets nonzero `history_log_id` / `history_entry_count` from the shared local history file.	2026-03-18 11:54:11 -06:00
xl-openai	580f32ad2a	fix: harden plugin feature gating (#15020 ) 1. Use requirement-resolved config.features as the plugin gate. 2. Guard plugin/list, plugin/read, and related flows behind that gate. 3. Skip bad marketplace.json files instead of failing the whole list. 4. Simplify plugin state and caching.	2026-03-18 10:11:43 -07:00
pakrym-oai	606d85055f	Add notify to code-mode (#14842 ) Allows model to send an out-of-band notification. The notification is injected as another tool call output for the same call_id.	2026-03-18 09:37:13 -07:00
jif-oai	7ae99576a6	chore: disable memory read path for morpheus (#15059 ) Because we don't want prompts collisions	2026-03-18 15:42:56 +00:00
jif-oai	58ac2a8773	nit: disable live memory edition (#15058 )	2026-03-18 14:49:57 +00:00
jif-oai	a265d6043e	feat: add memory citation to agent message (#14821 ) Client side to come	2026-03-18 10:03:38 +00:00
jif-oai	0f9484dc8a	feat: adapt artifacts to new packaging and 2.5.6 (#14947 )	2026-03-18 09:17:44 +00:00
Matthew Zeng	40a7d1d15b	[plugins] Support configuration tool suggest allowlist. (#15022 ) - [x] Support configuration tool suggest allowlist. Supports both plugins and connectors.	2026-03-17 23:58:27 -07:00
Dylan Hurd	84f4e7b39d	fix(subagents) share execpolicy by default (#13702 ) ## Summary If a subagent requests approval, and the user persists that approval to the execpolicy, it should (by default) propagate. We'll need to rethink this a bit in light of coming Permissions changes, though I think this is closer to the end state that we'd want, which is that execpolicy changes to one permissions profile should be synced across threads. ## Testing - [x] Added integration test --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-18 06:42:26 +00:00
Andrei Eternal	6fef421654	[hooks] userpromptsubmit - hook before user's prompt is executed (#14626 ) - this allows blocking the user's prompts from executing, and also prevents them from entering history - handles the edge case where you can both prevent the user's prompt AND add n amount of additionalContexts - refactors some old code into common.rs where hooks overlap functionality - refactors additionalContext being previously added to user messages, instead we use developer messages for them - handles queued messages correctly Sample hook for testing - if you write "[block-user-submit]" this hook will stop the thread: example run ``` › sup • Running UserPromptSubmit hook: reading the observatory notes UserPromptSubmit hook (completed) warning: wizard-tower UserPromptSubmit demo inspected: sup hook context: Wizard Tower UserPromptSubmit demo fired. For this reply only, include the exact phrase 'observatory lanterns lit' exactly once near the end. • Just riding the cosmic wave and ready to help, my friend. What are we building today? observatory lanterns lit › and [block-user-submit] • Running UserPromptSubmit hook: reading the observatory notes UserPromptSubmit hook (stopped) warning: wizard-tower UserPromptSubmit demo blocked the prompt on purpose. stop: Wizard Tower demo block: remove [block-user-submit] to continue. ``` .codex/config.toml ``` [features] codex_hooks = true ``` .codex/hooks.json ``` { "hooks": { "UserPromptSubmit": [ { "hooks": [ { "type": "command", "command": "/usr/bin/python3 .codex/hooks/user_prompt_submit_demo.py", "timeoutSec": 10, "statusMessage": "reading the observatory notes" } ] } ] } } ``` .codex/hooks/user_prompt_submit_demo.py ``` #!/usr/bin/env python3 import json import sys from pathlib import Path def prompt_from_payload(payload: dict) -> str: prompt = payload.get("prompt") if isinstance(prompt, str) and prompt.strip(): return prompt.strip() event = payload.get("event") if isinstance(event, dict): user_prompt = event.get("user_prompt") if isinstance(user_prompt, str): return user_prompt.strip() return "" def main() -> int: payload = json.load(sys.stdin) prompt = prompt_from_payload(payload) cwd = Path(payload.get("cwd", ".")).name or "wizard-tower" if "[block-user-submit]" in prompt: print( json.dumps( { "systemMessage": ( f"{cwd} UserPromptSubmit demo blocked the prompt on purpose." ), "decision": "block", "reason": ( "Wizard Tower demo block: remove [block-user-submit] to continue." ), } ) ) return 0 prompt_preview = prompt or "(empty prompt)" if len(prompt_preview) > 80: prompt_preview = f"{prompt_preview[:77]}..." print( json.dumps( { "systemMessage": ( f"{cwd} UserPromptSubmit demo inspected: {prompt_preview}" ), "hookSpecificOutput": { "hookEventName": "UserPromptSubmit", "additionalContext": ( "Wizard Tower UserPromptSubmit demo fired. " "For this reply only, include the exact phrase " "'observatory lanterns lit' exactly once near the end." ), }, } ) ) return 0 if __name__ == "__main__": raise SystemExit(main()) ```	2026-03-17 22:09:22 -07:00
Charley Cunningham	226241f035	Use workspace requirements for guardian prompt override (#14727 ) ## Summary - move `guardian_developer_instructions` from managed config into workspace-managed `requirements.toml` - have guardian continue using the override when present and otherwise fall back to the bundled local guardian prompt - keep the generalized prompt-quality improvements in the shared guardian default prompt - update requirements parsing, layering, schema, and tests for the new source of truth ## Context This replaces the earlier managed-config / MDM rollout plan. The intended rollout path is workspace-managed requirements, including cloud enterprise policies, rather than backend model metadata, Statsig, or Jamf-managed config. That keeps the default/fallback behavior local to `codex-rs` while allowing faster policy updates through the enterprise requirements plane. This is intentionally an admin-managed policy input, not a user preference: the guardian prompt should come either from the bundled `codex-rs` default or from enterprise-managed `requirements.toml`, and normal user/project/session config should not override it. ## Updating The OpenAI Prompt After this lands, the OpenAI-specific guardian prompt should be updated through the workspace Policies UI at `/codex/settings/policies` rather than through Jamf or codex-backend model metadata. Operationally: - open the workspace Policies editor as a Codex admin - edit the default `requirements.toml` policy, or a higher-precedence group-scoped override if we ever want different behavior for a subset of users - set `guardian_developer_instructions = """..."""` to the full OpenAI-specific guardian prompt text - save the policy; codex-backend stores the raw TOML and `codex-rs` fetches the effective requirements file from `/wham/config/requirements` When updating the OpenAI-specific prompt, keep it aligned with the shared default guardian policy in `codex-rs` except for intentional OpenAI-only additions. ## Testing - `cargo check --tests -p codex-core -p codex-config -p codex-cloud-requirements --message-format short` - `cargo run -p codex-core --bin codex-write-config-schema` - `cargo fmt` - `git diff --check` Co-authored-by: Codex <noreply@openai.com>	2026-03-17 22:05:41 -07:00
Ahmed Ibrahim	3ce879c646	Handle realtime conversation end in the TUI (#14903 ) - close live realtime sessions on errors, ctrl-c, and active meter removal - centralize TUI realtime cleanup and avoid duplicate follow-up close info --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>	2026-03-17 21:04:58 -07:00
pakrym-oai	770616414a	Prefer websockets when providers support them (#13592 ) Remove all flags and model settings. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-17 19:46:44 -07:00
viyatb-oai	d950543e65	feat: support restricted ReadOnlyAccess in elevated Windows sandbox (#14610 ) ## Summary - support legacy `ReadOnlyAccess::Restricted` on Windows in the elevated setup/runner backend - keep the unelevated restricted-token backend on the legacy full-read model only, and fail closed for restricted read-only policies there - keep the legacy full-read Windows path unchanged while deriving narrower read roots only for elevated restricted-read policies - honor `include_platform_defaults` by adding backend-managed Windows system roots only when requested, while always keeping helper roots and the command `cwd` readable - preserve `workspace-write` semantics by keeping writable roots readable when restricted read access is in use in the elevated backend - document the current Windows boundary: legacy `SandboxPolicy` is supported on both backends, while richer split-only carveouts still fail closed instead of running with weaker enforcement ## Testing - `cargo test -p codex-windows-sandbox` - `cargo check -p codex-windows-sandbox --tests --target x86_64-pc-windows-msvc` - `cargo clippy -p codex-windows-sandbox --tests --target x86_64-pc-windows-msvc -- -D warnings` - `cargo test -p codex-core windows_restricted_token_` ## Notes - local `cargo test -p codex-windows-sandbox` on macOS only exercises the non-Windows stubs; the Windows-targeted compile and clippy runs provide the local signal, and GitHub Windows CI exercises the runtime path	2026-03-17 19:08:50 -07:00
viyatb-oai	6fe8a05dcb	fix: honor active permission profiles in sandbox debug (#14293 ) ## Summary - stop `codex sandbox` from forcing legacy `sandbox_mode` when active `[permissions]` profiles are configured - keep the legacy `read-only` / `workspace-write` fallback for legacy configs and reject `--full-auto` for profile-based configs - use split filesystem and network policies in the macOS/Linux debug sandbox helpers and add regressions for the config-loading behavior assuming "codex/docs/private/secret.txt" = "none" ``` codex -c 'default_permissions="limited-read-test"' sandbox macos -- <command> ... codex sandbox macos -- cat codex/docs/private/secret.txt >/dev/null; echo EXIT:$? cat: codex/docs/private/secret.txt: Operation not permitted EXIT:1 ``` --------- Co-authored-by: celia-oai <celia@openai.com>	2026-03-18 01:52:02 +00:00
pakrym-oai	83a60fdb94	Add FS abstraction and use in view_image (#14960 ) Adds an environment crate and environment + file system abstraction. Environment is a combination of attributes and services specific to environment the agent is connected to: File system, process management, OS, default shell. The goal is to move most of agent logic that assumes environment to work through the environment abstraction.	2026-03-17 17:36:23 -07:00
xl-openai	a5d3114e97	feat: Add product-aware plugin policies and clean up manifest naming (#14993 ) - Add shared Product support to marketplace plugin policy and skill policy (no enforced yet). - Move marketplace installation/authentication under policy and model it as MarketplacePluginPolicy. - Rename plugin/marketplace local manifest types to separate raw serde shapes from resolved in-memory models.	2026-03-17 17:01:34 -07:00
viyatb-oai	0d1539e74c	fix(linux-sandbox): prefer system /usr/bin/bwrap when available (#14963 ) ## Problem Ubuntu/AppArmor hosts started failing in the default Linux sandbox path after the switch to vendored/default bubblewrap in `0.115.0`. The clearest report is in [#14919](https://github.com/openai/codex/issues/14919), especially [this investigation comment](https://github.com/openai/codex/issues/14919#issuecomment-4076504751): on affected Ubuntu systems, `/usr/bin/bwrap` works, but a copied or vendored `bwrap` binary fails with errors like `bwrap: setting up uid map: Permission denied` or `bwrap: loopback: Failed RTM_NEWADDR: Operation not permitted`. The root cause is Ubuntu's `/etc/apparmor.d/bwrap-userns-restrict` profile, which grants `userns` access specifically to `/usr/bin/bwrap`. Once Codex started using a vendored/internal bubblewrap path, that path was no longer covered by the distro AppArmor exception, so sandbox namespace setup could fail even when user namespaces were otherwise enabled and `uidmap` was installed. ## What this PR changes - prefer system `/usr/bin/bwrap` whenever it is available - keep vendored bubblewrap as the fallback when `/usr/bin/bwrap` is missing - when `/usr/bin/bwrap` is missing, surface a Codex startup warning through the app-server/TUI warning path instead of printing directly from the sandbox helper with `eprintln!` - use the same launcher decision for both the main sandbox execution path and the `/proc` preflight path - document the updated Linux bubblewrap behavior in the Linux sandbox and core READMEs ## Why this fix This still fixes the Ubuntu/AppArmor regression from [#14919](https://github.com/openai/codex/issues/14919), but it keeps the runtime rule simple and platform-agnostic: if the standard system bubblewrap is installed, use it; otherwise fall back to the vendored helper. The warning now follows that same simple rule. If Codex cannot find `/usr/bin/bwrap`, it tells the user that it is falling back to the vendored helper, and it does so through the existing startup warning plumbing that reaches the TUI and app-server instead of low-level sandbox stderr. ## Testing - `cargo test -p codex-linux-sandbox` - `cargo test -p codex-app-server --lib` - `cargo test -p codex-tui-app-server tests::embedded_app_server_start_failure_is_returned` - `cargo clippy -p codex-linux-sandbox --all-targets` - `cargo clippy -p codex-app-server --all-targets` - `cargo clippy -p codex-tui-app-server --all-targets`	2026-03-17 23:05:34 +00:00
Ahmed Ibrahim	98be562fd3	Unify realtime shutdown in core (#14902 ) - route realtime startup, input, and transport failures through a single shutdown path - emit one realtime error/closed lifecycle while clearing session state once --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>	2026-03-17 15:58:52 -07:00
Ahmed Ibrahim	c6ab4ee537	Gate realtime audio interruption logic to v2 (#14984 ) - thread the realtime version into conversation start and app-server notifications - keep playback-aware mic gating and playback interruption behavior on v2 only, leaving v1 on the legacy path	2026-03-17 15:24:37 -07:00
xl-openai	1a9555eda9	Cleanup skills/remote/xxx endpoints. (#14977 ) Remote skills/remote/xxx as they are not in used for now.	2026-03-17 15:22:36 -07:00
Colin Young	0d2ff40a58	Add auth env observability (#14905 ) CXC-410 Emit Env Var Status with `/feedback` report Add more observability on top of #14611 [Unset](https://openai.sentry.io/issues/7340419168/?project=4510195390611458&query=019cfa8d-c1ba-7002-96fa-e35fc340551d&referrer=issue-stream) [Set](https://openai.sentry.io/issues/7340426331/?project=4510195390611458&query=019cfa91-aba1-7823-ab7e-762edfbc0ed4&referrer=issue-stream) <img width="1063" height="610" alt="image" src="https://github.com/user-attachments/assets/937ab026-1c2d-4757-81d5-5f31b853113e" /> ###### Summary - Adds auth-env telemetry that records whether key auth-related env overrides were present on session start and request paths. - Threads those auth-env fields through `/responses`, websocket, and `/models` telemetry and feedback metadata. - Buckets custom provider `env_key` configuration to a safe `"configured"` value instead of emitting raw config text. - Keeps the slice observability-only: no raw token values or raw URLs are emitted. ###### Rationale (from spec findings) - 401 and auth-path debugging needs a way to distinguish env-driven auth paths from sessions with no auth env override. - Startup and model-refresh failures need the same auth-env diagnostics as normal request failures. - Feedback and Sentry tags need the same auth-env signal as OTel events so reports can be triaged consistently. - Custom provider config is user-controlled text, so the telemetry contract must stay presence-only / bucketed. ###### Scope - Adds a small `AuthEnvTelemetry` bundle for env presence collection and threads it through the main request/session telemetry paths. - Does not add endpoint/base-url/provider-header/geo routing attribution or broader telemetry API redesign. ###### Trade-offs - `provider_env_key_name` is bucketed to `"configured"` instead of preserving the literal configured env var name. - `/models` is included because startup/model-refresh auth failures need the same diagnostics, but broader parity work remains out of scope. - This slice keeps the existing telemetry APIs and layers auth-env fields onto them rather than redesigning the metadata model. ###### Client follow-up - Add the separate endpoint/base-url attribution slice if routing-source diagnosis is still needed. - Add provider-header or residency attribution only if auth-env presence proves insufficient in real reports. - Revisit whether any additional auth-related env inputs need safe bucketing after more 401 triage data. ###### Testing - `cargo test -p codex-core emit_feedback_request_tags -- --nocapture` - `cargo test -p codex-core collect_auth_env_telemetry_buckets_provider_env_key_name -- --nocapture` - `cargo test -p codex-core models_request_telemetry_emits_auth_env_feedback_tags_on_failure -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_api_request_auth_observability -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_connect_auth_observability -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_request_transport_observability -- --nocapture` - `cargo test -p codex-core --no-run --message-format short` - `cargo test -p codex-otel --no-run --message-format short` --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-17 14:26:27 -07:00
pakrym-oai	ee756eb80f	Rename exec_wait tool to wait (#14983 ) Summary - document that code mode only exposes `exec` and the renamed `wait` tool - update code mode tool spec and descriptions to match the new tool name - rename tests and helper references from `exec_wait` to `wait` Testing - Not run (not requested)	2026-03-17 14:22:26 -07:00
Ahmed Ibrahim	4d9d4b7b0f	Stabilize approval matrix write-file command (#14968 ) ## What is flaky The approval-matrix `WriteFile` scenario is flaky. It sometimes fails in CI even though the approval logic is unchanged, because the test delegates the file write and readback to shell parsing instead of deterministic file I/O. ## Why it was flaky The test generated a command shaped like `printf ... > file && cat file`. That means the scenario depended on shell quoting, redirection, newline handling, and encoding behavior in addition to the approval system it was actually trying to validate. If the shell interpreted the payload differently, the test would report an approval failure even though the product logic was fine. That also made failures hard to diagnose, because the test did not log the exact generated command or the parsed result payload. ## How this PR fixes it This PR replaces the shell-redirection path with a deterministic `python3 -c` script that writes the file with `Path.write_text(..., encoding='utf-8')` and then reads it back with the same UTF-8 path. It also logs the generated command and the resulting exit code/stdout for the approval scenario so any future failure is directly attributable. ## Why this fix fixes the flakiness The scenario no longer depends on shell parsing and redirection semantics. The file contents are produced and read through explicit UTF-8 file I/O, so the approval test is measuring approval behavior instead of shell behavior. The added diagnostics mean a future failure will show the exact command/result pair instead of looking like a generic intermittent mismatch. Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-17 13:52:36 -07:00
Ahmed Ibrahim	b02388672f	Stabilize Windows cmd-based shell test harnesses (#14958 ) ## What is flaky The Windows shell-driven integration tests in `codex-rs/core` were intermittently unstable, especially: - `apply_patch_cli_can_use_shell_command_output_as_patch_input` - `websocket_test_codex_shell_chain` - `websocket_v2_test_codex_shell_chain` ## Why it was flaky These tests were exercising real shell-tool flows through whichever shell Codex selected on Windows, and the `apply_patch` test also nested a PowerShell read inside `cmd /c`. There were multiple independent sources of nondeterminism in that setup: - The test harness depended on the model-selected Windows shell instead of pinning the shell it actually meant to exercise. - `cmd.exe /c powershell.exe -Command "..."` is quoting-sensitive; on CI that could leave the read command wrapped as a literal string instead of executing it. - Even after getting the quoting right, PowerShell could emit CLIXML progress records like module-initialization output onto stdout. - The `apply_patch` test was building a patch directly from shell stdout, so any quoting artifact or progress noise corrupted the patch input. So the failures were driven by shell startup and output-shape variance, not by the `apply_patch` or websocket logic themselves. ## How this PR fixes it - Add a test-only `user_shell_override` path so Windows integration tests can pin `cmd.exe` explicitly. - Use that override in the websocket shell-chain tests and in the `apply_patch` harness. - Change the nested Windows file read in `apply_patch_cli_can_use_shell_command_output_as_patch_input` to a UTF-8 PowerShell `-EncodedCommand` script. - Run that nested PowerShell process with `-NonInteractive`, set `$ProgressPreference = 'SilentlyContinue'`, and read the file with `[System.IO.File]::ReadAllText(...)`. ## Why this fix fixes the flakiness The outer harness now runs under a deterministic shell, and the inner PowerShell read no longer depends on fragile `cmd` quoting or on progress output staying quiet by accident. The shell tool returns only the file contents, so patch construction and websocket assertions depend on stable test inputs instead of on runner-specific shell behavior. --------- Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-17 20:21:46 +00:00
Matthew Zeng	683c37ce75	[plugins] Support plugin installation elicitation. (#14896 ) It now supports: - Connectors that are from installed and enabled plugins that are not installed yet - Plugins that are on the allowlist that are not installed yet.	2026-03-17 13:19:28 -07:00
Ahmed Ibrahim	0d531c05f2	Fix code mode yield startup race (#14959 )	2026-03-17 11:09:12 -07:00
jif-oai	d484bb57d9	feat: add suffix to shell snapshot name (#14938 ) https://github.com/openai/codex/issues/14906	2026-03-17 17:59:27 +00:00
Shijie Rao	8e258eb3f5	Feat: CXA-1831 Persist latest model and reasoning effort in sqlite (#14859 ) ### Summary The goal is for us to get the latest turn model and reasoning effort on thread/resume is no override is provided on the thread/resume func call. This is the part 1 which we write the model and reasoning effort for a thread to the sqlite db and there will be a followup PR to consume the two new fields on thread/resume. [part 2 PR is currently WIP](https://github.com/openai/codex/pull/14888) and this one can be merged independently.	2026-03-17 10:14:34 -07:00
Owen Lin	6ea041032b	fix(core): prevent hanging turn/start due to websocket warming issues (#14838 ) ## Description This PR fixes a bad first-turn failure mode in app-server when the startup websocket prewarm hangs. Before this change, `initialize -> thread/start -> turn/start` could sit behind the prewarm for up to five minutes, so the client would not see `turn/started`, and even `turn/interrupt` would block because the turn had not actually started yet. Now, we: - set a (configurable) timeout of 15s for websocket startup time, exposed as `websocket_startup_timeout_ms` in config.toml - `turn/started` is sent immediately on `turn/start` even if the websocket is still connecting - `turn/interrupt` can be used to cancel a turn that is still waiting on the websocket warmup - the turn task will wait for the full 15s websocket warming timeout before falling back ## Why The old behavior made app-server feel stuck at exactly the moment the client expects turn lifecycle events to start flowing. That was especially painful for external clients, because from their point of view the server had accepted the request but then went silent for minutes. ## Configuring the websocket startup timeout Can set it in config.toml like this: ``` [model_providers.openai] supports_websockets = true websocket_connect_timeout_ms = 15000 ```	2026-03-17 10:07:46 -07:00
jif-oai	e8add54e5d	feat: show effective model in spawn agent event (#14944 ) Show effective model after the full config layering for the sub agent	2026-03-17 16:58:58 +00:00
daveaitel-openai	ef36d39199	Fix agent jobs finalization race and reduce status polling churn (#14843 ) ## Summary - make `report_agent_job_result` atomically transition an item from running to completed while storing `result_json` - remove brittle finalization grace-sleep logic and make finished-item cleanup idempotent - replace blind fixed-interval waiting with status-subscription-based waiting for active worker threads - add state runtime tests for atomic completion and late-report rejection ## Why This addresses the race and polling concerns in #13948 by removing timing-based correctness assumptions and reducing unnecessary status polling churn. ## Validation - `cd codex-rs && just fmt` - `cd codex-rs && cargo test -p codex-state` - `cd codex-rs && cargo test -p codex-core --test all suite::agent_jobs` - `cd codex-rs && cargo test` - fails in an unrelated app-server tracing test: `message_processor::tracing_tests::thread_start_jsonrpc_span_exports_server_span_and_parents_children` timed out waiting for response ## Notes - This PR supersedes #14129 with the same agent-jobs fix on a clean branch from `main`. - The earlier PR branch was stacked on unrelated history, which made the review diff include unrelated commits. Fixes #13948	2026-03-17 10:40:14 -04:00
jif-oai	4ed19b0766	feat: rename to get more explicit close agent (#14935 ) https://github.com/openai/codex/issues/14907	2026-03-17 14:37:20 +00:00
jif-oai	31648563c8	feat: centralize package manager version (#14920 )	2026-03-17 12:03:07 +00:00
viyatb-oai	db7e02c739	fix: canonicalize symlinked Linux sandbox cwd (#14849 ) ## Problem On Linux, Codex can be launched from a workspace path that is a symlink (for example, a symlinked checkout or a symlinked parent directory). Our sandbox policy intentionally canonicalizes writable/readable roots to the real filesystem path before building the bubblewrap mounts. That part is correct and needed for safety. The remaining bug was that bubblewrap could still inherit the helper process's logical cwd, which might be the symlinked alias instead of the mounted canonical path. In that case, the sandbox starts in a cwd that does not exist inside the sandbox namespace even though the real workspace is mounted. This can cause sandboxed commands to fail in symlinked workspaces. ## Fix This PR keeps the sandbox policy behavior the same, but separates two concepts that were previously conflated: - the canonical cwd used to define sandbox mounts and permissions - the caller's logical cwd used when launching the command On the Linux bubblewrap path, we now thread the logical command cwd through the helper explicitly and only add `--chdir <canonical path>` when the logical cwd differs from the mounted canonical path. That means: - permissions are still computed from canonical paths - bubblewrap starts the command from a cwd that definitely exists inside the sandbox - we do not widen filesystem access or undo the earlier symlink hardening ## Why This Is Safe This is a narrow Linux-only launch fix, not a policy change. - Writable/readable root canonicalization stays intact. - Protected metadata carveouts still operate on canonical roots. - We only override bubblewrap's inherited cwd when the logical path would otherwise point at a symlink alias that is not mounted in the sandbox. ## Tests - kept the existing protocol/core regression coverage for symlink canonicalization - added regression coverage for symlinked cwd handling in the Linux bubblewrap builder/helper path Local validation: - `just fmt` - `cargo test -p codex-protocol` - `cargo test -p codex-core normalize_additional_permissions_canonicalizes_symlinked_write_paths` - `cargo clippy -p codex-linux-sandbox -p codex-protocol -p codex-core --tests -- -D warnings` - `cargo build --bin codex` ## Context This is related to #14694. The earlier writable-root symlink fix addressed the mount/permission side; this PR fixes the remaining symlinked-cwd launch mismatch in the Linux sandbox path.	2026-03-16 22:39:18 -07:00
Ahmed Ibrahim	79f476e47d	[stack 3/4] Add current thread context to realtime startup (#14829 ) ## Stack Position 3/4. Top-of-stack sibling built on #14830. ## Base - #14830 ## Sibling - #14827 ## Scope - Extend the realtime startup context with a bounded summary of the latest thread turns for continuity. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-17 05:11:05 +00:00
Thibault Sottiaux	8e34caffcc	[codex] add Jason as a predefined subagent name (#14881 ) This change adds Jason to codex-core's built-in subagent nickname pool so spawned agents can pick it without any custom role configuration. The default list was simply missing that predefined name (a grave mistake).	2026-03-16 22:01:14 -07:00
xl-openai	e5a28ba0c2	fix: align marketplace display name with existing interface conventions (#14886 ) 1. camelCase for displayName; 2. move displayName under interface.	2026-03-16 21:52:19 -07:00

1 2 3 4 5 ...

2328 Commits