codex

mirror of https://github.com/openai/codex.git synced 2026-05-27 06:25:48 +00:00

Author	SHA1	Message	Date
Felipe Coury	3b5996f988	fix(tui): promote windows terminal diff ansi16 to truecolor (#13016 ) ## Summary - Promote ANSI-16 to truecolor for diff rendering when running inside Windows Terminal - Respect explicit `FORCE_COLOR` override, skipping promotion when set - Extract a pure `diff_color_level_for_terminal` function for testability - Strip background tints from ANSI-16 diff output, rendering add/delete lines with foreground color only - Introduce `RichDiffColorLevel` to type-safely restrict background fills to truecolor and ansi256 ## Problem Windows Terminal fully supports 24-bit (truecolor) rendering but often does not provide the usual TERM metadata (`TERM`, `TERM_PROGRAM`, `COLORTERM`) in `cmd.exe`/PowerShell sessions. In those environments, `supports-color` can report only ANSI-16 support. The diff renderer therefore falls back to a 16-color palette, producing washed-out, hard-to-read diffs. The screenshots below demonstrate that both PowerShell and cmd.exe don't set any `TERM` environment variables. \| PowerShell \| cmd.exe \| \|---\|---\| \| <img width="2032" height="1162" alt="SCR-20260226-nfvy" src="https://github.com/user-attachments/assets/59e968cc-4add-4c7b-a415-07163297e86a" /> \| <img width="2032" height="1162" alt="SCR-20260226-nfyc" src="https://github.com/user-attachments/assets/d06b3e39-bf91-4ce3-9705-82bf9563a01b" /> \| ## Mental model `StdoutColorLevel` (from `supports-color`) is the _detected_ capability. `DiffColorLevel` is the _intended_ capability for diff rendering. A new intermediary — `diff_color_level_for_terminal` — maps one to the other and is the single place where terminal-specific overrides live. Windows Terminal is detected two independent ways: the `TerminalName` parsed by `terminal_info()` and the raw presence of `WT_SESSION`. When `WT_SESSION` is present and `FORCE_COLOR` is not set, we promote unconditionally to truecolor. When `WT_SESSION` is absent but `TerminalName::WindowsTerminal` is detected, we promote only the ANSI-16 level (not `Unknown`). A single override helper — `has_force_color_override()` — checks whether `FORCE_COLOR` is set. When it is, both the `WT_SESSION` fast-path and the `TerminalName`-based promotion are suppressed, preserving explicit user intent. \| PowerShell \| cmd.exe \| WSL \| Bash for Windows \| \|---\|---\|---\|---\| \| ![SCR-20260226-msrh](https://github.com/user-attachments/assets/0f6297a6-4241-4dbf-b7ff-cf02da8941b0) \| ![SCR-20260226-nbao](https://github.com/user-attachments/assets/bb5ff8a9-903c-4677-a2de-1f6e1f34b18e) \| ![SCR-20260226-nbej](https://github.com/user-attachments/assets/26ecec2c-a7e9-410a-8702-f73995b490a6) \| ![SCR-20260226-nbkz](https://github.com/user-attachments/assets/80c4bf9a-3b41-40e1-bc87-f5c565f96075) \| ## Non-goals - This does not change color detection for anything outside the diff renderer (e.g. the chat widget, markdown rendering). - This does not add a user-facing config knob; `FORCE_COLOR` already serves that role. ## Tradeoffs - The `has_wt_session` signal is intentionally kept separate from `TerminalName::WindowsTerminal`. `terminal_info()` is derived with `TERM_PROGRAM` precedence, so it can differ from raw `WT_SESSION`. - Real-world validation in this issue: in both `cmd.exe` and PowerShell, `TERM`/`TERM_PROGRAM`/`COLORTERM` were absent, so TERM-based capability hints were unavailable in those sessions. - Checking `FORCE_COLOR` for presence rather than parsing its value is a simplification. In practice `supports-color` has already parsed it, so our check is a coarse "did the user set _anything_?" gate. The effective color level still comes from `supports-color`. - When `WT_SESSION` is present without `FORCE_COLOR`, we promote to truecolor regardless of `stdout_level` (including `Unknown`). This is aggressive but correct: `WT_SESSION` is a strong signal that we're in Windows Terminal. - ANSI-16 add/delete backgrounds (bright green/red) overpower syntax-highlighted token colors, making diffs harder to read. Foreground-only cues (colored text, gutter signs) preserve readability on low-color terminals. ## Architecture ``` stdout_color_level() ──┐ terminal_info().name ──┤ WT_SESSION presence ──┼──▶ diff_color_level_for_terminal() ──▶ DiffColorLevel FORCE_COLOR presence ──┘ │ ▼ RichDiffColorLevel::from_diff_color_level() │ ┌──────────┴──────────┐ │ Some(TrueColor\|256) │ → bg tints │ None (Ansi16) │ → fg only └─────────────────────┘ ``` `diff_color_level()` is the environment-reading entry point; it gathers the four runtime signals and delegates to the pure, testable `diff_color_level_for_terminal()`. ## Observability No new logs or metrics. Incorrect color selection is immediately visible as broken diff rendering; the test suite covers the decision matrix exhaustively. ## Tests Six new unit tests exercise every branch of `diff_color_level_for_terminal`: \| Test \| Inputs \| Expected \| \|------\|--------\|----------\| \| `windows_terminal_promotes_ansi16_to_truecolor_for_diffs` \| Ansi16 + WindowsTerminal name \| TrueColor \| \| `wt_session_promotes_ansi16_to_truecolor_for_diffs` \| Ansi16 + WT_SESSION only \| TrueColor \| \| `non_windows_terminal_keeps_ansi16_diff_palette` \| Ansi16 + WezTerm \| Ansi16 \| \| `wt_session_promotes_unknown_color_level_to_truecolor` \| Unknown + WT_SESSION \| TrueColor \| \| `explicit_force_override_keeps_ansi16_on_windows_terminal` \| Ansi16 + WindowsTerminal + FORCE_COLOR \| Ansi16 \| \| `explicit_force_override_keeps_ansi256_on_windows_terminal` \| Ansi256 + WT_SESSION + FORCE_COLOR \| Ansi256 \| \| `ansi16_add_style_uses_foreground_only` \| Dark + Ansi16 \| fg=Green, bg=None \| \| (and any other new snapshot/assertion tests from commits `d757fee` and `d7c78b3`) \| \| \| ## Test plan - [x] Verify all new unit tests pass (`cargo test -p codex-tui --lib`) - [x] On Windows Terminal: confirm diffs render with truecolor backgrounds - [x] On Windows Terminal with `FORCE_COLOR` set: confirm promotion is disabled and output follows the forced `supports-color` level - [x] On macOS/Linux terminals: confirm no behavior change Fixes https://github.com/openai/codex/issues/12904 Fixes https://github.com/openai/codex/issues/12890 Fixes https://github.com/openai/codex/issues/12912 Fixes https://github.com/openai/codex/issues/12840	2026-02-27 10:45:59 -07:00
Michael Bolin	d09a7535ed	fix: use AbsolutePathBuf for permission profile file roots (#12970 ) ## Why `PermissionProfile` should describe filesystem roots as absolute paths at the type level. Using `PathBuf` in `FileSystemPermissions` made the shared type too permissive and blurred together three different deserialization cases: - skill metadata in `agents/openai.yaml`, where relative paths should resolve against the skill directory - app-server API payloads, where callers should have to send absolute paths - local tool-call payloads for commands like `shell_command` and `exec_command`, where `additional_permissions.file_system` may legitimately be relative to the command `workdir` This change tightens the shared model without regressing the existing local command flow. ## What Changed - changed `protocol::models::FileSystemPermissions` and the app-server `AdditionalFileSystemPermissions` mirror to use `AbsolutePathBuf` - wrapped skill metadata deserialization in `AbsolutePathBufGuard`, so relative permission roots in `agents/openai.yaml` resolve against the containing skill directory - kept app-server/API deserialization strict, so relative `additionalPermissions.fileSystem.*` paths are rejected at the boundary - restored cwd/workdir-relative deserialization for local tool-call payloads by parsing `shell`, `shell_command`, and `exec_command` arguments under an `AbsolutePathBufGuard` rooted at the resolved command working directory - simplified runtime additional-permission normalization so it only canonicalizes and deduplicates absolute roots instead of trying to recover relative ones later - updated the app-server schema fixtures, `app-server/README.md`, and the affected transport/TUI tests to match the final behavior	2026-02-27 17:42:52 +00:00
jif-oai	8cf5b00aef	fix: more stable notify script (#13011 )	2026-02-27 16:05:44 +01:00
jif-oai	fe439afb81	chore: tmp remove awaiter (#13001 )	2026-02-27 13:22:17 +01:00
jif-oai	bbd237348d	feat: gen memories config (#12999 )	2026-02-27 12:38:47 +01:00
jif-oai	a63d8bd569	feat: add use memories config (#12997 )	2026-02-27 11:40:54 +01:00
Michael Bolin	e6cd75a684	notify: include client in legacy hook payload (#12968 ) ## Why The `notify` hook payload did not identify which Codex client started the turn. That meant downstream notification hooks could not distinguish between completions coming from the TUI and completions coming from app-server clients such as VS Code or Xcode. Now that the Codex App provides its own desktop notifications, it would be nice to be able to filter those out. This change adds that context without changing the existing payload shape for callers that do not know the client name, and keeps the new end-to-end test cross-platform. ## What changed - added an optional top-level `client` field to the legacy `notify` JSON payload - threaded that value through `core` and `hooks`; the internal session and turn state now carries it as `app_server_client_name` - set the field to `codex-tui` for TUI turns - captured `initialize.clientInfo.name` in the app server and applied it to subsequent turns before dispatching hooks - replaced the notify integration test hook with a `python3` script so the test does not rely on Unix shell permissions or `bash` - documented the new field in `docs/config.md` ## Testing - `cargo test -p codex-hooks` - `cargo test -p codex-tui` - `cargo test -p codex-app-server suite::v2::initialize::turn_start_notify_payload_includes_initialize_client_name -- --exact --nocapture` - `cargo test -p codex-core` (`src/lib.rs` passed; `core/tests/all.rs` still has unrelated existing failures in this environment) ## Docs The public config reference on `developers.openai.com/codex` should mention that the legacy `notify` payload may include a top-level `client` field. The TUI reports `codex-tui`, and the app server reports `initialize.clientInfo.name` when it is available.	2026-02-26 22:27:34 -08:00
Ahmed Ibrahim	53e28f18cf	Add realtime websocket tracing (#12981 ) - add transport and conversation logs around connect, close, and parse flow - log realtime transport failures as errors for easier debugging	2026-02-26 22:15:18 -08:00
Ahmed Ibrahim	4d180ae428	Add model availability NUX metadata (#12972 ) - replace show_nux with structured availability_nux model metadata - expose availability NUX data through the app-server model API - update shared fixtures and tests for the new field	2026-02-26 22:02:57 -08:00
Eric Traut	cee009d117	Add oauth_resource handling for MCP login flows (#12866 ) Addresses bug https://github.com/openai/codex/issues/12589 Builds on community PR #12763. This adds `oauth_resource` support for MCP `streamable_http` servers and wires it through the relevant config and login paths. It fixes the bug where the configured OAuth resource was not reliably included in the authorization request, causing MCP login to omit the expected `resource` parameter.	2026-02-26 20:10:12 -08:00
Curtis 'Fjord' Hawthorne	7e980d7db6	Support multimodal custom tool outputs (#12948 ) ## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948	2026-02-26 18:17:46 -08:00
Ahmed Ibrahim	a11da86b37	Make realtime audio test deterministic (#12959 ) ## Summary\n- add a websocket test-server request waiter so tests can synchronize on recorded client messages\n- use that waiter in the realtime delegation test instead of a fixed audio timeout\n- add temporary timing logs in the test and websocket mock to inspect where the flake stalls	2026-02-26 16:09:00 -08:00
Celia Chen	90cc4e79a2	feat: add local date/timezone to turn environment context (#12947 ) ## Summary This PR includes the session's local date and timezone in the model-visible environment context and persists that data in `TurnContextItem`. ## What changed - captures the current local date and IANA timezone when building a turn context, with a UTC fallback if the timezone lookup fails - includes current_date and timezone in the serialized <environment_context> payload - stores those fields on TurnContextItem so they survive rollout/history handling, subagent review threads, and resume flows - treats date/timezone changes as environment updates, so prompt caching and context refresh logic do not silently reuse stale time context - updates tests to validate the new environment fields without depending on a single hardcoded environment-context string ## test built a local build and saw it in the rollout file: ``` {"timestamp":"2026-02-26T21:39:50.737Z","type":"response_item","payload":{"type":"message","role":"user","content":[{"type":"input_text","text":"<environment_context>\n <shell>zsh</shell>\n <current_date>2026-02-26</current_date>\n <timezone>America/Los_Angeles</timezone>\n</environment_context>"}]}} ```	2026-02-26 23:17:35 +00:00
Michael Bolin	4cb086d96f	test: move unix_escalation tests into sibling file (#12957 ) ## Why `unix_escalation.rs` had a large inline `mod tests` block that made the implementation harder to scan. This change moves those tests into a sibling file while keeping them as a child module, so they can still exercise private items without widening visibility. ## What Changed - replaced the inline `#[cfg(test)] mod tests` block in `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` with a path-based test module declaration - moved the existing unit tests into `codex-rs/core/src/tools/runtimes/shell/unix_escalation_tests.rs` - kept the extracted tests using `super::...` imports so they continue to access private helpers and types from `unix_escalation.rs` ## Testing - `cargo test -p codex-core unix_escalation::tests`	2026-02-26 23:15:28 +00:00
Ahmed Ibrahim	a0e86c69fe	Add realtime audio device config (#12849 ) ## Summary - add top-level realtime audio config for microphone and speaker selection - apply configured devices when starting realtime capture and playback - keep missing-device behavior on the system default fallback path ## Validation - just write-config-schema - cargo test -p codex-core realtime_audio - cargo test -p codex-tui - just fix -p codex-core - just fix -p codex-tui - just fmt --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-26 15:08:21 -08:00
pakrym-oai	951a389654	Allow clients not to send summary as an option (#12950 ) Summary is a required parameter on UserTurn. Ideally we'd like the core to decide the appropriate summary level. Make the summary optional and don't send it when not needed.	2026-02-26 14:37:38 -08:00
jif-oai	a6065d30f4	feat: add git info to memories (#12940 )	2026-02-26 20:14:13 +00:00
Michael Bolin	7fa9d9ae35	feat: include sandbox config with escalation request (#12839 ) ## Why Before this change, an escalation approval could say that a command should be rerun, but it could not carry the sandbox configuration that should still apply when the escalated command is actually spawned. That left an unsafe gap in the `zsh-fork` skill path: skill scripts under `scripts/` that did not declare permissions could be escalated without a sandbox, and scripts that did declare permissions could lose their bounded sandbox on rerun or cached session approval. This PR extends the escalation protocol so approvals can optionally carry sandbox configuration all the way through execution. That lets the shell runtime preserve the intended sandbox instead of silently widening access. We likely want a single permissions type for this codepath eventually, probably centered on `Permissions`. For now, the protocol needs to represent both the existing `PermissionProfile` form and the fuller `Permissions` form, so this introduces a temporary disjoint union, `EscalationPermissions`, to carry either one. Further, this means that today, a skill either: - does not declare any permissions, in which case it is run using the default sandbox for the turn - specifies permissions, in which case the skill is run using that exact sandbox, which might be more restrictive than the default sandbox for the turn We will likely change the skill's permissions to be additive to the existing permissions for the turn. ## What Changed - Added `EscalationPermissions` to `codex-protocol` so escalation requests can carry either a `PermissionProfile` or a full `Permissions` payload. - Added an explicit `EscalationExecution` mode to the shell escalation protocol so reruns distinguish between `Unsandboxed`, `TurnDefault`, and `Permissions(...)` instead of overloading `None`. - Updated `zsh-fork` shell reruns to resolve `TurnDefault` at execution time, which keeps ordinary `UseDefault` commands on the turn sandbox and preserves turn-level macOS seatbelt profile extensions. - Updated the `zsh-fork` skill path so a skill with no declared permissions inherits the conversation's effective sandbox instead of escalating unsandboxed. - Updated the `zsh-fork` skill path so a skill with declared permissions reruns with exactly those permissions, including when a cached session approval is reused. ## Testing - Added unit coverage in `core/src/tools/runtimes/shell/unix_escalation.rs` for the explicit `UseDefault` / `RequireEscalated` / `WithAdditionalPermissions` execution mapping. - Added unit coverage in `core/src/tools/runtimes/shell/unix_escalation.rs` for macOS seatbelt extension preservation in both the `TurnDefault` and explicit-permissions rerun paths. - Added integration coverage in `core/tests/suite/skill_approval.rs` for permissionless skills inheriting the turn sandbox and explicit skill permissions remaining bounded across cached approval reuse.	2026-02-26 12:00:18 -08:00
jif-oai	3404ecff15	feat: add post-compaction sub-agent infos (#12774 ) Co-authored-by: Codex <noreply@openai.com>	2026-02-26 18:55:34 +00:00
Curtis 'Fjord' Hawthorne	eb77db2957	Log js_repl nested tool responses in rollout history (#12837 ) ## Summary - add tracing-based diagnostics for nested `codex.tool(...)` calls made from `js_repl` - emit a bounded, sanitized summary at `info!` - emit the exact raw serialized response object or error string seen by JavaScript at `trace!` - document how to enable these logs and where to find them, especially for `codex app-server` ## Why Nested `codex.tool(...)` calls inside `js_repl` are a debugging boundary: JavaScript sees the tool result, but that result is otherwise hard to inspect from outside the kernel. This change adds explicit tracing for that path using the repo’s normal observability pattern: - `info` for compact summaries - `trace` for exact raw payloads when deep debugging is needed ## What changed - `js_repl` now summarizes nested tool-call results across the response shapes it can receive: - message content - function-call outputs - custom tool outputs - MCP tool results and MCP error results - direct error strings - each nested `codex.tool(...)` completion logs: - `exec_id` - `tool_call_id` - `tool_name` - `ok` - a bounded summary struct describing the payload shape - at `trace`, the same path also logs the exact serialized response object or error string that JavaScript received - docs now include concrete logging examples for `codex app-server` - unit coverage was added for multimodal function output summaries and error summaries ## How to use it ### Summary-only logging Set: ```sh RUST_LOG=codex_core::tools::js_repl=info ``` For `codex app-server`, tracing output is written to the server process `stderr`. Example: ```sh RUST_LOG=codex_core::tools::js_repl=info \ LOG_FORMAT=json \ codex app-server \ 2> /tmp/codex-app-server.log ``` This emits bounded summary lines for nested `codex.tool(...)` calls. ### Full raw debugging Set: ```sh RUST_LOG=codex_core::tools::js_repl=trace ``` Example: ```sh RUST_LOG=codex_core::tools::js_repl=trace \ LOG_FORMAT=json \ codex app-server \ 2> /tmp/codex-app-server.log ``` At `trace`, you get: - the same `info` summary line - a `trace` line with the exact serialized response object seen by JavaScript - or the exact error string if the nested tool call failed ### Where the logs go For `codex app-server`, these logs go to process `stderr`, so redirect or capture `stderr` to inspect them. Example: ```sh RUST_LOG=codex_core::tools::js_repl=trace \ LOG_FORMAT=json \ /Users/fjord/code/codex/codex-rs/target/debug/codex app-server \ 2> /tmp/codex-app-server.log ``` Then inspect: ```sh rg "js_repl nested tool call" /tmp/codex-app-server.log ``` Without an explicit `RUST_LOG` override, these `js_repl` nested tool-call logs are typically not visible.	2026-02-26 10:12:28 -08:00
jif-oai	d3603ae5d3	feat: fork thread multi agent (#12499 )	2026-02-26 18:01:53 +00:00
jif-oai	c53c08f8f9	chore: calm down awaiter (#12925 )	2026-02-26 17:54:48 +00:00
pakrym-oai	ba41e84a50	Use model catalog default for reasoning summary fallback (#12873 ) ## Summary - make `Config.model_reasoning_summary` optional so unset means use model default - resolve the optional config value to a concrete summary when building `TurnContext` - add protocol support for `default_reasoning_summary` in model metadata ## Validation - `cargo test -p codex-core --lib client::tests -- --nocapture` --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-26 09:31:13 -08:00
jif-oai	739d4b52de	fix: do not apply turn cwd to metadata (#12887 ) Details here: https://openai.slack.com/archives/C09NZ54M4KY/p1772056758227339	2026-02-26 17:05:58 +00:00
jif-oai	c528f32acb	feat: use memory usage for selection (#12909 )	2026-02-26 16:44:02 +00:00
daveaitel-openai	79cbca324a	Skip history metadata scan for subagents (#12918 ) Summary - Skip `history_metadata` scanning when spawning subagents to avoid expensive per-spawn history scans. - Keeps behavior unchanged for normal sessions. Testing - `cd codex-rs && cargo test -p codex-core` - Failing in this environment (pre-existing and I don't think something I did?): - `suite::cli_stream::responses_mode_stream_cli` (SIGKILL + OTEL export error to http://localhost:14318/v1/logs) - `suite::grep_files::grep_files_tool_collects_matches` (unsupported call: grep_files) - `suite::grep_files::grep_files_tool_reports_empty_results` (unsupported call: grep_files) Co-authored-by: Codex <noreply@openai.com>	2026-02-26 16:21:26 +00:00
jif-oai	382fa338b3	feat: memories forgetting (#12900 ) Add diff based memory forgetting	2026-02-26 13:19:57 +00:00
jif-oai	81ce645733	chore: better awaiter description (#12901 )	2026-02-26 12:07:13 +00:00
Wendy Jiao	52aa49db1b	Add rollout path to memory files and search for them during read (#12684 ) Co-authored-by: jif-oai <jif@openai.com>	2026-02-26 10:57:01 +00:00
jif-oai	51cf3977d4	chore: new agents name (#12884 )	2026-02-26 09:36:09 +00:00
Charley Cunningham	07aefffb1f	core: bundle settings diff updates into one dev/user envelope (#12417 ) ## Summary - bundle contextual prompt injection into at most one developer message plus one contextual user message in both: - per-turn settings updates - initial context insertion - preserve `<model_switch>` across compaction by rebuilding it through canonical initial-context injection, instead of relying on strip/reattach hacks - centralize contextual user fragment detection in one shared definition table and reuse it for parsing/compaction logic - keep `AGENTS.md` in its natural serialized format: - `# AGENTS.md instructions for {dirname}` - `<INSTRUCTIONS>...</INSTRUCTIONS>` - simplify related tests/helpers and accept the expected snapshot/layout updates from bundled multi-part messages ## Why The goal is to converge toward a simpler, more intentional prompt shape where contextual updates are consistently represented as one developer envelope plus one contextual user envelope, while keeping parsing and compaction behavior aligned with that representation. ## Notable details - the temporary `SettingsUpdateEnvelope` wrapper was removed; these paths now return `Vec<ResponseItem>` directly - local/remote compaction no longer rely on model-switch strip/restore helpers - contextual user detection is now driven by shared fragment definitions instead of ad hoc matcher assembly - AGENTS/user instructions are still the same logical context; only the synthetic `<user_instructions>` wrapper was replaced by the natural AGENTS text format ## Testing - `just fmt` - `cargo test -p codex-app-server codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages -- --exact` - `cargo test -p codex-core compact::tests::collect_user_messages_filters_session_prefix_entries --lib -- --exact` - `cargo test -p codex-core --test all 'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_apps_guidance_as_developer_message_when_enabled' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_developer_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_user_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::resume_includes_initial_messages_and_sends_prior_items' -- --exact` - `cargo test -p codex-core --test all 'suite::review::review_input_isolated_from_parent_history' -- --exact` - `cargo test -p codex-exec --test all 'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' -- --exact` - `cargo test -p core_test_support context_snapshot::tests::full_text_mode_preserves_unredacted_text -- --exact` ## Notes - I also ran several targeted `compact`, `compact_remote`, `prompt_caching`, `model_visible_layout`, and `event_mapping` tests while iterating on prompt-shape changes. - I have not claimed a clean full-workspace `cargo test` from this environment because local sandbox/resource conditions have previously produced unrelated failures in large workspace runs.	2026-02-26 00:12:08 -08:00
Curtis 'Fjord' Hawthorne	7326c097e3	Reduce js_repl Node version requirement to 22.22.0 (#12857 ) ## Summary Lower the `js_repl` minimum Node version from `24.13.1` to `22.22.0`. This updates the enforced minimum in `codex-rs/node-version.txt` and the corresponding user-facing `/experimental` description for the JavaScript REPL feature. ## Rationale The previous `24.13.1` floor was stricter than necessary for `js_repl`. I validated the REPL kernel behavior under Node `22.22.0` still works. ## Why `22.22.0` `22.22.0` is a current, widely packaged Node 22 release across common developer environments and distros, including Homebrew `node@22`, Fedora `nodejs22`, Arch `nodejs-lts-jod`, and Debian testing. That makes it a better exact floor than guessing at an older `22.x` patch we have not validated. `22.x` is also a maintenance branch that will be supported through April 2027, where the previous maintenance branch of `20.x` is only supported through April of this year. ## Changes - Update `codex-rs/node-version.txt` from `24.13.1` to `22.22.0` - Update the `/experimental` JavaScript REPL description to say `Requires Node >= v22.22.0 installed.`	2026-02-26 04:09:30 +00:00
xl-openai	8cdee988f9	Skip system skills for extra roots (#12744 ) When extra roots is set do not load system skills.	2026-02-25 19:55:28 -08:00
Curtis 'Fjord' Hawthorne	40ab71a985	Disable js_repl when Node is incompatible at startup (#12824 ) ## Summary - validate `js_repl` Node compatibility during session startup when the experiment is enabled - if Node is missing or too old, disable `js_repl` and `js_repl_tools_only` for the session before tools and instructions are built - surface that startup disablement to users through the existing startup warning flow instead of only logging it - reuse the same compatibility check in js_repl kernel startup so startup gating and runtime behavior stay aligned - add a regression test that verifies the warning is emitted and that the first advertised tool list omits `js_repl` and `js_repl_reset` when Node is incompatible ## Why Today `js_repl` can be advertised based only on the feature flag, then fail later when the kernel starts. That makes the available tool list inaccurate at the start of a conversation, and users do not get a clear explanation for why the tool is unavailable. This change makes tool availability reflect real startup checks, keeps the advertised tool set stable for the lifetime of the session, and gives users a visible warning when `js_repl` is disabled. ## Testing - `just fmt` - `cargo test -p codex-core --test all js_repl_is_not_advertised_when_startup_node_is_incompatible`	2026-02-26 01:14:51 +00:00
Michael Bolin	14116ade8d	feat: include available decisions in command approval requests (#12758 ) Command-approval clients currently infer which choices to show from side-channel fields like `networkApprovalContext`, `proposedExecpolicyAmendment`, and `additionalPermissions`. That makes the request shape harder to evolve, and it forces each client to replicate the server's heuristics instead of receiving the exact decision list for the prompt. This PR introduces a mapping between `CommandExecutionApprovalDecision` and `codex_protocol::protocol::ReviewDecision`: ```rust impl From<CoreReviewDecision> for CommandExecutionApprovalDecision { fn from(value: CoreReviewDecision) -> Self { match value { CoreReviewDecision::Approved => Self::Accept, CoreReviewDecision::ApprovedExecpolicyAmendment { proposed_execpolicy_amendment, } => Self::AcceptWithExecpolicyAmendment { execpolicy_amendment: proposed_execpolicy_amendment.into(), }, CoreReviewDecision::ApprovedForSession => Self::AcceptForSession, CoreReviewDecision::NetworkPolicyAmendment { network_policy_amendment, } => Self::ApplyNetworkPolicyAmendment { network_policy_amendment: network_policy_amendment.into(), }, CoreReviewDecision::Abort => Self::Cancel, CoreReviewDecision::Denied => Self::Decline, } } } ``` And updates `CommandExecutionRequestApprovalParams` to have a new field: ```rust available_decisions: Option<Vec<CommandExecutionApprovalDecision>> ``` when, if specified, should make it easier for clients to display an appropriate list of options in the UI. This makes it possible for `CoreShellActionProvider::prompt()` in `unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly, adding support for `ApprovedForSession` when approving a skill script, which was previously missing in the TUI. Note this results in a significant change to `exec_options()` in `approval_overlay.rs`, as the displayed options are now derived from `available_decisions: &[ReviewDecision]`. ## What Changed - Add `available_decisions` to [`ExecApprovalRequestEvent`](`de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)`), including helpers to derive the legacy default choices when older senders omit the field. - Map `codex_protocol::protocol::ReviewDecision` to app-server `CommandExecutionApprovalDecision` and expose the ordered list as experimental `availableDecisions` in [`CommandExecutionRequestApprovalParams`](`de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)`). - Thread optional `available_decisions` through the core approval path so Unix shell escalation can explicitly request `ApprovedForSession` for session-scoped approvals instead of relying on client heuristics. [`unix_escalation.rs`](`de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214)`) - Update the TUI approval overlay to build its buttons from the ordered decision list, while preserving the legacy fallback when `available_decisions` is missing. - Update the app-server README, test client output, and generated schema artifacts to document and surface the new field. ## Testing - Add `approval_overlay.rs` coverage for explicit decision lists, including the generic `ApprovedForSession` path and network approval options. - Update `chatwidget/tests.rs` and app-server protocol tests to populate the new optional field and keep older event shapes working. ## Developers Docs - If we document `item/commandExecution/requestApproval` on [developers.openai.com/codex](https://developers.openai.com/codex), add experimental `availableDecisions` as the preferred source of approval choices and note that older servers may omit it.	2026-02-26 01:10:46 +00:00
Celia Chen	4f45668106	Revert "Add skill approval event/response (#12633 )" (#12811 ) This reverts commit https://github.com/openai/codex/pull/12633. We no longer need this PR, because we favor sending normal exec command approval server request with `additional_permissions` of skill permissions instead	2026-02-26 01:02:42 +00:00
pakrym-oai	4fedef88e0	Use websocket v2 as model-preferred websocket protocol (#12838 )	2026-02-25 16:35:53 -08:00
Ahmed Ibrahim	e76b1a2853	Remove steer feature flag (#12026 ) All code should go in the direction that steer is enabled --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-25 15:41:42 -08:00
Michael Bolin	a6a5976c5a	feat: scope execve session approvals by approved skill metadata (#12814 ) Previous to this change, `determine_action()` would 1. check if `program` is associated with a skill 2. if so, check if `program` is in `execve_session_approvals` to see whether the user needs to be prompted This PR flips the order of these checks to try to set us up so that "session approvals" are always consulted first (which should soon extend to include session approvals derived from `prefix_rule()`s, as well). Though to make the new ordering work, we need to record any relevant metadata to associate with the approval, which in the case of a skill-based approval is the `SkillMetadata` so that we can derive the `PermissionProfile` to include with the escalation. (Though as noted by the `TODO`, this `PermissionProfile` is not honored yet.) The new `ExecveSessionApproval` struct is used to retain the necessary metadata. ## What Changed - Replace the `execve_session_approvals` `HashSet` with a map that stores an `ExecveSessionApproval` alongside each approved `program`. - When a user chooses `ApprovedForSession` for a skill script, capture the matched `SkillMetadata` in the session approval entry. - Consult that cache before re-running `find_skill()`, and reuse the originally approved skill metadata and permission profile when allowing later execve callbacks in the same session.	2026-02-25 15:30:24 -08:00
Charley Cunningham	2f4d6ded1d	Enable request_user_input in Default mode (#12735 ) ## Summary - allow `request_user_input` in Default collaboration mode as well as Plan - update the Default-mode instructions to prefer assumptions first and use `request_user_input` only when a question is unavoidable - update request_user_input and app-server tests to match the new Default-mode behavior - refactor collaboration-mode availability plumbing into `CollaborationModesConfig` for future mode-related flags ## Codex author `codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`	2026-02-25 15:20:46 -08:00
Ahmed Ibrahim	2bd87d1a75	only use preambles for realtime (#12831 ) Reverts openai/codex#12830	2026-02-25 14:54:54 -08:00
Celia Chen	b6d20748e0	Revert "Ensure shell command skills trigger approval (#12697 )" (#12721 ) This reverts commit `daf0f03ac8`. # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-02-25 22:49:53 +00:00
Ahmed Ibrahim	f86087eaa8	Revert "only use preambles for realtime" (#12830 ) Reverts openai/codex#12806	2026-02-25 14:30:48 -08:00
Ahmed Ibrahim	c1851be1ed	only use preambles for realtime (#12806 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-25 13:41:54 -08:00
sayan-oai	d45ffd5830	make 5.3-codex visible in cli for api users (#12808 ) 5.3-codex released in api, mark it visible for API users via bundled `models.json`.	2026-02-25 13:01:40 -08:00
Ahmed Ibrahim	3f30746237	Add simple realtime text logs (#12807 ) Update realtime debug logs to include the actual text payloads in both input and output paths. - In `core/src/realtime_conversation.rs`: - `handle_start`: add extracted assistant text output to the `[realtime-text]` debug log. - `handle_text`: add incoming text input (`params.text`) to the `[realtime-text]` debug log. No tests were run (per request).	2026-02-25 12:01:48 -08:00
Owen Lin	a0fd94bde6	feat(app-server): add ThreadItem::DynamicToolCall (#12732 ) Previously, clients would call `thread/start` with dynamic_tools set, and when a model invokes a dynamic tool, it would just make the server->client `item/tool/call` request and wait for the client's response to complete the tool call. This works, but it doesn't have an `item/started` or `item/completed` event. Now we are doing this: - [new] emit `item/started` with `DynamicToolCall` populated with the call arguments - send an `item/tool/call` server request - [new] once the client responds, emit `item/completed` with `DynamicToolCall` populated with the response. Also, with `persistExtendedHistory: true`, dynamic tool calls are now reconstructable in `thread/read` and `thread/resume` as `ThreadItem::DynamicToolCall`.	2026-02-25 12:00:10 -08:00
Rasmus Rygaard	73eaebbd1c	Propagate session ID when compacting (#12802 ) We propagate the session ID when sending requests for inference but we don't do the same for compaction requests. This makes it hard to link compaction requests to their session for debugging purposes	2026-02-25 19:17:38 +00:00
Michael Bolin	648a420cbf	fix: enforce sandbox envelope for zsh fork execution (#12800 ) ## Why Zsh fork execution was still able to bypass the `WorkspaceWrite` model in edge cases because the fork path reconstructed command execution without preserving sandbox wrappers, and command extraction only accepted shell invocations in a narrow positional shape. This can allow commands to run with broader filesystem access than expected, which breaks the sandbox safety model. ## What changed - Preserved the sandboxed `ExecRequest` produced by `attempt.env_for(...)` when entering the zsh fork path in [`unix_escalation.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs). - Updated `CoreShellCommandExecutor` to execute the sandboxed command and working directory captured from `attempt.env_for(...)`, instead of re-running a freshly reconstructed shell command. - Made zsh-fork script extraction robust to wrapped invocations by scanning command arguments for `-c`/`-lc` rather than only matching the first positional form. - Added unit tests in `unix_escalation.rs` to lock in wrapper-tolerant parsing behavior and keep unsupported shell forms rejected. - Tightened the regression in [`skill_approval.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/tests/suite/skill_approval.rs): - `shell_zsh_fork_still_enforces_workspace_write_sandbox` now uses an explicit `WorkspaceWrite` policy with `exclude_tmpdir_env_var: true` and `exclude_slash_tmp: true`. - The test attempts to write to `/tmp/...`, which is only reliably outside writable roots with those explicit exclusions set. ## Verification - Added and passed the new unit tests around `extract_shell_script` parsing behavior with wrapped command shapes. - `extract_shell_script_supports_wrapped_command_prefixes` - `extract_shell_script_rejects_unsupported_shell_invocation` - Verified the regression with the focused integration test: `shell_zsh_fork_still_enforces_workspace_write_sandbox`. ## Manual Testing Prior to this change, if I ran Codex via: ``` just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork ``` and asked: ``` what is the output of /bin/ps ``` it would run it, even though the default sandbox should prevent the agent from running `/bin/ps` because it is setuid on MacOS. But with this change, I now see the expected failure because it is blocked by the sandbox: ``` /bin/ps exited with status 1 and produced no output in this environment. ```	2026-02-25 11:05:27 -08:00
pakrym-oai	9d7013eab0	Handle websocket timeout (#12791 ) Sometimes websockets will timeout with 400 error, ensure we retry it.	2026-02-25 10:31:37 -08:00

1 2 3 4 5 ...

1965 Commits