codex

mirror of https://github.com/openai/codex.git synced 2026-05-30 07:50:17 +00:00

Author	SHA1	Message	Date
jif-oai	b40ad0d84d	Remove stale rollout TODO tests (#25106 ) ## Summary Remove a stale `TODO(jif)` block of commented-out rollout listing tests that still referenced an older listing API. The current rollout listing behavior is covered by the active state DB and filesystem fallback tests, so keeping the dead commented tests just adds noise. ## Validation - `just fmt` - `just test -p codex-rollout`	2026-05-29 17:09:00 +02:00
jif-oai	27e256bc40	Handle goal usage limits from turn errors (#25095 ) ## Summary - handle goal usage-limit turn errors in the goal extension - exercise the extension path in the goal backend test ## Tests - just fmt - just test -p codex-goal-extension - just fix -p codex-goal-extension	2026-05-29 15:39:05 +02:00
jif-oai	1c55bb2702	[codex] Improve built-in tool schema docs (#24794 ) ## Summary - Clarify default, omission, and bounded behavior across built-in tool schemas, including unified exec, classic shell, Code Mode exec/wait, multi-agent, agent job, MCP resource, image, goal, plan, tool_search, and test-sync fields. - Convert update_plan status to an enum and add short field descriptions where the schema previously relied on surrounding context. - Remove the dedicated permission-approval schema test and keep only updates to existing expected-spec tests. ## Validation - Ran `just fmt`. - Ran `git diff --check`. - Did not run clippy or tests, per request. Regression has been eval [here](https://openai.slack.com/archives/C09GDSP1J9X/p1779905065496949) and we proved there are no regressions	2026-05-29 13:32:19 +02:00
jif-oai	3deda3116c	fix: main (#25075 )	2026-05-29 12:53:31 +02:00
jif-oai	191c39aa75	Drop debug-client prompt state tracking (#25070 ) Deletes `codex-rs/debug-client/src/state.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:23 +02:00
jif-oai	43fa4e5d25	Remove debug-client server event reader (#25069 ) Deletes `codex-rs/debug-client/src/reader.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:19 +02:00
jif-oai	5c1387846d	Delete debug-client JSONL output helper (#25068 ) Deletes `codex-rs/debug-client/src/output.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:16 +02:00
jif-oai	e2b8ec616a	Remove the debug-client CLI entrypoint (#25067 ) Deletes `codex-rs/debug-client/src/main.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:12 +02:00
jif-oai	3d3cc5a953	Retire debug-client interactive command parsing (#25066 ) Deletes `codex-rs/debug-client/src/commands.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:09 +02:00
jif-oai	1197c7d654	Delete debug-client app-server process plumbing (#25065 ) Deletes `codex-rs/debug-client/src/client.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:05 +02:00
jif-oai	a9a92cbb0a	Remove the generated debug-client README (#25064 ) Deletes `codex-rs/debug-client/README.md` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:01 +02:00
jif-oai	fc8c723553	Drop the stale debug-client manifest (#25063 ) Deletes `codex-rs/debug-client/Cargo.toml` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:50:58 +02:00
jif-oai	8f6a945ec9	Use inject_if_running for active goal steering (#24924 ) ## Why This PR is stacked on #24918, which moves goal steering onto source-labeled internal model context fragments. Active-turn goal steering should use the same running-turn injection path as other runtime steering, so those fragments enter the pending input queue as `ResponseItem`s through the existing [`Session::inject_if_running`](`8d6f6cdf69/codex-rs/core/src/session/inject.rs (L12-L27)`) behavior instead of through a goal-specific conversion wrapper. ## What Changed - Exposes a narrow `CodexThread::inject_if_running` bridge for callers that only hold a thread handle. - Changes `ext/goal` active-turn steering to pass `ResponseItem`s directly. - Builds goal steering prompts as contextual internal model context `ResponseItem`s before injecting them into the running turn. ## Testing Not run locally; PR metadata update only.	2026-05-29 11:24:39 +02:00
jif-oai	740d942f90	Use internal model context fragments for goal steering (#24918 ) ## Why Goal steering is one form of runtime-owned model context, but the old `<goal_context>` wrapper made the contextual-fragment hiding path goal-specific. Using a source-labeled internal context fragment gives core and extensions a shared shape for hidden model steering while keeping those prompts out of visible turn history. The change also keeps legacy `<goal_context>` messages recognized as hidden contextual input so existing stored history does not start rendering old goal-steering prompts as user-visible turn items. ## What Changed - Replaces `GoalContext` with `InternalModelContextFragment` plus a validated `InternalContextSource`. - Renders goal steering as `<codex_internal_context source="goal">...</codex_internal_context>`. - Updates core goal steering and `ext/goal` steering to inject the new internal-context fragment. - Updates contextual-fragment, event-mapping, goal, and session tests for the new wrapper. ## Test Coverage - Adds coverage for detecting the new internal model context fragment. - Preserves coverage for hiding legacy `<goal_context>` fragments. - Verifies invalid internal context sources are rejected and arbitrary context tags are not hidden. - Updates goal steering/session assertions to expect the new `source="goal"` wrapper.	2026-05-29 10:28:25 +02:00
Eric Traut	522f549922	Fix fs/watch debounce batching (#24716 ) ## Summary `fs/watch` was using a local debounce wrapper whose deadline was initialized once and then reused after the first batch. Once that stale deadline was in the past, later file changes could bypass the intended 200ms debounce and send noisier `fs/changed` notifications. This moves the debounce wrapper into `codex-file-watcher` as `DebouncedWatchReceiver`, resets the debounce deadline for each event batch, preserves pending paths across cancelled receives, and updates app-server `fs/watch` to use the shared wrapper. Fixes #24692.	2026-05-28 23:09:55 -07:00
Michael Bolin	6e10142199	fix: preserve deny-read sandboxing for safe commands (#23943 ) ## Why Permission profiles can mark filesystem entries as unreadable with `deny` rules, including glob patterns. Several shell execution paths treated known-safe commands or execpolicy `allow` rules as sufficient to run outside the filesystem sandbox. That is not valid for read-capable commands: for example, `cat` or `ls` may be reasonable to allow generally, but dropping the sandbox would also drop deny-read constraints such as `*/.env`. ## What changed - Added a shared check that treats active deny-read restrictions as incompatible with unsandboxed execution. - Kept first-attempt execution sandboxed for explicit escalation and execpolicy allow bypasses when deny-read entries are present. - Prevented no-sandbox retry after a sandbox denial when the active filesystem policy contains deny-read entries. - Updated the zsh-fork execve path so prefix-rule `allow` decisions continue inside the current sandbox when deny-read restrictions are active. ## Verification - `cargo test -p codex-core tools::sandboxing::tests` - `cargo test -p codex-core tools::runtimes::shell::unix_escalation::tests` - `cargo test -p codex-core shell_command_enforces_glob_deny_read_policy`	2026-05-28 22:49:37 -07:00
Eric Traut	56958f2512	Seed prompt history from resumed messages (#24298 ) ## Why When the TUI resumes a thread, transcript replay renders prior user messages but did not seed the composer history. That leaves the resumed session with empty in-memory prompt history, so pressing Up can fall through to persisted global history and surface a prompt from another thread. The expected behavior is that prompts from the resumed thread are recalled first, with global history only as a fallback. ## What changed - Record replayed user messages into the composer history during resume replay. - Preserve the existing persisted history format and avoid any startup history scan. - Add focused TUI coverage showing replayed prompts are recalled before persisted global history. ## Validation - Added `replayed_user_messages_seed_composer_history` in `codex-rs/tui/src/chatwidget/tests/history_replay.rs`. - `just test -p codex-tui replayed_user_messages_seed_composer_history` passed.	2026-05-28 22:08:05 -07:00
xl-openai	f0a839ea0c	Add runtime extra skill roots API (#24977 ) ## Summary - Add v2 `skills/extraRoots/set` to replace app-server process-local standalone skill roots. The setting is not persisted, accepts missing roots, and `extraRoots: []` clears the runtime set. - Wire runtime roots into core skill discovery for `skills/list` and turn loads, clear skill caches on set, and register the roots with the skills watcher so later filesystem changes emit `skills/changed`. - Update app-server docs, generated JSON/TypeScript schemas, and coverage for serialization, missing roots, empty clears, and restart behavior. ## Testing - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core-skills` - `cargo test -p codex-app-server skills_extra_roots_set_updates_process_runtime_roots` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core-skills` - `just fix -p codex-app-server`	2026-05-28 21:14:34 -07:00
Adrian	42c80385cd	[codex] Avoid PowerShell safety parsing off Windows (#24946 ) ## Summary This fixes BUGB-17567 by preventing non-Windows command safety classification from invoking the Windows PowerShell safelist/parser path. Previously, `is_known_safe_command` called the Windows PowerShell classifier on every platform. That classifier recognizes `pwsh`/`powershell` by basename and delegates script parsing to the PowerShell AST parser. The parser starts the supplied executable, so on macOS/Linux a repository-controlled `pwsh` path could execute during safety parsing before the normal sandboxed command execution path. The change gates the Windows PowerShell classifier and module behind `#[cfg(windows)]`. On macOS/Linux, PowerShell-looking commands are no longer auto-approved by the Windows classifier and instead fall through to the normal non-Windows safe-command logic. ## Validation - `/private/tmp/codex-tools/bin/just fmt` - `PATH=/private/tmp/codex-tools/bin:$PATH /private/tmp/codex-tools/bin/just test -p codex-shell-command` The focused test run passed 135 tests with 0 skipped and completed the crate bench-smoke step. ## Notes This PR is scoped to the BUGB-17567 macOS/Linux path. Windows still uses the PowerShell classifier; a separate hardening follow-up should ensure Windows safety parsing only executes a trusted PowerShell parser binary and does not spawn the command's `argv[0]` when that path may be repository-controlled.	2026-05-29 03:00:35 +00:00
viyatb-oai	bf72be5927	fix(config): use deny for Unix socket permissions (#24970 ) ## Why Unix socket permissions still accepted and displayed `"none"` while file permissions use the clearer `"deny"` spelling. This keeps network Unix socket policy vocabulary consistent with filesystem policy vocabulary. ## What changed - Replace the Unix socket permission variant and serialized spelling from `none` to `deny` across config, feature configuration, and network proxy types. - Update app-server v2 serialization, TUI debug output, focused tests, and generated schemas to expose `"deny"`. - Add coverage for denied Unix socket entries in managed requirements and profile overlay behavior. ## Security This is a vocabulary change for explicit Unix socket rejection, not a network access expansion. Denied entries continue to be omitted from the effective allowlist. ## Validation - `just fmt` - `just write-config-schema` - `just write-app-server-schema` - `just test -p codex-config -p codex-core -p codex-app-server-protocol -p codex-tui -E 'test(network_requirements_are_preserved_as_constraints_with_source) \| test(network_permission_containers_project_allowed_and_denied_entries) \| test(network_toml_overlays_unix_socket_permissions_by_path) \| test(permissions_profiles_resolve_extends_parent_first_with_child_overrides) \| test(network_requirements_serializes_canonical_and_legacy_fields) \| test(debug_config_output_formats_unix_socket_permissions)'`\n- Automatic `bench-smoke` follow-up from `just test`\n- `cargo clippy -p codex-config -p codex-core -p codex-features -p codex-network-proxy -p codex-app-server-protocol -p codex-app-server -p codex-tui --all-targets -- -D warnings`	2026-05-28 23:53:26 +00:00
Anton Panasenko	912d7d4f75	feat(app-server): migrate remote control to server tokens (#24141 ) ## Why `codex-backend` now authenticates remote-control server websocket connections with short-lived server tokens instead of the user's ChatGPT access token. `app-server` needs to mint and refresh those server tokens without persisting them, so a restart can reconnect from durable enrollment identity while keeping the bearer token memory-only. ## What Changed Updated the remote-control transport to consume `remote_control_token` and `expires_at` from server enroll responses and added `/server/refresh` support for persisted enrollments or expiring cached tokens. Websocket handshakes now send `Authorization: Bearer <remote_control_token>` with the existing server identity headers, and no longer send the ChatGPT bearer token or `chatgpt-account-id` on that websocket path. The in-memory enrollment state now owns the ephemeral server token cache, while SQLite still persists only `server_id`, `environment_id`, and `server_name`. Websocket `401`/`403` clears only the cached token for refresh on reconnect; websocket or refresh `404` clears stale persisted enrollment and re-enrolls. Response body previews redact `remote_control_token` before surfacing parse errors. ## Verification - `just test -p codex-app-server-transport` - Manual prod smoke with an isolated `CODEX_HOME`: `codex remote-control --json -c 'chatgpt_base_url="https://chatgpt.com/backend-api"'` reached `status:"connected"` with `environmentId:"env_i_6a17d9f1d764832986da2e80f4554f1b"`.	2026-05-28 15:57:08 -07:00
Abhinav	a576be2b73	Tighten hook output event schemas (#24962 ) # Why Fixes #23993. Hook command output schemas are published as the contract for hook authors and schema-driven tooling. The event-specific output schemas previously described `hookSpecificOutput.hookEventName` as the global `HookEventNameWire` enum, so a `pre-tool-use.command.output` schema would validate mismatched values like `PostToolUse`. That made the schemas less precise than the intended event-specific contract. # What Constrain each hook-specific output schema to the matching literal `hookEventName` value, mirroring the existing input-schema shape. Also split `SubagentStartHookSpecificOutputWire` from the session-start output wire so `subagent-start.command.output.schema.json` can emit `const: "SubagentStart"` instead of sharing the session-start definition. # Verification - `cargo nextest run -p codex-hooks` - `just fix -p codex-hooks` - `just argument-comment-lint -p codex-hooks -- --all-targets`	2026-05-28 15:55:40 -07:00
Michael Bolin	bcf2b55957	windows-sandbox: fix capture cancellation test roots (#24974 ) ## Why The Windows Bazel job on `main` started failing after #24108 because one Windows-only capture test still passed `cwd.as_path()` to `run_windows_sandbox_capture`. That helper now expects the explicit `workspace_roots` slice introduced by #24108, so the Windows test target no longer compiled. ## What Changed - Updates `legacy_capture_cancellation_is_not_reported_as_timeout` to pass `workspace_roots_for(cwd.as_path()).as_slice()`, matching the adjacent capture test and the new runner signature. ## Verification - GitHub Actions CI is the important validation for this Windows-only compile path. - Created quickly to get Windows CI running while the separate Ubuntu `compact_resume_fork` timeout is still under investigation.	2026-05-28 15:51:27 -07:00
Michael Bolin	986c60467b	windows-sandbox: pass workspace roots to runner (#24108 ) ## Why #23813 switches the Windows sandbox runner path to `PermissionProfile`, but it still left one runtime anchor for resolving symbolic `:workspace_roots` entries. That is not enough once a turn has multiple effective workspace roots: exact entries and deny globs under `:workspace_roots` need to be materialized for every runtime root before the command runner chooses token mode or builds ACL plans. ## What Changed - Replaces the Windows runner/setup `permission_profile_cwd` plumbing with `workspace_roots: Vec<AbsolutePathBuf>`. - Resolves Windows-local `PermissionProfile` data with `materialize_project_roots_with_workspace_roots(...)` instead of the single-cwd helper. - Threads `Config::effective_workspace_roots()` through core execution, unified exec, TUI setup/read-grant flows, app-server setup, app-server `command/exec`, and `debug sandbox` on Windows. - Preserves those workspace roots through the zsh-fork escalation executor instead of rebuilding them from `sandbox_policy_cwd`. - Makes `ExecRequest::new(...)` and the remaining `build_exec_request(...)` helper path take `windows_sandbox_workspace_roots` explicitly so new call sites cannot silently fall back to `vec![cwd]`. - Clarifies the `debug sandbox` non-Windows comment: remaining cwd-dependent resolution still uses `sandbox_policy_cwd`, while `:workspace_roots` entries are already materialized from config roots. - Updates elevated runner IPC `SpawnRequest` to send `workspace_roots` and bumps the framed IPC protocol version to `3` for the payload shape change. - Adds Windows-local resolver coverage for expanding exact and glob `:workspace_roots` entries across multiple roots, plus core helper coverage proving explicit roots are preserved. ## Verification - `cargo check -p codex-windows-sandbox -p codex-core -p codex-tui -p codex-cli -p codex-app-server` - `cargo test -p codex-windows-sandbox` - `cargo test -p codex-core windows_sandbox` - `cargo test -p codex-core unix_escalation` - `cargo test -p codex-app-server windows_sandbox` - `cargo test -p codex-tui windows_sandbox` - `cargo test -p codex-cli debug_sandbox` - `just test -p codex-core unified_exec` - `just test -p codex-core build_exec_request_preserves_windows_workspace_roots` - `env -u CODEX_NETWORK_PROXY_ACTIVE -u CODEX_NETWORK_ALLOW_LOCAL_BINDING just test -p codex-app-server --lib command_exec` - `just test -p codex-windows-sandbox` - `just test -p codex-exec sandbox` - `just fix -p codex-core -p codex-app-server -p codex-windows-sandbox` A local macOS cross-check with `cargo check --target x86_64-pc-windows-msvc ...` did not reach crate Rust code because native dependencies require Windows SDK headers (`windows.h` / `assert.h`) in this environment; Windows CI remains the real target validation. Two local targeted filters compile but do not run assertions on macOS: `env -u CODEX_NETWORK_PROXY_ACTIVE -u CODEX_NETWORK_ALLOW_LOCAL_BINDING just test -p codex-app-server --lib command_exec_processor` matched zero tests, and `just test -p codex-linux-sandbox landlock` matched zero tests because the landlock suite is Linux-only.	2026-05-28 15:26:55 -07:00
Michael Bolin	e7dda8070e	Surface filesystem permission profiles in prompt context (#23924 ) ## Summary Some permission profiles can encode filesystem reads that should remain unavailable to the agent. Before this change, the model-visible context and automatic approval review prompt summarized the effective permissions as a legacy sandbox mode, which can omit permission-profile filesystem entries from escalation decisions. For example, a profile can grant workspace access while denying a private subtree across every workspace root: ```toml default_permissions = "restricted-workspace" [permissions.restricted-workspace.workspace_roots] "/Users/alice/project" = true "/Users/alice/other-project" = true [permissions.restricted-workspace.filesystem] ":minimal" = "read" [permissions.restricted-workspace.filesystem.":workspace_roots"] "." = "write" "private" = "deny" "private/" = "deny" ``` The context window now describes the workspace roots and effective filesystem side of the `PermissionProfile` directly, with deny entries marked as non-escalatable: ```xml <environment_context> <cwd>/Users/alice/project</cwd> <shell>zsh</shell> <filesystem><workspace_roots><root>/Users/alice/project</root><root>/Users/alice/other-project</root></workspace_roots><permission_profile type="managed"><file_system type="restricted"><entry access="read"><special>:minimal</special></entry><entry access="write"><path>/Users/alice/project</path></entry><entry access="write"><path>/Users/alice/other-project</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/project/private</path></entry><entry access="deny" escalatable="false"><path>/Users/alice/other-project/private</path></entry><entry access="deny" escalatable="false"><glob>/Users/alice/project/private/</glob></entry><entry access="deny" escalatable="false"><glob>/Users/alice/other-project/private/</glob></entry></file_system></permission_profile></filesystem> </environment_context> ``` Managed requirements can impose the same kind of deny-read restriction: ```toml [permissions.filesystem] deny_read = [ "/Users/alice/project/private", "/Users/alice/project/private/", ] ``` The automatic approval review prompt also receives the parent turn's denied-read context, so review decisions can account for the active permission profile. ## What Changed - Render the effective filesystem profile in `<environment_context>`, including profile type, filesystem entries, workspace roots, and non-escalatable deny entries. - Persist effective `workspace_roots` in `TurnContextItem` so resumed/replayed context does not have to bind `:workspace_roots` through legacy `cwd` fallback. - Add explicit permission instructions that denied reads are policy restrictions, not escalation targets. - Pass the parent turn's denied-read context into automatic approval reviews. - Add targeted coverage for prompt rendering, workspace-root materialization, replay context, and review prompt context. - Keep the prompt-context test expectations platform-aware so the same filesystem rendering assertions pass on Unix and Windows paths. ## Testing - `just test -p codex-core context::environment_context::tests::serialize_environment_context_with_full_filesystem_profile` - `just test -p codex-core context::environment_context::tests::turn_context_item_filesystem_uses_workspace_roots_instead_of_cwd` - `just test -p codex-core context::permissions_instructions::permissions_instructions_tests::builds_permissions_from_profile_with_denied_reads` - `just fix -p codex-core` I also attempted `just test -p codex-core`; the changed prompt-context tests passed, but the full local run did not complete cleanly in this sandboxed macOS environment due unrelated user-shell `CODEX_SANDBOX*` expectations and integration-test timeouts.	2026-05-28 14:56:53 -07:00
Alexi Christakis	e92c952b2e	[codex] Add user input client ids (#24653 ) ## Summary Adds an optional `clientId` field to app-server v2 `UserInput` and carries it through the core `UserInput` model so clients can correlate echoed user input items without relying on payload equality. ## Details - Adds `client_id: Option<String>` to core `UserInput` variants. - Exposes the v2 app-server field as `clientId` on the wire and in generated TypeScript. - Preserves the id when converting between app-server v2 and core protocol types. - Regenerates app-server schema fixtures. ## Validation - `just fmt` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-protocol` - `git diff --check`	2026-05-28 14:54:39 -07:00
viyatb-oai	a027135bc6	fix(exec-server): reject websocket requests with Origin headers (#24947 ) ## Why `codex exec-server` has a local WebSocket listener, but it did not apply the same browser-origin request handling as the `app-server` WebSocket transport. Requests that carry an `Origin` header should not be upgraded by this local transport, keeping both local WebSocket servers consistent and avoiding unexpected browser-initiated connections. ## What changed - Added an Axum middleware guard in `codex-rs/exec-server/src/server/transport.rs` that returns `403 Forbidden` for requests carrying an `Origin` header. - Added an integration test in `codex-rs/exec-server/tests/websocket.rs` that covers rejection of an `Origin`-bearing WebSocket handshake. - Kept ordinary WebSocket clients unchanged: existing no-`Origin` initialization and process behavior remains covered by the crate tests. ## Validation - `just test -p codex-exec-server` test phase (`186 passed`; run outside the parent macOS sandbox so nested sandbox tests can execute) - `just clippy -p codex-exec-server`	2026-05-28 14:44:14 -07:00
viyatb-oai	3cf737e4e3	fix: cancel Windows sandbox on network denial (#19880 ) ## Why When Guardian or the sandbox network proxy detects and denies a network attempt, core cancels the associated execution through `ExecExpiration`. The Windows sandbox capture path was only forwarding the timeout component of that expiration state. As a result, a sandboxed Windows command whose network attempt had already been denied could keep running until its timeout elapsed rather than terminating promptly in response to the denial. This change closes that cancellation-propagation gap for Windows sandbox execution. ## What changed - Added `WindowsSandboxCancellationToken` as the cancellation hook exposed to Windows capture backends. - Extracted the cancellation token from `ExecExpiration` in core and passed it to both the direct and elevated Windows sandbox capture paths alongside the existing timeout. - Updated direct capture to poll for either process exit, timeout, or cancellation and to terminate cancelled processes without reporting them as timed out. - Updated elevated capture to watch for cancellation and send the existing `Terminate` IPC frame to the elevated runner. The watcher parks for 50 ms between checks to bound response latency without a tight busy wait. - Added Windows regression coverage for a long-running PowerShell command: cancellation ends capture before its timeout and does not set `timed_out`. - Added a visible skip diagnostic when that PowerShell-dependent regression test cannot execute, and consolidated the duplicated expiration-policy branch identified in review. ## Security This improves enforcement after a denied network attempt has been attributed to a Windows sandboxed execution: the command no longer remains alive simply because Windows capture lost the cancellation signal. This PR does not claim to make Windows offline mode an airtight no-network or no-exfiltration boundary. It does not introduce AppContainer or change how network denial is detected; it makes an already-detected denial promptly stop the affected sandboxed command. ## Validation ### Commands run - `just fmt` - `cargo test -p codex-windows-sandbox` - `cargo test -p codex-core network_denial` - `cargo clippy -p codex-core -p codex-windows-sandbox --tests --no-deps -- -D warnings` - `just argument-comment-lint -p codex-windows-sandbox -p codex-core` The new capture regression is `cfg(target_os = "windows")`, so Windows CI is the execution coverage for that test path. The local macOS test runs validate the host-runnable crate and core network-denial behavior. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-28 21:28:06 +00:00
Michael Bolin	bc10e5b390	runtime: prepend zsh fork bin dir to PATH (#23768 ) ## Why #23756 makes packaged Codex builds include and default to the bundled zsh fork. The important reason to put that fork's directory at the front of `PATH` is to keep executable-level escalation working after a command leaves the original shell and later re-enters zsh through `env`. The expected chain is: 1. The zsh fork runs the top-level shell command. 2. That command launches another program, such as `python3`, while inheriting the `EXEC_WRAPPER` environment and the escalation socket fd. 3. That program spawns a shell script whose shebang is `#!/usr/bin/env zsh` rather than `#!/bin/zsh`, and it does not close the escalation fd. 4. `/usr/bin/env` resolves `zsh` through `PATH`, so it must find the packaged zsh fork before the system zsh. 5. Commands inside that nested script are intercepted by the zsh fork and can still request escalation from Codex. If `PATH` resolves `zsh` to the system shell instead, the nested script loses zsh-fork exec interception. Commands that should request escalation can then run only in the original sandbox, or fail there, without Codex ever receiving the approval request. Shell snapshots make this slightly more subtle: a snapshot can restore an older `PATH` after the child shell starts. This PR treats the zsh fork `PATH` prepend as an explicit environment override so snapshot wrapping preserves it. ## What Changed - Added shared zsh-fork runtime helpers that prepend the configured zsh executable parent directory to `PATH` without duplicate entries. - Applied the zsh fork `PATH` prepend to both zsh-fork `shell_command` launches and unified-exec zsh-fork launches before sandbox command construction. - Kept the shell-command zsh-fork backend API narrow: it derives the configured zsh path from session services and rebuilds its sandbox environment from `req.env`, rather than accepting a second, competing environment map or a separately threaded bin dir. - Kept Unix-only zsh-fork `PATH` mutation out of Windows clippy-visible mutability. - Added coverage for duplicate `PATH` entries, for preserving the zsh fork prepend through shell snapshot wrapping, and for the nested `python3` -> `#!/usr/bin/env zsh` escalation flow. ## Testing - `just fmt` - `just fix -p codex-core` I left final test validation to CI after the latest review-comment cleanup. Before that cleanup, `just test -p codex-core zsh_fork` passed locally for the zsh-fork-focused tests.	2026-05-28 14:10:40 -07:00
Celia Chen	0a8c835845	[codex] Remove Bedrock OSS models from catalog (#24960 ) Remove the GPT OSS 120B and 20B entries from the Amazon Bedrock static model catalog, as they are no longer supported.	2026-05-28 14:10:26 -07:00
iceweasel-oai	d9f53128b7	[codex] Handle PowerShell UTF-8 setup failures (#24949 ) Fixes #12496. ## Why Windows sandboxed PowerShell commands can run under `ConstrainedLanguage` on some machines, especially enterprise-managed Windows environments. In that mode, our PowerShell command prelude could fail before every command because it directly assigned `[Console]::OutputEncoding` to UTF-8. The actual user command still ran, but Codex surfaced noisy `Cannot set property. Property setting is supported only on core types in this language mode.` output for every shell call. ## What Changed - Makes the PowerShell UTF-8 output encoding prelude best-effort by wrapping the assignment in `try { ... } catch {}`. - Keeps the existing UTF-8 behavior when PowerShell allows the assignment. - Adds focused tests for adding the prelude and avoiding duplicate prelude insertion. ## Validation - `cargo fmt -p codex-shell-command` - `cargo check -p codex-shell-command` - `git diff --check` - Verified a local `ConstrainedLanguage` PowerShell probe prints only the command output with no property-setting error. - Verified `codex exec` from a temporary `chcp 437` context reports `utf-8` / `65001` and preserves non-ASCII output (`café`, `漢字`).	2026-05-28 13:58:20 -07:00
Felipe Coury	2e0c4f4977	fix(tui): prevent repository-configured code execution in /diff (#24954 ) ## Why `/diff` is intended to display working-tree changes, but its Git invocations honored repository-selected executable helpers. A repository could configure diff/text conversion helpers, clean/process filters, `core.fsmonitor`, or `post-index-change` hooks that execute when a user runs `/diff`. Fixes [PSEC-4395](https://linear.app/openai/issue/PSEC-4395/codex-cli-diff-executes-repository-selected-diff-helpers). ## What Changed - Pass `--no-textconv` and `--no-ext-diff` for tracked and untracked diff generation. - Discover configured `filter.<driver>.clean` and `.process` entries, then neutralize the selected drivers through structured `GIT_CONFIG_KEY_` / `GIT_CONFIG_VALUE_` overrides, including driver names containing `=`. - Run all `/diff` Git probes with `core.fsmonitor=false` and a null `core.hooksPath`. - Use short submodule reporting while ignoring dirty submodule worktrees, since inspecting a checked-out submodule for dirtiness can execute filters from that child repository. This intentionally omits dirty-only submodule markers in order to preserve the non-executing security boundary. - Add real-Git marker tests covering filters, fsmonitor, hooks, and configured helpers inside checked-out submodules. ## How to Test 1. In a repository with ordinary tracked and untracked edits, run `/diff`. 2. Confirm the normal working-tree diff is shown for top-level files. 3. Run the targeted tests below; they configure executable marker helpers for repository filters, fsmonitor, hooks, and a checked-out submodule, then verify `/diff` does not invoke them. 4. Confirm a dirty-only submodule does not cause Codex to enter the submodule and execute its configured helper. Targeted tests: - `just test -p codex-tui get_git_diff_` Validation note: `just test -p codex-tui` runs the new coverage, but this worktree currently also has two unrelated failing guardian tests: `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` and `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`.	2026-05-28 16:53:59 -03:00
Adam Perry @ OpenAI	b90ec46387	Add `codex app-server --stdio` alias (#24940 ) ## Summary - Add `--stdio` as a direct alias for `codex app-server --listen stdio://`. - Keep `--stdio` and `--listen` mutually exclusive. - Update the app-server README to document both forms.	2026-05-28 12:43:30 -07:00
Adam Perry @ OpenAI	9dd39f334e	Move Bazel Windows jobs onto codex-runners (#24952 ) The codex-windows runner group should be much faster than the default GHA runners. Since bazel jobs on windows are frequently the long pole for PRs checks, this will hopefully get people landing a bit faster.	2026-05-28 12:43:04 -07:00
Won Park	ecb41fcb64	Add feature-gated standalone image generation extension (#24723 ) ## Why Add a standalone image generation path that can be exercised independently of hosted Responses image generation, while retaining the hosted tool as fallback unless the extension is actually available to the model. ## What changed - Added the `codex-image-generation-extension` crate with standalone generate/edit execution, prior-image selection for edits, model-visible image output, and local generated-image persistence. - Installed the extension in app-server behind the disabled-by-default `imagegenext` feature and backend eligibility checks. - Updated core tool planning so eligible `image_gen.imagegen` exposure replaces hosted `image_generation`, while unavailable configurations retain hosted fallback. - Added coverage for extension behavior, edit history reuse, feature gating, auth eligibility, and hosted-tool replacement. - The extension is installed through app-server only in this PR; other execution paths retain hosted image generation because hosted replacement occurs only when the standalone executor is actually registered and model-visible. - The initial extension contract intentionally fixes the image model to `gpt-image-2` and uses automatic image parameters. - Native generated-image history/card parity and rollout persistence cleanup are intentionally deferred follow-up work. ## Validation - `just test -p codex-image-generation-extension` - `just test -p codex-features` - `just test -p codex-core hosted_tools_follow_provider_auth_model_and_config_gates` - `just test -p codex-app-server` - `just fix -p codex-image-generation-extension -p codex-features -p codex-core -p codex-app-server` - `just fmt` - `just bazel-lock-update` - `just bazel-lock-check` --------- Co-authored-by: jif-oai <jif@openai.com>	2026-05-28 11:44:55 -07:00
jif-oai	462deb0426	Wire task completion into thread-idle lifecycle (#24928 ) ## Why #24744 introduced the thread idle lifecycle hook so idle continuation can be owned by lifecycle contributors instead of hard-coded goal runtime plumbing. Task completion still called `goal_runtime_apply(GoalRuntimeEvent::MaybeContinueIfIdle)` directly, so the post-turn idle transition remained goal-specific and did not notify generic thread lifecycle contributors. ## What Changed - Add `Session::emit_thread_idle_lifecycle_if_idle()` to gate idle emission on both no active turn and no queued trigger-turn mailbox work. - Call that helper when a task clears the active turn, replacing the direct `GoalRuntimeEvent::MaybeContinueIfIdle` path. - Cover the behavior with `codex-core` session tests for emitting after task completion and suppressing idle emission while trigger-turn mailbox work is pending. ## Verification - New tests in `core/src/session/tests.rs` exercise the idle lifecycle emission and trigger-turn mailbox guard.	2026-05-28 20:05:41 +02:00
Adam Perry @ OpenAI	c2508db60d	Revert "Add app-server startup benchmark crate" (#24937 ) Reverts openai/codex#24651, broke musl job https://github.com/openai/codex/actions/runs/26585495205/job/78330166927	2026-05-28 17:49:41 +00:00
canvrno-oai	6c1215dac6	TUI: Unified mentions tweaks + polish mentions rendering (#23363 ) This change keeps unified @mentions behind the mentions_v2 gate, moves the flag to under-development, and polishes mention rendering/history behavior. It also adds a few small improvements to the mentions feature around mention rendering and history round-tripping for plugin/tool mentions in message edit scenarios. Plugin selections now insert `@` mentions with better casing, and saved history preserves the visible sigil so recalled messages look the same as what the user typed. - Preserves `@` sigils when encoding/decoding mention history for tool/plugin paths. - Improves plugin mention insertion so display names/casing are reflected more cleanly in the composer. - Update composer to render user-entered plugin mentions in the same color as the mentions menu. ALso applies to recalled/edited messages. - Left/right arrows no longer switch unified-mention search modes after an @mention has already been accepted (Ex: arrowing left through a composed message that contains @mentions). - Keeps bound mentions stable around punctuation, so accepted `@` mentions do not reopen the popup and punctuated `$` mentions still persist to cross-session history. Steps to test - Ensure mentions_v2 is enabled through configuration or `--enable mentions_v2` - Type `@` in the TUI composer and verify filesystem/plugin/skill results are displayed in the unified mentions menu. - Select a plugin mention from the `@` popup and confirm the inserted text is an `@...` mention with casing, then recall/edit the message and confirm it still renders as `@...`. - Mention a skill and verify that skills still insert as `$skill` mentions rather than `@` mentions. - Verify punctuated mentions such as `@plugin.` and `($skill)` keep their bound mention behavior across editing and history recall.	2026-05-28 10:30:15 -07:00
Celia Chen	489bf38658	chore: add GPT-5.5 to the Amazon Bedrock catalog (#24701 ) ## Summary Amazon Bedrock should expose GPT-5.5 alongside GPT-5.4, and the Bedrock GPT entries should stay aligned with the canonical bundled OpenAI model metadata instead of carrying a separate hand-written copy that can drift over time. This change will be merged when the model is online. This change: - Adds the Bedrock Mantle model id for `openai.gpt-5.5`. - Builds the Bedrock GPT-5.5 and GPT-5.4 catalog entries from the bundled OpenAI model catalog, then overrides the Bedrock-facing slug, explicit priority, and Bedrock-specific context windows. - Hardcodes both `context_window` and `max_context_window` to `272000` for Bedrock GPT-5.5 and GPT-5.4. - Keeps `openai.gpt-5.5` as the default Bedrock model ahead of `openai.gpt-5.4` and the Bedrock OSS models.	2026-05-28 10:29:06 -07:00
Gabriel Peal	577ec03bf8	[codex] Support ui visibility meta for tools (#24700 ) ## Summary Adds support for the same ui.visibility metadata as resources [spec](https://github.com/modelcontextprotocol/ext-apps/blob/main/specification/draft/apps.mdx#resource-discovery)	2026-05-28 10:24:03 -07:00
Michael Bolin	2264fdd4a2	Fix extension turn item emitter test event ordering (#24936 ) ## Why PR #24813 added extension `TurnItemEmitter` coverage and introduced a test that records a conversation history item before asserting extension-emitted turn item events. `record_conversation_items()` also emits a `RawResponseItem` event to observers. The test was reading from the same event receiver and expected the next event to be `ItemStarted`, so the test failed reliably once the setup history item was present. ## What Changed Update `passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call` to consume and assert the expected setup `RawResponseItem` before checking the extension `ItemStarted`, `WebSearchBegin`, `ItemCompleted`, and `WebSearchEnd` events. This is test-only and does not change extension runtime behavior. ## Verification - `cargo nextest run --no-fail-fast -p codex-core tools::handlers::extension_tools::tests::passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call`	2026-05-28 09:59:34 -07:00
jif-oai	e2551a5e36	Reap stale multi-agent slots (#24903 ) ## Summary - Let `close_agent` clean up an agent that is still registered in `AgentRegistry` even when its underlying thread is already missing. - Preserve the explicit-close boundary: for known stale thread-spawn agents, mark the persisted spawn edge `Closed`, then treat `ThreadNotFound` / `InternalAgentDied` as a successful close so the registry slot can be released. - Add a regression for MultiAgentV2 task-name targets where `close_agent("worker")` succeeds after the worker thread has already disappeared. ## Motivation A worker can disappear from `ThreadManager` while its metadata still exists in the root `AgentRegistry`. Before this change, the close tool failed while trying to subscribe to the missing thread status, so it never reached the cleanup path that releases the registered agent slot. With `agents.max_threads = 1`, an explicit close of that stale task-name agent could fail and leave the session unable to spawn a replacement. ## Scope This PR intentionally does not add automatic stale-agent reaping to `spawn_agent`, `resume_agent`, or `list_agents`. A thread being missing from `ThreadManager` is not the same as an explicit close: persisted open spawn edges are still the durable source of truth for resume and task-name ownership until `close_agent` is called. ## Validation - `just test -p codex-core -E 'test(multi_agent_v2_close_agent_reaps_stale_task_name_target) \| test(resume_agent_from_rollout_reopens_open_descendants_after_manager_shutdown)'` - `just fix -p codex-core`	2026-05-28 18:48:43 +02:00
Gabriel Peal	8a827d6426	Expose MCP server info as part of server status (#24698 ) # Summary Expose MCP server info via App Server (when available) so apps can render a richer MCP experience	2026-05-28 09:38:34 -07:00
Brent Traut	2a1158b8e2	feat(app-server): include turns page on thread resume (#23534 ) ## Summary The client currently calls `thread/resume` to establish live updates and immediately follows it with `thread/turns/list` to hydrate recent turns. This lets `thread/resume` return that page directly, eliminating a round trip and the ordering/deduplication gap between the two calls. Experimental clients opt in with `initialTurnsPage: { limit, sortDirection, itemsView }`. The response returns `initialTurnsPage` as a `TurnsPage`, including cursors for paging further back in history. Keeping the controls in a nested opt-in object provides the useful `thread/turns/list` knobs without spreading page-specific parameters across `thread/resume`. ## Verification - `just fmt` - `just write-app-server-schema --experimental` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_resume_initial_turns_page_matches_requested_turns_list_page --tests` - `cargo test -p codex-app-server thread_resume_rejoins_running_thread_even_with_override_mismatch --tests` - `just fix -p codex-app-server-protocol -p codex-app-server`	2026-05-28 09:18:13 -07:00
sayan-oai	2066874415	extension-api: add TurnItemEmitter to tool calls (#24813 ) ## Why Extension-contributed tools need to emit visible turn items through Codex's normal event and persistence pipeline. ## What - Add `TurnItemEmitter` to extension `ToolCall`s and route the core implementation through `Session::emit_turn_item_*`. - Hold weak session and turn references so retained tool calls cannot keep host state alive. - Provide a no-op emitter for extension test callers. ## Test Plan - `just test -p codex-core -E 'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'` --------- Co-authored-by: jif-oai <jif@openai.com>	2026-05-28 09:13:43 -07:00
Adam Perry @ OpenAI	a061befb46	Remove libubsan CI workaround (#24782 ) It seems that this was added to allow rustc to load proc macros that had been compiled with UBSan enabled, which zig does for debug and `ReleaseSafe` builds. When zig drives the link of the final binary it knows to include the ubsan runtime, but our zig-built artifacts are being linked into a binary whose linking rustc drives. This removes the libubsan workaround we have and replaces it with `-fno-sanitize=undefined` passed to zig. The new argument is passed at the end of zig's args so should take precedence over any earlier arguments from the script's caller.	2026-05-28 15:49:01 +00:00
jif-oai	e426d48f6d	Gate goal tools by thread eligibility (#24925 ) ## Why Goal tools create and update goal state for a persistent thread. The extension was only checking whether goals were enabled before advertising those tools, which meant they could be surfaced in contexts that should not receive thread goal controls: ephemeral threads without persistent thread state and review subagents. Those sessions can still run the goal extension lifecycle, but the thread tools should only be visible when the current thread can safely use them. ## What changed - Adds a `GoalRuntimeConfig` that separates goal enablement from whether goal tools are available for the current thread. - Computes tool eligibility on thread start from `persistent_thread_state_available` and `SessionSource`, hiding tools for review subagents. - Uses `GoalRuntimeHandle::tools_visible()` when contributing thread tools so enabled runtime state does not automatically imply tool exposure. - Adds backend coverage for hiding goal tools on ephemeral threads and review subagents. ## Testing - Added `goal_tools_hidden_for_ephemeral_threads`. - Added `goal_tools_hidden_for_review_subagents`.	2026-05-28 17:47:17 +02:00
Adam Perry @ OpenAI	bd2a732923	Add app-server startup benchmark crate (#24651 ) ## Summary - Add a new `app-server-start-bench` crate to measure app-server startup performance - Wire the benchmark into the workspace and Bazel build so it can be run consistently - Update lockfiles and repo automation to account for the new package	2026-05-28 08:46:30 -07:00
Vaibhav Srivastav	a4ed6c5aa0	[codex] Update OpenAI Docs skill (#24914 ) ## Summary - update the bundled `openai-docs` system skill to match the latest `openai-docs-plus` content from `skills-internal` - add the cached Codex manual fetch helper and expand the skill routing for Codex self-knowledge - keep the stable local skill identity and labels as `openai-docs` ## Why The built-in OpenAI Docs skill needed to reflect the current upstream guidance from `skills-internal` while preserving the local system-skill name used by Codex. ## Impact Codex now ships the newer OpenAI Docs skill behavior for Codex self-knowledge and manual-first documentation lookups. ## Validation - `just test -p codex-skills` - exact directory diff against transformed `skills-internal` `origin/main` was clean	2026-05-28 15:11:11 +00:00
pakrym-oai	1c7832ffa3	[codex] Store pending response items directly (#24865 )	2026-05-28 07:23:08 -07:00

1 2 3 4 5 ...

6956 Commits