Use the MCP server environment setting to choose local stdio or executor-backed stdio at client startup time.
Co-authored-by: Codex <noreply@openai.com>
Keep the shared launcher API before the local implementation and move local launch helpers onto LocalStdioServerLauncher.\n\nCo-authored-by: Codex <noreply@openai.com>
Rename the stdio runtime abstraction around launching a server so the PR boundary is about process placement, not a parallel stdio/executor runtime shape.\n\nCo-authored-by: Codex <noreply@openai.com>
Move local stdio process startup behind a runtime trait while preserving the existing local MCP stdio behavior.
Co-authored-by: Codex <noreply@openai.com>
Add an explicit stdin mode to process/start so non-tty processes can either keep stdin closed or expose a writable pipe.
Co-authored-by: Codex <noreply@openai.com>
Introduce the MCP server environment setting and thread the default local value through config serialization, schema generation, and existing config fixtures.
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Promote `Feature::ToolSearch` to `Stable` and enable it in the default
feature set
- Update feature tests and tool registry coverage to match the new
default
- Adjust the search-tool integration test to assert the default-on path
and explicit disable fallback
## Testing
- `just fmt`
- `cargo test -p codex-features`
- `cargo test -p codex-core --test all search_tool`
- `cargo test -p codex-tools`
- When launching the TUI client, if YOLO mode is enabled, display this
in the header.
- Eligibility is determined by `approval_policy = "never"` and
`sandbox_mode = "danger-full-access"`
<img width="886" height="230" alt="image"
src="https://github.com/user-attachments/assets/d7064778-e32c-4123-8e44-ca0c9016ab09"
/>
## Motivation
Codex needs a repeatable workflow for updating PR metadata after a pull
request already exists. This is more specific than generic GitHub
handling: the assistant needs to preserve author-provided body content,
explain why the PR exists before listing implementation details, and
describe only the net change under review, including when Sapling stacks
put a PR on top of another PR instead of `main`.
## Changes
- Adds `.codex/skills/codex-pr-body/SKILL.md`.
- Documents how to infer the target PR from the current branch or
commit, including Sapling-specific PR metadata and `sl sl` output.
- Defines the expected PR body update behavior: inspect the existing
body, preserve key content such as images, avoid local absolute paths,
use Markdown formatting, include relevant issue/PR references, and call
out developer docs follow-up only when applicable.
- Captures stacked-PR handling so generated PR text describes the change
between the PR's base and head, rather than unrelated ancestor changes.
## Verification
Not run; this is a Codex skill documentation addition.
## Summary
- Make command/exec output-delta tests accumulate streamed chunks
instead of assuming complete logical output in a single notification.
- Collect stdout and stderr independently so stream interleaving does
not fail the pipe streaming test.
## Why
The command/exec protocol exposes output as deltas, so tests should not
rely on chunk boundaries being stable. A line like `out-start\n` may
arrive split across multiple notifications, and stdout/stderr
notifications may interleave.
## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-app-server suite::v2::command_exec`
**Summary**
- prevent managed requirements.toml network settings from leaking into
DangerFullAccess / yolo turns by gating managed proxy attachment on
sandbox mode
- keep guardian/sandboxed modes on the managed proxy path, while making
true yolo bypass the proxy entirely, including /shell full-access
commands
## Summary
- wrap realtime startup context in
`<startup_context>...</startup_context>` tags
- prefix V2 mirrored user text and relayed backend text with `[USER]` /
`[BACKEND]`
- remove the V2 progress suffix and replace the final V2 handoff output
with a short completion acknowledgement while preserving the existing V1
wrapper
## Testing
- cargo test -p codex-api
realtime_v2_session_update_includes_background_agent_tool_and_handoff_output_item
-- --exact
- cargo test -p codex-app-server webrtc_v2_background_agent_
- cargo test -p codex-app-server webrtc_v2_text_input_is_
- cargo test -p codex-core conversation_user_text_turn_is_
## Summary
This PR significantly improves the standalone installer experience.
The main changes are:
1. We now install the codex binary and other dependencies in a
subdirectory under CODEX_HOME.
(`CODEX_HOME/packages/standalone/releases/...`)
2. We replace the `codex.js` launcher that npm/bun rely on with logic in
the Rust binary that automatically resolves its dependencies (like
ripgrep)
## Motivation
A few design constraints pushed this work.
1. Currently, the entrypoint to codex is through `codex.js`, which
forces a node dependency to kick off our rust app. We want to move away
from this so that the entrypoint to codex does not rely on node or
external package managers.
2. Right now, the native script adds codex and its dependencies directly
to user PATH. Given that codex is likely to add more binary dependencies
than ripgrep, we want a solution which does not add arbitrary binaries
to user PATH -- the only one we want to add is the `codex` command
itself.
3. We want upgrades to be atomic. We do not want scenarios where
interrupting an upgrade command can move codex into undefined state (for
example, having a new codex binary but an old ripgrep binary). This was
~possible with the old script.
4. Currently, the Rust binary uses heuristics to determine which
installer created it. These heuristics are flaky and are tied to the
`codex.js` launcher. We need a more stable/deterministic way to
determine how the binary was installed for standalone.
5. We do not want conflicting codex installations on PATH. For example,
the user installing via npm, then installing via brew, then installing
via standalone would make it unclear which version of codex is being
launched and make it tough for us to determine the right upgrade
command.
## Design
### Standalone package layout
Standalone installs now live under `CODEX_HOME/packages/standalone`:
```text
$CODEX_HOME/
packages/
standalone/
current -> releases/0.111.0-x86_64-unknown-linux-musl
releases/
0.111.0-x86_64-unknown-linux-musl/
codex
codex-resources/
rg
```
where `standalone/current` is a symlink to a release directory.
On Windows, the release directory has the same shape, with `.exe` names
and Windows helpers in `codex-resources`:
```text
%CODEX_HOME%\
packages\
standalone\
current -> releases\0.111.0-x86_64-pc-windows-msvc
releases\
0.111.0-x86_64-pc-windows-msvc\
codex.exe
codex-resources\
rg.exe
codex-command-runner.exe
codex-windows-sandbox-setup.exe
```
This gives us:
- atomic upgrades because we can fully stage a release before switching
`standalone/current`
- a stable way for the binary to recognize a standalone install from its
canonical `current_exe()` path under CODEX_HOME
- a clean place for binary dependencies like `rg`, Windows sandbox
helpers, and, in the future, our custom `zsh` etc
### Command location
On Unix, we add a symlink at `~/.local/bin/codex` which points directly
to the `$CODEX_HOME/packages/standalone/current/codex` binary. This
becomes the main entrypoint for the CLI.
On Windows, we store the link at
`%LOCALAPPDATA%\Programs\OpenAI\Codex\bin`.
### PATH persistence
This is a tricky part of the PR, as there's no ~super reliable way to
ensure that we end up on PATH without significant tradeoffs.
Most Unix variants will have `~/.local/bin` on PATH already, which means
we *should* be fine simply registering the command there in most cases.
However, there are cases where this is not the case. In these cases, we
directly edit the profile depending on the shell we're in.
- macOS zsh: `~/.zprofile`
- macOS bash: `~/.bash_profile`
- Linux zsh: `~/.zshrc`
- Linux bash: `~/.bashrc`
- fallback: `~/.profile`
On Windows, we update the User `Path` environment variable directly and
we don't need to worry about shell profiles.
### Standalone runtime detection
This PR adds a new shared crate, `codex-install-context`, which computes
install ownership once per process and caches it in a `OnceLock`.
That context includes:
- install manager (`Standalone`, `Npm`, `Bun`, `Brew`, `Other`)
- the managed standalone release directory, when applicable
- the managed standalone `codex-resources` directory, when present
- the resolved `rg_command`
The standalone path is detected by canonicalizing `current_exe()`,
canonicalizing CODEX_HOME via `find_codex_home()`, and checking whether
the binary is running from under
`$CODEX_HOME/packages/standalone/releases`.
We intentionally do not use a release metadata file. The binary path is
the source of truth.
### Dependency resolution
For standalone installs, `grep_files` now resolves bundled `rg` from
`codex-resources` next to the Codex binary.
For npm/bun/brew/other installs, `grep_files` falls back to resolving
`rg` from PATH.
For Windows standalone installs, Windows sandbox helpers are still found
as direct siblings when present. If they are not direct siblings, the
lookup also checks the sibling `codex-resources` directory.
### TUI update path
The TUI now has `UpdateAction::StandaloneUnix` and
`UpdateAction::StandaloneWindows`, which rerun the standalone install
commands.
Unix update command:
```sh
sh -c "curl -fsSL https://chatgpt.com/codex/install.sh | sh"
```
Windows update command:
```powershell
powershell -c "irm https://chatgpt.com/codex/install.ps1|iex"
```
The Windows updater runs PowerShell directly. We do this because `cmd
/C` would parse the `|iex` as a cmd pipeline instead of passing it to
PowerShell.
## Additional installer behavior
- standalone installs now warn about conflicting npm/bun/brew-managed
`codex` installs and offer to uninstall them
- same-version reruns do not redownload the release if it is already
staged locally
## Testing
Installer smoke tests run:
- macOS: fresh install into isolated `HOME` and `CODEX_HOME` with
`scripts/install/install.sh --release latest`
- macOS: reran the installer against the same isolated install to verify
the same-version/update path and PATH block idempotence
- macOS: verified the installed `codex --version` and bundled
`codex-resources/rg --version`
- Windows: parsed `scripts/install/install.ps1` with PowerShell via
`[scriptblock]::Create(...)`
- Windows: verified the standalone update action builds a direct
PowerShell command and does not route the `irm ...|iex` command through
`cmd /C`
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- honor `_meta["codex/imageDetail"] == "original"` on MCP image content
and map it to `detail: "original"` where supported
- strip that detail back out when the active model does not support
original-detail image inputs
- update code-mode `image(...)` to accept individual MCP image blocks
- teach `js_repl` / `codex.emitImage(...)` to preserve the same hint
from raw MCP image outputs
- document the new `_meta` contract and add generic RMCP-backed coverage
across protocol, core, code-mode, and js_repl paths
## Summary
1. Revert https://github.com/openai/codex/pull/17848 so the Bazel and
`BUILD` file changes leave `main`.
2. Prepare for a narrower follow up that restores only `SECURITY.md`.
## Validation
1. Reviewed the revert diff against `main`.
2. Ran a clean diff check before push.
- Discover marketplace manifests from different supported layout paths
instead of only .agents/plugins/marketplace.json.
- Accept local plugin sources written either as { source: "local", path:
... } or as a direct string path.
- Skip unsupported or invalid plugin source entries without failing the
entire marketplace, and keep valid local plugins loadable.
Dismiss stale TUI app-server approvals after remote resolution
When an approval, user-input prompt, or elicitation request is resolved
by another client, the TUI now dismisses the matching local UI instead
of leaving stale prompts behind and emitting a misleading local
cancellation.
This change teaches pending app-server request tracking to map
`serverRequest/resolved` notifications back to the concrete request type
and stable request key, then propagates that resolved request into TUI
prompt state. Approval, request-user-input, and MCP elicitation overlays
now drop the resolved current or queued request quietly, advance to the
next queued request when present, and avoid emitting abort/cancel events
for stale UI.
The latest update also retires matching prompts while they are still
deferred behind active streaming and suppresses buffered active-thread
requests whose app-server request id has already been resolved before
drain. `ChatWidget` removes a resolved request from both the deferred
interrupt queue and the materialized bottom-pane stack, while
active-thread request handling verifies the app-server request is still
pending before showing a prompt. Lifecycle events such as exec begin/end
remain queued so approved work can still render normally.
Tests cover resolved-request mapping, overlay dismissal behavior,
deferred prompt pruning for same-turn user input, exec approval IDs,
lifecycle-event retention, and the buffered active-thread ordering
regression.
Validation:
- `just fmt`
- `git diff --check`
- `cargo test -p codex-tui
resolved_buffered_approval_does_not_become_actionable_after_drain`
- `cargo test -p codex-tui
enqueue_primary_thread_session_replays_buffered_approval_after_attach`
- `cargo test -p codex-tui chatwidget::interrupts`
- `just fix -p codex-tui`
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Re-enable remote variants for the exec-server filesystem
sandbox/symlink tests that were made local-only in PR #17671.
- Restore `use_remote` parameterization for the readable-root,
normalized symlink escape, symlink removal, and symlink
copy-preservation cases.
- Preserve `mode={use_remote}` context on key async filesystem failures
so CI failures point at the local or remote lane.
## Validation
- `cd codex-rs && just fmt`
- Not run: `bazel test
//codex-rs/exec-server:exec-server-file_system-test` per local Codex
development guidance to avoid test runs unless explicitly requested.
Co-authored-by: Codex <noreply@openai.com>
# Summary
- implement local ThreadStore archive/unarchive operations
- implement local ThreadStore read_thread operation
- break up the various ThreadStore local method implementations into
separate files
- migrate app-server archive/unarchive and core archive fixture to use
ThreadStore (but not all read operations yet!)
- use the ThreadStore's read operation as a proxy check for thread
persistence/existence in the app server code
- move all other filesystem operations related to archive (path
validation etc) into the local thread store.
# Tests
- add dedicated local store archive/unarchive tests
## Summary
1. Add a Security Boundaries section to `SECURITY.md`.
2. Point readers to the Codex Agent approvals and security documentation
for sandboxing, approvals, and network controls.
## Validation
1. Reviewed the `SECURITY.md` diff in a clean worktree.
2. No tests run. Docs only change.
Azure Responses providers were still falling back to local compaction
because the compaction gate only checked
`ModelProviderInfo::is_openai()`.
Move the capability check onto `ModelProviderInfo` with
`supports_remote_compaction()`, backed by the existing Azure Responses
endpoint detection used in `codex-api`, and have `core::compact`
delegate to that helper.
Add regression coverage for:
- OpenAI providers using remote compaction
- Azure providers using remote compaction
- non-OpenAI/non-Azure providers staying on the local path
resolves#17773
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
## Why
#17763 moved sandbox-state delivery for MCP tool calls to request
`_meta` via the `codex/sandbox-state-meta` experimental capability.
Keeping the older `codex/sandbox-state` capability meant Codex still
maintained a second transport that pushed updates with the custom
`codex/sandbox-state/update` request at server startup and when the
session sandbox policy changed.
That duplicate MCP path is redundant with the per-tool-call metadata
path and makes the sandbox-state contract larger than needed. The
existing managed network proxy refresh on sandbox-policy changes is
still needed, so this keeps that behavior separate from the removed MCP
notification.
## What Changed
- Removed the exported `MCP_SANDBOX_STATE_CAPABILITY` and
`MCP_SANDBOX_STATE_METHOD` constants.
- Removed detection of `codex/sandbox-state` during MCP initialization
and stopped sending `codex/sandbox-state/update` at server startup.
- Removed the `McpConnectionManager::notify_sandbox_state_change`
plumbing while preserving the managed network proxy refresh when a user
turn changes sandbox policy.
- Slimmed `McpConnectionManager::new` so startup paths pass only the
initial `SandboxPolicy` needed for MCP elicitation state.
- Kept `codex/sandbox-state-meta` support intact; servers that opt in
still receive the current `SandboxState` on tool-call request `_meta`
([remaining call
path](ff2d3c1e72/codex-rs/core/src/mcp_tool_call.rs (L487-L526))).
- Added regression coverage for refreshing the live managed network
proxy on a per-turn sandbox-policy change.
## Verification
- `cargo test -p codex-core
new_turn_refreshes_managed_network_proxy_for_sandbox_change`
- `cargo test -p codex-mcp`
## Summary
- Track outbound remote-control sequence IDs independently for each
client stream.
- Retain unacked outbound messages per stream using FIFO buffers.
- Require stream-scoped acks and update tests for contiguous per-stream
sequencing.
## Why
The remote-control peer uses outbound sequence gaps to detect lost
messages and re-initialize. A single global outbound sequence counter
can create apparent gaps on an individual stream when another stream
receives an interleaved message.
## Validation
- `just fmt`
- `cargo test -p codex-app-server remote_control`
- `just fix -p codex-app-server`
- `git diff --check`
## Summary
- Move auth header construction into the
`AuthProvider::add_auth_headers` contract.
- Inline `CoreAuthProvider` header mutation in its provider impl and
remove the shared header-map helper.
- Update HTTP, websocket, file upload, sideband websocket, and test auth
callsites to use the provider method.
- Add direct coverage for `CoreAuthProvider` auth header mutation.
## Testing
- `just fmt`
- `cargo test -p codex-api`
- `cargo test -p codex-core
client::tests::auth_request_telemetry_context_tracks_attached_auth_and_retry_phase`
- `cargo test -p codex-core` failed on unrelated/reproducible
`tools::handlers::multi_agents::tests::multi_agent_v2_followup_task_interrupts_busy_child_without_losing_message`
---------
Co-authored-by: Celia Chen <celia@openai.com>
## Summary
- Keep the existing local-build test announcement as the first
announcement entry
- Add the CLI update reminder for versions below `0.120.0`
- Remove expired onboarding and gpt-5.3-codex announcement entries
<img width="1576" height="276" alt="Screenshot 2026-04-15 at 1 32 53 PM"
src="https://github.com/user-attachments/assets/10b55d0b-09cd-4de0-ab51-4293d811b80c"
/>
Builds on top of #17659
Move the filesystem + sqlite thread listing-related operations inside of
a local ThreadStore implementation and call ThreadStore from the places
that used to perform these filesystem/sqlite operations.
This is the first of a series of PRs that will implement the rest of the
local ThreadStore.
Testing:
- added unit tests for the thread store implementation
- adjusted some unit tests in the realtime + personality packages whose
callsites changed. Specifically I'm trying to hide ThreadMetadata inside
of the local implementation and make ThreadMetadata a sqlite
implementation detail concern rather than a public interface, preferring
the more generate StoredThread interface instead
- added a corner case test for the personality migration package that
wasn't covered by the existing test suite
- adjust the behavior of searched thread listing to run the existing
local rollout repair/backfill pass _before_ querying SQLite results, so
callers using ThreadStore::list_threads do not miss matches after a
partial metadata warm-up
## Summary
- Ensure direct namespaced MCP tool groups are emitted with a non-empty
namespace description even when namespace metadata is missing or blank.
- Add regression coverage for missing MCP namespace descriptions.
## Cause
Latest `main` can serialize a direct namespaced MCP tool group with an
empty top-level `description`. The namespace description path used
`unwrap_or_default()` when `tool_namespaces` did not include metadata
for that namespace, so the outbound Responses API payload could contain
a tool like `{"type":"namespace","description":""}`. The Responses API
rejects that because namespace tool descriptions must be a non-empty
string.
## Fix
- Add a fallback namespace description: `Tools in the <namespace>
namespace.`
- Preserve provided namespace descriptions after trimming, but treat
blank descriptions as missing.
### Issue I am seeing
This is what I am seeing on the local build.
<img width="1593" height="488" alt="Screenshot 2026-04-15 at 10 55 55
AM"
src="https://github.com/user-attachments/assets/bab668ba-bf17-4c71-be4e-b102202fce57"
/>
---------
Co-authored-by: Sayan Sisodiya <sayan@openai.com>
## Why
While reviewing https://github.com/openai/codex/pull/17958, the helper
name `is_azure_responses_wire_base_url` looked misleading because the
helper returns true for either the `azure` provider name or an Azure
Responses `base_url`. The new name makes both inputs part of the
contract.
## What
- Rename `is_azure_responses_wire_base_url` to
`is_azure_responses_provider`.
- Move the `openai.azure.` marker into
`matches_azure_responses_base_url` so all base URL marker matching is
centralized.
- Keep `Provider::is_azure_responses_endpoint()` behavior unchanged.
## Verification
- Compared the parent and current implementations.
`name.eq_ignore_ascii_case("azure")` still returns true before
consulting `base_url`, `None` still returns false, base URLs are still
lowercased before marker matching, and the same Azure marker set is
checked.
- Ran `cargo test -p codex-api`.
## Summary
- Skip directory entries whose metadata lookup fails during
`fs/readDirectory`
- Add an exec-server regression test covering a broken symlink beside
valid entries
## Testing
- `just fmt`
- `cargo test -p codex-exec-server` (started, but dependency/network
updates stalled before completion in this environment)
## Summary
Stack PR 2 of 4 for feature-gated agent identity support.
This PR adds agent identity registration behind
`features.use_agent_identity`. It keeps the app-server protocol
unchanged and starts registration after ChatGPT auth exists rather than
requiring a client restart.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - this PR
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR4: https://github.com/openai/codex/pull/17388 - use `AgentAssertion`
downstream when enabled
## Validation
Covered as part of the local stack validation pass:
- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`
## Notes
The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
## Summary
- Remove the exec-server-side manual filesystem request path preflight
before invoking the sandbox helper.
- Keep sandbox helper policy construction and platform sandbox
enforcement as the access boundary.
- Add a portable local+remote regression for writing through an
explicitly configured alias root.
- Remove the metadata symlink-escape assertion that depended on the
deleted manual preflight; no replacement metadata-specific access probe
is added.
## Tests
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-exec-server --test file_system`
- `git diff --check`
stacked on #17402.
MCP tools returned by `tool_search` (deferred tools) get registered in
our `ToolRegistry` with a different format than directly available
tools. this leads to two different ways of accessing MCP tools from our
tool catalog, only one of which works for each. fix this by registering
all MCP tools with the namespace format, since this info is already
available.
also, direct MCP tools are registered to responsesapi without a
namespace, while deferred MCP tools have a namespace. this means we can
receive MCP `FunctionCall`s in both formats from namespaces. fix this by
always registering MCP tools with namespace, regardless of deferral
status.
make code mode track `ToolName` provenance of tools so it can map the
literal JS function name string to the correct `ToolName` for
invocation, rather than supporting both in core.
this lets us unify to a single canonical `ToolName` representation for
each MCP tool and force everywhere to use that one, without supporting
fallbacks.
## Summary
- Fix marketplace-add local path detection on Windows by using
`Path::is_absolute()`.
- Make marketplace-add local-source tests parse/write TOML through the
same helpers instead of raw string matching.
- Update `rand` 0.9.x to 0.9.3 and document the remaining audited `rand`
0.8.5 advisory exception.
- Refresh `MODULE.bazel.lock` after the Cargo.lock update.
## Why
Latest `main` had two independent CI blockers: marketplace-add tests
were not portable to Windows path/TOML escaping, and cargo-deny still
reported `RUSTSEC-2026-0097` after the recent rustls-webpki fix.
## Validation
- `cargo test -p codex-core marketplace_add -- --nocapture`
- `cargo deny --all-features check`
- `just bazel-lock-check`
- `just fix -p codex-core`
- `just fmt`
- `git diff --check`
## Changes
Allows sandboxes to restrict overall network access while granting
access to specific unix sockets on mac.
## Details
- `codex sandbox macos`: adds a repeatable `--allow-unix-socket` option.
- `codex-sandboxing`: threads explicit Unix socket roots into the macOS
Seatbelt profile generation.
- Preserves restricted network behavior when only Unix socket IPC is
requested, and preserves full network behavior when full network is
already enabled.
## Verification
- `cargo test -p codex-cli -p codex-sandboxing`
- `cargo build -p codex-cli --bin codex`
- verified that `codex sandbox macos --allow-unix-socket /tmp/test.sock
-- test-client` grants access as expected
## Changes
Allows MCPs to opt in to receiving sandbox config info through `_meta`
on model-initiated tool calls. This lets MCPs adhere to the thread's
sandbox if they choose to.
## Details
- Adds the `codex/sandbox-state-meta` experimental MCP capability.
- Tracks whether each MCP server advertises that capability.
- When a server opts in, `codex-core` injects the current `SandboxState`
into model-initiated MCP tool-call request `_meta`.
## Verification
- added an integration test for the capability
`exec()` had a number of arguments that were unused, making the function
signature misleading. This PR aims to clean things up to clarify the
role of this function and to clarify which fields of `ExecParams` are
unused and why.
## Why
`spawn_command_under_seatbelt()` in `codex-rs/core/src/seatbelt.rs` had
fallen out of production use and was only referenced by test-only
wrappers. That left us with sandbox tests that could stay green even if
the actual seatbelt exec path regressed, because production shell
execution now flows through `SandboxManager::transform()` and
`ExecRequest::from_sandbox_exec_request()` instead of that helper.
Removing the dead helper also exposed one downstream `codex-exec`
integration test that still imported it, which broke `just clippy`.
## What Changed
- Removed `codex-rs/core/src/seatbelt.rs` and stopped exporting
`codex_core::seatbelt`.
- Removed the redundant `codex-rs/core/tests/suite/seatbelt.rs` coverage
that only exercised the dead helper.
- Kept the `openpty` regression check, but moved it into
`codex-rs/core/tests/suite/exec.rs` so it now runs through
`process_exec_tool_call()`.
- Fixed the seatbelt denial test in `codex-rs/core/tests/suite/exec.rs`
to use `/usr/bin/touch`, so it actually exercises the sandbox instead of
a nonexistent path.
- Updated `codex-rs/exec/tests/suite/sandbox.rs` on macOS to build the
sandboxed command through `build_exec_request()` and spawn the
transformed command, instead of importing the removed helper.
- Left the lower-level seatbelt policy coverage in
`codex-rs/sandboxing/src/seatbelt_tests.rs`, where the policy generator
is still covered directly.
## Verification
- `cargo test -p codex-core suite::exec::`
- `cargo test -p codex-exec`
- `cargo clippy -p codex-exec --tests -- -D warnings`
## Summary
- reuse a shared remote exec-server for remote-aware codex-core
integration tests within a test binary process
- keep per-test remote cwd creation and cleanup so tests retain
workspace isolation
- leave codex_self_exe, codex_linux_sandbox_exe, cwd_path(), and
workspace_path() behavior unchanged
## Validation
- rustfmt codex-rs/core/tests/common/test_codex.rs
- git diff --check
- CI is running on the updated branch
Fix clippy warnings in external agent config migration
```
error: this expression creates a reference which is immediately dereferenced by the compiler
--> core/src/external_agent_config.rs:188:55
|
188 | let migrated = build_config_from_external(&settings)?;
| ^^^^^^^^^ help: change this to: `settings`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/rust-1.93.0/index.html#needless_borrow
= note: requested on the command line with `-D clippy::needless-borrow`
error: useless conversion to the same type: `codex_utils_absolute_path::AbsolutePathBuf`
--> core/src/external_agent_config.rs:355:27
|
355 | match AbsolutePathBuf::try_from(
| ___________________________^
356 | | add_marketplace_outcome
357 | | .installed_root
358 | | .join(INSTALLED_MARKETPLACE_MANIFEST_RELATIVE_PATH),
359 | | ) {
| |_____________________^
|
= help: consider removing `AbsolutePathBuf::try_from()`
= help: for further information visit https://rust-lang.github.io/rust-clippy/rust-1.93.0/index.html#useless_conversion
= note: `-D clippy::useless-conversion` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(clippy::useless_conversion)]`
error: aborting due to 2 previous errors
```
## Summary
- wrap routed delegation text in a small XML envelope before submitting
it as a user turn
- escape XML text content so the envelope stays well formed
- update focused coverage for the wrapper and the affected routed-turn
expectations
## What
Disable `Feature::CodexHooks` when building guardian review session
config
## Why
Guardian review sessions were respecting the Stop hook and could ingest
synthetic `<hook_prompt>` user turns Guardian should ignore hooks, while
the main session and regular subagents continue to respect them
In other words Guardian was getting ralph-looped
Co-authored-by: Codex <noreply@openai.com>
### **Issue**
guardian_parallel_reviews_fork_from_last_committed_trunk_history was
failing on Windows/Bazel with a stack overflow:
`thread
'guardian::tests::guardian_parallel_reviews_fork_from_last_committed_trunk_history'
has overflowed its stack`
- This problem was a stack-headroom problem
### **Solution**
Reduced stack pressure in the guardian async path by boxing thin wrapper
futures, and run the affected test on a dedicated 2 MiB thread stack.
Concretely:
- added Box::pin(...) around thin async wrapper hops in the guardian
review/delegate path
- changed
guardian_parallel_reviews_fork_from_last_committed_trunk_history to run
inside an explicitly sized thread stack so it has enough headroom in
low-stack environments
## Summary
- Port marketplace source support into the shared core marketplace-add
flow
- Support local marketplace directory sources
- Support direct `marketplace.json` URL sources
- Persist the new source types in config/schema and cover them in CLI
and app-server tests
## Validation
- `cargo test -p codex-core marketplace_add`
- `cargo test -p codex-cli marketplace_add`
- `cargo test -p codex-app-server marketplace_add`
- `just write-config-schema`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-cli`
## Context
Current `main` moved marketplace-add behavior into shared core code and
still assumed only git-backed sources. This change keeps that structure
but restores support for local directories and direct manifest URLs in
the shared path.
## Why
`main` recently needed
[#17691](https://github.com/openai/codex/pull/17691) because code behind
`cfg(not(debug_assertions))` was not being compiled by the Bazel PR
workflow. Our existing CI only built the fast/debug configuration, so
PRs could stay green while release-only Rust code still failed to
compile. This PR adds a release-style compile check that is cheap enough
to run on every PR.
## What Changed
- Added a `verify-release-build` job to `.github/workflows/bazel.yml`.
- Represented each supported OS once in that job's matrix: x64 Linux,
arm64 macOS, and x64 Windows.
- Kept the build close to fastbuild cost by using
`--compilation_mode=fastbuild` while forcing Rust to compile with
`-Cdebug-assertions=no`, which makes `cfg(not(debug_assertions))` true
without also turning on release optimizations or debug-info generation.
- Added comments in `.github/workflows/bazel.yml` and
`scripts/list-bazel-release-targets.sh` to make the job's intent and
target scope explicit.
- Restored the Bazel repository cache save behavior to run after every
non-cancelled job, matching
[#16926](https://github.com/openai/codex/pull/16926), and removed the
now-unused `repository-cache-hit` output from `prepare-bazel-ci`.
- Reused the shared `prepare-bazel-ci` action from the parent PR so the
new job does not duplicate Bazel setup boilerplate.
## Verification
- Used `bazel aquery` on `//codex-rs/tui:codex-tui` to confirm the Rust
compile still uses `opt-level=0` and `debuginfo=0` while passing
`-Cdebug-assertions=no`.
- Parsed `.github/workflows/bazel.yml` as YAML locally.
- Ran `bash -n scripts/list-bazel-release-targets.sh`.
## Summary
- Allows selected MCP results to return a larger default result set.
- Keeps the existing default cap for other MCP results.
- Applies the cap consistently when higher explicit limits are
requested.
## Testing
- `cargo test -p codex-core tool_search`
- Ran a local CLI smoke test with two stdio MCP servers exposing 100
tools each; the selected-server query returned 20 tools and the
regular-server query returned 8.
- Add trace-only wire logging for realtime websocket request/event text
payloads and the WebRTC call SDP request.
- Gate raw realtime logs behind
`RUST_LOG=codex_api::realtime_websocket::wire=trace` so normal logs stay
quiet.
---------
Co-authored-by: Codex <noreply@openai.com>
Introduce a ThreadStore interface for mediating access to the filesystem
(rollout jsonl files + sqlite db) based thread storage.
In later PRs we'll move the existing fs code behind a "local"
implementation of this ThreadStore interface.
This PR should be a no-op behaviorally, it only introduces the
interface.
Problem: PR #17372 moved initialized request handling into
`dispatch_initialized_client_request`, leaving analytics code that uses
`connection_id` without a local binding and breaking `codex-app-server`
builds.
Solution: Restore the `connection_id` binding from
`connection_request_id` before initialized request validation and
analytics tracking.
## Summary
Fix the TUI `$` skill popup so personal skills appear reliably when
Codex is connected to a remote app-server.
## What changed
- load skills on TUI startup with an explicit forced refresh
- refresh skills using the actual current cwd instead of an empty `cwds`
list
- resync an already-open `$` popup when skill mentions are updated
- add a regression test for refreshing an open mention popup
## Root cause
The TUI was sometimes sending `list_skills` with `cwds: []` after
`SessionConfigured`.
For the launchd app-server flow, the server resolved that empty cwd list
to its own process cwd, which was `/`. The response therefore came back
tagged with `cwd: "/"`, and the TUI later filtered skills by exact cwd
match against the actual project cwd such as `/Users/starr/code/dream`.
That dropped all personal skills from the mention list, so `$` only
showed plugins/apps.
## Verification
Built successfully with remote cache disabled:
```bash
cd /Users/starr/code/codex-worktrees/starr-skill-popup-20260413130509
bazel --output_base=/tmp/codex-bazel-verify-starr-skill-popup build //codex-rs/cli:codex --noremote_accept_cached --noremote_upload_local_results --disk_cache=
```
Also verified interactively in a PTY against the live app-server at
`ws://127.0.0.1:4511`:
- launched the built TUI
- typed `$`
- confirmed personal skills appeared in the popup, including entries
such as `Applied Devbox`, `CI Debug`, `Channel Summarization`, `Codex PR
Review`, and `Daily Digest`
## Files changed
- `codex-rs/tui/src/app.rs`
- `codex-rs/tui/src/chatwidget.rs`
- `codex-rs/tui/src/bottom_pane/chat_composer.rs`
Co-authored-by: Codex <noreply@openai.com>
## Summary
- route apply_patch runtime execution through the selected Environment
filesystem instead of the local self-exec path
- keep the standalone apply_patch command surface intact while restoring
its launcher/test/docs contract
- add focused apply_patch filesystem sandbox regression coverage
## Validation
- remote devbox Bazel run in progress
- passed: //codex-rs/apply-patch:apply-patch-unit-tests
--test_filter=test_read_file_utf8_with_context_reports_invalid_utf8
- in progress / follow-up: focused core and exec Bazel test slices on
dev
## Follow-up under review
- remote pre-verification and approval/retry behavior still need
explicit scrutiny for delete/update flows
- runtime sandbox-denial classification may need a tighter assertion
path than rendered stderr matching
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
This stack adds a new Bazel CI lane that verifies Rust code behind
`cfg(not(debug_assertions))`, but adding that job directly to
`.github/workflows/bazel.yml` would duplicate the same setup in multiple
places. Extracting the shared setup first keeps the follow-up change
easier to review and reduces the chance that future Bazel workflow edits
drift apart.
## What Changed
- Added `.github/actions/prepare-bazel-ci/action.yml` as a composite
action for the Bazel job bootstrap shared by multiple workflow jobs.
- Moved the existing Bazel setup, repository-cache restore, and
execution-log setup behind that action.
- Updated the `test` and `clippy` jobs in `.github/workflows/bazel.yml`
to call `prepare-bazel-ci`.
- Exposed `repository-cache-hit` and `repository-cache-path` outputs so
callers can keep the existing cache-save behavior without duplicating
the restore step.
## Verification
- Parsed `.github/workflows/bazel.yml` as YAML locally after rebasing
the stack.
- CI will exercise the refactored jobs end to end.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17704).
* #17705
* __->__ #17704
## Summary
- Refactors `MessageProcessor` and per-connection session state so
initialized service RPC handling can be moved into spawned tasks in a
follow-up PR.
- Shares the processor and initialized session data with
`Arc`/`OnceLock` instead of mutable borrowed connection state.
- Keeps initialized request handling synchronous in this PR; it does
**not** call `tokio::spawn` for service RPCs yet.
## Testing
- `just fmt`
- `cargo test -p codex-app-server` *(fails on existing hardening gaps
covered by #17375, #17376, and #17377; the pipelined config regression
passed before the unrelated failures)*
- `just fix -p codex-app-server`
In the app-server debug client, allow redirecting output to a file in
addition to just stdout. Shell redirecting works OK but is a bit weird
with the interactive mode of the debug client since a bunch of newlines
get dumped into the shell. With async messages from MCPs starting it's
also tricky to actually type in a prompt.
To allow the ability to have guaranteed-unique cursors, we make two
important updates:
* Add new updated_at_ms and created_at_ms columns that are in
millisecond precision
* Guarantee uniqueness -- if multiple items are inserted at the same
millisecond, bump the new one by one millisecond until it becomes unique
This lets us use single-number cursors for forwards and backwards paging
through resultsets and guarantee that the cursor is a fixed point to do
(timestamp > cursor) and get new items only.
This updated implementation is backwards-compatible since multiple
appservers can be running and won't handle the previous method well.
## Summary
Adds `thread_source` field to the existing Codex turn metadata sent to
Responses API
- Sends `thread_source: "user"` for user-initiated sessions: CLI, VS
Code, and Exec
- Sends `thread_source: "subagent"` for subagent sessions
- Omits `thread_source` for MCP, custom, and unknown session sources
- Uses the existing turn metadata transport:
- HTTP requests send through the `x-codex-turn-metadata` header
- WebSocket `response.create` requests send through
`client_metadata["x-codex-turn-metadata"]`
## Testing
- `cargo test -p codex-protocol
session_source_thread_source_name_classifies_user_and_subagent_sources`
- `cargo test -p codex-core turn_metadata_state`
- `cargo test -p codex-core --test responses_headers
responses_stream_includes_turn_metadata_header_for_git_workspace_e2e --
--nocapture`
## Summary
This PR removes `image_detail_original` as a runtime experiment and
makes original image detail available whenever the selected model
supports it.
Concretely, this change:
- drops the `image_detail_original` feature flag from the feature
registry and generated config schema
- makes tool-emitted image detail depend only on
`ModelInfo.supports_image_detail_original`
- updates `view_image` and `code_mode`/`js_repl` image emission to use
that capability check directly
- removes now-redundant experiment-specific tests and instruction
coverage
- keeps backward compatibility for existing configs by silently ignoring
a stale `features.image_detail_original` entry
The net effect is that `detail: "original"` is always available on
supported models, without requiring an experiment toggle.
- Add outputModality to thread/realtime/start and wire text/audio output
selection through app-server, core, API, and TUI.\n- Rename the realtime
transcript delta notification and add a separate transcript done
notification that forwards final text from item done without correlating
it with deltas.
This changes multi-agent v2 mailbox handling so incoming inter-agent
messages no longer preempt an in-flight sampling stream at reasoning or
commentary output-item boundaries.
## Summary
Move `codex marketplace add` onto a shared core implementation so the
CLI and app-server path can use one source of truth.
This change:
- adds shared marketplace-add orchestration in `codex-core`
- switches the CLI command to call that shared implementation
- removes duplicated CLI-only marketplace add helpers
- preserves focused parser and add-path coverage while moving the shared
behavior into core tests
## Why
The new `marketplace/add` RPC should reuse the same underlying
marketplace-add flow as the CLI. This refactor lands that consolidation
first so the follow-up app-server PR can be mostly protocol and handler
wiring.
## Validation
- `cargo test -p codex-core marketplace_add`
- `cargo test -p codex-cli marketplace_cmd`
- `just fix -p codex-core`
- `just fix -p codex-cli`
- `just fmt`
## Summary
- Pin Rust git patch dependencies to immutable revisions and make
cargo-deny reject unknown git and registry sources unless explicitly
allowlisted.
- Add checked-in SHA-256 coverage for the current rusty_v8 release
assets, wire those hashes into Bazel, and verify CI override downloads
before use.
- Add rusty_v8 MODULE.bazel update/check tooling plus a Bazel CI guard
so future V8 bumps cannot drift from the checked-in checksum manifest.
- Pin release/lint cargo installs and all external GitHub Actions refs
to immutable inputs.
## Future V8 bump flow
Run these after updating the resolved `v8` crate version and checksum
manifest:
```bash
python3 .github/scripts/rusty_v8_bazel.py update-module-bazel
python3 .github/scripts/rusty_v8_bazel.py check-module-bazel
```
The update command rewrites the matching `rusty_v8_<crate_version>`
`http_file` SHA-256 values in `MODULE.bazel` from
`third_party/v8/rusty_v8_<crate_version>.sha256`. The check command is
also wired into Bazel CI to block drift.
## Notes
- This intentionally excludes RustSec dependency upgrades and
bubblewrap-related changes per request.
- The branch was rebased onto the latest origin/main before opening the
PR.
## Validation
- cargo fetch --locked
- cargo deny check advisories
- cargo deny check
- cargo deny check sources
- python3 .github/scripts/rusty_v8_bazel.py check-module-bazel
- python3 .github/scripts/rusty_v8_bazel.py update-module-bazel
- python3 -m unittest discover -s .github/scripts -p
'test_rusty_v8_bazel.py'
- python3 -m py_compile .github/scripts/rusty_v8_bazel.py
.github/scripts/rusty_v8_module_bazel.py
.github/scripts/test_rusty_v8_bazel.py
- repo-wide GitHub Actions `uses:` audit: all external action refs are
pinned to 40-character SHAs
- yq eval on touched workflows and local actions
- git diff --check
- just bazel-lock-check
## Hash verification
- Confirmed `MODULE.bazel` hashes match
`third_party/v8/rusty_v8_146_4_0.sha256`.
- Confirmed GitHub release asset digests for denoland/rusty_v8
`v146.4.0` and openai/codex `rusty-v8-v146.4.0` match the checked-in
hashes.
- Streamed and SHA-256 hashed all 10 `MODULE.bazel` rusty_v8 asset URLs
locally; every downloaded byte stream matched both `MODULE.bazel` and
the checked-in manifest.
## Pin verification
- Confirmed signing-action pins match the peeled commits for their tag
comments: `sigstore/cosign-installer@v3.7.0`, `azure/login@v2`, and
`azure/trusted-signing-action@v0`.
- Pinned the remaining tag-based action refs in Bazel CI/setup:
`actions/setup-node@v6`, `facebook/install-dotslash@v2`,
`bazelbuild/setup-bazelisk@v3`, and `actions/cache/restore@v5`.
- Normalized all `bazelbuild/setup-bazelisk@v3` refs to the peeled
commit behind the annotated tag.
- Audited Cargo git dependencies: every manifest git dependency uses
`rev` only, every `Cargo.lock` git source has `?rev=<sha>#<same-sha>`,
and `cargo deny check sources` passes with `required-git-spec = "rev"`.
- Shallow-fetched each distinct git dependency repo at its pinned SHA
and verified Git reports each object as a commit.
This PR teaches the TUI to render guardian review timeouts as explicit
terminal history entries instead of dropping them from the live
timeline.
It adds timeout-specific history cells for command, patch, MCP tool, and
network approval reviews.
It also adds snapshot tests covering both the direct guardian event path
and the app-server notification path.
## Summary\n- add an exec-server package-local test helper binary that
can run exec-server and fs-helper flows\n- route exec-server filesystem
tests through that helper instead of cross-crate codex helper
binaries\n- stop relying on Bazel-only extra binary wiring for these
tests\n\n## Testing\n- not run (per repo guidance for codex changes)
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Add `turn/inject_items` app-server v2 request support for appending
raw Responses API items to a loaded thread history without starting a
turn.
- Generate JSON schema and TypeScript protocol artifacts for the new
params and empty response.
- Document the new endpoint and include a request/response example.
- Preserve compatibility with the typo alias `turn/injet_items` while
returning the canonical method name.
## Testing
- Not run (not requested)
## Why
For more advanced MCP usage, we want the model to be able to emit
parallel MCP tool calls and have Codex execute eligible ones
concurrently, instead of forcing all MCP calls through the serial block.
The main design choice was where to thread the config. I made this
server-level because parallel safety depends on the MCP server
implementation. Codex reads the flag from `mcp_servers`, threads the
opted-in server names into `ToolRouter`, and checks the parsed
`ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
on model-visible tool names, which can be incomplete in
deferred/search-tool paths or ambiguous for similarly named
servers/tools.
## What was added
Added `supports_parallel_tool_calls` for MCP servers.
Before:
```toml
[mcp_servers.docs]
command = "docs-server"
```
After:
```toml
[mcp_servers.docs]
command = "docs-server"
supports_parallel_tool_calls = true
```
MCP calls remain serial by default. Only tools from opted-in servers are
eligible to run in parallel. Docs also now warn to enable this only when
the server’s tools are safe to run concurrently, especially around
shared state or read/write races.
## Testing
Tested with a local stdio MCP server exposing real delay tools. The
model/Responses side was mocked only to deterministically emit two MCP
calls in the same turn.
Each test called `query_with_delay` and `query_with_delay_2` with `{
"seconds": 25 }`.
| Build/config | Observed | Wall time |
| --- | --- | --- |
| main with flag enabled | serial | `58.79s` |
| PR with flag enabled | parallel | `31.73s` |
| PR without flag | serial | `56.70s` |
PR with flag enabled showed both tools start before either completed;
main and PR-without-flag completed the first delay before starting the
second.
Also added an integration test.
Additional checks:
- `cargo test -p codex-tools` passed
- `cargo test -p codex-core
mcp_parallel_support_uses_exact_payload_server` passed
- `git diff --check` passed
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
Cap mirrored user text sent to realtime with the existing 300-token turn
budget while preserving the full model turn.
Adds integration coverage for capped realtime mirror payloads.
---------
Co-authored-by: Codex <noreply@openai.com>
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
### Motivation
- Switch the default model used for memory Phase 2 (consolidation) to
the newer `gpt-5.4` model.
### Description
- Change the Phase 2 model constant from `"gpt-5.3-codex"` to
`"gpt-5.4"` in `codex-rs/core/src/memories/mod.rs`.
### Testing
- Ran `just fmt`, which completed successfully.
- Attempted `cargo test -p codex-core`, but the build failed in this
environment because the `codex-linux-sandbox` crate requires the system
`libcap` pkg-config entry and the required system packages could not be
installed, so the test run was blocked.
------
[Codex
Task](https://chatgpt.com/codex/cloud/tasks/task_i_69d977693b48832a967e78d73c66dc8e)
The recent release broke, codex suggested this as the fix
Source failure:
https://github.com/openai/codex/actions/runs/24362949066/job/71147202092
Probably from
ac82443d07
For why it got in:
```
The relevant setup:
.github/workflows/rust-ci.yml (line 1) runs on PRs, but for codex-rs it only does:
cargo fmt --check
cargo shear
argument-comment lint via Bazel
no cargo check, no cargo clippy over the workspace, no cargo test over codex-tui
.github/workflows/rust-ci-full.yml (line 1) runs on pushes to main and branches matching **full-ci**. That one does compile TUI because:
codex-rs/Cargo.toml includes "tui" as a workspace member
lint_build runs cargo clippy --target ... --tests --profile ...
the matrix includes both dev and release profiles
tests runs cargo nextest run ..., but only dev-profile tests
Release CI also compiles it indirectly. .github/workflows/rust-release.yml (line 235) builds --bin codex, and cli/Cargo.toml (line 46) depends on codex-tui.
```
Codex tested locally with `cargo check -p codex-tui --release` and was
able to repro, and verified that this fixed it
Currently app-server may unload actively running threads once the last
connection disconnects, which is not expected.
Instead track when was the last active turn & when there were any
subscribers the last time, also add 30 minute idleness/no subscribers
timer to reduce the churn.
- stop `list_tool_suggest_discoverable_plugins()` from reloading the
curated marketplace for each discoverable plugin
- reuse a direct plugin-detail loader against the already-resolved
marketplace entry
The trigger was to stop those logs spamming:
```
d=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.402Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
2026-04-13T12:27:30.402Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.405Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
2026-04-13T12:27:30.406Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.408Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
```
## Summary
This updates the Windows elevated sandbox setup/refresh path to include
the legacy `compute_allow_paths(...).deny` protected children in the
same deny-write payload pipe added for split filesystem carveouts.
Concretely, elevated setup and elevated refresh now both build
deny-write payload paths from:
- explicit split-policy deny-write paths, preserving missing paths so
setup can materialize them before applying ACLs
- legacy `compute_allow_paths(...).deny`, which includes existing
`.git`, `.codex`, and `.agents` children under writable roots
This lets the elevated backend protect `.git` consistently with the
unelevated/restricted-token path, and removes the old janky hard-coded
`.codex` / `.agents` elevated setup helpers in favor of the shared
payload path.
## Root Cause
The landed split-carveout PR threaded a `deny_write_paths` pipe through
elevated setup/refresh, but the legacy workspace-write deny set from
`compute_allow_paths(...).deny` was not included in that payload. As a
result, elevated workspace-write did not apply the intended deny-write
ACLs for existing protected children like `<cwd>/.git`.
## Notes
The legacy protected children still only enter the deny set if they
already exist, because `compute_allow_paths` filters `.git`, `.codex`,
and `.agents` with `exists()`. Missing explicit split-policy deny paths
are preserved separately because setup intentionally materializes those
before applying ACLs.
## Validation
- `cargo fmt --check -p codex-windows-sandbox`
- `cargo test -p codex-windows-sandbox`
- `cargo build -p codex-cli -p codex-windows-sandbox --bins`
- Elevated `codex exec` smoke with `windows.sandbox='elevated'`: fresh
git repo, attempted append to `.git/config`, observed `Access is
denied`, marker not written, Deny ACE present on `.git`
- Unelevated `codex exec` smoke with `windows.sandbox='unelevated'`:
fresh git repo, attempted append to `.git/config`, observed `Access is
denied`, marker not written, Deny ACE present on `.git`
Addresses #17593
Problem: A regression introduced in
https://github.com/openai/codex/pull/16492 made thread/start fail when
Codex could not persist trusted project state, which crashes startup for
users with read-only config.toml.
Solution: Treat trusted project persistence as best effort and keep the
current thread's config trusted in memory when writing config.toml
fails.
Problem: PR #17601 updated context-compaction replay to call a new
ChatWidget handler, but the handler was never implemented, breaking
codex-tui compilation on main.
Solution: Render context-compaction replay through the existing
info-message path, preserving the intended `Context compacted` UI marker
without adding a one-off handler.
Addresses #17514
Problem: PR #16966 made the TUI render the deprecated context-compaction
notification, while v2 could also receive legacy unified-exec
interaction items alongside terminal-interaction notifications, causing
duplicate "Context compacted" and "Waited for background terminal"
messages.
Solution: Suppress deprecated context-compaction notifications and
legacy unified-exec interaction command items from the app-server v2
projection, and render canonical context-compaction items through the
existing TUI info-event path.
Addresses #17453
Problem: /status rate-limit reset timestamps can be truncated in narrow
layouts, leaving users with partial times or dates.
Solution: Let narrow rate-limit rows drop the fixed progress bar to
preserve the percent summary, and wrap reset timestamps onto
continuation lines instead of truncating them.
Addresses #17252
Problem: Plan-mode clarification questionnaires used the generic
user-input notification type, so configs listening for plan-mode-prompt
did not fire when request_user_input waited for an answer.
Solution: Map request_user_input prompts to the plan-mode-prompt
notification and remove the obsolete user-input TUI notification
variant.
Addresses #16255
Problem: Incomplete Responses streams could leave completed custom tool
outputs out of cleanup and retry prompts, making persisted history
inconsistent and retries stale.
Solution: Route stream and output-item errors through shared cleanup,
and rebuild retry prompts from fresh session history after the first
attempt.
## Summary
When a `spawn_agent` call does a full-history fork, keep the parent's
effective agent type and model configuration instead of applying child
role/model overrides.
This is the minimal config-inheritance slice of #16055. Prompt-cache key
inheritance and MCP tool-surface stability are split into follow-up PRs.
## Design
- Reject `agent_type`, `model`, and `reasoning_effort` for v1
`fork_context` spawns.
- Reject `agent_type`, `model`, and `reasoning_effort` for v2
`fork_turns = "all"` spawns.
- Keep v2 partial-history forks (`fork_turns = "N"`) configurable;
requested model/reasoning overrides and role config still apply there.
- Keep non-forked spawn behavior unchanged.
## Tests
- `cargo +1.93.1 test -p codex-core spawn_agent_fork_context --lib`
- `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns
--lib`
- `cargo +1.93.1 test -p codex-core
multi_agent_v2_spawn_partial_fork_turns_allows_agent_type_override
--lib`
## Summary
- add an exec-server `envPolicy` field; when present, the server starts
from its own process env and applies the shell environment policy there
- keep `env` as the exact environment for local/embedded starts, but
make it an overlay for remote unified-exec starts
- move the shell-environment-policy builder into `codex-config` so Core
and exec-server share the inherit/filter/set/include behavior
- overlay only runtime/sandbox/network deltas from Core onto the
exec-server-derived env
## Why
Remote unified exec was materializing the shell env inside Core and
forwarding the whole map to exec-server, so remote processes could
inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
base env on the executor while preserving Core-owned runtime additions
like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
sandbox marker env.
## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-core --lib unified_exec::process_manager::tests`
- `cargo test -p codex-core --lib exec_env::tests`
- `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
matched 0 tests)
- `cargo test -p codex-config --lib shell_environment` (compile-only;
filter matched 0 tests)
- `just bazel-lock-update`
## Known local validation issue
- `just bazel-lock-check` is not runnable in this checkout: it invokes
`./scripts/check-module-bazel-lock.sh`, which is missing.
---------
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
Problem: After #17294 switched exec-server tests to launch the top-level
`codex exec-server` command, parallel remote exec-process cases can
flake while waiting for the child server's listen URL or transport
shutdown.
Solution: Serialize remote exec-server-backed process tests and harden
the harness so spawned servers are killed on drop and shutdown waits for
the child process to exit.
## Summary
Stop counting elicitation time towards mcp tool call time. There are
some tradeoffs here, but in general I don't think time spent waiting for
elicitations should count towards tool call time, or at least not
directly towards timeouts.
Elicitations are not exactly like exec_command escalation requests, but
I would argue it's ~roughly equivalent.
## Testing
- [x] Added unit tests
- [x] Tested locally
Addresses #17498
Problem: The TUI derived /status instruction source paths from the local
client environment, which could show stale <none> output or incorrect
paths when connected to a remote app server.
Solution: Add an app-server v2 instructionSources snapshot to thread
start/resume/fork responses, default it to an empty list when older
servers omit it, and render TUI /status from that server-provided
session data.
Additional context: The app-server field is intentionally named
instructionSources rather than AGENTS.md-specific terminology because
the loaded instruction sources can include global instructions, project
AGENTS.md files, AGENTS.override.md, user-defined instruction files, and
future dynamic sources.
Addresses #17313
Problem: The visual context meter in the status line was confusing and
continued to draw negative feedback, and context reporting should remain
an explicit opt-in rather than part of the default footer.
Solution: Remove the visual meter, restore opt-in context remaining/used
percentage items that explicitly say "Context", keep existing
context-usage configs working as a hidden alias, and update the setup
text and snapshots.
## Problem
The TUI had shell-style Up/Down history recall, but `Ctrl+R` did not
provide the reverse incremental search workflow users expect from
shells. Users needed a way to search older prompts without immediately
replacing the current draft, and the interaction needed to handle async
persistent history, repeated navigation keys, duplicate prompt text,
footer hints, and preview highlighting without making the main composer
file even harder to review.
https://github.com/user-attachments/assets/5165affd-4c9a-46e9-adbd-89088f5f7b6b
<img width="1227" height="722" alt="image"
src="https://github.com/user-attachments/assets/8bc83289-eeca-47c7-b0c3-8975101901af"
/>
## Mental model
`Ctrl+R` opens a temporary search session owned by the composer. The
footer line becomes the search input, the composer body previews the
current match only after the query has text, and `Enter` accepts that
preview as an editable draft while `Esc` restores the draft that existed
before search started. The history layer provides a combined offset
space over persistent and local history, but search navigation exposes
unique prompt text rather than every physical history row.
## Non-goals
This change does not rewrite stored history, change normal Up/Down
browsing semantics, add fuzzy matching, or add persistent metadata for
attachments in cross-session history. Search deduplication is
deliberately scoped to the active Ctrl+R search session and uses exact
prompt text, so case, whitespace, punctuation, and attachment-only
differences are not normalized.
## Tradeoffs
The implementation keeps search state in the existing composer and
history state machines instead of adding a new cross-module controller.
That keeps ownership local and testable, but it means the composer still
coordinates visible search status, draft restoration, footer rendering,
cursor placement, and match highlighting while `ChatComposerHistory`
owns traversal, async fetch continuation, boundary clamping, and
unique-result caching. Unique-result caching stores cloned
`HistoryEntry` values so known matches can be revisited without cache
lookups; this is simple and robust for interactive search sizes, but it
is not a global history index.
## Architecture
`ChatComposer` detects `Ctrl+R`, snapshots the current draft, switches
the footer to `FooterMode::HistorySearch`, and routes search-mode keys
before normal editing. Query edits call `ChatComposerHistory::search`
with `restart = true`, which starts from the newest combined-history
offset. Repeated `Ctrl+R` or Up searches older; Down searches newer
through already discovered unique matches or continues the scan.
Persistent history entries still arrive asynchronously through
`on_entry_response`, where a pending search either accepts the response,
skips a duplicate, or requests the next offset.
The composer-facing pieces now live in
`codex-rs/tui/src/bottom_pane/chat_composer/history_search.rs`, leaving
`chat_composer.rs` responsible for routing and rendering integration
instead of owning every search helper inline.
`codex-rs/tui/src/bottom_pane/chat_composer_history.rs` remains the
owner of stored history, combined offsets, async fetch state, boundary
semantics, and duplicate suppression. Match highlighting is computed
from the current composer text while search is active and disappears
when the match is accepted.
## Observability
There are no new logs or telemetry. The practical debug path is state
inspection: `ChatComposer.history_search` tells whether the footer query
is idle, searching, matched, or unmatched; `ChatComposerHistory.search`
tracks selected raw offsets, pending persistent fetches, exhausted
directions, and unique match cache state. If a user reports skipped or
repeated results, first inspect the exact stored prompt text, the
selected offset, whether an async persistent response is still pending,
and whether a query edit restarted the search session.
## Tests
The change is covered by focused `codex-tui` unit tests for opening
search without previewing the latest entry, accepting and canceling
search, no-match restoration, boundary clamping, footer hints,
case-insensitive highlighting, local duplicate skipping, and persistent
duplicate skipping through async responses. Snapshot coverage captures
the footer-mode visual changes. Local verification used `just fmt`,
`cargo test -p codex-tui history_search`, `cargo test -p codex-tui`, and
`just fix -p codex-tui`.
- Let typed user messages submit while realtime is active and mirror
accepted text into the realtime text stream.
- Add integration coverage and snapshot for outbound realtime text.
## Summary
- detect WSL1 before Codex probes or invokes the Linux bubblewrap
sandbox
- fail early with a clear unsupported-operation message when a command
would require bubblewrap on WSL1
- document that WSL2 follows the normal Linux bubblewrap path while WSL1
is unsupported
## Why
Codex 0.115.0 made bubblewrap the default Linux sandbox. WSL1 cannot
create the user namespaces that bubblewrap needs, so shell commands
currently fail later with a raw bwrap namespace error. This makes the
unsupported environment explicit and keeps non-bubblewrap paths
unchanged.
The WSL detection reads /proc/version, lets an explicit WSL<version>
marker decide WSL1 vs WSL2+, and only treats a bare Microsoft marker as
WSL1 when no explicit WSL version is present.
addresses https://github.com/openai/codex/issues/16076
---------
Co-authored-by: Codex <noreply@openai.com>
## Description
Enable pnpm's reviewed build-script gate for this repo.
## What changed
- added `strictDepBuilds: true` to `pnpm-workspace.yaml`
## Why
The repo already uses pinned pnpm and frozen installs in CI. This adds
the remaining guard so dependency build scripts do not run unless they
are explicitly reviewed.
## Validation
- ran `pnpm install --frozen-lockfile`
Co-authored-by: Codex <noreply@openai.com>
## Summary
- register flattened handler aliases for deferred MCP tools
- cover the node_repl-shaped deferred MCP call path in tool registry
tests
## Root Cause
Deferred MCP tools were registered only under their namespaced handler
key, e.g. `mcp__node_repl__:js`. If the model/bridge emitted the
flattened qualified name `mcp__node_repl__js`, core parsed it as an MCP
payload but dispatch looked up the flattened handler key and returned
`unsupported call` before reaching the MCP handler.
## Validation
- `just fmt`
- `cargo test -p codex-tools
search_tool_registers_deferred_mcp_flattened_handlers`
- `cargo test -p codex-core
search_tool_registers_namespaced_mcp_tool_aliases`
- `git diff --check`
Select Current Thread startup context by budget from newest turns, cap
each rendered turn at 300 approximate tokens, and add formatter plus
integration snapshot coverage.
## Summary
- leave the default contributor devcontainer on its lightweight
platform-only Docker runtime
- install bubblewrap in setuid mode only in the secure devcontainer
image for running Codex inside Docker
- add Docker run args to the secure profile for bubblewrap's required
capabilities
- use explicit `seccomp=unconfined` and `apparmor=unconfined` in the
secure profile instead of shipping a custom seccomp profile
- document that the relaxed Docker security options are scoped to the
secure profile
## Why
Docker's default seccomp profile blocks bubblewrap with `pivot_root:
Operation not permitted`, even when the container has `CAP_SYS_ADMIN`.
Docker's default AppArmor profile also blocks bubblewrap with `Failed to
make / slave: Permission denied`.
A custom seccomp profile works, but it is hard for customers to audit
and understand. Using Docker's standard `seccomp=unconfined` option is
clearer: the secure profile intentionally relaxes Docker's outer sandbox
just enough for Codex to construct its own bubblewrap/seccomp sandbox
inside the container. The default contributor profile does not get these
expanded runtime settings.
## Validation
- `sed '/\\/\\*/,/\\*\\//d' .devcontainer/devcontainer.json | jq empty`
- `jq empty .devcontainer/devcontainer.secure.json`
- `git diff --check`
- `docker build --platform=linux/arm64 -t
codex-devcontainer-bwrap-test-arm64 ./.devcontainer`
- `docker build --platform=linux/arm64 -f
.devcontainer/Dockerfile.secure -t
codex-devcontainer-secure-bwrap-test-arm64 .`
- interactive `docker run -it` smoke tests:
- verified non-root users `ubuntu` and `vscode`
- verified secure image `/usr/bin/bwrap` is setuid
- verified user/pid namespace, user/network namespace, and preserved-fd
`--ro-bind-data` bwrap commands
- reran secure-image smoke test with simplified `seccomp=unconfined`
setup:
- `bwrap-basic-ok`
- `bwrap-netns-ok`
- `codex-ok`
- ran Codex inside the secure image:
- `codex --version` -> `codex-cli 0.120.0`
- `codex sandbox linux --full-auto -- /bin/sh -lc '...'` -> exited 0 and
printed `codex-inner-ok`
Note: direct `bwrap --proc /proc` is still denied by this Docker
runtime, and Codex's existing proc-mount preflight fallback handles that
by retrying without `--proc`.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- update the guardian timeout guidance to say permission approval review
timed out
- simplify the retry guidance to say retry once or ask the user for
guidance or explicit approval
## Testing
- cargo test -p codex-core
guardian_timeout_message_distinguishes_timeout_from_policy_denial
- cargo test -p codex-core
guardian_review_decision_maps_to_mcp_tool_decision
**Summary**
This PR treats Guardian timeouts as distinct from explicit denials in
the core approval paths.
Timeouts now return timeout-specific guidance instead of Guardian
policy-rejection messaging.
It updates the command, shell, network, and MCP approval flows and adds
focused test coverage.
Addresses #17303
Problem: The standalone codex-tui entrypoint only printed token usage on
exit, so resumable sessions could omit the codex resume footer even when
thread metadata was available.
Solution: Format codex-tui exit output from AppExitInfo so it includes
the same resume hint as the main CLI and reports fatal exits
consistently.
Addresses #17311
Problem: `/stop` stops background terminals, but `/ps` can still show
stale entries because the TUI process cache is cleared only after later
exec end events arrive.
Solution: Clear the TUI's tracked unified exec process list and footer
immediately when `/stop` submits background terminal cleanup.
Addresses #17353
Problem: Codex rate-limit fetching failed when the backend returned the
new `prolite` subscription plan type.
Solution: Add `prolite` to the backend/account/auth plan mappings, keep
unknown WHAM plan values decodable, and regenerate app-server plan
schemas.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
Problem: The automatic issue labeler still treated agent-related issues
as one broad category, even though more specific agent-area labels now
exist.
Solution: Update the issue labeler prompt to prefer the new agent-area
labels and keep "agent" as the fallback for uncategorized core agent
issues.
Addresses #17276
Problem: Closing the terminal while the TUI input stream is pending
could leave the app outside the normal shutdown path, which is risky
when an approval prompt is active.
Solution: Treat a closed TUI input stream as ShutdownFirst so existing
thread shutdown behavior cancels pending work and approvals before exit.
# TL;DR
- Adds recognized slash commands to the TUI's local in-session recall
history.
- This is the MVP of the whole feature: it keeps slash-command recall
local only: nothing is written to persistent history, app-server
history, or core history storage.
- Treats slash commands like submitted text once they parse as a known
built-in command, regardless of whether command dispatch later succeeds.
# Problem
Slash commands are handled outside the normal message submission path,
so they could clear the composer without becoming part of the local
Up-arrow recall list. That made command-heavy workflows awkward: after
running `/diff`, `/rename Better title`, `/plan investigate this`, or
even a valid command that reports a usage error, users had to retype the
command instead of recalling and editing it like a normal prompt.
The goal of this PR is to make slash commands feel like submitted input
inside the current TUI session while keeping the change deliberately
local. This is not persistent history yet; it only affects the
composer's in-memory recall behavior.
# Mental model
The composer owns draft state and local recall. When slash input parses
as a recognized built-in command, the composer stages the submitted
command text before returning `InputResult::Command` or
`InputResult::CommandWithArgs`. `ChatWidget` then dispatches the command
and records the staged entry once dispatch returns to the input-result
path.
Command-name recognition is the only validation before local recall. A
valid slash command is recallable whether it succeeds, fails with a
usage error, no-ops, is unavailable while a task is running, or is
skipped by command-specific logic. An unrecognized slash command is
different: it is restored as a draft, surfaces the existing
unrecognized-command message, and is not added to recall.
Bare commands recalled from typed text use the trimmed submitted draft.
Commands selected from the popup record the canonical command text, such
as `/diff`, rather than the partial filter text the user typed. Inline
commands with arguments keep the original command invocation available
locally even when their arguments are later prepared through the normal
submission pipeline.
# Non-goals
Persisting slash commands across sessions is intentionally out of scope.
This change does not modify app-server history, core history storage,
protocol events, or message submission semantics.
This does not change command availability, command side effects, popup
filtering, command parsing, or the semantics of unsupported commands. It
only changes whether recognized slash-command invocations are available
through local Up-arrow recall after the user submits them.
# Tradeoffs
The main tradeoff is that recall is based on command recognition, not
command outcome. This intentionally favors a simpler user model: if the
TUI accepted the input as a slash command, the user can recall and edit
that input just like plain text. That means valid-but-unsuccessful
invocations such as usage errors are recallable, which is useful when
the next action is usually to edit and retry.
The previous accept/reject design required command dispatch to report a
boolean outcome, which made the dispatcher API noisier and forced every
branch to decide history behavior. This version keeps the dispatch APIs
as side-effect-only methods and localizes history recording to the
slash-command input path.
Inline command handling still avoids double-recording by preparing
inline arguments without using the normal message-submission history
path. The staged slash-command entry remains the single local recall
record for the command invocation.
# Architecture
`ChatComposer` stages a pending `HistoryEntry` when recognized
slash-command input is promoted into an input result. The pending entry
mirrors the existing local history payload shape so recall can restore
text elements, local images, remote images, mention bindings, and
pending paste state when those are present.
`BottomPane` exposes a narrow method for recording that staged command
entry because it owns the composer. `ChatWidget` records the staged
entry after dispatching a recognized command from the input-result
match. Valid commands rejected before they reach `ChatWidget`, such as
commands unavailable while a task is running, are staged and recorded in
the composer path that detects the rejection.
Slash-command dispatch itself now lives in
`chatwidget/slash_dispatch.rs` so the behavior is reviewable without
adding more weight to `chatwidget.rs`. The extraction is
behavior-preserving: the dispatch match arms stay intact, while the
input flow in `chatwidget.rs` remains the single place that connects
submitted slash-command input to dispatch.
# Observability
There is no new logging because this is a local UI recall behavior and
the result is directly visible through Up-arrow recall. The practical
debug path is to trace Enter through
`ChatComposer::try_dispatch_bare_slash_command`,
`ChatComposer::try_dispatch_slash_command_with_args`, or popup Enter/Tab
handling, then confirm the recognized command is staged before dispatch
and recorded exactly once afterward.
If a valid command unexpectedly does not appear in recall, check whether
the input path staged slash history before clearing the composer and
whether it used the `ChatWidget` slash-dispatch wrapper. If an
unrecognized command unexpectedly appears in recall, check the parser
branch that should restore the draft instead of staging history.
# Tests
Composer-level tests cover staging and recording for a bare typed slash
command, a popup-selected command, and an inline command with arguments.
Chat-widget tests cover valid commands being recallable after normal
dispatch, inline dispatch, usage errors, task-running unavailability,
no-op stub dispatch, and command-specific skip behavior such as `/init`
when an instructions file already exists. They also cover the negative
case: unrecognized slash commands are not added to local recall.
## Summary
- Add an optional `tags` dictionary to feedback upload params.
- Capture the active app-server turn id in the TUI and submit it as
`tags.turn_id` with `/feedback` uploads.
- Merge client-provided feedback tags into Sentry feedback tags while
preserving reserved system fields like `thread_id`, `classification`,
`cli_version`, `session_source`, and `reason`.
## Behavior / impact
Existing feedback upload callers remain compatible because `tags` is
optional and nullable. The wire shape is still a normal JSON object /
TypeScript dictionary, so adding future feedback metadata will not
require a new top-level protocol field each time. This change only adds
feedback metadata for Codex CLI/TUI uploads; it does not affect existing
pipelines, DAGs, exports, or downstream consumers unless they choose to
read the new `turn_id` feedback tag.
## Tests
- `cargo fmt -- --config imports_granularity=Item` passed; stable
rustfmt warned that `imports_granularity` is nightly-only.
- `cargo run -p codex-app-server-protocol --bin write_schema_fixtures`
- `cargo test -p codex-feedback
upload_tags_include_client_tags_and_preserve_reserved_fields`
- `cargo test -p codex-app-server-protocol
schema_fixtures_match_generated`
- `cargo test -p codex-tui build_feedback_upload_params`
- `cargo test -p codex-tui
live_app_server_turn_started_sets_feedback_turn_id`
- `cargo check -p codex-app-server --tests`
- `git diff --check`
---------
Co-authored-by: Codex <noreply@openai.com>
## Description
Keeps the existing Codex contributor devcontainer in place and adds a
separate secure profile for customer use.
## What changed
- leaves `.devcontainer/devcontainer.json` and the contributor
`Dockerfile` aligned with `main`
- adds `.devcontainer/devcontainer.secure.json` and
`.devcontainer/Dockerfile.secure`
- adds secure-profile bootstrap scripts:
- `post_install.py`
- `post-start.sh`
- `init-firewall.sh`
- updates `.devcontainer/README.md` to explain when to use each path
## Secure profile behavior
The new secure profile is opt-in and is meant for running Codex in a
stricter project container:
- preinstalls the Codex CLI plus common build tools
- uses persistent volumes for Codex state, Cargo, Rustup, and GitHub
auth
- applies an allowlist-driven outbound firewall at startup
- blocks IPv6 by default so the allowlist cannot be bypassed via AAAA
routes
- keeps the stricter networking isolated from the default contributor
workflow
## Resulting behavior
- `devcontainer.json` remains the low-friction Codex contributor setup
- `devcontainer.secure.json` is the customer-facing secure option
- the repo supports both workflows without forcing the secure profile on
Codex contributors
Addresses #17302
Problem: `thread/list` compared cwd filters with raw path equality, so
`resume --last` could miss Windows sessions when the saved cwd used a
verbatim path form and the current cwd did not.
Solution: Normalize cwd comparisons through the existing path comparison
utilities before falling back to direct equality, and add Windows
regression coverage for verbatim paths. I made this a general utility
function and replaced all of the duplicated instance of it across the
code base.
## Summary
- Update the marketplace add local-source integration test to pass an
explicit relative local path.
- Keep the change test-only; no CLI source parsing behavior changes.
## Tests
- cargo fmt -p codex-cli
- cargo test -p codex-cli --test marketplace_add
## Impact
- Production behavior is unchanged.
- No impact to feedback upload logic, DAGs, exports, or downstream
pipelines.
Co-authored-by: Codex <noreply@openai.com>
## Summary
- keep hostname targets proxied by default by removing hostname suffixes
from the managed `NO_PROXY` value while preserving private/link-local
CIDRs
- make the macOS `allow_local_binding` sandbox rules match the local
socket shape used by DNS tools by allowing wildcard local binds
- allow raw DNS egress to remote port 53 only when `allow_local_binding`
is enabled, without opening blanket outbound network access
## Root cause
Raw DNS tools do not honor `HTTP_PROXY` or `ALL_PROXY`, so the
proxy-only Seatbelt policy blocked their resolver traffic before it
could reach host DNS. In the affected managed config,
`allow_local_binding = true`, but the existing rule only allowed
`localhost:*` binds; `dig`/BIND can bind sockets in a way that needs
wildcard local binding. Separately, hostname suffixes in `NO_PROXY`
could force internal hostnames to resolve locally instead of through the
proxy path.
---------
Co-authored-by: Codex <noreply@openai.com>
Problem: The TUI still depended on `codex-core` directly in a number of
places, and we had no enforcement from keeping this problem from getting
worse.
Solution: Route TUI core access through
`codex-app-server-client::legacy_core`, add CI enforcement for that
boundary, and re-export this legacy bridge inside the TUI as
`crate::legacy_core` so the remaining call sites stay readable. There is
no functional change in this PR — just changes to import targets.
Over time, we can whittle away at the remaining symbols in this legacy
namespace with the eventual goal of removing them all. In the meantime,
this linter rule will prevent us from inadvertently importing new
symbols from core.
## Summary
- Add `TimedOut` to Guardian/review carrier types:
- `ReviewDecision::TimedOut`
- `GuardianAssessmentStatus::TimedOut`
- app-server v2 `GuardianApprovalReviewStatus::TimedOut`
- Regenerate app-server JSON/TypeScript schemas for the new wire shape.
- Wire the new status through core/app-server/TUI mappings with
conservative fail-closed handling.
- Keep `TimedOut` non-user-selectable in the approval UI.
**Does not change runtime behavior yet; emitting `TimeOut` and
parent-model timeout messaging will come in followup PRs**
Problem: The Windows exec-server test command could let separator
whitespace become part of `echo` output, making the exact
retained-output assertion flaky.
Solution: Tighten the Windows `cmd.exe` command by placing command
separators directly after the echoed tokens so stdout remains
deterministic while preserving the exact assertion.
Added a new top-level `codex marketplace add` command for installing
plugin marketplaces into Codex’s local marketplace cache.
This change adds source parsing for local directories, GitHub shorthand,
and git URLs, supports optional `--ref` and git-only `--sparse` checkout
paths, stages the source in a temp directory, validates the marketplace
manifest, and installs it under
`$CODEX_HOME/marketplaces/<marketplace-name>`
Included tests cover local install behavior in the CLI and marketplace
discovery from installed roots in core. Scoped formatting and fix passes
were run, and targeted CLI/core tests passed.
## Summary
- preserve logical symlink paths during permission normalization and
config cwd handling
- bind real targets for symlinked readable/writable roots in bwrap and
remap carveouts and unreadable roots there
- add regressions for symlinked carveouts and nested symlink escape
masking
## Root cause
Permission normalization canonicalized symlinked writable roots and cwd
to their real targets too early. That drifted policy checks away from
the logical paths the sandboxed process can actually address, while
bwrap still needed the real targets for mounts. The mismatch caused
shell and apply_patch failures on symlinked writable roots.
## Impact
Fixes#15781.
Also fixes#17079:
- #17079 is the protected symlinked carveout side: bwrap now binds the
real symlinked writable-root target and remaps carveouts before masking.
Related to #15157:
- #15157 is the broader permission-check side of this path-identity
problem. This PR addresses the shared logical-vs-canonical normalization
issue, but the reported Darwin prompt behavior should be validated
separately before auto-closing it.
This should also fix#14672, #14694, #14715, and #15725:
- #14672, #14694, and #14715 are the same Linux
symlinked-writable-root/bwrap family as #15781.
- #15725 is the protected symlinked workspace path variant; the PR
preserves the protected logical path in policy space while bwrap applies
read-only or unreadable treatment to the resolved target so
file-vs-directory bind mismatches do not abort sandbox setup.
## Notes
- Added Linux-only regressions for symlinked writable ancestors and
protected symlinked directory targets, including nested symlink escape
masking without rebinding the escape target writable.
---------
Co-authored-by: Codex <noreply@openai.com>
## Description
This PR introduces `review_id` as the stable identifier for guardian
reviews and exposes it in app-server `item/autoApprovalReview/started`
and `item/autoApprovalReview/completed` events.
Internally, guardian rejection state is now keyed by `review_id` instead
of the reviewed tool item ID. `target_item_id` is still included when a
review maps to a concrete thread item, but it is no longer overloaded as
the review lifecycle identifier.
## Motivation
We'd like to give users the ability to preempt a guardian review while
it's running (approve or decline).
However, we can't implement the API that allows the user to override a
running guardian review because we didn't have a unique `review_id` per
guardian review. Using `target_item_id` is not correct since:
- with execve reviews, there can be multiple execve calls (and therefore
guardian reviews) per shell command
- with network policy reviews, there is no target item ID
The PR that actually implements user overrides will use `review_id` as
the stable identifier.
## Motivation
The `SessionStart` hook already receives `startup` and `resume` sources,
but sessions created from `/clear` previously looked like normal startup
sessions. This makes it impossible for hook authors to distinguish
between these with the matcher.
## Summary
- Add `InitialHistory::Cleared` so `/clear`-created sessions can be
distinguished from ordinary startup sessions.
- Add `SessionStartSource::Clear` and wire it through core, app-server
thread start params, and TUI clear-session flow.
- Update app-server protocol schemas, generated TypeScript, docs, and
related tests.
https://github.com/user-attachments/assets/9cae3cb4-41c7-4d06-b34f-966252442e5c
# Motivation
Make hook display less noisy and more useful by keeping transient hook
activity out of permanent history unless there is useful output,
preserving visibility for meaningful hook work, and making completed
hook severity easier to scan.
Also addresses some of the concerns in
https://github.com/openai/codex/issues/15497
# Changes
## Demo
https://github.com/user-attachments/assets/9d8cebd4-a502-4c95-819c-c806c0731288
Reverse spec for the behavior changes in this branch:
## Hook Lifecycle Rendering
- Hook start events no longer write permanent history rows like `Running
PreToolUse hook`.
- Running hooks now render in a dedicated live hook area above the
composer. It's similar to the active cell we use for tool calls but its
a separate lane.
- Running hook rows use the existing animation setting.
## Hook Reveal Timing
- We wait 300ms before showing running hook rows and linger for up to
600ms once visible.
- This is so fast hooks don't flash a transient `Running hook` row
before user can read it every time.
- If a fast hook completes with meaningful output, only the completed
hook result is written to history.
- If a fast hook completes successfully with no output, it leaves no
visible trace.
## Completed Hook Output
- Completed hooks with output are sticky, for example `• SessionStart
hook (completed)`.
- Hook output entries are rendered under that row with stable prefixes:
`warning:`, `stop:`, `feedback:`, `hook context:`, and `error:`.
- Blocked hooks show feedback entries, for example `• PreToolUse hook
(blocked)` followed by `feedback: ...`.
- Failed hooks show error entries, for example `• PostToolUse hook
(failed)` followed by `error: ...`.
- Stopped hooks show stop entries and remain visually treated as
non-success.
## Parallel Hook Behavior
- Multiple simultaneously running hooks can be tracked in one live hook
cell.
- Adjacent running hooks with the same hook event name and same status
message collapse into a count, for example `• Running 3 PreToolUse
hooks: checking command policy`.
- Running hooks with different event names or different status messages
remain separate rows.
## Hook Run Identity
- `PreToolUse` and `PostToolUse` hook run IDs now include the tool call
ID which prevents concurrent tool-use hooks from sharing a run ID and
clobbering each other in the UI.
- This ID scoping applies to tool-use hooks only; other hook event types
keep their existing run identity behavior.
## App-Server Hook Notifications
- App-server `HookStarted` and `HookCompleted` notifications use the
same live hook rendering path as core hook events.
- `UserPromptSubmit` hook notifications now render through the same
completed hook output format, including warning and stop entries.
- Add thread-title as an optional TUI status line item, omitted unless
the user has set a custom name (`ChatWidget.thread_name`).
- Refresh the status line when threads are renamded
- Add snapshot coverage for renamed-thread footer behavior.
Encourages realtime prompt handling to delegate user requests to the
backend agent by default when repo inspection, commands, implementation,
or validation may help.
Co-authored-by: Codex <noreply@openai.com>
Builds on #17264.
- queues Realtime V2 `response.create` while an active response is open,
then flushes it after `response.done` or `response.cancelled`
- requests `response.create` after background agent final output and
steering acknowledgements
- adds app-server integration coverage for all `response.create` paths
Validation:
- `just fmt`
- `cargo check -p codex-app-server --tests`
- `git diff --check`
- CI green
---------
Co-authored-by: Codex <noreply@openai.com>
## Description
We reuse a guardian thread for a given user thread when we can. However,
we had always sent the full transcript history every time we made a
followup review request to an existing guardian thread.
This is especially bad for long guardian threads since we keep
re-appending old transcript entries instead of just what has changed.
The fix is to just send what's new.
**Caveat**: Whenever a thread is compacted or rolled back, we fall back
to sending the full transcript to guardian again since the thread's
history has been modified. However in the happy path we get a nice
optimization.
## Before
Initial guardian review sends the full parent transcript:
```
The following is the Codex agent history whose request action you are assessing...
>>> TRANSCRIPT START
[1] user: Please check the repo visibility and push the docs fix if needed.
[2] tool gh_repo_view call: {"repo":"openai/codex"}
[3] tool gh_repo_view result: repo visibility: public
[4] assistant: The repo is public; I now need approval to push the docs fix.
>>> TRANSCRIPT END
The Codex agent has requested the following action:
>>> APPROVAL REQUEST START
...
>>> APPROVAL REQUEST END
```
And a followup to the same guardian thread would send the full
transcript again (including items 1-4 we already sent):
```
The following is the Codex agent history whose request action you are assessing...
>>> TRANSCRIPT START
[1] user: Please check the repo visibility and push the docs fix if needed.
[2] tool gh_repo_view call: {"repo":"openai/codex"}
[3] tool gh_repo_view result: repo visibility: public
[4] assistant: The repo is public; I now need approval to push the docs fix.
[5] user: Please push the second docs fix too.
[6] assistant: I need approval for the second docs fix.
>>> TRANSCRIPT END
The Codex agent has requested the following action:
>>> APPROVAL REQUEST START
...
>>> APPROVAL REQUEST END
```
## After
Initial guardian review sends the full parent transcript (this is
unchanged):
```
The following is the Codex agent history whose request action you are assessing...
>>> TRANSCRIPT START
[1] user: Please check the repo visibility and push the docs fix if needed.
[2] tool gh_repo_view call: {"repo":"openai/codex"}
[3] tool gh_repo_view result: repo visibility: public
[4] assistant: The repo is public; I now need approval to push the docs fix.
>>> TRANSCRIPT END
The Codex agent has requested the following action:
>>> APPROVAL REQUEST START
...
>>> APPROVAL REQUEST END
```
But a followup now sends:
```
The following is the Codex agent history added since your last approval assessment. Continue the same review conversation...
>>> TRANSCRIPT DELTA START
[5] user: Please push the second docs fix too.
[6] assistant: I need approval for the second docs fix.
>>> TRANSCRIPT DELTA END
The Codex agent has requested the following next action:
>>> APPROVAL REQUEST START
...
>>> APPROVAL REQUEST END
```
The disconnect path now reuses the same teardown flow as explicit
unsubscribe, and the thread-state bookkeeping consistently reports only
threads that lost their last subscriber
https://github.com/openai/codex/issues/16895
The rollout writer now keeps an owned/monitored task handle, returns
real Result acks for flush/persist/shutdown, retries failed flushes by
reopening the rollout file, and keeps buffered items until they are
successfully written. Session flushes are now real durability barriers
for fork/rollback/read-after-write paths, while turn completion surfaces
a warning if the rollout still cannot be saved after recovery.
This introduces session-scoped ownership for exec-server so ws
disconnects no longer immediately kill running remote exec processes,
and it prepares the protocol for reconnect-based resume.
- add session_id / resume_session_id to the exec-server initialize
handshake
- move process ownership under a shared session registry
- detach sessions on websocket disconnect and expire them after a TTL
instead of killing processes immediately (we will resume based on this)
- allow a new connection to resume an existing session and take over
notifications/ownership
- I use UUID to make them not predictable as we don't have auth for now
- make detached-session expiry authoritative at resume time so teardown
wins at the TTL boundary
- reject long-poll process/read calls that get resumed out from under an
older attachment
---------
Co-authored-by: Codex <noreply@openai.com>
Stream Realtime V2 background agent updates while the background agent
task is still running, then send the final tool output when it
completes. User input during an active V2 handoff is acknowledged back
to realtime as a steering update.
Stack:
- Depends on #17278 for the background_agent rename.
- Depends on #17280 for the input task handler refactor.
Coverage:
- Adds an app-server integration regression test that verifies V2
progress is sent before the final function-call output.
Validation:
- just fmt
- cargo check -p codex-core
- cargo check -p codex-app-server --tests
- git diff --check
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
This PR adds the parent conversation/session id to the subagent-start
analytics event for Guardian subagents.
Previously, Guardian sessions were emitted as subagent
thread-initialized events, but their `parent_thread_id` was serialized
as `null`. After this change, the `codex_thread_initialized` analytics
event for a Guardian child session includes the parent user conversation
id.
Refactor the realtime input task select loop into named handlers for
user text, background agent output, realtime server events, and user
audio without changing the V2 behavior.
Stack:
- Depends on #17278 for the background_agent rename.
Validation:
- just fmt
- cargo check -p codex-core
- git diff --check
---------
Co-authored-by: Codex <noreply@openai.com>
Rename the Realtime V2 delegation tool and parser constant to
background_agent, and update the tool description and fixtures to match.
Validation: just fmt; cargo check -p codex-api; git diff --check
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Replace the manual `/notify-owner` flow with an inline confirmation
prompt when a usage-based workspace member hits a credits-depleted
limit.
- Fetch the current workspace role from the live ChatGPT
`accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches
the desktop and web clients.
- Keep owner, member, and spend-cap messaging distinct so we only offer
the owner nudge when the workspace is actually out of credits.
## What Changed
- `backend-client`
- Added a typed fetch for the current account role from
`accounts/check`.
- Mapped backend role values into a Rust workspace-role enum.
- `app-server` and protocol
- Added `workspaceRole` to `account/read` and `account/updated`.
- Derived `isWorkspaceOwner` from the live role, with a fallback to the
cached token claim when the role fetch is unavailable.
- `tui`
- Removed the explicit `/notify-owner` slash command.
- When a member is blocked because the workspace is out of credits, the
error now prompts:
- `Your workspace is out of credits. Request more from your workspace
owner? [y/N]`
- Choosing `y` sends the existing owner-notification request.
- Choosing `n`, pressing `Esc`, or accepting the default selection
dismisses the prompt without sending anything.
- Selection popups now honor explicit item shortcuts, which is how the
`y` / `n` interaction is wired.
## Reviewer Notes
- The main behavior change is scoped to usage-based workspace members
whose workspace credits are depleted.
- Spend-cap reached should not show the owner-notification prompt.
- Owners and admins should continue to see `/usage` guidance instead of
the member prompt.
- The live role fetch is best-effort; if it fails, we fall back to the
existing token-derived ownership signal.
## Testing
- Manual verification
- Workspace owner does not see the member prompt.
- Workspace member with depleted credits sees the confirmation prompt
and can send the nudge with `y`.
- Workspace member with spend cap reached does not see the
owner-notification prompt.
### Workspace member out of usage
https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1
### Workspace owner
<img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48
22 AM"
src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6"
/>
Addresses #17283
Problem: `codex --remote wss://...` could panic because
app-server-client did not install rustls' process-level crypto provider
before opening TLS websocket connections.
Solution: Add the existing rustls provider utility dependency and
install it before the remote websocket connect.
# What
Project raw Stop-hook prompt response items into typed v2 hookPrompt
item-completed notifications before applying the raw-response-event
filter. Keep ordinary raw response items filtered for normal
subscribers; only the existing hookPrompt bridge runs on the filtered
raw-item path.
# Why
Blocked Stop hooks record their continuation instruction as a raw
model-history user item. Normal v2 desktop subscribers do not opt into
raw response events, so the app-server listener filtered that raw item
before the existing hookPrompt translator could emit the typed live
item/completed notification. As a result, the hook-prompt bubble only
appeared after thread history was reloaded.
we used to alpha-sort tool search results because we were using
`BTreeMap`, which threw away the actual search result ordering.
Now we use a vec to preserve it.
### Tests
Updated tests
## Summary
- preserve legacy Windows elevated sandbox behavior for existing
policies
- add elevated-only support for split filesystem policies that can be
represented as readable-root overrides, writable-root overrides, and
extra deny-write carveouts
- resolve those elevated filesystem overrides during sandbox transform
and thread them through setup and policy refresh
- keep failing closed for explicit unreadable (`none`) carveouts and
reopened writable descendants under read-only carveouts
- for explicit read-only-under-writable-root carveouts, materialize
missing carveout directories during elevated setup before applying the
deny-write ACL
- document the elevated vs restricted-token support split in the core
README
## Example
Given a split filesystem policy like:
```toml
":root" = "read"
":cwd" = "write"
"./docs" = "read"
"C:/scratch" = "write"
```
the elevated backend now provisions the readable-root overrides,
writable-root overrides, and extra deny-write carveouts during setup and
refresh instead of collapsing back to the legacy workspace-only shape.
If a read-only carveout under a writable root is missing at setup time,
elevated setup creates that carveout as an empty directory before
applying its deny-write ACE; otherwise the sandboxed command could
create it later and bypass the carveout. This is only for explicit
policy carveouts. Best-effort workspace protections like `.codex/` and
`.agents/` still skip missing directories.
A policy like:
```toml
"/workspace" = "write"
"/workspace/docs" = "read"
"/workspace/docs/tmp" = "write"
```
still fails closed, because the elevated backend does not reopen
writable descendants under read-only carveouts yet.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- omit serialized Responses instructions when an app-server base
instruction override is empty
- skip empty developer instruction messages and add v2 coverage for the
empty-override request shape
## Validation
- just fmt
- git diff --check
## TL;DR
- New `Ctrl+O` shortcut on top of the existing `/copy` command, allowing
users to copy the latest agent response without having to cancel a plan
or type `/copy`
- Copy server clipboard to the client over SSH (OSC 52)
- Fixes linux copy behavior: a clipboard handle has to be kept alive
while the paste happens for the contents to be preserved
- Uses arboard as primary mechanism on Windows, falling back to
PowerShell copy clipboard function
- Works with resumes, rolling back during a session, etc.
Tested on macOS, Linux/X11, Windows WSL2, Windows cmd.exe, Windows
PowerShell, Windows VSCode PowerShell, Windows VSCode WSL2, SSH (macOS
-> macOS).
## Problem
The TUI's `/copy` command was fragile. It relied on a single
`last_copyable_output` field that was bluntly cleared on every rollback
and thread reconfiguration, making copied content unavailable after
common operations like backtracking. It also had no keyboard shortcut,
requiring users to type `/copy` each time. The previous clipboard
backend mixed platform selection policy with low-level I/O in a way that
was hard to test, and it did not keep the Linux clipboard owner alive —
meaning pasted content could vanish once the process that wrote it
dropped its `arboard::Clipboard`.
This addresses the text-copy failure modes reported in #12836, #15452,
and #15663: native Linux clipboard access failing in remote or
unreachable-display environments, copy state going blank even after
visible assistant output, and local Linux X11 reporting success while
leaving the clipboard empty.
## Shortcut rationale
The copy hotkey is `Ctrl+O` rather than `Alt+C` because Alt/Option
combinations are not delivered consistently by macOS terminal emulators.
Terminal.app and iTerm2 can treat Option as text input or as a
configurable Meta/Esc prefix, and Option+C may be consumed or
transformed before the TUI sees an `Alt+C` key event. `Ctrl+O` is a
stable control-key chord in Terminal.app, iTerm2, SSH, and the existing
cross-platform terminal stack.
## Mental model
Agent responses are now tracked as a bounded, ordinal-indexed history
(`agent_turn_markdowns: Vec<AgentTurnMarkdown>`) rather than a single
nullable string. Each completed agent turn appends an entry keyed by its
ordinal (the number of user turns seen so far). Rollbacks pop entries
whose ordinal exceeds the remaining turn count, then use the visible
transcript cells as a best-effort fallback if the ordinal history no
longer has a surviving entry. This means `/copy` and `Ctrl+O` reflect
the most recent surviving agent response after a backtrack, instead of
going blank.
The clipboard backend was rewritten as `clipboard_copy.rs` with a
strategy-injection design: `copy_to_clipboard_with` accepts closures for
the OSC 52, arboard, and WSL PowerShell paths, making the selection
logic fully unit-testable without touching real clipboards. On Linux,
the `Clipboard` handle is returned as a `ClipboardLease` stored on
`ChatWidget`, keeping X11/Wayland clipboard ownership alive for the
lifetime of the TUI. When native copy fails under WSL, the backend now
tries the Windows clipboard through PowerShell before falling back to
OSC 52.
## Non-goals
- This change does not introduce rich-text (HTML) clipboard support; the
copied content is raw markdown.
- It does not add a paste-from-history picker or multi-entry clipboard
ring.
- WSL support remains a best-effort fallback, not a new configuration
surface or guarantee for every terminal/host combination.
## Tradeoffs
- **Bounded history (256 entries)**: `MAX_AGENT_COPY_HISTORY` caps
memory. For sessions with thousands of turns this silently drops the
oldest entries. The cap is generous enough for realistic sessions.
- **`saw_copy_source_this_turn` flag**: Prevents double-recording when
both `AgentMessage` and `TurnComplete.last_agent_message` fire for the
same turn. The flag is reset on turn start and on turn complete,
creating a narrow window where a race between the two events could
theoretically skip recording. In practice the protocol delivers them
sequentially.
- **Transcript fallback on rollback**:
`last_agent_markdown_from_transcript` walks the visible transcript cells
to reconstruct plain text when the ordinal history has been fully
truncated. This path uses `AgentMessageCell::plain_text()` which joins
rendered spans, so it reconstructs display text rather than the original
raw markdown. It keeps visible text copyable after rollback, but
responses with markdown-specific syntax can diverge from the original
source.
- **Clipboard fallback ordering**: SSH still uses OSC 52 exclusively
because native/PowerShell clipboard access would target the wrong
machine. Local sessions try native clipboard first, then WSL PowerShell
when running under WSL, then OSC 52. This adds one process-spawn
fallback for WSL users but keeps the normal desktop and SSH paths
simple.
## Architecture
```
chatwidget.rs
├── agent_turn_markdowns: Vec<AgentTurnMarkdown> // ordinal-indexed history
├── last_agent_markdown: Option<String> // always == last entry's markdown
├── completed_turn_count: usize // incremented when user turns enter history
├── saw_copy_source_this_turn: bool // dedup guard
├── clipboard_lease: Option<ClipboardLease> // keeps Linux clipboard owner alive
│
├── record_agent_markdown(&str) // append/update history entry
├── truncate_agent_turn_markdowns_to_turn_count() // rollback support
├── copy_last_agent_markdown() // public entry point (slash + hotkey)
└── copy_last_agent_markdown_with(fn) // testable core
clipboard_copy.rs
├── copy_to_clipboard(text) -> Result<Option<ClipboardLease>>
├── copy_to_clipboard_with(text, ssh, wsl, osc52_fn, arboard_fn, wsl_fn)
├── ClipboardLease { _clipboard on linux }
├── arboard_copy(text) // platform-conditional native clipboard path
├── wsl_clipboard_copy(text) // WSL PowerShell fallback
├── osc52_copy(text) // /dev/tty -> stdout fallback
├── SuppressStderr // macOS stderr redirect guard
├── is_ssh_session()
└── is_wsl_session()
app_backtrack.rs
├── last_agent_markdown_from_transcript() // reconstruct from visible cells
└── truncate call sites in trim/apply_confirmed_rollback
```
## Observability
- `tracing::warn!` on native clipboard failure before OSC 52 fallback.
- `tracing::debug!` on `/dev/tty` open/write failure before stdout
fallback.
- History cell messages: "Copied last message to clipboard", "Copy
failed: {error}", "No agent response to copy" appear in the TUI
transcript.
## Tests
- `clipboard_copy.rs`: Unit tests cover OSC 52 encoding roundtrip,
payload size rejection, writer output, SSH-only OSC52 routing, non-WSL
native-to-OSC52 fallback, WSL native-to-PowerShell fallback, WSL
PowerShell-to-OSC52 fallback, and all-error reporting via strategy
injection.
- `chatwidget/tests/slash_commands.rs`: Updated existing `/copy` tests
to use `last_agent_markdown_text()` accessor. Added coverage for the
Linux clipboard lease lifecycle, missing
`TurnComplete.last_agent_message` fallback through completed assistant
items, replayed legacy agent messages, stale-output prevention after
rollback, and the `Ctrl+O` no-output hotkey path.
- `app_backtrack.rs`: Added
`agent_group_count_ignores_context_compacted_marker` verifying that
info-event cells don't inflate the agent group count.
---------
Co-authored-by: Felipe Coury <felipe.coury@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- [x] Expand tool search to custom MCPs.
- [x] Rename several variables/fields to be more generic.
Updated tool & server name lifecycles:
**Raw Identity**
ToolInfo.server_name is raw MCP server name.
ToolInfo.tool.name is raw MCP tool name.
MCP calls route back to raw via parse_tool_name() returning
(tool.server_name, tool.tool.name).
mcpServerStatus/list now groups by raw server and keys tools by
Tool.name: mod.rs:599
App-server just forwards that grouped raw snapshot:
codex_message_processor.rs:5245
**Callable Names**
On list-tools, we create provisional callable_namespace / callable_name:
mcp_connection_manager.rs:1556
For non-app MCP, provisional callable name starts as raw tool name.
For codex-apps, provisional callable name is sanitized and strips
connector name/id prefix; namespace includes connector name.
Then qualify_tools() sanitizes callable namespace + name to ASCII alnum
/ _ only: mcp_tool_names.rs:128
Note: this is stricter than Responses API. Hyphen is currently replaced
with _ for code-mode compatibility.
**Collision Handling**
We do initially collapse example-server and example_server to the same
base.
Then qualify_tools() detects distinct raw namespace identities behind
the same sanitized namespace and appends a hash to the callable
namespace: mcp_tool_names.rs:137
Same idea for tool-name collisions: hash suffix goes on callable tool
name.
Final list_all_tools() map key is callable_namespace + callable_name:
mcp_connection_manager.rs:769
**Direct Model Tools**
Direct MCP tool declarations use the full qualified sanitized key as the
Responses function name.
The raw rmcp Tool is converted but renamed for model exposure.
**Tool Search / Deferred**
Tool search result namespace = final ToolInfo.callable_namespace:
tool_search.rs:85
Tool search result nested name = final ToolInfo.callable_name:
tool_search.rs:86
Deferred tool handler is registered as "{namespace}:{name}":
tool_registry_plan.rs:248
When a function call comes back, core recombines namespace + name, looks
up the full qualified key, and gets the raw server/tool for MCP
execution: codex.rs:4353
**Separate Legacy Snapshot**
collect_mcp_snapshot_from_manager_with_detail() still returns a map
keyed by qualified callable name.
mcpServerStatus/list no longer uses that; it uses
McpServerStatusSnapshot, which is raw-inventory shaped.
## Summary
App-server v2 already receives turn-scoped `clientMetadata`, but the
Rust app-server was dropping it before the outbound Responses request.
This change keeps the fix lightweight by threading that metadata through
the existing turn-metadata path rather than inventing a new transport.
## What we're trying to do and why
We want turn-scoped metadata from the app-server protocol layer,
especially fields like Hermes/GAAS run IDs, to survive all the way to
the actual Responses API request so it is visible in downstream
websocket request logging and analytics.
The specific bug was:
- app-server protocol uses camelCase `clientMetadata`
- Responses transport already has an existing turn metadata carrier:
`x-codex-turn-metadata`
- websocket transport already rewrites that header into
`request.request_body.client_metadata["x-codex-turn-metadata"]`
- but the Rust app-server never parsed or stored `clientMetadata`, so
nothing from the app-server request was making it into that existing
path
This PR fixes that without adding a new header or a second metadata
channel.
## How we did it
### Protocol surface
- Add optional `clientMetadata` to v2 `TurnStartParams` and
`TurnSteerParams`
- Regenerate the JSON schema / TypeScript fixtures
- Update app-server docs to describe the field and its behavior
### Runtime plumbing
- Add a dedicated core op for app-server user input carrying turn-scoped
metadata: `Op::UserInputWithClientMetadata`
- Wire `turn/start` and `turn/steer` through that op / signature path
instead of dropping the metadata at the message-processor boundary
- Store the metadata in `TurnMetadataState`
### Transport behavior
- Reuse the existing serialized `x-codex-turn-metadata` payload
- Merge the new app-server `clientMetadata` into that JSON additively
- Do **not** replace built-in reserved fields already present in the
turn metadata payload
- Keep websocket behavior unchanged at the outer shape level: it still
sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON
string now contains the merged fields
- Keep HTTP fallback behavior unchanged except that the existing
`x-codex-turn-metadata` header now includes the merged fields too
### Request shape before / after
Before, a websocket `response.create` looked like:
```json
{
"type": "response.create",
"client_metadata": {
"x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}"
}
}
```
Even if the app-server caller supplied `clientMetadata`, it was not
represented there.
After, the same request shape is preserved, but the serialized payload
now includes the new turn-scoped fields:
```json
{
"type": "response.create",
"client_metadata": {
"x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}"
}
}
```
## Validation
### Targeted tests added / updated
- protocol round-trip coverage for `clientMetadata` on `turn/start` and
`turn/steer`
- protocol round-trip coverage for `Op::UserInputWithClientMetadata`
- `TurnMetadataState` merge test proving client metadata is added
without overwriting reserved built-in fields
- websocket request-shape test proving outbound `response.create`
contains merged metadata inside
`client_metadata["x-codex-turn-metadata"]`
- app-server integration tests proving:
- `turn/start` forwards `clientMetadata` into the outbound Responses
request path
- websocket warmup + real turn request both behave correctly
- `turn/steer` updates the follow-up request metadata
### Commands run
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-protocol`
- `cargo test -p codex-core
turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields
--lib`
- `cargo test -p codex-core --test all
responses_websocket_preserves_custom_turn_metadata_fields`
- `cargo test -p codex-app-server --test all client_metadata`
- `cargo test -p codex-app-server --test all
turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2
-- --nocapture`
- `just fmt`
- `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol
-p codex-app-server`
- `just fix -p codex-exec -p codex-tui-app-server`
- `just argument-comment-lint`
### Full suite note
`cargo test` in `codex-rs` still fails in:
-
`suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request`
I verified that same failure on a clean detached `HEAD` worktree with an
isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.
## Summary
- bridge Codex Apps tools that declare `_meta["openai/fileParams"]`
through the OpenAI file upload flow
- mask those file params in model-visible tool schemas so the model
provides absolute local file paths instead of raw file payload objects
- rewrite those local file path arguments client-side into
`ProvidedFilePayload`-shaped objects before the normal MCP tool call
## Details
- applies to scalar and array file params declared in
`openai/fileParams`
- Codex uploads local files directly to the backend and uses the
uploaded file metadata to build the MCP tool arguments locally
- this PR is input-only
## Verification
- `just fmt`
- `cargo test -p codex-core mcp_tool_call -- --nocapture`
---------
Co-authored-by: Codex <noreply@openai.com>
## What
Show an inline `ctrl + t to view transcript` hint when exec output is
truncated in the main TUI chat view.
## Why
Today, truncated exec output shows `… +N lines`, but it does not tell
users that the full content is already available through the existing
transcript overlay. That makes hidden output feel lost instead of
discoverable.
This change closes that discoverability gap without introducing a new
interaction model.
Fixes: CLI-5740
## How
- added an output-specific truncation hint in `ExecCell` rendering
- applied that hint in both exec-output truncation paths:
- logical head/tail truncation before wrapping
- row-budget truncation after wrapping
- preserved the existing row-budget behavior on narrow terminals by
reserving space for the longer hint line
- updated the relevant snapshot and added targeted regression coverage
## Intentional design decisions
- **Aligned shortcut styling with the visible footer UI**
The inline hint uses `ctrl + t`, not `Ctrl+T`, to match the TUI’s
rendered key-hint style.
- **Kept the noun `transcript`**
The product already exposes this flow as the transcript overlay, so the
hint points at the existing concept instead of inventing a new label.
- **Preserved narrow-terminal behavior**
The longer hint text is accounted for in the row-budget truncation path
so the visible output still respects the existing viewport cap.
- **Did not add the hint to long command truncation**
This PR only changes hidden **output** truncation. Long command
truncation still uses the plain ellipsis form because `ctrl + t` is not
the same kind of “show hidden output” escape hatch there.
- **Did not widen scope to other truncation surfaces**
This does not change MCP/tool-call truncation in `history_cell.rs`, and
it does not change transcript-overlay behavior itself.
## Validation
### Automated
- `just fmt`
- `cargo test -p codex-tui`
### Manual
- ran `just tui-with-exec-server`
- executed `!seq 1 200`
- confirmed the main view showed the new `ctrl + t to view transcript`
truncation hint
- pressed `ctrl + t` and confirmed the transcript overlay still exposed
the full output
- closed the overlay and returned to the main view
## Visual proof
Screenshot/video attached in the PR UI showing:
- the truncated exec output row with the new hint
- the transcript overlay after `ctrl + t`
## Summary
- Replace Codex-branded network-proxy block responses with concise
reason text
- Mention sandbox policy for local/private network and deny-policy
wording
- Remove “managed” from the proxy-disabled denial detail
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
## Summary
- detect remote exec-server sessions in the unified-exec runtime
- bypass the local shell-snapshot bootstrap only for those remote
sessions
- preserve existing local snapshot wrapping, PowerShell UTF-8 prefixing,
sandbox orchestration, and zsh-fork handling
## Why
The shell snapshot file is currently captured and stored next to Core.
If Core wraps a remote command with `. /path/to/local/snapshot`, the
process starts on the executor and tries to source a path from the
orchestrator filesystem. This keeps remote commands from receiving that
known-local path until shell snapshots are captured/restored on the
executor side.
## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-core --lib tools::runtimes::tests`
Co-authored-by: Codex <noreply@openai.com>
Problem: The statusline reported context as an “X% left” value, which
could be mistaken for quota, and context usage was included in the
default footer.
Solution: Render configured context status items as a filling context
meter, preserve `context-used` as a legacy alias while hiding it from
the setup menu, and remove context from the default statusline. It will
still be available as an opt-in option for users who want to see it.
<img width="317" height="39" alt="image"
src="https://github.com/user-attachments/assets/3aeb39bb-f80d-471f-88fe-d55e25b31491"
/>
## Summary
- keep pending steered input buffered until the active user prompt has
received a model response
- keep steering pending across auto-compact when there is real
model/tool continuation to resume
- allow queued steering to follow compaction immediately when the prior
model response was already final
- keep pending-input follow-up owned by `run_turn` instead of folding it
into `SamplingRequestResult`
- add regression coverage for mid-turn compaction, final-response
compaction, and compaction triggered before the next request after tool
output
## Root Cause
Steered input was drained at the top of every `run_turn` loop. After
auto-compaction, the loop continued and immediately appended any pending
steer after the compact summary, making a queued prompt look like the
newest task instead of letting the model first resume interrupted
model/tool work.
## Implementation Notes
This patch keeps the follow-up signals separated:
- `SamplingRequestResult.needs_follow_up` means model/tool continuation
is needed
- `sess.has_pending_input().await` means queued user steering exists
- `run_turn` computes the combined loop condition from those two signals
In `run_turn`:
```rust
let has_pending_input = sess.has_pending_input().await;
let needs_follow_up = model_needs_follow_up || has_pending_input;
```
After auto-compact we choose whether the next request may drain
steering:
```rust
can_drain_pending_input = !model_needs_follow_up;
```
That means:
- model/tool continuation + pending steer: compact -> resume once
without draining steer
- completed model answer + pending steer: compact -> drain/send the
steer immediately
- fresh user prompt: do not drain steering before the model has answered
the prompt once
The drain is still only `sess.get_pending_input().await`; when
`can_drain_pending_input` is false, core uses an empty local vec and
leaves the steer pending in session state.
## Validation
- PASS `cargo test -p codex-core --test all steered_user_input --
--nocapture`
- PASS `just fmt`
- PASS `git diff --check`
- NOT PASSING HERE `just fix -p codex-core` currently stops before
linting this change on an unrelated mainline test-build error:
`core/src/tools/spec_tests.rs` initializes `ToolsConfigParams` without
`image_generation_tool_auth_allowed`; this PR does not touch that file.
## Summary
- make realtime default to the v2 WebRTC path
- keep partial realtime config tables inheriting
`RealtimeConfig::default()`
## Validation
- CI found a stale config-test expectation; fixed in 974ba51bb3
- just fmt
- git diff --check
---------
Co-authored-by: Codex <noreply@openai.com>
Addresses #17166
Problem: Source builds report version 0.0.0, so the TUI update path can
treat any released Codex version as upgradeable and show startup or
popup prompts.
Solution: Skip both TUI update prompt entry points when the running CLI
version is the source-build sentinel 0.0.0.
- Default realtime sessions to v2 and gpt-realtime-1.5 when no override
is configured.
- Add Op::RealtimeConversationStart integration coverage and keep
v1-specific tests explicit.
---------
Co-authored-by: Codex <noreply@openai.com>
Problem: TUI desktop notifications are hard-gated on terminal focus, so
terminal/IDE hosts that want in-focus notifications cannot opt in.
Solution: Add a flat `[tui] notification_condition` setting (`unfocused`
by default, `always` opt-in), carry grouped TUI notification settings
through runtime config, apply method + condition together in the TUI,
and regenerate the config schema.
- Adds a core-owned realtime backend prompt template and preparation
path.
- Makes omitted realtime start prompts use the core default, while null
or empty prompts intentionally send empty instructions.
- Covers the core realtime path and app-server v2 path with integration
coverage.
---------
Co-authored-by: Codex <noreply@openai.com>
Addresses #15943
Problem: Name-based resume could stop on a newer session_index entry
whose rollout was never persisted, shadowing an older saved thread with
the same name.
Solution: Materialize rollouts before indexing thread names and make
name lookup skip unresolved entries until it finds a persisted rollout.
Problem: Warp supports OSC 9 notifications, but the TUI's automatic
notification backend selection did not recognize its
`TERM_PROGRAM=WarpTerminal` environment value.
Solution: Treat `TERM_PROGRAM=WarpTerminal` as OSC 9-capable when
choosing the TUI desktop notification backend.
Currently, when a MCP server sends an elicitation to Codex running in
Full Access (`sandbox_policy: DangerFullAccess` + `approval_policy:
Never`), the elicitations are auto-cancelled.
This PR updates the automatic handling of MCP elicitations to be
consistent with other approvals in full-access, where they are
auto-approved. Because MCP elicitations may actually require user input,
this mechanism is limited to empty form elicitations.
## Changeset
- Add policy helper shared with existing MCP tool call approval
auto-approve
- Update `ElicitationRequestManager` to auto-approve elicitations in
full access when `can_auto_accept_elicitation` is true.
- Add tests
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add a top-level `codex exec-server` subcommand, marked experimental in
CLI help
- launch an adjacent or PATH-provided `codex-exec-server`, with a
source-tree `cargo run -p codex-exec-server --` fallback
- cover the new subcommand parser path
## Validation
- `just fmt`
- `git diff --check`
- not run: Rust test suite
Co-authored-by: Codex <noreply@openai.com>
---------
Co-authored-by: Codex <noreply@openai.com>
Summary:
- parse the realtime call Location header and join that call over the
direct realtime WebSocket
- keep WebRTC starts alive on the existing realtime conversation path
Validation:
- just fmt
- git diff --check
- cargo check -p codex-api
- cargo check -p codex-core --tests
- local cargo tests not run; relying on PR CI
- Builds codex-realtime-webrtc through the normal Bazel Rust macro so
native macOS WebRTC sources are included.\n- Shares the macOS -ObjC link
flag with Bazel targets that can link libwebrtc.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add the missing `image_generation_tool_auth_allowed` field to the new
tool registry plan test initializer
## Validation
- `just fmt`
- `cargo test -p codex-tools image_generation`
- `cargo test -p codex-tools --no-run`
When running with remote executor the cwd is the remote path. Today we
check for existence of a local directory on startup and attempt to load
config from it.
For remote executors don't do that.
## Summary
- add optional `sandboxPolicy` support to the app-server filesystem
request surface
- thread sandbox-aware filesystem options through app-server and
exec-server adapters
- enforce sandboxed read/write access in the filesystem abstraction with
focused local and remote coverage
## Validation
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-exec-server file_system`
- `cargo test -p codex-app-server suite::v2::fs`
---------
Co-authored-by: Codex <noreply@openai.com>
**Disabling Image-Gen for Non-SIWC Codex Users**
We are only enabling image-gen feature for SIWC Codex users until there
comes a fix in ResponsesAPI to omit output from responses.completed, to
prevent the following issues:
1. websocket blows up due to heavier load (images) than before (text)
2. http parser streams through n^2 of n-base64 bytes (sum of base64s of
all images generated in turn) that causes long delays in
turn_completion.
## Description
This PR fixes `/debug-config` so it shows more of the active
requirements state, including reviewer requirements and managed feature
pins. This made it clear that legacy MDM config was setting
`approvals_reviewer = "guardian_subagent"` and that we were translating
that into a requirements constraint.
Also, translate `approvals_reviewer = "guardian_subagent"` (from legacy
managed_config.toml) to `allowed_approvals_reviewers: guardian_subagent,
user` instead of `allowed_approvals_reviewers: guardian_subagent`.
Example `/debug-config`:
```
Config layer stack (lowest precedence first):
1. system (/etc/codex/config.toml) (enabled)
2. user (/Users/owen/.codex/config.toml) (enabled)
3. project (/Users/owen/repos/codex/.codex/config.toml) (enabled)
4. legacy managed_config.toml (MDM) (enabled)
MDM value:
...
# Enable Guardian Mode
features.guardian_approval = true
approvals_reviewer = "guardian_subagent"
Requirements:
- allowed_approvals_reviewers: guardian_subagent, user (source: MDM managed_config.toml (legacy))
- features: apps=true, plugins=true (source: cloud requirements)
```
Before this PR, the `Requirements` section showed None.
## Summary
- Carry `AbsolutePathBuf` through tool cwd parsing/resolution instead of
resolving workdirs to raw `PathBuf`s.
- Type exec/sandbox request cwd fields as `AbsolutePathBuf` through
`ExecParams`, `ExecRequest`, `SandboxCommand`, and unified exec runtime
requests.
- Keep `PathBuf` conversions at external/event boundaries and update
existing tests/fixtures for the typed cwd.
## Validation
- `cargo check -p codex-core --tests`
- `cargo check -p codex-sandboxing --tests`
- `cargo test -p codex-sandboxing`
- `cargo test -p codex-core --lib tools::handlers::`
- `just fix -p codex-sandboxing`
- `just fix -p codex-core`
- `just fmt`
Full `codex-core` test suite was not run locally; per repo guidance I
kept local validation targeted.
Adds the `[realtime].transport = "webrtc"` TUI media path using a new
`codex-realtime-webrtc` crate, while leaving app-server as the
signaling/event source.\n\nLocal checks: fmt, diff-check, dependency
tree only; test signal should come from CI.
---------
Co-authored-by: Codex <noreply@openai.com>
Addresses #16971
Problem: Disabled MCP servers were still queried for streamable HTTP
auth status during MCP inventory, so unreachable disabled entries could
add startup latency.
Solution: Return `Unsupported` immediately for disabled MCP server
configs before bearer token/OAuth status discovery.
Problem: Resuming the live TUI thread through `/resume` could
unsubscribe and reconnect the same app-server thread, leaving the UI
crashed or disconnected.
Solution: No-op `/resume` only when the selected thread is the currently
attached active thread; keep the normal resume path for
stale/displayed-only threads so recovery and reattach still work.
Addresses #3793
Problem: /status only reported project-level AGENTS files, so sessions
with a loaded global $CODEX_HOME/AGENTS.md still showed Agents.md as
<none>.
Solution: Track the global instructions file loaded during config
initialization and prepend that path to the /status Agents.md summary,
with coverage for AGENTS.md, AGENTS.override.md, and global-plus-project
ordering.
Allow multi_agent_v2 features to have its own temporary configuration
under `[features.multi_agent_v2]`
```
[features.multi_agent_v2]
enabled = true
usage_hint_enabled = false
usage_hint_text = "Custom delegation guidance."
hide_spawn_agent_metadata = true
```
Absent `usage_hint_text` means use the default hint.
```
[features]
multi_agent_v2 = true
```
still works as the boolean shorthand.
Before this, the TUI was starting 2 app-server. One to check the login
status and one to actually start the session
This PR make only one app-server startup and defer the login check in
async, outside of the frame rendering path
---------
Co-authored-by: Codex <noreply@openai.com>
Problem: codex-cli/README.md is obsolete and confusing to keep around.
Solution: Delete codex-cli/README.md so the stale README is no longer
present in the repository.
Addresses #16677
Problem: Paid-plan startup tooltips still advertised 2x rate limits
until April 2nd after that promo had expired.
Solution: Remove the stale expiry copy and use evergreen Codex App /
Codex startup tips instead.
## Summary
Fix network proxy sessions so changing sandbox mode recomputes the
effective managed network policy and applies it to the already-running
per-session proxy.
## Root Cause
`danger_full_access_denylist_only` injects `"*"` only while building the
proxy spec for Full Access. Sessions built that spec once at startup, so
a later permission switch to Full Access left the live proxy in its
original restricted policy. Switching back needed the same recompute
path to remove the synthetic wildcard again.
## What Changed
- Preserve the original managed network proxy config/requirements so the
effective spec can be recomputed for a new sandbox policy.
- Refresh the current session proxy when sandbox settings change, then
reapply exec-policy network overlays.
- Add an in-place proxy state update path while rejecting
listener/port/SOCKS changes that cannot be hot-reloaded.
- Keep runtime proxy settings cheap to snapshot and update.
- Add regression coverage for workspace-write -> Full Access ->
workspace-write.
Tests added for existing JsonSchema in
`codex-rs/tools/src/json_schema_tests.rs`:
- `parse_tool_input_schema_coerces_boolean_schemas`
- `parse_tool_input_schema_infers_object_shape_and_defaults_properties`
- `parse_tool_input_schema_normalizes_integer_and_missing_array_items`
- `parse_tool_input_schema_sanitizes_additional_properties_schema`
-
`parse_tool_input_schema_infers_object_shape_from_boolean_additional_properties_only`
- `parse_tool_input_schema_infers_number_from_numeric_keywords`
- `parse_tool_input_schema_infers_number_from_multiple_of`
-
`parse_tool_input_schema_infers_string_from_enum_const_and_format_keywords`
- `parse_tool_input_schema_defaults_empty_schema_to_string`
- `parse_tool_input_schema_infers_array_from_prefix_items`
-
`parse_tool_input_schema_preserves_boolean_additional_properties_on_inferred_object`
-
`parse_tool_input_schema_infers_object_shape_from_schema_additional_properties_only`
Tests that we expect to fail on the baseline normalizer, but pass with
the new JsonSchema:
- `parse_tool_input_schema_preserves_nested_nullable_type_union`
- `parse_tool_input_schema_preserves_nested_any_of_property`
## TL;DR
- Fetches account/rateLimits/read asynchronously so the TUI can continue
starting without waiting for the rate-limit response.
- Fixes the /status card so it no longer leaves a stale “refreshing
cached limits...” notice in terminal history.
## Problem
The TUI bootstrap path fetched account rate limits synchronously
(`account/rateLimits/read`) before the event loop started for
ChatGPT/OpenAI-authenticated startups. This added ~670 ms of blocking
latency in the measured hot-start case, even though rate-limit data is
not needed to render the initial UI or accept user input. The delay was
especially noticeable on hot starts where every other RPC
(`account/read`, `model/list`, `thread/start`) completed in under 70 ms
total.
Moving that fetch to the background also exposed a `/status` UI bug: the
status card is flattened into terminal scrollback when it is inserted. A
transient "refreshing limits in background..." line could not be cleared
later, because the async completion updated the retained `HistoryCell`,
not the already-written terminal history.
## Mental model
Before this change, `AppServerSession::bootstrap()` performed three
sequential RPCs: `account/read` → `model/list` →
`account/rateLimits/read`. The result of the third call was baked into
`AppServerBootstrap` and applied to the chat widget before the event
loop began.
After this change, `bootstrap()` only performs two RPCs (`account/read`
+ `model/list`), and rate-limit fetching is kicked off as an async
background task immediately after the first frame is scheduled. A new
enum, `RateLimitRefreshOrigin`, tags each fetch so the event handler
knows whether the result came from the startup prefetch or from a
user-initiated `/status` command; they have different completion
side-effects.
The `get_login_status()` helper (used outside the main app flow) was
also decoupled: it previously called the full `bootstrap()` just to
check auth mode, wasting model-list and rate-limit work. It now calls
the narrower `read_account()` directly.
For `/status`, this PR keeps the background refresh request but stops
printing transient refresh notices into status history when cached
limits are already available. If a refresh updates the cache, the next
`/status` command will render the new values.
## Non-goals
- This change does not alter the rate-limit data itself.
- This change does not introduce caching, retries, or staleness
management for rate limits.
- This change does not affect the `model/list` or `thread/start` RPCs;
they remain on the critical startup path.
## Tradeoffs
- **Stale-on-first-render**: The status bar will briefly show no
rate-limit info until the background fetch completes; observed
background fetches landed roughly in the 400-900 ms range after the UI
appeared. This is acceptable because the user cannot meaningfully act on
rate-limit data in the first fraction of a second.
- **Error silence on startup prefetch**: If the startup prefetch fails,
the error is logged but the UI is not notified (unlike `/status` refresh
failures, which go through the status-command completion path). This
avoids surfacing transient network errors as a startup blocker.
- **Static `/status` history**: `/status` output is terminal history,
not a live widget. The card now avoids progress-style language that
would appear stuck in scrollback; users can run `/status` again to see
newly cached values.
- **`account_auth_mode` field removed from `AppServerBootstrap`**: The
only consumer was `get_login_status()`, which no longer goes through
`bootstrap()`. The field was dead weight.
## Architecture
### New types
- `RateLimitRefreshOrigin` (in `app_event.rs`): A `Copy` enum
distinguishing `StartupPrefetch` from `StatusCommand { request_id }`.
Carried through `RefreshRateLimits` and `RateLimitsLoaded` events so the
handler applies the right completion behavior.
### Modified types
- `AppServerBootstrap`: Lost `account_auth_mode` and
`rate_limit_snapshots`; gained `requires_openai_auth: bool` (passed
through from the account response so the caller can decide whether to
fire the prefetch).
### Control flow
1. `bootstrap()` returns with `requires_openai_auth` and
`has_chatgpt_account`.
2. After scheduling the first frame, `App::run_inner` fires
`refresh_rate_limits(StartupPrefetch)` if both flags are true.
3. When `RateLimitsLoaded { StartupPrefetch, Ok(..) }` arrives,
snapshots are applied and a frame is scheduled to repaint the status
bar.
4. When `RateLimitsLoaded { StartupPrefetch, Err(..) }` arrives, the
error is logged and no UI update occurs.
5. `/status`-initiated refreshes continue to use `StatusCommand {
request_id }` and call `finish_status_rate_limit_refresh` on completion
(success or failure).
6. `/status` history cells with cached rate-limit rows no longer render
an additional "refreshing limits" notice; the async refresh updates the
cache for future status output.
### Extracted method
- `AppServerSession::read_account()`: Factored out of `bootstrap()` so
that `get_login_status()` can call it independently without triggering
model-list or rate-limit work.
## Observability
- The existing `tracing::warn!` for rate-limit fetch failures is
preserved for the startup path.
- No new metrics or spans are introduced. The startup-time improvement
is observable via the existing `ready` timestamp in TUI startup logs.
## Tests
- Existing tests in `status_command_tests.rs` are updated to match on
`RateLimitRefreshOrigin::StatusCommand { request_id }` instead of a bare
`request_id`.
- Focused `/status` tests now assert that status history avoids
transient refresh text, continues to request an async refresh, and uses
refreshed cached limits in future status output.
- No new tests are added for the startup prefetch path because it is a
fire-and-forget spawn with no observable side-effect other than the
widget state update, which is already covered by the
snapshot-application tests.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fast Mode status was still tied to one model name in the TUI and
model-list plumbing. This changes the model metadata shape so a model
can advertise additional speed tiers, carries that field through the
app-server model list, and uses it to decide when to show Fast Mode
status.
For people using Codex, the behavior is intended to stay the same for
existing models. Fast Mode still requires the existing signed-in /
feature-gated path; the difference is that the UI can now recognize any
model the model list marks as Fast-capable, instead of requiring a new
client-side slug check.
## Summary
- run apply_patch through the executor filesystem when a remote
environment is present instead of shelling out to the local process
- thread the executor FileSystem into apply_patch interception and keep
existing local behavior for non-remote turns
- make the apply_patch integration harness use the executor filesystem
for setup/assertions
- add remote-aware skips for turn-diff coverage that still reads the
test-runner filesystem
## Why
Remote apply_patch needed to mutate the remote workspace instead of the
local checkout. The tests also needed to seed and assert workspace state
through the same filesystem abstraction so local and remote runs
exercise the same behavior.
## Validation
- `just fmt`
- `git diff --check`
- `cargo check -p core_test_support --tests`
- `cargo test -p codex-core --test all
suite::shell_serialization::apply_patch_custom_tool_call -- --nocapture`
- `cargo test -p codex-core --test all
suite::apply_patch_cli::apply_patch_cli_updates_file_appends_trailing_newline
-- --nocapture`
- remote `cargo test -p codex-core --test all apply_patch_cli --
--nocapture` (229 passed)
Adds WebRTC startup to the experimental app-server
`thread/realtime/start` method with an optional transport enum. The
websocket path remains the default; WebRTC offers create the realtime
session through the shared start flow and emit the answer SDP via
`thread/realtime/sdp`.
---------
Co-authored-by: Codex <noreply@openai.com>
- introduces `ServerResponse` as the symmetrical typed response union to
`ServerRequest` for app-server-protocol
- enables scalable event stream ingestion for use cases such as
analytics, particularly for tools/approvals
- no runtime behavior changes, protocol/schema plumbing only
- mirrors #15921
- Migrate apply-patch verification and application internals to use the
async `ExecutorFileSystem` abstraction from `exec-server`.
- Convert apply-patch `cwd` handling to `AbsolutePathBuf` through the
verifier/parser/handler boundary.
Doesn't change how the tool itself works.
## Summary
https://github.com/openai/codex/pull/13860 changed the serialized output
format of Unified Exec. This PR reverts those changes and some related
test changes
## Testing
- [x] Update tests
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Remove the stale `?` after `AbsolutePathBuf::join` in the unified exec
integration test helper.
## Root Cause
- `AbsolutePathBuf::join` was made infallible, but
`core/tests/suite/unified_exec.rs` still treated it as a `Result`, which
broke the Windows test build for the `all` integration test target.
## Validation
- `just fmt`
- `cargo test -p codex-core --test all
unified_exec_resolves_relative_workdir`
## Summary
- Convert unified exec integration tests that can run against the remote
executor to use the remote-aware test harness.
- Create workspace directories through the executor filesystem for
remote runs.
- Install `python3` and `zsh` in the remote test container so restored
Python/zsh-based test commands work in fresh Ubuntu containers.
## Validation
- `just fmt`
- `cargo test -p codex-core --test all unified_exec_defaults_to_pipe`
- `cargo test -p codex-core --test all unified_exec_can_enable_tty`
- `cargo test -p codex-core --test all unified_exec`
- Remote on `codex-remote`: `source scripts/test-remote-env.sh && cd
codex-rs && cargo test -p codex-core --test all unified_exec`
- `just fix -p codex-core`
## Description
This PR changes guardian transcript compaction so oversized
conversations no longer collapse into a nearly empty placeholder.
Before this change, if the retained user history alone exceeded the
message budget, guardian would replace the entire transcript with
`<transcript omitted to preserve budget for planned action>`!
That meant approvals, especially network approvals, could lose the
recent tool call and tool result that explained what guardian was
actually reviewing. Now we keep a compact but usable transcript instead
of dropping it all.
### Before
```
The following is the Codex agent history whose request action you are assessing...
>>> TRANSCRIPT START
<transcript omitted to preserve budget for planned action>
>>> TRANSCRIPT END
Conversation transcript omitted due to size.
The Codex agent has requested the following action:
>>> APPROVAL REQUEST START
Retry reason:
Sandbox blocked outbound network access.
Assess the exact planned action below. Use read-only tool checks when local state matters.
Planned action JSON:
{
"tool": "network_access",
"target": "https://example.com:443",
"host": "example.com",
"protocol": "https",
"port": 443
}
>>> APPROVAL REQUEST END
```
### After
```
The following is the Codex agent history whose request action you are assessing...
>>> TRANSCRIPT START
[1] user: Please investigate why uploads to example.com are failing and retry if needed.
[8] user: If the request looks correct, go ahead and try again with network access.
[9] tool shell call: {"command":["curl","-X","POST","https://example.com/upload"],"cwd":"/repo"}
[10] tool shell result: sandbox blocked outbound network access
>>> TRANSCRIPT END
Some conversation entries were omitted.
The Codex agent has requested the following action:
>>> APPROVAL REQUEST START
Retry reason:
Sandbox blocked outbound network access.
Assess the exact planned action below. Use read-only tool checks when local state matters.
Planned action JSON:
{
"tool": "network_access",
"target": "https://example.com:443",
"host": "example.com",
"protocol": "https",
"port": 443
}
>>> APPROVAL REQUEST END
```
## Summary
This adds a stable Codex installation ID and includes it on Responses
API requests via `x-codex-installation-id` passed in via the
`client_metadata` field for analytics/debugging.
The main pieces are:
- persist a UUID in `$CODEX_HOME/installation_id`
- thread the installation ID into `ModelClient`
- send it in `client_metadata` on Responses requests so it works
consistently across HTTP and WebSocket transports
Addresses #16421
Problem: Resumed interactive sessions exited before new token usage
skipped all footer lines, hiding the `codex resume` continuation
command.
It's not clear whether this was an intentional design choice, but I
think it's reasonable to expect this message under these circumstances.
Solution: Compose token usage and resume hints independently so
resumable sessions still print the continuation command with zero usage.
Addresses #15527
Problem: Nested `codex exec` commands could source a shell snapshot that
re-exported the parent `CODEX_THREAD_ID`, so commands inside the nested
session were attributed to the wrong thread.
Solution: Reapply the live command env's `CODEX_THREAD_ID` after
sourcing the snapshot.
Addresses #15532
Problem: Nested read-only `apply_patch` rejections report in-project
files as outside the project.
Solution: Choose the rejection message based on sandbox mode so
read-only sessions report a read-only-specific reason, and add focused
safety coverage.
Problem: The multi-agent followup interrupt test polled history before
interrupt cleanup and mailbox wakeup were guaranteed to settle, which
made it flaky under CI scheduling variance.
Solution: Wait for the child turn's `TurnAborted(Interrupted)` event
before asserting that the redirected assistant envelope is recorded and
no plain user message is left behind.
## Summary
- reduce public module visibility across Rust crates, preferring private
or crate-private modules with explicit crate-root public exports
- update external call sites and tests to use the intended public crate
APIs instead of reaching through module trees
- add the module visibility guideline to AGENTS.md
## Validation
- `cargo check --workspace --all-targets --message-format=short` passed
before the final fix/format pass
- `just fix` completed successfully
- `just fmt` completed successfully
- `git diff --check` passed
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
2026-04-07 10:17:31 +01:00
1097 changed files with 73602 additions and 18932 deletions
description: Diagnose GitHub bug reports in openai/codex. Use when given a GitHub issue URL from openai/codex and asked to decide next steps such as verifying against the repo, requesting more info, or explaining why it is not a bug; follow any additional user-provided instructions.
---
# Codex Bug
## Overview
Diagnose a Codex GitHub bug report and decide the next action: verify against sources, request more info, or explain why it is not a bug.
## Workflow
1. Confirm the input
- Require a GitHub issue URL that points to `github.com/openai/codex/issues/…`.
- If the URL is missing or not in the right repo, ask the user for the correct link.
2. Network access
- Always access the issue over the network immediately, even if you think access is blocked or unavailable.
- Prefer the GitHub API over HTML pages because the HTML is noisy:
- If the environment requires explicit approval, request it on demand via the tool and continue without additional user prompting.
- Only if the network attempt fails after requesting approval, explain what you can do offline (e.g., draft a response template) and ask how to proceed.
3. Read the issue
- Use the GitHub API responses (issue + comments) as the source of truth rather than scraping the HTML issue page.
- Extract: title, body, repro steps, expected vs actual, environment, logs, and any attachments.
- Note whether the report already includes logs or session details.
- If the report includes a thread ID, mention it in the summary and use it to look up the logs and session details if you have access to them.
4. Summarize the bug before investigating
- Before inspecting code, docs, or logs in depth, write a short summary of the report in your own words.
- Include the reported behavior, expected behavior, repro steps, environment, and what evidence is already attached or missing.
5. Decide the course of action
- **Verify with sources** when the report is specific and likely reproducible. Inspect relevant Codex files (or mention the files to inspect if access is unavailable).
- **Request more information** when the report is vague, missing repro steps, or lacks logs/environment.
- **Explain not a bug** when the report contradicts current behavior or documented constraints (cite the evidence from the issue and any local sources you checked).
6. Respond
- Provide a concise report of your findings and next steps.
description: Update the title and body of one or more pull requests.
---
## Determining the PR(s)
When this skill is invoked, the PR(s) to update may be specified explicitly, but in the common case, the PR(s) to update will be inferred from the branch / commit that the user is currently working on. For ordinary Git usage (i.e., not Sapling as discussed below), you may have to use a combination of `git branch` and `gh pr view <branch> --repo openai/codex --json number --jq '.number'` to determine the PR associated with the current branch / commit.
## PR Body Contents
When invoked, use `gh` to edit the pull request body and title to reflect the contents of the specified PR. Make sure to check the existing pull request body to see if there is key information that should be preserved. For example, NEVER remove an image in the existing pull request body, as the author may have no way to recover it if you remove it.
It is critically important to explain _why_ the change is being made. If the current conversation in which this skill is invoked has discussed the motivation, be sure to capture this in the pull request body.
The body should also explain _what_ changed, but this should appear after the _why_.
Limit discussion to the _net change_ of the commit. It is generally frowned upon to discuss changes that were attempted but later undone in the course of the development of the pull request. When rewriting the pull request body, you may need to eliminate details such as these when they are no longer appropriate / of interest to future readers.
Avoid references to absolute paths on my local disk. When talking about a path that is within the repository, simply use the repo-relative path.
It is generally helpful to discuss how the change was verified. That said, it is unnecessary to mention things that CI checks automatically, e.g., do not include "ran `just fmt`" as part of the test plan. Though identifying the new tests that were purposely introduced to verify the new behavior introduced by the pull request is often appropriate.
Make use of Markdown to format the pull request professionally. Ensure "code things" appear in single backticks when referenced inline. Fenced code blocks are useful when referencing code or showing a shell transcript. Also, make use of GitHub permalinks when citing existing pieces of code that are relevant to the change.
Make sure to reference any relevant pull requests or issues, though there should be no need to reference the pull request in its own PR body.
If there is documentation that should be updated on https://developers.openai.com/codex as a result of this change, please note that in a separate section near the end of the pull request. Omit this section if there is no documentation that needs to be updated.
## Working with Stacks
Sometimes a pull request is composed of a stack of commits that build on one another. In these cases, the PR body should reflect the _net_ change introduced by the stack as a whole, rather than the individual commits that make up the stack.
Similarly, sometimes a user may be using a tool like Sapling to leverage _stacked pull requests_, in which case the `base` of the PR may be the a branch that is the `head` of another PR in the stack rather than `main`. In this case, be sure to discuss only the net change between the `base` and `head` of the PR that is being opened against that stacked base, rather than the changes relative to `main`.
## Sapling
If `.git/sl/store` is present, then this Git repository is governed by Sapling SCM (https://sapling-scm.com).
In Sapling, run the following to see if there is a GitHub pull request associated with the current revision:
Alternatively, you can run `sl sl` to see the current development branch and whether there is a GitHub pull request associated with the current commit. For example, if the output were:
```
@ cb032b31cf 72 minutes ago mbolin #11412
╭─╯ tui: show non-file layer content in /debug-config
│
o fdd0cd1de9 Today at 20:09 origin/main
│
~
```
-`@` indicates the current commit is `cb032b31cf`
- it is a development branch containing a single commit branched off of `origin/main`
- it is associated with GitHub pull request #11412
We provide the following options to facilitate Codex development in a container. This is particularly useful for verifying the Linux build when working on a macOS host.
We provide two container paths:
-`devcontainer.json` keeps the existing Codex contributor setup for working on this repository.
-`devcontainer.secure.json` adds a customer-oriented profile with stricter outbound network controls.
## Codex contributor profile
Use `devcontainer.json` when you are developing Codex itself. This is the same lightweight arm64 container that already exists in the repo.
## Secure customer profile
Use `devcontainer.secure.json` when you want a stricter runtime profile for running Codex inside a project container:
- installs the Codex CLI plus common build tools
- installs bubblewrap in setuid mode for Codex's Linux sandbox
- disables Docker's outer seccomp and AppArmor profiles so bubblewrap can construct Codex's inner sandbox
- enables firewall startup with an allowlist-driven outbound policy
- blocks IPv6 by default so the allowlist cannot be bypassed over AAAA routes
- requires `NET_ADMIN` and `NET_RAW` so the firewall can be installed at startup
This profile keeps the stricter networking isolated to the customer path instead of changing the default Codex contributor container.
Start it from the CLI with:
```bash
devcontainer up --workspace-folder . --config .devcontainer/devcontainer.secure.json
```
In VS Code, choose **Dev Containers: Open Folder in Container...** and select `.devcontainer/devcontainer.secure.json`.
## Docker
To build the Docker image locally for x64 and then run it with the repo mounted under `/workspace`:
To build the contributor image locally for x64 and then run it with the repo mounted under `/workspace`:
Note that `/workspace/target` will contain the binaries built for your host platform, so we include `-e CARGO_TARGET_DIR=/workspace/codex-rs/target-amd64` in the `docker run` command so that the binaries built inside your container are written to a separate directory.
For arm64, specify `--platform=linux/amd64` instead for both `docker build` and `docker run`.
For arm64, specify `--platform=linux/arm64` instead for both `docker build` and `docker run`.
Currently, the `Dockerfile` works for both x64 and arm64 Linux, though you need to run `rustup target add x86_64-unknown-linux-musl` yourself to install the musl toolchain for x64.
Currently, the contributor `Dockerfile` works for both x64 and arm64 Linux, though you need to run `rustup target add x86_64-unknown-linux-musl` yourself to install the musl toolchain for x64.
## VS Code
VS Code recognizes the `devcontainer.json` file and gives you the option to develop Codex in a container. Currently, `devcontainer.json` builds and runs the `arm64` flavor of the container.
From the integrated terminal in VS Code, you can build either flavor of the `arm64` build (GNU or musl):
```shell
cargo build --target aarch64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu
```
The secure profile's capability, seccomp, and AppArmor options are required when you want Codex's bubblewrap sandbox to run inside Docker as the non-root devcontainer user. Without them, Docker's default runtime profile can block bubblewrap's namespace setup before Codex's own seccomp filter is installed. This keeps the Docker relaxation explicit in the profile that is meant to run Codex inside a project container, while the default contributor profile stays lightweight.
- Additionally add zero or more of the following labels that are relevant to the issue content. Prefer a small set of precise labels over many broad ones.
- For agent-area issues, prefer the most specific applicable label. Use "agent" only as a fallback for agent-related issues that do not fit a more specific agent-area label. Prefer "app-server" over "session" or "config" when the issue is about app-server protocol, API, RPC, schema, launch, or bridge behavior.
1. windows-os — Bugs or friction specific to Windows environments (always when PowerShell is mentioned, path handling, copy/paste, OS-specific auth or tooling failures).
2. mcp — Topics involving Model Context Protocol servers/clients.
3. mcp-server — Problems related to the codex mcp-server command, where codex runs as an MCP server.
@@ -61,6 +62,13 @@ jobs:
15. sandbox - Issues related to local sandbox environments or tool call approvals to override sandbox restrictions.
16. tool-calls - Problems related to specific tool call invocations including unexpected errors, failures, or hangs.
17. TUI - Problems with the terminal user interface (TUI) including keyboard shortcuts, copy & pasting, menus, or screen update issues.
18. app-server - Issues involving the app-server protocol or interfaces, including SDK/API payloads, thread/* and turn/* RPCs, app-server launch behavior, external app/controller bridges, and app-server protocol/schema behavior.
19. connectivity - Network connectivity or endpoint issues, including reconnecting messages, stream dropped/disconnected errors, websocket/SSE/transport failures, timeout/network/VPN/proxy/API endpoint failures, and related retry behavior.
20. subagent - Issues involving subagents, sub-agents, or multi-agent behavior, including spawn_agent, wait_agent, close_agent, worker/explorer roles, delegation, agent teams, lifecycle, model/config inheritance, quotas, and orchestration.
21. session - Issues involving session or thread management, including resume, fork, archive, rename/title, thread history, rollout persistence, compaction, checkpoints, retention, and cross-session state.
@@ -21,7 +21,9 @@ In the codex-rs folder where the rust code lives:
- Newly added traits should include doc comments that explain their role and how implementations are expected to use them.
- When writing tests, prefer comparing the equality of entire objects over fields one by one.
- When making a change that adds or changes an API, ensure that the documentation in the `docs/` folder is up to date if applicable.
- Prefer private modules and explicitly exported public crate API.
- If you change `ConfigToml` or nested config types, run `just write-config-schema` to update `codex-rs/core/config.schema.json`.
- When working with MCP tool calls, prefer using `codex-rs/codex-mcp/src/mcp_connection_manager.rs` to handle mutation of tools and tool calls. Aim to minimize the footprint of changes and leverage existing abstractions rather than plumbing code through multiple levels of function calls.
- If you change Rust dependencies (`Cargo.toml` or `Cargo.lock`), run `just bazel-lock-update` from the
repo root to refresh `MODULE.bazel.lock`, and include that lockfile update in the same change.
- After dependency changes, run `just bazel-lock-check` from the repo root so lockfile drift is caught
@@ -46,7 +46,7 @@ Each archive contains a single entry with the platform baked into the name (e.g.
### Using Codex with your ChatGPT plan
Run `codex` and select **Sign in with ChatGPT**. We recommend signing into your ChatGPT account to use Codex as part of your Plus, Pro, Team, Edu, or Enterprise plan. [Learn more about what's included in your ChatGPT plan](https://help.openai.com/en/articles/11369540-codex-in-chatgpt).
Run `codex` and select **Sign in with ChatGPT**. We recommend signing into your ChatGPT account to use Codex as part of your Plus, Pro, Business, Edu, or Enterprise plan. [Learn more about what's included in your ChatGPT plan](https://help.openai.com/en/articles/11369540-codex-in-chatgpt).
You can also use Codex with an API key, but this requires [additional setup](https://developers.openai.com/codex/auth#sign-in-with-an-api-key).
@@ -11,3 +11,7 @@ Our security program is managed through Bugcrowd, and we ask that any validated
## Vulnerability Disclosure Program
Our Vulnerability Program Guidelines are defined on our [Bugcrowd program page](https://bugcrowd.com/engagements/openai).
## How to operate CODEX safely
For details on Codex security boundaries, including sandboxing, approvals, and network controls, see [Agent approvals & security](https://developers.openai.com/codex/agent-approvals-security).
content="Update Required - This version will no longer be supported starting May 8th. Please upgrade to the latest version (https://github.com/openai/codex/releases/latest) using your preferred package manager."
# Matches 0.x.y versions from 0.0.y through 0.119.y; excludes 0.120.0 and newer.
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code></p>
> [!IMPORTANT]
> This is the documentation for the _legacy_ TypeScript implementation of the Codex CLI. It has been superseded by the _Rust_ implementation. See the [README in the root of the Codex repository](https://github.com/openai/codex/blob/main/README.md) for details.

---
<details>
<summary><strong>Table of contents</strong></summary>
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
## Quickstart
Install globally:
```shell
npm install -g @openai/codex
```
Next, set your OpenAI API key as an environment variable:
```shell
exportOPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
> ```
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - azure
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - arceeai
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
---
## Memory & project docs
You can give Codex extra instructions and guidance using `AGENTS.md` files. Codex looks for `AGENTS.md` files in the following places, and merges them top-down:
1.`~/.codex/AGENTS.md` - personal global guidance
2.`AGENTS.md` at repo root - shared project notes
3.`AGENTS.md` in the current working directory - sub-folder/feature specifics
Disable loading of these files with `--no-project-doc` or the environment variable `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Non-interactive / CI mode
Run Codex head-less in pipelines. Example GitHub Action step:
```yaml
- name:Update changelog via Codex
run:|
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex -a auto-edit --quiet "update CHANGELOG for next release"
```
Set `CODEX_QUIET_MODE=1` to silence interactive UI noise.
## Tracing / verbose logging
Setting the environment variable `DEBUG=true` prints full API request and response details:
```shell
DEBUG=true codex
```
---
## Recipes
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
<summary>OpenAI released a model called Codex in 2021 - is this related?</summary>
In 2021, OpenAI released Codex, an AI system designed to generate code from natural language prompts. That original Codex model was deprecated as of March 2023 and is separate from the CLI tool.
</details>
<details>
<summary>Which models are supported?</summary>
Any model available with [Responses API](https://platform.openai.com/docs/api-reference/responses). The default is `o4-mini`, but pass `--model gpt-4.1` or set `model: gpt-4.1` in your config file to override.
</details>
<details>
<summary>Why does <code>o3</code> or <code>o4-mini</code> not work for me?</summary>
It's possible that your [API account needs to be verified](https://help.openai.com/en/articles/10910291-api-organization-verification) in order to start streaming responses and seeing chain of thought summaries from the API. If you're still running into issues, please let us know!
</details>
<details>
<summary>How do I stop Codex from editing my files?</summary>
Codex runs model-generated commands in a sandbox. If a proposed command or file change doesn't look right, you can simply type **n** to deny the command or give the model feedback.
</details>
<details>
<summary>Does it work on Windows?</summary>
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) - Codex is regularly tested on macOS and Linux with Node 20+, and also supports Node 16.
</details>
---
## Zero data retention (ZDR) usage
Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled. If your OpenAI organization has Zero Data Retention enabled and you still encounter errors such as:
```
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
---
## Codex open source fund
We're excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
- Grants are awarded up to **$25,000** API credits.
- Applications are reviewed **on a rolling basis**.
This project is under active development and the code will likely change pretty significantly. We'll update this message once that's complete!
More broadly we welcome contributions - whether you are opening your very first pull request or you're a seasoned maintainer. At the same time we care about reliability and long-term maintainability, so the bar for merging code is intentionally **high**. The guidelines below spell out what "high-quality" means in practice and should make the whole process transparent and friendly.
### Development workflow
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
- Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git hooks with Husky
This project uses [Husky](https://typicode.github.io/husky/) to enforce code quality checks:
- **Pre-commit hook**: Automatically runs lint-staged to format and lint files before committing
- **Pre-push hook**: Runs tests and type checking before pushing to the remote
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./HUSKY.md).
```bash
pnpm test&& pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
```text
I have read the CLA Document and I hereby sign the CLA
```
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
pnpm test:watch
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
### Writing high-impact code changes
1. **Start with an issue.** Open a new one or comment on an existing discussion so we can agree on the solution before code is written.
2. **Add or update tests.** Every new feature or bug-fix should come with test coverage that fails before your change and passes afterwards. 100% coverage is not required, but aim for meaningful assertions.
3. **Document behaviour.** If your change affects user-facing behaviour, update the README, inline help (`codex --help`), or relevant example projects.
4. **Keep commits atomic.** Each commit should compile and the tests should pass. This makes reviews and potential rollbacks easier.
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
### Review process
1. One maintainer will be assigned as a primary reviewer.
2. We may ask for changes - please do not take this personally. We value the work, we just also value consistency and long-term maintainability.
3. When there is consensus that the PR meets the bar, a maintainer will squash-and-merge.
### Community values
- **Be kind and inclusive.** Treat others with respect; we follow the [Contributor Covenant](https://www.contributor-covenant.org/).
- **Assume good intent.** Written communication is hard - err on the side of generosity.
- **Teach & learn.** If you spot something confusing, open an issue or PR with improvements.
### Getting help
If you run into problems setting up the project, would like feedback on an idea, or just want to say _hi_ - please open a Discussion or jump into the relevant issue. We are happy to help.
Together we can make Codex CLI an incredible tool. **Happy hacking!** :rocket:
### Contributor license agreement (CLA)
All contributors **must** accept the CLA. The process is lightweight:
1. Open your pull request.
2. Paste the following comment (or reply `recheck` if you've signed before):
```text
I have read the CLA Document and I hereby sign the CLA
```
3. The CLA-Assistant bot records your signature in the repo and marks the status check as passed.
No special Git commands, email attachments, or commit footers required.
"description":"SDP offer generated by a WebRTC RTCPeerConnection after configuring audio and the realtime events data channel.",
"type":"string"
},
"type":{
"enum":[
"webrtc"
],
"title":"WebrtcThreadRealtimeStartTransportType",
"type":"string"
}
},
"required":[
"sdp",
"type"
],
"title":"WebrtcThreadRealtimeStartTransport",
"type":"object"
}
]
},
"ThreadResumeParams":{
"description":"There are three ways to resume a thread: 1. By thread_id: load the thread from disk by thread_id and resume it. 2. By history: instantiate the thread from memory and resume it. 3. By path: load the thread from disk by path and resume it.\n\nThe precedence is: history > path > thread_id. If using history or path, the thread_id param will be ignored.\n\nPrefer using thread_id whenever possible.",
"properties":{
@@ -3139,10 +3340,27 @@
"type":"null"
}
]
},
"sessionStartSource":{
"anyOf":[
{
"$ref":"#/definitions/ThreadStartSource"
},
{
"type":"null"
}
]
}
},
"type":"object"
},
"ThreadStartSource":{
"enum":[
"startup",
"clear"
],
"type":"string"
},
"ThreadUnarchiveParams":{
"properties":{
"threadId":{
@@ -3834,6 +4052,31 @@
"title":"Thread/readRequest",
"type":"object"
},
{
"description":"Append raw Responses API items to the thread history without starting a user turn.",
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.\n\nTODO(ccunningham): Attach guardian review state to the reviewed tool item's lifecycle instead of sending separate standalone review notifications so the app-server API can persist and replay review state via `thread/read`.",
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.",
"description":"Stable identifier for this review.",
"type":"string"
},
"targetItemId":{
"description":"Identifier for the reviewed item or tool call when one exists.\n\nIn most cases, one review maps to one target item. The exceptions are - execve reviews, where a single command may contain multiple execve calls to review (only possible when using the shell_zsh_fork feature) - network policy reviews, where there is no target item\n\nA network call is triggered by a CommandExecution item, so having a target_item_id set to the CommandExecution item would be misleading because the review is about the network call, not the command execution. Therefore, target_item_id is set to None for network policy reviews.",
"type":[
"string",
"null"
]
},
"threadId":{
"type":"string"
},
@@ -1595,15 +1627,16 @@
},
"required":[
"action",
"decisionSource",
"review",
"targetItemId",
"reviewId",
"threadId",
"turnId"
],
"type":"object"
},
"ItemGuardianApprovalReviewStartedNotification":{
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.\n\nTODO(ccunningham): Attach guardian review state to the reviewed tool item's lifecycle instead of sending separate standalone review notifications so the app-server API can persist and replay review state via `thread/read`.",
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.",
"description":"Stable identifier for this review.",
"type":"string"
},
"targetItemId":{
"description":"Identifier for the reviewed item or tool call when one exists.\n\nIn most cases, one review maps to one target item. The exceptions are - execve reviews, where a single command may contain multiple execve calls to review (only possible when using the shell_zsh_fork feature) - network policy reviews, where there is no target item\n\nA network call is triggered by a CommandExecution item, so having a target_item_id set to the CommandExecution item would be misleading because the review is about the network call, not the command execution. Therefore, target_item_id is set to None for network policy reviews.",
"type":[
"string",
"null"
]
},
"threadId":{
"type":"string"
},
@@ -1624,7 +1665,7 @@
"required":[
"action",
"review",
"targetItemId",
"reviewId",
"threadId",
"turnId"
],
@@ -1736,6 +1777,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -1968,6 +2010,7 @@
"go",
"plus",
"pro",
"prolite",
"team",
"self_serve_business_usage_based",
"business",
@@ -2418,8 +2461,12 @@
"type":"integer"
},
"cwd":{
"description":"Working directory captured for the thread.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"Working directory captured for the thread."
},
"ephemeral":{
"description":"Whether the thread is ephemeral and should not be materialized on disk.",
@@ -2726,8 +2773,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
@@ -3056,7 +3107,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -3089,9 +3140,13 @@
]
},
"savedPath":{
"type":[
"string",
"null"
"anyOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
},
{
"type":"null"
}
]
},
"status":{
@@ -3303,6 +3358,22 @@
],
"type":"object"
},
"ThreadRealtimeSdpNotification":{
"description":"EXPERIMENTAL - emitted with the remote SDP for a WebRTC realtime session.",
"properties":{
"sdp":{
"type":"string"
},
"threadId":{
"type":"string"
}
},
"required":[
"sdp",
"threadId"
],
"type":"object"
},
"ThreadRealtimeStartedNotification":{
"description":"EXPERIMENTAL - emitted when thread realtime startup is accepted.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"ByteRange":{
"properties":{
"end":{
@@ -78,7 +82,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -294,6 +298,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -664,8 +669,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"AutoReviewDecisionSource":{
"description":"[UNSTABLE] Source that produced a terminal guardian approval review decision.",
"enum":[
"agent"
],
"type":"string"
},
"GuardianApprovalReview":{
"description":"[UNSTABLE] Temporary guardian approval review payload used by `item/autoApprovalReview/*` notifications. This shape is expected to change soon.",
"description":"[UNSTABLE] Risk level assigned by guardian approval review.",
"enum":[
"low",
"medium",
"high",
"critical"
],
"type":"string"
},
"GuardianUserAuthorization":{
"description":"[UNSTABLE] Authorization level assigned by guardian approval review.",
"enum":[
"unknown",
"low",
"medium",
"high"
@@ -243,17 +268,28 @@
"type":"string"
}
},
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.\n\nTODO(ccunningham): Attach guardian review state to the reviewed tool item's lifecycle instead of sending separate standalone review notifications so the app-server API can persist and replay review state via `thread/read`.",
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.",
"description":"Stable identifier for this review.",
"type":"string"
},
"targetItemId":{
"description":"Identifier for the reviewed item or tool call when one exists.\n\nIn most cases, one review maps to one target item. The exceptions are - execve reviews, where a single command may contain multiple execve calls to review (only possible when using the shell_zsh_fork feature) - network policy reviews, where there is no target item\n\nA network call is triggered by a CommandExecution item, so having a target_item_id set to the CommandExecution item would be misleading because the review is about the network call, not the command execution. Therefore, target_item_id is set to None for network policy reviews.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"GuardianApprovalReview":{
"description":"[UNSTABLE] Temporary guardian approval review payload used by `item/autoApprovalReview/*` notifications. This shape is expected to change soon.",
"description":"[UNSTABLE] Risk level assigned by guardian approval review.",
"enum":[
"low",
"medium",
"high",
"critical"
],
"type":"string"
},
"GuardianUserAuthorization":{
"description":"[UNSTABLE] Authorization level assigned by guardian approval review.",
"enum":[
"unknown",
"low",
"medium",
"high"
@@ -243,7 +261,7 @@
"type":"string"
}
},
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.\n\nTODO(ccunningham): Attach guardian review state to the reviewed tool item's lifecycle instead of sending separate standalone review notifications so the app-server API can persist and replay review state via `thread/read`.",
"description":"[UNSTABLE] Temporary notification payload for guardian automatic approval review. This shape is expected to change soon.",
"description":"Stable identifier for this review.",
"type":"string"
},
"targetItemId":{
"description":"Identifier for the reviewed item or tool call when one exists.\n\nIn most cases, one review maps to one target item. The exceptions are - execve reviews, where a single command may contain multiple execve calls to review (only possible when using the shell_zsh_fork feature) - network policy reviews, where there is no target item\n\nA network call is triggered by a CommandExecution item, so having a target_item_id set to the CommandExecution item would be misleading because the review is about the network call, not the command execution. Therefore, target_item_id is set to None for network policy reviews.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"ByteRange":{
"properties":{
"end":{
@@ -78,7 +82,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -294,6 +298,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -664,8 +669,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"ByteRange":{
"properties":{
"end":{
@@ -214,7 +218,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -430,6 +434,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -807,8 +812,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"AgentPath":{
"type":"string"
},
@@ -217,7 +221,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -456,6 +460,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -793,8 +798,12 @@
"type":"integer"
},
"cwd":{
"description":"Working directory captured for the thread.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"Working directory captured for the thread."
},
"ephemeral":{
"description":"Whether the thread is ephemeral and should not be materialized on disk.",
@@ -1079,8 +1088,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"AgentPath":{
"type":"string"
},
@@ -217,7 +221,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -456,6 +460,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -793,8 +798,12 @@
"type":"integer"
},
"cwd":{
"description":"Working directory captured for the thread.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"Working directory captured for the thread."
},
"ephemeral":{
"description":"Whether the thread is ephemeral and should not be materialized on disk.",
@@ -1079,8 +1088,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"AgentPath":{
"type":"string"
},
@@ -217,7 +221,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -456,6 +460,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -793,8 +798,12 @@
"type":"integer"
},
"cwd":{
"description":"Working directory captured for the thread.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"Working directory captured for the thread."
},
"ephemeral":{
"description":"Whether the thread is ephemeral and should not be materialized on disk.",
@@ -1079,8 +1088,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"AgentPath":{
"type":"string"
},
@@ -217,7 +221,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -456,6 +460,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -793,8 +798,12 @@
"type":"integer"
},
"cwd":{
"description":"Working directory captured for the thread.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"Working directory captured for the thread."
},
"ephemeral":{
"description":"Whether the thread is ephemeral and should not be materialized on disk.",
@@ -1079,8 +1088,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"AgentPath":{
"type":"string"
},
@@ -217,7 +221,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -456,6 +460,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -793,8 +798,12 @@
"type":"integer"
},
"cwd":{
"description":"Working directory captured for the thread.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"Working directory captured for the thread."
},
"ephemeral":{
"description":"Whether the thread is ephemeral and should not be materialized on disk.",
@@ -1079,8 +1088,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"AgentPath":{
"type":"string"
},
@@ -217,7 +221,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -456,6 +460,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -793,8 +798,12 @@
"type":"integer"
},
"cwd":{
"description":"Working directory captured for the thread.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"Working directory captured for the thread."
},
"ephemeral":{
"description":"Whether the thread is ephemeral and should not be materialized on disk.",
@@ -1079,8 +1088,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"ByteRange":{
"properties":{
"end":{
@@ -214,7 +218,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -430,6 +434,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -807,8 +812,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"ByteRange":{
"properties":{
"end":{
@@ -214,7 +218,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -430,6 +434,7 @@
},
"McpToolCallResult":{
"properties":{
"_meta":true,
"content":{
"items":true,
"type":"array"
@@ -807,8 +812,12 @@
"type":"array"
},
"cwd":{
"description":"The command's working directory.",
"type":"string"
"allOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
}
],
"description":"The command's working directory."
},
"durationMs":{
"description":"The duration of the command execution in milliseconds.",
@@ -1137,7 +1146,7 @@
"type":"string"
},
"path":{
"type":"string"
"$ref":"#/definitions/AbsolutePathBuf"
},
"type":{
"enum":[
@@ -1170,9 +1179,13 @@
]
},
"savedPath":{
"type":[
"string",
"null"
"anyOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
},
{
"type":"null"
}
]
},
"status":{
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.