## Summary
- add the guardian reviewer flow for `on-request` approvals in command,
patch, sandbox-retry, and managed-network approval paths
- keep guardian behind `features.guardian_approval` instead of exposing
a public `approval_policy = guardian` mode
- route ordinary `OnRequest` approvals to the guardian subagent when the
feature is enabled, without changing the public approval-mode surface
## Public model
- public approval modes stay unchanged
- guardian is enabled via `features.guardian_approval`
- when that feature is on, `approval_policy = on-request` keeps the same
approval boundaries but sends those approval requests to the guardian
reviewer instead of the user
- `/experimental` only persists the feature flag; it does not rewrite
`approval_policy`
- CLI and app-server no longer expose a separate `guardian` approval
mode in this PR
## Guardian reviewer
- the reviewer runs as a normal subagent and reuses the existing
subagent/thread machinery
- it is locked to a read-only sandbox and `approval_policy = never`
- it does not inherit user/project exec-policy rules
- it prefers `gpt-5.4` when the current provider exposes it, otherwise
falls back to the parent turn's active model
- it fail-closes on timeout, startup failure, malformed output, or any
other review error
- it currently auto-approves only when `risk_score < 80`
## Review context and policy
- guardian mirrors `OnRequest` approval semantics rather than
introducing a separate approval policy
- explicit `require_escalated` requests follow the same approval surface
as `OnRequest`; the difference is only who reviews them
- managed-network allowlist misses that enter the approval flow are also
reviewed by guardian
- the review prompt includes bounded recent transcript history plus
recent tool call/result evidence
- transcript entries and planned-action strings are truncated with
explicit `<guardian_truncated ... />` markers so large payloads stay
bounded
- apply-patch reviews include the full patch content (without
duplicating the structured `changes` payload)
- the guardian request layout is snapshot-tested using the same
model-visible Responses request formatter used elsewhere in core
## Guardian network behavior
- the guardian subagent inherits the parent session's managed-network
allowlist when one exists, so it can use the same approved network
surface while reviewing
- exact session-scoped network approvals are copied into the guardian
session with protocol/port scope preserved
- those copied approvals are now seeded before the guardian's first turn
is submitted, so inherited approvals are available during any immediate
review-time checks
## Out of scope / follow-ups
- the sandbox-permission validation split was pulled into a separate PR
and is not part of this diff
- a future follow-up can enable `serde_json` preserve-order in
`codex-core` and then simplify the guardian action rendering further
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
`apply_patch` safety approval was still checking writable paths through
the legacy `SandboxPolicy` projection.
That can hide explicit `none` carveouts when a split filesystem policy
projects back to compatibility `ExternalSandbox`, which leaves one more
approval path that can auto-approve writes inside paths that are
intentionally blocked.
## What changed
- passed `turn.file_system_sandbox_policy` into `assess_patch_safety`
- changed writable-path checks to derive effective access from
`FileSystemSandboxPolicy` instead of the legacy `SandboxPolicy`
- made those checks reject explicit unreadable roots before considering
broad write access or writable roots
- added regression coverage showing that an `ExternalSandbox`
compatibility projection still asks for approval when the split
filesystem policy blocks a subpath
## Verification
- `cargo test -p codex-core safety::tests::`
- `cargo test -p codex-core test_sandbox_config_parsing`
- `cargo clippy -p codex-core --all-targets -- -D warnings`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13445).
* #13453
* #13452
* #13451
* #13449
* #13448
* __->__ #13445
* #13440
* #13439
---------
Co-authored-by: viyatb-oai <viyatb@openai.com>
Addresses feature request #13660
Adds new option to `/statusline` so the status line can display "fast
on" or "fast off"
Summary
- introduce a `FastMode` status-line item so `/statusline` can render
explicit `Fast on`/`Fast off` text for the service tier
- wire the item into the picker metadata and resolve its string from
`ChatWidget` without adding any unrelated `thread-name` logic or storage
changes
- ensure the refresh paths keep the cached footer in sync when the
service tier (fast mode) changes
Testing
- Manually tested
Here's what it looks like when enabled:
<img width="366" height="75" alt="image"
src="https://github.com/user-attachments/assets/7f992d2b-6dab-49ed-aa43-ad496f56f193"
/>
## Summary
- require windowsSandbox/setupStart.cwd to be an AbsolutePathBuf
- reject relative cwd values at request parsing instead of normalizing
them later in the setup flow
- add RPC-layer coverage for relative cwd rejection and update the
checked-in protocol schemas/docs
## Why
windowsSandbox/setupStart was carrying the client-provided cwd as a raw
PathBuf for command_cwd while config derivation normalized the same
value into an absolute policy_cwd.
That left room for relative-path ambiguity in the setup path, especially
for inputs like cwd: "repo". Making the RPC accept only absolute paths
removes that split entirely: the handler now receives one
already-validated absolute path and uses it for both config derivation
and setup.
This keeps the trust model unchanged. Trusted clients could already
choose the session cwd; this change is only about making the setup RPC
reject relative paths so command_cwd and policy_cwd cannot diverge.
## Testing
- cargo test -p codex-app-server windows_sandbox_setup (run locally by
user)
- cargo test -p codex-app-server-protocol windows_sandbox (run locally
by user)
## Summary
- distinguish reject-policy handling for prefix-rule approvals versus
sandbox approvals in Unix shell escalation
- keep prompting for skill-script execution when `rules=true` but
`sandbox_approval=false`, instead of denying the command up front
- add regression coverage for both skill-script reject-policy paths in
`codex-rs/core/tests/suite/skill_approval.rs`
## Why
`#13434` and `#13439` introduce split filesystem and network policies,
but the only code that could answer basic filesystem questions like "is
access effectively unrestricted?" or "which roots are readable and
writable for this cwd?" still lived on the legacy `SandboxPolicy` path.
That would force later backends to either keep projecting through
`SandboxPolicy` or duplicate path-resolution logic. This PR moves those
queries onto `FileSystemSandboxPolicy` itself so later runtime and
platform changes can consume the split policy directly.
## What changed
- added `FileSystemSandboxPolicy` helpers for full-read/full-write
checks, platform-default reads, readable roots, writable roots, and
explicit unreadable roots resolved against a cwd
- added a shared helper for the default read-only carveouts under
writable roots so the legacy and split-policy paths stay aligned
- added protocol coverage for full-access detection and derived
readable, writable, and unreadable roots
## Verification
- added protocol coverage in `protocol/src/protocol.rs` and
`protocol/src/permissions.rs` for full-root access and derived
filesystem roots
- verified the current PR state with `just clippy`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13440).
* #13453
* #13452
* #13451
* #13449
* #13448
* #13445
* __->__ #13440
* #13439
---------
Co-authored-by: viyatb-oai <viyatb@openai.com>
## Why
`#13434` introduces split `FileSystemSandboxPolicy` and
`NetworkSandboxPolicy`, but the runtime still made most execution-time
sandbox decisions from the legacy `SandboxPolicy` projection.
That projection loses information about combinations like unrestricted
filesystem access with restricted network access. In practice, that
means the runtime can choose the wrong platform sandbox behavior or set
the wrong network-restriction environment for a command even when config
has already separated those concerns.
This PR carries the split policies through the runtime so sandbox
selection, process spawning, and exec handling can consult the policy
that actually matters.
## What changed
- threaded `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` through
`TurnContext`, `ExecRequest`, sandbox attempts, shell escalation state,
unified exec, and app-server exec overrides
- updated sandbox selection in `core/src/sandboxing/mod.rs` and
`core/src/exec.rs` to key off `FileSystemSandboxPolicy.kind` plus
`NetworkSandboxPolicy`, rather than inferring behavior only from the
legacy `SandboxPolicy`
- updated process spawning in `core/src/spawn.rs` and the platform
wrappers to use `NetworkSandboxPolicy` when deciding whether to set
`CODEX_SANDBOX_NETWORK_DISABLED`
- kept additional-permissions handling and legacy `ExternalSandbox`
compatibility projections aligned with the split policies, including
explicit user-shell execution and Windows restricted-token routing
- updated callers across `core`, `app-server`, and `linux-sandbox` to
pass the split policies explicitly
## Verification
- added regression coverage in `core/tests/suite/user_shell_cmd.rs` to
verify `RunUserShellCommand` does not inherit
`CODEX_SANDBOX_NETWORK_DISABLED` from the active turn
- added coverage in `core/src/exec.rs` for Windows restricted-token
sandbox selection when the legacy projection is `ExternalSandbox`
- updated Linux sandbox coverage in
`linux-sandbox/tests/suite/landlock.rs` to exercise the split-policy
exec path
- verified the current PR state with `just clippy`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13439).
* #13453
* #13452
* #13451
* #13449
* #13448
* #13445
* #13440
* __->__ #13439
---------
Co-authored-by: viyatb-oai <viyatb@openai.com>
## Summary
- treat `requirements.toml` `allowed_domains` and `denied_domains` as
managed network baselines for the proxy
- in restricted modes by default, build the effective runtime policy
from the managed baseline plus user-configured allowlist and denylist
entries, so common hosts can be pre-approved without blocking later user
expansion
- add `experimental_network.managed_allowed_domains_only = true` to pin
the effective allowlist to managed entries, ignore user allowlist
additions, and hard-deny non-managed domains without prompting
- apply `managed_allowed_domains_only` anywhere managed network
enforcement is active, including full access, while continuing to
respect denied domains from all sources
- add regression coverage for merged-baseline behavior, managed-only
behavior, and full-access managed-only enforcement
## Behavior
Assuming `requirements.toml` defines both
`experimental_network.allowed_domains` and
`experimental_network.denied_domains`.
### Default mode
- By default, the effective allowlist is
`experimental_network.allowed_domains` plus user or persisted allowlist
additions.
- By default, the effective denylist is
`experimental_network.denied_domains` plus user or persisted denylist
additions.
- Allowlist misses can go through the network approval flow.
- Explicit denylist hits and local or private-network blocks are still
hard-denied.
- When `experimental_network.managed_allowed_domains_only = true`, only
managed `allowed_domains` are respected, user allowlist additions are
ignored, and non-managed domains are hard-denied without prompting.
- Denied domains continue to be respected from all sources.
### Full access
- With managed requirements present, the effective allowlist is pinned
to `experimental_network.allowed_domains`.
- With managed requirements present, the effective denylist is pinned to
`experimental_network.denied_domains`.
- There is no allowlist-miss approval path in full access.
- Explicit denylist hits are hard-denied.
- `experimental_network.managed_allowed_domains_only = true` now also
applies in full access, so managed-only behavior remains in effect
anywhere managed network enforcement is active.
## Summary
- resolve trust roots by inspecting `.git` entries on disk instead of
spawning `git rev-parse --git-common-dir`
- keep regular repo and linked-worktree trust inheritance behavior
intact
- add a synthetic regression test that proves worktree trust resolution
works without a real git command
## Testing
- `just fmt`
- `cargo test -p codex-core resolve_root_git_project_for_trust`
- `cargo clippy -p codex-core --all-targets -- -D warnings`
- `cargo test -p codex-core` (fails in this environment on unrelated
managed-config `DangerFullAccess` tests in `codex::tests`,
`tools::js_repl::tests`, and `unified_exec::tests`)
This fixes a schema export bug where two different `WebSearchAction`
types were getting merged under the same name in the app-server v2 JSON
schema bundle.
The problem was that v2 thread items use the app-server API's
`WebSearchAction` with camelCase variants like `openPage`, while
`ThreadResumeParams.history` and
`RawResponseItemCompletedNotification.item` pull in the upstream
`ResponseItem` graph, which uses the Responses API snake_case shape like
`open_page`. During bundle generation we were flattening nested
definitions into the v2 namespace by plain name, so the later definition
could silently overwrite the earlier one.
That meant clients generating code from the bundled schema could end up
with the wrong `WebSearchAction` definition for v2 thread history. In
practice this shows up on web search items reconstructed from rollout
files with persisted extended history.
This change does two things:
- Gives the upstream Responses API schema a distinct JSON schema name:
`ResponsesApiWebSearchAction`
- Makes namespace-level schema definition collisions fail loudly instead
of silently overwriting
* Add an ability to stream stdin, stdout, and stderr
* Streaming of stdout and stderr has a configurable cap for total amount
of transmitted bytes (with an ability to disable it)
* Add support for overriding environment variables
* Add an ability to terminate running applications (using
`command/exec/terminate`)
* Add TTY/PTY support, with an ability to resize the terminal (using
`command/exec/resize`)
Previously, we could only configure whether web search was on/off.
This PR enables sending along a web search config, which includes all
the stuff responsesapi supports: filters, location, etc.
## Summary
- Treat skill scripts with no permission profile, or an explicitly empty
one, as permissionless and run them with the turn's existing sandbox
instead of forcing an exec approval prompt.
- Keep the approval flow unchanged for skills that do declare additional
permissions.
- Update the skill approval tests to assert that permissionless skill
scripts do not prompt on either the initial run or a rerun.
## Why
Permissionless skills should inherit the current turn sandbox directly.
Prompting for exec approval in that case adds friction without granting
any additional capability.
1. Add a synced curated plugin marketplace and include it in marketplace
discovery.
2. Expose optional plugin.json interface metadata in plugin/list
3. Tighten plugin and marketplace path handling using validated absolute
paths.
4. Let manifests override skill, MCP, and app config paths.
5. Restrict plugin enablement/config loading to the user config layer so
plugin enablement is at global level
## Summary
This is a purely mechanical refactor of `OtelManager` ->
`SessionTelemetry` to better convey what the struct is doing. No
behavior change.
## Why
`OtelManager` ended up sounding much broader than what this type
actually does. It doesn't manage OTEL globally; it's the session-scoped
telemetry surface for emitting log/trace events and recording metrics
with consistent session metadata (`app_version`, `model`, `slug`,
`originator`, etc.).
`SessionTelemetry` is a more accurate name, and updating the call sites
makes that boundary a lot easier to follow.
## Validation
- `just fmt`
- `cargo test -p codex-otel`
- `cargo test -p codex-core`
I mainly use the devcontainer to be able to run `cargo clippy --tests`
locally for Linux.
We still need to make it possible to run clippy from Bazel so I don't
need to do this!
- add experimental_realtime_ws_startup_context to override or disable
realtime websocket startup context
- preserve generated startup context when unset and cover the new
override paths in tests
## Why
`SandboxPolicy` currently mixes together three separate concerns:
- parsing layered config from `config.toml`
- representing filesystem sandbox state
- carrying basic network policy alongside filesystem choices
That makes the existing config awkward to extend and blocks the new TOML
proposal where `[permissions]` becomes a table of named permission
profiles selected by `default_permissions`. (The idea is that if
`default_permissions` is not specified, we assume the user is opting
into the "traditional" way to configure the sandbox.)
This PR adds the config-side plumbing for those profiles while still
projecting back to the legacy `SandboxPolicy` shape that the current
macOS and Linux sandbox backends consume.
It also tightens the filesystem profile model so scoped entries only
exist for `:project_roots`, and so nested keys must stay within a
project root instead of using `.` or `..` traversal.
This drops support for the short-lived `[permissions.network]` in
`config.toml` because now that would be interpreted as a profile named
`network` within `[permissions]`.
## What Changed
- added `PermissionsToml`, `PermissionProfileToml`,
`FilesystemPermissionsToml`, and `FilesystemPermissionToml` so config
can parse named profiles under `[permissions.<profile>.filesystem]`
- added top-level `default_permissions` selection, validation for
missing or unknown profiles, and compilation from a named profile into
split `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` values
- taught config loading to choose between the legacy `sandbox_mode` path
and the profile-based path without breaking legacy users
- introduced `codex-protocol::permissions` for the split filesystem and
network sandbox types, and stored those alongside the legacy projected
`sandbox_policy` in runtime `Permissions`
- modeled `FileSystemSpecialPath` so only `ProjectRoots` can carry a
nested `subpath`, matching the intended config syntax instead of
allowing invalid states for other special paths
- restricted scoped filesystem maps to `:project_roots`, with validation
that nested entries are non-empty descendant paths and cannot use `.` or
`..` to escape the project root
- kept existing runtime consumers working by projecting
`FileSystemSandboxPolicy` back into `SandboxPolicy`, with an explicit
error for profiles that request writes outside the workspace root
- loaded proxy settings from top-level `[network]`
- regenerated `core/config.schema.json`
## Verification
- added config coverage for profile deserialization,
`default_permissions` selection, top-level `[network]` loading, network
enablement, rejection of writes outside the workspace root, rejection of
nested entries for non-`:project_roots` special paths, and rejection of
parent-directory traversal in `:project_roots` maps
- added protocol coverage for the legacy bridge rejecting non-workspace
writes
## Docs
- update the Codex config docs on developers.openai.com/codex to
document named `[permissions.<profile>]` entries, `default_permissions`,
scoped `:project_roots` syntax, the descendant-path restriction for
nested `:project_roots` entries, and top-level `[network]` proxy
configuration
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13434).
* #13453
* #13452
* #13451
* #13449
* #13448
* #13445
* #13440
* #13439
* __->__ #13434
## Summary
Remove `docs/auth-login-logging-plan.md`.
## Why
The document was a temporary planning artifact. The durable rationale
for the
auth-login diagnostics work now lives in the code comments, tests, PR
context,
and existing implementation notes, so keeping the standalone plan doc
adds
duplicate maintenance surface.
## Testing
- not run (docs-only deletion)
Co-authored-by: Codex <noreply@openai.com>
## Summary
Clarify the `js_repl` prompt guidance around persistent bindings and
redeclaration recovery.
This updates the generated `js_repl` instructions in
`core/src/project_doc.rs` to prefer this order when a name is already
bound:
1. Reuse the existing binding
2. Reassign a previously declared `let`
3. Pick a new descriptive name
4. Use `{ ... }` only for short-lived scratch scope
5. Reset the kernel only when a clean state is actually needed
The prompt now also explicitly warns against wrapping an entire cell in
block scope when the goal is to reuse names across later cells.
## Why
The previous wording still left too much room for low-value workarounds
like whole-cell block wrapping. In downstream browser rollouts, that
pattern was adding tokens and preventing useful state reuse across
`js_repl` cells.
This change makes the preferred behavior more explicit without changing
runtime semantics.
## Scope
- Prompt/documentation change only
- No runtime behavior changes
- Updates the matching string-backed `project_doc` tests
Enhance pty utils:
* Support closing stdin
* Separate stderr and stdout streams to allow consumers differentiate them
* Provide compatibility helper to merge both streams back into combined one
* Support specifying terminal size for pty, including on-demand resizes while process is already running
* Support terminating the process while still consuming its outputs
## Problem
Browser login failures historically leave support with an incomplete
picture. HARs can show that the browser completed OAuth and reached the
localhost callback, but they do not explain why the native client failed
on the final `/oauth/token` exchange. Direct `codex login` also relied
mostly on terminal stderr and the browser error page, so even when the
login crate emitted better sign-in diagnostics through TUI or app-server
flows, the one-shot CLI path still did not leave behind an easy artifact
to collect.
## Mental model
This implementation treats the browser page, the returned `io::Error`,
and the normal structured log as separate surfaces with different safety
requirements. The browser page and returned error preserve the detail
that operators need to diagnose failures. The structured log stays
narrower: it records reviewed lifecycle events, parsed safe fields, and
redacted transport errors without becoming a sink for secrets or
arbitrary backend bodies.
Direct `codex login` now adds a fourth support surface: a small
file-backed log at `codex-login.log` under the configured `log_dir`.
That artifact carries the same login-target events as the other
entrypoints without changing the existing stderr/browser UX.
## Non-goals
This does not add auth logging to normal runtime requests, and it does
not try to infer precise transport root causes from brittle string
matching. The scope remains the browser-login callback flow in the
`login` crate plus a direct-CLI wrapper that persists those events to
disk.
This also does not try to reuse the TUI logging stack wholesale. The TUI
path initializes feedback, OpenTelemetry, and other session-oriented
layers that are useful for an interactive app but unnecessary for a
one-shot login command.
## Tradeoffs
The implementation favors fidelity for caller-visible errors and
restraint for persistent logs. Parsed JSON token-endpoint errors are
logged safely by field. Non-JSON token-endpoint bodies remain available
to the returned error so CLI and browser surfaces still show backend
detail. Transport errors keep their real `reqwest` message, but attached
URLs are surgically redacted. Custom issuer URLs are sanitized before
logging.
On the CLI side, the code intentionally duplicates a narrow slice of the
TUI file-logging setup instead of sharing the full initializer. That
keeps `codex login` easy to reason about and avoids coupling it to
interactive-session layers that the command does not need.
## Architecture
The core auth behavior lives in `codex-rs/login/src/server.rs`. The
callback path now logs callback receipt, callback validation,
token-exchange start, token-exchange success, token-endpoint non-2xx
responses, and transport failures. App-server consumers still use this
same login-server path via `run_login_server(...)`, so the same
instrumentation benefits TUI, Electron, and VS Code extension flows.
The direct CLI path in `codex-rs/cli/src/login.rs` now installs a small
file-backed tracing layer for login commands only. That writes
`codex-login.log` under `log_dir` with login-specific targets such as
`codex_cli::login` and `codex_login::server`.
## Observability
The main signals come from the `login` crate target and are
intentionally scoped to sign-in. Structured logs include redacted issuer
URLs, redacted transport errors, HTTP status, and parsed token-endpoint
fields when available. The callback-layer log intentionally avoids
`%err` on token-endpoint failures so arbitrary backend bodies do not get
copied into the normal log file.
Direct `codex login` now leaves a durable artifact for both failure and
success cases. Example output from the new file-backed CLI path:
Failing callback:
```text
2026-03-06T22:08:54.143612Z INFO codex_cli::login: starting browser login flow
2026-03-06T22:09:03.431699Z INFO codex_login::server: received login callback path=/auth/callback has_code=false has_state=true has_error=true state_valid=true
2026-03-06T22:09:03.431745Z WARN codex_login::server: oauth callback returned error error_code="access_denied" has_error_description=true
```
Succeeded callback and token exchange:
```text
2026-03-06T22:09:14.065559Z INFO codex_cli::login: starting browser login flow
2026-03-06T22:09:36.431678Z INFO codex_login::server: received login callback path=/auth/callback has_code=true has_state=true has_error=false state_valid=true
2026-03-06T22:09:36.436977Z INFO codex_login::server: starting oauth token exchange issuer=https://auth.openai.com/ redirect_uri=http://localhost:1455/auth/callback
2026-03-06T22:09:36.685438Z INFO codex_login::server: oauth token exchange succeeded status=200 OK
```
## Tests
- `cargo test -p codex-login`
- `cargo clippy -p codex-login --tests -- -D warnings`
- `cargo test -p codex-cli`
- `just bazel-lock-update`
- `just bazel-lock-check`
- manual direct `codex login` smoke tests for both a failing callback
and a successful browser login
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
This is a structural cleanup of `codex-otel` to make the ownership
boundaries a lot clearer.
For example, previously it was quite confusing that `OtelManager` which
emits log + trace event telemetry lived under
`codex-rs/otel/src/traces/`. Also, there were two places that defined
methods on OtelManager via `impl OtelManager` (`lib.rs` and
`otel_manager.rs`).
What changed:
- move the `OtelProvider` implementation into `src/provider.rs`
- move `OtelManager` and session-scoped event emission into
`src/events/otel_manager.rs`
- collapse the shared log/trace event helpers into
`src/events/shared.rs`
- pull target classification into `src/targets.rs`
- move `traceparent_context_from_env()` into `src/trace_context.rs`
- keep `src/otel_provider.rs` as a compatibility shim for existing
imports
- update the `codex-otel` README to reflect the new layout
## Why
`lib.rs` and `otel_provider.rs` were doing too many different jobs at
once: provider setup, export routing, trace-context helpers, and session
event emission all lived together.
This refactor separates those concerns without trying to change the
behavior of the crate. The goal is to make future OTEL work easier to
reason about and easier to review.
## Notes
- no intended behavior change
- `OtelManager` remains the session-scoped event emitter in this PR
- the `otel_provider` shim keeps downstream churn low while the
internals move around
## Validation
- `just fmt`
- `cargo test -p codex-otel`
- `just fix -p codex-otel`
## Summary
- reject the global `*` domain pattern in proxy allow/deny lists and
managed constraints introduced for testing earlier
- keep exact hosts plus scoped wildcards like `*.example.com` and
`**.example.com`
- update docs and regression tests for the new invalid-config behavior
At over 7,000 lines, `codex-rs/core/src/config/mod.rs` was getting a bit
unwieldy.
This PR does the same type of move as
https://github.com/openai/codex/pull/12957 to put unit tests in their
own file, though I decided `config_tests.rs` is a more intuitive name
than `mod_tests.rs`.
Ultimately, I'll codemod the rest of the codebase to follow suit, but I
want to do it in stages to reduce merge conflicts for people.
## Summary
- reduce the SQLite-backed log retention window from 90 days to 10 days
## Testing
- just fmt
- cargo test -p codex-state
Co-authored-by: Codex <noreply@openai.com>
#### What
Add structured `@plugin` parsing and TUI support for plugin mentions.
- Core: switch from plain-text `@display_name` parsing to structured
`plugin://...` mentions via `UserInput::Mention` and
`[$...](plugin://...)` links in text, same pattern as apps/skills.
- TUI: add plugin mention popup, autocomplete, and chips when typing
`$`. Load plugin capability summaries and feed them into the composer;
plugin mentions appear alongside skills and apps.
- Generalize mention parsing to a sigil parameter, still defaults to `$`
<img width="797" height="119" alt="image"
src="https://github.com/user-attachments/assets/f0fe2658-d908-4927-9139-73f850805ceb"
/>
Builds on #13510. Currently clients have to build their own `id` via
`plugin@marketplace` and filter plugins to show by `enabled`, but we
will add `id` and `available` as fields returned from `plugin/list`
soon.
####Tests
Added tests, verified locally.
This branch:
* Avoid flushing DB when not necessary
* Filter events for which we perfom an `upsert` into the DB
* Add a dedicated update function of the `thread:updated_at` that is
lighter
This should significantly reduce the DB lock contention. If it is not
sufficient, we can de-sync the flush of the DB for `updated_at`
## Summary
- move sqlite log reads and writes onto a dedicated `logs_1.sqlite`
database to reduce lock contention with the main state DB
- add a dedicated logs migrator and route `codex-state-logs` to the new
database path
- leave the old `logs` table in the existing state DB untouched for now
## Testing
- just fmt
- cargo test -p codex-state
---------
Co-authored-by: Codex <noreply@openai.com>
### Summary
This adds turn-level latency metrics for the first model output and the
first completed agent message.
- `codex.turn.ttft.duration_ms` starts at turn start and records on the
first output signal we see from the model. That includes normal
assistant text, reasoning deltas, and non-text outputs like tool-call
items.
- `codex.turn.ttfm.duration_ms` also starts at turn start, but it
records when the first agent message finishes streaming rather than when
its first delta arrives.
### Implementation notes
The timing is tracked in codex-core, not app-server, so the definition
stays consistent across CLI, TUI, and app-server clients.
I reused the existing turn lifecycle boundary that already drives
`codex.turn.e2e_duration_ms`, stored the turn start timestamp in turn
state, and record each metric once per turn.
I also wired the new metric names into the OTEL runtime metrics summary
so they show up in the same in-memory/debug snapshot path as the
existing timing metrics.
This fixes a flaky `turn_start_shell_zsh_fork_executes_command_v2` test.
The interrupt path can race with the follow-up `/responses` request that
reports the aborted tool call, so the test now allows that extra no-op
response instead of assuming there will only ever be one request. The
assertions still stay focused on the behavior the test actually cares
about: starting the zsh-forked command correctly.
Testing:
- `just fmt`
- `cargo test -p codex-app-server --test all
suite::v2::turn_start_zsh_fork::turn_start_shell_zsh_fork_executes_command_v2
-- --exact --nocapture`
## Summary
Today `SandboxPermissions::requires_additional_permissions()` does not
actually mean "is `WithAdditionalPermissions`". It returns `true` for
any non-default sandbox override, including `RequireEscalated`. That
broad behavior is relied on in multiple `main` callsites.
The naming is security-sensitive because `SandboxPermissions` is used on
shell-like tool calls to tell the executor how a single command should
relate to the turn sandbox:
- `UseDefault`: run with the turn sandbox unchanged
- `RequireEscalated`: request execution outside the sandbox
- `WithAdditionalPermissions`: stay sandboxed but widen permissions for
that command only
## Problem
The old helper name reads as if it only applies to the
`WithAdditionalPermissions` variant. In practice it means "this command
requested any explicit sandbox override."
That ambiguity made it easy to read production checks incorrectly and
made the guardian change look like a standalone `main` fix when it is
not.
On `main` today:
- `shell` and `unified_exec` intentionally reject any explicit
`sandbox_permissions` request unless approval policy is `OnRequest`
- `exec_policy` intentionally treats any explicit sandbox override as
prompt-worthy in restricted sandboxes
- tests intentionally serialize both `RequireEscalated` and
`WithAdditionalPermissions` as explicit sandbox override requests
So changing those callsites from the broad helper to a narrow
`WithAdditionalPermissions` check would be a behavior change, not a pure
cleanup.
## What This PR Does
- documents `SandboxPermissions` as a per-command sandbox override, not
a generic permissions bag
- adds `requests_sandbox_override()` for the broad meaning: anything
except `UseDefault`
- adds `uses_additional_permissions()` for the narrow meaning: only
`WithAdditionalPermissions`
- keeps `requires_additional_permissions()` as a compatibility alias to
the broad meaning for now
- updates the current broad callsites to use the accurately named broad
helper
- adds unit coverage that locks in the semantics of all three helpers
## What This PR Does Not Do
This PR does not change runtime behavior. That is intentional.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add one-time session recovery in `RmcpClient` for streamable HTTP MCP
`404` session expiry
- rebuild the transport and retry the failed operation once after
reinitializing the client state
- extend the test server and integration coverage for `404`, `401`,
single-retry, and non-session failure scenarios
## Testing
- just fmt
- cargo test -p codex-rmcp-client (the post-rebase run lost its final
summary in the terminal; the suite had passed earlier before the rebase)
- just fix -p codex-rmcp-client
`/feedback` uploads can include `codex-logs.log` from the in-memory
feedback logger path. That logger was emitting level + message without a
timestamp, which made some uploaded logs much harder to inspect. This
change makes the feedback logger use an explicit timer so
feedback-captured log lines include timestamps consistently.
This is not Windows-specific code. The bug showed up in Windows reports
because those uploads were hitting the feedback-buffer path more often,
while Linux/macOS reports were typically coming from the SQLite feedback
export, which already prefixes timestamps.
Here's an example of a log that is missing the timestamps:
```
TRACE app-server request: getAuthStatus
TRACE app-server request: model/list
INFO models cache: evaluating cache eligibility
INFO models cache: attempting load_fresh
INFO models cache: loaded cache file
INFO models cache: cache version mismatch
INFO models cache: no usable cache entry
DEBUG
INFO models cache: cache miss, fetching remote models
TRACE windows::current_platform is called
TRACE Returning Info { os_type: Windows, version: Semantic(10, 0, 26200), edition: Some("Windows 11 Professional"), codename: None, bitness: X64, architecture: Some("x86_64") }
```
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
#### What
on `plugin/install`, check if installed apps are already authed on
chatgpt, and return list of all apps that are not. clients can use this
list to trigger auth workflows as needed.
checks are best effort based on `codex_apps` loading, much like
`app/list`.
#### Tests
Added integration tests, tested locally.
## Summary
Simplify the trusted directory flow. This logic was originally designed
several months ago, to determine if codex should start in read-only or
workspace-write mode. However, that's no longer the purpose of directory
trust - and therefore we should get rid of this logic.
## Testing
- [x] Unit tests pass
We do this for codex-command-runner.exe as well for the same reason.
Windows sandbox users cannot execute binaries in the WindowsApp/
installed directory for the Codex App. This causes apply-patch to fail
because it tries to execute codex.exe as the sandbox user.
## Summary
- delete the network proxy admin server and its runtime listener/task
plumbing
- remove the admin endpoint config, runtime, requirement, protocol,
schema, and debug-surface fields
- update proxy docs to reflect the remaining HTTP and SOCKS listeners
only
## Summary
This PR:
1. fixes a deserialization mismatch for macOS automation permissions in
approval payloads by making core parsing accept both supported wire
shapes for bundle IDs.
2. added `#[serde(default)]` to `MacOsSeatbeltProfileExtensions` so
omitted fields deserialize to secure defaults.
## Why this change is needed
`MacOsAutomationPermission` uses `#[serde(try_from =
"MacOsAutomationPermissionDe")]`, so deserialization is controlled by
`MacOsAutomationPermissionDe`. After we aligned v2
`additionalPermissions.macos.automations` to the core shape, approval
payloads started including `{ "bundle_ids": [...] }` in some paths.
`MacOsAutomationPermissionDe` previously accepted only `"none" | "all"`
or a plain array, so object-shaped bundle IDs failed with `data did not
match any variant of untagged enum MacOsAutomationPermissionDe`. This
change restores compatibility by accepting both forms while preserving
existing normalization behavior (trim values and map empty bundle lists
to `None`).
## Validation
saw this error went away when running
```
cargo run -p codex-app-server-test-client -- \
--codex-bin ./target/debug/codex \
-c 'approval_policy="on-request"' \
-c 'features.shell_zsh_fork=true' \
-c 'zsh_path="/tmp/codex-zsh-fork/package/vendor/aarch64-apple-darwin/zsh/macos-15/zsh"' \
send-message-v2 --experimental-api \
'Use $apple-notes and run scripts/notes_info now.'
```
:
```
Error: failed to deserialize ServerRequest from JSONRPCRequest
Caused by:
data did not match any variant of untagged enum MacOsAutomationPermissionDe
```
## Summary
This PR removes legacy macOS permission model types from
`codex-rs/protocol/src/models.rs`:
- `MacOsPermissions`
- `MacOsPreferencesValue`
- `MacOsAutomationValue`
The protocol now relies on the current `MacOsSeatbeltProfileExtensions`
model for macOS permission data.
## Summary
- default the resume picker sort key to UpdatedAt instead of CreatedAt
- keep Tab sort toggling behavior and update the test expectation for
the new default
## Testing
- just fmt
- cargo test -p codex-tui
Co-authored-by: Codex <noreply@openai.com>
## Summary
- keep the SQLite schema unchanged (no migrations)
- add timestamps to SQLite-backed `/feedback` log exports
- keep the existing SQL-side byte cap behavior and newline handling
- document the remaining fidelity gap (span prefixes + structured
fields) with TODOs
## Details
- update `query_feedback_logs` to format each exported line as:
- `YYYY-MM-DDTHH:MM:SS.ffffffZ {level} {message}`
- continue scoping rows to requested-thread + same-process threadless
logs
- continue capping in SQL before returning rows
- keep the existing fallback behavior unchanged when SQLite returns no
rows
- update parity tests to normalize away the new timestamp prefix while
we still only store `message`
## Follow-up
- TODO already in code: persist enough span/event metadata in SQLite to
reproduce span prefixes and structured fields in `/feedback` exports
## Testing
- `cargo test -p codex-state`
- `just fmt`
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- render pending steer previews with a single `pending steer:` prefix
instead of repeating it for each source line
- reuse the same truncation path for pending steers and queued drafts so
multiline previews behave consistently
- add snapshot coverage for the multiline pending steer case
Before
<img width="969" height="219" alt="Screenshot 2026-03-05 at 3 55 11 PM"
src="https://github.com/user-attachments/assets/b062c9c8-43d3-4a52-98e0-3c7643d1697b"
/>
After
<img width="965" height="203" alt="Screenshot 2026-03-05 at 3 56 08 PM"
src="https://github.com/user-attachments/assets/40935863-55b3-444f-9e14-1ac63126b2e1"
/>
## Codex author
`codex resume 019cc054-385e-79a3-bb85-ec9499623bd8`
Co-authored-by: Codex <noreply@openai.com>
- Replay thread rollback from the persisted rollout history instead of
truncating in-memory state.\n- Add rollback coverage, including
rollback-behind-compaction snapshot coverage.
## Summary
- group recent work by git repo when available, otherwise by directory
- render recent work as bounded user asks with per-thread cwd context
- exclude hidden files and directories from workspace trees
### Motivation
Today config.toml has three different OTEL knobs under `[otel]`:
- `exporter` controls where OTEL logs go
- `trace_exporter` controls where OTEL traces go
- `metrics_exporter` controls where metrics go
Those often (pretty much always?) serve different purposes.
For example, for OpenAI internal usage, the **log exporter** is already
being used for IT/security telemetry, and that use case is intentionally
content-rich: tool calls, arguments, outputs, MCP payloads, and in some
cases user content are all useful there. `log_user_prompt` is a good
example of that distinction. When it’s enabled, we include raw prompt
text in OTEL logs, which is acceptable for the security use case.
The **trace exporter** is a different story. The goal there is to give
OpenAI engineers visibility into latency and request behavior when they
run Codex locally, without sending sensitive prompt or tool data as
trace event data. In other words, traces should help answer “what was
slow?” or “where did time go?”, not “what did the user say?” or “what
did the tool return?”
The complication is that Rust’s `tracing` crate does not make a hard
distinction between “logs” and “trace events.” It gives us one
instrumentation API for logs and trace events (via `tracing::event!`),
and subscribers decide what gets treated as logs, trace events, or both.
Before this change, our OTEL trace layer was effectively attached to the
general tracing stream, which meant turning on `trace_exporter` could
pick up content-rich events that were originally written with logging
(and the `log_exporter`) in mind. That made it too easy for sensitive
data to end up in exported traces by accident.
### Concrete example
In `otel_manager.rs`, this `tracing::event!` call would be exported in
both logs AND traces (as a trace event).
```
pub fn user_prompt(&self, items: &[UserInput]) {
let prompt = items
.iter()
.flat_map(|item| match item {
UserInput::Text { text, .. } => Some(text.as_str()),
_ => None,
})
.collect::<String>();
let prompt_to_log = if self.metadata.log_user_prompts {
prompt.as_str()
} else {
"[REDACTED]"
};
tracing::event!(
tracing::Level::INFO,
event.name = "codex.user_prompt",
event.timestamp = %timestamp(),
// ...
prompt = %prompt_to_log,
);
}
```
Instead of `tracing::event!`, we should now be using `log_event!` and
`trace_event!` instead to more clearly indicate which sink (logs vs.
traces) that event should be exported to.
### What changed
This PR makes the log and trace export distinct instead of treating them
as two sinks for the same data.
On the provider side, OTEL logs and traces now have separate
routing/filtering policy. The log exporter keeps receiving the existing
`codex_otel` events, while trace export is limited to spans and trace
events.
On the event side, `OtelManager` now emits two flavors of telemetry
where needed:
- a log-only event with the current rich payloads
- a tracing-safe event with summaries only
It also has a convenience `log_and_trace_event!` macro for emitting to
both logs and traces when it's safe to do so, as well as log- and
trace-specific fields.
That means prompts, tool args, tool output, account email, MCP metadata,
and similar content stay in the log lane, while traces get the pieces
that are actually useful for performance work: durations, counts, sizes,
status, token counts, tool origin, and normalized error classes.
This preserves current IT/security logging behavior while making it safe
to turn on trace export for employees.
### Full list of things removed from trace export
- raw user prompt text from `codex.user_prompt`
- raw tool arguments and output from `codex.tool_result`
- MCP server metadata from `codex.tool_result` (mcp_server,
mcp_server_origin)
- account identity fields like `user.email` and `user.account_id` from
trace-safe OTEL events
- `host.name` from trace resources
- generic `codex.tool_decision` events from traces
- generic `codex.sse_event` events from traces
- the full ToolCall debug payload from the `handle_tool_call` span
What traces now keep instead is mostly:
- spans
- trace-safe OTEL events
- counts, lengths, durations, status, token counts, and tool origin
summaries
- Update `models.json` to surface the new model entry.
- Refresh the TUI model picker snapshot to match the updated catalog
ordering.
---------
Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
## Note-- added plugin mentions via @, but that conflicts with file
mentions
depends and builds upon #13433.
- introduces explicit `@plugin` mentions. this injects the plugin's mcp
servers, app names, and skill name format into turn context as a dev
message.
- we do not yet have UI for these mentions, so we currently parse raw
text (as opposed to skills and apps which have UI chips, autocomplete,
etc.) this depends on a `plugins/list` app-server endpoint we can feed
the UI with, which is upcoming
- also annotate mcp and app tool descriptions with the plugin(s) they
come from. this gives the model a first class way of understanding what
tools come from which plugins, which will help implicit invocation.
### Tests
Added and updated tests, unit and integration. Also confirmed locally a
raw `@plugin` injects the dev message, and the model knows about its
apps, mcps, and skills.
## Summary
This updates the `js_repl` prompt and docs to make the image guidance
less confusing.
## What changed
- Clarified that `codex.emitImage(...)` adds one image per call and can
be called multiple times to emit multiple images.
- Reworded the image-encoding guidance to be general `js_repl` advice
instead of `ImageDetailOriginal`-specific behavior.
- Updated the guidance to recommend JPEG at about quality 85 when lossy
compression is acceptable, and PNG when transparency or lossless detail
matters.
- Mirrored the same wording in the public `js_repl` docs.
This improves macOS Seatbelt handling for sandboxed tool processes.
## Changes
- Allow dual-stack local binding in proxy-managed sessions, while still
keeping traffic limited to loopback and configured proxy endpoints.
- Replace the old generic unix-socket path rule with explicit AF_UNIX
permissions for socket creation, bind, and outbound connect.
- Keep explicitly approved wrapper sockets connect-only.
Local helper servers are less likely to fail when binding on macOS.
Tools using local unix-socket IPC should work more reliably under the
sandbox.
Full-network sessions, proxy fail-closed behavior, and proxy lifecycle
are unchanged.
## Summary
- always pass `--unshare-user` in the Linux bubblewrap argv builders
- stop relying on bubblewrap's auto-userns behavior, which is skipped
for `uid 0`
- update argv expectations in tests and document the explicit user
namespace behavior
The installed Codex binary reproduced the same issue with:
- `codex -c features.use_linux_sandbox_bwrap=true sandbox linux -- true`
- `bwrap: Creating new namespace failed: Operation not permitted`
This happens because Codex asked bubblewrap for mount/pid/network
namespaces without explicitly asking for a user namespace. In a
root-inside-container environment without ambient `CAP_SYS_ADMIN`, that
fails. Adding `--unshare-user` makes bubblewrap create the user
namespace first and then the remaining namespaces succeed.
This PR adds a durable trace linkage for each turn by storing the active
trace ID on the rollout TurnContext record stored in session rollout
files.
Before this change, we propagated trace context at runtime but didn’t
persist a stable per-turn trace key in rollout history. That made
after-the-fact debugging harder (for example, mapping a historical turn
to the corresponding trace in datadog). This sets us up for much easier
debugging in the future.
### What changed
- Added an optional `trace_id` to TurnContextItem (rollout schema).
- Added a small OTEL helper to read the current span trace ID.
- Captured `trace_id` when creating `TurnContext` and included it in
`to_turn_context_item()`.
- Updated tests and fixtures that construct TurnContextItem so
older/no-trace cases still work.
### Why this approach
TurnContext is already the canonical durable per-turn metadata in
rollout. This keeps ownership clean: trace linkage lives with other
persisted turn metadata.
### Motivation
- Prevent untrusted js_repl code from supplying arbitrary external URLs
that the host would forward into model input and cause external fetches
/ data exfiltration. This change narrows the emitImage contract to safe,
self-contained data URLs.
### Description
- Kernel: added `normalizeEmitImageUrl` and enforce that string-valued
`codex.emitImage(...)` inputs and `input_image`/content-item paths only
accept non-empty `data:` URLs; byte-based paths still produce data URLs
as before (`kernel.js`).
- Host: added `validate_emitted_image_url` and check `EmitImage`
requests before creating `FunctionCallOutputContentItem::InputImage`,
returning an error to the kernel if the URL is not a `data:` URL
(`mod.rs`).
- Tests/docs: added a runtime test
`js_repl_emit_image_rejects_non_data_url` to assert rejection of
non-data URLs and updated user-facing docs/instruction text to state
`data URL` support instead of generic direct image URLs (`mod.rs`,
`docs/js_repl.md`, `project_doc.rs`).
### Testing
- Ran `just fmt` in `codex-rs`; it completed successfully.
- Added a runtime test (`cargo test -p codex-core
js_repl_emit_image_rejects_non_data_url`) but executing the test in this
environment failed due to a missing system dependency required by
`codex-linux-sandbox` (the vendored `bubblewrap` build requires
`libcap.pc` via `pkg-config`), so the test could not be run here.
- Attempted a focused `cargo test` invocation with and without default
features; both compile/test attempts were blocked by the same missing
system `libcap` dependency in this environment.
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69a7837bce98832d91db92d5f76d6cbe)
## Summary
This changes the Unix shell escalation path for skill-matched
executables to apply a skill's `PermissionProfile` as additive
permissions on top of the existing turn/request sandbox policy.
Previously, skill-matched executables compiled the skill permission
profile into a standalone sandbox policy and executed against that
replacement policy. Now they go through the same
`additional_permissions` merge path used elsewhere in shell sandbox
preparation.
## What Changed
- Changed `skill_escalation_execution()` to return
`EscalationPermissions::PermissionProfile(...)` for non-empty skill
permission profiles.
- Kept empty or missing skill permission profiles on the `TurnDefault`
path.
- Added tests covering the new additive skill-permission behavior.
- Added inline comments in `prepare_escalated_exec()` clarifying the
difference between additive permission merging and fully specified
replacement sandbox policies.
- Removed the now-unused skill permission compiler module after
switching this path away from standalone compiled skill sandbox
policies.
## Testing
- Ran `just fmt` in `codex-rs`
- Ran `cargo test -p codex-core`
`cargo test -p codex-core` still hits an unrelated existing failure:
`shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin`
## Follow-up
This change intentionally does not merge skill-specific macOS seatbelt
profile extensions through the `additional_permissions` path yet.
Filesystem and network permissions now follow the additive merge path,
but seatbelt extension permissions still need separate handling in a
follow-up PR.
## Summary
- Change `js_repl` failed-cell persistence so later cells keep prior
bindings plus only the current-cell bindings whose initialization
definitely completed before the throw.
- Preserve initialized lexical bindings across failed cells via
module-namespace readability, including top-level destructuring that
partially succeeds before a later throw.
- Preserve hoisted `var` and `function` bindings only when execution
clearly reached their declaration site, and preserve direct top-level
pre-declaration `var` writes and updates through explicit write-site
markers.
- Preserve top-level `for...in` / `for...of` `var` bindings when the
loop body executes at least once, using a first-iteration guard to avoid
per-iteration bookkeeping overhead.
- Keep prior module state intact across link-time failures and
evaluation failures before the prelude runs, while still allowing failed
cells that already recreated prior bindings to persist updates to those
existing bindings.
- Hide internal commit hooks from user `js_repl` code after the prelude
aliases them, so snippets cannot spoof committed bindings by calling the
raw `import.meta` hooks directly.
- Add focused regression coverage for the supported failed-cell
behaviors and the intentionally unsupported boundaries.
- Update `js_repl` docs and generated instructions to describe the new,
narrower failed-cell persistence model.
## Motivation
We saw `js_repl` drop bindings that had already been initialized
successfully when a later statement in the same cell threw, for example:
const { context: liveContext, session } =
await initializeGoogleSheetsLiveForTab(tab);
// later statement throws
That was surprising in practice because successful earlier work
disappeared from the next cell.
This change makes failed-cell persistence more useful without trying to
model every possible partially executed JavaScript edge case. The
resulting behavior is narrower and easier to reason about:
- prior bindings are always preserved
- lexical bindings persist when their initialization completed before
the throw
- hoisted `var` / `function` bindings persist only when execution
clearly reached their declaration or a supported top-level `var` write
site
- failed cells that already recreated prior bindings can persist writes
to those existing bindings even if they introduce no new bindings
The detailed edge-case matrix stays in `docs/js_repl.md`. The
model-facing `project_doc` guidance is intentionally shorter and focused
on generation-relevant behavior.
## Supported Failed-Cell Behavior
- Prior bindings remain available after a failed cell.
- Initialized lexical bindings remain available after a failed cell.
- Top-level destructuring like `const { a, b } = ...` preserves names
whose initialization completed before a later throw.
- Hoisted `function` bindings persist when execution reached the
declaration statement before the throw.
- Direct top-level pre-declaration `var` writes and updates persist, for
example:
- `x = 1`
- `x += 1`
- `x++`
- short-circuiting logical assignments only persist when the write
branch actually runs
- Non-empty top-level `for...in` / `for...of` `var` loops persist their
loop bindings.
- Failed cells can persist updates to existing carried bindings after
the prelude has run, even when the cell commits no new bindings.
- Link failures and eval failures before the prelude do not poison
`@prev`.
## Intentionally Unsupported Failed-Cell Cases
- Hoisted function reads before the declaration, such as `foo(); ...;
function foo() {}`
- Aliasing or inference-based recovery from reads before declaration
- Nested writes inside already-instrumented assignment RHS expressions
- Destructuring-assignment recovery for hoisted `var`
- Partial `var` destructuring recovery
- Pre-declaration `undefined` reads for hoisted `var`
- Empty top-level `for...in` / `for...of` loop vars
- Nested or scope-sensitive pre-declaration `var` writes outside direct
top-level expression statements
This adds a first-class server request for MCP server elicitations:
`mcpServer/elicitation/request`.
Until now, MCP elicitation requests only showed up as a raw
`codex/event/elicitation_request` event from core. That made it hard for
v2 clients to handle elicitations using the same request/response flow
as other server-driven interactions (like shell and `apply_patch`
tools).
This also updates the underlying MCP elicitation request handling in
core to pass through the full MCP request (including URL and form data)
so we can expose it properly in app-server.
### Why not `item/mcpToolCall/elicitationRequest`?
This is because MCP elicitations are related to MCP servers first, and
only optionally to a specific MCP tool call.
In the MCP protocol, elicitation is a server-to-client capability: the
server sends `elicitation/create`, and the client replies with an
elicitation result. RMCP models it that way as well.
In practice an elicitation is often triggered by an MCP tool call, but
not always.
### What changed
- add `mcpServer/elicitation/request` to the v2 app-server API
- translate core `codex/event/elicitation_request` events into the new
v2 server request
- map client responses back into `Op::ResolveElicitation` so the MCP
server can continue
- update app-server docs and generated protocol schema
- add an end-to-end app-server test that covers the full round trip
through a real RMCP elicitation flow
- The new test exercises a realistic case where an MCP tool call
triggers an elicitation, the app-server emits
mcpServer/elicitation/request, the client accepts it, and the tool call
resumes and completes successfully.
### app-server API flow
- Client starts a thread with `thread/start`.
- Client starts a turn with `turn/start`.
- App-server sends `item/started` for the `mcpToolCall`.
- While that tool call is in progress, app-server sends
`mcpServer/elicitation/request`.
- Client responds to that request with `{ action: "accept" | "decline" |
"cancel" }`.
- App-server sends `serverRequest/resolved`.
- App-server sends `item/completed` for the mcpToolCall.
- App-server sends `turn/completed`.
- If the turn is interrupted while the elicitation is pending,
app-server still sends `serverRequest/resolved` before the turn
finishes.
## Why
`shell_zsh_fork` already provides stronger guarantees around which
executables receive elevated permissions. To reuse that machinery from
unified exec without pushing Unix-specific escalation details through
generic runtime code, the escalation bootstrap and session lifetime
handling need a cleaner boundary.
That boundary also needs to be safe for long-lived sessions: when an
intercepted shell session is closed or pruned, any in-flight approval
workers and any already-approved escalated child they spawned must be
torn down with the session, and the inherited escalation socket must not
leak into unrelated subprocesses.
## What Changed
- Extracted a reusable `EscalationSession` and
`EscalateServer::start_session(...)` in `shell-escalation` so callers
can get the wrapper/socket env overlay and keep the escalation server
alive without immediately running a one-shot command.
- Documented that `EscalationSession::env()` and
`ShellCommandExecutor::run(...)` exchange only that env overlay, which
callers must merge into their own base shell environment.
- Clarified the prepared-exec helper boundary in `core` by naming the
new helper APIs around `ExecRequest`, while keeping the legacy
`execute_env(...)` entrypoints as thin compatibility wrappers for
existing callers that still use the older naming.
- Added a small post-spawn hook on the prepared execution path so the
parent copy of the inheritable escalation socket is closed immediately
after both the existing one-shot shell-command spawn and the
unified-exec spawn.
- Made session teardown explicit with session-scoped cancellation:
dropping an `EscalationSession` or canceling its parent request now
stops intercept workers, and the server-spawned escalated child uses
`kill_on_drop(true)` so teardown cannot orphan an already-approved
child.
- Added `UnifiedExecBackendConfig` plumbing through `ToolsConfig`, a
`shell::zsh_fork_backend` facade, and an opaque unified-exec
spawn-lifecycle hook so unified exec can prepare a wrapped `zsh -c/-lc`
request without storing `EscalationSession` directly in generic
process/runtime code.
- Kept the existing `shell_command` zsh-fork behavior intact on top of
the new bootstrap path. Tool selection is unchanged in this PR: when
`shell_zsh_fork` is enabled, `ShellCommand` still wins over
`exec_command`.
## Verification
- `cargo test -p codex-shell-escalation`
- includes coverage for `start_session_exposes_wrapper_env_overlay`
- includes coverage for `exec_closes_parent_socket_after_shell_spawn`
- includes coverage for
`dropping_session_aborts_intercept_workers_and_kills_spawned_child`
- `cargo test -p codex-core
shell_zsh_fork_prefers_shell_command_over_unified_exec`
- `cargo test -p codex-core --test all
shell_zsh_fork_prompts_for_skill_script_execution`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13392).
* #13432
* __->__ #13392
- add a speed row to the startup/session header under the model row
- render the speed row with the same styling pattern as the model row,
using /fast to change
- show only Fast or Standard to users and update the affected snapshots
---------
Co-authored-by: Codex <noreply@openai.com>
add `web_search_tool_type` on model_info that can be populated from
backend. will be used to filter which models can use `web_search` with
images and which cant.
added small unit test.
- lower `submission_dispatch` span logging to debug for realtime audio
submissions only
- keep other submission spans at info and add a targeted test for the
level selection
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add `js_repl` support for dynamic imports of relative and absolute
local ESM `.js` / `.mjs` files
- keep bare package imports on the native Node path and resolved from
REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then `cwd`),
even when they originate from imported local files
- restrict static imports inside imported local files to other local
relative/absolute `.js` / `.mjs` files, and surface a clear error for
unsupported top-level static imports in the REPL cell
- run imported local files inside the REPL VM context so they can access
`codex.tmpDir`, `codex.tool`, captured `console`, and Node-like
`import.meta` helpers
- reload local files between execs so later `await import("./file.js")`
calls pick up edits and fixed failures, while preserving package/builtin
caching and persistent top-level REPL bindings
- make `import.meta.resolve()` self-consistent by allowing the returned
`file://...` URLs to round-trip through `await import(...)`
- update both public and injected `js_repl` docs to clarify the narrowed
contract, including global bare-import resolution behavior for local
absolute files
## Testing
- `cargo test -p codex-core js_repl_`
- built codex binary and verified behavior
---------
Co-authored-by: Codex <noreply@openai.com>
- rotate the paid-plan startup promo slot 50/50 between the existing
Codex App promo and a new Fast mode promo
- keep the Fast mode call to action platform-neutral so Windows can show
the same tip
- add a focused unit test to ensure the paid promo pool actually rotates
---------
Co-authored-by: Codex <noreply@openai.com>
### first half of changes, followed by #13510
Track plugin capabilities as derived summaries on `PluginLoadOutcome`
for enabled plugins with at least one skill/app/mcp.
Also add `Plugins` section to `user_instructions` injected on session
start. These introduce the plugins concept and list enabled plugins, but
do NOT currently include paths to enabled plugins or details on what
apps/mcps the plugins contain (current plan is to inject this on
@-mention). that can be adjusted in a follow up and based on evals.
### tests
Added/updated tests, confirmed locally that new `Plugins` section +
currently enabled plugins show up in `user_instructions`.
## Summary
- ensure `thread.resume` reuses the stored `gitInfo` instead of
rebuilding it from the live working tree
- persist and apply thread git metadata through the resume flow and add
a regression test covering branch mismatch cases
## Testing
- Not run (not requested)
Bumps [serde_with](https://github.com/jonasbb/serde_with) from 3.16.1 to
3.17.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/jonasbb/serde_with/releases">serde_with's
releases</a>.</em></p>
<blockquote>
<h2>serde_with v3.17.0</h2>
<h3>Added</h3>
<ul>
<li>Support <code>OneOrMany</code> with <code>smallvec</code> v1 (<a
href="https://redirect.github.com/jonasbb/serde_with/issues/920">#920</a>,
<a
href="https://redirect.github.com/jonasbb/serde_with/issues/922">#922</a>)</li>
</ul>
<h3>Changed</h3>
<ul>
<li>Switch to <code>yaml_serde</code> for a maintained yaml dependency
by <a href="https://github.com/kazan417"><code>@kazan417</code></a> (<a
href="https://redirect.github.com/jonasbb/serde_with/issues/921">#921</a>)</li>
<li>Bump MSRV to 1.82, since that is required for
<code>yaml_serde</code> dev-dependency.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4031878a4c"><code>4031878</code></a>
Bump version to v3.17.0 (<a
href="https://redirect.github.com/jonasbb/serde_with/issues/924">#924</a>)</li>
<li><a
href="204ae56f8b"><code>204ae56</code></a>
Bump version to v3.17.0</li>
<li><a
href="7812b5a006"><code>7812b5a</code></a>
serde_yaml 0.9 to yaml_serde 0.10 (<a
href="https://redirect.github.com/jonasbb/serde_with/issues/921">#921</a>)</li>
<li><a
href="614bd8950b"><code>614bd89</code></a>
Bump MSRV to 1.82 as required by yaml_serde</li>
<li><a
href="518d0ed787"><code>518d0ed</code></a>
Suppress RUSTSEC-2026-0009 since we don't have untrusted time input in
tests ...</li>
<li><a
href="a6579a8984"><code>a6579a8</code></a>
Suppress RUSTSEC-2026-0009 since we don't have untrusted time input in
tests</li>
<li><a
href="9d4d0696e6"><code>9d4d069</code></a>
Implement OneOrMany for smallvec_1::SmallVec (<a
href="https://redirect.github.com/jonasbb/serde_with/issues/922">#922</a>)</li>
<li><a
href="fc78243e8c"><code>fc78243</code></a>
Add changelog</li>
<li><a
href="2b8c30bf67"><code>2b8c30b</code></a>
Implement OneOrMany for smallvec_1::SmallVec</li>
<li><a
href="2d9b9a1815"><code>2d9b9a1</code></a>
Carg.lock update</li>
<li>Additional commits viewable in <a
href="https://github.com/jonasbb/serde_with/compare/v3.16.1...v3.17.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Bumps [strum_macros](https://github.com/Peternator7/strum) from 0.27.2
to 0.28.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum_macros's
changelog</a>.</em></p>
<blockquote>
<h2>0.28.0</h2>
<ul>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/461">#461</a>:
Allow any kind of passthrough attributes on
<code>EnumDiscriminants</code>.</p>
<ul>
<li>Previously only list-style attributes (e.g.
<code>#[strum_discriminants(derive(...))]</code>) were supported. Now
path-only
(e.g. <code>#[strum_discriminants(non_exhaustive)]</code>) and
name/value (e.g. <code>#[strum_discriminants(doc =
"foo")]</code>)
attributes are also supported.</li>
</ul>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/462">#462</a>:
Add missing <code>#[automatically_derived]</code> to generated impls not
covered by <a
href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/466">#466</a>:
Bump MSRV to 1.71, required to keep up with updated <code>syn</code> and
<code>windows-sys</code> dependencies. This is a breaking change if
you're on an old version of rust.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/469">#469</a>:
Use absolute paths in generated proc macro code to avoid
potential name conflicts.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/465">#465</a>:
Upgrade <code>phf</code> dependency to v0.13.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/473">#473</a>:
Fix <code>cargo fmt</code> / <code>clippy</code> issues and add GitHub
Actions CI.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/477">#477</a>:
<code>strum::ParseError</code> now implements
<code>core::fmt::Display</code> instead
<code>std::fmt::Display</code> to make it <code>#[no_std]</code>
compatible. Note the <code>Error</code> trait wasn't available in core
until <code>1.81</code>
so <code>strum::ParseError</code> still only implements that in std.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/476">#476</a>:
<strong>Breaking Change</strong> - <code>EnumString</code> now
implements <code>From<&str></code>
(infallible) instead of <code>TryFrom<&str></code> when the
enum has a <code>#[strum(default)]</code> variant. This more accurately
reflects that parsing cannot fail in that case. If you need the old
<code>TryFrom</code> behavior, you can opt back in using
<code>parse_error_ty</code> and <code>parse_error_fn</code>:</p>
<pre lang="rust"><code>#[derive(EnumString)]
#[strum(parse_error_ty = strum::ParseError, parse_error_fn =
make_error)]
pub enum Color {
Red,
#[strum(default)]
Other(String),
}
<p>fn make_error(x: &str) -> strum::ParseError {
strum::ParseError::VariantNotFound
}
</code></pre></p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/431">#431</a>:
Fix bug where <code>EnumString</code> ignored the
<code>parse_err_ty</code>
attribute when the enum had a <code>#[strum(default)]</code>
variant.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/474">#474</a>:
EnumDiscriminants will now copy <code>default</code> over from the
original enum to the Discriminant enum.</p>
<pre lang="rust"><code>#[derive(Debug, Default, EnumDiscriminants)]
#[strum_discriminants(derive(Default))] // <- Remove this in 0.28.
enum MyEnum {
#[default] // <- Will be the #[default] on the MyEnumDiscriminant
#[strum_discriminants(default)] // <- Remove this in 0.28
Variant0,
Variant1 { a: NonDefault },
}
</code></pre>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7376771128"><code>7376771</code></a>
Peternator7/0.28 (<a
href="https://redirect.github.com/Peternator7/strum/issues/475">#475</a>)</li>
<li><a
href="26e63cd964"><code>26e63cd</code></a>
Display exists in core (<a
href="https://redirect.github.com/Peternator7/strum/issues/477">#477</a>)</li>
<li><a
href="9334c728ee"><code>9334c72</code></a>
Make TryFrom and FromStr infallible if there's a default (<a
href="https://redirect.github.com/Peternator7/strum/issues/476">#476</a>)</li>
<li><a
href="0ccbbf823c"><code>0ccbbf8</code></a>
Honor parse_err_ty attribute when the enum has a default variant (<a
href="https://redirect.github.com/Peternator7/strum/issues/431">#431</a>)</li>
<li><a
href="2c9e5a9259"><code>2c9e5a9</code></a>
Automatically add Default implementation to EnumDiscriminant if it
exists on ...</li>
<li><a
href="e241243e48"><code>e241243</code></a>
Fix existing cargo fmt + clippy issues and add GH actions (<a
href="https://redirect.github.com/Peternator7/strum/issues/473">#473</a>)</li>
<li><a
href="639b67fefd"><code>639b67f</code></a>
feat: allow any kind of passthrough attributes on
<code>EnumDiscriminants</code> (<a
href="https://redirect.github.com/Peternator7/strum/issues/461">#461</a>)</li>
<li><a
href="0ea1e2d0fd"><code>0ea1e2d</code></a>
docs: Fix typo (<a
href="https://redirect.github.com/Peternator7/strum/issues/463">#463</a>)</li>
<li><a
href="36c051b910"><code>36c051b</code></a>
Upgrade <code>phf</code> to v0.13 (<a
href="https://redirect.github.com/Peternator7/strum/issues/465">#465</a>)</li>
<li><a
href="9328b38617"><code>9328b38</code></a>
Use absolute paths in proc macro (<a
href="https://redirect.github.com/Peternator7/strum/issues/469">#469</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/Peternator7/strum/compare/v0.27.2...v0.28.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Bumps
[actions/download-artifact](https://github.com/actions/download-artifact)
from 7 to 8.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/download-artifact/releases">actions/download-artifact's
releases</a>.</em></p>
<blockquote>
<h2>v8.0.0</h2>
<h2>v8 - What's new</h2>
<h3>Direct downloads</h3>
<p>To support direct uploads in <code>actions/upload-artifact</code>,
the action will no longer attempt to unzip all downloaded files.
Instead, the action checks the <code>Content-Type</code> header ahead of
unzipping and skips non-zipped files. Callers wishing to download a
zipped file as-is can also set the new <code>skip-decompress</code>
parameter to <code>false</code>.</p>
<h3>Enforced checks (breaking)</h3>
<p>A previous release introduced digest checks on the download. If a
download hash didn't match the expected hash from the server, the action
would log a warning. Callers can now configure the behavior on mismatch
with the <code>digest-mismatch</code> parameter. To be secure by
default, we are now defaulting the behavior to <code>error</code> which
will fail the workflow run.</p>
<h3>ESM</h3>
<p>To support new versions of the @actions/* packages, we've upgraded
the package to ESM.</p>
<h2>What's Changed</h2>
<ul>
<li>Don't attempt to un-zip non-zipped downloads by <a
href="https://github.com/danwkennedy"><code>@danwkennedy</code></a> in
<a
href="https://redirect.github.com/actions/download-artifact/pull/460">actions/download-artifact#460</a></li>
<li>Add a setting to specify what to do on hash mismatch and default it
to <code>error</code> by <a
href="https://github.com/danwkennedy"><code>@danwkennedy</code></a> in
<a
href="https://redirect.github.com/actions/download-artifact/pull/461">actions/download-artifact#461</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/download-artifact/compare/v7...v8.0.0">https://github.com/actions/download-artifact/compare/v7...v8.0.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="70fc10c6e5"><code>70fc10c</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/download-artifact/issues/461">#461</a>
from actions/danwkennedy/digest-mismatch-behavior</li>
<li><a
href="f258da9a50"><code>f258da9</code></a>
Add change docs</li>
<li><a
href="ccc058e5fb"><code>ccc058e</code></a>
Fix linting issues</li>
<li><a
href="bd7976ba57"><code>bd7976b</code></a>
Add a setting to specify what to do on hash mismatch and default it to
<code>error</code></li>
<li><a
href="ac21fcf45e"><code>ac21fcf</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/download-artifact/issues/460">#460</a>
from actions/danwkennedy/download-no-unzip</li>
<li><a
href="15999bff51"><code>15999bf</code></a>
Add note about package bumps</li>
<li><a
href="974686ed50"><code>974686e</code></a>
Bump the version to <code>v8</code> and add release notes</li>
<li><a
href="fbe48b1d27"><code>fbe48b1</code></a>
Update test names to make it clearer what they do</li>
<li><a
href="96bf374a61"><code>96bf374</code></a>
One more test fix</li>
<li><a
href="b8c4819ef5"><code>b8c4819</code></a>
Fix skip decompress test</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/download-artifact/compare/v7...v8">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Improve observability of realtime conversation event handling by logging
non-audio events with payload details in the event loop, while skipping
audio-out events to reduce noise.
Support marketplace.json that points to a local file, with
```
"source":
{
"source": "local",
"path": "./plugin-1"
},
```
Add a new plugin/install endpoint which add the plugin to the cache folder and enable it in config.toml.
Addresses #13478
Summary
- Add two new scopes for `tui.notifications` config: `plan-mode-prompt`
and `user-input-requested`.
- Add Plan Mode prompt and user-input-requested notifications to the TUI
so these events surface consistently outside of plan mode
- Add helpers and tests to ensure the new notification types publish the
right titles, summaries, and type tags for filtering
- Add prioritization mechanism to fix an existing bug where one
notification event could arbitrarily overwrite others
Testing
- Manually tested plan mode to ensure that notification appeared
### Overview
This PR:
- Updates `app-server-test-client` to load OTEL settings from
`$CODEX_HOME/config.toml` and initializes its own OTEL provider.
- Add real client root spans to app-server test client traces.
This updates `codex-app-server-test-client` so its Datadog traces
reflect the full client-driven flow instead of a set of server spans
stitched together under a synthetic parent.
Before this change, the test client generated a fake `traceparent` once
and reused it for every JSON-RPC request. That kept the requests in one
trace, but there was no real client span at the top, so Datadog ended up
showing the sequence in a slightly misleading way, where all RPCs were
anchored under `initialize`.
Now the test client:
- loads OTEL settings from the normal Codex config path, including
`$CODEX_HOME/config.toml` and existing --config overrides
- initializes tracing the same way other Codex binaries do when trace
export is enabled
- creates a real client root span for each scripted command
- creates per-request client spans for JSON-RPC methods like
`initialize`, `thread/start`, and `turn/start`
- injects W3C trace context from the current client span into
request.trace instead of reusing a fabricated carrier
This gives us a cleaner trace shape in Datadog:
- one trace URL for the whole scripted flow
- a visible client root span
- proper client/server parent-child relationships for each app-server
request
## Problem
The `ansi`, `base16`, and `base16-256` syntax themes are designed to
emit ANSI palette colors so that highlighted code respects the user's
terminal color scheme. Syntect encodes this intent in the alpha channel
of its `Color` struct — a convention shared with `bat` — but
`convert_style` was ignoring it entirely, treating every foreground
color as raw RGB. This caused ANSI-family themes to produce hard-coded
RGB values (e.g. `Rgb(0x02, 0, 0)` instead of `Green`), defeating their
purpose and rendering them as near-invisible dark colors on most
terminals.
Reported in #12890.
## Mental model
Syntect themes use a compact encoding in their `Color` struct:
| `alpha` | Meaning of `r` | Mapped to |
|---------|----------------|-----------|
| `0x00` | ANSI palette index (0–255) | `RtColor::Black`…`Gray` for 0–7,
`Indexed(n)` for 8–255 |
| `0x01` | Unused (sentinel) | `None` — inherit terminal default fg/bg |
| `0xFF` | True RGB red channel | `RtColor::Rgb(r, g, b)` |
| other | Unexpected | `RtColor::Rgb(r, g, b)` (silent fallback) |
This encoding is a bat convention that three bundled themes rely on. The
new `convert_syntect_color` function decodes it; `ansi_palette_color`
maps indices 0–7 to ratatui's named ANSI variants.
| macOS - Dark | macOS - Light | Windows - ansi | Windows - base16 |
|---|---|---|---|
| <img width="1064" height="1205" alt="macos-dark"
src="https://github.com/user-attachments/assets/f03d92fb-b44b-4939-b2b9-503fde133811"
/> | <img width="1073" height="1227" alt="macos-light"
src="https://github.com/user-attachments/assets/2ecb2089-73b5-4676-bed8-e4e6794250b4"
/> |

|

|
## Non-goals
- Background color decoding — we intentionally skip backgrounds to
preserve the terminal's own background. The decoder supports it, but
`convert_style` does not apply it.
- Italic/underline changes — those remain suppressed as before.
- Custom `.tmTheme` support for ANSI encoding — only the bundled themes
use this convention.
## Tradeoffs
- The alpha-channel encoding is an undocumented bat/syntect convention,
not a formal spec. We match bat's behavior exactly, trading formality
for ecosystem compatibility.
- Indices 0–7 are mapped to ratatui's named variants (`Black`, `Red`, …,
`Gray`) rather than `Indexed(0)`…`Indexed(7)`. This lets terminals apply
bold/bright semantics to named colors, which is the expected behavior
for ANSI themes, but means the two representations are not perfectly
round-trippable.
## Architecture
All changes are in `codex-rs/tui/src/render/highlight.rs`, within the
style-conversion layer between syntect and ratatui:
```
syntect::highlighting::Color
└─ convert_syntect_color(color) [NEW — alpha-dispatch]
├─ a=0x00 → ansi_palette_color() [NEW — index→named/indexed]
├─ a=0x01 → None (terminal default)
├─ a=0xFF → Rgb(r,g,b) (standard opaque path)
└─ other → Rgb(r,g,b) (silent fallback)
```
`convert_style` delegates foreground mapping to `convert_syntect_color`
instead of inlining the `Rgb(r,g,b)` conversion. The core highlighter is
refactored into `highlight_to_line_spans_with_theme` (accepts an
explicit theme reference) so tests can highlight against specific themes
without mutating process-global state.
### ANSI-family theme contract
The ANSI-family themes (`ansi`, `base16`, `base16-256`) rely on upstream
alpha-channel encoding from two_face/syntect. We intentionally do
**not** validate this contract at runtime — if the upstream format
changes, the `ansi_themes_use_only_ansi_palette_colors` test catches it
at build time, long before it reaches users. A runtime warning would be
unactionable noise.
### Warning copy cleanup
User-facing warning messages were rewritten for clarity:
- Removed internal jargon ("alpha-encoded ANSI color markers", "RGB
fallback semantics", "persisted override config")
- Dropped "syntax" prefix from "syntax theme" — users just think "theme"
- Downgraded developer-only diagnostics (duplicate override, resolve
fallback) from `warn` to `debug`
## Observability
- The `ansi_themes_use_only_ansi_palette_colors` test enforces the
ANSI-family contract at build time.
- The snapshot test provides a regression tripwire for palette color
output.
- User-facing warnings are limited to actionable issues: unknown theme
names and invalid custom `.tmTheme` files.
## Tests
- **Unit tests for each alpha branch:** `alpha=0x00` with low index
(named color), `alpha=0x00` with high index (`Indexed`), `alpha=0x01`
(terminal default), unexpected alpha (falls back to RGB), ANSI white →
Gray mapping.
- **Integration test:**
`ansi_family_themes_use_terminal_palette_colors_not_rgb` — highlights a
Rust snippet with each ANSI-family theme and asserts zero `Rgb`
foreground colors appear.
- **Snapshot test:** `ansi_family_foreground_palette_snapshot` — records
the exact set of unique foreground colors each ANSI-family theme
produces, guarding against regressions.
- **Warning validation tests:** verify user-facing warnings for missing
custom themes, invalid `.tmTheme` files, and bundled theme resolution.
## Test plan
- [ ] `cargo test -p codex-tui` passes all new and existing tests
- [ ] Select `ansi`, `base16`, or `base16-256` theme and verify code
blocks render with terminal palette colors (not near-black RGB)
- [ ] Select a standard RGB theme (e.g. `dracula`) and verify no
regression in color output
## Summary
- update the /fast slash command description to mention fastest
inference
- mention the 3X plan usage tradeoff in the help copy
## Testing
- cargo test -p codex-tui slash_command (currently blocked by an
unrelated latest-main codex-tui compile error in chatwidget.rs:
refresh_queued_user_messages missing)
---------
Co-authored-by: Codex <noreply@openai.com>
This is PR 3 of the app-server tracing rollout.
PRs https://github.com/openai/codex/pull/13285 and
https://github.com/openai/codex/pull/13368 gave us inbound request spans
in app-server and propagated trace context through Submission. This
change finishes the next piece in core: when a request actually starts a
turn, we now create a core-owned long-lived span that stays open for the
real lifetime of the turn.
What changed:
- `Session::spawn_task` can now optionally create a long-lived turn span
and run the spawned task inside it
- `turn/start` uses that path, so normal turn execution stays under a
single core-owned span after the async handoff
- `review/start` uses the same pattern
- added a unit test that verifies the spawned turn task inherits the
submission dispatch trace ancestry
**Why**
The app-server request span is intentionally short-lived. Once work
crosses into core, we still want one span that covers the actual
execution window until completion or interruption. This keeps that
ownership where it belongs: in the layer that owns the runtime
lifecycle.
The electron app doesn't start up the app-server in a particular
workspace directory.
So sandbox setup happens in the app-installed directory instead of the
project workspace.
This allows the app do specify the workspace cwd so that the sandbox
setup actually sets up the ACLs instead of exiting fast and then having
the first shell command be slow.
Validated login + refresh flows. Removing scopes from the refresh
request until we have upgrade flow in place. Confirmed that tokens
refresh with existing scopes.
Follow-up to [#13388](https://github.com/openai/codex/pull/13388). This
uses the same general fix pattern as
[#12421](https://github.com/openai/codex/pull/12421), but in the
`codex-core` compact/resume/fork path.
## Why
`compact_resume_after_second_compaction_preserves_history` started
overflowing the stack on Windows CI after `#13388`.
The important part is that this was not a compaction-recursion bug. The
test exercises a path with several thin `async fn` wrappers around much
larger thread-spawn, resume, and fork futures. When one `async fn`
awaits another inline, the outer future stores the callee future as part
of its own state machine. In a long wrapper chain, that means a caller
can accidentally inline a lot more state than the source code suggests.
That is exactly what was happening here:
- `ThreadManager` convenience methods such as `start_thread`,
`resume_thread_from_rollout`, and `fork_thread` were inlining the larger
spawn/resume futures beneath them.
- `core_test_support::test_codex` added another wrapper layer on top of
those same paths.
- `compact_resume_fork` adds a few more helpers, and this particular
test drives the resume/fork path multiple times.
On Windows, that was enough to push both the libtest thread and Tokio
worker threads over the edge. The previous 8 MiB test-thread workaround
proved the failure was stack-related, but it did not address the
underlying future size.
## How This Was Debugged
The useful debugging pattern here was to turn the CI-only failure into a
local low-stack repro.
1. First, remove the explicit large-stack harness so the test runs on
the normal `#[tokio::test]` path.
2. Build the test binary normally.
3. Re-run the already-built `tests/all` binary directly with
progressively smaller `RUST_MIN_STACK` values.
Running the built binary directly matters: it keeps the reduced stack
size focused on the test process instead of also applying it to `cargo`
and `rustc`.
That made it possible to answer two questions quickly:
- Does the failure still reproduce without the workaround? Yes.
- Does boxing the wrapper futures actually buy back stack headroom? Also
yes.
After this change, the built test binary passes with
`RUST_MIN_STACK=917504` and still overflows at `786432`, which is enough
evidence to justify removing the explicit 8 MiB override while keeping a
deterministic low-stack repro for future debugging.
If we hit a similar issue again, the first places to inspect are thin
`async fn` wrappers that mostly forward into a much larger async
implementation.
## `Box::pin()` Primer
`async fn` compiles into a state machine. If a wrapper does this:
```rust
async fn wrapper() {
inner().await;
}
```
then `wrapper()` stores the full `inner()` future inline as part of its
own state.
If the wrapper instead does this:
```rust
async fn wrapper() {
Box::pin(inner()).await;
}
```
then the child future lives on the heap, and the outer future only
stores a pinned pointer to it. That usually trades one allocation for a
substantially smaller outer future, which is exactly the tradeoff we
want when the problem is stack pressure rather than raw CPU time.
Useful references:
-
[`Box::pin`](https://doc.rust-lang.org/std/boxed/struct.Box.html#method.pin)
- [Async book:
Pinning](https://rust-lang.github.io/async-book/04_pinning/01_chapter.html)
## What Changed
- Boxed the wrapper futures in `core/src/thread_manager.rs` around
`start_thread`, `resume_thread_from_rollout`, `fork_thread`, and the
corresponding `ThreadManagerState` spawn helpers so callers no longer
inline the full spawn/resume state machine through multiple layers.
- Boxed the matching test-only wrapper futures in
`core/tests/common/test_codex.rs` and
`core/tests/suite/compact_resume_fork.rs`, which sit directly on top of
the same path.
- Restored `compact_resume_after_second_compaction_preserves_history` in
`core/tests/suite/compact_resume_fork.rs` to a normal `#[tokio::test]`
and removed the explicit `TEST_STACK_SIZE_BYTES` thread/runtime sizing.
- Simplified a tiny helper in `compact_resume_fork` by making
`fetch_conversation_path()` synchronous, which removes one more
unnecessary future layer from the test path.
## Verification
- `cargo test -p codex-core --test all
suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
-- --exact --nocapture`
- `cargo test -p codex-core --test all suite::compact_resume_fork --
--nocapture`
- Re-ran the built `codex-core` `tests/all` binary directly with reduced
stack sizes:
- `RUST_MIN_STACK=917504` passes
- `RUST_MIN_STACK=786432` still overflows
- `cargo test -p codex-core`
- Still fails locally in unrelated existing integration areas that
expect the `codex` / `test_stdio_server` binaries or hit the existing
`search_tool` wiremock mismatches.
## Summary
Changes the permission profile shape from a bare network boolean to a
nested object.
Before:
```yaml
permissions:
network: true
```
After:
```yaml
permissions:
network:
enabled: true
```
This also updates the shared Rust and app-server protocol types so
`PermissionProfile.network` is no longer `Option<bool>`, but
`Option<NetworkPermissions>` with `enabled: Option<bool>`.
## What Changed
- Updated `PermissionProfile` in `codex-rs/protocol/src/models.rs`:
- `pub network: Option<bool>` -> `pub network:
Option<NetworkPermissions>`
- Added `NetworkPermissions` with:
- `pub enabled: Option<bool>`
- Changed emptiness semantics so `network` is only considered empty when
`enabled` is `None`
- Updated skill metadata parsing to accept `permissions.network.enabled`
- Updated core permission consumers to read
`network.enabled.unwrap_or(false)` where a concrete boolean is needed
- Updated app-server v2 protocol types and regenerated schema/TypeScript
outputs
- Updated docs to mention `additionalPermissions.network.enabled`
## Why
Enterprises can already constrain approvals, sandboxing, and web search
through `requirements.toml` and MDM, but feature flags were still only
configurable as managed defaults. That meant an enterprise could suggest
feature values, but it could not actually pin them.
This change closes that gap and makes enterprise feature requirements
behave like the other constrained settings. The effective feature set
now stays consistent with enterprise requirements during config load,
when config writes are validated, and when runtime code mutates feature
flags later in the session.
It also tightens the runtime API for managed features. `ManagedFeatures`
now follows the same constraint-oriented shape as `Constrained<T>`
instead of exposing panic-prone mutation helpers, and production code
can no longer construct it through an unconstrained `From<Features>`
path.
The PR also hardens the `compact_resume_fork` integration coverage on
Windows. After the feature-management changes,
`compact_resume_after_second_compaction_preserves_history` was
overflowing the libtest/Tokio thread stacks on Windows, so the test now
uses an explicit larger-stack harness as a pragmatic mitigation. That
may not be the ideal root-cause fix, and it merits a parallel
investigation into whether part of the async future chain should be
boxed to reduce stack pressure instead.
## What Changed
Enterprises can now pin feature values in `requirements.toml` with the
requirements-side `features` table:
```toml
[features]
personality = true
unified_exec = false
```
Only canonical feature keys are allowed in the requirements `features`
table; omitted keys remain unconstrained.
- Added a requirements-side pinned feature map to
`ConfigRequirementsToml`, threaded it through source-preserving
requirements merge and normalization in `codex-config`, and made the
TOML surface use `[features]` (while still accepting legacy
`[feature_requirements]` for compatibility).
- Exposed `featureRequirements` from `configRequirements/read`,
regenerated the JSON/TypeScript schema artifacts, and updated the
app-server README.
- Wrapped the effective feature set in `ManagedFeatures`, backed by
`ConstrainedWithSource<Features>`, and changed its API to mirror
`Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`,
and result-returning `enable` / `disable` / `set_enabled` helpers.
- Removed the legacy-usage and bulk-map passthroughs from
`ManagedFeatures`; callers that need those behaviors now mutate a plain
`Features` value and reapply it through `set(...)`, so the constrained
wrapper remains the enforcement boundary.
- Removed the production loophole for constructing unconstrained
`ManagedFeatures`. Non-test code now creates it through the configured
feature-loading path, and `impl From<Features> for ManagedFeatures` is
restricted to `#[cfg(test)]`.
- Rejected legacy feature aliases in enterprise feature requirements,
and return a load error when a pinned combination cannot survive
dependency normalization.
- Validated config writes against enterprise feature requirements before
persisting changes, including explicit conflicting writes and
profile-specific feature states that normalize into invalid
combinations.
- Updated runtime and TUI feature-toggle paths to use the constrained
setter API and to persist or apply the effective post-constraint value
rather than the requested value.
- Updated the `core_test_support` Bazel target to include the bundled
core model-catalog fixtures in its runtime data, so helper code that
resolves `core/models.json` through runfiles works in remote Bazel test
environments.
- Renamed the core config test coverage to emphasize that effective
feature values are normalized at runtime, while conflicting persisted
config writes are rejected.
- Ran `compact_resume_after_second_compaction_preserves_history` inside
an explicit 8 MiB test thread and Tokio runtime worker stack, following
the existing larger-stack integration-test pattern, to keep the Windows
`compact_resume_fork` test slice from aborting while a parallel
investigation continues into whether some of the underlying async
futures should be boxed.
## Verification
- `cargo test -p codex-config`
- `cargo test -p codex-core feature_requirements_ -- --nocapture`
- `cargo test -p codex-core
load_requirements_toml_produces_expected_constraints -- --nocapture`
- `cargo test -p codex-core
compact_resume_after_second_compaction_preserves_history -- --nocapture`
- `cargo test -p codex-core compact_resume_fork -- --nocapture`
- Re-ran the built `codex-core` `tests/all` binary with
`RUST_MIN_STACK=262144` for
`compact_resume_after_second_compaction_preserves_history` to confirm
the explicit-stack harness fixes the deterministic low-stack repro.
- `cargo test -p codex-core`
- This still fails locally in unrelated integration areas that expect
the `codex` / `test_stdio_server` binaries or hit existing `search_tool`
wiremock mismatches.
## Docs
`developers.openai.com/codex` should document the requirements-side
`[features]` table for enterprise and MDM-managed configuration,
including that it only accepts canonical feature keys and that
conflicting config writes are rejected.
## Summary
`PermissionProfile.network` could not be preserved when additional or
compiled permissions resolved to
`SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access
field. This change makes read-only + network
enabled representable directly and threads that through the protocol,
app-server v2 mirror, and permission-
merging logic.
## What changed
- Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core
protocol and app-server v2 protocol.
- Kept backward compatibility by defaulting the new field to false, so
legacy read-only payloads still
deserialize unchanged.
- Updated `has_full_network_access()` and sandbox summaries to respect
read-only network access.
- Preserved PermissionProfile.network when:
- compiling skill permission profiles into sandbox policies
- normalizing additional permissions
- merging additional permissions into existing sandbox policies
- Updated the approval overlay to show network in the rendered
permission rule when requested.
- Regenerated app-server schema fixtures for the new v2 wire shape.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
• Keep Windows sandbox runner launches working from packaged installs by
running the helper from a user-owned runtime location.
On some Windows installs, the packaged helper location is difficult to
use reliably for sandboxed runner launches even though the binaries are
present. This change works around that by copying codex-
command-runner.exe into CODEX_HOME/.sandbox-bin/, reusing that copy
across launches, and falling back to the existing packaged-path lookup
if anything goes wrong.
The runtime copy lives in a dedicated directory with tighter ACLs than
.sandbox: sandbox users can read and execute the runner there, but they
cannot modify it. This keeps the workaround focused on the
command runner, leaves the setup helper on its trusted packaged path,
and adds logging so it is clear which runner path was selected at
launch.
### Summary
Propagate trace context originating at app-server RPC method handlers ->
codex core submission loop (so this includes spans such as `run_turn`!).
This implements PR 2 of the app-server tracing rollout.
This also removes the old lower-level env-based reparenting in core so
explicit request/submission ancestry wins instead of being overridden by
ambient `TRACEPARENT` state.
### What changed
- Added `trace: Option<W3cTraceContext>` to codex_protocol::Submission
- Taught `Codex::submit()` / `submit_with_id()` to automatically capture
the current span context when constructing or forwarding a submission
- Wrapped the core submission loop in a submission_dispatch span
parented from Submission.trace
- Warn on invalid submission trace carriers and ignore them cleanly
- Removed the old env-based downstream reparenting path in core task
execution
- Stopped OTEL provider init from implicitly attaching env trace context
process-wide
- Updated mcp-server Submission call sites for the new field
Added focused unit tests for:
- capturing trace context into Submission
- preferring `Submission.trace` when building the core dispatch span
### Why
PR 1 gave us consistent inbound request spans in app-server, but that
only covered the transport boundary. For long-running work like turns
and reviews, the important missing piece was preserving ancestry after
the request handler returns and core continues work on a different async
path.
This change makes that handoff explicit and keeps the parentage rules
simple:
- app-server request span sets the current context
- `Submission.trace` snapshots that context
- core restores it once, at the submission boundary
- deeper core spans inherit naturally
That also lets us stop relying on env-based reparenting for this path,
which was too ambient and could override explicit ancestry.
This adds a first-class app-server v2 `skills/changed` notification for
the existing skills live-reload signal.
Before this change, clients only had the legacy raw
`codex/event/skills_update_available` event. With this PR, v2 clients
can listen for a typed JSON-RPC notification instead of depending on the
legacy `codex/event/*` stream, which we want to remove soon.
load plugin-apps from `.app.json`.
make apps runtime-mentionable iff `codex_apps` MCP actually exposes
tools for that `connector_id`.
if the app isn't available, it's filtered out of runtime connector set,
so no tools are added and no app-mentions resolve.
right now we don't have a clean cli-side error for an app not being
installed. can look at this after.
### Tests
Added tests, tested locally that using a plugin that bundles an app
picks up the app.
## Summary
Instead of always adding inner function call outputs to the model
context, let js code decide which ones to return.
- Stop auto-hoisting nested tool outputs from `codex.tool(...)` into the
outer `js_repl` function output.
- Keep `codex.tool(...)` return values unchanged as structured JS
objects.
- Add `codex.emitImage(...)` as the explicit path for attaching an image
to the outer `js_repl` function output.
- Support emitting from a direct image URL, a single `input_image` item,
an explicit `{ bytes, mimeType }` object, or a raw tool response object
containing exactly one image.
- Preserve existing `view_image` original-resolution behavior when JS
emits the raw `view_image` tool result.
- Suppress the special `ViewImageToolCall` event for `js_repl`-sourced
`view_image` calls so nested inspection stays side-effect free until JS
explicitly emits.
- Update the `js_repl` docs and generated project instructions with both
recommended patterns:
- `await codex.emitImage(codex.tool("view_image", { path }))`
- `await codex.emitImage({ bytes: await page.screenshot({ type: "jpeg",
quality: 85 }), mimeType: "image/jpeg" })`
#### [git stack](https://github.com/magus/git-stack-cli)
- ✅ `1` https://github.com/openai/codex/pull/13050
- 👉 `2` https://github.com/openai/codex/pull/13331
- ⏳ `3` https://github.com/openai/codex/pull/13049
## Summary
Add original-resolution support for `view_image` behind the
under-development `view_image_original_resolution` feature flag.
When the flag is enabled and the target model is `gpt-5.3-codex` or
newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends
`detail: "original"` to the Responses API instead of using the legacy
resize/compress path.
## What changed
- Added `view_image_original_resolution` as an under-development feature
flag.
- Added `ImageDetail` to the protocol models and support for serializing
`detail: "original"` on tool-returned images.
- Added `PromptImageMode::Original` to `codex-utils-image`.
- Preserves original PNG/JPEG/WebP bytes.
- Keeps legacy behavior for the resize path.
- Updated `view_image` to:
- use the shared `local_image_content_items_with_label_number(...)`
helper in both code paths
- select original-resolution mode only when:
- the feature flag is enabled, and
- the model slug parses as `gpt-5.3-codex` or newer
- Kept local user image attachments on the existing resize path; this
change is specific to `view_image`.
- Updated history/image accounting so only `detail: "original"` images
use the docs-based GPT-5 image cost calculation; legacy images still use
the old fixed estimate.
- Added JS REPL guidance, gated on the same feature flag, to prefer JPEG
at 85% quality unless lossless is required, while still allowing other
formats when explicitly requested.
- Updated tests and helper code that construct
`FunctionCallOutputContentItem::InputImage` to carry the new `detail`
field.
## Behavior
### Feature off
- `view_image` keeps the existing resize/re-encode behavior.
- History estimation keeps the existing fixed-cost heuristic.
### Feature on + `gpt-5.3-codex+`
- `view_image` sends original-resolution images with `detail:
"original"`.
- PNG/JPEG/WebP source bytes are preserved when possible.
- History estimation uses the GPT-5 docs-based image-cost calculation
for those `detail: "original"` images.
#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/13050
- ⏳ `2` https://github.com/openai/codex/pull/13331
- ⏳ `3` https://github.com/openai/codex/pull/13049
## Summary
- add the v2 `thread/metadata/update` API, including
protocol/schema/TypeScript exports and app-server docs
- patch stored thread `gitInfo` in sqlite without resuming the thread,
with validation plus support for explicit `null` clears
- repair missing sqlite thread rows from rollout data before patching,
and make those repairs safe by inserting only when absent and updating
only git columns so newer metadata is not clobbered
- keep sqlite authoritative for mutable thread git metadata by
preserving existing sqlite git fields during reconcile/backfill and only
using rollout `SessionMeta` git fields to fill gaps
- add regression coverage for the endpoint, repair paths, concurrent
sqlite writes, clearing git fields, and rollout/backfill reconciliation
- fix the login server shutdown race so cancelling before the waiter
starts still terminates `block_until_done()` correctly
## Testing
- `cargo test -p codex-state
apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields`
- `cargo test -p codex-state
update_thread_git_info_preserves_newer_non_git_metadata`
- `cargo test -p codex-core
backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields`
- `cargo test -p codex-app-server thread_metadata_update`
- `cargo test`
- currently fails in existing `codex-core` grep-files tests with
`unsupported call: grep_files`:
- `suite::grep_files::grep_files_tool_collects_matches`
- `suite::grep_files::grep_files_tool_reports_empty_results`
## Summary
- submit `Enter` steers immediately while a turn is already running
instead of routing them through `queued_user_messages`
- keep those submitted steers visible in the footer as `pending_steers`
until core records them as a user message or aborts the turn
- reconcile pending steers on `ItemCompleted(UserMessage)`, not
`RawResponseItem`
- emit user-message item lifecycle for leftover pending input at task
finish, then remove the TUI `TurnComplete` fallback
- keep `queued_user_messages` for actual queued drafts, rendered below
pending steers
## Problem
While the assistant was generating, pressing `Enter` could send the
input into `queued_user_messages`. That queue only drains after the turn
ends, so ordinary steers behaved like queued drafts instead of landing
at the next core sampling boundary.
The first version of this fix also used `RawResponseItem` to decide when
a steer had landed. Review feedback was that this is the wrong
abstraction for client behavior.
There was also a late edge case in core: if pending steer input was
accepted after the final sampling decision but before `TurnComplete`,
core would record that user message into history at task finish without
emitting `ItemStarted(UserMessage)` / `ItemCompleted(UserMessage)`. TUI
had a fallback to paper over that gap locally.
## Approach
- `Enter` during an active turn now submits a normal `Op::UserTurn`
immediately
- TUI keeps a local pending-steer preview instead of rendering that user
message into history immediately
- when core records the steer as `ItemCompleted(UserMessage)`, TUI
matches and removes the corresponding pending preview, then renders the
committed user message
- core now emits the same user-message lifecycle when
`on_task_finished(...)` drains leftover pending user input, before
`TurnComplete`
- with that lifecycle gap closed in core, TUI no longer needs to flush
pending steers into history on `TurnComplete`
- if the turn is interrupted, pending steers and queued drafts are both
restored into the composer, with pending steers first
## Notes
- `Tab` still uses the real queued-message path
- `queued_user_messages` and `pending_steers` are separate state with
separate semantics
- the pending-steer matching key is built directly from `UserInput`
- this removes the new TUI dependency on `RawResponseItem`
## Validation
- `just fmt`
- `cargo test -p codex-core
task_finish_emits_turn_item_lifecycle_for_leftover_pending_user_input --
--nocapture`
- `cargo test -p codex-tui`
Update config.toml plugin entries to use
<plugin_name>@<marketplace_name> as the key.
Plugin now stays in
[plugins/cache/marketplace-name/plugin-name/$version/]
Clean up the plugin code structure.
Add plugin install functionality (not used yet).
## Summary
- Route delegated realtime handoff turns from all handoff message texts,
preserving order
- Fallback to input_transcript only when no messages are present
- Add regression coverage for multi-message handoff requests
Realized EventMsg generated types were unintentionally removed as part
of this PR: https://github.com/openai/codex/pull/13375
Turns out our TypeScript export pipeline relied on transitively reaching
`EventMsg`. We should still export `EventMsg` explicitly since we're
still emitting `codex/event/*` events (for now, but getting dropped soon
as well).
## Summary
This removes the old app-server v1 methods and notifications we no
longer need, while keeping the small set the main codex app client still
depends on for now.
The remaining legacy surface is:
- `initialize`
- `getConversationSummary`
- `getAuthStatus`
- `gitDiffToRemote`
- `fuzzyFileSearch`
- `fuzzyFileSearch/sessionStart`
- `fuzzyFileSearch/sessionUpdate`
- `fuzzyFileSearch/sessionStop`
And the raw `codex/event/*` notifications emitted from core. These
notifications will be removed in a followup PR.
## What changed
- removed deprecated v1 request variants from the protocol and
app-server dispatcher
- removed deprecated typed notifications: `authStatusChange`,
`loginChatGptComplete`, and `sessionConfigured`
- updated the app-server test client to use v2 flows instead of deleted
v1 flows
- deleted legacy-only app-server test suites and added focused coverage
for `getConversationSummary`
- regenerated app-server schema fixtures and updated the MCP interface
docs to match the remaining compatibility surface
## Testing
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
## Summary
- collapse parsed command output to a single `Unknown` whenever the
normal parse includes any unknown entry
- preserve the existing parsing flow and existing `cd` handling,
including the current `cd && ...` collapse behavior
- trim redundant tests and add focused coverage for collapse-on-unknown
cases
## Testing
- `cargo test -p codex-shell-command`
## Summary
- write app-server SQLite logs at TRACE level when SQLite is enabled
- source app-server `/feedback` log attachments from SQLite for the
requested thread when available
- flush buffered SQLite log writes before `/feedback` queries them so
newly emitted events are not lost behind the async inserter
- include same-process threadless SQLite rows in those `/feedback` logs
so the attachment matches the process-wide feedback buffer more closely
- keep the existing in-memory ring buffer fallback unchanged, including
when the SQLite query returns no rows
## Details
- add a byte-bounded `query_feedback_logs` helper in `codex-state` so
`/feedback` does not fetch all rows before truncating
- scope SQLite feedback logs to the requested thread plus threadless
rows from the same `process_uuid`
- format exported SQLite feedback lines with the log level prefix to
better match the in-memory feedback formatter
- add an explicit `LogDbLayer::flush()` control path and await it in
app-server before querying SQLite for feedback logs
- pass optional SQLite log bytes through `codex-feedback` as the
`codex-logs.log` attachment override
- leave TUI behavior unchanged apart from the updated `upload_feedback`
call signature
- add regression coverage for:
- newest-within-budget ordering
- excluding oversized newest rows
- including same-process threadless rows
- keeping the newest suffix across mixed thread and threadless rows
- matching the feedback formatter shape aside from span prefixes
- falling back to the in-memory snapshot when SQLite returns no logs
- flushing buffered SQLite rows before querying
## Follow-up
- SQLite feedback exports still do not reproduce span prefixes like
`feedback-thread{thread_id=...}:`; there is a `TODO(ccunningham)` in
`codex-rs/state/src/log_db.rs` for that follow-up.
## Testing
- `cd codex-rs && cargo test -p codex-state`
- `cd codex-rs && cargo test -p codex-app-server`
- `cd codex-rs && just fmt`
## Summary
- add an `--experimental` flag to the export binary and thread the
option through TypeScript and JSON schema generation
- flatten the v2 schema bundle into a datamodel-code-generator-friendly
`codex_app_server_protocol.v2.schemas.json` export
- retarget shared helper refs to namespaced v2 definitions, add coverage
for the new export behavior, and vendor the generated schema fixtures
## Validation
- `cargo test -p codex-app-server-protocol` (71 unit tests and bin
targets passed locally; the final schema fixture integration target was
revalidated via fresh schema regeneration and a tree diff)
- `./target/debug/write_schema_fixtures --schema-root <tmpdir>`
- `diff -rq app-server-protocol/schema <tmpdir>`
## Tickets
- None
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
## Summary
- add a direct install script for Windows at
`scripts/install/install.ps1`
- extend release staging so `install.ps1` is published alongside
`install.sh`
- install the Windows runtime payload (`codex.exe`, `rg.exe`, and helper
binaries) from the existing platform npm package
## Dependencies
- Depends on https://github.com/openai/codex/pull/12740
## Testing
- Smoke-tested with powershell
2026-03-03 09:25:50 -08:00
670 changed files with 82034 additions and 46076 deletions
"description":"Run a standalone command (argv vector) in the server sandbox without creating a thread or turn.\n\nThe final `command/exec` response is deferred until the process exits and is sent only after all `command/exec/outputDelta` notifications for that connection have been emitted.",
"properties":{
"command":{
"description":"Command argv vector. Empty arrays are rejected.",
"items":{
"type":"string"
},
"type":"array"
},
"cwd":{
"description":"Optional working directory. Defaults to the server cwd.",
"type":[
"string",
"null"
]
},
"disableOutputCap":{
"description":"Disable stdout/stderr capture truncation for this request.\n\nCannot be combined with `outputBytesCap`.",
"type":"boolean"
},
"disableTimeout":{
"description":"Disable the timeout entirely for this request.\n\nCannot be combined with `timeoutMs`.",
"type":"boolean"
},
"env":{
"additionalProperties":{
"type":[
"string",
"null"
]
},
"description":"Optional environment overrides merged into the server-computed environment.\n\nMatching names override inherited values. Set a key to `null` to unset an inherited variable.",
"type":[
"object",
"null"
]
},
"outputBytesCap":{
"description":"Optional per-stream stdout/stderr capture cap in bytes.\n\nWhen omitted, the server default applies. Cannot be combined with `disableOutputCap`.",
"format":"uint",
"minimum":0.0,
"type":[
"integer",
"null"
]
},
"processId":{
"description":"Optional client-supplied, connection-scoped process id.\n\nRequired for `tty`, `streamStdin`, `streamStdoutStderr`, and follow-up `command/exec/write`, `command/exec/resize`, and `command/exec/terminate` calls. When omitted, buffered execution gets an internal id that is not exposed to the client.",
"type":[
"string",
"null"
@@ -169,14 +209,39 @@
{
"type":"null"
}
]
],
"description":"Optional sandbox policy for this command.\n\nUses the same shape as thread/turn execution sandbox configuration and defaults to the user's configured policy when omitted."
},
"size":{
"anyOf":[
{
"$ref":"#/definitions/CommandExecTerminalSize"
},
{
"type":"null"
}
],
"description":"Optional initial PTY size in character cells. Only valid when `tty` is true."
},
"streamStdin":{
"description":"Allow follow-up `command/exec/write` requests to write stdin bytes.\n\nRequires a client-supplied `processId`.",
"type":"boolean"
},
"streamStdoutStderr":{
"description":"Stream stdout/stderr via `command/exec/outputDelta` notifications.\n\nStreamed bytes are not duplicated into the final response and require a client-supplied `processId`.",
"type":"boolean"
},
"timeoutMs":{
"description":"Optional timeout in milliseconds.\n\nWhen omitted, the server default applies. Cannot be combined with `disableTimeout`.",
"format":"int64",
"type":[
"integer",
"null"
]
},
"tty":{
"description":"Enable PTY mode.\n\nThis implies `streamStdin` and `streamStdoutStderr`.",
"type":"boolean"
}
},
"required":[
@@ -184,6 +249,87 @@
],
"type":"object"
},
"CommandExecResizeParams":{
"description":"Resize a running PTY-backed `command/exec` session.",
"properties":{
"processId":{
"description":"Client-supplied, connection-scoped `processId` from the original `command/exec` request.",
"type":"string"
},
"size":{
"allOf":[
{
"$ref":"#/definitions/CommandExecTerminalSize"
}
],
"description":"New PTY size in character cells."
}
},
"required":[
"processId",
"size"
],
"type":"object"
},
"CommandExecTerminalSize":{
"description":"PTY size in character cells for `command/exec` PTY sessions.",
"properties":{
"cols":{
"description":"Terminal width in character cells.",
"format":"uint16",
"minimum":0.0,
"type":"integer"
},
"rows":{
"description":"Terminal height in character cells.",
"format":"uint16",
"minimum":0.0,
"type":"integer"
}
},
"required":[
"cols",
"rows"
],
"type":"object"
},
"CommandExecTerminateParams":{
"description":"Terminate a running `command/exec` session.",
"properties":{
"processId":{
"description":"Client-supplied, connection-scoped `processId` from the original `command/exec` request.",
"type":"string"
}
},
"required":[
"processId"
],
"type":"object"
},
"CommandExecWriteParams":{
"description":"Write stdin bytes to a running `command/exec` session, close stdin, or both.",
"properties":{
"closeStdin":{
"description":"Close stdin after writing `deltaBase64`, if present.",
"type":"boolean"
},
"deltaBase64":{
"description":"Optional base64-encoded stdin bytes to write.",
"type":[
"string",
"null"
]
},
"processId":{
"description":"Client-supplied, connection-scoped `processId` from the original `command/exec` request.",
"type":"string"
}
},
"required":[
"processId"
],
"type":"object"
},
"ConfigBatchWriteParams":{
"properties":{
"edits":{
@@ -514,6 +660,16 @@
},
{
"properties":{
"detail":{
"anyOf":[
{
"$ref":"#/definitions/ImageDetail"
},
{
"type":"null"
}
]
},
"image_url":{
"type":"string"
},
@@ -627,6 +783,15 @@
],
"type":"string"
},
"ImageDetail":{
"enum":[
"auto",
"low",
"high",
"original"
],
"type":"string"
},
"InitializeCapabilities":{
"description":"Client-declared capabilities negotiated during initialize.",
"properties":{
@@ -932,6 +1097,36 @@
],
"type":"string"
},
"PluginInstallParams":{
"properties":{
"marketplacePath":{
"$ref":"#/definitions/AbsolutePathBuf"
},
"pluginName":{
"type":"string"
}
},
"required":[
"marketplacePath",
"pluginName"
],
"type":"object"
},
"PluginListParams":{
"properties":{
"cwds":{
"description":"Optional working directories used to discover repo marketplaces. When omitted, only home-scoped marketplaces and the official curated marketplace are considered.",
"description":"Patch the stored Git metadata for this thread. Omit a field to leave it unchanged, set it to `null` to clear it, or provide a string to replace the stored value."
},
"threadId":{
"type":"string"
}
},
"required":[
"threadId"
],
"type":"object"
},
"ThreadReadParams":{
"properties":{
"includeTurns":{
@@ -2648,107 +3031,6 @@
}
]
},
"WebSearchAction":{
"oneOf":[
{
"properties":{
"queries":{
"items":{
"type":"string"
},
"type":[
"array",
"null"
]
},
"query":{
"type":[
"string",
"null"
]
},
"type":{
"enum":[
"search"
],
"title":"SearchWebSearchActionType",
"type":"string"
}
},
"required":[
"type"
],
"title":"SearchWebSearchAction",
"type":"object"
},
{
"properties":{
"type":{
"enum":[
"open_page"
],
"title":"OpenPageWebSearchActionType",
"type":"string"
},
"url":{
"type":[
"string",
"null"
]
}
},
"required":[
"type"
],
"title":"OpenPageWebSearchAction",
"type":"object"
},
{
"properties":{
"pattern":{
"type":[
"string",
"null"
]
},
"type":{
"enum":[
"find_in_page"
],
"title":"FindInPageWebSearchActionType",
"type":"string"
},
"url":{
"type":[
"string",
"null"
]
}
},
"required":[
"type"
],
"title":"FindInPageWebSearchAction",
"type":"object"
},
{
"properties":{
"type":{
"enum":[
"other"
],
"title":"OtherWebSearchActionType",
"type":"string"
}
},
"required":[
"type"
],
"title":"OtherWebSearchAction",
"type":"object"
}
]
},
"WindowsSandboxSetupMode":{
"enum":[
"elevated",
@@ -2758,6 +3040,16 @@
},
"WindowsSandboxSetupStartParams":{
"properties":{
"cwd":{
"anyOf":[
{
"$ref":"#/definitions/AbsolutePathBuf"
},
{
"type":"null"
}
]
},
"mode":{
"$ref":"#/definitions/WindowsSandboxSetupMode"
}
@@ -2939,6 +3231,30 @@
"title":"Thread/name/setRequest",
"type":"object"
},
{
"properties":{
"id":{
"$ref":"#/definitions/RequestId"
},
"method":{
"enum":[
"thread/metadata/update"
],
"title":"Thread/metadata/updateRequestMethod",
"type":"string"
},
"params":{
"$ref":"#/definitions/ThreadMetadataUpdateParams"
}
},
"required":[
"id",
"method",
"params"
],
"title":"Thread/metadata/updateRequest",
"type":"object"
},
{
"properties":{
"id":{
@@ -3107,6 +3423,30 @@
"title":"Skills/listRequest",
"type":"object"
},
{
"properties":{
"id":{
"$ref":"#/definitions/RequestId"
},
"method":{
"enum":[
"plugin/list"
],
"title":"Plugin/listRequestMethod",
"type":"string"
},
"params":{
"$ref":"#/definitions/PluginListParams"
}
},
"required":[
"id",
"method",
"params"
],
"title":"Plugin/listRequest",
"type":"object"
},
{
"properties":{
"id":{
@@ -3203,6 +3543,30 @@
"title":"Skills/config/writeRequest",
"type":"object"
},
{
"properties":{
"id":{
"$ref":"#/definitions/RequestId"
},
"method":{
"enum":[
"plugin/install"
],
"title":"Plugin/installRequestMethod",
"type":"string"
},
"params":{
"$ref":"#/definitions/PluginInstallParams"
}
},
"required":[
"id",
"method",
"params"
],
"title":"Plugin/installRequest",
"type":"object"
},
{
"properties":{
"id":{
@@ -3561,7 +3925,7 @@
"type":"object"
},
{
"description":"Execute a command (argv vector) under the server's sandbox.",
"description":"Execute a standalone command (argv vector) under the server's sandbox.",
"properties":{
"id":{
"$ref":"#/definitions/RequestId"
@@ -3585,6 +3949,81 @@
"title":"Command/execRequest",
"type":"object"
},
{
"description":"Write stdin bytes to a running `command/exec` session or close stdin.",
"properties":{
"id":{
"$ref":"#/definitions/RequestId"
},
"method":{
"enum":[
"command/exec/write"
],
"title":"Command/exec/writeRequestMethod",
"type":"string"
},
"params":{
"$ref":"#/definitions/CommandExecWriteParams"
}
},
"required":[
"id",
"method",
"params"
],
"title":"Command/exec/writeRequest",
"type":"object"
},
{
"description":"Terminate a running `command/exec` session by client-supplied `processId`.",
"properties":{
"id":{
"$ref":"#/definitions/RequestId"
},
"method":{
"enum":[
"command/exec/terminate"
],
"title":"Command/exec/terminateRequestMethod",
"type":"string"
},
"params":{
"$ref":"#/definitions/CommandExecTerminateParams"
}
},
"required":[
"id",
"method",
"params"
],
"title":"Command/exec/terminateRequest",
"type":"object"
},
{
"description":"Resize a running PTY-backed `command/exec` session by client-supplied `processId`.",
"description":"Explicit mention selected by the user (name + app://connectorid).",
"description":"Explicit structured mention selected by the user.\n\n`path` identifies the exact mention target, for example `app://<connector-id>` or `plugin://<plugin-name>@<marketplace-name>`.",
"properties":{
"name":{
"type":"string"
@@ -6074,107 +6407,6 @@
"type":"object"
}
]
},
"WebSearchAction":{
"oneOf":[
{
"properties":{
"queries":{
"items":{
"type":"string"
},
"type":[
"array",
"null"
]
},
"query":{
"type":[
"string",
"null"
]
},
"type":{
"enum":[
"search"
],
"title":"SearchWebSearchActionType",
"type":"string"
}
},
"required":[
"type"
],
"title":"SearchWebSearchAction",
"type":"object"
},
{
"properties":{
"type":{
"enum":[
"open_page"
],
"title":"OpenPageWebSearchActionType",
"type":"string"
},
"url":{
"type":[
"string",
"null"
]
}
},
"required":[
"type"
],
"title":"OpenPageWebSearchAction",
"type":"object"
},
{
"properties":{
"pattern":{
"type":[
"string",
"null"
]
},
"type":{
"enum":[
"find_in_page"
],
"title":"FindInPageWebSearchActionType",
"type":"string"
},
"url":{
"type":[
"string",
"null"
]
}
},
"required":[
"type"
],
"title":"FindInPageWebSearchAction",
"type":"object"
},
{
"properties":{
"type":{
"enum":[
"other"
],
"title":"OtherWebSearchActionType",
"type":"string"
}
},
"required":[
"type"
],
"title":"OtherWebSearchAction",
"type":"object"
}
]
}
},
"description":"Response event from the agent NOTE: Make sure none of these values have optional types, as it will mess up the extension code-gen.",
"description":"Typed form schema for MCP `elicitation/create` requests.\n\nThis matches the `requestedSchema` shape from the MCP 2025-11-25 `ElicitRequestFormParams` schema.",
"description":"Active Codex turn when this elicitation was observed, if app-server could correlate one.\n\nThis is nullable because MCP models elicitation as a standalone server-to-client request identified by the MCP server request id. It may be triggered during a turn, but turn context is app-server correlation rather than part of the protocol identity of the elicitation itself.",
"description":"Optional client metadata for form-mode action handling."
},
"action":{
"$ref":"#/definitions/McpServerElicitationAction"
},
"content":{
"description":"Structured user input for accepted elicitations, mirroring RMCP `CreateElicitationResult`.\n\nThis is nullable because decline/cancel responses have no content."
"description":"Base64-encoded output chunk emitted for a streaming `command/exec` request.\n\nThese notifications are connection-scoped. If the originating connection closes, the server terminates the process.",
"properties":{
"capReached":{
"description":"`true` on the final streamed chunk for a stream when `outputBytesCap` truncated later output on that stream.",
"type":"boolean"
},
"deltaBase64":{
"description":"Base64-encoded output bytes.",
"type":"string"
},
"processId":{
"description":"Client-supplied, connection-scoped `processId` from the original `command/exec` request.",
"type":"string"
},
"stream":{
"allOf":[
{
"$ref":"#/definitions/CommandExecOutputStream"
}
],
"description":"Output stream for this chunk."
}
},
"required":[
"capReached",
"deltaBase64",
"processId",
"stream"
],
"type":"object"
},
"CommandExecOutputStream":{
"description":"Stream label for `command/exec/outputDelta` notifications.",
"description":"Notification emitted when watched local skill files change.\n\nTreat this as an invalidation signal and re-run `skills/list` with the client's current parameters when refreshed skill metadata is needed.",
"type":"object"
},
"SubAgentSource":{
"oneOf":[
{
@@ -2247,6 +2309,40 @@
"title":"ImageViewThreadItem",
"type":"object"
},
{
"properties":{
"id":{
"type":"string"
},
"result":{
"type":"string"
},
"revisedPrompt":{
"type":[
"string",
"null"
]
},
"status":{
"type":"string"
},
"type":{
"enum":[
"imageGeneration"
],
"title":"ImageGenerationThreadItemType",
"type":"string"
}
},
"required":[
"id",
"result",
"status",
"type"
],
"title":"ImageGenerationThreadItem",
"type":"object"
},
{
"properties":{
"id":{
@@ -3202,6 +3298,26 @@
"title":"Thread/closedNotification",
"type":"object"
},
{
"properties":{
"method":{
"enum":[
"skills/changed"
],
"title":"Skills/changedNotificationMethod",
"type":"string"
},
"params":{
"$ref":"#/definitions/SkillsChangedNotification"
}
},
"required":[
"method",
"params"
],
"title":"Skills/changedNotification",
"type":"object"
},
{
"properties":{
"method":{
@@ -3403,6 +3519,27 @@
"title":"Item/plan/deltaNotification",
"type":"object"
},
{
"description":"Stream base64-encoded stdout/stderr chunks for a running `command/exec` session.",
"description":"Typed form schema for MCP `elicitation/create` requests.\n\nThis matches the `requestedSchema` shape from the MCP 2025-11-25 `ElicitRequestFormParams` schema.",
"description":"Active Codex turn when this elicitation was observed, if app-server could correlate one.\n\nThis is nullable because MCP models elicitation as a standalone server-to-client request identified by the MCP server request id. It may be triggered during a turn, but turn context is app-server correlation rather than part of the protocol identity of the elicitation itself.",
"type":[
"string",
"null"
]
}
},
"required":[
"serverName",
"threadId"
],
"type":"object"
},
"NetworkApprovalContext":{
"properties":{
"host":{
@@ -966,6 +1584,31 @@
"title":"Item/tool/requestUserInputRequest",
"type":"object"
},
{
"description":"Request input for an MCP server elicitation.",
"description":"Base64-encoded output chunk emitted for a streaming `command/exec` request.\n\nThese notifications are connection-scoped. If the originating connection closes, the server terminates the process.",
"properties":{
"capReached":{
"description":"`true` on the final streamed chunk for a stream when `outputBytesCap` truncated later output on that stream.",
"type":"boolean"
},
"deltaBase64":{
"description":"Base64-encoded output bytes.",
"type":"string"
},
"processId":{
"description":"Client-supplied, connection-scoped `processId` from the original `command/exec` request.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
},
"CommandExecTerminalSize":{
"description":"PTY size in character cells for `command/exec` PTY sessions.",
"properties":{
"cols":{
"description":"Terminal width in character cells.",
"format":"uint16",
"minimum":0.0,
"type":"integer"
},
"rows":{
"description":"Terminal height in character cells.",
"format":"uint16",
"minimum":0.0,
"type":"integer"
}
},
"required":[
"cols",
"rows"
],
"type":"object"
},
"NetworkAccess":{
"enum":[
"restricted",
@@ -89,6 +111,10 @@
"type":"fullAccess"
}
},
"networkAccess":{
"default":false,
"type":"boolean"
},
"type":{
"enum":[
"readOnly"
@@ -175,14 +201,54 @@
]
}
},
"description":"Run a standalone command (argv vector) in the server sandbox without creating a thread or turn.\n\nThe final `command/exec` response is deferred until the process exits and is sent only after all `command/exec/outputDelta` notifications for that connection have been emitted.",
"properties":{
"command":{
"description":"Command argv vector. Empty arrays are rejected.",
"items":{
"type":"string"
},
"type":"array"
},
"cwd":{
"description":"Optional working directory. Defaults to the server cwd.",
"type":[
"string",
"null"
]
},
"disableOutputCap":{
"description":"Disable stdout/stderr capture truncation for this request.\n\nCannot be combined with `outputBytesCap`.",
"type":"boolean"
},
"disableTimeout":{
"description":"Disable the timeout entirely for this request.\n\nCannot be combined with `timeoutMs`.",
"type":"boolean"
},
"env":{
"additionalProperties":{
"type":[
"string",
"null"
]
},
"description":"Optional environment overrides merged into the server-computed environment.\n\nMatching names override inherited values. Set a key to `null` to unset an inherited variable.",
"type":[
"object",
"null"
]
},
"outputBytesCap":{
"description":"Optional per-stream stdout/stderr capture cap in bytes.\n\nWhen omitted, the server default applies. Cannot be combined with `disableOutputCap`.",
"format":"uint",
"minimum":0.0,
"type":[
"integer",
"null"
]
},
"processId":{
"description":"Optional client-supplied, connection-scoped process id.\n\nRequired for `tty`, `streamStdin`, `streamStdoutStderr`, and follow-up `command/exec/write`, `command/exec/resize`, and `command/exec/terminate` calls. When omitted, buffered execution gets an internal id that is not exposed to the client.",
"type":[
"string",
"null"
@@ -196,14 +262,39 @@
{
"type":"null"
}
]
],
"description":"Optional sandbox policy for this command.\n\nUses the same shape as thread/turn execution sandbox configuration and defaults to the user's configured policy when omitted."
},
"size":{
"anyOf":[
{
"$ref":"#/definitions/CommandExecTerminalSize"
},
{
"type":"null"
}
],
"description":"Optional initial PTY size in character cells. Only valid when `tty` is true."
},
"streamStdin":{
"description":"Allow follow-up `command/exec/write` requests to write stdin bytes.\n\nRequires a client-supplied `processId`.",
"type":"boolean"
},
"streamStdoutStderr":{
"description":"Stream stdout/stderr via `command/exec/outputDelta` notifications.\n\nStreamed bytes are not duplicated into the final response and require a client-supplied `processId`.",
"type":"boolean"
},
"timeoutMs":{
"description":"Optional timeout in milliseconds.\n\nWhen omitted, the server default applies. Cannot be combined with `disableTimeout`.",
"format":"int64",
"type":[
"integer",
"null"
]
},
"tty":{
"description":"Enable PTY mode.\n\nThis implies `streamStdin` and `streamStdoutStderr`.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"type":"string"
}
},
"properties":{
"cwds":{
"description":"Optional working directories used to discover repo marketplaces. When omitted, only home-scoped marketplaces and the official curated marketplace are considered.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
"description":"Notification emitted when watched local skill files change.\n\nTreat this as an invalidation signal and re-run `skills/list` with the client's current parameters when refreshed skill metadata is needed.",
"description":"Patch the stored Git metadata for this thread. Omit a field to leave it unchanged, set it to `null` to clear it, or provide a string to replace the stored value."
"description":"There are three ways to resume a thread: 1. By thread_id: load the thread from disk by thread_id and resume it. 2. By history: instantiate the thread from memory and resume it. 3. By path: load the thread from disk by path and resume it.\n\nThe precedence is: history > path > thread_id. If using history or path, the thread_id param will be ignored.\n\nPrefer using thread_id whenever possible.",
"description":"A path that is guaranteed to be absolute and normalized (though it is not guaranteed to be canonicalized or exist on the filesystem).\n\nIMPORTANT: When deserializing an `AbsolutePathBuf`, a base path must be set using [AbsolutePathBufGuard::new]. If no base path is set, the deserialization will fail unless the path being deserialized is already absolute.",
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.