Add a temporary internal remote_plugin feature flag that merges remote
marketplaces into plugin/list and routes plugin/read through the remote
APIs when needed, while keeping pure local marketplaces working as
before.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Add first-class Amazon Bedrock Mantle provider support so Codex can keep
using its existing Responses API transport with OpenAI-compatible
AWS-hosted endpoints such as AOA/Mantle.
This is needed for the AWS launch path, where provider traffic should
authenticate with AWS credentials instead of OpenAI bearer credentials.
Requests are authenticated immediately before transport send, so SigV4
signs the final method, URL, headers, and body bytes that `reqwest` will
send.
## What Changed
- Added a new `codex-aws-auth` crate for loading AWS SDK config,
resolving credentials, and signing finalized HTTP requests with AWS
SigV4.
- Added a built-in `amazon-bedrock` provider that targets Bedrock Mantle
Responses endpoints, defaults to `us-east-1`, supports region/profile
overrides, disables WebSockets, and does not require OpenAI auth.
- Added Amazon Bedrock auth resolution in `codex-model-provider`: prefer
`AWS_BEARER_TOKEN_BEDROCK` when set, otherwise use AWS SDK credentials
and SigV4 signing.
- Added `AuthProvider::apply_auth` and `Request::prepare_body_for_send`
so request-signing providers can sign the exact outbound request after
JSON serialization/compression.
- Determine the region by taking the `aws.region` config first (required
for bearer token codepath), and fallback to SDK default region.
## Testing
Amazon Bedrock Mantle Responses paths:
- Built the local Codex binary with `cargo build`.
- Verified the custom proxy-backed `aws` provider using `env_key =
"AWS_BEARER_TOKEN_BEDROCK"` streamed raw `responses` output with
`response.output_text.delta`, `response.completed`, and `mantle-env-ok`.
- Verified a full `codex exec --profile aws` turn returned
`mantle-env-ok`.
- Confirmed the custom provider used the bearer env var, not AWS profile
auth: bogus `AWS_PROFILE` still passed, empty env var failed locally,
and malformed env var reached Mantle and failed with `401
invalid_api_key`.
- Verified built-in `amazon-bedrock` with `AWS_BEARER_TOKEN_BEDROCK` set
passed despite bogus AWS profiles, returning `amazon-bedrock-env-ok`.
- Verified built-in `amazon-bedrock` SDK/SigV4 auth passed with
`AWS_BEARER_TOKEN_BEDROCK` unset and temporary AWS session env
credentials, returning `amazon-bedrock-sdk-env-ok`.
## Summary
Adds the standalone `codex-rollout-trace` crate, which defines the raw
trace event format, replay/reduction model, writer, and reducer logic
for reconstructing model-visible conversation/runtime state from
recorded rollout data.
The crate-level design is documented in
[`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md).
## Stack
This is PR 1/5 in the rollout trace stack.
- [#18876](https://github.com/openai/codex/pull/18876): Add rollout
trace crate
- [#18877](https://github.com/openai/codex/pull/18877): Record core
session rollout traces
- [#18878](https://github.com/openai/codex/pull/18878): Trace tool and
code-mode boundaries
- [#18879](https://github.com/openai/codex/pull/18879): Trace sessions
and multi-agent edges
- [#18880](https://github.com/openai/codex/pull/18880): Add debug trace
reduction command
## Review Notes
This PR intentionally does not wire tracing into live Codex execution.
It establishes the data model and reducer contract first, with
crate-local tests covering conversation reconstruction, compaction
boundaries, tool/session edges, and code-cell lifecycle reduction. Later
PRs emit into this model.
The README is the best entry point for reviewing the intended trace
format and reduction semantics before diving into the reducer modules.
## Summary
- Adds a process-local, in-memory cookie store for ChatGPT HTTP clients.
- Limits cookie storage and replay to a shared ChatGPT host allowlist.
- Wires the shared store into the default Codex reqwest client and
backend client.
- Shares the ChatGPT host allowlist with remote-control URL validation
to avoid drift.
- Enables reqwest cookie support and updates lockfiles.
## Summary
This PR fully reverts the previously merged Agent Identity runtime
integration from the old stack:
https://github.com/openai/codex/pull/17387/changes
It removes the Codex-side task lifecycle wiring, rollout/session
persistence, feature flag plumbing, lazy `auth.json` mutation,
background task auth paths, and request callsite changes introduced by
that stack.
This leaves the repo in a clean pre-AgentIdentity integration state so
the follow-up PRs can reintroduce the pieces in smaller reviewable
layers.
## Stack
1. This PR: full revert
2. https://github.com/openai/codex/pull/18871: move Agent Identity
business logic into a crate
3. https://github.com/openai/codex/pull/18785: add explicit
AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate auth callsites
through AuthProvider
## Testing
Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
## Why
The device-key protocol needs an app-server implementation that keeps
local key operations behind the same request-processing boundary as
other v2 APIs.
app-server owns request dispatch, transport policy, documentation, and
JSON-RPC error shaping. `codex-device-key` owns key binding, validation,
platform provider selection, and signing mechanics. Keeping the adapter
thin makes the boundary easier to review and avoids moving local
key-management details into thread orchestration code.
## What changed
- Added `DeviceKeyApi` as the app-server adapter around
`DeviceKeyStore`.
- Converted protocol protection policies, payload variants, algorithms,
and protection classes to and from the device-key crate types.
- Encoded SPKI public keys and DER signatures as base64 protocol fields.
- Routed `device/key/create`, `device/key/public`, and `device/key/sign`
through `MessageProcessor`.
- Rejected remote transports before provider access while allowing local
`stdio` and in-process callers to reach the device-key API.
- Added stdio, in-process, and websocket tests for device-key validation
and transport policy.
- Documented the device-key methods in the app-server v2 method list.
## Test coverage
- `device_key_create_rejects_empty_account_user_id`
- `in_process_allows_device_key_requests_to_reach_device_key_api`
- `device_key_methods_are_rejected_over_websocket`
## Stack
This is PR 3 of 4 in the device-key app-server stack. It is stacked on
#18429.
## Validation
- `cargo test -p codex-app-server device_key`
- `just fix -p codex-app-server`
## Why
Device-key storage and signing are local security-sensitive operations
with platform-specific behavior. Keeping the core API in
`codex-device-key` keeps app-server focused on routing and business
logic instead of owning key-management details.
The crate keeps the signing surface intentionally narrow: callers can
create a bound key, fetch its public key, or sign one of the structured
payloads accepted by the crate. It does not expose a generic
arbitrary-byte signing API.
Key IDs cross into platform-specific labels, tags, and metadata paths,
so externally supplied IDs are constrained to the same auditable
namespace created by the crate: `dk_` followed by unpadded base64url for
32 bytes. Remote-control target paths are also tied to each signed
payload shape so connection proofs cannot be reused for enrollment
endpoints, or vice versa.
## What changed
- Added the `codex-device-key` workspace crate.
- Added account/client-bound key creation with stable `dk_` key IDs.
- Added strict `key_id` validation before public-key lookup or signing
reaches a provider.
- Added public-key lookup and structured signing APIs.
- Split remote-control client endpoint allowlists by connection vs
enrollment payload shape.
- Added validation for key bindings, accepted payload fields, token
expiration, and payload/key binding mismatches.
- Added flow-oriented docs on the validation helpers that gate provider
signing.
- Added protection policy and protection-class types without wiring a
platform provider yet.
- Added an unsupported default provider so platforms without an
implementation fail explicitly instead of silently falling back to
software-backed keys.
- Updated Cargo and Bazel lock metadata for the new crate and
non-platform-specific dependencies.
## Stack
This is stacked on #18428.
## Validation
- `cargo test -p codex-device-key`
- Added unit coverage for strict `key_id` validation before provider
use.
- Added unit coverage that rejects remote-control paths from the wrong
signed payload shape.
- `just bazel-lock-update`
- `just bazel-lock-check`
This add with 2 entry point:
* `reset_git_repository` that takes a directory and set it as a new git
root
* `diff_since_latest_init` this returns the diff for a given directory
since the last `reset_git_repository`
## Why
Customers need finer-grained control over allowed sandbox modes based on
the host Codex is running on. For example, they may want stricter
sandbox limits on devboxes while keeping a different default elsewhere.
Our current cloud requirements can target user/account groups, but they
cannot vary sandbox requirements by host. That makes remote development
environments awkward because the same top-level `allowed_sandbox_modes`
has to apply everywhere.
## What
Adds a new `remote_sandbox_config` section to `requirements.toml`:
```toml
allowed_sandbox_modes = ["read-only"]
[[remote_sandbox_config]]
hostname_patterns = ["*.org"]
allowed_sandbox_modes = ["read-only", "workspace-write"]
[[remote_sandbox_config]]
hostname_patterns = ["*.sh", "runner-*.ci"]
allowed_sandbox_modes = ["read-only", "danger-full-access"]
```
During requirements resolution, Codex resolves the local host name once,
preferring the machine FQDN when available and falling back to the
cleaned kernel hostname. This host classification is best effort rather
than authenticated device proof.
Each requirements source applies its first matching
`remote_sandbox_config` entry before it is merged with other sources.
The shared merge helper keeps that `apply_remote_sandbox_config` step
paired with requirements merging so new requirements sources do not have
to remember the extra call.
That preserves source precedence: a lower-precedence requirements file
with a matching `remote_sandbox_config` cannot override a
higher-precedence source that already set `allowed_sandbox_modes`.
This also wires the hostname-aware resolution through app-server,
CLI/TUI config loading, config API reads, and config layer metadata so
they all evaluate remote sandbox requirements consistently.
## Verification
- `cargo test -p codex-config remote_sandbox_config`
- `cargo test -p codex-config host_name`
- `cargo test -p codex-core
load_config_layers_applies_matching_remote_sandbox_config`
- `cargo test -p codex-core
system_remote_sandbox_config_keeps_cloud_sandbox_modes`
- `cargo test -p codex-config`
- `cargo test -p codex-core` unit tests passed; `tests/all.rs`
integration matrix was intentionally stopped after the relevant focused
tests passed
- `just fix -p codex-config`
- `just fix -p codex-core`
- `cargo check -p codex-app-server`
## Why
Cloud-hosted sessions need a way for the service that starts or manages
a thread to provide session-owned config without treating all config as
if it came from the same user/project/workspace TOML stack.
The important boundary is ownership: some values should be controlled by
the session/orchestrator, some by the authenticated user, and later some
may come from the executor. The earlier broad config-store shape made
that boundary too fuzzy and overlapped heavily with the existing
filesystem-backed config loader. This PR starts with the smaller piece
we need now: a typed session config loader that can feed the existing
config layer stack while preserving the normal precedence and merge
behavior.
## What Changed
- Added `ThreadConfigLoader` and related typed payloads in
`codex-config`.
- `SessionThreadConfig` currently supports `model_provider`,
`model_providers`, and feature flags.
- `UserThreadConfig` is present as an ownership boundary, but does not
yet add TOML-backed fields.
- `NoopThreadConfigLoader` preserves existing behavior when no external
loader is configured.
- `StaticThreadConfigLoader` supports tests and simple callers.
- Taught thread config sources to produce ordinary `ConfigLayerEntry`
values so the existing `ConfigLayerStack` remains the place where
precedence and merging happen.
- Wired the loader through `ConfigBuilder`, the config loader, and
app-server startup paths so app-server can provide session-owned config
before deriving a thread config.
- Added coverage for:
- translating typed thread config into config layers,
- inserting thread config layers into the stack at the right precedence,
- applying session-provided model provider and feature settings when
app-server derives config from thread params.
## Follow-Ups
This intentionally stops short of adding the remote/service transport.
The next pieces are expected to be:
1. Define the proto/API shape for this interface.
2. Add a client implementation that can source session config from the
service side.
## Verification
- Added unit coverage in `codex-config` for the loader and layer
conversion.
- Added `codex-core` config loader coverage for thread config layer
precedence.
- Added app-server coverage that verifies session thread config wins
over request-provided config for model provider and feature settings.
## Summary
- add a codex-uds crate with async UnixListener and UnixStream wrappers
- expose helpers for private socket directory setup and stale socket
path checks
- migrate codex-stdio-to-uds onto codex-uds and Tokio-based stdio/socket
relaying
- update the CLI stdio-to-uds command path for the async runner
## Tests
- cargo test -p codex-uds -p codex-stdio-to-uds
- cargo test -p codex-cli
- just fmt
- just fix -p codex-uds
- just fix -p codex-stdio-to-uds
- just fix -p codex-cli
- just bazel-lock-check
- git diff --check
## Summary
- Pin vulnerable npm dependencies through the existing root
`resolutions` mechanism so the lockfile moves only to patched versions.
- Refresh `pnpm-lock.yaml` for `@modelcontextprotocol/sdk`,
`handlebars`, `path-to-regexp`, `picomatch`, `minimatch`, `flatted`,
`rollup`, and `glob`.
- Bump `quinn-proto` from `0.11.13` to `0.11.14` and refresh
`MODULE.bazel.lock`.
## Testing
- `corepack pnpm --store-dir .pnpm-store install --frozen-lockfile
--ignore-scripts`
- `corepack pnpm audit --audit-level high` (passes; remaining advisories
are low/moderate)
- `corepack pnpm -r --filter ./sdk/typescript run build`
- `corepack pnpm exec eslint 'src/**/*.ts' 'tests/**/*.ts'`
- `cargo check --locked`
- `cargo build -p codex-cli`
- `bazel --output_user_root=/tmp/bazel-codex-dependabot
--ignore_all_rc_files mod deps --lockfile_mode=error`
- `just fmt`
Note: `corepack pnpm -r --filter ./sdk/typescript run test` was also
attempted after building `codex`; it is blocked on this workstation by
host-managed Codex MDM/auth state (`approval_policy` restrictions and
ChatGPT/API-key mismatch), not by this dependency change.
## Problem
The TUI still imported path utilities and config-loader symbols through
app-server-client's legacy_core facade even though those APIs already
exist in utility/config crates. This is part of our ongoing effort to
whittle away at these old dependencies.
## Solution
Rewire imports to avoid the TUI directly importing from the core crate
and instead import from common lower-level crates. This PR doesn't
include any functional changes; it's just a simple rewiring.
## Summary
Introduces a single background/control-plane agent task for ChatGPT
backend requests that do not have a thread-scoped task, with
`AuthManager` owning the default ChatGPT backend authorization decision.
Callers now ask `AuthManager` for the default ChatGPT backend
authorization header. `AuthManager` decides whether that is bearer or
background AgentAssertion based on config/internal state, while
low-level bootstrap paths can explicitly request bearer-only auth.
This PR is stacked on PR4 and focuses on the shared background task auth
plumbing plus the first tranche of backend/control-plane consumers. The
remaining callsite wiring is split into PR4.2 to keep review size down.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: https://github.com/openai/codex/pull/17980 - use task-scoped
`AgentAssertion` for downstream calls
- PR4.1: this PR - introduce AuthManager-owned background/control-plane
`AgentAssertion` auth
- PR4.2: https://github.com/openai/codex/pull/18260 - use background
task auth for additional backend/control-plane calls
## What Changed
- add background task registration and assertion minting inside
`codex-login`
- persist `agent_identity.background_task_id` separately from
per-session task state
- make `BackgroundAgentTaskManager` private to `codex-login`; call sites
do not instantiate or pass it around
- teach `AuthManager` the ChatGPT backend base URL and feature-derived
background auth mode from resolved config
- expose bearer-only helpers for bootstrap/registration/refresh-style
paths that must not use AgentAssertion
- wire `AuthManager` default ChatGPT authorization through app listing,
connector directory listing, remote plugins, MCP status/listing,
analytics, and core-skills remote calls
- preserve bearer fallback when the feature is disabled, the backend
host is unsupported, or background task registration is not available
## Validation
- `just fmt`
- `cargo check -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `cargo test -p codex-login agent_identity`
- `cargo test -p codex-model-provider bearer_auth_provider`
- `cargo test -p codex-core agent_assertion`
- `cargo test -p codex-app-server remote_control`
- `cargo test -p codex-cloud-requirements fetch_cloud_requirements`
- `cargo test -p codex-models-manager manager::tests`
- `cargo test -p codex-chatgpt`
- `cargo test -p codex-cloud-tasks`
- `just fix -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `just fix -p codex-app-server`
- `git diff --check`
## Summary
This is the AgentAssertion downstream slice for feature-gated agent
identity support, replacing the oversized AgentAssertion slice from PR
#17807.
It isolates task-scoped downstream AgentAssertion wiring on top of the
merged PR3.1 work without re-carrying the earlier agent registration,
task registration, or task-state history.
This PR includes the task-scoped bug-fix call sites from the review:
generic file upload auth, MCP OpenAI file upload auth, and ARC monitor
auth. Broader user/control-plane calls move to PR4.1 and PR4.2.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: this PR - use task-scoped `AgentAssertion` downstream when
enabled
- PR4.1: https://github.com/openai/codex/pull/18094 - introduce
AuthManager-owned background/control-plane `AgentAssertion` auth
- PR4.2: https://github.com/openai/codex/pull/18260 - use background
task auth for additional backend/control-plane calls
## What Changed
- add AgentAssertion envelope generation in `codex-core`
- route downstream HTTP and websocket auth through AgentAssertion when
an agent task is present
- extend the model-provider auth provider so non-bearer authorization
schemes can be passed through cleanly
- make generic file uploads attach the full authorization header value
- make MCP OpenAI file uploads use the cached thread agent task
assertion when present
- make ARC monitor calls use the cached thread agent task assertion when
present
## Why
The original PR had drifted ancestry and showed a much larger diff than
the semantic change actually required. Restacking it onto PR3.1 keeps
the reviewable surface down to the downstream assertion slice.
## Validation
- `just fmt`
- `cargo check -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `cargo test -p codex-model-provider bearer_auth_provider`
- `cargo test -p codex-core agent_assertion`
- `cargo test -p codex-app-server remote_control`
- `cargo test -p codex-cloud-requirements fetch_cloud_requirements`
- `cargo test -p codex-models-manager manager::tests`
- `cargo test -p codex-chatgpt`
- `cargo test -p codex-cloud-tasks`
- `cargo test -p codex-login agent_identity`
- `just fix -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `just fix -p codex-app-server`
- `git diff --check`
## Summary
The TUI still imported several symbols through the transitional
app-server-client `legacy_core` facade even though those symbols are
already owned by smaller crates. This PR narrows that facade by rewiring
those imports directly to their owner crates.
## Changes
No functional changes, just import rewiring. This is part of our ongoing
effort to whittle away at the `legacy_core` namespace, which represents
all of the remaining symbols that the TUI imports from the core.
## Stack
1. Base PR: #18443 stops granting ACLs on `USERPROFILE`.
2. This PR: filters additional SSH-owned profile roots discovered from
SSH config.
## Bug
The base PR removes the broadest bad grant: `USERPROFILE` itself.
That still leaves one important case. A user profile child can be
SSH-owned even when its name is not one of our fixed exclusions.
For example:
```sshconfig
Host devbox
IdentityFile ~/.keys/devbox
CertificateFile ~/.certs/devbox-cert.pub
UserKnownHostsFile ~/.known_hosts_custom
Include ~/.ssh/conf.d/*.conf
```
After profile expansion, the sandbox might see these as normal profile
children:
```text
C:\Users\me\.keys
C:\Users\me\.certs
C:\Users\me\.known_hosts_custom
C:\Users\me\.ssh
```
Those paths have another owner: OpenSSH and the tools that manage SSH
identity and host-key state. Codex should not add sandbox ACLs to them.
OpenSSH describes this dependency tree in
[`ssh_config(5)`](https://man.openbsd.org/ssh_config.5), and the client
parser follows the same shape in `readconf.c`:
- `Include` recursively reads more config files and expands globs
- `IdentityFile` and `CertificateFile` name authentication files
- `UserKnownHostsFile`, `GlobalKnownHostsFile`, and `RevokedHostKeys`
name host-key files
- `ControlPath` and `IdentityAgent` can name profile-owned sockets or
control files
- these path directives can use forms such as `~`, `%d`, and `${HOME}`
## Change
This PR adds a small SSH config dependency scanner.
It starts at:
```text
~/.ssh/config
```
Then it returns concrete paths named by `Include` and by path-valued SSH
config directives:
```text
IdentityFile
CertificateFile
UserKnownHostsFile
GlobalKnownHostsFile
RevokedHostKeys
ControlPath
IdentityAgent
```
For example:
```sshconfig
IdentityFile ~/.keys/devbox
CertificateFile ~/.certs/devbox-cert.pub
Include ~/.ssh/conf.d/*.conf
```
returns paths like:
```text
C:\Users\me\.keys\devbox
C:\Users\me\.certs\devbox-cert.pub
C:\Users\me\.ssh\conf.d\devbox.conf
```
The setup code then maps those paths back to their top-level
`USERPROFILE` child and filters matching sandbox roots out of both the
writable and readable root lists.
## Why this shape
The parser reports what SSH config references. The sandbox setup code
decides which `USERPROFILE` roots are unsafe to grant.
That keeps the policy simple:
1. expand broad profile grants
2. remove the profile root
3. remove fixed sensitive profile folders
4. remove profile folders referenced by SSH config dependencies
If a path has two possible owners, the sandbox steps back. SSH keeps
control of SSH config, keys, certificates, known-hosts files, sockets,
and included config files.
## Tests
- `cargo test -p codex-windows-sandbox --lib`
- `just bazel-lock-check`
- `just fix -p codex-windows-sandbox`
- `git diff --check`
## Summary
- Add the executor-backed RMCP stdio transport.
- Wire MCP stdio placement through the executor environment config.
- Cover local and executor-backed stdio paths with the existing MCP test
helpers.
## Stack
```text
o #18027 [6/6] Fail exec client operations after disconnect
│
@ #18212 [5/6] Wire executor-backed MCP stdio
│
o #18087 [4/6] Abstract MCP stdio server launching
│
o #18020 [3/6] Add pushed exec process events
│
o #18086 [2/6] Support piped stdin in exec process API
│
o #18085 [1/6] Add MCP server environment config
│
o main
```
---------
Co-authored-by: Codex <noreply@openai.com>
Cap the model-visible skills section to a small share of the context
window, with a fallback character budget, and keep only as many implicit
skills as fit within that budget.
Emit a non-fatal warning when enabled skills are omitted, and add a new
app-server warning notification
Record thread-start skill metrics for total enabled skills, kept skills,
and whether truncation happened
---------
Co-authored-by: Matthew Zeng <mzeng@openai.com>
Co-authored-by: Codex <noreply@openai.com>
This is the first mechanical cleanup in a stack whose higher-level goal
is to enable Clippy coverage for async guards held across `.await`
points.
The follow-up commits enable Clippy's
[`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock)
lint and the configurable
[`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type)
lint for Tokio guard types. This PR handles the cases where the
underlying issue is not protected shared mutable state, but a
`tokio::sync::mpsc::UnboundedReceiver` wrapped in `Arc<Mutex<_>>` so
cloned owners can call `recv().await`.
Using a mutex for that shape forces the receiver lock guard to live
across `.await`. Switching these paths to `async-channel` gives us
cloneable `Receiver`s, so each owner can hold a receiver handle directly
and await messages without an async mutex guard.
## What changed
- In `codex-rs/code-mode`, replace the turn-message
`mpsc::UnboundedSender`/`UnboundedReceiver` plus `Arc<Mutex<Receiver>>`
with `async_channel::Sender`/`Receiver`.
- In `codex-rs/codex-api`, replace the realtime websocket event receiver
with an `async_channel::Receiver`, allowing `RealtimeWebsocketEvents`
clones to receive without locking.
- Add `async-channel` as a dependency for `codex-code-mode` and
`codex-api`, and update `Cargo.lock`.
## Verification
- The split stack was verified at the final lint-enabling head with
`just clippy`.
## Summary
- Add `codex-model-provider` as the runtime home for model-provider
behavior that does not belong in `codex-core`, `codex-login`, or
`codex-api`.
- The new crate wraps configured `ModelProviderInfo` in a
`ModelProvider` trait object that can resolve the API provider config,
provider-scoped auth manager, and request auth provider for each call.
- This centralizes provider auth behavior in one place today, and gives
us an extension point for future provider-specific auth, model listing,
request setup, and related runtime behavior.
## Tests
Ran tests manually to make sure that provider auth under different
configs still work as expected.
---------
Co-authored-by: pakrym-oai <pakrym@openai.com>
## Summary
- adds macOS Seatbelt deny rules for unreadable glob patterns
- expands unreadable glob matches on Linux and masks them in bwrap,
including canonical symlink targets
- keeps Linux glob expansion robust when `rg` is unavailable in minimal
or Bazel test environments
- adds sandbox integration coverage that runs `shell` and `exec_command`
with a `**/*.env = none` policy and verifies the secret contents do not
reach the model
## Linux glob expansion
```text
Prefer: rg --files --hidden --no-ignore --glob <pattern> -- <search-root>
Fallback: internal globset walker when rg is not installed
Failure: any other rg failure aborts sandbox construction
```
```
[permissions.workspace.filesystem]
glob_scan_max_depth = 2
[permissions.workspace.filesystem.":project_roots"]
"**/*.env" = "none"
```
This keeps the common path fast without making sandbox construction
depend on an ambient `rg` binary. If `rg` is present but fails for
another reason, the sandbox setup fails closed instead of silently
omitting deny-read masks.
## Platform support
- macOS: subprocess sandbox enforcement is handled by Seatbelt regex
deny rules
- Linux: subprocess sandbox enforcement is handled by expanding existing
glob matches and masking them in bwrap
- Windows: policy/config/direct-tool glob support is already on `main`
from #15979; Windows subprocess sandbox paths continue to fail closed
when unreadable split filesystem carveouts require runtime enforcement,
rather than silently running unsandboxed
## Stack
1. #15979 - merged: cross-platform glob deny-read
policy/config/direct-tool support for macOS, Linux, and Windows
2. This PR - macOS/Linux subprocess sandbox enforcement plus Windows
fail-closed clarification
3. #17740 - managed deny-read requirements
## Verification
- Added integration coverage for `shell` and `exec_command` glob
deny-read enforcement
- `cargo check -p codex-sandboxing -p codex-linux-sandbox --tests`
- `cargo check -p codex-core --test all`
- `cargo clippy -p codex-linux-sandbox -p codex-sandboxing --tests`
- `just bazel-lock-check`
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Stack PR3 for feature-gated agent identity support.
This PR adds per-thread agent task registration behind
`features.use_agent_identity`. Tasks are minted on the first real user
turn and cached in thread runtime state for later turns.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - this PR, original
task registration slice
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: https://github.com/openai/codex/pull/17980 - use `AgentAssertion`
downstream when enabled
## Validation
Covered as part of the local stack validation pass:
- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`
## Notes
The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
## Summary
- Add best-effort auto-upgrade for user-configured Git marketplaces
recorded in `config.toml`.
- Track the last activated Git revision with `last_revision` so
unchanged marketplace sources skip clone work.
- Trigger the upgrade from plugin startup and `plugin/list`, while
preserving existing fail-open plugin behavior with warning logs rather
than new user-visible errors.
## Details
- Remote configured marketplaces use `git ls-remote` to compare the
source/ref against the recorded revision.
- Upgrades clone into a staging directory, validate that
`.agents/plugins/marketplace.json` exists and that the manifest name
matches the configured marketplace key, then atomically activate the new
root.
- Local `.agents/plugins/marketplace.json` marketplaces remain live
filesystem state and are not auto-pulled.
- Existing non-curated plugin cache refresh is kicked after successful
marketplace root upgrades.
## Validation
- `just write-config-schema`
- `cargo test -p codex-core marketplace_upgrade`
- `cargo check -p codex-cli -p codex-app-server`
- `just fix -p codex-core`
Did not run the complete `cargo test` suite because the repo
instructions require asking before a full core workspace run.
- Add a "remote" thread store implementation
- Implement the remote thread store as a thin wrapper that makes grpc
calls to a configurable service endpoint
- Implement only the thread/list method to start
- Encode the grpc method/param shape as protobufs in the remote
implementation
A wart: the proto generation script is an "example" binary target. This
is an example target only because Cargo lets examples use
dev-dependencies, which keeps tonic-prost-build out of the normal
codex-thread-store dependency surface. A regular bin would either need
to add proto generation deps as normal runtime deps, or use a
feature-gated optional dep, which this repo’s manifest checks explicitly
reject.
Split plugin loading, marketplace, and related infrastructure out of
core into codex-core-plugins, while keeping the core-facing
configuration and orchestration flow in codex-core.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
This PR significantly improves the standalone installer experience.
The main changes are:
1. We now install the codex binary and other dependencies in a
subdirectory under CODEX_HOME.
(`CODEX_HOME/packages/standalone/releases/...`)
2. We replace the `codex.js` launcher that npm/bun rely on with logic in
the Rust binary that automatically resolves its dependencies (like
ripgrep)
## Motivation
A few design constraints pushed this work.
1. Currently, the entrypoint to codex is through `codex.js`, which
forces a node dependency to kick off our rust app. We want to move away
from this so that the entrypoint to codex does not rely on node or
external package managers.
2. Right now, the native script adds codex and its dependencies directly
to user PATH. Given that codex is likely to add more binary dependencies
than ripgrep, we want a solution which does not add arbitrary binaries
to user PATH -- the only one we want to add is the `codex` command
itself.
3. We want upgrades to be atomic. We do not want scenarios where
interrupting an upgrade command can move codex into undefined state (for
example, having a new codex binary but an old ripgrep binary). This was
~possible with the old script.
4. Currently, the Rust binary uses heuristics to determine which
installer created it. These heuristics are flaky and are tied to the
`codex.js` launcher. We need a more stable/deterministic way to
determine how the binary was installed for standalone.
5. We do not want conflicting codex installations on PATH. For example,
the user installing via npm, then installing via brew, then installing
via standalone would make it unclear which version of codex is being
launched and make it tough for us to determine the right upgrade
command.
## Design
### Standalone package layout
Standalone installs now live under `CODEX_HOME/packages/standalone`:
```text
$CODEX_HOME/
packages/
standalone/
current -> releases/0.111.0-x86_64-unknown-linux-musl
releases/
0.111.0-x86_64-unknown-linux-musl/
codex
codex-resources/
rg
```
where `standalone/current` is a symlink to a release directory.
On Windows, the release directory has the same shape, with `.exe` names
and Windows helpers in `codex-resources`:
```text
%CODEX_HOME%\
packages\
standalone\
current -> releases\0.111.0-x86_64-pc-windows-msvc
releases\
0.111.0-x86_64-pc-windows-msvc\
codex.exe
codex-resources\
rg.exe
codex-command-runner.exe
codex-windows-sandbox-setup.exe
```
This gives us:
- atomic upgrades because we can fully stage a release before switching
`standalone/current`
- a stable way for the binary to recognize a standalone install from its
canonical `current_exe()` path under CODEX_HOME
- a clean place for binary dependencies like `rg`, Windows sandbox
helpers, and, in the future, our custom `zsh` etc
### Command location
On Unix, we add a symlink at `~/.local/bin/codex` which points directly
to the `$CODEX_HOME/packages/standalone/current/codex` binary. This
becomes the main entrypoint for the CLI.
On Windows, we store the link at
`%LOCALAPPDATA%\Programs\OpenAI\Codex\bin`.
### PATH persistence
This is a tricky part of the PR, as there's no ~super reliable way to
ensure that we end up on PATH without significant tradeoffs.
Most Unix variants will have `~/.local/bin` on PATH already, which means
we *should* be fine simply registering the command there in most cases.
However, there are cases where this is not the case. In these cases, we
directly edit the profile depending on the shell we're in.
- macOS zsh: `~/.zprofile`
- macOS bash: `~/.bash_profile`
- Linux zsh: `~/.zshrc`
- Linux bash: `~/.bashrc`
- fallback: `~/.profile`
On Windows, we update the User `Path` environment variable directly and
we don't need to worry about shell profiles.
### Standalone runtime detection
This PR adds a new shared crate, `codex-install-context`, which computes
install ownership once per process and caches it in a `OnceLock`.
That context includes:
- install manager (`Standalone`, `Npm`, `Bun`, `Brew`, `Other`)
- the managed standalone release directory, when applicable
- the managed standalone `codex-resources` directory, when present
- the resolved `rg_command`
The standalone path is detected by canonicalizing `current_exe()`,
canonicalizing CODEX_HOME via `find_codex_home()`, and checking whether
the binary is running from under
`$CODEX_HOME/packages/standalone/releases`.
We intentionally do not use a release metadata file. The binary path is
the source of truth.
### Dependency resolution
For standalone installs, `grep_files` now resolves bundled `rg` from
`codex-resources` next to the Codex binary.
For npm/bun/brew/other installs, `grep_files` falls back to resolving
`rg` from PATH.
For Windows standalone installs, Windows sandbox helpers are still found
as direct siblings when present. If they are not direct siblings, the
lookup also checks the sibling `codex-resources` directory.
### TUI update path
The TUI now has `UpdateAction::StandaloneUnix` and
`UpdateAction::StandaloneWindows`, which rerun the standalone install
commands.
Unix update command:
```sh
sh -c "curl -fsSL https://chatgpt.com/codex/install.sh | sh"
```
Windows update command:
```powershell
powershell -c "irm https://chatgpt.com/codex/install.ps1|iex"
```
The Windows updater runs PowerShell directly. We do this because `cmd
/C` would parse the `|iex` as a cmd pipeline instead of passing it to
PowerShell.
## Additional installer behavior
- standalone installs now warn about conflicting npm/bun/brew-managed
`codex` installs and offer to uninstall them
- same-version reruns do not redownload the release if it is already
staged locally
## Testing
Installer smoke tests run:
- macOS: fresh install into isolated `HOME` and `CODEX_HOME` with
`scripts/install/install.sh --release latest`
- macOS: reran the installer against the same isolated install to verify
the same-version/update path and PATH block idempotence
- macOS: verified the installed `codex --version` and bundled
`codex-resources/rg --version`
- Windows: parsed `scripts/install/install.ps1` with PowerShell via
`[scriptblock]::Create(...)`
- Windows: verified the standalone update action builds a direct
PowerShell command and does not route the `irm ...|iex` command through
`cmd /C`
---------
Co-authored-by: Codex <noreply@openai.com>
Builds on top of #17659
Move the filesystem + sqlite thread listing-related operations inside of
a local ThreadStore implementation and call ThreadStore from the places
that used to perform these filesystem/sqlite operations.
This is the first of a series of PRs that will implement the rest of the
local ThreadStore.
Testing:
- added unit tests for the thread store implementation
- adjusted some unit tests in the realtime + personality packages whose
callsites changed. Specifically I'm trying to hide ThreadMetadata inside
of the local implementation and make ThreadMetadata a sqlite
implementation detail concern rather than a public interface, preferring
the more generate StoredThread interface instead
- added a corner case test for the personality migration package that
wasn't covered by the existing test suite
- adjust the behavior of searched thread listing to run the existing
local rollout repair/backfill pass _before_ querying SQLite results, so
callers using ThreadStore::list_threads do not miss matches after a
partial metadata warm-up
## Summary
Stack PR 2 of 4 for feature-gated agent identity support.
This PR adds agent identity registration behind
`features.use_agent_identity`. It keeps the app-server protocol
unchanged and starts registration after ChatGPT auth exists rather than
requiring a client restart.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - this PR
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR4: https://github.com/openai/codex/pull/17388 - use `AgentAssertion`
downstream when enabled
## Validation
Covered as part of the local stack validation pass:
- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`
## Notes
The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
stacked on #17402.
MCP tools returned by `tool_search` (deferred tools) get registered in
our `ToolRegistry` with a different format than directly available
tools. this leads to two different ways of accessing MCP tools from our
tool catalog, only one of which works for each. fix this by registering
all MCP tools with the namespace format, since this info is already
available.
also, direct MCP tools are registered to responsesapi without a
namespace, while deferred MCP tools have a namespace. this means we can
receive MCP `FunctionCall`s in both formats from namespaces. fix this by
always registering MCP tools with namespace, regardless of deferral
status.
make code mode track `ToolName` provenance of tools so it can map the
literal JS function name string to the correct `ToolName` for
invocation, rather than supporting both in core.
this lets us unify to a single canonical `ToolName` representation for
each MCP tool and force everywhere to use that one, without supporting
fallbacks.
## Summary
- Fix marketplace-add local path detection on Windows by using
`Path::is_absolute()`.
- Make marketplace-add local-source tests parse/write TOML through the
same helpers instead of raw string matching.
- Update `rand` 0.9.x to 0.9.3 and document the remaining audited `rand`
0.8.5 advisory exception.
- Refresh `MODULE.bazel.lock` after the Cargo.lock update.
## Why
Latest `main` had two independent CI blockers: marketplace-add tests
were not portable to Windows path/TOML escaping, and cargo-deny still
reported `RUSTSEC-2026-0097` after the recent rustls-webpki fix.
## Validation
- `cargo test -p codex-core marketplace_add -- --nocapture`
- `cargo deny --all-features check`
- `just bazel-lock-check`
- `just fix -p codex-core`
- `just fmt`
- `git diff --check`
## Changes
Allows sandboxes to restrict overall network access while granting
access to specific unix sockets on mac.
## Details
- `codex sandbox macos`: adds a repeatable `--allow-unix-socket` option.
- `codex-sandboxing`: threads explicit Unix socket roots into the macOS
Seatbelt profile generation.
- Preserves restricted network behavior when only Unix socket IPC is
requested, and preserves full network behavior when full network is
already enabled.
## Verification
- `cargo test -p codex-cli -p codex-sandboxing`
- `cargo build -p codex-cli --bin codex`
- verified that `codex sandbox macos --allow-unix-socket /tmp/test.sock
-- test-client` grants access as expected
Introduce a ThreadStore interface for mediating access to the filesystem
(rollout jsonl files + sqlite db) based thread storage.
In later PRs we'll move the existing fs code behind a "local"
implementation of this ThreadStore interface.
This PR should be a no-op behaviorally, it only introduces the
interface.
## Summary
- Pin Rust git patch dependencies to immutable revisions and make
cargo-deny reject unknown git and registry sources unless explicitly
allowlisted.
- Add checked-in SHA-256 coverage for the current rusty_v8 release
assets, wire those hashes into Bazel, and verify CI override downloads
before use.
- Add rusty_v8 MODULE.bazel update/check tooling plus a Bazel CI guard
so future V8 bumps cannot drift from the checked-in checksum manifest.
- Pin release/lint cargo installs and all external GitHub Actions refs
to immutable inputs.
## Future V8 bump flow
Run these after updating the resolved `v8` crate version and checksum
manifest:
```bash
python3 .github/scripts/rusty_v8_bazel.py update-module-bazel
python3 .github/scripts/rusty_v8_bazel.py check-module-bazel
```
The update command rewrites the matching `rusty_v8_<crate_version>`
`http_file` SHA-256 values in `MODULE.bazel` from
`third_party/v8/rusty_v8_<crate_version>.sha256`. The check command is
also wired into Bazel CI to block drift.
## Notes
- This intentionally excludes RustSec dependency upgrades and
bubblewrap-related changes per request.
- The branch was rebased onto the latest origin/main before opening the
PR.
## Validation
- cargo fetch --locked
- cargo deny check advisories
- cargo deny check
- cargo deny check sources
- python3 .github/scripts/rusty_v8_bazel.py check-module-bazel
- python3 .github/scripts/rusty_v8_bazel.py update-module-bazel
- python3 -m unittest discover -s .github/scripts -p
'test_rusty_v8_bazel.py'
- python3 -m py_compile .github/scripts/rusty_v8_bazel.py
.github/scripts/rusty_v8_module_bazel.py
.github/scripts/test_rusty_v8_bazel.py
- repo-wide GitHub Actions `uses:` audit: all external action refs are
pinned to 40-character SHAs
- yq eval on touched workflows and local actions
- git diff --check
- just bazel-lock-check
## Hash verification
- Confirmed `MODULE.bazel` hashes match
`third_party/v8/rusty_v8_146_4_0.sha256`.
- Confirmed GitHub release asset digests for denoland/rusty_v8
`v146.4.0` and openai/codex `rusty-v8-v146.4.0` match the checked-in
hashes.
- Streamed and SHA-256 hashed all 10 `MODULE.bazel` rusty_v8 asset URLs
locally; every downloaded byte stream matched both `MODULE.bazel` and
the checked-in manifest.
## Pin verification
- Confirmed signing-action pins match the peeled commits for their tag
comments: `sigstore/cosign-installer@v3.7.0`, `azure/login@v2`, and
`azure/trusted-signing-action@v0`.
- Pinned the remaining tag-based action refs in Bazel CI/setup:
`actions/setup-node@v6`, `facebook/install-dotslash@v2`,
`bazelbuild/setup-bazelisk@v3`, and `actions/cache/restore@v5`.
- Normalized all `bazelbuild/setup-bazelisk@v3` refs to the peeled
commit behind the annotated tag.
- Audited Cargo git dependencies: every manifest git dependency uses
`rev` only, every `Cargo.lock` git source has `?rev=<sha>#<same-sha>`,
and `cargo deny check sources` passes with `required-git-spec = "rev"`.
- Shallow-fetched each distinct git dependency repo at its pinned SHA
and verified Git reports each object as a commit.
## Summary\n- add an exec-server package-local test helper binary that
can run exec-server and fs-helper flows\n- route exec-server filesystem
tests through that helper instead of cross-crate codex helper
binaries\n- stop relying on Bazel-only extra binary wiring for these
tests\n\n## Testing\n- not run (per repo guidance for codex changes)
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add an exec-server `envPolicy` field; when present, the server starts
from its own process env and applies the shell environment policy there
- keep `env` as the exact environment for local/embedded starts, but
make it an overlay for remote unified-exec starts
- move the shell-environment-policy builder into `codex-config` so Core
and exec-server share the inherit/filter/set/include behavior
- overlay only runtime/sandbox/network deltas from Core onto the
exec-server-derived env
## Why
Remote unified exec was materializing the shell env inside Core and
forwarding the whole map to exec-server, so remote processes could
inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
base env on the executor while preserving Core-owned runtime additions
like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
sandbox marker env.
## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-core --lib unified_exec::process_manager::tests`
- `cargo test -p codex-core --lib exec_env::tests`
- `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
matched 0 tests)
- `cargo test -p codex-config --lib shell_environment` (compile-only;
filter matched 0 tests)
- `just bazel-lock-update`
## Known local validation issue
- `just bazel-lock-check` is not runnable in this checkout: it invokes
`./scripts/check-module-bazel-lock.sh`, which is missing.
---------
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
Problem: After #17294 switched exec-server tests to launch the top-level
`codex exec-server` command, parallel remote exec-process cases can
flake while waiting for the child server's listen URL or transport
shutdown.
Solution: Serialize remote exec-server-backed process tests and harden
the harness so spawned servers are killed on drop and shutdown waits for
the child process to exit.