## Why
`thread/fork` responses intentionally include copied history so the
caller can render the fork immediately, but `thread/started` is a
lifecycle notification. The v2 `Thread` contract says notifications
should return `turns: []`, and the fork path was reusing the response
thread directly, causing copied turns to be emitted through
`thread/started` as well.
## What Changed
- Route app-server `thread/started` notification construction through a
helper that clears `thread.turns` before sending.
- Keep `thread/fork` responses unchanged so callers still receive copied
history.
- Add persistent and ephemeral fork coverage that asserts
`thread/started` emits an empty `turns` array while the response retains
fork history.
## Testing
- `just fmt`
- `cargo test -p codex-app-server`
Fix a bug where the `turn/interrupt` RPC hangs when interrupting a turn
that has already completed.
Before this change, `turn/interrupt` requests were queued in app-server
and only answered when a later TurnAborted event arrived. If the target
turn was already complete, core treated Op::Interrupt as a no-op, so no
abort event was emitted and the RPC could hang indefinitely.
This change fixes that in two places:
* Reject turn/interrupt immediately with `INVALID_REQUEST` when the
requested turn is no longer the active turn.
* Resolve any already-accepted pending interrupt requests when the turn
reaches TurnComplete, covering the case where a turn finishes naturally
after the interrupt request is accepted but before it aborts.
I tested this by adding a failing test in
707487c063. You may view the results here:
https://github.com/openai/codex/actions/runs/24585182419/
<img width="1512" height="310" alt="CleanShot 2026-04-17 at 16 33 30@2x"
src="https://github.com/user-attachments/assets/f4a88228-b2a4-41f4-9aaa-ec82814096af"
/>
Supersedes #18735.
The scheduled rust-release-prepare workflow force-pushed
`bot/update-models-json` back to the generated models.json-only diff,
which dropped the test and snapshot updates needed for CI.
This PR keeps the latest generated `models.json` from #18735 and adds
the corresponding fixture updates:
- preserve model availability NUX in the app-server model cache fixture
- update core/TUI expectations for the new `gpt-5.4` `xhigh` default
reasoning
- refresh affected TUI chatwidget snapshots for the `gpt-5.5`
default/model copy changes
Validation run locally while preparing the fix:
- `just fmt`
- `cargo test -p codex-app-server model_list`
- `cargo test -p codex-core includes_no_effort_in_request`
- `cargo test -p codex-core
includes_default_reasoning_effort_in_request_when_defined_by_model_info`
- `cargo test -p codex-tui --lib chatwidget::tests`
- `cargo insta pending-snapshots`
---------
Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
## Why
`PermissionProfile` is becoming the canonical permissions abstraction,
but the old shape only carried optional filesystem and network fields.
It could describe allowed access, but not who is responsible for
enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy
when profiles were exported, cached, or round-tripped through app-server
APIs.
The important model change is that active permissions are now a disjoint
union over the enforcement mode. Conceptually:
```rust
pub enum PermissionProfile {
Managed {
file_system: FileSystemSandboxPolicy,
network: NetworkSandboxPolicy,
},
Disabled,
External {
network: NetworkSandboxPolicy,
},
}
```
This distinction matters because `Disabled` means Codex should apply no
outer sandbox at all, while `External` means filesystem isolation is
owned by an outside caller. Those are not equivalent to a broad managed
sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an
inner sandbox may require the outer Codex layer to use no sandbox rather
than a permissive one.
## How Existing Modeling Maps
Legacy `SandboxPolicy` remains a boundary projection, but it now maps
into the higher-fidelity profile model:
- `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed`
with restricted filesystem entries plus the corresponding network
policy.
- `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving
the “no outer sandbox” intent instead of treating it as a lax managed
sandbox.
- `ExternalSandbox { network_access }` maps to
`PermissionProfile::External { network }`, preserving external
filesystem enforcement while still carrying the active network policy.
- Split runtime policies that legacy `SandboxPolicy` cannot faithfully
express, such as managed unrestricted filesystem plus restricted
network, stay `Managed` instead of being collapsed into
`ExternalSandbox`.
- Per-command/session/turn grants remain partial overlays via
`AdditionalPermissionProfile`; full `PermissionProfile` is reserved for
complete active runtime permissions.
## What Changed
- Change active `PermissionProfile` into a tagged union: `managed`,
`disabled`, and `external`.
- Keep partial permission grants separate with
`AdditionalPermissionProfile` for command/session/turn overlays.
- Represent managed filesystem permissions as either `restricted`
entries or `unrestricted`; `glob_scan_max_depth` is non-zero when
present.
- Preserve old rollout compatibility by accepting the pre-tagged `{
network, file_system }` profile shape during deserialization.
- Preserve fidelity for important edge cases: `DangerFullAccess`
round-trips as `disabled`, `ExternalSandbox` round-trips as `external`,
and managed unrestricted filesystem + restricted network stays managed
instead of being mistaken for external enforcement.
- Preserve configured deny-read entries and bounded glob scan depth when
full profiles are projected back into runtime policies, including
unrestricted replacements that now become `:root = write` plus deny
entries.
- Regenerate the experimental app-server v2 JSON/TypeScript schema and
update the `command/exec` README example for the tagged
`permissionProfile` shape.
## Compatibility
Legacy `SandboxPolicy` remains available at config/API boundaries as the
compatibility projection. Existing rollout lines with the old
`PermissionProfile` shape continue to load. The app-server
`permissionProfile` field is experimental, so its v2 wire shape is
intentionally updated to match the higher-fidelity model.
## Verification
- `just write-app-server-schema`
- `cargo check --tests`
- `cargo test -p codex-protocol permission_profile`
- `cargo test -p codex-protocol
preserving_deny_entries_keeps_unrestricted_policy_enforceable`
- `cargo test -p codex-app-server-protocol
permission_profile_file_system_permissions`
- `cargo test -p codex-app-server-protocol serialize_client_response`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox`
- `just fix`
- `just fix -p codex-protocol`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-core`
- `just fix -p codex-app-server`
## Summary
- Add a remote plugin install write call that POSTs the selected remote
plugin to the ChatGPT cloud plugin API.
- Align remote install with the latest remote read contract:
`pluginName` carries the backend remote plugin id directly, for example
`plugins~Plugin_linear`, and install no longer synthesizes
`<name>@<marketplace>` ids.
- Validate remote install ids with the same character rules as remote
read, return the same install response shape as local installs, and
include mocked app-server coverage for the write path.
## Validation
- `just fmt`
- `cargo test -p codex-app-server --test all plugin_install`
- `cargo test -p codex-core-plugins`
- `just fix -p codex-app-server`
- `just fix -p codex-core-plugins`
## Why
Device-key providers should only own platform key material. The
account/client binding used to authorize a signing payload is app-server
state, and keeping that state in provider-specific metadata makes the
same check harder to audit and harder to share across platform
implementations.
Persisting the binding in the shared state database gives the device-key
crate a platform-neutral source of truth before it asks a provider to
sign. It also lets app-server move potentially blocking key operations
off the main message processor path, which matters once providers may
wait for OS authentication prompts.
## What changed
- Add a `device_key_bindings` state migration plus `StateRuntime`
helpers keyed by `key_id`.
- Add an async `DeviceKeyBindingStore` abstraction to `codex-device-key`
and use it from `DeviceKeyStore::create` and `DeviceKeyStore::sign`.
- Keep provider calls behind async store methods and run the synchronous
provider work through `spawn_blocking`.
- Wire app-server device-key RPC handling to the SQLite-backed binding
store and spawn response/error delivery tasks for device-key requests.
- Run the turn-start tracing test on the existing larger current-thread
test harness after the larger async surface made the default test stack
too small locally.
## Validation
- `cargo test -p codex-device-key`
- `cargo test -p codex-state device_key`
- `cargo test -p codex-state`
- `cargo test -p codex-app-server device_key`
- `cargo test -p codex-app-server
message_processor::tracing_tests::turn_start_jsonrpc_span_parents_core_turn_spans`
- `cargo test -p codex-app-server`
- `just fix -p codex-device-key`
- `just fix -p codex-state`
- `just fix -p codex-app-server`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `git diff --check`
## Why
`codex-models-manager` had grown to own provider-specific concerns:
constructing OpenAI-compatible `/models` requests, resolving provider
auth, emitting request telemetry, and deciding how provider catalogs
should be sourced. That made the manager harder to reuse for providers
whose model catalog is not fetched from the OpenAI `/models` endpoint,
such as Amazon Bedrock.
This change moves provider-specific model discovery behind
provider-owned implementations, so the models manager can focus on
refresh policy, cache behavior, picker ordering, and model metadata
merging.
## What Changed
- Introduced a `ModelsManager` trait with separate `OpenAiModelsManager`
and `StaticModelsManager` implementations.
- Added `ModelsEndpointClient` so OpenAI-compatible HTTP fetching lives
outside `codex-models-manager`.
- Moved `/models` request construction, provider auth resolution,
timeout handling, and request telemetry into `codex-model-provider` via
`OpenAiModelsEndpoint`.
- Added provider-owned `models_manager(...)` construction so configured
OpenAI-compatible providers use `OpenAiModelsManager`, while
static/catalog-backed providers can return `StaticModelsManager`.
- Added an Amazon Bedrock static model catalog for the GPT OSS Bedrock
model IDs.
- Updated core/session/thread manager code and tests to depend on
`Arc<dyn ModelsManager>`.
- Moved offline model test helpers into
`codex_models_manager::test_support`.
## Metadata References
The Bedrock catalog metadata is based on the official Amazon Bedrock
OpenAI model documentation:
- [Amazon Bedrock OpenAI
models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-openai.html)
lists the Bedrock model IDs, text input/output modalities, and `128,000`
token context window for `gpt-oss-20b` and `gpt-oss-120b`.
- [Amazon Bedrock `gpt-oss-120b` model
card](https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-openai-gpt-oss-120b.html)
lists the `bedrock-runtime` model ID `openai.gpt-oss-120b-1:0`, the
`bedrock-mantle` model ID `openai.gpt-oss-120b`, text-only modalities,
and `128K` context window.
- [OpenAI `gpt-oss-120b` model
docs](https://developers.openai.com/api/docs/models/gpt-oss-120b)
document configurable reasoning effort with `low`, `medium`, and `high`,
plus text input/output modality.
The display names, default reasoning effort, and priority ordering are
Codex-local catalog choices.
## Test Plan
- Manually verified app-server model listing with an AWS profile:
```shell
CODEX_HOME="$(mktemp -d)" cargo run -p codex-app-server-test-client -- \
--codex-bin ./target/debug/codex \
-c 'model_provider="amazon-bedrock"' \
-c 'model_providers.amazon-bedrock.aws.profile="codex-bedrock"' \
-c 'model_providers.amazon-bedrock.aws.region="us-west-2"' \
model-list
```
The response returned the Bedrock catalog with `openai.gpt-oss-120b-1:0`
as the default model and `openai.gpt-oss-20b-1:0` as the second listed
model, both text-only and supporting low/medium/high reasoning effort.
## Why
AWS/Bedrock mode currently reports `account: null` with
`requiresOpenaiAuth: false` from `account/read`. That suppresses the
OpenAI-auth requirement, but it does not let app clients distinguish AWS
auth from any other non-OpenAI custom provider. For the prototype AWS
provider UX, clients need a simple provider-derived signal so they can
suppress ChatGPT/API-key login and token-refresh paths without
hardcoding Bedrock checks.
## What changed
- Adds an `aws` variant to the v2 `Account` protocol union.
- Adds `ProviderAccountKind` to `codex-model-provider` so the runtime
provider owns the app-visible account classification.
- Makes Amazon Bedrock return `ProviderAccountKind::Aws` from the
model-provider layer.
- Updates app-server `account/read` to map `ProviderAccountKind` to the
existing `GetAccountResponse` wire shape.
- Preserves the existing `account: null, requiresOpenaiAuth: false`
behavior for other non-OpenAI providers.
- Regenerates the app-server protocol schema fixtures.
- Adds coverage for provider account classification and for the Amazon
Bedrock `account/read` response.
## Testing
- `cargo test -p codex-model-provider`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server get_account_with_aws_provider`
## Notes
I attempted `just bazel-lock-update` and `just bazel-lock-check`, but
both are blocked in my local environment because `bazel` is not
installed.
Fixes#18203.
## Why
Remote TUI clients connected through `codex app-server --listen
ws://...` can receive short bursts of outbound turn and tool-output
notifications. The WebSocket transport previously used the shared
128-message channel capacity for its outbound writer queue, so a healthy
client that briefly lagged during normal output streaming could fill the
queue and be disconnected immediately.
This is a smaller mitigation than #18265: instead of adding a new
overflow/backpressure pipeline, keep the existing non-blocking router
behavior and give WebSocket clients enough bounded headroom for
realistic bursts.
## What Changed
- Added a WebSocket-only outbound writer capacity of `64 * 1024`
messages.
- Used that larger capacity only for the WebSocket data writer queue in
`codex-rs/app-server/src/transport/websocket.rs`.
- Left the shared `CHANNEL_CAPACITY` and the existing disconnect-on-full
behavior unchanged for internal/control channels and genuinely stuck
clients.
## Verification
- `cargo test -p codex-app-server
transport::tests::broadcast_does_not_block_on_slow_connection`
- Manually retried the #18203 repro prompt against the remote TUI and
confirmed it stayed connected.
## Why
App-server needs a way to fetch thread-scoped config from the remote
thread config service when the user config opts into that behavior. This
mirrors the existing experimental remote thread store endpoint while
keeping local/noop behavior as the default.
Startup paths also need to avoid silently dropping the remote config
endpoint after the first config load. The stdio app-server path
discovers the endpoint from the initial config and installs the real
thread config loader for later config builds, while in-process clients
used by TUI/exec now select the same remote loader directly from their
provided config.
## What changed
- Added `experimental_thread_config_endpoint` to `ConfigToml`, `Config`,
and `core/config.schema.json`.
- Added config parsing coverage for the new setting.
- Updated app-server startup to select `RemoteThreadConfigLoader` from
the initially loaded config, falling back to `NoopThreadConfigLoader`
when unset.
- Let `ConfigManager` replace its thread config loader after startup
discovery so later config loads use the selected loader.
- Updated in-process app-server client startup to pass
`RemoteThreadConfigLoader` when its config has
`experimental_thread_config_endpoint` set.
## Verification
- Added `experimental_thread_config_endpoint_loads_from_config_toml`.
- Added
`runtime_start_args_use_remote_thread_config_loader_when_configured`.
- Ran `cargo check -p codex-app-server --lib`.
- Ran `cargo test -p codex-app-server-client`.
## Summary
- add unix:// app-server transport backed by the shared codex-uds crate
- reuse the websocket connection loop for axum and tungstenite-backed
streams
- add codex app-server proxy to bridge stdio clients to the control
socket
- tolerate Windows UDS backends that report a missing rendezvous path as
connection refused before binding
## Tests
- cargo test -p codex-app-server
control_socket_acceptor_forwards_websocket_text_messages_and_pings
- cargo test -p codex-app-server
- just fmt
- just fix -p codex-app-server
- git -c core.fsmonitor=false diff --check
## Why
Fixes#18475. A `-c` override such as `projects.<cwd>.trust_level =
"untrusted"` is meant to be a runtime config override, but app-server
thread startup treated any non-trusted project as eligible for automatic
trust persistence when a permissive sandbox/cwd was requested. That
meant an explicit `untrusted` session override could still cause
`config.toml` to be updated with `trusted`.
## What changed
The app-server auto-trust path now runs only when the active project
trust level is unknown. Explicit `trusted` and explicit `untrusted`
values are both respected, regardless of whether they came from
persisted config or session flags.
A focused `thread/start` test now covers the explicit `untrusted` case
with a permissive sandbox request.
## Verification
- `cargo test -p codex-app-server`
- `just fix -p codex-app-server`
For callers who expect to be paginating the results for the UI, they can
now call thread/resume or thread/fork with excludeturns:true so it will
not fetch any pages of turns, and instead only set up the subscription.
That call can be immediately followed by pagination requests to
thread/turns/list to fetch pages of turns according to the UI's current
interactions.
## Summary
Lifecycle hooks currently treat `PreToolUse`, `PostToolUse`, and
`PermissionRequest` as Bash-only flows
- hook schema constrains `tool_name` to `Bash`
- hook input assumes a command-shaped `tool_input`
- core hook dispatch path passes only shell command strings
That means hooks cannot target MCP tools even though MCP tool names are
model-visible and stable
This change generalizes those hook paths so they can match and receive
payloads for MCP tools while preserving the existing Bash behavior.
## Reviewer Notes
I think these are the key files
- `codex-rs/core/src/tools/handlers/mcp.rs`
- `codex-rs/core/src/mcp_tool_call.rs`
Otherwise the changes across apply_patch, shell, and unified_exec are
mainly to rewire everything to be `tool_input` based instead of just
`command` so that it'll make sense for MCP tools.
## Changes
- Allow `PreToolUse`, `PostToolUse`, and `PermissionRequest` hook inputs
to carry arbitrary `tool_name` and `tool_input` values instead of
hard-coding `Bash` and command-only payloads.
- Add MCP hook payload support through `McpHandler`, using the
model-visible tool name from `ToolInvocation` and the raw MCP arguments
as `tool_input`.
- Include MCP tool responses in `PostToolUse` by serializing
`McpToolOutput` into the hook response payload.
- Run `PermissionRequest` hooks for MCP approval requests after
remembered approval checks and before falling back to user-facing MCP
elicitation.
- Preserve exact matching for literal hook matchers like `Bash` and
`mcp__memory__create_entities`, while keeping regex matcher support for
patterns like `mcp__memory__.*` and `mcp__.*__write.*`.
---------
Co-authored-by: Andrei Eternal <eternal@openai.com>
Co-authored-by: Codex <noreply@openai.com>
## Why
`item/permissions/requestApproval` sends a requested permission profile
to app-server clients. The core profile already stores filesystem
permissions as `entries`, but the v2 compatibility conversion used the
legacy `read`/`write` projection whenever possible and left `entries`
unset.
That made the request ambiguous for clients that consume the canonical
v2 shape: `permissions.fileSystem.entries` was missing even though
filesystem access was being requested. A client that rendered or echoed
grants from `entries` could treat the request as having no filesystem
permission entries, then return an empty or incomplete grant. The
app-server intersects responses with the original request, so omitted
filesystem permissions are denied.
## What Changed
- Populate `AdditionalFileSystemPermissions.entries` when converting
legacy read/write roots for request permission payloads, while
preserving `read` and `write` for compatibility.
- Mark `read` and `write` as transitional schema fields in the generated
app-server schema.
- Add regression coverage for the v2 conversion, the app-server
`item/permissions/requestApproval` round trip, and TUI app-server
approval conversion expectations.
- Refresh generated JSON and TypeScript schema fixtures.
## Verification
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server request_permissions_round_trip`
- `cargo test -p codex-tui
converts_request_permissions_into_granted_permissions`
- `cargo test -p codex-tui
resolves_permissions_and_user_input_through_app_server_request_id`
## Why
The current cloud-requirements failures say `workspace-managed config`,
which is ambiguous and can read like it refers to local managed config
such as `managed_config.toml`.
This code path only applies to cloud requirements, so the user-facing
message should name that source directly.
## What changed
- Updated the load failure in
[`codex-rs/cloud-requirements/src/lib.rs`](46e704d1f9/codex-rs/cloud-requirements/src/lib.rs)
to say `failed to load cloud requirements (workspace-managed policies)`.
- Updated the parse failure in the same file to use the same `cloud
requirements (workspace-managed policies)` terminology.
- Kept `workspace-managed` hyphenated because it is used as a compound
modifier.
- Updated the matching assertion in
[`codex-rs/app-server/src/codex_message_processor.rs`](46e704d1f9/codex-rs/app-server/src/codex_message_processor.rs).
- Reused `CLOUD_REQUIREMENTS_LOAD_FAILED_MESSAGE` in the
`codex-cloud-requirements` test where the test is asserting that
crate-local contract directly.
## Testing
`cargo test -p codex-cloud-requirements`
## Why
`command/exec` is another app-server entry point that can run under
caller-provided permissions. It needs to accept `PermissionProfile`
directly so command execution is not left behind on `SandboxPolicy`
while thread APIs move forward.
Command-level profiles also need to preserve the semantics clients
expect from profile-relative paths. `:cwd` and cwd-relative deny globs
should be anchored to the resolved command cwd for a command-specific
profile, while configured deny-read restrictions such as `**/*.env =
none` still need to be enforced because they can come from config or
requirements rather than the command override itself.
## What Changed
This adds `permissionProfile` to `CommandExecParams`, rejects requests
that combine it with `sandboxPolicy`, and converts accepted profiles
into the runtime filesystem/network permissions used for command
execution.
When a command supplies a profile, the app-server resolves that profile
against the command cwd instead of the thread/server cwd. It also
preserves configured deny-read entries and `globScanMaxDepth` on the
effective filesystem policy so one-off command overrides cannot drop
those read protections. The PR also updates app-server docs/schema
fixtures and adds command-exec coverage for accepted, rejected,
cwd-scoped, and deny-read-preserving profile paths.
## Verification
- `cargo test -p codex-app-server
command_exec_permission_profile_cwd_uses_command_cwd`
- `cargo test -p codex-app-server
command_profile_preserves_configured_deny_read_restrictions`
- `cargo test -p codex-app-server
command_exec_accepts_permission_profile`
- `cargo test -p codex-app-server
command_exec_rejects_sandbox_policy_with_permission_profile`
- `just fix -p codex-app-server`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18283).
* #18288
* #18287
* #18286
* #18285
* #18284
* __->__ #18283
## Summary
Support the existing hooks schema in inline TOML so hooks can be
configured from both `config.toml` and enterprise-managed
`requirements.toml` without requiring a separate `hooks.json` payload.
This gives enterprise admins a way to ship managed hook policy through
the existing requirements channel while still leaving script delivery to
MDM or other device-management tooling, and it keeps `hooks.json`
working unchanged for existing users.
This also lays the groundwork for follow-on managed filtering work such
as #15937, while continuing to respect project trust gating from #14718.
It does **not** implement `allow_managed_hooks_only` itself.
NOTE: yes, it's a bit unfortunate that the toml isn't formatted as
closely as normal to our default styling. This is because we're trying
to stay compatible with the spec for plugins/hooks that we'll need to
support & the main usecase here is embedding into requirements.toml
## What changed
- moved the shared hook serde model out of `codex-rs/hooks` into
`codex-rs/config` so the same schema can power `hooks.json`, inline
`config.toml` hooks, and managed `requirements.toml` hooks
- added `hooks` support to both `ConfigToml` and
`ConfigRequirementsToml`, including requirements-side `managed_dir` /
`windows_managed_dir`
- treated requirements-managed hooks as one constrained value via
`Constrained`, so managed hook policy is merged atomically and cannot
drift across requirement sources
- updated hook discovery to load requirements-managed hooks first, then
per-layer `hooks.json`, then per-layer inline TOML hooks, with a warning
when a single layer defines both representations
- threaded managed hook metadata through discovered handlers and exposed
requirements hooks in app-server responses, generated schemas, and
`/debug-config`
- added hook/config coverage in `codex-rs/config`, `codex-rs/hooks`,
`codex-rs/core/src/config_loader/tests.rs`, and
`codex-rs/core/tests/suite/hooks.rs`
## Testing
- `cargo test -p codex-config`
- `cargo test -p codex-hooks`
- `cargo test -p codex-app-server config_api`
## Documentation
Companion updates are needed in the developers website repo for:
- the hooks guide
- the config reference, sample, basic, and advanced pages
- the enterprise managed configuration guide
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
## Summary
Allow the user to approve a request_permissions_tool request with the
condition that all commands in the rest of the turn are reviewed by
guardian, regardless of sandbox status.
## Testing
- [x] Added unit tests
- [x] Ran locally
## Why
`approvals_reviewer` now uses `auto_review` as the canonical config/API
value after #18504, but the Rust enum variant and nearby helper/test
names still used `GuardianSubagent` / guardian approval wording. That
made follow-up code and reviews confusing even though the external value
had already moved to Auto-review.
## What changed
- Renamed `ApprovalsReviewer::GuardianSubagent` to
`ApprovalsReviewer::AutoReview`.
- Updated protocol, app-server, config, core, TUI, exec, and analytics
test callsites.
- Renamed nearby helper/test names from guardian approval wording to
Auto-review wording where they refer to the approvals reviewer mode.
- Preserved wire compatibility:
- `auto_review` remains the canonical serialized value.
- `guardian_subagent` remains accepted as a legacy alias.
This intentionally does not rename the `[features].guardian_approval`
key, `Feature::GuardianApproval`, `core/src/guardian`, analytics event
names, or app-server Guardian review event types.
## Verification
- `cargo test -p codex-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
- `cargo test -p codex-app-server-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
- `cargo test -p codex-config approvals_reviewer`
- `cargo test -p codex-tui update_feature_flags`
- `cargo test -p codex-core permissions_instructions`
- `cargo test -p codex-tui permissions_selection`
### Why
Auto-review is the user-facing name for the approvals reviewer, but the
config/API value still exposed the old `guardian_subagent` name. That
made new configs and generated schemas point users at Guardian
terminology even though the intended product surface is Auto-review.
This PR updates the external `approvals_reviewer` value while preserving
compatibility for existing configs and clients.
### What changed
- Makes `auto_review` the canonical serialized value for
`approvals_reviewer`.
- Keeps `guardian_subagent` accepted as a legacy alias.
- Keeps `user` accepted and serialized as `user`.
- Updates generated config and app-server schemas so
`approvals_reviewer` includes:
- `user`
- `auto_review`
- `guardian_subagent`
- Updates app-server README docs for the reviewer value.
- Updates analytics and config requirements tests for the canonical
auto_review value.
### Compatibility
Existing configs and API payloads using:
```toml
approvals_reviewer = "guardian_subagent"
```
continue to load and map to the Auto-review reviewer behavior.
New serialization emits:
```toml
approvals_reviewer = "auto_review"
```
This PR intentionally does not rename the [features].guardian_approval
key or broad internal Guardian symbols. Those are split out for a
follow-up PR to keep this migration small and avoid touching large
TUI/internal surfaces.
**Verification**
cargo test -p codex-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
cargo test -p codex-app-server-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
## Why
`PermissionProfile` is becoming the canonical permissions shape shared
by core and app-server. After app-server responses expose the active
profile, clients need to be able to send that same shape back when
starting, resuming, forking, or overriding a turn instead of translating
through the legacy `sandbox`/`sandboxPolicy` shorthands.
This still needs to preserve the existing requirements/platform
enforcement model. A profile-shaped request can be downgraded or
rejected by constraints, but the server should keep the user's
elevated-access intent for project trust decisions. Turn-level profile
overrides also need to retain existing read protections, including
deny-read entries and bounded glob-scan metadata, so a permission
override cannot accidentally drop configured protections such as
`**/*.env = deny`.
## What changed
- Adds optional `permissionProfile` request fields to `thread/start`,
`thread/resume`, `thread/fork`, and `turn/start`.
- Rejects ambiguous requests that specify both `permissionProfile` and
the legacy `sandbox`/`sandboxPolicy` fields, including running-thread
resume requests.
- Converts profile-shaped overrides into core runtime filesystem/network
permissions while continuing to derive the constrained legacy sandbox
projection used by existing execution paths.
- Preserves project-trust intent for profile overrides that are
equivalent to workspace-write or full-access sandbox requests.
- Preserves existing deny-read entries and `globScanMaxDepth` when
applying turn-level `permissionProfile` overrides.
- Updates app-server docs plus generated JSON/TypeScript schema fixtures
and regression coverage.
## Verification
- `cargo test -p codex-app-server-protocol schema_fixtures`
- `cargo test -p codex-core
session_configuration_apply_permission_profile_preserves_existing_deny_read_entries`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18279).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* #18280
* __->__ #18279
## Summary
Short circuit the convo if auto-review hits too many denials
## Testing
- [x] Added unit tests
---------
Co-authored-by: Codex <noreply@openai.com>
Preserve skill name/path entries whenever possible and trim descriptions
first, using round-robin character allocation so short descriptions do
not waste budget.
## Summary
This adds the structural plumbing needed for an app-server client to
approve a previously denied Guardian review and carry that approval
context into the next model turn.
This PR does not add the actual `/auto-review-denials` tool
## What Changed
- Added app-server v2 RPC `thread/approveGuardianDeniedAction`.
- Added generated JSON schema and TypeScript fixtures for
`ThreadApproveGuardianDeniedAction*`.
- Added core `Op::ApproveGuardianDeniedAction`.
- Added a core handler that validates the event is a denied Guardian
assessment and injects a developer message containing the stored denial
event JSON.
- Queues the approval context for the next turn if there is no active
turn yet.
- Added the TUI app-server bridge so `Op::ApproveGuardianDeniedAction {
event }` is routed to the app-server request.
## What This Does Not Do
- Does not add `/auto-review-denials`.
- Does not add chat widget recent-denial state.
- Does not add popup/list UI.
- Does not add a product-facing denial lookup/store.
- Does not change where Guardian denials are originally emitted or
persisted.
## Verification
- `cargo test -p codex-tui thread_approve_guardian_denied_action`
## Summary
- Teach app-server `thread/list` to accept either a single `cwd` or an
array of cwd filters, returning threads whose recorded session cwd
matches any requested path
- Add `useStateDbOnly` as an explicit opt-in fast path for callers that
want to answer `thread/list` from SQLite without scanning JSONL rollout
files
- Preserve backwards compatibility: by default, `thread/list` still
scans JSONL rollouts and repairs SQLite state
- Wire the new cwd array and SQLite-only options through app-server,
local/remote thread-store, rollout listing, generated TypeScript/schema
fixtures, proto output, and docs
## Test Plan
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-rollout`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-app-server thread_list`
- `just fmt`
- `just fix -p codex-app-server-protocol -p codex-rollout -p
codex-thread-store -p codex-app-server`
- `cargo build -p codex-cli --bin codex`
## Why
Guardian analytics includes time-to-first-token, but the Guardian
reviewer runs as a normal Codex session and `TurnCompleteEvent` did not
expose TTFT. The timing needs to flow through the standard
turn-completion protocol so Guardian review analytics can consume the
same value as the rest of the session machinery.
## What changed
Adds optional `time_to_first_token_ms` to `TurnCompleteEvent` and
populates it from `TurnTiming`. The value is carried through app-server
thread history, rollout reconstruction, TUI/app-server adapters, and
Guardian review session handling.
Guardian review analytics now captures TTFT from the reviewer
turn-complete event when available. Existing tests and fixtures are
updated to set the new optional field to `None` where TTFT is not
relevant.
## Verification
- `cargo clippy -p codex-tui --tests -- -D warnings`
- `cargo clippy -p codex-core --lib --tests -- -D warnings`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17696).
* __->__ #17696
* #17695
* #17693
* #18278
* #18953
## Why
The `PermissionProfile` migration needs app-server clients to see the
same constrained permission model that core is using at runtime. Before
this PR, thread lifecycle responses only exposed the legacy
`SandboxPolicy` shape, so clients still had to infer active permissions
from sandbox fields. That makes downstream resume, fork, and override
flows harder to make `PermissionProfile`-first.
External sandbox policies are intentionally excluded from this canonical
view. External enforcement cannot be round-tripped as a
`PermissionProfile`, and exposing a lossy root-write profile would let
clients accidentally change sandbox semantics if they echo the profile
back later.
## What changed
- Adds the app-server v2 `PermissionProfile` wire shape, including
filesystem permissions and glob scan depth metadata.
- Adds `PermissionProfileNetworkPermissions` so the profile response
does not expose active network state through the older
additional-permissions naming.
- Returns `permissionProfile` from thread start, resume, and fork
responses when the active sandbox can be represented as a
`PermissionProfile`.
- Keeps legacy `sandbox` in those responses for compatibility and
documents `permissionProfile` as canonical when present.
- Makes lifecycle `permissionProfile` nullable and returns `null` for
`ExternalSandbox` to avoid exposing a lossy profile.
- Regenerates the app-server JSON schema and TypeScript fixtures.
## Verification
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server
thread_response_permission_profile_omits_external_sandbox --
--nocapture`
- `cargo check --tests -p codex-analytics -p codex-exec -p codex-tui`
- `just fix -p codex-app-server-protocol -p codex-app-server -p
codex-analytics -p codex-exec -p codex-tui`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18278).
* #18279
* __->__ #18278
## Summary
This PR adds `CodexAuth::AgentIdentity` as an explicit auth mode.
An AgentIdentity auth record is a standalone `auth.json` mode. When
`AuthManager::auth().await` loads that mode, it registers one
process-scoped task and stores it in runtime-only state on the auth
value. Header creation stays synchronous after that because the task is
initialized before callers receive the auth object.
This PR also removes the old feature flag path. AgentIdentity is
selected by explicit auth mode, not by a hidden flag or lazy mutation of
ChatGPT auth records.
Reference old stack: https://github.com/openai/codex/pull/17387/changes
## Design Decisions
- AgentIdentity is a real auth enum variant because it can be the only
credential in `auth.json`.
- The process task is ephemeral runtime state. It is not serialized and
is not stored in rollout/session data.
- Account/user metadata needed by existing Codex backend checks lives on
the AgentIdentity record for now.
- `is_chatgpt_auth()` remains token-specific.
- `uses_codex_backend()` is the broader predicate for ChatGPT-token auth
and AgentIdentity auth.
## Stack
1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. This PR: explicit AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate Codex backend
auth callsites through AuthProvider
5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
and load `CODEX_AGENT_IDENTITY`
## Testing
Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
## Why
`Permissions` should not store a separate `PermissionProfile` that can
drift from the constrained `SandboxPolicy` and network settings. The
active profile needs to be derived from the same constrained values that
already honor `requirements.toml`.
## What changed
This adds derivation of the active `PermissionProfile` from the
constrained runtime permission settings and exposes that derived value
through config snapshots and thread state. The app-server can then
report the active profile without introducing a second source of truth.
## Verification
- `cargo test -p codex-core --test all permissions_messages --
--nocapture`
- `cargo test -p codex-core --test all request_permissions --
--nocapture`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18277).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* #18280
* #18279
* #18278
* __->__ #18277
Add a temporary internal remote_plugin feature flag that merges remote
marketplaces into plugin/list and routes plugin/read through the remote
APIs when needed, while keeping pure local marketplaces working as
before.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add experimental turn/start.environments params for per-turn
environment id + cwd selections
- pass selections through core protocol ops and resolve them with
EnvironmentManager before TurnContext creation
- treat omitted selections as default behavior, empty selections as no
environment, and non-empty selections as first environment/cwd as the
turn primary
## Testing
- ran `just fmt`
- ran `just write-app-server-schema`
- not run: unit tests for this stacked PR
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Adds a process-local, in-memory cookie store for ChatGPT HTTP clients.
- Limits cookie storage and replay to a shared ChatGPT host allowlist.
- Wires the shared store into the default Codex reqwest client and
backend client.
- Shares the ChatGPT host allowlist with remote-control URL validation
to avoid drift.
- Enables reqwest cookie support and updates lockfiles.
## Summary
This PR fully reverts the previously merged Agent Identity runtime
integration from the old stack:
https://github.com/openai/codex/pull/17387/changes
It removes the Codex-side task lifecycle wiring, rollout/session
persistence, feature flag plumbing, lazy `auth.json` mutation,
background task auth paths, and request callsite changes introduced by
that stack.
This leaves the repo in a clean pre-AgentIdentity integration state so
the follow-up PRs can reintroduce the pieces in smaller reviewable
layers.
## Stack
1. This PR: full revert
2. https://github.com/openai/codex/pull/18871: move Agent Identity
business logic into a crate
3. https://github.com/openai/codex/pull/18785: add explicit
AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate auth callsites
through AuthProvider
## Testing
Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
## Why
The device-key protocol needs an app-server implementation that keeps
local key operations behind the same request-processing boundary as
other v2 APIs.
app-server owns request dispatch, transport policy, documentation, and
JSON-RPC error shaping. `codex-device-key` owns key binding, validation,
platform provider selection, and signing mechanics. Keeping the adapter
thin makes the boundary easier to review and avoids moving local
key-management details into thread orchestration code.
## What changed
- Added `DeviceKeyApi` as the app-server adapter around
`DeviceKeyStore`.
- Converted protocol protection policies, payload variants, algorithms,
and protection classes to and from the device-key crate types.
- Encoded SPKI public keys and DER signatures as base64 protocol fields.
- Routed `device/key/create`, `device/key/public`, and `device/key/sign`
through `MessageProcessor`.
- Rejected remote transports before provider access while allowing local
`stdio` and in-process callers to reach the device-key API.
- Added stdio, in-process, and websocket tests for device-key validation
and transport policy.
- Documented the device-key methods in the app-server v2 method list.
## Test coverage
- `device_key_create_rejects_empty_account_user_id`
- `in_process_allows_device_key_requests_to_reach_device_key_api`
- `device_key_methods_are_rejected_over_websocket`
## Stack
This is PR 3 of 4 in the device-key app-server stack. It is stacked on
#18429.
## Validation
- `cargo test -p codex-app-server device_key`
- `just fix -p codex-app-server`
## Why
PR #18431 exposed a Bazel clippy failure in the app-server unit-test
target across Linux, macOS, and Windows. The failing lint was
`clippy::await_holding_invalid_type`: two tracing tests serialized
access to global tracing state by holding a `tokio::sync::MutexGuard`
across awaited test work.
That serialization is still needed because the tests share
process-global tracing setup and exporter state, but it should not
require holding an async mutex guard through the whole test body.
## What changed
- Replaced the bespoke async `tracing_test_guard` helper with
`serial_test` on the two tracing tests that need global tracing
serialization.
- Removed the `#[expect(clippy::await_holding_invalid_type)]`
annotations and the lock guard callsites that Bazel clippy rejected.
## Validation
- `cargo test -p codex-app-server jsonrpc_span`
- `just fix -p codex-app-server`
- `git diff --check`
I also attempted the exact failing Bazel clippy target locally with
BuildBuddy disabled: `bazel --noexperimental_remote_repo_contents_cache
build --config=clippy --bes_backend= --remote_cache=
--experimental_remote_downloader= --
//codex-rs/app-server:app-server-unit-tests-bin`. That run did not reach
clippy because Bazel timed out downloading `libcap-2.27.tar.gz` from
`kernel.org`.
## Why
Permission approval responses must not be able to grant more access than
the tool requested. Moving this flow to `PermissionProfile` means the
comparison must be profile-shaped instead of `SandboxPolicy`-shaped, and
cwd-relative special paths such as `:cwd` and `:project_roots` must stay
anchored to the turn that produced the request.
## What changed
This implements semantic `PermissionProfile` intersection in
`codex-sandboxing` for file-system and network permissions. The
intersection accepts narrower path grants, rejects broader grants,
preserves deny-read carve-outs and glob scan depth, and materializes
cwd-dependent special-path grants to absolute paths before they can be
recorded for reuse.
The request-permissions response paths now use that intersection
consistently. App-server captures the request turn cwd before waiting
for the client response, includes that cwd in the v2 approval params,
and core stores the requested profile plus cwd for direct TUI/client
responses and Guardian decisions before recording turn- or
session-scoped grants. The TUI app-server bridge now preserves the
app-server request cwd when converting permission approval params into
core events.
## Verification
- `cargo test -p codex-sandboxing intersect_permission_profiles --
--nocapture`
- `cargo test -p codex-app-server request_permissions_response --
--nocapture`
- `cargo test -p codex-core
request_permissions_response_materializes_session_cwd_grants_before_recording
-- --nocapture`
- `cargo check -p codex-tui --tests`
- `cargo check --tests`
- `cargo test -p codex-tui
app_server_request_permissions_preserves_file_system_permissions`
## Summary
- attach the authoritative Codex thread id to MCP tool request
`_meta.threadId` for model-initiated tool calls
- attach the same thread id for manual `mcpServer/tool/call` requests
before invoking the MCP server
- cover both metadata helper behavior and the manual app-server MCP path
in tests
needed because the Rust app-server is the last place that still has
authoritative knowledge of “this model-generated MCP tool call belongs
to conversation/thread X” before the request leaves Codex and reaches
Hoopa. It adds threadId to MCP request metadata in the model-generated
tool-call path, using sess.conversation_id, and also does the same for
the manual mcpServer/tool/call path.
## Test plan
- `cargo test -p codex-core
mcp_tool_call_thread_id_meta_is_added_to_request_meta --lib`
- `cargo test -p codex-app-server
mcp_server_tool_call_returns_tool_result`
Paired Hoopa consumer PR: https://github.com/openai/openai/pull/833263