Give saturated remote full-ci enough time for turn startup, tool output polling, and rollback events to reach their existing assertions.
Co-authored-by: Codex <noreply@openai.com>
Honor pre-cancelled delegate spawns before starting work, wait for TurnStarted in the remote no-wait helper, and give serialized rollback integration tests matching nextest headroom.
Co-authored-by: Codex <noreply@openai.com>
Keep remote routing tests from stalling before the first model request by consuming Codex events while waiting for the captured tool output.
Co-authored-by: Codex <noreply@openai.com>
Keep the assertions intact while allowing rollback and remote tool-output events longer to arrive under the remote full-ci load.
Co-authored-by: Codex <noreply@openai.com>
Add an optional captured-response helper for function call output values so remote-routing tests can poll without panicking on earlier requests.
Co-authored-by: Codex <noreply@openai.com>
function_call_output already returns an owned JSON value, so the remote-routing wait helper should return it directly instead of calling cloned().
Co-authored-by: Codex <noreply@openai.com>
The remote full-ci failures are now concentrated in tests that wait for unrelated completion signals under saturated CI. The remote routing tests only assert the tool output, so submit the turn without waiting for final turn completion and poll the captured function_call_output directly. Rollback-heavy tests are serialized and use an explicit event timeout for their rollback notification.
Co-authored-by: Codex <noreply@openai.com>
Remote full-ci runs these Docker-backed environment tests against one shared exec-server, so serialize the remote-env cases and make the rollback/event assertions tolerate CI startup latency. The unified-exec network-denial tests now still require a failed end event but also accept the remote exec-server startup failure observed when the minimal container cannot start the probe command.
Co-authored-by: Codex <noreply@openai.com>
The remote full-ci stabilization changed TestCodex turn-start waiting to use wait_for_event_with_timeout directly, leaving the wait_for_event_match import unused under -D warnings.
Co-authored-by: Codex <noreply@openai.com>
Remote full-ci was past the project-root failures but still exposed remote-test startup assumptions. Emit failed unified-exec end events when process creation is rejected, make the network-denial tests use /bin/sh in minimal remote containers, align TurnStarted waits with the existing submit-turn completion timeout, and make the delayed realtime sideband handshake deterministic.
Co-authored-by: Codex <noreply@openai.com>
## Summary
In https://github.com/openai/codex/pull/21584, we disabled doctests for
crates that lack any doctests. We can enforce that property via `cargo
shear --deny-warnings`: crates that lack doctests will be flagged if
doctests are enabled, and crates with doctests will be flagged if
doctests are disabled.
A few additional notes:
- By adding `--deny-warnings`, `cargo shear` also flagged a number of
modules that were not reachable at all. Some of those have been removed.
- This PR removes a usage of `windows_modules!` (since `cargo shear` and
`rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os =
"windows")]` macros. As a consequence, many of these files exhibit churn
in this PR, since they weren't being formatted by `rustfmt` at all on
main.
- Again, to make the code more analyzable, this PR also removes some
usages of `#[path = "cwd_junction.rs"]` in favor of a more standard
module structure. The bin sidecar structure is still retained, but,
e.g., `windows-sandbox-rs/src/bin/command_runner.rs` was moved to
`windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
`apply_patch` is now a freeform/custom tool. Keeping the old
JSON/function-style registration and parsing path left another way for
models and tests to invoke `apply_patch`, which made the tool surface
harder to reason about.
## What changed
- Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
spec, and handler support for function payloads.
- Kept `apply_patch_tool_type = freeform` as the supported model
metadata path, including Bedrock catalog metadata.
- Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
calls.
## Verification
- `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
- `cargo test -p codex-core tools::handlers::apply_patch --lib`
- `cargo test -p codex-core --test all
apply_patch_tool_executes_and_emits_patch_events`
- `cargo test -p codex-core --test all
apply_patch_reports_parse_diagnostics`
- `cargo test -p codex-exec test_apply_patch_tool`
- `just fix -p codex-core`
- `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
codex-exec`
## Summary
TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
attestation token and attach it as `x-oai-attestation` on the scoped
ChatGPT Codex request paths.

## Details
This PR teaches the Codex app-server runtime how to request and attach
an attestation token. It does not generate DeviceCheck tokens directly;
instead, it relies on the connected desktop app to advertise that it can
generate attestation and then asks that app for a fresh header value
when needed.
The flow is:
1. The Codex desktop app connects to app-server.
2. During `initialize`, the app can advertise that it supports
`requestAttestation`.
3. Before app-server calls selected ChatGPT Codex endpoints, it sends
the internal server request `attestation/generate` to the app.
4. app-server receives a pre-encoded header value back.
5. app-server forwards that value as `x-oai-attestation` on the scoped
outbound requests.
The code in this repo is mostly protocol and runtime plumbing: it adds
the app-server request/response shape, introduces an attestation
provider in core, wires that provider into Responses / compaction /
realtime setup paths, and covers the intended scoping with tests. The
signed macOS DeviceCheck generation remains owned by the desktop app PR.
## Related PR
- Codex desktop app implementation:
https://github.com/openai/openai/pull/878649
## Validation
<details>
<summary>Tests run</summary>
```sh
cargo test -p codex-app-server-protocol
cargo test -p codex-core attestation --lib
cargo test -p codex-app-server --lib attestation
```
Also ran:
```sh
just fix -p codex-core
just fix -p codex-app-server
just fix -p codex-app-server-protocol
just fmt
just write-app-server-schema
```
</details>
<details>
<summary>E2E DeviceCheck validation</summary>
First validated the signed desktop app boundary directly: launched a
packaged signed `Codex.app`, sent `attestation/generate`, decoded the
returned `v1.` attestation header, and validated the extracted
DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
`is_ok: true`.
Then ran the fuller app + app-server flow. The packaged `Codex.app`
launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
requested `attestation/generate` from the real Electron app process, and
the intercepted `/backend-api/codex/responses` traffic included
`x-oai-attestation` on both routes:
```text
GET /backend-api/codex/responses Upgrade: websocket x-oai-attestation: present
POST /backend-api/codex/responses Upgrade: none x-oai-attestation: present
```
The captured header decoded to a DeviceCheck token that also validated
with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
team `2DC432GLL2`).
</details>
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
`/fast` was wired as a one-off slash command even though model metadata
now exposes service tiers as catalog data. That meant adding another
tier, such as a slower/cheaper tier, would require more hardcoded TUI
plumbing instead of letting the model catalog drive the available
commands.
This change makes service-tier commands data-driven: each advertised
`service_tiers` entry becomes a `/name` command using the catalog
description, while the request path sends the tier `id` only when the
selected model supports it.
## What Changed
- Removed the hardcoded `/fast` slash-command variant and introduced
dynamic service-tier command items in the composer and command popup.
- Added toggle behavior for service-tier commands: invoking `/name`
selects that tier, and invoking it again clears the selection.
- Preserved the existing Fast-mode keybinding/status affordances by
resolving the current model tier whose name is `fast`, while still
sending the tier request value such as `priority`.
- Persisted service-tier selections as raw request strings so non-fast
tiers can round-trip through config.
- Updated the Bedrock catalog entry to advertise fast support through
`service_tiers` with `id: "priority"` and `name: "fast"`.
- Added defensive filtering in core so unsupported selected service
tiers are omitted from `/responses` requests.
## Validation
- Added/updated coverage for dynamic service-tier slash command lookup,
popup descriptions, composer dispatch, TUI fast toggling, and
unsupported-tier omission in core request construction.
- Local tests were not run per request.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Some consumers expect conventional hyphenated HTTP headers. Codex
already sends the session and thread IDs on outbound Responses requests,
but it only uses the underscore spellings today, which makes those IDs
harder to consume in systems that normalize or reject underscore header
names.
Full context here:
https://openai.slack.com/archives/C08KCGLSPSQ/p1778248578422369
## What changed
- `build_session_headers` now emits both `session_id` and `session-id`
when a session ID is present.
- It does the same for `thread_id` and `thread-id`.
- Added regression coverage in `codex-api/tests/clients.rs` and
`core/tests/suite/client.rs` so both the lower-level client tests and
the end-to-end request tests assert the two header spellings are
present.
## Test plan
- Added header assertions in `codex-api/tests/clients.rs`.
- Added request-header assertions in `core/tests/suite/client.rs` for
both the `/v1/responses` and `/api/codex/responses` request paths.
## Summary
- enable `apply_patch_freeform` by default in the feature registry
## Why
- make the freeform `apply_patch` tool available by default when model
metadata does not explicitly opt into another mode
## Validation
- `just fmt`
- did not run tests
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
API-key-auth remote compaction requests should not inherit
`service_tier` from normal `/responses` turns. This path needs to match
API auth expectations, while ChatGPT-auth remote compaction should keep
reusing the shared request fields that still apply there.
This change keeps the decision inline in
`codex-rs/core/src/compact_remote.rs` only. Under API key auth, the
classic remote `/responses/compact` path now omits `service_tier`; under
ChatGPT auth, it keeps reusing the configured tier.
`codex-rs/core/src/compact_remote_v2.rs` is unchanged. The remote
compaction parity coverage and snapshots were updated to assert the
API-key omission and preserve the ChatGPT-auth behavior.
## Testing
- Updated remote compaction parity coverage in
`codex-rs/core/tests/suite/compact_remote.rs` and the corresponding
snapshots.
## Why
Remote compaction v2 consumes a normal Responses stream, but that
compaction-specific stream consumer dropped the `response.completed` id.
As a result, the `responses_websocket_response_processed` lifecycle
notification was emitted for normal turn sampling but not after a v2
remote compaction response was fully processed.
## What changed
- Return the completed response id alongside the v2 `context_compaction`
output item.
- After v2 compacted history is installed, send `response.processed`
through the same websocket session when the feature is enabled.
- Add websocket regression coverage for a remote compaction v2 request
followed by `response.processed`.
## Verification
- `cargo test -p codex-core --test all
responses_websocket_sends_response_processed_after_remote_compaction_v2
-- --nocapture`
- `cargo test -p codex-core
collect_context_compaction_output_accepts_additional_output_items --
--nocapture`
## Why
After stdio transports and provider-owned defaults exist, Codex needs a
config-backed provider that can describe more than the single legacy
`CODEX_EXEC_SERVER_URL` remote. This PR adds that provider without
activating it in product entrypoints yet, keeping parser/validation
review separate from runtime wiring.
**Stack position:** this is PR 4 of 5. It builds on PR 3's
provider/default model and adds the `environments.toml` provider used by
PR 5.
## What Changed
- Add `environment_toml.rs` as the TOML-specific home for parsing,
validation, and provider construction.
- Keep the TOML schema/provider structs private; the public constructor
added here is `EnvironmentManager::from_codex_home(...)`.
- Add `TomlEnvironmentProvider`, including validation for:
- reserved ids such as `local` and `none`
- duplicate ids
- unknown explicit defaults
- empty programs or URLs
- exactly one of `url` or `program` per configured environment
- Support websocket environments with `url = "ws://..."` / `wss://...`.
- Support stdio-command environments with `program = "..."`.
- Add helpers to load `environments.toml` from `CODEX_HOME`, but do not
wire entrypoints to call them yet.
- Add the `toml` dependency for parsing.
## Stack
- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- **4. This PR:** https://github.com/openai/codex/pull/20666 - Add
CODEX_HOME environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME
Split from original draft: https://github.com/openai/codex/pull/20508
## Validation
Not run locally; this was split out of the original draft stack.
## Documentation
This introduces the config shape for `environments.toml`; user-facing
documentation should be added before this stack is treated as a
documented public workflow.
---------
Co-authored-by: Codex <noreply@openai.com>
Route view_image through selected environments so image reads use the selected turn environment and cwd, with schema exposure limited to multi-environment toolsets.\n\nCo-authored-by: Codex <noreply@openai.com>
## Summary
`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.
This PR disables doctests with `doctest = false` for crates that lack
any doctests.
For the collection of crates below, this speeds up test execution by
>4x.
E.g., before this PR:
```
Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml
Time (mean ± σ): 1.849 s ± 4.455 s [User: 0.752 s, System: 1.367 s]
Range (min … max): 0.418 s … 14.529 s 10 runs
```
And after:
```
Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml
Time (mean ± σ): 428.6 ms ± 6.9 ms [User: 187.7 ms, System: 219.7 ms]
Range (min … max): 418.0 ms … 436.8 ms 10 runs
```
For a single crate, with >2x speedup, before:
```
Benchmark 1: cargo test -p codex-utils-string
Time (mean ± σ): 491.1 ms ± 9.0 ms [User: 229.8 ms, System: 234.9 ms]
Range (min … max): 480.9 ms … 512.0 ms 10 runs
```
And after:
```
Benchmark 1: cargo test -p codex-utils-string
Time (mean ± σ): 213.9 ms ± 4.3 ms [User: 112.8 ms, System: 84.0 ms]
Range (min … max): 206.8 ms … 221.0 ms 13 runs
```
Co-authored-by: Codex <noreply@openai.com>
## Why
Follow-up to #21180: turn diffs are operation-backed now, but a failed
`apply_patch` can still leave exact filesystem mutations behind. For
example, a move can write the destination file before failing to remove
the source. Treating the whole call as unknowable then drops a change
that Codex actually knows happened, so the emitted turn diff can drift
from the workspace.
## What changed
-
[`apply-patch`](f55724e027/codex-rs/apply-patch/src/lib.rs (L248-L345))
now returns `ApplyPatchFailure` with the exact committed prefix
accumulated before an error. If a write failure may already have mutated
the target, the delta is marked inexact instead of being reused blindly.
- Move handling now records the destination write before attempting
source removal, so a partially failed move can still report the
destination file that definitely landed
([code](f55724e027/codex-rs/apply-patch/src/lib.rs (L463-L521))).
-
[`ApplyPatchRuntime`](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L49-L67))
now accumulates committed deltas across attempts and forwards them even
when the visible tool result is failed or sandbox-denied ([runtime
path](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L223-L250)),
[event
path](f55724e027/codex-rs/core/src/tools/events.rs (L215-L225))).
- `TurnDiffTracker` now consumes committed exact deltas rather than only
fully successful patches; exact-empty failures leave the aggregate
unchanged, while inexact deltas still invalidate it.
## Verification
- Added a regression test covering a failed move that still emits the
committed destination diff:
[`apply_patch_failed_move_preserves_committed_destination_diff`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1517-L1586)).
- Kept explicit coverage that an inexact delta clears the aggregate
instead of publishing a guessed diff:
[`apply_patch_clears_aggregated_diff_after_inexact_delta`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1589-L1655)).
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- replace filesystem-based turn diff tracking with an operation-backed
accumulator
- preserve enough verified apply_patch state to render move-overwrite
cases correctly
- keep the turn/diff/updated contract intact while removing remote-only
turn-diff test skips
This takes the assumption that no 3P services rely on the output format
of `apply_patch`
## Why
For the CCA file system isolation push
---------
Co-authored-by: Codex <noreply@openai.com>
## DISCLAIMER
This is experimental and no production service must rely on this
## Why
Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.
## What changed
- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.
## Test plan
- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.
## What changed
- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.
## Validation
- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.
Based on work from Vincent K -
https://github.com/openai/codex/pull/19060
<img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
/>
## Why
Compaction rewrites the conversation context that future model turns
receive, but hooks currently have no deterministic lifecycle point
around that rewrite. This adds compact lifecycle hooks so users can
audit manual and automatic compaction, surface hook messages in the UI,
and run post-compaction follow-up without overloading tool or prompt
hooks.
## What Changed
- Added `PreCompact` and `PostCompact` hook events across hook config,
discovery, dispatch, generated schemas, app-server notifications,
analytics, and TUI hook rendering.
- Added trigger matching for compact hooks with the documented `manual`
and `auto` matcher values.
- Wired `PreCompact` before both local and remote compaction, and
`PostCompact` after successful local or remote compaction.
- Kept compact hook command input to lifecycle metadata: session id,
Codex turn id, transcript path, cwd, hook event name, model, and
trigger.
- Made compact stdout handling consistent with other hooks: plain stdout
is ignored as debug output, while malformed JSON-looking stdout is
reported as failed hook output.
- Added integration coverage for compact hook dispatch, trigger
matching, post-compact execution, and the audited behavior that
`decision:"block"` does not block compaction.
## Out of Scope
- Hook-specific compaction blocking is not implemented;
`decision:"block"` and exit-code-2 blocking semantics are intentionally
unsupported for `PreCompact`.
- Custom compaction instructions are not exposed to compact hooks in
this PR.
- Compact summaries, summary character counts, and summary previews are
not exposed to compact hooks in this PR.
## Verification
- `cargo test -p codex-hooks`
- `cargo test -p codex-core
manual_pre_compact_block_decision_does_not_block_compaction`
- `cargo test -p codex-app-server hooks_list`
- `cargo test -p codex-core config_schema_matches_fixture`
- `cargo test -p codex-tui hooks_browser`
## Docs
The developer documentation for Codex hooks should be updated alongside
this feature to document `PreCompact` and `PostCompact`, the
`manual`/`auto` matcher values, and the compact hook payload fields.
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
## Why
Skills update notifications are app-server API behavior, but the watcher
lived in `codex-core` and surfaced through
`EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core
focused on thread execution and lets app-server own both cache
invalidation and the `skills/changed` notification.
## What changed
- Added an app-server-owned skills watcher that watches local skill
roots, clears the shared skills cache, and emits `skills/changed`
directly.
- Registers skill watches from the common app-server thread listener
attach path, including direct starts, resumes, and app-server-observed
child or forked threads.
- Stores the `WatchRegistration` on `ThreadState`, so listener
replacement, thread teardown, idle unload, and app-server shutdown
deregister by dropping the RAII guard.
- Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the
old core live-reload test.
- Extended the app-server skills change test to verify a cached skills
list is refreshed after a filesystem change without forcing reload.
## Validation
- `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p
codex-rollout -p codex-rollout-trace`
- `cargo test -p codex-app-server
skills_changed_notification_is_emitted_after_skill_change`
## Summary
- make resolved turn environment shell metadata optional instead of
hard-coding bash
- render environment context shells from explicit environment metadata
when present, falling back to the existing session shell
- update environment context tests for inherited PowerShell-style
fallback and explicit per-environment shell override
## Testing
- Not run (not requested; formatted with `just fmt`).
Co-authored-by: Codex <noreply@openai.com>
## Why
The core `Op::ListMcpTools` request path is no longer needed. Keeping it
around left a dead request/response surface alongside the app-server MCP
inventory APIs that own current server status listing.
## What Changed
- Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the
core handler that built the MCP snapshot response.
- Removed the now-unused `codex-mcp` snapshot wrapper/export and passive
event handling arms in rollout and MCP-server consumers.
- Updated tests that used the old op as a synchronization hook to wait
on existing startup/skills events, and deleted the plugin test that only
exercised the removed listing op.
## Validation
- `cargo test -p codex-protocol`
- `cargo test -p codex-mcp`
- `cargo test -p codex-rollout -p codex-rollout-trace -p
codex-mcp-server`
- `cargo test -p codex-core --test all
pending_input::queued_inter_agent_mail`
- `cargo test -p codex-core --test all
rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta`
- `cargo test -p codex-core --test all
rmcp_client::stdio_image_responses`
- `just fix -p codex-core -p codex-protocol -p codex-mcp -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server`
## Summary
- Add a `response.processed` websocket request payload and sender for
Responses API websockets.
- Send `response.processed` from `try_run_sampling_request` after a
response completes, local turn processing succeeds, and the
session-owned feature flag is enabled.
- Add websocket coverage for both enabled and disabled feature-flag
behavior.
## Validation
- `just fmt`
- `cargo test -p codex-core response_processed`
- `cargo test -p codex-api responses_websocket`
- `cargo test -p codex-features
responses_websocket_response_processed_is_under_development`
- `git diff --check`
- `just fix -p codex-api -p codex-core -p codex-features`
- `git diff --check origin/main...HEAD`
## Summary
- resolve or inject the installation ID before core startup and pass it
through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain
`String`
- keep child sessions on the parent installation ID instead of
rediscovering it inside core
- propagate installation ID startup failures in `mcp-server` instead of
panicking
## Why
Core was still touching the filesystem on the session startup path to
discover `installation_id`. This moves that work to the outer host
boundary so core no longer depends on `codex_home` reads during session
construction.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
`/responses/compact` should preserve the request-affinity fields that
apply to the active auth mode. ChatGPT-auth compact requests need the
effective `service_tier`, and compact requests for every auth mode need
the stable `prompt_cache_key`, so compaction does not quietly lose
routing or cache behavior that normal sampling already has.
This follows the request-parity direction from #20719, but keeps the net
change focused on the compact payload fields needed here.
## What changed
- Add `service_tier` and `prompt_cache_key` to the compact endpoint
input payload.
- Build the remote compact payload from the existing responses request
builder output so `Fast` still maps to `priority` when compact sends a
service tier.
- Pass the turn service tier into remote compaction, but only include it
in compact payloads for ChatGPT-backed auth.
- Keep `prompt_cache_key` on compact payloads for all auth modes.
- Add request-body diff snapshot coverage in
`core/tests/suite/compact_remote.rs` for:
- API-key auth reusing `prompt_cache_key` while omitting `service_tier`
even when `Fast` is configured.
- ChatGPT auth reusing both `service_tier` and `prompt_cache_key`.
- Drive the snapshot coverage through five varied turns: plain text,
multi-part text, tool-call continuation, image+text input, local-shell
continuation, and final-turn reasoning output.
## Verification
- Added insta snapshots for compact request-body parity against the last
normal `/responses` request after five varied turns.
- Not run locally per repo guidance; relying on GitHub CI for test
execution.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
MCP tool calls already include `session_id` in `x-codex-turn-metadata`,
but descendant threads intentionally share that value with the root
thread. Consumers that need to correlate work at the concrete thread
level also need the current `thread_id`.
## What changed
- add `thread_id` to `x-codex-turn-metadata` while preserving
`session_id` as the shared session identity
- thread the two identities separately through normal turns and spawned
review threads
- add regression coverage for resumed sessions, reserved metadata
fields, and deferred MCP tool calls
## Verification
- added focused coverage in `core/src/session/tests.rs`,
`core/src/turn_metadata_tests.rs`, and `core/tests/suite/search_tool.rs`
## Summary
Related to
https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
TLDR:
We update the meaning of session ids and thread ids:
* thread_id stays as now
* session_id become a shared id between every thread under a /root
thread (i.e. every sub-agent share the same session id)
This PR introduces an explicit `SessionId` and threads it through the
protocol/client boundary so `session_id` and `thread_id` can diverge
when they need to, while preserving compatibility for older serialized
`session_configured` events.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification
## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.
Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.
## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.
## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
## Why
`skills/list` is already exposed through app-server v2 and covered by
the app-server test suite. Keeping the separate core `Op::ListSkills`
path leaves a duplicate legacy protocol surface that no longer needs to
be maintained.
## What Changed
- Removed `Op::ListSkills` and `EventMsg::ListSkillsResponse` from the
core protocol.
- Deleted the corresponding core session handler and stale core
integration tests.
- Removed rollout/MCP ignore branches and protocol v1 docs references
for the deleted event/op.
- Left app-server `skills/list` and its existing coverage intact.
## Validation
- `cargo test -p codex-protocol`
- `cargo test -p codex-core --test all suite::skills`
- `cargo check -p codex-mcp-server -p codex-rollout -p
codex-rollout-trace`
- `just fix -p codex-core`
## Why
- Similar change as https://github.com/openai/codex/pull/19473.
- Without change: MCP tool calls receive
`_meta["x-codex-turn-metadata"]` with `session_id`, `turn_id`, and
`turn_started_at_unix_ms`.
- Issue: MCP servers may want the model and reasoning effort to better
understand tool-call behavior and latency relative to turn start.
## What Changed
- With change: MCP turn metadata now includes `model` and
`reasoning_effort`, propagated in `_meta["x-codex-turn-metadata"]`.
- Normal `/responses` turn metadata headers are unchanged.
## Verification
- `codex-rs/core/src/mcp_tool_call_tests.rs`
- `codex-rs/core/src/turn_metadata_tests.rs`
- `codex-rs/core/tests/suite/search_tool.rs`
## Why
We want the agent graph store to be passed down the stack as a real
dependency, the same way we already treat the thread store.
This will let us inject the agent graph store as a real dependency and
support implementations other than the local SQLite-backed one. Right
now most code instantiates a state DB and an agent graph store
just-in-time. Ideally, we would not depend on the state DB directly but
only read through the higher-level interfaces.
This change makes the dependency boundaries explicit and moves state DB
initialization to process bootstrap instead of hiding it inside local
store implementations.
## What changed
- `ThreadManager` now requires a `StateDbHandle` and an
`AgentGraphStore` at construction time instead of treating them as
optional internals.
- The local store constructors no longer lazily initialize SQLite.
Callers now initialize the state DB once per process and use that shared
handle to build:
- `LocalThreadStore`
- `LocalAgentGraphStore`
- App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
thread-manager sample) now initialize the state DB up front and inject
the resulting handle down the stack.
- `app-server` now consistently uses its process-scoped state DB handle
instead of reopening SQLite or trying to recover it from loaded threads.
- Device-key storage now reuses the shared state DB handle instead of
maintaining its own lazy opener.
- The thread archive / descendant traversal paths now use the injected
`AgentGraphStore` instead of reaching through local
thread-store-specific state.
## Verification
- `cargo check -p codex-core -p codex-thread-store -p codex-app-server
-p codex-mcp-server -p codex-thread-manager-sample --tests`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-core
thread_manager_accepts_separate_agent_graph_store_and_thread_store --
--nocapture`
- `cargo test -p codex-app-server
thread_archive_archives_spawned_descendants -- --nocapture`