## Why
`--profile-v2 <name>` gives launchers and runtime entry points a named
profile config without making each profile duplicate the base user
config. The base `$CODEX_HOME/config.toml` still loads first, then
`$CODEX_HOME/<name>.config.toml` layers above it and becomes the active
writable user config for that session.
That keeps shared defaults, plugin/MCP setup, and managed/user
constraints in one place while letting a named profile override only the
pieces that need to differ.
## What Changed
- Added the shared `--profile-v2 <name>` runtime option with validated
plain names, now represented by `ProfileV2Name`.
- Extended config layer state so the base user config and selected
profile config are both `User` layers; APIs expose the active user layer
and merged effective user config.
- Threaded profile selection through runtime entry points: `codex`,
`codex exec`, `codex review`, `codex resume`, `codex fork`, and `codex
debug prompt-input`.
- Made user-facing config writes go to the selected profile file when
active, including TUI/settings persistence, app-server config writes,
and MCP/app tool approval persistence.
- Made plugin, marketplace, MCP, hooks, and config reload paths read
from the merged user config so base and profile layers both participate.
- Updated app-server config layer schemas to mark profile-backed user
layers.
## Limits
`--profile-v2` is still rejected for config-management subcommands such
as feature, MCP, and marketplace edits. Those paths remain tied to the
base `config.toml` until they have explicit profile-selection semantics.
Some adjacent background writes may still update base or global state
rather than the selected profile:
- marketplace auto-upgrade metadata
- automatic MCP dependency installs from skills
- remote plugin sync or uninstall config edits
- personality migration marker/default writes
## Verification
Added targeted coverage for profile name validation, layer
ordering/merging, selected-profile writes, app-server config writes,
session hot reload, plugin config merging, hooks/config fixture updates,
and MCP/app approval persistence.
---------
Co-authored-by: Codex <noreply@openai.com>
# Why
`PreToolUse.additionalContext` became model-visible after #20692, but
the hook-output spilling path from #21069 never picked up that newer
lane. As a result, oversized `PreToolUse` context could bypass the
truncation/spill treatment that already applies to the other hook
outputs Codex forwards to the model.
# What
- Run `PreToolUseOutcome.additional_contexts` through
`maybe_spill_texts(...)`
- Add an integration test proving a large `PreToolUse.additionalContext`
is replaced with a truncated preview plus spill-file pointer, while the
full text is preserved on disk.
## Why
Recent session history showed no active use of the raw `shell`,
`local_shell`, or `container.exec` execution surfaces. Keeping those
handlers/specs wired into core leaves duplicate shell execution paths
alongside the supported `shell_command` and unified exec tools.
## What changed
- Removed the raw `shell` handler/spec and its `ShellToolCallParams`
protocol helper.
- Removed the legacy `local_shell` and `container.exec` handler/spec
plumbing while preserving persisted-history compatibility for old
response items.
- Normalized model/config `default` and `local` shell selections to
`shell_command`.
- Pruned tests that exercised removed raw-shell/local-shell/apply-patch
variants and kept coverage on `shell_command`, unified exec, and
freeform `apply_patch`.
## Verification
- `git diff --check`
- `cargo test -p codex-protocol`
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::handlers::shell`
- `cargo test -p codex-core tools::spec`
- `cargo test -p codex-core tools::router`
- `cargo test -p codex-core
active_call_preserves_triggering_command_context`
- `cargo test -p codex-core guardian_tests`
- `cargo test -p codex-core --test all shell_serialization`
- `cargo test -p codex-core --test all apply_patch_cli`
- `cargo test -p codex-core --test all shell_command_`
- `cargo test -p codex-core --test all local_shell`
- `cargo test -p codex-core --test all otel::`
- `cargo test -p codex-core --test all hooks::`
- `just fix -p codex-core`
- `just fix -p codex-tools`
## Why
Stop sending duplicate `session_id`/`thread_id` headers. We only want
the hyphenated forms as `_` is rejected by some proxies
Related discussion here:
https://openai.slack.com/archives/C095U48JNL9/p1778508316923179
## What
- Keep `session-id` and `thread-id`
- Remove the underscore aliases
## Why
Spawned agents can already override `model` and `reasoning_effort`, but
they have no equivalent way to opt into a model-supported service tier.
That makes it impossible to preserve or intentionally select tiered
execution behavior when delegating work to a sub-agent, even though the
model catalog already advertises supported `service_tiers`.
## What changed
- Add optional `service_tier` to both legacy and `MultiAgentV2`
`spawn_agent` tool inputs.
- Show each picker-visible model's supported service tier ids and
descriptions in the `spawn_agent` tool guidance.
- Resolve service tier selection after the child agent's effective model
is known.
- Inherit the parent tier when omitted and still supported by the final
child model; otherwise clear it.
- Reject explicit unsupported tier requests with a model-facing error.
- Keep explicit `service_tier` usable on full-history forks, while still
honoring the existing model/reasoning fork restrictions.
- Hide `service_tier` alongside other spawn metadata when
`hide_spawn_agent_metadata` is enabled.
## Verification
Added focused coverage for:
- v1/v2 `spawn_agent` schema exposure for `service_tier`
- tier descriptions in spawn guidance
- hidden-metadata suppression
- explicit supported tier selection
- explicit unknown and unsupported tier rejection
- inherited tier preservation or clearing based on child-model support
- full-history fork acceptance for explicit service tiers in both v1 and
v2
Local Rust tests were not run in this workspace per repo guidance; the
new coverage is included for CI.
## Why
`UnavailableDummyTools` kept synthetic placeholder tools alive for
historical tool calls whose backing MCP tool was no longer available.
That path adds stale model-visible tool specs and special routing at the
point where unavailable MCP calls should use ordinary current-tool
handling. This removes the runtime backfill instead of preserving a
second compatibility lane.
## Is it safe to remove?
The unavailable tools were added in #17853 after a CS issue when a
previously-called MCP tool failed to load and was omitted from the CS
spec. Now that we have tool search, I think this is resolved:
- API merges tools from previous TST output into effective tool set so
theyre always in CS spec
- if an MCP tool surfaced by TST later becomes unavailable, the model
can still call it and it will just return model-visible error
- both TST output and function call output are dropped on compaction so
model will not remember old calls to MCP post compaction
## What changed
- Delete unavailable-tool collection, placeholder handler, router/spec
plumbing, and obsolete placeholder coverage.
- Keep `features.unavailable_dummy_tools` as a removed no-op feature
tombstone so existing configs still parse cleanly.
- Add an integration-style `tool_search` regression test showing that a
deferred MCP tool surfaced through `tool_search` still routes through
MCP and returns a model-visible tool-call error rather than `unsupported
call`.
## Verification
- `cargo test -p codex-core tool_search`
## Why
`CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
bypass the normal Responses transport by reading SSE from local files.
That kept test-only behavior wired through production client code. The
affected tests can stay hermetic by using the existing
`core_test_support::responses` mock server and passing `openai_base_url`
instead.
## What Changed
- Removed the `CODEX_RS_SSE_FIXTURE` flag,
`codex_api::stream_from_fixture`, the `env-flags` dependency, and the
checked-in SSE fixture files.
- Repointed the affected core, exec, and TUI tests at `MockServer` with
the existing SSE event constructors.
- Removed the Bazel test data plumbing for the deleted fixtures and
refreshed cargo/Bazel lock state.
## Verification
- `cargo build -p codex-cli`
- `cargo test -p codex-api`
- `cargo test -p codex-core --test all responses_api_stream_cli`
- `cargo test -p codex-core --test all
integration_creates_and_checks_session_file`
- `cargo test -p codex-exec --test all ephemeral`
- `cargo test -p codex-exec --test all resume`
- `cargo test -p codex-tui --test all
resume_startup_does_not_consume_model_availability_nux_count`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
- `git diff --check`
- make ThreadStore::update_thread_metadata accept a broad range of
metadata patches
- keep ThreadStore::append_items as raw canonical history append (no
metadata side effects)
- in the local store, write these metadata updates to a combination of
sqlite and rollout jsonl files for backwards-compat. It special cases
which fields need to go into jsonl vs sqlite vs whatever, confining the
awkwardness to just this implementation
- in remote stores we can simply persist the metadata directly to a
database, no special casing required.
- move the "implicit metadata updates triggered by appending rollout
items" from the RolloutRecorder (which is local-threadstore-specific) to
the LiveThread layer above the ThreadStore, inside of a private helper
utility called ThreadMetadataSync. LiveThread calls ThreadStore
append_items and update_metadata separately.
- Add a generic update metadata method to ThreadManager that works on
both live threads and "cold" threads
- Call that ThreadManager method from app server code, so app server
doesn't need to worry about whether the thread is live or not
## Why
`tool_search` already had solid end-to-end coverage for discovery and
follow-up execution, but it did not prove that distinct pieces of
indexed search text actually work in integration. In particular, we were
not exercising whether unique tool names, descriptions, namespaces,
underscore-expanded dynamic names, and schema-property terms were
sufficient to surface the expected deferred tools.
This change adds focused integration coverage for those term sources so
regressions in search text construction are caught by a real `TestCodex`
flow instead of only by lower-level unit tests.
## What changed
- added a small helper in `core/tests/suite/search_tool.rs` to assert
that a `tool_search_output` contains an expected namespace child tool
- added an MCP integration test that issues several `tool_search_call`s
and verifies distinct query terms match the expected app tools:
- exact tool name: `calendar_timezone_option_99`
- tool description phrase: `uploaded document`
- top-level schema property: `starts_at`
- added a dynamic-tool integration test that verifies distinct query
terms match the expected deferred dynamic tool:
- exact name: `quasar_ping_beacon`
- underscore-expanded name: `quasar ping beacon`
- description phrase: `saffron metronome`
- namespace: `orbit_ops`
- schema property: `chrono_spec`
## Validation
- `cargo test -p codex-core tool_search_matches_`
## Docs
No documentation update needed.
## Summary
This refactor makes tool handlers the owner of the specs they can
publish, so registry construction can register handlers once and
separately publish only the specs that should be model-visible.
The main motivation is deferred tools: MCP and dynamic tools still need
handlers registered up front, but deferred tools should be discoverable
through `tool_search` rather than emitted in the initial tool spec list.
## What changed
- `McpHandler` and `DynamicToolHandler` can return their own `ToolSpec`.
- `build_tool_registry_builder` now collects handlers, registers them
through the no-spec path, and publishes only non-deferred handler specs.
- Deferred MCP and dynamic tool names are combined into one
`all_deferred_tools` set that drives spec filtering, code-mode
deferred-tool signaling, and `tool_search` registration.
- `tool_search` registration now requires both deferred tools and
`namespace_tools`.
- Namespace specs are merged in `spec_plan`, preserving top-level spec
order, sorting tools within each namespace, and backfilling empty
namespace descriptions.
- Hosted web search and image-generation specs are included in the
collected spec vector before namespace merge/publication, and tool-name
tests that should not care about hosted relative order now compare sets.
## Testing
- `cargo test -p codex-core tools::spec::tests:: -- --nocapture`
- `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture`
- `cargo test -p codex-core
tools::router::tests::specs_filter_deferred_dynamic_tools --
--nocapture`
- `cargo test -p codex-core
suite::prompt_caching::prompt_tools_are_consistent_across_requests --
--nocapture`
- `just fmt`
- `just fix -p codex-core`
- `cargo test -p codex-core -- --skip
tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`
passed the library suite after skipping the known stack-overflowing unit
test.
Full `cargo test -p codex-core` currently hits a stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`;
the same focused test reproduces on `origin/main`.
## Summary
Adds include_collaboration_mode_instructions, which is a config
equivalent to include_permissions_instructions for collaboration modes.
Desired for situations where we want to disable this instruction from
entering the context
## Testing
- [x] Added unit test
## Why
The split filesystem policy stack already supports exact and glob
`access = none` read restrictions on macOS and Linux. Windows still
needed subprocess handling for those deny-read policies without claiming
enforcement from a backend that cannot provide it.
## Key finding
The unelevated restricted-token backend cannot safely enforce deny-read
overlays. Its `WRITE_RESTRICTED` token model is authoritative for write
checks, not read denials, so this PR intentionally fails that backend
closed when deny-read overrides are present instead of claiming
unsupported enforcement.
## What changed
This PR adds the Windows deny-read enforcement layer and makes the
backend split explicit:
- Resolves Windows deny-read filesystem policy entries into concrete ACL
targets.
- Preserves exact missing paths so they can be materialized and denied
before an enforceable sandboxed process starts.
- Snapshot-expands existing glob matches into ACL targets for Windows
subprocess enforcement.
- Honors `glob_scan_max_depth` when expanding Windows deny-read globs.
- Plans both the configured lexical path and the canonical target for
existing paths so reparse-point aliases are covered.
- Threads deny-read overrides through the elevated/logon-user Windows
sandbox backend and unified exec.
- Applies elevated deny-read ACLs synchronously before command launch
rather than delegating them to the background read-grant helper.
- Reconciles persistent deny-read ACEs per sandbox principal so policy
changes do not leave stale deny-read ACLs behind.
- Fails closed on the unelevated restricted-token backend when deny-read
overrides are present, because its `WRITE_RESTRICTED` token model is not
authoritative for read denials.
## Landed prerequisites
These prerequisite PRs are already on `main`:
1. #15979 `feat(permissions): add glob deny-read policy support`
2. #18096 `feat(sandbox): add glob deny-read platform enforcement`
3. #17740 `feat(config): support managed deny-read requirements`
This PR targets `main` directly and contains only the Windows deny-read
enforcement layer.
## Implementation notes
- Exact deny-read paths remain enforceable on the elevated path even
when they do not exist yet: Windows materializes the missing path before
applying the deny ACE, so the sandboxed command cannot create and read
it during the same run.
- Existing exact deny paths are preserved lexically until the ACL
planner, which then adds the canonical target as a second ACL target
when needed. That keeps both the configured alias and the resolved
object covered.
- Windows ACLs do not consume Codex glob syntax directly, so glob
deny-read entries are expanded to the concrete matches that exist before
process launch.
- Glob traversal deduplicates directory visits within each pattern walk
to avoid cycles, without collapsing distinct lexical roots that happen
to resolve to the same target.
- Persistent deny-read ACL state is keyed by sandbox principal SID, so
cleanup only removes ACEs owned by the same backend principal.
- Deny-read ACEs are fail-closed on the elevated path: setup aborts if
mandatory deny-read ACL application fails.
- Unelevated restricted-token sessions reject deny-read overrides early
instead of running with a silently unenforceable read policy.
## Verification
- `cargo test -p codex-core
windows_restricted_token_rejects_unreadable_split_carveouts`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-windows-sandbox`
- GitHub Actions rerun is in progress on the pushed head.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Older sessions can contain model-warning records persisted as `user`
messages, including the unified exec process-limit warning, the
`apply_patch`-via-`exec_command` warning, and the model-mismatch
high-risk cyber fallback warning. Those warnings are no longer produced
as conversation history items, but when old sessions compact they should
still be recognized as injected context rather than preserved as real
user turns.
## What changed
- Removed `record_model_warning` and the production paths that emitted
these warning messages into conversation history.
- Added `LegacyUnifiedExecProcessLimitWarning`,
`LegacyApplyPatchExecCommandWarning`, and `LegacyModelMismatchWarning`
contextual fragments that are used only for matching old persisted
messages.
- Registered the legacy fragments with contextual user message detection
so compaction filters them through the existing fragment path.
- Added focused compaction coverage for old warning messages being
dropped during compacted-history processing.
## Testing
- `cargo test -p codex-core warning`
- `just fix -p codex-core`
## Why
`PreToolUse` already exposes `updatedInput` in its hook output schema,
but Codex currently rejects it instead of applying the rewrite. That
leaves hook authors unable to make the documented pre-execution
adjustment to a tool call before it runs.
## What
- Accept `updatedInput` from `PreToolUse` hooks when paired with
`permissionDecision: "allow"`.
- Apply the rewritten input before dispatch so the tool executes the
updated payload, not the original one.
- Preserve the stable hook-facing compatibility shapes that
participating tool handlers expose:
- Bash-like tools (`shell`, `container.exec`, `local_shell`,
`shell_command`, `exec_command`) use `{ "command": ... }`.
- `apply_patch` exposes its patch body through the same command-shaped
hook contract.
- MCP tools expose their JSON argument object directly.
- Keep each participating tool handler responsible for translating
hook-facing `updatedInput` back into its concrete invocation shape.
## Verification
Direct Bash-like rewrite coverage:
- `pre_tool_use_rewrites_shell_before_execution`
- `pre_tool_use_rewrites_container_exec_before_execution`
- `pre_tool_use_rewrites_local_shell_before_execution`
- `pre_tool_use_rewrites_shell_command_before_execution`
- `pre_tool_use_rewrites_exec_command_before_execution`
These cases assert that each supported Bash-like surface runs only the
rewritten command while the hook still observes the original `{
"command": ... }` input.
`pre_tool_use_rewrites_apply_patch_before_execution`
- Model emits one patch.
- Hook swaps in a different patch.
- Asserts only the rewritten file is created, and the hook saw the
original patch.
`pre_tool_use_rewrites_code_mode_nested_exec_command_before_execution`
- Model runs one nested shell command from code mode.
- Hook rewrites it.
- Asserts only the rewritten command runs, and the hook saw the original
nested input.
`pre_tool_use_rewrites_mcp_tool_before_execution`
- Model calls the RMCP echo tool.
- Hook rewrites the MCP arguments.
- Asserts the MCP server receives and returns the rewritten message, not
the original one.
## Summary
- create a selected-cwd filesystem sandbox context for view_image
metadata and file reads in both local and remote environments
- add a local restricted-profile regression test for the previously
unsandboxed read path
## Validation
- just fmt
- bazel test --bes_backend= --bes_results_url= --test_output=errors
--test_filter=view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads
//codex-rs/core:core-unit-tests
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add multi-environment apply_patch routing for both freeform and
function-call tool flows
- parse and reconcile the optional environment selector in the main
apply_patch parser, then verify against the selected environment in the
handler
- carry environment_id through runtime and approval surfaces so
remote-targeted patches stay explicit end to end
## Testing
- just fmt
- remote exec-server e2e: `cargo test -p codex-core --test all
apply_patch_multi_environment_uses_remote_executor -- --nocapture` on
dev via `scripts/test-remote-env.sh`
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
The permissions migration is making
`permissions.<profile>.network.enabled` the canonical sandbox network
bit, while proxy startup is a separate concern. Enabling network access
should not implicitly start the proxy, and users who are still on legacy
sandbox modes need a separate place to opt into proxy startup and
provide proxy-specific settings.
This follow-up to #19900 gives the network proxy its own feature surface
instead of overloading permission-profile network semantics.
## What changed
- Add an experimental `network_proxy` feature with a configurable
`[features.network_proxy]` table.
- Overlay `features.network_proxy` settings onto the configured proxy
state after permission-profile selection, so the proxy only starts when
the active `NetworkSandboxPolicy` already allows network access.
- Preserve `[experimental_network]` startup behavior independently of
the new feature flag.
## Behavior and examples
There are now three related knobs:
- `permissions.<profile>.network.enabled` controls whether the active
permission profile has network access at all.
- `features.network_proxy` enables proxy restrictions for an
already-network-enabled profile.
- Legacy `sandbox_mode` plus `[sandbox_workspace_write].network_access`
still control whether legacy `workspace-write` has network access at
all.
The rule is:
- network off + proxy flag on -> network stays off, proxy is a no-op
- network on + proxy flag off -> unrestricted direct network
- network on + proxy flag on -> network stays on, with proxy
restrictions applied
For permission profiles, the feature toggle adds proxy restrictions only
when network access is already enabled:
```toml
default_permissions = "workspace"
[permissions.workspace.filesystem]
":minimal" = "read"
[permissions.workspace.network]
enabled = true
[features]
network_proxy = true
```
If `network.enabled = false`, the same feature flag is a no-op: network
remains off and the proxy does not start.
For legacy sandbox config, `network_access` remains the master switch:
```toml
sandbox_mode = "workspace-write"
[sandbox_workspace_write]
network_access = true
[features]
network_proxy = true
```
That keeps legacy `workspace-write` network access on, but routes it
through the proxy policy. If `network_access = false`, the proxy feature
is a no-op and legacy `workspace-write` remains offline.
The same proxy opt-in can be supplied from the CLI:
```bash
codex -c 'features.network_proxy=true'
```
Additional proxy settings can be supplied when a table is needed:
```bash
codex \
-c 'features.network_proxy.enabled=true' \
-c 'features.network_proxy.enable_socks5=false'
```
The intended behavior matrix is:
| Config surface | Network setting | `features.network_proxy` | Direct
sandbox network | Proxy |
| --- | --- | --- | --- | --- |
| Permission profile | `network.enabled = false` | off | restricted |
off |
| Permission profile | `network.enabled = false` | on | restricted | off
|
| Permission profile | `network.enabled = true` | off | enabled | off |
| Permission profile | `network.enabled = true` | on | enabled | on |
| Legacy `workspace-write` | `network_access = false` | off | restricted
| off |
| Legacy `workspace-write` | `network_access = false` | on | restricted
| off |
| Legacy `workspace-write` | `network_access = true` | off | enabled |
off |
| Legacy `workspace-write` | `network_access = true` | on | enabled | on
|
`[experimental_network]` requirements remain separate from the user
feature toggle and still start the proxy on their own.
Relevant code:
-
[`features/src/feature_configs.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/features/src/feature_configs.rs#L58-L117)
defines the feature-specific proxy config.
-
[`core/src/config/mod.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L1959-L1964)
reads the feature table, and [later applies it only when network access
is already
enabled](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L2448-L2458).
## Verification
Added focused coverage for:
- keeping the proxy off when `features.network_proxy` is enabled but
sandbox network access is disabled
- the full permission-profile and legacy `workspace-write` matrix above
- preserving `[experimental_network]` startup without the feature
- reusing profile-supplied proxy settings when the feature is enabled
Ran:
- `cargo test -p codex-features`
- `cargo test -p codex-core network_proxy_feature`
- `cargo test -p codex-core
experimental_network_requirements_enable_proxy_without_feature`
## Summary
Restricts behavior of `is_known_safe_command` only to modes where it is
explicitly part of the documented behavior:
- when `environment_lacks_sandbox_protections`
- in `AskForApproval::UnlessTrusted`
Notably, as a result of this, escalations for commands that pass
`is_known_safe_commands` are no longer auto-approved in
AskForApproval::OnRequest or AskForApproval::Granular.
## Testing
- [x] Updated unit tests
- [x] Updated approvals scenario tests.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Dogfooder feedback exposed two correctness gaps in normal-loop overflow
recovery:
1. a sampling request that hit `ContextWindowExceeded` could keep
re-entering auto-compaction indefinitely if the compacted retry still
did not fit, and
2. local compact-history rebuilds flattened user messages down to text,
so an overflowing `[image, "what is this?"]` turn could be retried
without the image after compaction.
That means recovery could either fail to terminate cleanly or proceed
with a materially weakened version of the user request.
## What changed
- Move normal-loop `ContextWindowExceeded` handling into the sampling
retry loop, so successful rescue compaction consumes the provider retry
budget instead of creating an unbounded outer-turn loop.
- Keep compacted user-history rebuilds structured:
`collect_user_messages` now carries user `UserInput` content rather than
flattened strings, and `build_compacted_history` reconstructs full user
messages from that structured representation.
- Preserve image inputs while retaining the existing text-budget
truncation behavior for compacted user history.
- Preserve existing compaction-task failure handling and client-session
reset behavior while bounding repeated overflow retries.
- Add focused regression coverage for:
- recovery after a normal-loop overflow,
- retry-budget exhaustion after repeated overflow,
- local recovery preserving image + text input,
- remote recovery preserving image + text input,
- remote compaction v2 preserving image + text input, and
- compaction failure still terminating cleanly.
The main behavior changes are in `codex-rs/core/src/session/turn.rs` and
`codex-rs/core/src/compact.rs`.
## Verification
- Not run locally; relying on PR CI for this update.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
[#21736](https://github.com/openai/codex/pull/21736) introduces the
typed extension API, but the runtime does not yet carry a registry
through thread/session startup or give contributors host-owned stores to
read from. This PR wires that host-side path so later feature migrations
can move product-specific behavior behind typed contributions without
adding another bespoke seam directly to `codex-core`.
## What changed
- Thread `ExtensionRegistry<Config>` through `ThreadManager`,
`CodexSpawnArgs`, `Session`, and sub-agent spawn paths.
- Wire `ThreadStartContributor` and `ContextContributor`
- Expose the small supporting surface needed by non-core callers that
construct threads directly, including `empty_extension_registry()`
through `codex-core-api`.
This PR lands the host plumbing only: the app-server registry is still
empty, and concrete feature migrations are intended to follow
separately.
## Why
[PR #1705](https://github.com/openai/codex/pull/1705) moved
`apply_patch` execution under the configured sandbox and called out the
need for integration coverage. We already covered textual `../` escapes,
but did not have coverage for link aliases that live inside a writable
workspace while pointing at, or aliasing, files visible outside it.
This PR locks in the current sandbox boundary without changing
production write semantics. Symlink escapes into a read-only outside
root should fail and leave the outside file unchanged. Existing hard
links are characterized separately: if a user-created hard link already
exists inside the writable root, sandboxed writes preserve normal
hard-link semantics rather than replacing the link and silently breaking
that relationship.
## What Changed
- Added
`apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
to verify `apply_patch` cannot update a symlink that targets a file
outside the writable workspace.
- Added `apply_patch_cli_preserves_existing_hard_link_outside_workspace`
to verify `apply_patch` intentionally writes through an existing hard
link and does not unlink or replace it.
- Added `file_system_sandboxed_write_preserves_existing_hard_link` to
verify sandboxed `fs/writeFile` preserves an existing hard link and
writes the shared inode.
## Testing
- `cargo test -p codex-exec-server file_system_sandboxed_write`
- `cargo test -p codex-core
apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
- `cargo test -p codex-core
apply_patch_cli_preserves_existing_hard_link_outside_workspace`
- `just fix -p codex-exec-server -p codex-core`
- `just fix -p codex-core`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21819).
* #21845
* __->__ #21819
## Why
PR #21460 reverted the earlier move of skills change watching from
`codex-core` into app-server. This reapplies that boundary change so
app-server owns client-facing `skills/changed` notifications and core no
longer carries the watcher.
## What
- Restore the app-server `SkillsWatcher` and register it from thread
listener setup.
- Remove the core-owned skills watcher and its core live-reload
integration surface.
- Restore app-server coverage for `skills/changed` notifications after a
watched skill file changes.
## Validation
- `cargo test -p codex-app-server --test all
suite::v2::skills_list::skills_changed_notification_is_emitted_after_skill_change
-- --exact --nocapture`
- `cargo test -p codex-core --lib --no-run`
## Summary
In https://github.com/openai/codex/pull/21584, we disabled doctests for
crates that lack any doctests. We can enforce that property via `cargo
shear --deny-warnings`: crates that lack doctests will be flagged if
doctests are enabled, and crates with doctests will be flagged if
doctests are disabled.
A few additional notes:
- By adding `--deny-warnings`, `cargo shear` also flagged a number of
modules that were not reachable at all. Some of those have been removed.
- This PR removes a usage of `windows_modules!` (since `cargo shear` and
`rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os =
"windows")]` macros. As a consequence, many of these files exhibit churn
in this PR, since they weren't being formatted by `rustfmt` at all on
main.
- Again, to make the code more analyzable, this PR also removes some
usages of `#[path = "cwd_junction.rs"]` in favor of a more standard
module structure. The bin sidecar structure is still retained, but,
e.g., `windows-sandbox-rs/src/bin/command_runner.rs` was moved to
`windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
`apply_patch` is now a freeform/custom tool. Keeping the old
JSON/function-style registration and parsing path left another way for
models and tests to invoke `apply_patch`, which made the tool surface
harder to reason about.
## What changed
- Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
spec, and handler support for function payloads.
- Kept `apply_patch_tool_type = freeform` as the supported model
metadata path, including Bedrock catalog metadata.
- Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
calls.
## Verification
- `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
- `cargo test -p codex-core tools::handlers::apply_patch --lib`
- `cargo test -p codex-core --test all
apply_patch_tool_executes_and_emits_patch_events`
- `cargo test -p codex-core --test all
apply_patch_reports_parse_diagnostics`
- `cargo test -p codex-exec test_apply_patch_tool`
- `just fix -p codex-core`
- `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
codex-exec`
## Summary
TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
attestation token and attach it as `x-oai-attestation` on the scoped
ChatGPT Codex request paths.

## Details
This PR teaches the Codex app-server runtime how to request and attach
an attestation token. It does not generate DeviceCheck tokens directly;
instead, it relies on the connected desktop app to advertise that it can
generate attestation and then asks that app for a fresh header value
when needed.
The flow is:
1. The Codex desktop app connects to app-server.
2. During `initialize`, the app can advertise that it supports
`requestAttestation`.
3. Before app-server calls selected ChatGPT Codex endpoints, it sends
the internal server request `attestation/generate` to the app.
4. app-server receives a pre-encoded header value back.
5. app-server forwards that value as `x-oai-attestation` on the scoped
outbound requests.
The code in this repo is mostly protocol and runtime plumbing: it adds
the app-server request/response shape, introduces an attestation
provider in core, wires that provider into Responses / compaction /
realtime setup paths, and covers the intended scoping with tests. The
signed macOS DeviceCheck generation remains owned by the desktop app PR.
## Related PR
- Codex desktop app implementation:
https://github.com/openai/openai/pull/878649
## Validation
<details>
<summary>Tests run</summary>
```sh
cargo test -p codex-app-server-protocol
cargo test -p codex-core attestation --lib
cargo test -p codex-app-server --lib attestation
```
Also ran:
```sh
just fix -p codex-core
just fix -p codex-app-server
just fix -p codex-app-server-protocol
just fmt
just write-app-server-schema
```
</details>
<details>
<summary>E2E DeviceCheck validation</summary>
First validated the signed desktop app boundary directly: launched a
packaged signed `Codex.app`, sent `attestation/generate`, decoded the
returned `v1.` attestation header, and validated the extracted
DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
`is_ok: true`.
Then ran the fuller app + app-server flow. The packaged `Codex.app`
launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
requested `attestation/generate` from the real Electron app process, and
the intercepted `/backend-api/codex/responses` traffic included
`x-oai-attestation` on both routes:
```text
GET /backend-api/codex/responses Upgrade: websocket x-oai-attestation: present
POST /backend-api/codex/responses Upgrade: none x-oai-attestation: present
```
The captured header decoded to a DeviceCheck token that also validated
with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
team `2DC432GLL2`).
</details>
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
`/fast` was wired as a one-off slash command even though model metadata
now exposes service tiers as catalog data. That meant adding another
tier, such as a slower/cheaper tier, would require more hardcoded TUI
plumbing instead of letting the model catalog drive the available
commands.
This change makes service-tier commands data-driven: each advertised
`service_tiers` entry becomes a `/name` command using the catalog
description, while the request path sends the tier `id` only when the
selected model supports it.
## What Changed
- Removed the hardcoded `/fast` slash-command variant and introduced
dynamic service-tier command items in the composer and command popup.
- Added toggle behavior for service-tier commands: invoking `/name`
selects that tier, and invoking it again clears the selection.
- Preserved the existing Fast-mode keybinding/status affordances by
resolving the current model tier whose name is `fast`, while still
sending the tier request value such as `priority`.
- Persisted service-tier selections as raw request strings so non-fast
tiers can round-trip through config.
- Updated the Bedrock catalog entry to advertise fast support through
`service_tiers` with `id: "priority"` and `name: "fast"`.
- Added defensive filtering in core so unsupported selected service
tiers are omitted from `/responses` requests.
## Validation
- Added/updated coverage for dynamic service-tier slash command lookup,
popup descriptions, composer dispatch, TUI fast toggling, and
unsupported-tier omission in core request construction.
- Local tests were not run per request.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Some consumers expect conventional hyphenated HTTP headers. Codex
already sends the session and thread IDs on outbound Responses requests,
but it only uses the underscore spellings today, which makes those IDs
harder to consume in systems that normalize or reject underscore header
names.
Full context here:
https://openai.slack.com/archives/C08KCGLSPSQ/p1778248578422369
## What changed
- `build_session_headers` now emits both `session_id` and `session-id`
when a session ID is present.
- It does the same for `thread_id` and `thread-id`.
- Added regression coverage in `codex-api/tests/clients.rs` and
`core/tests/suite/client.rs` so both the lower-level client tests and
the end-to-end request tests assert the two header spellings are
present.
## Test plan
- Added header assertions in `codex-api/tests/clients.rs`.
- Added request-header assertions in `core/tests/suite/client.rs` for
both the `/v1/responses` and `/api/codex/responses` request paths.
## Summary
- enable `apply_patch_freeform` by default in the feature registry
## Why
- make the freeform `apply_patch` tool available by default when model
metadata does not explicitly opt into another mode
## Validation
- `just fmt`
- did not run tests
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
API-key-auth remote compaction requests should not inherit
`service_tier` from normal `/responses` turns. This path needs to match
API auth expectations, while ChatGPT-auth remote compaction should keep
reusing the shared request fields that still apply there.
This change keeps the decision inline in
`codex-rs/core/src/compact_remote.rs` only. Under API key auth, the
classic remote `/responses/compact` path now omits `service_tier`; under
ChatGPT auth, it keeps reusing the configured tier.
`codex-rs/core/src/compact_remote_v2.rs` is unchanged. The remote
compaction parity coverage and snapshots were updated to assert the
API-key omission and preserve the ChatGPT-auth behavior.
## Testing
- Updated remote compaction parity coverage in
`codex-rs/core/tests/suite/compact_remote.rs` and the corresponding
snapshots.
## Why
Remote compaction v2 consumes a normal Responses stream, but that
compaction-specific stream consumer dropped the `response.completed` id.
As a result, the `responses_websocket_response_processed` lifecycle
notification was emitted for normal turn sampling but not after a v2
remote compaction response was fully processed.
## What changed
- Return the completed response id alongside the v2 `context_compaction`
output item.
- After v2 compacted history is installed, send `response.processed`
through the same websocket session when the feature is enabled.
- Add websocket regression coverage for a remote compaction v2 request
followed by `response.processed`.
## Verification
- `cargo test -p codex-core --test all
responses_websocket_sends_response_processed_after_remote_compaction_v2
-- --nocapture`
- `cargo test -p codex-core
collect_context_compaction_output_accepts_additional_output_items --
--nocapture`
## Why
After stdio transports and provider-owned defaults exist, Codex needs a
config-backed provider that can describe more than the single legacy
`CODEX_EXEC_SERVER_URL` remote. This PR adds that provider without
activating it in product entrypoints yet, keeping parser/validation
review separate from runtime wiring.
**Stack position:** this is PR 4 of 5. It builds on PR 3's
provider/default model and adds the `environments.toml` provider used by
PR 5.
## What Changed
- Add `environment_toml.rs` as the TOML-specific home for parsing,
validation, and provider construction.
- Keep the TOML schema/provider structs private; the public constructor
added here is `EnvironmentManager::from_codex_home(...)`.
- Add `TomlEnvironmentProvider`, including validation for:
- reserved ids such as `local` and `none`
- duplicate ids
- unknown explicit defaults
- empty programs or URLs
- exactly one of `url` or `program` per configured environment
- Support websocket environments with `url = "ws://..."` / `wss://...`.
- Support stdio-command environments with `program = "..."`.
- Add helpers to load `environments.toml` from `CODEX_HOME`, but do not
wire entrypoints to call them yet.
- Add the `toml` dependency for parsing.
## Stack
- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- **4. This PR:** https://github.com/openai/codex/pull/20666 - Add
CODEX_HOME environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME
Split from original draft: https://github.com/openai/codex/pull/20508
## Validation
Not run locally; this was split out of the original draft stack.
## Documentation
This introduces the config shape for `environments.toml`; user-facing
documentation should be added before this stack is treated as a
documented public workflow.
---------
Co-authored-by: Codex <noreply@openai.com>
Route view_image through selected environments so image reads use the selected turn environment and cwd, with schema exposure limited to multi-environment toolsets.\n\nCo-authored-by: Codex <noreply@openai.com>
## Summary
`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.
This PR disables doctests with `doctest = false` for crates that lack
any doctests.
For the collection of crates below, this speeds up test execution by
>4x.
E.g., before this PR:
```
Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml
Time (mean ± σ): 1.849 s ± 4.455 s [User: 0.752 s, System: 1.367 s]
Range (min … max): 0.418 s … 14.529 s 10 runs
```
And after:
```
Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml
Time (mean ± σ): 428.6 ms ± 6.9 ms [User: 187.7 ms, System: 219.7 ms]
Range (min … max): 418.0 ms … 436.8 ms 10 runs
```
For a single crate, with >2x speedup, before:
```
Benchmark 1: cargo test -p codex-utils-string
Time (mean ± σ): 491.1 ms ± 9.0 ms [User: 229.8 ms, System: 234.9 ms]
Range (min … max): 480.9 ms … 512.0 ms 10 runs
```
And after:
```
Benchmark 1: cargo test -p codex-utils-string
Time (mean ± σ): 213.9 ms ± 4.3 ms [User: 112.8 ms, System: 84.0 ms]
Range (min … max): 206.8 ms … 221.0 ms 13 runs
```
Co-authored-by: Codex <noreply@openai.com>
## Why
Follow-up to #21180: turn diffs are operation-backed now, but a failed
`apply_patch` can still leave exact filesystem mutations behind. For
example, a move can write the destination file before failing to remove
the source. Treating the whole call as unknowable then drops a change
that Codex actually knows happened, so the emitted turn diff can drift
from the workspace.
## What changed
-
[`apply-patch`](f55724e027/codex-rs/apply-patch/src/lib.rs (L248-L345))
now returns `ApplyPatchFailure` with the exact committed prefix
accumulated before an error. If a write failure may already have mutated
the target, the delta is marked inexact instead of being reused blindly.
- Move handling now records the destination write before attempting
source removal, so a partially failed move can still report the
destination file that definitely landed
([code](f55724e027/codex-rs/apply-patch/src/lib.rs (L463-L521))).
-
[`ApplyPatchRuntime`](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L49-L67))
now accumulates committed deltas across attempts and forwards them even
when the visible tool result is failed or sandbox-denied ([runtime
path](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L223-L250)),
[event
path](f55724e027/codex-rs/core/src/tools/events.rs (L215-L225))).
- `TurnDiffTracker` now consumes committed exact deltas rather than only
fully successful patches; exact-empty failures leave the aggregate
unchanged, while inexact deltas still invalidate it.
## Verification
- Added a regression test covering a failed move that still emits the
committed destination diff:
[`apply_patch_failed_move_preserves_committed_destination_diff`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1517-L1586)).
- Kept explicit coverage that an inexact delta clears the aggregate
instead of publishing a guessed diff:
[`apply_patch_clears_aggregated_diff_after_inexact_delta`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1589-L1655)).
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- replace filesystem-based turn diff tracking with an operation-backed
accumulator
- preserve enough verified apply_patch state to render move-overwrite
cases correctly
- keep the turn/diff/updated contract intact while removing remote-only
turn-diff test skips
This takes the assumption that no 3P services rely on the output format
of `apply_patch`
## Why
For the CCA file system isolation push
---------
Co-authored-by: Codex <noreply@openai.com>
## DISCLAIMER
This is experimental and no production service must rely on this
## Why
Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.
## What changed
- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.
## Test plan
- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.
## What changed
- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.
## Validation
- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.
Based on work from Vincent K -
https://github.com/openai/codex/pull/19060
<img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
/>
## Why
Compaction rewrites the conversation context that future model turns
receive, but hooks currently have no deterministic lifecycle point
around that rewrite. This adds compact lifecycle hooks so users can
audit manual and automatic compaction, surface hook messages in the UI,
and run post-compaction follow-up without overloading tool or prompt
hooks.
## What Changed
- Added `PreCompact` and `PostCompact` hook events across hook config,
discovery, dispatch, generated schemas, app-server notifications,
analytics, and TUI hook rendering.
- Added trigger matching for compact hooks with the documented `manual`
and `auto` matcher values.
- Wired `PreCompact` before both local and remote compaction, and
`PostCompact` after successful local or remote compaction.
- Kept compact hook command input to lifecycle metadata: session id,
Codex turn id, transcript path, cwd, hook event name, model, and
trigger.
- Made compact stdout handling consistent with other hooks: plain stdout
is ignored as debug output, while malformed JSON-looking stdout is
reported as failed hook output.
- Added integration coverage for compact hook dispatch, trigger
matching, post-compact execution, and the audited behavior that
`decision:"block"` does not block compaction.
## Out of Scope
- Hook-specific compaction blocking is not implemented;
`decision:"block"` and exit-code-2 blocking semantics are intentionally
unsupported for `PreCompact`.
- Custom compaction instructions are not exposed to compact hooks in
this PR.
- Compact summaries, summary character counts, and summary previews are
not exposed to compact hooks in this PR.
## Verification
- `cargo test -p codex-hooks`
- `cargo test -p codex-core
manual_pre_compact_block_decision_does_not_block_compaction`
- `cargo test -p codex-app-server hooks_list`
- `cargo test -p codex-core config_schema_matches_fixture`
- `cargo test -p codex-tui hooks_browser`
## Docs
The developer documentation for Codex hooks should be updated alongside
this feature to document `PreCompact` and `PostCompact`, the
`manual`/`auto` matcher values, and the compact hook payload fields.
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
## Why
Skills update notifications are app-server API behavior, but the watcher
lived in `codex-core` and surfaced through
`EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core
focused on thread execution and lets app-server own both cache
invalidation and the `skills/changed` notification.
## What changed
- Added an app-server-owned skills watcher that watches local skill
roots, clears the shared skills cache, and emits `skills/changed`
directly.
- Registers skill watches from the common app-server thread listener
attach path, including direct starts, resumes, and app-server-observed
child or forked threads.
- Stores the `WatchRegistration` on `ThreadState`, so listener
replacement, thread teardown, idle unload, and app-server shutdown
deregister by dropping the RAII guard.
- Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the
old core live-reload test.
- Extended the app-server skills change test to verify a cached skills
list is refreshed after a filesystem change without forcing reload.
## Validation
- `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p
codex-rollout -p codex-rollout-trace`
- `cargo test -p codex-app-server
skills_changed_notification_is_emitted_after_skill_change`
## Summary
- make resolved turn environment shell metadata optional instead of
hard-coding bash
- render environment context shells from explicit environment metadata
when present, falling back to the existing session shell
- update environment context tests for inherited PowerShell-style
fallback and explicit per-environment shell override
## Testing
- Not run (not requested; formatted with `just fmt`).
Co-authored-by: Codex <noreply@openai.com>
## Why
The core `Op::ListMcpTools` request path is no longer needed. Keeping it
around left a dead request/response surface alongside the app-server MCP
inventory APIs that own current server status listing.
## What Changed
- Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the
core handler that built the MCP snapshot response.
- Removed the now-unused `codex-mcp` snapshot wrapper/export and passive
event handling arms in rollout and MCP-server consumers.
- Updated tests that used the old op as a synchronization hook to wait
on existing startup/skills events, and deleted the plugin test that only
exercised the removed listing op.
## Validation
- `cargo test -p codex-protocol`
- `cargo test -p codex-mcp`
- `cargo test -p codex-rollout -p codex-rollout-trace -p
codex-mcp-server`
- `cargo test -p codex-core --test all
pending_input::queued_inter_agent_mail`
- `cargo test -p codex-core --test all
rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta`
- `cargo test -p codex-core --test all
rmcp_client::stdio_image_responses`
- `just fix -p codex-core -p codex-protocol -p codex-mcp -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server`
## Summary
- Add a `response.processed` websocket request payload and sender for
Responses API websockets.
- Send `response.processed` from `try_run_sampling_request` after a
response completes, local turn processing succeeds, and the
session-owned feature flag is enabled.
- Add websocket coverage for both enabled and disabled feature-flag
behavior.
## Validation
- `just fmt`
- `cargo test -p codex-core response_processed`
- `cargo test -p codex-api responses_websocket`
- `cargo test -p codex-features
responses_websocket_response_processed_is_under_development`
- `git diff --check`
- `just fix -p codex-api -p codex-core -p codex-features`
- `git diff --check origin/main...HEAD`
## Summary
- resolve or inject the installation ID before core startup and pass it
through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain
`String`
- keep child sessions on the parent installation ID instead of
rediscovering it inside core
- propagate installation ID startup failures in `mcp-server` instead of
panicking
## Why
Core was still touching the filesystem on the session startup path to
discover `installation_id`. This moves that work to the outer host
boundary so core no longer depends on `codex_home` reads during session
construction.
---------
Co-authored-by: Codex <noreply@openai.com>