Commit Graph

3164 Commits

Author SHA1 Message Date
rhan-oai
99016ec732 [codex-analytics] plumb protocol-native review timing (#21434)
## Why

We want terminal tool review analytics, but the reducer should not stamp
review timing from its own wall clock.

This PR plumbs review timing through the real protocol and app-server
seams so downstream analytics can consume the emitter's timestamps
directly. Guardian reviews keep their enriched `started_at` /
`completed_at` analytics fields by deriving those legacy second-based
values from the same protocol-native millisecond lifecycle timestamps,
rather than sampling a separate analytics clock.

## What changed

- add `started_at_ms` to user approval request payloads
- add `started_at_ms` / `completed_at_ms` to guardian review
notifications
- preserve Guardian review `started_at` / `completed_at` enrichment from
the protocol-native timing source
- stamp typed `ServerResponse` analytics facts with app-server-observed
`completed_at_ms`
- thread the new timing fields through core, protocol, app-server, TUI,
and analytics fixtures

## Verification

- `cargo test -p codex-app-server outgoing_message --manifest-path
codex-rs/Cargo.toml`
- `cargo test -p codex-app-server-protocol guardian --manifest-path
codex-rs/Cargo.toml`
- `cargo test -p codex-tui guardian --manifest-path codex-rs/Cargo.toml`
- `cargo test -p codex-analytics analytics_client_tests --manifest-path
codex-rs/Cargo.toml`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21434).
* #18748
* __->__ #21434
* #18747
* #17090
* #17089
* #20514
2026-05-07 20:31:41 -07:00
pakrym-oai
dfa1e864a2 Send response.processed after remote compaction v2 (#21642)
## Why

Remote compaction v2 consumes a normal Responses stream, but that
compaction-specific stream consumer dropped the `response.completed` id.
As a result, the `responses_websocket_response_processed` lifecycle
notification was emitted for normal turn sampling but not after a v2
remote compaction response was fully processed.

## What changed

- Return the completed response id alongside the v2 `context_compaction`
output item.
- After v2 compacted history is installed, send `response.processed`
through the same websocket session when the feature is enabled.
- Add websocket regression coverage for a remote compaction v2 request
followed by `response.processed`.

## Verification

- `cargo test -p codex-core --test all
responses_websocket_sends_response_processed_after_remote_compaction_v2
-- --nocapture`
- `cargo test -p codex-core
collect_context_compaction_output_accepts_additional_output_items --
--nocapture`
2026-05-07 19:57:36 -07:00
starr-openai
07b695190f Add CODEX_HOME environments TOML provider (#20666)
## Why

After stdio transports and provider-owned defaults exist, Codex needs a
config-backed provider that can describe more than the single legacy
`CODEX_EXEC_SERVER_URL` remote. This PR adds that provider without
activating it in product entrypoints yet, keeping parser/validation
review separate from runtime wiring.

**Stack position:** this is PR 4 of 5. It builds on PR 3's
provider/default model and adds the `environments.toml` provider used by
PR 5.

## What Changed

- Add `environment_toml.rs` as the TOML-specific home for parsing,
validation, and provider construction.
- Keep the TOML schema/provider structs private; the public constructor
added here is `EnvironmentManager::from_codex_home(...)`.
- Add `TomlEnvironmentProvider`, including validation for:
  - reserved ids such as `local` and `none`
  - duplicate ids
  - unknown explicit defaults
  - empty programs or URLs
  - exactly one of `url` or `program` per configured environment
- Support websocket environments with `url = "ws://..."` / `wss://...`.
- Support stdio-command environments with `program = "..."`.
- Add helpers to load `environments.toml` from `CODEX_HOME`, but do not
wire entrypoints to call them yet.
- Add the `toml` dependency for parsing.

## Stack

- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- **4. This PR:** https://github.com/openai/codex/pull/20666 - Add
CODEX_HOME environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

Not run locally; this was split out of the original draft stack.

## Documentation

This introduces the config shape for `environments.toml`; user-facing
documentation should be added before this stack is treated as a
documented public workflow.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 01:37:47 +00:00
starr-openai
1bfc3d9773 Route view_image through selected environments
Route view_image through selected environments so image reads use the selected turn environment and cwd, with schema exposure limited to multi-environment toolsets.\n\nCo-authored-by: Codex <noreply@openai.com>
2026-05-08 01:29:03 +00:00
Tom
79ad209ce6 [codex] Remove remote thread store implementation (#21596)
Remove the remote thread-store backend and checked-in protobuf
artifacts. We've moved these into another crate that link against this
one.

Also remove the config settings for thread store backend selection,
since we'll instead pass an instantiated thread store into the core-api
crate's main entrypoint.
2026-05-08 00:02:46 +00:00
bbrown-oai
31b233c7c6 codex-otel: add configurable trace metadata (#21556)
Add Codex config for static trace span attributes and structured W3C
tracestate field upserts. The config flows through OtelSettings so
callers can attach trace metadata without touching every span call site.

Apply span attributes with an SDK span processor so every exported
trace span carries the configured metadata. Model tracestate as nested
member fields so configured keys can be upserted while unrelated
propagated state in the same member is preserved.

Validate configured tracestate before installing provider-global state,
including header-unsafe values the SDK does not reject by itself. This
keeps Codex from propagating malformed trace context from config.

Update the config schema, public docs, and OTLP loopback coverage for
config parsing, span export, propagation, and invalid-header rejection.
2026-05-07 16:06:57 -07:00
Charlie Marsh
54ef99a365 Disable empty Cargo test targets (#21584)
## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 15:44:17 -07:00
Tom
4242bba2eb Route ThreadManager rollout path reads through thread store (#21265)
- Route ThreadManager rollout-path resume/fork through ThreadStore
history reads.
- Add in-memory store coverage proving path-addressed reads are used.

This isn't strictly necessary for the ThreadStore migration, since these
ThreadManager methods _only_ work for path-based lookups, but I'm trying
to migrate all the rollout recorder callsites to use the threadstore
were possible for consistency.
2026-05-07 11:25:25 -07:00
Tom
56823ec46b Move thread name edits to ThreadStore (#21264)
- Route live thread renames through `ThreadStore` metadata updates.
- Read resumed thread names from store metadata with legacy local
fallback preserved in the store.
2026-05-07 11:12:22 -07:00
pakrym-oai
566f2cb612 [codex] Move tool specs onto handlers (#21461)
## Why

This is the next stacked step after deleting the tool-handler kind
indirection. Specs should come from the registered handlers themselves
so registry construction has a single source of truth for handler
behavior and exposed tool definitions.

## What changed

- Added `ToolHandler::spec()` plus handler-provided parallel/code-mode
metadata, and made `ToolRegistryBuilder::register_handler` automatically
collect specs from registered handlers.
- Moved builtin tool spec construction into the corresponding handlers
and their adjacent `_spec` modules, including shell, unified exec, apply
patch, view image, request plugin install, tool search, MCP resource,
goals, planning, permissions, agent jobs, and multi-agent tools.
- Reworked configurable handlers to receive their tool-building options
through constructors, with non-optional handler options where the
handler is always spec-backed. Shell fallback handlers keep an explicit
no-spec mode because they are also registered as hidden dispatch
aliases.
- Kept `CodeModeExecuteHandler` on the explicit configured wrapper so
the code-mode exec spec can still be built from the nested registry.

## Verification

- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `cargo test -p codex-core tools::handlers::multi_agents_spec::tests`
- `RUST_MIN_STACK=16777216 cargo test -p codex-core
tools::handlers::multi_agents::tests`
- `cargo test -p codex-core tools::handlers::apply_patch::tests`
- `cargo test -p codex-core tools::handlers::unified_exec::tests`
- `just fix -p codex-core`
- `git diff --check`
2026-05-07 10:48:36 -07:00
jif-oai
eb0462f2af app-server: refresh live threads from latest config snapshot (#21187)
## Why

App-server config writes were leaving existing threads partially stale.
After a config mutation, the app-server told each live thread to run
`Op::ReloadUserConfig`, but that path only re-read the user
`config.toml` layer. Settings that came from the app-server's
materialized config snapshot did not propagate to existing threads until
restart.

This change prevent a FS access from `core` for CCA.

## What changed

- add `CodexThread::refresh_runtime_config()` and
`Session::refresh_runtime_config()` so the app-server can push a freshly
rebuilt config snapshot into a live thread
- rebuild the latest config with each thread's `cwd` after config
mutations, then refresh the thread from that snapshot instead of asking
it to reload only `config.toml`
- keep session-static settings unchanged during refresh, while updating
runtime-refreshable state such as the config layer stack,
`tool_suggest`, and derived hook/plugin/skill state
- keep `reload_user_config_layer()` as the file-backed fallback for
legacy local reload flows, but route the shared refresh logic through
the new runtime refresh path

## Testing

- add a session test that verifies `refresh_runtime_config()` rebuilds
hooks from refreshed config
- add a session test that verifies runtime-refreshable fields update
while session-static settings like `model` and `notify` stay unchanged

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 19:22:04 +02:00
pakrym-oai
857e731478 [codex] Remove string-keyed MCP tool maps (#21454)
## Summary

This PR removes the synthetic `HashMap<String, ToolInfo>` keys from MCP
tool discovery. `McpConnectionManager::list_all_tools()` now returns
normalized `Vec<ToolInfo>`, and downstream code derives identity from
`ToolInfo::canonical_tool_name()`.

The motivation is to keep model-visible tool identity on
`ToolName`/`ToolInfo` instead of parallel string map keys, so future
namespace changes do not have to preserve otherwise-unused lookup keys.

## Changes

- Rename the MCP normalization path from `qualify_tools` to
`normalize_tools_for_model` and return tool values directly.
- Flow MCP tool lists through connectors, plugin injection, router/spec
building, code mode, and tool search as vectors/slices.
- Keep direct/deferred subtraction local to `mcp_tool_exposure`, using
`ToolName` values.
- Update tests to compare `ToolName` instances where MCP identity
matters.

## Validation

- `cargo test -p codex-mcp test_normalize_tools`
- `cargo test -p codex-core mcp_tool_exposure`
- `cargo test -p codex-core
direct_mcp_tools_register_namespaced_handlers`
- `cargo test -p codex-core
search_tool_registers_namespaced_mcp_tool_aliases`
- `just fix -p codex-mcp`
- `just fix -p codex-core`
2026-05-07 10:16:10 -07:00
jif-oai
9b6c6f7a01 fix: preserve exact turn diffs after partial apply_patch failures (#21518)
## Why

Follow-up to #21180: turn diffs are operation-backed now, but a failed
`apply_patch` can still leave exact filesystem mutations behind. For
example, a move can write the destination file before failing to remove
the source. Treating the whole call as unknowable then drops a change
that Codex actually knows happened, so the emitted turn diff can drift
from the workspace.

## What changed

-
[`apply-patch`](f55724e027/codex-rs/apply-patch/src/lib.rs (L248-L345))
now returns `ApplyPatchFailure` with the exact committed prefix
accumulated before an error. If a write failure may already have mutated
the target, the delta is marked inexact instead of being reused blindly.
- Move handling now records the destination write before attempting
source removal, so a partially failed move can still report the
destination file that definitely landed
([code](f55724e027/codex-rs/apply-patch/src/lib.rs (L463-L521))).
-
[`ApplyPatchRuntime`](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L49-L67))
now accumulates committed deltas across attempts and forwards them even
when the visible tool result is failed or sandbox-denied ([runtime
path](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L223-L250)),
[event
path](f55724e027/codex-rs/core/src/tools/events.rs (L215-L225))).
- `TurnDiffTracker` now consumes committed exact deltas rather than only
fully successful patches; exact-empty failures leave the aggregate
unchanged, while inexact deltas still invalidate it.

## Verification

- Added a regression test covering a failed move that still emits the
committed destination diff:
[`apply_patch_failed_move_preserves_committed_destination_diff`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1517-L1586)).
- Kept explicit coverage that an inexact delta clears the aggregate
instead of publishing a guessed diff:
[`apply_patch_clears_aggregated_diff_after_inexact_delta`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1589-L1655)).

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 18:05:45 +02:00
jif-oai
f7e8ff8e50 Make turn diff tracking operation backed (#21180)
## Summary
- replace filesystem-based turn diff tracking with an operation-backed
accumulator
- preserve enough verified apply_patch state to render move-overwrite
cases correctly
- keep the turn/diff/updated contract intact while removing remote-only
turn-diff test skips

This takes the assumption that no 3P services rely on the output format
of `apply_patch`

## Why
For the CCA file system isolation push

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 11:33:47 +02:00
jif-oai
b2268999fe feat: make built-in MCPs first-class runtime servers (#21356)
## DISCLAIMER
This is experimental and no production service must rely on this

## Why

Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.

## What changed

- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.

## Test plan

- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 10:36:32 +02:00
pakrym-oai
a8488fec5e Revert state DB injection and agent graph store (#21481)
## Why

Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.

## What changed

- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.

## Validation

- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.
2026-05-06 22:48:29 -07:00
mifan-oai
001363188a [codex] Add OpenAI Developers to tool suggest allowlist (#21423)
## Summary

Add `openai-developers@openai-curated` to
`TOOL_SUGGEST_DISCOVERABLE_PLUGIN_ALLOWLIST` so the OpenAI Developers
plugin can be surfaced through tool suggestions once it is available in
the Built by OpenAI marketplace.

Update the discoverable plugin test fixture to assert the plugin is
returned from the curated marketplace allowlist path.

## Validation

- `cargo fmt --check` passed; rustfmt emitted the existing
stable-channel warnings about `imports_granularity`.
- `cargo test -p codex-core
list_tool_suggest_discoverable_plugins_returns_uninstalled_curated_plugins`
passed.
2026-05-06 23:49:15 -04:00
pakrym-oai
e394625ea2 [codex] Delete tool handler plan indirection (#21427)
## Why

The spec split in the parent PR still left an intermediate registry plan
that recorded `ToolHandlerKind` values and translated them into concrete
handlers later. That kept tool registration dependent on static enum
bookkeeping instead of registering handlers from the same code that
assembles their specs.

## What Changed

- Make `build_tool_registry_builder` register concrete handlers directly
while adding specs.
- Add small `ToolRegistryBuilder` helpers for spec augmentation and
nested code-mode inspection.
- Remove `ToolHandlerKind`, `ToolHandlerSpec`, and `ToolRegistryPlan`.
- Update spec-plan tests to assert against the built `ToolRegistry`
instead of static handler descriptors.

## Validation

- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `just fix -p codex-core`
2026-05-06 20:36:24 -07:00
pakrym-oai
103dc2b6ae Revert "Move skills watcher to app-server" (#21460)
Reverts openai/codex#21287
2026-05-07 02:24:20 +00:00
Andrei Eternal
527d52df03 Add compact lifecycle hooks (started by vincentkoc - external contrib) (#19905)
Based on work from Vincent K -
https://github.com/openai/codex/pull/19060

<img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
/>

## Why

Compaction rewrites the conversation context that future model turns
receive, but hooks currently have no deterministic lifecycle point
around that rewrite. This adds compact lifecycle hooks so users can
audit manual and automatic compaction, surface hook messages in the UI,
and run post-compaction follow-up without overloading tool or prompt
hooks.

## What Changed

- Added `PreCompact` and `PostCompact` hook events across hook config,
discovery, dispatch, generated schemas, app-server notifications,
analytics, and TUI hook rendering.
- Added trigger matching for compact hooks with the documented `manual`
and `auto` matcher values.
- Wired `PreCompact` before both local and remote compaction, and
`PostCompact` after successful local or remote compaction.
- Kept compact hook command input to lifecycle metadata: session id,
Codex turn id, transcript path, cwd, hook event name, model, and
trigger.
- Made compact stdout handling consistent with other hooks: plain stdout
is ignored as debug output, while malformed JSON-looking stdout is
reported as failed hook output.
- Added integration coverage for compact hook dispatch, trigger
matching, post-compact execution, and the audited behavior that
`decision:"block"` does not block compaction.

## Out of Scope

- Hook-specific compaction blocking is not implemented;
`decision:"block"` and exit-code-2 blocking semantics are intentionally
unsupported for `PreCompact`.
- Custom compaction instructions are not exposed to compact hooks in
this PR.
- Compact summaries, summary character counts, and summary previews are
not exposed to compact hooks in this PR.

## Verification

- `cargo test -p codex-hooks`
- `cargo test -p codex-core
manual_pre_compact_block_decision_does_not_block_compaction`
- `cargo test -p codex-app-server hooks_list`
- `cargo test -p codex-core config_schema_matches_fixture`
- `cargo test -p codex-tui hooks_browser`

## Docs

The developer documentation for Codex hooks should be updated alongside
this feature to document `PreCompact` and `PostCompact`, the
`manual`/`auto` matcher values, and the compact hook payload fields.

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-05-06 18:08:31 -07:00
pakrym-oai
9417cf9696 [codex] Move tool specs into core handlers (#21416)
## Why

This is the first mechanical slice of moving tool spec ownership toward
the handlers. `codex-tools` should keep shared primitives and conversion
helpers, while builtin tool specs and registration planning live in
`codex-core` with the handlers that own those tools.

Keeping this PR to relocation and import updates isolates the copy/move
review from the later logic change that wires specs through registered
handlers.

## What changed

- Moved builtin tool spec constructors from `codex-rs/tools/src` into
`codex-rs/core/src/tools/handlers/*_spec.rs` or nearby core tool
modules.
- Moved the registry planning code into
`codex-rs/core/src/tools/spec_plan.rs` and its associated types/tests
into core.
- Kept shared primitives in `codex-tools`, including `ToolSpec`,
schema/types, discovery/config primitives, dynamic/MCP conversion
helpers, and code-mode collection helpers.
- Updated handlers that referenced moved argument types or tool-name
constants to use the core spec modules.
- Moved spec tests next to the moved spec modules.

## Verification

- `cargo check -p codex-tools`
- `cargo check -p codex-core`
- `cargo test -p codex-tools`
- `cargo test -p codex-core _spec::tests`
- `cargo test -p codex-core tools::spec_plan::tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`

Note: I also tried the broader `cargo test -p codex-core tools::`; it
reached the moved spec-plan/spec tests successfully, then aborted with a
stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`,
which is outside this spec relocation.
2026-05-06 15:40:50 -07:00
pakrym-oai
d5eea229cc Move skills watcher to app-server (#21287)
## Why

Skills update notifications are app-server API behavior, but the watcher
lived in `codex-core` and surfaced through
`EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core
focused on thread execution and lets app-server own both cache
invalidation and the `skills/changed` notification.

## What changed

- Added an app-server-owned skills watcher that watches local skill
roots, clears the shared skills cache, and emits `skills/changed`
directly.
- Registers skill watches from the common app-server thread listener
attach path, including direct starts, resumes, and app-server-observed
child or forked threads.
- Stores the `WatchRegistration` on `ThreadState`, so listener
replacement, thread teardown, idle unload, and app-server shutdown
deregister by dropping the RAII guard.
- Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the
old core live-reload test.
- Extended the app-server skills change test to verify a cached skills
list is refreshed after a filesystem change without forcing reload.

## Validation

- `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p
codex-rollout -p codex-rollout-trace`
- `cargo test -p codex-app-server
skills_changed_notification_is_emitted_after_skill_change`
2026-05-06 15:38:11 -07:00
Brian Henzelmann
8f5d68f9d2 Document Codex git commit attribution config (#21379)
## Summary
- document that commit attribution for generated git commit messages is
gated by the `codex_git_commit` feature flag
- add an example `config.toml` snippet showing `commit_attribution` with
`[features].codex_git_commit = true`
- update the config schema description so the reference docs explain
that `commit_attribution` only takes effect when the feature is enabled

Fixes #19799.

## Validation
- `cargo run -p codex-core --bin codex-write-config-schema`
- `cargo test -p codex-config`
- `cargo test -p codex-features`
- `cargo fmt --check`
- `git diff --check`

## Notes
- `cargo test -p codex-core config_schema_matches_fixture` currently
fails before reaching the schema test because `core_test_support`
imports `similar` without a linked crate in this checkout. The narrower
package checks above avoid that unrelated test-support build failure.
2026-05-06 16:14:50 -05:00
pakrym-oai
b9c50a53d7 [codex] Split tool handlers into separate files (#21395)
## Why

Several tool handler modules still bundled multiple `ToolHandler`
implementations in one file. That made the handler directory harder to
navigate and made otherwise local handler edits land in large shared
modules.

## What

- Split grouped tool handlers into one handler file each for agent jobs,
goals, MCP resources, shell tools, and unified exec.
- Kept shared parsing, payload, and runtime helpers in the existing
parent modules, with re-exports preserving the existing handler import
paths.
- Updated the shell handler tests to construct `ShellCommandHandler`
through the existing `ShellCommandBackendConfig` conversion now that the
backend detail lives with the shell-command handler.

## Validation

- `cargo check -p codex-core`
- `cargo clippy -p codex-core --lib -- -D warnings`
- `git diff --check -- codex-rs/core/src/tools/handlers`

Targeted `codex-core` handler tests did not run locally because
`core_test_support` currently fails to compile before reaching these
tests due to an unresolved `similar` import.
2026-05-06 13:12:24 -07:00
starr-openai
63a27ad6c6 Avoid hard-coded environment context shell (#21390)
## Summary
- make resolved turn environment shell metadata optional instead of
hard-coding bash
- render environment context shells from explicit environment metadata
when present, falling back to the existing session shell
- update environment context tests for inherited PowerShell-style
fallback and explicit per-environment shell override

## Testing
- Not run (not requested; formatted with `just fmt`).

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 19:54:26 +00:00
Christoph Paasch (OpenAI)
f9063045e1 Avoid noisy OTEL diagnostics in codex exec (#21107)
`codex exec` should not print OpenTelemetry exporter self-diagnostics to
stderr by default. Suppress the SDK and OTLP exporter targets unless
callers
explicitly opt in with `RUST_LOG`.

Also stop defaulting the trace exporter to the log exporter, since OTLP
HTTP
endpoints are signal-specific and a logs endpoint is not valid for
spans.

Co-authored-by: Codex <noreply@openai.com>

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 12:49:13 -07:00
Clark DuVall
346070a424 Route opted-in MCP elicitations through Guardian (#19431)
# Motivation

Browser Use origin-access prompts are MCP elicitations, not direct
tool-call approval prompts, so they were bypassing the Guardian approval
path. We need a generic opt-in that lets eligible MCP elicitations use
Guardian when the current turn already routes approvals there.

# Description

Add a generic elicitation reviewer hook in codex-mcp and wire codex-core
to pass a Guardian reviewer callback when creating the MCP connection
manager. The reviewer validates explicit mcp_tool_call opt-in metadata,
builds a Guardian MCP tool-call review request from
server/tool/connector metadata and tool params, and maps Guardian
approval, denial, timeout, and cancellation decisions back to MCP
elicitation responses.

The new option to trigger this in the `_meta` object is:
```
"codex_request_type": "approval_request",
```

# Testing

- RUST_MIN_STACK=8388608 NEXTEST_STATUS_LEVEL=leak cargo nextest run
--no-fail-fast --cargo-profile ci-test --test-threads 2
- cargo clippy --tests -- -D warnings
- cargo fmt -- --config imports_granularity=Item --check
- cargo shear
- pnpm run format
- python3 .github/scripts/verify_cargo_workspace_manifests.py
- python3 .github/scripts/verify_tui_core_boundary.py
- python3 .github/scripts/verify_bazel_clippy_lints.py
- git diff --check
2026-05-06 19:42:45 +00:00
pakrym-oai
712305be47 Remove core MCP list tools op (#21281)
## Why

The core `Op::ListMcpTools` request path is no longer needed. Keeping it
around left a dead request/response surface alongside the app-server MCP
inventory APIs that own current server status listing.

## What Changed

- Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the
core handler that built the MCP snapshot response.
- Removed the now-unused `codex-mcp` snapshot wrapper/export and passive
event handling arms in rollout and MCP-server consumers.
- Updated tests that used the old op as a synchronization hook to wait
on existing startup/skills events, and deleted the plugin test that only
exercised the removed listing op.

## Validation

- `cargo test -p codex-protocol`
- `cargo test -p codex-mcp`
- `cargo test -p codex-rollout -p codex-rollout-trace -p
codex-mcp-server`
- `cargo test -p codex-core --test all
pending_input::queued_inter_agent_mail`
- `cargo test -p codex-core --test all
rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta`
- `cargo test -p codex-core --test all
rmcp_client::stdio_image_responses`
- `just fix -p codex-core -p codex-protocol -p codex-mcp -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server`
2026-05-06 11:20:34 -07:00
pakrym-oai
2070d5bfd3 [codex] Add response.processed websocket request (#21284)
## Summary

- Add a `response.processed` websocket request payload and sender for
Responses API websockets.
- Send `response.processed` from `try_run_sampling_request` after a
response completes, local turn processing succeeds, and the
session-owned feature flag is enabled.
- Add websocket coverage for both enabled and disabled feature-flag
behavior.

## Validation

- `just fmt`
- `cargo test -p codex-core response_processed`
- `cargo test -p codex-api responses_websocket`
- `cargo test -p codex-features
responses_websocket_response_processed_is_under_development`
- `git diff --check`
- `just fix -p codex-api -p codex-core -p codex-features`
- `git diff --check origin/main...HEAD`
2026-05-06 09:58:46 -07:00
pakrym-oai
2004173cd7 Move message history out of core (#21278)
## Why

Message history was implemented inside `codex-core` and surfaced through
core protocol ops and `SessionConfiguredEvent` fields even though the
current consumer is TUI-local prompt recall. That made core own UI
history persistence and exposed `history_log_id` / `history_entry_count`
through surfaces that app-server and other clients do not need.

This change moves message history persistence out of core and keeps the
recall plumbing local to the TUI.

## What changed

- Added a new `codex-message-history` crate for appending, looking up,
trimming, and reading metadata from `history.jsonl`.
- Removed core protocol history ops/events: `AddToHistory`,
`GetHistoryEntryRequest`, and `GetHistoryEntryResponse`.
- Removed `history_log_id` and `history_entry_count` from
`SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly.
- Updated the TUI to dispatch local app events for message-history
append/lookup and keep its persistent-history metadata in TUI session
state.

## Validation

- `cargo test -p codex-message-history -p codex-protocol`
- `cargo test -p codex-exec event_processor_with_json_output`
- `cargo test -p codex-mcp-server outgoing_message`
- `cargo test -p codex-tui`
- `just fix -p codex-message-history -p codex-protocol -p codex-core -p
codex-tui -p codex-exec -p codex-mcp-server`
2026-05-06 08:35:42 -07:00
Ahmed Ibrahim
be1d3cff93 2- Use string service tiers in session protocol (#20971)
## Summary
- break service tier session/op/app-server protocol fields from the
closed enum to string tier ids
- send the service tier string directly through model requests, prewarm,
compaction, memories, and TUI/app-server turn starts
- regenerate app-server protocol JSON/TypeScript schemas, removing the
standalone ServiceTier TS enum

## Verification
- just fmt
- cargo check -p codex-core -p codex-app-server -p codex-tui
- just write-app-server-schema

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 18:00:21 +03:00
jif-oai
ca257b6ce5 chore: spawn MCP for memories (#21214)
Co-authored-by: Codex <noreply@openai.com>
2026-05-06 15:05:54 +02:00
jif-oai
8f3bb355f4 Move installation ID resolution out of core startup (#21182)
## Summary

- resolve or inject the installation ID before core startup and pass it
through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain
`String`
- keep child sessions on the parent installation ID instead of
rediscovering it inside core
- propagate installation ID startup failures in `mcp-server` instead of
panicking

## Why

Core was still touching the filesystem on the session startup path to
discover `installation_id`. This moves that work to the outer host
boundary so core no longer depends on `codex_home` reads during session
construction.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 10:48:54 +00:00
Ahmed Ibrahim
5d6f23a27b Propagate cache key and service tiers in compact (#21249)
## Why

`/responses/compact` should preserve the request-affinity fields that
apply to the active auth mode. ChatGPT-auth compact requests need the
effective `service_tier`, and compact requests for every auth mode need
the stable `prompt_cache_key`, so compaction does not quietly lose
routing or cache behavior that normal sampling already has.

This follows the request-parity direction from #20719, but keeps the net
change focused on the compact payload fields needed here.

## What changed

- Add `service_tier` and `prompt_cache_key` to the compact endpoint
input payload.
- Build the remote compact payload from the existing responses request
builder output so `Fast` still maps to `priority` when compact sends a
service tier.
- Pass the turn service tier into remote compaction, but only include it
in compact payloads for ChatGPT-backed auth.
- Keep `prompt_cache_key` on compact payloads for all auth modes.
- Add request-body diff snapshot coverage in
`core/tests/suite/compact_remote.rs` for:
- API-key auth reusing `prompt_cache_key` while omitting `service_tier`
even when `Fast` is configured.
  - ChatGPT auth reusing both `service_tier` and `prompt_cache_key`.
- Drive the snapshot coverage through five varied turns: plain text,
multi-part text, tool-call continuation, image+text input, local-shell
continuation, and final-turn reasoning output.

## Verification

- Added insta snapshots for compact request-body parity against the last
normal `/responses` request after five varied turns.
- Not run locally per repo guidance; relying on GitHub CI for test
execution.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 13:38:43 +03:00
jif-oai
cc84e6bc6d Revert "feat: support template interpolation in multi-agent usage hints" (#21337)
Reverts openai/codex#20973
2026-05-06 12:33:37 +02:00
jif-oai
fe24a180ab feat: include thread ID in MCP turn metadata (#21329)
## Why

MCP tool calls already include `session_id` in `x-codex-turn-metadata`,
but descendant threads intentionally share that value with the root
thread. Consumers that need to correlate work at the concrete thread
level also need the current `thread_id`.

## What changed

- add `thread_id` to `x-codex-turn-metadata` while preserving
`session_id` as the shared session identity
- thread the two identities separately through normal turns and spawned
review threads
- add regression coverage for resumed sessions, reserved metadata
fields, and deferred MCP tool calls

## Verification

- added focused coverage in `core/src/session/tests.rs`,
`core/src/turn_metadata_tests.rs`, and `core/tests/suite/search_tool.rs`
2026-05-06 11:36:15 +02:00
jif-oai
a98623511b feat: add session_id (#20437)
## Summary

Related to
https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
TLDR:
We update the meaning of session ids and thread ids:
* thread_id stays as now
* session_id become a shared id between every thread under a /root
thread (i.e. every sub-agent share the same session id)

This PR introduces an explicit `SessionId` and threads it through the
protocol/client boundary so `session_id` and `thread_id` can diverge
when they need to, while preserving compatibility for older serialized
`session_configured` events.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 10:48:37 +02:00
Matthew Zeng
f9a907aebe Support Codex Apps auth elicitations (#19193)
## Summary

- request URL-mode MCP elicitations when Codex Apps tool calls fail with
connector auth metadata
- route Codex Apps auth URL elicitations into the TUI app-link flow

## Test plan

- `just fmt`
- `cargo test -p codex-core mcp_tool_call::tests`
- `cargo test -p codex-mcp`
- `cargo test -p codex-tui bottom_pane::app_link_view::tests`
- `just fix -p codex-core`
- `just fix -p codex-mcp`
- `just fix -p codex-tui`

Also attempted broader local runs:

- `cargo test -p codex-core` fails in unrelated
config/request-permission/proxy-sensitive tests under the current Codex
Desktop environment.
- `cargo test -p codex-tui` fails in unrelated status
snapshots/trust-default tests because the ambient environment renders
workspace-write/network permission defaults.
2026-05-06 07:18:00 +00:00
Matthew Zeng
41505bcea2 [mcp] Return Accept early per feedback. (#21277)
- [x] Return Accept early when auto_deny is enabled per feedback.
2026-05-05 21:23:42 -07:00
aaronl-openai
9f06d171e2 Preserve session MCP config on refresh (#21055)
# Overview
MCP refreshes were rebuilding active threads from fresh disk-backed
config only, which dropped thread-start session overlays such as
app-injected MCP servers. This keeps refreshes current with disk config
while preserving the thread-local config that only the active thread
knows about.

# Changes
- Rebuild refreshed config per active thread using that thread's current
`cwd`, rather than fanning out one app-server config to every thread.
- Preserve each thread's `SessionFlags` layer while replacing reloadable
config layers with freshly loaded config, then derive the MCP refresh
payload from the rebuilt result.
- Move MCP refresh orchestration into app-server so manual refreshes
fail loudly while background refreshes remain best-effort, and route
plugin-triggered refreshes through the same per-thread reload path.
- Add regression coverage for session overlays, fresh project config,
plugin-derived MCP config, current requirements, and strict vs
best-effort refresh behavior.

# Verification
- Passed focused Rust coverage for the thread-config rebuild behavior
and deferred MCP refresh flow, plus `cargo test -p codex-app-server
--lib`.
- Verified end to end in the Codex dev app against the locally built
CLI: registered an MCP via thread config, verified that it could be used
successfully before refresh, manually triggered MCP refresh, and
verified that it continued to be available afterward.
2026-05-05 21:09:28 -07:00
rhan-oai
b3d4f1a9f0 [codex-analytics] rework thread_source for thread analytics (#20949)
## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification

## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.

Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.

## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.

## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
2026-05-06 02:12:31 +00:00
pakrym-oai
136e442e95 [codex] Remove legacy ListSkills op (#21282)
## Why

`skills/list` is already exposed through app-server v2 and covered by
the app-server test suite. Keeping the separate core `Op::ListSkills`
path leaves a duplicate legacy protocol surface that no longer needs to
be maintained.

## What Changed

- Removed `Op::ListSkills` and `EventMsg::ListSkillsResponse` from the
core protocol.
- Deleted the corresponding core session handler and stale core
integration tests.
- Removed rollout/MCP ignore branches and protocol v1 docs references
for the deleted event/op.
- Left app-server `skills/list` and its existing coverage intact.

## Validation

- `cargo test -p codex-protocol`
- `cargo test -p codex-core --test all suite::skills`
- `cargo check -p codex-mcp-server -p codex-rollout -p
codex-rollout-trace`
- `just fix -p codex-core`
2026-05-05 18:58:18 -07:00
mchen-oai
794c240f25 Add model and reasoning effort to MCP turn metadata (#21219)
## Why
- Similar change as https://github.com/openai/codex/pull/19473.
- Without change: MCP tool calls receive
`_meta["x-codex-turn-metadata"]` with `session_id`, `turn_id`, and
`turn_started_at_unix_ms`.
- Issue: MCP servers may want the model and reasoning effort to better
understand tool-call behavior and latency relative to turn start.

## What Changed
- With change: MCP turn metadata now includes `model` and
`reasoning_effort`, propagated in `_meta["x-codex-turn-metadata"]`.
- Normal `/responses` turn metadata headers are unchanged.

## Verification
- `codex-rs/core/src/mcp_tool_call_tests.rs`
- `codex-rs/core/src/turn_metadata_tests.rs`
- `codex-rs/core/tests/suite/search_tool.rs`
2026-05-05 17:37:48 -07:00
pakrym-oai
2c1a361a2e [codex] Move thread naming to app server (#21260)
## Why

Thread names are app-server metadata now, backed by the thread store and
sqlite state database. Keeping a core `SetThreadName` op plus a rollout
`thread_name_updated` event made rename persistence live in the wrong
layer and required historical replay support for an event that new
app-server flows should not write.

## What changed

- Removed `Op::SetThreadName` and `EventMsg::ThreadNameUpdated` from the
core protocol and deleted the core handler path that appended rename
events to rollouts.
- Updated app-server `thread/name/set` so both loaded and unloaded
threads write through thread-store metadata and app-server emits
`thread/name/updated` notifications.
- Updated local thread-store name metadata updates to write sqlite title
metadata and the legacy thread-name index without appending rollout
events.
- Removed state extraction and rollout handling for the deleted
thread-name event.

## Validation

- `cargo test -p codex-app-server thread_name_updated_broadcasts`
- `cargo test -p codex-app-server
thread_name_set_is_reflected_in_read_list_and_resume`
- `cargo test -p codex-thread-store
update_thread_metadata_sets_name_on_active_rollout_and_indexes_name`
- `cargo test -p codex-state`
- `cargo check -p codex-mcp-server -p codex-rollout-trace`
- `just fix -p codex-app-server -p codex-thread-store -p codex-state -p
codex-mcp-server -p codex-rollout-trace`

## Docs

No external documentation update is expected for this internal ownership
change.
2026-05-05 17:16:06 -07:00
Michael Bolin
26f355b67b linux-sandbox: use standalone bundled bwrap (#21255)
**Summary**
- Add `codex-bwrap`, a standalone `bwrap` binary built from the existing
vendored bubblewrap sources.
- Remove the linked vendored bwrap path from `codex-linux-sandbox`;
runtime now prefers system `bwrap` and falls back to bundled
`codex-resources/bwrap`.
- Add bundled SHA-256 verification with missing/all-zero digest as the
dev-mode skip value, then exec the verified file through
`/proc/self/fd`.
- Keep `launcher.rs` focused on choosing and dispatching the preferred
launcher. Bundled lookup, digest verification, and bundled exec now live
in `linux-sandbox/src/bundled_bwrap.rs`; Bazel runfiles lookup lives in
`linux-sandbox/src/bazel_bwrap.rs`; shared argv/fd exec helpers live in
`linux-sandbox/src/exec_util.rs`.
- Teach Bazel tests to surface the Bazel-built `//codex-rs/bwrap:bwrap`
through `CARGO_BIN_EXE_bwrap`; `codex-linux-sandbox` only honors that
fallback in debug Bazel runfiles environments so release/user runtime
lookup stays tied to `codex-resources/bwrap`.
- Allow `codex-exec-server` filesystem helpers to preserve just the
Bazel bwrap/runfiles variables they need in debug Bazel builds, since
those helpers intentionally rebuild a small environment before spawning
`codex-linux-sandbox`.
- Verify the Bazel bwrap target in Linux release CI with a build-only
check. Running `bwrap --version` is too strong for GitHub runners
because bubblewrap still attempts namespace setup there.

**Verification**
- Latest update: `cargo test -p codex-linux-sandbox`
- Latest update: `just fix -p codex-linux-sandbox`
- `cargo check --target x86_64-unknown-linux-gnu -p codex-linux-sandbox`
could not run locally because this macOS machine does not have
`x86_64-linux-gnu-gcc`; GitHub Linux Bazel CI is expected to cover the
Linux-only modules.
- Earlier in this PR: `cargo test -p codex-bwrap`
- Earlier in this PR: `cargo test -p codex-exec-server`
- Earlier in this PR: `cargo check --release -p codex-exec-server`
- Earlier in this PR: `just fix -p codex-linux-sandbox -p
codex-exec-server`
- Earlier in this PR: `bazel test --nobuild
//codex-rs/linux-sandbox:linux-sandbox-all-test
//codex-rs/core:core-all-test
//codex-rs/exec-server:exec-server-file_system-test
//codex-rs/app-server:app-server-all-test` (analysis completed; Bazel
then refuses to run tests under `--nobuild`)
- Earlier in this PR: `bazel build --nobuild //codex-rs/bwrap:bwrap`
- Prior to this update: `just bazel-lock-update`, `just
bazel-lock-check`, and YAML parse check for
`.github/workflows/bazel.yml`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21255).
* #21257
* #21256
* __->__ #21255
2026-05-05 17:14:29 -07:00
Tom
ee02cf26d6 codex: use ThreadStore history for core review forks (#20577)
- fork loaded parent threads from `ThreadStore` history in core agent
control paths
- migrate guardian review fork history to loaded session history instead
of rereading rollout files

## Verification
- `cargo test -p codex-core spawn_agent_fork`
2026-05-05 15:25:19 -07:00
Rasmus Rygaard
7e310bc7f3 Inject state DB, agent graph store (#20689)
## Why

We want the agent graph store to be passed down the stack as a real
dependency, the same way we already treat the thread store.

This will let us inject the agent graph store as a real dependency and
support implementations other than the local SQLite-backed one. Right
now most code instantiates a state DB and an agent graph store
just-in-time. Ideally, we would not depend on the state DB directly but
only read through the higher-level interfaces.

This change makes the dependency boundaries explicit and moves state DB
initialization to process bootstrap instead of hiding it inside local
store implementations.

## What changed

- `ThreadManager` now requires a `StateDbHandle` and an
`AgentGraphStore` at construction time instead of treating them as
optional internals.
- The local store constructors no longer lazily initialize SQLite.
Callers now initialize the state DB once per process and use that shared
handle to build:
  - `LocalThreadStore`
  - `LocalAgentGraphStore`
- App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
thread-manager sample) now initialize the state DB up front and inject
the resulting handle down the stack.
- `app-server` now consistently uses its process-scoped state DB handle
instead of reopening SQLite or trying to recover it from loaded threads.
- Device-key storage now reuses the shared state DB handle instead of
maintaining its own lazy opener.
- The thread archive / descendant traversal paths now use the injected
`AgentGraphStore` instead of reaching through local
thread-store-specific state.

## Verification

- `cargo check -p codex-core -p codex-thread-store -p codex-app-server
-p codex-mcp-server -p codex-thread-manager-sample --tests`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-core
thread_manager_accepts_separate_agent_graph_store_and_thread_store --
--nocapture`
- `cargo test -p codex-app-server
thread_archive_archives_spawned_descendants -- --nocapture`
2026-05-05 21:45:29 +00:00
Eric Traut
8c88f9a304 Auto-deny MCP elicitations for Xcode 26.4 clients (#21113)
## Summary

Xcode 26.4 was built against app-server behavior from before MCP
elicitation requests became client-visible in CLI 0.120.0 via #17043.
That client line does not expect the new events/messages, so this PR
restores the old behavior for exactly that client/version combination.

The compatibility handling stays in the app-server layer: when the
initialized client is `Xcode` and its version starts with `26.4`, the
app server marks the live Codex thread so MCP elicitations are
auto-denied. The flag is applied on thread start/resume/fork/turn
attachment, carried through `Codex`/`CodexThread`, and stored on
`McpConnectionManager` so refreshed MCP managers preserve the behavior.

## Notes

This is intentionally narrow and includes a TODO to remove the
compatibility path once Xcode 26.4 ages out.
2026-05-05 14:05:42 -07:00
pakrym-oai
f593323ef1 [codex] Split tool handlers by tool name (#20687)
## Why

Tool registration used to bind a tool name to a handler externally,
which left ownership split between the registry plan and the handler
implementation. Some built-in handlers also multiplexed multiple in-core
tools by switching on the invoked tool name internally.

This moves the registry identity onto the handler itself and makes
built-in multi-tool areas use separate concrete handlers, so each
registered handler instance owns exactly one tool name and one dispatch
path.

## What Changed

- Added `ToolHandler::tool_name()` and changed
`ToolRegistryBuilder::register_handler` to derive the registry key from
the handler.
- Split built-in multiplexed handlers into concrete per-tool handlers
for unified exec, shell/local shell/container exec, MCP resources, goal
tools, and agent job tools.
- Kept name-carrying handler instances only where the runtime target is
inherently external or dynamic, such as MCP tools, dynamic tools, and
unavailable placeholders.
- Updated `ToolHandlerKind` and registry-plan construction so plan
entries map directly to concrete handler registrations.

## Verification

- `cargo test -p codex-tools tool_registry_plan`
- `cargo test -p codex-core --lib tools::registry_tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`
2026-05-05 13:46:45 -07:00
Felipe Coury
3b2ebb368e feat(tui): redesign session picker (#20065)
## Why

The resume/fork picker is becoming the main way users recover previous
work, but the old fixed table made sessions hard to scan once thread
names, branches, working directories, and timestamps all mattered. This
redesign makes the picker denser by default, easier to search, and safer
to inspect before resuming or forking.

<table>
<tr>
<td>
<img width="1660" height="1103" alt="CleanShot 2026-05-03 at 12 34 10"
src="https://github.com/user-attachments/assets/313ede1d-1da4-4863-acd2-56b3e27e9703"
/>
</td>
<td>
<img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 34 15"
src="https://github.com/user-attachments/assets/cfde7d5c-bab0-4994-a807-254e53f344ea"
/>
</td>
</tr>
<tr>
<td>
<img width="1664" height="1107" alt="CleanShot 2026-05-03 at 12 39 22"
src="https://github.com/user-attachments/assets/e1ee58ca-4dc5-4a35-ae0f-47562da3974c"
/>
</td>
<td>
<img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 35 09"
src="https://github.com/user-attachments/assets/9c888072-eedf-4f45-985c-0c14df28bcc7"
/>
</td>
</tr>
</table>

## What Changed

- Replaces the old session table with responsive session rows that
prioritize the session name or preview, then show timestamp, cwd, and
branch metadata.
- Makes dense view the default while keeping comfortable view available
through `Ctrl+O`.
- Persists the picker view preference in `[tui].session_picker_view`,
including active profile-scoped config.
- Adds sort/filter controls for updated time, created time, cwd, and all
sessions.
- Expands search matching across session name, preview, thread id,
branch, and cwd.
- Makes `Esc` safer in search mode: it clears an active query before
starting a new session.
- Adds lazy transcript inspection:
  - `Space` expands recent transcript context inline.
  - `Ctrl+T` opens a transcript overlay.
  - raw reasoning visibility follows `show_raw_agent_reasoning`.
- Keeps remote cwd filtering server-side for remote app-server sessions
so local path normalization does not incorrectly hide remote results.
- Updates snapshots and config schema for the new picker states and
config option.

## How to Test

1. Start Codex in a repo with several saved sessions.
2. Press `Ctrl+R` / resume picker entry point.
3. Confirm the picker opens in dense mode and shows session name or
preview, timestamp, cwd, and branch metadata.
4. Press `Ctrl+O` and confirm it switches between dense and comfortable
views.
5. Restart Codex and confirm the selected view persists.
6. Type a query that matches a branch, cwd, thread id, or session name;
confirm matching sessions appear.
7. Press `Esc` while the query is non-empty and confirm it clears search
instead of starting a new session.
8. Select a session and press `Space`; confirm recent transcript context
expands inline.
9. Press `Ctrl+T`; confirm the transcript overlay opens and respects
raw-reasoning visibility settings.

Targeted tests:
- `cargo test -p codex-tui resume_picker --no-fail-fast`
- `cargo test -p codex-core
runtime_config_resolves_session_picker_view_default_and_override`
- `cargo test -p codex-core profile_tui_rejects_unsupported_settings`
- `cargo check -p codex-thread-manager-sample`
- `cargo insta pending-snapshots`
2026-05-05 13:32:54 -07:00