Commit Graph

6281 Commits

Author SHA1 Message Date
Edward Frazer
4666da6c7a refactor: remove queued follow-up leftovers 2026-05-11 16:34:30 -07:00
Edward Frazer
ad7f66a284 refactor: simplify queued turn start plumbing 2026-05-11 15:52:22 -07:00
Edward Frazer
6ec5e2626d refactor: move queued turn dispatch into core 2026-05-11 13:08:51 -07:00
Edward Frazer
0c81f7874c feat: add durable app-server queued turns 2026-05-11 01:03:23 -07:00
Tom
79ad209ce6 [codex] Remove remote thread store implementation (#21596)
Remove the remote thread-store backend and checked-in protobuf
artifacts. We've moved these into another crate that link against this
one.

Also remove the config settings for thread store backend selection,
since we'll instead pass an instantiated thread store into the core-api
crate's main entrypoint.
2026-05-08 00:02:46 +00:00
starr-openai
a3de5bde6e Add stdio exec-server client transport (#20664)
## Why

Configured environments need to connect to exec-server instances that
are not necessarily already listening on a websocket URL. A
command-backed stdio transport lets Codex start an exec-server process,
speak JSON-RPC over its stdio streams, and clean up that child process
with the client lifetime.

**Stack position:** this is PR 2 of 5. It builds on the server-side
stdio listener from PR 1 and provides the client transport used by later
environment/config PRs.

## What Changed

- Add `ExecServerTransport` variants for websocket URLs and stdio shell
commands.
- Add stdio command connection support for `ExecServerClient`.
- Move websocket/stdio transport setup into `client_transport.rs` so
`client.rs` stays focused on shared JSON-RPC client, session, HTTP, and
notification behavior.
- Tie stdio child process cleanup to the JSON-RPC connection lifetime
with a RAII lifetime guard.
- Keep existing websocket environment behavior by adapting URL-backed
remotes to `ExecServerTransport::WebSocketUrl`.

## Stack

- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- **2. This PR:** https://github.com/openai/codex/pull/20664 - Add stdio
exec-server client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

Not run locally; this was split out of the original draft stack and then
refactored to separate transport setup from the base client.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 23:48:50 +00:00
Zanie Blue
79154e6952 Use --locked in cargo build and lint invocations (#21602)
This ensures CI fails if the committed lockfile is outdated
2026-05-07 23:14:18 +00:00
William Woodruff
893038f77c [codex] Apply a Dependabot cooldown of 7 days (#21599)
This adds 7-day cooldowns to all of our Dependabot ecosystem blocks. Our
Dependabot runs will continue at the same cadence as before, but the
scheduled PRs will no suggest updates that are fewer than 7 days old
themselves. This serves two purposes: to let dependencies "bake" for a
bit in terms of stability before we adopt them, and to give third-party
security services/tooling a chance to detect and revoke malware.

This should have no functional changes/consequences besides how rapidly
we get (non-security) updates. Dependabot security PRs can still be
scheduled and will bypass the cooldown.
2026-05-07 16:07:46 -07:00
bbrown-oai
31b233c7c6 codex-otel: add configurable trace metadata (#21556)
Add Codex config for static trace span attributes and structured W3C
tracestate field upserts. The config flows through OtelSettings so
callers can attach trace metadata without touching every span call site.

Apply span attributes with an SDK span processor so every exported
trace span carries the configured metadata. Model tracestate as nested
member fields so configured keys can be upserted while unrelated
propagated state in the same member is preserved.

Validate configured tracestate before installing provider-global state,
including header-unsafe values the SDK does not reject by itself. This
keeps Codex from propagating malformed trace context from config.

Update the config schema, public docs, and OTLP loopback coverage for
config parsing, span export, propagation, and invalid-header rejection.
2026-05-07 16:06:57 -07:00
Owen Lin
0d0835dd53 feat(app-server, threadstore): Thread pagination APIs and ThreadStore contract (#21566)
## Why
The goal of this PR is to align on app-server and `ThreadStore` API
updates for paginating through large threads.


#### app-server
##### `thread/turns/list`
- Updates `thread/turns/list` to support `itemsView?: "notLoaded" |
"summary" | "full" | null`, defaulting to `summary`.
- Implements the current `thread/turns/list` behavior over the existing
persisted rollout-history fallback:
  - `notLoaded` returns turn envelopes with empty `items`.
- `summary` returns the first user message and final assistant message
when available.
  - `full` preserves the existing full item behavior.

Note that this method still uses the naive approach of loading the
entire rollout file, and returns just the filtered slice of the data.
Real pagination will come later by leveraging SQLite.

##### `thread/turns/items/list`
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.

#### ThreadStore
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.
- Adds `ThreadStore` contract types and stubbed methods for listing
thread turns and listing items within a turn.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.

This also sketches the storage abstraction we expect to need once turns
are indexed/stored. In particular, `notLoaded` is useful only if
ThreadStore can eventually list turn metadata without loading every
persisted item for each turn.

## Validation

- Added/updated protocol serialization coverage for the new request and
response shapes.
- Added app-server integration coverage for `thread/turns/list` default
summary behavior and all three `itemsView` modes.
- Added app-server integration coverage that `thread/turns/items/list`
returns the expected unsupported JSON-RPC error when experimental APIs
are enabled.
- Added thread-store coverage that the default trait methods return
`ThreadStoreError::Unsupported`.

No developers.openai.com documentation update is needed for this
internal experimental app-server API surface.
2026-05-07 15:44:43 -07:00
Charlie Marsh
54ef99a365 Disable empty Cargo test targets (#21584)
## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 15:44:17 -07:00
Aria Desires
80a8563e48 Ensure all mentions of cargo-install are --locked (#21592)
There's already a preference for this in the codebase, but a few of them
have drifted away. Generally `--locked` is preferred to reduce exposure
to supply-chain attacks (and just generally improve reproducibility).

In an ideal world these dependencies would maybe even be pinned to
versions but Cargo is kinda bad at that for devtools. Still better to
use --locked than not.
2026-05-07 15:30:37 -07:00
William Woodruff
8abcc5357d [codex] Fully qualify hash-pins in GitHub Actions (#21436)
This builds on top of https://github.com/openai/codex/pull/15828 by
ensuring that hash-pinned actions with version comments are fully
qualified, rather than referencing floating/mutable comments like "v7".
This makes actions management tools behave more consistently.

This shouldn't break anything, since it's comment only. But if it does,
ping ww@ 🙂
2026-05-07 14:31:20 -07:00
Zanie Blue
27ec488ad5 Add a Cargo build profile for benchmarking (#21574)
A clean release build takes ~18m and an incremental build takes ~12m.
This is far too slow to iterate on performance related changes and the
build time is dominated by LTO.

This pull request adds a `profiling` profile for Cargo which takes ~13m
clean and ~6m incremental, the primary change is that LTO is disabled.
This matches a profile used in uv and follows the great work at
https://github.com/astral-sh/uv/pull/5955 — there's a bit of commentary
there about the trade-offs this implies.

We've found that this does not inhibit the ability to accurately
benchmark as measurements with LTO disabled are generally consistent
with the results with LTO enabled and it makes it much faster (~2x) to
rebuild after making a change.

This is motivated by my interest in improving Codex TUI performance,
which is blocked by the tragically builds right now.

I tested incremental build times by making a no-op change to the
`codex-cli` crate.
2026-05-07 14:30:35 -07:00
Zanie Blue
8367ef4522 Use descriptive names for Cargo profile options (#21582)
These are equivalent and their intent is clearer, e.g., I was confused
if `debug = 1` meant the same thing as `debug = true` (it does not).
2026-05-07 14:19:32 -07:00
iceweasel-oai
163eac9306 Grant sandbox users access to desktop runtime bin (#21564)
## Why

Codex desktop copies bundled Windows binaries out of `WindowsApps` into
a LocalAppData runtime cache before launching `codex.exe`. Sandboxed
commands can then need to execute helpers from that cache, but the
sandbox user group may not have read/execute access to the runtime bin
directory.

This makes the Windows sandbox refresh path repair that access directly
so the packaged desktop runtime remains usable from sandboxed sessions.

## What changed

- Added `setup_runtime_bin` to locate `%LOCALAPPDATA%\OpenAI\Codex\bin`,
matching the desktop bundled-binaries destination path, with the same
`USERPROFILE\AppData\Local` fallback shape.
- During refresh setup, check whether `CodexSandboxUsers` already has
read/execute access to the runtime bin directory.
- If access is missing, grant `CodexSandboxUsers` `OI/CI/RX` inheritance
on that directory.
- If the runtime bin directory does not exist, no-op cleanly.

## Verification

- `cargo build -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
- `cargo test -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
- Manual Windows ACL exercise against the installed packaged runtime
bin:
- existing inherited `CodexSandboxUsers:(I)(OI)(CI)(RX)` no-ops without
changing SDDL
- after disabling inheritance and removing the group ACE, setup adds
`CodexSandboxUsers:(OI)(CI)(RX)`
- with `LOCALAPPDATA` pointed at a fake location without
`OpenAI\Codex\bin`, setup exits successfully and does not create the
directory
- restored the real runtime bin with inherited ACLs and confirmed the
final SDDL matched the baseline exactly
2026-05-07 11:38:10 -07:00
Tom
4242bba2eb Route ThreadManager rollout path reads through thread store (#21265)
- Route ThreadManager rollout-path resume/fork through ThreadStore
history reads.
- Add in-memory store coverage proving path-addressed reads are used.

This isn't strictly necessary for the ThreadStore migration, since these
ThreadManager methods _only_ work for path-based lookups, but I'm trying
to migrate all the rollout recorder callsites to use the threadstore
were possible for consistency.
2026-05-07 11:25:25 -07:00
Tom
0274398901 [codex] Fix pathless thread summaries (#21266)
## Summary

Fix `getConversationSummary` so thread-id summaries work for stored
threads that do not have a local rollout path, such as remote thread
stores.

The root cause was that `summary_from_stored_thread` returned `None`
when `StoredThread.rollout_path` was absent, and
`get_thread_summary_response_inner` treated that as an internal error.
This made conversation-id lookups depend on a local-only field even
though the thread store can address the thread by id.
2026-05-07 11:18:16 -07:00
Tom
56823ec46b Move thread name edits to ThreadStore (#21264)
- Route live thread renames through `ThreadStore` metadata updates.
- Read resumed thread names from store metadata with legacy local
fallback preserved in the store.
2026-05-07 11:12:22 -07:00
Charlie Marsh
0dc1885a5c Upgrade cargo-shear to 1.11.2 (#21547)
## Summary

Catches a few additional dependencies (`sha2`, `url`) that should be in
`dev-dependencies`.
2026-05-07 11:07:18 -07:00
pakrym-oai
566f2cb612 [codex] Move tool specs onto handlers (#21461)
## Why

This is the next stacked step after deleting the tool-handler kind
indirection. Specs should come from the registered handlers themselves
so registry construction has a single source of truth for handler
behavior and exposed tool definitions.

## What changed

- Added `ToolHandler::spec()` plus handler-provided parallel/code-mode
metadata, and made `ToolRegistryBuilder::register_handler` automatically
collect specs from registered handlers.
- Moved builtin tool spec construction into the corresponding handlers
and their adjacent `_spec` modules, including shell, unified exec, apply
patch, view image, request plugin install, tool search, MCP resource,
goals, planning, permissions, agent jobs, and multi-agent tools.
- Reworked configurable handlers to receive their tool-building options
through constructors, with non-optional handler options where the
handler is always spec-backed. Shell fallback handlers keep an explicit
no-spec mode because they are also registered as hidden dispatch
aliases.
- Kept `CodeModeExecuteHandler` on the explicit configured wrapper so
the code-mode exec spec can still be built from the nested registry.

## Verification

- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `cargo test -p codex-core tools::handlers::multi_agents_spec::tests`
- `RUST_MIN_STACK=16777216 cargo test -p codex-core
tools::handlers::multi_agents::tests`
- `cargo test -p codex-core tools::handlers::apply_patch::tests`
- `cargo test -p codex-core tools::handlers::unified_exec::tests`
- `just fix -p codex-core`
- `git diff --check`
2026-05-07 10:48:36 -07:00
jif-oai
eb0462f2af app-server: refresh live threads from latest config snapshot (#21187)
## Why

App-server config writes were leaving existing threads partially stale.
After a config mutation, the app-server told each live thread to run
`Op::ReloadUserConfig`, but that path only re-read the user
`config.toml` layer. Settings that came from the app-server's
materialized config snapshot did not propagate to existing threads until
restart.

This change prevent a FS access from `core` for CCA.

## What changed

- add `CodexThread::refresh_runtime_config()` and
`Session::refresh_runtime_config()` so the app-server can push a freshly
rebuilt config snapshot into a live thread
- rebuild the latest config with each thread's `cwd` after config
mutations, then refresh the thread from that snapshot instead of asking
it to reload only `config.toml`
- keep session-static settings unchanged during refresh, while updating
runtime-refreshable state such as the config layer stack,
`tool_suggest`, and derived hook/plugin/skill state
- keep `reload_user_config_layer()` as the file-backed fallback for
legacy local reload flows, but route the shared refresh logic through
the new runtime refresh path

## Testing

- add a session test that verifies `refresh_runtime_config()` rebuilds
hooks from refreshed config
- add a session test that verifies runtime-refreshable fields update
while session-static settings like `model` and `notify` stay unchanged

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 19:22:04 +02:00
Owen Lin
129401df43 add top-level remote-control command (#21424)
## Summary

`codex --enable remote_control app-server --listen off` is the current
way to start a headless, remote-controllable app-server, but it is hard
to remember and exposes implementation details.

This adds `codex remote-control` as a friendly top-level wrapper for
that flow. The command starts a foreground app-server with local
transports disabled and enables `remote_control` only for that
invocation.

## Changes

- Add a visible `codex remote-control` CLI subcommand.
- Launch app-server with `AppServerTransport::Off`.
- Append `features.remote_control=true` after root feature toggles so
the explicit command wins over `--disable remote_control`.
- Reject root `--remote` / `--remote-auth-token-env`, matching other
non-TUI subcommands.
- Add tests for parsing, launch defaults, override ordering, and remote
flag rejection.

## Verification

- `cargo test -p codex-cli`
- `just fix -p codex-cli`
2026-05-07 10:17:07 -07:00
pakrym-oai
857e731478 [codex] Remove string-keyed MCP tool maps (#21454)
## Summary

This PR removes the synthetic `HashMap<String, ToolInfo>` keys from MCP
tool discovery. `McpConnectionManager::list_all_tools()` now returns
normalized `Vec<ToolInfo>`, and downstream code derives identity from
`ToolInfo::canonical_tool_name()`.

The motivation is to keep model-visible tool identity on
`ToolName`/`ToolInfo` instead of parallel string map keys, so future
namespace changes do not have to preserve otherwise-unused lookup keys.

## Changes

- Rename the MCP normalization path from `qualify_tools` to
`normalize_tools_for_model` and return tool values directly.
- Flow MCP tool lists through connectors, plugin injection, router/spec
building, code mode, and tool search as vectors/slices.
- Keep direct/deferred subtraction local to `mcp_tool_exposure`, using
`ToolName` values.
- Update tests to compare `ToolName` instances where MCP identity
matters.

## Validation

- `cargo test -p codex-mcp test_normalize_tools`
- `cargo test -p codex-core mcp_tool_exposure`
- `cargo test -p codex-core
direct_mcp_tools_register_namespaced_handlers`
- `cargo test -p codex-core
search_tool_registers_namespaced_mcp_tool_aliases`
- `just fix -p codex-mcp`
- `just fix -p codex-core`
2026-05-07 10:16:10 -07:00
xl-openai
114bac1409 feat: Expose plugin share metadata in shareContext (#21495)
Extends PluginSummary.shareContext with shareUrl and reader shareTargets
2026-05-07 10:07:03 -07:00
rhan-oai
3444b0d60a [codex-analytics] add tool review event schema (#18747)
## Why

We want to emit terminal review analytics for tool-related approval
flows, but the event contract needs to exist before the reducer can
publish anything.

This PR is the schema-only slice for the Codex review event family.

## What changed

- add the `ReviewEvent` analytics envelope in
`codex-rs/analytics/src/events.rs`
- define the review subject kind, reviewer, trigger, terminal status,
and post-review resolution enums
- define the review event payload with thread, turn, item, lineage,
tool, and timing fields that the emitter stack will populate

## Verification

- stacked verification in dependent PRs: `cargo test -p codex-analytics
analytics_client_tests --manifest-path codex-rs/Cargo.toml`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18747).
* #18748
* #21434
* __->__ #18747
* #17090
* #17089
* #20514
2026-05-07 09:46:46 -07:00
jif-oai
9b6c6f7a01 fix: preserve exact turn diffs after partial apply_patch failures (#21518)
## Why

Follow-up to #21180: turn diffs are operation-backed now, but a failed
`apply_patch` can still leave exact filesystem mutations behind. For
example, a move can write the destination file before failing to remove
the source. Treating the whole call as unknowable then drops a change
that Codex actually knows happened, so the emitted turn diff can drift
from the workspace.

## What changed

-
[`apply-patch`](f55724e027/codex-rs/apply-patch/src/lib.rs (L248-L345))
now returns `ApplyPatchFailure` with the exact committed prefix
accumulated before an error. If a write failure may already have mutated
the target, the delta is marked inexact instead of being reused blindly.
- Move handling now records the destination write before attempting
source removal, so a partially failed move can still report the
destination file that definitely landed
([code](f55724e027/codex-rs/apply-patch/src/lib.rs (L463-L521))).
-
[`ApplyPatchRuntime`](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L49-L67))
now accumulates committed deltas across attempts and forwards them even
when the visible tool result is failed or sandbox-denied ([runtime
path](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L223-L250)),
[event
path](f55724e027/codex-rs/core/src/tools/events.rs (L215-L225))).
- `TurnDiffTracker` now consumes committed exact deltas rather than only
fully successful patches; exact-empty failures leave the aggregate
unchanged, while inexact deltas still invalidate it.

## Verification

- Added a regression test covering a failed move that still emits the
committed destination diff:
[`apply_patch_failed_move_preserves_committed_destination_diff`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1517-L1586)).
- Kept explicit coverage that an inexact delta clears the aggregate
instead of publishing a guessed diff:
[`apply_patch_clears_aggregated_diff_after_inexact_delta`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1589-L1655)).

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 18:05:45 +02:00
Ruslan Nigmatullin
e64a8979b0 device-key: clean up unused crate (#21487) 2026-05-07 09:01:44 -07:00
pakrym-oai
acac786d91 [codex] add account id to feedback uploads (#21498)
## Why

Feedback uploads already carry auth-derived context like
`chatgpt_user_id`, but they do not include the authenticated
workspace/account id. Adding `account_id` makes feedback triage easier
when a user can operate across multiple ChatGPT workspaces.

## What changed

- emit auth-derived `account_id` into feedback tags in `app-server`
before the feedback snapshot is uploaded
- preserve that tag through `codex-feedback` upload tag assembly
alongside the existing merge behavior for other tags
- extend `codex-feedback` coverage to assert that snapshot-derived
`account_id` is present in uploaded tags

## Verification

- `cargo test -p codex-feedback
upload_tags_include_client_tags_and_preserve_reserved_fields`
- `cargo test -p codex-app-server --lib feedback_processor`
2026-05-07 08:45:16 -07:00
jif-oai
f7e8ff8e50 Make turn diff tracking operation backed (#21180)
## Summary
- replace filesystem-based turn diff tracking with an operation-backed
accumulator
- preserve enough verified apply_patch state to render move-overwrite
cases correctly
- keep the turn/diff/updated contract intact while removing remote-only
turn-diff test skips

This takes the assumption that no 3P services rely on the output format
of `apply_patch`

## Why
For the CCA file system isolation push

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 11:33:47 +02:00
jif-oai
b2268999fe feat: make built-in MCPs first-class runtime servers (#21356)
## DISCLAIMER
This is experimental and no production service must rely on this

## Why

Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.

## What changed

- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.

## Test plan

- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 10:36:32 +02:00
Abhinav
40e282849c Show plugin hooks in plugin details (#21447)
Supersedes the abandoned #19859, rebuilt on latest `main`.

# Why

PR #19705 adds discovery for hooks bundled with plugins, but `/plugins`
still only shows skills, apps, and MCP servers. This follow-up makes
bundled hooks visible in the same plugin detail view so users can
inspect the full plugin surface in one place.

We also need `PluginHookSummary` to populate Plugin Hooks in the app;
`hooks/list` is not enough there because plugin detail needs to show
hooks for disabled plugins too.

# What

- extend `plugin/read` with `PluginHookSummary` entries for bundled
hooks
- summarize plugin hooks while loading plugin details
- render a `Hooks` row in the `/plugins` detail popup

<img width="3456" height="848" alt="CleanShot 2026-04-27 at 11 45 34@2x"
src="https://github.com/user-attachments/assets/fe3a38d6-a260-4351-8513-fb04c93d725b"
/>
2026-05-07 00:21:14 -07:00
xli-oai
898f5bfeaa [codex] fix PluginListParams test initializer (#21494)
## Summary
- update the app-server protocol test fixture to include the required
`marketplace_kinds` field on `PluginListParams`

## Why
`PluginListParams` now requires `marketplace_kinds`, but a later-added
test fixture in `common.rs` still constructed the older shape with only
`cwds`. That stale initializer breaks the main build with `missing field
marketplace_kinds`.

## Impact
This is a test-only repair. It restores compilation without changing the
JSON-RPC schema or runtime behavior.

## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
2026-05-06 23:58:26 -07:00
pakrym-oai
a8488fec5e Revert state DB injection and agent graph store (#21481)
## Why

Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.

## What changed

- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.

## Validation

- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.
2026-05-06 22:48:29 -07:00
xli-oai
5bc33fe31f [codex] Parallelize skills list cwd loading (#21441)
## Summary
- process `skills/list` cwd entries with bounded concurrency of 5
- preserve the caller's requested cwd order in the response
- add coverage that verifies response ordering remains stable

## Why
Cold-start desktop traces showed that `skills/list` can dominate the
shared config queue when it scans many workspace roots serially. The
expensive work is largely independent per cwd, so the request was paying
the sum of all cwd costs instead of the cost of the slowest bounded
batch.

## Impact
This keeps current request semantics intact while reducing the
wall-clock time of large multi-root `skills/list` calls. That should
also reduce how long later config-family requests, such as
`plugin/list`, wait behind `skills/list` during startup.

## Validation
- `just fmt`
- `cargo test -p codex-app-server`
- `cargo test -p codex-app-server
skills_list_preserves_requested_cwd_order`
2026-05-06 21:25:24 -07:00
xli-oai
05cd5c313e [codex] allow shared config reads in app-server queue (#21340)
## Summary
- add a shared-read serialization mode for global app-server request
families
- let consecutive leading shared reads for the same family run together
while keeping exclusive requests ordered
- mark only `skills/list`, `config/read` and `plugin/list` as shared
reads for now

## Why
`skills/list` and `plugin/list` are read-only config-family requests,
but the app-server queue currently treats every config request as
exclusive. That means one long `skills/list` can make a later
`plugin/list` wait even though the two requests do not mutate config.

This change keeps the existing queue order but lets adjacent reads
overlap. If a write is already waiting, later reads still stay behind
it, so writes do not starve.

## Scope
This intentionally keeps the first pass narrow:
- shared reads: `skills/list`, `plugin/list`
- still exclusive: `plugin/install`, `marketplace/*`,
`skills/config/write`, `config/*write`, `config/read`, and the rest of
the config family

## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`

## Desktop verification
I ran the dev desktop app against this branch's built binary with the
existing UI timing logs enabled. The app did use
`/Users/xli/code/codex_6/codex-rs/target/debug/codex`.

The new scheduler behavior works, but this narrow change does not remove
every cold-start delay: in the observed trace, an earlier exclusive
`config/read` was already queued ahead of the later `skills/list` and
`plugin/list` requests, so the page-open plugin requests still waited
behind that earlier exclusive config-family request before they could
run together.

That means this PR is the scheduler primitive needed for shared reads,
not the complete end-to-end latency fix by itself.

## Not run
- full workspace test suite, because repo policy requires explicit
approval before running it after touching `app-server-protocol`
2026-05-06 21:16:31 -07:00
mifan-oai
001363188a [codex] Add OpenAI Developers to tool suggest allowlist (#21423)
## Summary

Add `openai-developers@openai-curated` to
`TOOL_SUGGEST_DISCOVERABLE_PLUGIN_ALLOWLIST` so the OpenAI Developers
plugin can be surfaced through tool suggestions once it is available in
the Built by OpenAI marketplace.

Update the discoverable plugin test fixture to assert the plugin is
returned from the curated marketplace allowlist path.

## Validation

- `cargo fmt --check` passed; rustfmt emitted the existing
stable-channel warnings about `imports_granularity`.
- `cargo test -p codex-core
list_tool_suggest_discoverable_plugins_returns_uninstalled_curated_plugins`
passed.
2026-05-06 23:49:15 -04:00
pakrym-oai
e394625ea2 [codex] Delete tool handler plan indirection (#21427)
## Why

The spec split in the parent PR still left an intermediate registry plan
that recorded `ToolHandlerKind` values and translated them into concrete
handlers later. That kept tool registration dependent on static enum
bookkeeping instead of registering handlers from the same code that
assembles their specs.

## What Changed

- Make `build_tool_registry_builder` register concrete handlers directly
while adding specs.
- Add small `ToolRegistryBuilder` helpers for spec augmentation and
nested code-mode inspection.
- Remove `ToolHandlerKind`, `ToolHandlerSpec`, and `ToolRegistryPlan`.
- Update spec-plan tests to assert against the built `ToolRegistry`
instead of static handler descriptors.

## Validation

- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `just fix -p codex-core`
2026-05-06 20:36:24 -07:00
Felipe Coury
5a4b2702f2 fix(tui): clear first inline viewport render (#21450)
## Why

The alpha TUI can render the initial trust-directory prompt with stale
terminal text showing through spaces when startup begins below existing
shell output. The first inline viewport transition can happen while the
previous viewport is still empty, so the old clear path no-ops before
Ratatui draws the prompt. Ratatui then skips blank cells because its
previous buffer also thinks those cells are blank, leaving old terminal
contents visible inside the prompt.

## What Changed

- Clear from the new inline viewport top when the previous viewport is
empty during a viewport transition.
- Keep the existing clear-from-old-viewport behavior for normal viewport
updates.
- Add a VT100-backed regression test that pre-fills terminal contents,
performs the first viewport clear, and verifies stale text inside the
new viewport is removed while shell content above the viewport remains.

## How to Test

1. Start Codex alpha in a terminal that already has visible shell output
above the cursor.
2. Use a fresh untrusted project directory so the trust-directory prompt
appears.
3. Confirm the prompt text renders cleanly, with spaces staying blank
instead of showing fragments of previous shell output.
4. As a regression check, confirm content above the inline viewport is
still preserved in terminal scrollback.

Targeted tests:
- `cargo test -p codex-tui
first_viewport_change_clears_from_new_viewport_when_old_viewport_is_empty
-- --nocapture`
- `cargo test -p codex-tui`
2026-05-07 02:48:49 +00:00
pakrym-oai
103dc2b6ae Revert "Move skills watcher to app-server" (#21460)
Reverts openai/codex#21287
2026-05-07 02:24:20 +00:00
Andrei Eternal
527d52df03 Add compact lifecycle hooks (started by vincentkoc - external contrib) (#19905)
Based on work from Vincent K -
https://github.com/openai/codex/pull/19060

<img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
/>

## Why

Compaction rewrites the conversation context that future model turns
receive, but hooks currently have no deterministic lifecycle point
around that rewrite. This adds compact lifecycle hooks so users can
audit manual and automatic compaction, surface hook messages in the UI,
and run post-compaction follow-up without overloading tool or prompt
hooks.

## What Changed

- Added `PreCompact` and `PostCompact` hook events across hook config,
discovery, dispatch, generated schemas, app-server notifications,
analytics, and TUI hook rendering.
- Added trigger matching for compact hooks with the documented `manual`
and `auto` matcher values.
- Wired `PreCompact` before both local and remote compaction, and
`PostCompact` after successful local or remote compaction.
- Kept compact hook command input to lifecycle metadata: session id,
Codex turn id, transcript path, cwd, hook event name, model, and
trigger.
- Made compact stdout handling consistent with other hooks: plain stdout
is ignored as debug output, while malformed JSON-looking stdout is
reported as failed hook output.
- Added integration coverage for compact hook dispatch, trigger
matching, post-compact execution, and the audited behavior that
`decision:"block"` does not block compaction.

## Out of Scope

- Hook-specific compaction blocking is not implemented;
`decision:"block"` and exit-code-2 blocking semantics are intentionally
unsupported for `PreCompact`.
- Custom compaction instructions are not exposed to compact hooks in
this PR.
- Compact summaries, summary character counts, and summary previews are
not exposed to compact hooks in this PR.

## Verification

- `cargo test -p codex-hooks`
- `cargo test -p codex-core
manual_pre_compact_block_decision_does_not_block_compaction`
- `cargo test -p codex-app-server hooks_list`
- `cargo test -p codex-core config_schema_matches_fixture`
- `cargo test -p codex-tui hooks_browser`

## Docs

The developer documentation for Codex hooks should be updated alongside
this feature to document `PreCompact` and `PostCompact`, the
`manual`/`auto` matcher values, and the compact hook payload fields.

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-05-06 18:08:31 -07:00
xl-openai
11106016ff feat: Add marketplace source filtering and plugin share context (#21419)
Adds marketplaceKinds to plugin/list for local, workspace-directory, and
shared-with-me; omitted params keep default local plus gated global
behavior, while explicit kinds are exact.

Exposes shareContext on plugin summaries from local share mappings and
remote workspace/shared responses, including remotePluginId and nullable
creator metadata.

Adds shared-with-me listing through /ps/plugins/workspace/shared,
renames the workspace remote namespace to workspace-directory, and keeps
direct remote read/share/install/update/delete paths gated by plugins
rather than remote_plugin.
2026-05-06 16:12:23 -07:00
pakrym-oai
9417cf9696 [codex] Move tool specs into core handlers (#21416)
## Why

This is the first mechanical slice of moving tool spec ownership toward
the handlers. `codex-tools` should keep shared primitives and conversion
helpers, while builtin tool specs and registration planning live in
`codex-core` with the handlers that own those tools.

Keeping this PR to relocation and import updates isolates the copy/move
review from the later logic change that wires specs through registered
handlers.

## What changed

- Moved builtin tool spec constructors from `codex-rs/tools/src` into
`codex-rs/core/src/tools/handlers/*_spec.rs` or nearby core tool
modules.
- Moved the registry planning code into
`codex-rs/core/src/tools/spec_plan.rs` and its associated types/tests
into core.
- Kept shared primitives in `codex-tools`, including `ToolSpec`,
schema/types, discovery/config primitives, dynamic/MCP conversion
helpers, and code-mode collection helpers.
- Updated handlers that referenced moved argument types or tool-name
constants to use the core spec modules.
- Moved spec tests next to the moved spec modules.

## Verification

- `cargo check -p codex-tools`
- `cargo check -p codex-core`
- `cargo test -p codex-tools`
- `cargo test -p codex-core _spec::tests`
- `cargo test -p codex-core tools::spec_plan::tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`

Note: I also tried the broader `cargo test -p codex-core tools::`; it
reached the moved spec-plan/spec tests successfully, then aborted with a
stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`,
which is outside this spec relocation.
2026-05-06 15:40:50 -07:00
pakrym-oai
d5eea229cc Move skills watcher to app-server (#21287)
## Why

Skills update notifications are app-server API behavior, but the watcher
lived in `codex-core` and surfaced through
`EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core
focused on thread execution and lets app-server own both cache
invalidation and the `skills/changed` notification.

## What changed

- Added an app-server-owned skills watcher that watches local skill
roots, clears the shared skills cache, and emits `skills/changed`
directly.
- Registers skill watches from the common app-server thread listener
attach path, including direct starts, resumes, and app-server-observed
child or forked threads.
- Stores the `WatchRegistration` on `ThreadState`, so listener
replacement, thread teardown, idle unload, and app-server shutdown
deregister by dropping the RAII guard.
- Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the
old core live-reload test.
- Extended the app-server skills change test to verify a cached skills
list is refreshed after a filesystem change without forcing reload.

## Validation

- `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p
codex-rollout -p codex-rollout-trace`
- `cargo test -p codex-app-server
skills_changed_notification_is_emitted_after_skill_change`
2026-05-06 15:38:11 -07:00
Brian Henzelmann
8f5d68f9d2 Document Codex git commit attribution config (#21379)
## Summary
- document that commit attribution for generated git commit messages is
gated by the `codex_git_commit` feature flag
- add an example `config.toml` snippet showing `commit_attribution` with
`[features].codex_git_commit = true`
- update the config schema description so the reference docs explain
that `commit_attribution` only takes effect when the feature is enabled

Fixes #19799.

## Validation
- `cargo run -p codex-core --bin codex-write-config-schema`
- `cargo test -p codex-config`
- `cargo test -p codex-features`
- `cargo fmt --check`
- `git diff --check`

## Notes
- `cargo test -p codex-core config_schema_matches_fixture` currently
fails before reaching the schema test because `core_test_support`
imports `similar` without a linked crate in this checkout. The narrower
package checks above avoid that unrelated test-support build failure.
2026-05-06 16:14:50 -05:00
iceweasel-oai
123e78b97b [codex] Fix Windows sandbox git safe.directory for worktrees (#21409)
## Why

Windows sandboxed commands run as a sandbox user, while workspace
repositories are usually owned by the real user. The sandbox compensates
by injecting a temporary Git `safe.directory` entry into the child
environment.

That injection was still broken for linked worktrees because the helper
followed the `.git` file's `gitdir:` pointer and injected the internal
`.git/worktrees/...` location. Git's dubious-ownership check expects the
worktree root instead, so sandboxed Git commands still failed in
worktree-based Codex checkouts.

## What changed

- Treat any `.git` marker, directory or file, as the worktree root for
`safe.directory` injection.
- Keep the safe-directory logic in
`windows-sandbox-rs/src/sandbox_utils.rs` and have the one-shot elevated
path reuse it.
- Add regression coverage for both normal `.git` directories and
gitfile-based worktrees.

## Validation

- `cargo test -p codex-windows-sandbox sandbox_utils::tests`
- `cargo test -p codex-windows-sandbox` built and ran; the new
`sandbox_utils` tests passed, while two pre-existing legacy sandbox
tests failed locally with `Access is denied`:
`session::tests::legacy_non_tty_cmd_emits_output` and
`spawn_prep::tests::legacy_spawn_env_applies_offline_network_rewrite`.
2026-05-06 14:08:45 -07:00
rhan-oai
fbdbc6b2fe [codex-analytics] emit tool item events from item lifecycle (#17090)
## Why

After the tool-item schemas are in place, analytics needs to emit them
from the app-server item lifecycle rather than requiring bespoke
tracking at each callsite. The reducer should also reuse the shared
thread analytics context introduced below it in the stack so later event
families do not repeat the same reducer joins or missing-state ladder.

## What changed

- Tracks tool-item completion notifications and emits the matching tool
analytics event when a terminal item arrives.
- Derives event-specific payload details for command execution, file
changes, MCP calls, dynamic tools, collaboration tools, web search, and
image generation.
- Denormalizes thread, app-server client, runtime, and subagent
provenance metadata through the shared thread analytics context.
- Adds reducer coverage for item lifecycle emission and subagent
metadata inheritance.

## Duration semantics

`duration_ms` is computed from the app-server item lifecycle timestamps:
`completed_at_ms - started_at_ms`. That makes it the duration of the
lifecycle Codex observed locally, not necessarily the upstream
provider's full execution time.

- Web search usually has a meaningful observed lifecycle because
Responses can send `response.output_item.added` before
`response.output_item.done`; in that case `started_at_ms` comes from the
added event and `completed_at_ms` comes from the done event.
- Image generation can be much less precise. In the current observed
stream, image generation often arrives only as a completed
`response.output_item.done`; when there is no earlier added event, Codex
synthesizes the started item immediately before completion, so
`duration_ms` can be `0` even though upstream image generation took
longer.
- Standalone web search and standalone image generation work is expected
to land after this stack. Those paths may introduce more direct
lifecycle events or timing points, so the current
web-search/image-generation duration semantics should be treated as the
best available item-lifecycle approximation, not the final latency
contract for those tool families.
- `execution_duration_ms` is populated only where the completed item
already carries a native execution duration; otherwise it remains `null`
while `duration_ms` still reflects the local lifecycle interval.

## Currently placeholder / partial fields

Some fields are included in the schema for the intended steady-state
contract, but this PR does not yet populate them from real
approval/review state:

- `review_count`, `guardian_review_count`, and `user_review_count`
currently default to `0`.
- `final_approval_outcome` currently defaults to `unknown`.
- `requested_additional_permissions` and `requested_network_access`
currently default to `false`.

## Verification

- `cargo test -p codex-analytics`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17090).
* #18748
* #18747
* __->__ #17090
* #17089
* #20514
2026-05-06 20:27:41 +00:00
rhan-oai
21295f47e2 [codex-tui] pass thread source for tui threads (#21401)
## Summary
- mark TUI-created thread starts and forks with explicit `thread_source
= user`
- add focused coverage for embedded and remote lifecycle request
builders

## Why
Thread analytics now consume an explicit thread-level source
classification instead of inferring it from `session_source`. The TUI
still omitted that field, so TUI-created interactive threads would
continue to land as `null` even after the new analytics plumbing
shipped.

## Validation
- `cargo test -p codex-tui app_server_session --lib`
2026-05-06 13:18:41 -07:00
pakrym-oai
b9c50a53d7 [codex] Split tool handlers into separate files (#21395)
## Why

Several tool handler modules still bundled multiple `ToolHandler`
implementations in one file. That made the handler directory harder to
navigate and made otherwise local handler edits land in large shared
modules.

## What

- Split grouped tool handlers into one handler file each for agent jobs,
goals, MCP resources, shell tools, and unified exec.
- Kept shared parsing, payload, and runtime helpers in the existing
parent modules, with re-exports preserving the existing handler import
paths.
- Updated the shell handler tests to construct `ShellCommandHandler`
through the existing `ShellCommandBackendConfig` conversion now that the
backend detail lives with the shell-command handler.

## Validation

- `cargo check -p codex-core`
- `cargo clippy -p codex-core --lib -- -D warnings`
- `git diff --check -- codex-rs/core/src/tools/handlers`

Targeted `codex-core` handler tests did not run locally because
`core_test_support` currently fails to compile before reaching these
tests due to an unresolved `similar` import.
2026-05-06 13:12:24 -07:00
canvrno-oai
d5f0b6d63a [codex] Dedupe fallback model metadata warnings (#21090)
Fixes #21070.

This is a small cleanup around model metadata handling for
gateway/provider model names. It follows the report and proposed
direction from @dkbush by keeping the fallback metadata warning useful
without repeating it every turn, and by tightening the existing
provider-prefix lookup path.

- Track fallback metadata warning slugs in session state so each
unresolved model warns once per session.
- Keep warning emission outside the session-state lock and preserve the
existing warning text.
- Allow one-segment provider prefixes with hyphenated provider IDs,
while preserving the multi-segment rejection behavior.
- Add focused coverage for warning dedupe and hyphenated provider-prefix
metadata matching.

Testing:

- Ran `just fmt`.
- Ran `git diff --check`.
- Added tests for the new warning dedupe and provider-prefix lookup
behavior.
2026-05-06 13:11:44 -07:00