Commit Graph

167 Commits

Author SHA1 Message Date
Owen Lin
0d0835dd53 feat(app-server, threadstore): Thread pagination APIs and ThreadStore contract (#21566)
## Why
The goal of this PR is to align on app-server and `ThreadStore` API
updates for paginating through large threads.


#### app-server
##### `thread/turns/list`
- Updates `thread/turns/list` to support `itemsView?: "notLoaded" |
"summary" | "full" | null`, defaulting to `summary`.
- Implements the current `thread/turns/list` behavior over the existing
persisted rollout-history fallback:
  - `notLoaded` returns turn envelopes with empty `items`.
- `summary` returns the first user message and final assistant message
when available.
  - `full` preserves the existing full item behavior.

Note that this method still uses the naive approach of loading the
entire rollout file, and returns just the filtered slice of the data.
Real pagination will come later by leveraging SQLite.

##### `thread/turns/items/list`
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.

#### ThreadStore
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.
- Adds `ThreadStore` contract types and stubbed methods for listing
thread turns and listing items within a turn.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.

This also sketches the storage abstraction we expect to need once turns
are indexed/stored. In particular, `notLoaded` is useful only if
ThreadStore can eventually list turn metadata without loading every
persisted item for each turn.

## Validation

- Added/updated protocol serialization coverage for the new request and
response shapes.
- Added app-server integration coverage for `thread/turns/list` default
summary behavior and all three `itemsView` modes.
- Added app-server integration coverage that `thread/turns/items/list`
returns the expected unsupported JSON-RPC error when experimental APIs
are enabled.
- Added thread-store coverage that the default trait methods return
`ThreadStoreError::Unsupported`.

No developers.openai.com documentation update is needed for this
internal experimental app-server API surface.
2026-05-07 15:44:43 -07:00
Charlie Marsh
54ef99a365 Disable empty Cargo test targets (#21584)
## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 15:44:17 -07:00
rhan-oai
b3d4f1a9f0 [codex-analytics] rework thread_source for thread analytics (#20949)
## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification

## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.

Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.

## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.

## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
2026-05-06 02:12:31 +00:00
Ahmed Ibrahim
9d579813bb 1- Add model service tiers metadata (#20969)
## Why

The model list needs to carry display-ready service tier metadata so
clients can render tier choices with stable IDs, names, and
descriptions. A raw speed-tier string list is not enough for richer UI
copy or future tier labels.

## What changed

- Added `ModelServiceTier` to shared model metadata with string `id`,
`name`, and `description` fields.
- Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving
empty defaults for older cached model payloads.
- Exposed `serviceTiers` on app-server v2 `Model` responses and threaded
it through TUI app-server model conversion.
- Marked legacy `additional_speed_tiers` / `additionalSpeedTiers`
metadata as deprecated in source and generated schema output.
- Regenerated app-server protocol JSON schema and TypeScript fixtures,
including `ModelServiceTier.ts`.

## Verification

- Ran `just write-app-server-schema`.
- Did not run local tests per repo instruction; relying on PR CI.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-05 09:51:18 +03:00
Ruslan Nigmatullin
4950e7d8a6 [codex] Add unsandboxed process exec API (#19040)
## Why

App-server clients sometimes need argv-based local process execution
while sandbox policy is controlled outside Codex. Those environments can
reject sandbox-disabling paths before a command ever starts, even when
the caller intentionally wants unsandboxed execution.

This PR adds a distinct `process/*` API for that use case instead of
extending `command/exec` with another sandbox-disabling shape. Keeping
the new surface separate also makes the future removal of `command/exec`
simpler: clients that need explicit process lifecycle control can move
to the newer handle-based API without depending on `command/exec`
business logic.

## What changed

- Added v2 process lifecycle methods: `process/spawn`,
`process/writeStdin`, `process/resizePty`, and `process/kill`.
- Added process notifications: `process/outputDelta` for streamed
stdout/stderr chunks and `process/exited` for final exit status and
buffered output.
- Made `process/spawn` intentionally unsandboxed and omitted
sandbox-selection fields such as `sandboxPolicy` and
`permissionProfile`.
- Added client-supplied, connection-scoped `processHandle` values for
follow-up control requests and notification routing.
- Supported cwd, environment overrides, PTY mode and size, stdin
streaming, stdout/stderr streaming, per-stream output caps, and timeout
controls.
- Killed active process sessions when the originating app-server
connection closes.
- Wired the implementation through the modular `request_processors/`
app-server layout, with process-handle request serialization for
follow-up control calls.
- Updated generated JSON/TypeScript schema fixtures and documented the
new API in `codex-rs/app-server/README.md`.
- Added v2 app-server integration coverage in
`codex-rs/app-server/tests/suite/v2/process_exec.rs` for spawn
acknowledgement before exit, buffered output caps, and process
termination.

## Verification

- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`

---------

Co-authored-by: Owen Lin <owen@openai.com>
2026-05-04 16:43:58 -07:00
xli-oai
96d2ea9058 Add remote plugin skill read API (#20150)
## Summary

Adds an app-server `plugin/skill/read` method for remote plugin skill
markdown. The new method calls the plugin-service skill detail endpoint
and returns `skill_md_contents`, so clients can preview skills for
remote plugins before the bundle is installed locally.

## Why

Uninstalled remote plugin skills do not have local `SKILL.md` files.
Without an on-demand remote read, the desktop plugin details UI cannot
render the skill details modal for those skills.

## Validation

- `just write-app-server-schema`
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server --test all --
suite::v2::plugin_read::plugin_skill_read_reads_remote_skill_contents_when_remote_plugin_enabled
--exact`
- `just fix -p codex-app-server-protocol -p codex-core-plugins -p
codex-app-server`
2026-05-01 00:16:25 -07:00
xli-oai
2686873e77 Sync remote installed plugin bundles (#20268)
## Summary
- Download missing remote installed plugin bundles during app-server
startup and plugin/list refresh.
- Upgrade cached remote installed bundles when the backend installed
version changes.
- Remove stale remote installed bundle caches without writing remote
plugin state into config.toml.

## Review note
This is a clean PR branch cut from the current diff on top of latest
`origin/main`. The diff intentionally has no `codex-rs/core/**` files,
so CODEOWNERS should not request the core-directory owner review from
stale PR history.

## Validation
Already run on the source branch before creating this clean PR:
- `just fmt`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-app-server --test all
app_server_startup_sync_downloads_remote_installed_plugin_bundles --
--nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_sync_upgrades_and_removes_remote_installed_plugin_bundles --
--nocapture`
- `cargo test -p codex-app-server --test all
app_server_startup_remote_plugin_sync_runs_once -- --nocapture`
- `just fix -p codex-core-plugins`
- `just fix -p codex-app-server`
- `git diff --check`
2026-04-30 16:05:14 -07:00
Abhinav
8774229a89 Add hooks/list app-server RPC (#19778)
## Why

We need a way to list the available hooks to expose via the TUI and App
so users can view and manage their hooks

## What

- Adds `hooks/list` for one or more `cwd` values that returns discovered
hook metadata

## Stack

1. openai/codex#19705
2. This PR - openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Review Notes

The generated schema files account for most of the raw diff, these files
have the core change:

- `hooks/src/engine/discovery.rs` builds the inventory entries during
hook discovery while leaving runtime handlers focused on execution.
- `app-server/src/codex_message_processor.rs` wires `hooks/list` into
the app-server flow for each requested `cwd`.
- `app-server-protocol/src/protocol/v2.rs` defines the new v2
request/response payloads exposed on the wire.

### Core Changes

`core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so
`skills/list` and `hooks/list`can resolve plugin state for each
requested `cwd`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 23:39:57 +00:00
Celia Chen
8c47e36504 feat: expose provider capability bounds to app server clients (#20049)
follow up of #19442. The app server now exposes provider-derived bounds
through a new v2 `modelProvider/read` method. The response reports the
configured provider map key as `modelProvider` and returns the effective
capability booleans so clients can align their UI with the same
provider-owned limits used by core.
2026-04-29 01:36:19 +00:00
canvrno-oai
4c39ad33cb Fix plugin list workspace settings test isolation (#20086)
Fixes test that often fails locally when running `cargo test`
- Add an app-server test helper that combines managed-config isolation
with custom env overrides.
- Isolate `HOME` / `USERPROFILE` in plugin-list workspace settings tests
so host home marketplaces do not affect results.
2026-04-28 18:34:38 -07:00
Michael Bolin
ac2bffa443 test: harden app-server integration tests (#19683)
## Why

Windows Bazel runs in the permissions stack exposed that app-server
integration tests were launching normal plugin startup warmups in every
subprocess. Those warmups can call
`https://chatgpt.com/backend-api/plugins/featured` when a test is not
specifically exercising plugin startup, which adds slow background work,
noisy stderr, and dependence on external network state. The relevant
startup/featured-plugin behavior was introduced across #15042 and
#15264.

A few app-server tests also had long optional waits or unbounded cleanup
paths, making failures expensive to diagnose and contributing to slow
Windows shards. One external-agent config test from #18246 used a
GitHub-style marketplace source, which was enough to exercise the
pending remote-import path but also meant the background completion task
could attempt a real clone.

## What Changed

- Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks`
plumbing and a hidden debug-only
`--disable-plugin-startup-tasks-for-tests` app-server flag, so
integration tests can suppress startup plugin warmups without adding a
production env-var gate.
- Has the app-server test harness pass that hidden flag by default,
while opting plugin-startup coverage back in for tests that
intentionally exercise startup sync and featured-plugin warmup behavior.
- Lowers normal app-server subprocess logging from `info`/`debug` to
`warn` to avoid multi-megabyte stderr output in Bazel logs.
- Prevents the external-agent config test from attempting a real
marketplace clone by using an invalid non-local source while still
exercising the pending-import completion path.
- Bounds optional filesystem/realtime waits and fake WebSocket
test-server shutdown so failures produce targeted timeouts instead of
hanging a shard.
- Fixes the Unix script-resolution test in `rmcp-client` to exercise
PATH resolution directly and include the actual spawn error in failures.

## Verification

- `cargo check -p codex-app-server`
- `cargo clippy -p codex-app-server --tests -- -D warnings`
- `cargo test -p codex-rmcp-client
program_resolver::tests::test_unix_executes_script_without_extension`
- `cargo test -p codex-app-server --test all
external_agent_config_import_sends_completion_notification_after_pending_plugins_finish
-- --nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request --
--nocapture`
- Windows Local Bazel passed with this test-hardening bundle before it
was extracted from #19606.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19683
2026-04-26 12:43:16 -07:00
sayan-oai
c10f95ddac Update models.json and related fixtures (#19323)
Supersedes #18735.

The scheduled rust-release-prepare workflow force-pushed
`bot/update-models-json` back to the generated models.json-only diff,
which dropped the test and snapshot updates needed for CI.

This PR keeps the latest generated `models.json` from #18735 and adds
the corresponding fixture updates:
- preserve model availability NUX in the app-server model cache fixture
- update core/TUI expectations for the new `gpt-5.4` `xhigh` default
reasoning
- refresh affected TUI chatwidget snapshots for the `gpt-5.5`
default/model copy changes

Validation run locally while preparing the fix:
- `just fmt`
- `cargo test -p codex-app-server model_list`
- `cargo test -p codex-core includes_no_effort_in_request`
- `cargo test -p codex-core
includes_default_reasoning_effort_in_request_when_defined_by_model_info`
- `cargo test -p codex-tui --lib chatwidget::tests`
- `cargo insta pending-snapshots`

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
2026-04-24 11:14:13 +02:00
xli-oai
0d6a90cd6b Add app-server marketplace upgrade RPC (#19074)
## Summary
- add a v2 `marketplace/upgrade` app-server RPC that mirrors the
existing configured Git marketplace upgrade path
- expose typed request/response/error payloads and regenerate
JSON/TypeScript schema fixtures
- add app-server integration coverage for all, named, already
up-to-date, and invalid marketplace upgrade requests

## Tests
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server marketplace_upgrade`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fmt`
2026-04-23 13:00:46 -07:00
efrazer-oai
69c8913e24 feat: add explicit AgentIdentity auth mode (#18785)
## Summary

This PR adds `CodexAuth::AgentIdentity` as an explicit auth mode.

An AgentIdentity auth record is a standalone `auth.json` mode. When
`AuthManager::auth().await` loads that mode, it registers one
process-scoped task and stores it in runtime-only state on the auth
value. Header creation stays synchronous after that because the task is
initialized before callers receive the auth object.

This PR also removes the old feature flag path. AgentIdentity is
selected by explicit auth mode, not by a hidden flag or lazy mutation of
ChatGPT auth records.

Reference old stack: https://github.com/openai/codex/pull/17387/changes

## Design Decisions

- AgentIdentity is a real auth enum variant because it can be the only
credential in `auth.json`.
- The process task is ephemeral runtime state. It is not serialized and
is not stored in rollout/session data.
- Account/user metadata needed by existing Codex backend checks lives on
the AgentIdentity record for now.
- `is_chatgpt_auth()` remains token-specific.
- `uses_codex_backend()` is the broader predicate for ChatGPT-token auth
and AgentIdentity auth.

## Stack

1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. This PR: explicit AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate Codex backend
auth callsites through AuthProvider
5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
and load `CODEX_AGENT_IDENTITY`

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
2026-04-21 22:33:24 -07:00
efrazer-oai
be75785504 fix: fully revert agent identity runtime wiring (#18757)
## Summary

This PR fully reverts the previously merged Agent Identity runtime
integration from the old stack:
https://github.com/openai/codex/pull/17387/changes

It removes the Codex-side task lifecycle wiring, rollout/session
persistence, feature flag plumbing, lazy `auth.json` mutation,
background task auth paths, and request callsite changes introduced by
that stack.

This leaves the repo in a clean pre-AgentIdentity integration state so
the follow-up PRs can reintroduce the pieces in smaller reviewable
layers.

## Stack

1. This PR: full revert
2. https://github.com/openai/codex/pull/18871: move Agent Identity
business logic into a crate
3. https://github.com/openai/codex/pull/18785: add explicit
AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate auth callsites
through AuthProvider

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
2026-04-21 14:30:55 -07:00
xli-oai
1dc3535e17 [codex] Add marketplace/remove app-server RPC (#17751)
## Summary

Add a new app-server `marketplace/remove` RPC on top of the shared
marketplace-remove implementation.

This change:
- adds `MarketplaceRemoveParams` / `MarketplaceRemoveResponse` to the
app-server protocol
- wires the new request through `codex_message_processor`
- reuses the shared core marketplace-remove flow from the stacked
refactor PR
- updates generated schema files and adds focused app-server coverage

## Validation

- `just write-app-server-schema`
- `just fmt`
- heavy compile/test coverage deferred to GitHub CI per request
2026-04-19 23:22:49 -07:00
Ahmed Ibrahim
5bb193aa88 Add max context window model metadata (#18382)
Adds max_context_window to model metadata and routes core context-window
reads through resolved model info. Config model_context_window overrides
are clamped to max_context_window when present; without an override, the
model context_window is used.
2026-04-17 21:48:14 -07:00
richardopenai
6b39d0c657 [codex] Add owner nudge app-server API (#18220)
## Summary

Second PR in the split from #17956. Stacked on #18227.

- adds app-server v2 protocol/schema support for
`account/sendAddCreditsNudgeEmail`
- adds the backend-client `send_add_credits_nudge_email` request and
request body mapping
- handles the app-server request with auth checks, backend call, and
cooldown mapping
- adds the disabled `workspace_owner_usage_nudge` feature flag and
focused app-server/backend tests

## Validation

- `cargo test -p codex-backend-client`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server rate_limits`
- `cargo test -p codex-tui workspace_`
- `cargo test -p codex-tui status_`
- `just fmt`
- `just fix -p codex-backend-client`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`
2026-04-17 21:41:57 -07:00
David de Regt
eaf78e43f2 Add sorting/backwardsCursor to thread/list and new thread/turns/list api (#17305)
To improve performance of UI loads from the app, add two main
improvements:
1. The `thread/list` api now gets a `sortDirection` request field and a
`backwardsCursor` to the response, which lets you paginate forwards and
backwards from a window. This lets you fetch the first few items to
display immediately while you paginate to fill in history, then can
paginate "backwards" on future loads to catch up with any changes since
the last UI load without a full reload of the entire data set.
2. Added a new `thread/turns/list` api which also has sortDirection and
backwardsCursor for the same behavior as `thread/list`, allowing you the
same small-fetch for immediate display followed by background fill-in
and resync catchup.
2026-04-17 11:49:02 -07:00
Felipe Coury
ec8d4bfc77 fix(app-server): replay token usage after resume and fork (#18023)
## Problem

When a user resumed or forked a session, the TUI could render the
restored thread history immediately, but it did not receive token usage
until a later model turn emitted a fresh usage event. That left the
context/status UI blank or stale during the exact window where the user
expects resumed state to look complete. Core already reconstructed token
usage from the rollout; the missing behavior was app-server lifecycle
replay to the client that just attached.

## Mental model

Token usage has two representations. The rollout is the durable source
of historical `TokenCount` events, and the core session cache is the
in-memory snapshot reconstructed from that rollout on resume or fork.
App-server v2 clients do not read core state directly; they learn about
usage through `thread/tokenUsage/updated`. The fix keeps those roles
separate: core exposes the restored `TokenUsageInfo`, and app-server
sends one targeted notification after a successful `thread/resume` or
`thread/fork` response when that restored snapshot exists.

This notification is not a new model event. It is a replay of
already-persisted state for the client that just attached. That
distinction matters because using the normal core event path here would
risk duplicating `TokenCount` entries in the rollout and making future
resumes count historical usage twice.

## Non-goals

This change does not add a new protocol method or payload shape. It
reuses the existing v2 `thread/tokenUsage/updated` notification and the
TUI’s existing handler for that notification.

This change does not alter how token usage is computed, accumulated,
compacted, or written during turns. It only exposes the token usage that
resume and fork reconstruction already restored.

This change does not broadcast historical usage replay to every
subscribed client. The replay is intentionally scoped to the connection
that requested resume or fork so already-attached clients are not
surprised by an old usage update while they may be rendering live
activity.

## Tradeoffs

Sending the usage notification after the JSON-RPC response preserves a
clear lifecycle order: the client first receives the thread object, then
receives restored usage for that thread. The tradeoff is that usage is
still a notification rather than part of the `thread/resume` or
`thread/fork` response. That keeps the protocol shape stable and avoids
duplicating usage fields across response types, but clients must
continue listening for notifications after receiving the response.

The helper selects the latest non-in-progress turn id for the replayed
usage notification. This is conservative because restored usage belongs
to completed persisted accounting, not to newly attached in-flight work.
The fallback to the last turn preserves a stable wire payload for
unusual histories, but histories with no meaningful completed turn still
have a weak attribution story.

## Architecture

Core already seeds `Session` token state from the last persisted rollout
`TokenCount` during `InitialHistory::Resumed` and
`InitialHistory::Forked`. The new core accessor exposes the complete
`TokenUsageInfo` through `CodexThread` without giving app-server direct
session mutation authority.

App-server calls that accessor from three lifecycle paths: cold
`thread/resume`, running-thread resume/rejoin, and `thread/fork`. In
each path, the server sends the normal response first, then calls a
shared helper that converts core usage into
`ThreadTokenUsageUpdatedNotification` and sends it only to the
requesting connection.

The tests build fake rollouts with a user turn plus a persisted token
usage event. They then exercise `thread/resume` and `thread/fork`
without starting another model turn, proving that restored usage arrives
before any next-turn token event could be produced.

## Observability

The primary debug path is the app-server JSON-RPC stream. After
`thread/resume` or `thread/fork`, a client should see the response
followed by `thread/tokenUsage/updated` when the source rollout includes
token usage. If the notification is absent, check whether the rollout
contains an `event_msg` payload of type `token_count`, whether core
reconstruction seeded `Session::token_usage_info`, and whether the
connection stayed attached long enough to receive the targeted
notification.

The notification is sent through the existing
`OutgoingMessageSender::send_server_notification_to_connections` path,
so existing app-server tracing around server notifications still
applies. Because this is a replay, not a model turn event, debugging
should start at the resume/fork handlers rather than the turn event
translation in `bespoke_event_handling`.

## Tests

The focused regression coverage is `cargo test -p codex-app-server
emits_restored_token_usage`, which covers both resume and fork. The core
reconstruction guard is `cargo test -p codex-core
record_initial_history_seeds_token_info_from_rollout`.

Formatting and lint/fix passes were run with `just fmt`, `just fix -p
codex-core`, and `just fix -p codex-app-server`. Full crate test runs
surfaced pre-existing unrelated failures in command execution and plugin
marketplace tests; the new token usage tests passed in focused runs and
within the app-server suite before the unrelated command execution
failure.
2026-04-16 17:29:34 -03:00
Adrian
8e784bba2f Register agent identities behind use_agent_identity (#17386)
## Summary

Stack PR 2 of 4 for feature-gated agent identity support.

This PR adds agent identity registration behind
`features.use_agent_identity`. It keeps the app-server protocol
unchanged and starts registration after ChatGPT auth exists rather than
requiring a client restart.

## Stack

- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - this PR
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR4: https://github.com/openai/codex/pull/17388 - use `AgentAssertion`
downstream when enabled

## Validation

Covered as part of the local stack validation pass:

- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`

## Notes

The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
2026-04-15 10:08:27 -07:00
pakrym-oai
dd1321d11b Spread AbsolutePathBuf (#17792)
Mechanical change to promote absolute paths through code.
2026-04-14 14:26:10 -07:00
rhan-oai
d6b13276c7 [codex-analytics] enable general analytics by default (#17389)
## Summary
- Make GeneralAnalytics stable and enabled by default.
- Update feature tests and app-server lifecycle fixtures for explicit
general_analytics=false.
- Keep app-server integration tests isolated from host managed config so
explicit feature fixtures are deterministic.

## Validation
- cargo test -p codex-features
- cargo test -p codex-app-server general_analytics (matched 0 tests)
- cargo test -p codex-app-server thread_start_
- cargo test -p codex-app-server thread_fork_
- cargo test -p codex-app-server thread_resume_
- cargo test -p codex-app-server
config_read_includes_system_layer_and_overrides
2026-04-14 13:20:46 -07:00
rhan-oai
b704df85b8 [codex-analytics] feature plumbing and emittance (#16640)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16640).
* #16870
* #16706
* #16641
* __->__ #16640
2026-04-13 23:11:49 -07:00
xli-oai
ff584c5a4b [codex] Refactor marketplace add into shared core flow (#17717)
## Summary

Move `codex marketplace add` onto a shared core implementation so the
CLI and app-server path can use one source of truth.

This change:
- adds shared marketplace-add orchestration in `codex-core`
- switches the CLI command to call that shared implementation
- removes duplicated CLI-only marketplace add helpers
- preserves focused parser and add-path coverage while moving the shared
behavior into core tests

## Why

The new `marketplace/add` RPC should reuse the same underlying
marketplace-add flow as the CLI. This refactor lands that consolidation
first so the follow-up app-server PR can be mostly protocol and handler
wiring.

## Validation

- `cargo test -p codex-core marketplace_add`
- `cargo test -p codex-cli marketplace_cmd`
- `just fix -p codex-core`
- `just fix -p codex-cli`
- `just fmt`
2026-04-13 20:37:11 -07:00
pakrym-oai
d4be06adea Add turn item injection API (#17703)
## Summary
- Add `turn/inject_items` app-server v2 request support for appending
raw Responses API items to a loaded thread history without starting a
turn.
- Generate JSON schema and TypeScript protocol artifacts for the new
params and empty response.
- Document the new endpoint and include a request/response example.
- Preserve compatibility with the typo alias `turn/injet_items` while
returning the canonical method name.

## Testing
- Not run (not requested)
2026-04-13 16:11:05 -07:00
jif-oai
46a266cd6a feat: disable memory endpoint (#17626) 2026-04-13 18:29:49 +01:00
Matthew Zeng
b7139a7e8f [mcp] Support MCP Apps part 3 - Add mcp tool call support. (#17364)
- [x] Add a new app-server method so that MCP Apps can call their own
MCP server directly.
2026-04-11 04:39:19 +00:00
Shijie Rao
930e5adb7e Revert "Option to Notify Workspace Owner When Usage Limit is Reached" (#17391)
Reverts openai/codex#16969

#sev3-2026-04-10-accountscheckversion-500s-for-openai-workspace-7300
2026-04-10 23:33:13 +00:00
richardopenai
9f2a585153 Option to Notify Workspace Owner When Usage Limit is Reached (#16969)
## Summary
- Replace the manual `/notify-owner` flow with an inline confirmation
prompt when a usage-based workspace member hits a credits-depleted
limit.
- Fetch the current workspace role from the live ChatGPT
`accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches
the desktop and web clients.
- Keep owner, member, and spend-cap messaging distinct so we only offer
the owner nudge when the workspace is actually out of credits.

## What Changed
- `backend-client`
- Added a typed fetch for the current account role from
`accounts/check`.
  - Mapped backend role values into a Rust workspace-role enum.
- `app-server` and protocol
  - Added `workspaceRole` to `account/read` and `account/updated`.
- Derived `isWorkspaceOwner` from the live role, with a fallback to the
cached token claim when the role fetch is unavailable.
- `tui`
  - Removed the explicit `/notify-owner` slash command.
- When a member is blocked because the workspace is out of credits, the
error now prompts:
- `Your workspace is out of credits. Request more from your workspace
owner? [y/N]`
  - Choosing `y` sends the existing owner-notification request.
- Choosing `n`, pressing `Esc`, or accepting the default selection
dismisses the prompt without sending anything.
- Selection popups now honor explicit item shortcuts, which is how the
`y` / `n` interaction is wired.

## Reviewer Notes
- The main behavior change is scoped to usage-based workspace members
whose workspace credits are depleted.
- Spend-cap reached should not show the owner-notification prompt.
- Owners and admins should continue to see `/usage` guidance instead of
the member prompt.
- The live role fetch is best-effort; if it fails, we fall back to the
existing token-derived ownership signal.

## Testing
- Manual verification
  - Workspace owner does not see the member prompt.
- Workspace member with depleted credits sees the confirmation prompt
and can send the nudge with `y`.
- Workspace member with spend cap reached does not see the
owner-notification prompt.

### Workspace member out of usage

https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1

### Workspace owner
<img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48
22 AM"
src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6"
/>
2026-04-09 21:15:17 -07:00
Ahmed Ibrahim
2f9090be62 Add realtime voice selection (#17176)
- Add realtime voice selection for realtime/start.
- Expose the supported v1/v2 voice lists and cover explicit, configured,
default, and invalid voice paths.
2026-04-08 20:19:15 -07:00
pash-openai
80ebc80be5 Use model metadata for Fast Mode status (#16949)
Fast Mode status was still tied to one model name in the TUI and
model-list plumbing. This changes the model metadata shape so a model
can advertise additional speed tiers, carries that field through the
app-server model list, and uses it to decide when to show Fast Mode
status.

For people using Codex, the behavior is intended to stay the same for
existing models. Fast Mode still requires the existing signed-in /
feature-gated path; the difference is that the UI can now recognize any
model the model list marks as Fast-capable, instead of requiring a new
client-side slug check.
2026-04-07 17:55:40 -07:00
Dylan Hurd
6c36e7d688 fix(app-server) revert null instructions changes (#17047) 2026-04-07 15:18:34 -07:00
Ahmed Ibrahim
cd591dc457 Preserve null developer instructions (#16976)
Preserve explicit null developer-instruction overrides across app-server
resume and fork flows.
2026-04-07 09:32:14 -07:00
Matthew Zeng
5fe9ef06ce [mcp] Support MCP Apps part 1. (#16082)
- [x] Add `mcpResource/read` method to read mcp resource.
2026-04-06 19:17:14 -07:00
pakrym-oai
1f2411629f Refactor config types into a separate crate (#16962)
Move config types into a separate crate because their macros expand into
a lot of new code.
2026-04-07 00:32:41 +00:00
Eric Traut
a3b3e7a6cc Fix MCP tool listing for hyphenated server names (#16674)
Addresses #16671 and #14927

Problem: `mcpServerStatus/list` rebuilt MCP tool groups from sanitized
tool prefixes but looked them up by unsanitized server names, so
hyphenated servers rendered as having no tools in `/mcp`. This was
reported as a regression when the TUI switched to use the app server.

Solution: Build each server's tool map using the original server name's
sanitized prefix, include effective runtime MCP servers in the status
response, and add a regression test for hyphenated server names.
2026-04-03 09:05:50 -07:00
Ahmed Ibrahim
af8a9d2d2b remove temporary ownership re-exports (#16626)
Stacked on #16508.

This removes the temporary `codex-core` / `codex-login` re-export shims
from the ownership split and rewrites callsites to import directly from
`codex-model-provider-info`, `codex-models-manager`, `codex-api`,
`codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.

No behavior change intended; this is the mechanical import cleanup layer
split out from the ownership move.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-03 00:33:34 -07:00
Michael Bolin
aa2403e2eb core: remove cross-crate re-exports from lib.rs (#16512)
## Why

`codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
which made downstream crates depend on `codex-core` as a proxy module
instead of the actual owner crate.

Removing those forwards makes crate boundaries explicit and lets leaf
crates drop unnecessary `codex-core` dependencies. In this PR, this
reduces the dependency on `codex-core` to `codex-login` in the following
files:

```
codex-rs/backend-client/Cargo.toml
codex-rs/mcp-server/tests/common/Cargo.toml
```

## What

- Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
`codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
`codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
`codex-tools`, and `codex-utils-path`.
- Delete the `default_client` forwarding shim in `codex-rs/core`.
- Update in-crate and downstream callsites to import directly from the
owning `codex-*` crate.
- Add direct Cargo dependencies where callsites now target the owner
crate, and remove `codex-core` from `codex-rs/backend-client`.
2026-04-01 23:06:24 -07:00
Michael Bolin
9a8730f31e ci: verify codex-rs Cargo manifests inherit workspace settings (#16353)
## Why

Bazel clippy now catches lints that `cargo clippy` can still miss when a
crate under `codex-rs` forgets to opt into workspace lints. The concrete
example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel
flagged a clippy violation in `models_cache.rs`, but Cargo did not
because that crate inherited workspace package metadata without
declaring `[lints] workspace = true`.

We already mirror the workspace clippy deny list into Bazel after
[#15955](https://github.com/openai/codex/pull/15955), so we also need a
repo-side check that keeps every `codex-rs` manifest opted into the same
workspace settings.

## What changed

- add `.github/scripts/verify_cargo_workspace_manifests.py`, which
parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies:
  - `version.workspace = true`
  - `edition.workspace = true`
  - `license.workspace = true`
  - `[lints] workspace = true`
- top-level crate names follow the `codex-*` / `codex-utils-*`
conventions, with explicit exceptions for `windows-sandbox-rs` and
`utils/path-utils`
- run that script in `.github/workflows/ci.yml`
- update the current outlier manifests so the check is enforceable
immediately
- fix the newly exposed clippy violations in the affected crates
(`app-server/tests/common`, `file-search`, `feedback`,
`shell-escalation`, and `debug-client`)






---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353).
* #16351
* __->__ #16353
2026-03-31 21:59:28 +00:00
daniel-oai
47a9e2e084 Add ChatGPT device-code login to app server (#15525)
## Problem

App-server clients could only initiate ChatGPT login through the browser
callback flow, even though the shared login crate already supports
device-code auth. That left VS Code, Codex App, and other app-server
clients without a first-class way to use the existing device-code
backend when browser redirects are brittle or when the client UX wants
to own the login ceremony.

## Mental model

This change adds a second ChatGPT login start path to app-server:
clients can now call `account/login/start` with `type:
"chatgptDeviceCode"`. App-server immediately returns a `loginId` plus
the device-code UX payload (`verificationUrl` and `userCode`), then
completes the login asynchronously in the background using the existing
`codex_login` polling flow. Successful device-code login still resolves
to ordinary `chatgpt` auth, and completion continues to flow through the
existing `account/login/completed` and `account/updated` notifications.

## Non-goals

This does not introduce a new auth mode, a new account shape, or a
device-code eligibility discovery API. It also does not add automatic
fallback to browser login in core; clients remain responsible for
choosing when to request device code and whether to retry with a
different UX if the backend/admin policy rejects it.

## Tradeoffs

We intentionally keep `login_chatgpt_common` as a local validation
helper instead of turning it into a capability probe. Device-code
eligibility is checked by actually calling `request_device_code`, which
means policy-disabled cases surface as an immediate request error rather
than an async completion event. We also keep the active-login state
machine minimal: browser and device-code logins share the same public
cancel contract, but device-code cancellation is implemented with a
local cancel token rather than a larger cross-crate refactor.

## Architecture

The protocol grows a new `chatgptDeviceCode` request/response variant in
app-server v2. On the server side, the new handler reuses the existing
ChatGPT login precondition checks, calls `request_device_code`, returns
the device-code payload, and then spawns a background task that waits on
either cancellation or `complete_device_code_login`. On success, it
reuses the existing auth reload and cloud-requirements refresh path
before emitting `account/login/completed` success and `account/updated`.
On failure or cancellation, it emits only `account/login/completed`
failure. The existing `account/login/cancel { loginId }` contract
remains unchanged and now works for both browser and device-code
attempts.


## Tests

Added protocol serialization coverage for the new request/response
variant, plus app-server tests for device-code success, failure, cancel,
and start-time rejection behavior. Existing browser ChatGPT login
coverage remains in place to show that the callback-based flow is
unchanged.
2026-03-27 00:27:15 -07:00
Michael Bolin
b23789b770 [codex] import token_data from codex-login directly (#15903)
## Why
`token_data` is owned by `codex-login`, but `codex-core` was still
re-exporting it. That let callers pull auth token types through
`codex-core`, which keeps otherwise unrelated crates coupled to
`codex-core` and makes `codex-core` more of a build-graph bottleneck.

## What changed
- remove the `codex-core` re-export of `codex_login::token_data`
- update the remaining `codex-core` internals that used
`crate::token_data` to import `codex_login::token_data` directly
- update downstream callers in `codex-rs/chatgpt`,
`codex-rs/tui_app_server`, `codex-rs/app-server/tests/common`, and
`codex-rs/core/tests` to import `codex_login::token_data` directly
- add explicit `codex-login` workspace dependencies and refresh lock
metadata for crates that now depend on it directly

## Validation
- `cargo test -p codex-chatgpt --locked`
- `just argument-comment-lint`
- `just bazel-lock-update`
- `just bazel-lock-check`

## Notes
- attempted `cargo test -p codex-core --locked` and `cargo test -p
codex-core auth_refresh --locked`, but both ran out of disk while
linking `codex-core` test binaries in the local environment
2026-03-26 13:34:02 -07:00
Matthew Zeng
0b08d89304 [app-server] Add a method to override feature flags. (#15601)
- [x] Add a method to override feature flags globally and not just
thread level.
2026-03-25 02:27:00 +00:00
Ruslan Nigmatullin
301b17c2a1 app-server: add filesystem watch support (#14533)
### Summary
Add the v2 app-server filesystem watch RPCs and notifications, wire them
through the message processor, and implement connection-scoped watches
with notify-backed change delivery. This also updates the schema
fixtures, app-server documentation, and the v2 integration coverage for
watch and unwatch behavior.

This allows clients to efficiently watch for filesystem updates, e.g. to
react on branch changes.

### Testing
- exercise watch lifecycles for directory changes, atomic file
replacement, missing-file targets, and unwatch cleanup
2026-03-24 15:52:13 -07:00
jif-oai
79ad7b247b feat: change multi-agent to use path-like system instead of uuids (#15313)
This PR add an URI-based system to reference agents within a tree. This
comes from a sync between research and engineering.

The main agent (the one manually spawned by a user) is always called
`/root`. Any sub-agent spawned by it will be `/root/agent_1` for example
where `agent_1` is chosen by the model.

Any agent can contact any agents using the path.

Paths can be used either in absolute or relative to the calling agents

Resume is not supported for now on this new path
2026-03-20 18:23:48 +00:00
Ahmed Ibrahim
2e22885e79 Split features into codex-features crate (#15253)
- Split the feature system into a new `codex-features` crate.
- Cut `codex-core` and workspace consumers over to the new config and
warning APIs.

Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
2026-03-19 20:12:07 -07:00
xl-openai
db5781a088 feat: support product-scoped plugins. (#15041)
1. Added SessionSource::Custom(String) and --session-source.
  2. Enforced plugin and skill products by session_source.
  3. Applied the same filtering to curated background refresh.
2026-03-19 00:46:15 -07:00
Eric Traut
01df50cf42 Add thread/shellCommand to app server API surface (#14988)
This PR adds a new `thread/shellCommand` app server API so clients can
implement `!` shell commands. These commands are executed within the
sandbox, and the command text and output are visible to the model.

The internal implementation mirrors the current TUI `!` behavior.
- persist shell command execution as `CommandExecution` thread items,
including source and formatted output metadata
- bridge live and replayed app-server command execution events back into
the existing `tui_app_server` exec rendering path

This PR also wires `tui_app_server` to submit `!` commands through the
new API.
2026-03-18 23:42:40 -06:00
xl-openai
86982ca1f9 Revert "fix: harden plugin feature gating" (#15102)
Reverts openai/codex#15020

I messed up the commit in my PR and accidentally merged changes that
were still under review.
2026-03-18 15:19:29 -07:00
xl-openai
580f32ad2a fix: harden plugin feature gating (#15020)
1. Use requirement-resolved config.features as the plugin gate.
2. Guard plugin/list, plugin/read, and related flows behind that gate.
3. Skip bad marketplace.json files instead of failing the whole list.
4. Simplify plugin state and caching.
2026-03-18 10:11:43 -07:00