codex

mirror of https://github.com/openai/codex.git synced 2026-05-03 19:06:58 +00:00

Author	SHA1	Message	Date
Ruslan Nigmatullin	48f82ca7c5	app-server: define device key v2 protocol (#18428 ) ## Why Clients need a stable app-server protocol surface for enrolling a local device key, retrieving its public key, and producing a device-bound proof. The protocol reports `protectionClass` explicitly so clients can distinguish hardware-backed keys from an explicitly allowed OS-protected fallback. Signing uses a tagged `DeviceKeySignPayload` enum rather than arbitrary bytes so each signed statement is auditable at the API boundary. ## What changed - Added v2 JSON-RPC methods for `device/key/create`, `device/key/public`, and `device/key/sign`. - Added request/response types for device-key metadata, SPKI public keys, protection classes, and ECDSA signatures. - Added `DeviceKeyProtectionPolicy` with hardware-only default behavior and an explicit `allow_os_protected_nonextractable` option. - Added the initial `remoteControlClientConnection` signing payload variant. - Regenerated JSON Schema and TypeScript fixtures for app-server clients. ## Stack This is PR 1 of 4 in the device-key app-server stack. ## Validation - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol`	2026-04-21 10:08:42 -07:00
pakrym-oai	833212115e	Move external agent config out of core (#18850 ) ## Summary - Move external agent config migration logic and tests from `codex-core` into `app-server/src/config`. - Keep the migration service crate-private to app-server and update the API adapter imports. - Remove stale core re-exports and expose only the needed marketplace source helper. ## Testing - `cargo test -p codex-app-server config::external_agent_config` - `just fmt` - `just fix -p codex-app-server` - `just fix -p codex-core` - `git diff --check`	2026-04-21 08:33:58 -07:00
pash-openai	dc1a8f2190	[tool search] support namespaced deferred dynamic tools (#18413 ) Deferred dynamic tools need to round-trip a namespace so a tool returned by `tool_search` can be called through the same registry key that core uses for dispatch. This change adds namespace support for dynamic tool specs/calls, persists it through app-server thread state, and routes dynamic tool calls by full `ToolName` while still sending the app the leaf tool name. Deferred dynamic tools must provide a namespace; non-deferred dynamic tools may remain top-level. It also introduces `LoadableToolSpec` as the shared function-or-namespace Responses shape used by both `tool_search` output and dynamic tool registration, so dynamic tools use the same wrapping logic in both paths. Validation: - `cargo test -p codex-tools` - `cargo test -p codex-core tool_search` --------- Co-authored-by: Sayan Sisodiya <sayan@openai.com>	2026-04-21 14:13:08 +08:00
Michael Bolin	d62421d322	chore: document intentional await-holding cases (#18423 ) ## Why This PR prepares the stack to enable Clippy await-holding lints that were left disabled in #18178. The mechanical lock-scope cleanup is handled separately; this PR is the documentation/configuration layer for the remaining await-across-guard sites. Without explicit annotations, reviewers and future maintainers cannot tell whether an await-holding warning is a real concurrency smell or an intentional serialization boundary. ## What changed - Configures `clippy.toml` so `await_holding_invalid_type` also covers `tokio::sync::{MutexGuard,RwLockReadGuard,RwLockWriteGuard}`. - Adds targeted `#[expect(clippy::await_holding_invalid_type, reason = ...)]` annotations for intentional async guard lifetimes. - Documents the main categories of intentional cases: active-turn state transitions that must remain atomic, session-owned MCP manager accesses, remote-control websocket serialization, JS REPL kernel/process serialization, OAuth persistence, external bearer token refresh serialization, and tests that intentionally serialize shared global or session-owned state. - For external bearer token refresh, documents the existing serialization boundary: holding `cached_token` across the provider command prevents concurrent cache misses from starting duplicate refresh commands, and the current behavior is small enough that an explicit expectation is easier to maintain than adding another synchronization primitive. ## Verification - `cargo clippy -p codex-login --all-targets` - `cargo clippy -p codex-connectors --all-targets` - `cargo clippy -p codex-core --all-targets` - The follow-up PR #18698 enables `await_holding_invalid_type` and `await_holding_lock` as workspace `deny` lints, so any undocumented remaining offender will fail Clippy. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18423). * #18698 * __->__ #18423	2026-04-20 22:41:54 -07:00
Abhinav	ab26554a3a	Add remote_sandbox_config to our config requirements (#18763 ) ## Why Customers need finer-grained control over allowed sandbox modes based on the host Codex is running on. For example, they may want stricter sandbox limits on devboxes while keeping a different default elsewhere. Our current cloud requirements can target user/account groups, but they cannot vary sandbox requirements by host. That makes remote development environments awkward because the same top-level `allowed_sandbox_modes` has to apply everywhere. ## What Adds a new `remote_sandbox_config` section to `requirements.toml`: ```toml allowed_sandbox_modes = ["read-only"] [[remote_sandbox_config]] hostname_patterns = [".org"] allowed_sandbox_modes = ["read-only", "workspace-write"] [[remote_sandbox_config]] hostname_patterns = [".sh", "runner-*.ci"] allowed_sandbox_modes = ["read-only", "danger-full-access"] ``` During requirements resolution, Codex resolves the local host name once, preferring the machine FQDN when available and falling back to the cleaned kernel hostname. This host classification is best effort rather than authenticated device proof. Each requirements source applies its first matching `remote_sandbox_config` entry before it is merged with other sources. The shared merge helper keeps that `apply_remote_sandbox_config` step paired with requirements merging so new requirements sources do not have to remember the extra call. That preserves source precedence: a lower-precedence requirements file with a matching `remote_sandbox_config` cannot override a higher-precedence source that already set `allowed_sandbox_modes`. This also wires the hostname-aware resolution through app-server, CLI/TUI config loading, config API reads, and config layer metadata so they all evaluate remote sandbox requirements consistently. ## Verification - `cargo test -p codex-config remote_sandbox_config` - `cargo test -p codex-config host_name` - `cargo test -p codex-core load_config_layers_applies_matching_remote_sandbox_config` - `cargo test -p codex-core system_remote_sandbox_config_keeps_cloud_sandbox_modes` - `cargo test -p codex-config` - `cargo test -p codex-core` unit tests passed; `tests/all.rs` integration matrix was intentionally stopped after the relevant focused tests passed - `just fix -p codex-config` - `just fix -p codex-core` - `cargo check -p codex-app-server`	2026-04-21 05:05:02 +00:00
Matthew Zeng	1132ef887c	Make MCP resource read threadless (#18292 ) ## Summary Making thread id optional so that we can better cache resources for MCPs for connectors since their resource templates is universal and not particular to projects. - Make `mcpServer/resource/read` accept an optional `threadId` - Read resources from the current MCP config when no thread is supplied - Keep the existing thread-scoped path when `threadId` is present - Update the generated schemas, README, and integration coverage ## Testing - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-mcp` - `cargo test -p codex-app-server --test all mcp_resource` - `just fix -p codex-mcp` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server`	2026-04-20 19:59:36 -07:00
Michael Bolin	3d2f123895	protocol: preserve glob scan depth in permission profiles (#18713 ) ## Why #18274 made `PermissionProfile` the canonical file-system permissions shape, but the round-trip from `FileSystemSandboxPolicy` to `PermissionProfile` still dropped one piece of policy metadata: `glob_scan_max_depth`. That field is security-relevant for deny-read globs such as `*/.env`. On Linux, bubblewrap sandbox construction uses it to bound unreadable glob expansion. If a profile copied from active runtime permissions loses this value and is submitted back as an override, the resulting `FileSystemSandboxPolicy` can behave differently even though the visible permission entries look equivalent. ## What changed - Add `glob_scan_max_depth` to protocol `FileSystemPermissions` and preserve it when converting to/from `FileSystemSandboxPolicy`. - Keep legacy `read`/`write` JSON for simple path-only permissions, but force canonical JSON when glob scan depth is present so the metadata is not silently dropped. - Carry `globScanMaxDepth` through app-server `AdditionalFileSystemPermissions`, generated JSON/TypeScript schemas, and app-server/TUI conversion call sites. - Preserve the metadata through sandboxing permission normalization, merging, and intersection. - Carry the merged scan depth into the effective `FileSystemSandboxPolicy` used for command execution, so bounded deny-read globs reach Linux bubblewrap materialization. ## Verification - `cargo test -p codex-sandboxing glob_scan -- --nocapture` - `cargo test -p codex-sandboxing policy_transforms -- --nocapture` - `just fix -p codex-sandboxing` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18713). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * #18277 * #18276 * #18275 * __->__ #18713	2026-04-20 19:42:45 -07:00
Celia Chen	cefcfe43b9	feat: add a built-in Amazon Bedrock model provider (#18744 ) ## Why Codex needs a first-class `amazon-bedrock` model provider so users can select Bedrock without copying a full provider definition into `config.toml`. The provider has Codex-owned defaults for the pieces that should stay consistent across users: the display `name`, Bedrock `base_url`, and `wire_api`. At the same time, users still need a way to choose the AWS credential profile used by their local environment. This change makes `amazon-bedrock` a partially modifiable built-in provider: code owns the provider identity and endpoint defaults, while user config can set `model_providers.amazon-bedrock.aws.profile`. For example: ```toml model_provider = "amazon-bedrock" [model_providers.amazon-bedrock.aws] profile = "codex-bedrock" ``` ## What Changed - Added `amazon-bedrock` to the built-in model provider map with: - `name = "Amazon Bedrock"` - `base_url = "https://bedrock-mantle.us-east-1.api.aws/v1"` - `wire_api = "responses"` - Added AWS provider auth config with a profile-only shape: `model_providers.<id>.aws.profile`. - Kept AWS auth config restricted to `amazon-bedrock`; custom providers that set `aws` are rejected. - Allowed `model_providers.amazon-bedrock` through reserved-provider validation so it can act as a partial override. - During config loading, only `aws.profile` is copied from the user-provided `amazon-bedrock` entry onto the built-in provider. Other Bedrock provider fields remain hard-coded by the built-in definition. - Updated the generated config schema for the new provider AWS profile config.	2026-04-21 00:54:05 +00:00
Rasmus Rygaard	7b994100b3	Add session config loader interface (#18208 ) ## Why Cloud-hosted sessions need a way for the service that starts or manages a thread to provide session-owned config without treating all config as if it came from the same user/project/workspace TOML stack. The important boundary is ownership: some values should be controlled by the session/orchestrator, some by the authenticated user, and later some may come from the executor. The earlier broad config-store shape made that boundary too fuzzy and overlapped heavily with the existing filesystem-backed config loader. This PR starts with the smaller piece we need now: a typed session config loader that can feed the existing config layer stack while preserving the normal precedence and merge behavior. ## What Changed - Added `ThreadConfigLoader` and related typed payloads in `codex-config`. - `SessionThreadConfig` currently supports `model_provider`, `model_providers`, and feature flags. - `UserThreadConfig` is present as an ownership boundary, but does not yet add TOML-backed fields. - `NoopThreadConfigLoader` preserves existing behavior when no external loader is configured. - `StaticThreadConfigLoader` supports tests and simple callers. - Taught thread config sources to produce ordinary `ConfigLayerEntry` values so the existing `ConfigLayerStack` remains the place where precedence and merging happen. - Wired the loader through `ConfigBuilder`, the config loader, and app-server startup paths so app-server can provide session-owned config before deriving a thread config. - Added coverage for: - translating typed thread config into config layers, - inserting thread config layers into the stack at the right precedence, - applying session-provided model provider and feature settings when app-server derives config from thread params. ## Follow-Ups This intentionally stops short of adding the remote/service transport. The next pieces are expected to be: 1. Define the proto/API shape for this interface. 2. Add a client implementation that can source session config from the service side. ## Verification - Added unit coverage in `codex-config` for the loader and layer conversion. - Added `codex-core` config loader coverage for thread config layer precedence. - Added app-server coverage that verifies session thread config wins over request-provided config for model provider and feature settings.	2026-04-20 23:05:49 +00:00
guinness-oai	1029742cf7	Add realtime silence tool (#18635 ) ## Summary Adds a second realtime v2 function tool, `remain_silent`, so the realtime model has an explicit non-speaking action when the collaboration mode or latest context says it should not answer aloud. This is stacked on #18597. ## Design - Advertise `remain_silent` alongside `background_agent` in realtime v2 conversational sessions. - Parse `remain_silent` function calls into a typed `RealtimeEvent::NoopRequested` event. - Have core answer that function call with an empty `function_call_output` and deliberately avoid `response.create`, so no follow-up realtime response is requested. - Keep the event hidden from app-server/TUI surfaces; it is operational plumbing, not user-visible conversation content.	2026-04-20 15:43:20 -07:00
Tom	a718b6fd47	Read conversation summaries through thread store (#18716 ) Migrate the conversation summary App Server methods to ThreadStore Because this app server api allows explicitly fetching the thread by rollout path, intercept that case in the app server code and (a) route directly to underlying local thread store methods if we're using a local thread store, or (b) throw an unsupported error if we're using a remote thread store. This keeps the thread store API clean and all filesystem operations inside of the local thread store, which pushing the "fundamental incompatibility" check as early as possible.	2026-04-20 22:39:10 +00:00
jif-oai	660153b6de	feat: cascade thread archive (#18112 ) Cascade the thread archive endpoint to all the sub-agents in the agent tree Fix: https://github.com/openai/codex/issues/17867 --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 23:38:18 +01:00
Tom	46e5814f77	Add experimental remote thread store config (#18714 ) Add experimental config to use remote thread store rather than local thread store implementation in app server	2026-04-20 22:20:39 +00:00
guinness-oai	126bd6e7a8	Update realtime handoff transcript handling (#18597 ) ## Summary This PR aims to improve integration between the realtime model and the codex agent by sharing more context with each other. In particular, we now share full realtime conversation transcript deltas in addition to the delegation message. realtime_conversation.rs now turns a handoff into: ``` <realtime_delegation> <input>...</input> <transcript_delta>...</transcript_delta> </realtime_delegation> ``` ## Implementation notes The transcript is accumulated in the realtime websocket layer as parsed realtime events arrive. When a background-agent handoff is requested, the current transcript snapshot is copied onto the handoff event and then serialized by `realtime_conversation.rs` into the hidden realtime delegation envelope that Codex receives as user-turn context. For Realtime V2, the session now explicitly enables input audio transcription, and the parser handles the relevant input/output transcript completion events so the snapshot includes both user speech and realtime model responses. The delegation `<input>` remains the actual handoff request, while `<transcript_delta>` carries the surrounding conversation history for context. Reviewers should note that the transcript payload is intended for Codex context sharing, not UI rendering. The realtime delegation envelope should stay hidden from the user-facing transcript surface, while still being included in the background-agent turn so Codex can answer with the same conversational context the realtime model had.	2026-04-20 14:04:09 -07:00
Akshay Nathan	34a3e85fcd	Wire the PatchUpdated events through app_server (#18289 ) Wires patch_updated events through app_server. These events are parsed and streamed while apply_patch is being written by the model. Also adds 500ms of buffering to the patch_updated events in the diff_consumer. The eventual goal is to use this to display better progress indicators in the codex app.	2026-04-20 10:44:03 -07:00
Ahmed Ibrahim	316cf0e90b	Update models.json (#18586 ) - Replace the active models-manager catalog with the deleted core catalog contents. - Replace stale hardcoded test model slugs with current bundled model slugs. - Keep this as a stacked change on top of the cleanup PR.	2026-04-20 10:27:01 -07:00
Michael Bolin	dcec516313	protocol: canonicalize file system permissions (#18274 ) ## Why `PermissionProfile` needs stable, canonical file-system semantics before it can become the primary runtime permissions abstraction. Without a canonical form, callers have to keep re-deriving legacy sandbox maps and profile comparisons remain lossy or order-dependent. ## What changed This adds canonicalization helpers for `FileSystemPermissions` and `PermissionProfile`, expands special paths into explicit sandbox entries, and updates permission request/conversion paths to consume those canonical entries. It also tightens the legacy bridge so root-wide write profiles with narrower carveouts are not silently projected as full-disk legacy access. ## Verification - `cargo test -p codex-protocol root_write_with_read_only_child_is_not_full_disk_write -- --nocapture` - `cargo test -p codex-sandboxing permission -- --nocapture` - `cargo test -p codex-tui permissions -- --nocapture`	2026-04-20 09:57:03 -07:00
Tom	ac7c9a685f	codex: move unloaded thread writes into store (#18361 ) - Migrates unloaded `thread/name/set` and `thread/memoryModeSet` app-server writes behind the generic `ThreadStore::update_thread_metadata` API rather than adding one-off store methods for setting thread name or memory mode. - Implements the local ThreadStore metadata patch path for thread name and memory mode, including rollout append, legacy name index updates, SessionMeta validation/update, SQLite reconciliation, and re-reading the stored thread. - Adds focused local thread-store unit coverage plus app-server integration coverage for the migrated unloaded write paths.	2026-04-20 09:50:01 -07:00
Adrian	19e2f21827	[codex] Use background task auth for additional backend calls (#18260 ) ## Summary Splits the larger PR4.1 background task auth rollout by moving additional backend/control-plane call sites into this downstream PR. This PR keeps callers on the same design as PR4.1: most code asks `AuthManager` for the default ChatGPT backend authorization header, and `AuthManager` decides bearer vs background AgentAssertion internally. Task-pinned inference auth remains separate because it needs the thread's registered task id. ## Stack - PR1: https://github.com/openai/codex/pull/17385 - add `features.use_agent_identity` - PR2: https://github.com/openai/codex/pull/17386 - register agent identities when enabled - PR3: https://github.com/openai/codex/pull/17387 - register agent tasks when enabled - PR3.1: https://github.com/openai/codex/pull/17978 - persist and prewarm registered tasks per thread - PR4: https://github.com/openai/codex/pull/17980 - use task-scoped `AgentAssertion` for downstream calls - PR4.1: https://github.com/openai/codex/pull/18094 - introduce AuthManager-owned background/control-plane `AgentAssertion` auth - PR4.2: this PR - use background task auth for additional backend/control-plane calls ## What Changed - pass full authorization header values through backend-client and cloud-tasks-client call paths where needed - move ChatGPT client, cloud requirements, cloud tasks, thread-manager, and models-manager background auth usage into this downstream slice - make app-server remote control enrollment/websocket auth ask `AuthManager` for the local backend authorization header instead of threading a background auth mode through transport options - keep the same feature-gated bearer fallback behavior from PR4.1 ## Validation - `just fmt` - `cargo check -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `cargo test -p codex-login agent_identity` - `cargo test -p codex-model-provider bearer_auth_provider` - `cargo test -p codex-core agent_assertion` - `cargo test -p codex-app-server remote_control` - `cargo test -p codex-cloud-requirements fetch_cloud_requirements` - `cargo test -p codex-models-manager manager::tests` - `cargo test -p codex-chatgpt` - `cargo test -p codex-cloud-tasks` - `just fix -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `just fix -p codex-app-server` - `git diff --check`	2026-04-20 07:24:29 -07:00
Adrian	904c751a40	[codex] Use background agent task auth for backend calls (#18094 ) ## Summary Introduces a single background/control-plane agent task for ChatGPT backend requests that do not have a thread-scoped task, with `AuthManager` owning the default ChatGPT backend authorization decision. Callers now ask `AuthManager` for the default ChatGPT backend authorization header. `AuthManager` decides whether that is bearer or background AgentAssertion based on config/internal state, while low-level bootstrap paths can explicitly request bearer-only auth. This PR is stacked on PR4 and focuses on the shared background task auth plumbing plus the first tranche of backend/control-plane consumers. The remaining callsite wiring is split into PR4.2 to keep review size down. ## Stack - PR1: https://github.com/openai/codex/pull/17385 - add `features.use_agent_identity` - PR2: https://github.com/openai/codex/pull/17386 - register agent identities when enabled - PR3: https://github.com/openai/codex/pull/17387 - register agent tasks when enabled - PR3.1: https://github.com/openai/codex/pull/17978 - persist and prewarm registered tasks per thread - PR4: https://github.com/openai/codex/pull/17980 - use task-scoped `AgentAssertion` for downstream calls - PR4.1: this PR - introduce AuthManager-owned background/control-plane `AgentAssertion` auth - PR4.2: https://github.com/openai/codex/pull/18260 - use background task auth for additional backend/control-plane calls ## What Changed - add background task registration and assertion minting inside `codex-login` - persist `agent_identity.background_task_id` separately from per-session task state - make `BackgroundAgentTaskManager` private to `codex-login`; call sites do not instantiate or pass it around - teach `AuthManager` the ChatGPT backend base URL and feature-derived background auth mode from resolved config - expose bearer-only helpers for bootstrap/registration/refresh-style paths that must not use AgentAssertion - wire `AuthManager` default ChatGPT authorization through app listing, connector directory listing, remote plugins, MCP status/listing, analytics, and core-skills remote calls - preserve bearer fallback when the feature is disabled, the backend host is unsupported, or background task registration is not available ## Validation - `just fmt` - `cargo check -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `cargo test -p codex-login agent_identity` - `cargo test -p codex-model-provider bearer_auth_provider` - `cargo test -p codex-core agent_assertion` - `cargo test -p codex-app-server remote_control` - `cargo test -p codex-cloud-requirements fetch_cloud_requirements` - `cargo test -p codex-models-manager manager::tests` - `cargo test -p codex-chatgpt` - `cargo test -p codex-cloud-tasks` - `just fix -p codex-core -p codex-login -p codex-analytics -p codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p codex-models-manager -p codex-chatgpt -p codex-model-provider -p codex-mcp -p codex-core-skills` - `just fix -p codex-app-server` - `git diff --check`	2026-04-20 06:50:28 -07:00
xli-oai	2a17b32dfa	Stabilize marketplace/remove installedRoot test (#17721 ) ## Why This addresses the review comment from #17751 about `marketplace/remove` app-server test portability: https://github.com/openai/codex/pull/17751#discussion_r3104378613 The API returns the removed installed root using the app-server's effective `CODEX_HOME`. On macOS, temporary directory paths can appear as either `/var/...` or `/private/var/...`, so comparing one raw path against another can fail even when `marketplace/remove` behaves correctly. ## What changed - Removed the direct whole-response equality assertion for the installed root path. - Asserted the stable response field, `marketplace_name`, directly. - Compared the expected and returned installed-root paths after canonicalizing their existing parent directories, which avoids requiring the removed leaf directory to still exist. ## Verification - `cargo test -p codex-app-server marketplace_remove_deletes_config_and_installed_root` - `cargo test -p codex-app-server marketplace_remove`	2026-04-20 03:11:45 -07:00
xli-oai	1dc3535e17	[codex] Add marketplace/remove app-server RPC (#17751 ) ## Summary Add a new app-server `marketplace/remove` RPC on top of the shared marketplace-remove implementation. This change: - adds `MarketplaceRemoveParams` / `MarketplaceRemoveResponse` to the app-server protocol - wires the new request through `codex_message_processor` - reuses the shared core marketplace-remove flow from the stacked refactor PR - updates generated schema files and adds focused app-server coverage ## Validation - `just write-app-server-schema` - `just fmt` - heavy compile/test coverage deferred to GitHub CI per request	2026-04-19 23:22:49 -07:00
Adrian	e5b52a3caa	Persist and prewarm agent tasks per thread (#17978 ) ## Summary - persist registered agent tasks in the session state update stream so the thread can reuse them - prewarm task registration once identity registration succeeds, while keeping startup failures best-effort - isolate the session-side task lifecycle into a dedicated module so AgentIdentityManager and RegisteredAgentTask do not leak across as many core layers ## Testing - cargo test -p codex-core startup_agent_task_prewarm - cargo test -p codex-core cached_agent_task_for_current_identity_clears_stale_task - cargo test -p codex-core record_initial_history_	2026-04-19 15:45:28 -07:00
Ahmed Ibrahim	996aa23e4c	[5/6] Wire executor-backed MCP stdio (#18212 ) ## Summary - Add the executor-backed RMCP stdio transport. - Wire MCP stdio placement through the executor environment config. - Cover local and executor-backed stdio paths with the existing MCP test helpers. ## Stack ```text o #18027 [6/6] Fail exec client operations after disconnect │ @ #18212 [5/6] Wire executor-backed MCP stdio │ o #18087 [4/6] Abstract MCP stdio server launching │ o #18020 [3/6] Add pushed exec process events │ o #18086 [2/6] Support piped stdin in exec process API │ o #18085 [1/6] Add MCP server environment config │ o main ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-18 21:47:43 -07:00
pakrym-oai	53b1570367	Update image outputs to default to high detail (#18386 ) Do not assume the default `detail`.	2026-04-18 11:01:12 -07:00
Ahmed Ibrahim	5bb193aa88	Add max context window model metadata (#18382 ) Adds max_context_window to model metadata and routes core context-window reads through resolved model info. Config model_context_window overrides are clamped to max_context_window when present; without an override, the model context_window is used.	2026-04-17 21:48:14 -07:00
richardopenai	6b39d0c657	[codex] Add owner nudge app-server API (#18220 ) ## Summary Second PR in the split from #17956. Stacked on #18227. - adds app-server v2 protocol/schema support for `account/sendAddCreditsNudgeEmail` - adds the backend-client `send_add_credits_nudge_email` request and request body mapping - handles the app-server request with auth checks, backend call, and cooldown mapping - adds the disabled `workspace_owner_usage_nudge` feature flag and focused app-server/backend tests ## Validation - `cargo test -p codex-backend-client` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server rate_limits` - `cargo test -p codex-tui workspace_` - `cargo test -p codex-tui status_` - `just fmt` - `just fix -p codex-backend-client` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server` - `just fix -p codex-tui`	2026-04-17 21:41:57 -07:00
xli-oai	def6467d2b	[codex] Describe uninstalled cross-repo plugin reads (#18449 ) ## Summary - Populate `PluginDetail.description` in core for uninstalled cross-repo plugins when detailed fields are unavailable until install. - Include the source Git URL plus optional path/ref/sha details in that fallback description. - Keep `details_unavailable_reason` as the structured signal while app-server forwards the description normally. - Add plugin-read coverage proving the response does not clone the remote source just to show the message. ## Why Uninstalled cross-repo plugins intentionally return sparse detail data so listing/reading does not clone the plugin source. Without a description, Desktop and TUI detail pages look like an ordinary empty plugin. This gives users a concrete explanation and source pointer while keeping the existing structured reason available for callers. ## Validation - `just fmt` - `cargo test -p codex-core read_plugin_for_config_uninstalled_git_source_requires_install_without_cloning` - `cargo test -p codex-app-server plugin_read --test all` - `just fix -p codex-core` - `just fix -p codex-app-server` Note: `cargo test -p codex-app-server` was also attempted before the latest refactor and failed broadly in unrelated v2 thread/realtime/review/skills suites; the new plugin-read test passed in that run as well.	2026-04-17 20:31:13 -07:00
xl-openai	3f7222ec76	feat: Budget skill metadata and surface trimming as a warning (#18298 ) Cap the model-visible skills section to a small share of the context window, with a fallback character budget, and keep only as many implicit skills as fit within that budget. Emit a non-fatal warning when enabled skills are omitted, and add a new app-server warning notification Record thread-start skill metrics for total enabled skills, kept skills, and whether truncation happened --------- Co-authored-by: Matthew Zeng <mzeng@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-17 18:11:47 -07:00
viyatb-oai	370bed4bf4	fix: trust-gate project hooks and exec policies (#14718 ) ## Summary - trust-gate project `.codex` layers consistently, including repos that have `.codex/hooks.json` or `.codex/execpolicy/*.rules` but no `.codex/config.toml` - keep disabled project layers in the config stack so nested trusted project layers still resolve correctly, while preventing hooks and exec policies from loading until the project is trusted - update app-server/TUI onboarding copy to make the trust boundary explicit and add regressions for loader, hooks, exec-policy, and onboarding coverage ## Security Before this change, an untrusted repo could auto-load project hooks or exec policies from `.codex/` as long as `config.toml` was absent. This makes trust the single gate for project-local config, hooks, and exec policies. ## Stack - Parent of #15936 ## Test - cargo test -p codex-core without_config_toml --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-17 17:56:58 -07:00
xl-openai	26d9894a27	feat: Add remote plugin fields to plugin API (#17277 ) ## Summary Update the plugin API for the new remote plugin model. The mental model is no longer “keep local plugin state in sync with remote.” Instead, local and remote plugins are becoming separate sources. Remote catalog entries can be shown directly from the remote API before installation; after installation they are still downloaded into the local cache for execution, but remote installed state will come from the API and be held in memory rather than being read from config. • ## API changes - Remove `forceRemoteSync` from `plugin/list`, `plugin/install`, and `plugin/uninstall`. - Remove `remoteSyncError` from `plugin/list`. - Add remote-capable metadata to `plugin/list` / `plugin/read`: - nullable `marketplaces[].path` - `source: { type: "remote", downloadUrl }` - URL asset fields alongside local path fields: `composerIconUrl`, `logoUrl`, `screenshotUrls` - Make `plugin/read` and `plugin/install` source-compatible: - `marketplacePath?: AbsolutePathBuf \| null` - `remoteMarketplaceName?: string \| null` - exactly one source is required at runtime	2026-04-17 16:47:58 -07:00
Michael Bolin	96d35dd640	bazel: use native rust test sharding (#18082 ) ## Why The large Rust test suites are slow and include some of our flakiest tests, so we want to run them with Bazel native sharding while keeping shard membership stable between runs. This is the simpler follow-up to the explicit-label experiment in #17998. Since #18397 upgraded Codex to `rules_rs` `0.0.58`, which includes the stable test-name hashing support from hermeticbuild/rules_rust#14, this PR only needs to wire Codex's Bazel macros into that support. Using native sharding preserves BuildBuddy's sharded-test UI and Bazel's per-shard test action caching. Using stable name hashing avoids reshuffling every test when one test is added or removed. ## What Changed `codex_rust_crate` now accepts `test_shard_counts` and applies the right Bazel/rules_rust attributes to generated unit and integration test rules. Matched tests are also marked `flaky = True`, giving them Bazel's default three attempts. This PR shards these labels 8 ways: ```text //codex-rs/core:core-all-test //codex-rs/core:core-unit-tests //codex-rs/app-server:app-server-all-test //codex-rs/app-server:app-server-unit-tests //codex-rs/tui:tui-unit-tests ``` ## Verification `bazel query --output=build` over the selected public labels and their inner unit-test binaries confirmed the expected `shard_count = 8`, `flaky = True`, and `experimental_enable_sharding = True` attributes. Also verified that we see the shards as expected in BuildBuddy so they can be analyzed independently. Co-authored-by: Codex <noreply@openai.com>	2026-04-17 23:14:11 +00:00
xli-oai	0e111e08d0	[codex] Add cross-repo plugin sources to marketplace manifests (#18017 ) ## Summary - add first-class marketplace support for git-backed plugin sources - keep the newer marketplace parsing behavior from `main`, including alternate manifest locations and string local sources - materialize remote plugin sources during install, detail reads, and non-curated cache refresh - expose git plugin source metadata through the app-server protocol ## Details This teaches the marketplace parser to accept all of the following: - local string sources such as `"source": "./plugins/foo"` - local object sources such as `{"source":"local","path":"./plugins/foo"}` - remote repo-root sources such as `{"source":"url","url":"https://github.com/org/repo.git"}` - remote subdir sources such as `{"source":"git-subdir","url":"owner/repo","path":"plugins/foo","ref":"main","sha":"..."}` It also preserves the newer tolerant behavior from `main`: invalid or unsupported plugin entries are skipped instead of breaking the whole marketplace. ## Validation - `cargo test -p codex-core plugins::marketplace::tests` - `just fix -p codex-core` - `just fmt` ## Notes - A full `cargo test -p codex-core` run still hit unrelated existing failures in agent and multi-agent tests during this session; the marketplace-focused suite passed after the rebase resolution.	2026-04-17 15:11:42 -07:00
Michael Bolin	1265df0ec2	refactor: narrow async lock guard lifetimes (#18211 ) Follow-up to https://github.com/openai/codex/pull/18178, where we called out enabling the await-holding lint as a follow-up. The long-term goal is to enable Clippy coverage for async guards held across awaits. This PR is intentionally only the first, low-risk cleanup pass: it narrows obvious lock guard lifetimes and leaves `codex-rs/Cargo.toml` unchanged so the lint is not enabled until the remaining cases are fixed or explicitly justified. It intentionally leaves the active-turn/turn-state locking pattern alone because those checks and mutations need to stay atomic. ## Common fixes used here These are the main patterns reviewers should expect in this PR, and they are also the patterns to reach for when fixing future `await_holding_` findings: - Scope the guard to the synchronous work.* If the code only needs data from a locked value, move the lock into a small block, clone or compute the needed values, and do the later `.await` after the block. - Use direct one-line mutations when there is no later await. Cases like `map.lock().await.remove(&id)` are acceptable when the guard is only needed for that single mutation and the statement ends before any async work. - Drain or clone work out of the lock before notifying or awaiting. For example, the JS REPL drains pending exec senders into a local vector and the websocket writer clones buffered envelopes before it serializes or sends them. - Use a `Semaphore` only when serialization is intentional across async work. The test serialization guards intentionally span awaited setup or execution, so using a semaphore communicates "one at a time" without holding a mutex guard. - Remove the mutex when there is only one owner. The PTY stdin writer task owns `stdin` directly; the old `Arc<Mutex<_>>` did not protect shared access because nothing else had access to the writer. - Do not split locks that protect an atomic invariant. This PR deliberately leaves active-turn/turn-state paths alone because those checks and mutations need to stay atomic. Those cases should be fixed separately with a design change or documented with `#[expect]`. ## What changed - Narrow scoped async mutex guards in app-server, JS REPL, network approval, remote-control websocket, and the RMCP test server. - Replace test-only async mutex serialization guards with semaphores where the guard intentionally lives across async work. - Let the PTY pipe writer task own stdin directly instead of wrapping it in an async mutex. ## Verification - `just fix -p codex-core -p codex-app-server -p codex-rmcp-client -p codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` - `just clippy -p codex-core` - `cargo test -p codex-core -p codex-app-server -p codex-rmcp-client -p codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` was run; the app-server suite passed, and `codex-core` failed in the local sandbox on six otel approval tests plus `suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`, which appear to depend on local command approval/default rules and `CODEX_SANDBOX_NETWORK_DISABLED=1` in this environment.	2026-04-17 14:06:50 -07:00
richardopenai	139fa8b8f2	[codex] Propagate rate limit reached type (#18227 ) ## Summary First PR in the split from #17956. - adds the core/app-server `RateLimitReachedType` shape - maps backend `rate_limit_reached_type` into Codex rate-limit snapshots - carries the field through app-server notifications/responses and generated schemas - updates existing constructors/tests for the new optional field ## Validation - `cargo test -p codex-backend-client` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server rate_limits` - `cargo test -p codex-tui workspace_` - `cargo test -p codex-tui status_` - `just fmt` - `just fix -p codex-backend-client` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server` - `just fix -p codex-tui`	2026-04-17 13:37:25 -07:00
David de Regt	eaf78e43f2	Add sorting/backwardsCursor to thread/list and new thread/turns/list api (#17305 ) To improve performance of UI loads from the app, add two main improvements: 1. The `thread/list` api now gets a `sortDirection` request field and a `backwardsCursor` to the response, which lets you paginate forwards and backwards from a window. This lets you fetch the first few items to display immediately while you paginate to fill in history, then can paginate "backwards" on future loads to catch up with any changes since the last UI load without a full reload of the entire data set. 2. Added a new `thread/turns/list` api which also has sortDirection and backwardsCursor for the same behavior as `thread/list`, allowing you the same small-fetch for immediate display followed by background fill-in and resync catchup.	2026-04-17 11:49:02 -07:00
sayan-oai	6991be7ead	enable tool search over dynamic tools (#18263 ) ## Summary - Normalize deferred MCP and dynamic tools into `ToolSearchEntry` values before constructing `ToolSearchHandler`. - Move the tool-search entry adapter out of `tools/handlers` and into `tools/tool_search_entry.rs` so the handlers directory stays focused on handlers. - Keep `ToolSearchHandler` operating over one generic entry list for BM25 search, namespace grouping, and per-bucket default limits. ## Why Follow-up cleanup for #17849. The dynamic tool-search support made the handler juggle source-specific MCP and dynamic tool lists, index arithmetic, output conversion, and namespace emission. This keeps source adaptation outside the handler so the search loop itself is smaller and source-agnostic. ## Validation - `just fmt` - `cargo test -p codex-core tools::handlers::tool_search::tests` - `git diff --check` - `cargo test -p codex-core` currently fails in unrelated `plugins::manager::tests::list_marketplaces_ignores_installed_roots_missing_from_config`; rerunning that single test fails the same way at `core/src/plugins/manager_tests.rs:1692`. --------- Co-authored-by: pash <pash@openai.com>	2026-04-18 02:07:59 +08:00
Tom	fad3d0f1d0	codex: route thread/read persistence through thread store (#18352 ) Summary - replace the thread/read persisted-load helper with ThreadStore::read_thread - move SQLite/rollout summary, name, fork metadata, and history loading for persisted reads into LocalThreadStore - leave getConversationSummary unchanged for a later PR Context - Replaces closed stacked PR #18232 after PR #18231 merged and its base branch was deleted.	2026-04-17 10:31:30 -07:00
Won Park	af7b8d551c	Guardian -> Auto-Review (#18021 ) This PR is a user-facing change for our rebranding of guardian to auto-review.	2026-04-17 09:56:24 -07:00
viyatb-oai	dae0608c06	feat(config): support managed deny-read requirements (#17740 ) ## Summary - adds managed requirements support for deny-read filesystem entries - constrains config layers so managed deny-read requirements cannot be widened by user-controlled config - surfaces managed deny-read requirements through debug/config plumbing This PR lets managed requirements inject deny-read filesystem constraints into the effective filesystem sandbox policy. User-controlled config can still choose the surrounding permission profile, but it cannot remove or weaken the managed deny-read entries. ## Managed deny-read shape A managed requirements file can declare exact paths and glob patterns under `[permissions.filesystem]`: ```toml # /etc/codex/requirements.toml [permissions.filesystem] deny_read = [ "/Users/alice/.gitconfig", "/Users/alice/.ssh", "./managed-private/*/.env", ] ``` Those entries are compiled into the effective filesystem policy as `access = none` rules, equivalent in shape to filesystem permission entries like: ```toml [permissions.workspace.filesystem] "/Users/alice/.gitconfig" = "none" "/Users/alice/.ssh" = "none" "/absolute/path/to/managed-private/*/.env" = "none" ``` The important difference is that the managed entries come from requirements, so lower-precedence user config cannot remove them or make those paths readable again. Relative managed `deny_read` entries are resolved relative to the directory containing the managed requirements file. Glob entries keep their glob suffix after the non-glob prefix is normalized. ## Runtime behavior - Managed `deny_read` entries are appended to the effective `FileSystemSandboxPolicy` after the selected permission profile is resolved. - Exact paths become `FileSystemPath::Path { access: None }`; glob patterns become `FileSystemPath::GlobPattern { access: None }`. - When managed deny-read entries are present, `sandbox_mode` is constrained to `read-only` or `workspace-write`; `danger-full-access` and `external-sandbox` cannot silently bypass the managed read-deny policy. - On Windows, the managed deny-read policy is enforced for direct file tools, but shell subprocess reads are not sandboxed yet, so startup emits a warning for that platform. - `/debug-config` shows the effective managed requirement as `permissions.filesystem.deny_read` with its source. ## Stack 1. #15979 - glob deny-read policy/config/direct-tool support 2. #18096 - macOS and Linux sandbox enforcement 3. This PR - managed deny-read requirements --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-17 08:40:09 -07:00
alexsong-oai	20b4b80426	Sync local plugin imports, async remote imports, refresh caches after… (#18246 ) … import ## Why `externalAgentConfig/import` used to spawn plugin imports in the background and return immediately. That meant local marketplace imports could still be in flight when the caller refreshed plugin state, so newly imported plugins would not show up right away. This change makes local marketplace imports complete before the RPC returns, while keeping remote marketplace imports asynchronous so we do not block on remote fetches. ## What changed - split plugin migration details into local and remote marketplace imports based on the external config source - import local marketplaces synchronously during `externalAgentConfig/import` - return pending remote plugin imports to the app-server so it can finish them in the background - clear the plugin and skills caches before responding to plugin imports, and again after background remote imports complete, so the next `plugin/list` reloads fresh state - keep marketplace source parsing encapsulated behind `is_local_marketplace_source(...)` instead of re-exporting the internal enum - add core and app-server coverage for the synchronous local import path and the pending remote import path ## Verification - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core` (currently fails an existing unrelated test: `config_loader::tests::cli_override_can_update_project_local_mcp_server_when_project_is_trusted`) - `cargo test` (currently fails existing `codex-app-server` integration tests in MCP/skills/thread-start areas, plus the unrelated `codex-core` failure above)	2026-04-17 09:34:55 +00:00
sashank-oai	22f7ef1cb7	[codex] Revoke ChatGPT tokens on logout (#17825 ) ## Summary This changes Codex logout so managed ChatGPT auth is revoked against AuthAPI before local auth state is removed. CLI logout, TUI `/logout`, and the app-server account logout path now use the token-revoking logout flow instead of only deleting `auth.json` / credential store state. ## Root Cause Logout previously cleared only local auth storage. That removed Codex's local credentials but did not ask the backend to invalidate the refresh/access token state associated with a managed ChatGPT login. ## Behavior For managed ChatGPT auth, logout sends the stored refresh token to `https://auth.openai.com/oauth/revoke` with `token_type_hint: refresh_token` and the Codex OAuth client id, then deletes all local auth stores after revocation succeeds. If only an access token is available, it falls back to revoking that access token. API key auth and externally supplied `chatgptAuthTokens` are still only cleared locally because Codex does not own a refresh token for those modes. Revocation failures are fail-closed: if Codex cannot load stored auth or the backend revoke call fails, logout returns an error and leaves local auth in place so the user can retry instead of silently clearing local state while backend tokens remain valid. ## Validation ran local version of `codex-cli` with staging overrides/harness for auth ran `codex login` then `codex logout`: saw auth.json clear and backend revocation endpoints were called ``` POST /oauth/revoke status: 200 revoking access token should clear auth session clearing auth session due to token revocation successfully revoked session and access token CANONICAL-API-LINE Response: status='200' method='POST' path='/oauth/revoke ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 22:51:21 -07:00
Tom	9d6f4f2e2e	codex: split thread/read view loading (#18231 ) Summary - refactor thread/read into explicit persisted-load, live-load, and merge steps - preserve existing SQLite/filesystem/live-thread behavior exactly - keep ThreadStore migration out of this PR so the next PR is easier to review Validation - this one's a pure reorganization that relies on existing test coverage	2026-04-16 21:06:03 -07:00
xl-openai	37161bc76e	feat: Handle alternate plugin manifest paths (#18182 ) Load plugin manifests through a shared discoverable-path helper so manifest reads, installs, and skill names all see the same alternate manifest location.	2026-04-16 19:43:19 -07:00
pakrym-oai	9effa0509f	Refactor config loading to use filesystem abstraction (#18209 ) Initial pass propagating FileSystem through config loading.	2026-04-17 00:51:21 +00:00
Matthew Zeng	bf6e7e12aa	Use in-process app-server for unknown-thread MCP read test (#18196 ) ## Summary - Switch the unknown-thread MCP resource read test from the stdio subprocess to the in-process app-server path. - Keep the assertion focused on the returned error message while avoiding child-process teardown timing issues in nextest. ## Testing - Not run (not requested)	2026-04-16 23:46:15 +00:00
bxie-openai	6a1ddfc366	[codex] Update realtime V2 VAD silence delay and 1.5 prompt (#18092 ) ## Summary - set the realtime v2 server VAD silence delay to 500ms - update the default realtime 1.5 backend prompt to the v4 text - keep the session payload and prompt rendering tests aligned with those changes ## Why - the VAD change gives the voice path a longer pause before ending the user's turn - the prompt change makes the default bundled realtime prompt match the current v4 content ## Validation - `cargo +1.93.0 test -p codex-core realtime_prompt --manifest-path /tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml` - `CARGO_TARGET_DIR=/tmp/codex-pr-v4-target cargo +1.93.0 test -p codex-api realtime_v2_session_update_includes_background_agent_tool_and_handoff_output_item --manifest-path /tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml` - `CARGO_TARGET_DIR=/tmp/codex-pr-v4-target cargo +1.93.0 test -p codex-app-server --test all 'suite::v2::realtime_conversation::realtime_webrtc_start_emits_sdp_notification' --manifest-path /tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml -- --exact`	2026-04-16 14:30:57 -07:00
Adrian	55c3de75cb	Register agent tasks behind use_agent_identity (#17387 ) ## Summary Stack PR3 for feature-gated agent identity support. This PR adds per-thread agent task registration behind `features.use_agent_identity`. Tasks are minted on the first real user turn and cached in thread runtime state for later turns. ## Stack - PR1: https://github.com/openai/codex/pull/17385 - add `features.use_agent_identity` - PR2: https://github.com/openai/codex/pull/17386 - register agent identities when enabled - PR3: https://github.com/openai/codex/pull/17387 - this PR, original task registration slice - PR3.1: https://github.com/openai/codex/pull/17978 - persist and prewarm registered tasks per thread - PR4: https://github.com/openai/codex/pull/17980 - use `AgentAssertion` downstream when enabled ## Validation Covered as part of the local stack validation pass: - `just fmt` - `cargo test -p codex-core --lib agent_identity` - `cargo test -p codex-core --lib agent_assertion` - `cargo test -p codex-core --lib websocket_agent_task` - `cargo test -p codex-api api_bridge` - `cargo build -p codex-cli --bin codex` ## Notes The full local app-server E2E path is still being debugged after PR creation. The current branch stack is directionally ready for review while that follow-up continues.	2026-04-16 14:30:02 -07:00
Felipe Coury	ec8d4bfc77	fix(app-server): replay token usage after resume and fork (#18023 ) ## Problem When a user resumed or forked a session, the TUI could render the restored thread history immediately, but it did not receive token usage until a later model turn emitted a fresh usage event. That left the context/status UI blank or stale during the exact window where the user expects resumed state to look complete. Core already reconstructed token usage from the rollout; the missing behavior was app-server lifecycle replay to the client that just attached. ## Mental model Token usage has two representations. The rollout is the durable source of historical `TokenCount` events, and the core session cache is the in-memory snapshot reconstructed from that rollout on resume or fork. App-server v2 clients do not read core state directly; they learn about usage through `thread/tokenUsage/updated`. The fix keeps those roles separate: core exposes the restored `TokenUsageInfo`, and app-server sends one targeted notification after a successful `thread/resume` or `thread/fork` response when that restored snapshot exists. This notification is not a new model event. It is a replay of already-persisted state for the client that just attached. That distinction matters because using the normal core event path here would risk duplicating `TokenCount` entries in the rollout and making future resumes count historical usage twice. ## Non-goals This change does not add a new protocol method or payload shape. It reuses the existing v2 `thread/tokenUsage/updated` notification and the TUI’s existing handler for that notification. This change does not alter how token usage is computed, accumulated, compacted, or written during turns. It only exposes the token usage that resume and fork reconstruction already restored. This change does not broadcast historical usage replay to every subscribed client. The replay is intentionally scoped to the connection that requested resume or fork so already-attached clients are not surprised by an old usage update while they may be rendering live activity. ## Tradeoffs Sending the usage notification after the JSON-RPC response preserves a clear lifecycle order: the client first receives the thread object, then receives restored usage for that thread. The tradeoff is that usage is still a notification rather than part of the `thread/resume` or `thread/fork` response. That keeps the protocol shape stable and avoids duplicating usage fields across response types, but clients must continue listening for notifications after receiving the response. The helper selects the latest non-in-progress turn id for the replayed usage notification. This is conservative because restored usage belongs to completed persisted accounting, not to newly attached in-flight work. The fallback to the last turn preserves a stable wire payload for unusual histories, but histories with no meaningful completed turn still have a weak attribution story. ## Architecture Core already seeds `Session` token state from the last persisted rollout `TokenCount` during `InitialHistory::Resumed` and `InitialHistory::Forked`. The new core accessor exposes the complete `TokenUsageInfo` through `CodexThread` without giving app-server direct session mutation authority. App-server calls that accessor from three lifecycle paths: cold `thread/resume`, running-thread resume/rejoin, and `thread/fork`. In each path, the server sends the normal response first, then calls a shared helper that converts core usage into `ThreadTokenUsageUpdatedNotification` and sends it only to the requesting connection. The tests build fake rollouts with a user turn plus a persisted token usage event. They then exercise `thread/resume` and `thread/fork` without starting another model turn, proving that restored usage arrives before any next-turn token event could be produced. ## Observability The primary debug path is the app-server JSON-RPC stream. After `thread/resume` or `thread/fork`, a client should see the response followed by `thread/tokenUsage/updated` when the source rollout includes token usage. If the notification is absent, check whether the rollout contains an `event_msg` payload of type `token_count`, whether core reconstruction seeded `Session::token_usage_info`, and whether the connection stayed attached long enough to receive the targeted notification. The notification is sent through the existing `OutgoingMessageSender::send_server_notification_to_connections` path, so existing app-server tracing around server notifications still applies. Because this is a replay, not a model turn event, debugging should start at the resume/fork handlers rather than the turn event translation in `bespoke_event_handling`. ## Tests The focused regression coverage is `cargo test -p codex-app-server emits_restored_token_usage`, which covers both resume and fork. The core reconstruction guard is `cargo test -p codex-core record_initial_history_seeds_token_info_from_rollout`. Formatting and lint/fix passes were run with `just fmt`, `just fix -p codex-core`, and `just fix -p codex-app-server`. Full crate test runs surfaced pre-existing unrelated failures in command execution and plugin marketplace tests; the new token usage tests passed in focused runs and within the app-server suite before the unrelated command execution failure.	2026-04-16 17:29:34 -03:00
starr-openai	62847e7554	Make thread unsubscribe test deterministic (#18000 ) ## Summary - replace the unsubscribe-during-turn test's sleep/polling flow with a gated streaming SSE response - add request-count notification support to the streaming SSE test server so the test can wait for the in-flight Responses request deterministically ## Scope - codex-rs/app-server/tests/suite/v2/thread_unsubscribe.rs - codex-rs/core/tests/common/streaming_sse.rs ## Validation - Not run locally; this is a narrow extraction from the prior CI-green branch. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 19:34:04 +00:00

1 2 3 4 5 ...

938 Commits