codex

mirror of https://github.com/openai/codex.git synced 2026-05-15 08:42:34 +00:00

Author	SHA1	Message	Date
jif-oai	b401666ca5	Add process-scoped SQLite telemetry (#22154 ) ## Summary - add SQLite init, backfill-gate, and fallback telemetry without introducing a cross-cutting state-db access wrapper - install one process-scoped telemetry sink after OTEL startup and let low-level state/rollout paths emit through it directly - add process-start metrics for the process owners that initialize SQLite --------- Co-authored-by: Owen Lin <owen@openai.com>	2026-05-11 11:32:40 -07:00
jif-oai	436c0df658	extension: wire extension registries into sessions (#21737 ) ## Why [#21736](https://github.com/openai/codex/pull/21736) introduces the typed extension API, but the runtime does not yet carry a registry through thread/session startup or give contributors host-owned stores to read from. This PR wires that host-side path so later feature migrations can move product-specific behavior behind typed contributions without adding another bespoke seam directly to `codex-core`. ## What changed - Thread `ExtensionRegistry<Config>` through `ThreadManager`, `CodexSpawnArgs`, `Session`, and sub-agent spawn paths. - Wire `ThreadStartContributor` and `ContextContributor` - Expose the small supporting surface needed by non-core callers that construct threads directly, including `empty_extension_registry()` through `codex-core-api`. This PR lands the host plumbing only: the app-server registry is still empty, and concrete feature migrations are intended to follow separately.	2026-05-11 11:38:18 +02:00
pakrym-oai	408e6218ab	Reapply "Move skills watcher to app-server" (#21652 ) ## Why PR #21460 reverted the earlier move of skills change watching from `codex-core` into app-server. This reapplies that boundary change so app-server owns client-facing `skills/changed` notifications and core no longer carries the watcher. ## What - Restore the app-server `SkillsWatcher` and register it from thread listener setup. - Remove the core-owned skills watcher and its core live-reload integration surface. - Restore app-server coverage for `skills/changed` notifications after a watched skill file changes. ## Validation - `cargo test -p codex-app-server --test all suite::v2::skills_list::skills_changed_notification_is_emitted_after_skill_change -- --exact --nocapture` - `cargo test -p codex-core --lib --no-run`	2026-05-08 17:41:15 -07:00
Michael Zeng	8f4020846e	[codex] support executor registry remote environments (#21323 ) ## Summary Support registry-backed remote executors end to end so downstream services can resolve an executor id into an exec-server URL and make that environment available to Codex without relying on the legacy cloud environments flow. ## What changed - switch remote executor registration to the executor registry bootstrap contract - allow named remote environments to be inserted into `EnvironmentManager` at runtime - add the experimental app-server RPC `environment/add` so initialized experimental clients can register those remote environments for later `thread/start` and `turn/start` selection ## Validation Ran focused validation locally: - `cargo test -p codex-exec-server environment_manager_` - `cargo test -p codex-exec-server register_executor_posts_with_bearer_token_header` - `cargo test -p codex-app-server-protocol`	2026-05-08 16:30:07 -07:00
Jiaming Zhang	5f4d0ec343	[codex] request desktop attestation from app (#20619 ) ## Summary TL;DR: teaches `codex-rs` / app-server to request a desktop-provided attestation token and attach it as `x-oai-attestation` on the scoped ChatGPT Codex request paths. ![DeviceCheck attestation interface](https://raw.githubusercontent.com/openai/codex/dev/jm/devicecheck-diagram-assets/pr-assets/devicecheck-attestation-interface.png) ## Details This PR teaches the Codex app-server runtime how to request and attach an attestation token. It does not generate DeviceCheck tokens directly; instead, it relies on the connected desktop app to advertise that it can generate attestation and then asks that app for a fresh header value when needed. The flow is: 1. The Codex desktop app connects to app-server. 2. During `initialize`, the app can advertise that it supports `requestAttestation`. 3. Before app-server calls selected ChatGPT Codex endpoints, it sends the internal server request `attestation/generate` to the app. 4. app-server receives a pre-encoded header value back. 5. app-server forwards that value as `x-oai-attestation` on the scoped outbound requests. The code in this repo is mostly protocol and runtime plumbing: it adds the app-server request/response shape, introduces an attestation provider in core, wires that provider into Responses / compaction / realtime setup paths, and covers the intended scoping with tests. The signed macOS DeviceCheck generation remains owned by the desktop app PR. ## Related PR - Codex desktop app implementation: https://github.com/openai/openai/pull/878649 ## Validation <details> <summary>Tests run</summary> ```sh cargo test -p codex-app-server-protocol cargo test -p codex-core attestation --lib cargo test -p codex-app-server --lib attestation ``` Also ran: ```sh just fix -p codex-core just fix -p codex-app-server just fix -p codex-app-server-protocol just fmt just write-app-server-schema ``` </details> <details> <summary>E2E DeviceCheck validation</summary> First validated the signed desktop app boundary directly: launched a packaged signed `Codex.app`, sent `attestation/generate`, decoded the returned `v1.` attestation header, and validated the extracted DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using bundle ID `com.openai.codex`. Apple returned `status_code: 200` and `is_ok: true`. Then ran the fuller app + app-server flow. The packaged `Codex.app` launched a current-branch app-server via `CODEX_CLI_PATH`, and a local MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server requested `attestation/generate` from the real Electron app process, and the intercepted `/backend-api/codex/responses` traffic included `x-oai-attestation` on both routes: ```text GET /backend-api/codex/responses Upgrade: websocket x-oai-attestation: present POST /backend-api/codex/responses Upgrade: none x-oai-attestation: present ``` The captured header decoded to a DeviceCheck token that also validated with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`, team `2DC432GLL2`). </details> --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-08 12:36:02 -07:00
Owen Lin	0d0835dd53	feat(app-server, threadstore): Thread pagination APIs and ThreadStore contract (#21566 ) ## Why The goal of this PR is to align on app-server and `ThreadStore` API updates for paginating through large threads. #### app-server ##### `thread/turns/list` - Updates `thread/turns/list` to support `itemsView?: "notLoaded" \| "summary" \| "full" \| null`, defaulting to `summary`. - Implements the current `thread/turns/list` behavior over the existing persisted rollout-history fallback: - `notLoaded` returns turn envelopes with empty `items`. - `summary` returns the first user message and final assistant message when available. - `full` preserves the existing full item behavior. Note that this method still uses the naive approach of loading the entire rollout file, and returns just the filtered slice of the data. Real pagination will come later by leveraging SQLite. ##### `thread/turns/items/list` - Adds the experimental `thread/turns/items/list` protocol, schema, dispatcher, and processor stub. The app-server currently returns JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`. #### ThreadStore - Adds the experimental `thread/turns/items/list` protocol, schema, dispatcher, and processor stub. The app-server currently returns JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`. - Adds `ThreadStore` contract types and stubbed methods for listing thread turns and listing items within a turn. - Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking app-server API enums or lossy string status values into the store-facing turn contract. - Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking app-server API enums or lossy string status values into the store-facing turn contract. This also sketches the storage abstraction we expect to need once turns are indexed/stored. In particular, `notLoaded` is useful only if ThreadStore can eventually list turn metadata without loading every persisted item for each turn. ## Validation - Added/updated protocol serialization coverage for the new request and response shapes. - Added app-server integration coverage for `thread/turns/list` default summary behavior and all three `itemsView` modes. - Added app-server integration coverage that `thread/turns/items/list` returns the expected unsupported JSON-RPC error when experimental APIs are enabled. - Added thread-store coverage that the default trait methods return `ThreadStoreError::Unsupported`. No developers.openai.com documentation update is needed for this internal experimental app-server API surface.	2026-05-07 15:44:43 -07:00
Ruslan Nigmatullin	e64a8979b0	device-key: clean up unused crate (#21487 )	2026-05-07 09:01:44 -07:00
pakrym-oai	a8488fec5e	Revert state DB injection and agent graph store (#21481 ) ## Why Reverts #20689 to restore the previous optional state DB plumbing. The conflict resolution keeps the newer installation ID and session/thread identity changes that landed after #20689, while removing the mandatory state DB and agent graph store dependency from ThreadManager construction. ## What changed - Restored `Option<StateDbHandle>` through app-server, MCP server, prompt debug, and test entry points. - Removed the `codex-core` dependency on `codex-agent-graph-store` and reverted descendant lookup back to the existing state DB path when available. - Kept newer `installation_id` forwarding by passing it beside the optional DB handle. - Kept local thread-name updates working when the optional state DB handle is absent. ## Validation - `git diff --check` - `cargo test -p codex-thread-store` - `cargo test -p codex-state -p codex-rollout -p codex-app-server-protocol` - Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p codex-app-server -p codex-app-server-client -p codex-mcp-server -p codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607 2026-01-19)` on `aarch64-apple-darwin`.	2026-05-06 22:48:29 -07:00
xli-oai	05cd5c313e	[codex] allow shared config reads in app-server queue (#21340 ) ## Summary - add a shared-read serialization mode for global app-server request families - let consecutive leading shared reads for the same family run together while keeping exclusive requests ordered - mark only `skills/list`, `config/read` and `plugin/list` as shared reads for now ## Why `skills/list` and `plugin/list` are read-only config-family requests, but the app-server queue currently treats every config request as exclusive. That means one long `skills/list` can make a later `plugin/list` wait even though the two requests do not mutate config. This change keeps the existing queue order but lets adjacent reads overlap. If a write is already waiting, later reads still stay behind it, so writes do not starve. ## Scope This intentionally keeps the first pass narrow: - shared reads: `skills/list`, `plugin/list` - still exclusive: `plugin/install`, `marketplace/`, `skills/config/write`, `config/write`, `config/read`, and the rest of the config family ## Validation - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server` ## Desktop verification I ran the dev desktop app against this branch's built binary with the existing UI timing logs enabled. The app did use `/Users/xli/code/codex_6/codex-rs/target/debug/codex`. The new scheduler behavior works, but this narrow change does not remove every cold-start delay: in the observed trace, an earlier exclusive `config/read` was already queued ahead of the later `skills/list` and `plugin/list` requests, so the page-open plugin requests still waited behind that earlier exclusive config-family request before they could run together. That means this PR is the scheduler primitive needed for shared reads, not the complete end-to-end latency fix by itself. ## Not run - full workspace test suite, because repo policy requires explicit approval before running it after touching `app-server-protocol`	2026-05-06 21:16:31 -07:00
pakrym-oai	103dc2b6ae	Revert "Move skills watcher to app-server" (#21460 ) Reverts openai/codex#21287	2026-05-07 02:24:20 +00:00
pakrym-oai	d5eea229cc	Move skills watcher to app-server (#21287 ) ## Why Skills update notifications are app-server API behavior, but the watcher lived in `codex-core` and surfaced through `EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core focused on thread execution and lets app-server own both cache invalidation and the `skills/changed` notification. ## What changed - Added an app-server-owned skills watcher that watches local skill roots, clears the shared skills cache, and emits `skills/changed` directly. - Registers skill watches from the common app-server thread listener attach path, including direct starts, resumes, and app-server-observed child or forked threads. - Stores the `WatchRegistration` on `ThreadState`, so listener replacement, thread teardown, idle unload, and app-server shutdown deregister by dropping the RAII guard. - Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the old core live-reload test. - Extended the app-server skills change test to verify a cached skills list is refreshed after a filesystem change without forcing reload. ## Validation - `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p codex-rollout -p codex-rollout-trace` - `cargo test -p codex-app-server skills_changed_notification_is_emitted_after_skill_change`	2026-05-06 15:38:11 -07:00
rhan-oai	fbdbc6b2fe	[codex-analytics] emit tool item events from item lifecycle (#17090 ) ## Why After the tool-item schemas are in place, analytics needs to emit them from the app-server item lifecycle rather than requiring bespoke tracking at each callsite. The reducer should also reuse the shared thread analytics context introduced below it in the stack so later event families do not repeat the same reducer joins or missing-state ladder. ## What changed - Tracks tool-item completion notifications and emits the matching tool analytics event when a terminal item arrives. - Derives event-specific payload details for command execution, file changes, MCP calls, dynamic tools, collaboration tools, web search, and image generation. - Denormalizes thread, app-server client, runtime, and subagent provenance metadata through the shared thread analytics context. - Adds reducer coverage for item lifecycle emission and subagent metadata inheritance. ## Duration semantics `duration_ms` is computed from the app-server item lifecycle timestamps: `completed_at_ms - started_at_ms`. That makes it the duration of the lifecycle Codex observed locally, not necessarily the upstream provider's full execution time. - Web search usually has a meaningful observed lifecycle because Responses can send `response.output_item.added` before `response.output_item.done`; in that case `started_at_ms` comes from the added event and `completed_at_ms` comes from the done event. - Image generation can be much less precise. In the current observed stream, image generation often arrives only as a completed `response.output_item.done`; when there is no earlier added event, Codex synthesizes the started item immediately before completion, so `duration_ms` can be `0` even though upstream image generation took longer. - Standalone web search and standalone image generation work is expected to land after this stack. Those paths may introduce more direct lifecycle events or timing points, so the current web-search/image-generation duration semantics should be treated as the best available item-lifecycle approximation, not the final latency contract for those tool families. - `execution_duration_ms` is populated only where the completed item already carries a native execution duration; otherwise it remains `null` while `duration_ms` still reflects the local lifecycle interval. ## Currently placeholder / partial fields Some fields are included in the schema for the intended steady-state contract, but this PR does not yet populate them from real approval/review state: - `review_count`, `guardian_review_count`, and `user_review_count` currently default to `0`. - `final_approval_outcome` currently defaults to `unknown`. - `requested_additional_permissions` and `requested_network_access` currently default to `false`. ## Verification - `cargo test -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17090). * #18748 * #18747 * __->__ #17090 * #17089 * #20514	2026-05-06 20:27:41 +00:00
jif-oai	8f3bb355f4	Move installation ID resolution out of core startup (#21182 ) ## Summary - resolve or inject the installation ID before core startup and pass it through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain `String` - keep child sessions on the parent installation ID instead of rediscovering it inside core - propagate installation ID startup failures in `mcp-server` instead of panicking ## Why Core was still touching the filesystem on the session startup path to discover `installation_id`. This moves that work to the outer host boundary so core no longer depends on `codex_home` reads during session construction. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-06 10:48:54 +00:00
aaronl-openai	9f06d171e2	Preserve session MCP config on refresh (#21055 ) # Overview MCP refreshes were rebuilding active threads from fresh disk-backed config only, which dropped thread-start session overlays such as app-injected MCP servers. This keeps refreshes current with disk config while preserving the thread-local config that only the active thread knows about. # Changes - Rebuild refreshed config per active thread using that thread's current `cwd`, rather than fanning out one app-server config to every thread. - Preserve each thread's `SessionFlags` layer while replacing reloadable config layers with freshly loaded config, then derive the MCP refresh payload from the rebuilt result. - Move MCP refresh orchestration into app-server so manual refreshes fail loudly while background refreshes remain best-effort, and route plugin-triggered refreshes through the same per-thread reload path. - Add regression coverage for session overlays, fresh project config, plugin-derived MCP config, current requirements, and strict vs best-effort refresh behavior. # Verification - Passed focused Rust coverage for the thread-config rebuild behavior and deferred MCP refresh flow, plus `cargo test -p codex-app-server --lib`. - Verified end to end in the Codex dev app against the locally built CLI: registered an MCP via thread config, verified that it could be used successfully before refresh, manually triggered MCP refresh, and verified that it continued to be available afterward.	2026-05-05 21:09:28 -07:00
xl-openai	5119680f85	feat: Add plugin share access controls (#21124 ) Extends `plugin/share/save` to accept optional discoverability and shareTargets while uploading plugin contents, and adds `plugin/share/updateTargets` for share-only target updates without re-uploading.	2026-05-05 20:14:18 -07:00
Rasmus Rygaard	7e310bc7f3	Inject state DB, agent graph store (#20689 ) ## Why We want the agent graph store to be passed down the stack as a real dependency, the same way we already treat the thread store. This will let us inject the agent graph store as a real dependency and support implementations other than the local SQLite-backed one. Right now most code instantiates a state DB and an agent graph store just-in-time. Ideally, we would not depend on the state DB directly but only read through the higher-level interfaces. This change makes the dependency boundaries explicit and moves state DB initialization to process bootstrap instead of hiding it inside local store implementations. ## What changed - `ThreadManager` now requires a `StateDbHandle` and an `AgentGraphStore` at construction time instead of treating them as optional internals. - The local store constructors no longer lazily initialize SQLite. Callers now initialize the state DB once per process and use that shared handle to build: - `LocalThreadStore` - `LocalAgentGraphStore` - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the thread-manager sample) now initialize the state DB up front and inject the resulting handle down the stack. - `app-server` now consistently uses its process-scoped state DB handle instead of reopening SQLite or trying to recover it from loaded threads. - Device-key storage now reuses the shared state DB handle instead of maintaining its own lazy opener. - The thread archive / descendant traversal paths now use the injected `AgentGraphStore` instead of reaching through local thread-store-specific state. ## Verification - `cargo check -p codex-core -p codex-thread-store -p codex-app-server -p codex-mcp-server -p codex-thread-manager-sample --tests` - `cargo test -p codex-thread-store` - `cargo test -p codex-core thread_manager_accepts_separate_agent_graph_store_and_thread_store -- --nocapture` - `cargo test -p codex-app-server thread_archive_archives_spawned_descendants -- --nocapture`	2026-05-05 21:45:29 +00:00
Eric Traut	8c88f9a304	Auto-deny MCP elicitations for Xcode 26.4 clients (#21113 ) ## Summary Xcode 26.4 was built against app-server behavior from before MCP elicitation requests became client-visible in CLI 0.120.0 via #17043. That client line does not expect the new events/messages, so this PR restores the old behavior for exactly that client/version combination. The compatibility handling stays in the app-server layer: when the initialized client is `Xcode` and its version starts with `26.4`, the app server marks the live Codex thread so MCP elicitations are auto-denied. The flag is applied on thread start/resume/fork/turn attachment, carried through `Codex`/`CodexThread`, and stored on `McpConnectionManager` so refreshed MCP managers preserve the behavior. ## Notes This is intentionally narrow and includes a TODO to remove the compatibility path once Xcode 26.4 ages out.	2026-05-05 14:05:42 -07:00
iceweasel-oai	f35285dc78	Add Windows sandbox readiness RPC (#20708 ) ## Why The desktop app on Windows needs a read-only way to tell, before the next tool call, whether the local Windows sandbox setup is in a state that should block the user and ask for setup again. The main case we want to cover is the elevated sandbox setup version bump. Today, if the app is configured for elevated Windows sandboxing and the installed setup is stale, the next sandboxed shell/exec path can end up triggering the elevated setup flow directly. That means the user can see an unexpected UAC prompt with no UI explanation. This change adds a small app-server preflight so the desktop app can ask “is Windows sandbox ready, not configured, or update-required?” during startup and show the appropriate blocking UI before the user hits a tool call. ## What changed - Added a new read-only app-server RPC: `windowsSandbox/readiness` - Added a new protocol enum and response type: - `WindowsSandboxReadiness` - `WindowsSandboxReadinessResponse` - Added core readiness logic in `core/src/windows_sandbox.rs`: - `ready` - `notConfigured` - `updateRequired` - Wired the new request through `codex_message_processor` - Regenerated the vendored app-server schema fixtures ## Readiness semantics This is intentionally a coarse startup/version-bump readiness check, not a full predictor of every runtime repair case. For now, readiness is determined from: - the configured Windows sandbox level - `sandbox_setup_is_complete()` for elevated mode That means: - `disabled` maps to `notConfigured` - `restricted token` maps to `ready` - `elevated` maps to `ready` or `updateRequired` depending on `sandbox_setup_is_complete()` This is deliberate for the first UI integration because the common case we want to catch is “the app updated, the elevated setup version bumped, and the user should see an update-required blocker instead of a surprise UAC prompt”. It does not attempt to model every case where the deeper runtime path might decide to repair or re-run setup. ## Testing - Ran `cargo fmt --all -- app-server-protocol/src/protocol/common.rs app-server-protocol/src/protocol/v2.rs app-server/src/codex_message_processor.rs core/src/windows_sandbox.rs core/src/windows_sandbox_tests.rs` - Added unit tests for the pure readiness mapping in `core/src/windows_sandbox_tests.rs` - Regenerated vendored schema fixtures with `cargo run -p codex-app-server-protocol --bin write_schema_fixtures -- --schema-root app-server-protocol/schema` - Did not run the full cargo test suite	2026-05-05 09:58:23 -07:00
Tom	33d24b0df5	codex: migrate (more) app-server thread history reads to ThreadStore (#20575 ) Migrate token usage replay, rollback responses, and detached review setup (a special case of forking) to be served from ThreadStore reads rather direct rollout files. - replay restored token usage from already-loaded `RolloutItem` history instead of reopening `Thread.path` - rebuild rollback responses from loaded `ThreadStore` snapshots and history - start detached reviews from store-backed parent history and stored review-thread metadata - remove obsolete app-server rollout-summary helper code that became dead after the store-backed migration - preserve response/notification ordering for resume, fork, rollback, and detached review flows - add integration test coverage for the affected paths	2026-05-04 21:16:50 -07:00
Ruslan Nigmatullin	4950e7d8a6	[codex] Add unsandboxed process exec API (#19040 ) ## Why App-server clients sometimes need argv-based local process execution while sandbox policy is controlled outside Codex. Those environments can reject sandbox-disabling paths before a command ever starts, even when the caller intentionally wants unsandboxed execution. This PR adds a distinct `process/*` API for that use case instead of extending `command/exec` with another sandbox-disabling shape. Keeping the new surface separate also makes the future removal of `command/exec` simpler: clients that need explicit process lifecycle control can move to the newer handle-based API without depending on `command/exec` business logic. ## What changed - Added v2 process lifecycle methods: `process/spawn`, `process/writeStdin`, `process/resizePty`, and `process/kill`. - Added process notifications: `process/outputDelta` for streamed stdout/stderr chunks and `process/exited` for final exit status and buffered output. - Made `process/spawn` intentionally unsandboxed and omitted sandbox-selection fields such as `sandboxPolicy` and `permissionProfile`. - Added client-supplied, connection-scoped `processHandle` values for follow-up control requests and notification routing. - Supported cwd, environment overrides, PTY mode and size, stdin streaming, stdout/stderr streaming, per-stream output caps, and timeout controls. - Killed active process sessions when the originating app-server connection closes. - Wired the implementation through the modular `request_processors/` app-server layout, with process-handle request serialization for follow-up control calls. - Updated generated JSON/TypeScript schema fixtures and documented the new API in `codex-rs/app-server/README.md`. - Added v2 app-server integration coverage in `codex-rs/app-server/tests/suite/v2/process_exec.rs` for spawn acknowledgement before exit, buffered output caps, and process termination. ## Verification - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server` --------- Co-authored-by: Owen Lin <owen@openai.com>	2026-05-04 16:43:58 -07:00
Ruslan Nigmatullin	4d201e340e	state: pass state db handles through consumers (#20561 ) ## Why SQLite state was still being opened from consumer paths, including lazy `OnceCell`-backed thread-store call sites. That let one process construct multiple state DB connections for the same Codex home, which makes SQLite lock contention and `database is locked` failures much easier to hit. State DB lifetime should be chosen by main-like entrypoints and tests, then passed through explicitly. Consumers should use the supplied `Option<StateDbHandle>` or `StateDbHandle` and keep their existing filesystem fallback or error behavior when no handle is available. The startup path also needs to keep the rollout crate in charge of SQLite state initialization. Opening `codex_state::StateRuntime` directly bypasses rollout metadata backfill, so entrypoints should initialize through `codex_rollout::state_db` and receive a handle only after required rollout backfills have completed. ## What Changed - Initialize the state DB in main-like entrypoints for CLI, TUI, app-server, exec, MCP server, and the thread-manager sample. - Pass `Option<StateDbHandle>` through `ThreadManager`, `LocalThreadStore`, app-server processors, TUI app wiring, rollout listing/recording, personality migration, shell snapshot cleanup, session-name lookup, and memory/device-key consumers. - Remove the lazy local state DB wrapper from the thread store so non-test consumers use only the supplied handle or their existing fallback path. - Make `codex_rollout::state_db::init` the local state startup path: it opens/migrates SQLite, runs rollout metadata backfill when needed, waits for concurrent backfill workers up to a bounded timeout, verifies completion, and then returns the initialized handle. - Keep optional/non-owning SQLite helpers, such as remote TUI local reads, as open-only paths that do not run startup backfill. - Switch app-server startup from direct `codex_state::StateRuntime::init` to the rollout state initializer so app-server cannot skip rollout backfill. - Collapse split rollout lookup/list APIs so callers use the normal methods with an optional state handle instead of `_with_state_db` variants. - Restore `getConversationSummary(ThreadId)` to delegate through `ThreadStore::read_thread` instead of a LocalThreadStore-specific rollout path special case. - Keep DB-backed rollout path lookup keyed on the DB row and file existence, without imposing the filesystem filename convention on existing DB rows. - Verify readable DB-backed rollout paths against `session_meta.id` before returning them, so a stale SQLite row that points at another thread's JSONL falls back to filesystem search and read-repairs the DB row. - Keep `debug prompt-input` filesystem-only so a one-off debug command does not initialize or backfill SQLite state just to print prompt input. - Keep goal-session test Codex homes alive only in the goal-specific helper, rather than leaking tempdirs from the shared session test helper. - Update tests and call sites to pass explicit state handles where DB behavior is expected and explicit `None` where filesystem-only behavior is intended. ## Validation - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p codex-tui -p codex-exec -p codex-cli --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout state_db_` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout try_init_ -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p codex-rollout --lib -- -D warnings` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store read_thread_falls_back_when_sqlite_path_points_to_another_thread -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core shell_snapshot` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all personality_migration` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find::find_prefers_sqlite_path_by_id -- --nocapture` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core interrupt_accounts_active_goal_before_pausing` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server get_auth_status -- --test-threads=1` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server --lib` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-app-server --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p codex-exec -p codex-cli` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-app-server` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core` - `just argument-comment-lint -p codex-core` - `just argument-comment-lint -p codex-rollout` Focused coverage added in `codex-rollout`: - `recorder::tests::state_db_init_backfills_before_returning` verifies the rollout metadata row exists before startup init returns. - `state_db::tests::try_init_waits_for_concurrent_startup_backfill` verifies startup waits for another worker to finish backfill instead of disabling the handle for the process. - `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill` verifies startup does not hang indefinitely on a stuck backfill lease. - `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename` verifies DB-backed lookup accepts valid existing rollout paths even when the filename does not include the thread UUID. - `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread` verifies DB-backed lookup ignores a stale row whose existing path belongs to another thread and read-repairs the row after filesystem fallback. Focused coverage updated in `codex-core`: - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a DB-preferred rollout file with matching `session_meta.id`, so it still verifies that valid SQLite paths win without depending on stale/empty rollout contents. `cargo test -p codex-app-server thread_list_respects_search_term_filter -- --test-threads=1 --nocapture` was attempted locally but timed out waiting for the app-server test harness `initialize` response before reaching the changed thread-list code path. `bazel test //codex-rs/thread-store:thread-store-unit-tests --test_output=errors` was attempted locally after the thread-store fix, but this container failed before target analysis while fetching `v8+` through BuildBuddy/direct GitHub. The equivalent local crate coverage, including `cargo test -p codex-thread-store`, passes. A plain local `cargo check -p codex-rollout -p codex-app-server --tests` also requires system `libcap.pc` for `codex-linux-sandbox`; the follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in this container.	2026-05-04 11:46:03 -07:00
pakrym-oai	33b19bcfde	[codex] Split app-server request processors (#20940 ) ## Why The app-server request path had grown around a large `CodexMessageProcessor` plus separate API wrapper/helper modules. That made the dependency graph hard to see and forced unrelated request families to share broad processor state. This PR makes the split mechanical and command-prefix oriented so request families own only the dependencies they use. ## What changed - Replaced `CodexMessageProcessor` with command-prefix request processors under `app-server/src/request_processors/`. - Removed the old config, device-key, external-agent-config, and fs API wrapper files by moving their API handling into processors. - Split apps, plugins, marketplace, catalog, account, MCP, command exec, fs, git, feedback, thread, turn, thread goals, and Windows sandbox handling into dedicated processors. - Kept shared lifecycle, summary conversion, token usage replay, and shared error mapping only where multiple processors use them; single-use helpers were inlined into their owning processor. - Removed the fallback processor path and moved processor tests to `_tests` files. ## Validation - `cargo test -p codex-app-server` - `cargo check -p codex-app-server` - `just fix -p codex-app-server`	2026-05-04 09:34:11 -07:00
pakrym-oai	9ddfda9db7	[codex] Refactor app-server dispatch result flow (#20897 ) ## Why App-server request handling had response sending spread across many individual handlers, which made it harder to see which requests return payloads, which methods send their own delayed response, and which branches emit notifications after a response. ## What changed - Centralized normal `ClientResponsePayload` sending in the dispatch path. - Kept explicit-response methods explicit where they need custom ordering or delayed delivery. - Removed forward-only handler wrappers and immediate `async { ... }.await` bodies where they were not needed. - Moved branch-specific post-response notifications into the branches that own the response ordering. - Replaced unreachable delegated request-family error arms with explicit `unreachable!` cases. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server thread_goal` - `just fix -p codex-app-server`	2026-05-03 18:57:46 -07:00
Tom	fe05acad23	Make thread store process-scoped (#19474 ) - Build one app-server process ThreadStore from startup config and share it with ThreadManager and CodexMessageProcessor. - Remove per-thread/fork store reconstruction so effective thread config cannot switch the persistence backend. - Add params to ThreadStore create/resume for specifying thread metadata, since otherwise the metadata from store creation would be used (incorrectly).	2026-04-30 21:24:59 -07:00
xl-openai	7b3de63041	Move plugin out of core. (#20348 )	2026-04-30 14:26:14 -07:00
alexsong-oai	9121132c8f	Send external import completion for sync imports (#20379 )	2026-04-30 13:03:21 -07:00
pakrym-oai	fedcefe9da	Reduce the surface of collaboration modes (#20149 ) Collaboration modes were slightly invasive both into ThreadManager construction and ModelProvider	2026-04-29 17:22:41 -07:00
stefanstokic-oai	c8abcbf925	Import external agent sessions in background (#20284 ) Summary: - Return from external agent import before session history import finishes - Run session import work in the background and emit the existing completion notification when it is done - Serialize session imports so duplicate requests do not create duplicate imported threads Verification: - cargo test -p codex-app-server external_agent_config_ - cargo test -p codex-external-agent-sessions - just fix -p codex-app-server - just fix -p codex-external-agent-sessions - git diff --check	2026-04-30 00:00:41 +00:00
rhan-oai	973c5c823e	[app-server] type client response payloads (#20050 ) ## Why `pr17088` adds typed server-originated request/response plumbing, but successful client responses are still erased into bare JSON-RPC `result` values before app-server can make any typed decision about them. This precursor PR keeps successful client responses typed until the outgoing response seam. It is intentionally limited to protocol/app-server plumbing so the analytics behavior change can review separately on top. ## What changed - Add `ClientResponsePayload` as the pre-serialization client response body type. - Route app-server successful response paths through the typed payload seam while preserving existing handler-local analytics behavior. - Keep `InterruptConversation` JSON-RPC-only because it has no `ClientResponse` variant. - Move the new payload conversion tests into a dedicated protocol test module. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server-protocol`	2026-04-29 20:50:47 +00:00
rhan-oai	0690ab0842	[codex-analytics] ingest server requests and responses (#17088 ) ## Why Codex analytics needs a typed seam for app-server-originated request/response traffic so future tool-approval analytics can consume those facts without adding bespoke callsite tracking each time. Server responses arrive as JSON-RPC `id + result` payloads, so analytics has to reconstruct the matching typed response from the original typed request while that request context still exists in app-server. This also puts analytics on the app-server outbound path, which needs to avoid keeping the runtime alive during shutdown. The final ownership fix keeps the normal strong auth-manager retention in analytics and makes the external-auth refresh bridge hold a weak back-reference to `OutgoingMessageSender`, breaking the runtime cycle at the bridge boundary instead of exposing retention policy through the analytics client API. ## What changed - Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus `AnalyticsEventsClient::track_server_request` and `track_server_response`. - Renames the existing client-side facts to `ClientRequest` and `ClientResponse` so reducers can distinguish client-to-server traffic from server-to-client traffic. - Adds `ServerRequest::response_from_result`, allowing a stored typed request to decode the matching typed server response from a raw JSON-RPC result payload. - Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and records targeted server requests, replayed targeted requests, and matching targeted responses with the responding connection id needed for correlation. - Intentionally leaves broadcast server requests/responses out of analytics for now because the current model is per connection, while broadcasts fan one logical request out across multiple connections. - Breaks the app-server shutdown cycle by storing `Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and upgrading it only when an external-auth refresh is actually requested. - Keeps reducer ingestion of the new server-side facts as no-ops for now; this PR is plumbing for later tool-approval analytics work. ## Verification - `cargo test -p codex-analytics` - `cargo test -p codex-app-server outgoing_message::tests::` - Covers typed-response reconstruction plus the targeted, replayed, broadcast-exclusion, and response-attribution analytics paths. ## Follow-up This PR intentionally stops at ingestion plumbing, so `ServerRequest` and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR adds real downstream analytics output for those facts: - replace the temporary pre-reducer observation seam with reducer tests for the emitted event shape; - add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs` for the real app-server workflow and captured analytics payload; - remove the temporary sender-level observer tests added here in favor of the real-output coverage above. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088). * #18748 * #18747 * #17090 * #17089 * #20241 * #20239 * __->__ #17088	2026-04-29 19:56:41 +00:00
xl-openai	73cd831952	feat: Use remote installed plugin cache for skills and MCP (#20096 ) - Fetches and caches remote /installed plugin state - Lets skills/list load skills from remote-installed cached plugins without requiring a local marketplace entry - Routes plugin list/startup/install/uninstall changes through async plugin cache invalidation and MCP refresh	2026-04-29 12:09:49 -07:00
Celia Chen	8c47e36504	feat: expose provider capability bounds to app server clients (#20049 ) follow up of #19442. The app server now exposes provider-derived bounds through a new v2 `modelProvider/read` method. The response reports the configured provider map key as `modelProvider` and returns the effective capability booleans so clients can align their UI with the same provider-owned limits used by core.	2026-04-29 01:36:19 +00:00
alexsong-oai	cb8b1bbcd6	Support detect and import MCP, Subagents, hooks, commands from external (#19949 ) ## Why This PR expands the migration path so Codex can detect and import MCP server config, hooks, commands, and subagents configs in a Codex-native shape. ## What changed - Added a `codex-external-agent-migration` crate that owns conversion logic for external-agent MCP servers, hooks, commands, and subagents. - Extended the app-server external-agent config detection/import API with migration item types for MCP server config, hooks, commands, and subagents. ## Migration strategy The migration is intentionally conservative: Codex only imports external-agent config that can be represented safely in Codex today. Unsupported or ambiguous config is skipped instead of being partially translated into behavior that may not match the source system. - MCP servers: import supported stdio and HTTP MCP server definitions into `mcp_servers`. Disabled servers and servers filtered out by source `enabledMcpjsonServers` / `disabledMcpjsonServers` are skipped. Project-scoped MCP entries from `.claude.json` are included when they match the repo path. - Hooks: import only supported command hooks into `.codex/hooks.json`. Unsupported hook features such as conditional groups, async handlers, prompt/http hooks, or unknown fields are skipped. Referenced hook scripts are copied into `.codex/hooks/`, preserving any existing target scripts. - Commands: import supported external commands as Codex skills under `.agents/skills/source-command-`. Commands that rely on source runtime expansion such as `$ARGUMENTS`, `$1`, `@file` references, shell interpolation, or colliding generated names are skipped. - Subagents: import valid subagent Markdown files into `.codex/agents/.toml` when they have the minimum Codex agent fields. Source model names are not migrated, so imported agents keep the user’s Codex default model; compatible reasoning effort and sandbox mode are migrated when present. - Skills and project guidance: copy missing skill directories into `.agents/skills` and migrate `CLAUDE.md` guidance into `AGENTS.md`, rewriting source-agent terminology to Codex terminology where appropriate. - Detection details: detected migration items include lightweight details for UI preview, such as MCP server names, hook event names, generated command skill names, and subagent names. Import still recomputes from disk instead of trusting details as the source of truth. - Adds focused coverage for the new migration behavior and app-server import flow. ## Verification - `cargo test -p codex-external-agent-migration` - `cargo test -p codex-hooks` - `cargo test -p codex-app-server external_agent_config` - `just bazel-lock-check`	2026-04-29 00:45:24 +00:00
Ruslan Nigmatullin	0700f979ba	app-server: run initialized rpcs with keyed serialization (#17373 ) ## Why Initialized app-server RPCs no longer need to bottleneck behind one request processor path. Running them concurrently improves responsiveness, but several request families still mutate shared state or depend on ordered side effects. Those stateful families need an auditable serialization contract so concurrency does not reorder thread, config, auth, command, watcher, MCP, or similar state transitions. This PR keeps that boundary explicit: stateful work is serialized by the smallest useful key, while intentionally read-only or externally concurrent work remains unkeyed. In particular, `thread/list` and `thread/turns/list` explicitly have no serialization because they primarily read append-only rollout storage and should continue to be served concurrently. ## What changed - Adds `ClientRequest::serialization_scope()` in `app-server-protocol` and requires every client request definition to declare its serialization behavior. - Introduces keyed request scopes for thread, thread path, command exec process, fuzzy search session, fs watch, MCP OAuth, and global state buckets such as config, account auth, memory, and device keys. - Routes initialized app-server RPCs through per-key FIFO serialization while allowing unkeyed initialized requests to run concurrently. - Cancels in-flight initialized RPC work when the connection disconnects or the app-server exits so spawned request tasks do not outlive their session. - Adds focused coverage for representative keyed and unkeyed serialization scopes, including explicitly concurrent `thread/turns/list` behavior. ## Validation - Added protocol tests for representative keyed serialization scopes and intentionally unkeyed request families. - Added app-server request serialization tests covering per-key FIFO behavior, concurrent unkeyed execution, disconnect shutdown, and config read-after-write ordering. - Local focused protocol validation after the latest rebase is currently blocked by packageproxy failing to resolve locked `rustls-webpki 0.103.13`; CI is expected to provide the full validation signal.	2026-04-28 12:23:34 -07:00
stefanstokic-oai	4c68bd728f	External agent session support (#19895 ) ## Summary This extends external agent detection/import beyond config artifacts so Codex can detect recent sessions files from the external agent home and import them into Codex rollout history. ## What changed - Added a focused `external_agent_sessions` module for: - session discovery - source-record parsing - rollout construction - import ledger tracking - Wired session detection/import into the app-server external agent config API. - Added compaction handling so large imported sessions can be resumed safely before the first follow-up turn. ## Testing Added coverage for: - recent-session detection - custom-title handling - recency filtering - dedupe and re-detect-after-source-change behavior - visible imported turn construction - backward-compatible import payload deserialization - end-to-end RPC import flow - rejection of undetected session paths - repeat-import behavior - large-session compaction before first follow-up Ran: - `cargo test -p codex-app-server external_agent_config_import_ --test all`	2026-04-28 17:42:36 +00:00
rhan-oai	215d5a8f7c	[codex-analytics] remove ga flag (#19863 )	2026-04-27 19:29:19 +00:00
pakrym-oai	2a020f1a0a	Lift app-server JSON-RPC error handling to request boundary (#19484 ) ## Why App-server request handling had a lot of repeated JSON-RPC error construction and one-off `send_error`/`return` branches. This made small handlers noisy and pushed error response details into leaf code that otherwise only needed to validate input or call the underlying API. ## What Changed - Added shared JSON-RPC error constructors in `codex-rs/app-server/src/error_code.rs`. - Lifted straightforward request result emission into `codex-rs/app-server/src/message_processor.rs` so response/error dispatch happens at the request boundary. - Reused the result helpers across command exec, config, filesystem, device-key, external-agent config, fs-watch, and outgoing-message paths. - Removed leaf wrapper handlers where the method body was only forwarding to a response helper. - Returned request validation errors upward in the simple cases instead of sending an error locally and immediately returning. ## Verification - `cargo test -p codex-app-server --lib command_exec::tests` - `cargo test -p codex-app-server --lib outgoing_message::tests` - `cargo test -p codex-app-server --lib in_process::tests` - `cargo test -p codex-app-server --test all v2::fs` - `cargo test -p codex-app-server --test all v2::config_rpc` - `cargo test -p codex-app-server --test all v2::external_agent_config` - `cargo test -p codex-app-server --test all v2::initialize` - `just fix -p codex-app-server` - `git diff --check` Note: full `cargo test -p codex-app-server` was attempted and stopped in `message_processor::tracing_tests::turn_start_jsonrpc_span_parents_core_turn_spans` with a stack overflow after unrelated tests had already passed.	2026-04-26 15:10:35 -07:00
Michael Bolin	ac2bffa443	test: harden app-server integration tests (#19683 ) ## Why Windows Bazel runs in the permissions stack exposed that app-server integration tests were launching normal plugin startup warmups in every subprocess. Those warmups can call `https://chatgpt.com/backend-api/plugins/featured` when a test is not specifically exercising plugin startup, which adds slow background work, noisy stderr, and dependence on external network state. The relevant startup/featured-plugin behavior was introduced across #15042 and #15264. A few app-server tests also had long optional waits or unbounded cleanup paths, making failures expensive to diagnose and contributing to slow Windows shards. One external-agent config test from #18246 used a GitHub-style marketplace source, which was enough to exercise the pending remote-import path but also meant the background completion task could attempt a real clone. ## What Changed - Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks` plumbing and a hidden debug-only `--disable-plugin-startup-tasks-for-tests` app-server flag, so integration tests can suppress startup plugin warmups without adding a production env-var gate. - Has the app-server test harness pass that hidden flag by default, while opting plugin-startup coverage back in for tests that intentionally exercise startup sync and featured-plugin warmup behavior. - Lowers normal app-server subprocess logging from `info`/`debug` to `warn` to avoid multi-megabyte stderr output in Bazel logs. - Prevents the external-agent config test from attempting a real marketplace clone by using an invalid non-local source while still exercising the pending-import completion path. - Bounds optional filesystem/realtime waits and fake WebSocket test-server shutdown so failures produce targeted timeouts instead of hanging a shard. - Fixes the Unix script-resolution test in `rmcp-client` to exercise PATH resolution directly and include the actual spawn error in failures. ## Verification - `cargo check -p codex-app-server` - `cargo clippy -p codex-app-server --tests -- -D warnings` - `cargo test -p codex-rmcp-client program_resolver::tests::test_unix_executes_script_without_extension` - `cargo test -p codex-app-server --test all external_agent_config_import_sends_completion_notification_after_pending_plugins_finish -- --nocapture` - `cargo test -p codex-app-server --test all plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request -- --nocapture` - Windows Local Bazel passed with this test-hardening bundle before it was extracted from #19606. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683). * #19395 * #19394 * #19393 * #19392 * #19606 * __->__ #19683	2026-04-26 12:43:16 -07:00
Ruslan Nigmatullin	19badb0be2	app-server: persist device key bindings in sqlite (#19206 ) ## Why Device-key providers should only own platform key material. The account/client binding used to authorize a signing payload is app-server state, and keeping that state in provider-specific metadata makes the same check harder to audit and harder to share across platform implementations. Persisting the binding in the shared state database gives the device-key crate a platform-neutral source of truth before it asks a provider to sign. It also lets app-server move potentially blocking key operations off the main message processor path, which matters once providers may wait for OS authentication prompts. ## What changed - Add a `device_key_bindings` state migration plus `StateRuntime` helpers keyed by `key_id`. - Add an async `DeviceKeyBindingStore` abstraction to `codex-device-key` and use it from `DeviceKeyStore::create` and `DeviceKeyStore::sign`. - Keep provider calls behind async store methods and run the synchronous provider work through `spawn_blocking`. - Wire app-server device-key RPC handling to the SQLite-backed binding store and spawn response/error delivery tasks for device-key requests. - Run the turn-start tracing test on the existing larger current-thread test harness after the larger async surface made the default test stack too small locally. ## Validation - `cargo test -p codex-device-key` - `cargo test -p codex-state device_key` - `cargo test -p codex-state` - `cargo test -p codex-app-server device_key` - `cargo test -p codex-app-server message_processor::tracing_tests::turn_start_jsonrpc_span_parents_core_turn_spans` - `cargo test -p codex-app-server` - `just fix -p codex-device-key` - `just fix -p codex-state` - `just fix -p codex-app-server` - `just bazel-lock-update` - `just bazel-lock-check` - `git diff --check`	2026-04-23 21:55:56 -07:00
efrazer-oai	5882f3f95e	refactor: route Codex auth through AuthProvider (#18811 ) ## Summary This PR moves Codex backend request authentication from direct bearer-token handling to `AuthProvider`. The new `codex-auth-provider` crate defines the shared request-auth trait. `CodexAuth::provider()` returns a provider that can apply all headers needed for the selected auth mode. This lets ChatGPT token auth and AgentIdentity auth share the same callsite path: - ChatGPT token auth applies bearer auth plus account/FedRAMP headers where needed. - AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers where needed. Reference old stack: https://github.com/openai/codex/pull/17387/changes ## Callsite Migration \| Area \| Change \| \| --- \| --- \| \| backend-client \| accepts an `AuthProvider` instead of a raw token/header \| \| chatgpt client/connectors \| applies auth through `CodexAuth::provider()` \| \| cloud tasks \| keeps Codex-backend gating, applies auth through provider \| \| cloud requirements \| uses Codex-backend auth checks and provider headers \| \| app-server remote control \| applies provider headers for backend calls \| \| MCP Apps/connectors \| gates on `uses_codex_backend()` and keys caches from generic account getters \| \| model refresh \| treats AgentIdentity as Codex-backend auth \| \| OpenAI file upload path \| rejects non-Codex-backend auth before applying headers \| \| core client setup \| keeps model-provider auth flow and allows AgentIdentity through provider-backed OpenAI auth \| ## Stack 1. https://github.com/openai/codex/pull/18757: full revert 2. https://github.com/openai/codex/pull/18871: isolated Agent Identity crate 3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity auth mode and startup task allocation 4. This PR: migrate Codex backend auth callsites through AuthProvider 5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs and load `CODEX_AGENT_IDENTITY` ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-23 17:14:02 -07:00
Ruslan Nigmatullin	8a0ab3fc13	app-server: add Unix socket transport (#18255 ) ## Summary - add unix:// app-server transport backed by the shared codex-uds crate - reuse the websocket connection loop for axum and tungstenite-backed streams - add codex app-server proxy to bridge stdio clients to the control socket - tolerate Windows UDS backends that report a missing rendezvous path as connection refused before binding ## Tests - cargo test -p codex-app-server control_socket_acceptor_forwards_websocket_text_messages_and_pings - cargo test -p codex-app-server - just fmt - just fix -p codex-app-server - git -c core.fsmonitor=false diff --check	2026-04-23 11:09:25 -07:00
starr-openai	ddbe2536be	Support multiple managed environments (#18401 ) ## Summary - refactor EnvironmentManager to own keyed environments with default/local lookup helpers - keep remote exec-server client creation lazy until exec/fs use - preserve disabled agent environment access separately from internal local environment access ## Validation - not run (per Codex worktree instruction to avoid tests/builds unless requested) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 15:29:35 -07:00
Ruslan Nigmatullin	69c3d12274	app-server: implement device key v2 methods (#18430 ) ## Why The device-key protocol needs an app-server implementation that keeps local key operations behind the same request-processing boundary as other v2 APIs. app-server owns request dispatch, transport policy, documentation, and JSON-RPC error shaping. `codex-device-key` owns key binding, validation, platform provider selection, and signing mechanics. Keeping the adapter thin makes the boundary easier to review and avoids moving local key-management details into thread orchestration code. ## What changed - Added `DeviceKeyApi` as the app-server adapter around `DeviceKeyStore`. - Converted protocol protection policies, payload variants, algorithms, and protection classes to and from the device-key crate types. - Encoded SPKI public keys and DER signatures as base64 protocol fields. - Routed `device/key/create`, `device/key/public`, and `device/key/sign` through `MessageProcessor`. - Rejected remote transports before provider access while allowing local `stdio` and in-process callers to reach the device-key API. - Added stdio, in-process, and websocket tests for device-key validation and transport policy. - Documented the device-key methods in the app-server v2 method list. ## Test coverage - `device_key_create_rejects_empty_account_user_id` - `in_process_allows_device_key_requests_to_reach_device_key_api` - `device_key_methods_are_rejected_over_websocket` ## Stack This is PR 3 of 4 in the device-key app-server stack. It is stacked on #18429. ## Validation - `cargo test -p codex-app-server device_key` - `just fix -p codex-app-server`	2026-04-21 14:07:08 -07:00
pakrym-oai	ffa6944587	Load app-server config through ConfigManager (#18870 ) ## Summary - Load app-server startup config through `ConfigManager` instead of direct `ConfigBuilder` calls. - Move `ConfigManager` constructor-owned state (`cli_overrides`, runtime feature map, cloud requirements loader) behind internal manager fields. - Pass `ConfigManager` into `MessageProcessor` directly instead of reconstructing it from raw args. ## Tests - `cargo check -p codex-app-server` - `cargo test -p codex-app-server` - `just fix -p codex-app-server` - `just fmt`	2026-04-21 14:01:02 -07:00
pakrym-oai	5fe767e8e1	Refactor app-server config loading into ConfigManager (#18442 ) Localize app-server configuration loading in one place.	2026-04-21 10:22:26 -07:00
Rasmus Rygaard	7b994100b3	Add session config loader interface (#18208 ) ## Why Cloud-hosted sessions need a way for the service that starts or manages a thread to provide session-owned config without treating all config as if it came from the same user/project/workspace TOML stack. The important boundary is ownership: some values should be controlled by the session/orchestrator, some by the authenticated user, and later some may come from the executor. The earlier broad config-store shape made that boundary too fuzzy and overlapped heavily with the existing filesystem-backed config loader. This PR starts with the smaller piece we need now: a typed session config loader that can feed the existing config layer stack while preserving the normal precedence and merge behavior. ## What Changed - Added `ThreadConfigLoader` and related typed payloads in `codex-config`. - `SessionThreadConfig` currently supports `model_provider`, `model_providers`, and feature flags. - `UserThreadConfig` is present as an ownership boundary, but does not yet add TOML-backed fields. - `NoopThreadConfigLoader` preserves existing behavior when no external loader is configured. - `StaticThreadConfigLoader` supports tests and simple callers. - Taught thread config sources to produce ordinary `ConfigLayerEntry` values so the existing `ConfigLayerStack` remains the place where precedence and merging happen. - Wired the loader through `ConfigBuilder`, the config loader, and app-server startup paths so app-server can provide session-owned config before deriving a thread config. - Added coverage for: - translating typed thread config into config layers, - inserting thread config layers into the stack at the right precedence, - applying session-provided model provider and feature settings when app-server derives config from thread params. ## Follow-Ups This intentionally stops short of adding the remote/service transport. The next pieces are expected to be: 1. Define the proto/API shape for this interface. 2. Add a client implementation that can source session config from the service side. ## Verification - Added unit coverage in `codex-config` for the loader and layer conversion. - Added `codex-core` config loader coverage for thread config layer precedence. - Added app-server coverage that verifies session thread config wins over request-provided config for model provider and feature settings.	2026-04-20 23:05:49 +00:00
alexsong-oai	20b4b80426	Sync local plugin imports, async remote imports, refresh caches after… (#18246 ) … import ## Why `externalAgentConfig/import` used to spawn plugin imports in the background and return immediately. That meant local marketplace imports could still be in flight when the caller refreshed plugin state, so newly imported plugins would not show up right away. This change makes local marketplace imports complete before the RPC returns, while keeping remote marketplace imports asynchronous so we do not block on remote fetches. ## What changed - split plugin migration details into local and remote marketplace imports based on the external config source - import local marketplaces synchronously during `externalAgentConfig/import` - return pending remote plugin imports to the app-server so it can finish them in the background - clear the plugin and skills caches before responding to plugin imports, and again after background remote imports complete, so the next `plugin/list` reloads fresh state - keep marketplace source parsing encapsulated behind `is_local_marketplace_source(...)` instead of re-exporting the internal enum - add core and app-server coverage for the synchronous local import path and the pending remote import path ## Verification - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core` (currently fails an existing unrelated test: `config_loader::tests::cli_override_can_update_project_local_mcp_server_when_project_is_trusted`) - `cargo test` (currently fails existing `codex-app-server` integration tests in MCP/skills/thread-start areas, plus the unrelated `codex-core` failure above)	2026-04-17 09:34:55 +00:00
David de Regt	6adba99f4d	Stabilize Bazel tests (timeout tweaks and flake fixes) (#17791 )	2026-04-16 07:57:51 -07:00
Eric Traut	1fd9c33207	[codex] Fix app-server initialized request analytics build (#17830 ) Problem: PR #17372 moved initialized request handling into `dispatch_initialized_client_request`, leaving analytics code that uses `connection_id` without a local binding and breaking `codex-app-server` builds. Solution: Restore the `connection_id` binding from `connection_request_id` before initialized request validation and analytics tracking.	2026-04-14 13:11:04 -07:00
Ruslan Nigmatullin	23d4098c0f	app-server: prepare to run initialized rpcs concurrently (#17372 ) ## Summary - Refactors `MessageProcessor` and per-connection session state so initialized service RPC handling can be moved into spawned tasks in a follow-up PR. - Shares the processor and initialized session data with `Arc`/`OnceLock` instead of mutable borrowed connection state. - Keeps initialized request handling synchronous in this PR; it does not call `tokio::spawn` for service RPCs yet. ## Testing - `just fmt` - `cargo test -p codex-app-server` (fails on existing hardening gaps covered by #17375, #17376, and #17377; the pipelined config regression passed before the unrelated failures) - `just fix -p codex-app-server`	2026-04-14 11:24:34 -07:00

1 2 3

145 Commits