codex

mirror of https://github.com/openai/codex.git synced 2026-04-28 08:34:54 +00:00

Author	SHA1	Message	Date
neil-oai	a92a5085bd	Forward app-server turn clientMetadata to Responses (#16009 ) ## Summary App-server v2 already receives turn-scoped `clientMetadata`, but the Rust app-server was dropping it before the outbound Responses request. This change keeps the fix lightweight by threading that metadata through the existing turn-metadata path rather than inventing a new transport. ## What we're trying to do and why We want turn-scoped metadata from the app-server protocol layer, especially fields like Hermes/GAAS run IDs, to survive all the way to the actual Responses API request so it is visible in downstream websocket request logging and analytics. The specific bug was: - app-server protocol uses camelCase `clientMetadata` - Responses transport already has an existing turn metadata carrier: `x-codex-turn-metadata` - websocket transport already rewrites that header into `request.request_body.client_metadata["x-codex-turn-metadata"]` - but the Rust app-server never parsed or stored `clientMetadata`, so nothing from the app-server request was making it into that existing path This PR fixes that without adding a new header or a second metadata channel. ## How we did it ### Protocol surface - Add optional `clientMetadata` to v2 `TurnStartParams` and `TurnSteerParams` - Regenerate the JSON schema / TypeScript fixtures - Update app-server docs to describe the field and its behavior ### Runtime plumbing - Add a dedicated core op for app-server user input carrying turn-scoped metadata: `Op::UserInputWithClientMetadata` - Wire `turn/start` and `turn/steer` through that op / signature path instead of dropping the metadata at the message-processor boundary - Store the metadata in `TurnMetadataState` ### Transport behavior - Reuse the existing serialized `x-codex-turn-metadata` payload - Merge the new app-server `clientMetadata` into that JSON additively - Do not replace built-in reserved fields already present in the turn metadata payload - Keep websocket behavior unchanged at the outer shape level: it still sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON string now contains the merged fields - Keep HTTP fallback behavior unchanged except that the existing `x-codex-turn-metadata` header now includes the merged fields too ### Request shape before / after Before, a websocket `response.create` looked like: ```json { "type": "response.create", "client_metadata": { "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}" } } ``` Even if the app-server caller supplied `clientMetadata`, it was not represented there. After, the same request shape is preserved, but the serialized payload now includes the new turn-scoped fields: ```json { "type": "response.create", "client_metadata": { "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}" } } ``` ## Validation ### Targeted tests added / updated - protocol round-trip coverage for `clientMetadata` on `turn/start` and `turn/steer` - protocol round-trip coverage for `Op::UserInputWithClientMetadata` - `TurnMetadataState` merge test proving client metadata is added without overwriting reserved built-in fields - websocket request-shape test proving outbound `response.create` contains merged metadata inside `client_metadata["x-codex-turn-metadata"]` - app-server integration tests proving: - `turn/start` forwards `clientMetadata` into the outbound Responses request path - websocket warmup + real turn request both behave correctly - `turn/steer` updates the follow-up request metadata ### Commands run - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-protocol` - `cargo test -p codex-core turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields --lib` - `cargo test -p codex-core --test all responses_websocket_preserves_custom_turn_metadata_fields` - `cargo test -p codex-app-server --test all client_metadata` - `cargo test -p codex-app-server --test all turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2 -- --nocapture` - `just fmt` - `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol -p codex-app-server` - `just fix -p codex-exec -p codex-tui-app-server` - `just argument-comment-lint` ### Full suite note `cargo test` in `codex-rs` still fails in: - `suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request` I verified that same failure on a clean detached `HEAD` worktree with an isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.	2026-04-09 11:52:37 -07:00
Charley Cunningham	bc24017d64	Add Smart Approvals guardian review across core, app-server, and TUI (#13860 ) ## Summary - add `approvals_reviewer = "user" \| "guardian_subagent"` as the runtime control for who reviews approval requests - route Smart Approvals guardian review through core for command execution, file changes, managed-network approvals, MCP approvals, and delegated/subagent approval flows - expose guardian review in app-server with temporary unstable `item/autoApprovalReview/{started,completed}` notifications carrying `targetItemId`, `review`, and `action` - update the TUI so Smart Approvals can be enabled from `/experimental`, aligned with the matching `/approvals` mode, and surfaced clearly while reviews are pending or resolved ## Runtime model This PR does not introduce a new `approval_policy`. Instead: - `approval_policy` still controls when approval is needed - `approvals_reviewer` controls who reviewable approval requests are routed to: - `user` - `guardian_subagent` `guardian_subagent` is a carefully prompted reviewer subagent that gathers relevant context and applies a risk-based decision framework before approving or denying the request. The `smart_approvals` feature flag is a rollout/UI gate. Core runtime behavior keys off `approvals_reviewer`. When Smart Approvals is enabled from the TUI, it also switches the current `/approvals` settings to the matching Smart Approvals mode so users immediately see guardian review in the active thread: - `approval_policy = on-request` - `approvals_reviewer = guardian_subagent` - `sandbox_mode = workspace-write` Users can still change `/approvals` afterward. Config-load behavior stays intentionally narrow: - plain `smart_approvals = true` in `config.toml` remains just the rollout/UI gate and does not auto-set `approvals_reviewer` - the deprecated `guardian_approval = true` alias migration does backfill `approvals_reviewer = "guardian_subagent"` in the same scope when that reviewer is not already configured there, so old configs preserve their original guardian-enabled behavior ARC remains a separate safety check. For MCP tool approvals, ARC escalations now flow into the configured reviewer instead of always bypassing guardian and forcing manual review. ## Config stability The runtime reviewer override is stable, but the config-backed app-server protocol shape is still settling. - `thread/start`, `thread/resume`, and `turn/start` keep stable `approvalsReviewer` overrides - the config-backed `approvals_reviewer` exposure returned via `config/read` (including profile-level config) is now marked `[UNSTABLE]` / experimental in the app-server protocol until we are more confident in that config surface ## App-server surface This PR intentionally keeps the guardian app-server shape narrow and temporary. It adds generic unstable lifecycle notifications: - `item/autoApprovalReview/started` - `item/autoApprovalReview/completed` with payloads of the form: - `{ threadId, turnId, targetItemId, review, action? }` `review` is currently: - `{ status, riskScore?, riskLevel?, rationale? }` - where `status` is one of `inProgress`, `approved`, `denied`, or `aborted` `action` carries the guardian action summary payload from core when available. This lets clients render temporary standalone pending-review UI, including parallel reviews, even when the underlying tool item has not been emitted yet. These notifications are explicitly documented as `[UNSTABLE]` and expected to change soon. This PR does not persist guardian review state onto `thread/read` tool items. The intended follow-up is to attach guardian review state to the reviewed tool item lifecycle instead, which would improve consistency with manual approvals and allow thread history / reconnect flows to replay guardian review state directly. ## TUI behavior - `/experimental` exposes the rollout gate as `Smart Approvals` - enabling it in the TUI enables the feature and switches the current session to the matching Smart Approvals `/approvals` mode - disabling it in the TUI clears the persisted `approvals_reviewer` override when appropriate and returns the session to default manual review when the effective reviewer changes - `/approvals` still exposes the reviewer choice directly - the TUI renders: - pending guardian review state in the live status footer, including parallel review aggregation - resolved approval/denial state in history ## Scope notes This PR includes the supporting core/runtime work needed to make Smart Approvals usable end-to-end: - shell / unified-exec / apply_patch / managed-network / MCP guardian review - delegated/subagent approval routing into guardian review - guardian review risk metadata and action summaries for app-server/TUI - config/profile/TUI handling for `smart_approvals`, `guardian_approval` alias migration, and `approvals_reviewer` - a small internal cleanup of delegated approval forwarding to dedupe fallback paths and simplify guardian-vs-parent approval waiting (no intended behavior change) Out of scope for this PR: - redesigning the existing manual approval protocol shapes - persisting guardian review state onto app-server `ThreadItem`s - delegated MCP elicitation auto-review (the current delegated MCP guardian shim only covers the legacy `RequestUserInput` path) --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-13 15:27:00 -07:00
Ahmed Ibrahim	fefd01b9e0	Stabilize resumed rollout messages (#14060 ) ## What changed - add a bounded `resume_until_initial_messages` helper in `core/tests/suite/resume.rs` - retry the resume call until `initial_messages` contains the fully persisted final turn shape before asserting ## Why this fixes flakiness The old test resumed once immediately after `TurnComplete` and sometimes read rollout state before the final turn had been persisted. That made the assertion race persistence timing instead of checking the resumed message shape. The new helper polls for up to two seconds in 10ms steps and only asserts once the expected message sequence is actually present, so the test waits for the real readiness condition instead of depending on a lucky timing window. ## Scope - test-only - no production logic change	2026-03-09 11:48:13 -07:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
Michael Bolin	1af2a37ada	chore: remove codex-core public protocol/shell re-exports (#12432 ) ## Why `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules from `codex-protocol` and `codex-shell-command`. That made it easy for workspace crates to import those APIs through `codex-core`, which in turn hides dependency edges and makes it harder to reduce compile-time coupling over time. This change removes those public re-exports so call sites must import from the source crates directly. Even when a crate still depends on `codex-core` today, this makes dependency boundaries explicit and unblocks future work to drop `codex-core` dependencies where possible. ## What Changed - Removed public re-exports from `codex-rs/core/src/lib.rs` for: - `codex_protocol::protocol` and related protocol/model types (including `InitialHistory`) - `codex_protocol::config_types` (`protocol_config_types`) - `codex_shell_command::{bash, is_dangerous_command, is_safe_command, parse_command, powershell}` - Migrated workspace Rust call sites to import directly from: - `codex_protocol::protocol` - `codex_protocol::config_types` - `codex_protocol::models` - `codex_shell_command` - Added explicit `Cargo.toml` dependencies (`codex-protocol` / `codex-shell-command`) in crates that now import those crates directly. - Kept `codex-core` internal modules compiling by using `pub(crate)` aliases in `core/src/lib.rs` (internal-only, not part of the public API). - Updated the two utility crates that can already drop a `codex-core` dependency edge entirely: - `codex-utils-approval-presets` - `codex-utils-cli` ## Verification - `cargo test -p codex-utils-approval-presets` - `cargo test -p codex-utils-cli` - `cargo check --workspace --all-targets` - `just clippy`	2026-02-20 23:45:35 -08:00
Celia Chen	641d5268fa	chore: persist turn_id in rollout session and make turn_id uuid based (#11246 ) Problem: 1. turn id is constructed in-memory; 2. on resuming threads, turn_id might not be unique; 3. client cannot no the boundary of a turn from rollout files easily. This PR does three things: 1. persist `task_started` and `task_complete` events; 1. persist `turn_id` in rollout turn events; 5. generate turn_id as unique uuids instead of incrementing it in memory. This helps us resolve the issue of clients wanting to have unique turn ids for resuming a thread, and knowing the boundry of each turn in rollout files. example debug logs ``` 2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None } 2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None } 2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None } ``` backward compatibility: if you try to resume an old session without task_started and task_complete event populated, the following happens: - If you resume and do nothing: those reconstructed historical IDs can differ next time you resume. - If you resume and send a new turn: the new turn gets a fresh UUID from live submission flow and is persisted, so that new turn’s ID is stable on later resumes. I think this behavior is fine, because we only care about deterministic turn id once a turn is triggered.	2026-02-11 03:56:01 +00:00
Dylan Hurd	fe8b474acd	fix(core,app-server) resume with different model (#10719 ) ## Summary When resuming with a different model, we should also append a developer message with the model instructions ## Testing - [x] Added unit tests	2026-02-05 00:40:05 -08:00
jif-oai	83775f4df1	feat: ephemeral threads (#9765 ) Add ephemeral threads capabilities. Only exposed through the `app-server` v2 The idea is to disable the rollout recorder for those threads.	2026-01-24 14:57:40 +00:00
charley-oai	be9e55c5fc	Add total (non-partial) TextElement placeholder accessors (#9545 ) ## Summary - Make `TextElement` placeholders private and add a text-backed accessor to avoid assuming `Some`. - Since they are optional in the protocol, we want to make sure any accessors properly handle the None case (getting the placeholder using the byte range in the text) - Preserve placeholders during protocol/app-server conversions using the accessor fallback. - Update TUI composer/remap logic and tests to use the new constructor/accessor.	2026-01-20 14:04:11 -08:00
Dylan Hurd	675f165c56	fix(core) Preserve base_instructions in SessionMeta (#9427 ) ## Summary This PR consolidates base_instructions onto SessionMeta / SessionConfiguration, so we ensure `base_instructions` is set once per session and should be (mostly) immutable, unless: - overridden by config on resume / fork - sub-agent tasks, like review or collab In a future PR, we should convert all references to `base_instructions` to consistently used the typed struct, so it's less likely that we put other strings there. See #9423. However, this PR is already quite complex, so I'm deferring that to a follow-up. ## Testing - [x] Added a resume test to assert that instructions are preserved. In particular, `resume_switches_models_preserves_base_instructions` fails against main. Existing test coverage thats assert base instructions are preserved across multiple requests in a session: - Manual compact keeps baseline instructions: core/tests/suite/compact.rs:199 - Auto-compact keeps baseline instructions: core/tests/suite/compact.rs:1142 - Prompt caching reuses the same instructions across two requests: core/tests/suite/prompt_caching.rs:150 and core/tests/suite/prompt_caching.rs:157 - Prompt caching with explicit expected string across two requests: core/tests/suite/prompt_caching.rs:213 and core/tests/suite/prompt_caching.rs:222 - Resume with model switch keeps original instructions: core/tests/suite/resume.rs:136 - Compact/resume/fork uses request 0 instructions for later expected payloads: core/tests/suite/compact_resume_fork.rs:215	2026-01-19 21:59:36 -08:00
charley-oai	1fa8350ae7	Add text element metadata to protocol, app server, and core (#9331 ) The second part of breaking up PR https://github.com/openai/codex/pull/9116 Summary: - Add `TextElement` / `ByteRange` to protocol user inputs and user message events with defaults. - Thread `text_elements` through app-server v1/v2 request handling and history rebuild. - Preserve UI metadata only in user input/events (not `ContentItem`) while keeping local image attachments in user events for rehydration. Details: - Protocol: `UserInput::Text` carries `text_elements`; `UserMessageEvent` carries `text_elements` + `local_images`. Serialization includes empty vectors for backward compatibility. - app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in camelCase with conversions; v2 uses its own camelCase wrapper. - app-server: v1/v2 input mapping includes `text_elements`; thread history rebuilds include them. - Core: user event emission preserves UI metadata while model history stays clean; history replay round-trips the metadata.	2026-01-15 17:26:41 -08:00
charley-oai	4a9c2bcc5a	Add text element metadata to types (#9235 ) Initial type tweaking PR to make the diff of https://github.com/openai/codex/pull/9116 smaller This should not change any behavior, just adds some fields to types	2026-01-14 16:41:50 -08:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
Anton Panasenko	807f8a43c2	feat: expose outputSchema to user_turn/turn_start app_server API (#8377 ) What changed - Added `outputSchema` support to the app-server APIs, mirroring `codex exec --output-schema` behavior. - V1 `sendUserTurn` now accepts `outputSchema` and constrains the final assistant message for that turn. - V2 `turn/start` now accepts `outputSchema` and constrains the final assistant message for that turn (explicitly per-turn only). Core behavior - `Op::UserTurn` already supported `final_output_json_schema`; now V1 `sendUserTurn` forwards `outputSchema` into that field. - `Op::UserInput` now carries `final_output_json_schema` for per-turn settings updates; core maps it into `SessionSettingsUpdate.final_output_json_schema` so it applies to the created turn context. - V2 `turn/start` does NOT persist the schema via `OverrideTurnContext` (it’s applied only for the current turn). Other overrides (cwd/model/etc) keep their existing persistent behavior. API / docs - `codex-rs/app-server-protocol/src/protocol/v1.rs`: add `output_schema: Option<serde_json::Value>` to `SendUserTurnParams` (serialized as `outputSchema`). - `codex-rs/app-server-protocol/src/protocol/v2.rs`: add `output_schema: Option<JsonValue>` to `TurnStartParams` (serialized as `outputSchema`). - `codex-rs/app-server/README.md`: document `outputSchema` for `turn/start` and clarify it applies only to the current turn. - `codex-rs/docs/codex_mcp_interface.md`: document `outputSchema` for v1 `sendUserTurn` and v2 `turn/start`. Tests added/updated - New app-server integration tests asserting `outputSchema` is forwarded into outbound `/responses` requests as `text.format`: - `codex-rs/app-server/tests/suite/output_schema.rs` - `codex-rs/app-server/tests/suite/v2/output_schema.rs` - Added per-turn semantics tests (schema does not leak to the next turn): - `send_user_turn_output_schema_is_per_turn_v1` - `turn_start_output_schema_is_per_turn_v2` - Added protocol wire-compat tests for the merged op: - serialize omits `final_output_json_schema` when `None` - deserialize works when field is missing - serialize includes `final_output_json_schema` when `Some(schema)` Call site updates (high level) - Updated all `Op::UserInput { .. }` constructions to include `final_output_json_schema`: - `codex-rs/app-server/src/codex_message_processor.rs` - `codex-rs/core/src/codex_delegate.rs` - `codex-rs/mcp-server/src/codex_tool_runner.rs` - `codex-rs/tui/src/chatwidget.rs` - `codex-rs/tui2/src/chatwidget.rs` - plus impacted core tests. Validation - `just fmt` - `cargo test -p codex-core` - `cargo test -p codex-app-server` - `cargo test -p codex-mcp-server` - `cargo test -p codex-tui` - `cargo test -p codex-tui2` - `cargo test -p codex-protocol` - `cargo clippy --all-features --tests --profile dev --fix -- -D warnings`	2026-01-05 10:27:00 -08:00
pakrym-oai	6c384eb9c6	tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640 )	2025-11-13 18:04:05 -08:00
pakrym-oai	3c90728a29	Add new thread items and rewire event parsing to use them (#5418 ) 1. Adds AgentMessage, Reasoning, WebSearch items. 2. Switches the ResponseItem parsing to use new items and then also emit 3. Removes user-item kind and filters out "special" (environment) user items when returning to clients.	2025-10-22 10:14:50 -07:00
pakrym-oai	5cd8803998	Add a baseline test for resume initial messages (#5466 )	2025-10-21 11:45:01 -07:00

17 Commits