codex

mirror of https://github.com/openai/codex.git synced 2026-05-18 10:12:59 +00:00

Author	SHA1	Message	Date
pakrym-oai	2070d5bfd3	[codex] Add response.processed websocket request (#21284 ) ## Summary - Add a `response.processed` websocket request payload and sender for Responses API websockets. - Send `response.processed` from `try_run_sampling_request` after a response completes, local turn processing succeeds, and the session-owned feature flag is enabled. - Add websocket coverage for both enabled and disabled feature-flag behavior. ## Validation - `just fmt` - `cargo test -p codex-core response_processed` - `cargo test -p codex-api responses_websocket` - `cargo test -p codex-features responses_websocket_response_processed_is_under_development` - `git diff --check` - `just fix -p codex-api -p codex-core -p codex-features` - `git diff --check origin/main...HEAD`	2026-05-06 09:58:46 -07:00
Ahmed Ibrahim	5d6f23a27b	Propagate cache key and service tiers in compact (#21249 ) ## Why `/responses/compact` should preserve the request-affinity fields that apply to the active auth mode. ChatGPT-auth compact requests need the effective `service_tier`, and compact requests for every auth mode need the stable `prompt_cache_key`, so compaction does not quietly lose routing or cache behavior that normal sampling already has. This follows the request-parity direction from #20719, but keeps the net change focused on the compact payload fields needed here. ## What changed - Add `service_tier` and `prompt_cache_key` to the compact endpoint input payload. - Build the remote compact payload from the existing responses request builder output so `Fast` still maps to `priority` when compact sends a service tier. - Pass the turn service tier into remote compaction, but only include it in compact payloads for ChatGPT-backed auth. - Keep `prompt_cache_key` on compact payloads for all auth modes. - Add request-body diff snapshot coverage in `core/tests/suite/compact_remote.rs` for: - API-key auth reusing `prompt_cache_key` while omitting `service_tier` even when `Fast` is configured. - ChatGPT auth reusing both `service_tier` and `prompt_cache_key`. - Drive the snapshot coverage through five varied turns: plain text, multi-part text, tool-call continuation, image+text input, local-shell continuation, and final-turn reasoning output. ## Verification - Added insta snapshots for compact request-body parity against the last normal `/responses` request after five varied turns. - Not run locally per repo guidance; relying on GitHub CI for test execution. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-06 13:38:43 +03:00
cassirer-openai	89698ad1c3	[rollout-trace] Include x-request-id in rollout trace. (#20066 ) ## Why Rollout traces need an identifier that can be used to correlate a Codex inference with upstream Responses API, proxy, and engine logs. The reduced trace model already exposed `upstream_request_id`, but it was being populated from the Responses API `response.id`. That value is useful for `previous_response_id` chaining, but it is not the transport request id that upstream systems key on. This PR separates those concepts so trace consumers can reliably answer both questions: - which Responses API response did this inference produce? - which upstream request handled it? ## Structure The change keeps the upstream request id at the same lifecycle level as the provider stream: - `codex-api` captures the `x-request-id` HTTP response header when the SSE stream is created and exposes it on `ResponseStream`. Fixture and websocket streams set the field to `None` because they do not have that HTTP response header. - `codex-core` carries that stream-level id into `InferenceTraceAttempt` when recording terminal stream outcomes. Completed, failed, cancelled, dropped-stream, and pre-response error paths all record the id when it is available. - `rollout-trace` now records both identifiers in raw terminal inference events and response payloads: `response_id` for the Responses API `response.id`, and `upstream_request_id` for `x-request-id`. - The reducer stores both fields on `InferenceCall`. It also uses `response_id` for `previous_response_id` conversation linking, which removes the old accidental dependency on the misnamed `upstream_request_id` field. - Terminal inference reduction now consumes the full terminal payload (`InferenceCompleted`, `InferenceFailed`, or `InferenceCancelled`) in one place. That keeps status, partial payloads, response ids, and upstream request ids consistent across success, failure, cancellation, and late stream-mapper events. ## Why This Shape `x-request-id` is a property of the HTTP/provider response envelope, not an SSE event. Capturing it once in `codex-api` and plumbing it through terminal trace recording avoids trying to infer the value from stream contents, and it preserves the id even when the stream fails or is cancelled after only partial output. Keeping `response_id` separate from `upstream_request_id` also makes the reduced trace model less surprising: `response_id` remains the conversation-continuation id, while `upstream_request_id` is the operational correlation id for upstream debugging. ## Validation The PR updates trace and reducer coverage for: - reading `x-request-id` from SSE response headers; - storing the true upstream request id on completed inference calls; - preserving upstream request ids for cancelled and late-cancelled inference streams; - keeping `previous_response_id` reconstruction tied to `response_id` rather than transport request ids.	2026-04-28 21:11:17 +00:00
Andrey Mishchenko	355c40ad7e	Support end_turn in response.completed (#19610 ) Some providers of Responses API forward a model-defined `end_turn` boolean indicating explicitly the model's indication of whether it would like to end the turn or to be inferenced again. In this PR, we update the sampling loop to use this field correctly if it's set. If the field is not set by the provider, we fall back to the existing sampling logic.	2026-04-25 21:57:42 -07:00
Eric Traut	bbff4ee61a	Add safety check notification and error handling (#19055 ) Adds a new app-server notification that fires when a user account has been flagged for potential safety reasons.	2026-04-22 22:24:12 -07:00
maja-openai	ef00014a46	Allow guardian bare allow output (#18797 ) ## Summary Allow guardian to skip other fields and output only `{"outcome":"allow"}` when the command is low risk. This change lets guardian reviews use a non-strict text format while keeping the JSON schema itself as plain user-visible schema data, so transport strictness is carried out-of-band instead of through a schema marker key. ## What changed - Add an explicit `output_schema_strict` flag to model prompts and pass it into `codex-api` text formatting. - Set guardian reviewer prompts to non-strict schema validation while preserving strict-by-default behavior for normal callers. - Update the guardian output contract so definitely-low-risk decisions may return only `{"outcome":"allow"}`. - Treat bare allow responses as low-risk approvals in the guardian parser. - Add tests and snapshots covering the non-strict guardian request and optional guardian output fields. ## Verification - `cargo test -p codex-core guardian::tests::guardian` - `cargo test -p codex-core guardian::tests::` - `cargo test -p codex-core client_common::tests::` - `cargo test -p codex-protocol user_input_serialization_includes_final_output_json_schema` - `cargo test -p codex-api` - `git diff --check` Note: `cargo test -p codex-core` was also attempted, but this desktop environment injects ambient config/proxy state that causes unrelated config/session tests expecting pristine defaults to fail. --------- Co-authored-by: Dylan Hurd <dylan.hurd@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-21 15:37:12 -07:00
Akshay Nathan	7995c66032	Stream apply_patch changes (#17862 ) Adds new events for streaming apply_patch changes from responses api. This is to enable clients to show progress during file writes. Caveat: This does not work with apply_patch in function call mode, since that required adding streaming json parsing.	2026-04-16 18:12:19 -07:00
Ahmed Ibrahim	ecca34209d	Omit empty app-server instruction overrides (#17258 ) ## Summary - omit serialized Responses instructions when an app-server base instruction override is empty - skip empty developer instruction messages and add v2 coverage for the empty-override request shape ## Validation - just fmt - git diff --check	2026-04-09 15:29:35 -07:00
Dylan Hurd	6c36e7d688	fix(app-server) revert null instructions changes (#17047 )	2026-04-07 15:18:34 -07:00
Owen Lin	5d1671ca70	feat(analytics): generate an installation_id and pass it in responsesapi client_metadata (#16912 ) ## Summary This adds a stable Codex installation ID and includes it on Responses API requests via `x-codex-installation-id` passed in via the `client_metadata` field for analytics/debugging. The main pieces are: - persist a UUID in `$CODEX_HOME/installation_id` - thread the installation ID into `ModelClient` - send it in `client_metadata` on Responses requests so it works consistently across HTTP and WebSocket transports	2026-04-07 09:52:17 -07:00
Ahmed Ibrahim	24c598e8a9	Honor null thread instructions (#16964 ) - Treat explicit null thread instructions as a blank-slate override while preserving omitted-field fallback behavior. - Preserve null through rollout resume/fork and keep explicit empty strings distinct. - Add app-server v2 start/fork coverage for the tri-state instruction params.	2026-04-07 04:10:19 +00:00
Owen Lin	20f2a216df	feat(core, tracing): create turn spans over websockets (#14632 ) ## Description Dependent on: - [responsesapi] https://github.com/openai/openai/pull/760991 - [codex-backend] https://github.com/openai/openai/pull/760985 `codex app-server -> codex-backend -> responsesapi` now reuses a persistent websocket connection across many turns. This PR updates tracing when using websockets so that each `response.create` websocket request propagates the current tracing context, so we can get a holistic end-to-end trace for each turn. Tracing is propagated via special keys (`ws_request_header_traceparent`, `ws_request_header_tracestate`) set in the `client_metadata` param in Responses API. Currently tracing on websockets is a bit broken because we only set tracing context on ws connection time, so it's detached from a `turn/start` request.	2026-03-19 03:41:06 +00:00
Rasmus Rygaard	53d5972226	Reapply "Pass more params to compaction" (#14298 ) (#14521 ) This reverts commit `8af97ce4b0`. Confirmed that this runs locally without the previous issues with tool use	2026-03-12 23:27:21 +00:00
Rasmus Rygaard	7f22329389	Revert "Pass more params to compaction" (#14298 )	2026-03-11 12:33:10 -07:00
Rasmus Rygaard	2621ba17e3	Pass more params to compaction (#14247 ) Pass more params to /compact. This should give us parity with the /responses endpoint to improve caching. I'm torn about the MCP await. Blocking will give us parity but it seems like we explicitly don't block on MCPs. Happy either way	2026-03-11 12:33:09 -07:00
pakrym-oai	69df12efb3	Remove Responses V1 websocket implementation (#13364 ) V2 is the way to go!	2026-03-03 11:32:53 -07:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
pakrym-oai	97d0068658	Send warmup request (#11258 ) Send a request with `generate: falls` but a full set of tools and instructions to pre-warm inference. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-24 08:15:47 -08:00
pash-openai	429cc4860e	ws turn metadata via client_metadata (#11953 )	2026-02-19 12:28:15 -08:00
Fouad Matin	02e9006547	add(core): safety check downgrade warning (#11964 ) Add per-turn notice when a request is downgraded to a fallback model due to cyber safety checks. Changes - codex-api: Emit a ServerModel event based on the openai-model response header and/or response payload (SSE + WebSocket), including when the model changes mid-stream. - core: When the server-reported model differs from the requested model, emit a single per-turn warning explaining the reroute to gpt-5.2 and directing users to Trusted Access verification and the cyber safety explainer. - app-server (v2): Surface these cyber model-routing warnings as synthetic userMessage items with text prefixed by Warning: (and document this behavior).	2026-02-16 22:13:36 -08:00
pakrym-oai	eac5473114	Do not attempt to append after response.completed (#11402 ) Completed responses are fully done, and new response must be created.	2026-02-11 07:45:17 -08:00
pakrym-oai	0639c33892	Compare full request for websockets incrementality (#11343 ) Tools can dynamically change mid-turn now. We need to be more thorough about reusing incremental connections.	2026-02-10 19:14:36 +00:00
pakrym-oai	3322b99900	Remove ApiPrompt (#11265 ) Keep things simple and build a full Responses API request request right in the model client	2026-02-10 16:12:31 +00:00
jif-oai	6049ff02a0	memories: add extraction and prompt module foundation (#11200 ) ## Summary - add the new `core/src/memories` module (phase-one parsing, rollout filtering, storage, selection, prompts) - add Askama-backed memory templates for stage-one input/system and consolidation prompts - add module tests for parsing, filtering, path bucketing, and summary maintenance ## Testing - just fmt - cargo test -p codex-core --lib memories::	2026-02-10 10:10:24 +00:00
Brian Yu	1fbf5ed06f	Support alternative websocket API (#10861 ) Test plan ``` cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \ ./target/debug/codex \ --enable responses_websockets_v2 \ --profile byok \ --full-auto ```	2026-02-06 14:40:50 -08:00
jif-oai	e9335374b9	feat: add phase 1 mem client (#10629 ) Adding a client on top of https://github.com/openai/openai/pull/672176	2026-02-04 17:59:36 +00:00
jif-oai	d2394a2494	chore: nuke chat/completions API (#10157 )	2026-02-03 11:31:57 +00:00
Ahmed Ibrahim	b11e96fb04	Act on reasoning-included per turn (#9402 ) - Reset reasoning-included flag each turn and update compaction test	2026-01-19 11:23:25 -08:00
pakrym-oai	e726a82c8a	Websocket append support (#9128 ) Support an incremental append request in websocket transport.	2026-01-13 06:07:13 +00:00
Ahmed Ibrahim	66b7c673e9	Refresh on models etag mismatch (#8491 ) - Send models etag - Refresh models on 412 - This wires `ModelsManager` to `ModelFamily` so we don't mutate it mid-turn	2026-01-01 11:41:16 -08:00
Ahmed Ibrahim	71504325d3	Migrate model preset (#7542 ) - Introduce `openai_models` in `/core` - Move `PRESETS` under it - Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`, `ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol` - Introduce `Op::ListModels` and `EventMsg::AvailableModels` Next steps: - migrate `app-server` and `tui` to use the introduced Operation	2025-12-03 20:30:43 +00:00
jif-oai	4502b1b263	chore: proper client extraction (#6996 )	2025-11-25 18:06:12 +00:00

32 Commits