mirror of
https://github.com/openai/codex.git
synced 2026-05-01 18:06:47 +00:00
## Why Rollout traces need an identifier that can be used to correlate a Codex inference with upstream Responses API, proxy, and engine logs. The reduced trace model already exposed `upstream_request_id`, but it was being populated from the Responses API `response.id`. That value is useful for `previous_response_id` chaining, but it is not the transport request id that upstream systems key on. This PR separates those concepts so trace consumers can reliably answer both questions: - which Responses API response did this inference produce? - which upstream request handled it? ## Structure The change keeps the upstream request id at the same lifecycle level as the provider stream: - `codex-api` captures the `x-request-id` HTTP response header when the SSE stream is created and exposes it on `ResponseStream`. Fixture and websocket streams set the field to `None` because they do not have that HTTP response header. - `codex-core` carries that stream-level id into `InferenceTraceAttempt` when recording terminal stream outcomes. Completed, failed, cancelled, dropped-stream, and pre-response error paths all record the id when it is available. - `rollout-trace` now records both identifiers in raw terminal inference events and response payloads: `response_id` for the Responses API `response.id`, and `upstream_request_id` for `x-request-id`. - The reducer stores both fields on `InferenceCall`. It also uses `response_id` for `previous_response_id` conversation linking, which removes the old accidental dependency on the misnamed `upstream_request_id` field. - Terminal inference reduction now consumes the full terminal payload (`InferenceCompleted`, `InferenceFailed`, or `InferenceCancelled`) in one place. That keeps status, partial payloads, response ids, and upstream request ids consistent across success, failure, cancellation, and late stream-mapper events. ## Why This Shape `x-request-id` is a property of the HTTP/provider response envelope, not an SSE event. Capturing it once in `codex-api` and plumbing it through terminal trace recording avoids trying to infer the value from stream contents, and it preserves the id even when the stream fails or is cancelled after only partial output. Keeping `response_id` separate from `upstream_request_id` also makes the reduced trace model less surprising: `response_id` remains the conversation-continuation id, while `upstream_request_id` is the operational correlation id for upstream debugging. ## Validation The PR updates trace and reducer coverage for: - reading `x-request-id` from SSE response headers; - storing the true upstream request id on completed inference calls; - preserving upstream request ids for cancelled and late-cancelled inference streams; - keeping `previous_response_id` reconstruction tied to `response_id` rather than transport request ids.
codex-api
Typed clients for Codex/OpenAI APIs built on top of the generic transport in codex-client.
- Hosts the request/response models and request builders for Responses and Compact APIs.
- Owns provider configuration (base URLs, headers, query params), auth header injection, retry tuning, and stream idle settings.
- Parses SSE streams into
ResponseEvent/ResponseStream, including rate-limit snapshots and API-specific error mapping. - Serves as the wire-level layer consumed by
codex-core; higher layers handle auth refresh and business logic.
Core interface
The public interface of this crate is intentionally small and uniform:
-
Responses endpoint
- Input:
ResponsesApiRequestfor the request body (model,instructions,input,tools,parallel_tool_calls, reasoning/text controls).ResponsesOptionsfor transport/header concerns (conversation_id,session_source,extra_headers,compression,turn_state).
- Output: a
ResponseStreamofResponseEvent(both re-exported fromcommon).
- Input:
-
Compaction endpoint
- Input:
CompactionInput<'a>(re-exported ascodex_api::CompactionInput):model: &str.input: &[ResponseItem]– history to compact.instructions: &str– fully-resolved compaction instructions.
- Output:
Vec<ResponseItem>. CompactClient::compact_input(&CompactionInput, extra_headers)wraps the JSON encoding and retry/telemetry wiring.
- Input:
-
Memory summarize endpoint
- Input:
MemorySummarizeInput(re-exported ascodex_api::MemorySummarizeInput):model: String.raw_memories: Vec<RawMemory>(serialized astracesfor wire compatibility).RawMemoryincludesid,metadata.source_path, and normalizeditems.
reasoning: Option<Reasoning>.
- Output:
Vec<MemorySummarizeOutput>. MemoriesClient::summarize_input(&MemorySummarizeInput, extra_headers)wraps JSON encoding and retry/telemetry wiring.
- Input:
All HTTP details (URLs, headers, retry/backoff policies, SSE framing) are encapsulated in codex-api and codex-client. Callers construct prompts/inputs using protocol types and work with typed streams of ResponseEvent or compacted ResponseItem values.