Files
codex/codex-rs/codex-api
Ahmed Ibrahim 2eb396deb5 Propagate cache key and service tiers in compact (#21249)
## Why

`/responses/compact` should preserve the request-affinity fields that
apply to the active auth mode. ChatGPT-auth compact requests need the
effective `service_tier`, and compact requests for every auth mode need
the stable `prompt_cache_key`, so compaction does not quietly lose
routing or cache behavior that normal sampling already has.

This follows the request-parity direction from #20719, but keeps the net
change focused on the compact payload fields needed here.

## What changed

- Add `service_tier` and `prompt_cache_key` to the compact endpoint
input payload.
- Build the remote compact payload from the existing responses request
builder output so `Fast` still maps to `priority` when compact sends a
service tier.
- Pass the turn service tier into remote compaction, but only include it
in compact payloads for ChatGPT-backed auth.
- Keep `prompt_cache_key` on compact payloads for all auth modes.
- Add request-body diff snapshot coverage in
`core/tests/suite/compact_remote.rs` for:
- API-key auth reusing `prompt_cache_key` while omitting `service_tier`
even when `Fast` is configured.
  - ChatGPT auth reusing both `service_tier` and `prompt_cache_key`.
- Drive the snapshot coverage through five varied turns: plain text,
multi-part text, tool-call continuation, image+text input, local-shell
continuation, and final-turn reasoning output.

## Verification

- Added insta snapshots for compact request-body parity against the last
normal `/responses` request after five varied turns.
- Not run locally per repo guidance; relying on GitHub CI for test
execution.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 12:23:18 -07:00
..
2026-05-06 12:23:18 -07:00
2026-02-10 16:12:31 +00:00

codex-api

Typed clients for Codex/OpenAI APIs built on top of the generic transport in codex-client.

  • Hosts the request/response models and request builders for Responses and Compact APIs.
  • Owns provider configuration (base URLs, headers, query params), auth header injection, retry tuning, and stream idle settings.
  • Parses SSE streams into ResponseEvent/ResponseStream, including rate-limit snapshots and API-specific error mapping.
  • Serves as the wire-level layer consumed by codex-core; higher layers handle auth refresh and business logic.

Core interface

The public interface of this crate is intentionally small and uniform:

  • Responses endpoint

    • Input:
      • ResponsesApiRequest for the request body (model, instructions, input, tools, parallel_tool_calls, reasoning/text controls).
      • ResponsesOptions for transport/header concerns (conversation_id, session_source, extra_headers, compression, turn_state).
    • Output: a ResponseStream of ResponseEvent (both re-exported from common).
  • Compaction endpoint

    • Input: CompactionInput<'a> (re-exported as codex_api::CompactionInput):
      • model: &str.
      • input: &[ResponseItem] history to compact.
      • instructions: &str fully-resolved compaction instructions.
    • Output: Vec<ResponseItem>.
    • CompactClient::compact_input(&CompactionInput, extra_headers) wraps the JSON encoding and retry/telemetry wiring.
  • Memory summarize endpoint

    • Input: MemorySummarizeInput (re-exported as codex_api::MemorySummarizeInput):
      • model: String.
      • raw_memories: Vec<RawMemory> (serialized as traces for wire compatibility).
        • RawMemory includes id, metadata.source_path, and normalized items.
      • reasoning: Option<Reasoning>.
    • Output: Vec<MemorySummarizeOutput>.
    • MemoriesClient::summarize_input(&MemorySummarizeInput, extra_headers) wraps JSON encoding and retry/telemetry wiring.

All HTTP details (URLs, headers, retry/backoff policies, SSE framing) are encapsulated in codex-api and codex-client. Callers construct prompts/inputs using protocol types and work with typed streams of ResponseEvent or compacted ResponseItem values.