mirror of
https://github.com/openai/codex.git
synced 2026-05-23 12:34:25 +00:00
## Why Model catalog responses can now advertise a nullable `default_service_tier` for each model. Codex needs to preserve three distinct states all the way from config/app-server inputs to inference: - no explicit service tier, so the client may apply the current model catalog default when FastMode is enabled - explicit `default`, meaning the user intentionally wants standard routing - explicit catalog tier ids such as `priority`, `flex`, or future tiers Keeping those states distinct prevents the UI from showing one tier while core sends another, especially after model switches or app-server `thread/start` / `turn/start` updates. ## What Changed - Plumbed `default_service_tier` through model catalog protocol types, app-server model responses, generated schemas, model cache fixtures, and provider/model-manager conversions. - Added the request-only `default` service tier sentinel and normalized legacy config spelling so `fast` in `config.toml` still materializes as the runtime/request id `priority`. - Moved catalog default resolution to the TUI/client side, including recomputing the effective service tier when model/FastMode-dependent surfaces change. - Updated app-server thread lifecycle config construction so `serviceTier: null` preserves explicit standard-routing intent by mapping to `default` instead of internal `None`. - Kept core responsible for validating explicit tiers against the current model and stripping `default` before `/v1/responses`, without applying catalog defaults itself. ## Validation - `CARGO_INCREMENTAL=0 cargo build -p codex-cli` - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list` - `cargo test -p codex-tui service_tier` - `cargo test -p codex-protocol service_tier_for_request` - `cargo test -p codex-core get_service_tier` - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core service_tier`
codex-api
Typed clients for Codex/OpenAI APIs built on top of the generic transport in codex-client.
- Hosts the request/response models and request builders for Responses and Compact APIs.
- Owns provider configuration (base URLs, headers, query params), auth header injection, retry tuning, and stream idle settings.
- Parses SSE streams into
ResponseEvent/ResponseStream, including rate-limit snapshots and API-specific error mapping. - Serves as the wire-level layer consumed by
codex-core; higher layers handle auth refresh and business logic.
Core interface
The public interface of this crate is intentionally small and uniform:
-
Responses endpoint
- Input:
ResponsesApiRequestfor the request body (model,instructions,input,tools,parallel_tool_calls, reasoning/text controls).ResponsesOptionsfor transport/header concerns (conversation_id,session_source,extra_headers,compression,turn_state).
- Output: a
ResponseStreamofResponseEvent(both re-exported fromcommon).
- Input:
-
Compaction endpoint
- Input:
CompactionInput<'a>(re-exported ascodex_api::CompactionInput):model: &str.input: &[ResponseItem]– history to compact.instructions: &str– fully-resolved compaction instructions.
- Output:
Vec<ResponseItem>. CompactClient::compact_input(&CompactionInput, extra_headers)wraps the JSON encoding and retry/telemetry wiring.
- Input:
-
Memory summarize endpoint
- Input:
MemorySummarizeInput(re-exported ascodex_api::MemorySummarizeInput):model: String.raw_memories: Vec<RawMemory>(serialized astracesfor wire compatibility).RawMemoryincludesid,metadata.source_path, and normalizeditems.
reasoning: Option<Reasoning>.
- Output:
Vec<MemorySummarizeOutput>. MemoriesClient::summarize_input(&MemorySummarizeInput, extra_headers)wraps JSON encoding and retry/telemetry wiring.
- Input:
All HTTP details (URLs, headers, retry/backoff policies, SSE framing) are encapsulated in codex-api and codex-client. Callers construct prompts/inputs using protocol types and work with typed streams of ResponseEvent or compacted ResponseItem values.