Files
codex/codex-rs/app-server/src/models.rs
Shijie Rao 370b13afc9 Honor client-resolved service tier defaults (#23537)
## Why

Model catalog responses can now advertise a nullable
`default_service_tier` for each model. Codex needs to preserve three
distinct states all the way from config/app-server inputs to inference:

- no explicit service tier, so the client may apply the current model
catalog default when FastMode is enabled
- explicit `default`, meaning the user intentionally wants standard
routing
- explicit catalog tier ids such as `priority`, `flex`, or future tiers

Keeping those states distinct prevents the UI from showing one tier
while core sends another, especially after model switches or app-server
`thread/start` / `turn/start` updates.

## What Changed

- Plumbed `default_service_tier` through model catalog protocol types,
app-server model responses, generated schemas, model cache fixtures, and
provider/model-manager conversions.
- Added the request-only `default` service tier sentinel and normalized
legacy config spelling so `fast` in `config.toml` still materializes as
the runtime/request id `priority`.
- Moved catalog default resolution to the TUI/client side, including
recomputing the effective service tier when model/FastMode-dependent
surfaces change.
- Updated app-server thread lifecycle config construction so
`serviceTier: null` preserves explicit standard-routing intent by
mapping to `default` instead of internal `None`.
- Kept core responsible for validating explicit tiers against the
current model and stripping `default` before `/v1/responses`, without
applying catalog defaults itself.

## Validation

- `CARGO_INCREMENTAL=0 cargo build -p codex-cli`
- `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list`
- `cargo test -p codex-tui service_tier`
- `cargo test -p codex-protocol service_tier_for_request`
- `cargo test -p codex-core get_service_tier`
- `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core
service_tier`
2026-05-20 15:57:50 -07:00

72 lines
2.5 KiB
Rust

use std::sync::Arc;
use codex_app_server_protocol::Model;
use codex_app_server_protocol::ModelServiceTier;
use codex_app_server_protocol::ModelUpgradeInfo;
use codex_app_server_protocol::ReasoningEffortOption;
use codex_core::ThreadManager;
use codex_models_manager::manager::RefreshStrategy;
use codex_protocol::openai_models::ModelPreset;
use codex_protocol::openai_models::ReasoningEffortPreset;
pub async fn supported_models(
thread_manager: Arc<ThreadManager>,
include_hidden: bool,
) -> Vec<Model> {
thread_manager
.list_models(RefreshStrategy::OnlineIfUncached)
.await
.into_iter()
.filter(|preset| include_hidden || preset.show_in_picker)
.map(model_from_preset)
.collect()
}
fn model_from_preset(preset: ModelPreset) -> Model {
Model {
id: preset.id.to_string(),
model: preset.model.to_string(),
upgrade: preset.upgrade.as_ref().map(|upgrade| upgrade.id.clone()),
upgrade_info: preset.upgrade.as_ref().map(|upgrade| ModelUpgradeInfo {
model: upgrade.id.clone(),
upgrade_copy: upgrade.upgrade_copy.clone(),
model_link: upgrade.model_link.clone(),
migration_markdown: upgrade.migration_markdown.clone(),
}),
availability_nux: preset.availability_nux.map(Into::into),
display_name: preset.display_name.to_string(),
description: preset.description.to_string(),
hidden: !preset.show_in_picker,
supported_reasoning_efforts: reasoning_efforts_from_preset(
preset.supported_reasoning_efforts,
),
default_reasoning_effort: preset.default_reasoning_effort,
input_modalities: preset.input_modalities,
supports_personality: preset.supports_personality,
additional_speed_tiers: preset.additional_speed_tiers,
service_tiers: preset
.service_tiers
.into_iter()
.map(|service_tier| ModelServiceTier {
id: service_tier.id,
name: service_tier.name,
description: service_tier.description,
})
.collect(),
default_service_tier: preset.default_service_tier,
is_default: preset.is_default,
}
}
fn reasoning_efforts_from_preset(
efforts: Vec<ReasoningEffortPreset>,
) -> Vec<ReasoningEffortOption> {
efforts
.iter()
.map(|preset| ReasoningEffortOption {
reasoning_effort: preset.effort,
description: preset.description.to_string(),
})
.collect()
}