core: preconnect Responses websocket for first turn (#10698)

## Problem
The first user turn can pay websocket handshake latency even when a
session has already started. We want to reduce that initial delay while
preserving turn semantics and avoiding any prompt send during startup.

Reviewer feedback also called out duplicated connect/setup paths and
unnecessary preconnect state complexity.

## Mental model
`ModelClient` owns session-scoped transport state. During session
startup, it can opportunistically warm one websocket handshake slot. A
turn-scoped `ModelClientSession` adopts that slot once if available,
restores captured sticky turn-state, and otherwise opens a websocket
through the same shared connect path.

If startup preconnect is still in flight, first turn setup awaits that
task and treats it as the first connection attempt for the turn.

Preconnect is handshake-only. The first `response.create` is still sent
only when a turn starts.

## Non-goals
This change does not make preconnect required for correctness and does
not change prompt/turn payload semantics. It also does not expand
fallback behavior beyond clearing preconnect state when fallback
activates.

## Tradeoffs
The implementation prioritizes simpler ownership and shared connection
code over header-match gating for reuse. The single-slot cache keeps
lifecycle straightforward but only benefits the immediate next turn.

Awaiting in-flight preconnect has the same app-level connect-timeout
semantics as existing websocket connect behavior (no new timeout class
introduced by this PR).

## Architecture
`core/src/client.rs`:
- Added session-level preconnect lifecycle state (`Idle` / `InFlight` /
`Ready`) carrying one warmed websocket plus optional captured
turn-state.
- Added `pre_establish_connection()` startup warmup and `preconnect()`
handshake-only setup.
- Deduped auth/provider resolution into `current_client_setup()` and
websocket handshake wiring into `connect_websocket()` /
`build_websocket_headers()`.
- Updated turn websocket path to adopt preconnect first, await in-flight
preconnect when present, then create a new websocket only when needed.
- Ensured fallback activation clears warmed preconnect state.
- Added documentation for lifecycle, ownership, sticky-routing
invariants, and timeout semantics.

`core/src/codex.rs`:
- Session startup invokes `model_client.pre_establish_connection(...)`.
- Turn metadata resolution uses the shared timeout helper.

`core/src/turn_metadata.rs`:
- Centralized shared timeout helper used by both turn-time metadata
resolution and startup preconnect metadata building.

`core/tests/common/responses.rs` + websocket test suites:
- Added deterministic handshake waiting helper (`wait_for_handshakes`)
with bounded polling.
- Added startup preconnect and in-flight preconnect reuse coverage.
- Fallback expectations now assert exactly two websocket attempts in
covered scenarios (startup preconnect + turn attempt before fallback
sticks).

## Observability
Preconnect remains best-effort and non-fatal. Existing
websocket/fallback telemetry remains in place, and debug logs now make
preconnect-await behavior and preconnect failures easier to reason
about.

## Tests
Validated with:
1. `just fmt`
2. `cargo test -p codex-core websocket_preconnect -- --nocapture`
3. `cargo test -p codex-core websocket_fallback -- --nocapture`
4. `cargo test -p codex-core
websocket_first_turn_waits_for_inflight_preconnect -- --nocapture`
This commit is contained in:
Josh McKinney
2026-02-06 11:08:24 -08:00
committed by GitHub
parent 8896ca0ee6
commit e416e578bb
8 changed files with 718 additions and 116 deletions

View File

@@ -46,7 +46,10 @@ async fn websocket_fallback_switches_to_http_after_retries_exhausted() -> Result
.filter(|req| req.method == Method::POST && req.url.path().ends_with("/responses"))
.count();
assert_eq!(websocket_attempts, 1);
// One websocket attempt comes from startup preconnect and one from the first turn's stream
// attempt before fallback activates; after fallback, transport is HTTP. This matches the
// retry-budget tradeoff documented in [`codex_core::client`] module docs.
assert_eq!(websocket_attempts, 2);
assert_eq!(http_attempts, 1);
assert_eq!(response_mock.requests().len(), 1);
@@ -92,7 +95,10 @@ async fn websocket_fallback_is_sticky_across_turns() -> Result<()> {
.filter(|req| req.method == Method::POST && req.url.path().ends_with("/responses"))
.count();
assert_eq!(websocket_attempts, 1);
// The first turn issues exactly two websocket attempts (startup preconnect + first stream
// attempt). After fallback becomes sticky, subsequent turns stay on HTTP. This mirrors the
// retry-budget tradeoff documented in [`codex_core::client`] module docs.
assert_eq!(websocket_attempts, 2);
assert_eq!(http_attempts, 2);
assert_eq!(response_mock.requests().len(), 2);