Commit Graph

46 Commits

Author SHA1 Message Date
Ahmed Ibrahim
2e22885e79 Split features into codex-features crate (#15253)
- Split the feature system into a new `codex-features` crate.
- Cut `codex-core` and workspace consumers over to the new config and
warning APIs.

Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
2026-03-19 20:12:07 -07:00
Owen Lin
20f2a216df feat(core, tracing): create turn spans over websockets (#14632)
## Description

Dependent on:
- [responsesapi] https://github.com/openai/openai/pull/760991 
- [codex-backend] https://github.com/openai/openai/pull/760985

`codex app-server -> codex-backend -> responsesapi` now reuses a
persistent websocket connection across many turns. This PR updates
tracing when using websockets so that each `response.create` websocket
request propagates the current tracing context, so we can get a holistic
end-to-end trace for each turn.

Tracing is propagated via special keys (`ws_request_header_traceparent`,
`ws_request_header_tracestate`) set in the `client_metadata` param in
Responses API.

Currently tracing on websockets is a bit broken because we only set
tracing context on ws connection time, so it's detached from a
`turn/start` request.
2026-03-19 03:41:06 +00:00
pakrym-oai
770616414a Prefer websockets when providers support them (#13592)
Remove all flags and model settings.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-17 19:46:44 -07:00
Owen Lin
6ea041032b fix(core): prevent hanging turn/start due to websocket warming issues (#14838)
## Description

This PR fixes a bad first-turn failure mode in app-server when the
startup websocket prewarm hangs. Before this change, `initialize ->
thread/start -> turn/start` could sit behind the prewarm for up to five
minutes, so the client would not see `turn/started`, and even
`turn/interrupt` would block because the turn had not actually started
yet.

Now, we:
- set a (configurable) timeout of 15s for websocket startup time,
exposed as `websocket_startup_timeout_ms` in config.toml
- `turn/started` is sent immediately on `turn/start` even if the
websocket is still connecting
- `turn/interrupt` can be used to cancel a turn that is still waiting on
the websocket warmup
- the turn task will wait for the full 15s websocket warming timeout
before falling back

## Why

The old behavior made app-server feel stuck at exactly the moment the
client expects turn lifecycle events to start flowing. That was
especially painful for external clients, because from their point of
view the server had accepted the request but then went silent for
minutes.

## Configuring the websocket startup timeout
Can set it in config.toml like this:
```
[model_providers.openai]
supports_websockets = true
websocket_connect_timeout_ms = 15000
```
2026-03-17 10:07:46 -07:00
Channing Conger
fd4a673525 Responses: set x-client-request-id as convesration_id when talking to responses (#14312)
Right now we're sending the header session_id to responses which is
ignored/dropped. This sets a useful x-client-request-id to the
conversation_id.
2026-03-11 12:33:10 -07:00
Ahmed Ibrahim
c8446d7cf3 Stabilize websocket response.failed error delivery (#14017)
## What changed
- Drop failed websocket connections immediately after a terminal stream
error instead of awaiting a graceful close handshake before forwarding
the error to the caller.
- Keep the success path and the closed-connection guard behavior
unchanged.

## Why this fixes the flake
- The failing integration test waits for the second websocket stream to
surface the model error before issuing a follow-up request.
- On slower runners, the old error path awaited
`ws_stream.close().await` before sending the error downstream. If that
close handshake stalled, the test kept waiting for an error that had
already happened server-side and nextest timed it out.
- Dropping the failed websocket immediately makes the terminal error
observable right away and marks the session closed so the next request
reconnects cleanly instead of depending on a best-effort close
handshake.

## Code or test?
- This is a production logic fix in `codex-api`. The existing websocket
integration test already exercises the regression path.
2026-03-11 12:33:09 -07:00
Owen Lin
289ed549cf chore(otel): rename OtelManager to SessionTelemetry (#13808)
## Summary
This is a purely mechanical refactor of `OtelManager` ->
`SessionTelemetry` to better convey what the struct is doing. No
behavior change.

## Why

`OtelManager` ended up sounding much broader than what this type
actually does. It doesn't manage OTEL globally; it's the session-scoped
telemetry surface for emitting log/trace events and recording metrics
with consistent session metadata (`app_version`, `model`, `slug`,
`originator`, etc.).

`SessionTelemetry` is a more accurate name, and updating the call sites
makes that boundary a lot easier to follow.

## Validation

- `just fmt`
- `cargo test -p codex-otel`
- `cargo test -p codex-core`
2026-03-06 16:23:30 -08:00
Michael Bolin
bfff0c729f config: enforce enterprise feature requirements (#13388)
## Why

Enterprises can already constrain approvals, sandboxing, and web search
through `requirements.toml` and MDM, but feature flags were still only
configurable as managed defaults. That meant an enterprise could suggest
feature values, but it could not actually pin them.

This change closes that gap and makes enterprise feature requirements
behave like the other constrained settings. The effective feature set
now stays consistent with enterprise requirements during config load,
when config writes are validated, and when runtime code mutates feature
flags later in the session.

It also tightens the runtime API for managed features. `ManagedFeatures`
now follows the same constraint-oriented shape as `Constrained<T>`
instead of exposing panic-prone mutation helpers, and production code
can no longer construct it through an unconstrained `From<Features>`
path.

The PR also hardens the `compact_resume_fork` integration coverage on
Windows. After the feature-management changes,
`compact_resume_after_second_compaction_preserves_history` was
overflowing the libtest/Tokio thread stacks on Windows, so the test now
uses an explicit larger-stack harness as a pragmatic mitigation. That
may not be the ideal root-cause fix, and it merits a parallel
investigation into whether part of the async future chain should be
boxed to reduce stack pressure instead.

## What Changed

Enterprises can now pin feature values in `requirements.toml` with the
requirements-side `features` table:

```toml
[features]
personality = true
unified_exec = false
```

Only canonical feature keys are allowed in the requirements `features`
table; omitted keys remain unconstrained.

- Added a requirements-side pinned feature map to
`ConfigRequirementsToml`, threaded it through source-preserving
requirements merge and normalization in `codex-config`, and made the
TOML surface use `[features]` (while still accepting legacy
`[feature_requirements]` for compatibility).
- Exposed `featureRequirements` from `configRequirements/read`,
regenerated the JSON/TypeScript schema artifacts, and updated the
app-server README.
- Wrapped the effective feature set in `ManagedFeatures`, backed by
`ConstrainedWithSource<Features>`, and changed its API to mirror
`Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`,
and result-returning `enable` / `disable` / `set_enabled` helpers.
- Removed the legacy-usage and bulk-map passthroughs from
`ManagedFeatures`; callers that need those behaviors now mutate a plain
`Features` value and reapply it through `set(...)`, so the constrained
wrapper remains the enforcement boundary.
- Removed the production loophole for constructing unconstrained
`ManagedFeatures`. Non-test code now creates it through the configured
feature-loading path, and `impl From<Features> for ManagedFeatures` is
restricted to `#[cfg(test)]`.
- Rejected legacy feature aliases in enterprise feature requirements,
and return a load error when a pinned combination cannot survive
dependency normalization.
- Validated config writes against enterprise feature requirements before
persisting changes, including explicit conflicting writes and
profile-specific feature states that normalize into invalid
combinations.
- Updated runtime and TUI feature-toggle paths to use the constrained
setter API and to persist or apply the effective post-constraint value
rather than the requested value.
- Updated the `core_test_support` Bazel target to include the bundled
core model-catalog fixtures in its runtime data, so helper code that
resolves `core/models.json` through runfiles works in remote Bazel test
environments.
- Renamed the core config test coverage to emphasize that effective
feature values are normalized at runtime, while conflicting persisted
config writes are rejected.
- Ran `compact_resume_after_second_compaction_preserves_history` inside
an explicit 8 MiB test thread and Tokio runtime worker stack, following
the existing larger-stack integration-test pattern, to keep the Windows
`compact_resume_fork` test slice from aborting while a parallel
investigation continues into whether some of the underlying async
futures should be boxed.

## Verification

- `cargo test -p codex-config`
- `cargo test -p codex-core feature_requirements_ -- --nocapture`
- `cargo test -p codex-core
load_requirements_toml_produces_expected_constraints -- --nocapture`
- `cargo test -p codex-core
compact_resume_after_second_compaction_preserves_history -- --nocapture`
- `cargo test -p codex-core compact_resume_fork -- --nocapture`
- Re-ran the built `codex-core` `tests/all` binary with
`RUST_MIN_STACK=262144` for
`compact_resume_after_second_compaction_preserves_history` to confirm
the explicit-stack harness fixes the deterministic low-stack repro.
- `cargo test -p codex-core`
- This still fails locally in unrelated integration areas that expect
the `codex` / `test_stdio_server` binaries or hit existing `search_tool`
wiremock mismatches.

## Docs

`developers.openai.com/codex` should document the requirements-side
`[features]` table for enterprise and MDM-managed configuration,
including that it only accepts canonical feature keys and that
conflicting config writes are rejected.
2026-03-04 04:40:22 +00:00
pakrym-oai
69df12efb3 Remove Responses V1 websocket implementation (#13364)
V2 is the way to go!
2026-03-03 11:32:53 -07:00
pash-openai
2f5b01abd6 add fast mode toggle (#13212)
- add a local Fast mode setting in codex-core (similar to how model id
is currently stored on disk locally)
- send `service_tier=priority` on requests when Fast is enabled
- add `/fast` in the TUI and persist it locally
- feature flag
2026-03-02 20:29:33 -08:00
pakrym-oai
4fedef88e0 Use websocket v2 as model-preferred websocket protocol (#12838) 2026-02-25 16:35:53 -08:00
pakrym-oai
9d7013eab0 Handle websocket timeout (#12791)
Sometimes websockets will timeout with 400 error, ensure we retry it.
2026-02-25 10:31:37 -08:00
pakrym-oai
97d0068658 Send warmup request (#11258)
Send a request with `generate: falls` but a full set of tools and
instructions to pre-warm inference.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-24 08:15:47 -08:00
pakrym-oai
b17148f13a Prefer v2 websockets if available (#12428)
And also cleanup settings flow to avoid reading many separate flags.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-21 20:08:04 +00:00
Michael Bolin
1af2a37ada chore: remove codex-core public protocol/shell re-exports (#12432)
## Why

`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.

This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.

## What Changed

- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
  - `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
  - `codex_protocol::protocol`
  - `codex_protocol::config_types`
  - `codex_protocol::models`
  - `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
  - `codex-utils-approval-presets`
  - `codex-utils-cli`

## Verification

- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
2026-02-20 23:45:35 -08:00
pakrym-oai
86803ca9bf Reuse connection between turns (#12294)
Add a pool of one to the model client to reuse connections across turns.
2026-02-20 10:09:46 -08:00
pash-openai
429cc4860e ws turn metadata via client_metadata (#11953) 2026-02-19 12:28:15 -08:00
pakrym-oai
fc810ba045 Use V2 websockets if feature enabled (#12071) 2026-02-17 18:32:16 -08:00
pash-openai
6c0a924203 turn metadata: per-turn non-blocking (#11677) 2026-02-13 12:48:29 -08:00
pakrym-oai
eac5473114 Do not attempt to append after response.completed (#11402)
Completed responses are fully done, and new response must be created.
2026-02-11 07:45:17 -08:00
Michael Bolin
476c1a7160 Remove test-support feature from codex-core and replace it with explicit test toggles (#11405)
## Why

`codex-core` was being built in multiple feature-resolved permutations
because test-only behavior was modeled as crate features. For a large
crate, those permutations increase compile cost and reduce cache reuse.

## Net Change

- Removed the `test-support` crate feature and related feature wiring so
`codex-core` no longer needs separate feature shapes for test consumers.
- Standardized cross-crate test-only access behind
`codex_core::test_support`.
- External test code now imports helpers from
`codex_core::test_support`.
- Underlying implementation hooks are kept internal (`pub(crate)`)
instead of broadly public.

## Outcome

- Fewer `codex-core` build permutations.
- Better incremental cache reuse across test targets.
- No intended production behavior change.
2026-02-10 22:44:02 -08:00
xl-openai
fdd0cd1de9 feat: support multiple rate limits (#11260)
Added multi-limit support end-to-end by carrying limit_name in
rate-limit snapshots and handling multiple buckets instead of only
codex.
Extended /usage client parsing to consume additional_rate_limits
Updated TUI /status and in-memory state to store/render per-limit
snapshots
Extended app-server rate-limit read response: kept rate_limits and added
rate_limits_by_name.
Adjusted usage-limit error messaging for non-default codex limit buckets
2026-02-10 20:09:31 -08:00
pakrym-oai
4473147985 Do not resend output items in incremental websockets connections (#11383)
In the incremental websocket output items are already part of the
context, no need to send them again and duplicate.
2026-02-10 19:38:08 -08:00
pakrym-oai
c68999ee6d Prefer websocket transport when model opts in (#11386)
Summary
- add a `prefer_websockets` field to `ModelInfo`, defaulting to `false`
in all fixtures and constructors
- wire the new flag into websocket selection so models that opt in
always use websocket transport even when the feature gate is off

Testing
- Not run (not requested)
2026-02-10 18:50:48 -08:00
pakrym-oai
0639c33892 Compare full request for websockets incrementality (#11343)
Tools can dynamically change mid-turn now. We need to be more thorough
about reusing incremental connections.
2026-02-10 19:14:36 +00:00
pakrym-oai
ccd17374cb Move warmup to the task level (#11216)
Instead of storing a special connection on the client level make the
regular task responsible for establishing a normal client session and
open a connection on it.

Then when the turn is started we pass in a pre-established session.
2026-02-09 10:57:52 -08:00
Rasmus Rygaard
b2d3843109 Translate websocket errors (#10937)
When getting errors over a websocket connection, translate the error
into our regular API error format
2026-02-09 17:53:09 +00:00
pakrym-oai
8fe5066bcc Simplify pre-connect (#11040) 2026-02-07 15:52:03 -08:00
alexsong-oai
daeef06bec add originator to otel (#10826) 2026-02-06 15:13:56 -08:00
Brian Yu
1fbf5ed06f Support alternative websocket API (#10861)
**Test plan**

```
cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \
  ./target/debug/codex \
    --enable responses_websockets_v2 \
    --profile byok \
    --full-auto
```
2026-02-06 14:40:50 -08:00
Josh McKinney
e416e578bb core: preconnect Responses websocket for first turn (#10698)
## Problem
The first user turn can pay websocket handshake latency even when a
session has already started. We want to reduce that initial delay while
preserving turn semantics and avoiding any prompt send during startup.

Reviewer feedback also called out duplicated connect/setup paths and
unnecessary preconnect state complexity.

## Mental model
`ModelClient` owns session-scoped transport state. During session
startup, it can opportunistically warm one websocket handshake slot. A
turn-scoped `ModelClientSession` adopts that slot once if available,
restores captured sticky turn-state, and otherwise opens a websocket
through the same shared connect path.

If startup preconnect is still in flight, first turn setup awaits that
task and treats it as the first connection attempt for the turn.

Preconnect is handshake-only. The first `response.create` is still sent
only when a turn starts.

## Non-goals
This change does not make preconnect required for correctness and does
not change prompt/turn payload semantics. It also does not expand
fallback behavior beyond clearing preconnect state when fallback
activates.

## Tradeoffs
The implementation prioritizes simpler ownership and shared connection
code over header-match gating for reuse. The single-slot cache keeps
lifecycle straightforward but only benefits the immediate next turn.

Awaiting in-flight preconnect has the same app-level connect-timeout
semantics as existing websocket connect behavior (no new timeout class
introduced by this PR).

## Architecture
`core/src/client.rs`:
- Added session-level preconnect lifecycle state (`Idle` / `InFlight` /
`Ready`) carrying one warmed websocket plus optional captured
turn-state.
- Added `pre_establish_connection()` startup warmup and `preconnect()`
handshake-only setup.
- Deduped auth/provider resolution into `current_client_setup()` and
websocket handshake wiring into `connect_websocket()` /
`build_websocket_headers()`.
- Updated turn websocket path to adopt preconnect first, await in-flight
preconnect when present, then create a new websocket only when needed.
- Ensured fallback activation clears warmed preconnect state.
- Added documentation for lifecycle, ownership, sticky-routing
invariants, and timeout semantics.

`core/src/codex.rs`:
- Session startup invokes `model_client.pre_establish_connection(...)`.
- Turn metadata resolution uses the shared timeout helper.

`core/src/turn_metadata.rs`:
- Centralized shared timeout helper used by both turn-time metadata
resolution and startup preconnect metadata building.

`core/tests/common/responses.rs` + websocket test suites:
- Added deterministic handshake waiting helper (`wait_for_handshakes`)
with bounded polling.
- Added startup preconnect and in-flight preconnect reuse coverage.
- Fallback expectations now assert exactly two websocket attempts in
covered scenarios (startup preconnect + turn attempt before fallback
sticks).

## Observability
Preconnect remains best-effort and non-fatal. Existing
websocket/fallback telemetry remains in place, and debug logs now make
preconnect-await behavior and preconnect failures easier to reason
about.

## Tests
Validated with:
1. `just fmt`
2. `cargo test -p codex-core websocket_preconnect -- --nocapture`
3. `cargo test -p codex-core websocket_fallback -- --nocapture`
4. `cargo test -p codex-core
websocket_first_turn_waits_for_inflight_preconnect -- --nocapture`
2026-02-06 19:08:24 +00:00
Anton Panasenko
4ee039744e feat: expose detailed metrics to runtime metrics (#10699) 2026-02-05 18:22:30 -08:00
pakrym-oai
dbe47ea01a Send beta header with websocket connects (#10727) 2026-02-05 15:05:02 -08:00
sayan-oai
5fdf6f5efa chore: rm web-search-eligible header (#10660)
default-enablement of web_search is now client-side, no need to send
eligibility headers to backend.

Tested locally, headers no longer sent.

will wait for corresponding backend change to deploy before merging
2026-02-05 11:48:34 -08:00
Owen Lin
3582b74d01 fix(auth): isolate chatgptAuthTokens concept to auth manager and app-server (#10423)
So that the rest of the codebase (like TUI) don't need to be concerned
whether ChatGPT auth was handled by Codex itself or passed in via
app-server's external auth mode.
2026-02-05 10:46:06 -08:00
pakrym-oai
0e8d359da9 Session-level model client (#10664)
Make ModelClient a session-scoped object.
Move state that is session level onto the client, and make state that is
per-turn explicit on corresponding methods.
Stop taking a huge Config object, instead only pass in values that are
actually needed.

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-02-04 16:58:48 -08:00
Rasmus Rygaard
df000da917 Add a codex.rate_limits event for websockets (#10324)
When communicating over websockets, we can't rely on headers to deliver
rate limit information. This PR adds a `codex.rate_limits` event that
the server can pass to the client to inform them about rate limit usage.
The client parses this data the same way we parse rate limit headers in
HTTP mode.

This PR also wires up the etag and reasoning headers for websockets
2026-02-04 06:01:47 -08:00
Anton Panasenko
fcaed4cb88 feat: log webscocket timing into runtime metrics (#10577) 2026-02-03 18:04:07 -08:00
sayan-oai
fc05374344 chore: add phase to message responseitem (#10455)
### What

add wiring for `phase` field on `ResponseItem::Message` to lay
groundwork for differentiating model preambles and final messages.
currently optional.

follows pattern in #9698.

updated schemas with `just write-app-server-schema` so we can see type
changes.

### Tests
Updated existing tests for SSE parsing and hydrating from history
2026-02-03 02:52:26 +00:00
pash-openai
019d89ff86 make codex better at git (#10145)
adds basic git context to the session prefix so the model can anchor git
actions and be a bit more version-aware. structured it in a
multiroot-friendly shape even though we only have one root today
2026-02-02 16:57:29 -08:00
Anton Panasenko
101d359cd7 Add websocket telemetry metrics and labels (#10316)
Summary
- expose websocket telemetry hooks through the responses client so
request durations and event processing can be reported
- record websocket request/event metrics and emit runtime telemetry
events that the history UI now surfaces
- improve tests to cover websocket telemetry reporting and guard runtime
summary updates


<img width="824" height="79" alt="Screenshot 2026-01-31 at 5 28 12 PM"
src="https://github.com/user-attachments/assets/ea9a7965-d8b4-4e3c-a984-ef4fdc44c81d"
/>
2026-01-31 19:16:44 -08:00
pakrym-oai
fbb3a30953 Remove WebSocket wire format (#10179)
I'd like WireApi to go away (when chat is removed) and WebSockets is
still responses API just over a different transport.
2026-01-29 13:50:53 -08:00
pakrym-oai
3b1cddf001 Fall back to http when websockets fail (#10139)
I expect not all proxies work with websockets, fall back to http if
websockets fail.
2026-01-29 10:36:21 -08:00
pakrym-oai
b511c38ddb Support end_turn flag (#9698)
Experimental flag that signals the end of the turn.
2026-01-22 17:27:48 +00:00
Ahmed Ibrahim
b11e96fb04 Act on reasoning-included per turn (#9402)
- Reset reasoning-included flag each turn and update compaction test
2026-01-19 11:23:25 -08:00
pakrym-oai
2d56519ecd Support response.done and add integration tests (#9129)
The agent loop using a persistent incremental web socket connection.
2026-01-13 16:12:30 +00:00