Move the realtime model transport implementation to WebRTC while keeping the core session/event interface intact for the TUI layer.
Co-authored-by: Codex <noreply@openai.com>
## Stack Position
1/4. Base PR in the realtime stack.
## Base
- `main`
## Unblocks
- #14830
## Scope
- Split the realtime websocket request builders into `common`, `v1`, and
`v2` modules.
- Keep runtime behavior unchanged in this PR.
---------
Co-authored-by: Codex <noreply@openai.com>
- add experimental_realtime_ws_mode (conversational/transcription) and
plumb it into realtime conversation session config
- switch realtime websocket intent and session.update payload shape
based on mode
- update config schema and realtime/config tests
---------
Co-authored-by: Codex <noreply@openai.com>
- Advertise a `codex` function tool in realtime v2 session updates.
- Emit handoff replies as `function_call_output` items while keeping v1
behavior unchanged.
- Split realtime event parsing into explicit v1/v2 modules with shared
common helpers.
---------
Co-authored-by: Codex <noreply@openai.com>
- Add a feature-flagged realtime v2 parser on the existing
websocket/session pipeline.
- Wire parser selection from core feature flags and map the codex
handoff tool-call path into existing handoff events.
---------
Co-authored-by: Codex <noreply@openai.com>
- Introduce `RealtimeConversationManager` for realtime API management
- Add `op::conversation` to start conversation, insert audio, insert
text, and close conversation.
- emit conversation lifecycle and realtime events.
- Move shared realtime payload types into codex-protocol and add core
e2e websocket tests for start/replace/transport-close paths.
Things to consider:
- Should we use the same `op::` and `Events` channel to carry audio? I
think we should try this simple approach and later we can create
separate one if the channels got congested.
- Sending text updates to the client: we can start simple and later
restrict that.
- Provider auth isn't wired for now intentionally
## Summary
- add realtime websocket client transport in codex-api
- send session.create on connect with backend prompt and optional
conversation_id
- keep session.update for prompt changes after connect
- switch inbound event parsing to a tagged enum (typed variants instead
of optional field bag)
- add a websocket e2e integration test in
codex-rs/codex-api/tests/realtime_websocket_e2e.rs
## Why
This moves the realtime transport to an explicit session-create
handshake and improves protocol safety with typed inbound events.
## Testing
- Added e2e integration test coverage for session create + event flow in
the API crate.