Adds WebRTC startup to the experimental app-server
`thread/realtime/start` method with an optional transport enum. The
websocket path remains the default; WebRTC offers create the realtime
session through the shared start flow and emit the answer SDP via
`thread/realtime/sdp`.
---------
Co-authored-by: Codex <noreply@openai.com>
Expose workspace-owner account metadata and an app-server nudge endpoint so users who run out of credits can inspect usage and notify their workspace owner directly from the TUI.
Made-with: Cursor
Addresses #16560
Problem: `/status` stopped showing the source thread id in forked TUI
sessions after the app-server migration.
Solution: Carry fork source ids through app-server v2 thread data and
the TUI session adapter, and update TUI fixtures so `/status` matches
the old TUI behavior.
## Description
Previously the `action` field on `EventMsg::GuardianAssessment`, which
describes what Guardian is reviewing, was typed as an arbitrary JSON
blob. This PR cleans it up and defines a sum type representing all the
various actions that Guardian can review.
This is a breaking change (on purpose), which is fine because:
- the Codex app / VSCE does not actually use `action` at the moment
- the TUI code that consumes `action` is updated in this PR as well
- rollout files that serialized old `EventMsg::GuardianAssessment` will
just silently drop these guardian events
- the contract is defined as unstable, so other clients have a fair
warning :)
This will make things much easier for followup Guardian work.
## Why
The old guardian review payloads worked, but they pushed too much shape
knowledge into downstream consumers. The TUI had custom JSON parsing
logic for commands, patches, network requests, and MCP calls, and the
app-server protocol was effectively just passing through an opaque blob.
Typing this at the protocol boundary makes the contract clearer.
## Summary
- add `self_serve_business_usage_based` and `enterprise_cbp_usage_based`
to the public/internal plan enums and regenerate the app-server + Python
SDK artifacts
- map both plans through JWT login and backend rate-limit payloads, then
bucket them with the existing Team/Business entitlement behavior in
cloud requirements, usage-limit copy, tooltips, and status display
- keep the earlier display-label remap commit on this branch so the new
Team-like and Business-like plans render consistently in the UI
## Testing
- `just write-app-server-schema`
- `uv run --project sdk/python python
sdk/python/scripts/update_sdk_artifacts.py generate-types`
- `just fix -p codex-protocol -p codex-login -p codex-core -p
codex-backend-client -p codex-cloud-requirements -p codex-tui -p
codex-tui-app-server -p codex-backend-openapi-models`
- `just fmt`
- `just argument-comment-lint`
- `cargo test -p codex-protocol
usage_based_plan_types_use_expected_wire_names`
- `cargo test -p codex-login usage_based`
- `cargo test -p codex-backend-client usage_based`
- `cargo test -p codex-cloud-requirements usage_based`
- `cargo test -p codex-core usage_limit_reached_error_formats_`
- `cargo test -p codex-tui plan_type_display_name_remaps_display_labels`
- `cargo test -p codex-tui remapped`
- `cargo test -p codex-tui-app-server
plan_type_display_name_remaps_display_labels`
- `cargo test -p codex-tui-app-server remapped`
- `cargo test -p codex-tui-app-server
preserves_usage_based_plan_type_wire_name`
## Notes
- a broader multi-crate `cargo test` run still hits unrelated existing
guardian-approval config failures in
`codex-rs/core/src/config/config_tests.rs`
CHAINED PR - note that base is eternal/hooks-pretooluse-bash, not main
-- so the following PR should be first
Matching post-tool hook to the pre-tool functionality here:
https://github.com/openai/codex/pull/15211
So, PreToolUse calls for plain shell calls, allows blocking. This
PostToolUse call runs after the command executed
example run:
```
› as a test, run in parallel the following commands:
- echo 'one'
- echo '[block-pre-tool-use]'
- echo '[block-post-tool-use]'
⚠ MCP startup incomplete (failed: notion, linear)
• Cruising through those three commands in parallel now, and I’ll share the exact outputs right after
they land.
• Running PreToolUse hook: checking the observatory runes
• Running PreToolUse hook: checking the observatory runes
• Running PreToolUse hook: checking the observatory runes
PreToolUse hook (blocked)
warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
PreToolUse hook (completed)
warning: wizard-tower PreToolUse demo inspected Bash: echo 'one'
PreToolUse hook (completed)
warning: wizard-tower PreToolUse demo inspected Bash: echo '[block-post-tool-use]'
• Ran echo '[block-post-tool-use]'
└ [block-post-tool-use]
• Ran echo 'one'
└ one
• Running PostToolUse hook: reading the observatory aftermath
• Running PostToolUse hook: reading the observatory aftermath
PostToolUse hook (blocked)
warning: wizard-tower PostToolUse demo blocked the Bash follow-through on purpose.
feedback: PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to
continue.
PostToolUse hook (completed)
warning: wizard-tower PostToolUse demo inspected Bash: echo 'one'
hook context: PostToolUse demo saw the Bash result. Response preview: one
──────────────────────────────────────────────────────────────────────────────────────────────────────
• Ran them in parallel. Little wave report:
- echo 'one' completed and returned one.
- echo '[block-pre-tool-use]' was blocked before execution by the PreToolUse hook.
- echo '[block-post-tool-use]' ran, but its result was blocked afterward by the PostToolUse hook.
The hook messages were:
- PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
- PostToolUse demo blocked the result after execution. Remove [block-post-tool-use] to continue.
› for the blocked messages (correctly blocked in this test) did the tool output land back into
context?
• Yeah, from this run:
- PreToolUse block: the command itself did not execute, so no command output landed in context. Only
the block message did.
- PostToolUse block: the command did execute, but the actual stdout did not come back into context.
Only the post-tool block message did.
So the blocked notifications landed in context, but the blocked command payloads themselves did not.
Nice clean guardrail, aloha.
```
### Summary
Add the v2 app-server filesystem watch RPCs and notifications, wire them
through the message processor, and implement connection-scoped watches
with notify-backed change delivery. This also updates the schema
fixtures, app-server documentation, and the v2 integration coverage for
watch and unwatch behavior.
This allows clients to efficiently watch for filesystem updates, e.g. to
react on branch changes.
### Testing
- exercise watch lifecycles for directory changes, atomic file
replacement, missing-file targets, and unwatch cleanup
- add `PreToolUse` hook for bash-like tool execution only at first
- block shell execution before dispatch with deny-only hook behavior
- introduces common.rs matcher framework for matching when hooks are run
example run:
```
› run three parallel echo commands, and the second one should echo "[block-pre-tool-use]" as a test
• Running the three echo commands in parallel now and I’ll report the output directly.
• Running PreToolUse hook: name for demo pre tool use hook
• Running PreToolUse hook: name for demo pre tool use hook
• Running PreToolUse hook: name for demo pre tool use hook
PreToolUse hook (completed)
warning: wizard-tower PreToolUse demo inspected Bash: echo "first parallel echo"
PreToolUse hook (blocked)
warning: wizard-tower PreToolUse demo blocked a Bash command on purpose.
feedback: PreToolUse demo blocked the command. Remove [block-pre-tool-use] to continue.
PreToolUse hook (completed)
warning: wizard-tower PreToolUse demo inspected Bash: echo "third parallel echo"
• Ran echo "first parallel echo"
└ first parallel echo
• Ran echo "third parallel echo"
└ third parallel echo
• Three little waves went out in parallel.
1. printed first parallel echo
2. was blocked before execution because it contained the exact test string [block-pre-tool-use]
3. printed third parallel echo
There was also an unrelated macOS defaults warning around the successful commands, but the echoes
themselves worked fine. If you want, I can rerun the second one with a slightly modified string so
it passes cleanly.
```
## Summary
- queue input after the user submits `/compact` until that manual
compact turn ends
- mirror the same behavior in the app-server TUI
- add regressions for input queued before compact starts and while it is
running
Co-authored-by: Codex <noreply@openai.com>
- emit a typed `thread/realtime/transcriptUpdated` notification from
live realtime transcript deltas
- expose that notification as flat `threadId`, `role`, and `text` fields
instead of a nested transcript array
- continue forwarding raw `handoff_request` items on
`thread/realtime/itemAdded`, including the accumulated
`active_transcript`
- update app-server docs, tests, and generated protocol schema artifacts
to match the delta-based payloads
---------
Co-authored-by: Codex <noreply@openai.com>
This PR add an URI-based system to reference agents within a tree. This
comes from a sync between research and engineering.
The main agent (the one manually spawned by a user) is always called
`/root`. Any sub-agent spawned by it will be `/root/agent_1` for example
where `agent_1` is chosen by the model.
Any agent can contact any agents using the path.
Paths can be used either in absolute or relative to the calling agents
Resume is not supported for now on this new path
## Summary
Persist Stop-hook continuation prompts as `user` messages instead of
hidden `developer` messages + some requested integration tests
This is a followup to @pakrym 's comment in
https://github.com/openai/codex/pull/14532 to make sure stop-block
continuation prompts match training for turn loops
- Stop continuation now writes `<hook_prompt hook_run_id="...">stop
hook's user prompt<hook_prompt>`
- Introduces quick-xml dependency, though we already indirectly depended
on it anyway via syntect
- This PR only has about 500 lines of actual logic changes, the rest is
tests/schema
## Testing
Example run (with a sessionstart hook and 3 stop hooks) - this shows
context added by session start, then two stop hooks sending their own
additional prompts in a new turn. The model responds with a single
message addressing both. Then when that turn ends, the hooks detect that
they just ran using `stop_hook_active` and decide not to infinite loop
test files for this (unzip, move codex -> .codex):
[codex.zip](https://github.com/user-attachments/files/26075806/codex.zip)
```
› cats
• Running SessionStart hook: lighting the observatory
SessionStart hook (completed)
warning: Hi, I'm a session start hook for wizard-tower (startup).
hook context: A wimboltine stonpet is an exotic cuisine from hyperspace
• Cats are tiny zen wizards, my friend: equal parts nap, mystery, and chaos. If you want, we can talk
cat facts, cat breeds, cat names, or build something cat-themed in this repo.
• Running Stop hook: checking the tower wards
• Running Stop hook: sacking the guards
• Running Stop hook: hiring the guards
Stop hook (completed)
warning: Wizard Tower Stop hook reviewed the completed reply (177 chars).
Stop hook (blocked)
warning: Wizard Tower Stop hook continuing conversation
feedback: cook the stonpet
Stop hook (blocked)
warning: Wizard Tower Stop hook continuing conversation
feedback: eat the cooked stonpet
• Stonpet’s cooked, aloha style: flash-seared over a blue quasiflame, glazed with nebula salt, and
rested until the hyperspace juices settle.
Now we eat with gratitude, my friend. One mindful bite in, and the flavor is pure cosmic surf:
smoky, bright, and totally out of this dimension.
• Running Stop hook: checking the tower wards
• Running Stop hook: sacking the guards
• Running Stop hook: hiring the guards
Stop hook (completed)
warning: Wizard Tower Stop hook reviewed the completed reply (285 chars).
Stop hook (completed)
warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
Stop hook (completed)
warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
```
1. Added SessionSource::Custom(String) and --session-source.
2. Enforced plugin and skill products by session_source.
3. Applied the same filtering to curated background refresh.
This PR adds a new `thread/shellCommand` app server API so clients can
implement `!` shell commands. These commands are executed within the
sandbox, and the command text and output are visible to the model.
The internal implementation mirrors the current TUI `!` behavior.
- persist shell command execution as `CommandExecution` thread items,
including source and formatted output metadata
- bridge live and replayed app-server command execution events back into
the existing `tui_app_server` exec rendering path
This PR also wires `tui_app_server` to submit `!` commands through the
new API.
1. Use requirement-resolved config.features as the plugin gate.
2. Guard plugin/list, plugin/read, and related flows behind that gate.
3. Skip bad marketplace.json files instead of failing the whole list.
4. Simplify plugin state and caching.
- this allows blocking the user's prompts from executing, and also
prevents them from entering history
- handles the edge case where you can both prevent the user's prompt AND
add n amount of additionalContexts
- refactors some old code into common.rs where hooks overlap
functionality
- refactors additionalContext being previously added to user messages,
instead we use developer messages for them
- handles queued messages correctly
Sample hook for testing - if you write "[block-user-submit]" this hook
will stop the thread:
example run
```
› sup
• Running UserPromptSubmit hook: reading the observatory notes
UserPromptSubmit hook (completed)
warning: wizard-tower UserPromptSubmit demo inspected: sup
hook context: Wizard Tower UserPromptSubmit demo fired. For this reply only, include the exact
phrase 'observatory lanterns lit' exactly once near the end.
• Just riding the cosmic wave and ready to help, my friend. What are we building today? observatory
lanterns lit
› and [block-user-submit]
• Running UserPromptSubmit hook: reading the observatory notes
UserPromptSubmit hook (stopped)
warning: wizard-tower UserPromptSubmit demo blocked the prompt on purpose.
stop: Wizard Tower demo block: remove [block-user-submit] to continue.
```
.codex/config.toml
```
[features]
codex_hooks = true
```
.codex/hooks.json
```
{
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "/usr/bin/python3 .codex/hooks/user_prompt_submit_demo.py",
"timeoutSec": 10,
"statusMessage": "reading the observatory notes"
}
]
}
]
}
}
```
.codex/hooks/user_prompt_submit_demo.py
```
#!/usr/bin/env python3
import json
import sys
from pathlib import Path
def prompt_from_payload(payload: dict) -> str:
prompt = payload.get("prompt")
if isinstance(prompt, str) and prompt.strip():
return prompt.strip()
event = payload.get("event")
if isinstance(event, dict):
user_prompt = event.get("user_prompt")
if isinstance(user_prompt, str):
return user_prompt.strip()
return ""
def main() -> int:
payload = json.load(sys.stdin)
prompt = prompt_from_payload(payload)
cwd = Path(payload.get("cwd", ".")).name or "wizard-tower"
if "[block-user-submit]" in prompt:
print(
json.dumps(
{
"systemMessage": (
f"{cwd} UserPromptSubmit demo blocked the prompt on purpose."
),
"decision": "block",
"reason": (
"Wizard Tower demo block: remove [block-user-submit] to continue."
),
}
)
)
return 0
prompt_preview = prompt or "(empty prompt)"
if len(prompt_preview) > 80:
prompt_preview = f"{prompt_preview[:77]}..."
print(
json.dumps(
{
"systemMessage": (
f"{cwd} UserPromptSubmit demo inspected: {prompt_preview}"
),
"hookSpecificOutput": {
"hookEventName": "UserPromptSubmit",
"additionalContext": (
"Wizard Tower UserPromptSubmit demo fired. "
"For this reply only, include the exact phrase "
"'observatory lanterns lit' exactly once near the end."
),
},
}
)
)
return 0
if __name__ == "__main__":
raise SystemExit(main())
```
- thread the realtime version into conversation start and app-server
notifications
- keep playback-aware mic gating and playback interruption behavior on
v2 only, leaving v1 on the legacy path
## Stack Position
2/4. Built on top of #14828.
## Base
- #14828
## Unblocks
- #14829
- #14827
## Scope
- Port the realtime v2 wire parsing, session, app-server, and
conversation runtime behavior onto the split websocket-method base.
- Branch runtime behavior directly on the current realtime session kind
instead of parser-derived flow flags.
- Keep regression coverage in the existing e2e suites.
---------
Co-authored-by: Codex <noreply@openai.com>
Make `interrupted` an agent state and make it not final. As a result, a
`wait` won't return on an interrupted agent and no notification will be
send to the parent agent.
The rationals are:
* If a user interrupt a sub-agent for any reason, you don't want the
parent agent to instantaneously ask the sub-agent to restart
* If a parent agent interrupt a sub-agent, no need to add a noisy
notification in the parent agen
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved
## Runtime model
This PR does not introduce a new `approval_policy`.
Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
- `user`
- `guardian_subagent`
`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.
The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.
When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`
Users can still change `/approvals` afterward.
Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior
ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.
## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.
- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface
## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.
It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`
`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`
`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.
These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.
This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.
## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
- resolved approval/denial state in history
## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)
Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)
---------
Co-authored-by: Codex <noreply@openai.com>
- add model and reasoning effort to app-server collab spawn items and
notifications
- regenerate app-server protocol schemas for the new fields
---------
Co-authored-by: Codex <noreply@openai.com>
(Experimental)
This PR adds a first MVP for hooks, with SessionStart and Stop
The core design is:
- hooks live in a dedicated engine under codex-rs/hooks
- each hook type has its own event-specific file
- hook execution is synchronous and blocks normal turn progression while
running
- matching hooks run in parallel, then their results are aggregated into
a normalized HookRunSummary
On the AppServer side, hooks are exposed as operational metadata rather
than transcript-native items:
- new live notifications: hook/started, hook/completed
- persisted/replayed hook results live on Turn.hookRuns
- we intentionally did not add hook-specific ThreadItem variants
Hooks messages are not persisted, they remain ephemeral. The context
changes they add are (they get appended to the user's prompt)
* Add an ability to stream stdin, stdout, and stderr
* Streaming of stdout and stderr has a configurable cap for total amount
of transmitted bytes (with an ability to disable it)
* Add support for overriding environment variables
* Add an ability to terminate running applications (using
`command/exec/terminate`)
* Add TTY/PTY support, with an ability to resize the terminal (using
`command/exec/resize`)
## Note-- added plugin mentions via @, but that conflicts with file
mentions
depends and builds upon #13433.
- introduces explicit `@plugin` mentions. this injects the plugin's mcp
servers, app names, and skill name format into turn context as a dev
message.
- we do not yet have UI for these mentions, so we currently parse raw
text (as opposed to skills and apps which have UI chips, autocomplete,
etc.) this depends on a `plugins/list` app-server endpoint we can feed
the UI with, which is upcoming
- also annotate mcp and app tool descriptions with the plugin(s) they
come from. this gives the model a first class way of understanding what
tools come from which plugins, which will help implicit invocation.
### Tests
Added and updated tests, unit and integration. Also confirmed locally a
raw `@plugin` injects the dev message, and the model knows about its
apps, mcps, and skills.
This adds a first-class app-server v2 `skills/changed` notification for
the existing skills live-reload signal.
Before this change, clients only had the legacy raw
`codex/event/skills_update_available` event. With this PR, v2 clients
can listen for a typed JSON-RPC notification instead of depending on the
legacy `codex/event/*` stream, which we want to remove soon.
This change fixes a Codex app account-state sync bug where clients could
know the user was signed in but still miss the ChatGPT subscription
tier, which could lead to incorrect upgrade messaging for paid users.
The root cause was that `account/updated` only carried `authMode` while
plan information was available separately via `account/read` and
rate-limit snapshots, so this update adds `planType` to
`account/updated`, populates it consistently across login and refresh
paths.
Replay pending client requests after `thread/resume` and emit resolved
notifications when those requests clear so approval/input UI state stays
in sync after reconnects and across subscribed clients.
Affected RPCs:
- `item/commandExecution/requestApproval`
- `item/fileChange/requestApproval`
- `item/tool/requestUserInput`
Motivation:
- Resumed clients need to see pending approval/input requests that were
already outstanding before the reconnect.
- Clients also need an explicit signal when a pending request resolves
or is cleared so stale UI can be removed on turn start, completion, or
interruption.
Implementation notes:
- Use pending client requests from `OutgoingMessageSender` in order to
replay them after `thread/resume` attaches the connection, using
original request ids.
- Emit `serverRequest/resolved` when pending requests are answered
or cleared by lifecycle cleanup.
- Update the app-server protocol schema, generated TypeScript bindings,
and README docs for the replay/resolution flow.
High-level test plan:
- Added automated coverage for replaying pending command execution and
file change approval requests on `thread/resume`.
- Added automated coverage for resolved notifications in command
approval, file change approval, request_user_input, turn start, and turn
interrupt flows.
- Verified schema/docs updates in the relevant protocol and app-server
tests.
Manual testing:
- Tested reconnect/resume with multiple connections.
- Confirmed state stayed in sync between connections.
Adds a new v2 app-server API for a client to be able to unsubscribe to a
thread:
- New RPC method: `thread/unsubscribe`
- New server notification: `thread/closed`
Today clients can start/resume/archive threads, but there wasn’t a way
to explicitly unload a live thread from memory without archiving it.
With `thread/unsubscribe`, a client can indicate it is no longer
actively working with a live Thread. If this is the only client
subscribed to that given thread, the thread will be automatically closed
by app-server, at which point the server will send `thread/closed` and
`thread/status/changed` with `status: notLoaded` notifications.
This gives clients a way to prevent long-running app-server processes
from accumulating too many thread (and related) objects in memory.
Closed threads will also be removed from `thread/loaded/list`.
Previously, clients would call `thread/start` with dynamic_tools set,
and when a model invokes a dynamic tool, it would just make the
server->client `item/tool/call` request and wait for the client's
response to complete the tool call. This works, but it doesn't have an
`item/started` or `item/completed` event.
Now we are doing this:
- [new] emit `item/started` with `DynamicToolCall` populated with the
call arguments
- send an `item/tool/call` server request
- [new] once the client responds, emit `item/completed` with
`DynamicToolCall` populated with the response.
Also, with `persistExtendedHistory: true`, dynamic tool calls are now
reconstructable in `thread/read` and `thread/resume` as
`ThreadItem::DynamicToolCall`.
Add experimental `thread/realtime/*` v2 requests and notifications, then
route app-server realtime events through that thread-scoped surface with
integration coverage.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
The generated unnamespaced JSON envelope schemas (`ClientRequest` and
`ServerNotification`) still contained both v1 and v2 variants, which
pulled legacy v1/core types and v2 types into the same `definitions`
graph. That caused `schemars` to produce numeric suffix names (for
example `AskForApproval2`, `ByteRange2`, `MessagePhase2`).
This PR moves JSON codegen toward v2-only output while preserving the
unnamespaced envelope artifacts, and avoids reintroducing numeric-suffix
tolerance by removing the v1/internal-only variants that caused the
collisions in those envelope schemas.
## What Changed
- In `codex-rs/app-server-protocol/src/export.rs`, JSON generation now
excludes v1 schema artifacts (`v1/*`) while continuing to emit
unnamespaced/root JSON schemas and the JSON bundle.
- Added a narrow JSON v1 allowlist (`JSON_V1_ALLOWLIST`) so
`InitializeParams` and `InitializeResponse` are still emitted.
- Added JSON-only post-processing for the mixed envelope schemas before
collision checks run:
- `ClientRequest`: strips v1 request variants from the generated `oneOf`
using the temporary `V1_CLIENT_REQUEST_METHODS` list
- `ServerNotification`: strips v1 notifications plus the internal-only
`rawResponseItem/completed` notification using the temporary
`EXCLUDED_SERVER_NOTIFICATION_METHODS_FOR_JSON` list
- Added a temporary local-definition pruning pass for those envelope
schemas so now-unreferenced v1/core definitions are removed from
`definitions` after method filtering.
- Updated the variant-title naming heuristic for single-property literal
object variants to use the literal value (when available), avoiding
collisions like multiple `state`-only variants all deriving the same
title.
- Collision handling remains fail-fast (no numeric suffix fallback map
in this PR path).
## Verification
- `just write-app-server-schema`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12408).
* __->__ #12408
* #12406
https://github.com/openai/codex/pull/10455 introduced the `phase` field,
and then https://github.com/openai/codex/pull/12072 introduced a
`MessagePhase` type in `v2.rs` that paralleled the `MessagePhase` type
in `codex-rs/protocol/src/models.rs`.
The app server protocol prefers `camelCase` while the Responses API uses
`snake_case`, so this meant we had two versions of `MessagePhase` with
different serialization rules. When the app server protocol refers to
types from the Responses API, we use the wire format of the the
Responses API even though it is inconsistent with the app server API.
This PR deletes `MessagePhase` from `v2.rs` and consolidates on the
Responses API version to eliminate confusion.
- Introduce `RealtimeConversationManager` for realtime API management
- Add `op::conversation` to start conversation, insert audio, insert
text, and close conversation.
- emit conversation lifecycle and realtime events.
- Move shared realtime payload types into codex-protocol and add core
e2e websocket tests for start/replace/transport-close paths.
Things to consider:
- Should we use the same `op::` and `Events` channel to carry audio? I
think we should try this simple approach and later we can create
separate one if the channels got congested.
- Sending text updates to the client: we can start simple and later
restrict that.
- Provider auth isn't wired for now intentionally
Exposes through the app server updated names set for a thread. This
enables other surfaces to use the core as the source of truth for thread
naming. `threadName` is gathered using the helper functions used to
interact with `session_index.jsonl`, and is hydrated in:
- `thread/list`
- `thread/read`
- `thread/resume`
- `thread/unarchive`
- `thread/rollback`
We don't do this for `thread/start` and `thread/fork`.