Commit Graph

32 Commits

Author SHA1 Message Date
pakrym-oai
dd1321d11b Spread AbsolutePathBuf (#17792)
Mechanical change to promote absolute paths through code.
2026-04-14 14:26:10 -07:00
Won Park
37aac89a6d representing guardian review timeouts in protocol types (#17381)
## Summary

- Add `TimedOut` to Guardian/review carrier types:
  - `ReviewDecision::TimedOut`
  - `GuardianAssessmentStatus::TimedOut`
  - app-server v2 `GuardianApprovalReviewStatus::TimedOut`
- Regenerate app-server JSON/TypeScript schemas for the new wire shape.
- Wire the new status through core/app-server/TUI mappings with
conservative fail-closed handling.
- Keep `TimedOut` non-user-selectable in the approval UI.

**Does not change runtime behavior yet; emitting `TimeOut` and
parent-model timeout messaging will come in followup PRs**
2026-04-10 20:02:33 -07:00
Owen Lin
a3be74143a fix(guardian, app-server): introduce guardian review ids (#17298)
## Description

This PR introduces `review_id` as the stable identifier for guardian
reviews and exposes it in app-server `item/autoApprovalReview/started`
and `item/autoApprovalReview/completed` events.

Internally, guardian rejection state is now keyed by `review_id` instead
of the reviewed tool item ID. `target_item_id` is still included when a
review maps to a concrete thread item, but it is no longer overloaded as
the review lifecycle identifier.

## Motivation

We'd like to give users the ability to preempt a guardian review while
it's running (approve or decline).

However, we can't implement the API that allows the user to override a
running guardian review because we didn't have a unique `review_id` per
guardian review. Using `target_item_id` is not correct since:
- with execve reviews, there can be multiple execve calls (and therefore
guardian reviews) per shell command
- with network policy reviews, there is no target item ID

The PR that actually implements user overrides will use `review_id` as
the stable identifier.
2026-04-10 16:21:02 -07:00
maja-openai
dcbc91fd39 Update guardian output schema (#17061)
## Summary
- Update guardian output schema to separate risk, authorization,
outcome, and rationale.
- Feed guardian rationale into rejection messages.
- Split the guardian policy into template and tenant-config sections.

## Validation
- `cargo test -p codex-core mcp_tool_call`
- `env -u CODEX_SANDBOX_NETWORK_DISABLED INSTA_UPDATE=always cargo test
-p codex-core guardian::`

---------

Co-authored-by: Owen Lin <owen@openai.com>
2026-04-08 15:47:29 -07:00
Matthew Zeng
252d79f5eb [mcp] Support MCP Apps part 2 - Add meta to mcp tool call result. (#16465)
- [x] Add meta to mcp tool call result.
2026-04-07 11:10:21 -07:00
rhan-oai
756c45ec61 [codex-analytics] add protocol-native turn timestamps (#16638)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
* #16870
* #16706
* #16659
* #16641
* #16640
* __->__ #16638
2026-04-06 16:22:59 -07:00
Owen Lin
bd30bad96f fix(guardian): fix ordering of guardian events (#16462)
Guardian events were emitted a bit out of order for CommandExecution
items. This would make it hard for the frontend to render a guardian
auto-review, which has this payload:
```
pub struct ItemGuardianApprovalReviewStartedNotification {
    pub thread_id: String,
    pub turn_id: String,
    pub target_item_id: String,
    pub review: GuardianApprovalReview,
    // FYI this is no longer a json blob
    pub action: Option<JsonValue>,
}
```

There is a `target_item_id` the auto-approval review is referring to,
but the actual item had not been emitted yet.

Before this PR:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`, and if approved...
- `item/started`
- `item/completed`

After this PR:
- `item/started`
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
- `item/completed`

This lines up much better with existing patterns (i.e. human review in
`Default mode`, where app-server would send a server request to prompt
for user approval after `item/started`), and makes it easier for clients
to render what guardian is actually reviewing.

We do this following a similar pattern as `FileChange` (aka apply patch)
items, where we create a FileChange item and emit `item/started` if we
see the apply patch approval request, before the actual apply patch call
runs.
2026-04-06 19:14:27 +00:00
Charley Cunningham
f547b79bd0 Add fork snapshot modes (#15239)
## Summary
- add `ForkSnapshotMode` to `ThreadManager::fork_thread` so callers can
request either a committed snapshot or an interrupted snapshot
- share the model-visible `<turn_aborted>` history marker between the
live interrupt path and interrupted forks
- update the small set of direct fork callsites to pass
`ForkSnapshotMode::Committed`

Note: this enables /btw to work similarly as Esc to interrupt (hopefully
somewhat in distribution)

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-23 19:05:42 -07:00
Won Park
461ba012fc Feat/restore image generation history (#15223)
Restore image generation items in resumed thread history
2026-03-19 22:57:16 -07:00
Andrei Eternal
267499bed8 [hooks] use a user message > developer message for prompt continuation (#14867)
## Summary

Persist Stop-hook continuation prompts as `user` messages instead of
hidden `developer` messages + some requested integration tests

This is a followup to @pakrym 's comment in
https://github.com/openai/codex/pull/14532 to make sure stop-block
continuation prompts match training for turn loops

- Stop continuation now writes `<hook_prompt hook_run_id="...">stop
hook's user prompt<hook_prompt>`
- Introduces quick-xml dependency, though we already indirectly depended
on it anyway via syntect
- This PR only has about 500 lines of actual logic changes, the rest is
tests/schema

## Testing

Example run (with a sessionstart hook and 3 stop hooks) - this shows
context added by session start, then two stop hooks sending their own
additional prompts in a new turn. The model responds with a single
message addressing both. Then when that turn ends, the hooks detect that
they just ran using `stop_hook_active` and decide not to infinite loop

test files for this (unzip, move codex -> .codex):
[codex.zip](https://github.com/user-attachments/files/26075806/codex.zip)

```
› cats


• Running SessionStart hook: lighting the observatory

SessionStart hook (completed)
  warning: Hi, I'm a session start hook for wizard-tower (startup).
  hook context: A wimboltine stonpet is an exotic cuisine from hyperspace

• Cats are tiny zen wizards, my friend: equal parts nap, mystery, and chaos. If you want, we can talk
  cat facts, cat breeds, cat names, or build something cat-themed in this repo.

• Running Stop hook: checking the tower wards

• Running Stop hook: sacking the guards

• Running Stop hook: hiring the guards

Stop hook (completed)
  warning: Wizard Tower Stop hook reviewed the completed reply (177 chars).

Stop hook (blocked)
  warning: Wizard Tower Stop hook continuing conversation
  feedback: cook the stonpet

Stop hook (blocked)
  warning: Wizard Tower Stop hook continuing conversation
  feedback: eat the cooked stonpet

• Stonpet’s cooked, aloha style: flash-seared over a blue quasiflame, glazed with nebula salt, and
  rested until the hyperspace juices settle.

  Now we eat with gratitude, my friend. One mindful bite in, and the flavor is pure cosmic surf:
  smoky, bright, and totally out of this dimension.

• Running Stop hook: checking the tower wards

• Running Stop hook: sacking the guards

• Running Stop hook: hiring the guards

Stop hook (completed)
  warning: Wizard Tower Stop hook reviewed the completed reply (285 chars).

Stop hook (completed)
  warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.

Stop hook (completed)
  warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
```
2026-03-19 10:53:08 -07:00
Eric Traut
01df50cf42 Add thread/shellCommand to app server API surface (#14988)
This PR adds a new `thread/shellCommand` app server API so clients can
implement `!` shell commands. These commands are executed within the
sandbox, and the command text and output are visible to the model.

The internal implementation mirrors the current TUI `!` behavior.
- persist shell command execution as `CommandExecution` thread items,
including source and formatted output metadata
- bridge live and replayed app-server command execution events back into
the existing `tui_app_server` exec rendering path

This PR also wires `tui_app_server` to submit `!` commands through the
new API.
2026-03-18 23:42:40 -06:00
jif-oai
a265d6043e feat: add memory citation to agent message (#14821)
Client side to come
2026-03-18 10:03:38 +00:00
Michael Bolin
b77fe8fefe Apply argument comment lint across codex-rs (#14652)
## Why

Once the repo-local lint exists, `codex-rs` needs to follow the
checked-in convention and CI needs to keep it from drifting. This commit
applies the fallback `/*param*/` style consistently across existing
positional literal call sites without changing those APIs.

The longer-term preference is still to avoid APIs that require comments
by choosing clearer parameter types and call shapes. This PR is
intentionally the mechanical follow-through for the places where the
existing signatures stay in place.

After rebasing onto newer `main`, the rollout also had to cover newly
introduced `tui_app_server` call sites. That made it clear the first cut
of the CI job was too expensive for the common path: it was spending
almost as much time installing `cargo-dylint` and re-testing the lint
crate as a representative test job spends running product tests. The CI
update keeps the full workspace enforcement but trims that extra
overhead from ordinary `codex-rs` PRs.

## What changed

- keep a dedicated `argument_comment_lint` job in `rust-ci`
- mechanically annotate remaining opaque positional literals across
`codex-rs` with exact `/*param*/` comments, including the rebased
`tui_app_server` call sites that now fall under the lint
- keep the checked-in style aligned with the lint policy by using
`/*param*/` and leaving string and char literals uncommented
- cache `cargo-dylint`, `dylint-link`, and the relevant Cargo
registry/git metadata in the lint job
- split changed-path detection so the lint crate's own `cargo test` step
runs only when `tools/argument-comment-lint/*` or `rust-ci.yml` changes
- continue to run the repo wrapper over the `codex-rs` workspace, so
product-code enforcement is unchanged

Most of the code changes in this commit are intentionally mechanical
comment rewrites or insertions driven by the lint itself.

## Verification

- `./tools/argument-comment-lint/run.sh --workspace`
- `cargo test -p codex-tui-app-server -p codex-tui`
- parsed `.github/workflows/rust-ci.yml` locally with PyYAML

---

* -> #14652
* #14651
2026-03-16 16:48:15 -07:00
jif-oai
3f266bcd68 feat: make interrupt state not final for multi-agents (#13850)
Make `interrupted` an agent state and make it not final. As a result, a
`wait` won't return on an interrupted agent and no notification will be
send to the parent agent.

The rationals are:
* If a user interrupt a sub-agent for any reason, you don't want the
parent agent to instantaneously ask the sub-agent to restart
* If a parent agent interrupt a sub-agent, no need to add a noisy
notification in the parent agen
2026-03-16 16:39:40 +00:00
Ahmed Ibrahim
bf5e997b31 Include spawn agent model metadata in app-server items (#14410)
- add model and reasoning effort to app-server collab spawn items and
notifications
- regenerate app-server protocol schemas for the new fields

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 19:25:21 -07:00
Andrei Eternal
244b2d53f4 start of hooks engine (#13276)
(Experimental)

This PR adds a first MVP for hooks, with SessionStart and Stop

The core design is:

- hooks live in a dedicated engine under codex-rs/hooks
- each hook type has its own event-specific file
- hook execution is synchronous and blocks normal turn progression while
running
- matching hooks run in parallel, then their results are aggregated into
a normalized HookRunSummary

On the AppServer side, hooks are exposed as operational metadata rather
than transcript-native items:

- new live notifications: hook/started, hook/completed
- persisted/replayed hook results live on Turn.hookRuns
- we intentionally did not add hook-specific ThreadItem variants

Hooks messages are not persisted, they remain ephemeral. The context
changes they add are (they get appended to the user's prompt)
2026-03-10 04:11:31 +00:00
Won Park
229e6d0347 image-gen-event/client_processing (#13512)
enabling client-side to process with image-generation capabilities
(setting app-server)
2026-03-04 16:54:38 -08:00
Ruslan Nigmatullin
69d7a456bb app-server: Replay pending item requests on thread/resume (#12560)
Replay pending client requests after `thread/resume` and emit resolved
notifications when those requests clear so approval/input UI state stays
in sync after reconnects and across subscribed clients.

Affected RPCs:
- `item/commandExecution/requestApproval`
- `item/fileChange/requestApproval`
- `item/tool/requestUserInput`

Motivation:
- Resumed clients need to see pending approval/input requests that were
already outstanding before the reconnect.
- Clients also need an explicit signal when a pending request resolves
or is cleared so stale UI can be removed on turn start, completion, or
interruption.

Implementation notes:
- Use pending client requests from `OutgoingMessageSender` in order to
replay them after `thread/resume` attaches the connection, using
original request ids.
- Emit `serverRequest/resolved` when pending requests are answered
or cleared by lifecycle cleanup.
- Update the app-server protocol schema, generated TypeScript bindings,
and README docs for the replay/resolution flow.

High-level test plan:
- Added automated coverage for replaying pending command execution and
file change approval requests on `thread/resume`.
- Added automated coverage for resolved notifications in command
approval, file change approval, request_user_input, turn start, and turn
interrupt flows.
- Verified schema/docs updates in the relevant protocol and app-server
tests.

Manual testing:
- Tested reconnect/resume with multiple connections.
- Confirmed state stayed in sync between connections.
2026-02-27 12:45:59 -08:00
Owen Lin
a0fd94bde6 feat(app-server): add ThreadItem::DynamicToolCall (#12732)
Previously, clients would call `thread/start` with dynamic_tools set,
and when a model invokes a dynamic tool, it would just make the
server->client `item/tool/call` request and wait for the client's
response to complete the tool call. This works, but it doesn't have an
`item/started` or `item/completed` event.

Now we are doing this:
- [new] emit `item/started` with `DynamicToolCall` populated with the
call arguments
- send an `item/tool/call` server request
- [new] once the client responds, emit `item/completed` with
`DynamicToolCall` populated with the response.

Also, with `persistExtendedHistory: true`, dynamic tool calls are now
reconstructable in `thread/read` and `thread/resume` as
`ThreadItem::DynamicToolCall`.
2026-02-25 12:00:10 -08:00
Michael Bolin
48af93399e feat: use OAI Responses API MessagePhase type directly in App Server v2 (#12422)
https://github.com/openai/codex/pull/10455 introduced the `phase` field,
and then https://github.com/openai/codex/pull/12072 introduced a
`MessagePhase` type in `v2.rs` that paralleled the `MessagePhase` type
in `codex-rs/protocol/src/models.rs`.

The app server protocol prefers `camelCase` while the Responses API uses
`snake_case`, so this meant we had two versions of `MessagePhase` with
different serialization rules. When the app server protocol refers to
types from the Responses API, we use the wire format of the the
Responses API even though it is inconsistent with the app server API.

This PR deletes `MessagePhase` from `v2.rs` and consolidates on the
Responses API version to eliminate confusion.
2026-02-20 20:43:36 -08:00
jif-oai
4d60c803ba feat: cleaner TUI for sub-agents (#12327)
<img width="760" height="496" alt="Screenshot 2026-02-20 at 14 31 25"
src="https://github.com/user-attachments/assets/1983b825-bb47-417e-9925-6f727af56765"
/>
2026-02-20 15:26:33 +00:00
Max Johnson
b06f91c4fe app-server: improve thread resume rejoin flow (#11776)
thread/resume response includes latest turn with all items, in band so
no events are stale or lost

Testing
- e2e tested using app-server-test-client using flow described in
"Testing Thread Rejoin Behavior" in
codex-rs/app-server-test-client/README.md
- e2e tested in codex desktop by reconnecting to a running turn
2026-02-20 05:29:05 +00:00
Jack Mousseau
3a951f8096 Restore phase when loading from history (#12244) 2026-02-19 09:56:56 -08:00
Jack Mousseau
486e60bb55 Add message phase to agent message thread item (#12072) 2026-02-17 20:46:53 -08:00
Owen Lin
efc8d45750 feat(app-server): experimental flag to persist extended history (#11227)
This PR adds an experimental `persist_extended_history` bool flag to
app-server thread APIs so rollout logs can retain a richer set of
EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e.
on `thread/resume`).

### Motivation
Today, our rollout recorder only persists a small subset (e.g. user
message, reasoning, assistant message) of `EventMsg` types, dropping a
good number (like command exec, file change, etc.) that are important
for reconstructing full item history for `thread/resume`, `thread/read`,
and `thread/fork`.

Some clients want to be able to resume a thread without lossiness. This
lossiness is primarily a UI thing, since what the model sees are
`ResponseItem` and not `EventMsg`.

### Approach
This change introduces an opt-in `persist_full_history` flag to preserve
those events when you start/resume/fork a thread (defaults to `false`).

This is done by adding an `EventPersistenceMode` to the rollout
recorder:
- `Limited` (existing behavior, default)
- `Extended` (new opt-in behavior)

In `Extended` mode, persist additional `EventMsg` variants needed for
non-lossy app-server `ThreadItem` reconstruction. We now store the
following ThreadItems that we didn't before:
- web search
- command execution
- patch/file changes
- MCP tool calls
- image view calls
- collab tool outcomes
- context compaction
- review mode enter/exit

For **command executions** in particular, we truncate the output using
the existing `truncate_text` from core to store an upper bound of 10,000
bytes, which is also the default value for truncating tool outputs shown
to the model. This keeps the size of the rollout file and command
execution items returned over the wire reasonable.

And we also persist `EventMsg::Error` which we can now map back to the
Turn's status and populates the Turn's error metadata.

#### Updates to EventMsgs
To truly make `thread/resume` non-lossy, we also needed to persist the
`status` on `EventMsg::CommandExecutionEndEvent` and
`EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a
command failed or was declined (similar for apply_patch). These
EventMsgs were never persisted before so I made it a required field.
2026-02-12 19:34:22 +00:00
Celia Chen
641d5268fa chore: persist turn_id in rollout session and make turn_id uuid based (#11246)
Problem:
1. turn id is constructed in-memory;
2. on resuming threads, turn_id might not be unique;
3. client cannot no the boundary of a turn from rollout files easily.

This PR does three things:
1. persist `task_started` and `task_complete` events;
1. persist `turn_id` in rollout turn events;
5. generate turn_id as unique uuids instead of incrementing it in
memory.

This helps us resolve the issue of clients wanting to have unique turn
ids for resuming a thread, and knowing the boundry of each turn in
rollout files.

example debug logs
```
2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None }
2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None }
2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None }
```

backward compatibility:
if you try to resume an old session without task_started and
task_complete event populated, the following happens:
- If you resume and do nothing: those reconstructed historical IDs can
differ next time you resume.
- If you resume and send a new turn: the new turn gets a fresh UUID from
live submission flow and is persisted, so that new turn’s ID is stable
on later resumes.
I think this behavior is fine, because we only care about deterministic
turn id once a turn is triggered.
2026-02-11 03:56:01 +00:00
Charley Cunningham
ec4a2d07e4 Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)
## Summary
- Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
tags from normal assistant output.
- Persist plan items and rebuild them on resume so proposed plans show
in thread history.
- Wire plan items/deltas through app-server protocol v2 and render a
dedicated proposed-plan view in the TUI, including the “Implement this
plan?” prompt only when a plan item is present.

## Changes

### Core (`codex-rs/core`)
- Added a generic, line-based tag parser that buffers each line until it
can disprove a tag prefix; implements auto-close on `finish()` for
unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
- Refactored proposed plan parsing to wrap the generic parser.
`codex-rs/core/src/proposed_plan_parser.rs`
- In plan mode, stream assistant deltas as:
  - **Normal text** → `AgentMessageContentDelta`
  - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
  (`codex-rs/core/src/codex.rs`)
- Final plan item content is derived from the completed assistant
message (authoritative), not necessarily the concatenated deltas.
- Strips `<proposed_plan>` blocks from assistant text in plan mode so
tags don’t appear in normal messages.
(`codex-rs/core/src/stream_events_utils.rs`)
- Persist `ItemCompleted` events only for plan items for rollout replay.
(`codex-rs/core/src/rollout/policy.rs`)
- Guard `update_plan` tool in Plan Mode with a clear error message.
(`codex-rs/core/src/tools/handlers/plan.rs`)
- Updated Plan Mode prompt to:  
  - keep `<proposed_plan>` out of non-final reasoning/preambles  
  - require exact tag formatting  
  - allow only one `<proposed_plan>` block per turn  
  (`codex-rs/core/templates/collaboration_mode/plan.md`)

### Protocol / App-server protocol
- Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
(`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
- Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
EXPERIMENTAL markers and note that deltas may not match the final plan
item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
- Added plan delta route in app-server protocol common mapping.
(`codex-rs/app-server-protocol/src/protocol/common.rs`)
- Rebuild plan items from persisted `ItemCompleted` events on resume.
(`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)

### App-server
- Forward plan deltas to v2 clients and map core plan items to v2 plan
items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
`codex-rs/app-server/src/codex_message_processor.rs`)
- Added v2 plan item tests.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

### TUI
- Added a dedicated proposed plan history cell with special background
and padding, and moved “• Proposed Plan” outside the highlighted block.
(`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
- Only show “Implement this plan?” when a plan item exists.
(`codex-rs/tui/src/chatwidget.rs`,
`codex-rs/tui/src/chatwidget/tests.rs`)

<img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
/>

### Docs / Misc
- Updated protocol docs to mention plan deltas.
(`codex-rs/docs/protocol_v1.md`)
- Minor plumbing updates in exec/debug clients to tolerate plan deltas.
(`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)

## Tests
- Added core integration tests:
  - Plan mode strips plan from agent messages.
  - Missing `</proposed_plan>` closes at end-of-message.  
  (`codex-rs/core/tests/suite/items.rs`)
- Added unit tests for generic tag parser (prefix buffering, non-tag
lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
- Existing app-server plan item tests in v2.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

## Notes / Behavior
- Plan output no longer appears in standard assistant text in Plan Mode;
it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
- The final plan item content is authoritative and may diverge from
streamed deltas (documented as experimental).
- Reasoning summaries are not filtered; prompt instructs the model not
to include `<proposed_plan>` outside the final plan message.

## Codex Author
`codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
2026-01-30 18:59:30 +00:00
charley-oai
1fa8350ae7 Add text element metadata to protocol, app server, and core (#9331)
The second part of breaking up PR
https://github.com/openai/codex/pull/9116

Summary:

- Add `TextElement` / `ByteRange` to protocol user inputs and user
message events with defaults.
- Thread `text_elements` through app-server v1/v2 request handling and
history rebuild.
- Preserve UI metadata only in user input/events (not `ContentItem`)
while keeping local image attachments in user events for rehydration.

Details:

- Protocol: `UserInput::Text` carries `text_elements`;
`UserMessageEvent` carries `text_elements` + `local_images`.
Serialization includes empty vectors for backward compatibility.
- app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in
camelCase with conversions; v2 uses its own camelCase wrapper.
- app-server: v1/v2 input mapping includes `text_elements`; thread
history rebuilds include them.
- Core: user event emission preserves UI metadata while model history
stays clean; history replay round-trips the metadata.
2026-01-15 17:26:41 -08:00
charley-oai
4a9c2bcc5a Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00
Owen Lin
8b7ec31ba7 feat(app-server): thread/rollback API (#8454)
Add `thread/rollback` to app-server to support IDEs undo-ing the last N
turns of a thread.

For context, an IDE partner will be supporting an "undo" capability
where the IDE (the app-server client) will be responsible for reverting
the local changes made during the last turn. To support this well, we
also need a way to drop the last turn (or more generally, the last N
turns) from the agent's context. This is what `thread/rollback` does.

**Core idea**: A Thread rollback is represented as a persisted event
message (EventMsg::ThreadRollback) in the rollout JSONL file, not by
rewriting history. On resume, both the model's context (core replay) and
the UI turn list (app-server v2's thread history builder) apply these
markers so the pruned history is consistent across live conversations
and `thread/resume`.

Implementation notes:
- Rollback only affects agent context and appends to the rollout file;
clients are responsible for reverting files on disk.
- If a thread rollback is currently in progress, subsequent
`thread/rollback` calls are rejected.
- Because we use `CodexConversation::submit` and codex core tracks
active turns, returning an error on concurrent rollbacks is communicated
via an `EventMsg::Error` with a new variant
`CodexErrorInfo::ThreadRollbackFailed`. app-server watches for that and
sends the BAD_REQUEST RPC response.

Tests cover thread rollbacks in both core and app-server, including when
`num_turns` > existing turns (which clears all turns).

**Note**: this explicitly does **not** behave like `/undo` which we just
removed from the CLI, which does the opposite of what `thread/rollback`
does. `/undo` reverts local changes via ghost commits/snapshots and does
not modify the agent's context / conversation history.
2026-01-06 21:23:48 +00:00
Owen Lin
77c457121e fix: remove serde(flatten) annotation for TurnError (#7499)
The problem with using `serde(flatten)` on Turn status is that it
conditionally serializes the `error` field, which is not the pattern we
want in API v2 where all fields on an object should always be returned.

```
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct Turn {
    pub id: String,
    /// Only populated on a `thread/resume` response.
    /// For all other responses and notifications returning a Turn,
    /// the items field will be an empty list.
    pub items: Vec<ThreadItem>,
    #[serde(flatten)]
    pub status: TurnStatus,
}

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "status", rename_all = "camelCase")]
#[ts(tag = "status", export_to = "v2/")]
pub enum TurnStatus {
    Completed,
    Interrupted,
    Failed { error: TurnError },
    InProgress,
}
```

serializes to:
```
{
  "id": "turn-123",
  "items": [],
  "status": "completed"
}

{
  "id": "turn-123",
  "items": [],
  "status": "failed",
  "error": {
    "message": "Tool timeout",
    "codexErrorInfo": null
  }
}
```

Instead we want:
```
{
  "id": "turn-123",
  "items": [],
  "status": "completed",
  "error": null
}

{
  "id": "turn-123",
  "items": [],
  "status": "failed",
  "error": {
    "message": "Tool timeout",
    "codexErrorInfo": null
  }
}
```
2025-12-02 21:39:10 +00:00
Owen Lin
1924500250 [app-server] populate thread>turns>items on thread/resume (#6848)
This PR allows clients to render historical messages when resuming a
thread via `thread/resume` by reading from the list of `EventMsg`
payloads loaded from the rollout, and then transforming them into Turns
and ThreadItems to be returned on the `Thread` object.

This is implemented by leveraging `SessionConfiguredNotification` which
returns this list of `EventMsg` objects when resuming a conversation,
and then applying a stateful `ThreadHistoryBuilder` that parses from
this EventMsg log and transforms it into Turns and ThreadItems.

Note that we only persist a subset of `EventMsg`s in a rollout as
defined in `policy.rs`, so we lose fidelity whenever we resume a thread
compared to when we streamed the thread's turns originally. However,
this behavior is at parity with the legacy API.
2025-11-19 15:58:09 +00:00