Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)

## Summary
- Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
tags from normal assistant output.
- Persist plan items and rebuild them on resume so proposed plans show
in thread history.
- Wire plan items/deltas through app-server protocol v2 and render a
dedicated proposed-plan view in the TUI, including the “Implement this
plan?” prompt only when a plan item is present.

## Changes

### Core (`codex-rs/core`)
- Added a generic, line-based tag parser that buffers each line until it
can disprove a tag prefix; implements auto-close on `finish()` for
unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
- Refactored proposed plan parsing to wrap the generic parser.
`codex-rs/core/src/proposed_plan_parser.rs`
- In plan mode, stream assistant deltas as:
  - **Normal text** → `AgentMessageContentDelta`
  - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
  (`codex-rs/core/src/codex.rs`)
- Final plan item content is derived from the completed assistant
message (authoritative), not necessarily the concatenated deltas.
- Strips `<proposed_plan>` blocks from assistant text in plan mode so
tags don’t appear in normal messages.
(`codex-rs/core/src/stream_events_utils.rs`)
- Persist `ItemCompleted` events only for plan items for rollout replay.
(`codex-rs/core/src/rollout/policy.rs`)
- Guard `update_plan` tool in Plan Mode with a clear error message.
(`codex-rs/core/src/tools/handlers/plan.rs`)
- Updated Plan Mode prompt to:  
  - keep `<proposed_plan>` out of non-final reasoning/preambles  
  - require exact tag formatting  
  - allow only one `<proposed_plan>` block per turn  
  (`codex-rs/core/templates/collaboration_mode/plan.md`)

### Protocol / App-server protocol
- Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
(`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
- Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
EXPERIMENTAL markers and note that deltas may not match the final plan
item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
- Added plan delta route in app-server protocol common mapping.
(`codex-rs/app-server-protocol/src/protocol/common.rs`)
- Rebuild plan items from persisted `ItemCompleted` events on resume.
(`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)

### App-server
- Forward plan deltas to v2 clients and map core plan items to v2 plan
items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
`codex-rs/app-server/src/codex_message_processor.rs`)
- Added v2 plan item tests.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

### TUI
- Added a dedicated proposed plan history cell with special background
and padding, and moved “• Proposed Plan” outside the highlighted block.
(`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
- Only show “Implement this plan?” when a plan item exists.
(`codex-rs/tui/src/chatwidget.rs`,
`codex-rs/tui/src/chatwidget/tests.rs`)

<img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
/>

### Docs / Misc
- Updated protocol docs to mention plan deltas.
(`codex-rs/docs/protocol_v1.md`)
- Minor plumbing updates in exec/debug clients to tolerate plan deltas.
(`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)

## Tests
- Added core integration tests:
  - Plan mode strips plan from agent messages.
  - Missing `</proposed_plan>` closes at end-of-message.  
  (`codex-rs/core/tests/suite/items.rs`)
- Added unit tests for generic tag parser (prefix buffering, non-tag
lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
- Existing app-server plan item tests in v2.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

## Notes / Behavior
- Plan output no longer appears in standard assistant text in Plan Mode;
it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
- The final plan item content is authoritative and may diverge from
streamed deltas (documented as experimental).
- Reasoning summaries are not filtered; prompt instructs the model not
to include `<proposed_plan>` outside the final plan message.

## Codex Author
`codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
This commit is contained in:
Charley Cunningham
2026-01-30 10:59:30 -08:00
committed by GitHub
parent 40bf11bd52
commit ec4a2d07e4
36 changed files with 2021 additions and 42 deletions

View File

@@ -5,6 +5,10 @@ use codex_core::protocol::EventMsg;
use codex_core::protocol::ItemCompletedEvent;
use codex_core::protocol::ItemStartedEvent;
use codex_core::protocol::Op;
use codex_protocol::config_types::CollaborationMode;
use codex_protocol::config_types::ModeKind;
use codex_protocol::config_types::Settings;
use codex_protocol::items::AgentMessageContent;
use codex_protocol::items::TurnItem;
use codex_protocol::models::WebSearchAction;
use codex_protocol::user_input::ByteRange;
@@ -27,6 +31,7 @@ use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_match;
use pretty_assertions::assert_eq;
@@ -327,6 +332,268 @@ async fn agent_message_content_delta_has_item_metadata() -> anyhow::Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn plan_mode_emits_plan_item_from_proposed_plan_block() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
session_configured,
..
} = test_codex().build(&server).await?;
let plan_block = "<proposed_plan>\n- Step 1\n- Step 2\n</proposed_plan>\n";
let full_message = format!("Intro\n{plan_block}Outro");
let stream = sse(vec![
ev_response_created("resp-1"),
ev_message_item_added("msg-1", ""),
ev_output_text_delta(&full_message),
ev_assistant_message("msg-1", &full_message),
ev_completed("resp-1"),
]);
mount_sse_once(&server, stream).await;
let collaboration_mode = CollaborationMode {
mode: ModeKind::Plan,
settings: Settings {
model: session_configured.model.clone(),
reasoning_effort: None,
developer_instructions: None,
},
};
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "please plan".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: std::env::current_dir()?,
approval_policy: codex_core::protocol::AskForApproval::Never,
sandbox_policy: codex_core::protocol::SandboxPolicy::DangerFullAccess,
model: session_configured.model.clone(),
effort: None,
summary: codex_protocol::config_types::ReasoningSummary::Auto,
collaboration_mode: Some(collaboration_mode),
personality: None,
})
.await?;
let plan_delta = wait_for_event_match(&codex, |ev| match ev {
EventMsg::PlanDelta(event) => Some(event.clone()),
_ => None,
})
.await;
let plan_completed = wait_for_event_match(&codex, |ev| match ev {
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::Plan(item),
..
}) => Some(item.clone()),
_ => None,
})
.await;
assert_eq!(
plan_delta.thread_id,
session_configured.session_id.to_string()
);
assert_eq!(plan_delta.delta, "- Step 1\n- Step 2\n");
assert_eq!(plan_completed.text, "- Step 1\n- Step 2\n");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn plan_mode_strips_plan_from_agent_messages() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
session_configured,
..
} = test_codex().build(&server).await?;
let plan_block = "<proposed_plan>\n- Step 1\n- Step 2\n</proposed_plan>\n";
let full_message = format!("Intro\n{plan_block}Outro");
let stream = sse(vec![
ev_response_created("resp-1"),
ev_message_item_added("msg-1", ""),
ev_output_text_delta(&full_message),
ev_assistant_message("msg-1", &full_message),
ev_completed("resp-1"),
]);
mount_sse_once(&server, stream).await;
let collaboration_mode = CollaborationMode {
mode: ModeKind::Plan,
settings: Settings {
model: session_configured.model.clone(),
reasoning_effort: None,
developer_instructions: None,
},
};
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "please plan".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: std::env::current_dir()?,
approval_policy: codex_core::protocol::AskForApproval::Never,
sandbox_policy: codex_core::protocol::SandboxPolicy::DangerFullAccess,
model: session_configured.model.clone(),
effort: None,
summary: codex_protocol::config_types::ReasoningSummary::Auto,
collaboration_mode: Some(collaboration_mode),
personality: None,
})
.await?;
let mut agent_deltas = Vec::new();
let mut plan_delta = None;
let mut agent_item = None;
let mut plan_item = None;
while plan_delta.is_none() || agent_item.is_none() || plan_item.is_none() {
let ev = wait_for_event(&codex, |_| true).await;
match ev {
EventMsg::AgentMessageContentDelta(event) => {
agent_deltas.push(event.delta);
}
EventMsg::PlanDelta(event) => {
plan_delta = Some(event.delta);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::AgentMessage(item),
..
}) => {
agent_item = Some(item);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::Plan(item),
..
}) => {
plan_item = Some(item);
}
_ => {}
}
}
let agent_text = agent_deltas.concat();
assert_eq!(agent_text, "Intro\nOutro");
assert_eq!(plan_delta.unwrap(), "- Step 1\n- Step 2\n");
assert_eq!(plan_item.unwrap().text, "- Step 1\n- Step 2\n");
let agent_text_from_item: String = agent_item
.unwrap()
.content
.iter()
.map(|entry| match entry {
AgentMessageContent::Text { text } => text.as_str(),
})
.collect();
assert_eq!(agent_text_from_item, "Intro\nOutro");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn plan_mode_handles_missing_plan_close_tag() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
session_configured,
..
} = test_codex().build(&server).await?;
let full_message = "Intro\n<proposed_plan>\n- Step 1\n";
let stream = sse(vec![
ev_response_created("resp-1"),
ev_message_item_added("msg-1", ""),
ev_output_text_delta(full_message),
ev_assistant_message("msg-1", full_message),
ev_completed("resp-1"),
]);
mount_sse_once(&server, stream).await;
let collaboration_mode = CollaborationMode {
mode: ModeKind::Plan,
settings: Settings {
model: session_configured.model.clone(),
reasoning_effort: None,
developer_instructions: None,
},
};
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "please plan".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: std::env::current_dir()?,
approval_policy: codex_core::protocol::AskForApproval::Never,
sandbox_policy: codex_core::protocol::SandboxPolicy::DangerFullAccess,
model: session_configured.model.clone(),
effort: None,
summary: codex_protocol::config_types::ReasoningSummary::Auto,
collaboration_mode: Some(collaboration_mode),
personality: None,
})
.await?;
let mut plan_delta = None;
let mut plan_item = None;
let mut agent_item = None;
while plan_delta.is_none() || plan_item.is_none() || agent_item.is_none() {
let ev = wait_for_event(&codex, |_| true).await;
match ev {
EventMsg::PlanDelta(event) => {
plan_delta = Some(event.delta);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::Plan(item),
..
}) => {
plan_item = Some(item);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::AgentMessage(item),
..
}) => {
agent_item = Some(item);
}
_ => {}
}
}
assert_eq!(plan_delta.unwrap(), "- Step 1\n");
assert_eq!(plan_item.unwrap().text, "- Step 1\n");
let agent_text_from_item: String = agent_item
.unwrap()
.content
.iter()
.map(|entry| match entry {
AgentMessageContent::Text { text } => text.as_str(),
})
.collect();
assert_eq!(agent_text_from_item, "Intro\n");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reasoning_content_delta_has_item_metadata() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));