Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)

## Summary
- Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
tags from normal assistant output.
- Persist plan items and rebuild them on resume so proposed plans show
in thread history.
- Wire plan items/deltas through app-server protocol v2 and render a
dedicated proposed-plan view in the TUI, including the “Implement this
plan?” prompt only when a plan item is present.

## Changes

### Core (`codex-rs/core`)
- Added a generic, line-based tag parser that buffers each line until it
can disprove a tag prefix; implements auto-close on `finish()` for
unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
- Refactored proposed plan parsing to wrap the generic parser.
`codex-rs/core/src/proposed_plan_parser.rs`
- In plan mode, stream assistant deltas as:
  - **Normal text** → `AgentMessageContentDelta`
  - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
  (`codex-rs/core/src/codex.rs`)
- Final plan item content is derived from the completed assistant
message (authoritative), not necessarily the concatenated deltas.
- Strips `<proposed_plan>` blocks from assistant text in plan mode so
tags don’t appear in normal messages.
(`codex-rs/core/src/stream_events_utils.rs`)
- Persist `ItemCompleted` events only for plan items for rollout replay.
(`codex-rs/core/src/rollout/policy.rs`)
- Guard `update_plan` tool in Plan Mode with a clear error message.
(`codex-rs/core/src/tools/handlers/plan.rs`)
- Updated Plan Mode prompt to:  
  - keep `<proposed_plan>` out of non-final reasoning/preambles  
  - require exact tag formatting  
  - allow only one `<proposed_plan>` block per turn  
  (`codex-rs/core/templates/collaboration_mode/plan.md`)

### Protocol / App-server protocol
- Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
(`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
- Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
EXPERIMENTAL markers and note that deltas may not match the final plan
item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
- Added plan delta route in app-server protocol common mapping.
(`codex-rs/app-server-protocol/src/protocol/common.rs`)
- Rebuild plan items from persisted `ItemCompleted` events on resume.
(`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)

### App-server
- Forward plan deltas to v2 clients and map core plan items to v2 plan
items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
`codex-rs/app-server/src/codex_message_processor.rs`)
- Added v2 plan item tests.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

### TUI
- Added a dedicated proposed plan history cell with special background
and padding, and moved “• Proposed Plan” outside the highlighted block.
(`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
- Only show “Implement this plan?” when a plan item exists.
(`codex-rs/tui/src/chatwidget.rs`,
`codex-rs/tui/src/chatwidget/tests.rs`)

<img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
/>

### Docs / Misc
- Updated protocol docs to mention plan deltas.
(`codex-rs/docs/protocol_v1.md`)
- Minor plumbing updates in exec/debug clients to tolerate plan deltas.
(`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)

## Tests
- Added core integration tests:
  - Plan mode strips plan from agent messages.
  - Missing `</proposed_plan>` closes at end-of-message.  
  (`codex-rs/core/tests/suite/items.rs`)
- Added unit tests for generic tag parser (prefix buffering, non-tag
lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
- Existing app-server plan item tests in v2.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

## Notes / Behavior
- Plan output no longer appears in standard assistant text in Plan Mode;
it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
- The final plan item content is authoritative and may diverge from
streamed deltas (documented as experimental).
- Reasoning summaries are not filtered; prompt instructs the model not
to include `<proposed_plan>` outside the final plan message.

## Codex Author
`codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
This commit is contained in:
Charley Cunningham
2026-01-30 10:59:30 -08:00
committed by GitHub
parent 40bf11bd52
commit ec4a2d07e4
36 changed files with 2021 additions and 42 deletions

View File

@@ -146,6 +146,7 @@ mod tests {
use crate::config::Config;
use crate::config::ConfigBuilder;
use assert_matches::assert_matches;
use codex_protocol::config_types::ModeKind;
use codex_protocol::protocol::ErrorEvent;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::TurnAbortReason;
@@ -231,6 +232,7 @@ mod tests {
async fn on_event_updates_status_from_task_started() {
let status = agent_status_from_event(&EventMsg::TurnStarted(TurnStartedEvent {
model_context_window: None,
collaboration_mode_kind: ModeKind::Custom,
}));
assert_eq!(status, Some(AgentStatus::Running));
}

View File

@@ -30,6 +30,7 @@ use crate::rollout::session_index;
use crate::stream_events_utils::HandleOutputCtx;
use crate::stream_events_utils::handle_non_tool_response_item;
use crate::stream_events_utils::handle_output_item_done;
use crate::stream_events_utils::last_assistant_message_from_item;
use crate::terminal;
use crate::transport_manager::TransportManager;
use crate::truncate::TruncationPolicy;
@@ -44,6 +45,7 @@ use codex_protocol::config_types::Settings;
use codex_protocol::config_types::WebSearchMode;
use codex_protocol::dynamic_tools::DynamicToolResponse;
use codex_protocol::dynamic_tools::DynamicToolSpec;
use codex_protocol::items::PlanItem;
use codex_protocol::items::TurnItem;
use codex_protocol::items::UserMessageItem;
use codex_protocol::models::BaseInstructions;
@@ -127,6 +129,9 @@ use crate::mentions::collect_explicit_app_paths;
use crate::mentions::collect_tool_mentions_from_messages;
use crate::model_provider_info::CHAT_WIRE_API_DEPRECATION_SUMMARY;
use crate::project_doc::get_user_instructions;
use crate::proposed_plan_parser::ProposedPlanParser;
use crate::proposed_plan_parser::ProposedPlanSegment;
use crate::proposed_plan_parser::extract_proposed_plan_text;
use crate::protocol::AgentMessageContentDeltaEvent;
use crate::protocol::AgentReasoningSectionBreakEvent;
use crate::protocol::ApplyPatchApprovalRequestEvent;
@@ -139,6 +144,7 @@ use crate::protocol::EventMsg;
use crate::protocol::ExecApprovalRequestEvent;
use crate::protocol::McpServerRefreshConfig;
use crate::protocol::Op;
use crate::protocol::PlanDeltaEvent;
use crate::protocol::RateLimitSnapshot;
use crate::protocol::ReasoningContentDeltaEvent;
use crate::protocol::ReasoningRawContentDeltaEvent;
@@ -482,6 +488,7 @@ pub(crate) struct TurnContext {
pub(crate) developer_instructions: Option<String>,
pub(crate) compact_prompt: Option<String>,
pub(crate) user_instructions: Option<String>,
pub(crate) collaboration_mode_kind: ModeKind,
pub(crate) personality: Option<Personality>,
pub(crate) approval_policy: AskForApproval,
pub(crate) sandbox_policy: SandboxPolicy,
@@ -682,6 +689,7 @@ impl Session {
developer_instructions: session_configuration.developer_instructions.clone(),
compact_prompt: session_configuration.compact_prompt.clone(),
user_instructions: session_configuration.user_instructions.clone(),
collaboration_mode_kind: session_configuration.collaboration_mode.mode,
personality: session_configuration.personality,
approval_policy: session_configuration.approval_policy.value(),
sandbox_policy: session_configuration.sandbox_policy.get().clone(),
@@ -3196,6 +3204,7 @@ async fn spawn_review_thread(
developer_instructions: None,
user_instructions: None,
compact_prompt: parent_turn_context.compact_prompt.clone(),
collaboration_mode_kind: parent_turn_context.collaboration_mode_kind,
personality: parent_turn_context.personality,
approval_policy: parent_turn_context.approval_policy,
sandbox_policy: parent_turn_context.sandbox_policy.clone(),
@@ -3310,6 +3319,7 @@ pub(crate) async fn run_turn(
let total_usage_tokens = sess.get_total_token_usage().await;
let event = EventMsg::TurnStarted(TurnStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
collaboration_mode_kind: turn_context.collaboration_mode_kind,
});
sess.send_event(&turn_context, event).await;
if total_usage_tokens >= auto_compact_limit {
@@ -3759,6 +3769,381 @@ struct SamplingRequestResult {
last_agent_message: Option<String>,
}
/// Ephemeral per-response state for streaming a single proposed plan.
/// This is intentionally not persisted or stored in session/state since it
/// only exists while a response is actively streaming. The final plan text
/// is extracted from the completed assistant message.
/// Tracks a single proposed plan item across a streaming response.
struct ProposedPlanItemState {
item_id: String,
started: bool,
completed: bool,
}
/// Per-item plan parsers so we can buffer text while detecting `<proposed_plan>`
/// tags without ever mixing buffered lines across item ids.
struct PlanParsers {
assistant: HashMap<String, ProposedPlanParser>,
}
impl PlanParsers {
fn new() -> Self {
Self {
assistant: HashMap::new(),
}
}
fn assistant_parser_mut(&mut self, item_id: &str) -> &mut ProposedPlanParser {
self.assistant
.entry(item_id.to_string())
.or_insert_with(ProposedPlanParser::new)
}
fn take_assistant_parser(&mut self, item_id: &str) -> Option<ProposedPlanParser> {
self.assistant.remove(item_id)
}
fn drain_assistant_parsers(&mut self) -> Vec<(String, ProposedPlanParser)> {
self.assistant.drain().collect()
}
}
/// Aggregated state used only while streaming a plan-mode response.
/// Includes per-item parsers, deferred agent message bookkeeping, and the plan item lifecycle.
struct PlanModeStreamState {
/// Per-item parsers for assistant streams in plan mode.
plan_parsers: PlanParsers,
/// Agent message items started by the model but deferred until we see non-plan text.
pending_agent_message_items: HashMap<String, TurnItem>,
/// Agent message items whose start notification has been emitted.
started_agent_message_items: HashSet<String>,
/// Leading whitespace buffered until we see non-whitespace text for an item.
leading_whitespace_by_item: HashMap<String, String>,
/// Tracks plan item lifecycle while streaming plan output.
plan_item_state: ProposedPlanItemState,
}
impl PlanModeStreamState {
fn new(turn_id: &str) -> Self {
Self {
plan_parsers: PlanParsers::new(),
pending_agent_message_items: HashMap::new(),
started_agent_message_items: HashSet::new(),
leading_whitespace_by_item: HashMap::new(),
plan_item_state: ProposedPlanItemState::new(turn_id),
}
}
}
impl ProposedPlanItemState {
fn new(turn_id: &str) -> Self {
Self {
item_id: format!("{turn_id}-plan"),
started: false,
completed: false,
}
}
async fn start(&mut self, sess: &Session, turn_context: &TurnContext) {
if self.started || self.completed {
return;
}
self.started = true;
let item = TurnItem::Plan(PlanItem {
id: self.item_id.clone(),
text: String::new(),
});
sess.emit_turn_item_started(turn_context, &item).await;
}
async fn push_delta(&mut self, sess: &Session, turn_context: &TurnContext, delta: &str) {
if self.completed {
return;
}
if delta.is_empty() {
return;
}
let event = PlanDeltaEvent {
thread_id: sess.conversation_id.to_string(),
turn_id: turn_context.sub_id.clone(),
item_id: self.item_id.clone(),
delta: delta.to_string(),
};
sess.send_event(turn_context, EventMsg::PlanDelta(event))
.await;
}
async fn complete_with_text(
&mut self,
sess: &Session,
turn_context: &TurnContext,
text: String,
) {
if self.completed || !self.started {
return;
}
self.completed = true;
let item = TurnItem::Plan(PlanItem {
id: self.item_id.clone(),
text,
});
sess.emit_turn_item_completed(turn_context, item).await;
}
}
/// In plan mode we defer agent message starts until the parser emits non-plan
/// text. The parser buffers each line until it can rule out a tag prefix, so
/// plan-only outputs never show up as empty assistant messages.
async fn maybe_emit_pending_agent_message_start(
sess: &Session,
turn_context: &TurnContext,
state: &mut PlanModeStreamState,
item_id: &str,
) {
if state.started_agent_message_items.contains(item_id) {
return;
}
if let Some(item) = state.pending_agent_message_items.remove(item_id) {
sess.emit_turn_item_started(turn_context, &item).await;
state
.started_agent_message_items
.insert(item_id.to_string());
}
}
/// Agent messages are text-only today; concatenate all text entries.
fn agent_message_text(item: &codex_protocol::items::AgentMessageItem) -> String {
item.content
.iter()
.map(|entry| match entry {
codex_protocol::items::AgentMessageContent::Text { text } => text.as_str(),
})
.collect()
}
/// Split the stream into normal assistant text vs. proposed plan content.
/// Normal text becomes AgentMessage deltas; plan content becomes PlanDelta +
/// TurnItem::Plan.
async fn handle_plan_segments(
sess: &Session,
turn_context: &TurnContext,
state: &mut PlanModeStreamState,
item_id: &str,
segments: Vec<ProposedPlanSegment>,
) {
for segment in segments {
match segment {
ProposedPlanSegment::Normal(delta) => {
if delta.is_empty() {
continue;
}
let has_non_whitespace = delta.chars().any(|ch| !ch.is_whitespace());
if !has_non_whitespace && !state.started_agent_message_items.contains(item_id) {
let entry = state
.leading_whitespace_by_item
.entry(item_id.to_string())
.or_default();
entry.push_str(&delta);
continue;
}
let delta = if !state.started_agent_message_items.contains(item_id) {
if let Some(prefix) = state.leading_whitespace_by_item.remove(item_id) {
format!("{prefix}{delta}")
} else {
delta
}
} else {
delta
};
maybe_emit_pending_agent_message_start(sess, turn_context, state, item_id).await;
let event = AgentMessageContentDeltaEvent {
thread_id: sess.conversation_id.to_string(),
turn_id: turn_context.sub_id.clone(),
item_id: item_id.to_string(),
delta,
};
sess.send_event(turn_context, EventMsg::AgentMessageContentDelta(event))
.await;
}
ProposedPlanSegment::ProposedPlanStart => {
if !state.plan_item_state.completed {
state.plan_item_state.start(sess, turn_context).await;
}
}
ProposedPlanSegment::ProposedPlanDelta(delta) => {
if !state.plan_item_state.completed {
if !state.plan_item_state.started {
state.plan_item_state.start(sess, turn_context).await;
}
state
.plan_item_state
.push_delta(sess, turn_context, &delta)
.await;
}
}
ProposedPlanSegment::ProposedPlanEnd => {}
}
}
}
/// Flush any buffered proposed-plan segments when a specific assistant message ends.
async fn flush_proposed_plan_segments_for_item(
sess: &Session,
turn_context: &TurnContext,
state: &mut PlanModeStreamState,
item_id: &str,
) {
let Some(mut parser) = state.plan_parsers.take_assistant_parser(item_id) else {
return;
};
let segments = parser.finish();
if segments.is_empty() {
return;
}
handle_plan_segments(sess, turn_context, state, item_id, segments).await;
}
/// Flush any remaining assistant plan parsers when the response completes.
async fn flush_proposed_plan_segments_all(
sess: &Session,
turn_context: &TurnContext,
state: &mut PlanModeStreamState,
) {
for (item_id, mut parser) in state.plan_parsers.drain_assistant_parsers() {
let segments = parser.finish();
if segments.is_empty() {
continue;
}
handle_plan_segments(sess, turn_context, state, &item_id, segments).await;
}
}
/// Emit completion for plan items by parsing the finalized assistant message.
async fn maybe_complete_plan_item_from_message(
sess: &Session,
turn_context: &TurnContext,
state: &mut PlanModeStreamState,
item: &ResponseItem,
) {
if let ResponseItem::Message { role, content, .. } = item
&& role == "assistant"
{
let mut text = String::new();
for entry in content {
if let ContentItem::OutputText { text: chunk } = entry {
text.push_str(chunk);
}
}
if let Some(plan_text) = extract_proposed_plan_text(&text) {
if !state.plan_item_state.started {
state.plan_item_state.start(sess, turn_context).await;
}
state
.plan_item_state
.complete_with_text(sess, turn_context, plan_text)
.await;
}
}
}
/// Emit a completed agent message in plan mode, respecting deferred starts.
async fn emit_agent_message_in_plan_mode(
sess: &Session,
turn_context: &TurnContext,
agent_message: codex_protocol::items::AgentMessageItem,
state: &mut PlanModeStreamState,
) {
let agent_message_id = agent_message.id.clone();
let text = agent_message_text(&agent_message);
if text.trim().is_empty() {
state.pending_agent_message_items.remove(&agent_message_id);
state.started_agent_message_items.remove(&agent_message_id);
return;
}
maybe_emit_pending_agent_message_start(sess, turn_context, state, &agent_message_id).await;
if !state
.started_agent_message_items
.contains(&agent_message_id)
{
let start_item = state
.pending_agent_message_items
.remove(&agent_message_id)
.unwrap_or_else(|| {
TurnItem::AgentMessage(codex_protocol::items::AgentMessageItem {
id: agent_message_id.clone(),
content: Vec::new(),
})
});
sess.emit_turn_item_started(turn_context, &start_item).await;
state
.started_agent_message_items
.insert(agent_message_id.clone());
}
sess.emit_turn_item_completed(turn_context, TurnItem::AgentMessage(agent_message))
.await;
state.started_agent_message_items.remove(&agent_message_id);
}
/// Emit completion for a plan-mode turn item, handling agent messages specially.
async fn emit_turn_item_in_plan_mode(
sess: &Session,
turn_context: &TurnContext,
turn_item: TurnItem,
previously_active_item: Option<&TurnItem>,
state: &mut PlanModeStreamState,
) {
match turn_item {
TurnItem::AgentMessage(agent_message) => {
emit_agent_message_in_plan_mode(sess, turn_context, agent_message, state).await;
}
_ => {
if previously_active_item.is_none() {
sess.emit_turn_item_started(turn_context, &turn_item).await;
}
sess.emit_turn_item_completed(turn_context, turn_item).await;
}
}
}
/// Handle a completed assistant response item in plan mode, returning true if handled.
async fn handle_assistant_item_done_in_plan_mode(
sess: &Session,
turn_context: &TurnContext,
item: &ResponseItem,
state: &mut PlanModeStreamState,
previously_active_item: Option<&TurnItem>,
last_agent_message: &mut Option<String>,
) -> bool {
if let ResponseItem::Message { role, .. } = item
&& role == "assistant"
{
maybe_complete_plan_item_from_message(sess, turn_context, state, item).await;
if let Some(turn_item) = handle_non_tool_response_item(item, true).await {
emit_turn_item_in_plan_mode(
sess,
turn_context,
turn_item,
previously_active_item,
state,
)
.await;
}
sess.record_conversation_items(turn_context, std::slice::from_ref(item))
.await;
if let Some(agent_message) = last_assistant_message_from_item(item, true) {
*last_agent_message = Some(agent_message);
}
return true;
}
false
}
async fn drain_in_flight(
in_flight: &mut FuturesOrdered<BoxFuture<'static, CodexResult<ResponseInputItem>>>,
sess: Arc<Session>,
@@ -3795,10 +4180,6 @@ async fn try_run_sampling_request(
prompt: &Prompt,
cancellation_token: CancellationToken,
) -> CodexResult<SamplingRequestResult> {
// TODO: If we need to guarantee the persisted mode always matches the prompt used for this
// turn, capture it in TurnContext at creation time. Using SessionConfiguration here avoids
// duplicating model settings on TurnContext, but a later Op could update the session config
// before this write occurs.
let collaboration_mode = sess.current_collaboration_mode().await;
let rollout_item = RolloutItem::TurnContext(TurnContextItem {
cwd: turn_context.cwd.clone(),
@@ -3843,6 +4224,8 @@ async fn try_run_sampling_request(
let mut last_agent_message: Option<String> = None;
let mut active_item: Option<TurnItem> = None;
let mut should_emit_turn_diff = false;
let plan_mode = turn_context.collaboration_mode_kind == ModeKind::Plan;
let mut plan_mode_state = plan_mode.then(|| PlanModeStreamState::new(&turn_context.sub_id));
let receiving_span = trace_span!("receiving_stream");
let outcome: CodexResult<SamplingRequestResult> = loop {
let handle_responses = trace_span!(
@@ -3881,6 +4264,33 @@ async fn try_run_sampling_request(
ResponseEvent::Created => {}
ResponseEvent::OutputItemDone(item) => {
let previously_active_item = active_item.take();
if let Some(state) = plan_mode_state.as_mut() {
if let Some(previous) = previously_active_item.as_ref() {
let item_id = previous.id();
if matches!(previous, TurnItem::AgentMessage(_)) {
flush_proposed_plan_segments_for_item(
&sess,
&turn_context,
state,
&item_id,
)
.await;
}
}
if handle_assistant_item_done_in_plan_mode(
&sess,
&turn_context,
&item,
state,
previously_active_item.as_ref(),
&mut last_agent_message,
)
.await
{
continue;
}
}
let mut ctx = HandleOutputCtx {
sess: sess.clone(),
turn_context: turn_context.clone(),
@@ -3900,8 +4310,17 @@ async fn try_run_sampling_request(
needs_follow_up |= output_result.needs_follow_up;
}
ResponseEvent::OutputItemAdded(item) => {
if let Some(turn_item) = handle_non_tool_response_item(&item).await {
sess.emit_turn_item_started(&turn_context, &turn_item).await;
if let Some(turn_item) = handle_non_tool_response_item(&item, plan_mode).await {
if let Some(state) = plan_mode_state.as_mut()
&& matches!(turn_item, TurnItem::AgentMessage(_))
{
let item_id = turn_item.id();
state
.pending_agent_message_items
.insert(item_id, turn_item.clone());
} else {
sess.emit_turn_item_started(&turn_context, &turn_item).await;
}
active_item = Some(turn_item);
}
}
@@ -3925,6 +4344,9 @@ async fn try_run_sampling_request(
response_id: _,
token_usage,
} => {
if let Some(state) = plan_mode_state.as_mut() {
flush_proposed_plan_segments_all(&sess, &turn_context, state).await;
}
sess.update_token_usage_info(&turn_context, token_usage.as_ref())
.await;
should_emit_turn_diff = true;
@@ -3940,14 +4362,25 @@ async fn try_run_sampling_request(
// In review child threads, suppress assistant text deltas; the
// UI will show a selection popup from the final ReviewOutput.
if let Some(active) = active_item.as_ref() {
let event = AgentMessageContentDeltaEvent {
thread_id: sess.conversation_id.to_string(),
turn_id: turn_context.sub_id.clone(),
item_id: active.id(),
delta: delta.clone(),
};
sess.send_event(&turn_context, EventMsg::AgentMessageContentDelta(event))
.await;
let item_id = active.id();
if let Some(state) = plan_mode_state.as_mut()
&& matches!(active, TurnItem::AgentMessage(_))
{
let segments = state
.plan_parsers
.assistant_parser_mut(&item_id)
.parse(&delta);
handle_plan_segments(&sess, &turn_context, state, &item_id, segments).await;
} else {
let event = AgentMessageContentDeltaEvent {
thread_id: sess.conversation_id.to_string(),
turn_id: turn_context.sub_id.clone(),
item_id,
delta,
};
sess.send_event(&turn_context, EventMsg::AgentMessageContentDelta(event))
.await;
}
} else {
error_or_panic("OutputTextDelta without active item".to_string());
}

View File

@@ -61,6 +61,7 @@ pub(crate) async fn run_compact_task(
) {
let start_event = EventMsg::TurnStarted(TurnStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
collaboration_mode_kind: turn_context.collaboration_mode_kind,
});
sess.send_event(&turn_context, start_event).await;
run_compact_task_inner(sess.clone(), turn_context, input).await;

View File

@@ -22,6 +22,7 @@ pub(crate) async fn run_inline_remote_auto_compact_task(
pub(crate) async fn run_remote_compact_task(sess: Arc<Session>, turn_context: Arc<TurnContext>) {
let start_event = EventMsg::TurnStarted(TurnStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
collaboration_mode_kind: turn_context.collaboration_mode_kind,
});
sess.send_event(&turn_context, start_event).await;

View File

@@ -49,9 +49,11 @@ mod model_provider_info;
pub mod parse_command;
pub mod path_utils;
pub mod powershell;
mod proposed_plan_parser;
pub mod sandboxing;
mod session_prefix;
mod stream_events_utils;
mod tagged_block_parser;
mod text_encoding;
pub mod token_data;
mod truncate;

View File

@@ -0,0 +1,185 @@
use crate::tagged_block_parser::TagSpec;
use crate::tagged_block_parser::TaggedLineParser;
use crate::tagged_block_parser::TaggedLineSegment;
const OPEN_TAG: &str = "<proposed_plan>";
const CLOSE_TAG: &str = "</proposed_plan>";
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum PlanTag {
ProposedPlan,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub(crate) enum ProposedPlanSegment {
Normal(String),
ProposedPlanStart,
ProposedPlanDelta(String),
ProposedPlanEnd,
}
/// Parser for `<proposed_plan>` blocks emitted in plan mode.
///
/// This is a thin wrapper around the generic line-based tag parser. It maps
/// tag-aware segments into plan-specific segments for downstream consumers.
#[derive(Debug)]
pub(crate) struct ProposedPlanParser {
parser: TaggedLineParser<PlanTag>,
}
impl ProposedPlanParser {
pub(crate) fn new() -> Self {
Self {
parser: TaggedLineParser::new(vec![TagSpec {
open: OPEN_TAG,
close: CLOSE_TAG,
tag: PlanTag::ProposedPlan,
}]),
}
}
pub(crate) fn parse(&mut self, delta: &str) -> Vec<ProposedPlanSegment> {
self.parser
.parse(delta)
.into_iter()
.map(map_plan_segment)
.collect()
}
pub(crate) fn finish(&mut self) -> Vec<ProposedPlanSegment> {
self.parser
.finish()
.into_iter()
.map(map_plan_segment)
.collect()
}
}
fn map_plan_segment(segment: TaggedLineSegment<PlanTag>) -> ProposedPlanSegment {
match segment {
TaggedLineSegment::Normal(text) => ProposedPlanSegment::Normal(text),
TaggedLineSegment::TagStart(PlanTag::ProposedPlan) => {
ProposedPlanSegment::ProposedPlanStart
}
TaggedLineSegment::TagDelta(PlanTag::ProposedPlan, text) => {
ProposedPlanSegment::ProposedPlanDelta(text)
}
TaggedLineSegment::TagEnd(PlanTag::ProposedPlan) => ProposedPlanSegment::ProposedPlanEnd,
}
}
pub(crate) fn strip_proposed_plan_blocks(text: &str) -> String {
let mut parser = ProposedPlanParser::new();
let mut out = String::new();
for segment in parser.parse(text).into_iter().chain(parser.finish()) {
if let ProposedPlanSegment::Normal(delta) = segment {
out.push_str(&delta);
}
}
out
}
pub(crate) fn extract_proposed_plan_text(text: &str) -> Option<String> {
let mut parser = ProposedPlanParser::new();
let mut plan_text = String::new();
let mut saw_plan_block = false;
for segment in parser.parse(text).into_iter().chain(parser.finish()) {
match segment {
ProposedPlanSegment::ProposedPlanStart => {
saw_plan_block = true;
plan_text.clear();
}
ProposedPlanSegment::ProposedPlanDelta(delta) => {
plan_text.push_str(&delta);
}
ProposedPlanSegment::ProposedPlanEnd | ProposedPlanSegment::Normal(_) => {}
}
}
saw_plan_block.then_some(plan_text)
}
#[cfg(test)]
mod tests {
use super::ProposedPlanParser;
use super::ProposedPlanSegment;
use super::strip_proposed_plan_blocks;
use pretty_assertions::assert_eq;
#[test]
fn streams_proposed_plan_segments() {
let mut parser = ProposedPlanParser::new();
let mut segments = Vec::new();
for chunk in [
"Intro text\n<prop",
"osed_plan>\n- step 1\n",
"</proposed_plan>\nOutro",
] {
segments.extend(parser.parse(chunk));
}
segments.extend(parser.finish());
assert_eq!(
segments,
vec![
ProposedPlanSegment::Normal("Intro text\n".to_string()),
ProposedPlanSegment::ProposedPlanStart,
ProposedPlanSegment::ProposedPlanDelta("- step 1\n".to_string()),
ProposedPlanSegment::ProposedPlanEnd,
ProposedPlanSegment::Normal("Outro".to_string()),
]
);
}
#[test]
fn preserves_non_tag_lines() {
let mut parser = ProposedPlanParser::new();
let mut segments = parser.parse(" <proposed_plan> extra\n");
segments.extend(parser.finish());
assert_eq!(
segments,
vec![ProposedPlanSegment::Normal(
" <proposed_plan> extra\n".to_string()
)]
);
}
#[test]
fn closes_unterminated_plan_block_on_finish() {
let mut parser = ProposedPlanParser::new();
let mut segments = parser.parse("<proposed_plan>\n- step 1\n");
segments.extend(parser.finish());
assert_eq!(
segments,
vec![
ProposedPlanSegment::ProposedPlanStart,
ProposedPlanSegment::ProposedPlanDelta("- step 1\n".to_string()),
ProposedPlanSegment::ProposedPlanEnd,
]
);
}
#[test]
fn closes_tag_line_without_trailing_newline() {
let mut parser = ProposedPlanParser::new();
let mut segments = parser.parse("<proposed_plan>\n- step 1\n</proposed_plan>");
segments.extend(parser.finish());
assert_eq!(
segments,
vec![
ProposedPlanSegment::ProposedPlanStart,
ProposedPlanSegment::ProposedPlanDelta("- step 1\n".to_string()),
ProposedPlanSegment::ProposedPlanEnd,
]
);
}
#[test]
fn strips_proposed_plan_blocks_from_text() {
let text = "before\n<proposed_plan>\n- step\n</proposed_plan>\nafter";
assert_eq!(strip_proposed_plan_blocks(text), "before\nafter");
}
}

View File

@@ -48,6 +48,12 @@ pub(crate) fn should_persist_event_msg(ev: &EventMsg) -> bool {
| EventMsg::ThreadRolledBack(_)
| EventMsg::UndoCompleted(_)
| EventMsg::TurnAborted(_) => true,
EventMsg::ItemCompleted(event) => {
// Plan items are derived from streaming tags and are not part of the
// raw ResponseItem history, so we persist their completion to replay
// them on resume without bloating rollouts with every item lifecycle.
matches!(event.item, codex_protocol::items::TurnItem::Plan(_))
}
EventMsg::Error(_)
| EventMsg::Warning(_)
| EventMsg::TurnStarted(_)
@@ -89,8 +95,8 @@ pub(crate) fn should_persist_event_msg(ev: &EventMsg) -> bool {
| EventMsg::ViewImageToolCall(_)
| EventMsg::DeprecationNotice(_)
| EventMsg::ItemStarted(_)
| EventMsg::ItemCompleted(_)
| EventMsg::AgentMessageContentDelta(_)
| EventMsg::PlanDelta(_)
| EventMsg::ReasoningContentDelta(_)
| EventMsg::ReasoningRawContentDelta(_)
| EventMsg::SkillsUpdateAvailable

View File

@@ -1,6 +1,7 @@
use std::pin::Pin;
use std::sync::Arc;
use codex_protocol::config_types::ModeKind;
use codex_protocol::items::TurnItem;
use tokio_util::sync::CancellationToken;
@@ -10,6 +11,7 @@ use crate::error::CodexErr;
use crate::error::Result;
use crate::function_tool::FunctionCallError;
use crate::parse_turn_item;
use crate::proposed_plan_parser::strip_proposed_plan_blocks;
use crate::tools::parallel::ToolCallRuntime;
use crate::tools::router::ToolRouter;
use codex_protocol::models::FunctionCallOutputPayload;
@@ -46,6 +48,7 @@ pub(crate) async fn handle_output_item_done(
previously_active_item: Option<TurnItem>,
) -> Result<OutputItemResult> {
let mut output = OutputItemResult::default();
let plan_mode = ctx.turn_context.collaboration_mode_kind == ModeKind::Plan;
match ToolRouter::build_tool_call(ctx.sess.as_ref(), item.clone()).await {
// The model emitted a tool call; log it, persist the item immediately, and queue the tool execution.
@@ -74,7 +77,7 @@ pub(crate) async fn handle_output_item_done(
}
// No tool call: convert messages/reasoning into turn items and mark them as complete.
Ok(None) => {
if let Some(turn_item) = handle_non_tool_response_item(&item).await {
if let Some(turn_item) = handle_non_tool_response_item(&item, plan_mode).await {
if previously_active_item.is_none() {
ctx.sess
.emit_turn_item_started(&ctx.turn_context, &turn_item)
@@ -89,7 +92,7 @@ pub(crate) async fn handle_output_item_done(
ctx.sess
.record_conversation_items(&ctx.turn_context, std::slice::from_ref(&item))
.await;
let last_agent_message = last_assistant_message_from_item(&item);
let last_agent_message = last_assistant_message_from_item(&item, plan_mode);
output.last_agent_message = last_agent_message;
}
@@ -155,13 +158,31 @@ pub(crate) async fn handle_output_item_done(
Ok(output)
}
pub(crate) async fn handle_non_tool_response_item(item: &ResponseItem) -> Option<TurnItem> {
pub(crate) async fn handle_non_tool_response_item(
item: &ResponseItem,
plan_mode: bool,
) -> Option<TurnItem> {
debug!(?item, "Output item");
match item {
ResponseItem::Message { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. } => parse_turn_item(item),
| ResponseItem::WebSearchCall { .. } => {
let mut turn_item = parse_turn_item(item)?;
if plan_mode && let TurnItem::AgentMessage(agent_message) = &mut turn_item {
let combined = agent_message
.content
.iter()
.map(|entry| match entry {
codex_protocol::items::AgentMessageContent::Text { text } => text.as_str(),
})
.collect::<String>();
let stripped = strip_proposed_plan_blocks(&combined);
agent_message.content =
vec![codex_protocol::items::AgentMessageContent::Text { text: stripped }];
}
Some(turn_item)
}
ResponseItem::FunctionCallOutput { .. } | ResponseItem::CustomToolCallOutput { .. } => {
debug!("unexpected tool output from stream");
None
@@ -170,14 +191,29 @@ pub(crate) async fn handle_non_tool_response_item(item: &ResponseItem) -> Option
}
}
pub(crate) fn last_assistant_message_from_item(item: &ResponseItem) -> Option<String> {
pub(crate) fn last_assistant_message_from_item(
item: &ResponseItem,
plan_mode: bool,
) -> Option<String> {
if let ResponseItem::Message { role, content, .. } = item
&& role == "assistant"
{
return content.iter().rev().find_map(|ci| match ci {
codex_protocol::models::ContentItem::OutputText { text } => Some(text.clone()),
_ => None,
});
let combined = content
.iter()
.filter_map(|ci| match ci {
codex_protocol::models::ContentItem::OutputText { text } => Some(text.as_str()),
_ => None,
})
.collect::<String>();
if combined.is_empty() {
return None;
}
return if plan_mode {
let stripped = strip_proposed_plan_blocks(&combined);
(!stripped.trim().is_empty()).then_some(stripped)
} else {
Some(combined)
};
}
None
}

View File

@@ -0,0 +1,314 @@
//! Line-based tag block parsing for streamed text.
//!
//! The parser buffers each line until it can disprove that the line is a tag,
//! which is required for tags that must appear alone on a line. For example,
//! Proposed Plan output uses `<proposed_plan>` and `</proposed_plan>` tags
//! on their own lines so clients can stream plan content separately.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) struct TagSpec<T> {
pub(crate) open: &'static str,
pub(crate) close: &'static str,
pub(crate) tag: T,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub(crate) enum TaggedLineSegment<T> {
Normal(String),
TagStart(T),
TagDelta(T, String),
TagEnd(T),
}
/// Stateful line parser that splits input into normal text vs tag blocks.
///
/// How it works:
/// - While reading a line, we buffer characters until the line either finishes
/// (`\n`) or stops matching any tag prefix (after `trim_start`).
/// - If it stops matching a tag prefix, the buffered line is immediately
/// emitted as text and we continue in "plain text" mode until the next
/// newline.
/// - When a full line is available, we compare it to the open/close tags; tag
/// lines emit TagStart/TagEnd, otherwise the line is emitted as text.
/// - `finish()` flushes any buffered line and auto-closes an unterminated tag,
/// which keeps streaming resilient to missing closing tags.
#[derive(Debug, Default)]
pub(crate) struct TaggedLineParser<T>
where
T: Copy + Eq,
{
specs: Vec<TagSpec<T>>,
active_tag: Option<T>,
detect_tag: bool,
line_buffer: String,
}
impl<T> TaggedLineParser<T>
where
T: Copy + Eq,
{
pub(crate) fn new(specs: Vec<TagSpec<T>>) -> Self {
Self {
specs,
active_tag: None,
detect_tag: true,
line_buffer: String::new(),
}
}
/// Parse a streamed delta into line-aware segments.
pub(crate) fn parse(&mut self, delta: &str) -> Vec<TaggedLineSegment<T>> {
let mut segments = Vec::new();
let mut run = String::new();
for ch in delta.chars() {
if self.detect_tag {
if !run.is_empty() {
self.push_text(std::mem::take(&mut run), &mut segments);
}
self.line_buffer.push(ch);
if ch == '\n' {
self.finish_line(&mut segments);
continue;
}
let slug = self.line_buffer.trim_start();
if slug.is_empty() || self.is_tag_prefix(slug) {
continue;
}
// This line cannot be a tag line, so flush it immediately.
let buffered = std::mem::take(&mut self.line_buffer);
self.detect_tag = false;
self.push_text(buffered, &mut segments);
continue;
}
run.push(ch);
if ch == '\n' {
self.push_text(std::mem::take(&mut run), &mut segments);
self.detect_tag = true;
}
}
if !run.is_empty() {
self.push_text(run, &mut segments);
}
segments
}
/// Flush any buffered text and close an unterminated tag block.
pub(crate) fn finish(&mut self) -> Vec<TaggedLineSegment<T>> {
let mut segments = Vec::new();
if !self.line_buffer.is_empty() {
let buffered = std::mem::take(&mut self.line_buffer);
let without_newline = buffered.strip_suffix('\n').unwrap_or(&buffered);
let slug = without_newline.trim_start().trim_end();
if let Some(tag) = self.match_open(slug)
&& self.active_tag.is_none()
{
push_segment(&mut segments, TaggedLineSegment::TagStart(tag));
self.active_tag = Some(tag);
} else if let Some(tag) = self.match_close(slug)
&& self.active_tag == Some(tag)
{
push_segment(&mut segments, TaggedLineSegment::TagEnd(tag));
self.active_tag = None;
} else {
// The buffered line never proved to be a tag line.
self.push_text(buffered, &mut segments);
}
}
if let Some(tag) = self.active_tag.take() {
push_segment(&mut segments, TaggedLineSegment::TagEnd(tag));
}
self.detect_tag = true;
segments
}
fn finish_line(&mut self, segments: &mut Vec<TaggedLineSegment<T>>) {
let line = std::mem::take(&mut self.line_buffer);
let without_newline = line.strip_suffix('\n').unwrap_or(&line);
let slug = without_newline.trim_start().trim_end();
if let Some(tag) = self.match_open(slug)
&& self.active_tag.is_none()
{
push_segment(segments, TaggedLineSegment::TagStart(tag));
self.active_tag = Some(tag);
self.detect_tag = true;
return;
}
if let Some(tag) = self.match_close(slug)
&& self.active_tag == Some(tag)
{
push_segment(segments, TaggedLineSegment::TagEnd(tag));
self.active_tag = None;
self.detect_tag = true;
return;
}
self.detect_tag = true;
self.push_text(line, segments);
}
fn push_text(&self, text: String, segments: &mut Vec<TaggedLineSegment<T>>) {
if let Some(tag) = self.active_tag {
push_segment(segments, TaggedLineSegment::TagDelta(tag, text));
} else {
push_segment(segments, TaggedLineSegment::Normal(text));
}
}
fn is_tag_prefix(&self, slug: &str) -> bool {
let slug = slug.trim_end();
self.specs
.iter()
.any(|spec| spec.open.starts_with(slug) || spec.close.starts_with(slug))
}
fn match_open(&self, slug: &str) -> Option<T> {
self.specs
.iter()
.find(|spec| spec.open == slug)
.map(|spec| spec.tag)
}
fn match_close(&self, slug: &str) -> Option<T> {
self.specs
.iter()
.find(|spec| spec.close == slug)
.map(|spec| spec.tag)
}
}
fn push_segment<T>(segments: &mut Vec<TaggedLineSegment<T>>, segment: TaggedLineSegment<T>)
where
T: Copy + Eq,
{
match segment {
TaggedLineSegment::Normal(delta) => {
if delta.is_empty() {
return;
}
if let Some(TaggedLineSegment::Normal(existing)) = segments.last_mut() {
existing.push_str(&delta);
return;
}
segments.push(TaggedLineSegment::Normal(delta));
}
TaggedLineSegment::TagDelta(tag, delta) => {
if delta.is_empty() {
return;
}
if let Some(TaggedLineSegment::TagDelta(existing_tag, existing)) = segments.last_mut()
&& *existing_tag == tag
{
existing.push_str(&delta);
return;
}
segments.push(TaggedLineSegment::TagDelta(tag, delta));
}
TaggedLineSegment::TagStart(tag) => {
segments.push(TaggedLineSegment::TagStart(tag));
}
TaggedLineSegment::TagEnd(tag) => {
segments.push(TaggedLineSegment::TagEnd(tag));
}
}
}
#[cfg(test)]
mod tests {
use super::TagSpec;
use super::TaggedLineParser;
use super::TaggedLineSegment;
use pretty_assertions::assert_eq;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum Tag {
Block,
}
fn parser() -> TaggedLineParser<Tag> {
TaggedLineParser::new(vec![TagSpec {
open: "<tag>",
close: "</tag>",
tag: Tag::Block,
}])
}
#[test]
fn buffers_prefix_until_tag_is_decided() {
let mut parser = parser();
let mut segments = parser.parse("<t");
segments.extend(parser.parse("ag>\nline\n</tag>\n"));
segments.extend(parser.finish());
assert_eq!(
segments,
vec![
TaggedLineSegment::TagStart(Tag::Block),
TaggedLineSegment::TagDelta(Tag::Block, "line\n".to_string()),
TaggedLineSegment::TagEnd(Tag::Block),
]
);
}
#[test]
fn rejects_tag_lines_with_extra_text() {
let mut parser = parser();
let mut segments = parser.parse("<tag> extra\n");
segments.extend(parser.finish());
assert_eq!(
segments,
vec![TaggedLineSegment::Normal("<tag> extra\n".to_string())]
);
}
#[test]
fn closes_unterminated_tag_on_finish() {
let mut parser = parser();
let mut segments = parser.parse("<tag>\nline\n");
segments.extend(parser.finish());
assert_eq!(
segments,
vec![
TaggedLineSegment::TagStart(Tag::Block),
TaggedLineSegment::TagDelta(Tag::Block, "line\n".to_string()),
TaggedLineSegment::TagEnd(Tag::Block),
]
);
}
#[test]
fn accepts_tags_with_trailing_whitespace() {
let mut parser = parser();
let mut segments = parser.parse("<tag> \nline\n</tag> \n");
segments.extend(parser.finish());
assert_eq!(
segments,
vec![
TaggedLineSegment::TagStart(Tag::Block),
TaggedLineSegment::TagDelta(Tag::Block, "line\n".to_string()),
TaggedLineSegment::TagEnd(Tag::Block),
]
);
}
#[test]
fn passes_through_plain_text() {
let mut parser = parser();
let mut segments = parser.parse("plain text\n");
segments.extend(parser.finish());
assert_eq!(
segments,
vec![TaggedLineSegment::Normal("plain text\n".to_string())]
);
}
}

View File

@@ -67,6 +67,7 @@ impl SessionTask for UserShellCommandTask {
let event = EventMsg::TurnStarted(TurnStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
collaboration_mode_kind: turn_context.collaboration_mode_kind,
});
let session = session.clone_session();
session.send_event(turn_context.as_ref(), event).await;

View File

@@ -10,6 +10,7 @@ use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::tools::spec::JsonSchema;
use async_trait::async_trait;
use codex_protocol::config_types::ModeKind;
use codex_protocol::plan_tool::UpdatePlanArgs;
use codex_protocol::protocol::EventMsg;
use std::collections::BTreeMap;
@@ -103,6 +104,11 @@ pub(crate) async fn handle_update_plan(
arguments: String,
_call_id: String,
) -> Result<String, FunctionCallError> {
if turn_context.collaboration_mode_kind == ModeKind::Plan {
return Err(FunctionCallError::RespondToModel(
"update_plan is a TODO/checklist tool and is not allowed in Plan mode".to_string(),
));
}
let args = parse_update_plan_arguments(&arguments)?;
session
.send_event(turn_context, EventMsg::PlanUpdate(args))

View File

@@ -8,6 +8,12 @@ You are in **Plan Mode** until a developer message explicitly ends it.
Plan Mode is not changed by user intent, tone, or imperative language. If a user asks for execution while still in Plan Mode, treat it as a request to **plan the execution**, not perform it.
## Plan Mode vs update_plan tool
Plan Mode is a collaboration mode that can involve requesting user input and eventually issuing a `<proposed_plan>` block.
Separately, `update_plan` is a checklist/progress/TODOs tool; it does not enter or exit Plan Mode. Do not confuse it with Plan mode or try to use it while in Plan mode. If you try to use `update_plan` in Plan mode, it will return an error.
## Execution vs. mutation in Plan Mode
You may explore and execute **non-mutating** actions that improve the plan. You must not perform **mutating** actions.
@@ -96,6 +102,22 @@ Use the `request_user_input` tool only for decisions that materially change the
Only output the final plan when it is decision complete and leaves no decisions to the implementer.
When you present the official plan, wrap it in a `<proposed_plan>` block so the client can render it specially:
1) The opening tag must be on its own line.
2) Start the plan content on the next line (no text on the same line as the tag).
3) The closing tag must be on its own line.
4) Use Markdown inside the block.
5) Keep the tags exactly as `<proposed_plan>` and `</proposed_plan>` (do not translate or rename them), even if the plan content is in another language.
Example:
<proposed_plan>
# Plan title
- Step 1
- Step 2
</proposed_plan>
The final plan must be plan-only and include:
* A clear title
@@ -106,6 +128,6 @@ The final plan must be plan-only and include:
* Test cases
* Explicit assumptions and defaults chosen where needed
Do not ask "should I proceed?" in the final output.
Do not ask "should I proceed?" in the final output. The user can easily switch out of Plan mode and request implementation if you have included a `<proposed_plan>` block in your response. Alternatively, they can decide to stay in Plan mode and continue refining the plan.
Only produce the final answer when you are presenting the complete spec.
Only produce at most one `<proposed_plan>` block per turn, and only when you are presenting a complete spec.

View File

@@ -5,6 +5,10 @@ use codex_core::protocol::EventMsg;
use codex_core::protocol::ItemCompletedEvent;
use codex_core::protocol::ItemStartedEvent;
use codex_core::protocol::Op;
use codex_protocol::config_types::CollaborationMode;
use codex_protocol::config_types::ModeKind;
use codex_protocol::config_types::Settings;
use codex_protocol::items::AgentMessageContent;
use codex_protocol::items::TurnItem;
use codex_protocol::models::WebSearchAction;
use codex_protocol::user_input::ByteRange;
@@ -27,6 +31,7 @@ use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_match;
use pretty_assertions::assert_eq;
@@ -327,6 +332,268 @@ async fn agent_message_content_delta_has_item_metadata() -> anyhow::Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn plan_mode_emits_plan_item_from_proposed_plan_block() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
session_configured,
..
} = test_codex().build(&server).await?;
let plan_block = "<proposed_plan>\n- Step 1\n- Step 2\n</proposed_plan>\n";
let full_message = format!("Intro\n{plan_block}Outro");
let stream = sse(vec![
ev_response_created("resp-1"),
ev_message_item_added("msg-1", ""),
ev_output_text_delta(&full_message),
ev_assistant_message("msg-1", &full_message),
ev_completed("resp-1"),
]);
mount_sse_once(&server, stream).await;
let collaboration_mode = CollaborationMode {
mode: ModeKind::Plan,
settings: Settings {
model: session_configured.model.clone(),
reasoning_effort: None,
developer_instructions: None,
},
};
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "please plan".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: std::env::current_dir()?,
approval_policy: codex_core::protocol::AskForApproval::Never,
sandbox_policy: codex_core::protocol::SandboxPolicy::DangerFullAccess,
model: session_configured.model.clone(),
effort: None,
summary: codex_protocol::config_types::ReasoningSummary::Auto,
collaboration_mode: Some(collaboration_mode),
personality: None,
})
.await?;
let plan_delta = wait_for_event_match(&codex, |ev| match ev {
EventMsg::PlanDelta(event) => Some(event.clone()),
_ => None,
})
.await;
let plan_completed = wait_for_event_match(&codex, |ev| match ev {
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::Plan(item),
..
}) => Some(item.clone()),
_ => None,
})
.await;
assert_eq!(
plan_delta.thread_id,
session_configured.session_id.to_string()
);
assert_eq!(plan_delta.delta, "- Step 1\n- Step 2\n");
assert_eq!(plan_completed.text, "- Step 1\n- Step 2\n");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn plan_mode_strips_plan_from_agent_messages() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
session_configured,
..
} = test_codex().build(&server).await?;
let plan_block = "<proposed_plan>\n- Step 1\n- Step 2\n</proposed_plan>\n";
let full_message = format!("Intro\n{plan_block}Outro");
let stream = sse(vec![
ev_response_created("resp-1"),
ev_message_item_added("msg-1", ""),
ev_output_text_delta(&full_message),
ev_assistant_message("msg-1", &full_message),
ev_completed("resp-1"),
]);
mount_sse_once(&server, stream).await;
let collaboration_mode = CollaborationMode {
mode: ModeKind::Plan,
settings: Settings {
model: session_configured.model.clone(),
reasoning_effort: None,
developer_instructions: None,
},
};
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "please plan".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: std::env::current_dir()?,
approval_policy: codex_core::protocol::AskForApproval::Never,
sandbox_policy: codex_core::protocol::SandboxPolicy::DangerFullAccess,
model: session_configured.model.clone(),
effort: None,
summary: codex_protocol::config_types::ReasoningSummary::Auto,
collaboration_mode: Some(collaboration_mode),
personality: None,
})
.await?;
let mut agent_deltas = Vec::new();
let mut plan_delta = None;
let mut agent_item = None;
let mut plan_item = None;
while plan_delta.is_none() || agent_item.is_none() || plan_item.is_none() {
let ev = wait_for_event(&codex, |_| true).await;
match ev {
EventMsg::AgentMessageContentDelta(event) => {
agent_deltas.push(event.delta);
}
EventMsg::PlanDelta(event) => {
plan_delta = Some(event.delta);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::AgentMessage(item),
..
}) => {
agent_item = Some(item);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::Plan(item),
..
}) => {
plan_item = Some(item);
}
_ => {}
}
}
let agent_text = agent_deltas.concat();
assert_eq!(agent_text, "Intro\nOutro");
assert_eq!(plan_delta.unwrap(), "- Step 1\n- Step 2\n");
assert_eq!(plan_item.unwrap().text, "- Step 1\n- Step 2\n");
let agent_text_from_item: String = agent_item
.unwrap()
.content
.iter()
.map(|entry| match entry {
AgentMessageContent::Text { text } => text.as_str(),
})
.collect();
assert_eq!(agent_text_from_item, "Intro\nOutro");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn plan_mode_handles_missing_plan_close_tag() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
session_configured,
..
} = test_codex().build(&server).await?;
let full_message = "Intro\n<proposed_plan>\n- Step 1\n";
let stream = sse(vec![
ev_response_created("resp-1"),
ev_message_item_added("msg-1", ""),
ev_output_text_delta(full_message),
ev_assistant_message("msg-1", full_message),
ev_completed("resp-1"),
]);
mount_sse_once(&server, stream).await;
let collaboration_mode = CollaborationMode {
mode: ModeKind::Plan,
settings: Settings {
model: session_configured.model.clone(),
reasoning_effort: None,
developer_instructions: None,
},
};
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "please plan".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: std::env::current_dir()?,
approval_policy: codex_core::protocol::AskForApproval::Never,
sandbox_policy: codex_core::protocol::SandboxPolicy::DangerFullAccess,
model: session_configured.model.clone(),
effort: None,
summary: codex_protocol::config_types::ReasoningSummary::Auto,
collaboration_mode: Some(collaboration_mode),
personality: None,
})
.await?;
let mut plan_delta = None;
let mut plan_item = None;
let mut agent_item = None;
while plan_delta.is_none() || plan_item.is_none() || agent_item.is_none() {
let ev = wait_for_event(&codex, |_| true).await;
match ev {
EventMsg::PlanDelta(event) => {
plan_delta = Some(event.delta);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::Plan(item),
..
}) => {
plan_item = Some(item);
}
EventMsg::ItemCompleted(ItemCompletedEvent {
item: TurnItem::AgentMessage(item),
..
}) => {
agent_item = Some(item);
}
_ => {}
}
}
assert_eq!(plan_delta.unwrap(), "- Step 1\n");
assert_eq!(plan_item.unwrap().text, "- Step 1\n");
let agent_text_from_item: String = agent_item
.unwrap()
.content
.iter()
.map(|entry| match entry {
AgentMessageContent::Text { text } => text.as_str(),
})
.collect();
assert_eq!(agent_text_from_item, "Intro\n");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reasoning_content_delta_has_item_metadata() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));