8.5 KiB
TurnState Refactor Plan
Motivation
- The current
Stateinsidecore/src/codex.rsmixes long-lived session data (e.g. history, approved commands) with per-turn information (pending_input, transient readiness flags). - Readiness is delivered over a side channel and threaded through many call sites as an
Option<Arc<ReadinessFlag>>, making it hard to reason about when the flag is consumed, reused, or replaced. - Follow-up user messages while a turn is streaming are pushed into a session-level vector; the lifecycle for those items is implicit inside
run_task. - Introducing an explicit
TurnStateobject lets us capture everything that belongs to one user turn, ensures it is dropped when the turn finishes, and gives us a single place to own the readiness flag.
Proposed data structures
TurnState
pub(crate) struct TurnState {
/// Submission id that started this turn.
pub sub_id: String,
/// Per-turn context (model overrides, cwd, schema…).
pub turn_context: Arc<TurnContext>,
/// The initial user input bundled for the model.
pub initial_input: ResponseInputItem,
/// Mailbox for follow-up user inputs and readiness handoffs.
mailbox: Mutex<TurnMailbox>,
/// Tracks the latest agent output / review artefacts for completion events.
last_agent_message: Mutex<Option<String>>,
/// When in review mode we keep an isolated history to feed the child model.
review_thread_history: Mutex<Vec<ResponseItem>>,
/// Tracks the diff across the whole turn so we can emit `TurnDiff` events once.
diff_tracker: Mutex<TurnDiffTracker>,
/// Whether we already tried auto-compaction in this turn.
auto_compact_recently_attempted: AtomicBool,
}
TurnMailbox is a helper that keeps the queue of pending turn inputs and the most recent readiness flag:
struct TurnMailbox {
latest_readiness: Option<Arc<ReadinessFlag>>,
pending: VecDeque<PendingTurnInput>,
}
PendingTurnInput keeps the shape it already has today (ResponseInputItem plus the readiness flag that was active when it was enqueued).
Handle vs. runtime
TurnStateis reference-counted (Arc<TurnState>) so both the session (for injection) and the running task can access it.- Runtime-only helpers (prompt building, retry counters) remain inside
run_task; they borrow data fromTurnStateinstead of keeping their own copies.
Lifecycle management
-
Creation – When
submission_loopreceivesOp::UserInputorOp::UserTurnand there is no active task, it constructs aTurnState:- Build/resolve the
TurnContext(either reuse the persistent one or apply the per-turn overrides). - Collect the initial readiness flag by peeking at the readiness receiver. Instead of losing it on
try_recvfailure we push the flag into a queue owned by the session;TurnState::newpops from that queue. If nothing is available we storeNoneand the flag defaults to ready semantics. - Convert the submitted items into a
ResponseInputItemand seed the mailbox with that entry. The sameTurnState::enqueue_initial_inputhelper is used for review threads so every task goes through the same path. - Wrap the whole struct in an
Arcand pass it toAgentTask::spawn(or.review).
- Build/resolve the
-
Session bookkeeping –
Stategains two fields:current_task: Option<AgentTask>, current_turn: Option<Arc<TurnState>>,Session::set_taskstores both, aborting the previous task if needed.Session::remove_taskclearscurrent_turnin addition tocurrent_task. -
Injecting more user input –
Session::inject_inputbecomes a thin wrapper:- Grab the session mutex.
- If there is an active
TurnState, callturn_state.enqueue_user_input(items, readiness)and returnOk(()). - If not, return
Err((items, readiness))so the submission loop knows it needs to start a fresh turn (same behaviour as today). The enqueue helper converts the new items into aPendingTurnInput, pushes it into the mailbox, and updateslatest_readinesswhen a flag accompanies the message.
-
Turn execution –
run_tasknow receivesArc<TurnState>instead of a rawVec<InputItem>/ readiness pair. It:- Grabs the initial input via
turn_state.take_initial_input()to seed history and the review mailbox. - On each iteration, calls
turn_state.drain_mailbox()which returns aTurnDrainbundling the pendingResponseItems and the latest readiness flag so the loop no longer needs to manipulate the readiness flag manually.TurnMailboxensures we always hand out the most recent readiness flag (the newest non-Noneentry wins). - Accesses the diff tracker, review history, and auto-compaction flag through the
TurnStaterather than local variables. This keeps the single source of truth tied to the turn’s lifetime and makes debugging easier. - Writes the last assistant message into
turn_statebefore signallingTaskCompleteso listeners can retrieve it even if the task is aborted elsewhere.
- Grabs the initial input via
-
Completion – When the loop finishes (success, interruption, or error) we drop
Arc<TurnState>by clearingcurrent_turn. All readiness waiters associated with the turn naturally drop because the only owner lives on the turn state.
Readiness handling
TurnReadinessBridgein the TUI continues to sendArc<ReadinessFlag>values over the readiness channel; the session stores them in a short queue (VecDeque<Arc<ReadinessFlag>>) protected by the same mutex that guardsState.TurnState::newpops the next flag when constructing the mailbox. If the queue is empty we log (with rate limiting) and storeNoneso the turn stays unblocked.TurnState::enqueue_user_inputaccepts an optional flag. When present we updatelatest_readinessbefore pushing the input so subsequentdrain_mailboxcalls hand the new flag torun_turn.run_turnandhandle_response_itemonly seeturn_state.current_readiness(), eliminating the need for an ad-hoccurrent_turn_readinessvariable scattered through the loop.- Because the readiness flag lives on the
TurnState, tool handlers that are spawned outside the loop (e.g. background exec streams) can clone the flag from the turn state if they need to delay until the user confirms.
Changes to submission loop
- Replace the existing
turn_readiness_rx.try_recv()calls with a helper on the session such asSession::next_turn_readiness()that returns the oldest queued flag (orNone).TurnState::newreceives that value and stores it in its mailbox. - The submission loop no longer passes readiness into
AgentTask::spawn; instead it constructs theTurnState(with readiness embedded) and hands the state to the task constructor. - For review turns and compaction tasks, we construct a
TurnStatewithNonereadiness. The helper works for both flows so we can remove the separate code paths that bypass readiness today.
Implementation plan
- Introduce the
turn_statemodule withTurnState,TurnMailbox, and helpers to enqueue / drain inputs and expose readiness. - Extend
Statewithcurrent_turnand aVecDeque<Arc<ReadinessFlag>>used to store unread readiness flags pushed by the UI. - Update the readiness sender plumbing so
Codex::turn_readiness_sender()pushes into that queue; remove the directtry_recvusage. - Refactor
AgentTask::spawn/run_taskto acceptArc<TurnState>and use the new helper methods for initial input, pending input, diff tracking, and readiness. - Simplify
Session::inject_inputto route through the activeTurnStateinstead of manipulatingstate.pending_inputdirectly. Drop thePendingTurnInputvector fromStateonce all call sites are migrated. - Move per-turn temporaries (
last_agent_message, review mailbox, diff tracker, auto-compact flag) intoTurnState; this lets us delete the bespoke locals inrun_taskand make the turn lifecycle self-contained. - After the refactor, audit call sites to ensure readiness is consistently fetched from the turn state, delete the now-unused
turn_readinessparameters, and clean up warnings.
Follow-up considerations
- With
TurnStateowning the readiness flag we can extend it later to expose richer readiness semantics (e.g. multiple tokens, logging) without touching the submission loop again. - This refactor lays the groundwork for queuing multiple
TurnStates if we later want to support full multiturn buffering instead of mutating the live turn. - Once
TurnStateis in place, the session-level mutex guards much less data, which could be split further if concurrency becomes a bottleneck.