Fix compaction context reinjection and model baselines (#12252)

## Summary
- move regular-turn context diff/full-context persistence into
`run_turn` so pre-turn compaction runs before incoming context updates
are recorded
- after successful pre-turn compaction, rely on a cleared
`reference_context_item` to trigger full context reinjection on the
follow-up regular turn (manual `/compact` keeps replacement history
summary-only and also clears the baseline)
- preserve `<model_switch>` when full context is reinjected, and inject
it *before* the rest of the full-context items
- scope `reference_context_item` and `previous_model` to regular user
turns only so standalone tasks (`/compact`, shell, review, undo) cannot
suppress future reinjection or `<model_switch>` behavior
- make context-diff persistence + `reference_context_item` updates
explicit in the regular-turn path, with clearer docs/comments around the
invariant
- stop persisting local `/compact` `RolloutItem::TurnContext` snapshots
(only regular turns persist `TurnContextItem` now)
- simplify resume/fork previous-model/reference-baseline hydration by
looking up the last surviving turn context from rollout lifecycle
events, including rollback and compaction-crossing handling
- remove the legacy fallback that guessed from bare `TurnContext`
rollouts without lifecycle events
- update compaction/remote-compaction/model-visible snapshots and
compact test assertions (including remote compaction mock response
shape)

## Why
We were persisting incoming context items before spawning the regular
turn task, which let pre-turn compaction requests accidentally include
incoming context diffs without the new user message. Fixing that exposed
follow-on baseline issues around `/compact`, resume/fork, and standalone
tasks that could cause duplicate context injection or suppress
`<model_switch>` instructions.

This PR re-centers the invariants around regular turns:
- regular turns persist model-visible context diffs/full reinjection and
update the `reference_context_item`
- standalone tasks do not advance those regular-turn baselines
- compaction clears the baseline when replacement history may have
stripped the referenced context diffs

## Follow-ups (TODOs left in code)
- `TODO(ccunningham)`: fix rollback/backtracking baseline handling more
comprehensively
- `TODO(ccunningham)`: include pending incoming context items in
pre-turn compaction threshold estimation
- `TODO(ccunningham)`: inject updated personality spec alongside
`<model_switch>` so some model-switch paths can avoid forced full
reinjection
- `TODO(ccunningham)`: review task turn lifecycle
(`TurnStarted`/`TurnComplete`) behavior and emit task-start context
diffs for task types that should have them (excluding `/compact`)

## Validation
- `just fmt`
- CI should cover the updated compaction/resume/model-visible snapshot
expectations and rollout-hydration behavior
- I did **not** rerun the full local test suite after the latest
resume-lookup / rollout-persistence simplifications
This commit is contained in:
Charley Cunningham
2026-02-20 23:13:08 -08:00
committed by GitHub
parent 264fc444b6
commit bb0ac5be70
31 changed files with 1289 additions and 1206 deletions

View File

@@ -867,7 +867,7 @@ pub async fn mount_compact_json_once(server: &MockServer, body: serde_json::Valu
/// Mount a `/responses/compact` mock that mirrors the default remote compaction shape:
/// keep user+developer messages from the request, drop assistant/tool artifacts, and append one
/// summary user message.
/// compaction item carrying the provided summary text.
pub async fn mount_compact_user_history_with_summary_once(
server: &MockServer,
summary_text: &str,
@@ -911,6 +911,9 @@ pub async fn mount_compact_user_history_with_summary_sequence(
.cloned()
.unwrap_or_default()
.into_iter()
// TODO(ccunningham): Update this mock to match future compaction model behavior:
// return user/developer/assistant messages since the last compaction item, then
// append a single newest compaction item.
// Match current remote compaction behavior: keep user/developer messages and
// omit assistant/tool history entries.
.filter(|item| {
@@ -921,11 +924,10 @@ pub async fn mount_compact_user_history_with_summary_sequence(
)
})
.collect::<Vec<Value>>();
// Append the synthetic summary message as the newest user item.
// Append a synthetic compaction item as the newest item.
output.push(serde_json::json!({
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": summary_text}],
"type": "compaction",
"encrypted_content": summary_text,
}));
ResponseTemplate::new(200)
.insert_header("content-type", "application/json")