Fix compaction context reinjection and model baselines (#12252)

## Summary
- move regular-turn context diff/full-context persistence into
`run_turn` so pre-turn compaction runs before incoming context updates
are recorded
- after successful pre-turn compaction, rely on a cleared
`reference_context_item` to trigger full context reinjection on the
follow-up regular turn (manual `/compact` keeps replacement history
summary-only and also clears the baseline)
- preserve `<model_switch>` when full context is reinjected, and inject
it *before* the rest of the full-context items
- scope `reference_context_item` and `previous_model` to regular user
turns only so standalone tasks (`/compact`, shell, review, undo) cannot
suppress future reinjection or `<model_switch>` behavior
- make context-diff persistence + `reference_context_item` updates
explicit in the regular-turn path, with clearer docs/comments around the
invariant
- stop persisting local `/compact` `RolloutItem::TurnContext` snapshots
(only regular turns persist `TurnContextItem` now)
- simplify resume/fork previous-model/reference-baseline hydration by
looking up the last surviving turn context from rollout lifecycle
events, including rollback and compaction-crossing handling
- remove the legacy fallback that guessed from bare `TurnContext`
rollouts without lifecycle events
- update compaction/remote-compaction/model-visible snapshots and
compact test assertions (including remote compaction mock response
shape)

## Why
We were persisting incoming context items before spawning the regular
turn task, which let pre-turn compaction requests accidentally include
incoming context diffs without the new user message. Fixing that exposed
follow-on baseline issues around `/compact`, resume/fork, and standalone
tasks that could cause duplicate context injection or suppress
`<model_switch>` instructions.

This PR re-centers the invariants around regular turns:
- regular turns persist model-visible context diffs/full reinjection and
update the `reference_context_item`
- standalone tasks do not advance those regular-turn baselines
- compaction clears the baseline when replacement history may have
stripped the referenced context diffs

## Follow-ups (TODOs left in code)
- `TODO(ccunningham)`: fix rollback/backtracking baseline handling more
comprehensively
- `TODO(ccunningham)`: include pending incoming context items in
pre-turn compaction threshold estimation
- `TODO(ccunningham)`: inject updated personality spec alongside
`<model_switch>` so some model-switch paths can avoid forced full
reinjection
- `TODO(ccunningham)`: review task turn lifecycle
(`TurnStarted`/`TurnComplete`) behavior and emit task-start context
diffs for task types that should have them (excluding `/compact`)

## Validation
- `just fmt`
- CI should cover the updated compaction/resume/model-visible snapshot
expectations and rollout-hydration behavior
- I did **not** rerun the full local test suite after the latest
resume-lookup / rollout-persistence simplifications
This commit is contained in:
Charley Cunningham
2026-02-20 23:13:08 -08:00
committed by GitHub
parent 264fc444b6
commit bb0ac5be70
31 changed files with 1289 additions and 1206 deletions

View File

@@ -672,12 +672,7 @@ async fn per_turn_overrides_keep_cached_prefix_and_key_constant() -> anyhow::Res
});
let expected_permissions_msg = body1["input"][0].clone();
let body1_input = body1["input"].as_array().expect("input array");
let expected_permissions_msg_2 = body2["input"][body1_input.len() + 1].clone();
assert_ne!(
expected_permissions_msg_2, expected_permissions_msg,
"expected updated permissions message after per-turn override"
);
let expected_model_switch_msg = body2["input"][body1_input.len() + 2].clone();
let expected_model_switch_msg = body2["input"][body1_input.len()].clone();
assert_eq!(
expected_model_switch_msg["role"].as_str(),
Some("developer")
@@ -688,10 +683,15 @@ async fn per_turn_overrides_keep_cached_prefix_and_key_constant() -> anyhow::Res
.is_some_and(|text| text.contains("<model_switch>")),
"expected model switch message after model override: {expected_model_switch_msg:?}"
);
let expected_permissions_msg_2 = body2["input"][body1_input.len() + 2].clone();
assert_ne!(
expected_permissions_msg_2, expected_permissions_msg,
"expected updated permissions message after per-turn override"
);
let mut expected_body2 = body1_input.to_vec();
expected_body2.push(expected_model_switch_msg);
expected_body2.push(expected_env_msg_2);
expected_body2.push(expected_permissions_msg_2);
expected_body2.push(expected_model_switch_msg);
expected_body2.push(expected_user_message_2);
assert_eq!(body2["input"], serde_json::Value::Array(expected_body2));
@@ -900,12 +900,7 @@ async fn send_user_turn_with_changes_sends_environment_context() -> anyhow::Resu
assert_eq!(body1["input"], expected_input_1);
let body1_input = body1["input"].as_array().expect("input array");
let expected_permissions_msg_2 = body2["input"][body1_input.len()].clone();
assert_ne!(
expected_permissions_msg_2, expected_permissions_msg,
"expected updated permissions message after policy change"
);
let expected_model_switch_msg = body2["input"][body1_input.len() + 1].clone();
let expected_model_switch_msg = body2["input"][body1_input.len()].clone();
assert_eq!(
expected_model_switch_msg["role"].as_str(),
Some("developer")
@@ -916,14 +911,19 @@ async fn send_user_turn_with_changes_sends_environment_context() -> anyhow::Resu
.is_some_and(|text| text.contains("<model_switch>")),
"expected model switch message after model override: {expected_model_switch_msg:?}"
);
let expected_permissions_msg_2 = body2["input"][body1_input.len() + 1].clone();
assert_ne!(
expected_permissions_msg_2, expected_permissions_msg,
"expected updated permissions message after policy change"
);
let expected_user_message_2 = text_user_input("hello 2".to_string());
let expected_input_2 = serde_json::Value::Array(vec![
expected_permissions_msg,
expected_ui_msg,
expected_env_msg_1,
expected_user_message_1,
expected_permissions_msg_2,
expected_model_switch_msg,
expected_permissions_msg_2,
expected_user_message_2,
]);
assert_eq!(body2["input"], expected_input_2);