Fix compaction context reinjection and model baselines (#12252)

## Summary
- move regular-turn context diff/full-context persistence into
`run_turn` so pre-turn compaction runs before incoming context updates
are recorded
- after successful pre-turn compaction, rely on a cleared
`reference_context_item` to trigger full context reinjection on the
follow-up regular turn (manual `/compact` keeps replacement history
summary-only and also clears the baseline)
- preserve `<model_switch>` when full context is reinjected, and inject
it *before* the rest of the full-context items
- scope `reference_context_item` and `previous_model` to regular user
turns only so standalone tasks (`/compact`, shell, review, undo) cannot
suppress future reinjection or `<model_switch>` behavior
- make context-diff persistence + `reference_context_item` updates
explicit in the regular-turn path, with clearer docs/comments around the
invariant
- stop persisting local `/compact` `RolloutItem::TurnContext` snapshots
(only regular turns persist `TurnContextItem` now)
- simplify resume/fork previous-model/reference-baseline hydration by
looking up the last surviving turn context from rollout lifecycle
events, including rollback and compaction-crossing handling
- remove the legacy fallback that guessed from bare `TurnContext`
rollouts without lifecycle events
- update compaction/remote-compaction/model-visible snapshots and
compact test assertions (including remote compaction mock response
shape)

## Why
We were persisting incoming context items before spawning the regular
turn task, which let pre-turn compaction requests accidentally include
incoming context diffs without the new user message. Fixing that exposed
follow-on baseline issues around `/compact`, resume/fork, and standalone
tasks that could cause duplicate context injection or suppress
`<model_switch>` instructions.

This PR re-centers the invariants around regular turns:
- regular turns persist model-visible context diffs/full reinjection and
update the `reference_context_item`
- standalone tasks do not advance those regular-turn baselines
- compaction clears the baseline when replacement history may have
stripped the referenced context diffs

## Follow-ups (TODOs left in code)
- `TODO(ccunningham)`: fix rollback/backtracking baseline handling more
comprehensively
- `TODO(ccunningham)`: include pending incoming context items in
pre-turn compaction threshold estimation
- `TODO(ccunningham)`: inject updated personality spec alongside
`<model_switch>` so some model-switch paths can avoid forced full
reinjection
- `TODO(ccunningham)`: review task turn lifecycle
(`TurnStarted`/`TurnComplete`) behavior and emit task-start context
diffs for task types that should have them (excluding `/compact`)

## Validation
- `just fmt`
- CI should cover the updated compaction/resume/model-visible snapshot
expectations and rollout-hydration behavior
- I did **not** rerun the full local test suite after the latest
resume-lookup / rollout-persistence simplifications
This commit is contained in:
Charley Cunningham
2026-02-20 23:13:08 -08:00
committed by GitHub
parent 264fc444b6
commit bb0ac5be70
31 changed files with 1289 additions and 1206 deletions

View File

@@ -93,12 +93,9 @@ async fn remote_compact_replaces_history_for_followups() -> Result<()> {
)
.await;
let compacted_history = vec![
responses::user_message_item("REMOTE_COMPACTED_SUMMARY"),
ResponseItem::Compaction {
encrypted_content: "ENCRYPTED_COMPACTION_SUMMARY".to_string(),
},
];
let compacted_history = vec![ResponseItem::Compaction {
encrypted_content: "ENCRYPTED_COMPACTION_SUMMARY".to_string(),
}];
let compact_mock = responses::mount_compact_json_once(
harness.server(),
serde_json::json!({ "output": compacted_history.clone() }),
@@ -159,7 +156,7 @@ async fn remote_compact_replaces_history_for_followups() -> Result<()> {
let follow_up_request = response_requests.last().expect("follow-up request missing");
let follow_up_body = follow_up_request.body_json().to_string();
assert!(
follow_up_body.contains("REMOTE_COMPACTED_SUMMARY"),
follow_up_body.contains("\"type\":\"compaction\""),
"expected follow-up request to use compacted history"
);
assert!(
@@ -178,7 +175,7 @@ async fn remote_compact_replaces_history_for_followups() -> Result<()> {
insta::assert_snapshot!(
"remote_manual_compact_with_history_shapes",
format_labeled_requests_snapshot(
"Remote manual /compact where remote compact output is summary-only: follow-up layout uses returned summary plus new user message.",
"Remote manual /compact where remote compact output is compaction-only: follow-up layout uses the returned compaction item plus new user message.",
&[
("Remote Compaction Request", &compact_request),
("Remote Post-Compaction History Layout", follow_up_request),
@@ -958,7 +955,6 @@ async fn remote_compact_persists_replacement_history_in_rollout() -> Result<()>
.await;
let compacted_history = vec![
responses::user_message_item("COMPACTED_USER_SUMMARY"),
ResponseItem::Compaction {
encrypted_content: "ENCRYPTED_COMPACTION_SUMMARY".to_string(),
},
@@ -1012,17 +1008,6 @@ async fn remote_compact_persists_replacement_history_in_rollout() -> Result<()>
&& compacted.message.is_empty()
&& let Some(replacement_history) = compacted.replacement_history.as_ref()
{
let has_compacted_user_summary = replacement_history.iter().any(|item| {
matches!(
item,
ResponseItem::Message { role, content, .. }
if role == "user"
&& content.iter().any(|part| matches!(
part,
ContentItem::InputText { text } if text == "COMPACTED_USER_SUMMARY"
))
)
});
let has_compaction_item = replacement_history.iter().any(|item| {
matches!(
item,
@@ -1054,7 +1039,7 @@ async fn remote_compact_persists_replacement_history_in_rollout() -> Result<()>
)
});
if has_compacted_user_summary && has_compaction_item && has_compacted_assistant_note {
if has_compaction_item && has_compacted_assistant_note {
assert!(
!has_permissions_developer_message,
"manual remote compact rollout replacement history should not inject permissions context"
@@ -1110,7 +1095,6 @@ async fn remote_compact_and_resume_refresh_stale_developer_instructions() -> Res
.await;
let compacted_history = vec![
responses::user_message_item("REMOTE_COMPACTED_SUMMARY"),
ResponseItem::Message {
id: None,
role: "developer".to_string(),
@@ -1196,8 +1180,8 @@ async fn remote_compact_and_resume_refresh_stale_developer_instructions() -> Res
"fresh developer instructions should be present after compaction"
);
assert!(
after_compact_body.contains("REMOTE_COMPACTED_SUMMARY"),
"compacted summary should be present after compaction"
after_compact_body.contains("ENCRYPTED_COMPACTION_SUMMARY"),
"compaction item should be present after compaction"
);
let after_resume_body = after_resume_request.body_json().to_string();
@@ -1210,8 +1194,8 @@ async fn remote_compact_and_resume_refresh_stale_developer_instructions() -> Res
"fresh developer instructions should be present after resume"
);
assert!(
after_resume_body.contains("REMOTE_COMPACTED_SUMMARY"),
"compacted summary should persist after resume"
after_resume_body.contains("ENCRYPTED_COMPACTION_SUMMARY"),
"compaction item should persist after resume"
);
Ok(())
@@ -1243,7 +1227,6 @@ async fn remote_compact_refreshes_stale_developer_instructions_without_resume()
.await;
let compacted_history = vec![
responses::user_message_item("REMOTE_COMPACTED_SUMMARY"),
ResponseItem::Message {
id: None,
role: "developer".to_string(),
@@ -1302,8 +1285,8 @@ async fn remote_compact_refreshes_stale_developer_instructions_without_resume()
"fresh developer instructions should be present after compaction"
);
assert!(
after_compact_body.contains("REMOTE_COMPACTED_SUMMARY"),
"compacted summary should be present after compaction"
after_compact_body.contains("ENCRYPTED_COMPACTION_SUMMARY"),
"compaction item should be present after compaction"
);
Ok(())
@@ -1706,7 +1689,7 @@ async fn snapshot_request_shape_remote_mid_turn_continuation_compaction() -> Res
insta::assert_snapshot!(
"remote_mid_turn_compaction_shapes",
format_labeled_requests_snapshot(
"Remote mid-turn continuation compaction after tool output: compact request includes tool artifacts and follow-up request includes the summary.",
"Remote mid-turn continuation compaction after tool output: compact request includes tool artifacts and the follow-up request includes the returned compaction item.",
&[
("Remote Compaction Request", &compact_request),
("Remote Post-Compaction History Layout", &requests[1]),
@@ -1749,9 +1732,9 @@ async fn snapshot_request_shape_remote_mid_turn_compaction_summary_only_reinject
)
.await;
let compacted_history = vec![responses::user_message_item(&summary_with_prefix(
"REMOTE_SUMMARY_ONLY",
))];
let compacted_history = vec![ResponseItem::Compaction {
encrypted_content: summary_with_prefix("REMOTE_SUMMARY_ONLY"),
}];
let compact_mock = responses::mount_compact_json_once(
harness.server(),
serde_json::json!({ "output": compacted_history }),
@@ -1786,7 +1769,7 @@ async fn snapshot_request_shape_remote_mid_turn_compaction_summary_only_reinject
insta::assert_snapshot!(
"remote_mid_turn_compaction_summary_only_reinjects_context_shapes",
format_labeled_requests_snapshot(
"Remote mid-turn compaction where compact output has only summary user content: continuation layout reinjects canonical context before that summary.",
"Remote mid-turn compaction where compact output has only a compaction item: continuation layout reinjects context before that compaction item.",
&[
("Remote Compaction Request", &compact_request),
(
@@ -1893,13 +1876,13 @@ async fn snapshot_request_shape_remote_mid_turn_compaction_multi_summary_reinjec
insta::assert_snapshot!(
"remote_mid_turn_compaction_multi_summary_reinjects_above_last_summary_shapes",
format_labeled_requests_snapshot(
"Remote mid-turn compaction after an earlier summary compaction: the older summary remains in model-visible history and round-trips into the next compact request.",
"After a prior manual /compact produced an older remote compaction item, the next turn hits remote auto-compaction before the next sampling request. The compact request carries forward that earlier compaction item, and the next sampling request shows the latest compaction item with context reinjected before USER_TWO.",
&[
("Remote Compaction Request", &compact_request),
(
"Second Turn Request (Before Mid-Turn Compaction)",
"Second Turn Request (After Compaction)",
&second_turn_request
),
("Remote Compaction Request", &compact_request),
]
)
);