changes

feat: merge remote models instead of destructing (#7997 )
- merge remote models instead of destructing - make config values have more precedent over remote values
2026-02-06 17:03:42 +00:00 · 2025-12-16 14:30:38 -08:00 · 2025-12-15 18:02:35 -08:00 · 2025-12-16 01:28:27 +00:00 · 2025-12-15 17:20:53 -08:00 · 2025-12-16 01:11:01 +00:00
49 changed files with 3559 additions and 596 deletions
--- a/.github/workflows/rust-ci.yml
+++ b/.github/workflows/rust-ci.yml
@@ -385,28 +385,6 @@ jobs:
            /opt/ghc
          sudo apt-get remove -y docker.io docker-compose podman buildah

-      # Ensure brew includes this fix so that brew's shellenv.sh loads
-      # cleanly in the Codex sandbox (it is frequently eval'd via .zprofile
-      # for Brew users, including the macOS runners on GitHub):
-      #
-      # https://github.com/Homebrew/brew/pull/21157
-      #
-      # Once brew 5.0.5 is released and is the default on macOS runners, this
-      # step can be removed.
-      - name: Upgrade brew
-        if: ${{ startsWith(matrix.runner, 'macos') }}
-        shell: bash
-        run: |
-          set -euo pipefail
-          brew --version
-          git -C "$(brew --repo)" fetch origin
-          git -C "$(brew --repo)" checkout main
-          git -C "$(brew --repo)" reset --hard origin/main
-          export HOMEBREW_UPDATE_TO_TAG=0
-          brew update
-          brew upgrade
-          brew --version
-
      # Some integration tests rely on DotSlash being installed.
      # See https://github.com/openai/codex/pull/7617.
      - name: Install DotSlash
--- a/.github/workflows/rust-release.yml
+++ b/.github/workflows/rust-release.yml
@@ -262,6 +262,7 @@ jobs:
            local binary="$1"
            local source_path="target/${{ matrix.target }}/release/${binary}"
            local archive_path="${RUNNER_TEMP}/${binary}.zip"
+            local ticket_path="target/${{ matrix.target }}/release/${binary}.notarization-ticket.json"

            if [[ ! -f "$source_path" ]]; then
              echo "Binary $source_path not found"
@@ -292,6 +293,22 @@ jobs:
              echo "Notarization failed for ${binary} (submission ${submission_id}, status ${status})"
              exit 1
            fi
+
+            log_json=$(xcrun notarytool log "$submission_id" \
+              --key "$notary_key_path" \
+              --key-id "$APPLE_NOTARIZATION_KEY_ID" \
+              --issuer "$APPLE_NOTARIZATION_ISSUER_ID" \
+              --output-format json)
+
+            jq -n \
+              --arg binary "$binary" \
+              --arg target "${{ matrix.target }}" \
+              --arg id "$submission_id" \
+              --arg status "$status" \
+              --argjson submission "$submission_json" \
+              --argjson log "$log_json" \
+              '{binary: $binary, target: $target, id: $id, status: $status, submission: $submission, log: $log}' \
+              > "$ticket_path"
          }

          notarize_binary "codex"
@@ -313,6 +330,16 @@ jobs:
            cp target/${{ matrix.target }}/release/codex-responses-api-proxy "$dest/codex-responses-api-proxy-${{ matrix.target }}"
          fi

+          if [[ "${{ matrix.runner }}" == macos* ]]; then
+            for binary in codex codex-responses-api-proxy; do
+              ticket_src="target/${{ matrix.target }}/release/${binary}.notarization-ticket.json"
+              ticket_dest="$dest/${binary}-${{ matrix.target }}.notarization-ticket.json"
+              if [[ -f "$ticket_src" ]]; then
+                cp "$ticket_src" "$ticket_dest"
+              fi
+            done
+          fi
+
          if [[ "${{ matrix.target }}" == *linux* ]]; then
            cp target/${{ matrix.target }}/release/codex.sigstore "$dest/codex-${{ matrix.target }}.sigstore"
            cp target/${{ matrix.target }}/release/codex-responses-api-proxy.sigstore "$dest/codex-responses-api-proxy-${{ matrix.target }}.sigstore"
@@ -341,10 +368,10 @@ jobs:

          # For compatibility with environments that lack the `zstd` tool we
          # additionally create a `.tar.gz` for all platforms and `.zip` for
-          # Windows alongside every single binary that we publish. The end result is:
+          # Windows and macOS alongside every single binary that we publish. The end result is:
          #   codex-<target>.zst          (existing)
          #   codex-<target>.tar.gz       (new)
-          #   codex-<target>.zip          (only for Windows)
+          #   codex-<target>.zip          (Windows/macOS)

          # 1. Produce a .tar.gz for every file in the directory *before* we
          #    run `zstd --rm`, because that flag deletes the original files.
@@ -361,14 +388,31 @@ jobs:
              continue
            fi

+            # Notarization ticket sidecars are bundled into the per-binary
+            # archives; don't generate separate archives for them.
+            if [[ "$base" == *.notarization-ticket.json ]]; then
+              continue
+            fi
+
            # Create per-binary tar.gz
-            tar -C "$dest" -czf "$dest/${base}.tar.gz" "$base"
+            tar_inputs=("$base")
+            ticket_sidecar="${base}.notarization-ticket.json"
+            if [[ -f "$dest/$ticket_sidecar" ]]; then
+              tar_inputs+=("$ticket_sidecar")
+            fi
+            tar -C "$dest" -czf "$dest/${base}.tar.gz" "${tar_inputs[@]}"

            # Create zip archive for Windows binaries
            # Must run from inside the dest dir so 7z won't
            # embed the directory path inside the zip.
            if [[ "${{ matrix.runner }}" == windows* ]]; then
              (cd "$dest" && 7z a "${base}.zip" "$base")
+            elif [[ "${{ matrix.runner }}" == macos* ]]; then
+              if [[ -f "$dest/$ticket_sidecar" ]]; then
+                (cd "$dest" && zip -q "${base}.zip" "$base" "$ticket_sidecar")
+              else
+                (cd "$dest" && zip -q "${base}.zip" "$base")
+              fi
            fi

            # Also create .zst (existing behaviour) *and* remove the original
@@ -380,6 +424,10 @@ jobs:
            zstd "${zstd_args[@]}" "$dest/$base"
          done

+          if [[ "${{ matrix.runner }}" == macos* ]]; then
+            rm -f "$dest"/*.notarization-ticket.json
+          fi
+
      - name: Remove signing keychain
        if: ${{ always() && matrix.runner == 'macos-15-xlarge' }}
        shell: bash
--- a/codex-cli/bin/codex.js
+++ b/codex-cli/bin/codex.js
@@ -95,6 +95,14 @@ function detectPackageManager() {
    return "bun";
  }

+
+  if (
+    __dirname.includes(".bun/install/global") ||
+    __dirname.includes(".bun\\install\\global")
+  ) {
+    return "bun";
+  }
+
  return userAgent ? "npm" : null;
 }

--- a/codex-rs/Cargo.lock
+++ b/codex-rs/Cargo.lock
@@ -1395,7 +1395,6 @@ dependencies = [
 "tokio-util",
 "tracing",
 "tracing-subscriber",
- "which",
 ]

 [[package]]
@@ -2056,7 +2055,9 @@ dependencies = [
 "codex-protocol",
 "codex-utils-absolute-path",
 "notify",
+ "pretty_assertions",
 "regex-lite",
+ "reqwest",
 "serde_json",
 "shlex",
 "tempfile",
--- a/codex-rs/app-server-protocol/src/protocol/common.rs
+++ b/codex-rs/app-server-protocol/src/protocol/common.rs
@@ -117,10 +117,6 @@ client_request_definitions! {
        params: v2::ThreadListParams,
        response: v2::ThreadListResponse,
    },
-    ThreadCompact => "thread/compact" {
-        params: v2::ThreadCompactParams,
-        response: v2::ThreadCompactResponse,
-    },
    SkillsList => "skills/list" {
        params: v2::SkillsListParams,
        response: v2::SkillsListResponse,
--- a/codex-rs/app-server-protocol/src/protocol/v2.rs
+++ b/codex-rs/app-server-protocol/src/protocol/v2.rs
@@ -958,18 +958,6 @@ pub struct ThreadListResponse {
    pub next_cursor: Option<String>,
 }

-#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
-#[serde(rename_all = "camelCase")]
-#[ts(export_to = "v2/")]
-pub struct ThreadCompactParams {
-    pub thread_id: String,
-}
-
-#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
-#[serde(rename_all = "camelCase")]
-#[ts(export_to = "v2/")]
-pub struct ThreadCompactResponse {}
-
 #[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
 #[serde(rename_all = "camelCase")]
 #[ts(export_to = "v2/")]
--- a/codex-rs/app-server/src/codex_message_processor.rs
+++ b/codex-rs/app-server/src/codex_message_processor.rs
@@ -368,13 +368,6 @@ impl CodexMessageProcessor {
            ClientRequest::ThreadList { request_id, params } => {
                self.thread_list(request_id, params).await;
            }
-            ClientRequest::ThreadCompact {
-                request_id,
-                params: _,
-            } => {
-                self.send_unimplemented_error(request_id, "thread/compact")
-                    .await;
-            }
            ClientRequest::SkillsList { request_id, params } => {
                self.skills_list(request_id, params).await;
            }
@@ -515,15 +508,6 @@ impl CodexMessageProcessor {
        }
    }

-    async fn send_unimplemented_error(&self, request_id: RequestId, method: &str) {
-        let error = JSONRPCErrorError {
-            code: INTERNAL_ERROR_CODE,
-            message: format!("{method} is not implemented yet"),
-            data: None,
-        };
-        self.outgoing.send_error(request_id, error).await;
-    }
-
    async fn login_v2(&mut self, request_id: RequestId, params: LoginAccountParams) {
        match params {
            LoginAccountParams::ApiKey { api_key } => {
--- a/codex-rs/core/src/auth.rs
+++ b/codex-rs/core/src/auth.rs
@@ -32,8 +32,10 @@ use crate::token_data::parse_id_token;
 use crate::util::try_parse_error_message;
 use codex_client::CodexHttpClient;
 use codex_protocol::account::PlanType as AccountPlanType;
+#[cfg(any(test, feature = "test-support"))]
 use once_cell::sync::Lazy;
 use serde_json::Value;
+#[cfg(any(test, feature = "test-support"))]
 use tempfile::TempDir;
 use thiserror::Error;

--- a/codex-rs/core/src/config/edit.rs
+++ b/codex-rs/core/src/config/edit.rs
@@ -90,7 +90,7 @@ mod document_helpers {
        }
    }

-    pub(super) fn serialize_mcp_server(config: &McpServerConfig) -> TomlItem {
+    fn serialize_mcp_server_table(config: &McpServerConfig) -> TomlTable {
        let mut entry = TomlTable::new();
        entry.set_implicit(false);

@@ -161,7 +161,29 @@ mod document_helpers {
            entry["disabled_tools"] = array_from_iter(disabled_tools.iter().cloned());
        }

-        TomlItem::Table(entry)
+        entry
+    }
+
+    pub(super) fn serialize_mcp_server(config: &McpServerConfig) -> TomlItem {
+        TomlItem::Table(serialize_mcp_server_table(config))
+    }
+
+    pub(super) fn serialize_mcp_server_inline(config: &McpServerConfig) -> InlineTable {
+        serialize_mcp_server_table(config).into_inline_table()
+    }
+
+    pub(super) fn merge_inline_table(existing: &mut InlineTable, replacement: InlineTable) {
+        existing.retain(|key, _| replacement.get(key).is_some());
+
+        for (key, value) in replacement.iter() {
+            if let Some(existing_value) = existing.get_mut(key) {
+                let mut updated_value = value.clone();
+                *updated_value.decor_mut() = existing_value.decor().clone();
+                *existing_value = updated_value;
+            } else {
+                existing.insert(key.to_string(), value.clone());
+            }
+        }
    }

    fn table_from_inline(inline: &InlineTable) -> TomlTable {
@@ -317,15 +339,52 @@ impl ConfigDocument {
            return self.clear(Scope::Global, &["mcp_servers"]);
        }

-        let mut table = TomlTable::new();
-        table.set_implicit(true);
-
-        for (name, config) in servers {
-            table.insert(name, document_helpers::serialize_mcp_server(config));
+        let root = self.doc.as_table_mut();
+        if !root.contains_key("mcp_servers") {
+            root.insert(
+                "mcp_servers",
+                TomlItem::Table(document_helpers::new_implicit_table()),
+            );
        }

-        let item = TomlItem::Table(table);
-        self.write_value(Scope::Global, &["mcp_servers"], item)
+        let Some(item) = root.get_mut("mcp_servers") else {
+            return false;
+        };
+
+        if document_helpers::ensure_table_for_write(item).is_none() {
+            *item = TomlItem::Table(document_helpers::new_implicit_table());
+        }
+
+        let Some(table) = item.as_table_mut() else {
+            return false;
+        };
+
+        let keys_to_remove: Vec<String> = table
+            .iter()
+            .map(|(key, _)| key.to_string())
+            .filter(|key| !servers.contains_key(key.as_str()))
+            .collect();
+
+        for key in keys_to_remove {
+            table.remove(&key);
+        }
+
+        for (name, config) in servers {
+            if let Some(existing) = table.get_mut(name.as_str()) {
+                if let TomlItem::Value(value) = existing
+                    && let Some(inline) = value.as_inline_table_mut()
+                {
+                    let replacement = document_helpers::serialize_mcp_server_inline(config);
+                    document_helpers::merge_inline_table(inline, replacement);
+                } else {
+                    *existing = document_helpers::serialize_mcp_server(config);
+                }
+            } else {
+                table.insert(name, document_helpers::serialize_mcp_server(config));
+            }
+        }
+
+        true
    }

    fn scoped_segments(&self, scope: Scope, segments: &[&str]) -> Vec<String> {
@@ -357,6 +416,10 @@ impl ConfigDocument {
            return false;
        };

+        let mut value = value;
+        if let Some(existing) = parent.get(last) {
+            Self::preserve_decor(existing, &mut value);
+        }
        parent[last] = value;
        true
    }
@@ -398,6 +461,37 @@ impl ConfigDocument {

        Some(current)
    }
+
+    fn preserve_decor(existing: &TomlItem, replacement: &mut TomlItem) {
+        match (existing, replacement) {
+            (TomlItem::Table(existing_table), TomlItem::Table(replacement_table)) => {
+                replacement_table
+                    .decor_mut()
+                    .clone_from(existing_table.decor());
+                for (key, existing_item) in existing_table.iter() {
+                    if let (Some(existing_key), Some(mut replacement_key)) =
+                        (existing_table.key(key), replacement_table.key_mut(key))
+                    {
+                        replacement_key
+                            .leaf_decor_mut()
+                            .clone_from(existing_key.leaf_decor());
+                        replacement_key
+                            .dotted_decor_mut()
+                            .clone_from(existing_key.dotted_decor());
+                    }
+                    if let Some(replacement_item) = replacement_table.get_mut(key) {
+                        Self::preserve_decor(existing_item, replacement_item);
+                    }
+                }
+            }
+            (TomlItem::Value(existing_value), TomlItem::Value(replacement_value)) => {
+                replacement_value
+                    .decor_mut()
+                    .clone_from(existing_value.decor());
+            }
+            _ => {}
+        }
+    }
 }

 /// Persist edits using a blocking strategy.
@@ -691,6 +785,68 @@ profiles = { fast = { model = "gpt-4o", sandbox_mode = "strict" } }
        );
    }

+    #[test]
+    fn batch_write_table_upsert_preserves_inline_comments() {
+        let tmp = tempdir().expect("tmpdir");
+        let codex_home = tmp.path();
+        let original = r#"approval_policy = "never"
+
+[mcp_servers.linear]
+name = "linear"
+# ok
+url = "https://linear.example"
+
+[mcp_servers.linear.http_headers]
+foo = "bar"
+
+[sandbox_workspace_write]
+# ok 3
+network_access = false
+"#;
+        std::fs::write(codex_home.join(CONFIG_TOML_FILE), original).expect("seed config");
+
+        apply_blocking(
+            codex_home,
+            None,
+            &[
+                ConfigEdit::SetPath {
+                    segments: vec![
+                        "mcp_servers".to_string(),
+                        "linear".to_string(),
+                        "url".to_string(),
+                    ],
+                    value: value("https://linear.example/v2"),
+                },
+                ConfigEdit::SetPath {
+                    segments: vec![
+                        "sandbox_workspace_write".to_string(),
+                        "network_access".to_string(),
+                    ],
+                    value: value(true),
+                },
+            ],
+        )
+        .expect("apply");
+
+        let updated =
+            std::fs::read_to_string(codex_home.join(CONFIG_TOML_FILE)).expect("read config");
+        let expected = r#"approval_policy = "never"
+
+[mcp_servers.linear]
+name = "linear"
+# ok
+url = "https://linear.example/v2"
+
+[mcp_servers.linear.http_headers]
+foo = "bar"
+
+[sandbox_workspace_write]
+# ok 3
+network_access = true
+"#;
+        assert_eq!(updated, expected);
+    }
+
    #[test]
    fn blocking_clear_model_removes_inline_table_entry() {
        let tmp = tempdir().expect("tmpdir");
@@ -1028,6 +1184,178 @@ B = \"2\"
        assert_eq!(raw, expected);
    }

+    #[test]
+    fn blocking_replace_mcp_servers_preserves_inline_comments() {
+        let tmp = tempdir().expect("tmpdir");
+        let codex_home = tmp.path();
+        std::fs::write(
+            codex_home.join(CONFIG_TOML_FILE),
+            r#"[mcp_servers]
+# keep me
+foo = { command = "cmd" }
+"#,
+        )
+        .expect("seed");
+
+        let mut servers = BTreeMap::new();
+        servers.insert(
+            "foo".to_string(),
+            McpServerConfig {
+                transport: McpServerTransportConfig::Stdio {
+                    command: "cmd".to_string(),
+                    args: Vec::new(),
+                    env: None,
+                    env_vars: Vec::new(),
+                    cwd: None,
+                },
+                enabled: true,
+                startup_timeout_sec: None,
+                tool_timeout_sec: None,
+                enabled_tools: None,
+                disabled_tools: None,
+            },
+        );
+
+        apply_blocking(codex_home, None, &[ConfigEdit::ReplaceMcpServers(servers)])
+            .expect("persist");
+
+        let contents =
+            std::fs::read_to_string(codex_home.join(CONFIG_TOML_FILE)).expect("read config");
+        let expected = r#"[mcp_servers]
+# keep me
+foo = { command = "cmd" }
+"#;
+        assert_eq!(contents, expected);
+    }
+
+    #[test]
+    fn blocking_replace_mcp_servers_preserves_inline_comment_suffix() {
+        let tmp = tempdir().expect("tmpdir");
+        let codex_home = tmp.path();
+        std::fs::write(
+            codex_home.join(CONFIG_TOML_FILE),
+            r#"[mcp_servers]
+foo = { command = "cmd" } # keep me
+"#,
+        )
+        .expect("seed");
+
+        let mut servers = BTreeMap::new();
+        servers.insert(
+            "foo".to_string(),
+            McpServerConfig {
+                transport: McpServerTransportConfig::Stdio {
+                    command: "cmd".to_string(),
+                    args: Vec::new(),
+                    env: None,
+                    env_vars: Vec::new(),
+                    cwd: None,
+                },
+                enabled: false,
+                startup_timeout_sec: None,
+                tool_timeout_sec: None,
+                enabled_tools: None,
+                disabled_tools: None,
+            },
+        );
+
+        apply_blocking(codex_home, None, &[ConfigEdit::ReplaceMcpServers(servers)])
+            .expect("persist");
+
+        let contents =
+            std::fs::read_to_string(codex_home.join(CONFIG_TOML_FILE)).expect("read config");
+        let expected = r#"[mcp_servers]
+foo = { command = "cmd" , enabled = false } # keep me
+"#;
+        assert_eq!(contents, expected);
+    }
+
+    #[test]
+    fn blocking_replace_mcp_servers_preserves_inline_comment_after_removing_keys() {
+        let tmp = tempdir().expect("tmpdir");
+        let codex_home = tmp.path();
+        std::fs::write(
+            codex_home.join(CONFIG_TOML_FILE),
+            r#"[mcp_servers]
+foo = { command = "cmd", args = ["--flag"] } # keep me
+"#,
+        )
+        .expect("seed");
+
+        let mut servers = BTreeMap::new();
+        servers.insert(
+            "foo".to_string(),
+            McpServerConfig {
+                transport: McpServerTransportConfig::Stdio {
+                    command: "cmd".to_string(),
+                    args: Vec::new(),
+                    env: None,
+                    env_vars: Vec::new(),
+                    cwd: None,
+                },
+                enabled: true,
+                startup_timeout_sec: None,
+                tool_timeout_sec: None,
+                enabled_tools: None,
+                disabled_tools: None,
+            },
+        );
+
+        apply_blocking(codex_home, None, &[ConfigEdit::ReplaceMcpServers(servers)])
+            .expect("persist");
+
+        let contents =
+            std::fs::read_to_string(codex_home.join(CONFIG_TOML_FILE)).expect("read config");
+        let expected = r#"[mcp_servers]
+foo = { command = "cmd"} # keep me
+"#;
+        assert_eq!(contents, expected);
+    }
+
+    #[test]
+    fn blocking_replace_mcp_servers_preserves_inline_comment_prefix_on_update() {
+        let tmp = tempdir().expect("tmpdir");
+        let codex_home = tmp.path();
+        std::fs::write(
+            codex_home.join(CONFIG_TOML_FILE),
+            r#"[mcp_servers]
+# keep me
+foo = { command = "cmd" }
+"#,
+        )
+        .expect("seed");
+
+        let mut servers = BTreeMap::new();
+        servers.insert(
+            "foo".to_string(),
+            McpServerConfig {
+                transport: McpServerTransportConfig::Stdio {
+                    command: "cmd".to_string(),
+                    args: Vec::new(),
+                    env: None,
+                    env_vars: Vec::new(),
+                    cwd: None,
+                },
+                enabled: false,
+                startup_timeout_sec: None,
+                tool_timeout_sec: None,
+                enabled_tools: None,
+                disabled_tools: None,
+            },
+        );
+
+        apply_blocking(codex_home, None, &[ConfigEdit::ReplaceMcpServers(servers)])
+            .expect("persist");
+
+        let contents =
+            std::fs::read_to_string(codex_home.join(CONFIG_TOML_FILE)).expect("read config");
+        let expected = r#"[mcp_servers]
+# keep me
+foo = { command = "cmd" , enabled = false }
+"#;
+        assert_eq!(contents, expected);
+    }
+
    #[test]
    fn blocking_clear_path_noop_when_missing() {
        let tmp = tempdir().expect("tmpdir");
--- a/codex-rs/core/src/config/service.rs
+++ b/codex-rs/core/src/config/service.rs
@@ -932,4 +932,91 @@ remote_compaction = true
        assert_eq!(overridden.overriding_layer.name, ConfigLayerName::System);
        assert_eq!(overridden.effective_value, serde_json::json!("never"));
    }
+
+    #[tokio::test]
+    async fn upsert_merges_tables_replace_overwrites() -> Result<()> {
+        let tmp = tempdir().expect("tempdir");
+        let path = tmp.path().join(CONFIG_TOML_FILE);
+        let base = r#"[mcp_servers.linear]
+bearer_token_env_var = "TOKEN"
+name = "linear"
+url = "https://linear.example"
+
+[mcp_servers.linear.env_http_headers]
+existing = "keep"
+
+[mcp_servers.linear.http_headers]
+alpha = "a"
+"#;
+
+        let overlay = serde_json::json!({
+            "bearer_token_env_var": "NEW_TOKEN",
+            "http_headers": {
+                "alpha": "updated",
+                "beta": "b"
+            },
+            "name": "linear",
+            "url": "https://linear.example"
+        });
+
+        std::fs::write(&path, base)?;
+
+        let service = ConfigService::new(tmp.path().to_path_buf(), vec![]);
+        service
+            .write_value(ConfigValueWriteParams {
+                file_path: Some(path.display().to_string()),
+                key_path: "mcp_servers.linear".to_string(),
+                value: overlay.clone(),
+                merge_strategy: MergeStrategy::Upsert,
+                expected_version: None,
+            })
+            .await
+            .expect("upsert succeeds");
+
+        let upserted: TomlValue = toml::from_str(&std::fs::read_to_string(&path)?)?;
+        let expected_upsert: TomlValue = toml::from_str(
+            r#"[mcp_servers.linear]
+bearer_token_env_var = "NEW_TOKEN"
+name = "linear"
+url = "https://linear.example"
+
+[mcp_servers.linear.env_http_headers]
+existing = "keep"
+
+[mcp_servers.linear.http_headers]
+alpha = "updated"
+beta = "b"
+"#,
+        )?;
+        assert_eq!(upserted, expected_upsert);
+
+        std::fs::write(&path, base)?;
+
+        service
+            .write_value(ConfigValueWriteParams {
+                file_path: Some(path.display().to_string()),
+                key_path: "mcp_servers.linear".to_string(),
+                value: overlay,
+                merge_strategy: MergeStrategy::Replace,
+                expected_version: None,
+            })
+            .await
+            .expect("replace succeeds");
+
+        let replaced: TomlValue = toml::from_str(&std::fs::read_to_string(&path)?)?;
+        let expected_replace: TomlValue = toml::from_str(
+            r#"[mcp_servers.linear]
+bearer_token_env_var = "NEW_TOKEN"
+name = "linear"
+url = "https://linear.example"
+
+[mcp_servers.linear.http_headers]
+alpha = "updated"
+beta = "b"
+"#,
+        )?;
+        assert_eq!(replaced, expected_replace);
+
+        Ok(())
+    }
 }
--- a/codex-rs/core/src/lib.rs
+++ b/codex-rs/core/src/lib.rs
@@ -80,7 +80,6 @@ pub mod spawn;
 pub mod terminal;
 mod tools;
 pub mod turn_diff_tracker;
-pub mod version;
 pub use rollout::ARCHIVED_SESSIONS_SUBDIR;
 pub use rollout::INTERACTIVE_SESSION_SOURCES;
 pub use rollout::RolloutRecorder;
@@ -95,7 +94,6 @@ pub use rollout::list::read_head_for_summary;
 mod function_tool;
 mod state;
 mod tasks;
-pub mod update_action;
 mod user_notification;
 mod user_shell_command;
 pub mod util;
--- a/codex-rs/core/src/openai_models/models_manager.rs
+++ b/codex-rs/core/src/openai_models/models_manager.rs
@@ -6,6 +6,7 @@ use codex_protocol::openai_models::ModelInfo;
 use codex_protocol::openai_models::ModelPreset;
 use codex_protocol::openai_models::ModelsResponse;
 use http::HeaderMap;
+use std::collections::HashSet;
 use std::path::PathBuf;
 use std::sync::Arc;
 use std::time::Duration;
@@ -35,7 +36,7 @@ const CODEX_AUTO_BALANCED_MODEL: &str = "codex-auto-balanced";
 #[derive(Debug)]
 pub struct ModelsManager {
    // todo(aibrahim) merge available_models and model family creation into one struct
-    available_models: RwLock<Vec<ModelPreset>>,
+    local_models: Vec<ModelPreset>,
    remote_models: RwLock<Vec<ModelInfo>>,
    auth_manager: Arc<AuthManager>,
    etag: RwLock<Option<String>>,
@@ -49,7 +50,7 @@ impl ModelsManager {
    pub fn new(auth_manager: Arc<AuthManager>) -> Self {
        let codex_home = auth_manager.codex_home().to_path_buf();
        Self {
-            available_models: RwLock::new(builtin_model_presets(auth_manager.get_auth_mode())),
+            local_models: builtin_model_presets(auth_manager.get_auth_mode()),
            remote_models: RwLock::new(Vec::new()),
            auth_manager,
            etag: RwLock::new(None),
@@ -64,7 +65,7 @@ impl ModelsManager {
    pub fn with_provider(auth_manager: Arc<AuthManager>, provider: ModelProviderInfo) -> Self {
        let codex_home = auth_manager.codex_home().to_path_buf();
        Self {
-            available_models: RwLock::new(builtin_model_presets(auth_manager.get_auth_mode())),
+            local_models: builtin_model_presets(auth_manager.get_auth_mode()),
            remote_models: RwLock::new(Vec::new()),
            auth_manager,
            etag: RwLock::new(None),
@@ -107,13 +108,13 @@ impl ModelsManager {
        if let Err(err) = self.refresh_available_models(config).await {
            error!("failed to refresh available models: {err}");
        }
-        self.available_models.read().await.clone()
+        let remote_models = self.remote_models.read().await.clone();
+        self.build_available_models(remote_models)
    }

    pub fn try_list_models(&self) -> Result<Vec<ModelPreset>, TryLockError> {
-        self.available_models
-            .try_read()
-            .map(|models| models.clone())
+        let remote_models = self.remote_models.try_read()?.clone();
+        Ok(self.build_available_models(remote_models))
    }

    fn find_family_for_model(slug: &str) -> ModelFamily {
@@ -123,8 +124,8 @@ impl ModelsManager {
    /// Look up the requested model family while applying remote metadata overrides.
    pub async fn construct_model_family(&self, model: &str, config: &Config) -> ModelFamily {
        Self::find_family_for_model(model)
-            .with_config_overrides(config)
            .with_remote_overrides(self.remote_models.read().await.clone())
+            .with_config_overrides(config)
    }

    pub async fn get_model(&self, model: &Option<String>, config: &Config) -> String {
@@ -136,11 +137,10 @@ impl ModelsManager {
        }
        // if codex-auto-balanced exists & signed in with chatgpt mode, return it, otherwise return the default model
        let auth_mode = self.auth_manager.get_auth_mode();
+        let remote_models = self.remote_models.read().await.clone();
        if auth_mode == Some(AuthMode::ChatGPT)
            && self
-                .available_models
-                .read()
-                .await
+                .build_available_models(remote_models)
                .iter()
                .any(|m| m.model == CODEX_AUTO_BALANCED_MODEL)
        {
@@ -163,7 +163,6 @@ impl ModelsManager {
    /// Replace the cached remote models and rebuild the derived presets list.
    async fn apply_remote_models(&self, models: Vec<ModelInfo>) {
        *self.remote_models.write().await = models;
-        self.build_available_models().await;
    }

    /// Attempt to satisfy the refresh from the cache when it matches the provider and TTL.
@@ -203,22 +202,55 @@ impl ModelsManager {
        }
    }

-    /// Convert remote model metadata into picker-ready presets, marking defaults.
-    async fn build_available_models(&self) {
-        let mut available_models = self.remote_models.read().await.clone();
-        available_models.sort_by(|a, b| a.priority.cmp(&b.priority));
-        let mut model_presets: Vec<ModelPreset> = available_models
-            .into_iter()
-            .map(Into::into)
-            .filter(|preset: &ModelPreset| preset.show_in_picker)
-            .collect();
-        if let Some(default) = model_presets.first_mut() {
+    /// Merge remote model metadata into picker-ready presets, preserving existing entries.
+    fn build_available_models(&self, mut remote_models: Vec<ModelInfo>) -> Vec<ModelPreset> {
+        remote_models.sort_by(|a, b| a.priority.cmp(&b.priority));
+
+        let remote_presets: Vec<ModelPreset> = remote_models.into_iter().map(Into::into).collect();
+        let existing_presets = self.local_models.clone();
+        let mut merged_presets = Self::merge_presets(remote_presets, existing_presets);
+        merged_presets = Self::filter_visible_models(merged_presets);
+
+        let has_default = merged_presets.iter().any(|preset| preset.is_default);
+        if let Some(default) = merged_presets.first_mut()
+            && !has_default
+        {
            default.is_default = true;
        }
-        {
-            let mut available_models_guard = self.available_models.write().await;
-            *available_models_guard = model_presets;
+
+        merged_presets
+    }
+
+    fn filter_visible_models(models: Vec<ModelPreset>) -> Vec<ModelPreset> {
+        models
+            .into_iter()
+            .filter(|model| model.show_in_picker)
+            .collect()
+    }
+
+    fn merge_presets(
+        remote_presets: Vec<ModelPreset>,
+        existing_presets: Vec<ModelPreset>,
+    ) -> Vec<ModelPreset> {
+        if remote_presets.is_empty() {
+            return existing_presets;
        }
+
+        let remote_slugs: HashSet<&str> = remote_presets
+            .iter()
+            .map(|preset| preset.model.as_str())
+            .collect();
+
+        let mut merged_presets = remote_presets.clone();
+        for mut preset in existing_presets {
+            if remote_slugs.contains(preset.model.as_str()) {
+                continue;
+            }
+            preset.is_default = false;
+            merged_presets.push(preset);
+        }
+
+        merged_presets
    }

    fn cache_path(&self) -> PathBuf {
@@ -261,11 +293,21 @@ mod tests {
    use crate::model_provider_info::WireApi;
    use codex_protocol::openai_models::ModelsResponse;
    use core_test_support::responses::mount_models_once;
+    use pretty_assertions::assert_eq;
    use serde_json::json;
    use tempfile::tempdir;
    use wiremock::MockServer;

    fn remote_model(slug: &str, display: &str, priority: i32) -> ModelInfo {
+        remote_model_with_visibility(slug, display, priority, "list")
+    }
+
+    fn remote_model_with_visibility(
+        slug: &str,
+        display: &str,
+        priority: i32,
+        visibility: &str,
+    ) -> ModelInfo {
        serde_json::from_value(json!({
            "slug": slug,
            "display_name": display,
@@ -273,7 +315,7 @@ mod tests {
            "default_reasoning_level": "medium",
            "supported_reasoning_levels": [{"effort": "low", "description": "low"}, {"effort": "medium", "description": "medium"}],
            "shell_type": "shell_command",
-            "visibility": "list",
+            "visibility": visibility,
            "minimal_client_version": [0, 1, 0],
            "supported_in_api": true,
            "priority": priority,
@@ -347,14 +389,23 @@ mod tests {
        assert_eq!(cached_remote, remote_models);

        let available = manager.list_models(&config).await;
-        assert_eq!(available.len(), 2);
-        assert_eq!(available[0].model, "priority-high");
+        let high_idx = available
+            .iter()
+            .position(|model| model.model == "priority-high")
+            .expect("priority-high should be listed");
+        let low_idx = available
+            .iter()
+            .position(|model| model.model == "priority-low")
+            .expect("priority-low should be listed");
        assert!(
-            available[0].is_default,
+            high_idx < low_idx,
+            "higher priority should be listed before lower priority"
+        );
+        assert!(
+            available[high_idx].is_default,
            "highest priority should be default"
        );
-        assert_eq!(available[1].model, "priority-low");
-        assert!(!available[1].is_default);
+        assert!(!available[low_idx].is_default);
        assert_eq!(
            models_mock.requests().len(),
            1,
@@ -493,4 +544,94 @@ mod tests {
            "stale cache refresh should fetch /models once"
        );
    }
+
+    #[tokio::test]
+    async fn refresh_available_models_drops_removed_remote_models() {
+        let server = MockServer::start().await;
+        let initial_models = vec![remote_model("remote-old", "Remote Old", 1)];
+        let initial_mock = mount_models_once(
+            &server,
+            ModelsResponse {
+                models: initial_models,
+                etag: String::new(),
+            },
+        )
+        .await;
+
+        let codex_home = tempdir().expect("temp dir");
+        let mut config = Config::load_from_base_config_with_overrides(
+            ConfigToml::default(),
+            ConfigOverrides::default(),
+            codex_home.path().to_path_buf(),
+        )
+        .expect("load default test config");
+        config.features.enable(Feature::RemoteModels);
+        let auth_manager =
+            AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
+        let provider = provider_for(server.uri());
+        let mut manager = ModelsManager::with_provider(auth_manager, provider);
+        manager.cache_ttl = Duration::ZERO;
+
+        manager
+            .refresh_available_models(&config)
+            .await
+            .expect("initial refresh succeeds");
+
+        server.reset().await;
+        let refreshed_models = vec![remote_model("remote-new", "Remote New", 1)];
+        let refreshed_mock = mount_models_once(
+            &server,
+            ModelsResponse {
+                models: refreshed_models,
+                etag: String::new(),
+            },
+        )
+        .await;
+
+        manager
+            .refresh_available_models(&config)
+            .await
+            .expect("second refresh succeeds");
+
+        let available = manager
+            .try_list_models()
+            .expect("models should be available");
+        assert!(
+            available.iter().any(|preset| preset.model == "remote-new"),
+            "new remote model should be listed"
+        );
+        assert!(
+            !available.iter().any(|preset| preset.model == "remote-old"),
+            "removed remote model should not be listed"
+        );
+        assert_eq!(
+            initial_mock.requests().len(),
+            1,
+            "initial refresh should only hit /models once"
+        );
+        assert_eq!(
+            refreshed_mock.requests().len(),
+            1,
+            "second refresh should only hit /models once"
+        );
+    }
+
+    #[test]
+    fn build_available_models_picks_default_after_hiding_hidden_models() {
+        let auth_manager =
+            AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
+        let provider = provider_for("http://example.test".to_string());
+        let mut manager = ModelsManager::with_provider(auth_manager, provider);
+        manager.local_models = Vec::new();
+
+        let hidden_model = remote_model_with_visibility("hidden", "Hidden", 0, "hide");
+        let visible_model = remote_model_with_visibility("visible", "Visible", 1, "list");
+
+        let mut expected = ModelPreset::from(visible_model.clone());
+        expected.is_default = true;
+
+        let available = manager.build_available_models(vec![hidden_model, visible_model]);
+
+        assert_eq!(available, vec![expected]);
+    }
 }
--- a/codex-rs/core/src/stream_events_utils.rs
+++ b/codex-rs/core/src/stream_events_utils.rs
@@ -16,7 +16,6 @@ use codex_protocol::models::FunctionCallOutputPayload;
 use codex_protocol::models::ResponseInputItem;
 use codex_protocol::models::ResponseItem;
 use futures::Future;
-use tracing::Instrument;
 use tracing::debug;
 use tracing::instrument;

@@ -59,16 +58,10 @@ pub(crate) async fn handle_output_item_done(
                .await;

            let cancellation_token = ctx.cancellation_token.child_token();
-            let tool_runtime = ctx.tool_runtime.clone();
-
            let tool_future: InFlightFuture<'static> = Box::pin(
-                async move {
-                    let response_input = tool_runtime
-                        .handle_tool_call(call, cancellation_token)
-                        .await?;
-                    Ok(response_input)
-                }
-                .in_current_span(),
+                ctx.tool_runtime
+                    .clone()
+                    .handle_tool_call(call, cancellation_token),
            );

            output.needs_follow_up = true;
--- a/codex-rs/core/src/tools/parallel.rs
+++ b/codex-rs/core/src/tools/parallel.rs
@@ -47,7 +47,7 @@ impl ToolCallRuntime {

    #[instrument(skip_all, fields(call = ?call))]
    pub(crate) fn handle_tool_call(
-        &self,
+        self,
        call: ToolCall,
        cancellation_token: CancellationToken,
    ) -> impl std::future::Future<Output = Result<ResponseInputItem, CodexErr>> {
--- a/codex-rs/core/src/version.rs
+++ b/codex-rs/core/src/version.rs
@@ -1,247 +0,0 @@
-use std::path::Path;
-
-use chrono::DateTime;
-use chrono::Utc;
-use serde::Deserialize;
-use serde::Serialize;
-
-pub const VERSION_FILENAME: &str = "version.json";
-
-#[derive(Serialize, Deserialize, Debug, Clone)]
-pub struct VersionInfo {
-    pub latest_version: String,
-    // ISO-8601 timestamp (RFC3339)
-    pub last_checked_at: DateTime<Utc>,
-    #[serde(default)]
-    pub dismissed_version: Option<String>,
-}
-
-#[derive(Debug, Clone, PartialEq, Eq)]
-pub struct Version {
-    major: u64,
-    minor: u64,
-    patch: u64,
-    pre: Option<Vec<PrereleaseIdent>>,
-}
-
-#[derive(Debug, Clone, PartialEq, Eq)]
-enum PrereleaseIdent {
-    Numeric(u64),
-    Alpha(String),
-}
-
-impl Version {
-    pub fn parse(input: &str) -> Option<Self> {
-        let mut input = input.trim();
-        if let Some(stripped) = input.strip_prefix("rust-v") {
-            input = stripped;
-        }
-        if let Some(stripped) = input.strip_prefix('v') {
-            input = stripped;
-        }
-        let input = input.split('+').next().unwrap_or(input);
-        let mut parts = input.splitn(2, '-');
-        let core = parts.next()?;
-        let pre = parts.next();
-        let mut nums = core.split('.');
-        let major = nums.next()?.parse::<u64>().ok()?;
-        let minor = nums.next()?.parse::<u64>().ok()?;
-        let patch = nums.next()?.parse::<u64>().ok()?;
-        if nums.next().is_some() {
-            return None;
-        }
-        let pre = match pre {
-            None => None,
-            Some("") => None,
-            Some(value) => {
-                let mut idents = Vec::new();
-                for ident in value.split('.') {
-                    if ident.is_empty() {
-                        return None;
-                    }
-                    let parsed = if ident.chars().all(|c| c.is_ascii_digit()) {
-                        ident.parse::<u64>().ok().map(PrereleaseIdent::Numeric)
-                    } else {
-                        Some(PrereleaseIdent::Alpha(ident.to_string()))
-                    };
-                    idents.push(parsed?);
-                }
-                Some(idents)
-            }
-        };
-        Some(Self {
-            major,
-            minor,
-            patch,
-            pre,
-        })
-    }
-}
-
-impl Ord for Version {
-    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
-        match self.major.cmp(&other.major) {
-            std::cmp::Ordering::Equal => {}
-            ordering => return ordering,
-        }
-        match self.minor.cmp(&other.minor) {
-            std::cmp::Ordering::Equal => {}
-            ordering => return ordering,
-        }
-        match self.patch.cmp(&other.patch) {
-            std::cmp::Ordering::Equal => {}
-            ordering => return ordering,
-        }
-        match (&self.pre, &other.pre) {
-            (None, None) => std::cmp::Ordering::Equal,
-            (None, Some(_)) => std::cmp::Ordering::Greater,
-            (Some(_), None) => std::cmp::Ordering::Less,
-            (Some(left), Some(right)) => compare_prerelease_idents(left, right),
-        }
-    }
-}
-
-impl PartialOrd for Version {
-    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
-        Some(self.cmp(other))
-    }
-}
-
-pub fn is_newer(latest: &str, current: &str) -> Option<bool> {
-    let latest = Version::parse(latest)?;
-    if latest.pre.is_some() {
-        return Some(false);
-    }
-    let current = Version::parse(current)?;
-    let current = Version {
-        pre: None,
-        ..current
-    };
-    Some(latest > current)
-}
-
-pub fn is_up_to_date(latest: &str, current: &str) -> Option<bool> {
-    let latest = Version::parse(latest)?;
-    if latest.pre.is_some() {
-        return Some(true);
-    }
-    let current = Version::parse(current)?;
-    let current = Version {
-        pre: None,
-        ..current
-    };
-    Some(current >= latest)
-}
-
-pub fn read_version_info(version_file: &Path) -> anyhow::Result<VersionInfo> {
-    let contents = std::fs::read_to_string(version_file)?;
-    Ok(serde_json::from_str(&contents)?)
-}
-
-pub fn read_latest_version(version_file: &Path) -> Option<String> {
-    read_version_info(version_file)
-        .ok()
-        .map(|info| info.latest_version)
-}
-
-pub fn extract_version_from_cask(cask_contents: &str) -> anyhow::Result<String> {
-    cask_contents
-        .lines()
-        .find_map(|line| {
-            let line = line.trim();
-            line.strip_prefix("version \"")
-                .and_then(|rest| rest.strip_suffix('"'))
-                .map(ToString::to_string)
-        })
-        .ok_or_else(|| anyhow::anyhow!("Failed to find version in Homebrew cask file"))
-}
-
-pub fn extract_version_from_latest_tag(latest_tag_name: &str) -> anyhow::Result<String> {
-    latest_tag_name
-        .strip_prefix("rust-v")
-        .map(str::to_owned)
-        .ok_or_else(|| anyhow::anyhow!("Failed to parse latest tag name '{latest_tag_name}'"))
-}
-
-fn compare_prerelease_idents(
-    left: &[PrereleaseIdent],
-    right: &[PrereleaseIdent],
-) -> std::cmp::Ordering {
-    for (l, r) in left.iter().zip(right.iter()) {
-        let ordering = match (l, r) {
-            (PrereleaseIdent::Numeric(a), PrereleaseIdent::Numeric(b)) => a.cmp(b),
-            (PrereleaseIdent::Alpha(a), PrereleaseIdent::Alpha(b)) => a.cmp(b),
-            (PrereleaseIdent::Numeric(_), PrereleaseIdent::Alpha(_)) => std::cmp::Ordering::Less,
-            (PrereleaseIdent::Alpha(_), PrereleaseIdent::Numeric(_)) => std::cmp::Ordering::Greater,
-        };
-        if ordering != std::cmp::Ordering::Equal {
-            return ordering;
-        }
-    }
-    left.len().cmp(&right.len())
-}
-
-#[cfg(test)]
-mod tests {
-    use pretty_assertions::assert_eq;
-
-    use super::*;
-
-    #[test]
-    fn prerelease_current_is_ignored() {
-        assert_eq!(is_newer("1.2.3", "1.2.3-alpha.1"), Some(false));
-        assert_eq!(is_up_to_date("1.2.3", "1.2.3-alpha.1"), Some(true));
-    }
-
-    #[test]
-    fn prerelease_latest_is_ignored() {
-        assert_eq!(is_newer("1.2.4-alpha.1", "1.2.3"), Some(false));
-        assert_eq!(is_up_to_date("1.2.4-alpha.1", "1.2.3"), Some(true));
-    }
-
-    #[test]
-    fn prerelease_latest_is_not_considered_newer() {
-        assert_eq!(is_newer("0.11.0-beta.1", "0.11.0"), Some(false));
-        assert_eq!(is_newer("1.0.0-rc.1", "1.0.0"), Some(false));
-    }
-
-    #[test]
-    fn plain_semver_comparisons_work() {
-        assert_eq!(is_newer("0.11.1", "0.11.0"), Some(true));
-        assert_eq!(is_newer("0.11.0", "0.11.1"), Some(false));
-        assert_eq!(is_newer("1.0.0", "0.9.9"), Some(true));
-        assert_eq!(is_newer("0.9.9", "1.0.0"), Some(false));
-    }
-
-    #[test]
-    fn whitespace_is_ignored() {
-        assert_eq!(Version::parse(" 1.2.3 \n").is_some(), true);
-        assert_eq!(is_newer(" 1.2.3 ", "1.2.2"), Some(true));
-    }
-
-    #[test]
-    fn parses_version_from_cask_contents() {
-        let cask = r#"
-            cask "codex" do
-              version "0.55.0"
-            end
-        "#;
-        assert_eq!(
-            extract_version_from_cask(cask).expect("failed to parse version"),
-            "0.55.0"
-        );
-    }
-
-    #[test]
-    fn extracts_version_from_latest_tag() {
-        assert_eq!(
-            extract_version_from_latest_tag("rust-v1.5.0").expect("failed to parse version"),
-            "1.5.0"
-        );
-    }
-
-    #[test]
-    fn latest_tag_without_prefix_is_invalid() {
-        assert!(extract_version_from_latest_tag("v1.5.0").is_err());
-    }
-}
--- a/codex-rs/core/tests/common/Cargo.toml
+++ b/codex-rs/core/tests/common/Cargo.toml
@@ -22,3 +22,7 @@ tokio = { workspace = true, features = ["time"] }
 walkdir = { workspace = true }
 wiremock = { workspace = true }
 shlex = { workspace = true }
+
+[dev-dependencies]
+pretty_assertions = { workspace = true }
+reqwest = { workspace = true }
--- a/codex-rs/core/tests/common/lib.rs
+++ b/codex-rs/core/tests/common/lib.rs
@@ -14,6 +14,7 @@ use std::path::PathBuf;
 use assert_cmd::cargo::cargo_bin;

 pub mod responses;
+pub mod streaming_sse;
 pub mod test_codex;
 pub mod test_codex_exec;

--- a/codex-rs/core/tests/common/streaming_sse.rs
+++ b/codex-rs/core/tests/common/streaming_sse.rs
@@ -0,0 +1,680 @@
+use std::collections::VecDeque;
+use std::sync::Arc;
+use std::time::SystemTime;
+use std::time::UNIX_EPOCH;
+
+use tokio::io::AsyncReadExt;
+use tokio::io::AsyncWriteExt;
+use tokio::net::TcpListener;
+use tokio::sync::Mutex as TokioMutex;
+use tokio::sync::oneshot;
+
+/// Streaming SSE chunk payload gated by a per-chunk signal.
+#[derive(Debug)]
+pub struct StreamingSseChunk {
+    pub gate: Option<oneshot::Receiver<()>>,
+    pub body: String,
+}
+
+/// Minimal streaming SSE server for tests that need gated per-chunk delivery.
+pub struct StreamingSseServer {
+    uri: String,
+    shutdown: oneshot::Sender<()>,
+    task: tokio::task::JoinHandle<()>,
+}
+
+impl StreamingSseServer {
+    pub fn uri(&self) -> &str {
+        &self.uri
+    }
+
+    pub async fn shutdown(self) {
+        let _ = self.shutdown.send(());
+        let _ = self.task.await;
+    }
+}
+
+/// Starts a lightweight HTTP server that supports:
+/// - GET /v1/models -> empty models response
+/// - POST /v1/responses -> SSE stream gated per-chunk, served in order
+///
+/// Returns the server handle and a list of receivers that fire when each
+/// response stream finishes sending its final chunk.
+pub async fn start_streaming_sse_server(
+    responses: Vec<Vec<StreamingSseChunk>>,
+) -> (StreamingSseServer, Vec<oneshot::Receiver<i64>>) {
+    let listener = TcpListener::bind("127.0.0.1:0")
+        .await
+        .expect("bind streaming SSE server");
+    let addr = listener.local_addr().expect("streaming SSE server address");
+    let uri = format!("http://{addr}");
+
+    let mut completion_senders = Vec::with_capacity(responses.len());
+    let mut completion_receivers = Vec::with_capacity(responses.len());
+    for _ in 0..responses.len() {
+        let (tx, rx) = oneshot::channel();
+        completion_senders.push(tx);
+        completion_receivers.push(rx);
+    }
+
+    let state = Arc::new(TokioMutex::new(StreamingSseState {
+        responses: VecDeque::from(responses),
+        completions: VecDeque::from(completion_senders),
+    }));
+    let (shutdown_tx, mut shutdown_rx) = oneshot::channel();
+
+    let task = tokio::spawn(async move {
+        loop {
+            tokio::select! {
+                _ = &mut shutdown_rx => break,
+                accept_res = listener.accept() => {
+                    let (mut stream, _) = accept_res.expect("accept streaming SSE connection");
+                    let state = Arc::clone(&state);
+                    tokio::spawn(async move {
+                        let (request, body_prefix) = read_http_request(&mut stream).await;
+                        let Some((method, path)) = parse_request_line(&request) else {
+                            let _ = write_http_response(&mut stream, 400, "bad request", "text/plain").await;
+                            return;
+                        };
+
+                        if method == "GET" && path == "/v1/models" {
+                            if drain_request_body(&mut stream, &request, body_prefix)
+                                .await
+                                .is_err()
+                            {
+                                let _ = write_http_response(&mut stream, 400, "bad request", "text/plain").await;
+                                return;
+                            }
+                            let body = serde_json::json!({
+                                "data": [],
+                                "object": "list"
+                            })
+                            .to_string();
+                            let _ = write_http_response(&mut stream, 200, &body, "application/json").await;
+                            return;
+                        }
+
+                        if method == "POST" && path == "/v1/responses" {
+                            if drain_request_body(&mut stream, &request, body_prefix)
+                                .await
+                                .is_err()
+                            {
+                                let _ = write_http_response(&mut stream, 400, "bad request", "text/plain").await;
+                                return;
+                            }
+                            let Some((chunks, completion)) = take_next_stream(&state).await else {
+                                let _ = write_http_response(&mut stream, 500, "no responses queued", "text/plain").await;
+                                return;
+                            };
+
+                            if write_sse_headers(&mut stream).await.is_err() {
+                                return;
+                            }
+
+                            for chunk in chunks {
+                                if let Some(gate) = chunk.gate
+                                    && gate.await.is_err() {
+                                        return;
+                                    }
+                                if stream.write_all(chunk.body.as_bytes()).await.is_err() {
+                                    return;
+                                }
+                                let _ = stream.flush().await;
+                            }
+
+                            let _ = completion.send(unix_ms_now());
+                            let _ = stream.shutdown().await;
+                            return;
+                        }
+
+                        let _ = write_http_response(&mut stream, 404, "not found", "text/plain").await;
+                    });
+                }
+            }
+        }
+    });
+
+    (
+        StreamingSseServer {
+            uri,
+            shutdown: shutdown_tx,
+            task,
+        },
+        completion_receivers,
+    )
+}
+
+struct StreamingSseState {
+    responses: VecDeque<Vec<StreamingSseChunk>>,
+    completions: VecDeque<oneshot::Sender<i64>>,
+}
+
+async fn take_next_stream(
+    state: &TokioMutex<StreamingSseState>,
+) -> Option<(Vec<StreamingSseChunk>, oneshot::Sender<i64>)> {
+    let mut guard = state.lock().await;
+    let chunks = guard.responses.pop_front()?;
+    let completion = guard.completions.pop_front()?;
+    Some((chunks, completion))
+}
+
+async fn read_http_request(stream: &mut tokio::net::TcpStream) -> (String, Vec<u8>) {
+    let mut buf = Vec::new();
+    let mut scratch = [0u8; 1024];
+    loop {
+        let read = stream.read(&mut scratch).await.unwrap_or(0);
+        if read == 0 {
+            break;
+        }
+        buf.extend_from_slice(&scratch[..read]);
+        if let Some(end) = header_terminator_index(&buf) {
+            let header_end = end + 4;
+            let header = String::from_utf8_lossy(&buf[..header_end]).into_owned();
+            let rest = buf[header_end..].to_vec();
+            return (header, rest);
+        }
+    }
+    (String::from_utf8_lossy(&buf).into_owned(), Vec::new())
+}
+
+fn parse_request_line(request: &str) -> Option<(&str, &str)> {
+    let line = request.lines().next()?;
+    let mut parts = line.split_whitespace();
+    let method = parts.next()?;
+    let path = parts.next()?;
+    Some((method, path))
+}
+
+fn header_terminator_index(buf: &[u8]) -> Option<usize> {
+    buf.windows(4).position(|w| w == b"\r\n\r\n")
+}
+
+fn content_length(headers: &str) -> Option<usize> {
+    headers.lines().skip(1).find_map(|line| {
+        let mut parts = line.splitn(2, ':');
+        let name = parts.next()?.trim();
+        let value = parts.next()?.trim();
+        if name.eq_ignore_ascii_case("content-length") {
+            value.parse::<usize>().ok()
+        } else {
+            None
+        }
+    })
+}
+
+async fn drain_request_body(
+    stream: &mut tokio::net::TcpStream,
+    headers: &str,
+    mut body_prefix: Vec<u8>,
+) -> std::io::Result<()> {
+    let Some(content_len) = content_length(headers) else {
+        return Ok(());
+    };
+
+    if body_prefix.len() > content_len {
+        body_prefix.truncate(content_len);
+    }
+
+    let remaining = content_len.saturating_sub(body_prefix.len());
+    if remaining == 0 {
+        return Ok(());
+    }
+
+    let mut rest = vec![0u8; remaining];
+    stream.read_exact(&mut rest).await?;
+    Ok(())
+}
+
+async fn write_sse_headers(stream: &mut tokio::net::TcpStream) -> std::io::Result<()> {
+    let headers = "HTTP/1.1 200 OK\r\ncontent-type: text/event-stream\r\ncache-control: no-cache\r\nconnection: close\r\n\r\n";
+    stream.write_all(headers.as_bytes()).await
+}
+
+async fn write_http_response(
+    stream: &mut tokio::net::TcpStream,
+    status: i64,
+    body: &str,
+    content_type: &str,
+) -> std::io::Result<()> {
+    let body_len = body.len();
+    let headers = format!(
+        "HTTP/1.1 {status} OK\r\ncontent-type: {content_type}\r\ncontent-length: {body_len}\r\nconnection: close\r\n\r\n"
+    );
+    stream.write_all(headers.as_bytes()).await?;
+    stream.write_all(body.as_bytes()).await?;
+    stream.shutdown().await
+}
+
+fn unix_ms_now() -> i64 {
+    SystemTime::now()
+        .duration_since(UNIX_EPOCH)
+        .unwrap_or_default()
+        .as_millis() as i64
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use pretty_assertions::assert_eq;
+    use reqwest::StatusCode;
+    use tokio::net::TcpStream;
+    use tokio::time::Duration;
+    use tokio::time::timeout;
+
+    fn split_response(response: &str) -> (&str, &str) {
+        response
+            .split_once("\r\n\r\n")
+            .expect("response missing header separator")
+    }
+
+    fn status_code(headers: &str) -> u16 {
+        let line = headers.lines().next().expect("status line");
+        let mut parts = line.split_whitespace();
+        let _ = parts.next();
+        let status = parts.next().expect("status code");
+        status.parse().expect("parse status code")
+    }
+
+    fn header_value<'a>(headers: &'a str, name: &str) -> Option<&'a str> {
+        headers.lines().skip(1).find_map(|line| {
+            let mut parts = line.splitn(2, ':');
+            let key = parts.next()?.trim();
+            let value = parts.next()?.trim();
+            if key.eq_ignore_ascii_case(name) {
+                Some(value)
+            } else {
+                None
+            }
+        })
+    }
+
+    async fn connect(uri: &str) -> TcpStream {
+        let addr = uri.strip_prefix("http://").expect("uri should be http");
+        TcpStream::connect(addr)
+            .await
+            .expect("connect to streaming SSE server")
+    }
+
+    async fn read_to_end(stream: &mut TcpStream) -> String {
+        let mut buf = Vec::new();
+        stream.read_to_end(&mut buf).await.expect("read response");
+        String::from_utf8_lossy(&buf).into_owned()
+    }
+
+    async fn read_until(stream: &mut TcpStream, needle: &str) -> (String, String) {
+        let mut buf = Vec::new();
+        let mut scratch = [0u8; 256];
+        let needle_bytes = needle.as_bytes();
+        loop {
+            let read = stream.read(&mut scratch).await.expect("read response");
+            if read == 0 {
+                break;
+            }
+            buf.extend_from_slice(&scratch[..read]);
+            if let Some(pos) = buf
+                .windows(needle_bytes.len())
+                .position(|window| window == needle_bytes)
+            {
+                let end = pos + needle_bytes.len();
+                let headers = String::from_utf8_lossy(&buf[..end]).into_owned();
+                let remainder = String::from_utf8_lossy(&buf[end..]).into_owned();
+                return (headers, remainder);
+            }
+        }
+        (String::from_utf8_lossy(&buf).into_owned(), String::new())
+    }
+
+    async fn send_request(stream: &mut TcpStream, request: &str) {
+        stream
+            .write_all(request.as_bytes())
+            .await
+            .expect("write request");
+    }
+
+    #[tokio::test]
+    async fn get_models_returns_empty_list() {
+        let (server, _) = start_streaming_sse_server(Vec::new()).await;
+        let mut stream = connect(server.uri()).await;
+        send_request(
+            &mut stream,
+            "GET /v1/models HTTP/1.1\r\nHost: 127.0.0.1\r\n\r\n",
+        )
+        .await;
+        let response = read_to_end(&mut stream).await;
+        let (headers, body) = split_response(&response);
+        assert_eq!(status_code(headers), 200);
+        assert_eq!(
+            header_value(headers, "content-type"),
+            Some("application/json")
+        );
+        let parsed: serde_json::Value = serde_json::from_str(body).expect("parse json body");
+        assert_eq!(
+            parsed,
+            serde_json::json!({
+                "data": [],
+                "object": "list"
+            })
+        );
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn post_responses_streams_in_order_and_closes() {
+        let chunks = vec![
+            StreamingSseChunk {
+                gate: None,
+                body: "event: one\n\n".to_string(),
+            },
+            StreamingSseChunk {
+                gate: None,
+                body: "event: two\n\n".to_string(),
+            },
+        ];
+        let (server, mut completions) = start_streaming_sse_server(vec![chunks]).await;
+        let mut stream = connect(server.uri()).await;
+        send_request(
+            &mut stream,
+            "POST /v1/responses HTTP/1.1\r\nHost: 127.0.0.1\r\nContent-Length: 0\r\n\r\n",
+        )
+        .await;
+        let response = read_to_end(&mut stream).await;
+        let (headers, body) = split_response(&response);
+        assert_eq!(status_code(headers), 200);
+        assert_eq!(
+            header_value(headers, "content-type"),
+            Some("text/event-stream")
+        );
+        assert_eq!(body, "event: one\n\nevent: two\n\n");
+        let mut extra = [0u8; 1];
+        let read = stream.read(&mut extra).await.expect("read after eof");
+        assert_eq!(read, 0);
+        let completion = completions.pop().expect("completion receiver");
+        let timestamp = completion.await.expect("completion timestamp");
+        assert!(timestamp > 0);
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn none_gate_streams_immediately() {
+        let chunks = vec![StreamingSseChunk {
+            gate: None,
+            body: "event: immediate\n\n".to_string(),
+        }];
+        let (server, _) = start_streaming_sse_server(vec![chunks]).await;
+        let mut stream = connect(server.uri()).await;
+        send_request(
+            &mut stream,
+            "POST /v1/responses HTTP/1.1\r\nHost: 127.0.0.1\r\nContent-Length: 0\r\n\r\n",
+        )
+        .await;
+        let (headers, remainder) = read_until(&mut stream, "\r\n\r\n").await;
+        let (headers, _) = split_response(&headers);
+        assert_eq!(status_code(headers), 200);
+        let immediate = format!("{remainder}{}", read_to_end(&mut stream).await);
+        assert_eq!(immediate, "event: immediate\n\n");
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn post_responses_with_no_queue_returns_500() {
+        let (server, _) = start_streaming_sse_server(Vec::new()).await;
+        let mut stream = connect(server.uri()).await;
+        send_request(
+            &mut stream,
+            "POST /v1/responses HTTP/1.1\r\nHost: 127.0.0.1\r\nContent-Length: 0\r\n\r\n",
+        )
+        .await;
+        let response = read_to_end(&mut stream).await;
+        let (headers, body) = split_response(&response);
+        assert_eq!(status_code(headers), 500);
+        assert_eq!(header_value(headers, "content-type"), Some("text/plain"));
+        assert_eq!(body, "no responses queued");
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn gated_chunks_wait_for_signal_and_preserve_order() {
+        let (gate_one_tx, gate_one_rx) = oneshot::channel();
+        let (gate_two_tx, gate_two_rx) = oneshot::channel();
+        let chunks = vec![
+            StreamingSseChunk {
+                gate: Some(gate_one_rx),
+                body: "event: one\n\n".to_string(),
+            },
+            StreamingSseChunk {
+                gate: Some(gate_two_rx),
+                body: "event: two\n\n".to_string(),
+            },
+        ];
+        let (server, _) = start_streaming_sse_server(vec![chunks]).await;
+        let mut stream = connect(server.uri()).await;
+        send_request(
+            &mut stream,
+            "POST /v1/responses HTTP/1.1\r\nHost: 127.0.0.1\r\nContent-Length: 0\r\n\r\n",
+        )
+        .await;
+        let (headers, remainder) = read_until(&mut stream, "\r\n\r\n").await;
+        let (headers, _) = split_response(&headers);
+        assert_eq!(status_code(headers), 200);
+        assert_eq!(
+            header_value(headers, "content-type"),
+            Some("text/event-stream")
+        );
+        assert!(
+            remainder.is_empty(),
+            "unexpected body before gate: {remainder:?}"
+        );
+        let mut scratch = [0u8; 32];
+        let pending = timeout(Duration::from_millis(200), stream.read(&mut scratch)).await;
+        assert!(pending.is_err());
+
+        let _ = gate_one_tx.send(());
+        let mut first_chunk = vec![0u8; "event: one\n\n".len()];
+        stream
+            .read_exact(&mut first_chunk)
+            .await
+            .expect("read first chunk");
+        assert_eq!(String::from_utf8_lossy(&first_chunk), "event: one\n\n");
+        let pending = timeout(Duration::from_millis(200), stream.read(&mut scratch)).await;
+        assert!(pending.is_err());
+
+        let _ = gate_two_tx.send(());
+        let remaining = read_to_end(&mut stream).await;
+        assert_eq!(remaining, "event: two\n\n");
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn multiple_responses_are_fifo_and_completion_timestamps_monotonic() {
+        let first_chunks = vec![StreamingSseChunk {
+            gate: None,
+            body: "event: first\n\n".to_string(),
+        }];
+        let second_chunks = vec![StreamingSseChunk {
+            gate: None,
+            body: "event: second\n\n".to_string(),
+        }];
+        let (server, mut completions) =
+            start_streaming_sse_server(vec![first_chunks, second_chunks]).await;
+
+        let mut first_stream = connect(server.uri()).await;
+        send_request(
+            &mut first_stream,
+            "POST /v1/responses HTTP/1.1\r\nHost: 127.0.0.1\r\nContent-Length: 0\r\n\r\n",
+        )
+        .await;
+        let first_response = read_to_end(&mut first_stream).await;
+        let (_, first_body) = split_response(&first_response);
+        assert_eq!(first_body, "event: first\n\n");
+
+        let mut second_stream = connect(server.uri()).await;
+        send_request(
+            &mut second_stream,
+            "POST /v1/responses HTTP/1.1\r\nHost: 127.0.0.1\r\nContent-Length: 0\r\n\r\n",
+        )
+        .await;
+        let second_response = read_to_end(&mut second_stream).await;
+        let (_, second_body) = split_response(&second_response);
+        assert_eq!(second_body, "event: second\n\n");
+
+        let first_completion = completions.remove(0);
+        let second_completion = completions.remove(0);
+        let first_timestamp = first_completion.await.expect("first completion");
+        let second_timestamp = second_completion.await.expect("second completion");
+        assert!(first_timestamp > 0);
+        assert!(second_timestamp > 0);
+        assert!(first_timestamp <= second_timestamp);
+        assert!(completions.is_empty());
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn unknown_route_returns_404() {
+        let (server, _) = start_streaming_sse_server(Vec::new()).await;
+        let mut stream = connect(server.uri()).await;
+        send_request(
+            &mut stream,
+            "GET /v1/unknown HTTP/1.1\r\nHost: 127.0.0.1\r\n\r\n",
+        )
+        .await;
+        let response = read_to_end(&mut stream).await;
+        let (headers, body) = split_response(&response);
+        assert_eq!(status_code(headers), 404);
+        assert_eq!(header_value(headers, "content-type"), Some("text/plain"));
+        assert_eq!(body, "not found");
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn malformed_request_returns_400() {
+        let (server, _) = start_streaming_sse_server(Vec::new()).await;
+        let mut stream = connect(server.uri()).await;
+        send_request(&mut stream, "BAD\r\n\r\n").await;
+        let response = read_to_end(&mut stream).await;
+        let (headers, body) = split_response(&response);
+        assert_eq!(status_code(headers), 400);
+        assert_eq!(header_value(headers, "content-type"), Some("text/plain"));
+        assert_eq!(body, "bad request");
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn responses_post_drains_request_body() {
+        let response_body = r#"event: response.completed
+data: {"type":"response.completed","response":{"id":"resp-1"}}
+
+"#;
+        let (server, mut completions) = start_streaming_sse_server(vec![vec![StreamingSseChunk {
+            gate: None,
+            body: response_body.to_string(),
+        }]])
+        .await;
+
+        let url = format!("{}/v1/responses", server.uri());
+        let payload = serde_json::json!({
+            "model": "gpt-5.1",
+            "instructions": "test",
+            "input": [{"type": "message", "role": "user", "content": [{"type": "input_text", "text": "hello"}]}],
+            "stream": true
+        });
+
+        let resp = reqwest::Client::new()
+            .post(url)
+            .json(&payload)
+            .send()
+            .await
+            .expect("send request");
+        assert_eq!(resp.status(), StatusCode::OK);
+
+        let bytes = resp.bytes().await.expect("read response body");
+        assert_eq!(bytes, response_body.as_bytes());
+
+        let completion = completions.remove(0);
+        let completed_at = completion.await.expect("completion timestamp");
+        assert!(completed_at > 0);
+
+        server.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn read_http_request_returns_after_header_terminator() {
+        let listener = TcpListener::bind("127.0.0.1:0")
+            .await
+            .expect("bind test listener");
+        let addr = listener.local_addr().expect("listener address");
+        let (tx, rx) = oneshot::channel();
+        let server_task = tokio::spawn(async move {
+            let (mut stream, _) = listener.accept().await.expect("accept client");
+            let (request, body) = read_http_request(&mut stream).await;
+            let _ = tx.send((request, body));
+        });
+
+        let mut client = TcpStream::connect(addr)
+            .await
+            .expect("connect to test listener");
+        let request = "GET / HTTP/1.1\r\nHost: 127.0.0.1\r\n\r\n";
+        client
+            .write_all(request.as_bytes())
+            .await
+            .expect("write request");
+        let (received, body) = timeout(Duration::from_millis(200), rx)
+            .await
+            .expect("read_http_request timed out")
+            .expect("receive request");
+        assert_eq!(received, request);
+        assert!(body.is_empty());
+        drop(client);
+        let _ = server_task.await;
+    }
+
+    #[test]
+    fn parse_request_line_handles_valid_and_invalid() {
+        assert_eq!(parse_request_line(""), None);
+        assert_eq!(parse_request_line("BAD"), None);
+        assert_eq!(
+            parse_request_line("GET /v1/models HTTP/1.1"),
+            Some(("GET", "/v1/models"))
+        );
+    }
+
+    #[tokio::test]
+    async fn take_next_stream_consumes_in_lockstep() {
+        let (first_tx, first_rx) = oneshot::channel();
+        let (second_tx, second_rx) = oneshot::channel();
+        let state = TokioMutex::new(StreamingSseState {
+            responses: VecDeque::from(vec![
+                vec![StreamingSseChunk {
+                    gate: None,
+                    body: "first".to_string(),
+                }],
+                vec![StreamingSseChunk {
+                    gate: None,
+                    body: "second".to_string(),
+                }],
+            ]),
+            completions: VecDeque::from(vec![first_tx, second_tx]),
+        });
+
+        let (first_chunks, first_completion) =
+            take_next_stream(&state).await.expect("first stream");
+        assert_eq!(first_chunks[0].body, "first");
+        let _ = first_completion.send(11);
+        assert_eq!(first_rx.await.expect("first completion"), 11);
+
+        let (second_chunks, second_completion) =
+            take_next_stream(&state).await.expect("second stream");
+        assert_eq!(second_chunks[0].body, "second");
+        let _ = second_completion.send(22);
+        assert_eq!(second_rx.await.expect("second completion"), 22);
+
+        let third = take_next_stream(&state).await;
+        assert!(third.is_none());
+    }
+
+    #[tokio::test]
+    async fn shutdown_terminates_accept_loop() {
+        let (server, _) = start_streaming_sse_server(Vec::new()).await;
+        let shutdown = timeout(Duration::from_millis(200), server.shutdown()).await;
+        assert!(shutdown.is_ok());
+    }
+}
--- a/codex-rs/core/tests/common/test_codex.rs
+++ b/codex-rs/core/tests/common/test_codex.rs
@@ -25,6 +25,7 @@ use wiremock::MockServer;
 use crate::load_default_config_for_test;
 use crate::responses::get_responses_request_bodies;
 use crate::responses::start_mock_server;
+use crate::streaming_sse::StreamingSseServer;
 use crate::wait_for_event;

 type ConfigMutator = dyn FnOnce(&mut Config) + Send;
@@ -89,6 +90,16 @@ impl TestCodexBuilder {
        self.build_with_home(server, home, None).await
    }

+    pub async fn build_with_streaming_server(
+        &mut self,
+        server: &StreamingSseServer,
+    ) -> anyhow::Result<TestCodex> {
+        let base_url = server.uri();
+        let home = Arc::new(TempDir::new()?);
+        self.build_with_home_and_base_url(format!("{base_url}/v1"), home, None)
+            .await
+    }
+
    pub async fn resume(
        &mut self,
        server: &wiremock::MockServer,
@@ -104,8 +115,28 @@ impl TestCodexBuilder {
        home: Arc<TempDir>,
        resume_from: Option<PathBuf>,
    ) -> anyhow::Result<TestCodex> {
-        let (config, cwd) = self.prepare_config(server, &home).await?;
+        let base_url = format!("{}/v1", server.uri());
+        let (config, cwd) = self.prepare_config(base_url, &home).await?;
+        self.build_from_config(config, cwd, home, resume_from).await
+    }

+    async fn build_with_home_and_base_url(
+        &mut self,
+        base_url: String,
+        home: Arc<TempDir>,
+        resume_from: Option<PathBuf>,
+    ) -> anyhow::Result<TestCodex> {
+        let (config, cwd) = self.prepare_config(base_url, &home).await?;
+        self.build_from_config(config, cwd, home, resume_from).await
+    }
+
+    async fn build_from_config(
+        &mut self,
+        config: Config,
+        cwd: Arc<TempDir>,
+        home: Arc<TempDir>,
+        resume_from: Option<PathBuf>,
+    ) -> anyhow::Result<TestCodex> {
        let auth = self.auth.clone();
        let conversation_manager = ConversationManager::with_models_provider_and_home(
            auth.clone(),
@@ -139,11 +170,11 @@ impl TestCodexBuilder {

    async fn prepare_config(
        &mut self,
-        server: &wiremock::MockServer,
+        base_url: String,
        home: &TempDir,
    ) -> anyhow::Result<(Config, Arc<TempDir>)> {
        let model_provider = ModelProviderInfo {
-            base_url: Some(format!("{}/v1", server.uri())),
+            base_url: Some(base_url),
            ..built_in_model_providers()["openai"].clone()
        };
        let cwd = Arc::new(TempDir::new()?);
--- a/codex-rs/core/tests/suite/tool_parallelism.rs
+++ b/codex-rs/core/tests/suite/tool_parallelism.rs
@@ -1,6 +1,7 @@
 #![cfg(not(target_os = "windows"))]
 #![allow(clippy::unwrap_used)]

+use std::fs;
 use std::time::Duration;
 use std::time::Instant;

@@ -13,16 +14,22 @@ use codex_protocol::user_input::UserInput;
 use core_test_support::responses::ev_assistant_message;
 use core_test_support::responses::ev_completed;
 use core_test_support::responses::ev_function_call;
+use core_test_support::responses::ev_response_created;
+use core_test_support::responses::ev_shell_command_call_with_args;
 use core_test_support::responses::mount_sse_once;
 use core_test_support::responses::mount_sse_sequence;
 use core_test_support::responses::sse;
 use core_test_support::responses::start_mock_server;
 use core_test_support::skip_if_no_network;
+use core_test_support::streaming_sse::StreamingSseChunk;
+use core_test_support::streaming_sse::start_streaming_sse_server;
 use core_test_support::test_codex::TestCodex;
 use core_test_support::test_codex::test_codex;
 use core_test_support::wait_for_event;
+use pretty_assertions::assert_eq;
 use serde_json::Value;
 use serde_json::json;
+use tokio::sync::oneshot;

 async fn run_turn(test: &TestCodex, prompt: &str) -> anyhow::Result<()> {
    let session_model = test.session_configured.model.clone();
@@ -280,3 +287,123 @@ async fn tool_results_grouped() -> anyhow::Result<()> {

    Ok(())
 }
+
+#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
+async fn shell_tools_start_before_response_completed_when_stream_delayed() -> anyhow::Result<()> {
+    skip_if_no_network!(Ok(()));
+
+    let output_file = tempfile::NamedTempFile::new()?;
+    let output_path = output_file.path();
+    let first_response_id = "resp-1";
+    let second_response_id = "resp-2";
+
+    let command = format!(
+        "perl -MTime::HiRes -e 'print int(Time::HiRes::time()*1000), \"\\n\"' >> \"{}\"",
+        output_path.display()
+    );
+    let args = json!({
+        "command": command,
+        "timeout_ms": 1_000,
+    });
+
+    let first_chunk = sse(vec![
+        ev_response_created(first_response_id),
+        ev_shell_command_call_with_args("call-1", &args),
+        ev_shell_command_call_with_args("call-2", &args),
+        ev_shell_command_call_with_args("call-3", &args),
+        ev_shell_command_call_with_args("call-4", &args),
+    ]);
+    let second_chunk = sse(vec![ev_completed(first_response_id)]);
+    let follow_up = sse(vec![
+        ev_assistant_message("msg-1", "done"),
+        ev_completed(second_response_id),
+    ]);
+
+    let (first_gate_tx, first_gate_rx) = oneshot::channel();
+    let (completion_gate_tx, completion_gate_rx) = oneshot::channel();
+    let (follow_up_gate_tx, follow_up_gate_rx) = oneshot::channel();
+    let (streaming_server, completion_receivers) = start_streaming_sse_server(vec![
+        vec![
+            StreamingSseChunk {
+                gate: Some(first_gate_rx),
+                body: first_chunk,
+            },
+            StreamingSseChunk {
+                gate: Some(completion_gate_rx),
+                body: second_chunk,
+            },
+        ],
+        vec![StreamingSseChunk {
+            gate: Some(follow_up_gate_rx),
+            body: follow_up,
+        }],
+    ])
+    .await;
+
+    let mut builder = test_codex().with_model("gpt-5.1");
+    let test = builder
+        .build_with_streaming_server(&streaming_server)
+        .await?;
+
+    let session_model = test.session_configured.model.clone();
+    test.codex
+        .submit(Op::UserTurn {
+            items: vec![UserInput::Text {
+                text: "stream delayed completion".into(),
+            }],
+            final_output_json_schema: None,
+            cwd: test.cwd.path().to_path_buf(),
+            approval_policy: AskForApproval::Never,
+            sandbox_policy: SandboxPolicy::DangerFullAccess,
+            model: session_model,
+            effort: None,
+            summary: ReasoningSummary::Auto,
+        })
+        .await?;
+
+    let _ = first_gate_tx.send(());
+    let _ = follow_up_gate_tx.send(());
+
+    let timestamps = tokio::time::timeout(Duration::from_secs(1), async {
+        loop {
+            let contents = fs::read_to_string(output_path)?;
+            let timestamps = contents
+                .lines()
+                .filter(|line| !line.trim().is_empty())
+                .map(|line| {
+                    line.trim()
+                        .parse::<i64>()
+                        .map_err(|err| anyhow::anyhow!("invalid timestamp {line:?}: {err}"))
+                })
+                .collect::<Result<Vec<_>, _>>()?;
+            if timestamps.len() == 4 {
+                return Ok::<_, anyhow::Error>(timestamps);
+            }
+            tokio::time::sleep(Duration::from_millis(10)).await;
+        }
+    })
+    .await??;
+
+    let _ = completion_gate_tx.send(());
+    wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
+
+    let mut completion_iter = completion_receivers.into_iter();
+    let completed_at = completion_iter
+        .next()
+        .expect("completion receiver missing")
+        .await
+        .expect("completion timestamp missing");
+    let count = i64::try_from(timestamps.len()).expect("timestamp count fits in i64");
+    assert_eq!(count, 4);
+
+    for timestamp in timestamps {
+        assert!(
+            timestamp < completed_at,
+            "timestamp {timestamp} should be before completed {completed_at}"
+        );
+    }
+
+    streaming_server.shutdown().await;
+
+    Ok(())
+}
--- a/codex-rs/exec-server/Cargo.toml
+++ b/codex-rs/exec-server/Cargo.toml
@@ -61,4 +61,3 @@ exec_server_test_support = { workspace = true }
 maplit = { workspace = true }
 pretty_assertions = { workspace = true }
 tempfile = { workspace = true }
-which = { workspace = true }
--- a/codex-rs/exec-server/src/posix/escalate_client.rs
+++ b/codex-rs/exec-server/src/posix/escalate_client.rs
@@ -51,7 +51,10 @@ pub(crate) async fn run(file: String, argv: Vec<String>) -> anyhow::Result<i32>
        })
        .await
        .context("failed to send EscalateRequest")?;
-    let message = client.receive::<EscalateResponse>().await?;
+    let message = client
+        .receive::<EscalateResponse>()
+        .await
+        .context("failed to receive EscalateResponse")?;
    match message.action {
        EscalateAction::Escalate => {
            // TODO: maybe we should send ALL open FDs (except the escalate client)?
--- a/codex-rs/exec-server/tests/suite/accept_elicitation.rs
+++ b/codex-rs/exec-server/tests/suite/accept_elicitation.rs
@@ -24,6 +24,7 @@ use serde_json::json;
 use std::os::unix::fs::PermissionsExt;
 use std::os::unix::fs::symlink;
 use tempfile::TempDir;
+use tokio::process::Command;

 /// Verify that when using a read-only sandbox and an execpolicy that prompts,
 /// the proper elicitation is sent. Upon auto-approving the elicitation, the
@@ -53,11 +54,11 @@ prefix_rule(

    // Create an MCP client that approves expected elicitation messages.
    let project_root = TempDir::new()?;
-    let git = which::which("git")?;
    let project_root_path = project_root.path().canonicalize().unwrap();
+    let git_path = resolve_git_path().await?;
    let expected_elicitation_message = format!(
        "Allow agent to run `{} init .` in `{}`?",
-        git.display(),
+        git_path,
        project_root_path.display()
    );
    let elicitation_requests: Arc<Mutex<Vec<CreateElicitationRequestParam>>> = Default::default();
@@ -104,6 +105,7 @@ prefix_rule(
            name: Cow::Borrowed("shell"),
            arguments: Some(object(json!(
                {
+                    "login": false,
                    "command": "git init .",
                    "workdir": project_root_path.to_string_lossy(),
                }
@@ -174,3 +176,23 @@ fn ensure_codex_cli() -> Result<PathBuf> {

    Ok(codex_cli)
 }
+
+async fn resolve_git_path() -> Result<String> {
+    let git = Command::new("bash")
+        .arg("-lc")
+        .arg("command -v git")
+        .output()
+        .await
+        .context("failed to resolve git via login shell")?;
+    ensure!(
+        git.status.success(),
+        "failed to resolve git via login shell: {}",
+        String::from_utf8_lossy(&git.stderr)
+    );
+    let git_path = String::from_utf8(git.stdout)
+        .context("git path was not valid utf8")?
+        .trim()
+        .to_string();
+    ensure!(!git_path.is_empty(), "git path should not be empty");
+    Ok(git_path)
+}
--- a/codex-rs/tui/src/chatwidget.rs
+++ b/codex-rs/tui/src/chatwidget.rs
@@ -118,15 +118,6 @@ use crate::text_formatting::truncate_text;
 use crate::tui::FrameRequester;
 mod interrupts;
 use self::interrupts::InterruptManager;
-
-#[cfg(test)]
-use crate::version::CODEX_CLI_VERSION;
-#[cfg(test)]
-use codex_core::version::VERSION_FILENAME;
-#[cfg(test)]
-use codex_core::version::is_newer;
-#[cfg(test)]
-use codex_core::version::read_version_info;
 mod agent;
 use self::agent::spawn_agent;
 use self::agent::spawn_agent_from_existing;
@@ -685,45 +676,7 @@ impl ChatWidget {
        self.model_family.clone()
    }

-    fn maybe_append_update_nudge(&self, message: String) -> String {
-        if !self.should_show_update_nudge() {
-            return message;
-        }
-        let nudge = crate::update_action::update_available_nudge();
-        if message.is_empty() {
-            nudge
-        } else {
-            format!("{message}\n{nudge}")
-        }
-    }
-
-    #[cfg(not(debug_assertions))]
-    fn should_show_update_nudge(&self) -> bool {
-        if env!("CARGO_PKG_VERSION") == "0.0.0" {
-            return false;
-        }
-        crate::updates::get_upgrade_version(&self.config).is_some()
-    }
-
-    #[cfg(test)]
-    fn should_show_update_nudge(&self) -> bool {
-        if !self.config.check_for_update_on_startup {
-            return false;
-        }
-        let version_file = self.config.codex_home.join(VERSION_FILENAME);
-        read_version_info(&version_file)
-            .ok()
-            .and_then(|info| is_newer(&info.latest_version, CODEX_CLI_VERSION))
-            .unwrap_or(false)
-    }
-
-    #[cfg(all(debug_assertions, not(test)))]
-    fn should_show_update_nudge(&self) -> bool {
-        false
-    }
-
    fn on_error(&mut self, message: String) {
-        let message = self.maybe_append_update_nudge(message);
        self.finalize_turn();
        self.add_to_history(history_cell::new_error_event(message));
        self.request_redraw();
@@ -1002,7 +955,6 @@ impl ChatWidget {
    }

    fn on_stream_error(&mut self, message: String) {
-        let message = self.maybe_append_update_nudge(message);
        if self.retry_status_header.is_none() {
            self.retry_status_header = Some(self.current_status_header.clone());
        }
--- a/codex-rs/tui/src/chatwidget/snapshots/codex_tuichatwidgettests__error_event_history.snap
+++ b/codex-rs/tui/src/chatwidget/snapshots/codex_tuichatwidgettests__error_event_history.snap
@@ -1,6 +0,0 @@
---
-source: tui/src/chatwidget/tests.rs
-expression: last
---
-■ Something failed.
-Update available. See https://github.com/openai/codex for installation options.
--- a/codex-rs/tui/src/chatwidget/snapshots/codex_tuichatwidgettests__stream_error_status_header.snap
+++ b/codex-rs/tui/src/chatwidget/snapshots/codex_tuichatwidgettests__stream_error_status_header.snap
@@ -1,6 +0,0 @@
---
-source: tui/src/chatwidget/tests.rs
-expression: status.header()
---
-Reconnecting... 2/5
-Update available. See https://github.com/openai/codex for installation options.
--- a/codex-rs/tui/src/chatwidget/tests.rs
+++ b/codex-rs/tui/src/chatwidget/tests.rs
@@ -4,7 +4,6 @@ use crate::app_event_sender::AppEventSender;
 use crate::test_backend::VT100Backend;
 use crate::tui::FrameRequester;
 use assert_matches::assert_matches;
-use chrono::Utc;
 use codex_common::approval_presets::builtin_approval_presets;
 use codex_core::AuthManager;
 use codex_core::CodexAuth;
@@ -19,7 +18,6 @@ use codex_core::protocol::AgentReasoningEvent;
 use codex_core::protocol::ApplyPatchApprovalRequestEvent;
 use codex_core::protocol::BackgroundEventEvent;
 use codex_core::protocol::CreditsSnapshot;
-use codex_core::protocol::ErrorEvent;
 use codex_core::protocol::Event;
 use codex_core::protocol::EventMsg;
 use codex_core::protocol::ExecApprovalRequestEvent;
@@ -51,8 +49,6 @@ use codex_core::protocol::UndoCompletedEvent;
 use codex_core::protocol::UndoStartedEvent;
 use codex_core::protocol::ViewImageToolCallEvent;
 use codex_core::protocol::WarningEvent;
-use codex_core::version::VERSION_FILENAME;
-use codex_core::version::VersionInfo;
 use codex_protocol::ConversationId;
 use codex_protocol::account::PlanType;
 use codex_protocol::openai_models::ModelPreset;
@@ -68,7 +64,6 @@ use crossterm::event::KeyEvent;
 use crossterm::event::KeyModifiers;
 use insta::assert_snapshot;
 use pretty_assertions::assert_eq;
-use serde_json;
 use std::collections::HashSet;
 use std::path::PathBuf;
 use tempfile::NamedTempFile;
@@ -497,24 +492,6 @@ fn lines_to_single_string(lines: &[ratatui::text::Line<'static>]) -> String {
    s
 }

-fn set_update_available(config: &mut Config) -> tempfile::TempDir {
-    let codex_home = tempdir().expect("tempdir");
-    config.codex_home = codex_home.path().to_path_buf();
-    config.check_for_update_on_startup = true;
-    let info = VersionInfo {
-        latest_version: "9999.0.0".to_string(),
-        last_checked_at: Utc::now(),
-        dismissed_version: None,
-    };
-    let json_line = format!(
-        "{}\n",
-        serde_json::to_string(&info).expect("serialize version info")
-    );
-    std::fs::write(codex_home.path().join(VERSION_FILENAME), json_line)
-        .expect("write version info");
-    codex_home
-}
-
 fn make_token_info(total_tokens: i64, context_window: i64) -> TokenUsageInfo {
    fn usage(total_tokens: i64) -> TokenUsage {
        TokenUsage {
@@ -2947,7 +2924,6 @@ fn plan_update_renders_history_cell() {
 #[test]
 fn stream_error_updates_status_indicator() {
    let (mut chat, mut rx, _op_rx) = make_chatwidget_manual(None);
-    let _tempdir = set_update_available(&mut chat.config);
    chat.bottom_pane.set_task_running(true);
    let msg = "Reconnecting... 2/5";
    chat.handle_codex_event(Event {
@@ -2967,10 +2943,7 @@ fn stream_error_updates_status_indicator() {
        .bottom_pane
        .status_widget()
        .expect("status indicator should be visible");
-    let nudge = crate::update_action::update_available_nudge();
-    let expected = format!("{msg}\n{nudge}");
-    assert_eq!(status.header(), expected);
-    assert_snapshot!("stream_error_status_header", status.header());
+    assert_eq!(status.header(), msg);
 }

 #[test]
@@ -2992,23 +2965,6 @@ fn warning_event_adds_warning_history_cell() {
    );
 }

-#[test]
-fn error_event_renders_history_snapshot() {
-    let (mut chat, mut rx, _op_rx) = make_chatwidget_manual(None);
-    let _tempdir = set_update_available(&mut chat.config);
-    chat.handle_codex_event(Event {
-        id: "sub-1".into(),
-        msg: EventMsg::Error(ErrorEvent {
-            message: "Something failed.".to_string(),
-            codex_error_info: Some(CodexErrorInfo::Other),
-        }),
-    });
-
-    let cells = drain_insert_history(&mut rx);
-    let last = lines_to_single_string(cells.last().expect("error history cell"));
-    assert_snapshot!("error_event_history", last);
-}
-
 #[test]
 fn stream_recovery_restores_previous_status_header() {
    let (mut chat, mut rx, _op_rx) = make_chatwidget_manual(None);
--- a/codex-rs/tui/src/lib.rs
+++ b/codex-rs/tui/src/lib.rs
@@ -78,7 +78,7 @@ mod text_formatting;
 mod tooltips;
 mod tui;
 mod ui_consts;
-pub use codex_core::update_action;
+pub mod update_action;
 mod update_prompt;
 mod updates;
 mod version;
--- a/codex-rs/core/src/update_action.rs
+++ b/codex-rs/core/src/update_action.rs
@@ -1,6 +1,4 @@
-#[cfg(any(not(debug_assertions), test))]
-use std::path::Path;
-
+/// Update action the CLI should perform after the TUI exits.
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 pub enum UpdateAction {
    /// Update via `npm install -g @openai/codex@latest`.
@@ -30,7 +28,7 @@ impl UpdateAction {
 }

 #[cfg(not(debug_assertions))]
-pub fn get_update_action() -> Option<UpdateAction> {
+pub(crate) fn get_update_action() -> Option<UpdateAction> {
    let exe = std::env::current_exe().unwrap_or_default();
    let managed_by_npm = std::env::var_os("CODEX_MANAGED_BY_NPM").is_some();
    let managed_by_bun = std::env::var_os("CODEX_MANAGED_BY_BUN").is_some();
@@ -43,27 +41,10 @@ pub fn get_update_action() -> Option<UpdateAction> {
    )
 }

-#[cfg(debug_assertions)]
-pub fn get_update_action() -> Option<UpdateAction> {
-    None
-}
-
-/// Returns the standard update-available message for clients to display.
-pub fn update_available_nudge() -> String {
-    match get_update_action() {
-        Some(action) => {
-            let command = action.command_str();
-            format!("Update available. Run `{command}` to update.")
-        }
-        None => "Update available. See https://github.com/openai/codex for installation options."
-            .to_string(),
-    }
-}
-
 #[cfg(any(not(debug_assertions), test))]
 fn detect_update_action(
    is_macos: bool,
-    current_exe: &Path,
+    current_exe: &std::path::Path,
    managed_by_npm: bool,
    managed_by_bun: bool,
 ) -> Option<UpdateAction> {
@@ -87,23 +68,33 @@ mod tests {
    #[test]
    fn detects_update_action_without_env_mutation() {
        assert_eq!(
-            detect_update_action(false, Path::new("/any/path"), false, false),
+            detect_update_action(false, std::path::Path::new("/any/path"), false, false),
            None
        );
        assert_eq!(
-            detect_update_action(false, Path::new("/any/path"), true, false),
+            detect_update_action(false, std::path::Path::new("/any/path"), true, false),
            Some(UpdateAction::NpmGlobalLatest)
        );
        assert_eq!(
-            detect_update_action(false, Path::new("/any/path"), false, true),
+            detect_update_action(false, std::path::Path::new("/any/path"), false, true),
            Some(UpdateAction::BunGlobalLatest)
        );
        assert_eq!(
-            detect_update_action(true, Path::new("/opt/homebrew/bin/codex"), false, false),
+            detect_update_action(
+                true,
+                std::path::Path::new("/opt/homebrew/bin/codex"),
+                false,
+                false
+            ),
            Some(UpdateAction::BrewUpgrade)
        );
        assert_eq!(
-            detect_update_action(true, Path::new("/usr/local/bin/codex"), false, false),
+            detect_update_action(
+                true,
+                std::path::Path::new("/usr/local/bin/codex"),
+                false,
+                false
+            ),
            Some(UpdateAction::BrewUpgrade)
        );
    }
--- a/codex-rs/tui/src/updates.rs
+++ b/codex-rs/tui/src/updates.rs
@@ -2,17 +2,13 @@

 use crate::update_action;
 use crate::update_action::UpdateAction;
+use chrono::DateTime;
 use chrono::Duration;
 use chrono::Utc;
 use codex_core::config::Config;
 use codex_core::default_client::create_client;
-use codex_core::version::VERSION_FILENAME;
-use codex_core::version::VersionInfo;
-use codex_core::version::extract_version_from_cask;
-use codex_core::version::extract_version_from_latest_tag;
-use codex_core::version::is_newer;
-use codex_core::version::read_version_info;
 use serde::Deserialize;
+use serde::Serialize;
 use std::path::Path;
 use std::path::PathBuf;

@@ -49,6 +45,16 @@ pub fn get_upgrade_version(config: &Config) -> Option<String> {
    })
 }

+#[derive(Serialize, Deserialize, Debug, Clone)]
+struct VersionInfo {
+    latest_version: String,
+    // ISO-8601 timestamp (RFC3339)
+    last_checked_at: DateTime<Utc>,
+    #[serde(default)]
+    dismissed_version: Option<String>,
+}
+
+const VERSION_FILENAME: &str = "version.json";
 // We use the latest version from the cask if installation is via homebrew - homebrew does not immediately pick up the latest release and can lag behind.
 const HOMEBREW_CASK_URL: &str =
    "https://raw.githubusercontent.com/Homebrew/homebrew-cask/HEAD/Casks/c/codex.rb";
@@ -63,6 +69,11 @@ fn version_filepath(config: &Config) -> PathBuf {
    config.codex_home.join(VERSION_FILENAME)
 }

+fn read_version_info(version_file: &Path) -> anyhow::Result<VersionInfo> {
+    let contents = std::fs::read_to_string(version_file)?;
+    Ok(serde_json::from_str(&contents)?)
+}
+
 async fn check_for_update(version_file: &Path) -> anyhow::Result<()> {
    let latest_version = match update_action::get_update_action() {
        Some(UpdateAction::BrewUpgrade) => {
@@ -105,6 +116,32 @@ async fn check_for_update(version_file: &Path) -> anyhow::Result<()> {
    Ok(())
 }

+fn is_newer(latest: &str, current: &str) -> Option<bool> {
+    match (parse_version(latest), parse_version(current)) {
+        (Some(l), Some(c)) => Some(l > c),
+        _ => None,
+    }
+}
+
+fn extract_version_from_cask(cask_contents: &str) -> anyhow::Result<String> {
+    cask_contents
+        .lines()
+        .find_map(|line| {
+            let line = line.trim();
+            line.strip_prefix("version \"")
+                .and_then(|rest| rest.strip_suffix('"'))
+                .map(ToString::to_string)
+        })
+        .ok_or_else(|| anyhow::anyhow!("Failed to find version in Homebrew cask file"))
+}
+
+fn extract_version_from_latest_tag(latest_tag_name: &str) -> anyhow::Result<String> {
+    latest_tag_name
+        .strip_prefix("rust-v")
+        .map(str::to_owned)
+        .ok_or_else(|| anyhow::anyhow!("Failed to parse latest tag name '{latest_tag_name}'"))
+}
+
 /// Returns the latest version to show in a popup, if it should be shown.
 /// This respects the user's dismissal choice for the current latest version.
 pub fn get_upgrade_version_for_popup(config: &Config) -> Option<String> {
@@ -140,14 +177,48 @@ pub async fn dismiss_version(config: &Config, version: &str) -> anyhow::Result<(
    Ok(())
 }

+fn parse_version(v: &str) -> Option<(u64, u64, u64)> {
+    let mut iter = v.trim().split('.');
+    let maj = iter.next()?.parse::<u64>().ok()?;
+    let min = iter.next()?.parse::<u64>().ok()?;
+    let pat = iter.next()?.parse::<u64>().ok()?;
+    Some((maj, min, pat))
+}
+
 #[cfg(test)]
 mod tests {
    use super::*;

+    #[test]
+    fn parses_version_from_cask_contents() {
+        let cask = r#"
+            cask "codex" do
+              version "0.55.0"
+            end
+        "#;
+        assert_eq!(
+            extract_version_from_cask(cask).expect("failed to parse version"),
+            "0.55.0"
+        );
+    }
+
+    #[test]
+    fn extracts_version_from_latest_tag() {
+        assert_eq!(
+            extract_version_from_latest_tag("rust-v1.5.0").expect("failed to parse version"),
+            "1.5.0"
+        );
+    }
+
+    #[test]
+    fn latest_tag_without_prefix_is_invalid() {
+        assert!(extract_version_from_latest_tag("v1.5.0").is_err());
+    }
+
    #[test]
    fn prerelease_version_is_not_considered_newer() {
-        assert_eq!(is_newer("0.11.0-beta.1", "0.11.0"), Some(false));
-        assert_eq!(is_newer("1.0.0-rc.1", "1.0.0"), Some(false));
+        assert_eq!(is_newer("0.11.0-beta.1", "0.11.0"), None);
+        assert_eq!(is_newer("1.0.0-rc.1", "1.0.0"), None);
    }

    #[test]
@@ -160,6 +231,7 @@ mod tests {

    #[test]
    fn whitespace_is_ignored() {
+        assert_eq!(parse_version(" 1.2.3 \n"), Some((1, 2, 3)));
        assert_eq!(is_newer(" 1.2.3 ", "1.2.2"), Some(true));
    }
 }
--- a/codex-rs/tui2/Cargo.toml
+++ b/codex-rs/tui2/Cargo.toml
@@ -106,8 +106,8 @@ arboard = { workspace = true }


 [dev-dependencies]
-codex-core = { workspace = true, features = ["test-support"] }
 assert_matches = { workspace = true }
+codex-core = { workspace = true, features = ["test-support"] }
 chrono = { workspace = true, features = ["serde"] }
 insta = { workspace = true }
 pretty_assertions = { workspace = true }
--- a/codex-rs/tui2/docs/streaming_wrapping_design.md
+++ b/codex-rs/tui2/docs/streaming_wrapping_design.md
@@ -0,0 +1,85 @@
+# Streaming Markdown Wrapping & Animation – TUI2 Notes
+
+This document mirrors the original `tui/streaming_wrapping_design.md` and
+captures how the same concerns apply to the new `tui2` crate. It exists so that
+future viewport and streaming work in TUI2 can rely on the same context without
+having to cross‑reference the legacy TUI implementation.
+
+At a high level, the design constraints are the same:
+
+- Streaming agent responses are rendered incrementally, with an animation loop
+  that reveals content over time.
+- Non‑streaming history cells are rendered width‑agnostically and wrapped only
+  at display time, so they reflow correctly when the terminal is resized.
+- Streaming content should eventually follow the same “wrap on display” model so
+  the transcript reflows consistently across width changes, without regressing
+  animation or markdown semantics.
+
+## 1. Where streaming is implemented in TUI2
+
+TUI2 keeps the streaming pipeline conceptually aligned with the legacy TUI but
+in a separate crate:
+
+- `tui2/src/markdown_stream.rs` implements the markdown streaming collector and
+  animation controller for agent deltas.
+- `tui2/src/chatwidget.rs` integrates streamed content into the transcript via
+  `HistoryCell` implementations.
+- `tui2/src/history_cell.rs` provides the concrete history cell types used by
+  the inline transcript and overlays.
+- `tui2/src/wrapping.rs` contains the shared text wrapping utilities used by
+  both streaming and non‑streaming render paths:
+  - `RtOptions` describes viewport‑aware wrapping (width, indents, algorithm).
+  - `word_wrap_line`, `word_wrap_lines`, and `word_wrap_lines_borrowed` provide
+    span‑aware wrapping that preserves markdown styling and emoji width.
+
+As in the original TUI, the key tension is between:
+
+- **Pre‑wrapping streamed content at commit time** (simpler animation, but
+  baked‑in splits that don’t reflow), and
+- **Deferring wrapping to render time** (better reflow, but requires a more
+  sophisticated streaming cell model or recomputation on each frame).
+
+## 2. Current behavior and limitations
+
+TUI2 is intentionally conservative for now:
+
+- Streaming responses use the same markdown streaming and wrapping utilities as
+  the legacy TUI, with width decisions made near the streaming collector.
+- The transcript viewport (`App::render_transcript_cells` in
+  `tui2/src/app.rs`) always uses `word_wrap_lines_borrowed` against the
+  current `Rect` width, so:
+  - Non‑streaming cells reflow naturally on resize.
+  - Streamed cells respect whatever wrapping was applied when their lines were
+    constructed, and may not fully “un‑wrap” if that work happened at a fixed
+    width earlier in the pipeline.
+
+This means TUI2 shares the same fundamental limitation documented in the
+original design note: streamed paragraphs can retain historical wrap decisions
+made at the time they were streamed, even if the viewport later grows wider.
+
+## 3. Design directions (forward‑looking)
+
+The options outlined in the legacy document apply here as well:
+
+1. **Keep the current behavior but clarify tests and documentation.**
+   - Ensure tests in `tui2/src/markdown_stream.rs`, `tui2/src/markdown_render.rs`,
+     `tui2/src/history_cell.rs`, and `tui2/src/wrapping.rs` encode the current
+     expectations around streaming, wrapping, and emoji / markdown styling.
+2. **Move towards width‑agnostic streaming cells.**
+   - Introduce a dedicated streaming history cell that stores the raw markdown
+     buffer and lets `HistoryCell::display_lines(width)` perform both markdown
+     rendering and wrapping based on the current viewport width.
+   - Keep the commit animation logic expressed in terms of “logical” positions
+     (e.g., number of tokens or lines committed) rather than pre‑wrapped visual
+     lines at a fixed width.
+3. **Hybrid “visual line count” model.**
+   - Track committed visual lines as a scalar and re‑render the streamed prefix
+     at the current width, revealing only the first `N` visual lines on each
+     animation tick.
+
+TUI2 does not yet implement these refactors; it intentionally stays close to
+the legacy behavior while the viewport work (scrolling, selection, exit
+transcripts) is being ported. This document exists to make that trade‑off
+explicit for TUI2 and to provide a natural home for any TUI2‑specific streaming
+wrapping notes as the design evolves.
+
--- a/codex-rs/tui2/docs/tui_viewport_and_history.md
+++ b/codex-rs/tui2/docs/tui_viewport_and_history.md
@@ -0,0 +1,454 @@
+# TUI2 Viewport, Transcript, and History – Design Notes
+
+This document describes the viewport and history model we are implementing in the new
+`codex-rs/tui2` crate. It builds on lessons from the legacy TUI and explains why we moved away
+from directly writing history into terminal scrollback.
+
+The target audience is Codex developers and curious contributors who want to understand or
+critique how TUI2 owns its viewport, scrollback, and suspend behavior.
+
+Unless stated otherwise, references to “the TUI” in this document mean the TUI2 implementation;
+when we mean the legacy TUI specifically, we call it out explicitly.
+
+---
+
+## 1. Problem Overview
+
+Historically, the legacy TUI tried to “cooperate” with the terminal’s own scrollback:
+
+- The inline viewport sat somewhere above the bottom of the screen.
+- When new history arrived, we tried to insert it directly into the terminal scrollback above the
+  viewport.
+- On certain transitions (e.g. switching sessions, overlays), we cleared and re‑wrote portions of
+  the screen from scratch.
+
+This had several failure modes:
+
+- **Terminal‑dependent behavior.**
+  - Different terminals handle scroll regions, clears, and resize semantics differently.
+  - What looked correct in one terminal could drop or duplicate content in another.
+
+- **Resizes and layout churn.**
+  - The TUI reacts to resizes, focus changes, and overlay transitions.
+  - When the viewport moved or its size changed, our attempts to keep scrollback “aligned” with the
+    in‑memory history could go out of sync.
+  - In practice this meant:
+    - Some lines were lost or overwritten.
+    - Others were duplicated or appeared in unexpected places.
+
+- **“Clear and rewrite everything” didn’t save us.**
+  - We briefly tried a strategy of clearing large regions (or the full screen) and re‑rendering
+    history when the layout changed.
+  - This ran into two issues:
+    - Terminals treat full clears differently. For example, Terminal.app often leaves the cleared
+      screen as a “page” at the top of scrollback, some terminals interpret only a subset of the
+      ANSI clear/scrollback codes, and others (like iTerm2) gate “clear full scrollback” behind
+      explicit user consent.
+    - Replaying a long session is expensive and still subject to timing/race conditions with user
+      output (e.g. shell prompts) when we weren’t in alt screen.
+
+The net result: the legacy TUI could not reliably guarantee “the history you see on screen is complete, in
+order, and appears exactly once” across terminals, resizes, suspend/resume, and overlay transitions.
+
+---
+
+## 2. Goals
+
+The redesign is guided by a few explicit goals:
+
+1. **Codex, not the terminal, owns the viewport.**
+   - The in‑memory transcript (a list of history entries) is the single source of truth for what’s
+     on screen.
+   - The TUI decides how to map that transcript into the current viewport; scrollback becomes an
+     output target, not an extra data structure we try to maintain.
+
+2. **History must be correct, ordered, and never silently dropped.**
+   - Every logical history cell should either:
+     - Be visible in the TUI, or
+     - Have been printed into scrollback as part of a suspend/exit flow.
+   - We would rather (rarely) duplicate content than risk losing it.
+
+3. **Avoid unnecessary duplication.**
+   - When emitting history to scrollback (on suspend or exit), print each logical cell’s content at
+     most once.
+   - Streaming cells are allowed to be “re‑seen” as they grow, but finished cells should not keep
+     reappearing.
+
+4. **Behave sensibly under resizes.**
+   - TUI rendering should reflow to the current width on every frame.
+   - History printed to scrollback may have been wrapped at different widths over time; that is
+     acceptable, but it must not cause missing content or unbounded duplication.
+
+5. **Suspend/alt‑screen interaction is predictable.**
+   - `Ctrl+Z` should:
+     - Cleanly exit alt screen, if active.
+     - Print a consistent transcript prefix into normal scrollback.
+     - Resume with the TUI fully redrawn, without stale artifacts.
+
+---
+
+## 3. New Viewport & Transcript Model
+
+### 3.1 Transcript as a logical sequence of cells
+
+At a high level, the TUI transcript is a list of “cells”, each representing one logical thing in
+the conversation:
+
+- A user prompt (with padding and a distinct background).
+- An agent response (which may arrive in multiple streaming chunks).
+- System or info rows (session headers, migration banners, reasoning summaries, etc.).
+
+Each cell knows how to draw itself for a given width: how many lines it needs, what prefixes to
+use, how to style its content. The transcript itself is purely logical:
+
+- It has no scrollback coordinates or terminal state baked into it.
+- It can be re‑rendered for any viewport width.
+
+The TUI’s job is to take this logical sequence and decide how much of it fits into the current
+viewport, and how it should be wrapped and styled on screen.
+
+### 3.2 Building viewport lines from the transcript
+
+To render the main transcript area above the composer, the TUI:
+
+1. Defines a “transcript region” as the full frame minus the height of the bottom input area.
+2. Flattens all cells into a list of visual lines, remembering for each visual line which cell it
+   came from and which line within that cell it corresponds to.
+3. Uses this flattened list plus a scroll position to decide which visual line should appear at the
+   top of the region.
+4. Clears the transcript region and draws the visible slice of lines into it.
+5. For user messages, paints the entire row background (including padding lines) so the user block
+   stands out even when it does not fill the whole width.
+6. Applies selection styling and other overlays on top of the rendered lines.
+
+Scrolling (mouse wheel, PgUp/PgDn, Home/End) operates entirely in terms of these flattened lines
+and the current scroll anchor. The terminal’s own scrollback is not part of this calculation; it
+only ever sees fully rendered frames.
+
+### 3.3 Alternate screen, overlays, and redraw guarantees
+
+The TUI uses the terminal’s alternate screen for:
+
+- The main interactive chat session (so the viewport can cover the full terminal).
+- Full‑screen overlays such as the transcript pager, diff view, model migration screen, and
+  onboarding.
+
+Conceptually:
+
+- Entering alt screen:
+  - Switches the terminal into alt screen and expands the viewport to cover the full terminal.
+  - Clears that alt‑screen buffer.
+
+- Leaving alt screen:
+  - Disables “alternate scroll” so mouse wheel events behave predictably.
+  - Returns to the normal screen.
+
+- On leaving overlays and on resuming from suspend, the TUI viewport is explicitly cleared and fully
+  redrawn:
+  - This prevents stale overlay content or shell output from lingering in the TUI area.
+  - The next frame reconstructs the UI entirely from the in‑memory transcript and other state, not
+    from whatever the terminal happened to remember.
+
+Alt screen is therefore treated as a temporary render target. The only authoritative copy of the UI
+is the in‑memory state.
+
+---
+
+## 4. Mouse, Selection, and Scrolling
+
+Mouse interaction is a first‑class part of the new design:
+
+- **Scrolling.**
+  - Mouse wheel scrolls the transcript in fixed line increments.
+  - Keyboard shortcuts (PgUp/PgDn/Home/End) use the same scroll model, so the footer can show
+    consistent hints regardless of input device.
+
+- **Selection.**
+  - A click‑and‑drag gesture defines a linear text selection in terms of the flattened transcript
+    lines (not raw buffer coordinates).
+  - Selection tracks the _content_ rather than a fixed screen row. When the transcript scrolls, the
+    selection moves along with the underlying lines instead of staying glued to a particular Y
+    position.
+  - The selection only covers the “transcript text” area; it intentionally skips the left gutter
+    that we use for bullets/prefixes.
+
+- **Copy.**
+  - When the user triggers copy, the TUI re‑renders just the transcript region off‑screen using the
+    same wrapping as the visible view.
+  - It then walks the selected lines and columns in that off‑screen buffer to reconstruct the exact
+    text region the user highlighted (including internal spaces and empty lines).
+  - That text is sent to the system clipboard and a status footer indicates success or failure.
+
+Because scrolling, selection, and copy all operate on the same flattened transcript representation,
+they remain consistent even as the viewport resizes or the chat composer grows/shrinks. Owning our
+own scrolling also means we must own mouse interactions end‑to‑end: if we left scrolling entirely
+to the terminal, we could not reliably line up selections with transcript content or avoid
+accidentally copying gutter/margin characters instead of just the conversation text.
+
+---
+
+## 5. Printing History to Scrollback
+
+We still want the final session (and suspend points) to appear in the user’s normal scrollback, but
+we no longer try to maintain scrollback in lock‑step with the TUI frame. Instead, we treat
+scrollback as an **append‑only log** of logical transcript cells.
+
+In practice this means:
+
+- The TUI may print history both when you suspend (`Ctrl+Z`) and when you exit.
+- Some users may prefer to only print on exit (for example to keep scrollback quieter during long
+  sessions). The current design anticipates gating suspend‑time printing behind a config toggle so
+  that this behavior can be made opt‑in or opt‑out without touching the core viewport logic, but
+  that switch has not been implemented yet.
+
+### 5.1 Cell‑based high‑water mark
+
+Internally, the TUI keeps a simple “high‑water mark” for history printing:
+
+- Think of this as “how many cells at the front of the transcript have already been sent to
+  scrollback.”
+- It is just a counter over the logical transcript, not over wrapped lines.
+- It moves forward only when we have actually printed more history.
+
+This means we never try to guess “how many terminal lines have already been printed”; we only
+remember that “the first N logical entries are done.”
+
+### 5.2 Rendering new cells for scrollback
+
+When we need to print history (on suspend or exit), we:
+
+1. Take the suffix of the transcript that lies beyond the high‑water mark.
+2. Render just that suffix into styled lines at the **current** terminal width.
+3. Write those lines to stdout.
+4. Advance the high‑water mark to include all cells we just printed.
+
+Older cells are never re‑rendered for scrollback; they remain in whatever wrapping they had when
+they were first printed. This avoids the line‑count–based bugs we had before while still allowing
+the on‑screen TUI to reflow freely.
+
+### 5.3 Suspend (`Ctrl+Z`) flow
+
+On suspend (typically `Ctrl+Z` on Unix):
+
+- Before yielding control back to the shell, the TUI:
+  - Leaves alt screen if it is active and restores normal terminal modes.
+  - Determines which transcript cells have not yet been printed and renders them for the current
+    width.
+  - Prints those new lines once into normal scrollback.
+  - Marks those cells as printed in the high‑water mark.
+  - Finally, sends the process to the background.
+
+On `fg`, the process resumes, re‑enters TUI modes, and redraws the viewport from the in‑memory
+transcript. The history printed during suspend stays in scrollback and is not touched again.
+
+### 5.4 Exit flow
+
+When the TUI exits, we follow the same principle:
+
+- We compute the suffix of the transcript that has not yet been printed (taking into account any
+  prior suspends).
+- We render just that suffix to styled lines at the current width.
+- The outer `main` function leaves alt screen, restores the terminal, and prints those lines, plus a
+  blank line and token usage summary.
+
+If you never suspended, exit prints the entire transcript once. If you did suspend one or more
+times, exit prints only the cells appended after the last suspend. In both cases, each logical
+conversation entry reaches scrollback exactly once.
+
+---
+
+## 6. Streaming, Width Changes, and Tradeoffs
+
+### 6.1 Streaming cells
+
+Streaming agent responses are represented as a sequence of history entries:
+
+- The first chunk produces a “first line” entry for the message.
+- Subsequent chunks produce continuation entries that extend that message.
+
+From the history/scrollback perspective:
+
+- Each streaming chunk is just another entry in the logical transcript.
+- The high‑water mark is a simple count of how many entries at the _front_ of the transcript have
+  already been printed.
+- As new streaming chunks arrive, they are appended as new entries and will be included the next
+  time we print history on suspend or exit.
+
+We do **not** attempt to reprint or retroactively merge older chunks. In scrollback you will see the
+streaming response as a series of discrete blocks, matching the internal history structure.
+
+Today, streaming rendering still “bakes in” some width at the time chunks are committed: line breaks
+for the streaming path are computed using the width that was active at the time, and stored in the
+intermediate representation. This is a known limitation and is called out in more detail in
+`codex-rs/tui2/docs/streaming_wrapping_design.md`; a follow‑up change will make streaming behavior
+match the rest of the transcript more closely (wrap only at display time, not at commit time).
+
+### 6.2 Width changes over time
+
+Because we now use a **cell‑level** high‑water mark instead of a visual line‑count, width changes
+are handled gracefully:
+
+- On every suspend/exit, we render the not‑yet‑printed suffix of the transcript at the **current**
+  width and append those lines.
+- Previously printed entries remain in scrollback with whatever wrapping they had at the time they
+  were printed.
+- We no longer rely on “N lines printed before, therefore skip N lines of the newly wrapped
+  transcript,” which was the source of dropped and duplicated content when widths changed.
+
+This does mean scrollback can contain older cells wrapped for narrower or wider widths than the
+final terminal size, but:
+
+- Each logical cell’s content appears exactly once.
+- New cells are append‑only and never overwrite or implicitly “shrink” earlier content.
+- The on‑screen TUI always reflows to the current width independently of scrollback.
+
+If we later choose to also re‑emit the “currently streaming” cell when printing on suspend (to make
+sure the latest chunk of a long answer is always visible in scrollback), that would intentionally
+duplicate a small number of lines at the boundary of that cell. The design assumes any such behavior
+would be controlled by configuration (for example, by disabling suspend‑time printing entirely for
+users who prefer only exit‑time output).
+
+### 6.3 Why not reflow scrollback?
+
+In theory we could try to reflow already‑printed content when widths change by:
+
+- Recomputing the entire transcript at the new width, and
+- Printing diffs that “rewrite” old regions in scrollback.
+
+In practice, this runs into the same issues that motivated the redesign:
+
+- Terminals treat full clears and scroll regions differently.
+- There is no portable way to “rewrite” arbitrary portions of scrollback above the visible buffer.
+- Interleaving user output (e.g. shell prompts after suspend) makes it impossible to reliably
+  reconstruct the original scrollback structure.
+
+We therefore deliberately accept that scrollback is **append‑only** and not subject to reflow;
+correctness is measured in terms of logical transcript content, not pixel‑perfect layout.
+
+---
+
+## 7. Backtrack and Overlays (Context)
+
+While this document is focused on viewport and history, it’s worth mentioning a few related
+behaviors that rely on the same model.
+
+### 7.1 Transcript overlay and backtrack
+
+The transcript overlay (pager) is a full‑screen view of the same logical transcript:
+
+- When opened, it takes a snapshot of the current transcript and renders it in an alt‑screen
+  overlay.
+- Backtrack mode (`Esc` sequences) walks backwards through user messages in that snapshot and
+  highlights the candidate “edit from here” point.
+- Confirming a backtrack request forks the conversation on the server and trims the in‑memory
+  transcript so that only history up to the chosen user message remains, then re‑renders that prefix
+  in the main view.
+
+The overlay is purely a different _view_ of the same transcript; it never infers anything from
+scrollback.
+
+---
+
+## 8. Summary of Tradeoffs
+
+**What we gain:**
+
+- The TUI has a clear, single source of truth for history (the in‑memory transcript).
+- Viewport rendering is deterministic and independent of scrollback.
+- Suspend and exit flows:
+  - Print each logical history cell exactly once.
+  - Are robust to terminal width changes.
+  - Interact cleanly with alt screen and raw‑mode toggling.
+- Streaming, overlays, selection, and backtrack all share the same logical history model.
+- Because cells are always re‑rendered live from the transcript, per‑cell interactions can become
+  richer over time. Instead of treating the transcript as “dead text”, we can make individual
+  entries interactive after they are rendered: expanding or contracting tool calls, diffs, or
+  reasoning summaries in place, jum…truncated… \*\*\*
+
+---
+
+## 9. TUI2 Implementation Notes
+
+This section maps the design above onto the `codex-rs/tui2` crate so future viewport work has
+concrete code pointers.
+
+### 9.1 Transcript state and layout
+
+The main app struct (`codex-rs/tui2/src/app.rs`) tracks the transcript and viewport state with:
+
+- `transcript_cells: Vec<Arc<dyn HistoryCell>>` – the logical history.
+- `transcript_scroll: TranscriptScroll` – whether the viewport is pinned to the bottom or
+  anchored at a specific cell/line pair.
+- `transcript_selection: TranscriptSelection` – a selection expressed in screen coordinates over
+  the flattened transcript region.
+- `transcript_view_top` / `transcript_total_lines` – the current viewport’s top line index and
+  total number of wrapped lines for the inline transcript area.
+
+### 9.2 Rendering, wrapping, and selection
+
+`App::render_transcript_cells` defines the transcript region, builds flattened lines via
+`App::build_transcript_lines`, wraps them with `word_wrap_lines_borrowed` from
+`codex-rs/tui2/src/wrapping.rs`, and applies selection via `apply_transcript_selection` before
+writing to the frame buffer.
+
+Streaming wrapping details live in `codex-rs/tui2/docs/streaming_wrapping_design.md`.
+
+### 9.3 Input, selection, and footer state
+
+Mouse handling lives in `App::handle_mouse_event`, keyboard scrolling in
+`App::handle_key_event`, selection rendering in `App::apply_transcript_selection`, and copy in
+`App::copy_transcript_selection` plus `codex-rs/tui2/src/clipboard_copy.rs`. Scroll/selection UI
+state is forwarded through `ChatWidget::set_transcript_ui_state`,
+`BottomPane::set_transcript_ui_state`, and `ChatComposer::footer_props`, with footer text
+assembled in `codex-rs/tui2/src/bottom_pane/footer.rs`.
+
+### 9.4 Exit transcript output
+
+`App::run` returns `session_lines` on `AppExitInfo` after flattening with
+`App::build_transcript_lines` and converting to ANSI via `App::render_lines_to_ansi`. The CLI
+prints those lines before the token usage and resume hints.
+
+## 10. Future Work and Open Questions
+
+This section collects design questions that follow naturally from the current model and are worth
+explicit discussion before we commit to further UI changes.
+
+- **“Scroll mode” vs “live follow” UI.**
+  - We already distinguish “scrolled away from bottom” vs “following the latest output” in the
+    footer and scroll state. Do we need a more explicit “scroll mode vs live mode” affordance (e.g.,
+    a dedicated indicator or toggle), or is the current behavior sufficient and adding more chrome
+    would be noise?
+
+- **Ephemeral scroll indicator.**
+  - For long sessions, a more visible sense of “where am I?” could help. One option is a minimalist
+    scrollbar that appears while the user is actively scrolling and fades out when idle. A full
+    “mini‑map” is probably too heavy for a TUI given the limited vertical space, but we could
+    imagine adding simple markers along the scrollbar to show where prior prompts occurred, or
+    where text search matches are, without trying to render a full preview of the buffer.
+
+- **Selection affordances.**
+  - Today, the primary hint that selection is active is the reversed text and the “Ctrl+Y copy
+    selection” footer text. Do we want an explicit “Selecting… (Esc to cancel)” status while a drag
+    is in progress, or would that be redundant/clutter for most users?
+
+- **Suspend banners in scrollback.**
+  - When printing history on suspend, should we also emit a small banner such as
+    `--- codex suspended; history up to here ---` to make those boundaries obvious in scrollback?
+    This would slightly increase noise but could make multi‑suspend sessions easier to read.
+
+- **Configuring suspend printing behavior.**
+  - The design already assumes that suspend‑time printing can be gated by config. Questions to
+    resolve:
+    - Should printing on suspend be on or off by default?
+    - Should we support multiple modes (e.g., “off”, “print all new cells”, “print streaming cell
+      tail only”) or keep it binary?
+
+- **Streaming duplication at the edges.**
+  - If we later choose to always re‑emit the “currently streaming” message when printing on suspend,
+    we would intentionally allow a small amount of duplication at the boundary of that message (for
+    example, its last line appearing twice across suspends). Is that acceptable if it improves the
+    readability of long streaming answers in scrollback, and should the ability to disable
+    suspend‑time printing be our escape hatch for users who care about exact de‑duplication?\*\*\*
+
+---
+
--- a/codex-rs/tui2/src/app.rs
+++ b/codex-rs/tui2/src/app.rs
--- a/codex-rs/tui2/src/bottom_pane/chat_composer.rs
+++ b/codex-rs/tui2/src/bottom_pane/chat_composer.rs
@@ -118,6 +118,9 @@ pub(crate) struct ChatComposer {
    footer_hint_override: Option<Vec<(String, String)>>,
    context_window_percent: Option<i64>,
    context_window_used_tokens: Option<i64>,
+    transcript_scrolled: bool,
+    transcript_selection_active: bool,
+    transcript_scroll_position: Option<(usize, usize)>,
    skills: Option<Vec<SkillMetadata>>,
    dismissed_skill_popup_token: Option<String>,
 }
@@ -166,6 +169,9 @@ impl ChatComposer {
            footer_hint_override: None,
            context_window_percent: None,
            context_window_used_tokens: None,
+            transcript_scrolled: false,
+            transcript_selection_active: false,
+            transcript_scroll_position: None,
            skills: None,
            dismissed_skill_popup_token: None,
        };
@@ -1531,6 +1537,9 @@ impl ChatComposer {
            is_task_running: self.is_task_running,
            context_window_percent: self.context_window_percent,
            context_window_used_tokens: self.context_window_used_tokens,
+            transcript_scrolled: self.transcript_scrolled,
+            transcript_selection_active: self.transcript_selection_active,
+            transcript_scroll_position: self.transcript_scroll_position,
        }
    }

@@ -1551,6 +1560,23 @@ impl ChatComposer {
            .map(|items| if items.is_empty() { 0 } else { 1 })
    }

+    /// Update the footer's view of transcript scroll state for the inline viewport.
+    ///
+    /// This state is derived from the main `App`'s transcript viewport and passed
+    /// through the bottom pane so the footer can indicate when the transcript is
+    /// scrolled away from the bottom, whether a selection is active, and the
+    /// current `(visible_top, total)` position.
+    pub(crate) fn set_transcript_ui_state(
+        &mut self,
+        scrolled: bool,
+        selection_active: bool,
+        scroll_position: Option<(usize, usize)>,
+    ) {
+        self.transcript_scrolled = scrolled;
+        self.transcript_selection_active = selection_active;
+        self.transcript_scroll_position = scroll_position;
+    }
+
    fn sync_popups(&mut self) {
        let file_token = Self::current_at_token(&self.textarea);
        let skill_token = self.current_skill_token();
--- a/codex-rs/tui2/src/bottom_pane/footer.rs
+++ b/codex-rs/tui2/src/bottom_pane/footer.rs
@@ -22,6 +22,9 @@ pub(crate) struct FooterProps {
    pub(crate) is_task_running: bool,
    pub(crate) context_window_percent: Option<i64>,
    pub(crate) context_window_used_tokens: Option<i64>,
+    pub(crate) transcript_scrolled: bool,
+    pub(crate) transcript_selection_active: bool,
+    pub(crate) transcript_scroll_position: Option<(usize, usize)>,
 }

 #[derive(Clone, Copy, Debug, Eq, PartialEq)]
@@ -94,6 +97,27 @@ fn footer_lines(props: FooterProps) -> Vec<Line<'static>> {
                key_hint::plain(KeyCode::Char('?')).into(),
                " for shortcuts".dim(),
            ]);
+            if props.transcript_scrolled {
+                line.push_span(" · ".dim());
+                line.push_span(key_hint::plain(KeyCode::PageUp));
+                line.push_span("/");
+                line.push_span(key_hint::plain(KeyCode::PageDown));
+                line.push_span(" scroll".dim());
+                line.push_span(" · ".dim());
+                line.push_span(key_hint::plain(KeyCode::Home));
+                line.push_span("/");
+                line.push_span(key_hint::plain(KeyCode::End));
+                line.push_span(" jump".dim());
+                if let Some((current, total)) = props.transcript_scroll_position {
+                    line.push_span(" · ".dim());
+                    line.push_span(Span::from(format!("{current}/{total}")).dim());
+                }
+            }
+            if props.transcript_selection_active {
+                line.push_span(" · ".dim());
+                line.push_span(key_hint::ctrl(KeyCode::Char('y')));
+                line.push_span(" copy selection".dim());
+            }
            vec![line]
        }
        FooterMode::ShortcutOverlay => {
@@ -440,6 +464,24 @@ mod tests {
                is_task_running: false,
                context_window_percent: None,
                context_window_used_tokens: None,
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
+            },
+        );
+
+        snapshot_footer(
+            "footer_shortcuts_transcript_scrolled_and_selection",
+            FooterProps {
+                mode: FooterMode::ShortcutSummary,
+                esc_backtrack_hint: false,
+                use_shift_enter_hint: false,
+                is_task_running: false,
+                context_window_percent: None,
+                context_window_used_tokens: None,
+                transcript_scrolled: true,
+                transcript_selection_active: true,
+                transcript_scroll_position: Some((3, 42)),
            },
        );

@@ -452,6 +494,9 @@ mod tests {
                is_task_running: false,
                context_window_percent: None,
                context_window_used_tokens: None,
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
            },
        );

@@ -464,6 +509,9 @@ mod tests {
                is_task_running: false,
                context_window_percent: None,
                context_window_used_tokens: None,
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
            },
        );

@@ -476,6 +524,9 @@ mod tests {
                is_task_running: true,
                context_window_percent: None,
                context_window_used_tokens: None,
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
            },
        );

@@ -488,6 +539,9 @@ mod tests {
                is_task_running: false,
                context_window_percent: None,
                context_window_used_tokens: None,
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
            },
        );

@@ -500,6 +554,9 @@ mod tests {
                is_task_running: false,
                context_window_percent: None,
                context_window_used_tokens: None,
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
            },
        );

@@ -512,6 +569,9 @@ mod tests {
                is_task_running: true,
                context_window_percent: Some(72),
                context_window_used_tokens: None,
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
            },
        );

@@ -524,6 +584,9 @@ mod tests {
                is_task_running: false,
                context_window_percent: None,
                context_window_used_tokens: Some(123_456),
+                transcript_scrolled: false,
+                transcript_selection_active: false,
+                transcript_scroll_position: None,
            },
        );
    }
--- a/codex-rs/tui2/src/bottom_pane/mod.rs
+++ b/codex-rs/tui2/src/bottom_pane/mod.rs
@@ -381,6 +381,17 @@ impl BottomPane {
        self.request_redraw();
    }

+    pub(crate) fn set_transcript_ui_state(
+        &mut self,
+        scrolled: bool,
+        selection_active: bool,
+        scroll_position: Option<(usize, usize)>,
+    ) {
+        self.composer
+            .set_transcript_ui_state(scrolled, selection_active, scroll_position);
+        self.request_redraw();
+    }
+
    /// Show a generic list selection view with the provided items.
    pub(crate) fn show_selection_view(&mut self, params: list_selection_view::SelectionViewParams) {
        let view = list_selection_view::ListSelectionView::new(params, self.app_event_tx.clone());
--- a/codex-rs/tui2/src/bottom_pane/snapshots/codex_tui2__bottom_panefootertests__footer_shortcuts_transcript_scrolled_and_selection.snap
+++ b/codex-rs/tui2/src/bottom_pane/snapshots/codex_tui2__bottom_panefootertests__footer_shortcuts_transcript_scrolled_and_selection.snap
@@ -0,0 +1,5 @@
+---
+source: tui2/src/bottom_pane/footer.rs
+expression: terminal.backend()
+---
+"  100% context left · ? for shortcuts · pgup/pgdn scroll · home/end jump · 3/42 "
--- a/codex-rs/tui2/src/chatwidget.rs
+++ b/codex-rs/tui2/src/chatwidget.rs
@@ -3073,6 +3073,30 @@ impl ChatWidget {
    pub(crate) fn clear_esc_backtrack_hint(&mut self) {
        self.bottom_pane.clear_esc_backtrack_hint();
    }
+
+    /// Return true when the bottom pane currently has an active task.
+    ///
+    /// This is used by the viewport to decide when mouse selections should
+    /// disengage auto-follow behavior while responses are streaming.
+    pub(crate) fn is_task_running(&self) -> bool {
+        self.bottom_pane.is_task_running()
+    }
+
+    /// Inform the bottom pane about the current transcript scroll state.
+    ///
+    /// This is used by the footer to surface when the inline transcript is
+    /// scrolled away from the bottom and to display the current
+    /// `(visible_top, total)` scroll position alongside other shortcuts.
+    pub(crate) fn set_transcript_ui_state(
+        &mut self,
+        scrolled: bool,
+        selection_active: bool,
+        scroll_position: Option<(usize, usize)>,
+    ) {
+        self.bottom_pane
+            .set_transcript_ui_state(scrolled, selection_active, scroll_position);
+    }
+
    /// Forward an `Op` directly to codex.
    pub(crate) fn submit_op(&self, op: Op) {
        // Record outbound operation for session replay fidelity.
--- a/codex-rs/tui2/src/clipboard_copy.rs
+++ b/codex-rs/tui2/src/clipboard_copy.rs
@@ -0,0 +1,79 @@
+use tracing::error;
+
+#[derive(Debug)]
+pub enum ClipboardError {
+    ClipboardUnavailable(String),
+    WriteFailed(String),
+}
+
+impl std::fmt::Display for ClipboardError {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        match self {
+            ClipboardError::ClipboardUnavailable(msg) => {
+                write!(f, "clipboard unavailable: {msg}")
+            }
+            ClipboardError::WriteFailed(msg) => write!(f, "failed to write to clipboard: {msg}"),
+        }
+    }
+}
+
+impl std::error::Error for ClipboardError {}
+
+pub trait ClipboardManager {
+    fn set_text(&mut self, text: String) -> Result<(), ClipboardError>;
+}
+
+#[cfg(not(target_os = "android"))]
+pub struct ArboardClipboardManager {
+    inner: Option<arboard::Clipboard>,
+}
+
+#[cfg(not(target_os = "android"))]
+impl ArboardClipboardManager {
+    pub fn new() -> Self {
+        match arboard::Clipboard::new() {
+            Ok(cb) => Self { inner: Some(cb) },
+            Err(err) => {
+                error!(error = %err, "failed to initialize clipboard");
+                Self { inner: None }
+            }
+        }
+    }
+}
+
+#[cfg(not(target_os = "android"))]
+impl ClipboardManager for ArboardClipboardManager {
+    fn set_text(&mut self, text: String) -> Result<(), ClipboardError> {
+        let Some(cb) = &mut self.inner else {
+            return Err(ClipboardError::ClipboardUnavailable(
+                "clipboard is not available in this environment".to_string(),
+            ));
+        };
+        cb.set_text(text)
+            .map_err(|e| ClipboardError::WriteFailed(e.to_string()))
+    }
+}
+
+#[cfg(target_os = "android")]
+pub struct ArboardClipboardManager;
+
+#[cfg(target_os = "android")]
+impl ArboardClipboardManager {
+    pub fn new() -> Self {
+        ArboardClipboardManager
+    }
+}
+
+#[cfg(target_os = "android")]
+impl ClipboardManager for ArboardClipboardManager {
+    fn set_text(&mut self, _text: String) -> Result<(), ClipboardError> {
+        Err(ClipboardError::ClipboardUnavailable(
+            "clipboard text copy is unsupported on Android".to_string(),
+        ))
+    }
+}
+
+pub fn copy_text(text: String) -> Result<(), ClipboardError> {
+    let mut manager = ArboardClipboardManager::new();
+    manager.set_text(text)
+}
--- a/codex-rs/tui2/src/insert_history.rs
+++ b/codex-rs/tui2/src/insert_history.rs
@@ -241,7 +241,7 @@ impl ModifierDiff {
    }
 }

-fn write_spans<'a, I>(mut writer: &mut impl Write, content: I) -> io::Result<()>
+pub(crate) fn write_spans<'a, I>(mut writer: &mut impl Write, content: I) -> io::Result<()>
 where
    I: IntoIterator<Item = &'a Span<'a>>,
 {
--- a/codex-rs/tui2/src/lib.rs
+++ b/codex-rs/tui2/src/lib.rs
@@ -40,6 +40,7 @@ mod ascii_animation;
 mod bottom_pane;
 mod chatwidget;
 mod cli;
+mod clipboard_copy;
 mod clipboard_paste;
 mod color;
 pub mod custom_terminal;
@@ -369,6 +370,7 @@ async fn run_ratatui_app(
                        token_usage: codex_core::protocol::TokenUsage::default(),
                        conversation_id: None,
                        update_action: Some(action),
+                        session_lines: Vec::new(),
                    });
                }
            }
@@ -408,6 +410,7 @@ async fn run_ratatui_app(
                token_usage: codex_core::protocol::TokenUsage::default(),
                conversation_id: None,
                update_action: None,
+                session_lines: Vec::new(),
            });
        }
        // if the user acknowledged windows or made an explicit decision ato trust the directory, reload the config accordingly
@@ -443,6 +446,7 @@ async fn run_ratatui_app(
                    token_usage: codex_core::protocol::TokenUsage::default(),
                    conversation_id: None,
                    update_action: None,
+                    session_lines: Vec::new(),
                });
            }
        }
@@ -481,6 +485,7 @@ async fn run_ratatui_app(
                    token_usage: codex_core::protocol::TokenUsage::default(),
                    conversation_id: None,
                    update_action: None,
+                    session_lines: Vec::new(),
                });
            }
            other => other,
@@ -491,6 +496,12 @@ async fn run_ratatui_app(

    let Cli { prompt, images, .. } = cli;

+    // Run the main chat + transcript UI on the terminal's alternate screen so
+    // the entire viewport can be used without polluting normal scrollback. This
+    // mirrors the behavior of the legacy TUI but keeps inline mode available
+    // for smaller prompts like onboarding and model migration.
+    let _ = tui.enter_alt_screen();
+
    let app_result = App::run(
        &mut tui,
        auth_manager,
@@ -504,7 +515,17 @@ async fn run_ratatui_app(
    )
    .await;

+    let _ = tui.leave_alt_screen();
    restore();
+    if let Ok(exit_info) = &app_result {
+        let mut stdout = std::io::stdout();
+        for line in exit_info.session_lines.iter() {
+            let _ = writeln!(stdout, "{line}");
+        }
+        if !exit_info.session_lines.is_empty() {
+            let _ = writeln!(stdout);
+        }
+    }
    // Mark the end of the recorded session.
    session_log::log_session_end();
    // ignore error when collecting usage – report underlying error instead
--- a/codex-rs/tui2/src/model_migration.rs
+++ b/codex-rs/tui2/src/model_migration.rs
@@ -114,6 +114,7 @@ pub(crate) async fn run_model_migration_prompt(
        if let Some(event) = events.next().await {
            match event {
                TuiEvent::Key(key_event) => screen.handle_key(key_event),
+                TuiEvent::Mouse(_) => {}
                TuiEvent::Paste(_) => {}
                TuiEvent::Draw => {
                    let _ = alt.tui.draw(u16::MAX, |frame| {
--- a/codex-rs/tui2/src/onboarding/onboarding_screen.rs
+++ b/codex-rs/tui2/src/onboarding/onboarding_screen.rs
@@ -393,6 +393,7 @@ pub(crate) async fn run_onboarding_app(
    while !onboarding_screen.is_done() {
        if let Some(event) = tui_events.next().await {
            match event {
+                TuiEvent::Mouse(_) => {}
                TuiEvent::Key(key_event) => {
                    onboarding_screen.handle_key_event(key_event);
                }
--- a/codex-rs/tui2/src/pager_overlay.rs
+++ b/codex-rs/tui2/src/pager_overlay.rs
@@ -14,6 +14,8 @@ use crate::tui;
 use crate::tui::TuiEvent;
 use crossterm::event::KeyCode;
 use crossterm::event::KeyEvent;
+use crossterm::event::MouseEvent;
+use crossterm::event::MouseEventKind;
 use ratatui::buffer::Buffer;
 use ratatui::buffer::Cell;
 use ratatui::layout::Rect;
@@ -283,6 +285,24 @@ impl PagerView {
        Ok(())
    }

+    fn handle_mouse_scroll(&mut self, tui: &mut tui::Tui, event: MouseEvent) -> Result<()> {
+        let step: usize = 3;
+        match event.kind {
+            MouseEventKind::ScrollUp => {
+                self.scroll_offset = self.scroll_offset.saturating_sub(step);
+            }
+            MouseEventKind::ScrollDown => {
+                self.scroll_offset = self.scroll_offset.saturating_add(step);
+            }
+            _ => {
+                return Ok(());
+            }
+        }
+        tui.frame_requester()
+            .schedule_frame_in(Duration::from_millis(16));
+        Ok(())
+    }
+
    /// Returns the height of one page in content rows.
    ///
    /// Prefers the last rendered content height (excluding header/footer chrome);
@@ -506,6 +526,7 @@ impl TranscriptOverlay {
                }
                other => self.view.handle_key_event(tui, other),
            },
+            TuiEvent::Mouse(mouse_event) => self.view.handle_mouse_scroll(tui, mouse_event),
            TuiEvent::Draw => {
                tui.draw(u16::MAX, |frame| {
                    self.render(frame.area(), frame.buffer);
@@ -565,6 +586,7 @@ impl StaticOverlay {
                }
                other => self.view.handle_key_event(tui, other),
            },
+            TuiEvent::Mouse(mouse_event) => self.view.handle_mouse_scroll(tui, mouse_event),
            TuiEvent::Draw => {
                tui.draw(u16::MAX, |frame| {
                    self.render(frame.area(), frame.buffer);
--- a/codex-rs/tui2/src/skill_error_prompt.rs
+++ b/codex-rs/tui2/src/skill_error_prompt.rs
@@ -58,6 +58,7 @@ pub(crate) async fn run_skill_error_prompt(
        if let Some(event) = events.next().await {
            match event {
                TuiEvent::Key(key_event) => screen.handle_key(key_event),
+                TuiEvent::Mouse(_) => {}
                TuiEvent::Paste(_) => {}
                TuiEvent::Draw => {
                    let _ = alt.tui.draw(u16::MAX, |frame| {
--- a/codex-rs/tui2/src/tui.rs
+++ b/codex-rs/tui2/src/tui.rs
@@ -1,4 +1,3 @@
-use std::fmt;
 use std::io::IsTerminal;
 use std::io::Result;
 use std::io::Stdout;
@@ -10,12 +9,13 @@ use std::sync::Arc;
 use std::sync::atomic::AtomicBool;
 use std::sync::atomic::Ordering;

-use crossterm::Command;
 use crossterm::SynchronizedUpdate;
 use crossterm::event::DisableBracketedPaste;
 use crossterm::event::DisableFocusChange;
+use crossterm::event::DisableMouseCapture;
 use crossterm::event::EnableBracketedPaste;
 use crossterm::event::EnableFocusChange;
+use crossterm::event::EnableMouseCapture;
 use crossterm::event::Event;
 use crossterm::event::KeyEvent;
 use crossterm::event::KeyboardEnhancementFlags;
@@ -24,7 +24,6 @@ use crossterm::event::PushKeyboardEnhancementFlags;
 use crossterm::terminal::EnterAlternateScreen;
 use crossterm::terminal::LeaveAlternateScreen;
 use crossterm::terminal::supports_keyboard_enhancement;
-use ratatui::backend::Backend;
 use ratatui::backend::CrosstermBackend;
 use ratatui::crossterm::execute;
 use ratatui::crossterm::terminal::disable_raw_mode;
@@ -74,56 +73,18 @@ pub fn set_modes() -> Result<()> {
    );

    let _ = execute!(stdout(), EnableFocusChange);
+    // Enable application mouse mode so scroll events are delivered as
+    // Mouse events instead of arrow keys.
+    let _ = execute!(stdout(), EnableMouseCapture);
    Ok(())
 }

-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-struct EnableAlternateScroll;
-
-impl Command for EnableAlternateScroll {
-    fn write_ansi(&self, f: &mut impl fmt::Write) -> fmt::Result {
-        write!(f, "\x1b[?1007h")
-    }
-
-    #[cfg(windows)]
-    fn execute_winapi(&self) -> Result<()> {
-        Err(std::io::Error::other(
-            "tried to execute EnableAlternateScroll using WinAPI; use ANSI instead",
-        ))
-    }
-
-    #[cfg(windows)]
-    fn is_ansi_code_supported(&self) -> bool {
-        true
-    }
-}
-
-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-struct DisableAlternateScroll;
-
-impl Command for DisableAlternateScroll {
-    fn write_ansi(&self, f: &mut impl fmt::Write) -> fmt::Result {
-        write!(f, "\x1b[?1007l")
-    }
-
-    #[cfg(windows)]
-    fn execute_winapi(&self) -> Result<()> {
-        Err(std::io::Error::other(
-            "tried to execute DisableAlternateScroll using WinAPI; use ANSI instead",
-        ))
-    }
-
-    #[cfg(windows)]
-    fn is_ansi_code_supported(&self) -> bool {
-        true
-    }
-}
-
 /// Restore the terminal to its original state.
 /// Inverse of `set_modes`.
 pub fn restore() -> Result<()> {
    // Pop may fail on platforms that didn't support the push; ignore errors.
    let _ = execute!(stdout(), PopKeyboardEnhancementFlags);
+    let _ = execute!(stdout(), DisableMouseCapture);
    execute!(stdout(), DisableBracketedPaste)?;
    let _ = execute!(stdout(), DisableFocusChange);
    disable_raw_mode()?;
@@ -161,6 +122,7 @@ pub enum TuiEvent {
    Key(KeyEvent),
    Paste(String),
    Draw,
+    Mouse(crossterm::event::MouseEvent),
 }

 pub struct Tui {
@@ -297,6 +259,9 @@ impl Tui {
                                    Event::Paste(pasted) => {
                                        yield TuiEvent::Paste(pasted);
                                    }
+                                    Event::Mouse(mouse_event) => {
+                                        yield TuiEvent::Mouse(mouse_event);
+                                    }
                                    Event::FocusGained => {
                                        terminal_focused.store(true, Ordering::Relaxed);
                                        crate::terminal_palette::requery_default_colors();
@@ -305,7 +270,6 @@ impl Tui {
                                    Event::FocusLost => {
                                        terminal_focused.store(false, Ordering::Relaxed);
                                    }
-                                    _ => {}
                                }
                            }
                            Some(Err(_)) | None => {
@@ -341,8 +305,6 @@ impl Tui {
    /// inline viewport for restoration when leaving.
    pub fn enter_alt_screen(&mut self) -> Result<()> {
        let _ = execute!(self.terminal.backend_mut(), EnterAlternateScreen);
-        // Enable "alternate scroll" so terminals may translate wheel to arrows
-        let _ = execute!(self.terminal.backend_mut(), EnableAlternateScroll);
        if let Ok(size) = self.terminal.size() {
            self.alt_saved_viewport = Some(self.terminal.viewport_area);
            self.terminal.set_viewport_area(ratatui::layout::Rect::new(
@@ -359,8 +321,6 @@ impl Tui {

    /// Leave alternate screen and restore the previously saved inline viewport, if any.
    pub fn leave_alt_screen(&mut self) -> Result<()> {
-        // Disable alternate scroll when leaving alt-screen
-        let _ = execute!(self.terminal.backend_mut(), DisableAlternateScroll);
        let _ = execute!(self.terminal.backend_mut(), LeaveAlternateScreen);
        if let Some(saved) = self.alt_saved_viewport.take() {
            self.terminal.set_viewport_area(saved);
@@ -404,30 +364,13 @@ impl Tui {

            let size = terminal.size()?;

-            let mut area = terminal.viewport_area;
-            area.height = height.min(size.height);
-            area.width = size.width;
-            // If the viewport has expanded, scroll everything else up to make room.
-            if area.bottom() > size.height {
-                terminal
-                    .backend_mut()
-                    .scroll_region_up(0..area.top(), area.bottom() - size.height)?;
-                area.y = size.height - area.height;
-            }
+            let area = Rect::new(0, 0, size.width, height.min(size.height));
            if area != terminal.viewport_area {
                // TODO(nornagon): probably this could be collapsed with the clear + set_viewport_area above.
                terminal.clear()?;
                terminal.set_viewport_area(area);
            }

-            if !self.pending_history_lines.is_empty() {
-                crate::insert_history::insert_history_lines(
-                    terminal,
-                    self.pending_history_lines.clone(),
-                )?;
-                self.pending_history_lines.clear();
-            }
-
            // Update the y position for suspending so Ctrl-Z can place the cursor correctly.
            #[cfg(unix)]
            {
--- a/codex-rs/tui2/src/tui/job_control.rs
+++ b/codex-rs/tui2/src/tui/job_control.rs
@@ -18,8 +18,6 @@ use ratatui::layout::Rect;

 use crate::key_hint;

-use super::DisableAlternateScroll;
-use super::EnableAlternateScroll;
 use super::Terminal;

 pub const SUSPEND_KEY: key_hint::KeyBinding = key_hint::ctrl(KeyCode::Char('z'));
@@ -63,8 +61,7 @@ impl SuspendContext {
    /// - Trigger SIGTSTP so the process can be resumed and continue drawing with the saved state.
    pub(crate) fn suspend(&self, alt_screen_active: &Arc<AtomicBool>) -> Result<()> {
        if alt_screen_active.load(Ordering::Relaxed) {
-            // Leave alt-screen so the terminal returns to the normal buffer while suspended; also turn off alt-scroll.
-            let _ = execute!(stdout(), DisableAlternateScroll);
+            // Leave alt-screen so the terminal returns to the normal buffer while suspended.
            let _ = execute!(stdout(), LeaveAlternateScreen);
            self.set_resume_action(ResumeAction::RestoreAlt);
        } else {
@@ -157,11 +154,10 @@ impl PreparedResumeAction {
        match self {
            PreparedResumeAction::RealignViewport(area) => {
                terminal.set_viewport_area(area);
+                terminal.clear()?;
            }
            PreparedResumeAction::RestoreAltScreen => {
                execute!(terminal.backend_mut(), EnterAlternateScreen)?;
-                // Enable "alternate scroll" so terminals may translate wheel to arrows
-                execute!(terminal.backend_mut(), EnableAlternateScroll)?;
                if let Ok(size) = terminal.size() {
                    terminal.set_viewport_area(Rect::new(0, 0, size.width, size.height));
                    terminal.clear()?;
--- a/docs/config.md
+++ b/docs/config.md
@@ -190,7 +190,7 @@ model = "mistral"

 ### model_reasoning_effort

-If the selected model is known to support reasoning (for example: `o3`, `o4-mini`, `codex-*`, `gpt-5.1-codex-max`, `gpt-5.1`, `gpt-5.1-codex`, `gpt5-2`), reasoning is enabled by default when using the Responses API. As explained in the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning), this can be set to:
+If the selected model is known to support reasoning (for example: `o3`, `o4-mini`, `codex-*`, `gpt-5.1-codex-max`, `gpt-5.1`, `gpt-5.1-codex`, `gpt-5.2`), reasoning is enabled by default when using the Responses API. As explained in the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning), this can be set to:

 - `"minimal"`
 - `"low"`
Author	SHA1	Message	Date
celia-oai	907411be57	changes	2025-12-16 14:30:38 -08:00
Ahmed Ibrahim	c0a12b3952	feat: merge remote models instead of destructing (#7997 ) - merge remote models instead of destructing - make config values have more precedent over remote values	2025-12-15 18:02:35 -08:00
Ahmed Ibrahim	d802b18716	fix parallel tool calls (#7956 )	2025-12-16 01:28:27 +00:00
Josh McKinney	b093565bfb	WIP: Rework TUI viewport, history printing, and selection/copy (#7601 ) > large behavior change to how the TUI owns its viewport, history, and suspend behavior. > Core model is in place; a few items are still being polished before this is ready to merge. We've moved this over to a new tui2 crate from being directly on the tui crate. To enable use --enable tui2 (or the equivalent in your config.toml). See https://developers.openai.com/codex/local-config#feature-flags Note that this serves as a baseline for the changes that we're making to be applied rapidly. Tui2 may not track later changes in the main tui. It's experimental and may not be where we land on things. --- ## Summary This PR moves the Codex TUI off of “cooperating” with the terminal’s scrollback and onto a model where the in‑memory transcript is the single source of truth. The TUI now owns scrolling, selection, copy, and suspend/exit printing based on that transcript, and only writes to terminal scrollback in append‑only fashion on suspend/exit. It also fixes streaming wrapping so streamed responses reflow with the viewport, and introduces configuration to control whether we print history on suspend or only on exit. High‑level goals: - Ensure history is complete, ordered, and never silently dropped. - Print each logical history cell at most once into scrollback, even with resizes and suspends. - Make scrolling, selection, and copy match the visible transcript, not the terminal’s notion of scrollback. - Keep suspend/alt‑screen behavior predictable across terminals. --- ## Core Design Changes ### Transcript & viewport ownership - Treat the transcript as a list of cells (user prompts, agent messages, system/info rows, streaming segments). - On each frame: - Compute a transcript region as “full terminal frame minus the bottom input area”. - Flatten all cells into visual lines plus metadata (which cell + which line within that cell). - Use scroll state to choose which visual line is at the top of the region. - Clear that region and draw just the visible slice of lines. - The terminal’s scrollback is no longer part of the live layout algorithm; it is only ever written to when we decide to print history. ### User message styling - User prompts now render as clear blocks with: - A blank padding line above and below. - A full‑width background for every line in the block (including the prompt line itself). - The same block styling is used when we print history into scrollback, so the transcript looks consistent whether you are in the TUI or scrolling back after exit/suspend. --- ## Scrolling, Mouse, Selection, and Copy ### Scrolling - Scrolling is defined in terms of the flattened transcript lines: - Mouse wheel scrolls up/down by fixed line increments. - PgUp/PgDn/Home/End operate on the same scroll model. - The footer shows: - Whether you are “following live output” vs “scrolled up”. - Current scroll position (line / total). - When there is no history yet, the bottom pane is pegged high and gradually moves down as the transcript fills, matching the existing UX. ### Selection - Click‑and‑drag defines a linear selection over transcript line/column coordinates, not raw screen rows. - Selection is content‑anchored: - When you scroll, the selection moves with the underlying lines instead of sticking to a fixed Y position. - This holds both when scrolling manually and when new content streams in, as long as you are in “follow” mode. - The selection only covers the “transcript text” area: - Left gutter/prefix (bullets, markers) is intentionally excluded. - This keeps copy/paste cleaner and avoids including structural margin characters. ### Copy (`Ctrl+Y`) - Introduce a small clipboard abstraction (`ClipboardManager`‑style) and use a cross‑platform clipboard crate under the hood. - When `Ctrl+Y` is pressed and a non‑empty selection exists: - Re‑render the transcript region off‑screen using the same wrapping as the visible viewport. - Walk the selected line/column range over that buffer to reconstruct the exact text: - Includes spaces between words. - Preserves empty lines within the selection. - Send the resulting text to the system clipboard. - Show a short status message in the footer indicating success/failure. - Copy is best‑effort: - Clipboard failures (headless environment, sandbox, remote sessions) are handled gracefully via status messages; they do not crash the TUI. - Copy does not insert a new history entry; it only affects the status bar. --- ## Streaming and Wrapping ### Previous behavior Previously, streamed markdown: - Was wrapped at a fixed width at commit time inside the streaming collector. - Those wrapped `Line<'static>` values were then wrapped again at display time. - As a result, streamed paragraphs could not “un‑wrap” when the terminal width increased; they were permanently split according to the width at the start of the stream. ### New behavior This PR implements the first step from `codex-rs/tui/streaming_wrapping_design.md`: - Streaming collector is constructed without a fixed width for wrapping. - It still: - Buffers the full markdown source for the current stream. - Commits only at newline boundaries. - Emits logical lines as new content becomes available. - Agent message cells now wrap streamed content only at display time, based on the current viewport width, just like non‑streaming messages. - Consequences: - Streamed responses reflow correctly when the terminal is resized. - Animation steps are per logical line instead of per “pre‑wrapped” visual line; this makes some commits slightly larger but keeps the behavior simple and predictable. Streaming responses are still represented as a sequence of logical history entries (first line + continuations) and integrate with the same scrolling, selection, and printing model. --- ## Printing History on Suspend and Exit ### High‑water mark and append‑only scrollback - Introduce a cell‑based high‑water mark (`printed_history_cells`) on the transcript: - Represents “how many cells at the front of the transcript have already been printed”. - Completely independent of wrapped line counts or terminal geometry. - Whenever we print history (suspend or exit): - Take the suffix of `transcript_cells` beyond `printed_history_cells`. - Render just that suffix into styled lines at the current width. - Write those lines to stdout. - Advance `printed_history_cells` to cover all cells we just printed. - Older cells are never re‑rendered for scrollback. They stay in whatever wrapping they had when printed, which is acceptable as long as the logical content is present once. ### Suspend (`Ctrl+Z`) - On suspend: - Leave alt screen if active and restore normal terminal modes. - Render the not‑yet‑printed suffix of the transcript and append it to normal scrollback. - Advance the high‑water mark. - Suspend the process. - On resume (`fg`): - Re‑enter the TUI mode (alt screen + input modes). - Clear the viewport region and fully redraw from in‑memory transcript and state. This gives predictable behavior across terminals without trying to maintain scrollback live. ### Exit - On exit: - Render any remaining unprinted cells once and write them to stdout. - Add an extra blank line after the final Codex history cell before printing token usage, so the transcript and usage info are visually separated. - If you never suspended, exit prints the entire transcript exactly once. - If you suspended one or more times, exit prints only the cells appended after the last suspend. --- ## Configuration: Suspend Printing This PR also adds configuration to control when we print history: - New TUI config option to gate printing on suspend: - At minimum: - `print_on_suspend = true` – current behavior: print new history at each suspend and on exit. - `print_on_suspend = false` – only print on exit. - Default is tuned to preserve current behavior, but this can be revisited based on feedback. - The config is respected in the suspend path: - If disabled, suspend only restores terminal modes and stops rendering but does not print new history. - Exit still prints the full not‑yet‑printed suffix once. This keeps the core viewport logic agnostic to preference, while letting users who care about quiet scrollback opt out of suspend printing. --- ## Tradeoffs What we gain: - A single authoritative history model (the in‑memory transcript). - Deterministic viewport rendering independent of terminal quirks. - Suspend/exit flows that: - Print each logical history cell exactly once. - Work across resizes and different terminals. - Interact cleanly with alt screen and raw‑mode toggling. - Consistent, content‑anchored scrolling, selection, and copy. - Streaming messages that reflow correctly with the viewport width. What we accept: - Scrollback may contain older cells wrapped differently than newer ones. - Streaming responses appear in scrollback as a sequence of blocks corresponding to their streaming structure, not as a single retroactively reflowed paragraph. - We do not attempt to rewrite or reflow already‑printed scrollback. For deeper rationale and diagrams, see `docs/tui_viewport_and_history.md` and `codex-rs/tui/streaming_wrapping_design.md`. --- ## Still to Do Before This PR Is Ready These are scoped to this PR (not long‑term future work): - [ ] Streaming wrapping polish - Double‑check all streaming paths use display‑time wrapping only. - Ensure tests cover resizing after streaming has started. - [ ] Suspend printing config - Finalize config shape and default (keep existing behavior vs opt‑out). - Wire config through TUI startup and document it in the appropriate config docs. - [x] Bottom pane positioning - Ensure the bottom pane is pegged high when there’s no history and smoothly moves down as the transcript fills, matching the current behavior across startup and resume. - [x] Transcript mouse scrolling - Re‑enable wheel‑based transcript scrolling on top of the new scroll model. - Make sure mouse scroll does not get confused with “alternate scroll” modes from terminals. - [x] Mouse selection vs streaming - When selection is active, stop auto‑scrolling on streaming so the selection remains stable on the selected content. - Ensure that when streaming continues after selection is cleared, “follow latest output” mode resumes correctly. - [ ] Auto‑scroll during drag - While the user is dragging a selection, auto‑scroll when the cursor is at/near the top or bottom of the transcript viewport to allow selecting beyond the current visible window. - [ ] Feature flag / rollout - Investigate gating the new viewport/history behavior behind a feature flag for initial rollout, so we can fall back to the old behavior if needed during early testing. - [ ] Before/after videos - Capture short clips showing: - Scrolling (mouse + keys). - Selection and copy. - Streaming behavior under resize. - Suspend/resume and exit printing. - Use these to validate UX and share context in the PR discussion.	2025-12-15 17:20:53 -08:00
Owen Lin	412dd37956	chore(app-server): remove stubbed thread/compact API (#8086 ) We want to rely on server-side auto-compaction instead of having the client trigger context compaction manually. This API was stubbed as a placeholder and never implemented.	2025-12-16 01:11:01 +00:00
Eric Traut	d9554c8191	Fixes mcp elicitation test that fails for me when run locally (#8020 )	2025-12-15 16:23:04 -08:00
jif-oai	3ee5c40261	chore: persist comments in edit (#7931 ) This PR makes sure that inline comment is preserved for mcp server config and arbitrary key/value setPath config. --------- Co-authored-by: celia-oai <celia@openai.com>	2025-12-15 16:05:49 -08:00
miraclebakelaser	f754b19e80	Fix: Detect Bun global install via path check (#8004 ) ## Summary Restores ability to detect when Codex is installed globally via Bun, which was broken by `c3e4f920b4`. Fixes #8003. Instead of relying on `npm_config_user_agent` (which is only set when running via `bunx` or `bun run`), this adds a path-based check to see if the CLI wrapper is located in Bun's global installation directory. ## Regression Context Commit `c3e4f920b4e965085164d6ee0249a873ef96da77` removed the `BUN_INSTALL` environment variable checks to prevent false positives. However, this caused false negatives for genuine Bun global installs because `detectPackageManager()` defaults to NPM when no signal is found. ## Changes - Updated `codex-cli/bin/codex.js` to check if `__dirname` contains `.bun/install/global` (handles both POSIX and Windows paths). ## Verification Verified by performing a global install of the patched CLI (v0.69.0 to trigger the update prompt): 1. Packed the CLI using `npm pack` in `codex-cli/` to create a release tarball. 2. Installed globally via Bun: `bun install -g $(pwd)/openai-codex-0.0.0-dev.tgz`. 3. Ran `codex`, confirmed it detected Bun (banner showed `bun install -g @openai/codex`), selected "Update now", and verified it correctly spawned `bun install -g` instead of `npm`. 4. Confirmed the upgrade completed successfully using Bun. <img width="1038" height="813" alt="verifying installation via bun" src="https://github.com/user-attachments/assets/00c9301a-18f1-4440-aa95-82ccffba896c" /> 5. Verified installations via npm are unaffected. <img width="2090" height="842" alt="verifying installation via npm" src="https://github.com/user-attachments/assets/ccb3e031-b85c-4bbe-bac7-23b087c5b844" />	2025-12-15 15:30:06 -08:00
Victor Vannara	fbeb7d47a9	chore(ci): drop Homebrew origin/main workaround for macOS runners (#8084 ) ## Notes GitHub Actions macOS runners now ship a Homebrew version (5.0.5) that includes the fix that was needed in a change, so it's possible to remove the temporary CI step that forced using brew from origin/main (added in #7680). Proof of macOS GitHub Actions coming packaged with 5.0.5 - latest commit on `main` (https://github.com/openai/codex/actions/runs/20245177832/job/58123247999) - <img width="1286" height="136" alt="image" src="https://github.com/user-attachments/assets/8b25fd57-dad5-45c5-907c-4f4da6a36c3f" /> `actions/runner-images` upgraded the macOS 14 image from pre-release to release today (https://github.com/actions/runner-images/releases/tag/macos-14-arm64%2F20251210.0045) - <img width="1076" height="793" alt="image" src="https://github.com/user-attachments/assets/357ea4bd-40b0-49c3-a6cd-e7d87ba6766d" />	2025-12-15 15:29:43 -08:00
Lucas Kim	54def78a22	docs: fix gpt-5.2 typo in config.md (#8079 ) Fix small typo in docs/config.md: `gpt5-2` -> `gpt-5.2`	2025-12-15 15:15:14 -08:00
Jeremy Rose	2c6995ca4d	exec-server: additional context for errors (#7935 ) Add a .context() on some exec-server errors for debugging CI flakes. Also, "login": false in the test to make the test not affected by user profile.	2025-12-15 11:40:40 -08:00