Compare commits

...

9 Commits

Author SHA1 Message Date
Felipe Coury
a00d72279f fix(skills): make budget warning actionable 2026-04-25 22:42:37 -03:00
Shijie Rao
4e30281a13 Guard npm update readiness (#19389)
## Why
For npm/Bun-managed installs, the update prompt was treating the latest
GitHub release as ready to install. During the `0.124.0` release, GitHub
and npm visibility were not atomic: the root npm wrapper could become
visible before the npm registry marked that version as the package
`latest`. That left a window where users could be prompted to upgrade
before npm was ready for the release.

## What changed
- Keep GitHub Releases as the candidate latest-version source for
npm/Bun installs, but only write the existing `version.json` cache after
npm registry metadata proves that same root version is ready.
- Add `codex-rs/tui/src/npm_registry.rs` to validate npm readiness by
checking `dist-tags.latest` and root package `dist` metadata for the
GitHub candidate version.
- Move version parsing helpers into
`codex-rs/tui/src/update_versions.rs` so that logic can be tested
without compiling the release-only `updates.rs` module under tests.
- Update `.github/workflows/rust-release.yml` so the six known platform
tarballs publish before the root `@openai/codex` wrapper. Other npm
tarballs publish before the root wrapper, and the SDK publishes after
the root package it depends on.
2026-04-25 17:09:29 -07:00
Michael Bolin
9881dc7306 fix: restore 30-minute timeout for Bazel builds (#19609)
I think raising it to 45 minutes in
https://github.com/openai/codex/pull/19578 was a mistake for the reasons
explained in the comments in the code. Instead, we attempt to defend
against timeouts by increasing the number of shards in
`app-server-all-test` so that a "true failure" that gets run 3x should
not take as much wall clock time.
2026-04-25 16:34:06 -07:00
Michael Bolin
d54493ba1c test: stabilize app-server path assertions on Windows (#19604)
## Why

Windows can represent the same canonical local path with either a normal
drive path or a verbatim device path prefix. The failure pattern that
motivated this PR was an assertion diff like `C:\...` versus
`\\?\C:\...`: different spellings, same file.

That became visible while validating the permissions stack above this
PR. The stack increasingly routes paths through `AbsolutePathBuf`, which
normalizes supported Windows device prefixes, while several existing
tests still built expected values directly with
`std::fs::canonicalize()` or compared `AbsolutePathBuf::as_path()` to a
raw `PathBuf`. On Windows, that can make tests fail because the two
sides choose different textual forms for an otherwise equivalent
canonical path.

This PR is intentionally split out as the bottom PR below #19606. The
runtime permissions migration should not carry unrelated Windows test
stabilization, and reviewers should be able to verify this as a
test-only change before looking at the larger permissions changes.

## Failure Modes Covered

- `conversation_summary` expected rollout paths were built from raw
canonicalized `PathBuf`s, while app-server responses could carry
`AbsolutePathBuf`-normalized paths.
- `thread_resume` compared returned thread paths directly to previously
stored or fixture paths, so a verbatim-prefix spelling could fail an
otherwise correct resume.
- `marketplace_add` compared plugin install roots through `as_path()`
against raw canonicalized paths, reproducing the same `C:\...` versus
`\\?\C:\...` mismatch in both app-server and core-plugin coverage.

## What Changed

- In `app-server/tests/suite/conversation_summary.rs`, normalize both
expected rollout paths and received `ConversationSummary.path` values
through `AbsolutePathBuf` before comparing the full summary object.
- In `app-server/tests/suite/v2/thread_resume.rs`, normalize both sides
of thread path comparisons before asserting equality. This keeps the
tests focused on whether resume returned the same existing path, not
whether Windows used the same string spelling.
- In `app-server/tests/suite/v2/marketplace_add.rs` and
`core-plugins/src/marketplace_add.rs`, compare install roots as
`AbsolutePathBuf` values instead of comparing an absolute-path wrapper
to a raw canonicalized `PathBuf`.

## Behavior

This PR does not change production app-server or marketplace behavior.
It only changes tests to assert semantic path identity across Windows
path spelling variants. It also leaves API response values untouched;
the normalization happens inside assertions only.

## Verification

Targeted local checks run while extracting this fix:

- `cargo test -p codex-app-server
get_conversation_summary_by_thread_id_reads_rollout`
- `cargo test -p codex-app-server
get_conversation_summary_by_relative_rollout_path_resolves_from_codex_home`
- `cargo test -p codex-app-server
thread_resume_prefers_path_over_thread_id`

Windows-specific confidence comes from the Bazel Windows CI job for this
PR, since the failure is platform-specific.

## Docs

No docs update is needed because this is test-only infrastructure
stabilization.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19604).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19604
2026-04-25 16:25:28 -07:00
viyatb-oai
9aaa5d9358 [codex] Bypass managed network for escalated exec (#19595)
## Why

`sandbox_permissions = "require_escalated"` is treated as an explicit
request to approve the command and run it outside the
filesystem/platform sandbox. Before this change, shell and unified exec
still registered managed network approval context and could inject
Codex-managed proxy state into the child process, which meant an
approved escalated command could still hit a second network approval
path.

This PR makes that escalation boundary consistent: once a command is
explicitly approved to run outside the sandbox, Codex does not also
route that process through the managed network proxy.

## Security impact

Command/filesystem sandbox approval now implies network approval for
that command. If an untrusted command or script is allowed to run with
`require_escalated`, its network calls are unsandboxed: Codex-managed
network allowlists and denylists are not respected for that process, so
the command can exfiltrate any data it can read.

## What changed

- Skip managed network approval specs for
`SandboxPermissions::RequireEscalated`.
- Pass `network: None` into shell, zsh-fork shell, and unified exec
sandbox preparation for explicitly escalated requests.
- Strip Codex-managed proxy environment variables when
`CODEX_NETWORK_PROXY_ACTIVE` is present, while preserving user proxy env
when the Codex marker is absent.
- Add regression coverage for the prepared exec request so the old
behavior cannot silently reappear.

## Verification

- `cargo test -p codex-core explicit_escalation`
- `cargo clippy -p codex-core --all-targets -- -D warnings`
2026-04-25 23:23:58 +00:00
Eric Traut
0c785598b3 Keep slash command popup columns stable while scrolling (#19511)
## Why

Fixes #19499.

The slash-command popup recalculated the command-name column from only
the rows visible in the current viewport. That made the description
column shift horizontally while scrolling through `/` commands whenever
longer command names entered or left the visible window.

## What Changed

`codex-rs/tui/src/bottom_pane/command_popup.rs` now uses the shared
selection-popup `AutoAllRows` column-width mode for both height
measurement and rendering. This keeps the command description column
based on the full filtered slash-command list instead of the current
viewport.

## Verification

- `cargo test -p codex-tui bottom_pane::command_popup`
2026-04-25 14:25:58 -07:00
Michael Bolin
f41306b4f3 test: isolate remote thread store regression from plugin warmups (#19593)
Follow-up to #19266.

## Why


`thread_start_with_non_local_thread_store_does_not_create_local_persistence`
is meant to catch accidental local thread persistence when a non-local
thread store is configured. The Windows flake reported in [this
BuildBuddy
invocation](https://app.buildbuddy.io/invocation/0b75dde4-6828-4e7b-a35b-e45b73fb005d)
showed that the assertion was tripping on an unexpected top-level `.tmp`
entry:

```diff
 {
+    ".tmp",
     "config.toml",
     "installation_id",
     "memories",
     "skills",
 }
```

That `.tmp` does not appear to come from `tempfile::TempDir`; it comes
from unrelated plugin startup work that can legitimately materialize
`codex_home/.tmp`, including the startup remote plugin sync marker in
[`core/src/plugins/startup_sync.rs`](bce74c70ce/codex-rs/core/src/plugins/startup_sync.rs (L13-L15))
and the curated plugin snapshot under
[`.tmp/plugins`](bce74c70ce/codex-rs/core-plugins/src/startup_sync.rs (L25-L26)).

That makes the regression race unrelated background startup tasks
instead of validating the thread-store invariant it was added to cover.
Rather than weakening the assertion to allow arbitrary `.tmp` entries,
this change isolates the test from plugin warmups so it can stay strict
about unexpected local thread persistence artifacts.

## What changed

- disable plugins in the generated config used by
`app-server/tests/suite/v2/remote_thread_store.rs`
- keep the existing `codex_home` assertions unchanged so the test still
fails if local session or sqlite persistence is introduced

## Verification

- `cargo test -p codex-app-server
suite::v2::remote_thread_store::thread_start_with_non_local_thread_store_does_not_create_local_persistence
-- --exact`
2026-04-25 20:45:31 +00:00
Eric Traut
bce74c70ce Restore persisted model provider on thread resume (#19287)
Fixes #15219.

## Why

`thread/resume` should continue a persisted thread with the same model
provider that created the thread. The app server already restores the
persisted model and reasoning effort before resuming, but it was leaving
`model_provider` unset. If a user created a thread with one provider and
later switched their active profile to another provider, resumed
encrypted history could be sent to the wrong endpoint and fail with
`invalid_encrypted_content`.

The thread metadata already records the original provider, so resume
should apply it when the caller has not explicitly requested a different
model/provider/reasoning configuration.

## What changed

This updates `merge_persisted_resume_metadata` in
`app-server/src/codex_message_processor.rs` to copy
`ThreadMetadata::model_provider` into `ConfigOverrides::model_provider`
alongside the persisted model.

The existing resume metadata tests now also assert that:

- the persisted provider is restored for normal resume
- explicit model, provider, or reasoning-effort overrides still prevent
persisted resume metadata from being applied
- a thread with no persisted model or reasoning effort still resumes
with its persisted provider

## Verification

- `cargo test -p codex-app-server` passed the app-server unit tests,
including the updated resume metadata coverage. The broader integration
portion of that command failed in an unrelated environment-sensitive
skills-budget warning assertion, where this run saw 8 omitted skills
instead of the expected 7.
- `just fix -p codex-app-server` completed successfully.
2026-04-25 12:40:00 -07:00
Michael Bolin
88f300d74d fix: increase Bazel timeout to 45 minutes (#19578)
Unfortunately, if most of the build graph is invalidated such that there
are few cache hits, the Windows Bazel build for all the tests often
takes more than `30` minutes, so this PR increases the timeout to `45`
minutes until we set up distributed builds.
2026-04-25 10:03:01 -07:00
24 changed files with 674 additions and 189 deletions

View File

@@ -17,6 +17,13 @@ concurrency:
cancel-in-progress: ${{ github.ref_name != 'main' }}
jobs:
test:
# Even though a no-cache-hit Windows build seems to exceed the 30-minute
# limit on occasion, the more common reason for exceeding the limit is a
# true test failure in a rust_test() marked "flaky" that gets run 3x.
# In that case, extra time generally does not give us more signal.
#
# Ultimately we need true distributed builds (e.g.,
# https://www.buildbuddy.io/docs/rbe-setup/) to speed things up.
timeout-minutes: 30
strategy:
fail-fast: false

View File

@@ -651,11 +651,59 @@ jobs:
prefix="${NPM_TAG}-"
fi
root_tarball="dist/npm/codex-npm-${VERSION}.tgz"
sdk_tarball="dist/npm/codex-sdk-npm-${VERSION}.tgz"
# Keep this list in sync with CODEX_PLATFORM_PACKAGES in
# codex-cli/scripts/build_npm_package.py. The root wrapper advances
# @openai/codex@latest as soon as it publishes, so every platform
# package it aliases must already exist in the registry first.
platform_tarballs=(
"dist/npm/codex-npm-linux-x64-${VERSION}.tgz"
"dist/npm/codex-npm-linux-arm64-${VERSION}.tgz"
"dist/npm/codex-npm-darwin-x64-${VERSION}.tgz"
"dist/npm/codex-npm-darwin-arm64-${VERSION}.tgz"
"dist/npm/codex-npm-win32-x64-${VERSION}.tgz"
"dist/npm/codex-npm-win32-arm64-${VERSION}.tgz"
)
for required_tarball in "${platform_tarballs[@]}" "${root_tarball}"; do
if [[ ! -f "${required_tarball}" ]]; then
echo "Missing npm tarball: ${required_tarball}"
exit 1
fi
done
shopt -s nullglob
tarballs=(dist/npm/*-"${VERSION}".tgz)
if [[ ${#tarballs[@]} -eq 0 ]]; then
echo "No npm tarballs found in dist/npm for version ${VERSION}"
exit 1
other_tarballs=()
for tarball in dist/npm/*-"${VERSION}".tgz; do
if [[ "${tarball}" == "${root_tarball}" || "${tarball}" == "${sdk_tarball}" ]]; then
continue
fi
is_platform_tarball=false
for platform_tarball in "${platform_tarballs[@]}"; do
if [[ "${tarball}" == "${platform_tarball}" ]]; then
is_platform_tarball=true
break
fi
done
if [[ "${is_platform_tarball}" == true ]]; then
continue
fi
other_tarballs+=("${tarball}")
done
# Publish the platform packages before the root CLI wrapper. The root
# wrapper advances @openai/codex@latest, so it should only publish
# after the optional dependency versions it references exist.
tarballs=(
"${platform_tarballs[@]}"
"${other_tarballs[@]}"
"${root_tarball}"
)
if [[ -f "${sdk_tarball}" ]]; then
tarballs+=("${sdk_tarball}")
fi
for tarball in "${tarballs[@]}"; do

View File

@@ -5,7 +5,13 @@ codex_rust_crate(
crate_name = "codex_app_server",
integration_test_timeout = "long",
test_shard_counts = {
"app-server-all-test": 8,
# Note app-server-all-test has a large number of integration tests, so
# even a single shard can be quite slow. When there is a legitimate
# test failure in a shard, it will still get run 3x in total, which
# can cause us to exhaust our CI timeout if the shard happens to run
# long. Using a higher shard count for app-server-all-test should help
# mitigate this risk.
"app-server-all-test": 16,
"app-server-unit-tests": 8,
},
test_tags = ["no-sandbox"],

View File

@@ -9294,6 +9294,7 @@ fn merge_persisted_resume_metadata(
}
typesafe_overrides.model = persisted_metadata.model.clone();
typesafe_overrides.model_provider = Some(persisted_metadata.model_provider.clone());
if let Some(reasoning_effort) = persisted_metadata.reasoning_effort {
request_overrides.get_or_insert_with(HashMap::new).insert(
@@ -10983,6 +10984,10 @@ mod tests {
typesafe_overrides.model,
Some("gpt-5.1-codex-max".to_string())
);
assert_eq!(
typesafe_overrides.model_provider,
Some("mock_provider".to_string())
);
assert_eq!(
request_overrides,
Some(HashMap::from([(
@@ -11013,6 +11018,7 @@ mod tests {
);
assert_eq!(typesafe_overrides.model, Some("gpt-5.2-codex".to_string()));
assert_eq!(typesafe_overrides.model_provider, None);
assert_eq!(
request_overrides,
Some(HashMap::from([(
@@ -11041,6 +11047,7 @@ mod tests {
);
assert_eq!(typesafe_overrides.model, None);
assert_eq!(typesafe_overrides.model_provider, None);
assert_eq!(
request_overrides,
Some(HashMap::from([(
@@ -11092,6 +11099,7 @@ mod tests {
);
assert_eq!(typesafe_overrides.model, None);
assert_eq!(typesafe_overrides.model_provider, None);
assert_eq!(
request_overrides,
Some(HashMap::from([(
@@ -11116,6 +11124,10 @@ mod tests {
);
assert_eq!(typesafe_overrides.model, None);
assert_eq!(
typesafe_overrides.model_provider,
Some("mock_provider".to_string())
);
assert_eq!(request_overrides, None);
Ok(())
}

View File

@@ -11,7 +11,9 @@ use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_protocol::ThreadId;
use codex_protocol::protocol::SessionSource;
use codex_utils_absolute_path::AbsolutePathBuf;
use pretty_assertions::assert_eq;
use std::path::Path;
use std::path::PathBuf;
use tempfile::TempDir;
use tokio::time::timeout;
@@ -40,6 +42,15 @@ fn expected_summary(conversation_id: ThreadId, path: PathBuf) -> ConversationSum
}
}
fn normalized_canonical_path(path: impl AsRef<Path>) -> Result<PathBuf> {
Ok(AbsolutePathBuf::from_absolute_path(path.as_ref().canonicalize()?)?.into_path_buf())
}
fn normalized_summary_path(mut summary: ConversationSummary) -> Result<ConversationSummary> {
summary.path = normalized_canonical_path(&summary.path)?;
Ok(summary)
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn get_conversation_summary_by_thread_id_reads_rollout() -> Result<()> {
let codex_home = TempDir::new()?;
@@ -54,7 +65,7 @@ async fn get_conversation_summary_by_thread_id_reads_rollout() -> Result<()> {
let thread_id = ThreadId::from_string(&conversation_id)?;
let expected = expected_summary(
thread_id,
std::fs::canonicalize(rollout_path(
normalized_canonical_path(rollout_path(
codex_home.path(),
FILENAME_TS,
&conversation_id,
@@ -76,7 +87,7 @@ async fn get_conversation_summary_by_thread_id_reads_rollout() -> Result<()> {
.await??;
let received: GetConversationSummaryResponse = to_response(response)?;
assert_eq!(received.summary, expected);
assert_eq!(normalized_summary_path(received.summary)?, expected);
Ok(())
}
@@ -126,7 +137,7 @@ async fn get_conversation_summary_by_relative_rollout_path_resolves_from_codex_h
let thread_id = ThreadId::from_string(&conversation_id)?;
let rollout_path = rollout_path(codex_home.path(), FILENAME_TS, &conversation_id);
let relative_path = rollout_path.strip_prefix(codex_home.path())?.to_path_buf();
let expected = expected_summary(thread_id, std::fs::canonicalize(rollout_path)?);
let expected = expected_summary(thread_id, normalized_canonical_path(rollout_path)?);
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
@@ -143,6 +154,6 @@ async fn get_conversation_summary_by_relative_rollout_path_resolves_from_codex_h
.await??;
let received: GetConversationSummaryResponse = to_response(response)?;
assert_eq!(received.summary, expected);
assert_eq!(normalized_summary_path(received.summary)?, expected);
Ok(())
}

View File

@@ -5,6 +5,7 @@ use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::MarketplaceAddParams;
use codex_app_server_protocol::MarketplaceAddResponse;
use codex_app_server_protocol::RequestId;
use codex_utils_absolute_path::AbsolutePathBuf;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
use tokio::time::Duration;
@@ -48,10 +49,10 @@ async fn marketplace_add_local_directory_source() -> Result<()> {
installed_root,
already_added,
} = to_response(response)?;
let expected_root = source.canonicalize()?;
let expected_root = AbsolutePathBuf::from_absolute_path(source.canonicalize()?)?;
assert_eq!(marketplace_name, "debug");
assert_eq!(installed_root.as_path(), expected_root.as_path());
assert_eq!(installed_root, expected_root);
assert!(!already_added);
assert_eq!(
std::fs::read_to_string(installed_root.as_path().join("plugins/sample/marker.txt"))?,

View File

@@ -54,6 +54,8 @@ async fn thread_start_with_non_local_thread_store_does_not_create_local_persiste
let server = create_mock_responses_server_repeating_assistant("Done").await;
let codex_home = TempDir::new()?;
let store_id = Uuid::new_v4().to_string();
// Plugin startup warmups may create `.tmp` under codex_home. Disable them
// here so this regression stays focused on thread persistence artifacts.
create_config_toml_with_thread_store(codex_home.path(), &server.uri(), &store_id)?;
let loader_overrides = LoaderOverrides::without_managed_config_for_tests();
@@ -248,6 +250,9 @@ base_url = "{server_uri}/v1"
wire_api = "responses"
request_max_retries = 0
stream_max_retries = 0
[features]
plugins = false
"#
),
)

View File

@@ -65,6 +65,7 @@ use codex_protocol::protocol::TurnStartedEvent;
use codex_protocol::user_input::ByteRange;
use codex_protocol::user_input::TextElement;
use codex_state::StateRuntime;
use codex_utils_absolute_path::AbsolutePathBuf;
use core_test_support::responses;
use core_test_support::skip_if_no_network;
use pretty_assertions::assert_eq;
@@ -94,6 +95,10 @@ const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs
const INTERNAL_ERROR_CODE: i64 = -32603;
const CODEX_5_2_INSTRUCTIONS_TEMPLATE_DEFAULT: &str = "You are Codex, a coding agent based on GPT-5. You and the user share the same workspace and collaborate to achieve the user's goals.";
fn normalized_existing_path(path: impl AsRef<Path>) -> Result<PathBuf> {
Ok(AbsolutePathBuf::from_absolute_path(path.as_ref().canonicalize()?)?.into_path_buf())
}
async fn wait_for_responses_request_count(
server: &wiremock::MockServer,
expected_count: usize,
@@ -2537,7 +2542,12 @@ async fn thread_resume_prefers_path_over_thread_id() -> Result<()> {
thread: resumed, ..
} = to_response::<ThreadResumeResponse>(resume_resp)?;
assert_eq!(resumed.id, thread.id);
assert_eq!(resumed.path, thread.path);
let resumed_path = resumed.path.as_ref().expect("resumed thread path");
let original_path = thread.path.as_ref().expect("original thread path");
assert_eq!(
normalized_existing_path(resumed_path)?,
normalized_existing_path(original_path)?
);
assert_eq!(resumed.status, ThreadStatus::Idle);
Ok(())
@@ -2577,9 +2587,12 @@ async fn thread_resume_can_load_source_by_external_path() -> Result<()> {
let ThreadResumeResponse {
thread: resumed, ..
} = to_response::<ThreadResumeResponse>(resume_resp)?;
let expected_thread_path = std::fs::canonicalize(&thread_path)?;
assert_eq!(resumed.id, thread_id);
assert_eq!(resumed.path, Some(expected_thread_path));
let resumed_path = resumed.path.as_ref().expect("resumed thread path");
assert_eq!(
normalized_existing_path(resumed_path)?,
normalized_existing_path(&thread_path)?
);
assert_eq!(resumed.preview, "external path history");
assert_eq!(resumed.status, ThreadStatus::Idle);

View File

@@ -334,7 +334,7 @@ async fn turn_start_emits_thread_scoped_warning_notification_for_trimmed_skills(
assert_eq!(warning.thread_id.as_deref(), Some(thread.id.as_str()));
assert_eq!(
warning.message,
"Warning: Exceeded skills context budget of 2%. All skill descriptions were removed and 7 additional skills were not included in the model-visible skills list."
"Skills metadata exceeded the 2% context budget. Some skills were omitted from the model-visible skills list. Omitted skills: imagegen, openai-docs, plugin-creator, skill-creator, skill-installer, and 2 more."
);
timeout(

View File

@@ -278,10 +278,9 @@ mod tests {
let expected_source = source_root.path().canonicalize()?.display().to_string();
assert_eq!(result.marketplace_name, "debug");
assert_eq!(result.source_display, expected_source);
assert_eq!(
result.installed_root.as_path(),
source_root.path().canonicalize()?
);
let expected_installed_root =
AbsolutePathBuf::from_absolute_path(source_root.path().canonicalize()?)?;
assert_eq!(result.installed_root, expected_installed_root);
assert!(!result.already_added);
assert!(
!marketplace_install_root(codex_home.path())

View File

@@ -17,10 +17,8 @@ use codex_utils_output_truncation::approx_token_count;
const DEFAULT_SKILL_METADATA_CHAR_BUDGET: usize = 8_000;
const SKILL_METADATA_CONTEXT_WINDOW_PERCENT: usize = 2;
const SKILL_DESCRIPTION_TRUNCATION_WARNING_THRESHOLD_CHARS: usize = 10;
const MAX_WARNING_SKILL_NAMES: usize = 5;
const APPROX_BYTES_PER_TOKEN: usize = 4;
pub const SKILL_DESCRIPTION_TRUNCATED_WARNING_PREFIX: &str = "Warning: Exceeded skills context budget. Loaded skill descriptions were truncated by an average of";
pub const SKILL_DESCRIPTIONS_REMOVED_WARNING_PREFIX: &str =
"Warning: Exceeded skills context budget. All skill descriptions were removed and";
pub const SKILLS_INTRO_WITH_ABSOLUTE_PATHS: &str = "A skill is a set of local instructions to follow that is stored in a `SKILL.md` file. Below is the list of skills that can be used. Each entry includes a name, description, and file path so you can open the source for full instructions when using a specific skill.";
pub const SKILLS_INTRO_WITH_ALIASES: &str = "A skill is a set of local instructions to follow that is stored in a `SKILL.md` file. Below is the list of skills that can be used. Each entry includes a name, description, and a short path that can be expanded into an absolute path using the skill roots table.";
pub const SKILLS_HOW_TO_USE_WITH_ABSOLUTE_PATHS: &str = r###"- Discovery: The list above is the skills available in this session (name + description + file path). Skill bodies live on disk at the listed paths.
@@ -121,6 +119,8 @@ pub struct SkillRenderReport {
pub omitted_count: usize,
pub truncated_description_chars: usize,
pub truncated_description_count: usize,
pub largest_truncated_description_names: Vec<String>,
pub omitted_skill_names: Vec<String>,
}
#[derive(Clone, Copy)]
@@ -210,30 +210,26 @@ fn build_available_skills_from_lines(
let (skill_lines, report) = render_skill_lines_from_lines(skill_lines, total_count, budget);
let warning_message = if report.omitted_count > 0 {
let skill_word = if report.omitted_count == 1 {
"skill"
} else {
"skills"
};
let verb = if report.omitted_count == 1 {
"was"
} else {
"were"
};
Some(format!(
"{} {} additional {} {} not included in the model-visible skills list.",
budget_warning_prefix(budget, SKILL_DESCRIPTIONS_REMOVED_WARNING_PREFIX),
report.omitted_count,
skill_word,
verb
"{} Some skills were omitted from the model-visible skills list.{}",
budget_warning_prefix(budget),
warning_skill_names_suffix(
"Omitted skills",
&report.omitted_skill_names,
report.omitted_count
)
))
} else if report.average_truncated_description_chars()
> SKILL_DESCRIPTION_TRUNCATION_WARNING_THRESHOLD_CHARS
{
Some(format!(
"{} {} characters per skill.",
budget_warning_prefix(budget, SKILL_DESCRIPTION_TRUNCATED_WARNING_PREFIX),
report.average_truncated_description_chars()
"{} All skills are still available, but some descriptions were shortened.{}",
budget_warning_prefix(budget),
warning_skill_names_suffix(
"Largest truncated descriptions",
&report.largest_truncated_description_names,
report.largest_truncated_description_names.len()
)
))
} else {
None
@@ -273,17 +269,26 @@ fn record_available_skills_side_effects(
}
}
fn budget_warning_prefix(budget: SkillMetadataBudget, prefix: &str) -> String {
fn budget_warning_prefix(budget: SkillMetadataBudget) -> &'static str {
match budget {
SkillMetadataBudget::Tokens(_) => prefix.replacen(
"Exceeded skills context budget.",
"Exceeded skills context budget of 2%.",
1,
),
SkillMetadataBudget::Characters(_) => prefix.to_string(),
SkillMetadataBudget::Tokens(_) => "Skills metadata exceeded the 2% context budget.",
SkillMetadataBudget::Characters(_) => "Skills metadata exceeded the context budget.",
}
}
fn warning_skill_names_suffix(label: &str, names: &[String], total_count: usize) -> String {
if names.is_empty() {
return String::new();
}
let mut names_text = names.join(", ");
let hidden_count = total_count.saturating_sub(names.len());
if hidden_count > 0 {
names_text.push_str(&format!(", and {hidden_count} more"));
}
format!(" {label}: {names_text}.")
}
fn record_skill_render_side_effects(
side_effects: SkillRenderSideEffects<'_>,
total_count: usize,
@@ -340,6 +345,8 @@ fn render_skill_lines_from_lines(
/*omitted_count*/ 0,
/*truncated_description_chars*/ 0,
/*truncated_description_count*/ 0,
Vec::new(),
Vec::new(),
),
);
}
@@ -353,8 +360,11 @@ fn render_skill_lines_from_lines(
&skill_lines,
budget.limit().saturating_sub(minimum_cost),
);
let (truncated_description_chars, truncated_description_count) =
sum_description_truncation(&rendered);
let (
truncated_description_chars,
truncated_description_count,
largest_truncated_description_names,
) = summarize_description_truncation(&rendered);
let included = rendered
.into_iter()
.map(|rendered| rendered.line)
@@ -368,6 +378,8 @@ fn render_skill_lines_from_lines(
/*omitted_count*/ 0,
truncated_description_chars,
truncated_description_count,
largest_truncated_description_names,
Vec::new(),
),
);
}
@@ -383,8 +395,10 @@ fn render_minimum_skill_lines_until_budget(
let mut included = Vec::new();
let mut used = 0usize;
let mut omitted_count = 0usize;
let mut omitted_skill_names = Vec::new();
let mut truncated_description_chars = 0usize;
let mut truncated_description_count = 0usize;
let mut truncated_descriptions = Vec::new();
for line in skill_lines {
let line_cost = line.minimum_cost(budget);
let description_char_count = line.description_char_count();
@@ -393,12 +407,16 @@ fn render_minimum_skill_lines_until_budget(
included.push(line.render_minimum());
} else {
omitted_count = omitted_count.saturating_add(1);
if omitted_skill_names.len() < MAX_WARNING_SKILL_NAMES {
omitted_skill_names.push(line.name.to_string());
}
}
truncated_description_chars =
truncated_description_chars.saturating_add(description_char_count);
if description_char_count > 0 {
truncated_description_count = truncated_description_count.saturating_add(1);
truncated_descriptions.push((description_char_count, line.name.to_string()));
}
}
@@ -408,6 +426,8 @@ fn render_minimum_skill_lines_until_budget(
omitted_count,
truncated_description_chars,
truncated_description_count,
top_skill_names_by_truncated_chars(truncated_descriptions),
omitted_skill_names,
);
(included, report)
@@ -419,6 +439,8 @@ fn skill_render_report(
omitted_count: usize,
truncated_description_chars: usize,
truncated_description_count: usize,
largest_truncated_description_names: Vec<String>,
omitted_skill_names: Vec<String>,
) -> SkillRenderReport {
SkillRenderReport {
total_count,
@@ -426,6 +448,8 @@ fn skill_render_report(
omitted_count,
truncated_description_chars,
truncated_description_count,
largest_truncated_description_names,
omitted_skill_names,
}
}
@@ -449,6 +473,7 @@ struct SkillLine<'a> {
struct RenderedSkillLine {
line: String,
skill_name: String,
truncated_chars: usize,
}
@@ -458,19 +483,38 @@ struct DescriptionBudgetLine<'a> {
extra_costs: Vec<usize>,
}
fn sum_description_truncation(rendered: &[RenderedSkillLine]) -> (usize, usize) {
rendered
.iter()
.fold((0usize, 0usize), |(chars, count), line| {
if line.truncated_chars == 0 {
(chars, count)
} else {
(
chars.saturating_add(line.truncated_chars),
count.saturating_add(1),
)
}
})
fn summarize_description_truncation(rendered: &[RenderedSkillLine]) -> (usize, usize, Vec<String>) {
let mut truncated_description_chars = 0usize;
let mut truncated_description_count = 0usize;
let mut truncated_descriptions = Vec::new();
for line in rendered {
if line.truncated_chars == 0 {
continue;
}
truncated_description_chars =
truncated_description_chars.saturating_add(line.truncated_chars);
truncated_description_count = truncated_description_count.saturating_add(1);
truncated_descriptions.push((line.truncated_chars, line.skill_name.clone()));
}
(
truncated_description_chars,
truncated_description_count,
top_skill_names_by_truncated_chars(truncated_descriptions),
)
}
fn top_skill_names_by_truncated_chars(mut entries: Vec<(usize, String)>) -> Vec<String> {
entries.sort_by(|(left_chars, left_name), (right_chars, right_name)| {
right_chars
.cmp(left_chars)
.then_with(|| left_name.cmp(right_name))
});
entries
.into_iter()
.take(MAX_WARNING_SKILL_NAMES)
.map(|(_chars, name)| name)
.collect()
}
impl<'a> SkillLine<'a> {
@@ -626,6 +670,7 @@ fn render_lines_with_description_budget(
.saturating_sub(description_chars);
RenderedSkillLine {
line: line.line.render_with_description_chars(description_chars),
skill_name: line.line.name.to_string(),
truncated_chars,
}
})
@@ -1066,7 +1111,7 @@ mod tests {
assert_eq!(
rendered.warning_message,
Some(
"Warning: Exceeded skills context budget. Loaded skill descriptions were truncated by an average of 14 characters per skill."
"Skills metadata exceeded the context budget. All skills are still available, but some descriptions were shortened. Largest truncated descriptions: alpha-skill, beta-skill."
.to_string()
)
);
@@ -1116,7 +1161,7 @@ mod tests {
assert_eq!(
rendered.warning_message,
Some(
"Warning: Exceeded skills context budget. All skill descriptions were removed and 2 additional skills were not included in the model-visible skills list."
"Skills metadata exceeded the context budget. Some skills were omitted from the model-visible skills list. Omitted skills: repo-skill, user-skill."
.to_string()
)
);
@@ -1145,7 +1190,7 @@ mod tests {
assert_eq!(
rendered.warning_message,
Some(
"Warning: Exceeded skills context budget. All skill descriptions were removed and 1 additional skill was not included in the model-visible skills list."
"Skills metadata exceeded the context budget. Some skills were omitted from the model-visible skills list. Omitted skills: oversized-system-skill."
.to_string()
)
);

View File

@@ -5219,7 +5219,7 @@ async fn build_initial_context_trims_skill_metadata_from_context_window_budget()
assert!(
developer_texts
.iter()
.all(|text| !text.contains("Exceeded skills context budget")),
.all(|text| !text.contains("Skills metadata exceeded")),
"expected skill budget warning to stay out of the initial context, got {developer_texts:?}"
);
assert!(
@@ -5256,7 +5256,7 @@ fn emit_thread_start_skill_metrics_records_enabled_kept_and_truncated_values() {
assert_eq!(
rendered.warning_message,
Some(
"Warning: Exceeded skills context budget. All skill descriptions were removed and 1 additional skill was not included in the model-visible skills list."
"Skills metadata exceeded the context budget. Some skills were omitted from the model-visible skills list. Omitted skills: repo-skill."
.to_string()
)
);
@@ -5367,7 +5367,7 @@ async fn build_initial_context_emits_thread_start_skill_warning_on_repeated_buil
assert!(matches!(
warning_event.msg,
EventMsg::Warning(WarningEvent { message })
if message == "Warning: Exceeded skills context budget of 2%. All skill descriptions were removed and 2 additional skills were not included in the model-visible skills list."
if message == "Skills metadata exceeded the 2% context budget. Some skills were omitted from the model-visible skills list. Omitted skills: admin-skill, repo-skill."
));
let _ = session.build_initial_context(&turn_context).await;
@@ -5378,7 +5378,7 @@ async fn build_initial_context_emits_thread_start_skill_warning_on_repeated_buil
assert!(matches!(
warning_event.msg,
EventMsg::Warning(WarningEvent { message })
if message == "Warning: Exceeded skills context budget of 2%. All skill descriptions were removed and 2 additional skills were not included in the model-visible skills list."
if message == "Skills metadata exceeded the 2% context budget. Some skills were omitted from the model-visible skills list. Omitted skills: admin-skill, repo-skill."
));
}

View File

@@ -6,6 +6,7 @@ small and focused and reuses the orchestrator for approvals + sandbox + retry.
*/
use crate::exec_env::CODEX_THREAD_ID_ENV_VAR;
use crate::path_utils;
use crate::sandboxing::SandboxPermissions;
use crate::shell::Shell;
use crate::tools::sandboxing::ToolError;
#[cfg(target_os = "macos")]
@@ -43,6 +44,29 @@ pub(crate) fn build_sandbox_command(
})
}
pub(crate) fn exec_env_for_sandbox_permissions(
env: &HashMap<String, String>,
sandbox_permissions: SandboxPermissions,
) -> HashMap<String, String> {
let mut env = env.clone();
if sandbox_permissions.requires_escalated_permissions()
&& env.contains_key(PROXY_ACTIVE_ENV_KEY)
{
for key in PROXY_ENV_KEYS {
env.remove(*key);
}
// Only macOS injects a Codex-owned SSH wrapper for the managed SOCKS proxy.
#[cfg(target_os = "macos")]
if env
.get(PROXY_GIT_SSH_COMMAND_ENV_KEY)
.is_some_and(|command| command.starts_with(CODEX_PROXY_GIT_SSH_COMMAND_MARKER))
{
env.remove(PROXY_GIT_SSH_COMMAND_ENV_KEY);
}
}
env
}
/// POSIX-only helper: for commands produced by `Shell::derive_exec_args`
/// for Bash/Zsh/sh of the form `[shell_path, "-lc", "<script>"]`, and
/// when a snapshot is configured on the session shell, rewrite the argv

View File

@@ -1,11 +1,29 @@
use super::*;
use crate::exec::ExecCapturePolicy;
use crate::exec::ExecExpiration;
use crate::sandboxing::ExecOptions;
use crate::shell::ShellType;
use crate::shell_snapshot::ShellSnapshot;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::managed_network_for_sandbox_permissions;
#[cfg(target_os = "macos")]
use codex_network_proxy::CODEX_PROXY_GIT_SSH_COMMAND_MARKER;
use codex_network_proxy::ConfigReloader;
use codex_network_proxy::ConfigState;
use codex_network_proxy::NetworkProxy;
use codex_network_proxy::NetworkProxyConfig;
use codex_network_proxy::NetworkProxyConstraints;
use codex_network_proxy::NetworkProxyState;
use codex_network_proxy::PROXY_ACTIVE_ENV_KEY;
use codex_network_proxy::PROXY_ENV_KEYS;
#[cfg(target_os = "macos")]
use codex_network_proxy::PROXY_GIT_SSH_COMMAND_ENV_KEY;
use codex_protocol::config_types::WindowsSandboxLevel;
use codex_protocol::permissions::FileSystemSandboxPolicy;
use codex_protocol::permissions::NetworkSandboxPolicy;
use codex_protocol::protocol::SandboxPolicy;
use codex_sandboxing::SandboxManager;
use codex_sandboxing::SandboxType;
use codex_utils_absolute_path::AbsolutePathBuf;
use core_test_support::PathBufExt;
use core_test_support::PathExt;
@@ -16,6 +34,23 @@ use std::sync::Arc;
use tempfile::tempdir;
use tokio::sync::watch;
struct StaticReloader;
#[async_trait::async_trait]
impl ConfigReloader for StaticReloader {
fn source_label(&self) -> String {
"test config state".to_string()
}
async fn maybe_reload(&self) -> anyhow::Result<Option<ConfigState>> {
Ok(None)
}
async fn reload_now(&self) -> anyhow::Result<ConfigState> {
Err(anyhow::anyhow!("force reload is not supported in tests"))
}
}
fn shell_with_snapshot(
shell_type: ShellType,
shell_path: &str,
@@ -33,6 +68,104 @@ fn shell_with_snapshot(
}
}
async fn test_network_proxy() -> anyhow::Result<NetworkProxy> {
let state = codex_network_proxy::build_config_state(
NetworkProxyConfig::default(),
NetworkProxyConstraints::default(),
)?;
NetworkProxy::builder()
.state(Arc::new(NetworkProxyState::with_reloader(
state,
Arc::new(StaticReloader),
)))
.managed_by_codex(/*managed_by_codex*/ false)
.http_addr("127.0.0.1:43128".parse()?)
.socks_addr("127.0.0.1:48081".parse()?)
.build()
.await
}
#[tokio::test]
async fn explicit_escalation_prepares_exec_without_managed_network() -> anyhow::Result<()> {
let proxy = test_network_proxy().await?;
let dir = tempdir().expect("create temp dir");
let cwd = dir.path().abs();
let mut env = HashMap::from([("CUSTOM_ENV".to_string(), "kept".to_string())]);
proxy.apply_to_env(&mut env);
let command = vec!["/bin/echo".to_string(), "ok".to_string()];
let command = build_sandbox_command(
&command,
&cwd,
&exec_env_for_sandbox_permissions(&env, SandboxPermissions::RequireEscalated),
/*additional_permissions*/ None,
)
.expect("build sandbox command");
let options = ExecOptions {
expiration: ExecExpiration::DefaultTimeout,
capture_policy: ExecCapturePolicy::ShellTool,
};
let sandbox_policy = SandboxPolicy::DangerFullAccess;
let file_system_policy = FileSystemSandboxPolicy::from(&sandbox_policy);
let manager = SandboxManager::new();
let attempt = SandboxAttempt {
sandbox: SandboxType::None,
policy: &sandbox_policy,
file_system_policy: &file_system_policy,
network_policy: NetworkSandboxPolicy::Enabled,
enforce_managed_network: false,
manager: &manager,
sandbox_cwd: &cwd,
codex_linux_sandbox_exe: None,
use_legacy_landlock: false,
windows_sandbox_level: WindowsSandboxLevel::Disabled,
windows_sandbox_private_desktop: false,
};
let exec_request = attempt
.env_for(
command,
options,
managed_network_for_sandbox_permissions(
Some(&proxy),
SandboxPermissions::RequireEscalated,
),
)
.expect("prepare exec request");
assert_eq!(exec_request.network, None);
for key in PROXY_ENV_KEYS {
assert_eq!(exec_request.env.get(*key), None, "{key} should be unset");
}
#[cfg(target_os = "macos")]
assert_eq!(exec_request.env.get(PROXY_GIT_SSH_COMMAND_ENV_KEY), None);
assert_eq!(
exec_request.env.get("CUSTOM_ENV"),
Some(&"kept".to_string())
);
Ok(())
}
#[test]
fn explicit_escalation_keeps_user_proxy_env_without_codex_marker() {
let env = HashMap::from([
(
"HTTP_PROXY".to_string(),
"http://user.proxy:8080".to_string(),
),
("CUSTOM_ENV".to_string(), "kept".to_string()),
]);
let env = exec_env_for_sandbox_permissions(&env, SandboxPermissions::RequireEscalated);
assert_eq!(
env.get("HTTP_PROXY"),
Some(&"http://user.proxy:8080".to_string())
);
assert_eq!(env.get("CUSTOM_ENV"), Some(&"kept".to_string()));
}
#[test]
fn maybe_wrap_shell_lc_with_snapshot_bootstraps_in_user_shell() {
let dir = tempdir().expect("create temp dir");

View File

@@ -20,6 +20,7 @@ use crate::shell::ShellType;
use crate::tools::network_approval::NetworkApprovalMode;
use crate::tools::network_approval::NetworkApprovalSpec;
use crate::tools::runtimes::build_sandbox_command;
use crate::tools::runtimes::exec_env_for_sandbox_permissions;
use crate::tools::runtimes::maybe_wrap_shell_lc_with_snapshot;
use crate::tools::sandboxing::Approvable;
use crate::tools::sandboxing::ApprovalCtx;
@@ -31,6 +32,7 @@ use crate::tools::sandboxing::Sandboxable;
use crate::tools::sandboxing::ToolCtx;
use crate::tools::sandboxing::ToolError;
use crate::tools::sandboxing::ToolRuntime;
use crate::tools::sandboxing::managed_network_for_sandbox_permissions;
use crate::tools::sandboxing::sandbox_override_for_first_attempt;
use crate::tools::sandboxing::with_cached_approval;
use codex_network_proxy::NetworkProxy;
@@ -218,9 +220,10 @@ impl ToolRuntime<ShellRequest, ExecToolCallOutput> for ShellRuntime {
req: &ShellRequest,
ctx: &ToolCtx,
) -> Option<NetworkApprovalSpec> {
req.network.as_ref()?;
let network =
managed_network_for_sandbox_permissions(req.network.as_ref(), req.sandbox_permissions)?;
Some(NetworkApprovalSpec {
network: req.network.clone(),
network: Some(network.clone()),
mode: NetworkApprovalMode::Immediate,
trigger: GuardianNetworkAccessTrigger {
call_id: ctx.call_id.clone(),
@@ -243,12 +246,15 @@ impl ToolRuntime<ShellRequest, ExecToolCallOutput> for ShellRuntime {
ctx: &ToolCtx,
) -> Result<ExecToolCallOutput, ToolError> {
let session_shell = ctx.session.user_shell();
let managed_network =
managed_network_for_sandbox_permissions(req.network.as_ref(), req.sandbox_permissions);
let env = exec_env_for_sandbox_permissions(&req.env, req.sandbox_permissions);
let command = maybe_wrap_shell_lc_with_snapshot(
&req.command,
session_shell.as_ref(),
&req.cwd,
&req.explicit_env_overrides,
&req.env,
&env,
);
let command = if matches!(session_shell.shell_type, ShellType::PowerShell) {
prefix_powershell_script_with_utf8(&command)
@@ -267,18 +273,14 @@ impl ToolRuntime<ShellRequest, ExecToolCallOutput> for ShellRuntime {
}
}
let command = build_sandbox_command(
&command,
&req.cwd,
&req.env,
req.additional_permissions.clone(),
)?;
let command =
build_sandbox_command(&command, &req.cwd, &env, req.additional_permissions.clone())?;
let options = ExecOptions {
expiration: req.timeout_ms.into(),
capture_policy: ExecCapturePolicy::ShellTool,
};
let env = attempt
.env_for(command, options, req.network.as_ref())
.env_for(command, options, managed_network)
.map_err(|err| ToolError::Codex(err.into()))?;
let out = execute_env(env, Self::stdout_stream(ctx))
.await

View File

@@ -14,10 +14,12 @@ use crate::sandboxing::ExecRequest;
use crate::sandboxing::SandboxPermissions;
use crate::shell::ShellType;
use crate::tools::runtimes::build_sandbox_command;
use crate::tools::runtimes::exec_env_for_sandbox_permissions;
use crate::tools::sandboxing::PermissionRequestPayload;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::ToolCtx;
use crate::tools::sandboxing::ToolError;
use crate::tools::sandboxing::managed_network_for_sandbox_permissions;
use codex_execpolicy::Decision;
use codex_execpolicy::Evaluation;
use codex_execpolicy::MatchOptions;
@@ -114,18 +116,19 @@ pub(super) async fn try_run_zsh_fork(
return Ok(None);
}
let command = build_sandbox_command(
command,
&req.cwd,
&req.env,
req.additional_permissions.clone(),
)?;
let env = exec_env_for_sandbox_permissions(&req.env, req.sandbox_permissions);
let command =
build_sandbox_command(command, &req.cwd, &env, req.additional_permissions.clone())?;
let options = ExecOptions {
expiration: req.timeout_ms.into(),
capture_policy: ExecCapturePolicy::ShellTool,
};
let sandbox_exec_request = attempt
.env_for(command, options, req.network.as_ref())
.env_for(
command,
options,
managed_network_for_sandbox_permissions(req.network.as_ref(), req.sandbox_permissions),
)
.map_err(|err| ToolError::Codex(err.into()))?;
let crate::sandboxing::ExecRequest {
command,

View File

@@ -17,6 +17,7 @@ use crate::shell::ShellType;
use crate::tools::network_approval::NetworkApprovalMode;
use crate::tools::network_approval::NetworkApprovalSpec;
use crate::tools::runtimes::build_sandbox_command;
use crate::tools::runtimes::exec_env_for_sandbox_permissions;
use crate::tools::runtimes::maybe_wrap_shell_lc_with_snapshot;
use crate::tools::runtimes::shell::zsh_fork_backend;
use crate::tools::sandboxing::Approvable;
@@ -29,6 +30,7 @@ use crate::tools::sandboxing::Sandboxable;
use crate::tools::sandboxing::ToolCtx;
use crate::tools::sandboxing::ToolError;
use crate::tools::sandboxing::ToolRuntime;
use crate::tools::sandboxing::managed_network_for_sandbox_permissions;
use crate::tools::sandboxing::sandbox_override_for_first_attempt;
use crate::tools::sandboxing::with_cached_approval;
use crate::unified_exec::NoopSpawnLifecycle;
@@ -203,9 +205,10 @@ impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecProcess> for UnifiedExecRunt
req: &UnifiedExecRequest,
ctx: &ToolCtx,
) -> Option<NetworkApprovalSpec> {
req.network.as_ref()?;
let network =
managed_network_for_sandbox_permissions(req.network.as_ref(), req.sandbox_permissions)?;
Some(NetworkApprovalSpec {
network: req.network.clone(),
network: Some(network.clone()),
mode: NetworkApprovalMode::Deferred,
trigger: GuardianNetworkAccessTrigger {
call_id: ctx.call_id.clone(),
@@ -229,6 +232,12 @@ impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecProcess> for UnifiedExecRunt
) -> Result<UnifiedExecProcess, ToolError> {
let base_command = &req.command;
let session_shell = ctx.session.user_shell();
let managed_network =
managed_network_for_sandbox_permissions(req.network.as_ref(), req.sandbox_permissions);
let mut env = exec_env_for_sandbox_permissions(&req.env, req.sandbox_permissions);
if let Some(network) = managed_network {
network.apply_to_env(&mut env);
}
let environment_is_remote = ctx
.turn
.environment
@@ -242,7 +251,7 @@ impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecProcess> for UnifiedExecRunt
session_shell.as_ref(),
&req.cwd,
&req.explicit_env_overrides,
&req.env,
&env,
)
};
let command = if matches!(session_shell.shell_type, ShellType::PowerShell) {
@@ -251,10 +260,6 @@ impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecProcess> for UnifiedExecRunt
command
};
let mut env = req.env.clone();
if let Some(network) = req.network.as_ref() {
network.apply_to_env(&mut env);
}
if let UnifiedExecShellMode::ZshFork(zsh_fork_config) = &self.shell_mode {
let command =
build_sandbox_command(&command, &req.cwd, &env, req.additional_permissions.clone())
@@ -264,7 +269,7 @@ impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecProcess> for UnifiedExecRunt
capture_policy: ExecCapturePolicy::ShellTool,
};
let mut exec_env = attempt
.env_for(command, options, req.network.as_ref())
.env_for(command, options, managed_network)
.map_err(|err| ToolError::Codex(err.into()))?;
exec_env.exec_server_env_config = req.exec_server_env_config.clone();
match zsh_fork_backend::maybe_prepare_unified_exec(
@@ -322,7 +327,7 @@ impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecProcess> for UnifiedExecRunt
capture_policy: ExecCapturePolicy::ShellTool,
};
let mut exec_env = attempt
.env_for(command, options, req.network.as_ref())
.env_for(command, options, managed_network)
.map_err(|err| ToolError::Codex(err.into()))?;
exec_env.exec_server_env_config = req.exec_server_env_config.clone();
let Some(environment) = ctx.turn.environment.as_ref() else {

View File

@@ -265,6 +265,17 @@ pub(crate) fn sandbox_override_for_first_attempt(
}
}
pub(crate) fn managed_network_for_sandbox_permissions(
network: Option<&NetworkProxy>,
sandbox_permissions: SandboxPermissions,
) -> Option<&NetworkProxy> {
if sandbox_permissions.requires_escalated_permissions() {
None
} else {
network
}
}
pub(crate) trait Approvable<Req> {
type ApprovalKey: Hash + Eq + Clone + Debug + Serialize;

View File

@@ -4,8 +4,11 @@ use ratatui::widgets::WidgetRef;
use super::popup_consts::MAX_POPUP_ROWS;
use super::scroll_state::ScrollState;
use super::selection_popup_common::ColumnWidthConfig;
use super::selection_popup_common::ColumnWidthMode;
use super::selection_popup_common::GenericDisplayRow;
use super::selection_popup_common::render_rows;
use super::selection_popup_common::measure_rows_height_with_col_width_mode;
use super::selection_popup_common::render_rows_with_col_width_mode;
use super::slash_commands;
use crate::render::Insets;
use crate::render::RectExt;
@@ -15,6 +18,10 @@ use crate::slash_command::SlashCommand;
// `quit` is an alias of `exit`, so we skip `quit` here.
// `approvals` is an alias of `permissions`.
const ALIAS_COMMANDS: &[SlashCommand] = &[SlashCommand::Quit, SlashCommand::Approvals];
const COMMAND_COLUMN_WIDTH: ColumnWidthConfig = ColumnWidthConfig::new(
ColumnWidthMode::AutoAllRows,
/*name_column_width*/ None,
);
/// A selectable item in the popup.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
@@ -109,10 +116,15 @@ impl CommandPopup {
/// Determine the preferred height of the popup for a given width.
/// Accounts for wrapped descriptions so that long tooltips don't overflow.
pub(crate) fn calculate_required_height(&self, width: u16) -> u16 {
use super::selection_popup_common::measure_rows_height;
let rows = self.rows_from_matches(self.filtered());
measure_rows_height(&rows, &self.state, MAX_POPUP_ROWS, width)
measure_rows_height_with_col_width_mode(
&rows,
&self.state,
MAX_POPUP_ROWS,
width,
COMMAND_COLUMN_WIDTH,
)
}
/// Compute exact/prefix matches over built-in commands and user prompts,
@@ -223,7 +235,7 @@ impl CommandPopup {
impl WidgetRef for CommandPopup {
fn render_ref(&self, area: Rect, buf: &mut Buffer) {
let rows = self.rows_from_matches(self.filtered());
render_rows(
render_rows_with_col_width_mode(
area.inset(Insets::tlbr(
/*top*/ 0, /*left*/ 2, /*bottom*/ 0, /*right*/ 0,
)),
@@ -232,6 +244,7 @@ impl WidgetRef for CommandPopup {
&self.state,
MAX_POPUP_ROWS,
"no matches",
COMMAND_COLUMN_WIDTH,
);
}
}

View File

@@ -191,7 +191,7 @@ async fn live_app_server_warning_notification_renders_message() {
chat.handle_server_notification(
ServerNotification::Warning(WarningNotification {
thread_id: None,
message: "Warning: Exceeded skills context budget of 2%. All skill descriptions were removed and 2 additional skills were not included in the model-visible skills list.".to_string(),
message: "Skills metadata exceeded the 2% context budget. All skills are still available, but some descriptions were shortened. Largest truncated descriptions: imagegen, code-replay.".to_string(),
}),
/*replay_kind*/ None,
);
@@ -201,12 +201,12 @@ async fn live_app_server_warning_notification_renders_message() {
let rendered = lines_to_single_string(&cells[0]);
let normalized = rendered.split_whitespace().collect::<Vec<_>>().join(" ");
assert!(
normalized.contains("Warning: Exceeded skills context budget of 2%."),
normalized.contains("Skills metadata exceeded the 2% context budget."),
"expected warning notification message, got {rendered}"
);
assert!(
normalized.contains(
"All skill descriptions were removed and 2 additional skills were not included in the model-visible skills list."
"All skills are still available, but some descriptions were shortened. Largest truncated descriptions: imagegen, code-replay."
),
"expected warning guidance, got {rendered}"
);

View File

@@ -142,6 +142,8 @@ mod model_catalog;
mod model_migration;
mod multi_agents;
mod notifications;
#[cfg(any(not(debug_assertions), test))]
mod npm_registry;
pub(crate) mod onboarding;
mod oss_selection;
mod pager_overlay;
@@ -167,6 +169,8 @@ mod ui_consts;
pub(crate) mod update_action;
pub use update_action::UpdateAction;
mod update_prompt;
#[cfg(any(not(debug_assertions), test))]
mod update_versions;
mod updates;
mod version;
#[cfg(not(target_os = "linux"))]

View File

@@ -0,0 +1,130 @@
use serde::Deserialize;
use std::collections::HashMap;
#[cfg(not(debug_assertions))]
pub(crate) const PACKAGE_URL: &str = "https://registry.npmjs.org/@openai%2fcodex";
#[derive(Deserialize, Debug, Clone)]
pub(crate) struct NpmPackageInfo {
#[serde(rename = "dist-tags")]
dist_tags: HashMap<String, String>,
versions: HashMap<String, NpmPackageVersionInfo>,
}
#[derive(Deserialize, Debug, Clone)]
struct NpmPackageVersionInfo {
dist: Option<NpmPackageDist>,
}
#[derive(Deserialize, Debug, Clone)]
struct NpmPackageDist {
tarball: Option<String>,
integrity: Option<String>,
}
pub(crate) fn ensure_version_ready(
package_info: &NpmPackageInfo,
version: &str,
) -> anyhow::Result<()> {
let version = version.trim();
match package_info.dist_tags.get("latest").map(String::as_str) {
Some(latest) if latest == version => {}
Some(latest) => anyhow::bail!(
"npm latest dist-tag points to {latest}, expected GitHub release {version}"
),
None => anyhow::bail!("npm package is missing latest dist-tag"),
}
version_info_with_dist(package_info, version)?;
Ok(())
}
fn version_info_with_dist<'a>(
package_info: &'a NpmPackageInfo,
version: &str,
) -> anyhow::Result<&'a NpmPackageVersionInfo> {
let info = package_info
.versions
.get(version)
.ok_or_else(|| anyhow::anyhow!("npm package version {version} is missing"))?;
let Some(dist) = info.dist.as_ref() else {
anyhow::bail!("npm package version {version} is missing dist metadata");
};
let has_tarball = dist
.tarball
.as_deref()
.is_some_and(|tarball| !tarball.is_empty());
if !has_tarball {
anyhow::bail!("npm package version {version} is missing dist.tarball");
}
let has_integrity = dist
.integrity
.as_ref()
.is_some_and(|integrity| !integrity.is_empty());
if !has_integrity {
anyhow::bail!("npm package version {version} is missing dist.integrity");
}
Ok(info)
}
#[cfg(test)]
mod tests {
use super::*;
fn version_json(version: &str) -> serde_json::Value {
serde_json::json!({
"dist": {
"integrity": format!("sha512-{version}"),
"tarball": format!("https://registry.npmjs.org/@openai/codex/-/codex-{version}.tgz"),
}
})
}
fn package_info(github_latest: &str, npm_latest: &str) -> NpmPackageInfo {
let mut versions = serde_json::Map::new();
versions.insert(github_latest.to_string(), version_json(github_latest));
serde_json::from_value(serde_json::json!({
"dist-tags": { "latest": npm_latest },
"versions": serde_json::Value::Object(versions),
}))
.expect("valid npm package metadata")
}
#[test]
fn ready_version_requires_latest_dist_tag_and_root_dist() {
let latest = "1.2.3";
let package_info = package_info(latest, latest);
ensure_version_ready(&package_info, latest).expect("npm package is ready");
}
#[test]
fn ready_version_rejects_stale_latest_dist_tag() {
let package_info = package_info("1.2.3", "1.2.2");
let err = ensure_version_ready(&package_info, "1.2.3")
.expect_err("npm latest dist-tag must match GitHub latest");
assert!(
err.to_string().contains("latest dist-tag"),
"error should name stale latest dist-tag: {err}"
);
}
#[test]
fn ready_version_rejects_missing_root_dist() {
let package_info: NpmPackageInfo = serde_json::from_value(serde_json::json!({
"dist-tags": { "latest": "1.2.3" },
"versions": { "1.2.3": {} },
}))
.expect("valid npm package metadata");
let err = ensure_version_ready(&package_info, "1.2.3")
.expect_err("root package must have dist metadata");
assert!(
err.to_string().contains("missing dist metadata"),
"error should name missing dist metadata: {err}"
);
}
}

View File

@@ -0,0 +1,70 @@
pub(crate) fn is_newer(latest: &str, current: &str) -> Option<bool> {
match (parse_version(latest), parse_version(current)) {
(Some(l), Some(c)) => Some(l > c),
_ => None,
}
}
pub(crate) fn extract_version_from_latest_tag(latest_tag_name: &str) -> anyhow::Result<String> {
latest_tag_name
.strip_prefix("rust-v")
.map(str::to_owned)
.ok_or_else(|| anyhow::anyhow!("Failed to parse latest tag name '{latest_tag_name}'"))
}
pub(crate) fn is_source_build_version(version: &str) -> bool {
parse_version(version) == Some((0, 0, 0))
}
fn parse_version(v: &str) -> Option<(u64, u64, u64)> {
let mut iter = v.trim().split('.');
let maj = iter.next()?.parse::<u64>().ok()?;
let min = iter.next()?.parse::<u64>().ok()?;
let pat = iter.next()?.parse::<u64>().ok()?;
Some((maj, min, pat))
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
#[test]
fn extracts_version_from_latest_tag() {
assert_eq!(
extract_version_from_latest_tag("rust-v1.5.0").expect("failed to parse version"),
"1.5.0"
);
}
#[test]
fn latest_tag_without_prefix_is_invalid() {
assert!(extract_version_from_latest_tag("v1.5.0").is_err());
}
#[test]
fn prerelease_version_is_not_considered_newer() {
assert_eq!(is_newer("0.11.0-beta.1", "0.11.0"), None);
assert_eq!(is_newer("1.0.0-rc.1", "1.0.0"), None);
}
#[test]
fn plain_semver_comparisons_work() {
assert_eq!(is_newer("0.11.1", "0.11.0"), Some(true));
assert_eq!(is_newer("0.11.0", "0.11.1"), Some(false));
assert_eq!(is_newer("1.0.0", "0.9.9"), Some(true));
assert_eq!(is_newer("0.9.9", "1.0.0"), Some(false));
}
#[test]
fn source_build_version_is_not_checked() {
assert!(is_source_build_version("0.0.0"));
assert!(!is_source_build_version("0.1.0"));
}
#[test]
fn whitespace_is_ignored() {
assert_eq!(parse_version(" 1.2.3 \n"), Some((1, 2, 3)));
assert_eq!(is_newer(" 1.2.3 ", "1.2.2"), Some(true));
}
}

View File

@@ -1,8 +1,13 @@
#![cfg(not(debug_assertions))]
use crate::legacy_core::config::Config;
use crate::npm_registry;
use crate::npm_registry::NpmPackageInfo;
use crate::update_action;
use crate::update_action::UpdateAction;
use crate::update_versions::extract_version_from_latest_tag;
use crate::update_versions::is_newer;
use crate::update_versions::is_source_build_version;
use chrono::DateTime;
use chrono::Duration;
use chrono::Utc;
@@ -19,6 +24,7 @@ pub fn get_upgrade_version(config: &Config) -> Option<String> {
return None;
}
let action = update_action::get_update_action();
let version_file = version_filepath(config);
let info = read_version_info(&version_file).ok();
@@ -30,7 +36,7 @@ pub fn get_upgrade_version(config: &Config) -> Option<String> {
// isnt blocked by a network call. The UI reads the previously cached
// value (if any) for this run; the next run shows the banner if needed.
tokio::spawn(async move {
check_for_update(&version_file)
check_for_update(&version_file, action)
.await
.inspect_err(|e| tracing::error!("Failed to update version: {e}"))
});
@@ -78,8 +84,8 @@ fn read_version_info(version_file: &Path) -> anyhow::Result<VersionInfo> {
Ok(serde_json::from_str(&contents)?)
}
async fn check_for_update(version_file: &Path) -> anyhow::Result<()> {
let latest_version = match update_action::get_update_action() {
async fn check_for_update(version_file: &Path, action: Option<UpdateAction>) -> anyhow::Result<()> {
let latest_version = match action {
Some(UpdateAction::BrewUpgrade) => {
let HomebrewCaskInfo { version } = create_client()
.get(HOMEBREW_CASK_API_URL)
@@ -90,17 +96,20 @@ async fn check_for_update(version_file: &Path) -> anyhow::Result<()> {
.await?;
version
}
_ => {
let ReleaseInfo {
tag_name: latest_tag_name,
} = create_client()
.get(LATEST_RELEASE_URL)
Some(UpdateAction::NpmGlobalLatest) | Some(UpdateAction::BunGlobalLatest) => {
let latest_version = fetch_latest_github_release_version().await?;
let package_info = create_client()
.get(npm_registry::PACKAGE_URL)
.send()
.await?
.error_for_status()?
.json::<ReleaseInfo>()
.json::<NpmPackageInfo>()
.await?;
extract_version_from_latest_tag(&latest_tag_name)?
npm_registry::ensure_version_ready(&package_info, &latest_version)?;
latest_version
}
Some(UpdateAction::StandaloneUnix) | Some(UpdateAction::StandaloneWindows) | None => {
fetch_latest_github_release_version().await?
}
};
@@ -120,18 +129,17 @@ async fn check_for_update(version_file: &Path) -> anyhow::Result<()> {
Ok(())
}
fn is_newer(latest: &str, current: &str) -> Option<bool> {
match (parse_version(latest), parse_version(current)) {
(Some(l), Some(c)) => Some(l > c),
_ => None,
}
}
fn extract_version_from_latest_tag(latest_tag_name: &str) -> anyhow::Result<String> {
latest_tag_name
.strip_prefix("rust-v")
.map(str::to_owned)
.ok_or_else(|| anyhow::anyhow!("Failed to parse latest tag name '{latest_tag_name}'"))
async fn fetch_latest_github_release_version() -> anyhow::Result<String> {
let ReleaseInfo {
tag_name: latest_tag_name,
} = create_client()
.get(LATEST_RELEASE_URL)
.send()
.await?
.error_for_status()?
.json::<ReleaseInfo>()
.await?;
extract_version_from_latest_tag(&latest_tag_name)
}
/// Returns the latest version to show in a popup, if it should be shown.
@@ -168,68 +176,3 @@ pub async fn dismiss_version(config: &Config, version: &str) -> anyhow::Result<(
tokio::fs::write(version_file, json_line).await?;
Ok(())
}
fn parse_version(v: &str) -> Option<(u64, u64, u64)> {
let mut iter = v.trim().split('.');
let maj = iter.next()?.parse::<u64>().ok()?;
let min = iter.next()?.parse::<u64>().ok()?;
let pat = iter.next()?.parse::<u64>().ok()?;
Some((maj, min, pat))
}
fn is_source_build_version(version: &str) -> bool {
parse_version(version) == Some((0, 0, 0))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn extract_version_from_brew_api_json() {
//
// https://formulae.brew.sh/api/cask/codex.json
let cask_json = r#"{
"token": "codex",
"full_token": "codex",
"tap": "homebrew/cask",
"version": "0.96.0",
}"#;
let HomebrewCaskInfo { version } = serde_json::from_str::<HomebrewCaskInfo>(cask_json)
.expect("failed to parse version from cask json");
assert_eq!(version, "0.96.0");
}
#[test]
fn extracts_version_from_latest_tag() {
assert_eq!(
extract_version_from_latest_tag("rust-v1.5.0").expect("failed to parse version"),
"1.5.0"
);
}
#[test]
fn latest_tag_without_prefix_is_invalid() {
assert!(extract_version_from_latest_tag("v1.5.0").is_err());
}
#[test]
fn prerelease_version_is_not_considered_newer() {
assert_eq!(is_newer("0.11.0-beta.1", "0.11.0"), None);
assert_eq!(is_newer("1.0.0-rc.1", "1.0.0"), None);
}
#[test]
fn plain_semver_comparisons_work() {
assert_eq!(is_newer("0.11.1", "0.11.0"), Some(true));
assert_eq!(is_newer("0.11.0", "0.11.1"), Some(false));
assert_eq!(is_newer("1.0.0", "0.9.9"), Some(true));
assert_eq!(is_newer("0.9.9", "1.0.0"), Some(false));
}
#[test]
fn whitespace_is_ignored() {
assert_eq!(parse_version(" 1.2.3 \n"), Some((1, 2, 3)));
assert_eq!(is_newer(" 1.2.3 ", "1.2.2"), Some(true));
}
}