Compare commits

...

7 Commits

Author SHA1 Message Date
Michael Bolin
b99e9c3c74 Release 0.100.0-alpha.9 2026-02-12 00:57:41 -08:00
Michael Bolin
26d9bddc52 rust-release: exclude cargo-timing.html from release assets (#11564)
## Why
The `release` job in `.github/workflows/rust-release.yml` uploads
`files: dist/**` via `softprops/action-gh-release`. The downloaded
timing artifacts include multiple files with the same basename,
`cargo-timing.html` (one per target), which causes release asset
collisions/races and can fail with GitHub release-assets API `404 Not
Found` errors.

## What Changed
- Updated the existing cleanup step before `Create GitHub Release` to
remove all `cargo-timing.html` files from `dist/`.
- Removed any now-empty directories after deleting those timing files.

Relevant change:
-
daba003d32/.github/workflows/rust-release.yml (L423)

## Verification
- Confirmed from failing release logs that multiple `cargo-timing.html`
files were being included in `dist/**` and that the release step failed
while operating on duplicate-named assets.
- Verified the workflow now deletes those files before the release
upload step, so `cargo-timing.html` is no longer part of the release
asset set.
2026-02-12 00:56:47 -08:00
xl-openai
6ca9b4327b fix: stop inheriting rate-limit limit_name (#11557)
When we carry over values from partial rate-limit, we should only do so
for the same limit_id.
2026-02-12 00:17:48 -08:00
pakrym-oai
fd7f2aedc7 Handle response.incomplete (#11558)
Treat it same as error.
2026-02-12 00:11:38 -08:00
Michael Bolin
08a000866f Fix linux-musl release link failures caused by glibc-only libcap artifacts (#11556)
Problem:
The `aarch64-unknown-linux-musl` release build was failing at link time
with
`/usr/bin/ld: cannot find -lcap` while building binaries that
transitively pull
in `codex-linux-sandbox`.

Why this is the right fix:
`codex-linux-sandbox` compiles vendored bubblewrap and links `libcap`.
In the
musl jobs, we were installing distro `libcap-dev`, which provides
host/glibc
artifacts. That is not a valid source of target-compatible static libcap
for
musl cross-linking, so the fix is to produce a target-compatible libcap
inside
the musl tool bootstrap and point pkg-config at it.

This also closes the CI coverage gap that allowed this to slip through:
the
`rust-ci.yml` matrix did not exercise `aarch64-unknown-linux-musl` in
`release`
mode. Adding that target/profile combination to CI is the right
regression
barrier for this class of failure.

What changed:
- Updated `.github/scripts/install-musl-build-tools.sh` to install
tooling
  needed to fetch/build libcap sources (`curl`, `xz-utils`, certs).
- Added deterministic libcap bootstrap in the musl tool root:
  - download `libcap-2.75` from kernel.org
  - verify SHA256
  - build with the target musl compiler (`*-linux-musl-gcc`)
  - stage `libcap.a` and headers under the target tool root
  - generate a target-scoped `libcap.pc`
- Exported target `PKG_CONFIG_PATH` so builds resolve the staged musl
libcap
  instead of host pkg-config/lib paths.
- Updated `.github/workflows/rust-ci.yml` to add a `release` matrix
entry for
  `aarch64-unknown-linux-musl` on the ARM runner.
- Updated `.github/workflows/rust-ci.yml` to set
`CARGO_PROFILE_RELEASE_LTO=thin` for `release` matrix entries (and keep
`fat`
for non-release entries), matching the release-build tradeoff already
used in
  `rust-release.yml` while reducing CI runtime.

Verification:
- Reproduced the original failure in CI-like containers:
  - `aarch64-unknown-linux-musl` failed with `cannot find -lcap`.
- Verified the underlying mismatch by forcing host libcap into the link:
  - link then failed with glibc-specific unresolved symbols
    (`__isoc23_*`, `__*_chk`), confirming host libcap was unsuitable.
- Verified the fix in CI-like containers after this change:
- `cargo build -p codex-linux-sandbox --target
aarch64-unknown-linux-musl --release` -> pass
- `cargo build -p codex-linux-sandbox --target x86_64-unknown-linux-musl
--release` -> pass
- Triggered `rust-ci` on this branch and confirmed the new job appears:
- `Lint/Build — ubuntu-24.04-arm - aarch64-unknown-linux-musl (release)`
2026-02-12 08:08:32 +00:00
Ahmed Ibrahim
21ceefc0d1 Add logs to model cache (#11551) 2026-02-11 23:25:31 -08:00
pakrym-oai
d391f3e2f9 Hide the first websocket retry (#11548)
Sometimes connection needs to be quickly reestablished, don't produce an
error for that.
2026-02-11 22:48:13 -08:00
12 changed files with 287 additions and 27 deletions

View File

@@ -17,7 +17,7 @@ if [[ -n "${APT_INSTALL_ARGS:-}" ]]; then
fi
sudo apt-get update "${apt_update_args[@]}"
sudo apt-get install -y "${apt_install_args[@]}" musl-tools pkg-config libcap-dev g++ clang libc++-dev libc++abi-dev lld
sudo apt-get install -y "${apt_install_args[@]}" ca-certificates curl musl-tools pkg-config libcap-dev g++ clang libc++-dev libc++abi-dev lld xz-utils
case "${TARGET}" in
x86_64-unknown-linux-musl)
@@ -32,6 +32,11 @@ case "${TARGET}" in
;;
esac
libcap_version="2.75"
libcap_sha256="de4e7e064c9ba451d5234dd46e897d7c71c96a9ebf9a0c445bc04f4742d83632"
libcap_tarball_name="libcap-${libcap_version}.tar.xz"
libcap_download_url="https://mirrors.edge.kernel.org/pub/linux/libs/security/linux-privs/libcap2/${libcap_tarball_name}"
# Use the musl toolchain as the Rust linker to avoid Zig injecting its own CRT.
if command -v "${arch}-linux-musl-gcc" >/dev/null; then
musl_linker="$(command -v "${arch}-linux-musl-gcc")"
@@ -47,6 +52,43 @@ runner_temp="${RUNNER_TEMP:-/tmp}"
tool_root="${runner_temp}/codex-musl-tools-${TARGET}"
mkdir -p "${tool_root}"
libcap_root="${tool_root}/libcap-${libcap_version}"
libcap_src_root="${libcap_root}/src"
libcap_prefix="${libcap_root}/prefix"
libcap_pkgconfig_dir="${libcap_prefix}/lib/pkgconfig"
if [[ ! -f "${libcap_prefix}/lib/libcap.a" ]]; then
mkdir -p "${libcap_src_root}" "${libcap_prefix}/lib" "${libcap_prefix}/include/sys" "${libcap_prefix}/include/linux" "${libcap_pkgconfig_dir}"
libcap_tarball="${libcap_root}/${libcap_tarball_name}"
curl -fsSL "${libcap_download_url}" -o "${libcap_tarball}"
echo "${libcap_sha256} ${libcap_tarball}" | sha256sum -c -
tar -xJf "${libcap_tarball}" -C "${libcap_src_root}"
libcap_source_dir="${libcap_src_root}/libcap-${libcap_version}"
make -C "${libcap_source_dir}/libcap" -j"$(nproc)" \
CC="${musl_linker}" \
AR=ar \
RANLIB=ranlib
cp "${libcap_source_dir}/libcap/libcap.a" "${libcap_prefix}/lib/libcap.a"
cp "${libcap_source_dir}/libcap/include/uapi/linux/capability.h" "${libcap_prefix}/include/linux/capability.h"
cp "${libcap_source_dir}/libcap/../libcap/include/sys/capability.h" "${libcap_prefix}/include/sys/capability.h"
cat > "${libcap_pkgconfig_dir}/libcap.pc" <<EOF
prefix=${libcap_prefix}
exec_prefix=\${prefix}
libdir=\${prefix}/lib
includedir=\${prefix}/include
Name: libcap
Description: Linux capabilities
Version: ${libcap_version}
Libs: -L\${libdir} -lcap
Cflags: -I\${includedir}
EOF
fi
sysroot=""
if command -v zig >/dev/null; then
zig_bin="$(command -v zig)"
@@ -220,6 +262,14 @@ echo "CMAKE_ARGS=-DCMAKE_HAVE_THREADS_LIBRARY=1 -DCMAKE_USE_PTHREADS_INIT=1 -DCM
# Allow pkg-config resolution during cross-compilation.
echo "PKG_CONFIG_ALLOW_CROSS=1" >> "$GITHUB_ENV"
pkg_config_path="${libcap_pkgconfig_dir}"
if [[ -n "${PKG_CONFIG_PATH:-}" ]]; then
pkg_config_path="${pkg_config_path}:${PKG_CONFIG_PATH}"
fi
echo "PKG_CONFIG_PATH=${pkg_config_path}" >> "$GITHUB_ENV"
pkg_config_path_var="PKG_CONFIG_PATH_${TARGET}"
pkg_config_path_var="${pkg_config_path_var//-/_}"
echo "${pkg_config_path_var}=${libcap_pkgconfig_dir}" >> "$GITHUB_ENV"
if [[ -n "${sysroot}" && "${sysroot}" != "/" ]]; then
echo "PKG_CONFIG_SYSROOT_DIR=${sysroot}" >> "$GITHUB_ENV"

View File

@@ -99,6 +99,8 @@ jobs:
USE_SCCACHE: ${{ startsWith(matrix.runner, 'windows') && 'false' || 'true' }}
CARGO_INCREMENTAL: "0"
SCCACHE_CACHE_SIZE: 10G
# In rust-ci, representative release-profile checks use thin LTO for faster feedback.
CARGO_PROFILE_RELEASE_LTO: ${{ matrix.profile == 'release' && 'thin' || 'fat' }}
strategy:
fail-fast: false
@@ -160,6 +162,12 @@ jobs:
runs_on:
group: codex-runners
labels: codex-linux-x64
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
profile: release
runs_on:
group: codex-runners
labels: codex-linux-arm64
- runner: windows-x64
target: x86_64-pc-windows-msvc
profile: release

View File

@@ -424,6 +424,11 @@ jobs:
run: |
rm -rf dist/shell-tool-mcp*
rm -rf dist/windows-binaries*
# cargo-timing.html appears under multiple target-specific directories.
# If included in files: dist/**, release upload races on duplicate
# asset names and can fail with 404s.
find dist -type f -name 'cargo-timing.html' -delete
find dist -type d -empty -delete
ls -R dist/

View File

@@ -64,7 +64,7 @@ members = [
resolver = "2"
[workspace.package]
version = "0.0.0"
version = "0.100.0-alpha.9"
# Track the edition for all workspace crates in one place. Individual
# crates can still override this value, but keeping it here means new
# crates created with `cargo new -w ...` automatically inherit the 2024

View File

@@ -254,6 +254,17 @@ pub fn process_responses_event(
"response.failed event received".into(),
)));
}
"response.incomplete" => {
let reason = event.response.as_ref().and_then(|response| {
response
.get("incomplete_details")
.and_then(|details| details.get("reason"))
.and_then(Value::as_str)
});
let reason = reason.unwrap_or("unknown");
let message = format!("Incomplete response returned, reason: {reason}");
return Err(ResponsesEventError::Api(ApiError::Stream(message)));
}
"response.completed" => {
if let Some(resp_val) = event.response {
match serde_json::from_value::<ResponseCompleted>(resp_val) {

View File

@@ -346,7 +346,7 @@ impl ModelClient {
///
/// This combines provider capability and feature gating; both must be true for websocket paths
/// to be eligible.
fn responses_websocket_enabled(&self, model_info: &ModelInfo) -> bool {
pub fn responses_websocket_enabled(&self, model_info: &ModelInfo) -> bool {
self.state.provider.supports_websockets
&& (self.state.enable_responses_websockets || model_info.prefer_websockets)
}

View File

@@ -4482,16 +4482,26 @@ async fn run_sampling_request(
"stream disconnected - retrying sampling request ({retries}/{max_retries} in {delay:?})...",
);
// Surface retry information to any UI/frontend so the
// user understands what is happening instead of staring
// at a seemingly frozen screen.
sess.notify_stream_error(
&turn_context,
format!("Reconnecting... {retries}/{max_retries}"),
err,
)
.await;
// In release builds, hide the first websocket retry notification to reduce noisy
// transient reconnect messages. In debug builds, keep full visibility for diagnosis.
let report_error = retries > 1
|| cfg!(debug_assertions)
|| !sess
.services
.model_client
.responses_websocket_enabled(&turn_context.model_info);
if report_error {
// Surface retry information to any UI/frontend so the
// user understands what is happening instead of staring
// at a seemingly frozen screen.
sess.notify_stream_error(
&turn_context,
format!("Reconnecting... {retries}/{max_retries}"),
err,
)
.await;
}
tokio::time::sleep(delay).await;
} else {
return Err(err);

View File

@@ -9,6 +9,7 @@ use std::path::PathBuf;
use std::time::Duration;
use tokio::fs;
use tracing::error;
use tracing::info;
/// Manages loading and saving of models cache to disk.
#[derive(Debug)]
@@ -28,6 +29,11 @@ impl ModelsCacheManager {
/// Attempt to load a fresh cache entry. Returns `None` if the cache doesn't exist or is stale.
pub(crate) async fn load_fresh(&self, expected_version: &str) -> Option<ModelsCache> {
info!(
cache_path = %self.cache_path.display(),
expected_version,
"models cache: attempting load_fresh"
);
let cache = match self.load().await {
Ok(cache) => cache?,
Err(err) => {
@@ -35,12 +41,35 @@ impl ModelsCacheManager {
return None;
}
};
info!(
cache_path = %self.cache_path.display(),
cached_version = ?cache.client_version,
fetched_at = %cache.fetched_at,
"models cache: loaded cache file"
);
if cache.client_version.as_deref() != Some(expected_version) {
info!(
cache_path = %self.cache_path.display(),
expected_version,
cached_version = ?cache.client_version,
"models cache: cache version mismatch"
);
return None;
}
if !cache.is_fresh(self.cache_ttl) {
info!(
cache_path = %self.cache_path.display(),
cache_ttl_secs = self.cache_ttl.as_secs(),
fetched_at = %cache.fetched_at,
"models cache: cache is stale"
);
return None;
}
info!(
cache_path = %self.cache_path.display(),
cache_ttl_secs = self.cache_ttl.as_secs(),
"models cache: cache hit"
);
Some(cache)
}

View File

@@ -26,6 +26,7 @@ use tokio::sync::RwLock;
use tokio::sync::TryLockError;
use tokio::time::timeout;
use tracing::error;
use tracing::info;
const MODEL_CACHE_FILE: &str = "models_cache.json";
const DEFAULT_MODEL_CACHE_TTL: Duration = Duration::from_secs(300);
@@ -216,8 +217,10 @@ impl ModelsManager {
RefreshStrategy::OnlineIfUncached => {
// Try cache first, fall back to online if unavailable
if self.try_load_cache().await {
info!("models cache: using cached models for OnlineIfUncached");
return Ok(());
}
info!("models cache: cache miss, fetching remote models");
self.fetch_and_update_models().await
}
RefreshStrategy::Online => {
@@ -285,13 +288,22 @@ impl ModelsManager {
let _timer =
codex_otel::start_global_timer("codex.remote_models.load_cache.duration_ms", &[]);
let client_version = crate::models_manager::client_version_to_whole();
info!(client_version, "models cache: evaluating cache eligibility");
let cache = match self.cache_manager.load_fresh(&client_version).await {
Some(cache) => cache,
None => return false,
None => {
info!("models cache: no usable cache entry");
return false;
}
};
let models = cache.models.clone();
*self.etag.write().await = cache.etag.clone();
self.apply_remote_models(models.clone()).await;
info!(
models_count = models.len(),
etag = ?cache.etag,
"models cache: cache entry applied"
);
true
}

View File

@@ -177,20 +177,15 @@ impl SessionState {
}
}
// Merge partial rate-limit updates: new fields overwrite existing values;
// missing fields retain prior values. If `limit_id` is absent everywhere,
// default it to `"codex"`.
// Sometimes new snapshots don't include credits or plan information.
// Preserve those from the previous snapshot when missing. For `limit_id`, treat
// missing values as the default `"codex"` bucket.
fn merge_rate_limit_fields(
previous: Option<&RateLimitSnapshot>,
mut snapshot: RateLimitSnapshot,
) -> RateLimitSnapshot {
if snapshot.limit_id.is_none() {
snapshot.limit_id = previous
.and_then(|prior| prior.limit_id.clone())
.or_else(|| Some("codex".to_string()));
}
if snapshot.limit_name.is_none() {
snapshot.limit_name = previous.and_then(|prior| prior.limit_name.clone());
snapshot.limit_id = Some("codex".to_string());
}
if snapshot.credits.is_none() {
snapshot.credits = previous.and_then(|prior| prior.credits.clone());
@@ -305,7 +300,7 @@ mod tests {
}
#[tokio::test]
async fn set_rate_limits_preserves_previous_limit_id_when_missing() {
async fn set_rate_limits_defaults_to_codex_when_limit_id_missing_after_other_bucket() {
let session_configuration = make_session_configuration_for_tests().await;
let mut state = SessionState::new(session_configuration);
@@ -339,12 +334,12 @@ mod tests {
.latest_rate_limits
.as_ref()
.and_then(|v| v.limit_id.clone()),
Some("codex_other".to_string())
Some("codex".to_string())
);
}
#[tokio::test]
async fn set_rate_limits_accepts_new_limit_id_bucket() {
async fn set_rate_limits_carries_credits_and_plan_type_from_codex_to_codex_other() {
let session_configuration = make_session_configuration_for_tests().await;
let mut state = SessionState::new(session_configuration);
@@ -367,7 +362,7 @@ mod tests {
state.set_rate_limits(RateLimitSnapshot {
limit_id: Some("codex_other".to_string()),
limit_name: Some("codex_other".to_string()),
limit_name: None,
primary: Some(RateLimitWindow {
used_percent: 30.0,
window_minutes: Some(120),
@@ -382,7 +377,7 @@ mod tests {
state.latest_rate_limits,
Some(RateLimitSnapshot {
limit_id: Some("codex_other".to_string()),
limit_name: Some("codex_other".to_string()),
limit_name: None,
primary: Some(RateLimitWindow {
used_percent: 30.0,
window_minutes: Some(120),

View File

@@ -37,6 +37,8 @@ use codex_protocol::user_input::UserInput;
use core_test_support::load_default_config_for_test;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_completed_with_tokens;
use core_test_support::responses::ev_message_item_added;
use core_test_support::responses::ev_output_text_delta;
use core_test_support::responses::ev_response_created;
use core_test_support::responses::mount_sse_once;
use core_test_support::responses::mount_sse_once_match;
@@ -1787,6 +1789,64 @@ async fn context_window_error_sets_total_tokens_to_model_window() -> anyhow::Res
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn incomplete_response_emits_content_filter_error_message() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = MockServer::start().await;
let incomplete_response = sse(vec![
ev_response_created("resp_incomplete"),
ev_message_item_added("msg_incomplete", "partial content"),
ev_output_text_delta("continued chunk"),
json!({
"type": "response.incomplete",
"response": {
"id": "resp_incomplete",
"object": "response",
"status": "incomplete",
"error": null,
"incomplete_details": {
"reason": "content_filter"
}
}
}),
]);
let responses_mock = mount_sse_once(&server, incomplete_response).await;
let TestCodex { codex, .. } = test_codex()
.with_config(|config| {
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await?;
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: "trigger incomplete".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
})
.await?;
let error_event = wait_for_event(&codex, |ev| matches!(ev, EventMsg::Error(_))).await;
assert!(
matches!(
error_event,
EventMsg::Error(ref err)
if err.message
== "stream disconnected before completion: Incomplete response returned, reason: content_filter"
),
"expected incomplete content filter error; got {error_event:?}"
);
assert_eq!(responses_mock.requests().len(), 1);
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn azure_overrides_assign_properties_used_for_responses_url() {
skip_if_no_network!();

View File

@@ -1,5 +1,11 @@
use anyhow::Result;
use codex_core::features::Feature;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::user_input::UserInput;
use core_test_support::responses;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_response_created;
@@ -7,8 +13,11 @@ use core_test_support::responses::mount_sse_once;
use core_test_support::responses::mount_sse_sequence;
use core_test_support::responses::sse;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use pretty_assertions::assert_eq;
use tokio::time::Duration;
use tokio::time::timeout;
use wiremock::Mock;
use wiremock::ResponseTemplate;
use wiremock::http::Method;
@@ -113,6 +122,77 @@ async fn websocket_fallback_switches_to_http_after_retries_exhausted() -> Result
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn websocket_fallback_hides_first_websocket_retry_stream_error() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let response_mock = mount_sse_once(
&server,
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
)
.await;
let mut builder = test_codex().with_config({
let base_url = format!("{}/v1", server.uri());
move |config| {
config.model_provider.base_url = Some(base_url);
config.model_provider.wire_api = codex_core::WireApi::Responses;
config.features.enable(Feature::ResponsesWebsockets);
config.model_provider.stream_max_retries = Some(2);
config.model_provider.request_max_retries = Some(0);
}
});
let TestCodex {
codex,
session_configured,
cwd,
..
} = builder.build(&server).await?;
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "hello".into(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_configured.model.clone(),
effort: None,
summary: ReasoningSummary::Auto,
collaboration_mode: None,
personality: None,
})
.await?;
let mut stream_error_messages = Vec::new();
loop {
let event = timeout(Duration::from_secs(10), codex.next_event())
.await
.expect("timeout waiting for event")
.expect("event stream ended unexpectedly")
.msg;
match event {
EventMsg::StreamError(e) => stream_error_messages.push(e.message),
EventMsg::TurnComplete(_) => break,
_ => {}
}
}
let expected_stream_errors = if cfg!(debug_assertions) {
vec!["Reconnecting... 1/2", "Reconnecting... 2/2"]
} else {
vec!["Reconnecting... 2/2"]
};
assert_eq!(stream_error_messages, expected_stream_errors);
assert_eq!(response_mock.requests().len(), 1);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn websocket_fallback_is_sticky_across_turns() -> Result<()> {
skip_if_no_network!(Ok(()));