Commit Graph

28 Commits

Author SHA1 Message Date
Michael Bolin
4b2d86c360 Refactor Bazel CI job setup 2026-04-13 15:25:00 -07:00
Ruslan Nigmatullin
0225479f0d bazel: Always save bazel repository cache (#16926)
This should improve the cache hit ratio for external deps and such
2026-04-06 12:06:58 -07:00
Michael Bolin
39097ab65d ci: align Bazel repo cache and Windows clippy target handling (#16740)
## Why

Bazel CI had two independent Windows issues:

- The workflow saved/restored `~/.cache/bazel-repo-cache`, but
`.bazelrc` configured `common:ci-windows
--repository_cache=D:/a/.cache/bazel-repo-cache`, so `actions/cache` and
Bazel could point at different directories.
- The Windows `Bazel clippy` job passed the full explicit target list
from `//codex-rs/...`, but some of those explicit targets are
intentionally incompatible with `//:local_windows`.
`run-argument-comment-lint-bazel.sh` already handles that with
`--skip_incompatible_explicit_targets`; the clippy workflow path did
not.

I also tried switching the workflow cache path to
`D:\a\.cache\bazel-repo-cache`, but the Windows clippy job repeatedly
failed with `Failed to restore: Cache service responded with 400`, so
the final change standardizes on `$HOME/.cache/bazel-repo-cache` and
makes cache restore non-fatal.

## What Changed

- Expose one repository-cache path from
`.github/actions/setup-bazel-ci/action.yml` and export that path as
`BAZEL_REPOSITORY_CACHE` so `run-bazel-ci.sh` passes it to Bazel after
`--config=ci-*`.
- Move `actions/cache/restore` out of the composite action into
`.github/workflows/bazel.yml`, and make restore failures non-fatal
there.
- Save exactly the exported cache path in `.github/workflows/bazel.yml`.
- Remove `common:ci-windows
--repository_cache=D:/a/.cache/bazel-repo-cache` from `.bazelrc` so the
Windows CI config no longer disagrees with the workflow cache path.
- Pass `--skip_incompatible_explicit_targets` in the Windows `Bazel
clippy` job so incompatible explicit targets do not fail analysis while
the lint aspect still traverses compatible Rust dependencies.

## Verification

- Parsed `.github/actions/setup-bazel-ci/action.yml` and
`.github/workflows/bazel.yml` with Ruby's YAML loader.
- Resubmitted PR `#16740`; CI is rerunning on the amended commit.
2026-04-03 20:18:33 -07:00
Michael Bolin
c9e706f8b6 Back out "bazel: lint rust_test targets in clippy workflow (#16450)" (#16757)
This backs out https://github.com/openai/codex/pull/16450 because it was
not good to go yet.
2026-04-03 20:01:26 -07:00
Michael Bolin
f263607c60 bazel: lint rust_test targets in clippy workflow (#16450)
## Why

`cargo clippy --tests` was catching warnings in inline `#[cfg(test)]`
code that the Bazel PR Clippy lane missed. The existing Bazel invocation
linted `//codex-rs/...`, but that did not apply Clippy to the generated
manual `rust_test` binaries, so warnings in targets such as
`//codex-rs/state:state-unit-tests-bin` only surfaced as plain compile
warnings instead of failing the lint job.

## What Changed

- added `scripts/list-bazel-clippy-targets.sh` to expand the Bazel
Clippy target set with the generated manual `rust_test` rules while
still excluding `//codex-rs/v8-poc:all`
- updated `.github/workflows/bazel.yml` to use that expanded target list
in the Bazel Clippy PR job
- updated `just bazel-clippy` to use the same target expansion locally
- updated `.github/workflows/README.md` to document that the Bazel PR
lint lane now covers inline `#[cfg(test)]` code

## Verification

- `./scripts/list-bazel-clippy-targets.sh` includes
`//codex-rs/state:state-unit-tests-bin`
- `bazel build --config=clippy -- //codex-rs/state:state-unit-tests-bin`
now fails with the same unused import in `state/src/runtime/logs.rs`
that `cargo clippy --tests` reports
2026-04-03 22:44:53 +00:00
Michael Bolin
eaf12beacf Codex/windows bazel rust test coverage no rs (#16528)
# Why this PR exists

This PR is trying to fix a coverage gap in the Windows Bazel Rust test
lane.

Before this change, the Windows `bazel test //...` job was nominally
part of PR CI, but a non-trivial set of `//codex-rs/...` Rust test
targets did not actually contribute test signal on Windows. In
particular, targets such as `//codex-rs/core:core-unit-tests`,
`//codex-rs/core:core-all-test`, and `//codex-rs/login:login-unit-tests`
were incompatible during Bazel analysis on the Windows gnullvm platform,
so they never reached test execution there. That is why the
Cargo-powered Windows CI job could surface Windows-only failures that
the Bazel-powered job did not report: Cargo was executing those tests,
while Bazel was silently dropping them from the runnable target set.

The main goal of this PR is to make the Windows Bazel test lane execute
those Rust test targets instead of skipping them during analysis, while
still preserving `windows-gnullvm` as the target configuration for the
code under test. In other words: use an MSVC host/exec toolchain where
Bazel helper binaries and build scripts need it, but continue compiling
the actual crate targets with the Windows gnullvm cfgs that our current
Bazel matrix is supposed to exercise.

# Important scope note

This branch intentionally removes the non-resource-loading `.rs` test
and production-code changes from the earlier
`codex/windows-bazel-rust-test-coverage` branch. The only Rust source
changes kept here are runfiles/resource-loading fixes in TUI tests:

- `codex-rs/tui/src/chatwidget/tests.rs`
- `codex-rs/tui/tests/manager_dependency_regression.rs`

That is deliberate. Since the corresponding tests already pass under
Cargo, this PR is meant to test whether Bazel infrastructure/toolchain
fixes alone are enough to get a healthy Windows Bazel test signal,
without changing test behavior for Windows timing, shell output, or
SQLite file-locking.

# How this PR changes the Windows Bazel setup

## 1. Split Windows host/exec and target concerns in the Bazel test lane

The core change is that the Windows Bazel test job now opts into an MSVC
host platform for Bazel execution-time tools, but only for `bazel test`,
not for the Bazel clippy build.

Files:

- `.github/workflows/bazel.yml`
- `.github/scripts/run-bazel-ci.sh`
- `MODULE.bazel`

What changed:

- `run-bazel-ci.sh` now accepts `--windows-msvc-host-platform`.
- When that flag is present on Windows, the wrapper appends
`--host_platform=//:local_windows_msvc` unless the caller already
provided an explicit `--host_platform`.
- `bazel.yml` passes that wrapper flag only for the Windows `bazel test
//...` job.
- The Bazel clippy job intentionally does **not** pass that flag, so
clippy stays on the default Windows gnullvm host/exec path and continues
linting against the target cfgs we care about.
- `run-bazel-ci.sh` also now forwards `CODEX_JS_REPL_NODE_PATH` on
Windows and normalizes the `node` executable path with `cygpath -w`, so
tests that need Node resolve the runner's Node installation correctly
under the Windows Bazel test environment.

Why this helps:

- The original incompatibility chain was mostly on the **exec/tool**
side of the graph, not in the Rust test code itself. Moving host tools
to MSVC lets Bazel resolve helper binaries and generators that were not
viable on the gnullvm exec platform.
- Keeping the target platform on gnullvm preserves cfg coverage for the
crates under test, which is important because some Windows behavior
differs between `msvc` and `gnullvm`.

## 2. Teach the repo's Bazel Rust macro about Windows link flags and
integration-test knobs

Files:

- `defs.bzl`
- `codex-rs/core/BUILD.bazel`
- `codex-rs/otel/BUILD.bazel`
- `codex-rs/tui/BUILD.bazel`

What changed:

- Replaced the old gnullvm-only linker flag block with
`WINDOWS_RUSTC_LINK_FLAGS`, which now handles both Windows ABIs:
  - gnullvm gets `-C link-arg=-Wl,--stack,8388608`
- MSVC gets `-C link-arg=/STACK:8388608`, `-C
link-arg=/NODEFAULTLIB:libucrt.lib`, and `-C link-arg=ucrt.lib`
- Threaded those Windows link flags into generated `rust_binary`,
unit-test binaries, and integration-test binaries.
- Extended `codex_rust_crate(...)` with:
  - `integration_test_args`
  - `integration_test_timeout`
- Used those new knobs to:
- mark `//codex-rs/core:core-all-test` as a long-running integration
test
  - serialize `//codex-rs/otel:otel-all-test` with `--test-threads=1`
- Added `src/**/*.rs` to `codex-rs/tui` test runfiles, because one
regression test scans source files at runtime and Bazel does not expose
source-tree directories unless they are declared as data.

Why this helps:

- Once host-side MSVC tools are available, we still need the generated
Rust test binaries to link correctly on Windows. The MSVC-side
stack/UCRT flags make those binaries behave more like their Cargo-built
equivalents.
- The integration-test macro knobs avoid hardcoding one-off test
behavior in ad hoc BUILD rules and make the generated test targets more
expressive where Bazel and Cargo have different runtime defaults.

## 3. Patch `rules_rs` / `rules_rust` so Windows MSVC exec-side Rust and
build scripts are actually usable

Files:

- `MODULE.bazel`
- `patches/rules_rs_windows_exec_linker.patch`
- `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch`
- `patches/rules_rust_windows_build_script_runner_paths.patch`
- `patches/rules_rust_windows_exec_msvc_build_script_env.patch`
- `patches/rules_rust_windows_msvc_direct_link_args.patch`
- `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch`
- `patches/BUILD.bazel`

What these patches do:

- `rules_rs_windows_exec_linker.patch`
- Adds a `rust-lld` filegroup for Windows Rust toolchain repos,
symlinked to `lld-link.exe` from `PATH`.
  - Marks Windows toolchains as using a direct linker driver.
  - Supplies Windows stdlib link flags for both gnullvm and MSVC.
- `rules_rust_windows_bootstrap_process_wrapper_linker.patch`
- For Windows MSVC Rust targets, prefers the Rust toolchain linker over
an inherited C++ linker path like `clang++`.
- This specifically avoids the broken mixed-mode command line where
rustc emits MSVC-style `/NOLOGO` / `/LIBPATH:` / `/OUT:` arguments but
Bazel still invokes `clang++.exe`.
- `rules_rust_windows_build_script_runner_paths.patch`
- Normalizes forward-slash execroot-relative paths into Windows path
separators before joining them on Windows.
- Uses short Windows paths for `RUSTC`, `OUT_DIR`, and the build-script
working directory to avoid path-length and quoting issues in third-party
build scripts.
- Exposes `RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER=1` to build scripts so
crate-local patches can detect "this is running under Bazel's
build-script runner".
- Fixes the Windows runfiles cleanup filter so generated files with
retained suffixes are actually retained.
- `rules_rust_windows_exec_msvc_build_script_env.patch`
- For exec-side Windows MSVC build scripts, stops force-injecting
Bazel's `CC`, `CXX`, `LD`, `CFLAGS`, and `CXXFLAGS` when that would send
GNU-flavored tool paths/flags into MSVC-oriented Cargo build scripts.
- Rewrites or strips GNU-only `--sysroot`, MinGW include/library paths,
stack-protector, and `_FORTIFY_SOURCE` flags on the MSVC exec path.
- The practical effect is that build scripts can fall back to the Visual
Studio toolchain environment already exported by CI instead of crashing
inside Bazel's hermetic `clang.exe` setup.
- `rules_rust_windows_msvc_direct_link_args.patch`
- When using a direct linker on Windows, stops forwarding GNU driver
flags such as `-L...` and `--sysroot=...` that `lld-link.exe` does not
understand.
- Passes non-`.lib` native artifacts as explicit `-Clink-arg=<path>`
entries when needed.
- Filters C++ runtime libraries to `.lib` artifacts on the Windows
direct-driver path.
- `rules_rust_windows_process_wrapper_skip_temp_outputs.patch`
- Excludes transient `*.tmp*` and `*.rcgu.o` files from process-wrapper
dependency search-path consolidation, so unstable compiler outputs do
not get treated as real link search-path inputs.

Why this helps:

- The host-platform split alone was not enough. Once Bazel started
analyzing/running previously incompatible Rust tests on Windows, the
next failures were in toolchain plumbing:
- MSVC-targeted Rust tests were being linked through `clang++` with
MSVC-style arguments.
- Cargo build scripts running under Bazel's Windows MSVC exec platform
were handed Unix/GNU-flavored path and flag shapes.
- Some generated paths were too long or had path-separator forms that
third-party Windows build scripts did not tolerate.
- These patches make that mixed Bazel/Cargo/Rust/MSVC path workable
enough for the test lane to actually build and run the affected crates.

## 4. Patch third-party crate build scripts that were not robust under
Bazel's Windows MSVC build-script path

Files:

- `MODULE.bazel`
- `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch`
- `patches/ring_windows_msvc_include_dirs.patch`
- `patches/zstd-sys_windows_msvc_include_dirs.patch`

What changed:

- `aws-lc-sys`
- Detects Bazel's Windows MSVC build-script runner via
`RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER` or a `bazel-out` manifest-dir
path.
- Uses `clang-cl` for Bazel Windows MSVC builds when no explicit
`CC`/`CXX` is set.
- Allows prebuilt NASM on the Bazel Windows MSVC path even when `nasm`
is not available directly in the runner environment.
- Avoids canonicalizing `CARGO_MANIFEST_DIR` in the Bazel Windows MSVC
case, because that path may point into Bazel output/runfiles state where
preserving the given path is more reliable than forcing a local
filesystem canonicalization.
- `ring`
- Under the Bazel Windows MSVC build-script runner, copies the
pregenerated source tree into `OUT_DIR` and uses that as the
generated-source root.
- Adds include paths needed by MSVC compilation for
Fiat/curve25519/P-256 generated headers.
- Rewrites a few relative includes in C sources so the added include
directories are sufficient.
- `zstd-sys`
- Adds MSVC-only include directories for `compress`, `decompress`, and
feature-gated dictionary/legacy/seekable sources.
- Skips `-fvisibility=hidden` on MSVC targets, where that
GCC/Clang-style flag is not the right mechanism.

Why this helps:

- After the `rules_rust` plumbing started running build scripts on the
Windows MSVC exec path, some third-party crates still failed for
crate-local reasons: wrong compiler choice, missing include directories,
build-script assumptions about manifest paths, or Unix-only C compiler
flags.
- These crate patches address those crate-local assumptions so the
larger toolchain change can actually reach first-party Rust test
execution.

## 5. Keep the only `.rs` test changes to Bazel/Cargo runfiles parity

Files:

- `codex-rs/tui/src/chatwidget/tests.rs`
- `codex-rs/tui/tests/manager_dependency_regression.rs`

What changed:

- Instead of asking `find_resource!` for a directory runfile like
`src/chatwidget/snapshots` or `src`, these tests now resolve one known
file runfile first and then walk to its parent directory.

Why this helps:

- Bazel runfiles are more reliable for explicitly declared files than
for source-tree directories that happen to exist in a Cargo checkout.
- This keeps the tests working under both Cargo and Bazel without
changing their actual assertions.

# What we tried before landing on this shape, and why those attempts did
not work

## Attempt 1: Force `--host_platform=//:local_windows_msvc` for all
Windows Bazel jobs

This did make the previously incompatible test targets show up during
analysis, but it also pushed the Bazel clippy job and some unrelated
build actions onto the MSVC exec path.

Why that was bad:

- Windows clippy started running third-party Cargo build scripts with
Bazel's MSVC exec settings and crashed in crates such as `tree-sitter`
and `libsqlite3-sys`.
- That was a regression in a job that was previously giving useful
gnullvm-targeted lint signal.

What this PR does instead:

- The wrapper flag is opt-in, and `bazel.yml` uses it only for the
Windows `bazel test` lane.
- The clippy lane stays on the default Windows gnullvm host/exec
configuration.

## Attempt 2: Broaden the `rules_rust` linker override to all Windows
Rust actions

This fixed the MSVC test-lane failure where normal `rust_test` targets
were linked through `clang++` with MSVC-style arguments, but it broke
the default gnullvm path.

Why that was bad:

-
`@@rules_rs++rules_rust+rules_rust//util/process_wrapper:process_wrapper`
on the gnullvm exec platform started linking with `lld-link.exe` and
then failed to resolve MinGW-style libraries such as `-lkernel32`,
`-luser32`, and `-lmingw32`.

What this PR does instead:

- The linker override is restricted to Windows MSVC targets only.
- The gnullvm path keeps its original linker behavior, while MSVC uses
the direct Windows linker.

## Attempt 3: Keep everything on pure Windows gnullvm and patch the V8 /
Python incompatibility chain instead

This would have preserved a single Windows ABI everywhere, but it is a
much larger project than this PR.

Why that was not the practical first step:

- The original incompatibility chain ran through exec-side generators
and helper tools, not only through crate code.
- `third_party/v8` is already special-cased on Windows gnullvm because
`rusty_v8` only publishes Windows prebuilts under MSVC names.
- Fixing that path likely means deeper changes in
V8/rules_python/rules_rust toolchain resolution and generator execution,
not just one local CI flag.

What this PR does instead:

- Keep gnullvm for the target cfgs we want to exercise.
- Move only the Windows test lane's host/exec platform to MSVC, then
patch the build-script/linker boundary enough for that split
configuration to work.

## Attempt 4: Validate compatibility with `bazel test --nobuild ...`

This turned out to be a misleading local validation command.

Why:

- `bazel test --nobuild ...` can successfully analyze targets and then
still exit 1 with "Couldn't start the build. Unable to run tests"
because there are no runnable test actions after `--nobuild`.

Better local check:

```powershell
bazel build --nobuild --keep_going --host_platform=//:local_windows_msvc //codex-rs/login:login-unit-tests //codex-rs/core:core-unit-tests //codex-rs/core:core-all-test
```

# Which patches probably deserve upstream follow-up

My rough take is that the `rules_rs` / `rules_rust` patches are the
highest-value upstream candidates, because they are fixing generic
Windows host/exec + MSVC direct-linker behavior rather than
Codex-specific test logic.

Strong upstream candidates:

- `patches/rules_rs_windows_exec_linker.patch`
- `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch`
- `patches/rules_rust_windows_build_script_runner_paths.patch`
- `patches/rules_rust_windows_exec_msvc_build_script_env.patch`
- `patches/rules_rust_windows_msvc_direct_link_args.patch`
- `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch`

Why these seem upstreamable:

- They address general-purpose problems in the Windows MSVC exec path:
  - missing direct-linker exposure for Rust toolchains
  - wrong linker selection when rustc emits MSVC-style args
- Windows path normalization/short-path issues in the build-script
runner
  - forwarding GNU-flavored CC/link flags into MSVC Cargo build scripts
  - unstable temp outputs polluting process-wrapper search-path state

Potentially upstreamable crate patches, but likely with more care:

- `patches/zstd-sys_windows_msvc_include_dirs.patch`
- `patches/ring_windows_msvc_include_dirs.patch`
- `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch`

Notes on those:

- The `zstd-sys` and `ring` include-path fixes look fairly generic for
MSVC/Bazel build-script environments and may be straightforward to
propose upstream after we confirm CI stability.
- The `aws-lc-sys` patch is useful, but it includes a Bazel-specific
environment probe and CI-specific compiler fallback behavior. That
probably needs a cleaner upstream-facing shape before sending it out, so
upstream maintainers are not forced to adopt Codex's exact CI
assumptions.

Probably not worth upstreaming as-is:

- The repo-local Starlark/test target changes in `defs.bzl`,
`codex-rs/*/BUILD.bazel`, and `.github/scripts/run-bazel-ci.sh` are
mostly Codex-specific policy and CI wiring, not generic rules changes.

# Validation notes for reviewers

On this branch, I ran the following local checks after dropping the
non-resource-loading Rust edits:

```powershell
cargo test -p codex-tui
just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc -- fix -p codex-tui
python .\tools\argument-comment-lint\run-prebuilt-linter.py -p codex-tui
just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc fmt
```

One local caveat:

- `just argument-comment-lint` still fails on this Windows machine for
an unrelated Bazel toolchain-resolution issue in
`//codex-rs/exec:exec-all-test`, so I used the direct prebuilt linter
for `codex-tui` as the local fallback.

# Expected reviewer takeaway

If this PR goes green, the important conclusion is that the Windows
Bazel test coverage gap was primarily a Bazel host/exec toolchain
problem, not a need to make the Rust tests themselves Windows-specific.
That would be a strong signal that the deleted non-resource-loading Rust
test edits from the earlier branch should stay out, and that future work
should focus on upstreaming the generic `rules_rs` / `rules_rust`
Windows fixes and reducing the crate-local patch surface.
2026-04-03 15:34:03 -07:00
Michael Bolin
a098834148 ci: upload compact Bazel execution logs for bazel.yml (#16577)
## Why

The main Bazel CI lanes need compact execution logs to investigate cache
misses and unexpected rebuilds, but local users of the shared wrapper
should not pay that log-generation cost by default.

## What Changed

-
[`.github/scripts/run-bazel-ci.sh`](a6ec239a24/.github/scripts/run-bazel-ci.sh (L149-L153))
now appends `--execution_log_compact_file=...` only when
`CODEX_BAZEL_EXECUTION_LOG_COMPACT_DIR` is set; the caller owns creating
that directory.
-
[`.github/workflows/bazel.yml`](a6ec239a24/.github/workflows/bazel.yml (L66-L174))
enables that env var only for the main `test` and `clippy` jobs, creates
the temp log directory in each job, and uploads the resulting `*.zst`
files from `runner.temp`.

## Verification

- `bash -n .github/scripts/run-bazel-ci.sh`
- Parsed `.github/workflows/bazel.yml` as YAML.
- Ran a local opt-in wrapper smoke test and confirmed it writes
`execution-log-cquery-local-*.zst` when the caller pre-creates
`CODEX_BAZEL_EXECUTION_LOG_COMPACT_DIR`.
2026-04-02 08:41:04 -07:00
Michael Bolin
fce0f76d57 build: migrate argument-comment-lint to a native Bazel aspect (#16106)
## Why

`argument-comment-lint` had become a PR bottleneck because the repo-wide
lane was still effectively running a `cargo dylint`-style flow across
the workspace instead of reusing Bazel's Rust dependency graph. That
kept the lint enforced, but it threw away the main benefit of moving
this job under Bazel in the first place: metadata reuse and cacheable
per-target analysis in the same shape as Clippy.

This change moves the repo-wide lint onto a native Bazel Rust aspect so
Linux and macOS can lint `codex-rs` without rebuilding the world
crate-by-crate through the wrapper path.

## What Changed

- add a nightly Rust toolchain with `rustc-dev` for Bazel and a
dedicated crate-universe repo for `tools/argument-comment-lint`
- add `tools/argument-comment-lint/driver.rs` and
`tools/argument-comment-lint/lint_aspect.bzl` so Bazel can run the lint
as a custom `rustc_driver`
- switch repo-wide `just argument-comment-lint` and the Linux/macOS
`rust-ci` lanes to `bazel build --config=argument-comment-lint
//codex-rs/...`
- keep the Python/DotSlash wrappers as the package-scoped fallback path
and as the current Windows CI path
- gate the Dylint entrypoint behind a `bazel_native` feature so the
Bazel-native library avoids the `dylint_*` packaging stack
- update the aspect runtime environment so the driver can locate
`rustc_driver` correctly under remote execution
- keep the dedicated `tools/argument-comment-lint` package tests and
wrapper unit tests in CI so the source and packaged entrypoints remain
covered

## Verification

- `python3 -m unittest discover -s tools/argument-comment-lint -p
'test_*.py'`
- `cargo test` in `tools/argument-comment-lint`
- `bazel build
//tools/argument-comment-lint:argument-comment-lint-driver
--@rules_rust//rust/toolchain/channel=nightly`
- `bazel build --config=argument-comment-lint
//codex-rs/utils/path-utils:all`
- `bazel build --config=argument-comment-lint
//codex-rs/rollout:rollout`







---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16106).
* #16120
* __->__ #16106
2026-03-28 12:41:56 -07:00
Michael Bolin
f4d0cbfda6 ci: run Bazel clippy on Windows gnullvm (#16067)
## Why

We want more of the pre-merge Rust signal to come from `bazel.yml`,
especially on Windows. The Bazel test workflow already exercises
`x86_64-pc-windows-gnullvm`, but the Bazel clippy job still only ran on
Linux x64 and macOS arm64. That left a gap where Windows-only Bazel lint
breakages could slip through until the Cargo-based workflow ran.

This change keeps the fix narrow. Rather than expanding the Bazel clippy
target set or changing the shared setup logic, it extends the existing
clippy matrix to the same Windows GNU toolchain that the Bazel test job
already uses.

## What Changed

- add `windows-latest` / `x86_64-pc-windows-gnullvm` to the `clippy` job
matrix in `.github/workflows/bazel.yml`
- update the nearby workflow comment to explain that the goal is to get
Bazel-native Windows lint coverage on the same toolchain as the Bazel
test lane
- leave the Bazel clippy scope unchanged at `//codex-rs/...
-//codex-rs/v8-poc:all`

## Verification

- parsed `.github/workflows/bazel.yml` successfully with Ruby
`YAML.load_file`
2026-03-27 20:47:22 -07:00
Michael Bolin
343d1af3da bazel: enable the full Windows gnullvm CI path (#15952)
## Why

This PR is the current, consolidated follow-up to the earlier Windows
Bazel attempt in #11229. The goal is no longer just to get a tiny
Windows smoke job limping along: it is to make the ordinary Bazel CI
path usable on `windows-latest` for `x86_64-pc-windows-gnullvm`, with
the same broad `//...` test shape that macOS and Linux already use.

The earlier smoke-list version of this work was useful as a foothold,
but it was not a good long-term landing point. Windows Bazel kept
surfacing real issues outside that allowlist:

- GitHub's Windows runner exposed runfiles-manifest bugs such as
`FINDSTR: Cannot open D:MANIFEST`, which broke Bazel test launchers even
when the manifest file existed.
- `rules_rs`, `rules_rust`, LLVM extraction, and Abseil still needed
`windows-gnullvm`-specific fixes for our hermetic toolchain.
- the V8 path needed more work than just turning the Windows matrix
entry back on: `rusty_v8` does not ship Windows GNU artifacts in the
same shape we need, and Bazel's in-tree V8 build needed a set of Windows
GNU portability fixes.

Windows performance pressure also pushed this toward a full solution
instead of a permanent smoke suite. During this investigation we hit
targets such as `//codex-rs/shell-command:shell-command-unit-tests` that
were much more expensive on Windows because they repeatedly spawn real
PowerShell parsers (see #16057 for one concrete example of that
pressure). That made it much more valuable to get the real Windows Bazel
path working than to keep iterating on a narrowly curated subset.

The net result is that this PR now aims for the same CI contract on
Windows that we already expect elsewhere: keep standalone
`//third_party/v8:all` out of the ordinary Bazel lane, but allow V8
consumers under `//codex-rs/...` to build and test transitively through
`//...`.

## What Changed

### CI and workflow wiring

- re-enable the `windows-latest` / `x86_64-pc-windows-gnullvm` Bazel
matrix entry in `.github/workflows/bazel.yml`
- move the Windows Bazel output root to `D:\b` and enable `git config
--global core.longpaths true` in
`.github/actions/setup-bazel-ci/action.yml`
- keep the ordinary Bazel target set on Windows aligned with macOS and
Linux by running `//...` while excluding only standalone
`//third_party/v8:all` targets from the normal lane

### Toolchain and module support for `windows-gnullvm`

- patch `rules_rs` so `windows-gnullvm` is modeled as a distinct Windows
exec/toolchain platform instead of collapsing into the generic Windows
shape
- patch `rules_rust` build-script environment handling so llvm-mingw
build-script probes do not inherit unsupported `-fstack-protector*`
flags
- patch the LLVM module archive so it extracts cleanly on Windows and
provides the MinGW libraries this toolchain needs
- patch Abseil so its thread-local identity path matches the hermetic
`windows-gnullvm` toolchain instead of taking an incompatible MinGW
pthread path
- keep both MSVC and GNU Windows targets in the generated Cargo metadata
because the current V8 release-asset story still uses MSVC-shaped names
in some places while the Bazel build targets the GNU ABI

### Windows test-launch and binary-behavior fixes

- update `workspace_root_test_launcher.bat.tpl` to read the runfiles
manifest directly instead of shelling out to `findstr`, which was the
source of the `D:MANIFEST` failures on the GitHub Windows runner
- thread a larger Windows GNU stack reserve through `defs.bzl` so
Bazel-built binaries that pull in V8 behave correctly both under normal
builds and under `bazel test`
- remove the no-longer-needed Windows bootstrap sh-toolchain override
from `.bazelrc`

### V8 / `rusty_v8` Windows GNU support

- export and apply the new Windows GNU patch set from
`patches/BUILD.bazel` / `MODULE.bazel`
- patch the V8 module/rules/source layers so the in-tree V8 build can
produce Windows GNU archives under Bazel
- teach `third_party/v8/BUILD.bazel` to build Windows GNU static
archives in-tree instead of aliasing them to the MSVC prebuilts
- reuse the Linux release binding for the experimental Windows GNU path
where `rusty_v8` does not currently publish a Windows GNU binding
artifact

## Testing

- the primary end-to-end validation for this work is the `Bazel`
workflow plus `v8-canary`, since the hard parts are Windows-specific and
depend on real GitHub runner behavior
- before consolidation back onto this PR, the same net change passed the
full Bazel matrix in [run
23675590471](https://github.com/openai/codex/actions/runs/23675590471)
and passed `v8-canary` in [run
23675590453](https://github.com/openai/codex/actions/runs/23675590453)
- those successful runs included the `windows-latest` /
`x86_64-pc-windows-gnullvm` Bazel job with the ordinary `//...` path,
not the earlier Windows smoke allowlist

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15952).
* #16067
* __->__ #15952
2026-03-27 20:37:03 -07:00
Drew Hintz
f4f6eca871 [codex] Pin GitHub Actions workflow references (#15828)
Pin floating external GitHub Actions workflow refs to immutable SHAs.

Why are we doing this? Please see the rationale doc:
https://docs.google.com/document/d/1qOURCNx2zszQ0uWx7Fj5ERu4jpiYjxLVWBWgKa2wTsA/edit?tab=t.0

Did this break you? Please roll back and let hintz@ know
2026-03-27 23:00:05 +00:00
Michael Bolin
2616c7cf12 ci: add Bazel clippy workflow for codex-rs (#15955)
## Why
`bazel.yml` already builds and tests the Bazel graph, but `rust-ci.yml`
still runs `cargo clippy` separately. This PR starts the transition to a
Bazel-backed lint lane for `codex-rs` so we can eventually replace the
duplicate Rust build, test, and lint work with Bazel while explicitly
keeping the V8 Bazel path out of scope for now.

To make that lane practical, the workflow also needs to look like the
Bazel job we already trust. That means sharing the common Bazel setup
and invocation logic instead of hand-copying it, and covering the arm64
macOS path in addition to Linux.

Landing the workflow green also required fixing the first lint findings
that Bazel surfaced and adding the matching local entrypoint.

## What changed
- add a reusable `build:clippy` config to `.bazelrc` and export
`codex-rs/clippy.toml` from `codex-rs/BUILD.bazel` so Bazel can run the
repository's existing Clippy policy
- add `just bazel-clippy` so the local developer entrypoint matches the
new CI lane
- extend `.github/workflows/bazel.yml` with a dedicated Bazel clippy job
for `codex-rs`, scoped to `//codex-rs/... -//codex-rs/v8-poc:all`
- run that clippy job on Linux x64 and arm64 macOS
- factor the shared Bazel workflow setup into
`.github/actions/setup-bazel-ci/action.yml` and the shared Bazel
invocation logic into `.github/scripts/run-bazel-ci.sh` so the clippy
and build/test jobs stay aligned
- fix the first Bazel-clippy findings needed to keep the lane green,
including the cross-target `cmsghdr::cmsg_len` normalization in
`codex-rs/shell-escalation/src/unix/socket.rs` and the no-`voice-input`
dead-code warnings in `codex-rs/tui` and `codex-rs/tui_app_server`

## Verification
- `just bazel-clippy`
- `RUNNER_OS=macOS ./.github/scripts/run-bazel-ci.sh -- build
--config=clippy --build_metadata=COMMIT_SHA=local-check
--build_metadata=TAG_job=clippy -- //codex-rs/...
-//codex-rs/v8-poc:all`
- `bazel build --config=clippy
//codex-rs/shell-escalation:shell-escalation`
- `CARGO_TARGET_DIR=/tmp/codex4-shell-escalation-test cargo test -p
codex-shell-escalation`
- `ruby -e 'require "yaml";
YAML.load_file(".github/workflows/bazel.yml");
YAML.load_file(".github/actions/setup-bazel-ci/action.yml")'`

## Notes
- `CARGO_TARGET_DIR=/tmp/codex4-tui-app-server-test cargo test -p
codex-tui-app-server` still hits existing guardian-approvals test and
snapshot failures unrelated to this PR's Bazel-clippy changes.

Related: #15954
2026-03-27 12:02:41 -07:00
Michael Bolin
d838c23867 fix: use matrix.target instead of matrix.os for actions/cache build action (#15933)
This seems like a more precise cache key.
2026-03-27 01:32:13 +00:00
Son Luong Ngoc
a27cd2d281 bazel: re-organize bazelrc (#15522)
Replaced ci.bazelrc and v8-ci.bazelrc by custom configs inside the main
.bazelrc file. As a result, github workflows setup is simplified down to
a single '--config=<foo>' flag usage.

Moved the build metadata flags to config=ci.
Added custom tags metadata to help differentiate invocations based on
workflow (bazel vs v8) and os (linux/macos/windows).

Enabled users to override the default values in .bazelrc by using a
user.bazelrc file locally.
Added user.bazelrc to gitignore.
2026-03-26 16:50:07 -07:00
Siggi Simonarson
c264c6eef9 Preserve bazel repository cache in github actions (#14495)
Highlights:

- Trimmed down to just the repository cache for faster upload / download
- Made the cache key only include files that affect external
dependencies (since that's what the repository cache caches) -
MODULE.bazel, codex-rs/Cargo.lock, codex-rs/Cargo.toml
- Split the caching action in to explicit restore / save steps (similar
to your rust CI) which allows us to skip uploads on cache hit, and not
fail the build if upload fails

This should get rid of 842 network fetches that are happening on every
Bazel CI run, while also reducing the Github flakiness @bolinfest
reported. Uploading should be faster (since we're not caching many small
files), and will only happen when MODULE.bazel or Cargo.lock /
Cargo.toml files change.

In my testing, it [took 3s to save the repository
cache](https://github.com/siggisim/codex/actions/runs/23014186143/job/66832859781).
2026-03-26 16:41:15 -07:00
Channing Conger
ded7854f09 V8 Bazel Build (#15021)
Alternative approach, we use rusty_v8 for all platforms that its
predefined, but lets build from source a musl v8 version with bazel for
x86 and aarch64 only. We would need to release this on github and then
use the release.
2026-03-19 18:05:23 -07:00
Michael Bolin
6e0f1e9469 fix: disable Bazel builds in CI on ubuntu-24.04-arm until we can stabilize them (#13055)
The other three Bazel builds have experienced low flakiness in my
experience whereas I find myself re-running the `ubuntu-24.04-arm` jobs
often to shake out the flakes. Disabling for now.
2026-02-27 12:49:13 -08:00
Curtis 'Fjord' Hawthorne
8f3f2c3c02 tests(js_repl): stabilize CI runtime test execution (#12407)
## Summary

Stabilize `js_repl` runtime test setup in CI and move tool-facing
`js_repl` behavior coverage into integration tests.

This is a test/CI change only. No production `js_repl` behavior change
is intended.

## Why

- Bazel test sandboxes (especially on macOS) could resolve a different
`node` than the one installed by `actions/setup-node`, which caused
`js_repl` runtime/version failures.
- `js_repl` runtime tests depend on platform-specific
sandbox/test-harness behavior, so they need explicit gating in a
base-stability commit.
- Several tests in the `js_repl` unit test module were actually
black-box/tool-level behavior tests and fit better in the integration
suite.

## Changes

- Add `actions/setup-node` to the Bazel and Rust `Tests` workflows,
using the exact version pinned in the repo’s Node version file.
- In Bazel (non-Windows), pass `CODEX_JS_REPL_NODE_PATH=$(which node)`
into test env so `js_repl` uses the `actions/setup-node` runtime inside
Bazel tests.
- Add a new integration test suite for `js_repl` tool behavior and
register it in the core integration test suite module.
- Move black-box `js_repl` behavior tests into the integration suite
(persistence/TLA, builtin tool invocation, recursive self-call
rejection, `process` isolation, blocked builtin imports).
- Keep white-box manager/kernel tests in the `js_repl` unit test module.
- Gate `js_repl` runtime tests to run only on macOS and only when a
usable Node runtime is available (skip on other platforms / missing Node
in this commit).

## Impact

- Reduces `js_repl` CI failures caused by Node resolution drift in
Bazel.
- Improves test organization by separating tool-facing behavior tests
from white-box manager/kernel tests.
- Keeps the base commit stable while expanding `js_repl` runtime
coverage.


#### [git stack](https://github.com/magus/git-stack-cli)
-  `1` https://github.com/openai/codex/pull/12372
- 👉 `2` https://github.com/openai/codex/pull/12407
-  `3` https://github.com/openai/codex/pull/12185
-  `4` https://github.com/openai/codex/pull/10673
2026-02-24 21:04:34 -08:00
jif-oai
3b6c50d925 chore: better bazel test logs (#12576)
## Summary

Improve Bazel CI failure diagnostics by printing the tail of each failed
target’s test.log directly in the GitHub Actions output.

Today, when a large Bazel test target fails (for example tests of
`codex-core`), the workflow often only shows a target-level Exit 101
plus a path to Bazel’s test.log. That makes it hard to see the actual
failing Rust test and panic without digging into artifacts or
reproducing locally.

This change makes the workflow automatically surface that information
inline.

  ## What Changed

In .github/workflows/bazel.yml:

  - Capture Bazel console output via tee
  - Preserve the Bazel exit code when piping (PIPESTATUS[0])
  - On failure:
      - Parse failed Bazel test targets from FAIL: //... lines
      - Resolve Bazel test log directory via bazel info bazel-testlogs
      - Print tail -n 200 for each failed target’s test.log
      - Group each target’s output in GitHub Actions logs (::group::)

## Bonus
Disable `experimental_remote_repo_contents_cache` to prevent "Permission
Denied"
2026-02-23 08:13:29 -08:00
Josh McKinney
de93cef5b7 bazel: enforce MODULE.bazel.lock sync with Cargo.lock (#11790)
## Why this change

When Cargo dependencies change, it is easy to end up with an unexpected
local diff in
`MODULE.bazel.lock` after running Bazel. That creates noisy working
copies and pushes lockfile fixes
later in the cycle. This change addresses that pain point directly.

## What this change enforces

The expected invariant is: after dependency updates, `MODULE.bazel.lock`
is already in sync with
Cargo resolution. In practice, running `bazel mod deps` should not
mutate the lockfile in a clean
state. If it does, the dependency update is incomplete.

## How this is enforced

This change adds a single lockfile check script that snapshots
`MODULE.bazel.lock`, runs
`bazel mod deps`, and fails if the file changes. The same check is wired
into local workflow
commands (`just bazel-lock-update` and `just bazel-lock-check`) and into
Bazel CI (Linux x86_64 job)
so drift is caught early and consistently. The developer documentation
is updated in
`codex-rs/docs/bazel.md` and `AGENTS.md` to make the expected flow
explicit.

`MODULE.bazel.lock` is also refreshed in this PR to match the current
Cargo dependency resolution.

## Expected developer workflow

After changing `Cargo.toml` or `Cargo.lock`, run `just
bazel-lock-update`, then run
`just bazel-lock-check`, and include any resulting `MODULE.bazel.lock`
update in the same change.

## Testing

Ran `just bazel-lock-check` locally.
2026-02-14 02:11:19 +00:00
viyatb-oai
923f931121 build(linux-sandbox): always compile vendored bubblewrap on Linux; remove CODEX_BWRAP_ENABLE_FFI (#11498)
## Summary
This PR removes the temporary `CODEX_BWRAP_ENABLE_FFI` flag and makes
Linux builds always compile vendored bubblewrap support for
`codex-linux-sandbox`.

## Changes
- Removed `CODEX_BWRAP_ENABLE_FFI` gating from
`codex-rs/linux-sandbox/build.rs`.
- Linux builds now fail fast if vendored bubblewrap compilation fails
(instead of warning and continuing).
- Updated fallback/help text in
`codex-rs/linux-sandbox/src/vendored_bwrap.rs` to remove references to
`CODEX_BWRAP_ENABLE_FFI`.
- Removed `CODEX_BWRAP_ENABLE_FFI` env wiring from:
  - `.github/workflows/rust-ci.yml`
  - `.github/workflows/bazel.yml`
  - `.github/workflows/rust-release.yml`

---------

Co-authored-by: David Zbarsky <zbarsky@openai.com>
2026-02-11 21:30:41 -08:00
Michael Bolin
9722567a80 fix: add --test_verbose_timeout_warnings to bazel.yml (#11522)
This is in response to seeing this on BuildBuddy:

> There were tests whose specified size is too big. Use the
--test_verbose_timeout_warnings command line option to see which ones
these are.
2026-02-11 17:52:06 -08:00
Josh McKinney
34fb4b6e63 ci: fall back to local Bazel on forks without BuildBuddy key (#11359)
## Summary
- detect whether BUILDBUDDY_API_KEY is present in Bazel CI
- keep existing remote BuildBuddy path when key is available
- add a local fallback path for fork PRs without secrets by clearing
remote cache/executor/BES endpoints
- document each fallback flag inline with links to Bazel docs

## Testing
- ruby -e 'require "yaml";
YAML.load_file(".github/workflows/bazel.yml"); puts "ok"'
- verified Bazel docs/flag references used in workflow comments
2026-02-10 23:19:55 +00:00
viyatb-oai
ae4de43ccc feat(linux-sandbox): add bwrap support (#9938)
## Summary
This PR introduces a gated Bubblewrap (bwrap) Linux sandbox path. The
curent Linux sandbox path relies on in-process restrictions (including
Landlock). Bubblewrap gives us a more uniform filesystem isolation
model, especially explicit writable roots with the option to make some
directories read-only and granular network controls.

This is behind a feature flag so we can validate behavior safely before
making it the default.

- Added temporary rollout flag:
  - `features.use_linux_sandbox_bwrap`
- Preserved existing default path when the flag is off.
- In Bubblewrap mode:
- Added internal retry without /proc when /proc mount is not permitted
by the host/container.
2026-02-04 11:13:17 -08:00
Salman Chishti
eca365cf8c Upgrade GitHub Actions for Node 24 compatibility (#9722)
## Summary

Upgrade GitHub Actions to their latest versions to ensure compatibility
with Node 24, as Node 20 will reach end-of-life in April 2026.

## Changes

| Action | Old Version(s) | New Version | Release | Files |
|--------|---------------|-------------|---------|-------|
| `actions/cache` |
[`v4`](https://github.com/actions/cache/releases/tag/v4) |
[`v5`](https://github.com/actions/cache/releases/tag/v5) |
[Release](https://github.com/actions/cache/releases/tag/v5) | bazel.yml
|

## Context

Per [GitHub's
announcement](https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/),
Node 20 is being deprecated and runners will begin using Node 24 by
default starting March 4th, 2026.

### Why this matters

- **Node 20 EOL**: April 2026
- **Node 24 default**: March 4th, 2026
- **Action**: Update to latest action versions that support Node 24

### Security Note

Actions that were previously pinned to commit SHAs remain pinned to SHAs
(updated to the latest release SHA) to maintain the security benefits of
immutable references.

### Testing

These changes only affect CI/CD workflow configurations and should not
impact application functionality. The workflows should be tested by
running them on a branch before merging.

Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
2026-01-23 12:06:04 -08:00
zbarsky-openai
6a57d7980b fix: support remote arm64 builds, as well (#9018) 2026-01-10 18:41:08 -08:00
Michael Bolin
cf515142b0 fix: include AGENTS.md as repo root marker for integration tests (#9010)
As explained in `codex-rs/core/BUILD.bazel`, including the repo's own
`AGENTS.md` is a hack to get some tests passing. We should fix this
properly, but I wanted to put stake in the ground ASAP to get `just
bazel-remote-test` working and then add a job to `bazel.yml` to ensure
it keeps working.
2026-01-09 17:09:59 -08:00
zbarsky-openai
2a06d64bc9 feat: add support for building with Bazel (#8875)
This PR configures Codex CLI so it can be built with
[Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
includes configuration so that remote builds can be done using
[BuildBuddy](https://www.buildbuddy.io).

If you are familiar with Bazel, things should work as you expect, e.g.,
run `bazel test //... --keep-going` to run all the tests in the repo,
but we have also added some new aliases in the `justfile` for
convenience:

- `just bazel-test` to run tests locally
- `just bazel-remote-test` to run tests remotely (currently, the remote
build is for x86_64 Linux regardless of your host platform). Note we are
currently seeing the following test failures in the remote build, so we
still need to figure out what is happening here:

```
failures:
    suite::compact::manual_compact_twice_preserves_latest_user_messages
    suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
    suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
```

- `just build-for-release` to build release binaries for all
platforms/architectures remotely

To setup remote execution:
- [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
employees should also request org access at
https://openai.buildbuddy.io/join/ with their `@openai.com` email
address.)
- [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
`~/.bazelrc` (add the line `build
--remote_header=x-buildbuddy-api-key=YOUR_KEY`)
- Use `--config=remote` in your `bazel` invocations (or add `common
--config=remote` to your `~/.bazelrc`, or use the `just` commands)

## CI

In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
(we are working on supporting Windows, but that is not ready yet). Note
that the failures we are seeing in `just bazel-remote-test` do not occur
on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
is green right now.

The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
that macOS CI jobs build _remotely_ on Linux hosts (using the
`docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
root `BUILD.bazel`) using cross-compilation to build the macOS
artifacts. Then these artifacts are downloaded locally to GitHub's macOS
runner so the tests can be executed natively. This is the relevant
config that enables this:

```
common:macos --config=remote
common:macos --strategy=remote
common:macos --strategy=TestRunner=darwin-sandbox,local
```

Because of the remote caching benefits we get from BuildBuddy, these new
CI jobs can be extremely fast! For example, consider these two jobs that
ran all the tests on Linux x86_64:

- Bazel 1m37s
https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
- Cargo 9m20s
https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875

For now, we will continue to run both the Bazel and Cargo jobs for PRs,
but once we add support for Windows and running Clippy, we should be
able to cutover to using Bazel exclusively for PRs, which should still
speed things up considerably. We will probably continue to run the Cargo
jobs post-merge for commits that land on `main` as a sanity check.

Release builds will also continue to be done by Cargo for now.

Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
Earlier attempt to add support for Buck2, now abandoned:
https://github.com/openai/codex/pull/8504

---------

Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-01-09 11:09:43 -08:00