Files
codex/.github/workflows/bazel.yml
starr-openai d666238b40 Shard Bazel Windows tests across jobs (#22408)
## Summary
- split the single PR-blocking Bazel Windows test leg into four Windows
shard jobs
- preserve the existing required Windows Bazel check name with a
lightweight aggregate gate
- keep Linux/macOS Bazel test jobs and the separate Windows
clippy/release jobs unchanged

## Why
The ordinary PR Windows Bazel test leg was one GitHub Actions job, so
Bazel only had in-job parallelism. This gives that lane real job-level
fanout across separate Windows hosts while keeping the target set
disjoint via stable label hashing.

## Evidence
- final pre-rebase green run: `25774733562`
- Windows shard target counts: `61/212`, `48/212`, `52/212`, `51/212`
- Windows test fanout completed in about 7m29s versus a recent
monolithic median around 22m26s

## Notes
- this is scoped to the Bazel Windows test leg only
- each shard keeps the existing Windows cross-compile/RBE path and
restores the former monolithic Windows test cache
- shard jobs do not upload duplicate repository caches after test work,
keeping cache cleanup off the PR-blocking shard path
- no local validation run; relying on GitHub Actions for the
workflow-shaped check

Co-authored-by: Codex <noreply@openai.com>
2026-05-13 12:46:51 -07:00

519 lines
20 KiB
YAML

name: Bazel
# Note this workflow was originally derived from:
# https://github.com/cerisier/toolchains_llvm_bootstrapped/blob/main/.github/workflows/ci.yaml
on:
pull_request: {}
push:
branches:
- main
workflow_dispatch:
concurrency:
# Cancel previous actions from the same PR or branch except 'main' branch.
# See https://docs.github.com/en/actions/using-jobs/using-concurrency and https://docs.github.com/en/actions/learn-github-actions/contexts for more info.
group: concurrency-group::${{ github.workflow }}::${{ github.event.pull_request.number > 0 && format('pr-{0}', github.event.pull_request.number) || github.ref_name }}${{ github.ref_name == 'main' && format('::{0}', github.run_id) || ''}}
cancel-in-progress: ${{ github.ref_name != 'main' }}
jobs:
test:
# PRs use the sharded Windows cross-compiled test jobs below. Post-merge
# pushes to main also run the native Windows test job for broader Windows
# signal without putting PR latency back on the critical path. Cargo CI
# owns V8/code-mode test coverage for now.
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
include:
# macOS
- os: macos-15-xlarge
target: aarch64-apple-darwin
- os: macos-15-xlarge
target: x86_64-apple-darwin
# Linux
- os: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- os: ubuntu-24.04
target: x86_64-unknown-linux-musl
# 2026-02-27 Bazel tests have been flaky on arm in CI.
# Disable until we can investigate and stabilize them.
# - os: ubuntu-24.04-arm
# target: aarch64-unknown-linux-musl
# - os: ubuntu-24.04-arm
# target: aarch64-unknown-linux-gnu
runs-on: ${{ matrix.os }}
# Configure a human readable name for each job
name: Bazel test on ${{ matrix.os }} for ${{ matrix.target }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Check rusty_v8 MODULE.bazel checksums
if: matrix.os == 'ubuntu-24.04' && matrix.target == 'x86_64-unknown-linux-gnu'
shell: bash
run: |
python3 .github/scripts/rusty_v8_bazel.py check-module-bazel
python3 -m unittest discover -s .github/scripts -p test_rusty_v8_bazel.py
- name: Prepare Bazel CI
id: prepare_bazel
uses: ./.github/actions/prepare-bazel-ci
with:
target: ${{ matrix.target }}
cache-scope: bazel-${{ github.job }}
install-test-prereqs: "true"
- name: Check MODULE.bazel.lock is up to date
if: matrix.os == 'ubuntu-24.04' && matrix.target == 'x86_64-unknown-linux-gnu'
shell: bash
run: ./scripts/check-module-bazel-lock.sh
- name: bazel test //...
env:
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
bazel_targets=(
//...
# Keep standalone V8 library targets out of the ordinary Bazel CI
# path. V8 consumers under `//codex-rs/...` still participate
# transitively through `//...`.
-//third_party/v8:all
# V8-backed code-mode tests are covered by Cargo CI. Bazel CI
# cross-compiles in several legs, and those tests are not stable in
# that setup yet.
-//codex-rs/code-mode:code-mode-unit-tests
-//codex-rs/v8-poc:v8-poc-unit-tests
)
bazel_wrapper_args=(
--print-failed-action-summary
--print-failed-test-logs
)
bazel_test_args=(
test
--test_tag_filters=-argument-comment-lint
--test_verbose_timeout_warnings
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
)
./.github/scripts/run-bazel-ci.sh \
"${bazel_wrapper_args[@]}" \
-- \
"${bazel_test_args[@]}" \
-- \
"${bazel_targets[@]}"
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-test-${{ matrix.target }}
path: ${{ runner.temp }}/bazel-execution-logs
if-no-files-found: ignore
# Save the job-scoped Bazel repository cache after cache misses. Keep the
# upload non-fatal so cache service issues never fail the job itself.
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}
test-windows-shard:
# Split the Windows Bazel test leg across separate Windows
# hosts. Each shard still uses Linux RBE for build actions, but the test
# execution itself happens on its own Windows runner.
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
shard:
- 1
- 2
- 3
- 4
runs-on: windows-latest
name: Bazel test on windows-latest for x86_64-pc-windows-gnullvm shard ${{ matrix.shard }}/4
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
uses: ./.github/actions/prepare-bazel-ci
with:
target: x86_64-pc-windows-gnullvm
# Reuse the former monolithic Windows test cache for restores. Do
# not save it from every shard below; duplicate uploads would sit on
# the PR-blocking critical path after the useful test work is done.
cache-scope: bazel-test
install-test-prereqs: "true"
- name: bazel test shard
env:
BAZEL_TEST_SHARD: ${{ matrix.shard }}
BAZEL_TEST_SHARD_COUNT: 4
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
set -euo pipefail
bazel_test_query='tests(//...) except tests(//third_party/v8:all) except //codex-rs/code-mode:code-mode-unit-tests except //codex-rs/v8-poc:v8-poc-unit-tests except attr(tags, "manual", tests(//...))'
mapfile -t bazel_targets < <(
MSYS2_ARG_CONV_EXCL='*' bazel query --output=label "${bazel_test_query}" \
| LC_ALL=C sort
)
selected_targets=()
for bazel_target in "${bazel_targets[@]}"; do
target_bucket="$(
printf '%s\n' "${bazel_target}" \
| cksum \
| awk -v shard_count="${BAZEL_TEST_SHARD_COUNT}" '{ print ($1 % shard_count) + 1 }'
)"
if [[ "${target_bucket}" == "${BAZEL_TEST_SHARD}" ]]; then
selected_targets+=("${bazel_target}")
fi
done
if [[ ${#selected_targets[@]} -eq 0 ]]; then
echo "No Bazel test targets selected for Windows shard ${BAZEL_TEST_SHARD}/${BAZEL_TEST_SHARD_COUNT}." >&2
exit 1
fi
echo "Selected ${#selected_targets[@]} of ${#bazel_targets[@]} Bazel test targets for Windows shard ${BAZEL_TEST_SHARD}/${BAZEL_TEST_SHARD_COUNT}."
bazel_test_args=(
test
--skip_incompatible_explicit_targets
--test_tag_filters=-argument-comment-lint
--test_verbose_timeout_warnings
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
--build_metadata=TAG_windows_test_shard=${BAZEL_TEST_SHARD}
)
./.github/scripts/run-bazel-ci.sh \
--print-failed-action-summary \
--print-failed-test-logs \
--windows-cross-compile \
--remote-download-toplevel \
-- \
"${bazel_test_args[@]}" \
-- \
"${selected_targets[@]}"
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-test-x86_64-pc-windows-gnullvm-shard-${{ matrix.shard }}
path: ${{ runner.temp }}/bazel-execution-logs
if-no-files-found: ignore
test-windows:
# Preserve the existing required-check surface while the real work happens
# in the sharded Windows jobs above.
if: always()
needs: test-windows-shard
runs-on: ubuntu-24.04
name: Bazel test on windows-latest for x86_64-pc-windows-gnullvm
steps:
- name: Confirm Windows Bazel test shards passed
shell: bash
run: |
if [[ "${{ needs.test-windows-shard.result }}" != "success" ]]; then
echo "Windows Bazel test shards finished with result: ${{ needs.test-windows-shard.result }}" >&2
exit 1
fi
test-windows-native-main:
# Native Windows Bazel tests are slower and frequently approach the
# 30-minute PR budget. Run this only for post-merge commits to main and give
# it a larger timeout.
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
timeout-minutes: 40
runs-on: windows-latest
name: Bazel test on windows-latest for x86_64-pc-windows-gnullvm (native main)
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
uses: ./.github/actions/prepare-bazel-ci
with:
target: x86_64-pc-windows-gnullvm
cache-scope: bazel-${{ github.job }}
install-test-prereqs: "true"
- name: bazel test //...
env:
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
bazel_targets=(
//...
# Keep standalone V8 library targets out of the ordinary Bazel CI
# path. V8 consumers under `//codex-rs/...` still participate
# transitively through `//...`.
-//third_party/v8:all
# Keep this aligned with the main Bazel job. The native Windows
# job preserves broad post-merge coverage, but code-mode/V8 tests
# are covered by Cargo CI rather than Bazel for now.
-//codex-rs/code-mode:code-mode-unit-tests
-//codex-rs/v8-poc:v8-poc-unit-tests
)
bazel_test_args=(
test
--test_tag_filters=-argument-comment-lint
--test_verbose_timeout_warnings
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
--build_metadata=TAG_windows_native_main=true
)
./.github/scripts/run-bazel-ci.sh \
--print-failed-action-summary \
--print-failed-test-logs \
-- \
"${bazel_test_args[@]}" \
-- \
"${bazel_targets[@]}"
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-test-windows-native-x86_64-pc-windows-gnullvm
path: ${{ runner.temp }}/bazel-execution-logs
if-no-files-found: ignore
# Save the job-scoped Bazel repository cache after cache misses. Keep the
# upload non-fatal so cache service issues never fail the job itself.
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}
clippy:
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
include:
# Keep Linux lint coverage on x64 and add the arm64 macOS path that
# the Bazel test job already exercises. Add Windows gnullvm as well
# so PRs get Bazel-native lint signal on the same Windows toolchain
# that the Bazel test job uses.
- os: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- os: macos-15-xlarge
target: aarch64-apple-darwin
- os: windows-latest
target: x86_64-pc-windows-gnullvm
runs-on: ${{ matrix.os }}
name: Bazel clippy on ${{ matrix.os }} for ${{ matrix.target }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
uses: ./.github/actions/prepare-bazel-ci
with:
target: ${{ matrix.target }}
cache-scope: bazel-${{ github.job }}
- name: bazel build --config=clippy lint targets
env:
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
bazel_clippy_args=(
--config=clippy
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
--build_metadata=TAG_job=clippy
)
bazel_wrapper_args=()
bazel_target_list_args=()
if [[ "${RUNNER_OS}" == "Windows" ]]; then
# Keep this aligned with the fast Windows Bazel test job: use
# Linux RBE for clippy build actions while targeting Windows
# gnullvm. Fork/community PRs without the BuildBuddy secret fall
# back inside `run-bazel-ci.sh` to the previous local Windows MSVC
# host-platform shape.
bazel_wrapper_args+=(--windows-cross-compile)
bazel_target_list_args+=(--windows-cross-compile)
if [[ -z "${BUILDBUDDY_API_KEY:-}" ]]; then
# The fork fallback can see incompatible explicit Windows-cross
# internal test binaries in the generated target list. Preserve
# the old local-fallback behavior there.
bazel_clippy_args+=(--skip_incompatible_explicit_targets)
fi
fi
bazel_target_lines="$(./scripts/list-bazel-clippy-targets.sh "${bazel_target_list_args[@]}")"
bazel_targets=()
while IFS= read -r target; do
bazel_targets+=("${target}")
done <<< "${bazel_target_lines}"
./.github/scripts/run-bazel-ci.sh \
--print-failed-action-summary \
"${bazel_wrapper_args[@]}" \
-- \
build \
"${bazel_clippy_args[@]}" \
-- \
"${bazel_targets[@]}"
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-clippy-${{ matrix.target }}
path: ${{ runner.temp }}/bazel-execution-logs
if-no-files-found: ignore
# Save the job-scoped Bazel repository cache after cache misses. Keep the
# upload non-fatal so cache service issues never fail the job itself.
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}
verify-release-build:
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
include:
- os: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- os: macos-15-xlarge
target: aarch64-apple-darwin
- os: windows-latest
target: x86_64-pc-windows-gnullvm
runs-on: ${{ matrix.os }}
name: Verify release build on ${{ matrix.os }} for ${{ matrix.target }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
uses: ./.github/actions/prepare-bazel-ci
with:
target: ${{ matrix.target }}
cache-scope: bazel-${{ github.job }}
- name: bazel build verify-release-build targets
env:
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
# This job exists to compile Rust code behind
# `cfg(not(debug_assertions))` so PR CI catches failures that would
# otherwise show up only in a release build. We do not need the full
# optimizer and debug-info work that normally comes with a release
# build to get that signal, so keep Bazel in `fastbuild` and disable
# Rust debug assertions explicitly.
bazel_wrapper_args=()
if [[ "${RUNNER_OS}" == "Windows" ]]; then
# This is build-only signal, so use the same Linux-RBE
# cross-compile path as the fast Windows test and clippy jobs.
# Fork/community PRs without the BuildBuddy secret fall back
# inside `run-bazel-ci.sh` to the previous local Windows MSVC
# host-platform shape.
bazel_wrapper_args+=(--windows-cross-compile)
fi
bazel_build_args=(
--compilation_mode=fastbuild
--@rules_rust//rust/settings:extra_rustc_flag=-Cdebug-assertions=no
--@rules_rust//rust/settings:extra_exec_rustc_flag=-Cdebug-assertions=no
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
--build_metadata=TAG_job=verify-release-build
--build_metadata=TAG_rust_debug_assertions=off
)
bazel_target_lines="$(bash ./scripts/list-bazel-release-targets.sh)"
bazel_targets=()
while IFS= read -r target; do
bazel_targets+=("${target}")
done <<< "${bazel_target_lines}"
./.github/scripts/run-bazel-ci.sh \
"${bazel_wrapper_args[@]}" \
-- \
build \
"${bazel_build_args[@]}" \
-- \
"${bazel_targets[@]}"
- name: Verify Bazel builds bwrap
if: runner.os == 'Linux'
env:
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
./.github/scripts/run-bazel-ci.sh \
--remote-download-toplevel \
--print-failed-action-summary \
-- \
build \
--build_metadata=COMMIT_SHA=${GITHUB_SHA} \
--build_metadata=TAG_job=verify-bwrap \
-- \
//codex-rs/bwrap:bwrap
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-verify-release-build-${{ matrix.target }}
path: ${{ runner.temp }}/bazel-execution-logs
if-no-files-found: ignore
# Save the job-scoped Bazel repository cache after cache misses. Keep the
# upload non-fatal so cache service issues never fail the job itself.
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}