mirror of
https://github.com/openai/codex.git
synced 2026-05-05 11:57:33 +00:00
## Why Bazel CI was not actually exercising some sharded Rust integration-test targets on macOS. The `rules_rust` sharding wrapper expects a symlink runfiles tree, but this repo runs Bazel with `--noenable_runfiles`. In that configuration the wrapper could fail to find the generated test binary, produce an empty test list, and exit successfully. That made targets such as `//codex-rs/core:core-all-test` look green even when Cargo CI could still catch failures in the same Rust tests. The coverage gap appears to have been introduced by [#18082](https://github.com/openai/codex/pull/18082), which enabled rules_rust native sharding on `//codex-rs/core:core-all-test` and the other large Rust test labels. The manifest-runfiles setup itself predates that change in [#10098](https://github.com/openai/codex/pull/10098), but #18082 is where the affected integration tests started running through the incompatible rules_rust sharding wrapper. [#18913](https://github.com/openai/codex/pull/18913) fixed the same class of issue for wrapped unit-test shards, but integration-test shards were still going through the rules_rust wrapper until this PR. We still do not have the V8/code-mode pieces stable under the Bazel CI cross-compile setup, so this keeps those tests out of Bazel while restoring coverage for the rest of the sharded Rust integration suites. Cargo CI remains responsible for V8/code-mode coverage for now. This change did uncover a real failing core test on `main`: `approved_folder_write_request_permissions_unblocks_later_apply_patch`. That fix is split into [#21060](https://github.com/openai/codex/pull/21060), which enables the `apply_patch` tool in the test, teaches the aggregate core test binary to dispatch the sandboxed filesystem helper, canonicalizes the macOS temp patch target, and isolates the core test harness from managed local/enterprise config. Keeping that fix separate lets this PR stay focused on restoring Bazel coverage while documenting the first failure it exposed. ## What changed - Build sharded Rust integration tests as manual `*-bin` binaries and run them through the existing manifest-aware `workspace_root_test` launcher. - Keep Bazel sharding on the launcher target so Rust test cases are still distributed by stable test-name hashing. - Configure Bazel CI to skip Rust tests whose names contain `suite::code_mode::`. - Exclude the standalone `codex-rs/code-mode` and `codex-rs/v8-poc` unit-test targets from `bazel.yml`. ## Verification - `bazel query --output=build //codex-rs/core:core-all-test` now shows `workspace_root_test` wrapping `//codex-rs/core:core-all-test-bin`. - `bazel test --test_output=all --nocache_test_results --test_sharding_strategy=disabled //codex-rs/core:core-all-test --test_filter=suite::request_permissions_tool::approved_folder_write_request_permissions_unblocks_later_apply_patch` runs the actual Rust test body and passes. - `bazel test --test_output=errors --nocache_test_results --test_env=CODEX_BAZEL_TEST_SKIP_FILTERS=suite::code_mode:: //codex-rs/core:core-all-test` runs the sharded target with code-mode skipped and passes overall locally, with one flaky attempt retried by the existing `flaky = True` setting.
63 lines
2.3 KiB
Bash
Executable File
63 lines
2.3 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
|
|
set -euo pipefail
|
|
|
|
repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
|
cd "${repo_root}"
|
|
|
|
windows_cross_compile=0
|
|
while [[ $# -gt 0 ]]; do
|
|
case "$1" in
|
|
--windows-cross-compile)
|
|
windows_cross_compile=1
|
|
shift
|
|
;;
|
|
*)
|
|
echo "Usage: $0 [--windows-cross-compile]" >&2
|
|
exit 1
|
|
;;
|
|
esac
|
|
done
|
|
|
|
# Resolve the dynamic targets before printing anything so callers do not
|
|
# continue with a partial list if `bazel query` fails. Reuse the same CI Bazel
|
|
# server settings as the subsequent build so Windows jobs do not cold-start a
|
|
# second Bazel server just for target discovery.
|
|
if [[ $windows_cross_compile -eq 1 ]]; then
|
|
manual_rust_test_targets="$(
|
|
./.github/scripts/run-bazel-query-ci.sh \
|
|
--windows-cross-compile \
|
|
--output=label \
|
|
-- 'kind("rust_test rule", attr(tags, "manual", //codex-rs/... except //codex-rs/v8-poc/...))'
|
|
)"
|
|
else
|
|
manual_rust_test_targets="$(
|
|
./.github/scripts/run-bazel-query-ci.sh \
|
|
--output=label \
|
|
-- 'kind("rust_test rule", attr(tags, "manual", //codex-rs/... except //codex-rs/v8-poc/...))'
|
|
)"
|
|
fi
|
|
if [[ "${RUNNER_OS:-}" != "Windows" ]]; then
|
|
# Non-Windows clippy jobs lint the native test binaries; the
|
|
# Windows-cross binaries exist only for the fast Windows test leg.
|
|
manual_rust_test_targets="$(printf '%s\n' "${manual_rust_test_targets}" | grep -v -- '-windows-cross-bin$' || true)"
|
|
elif [[ $windows_cross_compile -eq 1 ]]; then
|
|
# `bazel query` is intentionally pre-analysis and does not remove targets
|
|
# made incompatible by `target_compatible_with`. Sharded integration tests
|
|
# add native-only manual helpers such as `core-all-test-bin`, plus separate
|
|
# `core-all-test-windows-cross-bin` helpers for the Windows cross leg. Keep
|
|
# the Windows helpers and unit-test helpers, but do not pass the native-only
|
|
# sharded integration helpers as explicit clippy targets.
|
|
manual_rust_test_targets="$(printf '%s\n' "${manual_rust_test_targets}" | grep -v -- '-test-bin$' || true)"
|
|
fi
|
|
|
|
printf '%s\n' \
|
|
"//codex-rs/..." \
|
|
"-//codex-rs/v8-poc:all"
|
|
|
|
# `--config=clippy` on the `workspace_root_test` wrappers does not lint the
|
|
# underlying `rust_test` binaries. Add the internal manual `*-unit-tests-bin`
|
|
# targets explicitly so inline `#[cfg(test)]` code is linted like
|
|
# `cargo clippy --tests`.
|
|
printf '%s\n' "${manual_rust_test_targets}"
|