Files
codex/defs.bzl
Michael Bolin 30de54da36 bazel: run sharded rust integration tests (#21057)
## Why

Bazel CI was not actually exercising some sharded Rust integration-test
targets on macOS. The `rules_rust` sharding wrapper expects a symlink
runfiles tree, but this repo runs Bazel with `--noenable_runfiles`. In
that configuration the wrapper could fail to find the generated test
binary, produce an empty test list, and exit successfully. That made
targets such as `//codex-rs/core:core-all-test` look green even when
Cargo CI could still catch failures in the same Rust tests.

The coverage gap appears to have been introduced by
[#18082](https://github.com/openai/codex/pull/18082), which enabled
rules_rust native sharding on `//codex-rs/core:core-all-test` and the
other large Rust test labels. The manifest-runfiles setup itself
predates that change in
[#10098](https://github.com/openai/codex/pull/10098), but #18082 is
where the affected integration tests started running through the
incompatible rules_rust sharding wrapper.
[#18913](https://github.com/openai/codex/pull/18913) fixed the same
class of issue for wrapped unit-test shards, but integration-test shards
were still going through the rules_rust wrapper until this PR.

We still do not have the V8/code-mode pieces stable under the Bazel CI
cross-compile setup, so this keeps those tests out of Bazel while
restoring coverage for the rest of the sharded Rust integration suites.
Cargo CI remains responsible for V8/code-mode coverage for now.

This change did uncover a real failing core test on `main`:
`approved_folder_write_request_permissions_unblocks_later_apply_patch`.
That fix is split into
[#21060](https://github.com/openai/codex/pull/21060), which enables the
`apply_patch` tool in the test, teaches the aggregate core test binary
to dispatch the sandboxed filesystem helper, canonicalizes the macOS
temp patch target, and isolates the core test harness from managed
local/enterprise config. Keeping that fix separate lets this PR stay
focused on restoring Bazel coverage while documenting the first failure
it exposed.

## What changed

- Build sharded Rust integration tests as manual `*-bin` binaries and
run them through the existing manifest-aware `workspace_root_test`
launcher.
- Keep Bazel sharding on the launcher target so Rust test cases are
still distributed by stable test-name hashing.
- Configure Bazel CI to skip Rust tests whose names contain
`suite::code_mode::`.
- Exclude the standalone `codex-rs/code-mode` and `codex-rs/v8-poc`
unit-test targets from `bazel.yml`.

## Verification

- `bazel query --output=build //codex-rs/core:core-all-test` now shows
`workspace_root_test` wrapping `//codex-rs/core:core-all-test-bin`.
- `bazel test --test_output=all --nocache_test_results
--test_sharding_strategy=disabled //codex-rs/core:core-all-test
--test_filter=suite::request_permissions_tool::approved_folder_write_request_permissions_unblocks_later_apply_patch`
runs the actual Rust test body and passes.
- `bazel test --test_output=errors --nocache_test_results
--test_env=CODEX_BAZEL_TEST_SKIP_FILTERS=suite::code_mode::
//codex-rs/core:core-all-test` runs the sharded target with code-mode
skipped and passes overall locally, with one flaky attempt retried by
the existing `flaky = True` setting.
2026-05-04 13:33:14 -07:00

544 lines
23 KiB
Python

load("@crates//:data.bzl", "DEP_DATA")
load("@crates//:defs.bzl", "all_crate_deps")
load("@rules_platform//platform_data:defs.bzl", "platform_data")
load("@rules_rust//cargo/private:cargo_build_script_wrapper.bzl", "cargo_build_script")
load("@rules_rust//rust:defs.bzl", "rust_binary", "rust_library", "rust_proc_macro", "rust_test")
PLATFORMS = [
"linux_arm64_musl",
"linux_amd64_musl",
"macos_amd64",
"macos_arm64",
"windows_amd64",
"windows_arm64",
]
# Match Cargo's Windows linker behavior so Bazel-built binaries and tests use
# the same stack reserve on both Windows ABIs and resolve UCRT imports on MSVC.
WINDOWS_RUSTC_LINK_FLAGS = select({
"@rules_rs//rs/experimental/platforms/constraints:windows_gnullvm": [
"-C",
"link-arg=-Wl,--stack,8388608", # 8 MiB
],
"@rules_rs//rs/experimental/platforms/constraints:windows_msvc": [
"-C",
"link-arg=/STACK:8388608", # 8 MiB
"-C",
"link-arg=/NODEFAULTLIB:libucrt.lib",
"-C",
"link-arg=ucrt.lib",
],
"//conditions:default": [],
})
WINDOWS_GNULLVM_INCOMPATIBLE = select({
"@rules_rs//rs/experimental/platforms/constraints:windows_gnullvm": ["@platforms//:incompatible"],
"//conditions:default": [],
})
WINDOWS_GNULLVM_ONLY = select({
"@rules_rs//rs/experimental/platforms/constraints:windows_gnullvm": [],
"//conditions:default": ["@platforms//:incompatible"],
})
# libwebrtc uses Objective-C categories from native archives. Any Bazel-linked
# macOS binary/test that can pull it in must keep category symbols alive.
MACOS_WEBRTC_RUSTC_LINK_FLAGS = select({
"@platforms//os:macos": [
"-C",
"link-arg=-ObjC",
"-C",
"link-arg=-lc++",
],
"//conditions:default": [],
})
def multiplatform_binaries(name, platforms = PLATFORMS):
for platform in platforms:
platform_data(
name = name + "_" + platform,
platform = "@llvm//platforms:" + platform,
target = name,
tags = ["manual"],
)
native.filegroup(
name = "release_binaries",
srcs = [name + "_" + platform for platform in platforms],
tags = ["manual"],
)
def _workspace_root_test_impl(ctx):
is_windows = ctx.target_platform_has_constraint(ctx.attr._windows_constraint[platform_common.ConstraintValueInfo])
launcher = ctx.actions.declare_file(ctx.label.name + ".bat" if is_windows else ctx.label.name)
test_bin = ctx.executable.test_bin
workspace_root_marker = ctx.file.workspace_root_marker
launcher_template = ctx.file._windows_launcher_template if is_windows else ctx.file._bash_launcher_template
runfile_env_exports = _windows_runfile_env_exports(ctx) if is_windows else _bash_runfile_env_exports(ctx)
workspace_root_setup = _windows_workspace_root_setup(ctx) if is_windows else _bash_workspace_root_setup(ctx)
ctx.actions.expand_template(
template = launcher_template,
output = launcher,
is_executable = True,
substitutions = {
"__RUNFILE_ENV_EXPORTS__": runfile_env_exports,
"__TEST_BIN__": test_bin.short_path,
"__WORKSPACE_ROOT_SETUP__": workspace_root_setup,
"__WORKSPACE_ROOT_MARKER__": workspace_root_marker.short_path,
},
)
runfiles = ctx.runfiles(files = [test_bin, workspace_root_marker]).merge(ctx.attr.test_bin[DefaultInfo].default_runfiles)
for data_dep in ctx.attr.data:
runfiles = runfiles.merge(ctx.runfiles(files = data_dep[DefaultInfo].files.to_list()))
runfiles = runfiles.merge(data_dep[DefaultInfo].default_runfiles)
for runfile_dep in ctx.attr.runfile_env:
executable = runfile_dep[DefaultInfo].files_to_run.executable
if executable == None:
fail("{} does not provide an executable for runfile_env".format(runfile_dep.label))
runfiles = runfiles.merge(ctx.runfiles(files = [executable]))
runfiles = runfiles.merge(runfile_dep[DefaultInfo].default_runfiles)
location_targets = (
ctx.attr.data +
[ctx.attr.test_bin, ctx.attr.workspace_root_marker] +
ctx.attr.runfile_env.keys()
)
env = {
key: ctx.expand_location(value, targets = location_targets)
for key, value in ctx.attr.env.items()
}
return [
DefaultInfo(
executable = launcher,
files = depset([launcher]),
runfiles = runfiles,
),
RunEnvironmentInfo(
environment = env,
),
]
def _bash_runfile_env_exports(ctx):
lines = []
for runfile_dep, env_var in ctx.attr.runfile_env.items():
executable = runfile_dep[DefaultInfo].files_to_run.executable
if executable == None:
fail("{} does not provide an executable for runfile_env".format(runfile_dep.label))
lines.append('RUNFILE_ENV_ARGS+=("{}=$(resolve_runfile "{}")")'.format(env_var, executable.short_path))
return "\n".join(lines)
def _windows_runfile_env_exports(ctx):
lines = []
for runfile_dep, env_var in ctx.attr.runfile_env.items():
executable = runfile_dep[DefaultInfo].files_to_run.executable
if executable == None:
fail("{} does not provide an executable for runfile_env".format(runfile_dep.label))
lines.append('call :resolve_runfile {} "{}"'.format(env_var, executable.short_path))
lines.append("if errorlevel 1 exit /b 1")
return "\n".join(lines)
def _bash_workspace_root_setup(ctx):
if not ctx.attr.chdir_workspace_root:
return ""
return 'export INSTA_WORKSPACE_ROOT="${workspace_root}"\ncd "${workspace_root}"'
def _windows_workspace_root_setup(ctx):
if not ctx.attr.chdir_workspace_root:
return ""
return """set "INSTA_WORKSPACE_ROOT=%workspace_root%"
cd /d "%workspace_root%" || exit /b 1"""
workspace_root_test = rule(
implementation = _workspace_root_test_impl,
test = True,
toolchains = ["@bazel_tools//tools/test:default_test_toolchain_type"],
attrs = {
"chdir_workspace_root": attr.bool(
default = True,
),
"data": attr.label_list(
allow_files = True,
),
"env": attr.string_dict(),
"runfile_env": attr.label_keyed_string_dict(
cfg = "target",
),
"test_bin": attr.label(
cfg = "target",
executable = True,
mandatory = True,
),
"workspace_root_marker": attr.label(
allow_single_file = True,
mandatory = True,
),
"_windows_constraint": attr.label(
default = "@platforms//os:windows",
providers = [platform_common.ConstraintValueInfo],
),
"_bash_launcher_template": attr.label(
allow_single_file = True,
default = "//:workspace_root_test_launcher.sh.tpl",
),
"_windows_launcher_template": attr.label(
allow_single_file = True,
default = "//:workspace_root_test_launcher.bat.tpl",
),
},
)
def codex_rust_crate(
name,
crate_name,
crate_features = [],
crate_srcs = None,
crate_edition = None,
proc_macro = False,
build_script_enabled = True,
build_script_data = [],
compile_data = [],
lib_data_extra = [],
rustc_flags_extra = [],
rustc_env = {},
deps_extra = [],
integration_compile_data_extra = [],
integration_test_args = [],
integration_test_timeout = None,
test_data_extra = [],
test_shard_counts = {},
test_tags = [],
unit_test_timeout = None,
extra_binaries = []):
"""Defines a Rust crate with library, binaries, and tests wired for Bazel + Cargo parity.
The macro mirrors Cargo conventions: it builds a library when `src/` exists,
wires build scripts, exports `CARGO_BIN_EXE_*` for integration tests, and
creates unit + integration test targets. Dependency buckets map to the
Cargo.lock resolution in `@crates`.
Args:
name: Bazel target name for the library, should be the directory name.
Example: `app-server`.
crate_name: Cargo crate name from Cargo.toml
Example: `codex_app_server`.
crate_features: Cargo features to enable for this crate.
Crates are only compiled in a single configuration across the workspace, i.e.
with all features in this list enabled. So use sparingly, and prefer to refactor
optional functionality to a separate crate.
crate_srcs: Optional explicit srcs; defaults to `src/**/*.rs`.
crate_edition: Rust edition override, if not default.
You probably don't want this, it's only here for a single caller.
proc_macro: Whether this crate builds a proc-macro library.
build_script_data: Data files exposed to the build script at runtime.
compile_data: Non-Rust compile-time data for the library target.
lib_data_extra: Extra runtime data for the library target.
rustc_env: Extra rustc_env entries to merge with defaults.
deps_extra: Extra normal deps beyond @crates resolution.
Typically only needed when features add additional deps.
integration_compile_data_extra: Extra compile_data for integration tests.
integration_test_args: Optional args for integration test binaries.
integration_test_timeout: Optional Bazel timeout for integration test
targets generated from `tests/*.rs`.
test_data_extra: Extra runtime data for tests.
test_shard_counts: Mapping from generated test target name to Bazel
shard count. Matching tests use native Bazel sharding on the outer
workspace-root launcher, not rules_rust's inner sharding wrapper.
The launcher resolves the real Rust test binary through runfiles
and then assigns each libtest case to a stable bucket by hashing
the test name. Matching tests are also marked flaky, which gives
them Bazel's default three attempts.
test_tags: Tags applied to unit + integration test targets.
Typically used to disable the sandbox, but see https://bazel.build/reference/be/common-definitions#common.tags
unit_test_timeout: Optional Bazel timeout for the unit-test target
generated from `src/**/*.rs`.
extra_binaries: Additional binary labels to surface as test data and
`CARGO_BIN_EXE_*` environment variables. These are only needed for binaries from a different crate.
"""
test_env = {
# The launcher resolves an absolute workspace root at runtime so
# manifest-only platforms like macOS still point Insta at the real
# `codex-rs` checkout.
"INSTA_WORKSPACE_ROOT": ".",
"INSTA_SNAPSHOT_PATH": "src",
}
native.filegroup(
name = "package-files",
srcs = native.glob(
["**"],
exclude = [
"**/BUILD.bazel",
"BUILD.bazel",
"target/**",
],
allow_empty = True,
),
visibility = ["//visibility:public"],
)
rustc_env = {
"BAZEL_PACKAGE": native.package_name(),
} | rustc_env
manifest_relpath = native.package_name()
if manifest_relpath.startswith("codex-rs/"):
manifest_relpath = manifest_relpath[len("codex-rs/"):]
manifest_path = manifest_relpath + "/Cargo.toml"
binaries = DEP_DATA.get(native.package_name())["binaries"]
lib_srcs = crate_srcs or native.glob(["src/**/*.rs"], exclude = binaries.values(), allow_empty = True)
maybe_deps = []
if build_script_enabled and native.glob(["build.rs"], allow_empty = True):
cargo_build_script(
name = name + "-build-script",
srcs = ["build.rs"],
deps = all_crate_deps(build = True),
data = build_script_data,
# Some build script deps sniff version-related env vars...
version = "0.0.0",
)
maybe_deps += [name + "-build-script"]
if lib_srcs:
lib_rule = rust_proc_macro if proc_macro else rust_library
lib_rule(
name = name,
crate_name = crate_name,
crate_features = crate_features,
deps = all_crate_deps() + maybe_deps + deps_extra,
compile_data = compile_data,
data = lib_data_extra,
srcs = lib_srcs,
edition = crate_edition,
rustc_flags = rustc_flags_extra,
rustc_env = rustc_env,
visibility = ["//visibility:public"],
)
unit_test_name = name + "-unit-tests"
unit_test_binary = name + "-unit-tests-bin"
unit_test_shard_count = _test_shard_count(test_shard_counts, unit_test_name)
# Shard at the workspace_root_test layer. rules_rust's sharding wrapper
# expects to run from its own runfiles cwd, while workspace_root_test
# deliberately changes cwd so Insta sees Cargo-like snapshot paths.
rust_test(
name = unit_test_binary,
crate = name,
deps = all_crate_deps(normal = True, normal_dev = True) + maybe_deps + deps_extra,
# Unit tests also compile to standalone Windows executables, so
# keep their stack reserve aligned with binaries and integration
# tests under gnullvm.
# Bazel has emitted both `codex-rs/<crate>/...` and
# `../codex-rs/<crate>/...` paths for `file!()`. Strip either
# prefix so the workspace-root launcher sees Cargo-like metadata
# such as `tui/src/...`.
rustc_flags = rustc_flags_extra + WINDOWS_RUSTC_LINK_FLAGS + [
"--remap-path-prefix=../codex-rs=",
"--remap-path-prefix=codex-rs=",
],
rustc_env = rustc_env,
data = test_data_extra,
tags = test_tags + ["manual"],
)
unit_test_kwargs = {}
if unit_test_timeout:
unit_test_kwargs["timeout"] = unit_test_timeout
if unit_test_shard_count:
unit_test_kwargs["shard_count"] = unit_test_shard_count
unit_test_kwargs["flaky"] = True
workspace_root_test(
name = unit_test_name,
env = test_env,
test_bin = ":" + unit_test_binary,
workspace_root_marker = "//codex-rs/utils/cargo-bin:repo_root.marker",
tags = test_tags,
**unit_test_kwargs
)
maybe_deps += [name]
sanitized_binaries = []
cargo_env = {}
cargo_env_runfiles = {}
for binary, main in binaries.items():
#binary = binary.replace("-", "_")
sanitized_binaries.append(binary)
cargo_env_runfiles[":" + binary] = "CARGO_BIN_EXE_" + binary
cargo_env["CARGO_BIN_EXE_" + binary] = "$(rlocationpath :%s)" % binary
rust_binary(
name = binary,
crate_name = binary.replace("-", "_"),
crate_root = main,
deps = all_crate_deps() + maybe_deps + deps_extra,
edition = crate_edition,
rustc_flags = rustc_flags_extra + WINDOWS_RUSTC_LINK_FLAGS,
srcs = native.glob(["src/**/*.rs"]),
visibility = ["//visibility:public"],
)
for binary_label in extra_binaries:
sanitized_binaries.append(binary_label)
binary = Label(binary_label).name
cargo_env_runfiles[binary_label] = "CARGO_BIN_EXE_" + binary
cargo_env["CARGO_BIN_EXE_" + binary] = "$(rlocationpath %s)" % binary_label
integration_test_kwargs = {}
if integration_test_args:
integration_test_kwargs["args"] = integration_test_args
if integration_test_timeout:
integration_test_kwargs["timeout"] = integration_test_timeout
for test in native.glob(["tests/*.rs"], allow_empty = True):
test_file_stem = test.removeprefix("tests/").removesuffix(".rs")
test_crate_name = test_file_stem.replace("-", "_")
test_name = name + "-" + test_file_stem.replace("/", "-")
if not test_name.endswith("-test"):
test_name += "-test"
windows_cross_test_binary = test_name + "-windows-cross-bin"
test_kwargs = {}
test_kwargs.update(integration_test_kwargs)
test_shard_count = _test_shard_count(test_shard_counts, test_name)
if test_shard_count:
# Put Bazel sharding on the label users/CI invoke. Do not set
# rules_rust's experimental_enable_sharding on the Rust test
# binary: that creates an intermediate wrapper that expects a
# symlink runfiles tree, while this repo intentionally runs with
# --noenable_runfiles and usually has only a runfiles manifest.
test_kwargs["shard_count"] = test_shard_count
test_kwargs["flaky"] = True
integration_test_binary = test_name + "-bin"
# There are three generated integration-test shapes:
#
# 1. Unsharded native tests keep the plain rust_test label for minimal
# churn and the usual rules_rust Cargo-like environment.
# 2. Sharded native tests split into a manual rust_test binary plus an
# outer workspace_root_test. The outer test action receives Bazel's
# sharding environment, resolves the real binary through the
# runfiles manifest, and implements stable libtest sharding itself.
# 3. Windows cross tests always use the workspace_root_test wrapper so
# runfile env vars become Windows-native absolute paths before the
# Rust process starts.
if test_shard_count:
# This target is intentionally a binary-like helper, not the public
# test target. The wrapper below owns cwd setup, runfile env
# materialization, sharding, and flaky retry behavior.
rust_test(
name = integration_test_binary,
crate_name = test_crate_name,
crate_root = test,
srcs = [test],
data = native.glob(["tests/**"], allow_empty = True) + sanitized_binaries + test_data_extra,
compile_data = native.glob(["tests/**"], allow_empty = True) + integration_compile_data_extra,
deps = all_crate_deps(normal = True, normal_dev = True) + maybe_deps + deps_extra,
# Bazel has emitted both `codex-rs/<crate>/...` and
# `../codex-rs/<crate>/...` paths for `file!()`. Strip either
# prefix so Insta records Cargo-like metadata such as `core/tests/...`.
rustc_flags = rustc_flags_extra + WINDOWS_RUSTC_LINK_FLAGS + [
"--remap-path-prefix=../codex-rs=",
"--remap-path-prefix=codex-rs=",
],
rustc_env = rustc_env,
target_compatible_with = WINDOWS_GNULLVM_INCOMPATIBLE,
tags = test_tags + ["manual"],
)
workspace_root_test(
name = test_name,
env = test_env,
# CARGO_BIN_EXE_* values are rlocation paths at analysis time.
# The launcher rewrites them to absolute paths at execution
# time so tests keep working after chdir_workspace_root and on
# manifest-only platforms.
runfile_env = cargo_env_runfiles,
test_bin = ":" + integration_test_binary,
workspace_root_marker = "//codex-rs/utils/cargo-bin:repo_root.marker",
target_compatible_with = WINDOWS_GNULLVM_INCOMPATIBLE,
tags = test_tags,
**test_kwargs
)
else:
# For unsharded tests, the direct rust_test rule is still fine:
# there is no rules_rust sharding wrapper to bypass, and env can
# use rlocation paths directly because the test starts under
# Bazel's normal test environment.
rust_test(
name = test_name,
crate_name = test_crate_name,
crate_root = test,
srcs = [test],
data = native.glob(["tests/**"], allow_empty = True) + sanitized_binaries + test_data_extra,
compile_data = native.glob(["tests/**"], allow_empty = True) + integration_compile_data_extra,
deps = all_crate_deps(normal = True, normal_dev = True) + maybe_deps + deps_extra,
# Bazel has emitted both `codex-rs/<crate>/...` and
# `../codex-rs/<crate>/...` paths for `file!()`. Strip either
# prefix so Insta records Cargo-like metadata such as `core/tests/...`.
rustc_flags = rustc_flags_extra + WINDOWS_RUSTC_LINK_FLAGS + [
"--remap-path-prefix=../codex-rs=",
"--remap-path-prefix=codex-rs=",
],
rustc_env = rustc_env,
env = cargo_env,
target_compatible_with = WINDOWS_GNULLVM_INCOMPATIBLE,
tags = test_tags,
**test_kwargs
)
windows_cross_test_kwargs = {}
windows_cross_test_kwargs.update(integration_test_kwargs)
if test_shard_count:
windows_cross_test_kwargs["shard_count"] = test_shard_count
windows_cross_test_kwargs["flaky"] = True
rust_test(
name = windows_cross_test_binary,
crate_name = test_crate_name,
crate_root = test,
srcs = [test],
data = native.glob(["tests/**"], allow_empty = True) + sanitized_binaries + test_data_extra,
compile_data = native.glob(["tests/**"], allow_empty = True) + integration_compile_data_extra,
deps = all_crate_deps(normal = True, normal_dev = True) + maybe_deps + deps_extra,
rustc_flags = rustc_flags_extra + WINDOWS_RUSTC_LINK_FLAGS + [
"--remap-path-prefix=../codex-rs=",
"--remap-path-prefix=codex-rs=",
],
rustc_env = rustc_env,
env = cargo_env,
target_compatible_with = WINDOWS_GNULLVM_ONLY,
tags = test_tags + ["manual"],
)
workspace_root_test(
name = test_name + "-windows-cross",
chdir_workspace_root = False,
env = cargo_env,
runfile_env = cargo_env_runfiles,
test_bin = ":" + windows_cross_test_binary,
workspace_root_marker = "//codex-rs/utils/cargo-bin:repo_root.marker",
target_compatible_with = WINDOWS_GNULLVM_ONLY,
tags = test_tags,
**windows_cross_test_kwargs
)
def _test_shard_count(test_shard_counts, test_name):
shard_count = test_shard_counts.get(test_name)
if shard_count == None:
return None
if shard_count < 1:
fail("test_shard_counts[{}] must be a positive integer".format(test_name))
return shard_count