From d53e68954acee2eb50303970498ffebddff393ed Mon Sep 17 00:00:00 2001 From: anp-oai Date: Fri, 22 May 2026 09:58:14 -0700 Subject: [PATCH] Prefer `just test` over `cargo test` in docs (#23910) `cargo test` for the core and other crates fails on a fresh macOS checkout without the right stack size variable. This change encourages using the just test command that sets the environment up correctly. As a bonus, this should encourage agents to get more benefit out of nextest's parallel execution. --- AGENTS.md | 11 ++++++----- codex-rs/app-server/README.md | 2 +- codex-rs/core/tests/suite/live_cli.rs | 3 ++- codex-rs/utils/pty/README.md | 2 +- docs/contributing.md | 2 +- docs/install.md | 8 +++----- justfile | 7 +++---- scripts/test-remote-env.sh | 2 +- 8 files changed, 18 insertions(+), 19 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index c13fdea641..9906d3039a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -52,12 +52,13 @@ In the codex-rs folder where the rust code lives: the new implementation so the invariants stay close to the code that owns them. - Avoid adding new standalone methods to `codex-rs/tui/src/chatwidget.rs` unless the change is trivial; prefer new modules/files and keep `chatwidget.rs` focused on orchestration. -- When running Rust commands (e.g. `just fix` or `cargo test`) be patient with the command and never try to kill them using the PID. Rust lock can make the execution slow, this is expected. +- When running Rust commands (e.g. `just fix` or `just test`) be patient with the command and never try to kill them using the PID. Rust lock can make the execution slow, this is expected. Run `just fmt` (in `codex-rs` directory) automatically after you have finished making Rust code changes; do not ask for approval to run it. Additionally, run the tests: -1. Run the test for the specific project that was changed. For example, if changes were made in `codex-rs/tui`, run `cargo test -p codex-tui`. -2. Once those pass, if any changes were made in common, core, or protocol, run the complete test suite with `cargo test` (or `just test` if `cargo-nextest` is installed). Avoid `--all-features` for routine local runs because it expands the build matrix and can significantly increase `target/` disk usage; use it only when you specifically need full feature coverage. project-specific or individual tests can be run without asking the user, but do ask the user before running the complete test suite. +1. Do not run `cargo test` directly. Use `just test` so test execution follows the repo defaults. +2. Run the test for the specific project that was changed. For example, if changes were made in `codex-rs/tui`, run `just test -p codex-tui`. +3. Once those pass, if any changes were made in common, core, or protocol, run the complete test suite with `just test`. Avoid `--all-features` for routine local runs because it expands the build matrix and can significantly increase `target/` disk usage; use it only when you specifically need full feature coverage. project-specific or individual tests can be run without asking the user, but do ask the user before running the complete test suite. Before finalizing a large change to `codex-rs`, run `just fix -p ` (in `codex-rs` directory) to fix any linter issues in the code. Prefer scoping with `-p` to avoid slow workspace‑wide Clippy builds; only run `just fix` without `-p` if you changed shared crates. Do not re-run tests after running `fix` or `fmt`. @@ -120,7 +121,7 @@ is easy to review and future diffs stay visual. When UI or text output changes intentionally, update the snapshots as follows: - Run tests to generate any updated snapshots: - - `cargo test -p codex-tui` + - `just test -p codex-tui` - Check what’s pending: - `cargo insta pending-snapshots -p codex-tui` - Review changes by reading the generated `*.snap.new` files directly in the repo, or preview a specific file: @@ -214,6 +215,6 @@ These guidelines apply to app-server protocol work in `codex-rs`, especially: - Regenerate schema fixtures when API shapes change: `just write-app-server-schema` (and `just write-app-server-schema --experimental` when experimental API fixtures are affected). -- Validate with `cargo test -p codex-app-server-protocol`. +- Validate with `just test -p codex-app-server-protocol`. - Avoid boilerplate tests that only assert experimental field markers for individual request fields in `common.rs`; rely on schema generation/tests and behavioral coverage instead. diff --git a/codex-rs/app-server/README.md b/codex-rs/app-server/README.md index 2ceffc86fe..71b068c93f 100644 --- a/codex-rs/app-server/README.md +++ b/codex-rs/app-server/README.md @@ -1950,5 +1950,5 @@ For server-initiated request payloads, annotate the field the same way so schema 5. Verify the protocol crate: ```bash - cargo test -p codex-app-server-protocol + just test -p codex-app-server-protocol ``` diff --git a/codex-rs/core/tests/suite/live_cli.rs b/codex-rs/core/tests/suite/live_cli.rs index 5e2c0415ea..6273cd15e4 100644 --- a/codex-rs/core/tests/suite/live_cli.rs +++ b/codex-rs/core/tests/suite/live_cli.rs @@ -2,7 +2,8 @@ //! Optional smoke tests that hit the real OpenAI /v1/responses endpoint. They are `#[ignore]` by //! default so CI stays deterministic and free. Developers can run them locally with -//! `cargo test --test live_cli -- --ignored` provided they set a valid `OPENAI_API_KEY`. +//! `just test -p codex-core --test all --run-ignored only live_cli` provided they set a valid +//! `OPENAI_API_KEY`. use assert_cmd::prelude::*; use predicates::prelude::*; diff --git a/codex-rs/utils/pty/README.md b/codex-rs/utils/pty/README.md index e70d7bc6af..7b9df30d0a 100644 --- a/codex-rs/utils/pty/README.md +++ b/codex-rs/utils/pty/README.md @@ -60,5 +60,5 @@ Use `spawn_pipe_process_no_stdin` to force stdin closed (commands that read stdi Unit tests live in `src/lib.rs` and cover both backends (PTY Python REPL and pipe-based stdin roundtrip). Run with: ``` -cargo test -p codex-utils-pty -- --nocapture +just test -p codex-utils-pty --no-capture ``` diff --git a/docs/contributing.md b/docs/contributing.md index 19b31073e9..aeae1f10d3 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -54,7 +54,7 @@ When a change updates model catalogs or model metadata (`/models` payloads, pres - Fill in the PR template (or include similar information) - **What? Why? How?** - Include a link to a bug report or enhancement request in the issue tracker -- Run **all** checks locally. Use the root `just` helpers so you stay consistent with the rest of the workspace: `just fmt`, `just fix -p ` for the crate you touched, and the relevant tests (e.g., `cargo test -p codex-tui` or `just test` if you need a full sweep). CI failures that could have been caught locally slow down the process. +- Run **all** checks locally. Use the root `just` helpers so you stay consistent with the rest of the workspace: `just fmt`, `just fix -p ` for the crate you touched, and the relevant tests (e.g., `just test -p codex-tui` or `just test` if you need a full sweep). CI failures that could have been caught locally slow down the process. - Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts. - Mark the PR as **Ready for review** only when you believe it is in a merge-able state. diff --git a/docs/install.md b/docs/install.md index 0991e7d16c..7c762c4c50 100644 --- a/docs/install.md +++ b/docs/install.md @@ -26,7 +26,7 @@ rustup component add rustfmt rustup component add clippy # Install helper tools used by the workspace justfile: cargo install --locked just -# Optional: install nextest for the `just test` helper +# Install nextest for the `just test` helper. cargo install --locked cargo-nextest # Build Codex. @@ -40,13 +40,11 @@ just fmt just fix -p # Run the relevant tests (project-specific is fastest), for example: -cargo test -p codex-tui -# If you have cargo-nextest installed, `just test` runs the test suite via nextest: +just test -p codex-tui +# `just test` runs the test suite via nextest: just test # Avoid `--all-features` for routine local runs because it increases build # time and `target/` disk usage by compiling additional feature combinations. -# If you specifically want full feature coverage, use: -cargo test --all-features ``` ## Tracing / verbose logging diff --git a/justfile b/justfile index ab2fbc6362..907cd71f6d 100644 --- a/justfile +++ b/justfile @@ -46,14 +46,13 @@ install: rustup show active-toolchain cargo fetch -# Run `cargo nextest` since it's faster than `cargo test`, though including -# --no-fail-fast is important to ensure all tests are run. +# Run nextest with --no-fail-fast so all tests are run. # # Run `cargo install --locked cargo-nextest` if you don't have it installed. # Prefer this for routine local runs. Workspace crate features are banned, so # there should be no need to add `--all-features`. -test: - RUST_MIN_STACK={{ rust_min_stack }} cargo nextest run --no-fail-fast +test *args: + RUST_MIN_STACK={{ rust_min_stack }} cargo nextest run --no-fail-fast "$@" # Build and run Codex from source using Bazel. # Note we have to use the combination of `[no-cd]` and `--run_under="cd $PWD &&"` diff --git a/scripts/test-remote-env.sh b/scripts/test-remote-env.sh index 96743616a2..584a0f6f29 100755 --- a/scripts/test-remote-env.sh +++ b/scripts/test-remote-env.sh @@ -5,7 +5,7 @@ # Usage (source-only): # source scripts/test-remote-env.sh # cd codex-rs -# cargo test -p codex-core --test all remote_env_connects_creates_temp_dir_and_runs_sample_script +# just test -p codex-core --test all remote_test_env_can_connect_and_use_filesystem # codex_remote_env_cleanup SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"