mirror of
https://github.com/google-gemini/gemini-cli.git
synced 2026-02-01 22:48:03 +00:00
212 lines
5.8 KiB
Markdown
212 lines
5.8 KiB
Markdown
# Integration tests
|
|
|
|
This document provides information about the integration testing framework used
|
|
in this project.
|
|
|
|
## Overview
|
|
|
|
The integration tests are designed to validate the end-to-end functionality of
|
|
the Gemini CLI. They execute the built binary in a controlled environment and
|
|
verify that it behaves as expected when interacting with the file system.
|
|
|
|
These tests are located in the `integration-tests` directory and are run using a
|
|
custom test runner.
|
|
|
|
## Building the tests
|
|
|
|
Prior to running any integration tests, you need to create a release bundle that
|
|
you want to actually test:
|
|
|
|
```bash
|
|
npm run bundle
|
|
```
|
|
|
|
You must re-run this command after making any changes to the CLI source code,
|
|
but not after making changes to tests.
|
|
|
|
## Running the tests
|
|
|
|
The integration tests are not run as part of the default `npm run test` command.
|
|
They must be run explicitly using the `npm run test:integration:all` script.
|
|
|
|
The integration tests can also be run using the following shortcut:
|
|
|
|
```bash
|
|
npm run test:e2e
|
|
```
|
|
|
|
## Running a specific set of tests
|
|
|
|
To run a subset of test files, you can use
|
|
`npm run <integration test command> <file_name1> ....` where <integration
|
|
test command> is either `test:e2e` or `test:integration*` and `<file_name>`
|
|
is any of the `.test.js` files in the `integration-tests/` directory. For
|
|
example, the following command runs `list_directory.test.js` and
|
|
`write_file.test.js`:
|
|
|
|
```bash
|
|
npm run test:e2e list_directory write_file
|
|
```
|
|
|
|
### Running a single test by name
|
|
|
|
To run a single test by its name, use the `--test-name-pattern` flag:
|
|
|
|
```bash
|
|
npm run test:e2e -- --test-name-pattern "reads a file"
|
|
```
|
|
|
|
### Regenerating model responses
|
|
|
|
Some integration tests use faked out model responses, which may need to be
|
|
regenerated from time to time as the implementations change.
|
|
|
|
To regenerate these golden files, set the REGENERATE_MODEL_GOLDENS environment
|
|
variable to "true" when running the tests, for example:
|
|
|
|
**WARNING**: If running locally you should review these updated responses for
|
|
any information about yourself or your system that gemini may have included in
|
|
these responses.
|
|
|
|
```bash
|
|
REGENERATE_MODEL_GOLDENS="true" npm run test:e2e
|
|
```
|
|
|
|
**WARNING**: Make sure you run **await rig.cleanup()** at the end of your test,
|
|
else the golden files will not be updated.
|
|
|
|
### Deflaking a test
|
|
|
|
Before adding a **new** integration test, you should test it at least 5 times
|
|
with the deflake script or workflow to make sure that it is not flaky.
|
|
|
|
### Deflake script
|
|
|
|
```bash
|
|
npm run deflake -- --runs=5 --command="npm run test:e2e -- -- --test-name-pattern '<your-new-test-name>'"
|
|
```
|
|
|
|
#### Deflake workflow
|
|
|
|
```bash
|
|
gh workflow run deflake.yml --ref <your-branch> -f test_name_pattern="<your-test-name-pattern>"
|
|
```
|
|
|
|
### Running all tests
|
|
|
|
To run the entire suite of integration tests, use the following command:
|
|
|
|
```bash
|
|
npm run test:integration:all
|
|
```
|
|
|
|
### Sandbox matrix
|
|
|
|
The `all` command will run tests for `no sandboxing`, `docker` and `podman`.
|
|
Each individual type can be run using the following commands:
|
|
|
|
```bash
|
|
npm run test:integration:sandbox:none
|
|
```
|
|
|
|
```bash
|
|
npm run test:integration:sandbox:docker
|
|
```
|
|
|
|
```bash
|
|
npm run test:integration:sandbox:podman
|
|
```
|
|
|
|
## Diagnostics
|
|
|
|
The integration test runner provides several options for diagnostics to help
|
|
track down test failures.
|
|
|
|
### Keeping test output
|
|
|
|
You can preserve the temporary files created during a test run for inspection.
|
|
This is useful for debugging issues with file system operations.
|
|
|
|
To keep the test output set the `KEEP_OUTPUT` environment variable to `true`.
|
|
|
|
```bash
|
|
KEEP_OUTPUT=true npm run test:integration:sandbox:none
|
|
```
|
|
|
|
When output is kept, the test runner will print the path to the unique directory
|
|
for the test run.
|
|
|
|
### Verbose output
|
|
|
|
For more detailed debugging, set the `VERBOSE` environment variable to `true`.
|
|
|
|
```bash
|
|
VERBOSE=true npm run test:integration:sandbox:none
|
|
```
|
|
|
|
When using `VERBOSE=true` and `KEEP_OUTPUT=true` in the same command, the output
|
|
is streamed to the console and also saved to a log file within the test's
|
|
temporary directory.
|
|
|
|
The verbose output is formatted to clearly identify the source of the logs:
|
|
|
|
```
|
|
--- TEST: <log dir>:<test-name> ---
|
|
... output from the gemini command ...
|
|
--- END TEST: <log dir>:<test-name> ---
|
|
```
|
|
|
|
## Linting and formatting
|
|
|
|
To ensure code quality and consistency, the integration test files are linted as
|
|
part of the main build process. You can also manually run the linter and
|
|
auto-fixer.
|
|
|
|
### Running the linter
|
|
|
|
To check for linting errors, run the following command:
|
|
|
|
```bash
|
|
npm run lint
|
|
```
|
|
|
|
You can include the `:fix` flag in the command to automatically fix any fixable
|
|
linting errors:
|
|
|
|
```bash
|
|
npm run lint:fix
|
|
```
|
|
|
|
## Directory structure
|
|
|
|
The integration tests create a unique directory for each test run inside the
|
|
`.integration-tests` directory. Within this directory, a subdirectory is created
|
|
for each test file, and within that, a subdirectory is created for each
|
|
individual test case.
|
|
|
|
This structure makes it easy to locate the artifacts for a specific test run,
|
|
file, or case.
|
|
|
|
```
|
|
.integration-tests/
|
|
└── <run-id>/
|
|
└── <test-file-name>.test.js/
|
|
└── <test-case-name>/
|
|
├── output.log
|
|
└── ...other test artifacts...
|
|
```
|
|
|
|
## Continuous integration
|
|
|
|
To ensure the integration tests are always run, a GitHub Actions workflow is
|
|
defined in `.github/workflows/chained_e2e.yml`. This workflow automatically runs
|
|
the integrations tests for pull requests against the `main` branch, or when a
|
|
pull request is added to a merge queue.
|
|
|
|
The workflow runs the tests in different sandboxing environments to ensure
|
|
Gemini CLI is tested across each:
|
|
|
|
- `sandbox:none`: Runs the tests without any sandboxing.
|
|
- `sandbox:docker`: Runs the tests in a Docker container.
|
|
- `sandbox:podman`: Runs the tests in a Podman container.
|