Commit Graph

5068 Commits

Author SHA1 Message Date
jif-oai
285f4ea817 feat: restrict spawn_agent v2 to messages (#16325) 2026-03-31 14:52:55 +02:00
jif-oai
4c72e62d0b fix: update fork boundaries computation (#16322) 2026-03-31 14:10:43 +02:00
jif-oai
1fc8aa0e16 feat: fork pattern v2 (#15771)
Adds this:
```
properties.insert(
            "fork_turns".to_string(),
            JsonSchema::String {
                description: Some(
                    "Optional MultiAgentV2 fork mode. Use `none`, `all`, or a positive integer string such as `3` to fork only the most recent turns."
                        .to_string(),
                ),
            },
        );
        ```

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-31 13:06:08 +02:00
jif-oai
2b8d29ac0d nit: update aborted line (#16318) 2026-03-31 13:06:00 +02:00
jif-oai
ec21e1fd01 chore: clean wait v2 (#16317) 2026-03-31 12:18:10 +02:00
jif-oai
25fbd7e40e fix: ma2 (#16238) 2026-03-31 11:22:38 +02:00
jif-oai
873e466549 fix: one shot end of turn (#16308)
Fix the death of the end of turn watcher
2026-03-31 11:11:33 +02:00
Michael Bolin
20f43c1e05 core: support dynamic auth tokens for model providers (#16288)
## Summary

Fixes #15189.

Custom model providers that set `requires_openai_auth = false` could
only use static credentials via `env_key` or
`experimental_bearer_token`. That is not enough for providers that mint
short-lived bearer tokens, because Codex had no way to run a command to
obtain a bearer token, cache it briefly in memory, and retry with a
refreshed token after a `401`.

This PR adds that provider config and wires it through the existing auth
design: request paths still go through `AuthManager.auth()` and
`UnauthorizedRecovery`, with `core` only choosing when to use a
provider-backed bearer-only `AuthManager`.

## Scope

To keep this PR reviewable, `/models` only uses provider auth for the
initial request in this change. It does **not** add a dedicated `401`
retry path for `/models`; that can be follow-up work if we still need it
after landing the main provider-token support.

## Example Usage

```toml
model_provider = "corp-openai"

[model_providers.corp-openai]
name = "Corp OpenAI"
base_url = "https://gateway.example.com/openai"
requires_openai_auth = false

[model_providers.corp-openai.auth]
command = "gcloud"
args = ["auth", "print-access-token"]
timeout_ms = 5000
refresh_interval_ms = 300000
```

The command contract is intentionally small:

- write the bearer token to `stdout`
- exit `0`
- any leading or trailing whitespace is trimmed before the token is used

## What Changed

- add `model_providers.<id>.auth` to the config model and generated
schema
- validate that command-backed provider auth is mutually exclusive with
`env_key`, `experimental_bearer_token`, and `requires_openai_auth`
- build a bearer-only `AuthManager` for `ModelClient` and
`ModelsManager` when a provider configures `auth`
- let normal Responses requests and realtime websocket connects use the
provider-backed bearer source through the same `AuthManager.auth()` path
- allow `/models` online refresh for command-auth providers and attach
the provider token to the initial `/models` request
- keep `auth.cwd` available as an advanced escape hatch and include it
in the generated config schema

## Testing

- `cargo test -p codex-core provider_auth_command`
- `cargo test -p codex-core
refresh_available_models_uses_provider_auth_token`
- `cargo test -p codex-core
test_deserialize_provider_auth_config_defaults`

## Docs

- `developers.openai.com/codex` should document the new
`[model_providers.<id>.auth]` block and the token-command contract
2026-03-31 01:37:27 -07:00
Michael Bolin
0071968829 auth: let AuthManager own external bearer auth (#16287)
## Summary

`AuthManager` and `UnauthorizedRecovery` already own token resolution
and staged `401` recovery. The missing piece for provider auth was a
bearer-only mode that still fit that design, instead of pushing a second
auth abstraction into `codex-core`.

This PR keeps the design centered on `AuthManager`: it teaches
`codex-login` how to own external bearer auth directly so later provider
work can keep calling `AuthManager.auth()` and `UnauthorizedRecovery`.

## Motivation

This is the middle layer for #15189.

The intended design is still:

- `AuthManager` encapsulates token storage and refresh
- `UnauthorizedRecovery` powers staged `401` recovery
- all request tokens go through `AuthManager.auth()`

This PR makes that possible for provider-backed bearer tokens by adding
a bearer-only auth mode inside `AuthManager` instead of building
parallel request-auth plumbing in `core`.

## What Changed

- move `ModelProviderAuthInfo` into `codex-protocol` so `core` and
`login` share one config shape
- add `login/src/auth/external_bearer.rs`, which runs the configured
command, caches the bearer token in memory, and refreshes it after `401`
- add `AuthManager::external_bearer_only(...)` for provider-scoped
request paths that should use command-backed bearer auth without
mutating the shared OpenAI auth manager
- add `AuthManager::shared_with_external_chatgpt_auth_refresher(...)`
and rename the other `AuthManager` helpers that only apply to external
ChatGPT auth so the ChatGPT-only path is explicit at the call site
- keep external ChatGPT refresh behavior unchanged while ensuring
bearer-only external auth never persists to `auth.json`

## Testing

- `cargo test -p codex-login`
- `cargo test -p codex-protocol`





---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16287).
* #16288
* __->__ #16287
2026-03-31 01:26:17 -07:00
Michael Bolin
ea650a91b3 auth: generalize external auth tokens for bearer-only sources (#16286)
## Summary

`ExternalAuthRefresher` was still shaped around external ChatGPT auth:
`ExternalAuthTokens` always implied ChatGPT account metadata even when a
caller only needed a bearer token.

This PR generalizes that contract so bearer-only sources are
first-class, while keeping the existing ChatGPT paths strict anywhere we
persist or rebuild ChatGPT auth state.

## Motivation

This is the first step toward #15189.

The follow-on provider-auth work needs one shared external-auth contract
that can do both of these things:

- resolve the current bearer token before a request is sent
- return a refreshed bearer token after a `401`

That should not require a second token result type just because there is
no ChatGPT account metadata attached.

## What Changed

- change `ExternalAuthTokens` to carry `access_token` plus optional
`ExternalAuthChatgptMetadata`
- add helper constructors for bearer-only tokens and ChatGPT-backed
tokens
- add `ExternalAuthRefresher::resolve()` with a default no-op
implementation so refreshers can optionally provide the current token
before a request is sent
- keep ChatGPT-only persistence strict by continuing to require ChatGPT
metadata anywhere the login layer seeds or reloads ChatGPT auth state
- update the app-server bridge to construct the new token shape for
external ChatGPT auth refreshes

## Testing

- `cargo test -p codex-login`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16286).
* #16288
* #16287
* __->__ #16286
2026-03-31 01:02:46 -07:00
Michael Bolin
19f0d196d1 ci: run Windows argument-comment-lint via native Bazel (#16120)
## Why

Follow-up to #16106.

`argument-comment-lint` already runs as a native Bazel aspect on Linux
and macOS, but Windows is still the long pole in `rust-ci`. To move
Windows onto the same native Bazel lane, the toolchain split has to let
exec-side helper binaries build in an MSVC environment while still
linting repo crates as `windows-gnullvm`.

Pushing the Windows lane onto the native Bazel path exposed a second
round of Windows-only issues in the mixed exec-toolchain plumbing after
the initial wrapper/target fixes landed.

## What Changed

- keep the Windows lint lanes on the native Bazel/aspect path in
`rust-ci.yml` and `rust-ci-full.yml`
- add a dedicated `local_windows_msvc` platform for exec-side helper
binaries while keeping `local_windows` as the `windows-gnullvm` target
platform
- patch `rules_rust` so `repository_set(...)` preserves explicit
exec-platform constraints for the generated toolchains, keep the
Windows-specific bootstrap/direct-link fixes needed for the nightly lint
driver, and expose exec-side `rustc-dev` `.rlib`s to the MSVC sysroot
- register the custom Windows nightly toolchain set with MSVC exec
constraints while still exposing both `x86_64-pc-windows-msvc` and
`x86_64-pc-windows-gnullvm` targets
- enable `dev_components` on the custom Windows nightly repository set
so the MSVC exec helper toolchain actually downloads the
compiler-internal crates that `clippy_utils` needs
- teach `run-argument-comment-lint-bazel.sh` to enumerate concrete
Windows Rust rules, normalize the resulting labels, and skip explicitly
requested incompatible targets instead of failing before the lint run
starts
- patch `rules_rust` build-script env propagation so exec-side
`windows-msvc` helper crates drop forwarded MinGW include and linker
search paths as whole flag/path pairs instead of emitting malformed
`CFLAGS`, `CXXFLAGS`, and `LDFLAGS`
- export the Windows VS/MSVC SDK environment in `setup-bazel-ci` and
pass the relevant variables through `run-bazel-ci.sh` via `--action_env`
/ `--host_action_env` so Bazel build scripts can see the MSVC and UCRT
headers on native Windows runs
- add inline comments to the Windows `setup-bazel-ci` MSVC environment
export step so it is easier to audit how `vswhere`, `VsDevCmd.bat`, and
the filtered `GITHUB_ENV` export fit together
- patch `aws-lc-sys` to skip its standalone `memcmp` probe under Bazel
`windows-msvc` build-script environments, which avoids a Windows-native
toolchain mismatch that blocked the lint lane before it reached the
aspect execution
- patch `aws-lc-sys` to prefer its bundled `prebuilt-nasm` objects for
Bazel `windows-msvc` build-script runs, which avoids missing
`generated-src/win-x86_64/*.asm` runfiles in the exec-side helper
toolchain
- annotate the Linux test-only callsites in `codex-rs/linux-sandbox` and
`codex-rs/core` that the wider native lint coverage surfaced

## Patches

This PR introduces a large patch stack because the Windows Bazel lint
lane currently depends on behavior that upstream dependencies do not
provide out of the box in the mixed `windows-gnullvm` target /
`windows-msvc` exec-toolchain setup.

- Most of the `rules_rust` patches look like upstream candidates rather
than OpenAI-only policy. Preserving explicit exec-platform constraints,
forwarding the right MSVC/UCRT environment into exec-side build scripts,
exposing exec-side `rustc-dev` artifacts, and keeping the Windows
bootstrap/linker behavior coherent all look like fixes to the Bazel/Rust
integration layer itself.
- The two `aws-lc-sys` patches are more tactical. They special-case
Bazel `windows-msvc` build-script environments to avoid a `memcmp` probe
mismatch and missing NASM runfiles. Those may be harder to upstream
as-is because they rely on Bazel-specific detection instead of a general
Cargo/build-script contract.
- Short term, carrying these patches in-tree is reasonable because they
unblock a real CI lane and are still narrow enough to audit. Long term,
the goal should not be to keep growing a permanent local fork of either
dependency.
- My current expectation is that the `rules_rust` patches are less
controversial and should be broken out into focused upstream proposals,
while the `aws-lc-sys` patches are more likely to be temporary escape
hatches unless that crate wants a more general hook for hermetic build
systems.

Suggested follow-up plan:

1. Split the `rules_rust` deltas into upstream-sized PRs or issues with
minimized repros.
2. Revisit the `aws-lc-sys` patches during the next dependency bump and
see whether they can be replaced by an upstream fix, a crate upgrade, or
a cleaner opt-in mechanism.
3. Treat each dependency update as a chance to delete patches one by one
so the local patch set only contains still-needed deltas.

## Verification

- `./.github/scripts/run-argument-comment-lint-bazel.sh
--config=argument-comment-lint --keep_going`
- `RUNNER_OS=Windows
./.github/scripts/run-argument-comment-lint-bazel.sh --nobuild
--config=argument-comment-lint --platforms=//:local_windows
--keep_going`
- `cargo test -p codex-linux-sandbox`
- `cargo test -p codex-core shell_snapshot_tests`
- `just argument-comment-lint`

## References

- #16106
2026-03-30 15:32:04 -07:00
Andrey Mishchenko
390b644b21 Update code mode exec() instructions (#16279) 2026-03-30 12:31:13 -10:00
rhan-oai
28a9807f84 [codex-analytics] refactor analytics to use reducer architecture (#16225)
- rework codex analytics crate to use reducer / publish architecture
- in anticipation of extensive codex analytics
2026-03-30 14:27:12 -07:00
Michael Bolin
9313c49e4c fix: close Bazel argument-comment-lint CI gaps (#16253)
## Why

The Bazel-backed `argument-comment-lint` CI path had two gaps:

- Bazel wildcard target expansion skipped inline unit-test crates from
`src/` modules because the generated `*-unit-tests-bin` `rust_test`
targets are tagged `manual`.
- `argument-comment-mismatch` was still only a warning in the Bazel and
packaged-wrapper entrypoints, so a typoed `/*param_name*/` comment could
still pass CI even when the lint detected it.

That left CI blind to real linux-sandbox examples, including the missing
`/*local_port*/` comment in
`codex-rs/linux-sandbox/src/proxy_routing.rs` and typoed argument
comments in `codex-rs/linux-sandbox/src/landlock.rs`.

## What Changed

- Added `tools/argument-comment-lint/list-bazel-targets.sh` so Bazel
lint runs cover `//codex-rs/...` plus the manual `rust_test`
`*-unit-tests-bin` targets.
- Updated `just argument-comment-lint`, `rust-ci.yml`, and
`rust-ci-full.yml` to use that helper.
- Promoted both `argument-comment-mismatch` and
`uncommented-anonymous-literal-argument` to errors in every strict
entrypoint:
  - `tools/argument-comment-lint/lint_aspect.bzl`
  - `tools/argument-comment-lint/src/bin/argument-comment-lint.rs`
  - `tools/argument-comment-lint/wrapper_common.py`
- Added wrapper/bin coverage for the stricter lint flags and documented
the behavior in `tools/argument-comment-lint/README.md`.
- Fixed the now-covered callsites in
`codex-rs/linux-sandbox/src/proxy_routing.rs`,
`codex-rs/linux-sandbox/src/landlock.rs`, and
`codex-rs/core/src/shell_snapshot_tests.rs`.

This keeps the Bazel target expansion narrow while making the Bazel and
prebuilt-linter paths enforce the same strict lint set.

## Verification

- `python3 -m unittest discover -s tools/argument-comment-lint -p
'test_*.py'`
- `cargo +nightly-2025-09-18 test --manifest-path
tools/argument-comment-lint/Cargo.toml`
- `just argument-comment-lint`
2026-03-30 11:59:50 -07:00
Michael Bolin
258ba436f1 codex-tools: extract discoverable tool models (#16254)
## Why

`#16193` moved the pure `tool_search` and `tool_suggest` spec builders
into `codex-tools`, but `codex-core` still owned the shared
discoverable-tool model that those builders and the `tool_suggest`
runtime both depend on. This change continues the migration by moving
that reusable model boundary out of `codex-core` as well, so the
discovery/suggestion stack uses one shared set of types and
`core/src/tools` no longer needs its own `discoverable.rs` module.

## What changed

- Moved `DiscoverableTool`, `DiscoverablePluginInfo`, and
`filter_tool_suggest_discoverable_tools_for_client()` into
`codex-rs/tools/src/tool_discovery.rs` alongside the extracted
discovery/suggestion spec builders.
- Added `codex-app-server-protocol` as a `codex-tools` dependency so the
shared discoverable-tool model can own the connector-side `AppInfo`
variant directly.
- Updated `core/src/tools/handlers/tool_suggest.rs`,
`core/src/tools/spec.rs`, `core/src/tools/router.rs`,
`core/src/connectors.rs`, and `core/src/codex.rs` to consume the shared
`codex-tools` model instead of the old core-local declarations.
- Changed `core/src/plugins/discoverable.rs` to return
`DiscoverablePluginInfo` directly, moved the pure client-filter coverage
into `tool_discovery_tests.rs`, and deleted the old
`core/src/tools/discoverable.rs` module.
- Updated `codex-rs/tools/README.md` so the crate boundary documents
that `codex-tools` now owns the discoverable-tool models in addition to
the discovery/suggestion spec builders.

## Test plan

- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib tools::handlers::tool_suggest::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discoverable-model cargo test -p
codex-core --lib plugins::discoverable::`
- `just bazel-lock-check`
- `just argument-comment-lint`

## References

- #16193
- #16154
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
2026-03-30 10:48:49 -07:00
Michael Bolin
716f7b0428 codex-tools: extract discovery tool specs (#16193)
## Why

`core/src/tools/spec.rs` still owned the pure `tool_search` and
`tool_suggest` spec builders even though that logic no longer needed
`codex-core` runtime state. This change continues the `codex-tools`
migration by moving the reusable discovery and suggestion spec
construction out of `codex-core` so `spec.rs` is left with the
core-owned policy decisions about when these tools are exposed and what
metadata is available.

## What changed

- Added `codex-rs/tools/src/tool_discovery.rs` with the shared
`tool_search` and `tool_suggest` spec builders, plus focused unit tests
in `tool_discovery_tests.rs`.
- Moved the shared `DiscoverableToolAction` and `DiscoverableToolType`
declarations into `codex-tools` so the `tool_suggest` handler and the
extracted spec builders use the same wire-model enums.
- Updated `core/src/tools/spec.rs` to translate `ToolInfo` and
`DiscoverableTool` values into neutral `codex-tools` inputs and delegate
the actual spec building there.
- Removed the old template-based description rendering helpers from
`core/src/tools/spec.rs` and deleted the now-dead helper methods in
`core/src/tools/discoverable.rs`.
- Updated `codex-rs/tools/README.md` to document that discovery and
suggestion models/spec builders now live in `codex-tools`.

## Test plan

- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-discovery-specs cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-discovery-specs cargo test -p
codex-core --lib tools::handlers::tool_suggest::`
- `just argument-comment-lint`

## References

- #16154
- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
2026-03-30 08:15:12 -07:00
jif-oai
c74190a622 fix: ma1 (#16237) 2026-03-30 15:42:17 +02:00
jif-oai
213756c9ab feat: add mailbox concept for wait (#16010)
Add a mailbox we can use for inter-agent communication
`wait` is now based on it and don't take target anymore
2026-03-30 11:47:20 +02:00
Eric Traut
bb95ec3ec6 [codex] Normalize Windows path in MCP startup snapshot test (#16204)
## Summary
A Windows-only snapshot assertion in the app-server MCP startup warning
test compared the raw rendered path, so CI saw `C:\tmp\project` instead
of the normalized `/tmp/project` snapshot fixture.

## Fix
Route that snapshot assertion through the existing
`normalize_snapshot_paths(...)` helper so the test remains
platform-stable.
2026-03-29 17:54:17 -06:00
Michael Bolin
af568afdd5 codex-tools: extract utility tool specs (#16154)
## Why

The previous `codex-tools` migration steps moved the shared schema
models, local-host specs, collaboration specs, and related adapters out
of `codex-core`, but `core/src/tools/spec.rs` still contained a grab bag
of pure utility tool builders. Those specs do not need session state or
handler logic; they only describe wire shapes for tools that
`codex-core` already knows how to execute.

Moving that remaining low-coupling layer into `codex-tools` keeps the
migration moving in meaningful chunks and trims another large block of
passive tool-spec construction out of `codex-core` without touching the
runtime-coupled handlers.

## What changed

- extended `codex-tools` to own the pure spec builders for:
  - code-mode `exec` / `wait`
  - `js_repl` / `js_repl_reset`
- MCP resource tools `list_mcp_resources`,
`list_mcp_resource_templates`, and `read_mcp_resource`
  - utility tools `list_dir` and `test_sync_tool`
- split those builders across small module files with sibling
`*_tests.rs` coverage, keeping `src/lib.rs` exports-only
- rewired `core/src/tools/spec.rs` to call the extracted builders and
deleted the duplicated core-local implementations
- moved the direct JS REPL grammar seam test out of
`core/src/tools/spec_tests.rs` so it now lives with the extracted
implementation in `codex-tools`
- updated `codex-rs/tools/README.md` so the documented crate boundary
matches the new utility-spec surface

## Test plan

- `CARGO_TARGET_DIR=/tmp/codex-tools-utility-specs cargo test -p
codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-utility-specs cargo test -p
codex-core --lib tools::spec::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`

## References

- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
- #16141
2026-03-29 14:34:36 -07:00
Eric Traut
38e648ca67 Fix tui_app_server ghost subagent entries in /agent (#16110)
Fixes #16092

The app-server-backed TUI could accumulate ghost subagent entries in
`/agent` after resume/backfill flows. Some of those rows were no longer
live according to the backend, but still appeared selectable in the
picker and could open as blank threads.

*Cause*
Unlike the legacy tui behavior, tui_app_server was creating local
picker/replay state for subagents discovered through metadata refresh
and loaded-thread backfill, even when no real local session or
transcript had been attached. That let stale ids survive in the picker
as if they were replayable threads.

*Fix*
Stop creating empty local thread channels during subagent metadata
hydration and loaded-thread backfill.
When opening /agent, prune metadata-only entries that thread/read
reports as terminally unavailable.
When selecting a discovered subagent that is still live but not yet
locally attached, materialize a real local session on demand from
thread/read instead of falling back to an empty replay state.
2026-03-29 12:19:34 -06:00
Eric Traut
54d3ad1ede Fix app-server TUI MCP startup warnings regression (#16041)
This addresses #16038

The default `tui_app_server` path stopped surfacing MCP startup failures
during cold start, even though the legacy TUI still showed warnings like
`MCP startup incomplete (...)`. The app-server bridge emitted per-server
startup status notifications, but `tui_app_server` ignored them, so
failed MCP handshakes could look like a clean startup.

This change teaches `tui_app_server` to consume MCP startup status
notifications, preserve the immediate per-server failure warning, and
synthesize the same aggregate startup warning the legacy TUI shows once
startup settles.
2026-03-29 11:57:00 -06:00
Michael Bolin
7880414a27 codex-tools: extract collaboration tool specs (#16141)
## Why

The recent `codex-tools` migration steps have moved shared tool models
and low-coupling spec helpers out of `codex-core`, but
`core/src/tools/spec.rs` still owned a large block of pure
collaboration-tool spec construction. Those builders do not need session
state or runtime behavior; they only need a small amount of core-owned
configuration injected at the seam.

Moving that cohesive slice into `codex-tools` makes the crate boundary
more honest and removes a substantial amount of passive tool-spec logic
from `codex-core` without trying to move the runtime-coupled multi-agent
handlers at the same time.

## What changed

- added `agent_tool.rs`, `request_user_input_tool.rs`, and
`agent_job_tool.rs` to `codex-tools`, with sibling `*_tests.rs` coverage
and an exports-only `lib.rs`
- moved the pure `ToolSpec` builders for:
- collaboration tools such as `spawn_agent`, `send_input`,
`send_message`, `assign_task`, `resume_agent`, `wait_agent`,
`list_agents`, and `close_agent`
  - `request_user_input`
  - agent-job specs `spawn_agents_on_csv` and `report_agent_job_result`
- rewired `core/src/tools/spec.rs` to call the extracted builders while
still supplying the core-owned inputs, such as spawn-agent role
descriptions and wait timeout bounds
- updated the `core/src/tools/spec.rs` seam tests to build expected
collaboration specs through `codex-tools`
- updated `codex-rs/tools/README.md` so the crate documentation reflects
the broader collaboration-tool boundary

## Test plan

- `CARGO_TARGET_DIR=/tmp/codex-tools-collab-specs cargo test -p
codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-collab-specs cargo test -p
codex-core --lib tools::spec::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`

## References

- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
- #16138
2026-03-28 20:39:47 -07:00
Matthew Zeng
3807807f91 [mcp] Increase MCP startup timeout. (#16080)
- [x] Increase MCP startup timeout to 30s, as the current 10s causes a
lot of local MCPs to timeout.
2026-03-28 19:58:00 -07:00
Eric Traut
3bbc1ce003 Remove TUI voice transcription feature (#16114)
Removes the partially-completed TUI composer voice transcription flow,
including its feature flag, app events, and hold-to-talk state machine.
2026-03-29 00:20:25 +00:00
Michael Bolin
4e119a3b38 codex-tools: extract local host tool specs (#16138)
## Why

`core/src/tools/spec.rs` still bundled a set of pure local-host tool
builders with the orchestration that actually decides when those tools
are exposed and which handlers back them. That made `codex-core`
responsible for JSON/tool-shape construction that does not depend on
session state, and it kept the `codex-tools` migration from taking a
meaningfully larger bite out of `spec.rs`.

This PR moves that reusable spec-building layer into `codex-tools` while
leaving feature gating, handler registration, and runtime-coupled
descriptions in `codex-core`.

## What changed

- added `codex-rs/tools/src/local_tool.rs` for the pure builders for
`exec_command`, `write_stdin`, `shell`, `shell_command`, and
`request_permissions`
- added `codex-rs/tools/src/view_image.rs` for the `view_image` tool
spec and output schema so the extracted modules stay right-sized
- rewired `codex-rs/core/src/tools/spec.rs` to call those extracted
builders instead of constructing these specs inline
- kept the `request_permissions` description source in `codex-core`,
with `codex-tools` taking the description as input so the crate boundary
does not grow a dependency on handler/runtime code
- moved the direct constructor coverage for this slice from
`codex-rs/core/src/tools/spec_tests.rs` into
`codex-rs/tools/src/local_tool_tests.rs` and
`codex-rs/tools/src/view_image_tests.rs`
- updated `codex-rs/tools/README.md` to reflect that `codex-tools` now
owns this local-host spec layer

## Test plan

- `CARGO_TARGET_DIR=/tmp/codex-tools-local-host cargo test -p
codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-local-tools cargo test -p codex-core
--lib tools::spec::`
- `just argument-comment-lint`

## References

- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
- #16132
2026-03-28 16:33:58 -07:00
Eric Traut
46b653e73c Fix skills picker scrolling in tui app server (#16109)
Fixes #16091.

The app-server TUI was truncating the filtered mention candidate list to
`MAX_POPUP_ROWS`, so the `$` skills picker only exposed the first 8
matches. That made it look like many skills were missing and prevented
keyboard navigation beyond the first page, even though direct
`$skill-name` insertion still worked.

Testing: I manually verified the regression and confirmed the fix.
2026-03-28 17:22:25 -06:00
Michael Bolin
f7ef9599ed exec: make review-policy tests hermetic (#16137)
## Why

`thread_start_params_from_config()` is supposed to forward the effective
`approvals_reviewer` into the app-server request, but these tests were
constructing that config through `ConfigBuilder::build()`, which also
loads ambient system and managed config layers. On machines with an
admin or host-level reviewer override, the manual-only case could
inherit `guardian_subagent` and fail even though the exec-side mapping
was correct.

## What changed

- Set `approvals_reviewer` explicitly via `harness_overrides` in the two
`thread_start_params_*review_policy*` tests in
`codex-rs/exec/src/lib.rs`.
- Removed the dependence on default config resolution and temp
`config.toml` writes so the tests exercise only the reviewer-to-request
mapping in `codex-exec`.

## Testing

- `cargo test -p codex-exec`
2026-03-28 23:01:04 +00:00
Michael Bolin
a16a9109d7 ci: use BuildBuddy for rust-ci-full non-Windows argument-comment-lint (#16136)
## Why

PR #16130 fixed the Windows `argument-comment-lint` regression in
`rust-ci-full`, but the next `main` runs still left the Linux and macOS
lint legs timing out.

In [run
23695263729](https://github.com/openai/codex/actions/runs/23695263729),
both non-Windows `argument-comment-lint` jobs were cancelled almost
exactly 30 minutes after they started. The remaining workflow difference
versus `rust-ci.yml` was that `rust-ci-full` did not pass
`BUILDBUDDY_API_KEY` into the non-Windows Bazel lint step, so
`run-bazel-ci.sh` fell back to local Bazel configuration instead of
using the faster remote-backed path available on `main`.

## What changed

- passed `BUILDBUDDY_API_KEY` to the non-Windows `rust-ci-full`
`argument-comment-lint` Bazel step
- left the Windows packaged-wrapper path from #16130 unchanged
- kept the change scoped to `rust-ci-full.yml`

## Test plan

- loaded `.github/workflows/rust-ci-full.yml` and
`.github/workflows/rust-ci.yml` with `python3` + `yaml.safe_load(...)`
- inspected run `23695263729` and confirmed `Argument comment lint -
Linux` and `Argument comment lint - macOS` were cancelled about 30
minutes after start
- verified the updated `rust-ci-full` step now matches the non-Windows
secret wiring already present in `rust-ci.yml`

## References

- #16130
- #16106
2026-03-28 15:36:01 -07:00
Michael Bolin
2238c16a91 codex-tools: extract code mode tool spec adapters (#16132)
## Why

The longer-term `codex-tools` migration is to move pure tool-definition
and tool-spec plumbing out of `codex-core` while leaving session- and
runtime-coupled orchestration behind.

The remaining code-mode adapter layer in
`core/src/tools/code_mode_description.rs` was a good next extraction
seam because it only transformed `ToolSpec` values for code mode and
already delegated the low-level description rendering to
`codex-code-mode`.

## What Changed

- added `codex-rs/tools/src/code_mode.rs` with
`augment_tool_spec_for_code_mode()` and
`tool_spec_to_code_mode_tool_definition()`
- added focused unit coverage in `codex-rs/tools/src/code_mode_tests.rs`
- rewired `core/src/tools/spec.rs` and `core/src/tools/code_mode/mod.rs`
to use the extracted adapters from `codex-tools`
- removed the old `core/src/tools/code_mode_description.rs` shim and its
test file from `codex-core`
- added the `codex-code-mode` dependency to `codex-tools`, updated
`Cargo.lock`, and refreshed the `codex-tools` README to reflect the
expanded boundary

## Test Plan

- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-code-mode-adapters cargo test -p
codex-core --lib tools::code_mode::`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just argument-comment-lint`

## References

- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
- #16129
2026-03-28 15:32:35 -07:00
Michael Bolin
c25c0d6e9e core: fix stale curated plugin cache refresh races (#16126)
## Why

The `plugin/list` force-sync path can race app-server startup's curated
plugin cache refresh.

Startup was capturing the configured curated plugin IDs from the initial
config snapshot. If `plugin/list` with `forceRemoteSync` removed curated
plugin entries from `config.toml` while that background refresh was
still in flight, the startup task could recreate cache directories for
plugins that had just been uninstalled.

That leaves the `plugin/list` response logically correct but the on-disk
cache stale, which matches the flaky Ubuntu arm failure seen in
`codex-app-server::all
suite::v2::plugin_list::plugin_list_force_remote_sync_reconciles_curated_plugin_state`
while validating [#16047](https://github.com/openai/codex/pull/16047).

## What

- change `codex-rs/core/src/plugins/manager.rs` so startup curated-repo
refresh rereads the current user `config.toml` before deciding which
curated plugin cache entries to refresh
- factor the configured-plugin parsing so the same logic can be reused
from either the config layer stack or the persisted user config value
- add a regression test that verifies curated plugin IDs are read from
the latest user config state before cache refresh runs

## Testing

- `cargo test -p codex-core
configured_curated_plugin_ids_from_codex_home_reads_latest_user_config
-- --nocapture`
- `cargo test -p codex-app-server
suite::v2::plugin_list::plugin_list_force_remote_sync_reconciles_curated_plugin_state
-- --nocapture`
- `just argument-comment-lint`
2026-03-28 15:00:39 -07:00
Michael Bolin
313fb95989 ci: keep rust-ci-full Windows argument-comment-lint on packaged wrapper (#16130)
## Why

PR #16106 switched `rust-ci-full` over to the native Bazel-backed
`argument-comment-lint` path on all three platforms.

That works on Linux and macOS, but the Windows leg in `rust-ci-full` now
fails before linting starts: Bazel dies while building `rules_rust`'s
`process_wrapper` tool, so `main` reports an `argument-comment-lint`
failure even though no Rust lint finding was produced.

Until native Windows Bazel linting is repaired, `rust-ci-full` should
keep the same Windows split that `rust-ci.yml` already uses.

## What changed

- restored the Windows-only nightly `argument-comment-lint` toolchain
setup in `rust-ci-full`
- limited the Bazel-backed lint step in `rust-ci-full` to non-Windows
runners
- routed the Windows runner back through
`tools/argument-comment-lint/run-prebuilt-linter.py`
- left the Linux and macOS `rust-ci-full` behavior unchanged

## Test plan

- loaded `.github/workflows/rust-ci-full.yml` and
`.github/workflows/rust-ci.yml` with `python3` + `yaml.safe_load(...)`
- inspected failing Actions run `23692864849`, especially job
`69023229311`, to confirm the Windows failure occurs in Bazel
`process_wrapper` setup before lint output is emitted

## References

- #16106
2026-03-28 14:50:19 -07:00
Michael Bolin
4e27a87ec6 codex-tools: extract configured tool specs (#16129)
## Why

This continues the `codex-tools` migration by moving another passive
tool-spec layer out of `codex-core`.

After `ToolSpec` moved into `codex-tools`, `codex-core` still owned
`ConfiguredToolSpec` and `create_tools_json_for_responses_api()`. Both
are data-model and serialization helpers rather than runtime
orchestration, so keeping them in `core/src/tools/registry.rs` and
`core/src/tools/spec.rs` left passive tool-definition code coupled to
`codex-core` longer than necessary.

## What changed

- moved `ConfiguredToolSpec` into `codex-rs/tools/src/tool_spec.rs`
- moved `create_tools_json_for_responses_api()` into
`codex-rs/tools/src/tool_spec.rs`
- re-exported the new surface from `codex-rs/tools/src/lib.rs`, which
remains exports-only
- updated `core/src/client.rs`, `core/src/tools/registry.rs`, and
`core/src/tools/router.rs` to consume the extracted types and serializer
from `codex-tools`
- moved the tool-list serialization test into
`codex-rs/tools/src/tool_spec_tests.rs`
- added focused unit coverage for `ConfiguredToolSpec::name()`
- simplified `core/src/tools/spec_tests.rs` to use the extracted
`ConfiguredToolSpec::name()` directly and removed the now-redundant
local `tool_name()` helper
- updated `codex-rs/tools/README.md` so the crate boundary reflects the
newly extracted tool-spec wrapper and serialization helper

## Test plan

- `cargo test -p codex-tools`
- `CARGO_TARGET_DIR=/tmp/codex-core-configured-spec cargo test -p
codex-core --lib tools::spec::`
- `CARGO_TARGET_DIR=/tmp/codex-core-configured-spec cargo test -p
codex-core --lib client::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`

## References

- #15923
- #15928
- #15944
- #15953
- #16031
- #16047
2026-03-28 14:24:14 -07:00
Michael Bolin
ae8a3be958 bazel: refresh the expired macOS SDK pin (#16128)
## Why

macOS BuildBuddy started failing before target analysis because the
Apple CDN object pinned in
[`MODULE.bazel`](fce0f76d57/MODULE.bazel (L28-L36))
now returns `403 Forbidden`. The failure report that triggered this
change was this [BuildBuddy
invocation](https://app.buildbuddy.io/invocation/c57590e0-1bdb-4e19-a86f-74d4a7ded228).

This repo uses `@llvm//extensions:osx.bzl` via `osx.from_archive(...)`,
and that API does not discover a current SDK URL for us. It fetches
exactly the `urls`, `sha256`, and `strip_prefix` we pin. Once Apple
retires that `swcdn.apple.com` object, `@macos_sdk` stops resolving and
every downstream macOS build fails during external repository fetch.

This is the same basic failure mode we hit in
[b9fa08ec61](b9fa08ec61):
the pin itself aged out.

## How I tracked it down

1. I started from the BuildBuddy error and copied the exact
`swcdn.apple.com/.../CLTools_macOSNMOS_SDK.pkg` URL from the failure.
2. I reproduced the issue outside CI by opening that URL directly in a
browser and by running `curl -I` against it locally. Both returned `403
Forbidden`, which ruled out BuildBuddy as the root cause.
3. I searched the repo for that URL and found it hardcoded in
`MODULE.bazel`.
4. I inspected the `llvm` Bzlmod `osx` extension implementation to
confirm that `osx.from_archive(...)` is just a literal fetch of the
pinned archive metadata. There is no automatic fallback or catalog
lookup behind it.
5. I queried Apple's software update catalogs to find the current
Command Line Tools package for macOS 26.x. The useful catalog was:
-
`https://swscan.apple.com/content/catalogs/others/index-26-15-14-13-12-10.16-10.15-10.14-10.13-10.12-10.11-10.10-10.9-mountainlion-lion-snowleopard-leopard.merged-1.sucatalog.gz`

This is scriptable; it does not require opening a website in a browser.
The catalog is a gzip-compressed plist served over HTTP, so the workflow
is just:

   1. fetch the catalog,
   2. decompress it,
   3. search or parse the plist for `CLTools_macOSNMOS_SDK.pkg` entries,
   4. inspect the matching product metadata.

   The quick shell version I used was:

   ```shell
   curl -L <catalog-url> \
     | gzip -dc \
     | rg -n -C 6 'CLTools_macOSNMOS_SDK\.pkg|PostDate|English\.dist'
   ```

That is enough to surface the current product id, package URL, post
date, and the matching `.dist` file. If we want something less
grep-driven next time, the same catalog can be parsed structurally. For
example:

   ```python
   import gzip
   import plistlib
   import urllib.request

url =
"https://swscan.apple.com/content/catalogs/others/index-26-15-14-13-12-10.16-10.15-10.14-10.13-10.12-10.11-10.10-10.9-mountainlion-lion-snowleopard-leopard.merged-1.sucatalog.gz"
   with urllib.request.urlopen(url) as resp:
       catalog = plistlib.loads(gzip.decompress(resp.read()))

   for product_id, product in catalog["Products"].items():
       for package in product.get("Packages", []):
           package_url = package.get("URL", "")
           if package_url.endswith("CLTools_macOSNMOS_SDK.pkg"):
               print(product_id)
               print(product.get("PostDate"))
               print(package_url)
               print(product.get("Distributions", {}).get("English"))
   ```

In practice, `curl` was only the transport. The important part is that
the catalog itself is a machine-readable plist, so this can be
automated.
6. That catalog contains the newer `047-96692` Command Line Tools
release, and its distribution file identifies it as [Command Line Tools
for Xcode
26.4](https://swdist.apple.com/content/downloads/32/53/047-96692-A_OAHIHT53YB/ybtshxmrcju8m2qvw3w5elr4rajtg1x3y3/047-96692.English.dist).
7. I downloaded that package locally, computed its SHA-256, expanded it
with `pkgutil --expand-full`, and verified that it contains
`Payload/Library/Developer/CommandLineTools/SDKs/MacOSX26.4.sdk`, which
is the correct new `strip_prefix` for this pin.

The core debugging loop looked like this:

```shell
curl -I <stale swcdn URL>
rg 'swcdn\.apple\.com|osx\.from_archive' MODULE.bazel
curl -L <apple 26.x sucatalog> | gzip -dc | rg 'CLTools_macOSNMOS_SDK.pkg'
pkgutil --expand-full CLTools_macOSNMOS_SDK.pkg expanded
find expanded/Payload/Library/Developer/CommandLineTools/SDKs -maxdepth 1 -mindepth 1
```

## What changed

- Updated `MODULE.bazel` to point `osx.from_archive(...)` at the
currently live `047-96692` `CLTools_macOSNMOS_SDK.pkg` object.
- Updated the pinned `sha256` to match that package.
- Updated the `strip_prefix` from `MacOSX26.2.sdk` to `MacOSX26.4.sdk`.

## Verification

- `bazel --output_user_root="$(mktemp -d
/tmp/codex-bazel-sdk-fetch.XXXXXX)" build @macos_sdk//sysroot`

## Notes for next time

As long as we pin raw `swcdn.apple.com` objects, this will likely happen
again. When it does, the expected recovery path is:

1. Reproduce the `403` against the exact URL from CI.
2. Find the stale pin in `MODULE.bazel`.
3. Look up the current CLTools package in the relevant Apple software
update catalog for that macOS major version.
4. Download the replacement package and refresh both `sha256` and
`strip_prefix`.
5. Validate the new pin with a fresh `@macos_sdk` fetch, not just an
incremental Bazel build.

The important detail is that the non-`26` catalog did not surface the
macOS 26.x SDK package here; the `index-26-15-14-...` catalog was the
one that exposed the currently live replacement.
2026-03-28 21:08:19 +00:00
Michael Bolin
bc53d42fd9 codex-tools: extract tool spec models (#16047)
## Why

This continues the `codex-tools` migration by moving another passive
tool-definition layer out of `codex-core`.

After `ResponsesApiTool` and the lower-level schema adapters moved into
`codex-tools`, `core/src/client_common.rs` was still owning `ToolSpec`
and the web-search request wire types even though they are serialized
data models rather than runtime orchestration. Keeping those types in
`codex-core` makes the crate boundary look smaller than it really is and
leaves non-runtime tool-shape code coupled to core.

## What changed

- moved `ToolSpec`, `ResponsesApiWebSearchFilters`, and
`ResponsesApiWebSearchUserLocation` into
`codex-rs/tools/src/tool_spec.rs`
- added focused unit tests in `codex-rs/tools/src/tool_spec_tests.rs`
for:
  - `ToolSpec::name()`
  - web-search config conversions
  - `ToolSpec` serialization for `web_search` and `tool_search`
- kept `codex-rs/tools/src/lib.rs` exports-only by re-exporting the new
module from `lib.rs`
- reduced `core/src/client_common.rs` to a compatibility shim that
re-exports the extracted tool-spec types for current core call sites
- updated `core/src/tools/spec_tests.rs` to consume the extracted
web-search types directly from `codex-tools`
- updated `codex-rs/tools/README.md` so the crate contract reflects that
`codex-tools` now owns the passive tool-spec request models in addition
to the lower-level Responses API structs

## Test plan

- `cargo test -p codex-tools`
- `cargo test -p codex-core --lib tools::spec::`
- `cargo test -p codex-core --lib client_common::`
- `just fix -p codex-tools -p codex-core`
- `just argument-comment-lint`

## References

- #15923
- #15928
- #15944
- #15953
- #16031
2026-03-28 13:37:00 -07:00
Eric Traut
178d2b00b1 Remove the codex-tui app-server originator workaround (#16116)
## Summary
- remove the temporary `codex-tui` special-case when setting the default
originator during app-server initialization
2026-03-28 13:53:33 -06:00
Eric Traut
48144a7fa4 Remove remaining custom prompt support (#16115)
## Summary
- remove protocol and core support for discovering and listing custom
prompts
- simplify the TUI slash-command flow and command popup to built-in
commands only
- delete obsolete custom prompt tests, helpers, and docs references
- clean up downstream event handling for the removed protocol events
2026-03-28 13:49:37 -06:00
Michael Bolin
fce0f76d57 build: migrate argument-comment-lint to a native Bazel aspect (#16106)
## Why

`argument-comment-lint` had become a PR bottleneck because the repo-wide
lane was still effectively running a `cargo dylint`-style flow across
the workspace instead of reusing Bazel's Rust dependency graph. That
kept the lint enforced, but it threw away the main benefit of moving
this job under Bazel in the first place: metadata reuse and cacheable
per-target analysis in the same shape as Clippy.

This change moves the repo-wide lint onto a native Bazel Rust aspect so
Linux and macOS can lint `codex-rs` without rebuilding the world
crate-by-crate through the wrapper path.

## What Changed

- add a nightly Rust toolchain with `rustc-dev` for Bazel and a
dedicated crate-universe repo for `tools/argument-comment-lint`
- add `tools/argument-comment-lint/driver.rs` and
`tools/argument-comment-lint/lint_aspect.bzl` so Bazel can run the lint
as a custom `rustc_driver`
- switch repo-wide `just argument-comment-lint` and the Linux/macOS
`rust-ci` lanes to `bazel build --config=argument-comment-lint
//codex-rs/...`
- keep the Python/DotSlash wrappers as the package-scoped fallback path
and as the current Windows CI path
- gate the Dylint entrypoint behind a `bazel_native` feature so the
Bazel-native library avoids the `dylint_*` packaging stack
- update the aspect runtime environment so the driver can locate
`rustc_driver` correctly under remote execution
- keep the dedicated `tools/argument-comment-lint` package tests and
wrapper unit tests in CI so the source and packaged entrypoints remain
covered

## Verification

- `python3 -m unittest discover -s tools/argument-comment-lint -p
'test_*.py'`
- `cargo test` in `tools/argument-comment-lint`
- `bazel build
//tools/argument-comment-lint:argument-comment-lint-driver
--@rules_rust//rust/toolchain/channel=nightly`
- `bazel build --config=argument-comment-lint
//codex-rs/utils/path-utils:all`
- `bazel build --config=argument-comment-lint
//codex-rs/rollout:rollout`







---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16106).
* #16120
* __->__ #16106
2026-03-28 12:41:56 -07:00
Michael Bolin
65f631c3d6 fix: fix comment linter lint violations in Linux-only code (#16118)
https://github.com/openai/codex/pull/16071 took care of this for
Windows, so this takes care of things for Linux.

We don't touch the CI jobs in this PR because
https://github.com/openai/codex/pull/16106 is going to be the real fix
there (including a major speedup!).
2026-03-28 11:09:41 -07:00
Eric Traut
61429a6c10 Rename tui_app_server to tui (#16104)
This is a follow-up to https://github.com/openai/codex/pull/15922. That
previous PR deleted the old `tui` directory and left the new
`tui_app_server` directory in place. This PR renames `tui_app_server` to
`tui` and fixes up all references.
2026-03-28 11:23:07 -06:00
Eric Traut
3d1abf3f3d Update PR babysitter skill for review replies and resolution (#16112)
This PR updates the "PR Babysitter" skill to clarify that non-actionable
review comments should receive a direct reply explaining why no change
is needed, and actionable review comments should be marked "resolved"
after they are addressed.
2026-03-28 10:35:20 -06:00
Felipe Coury
bede1d9e23 fix(tui): refresh footer on collaboration mode changes (#16026)
## Summary
- Moves status surface refresh (`refresh_status_surfaces` /
`refresh_status_line`) from `App` event handlers into `ChatWidget`
setters via a new `refresh_model_dependent_surfaces()` method
- Ensures model-dependent UI stays in sync whenever collaboration mode,
model, or reasoning effort changes, including the footer and terminal
title in both `tui` and `tui_app_server`
- Applies the fix to both `tui` and `tui_app_server` widgets

#15961

## Test plan
- [x] Added snapshot test
`status_line_model_with_reasoning_plan_mode_footer` verifying footer
renders correctly in plan mode
- [x] Added
`terminal_title_model_updates_on_model_change_without_manual_refresh` in
`tui_app_server`
- [ ] Verify switching collaboration modes updates the footer in real
TUI
- [ ] Verify model/reasoning effort changes reflect in the status bar
and terminal title

---------

Co-authored-by: Eric Traut <etraut@openai.com>
2026-03-28 08:55:32 -06:00
Michael Bolin
e39ddc61b1 bazel: add Windows gnullvm stack flags to unit test binaries (#16074)
## Summary

Add the Windows gnullvm stack-reserve flags to the `*-unit-tests-bin`
path in `codex_rust_crate()`.

## Why

This is the narrow code fix behind the earlier review comment on
[#16067](https://github.com/openai/codex/pull/16067). That comment was
stale relative to the workflow-only PR it landed on, but it pointed at a
real gap in `defs.bzl`.

Today, `codex_rust_crate()` already appends
`WINDOWS_GNULLVM_RUSTC_STACK_FLAGS` for:

- `rust_binary()` targets
- integration-test `rust_test()` targets

But the unit-test binary path still omitted those flags. That meant the
generated `*-unit-tests-bin` executables were not built the same way as
the rest of the Windows gnullvm executables in the macro.

## What Changed

- Added `WINDOWS_GNULLVM_RUSTC_STACK_FLAGS` to the `unit_test_binary`
`rust_test()` rule in `defs.bzl`
- Added a short comment explaining why unit-test binaries need the same
stack-reserve treatment as binaries and integration tests on Windows
gnullvm

## Testing

- `bazel query '//codex-rs/core:*'`
- `bazel query '//codex-rs/shell-command:*'`

Those queries load packages that exercise `codex_rust_crate()`,
including `*-unit-tests-bin` targets. The actual runtime effect is
Windows-specific, so the real end-to-end confirmation still comes from
Windows CI.
2026-03-27 22:11:49 -07:00
Michael Bolin
b94366441e ci: split fast PR Rust CI from full post-merge Cargo CI (#16072)
## Summary

Split the old all-in-one `rust-ci.yml` into:

- a PR-time Cargo workflow in `rust-ci.yml`
- a full post-merge Cargo workflow in `rust-ci-full.yml`

This keeps the PR path focused on fast Cargo-native hygiene plus the
Bazel `build` / `test` / `clippy` coverage in `bazel.yml`, while moving
the heavyweight Cargo-native matrix to `main`.

## Why

`bazel.yml` is now the main Rust verification workflow for pull
requests. It already covers the Bazel build, test, and clippy signal we
care about pre-merge, and it also runs on pushes to `main` to re-verify
the merged tree and help keep the BuildBuddy caches warm.

What was still missing was a clean split for the Cargo-native checks
that Bazel does not replace yet. The old `rust-ci.yml` mixed together:

- fast hygiene checks such as `cargo fmt --check` and `cargo shear`
- `argument-comment-lint`
- the full Cargo clippy / nextest / release-build matrix

That made every PR pay for the full Cargo matrix even though most of
that coverage is better treated as post-merge verification. The goal of
this change is to leave PRs with the checks we still want before merge,
while moving the heavier Cargo-native matrix off the review path.

## What Changed

- Renamed the old heavyweight workflow to `rust-ci-full.yml` and limited
it to `push` on `main` plus `workflow_dispatch`.
- Added a new PR-only `rust-ci.yml` that runs:
  - changed-path detection
  - `cargo fmt --check`
  - `cargo shear`
  - `argument-comment-lint` on Linux, macOS, and Windows
- `tools/argument-comment-lint` package tests when the lint itself or
its workflow wiring changes
- Kept the PR workflow's gatherer as the single required Cargo-native
status so branch protection can stay simple.
- Added `.github/workflows/README.md` to document the intended split
between `bazel.yml`, `rust-ci.yml`, and `rust-ci-full.yml`.
- Preserved the recent Windows `argument-comment-lint` behavior from
`e02fd6e1d3` in `rust-ci-full.yml`, and mirrored cross-platform lint
coverage into the PR workflow.

A few details are deliberate:

- The PR workflow still keeps the Linux lint lane on the
default-targets-only invocation for now, while macOS and Windows use the
broader released-linter path.
- This PR does not change `bazel.yml`; it changes the Cargo-native
workflow around the existing Bazel PR path.

## Testing

- Rebasing this change onto `main` after `e02fd6e1d3`
- `ruby -e 'require "yaml"; %w[.github/workflows/rust-ci.yml
.github/workflows/rust-ci-full.yml .github/workflows/bazel.yml].each {
|f| YAML.load_file(f) }'`
2026-03-27 21:08:08 -07:00
Michael Bolin
e02fd6e1d3 fix: clean up remaining Windows argument-comment-lint violations (#16071)
## Why

The initial `argument-comment-lint` rollout left Windows on
default-target coverage because there were still Windows-only callsites
failing under `--all-targets`. This follow-up cleans up those remaining
Windows-specific violations so the Windows CI lane can enforce the same
stricter coverage, leaving Linux as the remaining platform-specific
follow-up.

## What changed

- switched the Windows `rust-ci` argument-comment-lint step back to the
default wrapper invocation so it runs full-target coverage again
- added the required `/*param_name*/` annotations at Windows-gated
literal callsites in:
  - `codex-rs/windows-sandbox-rs/src/lib.rs`
  - `codex-rs/windows-sandbox-rs/src/elevated_impl.rs`
  - `codex-rs/tui_app_server/src/multi_agents.rs`
  - `codex-rs/network-proxy/src/proxy.rs`

## Validation

- Windows `argument comment lint` CI on this PR
2026-03-27 20:48:21 -07:00
Michael Bolin
f4d0cbfda6 ci: run Bazel clippy on Windows gnullvm (#16067)
## Why

We want more of the pre-merge Rust signal to come from `bazel.yml`,
especially on Windows. The Bazel test workflow already exercises
`x86_64-pc-windows-gnullvm`, but the Bazel clippy job still only ran on
Linux x64 and macOS arm64. That left a gap where Windows-only Bazel lint
breakages could slip through until the Cargo-based workflow ran.

This change keeps the fix narrow. Rather than expanding the Bazel clippy
target set or changing the shared setup logic, it extends the existing
clippy matrix to the same Windows GNU toolchain that the Bazel test job
already uses.

## What Changed

- add `windows-latest` / `x86_64-pc-windows-gnullvm` to the `clippy` job
matrix in `.github/workflows/bazel.yml`
- update the nearby workflow comment to explain that the goal is to get
Bazel-native Windows lint coverage on the same toolchain as the Bazel
test lane
- leave the Bazel clippy scope unchanged at `//codex-rs/...
-//codex-rs/v8-poc:all`

## Verification

- parsed `.github/workflows/bazel.yml` successfully with Ruby
`YAML.load_file`
2026-03-27 20:47:22 -07:00
Michael Bolin
343d1af3da bazel: enable the full Windows gnullvm CI path (#15952)
## Why

This PR is the current, consolidated follow-up to the earlier Windows
Bazel attempt in #11229. The goal is no longer just to get a tiny
Windows smoke job limping along: it is to make the ordinary Bazel CI
path usable on `windows-latest` for `x86_64-pc-windows-gnullvm`, with
the same broad `//...` test shape that macOS and Linux already use.

The earlier smoke-list version of this work was useful as a foothold,
but it was not a good long-term landing point. Windows Bazel kept
surfacing real issues outside that allowlist:

- GitHub's Windows runner exposed runfiles-manifest bugs such as
`FINDSTR: Cannot open D:MANIFEST`, which broke Bazel test launchers even
when the manifest file existed.
- `rules_rs`, `rules_rust`, LLVM extraction, and Abseil still needed
`windows-gnullvm`-specific fixes for our hermetic toolchain.
- the V8 path needed more work than just turning the Windows matrix
entry back on: `rusty_v8` does not ship Windows GNU artifacts in the
same shape we need, and Bazel's in-tree V8 build needed a set of Windows
GNU portability fixes.

Windows performance pressure also pushed this toward a full solution
instead of a permanent smoke suite. During this investigation we hit
targets such as `//codex-rs/shell-command:shell-command-unit-tests` that
were much more expensive on Windows because they repeatedly spawn real
PowerShell parsers (see #16057 for one concrete example of that
pressure). That made it much more valuable to get the real Windows Bazel
path working than to keep iterating on a narrowly curated subset.

The net result is that this PR now aims for the same CI contract on
Windows that we already expect elsewhere: keep standalone
`//third_party/v8:all` out of the ordinary Bazel lane, but allow V8
consumers under `//codex-rs/...` to build and test transitively through
`//...`.

## What Changed

### CI and workflow wiring

- re-enable the `windows-latest` / `x86_64-pc-windows-gnullvm` Bazel
matrix entry in `.github/workflows/bazel.yml`
- move the Windows Bazel output root to `D:\b` and enable `git config
--global core.longpaths true` in
`.github/actions/setup-bazel-ci/action.yml`
- keep the ordinary Bazel target set on Windows aligned with macOS and
Linux by running `//...` while excluding only standalone
`//third_party/v8:all` targets from the normal lane

### Toolchain and module support for `windows-gnullvm`

- patch `rules_rs` so `windows-gnullvm` is modeled as a distinct Windows
exec/toolchain platform instead of collapsing into the generic Windows
shape
- patch `rules_rust` build-script environment handling so llvm-mingw
build-script probes do not inherit unsupported `-fstack-protector*`
flags
- patch the LLVM module archive so it extracts cleanly on Windows and
provides the MinGW libraries this toolchain needs
- patch Abseil so its thread-local identity path matches the hermetic
`windows-gnullvm` toolchain instead of taking an incompatible MinGW
pthread path
- keep both MSVC and GNU Windows targets in the generated Cargo metadata
because the current V8 release-asset story still uses MSVC-shaped names
in some places while the Bazel build targets the GNU ABI

### Windows test-launch and binary-behavior fixes

- update `workspace_root_test_launcher.bat.tpl` to read the runfiles
manifest directly instead of shelling out to `findstr`, which was the
source of the `D:MANIFEST` failures on the GitHub Windows runner
- thread a larger Windows GNU stack reserve through `defs.bzl` so
Bazel-built binaries that pull in V8 behave correctly both under normal
builds and under `bazel test`
- remove the no-longer-needed Windows bootstrap sh-toolchain override
from `.bazelrc`

### V8 / `rusty_v8` Windows GNU support

- export and apply the new Windows GNU patch set from
`patches/BUILD.bazel` / `MODULE.bazel`
- patch the V8 module/rules/source layers so the in-tree V8 build can
produce Windows GNU archives under Bazel
- teach `third_party/v8/BUILD.bazel` to build Windows GNU static
archives in-tree instead of aliasing them to the MSVC prebuilts
- reuse the Linux release binding for the experimental Windows GNU path
where `rusty_v8` does not currently publish a Windows GNU binding
artifact

## Testing

- the primary end-to-end validation for this work is the `Bazel`
workflow plus `v8-canary`, since the hard parts are Windows-specific and
depend on real GitHub runner behavior
- before consolidation back onto this PR, the same net change passed the
full Bazel matrix in [run
23675590471](https://github.com/openai/codex/actions/runs/23675590471)
and passed `v8-canary` in [run
23675590453](https://github.com/openai/codex/actions/runs/23675590453)
- those successful runs included the `windows-latest` /
`x86_64-pc-windows-gnullvm` Bazel job with the ordinary `//...` path,
not the earlier Windows smoke allowlist

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15952).
* #16067
* __->__ #15952
2026-03-27 20:37:03 -07:00
Michael Bolin
5037a2d199 refactor: rewrite argument-comment lint wrappers in Python (#16063)
## Why

The `argument-comment-lint` entrypoints had grown into two shell
wrappers with duplicated parsing, environment setup, and Cargo
forwarding logic. The recent `--` separator regression was a good
example of the problem: the behavior was subtle, easy to break, and hard
to verify.

This change rewrites those wrappers in Python so the control flow is
easier to follow, the shared behavior lives in one place, and the tricky
argument/defaulting paths have direct test coverage.

## What changed

- replaced `tools/argument-comment-lint/run.sh` and
`tools/argument-comment-lint/run-prebuilt-linter.sh` with Python
entrypoints: `run.py` and `run-prebuilt-linter.py`
- moved shared wrapper behavior into
`tools/argument-comment-lint/wrapper_common.py`, including:
  - splitting lint args from forwarded Cargo args after `--`
- defaulting repo runs to `--manifest-path codex-rs/Cargo.toml
--workspace --no-deps`
- defaulting non-`--fix` runs to `--all-targets` unless the caller
explicitly narrows the target set
  - setting repo defaults for `DYLINT_RUSTFLAGS` and `CARGO_INCREMENTAL`
- kept the prebuilt wrapper thin: it still just resolves the packaged
DotSlash entrypoint, keeps `rustup` shims first on `PATH`, infers
`RUSTUP_HOME` when needed, and then launches the packaged `cargo-dylint`
path
- updated `justfile`, `rust-ci.yml`, and
`tools/argument-comment-lint/README.md` to use the Python entrypoints
- updated `rust-ci` so the package job runs Python syntax checks plus
the new wrapper unit tests, and the OS-specific lint jobs invoke the
wrappers through an explicit Python interpreter

This is a follow-up to #16054: it keeps the current lint semantics while
making the wrapper logic maintainable enough to iterate on safely.

## Validation

- `python3 -m py_compile tools/argument-comment-lint/wrapper_common.py
tools/argument-comment-lint/run.py
tools/argument-comment-lint/run-prebuilt-linter.py
tools/argument-comment-lint/test_wrapper_common.py`
- `python3 -m unittest discover -s tools/argument-comment-lint -p
'test_*.py'`
- `python3 ./tools/argument-comment-lint/run-prebuilt-linter.py -p
codex-terminal-detection -- --lib`
- `python3 ./tools/argument-comment-lint/run.py -p
codex-terminal-detection -- --lib`
2026-03-27 19:42:30 -07:00
Michael Bolin
142681ef93 shell-command: reuse a PowerShell parser process on Windows (#16057)
## Why

`//codex-rs/shell-command:shell-command-unit-tests` became a real
bottleneck in the Windows Bazel lane because repeated calls to
`is_safe_command_windows()` were starting a fresh PowerShell parser
process for every `powershell.exe -Command ...` assertion.

PR #16056 was motivated by that same bottleneck, but its test-only
shortcut was the wrong layer to optimize because it weakened the
end-to-end guarantee that our runtime path really asks PowerShell to
parse the command the way we expect.

This PR attacks the actual cost center instead: it keeps the real
PowerShell parser in the loop, but turns that parser into a long-lived
helper process so both tests and the runtime safe-command path can reuse
it across many requests.

## What Changed

- add `shell-command/src/command_safety/powershell_parser.rs`, which
keeps one mutex-protected parser process per PowerShell executable path
and speaks a simple JSON-over-stdio request/response protocol
- turn `shell-command/src/command_safety/powershell_parser.ps1` into a
long-running parser server with comments explaining the protocol, the
AST-shape restrictions, and why unsupported constructs are rejected
conservatively
- keep request ids and a one-time respawn path so a dead or
desynchronized cached child fails closed instead of silently returning
mixed parser output
- preserve separate parser processes for `powershell.exe` and
`pwsh.exe`, since they do not accept the same language surface
- avoid a direct `PipelineChainAst` type reference in the PowerShell
script so the parser service still runs under Windows PowerShell 5.1 as
well as newer `pwsh`
- make `shell-command/src/command_safety/windows_safe_commands.rs`
delegate to the new parser utility instead of spawning a fresh
PowerShell process for every parse
- add a Windows-only unit test that exercises multiple sequential
requests against the same parser process

## Testing

- adds a Windows-only parser-reuse unit test in `powershell_parser.rs`
- the main end-to-end verification for this change is the Windows CI
lane, because the new service depends on real `powershell.exe` /
`pwsh.exe` behavior
2026-03-27 19:33:41 -07:00
Joe Liccini
71923f43a7 Support Codex CLI stdin piping for codex exec (#15917)
# Summary

Claude Code supports a useful prompt-plus-stdin workflow:

```bash
echo "complex input..." | claude -p "summarize concisely"
```

Codex previously did not support the equivalent `codex exec` form. While
`codex exec` could read the prompt from stdin, it could not combine
piped input with an explicit prompt argument.

This change adds that missing workflow:

```bash
echo "complex input..." | codex exec "summarize concisely"
```

With this change, when `codex exec` receives both a positional prompt
and piped stdin, the prompt remains the instruction and stdin is passed
along as structured `<stdin>...</stdin>` context.

Example:

```bash
curl https://jsonplaceholder.typicode.com/comments \
  | ./target/debug/codex exec --skip-git-repo-check "format the top 20 items into a markdown table" \
  > table.md
```

This PR also adds regression coverage for:
- prompt argument + piped stdin
- legacy stdin-as-prompt behavior
- `codex exec -` forced-stdin behavior
- empty-stdin error cases

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-28 02:21:22 +00:00