## Why
`argument-comment-lint` was green in CI even though the repo still had
many uncommented literal arguments. The main gap was target coverage:
the repo wrapper did not force Cargo to inspect test-only call sites, so
examples like the `latest_session_lookup_params(true, ...)` tests in
`codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.
This change cleans up the existing backlog, makes the default repo lint
path cover all Cargo targets, and starts rolling that stricter CI
enforcement out on the platform where it is currently validated.
## What changed
- mechanically fixed existing `argument-comment-lint` violations across
the `codex-rs` workspace, including tests, examples, and benches
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
`tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
`--all-targets` unless the caller explicitly narrows the target set
- fixed both wrappers so forwarded cargo arguments after `--` are
preserved with a single separator
- documented the new default behavior in
`tools/argument-comment-lint/README.md`
- updated `rust-ci` so the macOS lint lane keeps the plain wrapper
invocation and therefore enforces `--all-targets`, while Linux and
Windows temporarily pass `-- --lib --bins`
That temporary CI split keeps the stricter all-targets check where it is
already cleaned up, while leaving room to finish the remaining Linux-
and Windows-specific target-gated cleanup before enabling
`--all-targets` on those runners. The Linux and Windows failures on the
intermediate revision were caused by the wrapper forwarding bug, not by
additional lint findings in those lanes.
## Validation
- `bash -n tools/argument-comment-lint/run.sh`
- `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
- shell-level wrapper forwarding check for `-- --lib --bins`
- shell-level wrapper forwarding check for `-- --tests`
- `just argument-comment-lint`
- `cargo test` in `tools/argument-comment-lint`
- `cargo test -p codex-terminal-detection`
## Follow-up
- Clean up remaining Linux-only target-gated callsites, then switch the
Linux lint lane back to the plain wrapper invocation.
- Clean up remaining Windows-only target-gated callsites, then switch
the Windows lint lane back to the plain wrapper invocation.
## Why
`codex-utils-pty` and `codex-windows-sandbox` were the remaining crates
in `codex-rs` that still overrode the workspace's Rust 2024 edition.
Moving them forward in a separate PR keeps the baseline edition update
isolated from the follow-on Bazel clippy workflow in #15955, while
making linting and formatting behavior consistent with the rest of the
workspace.
This PR also needs Cargo and Bazel to agree on the edition for
`codex-windows-sandbox`. Without the Bazel-side sync, the experimental
Bazel app-server builds fail once they compile `windows-sandbox-rs`.
## What changed
- switch `codex-rs/utils/pty` and `codex-rs/windows-sandbox-rs` to
`edition = "2024"`
- update `codex-utils-pty` callsites and tests to use the collapsed `if
let` form that Clippy expects under the new edition
- fix the Rust 2024 fallout in `windows-sandbox-rs`, including the
reserved `gen` identifier, `unsafe extern` requirements, and new Clippy
findings that surfaced under the edition bump
- keep the edition bump separate from a larger unsafe cleanup by
temporarily allowing `unsafe_op_in_unsafe_fn` in the Windows entrypoint
modules that now report it under Rust 2024
- update `codex-rs/windows-sandbox-rs/BUILD.bazel` to `crate_edition =
"2024"` so Bazel compiles the crate with the same edition as Cargo
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/15954).
* #15976
* #15955
* __->__ #15954
## Why
`zsh-fork` sessions launched through unified-exec need the escalation
socket to survive the wrapper -> server -> child handoff so later
intercepted `exec()` calls can still reach the escalation server.
The inherited-fd spawn path also needs to avoid closing Rust's internal
exec-error pipe, and the shell-escalation handoff needs to tolerate the
receive-side case where a transferred fd is installed into the same
stdio slot it will be mapped onto.
## What Changed
- Added `SpawnLifecycle::inherited_fds()` in
`codex-rs/core/src/unified_exec/process.rs` and threaded inherited fds
through `codex-rs/core/src/unified_exec/process_manager.rs` so
unified-exec can preserve required descriptors across both PTY and
no-stdin pipe spawn paths.
- Updated `codex-rs/core/src/tools/runtimes/shell/zsh_fork_backend.rs`
to expose the escalation socket fd through the spawn lifecycle.
- Added inherited-fd-aware spawn helpers in
`codex-rs/utils/pty/src/pty.rs` and `codex-rs/utils/pty/src/pipe.rs`,
including Unix pre-exec fd pruning that preserves requested inherited
fds while leaving `FD_CLOEXEC` descriptors alone. The pruning helper is
now named `close_inherited_fds_except()` to better describe that
behavior.
- Updated `codex-rs/shell-escalation/src/unix/escalate_client.rs` to
duplicate local stdio before transfer and send destination stdio numbers
in `SuperExecMessage`, so the wrapper keeps using its own
`stdin`/`stdout`/`stderr` until the escalated child takes over.
- Updated `codex-rs/shell-escalation/src/unix/escalate_server.rs` so the
server accepts the overlap case where a received fd reuses the same
stdio descriptor number that the child setup will target with `dup2`.
- Added comments around the PTY stdio wiring and the overlap regression
helper to make the fd handoff and controlling-terminal setup easier to
follow.
## Verification
- `cargo test -p codex-utils-pty`
- covers preserved-fd PTY spawn behavior, PTY resize, Python REPL
continuity, exec-failure reporting, and the no-stdin pipe path
- `cargo test -p codex-shell-escalation`
- covers duplicated-fd transfer on the client side and verifies the
overlap case by passing a pipe-backed stdin payload through the
server-side `dup2` path
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13644).
* #14624
* __->__ #13644
## What changed
- keep the explicit stdin-close behavior after writing so the child
still receives EOF deterministically
- on Windows, stop using `python -c` for the round-trip assertion and
instead run a native `cmd.exe` pipeline that reads one line from stdin
with `set /p` and echoes it back
- send `
` on Windows so the stdin payload matches the platform-native line
ending the shell reader expects
## Why this fixes flakiness
The failing branch-local flake was not in `spawn_pipe_process` itself.
The child exited cleanly, but the Windows ARM runner sometimes produced
an empty stdout string when the test used Python as the stdin consumer.
That makes the test sensitive to Python startup and stdin-close timing
rather than the pipe primitive we actually want to validate. Switching
the Windows path to a native `cmd.exe` reader keeps the assertion
focused on our pipe behavior: bytes written to stdin should come back on
stdout before EOF closes the process. The explicit `
` write removes line-ending ambiguity on Windows.
## Scope
- test-only
- no production logic change
## Summary
- run the split stdout/stderr PTY test through the normal shell helper
on every platform
- use a Windows-native command string instead of depending on Python to
emit split streams
- assert CRLF line endings on Windows explicitly
## Why this fixes the flake
The earlier PTY split-output test used a Python one-liner on Windows
while the rest of the file exercised shell-command behavior. That made
the test depend on runner-local Python availability and masked the real
Windows shell output shape. Using a native cmd-compatible command and
asserting the actual CRLF output makes the split stdout/stderr coverage
deterministic on Windows runners.
(Experimental)
This PR adds a first MVP for hooks, with SessionStart and Stop
The core design is:
- hooks live in a dedicated engine under codex-rs/hooks
- each hook type has its own event-specific file
- hook execution is synchronous and blocks normal turn progression while
running
- matching hooks run in parallel, then their results are aggregated into
a normalized HookRunSummary
On the AppServer side, hooks are exposed as operational metadata rather
than transcript-native items:
- new live notifications: hook/started, hook/completed
- persisted/replayed hook results live on Turn.hookRuns
- we intentionally did not add hook-specific ThreadItem variants
Hooks messages are not persisted, they remain ephemeral. The context
changes they add are (they get appended to the user's prompt)
## What changed
- The PTY Python REPL test now starts Python with a startup marker
already embedded in argv.
- The test waits for that marker in PTY output before making assertions.
## Why this fixes the flake
- The old version tried to probe the live REPL almost immediately after
spawn.
- That races PTY initialization, Python startup, and prompt buffering,
all of which vary across platforms and CI load.
- By having the child process emit a known marker as part of its own
startup path, the test gets a deterministic synchronization point that
comes from the process under test rather than from guessed timing.
## Scope
- Test-only change.
Enhance pty utils:
* Support closing stdin
* Separate stderr and stdout streams to allow consumers differentiate them
* Provide compatibility helper to merge both streams back into combined one
* Support specifying terminal size for pty, including on-demand resizes while process is already running
* Support terminating the process while still consuming its outputs
Hardens PTY Python REPL test and make MCP test startup deterministic
**Summary**
- `utils/pty/src/tests.rs`
- Added a REPL readiness handshake (`wait_for_python_repl_ready`) that
repeatedly sends a marker and waits for it in PTY output before sending
test commands.
- Updated `pty_python_repl_emits_output_and_exits` to:
- wait for readiness first,
- preserve startup output,
- append output collected through process exit.
- Reduces Windows/ConPTY flakiness from early stdin writes racing REPL
startup.
- `mcp-server/tests/suite/codex_tool.rs`
- Avoid remote model refresh during MCP test startup, reducing
timeout-prone nondeterminism.