codex

mirror of https://github.com/openai/codex.git synced 2026-04-26 15:45:02 +00:00

Author	SHA1	Message	Date
Ruslan Nigmatullin	fe1ef5e637	app-server: tolerate shell startup output in shell command tests Motivation: User-shell command tests can observe shell startup output from the local environment before or around the command output being asserted. Exact output equality makes these tests depend on machine-specific shell profile behavior. Summary: Update thread shell command tests to wait for output deltas containing the expected command output and to assert aggregated output contains the expected text instead of requiring exact equality. Testing: - cargo test -p codex-app-server --test all thread_shell_command_runs_as_standalone_turn_and_persists_history - cargo test -p codex-app-server --test all thread_shell_command_uses_existing_active_turn - cargo test -p codex-app-server	2026-04-10 14:31:46 -07:00
Michael Bolin	a70aee1a1e	Fix Windows Bazel app-server trust tests (#16711 ) ## Why Extracted from [#16528](https://github.com/openai/codex/pull/16528) so the Windows Bazel app-server test failures can be reviewed independently from the rest of that PR. This PR targets: - `suite::v2::thread_shell_command::thread_shell_command_runs_as_standalone_turn_and_persists_history` - `suite::v2::thread_start::thread_start_with_elevated_sandbox_trusts_project_and_followup_loads_project_config` - `suite::v2::thread_start::thread_start_with_nested_git_cwd_trusts_repo_root` There were two Windows-specific assumptions baked into those tests and the underlying trust lookup: - project trust keys were persisted and looked up using raw path strings, but Bazel's Windows test environment can surface canonicalized paths with `\\?\` / UNC prefixes or normalized symlink/junction targets, so follow-up `thread/start` requests no longer matched the project entry that had just been written - `item/commandExecution/outputDelta` assertions compared exact trailing line endings even though shell output chunk boundaries and CRLF handling can differ on Windows, and Bazel made that timing-sensitive mismatch visible There was also one behavior bug separate from the assertion cleanup: `thread/start` decided whether to persist trust from the final resolved sandbox policy, but on Windows an explicit `workspace-write` request may be downgraded to `read-only`. That incorrectly skipped writing trust even though the request had asked to elevate the project, so the new logic also keys off the requested sandbox mode. ## What - Canonicalize project trust keys when persisting/loading `[projects]` entries, while still accepting legacy raw keys for existing configs. - Persist project trust when `thread/start` explicitly requests `workspace-write` or `danger-full-access`, even if the resolved policy is later downgraded on Windows. - Make the Windows app-server tests compare persisted trust paths and command output deltas in a path/newline-normalized way. ## Verification - Existing app-server v2 tests cover the three failing Windows Bazel cases above.	2026-04-03 21:41:25 +00:00
Michael Bolin	862158b9e9	app-server: make thread/shellCommand tests shell-aware (#16635 ) ## Why `thread/shellCommand` executes the raw command string through the current user shell, which is PowerShell on Windows. The two v2 app-server tests in `app-server/tests/suite/v2/thread_shell_command.rs` used POSIX `printf`, so Bazel CI on Windows failed with `printf` not being recognized as a PowerShell command. For reference, the user-shell task wraps commands with the active shell before execution: [`core/src/tasks/user_shell.rs`](`7a3eec6fdb/codex-rs/core/src/tasks/user_shell.rs (L120-L126)`). ## What Changed Added a test-local helper that builds a shell-appropriate output command and expected newline sequence from `default_user_shell()`: - PowerShell: `Write-Output '...'` with `\r\n` - Cmd: `echo ...` with `\r\n` - POSIX shells: `printf '%s\n' ...` with `\n` Both `thread_shell_command_runs_as_standalone_turn_and_persists_history` and `thread_shell_command_uses_existing_active_turn` now use that helper. ## Verification - `cargo test -p codex-app-server thread_shell_command`	2026-04-02 17:28:47 -07:00
Michael Bolin	61dfe0b86c	chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054 ) ## Why `argument-comment-lint` was green in CI even though the repo still had many uncommented literal arguments. The main gap was target coverage: the repo wrapper did not force Cargo to inspect test-only call sites, so examples like the `latest_session_lookup_params(true, ...)` tests in `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path. This change cleans up the existing backlog, makes the default repo lint path cover all Cargo targets, and starts rolling that stricter CI enforcement out on the platform where it is currently validated. ## What changed - mechanically fixed existing `argument-comment-lint` violations across the `codex-rs` workspace, including tests, examples, and benches - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to `--all-targets` unless the caller explicitly narrows the target set - fixed both wrappers so forwarded cargo arguments after `--` are preserved with a single separator - documented the new default behavior in `tools/argument-comment-lint/README.md` - updated `rust-ci` so the macOS lint lane keeps the plain wrapper invocation and therefore enforces `--all-targets`, while Linux and Windows temporarily pass `-- --lib --bins` That temporary CI split keeps the stricter all-targets check where it is already cleaned up, while leaving room to finish the remaining Linux- and Windows-specific target-gated cleanup before enabling `--all-targets` on those runners. The Linux and Windows failures on the intermediate revision were caused by the wrapper forwarding bug, not by additional lint findings in those lanes. ## Validation - `bash -n tools/argument-comment-lint/run.sh` - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh` - shell-level wrapper forwarding check for `-- --lib --bins` - shell-level wrapper forwarding check for `-- --tests` - `just argument-comment-lint` - `cargo test` in `tools/argument-comment-lint` - `cargo test -p codex-terminal-detection` ## Follow-up - Clean up remaining Linux-only target-gated callsites, then switch the Linux lint lane back to the plain wrapper invocation. - Clean up remaining Windows-only target-gated callsites, then switch the Windows lint lane back to the plain wrapper invocation.	2026-03-27 19:00:44 -07:00
Ahmed Ibrahim	2e22885e79	Split features into codex-features crate (#15253 ) - Split the feature system into a new `codex-features` crate. - Cut `codex-core` and workspace consumers over to the new config and warning APIs. Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-19 20:12:07 -07:00
Eric Traut	01df50cf42	Add thread/shellCommand to app server API surface (#14988 ) This PR adds a new `thread/shellCommand` app server API so clients can implement `!` shell commands. These commands are executed within the sandbox, and the command text and output are visible to the model. The internal implementation mirrors the current TUI `!` behavior. - persist shell command execution as `CommandExecution` thread items, including source and formatted output metadata - bridge live and replayed app-server command execution events back into the existing `tui_app_server` exec rendering path This PR also wires `tui_app_server` to submit `!` commands through the new API.	2026-03-18 23:42:40 -06:00

6 Commits