Replace the Unix shell lookup path in `codex-rs/core/src/shell.rs` to
use
`libc::getpwuid_r()` instead of `libc::getpwuid()` when resolving the
current
user's shell.
Why:
- `getpwuid()` can return pointers into libc-managed shared storage
- on the musl static Linux build, concurrent callers can race on that
storage
- this matches the crash pattern reported in tmux/Linux sessions with
parallel
shell activity
Refs:
- Fixes#13842
robust to Nix shell paths (#12476)
## Summary
- Updated `codex-rs/core/src/shell.rs` tests for shell detection to stop
asserting hardcoded shell paths.
- `detects_bash` and `detects_sh` now assert executable basenames
(`bash`, `sh`) rather than `/bin/*`/`/usr/bin/*` absolute paths.
- This keeps behavior the same while avoiding failures in Nix
environments where shells are resolved from `/nix/store/.../bin`.
## Testing
- `nix develop .#default --command sh -lc 'export
PKG_CONFIG_PATH=/nix/store/6az1q591wwlgazzskngr6rl7gmhpyvnc-libcap-2.77-dev/lib/pkgconfig:/nix/store/fgm3pz8486ksh3f94629lpb7xjr2wjp7-openssl-3.6.0-dev/lib/pkgconfig:$PKG_CONFIG_PATH;
export PKG_CONFIG_PATH_FOR_TARGET=$PKG_CONFIG_PATH; cd
/home/alex/workspace/openai/codex/codex-rs && cargo test -p codex-core
--lib detects_bash && cargo test -p codex-core --lib detects_sh'`
## Why
The two failing tests previously hardcoded fixed paths and failed under
the Nix shell due to Nix-provided shell binary locations.
## Links
- Bug report / enhancement request: not publicly filed yet; this was
reproduced in the local Nix environment.
- Use /bin/sh instead of /bin/bash on FreeBSD/OpenBSD in the process
group timeout test to avoid command-not-found failures.
- Accept /usr/local/bin/bash as a valid SHELL path to match common
FreeBSD installations.
- Switch the shell serialization duration test to /bin/sh for improved
portability across Unix platforms.
With this change, `cargo test -p codex-core --lib` runs and passes on
FreeBSD.
## Summary
Enables shell_command for windows users, and starts adding some basic
command parsing here, to at least remove powershell prefixes. We'll
follow this up with command parsing but I wanted to land this change
separately with some basic UX.
**NOTE**: This implementation parses bash and powershell on both
platforms. In theory this is possible, since you can use git bash on
windows or powershell on linux. In practice, this may not be worth the
complexity of supporting, so I don't feel strongly about the current
approach vs. platform-specific branching.
## Testing
- [x] Added a bunch of tests
- [x] Ran on both windows and os x
This adds support for a new variant of the shell tool behind a flag. To
test, run `codex` with `--enable shell_command_tool`, which will
register the tool with Codex under the name `shell_command` that accepts
the following shape:
```python
{
command: str
workdir: str | None,
timeout_ms: int | None,
with_escalated_permissions: bool | None,
justification: str | None,
}
```
This is comparable to the existing tool registered under
`shell`/`container.exec`. The primary difference is that it accepts
`command` as a `str` instead of a `str[]`. The `shell_command` tool
executes by running `execvp(["bash", "-lc", command])`, though the exact
arguments to `execvp(3)` depend on the user's default shell.
The hypothesis is that this will simplify things for the model. For
example, on Windows, instead of generating:
```json
{"command": ["pwsh.exe", "-NoLogo", "-Command", "ls -Name"]}
```
The model could simply generate:
```json
{"command": "ls -Name"}
```
As part of this change, I extracted some logic out of `user_shell.rs` as
`Shell::derive_exec_args()` so that it can be reused in
`codex-rs/core/src/tools/handlers/shell.rs`. Note the original code
generated exec arg lists like:
```javascript
["bash", "-lc", command]
["zsh", "-lc", command]
["pwsh.exe", "-NoProfile", "-Command", command]
```
Using `-l` for Bash and Zsh, but then specifying `-NoProfile` for
PowerShell seemed inconsistent to me, so I changed this in the new
implementation while also adding a `use_login_shell: bool` option to
make this explicit. If we decide to add a `login: bool` to
`ShellCommandToolCallParams` like we have for unified exec:
807e2c27f0/codex-rs/core/src/tools/handlers/unified_exec.rs (L33-L34)
Then this should make it straightforward to support.
Found that the VS Code Codex extension throws “Error starting
conversation” when initializing a conversation with Git for Windows’
bash on PATH.
Debugging showed the bash-detection logic did not return as expected;
this change makes it reliable in that scenario.
Possibly related to issue #2841.
## Summary
SendUserTurn has not been correctly handling updates to policies. While
the tui protocol handles this in `Op::OverrideTurnContext`, the
SendUserTurn should be appending `EnvironmentContext` messages when the
sandbox settings change. MCP client behavior should match the cli
behavior, so we update `SendUserTurn` message to match.
## Testing
- [x] Added prompt caching tests
Created this PR by:
- adding `redundant_clone` to `[workspace.lints.clippy]` in
`cargo-rs/Cargol.toml`
- running `cargo clippy --tests --fix`
- running `just fmt`
Though I had to clean up one instance of the following that resulted:
```rust
let codex = codex;
```
## Session snapshot
For POSIX shell, the goal is to take a snapshot of the interactive shell
environment, store it in a session file located in `.codex/` and only
source this file for every command that is run.
As a result, if a snapshot files exist, `bash -lc <CALL>` get replaced
by `bash -c <CALL>`.
This also fixes the issue that `bash -lc` does not source `.bashrc`,
resulting in missing env variables and aliases in the codex session.
## POSIX unification
Unify `bash` and `zsh` shell into a POSIX shell. The rational is that
the tool will not use any `zsh` specific capabilities.
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
## What? Why? How?
- When running on Windows, codex often tries to invoke bash commands,
which commonly fail (unless WSL is installed)
- Fix: Detect if powershell is available and, if so, route commands to
it
- Also add a shell_name property to environmental context for codex to
default to powershell commands when running in that environment
## Testing
- Tested within WSL and powershell (e.g. get top 5 largest files within
a folder and validated that commands generated were powershell commands)
- Tested within Zsh
- Updated unit tests
---------
Co-authored-by: Eddy Escardo <eddy@openai.com>
Codex created this PR from the following prompt:
> upgrade this entire repo to Rust 1.89. Note that this requires
updating codex-rs/rust-toolchain.toml as well as the workflows in
.github/. Make sure that things are "clippy clean" as this change will
likely uncover new Clippy errors. `just fmt` and `cargo clippy --tests`
are sufficient to check for correctness
Note this modifies a lot of lines because it folds nested `if`
statements using `&&`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2465).
* #2467
* __->__ #2465
This PR:
* Added the clippy.toml to configure allowable expect / unwrap usage in
tests
* Removed as many expect/allow lines as possible from tests
* moved a bunch of allows to expects where possible
Note: in integration tests, non `#[test]` helper functions are not
covered by this so we had to leave a few lingering `expect(expect_used`
checks around
This PR does two things because after I got deep into the first one I
started pulling on the thread to the second:
- Makes `ConversationManager` the place where all in-memory
conversations are created and stored. Previously, `MessageProcessor` in
the `codex-mcp-server` crate was doing this via its `session_map`, but
this is something that should be done in `codex-core`.
- It unwinds the `ctrl_c: tokio::sync::Notify` that was threaded
throughout our code. I think this made sense at one time, but now that
we handle Ctrl-C within the TUI and have a proper `Op::Interrupt` event,
I don't think this was quite right, so I removed it. For `codex exec`
and `codex proto`, we now use `tokio::signal::ctrl_c()` directly, but we
no longer make `Notify` a field of `Codex` or `CodexConversation`.
Changes of note:
- Adds the files `conversation_manager.rs` and `codex_conversation.rs`
to `codex-core`.
- `Codex` and `CodexSpawnOk` are no longer exported from `codex-core`:
other crates must use `CodexConversation` instead (which is created via
`ConversationManager`).
- `core/src/codex_wrapper.rs` has been deleted in favor of
`ConversationManager`.
- `ConversationManager::new_conversation()` returns `NewConversation`,
which is in line with the `new_conversation` tool we want to add to the
MCP server. Note `NewConversation` includes `SessionConfiguredEvent`, so
we eliminate checks in cases like `codex-rs/core/tests/client.rs` to
verify `SessionConfiguredEvent` is the first event because that is now
internal to `ConversationManager`.
- Quite a bit of code was deleted from
`codex-rs/mcp-server/src/message_processor.rs` since it no longer has to
manage multiple conversations itself: it goes through
`ConversationManager` instead.
- `core/tests/live_agent.rs` has been deleted because I had to update a
bunch of tests and all the tests in here were ignored, and I don't think
anyone ever ran them, so this was just technical debt, at this point.
- Removed `notify_on_sigint()` from `util.rs` (and in a follow-up, I
hope to refactor the blandly-named `util.rs` into more descriptive
files).
- In general, I started replacing local variables named `codex` as
`conversation`, where appropriate, though admittedly I didn't do it
through all the integration tests because that would have added a lot of
noise to this PR.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2240).
* #2264
* #2263
* __->__ #2240
## Summary
A split-up PR of #1763 , stacked on top of a tools refactor #1858 to
make the change clearer. From the previous summary:
> Let's try something new: tell the model about the sandbox, and let it
decide when it will need to break the sandbox. Some local testing
suggests that it works pretty well with zero iteration on the prompt!
## Testing
- [x] Added unit tests
- [x] Tested locally and it appears to work smoothly!
## Summary
- stream command stdout as `ExecCommandStdout` events
- forward streamed stdout to clients and ignore in human output
processor
- adjust call sites for new streaming API