Compare commits

...

217 Commits

Author SHA1 Message Date
Gabriel Peal
3be407fcbe Works 2025-08-07 13:39:55 -07:00
Michael Bolin
f74fe7af7b fix: fix mistaken bitwise OR in #1949 (#1957)
This is hard for me to test conclusively because I have the default of
`ctrl left/right` used to migrate between Spaces on macOS.
2025-08-07 20:11:06 +00:00
Jeremy Rose
c787603812 ctrl+arrows also move words (#1949)
this was removed at some point, but this is a common keybind for word
left/right.
2025-08-07 18:27:44 +00:00
Ed Bayes
e07776ccc9 update readme (#1948)
Co-authored-by: Alexander Embiricos <ae@openai.com>
2025-08-07 11:20:53 -07:00
pakrym-oai
f23c3066c8 Add capacity error (#1947) 2025-08-07 10:46:43 -07:00
pakrym-oai
a593b1c3ab Use different field for error type (#1945) 2025-08-07 10:20:33 -07:00
Michael Bolin
107d2ce4e7 fix: change OPENAI_DEFAULT_MODEL to "gpt-5" (#1943) 2025-08-07 10:13:13 -07:00
Ed Bayes
09adbf9132 remove composer bg (#1944)
passes local tests
2025-08-07 10:04:49 -07:00
pakrym-oai
62ed5907f9 Better usage errors (#1941)
<img width="771" height="279" alt="image"
src="https://github.com/user-attachments/assets/e56f967f-bcd7-49f7-8a94-3d88df68b65a"
/>
2025-08-07 09:46:13 -07:00
Dylan
bc28b87c7b [config] Onboarding flow with persistence (#1929)
## Summary
In collaboration with @gpeal: upgrade the onboarding flow, and persist
user settings.

---------

Co-authored-by: Gabriel Peal <gabriel@openai.com>
2025-08-07 09:27:38 -07:00
pakrym-oai
7e9ecfbc6a Rename the model (#1942) 2025-08-07 09:07:51 -07:00
pakrym-oai
c87fb83d81 Calculate remaining context based on last token usage (#1940)
We should only take last request size (in tokens) into account
2025-08-07 05:17:18 -07:00
ae
81b148bda2 feat: update system prompt (#1939) 2025-08-07 04:29:50 -07:00
ae
12d29c2779 feat: add tip to upgrade to ChatGPT plan (#1938) 2025-08-07 11:10:13 +00:00
ae
c4dc6a80bf feat: improve output of /status (#1936)
Now it looks like this:
```
/status
📂 Workspace
  • Path: ~/code/codex/codex-rs
  • Approval Mode: on-request
  • Sandbox: workspace-write

👤 Account
  • Signed in with ChatGPT
  • Login: example@example.com
  • Plan: Pro

🧠 Model
  • Name: ?!?!?!?!?!
  • Provider: OpenAI

📊 Token Usage
  • Input: 11940 (+ 7999 cached)
  • Output: 2639
  • Total: 14579
```
2025-08-07 12:02:58 +01:00
ae
7c20160676 feat: /prompts slash command (#1937)
- Shows several example prompts which include @-mentions 

------
https://chatgpt.com/codex/tasks/task_i_6894779ba8cc832ca0c871d17ee06aae
2025-08-07 11:55:59 +01:00
Ed Bayes
1e4bf81653 Update copy (#1935)
Updated copy

---------

Co-authored-by: pap-openai <pap@openai.com>
2025-08-07 03:29:33 -07:00
aibrahim-oai
5589c6089b approval ui (#1933)
Asking for approval:

<img width="269" height="41" alt="image"
src="https://github.com/user-attachments/assets/b9ced569-3297-4dae-9ce7-0b015c9e14ea"
/>

Allow:

<img width="400" height="31" alt="image"
src="https://github.com/user-attachments/assets/92056b22-efda-4d49-854d-e2943d5fcf17"
/>

Reject:

<img width="372" height="30" alt="image"
src="https://github.com/user-attachments/assets/be9530a9-7d41-4800-bb42-abb9a24fc3ea"
/>

Always Approve:

<img width="410" height="36" alt="image"
src="https://github.com/user-attachments/assets/acf871ba-4c26-4501-b303-7956d0151754"
/>
2025-08-07 02:02:56 -07:00
Michael Bolin
c2c327c723 feat: change shell_environment_policy to default to inherit="all" (#1904)
Trying to use `core` as the default has been "too clever." Users can
always take responsibility for controlling the env without this setting
at all by specifying the `env` they use when calling `codex` in the
first place.

See https://github.com/openai/codex/issues/1249.
2025-08-07 01:55:41 -07:00
Ed Bayes
20084facfe Add spinner animation to TUI status indicator (#1917)
## Summary
- add a pulsing dot loader before the shimmering `Working` label in the
status indicator widget and include a small test asserting the spinner
character is rendered
- also fix a small bug in the ran command header by adding a space
between the  and `Ran command`


https://github.com/user-attachments/assets/6768c9d2-e094-49cb-ad51-44bcac10aa6f

## Testing
- `just fmt`
- `just fix` *(failed: E0658 `let` expressions in core/src/client.rs)*
- `cargo test --all-features` *(failed: E0658 `let` expressions in
core/src/client.rs)*

------
https://chatgpt.com/codex/tasks/task_i_68941bffdb948322b0f4190bc9dbe7f6

---------

Co-authored-by: aibrahim-oai <aibrahim@openai.com>
2025-08-07 08:45:04 +00:00
Michael Bolin
13982d6b4e chore: fix outstanding review comments from the bot on #1919 (#1928)
I should have read the comments before submitting!
2025-08-07 01:30:13 -07:00
ae
0334476894 feat: parse info from auth.json and show in /status (#1923)
- `/status` renders
    ```
    signed in with chatgpt
      login: example@example.com
      plan: plus
    ```
- Setup for using this info in a few more places.

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-08-07 01:27:45 -07:00
Gabriel Peal
6d19b73edf Add logout command to CLI and TUI (#1932)
## Summary
- support `codex logout` via new subcommand and helper that removes the
stored `auth.json`
- expose a `logout` function in `codex-login` and test it
- add `/logout` slash command in the TUI; command list is filtered when
not logged in and the handler deletes `auth.json` then exits

## Testing
- `just fix` *(fails: failed to get `diffy` from crates.io)*
- `cargo test --all-features` *(fails: failed to get `diffy` from
crates.io)*

------
https://chatgpt.com/codex/tasks/task_i_68945c3facac832ca83d48499716fb51
2025-08-07 04:17:33 -04:00
ae
28395df957 [fix] fix absolute and % token counts (#1931)
- For absolute, use non-cached input + output.
- For estimating what % of the model's context window is used, we need
to account for reasoning output tokens from prior turns being dropped
from the context window. We approximate this here by subtracting
reasoning output tokens from the total. This will be off for the current
turn and pending function calls. We can improve it later.
2025-08-07 08:13:36 +00:00
Ed Bayes
eb80614a7c Tint chat composer background (#1921)
## Summary
- give the chat composer a subtle custom background and apply it across
the full area drawn

<img width="1008" height="718" alt="composer-bg"
src="https://github.com/user-attachments/assets/4b0f7f69-722a-438a-b4e9-0165ae8865a6"
/>

- update turn interrupted to be more human readable
<img width="648" height="170" alt="CleanShot 2025-08-06 at 22 44 47@2x"
src="https://github.com/user-attachments/assets/8d35e53a-bbfa-48e7-8612-c280a54e01dd"
/>

## Testing
- `cargo test --all-features` *(fails: `let` expressions in
`core/src/client.rs` require newer rustc)*
- `just fix` *(fails: `let` expressions in `core/src/client.rs` require
newer rustc)*

------
https://chatgpt.com/codex/tasks/task_i_68941f32c1008322bbcc39ee1d29a526
2025-08-07 00:46:45 -07:00
aibrahim-oai
04b40ac179 Move used tokens next to the hints (#1930)
Before:

<img width="341" height="58" alt="image"
src="https://github.com/user-attachments/assets/3b209e42-1157-4f7b-8385-825c865969e8"
/>

After:

<img width="490" height="53" alt="image"
src="https://github.com/user-attachments/assets/5d99b9bc-6ac2-4748-b62c-c0c3217622c2"
/>
2025-08-07 00:45:47 -07:00
easong-openai
4e29c4afe4 Add a UI hint when you press @ (#1903)
This will make @ more discoverable (even though it is currently not
super useful, IMO it should be used to bring files into context from
outside CWD)

---------

Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
2025-08-07 07:41:48 +00:00
Michael Bolin
cd5f9074af feat: add /tmp by default (#1919)
Replaces the `include_default_writable_roots` option on
`sandbox_workspace_write` (that defaulted to `true`, which was slightly
weird/annoying) with `exclude_tmpdir_env_var`, which defaults to
`false`.

Though perhaps more importantly `/tmp` is now enabled by default as part
of `sandbox_mode = "workspace-write"`, though `exclude_slash_tmp =
false` can be used to disable this.
2025-08-07 00:17:00 -07:00
aibrahim-oai
fff2bb39f9 change todo (#1925)
<img width="746" height="135" alt="image"
src="https://github.com/user-attachments/assets/1605b2fb-aa3a-4337-b9e9-93f6ff1361c5"
/>


<img width="747" height="126" alt="image"
src="https://github.com/user-attachments/assets/6b4366bd-8548-4d29-8cfa-cd484d9a2359"
/>
2025-08-07 00:01:38 -07:00
aibrahim-oai
f15e0fe1df Ensure exec command end always emitted (#1908)
## Summary
- defer ExecCommandEnd emission until after sandbox resolution
- make sandbox error handler return final exec output and response
- align sandbox error stderr with response content and rename to
`final_output`
- replace unstable `let` chains in client command header logic

## Testing
- `just fmt`
- `just fix`
- `cargo test --all-features` *(fails: NotPresent in
core/tests/client.rs)*

------
https://chatgpt.com/codex/tasks/task_i_6893e63b0c408321a8e1ff2a052c4c51
2025-08-07 06:25:56 +00:00
ae
f0fe61c667 feat: use ctrl c in interrupt hint (#1926)
https://chatgpt.com/codex/tasks/task_i_689441c33e1c832c85ceda166dab5d33
2025-08-06 23:22:58 -07:00
ae
935ad5c6f2 feat: >_ (#1924) 2025-08-06 22:54:54 -07:00
aibrahim-oai
ec20e84d80 Change the UI of apply patch (#1907)
<img width="487" height="108" alt="image"
src="https://github.com/user-attachments/assets/3f6ffd56-36f6-40bc-b999-64279705416a"
/>

---------

Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
2025-08-07 05:25:41 +00:00
easong-openai
2098b40369 Scrollable slash commands (#1830)
Scrollable slash commands. Part 1 of the multi PR.
2025-08-06 21:23:09 -07:00
aibrahim-oai
4971d54ca7 Show timing and token counts in status indicator (#1909)
## Summary
- track start time and cumulative tokens in status indicator
- display dim "(Ns • N tokens • Ctrl z to interrupt)" text after
animated Working header
- propagate token usage updates to status indicator views



https://github.com/user-attachments/assets/b73210c1-1533-40b5-b6c2-3c640029fd54


## Testing
- `just fmt`
- `just fix` *(fails: let expressions in this position are unstable)*
- `cargo test --all-features` *(fails: let expressions in this position
are unstable)*

------
https://chatgpt.com/codex/tasks/task_i_6893ec0d74a883218b94005172d7bc4c
2025-08-06 21:20:09 -07:00
Gabriel Peal
8a990b5401 Migrate GitWarning to OnboardingScreen (#1915)
This paves the way to do per-directory approval settings
(https://github.com/openai/codex/pull/1912).

This also lets us pass in a Config/ChatWidgetArgs into onboarding which
can then mutate it and emit the ChatWidgetArgs it wants at the end which
may be modified by the said approval settings.

<img width="1180" height="428" alt="CleanShot 2025-08-06 at 19 30 55"
src="https://github.com/user-attachments/assets/4dcfda42-0f5e-4b6d-a16d-2597109cc31c"
/>
2025-08-06 22:39:07 -04:00
aibrahim-oai
a5e17cda6b Run command UI (#1897)
Edit how commands show:

<img width="243" height="119" alt="image"
src="https://github.com/user-attachments/assets/13d5608e-3b66-4b8d-8fe7-ce464310d85d"
/>
2025-08-07 00:10:59 +00:00
pap-openai
8a980399c5 fix cursor file name insert (#1896)
Cursor wasn't moving when inserting a file, resulting in being not at
the end of the filename when inserting the file.
This fixes it by moving the cursor to the end of the file + one trailing
space.


Example screenshot after selecting a file when typing `@`
<img width="823" height="268" alt="image"
src="https://github.com/user-attachments/assets/ec6e3741-e1ba-4752-89d2-11f14a2bd69f"
/>
2025-08-06 16:58:06 -07:00
pap-openai
af8c1cdf12 fix meta+b meta+f (option+left/right) (#1895)
Option+Left or Option+Right should move cursor to beginning/end of the
word.

We weren't listening to what terminals are sending (on MacOS) and were
therefore printing b or f instead of moving cursor. We were actually in
the first match clause and returning char insertion
(https://github.com/openai/codex/pull/1895/files#diff-6bf130cd00438cc27a38c5a4d9937a27cf9a324c191de4b74fc96019d362be6dL209)

Tested on Apple Terminal, iTerm, Ghostty
2025-08-06 16:16:47 -07:00
pakrym-oai
57c973b571 Add 2025-08-06 model family (#1899) 2025-08-06 23:14:02 +00:00
Gabriel Peal
2d5de795aa First pass at a TUI onboarding (#1876)
This sets up the scaffolding and basic flow for a TUI onboarding
experience. It covers sign in with ChatGPT, env auth, as well as some
safety guidance.

Next up:
1. Replace the git warning screen
2. Use this to configure default approval/sandbox modes


Note the shimmer flashes are from me slicing the video, not jank.

https://github.com/user-attachments/assets/0fbe3479-fdde-41f3-87fb-a7a83ab895b8
2025-08-06 18:22:14 -04:00
Dylan
f25b2e8e2c Propagate apply_patch filesystem errors (#1892)
## Summary
We have been returning `exit code 0` from the apply patch command when
writes fail, which causes our `exec` harness to pass back confusing
messages to the model. Instead, we should loudly fail so that the
harness and the model can handle these errors appropriately.

Also adds a test to confirm this behavior.

## Testing
- `cargo test -p codex-apply-patch`
2025-08-06 14:58:53 -07:00
ae
a575effbb0 feat: interrupt running task on ctrl-z (#1880)
- Arguably a bugfix as previously CTRL-Z didn't do anything.
- Only in TUI mode for now. This may make sense in other modes... to be
researched.
- The TUI runs the terminal in raw mode and the signals arrive as key
events, so we handle CTRL-Z as a key event just like CTRL-C.
- Not adding UI for it as a composer redesign is coming, and we can just
add it then.
- We should follow with CTRL-Z a second time doing the native terminal
action.
2025-08-06 21:56:34 +00:00
ae
6cef86f05b feat: update launch screen (#1881)
- Updates the launch screen to:
  ```
  >_ You are using OpenAI Codex in ~/code/codex/codex-rs
  
   Try one of the following commands to get started:
  
   1. /init - Create an AGENTS.md file with instructions for Codex
   2. /status - Show current session configuration and token usage
   3. /compact - Compact the chat history
   4. /new - Start a new chat
   ```
- These aren't the perfect commands, but as more land soon we can
update.
- We should also add logic later to make /init only show when there's no
existing AGENTS.md.
- Majorly need to iterate on copy.

<img width="905" height="769" alt="image"
src="https://github.com/user-attachments/assets/5912939e-fb0e-4e76-94ff-785261e2d6ee"
/>
2025-08-06 14:36:48 -07:00
pakrym-oai
8262ba58b2 Prefer env var auth over default codex auth (#1861)
## Summary
- Prioritize provider-specific API keys over default Codex auth when
building requests
- Add test to ensure provider env var auth overrides default auth

## Testing
- `just fmt`
- `just fix` *(fails: `let` expressions in this position are unstable)*
- `cargo test --all-features` *(fails: `let` expressions in this
position are unstable)*

------
https://chatgpt.com/codex/tasks/task_i_68926a104f7483208f2c8fd36763e0e3
2025-08-06 13:02:00 -07:00
Jeremy Rose
081caa5a6b show a transient history cell for commands (#1824)
Adds a new "active history cell" for history bits that need to render
more than once before they're inserted into the history. Only used for
commands right now.


https://github.com/user-attachments/assets/925f01a0-e56d-4613-bc25-fdaa85d8aea5

---------

Co-authored-by: easong-openai <easong@openai.com>
2025-08-06 12:03:45 -07:00
Michael Bolin
4344537742 chore: rename INIT.md to prompt_for_init_command.md and move closer to usage (#1886)
Addressing my post-commit review feedback on
https://github.com/openai/codex/pull/1822.
2025-08-06 11:58:57 -07:00
Michael Bolin
64f2f2eca2 fix: support $CODEX_HOME/AGENTS.md instead of $CODEX_HOME/instructions.md (#1891)
The docs and code do not match. It turns out the docs are "right" in
they are what we have been meaning to support, so this PR updates the
code:


ae88b69b09/README.md (L298-L302)

Support for `instructions.md` is a holdover from the TypeScript CLI, so
we are just going to drop support for it altogether rather than maintain
it in perpetuity.
2025-08-06 11:48:03 -07:00
Michael Bolin
ae88b69b09 fix: add more instructions to ensure GitHub Action reviews only the necessary code (#1887)
Empirically, we have seen the GitHub Action comment on code outside of
the PR, so try to provide additional instructions in the prompt to avoid
this.
2025-08-06 10:39:58 -07:00
Charlie Weems
ffe24991b7 Initial implementation of /init (#1822)
Basic /init command that appends an instruction to create AGENTS.md to
the conversation history.
2025-08-06 09:10:23 -07:00
Dylan
dc468d563f [env] Remove git config for now (#1884)
## Summary
Forgot to remove this in #1869 last night! Too much of a performance hit
on the main thread. We can bring it back via an async thread on startup.
2025-08-06 08:05:17 -07:00
Dylan
3e8bcf0247 [prompts] Add <environment_context> (#1869)
## Summary
Includes a new user message in the api payload which provides useful
environment context for the model, so it knows about things like the
current working directory and the sandbox.

## Testing
Updated unit tests
2025-08-06 01:13:31 -07:00
Dylan
cda39e417f [tests] Investigate flakey mcp-server test (#1877)
## Summary
Have seen these tests flaking over the course of today on different
boxes. `wiremock` seems to be generally written with tokio/threads in
mind but based on the weird panics from the tests, let's see if this
helps.
2025-08-06 00:07:58 -07:00
ae
d642b07fcc [feat] add /status slash command (#1873)
- Added a `/status` command, which will be useful when we update the
home screen to print less status.
- Moved `create_config_summary_entries` to common since it's used in a
few places.
- Noticed we inconsistently had periods in slash command descriptions
and just removed them everywhere.
- Noticed the diff description was overflowing so made it shorter.
2025-08-05 23:57:52 -07:00
Michael Bolin
7b3ab968a0 docs: add more detail to the codex-rust-review (#1875)
This PR attempts to break `codex-rust-review.md` into sections so that
it is easier to consume.

It also adds a healthy new section on "Assertions in Tests" that has
been on my mind for awhile.
2025-08-06 06:36:10 +00:00
Michael Bolin
02e7965228 fix: add stricter checks and better error messages to create_github_release.sh (#1874)
This script attempts to verify that:

- You have no local, uncommitted changes.
- You are on `main`
- The commit you are on exists on `main` also exists on the origin
`https://github.com/openai/codex`, i.e., it is not just a commit you
have pushed to your local version of `main`

As part of this, try to print better error message if/when these
conditions are violated.
2025-08-05 23:33:21 -07:00
Michael Bolin
493e4c9463 fix: only tag as prerelease when the version has an -alpha or -beta suffix (#1872)
Hardcoding to `prerelease: true` is a holdover from before we had
migrated to the Rust CLI for releases and decided on how we were doing
version numbers.

To date, I have had to change the release status from "prerelease" to
"actual release" manually through the GitHub Releases web page. This is
a semi-serious problem because I've discovered that it messes up
Homebrew's automation if the version number _looks_ like a real release
but turns out to be a prerelease. The release potentially gets skipped
from being published on Homebrew, so it's important to set the value
correctly from the start.

I verified that `steps.release_name.outputs.name` does not include the
`rust-v` prefix from the tag name.
2025-08-05 23:11:29 -07:00
ae
1f7003b476 tweak comment (#1871)
Belatedly address CR feedback about a comment.

------
https://chatgpt.com/codex/tasks/task_i_6892e8070be4832cba379f2955f5b8bc
2025-08-05 23:02:00 -07:00
Michael Bolin
eaf2fb5b4f fix: fully enumerate EventMsg in chatwidget.rs (#1866)
https://github.com/openai/codex/pull/1868 is a related fix that was in
flight simultaenously, but after talking to @easong-openai, this:

- logs instead of renders for `BackgroundEvent`
- logs for `TurnDiff`
- renders for `PatchApplyEnd`
2025-08-05 22:44:27 -07:00
easong-openai
f8d70d67b6 Add OSS model info (#1860)
Add somewhat arbitrarily chosen context window/output limit.
2025-08-05 22:35:00 -07:00
easong-openai
966d957faf fixes no git repo warning (#1863)
Fix broken git warning

<img width="797" height="482" alt="broken-screen"
src="https://github.com/user-attachments/assets/9c52ed9b-13d8-4f1d-bb37-7c51acac615d"
/>
2025-08-05 22:34:14 -07:00
ae
b90c15abc4 clear terminal on launch (#1870) 2025-08-05 22:01:34 -07:00
aibrahim-oai
31dcae67db Remove Turndiff and Apply patch from the render (#1868)
Make the tui more specific on what to render. Apply patch End and Turn
diff needs special handling.

Avoiding this issue:

<img width="503" height="138" alt="image"
src="https://github.com/user-attachments/assets/4c010ea8-701e-46d2-aa49-88b37fe0e5d9"
/>
2025-08-05 21:32:03 -07:00
Dylan
725dd6be6a [approval_policy] Add OnRequest approval_policy (#1865)
## Summary
A split-up PR of #1763 , stacked on top of a tools refactor #1858 to
make the change clearer. From the previous summary:

> Let's try something new: tell the model about the sandbox, and let it
decide when it will need to break the sandbox. Some local testing
suggests that it works pretty well with zero iteration on the prompt!

## Testing
- [x] Added unit tests
- [x] Tested locally and it appears to work smoothly!
2025-08-05 20:44:20 -07:00
Dylan
aff97ed7dd [core] Separate tools config from openai client (#1858)
## Summary
In an effort to make tools easier to work with and more configurable,
I'm introducing `ToolConfig` and updating `Prompt` to take in a general
list of Tools. I think this is simpler and better for a few reasons:
- We can easily assemble tools from various sources (our own harness,
mcp servers, etc.) and we can consolidate the logic for constructing the
logic in one place that is separate from serialization.
- client.rs no longer needs arbitrary config values, it just takes in a
list of tools to serialize

A hefty portion of the PR is now updating our conversion of
`mcp_types::Tool` to `OpenAITool`, but considering that @bolinfest
accurately called this out as a TODO long ago, I think it's time we
tackled it.

## Testing
- [x] Experimented locally, no changes, as expected
- [x] Added additional unit tests
- [x] Responded to rust-review
2025-08-05 19:27:52 -07:00
Michael Bolin
afa8f0d617 fix: exit cleanly when ShutdownComplete is received (#1864)
Previous to this PR, `ShutdownComplete` was not being handled correctly
in `codex exec`, so it always ended up printing the following to stderr:

```
ERROR codex_exec: Error receiving event: InternalAgentDied
```

Because we were not breaking out of the loop for `ShutdownComplete`,
inevitably `codex.next_event()` would get called again and
`rx_event.recv()` would fail and the error would get mapped to
`InternalAgentDied`:


ea7d3f27bd/codex-rs/core/src/codex.rs (L190-L197)

For reference, https://github.com/openai/codex/pull/1647 introduced the
`ShutdownComplete` variant.
2025-08-05 19:19:36 -07:00
Dylan
ea7d3f27bd [core] Stop escalating timeouts (#1853)
## Summary
Escalating out of sandbox is (almost always) not going to fix
long-running commands timing out - therefore we should just pass the
failure back to the model instead of asking the user to re-run a command
that took a long time anyway.

## Testing
- [x] Ran locally with a timeout and confirmed this worked as expected
2025-08-05 17:52:25 -07:00
ae
f6c8d1117c [feat] make approval key matching case insensitive (#1862) 2025-08-05 15:50:06 -07:00
Michael Bolin
42bd73e150 chore: remove unnecessary default_ prefix (#1854)
This prefix is not inline with the other fields on the `ConfigOverrides`
struct.
2025-08-05 14:42:49 -07:00
Michael Bolin
d365cae077 fix: when using --oss, ensure correct configuration is threaded through correctly (#1859)
This PR started as an investigation with the goal of eliminating the use
of `unsafe { std::env::set_var() }` in `ollama/src/client.rs`, as
setting environment variables in a multithreaded context is indeed
unsafe and these tests were observed to be flaky, as a result.

Though as I dug deeper into the issue, I discovered that the logic for
instantiating `OllamaClient` under test scenarios was not quite right.
In this PR, I aimed to:

- share more code between the two creation codepaths,
`try_from_oss_provider()` and `try_from_provider_with_base_url()`
- use the values from `Config` when setting up Ollama, as we have
various mechanisms for overriding config values, so we should be sure
that we are always using the ultimate `Config` for things such as the
`ModelProviderInfo` associated with the `oss` id

Once this was in place,
`OllamaClient::try_from_provider_with_base_url()` could be used in unit
tests for `OllamaClient` so it was possible to create a properly
configured client without having to set environment variables.
2025-08-05 13:55:32 -07:00
Michael Bolin
0c5fa271bc fix: README ToC did not match contents (#1857)
Similar to https://github.com/openai/codex/pull/1855, this got through.

Fixed by running:

```
python3 scripts/readme_toc.py --fix README.md
```
2025-08-05 11:48:28 -07:00
Michael Bolin
bd24bc320e fix: clean out some ASCII (#1856)
Similar to https://github.com/openai/codex/pull/1855, this got through.

Fixed by running:

```
./scripts/asciicheck.py README.md
```
2025-08-05 11:44:04 -07:00
Michael Bolin
9f91b3da24 fix: correct spelling error that sneaked through (#1855)
I ended up force-pushing https://github.com/openai/codex/pull/1848
because CI jobs were not being triggered after updating the PR on
GitHub, so this spelling error sneaked through.
2025-08-05 11:39:30 -07:00
easong-openai
9285350842 Introduce --oss flag to use gpt-oss models (#1848)
This adds support for easily running Codex backed by a local Ollama
instance running our new open source models. See
https://github.com/openai/gpt-oss for details.

If you pass in `--oss` you'll be prompted to install/launch ollama, and
it will automatically download the 20b model and attempt to use it.

We'll likely want to expand this with some options later to make the
experience smoother for users who can't run the 20b or want to run the
120b.

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-08-05 11:31:11 -07:00
easong-openai
e0303dbac0 Rescue chat completion changes (#1846)
https://github.com/openai/codex/pull/1835 has some messed up history.

This adds support for streaming chat completions, which is useful for ollama. We should probably take a very skeptical eye to the code introduced in this PR.

---------

Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
2025-08-05 08:56:13 +00:00
Dylan
d31e149cb1 [prompt] Update prompt.md (#1839)
## Summary
Additional clarifications to our prompt. Still very concise, but we'll
continue to add more here.
2025-08-05 00:43:23 -07:00
Michael Bolin
136b3ee5bf chore: introduce ModelFamily abstraction (#1838)
To date, we have a number of hardcoded OpenAI model slug checks spread
throughout the codebase, which makes it hard to audit the various
special cases for each model. To mitigate this issue, this PR introduces
the idea of a `ModelFamily` that has fields to represent the existing
special cases, such as `supports_reasoning_summaries` and
`uses_local_shell_tool`.

There is a `find_family_for_model()` function that maps the raw model
slug to a `ModelFamily`. This function hardcodes all the knowledge about
the special attributes for each model. This PR then replaces the
hardcoded model name checks with checks against a `ModelFamily`.

Note `ModelFamily` is now available as `Config::model_family`. We should
ultimately remove `Config::model` in favor of
`Config::model_family::slug`.
2025-08-04 23:50:03 -07:00
Michael Bolin
fcdb1c4b4d fix: disable reorderArrays in tamasfe.even-better-toml (#1837)
The existing setting kept destroying my `~/.codex/config.toml` for the
reasons mentioned in the comment.
2025-08-04 21:57:55 -07:00
easong-openai
906d449760 Stream model responses (#1810)
Stream models thoughts and responses instead of waiting for the whole
thing to come through. Very rough right now, but I'm making the risk call to push through.
2025-08-05 04:23:22 +00:00
Dylan
063083af15 [prompts] Better user_instructions handling (#1836)
## Summary
Our recent change in #1737 can sometimes lead to the model confusing
AGENTS.md context as part of the message. But a little prompting and
formatting can help fix this!

## Testing
- Ran locally with a few different prompts to verify the model
behaves well.
- Updated unit tests
2025-08-04 18:55:57 -07:00
pakrym-oai
f58401e203 Request the simplified auth flow (#1834) 2025-08-04 18:45:13 -07:00
pakrym-oai
84bcadb8d9 Restore API key and query param overrides (#1826)
Addresses https://github.com/openai/codex/issues/1796
2025-08-04 18:07:49 -07:00
Ahmed Ibrahim
e38ce39c51 Revert to 3f13ebce10 without rewriting history. Wrong merge 2025-08-04 17:03:24 -07:00
Ahmed Ibrahim
1a33de34b0 unify flag 2025-08-04 16:56:52 -07:00
Ahmed Ibrahim
bd171e5206 add raw reasoning 2025-08-04 16:49:42 -07:00
Michael Bolin
3f13ebce10 [codex] stop printing error message when --output-last-message is not specified (#1828)
Previously, `codex exec` was printing `Warning: no file to write last
message to` as a warning to stderr even though `--output-last-message`
was not specified, which is wrong. This fixes the code and changes
`handle_last_message()` so that it is only called when
`last_message_path` is `Some`.
2025-08-04 15:56:32 -07:00
dependabot[bot]
7279080edd chore(deps): bump tokio from 1.46.1 to 1.47.1 in /codex-rs (#1816)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.46.1 to 1.47.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/tokio/releases">tokio's
releases</a>.</em></p>
<blockquote>
<h2>Tokio v1.47.1</h2>
<h1>1.47.1 (August 1st, 2025)</h1>
<h3>Fixed</h3>
<ul>
<li>process: fix panic from spurious pidfd wakeup (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7494">#7494</a>)</li>
<li>sync: fix broken link of Python <code>asyncio.Event</code> in
<code>SetOnce</code> docs (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7485">#7485</a>)</li>
</ul>
<p><a
href="https://redirect.github.com/tokio-rs/tokio/issues/7485">#7485</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7485">tokio-rs/tokio#7485</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7494">#7494</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7494">tokio-rs/tokio#7494</a></p>
<h2>Tokio v1.47.0</h2>
<h1>1.47.0 (July 25th, 2025)</h1>
<p>This release adds <code>poll_proceed</code> and
<code>cooperative</code> to the <code>coop</code> module for
cooperative scheduling, adds <code>SetOnce</code> to the
<code>sync</code> module which provides
similar functionality to [<code>std::sync::OnceLock</code>], and adds a
new method
<code>sync::Notify::notified_owned()</code> which returns an
<code>OwnedNotified</code> without
a lifetime parameter.</p>
<h2>Added</h2>
<ul>
<li>coop: add <code>cooperative</code> and <code>poll_proceed</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7405">#7405</a>)</li>
<li>sync: add <code>SetOnce</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7418">#7418</a>)</li>
<li>sync: add <code>sync::Notify::notified_owned()</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7465">#7465</a>)</li>
</ul>
<h2>Changed</h2>
<ul>
<li>deps: upgrade windows-sys 0.52 → 0.59 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7117">#7117</a>)</li>
<li>deps: update to socket2 v0.6 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7443">#7443</a>)</li>
<li>sync: improve <code>AtomicWaker::wake</code> performance (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7450">#7450</a>)</li>
</ul>
<h2>Documented</h2>
<ul>
<li>metrics: fix listed feature requirements for some metrics (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7449">#7449</a>)</li>
<li>runtime: improve safety comments of <code>Readiness&lt;'_&gt;</code>
(<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7415">#7415</a>)</li>
</ul>
<p><a
href="https://redirect.github.com/tokio-rs/tokio/issues/7405">#7405</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7405">tokio-rs/tokio#7405</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7415">#7415</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7415">tokio-rs/tokio#7415</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7418">#7418</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7418">tokio-rs/tokio#7418</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7449">#7449</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7449">tokio-rs/tokio#7449</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7450">#7450</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7450">tokio-rs/tokio#7450</a>
<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7465">#7465</a>:
<a
href="https://redirect.github.com/tokio-rs/tokio/pull/7465">tokio-rs/tokio#7465</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="be8ee45b3f"><code>be8ee45</code></a>
chore: prepare Tokio v1.47.1 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7504">#7504</a>)</li>
<li><a
href="d9b19166cd"><code>d9b1916</code></a>
Merge 'tokio-1.43.2' into 'tokio-1.47.x' (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7503">#7503</a>)</li>
<li><a
href="db8edc620f"><code>db8edc6</code></a>
chore: prepare Tokio v1.43.2 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7502">#7502</a>)</li>
<li><a
href="4730984d66"><code>4730984</code></a>
readme: add 1.47 as LTS release (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7497">#7497</a>)</li>
<li><a
href="1979615cbf"><code>1979615</code></a>
process: fix panic from spurious pidfd wakeup (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7494">#7494</a>)</li>
<li><a
href="f669a609cf"><code>f669a60</code></a>
ci: add lockfile for LTS branch</li>
<li><a
href="ce41896f8d"><code>ce41896</code></a>
sync: fix broken link of Python <code>asyncio.Event</code> in
<code>SetOnce</code> docs (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7485">#7485</a>)</li>
<li><a
href="c8ab78a84f"><code>c8ab78a</code></a>
changelog: fix incorrect PR number for 1.47.0 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7484">#7484</a>)</li>
<li><a
href="3911cb8523"><code>3911cb8</code></a>
chore: prepare Tokio v1.47.0 (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7482">#7482</a>)</li>
<li><a
href="d545aa2601"><code>d545aa2</code></a>
sync: add <code>sync::Notify::notified_owned()</code> (<a
href="https://redirect.github.com/tokio-rs/tokio/issues/7465">#7465</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/tokio-rs/tokio/compare/tokio-1.46.1...tokio-1.47.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tokio&package-manager=cargo&previous-version=1.46.1&new-version=1.47.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 14:50:53 -07:00
dependabot[bot]
89ab5c3f74 chore(deps): bump serde_json from 1.0.141 to 1.0.142 in /codex-rs (#1817)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.141 to
1.0.142.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.142</h2>
<ul>
<li>impl Default for &amp;Value (<a
href="https://redirect.github.com/serde-rs/json/issues/1265">#1265</a>,
thanks <a
href="https://github.com/aatifsyed"><code>@​aatifsyed</code></a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1731167cd5"><code>1731167</code></a>
Release 1.0.142</li>
<li><a
href="e51c81450a"><code>e51c814</code></a>
Touch up PR 1265</li>
<li><a
href="84abbdb613"><code>84abbdb</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1265">#1265</a>
from aatifsyed/master</li>
<li><a
href="9206cc0150"><code>9206cc0</code></a>
feat: impl Default for &amp;Value</li>
<li>See full diff in <a
href="https://github.com/serde-rs/json/compare/v1.0.141...v1.0.142">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=serde_json&package-manager=cargo&previous-version=1.0.141&new-version=1.0.142)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 14:26:14 -07:00
dependabot[bot]
6db597ec0c chore(deps-dev): bump typescript from 5.8.3 to 5.9.2 in /.github/actions/codex (#1814)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=typescript&package-manager=bun&previous-version=5.8.3&new-version=5.9.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 14:25:00 -07:00
dependabot[bot]
2899817c94 chore(deps): bump toml from 0.9.2 to 0.9.4 in /codex-rs (#1815)
Bumps [toml](https://github.com/toml-rs/toml) from 0.9.2 to 0.9.4.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2126e6af51"><code>2126e6a</code></a>
chore: Release</li>
<li><a
href="fa2100a888"><code>fa2100a</code></a>
docs: Update changelog</li>
<li><a
href="0c75bbd6f7"><code>0c75bbd</code></a>
feat(toml): Expose DeInteger/DeFloat as_str/radix (<a
href="https://redirect.github.com/toml-rs/toml/issues/1021">#1021</a>)</li>
<li><a
href="e3d64dff47"><code>e3d64df</code></a>
feat(toml): Expose DeFloat::as_str</li>
<li><a
href="ffdd211033"><code>ffdd211</code></a>
feat(toml): Expose DeInteger::as_str/radix</li>
<li><a
href="9e7adcc7fa"><code>9e7adcc</code></a>
docs(readme): Fix links to crates (<a
href="https://redirect.github.com/toml-rs/toml/issues/1020">#1020</a>)</li>
<li><a
href="73d04e20b5"><code>73d04e2</code></a>
docs(readme): Fix links to crates</li>
<li><a
href="da667e8a7d"><code>da667e8</code></a>
chore: Release</li>
<li><a
href="b1327fbe7c"><code>b1327fb</code></a>
docs: Update changelog</li>
<li><a
href="fb5346827e"><code>fb53468</code></a>
fix(toml): Don't enable std in toml_writer (<a
href="https://redirect.github.com/toml-rs/toml/issues/1019">#1019</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/toml-rs/toml/compare/toml-v0.9.2...toml-v0.9.4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=toml&package-manager=cargo&previous-version=0.9.2&new-version=0.9.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 14:24:19 -07:00
Jeremy Rose
64cfbbd3c8 support more keys in textarea (#1820)
Added:
* C-m for newline (not sure if this is actually treated differently to
Enter, but tui-textarea handles it and it doesn't hurt)
* C-d to delete one char forwards (same as Del)
* A-bksp to delete backwards one word
* A-arrows to navigate by word
2025-08-04 11:25:01 -07:00
easong-openai
a6139aa003 Update prompt.md (#1819)
The existing prompt is really bad. As a low-hanging fruit, let's correct
the apply_patch instructions - this helps smaller models successfully
apply patches.
2025-08-04 10:42:39 -07:00
ae
dc15a5cf0b feat: accept custom instructions in profiles (#1803)
Allows users to set their experimental_instructions_file in configs.

For example the below enables experimental instructions when running
`codex -p foo`.
```
[profiles.foo]
experimental_instructions_file = "/Users/foo/.codex/prompt.md"
```

# Testing
-  Running against a profile with experimental_instructions_file works.
-  Running against a profile without experimental_instructions_file
works.
-  Running against no profile with experimental_instructions_file
works.
-  Running against no profile without experimental_instructions_file
works.
2025-08-04 09:34:46 -07:00
Gabriel Peal
1f3318c1c5 Add a TurnDiffTracker to create a unified diff for an entire turn (#1770)
This lets us show an accumulating diff across all patches in a turn.
Refer to the docs for TurnDiffTracker for implementation details.

There are multiple ways this could have been done and this felt like the
right tradeoff between reliability and completeness:
*Pros*
* It will pick up all changes to files that the model touched including
if they prettier or another command that updates them.
* It will not pick up changes made by the user or other agents to files
it didn't modify.

*Cons*
* It will pick up changes that the user made to a file that the model
also touched
* It will not pick up changes to codegen or files that were not modified
with apply_patch
2025-08-04 11:57:04 -04:00
Dylan
e3565a3f43 [sandbox] Filter out certain non-sandbox errors (#1804)
## Summary
Users frequently complain about re-approving commands that have failed
for non-sandbox reasons. We can't diagnose with complete accuracy which
errors happened because of a sandbox failure, but we can start to
eliminate some common simple cases.

This PR captures the most common case I've seen, which is a `command not
found` error.

## Testing
- [x] Added unit tests
- [x] Ran a few cases locally
2025-08-03 13:05:48 -07:00
Jeremy Rose
2576fadc74 shimmer on working (#1807)
change the animation on "working" to be a text shimmer


https://github.com/user-attachments/assets/f64529eb-1c64-493a-8d97-0f68b964bdd0
2025-08-03 18:51:33 +00:00
Jeremy Rose
78a1d49fac fix command duration display (#1806)
we were always displaying "0ms" before.

<img width="731" height="101" alt="Screenshot 2025-08-02 at 10 51 22 PM"
src="https://github.com/user-attachments/assets/f56814ed-b9a4-4164-9e78-181c60ce19b7"
/>
2025-08-03 11:33:44 -07:00
Jeremy Rose
d62b703a21 custom textarea (#1794)
This replaces tui-textarea with a custom textarea component.

Key differences:
1. wrapped lines
2. better unicode handling
3. uses the native terminal cursor

This should perhaps be spun out into its own separate crate at some
point, but for now it's convenient to have it in-tree.
2025-08-03 11:31:35 -07:00
Gabriel Peal
4c9f7b6bcc Fix flaky test_shell_command_approval_triggers_elicitation test (#1802)
This doesn't flake very often but this should fix it.
2025-08-03 10:19:12 -04:00
David Z Hao
75eecb656e Fix MacOS multiprocessing by relaxing sandbox (#1808)
The following test script fails in the codex sandbox:
```
import multiprocessing
from multiprocessing import Lock, Process

def f(lock):
    with lock:
        print("Lock acquired in child process")

if __name__ == '__main__':
    lock = Lock()
    p = Process(target=f, args=(lock,))
    p.start()
    p.join()
```

with 
```
Traceback (most recent call last):
  File "/Users/david.hao/code/codex/codex-rs/cli/test.py", line 9, in <module>
    lock = Lock()
           ^^^^^^
  File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/context.py", line 68, in Lock
    return Lock(ctx=self.get_context())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/synchronize.py", line 169, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
  File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/synchronize.py", line 57, in __init__
    sl = self._semlock = _multiprocessing.SemLock(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 1] Operation not permitted
```

After reading, adding this line to the sandbox configs fixes things -
MacOS multiprocessing appears to use sem_lock(), which opens an IPC
which is considered a disk write even though no file is created. I
interrogated ChatGPT about whether it's okay to loosen, and my
impression after reading is that it is, although would appreciate a
close look


Breadcrumb: You can run `cargo run -- debug seatbelt --full-auto <cmd>`
to test the sandbox
2025-08-03 06:59:26 -07:00
aibrahim-oai
81bb1c9e26 Fix compact (#1798)
We are not recording the summary in the history.
2025-08-02 12:05:06 -07:00
Jeremy Rose
7e0f506da2 check for updates (#1764)
1. Ping https://api.github.com/repos/openai/codex/releases/latest (at
most once every 20 hrs)
2. Store the result in ~/.codex/version.jsonl
3. If CARGO_PKG_VERSION < latest_version, print a message at boot.

---------

Co-authored-by: easong-openai <easong@openai.com>
2025-08-02 00:31:38 +00:00
pakrym-oai
929ba50adc Update succesfull login page look (#1789) 2025-08-01 23:30:15 +00:00
Michael Bolin
80555d4ff2 feat: make .git read-only within a writable root when using Seatbelt (#1765)
To make `--full-auto` safer, this PR updates the Seatbelt policy so that
a `SandboxPolicy` with a `writable_root` that contains a `.git/`
_directory_ will make `.git/` _read-only_ (though as a follow-up, we
should also consider the case where `.git` is a _file_ with a `gitdir:
/path/to/actual/repo/.git` entry that should also be protected).

The two major changes in this PR:

- Updating `SandboxPolicy::get_writable_roots_with_cwd()` to return a
`Vec<WritableRoot>` instead of a `Vec<PathBuf>` where a `WritableRoot`
can specify a list of read-only subpaths.
- Updating `create_seatbelt_command_args()` to honor the read-only
subpaths in `WritableRoot`.

The logic to update the policy is a fairly straightforward update to
`create_seatbelt_command_args()`, but perhaps the more interesting part
of this PR is the introduction of an integration test in
`tests/sandbox.rs`. Leveraging the new API in #1785, we test
`SandboxPolicy` under various conditions, including ones where `$TMPDIR`
is not readable, which is critical for verifying the new behavior.

To ensure that Codex can run its own tests, e.g.:

```
just codex debug seatbelt --full-auto -- cargo test if_git_repo_is_writable_root_then_dot_git_folder_is_read_only
```

I had to introduce the use of `CODEX_SANDBOX=sandbox`, which is
comparable to how `CODEX_SANDBOX_NETWORK_DISABLED=1` was already being
used.

Adding a comparable change for Landlock will be done in a subsequent PR.
2025-08-01 16:11:24 -07:00
aibrahim-oai
97ab8fb610 MCP: add conversation.create tool [Stack 2/2] (#1783)
Introduce conversation.create handler (handle_create_conversation) and
wire it in MessageProcessor.

Stack:
Top: #1783 
Bottom: #1784

---------

Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
2025-08-01 22:18:36 +00:00
aibrahim-oai
fe62f859a6 Add Error variant to ConversationCreateResult [Stack 1/2] (#1784)
Switch ConversationCreateResult from a struct to a tagged enum (Ok |
Error)

Stack:
Top: #1783 
Bottom: #1784
2025-08-01 15:13:53 -07:00
Michael Bolin
92f3566d78 chore: introduce SandboxPolicy::WorkspaceWrite::include_default_writable_roots (#1785)
Without this change, it is challenging to create integration tests to
verify that the folders not included in `writable_roots` in
`SandboxPolicy::WorkspaceWrite` are read-only because, by default,
`get_writable_roots_with_cwd()` includes `TMPDIR`, which is where most
integrationt
tests do their work.

This introduces a `use_exact_writable_roots` option to disable the
default
includes returned by `get_writable_roots_with_cwd()`.




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1785).
* #1765
* __->__ #1785
2025-08-01 14:15:55 -07:00
aibrahim-oai
f20de21cb6 collabse stdout and stderr delta events into one (#1787) 2025-08-01 14:00:19 -07:00
aibrahim-oai
bc7beddaa2 feat: stream exec stdout events (#1786)
## Summary
- stream command stdout as `ExecCommandStdout` events
- forward streamed stdout to clients and ignore in human output
processor
- adjust call sites for new streaming API
2025-08-01 13:04:34 -07:00
Jeremy Rose
8360c6a3ec fix insert_history modifier handling (#1774)
This fixes a bug in insert_history_lines where writing
`Line::From(vec!["A".bold(), "B".into()])` would write "B" as bold,
because "B" didn't explicitly subtract bold.
2025-08-01 10:37:43 -07:00
aibrahim-oai
f918198bbb Introduce a new function to just send user message [Stack 3/3] (#1686)
- MCP server: add send-user-message tool to send user input to a running
Codex session
- Added an integration tests for the happy and sad paths

Changes:
•	Add tool definition and schema.
•	Expose tool in capabilities.
•	Route and handle tool requests with validation.
•	Tests for success, bad UUID, and missing session.


follow‑ups
• Listen path not implemented yet; the tool is present but marked “don’t
use yet” in code comments.
• Session run flag reset: clear running_session_id_set appropriately
after turn completion/errors.

This is the third PR in a stack.
Stack:
Final: #1686
Intermediate: #1751
First: #1750
2025-08-01 17:04:12 +00:00
pakrym-oai
88ea215c80 Add a custom originator setting (#1781) 2025-08-01 09:55:23 -07:00
aibrahim-oai
b67c485d84 ci fix (#1782) 2025-08-01 09:17:13 -07:00
aibrahim-oai
e2c994e32a Add /compact (#1527)
- Add operation to summarize the context so far.
- The operation runs a compact task that summarizes the context.
- The operation clear the previous context to free the context window
- The operation didn't use `run_task` to avoid corrupting the session
- Add /compact in the tui



https://github.com/user-attachments/assets/e06c24e5-dcfb-4806-934a-564d425a919c
2025-07-31 21:34:32 -07:00
aibrahim-oai
ad0295b893 MCP server: route structured tool-call requests and expose mcp_protocol [Stack 2/3] (#1751)
- Expose mcp_protocol from mcp-server for reuse in tests and callers.
- In MessageProcessor, detect structured ToolCallRequestParams in
tools/call and forward to a new handler.
- Add handle_new_tool_calls scaffold (returns error for now).
- Test helper: add send_send_user_message_tool_call to McpProcess to
send ConversationSendMessage requests;

This is the second PR in a stack.
Stack:
Final: #1686
Intermediate: #1751
First: #1750
2025-08-01 02:46:04 +00:00
aibrahim-oai
d3aa5f46b7 MCP Protocol: Align tool-call response with CallToolResult [Stack 1/3] (#1750)
# Summary
- Align MCP server responses with mcp_types by emitting [CallToolResult,
RequestId] instead of an object.
Update send-message result to a tagged enum: Ok or Error { message }.

# Why
Protocol compliance with current MCP schema.

# Tests
- Updated assertions in mcp_protocol.rs for create/stream/send/list and
error cases.

This is the first PR in a stack.
Stack:
Final: #1686
Intermediate: #1751
First: #1750
2025-08-01 02:30:03 +00:00
easong-openai
575590e4c2 Detect kitty terminals (#1748)
We want to detect kitty terminals so we can preferentially upgrade their UX without degrading older terminals.
2025-08-01 00:30:44 +00:00
Jeremy Rose
4aca3e46c8 insert history lines with redraw (#1769)
This delays the call to insert_history_lines until a redraw is
happening. Crucially, the new lines are inserted _after the viewport is
resized_. This results in fewer stray blank lines below the viewport
when modals (e.g. user approval) are closed.
2025-07-31 17:15:26 -07:00
Jeremy Rose
d787434aa8 fix: always send KeyEvent, we now check kind in the handler (#1772)
https://github.com/openai/codex/pull/1754 and #1771 fixed the same thing
in colliding ways.
2025-08-01 00:13:36 +00:00
Jeremy Rose
ea69a1d72f lighter approval modal (#1768)
The yellow hazard stripes were too scary :)

This also has the added benefit of not rendering anything at the full
width of the terminal, so resizing is a little easier to handle.

<img width="860" height="390" alt="Screenshot 2025-07-31 at 4 03 29 PM"
src="https://github.com/user-attachments/assets/18476e1a-065d-4da9-92fe-e94978ab0fce"
/>

<img width="860" height="390" alt="Screenshot 2025-07-31 at 4 05 03 PM"
src="https://github.com/user-attachments/assets/337db0da-de40-48c6-ae71-0e40f24b87e7"
/>
2025-07-31 17:10:52 -07:00
Jeremy Rose
610addbc2e do not dispatch key releases (#1771)
when we enabled KKP in https://github.com/openai/codex/pull/1743, we
started receiving keyup events, but didn't expect them anywhere in our
code. for now, just don't dispatch them at all.
2025-07-31 17:00:48 -07:00
pakrym-oai
0935e6a875 Send account id when available (#1767)
For users with multiple accounts we need to specify the account to use.
2025-07-31 15:40:19 -07:00
easong-openai
6ce0a5875b Initial planning tool (#1753)
We need to optimize the prompt, but this causes the model to use the new
planning_tool.

<img width="765" height="110" alt="image"
src="https://github.com/user-attachments/assets/45633f7f-3c85-4e60-8b80-902f1b3b508d"
/>
2025-07-31 20:45:52 +00:00
Michael Bolin
5a0ad5ab8f chore: refactor exec.rs: create separate seatbelt.rs and spawn.rs files (#1762)
At 550 lines, `exec.rs` was a bit large. In particular, I found it hard
to locate the Seatbelt-related code quickly without a file with
`seatbelt` in the name, so this refactors things so:

- `spawn_command_under_seatbelt()` and dependent code moves to a new
`seatbelt.rs` file
- `spawn_child_async()` and dependent code moves to a new `spawn.rs`
file
2025-07-31 13:11:47 -07:00
easong-openai
9aa11269a5 Fix double-scrolling in approval model (#1754)
Previously, pressing up or down arrow in the new approval modal would be
the equivalent of two up or down presses.
2025-07-31 19:41:32 +00:00
Michael Bolin
06c786b2da fix: ensure PatchApplyBeginEvent and PatchApplyEndEvent are dispatched reliably (#1760)
This is a follow-up to https://github.com/openai/codex/pull/1705, as
that PR inadvertently lost the logic where `PatchApplyBeginEvent` and
`PatchApplyEndEvent` events were sent when patches were auto-approved.

Though as part of this fix, I believe this also makes an important
safety fix to `assess_patch_safety()`, as there was a case that returned
`SandboxType::None`, which arguably is the thing we were trying to avoid
in #1705.

On a high level, we want there to be only one codepath where
`apply_patch` happens, which should be unified with the patch to run
`exec`, in general, so that sandboxing is applied consistently for both
cases.

Prior to this change, `apply_patch()` in `core` would either:

* exit early, delegating to `exec()` to shell out to `apply_patch` using
the appropriate sandbox
* proceed to run the logic for `apply_patch` in memory


549846b29a/codex-rs/core/src/apply_patch.rs (L61-L63)

In this implementation, only the latter would dispatch
`PatchApplyBeginEvent` and `PatchApplyEndEvent`, though the former would
dispatch `ExecCommandBeginEvent` and `ExecCommandEndEvent` for the
`apply_patch` call (or, more specifically, the `codex
--codex-run-as-apply-patch PATCH` call).

To unify things in this PR, we:

* Eliminate the back half of the `apply_patch()` function, and instead
have it also return with `DelegateToExec`, though we add an extra field
to the return value, `user_explicitly_approved_this_action`.
* In `codex.rs` where we process `DelegateToExec`, we use
`SandboxType::None` when `user_explicitly_approved_this_action` is
`true`. This means **we no longer run the apply_patch logic in memory**,
as we always `exec()`. (Note this is what allowed us to delete so much
code in `apply_patch.rs`.)
* In `codex.rs`, we further update `notify_exec_command_begin()` and
`notify_exec_command_end()` to take additional fields to determine what
type of notification to send: `ExecCommand` or `PatchApply`.

Admittedly, this PR also drops some of the functionality about giving
the user the opportunity to expand the set of writable roots as part of
approving the `apply_patch` command. I'm not sure how much that was
used, and we should probably rethink how that works as we are currently
tidying up the protocol to the TUI, in general.
2025-07-31 11:13:57 -07:00
pakrym-oai
549846b29a Add codex login --api-key (#1759)
Allow setting the API key via `codex login --api-key`
2025-07-31 17:48:49 +00:00
Jeremy Rose
96654a5d52 clamp render area to terminal size (#1758)
this fixes a couple of panics that would happen when trying to render
something larger than the terminal, or insert history lines when the top
of the viewport is at y=0.
2025-07-31 09:59:36 -07:00
easong-openai
861ba86403 Show error message after panic (#1752)
Previously we were swallowing errors and silently exiting, which isn't
great for helping users help us.
2025-07-31 09:19:08 -07:00
Jeremy Rose
be0cd34300 fix git tests (#1747)
the git tests were failing on my local machine due to gpg signing config
in my ~/.gitconfig. tests should not be affected by ~/.gitconfig, so
configure them to ignore it.
2025-07-31 09:17:59 -07:00
Jeremy Rose
d86270696e streamline ui (#1733)
Simplify and improve many UI elements.
* Remove all-around borders in most places. These interact badly with
terminal resizing and look heavy. Prefer left-side-only borders.
* Make the viewport adjust to the size of its contents.
* <kbd>/</kbd> and <kbd>@</kbd> autocomplete boxes appear below the
prompt, instead of above it.
* Restyle the keyboard shortcut hints & move them to the left.
* Restyle the approval dialog.
* Use synchronized rendering to avoid flashing during rerenders.


https://github.com/user-attachments/assets/96f044af-283b-411c-b7fc-5e6b8a433c20

<img width="1117" height="858" alt="Screenshot 2025-07-30 at 5 29 20 PM"
src="https://github.com/user-attachments/assets/0cc0af77-8396-429b-b6ee-9feaaccdbee7"
/>
2025-07-31 00:43:21 -07:00
pap-openai
defeafb279 add keyboard enhancements to support shift_return (#1743)
For terminal that supports [keyboard
enhancements](https://docs.rs/libcrossterm/latest/crossterm/enum.KeyboardEnhancementFlags.html),
adds the enhancements (enabling [kitty keyboard
protocol](https://sw.kovidgoyal.net/kitty/keyboard-protocol/)) to
support shift+enter listener.

Those users (users with terminals listed on
[KPP](https://sw.kovidgoyal.net/kitty/keyboard-protocol/)) should be
able to press shift+return for new line

---------

Co-authored-by: easong-openai <easong@openai.com>
2025-07-31 03:23:56 +00:00
pakrym-oai
51b6bdefbe Auto format toml (#1745)
Add recommended extension and configure it to auto format prompt.
2025-07-30 18:37:00 -07:00
Michael Bolin
35010812c7 chore: add support for a new label, codex-rust-review (#1744)
The goal of this change is to try an experiment where we try to get AI
to take on more of the code review load. The idea is that once you
believe your PR is ready for review, please add the `codex-rust-review`
label (as opposed to the `codex-review` label).

Admittedly the corresponding prompt currently represents my personal
biases in terms of code review, but we should massage it over time to
represent the team's preferences.
2025-07-30 17:49:07 -07:00
Jeremy Rose
f2134f6633 resizable viewport (#1732)
Proof of concept for a resizable viewport.

The general approach here is to duplicate the `Terminal` struct from
ratatui, but with our own logic. This is a "light fork" in that we are
still using all the base ratatui functions (`Buffer`, `Widget` and so
on), but we're doing our own bookkeeping at the top level to determine
where to draw everything.

This approach could use improvement—e.g, when the window is resized to a
smaller size, if the UI wraps, we don't correctly clear out the
artifacts from wrapping. This is possible with a little work (i.e.
tracking what parts of our UI would have been wrapped), but this
behavior is at least at par with the existing behavior.


https://github.com/user-attachments/assets/4eb17689-09fd-4daa-8315-c7ebc654986d


cc @joshka who might have Thoughts™
2025-07-31 00:06:55 +00:00
Michael Bolin
221ebfcccc fix: run apply_patch calls through the sandbox (#1705)
Building on the work of https://github.com/openai/codex/pull/1702, this
changes how a shell call to `apply_patch` is handled.

Previously, a shell call to `apply_patch` was always handled in-process,
never leveraging a sandbox. To determine whether the `apply_patch`
operation could be auto-approved, the
`is_write_patch_constrained_to_writable_paths()` function would check if
all the paths listed in the paths were writable. If so, the agent would
apply the changes listed in the patch.

Unfortunately, this approach afforded a loophole: symlinks!

* For a soft link, we could fix this issue by tracing the link and
checking whether the target is in the set of writable paths, however...
* ...For a hard link, things are not as simple. We can run `stat FILE`
to see if the number of links is greater than 1, but then we would have
to do something potentially expensive like `find . -inum <inode_number>`
to find the other paths for `FILE`. Further, even if this worked, this
approach runs the risk of a
[TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
race condition, so it is not robust.

The solution, implemented in this PR, is to take the virtual execution
of the `apply_patch` CLI into an _actual_ execution using `codex
--codex-run-as-apply-patch PATCH`, which we can run under the sandbox
the user specified, just like any other `shell` call.

This, of course, assumes that the sandbox prevents writing through
symlinks as a mechanism to write to folders that are not in the writable
set configured by the sandbox. I verified this by testing the following
on both Mac and Linux:

```shell
#!/usr/bin/env bash
set -euo pipefail

# Can running a command in SANDBOX_DIR write a file in EXPLOIT_DIR?

# Codex is run in SANDBOX_DIR, so writes should be constrianed to this directory.
SANDBOX_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
# EXPLOIT_DIR is outside of SANDBOX_DIR, so let's see if we can write to it.
EXPLOIT_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)

echo "SANDBOX_DIR: $SANDBOX_DIR"
echo "EXPLOIT_DIR: $EXPLOIT_DIR"

cleanup() {
  # Only remove if it looks sane and still exists
  [[ -n "${SANDBOX_DIR:-}" && -d "$SANDBOX_DIR" ]] && rm -rf -- "$SANDBOX_DIR"
  [[ -n "${EXPLOIT_DIR:-}" && -d "$EXPLOIT_DIR" ]] && rm -rf -- "$EXPLOIT_DIR"
}

trap cleanup EXIT

echo "I am the original content" > "${EXPLOIT_DIR}/original.txt"

# Drop the -s to test hard links.
ln -s "${EXPLOIT_DIR}/original.txt" "${SANDBOX_DIR}/link-to-original.txt"

cat "${SANDBOX_DIR}/link-to-original.txt"

if [[ "$(uname)" == "Linux" ]]; then
    SANDBOX_SUBCOMMAND=landlock
else
    SANDBOX_SUBCOMMAND=seatbelt
fi

# Attempt the exploit
cd "${SANDBOX_DIR}"

codex debug "${SANDBOX_SUBCOMMAND}" bash -lc "echo pwned > ./link-to-original.txt" || true

cat "${EXPLOIT_DIR}/original.txt"
```

Admittedly, this change merits a proper integration test, but I think I
will have to do that in a follow-up PR.
2025-07-30 16:45:08 -07:00
pakrym-oai
301ec72107 Add login status command (#1716)
Print the current login mode, sanitized key and return an appropriate
status.
2025-07-30 14:09:26 -07:00
pakrym-oai
e0e245cc1c Send AGENTS.md as a separate user message (#1737) 2025-07-30 13:56:24 -07:00
aibrahim-oai
2f5557056d moving input item from MCP Protocol back to core Protocol (#1740)
- Currently we have duplicate input item. Let's have one source of truth
in the core.
- Used Requestid type
2025-07-30 13:43:08 -07:00
pakrym-oai
ea01a5ffe2 Add support for a separate chatgpt auth endpoint (#1712)
Adds a `CodexAuth` type that encapsulates information about available
auth modes and logic for refreshing the token.
Changes `Responses` API to send requests to different endpoints based on
the auth type.
Updates login_with_chatgpt to support API-less mode and skip the key
exchange.
2025-07-30 19:40:15 +00:00
aibrahim-oai
93341797c4 fix ci (#1739)
I think this commit broke the CI because it changed the
`McpToolCallBeginEvent` type:
347c81ad00
2025-07-30 11:32:38 -07:00
Jeremy Rose
347c81ad00 remove conversation history widget (#1727)
this widget is no longer used.
2025-07-30 10:05:40 -07:00
aibrahim-oai
3823b32b7a Mcp protocol (#1715)
- Add typed MCP protocol surface in
`codex-rs/mcp-server/src/mcp_protocol.rs` for `requests`, `responses`,
and `notifications`
- Requests: `NewConversation`, `Connect`, `SendUserMessage`,
`GetConversations`
- Message content parts: `Text`, `Image` (`ImageUrl`/`FileId`, optional
`ImageDetail`), File (`Url`/`Id`/`inline Data`)
- Responses: `ToolCallResponseEnvelope` with optional `isError` and
`structuredContent` variants (`NewConversation`, `Connect`,
`SendUserMessageAccepted`, `GetConversations`)
- Notifications: `InitialState`, `ConnectionRevoked`, `CodexEvent`,
`Cancelled`
- Uniform `_meta` on `notifications` via `NotificationMeta`
(`conversationId`, `requestId`)
- Unit tests validate JSON wire shapes for key
`requests`/`responses`/`notifications`
2025-07-29 20:14:41 -07:00
pakrym-oai
6b10e22eb3 Trim bash lc and run with login shell (#1725)
include .zshenv, .zprofile by running with the `-l` flag and don't start
a shell inside a shell when we see the typical `bash -lc` invocation.
2025-07-29 16:49:02 -07:00
Gabriel Peal
8828f6f082 Add an experimental plan tool (#1726)
This adds a tool the model can call to update a plan. The tool doesn't
actually _do_ anything but it gives clients a chance to read and render
the structured plan. We will likely iterate on the prompt and tools
exposed for planning over time.
2025-07-29 14:22:02 -04:00
easong-openai
f8fcaaaf6f Relative instruction file (#1722)
Passing in an instruction file with a bad path led to silent failures,
also instruction relative paths were handled in an unintuitive fashion.
2025-07-29 10:06:05 -07:00
Jeremy Rose
fc85f4812f feat: map ^U to kill-line-to-head (#1711)
see
[discussion](https://github.com/rhysd/tui-textarea/issues/51#issuecomment-3021191712),
it's surprising that ^U behaves this way. IMO the undo/redo
functionality in tui-textarea isn't good enough to be worth preserving,
but if we do bring it back it should probably be on C-z / C-S-z / C-y.
2025-07-29 09:40:26 -07:00
easong-openai
efe7f3c793 alternate login wording? (#1723)
Co-authored-by: Jeremy Rose <172423086+nornagon-openai@users.noreply.github.com>
2025-07-29 16:23:09 +00:00
Jeremy Rose
f66704a88f replace login screen with a simple prompt (#1713)
Perhaps there was an intention to make the login screen prettier, but it
feels quite silly right now to just have a screen that says "press q",
so replace it with something that lets the user directly login without
having to quit the app.

<img width="1283" height="635" alt="Screenshot 2025-07-28 at 2 54 05 PM"
src="https://github.com/user-attachments/assets/f19e5595-6ef9-4a2d-b409-aa61b30d3628"
/>
2025-07-28 17:25:14 -07:00
Dylan
094d7af8c3 [mcp-server] Populate notifications._meta with requestId (#1704)
## Summary
Per the [latest MCP
spec](https://modelcontextprotocol.io/specification/2025-06-18/basic#meta),
the `_meta` field is reserved for metadata. In the [Typescript
Schema](0695a497eb/schema/2025-06-18/schema.ts (L37-L40)),
`progressToken` is defined as a value to be attached to subsequent
notifications for that request.

The
[CallToolRequestParams](0695a497eb/schema/2025-06-18/schema.ts (L806-L817))
extends this definition but overwrites the params field. This ambiguity
makes our generated type definitions tricky, so I'm going to skip
`progressToken` field for now and just send back the `requestId`
instead.
 
In a future PR, we can clarify, update our `generate_mcp_types.py`
script, and update our progressToken logic accordingly.

## Testing
- [x] Added unit tests
- [x] Manually tested with mcp client
2025-07-28 13:32:09 -07:00
Jeremy Rose
2d2df891bb fix: long lines incorrectly wrapped (#1710)
fix to #1685.
2025-07-28 12:19:03 -07:00
easong-openai
80c19ea77c Fix approval workflow (#1696)
(Hopefully) temporary solution to the invisible approvals problem -
prints commands to history when they need approval and then also prints
the result of the approval. In the near future we should be able to do
some fancy stuff with updating commands before writing them to permanent
history.

Also, ctr-c while in the approval modal now acts as esc (aborts command)
and puts the TUI in the state where one additional ctr-c will exit.
2025-07-28 19:00:06 +00:00
aibrahim-oai
19bef7659f Serializing the eventmsg type to snake_case (#1709)
This was an abrupt change on our clients. We need to serialize as
snake_case.
2025-07-28 10:26:27 -07:00
Michael Bolin
5ebb7dd34c chore: split apply_patch logic out of codex.rs and into apply_patch.rs (#1703)
This is a straight refactor, moving apply-patch-related code from
`codex.rs` and into the new `apply_patch.rs` file. The only "logical"
change is inlining `#[allow(clippy::unwrap_used)]` instead of declaring
`#![allow(clippy::unwrap_used)]` at the top of the file (which is
currently the case in `codex.rs`).

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1703).
* #1705
* __->__ #1703
* #1702
* #1698
* #1697
2025-07-28 09:51:22 -07:00
Michael Bolin
d76f96ce79 fix: support special --codex-run-as-apply-patch arg (#1702)
This introduces some special behavior to the CLIs that are using the
`codex-arg0` crate where if `arg1` is `--codex-run-as-apply-patch`, then
it will run as if `apply_patch arg2` were invoked. This is important
because it means we can do things like:

```
SANDBOX_TYPE=landlock # or seatbelt for macOS
codex debug "${SANDBOX_TYPE}" -- codex --codex-run-as-apply-patch PATCH
```

which gives us a way to run `apply_patch` while ensuring it adheres to
the sandbox the user specified.

While it would be nice to use the `arg0` trick like we are currently
doing for `codex-linux-sandbox`, there is no way to specify the `arg0`
for the underlying command when running under `/usr/bin/sandbox-exec`,
so it will not work for us in this case.

Admittedly, we could have also supported this via a custom environment
variable (e.g., `CODEX_ARG0`), but since environment variables are
inherited by child processes, that seemed like a potentially leakier
abstraction.

This change, as well as our existing reliance on checking `arg0`, place
additional requirements on those who include `codex-core`. Its
`README.md` has been updated to reflect this.

While we could have just added an `apply-patch` subcommand to the
`codex` multitool CLI, that would not be sufficient for the standalone
`codex-exec` CLI, which is something that we distribute as part of our
GitHub releases for those who know they will not be using the TUI and
therefore prefer to use a slightly smaller executable:

https://github.com/openai/codex/releases/tag/rust-v0.10.0

To that end, this PR adds an integration test to ensure that the
`--codex-run-as-apply-patch` option works with the standalone
`codex-exec` CLI.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1702).
* #1705
* #1703
* __->__ #1702
* #1698
* #1697
2025-07-28 09:26:44 -07:00
Michael Bolin
fcd197d596 fix: use std::env::args_os instead of std::env::args (#1698)
Apparently `std::env::args()` will panic during iteration if any
argument to the process is not valid Unicode:

https://doc.rust-lang.org/std/env/fn.args.html

Let's avoid the risk and just go with `std::env::args_os()`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1698).
* #1705
* #1703
* #1702
* __->__ #1698
* #1697
2025-07-28 08:52:18 -07:00
Michael Bolin
9102255854 fix: move arg0 handling out of codex-linux-sandbox and into its own crate (#1697) 2025-07-28 08:31:24 -07:00
Jeremy Rose
7ecd3153a8 fix: correctly wrap history items (#1685)
The overall idea here is: skip ratatui for writing into scrollback,
because its primitives are wrong. We want to render full lines of text,
that will be wrapped natively by the terminal, and which we never plan
to update using ratatui (so the `Buffer` struct is overhead and in fact
an inhibition).

Instead, we use ANSI scrolling regions (link reference doc to come).
Essentially, we:
1. Define a scrolling region that extends from the top of the prompt
area all the way to the top of scrollback
2. Scroll that region up by N < (screen_height - viewport_height) lines,
in this PR N=1
3. Put our cursor at the top of the newly empty region
4. Print out our new text like normal

The terminal interactions here (write_spans and its dependencies) are
mostly extracted from ratatui.
2025-07-28 14:45:49 +00:00
Michael Bolin
2405c40026 chore: update Codex::spawn() to return a struct instead of a tuple (#1677)
Also update `init_codex()` to return a `struct` instead of a tuple, as well.
2025-07-27 20:01:35 -07:00
easong-openai
58bed77ba7 Remove tab focus switching (#1694)
Previously pressing tab would switch TUI focus to the history scrollbox - no longer necessary.
2025-07-27 11:04:09 -07:00
aibrahim-oai
5a0079fea2 Changing method in MCP notifications (#1684)
- Changing the codex/event type
2025-07-26 10:35:49 -07:00
Jeremy Rose
c66c99c5b5 fix: crash on resize (#1683)
Without this, resizing the terminal prints "Error: The cursor position
could not be read within a normal duration" and quits the app.
2025-07-25 14:23:38 -07:00
Jeremy Rose
75b4008094 fix: paste with newlines (#1682)
This fixes an issue where pasting multi-line content would break the
composer.
2025-07-25 19:26:40 +00:00
pakrym-oai
7ee87123a6 Optionally run using user profile (#1678) 2025-07-25 11:45:23 -07:00
Michael Bolin
994c9a874d chore: use one write call per item in rollout_writer() (#1679)
Most of the time, we expect the `String` returned by
`serde_json::to_string()` to have extra capacity, so `push('\n')` is
unlikely to allocate, which seems cheaper than an extra `write(2)` call,
on average?
2025-07-25 10:43:36 -07:00
easong-openai
480e82b00d Easily Selectable History (#1672)
This update replaces the previous ratatui history widget with an
append-only log so that the terminal can handle text selection and
scrolling. It also disables streaming responses, which we'll do our best
to bring back in a later PR. It also adds a small summary of token use
after the TUI exits.
2025-07-25 01:56:40 -07:00
Pavel Bezglasny
508abbe990 Update render name in tui for approval_policy to match with config values (#1675)
Currently, codex on start shows the value for the approval policy as
name of
[AskForApproval](2437a8d17a/codex-rs/core/src/protocol.rs (L128))
enum, which differs from
[approval_policy](2437a8d17a/codex-rs/config.md (approval_policy))
config values.
E.g. "untrusted" becomes "UnlessTrusted", "on-failure" -> "OnFailure",
"never" -> "Never".
This PR changes render names of the approval policy to match with
configuration values.
2025-07-24 14:17:57 -07:00
Michael Bolin
a1641743a8 feat: expand the set of commands that can be safely identified as "trusted" (#1668)
This PR updates `is_known_safe_command()` to account for "safe
operators" to expand the set of commands that can be run without
approval. This concept existed in the TypeScript CLI, and we are
[finally!] porting it to the Rust one:


c9e2def494/codex-cli/src/approvals.ts (L531-L541)

The idea is that if we have `EXPR1 SAFE_OP EXPR2` and `EXPR1` and
`EXPR2` are considered safe independently, then `EXPR1 SAFE_OP EXPR2`
should be considered safe. Currently, `SAFE_OP` includes `&&`, `||`,
`;`, and `|`.

In the TypeScript implementation, we relied on
https://www.npmjs.com/package/shell-quote to parse the string of Bash,
as it could provide a "lightweight" parse tree, parsing `'beep || boop >
/byte'` as:

```
[ 'beep', { op: '||' }, 'boop', { op: '>' }, '/byte' ]
```

Though in this PR, we introduce the use of
https://crates.io/crates/tree-sitter-bash for parsing (which
incidentally we were already using in
[`codex-apply-patch`](c9e2def494/codex-rs/apply-patch/Cargo.toml (L18))),
which gives us a richer parse tree. (Incidentally, if you have never
played with tree-sitter, try the
[playground](https://tree-sitter.github.io/tree-sitter/7-playground.html)
and select **Bash** from the dropdown to see how it parses various
expressions.)

As a concrete example, prior to this change, our implementation of
`is_known_safe_command()` could verify things like:

```
["bash", "-lc", "grep -R \"Cargo.toml\" -n"]
```

but not:

```
["bash", "-lc", "grep -R \"Cargo.toml\" -n || true"]
```

With this change, the version with `|| true` is also accepted.

Admittedly, this PR does not expand the safety check to support
subshells, so it would reject, e.g. `bash -lc 'ls || (pwd && echo hi)'`,
but that can be addressed in a subsequent PR.
2025-07-24 14:13:30 -07:00
Michael Bolin
c9e2def494 fix: add true,false,nl to the list of trusted commands (#1676)
`nl` is a line-numbering tool that should be on the _trusted _ list, as
there is nothing concerning on https://gtfobins.github.io/gtfobins/nl/
that would merit exclusion.

`true` and `false` are also safe, though not particularly useful given
how `is_known_safe_command()` works today, but that will change with
https://github.com/openai/codex/pull/1668.
2025-07-24 12:59:36 -07:00
Michael Bolin
7af9cedbd7 fix: create separate test_support crates to eliminate #[allow(dead_code)] (#1667)
Because of a quirk of how implementation tests work in Rust, we had a
number of `#[allow(dead_code)]` annotations that were misleading because
the functions _were_ being used, just not by all integration tests in a
`tests/` folder, so when compiling the test that did not use the
function, clippy would complain that it was unused.

This fixes things by create a "test_support" crate under the `tests/`
folder that is imported as a dev dependency for the respective crate.
2025-07-24 12:19:46 -07:00
vishnu-oai
2437a8d17a Record Git metadata to rollout (#1598)
# Summary

- Writing effective evals for codex sessions requires context of the
overall repository state at the moment the session began
- This change adds this metadata (git repository, branch, commit hash)
to the top of the rollout of the session (if available - if not it
doesn't add anything)
- Currently, this is only effective on a clean working tree, as we can't
track uncommitted/untracked changes with the current metadata set.
Ideally in the future we may want to track unclean changes somehow, or
perhaps prompt the user to stash or commit them.

# Testing
- Added unit tests
- `cargo test && cargo clippy --tests && cargo fmt -- --config
imports_granularity=Item`

### Resulting Rollout
<img width="1243" height="127" alt="Screenshot 2025-07-17 at 1 50 00 PM"
src="https://github.com/user-attachments/assets/68108941-f015-45b2-985c-ea315ce05415"
/>
2025-07-24 11:35:28 -07:00
dependabot[bot]
d2be0720b5 chore(deps): bump toml from 0.9.1 to 0.9.2 in /codex-rs (#1562)
Bumps [toml](https://github.com/toml-rs/toml) from 0.9.1 to 0.9.2.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c28f9ac30f"><code>c28f9ac</code></a>
chore: Release</li>
<li><a
href="f3a2299148"><code>f3a2299</code></a>
docs: Update changelog</li>
<li><a
href="69f09d3093"><code>69f09d3</code></a>
fix(lex): Don't loop over ')' for forever (<a
href="https://redirect.github.com/toml-rs/toml/issues/1003">#1003</a>)</li>
<li><a
href="cc68ae4f42"><code>cc68ae4</code></a>
fix(lex): Don't loop over ')' for forever</li>
<li>See full diff in <a
href="https://github.com/toml-rs/toml/compare/toml-v0.9.1...toml-v0.9.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=toml&package-manager=cargo&previous-version=0.9.1&new-version=0.9.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 17:22:05 -07:00
dependabot[bot]
173386eeac chore(deps): bump tree-sitter from 0.25.6 to 0.25.8 in /codex-rs (#1561)
Bumps [tree-sitter](https://github.com/tree-sitter/tree-sitter) from
0.25.6 to 0.25.8.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f2f197b6b2"><code>f2f197b</code></a>
0.25.8</li>
<li><a
href="8bb33f7d8c"><code>8bb33f7</code></a>
perf: reorder conditional operands</li>
<li><a
href="6f944de32f"><code>6f944de</code></a>
fix(generate): propagate node types error</li>
<li><a
href="c15938532d"><code>c159385</code></a>
0.25.7</li>
<li><a
href="94b55bfcdc"><code>94b55bf</code></a>
perf: reorder expensive conditional operand</li>
<li><a
href="bcb30f7951"><code>bcb30f7</code></a>
fix(generate): use topological sort for subtype map</li>
<li><a
href="3bd8f7df8e"><code>3bd8f7d</code></a>
perf: More efficient computation of used symbols</li>
<li><a
href="d7529c3265"><code>d7529c3</code></a>
perf: reserve <code>Vec</code> capacities where appropriate</li>
<li><a
href="bf4217f0ff"><code>bf4217f</code></a>
fix(web): wasm export paths</li>
<li><a
href="bb7b339ae2"><code>bb7b339</code></a>
Fix 'extra' field generation for node-types.json</li>
<li>Additional commits viewable in <a
href="https://github.com/tree-sitter/tree-sitter/compare/v0.25.6...v0.25.8">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tree-sitter&package-manager=cargo&previous-version=0.25.6&new-version=0.25.8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 16:59:05 -07:00
dependabot[bot]
4a57afaaf2 chore(deps): bump strum_macros from 0.27.1 to 0.27.2 in /codex-rs (#1638)
Bumps [strum_macros](https://github.com/Peternator7/strum) from 0.27.1
to 0.27.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/Peternator7/strum/releases">strum_macros's
releases</a>.</em></p>
<blockquote>
<h2>v0.27.2</h2>
<h2>What's Changed</h2>
<ul>
<li>Adding support for doc comments on <code>EnumDiscriminants</code>
generated type… by <a
href="https://github.com/linclelinkpart5"><code>@​linclelinkpart5</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/141">Peternator7/strum#141</a></li>
<li>Drop needless <code>rustversion</code> dependency by <a
href="https://github.com/paolobarbolini"><code>@​paolobarbolini</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/446">Peternator7/strum#446</a></li>
<li>Upgrade <code>phf</code> to v0.12 by <a
href="https://github.com/paolobarbolini"><code>@​paolobarbolini</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/448">Peternator7/strum#448</a></li>
<li>allow discriminants on empty enum by <a
href="https://github.com/crop2000"><code>@​crop2000</code></a> in <a
href="https://redirect.github.com/Peternator7/strum/pull/435">Peternator7/strum#435</a></li>
<li>Remove broken link to EnumTable docs by <a
href="https://github.com/schneems"><code>@​schneems</code></a> in <a
href="https://redirect.github.com/Peternator7/strum/pull/427">Peternator7/strum#427</a></li>
<li>Change enum table callbacks to FnMut. by <a
href="https://github.com/ClaytonKnittel"><code>@​ClaytonKnittel</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/443">Peternator7/strum#443</a></li>
<li>Add <code>#[automatically_derived]</code> to the <code>impl</code>s
by <a
href="https://github.com/dandedotdev"><code>@​dandedotdev</code></a> in
<a
href="https://redirect.github.com/Peternator7/strum/pull/444">Peternator7/strum#444</a></li>
<li>Implement a <code>suffix</code> attribute for serialization of enum
variants by <a
href="https://github.com/amogh-dambal"><code>@​amogh-dambal</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/440">Peternator7/strum#440</a></li>
<li>Expound upon use_phf docs by <a
href="https://github.com/Peternator7"><code>@​Peternator7</code></a> in
<a
href="https://redirect.github.com/Peternator7/strum/pull/449">Peternator7/strum#449</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/paolobarbolini"><code>@​paolobarbolini</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/446">Peternator7/strum#446</a></li>
<li><a href="https://github.com/crop2000"><code>@​crop2000</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/435">Peternator7/strum#435</a></li>
<li><a href="https://github.com/schneems"><code>@​schneems</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/427">Peternator7/strum#427</a></li>
<li><a
href="https://github.com/ClaytonKnittel"><code>@​ClaytonKnittel</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/443">Peternator7/strum#443</a></li>
<li><a
href="https://github.com/dandedotdev"><code>@​dandedotdev</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/444">Peternator7/strum#444</a></li>
<li><a
href="https://github.com/amogh-dambal"><code>@​amogh-dambal</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/440">Peternator7/strum#440</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/Peternator7/strum/compare/v0.27.1...v0.27.2">https://github.com/Peternator7/strum/compare/v0.27.1...v0.27.2</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum_macros's
changelog</a>.</em></p>
<blockquote>
<h2>0.27.2</h2>
<ul>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/141">#141</a>:
Adding support for doc comments on <code>EnumDiscriminants</code>
generated type.</p>
<ul>
<li>The doc comment will be copied from the variant on the type
itself.</li>
</ul>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/435">#435</a>:allow
discriminants on empty enum.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/443">#443</a>:
Change enum table callbacks to FnMut.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>:
Add <code>#[automatically_derived]</code> to the <code>impl</code>s by
<a href="https://github.com/dandedotdev"><code>@​dandedotdev</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/444">Peternator7/strum#444</a></p>
<ul>
<li>This should make the linter less noisy with warnings in generated
code.</li>
</ul>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/440">#440</a>:
Implement a <code>suffix</code> attribute for serialization of enum
variants.</p>
<pre lang="rust"><code>#[derive(strum::Display)]
#[strum(suffix=&quot;.json&quot;)]
#[strum(serialize_all=&quot;snake_case&quot;)]
enum StorageConfiguration {
  PostgresProvider,
  S3StorageProvider,
  AzureStorageProvider,
}
<p>fn main() {
let response = SurveyResponse::Other(&quot;It was good&quot;.into());
println!(&quot;Loading configuration from: {}&quot;,
StorageConfiguration::PostgresProvider);
// prints: Loaded Configuration from: postgres_provider.json
}
</code></pre></p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/446">#446</a>:
Drop needless <code>rustversion</code> dependency.</p>
</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="38f66210e7"><code>38f6621</code></a>
Expound upon use_phf docs (<a
href="https://redirect.github.com/Peternator7/strum/issues/449">#449</a>)</li>
<li><a
href="bb1339026b"><code>bb13390</code></a>
Implement a <code>suffix</code> attribute for serialization of enum
variants (<a
href="https://redirect.github.com/Peternator7/strum/issues/440">#440</a>)</li>
<li><a
href="c9e52bfd28"><code>c9e52bf</code></a>
Add <code>#[automatically_derived]</code> to the <code>impl</code>s (<a
href="https://redirect.github.com/Peternator7/strum/issues/444">#444</a>)</li>
<li><a
href="1b00f899e5"><code>1b00f89</code></a>
Change enum table callbacks to FnMut. (<a
href="https://redirect.github.com/Peternator7/strum/issues/443">#443</a>)</li>
<li><a
href="6e2ca25fba"><code>6e2ca25</code></a>
Remove broken link to EnumTable docs (<a
href="https://redirect.github.com/Peternator7/strum/issues/427">#427</a>)</li>
<li><a
href="9503781141"><code>9503781</code></a>
allow discriminants on empty enum (<a
href="https://redirect.github.com/Peternator7/strum/issues/435">#435</a>)</li>
<li><a
href="8553ba2845"><code>8553ba2</code></a>
Upgrade <code>phf</code> to v0.12 (<a
href="https://redirect.github.com/Peternator7/strum/issues/448">#448</a>)</li>
<li><a
href="2eba5c2a5c"><code>2eba5c2</code></a>
Drop needless <code>rustversion</code> dependency (<a
href="https://redirect.github.com/Peternator7/strum/issues/446">#446</a>)</li>
<li><a
href="f301b67d91"><code>f301b67</code></a>
Merge branch 'linclelinkpart5-master-2'</li>
<li><a
href="455b2bf859"><code>455b2bf</code></a>
Merge branch 'master' of <a
href="https://github.com/linclelinkpart5/strum">https://github.com/linclelinkpart5/strum</a>
into lincle...</li>
<li>See full diff in <a
href="https://github.com/Peternator7/strum/compare/v0.27.1...v0.27.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=strum_macros&package-manager=cargo&previous-version=0.27.1&new-version=0.27.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 16:34:16 -07:00
dependabot[bot]
9f645353e9 chore(deps): bump strum from 0.27.1 to 0.27.2 in /codex-rs (#1639)
Bumps [strum](https://github.com/Peternator7/strum) from 0.27.1 to
0.27.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/Peternator7/strum/releases">strum's
releases</a>.</em></p>
<blockquote>
<h2>v0.27.2</h2>
<h2>What's Changed</h2>
<ul>
<li>Adding support for doc comments on <code>EnumDiscriminants</code>
generated type… by <a
href="https://github.com/linclelinkpart5"><code>@​linclelinkpart5</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/141">Peternator7/strum#141</a></li>
<li>Drop needless <code>rustversion</code> dependency by <a
href="https://github.com/paolobarbolini"><code>@​paolobarbolini</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/446">Peternator7/strum#446</a></li>
<li>Upgrade <code>phf</code> to v0.12 by <a
href="https://github.com/paolobarbolini"><code>@​paolobarbolini</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/448">Peternator7/strum#448</a></li>
<li>allow discriminants on empty enum by <a
href="https://github.com/crop2000"><code>@​crop2000</code></a> in <a
href="https://redirect.github.com/Peternator7/strum/pull/435">Peternator7/strum#435</a></li>
<li>Remove broken link to EnumTable docs by <a
href="https://github.com/schneems"><code>@​schneems</code></a> in <a
href="https://redirect.github.com/Peternator7/strum/pull/427">Peternator7/strum#427</a></li>
<li>Change enum table callbacks to FnMut. by <a
href="https://github.com/ClaytonKnittel"><code>@​ClaytonKnittel</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/443">Peternator7/strum#443</a></li>
<li>Add <code>#[automatically_derived]</code> to the <code>impl</code>s
by <a
href="https://github.com/dandedotdev"><code>@​dandedotdev</code></a> in
<a
href="https://redirect.github.com/Peternator7/strum/pull/444">Peternator7/strum#444</a></li>
<li>Implement a <code>suffix</code> attribute for serialization of enum
variants by <a
href="https://github.com/amogh-dambal"><code>@​amogh-dambal</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/440">Peternator7/strum#440</a></li>
<li>Expound upon use_phf docs by <a
href="https://github.com/Peternator7"><code>@​Peternator7</code></a> in
<a
href="https://redirect.github.com/Peternator7/strum/pull/449">Peternator7/strum#449</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/paolobarbolini"><code>@​paolobarbolini</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/446">Peternator7/strum#446</a></li>
<li><a href="https://github.com/crop2000"><code>@​crop2000</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/435">Peternator7/strum#435</a></li>
<li><a href="https://github.com/schneems"><code>@​schneems</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/427">Peternator7/strum#427</a></li>
<li><a
href="https://github.com/ClaytonKnittel"><code>@​ClaytonKnittel</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/443">Peternator7/strum#443</a></li>
<li><a
href="https://github.com/dandedotdev"><code>@​dandedotdev</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/444">Peternator7/strum#444</a></li>
<li><a
href="https://github.com/amogh-dambal"><code>@​amogh-dambal</code></a>
made their first contribution in <a
href="https://redirect.github.com/Peternator7/strum/pull/440">Peternator7/strum#440</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/Peternator7/strum/compare/v0.27.1...v0.27.2">https://github.com/Peternator7/strum/compare/v0.27.1...v0.27.2</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum's
changelog</a>.</em></p>
<blockquote>
<h2>0.27.2</h2>
<ul>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/141">#141</a>:
Adding support for doc comments on <code>EnumDiscriminants</code>
generated type.</p>
<ul>
<li>The doc comment will be copied from the variant on the type
itself.</li>
</ul>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/435">#435</a>:allow
discriminants on empty enum.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/443">#443</a>:
Change enum table callbacks to FnMut.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>:
Add <code>#[automatically_derived]</code> to the <code>impl</code>s by
<a href="https://github.com/dandedotdev"><code>@​dandedotdev</code></a>
in <a
href="https://redirect.github.com/Peternator7/strum/pull/444">Peternator7/strum#444</a></p>
<ul>
<li>This should make the linter less noisy with warnings in generated
code.</li>
</ul>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/440">#440</a>:
Implement a <code>suffix</code> attribute for serialization of enum
variants.</p>
<pre lang="rust"><code>#[derive(strum::Display)]
#[strum(suffix=&quot;.json&quot;)]
#[strum(serialize_all=&quot;snake_case&quot;)]
enum StorageConfiguration {
  PostgresProvider,
  S3StorageProvider,
  AzureStorageProvider,
}
<p>fn main() {
let response = SurveyResponse::Other(&quot;It was good&quot;.into());
println!(&quot;Loading configuration from: {}&quot;,
StorageConfiguration::PostgresProvider);
// prints: Loaded Configuration from: postgres_provider.json
}
</code></pre></p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/446">#446</a>:
Drop needless <code>rustversion</code> dependency.</p>
</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="38f66210e7"><code>38f6621</code></a>
Expound upon use_phf docs (<a
href="https://redirect.github.com/Peternator7/strum/issues/449">#449</a>)</li>
<li><a
href="bb1339026b"><code>bb13390</code></a>
Implement a <code>suffix</code> attribute for serialization of enum
variants (<a
href="https://redirect.github.com/Peternator7/strum/issues/440">#440</a>)</li>
<li><a
href="c9e52bfd28"><code>c9e52bf</code></a>
Add <code>#[automatically_derived]</code> to the <code>impl</code>s (<a
href="https://redirect.github.com/Peternator7/strum/issues/444">#444</a>)</li>
<li><a
href="1b00f899e5"><code>1b00f89</code></a>
Change enum table callbacks to FnMut. (<a
href="https://redirect.github.com/Peternator7/strum/issues/443">#443</a>)</li>
<li><a
href="6e2ca25fba"><code>6e2ca25</code></a>
Remove broken link to EnumTable docs (<a
href="https://redirect.github.com/Peternator7/strum/issues/427">#427</a>)</li>
<li><a
href="9503781141"><code>9503781</code></a>
allow discriminants on empty enum (<a
href="https://redirect.github.com/Peternator7/strum/issues/435">#435</a>)</li>
<li><a
href="8553ba2845"><code>8553ba2</code></a>
Upgrade <code>phf</code> to v0.12 (<a
href="https://redirect.github.com/Peternator7/strum/issues/448">#448</a>)</li>
<li><a
href="2eba5c2a5c"><code>2eba5c2</code></a>
Drop needless <code>rustversion</code> dependency (<a
href="https://redirect.github.com/Peternator7/strum/issues/446">#446</a>)</li>
<li><a
href="f301b67d91"><code>f301b67</code></a>
Merge branch 'linclelinkpart5-master-2'</li>
<li><a
href="455b2bf859"><code>455b2bf</code></a>
Merge branch 'master' of <a
href="https://github.com/linclelinkpart5/strum">https://github.com/linclelinkpart5/strum</a>
into lincle...</li>
<li>See full diff in <a
href="https://github.com/Peternator7/strum/compare/v0.27.1...v0.27.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=strum&package-manager=cargo&previous-version=0.27.1&new-version=0.27.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 16:07:33 -07:00
Gabriel Peal
db84722080 Fix flaky test (#1664)
Co-authored-by: aibrahim-oai <aibrahim@openai.com>
2025-07-23 18:40:00 -04:00
dependabot[bot]
6e1838e0d8 chore(deps): bump rand from 0.9.1 to 0.9.2 in /codex-rs (#1637)
Bumps [rand](https://github.com/rust-random/rand) from 0.9.1 to 0.9.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rust-random/rand/blob/master/CHANGELOG.md">rand's
changelog</a>.</em></p>
<blockquote>
<h2>[0.9.2 — 2025-07-20]</h2>
<h3>Deprecated</h3>
<ul>
<li>Deprecate <code>rand::rngs::mock</code> module and
<code>StepRng</code> generator (<a
href="https://redirect.github.com/rust-random/rand/issues/1634">#1634</a>)</li>
</ul>
<h3>Additions</h3>
<ul>
<li>Enable <code>WeightedIndex&lt;usize&gt;</code> (de)serialization (<a
href="https://redirect.github.com/rust-random/rand/issues/1646">#1646</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d3dd415705"><code>d3dd415</code></a>
Prepare rand_core 0.9.2 (<a
href="https://redirect.github.com/rust-random/rand/issues/1605">#1605</a>)</li>
<li><a
href="99fabd20e9"><code>99fabd2</code></a>
Prepare rand_core 0.9.2</li>
<li><a
href="c7fe1c43b5"><code>c7fe1c4</code></a>
rand: re-export <code>rand_core</code> (<a
href="https://redirect.github.com/rust-random/rand/issues/1604">#1604</a>)</li>
<li><a
href="db2b1e0bb4"><code>db2b1e0</code></a>
rand: re-export <code>rand_core</code></li>
<li><a
href="ee1d96f9f5"><code>ee1d96f</code></a>
rand_core: implement reborrow for <code>UnwrapMut</code> (<a
href="https://redirect.github.com/rust-random/rand/issues/1595">#1595</a>)</li>
<li><a
href="e0eb2ee0fc"><code>e0eb2ee</code></a>
rand_core: implement reborrow for <code>UnwrapMut</code></li>
<li><a
href="975f602f5d"><code>975f602</code></a>
fixup clippy 1.85 warnings</li>
<li><a
href="775b05be1b"><code>775b05b</code></a>
Relax <code>Sized</code> requirements for blanket impls (<a
href="https://redirect.github.com/rust-random/rand/issues/1593">#1593</a>)</li>
<li>See full diff in <a
href="https://github.com/rust-random/rand/compare/rand_core-0.9.1...rand_core-0.9.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=rand&package-manager=cargo&previous-version=0.9.1&new-version=0.9.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 15:36:08 -07:00
dependabot[bot]
4fc4e410bd chore(deps-dev): bump @types/node from 24.0.13 to 24.0.15 in /.github/actions/codex (#1636)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=bun&previous-version=24.0.13&new-version=24.0.15)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 15:32:31 -07:00
dependabot[bot]
6dd62ffa3b chore(deps-dev): bump @types/bun from 1.2.18 to 1.2.19 in /.github/actions/codex (#1635)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/bun&package-manager=bun&previous-version=1.2.18&new-version=1.2.19)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 15:20:47 -07:00
aibrahim-oai
b4ab7c1b73 Flaky CI fix (#1647)
Flushing before sending `TaskCompleteEvent` and ending the submission
loop to avoid race conditions.
2025-07-23 15:03:26 -07:00
Gabriel Peal
084236f717 Add call_id to patch approvals and elicitations (#1660)
Builds on https://github.com/openai/codex/pull/1659 and adds call_id to
a few more places for the same reason.
2025-07-23 15:55:35 -04:00
Gabriel Peal
bc944e77f5 Improve messages emitted for exec failures (#1659)
1. Emit call_id to exec approval elicitations for mcp client convenience
2. Remove the `-retry` from the call id for the same reason as above but
upstream the reset behavior to the mcp client
2025-07-23 14:43:53 -04:00
pakrym-oai
591cb6149a Always send entire request context (#1641)
Always store the entire conversation history.
Request encrypted COT when not storing Responses.
Send entire input context instead of sending previous_response_id
2025-07-23 10:37:45 -07:00
Michael Bolin
d6c4083f98 feat: support dotenv (including ~/.codex/.env) (#1653)
This PR adds a `load_dotenv()` helper function to the `codex-common`
crate that is available when the `cli` feature is enabled. The function
uses [`dotenvy`](https://crates.io/crates/dotenvy) to update the
environment from:

- `$CODEX_HOME/.env`
- `$(pwd)/.env`

To test:

- ran `printenv OPENAI_API_KEY` to verify the env var exists in my
environment
- ran `just codex exec hello` to verify the CLI uses my `OPENAI_API_KEY`
- ran `unset OPENAI_API_KEY`
- ran `just codex exec hello` again and got **ERROR: Missing environment
variable: `OPENAI_API_KEY`**, as expected
- created `~/.codex/.env` and added `OPENAI_API_KEY=sk-proj-...` (also
ran `chmod 400 ~/.codex/.env` for good measure)
- ran `just codex exec hello` again and it worked, verifying it picked
up `OPENAI_API_KEY` from `~/.codex/.env`

Note this functionality was available in the TypeScript CLI:
https://github.com/openai/codex/pull/122 and was recently requested over
on https://github.com/openai/codex/issues/1262#issuecomment-3093203551.
2025-07-22 15:54:33 -07:00
Michael Bolin
3ef544fb95 chore: for release build, build specific targets instead of --all-targets (#1656)
I noticed that releases have taken longer and longer to build.
Originally, I think I did `--all-targets` to be confident that
everything builds cleanly, but that's really the job of CI that runs on
`main`, so we're spending a lot of time in `rust-release.yml` for not
that much additional signal.
2025-07-22 14:35:50 -07:00
aibrahim-oai
01c0896f0f Adding interrupt Support to MCP (#1646) 2025-07-22 20:33:49 +00:00
Michael Bolin
4082246f6a chore: install an extension for TOML syntax highlighting in the devcontainer (#1650)
Small quality-of-life improvement when doing devcontainer development.
2025-07-22 10:58:09 -07:00
pakrym-oai
6d82907082 Add support for custom base instructions (#1645)
Allows providing custom instructions file as a config parameter and
custom instruction text via MCP tool call.
2025-07-22 09:42:22 -07:00
pakrym-oai
ed206d5687 Log response.failed error message and request-id (#1649)
To help with diagnosing failures.
2025-07-22 09:28:00 -07:00
Michael Bolin
d51654822f fix: use PR_SET_PDEATHSIG so to ensure child processes are killed in a timely manner (#1626)
Some users have reported issues where child processes are not cleaned up
after Codex exits (e.g., https://github.com/openai/codex/issues/1570).

This is generally a tricky issue on operating systems: if a parent
process receives `SIGKILL`, then it terminates immediately and cannot
communicate with the child.

**It only helps on Linux**, but this PR introduces the use of `prctl(2)`
so that if the parent process dies, `SIGTERM` will be delivered to the
child process. Whereas previously, I believe that if Codex spawned a
long-running process (like `tsc --watch`) and the Codex process received
`SIGKILL`, the `tsc --watch` process would be reparented to the init
process and would never be killed. Now with the use of `prctl(2)`, the
`tsc --watch` process should receive `SIGTERM` in that scenario.

We still need to come up with a solution for macOS. I've started to look
at `launchd`, but I'm researching a number of options.
2025-07-22 00:41:27 -07:00
Gabriel Peal
710f728124 Add an elicitation for approve patch and refactor tool calls (#1642)
1. Added an elicitation for `approve-patch` which is very similar to
`approve-exec`.
2. Extracted both elicitations to their own files to prevent
`codex_tool_runner` from blowing up in size.
2025-07-22 02:58:41 -04:00
Michael Bolin
6cf4b96f9d fix: check flags to ripgrep when deciding whether the invocation is "trusted" (#1644)
With this change, if any of `--pre`, `--hostname-bin`, `--search-zip`, or `-z` are used with a proposed invocation of `rg`, do not auto-approve.
2025-07-21 22:38:50 -07:00
Dylan
18b2b30841 [mcp-server] Add reply tool call (#1643)
## Summary
Adds a new mcp tool call, `codex-reply`, so we can continue existing
sessions. This is a first draft and does not yet support sessions from
previous processes.

## Testing
- [x] tested with mcp client
2025-07-21 21:01:56 -07:00
Michael Bolin
d49d802b06 test: add integration test for MCP server (#1633)
This PR introduces a single integration test for `cargo mcp`, though it
also introduces a number of reusable components so that it should be
easier to introduce more integration tests going forward.

The new test is introduced in `codex-rs/mcp-server/tests/elicitation.rs`
and the reusable pieces are in `codex-rs/mcp-server/tests/common`.

The test itself verifies new functionality around elicitations
introduced in https://github.com/openai/codex/pull/1623 (and the fix
introduced in https://github.com/openai/codex/pull/1629) by doing the
following:

- starts a mock model provider with canned responses for
`/v1/chat/completions`
- starts the MCP server with a `config.toml` to use that model provider
(and `approval_policy = "untrusted"`)
- sends the `codex` tool call which causes the mock model provider to
request a shell call for `git init`
- the MCP server sends an elicitation to the client to approve the
request
- the client replies to the elicitation with `"approved"`
- the MCP server runs the command and re-samples the model, getting a
`"finish_reason": "stop"`
- in turn, the MCP server sends the final response to the original
`codex` tool call
- verifies that `git init` ran as expected

To test:

```
cargo test shell_command_approval_triggers_elicitation
```

In writing this test, I discovered that `ExecApprovalResponse` does not
conform to `ElicitResult`, so I added a TODO to fix that, since I think
that should be updated in a separate PR. As it stands, this PR does not
update any business logic, though it does make a number of members of
the `mcp-server` crate `pub` so they can be used in the test.

One additional learning from this PR is that
`std::process::Command::cargo_bin()` from the `assert_cmd` trait is only
available for `std::process::Command`, but we really want to use
`tokio::process::Command` so that everything is async and we can
leverage utilities like `tokio::time::timeout()`. The trick I came up
with was to use `cargo_bin()` to locate the program, and then to use
`std::process::Command::get_program()` when constructing the
`tokio::process::Command`.
2025-07-21 10:27:07 -07:00
Michael Bolin
8a6c6cee88 fix: address review feedback on #1621 and #1623 (#1631)
- formalizes `ExecApprovalElicitRequestParams`
- adds some defensive logic when messages fail to parse
- fixes a typo in a comment
2025-07-20 14:42:11 -07:00
Gabriel Peal
8b590105de Don't drop sessions on elicitation responses (#1629) 2025-07-20 13:31:19 -04:00
Michael Bolin
018003e52f feat: leverage elicitations in the MCP server (#1623)
This updates the MCP server so that if it receives an
`ExecApprovalRequest` from the `Codex` session, it in turn sends an [MCP
elicitation](https://modelcontextprotocol.io/specification/draft/client/elicitation)
to the client to ask for the approval decision. Upon getting a response,
it forwards the client's decision via `Op::ExecApproval`.

Admittedly, we should be doing the same thing for
`ApplyPatchApprovalRequest`, but this is our first time experimenting
with elicitations, so I'm inclined to defer wiring that code path up
until we feel good about how this one works.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1623).
* __->__ #1623
* #1622
* #1621
* #1620
2025-07-19 01:32:03 -04:00
Michael Bolin
11fd3123be chore: introduce OutgoingMessageSender (#1622)
Previous to this change, `MessageProcessor` had a
`tokio::sync::mpsc::Sender<JSONRPCMessage>` as an abstraction for server
code to send a message down to the MCP client. Because `Sender` is cheap
to `clone()`, it was straightforward to make it available to tasks
scheduled with `tokio::task::spawn()`.

This worked well when we were only sending notifications or responses
back down to the client, but we want to add support for sending
elicitations in #1623, which means that we need to be able to send
_requests_ to the client, and now we need a bit of centralization to
ensure all request ids are unique.

To that end, this PR introduces `OutgoingMessageSender`, which houses
the existing `Sender<OutgoingMessage>` as well as an `AtomicI64` to mint
out new, unique request ids. It has methods like `send_request()` and
`send_response()` so that callers do not have to deal with
`JSONRPCMessage` directly, as having to set the `jsonrpc` for each
message was a bit tedious (this cleans up `codex_tool_runner.rs` quite a
bit).

We do not have `OutgoingMessageSender` implement `Clone` because it is
important that the `AtomicI64` is shared across all users of
`OutgoingMessageSender`. As such, `Arc<OutgoingMessageSender>` must be
used instead, as it is frequently shared with new tokio tasks.

As part of this change, we update `message_processor.rs` to embrace
`await`, though we must be careful that no individual handler blocks the
main loop and prevents other messages from being handled.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1622).
* #1623
* __->__ #1622
* #1621
* #1620
2025-07-19 00:30:56 -04:00
Michael Bolin
e78ec00e73 chore: support MCP schema 2025-06-18 (#1621)
This updates the schema in `generate_mcp_types.py` from `2025-03-26` to
`2025-06-18`, regenerates `mcp-types/src/lib.rs`, and then updates all
the code that uses `mcp-types` to honor the changes.

Ran

```
npx @modelcontextprotocol/inspector just codex mcp
```

and verified that I was able to invoke the `codex` tool, as expected.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1621).
* #1623
* #1622
* __->__ #1621
2025-07-19 00:09:34 -04:00
Michael Bolin
a06d4f58e4 chore: clean up generate_mcp_types.py so codegen matches existing output (#1620) 2025-07-18 21:40:39 -04:00
aibrahim-oai
83eefb55fb Add session loading support to Codex (#1602)
## Summary
- extend rollout format to store all session data in JSON
- add resume/write helpers for rollouts
- track session state after each conversation
- support `LoadSession` op to resume a previous rollout
- allow starting Codex with an existing session via
`experimental_resume` config variable

We need a way later for exploring the available sessions in a user
friendly way.

## Testing
- `cargo test --no-run` *(fails: `cargo: command not found`)*

------
https://chatgpt.com/codex/tasks/task_i_68792a29dd5c832190bf6930d3466fba

This video is outdated. you should use `-c experimental_resume:<full
path>` instead of `--resume <full path>`


https://github.com/user-attachments/assets/7a9975c7-aa04-4f4e-899a-9e87defd947a
2025-07-18 17:04:04 -07:00
aibrahim-oai
9846adeabf Refactor env settings into config (#1601)
## Summary
- add OpenAI retry and timeout fields to Config
- inject these settings in tests instead of mutating env vars
- plumb Config values through client and chat completions logic
- document new configuration options

## Testing
- `cargo test -p codex-core --no-run`

------
https://chatgpt.com/codex/tasks/task_i_68792c5b04cc832195c03050c8b6ea94

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-07-18 19:12:39 +00:00
aibrahim-oai
d5a2148deb Fix ctrl+c interrupt while streaming (#1617)
Interrupting while streaming now causes is broken because we aren't
clearing the delta buffer.
2025-07-18 12:08:25 -07:00
Michael Bolin
cc874c9205 chore: use AtomicBool instead of Mutex<bool> (#1616) 2025-07-18 11:13:34 -07:00
pakrym-oai
6f2b01bb6b feat: ensure session ID header is sent in Response API request (#1614)
Include the current session id in Responses API requests.
2025-07-18 09:59:07 -07:00
aibrahim-oai
9cedeadf6a change the default debounce rate to 10ms (#1606)
changed the default debounce rate to 10ms because typing was laggy.

Before:


https://github.com/user-attachments/assets/e5d15fcb-6a2b-4837-b2b4-c3dcb4cc3409

After



https://github.com/user-attachments/assets/6f0005eb-fd49-4130-ba68-635ee0f2831f
2025-07-17 17:00:17 -07:00
pakrym-oai
327e2254f6 chore: rename toolchain file (#1604)
Rename toolchain file so older versions of cargo can pick it up.
2025-07-17 15:36:15 -07:00
Michael Bolin
e16657ca45 feat: add --json flag to codex exec (#1603)
This is designed to facilitate programmatic use of Codex in a more
lightweight way than using `codex mcp`.

Passing `--json` to `codex exec` will print each event as a line of JSON
to stdout. Note that it does not print the individual tokens as they are
streamed, only full messages, as this is aimed at programmatic use
rather than to power UI.

<img width="1348" height="1307" alt="image"
src="https://github.com/user-attachments/assets/fc7908de-b78d-46e4-a6ff-c85de28415c7"
/>

I changed the existing `EventProcessor` into a trait and moved the
implementation to `EventProcessorWithHumanOutput`. Then I introduced an
alternative implementation, `EventProcessorWithJsonOutput`. The `--json`
flag determines which implementation to use.
2025-07-17 15:10:15 -07:00
aibrahim-oai
bb30ab9e96 Implement redraw debounce (#1599)
## Summary
- debouce redraw events so repeated requests don't overwhelm the
terminal
- add `RequestRedraw` event and schedule redraws after 100ms

## Testing
- `cargo clippy --tests`
- `cargo test` *(fails: Sandbox Denied errors in landlock tests)*

------
https://chatgpt.com/codex/tasks/task_i_68792a65b8b483218ec90a8f68746cd8

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-07-17 12:54:55 -07:00
pakrym-oai
6949329a7f chore: auto format code on save and add more details to AGENTS.md (#1582)
Adds a default vscode config with generally applicable settings.
Adds more entrypoints to justfile both  for environment setup and to help
agents better verify changes.
2025-07-17 11:40:00 -07:00
pakrym-oai
b95a010e86 fix: trim MCP tool names to fit into tool name length limit (#1571)
Store fully qualified names along with tool entries so we don't have to re-parse them.

Fixes: https://github.com/openai/codex/issues/1289
2025-07-17 11:35:38 -07:00
aibrahim-oai
fcbcc40f51 Storing the sessions in a more organized way for easier look up. (#1596)
now storing the sessions in `~/.codex/sessions/YYYY/MM/DD/<file>`
2025-07-17 10:12:15 -07:00
aibrahim-oai
643ab1f582 Add streaming to exec and tui (#1594)
Added support for streaming in `tui`
Added support for streaming in `exec`


https://github.com/user-attachments/assets/4215892e-d940-452c-a1d0-416ed0cf14eb
2025-07-16 22:26:31 -07:00
Michael Bolin
d3dbc10479 fix: update bin/codex.js so it listens for exit on the child process (#1590)
When Codex CLI is installed via `npm`, we use a `.js` wrapper script to
launch the Rust binary.

- Previously, we were not listening for signals to ensure that killing
the Node.js process would also kill the underlying Rust process.
- We also did not have a proper `exit` handler in place on the child
process to ensure we exited from the Node.js process.

This PR fixes these things and hopefully addresses
https://github.com/openai/codex/issues/1570.

This also adds logic so that Windows falls back to the TypeScript CLI
again, which should address https://github.com/openai/codex/issues/1573.
2025-07-16 16:35:29 -07:00
Preet 🚀
0bc7ee9193 Added mcp-server name validation (#1591)
This PR implements server name validation for MCP (Model Context
Protocol) servers to ensure they conform to the required pattern
^[a-zA-Z0-9_-]+$. This addresses the TODO comment in
mcp_connection_manager.rs:82.

+ Added validation before spawning MCP client tasks
+ Invalid server names are added to errors map with descriptive messages

I have read the CLA Document and I hereby sign the CLA

---------

Co-authored-by: Michael Bolin <bolinfest@gmail.com>
2025-07-16 16:00:39 -07:00
aibrahim-oai
2bd3314886 support deltas in core (#1587)
- Added support for message and reasoning deltas
- Skipped adding the support in the cli and tui for later
- Commented a failing test (wrong merge) that needs fix in a separate
PR.

Side note: I think we need to disable merge when the CI don't pass.
2025-07-16 15:11:18 -07:00
Michael Bolin
5b820c5ce7 feat: ctrl-d only exits when there is no user input (#1589)
While this does make it so that `ctrl-d` will not exit Codex when the
composer is not empty, `ctrl-d` will still exit Codex if it is in the
"working" state.

Fixes https://github.com/openai/codex/issues/1443.
2025-07-16 08:59:26 -07:00
214 changed files with 27902 additions and 5932 deletions

View File

@@ -21,7 +21,7 @@
"settings": {
"terminal.integrated.defaultProfile.linux": "bash"
},
"extensions": ["rust-lang.rust-analyzer"]
"extensions": ["rust-lang.rust-analyzer", "tamasfe.even-better-toml"]
}
}
}

View File

@@ -8,10 +8,10 @@
"@actions/github": "^6.0.1",
},
"devDependencies": {
"@types/bun": "^1.2.18",
"@types/node": "^24.0.13",
"@types/bun": "^1.2.19",
"@types/node": "^24.1.0",
"prettier": "^3.6.2",
"typescript": "^5.8.3",
"typescript": "^5.9.2",
},
},
},
@@ -48,15 +48,15 @@
"@octokit/types": ["@octokit/types@13.10.0", "", { "dependencies": { "@octokit/openapi-types": "^24.2.0" } }, "sha512-ifLaO34EbbPj0Xgro4G5lP5asESjwHracYJvVaPIyXMuiuXLlhic3S47cBdTb+jfODkTE5YtGCLt3Ay3+J97sA=="],
"@types/bun": ["@types/bun@1.2.18", "", { "dependencies": { "bun-types": "1.2.18" } }, "sha512-Xf6RaWVheyemaThV0kUfaAUvCNokFr+bH8Jxp+tTZfx7dAPA8z9ePnP9S9+Vspzuxxx9JRAXhnyccRj3GyCMdQ=="],
"@types/bun": ["@types/bun@1.2.19", "", { "dependencies": { "bun-types": "1.2.19" } }, "sha512-d9ZCmrH3CJ2uYKXQIUuZ/pUnTqIvLDS0SK7pFmbx8ma+ziH/FRMoAq5bYpRG7y+w1gl+HgyNZbtqgMq4W4e2Lg=="],
"@types/node": ["@types/node@24.0.13", "", { "dependencies": { "undici-types": "~7.8.0" } }, "sha512-Qm9OYVOFHFYg3wJoTSrz80hoec5Lia/dPp84do3X7dZvLikQvM1YpmvTBEdIr/e+U8HTkFjLHLnl78K/qjf+jQ=="],
"@types/node": ["@types/node@24.1.0", "", { "dependencies": { "undici-types": "~7.8.0" } }, "sha512-ut5FthK5moxFKH2T1CUOC6ctR67rQRvvHdFLCD2Ql6KXmMuCrjsSsRI9UsLCm9M18BMwClv4pn327UvB7eeO1w=="],
"@types/react": ["@types/react@19.1.8", "", { "dependencies": { "csstype": "^3.0.2" } }, "sha512-AwAfQ2Wa5bCx9WP8nZL2uMZWod7J7/JSplxbTmBQ5ms6QpqNYm672H0Vu9ZVKVngQ+ii4R/byguVEUZQyeg44g=="],
"before-after-hook": ["before-after-hook@2.2.3", "", {}, "sha512-NzUnlZexiaH/46WDhANlyR2bXRopNg4F/zuSA3OpZnllCUgRaOF2znDioDWrmbNVsuZk6l9pMquQB38cfBZwkQ=="],
"bun-types": ["bun-types@1.2.18", "", { "dependencies": { "@types/node": "*" }, "peerDependencies": { "@types/react": "^19" } }, "sha512-04+Eha5NP7Z0A9YgDAzMk5PHR16ZuLVa83b26kH5+cp1qZW4F6FmAURngE7INf4tKOvCE69vYvDEwoNl1tGiWw=="],
"bun-types": ["bun-types@1.2.19", "", { "dependencies": { "@types/node": "*" }, "peerDependencies": { "@types/react": "^19" } }, "sha512-uAOTaZSPuYsWIXRpj7o56Let0g/wjihKCkeRqUBhlLVM/Bt+Fj9xTo+LhC1OV1XDaGkz4hNC80et5xgy+9KTHQ=="],
"csstype": ["csstype@3.1.3", "", {}, "sha512-M1uQkMl8rQK/szD0LNhtqxIPLpimGm8sOBwU7lLnCpSbTyY3yeU1Vc7l4KT5zT4s/yOxHH5O7tIuuLOCnLADRw=="],
@@ -68,7 +68,7 @@
"tunnel": ["tunnel@0.0.6", "", {}, "sha512-1h/Lnq9yajKY2PEbBadPXj3VxsDDu844OnaAo52UVmIzIvwwtBPIuNvkjuzBlTWpfJyUbG3ez0KSBibQkj4ojg=="],
"typescript": ["typescript@5.8.3", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-p1diW6TqL9L07nNxvRMM7hMMw4c5XOo/1ibL4aAIGmSAt9slTE1Xgw5KWuof2uTOvCg9BY7ZRi+GaF+7sfgPeQ=="],
"typescript": ["typescript@5.9.2", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-CWBzXQrc/qOkhidw1OzBTQuYRbfyxDXJMVJ1XNwUHGROVmuaeiEm3OslpZ1RV96d7SKKjZKrSJu3+t/xlw3R9A=="],
"undici": ["undici@5.29.0", "", { "dependencies": { "@fastify/busboy": "^2.0.0" } }, "sha512-raqeBD6NQK4SkWhQzeYKd1KmIG6dllBOTt55Rmkt4HtI9mwdWtJljnrXjAFUBLTSN67HWrOIZ3EPF4kjUw80Bg=="],
@@ -82,6 +82,8 @@
"@octokit/plugin-rest-endpoint-methods/@octokit/types": ["@octokit/types@12.6.0", "", { "dependencies": { "@octokit/openapi-types": "^20.0.0" } }, "sha512-1rhSOfRa6H9w4YwK0yrf5faDaDTb+yLyBUKOCV4xtCDB5VmIPqd/v9yr9o6SAzOAlRxMiRiCic6JVM1/kunVkw=="],
"bun-types/@types/node": ["@types/node@24.0.13", "", { "dependencies": { "undici-types": "~7.8.0" } }, "sha512-Qm9OYVOFHFYg3wJoTSrz80hoec5Lia/dPp84do3X7dZvLikQvM1YpmvTBEdIr/e+U8HTkFjLHLnl78K/qjf+jQ=="],
"@octokit/plugin-paginate-rest/@octokit/types/@octokit/openapi-types": ["@octokit/openapi-types@20.0.0", "", {}, "sha512-EtqRBEjp1dL/15V7WiX5LJMIxxkdiGJnabzYx5Apx4FkQIFgAfKumXeYAqqJCj1s+BMX4cPFIFC4OLCR6stlnA=="],
"@octokit/plugin-rest-endpoint-methods/@octokit/types/@octokit/openapi-types": ["@octokit/openapi-types@20.0.0", "", {}, "sha512-EtqRBEjp1dL/15V7WiX5LJMIxxkdiGJnabzYx5Apx4FkQIFgAfKumXeYAqqJCj1s+BMX4cPFIFC4OLCR6stlnA=="],

View File

@@ -13,9 +13,9 @@
"@actions/github": "^6.0.1"
},
"devDependencies": {
"@types/bun": "^1.2.18",
"@types/node": "^24.0.13",
"@types/bun": "^1.2.19",
"@types/node": "^24.1.0",
"prettier": "^3.6.2",
"typescript": "^5.8.3"
"typescript": "^5.9.2"
}
}

View File

@@ -91,7 +91,38 @@ async function processLabel(
labelConfig: LabelConfig,
): Promise<void> {
const template = labelConfig.getPromptTemplate();
const populatedTemplate = await renderPromptTemplate(template, ctx);
// If this is a review label, prepend explicit PR-diff scoping guidance to
// reduce out-of-scope feedback. Do this before rendering so placeholders in
// the guidance (e.g., {CODEX_ACTION_GITHUB_EVENT_PATH}) are substituted.
const isReview = label.toLowerCase().includes("review");
const reviewScopeGuidance = `
PR Diff Scope
- Only review changes between the PR's merge-base and head; do not comment on commits or files outside this range.
- Derive the base/head SHAs from the event JSON at {CODEX_ACTION_GITHUB_EVENT_PATH}, then compute and use the PR diff for all analysis and comments.
Commands to determine scope
- Resolve SHAs:
- BASE_SHA=$(jq -r '.pull_request.base.sha // .pull_request.base.ref' "{CODEX_ACTION_GITHUB_EVENT_PATH}")
- HEAD_SHA=$(jq -r '.pull_request.head.sha // .pull_request.head.ref' "{CODEX_ACTION_GITHUB_EVENT_PATH}")
- BASE_SHA=$(git rev-parse "$BASE_SHA")
- HEAD_SHA=$(git rev-parse "$HEAD_SHA")
- Prefer triple-dot (merge-base) semantics for PR diffs:
- Changed commits: git log --oneline "$BASE_SHA...$HEAD_SHA"
- Changed files: git diff --name-status "$BASE_SHA...$HEAD_SHA"
- Review hunks: git diff -U0 "$BASE_SHA...$HEAD_SHA"
Review rules
- Anchor every comment to a file and hunk present in git diff "$BASE_SHA...$HEAD_SHA".
- If you mention context outside the diff, label it as "Follow-up (outside this PR scope)" and keep it brief (<=2 bullets).
- Do not critique commits or files not reachable in the PR range (merge-base(base, head) → head).
`.trim();
const effectiveTemplate = isReview
? `${reviewScopeGuidance}\n\n${template}`
: template;
const populatedTemplate = await renderPromptTemplate(effectiveTemplate, ctx);
// Always run Codex and post the resulting message as a comment.
let commentBody = await runCodex(populatedTemplate, ctx);

BIN
.github/codex-cli-login.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 410 KiB

BIN
.github/codex-cli-permissions.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 408 KiB

BIN
.github/codex-cli-splash.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 412 KiB

View File

@@ -0,0 +1,139 @@
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
Things to look out for when doing the review:
## General Principles
- **Make sure the pull request body explains the motivation behind the change.** If the author has failed to do this, call it out, and if you think you can deduce the motivation behind the change, propose copy.
- Ideally, the PR body also contains a small summary of the change. For small changes, the PR title may be sufficient.
- Each PR should ideally do one conceptual thing. For example, if a PR does a refactoring as well as introducing a new feature, push back and suggest the refactoring be done in a separate PR. This makes things easier for the reviewer, as refactoring changes can often be far-reaching, yet quick to review.
- When introducing new code, be on the lookout for code that duplicates existing code. When found, propose a way to refactor the existing code such that it should be reused.
## Code Organization
- Each create in the Cargo workspace in `codex-rs` has a specific purpose: make a note if you believe new code is not introduced in the correct crate.
- When possible, try to keep the `core` crate as small as possible. Non-core but shared logic is often a good candidate for `codex-rs/common`.
- Be wary of large files and offer suggestions for how to break things into more reasonably-sized files.
- Rust files should generally be organized such that the public parts of the API appear near the top of the file and helper functions go below. This is analagous to the "inverted pyramid" structure that is favored in journalism.
## Assertions in Tests
Assert the equality of the entire objects instead of doing "piecemeal comparisons," performing `assert_eq!()` on individual fields.
Note that unit tests also function as "executable documentation." As shown in the following example, "piecemeal comparisons" are often more verbose, provide less coverage, and are not as useful as executable documentation.
For example, suppose you have the following enum:
```rust
#[derive(Debug, PartialEq)]
enum Message {
Request {
id: String,
method: String,
params: Option<serde_json::Value>,
},
Notification {
method: String,
params: Option<serde_json::Value>,
},
}
```
This is an example of a _piecemeal_ comparison:
```rust
// BAD: Piecemeal Comparison
#[test]
fn test_get_latest_messages() {
let messages = get_latest_messages();
assert_eq!(messages.len(), 2);
let m0 = &messages[0];
match m0 {
Message::Request { id, method, params } => {
assert_eq!(id, "123");
assert_eq!(method, "subscribe");
assert_eq!(
*params,
Some(json!({
"conversation_id": "x42z86"
}))
)
}
Message::Notification { .. } => {
panic!("expected Request");
}
}
let m1 = &messages[1];
match m1 {
Message::Request { .. } => {
panic!("expected Notification");
}
Message::Notification { method, params } => {
assert_eq!(method, "log");
assert_eq!(
*params,
Some(json!({
"level": "info",
"message": "subscribed"
}))
)
}
}
}
```
This is a _deep_ comparison:
```rust
// GOOD: Verify the entire structure with a single assert_eq!().
use pretty_assertions::assert_eq;
#[test]
fn test_get_latest_messages() {
let messages = get_latest_messages();
assert_eq!(
vec![
Message::Request {
id: "123".to_string(),
method: "subscribe".to_string(),
params: Some(json!({
"conversation_id": "x42z86"
})),
},
Message::Notification {
method: "log".to_string(),
params: Some(json!({
"level": "info",
"message": "subscribed"
})),
},
],
messages,
);
}
```
## More Tactical Rust Things To Look Out For
- Do not use `unsafe` (unless you have a really, really good reason like using an operating system API directly and no safe wrapper exists). For example, there are cases where it is tempting to use `unsafe` in order to use `std::env::set_var()`, but this indeed `unsafe` and has led to race conditions on multiple occasions. (When this happens, find a mechanism other than environment variables to use for configuration.)
- Encourage the use of small enums or the newtype pattern in Rust if it helps readability without adding significant cognitive load or lines of code.
- If you see opportunities for the changes in a diff to use more idiomatic Rust, please make specific recommendations. For example, favor the use of expressions over `return`.
- When modifying a `Cargo.toml` file, make sure that dependency lists stay alphabetically sorted. Also consider whether a new dependency is added to the appropriate place (e.g., `[dependencies]` versus `[dev-dependencies]`)
## Pull Request Body
- If the nature of the change seems to have a visual component (which is often the case for changes to `codex-rs/tui`), recommend including a screenshot or video to demonstrate the change, if appropriate.
- References to existing GitHub issues and PRs are encouraged, where appropriate, though you likely do not have network access, so may not be able to help here.
# PR Information
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the `base` and `head` refs that define this PR. Both refs are available locally.

View File

@@ -20,7 +20,7 @@ jobs:
(github.event_name == 'issues' && (
(github.event.action == 'labeled' && (github.event.label.name == 'codex-attempt' || github.event.label.name == 'codex-triage'))
)) ||
(github.event_name == 'pull_request' && github.event.action == 'labeled' && github.event.label.name == 'codex-review')
(github.event_name == 'pull_request' && github.event.action == 'labeled' && (github.event.label.name == 'codex-review' || github.event.label.name == 'codex-rust-review'))
runs-on: ubuntu-latest
permissions:
contents: write # can push or create branches

View File

@@ -93,7 +93,7 @@ jobs:
sudo apt install -y musl-tools pkg-config
- name: Cargo build
run: cargo build --target ${{ matrix.target }} --release --all-targets --all-features
run: cargo build --target ${{ matrix.target }} --release --bin codex --bin codex-exec --bin codex-linux-sandbox
- name: Stage artifacts
shell: bash
@@ -181,9 +181,9 @@ jobs:
name: ${{ steps.release_name.outputs.name }}
tag_name: ${{ github.ref_name }}
files: dist/**
# For now, tag releases as "prerelease" because we are not claiming
# the Rust CLI is stable yet.
prerelease: true
# Mark as prerelease only when the version has a suffix after x.y.z
# (e.g. -alpha, -beta). Otherwise publish a normal release.
prerelease: ${{ contains(steps.release_name.outputs.name, '-') }}
- uses: facebook/dotslash-publish-release@v2
env:

5
.vscode/extensions.json vendored Normal file
View File

@@ -0,0 +1,5 @@
{
"recommendations": [
"tamasfe.even-better-toml",
]
}

18
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,18 @@
{
"version": "0.2.0",
"configurations": [
{
"type": "lldb",
"request": "launch",
"name": "Cargo launch",
"cargo": {
"cwd": "${workspaceFolder}/codex-rs",
"args": [
"build",
"--bin=codex-tui"
]
},
"args": []
}
]
}

18
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,18 @@
{
"rust-analyzer.checkOnSave": true,
"rust-analyzer.check.command": "clippy",
"rust-analyzer.check.extraArgs": ["--all-features", "--tests"],
"rust-analyzer.rustfmt.extraArgs": ["--config", "imports_granularity=Item"],
"[rust]": {
"editor.defaultFormatter": "rust-lang.rust-analyzer",
"editor.formatOnSave": true,
},
"[toml]": {
"editor.defaultFormatter": "tamasfe.even-better-toml",
"editor.formatOnSave": true,
},
// Array order for options in ~/.codex/config.toml such as `notify` and the
// `args` for an MCP server is significant, so we disable reordering.
"evenBetterToml.formatter.reorderArrays": false,
"evenBetterToml.formatter.reorderKeys": true,
}

View File

@@ -2,4 +2,10 @@
In the codex-rs folder where the rust code lives:
- Never add or modify any code related to `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR`. You operate in a sandbox where `CODEX_SANDBOX_NETWORK_DISABLED=1` will be set whenever you use the `shell` tool. Any existing code that uses `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` was authored with this fact in mind. It is often used to early exit out of tests that the author knew you would not be able to run given your sandbox limitations.
- Never add or modify any code related to `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` or `CODEX_SANDBOX_ENV_VAR`.
- You operate in a sandbox where `CODEX_SANDBOX_NETWORK_DISABLED=1` will be set whenever you use the `shell` tool. Any existing code that uses `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` was authored with this fact in mind. It is often used to early exit out of tests that the author knew you would not be able to run given your sandbox limitations.
- Similarly, when you spawn a process using Seatbelt (`/usr/bin/sandbox-exec`), `CODEX_SANDBOX=seatbelt` will be set on the child process. Integration tests that want to run Seatbelt themselves cannot be run under Seatbelt, so checks for `CODEX_SANDBOX=seatbelt` are also often used to early exit out of tests, as appropriate.
Before creating a pull request with changes to `codex-rs`, run `just fmt` (in `codex-rs` directory) to format the code and `just fix` (in `codex-rs` directory) to fix any linter issues in the code, ensure the test suite passes by running `cargo test --all-features` in the `codex-rs` directory.
When making individual changes prefer running tests on individual files or projects first.

4
NOTICE
View File

@@ -1,2 +1,6 @@
OpenAI Codex
Copyright 2025 OpenAI
This project includes code derived from [Ratatui](https://github.com/ratatui/ratatui), licensed under the MIT license.
Copyright (c) 2016-2022 Florian Dehau
Copyright (c) 2023-2025 The Ratatui Developers

321
README.md
View File

@@ -1,11 +1,12 @@
<h1 align="center">OpenAI Codex CLI</h1>
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>
This is the home of the **Codex CLI**, which is a coding agent from OpenAI that runs locally on your computer. If you are looking for the _cloud-based agent_ from OpenAI, **Codex [Web]**, see <https://chatgpt.com/codex>.
<p align="center"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, see <a href="https://chatgpt.com/codex">chatgpt.com/codex</a>.</p>
<!-- ![Codex demo GIF using: codex "explain this codebase to me"](./.github/demo.gif) -->
<p align="center">
<img src="./.github/codex-cli-splash.png" alt="Codex CLI splash" width="50%" />
</p>
---
@@ -14,21 +15,27 @@ This is the home of the **Codex CLI**, which is a coding agent from OpenAI that
<!-- Begin ToC -->
- [Experimental technology disclaimer](#experimental-technology-disclaimer)
- [Quickstart](#quickstart)
- [OpenAI API Users](#openai-api-users)
- [OpenAI Plus/Pro Users](#openai-pluspro-users)
- [Why Codex?](#why-codex)
- [Security model & permissions](#security-model--permissions)
- [Installing and running Codex CLI](#installing-and-running-codex-cli)
- [Using Codex with your ChatGPT plan](#using-codex-with-your-chatgpt-plan)
- [Usage-based billing alternative: Use an OpenAI API key](#usage-based-billing-alternative-use-an-openai-api-key)
- [Choosing Codex's level of autonomy](#choosing-codexs-level-of-autonomy)
- [**1. Read/write**](#1-readwrite)
- [**2. Read-only**](#2-read-only)
- [**3. Advanced configuration**](#3-advanced-configuration)
- [Can I run without ANY approvals?](#can-i-run-without-any-approvals)
- [Fine-tuning in `config.toml`](#fine-tuning-in-configtoml)
- [Example prompts](#example-prompts)
- [Running with a prompt as input](#running-with-a-prompt-as-input)
- [Using Open Source Models](#using-open-source-models)
- [Platform sandboxing details](#platform-sandboxing-details)
- [Experimental technology disclaimer](#experimental-technology-disclaimer)
- [System requirements](#system-requirements)
- [CLI reference](#cli-reference)
- [Memory & project docs](#memory--project-docs)
- [Non-interactive / CI mode](#non-interactive--ci-mode)
- [Model Context Protocol (MCP)](#model-context-protocol-mcp)
- [Tracing / verbose logging](#tracing--verbose-logging)
- [Recipes](#recipes)
- [Installation](#installation)
- [DotSlash](#dotslash)
- [Configuration](#configuration)
- [FAQ](#faq)
@@ -53,49 +60,156 @@ This is the home of the **Codex CLI**, which is a coding agent from OpenAI that
---
## Experimental technology disclaimer
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
## Quickstart
### Installing and running Codex CLI
Install globally with your preferred package manager:
```shell
npm install -g @openai/codex # Alternatively: `brew install codex`
```
Or go to the [latest GitHub Release](https://github.com/openai/codex/releases/latest) and download the appropriate binary for your platform.
Then simply run `codex` to get started:
### OpenAI API Users
```shell
codex
```
Next, set your OpenAI API key as an environment variable:
<details>
<summary>You can also go to the <a href="https://github.com/openai/codex/releases/latest">latest GitHub Release</a> and download the appropriate binary for your platform.</summary>
Each GitHub Release contains many executables, but in practice, you likely want one of these:
- macOS
- Apple Silicon/arm64: `codex-aarch64-apple-darwin.tar.gz`
- x86_64 (older Mac hardware): `codex-x86_64-apple-darwin.tar.gz`
- Linux
- x86_64: `codex-x86_64-unknown-linux-musl.tar.gz`
- arm64: `codex-aarch64-unknown-linux-musl.tar.gz`
Each archive contains a single entry with the platform baked into the name (e.g., `codex-x86_64-unknown-linux-musl`), so you likely want to rename it to `codex` after extracting it.
</details>
### Using Codex with your ChatGPT plan
<p align="center">
<img src="./.github/codex-cli-login.png" alt="Codex CLI login" width="50%" />
</p>
After you run `codex` select Sign in with ChatGPT. You'll need a Plus, Pro, or Team ChatGPT account, and will get access to our latest models, including `gpt-5`, at no extra cost to your plan. (Enterprise is coming soon.)
> Important: If you've used the Codex CLI before, you'll need to follow these steps to migrate from usage-based billing with your API key:
>
> 1. Update the CLI with `codex update` and ensure `codex --version` is greater than 0.13
> 2. Ensure that there is no `OPENAI_API_KEY` environment variable set. (Check that `env | grep 'OPENAI_API_KEY'` returns empty)
> 3. Run `codex login` again
If you encounter problems with the login flow, please comment on [this issue](https://github.com/openai/codex/issues/1243).
### Usage-based billing alternative: Use an OpenAI API key
If you prefer to pay-as-you-go, you can still authenticate with your OpenAI API key by setting it as an environment variable:
```shell
export OPENAI_API_KEY="your-api-key-here"
```
> [!NOTE]
> This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`), but we recommend setting it for the session.
> Note: This command only sets the key for your current terminal session, which we recommend. To set it for all future sessions, you can also add the `export` line to your shell's configuration file (e.g., `~/.zshrc`).
### OpenAI Plus/Pro Users
### Choosing Codex's level of autonomy
If you have a paid OpenAI account, run the following to start the login process:
We always recommend running Codex in its default sandbox that gives you strong guardrails around what the agent can do. The default sandbox prevents it from editing files outside its workspace, or from accessing the network.
```
codex login
When you launch Codex in a new folder, it detects whether the folder is version controlled and recommends one of two levels of autonomy:
#### **1. Read/write**
- Codex can run commands and write files in the workspace without approval.
- To write files in other folders, access network, update git or perform other actions protected by the sandbox, Codex will need your permission.
- By default, the workspace includes the current directory, as well as temporary directories like `/tmp`. You can see what directories are in the workspace with the `/status` command. See the docs for how to customize this behavior.
- Advanced: You can manually specify this configuration by running `codex --sandbox workspace-write --ask-for-approval on-request`
- This is the recommended default for version-controlled folders.
#### **2. Read-only**
- Codex can run read-only commands without approval.
- To edit files, access network, or perform other actions protected by the sandbox, Codex will need your permission.
- Advanced: You can manually specify this configuration by running `codex --sandbox read-only --ask-for-approval on-request`
- This is the recommended default non-version-controlled folders.
#### **3. Advanced configuration**
Codex gives you fine-grained control over the sandbox with the `--sandbox` option, and over when it requests approval with the `--ask-for-approval` option. Run `codex help` for more on these options.
#### Can I run without ANY approvals?
Yes, run codex non-interactively with `--ask-for-approval never`. This option works with all `--sandbox` options, so you still have full control over Codex's level of autonomy. It will make its best attempt with whatever contrainsts you provide. For example:
- Use `codex --ask-for-approval never --sandbox read-only` when you are running many agents to answer questions in parallel in the same workspace.
- Use `codex --ask-for-approval never --sandbox workspace-write` when you want the agent to non-interactively take time to produce the best outcome, with strong guardrails around its behavior.
- Use `codex --ask-for-approval never --sandbox danger-full-access` to dangerously give the agent full autonomy. Because this disables important safety mechanisms, we recommend against using this unless running Codex in an isolated environment.
#### Fine-tuning in `config.toml`
```toml
# approval mode
approval_policy = "untrusted"
sandbox_mode = "read-only"
# full-auto mode
approval_policy = "on-request"
sandbox_mode = "workspace-write"
# Optional: allow network in workspace-write mode
[sandbox_workspace_write]
network_access = true
```
If you complete the process successfully, you should have a `~/.codex/auth.json` file that contains the credentials that Codex will use.
You can also save presets as **profiles**:
If you encounter problems with the login flow, please comment on <https://github.com/openai/codex/issues/1243>.
```toml
[profiles.full_auto]
approval_policy = "on-request"
sandbox_mode = "workspace-write"
[profiles.readonly_quiet]
approval_policy = "never"
sandbox_mode = "read-only"
```
### Example prompts
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
| ✨ | What you type | What happens |
| --- | ------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| 1 | `codex "Refactor the Dashboard component to React Hooks"` | Codex rewrites the class component, runs `npm test`, and shows the diff. |
| 2 | `codex "Generate SQL migrations for adding a users table"` | Infers your ORM, creates migration files, and runs them in a sandboxed DB. |
| 3 | `codex "Write unit tests for utils/date.ts"` | Generates tests, executes them, and iterates until they pass. |
| 4 | `codex "Bulk-rename *.jpeg -> *.jpg with git mv"` | Safely renames files and updates imports/usages. |
| 5 | `codex "Explain what this regex does: ^(?=.*[A-Z]).{8,}$"` | Outputs a step-by-step human explanation. |
| 6 | `codex "Carefully review this repo, and propose 3 high impact well-scoped PRs"` | Suggests impactful PRs in the current codebase. |
| 7 | `codex "Look for vulnerabilities and create a security review report"` | Finds and explains security bugs. |
## Running with a prompt as input
You can also run Codex CLI with a prompt as input:
```shell
codex "explain this codebase to me"
```
```shell
codex --full-auto "create the fanciest todo-list app"
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
missing dependencies, and show you the live result. Approve the changes and
they'll be committed to your working directory.
## Using Open Source Models
<details>
<summary><strong>Use <code>--profile</code> to use other models</strong></summary>
@@ -156,68 +270,40 @@ model = "mistral"
This way, you can specify one command-line argument (.e.g., `--profile o3`, `--profile mistral`) to override multiple settings together.
</details>
<br />
Run interactively:
Codex can run fully locally against an OpenAI-compatible OSS host (like Ollama) using the `--oss` flag:
```shell
codex
- Interactive UI:
- codex --oss
- Non-interactive (programmatic) mode:
- echo "Refactor utils" | codex exec --oss
Model selection when using `--oss`:
- If you omit `-m/--model`, Codex defaults to -m gpt-oss:20b and will verify it exists locally (downloading if needed).
- To pick a different size, pass one of:
- -m "gpt-oss:20b"
- -m "gpt-oss:120b"
Point Codex at your own OSS host:
- By default, `--oss` talks to http://localhost:11434/v1.
- To use a different host, set one of these environment variables before running Codex:
- CODEX_OSS_BASE_URL, for example:
- CODEX_OSS_BASE_URL="http://my-ollama.example.com:11434/v1" codex --oss -m gpt-oss:20b
- or CODEX_OSS_PORT (when the host is localhost):
- CODEX_OSS_PORT=11434 codex --oss
Advanced: you can persist this in your config instead of environment variables by overriding the built-in `oss` provider in `~/.codex/config.toml`:
```toml
[model_providers.oss]
name = "Open Source"
base_url = "http://my-ollama.example.com:11434/v1"
```
Or, run with a prompt as input (and optionally in `Full Auto` mode):
```shell
codex "explain this codebase to me"
```
```shell
codex --full-auto "create the fanciest todo-list app"
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
missing dependencies, and show you the live result. Approve the changes and
they'll be committed to your working directory.
---
## Why Codex?
Codex CLI is built for developers who already **live in the terminal** and want
ChatGPT-level reasoning **plus** the power to actually run code, manipulate
files, and iterate - all under version control. In short, it's _chat-driven
development_ that understands and executes your repo.
- **Zero setup** - bring your OpenAI API key and it just works!
- **Full auto-approval, while safe + secure** by running network-disabled and directory-sandboxed
- **Multimodal** - pass in screenshots or diagrams to implement features ✨
And it's **fully open-source** so you can see and contribute to how it develops!
---
## Security model & permissions
Codex lets you decide _how much autonomy_ you want to grant the agent. The following options can be configured independently:
- [`approval_policy`](./codex-rs/config.md#approval_policy) determines when you should be prompted to approve whether Codex can execute a command
- [`sandbox`](./codex-rs/config.md#sandbox) determines the _sandbox policy_ that Codex uses to execute untrusted commands
By default, Codex runs with `--ask-for-approval untrusted` and `--sandbox read-only`, which means that:
- The user is prompted to approve every command not on the set of "trusted" commands built into Codex (`cat`, `ls`, etc.)
- Approved commands are run outside of a sandbox because user approval implies "trust," in this case.
Running Codex with the `--full-auto` convenience flag changes the configuration to `--ask-for-approval on-failure` and `--sandbox workspace-write`, which means that:
- Codex does not initially ask for user approval before running an individual command.
- Though when it runs a command, it is run under a sandbox in which:
- It can read any file on the system.
- It can only write files under the current directory (or the directory specified via `--cd`).
- Network requests are completely disabled.
- Only if the command exits with a non-zero exit code will it ask the user for approval. If granted, it will re-attempt the command outside of the sandbox. (A common case is when Codex cannot `npm install` a dependency because that requires network access.)
Again, these two options can be configured independently. For example, if you want Codex to perform an "exploration" where you are happy for it to read anything it wants but you never want to be prompted, you could run Codex with `--ask-for-approval never` and `--sandbox read-only`.
### Platform sandboxing details
The mechanism Codex uses to implement the sandbox policy depends on your OS:
@@ -229,6 +315,19 @@ Note that when running Linux in a containerized environment such as Docker, sand
---
## Experimental technology disclaimer
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
---
## System requirements
| Requirement | Details |
@@ -304,52 +403,6 @@ See the Rust documentation on [`RUST_LOG`](https://docs.rs/env_logger/latest/env
---
## Recipes
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
| ✨ | What you type | What happens |
| --- | ------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| 1 | `codex "Refactor the Dashboard component to React Hooks"` | Codex rewrites the class component, runs `npm test`, and shows the diff. |
| 2 | `codex "Generate SQL migrations for adding a users table"` | Infers your ORM, creates migration files, and runs them in a sandboxed DB. |
| 3 | `codex "Write unit tests for utils/date.ts"` | Generates tests, executes them, and iterates until they pass. |
| 4 | `codex "Bulk-rename *.jpeg -> *.jpg with git mv"` | Safely renames files and updates imports/usages. |
| 5 | `codex "Explain what this regex does: ^(?=.*[A-Z]).{8,}$"` | Outputs a step-by-step human explanation. |
| 6 | `codex "Carefully review this repo, and propose 3 high impact well-scoped PRs"` | Suggests impactful PRs in the current codebase. |
| 7 | `codex "Look for vulnerabilities and create a security review report"` | Finds and explains security bugs. |
---
## Installation
<details open>
<summary><strong>Install Codex CLI using your preferred package manager.</strong></summary>
From `brew` (recommended, downloads only the binary for your platform):
```bash
brew install codex
```
From `npm` (generally more readily available, but downloads binaries for all supported platforms):
```bash
npm i -g @openai/codex
```
Or go to the [latest GitHub Release](https://github.com/openai/codex/releases/latest) and download the appropriate binary for your platform.
Admittedly, each GitHub Release contains many executables, but in practice, you likely want one of these:
- macOS
- Apple Silicon/arm64: `codex-aarch64-apple-darwin.tar.gz`
- x86_64 (older Mac hardware): `codex-x86_64-apple-darwin.tar.gz`
- Linux
- x86_64: `codex-x86_64-unknown-linux-musl.tar.gz`
- arm64: `codex-aarch64-unknown-linux-musl.tar.gz`
Each archive contains a single entry with the platform baked into the name (e.g., `codex-x86_64-unknown-linux-musl`), so you likely want to rename it to `codex` after extracting it.
### DotSlash
The GitHub Release also contains a [DotSlash](https://dotslash-cli.com/) file for the Codex CLI named `codex`. Using a DotSlash file makes it possible to make a lightweight commit to source control to ensure all contributors use the same version of an executable, regardless of what platform they use for development.

21
SUMMARY.md Normal file
View File

@@ -0,0 +1,21 @@
You are a summarization assistant. A conversation follows between a user and a coding-focused AI (Codex). Your task is to generate a clear summary capturing:
• High-level objective or problem being solved
• Key instructions or design decisions given by the user
• Main code actions or behaviors from the AI
• Important variables, functions, modules, or outputs discussed
• Any unresolved questions or next steps
Produce the summary in a structured format like:
**Objective:**
**User instructions:** … (bulleted)
**AI actions / code behavior:** … (bulleted)
**Important entities:** … (e.g. function names, variables, files)
**Open issues / next steps:** … (if any)
**Summary (concise):** (one or two sentences)

View File

@@ -15,7 +15,6 @@
* current platform / architecture, an error is thrown.
*/
import { spawnSync } from "child_process";
import fs from "fs";
import path from "path";
import { fileURLToPath, pathToFileURL } from "url";
@@ -35,7 +34,7 @@ const wantsNative = fs.existsSync(path.join(__dirname, "use-native")) ||
: false);
// Try native binary if requested.
if (wantsNative) {
if (wantsNative && process.platform !== 'win32') {
const { platform, arch } = process;
let targetTriple = null;
@@ -74,22 +73,77 @@ if (wantsNative) {
}
const binaryPath = path.join(__dirname, "..", "bin", `codex-${targetTriple}`);
const result = spawnSync(binaryPath, process.argv.slice(2), {
// Use an asynchronous spawn instead of spawnSync so that Node is able to
// respond to signals (e.g. Ctrl-C / SIGINT) while the native binary is
// executing. This allows us to forward those signals to the child process
// and guarantees that when either the child terminates or the parent
// receives a fatal signal, both processes exit in a predictable manner.
const { spawn } = await import("child_process");
const child = spawn(binaryPath, process.argv.slice(2), {
stdio: "inherit",
env: { ...process.env, CODEX_MANAGED_BY_NPM: "1" },
});
const exitCode = typeof result.status === "number" ? result.status : 1;
process.exit(exitCode);
}
child.on("error", (err) => {
// Typically triggered when the binary is missing or not executable.
// Re-throwing here will terminate the parent with a non-zero exit code
// while still printing a helpful stack trace.
// eslint-disable-next-line no-console
console.error(err);
process.exit(1);
});
// Fallback: execute the original JavaScript CLI.
// Forward common termination signals to the child so that it shuts down
// gracefully. In the handler we temporarily disable the default behavior of
// exiting immediately; once the child has been signaled we simply wait for
// its exit event which will in turn terminate the parent (see below).
const forwardSignal = (signal) => {
if (child.killed) {
return;
}
try {
child.kill(signal);
} catch {
/* ignore */
}
};
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, "../dist/cli.js");
const cliUrl = pathToFileURL(cliPath).href;
["SIGINT", "SIGTERM", "SIGHUP"].forEach((sig) => {
process.on(sig, () => forwardSignal(sig));
});
// Load and execute the CLI
(async () => {
// When the child exits, mirror its termination reason in the parent so that
// shell scripts and other tooling observe the correct exit status.
// Wrap the lifetime of the child process in a Promise so that we can await
// its termination in a structured way. The Promise resolves with an object
// describing how the child exited: either via exit code or due to a signal.
const childResult = await new Promise((resolve) => {
child.on("exit", (code, signal) => {
if (signal) {
resolve({ type: "signal", signal });
} else {
resolve({ type: "code", exitCode: code ?? 1 });
}
});
});
if (childResult.type === "signal") {
// Re-emit the same signal so that the parent terminates with the expected
// semantics (this also sets the correct exit code of 128 + n).
process.kill(process.pid, childResult.signal);
} else {
process.exit(childResult.exitCode);
}
} else {
// Fallback: execute the original JavaScript CLI.
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, "../dist/cli.js");
const cliUrl = pathToFileURL(cliPath).href;
// Load and execute the CLI
try {
await import(cliUrl);
} catch (err) {
@@ -97,4 +151,4 @@ const cliUrl = pathToFileURL(cliPath).href;
console.error(err);
process.exit(1);
}
})();
}

View File

@@ -370,11 +370,26 @@ export function isSafeCommand(
reason: "View file with line numbers",
group: "Reading files",
};
case "rg":
case "rg": {
// Certain ripgrep options execute external commands or invoke other
// processes, so we must reject them.
const isUnsafe = command.some(
(arg: string) =>
UNSAFE_OPTIONS_FOR_RIPGREP_WITHOUT_ARGS.has(arg) ||
[...UNSAFE_OPTIONS_FOR_RIPGREP_WITH_ARGS].some(
(opt) => arg === opt || arg.startsWith(`${opt}=`),
),
);
if (isUnsafe) {
break;
}
return {
reason: "Ripgrep search",
group: "Searching",
};
}
case "find": {
// Certain options to `find` allow executing arbitrary processes, so we
// cannot auto-approve them.
@@ -495,6 +510,22 @@ const UNSAFE_OPTIONS_FOR_FIND_COMMAND: ReadonlySet<string> = new Set([
"-fprintf",
]);
// Ripgrep options that are considered unsafe because they may execute
// arbitrary commands or spawn auxiliary processes.
const UNSAFE_OPTIONS_FOR_RIPGREP_WITH_ARGS: ReadonlySet<string> = new Set([
// Executes an arbitrary command for each matching file.
"--pre",
// Allows custom hostname command which could leak environment details.
"--hostname-bin",
]);
const UNSAFE_OPTIONS_FOR_RIPGREP_WITHOUT_ARGS: ReadonlySet<string> = new Set([
// Enables searching inside archives which triggers external decompression
// utilities reject out of an abundance of caution.
"--search-zip",
"-z",
]);
// ---------------- Helper utilities for complex shell expressions -----------------
// A conservative allow-list of bash operators that do not, on their own, cause

View File

@@ -854,7 +854,7 @@ export default function TerminalChatInput({
/>
) : (
<Text dimColor>
ctrl+c to exit | "/" to see commands | enter to send
Ctrl+C to exit | "/" to see commands | Enter to send
{contextLeftPercent > 25 && (
<>
{" — "}

View File

@@ -96,7 +96,7 @@ export default function HelpOverlay({
</Box>
<Box paddingX={1}>
<Text dimColor>esc or q to close</Text>
<Text dimColor>Esc or q to close</Text>
</Box>
</Box>
);

View File

@@ -147,4 +147,8 @@ const READ_ONLY_SEATBELT_POLICY = `
(sysctl-name "kern.version")
(sysctl-name "sysctl.proc_cputype")
(sysctl-name-prefix "hw.perflevel")
)`.trim();
)
; Added on top of Chrome profile
; Needed for python multiprocessing on MacOS for the SemLock
(allow ipc-posix-sem)`.trim();

View File

@@ -68,7 +68,7 @@ export function WaitingForAuth(): JSX.Element {
<Spinner type="ball" />
<Text>
{" "}
Waiting for authentication <Text dimColor>ctrl + c to quit</Text>
Waiting for authentication <Text dimColor>Ctrl + C to quit</Text>
</Text>
</Box>
);

View File

@@ -44,6 +44,14 @@ describe("canAutoApprove()", () => {
group: "Navigating",
runInSandbox: false,
});
// Ripgrep safe invocation.
expect(check(["rg", "TODO"])).toEqual({
type: "auto-approve",
reason: "Ripgrep search",
group: "Searching",
runInSandbox: false,
});
});
test("simple safe commands within a `bash -lc` call", () => {
@@ -67,6 +75,24 @@ describe("canAutoApprove()", () => {
});
});
test("ripgrep unsafe flags", () => {
// Flags that do not take arguments
expect(check(["rg", "--search-zip", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "-z", "TODO"])).toEqual({ type: "ask-user" });
// Flags that take arguments (provided separately)
expect(check(["rg", "--pre", "cat", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "--hostname-bin", "hostname", "TODO"])).toEqual({
type: "ask-user",
});
// Flags that take arguments in = form
expect(check(["rg", "--pre=cat", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "--hostname-bin=hostname", "TODO"])).toEqual({
type: "ask-user",
});
});
test("bash -lc commands with unsafe redirects", () => {
expect(check(["bash", "-lc", "echo hello > file.txt"])).toEqual({
type: "ask-user",

446
codex-rs/Cargo.lock generated
View File

@@ -399,6 +399,15 @@ version = "2.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6099cdc01846bc367c4e7dd630dc5966dccf36b652fae7a74e17b640411a91b2"
[[package]]
name = "block-buffer"
version = "0.10.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3078c7629b62d3f0439517fa394996acacc5cbc91c5a20d8c658e77abd503a71"
dependencies = [
"generic-array",
]
[[package]]
name = "bstr"
version = "1.12.0"
@@ -454,18 +463,18 @@ checksum = "df8670b8c7b9dae1793364eafadf7239c40d669904660c5960d74cfd80b46a53"
[[package]]
name = "castaway"
version = "0.2.3"
version = "0.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0abae9be0aaf9ea96a3b1b8b1b55c602ca751eba1b1500220cea4ecbafe7c0d5"
checksum = "dec551ab6e7578819132c713a93c022a05d60159dc86e7a7050223577484c55a"
dependencies = [
"rustversion",
]
[[package]]
name = "cc"
version = "1.2.29"
version = "1.2.30"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c1599538de2394445747c8cf7935946e3cc27e9625f889d979bfb2aaf569362"
checksum = "deec109607ca693028562ed836a5f1c4b8bd77755c4e132fc5ce11b0b6211ae7"
dependencies = [
"jobserver",
"libc",
@@ -561,9 +570,9 @@ checksum = "b94f61472cee1439c0b966b47e3aca9ae07e45d070759512cd390ea2bebc6675"
[[package]]
name = "clipboard-win"
version = "5.4.0"
version = "5.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "15efe7a882b08f34e38556b14f2fb3daa98769d06c7f0c1b076dfd0d983bc892"
checksum = "bde03770d3df201d4fb868f2c9c59e66a3e4e2bd06692a0fe701e7103c7e84d4"
dependencies = [
"error-code",
]
@@ -596,6 +605,18 @@ dependencies = [
"tree-sitter-bash",
]
[[package]]
name = "codex-arg0"
version = "0.0.0"
dependencies = [
"anyhow",
"codex-apply-patch",
"codex-core",
"codex-linux-sandbox",
"dotenvy",
"tokio",
]
[[package]]
name = "codex-chatgpt"
version = "0.0.0"
@@ -619,11 +640,11 @@ dependencies = [
"anyhow",
"clap",
"clap_complete",
"codex-arg0",
"codex-chatgpt",
"codex-common",
"codex-core",
"codex-exec",
"codex-linux-sandbox",
"codex-login",
"codex-mcp-server",
"codex-tui",
@@ -640,7 +661,7 @@ dependencies = [
"clap",
"codex-core",
"serde",
"toml 0.9.1",
"toml 0.9.4",
]
[[package]]
@@ -652,37 +673,48 @@ dependencies = [
"async-channel",
"base64 0.22.1",
"bytes",
"chrono",
"codex-apply-patch",
"codex-login",
"codex-mcp-client",
"core_test_support",
"dirs",
"env-flags",
"eventsource-stream",
"fs2",
"futures",
"landlock",
"libc",
"maplit",
"mcp-types",
"mime_guess",
"openssl-sys",
"predicates",
"pretty_assertions",
"rand 0.9.1",
"rand 0.9.2",
"reqwest",
"seccompiler",
"serde",
"serde_bytes",
"serde_json",
"strum_macros 0.27.1",
"sha1",
"shlex",
"similar",
"strum_macros 0.27.2",
"tempfile",
"thiserror 2.0.12",
"time",
"tokio",
"tokio-test",
"tokio-util",
"toml 0.9.1",
"toml 0.9.4",
"toml_edit 0.23.3",
"tracing",
"tree-sitter",
"tree-sitter-bash",
"uuid",
"walkdir",
"whoami",
"wildmatch",
"wiremock",
]
@@ -692,14 +724,18 @@ name = "codex-exec"
version = "0.0.0"
dependencies = [
"anyhow",
"assert_cmd",
"chrono",
"clap",
"codex-arg0",
"codex-common",
"codex-core",
"codex-linux-sandbox",
"codex-ollama",
"owo-colors",
"predicates",
"serde_json",
"shlex",
"tempfile",
"tokio",
"tracing",
"tracing-subscriber",
@@ -744,6 +780,7 @@ version = "0.0.0"
dependencies = [
"anyhow",
"clap",
"codex-common",
"codex-core",
"landlock",
"libc",
@@ -756,10 +793,14 @@ dependencies = [
name = "codex-login"
version = "0.0.0"
dependencies = [
"base64 0.22.1",
"chrono",
"pretty_assertions",
"reqwest",
"serde",
"serde_json",
"tempfile",
"thiserror 2.0.12",
"tokio",
]
@@ -781,17 +822,42 @@ name = "codex-mcp-server"
version = "0.0.0"
dependencies = [
"anyhow",
"assert_cmd",
"codex-arg0",
"codex-core",
"codex-linux-sandbox",
"mcp-types",
"mcp_test_support",
"pretty_assertions",
"schemars 0.8.22",
"serde",
"serde_json",
"shlex",
"strum_macros 0.27.2",
"tempfile",
"tokio",
"toml 0.9.1",
"tokio-test",
"toml 0.9.4",
"tracing",
"tracing-subscriber",
"uuid",
"wiremock",
]
[[package]]
name = "codex-ollama"
version = "0.0.0"
dependencies = [
"async-stream",
"bytes",
"codex-core",
"futures",
"reqwest",
"serde_json",
"tempfile",
"tokio",
"toml 0.9.4",
"tracing",
"wiremock",
]
[[package]]
@@ -800,37 +866,46 @@ version = "0.0.0"
dependencies = [
"anyhow",
"base64 0.22.1",
"chrono",
"clap",
"codex-ansi-escape",
"codex-arg0",
"codex-common",
"codex-core",
"codex-file-search",
"codex-linux-sandbox",
"codex-login",
"codex-ollama",
"color-eyre",
"crossterm",
"diffy",
"image",
"insta",
"lazy_static",
"mcp-types",
"path-clean",
"pretty_assertions",
"rand 0.8.5",
"ratatui",
"ratatui-image",
"regex-lite",
"reqwest",
"serde",
"serde_json",
"shlex",
"strum 0.27.1",
"strum_macros 0.27.1",
"strum 0.27.2",
"strum_macros 0.27.2",
"supports-color",
"textwrap 0.16.2",
"tokio",
"tracing",
"tracing-appender",
"tracing-subscriber",
"tui-input",
"tui-markdown",
"tui-textarea",
"unicode-segmentation",
"unicode-width 0.1.14",
"uuid",
"vt100",
]
[[package]]
@@ -933,10 +1008,29 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "773648b94d0e5d620f64f280777445740e61fe701025087ec8b57f45c791888b"
[[package]]
name = "crc32fast"
version = "1.4.2"
name = "core_test_support"
version = "0.0.0"
dependencies = [
"codex-core",
"serde_json",
"tempfile",
"tokio",
]
[[package]]
name = "cpufeatures"
version = "0.2.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a97769d94ddab943e4510d138150169a2758b5ef3eb191a9ee688de3e23ef7b3"
checksum = "59ed5838eebb26a2bb2e58f6d5b5316989ae9d08bab10e0e6d103e656d1b0280"
dependencies = [
"libc",
]
[[package]]
name = "crc32fast"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9481c1c90cbf2ac953f07c8d4a58aa3945c425b7185c9154d67a65e4230da511"
dependencies = [
"cfg-if",
]
@@ -1006,6 +1100,16 @@ version = "0.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5"
[[package]]
name = "crypto-common"
version = "0.1.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1bfb12502f3fc46cca1bb51ac28df9d618d813cdc3d2f25b9fe775a34af26bb3"
dependencies = [
"generic-array",
"typenum",
]
[[package]]
name = "ctor"
version = "0.1.26"
@@ -1156,6 +1260,25 @@ version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6184e33543162437515c2e2b48714794e37845ec9851711914eec9d308f6ebe8"
[[package]]
name = "diffy"
version = "0.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b545b8c50194bdd008283985ab0b31dba153cfd5b3066a92770634fbc0d7d291"
dependencies = [
"nu-ansi-term 0.50.1",
]
[[package]]
name = "digest"
version = "0.10.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9ed9a281f7bc9b7576e61468ba615a66a5c8cfdff42420a70aa82701a3b1e292"
dependencies = [
"block-buffer",
"crypto-common",
]
[[package]]
name = "dirs"
version = "6.0.0"
@@ -1225,6 +1348,12 @@ version = "0.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fea41bba32d969b513997752735605054bc0dfa92b4c56bf1189f2e174be7a10"
[[package]]
name = "dotenvy"
version = "0.15.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1aaf95b3e5c8f23aa320147307562d361db0ae0d51242340f558153b4eb2439b"
[[package]]
name = "dupe"
version = "0.9.1"
@@ -1457,7 +1586,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0ce92ff622d6dadf7349484f42c93271a0d49b7cc4d466a936405bacbe10aa78"
dependencies = [
"cfg-if",
"rustix 1.0.7",
"rustix 1.0.8",
"windows-sys 0.59.0",
]
@@ -1645,13 +1774,23 @@ dependencies = [
"byteorder",
]
[[package]]
name = "generic-array"
version = "0.14.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "85649ca51fd72272d7821adaf274ad91c288277713d9c18820d8499a7ff69e9a"
dependencies = [
"typenum",
"version_check",
]
[[package]]
name = "getopts"
version = "0.2.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cba6ae63eb948698e300f645f87c70f76630d505f23b8907cf1e193ee85048c1"
dependencies = [
"unicode-width 0.2.0",
"unicode-width 0.2.1",
]
[[package]]
@@ -1896,9 +2035,9 @@ dependencies = [
[[package]]
name = "hyper-util"
version = "0.1.15"
version = "0.1.16"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7f66d5bd4c6f02bf0542fad85d626775bab9258cf795a4256dcaf3161114d1df"
checksum = "8d9b05277c7e8da2c93a568989bb6207bef0112e8d17df7a6eda4a3cf143bc5e"
dependencies = [
"base64 0.22.1",
"bytes",
@@ -2165,9 +2304,9 @@ dependencies = [
[[package]]
name = "instability"
version = "0.3.7"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0bf9fed6d91cfb734e7476a06bde8300a1b94e217e1b523b6f0cd1a01998c71d"
checksum = "435d80800b936787d62688c927b6490e887c7ef5ff9ce922c6c6050fca75eb9a"
dependencies = [
"darling",
"indoc",
@@ -2198,9 +2337,9 @@ dependencies = [
[[package]]
name = "io-uring"
version = "0.7.8"
version = "0.7.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b86e202f00093dcba4275d4636b93ef9dd75d025ae560d2521b45ea28ab49013"
checksum = "d93587f37623a1a17d94ef2bc9ada592f5465fe7732084ab7beefabe5c77c0c4"
dependencies = [
"bitflags 2.9.1",
"cfg-if",
@@ -2234,6 +2373,12 @@ dependencies = [
"windows-sys 0.59.0",
]
[[package]]
name = "is_ci"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7655c9839580ee829dfacba1d1278c2b7883e50a277ff7541299489d6bdfdc45"
[[package]]
name = "is_terminal_polyfill"
version = "1.70.1"
@@ -2404,9 +2549,9 @@ dependencies = [
[[package]]
name = "libredox"
version = "0.1.4"
version = "0.1.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1580801010e535496706ba011c15f8532df6b42297d2e471fec38ceadd8c0638"
checksum = "4488594b9328dee448adb906d8b126d9b7deb7cf5c22161ee591610bb1be83c0"
dependencies = [
"bitflags 2.9.1",
"libc",
@@ -2539,6 +2684,24 @@ dependencies = [
"serde_json",
]
[[package]]
name = "mcp_test_support"
version = "0.0.0"
dependencies = [
"anyhow",
"assert_cmd",
"codex-core",
"codex-mcp-server",
"mcp-types",
"pretty_assertions",
"serde_json",
"shlex",
"tempfile",
"tokio",
"uuid",
"wiremock",
]
[[package]]
name = "memchr"
version = "2.7.5"
@@ -2683,6 +2846,15 @@ dependencies = [
"winapi",
]
[[package]]
name = "nu-ansi-term"
version = "0.50.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d4a28e057d01f97e61255210fcff094d74ed0466038633e95017f5beb68e4399"
dependencies = [
"windows-sys 0.52.0",
]
[[package]]
name = "nucleo-matcher"
version = "0.3.1"
@@ -3102,7 +3274,7 @@ version = "3.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "edce586971a4dfaa28950c6f18ed55e0406c1ab88bbce2c6f6293a7aaba73d35"
dependencies = [
"toml_edit",
"toml_edit 0.22.27",
]
[[package]]
@@ -3214,9 +3386,9 @@ dependencies = [
[[package]]
name = "rand"
version = "0.9.1"
version = "0.9.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9fbfd9d094a40bf3ae768db9361049ace4c0e04a4fd6b359518bd7b73a73dd97"
checksum = "6db2770f06117d490610c7488547d543617b21bfa07796d7a12f6f1bd53850d1"
dependencies = [
"rand_chacha 0.9.0",
"rand_core 0.9.3",
@@ -3263,8 +3435,7 @@ dependencies = [
[[package]]
name = "ratatui"
version = "0.29.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eabd94c2f37801c20583fc49dd5cd6b0ba68c716787c2dd6ed18571e1e63117b"
source = "git+https://github.com/nornagon/ratatui?branch=nornagon-v0.29.0-patch#9b2ad1298408c45918ee9f8241a6f95498cdbed2"
dependencies = [
"bitflags 2.9.1",
"cassowary",
@@ -3278,7 +3449,7 @@ dependencies = [
"strum 0.26.3",
"unicode-segmentation",
"unicode-truncate",
"unicode-width 0.2.0",
"unicode-width 0.2.1",
]
[[package]]
@@ -3369,9 +3540,9 @@ dependencies = [
[[package]]
name = "redox_syscall"
version = "0.5.13"
version = "0.5.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0d04b7d0ee6b4a0207a0a7adb104d23ecb0b47d6beae7152d0fa34b692b29fd6"
checksum = "7e8af0dde094006011e6a740d4879319439489813bd0bcdc7d821beaeeff48ec"
dependencies = [
"bitflags 2.9.1",
]
@@ -3519,9 +3690,9 @@ dependencies = [
[[package]]
name = "rgb"
version = "0.8.51"
version = "0.8.52"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a457e416a0f90d246a4c3288bd7a25b2304ca727f253f95be383dd17af56be8f"
checksum = "0c6a884d2998352bb4daf0183589aec883f16a6da1f4dde84d8e2e9a5409a1ce"
[[package]]
name = "ring"
@@ -3597,22 +3768,22 @@ dependencies = [
[[package]]
name = "rustix"
version = "1.0.7"
version = "1.0.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c71e83d6afe7ff64890ec6b71d6a69bb8a610ab78ce364b3352876bb4c801266"
checksum = "11181fbabf243db407ef8df94a6ce0b2f9a733bd8be4ad02b4eda9602296cac8"
dependencies = [
"bitflags 2.9.1",
"errno",
"libc",
"linux-raw-sys 0.9.4",
"windows-sys 0.59.0",
"windows-sys 0.60.2",
]
[[package]]
name = "rustls"
version = "0.23.28"
version = "0.23.29"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7160e3e10bf4535308537f3c4e1641468cd0e485175d6163087c0393c7d46643"
checksum = "2491382039b29b9b11ff08b76ff6c97cf287671dbb74f0be44bda389fffe9bd1"
dependencies = [
"once_cell",
"rustls-pki-types",
@@ -3632,9 +3803,9 @@ dependencies = [
[[package]]
name = "rustls-webpki"
version = "0.103.3"
version = "0.103.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e4a72fe2bcf7a6ac6fd7d0b9e5cb68aeb7d4c0a0271730218b3e92d43b4eb435"
checksum = "0a17884ae0c1b773f1ccd2bd4a8c72f16da897310a98b0e84bf349ad5ead92fc"
dependencies = [
"ring",
"rustls-pki-types",
@@ -3836,6 +4007,15 @@ dependencies = [
"serde_derive",
]
[[package]]
name = "serde_bytes"
version = "0.11.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8437fd221bde2d4ca316d61b90e337e9e702b3820b87d63caa9ba6c02bd06d96"
dependencies = [
"serde",
]
[[package]]
name = "serde_derive"
version = "1.0.219"
@@ -3860,9 +4040,9 @@ dependencies = [
[[package]]
name = "serde_json"
version = "1.0.140"
version = "1.0.142"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "20068b6e96dc6c9bd23e01df8827e6c7e1f2fddd43c21810382803c136b99373"
checksum = "030fedb782600dcbd6f02d479bf0d817ac3bb40d644745b769d6a96bc3afc5a7"
dependencies = [
"indexmap 2.10.0",
"itoa",
@@ -3944,6 +4124,17 @@ dependencies = [
"syn 2.0.104",
]
[[package]]
name = "sha1"
version = "0.10.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e3bf829a2d51ab4a5ddf1352d8470c140cadc8301b2ae1789db023f01cedd6ba"
dependencies = [
"cfg-if",
"cpufeatures",
"digest",
]
[[package]]
name = "sharded-slab"
version = "0.1.7"
@@ -4035,13 +4226,19 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "67b1b7a3b5fe4f1376887184045fcf45c69e92af734b7aaddc05fb777b6fbd03"
[[package]]
name = "socket2"
version = "0.5.10"
name = "smawk"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e22376abed350d73dd1cd119b57ffccad95b4e585a7cda43e286245ce23c0678"
checksum = "b7c388c1b5e93756d0c740965c41e8822f866621d41acbdf6336a6a168f8840c"
[[package]]
name = "socket2"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "233504af464074f9d066d7b5416c5f9b894a5862a6506e306f7b816cdd6f1807"
dependencies = [
"libc",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -4086,7 +4283,7 @@ dependencies = [
"starlark_syntax",
"static_assertions",
"strsim 0.10.0",
"textwrap",
"textwrap 0.11.0",
"thiserror 1.0.69",
]
@@ -4187,9 +4384,9 @@ dependencies = [
[[package]]
name = "strum"
version = "0.27.1"
version = "0.27.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f64def088c51c9510a8579e3c5d67c65349dcf755e5479ad3d010aa6454e2c32"
checksum = "af23d6f6c1a224baef9d3f61e287d2761385a5b88fdab4eb4c6f11aeb54c4bcf"
[[package]]
name = "strum_macros"
@@ -4206,14 +4403,13 @@ dependencies = [
[[package]]
name = "strum_macros"
version = "0.27.1"
version = "0.27.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c77a8c5abcaf0f9ce05d62342b7d298c346515365c36b673df4ebe3ced01fde8"
checksum = "7695ce3845ea4b33927c055a39dc438a45b059f7c1b3d91d38d10355fb8cbca7"
dependencies = [
"heck",
"proc-macro2",
"quote",
"rustversion",
"syn 2.0.104",
]
@@ -4223,6 +4419,15 @@ version = "2.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "13c2bddecc57b384dee18652358fb23172facb8a2c51ccc10d74c157bdea3292"
[[package]]
name = "supports-color"
version = "3.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c64fc7232dd8d2e4ac5ce4ef302b1d81e0b80d055b9d77c7c4f51f6aa4c867d6"
dependencies = [
"is_ci",
]
[[package]]
name = "syn"
version = "1.0.109"
@@ -4336,7 +4541,7 @@ dependencies = [
"fastrand",
"getrandom 0.3.3",
"once_cell",
"rustix 1.0.7",
"rustix 1.0.8",
"windows-sys 0.59.0",
]
@@ -4357,7 +4562,7 @@ version = "0.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "45c6481c4829e4cc63825e62c49186a34538b7b2750b73b266581ffb612fb5ed"
dependencies = [
"rustix 1.0.7",
"rustix 1.0.8",
"windows-sys 0.59.0",
]
@@ -4376,6 +4581,17 @@ dependencies = [
"unicode-width 0.1.14",
]
[[package]]
name = "textwrap"
version = "0.16.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c13547615a44dc9c452a8a534638acdf07120d4b6847c8178705da06306a3057"
dependencies = [
"smawk",
"unicode-linebreak",
"unicode-width 0.2.1",
]
[[package]]
name = "thiserror"
version = "1.0.69"
@@ -4490,9 +4706,9 @@ dependencies = [
[[package]]
name = "tokio"
version = "1.46.1"
version = "1.47.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0cc3a2344dafbe23a245241fe8b09735b521110d30fcefbbd5feb1797ca35d17"
checksum = "89e49afdadebb872d3145a5638b59eb0691ea23e46ca484037cfab3b76b95038"
dependencies = [
"backtrace",
"bytes",
@@ -4505,7 +4721,7 @@ dependencies = [
"slab",
"socket2",
"tokio-macros",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -4585,14 +4801,14 @@ dependencies = [
"serde",
"serde_spanned 0.6.9",
"toml_datetime 0.6.11",
"toml_edit",
"toml_edit 0.22.27",
]
[[package]]
name = "toml"
version = "0.9.1"
version = "0.9.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0207d6ed1852c2a124c1fbec61621acb8330d2bf969a5d0643131e9affd985a5"
checksum = "41ae868b5a0f67631c14589f7e250c1ea2c574ee5ba21c6c8dd4b1485705a5a1"
dependencies = [
"indexmap 2.10.0",
"serde",
@@ -4635,19 +4851,32 @@ dependencies = [
]
[[package]]
name = "toml_parser"
version = "1.0.0"
name = "toml_edit"
version = "0.23.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b5c1c469eda89749d2230d8156a5969a69ffe0d6d01200581cdc6110674d293e"
checksum = "17d3b47e6b7a040216ae5302712c94d1cf88c95b47efa80e2c59ce96c878267e"
dependencies = [
"indexmap 2.10.0",
"toml_datetime 0.7.0",
"toml_parser",
"toml_writer",
"winnow",
]
[[package]]
name = "toml_parser"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b551886f449aa90d4fe2bdaa9f4a2577ad2dde302c61ecf262d80b116db95c10"
dependencies = [
"winnow",
]
[[package]]
name = "toml_writer"
version = "1.0.0"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b679217f2848de74cabd3e8fc5e6d66f40b7da40f8e1954d92054d9010690fd5"
checksum = "fcc842091f2def52017664b53082ecbbeb5c7731092bad69d2c63050401dfd64"
[[package]]
name = "tower"
@@ -4767,7 +4996,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8189decb5ac0fa7bc8b96b7cb9b2701d60d48805aca84a238004d665fcc4008"
dependencies = [
"matchers",
"nu-ansi-term",
"nu-ansi-term 0.46.0",
"once_cell",
"regex",
"sharded-slab",
@@ -4780,9 +5009,9 @@ dependencies = [
[[package]]
name = "tree-sitter"
version = "0.25.6"
version = "0.25.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7cf18d43cbf0bfca51f657132cc616a5097edc4424d538bae6fa60142eaf9f0"
checksum = "6d7b8994f367f16e6fa14b5aebbcb350de5d7cbea82dc5b00ae997dd71680dd2"
dependencies = [
"cc",
"regex",
@@ -4821,7 +5050,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "911e93158bf80bbc94bad533b2b16e3d711e1132d69a6a6980c3920a63422c19"
dependencies = [
"ratatui",
"unicode-width 0.2.0",
"unicode-width 0.2.1",
]
[[package]]
@@ -4841,15 +5070,10 @@ dependencies = [
]
[[package]]
name = "tui-textarea"
version = "0.7.0"
name = "typenum"
version = "1.18.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0a5318dd619ed73c52a9417ad19046724effc1287fb75cdcc4eca1d6ac1acbae"
dependencies = [
"crossterm",
"ratatui",
"unicode-width 0.2.0",
]
checksum = "1dccffe3ce07af9386bfd29e80c0ab1a8205a2fc34e4bcd40364df902cfa8f3f"
[[package]]
name = "unicase"
@@ -4863,6 +5087,12 @@ version = "1.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5a5f39404a5da50712a4c1eecf25e90dd62b613502b7e925fd4e4d19b5c96512"
[[package]]
name = "unicode-linebreak"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3b09c83c3c29d37506a3e260c08c03743a6bb66a9cd432c6934ab501a190571f"
[[package]]
name = "unicode-segmentation"
version = "1.12.0"
@@ -4888,9 +5118,9 @@ checksum = "7dd6e30e90baa6f72411720665d41d89b9a3d039dc45b8faea1ddd07f617f6af"
[[package]]
name = "unicode-width"
version = "0.2.0"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fc81956842c57dac11422a97c3b8195a1ff727f06e85c84ed2e8aa277c9a0fd"
checksum = "4a1a07cc7db3810833284e8d372ccdc6da29741639ecc70c9ec107df0fa6154c"
[[package]]
name = "unicode-xid"
@@ -4975,6 +5205,27 @@ version = "0.9.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a"
[[package]]
name = "vt100"
version = "0.16.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "054ff75fb8fa83e609e685106df4faeffdf3a735d3c74ebce97ec557d5d36fd9"
dependencies = [
"itoa",
"unicode-width 0.2.1",
"vte",
]
[[package]]
name = "vte"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a5924018406ce0063cd67f8e008104968b74b563ee1b85dde3ed1f7cb87d3dbd"
dependencies = [
"arrayvec",
"memchr",
]
[[package]]
name = "wait-timeout"
version = "0.2.1"
@@ -5018,6 +5269,12 @@ dependencies = [
"wit-bindgen-rt",
]
[[package]]
name = "wasite"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b8dad83b4f25e74f184f64c43b150b91efe7647395b42289f38e50566d82855b"
[[package]]
name = "wasm-bindgen"
version = "0.2.100"
@@ -5118,6 +5375,17 @@ version = "0.1.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a751b3277700db47d3e574514de2eced5e54dc8a5436a3bf7a0b248b2cee16f3"
[[package]]
name = "whoami"
version = "1.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6994d13118ab492c3c80c1f81928718159254c53c472bf9ce36f8dae4add02a7"
dependencies = [
"redox_syscall",
"wasite",
"web-sys",
]
[[package]]
name = "wildmatch"
version = "2.4.0"
@@ -5446,9 +5714,9 @@ checksum = "271414315aff87387382ec3d271b52d7ae78726f5d44ac98b4f4030c91880486"
[[package]]
name = "winnow"
version = "0.7.11"
version = "0.7.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "74c7b26e3480b707944fc872477815d29a8e429d2f93a1ce000f5fa84a15cbcd"
checksum = "f3edebf492c8125044983378ecb5766203ad3b4c2f7a922bd7dd207f6d443e95"
dependencies = [
"memchr",
]

View File

@@ -1,8 +1,8 @@
[workspace]
resolver = "2"
members = [
"ansi-escape",
"apply-patch",
"arg0",
"cli",
"common",
"core",
@@ -14,8 +14,10 @@ members = [
"mcp-client",
"mcp-server",
"mcp-types",
"ollama",
"tui",
]
resolver = "2"
[workspace.package]
version = "0.0.0"
@@ -40,3 +42,7 @@ strip = "symbols"
# See https://github.com/openai/codex/issues/1411 for details.
codegen-units = 1
[patch.crates-io]
# ratatui = { path = "../../ratatui" }
ratatui = { git = "https://github.com/nornagon/ratatui", branch = "nornagon-v0.29.0-patch" }

View File

@@ -1,7 +1,7 @@
[package]
edition = "2024"
name = "codex-ansi-escape"
version = { workspace = true }
edition = "2024"
[lib]
name = "codex_ansi_escape"
@@ -10,7 +10,7 @@ path = "src/lib.rs"
[dependencies]
ansi-to-tui = "7.0.0"
ratatui = { version = "0.29.0", features = [
"unstable-widget-ref",
"unstable-rendered-line-info",
"unstable-widget-ref",
] }
tracing = { version = "0.1.41", features = ["log"] }

View File

@@ -1,7 +1,7 @@
[package]
edition = "2024"
name = "codex-apply-patch"
version = { workspace = true }
edition = "2024"
[lib]
name = "codex_apply_patch"
@@ -14,7 +14,7 @@ workspace = true
anyhow = "1"
similar = "2.7.0"
thiserror = "2.0.12"
tree-sitter = "0.25.3"
tree-sitter = "0.25.8"
tree-sitter-bash = "0.25.0"
[dev-dependencies]

View File

@@ -42,6 +42,15 @@ impl From<std::io::Error> for ApplyPatchError {
}
}
impl From<&std::io::Error> for ApplyPatchError {
fn from(err: &std::io::Error) -> Self {
ApplyPatchError::IoError(IoError {
context: "I/O error".to_string(),
source: std::io::Error::new(err.kind(), err.to_string()),
})
}
}
#[derive(Debug, Error)]
#[error("{context}: {source}")]
pub struct IoError {
@@ -58,16 +67,24 @@ impl PartialEq for IoError {
#[derive(Debug, PartialEq)]
pub enum MaybeApplyPatch {
Body(Vec<Hunk>),
Body(ApplyPatchArgs),
ShellParseError(ExtractHeredocError),
PatchParseError(ParseError),
NotApplyPatch,
}
/// Both the raw PATCH argument to `apply_patch` as well as the PATCH argument
/// parsed into hunks.
#[derive(Debug, PartialEq)]
pub struct ApplyPatchArgs {
pub patch: String,
pub hunks: Vec<Hunk>,
}
pub fn maybe_parse_apply_patch(argv: &[String]) -> MaybeApplyPatch {
match argv {
[cmd, body] if cmd == "apply_patch" => match parse_patch(body) {
Ok(hunks) => MaybeApplyPatch::Body(hunks),
Ok(source) => MaybeApplyPatch::Body(source),
Err(e) => MaybeApplyPatch::PatchParseError(e),
},
[bash, flag, script]
@@ -77,7 +94,7 @@ pub fn maybe_parse_apply_patch(argv: &[String]) -> MaybeApplyPatch {
{
match extract_heredoc_body_from_apply_patch_command(script) {
Ok(body) => match parse_patch(&body) {
Ok(hunks) => MaybeApplyPatch::Body(hunks),
Ok(source) => MaybeApplyPatch::Body(source),
Err(e) => MaybeApplyPatch::PatchParseError(e),
},
Err(e) => MaybeApplyPatch::ShellParseError(e),
@@ -116,11 +133,19 @@ pub enum MaybeApplyPatchVerified {
NotApplyPatch,
}
#[derive(Debug, PartialEq)]
/// ApplyPatchAction is the result of parsing an `apply_patch` command. By
/// construction, all paths should be absolute paths.
#[derive(Debug, PartialEq)]
pub struct ApplyPatchAction {
changes: HashMap<PathBuf, ApplyPatchFileChange>,
/// The raw patch argument that can be used with `apply_patch` as an exec
/// call. i.e., if the original arg was parsed in "lenient" mode with a
/// heredoc, this should be the value without the heredoc wrapper.
pub patch: String,
/// The working directory that was used to resolve relative paths in the patch.
pub cwd: PathBuf,
}
impl ApplyPatchAction {
@@ -140,8 +165,28 @@ impl ApplyPatchAction {
panic!("path must be absolute");
}
#[allow(clippy::expect_used)]
let filename = path
.file_name()
.expect("path should not be empty")
.to_string_lossy();
let patch = format!(
r#"*** Begin Patch
*** Update File: {filename}
@@
+ {content}
*** End Patch"#,
);
let changes = HashMap::from([(path.to_path_buf(), ApplyPatchFileChange::Add { content })]);
Self { changes }
#[allow(clippy::expect_used)]
Self {
changes,
cwd: path
.parent()
.expect("path should have parent")
.to_path_buf(),
patch,
}
}
}
@@ -149,7 +194,7 @@ impl ApplyPatchAction {
/// patch.
pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApplyPatchVerified {
match maybe_parse_apply_patch(argv) {
MaybeApplyPatch::Body(hunks) => {
MaybeApplyPatch::Body(ApplyPatchArgs { patch, hunks }) => {
let mut changes = HashMap::new();
for hunk in hunks {
let path = hunk.resolve_path(cwd);
@@ -183,7 +228,11 @@ pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApp
}
}
}
MaybeApplyPatchVerified::Body(ApplyPatchAction { changes })
MaybeApplyPatchVerified::Body(ApplyPatchAction {
changes,
patch,
cwd: cwd.to_path_buf(),
})
}
MaybeApplyPatch::ShellParseError(e) => MaybeApplyPatchVerified::ShellParseError(e),
MaybeApplyPatch::PatchParseError(e) => MaybeApplyPatchVerified::CorrectnessError(e.into()),
@@ -264,7 +313,7 @@ pub fn apply_patch(
stderr: &mut impl std::io::Write,
) -> Result<(), ApplyPatchError> {
let hunks = match parse_patch(patch) {
Ok(hunks) => hunks,
Ok(source) => source.hunks,
Err(e) => {
match &e {
InvalidPatchError(message) => {
@@ -326,13 +375,21 @@ pub fn apply_hunks(
match apply_hunks_to_files(hunks) {
Ok(affected) => {
print_summary(&affected, stdout).map_err(ApplyPatchError::from)?;
Ok(())
}
Err(err) => {
writeln!(stderr, "{err:?}").map_err(ApplyPatchError::from)?;
let msg = err.to_string();
writeln!(stderr, "{msg}").map_err(ApplyPatchError::from)?;
if let Some(io) = err.downcast_ref::<std::io::Error>() {
Err(ApplyPatchError::from(io))
} else {
Err(ApplyPatchError::IoError(IoError {
context: msg,
source: std::io::Error::other(err),
}))
}
}
}
Ok(())
}
/// Applies each parsed patch hunk to the filesystem.
@@ -652,7 +709,7 @@ mod tests {
]);
match maybe_parse_apply_patch(&args) {
MaybeApplyPatch::Body(hunks) => {
MaybeApplyPatch::Body(ApplyPatchArgs { hunks, patch: _ }) => {
assert_eq!(
hunks,
vec![Hunk::AddFile {
@@ -679,7 +736,7 @@ PATCH"#,
]);
match maybe_parse_apply_patch(&args) {
MaybeApplyPatch::Body(hunks) => {
MaybeApplyPatch::Body(ApplyPatchArgs { hunks, patch: _ }) => {
assert_eq!(
hunks,
vec![Hunk::AddFile {
@@ -954,7 +1011,7 @@ PATCH"#,
));
let patch = parse_patch(&patch).unwrap();
let update_file_chunks = match patch.as_slice() {
let update_file_chunks = match patch.hunks.as_slice() {
[Hunk::UpdateFile { chunks, .. }] => chunks,
_ => panic!("Expected a single UpdateFile hunk"),
};
@@ -992,7 +1049,7 @@ PATCH"#,
));
let patch = parse_patch(&patch).unwrap();
let chunks = match patch.as_slice() {
let chunks = match patch.hunks.as_slice() {
[Hunk::UpdateFile { chunks, .. }] => chunks,
_ => panic!("Expected a single UpdateFile hunk"),
};
@@ -1029,7 +1086,7 @@ PATCH"#,
));
let patch = parse_patch(&patch).unwrap();
let chunks = match patch.as_slice() {
let chunks = match patch.hunks.as_slice() {
[Hunk::UpdateFile { chunks, .. }] => chunks,
_ => panic!("Expected a single UpdateFile hunk"),
};
@@ -1064,7 +1121,7 @@ PATCH"#,
));
let patch = parse_patch(&patch).unwrap();
let chunks = match patch.as_slice() {
let chunks = match patch.hunks.as_slice() {
[Hunk::UpdateFile { chunks, .. }] => chunks,
_ => panic!("Expected a single UpdateFile hunk"),
};
@@ -1110,7 +1167,7 @@ PATCH"#,
// Extract chunks then build the unified diff.
let parsed = parse_patch(&patch).unwrap();
let chunks = match parsed.as_slice() {
let chunks = match parsed.hunks.as_slice() {
[Hunk::UpdateFile { chunks, .. }] => chunks,
_ => panic!("Expected a single UpdateFile hunk"),
};
@@ -1193,7 +1250,29 @@ g
new_content: "updated session directory content\n".to_string(),
},
)]),
patch: argv[1].clone(),
cwd: session_dir.path().to_path_buf(),
})
);
}
#[test]
fn test_apply_patch_fails_on_write_error() {
let dir = tempdir().unwrap();
let path = dir.path().join("readonly.txt");
fs::write(&path, "before\n").unwrap();
let mut perms = fs::metadata(&path).unwrap().permissions();
perms.set_readonly(true);
fs::set_permissions(&path, perms).unwrap();
let patch = wrap_patch(&format!(
"*** Update File: {}\n@@\n-before\n+after\n*** End Patch",
path.display()
));
let mut stdout = Vec::new();
let mut stderr = Vec::new();
let result = apply_patch(&patch, &mut stdout, &mut stderr);
assert!(result.is_err());
}
}

View File

@@ -22,6 +22,7 @@
//!
//! The parser below is a little more lenient than the explicit spec and allows for
//! leading/trailing whitespace around patch markers.
use crate::ApplyPatchArgs;
use std::path::Path;
use std::path::PathBuf;
@@ -102,7 +103,7 @@ pub struct UpdateFileChunk {
pub is_end_of_file: bool,
}
pub fn parse_patch(patch: &str) -> Result<Vec<Hunk>, ParseError> {
pub fn parse_patch(patch: &str) -> Result<ApplyPatchArgs, ParseError> {
let mode = if PARSE_IN_STRICT_MODE {
ParseMode::Strict
} else {
@@ -150,7 +151,7 @@ enum ParseMode {
Lenient,
}
fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<Vec<Hunk>, ParseError> {
fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<ApplyPatchArgs, ParseError> {
let lines: Vec<&str> = patch.trim().lines().collect();
let lines: &[&str] = match check_patch_boundaries_strict(&lines) {
Ok(()) => &lines,
@@ -173,7 +174,8 @@ fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<Vec<Hunk>, ParseErro
line_number += hunk_lines;
remaining_lines = &remaining_lines[hunk_lines..]
}
Ok(hunks)
let patch = lines.join("\n");
Ok(ApplyPatchArgs { hunks, patch })
}
/// Checks the start and end lines of the patch text for `apply_patch`,
@@ -425,6 +427,7 @@ fn parse_update_file_chunk(
}
#[test]
#[allow(clippy::unwrap_used)]
fn test_parse_patch() {
assert_eq!(
parse_patch_text("bad", ParseMode::Strict),
@@ -455,8 +458,10 @@ fn test_parse_patch() {
"*** Begin Patch\n\
*** End Patch",
ParseMode::Strict
),
Ok(Vec::new())
)
.unwrap()
.hunks,
Vec::new()
);
assert_eq!(
parse_patch_text(
@@ -472,8 +477,10 @@ fn test_parse_patch() {
+ return 123\n\
*** End Patch",
ParseMode::Strict
),
Ok(vec![
)
.unwrap()
.hunks,
vec![
AddFile {
path: PathBuf::from("path/add.py"),
contents: "abc\ndef\n".to_string()
@@ -491,7 +498,7 @@ fn test_parse_patch() {
is_end_of_file: false
}]
}
])
]
);
// Update hunk followed by another hunk (Add File).
assert_eq!(
@@ -504,8 +511,10 @@ fn test_parse_patch() {
+content\n\
*** End Patch",
ParseMode::Strict
),
Ok(vec![
)
.unwrap()
.hunks,
vec![
UpdateFile {
path: PathBuf::from("file.py"),
move_path: None,
@@ -520,7 +529,7 @@ fn test_parse_patch() {
path: PathBuf::from("other.py"),
contents: "content\n".to_string()
}
])
]
);
// Update hunk without an explicit @@ header for the first chunk should parse.
@@ -533,8 +542,10 @@ fn test_parse_patch() {
+bar
*** End Patch"#,
ParseMode::Strict
),
Ok(vec![UpdateFile {
)
.unwrap()
.hunks,
vec![UpdateFile {
path: PathBuf::from("file2.py"),
move_path: None,
chunks: vec![UpdateFileChunk {
@@ -543,7 +554,7 @@ fn test_parse_patch() {
new_lines: vec!["import foo".to_string(), "bar".to_string()],
is_end_of_file: false,
}],
}])
}]
);
}
@@ -574,7 +585,10 @@ fn test_parse_patch_lenient() {
);
assert_eq!(
parse_patch_text(&patch_text_in_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
Ok(ApplyPatchArgs {
hunks: expected_patch.clone(),
patch: patch_text.to_string()
})
);
let patch_text_in_single_quoted_heredoc = format!("<<'EOF'\n{patch_text}\nEOF\n");
@@ -584,7 +598,10 @@ fn test_parse_patch_lenient() {
);
assert_eq!(
parse_patch_text(&patch_text_in_single_quoted_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
Ok(ApplyPatchArgs {
hunks: expected_patch.clone(),
patch: patch_text.to_string()
})
);
let patch_text_in_double_quoted_heredoc = format!("<<\"EOF\"\n{patch_text}\nEOF\n");
@@ -594,7 +611,10 @@ fn test_parse_patch_lenient() {
);
assert_eq!(
parse_patch_text(&patch_text_in_double_quoted_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
Ok(ApplyPatchArgs {
hunks: expected_patch.clone(),
patch: patch_text.to_string()
})
);
let patch_text_in_mismatched_quotes_heredoc = format!("<<\"EOF'\n{patch_text}\nEOF\n");

19
codex-rs/arg0/Cargo.toml Normal file
View File

@@ -0,0 +1,19 @@
[package]
edition = "2024"
name = "codex-arg0"
version = { workspace = true }
[lib]
name = "codex_arg0"
path = "src/lib.rs"
[lints]
workspace = true
[dependencies]
anyhow = "1"
codex-apply-patch = { path = "../apply-patch" }
codex-core = { path = "../core" }
codex-linux-sandbox = { path = "../linux-sandbox" }
dotenvy = "0.15.7"
tokio = { version = "1", features = ["rt-multi-thread"] }

91
codex-rs/arg0/src/lib.rs Normal file
View File

@@ -0,0 +1,91 @@
use std::future::Future;
use std::path::Path;
use std::path::PathBuf;
use codex_core::CODEX_APPLY_PATCH_ARG1;
/// While we want to deploy the Codex CLI as a single executable for simplicity,
/// we also want to expose some of its functionality as distinct CLIs, so we use
/// the "arg0 trick" to determine which CLI to dispatch. This effectively allows
/// us to simulate deploying multiple executables as a single binary on Mac and
/// Linux (but not Windows).
///
/// When the current executable is invoked through the hard-link or alias named
/// `codex-linux-sandbox` we *directly* execute
/// [`codex_linux_sandbox::run_main`] (which never returns). Otherwise we:
///
/// 1. Use [`dotenvy::from_path`] and [`dotenvy::dotenv`] to modify the
/// environment before creating any threads.
/// 2. Construct a Tokio multi-thread runtime.
/// 3. Derive the path to the current executable (so children can re-invoke the
/// sandbox) when running on Linux.
/// 4. Execute the provided async `main_fn` inside that runtime, forwarding any
/// error. Note that `main_fn` receives `codex_linux_sandbox_exe:
/// Option<PathBuf>`, as an argument, which is generally needed as part of
/// constructing [`codex_core::config::Config`].
///
/// This function should be used to wrap any `main()` function in binary crates
/// in this workspace that depends on these helper CLIs.
pub fn arg0_dispatch_or_else<F, Fut>(main_fn: F) -> anyhow::Result<()>
where
F: FnOnce(Option<PathBuf>) -> Fut,
Fut: Future<Output = anyhow::Result<()>>,
{
// Determine if we were invoked via the special alias.
let mut args = std::env::args_os();
let argv0 = args.next().unwrap_or_default();
let exe_name = Path::new(&argv0)
.file_name()
.and_then(|s| s.to_str())
.unwrap_or("");
if exe_name == "codex-linux-sandbox" {
// Safety: [`run_main`] never returns.
codex_linux_sandbox::run_main();
}
let argv1 = args.next().unwrap_or_default();
if argv1 == CODEX_APPLY_PATCH_ARG1 {
let patch_arg = args.next().and_then(|s| s.to_str().map(|s| s.to_owned()));
let exit_code = match patch_arg {
Some(patch_arg) => {
let mut stdout = std::io::stdout();
let mut stderr = std::io::stderr();
match codex_apply_patch::apply_patch(&patch_arg, &mut stdout, &mut stderr) {
Ok(()) => 0,
Err(_) => 1,
}
}
None => {
eprintln!("Error: {CODEX_APPLY_PATCH_ARG1} requires a UTF-8 PATCH argument.");
1
}
};
std::process::exit(exit_code);
}
// This modifies the environment, which is not thread-safe, so do this
// before creating any threads/the Tokio runtime.
load_dotenv();
// Regular invocation create a Tokio runtime and execute the provided
// async entry-point.
let runtime = tokio::runtime::Runtime::new()?;
runtime.block_on(async move {
let codex_linux_sandbox_exe: Option<PathBuf> = if cfg!(target_os = "linux") {
std::env::current_exe().ok()
} else {
None
};
main_fn(codex_linux_sandbox_exe).await
})
}
/// Load env vars from ~/.codex/.env and `$(pwd)/.env`.
fn load_dotenv() {
if let Ok(codex_home) = codex_core::config::find_codex_home() {
dotenvy::from_path(codex_home.join(".env")).ok();
}
dotenvy::dotenv().ok();
}

View File

@@ -1,7 +1,7 @@
[package]
edition = "2024"
name = "codex-chatgpt"
version = { workspace = true }
edition = "2024"
[lints]
workspace = true
@@ -9,12 +9,12 @@ workspace = true
[dependencies]
anyhow = "1"
clap = { version = "4", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
codex-common = { path = "../common", features = ["cli"] }
codex-core = { path = "../core" }
codex-login = { path = "../login" }
reqwest = { version = "0.12", features = ["json", "stream"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["full"] }
[dev-dependencies]

View File

@@ -1,3 +1,5 @@
use std::path::PathBuf;
use clap::Parser;
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
@@ -17,7 +19,10 @@ pub struct ApplyCommand {
#[clap(flatten)]
pub config_overrides: CliConfigOverrides,
}
pub async fn run_apply_command(apply_cli: ApplyCommand) -> anyhow::Result<()> {
pub async fn run_apply_command(
apply_cli: ApplyCommand,
cwd: Option<PathBuf>,
) -> anyhow::Result<()> {
let config = Config::load_with_cli_overrides(
apply_cli
.config_overrides
@@ -29,10 +34,13 @@ pub async fn run_apply_command(apply_cli: ApplyCommand) -> anyhow::Result<()> {
init_chatgpt_token_from_auth(&config.codex_home).await?;
let task_response = get_task(&config, apply_cli.task_id).await?;
apply_diff_from_task(task_response).await
apply_diff_from_task(task_response, cwd).await
}
pub async fn apply_diff_from_task(task_response: GetTaskResponse) -> anyhow::Result<()> {
pub async fn apply_diff_from_task(
task_response: GetTaskResponse,
cwd: Option<PathBuf>,
) -> anyhow::Result<()> {
let diff_turn = match task_response.current_diff_task_turn {
Some(turn) => turn,
None => anyhow::bail!("No diff turn found"),
@@ -42,13 +50,17 @@ pub async fn apply_diff_from_task(task_response: GetTaskResponse) -> anyhow::Res
_ => None,
});
match output_diff {
Some(output_diff) => apply_diff(&output_diff.diff).await,
Some(output_diff) => apply_diff(&output_diff.diff, cwd).await,
None => anyhow::bail!("No PR output item found"),
}
}
async fn apply_diff(diff: &str) -> anyhow::Result<()> {
let toplevel_output = tokio::process::Command::new("git")
async fn apply_diff(diff: &str, cwd: Option<PathBuf>) -> anyhow::Result<()> {
let mut cmd = tokio::process::Command::new("git");
if let Some(cwd) = cwd {
cmd.current_dir(cwd);
}
let toplevel_output = cmd
.args(vec!["rev-parse", "--show-toplevel"])
.output()
.await?;

View File

@@ -21,10 +21,14 @@ pub(crate) async fn chatgpt_get_request<T: DeserializeOwned>(
let token =
get_chatgpt_token_data().ok_or_else(|| anyhow::anyhow!("ChatGPT token not available"))?;
let account_id = token.account_id.ok_or_else(|| {
anyhow::anyhow!("ChatGPT account ID not available, please re-run `codex login`")
});
let response = client
.get(&url)
.bearer_auth(&token.access_token)
.header("chatgpt-account-id", &token.account_id)
.header("chatgpt-account-id", account_id?)
.header("Content-Type", "application/json")
.header("User-Agent", "codex-cli")
.send()

View File

@@ -18,7 +18,10 @@ pub fn set_chatgpt_token_data(value: TokenData) {
/// Initialize the ChatGPT token from auth.json file
pub async fn init_chatgpt_token_from_auth(codex_home: &Path) -> std::io::Result<()> {
let auth_json = codex_login::try_read_auth_json(codex_home).await?;
set_chatgpt_token_data(auth_json.tokens.clone());
let auth = codex_login::load_auth(codex_home, true)?;
if let Some(auth) = auth {
let token_data = auth.get_token_data().await?;
set_chatgpt_token_data(token_data);
}
Ok(())
}

View File

@@ -10,8 +10,13 @@ use tokio::process::Command;
async fn create_temp_git_repo() -> anyhow::Result<TempDir> {
let temp_dir = TempDir::new()?;
let repo_path = temp_dir.path();
let envs = vec![
("GIT_CONFIG_GLOBAL", "/dev/null"),
("GIT_CONFIG_NOSYSTEM", "1"),
];
let output = Command::new("git")
.envs(envs.clone())
.args(["init"])
.current_dir(repo_path)
.output()
@@ -25,12 +30,14 @@ async fn create_temp_git_repo() -> anyhow::Result<TempDir> {
}
Command::new("git")
.envs(envs.clone())
.args(["config", "user.email", "test@example.com"])
.current_dir(repo_path)
.output()
.await?;
Command::new("git")
.envs(envs.clone())
.args(["config", "user.name", "Test User"])
.current_dir(repo_path)
.output()
@@ -39,12 +46,14 @@ async fn create_temp_git_repo() -> anyhow::Result<TempDir> {
std::fs::write(repo_path.join("README.md"), "# Test Repo\n")?;
Command::new("git")
.envs(envs.clone())
.args(["add", "README.md"])
.current_dir(repo_path)
.output()
.await?;
let output = Command::new("git")
.envs(envs.clone())
.args(["commit", "-m", "Initial commit"])
.current_dir(repo_path)
.output()
@@ -78,17 +87,7 @@ async fn test_apply_command_creates_fibonacci_file() {
.await
.expect("Failed to load fixture");
let original_dir = std::env::current_dir().expect("Failed to get current dir");
std::env::set_current_dir(repo_path).expect("Failed to change directory");
struct DirGuard(std::path::PathBuf);
impl Drop for DirGuard {
fn drop(&mut self) {
let _ = std::env::set_current_dir(&self.0);
}
}
let _guard = DirGuard(original_dir);
apply_diff_from_task(task_response)
apply_diff_from_task(task_response, Some(repo_path.to_path_buf()))
.await
.expect("Failed to apply diff from task");
@@ -173,7 +172,7 @@ console.log(fib(10));
.await
.expect("Failed to load fixture");
let apply_result = apply_diff_from_task(task_response).await;
let apply_result = apply_diff_from_task(task_response, Some(repo_path.to_path_buf())).await;
assert!(
apply_result.is_err(),

View File

@@ -1,7 +1,7 @@
[package]
edition = "2024"
name = "codex-cli"
version = { workspace = true }
edition = "2024"
[[bin]]
name = "codex"
@@ -18,12 +18,12 @@ workspace = true
anyhow = "1"
clap = { version = "4", features = ["derive"] }
clap_complete = "4"
codex-arg0 = { path = "../arg0" }
codex-chatgpt = { path = "../chatgpt" }
codex-core = { path = "../core" }
codex-common = { path = "../common", features = ["cli"] }
codex-core = { path = "../core" }
codex-exec = { path = "../exec" }
codex-login = { path = "../login" }
codex-linux-sandbox = { path = "../linux-sandbox" }
codex-mcp-server = { path = "../mcp-server" }
codex-tui = { path = "../tui" }
serde_json = "1"

View File

@@ -0,0 +1,53 @@
//! Print the Fibonacci sequence.
//!
//! Usage:
//! cargo run -p codex-cli --example fibonacci -- [COUNT]
//!
//! If COUNT is omitted, the first 10 numbers are printed.
use std::env;
use std::process;
fn fibonacci(count: usize) -> Vec<u128> {
let mut seq = Vec::with_capacity(count);
if count == 0 {
return seq;
}
// Start with 0, 1
let mut a: u128 = 0;
let mut b: u128 = 1;
for _ in 0..count {
seq.push(a);
let next = a.saturating_add(b);
a = b;
b = next;
}
seq
}
fn parse_count_arg() -> Result<usize, String> {
let mut args = env::args().skip(1);
match args.next() {
None => Ok(10), // default
Some(s) => s
.parse::<usize>()
.map_err(|_| format!("Invalid COUNT: '{}' (expected a non-negative integer)", s)),
}
}
fn main() {
let count = match parse_count_arg() {
Ok(n) => n,
Err(e) => {
eprintln!(
"{}\nUsage: cargo run -p codex-cli --example fibonacci -- [COUNT]",
e
);
process::exit(2);
}
};
for n in fibonacci(count) {
println!("{}", n);
}
}

View File

@@ -4,10 +4,10 @@ use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::config_types::SandboxMode;
use codex_core::exec::StdioPolicy;
use codex_core::exec::spawn_command_under_linux_sandbox;
use codex_core::exec::spawn_command_under_seatbelt;
use codex_core::exec_env::create_env;
use codex_core::seatbelt::spawn_command_under_seatbelt;
use codex_core::spawn::StdioPolicy;
use crate::LandlockCommand;
use crate::SeatbeltCommand;

View File

@@ -1,25 +1,17 @@
use std::env;
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_login::AuthMode;
use codex_login::OPENAI_API_KEY_ENV_VAR;
use codex_login::load_auth;
use codex_login::login_with_api_key;
use codex_login::login_with_chatgpt;
use codex_login::logout;
pub async fn run_login_with_chatgpt(cli_config_overrides: CliConfigOverrides) -> ! {
let cli_overrides = match cli_config_overrides.parse_overrides() {
Ok(v) => v,
Err(e) => {
eprintln!("Error parsing -c overrides: {e}");
std::process::exit(1);
}
};
let config_overrides = ConfigOverrides::default();
let config = match Config::load_with_cli_overrides(cli_overrides, config_overrides) {
Ok(config) => config,
Err(e) => {
eprintln!("Error loading configuration: {e}");
std::process::exit(1);
}
};
let config = load_config_or_exit(cli_config_overrides);
let capture_output = false;
match login_with_chatgpt(&config.codex_home, capture_output).await {
@@ -33,3 +25,122 @@ pub async fn run_login_with_chatgpt(cli_config_overrides: CliConfigOverrides) ->
}
}
}
pub async fn run_login_with_api_key(
cli_config_overrides: CliConfigOverrides,
api_key: String,
) -> ! {
let config = load_config_or_exit(cli_config_overrides);
match login_with_api_key(&config.codex_home, &api_key) {
Ok(_) => {
eprintln!("Successfully logged in");
std::process::exit(0);
}
Err(e) => {
eprintln!("Error logging in: {e}");
std::process::exit(1);
}
}
}
pub async fn run_login_status(cli_config_overrides: CliConfigOverrides) -> ! {
let config = load_config_or_exit(cli_config_overrides);
match load_auth(&config.codex_home, true) {
Ok(Some(auth)) => match auth.mode {
AuthMode::ApiKey => {
if let Some(api_key) = auth.api_key.as_deref() {
eprintln!("Logged in using an API key - {}", safe_format_key(api_key));
if let Ok(env_api_key) = env::var(OPENAI_API_KEY_ENV_VAR) {
if env_api_key == api_key {
eprintln!(
" API loaded from OPENAI_API_KEY environment variable or .env file"
);
}
}
} else {
eprintln!("Logged in using an API key");
}
std::process::exit(0);
}
AuthMode::ChatGPT => {
eprintln!("Logged in using ChatGPT");
std::process::exit(0);
}
},
Ok(None) => {
eprintln!("Not logged in");
std::process::exit(1);
}
Err(e) => {
eprintln!("Error checking login status: {e}");
std::process::exit(1);
}
}
}
pub async fn run_logout(cli_config_overrides: CliConfigOverrides) -> ! {
let config = load_config_or_exit(cli_config_overrides);
match logout(&config.codex_home) {
Ok(true) => {
eprintln!("Successfully logged out");
std::process::exit(0);
}
Ok(false) => {
eprintln!("Not logged in");
std::process::exit(0);
}
Err(e) => {
eprintln!("Error logging out: {e}");
std::process::exit(1);
}
}
}
fn load_config_or_exit(cli_config_overrides: CliConfigOverrides) -> Config {
let cli_overrides = match cli_config_overrides.parse_overrides() {
Ok(v) => v,
Err(e) => {
eprintln!("Error parsing -c overrides: {e}");
std::process::exit(1);
}
};
let config_overrides = ConfigOverrides::default();
match Config::load_with_cli_overrides(cli_overrides, config_overrides) {
Ok(config) => config,
Err(e) => {
eprintln!("Error loading configuration: {e}");
std::process::exit(1);
}
}
}
fn safe_format_key(key: &str) -> String {
if key.len() <= 13 {
return "***".to_string();
}
let prefix = &key[..8];
let suffix = &key[key.len() - 5..];
format!("{prefix}***{suffix}")
}
#[cfg(test)]
mod tests {
use super::safe_format_key;
#[test]
fn formats_long_key() {
let key = "sk-proj-1234567890ABCDE";
assert_eq!(safe_format_key(key), "sk-proj-***ABCDE");
}
#[test]
fn short_key_returns_stars() {
let key = "sk-proj-12345";
assert_eq!(safe_format_key(key), "***");
}
}

View File

@@ -2,11 +2,15 @@ use clap::CommandFactory;
use clap::Parser;
use clap_complete::Shell;
use clap_complete::generate;
use codex_arg0::arg0_dispatch_or_else;
use codex_chatgpt::apply_command::ApplyCommand;
use codex_chatgpt::apply_command::run_apply_command;
use codex_cli::LandlockCommand;
use codex_cli::SeatbeltCommand;
use codex_cli::login::run_login_status;
use codex_cli::login::run_login_with_api_key;
use codex_cli::login::run_login_with_chatgpt;
use codex_cli::login::run_logout;
use codex_cli::proto;
use codex_common::CliConfigOverrides;
use codex_exec::Cli as ExecCli;
@@ -42,9 +46,12 @@ enum Subcommand {
#[clap(visible_alias = "e")]
Exec(ExecCli),
/// Login with ChatGPT.
/// Manage login.
Login(LoginCommand),
/// Remove stored authentication credentials.
Logout(LogoutCommand),
/// Experimental: run Codex as an MCP server.
Mcp,
@@ -89,10 +96,28 @@ enum DebugCommand {
struct LoginCommand {
#[clap(skip)]
config_overrides: CliConfigOverrides,
#[arg(long = "api-key", value_name = "API_KEY")]
api_key: Option<String>,
#[command(subcommand)]
action: Option<LoginSubcommand>,
}
#[derive(Debug, clap::Subcommand)]
enum LoginSubcommand {
/// Show login status.
Status,
}
#[derive(Debug, Parser)]
struct LogoutCommand {
#[clap(skip)]
config_overrides: CliConfigOverrides,
}
fn main() -> anyhow::Result<()> {
codex_linux_sandbox::run_with_sandbox(|codex_linux_sandbox_exe| async move {
arg0_dispatch_or_else(|codex_linux_sandbox_exe| async move {
cli_main(codex_linux_sandbox_exe).await?;
Ok(())
})
@@ -105,7 +130,10 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
None => {
let mut tui_cli = cli.interactive;
prepend_config_flags(&mut tui_cli.config_overrides, cli.config_overrides);
codex_tui::run_main(tui_cli, codex_linux_sandbox_exe)?;
let usage = codex_tui::run_main(tui_cli, codex_linux_sandbox_exe).await?;
if !usage.is_zero() {
println!("{}", codex_core::protocol::FinalOutput::from(usage));
}
}
Some(Subcommand::Exec(mut exec_cli)) => {
prepend_config_flags(&mut exec_cli.config_overrides, cli.config_overrides);
@@ -116,7 +144,22 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
}
Some(Subcommand::Login(mut login_cli)) => {
prepend_config_flags(&mut login_cli.config_overrides, cli.config_overrides);
run_login_with_chatgpt(login_cli.config_overrides).await;
match login_cli.action {
Some(LoginSubcommand::Status) => {
run_login_status(login_cli.config_overrides).await;
}
None => {
if let Some(api_key) = login_cli.api_key {
run_login_with_api_key(login_cli.config_overrides, api_key).await;
} else {
run_login_with_chatgpt(login_cli.config_overrides).await;
}
}
}
}
Some(Subcommand::Logout(mut logout_cli)) => {
prepend_config_flags(&mut logout_cli.config_overrides, cli.config_overrides);
run_logout(logout_cli.config_overrides).await;
}
Some(Subcommand::Proto(mut proto_cli)) => {
prepend_config_flags(&mut proto_cli.config_overrides, cli.config_overrides);
@@ -145,7 +188,7 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
},
Some(Subcommand::Apply(mut apply_cli)) => {
prepend_config_flags(&mut apply_cli.config_overrides, cli.config_overrides);
run_apply_command(apply_cli).await?;
run_apply_command(apply_cli, None).await?;
}
}

View File

@@ -4,10 +4,12 @@ use std::sync::Arc;
use clap::Parser;
use codex_common::CliConfigOverrides;
use codex_core::Codex;
use codex_core::CodexSpawnOk;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::protocol::Submission;
use codex_core::util::notify_on_sigint;
use codex_login::load_auth;
use tokio::io::AsyncBufReadExt;
use tokio::io::BufReader;
use tracing::error;
@@ -34,8 +36,9 @@ pub async fn run_main(opts: ProtoCli) -> anyhow::Result<()> {
.map_err(anyhow::Error::msg)?;
let config = Config::load_with_cli_overrides(overrides_vec, ConfigOverrides::default())?;
let auth = load_auth(&config.codex_home, true)?;
let ctrl_c = notify_on_sigint();
let (codex, _init_id) = Codex::spawn(config, ctrl_c.clone()).await?;
let CodexSpawnOk { codex, .. } = Codex::spawn(config, auth, ctrl_c.clone()).await?;
let codex = Arc::new(codex);
// Task that reads JSON lines from stdin and forwards to Submission Queue

View File

@@ -1,7 +1,7 @@
[package]
edition = "2024"
name = "codex-common"
version = { workspace = true }
edition = "2024"
[lints]
workspace = true
@@ -9,11 +9,11 @@ workspace = true
[dependencies]
clap = { version = "4", features = ["derive", "wrap_help"], optional = true }
codex-core = { path = "../core" }
toml = { version = "0.9", optional = true }
serde = { version = "1", optional = true }
toml = { version = "0.9", optional = true }
[features]
# Separate feature so that `clap` is not a mandatory dependency.
cli = ["clap", "toml", "serde"]
cli = ["clap", "serde", "toml"]
elapsed = []
sandbox_summary = []

View File

@@ -18,6 +18,9 @@ pub enum ApprovalModeCliArg {
/// will escalate to the user to ask for un-sandboxed execution.
OnFailure,
/// The model decides when to ask the user for approval.
OnRequest,
/// Never ask for user approval
/// Execution failures are immediately returned to the model.
Never,
@@ -28,6 +31,7 @@ impl From<ApprovalModeCliArg> for AskForApproval {
match value {
ApprovalModeCliArg::Untrusted => AskForApproval::UnlessTrusted,
ApprovalModeCliArg::OnFailure => AskForApproval::OnFailure,
ApprovalModeCliArg::OnRequest => AskForApproval::OnRequest,
ApprovalModeCliArg::Never => AskForApproval::Never,
}
}

View File

@@ -64,7 +64,11 @@ impl CliConfigOverrides {
// `-c model=o3` without the quotes.
let value: Value = match parse_toml_value(value_str) {
Ok(v) => v,
Err(_) => Value::String(value_str.to_string()),
Err(_) => {
// Strip leading/trailing quotes if present
let trimmed = value_str.trim().trim_matches(|c| c == '"' || c == '\'');
Value::String(trimmed.to_string())
}
};
Ok((key.to_string(), value))

View File

@@ -0,0 +1,29 @@
use codex_core::WireApi;
use codex_core::config::Config;
use crate::sandbox_summary::summarize_sandbox_policy;
/// Build a list of key/value pairs summarizing the effective configuration.
pub fn create_config_summary_entries(config: &Config) -> Vec<(&'static str, String)> {
let mut entries = vec![
("workdir", config.cwd.display().to_string()),
("model", config.model.clone()),
("provider", config.model_provider_id.clone()),
("approval", config.approval_policy.to_string()),
("sandbox", summarize_sandbox_policy(&config.sandbox_policy)),
];
if config.model_provider.wire_api == WireApi::Responses
&& config.model_family.supports_reasoning_summaries
{
entries.push((
"reasoning effort",
config.model_reasoning_effort.to_string(),
));
entries.push((
"reasoning summaries",
config.model_reasoning_summary.to_string(),
));
}
entries
}

View File

@@ -0,0 +1,177 @@
/// Simple case-insensitive subsequence matcher used for fuzzy filtering.
///
/// Returns the indices (character positions) of the matched characters in the
/// ORIGINAL `haystack` string and a score where smaller is better.
///
/// Unicode correctness: we perform the match on a lowercased copy of the
/// haystack and needle but maintain a mapping from each character in the
/// lowercased haystack back to the original character index in `haystack`.
/// This ensures the returned indices can be safely used with
/// `str::chars().enumerate()` consumers for highlighting, even when
/// lowercasing expands certain characters (e.g., ß → ss, İ → i̇).
pub fn fuzzy_match(haystack: &str, needle: &str) -> Option<(Vec<usize>, i32)> {
if needle.is_empty() {
return Some((Vec::new(), i32::MAX));
}
let mut lowered_chars: Vec<char> = Vec::new();
let mut lowered_to_orig_char_idx: Vec<usize> = Vec::new();
for (orig_idx, ch) in haystack.chars().enumerate() {
for lc in ch.to_lowercase() {
lowered_chars.push(lc);
lowered_to_orig_char_idx.push(orig_idx);
}
}
let lowered_needle: Vec<char> = needle.to_lowercase().chars().collect();
let mut result_orig_indices: Vec<usize> = Vec::with_capacity(lowered_needle.len());
let mut last_lower_pos: Option<usize> = None;
let mut cur = 0usize;
for &nc in lowered_needle.iter() {
let mut found_at: Option<usize> = None;
while cur < lowered_chars.len() {
if lowered_chars[cur] == nc {
found_at = Some(cur);
cur += 1;
break;
}
cur += 1;
}
let pos = found_at?;
result_orig_indices.push(lowered_to_orig_char_idx[pos]);
last_lower_pos = Some(pos);
}
let first_lower_pos = if result_orig_indices.is_empty() {
0usize
} else {
let target_orig = result_orig_indices[0];
lowered_to_orig_char_idx
.iter()
.position(|&oi| oi == target_orig)
.unwrap_or(0)
};
// last defaults to first for single-hit; score = extra span between first/last hit
// minus needle len (≥0).
// Strongly reward prefix matches by subtracting 100 when the first hit is at index 0.
let last_lower_pos = last_lower_pos.unwrap_or(first_lower_pos);
let window =
(last_lower_pos as i32 - first_lower_pos as i32 + 1) - (lowered_needle.len() as i32);
let mut score = window.max(0);
if first_lower_pos == 0 {
score -= 100;
}
result_orig_indices.sort_unstable();
result_orig_indices.dedup();
Some((result_orig_indices, score))
}
/// Convenience wrapper to get only the indices for a fuzzy match.
pub fn fuzzy_indices(haystack: &str, needle: &str) -> Option<Vec<usize>> {
fuzzy_match(haystack, needle).map(|(mut idx, _)| {
idx.sort_unstable();
idx.dedup();
idx
})
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn ascii_basic_indices() {
let (idx, score) = match fuzzy_match("hello", "hl") {
Some(v) => v,
None => panic!("expected a match"),
};
assert_eq!(idx, vec![0, 2]);
// 'h' at 0, 'l' at 2 -> window 1; start-of-string bonus applies (-100)
assert_eq!(score, -99);
}
#[test]
fn unicode_dotted_i_istanbul_highlighting() {
let (idx, score) = match fuzzy_match("İstanbul", "is") {
Some(v) => v,
None => panic!("expected a match"),
};
assert_eq!(idx, vec![0, 1]);
// Matches at lowered positions 0 and 2 -> window 1; start-of-string bonus applies
assert_eq!(score, -99);
}
#[test]
fn unicode_german_sharp_s_casefold() {
assert!(fuzzy_match("straße", "strasse").is_none());
}
#[test]
fn prefer_contiguous_match_over_spread() {
let (_idx_a, score_a) = match fuzzy_match("abc", "abc") {
Some(v) => v,
None => panic!("expected a match"),
};
let (_idx_b, score_b) = match fuzzy_match("a-b-c", "abc") {
Some(v) => v,
None => panic!("expected a match"),
};
// Contiguous window -> 0; start-of-string bonus -> -100
assert_eq!(score_a, -100);
// Spread over 5 chars for 3-letter needle -> window 2; with bonus -> -98
assert_eq!(score_b, -98);
assert!(score_a < score_b);
}
#[test]
fn start_of_string_bonus_applies() {
let (_idx_a, score_a) = match fuzzy_match("file_name", "file") {
Some(v) => v,
None => panic!("expected a match"),
};
let (_idx_b, score_b) = match fuzzy_match("my_file_name", "file") {
Some(v) => v,
None => panic!("expected a match"),
};
// Start-of-string contiguous -> window 0; bonus -> -100
assert_eq!(score_a, -100);
// Non-prefix contiguous -> window 0; no bonus -> 0
assert_eq!(score_b, 0);
assert!(score_a < score_b);
}
#[test]
fn empty_needle_matches_with_max_score_and_no_indices() {
let (idx, score) = match fuzzy_match("anything", "") {
Some(v) => v,
None => panic!("empty needle should match"),
};
assert!(idx.is_empty());
assert_eq!(score, i32::MAX);
}
#[test]
fn case_insensitive_matching_basic() {
let (idx, score) = match fuzzy_match("FooBar", "foO") {
Some(v) => v,
None => panic!("expected a match"),
};
assert_eq!(idx, vec![0, 1, 2]);
// Contiguous prefix match (case-insensitive) -> window 0 with bonus
assert_eq!(score, -100);
}
#[test]
fn indices_are_deduped_for_multichar_lowercase_expansion() {
let needle = "\u{0069}\u{0307}"; // "i" + combining dot above
let (idx, score) = match fuzzy_match("İ", needle) {
Some(v) => v,
None => panic!("expected a match"),
};
assert_eq!(idx, vec![0]);
// Lowercasing 'İ' expands to two chars; contiguous prefix -> window 0 with bonus
assert_eq!(score, -100);
}
}

View File

@@ -23,3 +23,9 @@ mod sandbox_summary;
#[cfg(feature = "sandbox_summary")]
pub use sandbox_summary::summarize_sandbox_policy;
mod config_summary;
pub use config_summary::create_config_summary_entries;
// Shared fuzzy matcher (used by TUI selection popups and other UI filtering)
pub mod fuzzy_match;

View File

@@ -7,18 +7,26 @@ pub fn summarize_sandbox_policy(sandbox_policy: &SandboxPolicy) -> String {
SandboxPolicy::WorkspaceWrite {
writable_roots,
network_access,
exclude_tmpdir_env_var,
exclude_slash_tmp,
} => {
let mut summary = "workspace-write".to_string();
if !writable_roots.is_empty() {
summary.push_str(&format!(
" [{}]",
writable_roots
.iter()
.map(|p| p.to_string_lossy())
.collect::<Vec<_>>()
.join(", ")
));
let mut writable_entries = Vec::<String>::new();
writable_entries.push("workdir".to_string());
if !*exclude_slash_tmp {
writable_entries.push("/tmp".to_string());
}
if !*exclude_tmpdir_env_var {
writable_entries.push("$TMPDIR".to_string());
}
writable_entries.extend(
writable_roots
.iter()
.map(|p| p.to_string_lossy().to_string()),
);
summary.push_str(&format!(" [{}]", writable_entries.join(", ")));
if *network_access {
summary.push_str(" (network access enabled)");
}

View File

@@ -92,6 +92,35 @@ http_headers = { "X-Example-Header" = "example-value" }
env_http_headers = { "X-Example-Features": "EXAMPLE_FEATURES" }
```
### Per-provider network tuning
The following optional settings control retry behaviour and streaming idle timeouts **per model provider**. They must be specified inside the corresponding `[model_providers.<id>]` block in `config.toml`. (Older releases accepted toplevel keys; those are now ignored.)
Example:
```toml
[model_providers.openai]
name = "OpenAI"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
# network tuning overrides (all optional; falls back to builtin defaults)
request_max_retries = 4 # retry failed HTTP requests
stream_max_retries = 10 # retry dropped SSE streams
stream_idle_timeout_ms = 300000 # 5m idle timeout
```
#### request_max_retries
How many times Codex will retry a failed HTTP request to the model provider. Defaults to `4`.
#### stream_max_retries
Number of times Codex will attempt to reconnect when a streaming response is interrupted. Defaults to `10`.
#### stream_idle_timeout_ms
How long Codex will wait for activity on a streaming response before treating the connection as lost. Defaults to `300_000` (5 minutes).
## model_provider
Identifies which provider to use from the `model_providers` map. Defaults to `"openai"`. You can override the `base_url` for the built-in `openai` provider via the `OPENAI_BASE_URL` environment variable.
@@ -119,12 +148,20 @@ Determines when the user should be prompted to approve whether Codex can execute
approval_policy = "untrusted"
```
If you want to be notified whenever a command fails, use "on-failure":
```toml
# If the command fails when run in the sandbox, Codex asks for permission to
# retry the command outside the sandbox.
approval_policy = "on-failure"
```
If you want the model to run until it decides that it needs to ask you for escalated permissions, use "on-request":
```toml
# The model decides when to escalate
approval_policy = "on-request"
```
Alternatively, you can have the model run until it is done, and never ask to run a command with escalated permissions:
```toml
# User is never prompted: if the command fails, Codex will automatically try
# something out. Note the `exec` subcommand always uses this mode.
@@ -230,15 +267,20 @@ disk, but attempts to write a file or access the network will be blocked.
A more relaxed policy is `workspace-write`. When specified, the current working directory for the Codex task will be writable (as well as `$TMPDIR` on macOS). Note that the CLI defaults to using the directory where it was spawned as `cwd`, though this can be overridden using `--cwd/-C`.
On macOS (and soon Linux), all writable roots (including `cwd`) that contain a `.git/` folder _as an immediate child_ will configure the `.git/` folder to be read-only while the rest of the Git repository will be writable. This means that commands like `git commit` will fail, by default (as it entails writing to `.git/`), and will require Codex to ask for permission.
```toml
# same as `--sandbox workspace-write`
sandbox_mode = "workspace-write"
# Extra settings that only apply when `sandbox = "workspace-write"`.
[sandbox_workspace_write]
# By default, only the cwd for the Codex session will be writable (and $TMPDIR
# on macOS), but you can specify additional writable folders in this array.
writable_roots = ["/tmp"]
# By default, the cwd for the Codex session will be writable as well as $TMPDIR
# (if set) and /tmp (if it exists). Setting the respective options to `true`
# will override those defaults.
exclude_tmpdir_env_var = false
exclude_slash_tmp = false
# Allow the command being run inside the sandbox to make outbound network
# requests. Disabled by default.
network_access = false
@@ -297,12 +339,11 @@ disable_response_storage = true
## shell_environment_policy
Codex spawns subprocesses (e.g. when executing a `local_shell` tool-call suggested by the assistant). By default it passes **only a minimal core subset** of your environment to those subprocesses to avoid leaking credentials. You can tune this behavior via the **`shell_environment_policy`** block in
`config.toml`:
Codex spawns subprocesses (e.g. when executing a `local_shell` tool-call suggested by the assistant). By default it now passes **your full environment** to those subprocesses. You can tune this behavior via the **`shell_environment_policy`** block in `config.toml`:
```toml
[shell_environment_policy]
# inherit can be "core" (default), "all", or "none"
# inherit can be "all" (default), "core", or "none"
inherit = "core"
# set to true to *skip* the filter for `"*KEY*"` and `"*TOKEN*"`
ignore_default_excludes = false
@@ -316,7 +357,7 @@ include_only = ["PATH", "HOME"]
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `inherit` | string | `core` | Starting template for the environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full parent env), or `none` (start empty). |
| `inherit` | string | `all` | Starting template for the environment:<br>`all` (clone full parent env), `core` (`HOME`, `PATH`, `USER`, …), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run. |
| `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob patterns to drop after the default filter.<br>Examples: `"AWS_*"`, `"AZURE_*"`. |
| `set` | table&lt;string,string&gt; | `{}` | Explicit key/value overrides or additions always win over inherited values. |
@@ -444,7 +485,7 @@ Currently, `"vscode"` is the default, though Codex does not verify VS Code is in
## hide_agent_reasoning
Codex intermittently emits "reasoning" events that show the models internal "thinking" before it produces a final answer. Some users may find these events distracting, especially in CI logs or minimal terminal output.
Codex intermittently emits "reasoning" events that show the model's internal "thinking" before it produces a final answer. Some users may find these events distracting, especially in CI logs or minimal terminal output.
Setting `hide_agent_reasoning` to `true` suppresses these events in **both** the TUI as well as the headless `exec` sub-command:
@@ -452,6 +493,19 @@ Setting `hide_agent_reasoning` to `true` suppresses these events in **both** the
hide_agent_reasoning = true # defaults to false
```
## show_raw_agent_reasoning
Surfaces the models raw chain-of-thought ("raw reasoning content") when available.
Notes:
- Only takes effect if the selected model/provider actually emits raw reasoning content. Many models do not. When unsupported, this option has no visible effect.
- Raw reasoning may include intermediate thoughts or sensitive context. Enable only if acceptable for your workflow.
Example:
```toml
show_raw_agent_reasoning = true # defaults to false
```
## model_context_window
The size of the context window for the model, in tokens.
@@ -472,14 +526,5 @@ Options that are specific to the TUI.
```toml
[tui]
# This will make it so that Codex does not try to process mouse events, which
# means your Terminal's native drag-to-text to text selection and copy/paste
# should work. The tradeoff is that Codex will not receive any mouse events, so
# it will not be possible to use the mouse to scroll conversation history.
#
# Note that most terminals support holding down a modifier key when using the
# mouse to support text selection. For example, even if Codex mouse capture is
# enabled (i.e., this is set to `false`), you can still hold down alt while
# dragging the mouse to select text.
disable_mouse_capture = true # defaults to `false`
# More to come here
```

View File

@@ -1,7 +1,7 @@
[package]
edition = "2024"
name = "codex-core"
version = { workspace = true }
edition = "2024"
[lib]
name = "codex_core"
@@ -15,20 +15,28 @@ anyhow = "1"
async-channel = "2.3.1"
base64 = "0.22"
bytes = "1.10.1"
chrono = { version = "0.4", features = ["serde"] }
codex-apply-patch = { path = "../apply-patch" }
codex-login = { path = "../login" }
codex-mcp-client = { path = "../mcp-client" }
dirs = "6"
env-flags = "0.1.1"
eventsource-stream = "0.2.3"
fs2 = "0.4.3"
futures = "0.3"
libc = "0.2.174"
mcp-types = { path = "../mcp-types" }
mime_guess = "2.0"
rand = "0.9"
reqwest = { version = "0.12", features = ["json", "stream"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
strum_macros = "0.27.1"
serde_bytes = "0.11"
sha1 = "0.10.6"
shlex = "1.3.0"
similar = "2.7.0"
strum_macros = "0.27.2"
tempfile = "3"
thiserror = "2.0.12"
time = { version = "0.3", features = ["formatting", "local-offset", "macros"] }
tokio = { version = "1", features = [
@@ -39,13 +47,16 @@ tokio = { version = "1", features = [
"signal",
] }
tokio-util = "0.7.14"
toml = "0.9.1"
toml = "0.9.4"
toml_edit = "0.23.3"
tracing = { version = "0.1.41", features = ["log"] }
tree-sitter = "0.25.3"
tree-sitter = "0.25.8"
tree-sitter-bash = "0.25.0"
uuid = { version = "1", features = ["serde", "v4"] }
whoami = "1.6.0"
wildmatch = "2.4.0"
[target.'cfg(target_os = "linux")'.dependencies]
landlock = "0.4.1"
seccompiler = "0.5.0"
@@ -60,9 +71,11 @@ openssl-sys = { version = "*", features = ["vendored"] }
[dev-dependencies]
assert_cmd = "2"
core_test_support = { path = "tests/common" }
maplit = "1.0.2"
predicates = "3"
pretty_assertions = "1.4.1"
tempfile = "3"
tokio-test = "0.4"
walkdir = "2.5.0"
wiremock = "0.6"

View File

@@ -2,9 +2,18 @@
This crate implements the business logic for Codex. It is designed to be used by the various Codex UIs written in Rust.
Though for non-Rust UIs, we are also working to define a _protocol_ for talking to Codex. See:
## Dependencies
- [Specification](../docs/protocol_v1.md)
- [Rust types](./src/protocol.rs)
Note that `codex-core` makes some assumptions about certain helper utilities being available in the environment. Currently, this
You can use the `proto` subcommand using the executable in the [`cli` crate](../cli) to speak the protocol using newline-delimited-JSON over stdin/stdout.
### macOS
Expects `/usr/bin/sandbox-exec` to be present.
### Linux
Expects the binary containing `codex-core` to run the equivalent of `codex debug landlock` when `arg0` is `codex-linux-sandbox`. See the `codex-arg0` crate for details.
### All Platforms
Expects the binary containing `codex-core` to simulate the virtual `apply_patch` CLI when `arg1` is `--codex-run-as-apply-patch`. See the `codex-arg0` crate for details.

View File

@@ -1,42 +1,260 @@
Please resolve the user's task by editing and testing the code files in your current code execution session.
You are a deployed coding agent.
Your session is backed by a container specifically designed for you to easily modify and run code.
The repo(s) are already cloned in your working directory, and you must fully solve the problem for your answer to be considered correct.
You are a coding agent running in the Codex CLI, a terminal-based coding assistant. Codex CLI is an open source project led by OpenAI. You are expected to be precise, safe, and helpful.
You MUST adhere to the following criteria when executing the task:
Your capabilities:
- Receive user prompts and other context provided by the harness, such as files in the workspace.
- Communicate with the user by streaming thinking & responses, and by making & updating plans.
- Emit function calls to run terminal commands and apply patches. Depending on how this specific run is configured, you can request that these function calls be escalated to the user for approval before running. More on this in the "Sandbox and approvals" section.
Within this context, Codex refers to the open-source agentic coding interface (not the old Codex language model built by OpenAI).
# How you work
## Personality
Your default personality and tone is concise, direct, and friendly. You communicate efficiently, always keeping the user clearly informed about ongoing actions without unnecessary detail. You always prioritize actionable guidance, clearly stating assumptions, environment prerequisites, and next steps. Unless explicitly asked, you avoid excessively verbose explanations about your work.
## Responsiveness
### Preamble messages
Before making tool calls, send a brief preamble to the user explaining what youre about to do. When sending preamble messages, follow these principles and examples:
- **Logically group related actions**: if youre about to run several related commands, describe them together in one preamble rather than sending a separate note for each.
- **Keep it concise**: be no more than 1-2 sentences (812 words for quick updates).
- **Build on prior context**: if this is not your first tool call, use the preamble message to connect the dots with whats been done so far and create a sense of momentum and clarity for the user to understand your next actions.
- **Keep your tone light, friendly and curious**: add small touches of personality in preambles feel collaborative and engaging.
**Examples:**
- “Ive explored the repo; now checking the API route definitions.”
- “Next, Ill patch the config and update the related tests.”
- “Im about to scaffold the CLI commands and helper functions.”
- “Ok cool, so Ive wrapped my head around the repo. Now digging into the API routes.”
- “Configs looking tidy. Next up is patching helpers to keep things in sync.”
- “Finished poking at the DB gateway. I will now chase down error handling.”
- “Alright, build pipeline order is interesting. Checking how it reports failures.”
- “Spotted a clever caching util; now hunting where it gets used.”
**Avoiding a preamble for every trivial read (e.g., `cat` a single file) unless its part of a larger grouped action.
- Jumping straight into tool calls without explaining whats about to happen.
- Writing overly long or speculative preambles — focus on immediate, tangible next steps.
## Planning
You have access to an `update_plan` tool which tracks steps and progress and renders them to the user. Using the tool helps demonstrate that you've understood the task and convey how you're approaching it. Plans can help to make complex, ambiguous, or multi-phase work clearer and more collaborative for the user. A good plan should break the task into meaningful, logically ordered steps that are easy to verify as you go. Note that plans are not for padding out simple work with filler steps or stating the obvious. Do not repeat the full contents of the plan after an `update_plan` call — the harness already displays it. Instead, summarize the change made and highlight any important context or next step.
Use a plan when:
- The task is non-trivial and will require multiple actions over a long time horizon.
- There are logical phases or dependencies where sequencing matters.
- The work has ambiguity that benefits from outlining high-level goals.
- You want intermediate checkpoints for feedback and validation.
- When the user asked you to do more than one thing in a single prompt
- The user has asked you to use the plan tool (aka "TODOs")
- You generate additional steps while working, and plan to do them before yielding to the user
Skip a plan when:
- The task is simple and direct.
- Breaking it down would only produce literal or trivial steps.
Planning steps are called "steps" in the tool, but really they're more like tasks or TODOs. As such they should be very concise descriptions of non-obvious work that an engineer might do like "Write the API spec", then "Update the backend", then "Implement the frontend". On the other hand, it's obvious that you'll usually have to "Explore the codebase" or "Implement the changes", so those are not worth tracking in your plan.
It may be the case that you complete all steps in your plan after a single pass of implementation. If this is the case, you can simply mark all the planned steps as completed. The content of your plan should not involve doing anything that you aren't capable of doing (i.e. don't try to test things that you can't test). Do not use plans for simple or single-step queries that you can just do or answer immediately.
### Examples
**High-quality plans**
Example 1:
1. Add CLI entry with file args
2. Parse Markdown via CommonMark library
3. Apply semantic HTML template
4. Handle code blocks, images, links
5. Add error handling for invalid files
Example 2:
1. Define CSS variables for colors
2. Add toggle with localStorage state
3. Refactor components to use variables
4. Verify all views for readability
5. Add smooth theme-change transition
Example 3:
1. Set up Node.js + WebSocket server
2. Add join/leave broadcast events
3. Implement messaging with timestamps
4. Add usernames + mention highlighting
5. Persist messages in lightweight DB
6. Add typing indicators + unread count
**Low-quality plans**
Example 1:
1. Create CLI tool
2. Add Markdown parser
3. Convert to HTML
Example 2:
1. Add dark mode toggle
2. Save preference
3. Make styles look good
Example 3:
1. Create single-file HTML game
2. Run quick sanity check
3. Summarize usage instructions
If you need to write a plan, only write high quality plans, not low quality ones.
## Task execution
You are a coding agent. Please keep going until the query is completely resolved, before ending your turn and yielding back to the user. Only terminate your turn when you are sure that the problem is solved. Autonomously resolve the query to the best of your ability, using the tools available to you, before coming back to the user. Do NOT guess or make up an answer.
You MUST adhere to the following criteria when solving queries:
- Working on the repo(s) in the current environment is allowed, even if they are proprietary.
- Analyzing code for vulnerabilities is allowed.
- Showing user code and tool call details is allowed.
- User instructions may overwrite the _CODING GUIDELINES_ section in this developer message.
- Do not use \`ls -R\`, \`find\`, or \`grep\` - these are slow in large repos. Use \`rg\` and \`rg --files\`.
- Use \`apply_patch\` to edit files: {"cmd":["apply_patch","*** Begin Patch\\n*** Update File: path/to/file.py\\n@@ def example():\\n- pass\\n+ return 123\\n*** End Patch"]}
- If completing the user's task requires writing or modifying files:
- Your code and final answer should follow these _CODING GUIDELINES_:
- Fix the problem at the root cause rather than applying surface-level patches, when possible.
- Avoid unneeded complexity in your solution.
- Ignore unrelated bugs or broken tests; it is not your responsibility to fix them.
- Update documentation as necessary.
- Keep changes consistent with the style of the existing codebase. Changes should be minimal and focused on the task.
- Use \`git log\` and \`git blame\` to search the history of the codebase if additional context is required; internet access is disabled in the container.
- NEVER add copyright or license headers unless specifically requested.
- You do not need to \`git commit\` your changes; this will be done automatically for you.
- If there is a .pre-commit-config.yaml, use \`pre-commit run --files ...\` to check that your changes pass the pre- commit checks. However, do not fix pre-existing errors on lines you didn't touch.
- If pre-commit doesn't work after a few retries, politely inform the user that the pre-commit setup is broken.
- Once you finish coding, you must
- Check \`git status\` to sanity check your changes; revert any scratch files or changes.
- Remove all inline comments you added much as possible, even if they look normal. Check using \`git diff\`. Inline comments must be generally avoided, unless active maintainers of the repo, after long careful study of the code and the issue, will still misinterpret the code without the comments.
- Check if you accidentally add copyright or license headers. If so, remove them.
- Try to run pre-commit if it is available.
- For smaller tasks, describe in brief bullet points
- For more complex tasks, include brief high-level description, use bullet points, and include details that would be relevant to a code reviewer.
- If completing the user's task DOES NOT require writing or modifying files (e.g., the user asks a question about the code base):
- Respond in a friendly tune as a remote teammate, who is knowledgeable, capable and eager to help with coding.
- When your task involves writing or modifying files:
- Do NOT tell the user to "save the file" or "copy the code into a file" if you already created or modified the file using \`apply_patch\`. Instead, reference the file as already saved.
- Do NOT show the full contents of large files you have already written, unless the user explicitly asks for them.
- Use the `apply_patch` tool to edit files (NEVER try `applypatch` or `apply-patch`, only `apply_patch`): {"command":["apply_patch","*** Begin Patch\\n*** Update File: path/to/file.py\\n@@ def example():\\n- pass\\n+ return 123\\n*** End Patch"]}
§ `apply-patch` Specification
If completing the user's task requires writing or modifying files, your code and final answer should follow these coding guidelines, though user instructions (i.e. AGENTS.md) may override these guidelines:
- Fix the problem at the root cause rather than applying surface-level patches, when possible.
- Avoid unneeded complexity in your solution.
- Do not attempt to fix unrelated bugs or broken tests. It is not your responsibility to fix them. (You may mention them to the user in your final message though.)
- Update documentation as necessary.
- Keep changes consistent with the style of the existing codebase. Changes should be minimal and focused on the task.
- Use `git log` and `git blame` to search the history of the codebase if additional context is required.
- NEVER add copyright or license headers unless specifically requested.
- Do not waste tokens by re-reading files after calling `apply_patch` on them. The tool call will fail if it didn't work. The same goes for making folders, deleting folders, etc.
- Do not `git commit` your changes or create new git branches unless explicitly requested.
- Do not add inline comments within code unless explicitly requested.
- Do not use one-letter variable names unless explicitly requested.
- NEVER output inline citations like "【F:README.md†L5-L14】" in your outputs. The CLI is not able to render these so they will just be broken in the UI. Instead, if you output valid filepaths, users will be able to click on them to open the files in their editor.
## Testing your work
If the codebase has tests or the ability to build or run, you should use them to verify that your work is complete. Generally, your testing philosophy should be to start as specific as possible to the code you changed so that you can catch issues efficiently, then make your way to broader tests as you build confidence. If there's no test for the code you changed, and if the adjacent patterns in the codebases show that there's a logical place for you to add a test, you may do so. However, do not add tests to codebases with no tests, or where the patterns don't indicate so.
Once you're confident in correctness, use formatting commands to ensure that your code is well formatted. These commands can take time so you should run them on as precise a target as possible. If there are issues you can iterate up to 3 times to get formatting right, but if you still can't manage it's better to save the user time and present them a correct solution where you call out the formatting in your final message. If the codebase does not have a formatter configured, do not add one.
For all of testing, running, building, and formatting, do not attempt to fix unrelated bugs. It is not your responsibility to fix them. (You may mention them to the user in your final message though.)
## Sandbox and approvals
The Codex CLI harness supports several different sandboxing, and approval configurations that the user can choose from.
Filesystem sandboxing prevents you from editing files without user approval. The options are:
- *read-only*: You can only read files.
- *workspace-write*: You can read files. You can write to files in your workspace folder, but not outside it.
- *danger-full-access*: No filesystem sandboxing.
Network sandboxing prevents you from accessing network without approval. Options are
- *ON*
- *OFF*
Approvals are your mechanism to get user consent to perform more privileged actions. Although they introduce friction to the user because your work is paused until the user responds, you should leverage them to accomplish your important work. Do not let these settings or the sandbox deter you from attempting to accomplish the user's task. Approval options are
- *untrusted*: The harness will escalate most commands for user approval, apart from a limited allowlist of safe "read" commands.
- *on-failure*: The harness will allow all commands to run in the sandbox (if enabled), and failures will be escalated to the user for approval to run again without the sandbox.
- *on-request*: Commands will be run in the sandbox by default, and you can specify in your tool call if you want to escalate a command to run without sandboxing. (Note that this mode is not always available. If it is, you'll see parameters for it in the `shell` command description.)
- *never*: This is a non-interactive mode where you may NEVER ask the user for approval to run commands. Instead, you must always persist and work around constraints to solve the task for the user. You MUST do your utmost best to finish the task and validate your work before yielding. If this mode is pared with `danger-full-access`, take advantage of it to deliver the best outcome for the user. Further, in this mode, your default testing philosophy is overridden: Even if you don't see local patterns for testing, you may add tests and scripts to validate your work. Just remove them before yielding.
When you are running with approvals `on-request`, and sandboxing enabled, here are scenarios where you'll need to request approval:
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /tmp)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (For all of these, you should weigh alternative paths that do not require approval.)
Note that when sandboxing is set to read-only, you'll need to request approval for any command that isn't a read.
You will be told what filesystem sandboxing, network sandboxing, and approval mode are active in a developer or user message. If you are not told about this, assume that you are running with workspace-write, network sandboxing ON, and approval on-failure.
## Ambition vs. precision
For tasks that have no prior context (i.e. the user is starting something brand new), you should feel free to be ambitious and demonstrate creativity with your implementation.
If you're operating in an existing codebase, you should make sure you do exactly what the user asks with surgical precision. Treat the surrounding codebase with respect, and don't overstep (i.e. changing filenames or variables unnecessarily). You should balance being sufficiently ambitious and proactive when completing tasks of this nature.
You should use judicious initiative to decide on the right level of detail and complexity to deliver based on the user's needs. This means showing good judgment that you're capable of doing the right extras without gold-plating. This might be demonstrated by high-value, creative touches when scope of the task is vague; while being surgical and targeted when scope is tightly specified.
## Sharing progress updates
For especially longer tasks that you work on (i.e. requiring many tool calls, or a plan with multiple steps), you should provide progress updates back to the user at reasonable intervals. These updates should be structured as a concise sentence or two (no more than 8-10 words long) recapping progress so far in plain language: this update demonstrates your understanding of what needs to be done, progress so far (i.e. files explores, subtasks complete), and where you're going next.
Before doing large chunks of work that may incur latency as experienced by the user (i.e. writing a new file), you should send a concise message to the user with an update indicating what you're about to do to ensure they know what you're spending time on. Don't start editing or writing large files before informing the user what you are doing and why.
The messages you send before tool calls should describe what is immediately about to be done next in very concise language. If there was previous work done, this preamble message should also include a note about the work done so far to bring the user along.
## Presenting your work and final message
Your final message should read naturally, like an update from a concise teammate. For casual conversation, brainstorming tasks, or quick questions from the user, respond in a friendly, conversational tone. You should ask questions, suggest ideas, and adapt to the users style. If you've finished a large amount of work, when describing what you've done to the user, you should follow the final answer formatting guidelines to communicate substantive changes. You don't need to add structured formatting for one-word answers, greetings, or purely conversational exchanges.
You can skip heavy formatting for single, simple actions or confirmations. In these cases, respond in plain sentences with any relevant next step or quick option. Reserve multi-section structured responses for results that need grouping or explanation.
The user is working on the same computer as you, and has access to your work. As such there's no need to show the full contents of large files you have already written unless the user explicitly asks for them. Similarly, if you've created or modified files using `apply_patch`, there's no need to tell users to "save the file" or "copy the code into a file"—just reference the file path.
If there's something that you think you could help with as a logical next step, concisely ask the user if they want you to do so. Good examples of this are running tests, committing changes, or building out the next logical component. If theres something that you couldn't do (even with approval) but that the user might want to do (such as verifying changes by running the app), include those instructions succinctly.
Brevity is very important as a default. You should be very concise (i.e. no more than 10 lines), but can relax this requirement for tasks where additional detail and comprehensiveness is important for the user's understanding.
### Final answer structure and style guidelines
You are producing plain text that will later be styled by the CLI. Follow these rules exactly. Formatting should make results easy to scan, but not feel mechanical. Use judgment to decide how much structure adds value.
**Section Headers**
- Use only when they improve clarity — they are not mandatory for every answer.
- Choose descriptive names that fit the content
- Keep headers short (13 words) and in `**Title Case**`. Always start headers with `**` and end with `**`
- Leave no blank line before the first bullet under a header.
- Section headers should only be used where they genuinely improve scanability; avoid fragmenting the answer.
**Bullets**
- Use `-` followed by a space for every bullet.
- Bold the keyword, then colon + concise description.
- Merge related points when possible; avoid a bullet for every trivial detail.
- Keep bullets to one line unless breaking for clarity is unavoidable.
- Group into short lists (46 bullets) ordered by importance.
- Use consistent keyword phrasing and formatting across sections.
**Monospace**
- Wrap all commands, file paths, env vars, and code identifiers in backticks (`` `...` ``).
- Apply to inline examples and to bullet keywords if the keyword itself is a literal file/command.
- Never mix monospace and bold markers; choose one based on whether its a keyword (`**`) or inline code/path (`` ` ``).
**Structure**
- Place related bullets together; dont mix unrelated concepts in the same section.
- Order sections from general → specific → supporting info.
- For subsections (e.g., “Binaries” under “Rust Workspace”), introduce with a bolded keyword bullet, then list items under it.
- Match structure to complexity:
- Multi-part or detailed results → use clear headers and grouped bullets.
- Simple results → minimal headers, possibly just a short list or paragraph.
**Tone**
- Keep the voice collaborative and natural, like a coding partner handing off work.
- Be concise and factual — no filler or conversational commentary and avoid unnecessary repetition
- Use present tense and active voice (e.g., “Runs tests” not “This will run tests”).
- Keep descriptions self-contained; dont refer to “above” or “below”.
- Use parallel structure in lists for consistency.
**Dont**
- Dont use literal words “bold” or “monospace” in the content.
- Dont nest bullets or create deep hierarchies.
- Dont output ANSI escape codes directly — the CLI renderer applies them.
- Dont cram unrelated keywords into a single bullet; split for clarity.
- Dont let keyword lists run long — wrap or reformat for scanability.
Generally, ensure your final answers adapt their shape and depth to the request. For example, answers to code explanations should have a precise, structured explanation with code references that answer the question directly. For tasks with a simple implementation, lead with the outcome and supplement only with whats needed for clarity. Larger changes can be presented as a logical walkthrough of your approach, grouping related steps, explaining rationale where it adds value, and highlighting next actions to accelerate the user. Your answers should provide the right level of detail while being easily scannable.
For casual greetings, acknowledgements, or other one-off conversational messages that are not delivering substantive information or structured results, respond naturally without section headers or bullet formatting.
# Tools
## `apply_patch`
Your patch language is a strippeddown, fileoriented diff format designed to be easy to parse and safe to apply. You can think of it as a highlevel envelope:
@@ -96,3 +314,13 @@ You can invoke apply_patch like:
```
shell {"command":["apply_patch","*** Begin Patch\n*** Add File: hello.txt\n+Hello, world!\n*** End Patch\n"]}
```
## `update_plan`
A tool named `update_plan` is available to you. You can use it to keep an uptodate, stepbystep plan for the task.
To create a new plan, call `update_plan` with a short list of 1sentence steps (no more than 5-7 words each) with a `status` for each step (`pending`, `in_progress`, or `completed`).
When steps have been completed, use `update_plan` to mark each finished step as `completed` and the next step you are working on as `in_progress`. There should always be exactly one `in_progress` step until everything is done. You can mark multiple items as complete in a single `update_plan` call.
If all steps are complete, ensure you call `update_plan` to mark all steps as `completed`.

View File

@@ -0,0 +1,157 @@
use crate::codex::Session;
use crate::models::FunctionCallOutputPayload;
use crate::models::ResponseInputItem;
use crate::protocol::FileChange;
use crate::protocol::ReviewDecision;
use crate::safety::SafetyCheck;
use crate::safety::assess_patch_safety;
use codex_apply_patch::ApplyPatchAction;
use codex_apply_patch::ApplyPatchFileChange;
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
pub const CODEX_APPLY_PATCH_ARG1: &str = "--codex-run-as-apply-patch";
pub(crate) enum InternalApplyPatchInvocation {
/// The `apply_patch` call was handled programmatically, without any sort
/// of sandbox, because the user explicitly approved it. This is the
/// result to use with the `shell` function call that contained `apply_patch`.
Output(ResponseInputItem),
/// The `apply_patch` call was approved, either automatically because it
/// appears that it should be allowed based on the user's sandbox policy
/// *or* because the user explicitly approved it. In either case, we use
/// exec with [`CODEX_APPLY_PATCH_ARG1`] to realize the `apply_patch` call,
/// but [`ApplyPatchExec::auto_approved`] is used to determine the sandbox
/// used with the `exec()`.
DelegateToExec(ApplyPatchExec),
}
pub(crate) struct ApplyPatchExec {
pub(crate) action: ApplyPatchAction,
pub(crate) user_explicitly_approved_this_action: bool,
}
impl From<ResponseInputItem> for InternalApplyPatchInvocation {
fn from(item: ResponseInputItem) -> Self {
InternalApplyPatchInvocation::Output(item)
}
}
pub(crate) async fn apply_patch(
sess: &Session,
sub_id: &str,
call_id: &str,
action: ApplyPatchAction,
) -> InternalApplyPatchInvocation {
let writable_roots_snapshot = {
#[allow(clippy::unwrap_used)]
let guard = sess.writable_roots.lock().unwrap();
guard.clone()
};
match assess_patch_safety(
&action,
sess.approval_policy,
&writable_roots_snapshot,
&sess.cwd,
) {
SafetyCheck::AutoApprove { .. } => {
InternalApplyPatchInvocation::DelegateToExec(ApplyPatchExec {
action,
user_explicitly_approved_this_action: false,
})
}
SafetyCheck::AskUser => {
// Compute a readable summary of path changes to include in the
// approval request so the user can make an informed decision.
//
// Note that it might be worth expanding this approval request to
// give the user the option to expand the set of writable roots so
// that similar patches can be auto-approved in the future during
// this session.
let rx_approve = sess
.request_patch_approval(sub_id.to_owned(), call_id.to_owned(), &action, None, None)
.await;
match rx_approve.await.unwrap_or_default() {
ReviewDecision::Approved | ReviewDecision::ApprovedForSession => {
InternalApplyPatchInvocation::DelegateToExec(ApplyPatchExec {
action,
user_explicitly_approved_this_action: true,
})
}
ReviewDecision::Denied | ReviewDecision::Abort => {
ResponseInputItem::FunctionCallOutput {
call_id: call_id.to_owned(),
output: FunctionCallOutputPayload {
content: "patch rejected by user".to_string(),
success: Some(false),
},
}
.into()
}
}
}
SafetyCheck::Reject { reason } => ResponseInputItem::FunctionCallOutput {
call_id: call_id.to_owned(),
output: FunctionCallOutputPayload {
content: format!("patch rejected: {reason}"),
success: Some(false),
},
}
.into(),
}
}
pub(crate) fn convert_apply_patch_to_protocol(
action: &ApplyPatchAction,
) -> HashMap<PathBuf, FileChange> {
let changes = action.changes();
let mut result = HashMap::with_capacity(changes.len());
for (path, change) in changes {
let protocol_change = match change {
ApplyPatchFileChange::Add { content } => FileChange::Add {
content: content.clone(),
},
ApplyPatchFileChange::Delete => FileChange::Delete,
ApplyPatchFileChange::Update {
unified_diff,
move_path,
new_content: _new_content,
} => FileChange::Update {
unified_diff: unified_diff.clone(),
move_path: move_path.clone(),
},
};
result.insert(path.clone(), protocol_change);
}
result
}
pub(crate) fn get_writable_roots(cwd: &Path) -> Vec<PathBuf> {
let mut writable_roots = Vec::new();
if cfg!(target_os = "macos") {
// On macOS, $TMPDIR is private to the user.
writable_roots.push(std::env::temp_dir());
// Allow pyenv to update its shims directory. Without this, any tool
// that happens to be managed by `pyenv` will fail with an error like:
//
// pyenv: cannot rehash: $HOME/.pyenv/shims isn't writable
//
// which is emitted every time `pyenv` tries to run `rehash` (for
// example, after installing a new Python package that drops an entry
// point). Although the sandbox is intentionally readonly by default,
// writing to the user's local `pyenv` directory is safe because it
// is already userwritable and scoped to the current user account.
if let Ok(home_dir) = std::env::var("HOME") {
let pyenv_dir = PathBuf::from(home_dir).join(".pyenv");
writable_roots.push(pyenv_dir);
}
}
writable_roots.push(cwd.to_path_buf());
writable_roots
}

219
codex-rs/core/src/bash.rs Normal file
View File

@@ -0,0 +1,219 @@
use tree_sitter::Parser;
use tree_sitter::Tree;
use tree_sitter_bash::LANGUAGE as BASH;
/// Parse the provided bash source using tree-sitter-bash, returning a Tree on
/// success or None if parsing failed.
pub fn try_parse_bash(bash_lc_arg: &str) -> Option<Tree> {
let lang = BASH.into();
let mut parser = Parser::new();
#[expect(clippy::expect_used)]
parser.set_language(&lang).expect("load bash grammar");
let old_tree: Option<&Tree> = None;
parser.parse(bash_lc_arg, old_tree)
}
/// Parse a script which may contain multiple simple commands joined only by
/// the safe logical/pipe/sequencing operators: `&&`, `||`, `;`, `|`.
///
/// Returns `Some(Vec<command_words>)` if every command is a plain wordonly
/// command and the parse tree does not contain disallowed constructs
/// (parentheses, redirections, substitutions, control flow, etc.). Otherwise
/// returns `None`.
pub fn try_parse_word_only_commands_sequence(tree: &Tree, src: &str) -> Option<Vec<Vec<String>>> {
if tree.root_node().has_error() {
return None;
}
// List of allowed (named) node kinds for a "word only commands sequence".
// If we encounter a named node that is not in this list we reject.
const ALLOWED_KINDS: &[&str] = &[
// top level containers
"program",
"list",
"pipeline",
// commands & words
"command",
"command_name",
"word",
"string",
"string_content",
"raw_string",
"number",
];
// Allow only safe punctuation / operator tokens; anything else causes reject.
const ALLOWED_PUNCT_TOKENS: &[&str] = &["&&", "||", ";", "|", "\"", "'"];
let root = tree.root_node();
let mut cursor = root.walk();
let mut stack = vec![root];
let mut command_nodes = Vec::new();
while let Some(node) = stack.pop() {
let kind = node.kind();
if node.is_named() {
if !ALLOWED_KINDS.contains(&kind) {
return None;
}
if kind == "command" {
command_nodes.push(node);
}
} else {
// Reject any punctuation / operator tokens that are not explicitly allowed.
if kind.chars().any(|c| "&;|".contains(c)) && !ALLOWED_PUNCT_TOKENS.contains(&kind) {
return None;
}
if !(ALLOWED_PUNCT_TOKENS.contains(&kind) || kind.trim().is_empty()) {
// If it's a quote token or operator it's allowed above; we also allow whitespace tokens.
// Any other punctuation like parentheses, braces, redirects, backticks, etc are rejected.
return None;
}
}
for child in node.children(&mut cursor) {
stack.push(child);
}
}
let mut commands = Vec::new();
for node in command_nodes {
if let Some(words) = parse_plain_command_from_node(node, src) {
commands.push(words);
} else {
return None;
}
}
Some(commands)
}
fn parse_plain_command_from_node(cmd: tree_sitter::Node, src: &str) -> Option<Vec<String>> {
if cmd.kind() != "command" {
return None;
}
let mut words = Vec::new();
let mut cursor = cmd.walk();
for child in cmd.named_children(&mut cursor) {
match child.kind() {
"command_name" => {
let word_node = child.named_child(0)?;
if word_node.kind() != "word" {
return None;
}
words.push(word_node.utf8_text(src.as_bytes()).ok()?.to_owned());
}
"word" | "number" => {
words.push(child.utf8_text(src.as_bytes()).ok()?.to_owned());
}
"string" => {
if child.child_count() == 3
&& child.child(0)?.kind() == "\""
&& child.child(1)?.kind() == "string_content"
&& child.child(2)?.kind() == "\""
{
words.push(child.child(1)?.utf8_text(src.as_bytes()).ok()?.to_owned());
} else {
return None;
}
}
"raw_string" => {
let raw_string = child.utf8_text(src.as_bytes()).ok()?;
let stripped = raw_string
.strip_prefix('\'')
.and_then(|s| s.strip_suffix('\''));
if let Some(s) = stripped {
words.push(s.to_owned());
} else {
return None;
}
}
_ => return None,
}
}
Some(words)
}
#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used)]
use super::*;
fn parse_seq(src: &str) -> Option<Vec<Vec<String>>> {
let tree = try_parse_bash(src)?;
try_parse_word_only_commands_sequence(&tree, src)
}
#[test]
fn accepts_single_simple_command() {
let cmds = parse_seq("ls -1").unwrap();
assert_eq!(cmds, vec![vec!["ls".to_string(), "-1".to_string()]]);
}
#[test]
fn accepts_multiple_commands_with_allowed_operators() {
let src = "ls && pwd; echo 'hi there' | wc -l";
let cmds = parse_seq(src).unwrap();
let expected: Vec<Vec<String>> = vec![
vec!["wc".to_string(), "-l".to_string()],
vec!["echo".to_string(), "hi there".to_string()],
vec!["pwd".to_string()],
vec!["ls".to_string()],
];
assert_eq!(cmds, expected);
}
#[test]
fn extracts_double_and_single_quoted_strings() {
let cmds = parse_seq("echo \"hello world\"").unwrap();
assert_eq!(
cmds,
vec![vec!["echo".to_string(), "hello world".to_string()]]
);
let cmds2 = parse_seq("echo 'hi there'").unwrap();
assert_eq!(
cmds2,
vec![vec!["echo".to_string(), "hi there".to_string()]]
);
}
#[test]
fn accepts_numbers_as_words() {
let cmds = parse_seq("echo 123 456").unwrap();
assert_eq!(
cmds,
vec![vec![
"echo".to_string(),
"123".to_string(),
"456".to_string()
]]
);
}
#[test]
fn rejects_parentheses_and_subshells() {
assert!(parse_seq("(ls)").is_none());
assert!(parse_seq("ls || (pwd && echo hi)").is_none());
}
#[test]
fn rejects_redirections_and_unsupported_operators() {
assert!(parse_seq("ls > out.txt").is_none());
assert!(parse_seq("echo hi & echo bye").is_none());
}
#[test]
fn rejects_command_and_process_substitutions_and_expansions() {
assert!(parse_seq("echo $(pwd)").is_none());
assert!(parse_seq("echo `pwd`").is_none());
assert!(parse_seq("echo $HOME").is_none());
assert!(parse_seq("echo \"hi $USER\"").is_none());
}
#[test]
fn rejects_variable_assignment_prefix() {
assert!(parse_seq("FOO=bar ls").is_none());
}
#[test]
fn rejects_trailing_operator_parse_error() {
assert!(parse_seq("ls &&").is_none());
}
}

View File

@@ -21,9 +21,9 @@ use crate::client_common::ResponseEvent;
use crate::client_common::ResponseStream;
use crate::error::CodexErr;
use crate::error::Result;
use crate::flags::OPENAI_REQUEST_MAX_RETRIES;
use crate::flags::OPENAI_STREAM_IDLE_TIMEOUT_MS;
use crate::model_family::ModelFamily;
use crate::models::ContentItem;
use crate::models::ReasoningItemContent;
use crate::models::ResponseItem;
use crate::openai_tools::create_tools_json_for_chat_completions_api;
use crate::util::backoff;
@@ -31,19 +31,21 @@ use crate::util::backoff;
/// Implementation for the classic Chat Completions API.
pub(crate) async fn stream_chat_completions(
prompt: &Prompt,
model: &str,
model_family: &ModelFamily,
client: &reqwest::Client,
provider: &ModelProviderInfo,
) -> Result<ResponseStream> {
// Build messages array
let mut messages = Vec::<serde_json::Value>::new();
let full_instructions = prompt.get_full_instructions(model);
let full_instructions = prompt.get_full_instructions(model_family);
messages.push(json!({"role": "system", "content": full_instructions}));
for item in &prompt.input {
let input = prompt.get_formatted_input();
for item in &input {
match item {
ResponseItem::Message { role, content } => {
ResponseItem::Message { role, content, .. } => {
let mut text = String::new();
for c in content {
match c {
@@ -60,6 +62,7 @@ pub(crate) async fn stream_chat_completions(
name,
arguments,
call_id,
..
} => {
messages.push(json!({
"role": "assistant",
@@ -106,9 +109,9 @@ pub(crate) async fn stream_chat_completions(
}
}
let tools_json = create_tools_json_for_chat_completions_api(prompt, model)?;
let tools_json = create_tools_json_for_chat_completions_api(&prompt.tools)?;
let payload = json!({
"model": model,
"model": model_family.slug,
"messages": messages,
"stream": true,
"tools": tools_json,
@@ -116,15 +119,16 @@ pub(crate) async fn stream_chat_completions(
debug!(
"POST to {}: {}",
provider.get_full_url(),
provider.get_full_url(&None),
serde_json::to_string_pretty(&payload).unwrap_or_default()
);
let mut attempt = 0;
let max_retries = provider.request_max_retries();
loop {
attempt += 1;
let req_builder = provider.create_request_builder(client)?;
let req_builder = provider.create_request_builder(client, &None).await?;
let res = req_builder
.header(reqwest::header::ACCEPT, "text/event-stream")
@@ -134,9 +138,13 @@ pub(crate) async fn stream_chat_completions(
match res {
Ok(resp) if resp.status().is_success() => {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(1600);
let stream = resp.bytes_stream().map_err(CodexErr::Reqwest);
tokio::spawn(process_chat_sse(stream, tx_event));
tokio::spawn(process_chat_sse(
stream,
tx_event,
provider.stream_idle_timeout(),
));
return Ok(ResponseStream { rx_event });
}
Ok(res) => {
@@ -146,7 +154,7 @@ pub(crate) async fn stream_chat_completions(
return Err(CodexErr::UnexpectedStatus(status, body));
}
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
if attempt > max_retries {
return Err(CodexErr::RetryLimit(status));
}
@@ -162,7 +170,7 @@ pub(crate) async fn stream_chat_completions(
tokio::time::sleep(delay).await;
}
Err(e) => {
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
if attempt > max_retries {
return Err(e.into());
}
let delay = backoff(attempt);
@@ -175,14 +183,15 @@ pub(crate) async fn stream_chat_completions(
/// Lightweight SSE processor for the Chat Completions streaming format. The
/// output is mapped onto Codex's internal [`ResponseEvent`] so that the rest
/// of the pipeline can stay agnostic of the underlying wire format.
async fn process_chat_sse<S>(stream: S, tx_event: mpsc::Sender<Result<ResponseEvent>>)
where
async fn process_chat_sse<S>(
stream: S,
tx_event: mpsc::Sender<Result<ResponseEvent>>,
idle_timeout: Duration,
) where
S: Stream<Item = Result<Bytes>> + Unpin,
{
let mut stream = stream.eventsource();
let idle_timeout = *OPENAI_STREAM_IDLE_TIMEOUT_MS;
// State to accumulate a function call across streaming chunks.
// OpenAI may split the `arguments` string over multiple `delta` events
// until the chunk whose `finish_reason` is `tool_calls` is emitted. We
@@ -197,6 +206,8 @@ where
}
let mut fn_call_state = FunctionCallState::default();
let mut assistant_text = String::new();
let mut reasoning_text = String::new();
loop {
let sse = match timeout(idle_timeout, stream.next()).await {
@@ -225,6 +236,31 @@ where
// OpenAI Chat streaming sends a literal string "[DONE]" when finished.
if sse.data.trim() == "[DONE]" {
// Emit any finalized items before closing so downstream consumers receive
// terminal events for both assistant content and raw reasoning.
if !assistant_text.is_empty() {
let item = ResponseItem::Message {
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: std::mem::take(&mut assistant_text),
}],
id: None,
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
if !reasoning_text.is_empty() {
let item = ResponseItem::Reasoning {
id: String::new(),
summary: Vec::new(),
content: Some(vec![ReasoningItemContent::ReasoningText {
text: std::mem::take(&mut reasoning_text),
}]),
encrypted_content: None,
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
@@ -244,20 +280,47 @@ where
let choice_opt = chunk.get("choices").and_then(|c| c.get(0));
if let Some(choice) = choice_opt {
// Handle assistant content tokens.
// Handle assistant content tokens as streaming deltas.
if let Some(content) = choice
.get("delta")
.and_then(|d| d.get("content"))
.and_then(|c| c.as_str())
{
let item = ResponseItem::Message {
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: content.to_string(),
}],
};
if !content.is_empty() {
assistant_text.push_str(content);
let _ = tx_event
.send(Ok(ResponseEvent::OutputTextDelta(content.to_string())))
.await;
}
}
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
// Forward any reasoning/thinking deltas if present.
// Some providers stream `reasoning` as a plain string while others
// nest the text under an object (e.g. `{ "reasoning": { "text": "…" } }`).
if let Some(reasoning_val) = choice.get("delta").and_then(|d| d.get("reasoning")) {
let mut maybe_text = reasoning_val.as_str().map(|s| s.to_string());
if maybe_text.is_none() && reasoning_val.is_object() {
if let Some(s) = reasoning_val
.get("text")
.and_then(|t| t.as_str())
.filter(|s| !s.is_empty())
{
maybe_text = Some(s.to_string());
} else if let Some(s) = reasoning_val
.get("content")
.and_then(|t| t.as_str())
.filter(|s| !s.is_empty())
{
maybe_text = Some(s.to_string());
}
}
if let Some(reasoning) = maybe_text {
let _ = tx_event
.send(Ok(ResponseEvent::ReasoningContentDelta(reasoning)))
.await;
}
}
// Handle streaming function / tool calls.
@@ -294,18 +357,55 @@ where
if let Some(finish_reason) = choice.get("finish_reason").and_then(|v| v.as_str()) {
match finish_reason {
"tool_calls" if fn_call_state.active => {
// Build the FunctionCall response item.
// First, flush the terminal raw reasoning so UIs can finalize
// the reasoning stream before any exec/tool events begin.
if !reasoning_text.is_empty() {
let item = ResponseItem::Reasoning {
id: String::new(),
summary: Vec::new(),
content: Some(vec![ReasoningItemContent::ReasoningText {
text: std::mem::take(&mut reasoning_text),
}]),
encrypted_content: None,
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
// Then emit the FunctionCall response item.
let item = ResponseItem::FunctionCall {
id: None,
name: fn_call_state.name.clone().unwrap_or_else(|| "".to_string()),
arguments: fn_call_state.arguments.clone(),
call_id: fn_call_state.call_id.clone().unwrap_or_else(String::new),
};
// Emit it downstream.
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
"stop" => {
// Regular turn without tool-call.
// Regular turn without tool-call. Emit the final assistant message
// as a single OutputItemDone so non-delta consumers see the result.
if !assistant_text.is_empty() {
let item = ResponseItem::Message {
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: std::mem::take(&mut assistant_text),
}],
id: None,
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
// Also emit a terminal Reasoning item so UIs can finalize raw reasoning.
if !reasoning_text.is_empty() {
let item = ResponseItem::Reasoning {
id: String::new(),
summary: Vec::new(),
content: Some(vec![ReasoningItemContent::ReasoningText {
text: std::mem::take(&mut reasoning_text),
}]),
encrypted_content: None,
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
}
_ => {}
}
@@ -343,10 +443,17 @@ where
/// The adapter is intentionally *lossless*: callers who do **not** opt in via
/// [`AggregateStreamExt::aggregate()`] keep receiving the original unmodified
/// events.
#[derive(Copy, Clone, Eq, PartialEq)]
enum AggregateMode {
AggregatedOnly,
Streaming,
}
pub(crate) struct AggregatedChatStream<S> {
inner: S,
cumulative: String,
pending_completed: Option<ResponseEvent>,
cumulative_reasoning: String,
pending: std::collections::VecDeque<ResponseEvent>,
mode: AggregateMode,
}
impl<S> Stream for AggregatedChatStream<S>
@@ -358,8 +465,8 @@ where
fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
let this = self.get_mut();
// First, flush any buffered Completed event from the previous call.
if let Some(ev) = this.pending_completed.take() {
// First, flush any buffered events from the previous call.
if let Some(ev) = this.pending.pop_front() {
return Poll::Ready(Some(Ok(ev)));
}
@@ -376,16 +483,21 @@ where
let is_assistant_delta = matches!(&item, crate::models::ResponseItem::Message { role, .. } if role == "assistant");
if is_assistant_delta {
if let crate::models::ResponseItem::Message { content, .. } = &item {
if let Some(text) = content.iter().find_map(|c| match c {
crate::models::ContentItem::OutputText { text } => Some(text),
_ => None,
}) {
this.cumulative.push_str(text);
// Only use the final assistant message if we have not
// seen any deltas; otherwise, deltas already built the
// cumulative text and this would duplicate it.
if this.cumulative.is_empty() {
if let crate::models::ResponseItem::Message { content, .. } = &item {
if let Some(text) = content.iter().find_map(|c| match c {
crate::models::ContentItem::OutputText { text } => Some(text),
_ => None,
}) {
this.cumulative.push_str(text);
}
}
}
// Swallow partial assistant chunk; keep polling.
// Swallow assistant message here; emit on Completed.
continue;
}
@@ -396,23 +508,50 @@ where
response_id,
token_usage,
}))) => {
// Build any aggregated items in the correct order: Reasoning first, then Message.
let mut emitted_any = false;
if !this.cumulative_reasoning.is_empty()
&& matches!(this.mode, AggregateMode::AggregatedOnly)
{
let aggregated_reasoning = crate::models::ResponseItem::Reasoning {
id: String::new(),
summary: Vec::new(),
content: Some(vec![
crate::models::ReasoningItemContent::ReasoningText {
text: std::mem::take(&mut this.cumulative_reasoning),
},
]),
encrypted_content: None,
};
this.pending
.push_back(ResponseEvent::OutputItemDone(aggregated_reasoning));
emitted_any = true;
}
if !this.cumulative.is_empty() {
let aggregated_item = crate::models::ResponseItem::Message {
let aggregated_message = crate::models::ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![crate::models::ContentItem::OutputText {
text: std::mem::take(&mut this.cumulative),
}],
};
this.pending
.push_back(ResponseEvent::OutputItemDone(aggregated_message));
emitted_any = true;
}
// Buffer Completed so it is returned *after* the aggregated message.
this.pending_completed = Some(ResponseEvent::Completed {
response_id,
token_usage,
// Always emit Completed last when anything was aggregated.
if emitted_any {
this.pending.push_back(ResponseEvent::Completed {
response_id: response_id.clone(),
token_usage: token_usage.clone(),
});
return Poll::Ready(Some(Ok(ResponseEvent::OutputItemDone(
aggregated_item,
))));
// Return the first pending event now.
if let Some(ev) = this.pending.pop_front() {
return Poll::Ready(Some(Ok(ev)));
}
}
// Nothing aggregated forward Completed directly.
@@ -426,6 +565,29 @@ where
// will never appear in a Chat Completions stream.
continue;
}
Poll::Ready(Some(Ok(ResponseEvent::OutputTextDelta(delta)))) => {
// Always accumulate deltas so we can emit a final OutputItemDone at Completed.
this.cumulative.push_str(&delta);
if matches!(this.mode, AggregateMode::Streaming) {
// In streaming mode, also forward the delta immediately.
return Poll::Ready(Some(Ok(ResponseEvent::OutputTextDelta(delta))));
} else {
continue;
}
}
Poll::Ready(Some(Ok(ResponseEvent::ReasoningContentDelta(delta)))) => {
// Always accumulate reasoning deltas so we can emit a final Reasoning item at Completed.
this.cumulative_reasoning.push_str(&delta);
if matches!(this.mode, AggregateMode::Streaming) {
// In streaming mode, also forward the delta immediately.
return Poll::Ready(Some(Ok(ResponseEvent::ReasoningContentDelta(delta))));
} else {
continue;
}
}
Poll::Ready(Some(Ok(ResponseEvent::ReasoningSummaryDelta(_)))) => {
continue;
}
}
}
}
@@ -453,12 +615,24 @@ pub(crate) trait AggregateStreamExt: Stream<Item = Result<ResponseEvent>> + Size
/// }
/// ```
fn aggregate(self) -> AggregatedChatStream<Self> {
AggregatedChatStream {
inner: self,
cumulative: String::new(),
pending_completed: None,
}
AggregatedChatStream::new(self, AggregateMode::AggregatedOnly)
}
}
impl<T> AggregateStreamExt for T where T: Stream<Item = Result<ResponseEvent>> + Sized {}
impl<S> AggregatedChatStream<S> {
fn new(inner: S, mode: AggregateMode) -> Self {
AggregatedChatStream {
inner,
cumulative: String::new(),
cumulative_reasoning: String::new(),
pending: std::collections::VecDeque::new(),
mode,
}
}
pub(crate) fn streaming_mode(inner: S) -> Self {
Self::new(inner, AggregateMode::Streaming)
}
}

View File

@@ -3,6 +3,8 @@ use std::path::Path;
use std::time::Duration;
use bytes::Bytes;
use codex_login::AuthMode;
use codex_login::CodexAuth;
use eventsource_stream::Eventsource;
use futures::prelude::*;
use reqwest::StatusCode;
@@ -15,6 +17,7 @@ use tokio_util::io::ReaderStream;
use tracing::debug;
use tracing::trace;
use tracing::warn;
use uuid::Uuid;
use crate::chat_completions::AggregateStreamExt;
use crate::chat_completions::stream_chat_completions;
@@ -29,8 +32,6 @@ use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
use crate::error::CodexErr;
use crate::error::Result;
use crate::flags::CODEX_RS_SSE_FIXTURE;
use crate::flags::OPENAI_REQUEST_MAX_RETRIES;
use crate::flags::OPENAI_STREAM_IDLE_TIMEOUT_MS;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::WireApi;
use crate::models::ResponseItem;
@@ -39,11 +40,23 @@ use crate::protocol::TokenUsage;
use crate::util::backoff;
use std::sync::Arc;
#[derive(Debug, Deserialize)]
struct ErrorResponse {
error: Error,
}
#[derive(Debug, Deserialize)]
struct Error {
r#type: String,
}
#[derive(Clone)]
pub struct ModelClient {
config: Arc<Config>,
auth: Option<CodexAuth>,
client: reqwest::Client,
provider: ModelProviderInfo,
session_id: Uuid,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
}
@@ -51,14 +64,18 @@ pub struct ModelClient {
impl ModelClient {
pub fn new(
config: Arc<Config>,
auth: Option<CodexAuth>,
provider: ModelProviderInfo,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
session_id: Uuid,
) -> Self {
Self {
config,
auth,
client: reqwest::Client::new(),
provider,
session_id,
effort,
summary,
}
@@ -74,7 +91,7 @@ impl ModelClient {
// Create the raw streaming connection first.
let response_stream = stream_chat_completions(
prompt,
&self.config.model,
&self.config.model_family,
&self.client,
&self.provider,
)
@@ -83,7 +100,11 @@ impl ModelClient {
// Wrap it with the aggregation adapter so callers see *only*
// the final assistant message per turn (matching the
// behaviour of the Responses API).
let mut aggregated = response_stream.aggregate();
let mut aggregated = if self.config.show_raw_agent_reasoning {
crate::chat_completions::AggregatedChatStream::streaming_mode(response_stream)
} else {
response_stream.aggregate()
};
// Bridge the aggregated stream back into a standard
// `ResponseStream` by forwarding events through a channel.
@@ -109,55 +130,119 @@ impl ModelClient {
if let Some(path) = &*CODEX_RS_SSE_FIXTURE {
// short circuit for tests
warn!(path, "Streaming from fixture");
return stream_from_fixture(path).await;
return stream_from_fixture(path, self.provider.clone()).await;
}
let full_instructions = prompt.get_full_instructions(&self.config.model);
let tools_json = create_tools_json_for_responses_api(prompt, &self.config.model)?;
let reasoning = create_reasoning_param_for_request(&self.config, self.effort, self.summary);
let auth = self.auth.clone();
let auth_mode = auth.as_ref().map(|a| a.mode);
let store = prompt.store && auth_mode != Some(AuthMode::ChatGPT);
let full_instructions = prompt.get_full_instructions(&self.config.model_family);
let tools_json = create_tools_json_for_responses_api(&prompt.tools)?;
let reasoning = create_reasoning_param_for_request(
&self.config.model_family,
self.effort,
self.summary,
);
// Request encrypted COT if we are not storing responses,
// otherwise reasoning items will be referenced by ID
let include: Vec<String> = if !store && reasoning.is_some() {
vec!["reasoning.encrypted_content".to_string()]
} else {
vec![]
};
let input_with_instructions = prompt.get_formatted_input();
let payload = ResponsesApiRequest {
model: &self.config.model,
instructions: &full_instructions,
input: &prompt.input,
input: &input_with_instructions,
tools: &tools_json,
tool_choice: "auto",
parallel_tool_calls: false,
reasoning,
previous_response_id: prompt.prev_id.clone(),
store: prompt.store,
store,
stream: true,
include,
};
let mut attempt = 0;
let max_retries = self.provider.request_max_retries();
trace!(
"POST to {}: {}",
self.provider.get_full_url(),
self.provider.get_full_url(&auth),
serde_json::to_string(&payload)?
);
let mut attempt = 0;
loop {
attempt += 1;
let req_builder = self
let mut req_builder = self
.provider
.create_request_builder(&self.client)?
.create_request_builder(&self.client, &auth)
.await?;
req_builder = req_builder
.header("OpenAI-Beta", "responses=experimental")
.header("session_id", self.session_id.to_string())
.header(reqwest::header::ACCEPT, "text/event-stream")
.json(&payload);
if let Some(auth) = auth.as_ref()
&& auth.mode == AuthMode::ChatGPT
&& let Some(account_id) = auth.get_account_id().await
{
req_builder = req_builder.header("chatgpt-account-id", account_id);
}
let originator = self
.config
.internal_originator
.as_deref()
.unwrap_or("codex_cli_rs");
req_builder = req_builder.header("originator", originator);
let res = req_builder.send().await;
if let Ok(resp) = &res {
trace!(
"Response status: {}, request-id: {}",
resp.status(),
resp.headers()
.get("x-request-id")
.map(|v| v.to_str().unwrap_or_default())
.unwrap_or_default()
);
}
match res {
Ok(resp) if resp.status().is_success() => {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(1600);
// spawn task to process SSE
let stream = resp.bytes_stream().map_err(CodexErr::Reqwest);
tokio::spawn(process_sse(stream, tx_event));
tokio::spawn(process_sse(
stream,
tx_event,
self.provider.stream_idle_timeout(),
));
return Ok(ResponseStream { rx_event });
}
Ok(res) => {
let status = res.status();
// Pull out RetryAfter header if present.
let retry_after_secs = res
.headers()
.get(reqwest::header::RETRY_AFTER)
.and_then(|v| v.to_str().ok())
.and_then(|s| s.parse::<u64>().ok());
// The OpenAI Responses endpoint returns structured JSON bodies even for 4xx/5xx
// errors. When we bubble early with only the HTTP status the caller sees an opaque
// "unexpected status 400 Bad Request" which makes debugging nearly impossible.
@@ -171,16 +256,27 @@ impl ModelClient {
return Err(CodexErr::UnexpectedStatus(status, body));
}
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
return Err(CodexErr::RetryLimit(status));
if status == StatusCode::TOO_MANY_REQUESTS {
let body = res.json::<ErrorResponse>().await.ok();
if let Some(ErrorResponse {
error: Error { r#type, .. },
}) = body
{
if r#type == "usage_limit_reached" {
return Err(CodexErr::UsageLimitReached);
} else if r#type == "usage_not_included" {
return Err(CodexErr::UsageNotIncluded);
}
}
}
// Pull out RetryAfter header if present.
let retry_after_secs = res
.headers()
.get(reqwest::header::RETRY_AFTER)
.and_then(|v| v.to_str().ok())
.and_then(|s| s.parse::<u64>().ok());
if attempt > max_retries {
if status == StatusCode::INTERNAL_SERVER_ERROR {
return Err(CodexErr::InternalServerError);
}
return Err(CodexErr::RetryLimit(status));
}
let delay = retry_after_secs
.map(|s| Duration::from_millis(s * 1_000))
@@ -188,7 +284,7 @@ impl ModelClient {
tokio::time::sleep(delay).await;
}
Err(e) => {
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
if attempt > max_retries {
return Err(e.into());
}
let delay = backoff(attempt);
@@ -197,6 +293,10 @@ impl ModelClient {
}
}
}
pub fn get_provider(&self) -> ModelProviderInfo {
self.provider.clone()
}
}
#[derive(Debug, Deserialize, Serialize)]
@@ -205,6 +305,7 @@ struct SseEvent {
kind: String,
response: Option<Value>,
item: Option<Value>,
delta: Option<String>,
}
#[derive(Debug, Deserialize)]
@@ -247,14 +348,16 @@ struct ResponseCompletedOutputTokensDetails {
reasoning_tokens: u64,
}
async fn process_sse<S>(stream: S, tx_event: mpsc::Sender<Result<ResponseEvent>>)
where
async fn process_sse<S>(
stream: S,
tx_event: mpsc::Sender<Result<ResponseEvent>>,
idle_timeout: Duration,
) where
S: Stream<Item = Result<Bytes>> + Unpin,
{
let mut stream = stream.eventsource();
// If the stream stays completely silent for an extended period treat it as disconnected.
let idle_timeout = *OPENAI_STREAM_IDLE_TIMEOUT_MS;
// The response id returned from the "complete" message.
let mut response_completed: Option<ResponseCompleted> = None;
@@ -315,7 +418,7 @@ where
// duplicated `output` array embedded in the `response.completed`
// payload. That produced two concrete issues:
// 1. No realtime streaming the user only saw output after the
// entire turn had finished, which broke the typing UX and
// entire turn had finished, which broke the "typing" UX and
// made longrunning turns look stalled.
// 2. Duplicate `function_call_output` items both the
// individual *and* the completed array were forwarded, which
@@ -337,11 +440,48 @@ where
return;
}
}
"response.output_text.delta" => {
if let Some(delta) = event.delta {
let event = ResponseEvent::OutputTextDelta(delta);
if tx_event.send(Ok(event)).await.is_err() {
return;
}
}
}
"response.reasoning_summary_text.delta" => {
if let Some(delta) = event.delta {
let event = ResponseEvent::ReasoningSummaryDelta(delta);
if tx_event.send(Ok(event)).await.is_err() {
return;
}
}
}
"response.reasoning_text.delta" => {
if let Some(delta) = event.delta {
let event = ResponseEvent::ReasoningContentDelta(delta);
if tx_event.send(Ok(event)).await.is_err() {
return;
}
}
}
"response.created" => {
if event.response.is_some() {
let _ = tx_event.send(Ok(ResponseEvent::Created {})).await;
}
}
"response.failed" => {
if let Some(resp_val) = event.response {
let error = resp_val
.get("error")
.and_then(|v| v.get("message"))
.and_then(|v| v.as_str())
.unwrap_or("response.failed event received");
let _ = tx_event
.send(Err(CodexErr::Stream(error.to_string())))
.await;
}
}
// Final response completed includes array of output items & id
"response.completed" => {
if let Some(resp_val) = event.response {
@@ -360,10 +500,8 @@ where
| "response.function_call_arguments.delta"
| "response.in_progress"
| "response.output_item.added"
| "response.output_text.delta"
| "response.output_text.done"
| "response.reasoning_summary_part.added"
| "response.reasoning_summary_text.delta"
| "response.reasoning_summary_text.done" => {
// Currently, we ignore these events, but we handle them
// separately to skip the logging message in the `other` case.
@@ -374,8 +512,11 @@ where
}
/// used in tests to stream from a text SSE file
async fn stream_from_fixture(path: impl AsRef<Path>) -> Result<ResponseStream> {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
async fn stream_from_fixture(
path: impl AsRef<Path>,
provider: ModelProviderInfo,
) -> Result<ResponseStream> {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(1600);
let f = std::fs::File::open(path.as_ref())?;
let lines = std::io::BufReader::new(f).lines();
@@ -388,7 +529,11 @@ async fn stream_from_fixture(path: impl AsRef<Path>) -> Result<ResponseStream> {
let rdr = std::io::Cursor::new(content);
let stream = ReaderStream::new(rdr).map_err(CodexErr::Io);
tokio::spawn(process_sse(stream, tx_event));
tokio::spawn(process_sse(
stream,
tx_event,
provider.stream_idle_timeout(),
));
Ok(ResponseStream { rx_event })
}
@@ -408,7 +553,10 @@ mod tests {
/// Runs the SSE parser on pre-chunked byte slices and returns every event
/// (including any final `Err` from a stream-closure check).
async fn collect_events(chunks: &[&[u8]]) -> Vec<Result<ResponseEvent>> {
async fn collect_events(
chunks: &[&[u8]],
provider: ModelProviderInfo,
) -> Vec<Result<ResponseEvent>> {
let mut builder = IoBuilder::new();
for chunk in chunks {
builder.read(chunk);
@@ -417,7 +565,7 @@ mod tests {
let reader = builder.build();
let stream = ReaderStream::new(reader).map_err(CodexErr::Io);
let (tx, mut rx) = mpsc::channel::<Result<ResponseEvent>>(16);
tokio::spawn(process_sse(stream, tx));
tokio::spawn(process_sse(stream, tx, provider.stream_idle_timeout()));
let mut events = Vec::new();
while let Some(ev) = rx.recv().await {
@@ -428,7 +576,10 @@ mod tests {
/// Builds an in-memory SSE stream from JSON fixtures and returns only the
/// successfully parsed events (panics on internal channel errors).
async fn run_sse(events: Vec<serde_json::Value>) -> Vec<ResponseEvent> {
async fn run_sse(
events: Vec<serde_json::Value>,
provider: ModelProviderInfo,
) -> Vec<ResponseEvent> {
let mut body = String::new();
for e in events {
let kind = e
@@ -444,7 +595,7 @@ mod tests {
let (tx, mut rx) = mpsc::channel::<Result<ResponseEvent>>(8);
let stream = ReaderStream::new(std::io::Cursor::new(body)).map_err(CodexErr::Io);
tokio::spawn(process_sse(stream, tx));
tokio::spawn(process_sse(stream, tx, provider.stream_idle_timeout()));
let mut out = Vec::new();
while let Some(ev) = rx.recv().await {
@@ -489,7 +640,26 @@ mod tests {
let sse2 = format!("event: response.output_item.done\ndata: {item2}\n\n");
let sse3 = format!("event: response.completed\ndata: {completed}\n\n");
let events = collect_events(&[sse1.as_bytes(), sse2.as_bytes(), sse3.as_bytes()]).await;
let provider = ModelProviderInfo {
name: "test".to_string(),
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(1000),
requires_openai_auth: false,
};
let events = collect_events(
&[sse1.as_bytes(), sse2.as_bytes(), sse3.as_bytes()],
provider,
)
.await;
assert_eq!(events.len(), 3);
@@ -530,8 +700,22 @@ mod tests {
.to_string();
let sse1 = format!("event: response.output_item.done\ndata: {item1}\n\n");
let provider = ModelProviderInfo {
name: "test".to_string(),
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(1000),
requires_openai_auth: false,
};
let events = collect_events(&[sse1.as_bytes()]).await;
let events = collect_events(&[sse1.as_bytes()], provider).await;
assert_eq!(events.len(), 2);
@@ -619,7 +803,22 @@ mod tests {
let mut evs = vec![case.event];
evs.push(completed.clone());
let out = run_sse(evs).await;
let provider = ModelProviderInfo {
name: "test".to_string(),
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(1000),
requires_openai_auth: false,
};
let out = run_sse(evs, provider).await;
assert_eq!(out.len(), case.expected_len, "case {}", case.name);
assert!(
(case.expect_first)(&out[0]),

View File

@@ -1,13 +1,19 @@
use crate::config_types::ReasoningEffort as ReasoningEffortConfig;
use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
use crate::error::Result;
use crate::model_family::ModelFamily;
use crate::models::ContentItem;
use crate::models::ResponseItem;
use crate::openai_tools::OpenAiTool;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
use crate::protocol::TokenUsage;
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
use futures::Stream;
use serde::Serialize;
use std::borrow::Cow;
use std::collections::HashMap;
use std::fmt::Display;
use std::path::PathBuf;
use std::pin::Pin;
use std::task::Context;
use std::task::Poll;
@@ -17,36 +23,114 @@ use tokio::sync::mpsc;
/// with this content.
const BASE_INSTRUCTIONS: &str = include_str!("../prompt.md");
/// wraps environment context message in a tag for the model to parse more easily.
const ENVIRONMENT_CONTEXT_START: &str = "<environment_context>\n\n";
const ENVIRONMENT_CONTEXT_END: &str = "\n\n</environment_context>";
/// wraps user instructions message in a tag for the model to parse more easily.
const USER_INSTRUCTIONS_START: &str = "<user_instructions>\n\n";
const USER_INSTRUCTIONS_END: &str = "\n\n</user_instructions>";
#[derive(Debug, Clone)]
pub(crate) struct EnvironmentContext {
pub cwd: PathBuf,
pub approval_policy: AskForApproval,
pub sandbox_policy: SandboxPolicy,
}
impl Display for EnvironmentContext {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
writeln!(
f,
"Current working directory: {}",
self.cwd.to_string_lossy()
)?;
writeln!(f, "Approval policy: {}", self.approval_policy)?;
writeln!(f, "Sandbox policy: {}", self.sandbox_policy)?;
let network_access = match self.sandbox_policy.clone() {
SandboxPolicy::DangerFullAccess => "enabled",
SandboxPolicy::ReadOnly => "restricted",
SandboxPolicy::WorkspaceWrite { network_access, .. } => {
if network_access {
"enabled"
} else {
"restricted"
}
}
};
writeln!(f, "Network access: {network_access}")?;
Ok(())
}
}
/// API request payload for a single model turn.
#[derive(Default, Debug, Clone)]
pub struct Prompt {
/// Conversation context input items.
pub input: Vec<ResponseItem>,
/// Optional previous response ID (when storage is enabled).
pub prev_id: Option<String>,
/// Optional instructions from the user to amend to the built-in agent
/// instructions.
pub user_instructions: Option<String>,
/// Whether to store response on server side (disable_response_storage = !store).
pub store: bool,
/// Additional tools sourced from external MCP servers. Note each key is
/// the "fully qualified" tool name (i.e., prefixed with the server name),
/// which should be reported to the model in place of Tool::name.
pub extra_tools: HashMap<String, mcp_types::Tool>,
/// A list of key-value pairs that will be added as a developer message
/// for the model to use
pub environment_context: Option<EnvironmentContext>,
/// Tools available to the model, including additional tools sourced from
/// external MCP servers.
pub tools: Vec<OpenAiTool>,
/// Optional override for the built-in BASE_INSTRUCTIONS.
pub base_instructions_override: Option<String>,
}
impl Prompt {
pub(crate) fn get_full_instructions(&self, model: &str) -> Cow<'_, str> {
let mut sections: Vec<&str> = vec![BASE_INSTRUCTIONS];
if let Some(ref user) = self.user_instructions {
sections.push(user);
}
if model.starts_with("gpt-4.1") {
pub(crate) fn get_full_instructions(&self, model: &ModelFamily) -> Cow<'_, str> {
let base = self
.base_instructions_override
.as_deref()
.unwrap_or(BASE_INSTRUCTIONS);
let mut sections: Vec<&str> = vec![base];
if model.needs_special_apply_patch_instructions {
sections.push(APPLY_PATCH_TOOL_INSTRUCTIONS);
}
Cow::Owned(sections.join("\n"))
}
fn get_formatted_user_instructions(&self) -> Option<String> {
self.user_instructions
.as_ref()
.map(|ui| format!("{USER_INSTRUCTIONS_START}{ui}{USER_INSTRUCTIONS_END}"))
}
fn get_formatted_environment_context(&self) -> Option<String> {
self.environment_context
.as_ref()
.map(|ec| format!("{ENVIRONMENT_CONTEXT_START}{ec}{ENVIRONMENT_CONTEXT_END}"))
}
pub(crate) fn get_formatted_input(&self) -> Vec<ResponseItem> {
let mut input_with_instructions = Vec::with_capacity(self.input.len() + 2);
if let Some(ec) = self.get_formatted_environment_context() {
input_with_instructions.push(ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: ec }],
});
}
if let Some(ui) = self.get_formatted_user_instructions() {
input_with_instructions.push(ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: ui }],
});
}
input_with_instructions.extend(self.input.clone());
input_with_instructions
}
}
#[derive(Debug)]
@@ -57,6 +141,9 @@ pub enum ResponseEvent {
response_id: String,
token_usage: Option<TokenUsage>,
},
OutputTextDelta(String),
ReasoningSummaryDelta(String),
ReasoningContentDelta(String),
}
#[derive(Debug, Serialize)]
@@ -124,21 +211,18 @@ pub(crate) struct ResponsesApiRequest<'a> {
pub(crate) tool_choice: &'static str,
pub(crate) parallel_tool_calls: bool,
pub(crate) reasoning: Option<Reasoning>,
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) previous_response_id: Option<String>,
/// true when using the Responses API.
pub(crate) store: bool,
pub(crate) stream: bool,
pub(crate) include: Vec<String>,
}
use crate::config::Config;
pub(crate) fn create_reasoning_param_for_request(
config: &Config,
model_family: &ModelFamily,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
) -> Option<Reasoning> {
if model_supports_reasoning_summaries(config) {
if model_family.supports_reasoning_summaries {
let effort: Option<OpenAiReasoningEffort> = effort.into();
let effort = effort?;
Some(Reasoning {
@@ -150,27 +234,6 @@ pub(crate) fn create_reasoning_param_for_request(
}
}
pub fn model_supports_reasoning_summaries(config: &Config) -> bool {
// Currently, we hardcode this rule to decide whether to enable reasoning.
// We expect reasoning to apply only to OpenAI models, but we do not want
// users to have to mess with their config to disable reasoning for models
// that do not support it, such as `gpt-4.1`.
//
// Though if a user is using Codex with non-OpenAI models that, say, happen
// to start with "o", then they can set `model_reasoning_effort = "none"` in
// config.toml to disable reasoning.
//
// Converseley, if a user has a non-OpenAI provider that supports reasoning,
// they can set the top-level `model_supports_reasoning_summaries = true`
// config option to enable reasoning.
if config.model_supports_reasoning_summaries {
return true;
}
let model = &config.model;
model.starts_with("o") || model.starts_with("codex")
}
pub(crate) struct ResponseStream {
pub(crate) rx_event: mpsc::Receiver<Result<ResponseEvent>>,
}
@@ -182,3 +245,23 @@ impl Stream for ResponseStream {
self.rx_event.poll_recv(cx)
}
}
#[cfg(test)]
mod tests {
#![allow(clippy::expect_used)]
use crate::model_family::find_family_for_model;
use super::*;
#[test]
fn get_full_instructions_no_user_content() {
let prompt = Prompt {
user_instructions: Some("custom instruction".to_string()),
..Default::default()
};
let expected = format!("{BASE_INSTRUCTIONS}\n{APPLY_PATCH_TOOL_INSTRUCTIONS}");
let model_family = find_family_for_model("gpt-4.1").expect("known model slug");
let full = prompt.get_full_instructions(&model_family);
assert_eq!(full, expected);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,20 +1,37 @@
use std::sync::Arc;
use crate::Codex;
use crate::CodexSpawnOk;
use crate::config::Config;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::util::notify_on_sigint;
use codex_login::load_auth;
use tokio::sync::Notify;
use uuid::Uuid;
/// Represents an active Codex conversation, including the first event
/// (which is [`EventMsg::SessionConfigured`]).
pub struct CodexConversation {
pub codex: Codex,
pub session_id: Uuid,
pub session_configured: Event,
pub ctrl_c: Arc<Notify>,
}
/// Spawn a new [`Codex`] and initialize the session.
///
/// Returns the wrapped [`Codex`] **and** the `SessionInitialized` event that
/// is received as a response to the initial `ConfigureSession` submission so
/// that callers can surface the information to the UI.
pub async fn init_codex(config: Config) -> anyhow::Result<(Codex, Event, Arc<Notify>)> {
pub async fn init_codex(config: Config) -> anyhow::Result<CodexConversation> {
let ctrl_c = notify_on_sigint();
let (codex, init_id) = Codex::spawn(config, ctrl_c.clone()).await?;
let auth = load_auth(&config.codex_home, true)?;
let CodexSpawnOk {
codex,
init_id,
session_id,
} = Codex::spawn(config, auth, ctrl_c.clone()).await?;
// The first event must be `SessionInitialized`. Validate and forward it to
// the caller so that they can display it in the conversation history.
@@ -33,5 +50,10 @@ pub async fn init_codex(config: Config) -> anyhow::Result<(Codex, Event, Arc<Not
));
}
Ok((codex, event, ctrl_c))
Ok(CodexConversation {
codex,
session_id,
session_configured: event,
ctrl_c,
})
}

View File

@@ -4,12 +4,13 @@ use crate::config_types::McpServerConfig;
use crate::config_types::ReasoningEffort;
use crate::config_types::ReasoningSummary;
use crate::config_types::SandboxMode;
use crate::config_types::SandboxWorkplaceWrite;
use crate::config_types::SandboxWorkspaceWrite;
use crate::config_types::ShellEnvironmentPolicy;
use crate::config_types::ShellEnvironmentPolicyToml;
use crate::config_types::Tui;
use crate::config_types::UriBasedFileOpener;
use crate::flags::OPENAI_DEFAULT_MODEL;
use crate::model_family::ModelFamily;
use crate::model_family::find_family_for_model;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::built_in_model_providers;
use crate::openai_model_info::get_model_info;
@@ -20,19 +21,27 @@ use serde::Deserialize;
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
use tempfile::NamedTempFile;
use toml::Value as TomlValue;
use toml_edit::DocumentMut;
const OPENAI_DEFAULT_MODEL: &str = "gpt-5";
/// Maximum number of bytes of the documentation that will be embedded. Larger
/// files are *silently truncated* to this size so we do not take up too much of
/// the context window.
pub(crate) const PROJECT_DOC_MAX_BYTES: usize = 32 * 1024; // 32 KiB
const CONFIG_TOML_FILE: &str = "config.toml";
/// Application configuration loaded from disk and merged with overrides.
#[derive(Debug, Clone, PartialEq)]
pub struct Config {
/// Optional override of model selection.
pub model: String,
pub model_family: ModelFamily,
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<u64>,
@@ -57,13 +66,20 @@ pub struct Config {
/// users are only interested in the final agent responses.
pub hide_agent_reasoning: bool,
/// When set to `true`, `AgentReasoningRawContentEvent` events will be shown in the UI/output.
/// Defaults to `false`.
pub show_raw_agent_reasoning: bool,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
/// who have opted into Zero Data Retention (ZDR).
pub disable_response_storage: bool,
/// User-provided instructions from instructions.md.
pub instructions: Option<String>,
/// User-provided instructions from AGENTS.md.
pub user_instructions: Option<String>,
/// Base instructions override.
pub base_instructions: Option<String>,
/// Optional external notifier command. When set, Codex will spawn this
/// program after each completed *turn* (i.e. when the agent finishes
@@ -131,12 +147,17 @@ pub struct Config {
/// request using the Responses API.
pub model_reasoning_summary: ReasoningSummary,
/// When set to `true`, overrides the default heuristic and forces
/// `model_supports_reasoning_summaries()` to return `true`.
pub model_supports_reasoning_summaries: bool,
/// Base URL for requests to ChatGPT (as opposed to the OpenAI API).
pub chatgpt_base_url: String,
/// Experimental rollout resume path (absolute path to .jsonl; undocumented).
pub experimental_resume: Option<PathBuf>,
/// Include an experimental plan tool that the model can use to update its current plan and status of each step.
pub include_plan_tool: bool,
/// The value for the `originator` header included with Responses API requests.
pub internal_originator: Option<String>,
}
impl Config {
@@ -175,10 +196,28 @@ impl Config {
}
}
pub fn load_config_as_toml_with_cli_overrides(
codex_home: &Path,
cli_overrides: Vec<(String, TomlValue)>,
) -> std::io::Result<ConfigToml> {
let mut root_value = load_config_as_toml(codex_home)?;
for (path, value) in cli_overrides.into_iter() {
apply_toml_override(&mut root_value, &path, value);
}
let cfg: ConfigToml = root_value.try_into().map_err(|e| {
tracing::error!("Failed to deserialize overridden config: {e}");
std::io::Error::new(std::io::ErrorKind::InvalidData, e)
})?;
Ok(cfg)
}
/// Read `CODEX_HOME/config.toml` and return it as a generic TOML value. Returns
/// an empty TOML table when the file does not exist.
fn load_config_as_toml(codex_home: &Path) -> std::io::Result<TomlValue> {
let config_path = codex_home.join("config.toml");
pub fn load_config_as_toml(codex_home: &Path) -> std::io::Result<TomlValue> {
let config_path = codex_home.join(CONFIG_TOML_FILE);
match std::fs::read_to_string(&config_path) {
Ok(contents) => match toml::from_str::<TomlValue>(&contents) {
Ok(val) => Ok(val),
@@ -198,6 +237,35 @@ fn load_config_as_toml(codex_home: &Path) -> std::io::Result<TomlValue> {
}
}
/// Patch `CODEX_HOME/config.toml` project state.
/// Use with caution.
pub fn set_project_trusted(codex_home: &Path, project_path: &Path) -> anyhow::Result<()> {
let config_path = codex_home.join(CONFIG_TOML_FILE);
// Parse existing config if present; otherwise start a new document.
let mut doc = match std::fs::read_to_string(config_path.clone()) {
Ok(s) => s.parse::<DocumentMut>()?,
Err(e) if e.kind() == std::io::ErrorKind::NotFound => DocumentMut::new(),
Err(e) => return Err(e.into()),
};
// Mark the project as trusted. toml_edit is very good at handling
// missing properties
let project_key = project_path.to_string_lossy().to_string();
doc["projects"][project_key.as_str()]["trust_level"] = toml_edit::value("trusted");
// ensure codex_home exists
std::fs::create_dir_all(codex_home)?;
// create a tmp_file
let tmp_file = NamedTempFile::new_in(codex_home)?;
std::fs::write(tmp_file.path(), doc.to_string())?;
// atomically move the tmp file into config.toml
tmp_file.persist(config_path)?;
Ok(())
}
/// Apply a single dotted-path override onto a TOML value.
fn apply_toml_override(root: &mut TomlValue, path: &str, value: TomlValue) {
use toml::value::Table;
@@ -266,7 +334,7 @@ pub struct ConfigToml {
pub sandbox_mode: Option<SandboxMode>,
/// Sandbox configuration to apply if `sandbox` is `WorkspaceWrite`.
pub sandbox_workspace_write: Option<SandboxWorkplaceWrite>,
pub sandbox_workspace_write: Option<SandboxWorkspaceWrite>,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
@@ -313,6 +381,10 @@ pub struct ConfigToml {
/// UI/output. Defaults to `false`.
pub hide_agent_reasoning: Option<bool>,
/// When set to `true`, `AgentReasoningRawContentEvent` events will be shown in the UI/output.
/// Defaults to `false`.
pub show_raw_agent_reasoning: Option<bool>,
pub model_reasoning_effort: Option<ReasoningEffort>,
pub model_reasoning_summary: Option<ReasoningSummary>,
@@ -321,6 +393,22 @@ pub struct ConfigToml {
/// Base URL for requests to ChatGPT (as opposed to the OpenAI API).
pub chatgpt_base_url: Option<String>,
/// Experimental rollout resume path (absolute path to .jsonl; undocumented).
pub experimental_resume: Option<PathBuf>,
/// Experimental path to a file whose contents replace the built-in BASE_INSTRUCTIONS.
pub experimental_instructions_file: Option<PathBuf>,
/// The value for the `originator` header included with Responses API requests.
pub internal_originator: Option<String>,
pub projects: Option<HashMap<String, ProjectConfig>>,
}
#[derive(Deserialize, Debug, Clone, PartialEq, Eq)]
pub struct ProjectConfig {
pub trust_level: Option<String>,
}
impl ConfigToml {
@@ -332,15 +420,52 @@ impl ConfigToml {
match resolved_sandbox_mode {
SandboxMode::ReadOnly => SandboxPolicy::new_read_only_policy(),
SandboxMode::WorkspaceWrite => match self.sandbox_workspace_write.as_ref() {
Some(s) => SandboxPolicy::WorkspaceWrite {
writable_roots: s.writable_roots.clone(),
network_access: s.network_access,
Some(SandboxWorkspaceWrite {
writable_roots,
network_access,
exclude_tmpdir_env_var,
exclude_slash_tmp,
}) => SandboxPolicy::WorkspaceWrite {
writable_roots: writable_roots.clone(),
network_access: *network_access,
exclude_tmpdir_env_var: *exclude_tmpdir_env_var,
exclude_slash_tmp: *exclude_slash_tmp,
},
None => SandboxPolicy::new_workspace_write_policy(),
},
SandboxMode::DangerFullAccess => SandboxPolicy::DangerFullAccess,
}
}
pub fn is_cwd_trusted(&self, resolved_cwd: &Path) -> bool {
let projects = self.projects.clone().unwrap_or_default();
projects
.get(&resolved_cwd.to_string_lossy().to_string())
.map(|p| p.trust_level.clone().unwrap_or("".to_string()) == "trusted")
.unwrap_or(false)
}
pub fn get_config_profile(
&self,
override_profile: Option<String>,
) -> Result<ConfigProfile, std::io::Error> {
let profile = override_profile.or_else(|| self.profile.clone());
match profile {
Some(key) => {
if let Some(profile) = self.profiles.get(key.as_str()) {
return Ok(profile.clone());
}
Err(std::io::Error::new(
std::io::ErrorKind::NotFound,
format!("config profile `{key}` not found"),
))
}
None => Ok(ConfigProfile::default()),
}
}
}
/// Optional overrides for user configuration (e.g., from CLI flags).
@@ -353,6 +478,10 @@ pub struct ConfigOverrides {
pub model_provider: Option<String>,
pub config_profile: Option<String>,
pub codex_linux_sandbox_exe: Option<PathBuf>,
pub base_instructions: Option<String>,
pub include_plan_tool: Option<bool>,
pub disable_response_storage: Option<bool>,
pub show_raw_agent_reasoning: Option<bool>,
}
impl Config {
@@ -363,7 +492,7 @@ impl Config {
overrides: ConfigOverrides,
codex_home: PathBuf,
) -> std::io::Result<Self> {
let instructions = Self::load_instructions(Some(&codex_home));
let user_instructions = Self::load_instructions(Some(&codex_home));
// Destructure ConfigOverrides fully to ensure all overrides are applied.
let ConfigOverrides {
@@ -374,6 +503,10 @@ impl Config {
model_provider,
config_profile: config_profile_key,
codex_linux_sandbox_exe,
base_instructions,
include_plan_tool,
disable_response_storage,
show_raw_agent_reasoning,
} = overrides;
let config_profile = match config_profile_key.as_ref().or(cfg.profile.as_ref()) {
@@ -439,7 +572,19 @@ impl Config {
.or(config_profile.model)
.or(cfg.model)
.unwrap_or_else(default_model);
let openai_model_info = get_model_info(&model);
let model_family = find_family_for_model(&model).unwrap_or_else(|| {
let supports_reasoning_summaries =
cfg.model_supports_reasoning_summaries.unwrap_or(false);
ModelFamily {
slug: model.clone(),
family: model.clone(),
needs_special_apply_patch_instructions: false,
supports_reasoning_summaries,
uses_local_shell_tool: false,
}
});
let openai_model_info = get_model_info(&model_family);
let model_context_window = cfg
.model_context_window
.or_else(|| openai_model_info.as_ref().map(|info| info.context_window));
@@ -448,8 +593,23 @@ impl Config {
.as_ref()
.map(|info| info.max_output_tokens)
});
let experimental_resume = cfg.experimental_resume;
// Load base instructions override from a file if specified. If the
// path is relative, resolve it against the effective cwd so the
// behaviour matches other path-like config values.
let experimental_instructions_path = config_profile
.experimental_instructions_file
.as_ref()
.or(cfg.experimental_instructions_file.as_ref());
let file_base_instructions =
Self::get_base_instructions(experimental_instructions_path, &resolved_cwd)?;
let base_instructions = base_instructions.or(file_base_instructions);
let config = Self {
model,
model_family,
model_context_window,
model_max_output_tokens,
model_provider_id,
@@ -464,9 +624,11 @@ impl Config {
disable_response_storage: config_profile
.disable_response_storage
.or(cfg.disable_response_storage)
.or(disable_response_storage)
.unwrap_or(false),
notify: cfg.notify,
instructions,
user_instructions,
base_instructions,
mcp_servers: cfg.mcp_servers,
model_providers,
project_doc_max_bytes: cfg.project_doc_max_bytes.unwrap_or(PROJECT_DOC_MAX_BYTES),
@@ -477,6 +639,10 @@ impl Config {
codex_linux_sandbox_exe,
hide_agent_reasoning: cfg.hide_agent_reasoning.unwrap_or(false),
show_raw_agent_reasoning: cfg
.show_raw_agent_reasoning
.or(show_raw_agent_reasoning)
.unwrap_or(false),
model_reasoning_effort: config_profile
.model_reasoning_effort
.or(cfg.model_reasoning_effort)
@@ -486,14 +652,14 @@ impl Config {
.or(cfg.model_reasoning_summary)
.unwrap_or_default(),
model_supports_reasoning_summaries: cfg
.model_supports_reasoning_summaries
.unwrap_or(false),
chatgpt_base_url: config_profile
.chatgpt_base_url
.or(cfg.chatgpt_base_url)
.unwrap_or("https://chatgpt.com/backend-api/".to_string()),
experimental_resume,
include_plan_tool: include_plan_tool.unwrap_or(false),
internal_originator: cfg.internal_originator,
};
Ok(config)
}
@@ -504,7 +670,7 @@ impl Config {
None => return None,
};
p.push("instructions.md");
p.push("AGENTS.md");
std::fs::read_to_string(&p).ok().and_then(|s| {
let s = s.trim();
if s.is_empty() {
@@ -514,6 +680,48 @@ impl Config {
}
})
}
fn get_base_instructions(
path: Option<&PathBuf>,
cwd: &Path,
) -> std::io::Result<Option<String>> {
let p = match path.as_ref() {
None => return Ok(None),
Some(p) => p,
};
// Resolve relative paths against the provided cwd to make CLI
// overrides consistent regardless of where the process was launched
// from.
let full_path = if p.is_relative() {
cwd.join(p)
} else {
p.to_path_buf()
};
let contents = std::fs::read_to_string(&full_path).map_err(|e| {
std::io::Error::new(
e.kind(),
format!(
"failed to read experimental instructions file {}: {e}",
full_path.display()
),
)
})?;
let s = contents.trim().to_string();
if s.is_empty() {
Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
format!(
"experimental instructions file is empty: {}",
full_path.display()
),
))
} else {
Ok(Some(s))
}
}
}
fn default_model() -> String {
@@ -528,7 +736,7 @@ fn default_model() -> String {
/// function will Err if the path does not exist.
/// - If `CODEX_HOME` is not set, this function does not verify that the
/// directory exists.
fn find_codex_home() -> std::io::Result<PathBuf> {
pub fn find_codex_home() -> std::io::Result<PathBuf> {
// Honor the `CODEX_HOME` environment variable when it is set to allow users
// (and tests) to override the default location.
if let Ok(val) = std::env::var("CODEX_HOME") {
@@ -632,8 +840,10 @@ sandbox_mode = "workspace-write"
[sandbox_workspace_write]
writable_roots = [
"/tmp",
"/my/workspace",
]
exclude_tmpdir_env_var = true
exclude_slash_tmp = true
"#;
let sandbox_workspace_write_cfg = toml::from_str::<ConfigToml>(sandbox_workspace_write)
@@ -641,8 +851,10 @@ writable_roots = [
let sandbox_mode_override = None;
assert_eq!(
SandboxPolicy::WorkspaceWrite {
writable_roots: vec![PathBuf::from("/tmp")],
writable_roots: vec![PathBuf::from("/my/workspace")],
network_access: false,
exclude_tmpdir_env_var: true,
exclude_slash_tmp: true,
},
sandbox_workspace_write_cfg.derive_sandbox_policy(sandbox_mode_override)
);
@@ -682,6 +894,9 @@ name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
request_max_retries = 4 # retry failed HTTP requests
stream_max_retries = 10 # retry dropped SSE streams
stream_idle_timeout_ms = 300000 # 5m idle timeout
[profiles.o3]
model = "o3"
@@ -715,13 +930,17 @@ disable_response_storage = true
let openai_chat_completions_provider = ModelProviderInfo {
name: "OpenAI using Chat Completions".to_string(),
base_url: "https://api.openai.com/v1".to_string(),
base_url: Some("https://api.openai.com/v1".to_string()),
env_key: Some("OPENAI_API_KEY".to_string()),
wire_api: crate::WireApi::Chat,
env_key_instructions: None,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(4),
stream_max_retries: Some(10),
stream_idle_timeout_ms: Some(300_000),
requires_openai_auth: false,
};
let model_provider_map = {
let mut model_provider_map = built_in_model_providers();
@@ -752,7 +971,7 @@ disable_response_storage = true
///
/// 1. custom command-line argument, e.g. `--model o3`
/// 2. as part of a profile, where the `--profile` is specified via a CLI
/// (or in the config file itelf)
/// (or in the config file itself)
/// 3. as an entry in `config.toml`, e.g. `model = "o3"`
/// 4. the default value for a required field defined in code, e.g.,
/// `crate::flags::OPENAI_DEFAULT_MODEL`
@@ -776,6 +995,7 @@ disable_response_storage = true
assert_eq!(
Config {
model: "o3".to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_provider_id: "openai".to_string(),
@@ -784,7 +1004,7 @@ disable_response_storage = true
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: false,
instructions: None,
user_instructions: None,
notify: None,
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
@@ -796,10 +1016,14 @@ disable_response_storage = true
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
show_raw_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::High,
model_reasoning_summary: ReasoningSummary::Detailed,
model_supports_reasoning_summaries: false,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
experimental_resume: None,
base_instructions: None,
include_plan_tool: false,
internal_originator: None,
},
o3_profile_config
);
@@ -822,6 +1046,7 @@ disable_response_storage = true
)?;
let expected_gpt3_profile_config = Config {
model: "gpt-3.5-turbo".to_string(),
model_family: find_family_for_model("gpt-3.5-turbo").expect("known model slug"),
model_context_window: Some(16_385),
model_max_output_tokens: Some(4_096),
model_provider_id: "openai-chat-completions".to_string(),
@@ -830,7 +1055,7 @@ disable_response_storage = true
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: false,
instructions: None,
user_instructions: None,
notify: None,
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
@@ -842,10 +1067,14 @@ disable_response_storage = true
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
show_raw_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::default(),
model_reasoning_summary: ReasoningSummary::default(),
model_supports_reasoning_summaries: false,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
experimental_resume: None,
base_instructions: None,
include_plan_tool: false,
internal_originator: None,
};
assert_eq!(expected_gpt3_profile_config, gpt3_profile_config);
@@ -883,6 +1112,7 @@ disable_response_storage = true
)?;
let expected_zdr_profile_config = Config {
model: "o3".to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_provider_id: "openai".to_string(),
@@ -891,7 +1121,7 @@ disable_response_storage = true
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: true,
instructions: None,
user_instructions: None,
notify: None,
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
@@ -903,10 +1133,14 @@ disable_response_storage = true
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
show_raw_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::default(),
model_reasoning_summary: ReasoningSummary::default(),
model_supports_reasoning_summaries: false,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
experimental_resume: None,
base_instructions: None,
include_plan_tool: false,
internal_originator: None,
};
assert_eq!(expected_zdr_profile_config, zdr_profile_config);

View File

@@ -1,4 +1,5 @@
use serde::Deserialize;
use std::path::PathBuf;
use crate::config_types::ReasoningEffort;
use crate::config_types::ReasoningSummary;
@@ -17,4 +18,5 @@ pub struct ConfigProfile {
pub model_reasoning_effort: Option<ReasoningEffort>,
pub model_reasoning_summary: Option<ReasoningSummary>,
pub chatgpt_base_url: Option<String>,
pub experimental_instructions_file: Option<PathBuf>,
}

View File

@@ -76,22 +76,9 @@ pub enum HistoryPersistence {
/// Collection of settings that are specific to the TUI.
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
pub struct Tui {
/// By default, mouse capture is enabled in the TUI so that it is possible
/// to scroll the conversation history with a mouse. This comes at the cost
/// of not being able to use the mouse to select text in the TUI.
/// (Most terminals support a modifier key to allow this. For example,
/// text selection works in iTerm if you hold down the `Option` key while
/// clicking and dragging.)
///
/// Setting this option to `true` disables mouse capture, so scrolling with
/// the mouse is not possible, though the keyboard shortcuts e.g. `b` and
/// `space` still work. This allows the user to select text in the TUI
/// using the mouse without needing to hold down a modifier key.
pub disable_mouse_capture: bool,
}
pub struct Tui {}
#[derive(Deserialize, Debug, Clone, Copy, PartialEq, Default)]
#[derive(Deserialize, Debug, Clone, Copy, PartialEq, Default, Serialize)]
#[serde(rename_all = "kebab-case")]
pub enum SandboxMode {
#[serde(rename = "read-only")]
@@ -106,11 +93,15 @@ pub enum SandboxMode {
}
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
pub struct SandboxWorkplaceWrite {
pub struct SandboxWorkspaceWrite {
#[serde(default)]
pub writable_roots: Vec<PathBuf>,
#[serde(default)]
pub network_access: bool,
#[serde(default)]
pub exclude_tmpdir_env_var: bool,
#[serde(default)]
pub exclude_slash_tmp: bool,
}
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
@@ -118,10 +109,10 @@ pub struct SandboxWorkplaceWrite {
pub enum ShellEnvironmentPolicyInherit {
/// "Core" environment variables for the platform. On UNIX, this would
/// include HOME, LOGNAME, PATH, SHELL, and USER, among others.
#[default]
Core,
/// Inherits the full environment from the parent process.
#[default]
All,
/// Do not inherit any environment variables from the parent process.
@@ -143,6 +134,8 @@ pub struct ShellEnvironmentPolicyToml {
/// List of regular expressions.
pub include_only: Option<Vec<String>>,
pub experimental_use_profile: Option<bool>,
}
pub type EnvironmentVariablePattern = WildMatchPattern<'*', '?'>;
@@ -171,11 +164,15 @@ pub struct ShellEnvironmentPolicy {
/// Environment variable names to retain in the environment.
pub include_only: Vec<EnvironmentVariablePattern>,
/// If true, the shell profile will be used to run the command.
pub use_profile: bool,
}
impl From<ShellEnvironmentPolicyToml> for ShellEnvironmentPolicy {
fn from(toml: ShellEnvironmentPolicyToml) -> Self {
let inherit = toml.inherit.unwrap_or(ShellEnvironmentPolicyInherit::Core);
// Default to inheriting the full environment when not specified.
let inherit = toml.inherit.unwrap_or(ShellEnvironmentPolicyInherit::All);
let ignore_default_excludes = toml.ignore_default_excludes.unwrap_or(false);
let exclude = toml
.exclude
@@ -190,6 +187,7 @@ impl From<ShellEnvironmentPolicyToml> for ShellEnvironmentPolicy {
.into_iter()
.map(|s| EnvironmentVariablePattern::new_case_insensitive(&s))
.collect();
let use_profile = toml.experimental_use_profile.unwrap_or(false);
Self {
inherit,
@@ -197,6 +195,7 @@ impl From<ShellEnvironmentPolicyToml> for ShellEnvironmentPolicy {
exclude,
r#set,
include_only,
use_profile,
}
}
}

View File

@@ -1,12 +1,7 @@
use crate::models::ResponseItem;
/// Transcript of conversation history that is needed:
/// - for ZDR clients for which previous_response_id is not available, so we
/// must include the transcript with every API call. This must include each
/// `function_call` and its corresponding `function_call_output`.
/// - for clients using the "chat completions" API as opposed to the
/// "responses" API.
#[derive(Debug, Clone)]
/// Transcript of conversation history
#[derive(Debug, Clone, Default)]
pub(crate) struct ConversationHistory {
/// The oldest items are at the beginning of the vector.
items: Vec<ResponseItem>,
@@ -29,12 +24,83 @@ impl ConversationHistory {
I::Item: std::ops::Deref<Target = ResponseItem>,
{
for item in items {
if is_api_message(&item) {
// Note agent-loop.ts also does filtering on some of the fields.
self.items.push(item.clone());
if !is_api_message(&item) {
continue;
}
// Merge adjacent assistant messages into a single history entry.
// This prevents duplicates when a partial assistant message was
// streamed into history earlier in the turn and the final full
// message is recorded at turn end.
match (&*item, self.items.last_mut()) {
(
ResponseItem::Message {
role: new_role,
content: new_content,
..
},
Some(ResponseItem::Message {
role: last_role,
content: last_content,
..
}),
) if new_role == "assistant" && last_role == "assistant" => {
append_text_content(last_content, new_content);
}
_ => {
self.items.push(item.clone());
}
}
}
}
/// Append a text `delta` to the latest assistant message, creating a new
/// assistant entry if none exists yet (e.g. first delta for this turn).
pub(crate) fn append_assistant_text(&mut self, delta: &str) {
match self.items.last_mut() {
Some(ResponseItem::Message { role, content, .. }) if role == "assistant" => {
append_text_delta(content, delta);
}
_ => {
// Start a new assistant message with the delta.
self.items.push(ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![crate::models::ContentItem::OutputText {
text: delta.to_string(),
}],
});
}
}
}
pub(crate) fn keep_last_messages(&mut self, n: usize) {
if n == 0 {
self.items.clear();
return;
}
// Collect the last N message items (assistant/user), newest to oldest.
let mut kept: Vec<ResponseItem> = Vec::with_capacity(n);
for item in self.items.iter().rev() {
if let ResponseItem::Message { role, content, .. } = item {
kept.push(ResponseItem::Message {
// we need to remove the id or the model will complain that messages are sent without
// their reasonings
id: None,
role: role.clone(),
content: content.clone(),
});
if kept.len() == n {
break;
}
}
}
// Preserve chronological order (oldest to newest) within the kept slice.
kept.reverse();
self.items = kept;
}
}
/// Anything that is not a system message or "reasoning" message is considered
@@ -44,7 +110,145 @@ fn is_api_message(message: &ResponseItem) -> bool {
ResponseItem::Message { role, .. } => role.as_str() != "system",
ResponseItem::FunctionCallOutput { .. }
| ResponseItem::FunctionCall { .. }
| ResponseItem::LocalShellCall { .. } => true,
ResponseItem::Reasoning { .. } | ResponseItem::Other => false,
| ResponseItem::LocalShellCall { .. }
| ResponseItem::Reasoning { .. } => true,
ResponseItem::Other => false,
}
}
/// Helper to append the textual content from `src` into `dst` in place.
fn append_text_content(
dst: &mut Vec<crate::models::ContentItem>,
src: &Vec<crate::models::ContentItem>,
) {
for c in src {
if let crate::models::ContentItem::OutputText { text } = c {
append_text_delta(dst, text);
}
}
}
/// Append a single text delta to the last OutputText item in `content`, or
/// push a new OutputText item if none exists.
fn append_text_delta(content: &mut Vec<crate::models::ContentItem>, delta: &str) {
if let Some(crate::models::ContentItem::OutputText { text }) = content
.iter_mut()
.rev()
.find(|c| matches!(c, crate::models::ContentItem::OutputText { .. }))
{
text.push_str(delta);
} else {
content.push(crate::models::ContentItem::OutputText {
text: delta.to_string(),
});
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::models::ContentItem;
fn assistant_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: text.to_string(),
}],
}
}
fn user_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::OutputText {
text: text.to_string(),
}],
}
}
#[test]
fn merges_adjacent_assistant_messages() {
let mut h = ConversationHistory::default();
let a1 = assistant_msg("Hello");
let a2 = assistant_msg(", world!");
h.record_items([&a1, &a2]);
let items = h.contents();
assert_eq!(
items,
vec![ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: "Hello, world!".to_string()
}]
}]
);
}
#[test]
fn append_assistant_text_creates_and_appends() {
let mut h = ConversationHistory::default();
h.append_assistant_text("Hello");
h.append_assistant_text(", world");
// Now record a final full assistant message and verify it merges.
let final_msg = assistant_msg("!");
h.record_items([&final_msg]);
let items = h.contents();
assert_eq!(
items,
vec![ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: "Hello, world!".to_string()
}]
}]
);
}
#[test]
fn filters_non_api_messages() {
let mut h = ConversationHistory::default();
// System message is not an API message; Other is ignored.
let system = ResponseItem::Message {
id: None,
role: "system".to_string(),
content: vec![ContentItem::OutputText {
text: "ignored".to_string(),
}],
};
h.record_items([&system, &ResponseItem::Other]);
// User and assistant should be retained.
let u = user_msg("hi");
let a = assistant_msg("hello");
h.record_items([&u, &a]);
let items = h.contents();
assert_eq!(
items,
vec![
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::OutputText {
text: "hi".to_string()
}]
},
ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: "hello".to_string()
}]
}
]
);
}
}

View File

@@ -62,6 +62,17 @@ pub enum CodexErr {
#[error("unexpected status {0}: {1}")]
UnexpectedStatus(StatusCode, String),
#[error("Usage limit has been reached")]
UsageLimitReached,
#[error("Usage not included with the plan")]
UsageNotIncluded,
#[error(
"Were currently experiencing high demand, which may cause temporary errors. Were adding capacity in East and West Europe to restore normal service."
)]
InternalServerError,
/// Retry limit exceeded.
#[error("exceeded retry limit, last status: {0}")]
RetryLimit(StatusCode),
@@ -132,3 +143,10 @@ impl CodexErr {
(self as &dyn std::any::Any).downcast_ref::<T>()
}
}
pub fn get_error_message_ui(e: &CodexErr) -> String {
match e {
CodexErr::Sandbox(SandboxErr::Denied(_, _, stderr)) => stderr.to_string(),
_ => e.to_string(),
}
}

View File

@@ -6,22 +6,29 @@ use std::io;
use std::path::Path;
use std::path::PathBuf;
use std::process::ExitStatus;
use std::process::Stdio;
use std::sync::Arc;
use std::time::Duration;
use std::time::Instant;
use async_channel::Sender;
use tokio::io::AsyncRead;
use tokio::io::AsyncReadExt;
use tokio::io::BufReader;
use tokio::process::Child;
use tokio::process::Command;
use tokio::sync::Notify;
use crate::error::CodexErr;
use crate::error::Result;
use crate::error::SandboxErr;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::ExecCommandOutputDeltaEvent;
use crate::protocol::ExecOutputStream;
use crate::protocol::SandboxPolicy;
use crate::seatbelt::spawn_command_under_seatbelt;
use crate::spawn::StdioPolicy;
use crate::spawn::spawn_child_async;
use serde_bytes::ByteBuf;
// Maximum we send for each stream, which is either:
// - 10KiB OR
@@ -36,30 +43,20 @@ const DEFAULT_TIMEOUT_MS: u64 = 10_000;
const SIGKILL_CODE: i32 = 9;
const TIMEOUT_CODE: i32 = 64;
const MACOS_SEATBELT_BASE_POLICY: &str = include_str!("seatbelt_base_policy.sbpl");
/// When working with `sandbox-exec`, only consider `sandbox-exec` in `/usr/bin`
/// to defend against an attacker trying to inject a malicious version on the
/// PATH. If /usr/bin/sandbox-exec has been tampered with, then the attacker
/// already has root access.
const MACOS_PATH_TO_SEATBELT_EXECUTABLE: &str = "/usr/bin/sandbox-exec";
/// Experimental environment variable that will be set to some non-empty value
/// if both of the following are true:
///
/// 1. The process was spawned by Codex as part of a shell tool call.
/// 2. SandboxPolicy.has_full_network_access() was false for the tool call.
///
/// We may try to have just one environment variable for all sandboxing
/// attributes, so this may change in the future.
pub const CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR: &str = "CODEX_SANDBOX_NETWORK_DISABLED";
#[derive(Debug, Clone)]
pub struct ExecParams {
pub command: Vec<String>,
pub cwd: PathBuf,
pub timeout_ms: Option<u64>,
pub env: HashMap<String, String>,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
}
impl ExecParams {
pub fn timeout_duration(&self) -> Duration {
Duration::from_millis(self.timeout_ms.unwrap_or(DEFAULT_TIMEOUT_MS))
}
}
#[derive(Clone, Copy, Debug, PartialEq)]
@@ -73,23 +70,30 @@ pub enum SandboxType {
LinuxSeccomp,
}
#[derive(Clone)]
pub struct StdoutStream {
pub sub_id: String,
pub call_id: String,
pub tx_event: Sender<Event>,
}
pub async fn process_exec_tool_call(
params: ExecParams,
sandbox_type: SandboxType,
ctrl_c: Arc<Notify>,
sandbox_policy: &SandboxPolicy,
codex_linux_sandbox_exe: &Option<PathBuf>,
stdout_stream: Option<StdoutStream>,
) -> Result<ExecToolCallOutput> {
let start = Instant::now();
let raw_output_result = match sandbox_type {
SandboxType::None => exec(params, sandbox_policy, ctrl_c).await,
let raw_output_result: std::result::Result<RawExecToolCallOutput, CodexErr> = match sandbox_type
{
SandboxType::None => exec(params, sandbox_policy, ctrl_c, stdout_stream.clone()).await,
SandboxType::MacosSeatbelt => {
let timeout = params.timeout_duration();
let ExecParams {
command,
cwd,
timeout_ms,
env,
command, cwd, env, ..
} = params;
let child = spawn_command_under_seatbelt(
command,
@@ -99,14 +103,12 @@ pub async fn process_exec_tool_call(
env,
)
.await?;
consume_truncated_output(child, ctrl_c, timeout_ms).await
consume_truncated_output(child, ctrl_c, timeout, stdout_stream.clone()).await
}
SandboxType::LinuxSeccomp => {
let timeout = params.timeout_duration();
let ExecParams {
command,
cwd,
timeout_ms,
env,
command, cwd, env, ..
} = params;
let codex_linux_sandbox_exe = codex_linux_sandbox_exe
@@ -122,7 +124,7 @@ pub async fn process_exec_tool_call(
)
.await?;
consume_truncated_output(child, ctrl_c, timeout_ms).await
consume_truncated_output(child, ctrl_c, timeout, stdout_stream).await
}
};
let duration = start.elapsed();
@@ -142,11 +144,7 @@ pub async fn process_exec_tool_call(
let exit_code = raw_output.exit_status.code().unwrap_or(-1);
// NOTE(ragona): This is much less restrictive than the previous check. If we exec
// a command, and it returns anything other than success, we assume that it may have
// been a sandboxing error and allow the user to retry. (The user of course may choose
// not to retry, or in a non-interactive mode, would automatically reject the approval.)
if exit_code != 0 && sandbox_type != SandboxType::None {
if exit_code != 0 && is_likely_sandbox_denied(sandbox_type, exit_code) {
return Err(CodexErr::Sandbox(SandboxErr::Denied(
exit_code, stdout, stderr,
)));
@@ -166,27 +164,6 @@ pub async fn process_exec_tool_call(
}
}
pub async fn spawn_command_under_seatbelt(
command: Vec<String>,
sandbox_policy: &SandboxPolicy,
cwd: PathBuf,
stdio_policy: StdioPolicy,
env: HashMap<String, String>,
) -> std::io::Result<Child> {
let args = create_seatbelt_command_args(command, sandbox_policy, &cwd);
let arg0 = None;
spawn_child_async(
PathBuf::from(MACOS_PATH_TO_SEATBELT_EXECUTABLE),
args,
arg0,
cwd,
sandbox_policy,
stdio_policy,
env,
)
.await
}
/// Spawn a shell tool command under the Linux Landlock+seccomp sandbox helper
/// (codex-linux-sandbox).
///
@@ -246,63 +223,24 @@ fn create_linux_sandbox_command_args(
linux_cmd
}
fn create_seatbelt_command_args(
command: Vec<String>,
sandbox_policy: &SandboxPolicy,
cwd: &Path,
) -> Vec<String> {
let (file_write_policy, extra_cli_args) = {
if sandbox_policy.has_full_disk_write_access() {
// Allegedly, this is more permissive than `(allow file-write*)`.
(
r#"(allow file-write* (regex #"^/"))"#.to_string(),
Vec::<String>::new(),
)
} else {
let writable_roots = sandbox_policy.get_writable_roots_with_cwd(cwd);
let (writable_folder_policies, cli_args): (Vec<String>, Vec<String>) = writable_roots
.iter()
.enumerate()
.map(|(index, root)| {
let param_name = format!("WRITABLE_ROOT_{index}");
let policy: String = format!("(subpath (param \"{param_name}\"))");
let cli_arg = format!("-D{param_name}={}", root.to_string_lossy());
(policy, cli_arg)
})
.unzip();
if writable_folder_policies.is_empty() {
("".to_string(), Vec::<String>::new())
} else {
let file_write_policy = format!(
"(allow file-write*\n{}\n)",
writable_folder_policies.join(" ")
);
(file_write_policy, cli_args)
}
}
};
/// We don't have a fully deterministic way to tell if our command failed
/// because of the sandbox - a command in the user's zshrc file might hit an
/// error, but the command itself might fail or succeed for other reasons.
/// For now, we conservatively check for 'command not found' (exit code 127),
/// and can add additional cases as necessary.
fn is_likely_sandbox_denied(sandbox_type: SandboxType, exit_code: i32) -> bool {
if sandbox_type == SandboxType::None {
return false;
}
let file_read_policy = if sandbox_policy.has_full_disk_read_access() {
"; allow read-only file operations\n(allow file-read*)"
} else {
""
};
// Quick rejects: well-known non-sandbox shell exit codes
// 127: command not found, 2: misuse of shell builtins
if exit_code == 127 {
return false;
}
// TODO(mbolin): apply_patch calls must also honor the SandboxPolicy.
let network_policy = if sandbox_policy.has_full_network_access() {
"(allow network-outbound)\n(allow network-inbound)\n(allow system-socket)"
} else {
""
};
let full_policy = format!(
"{MACOS_SEATBELT_BASE_POLICY}\n{file_read_policy}\n{file_write_policy}\n{network_policy}"
);
let mut seatbelt_args: Vec<String> = vec!["-p".to_string(), full_policy];
seatbelt_args.extend(extra_cli_args);
seatbelt_args.push("--".to_string());
seatbelt_args.extend(command);
seatbelt_args
// For all other cases, we assume the sandbox is the cause
true
}
#[derive(Debug)]
@@ -321,15 +259,16 @@ pub struct ExecToolCallOutput {
}
async fn exec(
ExecParams {
command,
cwd,
timeout_ms,
env,
}: ExecParams,
params: ExecParams,
sandbox_policy: &SandboxPolicy,
ctrl_c: Arc<Notify>,
stdout_stream: Option<StdoutStream>,
) -> Result<RawExecToolCallOutput> {
let timeout = params.timeout_duration();
let ExecParams {
command, cwd, env, ..
} = params;
let (program, args) = command.split_first().ok_or_else(|| {
CodexErr::Io(io::Error::new(
io::ErrorKind::InvalidInput,
@@ -347,62 +286,7 @@ async fn exec(
env,
)
.await?;
consume_truncated_output(child, ctrl_c, timeout_ms).await
}
#[derive(Debug, Clone, Copy)]
pub enum StdioPolicy {
RedirectForShellTool,
Inherit,
}
/// Spawns the appropriate child process for the ExecParams and SandboxPolicy,
/// ensuring the args and environment variables used to create the `Command`
/// (and `Child`) honor the configuration.
///
/// For now, we take `SandboxPolicy` as a parameter to spawn_child() because
/// we need to determine whether to set the
/// `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` environment variable.
async fn spawn_child_async(
program: PathBuf,
args: Vec<String>,
#[cfg_attr(not(unix), allow(unused_variables))] arg0: Option<&str>,
cwd: PathBuf,
sandbox_policy: &SandboxPolicy,
stdio_policy: StdioPolicy,
env: HashMap<String, String>,
) -> std::io::Result<Child> {
let mut cmd = Command::new(&program);
#[cfg(unix)]
cmd.arg0(arg0.map_or_else(|| program.to_string_lossy().to_string(), String::from));
cmd.args(args);
cmd.current_dir(cwd);
cmd.env_clear();
cmd.envs(env);
if !sandbox_policy.has_full_network_access() {
cmd.env(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR, "1");
}
match stdio_policy {
StdioPolicy::RedirectForShellTool => {
// Do not create a file descriptor for stdin because otherwise some
// commands may hang forever waiting for input. For example, ripgrep has
// a heuristic where it may try to read from stdin as explained here:
// https://github.com/BurntSushi/ripgrep/blob/e2362d4d5185d02fa857bf381e7bd52e66fafc73/crates/core/flags/hiargs.rs#L1101-L1103
cmd.stdin(Stdio::null());
cmd.stdout(Stdio::piped()).stderr(Stdio::piped());
}
StdioPolicy::Inherit => {
// Inherit stdin, stdout, and stderr from the parent process.
cmd.stdin(Stdio::inherit())
.stdout(Stdio::inherit())
.stderr(Stdio::inherit());
}
}
cmd.kill_on_drop(true).spawn()
consume_truncated_output(child, ctrl_c, timeout, stdout_stream).await
}
/// Consumes the output of a child process, truncating it so it is suitable for
@@ -410,7 +294,8 @@ async fn spawn_child_async(
pub(crate) async fn consume_truncated_output(
mut child: Child,
ctrl_c: Arc<Notify>,
timeout_ms: Option<u64>,
timeout: Duration,
stdout_stream: Option<StdoutStream>,
) -> Result<RawExecToolCallOutput> {
// Both stdout and stderr were configured with `Stdio::piped()`
// above, therefore `take()` should normally return `Some`. If it doesn't
@@ -431,15 +316,18 @@ pub(crate) async fn consume_truncated_output(
BufReader::new(stdout_reader),
MAX_STREAM_OUTPUT,
MAX_STREAM_OUTPUT_LINES,
stdout_stream.clone(),
false,
));
let stderr_handle = tokio::spawn(read_capped(
BufReader::new(stderr_reader),
MAX_STREAM_OUTPUT,
MAX_STREAM_OUTPUT_LINES,
stdout_stream.clone(),
true,
));
let interrupted = ctrl_c.notified();
let timeout = Duration::from_millis(timeout_ms.unwrap_or(DEFAULT_TIMEOUT_MS));
let exit_status = tokio::select! {
result = tokio::time::timeout(timeout, child.wait()) => {
match result {
@@ -469,10 +357,12 @@ pub(crate) async fn consume_truncated_output(
})
}
async fn read_capped<R: AsyncRead + Unpin>(
async fn read_capped<R: AsyncRead + Unpin + Send + 'static>(
mut reader: R,
max_output: usize,
max_lines: usize,
stream: Option<StdoutStream>,
is_stderr: bool,
) -> io::Result<Vec<u8>> {
let mut buf = Vec::with_capacity(max_output.min(8 * 1024));
let mut tmp = [0u8; 8192];
@@ -486,6 +376,25 @@ async fn read_capped<R: AsyncRead + Unpin>(
break;
}
if let Some(stream) = &stream {
let chunk = tmp[..n].to_vec();
let msg = EventMsg::ExecCommandOutputDelta(ExecCommandOutputDeltaEvent {
call_id: stream.call_id.clone(),
stream: if is_stderr {
ExecOutputStream::Stderr
} else {
ExecOutputStream::Stdout
},
chunk: ByteBuf::from(chunk),
});
let event = Event {
id: stream.sub_id.clone(),
msg,
};
#[allow(clippy::let_unit_value)]
let _ = stream.tx_event.send(event).await;
}
// Copy into the buffer only while we still have byte and line budget.
if remaining_bytes > 0 && remaining_lines > 0 {
let mut copy_len = 0;

View File

@@ -3,7 +3,6 @@ use std::time::Duration;
use env_flags::env_flags;
env_flags! {
pub OPENAI_DEFAULT_MODEL: &str = "codex-mini-latest";
pub OPENAI_API_BASE: &str = "https://api.openai.com/v1";
/// Fallback when the provider-specific key is not set.
@@ -11,14 +10,6 @@ env_flags! {
pub OPENAI_TIMEOUT_MS: Duration = Duration::from_millis(300_000), |value| {
value.parse().map(Duration::from_millis)
};
pub OPENAI_REQUEST_MAX_RETRIES: u64 = 4;
pub OPENAI_STREAM_MAX_RETRIES: u64 = 10;
// We generally don't want to disconnect; this updates the timeout to be five minutes
// which matches the upstream typescript codex impl.
pub OPENAI_STREAM_IDLE_TIMEOUT_MS: Duration = Duration::from_millis(300_000), |value| {
value.parse().map(Duration::from_millis)
};
/// Fixture path for offline tests (see client.rs).
pub CODEX_RS_SSE_FIXTURE: Option<&str> = None;

View File

@@ -0,0 +1,316 @@
use std::path::Path;
use serde::Deserialize;
use serde::Serialize;
use tokio::process::Command;
use tokio::time::Duration as TokioDuration;
use tokio::time::timeout;
/// Timeout for git commands to prevent freezing on large repositories
const GIT_COMMAND_TIMEOUT: TokioDuration = TokioDuration::from_secs(5);
#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct GitInfo {
/// Current commit hash (SHA)
#[serde(skip_serializing_if = "Option::is_none")]
pub commit_hash: Option<String>,
/// Current branch name
#[serde(skip_serializing_if = "Option::is_none")]
pub branch: Option<String>,
/// Repository URL (if available from remote)
#[serde(skip_serializing_if = "Option::is_none")]
pub repository_url: Option<String>,
}
/// Collect git repository information from the given working directory using command-line git.
/// Returns None if no git repository is found or if git operations fail.
/// Uses timeouts to prevent freezing on large repositories.
/// All git commands (except the initial repo check) run in parallel for better performance.
pub async fn collect_git_info(cwd: &Path) -> Option<GitInfo> {
// Check if we're in a git repository first
let is_git_repo = run_git_command_with_timeout(&["rev-parse", "--git-dir"], cwd)
.await?
.status
.success();
if !is_git_repo {
return None;
}
// Run all git info collection commands in parallel
let (commit_result, branch_result, url_result) = tokio::join!(
run_git_command_with_timeout(&["rev-parse", "HEAD"], cwd),
run_git_command_with_timeout(&["rev-parse", "--abbrev-ref", "HEAD"], cwd),
run_git_command_with_timeout(&["remote", "get-url", "origin"], cwd)
);
let mut git_info = GitInfo {
commit_hash: None,
branch: None,
repository_url: None,
};
// Process commit hash
if let Some(output) = commit_result {
if output.status.success() {
if let Ok(hash) = String::from_utf8(output.stdout) {
git_info.commit_hash = Some(hash.trim().to_string());
}
}
}
// Process branch name
if let Some(output) = branch_result {
if output.status.success() {
if let Ok(branch) = String::from_utf8(output.stdout) {
let branch = branch.trim();
if branch != "HEAD" {
git_info.branch = Some(branch.to_string());
}
}
}
}
// Process repository URL
if let Some(output) = url_result {
if output.status.success() {
if let Ok(url) = String::from_utf8(output.stdout) {
git_info.repository_url = Some(url.trim().to_string());
}
}
}
Some(git_info)
}
/// Run a git command with a timeout to prevent blocking on large repositories
async fn run_git_command_with_timeout(args: &[&str], cwd: &Path) -> Option<std::process::Output> {
let result = timeout(
GIT_COMMAND_TIMEOUT,
Command::new("git").args(args).current_dir(cwd).output(),
)
.await;
match result {
Ok(Ok(output)) => Some(output),
_ => None, // Timeout or error
}
}
#[cfg(test)]
mod tests {
#![allow(clippy::expect_used)]
#![allow(clippy::unwrap_used)]
use super::*;
use std::fs;
use std::path::PathBuf;
use tempfile::TempDir;
// Helper function to create a test git repository
async fn create_test_git_repo(temp_dir: &TempDir) -> PathBuf {
let repo_path = temp_dir.path().to_path_buf();
let envs = vec![
("GIT_CONFIG_GLOBAL", "/dev/null"),
("GIT_CONFIG_NOSYSTEM", "1"),
];
// Initialize git repo
Command::new("git")
.envs(envs.clone())
.args(["init"])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to init git repo");
// Configure git user (required for commits)
Command::new("git")
.envs(envs.clone())
.args(["config", "user.name", "Test User"])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to set git user name");
Command::new("git")
.envs(envs.clone())
.args(["config", "user.email", "test@example.com"])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to set git user email");
// Create a test file and commit it
let test_file = repo_path.join("test.txt");
fs::write(&test_file, "test content").expect("Failed to write test file");
Command::new("git")
.envs(envs.clone())
.args(["add", "."])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to add files");
Command::new("git")
.envs(envs.clone())
.args(["commit", "-m", "Initial commit"])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to commit");
repo_path
}
#[tokio::test]
async fn test_collect_git_info_non_git_directory() {
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let result = collect_git_info(temp_dir.path()).await;
assert!(result.is_none());
}
#[tokio::test]
async fn test_collect_git_info_git_repository() {
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let repo_path = create_test_git_repo(&temp_dir).await;
let git_info = collect_git_info(&repo_path)
.await
.expect("Should collect git info from repo");
// Should have commit hash
assert!(git_info.commit_hash.is_some());
let commit_hash = git_info.commit_hash.unwrap();
assert_eq!(commit_hash.len(), 40); // SHA-1 hash should be 40 characters
assert!(commit_hash.chars().all(|c| c.is_ascii_hexdigit()));
// Should have branch (likely "main" or "master")
assert!(git_info.branch.is_some());
let branch = git_info.branch.unwrap();
assert!(branch == "main" || branch == "master");
// Repository URL might be None for local repos without remote
// This is acceptable behavior
}
#[tokio::test]
async fn test_collect_git_info_with_remote() {
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let repo_path = create_test_git_repo(&temp_dir).await;
// Add a remote origin
Command::new("git")
.args([
"remote",
"add",
"origin",
"https://github.com/example/repo.git",
])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to add remote");
let git_info = collect_git_info(&repo_path)
.await
.expect("Should collect git info from repo");
// Should have repository URL
assert_eq!(
git_info.repository_url,
Some("https://github.com/example/repo.git".to_string())
);
}
#[tokio::test]
async fn test_collect_git_info_detached_head() {
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let repo_path = create_test_git_repo(&temp_dir).await;
// Get the current commit hash
let output = Command::new("git")
.args(["rev-parse", "HEAD"])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to get HEAD");
let commit_hash = String::from_utf8(output.stdout).unwrap().trim().to_string();
// Checkout the commit directly (detached HEAD)
Command::new("git")
.args(["checkout", &commit_hash])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to checkout commit");
let git_info = collect_git_info(&repo_path)
.await
.expect("Should collect git info from repo");
// Should have commit hash
assert!(git_info.commit_hash.is_some());
// Branch should be None for detached HEAD (since rev-parse --abbrev-ref HEAD returns "HEAD")
assert!(git_info.branch.is_none());
}
#[tokio::test]
async fn test_collect_git_info_with_branch() {
let temp_dir = TempDir::new().expect("Failed to create temp dir");
let repo_path = create_test_git_repo(&temp_dir).await;
// Create and checkout a new branch
Command::new("git")
.args(["checkout", "-b", "feature-branch"])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to create branch");
let git_info = collect_git_info(&repo_path)
.await
.expect("Should collect git info from repo");
// Should have the new branch name
assert_eq!(git_info.branch, Some("feature-branch".to_string()));
}
#[test]
fn test_git_info_serialization() {
let git_info = GitInfo {
commit_hash: Some("abc123def456".to_string()),
branch: Some("main".to_string()),
repository_url: Some("https://github.com/example/repo.git".to_string()),
};
let json = serde_json::to_string(&git_info).expect("Should serialize GitInfo");
let parsed: serde_json::Value = serde_json::from_str(&json).expect("Should parse JSON");
assert_eq!(parsed["commit_hash"], "abc123def456");
assert_eq!(parsed["branch"], "main");
assert_eq!(
parsed["repository_url"],
"https://github.com/example/repo.git"
);
}
#[test]
fn test_git_info_serialization_with_nones() {
let git_info = GitInfo {
commit_hash: None,
branch: None,
repository_url: None,
};
let json = serde_json::to_string(&git_info).expect("Should serialize GitInfo");
let parsed: serde_json::Value = serde_json::from_str(&json).expect("Should parse JSON");
// Fields with None values should be omitted due to skip_serializing_if
assert!(!parsed.as_object().unwrap().contains_key("commit_hash"));
assert!(!parsed.as_object().unwrap().contains_key("branch"));
assert!(!parsed.as_object().unwrap().contains_key("repository_url"));
}
}

View File

@@ -1,31 +1,57 @@
use tree_sitter::Parser;
use tree_sitter::Tree;
use tree_sitter_bash::LANGUAGE as BASH;
use crate::bash::try_parse_bash;
use crate::bash::try_parse_word_only_commands_sequence;
pub fn is_known_safe_command(command: &[String]) -> bool {
if is_safe_to_call_with_exec(command) {
return true;
}
// TODO(mbolin): Also support safe commands that are piped together such
// as `cat foo | wc -l`.
matches!(
command,
[bash, flag, script]
if bash == "bash"
&& flag == "-lc"
&& try_parse_bash(script).and_then(|tree|
try_parse_single_word_only_command(&tree, script)).is_some_and(|parsed_bash_command| is_safe_to_call_with_exec(&parsed_bash_command))
)
// Support `bash -lc "..."` where the script consists solely of one or
// more "plain" commands (only bare words / quoted strings) combined with
// a conservative allowlist of shell operators that themselves do not
// introduce side effects ( "&&", "||", ";", and "|" ). If every
// individual command in the script is itself a knownsafe command, then
// the composite expression is considered safe.
if let [bash, flag, script] = command {
if bash == "bash" && flag == "-lc" {
if let Some(tree) = try_parse_bash(script) {
if let Some(all_commands) = try_parse_word_only_commands_sequence(&tree, script) {
if !all_commands.is_empty()
&& all_commands
.iter()
.all(|cmd| is_safe_to_call_with_exec(cmd))
{
return true;
}
}
}
}
}
false
}
fn is_safe_to_call_with_exec(command: &[String]) -> bool {
let cmd0 = command.first().map(String::as_str);
match cmd0 {
#[rustfmt::skip]
Some(
"cat" | "cd" | "echo" | "grep" | "head" | "ls" | "pwd" | "rg" | "tail" | "wc" | "which",
) => true,
"cat" |
"cd" |
"echo" |
"false" |
"grep" |
"head" |
"ls" |
"nl" |
"pwd" |
"tail" |
"true" |
"wc" |
"which") => {
true
},
Some("find") => {
// Certain options to `find` can delete files, write to files, or
@@ -46,6 +72,29 @@ fn is_safe_to_call_with_exec(command: &[String]) -> bool {
.any(|arg| UNSAFE_FIND_OPTIONS.contains(&arg.as_str()))
}
// Ripgrep
Some("rg") => {
const UNSAFE_RIPGREP_OPTIONS_WITH_ARGS: &[&str] = &[
// Takes an arbitrary command that is executed for each match.
"--pre",
// Takes a command that can be used to obtain the local hostname.
"--hostname-bin",
];
const UNSAFE_RIPGREP_OPTIONS_WITHOUT_ARGS: &[&str] = &[
// Calls out to other decompression tools, so do not auto-approve
// out of an abundance of caution.
"--search-zip",
"-z",
];
!command.iter().any(|arg| {
UNSAFE_RIPGREP_OPTIONS_WITHOUT_ARGS.contains(&arg.as_str())
|| UNSAFE_RIPGREP_OPTIONS_WITH_ARGS
.iter()
.any(|&opt| arg == opt || arg.starts_with(&format!("{opt}=")))
})
}
// Git
Some("git") => matches!(
command.get(1).map(String::as_str),
@@ -72,90 +121,7 @@ fn is_safe_to_call_with_exec(command: &[String]) -> bool {
}
}
fn try_parse_bash(bash_lc_arg: &str) -> Option<Tree> {
let lang = BASH.into();
let mut parser = Parser::new();
#[expect(clippy::expect_used)]
parser.set_language(&lang).expect("load bash grammar");
let old_tree: Option<&Tree> = None;
parser.parse(bash_lc_arg, old_tree)
}
/// If `tree` represents a single Bash command whose name and every argument is
/// an ordinary `word`, return those words in order; otherwise, return `None`.
///
/// `src` must be the exact source string that was parsed into `tree`, so we can
/// extract the text for every node.
pub fn try_parse_single_word_only_command(tree: &Tree, src: &str) -> Option<Vec<String>> {
// Any parse error is an immediate rejection.
if tree.root_node().has_error() {
return None;
}
// (program …) with exactly one statement
let root = tree.root_node();
if root.kind() != "program" || root.named_child_count() != 1 {
return None;
}
let cmd = root.named_child(0)?; // (command …)
if cmd.kind() != "command" {
return None;
}
let mut words = Vec::new();
let mut cursor = cmd.walk();
for child in cmd.named_children(&mut cursor) {
match child.kind() {
// The command name node wraps one `word` child.
"command_name" => {
let word_node = child.named_child(0)?; // make sure it's only a word
if word_node.kind() != "word" {
return None;
}
words.push(word_node.utf8_text(src.as_bytes()).ok()?.to_owned());
}
// Positionalargument word (allowed).
"word" | "number" => {
words.push(child.utf8_text(src.as_bytes()).ok()?.to_owned());
}
"string" => {
if child.child_count() == 3
&& child.child(0)?.kind() == "\""
&& child.child(1)?.kind() == "string_content"
&& child.child(2)?.kind() == "\""
{
words.push(child.child(1)?.utf8_text(src.as_bytes()).ok()?.to_owned());
} else {
// Anything else means the command is *not* plain words.
return None;
}
}
"concatenation" => {
// TODO: Consider things like `'ab\'a'`.
return None;
}
"raw_string" => {
// Raw string is a single word, but we need to strip the quotes.
let raw_string = child.utf8_text(src.as_bytes()).ok()?;
let stripped = raw_string
.strip_prefix('\'')
.and_then(|s| s.strip_suffix('\''));
if let Some(stripped) = stripped {
words.push(stripped.to_owned());
} else {
return None;
}
}
// Anything else means the command is *not* plain words.
_ => return None,
}
}
Some(words)
}
// (bash parsing helpers implemented in crate::bash)
/* ----------------------------------------------------------
Example
@@ -193,6 +159,7 @@ fn is_valid_sed_n_arg(arg: Option<&str>) -> bool {
_ => false,
}
}
#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used)]
@@ -209,6 +176,11 @@ mod tests {
assert!(is_safe_to_call_with_exec(&vec_str(&[
"sed", "-n", "1,5p", "file.txt"
])));
assert!(is_safe_to_call_with_exec(&vec_str(&[
"nl",
"-nrz",
"Cargo.toml"
])));
// Safe `find` command (no unsafe options).
assert!(is_safe_to_call_with_exec(&vec_str(&[
@@ -245,6 +217,40 @@ mod tests {
}
}
#[test]
fn ripgrep_rules() {
// Safe ripgrep invocations none of the unsafe flags are present.
assert!(is_safe_to_call_with_exec(&vec_str(&[
"rg",
"Cargo.toml",
"-n"
])));
// Unsafe flags that do not take an argument (present verbatim).
for args in [
vec_str(&["rg", "--search-zip", "files"]),
vec_str(&["rg", "-z", "files"]),
] {
assert!(
!is_safe_to_call_with_exec(&args),
"expected {args:?} to be considered unsafe due to zip-search flag",
);
}
// Unsafe flags that expect a value, provided in both split and = forms.
for args in [
vec_str(&["rg", "--pre", "pwned", "files"]),
vec_str(&["rg", "--pre=pwned", "files"]),
vec_str(&["rg", "--hostname-bin", "pwned", "files"]),
vec_str(&["rg", "--hostname-bin=pwned", "files"]),
] {
assert!(
!is_safe_to_call_with_exec(&args),
"expected {args:?} to be considered unsafe due to external-command flag",
);
}
}
#[test]
fn bash_lc_safe_examples() {
assert!(is_known_safe_command(&vec_str(&["bash", "-lc", "ls"])));
@@ -277,6 +283,30 @@ mod tests {
])));
}
#[test]
fn bash_lc_safe_examples_with_operators() {
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"grep -R \"Cargo.toml\" -n || true"
])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"ls && pwd"
])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"echo 'hi' ; ls"
])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"ls | wc -l"
])));
}
#[test]
fn bash_lc_unsafe_examples() {
assert!(
@@ -290,44 +320,29 @@ mod tests {
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "find . -name file.txt -delete"])),
"Unsafe find option should not be autoapproved."
);
}
#[test]
fn test_try_parse_single_word_only_command() {
let script_with_single_quoted_string = "sed -n '1,5p' file.txt";
let parsed_words = try_parse_bash(script_with_single_quoted_string)
.and_then(|tree| {
try_parse_single_word_only_command(&tree, script_with_single_quoted_string)
})
.unwrap();
assert_eq!(
vec![
"sed".to_string(),
"-n".to_string(),
// Ensure the single quotes are properly removed.
"1,5p".to_string(),
"file.txt".to_string()
],
parsed_words,
"Unsafe find option should not be auto-approved."
);
let script_with_number_arg = "ls -1";
let parsed_words = try_parse_bash(script_with_number_arg)
.and_then(|tree| try_parse_single_word_only_command(&tree, script_with_number_arg))
.unwrap();
assert_eq!(vec!["ls", "-1"], parsed_words,);
// Disallowed because of unsafe command in sequence.
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "ls && rm -rf /"])),
"Sequence containing unsafe command must be rejected"
);
let script_with_double_quoted_string_with_no_funny_stuff_arg = "grep -R \"Cargo.toml\" -n";
let parsed_words = try_parse_bash(script_with_double_quoted_string_with_no_funny_stuff_arg)
.and_then(|tree| {
try_parse_single_word_only_command(
&tree,
script_with_double_quoted_string_with_no_funny_stuff_arg,
)
})
.unwrap();
assert_eq!(vec!["grep", "-R", "Cargo.toml", "-n"], parsed_words);
// Disallowed because of parentheses / subshell.
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "(ls)"])),
"Parentheses (subshell) are not provably safe with the current parser"
);
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "ls || (pwd && echo hi)"])),
"Nested parentheses are not provably safe with the current parser"
);
// Disallowed redirection.
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "ls > out.txt"])),
"> redirection should be rejected"
);
}
}

View File

@@ -5,11 +5,14 @@
// the TUI or the tracing stack).
#![deny(clippy::print_stdout, clippy::print_stderr)]
mod apply_patch;
mod bash;
mod chat_completions;
mod client;
mod client_common;
pub mod codex;
pub use codex::Codex;
pub use codex::CodexSpawnOk;
pub mod codex_wrapper;
pub mod config;
pub mod config_profile;
@@ -19,22 +22,32 @@ pub mod error;
pub mod exec;
pub mod exec_env;
mod flags;
pub mod git_info;
mod is_safe_command;
mod mcp_connection_manager;
mod mcp_tool_call;
mod message_history;
mod model_provider_info;
pub mod parse_command;
pub use model_provider_info::BUILT_IN_OSS_MODEL_PROVIDER_ID;
pub use model_provider_info::ModelProviderInfo;
pub use model_provider_info::WireApi;
pub use model_provider_info::built_in_model_providers;
pub use model_provider_info::create_oss_provider_with_base_url;
pub mod model_family;
mod models;
pub mod openai_api_key;
mod openai_model_info;
mod openai_tools;
pub mod plan_tool;
mod project_doc;
pub mod protocol;
mod rollout;
mod safety;
pub(crate) mod safety;
pub mod seatbelt;
pub mod shell;
pub mod spawn;
pub mod turn_diff_tracker;
mod user_notification;
pub mod util;
pub use client_common::model_supports_reasoning_summaries;
pub use apply_patch::CODEX_APPLY_PATCH_ARG1;
pub use safety::get_platform_sandbox;

View File

@@ -7,6 +7,8 @@
//! `"<server><MCP_TOOL_NAME_DELIMITER><tool>"` as the key.
use std::collections::HashMap;
use std::collections::HashSet;
use std::ffi::OsString;
use std::time::Duration;
use anyhow::Context;
@@ -16,8 +18,13 @@ use codex_mcp_client::McpClient;
use mcp_types::ClientCapabilities;
use mcp_types::Implementation;
use mcp_types::Tool;
use serde_json::json;
use sha1::Digest;
use sha1::Sha1;
use tokio::task::JoinSet;
use tracing::info;
use tracing::warn;
use crate::config_types::McpServerConfig;
@@ -26,7 +33,8 @@ use crate::config_types::McpServerConfig;
///
/// OpenAI requires tool names to conform to `^[a-zA-Z0-9_-]+$`, so we must
/// choose a delimiter from this character set.
const MCP_TOOL_NAME_DELIMITER: &str = "__OAI_CODEX_MCP__";
const MCP_TOOL_NAME_DELIMITER: &str = "__";
const MAX_TOOL_NAME_LENGTH: usize = 64;
/// Timeout for the `tools/list` request.
const LIST_TOOLS_TIMEOUT: Duration = Duration::from_secs(10);
@@ -35,16 +43,42 @@ const LIST_TOOLS_TIMEOUT: Duration = Duration::from_secs(10);
/// spawned successfully.
pub type ClientStartErrors = HashMap<String, anyhow::Error>;
fn fully_qualified_tool_name(server: &str, tool: &str) -> String {
format!("{server}{MCP_TOOL_NAME_DELIMITER}{tool}")
fn qualify_tools(tools: Vec<ToolInfo>) -> HashMap<String, ToolInfo> {
let mut used_names = HashSet::new();
let mut qualified_tools = HashMap::new();
for tool in tools {
let mut qualified_name = format!(
"{}{}{}",
tool.server_name, MCP_TOOL_NAME_DELIMITER, tool.tool_name
);
if qualified_name.len() > MAX_TOOL_NAME_LENGTH {
let mut hasher = Sha1::new();
hasher.update(qualified_name.as_bytes());
let sha1 = hasher.finalize();
let sha1_str = format!("{sha1:x}");
// Truncate to make room for the hash suffix
let prefix_len = MAX_TOOL_NAME_LENGTH - sha1_str.len();
qualified_name = format!("{}{}", &qualified_name[..prefix_len], sha1_str);
}
if used_names.contains(&qualified_name) {
warn!("skipping duplicated tool {}", qualified_name);
continue;
}
used_names.insert(qualified_name.clone());
qualified_tools.insert(qualified_name, tool);
}
qualified_tools
}
pub(crate) fn try_parse_fully_qualified_tool_name(fq_name: &str) -> Option<(String, String)> {
let (server, tool) = fq_name.split_once(MCP_TOOL_NAME_DELIMITER)?;
if server.is_empty() || tool.is_empty() {
return None;
}
Some((server.to_string(), tool.to_string()))
struct ToolInfo {
server_name: String,
tool_name: String,
tool: Tool,
}
/// A thin wrapper around a set of running [`McpClient`] instances.
@@ -57,7 +91,7 @@ pub(crate) struct McpConnectionManager {
clients: HashMap<String, std::sync::Arc<McpClient>>,
/// Fully qualified tool name -> tool instance.
tools: HashMap<String, Tool>,
tools: HashMap<String, ToolInfo>,
}
impl McpConnectionManager {
@@ -79,12 +113,27 @@ impl McpConnectionManager {
// Launch all configured servers concurrently.
let mut join_set = JoinSet::new();
let mut errors = ClientStartErrors::new();
for (server_name, cfg) in mcp_servers {
// TODO: Verify server name: require `^[a-zA-Z0-9_-]+$`?
// Validate server name before spawning
if !is_valid_mcp_server_name(&server_name) {
let error = anyhow::anyhow!(
"invalid server name '{}': must match pattern ^[a-zA-Z0-9_-]+$",
server_name
);
errors.insert(server_name, error);
continue;
}
join_set.spawn(async move {
let McpServerConfig { command, args, env } = cfg;
let client_res = McpClient::new_stdio_client(command, args, env).await;
let client_res = McpClient::new_stdio_client(
command.into(),
args.into_iter().map(OsString::from).collect(),
env,
)
.await;
match client_res {
Ok(client) => {
// Initialize the client.
@@ -93,10 +142,14 @@ impl McpConnectionManager {
experimental: None,
roots: None,
sampling: None,
// https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#capabilities
// indicates this should be an empty object.
elicitation: Some(json!({})),
},
client_info: Implementation {
name: "codex-mcp-client".to_owned(),
version: env!("CARGO_PKG_VERSION").to_owned(),
title: Some("Codex".into()),
},
protocol_version: mcp_types::MCP_SCHEMA_VERSION.to_owned(),
};
@@ -117,7 +170,6 @@ impl McpConnectionManager {
let mut clients: HashMap<String, std::sync::Arc<McpClient>> =
HashMap::with_capacity(join_set.len());
let mut errors = ClientStartErrors::new();
while let Some(res) = join_set.join_next().await {
let (server_name, client_res) = res?; // JoinError propagation
@@ -132,7 +184,9 @@ impl McpConnectionManager {
}
}
let tools = list_all_tools(&clients).await?;
let all_tools = list_all_tools(&clients).await?;
let tools = qualify_tools(all_tools);
Ok((Self { clients, tools }, errors))
}
@@ -140,7 +194,10 @@ impl McpConnectionManager {
/// Returns a single map that contains **all** tools. Each key is the
/// fully-qualified name for the tool.
pub fn list_all_tools(&self) -> HashMap<String, Tool> {
self.tools.clone()
self.tools
.iter()
.map(|(name, tool)| (name.clone(), tool.tool.clone()))
.collect()
}
/// Invoke the tool indicated by the (server, tool) pair.
@@ -162,13 +219,19 @@ impl McpConnectionManager {
.await
.with_context(|| format!("tool call failed for `{server}/{tool}`"))
}
pub fn parse_tool_name(&self, tool_name: &str) -> Option<(String, String)> {
self.tools
.get(tool_name)
.map(|tool| (tool.server_name.clone(), tool.tool_name.clone()))
}
}
/// Query every server for its available tools and return a single map that
/// contains **all** tools. Each key is the fully-qualified name for the tool.
pub async fn list_all_tools(
async fn list_all_tools(
clients: &HashMap<String, std::sync::Arc<McpClient>>,
) -> Result<HashMap<String, Tool>> {
) -> Result<Vec<ToolInfo>> {
let mut join_set = JoinSet::new();
// Spawn one task per server so we can query them concurrently. This
@@ -185,18 +248,19 @@ pub async fn list_all_tools(
});
}
let mut aggregated: HashMap<String, Tool> = HashMap::with_capacity(join_set.len());
let mut aggregated: Vec<ToolInfo> = Vec::with_capacity(join_set.len());
while let Some(join_res) = join_set.join_next().await {
let (server_name, list_result) = join_res?;
let list_result = list_result?;
for tool in list_result.tools {
// TODO(mbolin): escape tool names that contain invalid characters.
let fq_name = fully_qualified_tool_name(&server_name, &tool.name);
if aggregated.insert(fq_name.clone(), tool).is_some() {
panic!("tool name collision for '{fq_name}': suspicious");
}
let tool_info = ToolInfo {
server_name: server_name.clone(),
tool_name: tool.name.clone(),
tool,
};
aggregated.push(tool_info);
}
}
@@ -208,3 +272,99 @@ pub async fn list_all_tools(
Ok(aggregated)
}
fn is_valid_mcp_server_name(server_name: &str) -> bool {
!server_name.is_empty()
&& server_name
.chars()
.all(|c| c.is_ascii_alphanumeric() || c == '_' || c == '-')
}
#[cfg(test)]
#[allow(clippy::unwrap_used)]
mod tests {
use super::*;
use mcp_types::ToolInputSchema;
fn create_test_tool(server_name: &str, tool_name: &str) -> ToolInfo {
ToolInfo {
server_name: server_name.to_string(),
tool_name: tool_name.to_string(),
tool: Tool {
annotations: None,
description: Some(format!("Test tool: {tool_name}")),
input_schema: ToolInputSchema {
properties: None,
required: None,
r#type: "object".to_string(),
},
name: tool_name.to_string(),
output_schema: None,
title: None,
},
}
}
#[test]
fn test_qualify_tools_short_non_duplicated_names() {
let tools = vec![
create_test_tool("server1", "tool1"),
create_test_tool("server1", "tool2"),
];
let qualified_tools = qualify_tools(tools);
assert_eq!(qualified_tools.len(), 2);
assert!(qualified_tools.contains_key("server1__tool1"));
assert!(qualified_tools.contains_key("server1__tool2"));
}
#[test]
fn test_qualify_tools_duplicated_names_skipped() {
let tools = vec![
create_test_tool("server1", "duplicate_tool"),
create_test_tool("server1", "duplicate_tool"),
];
let qualified_tools = qualify_tools(tools);
// Only the first tool should remain, the second is skipped
assert_eq!(qualified_tools.len(), 1);
assert!(qualified_tools.contains_key("server1__duplicate_tool"));
}
#[test]
fn test_qualify_tools_long_names_same_server() {
let server_name = "my_server";
let tools = vec![
create_test_tool(
server_name,
"extremely_lengthy_function_name_that_absolutely_surpasses_all_reasonable_limits",
),
create_test_tool(
server_name,
"yet_another_extremely_lengthy_function_name_that_absolutely_surpasses_all_reasonable_limits",
),
];
let qualified_tools = qualify_tools(tools);
assert_eq!(qualified_tools.len(), 2);
let mut keys: Vec<_> = qualified_tools.keys().cloned().collect();
keys.sort();
assert_eq!(keys[0].len(), 64);
assert_eq!(
keys[0],
"my_server__extremely_lena02e507efc5a9de88637e436690364fd4219e4ef"
);
assert_eq!(keys[1].len(), 64);
assert_eq!(
keys[1],
"my_server__yet_another_e1c3987bd9c50b826cbe1687966f79f0c602d19ca"
);
}
}

View File

@@ -1,4 +1,5 @@
use std::time::Duration;
use std::time::Instant;
use tracing::error;
@@ -7,6 +8,7 @@ use crate::models::FunctionCallOutputPayload;
use crate::models::ResponseInputItem;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::McpInvocation;
use crate::protocol::McpToolCallBeginEvent;
use crate::protocol::McpToolCallEndEvent;
@@ -41,21 +43,28 @@ pub(crate) async fn handle_mcp_tool_call(
}
};
let tool_call_begin_event = EventMsg::McpToolCallBegin(McpToolCallBeginEvent {
call_id: call_id.clone(),
let invocation = McpInvocation {
server: server.clone(),
tool: tool_name.clone(),
arguments: arguments_value.clone(),
};
let tool_call_begin_event = EventMsg::McpToolCallBegin(McpToolCallBeginEvent {
call_id: call_id.clone(),
invocation: invocation.clone(),
});
notify_mcp_tool_call_event(sess, sub_id, tool_call_begin_event).await;
let start = Instant::now();
// Perform the tool call.
let result = sess
.call_tool(&server, &tool_name, arguments_value, timeout)
.call_tool(&server, &tool_name, arguments_value.clone(), timeout)
.await
.map_err(|e| format!("tool call error: {e}"));
let tool_call_end_event = EventMsg::McpToolCallEnd(McpToolCallEndEvent {
call_id: call_id.clone(),
invocation,
duration: start.elapsed(),
result: result.clone(),
});

View File

@@ -0,0 +1,100 @@
/// A model family is a group of models that share certain characteristics.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct ModelFamily {
/// The full model slug used to derive this model family, e.g.
/// "gpt-4.1-2025-04-14".
pub slug: String,
/// The model family name, e.g. "gpt-4.1". Note this should able to be used
/// with [`crate::openai_model_info::get_model_info`].
pub family: String,
/// True if the model needs additional instructions on how to use the
/// "virtual" `apply_patch` CLI.
pub needs_special_apply_patch_instructions: bool,
// Whether the `reasoning` field can be set when making a request to this
// model family. Note it has `effort` and `summary` subfields (though
// `summary` is optional).
pub supports_reasoning_summaries: bool,
// This should be set to true when the model expects a tool named
// "local_shell" to be provided. Its contract must be understood natively by
// the model such that its description can be omitted.
// See https://platform.openai.com/docs/guides/tools-local-shell
pub uses_local_shell_tool: bool,
}
macro_rules! model_family {
(
$slug:expr, $family:expr $(, $key:ident : $value:expr )* $(,)?
) => {{
// defaults
let mut mf = ModelFamily {
slug: $slug.to_string(),
family: $family.to_string(),
needs_special_apply_patch_instructions: false,
supports_reasoning_summaries: false,
uses_local_shell_tool: false,
};
// apply overrides
$(
mf.$key = $value;
)*
Some(mf)
}};
}
macro_rules! simple_model_family {
(
$slug:expr, $family:expr
) => {{
Some(ModelFamily {
slug: $slug.to_string(),
family: $family.to_string(),
needs_special_apply_patch_instructions: false,
supports_reasoning_summaries: false,
uses_local_shell_tool: false,
})
}};
}
/// Returns a `ModelFamily` for the given model slug, or `None` if the slug
/// does not match any known model family.
pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
if slug.starts_with("o3") {
model_family!(
slug, "o3",
supports_reasoning_summaries: true,
)
} else if slug.starts_with("o4-mini") {
model_family!(
slug, "o4-mini",
supports_reasoning_summaries: true,
)
} else if slug.starts_with("codex-mini-latest") {
model_family!(
slug, "codex-mini-latest",
supports_reasoning_summaries: true,
uses_local_shell_tool: true,
)
} else if slug.starts_with("gpt-4.1") {
model_family!(
slug, "gpt-4.1",
needs_special_apply_patch_instructions: true,
)
} else if slug.starts_with("gpt-4o") {
simple_model_family!(slug, "gpt-4o")
} else if slug.starts_with("gpt-oss") {
simple_model_family!(slug, "gpt-oss")
} else if slug.starts_with("gpt-3.5") {
simple_model_family!(slug, "gpt-3.5")
} else if slug.starts_with("gpt-5") {
model_family!(
slug, "gpt-5",
supports_reasoning_summaries: true,
)
} else {
None
}
}

View File

@@ -5,17 +5,18 @@
//! 2. User-defined entries inside `~/.codex/config.toml` under the `model_providers`
//! key. These override or extend the defaults at runtime.
use codex_login::AuthMode;
use codex_login::CodexAuth;
use serde::Deserialize;
use serde::Serialize;
use std::collections::HashMap;
use std::env::VarError;
use std::time::Duration;
use crate::error::EnvVarError;
use crate::openai_api_key::get_openai_api_key;
/// Value for the `OpenAI-Originator` header that is sent with requests to
/// OpenAI.
const OPENAI_ORIGINATOR_HEADER: &str = "codex_cli_rs";
const DEFAULT_STREAM_IDLE_TIMEOUT_MS: u64 = 300_000;
const DEFAULT_STREAM_MAX_RETRIES: u64 = 10;
const DEFAULT_REQUEST_MAX_RETRIES: u64 = 4;
/// Wire protocol that the provider speaks. Most third-party services only
/// implement the classic OpenAI Chat Completions JSON schema, whereas OpenAI
@@ -26,7 +27,7 @@ const OPENAI_ORIGINATOR_HEADER: &str = "codex_cli_rs";
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum WireApi {
/// The experimental “Responses API exposed by OpenAI at `/v1/responses`.
/// The Responses API exposed by OpenAI at `/v1/responses`.
Responses,
/// Regular Chat Completions compatible with `/v1/chat/completions`.
@@ -40,7 +41,7 @@ pub struct ModelProviderInfo {
/// Friendly display name.
pub name: String,
/// Base URL for the provider's OpenAI-compatible API.
pub base_url: String,
pub base_url: Option<String>,
/// Environment variable that stores the user's API key for this provider.
pub env_key: Option<String>,
@@ -64,6 +65,20 @@ pub struct ModelProviderInfo {
/// value should be used. If the environment variable is not set, or the
/// value is empty, the header will not be included in the request.
pub env_http_headers: Option<HashMap<String, String>>,
/// Maximum number of times to retry a failed HTTP request to this provider.
pub request_max_retries: Option<u64>,
/// Number of times to retry reconnecting a dropped streaming response before failing.
pub stream_max_retries: Option<u64>,
/// Idle timeout (in milliseconds) to wait for activity on a streaming response before treating
/// the connection as lost.
pub stream_idle_timeout_ms: Option<u64>,
/// Whether this provider requires some form of standard authentication (API key, ChatGPT token).
#[serde(default)]
pub requires_openai_auth: bool,
}
impl ModelProviderInfo {
@@ -71,29 +86,40 @@ impl ModelProviderInfo {
/// reqwest Client applying:
/// • provider-specific headers (static + env based)
/// • Bearer auth header when an API key is available.
/// • Auth token for OAuth.
///
/// When `require_api_key` is true and the provider declares an `env_key`
/// but the variable is missing/empty, returns an [`Err`] identical to the
/// If the provider declares an `env_key` but the variable is missing/empty, returns an [`Err`] identical to the
/// one produced by [`ModelProviderInfo::api_key`].
pub fn create_request_builder<'a>(
pub async fn create_request_builder<'a>(
&'a self,
client: &'a reqwest::Client,
auth: &Option<CodexAuth>,
) -> crate::error::Result<reqwest::RequestBuilder> {
let api_key = self.api_key()?;
let effective_auth = match self.api_key() {
Ok(Some(key)) => Some(CodexAuth::from_api_key(key)),
Ok(None) => auth.clone(),
Err(err) => {
if auth.is_some() {
auth.clone()
} else {
return Err(err);
}
}
};
let url = self.get_full_url();
let url = self.get_full_url(&effective_auth);
let mut builder = client.post(url);
if let Some(key) = api_key {
builder = builder.bearer_auth(key);
if let Some(auth) = effective_auth.as_ref() {
builder = builder.bearer_auth(auth.get_token().await?);
}
Ok(self.apply_http_headers(builder))
}
pub(crate) fn get_full_url(&self) -> String {
let query_string = self
.query_params
fn get_query_string(&self) -> String {
self.query_params
.as_ref()
.map_or_else(String::new, |params| {
let full_params = params
@@ -102,8 +128,27 @@ impl ModelProviderInfo {
.collect::<Vec<_>>()
.join("&");
format!("?{full_params}")
});
let base_url = &self.base_url;
})
}
pub(crate) fn get_full_url(&self, auth: &Option<CodexAuth>) -> String {
let default_base_url = if matches!(
auth,
Some(CodexAuth {
mode: AuthMode::ChatGPT,
..
})
) {
"https://chatgpt.com/backend-api/codex"
} else {
"https://api.openai.com/v1"
};
let query_string = self.get_query_string();
let base_url = self
.base_url
.clone()
.unwrap_or(default_base_url.to_string());
match self.wire_api {
WireApi::Responses => format!("{base_url}/responses{query_string}"),
WireApi::Chat => format!("{base_url}/chat/completions{query_string}"),
@@ -135,14 +180,10 @@ impl ModelProviderInfo {
/// If `env_key` is Some, returns the API key for this provider if present
/// (and non-empty) in the environment. If `env_key` is required but
/// cannot be found, returns an error.
fn api_key(&self) -> crate::error::Result<Option<String>> {
pub fn api_key(&self) -> crate::error::Result<Option<String>> {
match &self.env_key {
Some(env_key) => {
let env_value = if env_key == crate::openai_api_key::OPENAI_API_KEY_ENV_VAR {
get_openai_api_key().map_or_else(|| Err(VarError::NotPresent), Ok)
} else {
std::env::var(env_key)
};
let env_value = std::env::var(env_key);
env_value
.and_then(|v| {
if v.trim().is_empty() {
@@ -161,16 +202,39 @@ impl ModelProviderInfo {
None => Ok(None),
}
}
/// Effective maximum number of request retries for this provider.
pub fn request_max_retries(&self) -> u64 {
self.request_max_retries
.unwrap_or(DEFAULT_REQUEST_MAX_RETRIES)
}
/// Effective maximum number of stream reconnection attempts for this provider.
pub fn stream_max_retries(&self) -> u64 {
self.stream_max_retries
.unwrap_or(DEFAULT_STREAM_MAX_RETRIES)
}
/// Effective idle timeout for streaming responses.
pub fn stream_idle_timeout(&self) -> Duration {
self.stream_idle_timeout_ms
.map(Duration::from_millis)
.unwrap_or(Duration::from_millis(DEFAULT_STREAM_IDLE_TIMEOUT_MS))
}
}
const DEFAULT_OLLAMA_PORT: u32 = 11434;
pub const BUILT_IN_OSS_MODEL_PROVIDER_ID: &str = "oss";
/// Built-in default provider list.
pub fn built_in_model_providers() -> HashMap<String, ModelProviderInfo> {
use ModelProviderInfo as P;
// We do not want to be in the business of adjucating which third-party
// providers are bundled with Codex CLI, so we only include the OpenAI
// provider by default. Users are encouraged to add to `model_providers`
// in config.toml to add their own providers.
// providers are bundled with Codex CLI, so we only include the OpenAI and
// open source ("oss") providers by default. Users are encouraged to add to
// `model_providers` in config.toml to add their own providers.
[
(
"openai",
@@ -183,36 +247,79 @@ pub fn built_in_model_providers() -> HashMap<String, ModelProviderInfo> {
// OpenAI provider.
base_url: std::env::var("OPENAI_BASE_URL")
.ok()
.filter(|v| !v.trim().is_empty())
.unwrap_or_else(|| "https://api.openai.com/v1".to_string()),
env_key: Some("OPENAI_API_KEY".into()),
env_key_instructions: Some("Create an API key (https://platform.openai.com) and export it as an environment variable.".into()),
.filter(|v| !v.trim().is_empty()),
env_key: None,
env_key_instructions: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: Some(
[
("originator".to_string(), OPENAI_ORIGINATOR_HEADER.to_string()),
("version".to_string(), env!("CARGO_PKG_VERSION").to_string()),
]
[("version".to_string(), env!("CARGO_PKG_VERSION").to_string())]
.into_iter()
.collect(),
),
env_http_headers: Some(
[
("OpenAI-Organization".to_string(), "OPENAI_ORGANIZATION".to_string()),
(
"OpenAI-Organization".to_string(),
"OPENAI_ORGANIZATION".to_string(),
),
("OpenAI-Project".to_string(), "OPENAI_PROJECT".to_string()),
]
.into_iter()
.collect(),
.into_iter()
.collect(),
),
// Use global defaults for retry/timeout unless overridden in config.toml.
request_max_retries: None,
stream_max_retries: None,
stream_idle_timeout_ms: None,
requires_openai_auth: true,
},
),
(BUILT_IN_OSS_MODEL_PROVIDER_ID, create_oss_provider()),
]
.into_iter()
.map(|(k, v)| (k.to_string(), v))
.collect()
}
pub fn create_oss_provider() -> ModelProviderInfo {
// These CODEX_OSS_ environment variables are experimental: we may
// switch to reading values from config.toml instead.
let codex_oss_base_url = match std::env::var("CODEX_OSS_BASE_URL")
.ok()
.filter(|v| !v.trim().is_empty())
{
Some(url) => url,
None => format!(
"http://localhost:{port}/v1",
port = std::env::var("CODEX_OSS_PORT")
.ok()
.filter(|v| !v.trim().is_empty())
.and_then(|v| v.parse::<u32>().ok())
.unwrap_or(DEFAULT_OLLAMA_PORT)
),
};
create_oss_provider_with_base_url(&codex_oss_base_url)
}
pub fn create_oss_provider_with_base_url(base_url: &str) -> ModelProviderInfo {
ModelProviderInfo {
name: "gpt-oss".into(),
base_url: Some(base_url.into()),
env_key: None,
env_key_instructions: None,
wire_api: WireApi::Chat,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: None,
stream_max_retries: None,
stream_idle_timeout_ms: None,
requires_openai_auth: false,
}
}
#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used)]
@@ -227,13 +334,17 @@ base_url = "http://localhost:11434/v1"
"#;
let expected_provider = ModelProviderInfo {
name: "Ollama".into(),
base_url: "http://localhost:11434/v1".into(),
base_url: Some("http://localhost:11434/v1".into()),
env_key: None,
env_key_instructions: None,
wire_api: WireApi::Chat,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: None,
stream_max_retries: None,
stream_idle_timeout_ms: None,
requires_openai_auth: false,
};
let provider: ModelProviderInfo = toml::from_str(azure_provider_toml).unwrap();
@@ -250,7 +361,7 @@ query_params = { api-version = "2025-04-01-preview" }
"#;
let expected_provider = ModelProviderInfo {
name: "Azure".into(),
base_url: "https://xxxxx.openai.azure.com/openai".into(),
base_url: Some("https://xxxxx.openai.azure.com/openai".into()),
env_key: Some("AZURE_OPENAI_API_KEY".into()),
env_key_instructions: None,
wire_api: WireApi::Chat,
@@ -259,6 +370,10 @@ query_params = { api-version = "2025-04-01-preview" }
}),
http_headers: None,
env_http_headers: None,
request_max_retries: None,
stream_max_retries: None,
stream_idle_timeout_ms: None,
requires_openai_auth: false,
};
let provider: ModelProviderInfo = toml::from_str(azure_provider_toml).unwrap();
@@ -276,7 +391,7 @@ env_http_headers = { "X-Example-Env-Header" = "EXAMPLE_ENV_VAR" }
"#;
let expected_provider = ModelProviderInfo {
name: "Example".into(),
base_url: "https://example.com".into(),
base_url: Some("https://example.com".into()),
env_key: Some("API_KEY".into()),
env_key_instructions: None,
wire_api: WireApi::Chat,
@@ -287,6 +402,10 @@ env_http_headers = { "X-Example-Env-Header" = "EXAMPLE_ENV_VAR" }
env_http_headers: Some(maplit::hashmap! {
"X-Example-Env-Header".to_string() => "EXAMPLE_ENV_VAR".to_string(),
}),
request_max_retries: None,
stream_max_retries: None,
stream_idle_timeout_ms: None,
requires_openai_auth: false,
};
let provider: ModelProviderInfo = toml::from_str(azure_provider_toml).unwrap();

View File

@@ -3,12 +3,13 @@ use std::collections::HashMap;
use base64::Engine;
use mcp_types::CallToolResult;
use serde::Deserialize;
use serde::Deserializer;
use serde::Serialize;
use serde::ser::Serializer;
use crate::protocol::InputItem;
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ResponseInputItem {
Message {
@@ -25,7 +26,7 @@ pub enum ResponseInputItem {
},
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ContentItem {
InputText { text: String },
@@ -33,16 +34,20 @@ pub enum ContentItem {
OutputText { text: String },
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ResponseItem {
Message {
id: Option<String>,
role: String,
content: Vec<ContentItem>,
},
Reasoning {
id: String,
summary: Vec<ReasoningItemReasoningSummary>,
#[serde(default, skip_serializing_if = "Option::is_none")]
content: Option<Vec<ReasoningItemContent>>,
encrypted_content: Option<String>,
},
LocalShellCall {
/// Set when using the chat completions API.
@@ -53,6 +58,7 @@ pub enum ResponseItem {
action: LocalShellAction,
},
FunctionCall {
id: Option<String>,
name: String,
// The Responses API returns the function call arguments as a *string* that contains
// JSON, not as an alreadyparsed object. We keep it as a raw string here and let
@@ -78,7 +84,11 @@ pub enum ResponseItem {
impl From<ResponseInputItem> for ResponseItem {
fn from(item: ResponseInputItem) -> Self {
match item {
ResponseInputItem::Message { role, content } => Self::Message { role, content },
ResponseInputItem::Message { role, content } => Self::Message {
role,
content,
id: None,
},
ResponseInputItem::FunctionCallOutput { call_id, output } => {
Self::FunctionCallOutput { call_id, output }
}
@@ -99,7 +109,7 @@ impl From<ResponseInputItem> for ResponseItem {
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(rename_all = "snake_case")]
pub enum LocalShellStatus {
Completed,
@@ -107,13 +117,13 @@ pub enum LocalShellStatus {
Incomplete,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum LocalShellAction {
Exec(LocalShellExecAction),
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct LocalShellExecAction {
pub command: Vec<String>,
pub timeout_ms: Option<u64>,
@@ -122,12 +132,18 @@ pub struct LocalShellExecAction {
pub user: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ReasoningItemReasoningSummary {
SummaryText { text: String },
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ReasoningItemContent {
ReasoningText { text: String },
}
impl From<Vec<InputItem>> for ResponseInputItem {
fn from(items: Vec<InputItem>) -> Self {
Self::Message {
@@ -175,12 +191,15 @@ pub struct ShellToolCallParams {
// The wire format uses `timeout`, which has ambiguous units, so we use
// `timeout_ms` as the field name so it is clear in code.
pub timeout_ms: Option<u64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub with_escalated_permissions: Option<bool>,
#[serde(skip_serializing_if = "Option::is_none")]
pub justification: Option<String>,
}
#[derive(Deserialize, Debug, Clone)]
#[derive(Debug, Clone, PartialEq)]
pub struct FunctionCallOutputPayload {
pub content: String,
#[expect(dead_code)]
pub success: Option<bool>,
}
@@ -205,6 +224,19 @@ impl Serialize for FunctionCallOutputPayload {
}
}
impl<'de> Deserialize<'de> for FunctionCallOutputPayload {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
Ok(FunctionCallOutputPayload {
content: s,
success: None,
})
}
}
// Implement Display so callers can treat the payload like a plain string when logging or doing
// trivial substring checks in tests (existing tests call `.contains()` on the output). Display
// returns the raw `content` field.
@@ -274,6 +306,8 @@ mod tests {
command: vec!["ls".to_string(), "-l".to_string()],
workdir: Some("/tmp".to_string()),
timeout_ms: Some(1000),
with_escalated_permissions: None,
justification: None,
},
params
);

View File

@@ -1,24 +0,0 @@
use std::env;
use std::sync::LazyLock;
use std::sync::RwLock;
pub const OPENAI_API_KEY_ENV_VAR: &str = "OPENAI_API_KEY";
static OPENAI_API_KEY: LazyLock<RwLock<Option<String>>> = LazyLock::new(|| {
let val = env::var(OPENAI_API_KEY_ENV_VAR)
.ok()
.and_then(|s| if s.is_empty() { None } else { Some(s) });
RwLock::new(val)
});
pub fn get_openai_api_key() -> Option<String> {
#![allow(clippy::unwrap_used)]
OPENAI_API_KEY.read().unwrap().clone()
}
pub fn set_openai_api_key(value: String) {
#![allow(clippy::unwrap_used)]
if !value.is_empty() {
*OPENAI_API_KEY.write().unwrap() = Some(value);
}
}

View File

@@ -1,3 +1,5 @@
use crate::model_family::ModelFamily;
/// Metadata about a model, particularly OpenAI models.
/// We may want to consider including details like the pricing for
/// input tokens, output tokens, etc., though users will need to be able to
@@ -12,10 +14,19 @@ pub(crate) struct ModelInfo {
pub(crate) max_output_tokens: u64,
}
/// Note details such as what a model like gpt-4o is aliased to may be out of
/// date.
pub(crate) fn get_model_info(name: &str) -> Option<ModelInfo> {
match name {
pub(crate) fn get_model_info(model_family: &ModelFamily) -> Option<ModelInfo> {
match model_family.slug.as_str() {
// OSS models have a 128k shared token pool.
// Arbitrarily splitting it: 3/4 input context, 1/4 output.
// https://openai.com/index/gpt-oss-model-card/
"gpt-oss-20b" => Some(ModelInfo {
context_window: 96_000,
max_output_tokens: 32_000,
}),
"gpt-oss-120b" => Some(ModelInfo {
context_window: 96_000,
max_output_tokens: 32_000,
}),
// https://platform.openai.com/docs/models/o3
"o3" => Some(ModelInfo {
context_window: 200_000,
@@ -66,6 +77,11 @@ pub(crate) fn get_model_info(name: &str) -> Option<ModelInfo> {
max_output_tokens: 4_096,
}),
"gpt-5" => Some(ModelInfo {
context_window: 200_000,
max_output_tokens: 100_000,
}),
_ => None,
}
}

View File

@@ -1,21 +1,28 @@
use serde::Deserialize;
use serde::Serialize;
use serde_json::json;
use std::collections::BTreeMap;
use std::sync::LazyLock;
use std::collections::HashMap;
use crate::client_common::Prompt;
use crate::model_family::ModelFamily;
use crate::plan_tool::PLAN_TOOL;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
#[derive(Debug, Clone, Serialize)]
pub(crate) struct ResponsesApiTool {
name: &'static str,
description: &'static str,
strict: bool,
parameters: JsonSchema,
#[derive(Debug, Clone, Serialize, PartialEq)]
pub struct ResponsesApiTool {
pub(crate) name: String,
pub(crate) description: String,
/// TODO: Validation. When strict is set to true, the JSON schema,
/// `required` and `additional_properties` must be present. All fields in
/// `properties` must be present in `required`.
pub(crate) strict: bool,
pub(crate) parameters: JsonSchema,
}
/// When serialized as JSON, this produces a valid "Tool" in the OpenAI
/// Responses API.
#[derive(Debug, Clone, Serialize)]
#[derive(Debug, Clone, Serialize, PartialEq)]
#[serde(tag = "type")]
pub(crate) enum OpenAiTool {
#[serde(rename = "function")]
@@ -24,74 +31,218 @@ pub(crate) enum OpenAiTool {
LocalShell {},
}
#[derive(Debug, Clone)]
pub enum ConfigShellToolType {
DefaultShell,
ShellWithRequest { sandbox_policy: SandboxPolicy },
LocalShell,
}
#[derive(Debug, Clone)]
pub struct ToolsConfig {
pub shell_type: ConfigShellToolType,
pub plan_tool: bool,
}
impl ToolsConfig {
pub fn new(
model_family: &ModelFamily,
approval_policy: AskForApproval,
sandbox_policy: SandboxPolicy,
include_plan_tool: bool,
) -> Self {
let mut shell_type = if model_family.uses_local_shell_tool {
ConfigShellToolType::LocalShell
} else {
ConfigShellToolType::DefaultShell
};
if matches!(approval_policy, AskForApproval::OnRequest) {
shell_type = ConfigShellToolType::ShellWithRequest {
sandbox_policy: sandbox_policy.clone(),
}
}
Self {
shell_type,
plan_tool: include_plan_tool,
}
}
}
/// Generic JSONSchema subset needed for our tool definitions
#[derive(Debug, Clone, Serialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(tag = "type", rename_all = "lowercase")]
pub(crate) enum JsonSchema {
String,
Number,
Boolean {
#[serde(skip_serializing_if = "Option::is_none")]
description: Option<String>,
},
String {
#[serde(skip_serializing_if = "Option::is_none")]
description: Option<String>,
},
Number {
#[serde(skip_serializing_if = "Option::is_none")]
description: Option<String>,
},
Array {
items: Box<JsonSchema>,
#[serde(skip_serializing_if = "Option::is_none")]
description: Option<String>,
},
Object {
properties: BTreeMap<String, JsonSchema>,
required: &'static [&'static str],
#[serde(rename = "additionalProperties")]
additional_properties: bool,
#[serde(skip_serializing_if = "Option::is_none")]
required: Option<Vec<String>>,
#[serde(
rename = "additionalProperties",
skip_serializing_if = "Option::is_none"
)]
additional_properties: Option<bool>,
},
}
/// Tool usage specification
static DEFAULT_TOOLS: LazyLock<Vec<OpenAiTool>> = LazyLock::new(|| {
fn create_shell_tool() -> OpenAiTool {
let mut properties = BTreeMap::new();
properties.insert(
"command".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String),
items: Box::new(JsonSchema::String { description: None }),
description: None,
},
);
properties.insert("workdir".to_string(), JsonSchema::String);
properties.insert("timeout".to_string(), JsonSchema::Number);
properties.insert(
"workdir".to_string(),
JsonSchema::String { description: None },
);
properties.insert(
"timeout".to_string(),
JsonSchema::Number { description: None },
);
vec![OpenAiTool::Function(ResponsesApiTool {
name: "shell",
description: "Runs a shell command, and returns its output.",
OpenAiTool::Function(ResponsesApiTool {
name: "shell".to_string(),
description: "Runs a shell command and returns its output".to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: &["command"],
additional_properties: false,
required: Some(vec!["command".to_string()]),
additional_properties: Some(false),
},
})]
});
})
}
static DEFAULT_CODEX_MODEL_TOOLS: LazyLock<Vec<OpenAiTool>> =
LazyLock::new(|| vec![OpenAiTool::LocalShell {}]);
fn create_shell_tool_for_sandbox(sandbox_policy: &SandboxPolicy) -> OpenAiTool {
let mut properties = BTreeMap::new();
properties.insert(
"command".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String { description: None }),
description: Some("The command to execute".to_string()),
},
);
properties.insert(
"workdir".to_string(),
JsonSchema::String {
description: Some("The working directory to execute the command in".to_string()),
},
);
properties.insert(
"timeout".to_string(),
JsonSchema::Number {
description: Some("The timeout for the command in milliseconds".to_string()),
},
);
if matches!(sandbox_policy, SandboxPolicy::WorkspaceWrite { .. }) {
properties.insert(
"with_escalated_permissions".to_string(),
JsonSchema::Boolean {
description: Some("Whether to request escalated permissions. Set to true if command needs to be run without sandbox restrictions".to_string()),
},
);
properties.insert(
"justification".to_string(),
JsonSchema::String {
description: Some("Only set if ask_for_escalated_permissions is true. 1-sentence explanation of why we want to run this command.".to_string()),
},
);
}
let description = match sandbox_policy {
SandboxPolicy::WorkspaceWrite {
network_access,
..
} => {
format!(
r#"
The shell tool is used to execute shell commands.
- When invoking the shell tool, your call will be running in a landlock sandbox, and some shell commands will require escalated privileges:
- Types of actions that require escalated privileges:
- Reading files outside the current directory
- Writing files outside the current directory, and protected folders like .git or .env{}
- Examples of commands that require escalated privileges:
- git commit
- npm install or pnpm install
- cargo build
- cargo test
- When invoking a command that will require escalated privileges:
- Provide the with_escalated_permissions parameter with the boolean value true
- Include a short, 1 sentence explanation for why we need to run with_escalated_permissions in the justification parameter."#,
if !network_access {
"\n - Commands that require network access\n"
} else {
""
}
)
}
SandboxPolicy::DangerFullAccess => {
"Runs a shell command and returns its output.".to_string()
}
SandboxPolicy::ReadOnly => {
r#"
The shell tool is used to execute shell commands.
- When invoking the shell tool, your call will be running in a landlock sandbox, and some shell commands (including apply_patch) will require escalated permissions:
- Types of actions that require escalated privileges:
- Reading files outside the current directory
- Writing files
- Applying patches
- Examples of commands that require escalated privileges:
- apply_patch
- git commit
- npm install or pnpm install
- cargo build
- cargo test
- When invoking a command that will require escalated privileges:
- Provide the with_escalated_permissions parameter with the boolean value true
- Include a short, 1 sentence explanation for why we need to run with_escalated_permissions in the justification parameter"#.to_string()
}
};
OpenAiTool::Function(ResponsesApiTool {
name: "shell".to_string(),
description,
strict: false,
parameters: JsonSchema::Object {
properties,
required: Some(vec!["command".to_string()]),
additional_properties: Some(false),
},
})
}
/// Returns JSON values that are compatible with Function Calling in the
/// Responses API:
/// https://platform.openai.com/docs/guides/function-calling?api-mode=responses
pub(crate) fn create_tools_json_for_responses_api(
prompt: &Prompt,
model: &str,
tools: &Vec<OpenAiTool>,
) -> crate::error::Result<Vec<serde_json::Value>> {
// Assemble tool list: built-in tools + any extra tools from the prompt.
let default_tools = if model.starts_with("codex") {
&DEFAULT_CODEX_MODEL_TOOLS
} else {
&DEFAULT_TOOLS
};
let mut tools_json = Vec::with_capacity(default_tools.len() + prompt.extra_tools.len());
for t in default_tools.iter() {
tools_json.push(serde_json::to_value(t)?);
let mut tools_json = Vec::new();
for tool in tools {
tools_json.push(serde_json::to_value(tool)?);
}
tools_json.extend(
prompt
.extra_tools
.clone()
.into_iter()
.map(|(name, tool)| mcp_tool_to_openai_tool(name, tool)),
);
Ok(tools_json)
}
@@ -100,12 +251,11 @@ pub(crate) fn create_tools_json_for_responses_api(
/// Chat Completions API:
/// https://platform.openai.com/docs/guides/function-calling?api-mode=chat
pub(crate) fn create_tools_json_for_chat_completions_api(
prompt: &Prompt,
model: &str,
tools: &Vec<OpenAiTool>,
) -> crate::error::Result<Vec<serde_json::Value>> {
// We start with the JSON for the Responses API and than rewrite it to match
// the chat completions tool call format.
let responses_api_tools_json = create_tools_json_for_responses_api(prompt, model)?;
let responses_api_tools_json = create_tools_json_for_responses_api(tools)?;
let tools_json = responses_api_tools_json
.into_iter()
.filter_map(|mut tool| {
@@ -128,10 +278,10 @@ pub(crate) fn create_tools_json_for_chat_completions_api(
Ok(tools_json)
}
fn mcp_tool_to_openai_tool(
pub(crate) fn mcp_tool_to_openai_tool(
fully_qualified_name: String,
tool: mcp_types::Tool,
) -> serde_json::Value {
) -> Result<ResponsesApiTool, serde_json::Error> {
let mcp_types::Tool {
description,
mut input_schema,
@@ -146,12 +296,205 @@ fn mcp_tool_to_openai_tool(
input_schema.properties = Some(serde_json::Value::Object(serde_json::Map::new()));
}
// TODO(mbolin): Change the contract of this function to return
// ResponsesApiTool.
json!({
"name": fully_qualified_name,
"description": description,
"parameters": input_schema,
"type": "function",
let serialized_input_schema = serde_json::to_value(input_schema)?;
let input_schema = serde_json::from_value::<JsonSchema>(serialized_input_schema)?;
Ok(ResponsesApiTool {
name: fully_qualified_name,
description: description.unwrap_or_default(),
strict: false,
parameters: input_schema,
})
}
/// Returns a list of OpenAiTools based on the provided config and MCP tools.
/// Note that the keys of mcp_tools should be fully qualified names. See
/// [`McpConnectionManager`] for more details.
pub(crate) fn get_openai_tools(
config: &ToolsConfig,
mcp_tools: Option<HashMap<String, mcp_types::Tool>>,
) -> Vec<OpenAiTool> {
let mut tools: Vec<OpenAiTool> = Vec::new();
match &config.shell_type {
ConfigShellToolType::DefaultShell => {
tools.push(create_shell_tool());
}
ConfigShellToolType::ShellWithRequest { sandbox_policy } => {
tools.push(create_shell_tool_for_sandbox(sandbox_policy));
}
ConfigShellToolType::LocalShell => {
tools.push(OpenAiTool::LocalShell {});
}
}
if config.plan_tool {
tools.push(PLAN_TOOL.clone());
}
if let Some(mcp_tools) = mcp_tools {
for (name, tool) in mcp_tools {
match mcp_tool_to_openai_tool(name.clone(), tool.clone()) {
Ok(converted_tool) => tools.push(OpenAiTool::Function(converted_tool)),
Err(e) => {
tracing::error!("Failed to convert {name:?} MCP tool to OpenAI tool: {e:?}");
}
}
}
}
tools
}
#[cfg(test)]
#[allow(clippy::expect_used)]
mod tests {
use crate::model_family::find_family_for_model;
use mcp_types::ToolInputSchema;
use super::*;
fn assert_eq_tool_names(tools: &[OpenAiTool], expected_names: &[&str]) {
let tool_names = tools
.iter()
.map(|tool| match tool {
OpenAiTool::Function(ResponsesApiTool { name, .. }) => name,
OpenAiTool::LocalShell {} => "local_shell",
})
.collect::<Vec<_>>();
assert_eq!(
tool_names.len(),
expected_names.len(),
"tool_name mismatch, {tool_names:?}, {expected_names:?}",
);
for (name, expected_name) in tool_names.iter().zip(expected_names.iter()) {
assert_eq!(
name, expected_name,
"tool_name mismatch, {name:?}, {expected_name:?}"
);
}
}
#[test]
fn test_get_openai_tools() {
let model_family = find_family_for_model("codex-mini-latest")
.expect("codex-mini-latest should be a valid model family");
let config = ToolsConfig::new(
&model_family,
AskForApproval::Never,
SandboxPolicy::ReadOnly,
true,
);
let tools = get_openai_tools(&config, Some(HashMap::new()));
assert_eq_tool_names(&tools, &["local_shell", "update_plan"]);
}
#[test]
fn test_get_openai_tools_default_shell() {
let model_family = find_family_for_model("o3").expect("o3 should be a valid model family");
let config = ToolsConfig::new(
&model_family,
AskForApproval::Never,
SandboxPolicy::ReadOnly,
true,
);
let tools = get_openai_tools(&config, Some(HashMap::new()));
assert_eq_tool_names(&tools, &["shell", "update_plan"]);
}
#[test]
fn test_get_openai_tools_mcp_tools() {
let model_family = find_family_for_model("o3").expect("o3 should be a valid model family");
let config = ToolsConfig::new(
&model_family,
AskForApproval::Never,
SandboxPolicy::ReadOnly,
false,
);
let tools = get_openai_tools(
&config,
Some(HashMap::from([(
"test_server/do_something_cool".to_string(),
mcp_types::Tool {
name: "do_something_cool".to_string(),
input_schema: ToolInputSchema {
properties: Some(serde_json::json!({
"string_argument": {
"type": "string",
},
"number_argument": {
"type": "number",
},
"object_argument": {
"type": "object",
"properties": {
"string_property": { "type": "string" },
"number_property": { "type": "number" },
},
"required": [
"string_property",
"number_property"
],
"additionalProperties": Some(false),
},
})),
required: None,
r#type: "object".to_string(),
},
output_schema: None,
title: None,
annotations: None,
description: Some("Do something cool".to_string()),
},
)])),
);
assert_eq_tool_names(&tools, &["shell", "test_server/do_something_cool"]);
assert_eq!(
tools[1],
OpenAiTool::Function(ResponsesApiTool {
name: "test_server/do_something_cool".to_string(),
parameters: JsonSchema::Object {
properties: BTreeMap::from([
(
"string_argument".to_string(),
JsonSchema::String { description: None }
),
(
"number_argument".to_string(),
JsonSchema::Number { description: None }
),
(
"object_argument".to_string(),
JsonSchema::Object {
properties: BTreeMap::from([
(
"string_property".to_string(),
JsonSchema::String { description: None }
),
(
"number_property".to_string(),
JsonSchema::Number { description: None }
),
]),
required: Some(vec![
"string_property".to_string(),
"number_property".to_string(),
]),
additional_properties: Some(false),
},
),
]),
required: None,
additional_properties: None,
},
description: "Do something cool".to_string(),
strict: false,
})
);
}
}

View File

@@ -0,0 +1,648 @@
use crate::bash::try_parse_bash;
use crate::bash::try_parse_word_only_commands_sequence;
use serde::Deserialize;
use serde::Serialize;
use shlex::split as shlex_split;
#[derive(Debug, Clone, PartialEq, Eq, Deserialize, Serialize)]
pub enum ParsedCommand {
Read {
cmd: Vec<String>,
name: String,
},
Python {
cmd: Vec<String>,
},
GitStatus {
cmd: Vec<String>,
},
GitLog {
cmd: Vec<String>,
},
GitDiff {
cmd: Vec<String>,
},
Ls {
cmd: Vec<String>,
path: Option<String>,
},
Rg {
cmd: Vec<String>,
query: Option<String>,
path: Option<String>,
files_only: bool,
},
Shell {
cmd: Vec<String>,
display: String,
},
Pnpm {
cmd: Vec<String>,
pnpm_cmd: String,
},
Unknown {
cmd: Vec<String>,
},
}
pub fn parse_command(command: &[String]) -> Vec<ParsedCommand> {
let main_cmd = extract_main_cmd_tokens(command);
// 1) Try the "bash -lc <script>" path: leverage the existing parser so we
// can get each sub-command (words-only) precisely.
if let [bash, flag, script] = command {
if bash == "bash" && flag == "-lc" {
if let Some(tree) = try_parse_bash(script) {
if let Some(all_commands) = try_parse_word_only_commands_sequence(&tree, script) {
if !all_commands.is_empty() {
// Tokenize the entire script once; used to preserve full context for certain summaries.
let script_tokens = shlex_split(script).unwrap_or_else(|| {
vec!["bash".to_string(), flag.clone(), script.clone()]
});
let commands: Vec<ParsedCommand> = all_commands
.into_iter()
.map(|tokens| {
match summarize_main_tokens(&tokens) {
// For ls within a bash -lc script, preserve the full script tokens for display.
ParsedCommand::Ls { path, .. } => ParsedCommand::Ls {
cmd: script_tokens.clone(),
path,
},
other => other,
}
})
.collect();
return commands;
}
}
}
// If we couldn't parse with the bash parser, conservatively treat the
// whole thing as one opaque shell command and mark unsafe.
let display = script.clone();
let commands = vec![ParsedCommand::Shell {
cmd: main_cmd.clone(),
display,
}];
return commands;
}
}
// 2) Not a "bash -lc" form. If there are connectors, split locally.
let has_connectors = main_cmd
.iter()
.any(|t| t == "&&" || t == "||" || t == "|" || t == ";");
let split_subcommands = |tokens: &[String]| -> Vec<Vec<String>> {
let mut out: Vec<Vec<String>> = Vec::new();
let mut cur: Vec<String> = Vec::new();
for t in tokens {
if t == "&&" || t == "||" || t == "|" || t == ";" {
if !cur.is_empty() {
out.push(std::mem::take(&mut cur));
}
} else {
cur.push(t.clone());
}
}
if !cur.is_empty() {
out.push(cur);
}
out
};
let commands_tokens: Vec<Vec<String>> = if has_connectors {
split_subcommands(&main_cmd)
} else {
vec![main_cmd.clone()]
};
// 3) Summarize each sub-command.
let commands: Vec<ParsedCommand> = commands_tokens
.into_iter()
.map(|tokens| summarize_main_tokens(&tokens))
.collect();
commands
}
/// Returns true if `arg` matches /^(\d+,)?\d+p$/
fn is_valid_sed_n_arg(arg: Option<&str>) -> bool {
let s = match arg {
Some(s) => s,
None => return false,
};
let core = match s.strip_suffix('p') {
Some(rest) => rest,
None => return false,
};
let parts: Vec<&str> = core.split(',').collect();
match parts.as_slice() {
[num] => !num.is_empty() && num.chars().all(|c| c.is_ascii_digit()),
[a, b] => {
!a.is_empty()
&& !b.is_empty()
&& a.chars().all(|c| c.is_ascii_digit())
&& b.chars().all(|c| c.is_ascii_digit())
}
_ => false,
}
}
fn extract_main_cmd_tokens(cmd: &[String]) -> Vec<String> {
match cmd {
[first, pipe, rest @ ..] if (first == "yes" || first == "y") && pipe == "|" => {
let s = rest.join(" ");
shlex_split(&s).unwrap_or_else(|| rest.to_vec())
}
[first, pipe, rest @ ..] if (first == "no" || first == "n") && pipe == "|" => {
let s = rest.join(" ");
shlex_split(&s).unwrap_or_else(|| rest.to_vec())
}
[bash, flag, script] if bash == "bash" && (flag == "-c" || flag == "-lc") => {
shlex_split(script)
.unwrap_or_else(|| vec!["bash".to_string(), flag.clone(), script.clone()])
}
_ => cmd.to_vec(),
}
}
fn summarize_main_tokens(main_cmd: &[String]) -> ParsedCommand {
let cut_at_connector = |tokens: &[String]| -> Vec<String> {
let idx = tokens
.iter()
.position(|t| t == "|" || t == "&&" || t == "||")
.unwrap_or(tokens.len());
tokens[..idx].to_vec()
};
let truncate_file_path_for_display = |path: &str| -> String {
let mut parts = path.split('/').rev().filter(|p| {
!p.is_empty() && *p != "build" && *p != "dist" && *p != "node_modules" && *p != "src"
});
parts
.next()
.map(|s| s.to_string())
.unwrap_or_else(|| path.to_string())
};
match main_cmd.split_first() {
Some((head, tail)) if head == "ls" => {
let path = tail
.iter()
.find(|p| !p.starts_with('-'))
.map(|p| truncate_file_path_for_display(p));
ParsedCommand::Ls {
cmd: main_cmd.to_vec(),
path,
}
}
Some((head, tail)) if head == "rg" => {
let args_no_connector = cut_at_connector(tail);
let files_only = args_no_connector.iter().any(|a| a == "--files");
let non_flags: Vec<&String> = args_no_connector
.iter()
.filter(|p| !p.starts_with('-'))
.collect();
let (query, path) = if files_only {
let p = non_flags.first().map(|s| truncate_file_path_for_display(s));
(None, p)
} else {
let q = non_flags.first().map(|s| truncate_file_path_for_display(s));
let p = non_flags.get(1).map(|s| truncate_file_path_for_display(s));
(q, p)
};
ParsedCommand::Rg {
cmd: main_cmd.to_vec(),
query,
path,
files_only,
}
}
Some((head, tail)) if head == "grep" => {
let args_no_connector = cut_at_connector(tail);
let non_flags: Vec<&String> = args_no_connector
.iter()
.filter(|p| !p.starts_with('-'))
.collect();
let query = non_flags.first().map(|s| truncate_file_path_for_display(s));
let path = non_flags.get(1).map(|s| truncate_file_path_for_display(s));
ParsedCommand::Rg {
cmd: main_cmd.to_vec(),
query,
path,
files_only: false,
}
}
Some((head, tail)) if head == "cat" && tail.len() == 1 => {
let name = truncate_file_path_for_display(&tail[0]);
ParsedCommand::Read {
cmd: main_cmd.to_vec(),
name,
}
}
Some((head, tail))
if head == "head"
&& tail.len() >= 3
&& tail[0] == "-n"
&& tail[1].chars().all(|c| c.is_ascii_digit()) =>
{
let name = truncate_file_path_for_display(&tail[2]);
ParsedCommand::Read {
cmd: main_cmd.to_vec(),
name,
}
}
Some((head, tail))
if head == "tail" && tail.len() >= 3 && tail[0] == "-n" && {
let n = &tail[1];
let s = n.strip_prefix('+').unwrap_or(n);
!s.is_empty() && s.chars().all(|c| c.is_ascii_digit())
} =>
{
let name = truncate_file_path_for_display(&tail[2]);
ParsedCommand::Read {
cmd: main_cmd.to_vec(),
name,
}
}
Some((head, tail))
if head == "sed"
&& tail.len() >= 3
&& tail[0] == "-n"
&& is_valid_sed_n_arg(tail.get(1).map(|s| s.as_str())) =>
{
if let Some(path) = tail.get(2) {
let name = truncate_file_path_for_display(path);
ParsedCommand::Read {
cmd: main_cmd.to_vec(),
name,
}
} else {
ParsedCommand::Unknown {
cmd: main_cmd.to_vec(),
}
}
}
Some((head, _tail)) if head == "python" => ParsedCommand::Python {
cmd: main_cmd.to_vec(),
},
Some((first, rest)) if first == "git" => match rest.first().map(|s| s.as_str()) {
Some("status") => ParsedCommand::GitStatus {
cmd: main_cmd.to_vec(),
},
Some("log") => ParsedCommand::GitLog {
cmd: main_cmd.to_vec(),
},
Some("diff") => ParsedCommand::GitDiff {
cmd: main_cmd.to_vec(),
},
_ => ParsedCommand::Unknown {
cmd: main_cmd.to_vec(),
},
},
Some((tool, rest)) if (tool == "pnpm" || tool == "npm") => {
let mut r = rest;
let mut has_r = false;
if let Some(flag) = r.first() {
if flag == "-r" {
has_r = true;
r = &r[1..];
}
}
if r.first().map(|s| s.as_str()) == Some("run") {
let args = r[1..].to_vec();
// For display, only include the script name before any "--" forwarded args.
let script_name = args.first().cloned().unwrap_or_default();
let pnpm_cmd = script_name;
let mut full = vec![tool.clone()];
if has_r {
full.push("-r".to_string());
}
full.push("run".to_string());
full.extend(args.clone());
ParsedCommand::Pnpm {
cmd: full,
pnpm_cmd,
}
} else {
ParsedCommand::Unknown {
cmd: main_cmd.to_vec(),
}
}
}
_ => ParsedCommand::Unknown {
cmd: main_cmd.to_vec(),
},
}
}
#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used)]
use super::*;
fn vec_str(args: &[&str]) -> Vec<String> {
args.iter().map(|s| s.to_string()).collect()
}
#[test]
fn git_status_summary() {
let out = parse_command(&vec_str(&["git", "status"]));
assert_eq!(
out,
vec![ParsedCommand::GitStatus {
cmd: vec_str(&["git", "status"]),
}]
);
}
#[test]
fn handles_complex_bash_command() {
let inner =
"rg --version && node -v && pnpm -v && rg --files | wc -l && rg --files | head -n 40";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![
ParsedCommand::Unknown {
cmd: vec_str(&["head", "-n", "40"])
},
ParsedCommand::Rg {
cmd: vec_str(&["rg", "--files"]),
query: None,
path: None,
files_only: true,
},
ParsedCommand::Unknown {
cmd: vec_str(&["wc", "-l"])
},
ParsedCommand::Rg {
cmd: vec_str(&["rg", "--files"]),
query: None,
path: None,
files_only: true,
},
ParsedCommand::Unknown {
cmd: vec_str(&["pnpm", "-v"])
},
ParsedCommand::Unknown {
cmd: vec_str(&["node", "-v"])
},
ParsedCommand::Rg {
cmd: vec_str(&["rg", "--version"]),
query: None,
path: None,
files_only: false,
},
]
);
}
#[test]
fn supports_searching_for_navigate_to_route() {
let inner = "rg -n \"navigate-to-route\" -S";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![ParsedCommand::Rg {
cmd: shlex_split(inner).unwrap(),
query: Some("navigate-to-route".to_string()),
path: None,
files_only: false,
}]
);
}
#[test]
fn supports_rg_files_with_path_and_pipe() {
let inner = "rg --files webview/src | sed -n";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![
ParsedCommand::Unknown {
cmd: vec_str(&["sed", "-n"])
},
ParsedCommand::Rg {
cmd: vec_str(&["rg", "--files", "webview/src"]),
query: None,
path: Some("webview".to_string()),
files_only: true,
},
]
);
}
#[test]
fn supports_rg_files_then_head() {
let inner = "rg --files | head -n 50";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![
ParsedCommand::Unknown {
cmd: vec_str(&["head", "-n", "50"])
},
ParsedCommand::Rg {
cmd: vec_str(&["rg", "--files"]),
query: None,
path: None,
files_only: true,
},
]
);
}
#[test]
fn supports_cat() {
let inner = "cat webview/README.md";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![ParsedCommand::Read {
cmd: shlex_split(inner).unwrap(),
name: "README.md".to_string(),
}]
);
}
#[test]
fn supports_ls_with_pipe() {
let inner = "ls -la | sed -n '1,120p'";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![
ParsedCommand::Unknown {
cmd: vec_str(&["sed", "-n", "1,120p"])
},
ParsedCommand::Ls {
cmd: shlex_split(inner).unwrap(),
path: None,
},
]
);
}
#[test]
fn supports_head_n() {
let inner = "head -n 50 Cargo.toml";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![ParsedCommand::Read {
cmd: shlex_split(inner).unwrap(),
name: "Cargo.toml".to_string(),
},]
);
}
#[test]
fn supports_tail_n_plus() {
let inner = "tail -n +522 README.md";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![ParsedCommand::Read {
cmd: shlex_split(inner).unwrap(),
name: "README.md".to_string(),
}]
);
}
#[test]
fn supports_tail_n_last_lines() {
let inner = "tail -n 30 README.md";
let out = parse_command(&vec_str(&["bash", "-lc", inner]));
assert_eq!(
out,
vec![ParsedCommand::Read {
cmd: shlex_split(inner).unwrap(),
name: "README.md".to_string(),
}]
);
}
#[test]
fn supports_npm_run_build() {
let out = parse_command(&vec_str(&["npm", "run", "build"]));
assert_eq!(
out,
vec![ParsedCommand::Pnpm {
cmd: vec_str(&["npm", "run", "build"]),
pnpm_cmd: "build".to_string(),
}]
);
}
#[test]
fn supports_npm_run_with_forwarded_args() {
let out = parse_command(&vec_str(&[
"npm",
"run",
"lint",
"--",
"--max-warnings",
"0",
"--format",
"json",
]));
assert_eq!(
out,
vec![ParsedCommand::Pnpm {
cmd: vec_str(&[
"npm",
"run",
"lint",
"--",
"--max-warnings",
"0",
"--format",
"json",
]),
pnpm_cmd: "lint".to_string(),
}]
);
}
#[test]
fn supports_grep_recursive_current_dir() {
let out = parse_command(&vec_str(&[
"grep",
"-R",
"CODEX_SANDBOX_ENV_VAR",
"-n",
".",
]));
assert_eq!(
out,
vec![ParsedCommand::Rg {
cmd: vec_str(&["grep", "-R", "CODEX_SANDBOX_ENV_VAR", "-n", "."]),
query: Some("CODEX_SANDBOX_ENV_VAR".to_string()),
path: Some(".".to_string()),
files_only: false,
}]
);
}
#[test]
fn supports_grep_recursive_specific_file() {
let out = parse_command(&vec_str(&[
"grep",
"-R",
"CODEX_SANDBOX_ENV_VAR",
"-n",
"core/src/spawn.rs",
]));
assert_eq!(
out,
vec![ParsedCommand::Rg {
cmd: vec_str(&[
"grep",
"-R",
"CODEX_SANDBOX_ENV_VAR",
"-n",
"core/src/spawn.rs",
]),
query: Some("CODEX_SANDBOX_ENV_VAR".to_string()),
path: Some("spawn.rs".to_string()),
files_only: false,
}]
);
}
#[test]
fn supports_grep_weird_backtick_in_query() {
let out = parse_command(&vec_str(&["grep", "-R", "COD`EX_SANDBOX", "-n"]));
assert_eq!(
out,
vec![ParsedCommand::Rg {
cmd: vec_str(&["grep", "-R", "COD`EX_SANDBOX", "-n"]),
query: Some("COD`EX_SANDBOX".to_string()),
path: None,
files_only: false,
}]
);
}
#[test]
fn supports_cd_and_rg_files() {
let out = parse_command(&vec_str(&["cd", "codex-rs", "&&", "rg", "--files"]));
assert_eq!(
out,
vec![
ParsedCommand::Unknown {
cmd: vec_str(&["cd", "codex-rs"]),
},
ParsedCommand::Rg {
cmd: vec_str(&["rg", "--files"]),
query: None,
path: None,
files_only: true,
},
]
);
}
}

View File

@@ -0,0 +1,133 @@
use std::collections::BTreeMap;
use std::sync::LazyLock;
use serde::Deserialize;
use serde::Serialize;
use crate::codex::Session;
use crate::models::FunctionCallOutputPayload;
use crate::models::ResponseInputItem;
use crate::openai_tools::JsonSchema;
use crate::openai_tools::OpenAiTool;
use crate::openai_tools::ResponsesApiTool;
use crate::protocol::Event;
use crate::protocol::EventMsg;
// Types for the TODO tool arguments matching codex-vscode/todo-mcp/src/main.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum StepStatus {
Pending,
InProgress,
Completed,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct PlanItemArg {
pub step: String,
pub status: StepStatus,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct UpdatePlanArgs {
#[serde(default)]
pub explanation: Option<String>,
pub plan: Vec<PlanItemArg>,
}
pub(crate) static PLAN_TOOL: LazyLock<OpenAiTool> = LazyLock::new(|| {
let mut plan_item_props = BTreeMap::new();
plan_item_props.insert("step".to_string(), JsonSchema::String { description: None });
plan_item_props.insert(
"status".to_string(),
JsonSchema::String { description: None },
);
let plan_items_schema = JsonSchema::Array {
description: Some("The list of steps".to_string()),
items: Box::new(JsonSchema::Object {
properties: plan_item_props,
required: Some(vec!["step".to_string(), "status".to_string()]),
additional_properties: Some(false),
}),
};
let mut properties = BTreeMap::new();
properties.insert(
"explanation".to_string(),
JsonSchema::String { description: None },
);
properties.insert("plan".to_string(), plan_items_schema);
OpenAiTool::Function(ResponsesApiTool {
name: "update_plan".to_string(),
description: r#"Use the update_plan tool to keep the user updated on the current plan for the task.
After understanding the user's task, call the update_plan tool with an initial plan. An example of a plan:
1. Explore the codebase to find relevant files (status: in_progress)
2. Implement the feature in the XYZ component (status: pending)
3. Commit changes and make a pull request (status: pending)
Each step should be a short, 1-sentence description.
Until all the steps are finished, there should always be exactly one in_progress step in the plan.
Call the update_plan tool whenever you finish a step, marking the completed step as `completed` and marking the next step as `in_progress`.
Before running a command, consider whether or not you have completed the previous step, and make sure to mark it as completed before moving on to the next step.
Sometimes, you may need to change plans in the middle of a task: call `update_plan` with the updated plan and make sure to provide an `explanation` of the rationale when doing so.
When all steps are completed, call update_plan one last time with all steps marked as `completed`."#.to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: Some(vec!["plan".to_string()]),
additional_properties: Some(false),
},
})
});
/// This function doesn't do anything useful. However, it gives the model a structured way to record its plan that clients can read and render.
/// So it's the _inputs_ to this function that are useful to clients, not the outputs and neither are actually useful for the model other
/// than forcing it to come up and document a plan (TBD how that affects performance).
pub(crate) async fn handle_update_plan(
session: &Session,
arguments: String,
sub_id: String,
call_id: String,
) -> ResponseInputItem {
match parse_update_plan_arguments(arguments, &call_id) {
Ok(args) => {
let output = ResponseInputItem::FunctionCallOutput {
call_id,
output: FunctionCallOutputPayload {
content: "Plan updated".to_string(),
success: Some(true),
},
};
session
.send_event(Event {
id: sub_id.to_string(),
msg: EventMsg::PlanUpdate(args),
})
.await;
output
}
Err(output) => *output,
}
}
fn parse_update_plan_arguments(
arguments: String,
call_id: &str,
) -> Result<UpdatePlanArgs, Box<ResponseInputItem>> {
match serde_json::from_str::<UpdatePlanArgs>(&arguments) {
Ok(args) => Ok(args),
Err(e) => {
let output = ResponseInputItem::FunctionCallOutput {
call_id: call_id.to_string(),
output: FunctionCallOutputPayload {
content: format!("failed to parse function arguments: {e}"),
success: None,
},
};
Err(Box::new(output))
}
}
}

View File

@@ -27,16 +27,16 @@ const PROJECT_DOC_SEPARATOR: &str = "\n\n--- project-doc ---\n\n";
/// string of instructions.
pub(crate) async fn get_user_instructions(config: &Config) -> Option<String> {
match find_project_doc(config).await {
Ok(Some(project_doc)) => match &config.instructions {
Ok(Some(project_doc)) => match &config.user_instructions {
Some(original_instructions) => Some(format!(
"{original_instructions}{PROJECT_DOC_SEPARATOR}{project_doc}"
)),
None => Some(project_doc),
},
Ok(None) => config.instructions.clone(),
Ok(None) => config.user_instructions.clone(),
Err(e) => {
error!("error trying to find project doc: {e:#}");
config.instructions.clone()
config.user_instructions.clone()
}
}
}
@@ -159,7 +159,7 @@ mod tests {
config.cwd = root.path().to_path_buf();
config.project_doc_max_bytes = limit;
config.instructions = instructions.map(ToOwned::to_owned);
config.user_instructions = instructions.map(ToOwned::to_owned);
config
}

View File

@@ -4,19 +4,25 @@
//! between user and agent.
use std::collections::HashMap;
use std::fmt;
use std::path::Path;
use std::path::PathBuf;
use std::str::FromStr;
use std::time::Duration;
use mcp_types::CallToolResult;
use serde::Deserialize;
use serde::Serialize;
use serde_bytes::ByteBuf;
use strum_macros::Display;
use uuid::Uuid;
use crate::config_types::ReasoningEffort as ReasoningEffortConfig;
use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
use crate::message_history::HistoryEntry;
use crate::model_provider_info::ModelProviderInfo;
use crate::parse_command::ParsedCommand;
use crate::plan_tool::UpdatePlanArgs;
/// Submission Queue Entry - requests from user
#[derive(Debug, Clone, Deserialize, Serialize)]
@@ -44,8 +50,12 @@ pub enum Op {
model_reasoning_effort: ReasoningEffortConfig,
model_reasoning_summary: ReasoningSummaryConfig,
/// Model instructions
instructions: Option<String>,
/// Model instructions that are appended to the base instructions.
user_instructions: Option<String>,
/// Base instructions override.
base_instructions: Option<String>,
/// When to escalate for approval for execution
approval_policy: AskForApproval,
/// How to sandbox commands executed in the system
@@ -69,6 +79,10 @@ pub enum Op {
/// `ConfigureSession` operation so that the business-logic layer can
/// operate deterministically.
cwd: std::path::PathBuf,
/// Path to a rollout file to resume from.
#[serde(skip_serializing_if = "Option::is_none")]
resume_path: Option<std::path::PathBuf>,
},
/// Abort current task.
@@ -108,18 +122,26 @@ pub enum Op {
/// Request a single history entry identified by `log_id` + `offset`.
GetHistoryEntryRequest { offset: usize, log_id: u64 },
/// Request the agent to summarize the current conversation context.
/// The agent will use its existing context (either conversation history or previous response id)
/// to generate a summary which will be returned as an AgentMessage event.
Compact,
/// Request to shut down codex instance.
Shutdown,
}
/// Determines the conditions under which the user is consulted to approve
/// running the command proposed by Codex.
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq, Hash, Serialize, Deserialize)]
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq, Hash, Serialize, Deserialize, Display)]
#[serde(rename_all = "kebab-case")]
#[strum(serialize_all = "kebab-case")]
pub enum AskForApproval {
/// Under this policy, only "known safe" commands—as determined by
/// `is_safe_command()`—that **only read files** are autoapproved.
/// Everything else will ask the user to approve.
#[default]
#[serde(rename = "untrusted")]
#[strum(serialize = "untrusted")]
UnlessTrusted,
/// *All* commands are autoapproved, but they are expected to run inside a
@@ -128,13 +150,18 @@ pub enum AskForApproval {
/// the user to approve execution without a sandbox.
OnFailure,
/// The model decides when to ask the user for approval.
#[default]
OnRequest,
/// Never ask the user to approve commands. Failures are immediately returned
/// to the model, and never escalated to the user for approval.
Never,
}
/// Determines execution restrictions for model shell commands.
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, Display)]
#[strum(serialize_all = "kebab-case")]
#[serde(tag = "mode", rename_all = "kebab-case")]
pub enum SandboxPolicy {
/// No restrictions whatsoever. Use with caution.
@@ -158,9 +185,30 @@ pub enum SandboxPolicy {
/// default.
#[serde(default)]
network_access: bool,
/// When set to `true`, will NOT include the per-user `TMPDIR`
/// environment variable among the default writable roots. Defaults to
/// `false`.
#[serde(default)]
exclude_tmpdir_env_var: bool,
/// When set to `true`, will NOT include the `/tmp` among the default
/// writable roots on UNIX. Defaults to `false`.
#[serde(default)]
exclude_slash_tmp: bool,
},
}
/// A writable root path accompanied by a list of subpaths that should remain
/// readonly even when the root is writable. This is primarily used to ensure
/// toplevel VCS metadata directories (e.g. `.git`) under a writable root are
/// not modified by the agent.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct WritableRoot {
pub root: PathBuf,
pub read_only_subpaths: Vec<PathBuf>,
}
impl FromStr for SandboxPolicy {
type Err = serde_json::Error;
@@ -182,6 +230,8 @@ impl SandboxPolicy {
SandboxPolicy::WorkspaceWrite {
writable_roots: vec![],
network_access: false,
exclude_tmpdir_env_var: false,
exclude_slash_tmp: false,
}
}
@@ -207,27 +257,64 @@ impl SandboxPolicy {
}
}
/// Returns the list of writable roots that should be passed down to the
/// Landlock rules installer, tailored to the current working directory.
pub fn get_writable_roots_with_cwd(&self, cwd: &Path) -> Vec<PathBuf> {
/// Returns the list of writable roots (tailored to the current working
/// directory) together with subpaths that should remain readonly under
/// each writable root.
pub fn get_writable_roots_with_cwd(&self, cwd: &Path) -> Vec<WritableRoot> {
match self {
SandboxPolicy::DangerFullAccess => Vec::new(),
SandboxPolicy::ReadOnly => Vec::new(),
SandboxPolicy::WorkspaceWrite { writable_roots, .. } => {
let mut roots = writable_roots.clone();
SandboxPolicy::WorkspaceWrite {
writable_roots,
exclude_tmpdir_env_var,
exclude_slash_tmp,
network_access: _,
} => {
// Start from explicitly configured writable roots.
let mut roots: Vec<PathBuf> = writable_roots.clone();
// Always include defaults: cwd, /tmp (if present on Unix), and
// on macOS, the per-user TMPDIR unless explicitly excluded.
roots.push(cwd.to_path_buf());
// Also include the per-user tmp dir on macOS.
// Note this is added dynamically rather than storing it in
// writable_roots because writable_roots contains only static
// values deserialized from the config file.
if cfg!(target_os = "macos") {
if let Some(tmpdir) = std::env::var_os("TMPDIR") {
roots.push(PathBuf::from(tmpdir));
// Include /tmp on Unix unless explicitly excluded.
if cfg!(unix) && !exclude_slash_tmp {
let slash_tmp = PathBuf::from("/tmp");
if slash_tmp.is_dir() {
roots.push(slash_tmp);
}
}
// Include $TMPDIR unless explicitly excluded. On macOS, TMPDIR
// is per-user, so writes to TMPDIR should not be readable by
// other users on the system.
//
// By comparison, TMPDIR is not guaranteed to be defined on
// Linux or Windows, but supporting it here gives users a way to
// provide the model with their own temporary directory without
// having to hardcode it in the config.
if !exclude_tmpdir_env_var
&& let Some(tmpdir) = std::env::var_os("TMPDIR")
&& !tmpdir.is_empty()
{
roots.push(PathBuf::from(tmpdir));
}
// For each root, compute subpaths that should remain read-only.
roots
.into_iter()
.map(|writable_root| {
let mut subpaths = Vec::new();
let top_level_git = writable_root.join(".git");
if top_level_git.is_dir() {
subpaths.push(top_level_git);
}
WritableRoot {
root: writable_root,
read_only_subpaths: subpaths,
}
})
.collect()
}
}
}
@@ -263,8 +350,9 @@ pub struct Event {
}
/// Response event from the agent
#[derive(Debug, Clone, Deserialize, Serialize)]
#[derive(Debug, Clone, Deserialize, Serialize, Display)]
#[serde(tag = "type", rename_all = "snake_case")]
#[strum(serialize_all = "snake_case")]
pub enum EventMsg {
/// Error while executing a submission
Error(ErrorEvent),
@@ -282,9 +370,21 @@ pub enum EventMsg {
/// Agent text output message
AgentMessage(AgentMessageEvent),
/// Agent text output delta message
AgentMessageDelta(AgentMessageDeltaEvent),
/// Reasoning event from agent.
AgentReasoning(AgentReasoningEvent),
/// Agent reasoning delta event from agent.
AgentReasoningDelta(AgentReasoningDeltaEvent),
/// Raw chain-of-thought from agent.
AgentReasoningRawContent(AgentReasoningRawContentEvent),
/// Agent reasoning content delta event from agent.
AgentReasoningRawContentDelta(AgentReasoningRawContentDeltaEvent),
/// Ack the client's configure message.
SessionConfigured(SessionConfiguredEvent),
@@ -295,6 +395,9 @@ pub enum EventMsg {
/// Notification that the server is about to execute a command.
ExecCommandBegin(ExecCommandBeginEvent),
/// Incremental chunk of output from a running command.
ExecCommandOutputDelta(ExecCommandOutputDeltaEvent),
ExecCommandEnd(ExecCommandEndEvent),
ExecApprovalRequest(ExecApprovalRequestEvent),
@@ -310,8 +413,15 @@ pub enum EventMsg {
/// Notification that a patch application has finished.
PatchApplyEnd(PatchApplyEndEvent),
TurnDiff(TurnDiffEvent),
/// Response to GetHistoryEntryRequest.
GetHistoryEntryResponse(GetHistoryEntryResponseEvent),
PlanUpdate(UpdatePlanArgs),
/// Notification that the agent is shutting down.
ShutdownComplete,
}
// Individual event payload types matching each `EventMsg` variant.
@@ -335,20 +445,99 @@ pub struct TokenUsage {
pub total_tokens: u64,
}
impl TokenUsage {
pub fn is_zero(&self) -> bool {
self.total_tokens == 0
}
pub fn cached_input(&self) -> u64 {
self.cached_input_tokens.unwrap_or(0)
}
pub fn non_cached_input(&self) -> u64 {
self.input_tokens.saturating_sub(self.cached_input())
}
/// Primary count for display as a single absolute value: non-cached input + output.
pub fn blended_total(&self) -> u64 {
self.non_cached_input() + self.output_tokens
}
/// For estimating what % of the model's context window is used, we need to account
/// for reasoning output tokens from prior turns being dropped from the context window.
/// We approximate this here by subtracting reasoning output tokens from the total.
/// This will be off for the current turn and pending function calls.
pub fn tokens_in_context_window(&self) -> u64 {
self.total_tokens
.saturating_sub(self.reasoning_output_tokens.unwrap_or(0))
}
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct FinalOutput {
pub token_usage: TokenUsage,
}
impl From<TokenUsage> for FinalOutput {
fn from(token_usage: TokenUsage) -> Self {
Self { token_usage }
}
}
impl fmt::Display for FinalOutput {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let token_usage = &self.token_usage;
write!(
f,
"Token usage: total={} input={}{} output={}{}",
token_usage.blended_total(),
token_usage.non_cached_input(),
if token_usage.cached_input() > 0 {
format!(" (+ {} cached)", token_usage.cached_input())
} else {
String::new()
},
token_usage.output_tokens,
token_usage
.reasoning_output_tokens
.map(|r| format!(" (reasoning {r})"))
.unwrap_or_default()
)
}
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct AgentMessageEvent {
pub message: String,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct AgentMessageDeltaEvent {
pub delta: String,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct AgentReasoningEvent {
pub text: String,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct McpToolCallBeginEvent {
/// Identifier so this can be paired with the McpToolCallEnd event.
pub call_id: String,
pub struct AgentReasoningRawContentEvent {
pub text: String,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct AgentReasoningRawContentDeltaEvent {
pub delta: String,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct AgentReasoningDeltaEvent {
pub delta: String,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct McpInvocation {
/// Name of the MCP server as defined in the config.
pub server: String,
/// Name of the tool as given by the MCP server.
@@ -357,10 +546,19 @@ pub struct McpToolCallBeginEvent {
pub arguments: Option<serde_json::Value>,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct McpToolCallBeginEvent {
/// Identifier so this can be paired with the McpToolCallEnd event.
pub call_id: String,
pub invocation: McpInvocation,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct McpToolCallEndEvent {
/// Identifier for the corresponding McpToolCallBegin that finished.
pub call_id: String,
pub invocation: McpInvocation,
pub duration: Duration,
/// Result of the tool call. Note this could be an error.
pub result: Result<CallToolResult, String>,
}
@@ -382,6 +580,7 @@ pub struct ExecCommandBeginEvent {
pub command: Vec<String>,
/// The command's working directory if not the default cwd for the agent.
pub cwd: PathBuf,
pub parsed_cmd: Vec<ParsedCommand>,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
@@ -394,10 +593,32 @@ pub struct ExecCommandEndEvent {
pub stderr: String,
/// The command's exit code.
pub exit_code: i32,
/// The duration of the command execution.
pub duration: Duration,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
#[serde(rename_all = "snake_case")]
pub enum ExecOutputStream {
Stdout,
Stderr,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct ExecCommandOutputDeltaEvent {
/// Identifier for the ExecCommandBegin that produced this chunk.
pub call_id: String,
/// Which stream produced this chunk.
pub stream: ExecOutputStream,
/// Raw bytes from the stream (may not be valid UTF-8).
#[serde(with = "serde_bytes")]
pub chunk: ByteBuf,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct ExecApprovalRequestEvent {
/// Identifier for the associated exec call, if available.
pub call_id: String,
/// The command to be executed.
pub command: Vec<String>,
/// The command's working directory.
@@ -409,6 +630,8 @@ pub struct ExecApprovalRequestEvent {
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct ApplyPatchApprovalRequestEvent {
/// Responses API call id for the associated patch apply call, if available.
pub call_id: String,
pub changes: HashMap<PathBuf, FileChange>,
/// Optional explanatory reason (e.g. request for extra write access).
#[serde(skip_serializing_if = "Option::is_none")]
@@ -445,6 +668,11 @@ pub struct PatchApplyEndEvent {
pub success: bool,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct TurnDiffEvent {
pub unified_diff: String,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct GetHistoryEntryResponseEvent {
pub offset: usize,

View File

@@ -1,33 +1,57 @@
//! Functionality to persist a Codex conversation *rollout* a linear list of
//! [`ResponseItem`] objects exchanged during a session to disk so that
//! sessions can be replayed or inspected later (mirrors the behaviour of the
//! upstream TypeScript implementation).
//! Persist Codex session rollouts (.jsonl) so sessions can be replayed or inspected later.
use std::fs::File;
use std::fs::{self};
use std::io::Error as IoError;
use std::path::Path;
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value;
use time::OffsetDateTime;
use time::format_description::FormatItem;
use time::macros::format_description;
use tokio::io::AsyncWriteExt;
use tokio::sync::mpsc::Sender;
use tokio::sync::mpsc::{self};
use tokio::sync::oneshot;
use tracing::info;
use tracing::warn;
use uuid::Uuid;
use crate::config::Config;
use crate::git_info::GitInfo;
use crate::git_info::collect_git_info;
use crate::models::ResponseItem;
/// Folder inside `~/.codex` that holds saved rollouts.
const SESSIONS_SUBDIR: &str = "sessions";
#[derive(Serialize, Deserialize, Clone, Default)]
pub struct SessionMeta {
pub id: Uuid,
pub timestamp: String,
pub instructions: Option<String>,
}
#[derive(Serialize)]
struct SessionMeta {
id: String,
timestamp: String,
struct SessionMetaWithGit {
#[serde(flatten)]
meta: SessionMeta,
#[serde(skip_serializing_if = "Option::is_none")]
instructions: Option<String>,
git: Option<GitInfo>,
}
#[derive(Serialize, Deserialize, Default, Clone)]
pub struct SessionStateSnapshot {}
#[derive(Serialize, Deserialize, Default, Clone)]
pub struct SavedSession {
pub session: SessionMeta,
#[serde(default)]
pub items: Vec<ResponseItem>,
#[serde(default)]
pub state: SessionStateSnapshot,
pub session_id: Uuid,
}
/// Records all [`ResponseItem`]s for a session and flushes them to disk after
@@ -41,7 +65,13 @@ struct SessionMeta {
/// ```
#[derive(Clone)]
pub(crate) struct RolloutRecorder {
tx: Sender<String>,
tx: Sender<RolloutCmd>,
}
enum RolloutCmd {
AddItems(Vec<ResponseItem>),
UpdateState(SessionStateSnapshot),
Shutdown { ack: oneshot::Sender<()> },
}
impl RolloutRecorder {
@@ -59,7 +89,6 @@ impl RolloutRecorder {
timestamp,
} = create_log_file(config, uuid)?;
// Build the static session metadata JSON first.
let timestamp_format: &[FormatItem] = format_description!(
"[year]-[month]-[day]T[hour]:[minute]:[second].[subsecond digits:3]Z"
);
@@ -67,48 +96,33 @@ impl RolloutRecorder {
.format(timestamp_format)
.map_err(|e| IoError::other(format!("failed to format timestamp: {e}")))?;
let meta = SessionMeta {
timestamp,
id: session_id.to_string(),
instructions,
};
// Clone the cwd for the spawned task to collect git info asynchronously
let cwd = config.cwd.clone();
// A reasonably-sized bounded channel. If the buffer fills up the send
// future will yield, which is fine we only need to ensure we do not
// perform *blocking* I/O on the callers thread.
let (tx, mut rx) = mpsc::channel::<String>(256);
// perform *blocking* I/O on the caller's thread.
let (tx, rx) = mpsc::channel::<RolloutCmd>(256);
// Spawn a Tokio task that owns the file handle and performs async
// writes. Using `tokio::fs::File` keeps everything on the async I/O
// driver instead of blocking the runtime.
tokio::task::spawn(async move {
let mut file = tokio::fs::File::from_std(file);
tokio::task::spawn(rollout_writer(
tokio::fs::File::from_std(file),
rx,
Some(SessionMeta {
timestamp,
id: session_id,
instructions,
}),
cwd,
));
while let Some(line) = rx.recv().await {
// Write line + newline, then flush to disk.
if let Err(e) = file.write_all(line.as_bytes()).await {
tracing::warn!("rollout writer: failed to write line: {e}");
break;
}
if let Err(e) = file.write_all(b"\n").await {
tracing::warn!("rollout writer: failed to write newline: {e}");
break;
}
if let Err(e) = file.flush().await {
tracing::warn!("rollout writer: failed to flush: {e}");
break;
}
}
});
let recorder = Self { tx };
// Ensure SessionMeta is the first item in the file.
recorder.record_item(&meta).await?;
Ok(recorder)
Ok(Self { tx })
}
/// Append `items` to the rollout file.
pub(crate) async fn record_items(&self, items: &[ResponseItem]) -> std::io::Result<()> {
let mut filtered = Vec::new();
for item in items {
match item {
// Note that function calls may look a bit strange if they are
@@ -117,27 +131,114 @@ impl RolloutRecorder {
ResponseItem::Message { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::FunctionCall { .. }
| ResponseItem::FunctionCallOutput { .. } => {}
ResponseItem::Reasoning { .. } | ResponseItem::Other => {
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::Reasoning { .. } => filtered.push(item.clone()),
ResponseItem::Other => {
// These should never be serialized.
continue;
}
}
self.record_item(item).await?;
}
Ok(())
if filtered.is_empty() {
return Ok(());
}
self.tx
.send(RolloutCmd::AddItems(filtered))
.await
.map_err(|e| IoError::other(format!("failed to queue rollout items: {e}")))
}
async fn record_item(&self, item: &impl Serialize) -> std::io::Result<()> {
// Serialize the item to JSON first so that the writer thread only has
// to perform the actual write.
let json = serde_json::to_string(item)
.map_err(|e| IoError::other(format!("failed to serialize response items: {e}")))?;
pub(crate) async fn record_state(&self, state: SessionStateSnapshot) -> std::io::Result<()> {
self.tx
.send(json)
.send(RolloutCmd::UpdateState(state))
.await
.map_err(|e| IoError::other(format!("failed to queue rollout item: {e}")))
.map_err(|e| IoError::other(format!("failed to queue rollout state: {e}")))
}
pub async fn resume(
path: &Path,
cwd: std::path::PathBuf,
) -> std::io::Result<(Self, SavedSession)> {
info!("Resuming rollout from {path:?}");
let text = tokio::fs::read_to_string(path).await?;
let mut lines = text.lines();
let meta_line = lines
.next()
.ok_or_else(|| IoError::other("empty session file"))?;
let session: SessionMeta = serde_json::from_str(meta_line)
.map_err(|e| IoError::other(format!("failed to parse session meta: {e}")))?;
let mut items = Vec::new();
let mut state = SessionStateSnapshot::default();
for line in lines {
if line.trim().is_empty() {
continue;
}
let v: Value = match serde_json::from_str(line) {
Ok(v) => v,
Err(_) => continue,
};
if v.get("record_type")
.and_then(|rt| rt.as_str())
.map(|s| s == "state")
.unwrap_or(false)
{
if let Ok(s) = serde_json::from_value::<SessionStateSnapshot>(v.clone()) {
state = s
}
continue;
}
match serde_json::from_value::<ResponseItem>(v.clone()) {
Ok(item) => match item {
ResponseItem::Message { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::FunctionCall { .. }
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::Reasoning { .. } => items.push(item),
ResponseItem::Other => {}
},
Err(e) => {
warn!("failed to parse item: {v:?}, error: {e}");
}
}
}
let saved = SavedSession {
session: session.clone(),
items: items.clone(),
state: state.clone(),
session_id: session.id,
};
let file = std::fs::OpenOptions::new()
.append(true)
.read(true)
.open(path)?;
let (tx, rx) = mpsc::channel::<RolloutCmd>(256);
tokio::task::spawn(rollout_writer(
tokio::fs::File::from_std(file),
rx,
None,
cwd,
));
info!("Resumed rollout successfully from {path:?}");
Ok((Self { tx }, saved))
}
pub async fn shutdown(&self) -> std::io::Result<()> {
let (tx_done, rx_done) = oneshot::channel();
match self.tx.send(RolloutCmd::Shutdown { ack: tx_done }).await {
Ok(_) => rx_done
.await
.map_err(|e| IoError::other(format!("failed waiting for rollout shutdown: {e}"))),
Err(e) => {
warn!("failed to send rollout shutdown command: {e}");
Err(IoError::other(format!(
"failed to send rollout shutdown command: {e}"
)))
}
}
}
}
@@ -153,13 +254,15 @@ struct LogFileInfo {
}
fn create_log_file(config: &Config, session_id: Uuid) -> std::io::Result<LogFileInfo> {
// Resolve ~/.codex/sessions and create it if missing.
let mut dir = config.codex_home.clone();
dir.push(SESSIONS_SUBDIR);
fs::create_dir_all(&dir)?;
// Resolve ~/.codex/sessions/YYYY/MM/DD and create it if missing.
let timestamp = OffsetDateTime::now_local()
.map_err(|e| IoError::other(format!("failed to get local time: {e}")))?;
let mut dir = config.codex_home.clone();
dir.push(SESSIONS_SUBDIR);
dir.push(timestamp.year().to_string());
dir.push(format!("{:02}", u8::from(timestamp.month())));
dir.push(format!("{:02}", timestamp.day()));
fs::create_dir_all(&dir)?;
// Custom format for YYYY-MM-DDThh-mm-ss. Use `-` instead of `:` for
// compatibility with filesystems that do not allow colons in filenames.
@@ -183,3 +286,77 @@ fn create_log_file(config: &Config, session_id: Uuid) -> std::io::Result<LogFile
timestamp,
})
}
async fn rollout_writer(
file: tokio::fs::File,
mut rx: mpsc::Receiver<RolloutCmd>,
mut meta: Option<SessionMeta>,
cwd: std::path::PathBuf,
) -> std::io::Result<()> {
let mut writer = JsonlWriter { file };
// If we have a meta, collect git info asynchronously and write meta first
if let Some(session_meta) = meta.take() {
let git_info = collect_git_info(&cwd).await;
let session_meta_with_git = SessionMetaWithGit {
meta: session_meta,
git: git_info,
};
// Write the SessionMeta as the first item in the file
writer.write_line(&session_meta_with_git).await?;
}
// Process rollout commands
while let Some(cmd) = rx.recv().await {
match cmd {
RolloutCmd::AddItems(items) => {
for item in items {
match item {
ResponseItem::Message { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::FunctionCall { .. }
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::Reasoning { .. } => {
writer.write_line(&item).await?;
}
ResponseItem::Other => {}
}
}
}
RolloutCmd::UpdateState(state) => {
#[derive(Serialize)]
struct StateLine<'a> {
record_type: &'static str,
#[serde(flatten)]
state: &'a SessionStateSnapshot,
}
writer
.write_line(&StateLine {
record_type: "state",
state: &state,
})
.await?;
}
RolloutCmd::Shutdown { ack } => {
let _ = ack.send(());
}
}
}
Ok(())
}
struct JsonlWriter {
file: tokio::fs::File,
}
impl JsonlWriter {
async fn write_line(&mut self, item: &impl serde::Serialize) -> std::io::Result<()> {
let mut json = serde_json::to_string(item)?;
json.push('\n');
let _ = self.file.write_all(json.as_bytes()).await;
self.file.flush().await?;
Ok(())
}
}

View File

@@ -11,7 +11,7 @@ use crate::is_safe_command::is_known_safe_command;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
#[derive(Debug)]
#[derive(Debug, PartialEq)]
pub enum SafetyCheck {
AutoApprove { sandbox_type: SandboxType },
AskUser,
@@ -31,7 +31,7 @@ pub fn assess_patch_safety(
}
match policy {
AskForApproval::OnFailure | AskForApproval::Never => {
AskForApproval::OnFailure | AskForApproval::Never | AskForApproval::OnRequest => {
// Continue to see if this can be auto-approved.
}
// TODO(ragona): I'm not sure this is actually correct? I believe in this case
@@ -41,11 +41,13 @@ pub fn assess_patch_safety(
}
}
if is_write_patch_constrained_to_writable_paths(action, writable_roots, cwd) {
SafetyCheck::AutoApprove {
sandbox_type: SandboxType::None,
}
} else if policy == AskForApproval::OnFailure {
// Even though the patch *appears* to be constrained to writable paths, it
// is possible that paths in the patch are hard links to files outside the
// writable roots, so we should still run `apply_patch` in a sandbox in that
// case.
if is_write_patch_constrained_to_writable_paths(action, writable_roots, cwd)
|| policy == AskForApproval::OnFailure
{
// Only autoapprove when we can actually enforce a sandbox. Otherwise
// fall back to asking the user because the patch may touch arbitrary
// paths outside the project.
@@ -74,10 +76,8 @@ pub fn assess_command_safety(
approval_policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
approved: &HashSet<Vec<String>>,
with_escalated_permissions: bool,
) -> SafetyCheck {
use AskForApproval::*;
use SandboxPolicy::*;
// A command is "trusted" because either:
// - it belongs to a set of commands we consider "safe" by default, or
// - the user has explicitly approved the command for this session
@@ -97,6 +97,17 @@ pub fn assess_command_safety(
};
}
assess_safety_for_untrusted_command(approval_policy, sandbox_policy, with_escalated_permissions)
}
pub(crate) fn assess_safety_for_untrusted_command(
approval_policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
with_escalated_permissions: bool,
) -> SafetyCheck {
use AskForApproval::*;
use SandboxPolicy::*;
match (approval_policy, sandbox_policy) {
(UnlessTrusted, _) => {
// Even though the user may have opted into DangerFullAccess,
@@ -104,9 +115,23 @@ pub fn assess_command_safety(
// commands.
SafetyCheck::AskUser
}
(OnFailure, DangerFullAccess) | (Never, DangerFullAccess) => SafetyCheck::AutoApprove {
(OnFailure, DangerFullAccess)
| (Never, DangerFullAccess)
| (OnRequest, DangerFullAccess) => SafetyCheck::AutoApprove {
sandbox_type: SandboxType::None,
},
(OnRequest, ReadOnly) | (OnRequest, WorkspaceWrite { .. }) => {
if with_escalated_permissions {
SafetyCheck::AskUser
} else {
match get_platform_sandbox() {
Some(sandbox_type) => SafetyCheck::AutoApprove { sandbox_type },
// Fall back to asking since the command is untrusted and
// we do not have a sandbox available
None => SafetyCheck::AskUser,
}
}
}
(Never, ReadOnly)
| (Never, WorkspaceWrite { .. })
| (OnFailure, ReadOnly)
@@ -255,4 +280,47 @@ mod tests {
&cwd,
))
}
#[test]
fn test_request_escalated_privileges() {
// Should not be a trusted command
let command = vec!["git commit".to_string()];
let approval_policy = AskForApproval::OnRequest;
let sandbox_policy = SandboxPolicy::ReadOnly;
let approved: HashSet<Vec<String>> = HashSet::new();
let request_escalated_privileges = true;
let safety_check = assess_command_safety(
&command,
approval_policy,
&sandbox_policy,
&approved,
request_escalated_privileges,
);
assert_eq!(safety_check, SafetyCheck::AskUser);
}
#[test]
fn test_request_escalated_privileges_no_sandbox_fallback() {
let command = vec!["git".to_string(), "commit".to_string()];
let approval_policy = AskForApproval::OnRequest;
let sandbox_policy = SandboxPolicy::ReadOnly;
let approved: HashSet<Vec<String>> = HashSet::new();
let request_escalated_privileges = false;
let safety_check = assess_command_safety(
&command,
approval_policy,
&sandbox_policy,
&approved,
request_escalated_privileges,
);
let expected = match get_platform_sandbox() {
Some(sandbox_type) => SafetyCheck::AutoApprove { sandbox_type },
None => SafetyCheck::AskUser,
};
assert_eq!(safety_check, expected);
}
}

View File

@@ -0,0 +1,333 @@
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
use tokio::process::Child;
use crate::protocol::SandboxPolicy;
use crate::spawn::CODEX_SANDBOX_ENV_VAR;
use crate::spawn::StdioPolicy;
use crate::spawn::spawn_child_async;
const MACOS_SEATBELT_BASE_POLICY: &str = include_str!("seatbelt_base_policy.sbpl");
/// When working with `sandbox-exec`, only consider `sandbox-exec` in `/usr/bin`
/// to defend against an attacker trying to inject a malicious version on the
/// PATH. If /usr/bin/sandbox-exec has been tampered with, then the attacker
/// already has root access.
const MACOS_PATH_TO_SEATBELT_EXECUTABLE: &str = "/usr/bin/sandbox-exec";
pub async fn spawn_command_under_seatbelt(
command: Vec<String>,
sandbox_policy: &SandboxPolicy,
cwd: PathBuf,
stdio_policy: StdioPolicy,
mut env: HashMap<String, String>,
) -> std::io::Result<Child> {
let args = create_seatbelt_command_args(command, sandbox_policy, &cwd);
let arg0 = None;
env.insert(CODEX_SANDBOX_ENV_VAR.to_string(), "seatbelt".to_string());
spawn_child_async(
PathBuf::from(MACOS_PATH_TO_SEATBELT_EXECUTABLE),
args,
arg0,
cwd,
sandbox_policy,
stdio_policy,
env,
)
.await
}
fn create_seatbelt_command_args(
command: Vec<String>,
sandbox_policy: &SandboxPolicy,
cwd: &Path,
) -> Vec<String> {
let (file_write_policy, extra_cli_args) = {
if sandbox_policy.has_full_disk_write_access() {
// Allegedly, this is more permissive than `(allow file-write*)`.
(
r#"(allow file-write* (regex #"^/"))"#.to_string(),
Vec::<String>::new(),
)
} else {
let writable_roots = sandbox_policy.get_writable_roots_with_cwd(cwd);
let mut writable_folder_policies: Vec<String> = Vec::new();
let mut cli_args: Vec<String> = Vec::new();
for (index, wr) in writable_roots.iter().enumerate() {
// Canonicalize to avoid mismatches like /var vs /private/var on macOS.
let canonical_root = wr.root.canonicalize().unwrap_or_else(|_| wr.root.clone());
let root_param = format!("WRITABLE_ROOT_{index}");
cli_args.push(format!(
"-D{root_param}={}",
canonical_root.to_string_lossy()
));
if wr.read_only_subpaths.is_empty() {
writable_folder_policies.push(format!("(subpath (param \"{root_param}\"))"));
} else {
// Add parameters for each read-only subpath and generate
// the `(require-not ...)` clauses.
let mut require_parts: Vec<String> = Vec::new();
require_parts.push(format!("(subpath (param \"{root_param}\"))"));
for (subpath_index, ro) in wr.read_only_subpaths.iter().enumerate() {
let canonical_ro = ro.canonicalize().unwrap_or_else(|_| ro.clone());
let ro_param = format!("WRITABLE_ROOT_{index}_RO_{subpath_index}");
cli_args.push(format!("-D{ro_param}={}", canonical_ro.to_string_lossy()));
require_parts
.push(format!("(require-not (subpath (param \"{ro_param}\")))"));
}
let policy_component = format!("(require-all {} )", require_parts.join(" "));
writable_folder_policies.push(policy_component);
}
}
if writable_folder_policies.is_empty() {
("".to_string(), Vec::<String>::new())
} else {
let file_write_policy = format!(
"(allow file-write*\n{}\n)",
writable_folder_policies.join(" ")
);
(file_write_policy, cli_args)
}
}
};
let file_read_policy = if sandbox_policy.has_full_disk_read_access() {
"; allow read-only file operations\n(allow file-read*)"
} else {
""
};
// TODO(mbolin): apply_patch calls must also honor the SandboxPolicy.
let network_policy = if sandbox_policy.has_full_network_access() {
"(allow network-outbound)\n(allow network-inbound)\n(allow system-socket)"
} else {
""
};
let full_policy = format!(
"{MACOS_SEATBELT_BASE_POLICY}\n{file_read_policy}\n{file_write_policy}\n{network_policy}"
);
let mut seatbelt_args: Vec<String> = vec!["-p".to_string(), full_policy];
seatbelt_args.extend(extra_cli_args);
seatbelt_args.push("--".to_string());
seatbelt_args.extend(command);
seatbelt_args
}
#[cfg(test)]
mod tests {
#![expect(clippy::expect_used)]
use super::MACOS_SEATBELT_BASE_POLICY;
use super::create_seatbelt_command_args;
use crate::protocol::SandboxPolicy;
use pretty_assertions::assert_eq;
use std::fs;
use std::path::Path;
use std::path::PathBuf;
use tempfile::TempDir;
#[test]
fn create_seatbelt_args_with_read_only_git_subpath() {
if cfg!(target_os = "windows") {
// /tmp does not exist on Windows, so skip this test.
return;
}
// Create a temporary workspace with two writable roots: one containing
// a top-level .git directory and one without it.
let tmp = TempDir::new().expect("tempdir");
let PopulatedTmp {
root_with_git,
root_without_git,
root_with_git_canon,
root_with_git_git_canon,
root_without_git_canon,
} = populate_tmpdir(tmp.path());
let cwd = tmp.path().join("cwd");
// Build a policy that only includes the two test roots as writable and
// does not automatically include defaults TMPDIR or /tmp.
let policy = SandboxPolicy::WorkspaceWrite {
writable_roots: vec![root_with_git.clone(), root_without_git.clone()],
network_access: false,
exclude_tmpdir_env_var: true,
exclude_slash_tmp: true,
};
let args = create_seatbelt_command_args(
vec!["/bin/echo".to_string(), "hello".to_string()],
&policy,
&cwd,
);
// Build the expected policy text using a raw string for readability.
// Note that the policy includes:
// - the base policy,
// - read-only access to the filesystem,
// - write access to WRITABLE_ROOT_0 (but not its .git) and WRITABLE_ROOT_1.
let expected_policy = format!(
r#"{MACOS_SEATBELT_BASE_POLICY}
; allow read-only file operations
(allow file-read*)
(allow file-write*
(require-all (subpath (param "WRITABLE_ROOT_0")) (require-not (subpath (param "WRITABLE_ROOT_0_RO_0"))) ) (subpath (param "WRITABLE_ROOT_1")) (subpath (param "WRITABLE_ROOT_2"))
)
"#,
);
let mut expected_args = vec![
"-p".to_string(),
expected_policy,
format!(
"-DWRITABLE_ROOT_0={}",
root_with_git_canon.to_string_lossy()
),
format!(
"-DWRITABLE_ROOT_0_RO_0={}",
root_with_git_git_canon.to_string_lossy()
),
format!(
"-DWRITABLE_ROOT_1={}",
root_without_git_canon.to_string_lossy()
),
format!("-DWRITABLE_ROOT_2={}", cwd.to_string_lossy()),
];
expected_args.extend(vec![
"--".to_string(),
"/bin/echo".to_string(),
"hello".to_string(),
]);
assert_eq!(expected_args, args);
}
#[test]
fn create_seatbelt_args_for_cwd_as_git_repo() {
if cfg!(target_os = "windows") {
// /tmp does not exist on Windows, so skip this test.
return;
}
// Create a temporary workspace with two writable roots: one containing
// a top-level .git directory and one without it.
let tmp = TempDir::new().expect("tempdir");
let PopulatedTmp {
root_with_git,
root_with_git_canon,
root_with_git_git_canon,
..
} = populate_tmpdir(tmp.path());
// Build a policy that does not specify any writable_roots, but does
// use the default ones (cwd and TMPDIR) and verifies the `.git` check
// is done properly for cwd.
let policy = SandboxPolicy::WorkspaceWrite {
writable_roots: vec![],
network_access: false,
exclude_tmpdir_env_var: false,
exclude_slash_tmp: false,
};
let args = create_seatbelt_command_args(
vec!["/bin/echo".to_string(), "hello".to_string()],
&policy,
root_with_git.as_path(),
);
let tmpdir_env_var = std::env::var("TMPDIR")
.ok()
.map(PathBuf::from)
.and_then(|p| p.canonicalize().ok())
.map(|p| p.to_string_lossy().to_string());
let tempdir_policy_entry = if tmpdir_env_var.is_some() {
r#" (subpath (param "WRITABLE_ROOT_2"))"#
} else {
""
};
// Build the expected policy text using a raw string for readability.
// Note that the policy includes:
// - the base policy,
// - read-only access to the filesystem,
// - write access to WRITABLE_ROOT_0 (but not its .git) and WRITABLE_ROOT_1.
let expected_policy = format!(
r#"{MACOS_SEATBELT_BASE_POLICY}
; allow read-only file operations
(allow file-read*)
(allow file-write*
(require-all (subpath (param "WRITABLE_ROOT_0")) (require-not (subpath (param "WRITABLE_ROOT_0_RO_0"))) ) (subpath (param "WRITABLE_ROOT_1")){tempdir_policy_entry}
)
"#,
);
let mut expected_args = vec![
"-p".to_string(),
expected_policy,
format!(
"-DWRITABLE_ROOT_0={}",
root_with_git_canon.to_string_lossy()
),
format!(
"-DWRITABLE_ROOT_0_RO_0={}",
root_with_git_git_canon.to_string_lossy()
),
format!(
"-DWRITABLE_ROOT_1={}",
PathBuf::from("/tmp")
.canonicalize()
.expect("canonicalize /tmp")
.to_string_lossy()
),
];
if let Some(p) = tmpdir_env_var {
expected_args.push(format!("-DWRITABLE_ROOT_2={p}"));
}
expected_args.extend(vec![
"--".to_string(),
"/bin/echo".to_string(),
"hello".to_string(),
]);
assert_eq!(expected_args, args);
}
struct PopulatedTmp {
root_with_git: PathBuf,
root_without_git: PathBuf,
root_with_git_canon: PathBuf,
root_with_git_git_canon: PathBuf,
root_without_git_canon: PathBuf,
}
fn populate_tmpdir(tmp: &Path) -> PopulatedTmp {
let root_with_git = tmp.join("with_git");
let root_without_git = tmp.join("no_git");
fs::create_dir_all(&root_with_git).expect("create with_git");
fs::create_dir_all(&root_without_git).expect("create no_git");
fs::create_dir_all(root_with_git.join(".git")).expect("create .git");
// Ensure we have canonical paths for -D parameter matching.
let root_with_git_canon = root_with_git.canonicalize().expect("canonicalize with_git");
let root_with_git_git_canon = root_with_git_canon.join(".git");
let root_without_git_canon = root_without_git
.canonicalize()
.expect("canonicalize no_git");
PopulatedTmp {
root_with_git,
root_without_git,
root_with_git_canon,
root_with_git_git_canon,
root_without_git_canon,
}
}
}

View File

@@ -65,3 +65,7 @@
(sysctl-name "sysctl.proc_cputype")
(sysctl-name-prefix "hw.perflevel")
)
; Added on top of Chrome profile
; Needed for python multiprocessing on MacOS for the SemLock
(allow ipc-posix-sem)

239
codex-rs/core/src/shell.rs Normal file
View File

@@ -0,0 +1,239 @@
use shlex;
#[derive(Debug, PartialEq, Eq)]
pub struct ZshShell {
shell_path: String,
zshrc_path: String,
}
#[derive(Debug, PartialEq, Eq)]
pub enum Shell {
Zsh(ZshShell),
Unknown,
}
impl Shell {
pub fn format_default_shell_invocation(&self, command: Vec<String>) -> Option<Vec<String>> {
match self {
Shell::Zsh(zsh) => {
if !std::path::Path::new(&zsh.zshrc_path).exists() {
return None;
}
let mut result = vec![zsh.shell_path.clone()];
result.push("-lc".to_string());
let joined = strip_bash_lc(&command)
.or_else(|| shlex::try_join(command.iter().map(|s| s.as_str())).ok());
if let Some(joined) = joined {
result.push(format!("source {} && ({joined})", zsh.zshrc_path));
} else {
return None;
}
Some(result)
}
Shell::Unknown => None,
}
}
}
fn strip_bash_lc(command: &Vec<String>) -> Option<String> {
match command.as_slice() {
// exactly three items
[first, second, third]
// first two must be "bash", "-lc"
if first == "bash" && second == "-lc" =>
{
Some(third.clone())
}
_ => None,
}
}
#[cfg(target_os = "macos")]
pub async fn default_user_shell() -> Shell {
use tokio::process::Command;
use whoami;
let user = whoami::username();
let home = format!("/Users/{user}");
let output = Command::new("dscl")
.args([".", "-read", &home, "UserShell"])
.output()
.await
.ok();
match output {
Some(o) => {
if !o.status.success() {
return Shell::Unknown;
}
let stdout = String::from_utf8_lossy(&o.stdout);
for line in stdout.lines() {
if let Some(shell_path) = line.strip_prefix("UserShell: ") {
if shell_path.ends_with("/zsh") {
return Shell::Zsh(ZshShell {
shell_path: shell_path.to_string(),
zshrc_path: format!("{home}/.zshrc"),
});
}
}
}
Shell::Unknown
}
_ => Shell::Unknown,
}
}
#[cfg(not(target_os = "macos"))]
pub async fn default_user_shell() -> Shell {
Shell::Unknown
}
#[cfg(test)]
#[cfg(target_os = "macos")]
mod tests {
use super::*;
use std::process::Command;
#[tokio::test]
#[expect(clippy::unwrap_used)]
async fn test_current_shell_detects_zsh() {
let shell = Command::new("sh")
.arg("-c")
.arg("echo $SHELL")
.output()
.unwrap();
let home = std::env::var("HOME").unwrap();
let shell_path = String::from_utf8_lossy(&shell.stdout).trim().to_string();
if shell_path.ends_with("/zsh") {
assert_eq!(
default_user_shell().await,
Shell::Zsh(ZshShell {
shell_path: shell_path.to_string(),
zshrc_path: format!("{home}/.zshrc",),
})
);
}
}
#[tokio::test]
async fn test_run_with_profile_zshrc_not_exists() {
let shell = Shell::Zsh(ZshShell {
shell_path: "/bin/zsh".to_string(),
zshrc_path: "/does/not/exist/.zshrc".to_string(),
});
let actual_cmd = shell.format_default_shell_invocation(vec!["myecho".to_string()]);
assert_eq!(actual_cmd, None);
}
#[expect(clippy::unwrap_used)]
#[tokio::test]
async fn test_run_with_profile_escaping_and_execution() {
let shell_path = "/bin/zsh";
let cases = vec![
(
vec!["myecho"],
vec![shell_path, "-lc", "source ZSHRC_PATH && (myecho)"],
Some("It works!\n"),
),
(
vec!["myecho"],
vec![shell_path, "-lc", "source ZSHRC_PATH && (myecho)"],
Some("It works!\n"),
),
(
vec!["bash", "-c", "echo 'single' \"double\""],
vec![
shell_path,
"-lc",
"source ZSHRC_PATH && (bash -c \"echo 'single' \\\"double\\\"\")",
],
Some("single double\n"),
),
(
vec!["bash", "-lc", "echo 'single' \"double\""],
vec![
shell_path,
"-lc",
"source ZSHRC_PATH && (echo 'single' \"double\")",
],
Some("single double\n"),
),
];
for (input, expected_cmd, expected_output) in cases {
use std::collections::HashMap;
use std::path::PathBuf;
use std::sync::Arc;
use tokio::sync::Notify;
use crate::exec::ExecParams;
use crate::exec::SandboxType;
use crate::exec::process_exec_tool_call;
use crate::protocol::SandboxPolicy;
// create a temp directory with a zshrc file in it
let temp_home = tempfile::tempdir().unwrap();
let zshrc_path = temp_home.path().join(".zshrc");
std::fs::write(
&zshrc_path,
r#"
set -x
function myecho {
echo 'It works!'
}
"#,
)
.unwrap();
let shell = Shell::Zsh(ZshShell {
shell_path: shell_path.to_string(),
zshrc_path: zshrc_path.to_str().unwrap().to_string(),
});
let actual_cmd = shell
.format_default_shell_invocation(input.iter().map(|s| s.to_string()).collect());
let expected_cmd = expected_cmd
.iter()
.map(|s| {
s.replace("ZSHRC_PATH", zshrc_path.to_str().unwrap())
.to_string()
})
.collect();
assert_eq!(actual_cmd, Some(expected_cmd));
// Actually run the command and check output/exit code
let output = process_exec_tool_call(
ExecParams {
command: actual_cmd.unwrap(),
cwd: PathBuf::from(temp_home.path()),
timeout_ms: None,
env: HashMap::from([(
"HOME".to_string(),
temp_home.path().to_str().unwrap().to_string(),
)]),
with_escalated_permissions: None,
justification: None,
},
SandboxType::None,
Arc::new(Notify::new()),
&SandboxPolicy::DangerFullAccess,
&None,
None,
)
.await
.unwrap();
assert_eq!(output.exit_code, 0, "input: {input:?} output: {output:?}");
if let Some(expected) = expected_output {
assert_eq!(
output.stdout, expected,
"input: {input:?} output: {output:?}"
);
}
}
}
}

107
codex-rs/core/src/spawn.rs Normal file
View File

@@ -0,0 +1,107 @@
use std::collections::HashMap;
use std::path::PathBuf;
use std::process::Stdio;
use tokio::process::Child;
use tokio::process::Command;
use tracing::trace;
use crate::protocol::SandboxPolicy;
/// Experimental environment variable that will be set to some non-empty value
/// if both of the following are true:
///
/// 1. The process was spawned by Codex as part of a shell tool call.
/// 2. SandboxPolicy.has_full_network_access() was false for the tool call.
///
/// We may try to have just one environment variable for all sandboxing
/// attributes, so this may change in the future.
pub const CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR: &str = "CODEX_SANDBOX_NETWORK_DISABLED";
/// Should be set when the process is spawned under a sandbox. Currently, the
/// value is "seatbelt" for macOS, but it may change in the future to
/// accommodate sandboxing configuration and other sandboxing mechanisms.
pub const CODEX_SANDBOX_ENV_VAR: &str = "CODEX_SANDBOX";
#[derive(Debug, Clone, Copy)]
pub enum StdioPolicy {
RedirectForShellTool,
Inherit,
}
/// Spawns the appropriate child process for the ExecParams and SandboxPolicy,
/// ensuring the args and environment variables used to create the `Command`
/// (and `Child`) honor the configuration.
///
/// For now, we take `SandboxPolicy` as a parameter to spawn_child() because
/// we need to determine whether to set the
/// `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` environment variable.
pub(crate) async fn spawn_child_async(
program: PathBuf,
args: Vec<String>,
#[cfg_attr(not(unix), allow(unused_variables))] arg0: Option<&str>,
cwd: PathBuf,
sandbox_policy: &SandboxPolicy,
stdio_policy: StdioPolicy,
env: HashMap<String, String>,
) -> std::io::Result<Child> {
trace!(
"spawn_child_async: {program:?} {args:?} {arg0:?} {cwd:?} {sandbox_policy:?} {stdio_policy:?} {env:?}"
);
let mut cmd = Command::new(&program);
#[cfg(unix)]
cmd.arg0(arg0.map_or_else(|| program.to_string_lossy().to_string(), String::from));
cmd.args(args);
cmd.current_dir(cwd);
cmd.env_clear();
cmd.envs(env);
if !sandbox_policy.has_full_network_access() {
cmd.env(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR, "1");
}
// If this Codex process dies (including being killed via SIGKILL), we want
// any child processes that were spawned as part of a `"shell"` tool call
// to also be terminated.
// This relies on prctl(2), so it only works on Linux.
#[cfg(target_os = "linux")]
unsafe {
cmd.pre_exec(|| {
// This prctl call effectively requests, "deliver SIGTERM when my
// current parent dies."
if libc::prctl(libc::PR_SET_PDEATHSIG, libc::SIGTERM) == -1 {
return Err(std::io::Error::last_os_error());
}
// Though if there was a race condition and this pre_exec() block is
// run _after_ the parent (i.e., the Codex process) has already
// exited, then the parent is the _init_ process (which will never
// die), so we should just terminate the child process now.
if libc::getppid() == 1 {
libc::raise(libc::SIGTERM);
}
Ok(())
});
}
match stdio_policy {
StdioPolicy::RedirectForShellTool => {
// Do not create a file descriptor for stdin because otherwise some
// commands may hang forever waiting for input. For example, ripgrep has
// a heuristic where it may try to read from stdin as explained here:
// https://github.com/BurntSushi/ripgrep/blob/e2362d4d5185d02fa857bf381e7bd52e66fafc73/crates/core/flags/hiargs.rs#L1101-L1103
cmd.stdin(Stdio::null());
cmd.stdout(Stdio::piped()).stderr(Stdio::piped());
}
StdioPolicy::Inherit => {
// Inherit stdin, stdout, and stderr from the parent process.
cmd.stdin(Stdio::inherit())
.stdout(Stdio::inherit())
.stderr(Stdio::inherit());
}
}
cmd.kill_on_drop(true).spawn()
}

View File

@@ -0,0 +1,887 @@
use std::collections::HashMap;
use std::fs;
use std::path::Path;
use std::path::PathBuf;
use std::process::Command;
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
use sha1::digest::Output;
use uuid::Uuid;
use crate::protocol::FileChange;
const ZERO_OID: &str = "0000000000000000000000000000000000000000";
const DEV_NULL: &str = "/dev/null";
struct BaselineFileInfo {
path: PathBuf,
content: Vec<u8>,
mode: FileMode,
oid: String,
}
/// Tracks sets of changes to files and exposes the overall unified diff.
/// Internally, the way this works is now:
/// 1. Maintain an in-memory baseline snapshot of files when they are first seen.
/// For new additions, do not create a baseline so that diffs are shown as proper additions (using /dev/null).
/// 2. Keep a stable internal filename (uuid) per external path for rename tracking.
/// 3. To compute the aggregated unified diff, compare each baseline snapshot to the current file on disk entirely in-memory
/// using the `similar` crate and emit unified diffs with rewritten external paths.
#[derive(Default)]
pub struct TurnDiffTracker {
/// Map external path -> internal filename (uuid).
external_to_temp_name: HashMap<PathBuf, String>,
/// Internal filename -> baseline file info.
baseline_file_info: HashMap<String, BaselineFileInfo>,
/// Internal filename -> external path as of current accumulated state (after applying all changes).
/// This is where renames are tracked.
temp_name_to_current_path: HashMap<String, PathBuf>,
/// Cache of known git worktree roots to avoid repeated filesystem walks.
git_root_cache: Vec<PathBuf>,
}
impl TurnDiffTracker {
pub fn new() -> Self {
Self::default()
}
/// Front-run apply patch calls to track the starting contents of any modified files.
/// - Creates an in-memory baseline snapshot for files that already exist on disk when first seen.
/// - For additions, we intentionally do not create a baseline snapshot so that diffs are proper additions.
/// - Also updates internal mappings for move/rename events.
pub fn on_patch_begin(&mut self, changes: &HashMap<PathBuf, FileChange>) {
for (path, change) in changes.iter() {
// Ensure a stable internal filename exists for this external path.
if !self.external_to_temp_name.contains_key(path) {
let internal = Uuid::new_v4().to_string();
self.external_to_temp_name
.insert(path.clone(), internal.clone());
self.temp_name_to_current_path
.insert(internal.clone(), path.clone());
// If the file exists on disk now, snapshot as baseline; else leave missing to represent /dev/null.
let baseline_file_info = if path.exists() {
let mode = file_mode_for_path(path);
let mode_val = mode.unwrap_or(FileMode::Regular);
let content = blob_bytes(path, &mode_val).unwrap_or_default();
let oid = if mode == Some(FileMode::Symlink) {
format!("{:x}", git_blob_sha1_hex_bytes(&content))
} else {
self.git_blob_oid_for_path(path)
.unwrap_or_else(|| format!("{:x}", git_blob_sha1_hex_bytes(&content)))
};
Some(BaselineFileInfo {
path: path.clone(),
content,
mode: mode_val,
oid,
})
} else {
Some(BaselineFileInfo {
path: path.clone(),
content: vec![],
mode: FileMode::Regular,
oid: ZERO_OID.to_string(),
})
};
if let Some(baseline_file_info) = baseline_file_info {
self.baseline_file_info
.insert(internal.clone(), baseline_file_info);
}
}
// Track rename/move in current mapping if provided in an Update.
if let FileChange::Update {
move_path: Some(dest),
..
} = change
{
let uuid_filename = match self.external_to_temp_name.get(path) {
Some(i) => i.clone(),
None => {
// This should be rare, but if we haven't mapped the source, create it with no baseline.
let i = Uuid::new_v4().to_string();
self.baseline_file_info.insert(
i.clone(),
BaselineFileInfo {
path: path.clone(),
content: vec![],
mode: FileMode::Regular,
oid: ZERO_OID.to_string(),
},
);
i
}
};
// Update current external mapping for temp file name.
self.temp_name_to_current_path
.insert(uuid_filename.clone(), dest.clone());
// Update forward file_mapping: external current -> internal name.
self.external_to_temp_name.remove(path);
self.external_to_temp_name
.insert(dest.clone(), uuid_filename);
};
}
}
fn get_path_for_internal(&self, internal: &str) -> Option<PathBuf> {
self.temp_name_to_current_path
.get(internal)
.cloned()
.or_else(|| {
self.baseline_file_info
.get(internal)
.map(|info| info.path.clone())
})
}
/// Find the git worktree root for a file/directory by walking up to the first ancestor containing a `.git` entry.
/// Uses a simple cache of known roots and avoids negative-result caching for simplicity.
fn find_git_root_cached(&mut self, start: &Path) -> Option<PathBuf> {
let dir = if start.is_dir() {
start
} else {
start.parent()?
};
// Fast path: if any cached root is an ancestor of this path, use it.
if let Some(root) = self
.git_root_cache
.iter()
.find(|r| dir.starts_with(r))
.cloned()
{
return Some(root);
}
// Walk up to find a `.git` marker.
let mut cur = dir.to_path_buf();
loop {
let git_marker = cur.join(".git");
if git_marker.is_dir() || git_marker.is_file() {
if !self.git_root_cache.iter().any(|r| r == &cur) {
self.git_root_cache.push(cur.clone());
}
return Some(cur);
}
// On Windows, avoid walking above the drive or UNC share root.
#[cfg(windows)]
{
if is_windows_drive_or_unc_root(&cur) {
return None;
}
}
if let Some(parent) = cur.parent() {
cur = parent.to_path_buf();
} else {
return None;
}
}
}
/// Return a display string for `path` relative to its git root if found, else absolute.
fn relative_to_git_root_str(&mut self, path: &Path) -> String {
let s = if let Some(root) = self.find_git_root_cached(path) {
if let Ok(rel) = path.strip_prefix(&root) {
rel.display().to_string()
} else {
path.display().to_string()
}
} else {
path.display().to_string()
};
s.replace('\\', "/")
}
/// Ask git to compute the blob SHA-1 for the file at `path` within its repository.
/// Returns None if no repository is found or git invocation fails.
fn git_blob_oid_for_path(&mut self, path: &Path) -> Option<String> {
let root = self.find_git_root_cached(path)?;
// Compute a path relative to the repo root for better portability across platforms.
let rel = path.strip_prefix(&root).unwrap_or(path);
let output = Command::new("git")
.arg("-C")
.arg(&root)
.arg("hash-object")
.arg("--")
.arg(rel)
.output()
.ok()?;
if !output.status.success() {
return None;
}
let s = String::from_utf8_lossy(&output.stdout).trim().to_string();
if s.len() == 40 { Some(s) } else { None }
}
/// Recompute the aggregated unified diff by comparing all of the in-memory snapshots that were
/// collected before the first time they were touched by apply_patch during this turn with
/// the current repo state.
pub fn get_unified_diff(&mut self) -> Result<Option<String>> {
let mut aggregated = String::new();
// Compute diffs per tracked internal file in a stable order by external path.
let mut baseline_file_names: Vec<String> =
self.baseline_file_info.keys().cloned().collect();
// Sort lexicographically by full repo-relative path to match git behavior.
baseline_file_names.sort_by_key(|internal| {
self.get_path_for_internal(internal)
.map(|p| self.relative_to_git_root_str(&p))
.unwrap_or_default()
});
for internal in baseline_file_names {
aggregated.push_str(self.get_file_diff(&internal).as_str());
if !aggregated.ends_with('\n') {
aggregated.push('\n');
}
}
if aggregated.trim().is_empty() {
Ok(None)
} else {
Ok(Some(aggregated))
}
}
fn get_file_diff(&mut self, internal_file_name: &str) -> String {
let mut aggregated = String::new();
// Snapshot lightweight fields only.
let (baseline_external_path, baseline_mode, left_oid) = {
if let Some(info) = self.baseline_file_info.get(internal_file_name) {
(info.path.clone(), info.mode, info.oid.clone())
} else {
(PathBuf::new(), FileMode::Regular, ZERO_OID.to_string())
}
};
let current_external_path = match self.get_path_for_internal(internal_file_name) {
Some(p) => p,
None => return aggregated,
};
let current_mode = file_mode_for_path(&current_external_path).unwrap_or(FileMode::Regular);
let right_bytes = blob_bytes(&current_external_path, &current_mode);
// Compute displays with &mut self before borrowing any baseline content.
let left_display = self.relative_to_git_root_str(&baseline_external_path);
let right_display = self.relative_to_git_root_str(&current_external_path);
// Compute right oid before borrowing baseline content.
let right_oid = if let Some(b) = right_bytes.as_ref() {
if current_mode == FileMode::Symlink {
format!("{:x}", git_blob_sha1_hex_bytes(b))
} else {
self.git_blob_oid_for_path(&current_external_path)
.unwrap_or_else(|| format!("{:x}", git_blob_sha1_hex_bytes(b)))
}
} else {
ZERO_OID.to_string()
};
// Borrow baseline content only after all &mut self uses are done.
let left_present = left_oid.as_str() != ZERO_OID;
let left_bytes: Option<&[u8]> = if left_present {
self.baseline_file_info
.get(internal_file_name)
.map(|i| i.content.as_slice())
} else {
None
};
// Fast path: identical bytes or both missing.
if left_bytes == right_bytes.as_deref() {
return aggregated;
}
aggregated.push_str(&format!("diff --git a/{left_display} b/{right_display}\n"));
let is_add = !left_present && right_bytes.is_some();
let is_delete = left_present && right_bytes.is_none();
if is_add {
aggregated.push_str(&format!("new file mode {current_mode}\n"));
} else if is_delete {
aggregated.push_str(&format!("deleted file mode {baseline_mode}\n"));
} else if baseline_mode != current_mode {
aggregated.push_str(&format!("old mode {baseline_mode}\n"));
aggregated.push_str(&format!("new mode {current_mode}\n"));
}
let left_text = left_bytes.and_then(|b| std::str::from_utf8(b).ok());
let right_text = right_bytes
.as_deref()
.and_then(|b| std::str::from_utf8(b).ok());
let can_text_diff = matches!(
(left_text, right_text, is_add, is_delete),
(Some(_), Some(_), _, _) | (_, Some(_), true, _) | (Some(_), _, _, true)
);
if can_text_diff {
let l = left_text.unwrap_or("");
let r = right_text.unwrap_or("");
aggregated.push_str(&format!("index {left_oid}..{right_oid}\n"));
let old_header = if left_present {
format!("a/{left_display}")
} else {
DEV_NULL.to_string()
};
let new_header = if right_bytes.is_some() {
format!("b/{right_display}")
} else {
DEV_NULL.to_string()
};
let diff = similar::TextDiff::from_lines(l, r);
let unified = diff
.unified_diff()
.context_radius(3)
.header(&old_header, &new_header)
.to_string();
aggregated.push_str(&unified);
} else {
aggregated.push_str(&format!("index {left_oid}..{right_oid}\n"));
let old_header = if left_present {
format!("a/{left_display}")
} else {
DEV_NULL.to_string()
};
let new_header = if right_bytes.is_some() {
format!("b/{right_display}")
} else {
DEV_NULL.to_string()
};
aggregated.push_str(&format!("--- {old_header}\n"));
aggregated.push_str(&format!("+++ {new_header}\n"));
aggregated.push_str("Binary files differ\n");
}
aggregated
}
}
/// Compute the Git SHA-1 blob object ID for the given content (bytes).
fn git_blob_sha1_hex_bytes(data: &[u8]) -> Output<sha1::Sha1> {
// Git blob hash is sha1 of: "blob <len>\0<data>"
let header = format!("blob {}\0", data.len());
use sha1::Digest;
let mut hasher = sha1::Sha1::new();
hasher.update(header.as_bytes());
hasher.update(data);
hasher.finalize()
}
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
enum FileMode {
Regular,
#[cfg(unix)]
Executable,
Symlink,
}
impl FileMode {
fn as_str(&self) -> &'static str {
match self {
FileMode::Regular => "100644",
#[cfg(unix)]
FileMode::Executable => "100755",
FileMode::Symlink => "120000",
}
}
}
impl std::fmt::Display for FileMode {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(self.as_str())
}
}
#[cfg(unix)]
fn file_mode_for_path(path: &Path) -> Option<FileMode> {
use std::os::unix::fs::PermissionsExt;
let meta = fs::symlink_metadata(path).ok()?;
let ft = meta.file_type();
if ft.is_symlink() {
return Some(FileMode::Symlink);
}
let mode = meta.permissions().mode();
let is_exec = (mode & 0o111) != 0;
Some(if is_exec {
FileMode::Executable
} else {
FileMode::Regular
})
}
#[cfg(not(unix))]
fn file_mode_for_path(_path: &Path) -> Option<FileMode> {
// Default to non-executable on non-unix.
Some(FileMode::Regular)
}
fn blob_bytes(path: &Path, mode: &FileMode) -> Option<Vec<u8>> {
if path.exists() {
let contents = if *mode == FileMode::Symlink {
symlink_blob_bytes(path)
.ok_or_else(|| anyhow!("failed to read symlink target for {}", path.display()))
} else {
fs::read(path)
.with_context(|| format!("failed to read current file for diff {}", path.display()))
};
contents.ok()
} else {
None
}
}
#[cfg(unix)]
fn symlink_blob_bytes(path: &Path) -> Option<Vec<u8>> {
use std::os::unix::ffi::OsStrExt;
let target = std::fs::read_link(path).ok()?;
Some(target.as_os_str().as_bytes().to_vec())
}
#[cfg(not(unix))]
fn symlink_blob_bytes(_path: &Path) -> Option<Vec<u8>> {
None
}
#[cfg(windows)]
fn is_windows_drive_or_unc_root(p: &std::path::Path) -> bool {
use std::path::Component;
let mut comps = p.components();
matches!(
(comps.next(), comps.next(), comps.next()),
(Some(Component::Prefix(_)), Some(Component::RootDir), None)
)
}
#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used)]
use super::*;
use pretty_assertions::assert_eq;
use tempfile::tempdir;
/// Compute the Git SHA-1 blob object ID for the given content (string).
/// This delegates to the bytes version to avoid UTF-8 lossy conversions here.
fn git_blob_sha1_hex(data: &str) -> String {
format!("{:x}", git_blob_sha1_hex_bytes(data.as_bytes()))
}
fn normalize_diff_for_test(input: &str, root: &Path) -> String {
let root_str = root.display().to_string().replace('\\', "/");
let replaced = input.replace(&root_str, "<TMP>");
// Split into blocks on lines starting with "diff --git ", sort blocks for determinism, and rejoin
let mut blocks: Vec<String> = Vec::new();
let mut current = String::new();
for line in replaced.lines() {
if line.starts_with("diff --git ") && !current.is_empty() {
blocks.push(current);
current = String::new();
}
if !current.is_empty() {
current.push('\n');
}
current.push_str(line);
}
if !current.is_empty() {
blocks.push(current);
}
blocks.sort();
let mut out = blocks.join("\n");
if !out.ends_with('\n') {
out.push('\n');
}
out
}
#[test]
fn accumulates_add_and_update() {
let mut acc = TurnDiffTracker::new();
let dir = tempdir().unwrap();
let file = dir.path().join("a.txt");
// First patch: add file (baseline should be /dev/null).
let add_changes = HashMap::from([(
file.clone(),
FileChange::Add {
content: "foo\n".to_string(),
},
)]);
acc.on_patch_begin(&add_changes);
// Simulate apply: create the file on disk.
fs::write(&file, "foo\n").unwrap();
let first = acc.get_unified_diff().unwrap().unwrap();
let first = normalize_diff_for_test(&first, dir.path());
let expected_first = {
let mode = file_mode_for_path(&file).unwrap_or(FileMode::Regular);
let right_oid = git_blob_sha1_hex("foo\n");
format!(
r#"diff --git a/<TMP>/a.txt b/<TMP>/a.txt
new file mode {mode}
index {ZERO_OID}..{right_oid}
--- {DEV_NULL}
+++ b/<TMP>/a.txt
@@ -0,0 +1 @@
+foo
"#,
)
};
assert_eq!(first, expected_first);
// Second patch: update the file on disk.
let update_changes = HashMap::from([(
file.clone(),
FileChange::Update {
unified_diff: "".to_owned(),
move_path: None,
},
)]);
acc.on_patch_begin(&update_changes);
// Simulate apply: append a new line.
fs::write(&file, "foo\nbar\n").unwrap();
let combined = acc.get_unified_diff().unwrap().unwrap();
let combined = normalize_diff_for_test(&combined, dir.path());
let expected_combined = {
let mode = file_mode_for_path(&file).unwrap_or(FileMode::Regular);
let right_oid = git_blob_sha1_hex("foo\nbar\n");
format!(
r#"diff --git a/<TMP>/a.txt b/<TMP>/a.txt
new file mode {mode}
index {ZERO_OID}..{right_oid}
--- {DEV_NULL}
+++ b/<TMP>/a.txt
@@ -0,0 +1,2 @@
+foo
+bar
"#,
)
};
assert_eq!(combined, expected_combined);
}
#[test]
fn accumulates_delete() {
let dir = tempdir().unwrap();
let file = dir.path().join("b.txt");
fs::write(&file, "x\n").unwrap();
let mut acc = TurnDiffTracker::new();
let del_changes = HashMap::from([(file.clone(), FileChange::Delete)]);
acc.on_patch_begin(&del_changes);
// Simulate apply: delete the file from disk.
let baseline_mode = file_mode_for_path(&file).unwrap_or(FileMode::Regular);
fs::remove_file(&file).unwrap();
let diff = acc.get_unified_diff().unwrap().unwrap();
let diff = normalize_diff_for_test(&diff, dir.path());
let expected = {
let left_oid = git_blob_sha1_hex("x\n");
format!(
r#"diff --git a/<TMP>/b.txt b/<TMP>/b.txt
deleted file mode {baseline_mode}
index {left_oid}..{ZERO_OID}
--- a/<TMP>/b.txt
+++ {DEV_NULL}
@@ -1 +0,0 @@
-x
"#,
)
};
assert_eq!(diff, expected);
}
#[test]
fn accumulates_move_and_update() {
let dir = tempdir().unwrap();
let src = dir.path().join("src.txt");
let dest = dir.path().join("dst.txt");
fs::write(&src, "line\n").unwrap();
let mut acc = TurnDiffTracker::new();
let mv_changes = HashMap::from([(
src.clone(),
FileChange::Update {
unified_diff: "".to_owned(),
move_path: Some(dest.clone()),
},
)]);
acc.on_patch_begin(&mv_changes);
// Simulate apply: move and update content.
fs::rename(&src, &dest).unwrap();
fs::write(&dest, "line2\n").unwrap();
let out = acc.get_unified_diff().unwrap().unwrap();
let out = normalize_diff_for_test(&out, dir.path());
let expected = {
let left_oid = git_blob_sha1_hex("line\n");
let right_oid = git_blob_sha1_hex("line2\n");
format!(
r#"diff --git a/<TMP>/src.txt b/<TMP>/dst.txt
index {left_oid}..{right_oid}
--- a/<TMP>/src.txt
+++ b/<TMP>/dst.txt
@@ -1 +1 @@
-line
+line2
"#
)
};
assert_eq!(out, expected);
}
#[test]
fn move_without_1change_yields_no_diff() {
let dir = tempdir().unwrap();
let src = dir.path().join("moved.txt");
let dest = dir.path().join("renamed.txt");
fs::write(&src, "same\n").unwrap();
let mut acc = TurnDiffTracker::new();
let mv_changes = HashMap::from([(
src.clone(),
FileChange::Update {
unified_diff: "".to_owned(),
move_path: Some(dest.clone()),
},
)]);
acc.on_patch_begin(&mv_changes);
// Simulate apply: move only, no content change.
fs::rename(&src, &dest).unwrap();
let diff = acc.get_unified_diff().unwrap();
assert_eq!(diff, None);
}
#[test]
fn move_declared_but_file_only_appears_at_dest_is_add() {
let dir = tempdir().unwrap();
let src = dir.path().join("src.txt");
let dest = dir.path().join("dest.txt");
let mut acc = TurnDiffTracker::new();
let mv = HashMap::from([(
src.clone(),
FileChange::Update {
unified_diff: "".into(),
move_path: Some(dest.clone()),
},
)]);
acc.on_patch_begin(&mv);
// No file existed initially; create only dest
fs::write(&dest, "hello\n").unwrap();
let diff = acc.get_unified_diff().unwrap().unwrap();
let diff = normalize_diff_for_test(&diff, dir.path());
let expected = {
let mode = file_mode_for_path(&dest).unwrap_or(FileMode::Regular);
let right_oid = git_blob_sha1_hex("hello\n");
format!(
r#"diff --git a/<TMP>/src.txt b/<TMP>/dest.txt
new file mode {mode}
index {ZERO_OID}..{right_oid}
--- {DEV_NULL}
+++ b/<TMP>/dest.txt
@@ -0,0 +1 @@
+hello
"#,
)
};
assert_eq!(diff, expected);
}
#[test]
fn update_persists_across_new_baseline_for_new_file() {
let dir = tempdir().unwrap();
let a = dir.path().join("a.txt");
let b = dir.path().join("b.txt");
fs::write(&a, "foo\n").unwrap();
fs::write(&b, "z\n").unwrap();
let mut acc = TurnDiffTracker::new();
// First: update existing a.txt (baseline snapshot is created for a).
let update_a = HashMap::from([(
a.clone(),
FileChange::Update {
unified_diff: "".to_owned(),
move_path: None,
},
)]);
acc.on_patch_begin(&update_a);
// Simulate apply: modify a.txt on disk.
fs::write(&a, "foo\nbar\n").unwrap();
let first = acc.get_unified_diff().unwrap().unwrap();
let first = normalize_diff_for_test(&first, dir.path());
let expected_first = {
let left_oid = git_blob_sha1_hex("foo\n");
let right_oid = git_blob_sha1_hex("foo\nbar\n");
format!(
r#"diff --git a/<TMP>/a.txt b/<TMP>/a.txt
index {left_oid}..{right_oid}
--- a/<TMP>/a.txt
+++ b/<TMP>/a.txt
@@ -1 +1,2 @@
foo
+bar
"#
)
};
assert_eq!(first, expected_first);
// Next: introduce a brand-new path b.txt into baseline snapshots via a delete change.
let del_b = HashMap::from([(b.clone(), FileChange::Delete)]);
acc.on_patch_begin(&del_b);
// Simulate apply: delete b.txt.
let baseline_mode = file_mode_for_path(&b).unwrap_or(FileMode::Regular);
fs::remove_file(&b).unwrap();
let combined = acc.get_unified_diff().unwrap().unwrap();
let combined = normalize_diff_for_test(&combined, dir.path());
let expected = {
let left_oid_a = git_blob_sha1_hex("foo\n");
let right_oid_a = git_blob_sha1_hex("foo\nbar\n");
let left_oid_b = git_blob_sha1_hex("z\n");
format!(
r#"diff --git a/<TMP>/a.txt b/<TMP>/a.txt
index {left_oid_a}..{right_oid_a}
--- a/<TMP>/a.txt
+++ b/<TMP>/a.txt
@@ -1 +1,2 @@
foo
+bar
diff --git a/<TMP>/b.txt b/<TMP>/b.txt
deleted file mode {baseline_mode}
index {left_oid_b}..{ZERO_OID}
--- a/<TMP>/b.txt
+++ {DEV_NULL}
@@ -1 +0,0 @@
-z
"#,
)
};
assert_eq!(combined, expected);
}
#[test]
fn binary_files_differ_update() {
let dir = tempdir().unwrap();
let file = dir.path().join("bin.dat");
// Initial non-UTF8 bytes
let left_bytes: Vec<u8> = vec![0xff, 0xfe, 0xfd, 0x00];
// Updated non-UTF8 bytes
let right_bytes: Vec<u8> = vec![0x01, 0x02, 0x03, 0x00];
fs::write(&file, &left_bytes).unwrap();
let mut acc = TurnDiffTracker::new();
let update_changes = HashMap::from([(
file.clone(),
FileChange::Update {
unified_diff: "".to_owned(),
move_path: None,
},
)]);
acc.on_patch_begin(&update_changes);
// Apply update on disk
fs::write(&file, &right_bytes).unwrap();
let diff = acc.get_unified_diff().unwrap().unwrap();
let diff = normalize_diff_for_test(&diff, dir.path());
let expected = {
let left_oid = format!("{:x}", git_blob_sha1_hex_bytes(&left_bytes));
let right_oid = format!("{:x}", git_blob_sha1_hex_bytes(&right_bytes));
format!(
r#"diff --git a/<TMP>/bin.dat b/<TMP>/bin.dat
index {left_oid}..{right_oid}
--- a/<TMP>/bin.dat
+++ b/<TMP>/bin.dat
Binary files differ
"#
)
};
assert_eq!(diff, expected);
}
#[test]
fn filenames_with_spaces_add_and_update() {
let mut acc = TurnDiffTracker::new();
let dir = tempdir().unwrap();
let file = dir.path().join("name with spaces.txt");
// First patch: add file (baseline should be /dev/null).
let add_changes = HashMap::from([(
file.clone(),
FileChange::Add {
content: "foo\n".to_string(),
},
)]);
acc.on_patch_begin(&add_changes);
// Simulate apply: create the file on disk.
fs::write(&file, "foo\n").unwrap();
let first = acc.get_unified_diff().unwrap().unwrap();
let first = normalize_diff_for_test(&first, dir.path());
let expected_first = {
let mode = file_mode_for_path(&file).unwrap_or(FileMode::Regular);
let right_oid = git_blob_sha1_hex("foo\n");
format!(
r#"diff --git a/<TMP>/name with spaces.txt b/<TMP>/name with spaces.txt
new file mode {mode}
index {ZERO_OID}..{right_oid}
--- {DEV_NULL}
+++ b/<TMP>/name with spaces.txt
@@ -0,0 +1 @@
+foo
"#,
)
};
assert_eq!(first, expected_first);
// Second patch: update the file on disk.
let update_changes = HashMap::from([(
file.clone(),
FileChange::Update {
unified_diff: "".to_owned(),
move_path: None,
},
)]);
acc.on_patch_begin(&update_changes);
// Simulate apply: append a new line with a space.
fs::write(&file, "foo\nbar baz\n").unwrap();
let combined = acc.get_unified_diff().unwrap().unwrap();
let combined = normalize_diff_for_test(&combined, dir.path());
let expected_combined = {
let mode = file_mode_for_path(&file).unwrap_or(FileMode::Regular);
let right_oid = git_blob_sha1_hex("foo\nbar baz\n");
format!(
r#"diff --git a/<TMP>/name with spaces.txt b/<TMP>/name with spaces.txt
new file mode {mode}
index {ZERO_OID}..{right_oid}
--- {DEV_NULL}
+++ b/<TMP>/name with spaces.txt
@@ -0,0 +1,2 @@
+foo
+bar baz
"#,
)
};
assert_eq!(combined, expected_combined);
}
}

View File

@@ -1,3 +1,4 @@
use std::path::Path;
use std::sync::Arc;
use std::time::Duration;
@@ -5,8 +6,6 @@ use rand::Rng;
use tokio::sync::Notify;
use tracing::debug;
use crate::config::Config;
const INITIAL_DELAY_MS: u64 = 200;
const BACKOFF_FACTOR: f64 = 1.3;
@@ -47,8 +46,8 @@ pub(crate) fn backoff(attempt: u64) -> Duration {
/// `git worktree add` where the checkout lives outside the main repository
/// directory. If you need Codex to work from such a checkout simply pass the
/// `--allow-no-git-exec` CLI flag that disables the repo requirement.
pub fn is_inside_git_repo(config: &Config) -> bool {
let mut dir = config.cwd.to_path_buf();
pub fn is_inside_git_repo(base_dir: &Path) -> bool {
let mut dir = base_dir.to_path_buf();
loop {
if dir.join(".git").exists() {

View File

@@ -1,8 +1,12 @@
#![expect(clippy::unwrap_used)]
use assert_cmd::Command as AssertCommand;
use codex_core::exec::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use std::time::Duration;
use std::time::Instant;
use tempfile::TempDir;
use uuid::Uuid;
use walkdir::WalkDir;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
@@ -71,12 +75,102 @@ async fn chat_mode_stream_cli() {
println!("Stderr:\n{}", String::from_utf8_lossy(&output.stderr));
assert!(output.status.success());
let stdout = String::from_utf8_lossy(&output.stdout);
assert!(stdout.contains("hi"));
assert_eq!(stdout.matches("hi").count(), 1);
let hi_lines = stdout.lines().filter(|line| line.trim() == "hi").count();
assert_eq!(hi_lines, 1, "Expected exactly one line with 'hi'");
server.verify().await;
}
/// Verify that passing `-c experimental_instructions_file=...` to the CLI
/// overrides the built-in base instructions by inspecting the request body
/// received by a mock OpenAI Responses endpoint.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn exec_cli_applies_experimental_instructions_file() {
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
// Start mock server which will capture the request and return a minimal
// SSE stream for a single turn.
let server = MockServer::start().await;
let sse = concat!(
"data: {\"type\":\"response.created\",\"response\":{}}\n\n",
"data: {\"type\":\"response.completed\",\"response\":{\"id\":\"r1\"}}\n\n"
);
Mock::given(method("POST"))
.and(path("/v1/responses"))
.respond_with(
ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse, "text/event-stream"),
)
.expect(1)
.mount(&server)
.await;
// Create a temporary instructions file with a unique marker we can assert
// appears in the outbound request payload.
let custom = TempDir::new().unwrap();
let marker = "cli-experimental-instructions-marker";
let custom_path = custom.path().join("instr.md");
std::fs::write(&custom_path, marker).unwrap();
let custom_path_str = custom_path.to_string_lossy().replace('\\', "/");
// Build a provider override that points at the mock server and instructs
// Codex to use the Responses API with the dummy env var.
let provider_override = format!(
"model_providers.mock={{ name = \"mock\", base_url = \"{}/v1\", env_key = \"PATH\", wire_api = \"responses\" }}",
server.uri()
);
let home = TempDir::new().unwrap();
let mut cmd = AssertCommand::new("cargo");
cmd.arg("run")
.arg("-p")
.arg("codex-cli")
.arg("--quiet")
.arg("--")
.arg("exec")
.arg("--skip-git-repo-check")
.arg("-c")
.arg(&provider_override)
.arg("-c")
.arg("model_provider=\"mock\"")
.arg("-c")
.arg(format!(
"experimental_instructions_file=\"{custom_path_str}\""
))
.arg("-C")
.arg(env!("CARGO_MANIFEST_DIR"))
.arg("hello?\n");
cmd.env("CODEX_HOME", home.path())
.env("OPENAI_API_KEY", "dummy")
.env("OPENAI_BASE_URL", format!("{}/v1", server.uri()));
let output = cmd.output().unwrap();
println!("Status: {}", output.status);
println!("Stdout:\n{}", String::from_utf8_lossy(&output.stdout));
println!("Stderr:\n{}", String::from_utf8_lossy(&output.stderr));
assert!(output.status.success());
// Inspect the captured request and verify our custom base instructions were
// included in the `instructions` field.
let request = &server.received_requests().await.unwrap()[0];
let body = request.body_json::<serde_json::Value>().unwrap();
let instructions = body
.get("instructions")
.and_then(|v| v.as_str())
.unwrap_or_default()
.to_string();
assert!(
instructions.contains(marker),
"instructions did not contain custom marker; got: {instructions}"
);
}
/// Tests streaming responses through the CLI using a local SSE fixture file.
/// This test:
/// 1. Uses a pre-recorded SSE response fixture instead of a live server
@@ -117,3 +211,375 @@ async fn responses_api_stream_cli() {
let stdout = String::from_utf8_lossy(&output.stdout);
assert!(stdout.contains("fixture hello"));
}
/// End-to-end: create a session (writes rollout), verify the file, then resume and confirm append.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn integration_creates_and_checks_session_file() {
// Honor sandbox network restrictions for CI parity with the other tests.
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
// 1. Temp home so we read/write isolated session files.
let home = TempDir::new().unwrap();
// 2. Unique marker we'll look for in the session log.
let marker = format!("integration-test-{}", Uuid::new_v4());
let prompt = format!("echo {marker}");
// 3. Use the same offline SSE fixture as responses_api_stream_cli so the test is hermetic.
let fixture =
std::path::Path::new(env!("CARGO_MANIFEST_DIR")).join("tests/cli_responses_fixture.sse");
// 4. Run the codex CLI through cargo (ensures the right bin is built) and invoke `exec`,
// which is what records a session.
let mut cmd = AssertCommand::new("cargo");
cmd.arg("run")
.arg("-p")
.arg("codex-cli")
.arg("--quiet")
.arg("--")
.arg("exec")
.arg("--skip-git-repo-check")
.arg("-C")
.arg(env!("CARGO_MANIFEST_DIR"))
.arg(&prompt);
cmd.env("CODEX_HOME", home.path())
.env("OPENAI_API_KEY", "dummy")
.env("CODEX_RS_SSE_FIXTURE", &fixture)
// Required for CLI arg parsing even though fixture short-circuits network usage.
.env("OPENAI_BASE_URL", "http://unused.local");
let output = cmd.output().unwrap();
assert!(
output.status.success(),
"codex-cli exec failed: {}",
String::from_utf8_lossy(&output.stderr)
);
// Wait for sessions dir to appear.
let sessions_dir = home.path().join("sessions");
let dir_deadline = Instant::now() + Duration::from_secs(5);
while !sessions_dir.exists() && Instant::now() < dir_deadline {
std::thread::sleep(Duration::from_millis(50));
}
assert!(sessions_dir.exists(), "sessions directory never appeared");
// Find the session file that contains `marker`.
let deadline = Instant::now() + Duration::from_secs(10);
let mut matching_path: Option<std::path::PathBuf> = None;
while Instant::now() < deadline && matching_path.is_none() {
for entry in WalkDir::new(&sessions_dir) {
let entry = match entry {
Ok(e) => e,
Err(_) => continue,
};
if !entry.file_type().is_file() {
continue;
}
if !entry.file_name().to_string_lossy().ends_with(".jsonl") {
continue;
}
let path = entry.path();
let Ok(content) = std::fs::read_to_string(path) else {
continue;
};
let mut lines = content.lines();
if lines.next().is_none() {
continue;
}
for line in lines {
if line.trim().is_empty() {
continue;
}
let item: serde_json::Value = match serde_json::from_str(line) {
Ok(v) => v,
Err(_) => continue,
};
if item.get("type").and_then(|t| t.as_str()) == Some("message") {
if let Some(c) = item.get("content") {
if c.to_string().contains(&marker) {
matching_path = Some(path.to_path_buf());
break;
}
}
}
}
}
if matching_path.is_none() {
std::thread::sleep(Duration::from_millis(50));
}
}
let path = match matching_path {
Some(p) => p,
None => panic!("No session file containing the marker was found"),
};
// Basic sanity checks on location and metadata.
let rel = match path.strip_prefix(&sessions_dir) {
Ok(r) => r,
Err(_) => panic!("session file should live under sessions/"),
};
let comps: Vec<String> = rel
.components()
.map(|c| c.as_os_str().to_string_lossy().into_owned())
.collect();
assert_eq!(
comps.len(),
4,
"Expected sessions/YYYY/MM/DD/<file>, got {rel:?}"
);
let year = &comps[0];
let month = &comps[1];
let day = &comps[2];
assert!(
year.len() == 4 && year.chars().all(|c| c.is_ascii_digit()),
"Year dir not 4-digit numeric: {year}"
);
assert!(
month.len() == 2 && month.chars().all(|c| c.is_ascii_digit()),
"Month dir not zero-padded 2-digit numeric: {month}"
);
assert!(
day.len() == 2 && day.chars().all(|c| c.is_ascii_digit()),
"Day dir not zero-padded 2-digit numeric: {day}"
);
if let Ok(m) = month.parse::<u8>() {
assert!((1..=12).contains(&m), "Month out of range: {m}");
}
if let Ok(d) = day.parse::<u8>() {
assert!((1..=31).contains(&d), "Day out of range: {d}");
}
let content =
std::fs::read_to_string(&path).unwrap_or_else(|_| panic!("Failed to read session file"));
let mut lines = content.lines();
let meta_line = lines
.next()
.ok_or("missing session meta line")
.unwrap_or_else(|_| panic!("missing session meta line"));
let meta: serde_json::Value = serde_json::from_str(meta_line)
.unwrap_or_else(|_| panic!("Failed to parse session meta line as JSON"));
assert!(meta.get("id").is_some(), "SessionMeta missing id");
assert!(
meta.get("timestamp").is_some(),
"SessionMeta missing timestamp"
);
let mut found_message = false;
for line in lines {
if line.trim().is_empty() {
continue;
}
let Ok(item) = serde_json::from_str::<serde_json::Value>(line) else {
continue;
};
if item.get("type").and_then(|t| t.as_str()) == Some("message") {
if let Some(c) = item.get("content") {
if c.to_string().contains(&marker) {
found_message = true;
break;
}
}
}
}
assert!(
found_message,
"No message found in session file containing the marker"
);
// Second run: resume and append.
let orig_len = content.lines().count();
let marker2 = format!("integration-resume-{}", Uuid::new_v4());
let prompt2 = format!("echo {marker2}");
// Crossplatform safe resume override. On Windows, backslashes in a TOML string must be escaped
// or the parse will fail and the raw literal (including quotes) may be preserved all the way down
// to Config, which in turn breaks resume because the path is invalid. Normalize to forward slashes
// to sidestep the issue.
let resume_path_str = path.to_string_lossy().replace('\\', "/");
let resume_override = format!("experimental_resume=\"{resume_path_str}\"");
let mut cmd2 = AssertCommand::new("cargo");
cmd2.arg("run")
.arg("-p")
.arg("codex-cli")
.arg("--quiet")
.arg("--")
.arg("exec")
.arg("--skip-git-repo-check")
.arg("-c")
.arg(&resume_override)
.arg("-C")
.arg(env!("CARGO_MANIFEST_DIR"))
.arg(&prompt2);
cmd2.env("CODEX_HOME", home.path())
.env("OPENAI_API_KEY", "dummy")
.env("CODEX_RS_SSE_FIXTURE", &fixture)
.env("OPENAI_BASE_URL", "http://unused.local");
let output2 = cmd2.output().unwrap();
assert!(output2.status.success(), "resume codex-cli run failed");
// The rollout writer runs on a background async task; give it a moment to flush.
let mut new_len = orig_len;
let deadline = Instant::now() + Duration::from_secs(5);
let mut content2 = String::new();
while Instant::now() < deadline {
if let Ok(c) = std::fs::read_to_string(&path) {
let count = c.lines().count();
if count > orig_len {
content2 = c;
new_len = count;
break;
}
}
std::thread::sleep(Duration::from_millis(50));
}
if content2.is_empty() {
// last attempt
content2 = std::fs::read_to_string(&path).unwrap();
new_len = content2.lines().count();
}
assert!(new_len > orig_len, "rollout file did not grow after resume");
assert!(content2.contains(&marker), "rollout lost original marker");
assert!(
content2.contains(&marker2),
"rollout missing resumed marker"
);
}
/// Integration test to verify git info is collected and recorded in session files.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn integration_git_info_unit_test() {
// This test verifies git info collection works independently
// without depending on the full CLI integration
// 1. Create temp directory for git repo
let temp_dir = TempDir::new().unwrap();
let git_repo = temp_dir.path().to_path_buf();
let envs = vec![
("GIT_CONFIG_GLOBAL", "/dev/null"),
("GIT_CONFIG_NOSYSTEM", "1"),
];
// 2. Initialize a git repository with some content
let init_output = std::process::Command::new("git")
.envs(envs.clone())
.args(["init"])
.current_dir(&git_repo)
.output()
.unwrap();
assert!(init_output.status.success(), "git init failed");
// Configure git user (required for commits)
std::process::Command::new("git")
.envs(envs.clone())
.args(["config", "user.name", "Integration Test"])
.current_dir(&git_repo)
.output()
.unwrap();
std::process::Command::new("git")
.envs(envs.clone())
.args(["config", "user.email", "test@example.com"])
.current_dir(&git_repo)
.output()
.unwrap();
// Create a test file and commit it
let test_file = git_repo.join("test.txt");
std::fs::write(&test_file, "integration test content").unwrap();
std::process::Command::new("git")
.envs(envs.clone())
.args(["add", "."])
.current_dir(&git_repo)
.output()
.unwrap();
let commit_output = std::process::Command::new("git")
.envs(envs.clone())
.args(["commit", "-m", "Integration test commit"])
.current_dir(&git_repo)
.output()
.unwrap();
assert!(commit_output.status.success(), "git commit failed");
// Create a branch to test branch detection
std::process::Command::new("git")
.envs(envs.clone())
.args(["checkout", "-b", "integration-test-branch"])
.current_dir(&git_repo)
.output()
.unwrap();
// Add a remote to test repository URL detection
std::process::Command::new("git")
.envs(envs.clone())
.args([
"remote",
"add",
"origin",
"https://github.com/example/integration-test.git",
])
.current_dir(&git_repo)
.output()
.unwrap();
// 3. Test git info collection directly
let git_info = codex_core::git_info::collect_git_info(&git_repo).await;
// 4. Verify git info is present and contains expected data
assert!(git_info.is_some(), "Git info should be collected");
let git_info = git_info.unwrap();
// Check that we have a commit hash
assert!(
git_info.commit_hash.is_some(),
"Git info should contain commit_hash"
);
let commit_hash = git_info.commit_hash.as_ref().unwrap();
assert_eq!(commit_hash.len(), 40, "Commit hash should be 40 characters");
assert!(
commit_hash.chars().all(|c| c.is_ascii_hexdigit()),
"Commit hash should be hexadecimal"
);
// Check that we have the correct branch
assert!(git_info.branch.is_some(), "Git info should contain branch");
let branch = git_info.branch.as_ref().unwrap();
assert_eq!(
branch, "integration-test-branch",
"Branch should match what we created"
);
// Check that we have the repository URL
assert!(
git_info.repository_url.is_some(),
"Git info should contain repository_url"
);
let repo_url = git_info.repository_url.as_ref().unwrap();
assert_eq!(
repo_url, "https://github.com/example/integration-test.git",
"Repository URL should match what we configured"
);
println!("✅ Git info collection test passed!");
println!(" Commit: {commit_hash}");
println!(" Branch: {branch}");
println!(" Repo: {repo_url}");
// 5. Test serialization to ensure it works in SessionMeta
let serialized = serde_json::to_string(&git_info).unwrap();
let deserialized: codex_core::git_info::GitInfo = serde_json::from_str(&serialized).unwrap();
assert_eq!(git_info.commit_hash, deserialized.commit_hash);
assert_eq!(git_info.branch, deserialized.branch);
assert_eq!(git_info.repository_url, deserialized.repository_url);
println!("✅ Git info serialization test passed!");
}

View File

@@ -0,0 +1,574 @@
#![allow(clippy::expect_used)]
#![allow(clippy::unwrap_used)]
use std::path::PathBuf;
use chrono::Utc;
use codex_core::Codex;
use codex_core::CodexSpawnOk;
use codex_core::ModelProviderInfo;
use codex_core::WireApi;
use codex_core::built_in_model_providers;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SessionConfiguredEvent;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_login::AuthDotJson;
use codex_login::AuthMode;
use codex_login::CodexAuth;
use codex_login::TokenData;
use core_test_support::load_default_config_for_test;
use core_test_support::load_sse_fixture_with_id;
use core_test_support::wait_for_event;
use tempfile::TempDir;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
use wiremock::matchers::header_regex;
use wiremock::matchers::method;
use wiremock::matchers::path;
use wiremock::matchers::query_param;
/// Build minimal SSE stream with completed marker using the JSON fixture.
fn sse_completed(id: &str) -> String {
load_sse_fixture_with_id("tests/fixtures/completed_template.json", id)
}
fn assert_message_role(request_body: &serde_json::Value, role: &str) {
assert_eq!(request_body["role"].as_str().unwrap(), role);
}
fn assert_message_starts_with(request_body: &serde_json::Value, text: &str) {
let content = request_body["content"][0]["text"]
.as_str()
.expect("invalid message content");
assert!(
content.starts_with(text),
"expected message content '{content}' to start with '{text}'"
);
}
fn assert_message_ends_with(request_body: &serde_json::Value, text: &str) {
let content = request_body["content"][0]["text"]
.as_str()
.expect("invalid message content");
assert!(
content.ends_with(text),
"expected message content '{content}' to end with '{text}'"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn includes_session_id_and_model_headers_in_request() {
#![allow(clippy::unwrap_used)]
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
// Mock server
let server = MockServer::start().await;
// First request must NOT include `previous_response_id`.
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
Mock::given(method("POST"))
.and(path("/v1/responses"))
.respond_with(first)
.expect(1)
.mount(&server)
.await;
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
};
// Init session
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = model_provider;
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } = Codex::spawn(
config,
Some(CodexAuth::from_api_key("Test API Key".to_string())),
ctrl_c.clone(),
)
.await
.unwrap();
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
let EventMsg::SessionConfigured(SessionConfiguredEvent { session_id, .. }) =
wait_for_event(&codex, |ev| matches!(ev, EventMsg::SessionConfigured(_))).await
else {
unreachable!()
};
let current_session_id = Some(session_id.to_string());
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// get request from the server
let request = &server.received_requests().await.unwrap()[0];
let request_session_id = request.headers.get("session_id").unwrap();
let request_authorization = request.headers.get("authorization").unwrap();
let request_originator = request.headers.get("originator").unwrap();
assert!(current_session_id.is_some());
assert_eq!(
request_session_id.to_str().unwrap(),
current_session_id.as_ref().unwrap()
);
assert_eq!(request_originator.to_str().unwrap(), "codex_cli_rs");
assert_eq!(
request_authorization.to_str().unwrap(),
"Bearer Test API Key"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn includes_base_instructions_override_in_request() {
#![allow(clippy::unwrap_used)]
// Mock server
let server = MockServer::start().await;
// First request must NOT include `previous_response_id`.
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
Mock::given(method("POST"))
.and(path("/v1/responses"))
.respond_with(first)
.expect(1)
.mount(&server)
.await;
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
};
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.base_instructions = Some("test instructions".to_string());
config.model_provider = model_provider;
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } = Codex::spawn(
config,
Some(CodexAuth::from_api_key("Test API Key".to_string())),
ctrl_c.clone(),
)
.await
.unwrap();
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let request = &server.received_requests().await.unwrap()[0];
let request_body = request.body_json::<serde_json::Value>().unwrap();
assert!(
request_body["instructions"]
.as_str()
.unwrap()
.contains("test instructions")
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn originator_config_override_is_used() {
#![allow(clippy::unwrap_used)]
// Mock server
let server = MockServer::start().await;
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
Mock::given(method("POST"))
.and(path("/v1/responses"))
.respond_with(first)
.expect(1)
.mount(&server)
.await;
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
};
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = model_provider;
config.internal_originator = Some("my_override".to_string());
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } = Codex::spawn(
config,
Some(CodexAuth::from_api_key("Test API Key".to_string())),
ctrl_c.clone(),
)
.await
.unwrap();
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let request = &server.received_requests().await.unwrap()[0];
let request_originator = request.headers.get("originator").unwrap();
assert_eq!(request_originator.to_str().unwrap(), "my_override");
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn chatgpt_auth_sends_correct_request() {
#![allow(clippy::unwrap_used)]
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
// Mock server
let server = MockServer::start().await;
// First request must NOT include `previous_response_id`.
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
Mock::given(method("POST"))
.and(path("/api/codex/responses"))
.respond_with(first)
.expect(1)
.mount(&server)
.await;
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/api/codex", server.uri())),
..built_in_model_providers()["openai"].clone()
};
// Init session
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = model_provider;
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } =
Codex::spawn(config, Some(create_dummy_codex_auth()), ctrl_c.clone())
.await
.unwrap();
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
let EventMsg::SessionConfigured(SessionConfiguredEvent { session_id, .. }) =
wait_for_event(&codex, |ev| matches!(ev, EventMsg::SessionConfigured(_))).await
else {
unreachable!()
};
let current_session_id = Some(session_id.to_string());
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// get request from the server
let request = &server.received_requests().await.unwrap()[0];
let request_session_id = request.headers.get("session_id").unwrap();
let request_authorization = request.headers.get("authorization").unwrap();
let request_originator = request.headers.get("originator").unwrap();
let request_chatgpt_account_id = request.headers.get("chatgpt-account-id").unwrap();
let request_body = request.body_json::<serde_json::Value>().unwrap();
assert!(current_session_id.is_some());
assert_eq!(
request_session_id.to_str().unwrap(),
current_session_id.as_ref().unwrap()
);
assert_eq!(request_originator.to_str().unwrap(), "codex_cli_rs");
assert_eq!(
request_authorization.to_str().unwrap(),
"Bearer Access Token"
);
assert_eq!(request_chatgpt_account_id.to_str().unwrap(), "account_id");
assert!(!request_body["store"].as_bool().unwrap());
assert!(request_body["stream"].as_bool().unwrap());
assert_eq!(
request_body["include"][0].as_str().unwrap(),
"reasoning.encrypted_content"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn includes_user_instructions_message_in_request() {
#![allow(clippy::unwrap_used)]
let server = MockServer::start().await;
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
Mock::given(method("POST"))
.and(path("/v1/responses"))
.respond_with(first)
.expect(1)
.mount(&server)
.await;
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
};
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = model_provider;
config.user_instructions = Some("be nice".to_string());
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } = Codex::spawn(
config,
Some(CodexAuth::from_api_key("Test API Key".to_string())),
ctrl_c.clone(),
)
.await
.unwrap();
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let request = &server.received_requests().await.unwrap()[0];
let request_body = request.body_json::<serde_json::Value>().unwrap();
assert!(
!request_body["instructions"]
.as_str()
.unwrap()
.contains("be nice")
);
assert_message_role(&request_body["input"][0], "user");
assert_message_starts_with(&request_body["input"][0], "<environment_context>\n\n");
assert_message_ends_with(&request_body["input"][0], "</environment_context>");
assert_message_role(&request_body["input"][1], "user");
assert_message_starts_with(&request_body["input"][1], "<user_instructions>\n\n");
assert_message_ends_with(&request_body["input"][1], "</user_instructions>");
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn azure_overrides_assign_properties_used_for_responses_url() {
#![allow(clippy::unwrap_used)]
let existing_env_var_with_random_value = if cfg!(windows) { "USERNAME" } else { "USER" };
// Mock server
let server = MockServer::start().await;
// First request must NOT include `previous_response_id`.
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
// Expect POST to /openai/responses with api-version query param
Mock::given(method("POST"))
.and(path("/openai/responses"))
.and(query_param("api-version", "2025-04-01-preview"))
.and(header_regex("Custom-Header", "Value"))
.and(header_regex(
"Authorization",
format!(
"Bearer {}",
std::env::var(existing_env_var_with_random_value).unwrap()
)
.as_str(),
))
.respond_with(first)
.expect(1)
.mount(&server)
.await;
let provider = ModelProviderInfo {
name: "custom".to_string(),
base_url: Some(format!("{}/openai", server.uri())),
// Reuse the existing environment variable to avoid using unsafe code
env_key: Some(existing_env_var_with_random_value.to_string()),
query_params: Some(std::collections::HashMap::from([(
"api-version".to_string(),
"2025-04-01-preview".to_string(),
)])),
env_key_instructions: None,
wire_api: WireApi::Responses,
http_headers: Some(std::collections::HashMap::from([(
"Custom-Header".to_string(),
"Value".to_string(),
)])),
env_http_headers: None,
request_max_retries: None,
stream_max_retries: None,
stream_idle_timeout_ms: None,
requires_openai_auth: false,
};
// Init session
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = provider;
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } = Codex::spawn(config, None, ctrl_c.clone()).await.unwrap();
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn env_var_overrides_loaded_auth() {
#![allow(clippy::unwrap_used)]
let existing_env_var_with_random_value = if cfg!(windows) { "USERNAME" } else { "USER" };
// Mock server
let server = MockServer::start().await;
// First request must NOT include `previous_response_id`.
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
// Expect POST to /openai/responses with api-version query param
Mock::given(method("POST"))
.and(path("/openai/responses"))
.and(query_param("api-version", "2025-04-01-preview"))
.and(header_regex("Custom-Header", "Value"))
.and(header_regex(
"Authorization",
format!(
"Bearer {}",
std::env::var(existing_env_var_with_random_value).unwrap()
)
.as_str(),
))
.respond_with(first)
.expect(1)
.mount(&server)
.await;
let provider = ModelProviderInfo {
name: "custom".to_string(),
base_url: Some(format!("{}/openai", server.uri())),
// Reuse the existing environment variable to avoid using unsafe code
env_key: Some(existing_env_var_with_random_value.to_string()),
query_params: Some(std::collections::HashMap::from([(
"api-version".to_string(),
"2025-04-01-preview".to_string(),
)])),
env_key_instructions: None,
wire_api: WireApi::Responses,
http_headers: Some(std::collections::HashMap::from([(
"Custom-Header".to_string(),
"Value".to_string(),
)])),
env_http_headers: None,
request_max_retries: None,
stream_max_retries: None,
stream_idle_timeout_ms: None,
requires_openai_auth: false,
};
// Init session
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = provider;
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } =
Codex::spawn(config, Some(create_dummy_codex_auth()), ctrl_c.clone())
.await
.unwrap();
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
}
fn create_dummy_codex_auth() -> CodexAuth {
CodexAuth::new(
None,
AuthMode::ChatGPT,
PathBuf::new(),
Some(AuthDotJson {
openai_api_key: None,
tokens: Some(TokenData {
id_token: Default::default(),
access_token: "Access Token".to_string(),
refresh_token: "test".to_string(),
account_id: Some("account_id".to_string()),
}),
last_refresh: Some(Utc::now()),
}),
)
}

View File

@@ -0,0 +1,13 @@
[package]
name = "core_test_support"
version = { workspace = true }
edition = "2024"
[lib]
path = "lib.rs"
[dependencies]
codex-core = { path = "../.." }
serde_json = "1"
tempfile = "3"
tokio = { version = "1", features = ["time"] }

View File

@@ -1,9 +1,5 @@
#![allow(clippy::expect_used)]
// Helpers shared by the integration tests. These are located inside the
// `tests/` tree on purpose so they never become part of the public API surface
// of the `codex-core` crate.
use tempfile::TempDir;
use codex_core::config::Config;
@@ -30,7 +26,6 @@ pub fn load_default_config_for_test(codex_home: &TempDir) -> Config {
/// with only a `type` field results in an event with no `data:` section. This
/// makes it trivial to extend the fixtures as OpenAI adds new event kinds or
/// fields.
#[allow(dead_code)]
pub fn load_sse_fixture(path: impl AsRef<std::path::Path>) -> String {
let events: Vec<serde_json::Value> =
serde_json::from_reader(std::fs::File::open(path).expect("read fixture"))
@@ -55,7 +50,6 @@ pub fn load_sse_fixture(path: impl AsRef<std::path::Path>) -> String {
/// fixture template with the supplied identifier before parsing. This lets a
/// single JSON template be reused by multiple tests that each need a unique
/// `response_id`.
#[allow(dead_code)]
pub fn load_sse_fixture_with_id(path: impl AsRef<std::path::Path>, id: &str) -> String {
let raw = std::fs::read_to_string(path).expect("read fixture template");
let replaced = raw.replace("__ID__", id);
@@ -76,3 +70,23 @@ pub fn load_sse_fixture_with_id(path: impl AsRef<std::path::Path>, id: &str) ->
})
.collect()
}
pub async fn wait_for_event<F>(
codex: &codex_core::Codex,
mut predicate: F,
) -> codex_core::protocol::EventMsg
where
F: FnMut(&codex_core::protocol::EventMsg) -> bool,
{
use tokio::time::Duration;
use tokio::time::timeout;
loop {
let ev = timeout(Duration::from_secs(1), codex.next_event())
.await
.expect("timeout waiting for event")
.expect("stream ended unexpectedly");
if predicate(&ev.msg) {
return ev.msg;
}
}
}

View File

@@ -0,0 +1,254 @@
#![expect(clippy::unwrap_used)]
use codex_core::Codex;
use codex_core::CodexSpawnOk;
use codex_core::ModelProviderInfo;
use codex_core::built_in_model_providers;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_login::CodexAuth;
use core_test_support::load_default_config_for_test;
use core_test_support::wait_for_event;
use serde_json::Value;
use tempfile::TempDir;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
use wiremock::matchers::method;
use wiremock::matchers::path;
use pretty_assertions::assert_eq;
// --- Test helpers -----------------------------------------------------------
/// Build an SSE stream body from a list of JSON events.
fn sse(events: Vec<Value>) -> String {
use std::fmt::Write as _;
let mut out = String::new();
for ev in events {
let kind = ev.get("type").and_then(|v| v.as_str()).unwrap();
writeln!(&mut out, "event: {kind}").unwrap();
if !ev.as_object().map(|o| o.len() == 1).unwrap_or(false) {
write!(&mut out, "data: {ev}\n\n").unwrap();
} else {
out.push('\n');
}
}
out
}
/// Convenience: SSE event for a completed response with a specific id.
fn ev_completed(id: &str) -> Value {
serde_json::json!({
"type": "response.completed",
"response": {
"id": id,
"usage": {"input_tokens":0,"input_tokens_details":null,"output_tokens":0,"output_tokens_details":null,"total_tokens":0}
}
})
}
/// Convenience: SSE event for a single assistant message output item.
fn ev_assistant_message(id: &str, text: &str) -> Value {
serde_json::json!({
"type": "response.output_item.done",
"item": {
"type": "message",
"role": "assistant",
"id": id,
"content": [{"type": "output_text", "text": text}]
}
})
}
fn sse_response(body: String) -> ResponseTemplate {
ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(body, "text/event-stream")
}
async fn mount_sse_once<M>(server: &MockServer, matcher: M, body: String)
where
M: wiremock::Match + Send + Sync + 'static,
{
Mock::given(method("POST"))
.and(path("/v1/responses"))
.and(matcher)
.respond_with(sse_response(body))
.expect(1)
.mount(server)
.await;
}
const FIRST_REPLY: &str = "FIRST_REPLY";
const SUMMARY_TEXT: &str = "SUMMARY_ONLY_CONTEXT";
const SUMMARIZE_TRIGGER: &str = "Start Summarization";
const THIRD_USER_MSG: &str = "next turn";
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn summarize_context_three_requests_and_instructions() {
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
// Set up a mock server that we can inspect after the run.
let server = MockServer::start().await;
// SSE 1: assistant replies normally so it is recorded in history.
let sse1 = sse(vec![
ev_assistant_message("m1", FIRST_REPLY),
ev_completed("r1"),
]);
// SSE 2: summarizer returns a summary message.
let sse2 = sse(vec![
ev_assistant_message("m2", SUMMARY_TEXT),
ev_completed("r2"),
]);
// SSE 3: minimal completed; we only need to capture the request body.
let sse3 = sse(vec![ev_completed("r3")]);
// Mount three expectations, one per request, matched by body content.
let first_matcher = |req: &wiremock::Request| {
let body = std::str::from_utf8(&req.body).unwrap_or("");
body.contains("\"text\":\"hello world\"")
&& !body.contains(&format!("\"text\":\"{SUMMARIZE_TRIGGER}\""))
};
mount_sse_once(&server, first_matcher, sse1).await;
let second_matcher = |req: &wiremock::Request| {
let body = std::str::from_utf8(&req.body).unwrap_or("");
body.contains(&format!("\"text\":\"{SUMMARIZE_TRIGGER}\""))
};
mount_sse_once(&server, second_matcher, sse2).await;
let third_matcher = |req: &wiremock::Request| {
let body = std::str::from_utf8(&req.body).unwrap_or("");
body.contains(&format!("\"text\":\"{THIRD_USER_MSG}\""))
};
mount_sse_once(&server, third_matcher, sse3).await;
// Build config pointing to the mock server and spawn Codex.
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
};
let home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&home);
config.model_provider = model_provider;
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let CodexSpawnOk { codex, .. } = Codex::spawn(
config,
Some(CodexAuth::from_api_key("dummy".to_string())),
ctrl_c.clone(),
)
.await
.unwrap();
// 1) Normal user input should hit server once.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello world".into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// 2) Summarize second hit with summarization instructions.
codex.submit(Op::Compact).await.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// 3) Next user input third hit; history should include only the summary.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: THIRD_USER_MSG.into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// Inspect the three captured requests.
let requests = server.received_requests().await.unwrap();
assert_eq!(requests.len(), 3, "expected exactly three requests");
let req1 = &requests[0];
let req2 = &requests[1];
let req3 = &requests[2];
let body1 = req1.body_json::<serde_json::Value>().unwrap();
let body2 = req2.body_json::<serde_json::Value>().unwrap();
let body3 = req3.body_json::<serde_json::Value>().unwrap();
// System instructions should change for the summarization turn.
let instr1 = body1.get("instructions").and_then(|v| v.as_str()).unwrap();
let instr2 = body2.get("instructions").and_then(|v| v.as_str()).unwrap();
assert_ne!(
instr1, instr2,
"summarization should override base instructions"
);
assert!(
instr2.contains("You are a summarization assistant"),
"summarization instructions not applied"
);
// The summarization request should include the injected user input marker.
let input2 = body2.get("input").and_then(|v| v.as_array()).unwrap();
// The last item is the user message created from the injected input.
let last2 = input2.last().unwrap();
assert_eq!(last2.get("type").unwrap().as_str().unwrap(), "message");
assert_eq!(last2.get("role").unwrap().as_str().unwrap(), "user");
let text2 = last2["content"][0]["text"].as_str().unwrap();
assert!(text2.contains(SUMMARIZE_TRIGGER));
// Third request must contain only the summary from step 2 as prior history plus new user msg.
let input3 = body3.get("input").and_then(|v| v.as_array()).unwrap();
println!("third request body: {body3}");
assert!(
input3.len() >= 2,
"expected summary + new user message in third request"
);
// Collect all (role, text) message tuples.
let mut messages: Vec<(String, String)> = Vec::new();
for item in input3 {
if item["type"].as_str() == Some("message") {
let role = item["role"].as_str().unwrap_or_default().to_string();
let text = item["content"][0]["text"]
.as_str()
.unwrap_or_default()
.to_string();
messages.push((role, text));
}
}
// Exactly one assistant message should remain after compaction and the new user message is present.
let assistant_count = messages.iter().filter(|(r, _)| r == "assistant").count();
assert_eq!(
assistant_count, 1,
"exactly one assistant message should remain after compaction"
);
assert!(
messages
.iter()
.any(|(r, t)| r == "user" && t == THIRD_USER_MSG),
"third request should include the new user message"
);
assert!(
!messages.iter().any(|(_, t)| t.contains("hello world")),
"third request should not include the original user input"
);
assert!(
!messages.iter().any(|(_, t)| t.contains(SUMMARIZE_TRIGGER)),
"third request should not include the summarize trigger"
);
}

View File

@@ -0,0 +1,71 @@
#![cfg(target_os = "macos")]
#![expect(clippy::expect_used)]
use std::collections::HashMap;
use std::sync::Arc;
use codex_core::exec::ExecParams;
use codex_core::exec::SandboxType;
use codex_core::exec::process_exec_tool_call;
use codex_core::protocol::SandboxPolicy;
use codex_core::spawn::CODEX_SANDBOX_ENV_VAR;
use tempfile::TempDir;
use tokio::sync::Notify;
use codex_core::get_platform_sandbox;
async fn run_test_cmd(tmp: TempDir, cmd: Vec<&str>, should_be_ok: bool) {
if std::env::var(CODEX_SANDBOX_ENV_VAR) == Ok("seatbelt".to_string()) {
eprintln!("{CODEX_SANDBOX_ENV_VAR} is set to 'seatbelt', skipping test.");
return;
}
let sandbox_type = get_platform_sandbox().expect("should be able to get sandbox type");
assert_eq!(sandbox_type, SandboxType::MacosSeatbelt);
let params = ExecParams {
command: cmd.iter().map(|s| s.to_string()).collect(),
cwd: tmp.path().to_path_buf(),
timeout_ms: Some(1000),
env: HashMap::new(),
with_escalated_permissions: None,
justification: None,
};
let ctrl_c = Arc::new(Notify::new());
let policy = SandboxPolicy::new_read_only_policy();
let result = process_exec_tool_call(params, sandbox_type, ctrl_c, &policy, &None, None).await;
assert!(result.is_ok() == should_be_ok);
}
/// Command succeeds with exit code 0 normally
#[tokio::test]
async fn exit_code_0_succeeds() {
let tmp = TempDir::new().expect("should be able to create temp dir");
let cmd = vec!["echo", "hello"];
run_test_cmd(tmp, cmd, true).await
}
/// Command not found returns exit code 127, this is not considered a sandbox error
#[tokio::test]
async fn exit_command_not_found_is_ok() {
let tmp = TempDir::new().expect("should be able to create temp dir");
let cmd = vec!["/bin/bash", "-c", "nonexistent_command_12345"];
run_test_cmd(tmp, cmd, true).await
}
/// Writing a file fails and should be considered a sandbox error
#[tokio::test]
async fn write_file_fails_as_sandbox_error() {
let tmp = TempDir::new().expect("should be able to create temp dir");
let path = tmp.path().join("test.txt");
let cmd = vec![
"/user/bin/touch",
path.to_str().expect("should be able to get path"),
];
run_test_cmd(tmp, cmd, false).await;
}

View File

@@ -0,0 +1,147 @@
#![cfg(unix)]
use std::collections::HashMap;
use std::path::PathBuf;
use std::sync::Arc;
use async_channel::Receiver;
use codex_core::exec::ExecParams;
use codex_core::exec::SandboxType;
use codex_core::exec::StdoutStream;
use codex_core::exec::process_exec_tool_call;
use codex_core::protocol::Event;
use codex_core::protocol::EventMsg;
use codex_core::protocol::ExecCommandOutputDeltaEvent;
use codex_core::protocol::ExecOutputStream;
use codex_core::protocol::SandboxPolicy;
use tokio::sync::Notify;
fn collect_stdout_events(rx: Receiver<Event>) -> Vec<u8> {
let mut out = Vec::new();
while let Ok(ev) = rx.try_recv() {
if let EventMsg::ExecCommandOutputDelta(ExecCommandOutputDeltaEvent {
stream: ExecOutputStream::Stdout,
chunk,
..
}) = ev.msg
{
out.extend_from_slice(&chunk);
}
}
out
}
#[tokio::test]
async fn test_exec_stdout_stream_events_echo() {
let (tx, rx) = async_channel::unbounded::<Event>();
let stdout_stream = StdoutStream {
sub_id: "test-sub".to_string(),
call_id: "call-1".to_string(),
tx_event: tx,
};
let cmd = vec![
"/bin/sh".to_string(),
"-c".to_string(),
// Use printf for predictable behavior across shells
"printf 'hello-world\n'".to_string(),
];
let params = ExecParams {
command: cmd,
cwd: std::env::current_dir().unwrap_or_else(|_| PathBuf::from(".")),
timeout_ms: Some(5_000),
env: HashMap::new(),
with_escalated_permissions: None,
justification: None,
};
let ctrl_c = Arc::new(Notify::new());
let policy = SandboxPolicy::new_read_only_policy();
let result = process_exec_tool_call(
params,
SandboxType::None,
ctrl_c,
&policy,
&None,
Some(stdout_stream),
)
.await;
let result = match result {
Ok(r) => r,
Err(e) => panic!("process_exec_tool_call failed: {e}"),
};
assert_eq!(result.exit_code, 0);
assert_eq!(result.stdout, "hello-world\n");
let streamed = collect_stdout_events(rx);
// We should have received at least the same contents (possibly in one chunk)
assert_eq!(String::from_utf8_lossy(&streamed), "hello-world\n");
}
#[tokio::test]
async fn test_exec_stderr_stream_events_echo() {
let (tx, rx) = async_channel::unbounded::<Event>();
let stdout_stream = StdoutStream {
sub_id: "test-sub".to_string(),
call_id: "call-2".to_string(),
tx_event: tx,
};
let cmd = vec![
"/bin/sh".to_string(),
"-c".to_string(),
// Write to stderr explicitly
"printf 'oops\n' 1>&2".to_string(),
];
let params = ExecParams {
command: cmd,
cwd: std::env::current_dir().unwrap_or_else(|_| PathBuf::from(".")),
timeout_ms: Some(5_000),
env: HashMap::new(),
with_escalated_permissions: None,
justification: None,
};
let ctrl_c = Arc::new(Notify::new());
let policy = SandboxPolicy::new_read_only_policy();
let result = process_exec_tool_call(
params,
SandboxType::None,
ctrl_c,
&policy,
&None,
Some(stdout_stream),
)
.await;
let result = match result {
Ok(r) => r,
Err(e) => panic!("process_exec_tool_call failed: {e}"),
};
assert_eq!(result.exit_code, 0);
assert_eq!(result.stdout, "");
assert_eq!(result.stderr, "oops\n");
// Collect only stderr delta events
let mut err = Vec::new();
while let Ok(ev) = rx.try_recv() {
if let EventMsg::ExecCommandOutputDelta(ExecCommandOutputDeltaEvent {
stream: ExecOutputStream::Stderr,
chunk,
..
}) = ev.msg
{
err.extend_from_slice(&chunk);
}
}
assert_eq!(String::from_utf8_lossy(&err), "oops\n");
}

View File

@@ -20,15 +20,15 @@
use std::time::Duration;
use codex_core::Codex;
use codex_core::CodexSpawnOk;
use codex_core::error::CodexErr;
use codex_core::protocol::AgentMessageEvent;
use codex_core::protocol::ErrorEvent;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
mod test_support;
use core_test_support::load_default_config_for_test;
use tempfile::TempDir;
use test_support::load_default_config_for_test;
use tokio::sync::Notify;
use tokio::time::timeout;
@@ -45,23 +45,12 @@ async fn spawn_codex() -> Result<Codex, CodexErr> {
"OPENAI_API_KEY must be set for live tests"
);
// Environment tweaks to keep the tests snappy and inexpensive while still
// exercising retry/robustness logic.
//
// NOTE: Starting with the 2024 edition `std::env::set_var` is `unsafe`
// because changing the process environment races with any other threads
// that might be performing environment look-ups at the same time.
// Restrict the unsafety to this tiny block that happens at the very
// beginning of the test, before we spawn any background tasks that could
// observe the environment.
unsafe {
std::env::set_var("OPENAI_REQUEST_MAX_RETRIES", "2");
std::env::set_var("OPENAI_STREAM_MAX_RETRIES", "2");
}
let codex_home = TempDir::new().unwrap();
let config = load_default_config_for_test(&codex_home);
let (agent, _init_id) = Codex::spawn(config, std::sync::Arc::new(Notify::new())).await?;
let mut config = load_default_config_for_test(&codex_home);
config.model_provider.request_max_retries = Some(2);
config.model_provider.stream_max_retries = Some(2);
let CodexSpawnOk { codex: agent, .. } =
Codex::spawn(config, None, std::sync::Arc::new(Notify::new())).await?;
Ok(agent)
}
@@ -79,7 +68,7 @@ async fn live_streaming_and_prev_id_reset() {
let codex = spawn_codex().await.unwrap();
// ---------- Task 1 ----------
// ---------- Task 1 ----------
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
@@ -113,7 +102,7 @@ async fn live_streaming_and_prev_id_reset() {
"Agent did not stream any AgentMessage before TaskComplete"
);
// ---------- Task 2 (same session) ----------
// ---------- Task 2 (same session) ----------
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
@@ -188,8 +177,7 @@ async fn live_shell_function_call() {
match ev.msg {
EventMsg::ExecCommandBegin(codex_core::protocol::ExecCommandBeginEvent {
command,
call_id: _,
cwd: _,
..
}) => {
assert_eq!(command, vec!["echo", MARKER]);
saw_begin = true;
@@ -197,8 +185,7 @@ async fn live_shell_function_call() {
EventMsg::ExecCommandEnd(codex_core::protocol::ExecCommandEndEvent {
stdout,
exit_code,
call_id: _,
stderr: _,
..
}) => {
assert_eq!(exit_code, 0, "echo returned nonzero exit code");
assert!(stdout.contains(MARKER));

View File

@@ -1,166 +0,0 @@
use std::time::Duration;
use codex_core::Codex;
use codex_core::ModelProviderInfo;
use codex_core::exec::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_core::protocol::ErrorEvent;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
mod test_support;
use serde_json::Value;
use tempfile::TempDir;
use test_support::load_default_config_for_test;
use test_support::load_sse_fixture_with_id;
use tokio::time::timeout;
use wiremock::Match;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::Request;
use wiremock::ResponseTemplate;
use wiremock::matchers::method;
use wiremock::matchers::path;
/// Matcher asserting that JSON body has NO `previous_response_id` field.
struct NoPrevId;
impl Match for NoPrevId {
fn matches(&self, req: &Request) -> bool {
serde_json::from_slice::<Value>(&req.body)
.map(|v| v.get("previous_response_id").is_none())
.unwrap_or(false)
}
}
/// Matcher asserting that JSON body HAS a `previous_response_id` field.
struct HasPrevId;
impl Match for HasPrevId {
fn matches(&self, req: &Request) -> bool {
serde_json::from_slice::<Value>(&req.body)
.map(|v| v.get("previous_response_id").is_some())
.unwrap_or(false)
}
}
/// Build minimal SSE stream with completed marker using the JSON fixture.
fn sse_completed(id: &str) -> String {
load_sse_fixture_with_id("tests/fixtures/completed_template.json", id)
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn keeps_previous_response_id_between_tasks() {
#![allow(clippy::unwrap_used)]
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
// Mock server
let server = MockServer::start().await;
// First request must NOT include `previous_response_id`.
let first = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp1"), "text/event-stream");
Mock::given(method("POST"))
.and(path("/v1/responses"))
.and(NoPrevId)
.respond_with(first)
.expect(1)
.mount(&server)
.await;
// Second request MUST include `previous_response_id`.
let second = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse_completed("resp2"), "text/event-stream");
Mock::given(method("POST"))
.and(path("/v1/responses"))
.and(HasPrevId)
.respond_with(second)
.expect(1)
.mount(&server)
.await;
// Environment
// Update environment `set_var` is `unsafe` starting with the 2024
// edition so we group the calls into a single `unsafe { … }` block.
unsafe {
std::env::set_var("OPENAI_REQUEST_MAX_RETRIES", "0");
std::env::set_var("OPENAI_STREAM_MAX_RETRIES", "0");
}
let model_provider = ModelProviderInfo {
name: "openai".into(),
base_url: format!("{}/v1", server.uri()),
// Environment variable that should exist in the test environment.
// ModelClient will return an error if the environment variable for the
// provider is not set.
env_key: Some("PATH".into()),
env_key_instructions: None,
wire_api: codex_core::WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
};
// Init session
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = model_provider;
let ctrl_c = std::sync::Arc::new(tokio::sync::Notify::new());
let (codex, _init_id) = Codex::spawn(config, ctrl_c.clone()).await.unwrap();
// Task 1 triggers first request (no previous_response_id)
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello".into(),
}],
})
.await
.unwrap();
// Wait for TaskComplete
loop {
let ev = timeout(Duration::from_secs(1), codex.next_event())
.await
.unwrap()
.unwrap();
if matches!(ev.msg, EventMsg::TaskComplete(_)) {
break;
}
}
// Task 2 should include `previous_response_id` (triggers second request)
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "again".into(),
}],
})
.await
.unwrap();
// Wait for TaskComplete or error
loop {
let ev = timeout(Duration::from_secs(1), codex.next_event())
.await
.unwrap()
.unwrap();
match ev.msg {
EventMsg::TaskComplete(_) => break,
EventMsg::Error(ErrorEvent { message }) => {
panic!("unexpected error: {message}")
}
_ => {
// Ignore other events.
}
}
}
}

Some files were not shown because too many files have changed in this diff Show More