Compare commits

..

78 Commits

Author SHA1 Message Date
Michael Bolin
8f7a54501c chore: Rust release, set prerelease:false and version=0.0.2504301132 (#755)
The generated DotSlash file has URLs that refer to
`https://github.com/openai/codex/releases/`, so let's set
`prerelease:false` (but keep `draft:true` for now) so those URLs should
work.

Also updated `version` in Cargo workspace so I will kick off a build
once this lands.
2025-04-30 11:53:03 -07:00
Michael Bolin
2f1d96e77d fix: remove errant eslint-disable so pnpm run lint passes again (#756)
My bad: introduced in https://github.com/openai/codex/pull/753.
2025-04-30 11:37:11 -07:00
Michael Bolin
84aaefa102 fix: read version from package.json instead of modifying session.ts (#753)
I am working to simplify the build process. As a first step, update
`session.ts` so it reads the `version` from `package.json` at runtime so
we no longer have to modify it during the build process. I want to get
to a place where the build looks like:

```
cd codex-cli
pnpm i
pnpm build
RELEASE_DIR=$(mktemp -d)
cp -r bin "$RELEASE_DIR/bin"
cp -r dist "$RELEASE_DIR/dist"
cp -r src "$RELEASE_DIR/src" # important if we want sourcemaps to continue to work
cp ../README.md "$RELEASE_DIR"
VERSION=$(printf '0.1.%d' $(date +%y%m%d%H%M))
jq --arg version "$VERSION" '.version = $version' package.json > "$RELEASE_DIR/package.json"
```

Then the contents of `$RELEASE_DIR` should be good to `npm publish`, no?
2025-04-30 11:03:10 -07:00
Michael Bolin
c432d9ef81 chore: remove the REPL crate/subcommand (#754)
@oai-ragona and I discussed it, and we feel the REPL crate has served
its purpose, so we're going to delete the code and future archaeologists
can find it in Git history.
2025-04-30 10:15:50 -07:00
Michael Bolin
4746ee900f fix: remove expected dot after v in rust-v tag name (#742)
I think this extra dot was not intentional, but I'm not sure. Certainly
this comment suggests it should not be there:


85999d7277/.github/workflows/rust-release.yml (L4)
2025-04-30 10:05:47 -07:00
Michael Bolin
f2ed46ceca fix: include x86_64-unknown-linux-gnu in the list of arch to build codex-linux-sandbox (#748) 2025-04-29 21:19:14 -07:00
Michael Bolin
e42dacbdc8 fix: add another place where $dest was missing in rust-release.yml (#747)
I thought https://github.com/openai/codex/pull/745 was the last fix I
needed, but apparently not.
2025-04-29 20:23:54 -07:00
Michael Bolin
5122fe647f chore: fix errors in .github/workflows/rust-release.yml and prep 0.0.2504292006 release (#745)
Apparently I made two key mistakes in
https://github.com/openai/codex/pull/740 (fixed in this PR):

* I forgot to redefine `$dest` in the `Stage Linux-only artifacts` step
* I did not define the `if` check correctly in the `Stage Linux-only
artifacts` step

This fixes both of those issues and bumps the workspace version to
`0.0.2504292006` in preparation for another release attempt.
2025-04-29 20:12:23 -07:00
Michael Bolin
1a39568e03 chore: set Cargo workspace to version 0.0.2504291954 to create a scratch release (#744) 2025-04-29 19:56:30 -07:00
Michael Bolin
efb0acc152 fix: primary output of the codex-cli crate is named codex, not codex-cli (#743)
I just got a bunch of failures in the release workflow:

https://github.com/openai/codex/actions/runs/14745492805/job/41391926707

along the lines of:

```
cp: cannot stat 'target/aarch64-unknown-linux-gnu/release/codex-cli': No such file or directory
```
2025-04-29 19:53:29 -07:00
Michael Bolin
85999d7277 chore: set Cargo workspace to version 0.0.2504291926 to create a scratch release (#741)
Needed to exercise the new release process in
https://github.com/openai/codex/pull/671.
2025-04-29 19:35:37 -07:00
Michael Bolin
411bfeb410 feat: codex-linux-sandbox standalone executable (#740)
This introduces a standalone executable that run the equivalent of the
`codex debug landlock` subcommand and updates `rust-release.yml` to
include it in the release.

The idea is that we will include this small binary with the TypeScript
CLI to provide support for Linux sandboxing.
2025-04-29 19:21:26 -07:00
Michael Bolin
27bc4516bf feat: bring back -s option to specify sandbox permissions (#739) 2025-04-29 18:42:52 -07:00
oai-ragona
cb0b0259f4 [codex-rs] Add rust-release action (#671)
Taking a pass at building artifacts per platform so we can consider
different distribution strategies that don't require users to install
the full `cargo` toolchain.

Right now this grabs just the `codex-repl` and `codex-tui` bins for 5
different targets and bundles them into a draft release. I think a
clearly marked pre-release set of artifacts will unblock the next step
of testing.
2025-04-29 16:38:47 -07:00
Michael Bolin
0a00b5ed29 fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:


237f8a11e1/codex-rs/core/src/protocol.rs (L98-L108)

Specifically:

* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.

This PR changes things substantially by redefining the policy in terms
of two concepts:

* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.

Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.

Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.

Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:

* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.

The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.

Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
Matan Yemini
237f8a11e1 feat: add common package registries domains to allowed-domains list (#414)
feat: add common package registries domains to allowed-domains list
2025-04-29 12:07:00 -07:00
Kevin Alwell
a6ed7ff103 Fixes issue #726 by adding config to configToSave object (#728)
The saveConfig() function only includes a hardcoded subset of properties
when writing the config file. Any property not explicitly listed (like
disableResponseStorage) will be dropped.
I have added `disableResponseStorage` to the `configToSave` object as
the immediate fix.

[Linking Issue this fixes.](https://github.com/openai/codex/issues/726)
2025-04-29 13:10:16 -04:00
Michael Bolin
3b39964f81 feat: improve output of exec subcommand (#719) 2025-04-29 09:59:35 -07:00
Rashim
892242ef7c feat: add --reasoning CLI flag (#314)
This PR adds a new CLI flag: `--reasoning`, which allows users to
customize the reasoning effort level (`low`, `medium`, or `high`) used
by OpenAI's `o` models.
By introducing the `--reasoning` flag, users gain more flexibility when
working with the models. It enables optimization for either speed or
depth of reasoning, depending on specific use cases.
This PR resolves #107

- **Flag**: `--reasoning`
- **Accepted Values**: `low`, `medium`, `high`
- **Default Behavior**: If not specified, the model uses the default
reasoning level.

## Example Usage

```bash
codex --reasoning=low "Write a simple function to calculate factorial"

---------

Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
Co-authored-by: yashrwealthy <yash.rastogi@wealthy.in>
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-29 07:30:49 -07:00
Fouad Matin
19928bc257 [codex-rs] fix: exit code 1 if no api key (#697) 2025-04-28 21:42:06 -07:00
Michael Bolin
b9bba09819 fix: eliminate runtime dependency on patch(1) for apply_patch (#718)
When processing an `apply_patch` tool call, we were already computing
the new file content in order to compute the unified diff. Before this
PR, we were shelling out to `patch(1)` to apply the unified diff once
the user accepted the change, but this updates the code to just retain
the new file content and use it to write the file when the user accepts.
This simplifies deployment because it no longer assumes `patch(1)` is on
the host.

Note this change is internal to the Codex agent and does not affect
`protocol.rs`.
2025-04-28 21:15:41 -07:00
Thibault Sottiaux
d09dbba7ec feat: lower default retry wait time and increase number of tries (#720)
In total we now guarantee that we will wait for at least 60s before
giving up.

---------

Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-28 21:11:30 -07:00
Michael Bolin
e79549f039 feat: add debug landlock subcommand comparable to debug seatbelt (#715)
This PR adds a `debug landlock` subcommand to the Codex CLI for testing
how Codex would execute a command using the specified sandbox policy.

Built and ran this code in the `rust:latest` Docker container. In the
container, hitting the network with vanilla `curl` succeeds:

```
$ curl google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
```

whereas this fails, as expected:

```
$ cargo run -- debug landlock -s network-restricted -- curl google.com
curl: (6) getaddrinfo() thread failed to start
```
2025-04-28 16:37:05 -07:00
Michael Bolin
e7ad9449ea feat: make it possible to set disable_response_storage = true in config.toml (#714)
https://github.com/openai/codex/pull/642 introduced support for the
`--disable-response-storage` flag, but if you are a ZDR customer, it is
tedious to set this every time, so this PR makes it possible to set this
once in `config.toml` and be done with it.

Incidentally, this tidies things up such that now `init_codex()` takes
only one parameter: `Config`.
2025-04-28 15:39:34 -07:00
Michael Bolin
cca1122ddc fix: make the TUI the default/"interactive" CLI in Rust (#711)
Originally, the `interactive` crate was going to be a placeholder for
building out a UX that was comparable to that of the existing TypeScript
CLI. Though after researching how Ratatui works, that seems difficult to
do because it is designed around the idea that it will redraw the full
screen buffer each time (and so any scrolling should be "internal" to
your Ratatui app) whereas the TypeScript CLI expects to render the full
history of the conversation every time(*) (which is why you can use your
terminal scrollbar to scroll it).

While it is possible to use Ratatui in a way that acts more like what
the TypeScript CLI is doing, it is awkward and seemingly results in
tedious code, so I think we should abandon that approach. As such, this
PR deletes the `interactive/` folder and the code that depended on it.

Further, since we added support for mousewheel scrolling in the TUI in
https://github.com/openai/codex/pull/641, it certainly feels much better
and the need for scroll support via the terminal scrollbar is greatly
diminished. This is now a more appropriate default UX for the
"multitool" CLI.

(*) Incidentally, I haven't verified this, but I think this results in
O(N^2) work in rendering, which seems potentially problematic for long
conversations.
2025-04-28 13:46:22 -07:00
Michael Bolin
40460faf2a fix: tighten up check for /usr/bin/sandbox-exec (#710)
* In both TypeScript and Rust, we now invoke `/usr/bin/sandbox-exec`
explicitly rather than whatever `sandbox-exec` happens to be on the
`PATH`.
* Changed `isSandboxExecAvailable` to use `access()` rather than
`command -v` so that:
  *  We only do the check once over the lifetime of the Codex process.
  * The check is specific to `/usr/bin/sandbox-exec`.
* We now do a syscall rather than incur the overhead of spawning a
process, dealing with timeouts, etc.

I think there is still room for improvement here where we should move
the `isSandboxExecAvailable` check earlier in the CLI, ideally right
after we do arg parsing to verify that we can provide the Seatbelt
sandbox if that is what the user has requested.
2025-04-28 13:42:04 -07:00
Michael Bolin
38575ed8aa fix: increase timeout of test_writable_root (#713)
Although we made some promising fixes in
https://github.com/openai/codex/pull/662, we are still seeing some
flakiness in `test_writable_root()`. If this continues to flake with the
more generous timeout, we should try something other than simply
increasing the timeout.
2025-04-28 13:09:27 -07:00
Michael Bolin
77e2918049 fix: drop d as keyboard shortcut for scrolling in the TUI (#704)
The existing `b` and `space` are sufficient and `d` and `u` default to
half-page scrolling in `less`, so the way we supported `d` and `u`
wasn't faithful to that, anyway:

https://man7.org/linux/man-pages/man1/less.1.html

If we decide to bring `d` and `u` back, they should probably match
`less`?
2025-04-28 10:39:58 -07:00
Thibault Sottiaux
fa5fa8effc fix: only allow running without sandbox if explicitly marked in safe container (#699)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-28 07:48:38 -07:00
Michael Bolin
4eda4dd772 feat: load defaults into Config and introduce ConfigOverrides (#677)
This changes how instantiating `Config` works and also adds
`approval_policy` and `sandbox_policy` as fields. The idea is:

* All fields of `Config` have appropriate default values.
* `Config` is initially loaded from `~/.codex/config.toml`, so values in
`config.toml` will override those defaults.
* Clients must instantiate `Config` via
`Config::load_with_overrides(ConfigOverrides)` where `ConfigOverrides`
has optional overrides that are expected to be settable based on CLI
flags.

The `Config` should be defined early in the program and then passed
down. Now functions like `init_codex()` take fewer individual parameters
because they can just take a `Config`.

Also, `Config::load()` used to fail silently if `~/.codex/config.toml`
had a parse error and fell back to the default config. This seemed
really bad because it wasn't clear why the values in my `config.toml`
weren't getting picked up. I changed things so that
`load_with_overrides()` returns `Result<Config>` and verified that the
various CLIs print a reasonable error if `config.toml` is malformed.

Finally, I also updated the TUI to show which **sandbox** value is being
used, as we do for other key values like **model** and **approval**.
This was also a reminder that the various values of `--sandbox` are
honored on Linux but not macOS today, so I added some TODOs about fixing
that.
2025-04-27 21:47:50 -07:00
Thibault Sottiaux
e9d16d3c2b fix: check if sandbox-exec is available (#696)
- Introduce `isSandboxExecAvailable()` helper and tidy import ordering
in `handle-exec-command.ts`.
- Add runtime check for the `sandbox-exec` binary on macOS; fall back to
`SandboxType.NONE` with a warning if it’s missing, preventing crashes.

---------

Signed-off-by: Thibault Sottiaux <tibo@openai.com>
Co-authored-by: Fouad Matin <fouad@openai.com>
2025-04-27 17:04:47 -07:00
Fouad Matin
523996b5cb fix: /diff should include untracked files (#686) 2025-04-26 12:43:51 -07:00
Tomas Cupr
bc500d3009 feat: user config api key (#569)
Adds support for reading OPENAI_API_KEY (and other variables) from a
user‑wide dotenv file (~/.codex.config). Precedence order is now:
  1. explicit environment variable
  2. project‑local .env (loaded earlier)
  3. ~/.codex.config

Also adds a regression test that ensures the multiline editor correctly
handles cases where printable text and the CSI‑u Shift+Enter sequence
arrive in the same input chunk.

House‑kept with Prettier; removed stray temp.json artifact.
2025-04-26 10:13:30 -07:00
moppywhip
9b0ccf9aeb fix: duplicate messages in quiet mode (#680)
Addressing #600 and #664 (partially)

## Bug
Codex was staging duplicate items in output running when the same
response item appeared in both the streaming events. Specifically:

1. Items would be staged once when received as a
`response.output_item.done` event
2. The same items would be staged again when included in the final
`response.completed` payload

This duplication would result in each message being sent several times
in the quiet mode output.

## Changes
- Added a Set (`alreadyStagedItemIds`) to track items that have already
been staged
- Modified the `stageItem` function to check if an item's ID is already
in this set before staging it
- Added a regression test (`agent-dedupe-items.test.ts`) that verifies
items with the same ID are only staged once

## Testing
Like other tests, the included test creates a mock OpenAI stream that
emits the same message twice (once as an incremental event and once in
the final response) and verifies the item is only passed to `onItem`
once.
2025-04-26 09:14:50 -07:00
Michael Bolin
b0ba65a936 fix: write logs to ~/.codex/log instead of /tmp (#669)
Previously, the Rust TUI was writing log files to `/tmp`, which is
world-readable and not available on Windows, so that isn't great.

This PR tries to clean things up by adding a function that provides the
path to the "Codex config dir," e.g., `~/.codex` (though I suppose we
could support `$CODEX_HOME` to override this?) and then defines other
paths in terms of the result of `codex_dir()`.

For example, `log_dir()` returns the folder where log files should be
written which is defined in terms of `codex_dir()`. I updated the TUI to
use this function. On UNIX, we even go so far as to `chmod 600` the log
file by default, though as noted in a comment, it's a bit tedious to do
the equivalent on Windows, so we just let that go for now.

This also changes the default logging level to `info` for `codex_core`
and `codex_tui` when `RUST_LOG` is not specified. I'm not really sure if
we should use a more verbose default (it may be helpful when debugging
user issues), though if so, we should probably also set up log rotation?
2025-04-25 17:37:41 -07:00
Fouad Matin
103093f793 bump(version): 0.1.2504251709 (#660)
## `0.1.2504251709`

### 🚀 Features

- Add openai model info configuration (#551)
- Added provider to run quiet mode function (#571)
- Create parent directories when creating new files (#552)
- Print bug report URL in terminal instead of opening browser (#510)
(#528)
- Add support for custom provider configuration in the user config
(#537)
- Add support for OpenAI-Organization and OpenAI-Project headers (#626)
- Add specific instructions for creating API keys in error msg (#581)
- Enhance toCodePoints to prevent potential unicode 14 errors (#615)
- More native keyboard navigation in multiline editor (#655)
- Display error on selection of invalid model (#594)

### 🪲 Bug Fixes

- Model selection (#643)
- Nits in apply patch (#640)
- Input keyboard shortcuts (#676)
- `apply_patch` unicode characters (#625)
- Don't clear turn input before retries (#611)
- More loosely match context for apply_patch (#610)
- Update bug report template - there is no --revision flag (#614)
- Remove outdated copy of text input and external editor feature (#670)
- Remove unreachable "disableResponseStorage" logic flow introduced in
#543 (#573)
- Non-openai mode - fix for gemini content: null, fix 429 to throw
before stream (#563)
- Only allow going up in history when not already in history if input is
empty (#654)
- Do not grant "node" user sudo access when using run_in_container.sh
(#627)
- Update scripts/build_container.sh to use pnpm instead of npm (#631)
- Update lint-staged config to use pnpm --filter (#582)
- Non-openai mode - don't default temp and top_p (#572)
- Fix error catching when checking for updates (#597)
- Close stdin when running an exec tool call (#636)
2025-04-25 17:15:40 -07:00
Fouad Matin
3f4762d969 fix: input keyboard shortcuts (#676)
Fixes keyboard shortcuts:
- ctrl+a/e
- opt+arrow keys
2025-04-25 16:58:09 -07:00
Michael Bolin
f3ee933a74 ci: build Rust on Windows as part of CI (#665)
While we aren't ready to provide Windows binaries of Codex CLI, it seems
like a good idea to ensure we guard platform-specific code
appropriately.
2025-04-25 16:22:16 -07:00
Thibault Sottiaux
44d68f9dbf fix: remove outdated copy of text input and external editor feature (#670)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 16:11:16 -07:00
Misha Davidov
15bf5ca971 fix: handling weird unicode characters in apply_patch (#674)
I � unicode
2025-04-25 16:01:58 -07:00
Michael Bolin
c18f1689a9 fix: small fixes so Codex compiles on Windows (#673)
Small fixes required:

* `ExitStatusExt` differs because UNIX expects exit code to be `i32`
whereas Windows does `u32`
* Marking a file "executable only by owner" is a bit more involved on
Windows. We just do something approximate for now (and add a TODO) to
get things compiling.

I created this PR on my personal Windows machine and `cargo test` and
`cargo clippy` succeed. Once this is in, I'll rebase
https://github.com/openai/codex/pull/665 on top so Windows stays fixed!
2025-04-25 15:58:44 -07:00
Michael Bolin
ebd2ae4abd fix: remove dependency on expanduser crate (#667)
In putting up https://github.com/openai/codex/pull/665, I discovered
that the `expanduser` crate does not compile on Windows. Looking into
it, we do not seem to need it because we were only using it with a value
that was passed in via a command-line flag, so the shell expands `~` for
us before we see it, anyway. (I changed the type in `Cli` from `String`
to `PathBuf`, to boot.)

If we do need this sort of functionality in the future,
https://docs.rs/shellexpand/latest/shellexpand/fn.tilde.html seems
promising.
2025-04-25 14:20:21 -07:00
Michael Bolin
9c3ebac3b7 fix: flipped the sense of Prompt.store in #642 (#663)
I got the sense of this wrong in
https://github.com/openai/codex/pull/642. In that PR, I made
`--disable-response-storage` work, but broke the default case.

With this fix, both cases work and I think the code is a bit cleaner.
2025-04-25 13:41:17 -07:00
Parker Thompson
7d9de34bc7 [codex-rs] Improve linux sandbox timeouts (#662)
* Fixes flaking rust unit test
* Adds explicit sandbox exec timeout handling
2025-04-25 12:56:20 -07:00
Parker Thompson
55e25abf78 [codex-rs] CI performance for rust (#639)
* Refactors the rust-ci into a matrix build
* Adds directory caching for the build artifacts
* Adds workflow dispatch for manual testing
2025-04-25 12:44:03 -07:00
Michael Bolin
b323d10ea7 feat: add ZDR support to Rust implementation (#642)
This adds support for the `--disable-response-storage` flag across our
multiple Rust CLIs to support customers who have opted into Zero-Data
Retention (ZDR). The analogous changes to the TypeScript CLI were:

* https://github.com/openai/codex/pull/481
* https://github.com/openai/codex/pull/543

For a client using ZDR, `previous_response_id` will never be available,
so the `input` field of an API request must include the full transcript
of the conversation thus far. As such, this PR changes the type of
`Prompt.input` from `Vec<ResponseInputItem>` to `Vec<ResponseItem>`.

Practically speaking, `ResponseItem` was effectively a "superset" of
`ResponseInputItem` already. The main difference for us is that
`ResponseItem` includes the `FunctionCall` variant that we have to
include as part of the conversation history in the ZDR case.

Another key change in this PR is modifying `try_run_turn()` so that it
returns the `Vec<ResponseItem>` for the turn in addition to the
`Vec<ResponseInputItem>` produced by `try_run_turn()`. This is because
the caller of `run_turn()` needs to record the `Vec<ResponseItem>` when
ZDR is enabled.

To that end, this PR introduces `ZdrTranscript` (and adds
`zdr_transcript: Option<ZdrTranscript>` to `struct State` in `codex.rs`)
to take responsibility for maintaining the conversation transcript in
the ZDR case.
2025-04-25 12:08:18 -07:00
Michael Bolin
dc7b83666a feat(tui-rs): add support for mousewheel scrolling (#641)
It is intuitive to try to scroll the conversation history using the
mouse in the TUI, but prior to this change, we only supported scrolling
via keyboard events.

This PR enables mouse capture upon initialization (and disables it on
exit) such that we get `ScrollUp` and `ScrollDown` events in
`codex-rs/tui/src/app.rs`. I initially mapped each event to scrolling by
one line, but that felt sluggish. I decided to introduce
`ScrollEventHelper` so we could debounce scroll events and measure the
number of scroll events in a 100ms window to determine the "magnitude"
of the scroll event. I put in a basic heuristic to start, but perhaps
someone more motivated can play with it over time.

`ScrollEventHelper` takes care of handling the atomic fields and thread
management to ensure an `AppEvent::Scroll` event is pumped back through
the event loop at the appropriate time with the accumulated delta.
2025-04-25 12:01:52 -07:00
oai-ragona
d7a40195e6 [codex-rs] Reliability pass on networking (#658)
We currently see a behavior that looks like this:
```
2025-04-25T16:52:24.552789Z  WARN codex_core::codex: stream disconnected - retrying turn (1/10 in 232ms)...
codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 1/10 in 232ms…" }
2025-04-25T16:52:54.789885Z  WARN codex_core::codex: stream disconnected - retrying turn (2/10 in 418ms)...
codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 2/10 in 418ms…" }
```

This PR contains a few different fixes that attempt to resolve/improve
this:
1. **Remove overall client timeout.** I think
[this](https://github.com/openai/codex/pull/658/files#diff-c39945d3c42f29b506ff54b7fa2be0795b06d7ad97f1bf33956f60e3c6f19c19L173)
is perhaps the big fix -- it looks to me like this was actually timing
out even if events were still coming through, and that was causing a
disconnect right in the middle of a healthy stream.
2. **Cap response sizes.** We were frequently sending MUCH larger
responses than the upstream typescript `codex`, and that was definitely
not helping. [Fix
here](https://github.com/openai/codex/pull/658/files#diff-d792bef59aa3ee8cb0cbad8b176dbfefe451c227ac89919da7c3e536a9d6cdc0R21-R26)
for that one.
3. **Much higher idle timeout.** Our idle timeout value was much lower
than typescript.
4. **Sub-linear backoff.** We were much too aggressively backing off,
[this](https://github.com/openai/codex/pull/658/files#diff-5d5959b95c6239e6188516da5c6b7eb78154cd9cfedfb9f753d30a7b6d6b8b06R30-R33)
makes it sub-exponential but maintains the jitter and such.

I was seeing that `stream error: stream disconnected` behavior
constantly, and anecdotally I can no longer reproduce. It feels much
snappier.
2025-04-25 11:44:22 -07:00
Tomas Cupr
4760aa1eb9 perf: optimize token streaming with balanced approach (#635)
- Replace setTimeout(10ms) with queueMicrotask for immediate processing
- Add minimal 3ms setTimeout for rendering to maintain readable UX
- Reduces per-token delay while preserving streaming experience
- Add performance test to verify optimization works correctly

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 10:49:38 -07:00
Thibault Sottiaux
d401283a41 feat: more native keyboard navigation in multiline editor (#655)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 10:35:30 -07:00
rumple
69ce06d2f8 feat: Add support for OpenAI-Organization and OpenAI-Project headers (#626)
Added support for OpenAI-Organization and OpenAI-Project headers for
OpenAI API calls.

This is for #74
2025-04-25 09:52:42 -07:00
Thibault Sottiaux
866626347b fix: only allow going up in history when not already in history if input is empty (#654)
\+ cleanup below input help to be "ctrl+c to exit | "/" to see commands
| enter to send" now that we have command autocompletion
\+ minor other drive-by code cleanups

---------

Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 09:39:24 -07:00
Pulipaka Sai Krishna
2759ff39da fix: model selection (#643)
fix: pass correct selected model in ModelOverlay

The ModelOverlay component was incorrectly passing the current model
instead of the newly selected model to its onSelect callback. This
prevented model changes from being applied properly.

The fix ensures that when a user selects a new model, the parent
component receives the correct newly selected model value, allowing
model changes to work as intended.
2025-04-25 09:38:05 -07:00
Luci
3fe7e53327 fix: nits in apply patch (#640)
## Description

Fix a nit in `apply patch`, potentially improving performance slightly.
2025-04-25 07:27:48 -07:00
Luci
1ef8e8afd3 docs: provider config (#653)
close: #651

Hi! @tibo-openai 👋 Could you share some great examples of
`instructions.md` files? Thanks!

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 07:25:32 -07:00
Luci
a9ecb2efce chore: upgrade prettier to v3 (#644)
## Description

This PR addresses the following improvements:

**Unify Prettier Version**: Currently, the Prettier version used in
`/package.json` and `/codex-cli/package.json` are different. In this PR,
we're updating both to use Prettier v3.

- Prettier v3 introduces improved support for JavaScript and TypeScript.
(e.g. the formatting scenario shown in the image below. This is more
aligned with the TypeScript indentation standard).

<img width="1126" alt="image"
src="https://github.com/user-attachments/assets/6e237eb8-4553-4574-b336-ed9561c55370"
/>

**Add Prettier Auto-Formatting in lint-staged**: We've added a step to
automatically run prettier --write on JavaScript and TypeScript files as
part of the lint-staged process, before the ESLint checks.

- This will help ensure that all committed code is properly formatted
according to the project's Prettier configuration.
2025-04-25 07:21:50 -07:00
Michael Bolin
bfe6fac463 fix: close stdin when running an exec tool call (#636)
We were already doing this in the TypeScript version, but forgot to
bring this over to Rust:


c38c2a59c7/codex-cli/src/utils/agent/sandbox/raw-exec.ts (L76-L78)
2025-04-24 18:06:08 -07:00
Michael Bolin
6a9c9f4b6c fix: add RUST_BACKTRACE=full when running cargo test in CI (#638)
This should provide more information in the event of a failure.
2025-04-24 18:05:56 -07:00
Michael Bolin
5cdcbfa9b4 fix: only run rust-ci.yml on PRs that modify files in codex-rs (#637)
The `rust-ci.yml` build appears to be a bit flaky (we're looking into
it...), so to save TypeScript contributors some noise, restrict the
`rust-ci.yml` job so that it only runs on PRs that touch files in
`codex-rs/`.
2025-04-24 17:59:35 -07:00
Luci
c38c2a59c7 fix(utils): save config (#578)
## Description

When `saveConfig` is called, the project doc is incorrectly saved into
user instructions. This change ensures that only user instructions are
saved to `instructions.md` during saveConfig, preventing data
corruption.

close: #576

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-24 17:32:33 -07:00
Michael Bolin
58f0e5ab74 feat: introduce codex_execpolicy crate for defining "safe" commands (#634)
As described in detail in `codex-rs/execpolicy/README.md` introduced in
this PR, `execpolicy` is a tool that lets you define a set of _patterns_
used to match [`execv(3)`](https://linux.die.net/man/3/execv)
invocations. When a pattern is matched, `execpolicy` returns the parsed
version in a structured form that is amenable to static analysis.

The primary use case is to define patterns match commands that should be
auto-approved by a tool such as Codex. This supports a richer pattern
matching mechanism that the sort of prefix-matching we have done to
date, e.g.:


5e40d9d221/codex-cli/src/approvals.ts (L333-L354)

Note we are still playing with the API and the `system_path` option in
particular still needs some work.
2025-04-24 17:14:47 -07:00
nvp159
5e40d9d221 feat(bug-report): print bug report URL in terminal instead of opening browser (#510) (#528)
Solves #510 
This PR changes the `/bug` command to print the URL into the terminal
(so it works in headless sessions) instead of trying to open a browser.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-24 17:00:14 -07:00
sooraj
36a5a02d5c feat: display error on selection of invalid model (#594)
Up-to-date of #78 

Fixes #32

addressed requested changes @tibo-openai :) made sense to me


though, previous rationale with passing the state up was assuming there
could be a future need to have a shared state with all available models
being available to the parent
2025-04-24 16:56:00 -07:00
Michael Bolin
bb2d411043 fix: update scripts/build_container.sh to use pnpm instead of npm (#631)
I suspect this is why some contributors kept accidentally including a
new `codex-cli/package-lock.json` in their PRs.

Note the `Dockerfile` still uses `npm` instead of `pnpm`, but that
appears to be fine. (Probably nicer to globally install as few things as
possible in the image.)
2025-04-24 16:38:57 -07:00
oai-ragona
b34ed2ab83 [codex-rs] More fine-grained sandbox flag support on Linux (#632)
##### What/Why
This PR makes it so that in Linux we actually respect the different
types of `--sandbox` flag, such that users can apply network and
filesystem restrictions in combination (currently the only supported
behavior), or just pick one or the other.

We should add similar support for OSX in a future PR.

##### Testing
From Linux devbox, updated tests to use more specific flags:
```
test linux::tests_linux::sandbox_blocks_ping ... ok
test linux::tests_linux::sandbox_blocks_getent ... ok
test linux::tests_linux::test_root_read ... ok
test linux::tests_linux::test_dev_null_write ... ok
test linux::tests_linux::sandbox_blocks_dev_tcp_redirection ... ok
test linux::tests_linux::sandbox_blocks_ssh ... ok
test linux::tests_linux::test_writable_root ... ok
test linux::tests_linux::sandbox_blocks_curl ... ok
test linux::tests_linux::sandbox_blocks_wget ... ok
test linux::tests_linux::sandbox_blocks_nc ... ok
test linux::tests_linux::test_root_write - should panic ... ok
```

##### Todo
- [ ] Add negative tests (e.g. confirm you can hit the network if you
configure filesystem only restrictions)
2025-04-24 15:33:45 -07:00
Michael Bolin
61805a832d fix: do not grant "node" user sudo access when using run_in_container.sh (#627)
This exploration came out of my review of
https://github.com/openai/codex/pull/414.

`run_in_container.sh` runs Codex in a Docker container like so:


bd1c3deed9/codex-cli/scripts/run_in_container.sh (L51-L58)

But then runs `init_firewall.sh` to set up the firewall to restrict
network access.

Previously, we did this by adding `/usr/local/bin/init_firewall.sh` to
the container and adding a special rule in `/etc/sudoers.d` so the
unprivileged user (`node`) could run the privileged `init_firewall.sh`
script to open up the firewall for `api.openai.com`:


31d0d7a305/codex-cli/Dockerfile (L51-L56)

Though I believe this is unnecessary, as we can use `docker exec --user
root` from _outside_ the container to run
`/usr/local/bin/init_firewall.sh` as `root` without adding a special
case in `/etc/sudoers.d`.

This appears to work as expected, as I tested it by doing the following:

```
./codex-cli/scripts/build_container.sh
./codex-cli/scripts/run_in_container.sh 'what is the output of `curl https://www.openai.com`'
```

This was a bit funny because in some of my runs, Codex wasn't convinced
it had network access, so I had to convince it to try the `curl`
request:


![image](https://github.com/user-attachments/assets/80bd487c-74e2-4cd3-aa0f-26a6edd8d3f7)

As you can see, when it ran `curl -s https\://www.openai.com`, it a
connection failure, so the network policy appears to be working as
intended.

Note this PR also removes `sudo` from the `apt-get install` list in the
`Dockerfile`.
2025-04-24 14:25:02 -07:00
Fouad Matin
bd1c3deed9 update: readme (#630)
- mention support for ZDR
- codex open source fund
2025-04-24 14:05:26 -07:00
Michael Bolin
31d0d7a305 feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:

Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.

To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:

- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.

Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
Misha Davidov
acc4acc81e fix: apply_patch unicode characters (#625)
fuzzy-er matching for apply_patch to handle u00A0 and u202F spaces.
2025-04-24 13:04:37 -07:00
Luci
e84fa6793d fix(agent-loop): notify type (#608)
## Description

The `as AppConfig` type assertion in the constructor may introduce
potential type safety risks. Removing the assertion and making `notify`
an optional parameter could enhance type robustness and prevent
unexpected runtime errors.

close: #605
2025-04-24 11:08:52 -07:00
Asa
d1c0d5e683 feat: update README and config to support custom providers with API k… (#577)
When using a non-built-in provider with the `--provider` option, users
are prompted:

```
Set the environment variable <provider>_API_KEY and re-run this command.
You can create a <provider>_API_KEY in the <provider> dashboard.
```

However, many users are confused because, even after correctly setting
`<provider>_API_KEY`, authentication may still fail unless
`OPENAI_API_KEY` is _also_ present in the environment. This is not
intuitive and leads to ambiguity about which API key is actually
required and used as a fallback, especially when using custom or
third-party (non-listed) providers.

Furthermore, the original README/documentation did not mention the
requirement to set `<provider>_BASE_URL` for non-built-in providers,
which is necessary for proper client behavior. This omission made the
configuration process more difficult for users trying to integrate with
custom endpoints.
2025-04-24 11:08:19 -07:00
Luci
6d68a90064 feat: enhance toCodePoints to prevent potential unicode 14 errors (#615)
## Description

`Array.from` may fail when handling certain characters newly added in
Unicode 14. Where possible, it seems better to use `Intl.Segmenter` for
more reliable processing.


![image](https://github.com/user-attachments/assets/2cbd779d-69d3-448e-b76a-d793cb639d96)
2025-04-24 10:49:18 -07:00
Ilya Kamen
1008e1b9a0 fix: update bug report template - there is no --revision flag (#614)
I think there was a wrong word; --revision seems not to exist in help
and does nothing.
2025-04-24 10:48:42 -07:00
Luci
257167a034 fix: lint-staged error (#617)
## Description

In a recent commit, the command `"cd codex-cli && pnpm run typecheck"`
was updated to `"pnpm --filter @openai/codex run typecheck"`.

However, this change introduces an issue: 
when running `pnpm --filter @openai/codex run typecheck`, it executes
`tsc --noEmit somefile.ts` directly, bypassing the `tsconfig.json`
configuration. As a result, numerous type errors are triggered,
preventing successful commits.

Close: #619
2025-04-24 10:48:35 -07:00
Misha Davidov
9b102965b9 feat: more loosely match context for apply_patch (#610)
More of a proposal than anything but models seem to struggle with
composing valid patches for `apply_patch` for context matching when
there are unicode look-a-likes involved. This would normalize them.

```
top-level          # ASCII
top-level          # U+2011 NON-BREAKING HYPHEN
top–level          # U+2013 EN DASH
top—level          # U+2014 EM DASH
top‒level          # U+2012 FIGURE DASH
```

thanks unicode.
2025-04-24 09:05:19 -07:00
theg1239
ad1e39c903 feat: add specific instructions for creating API keys in error msg (#581)
Updates the error message for missing Gemini API keys to reference
"Google AI Studio" instead of the generic "GEMINI dashboard". This
provides users with more accurate information about where to obtain
their Gemini API keys.

This could be extended to other providers as well.
2025-04-24 06:33:34 -05:00
theg1239
006992b85a chore: update lint-staged config to use pnpm --filter (#582)
Replaced directory-specific commands with workspace-aware pnpm commands
2025-04-24 06:33:13 -05:00
Connor Christie
622323a59b fix: don't clear turn input before retries (#611)
The current turn input in the agent loop is being discarded before
consuming the stream events which causes the stream reconnect (after
rate limit failure) to not include the inputs. Since the new stream
includes the previous response ID, it triggers a bad request exception
considering the input doesn't match what OpenAI has stored on the server
side and subsequently a very confusing error message of: `No tool output
found for function call call_xyz`.

This should fix https://github.com/openai/codex/issues/586.

## Testing

I have a personal project that I'm working on that runs multiple Codex
CLIs in parallel and often runs into rate limit errors (as seen in the
OpenAI logs). After making this change, I am no longer experiencing
Codex crashing and it was able to retry and handle everything gracefully
until completion (even though I still see rate limiting in the OpenAI
logs).
2025-04-24 06:29:36 -05:00
164 changed files with 20540 additions and 1342 deletions

View File

@@ -19,7 +19,7 @@ body:
id: version
attributes:
label: What version of Codex is running?
description: Copy the output of `codex --revision`
description: Copy the output of `codex --version`
- type: input
id: model
attributes:

37
.github/dotslash-config.json vendored Normal file
View File

@@ -0,0 +1,37 @@
{
"outputs": {
"codex-repl": {
"platforms": {
"macos-aarch64": { "regex": "^codex-repl-aarch64-apple-darwin\\.zst$", "path": "codex-repl" },
"macos-x86_64": { "regex": "^codex-repl-x86_64-apple-darwin\\.zst$", "path": "codex-repl" },
"linux-x86_64": { "regex": "^codex-repl-x86_64-unknown-linux-musl\\.zst$", "path": "codex-repl" },
"linux-aarch64": { "regex": "^codex-repl-aarch64-unknown-linux-gnu\\.zst$", "path": "codex-repl" }
}
},
"codex-exec": {
"platforms": {
"macos-aarch64": { "regex": "^codex-exec-aarch64-apple-darwin\\.zst$", "path": "codex-exec" },
"macos-x86_64": { "regex": "^codex-exec-x86_64-apple-darwin\\.zst$", "path": "codex-exec" },
"linux-x86_64": { "regex": "^codex-exec-x86_64-unknown-linux-musl\\.zst$", "path": "codex-exec" },
"linux-aarch64": { "regex": "^codex-exec-aarch64-unknown-linux-gnu\\.zst$", "path": "codex-exec" }
}
},
"codex": {
"platforms": {
"macos-aarch64": { "regex": "^codex-aarch64-apple-darwin\\.zst$", "path": "codex" },
"macos-x86_64": { "regex": "^codex-x86_64-apple-darwin\\.zst$", "path": "codex" },
"linux-x86_64": { "regex": "^codex-x86_64-unknown-linux-musl\\.zst$", "path": "codex" },
"linux-aarch64": { "regex": "^codex-aarch64-unknown-linux-gnu\\.zst$", "path": "codex" }
}
},
"codex-linux-sandbox": {
"platforms": {
"linux-x86_64": { "regex": "^codex-linux-sandbox-x86_64-unknown-linux-musl\\.zst$", "path": "codex-linux-sandbox" },
"linux-aarch64": { "regex": "^codex-linux-sandbox-aarch64-unknown-linux-gnu\\.zst$", "path": "codex-linux-sandbox" }
}
}
}
}

94
.github/workflows/rust-ci.yml vendored Normal file
View File

@@ -0,0 +1,94 @@
name: rust-ci
on:
pull_request:
branches:
- main
paths:
- "codex-rs/**"
- ".github/**"
push:
branches:
- main
workflow_dispatch:
# For CI, we build in debug (`--profile dev`) rather than release mode so we
# get signal faster.
jobs:
# CI that don't need specific targets
general:
name: Format / etc
runs-on: ubuntu-24.04
defaults:
run:
working-directory: codex-rs
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: cargo fmt
run: cargo fmt -- --config imports_granularity=Item --check
# CI to validate on different os/targets
lint_build_test:
name: ${{ matrix.runner }} - ${{ matrix.target }}
runs-on: ${{ matrix.runner }}
timeout-minutes: 30
defaults:
run:
working-directory: codex-rs
strategy:
fail-fast: false
matrix:
# Note: While Codex CLI does not support Windows today, we include
# Windows in CI to ensure the code at least builds there.
include:
- runner: macos-14
target: aarch64-apple-darwin
- runner: macos-14
target: x86_64-apple-darwin
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
- runner: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- runner: windows-latest
target: x86_64-pc-windows-msvc
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
${{ github.workspace }}/codex-rs/target/
key: cargo-${{ matrix.runner }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' }}
name: Install musl build tools
run: |
sudo apt install -y musl-tools pkg-config
- name: Initialize failure flag
run: echo "FAILED=" >> $GITHUB_ENV
- name: cargo clippy
run: cargo clippy --target ${{ matrix.target }} --all-features -- -D warnings || echo "FAILED=${FAILED:+$FAILED, }cargo clippy" >> $GITHUB_ENV
- name: cargo test
run: cargo test --target ${{ matrix.target }} || echo "FAILED=${FAILED:+$FAILED, }cargo test" >> $GITHUB_ENV
- name: Fail if any step failed
if: env.FAILED != ''
run: |
echo "See logs above, as the following steps failed:"
echo "$FAILED"
exit 1

157
.github/workflows/rust-release.yml vendored Normal file
View File

@@ -0,0 +1,157 @@
# Release workflow for codex-rs.
# To release, follow a workflow like:
# ```
# git tag -a rust-v0.1.0 -m "Release 0.1.0"
# git push origin rust-v0.1.0
# ```
name: rust-release
on:
push:
tags:
- "rust-v*.*.*"
concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
env:
TAG_REGEX: '^rust-v[0-9]+\.[0-9]+\.[0-9]+$'
jobs:
tag-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate tag matches Cargo.toml version
shell: bash
run: |
set -euo pipefail
echo "::group::Tag validation"
# 1. Must be a tag and match the regex
[[ "${GITHUB_REF_TYPE}" == "tag" ]] \
|| { echo "❌ Not a tag push"; exit 1; }
[[ "${GITHUB_REF_NAME}" =~ ${TAG_REGEX} ]] \
|| { echo "❌ Tag '${GITHUB_REF_NAME}' != ${TAG_REGEX}"; exit 1; }
# 2. Extract versions
tag_ver="${GITHUB_REF_NAME#rust-v}"
cargo_ver="$(grep -m1 '^version' codex-rs/Cargo.toml \
| sed -E 's/version *= *"([^"]+)".*/\1/')"
# 3. Compare
[[ "${tag_ver}" == "${cargo_ver}" ]] \
|| { echo "❌ Tag ${tag_ver} ≠ Cargo.toml ${cargo_ver}"; exit 1; }
echo "✅ Tag and Cargo.toml agree (${tag_ver})"
echo "::endgroup::"
build:
needs: tag-check
name: ${{ matrix.runner }} - ${{ matrix.target }}
runs-on: ${{ matrix.runner }}
timeout-minutes: 30
defaults:
run:
working-directory: codex-rs
strategy:
fail-fast: false
matrix:
include:
- runner: macos-14
target: aarch64-apple-darwin
- runner: macos-14
target: x86_64-apple-darwin
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
- runner: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-gnu
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
${{ github.workspace }}/codex-rs/target/
key: cargo-release-${{ matrix.runner }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' }}
name: Install musl build tools
run: |
sudo apt install -y musl-tools pkg-config
- name: Cargo build
run: cargo build --target ${{ matrix.target }} --release --all-targets --all-features
- name: Stage artifacts
shell: bash
run: |
dest="dist/${{ matrix.target }}"
mkdir -p "$dest"
cp target/${{ matrix.target }}/release/codex-repl "$dest/codex-repl-${{ matrix.target }}"
cp target/${{ matrix.target }}/release/codex-exec "$dest/codex-exec-${{ matrix.target }}"
cp target/${{ matrix.target }}/release/codex "$dest/codex-${{ matrix.target }}"
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'x86_64-unknown-linux-gnu' || matrix.target == 'aarch64-unknown-linux-gnu' }}
name: Stage Linux-only artifacts
shell: bash
run: |
dest="dist/${{ matrix.target }}"
cp target/${{ matrix.target }}/release/codex-linux-sandbox "$dest/codex-linux-sandbox-${{ matrix.target }}"
- name: Compress artifacts
shell: bash
run: |
dest="dist/${{ matrix.target }}"
zstd -T0 -19 --rm "$dest"/*
- uses: actions/upload-artifact@v4
with:
name: ${{ matrix.target }}
path: codex-rs/dist/${{ matrix.target }}/*
release:
needs: build
name: release
runs-on: ubuntu-24.04
env:
RELEASE_TAG: codex-rs-${{ github.sha }}-${{ github.run_attempt }}-${{ github.ref_name }}
steps:
- uses: actions/download-artifact@v4
with:
path: dist
- name: List
run: ls -R dist/
- uses: softprops/action-gh-release@v2
with:
tag_name: ${{ env.RELEASE_TAG }}
files: dist/**
# TODO(ragona): I'm going to leave these as draft for now.
# It gives us 1) clarity that these are not yet a stable version, and
# 2) allows a human step to review the release before publishing the draft.
prerelease: false
draft: true
- uses: facebook/dotslash-publish-release@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag: ${{ env.RELEASE_TAG }}
config: .github/dotslash-config.json

View File

@@ -2,6 +2,41 @@
You can install any of these versions: `npm install -g codex@version`
## `0.1.2504251709`
### 🚀 Features
- Add openai model info configuration (#551)
- Added provider to run quiet mode function (#571)
- Create parent directories when creating new files (#552)
- Print bug report URL in terminal instead of opening browser (#510) (#528)
- Add support for custom provider configuration in the user config (#537)
- Add support for OpenAI-Organization and OpenAI-Project headers (#626)
- Add specific instructions for creating API keys in error msg (#581)
- Enhance toCodePoints to prevent potential unicode 14 errors (#615)
- More native keyboard navigation in multiline editor (#655)
- Display error on selection of invalid model (#594)
### 🪲 Bug Fixes
- Model selection (#643)
- Nits in apply patch (#640)
- Input keyboard shortcuts (#676)
- `apply_patch` unicode characters (#625)
- Don't clear turn input before retries (#611)
- More loosely match context for apply_patch (#610)
- Update bug report template - there is no --revision flag (#614)
- Remove outdated copy of text input and external editor feature (#670)
- Remove unreachable "disableResponseStorage" logic flow introduced in #543 (#573)
- Non-openai mode - fix for gemini content: null, fix 429 to throw before stream (#563)
- Only allow going up in history when not already in history if input is empty (#654)
- Do not grant "node" user sudo access when using run_in_container.sh (#627)
- Update scripts/build_container.sh to use pnpm instead of npm (#631)
- Update lint-staged config to use pnpm --filter (#582)
- Non-openai mode - don't default temp and top_p (#572)
- Fix error catching when checking for updates (#597)
- Close stdin when running an exec tool call (#636)
## `0.1.2504221401`
### 🚀 Features
@@ -9,7 +44,7 @@ You can install any of these versions: `npm install -g codex@version`
- Show actionable errors when api keys are missing (#523)
- Add CLI `--version` flag (#492)
### 🐛 Bug Fixes
### 🪲 Bug Fixes
- Agent loop for ZDR (`disableResponseStorage`) (#543)
- Fix relative `workdir` check for `apply_patch` (#556)
@@ -40,7 +75,7 @@ You can install any of these versions: `npm install -g codex@version`
- Add /command autocomplete (#317)
- Allow multi-line input (#438)
### 🐛 Bug Fixes
### 🪲 Bug Fixes
- `full-auto` support in quiet mode (#374)
- Enable shell option for child process execution (#391)
@@ -64,7 +99,7 @@ You can install any of these versions: `npm install -g codex@version`
- Add `/bug` report command (#312)
- Notify when a newer version is available (#333)
### 🐛 Bug Fixes
### 🪲 Bug Fixes
- Update context left display logic in TerminalChatInput component (#307)
- Improper spawn of sh on Windows Powershell (#318)
@@ -77,7 +112,7 @@ You can install any of these versions: `npm install -g codex@version`
- Add Nix flake for reproducible development environments (#225)
### 🐛 Bug Fixes
### 🪲 Bug Fixes
- Handle invalid commands (#304)
- Raw-exec-process-group.test improve reliability and error handling (#280)
@@ -96,7 +131,7 @@ You can install any of these versions: `npm install -g codex@version`
- `--config`/`-c` flag to open global instructions in nvim (#158)
- Update position of cursor when navigating input history with arrow keys to the end of the text (#255)
### 🐛 Bug Fixes
### 🪲 Bug Fixes
- Correct word deletion logic for trailing spaces (Ctrl+Backspace) (#131)
- Improve Windows compatibility for CLI commands and sandbox (#261)

173
README.md
View File

@@ -24,10 +24,17 @@
- [Tracing / Verbose Logging](#tracing--verbose-logging)
- [Recipes](#recipes)
- [Installation](#installation)
- [Configuration](#configuration)
- [Configuration Guide](#configuration-guide)
- [Basic Configuration Parameters](#basic-configuration-parameters)
- [Custom AI Provider Configuration](#custom-ai-provider-configuration)
- [History Configuration](#history-configuration)
- [Configuration Examples](#configuration-examples)
- [Full Configuration Example](#full-configuration-example)
- [Custom Instructions](#custom-instructions)
- [Environment Variables Setup](#environment-variables-setup)
- [FAQ](#faq)
- [Zero Data Retention (ZDR) Organization Limitation](#zero-data-retention-zdr-organization-limitation)
- [Funding Opportunity](#funding-opportunity)
- [Zero Data Retention (ZDR) Usage](#zero-data-retention-zdr-usage)
- [Codex Open Source Fund](#codex-open-source-fund)
- [Contributing](#contributing)
- [Development workflow](#development-workflow)
- [Git Hooks with Husky](#git-hooks-with-husky)
@@ -97,12 +104,19 @@ export OPENAI_API_KEY="your-api-key-here"
> - deepseek
> - xai
> - groq
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
>
> ```shell
> export <provider>_BASE_URL="https://your-provider-api-base-url"
> ```
</details>
<br />
@@ -308,20 +322,53 @@ pnpm link
---
## Configuration
## Configuration Guide
Codex looks for config files in **`~/.codex/`** (either YAML or JSON format).
Codex configuration files can be placed in the `~/.codex/` directory, supporting both YAML and JSON formats.
### Basic Configuration Parameters
| Parameter | Type | Default | Description | Available Options |
| ------------------- | ------- | ---------- | -------------------------------- | ---------------------------------------------------------------------------------------------- |
| `model` | string | `o4-mini` | AI model to use | Any model name supporting OpenAI API |
| `approvalMode` | string | `suggest` | AI assistant's permission mode | `suggest` (suggestions only)<br>`auto-edit` (automatic edits)<br>`full-auto` (fully automatic) |
| `fullAutoErrorMode` | string | `ask-user` | Error handling in full-auto mode | `ask-user` (prompt for user input)<br>`ignore-and-continue` (ignore and proceed) |
| `notify` | boolean | `true` | Enable desktop notifications | `true`/`false` |
### Custom AI Provider Configuration
In the `providers` object, you can configure multiple AI service providers. Each provider requires the following parameters:
| Parameter | Type | Description | Example |
| --------- | ------ | --------------------------------------- | ----------------------------- |
| `name` | string | Display name of the provider | `"OpenAI"` |
| `baseURL` | string | API service URL | `"https://api.openai.com/v1"` |
| `envKey` | string | Environment variable name (for API key) | `"OPENAI_API_KEY"` |
### History Configuration
In the `history` object, you can configure conversation history settings:
| Parameter | Type | Description | Example Value |
| ------------------- | ------- | ------------------------------------------------------ | ------------- |
| `maxSize` | number | Maximum number of history entries to save | `1000` |
| `saveHistory` | boolean | Whether to save history | `true` |
| `sensitivePatterns` | array | Patterns of sensitive information to filter in history | `[]` |
### Configuration Examples
1. YAML format (save as `~/.codex/config.yaml`):
```yaml
# ~/.codex/config.yaml
model: o4-mini # Default model
approvalMode: suggest # or auto-edit, full-auto
fullAutoErrorMode: ask-user # or ignore-and-continue
notify: true # Enable desktop notifications for responses
model: o4-mini
approvalMode: suggest
fullAutoErrorMode: ask-user
notify: true
```
2. JSON format (save as `~/.codex/config.json`):
```json
// ~/.codex/config.json
{
"model": "o4-mini",
"approvalMode": "suggest",
@@ -330,12 +377,85 @@ notify: true # Enable desktop notifications for responses
}
```
You can also define custom instructions:
### Full Configuration Example
```yaml
# ~/.codex/instructions.md
Below is a comprehensive example of `config.json` with multiple custom providers:
```json
{
"model": "o4-mini",
"provider": "openai",
"providers": {
"openai": {
"name": "OpenAI",
"baseURL": "https://api.openai.com/v1",
"envKey": "OPENAI_API_KEY"
},
"openrouter": {
"name": "OpenRouter",
"baseURL": "https://openrouter.ai/api/v1",
"envKey": "OPENROUTER_API_KEY"
},
"gemini": {
"name": "Gemini",
"baseURL": "https://generativelanguage.googleapis.com/v1beta/openai",
"envKey": "GEMINI_API_KEY"
},
"ollama": {
"name": "Ollama",
"baseURL": "http://localhost:11434/v1",
"envKey": "OLLAMA_API_KEY"
},
"mistral": {
"name": "Mistral",
"baseURL": "https://api.mistral.ai/v1",
"envKey": "MISTRAL_API_KEY"
},
"deepseek": {
"name": "DeepSeek",
"baseURL": "https://api.deepseek.com",
"envKey": "DEEPSEEK_API_KEY"
},
"xai": {
"name": "xAI",
"baseURL": "https://api.x.ai/v1",
"envKey": "XAI_API_KEY"
},
"groq": {
"name": "Groq",
"baseURL": "https://api.groq.com/openai/v1",
"envKey": "GROQ_API_KEY"
}
},
"history": {
"maxSize": 1000,
"saveHistory": true,
"sensitivePatterns": []
}
}
```
### Custom Instructions
You can create a `~/.codex/instructions.md` file to define custom instructions:
```markdown
- Always respond with emojis
- Only use git commands if I explicitly mention you should
- Only use git commands when explicitly requested
```
### Environment Variables Setup
For each AI provider, you need to set the corresponding API key in your environment variables. For example:
```bash
# OpenAI
export OPENAI_API_KEY="your-api-key-here"
# OpenRouter
export OPENROUTER_API_KEY="your-openrouter-key-here"
# Similarly for other providers
```
---
@@ -377,34 +497,23 @@ Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.mic
---
## Zero Data Retention (ZDR) Organization Limitation
## Zero Data Retention (ZDR) Usage
> **Note:** Codex CLI does **not** currently support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled.
If your OpenAI organization has Zero Data Retention enabled, you may encounter errors such as:
Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled. If your OpenAI organization has Zero Data Retention enabled and you still encounter errors such as:
```
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
**Why?**
- Codex CLI relies on the Responses API with `store:true` to enable internal reasoning steps.
- As noted in the [docs](https://platform.openai.com/docs/guides/your-data#responses-api), the Responses API requires a 30-day retention period by default, or when the store parameter is set to true.
- ZDR organizations cannot use `store:true`, so requests will fail.
**What can I do?**
- If you are part of a ZDR organization, Codex CLI will not work until support is added.
- We are tracking this limitation and will update the documentation once support becomes available.
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
---
## Funding Opportunity
## Codex Open Source Fund
We're excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
- Grants are awarded in **$25,000** API credit increments.
- Grants are awarded up to **$25,000** API credits.
- Applications are reviewed **on a rolling basis**.
**Interested? [Apply here](https://openai.com/form/codex-open-source-fund/).**
@@ -531,7 +640,7 @@ To publish a new version of the CLI, run the release scripts defined in `codex-c
3. Bump the version and `CLI_VERSION` to current datetime: `pnpm release:version`
4. Commit the version bump (with DCO sign-off):
```bash
git add codex-cli/src/utils/session.ts codex-cli/package.json
git add codex-cli/package.json
git commit -s -m "chore(release): codex-cli v$(node -p \"require('./codex-cli/package.json').version\")"
```
5. Copy README, build, and publish to npm: `pnpm release`

View File

@@ -35,7 +35,7 @@ conventional_commits = true
commit_parsers = [
{ message = "^feat", group = "<!-- 0 -->🚀 Features" },
{ message = "^fix", group = "<!-- 1 -->🐛 Bug Fixes" },
{ message = "^fix", group = "<!-- 1 -->🪲 Bug Fixes" },
{ message = "^bump", group = "<!-- 6 -->🛳️ Release" },
# Fallback  skip anything that didn't match the above rules.
{ message = ".*", group = "<!-- 10 -->💼 Other" },

View File

@@ -20,7 +20,6 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
less \
man-db \
procps \
sudo \
unzip \
ripgrep \
zsh \
@@ -47,10 +46,14 @@ RUN npm install -g codex.tgz \
&& rm -rf /usr/local/share/npm-global/lib/node_modules/codex-cli/tests \
&& rm -rf /usr/local/share/npm-global/lib/node_modules/codex-cli/docs
# Copy and set up firewall script
COPY scripts/init_firewall.sh /usr/local/bin/
# Inside the container we consider the environment already sufficiently locked
# down, therefore instruct Codex CLI to allow running without sandboxing.
ENV CODEX_UNSAFE_ALLOW_NO_SANDBOX=1
# Copy and set up firewall script as root.
USER root
RUN chmod +x /usr/local/bin/init_firewall.sh && \
echo "node ALL=(root) NOPASSWD: /usr/local/bin/init_firewall.sh" > /etc/sudoers.d/node-firewall && \
chmod 0440 /etc/sudoers.d/node-firewall
COPY scripts/init_firewall.sh /usr/local/bin/
RUN chmod 500 /usr/local/bin/init_firewall.sh
# Drop back to non-root.
USER node

View File

@@ -1,6 +1,6 @@
{
"name": "@openai/codex",
"version": "0.1.2504221401",
"version": "0.1.2504251709",
"license": "Apache-2.0",
"bin": {
"codex": "bin/codex.js"
@@ -21,7 +21,7 @@
"build": "node build.mjs",
"build:dev": "NODE_ENV=development node build.mjs --dev && NODE_OPTIONS=--enable-source-maps node dist/cli-dev.js",
"release:readme": "cp ../README.md ./README.md",
"release:version": "TS=$(date +%y%m%d%H%M) && sed -E -i'' -e \"s/\\\"0\\.1\\.[0-9]{10}\\\"/\\\"0.1.${TS}\\\"/g\" package.json src/utils/session.ts",
"release:version": "TS=$(date +%y%m%d%H%M) && sed -E -i'' -e \"s/\\\"0\\.1\\.[0-9]{10}\\\"/\\\"0.1.${TS}\\\"/g\" package.json",
"release:build-and-publish": "pnpm run build && npm publish",
"release": "pnpm run release:readme && pnpm run release:version && pnpm install && pnpm run release:build-and-publish"
},
@@ -71,7 +71,7 @@
"eslint-plugin-react-refresh": "^0.4.19",
"husky": "^9.1.7",
"ink-testing-library": "^3.0.0",
"prettier": "^2.8.7",
"prettier": "^3.5.3",
"punycode": "^2.3.1",
"semver": "^7.7.1",
"ts-node": "^10.9.1",

View File

@@ -8,9 +8,9 @@ pushd "$SCRIPT_DIR/.." >> /dev/null || {
echo "Error: Failed to change directory to $SCRIPT_DIR/.."
exit 1
}
npm install
npm run build
pnpm install
pnpm run build
rm -rf ./dist/openai-codex-*.tgz
npm pack --pack-destination ./dist
pnpm pack --pack-destination ./dist
mv ./dist/openai-codex-*.tgz ./dist/codex.tgz
docker build -t codex -f "./Dockerfile" .

View File

@@ -2,6 +2,26 @@
set -euo pipefail # Exit on error, undefined vars, and pipeline failures
IFS=$'\n\t' # Stricter word splitting
# Read allowed domains from file
ALLOWED_DOMAINS_FILE="/etc/codex/allowed_domains.txt"
if [ -f "$ALLOWED_DOMAINS_FILE" ]; then
ALLOWED_DOMAINS=()
while IFS= read -r domain; do
ALLOWED_DOMAINS+=("$domain")
done < "$ALLOWED_DOMAINS_FILE"
echo "Using domains from file: ${ALLOWED_DOMAINS[*]}"
else
# Fallback to default domains
ALLOWED_DOMAINS=("api.openai.com")
echo "Domains file not found, using default: ${ALLOWED_DOMAINS[*]}"
fi
# Ensure we have at least one domain
if [ ${#ALLOWED_DOMAINS[@]} -eq 0 ]; then
echo "ERROR: No allowed domains specified"
exit 1
fi
# Flush existing rules and delete existing ipsets
iptables -F
iptables -X
@@ -24,8 +44,7 @@ iptables -A OUTPUT -o lo -j ACCEPT
ipset create allowed-domains hash:net
# Resolve and add other allowed domains
for domain in \
"api.openai.com"; do
for domain in "${ALLOWED_DOMAINS[@]}"; do
echo "Resolving $domain..."
ips=$(dig +short A "$domain")
if [ -z "$ips" ]; then
@@ -87,7 +106,7 @@ else
echo "Firewall verification passed - unable to reach https://example.com as expected"
fi
# Verify OpenAI API access
# Always verify OpenAI API access is working
if ! curl --connect-timeout 5 https://api.openai.com >/dev/null 2>&1; then
echo "ERROR: Firewall verification failed - unable to reach https://api.openai.com"
exit 1

View File

@@ -10,6 +10,8 @@ set -e
# Default the work directory to WORKSPACE_ROOT_DIR if not provided.
WORK_DIR="${WORKSPACE_ROOT_DIR:-$(pwd)}"
# Default allowed domains - can be overridden with OPENAI_ALLOWED_DOMAINS env var
OPENAI_ALLOWED_DOMAINS="${OPENAI_ALLOWED_DOMAINS:-api.openai.com}"
# Parse optional flag.
if [ "$1" = "--work_dir" ]; then
@@ -45,6 +47,12 @@ if [ -z "$WORK_DIR" ]; then
exit 1
fi
# Verify that OPENAI_ALLOWED_DOMAINS is not empty
if [ -z "$OPENAI_ALLOWED_DOMAINS" ]; then
echo "Error: OPENAI_ALLOWED_DOMAINS is empty."
exit 1
fi
# Kill any existing container for the working directory using cleanup(), centralizing removal logic.
cleanup
@@ -57,8 +65,25 @@ docker run --name "$CONTAINER_NAME" -d \
codex \
sleep infinity
# Initialize the firewall inside the container.
docker exec "$CONTAINER_NAME" bash -c "sudo /usr/local/bin/init_firewall.sh"
# Write the allowed domains to a file in the container
docker exec --user root "$CONTAINER_NAME" bash -c "mkdir -p /etc/codex"
for domain in $OPENAI_ALLOWED_DOMAINS; do
# Validate domain format to prevent injection
if [[ ! "$domain" =~ ^[a-zA-Z0-9][a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]]; then
echo "Error: Invalid domain format: $domain"
exit 1
fi
echo "$domain" | docker exec --user root -i "$CONTAINER_NAME" bash -c "cat >> /etc/codex/allowed_domains.txt"
done
# Set proper permissions on the domains file
docker exec --user root "$CONTAINER_NAME" bash -c "chmod 444 /etc/codex/allowed_domains.txt && chown root:root /etc/codex/allowed_domains.txt"
# Initialize the firewall inside the container as root user
docker exec --user root "$CONTAINER_NAME" bash -c "/usr/local/bin/init_firewall.sh"
# Remove the firewall script after running it
docker exec --user root "$CONTAINER_NAME" bash -c "rm -f /usr/local/bin/init_firewall.sh"
# Execute the provided command in the container, ensuring it runs in the work directory.
# We use a parameterized bash command to safely handle the command and directory.

View File

@@ -10,6 +10,7 @@ import type { ApprovalPolicy } from "./approvals";
import type { CommandConfirmation } from "./utils/agent/agent-loop";
import type { AppConfig } from "./utils/config";
import type { ResponseItem } from "openai/resources/responses/responses";
import type { ReasoningEffort } from "openai/resources.mjs";
import App from "./app";
import { runSinglePass } from "./cli-singlepass";
@@ -160,6 +161,12 @@ const cli = meow(
"Disable truncation of command stdout/stderr messages (show everything)",
aliases: ["no-truncate"],
},
reasoning: {
type: "string",
description: "Set the reasoning effort level (low, medium, high)",
choices: ["low", "medium", "high"],
default: "high",
},
// Notification
notify: {
type: "boolean",
@@ -184,6 +191,10 @@ const cli = meow(
},
);
// ---------------------------------------------------------------------------
// Global flag handling
// ---------------------------------------------------------------------------
// Handle 'completion' subcommand before any prompting or API calls
if (cli.input[0] === "completion") {
const shell = cli.input[1] || "bash";
@@ -271,25 +282,34 @@ if (!apiKey && !NO_API_KEY_REQUIRED.has(provider.toLowerCase())) {
? `You can create a key here: ${chalk.bold(
chalk.underline("https://platform.openai.com/account/api-keys"),
)}\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
: provider.toLowerCase() === "gemini"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
}`,
);
process.exit(1);
}
const flagPresent = Object.hasOwn(cli.flags, "disableResponseStorage");
const disableResponseStorage = flagPresent
? Boolean(cli.flags.disableResponseStorage) // value user actually passed
: (config.disableResponseStorage ?? false); // fall back to YAML, default to false
config = {
apiKey,
...config,
model: model ?? config.model,
notify: Boolean(cli.flags.notify),
reasoningEffort:
(cli.flags.reasoning as ReasoningEffort | undefined) ?? "high",
flexMode: Boolean(cli.flags.flexMode),
provider,
disableResponseStorage:
cli.flags.disableResponseStorage !== undefined
? Boolean(cli.flags.disableResponseStorage)
: config.disableResponseStorage,
disableResponseStorage,
};
// Check for updates after loading config. This is important because we write state file in
@@ -377,8 +397,8 @@ if (cli.flags.quiet) {
cli.flags.fullAuto || cli.flags.approvalMode === "full-auto"
? AutoApprovalMode.FULL_AUTO
: cli.flags.autoEdit || cli.flags.approvalMode === "auto-edit"
? AutoApprovalMode.AUTO_EDIT
: config.approvalMode || AutoApprovalMode.SUGGEST;
? AutoApprovalMode.AUTO_EDIT
: config.approvalMode || AutoApprovalMode.SUGGEST;
await runQuietMode({
prompt,
@@ -408,8 +428,8 @@ const approvalPolicy: ApprovalPolicy =
cli.flags.fullAuto || cli.flags.approvalMode === "full-auto"
? AutoApprovalMode.FULL_AUTO
: cli.flags.autoEdit || cli.flags.approvalMode === "auto-edit"
? AutoApprovalMode.AUTO_EDIT
: config.approvalMode || AutoApprovalMode.SUGGEST;
? AutoApprovalMode.AUTO_EDIT
: config.approvalMode || AutoApprovalMode.SUGGEST;
const instance = render(
<App

View File

@@ -3,7 +3,7 @@
import { useTerminalSize } from "../../hooks/use-terminal-size";
import TextBuffer from "../../text-buffer.js";
import chalk from "chalk";
import { Box, Text, useInput, useStdin } from "ink";
import { Box, Text, useInput } from "ink";
import { EventEmitter } from "node:events";
import React, { useRef, useState } from "react";
@@ -189,41 +189,6 @@ const MultilineTextEditorInner = (
// minimum so that the UI never becomes unusably small.
const effectiveWidth = Math.max(20, width ?? terminalSize.columns);
// ---------------------------------------------------------------------------
// External editor integration helpers.
// ---------------------------------------------------------------------------
// Access to stdin so we can toggle rawmode while the external editor is
// in control of the terminal.
const { stdin, setRawMode } = useStdin();
/**
* Launch the user's preferred $EDITOR, blocking until they close it, then
* reload the edited file back into the inmemory TextBuffer. The heavy
* work is delegated to `TextBuffer.openInExternalEditor`, but we are
* responsible for temporarily *disabling* raw mode so the child process can
* interact with the TTY normally.
*/
const openExternalEditor = React.useCallback(async () => {
// Preserve the current rawmode setting so we can restore it afterwards.
const wasRaw = stdin?.isRaw ?? false;
try {
setRawMode?.(false);
await buffer.current.openInExternalEditor();
} catch (err) {
// Surface the error so it doesn't fail silently for now we log to
// stderr. In the future this could surface a toast / overlay.
// eslint-disable-next-line no-console
console.error("[MultilineTextEditor] external editor error", err);
} finally {
if (wasRaw) {
setRawMode?.(true);
}
// Force a rerender so the component reflects the mutated buffer.
setVersion((v) => v + 1);
}
}, [buffer, stdin, setRawMode]);
// ---------------------------------------------------------------------------
// Keyboard handling.
// ---------------------------------------------------------------------------
@@ -234,25 +199,6 @@ const MultilineTextEditorInner = (
return;
}
// Singlestep editor shortcut: Ctrl+X or Ctrl+E
// Treat both true Ctrl+Key combinations *and* raw control codes so that
// the shortcut works consistently in real terminals (rawmode) and the
// inktestinglibrary stub which delivers only the raw byte (e.g. 0x05
// for CtrlE) without setting `key.ctrl`.
const isCtrlX =
(key.ctrl && (input === "x" || input === "\x18")) || input === "\x18";
const isCtrlE =
(key.ctrl && (input === "e" || input === "\x05")) ||
input === "\x05" ||
(!key.ctrl &&
input === "e" &&
input.length === 1 &&
input.charCodeAt(0) === 5);
if (isCtrlX || isCtrlE) {
openExternalEditor();
return;
}
if (
process.env["TEXTBUFFER_DEBUG"] === "1" ||
process.env["TEXTBUFFER_DEBUG"] === "true"
@@ -439,5 +385,4 @@ const MultilineTextEditorInner = (
};
const MultilineTextEditor = React.forwardRef(MultilineTextEditorInner);
export default MultilineTextEditor;

View File

@@ -106,11 +106,16 @@ export default function TerminalChatInputThinking({
return (
<Box flexDirection="column" gap={1}>
<Box gap={2}>
<Text>{frameWithSeconds}</Text>
<Box justifyContent="space-between">
<Box gap={2}>
<Text>{frameWithSeconds}</Text>
<Text>
Thinking
{dots}
</Text>
</Box>
<Text>
Thinking
{dots}
Press <Text bold>Esc</Text> twice to interrupt
</Text>
</Box>
{awaitingConfirm && (

View File

@@ -100,6 +100,7 @@ export default function TerminalChatInput({
const editorRef = useRef<MultilineTextEditorHandle | null>(null);
// Track the caret row across keystrokes
const prevCursorRow = useRef<number | null>(null);
const prevCursorWasAtLastRow = useRef<boolean>(false);
// Load command history on component mount
useEffect(() => {
@@ -135,8 +136,8 @@ export default function TerminalChatInput({
? len - 1
: selectedSlashSuggestion - 1
: selectedSlashSuggestion >= len - 1
? 0
: selectedSlashSuggestion + 1;
? 0
: selectedSlashSuggestion + 1;
setSelectedSlashSuggestion(nextIdx);
// Autocomplete the command in the input
const match = matches[nextIdx];
@@ -245,36 +246,52 @@ export default function TerminalChatInput({
}
if (_key.upArrow) {
// Only recall history when the caret was *already* on the very first
let moveThroughHistory = true;
// Only use history when the caret was *already* on the very first
// row *before* this key-press.
const cursorRow = editorRef.current?.getRow?.() ?? 0;
const cursorCol = editorRef.current?.getCol?.() ?? 0;
const wasAtFirstRow = (prevCursorRow.current ?? cursorRow) === 0;
if (!(cursorRow === 0 && wasAtFirstRow)) {
moveThroughHistory = false;
}
if (history.length > 0 && cursorRow === 0 && wasAtFirstRow) {
// If we are not yet in history mode, then also require that the col is zero so that
// we only trigger history navigation when the user is at the start of the input.
if (historyIndex == null && !(cursorRow === 0 && cursorCol === 0)) {
moveThroughHistory = false;
}
// Move through history.
if (history.length && moveThroughHistory) {
let newIndex: number;
if (historyIndex == null) {
const currentDraft = editorRef.current?.getText?.() ?? input;
setDraftInput(currentDraft);
}
let newIndex: number;
if (historyIndex == null) {
newIndex = history.length - 1;
} else {
newIndex = Math.max(0, historyIndex - 1);
}
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
// Re-mount the editor so it picks up the new initialText
setEditorKey((k) => k + 1);
return; // we handled the key
return; // handled
}
// Otherwise let the event propagate so the editor moves the caret
// Otherwise let it propagate.
}
if (_key.downArrow) {
// Only move forward in history when we're already *in* history mode
// AND the caret sits on the last line of the buffer
if (historyIndex != null && editorRef.current?.isCursorAtLastRow()) {
// AND the caret sits on the last line of the buffer.
const wasAtLastRow =
prevCursorWasAtLastRow.current ??
editorRef.current?.isCursorAtLastRow() ??
true;
if (historyIndex != null && wasAtLastRow) {
const newIndex = historyIndex + 1;
if (newIndex >= history.length) {
setHistoryIndex(null);
@@ -304,9 +321,26 @@ export default function TerminalChatInput({
}
}
// Update the cached cursor position *after* we've potentially handled
// the key so that the next event has the correct "previous" reference.
prevCursorRow.current = editorRef.current?.getRow?.() ?? null;
// Update the cached cursor position *after* **all** handlers (including
// the internal <MultilineTextEditor>) have processed this key event.
//
// Ink invokes `useInput` callbacks starting with **parent** components
// first, followed by their descendants. As a result the call above
// executes *before* the editor has had a chance to react to the key
// press and update its internal caret position. When navigating
// through a multi-line draft with the ↑ / ↓ arrow keys this meant we
// recorded the *old* cursor row instead of the one that results *after*
// the key press. Consequently, a subsequent ↑ still saw
// `prevCursorRow = 1` even though the caret was already on row 0 and
// history-navigation never kicked in.
//
// Defer the sampling by one tick so we read the *final* caret position
// for this frame.
setTimeout(() => {
prevCursorRow.current = editorRef.current?.getRow?.() ?? null;
prevCursorWasAtLastRow.current =
editorRef.current?.isCursorAtLastRow?.() ?? true;
}, 1);
if (input.trim() === "" && isNew) {
if (_key.tab) {
@@ -339,73 +373,60 @@ export default function TerminalChatInput({
const onSubmit = useCallback(
async (value: string) => {
const inputValue = value.trim();
// If the user only entered a slash, do not send a chat message
// If the user only entered a slash, do not send a chat message.
if (inputValue === "/") {
setInput("");
return;
}
// Skip this submit if we just autocompleted a slash command
// Skip this submit if we just autocompleted a slash command.
if (skipNextSubmit) {
setSkipNextSubmit(false);
return;
}
if (!inputValue) {
return;
}
if (inputValue === "/history") {
} else if (inputValue === "/history") {
setInput("");
openOverlay();
return;
}
if (inputValue === "/help") {
} else if (inputValue === "/help") {
setInput("");
openHelpOverlay();
return;
}
if (inputValue === "/diff") {
} else if (inputValue === "/diff") {
setInput("");
openDiffOverlay();
return;
}
if (inputValue === "/compact") {
} else if (inputValue === "/compact") {
setInput("");
onCompact();
return;
}
if (inputValue.startsWith("/model")) {
} else if (inputValue.startsWith("/model")) {
setInput("");
openModelOverlay();
return;
}
if (inputValue.startsWith("/approval")) {
} else if (inputValue.startsWith("/approval")) {
setInput("");
openApprovalOverlay();
return;
}
if (inputValue === "q" || inputValue === ":q" || inputValue === "exit") {
} else if (["exit", "q", ":q"].includes(inputValue)) {
setInput("");
// wait one 60ms frame
setTimeout(() => {
app.exit();
onExit();
process.exit(0);
}, 60);
}, 60); // Wait one frame.
return;
} else if (inputValue === "/clear" || inputValue === "clear") {
setInput("");
setSessionId("");
setLastResponseId("");
// Clear the terminal screen (including scrollback) before resetting context
clearTerminal();
// Emit a system notice in the chat; no raw console writes so Ink keeps control.
// Clear the terminal screen (including scrollback) before resetting context.
clearTerminal();
// Emit a system message to confirm the clear action. We *append*
// it so Ink's <Static> treats it as new output and actually renders it.
@@ -449,7 +470,7 @@ export default function TerminalChatInput({
await clearCommandHistory();
setHistory([]);
// Emit a system message to confirm the history clear action
// Emit a system message to confirm the history clear action.
setItems((prev) => [
...prev,
{
@@ -466,19 +487,12 @@ export default function TerminalChatInput({
return;
} else if (inputValue === "/bug") {
// Generate a GitHub bug report URL prefilled with session details
// Generate a GitHub bug report URL prefilled with session details.
setInput("");
try {
// Dynamically import dependencies to avoid unnecessary bundle size
const [{ default: open }, os] = await Promise.all([
import("open"),
import("node:os"),
]);
// Lazy import CLI_VERSION to avoid circular deps
const os = await import("node:os");
const { CLI_VERSION } = await import("../../utils/session.js");
const { buildBugReportUrl } = await import(
"../../utils/bug-report.js"
);
@@ -492,10 +506,6 @@ export default function TerminalChatInput({
.join(" | "),
});
// Open the URL in the user's default browser
await open(url, { wait: false });
// Inform the user in the chat history
setItems((prev) => [
...prev,
{
@@ -505,13 +515,13 @@ export default function TerminalChatInput({
content: [
{
type: "input_text",
text: "📋 Opened browser to file a bug report. Please include any context that might help us fix the issue!",
text: `🔗 Bug report URL: ${url}`,
},
],
},
]);
} catch (error) {
// If anything went wrong, notify the user
// If anything went wrong, notify the user.
setItems((prev) => [
...prev,
{
@@ -530,10 +540,10 @@ export default function TerminalChatInput({
return;
} else if (inputValue.startsWith("/")) {
// Handle invalid/unrecognized commands.
// Only single-word inputs starting with '/' (e.g., /command) that are not recognized are caught here.
// Any other input, including those starting with '/' but containing spaces
// (e.g., "/command arg"), will fall through and be treated as a regular prompt.
// Handle invalid/unrecognized commands. Only single-word inputs starting with '/'
// (e.g., /command) that are not recognized are caught here. Any other input, including
// those starting with '/' but containing spaces (e.g., "/command arg"), will fall through
// and be treated as a regular prompt.
const trimmed = inputValue.trim();
if (/^\/\S+$/.test(trimmed)) {
@@ -560,11 +570,13 @@ export default function TerminalChatInput({
// detect image file paths for dynamic inclusion
const images: Array<string> = [];
let text = inputValue;
// markdown-style image syntax: ![alt](path)
text = text.replace(/!\[[^\]]*?\]\(([^)]+)\)/g, (_m, p1: string) => {
images.push(p1.startsWith("file://") ? fileURLToPath(p1) : p1);
return "";
});
// quoted file paths ending with common image extensions (e.g. '/path/to/img.png')
text = text.replace(
/['"]([^'"]+?\.(?:png|jpe?g|gif|bmp|webp|svg))['"]/gi,
@@ -573,6 +585,7 @@ export default function TerminalChatInput({
return "";
},
);
// bare file paths ending with common image extensions
text = text.replace(
// eslint-disable-next-line no-useless-escape
@@ -589,10 +602,10 @@ export default function TerminalChatInput({
const inputItem = await createInputItem(text, images);
submitInput([inputItem]);
// Get config for history persistence
// Get config for history persistence.
const config = loadConfig();
// Add to history and update state
// Add to history and update state.
const updatedHistory = await addToHistory(value, history, {
maxSize: config.history?.maxSize ?? 1000,
saveHistory: config.history?.saveHistory ?? true,
@@ -734,8 +747,7 @@ export default function TerminalChatInput({
/>
) : (
<Text dimColor>
send q or ctrl+c to exit | send "/clear" to reset | send "/help" for
commands | press enter to send | shift+enter for new line
ctrl+c to exit | "/" to see commands | enter to send
{contextLeftPercent > 25 && (
<>
{" — "}
@@ -869,20 +881,30 @@ function TerminalChatInputThinking({
);
return (
<Box flexDirection="column" gap={1}>
<Box gap={2}>
<Text>{frameWithSeconds}</Text>
<Box width="100%" flexDirection="column" gap={1}>
<Box
flexDirection="row"
width="100%"
justifyContent="space-between"
paddingRight={1}
>
<Box gap={2}>
<Text>{frameWithSeconds}</Text>
<Text>
Thinking
{dots}
</Text>
</Box>
<Text>
Thinking
{dots}
<Text dimColor>press</Text> <Text bold>Esc</Text>{" "}
{awaitingConfirm ? (
<Text bold>again</Text>
) : (
<Text dimColor>twice</Text>
)}{" "}
<Text dimColor>to interrupt</Text>
</Text>
</Box>
{awaitingConfirm && (
<Text dimColor>
Press <Text bold>Esc</Text> again to interrupt and enter a new
instruction
</Text>
)}
</Box>
);
}

View File

@@ -1,560 +0,0 @@
import type { MultilineTextEditorHandle } from "./multiline-editor";
import type { ReviewDecision } from "../../utils/agent/review.js";
import type { HistoryEntry } from "../../utils/storage/command-history.js";
import type {
ResponseInputItem,
ResponseItem,
} from "openai/resources/responses/responses.mjs";
import MultilineTextEditor from "./multiline-editor";
import { TerminalChatCommandReview } from "./terminal-chat-command-review.js";
import { loadConfig } from "../../utils/config.js";
import { createInputItem } from "../../utils/input-utils.js";
import { log } from "../../utils/logger/log.js";
import { setSessionId } from "../../utils/session.js";
import {
loadCommandHistory,
addToHistory,
} from "../../utils/storage/command-history.js";
import { clearTerminal, onExit } from "../../utils/terminal.js";
import { Box, Text, useApp, useInput, useStdin } from "ink";
import { fileURLToPath } from "node:url";
import React, { useCallback, useState, Fragment, useEffect } from "react";
import { useInterval } from "use-interval";
const suggestions = [
"explain this codebase to me",
"fix any build errors",
"are there any bugs in my code?",
];
const typeHelpText = `ctrl+c to exit | "/clear" to reset context | "/help" for commands | ↑↓ to recall history | ctrl+x to open external editor | enter to send`;
// Enable verbose logging for the historynavigation logic when the
// DEBUG_TCI environment variable is truthy. The traces help while debugging
// unittest failures but remain silent in production.
const DEBUG_HIST =
process.env["DEBUG_TCI"] === "1" || process.env["DEBUG_TCI"] === "true";
// Placeholder for potential dynamic prompts currently not used.
export default function TerminalChatInput({
isNew: _isNew,
loading,
submitInput,
confirmationPrompt,
explanation,
submitConfirmation,
setLastResponseId,
setItems,
contextLeftPercent,
openOverlay,
openModelOverlay,
openApprovalOverlay,
openHelpOverlay,
openDiffOverlay,
interruptAgent,
active,
thinkingSeconds,
}: {
isNew: boolean;
loading: boolean;
submitInput: (input: Array<ResponseInputItem>) => void;
confirmationPrompt: React.ReactNode | null;
explanation?: string;
submitConfirmation: (
decision: ReviewDecision,
customDenyMessage?: string,
) => void;
setLastResponseId: (lastResponseId: string) => void;
setItems: React.Dispatch<React.SetStateAction<Array<ResponseItem>>>;
contextLeftPercent: number;
openOverlay: () => void;
openModelOverlay: () => void;
openApprovalOverlay: () => void;
openHelpOverlay: () => void;
openDiffOverlay: () => void;
interruptAgent: () => void;
active: boolean;
thinkingSeconds: number;
}): React.ReactElement {
const app = useApp();
const [selectedSuggestion, setSelectedSuggestion] = useState<number>(0);
const [input, setInput] = useState("");
const [history, setHistory] = useState<Array<HistoryEntry>>([]);
const [historyIndex, setHistoryIndex] = useState<number | null>(null);
const [draftInput, setDraftInput] = useState<string>("");
// Multiline text editor is now the default input mode. We keep an
// incremental `editorKey` so that we can forceremount the component and
// thus reset its internal buffer after each successful submit.
const [editorKey, setEditorKey] = useState(0);
// Load command history on component mount
useEffect(() => {
async function loadHistory() {
const historyEntries = await loadCommandHistory();
setHistory(historyEntries);
}
loadHistory();
}, []);
// Imperative handle from the multiline editor so we can query caret position
const editorRef = React.useRef<MultilineTextEditorHandle | null>(null);
// Track the caret row across keystrokes so we can tell whether the cursor
// was *already* on the first/last line before the current key event. This
// lets us distinguish between a normal vertical navigation (e.g. moving
// from row 1 → row 0 inside a multiline draft) and an attempt to navigate
// the chat history (pressing ↑ again while already at row 0).
const prevCursorRow = React.useRef<number | null>(null);
useInput(
(_input, _key) => {
if (!confirmationPrompt && !loading) {
if (_key.upArrow) {
if (DEBUG_HIST) {
// eslint-disable-next-line no-console
console.log("[TCI] upArrow", {
historyIndex,
input,
cursorRow: editorRef.current?.getRow?.(),
});
}
// Only recall history when the caret was *already* on the very first
// row *before* this keypress. That means the user pressed ↑ while
// the cursor sat at the top mirroring how shells like Bash/zsh
// enter history navigation. When the caret starts on a lower line
// the first ↑ should merely move it up one row; only a subsequent
// press (when we are *still* at row 0) should trigger the recall.
const cursorRow = editorRef.current?.getRow?.() ?? 0;
const wasAtFirstRow = (prevCursorRow.current ?? cursorRow) === 0;
if (history.length > 0 && cursorRow === 0 && wasAtFirstRow) {
if (historyIndex == null) {
const currentDraft = editorRef.current?.getText?.() ?? input;
setDraftInput(currentDraft);
if (DEBUG_HIST) {
// eslint-disable-next-line no-console
console.log("[TCI] store draft", JSON.stringify(currentDraft));
}
}
let newIndex: number;
if (historyIndex == null) {
newIndex = history.length - 1;
} else {
newIndex = Math.max(0, historyIndex - 1);
}
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
// Remount the editor so it picks up the new initialText.
setEditorKey((k) => k + 1);
return; // we handled the key
}
// Otherwise let the event propagate so the editor moves the caret.
}
if (_key.downArrow) {
if (DEBUG_HIST) {
// eslint-disable-next-line no-console
console.log("[TCI] downArrow", { historyIndex, draftInput, input });
}
// Only move forward in history when we're already *in* history mode
// AND the caret sits on the last line of the buffer (so ↓ within a
// multiline draft simply moves the caret down).
if (historyIndex != null && editorRef.current?.isCursorAtLastRow()) {
const newIndex = historyIndex + 1;
if (newIndex >= history.length) {
setHistoryIndex(null);
setInput(draftInput);
setEditorKey((k) => k + 1);
} else {
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
setEditorKey((k) => k + 1);
}
return; // handled
}
// Otherwise let it propagate.
}
}
if (input.trim() === "") {
if (_key.tab) {
setSelectedSuggestion(
(s) => (s + (_key.shift ? -1 : 1)) % (suggestions.length + 1),
);
} else if (selectedSuggestion && _key.return) {
const suggestion = suggestions[selectedSuggestion - 1] || "";
setInput("");
setSelectedSuggestion(0);
submitInput([
{
role: "user",
content: [{ type: "input_text", text: suggestion }],
type: "message",
},
]);
}
} else if (_input === "\u0003" || (_input === "c" && _key.ctrl)) {
setTimeout(() => {
app.exit();
onExit();
process.exit(0);
}, 60);
}
// Update the cached cursor position *after* we've potentially handled
// the key so that the next event has the correct "previous" reference.
prevCursorRow.current = editorRef.current?.getRow?.() ?? null;
},
{ isActive: active },
);
const onSubmit = useCallback(
async (value: string) => {
const inputValue = value.trim();
if (!inputValue) {
return;
}
if (inputValue === "/history") {
setInput("");
openOverlay();
return;
}
if (inputValue === "/help") {
setInput("");
openHelpOverlay();
return;
}
if (inputValue === "/diff") {
setInput("");
openDiffOverlay();
return;
}
if (inputValue.startsWith("/model")) {
setInput("");
openModelOverlay();
return;
}
if (inputValue.startsWith("/approval")) {
setInput("");
openApprovalOverlay();
return;
}
if (inputValue === "q" || inputValue === ":q" || inputValue === "exit") {
setInput("");
// wait one 60ms frame
setTimeout(() => {
app.exit();
onExit();
process.exit(0);
}, 60);
return;
} else if (inputValue === "/clear" || inputValue === "clear") {
setInput("");
setSessionId("");
setLastResponseId("");
// Clear the terminal screen (including scrollback) before resetting context
clearTerminal();
// Print a clear confirmation and reset conversation items.
setItems([
{
id: `clear-${Date.now()}`,
type: "message",
role: "system",
content: [{ type: "input_text", text: "Terminal cleared" }],
},
]);
return;
} else if (inputValue === "/clearhistory") {
setInput("");
// Import clearCommandHistory function to avoid circular dependencies
// Using dynamic import to lazy-load the function
import("../../utils/storage/command-history.js").then(
async ({ clearCommandHistory }) => {
await clearCommandHistory();
setHistory([]);
// Emit a system message to confirm the history clear action
setItems((prev) => [
...prev,
{
id: `clearhistory-${Date.now()}`,
type: "message",
role: "system",
content: [
{ type: "input_text", text: "Command history cleared" },
],
},
]);
},
);
return;
}
const images: Array<string> = [];
const text = inputValue
.replace(/!\[[^\]]*?\]\(([^)]+)\)/g, (_m, p1: string) => {
images.push(p1.startsWith("file://") ? fileURLToPath(p1) : p1);
return "";
})
.trim();
const inputItem = await createInputItem(text, images);
submitInput([inputItem]);
// Get config for history persistence
const config = loadConfig();
// Add to history and update state
const updatedHistory = await addToHistory(value, history, {
maxSize: config.history?.maxSize ?? 1000,
saveHistory: config.history?.saveHistory ?? true,
sensitivePatterns: config.history?.sensitivePatterns ?? [],
});
setHistory(updatedHistory);
setHistoryIndex(null);
setDraftInput("");
setSelectedSuggestion(0);
setInput("");
},
[
setInput,
submitInput,
setLastResponseId,
setItems,
app,
setHistory,
setHistoryIndex,
openOverlay,
openApprovalOverlay,
openModelOverlay,
openHelpOverlay,
openDiffOverlay,
history, // Add history to the dependency array
],
);
if (confirmationPrompt) {
return (
<TerminalChatCommandReview
confirmationPrompt={confirmationPrompt}
onReviewCommand={submitConfirmation}
// allow switching approval mode via 'v'
onSwitchApprovalMode={openApprovalOverlay}
explanation={explanation}
// disable when input is inactive (e.g., overlay open)
isActive={active}
/>
);
}
return (
<Box flexDirection="column">
{loading ? (
<Box borderStyle="round">
<TerminalChatInputThinking
onInterrupt={interruptAgent}
active={active}
thinkingSeconds={thinkingSeconds}
/>
</Box>
) : (
<>
<Box borderStyle="round">
<MultilineTextEditor
ref={editorRef}
onChange={(txt: string) => setInput(txt)}
key={editorKey}
initialText={input}
height={8}
focus={active}
onSubmit={(txt) => {
onSubmit(txt);
setEditorKey((k) => k + 1);
setInput("");
setHistoryIndex(null);
setDraftInput("");
}}
/>
</Box>
<Box paddingX={2} marginBottom={1}>
<Text dimColor>
{!input ? (
<>
try:{" "}
{suggestions.map((m, key) => (
<Fragment key={key}>
{key !== 0 ? " | " : ""}
<Text
backgroundColor={
key + 1 === selectedSuggestion ? "blackBright" : ""
}
>
{m}
</Text>
</Fragment>
))}
</>
) : (
<>
{typeHelpText}
{contextLeftPercent < 25 && (
<>
{" — "}
<Text color="red">
{Math.round(contextLeftPercent)}% context left
</Text>
</>
)}
</>
)}
</Text>
</Box>
</>
)}
</Box>
);
}
function TerminalChatInputThinking({
onInterrupt,
active,
thinkingSeconds,
}: {
onInterrupt: () => void;
active: boolean;
thinkingSeconds: number;
}) {
const [awaitingConfirm, setAwaitingConfirm] = useState(false);
const [dots, setDots] = useState("");
// Animate ellipsis
useInterval(() => {
setDots((prev) => (prev.length < 3 ? prev + "." : ""));
}, 500);
// Spinner frames with seconds embedded
const ballFrames = [
"( ● )",
"( ● )",
"( ● )",
"( ● )",
"( ●)",
"( ● )",
"( ● )",
"( ● )",
"( ● )",
"(● )",
];
const [frame, setFrame] = useState(0);
useInterval(() => {
setFrame((idx) => (idx + 1) % ballFrames.length);
}, 80);
const frameTemplate = ballFrames[frame] ?? ballFrames[0];
const frameWithSeconds = (frameTemplate as string).replace(
"●",
`${thinkingSeconds}s`,
);
// ---------------------------------------------------------------------
// Raw stdin listener to catch the case where the terminal delivers two
// consecutive ESC bytes ("\x1B\x1B") in a *single* chunk. Ink's `useInput`
// collapses that sequence into one key event, so the regular twostep
// handler above never sees the second press. By inspecting the raw data
// we can identify this special case and trigger the interrupt while still
// requiring a double press for the normal singlebyte ESC events.
// ---------------------------------------------------------------------
const { stdin, setRawMode } = useStdin();
React.useEffect(() => {
if (!active) {
return;
}
// Ensure raw mode already enabled by Ink when the component has focus,
// but called defensively in case that assumption ever changes.
setRawMode?.(true);
const onData = (data: Buffer | string) => {
if (awaitingConfirm) {
return; // already awaiting a second explicit press
}
// Handle both Buffer and string forms.
const str = Buffer.isBuffer(data) ? data.toString("utf8") : data;
if (str === "\x1b\x1b") {
// Treat as the first Escape press prompt the user for confirmation.
log(
"raw stdin: received collapsed ESC ESC starting confirmation timer",
);
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
};
stdin?.on("data", onData);
return () => {
stdin?.off("data", onData);
};
}, [stdin, awaitingConfirm, onInterrupt, active, setRawMode]);
// Elapsed time provided via props no local interval needed.
useInput(
(_input, key) => {
if (!key.escape) {
return;
}
if (awaitingConfirm) {
log("useInput: second ESC detected triggering onInterrupt()");
onInterrupt();
setAwaitingConfirm(false);
} else {
log("useInput: first ESC detected waiting for confirmation");
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
},
{ isActive: active },
);
return (
<Box flexDirection="column" gap={1}>
<Box gap={2}>
<Text>{frameWithSeconds}</Text>
<Text>
Thinking
{dots}
</Text>
</Box>
{awaitingConfirm && (
<Text dimColor>
Press <Text bold>Esc</Text> again to interrupt and enter a new
instruction
</Text>
)}
</Box>
);
}

View File

@@ -135,14 +135,14 @@ function TerminalChatResponseMessage({
c.type === "output_text"
? c.text
: c.type === "refusal"
? c.refusal
: c.type === "input_text"
? c.text
: c.type === "input_image"
? "<Image>"
: c.type === "input_file"
? c.filename
: "", // unknown content type
? c.refusal
: c.type === "input_text"
? c.text
: c.type === "input_image"
? "<Image>"
: c.type === "input_file"
? c.filename
: "", // unknown content type
)
.join(" ")}
</Markdown>

View File

@@ -31,6 +31,7 @@ import DiffOverlay from "../diff-overlay.js";
import HelpOverlay from "../help-overlay.js";
import HistoryOverlay from "../history-overlay.js";
import ModelOverlay from "../model-overlay.js";
import chalk from "chalk";
import { Box, Text } from "ink";
import { spawn } from "node:child_process";
import OpenAI from "openai";
@@ -141,7 +142,7 @@ export default function TerminalChat({
additionalWritableRoots,
fullStdout,
}: Props): React.ReactElement {
const notify = config.notify;
const notify = Boolean(config.notify);
const [model, setModel] = useState<string>(config.model);
const [provider, setProvider] = useState<string>(config.provider || "openai");
const [lastResponseId, setLastResponseId] = useState<string | null>(null);
@@ -575,7 +576,7 @@ export default function TerminalChat({
providers={config.providers}
currentProvider={provider}
hasLastResponse={Boolean(lastResponseId)}
onSelect={(newModel) => {
onSelect={(allModels, newModel) => {
log(
"TerminalChat: interruptAgent invoked calling agent.cancel()",
);
@@ -585,6 +586,20 @@ export default function TerminalChat({
agent?.cancel();
setLoading(false);
if (!allModels?.includes(newModel)) {
// eslint-disable-next-line no-console
console.error(
chalk.bold.red(
`Model "${chalk.yellow(
newModel,
)}" is not available for provider "${chalk.yellow(
provider,
)}".`,
),
);
return;
}
setModel(newModel);
setLastResponseId((prev) =>
prev && newModel !== model ? null : prev,

View File

@@ -73,7 +73,7 @@ const TerminalHeader: React.FC<TerminalHeaderProps> = ({
</Text>
<Text dimColor>
<Text color="blueBright"></Text> approval:{" "}
<Text bold color={colorsByPolicy[approvalPolicy]} dimColor>
<Text bold color={colorsByPolicy[approvalPolicy]}>
{approvalPolicy}
</Text>
</Text>

View File

@@ -53,7 +53,8 @@ export default function HelpOverlay({
<Text color="cyan">/clearhistory</Text> clear command history
</Text>
<Text>
<Text color="cyan">/bug</Text> file a bug report with session log
<Text color="cyan">/bug</Text> generate a prefilled GitHub issue URL
with session log
</Text>
<Text>
<Text color="cyan">/diff</Text> view working tree git diff

View File

@@ -148,8 +148,8 @@ function formatHistoryForDisplay(items: Array<ResponseItem>): {
const cmdArray: Array<string> | undefined = Array.isArray(argsObj?.["cmd"])
? (argsObj!["cmd"] as Array<string>)
: Array.isArray(argsObj?.["command"])
? (argsObj!["command"] as Array<string>)
: undefined;
? (argsObj!["command"] as Array<string>)
: undefined;
if (cmdArray && cmdArray.length > 0) {
commands.push(processCommandArray(cmdArray, filesSet));

View File

@@ -19,7 +19,7 @@ type Props = {
currentProvider?: string;
hasLastResponse: boolean;
providers?: Record<string, { name: string; baseURL: string; envKey: string }>;
onSelect: (model: string) => void;
onSelect: (allModels: Array<string>, model: string) => void;
onSelectProvider?: (provider: string) => void;
onExit: () => void;
};
@@ -153,7 +153,12 @@ export default function ModelOverlay({
}
initialItems={items}
currentValue={currentModel}
onSelect={onSelect}
onSelect={(selectedModel) =>
onSelect(
items?.map((m) => m.value),
selectedModel,
)
}
onExit={onExit}
/>
);

View File

@@ -5,7 +5,13 @@ import type { FileOperation } from "../utils/singlepass/file_ops";
import Spinner from "./vendor/ink-spinner"; // Thirdparty / vendor components
import TextInput from "./vendor/ink-text-input";
import { OPENAI_TIMEOUT_MS, getBaseUrl, getApiKey } from "../utils/config";
import {
OPENAI_TIMEOUT_MS,
OPENAI_ORGANIZATION,
OPENAI_PROJECT,
getBaseUrl,
getApiKey,
} from "../utils/config";
import {
generateDiffSummary,
generateEditSummary,
@@ -393,10 +399,19 @@ export function SinglePassApp({
files,
});
const headers: Record<string, string> = {};
if (OPENAI_ORGANIZATION) {
headers["OpenAI-Organization"] = OPENAI_ORGANIZATION;
}
if (OPENAI_PROJECT) {
headers["OpenAI-Project"] = OPENAI_PROJECT;
}
const openai = new OpenAI({
apiKey: getApiKey(config.provider),
baseURL: getBaseUrl(config.provider),
timeout: OPENAI_TIMEOUT_MS,
defaultHeaders: headers,
});
const chatResp = await openai.beta.chat.completions.parse({
model: config.model,

View File

@@ -34,6 +34,10 @@ function clamp(v: number, min: number, max: number): number {
* ---------------------------------------------------------------------- */
function toCodePoints(str: string): Array<string> {
if (typeof Intl !== "undefined" && "Segmenter" in Intl) {
const seg = new Intl.Segmenter();
return [...seg.segment(str)].map((seg) => seg.segment);
}
// [...str] or Array.from both iterate by UTF32 code point, handling
// surrogate pairs correctly.
return Array.from(str);
@@ -103,88 +107,6 @@ export default class TextBuffer {
}
}
/* =====================================================================
* External editor integration (gitstyle $EDITOR workflow)
* =================================================================== */
/**
* Opens the current buffer contents in the users preferred terminal text
* editor ($VISUAL or $EDITOR, falling back to "vi"). The method blocks
* until the editor exits, then reloads the file and replaces the inmemory
* buffer with whatever the user saved.
*
* The operation is treated as a single undoable edit we snapshot the
* previous state *once* before launching the editor so one `undo()` will
* revert the entire change set.
*
* Note: We purposefully rely on the *synchronous* spawn API so that the
* calling process genuinely waits for the editor to close before
* continuing. This mirrors Gits behaviour and simplifies downstream
* controlflow (callers can simply `await` the Promise).
*/
async openInExternalEditor(opts: { editor?: string } = {}): Promise<void> {
// Deliberately use `require()` so that unit tests can stub the
// respective modules with `vi.spyOn(require("node:child_process"), …)`.
// Dynamic `import()` would circumvent those CommonJS stubs.
// eslint-disable-next-line @typescript-eslint/no-var-requires
const pathMod = require("node:path");
// eslint-disable-next-line @typescript-eslint/no-var-requires
const fs = require("node:fs");
// eslint-disable-next-line @typescript-eslint/no-var-requires
const os = require("node:os");
// eslint-disable-next-line @typescript-eslint/no-var-requires
const { spawnSync } = require("node:child_process");
const editor =
opts.editor ??
process.env["VISUAL"] ??
process.env["EDITOR"] ??
(process.platform === "win32" ? "notepad" : "vi");
// Prepare a temporary file with the current contents. We use mkdtempSync
// to obtain an isolated directory and avoid name collisions.
const tmpDir = fs.mkdtempSync(pathMod.join(os.tmpdir(), "codex-edit-"));
const filePath = pathMod.join(tmpDir, "buffer.txt");
fs.writeFileSync(filePath, this.getText(), "utf8");
// One snapshot for undo semantics *before* we mutate anything.
this.pushUndo();
// The child inherits stdio so the user can interact with the editor as if
// they had launched it directly.
const { status, error } = spawnSync(editor, [filePath], {
stdio: "inherit",
});
if (error) {
throw error;
}
if (typeof status === "number" && status !== 0) {
throw new Error(`External editor exited with status ${status}`);
}
// Read the edited contents back in normalise line endings to \n.
let newText = fs.readFileSync(filePath, "utf8");
newText = newText.replace(/\r\n?/g, "\n");
// Update buffer.
this.lines = newText.split("\n");
if (this.lines.length === 0) {
this.lines = [""];
}
// Position the caret at EOF.
this.cursorRow = this.lines.length - 1;
this.cursorCol = cpLen(this.line(this.cursorRow));
// Reset scroll offsets so the new end is visible.
this.scrollRow = Math.max(0, this.cursorRow - 1);
this.scrollCol = 0;
this.version++;
}
/* =======================================================================
* Geometry helpers
* ===================================================================== */
@@ -415,6 +337,58 @@ export default class TextBuffer {
});
}
/**
* Delete everything from the caret to the *end* of the current line. The
* caret itself stays in place (column remains unchanged). Mirrors the
* common Ctrl+K shortcut in many shells and editors.
*/
deleteToLineEnd(): void {
dbg("deleteToLineEnd", { beforeCursor: this.getCursor() });
const line = this.line(this.cursorRow);
if (this.cursorCol >= this.lineLen(this.cursorRow)) {
// Nothing to delete caret already at EOL.
return;
}
this.pushUndo();
// Keep the prefix before the caret, discard the remainder.
this.lines[this.cursorRow] = cpSlice(line, 0, this.cursorCol);
this.version++;
dbg("deleteToLineEnd:after", {
cursor: this.getCursor(),
line: this.line(this.cursorRow),
});
}
/**
* Delete everything from the *start* of the current line up to (but not
* including) the caret. The caret is moved to column-0, mirroring the
* behaviour of the familiar Ctrl+U binding.
*/
deleteToLineStart(): void {
dbg("deleteToLineStart", { beforeCursor: this.getCursor() });
if (this.cursorCol === 0) {
// Nothing to delete caret already at SOL.
return;
}
this.pushUndo();
const line = this.line(this.cursorRow);
this.lines[this.cursorRow] = cpSlice(line, this.cursorCol);
this.cursorCol = 0;
this.version++;
dbg("deleteToLineStart:after", {
cursor: this.getCursor(),
line: this.line(this.cursorRow),
});
}
/* ------------------------------------------------------------------
* Wordwise deletion helpers exposed publicly so tests (and future
* keybindings) can invoke them directly.
@@ -636,6 +610,24 @@ export default class TextBuffer {
}
}
/* ------------------------------------------------------------------
* Document-level navigation helpers
* ---------------------------------------------------------------- */
/** Move caret to *absolute* beginning of the buffer (row-0, col-0). */
private moveToStartOfDocument(): void {
this.preferredCol = null;
this.cursorRow = 0;
this.cursorCol = 0;
}
/** Move caret to *absolute* end of the buffer (last row, last column). */
private moveToEndOfDocument(): void {
this.preferredCol = null;
this.cursorRow = this.lines.length - 1;
this.cursorCol = this.lineLen(this.cursorRow);
}
/* =====================================================================
* Higherlevel helpers
* =================================================================== */
@@ -787,7 +779,6 @@ export default class TextBuffer {
!key["ctrl"] &&
!key["alt"]
) {
/* navigation */
this.move("left");
} else if (
key["rightArrow"] &&
@@ -807,12 +798,26 @@ export default class TextBuffer {
key["rightArrow"]
) {
this.move("wordRight");
}
// Many terminal/OS combinations (e.g. macOS Terminal.app & iTerm2 with
// the default key-bindings) translate ⌥← / ⌥→ into the classic readline
// shortcuts ESC-b / ESC-f rather than an ANSI arrow sequence that Ink
// would tag with `leftArrow` / `rightArrow`. Ink parses those 2-byte
// escape sequences into `input === "b"|"f"` with `key.meta === true`.
// Handle this variant explicitly so that Option+Arrow performs word
// navigation consistently across environments.
else if (key["meta"] && (input === "b" || input === "B")) {
this.move("wordLeft");
} else if (key["meta"] && (input === "f" || input === "F")) {
this.move("wordRight");
} else if (key["home"]) {
this.move("home");
} else if (key["end"]) {
this.move("end");
}
/* delete */
// Deletions
//
// In raw terminal mode many frameworks (Ink included) surface a physical
// Backspace keypress as the single DEL (0x7f) byte placed in `input` with
// no `key.backspace` flag set. Treat that byte exactly like an ordinary
@@ -835,22 +840,47 @@ export default class TextBuffer {
// forward deletion so we don't lose that capability on keyboards that
// expose both behaviours.
this.backspace();
}
// Forward deletion (Fn+Delete on macOS, or Delete key with Shift held after
// the branch above) remove the character *under / to the right* of the
// caret, merging lines when at EOL similar to many editors.
else if (key["delete"]) {
} else if (key["delete"]) {
// Forward deletion (Fn+Delete on macOS, or Delete key with Shift held after
// the branch above) remove the character *under / to the right* of the
// caret, merging lines when at EOL similar to many editors.
this.del();
} else if (input && !key["ctrl"] && !key["meta"]) {
}
// Normal input
else if (input && !key["ctrl"] && !key["meta"]) {
this.insert(input);
}
/* printable */
// Emacs/readline-style shortcuts
else if (key["ctrl"] && (input === "a" || input === "\x01")) {
// Ctrl+A → start of input (first row, first column)
this.moveToStartOfDocument();
} else if (key["ctrl"] && (input === "e" || input === "\x05")) {
// Ctrl+E → end of input (last row, last column)
this.moveToEndOfDocument();
} else if (key["ctrl"] && (input === "b" || input === "\x02")) {
// Ctrl+B → char left
this.move("left");
} else if (key["ctrl"] && (input === "f" || input === "\x06")) {
// Ctrl+F → char right
this.move("right");
} else if (key["ctrl"] && (input === "d" || input === "\x04")) {
// Ctrl+D → forward delete
this.del();
} else if (key["ctrl"] && (input === "k" || input === "\x0b")) {
// Ctrl+K → kill to EOL
this.deleteToLineEnd();
} else if (key["ctrl"] && (input === "u" || input === "\x15")) {
// Ctrl+U → kill to SOL
this.deleteToLineStart();
} else if (key["ctrl"] && (input === "w" || input === "\x17")) {
// Ctrl+W → delete word left
this.deleteWordLeft();
}
/* clamp + scroll */
/* printable, clamp + scroll */
this.ensureCursorInRange();
this.ensureCursorVisible(vp);
const cursorMoved =
this.cursorRow !== beforeRow || this.cursorCol !== beforeCol;

View File

@@ -11,7 +11,13 @@ import type {
} from "openai/resources/responses/responses.mjs";
import type { Reasoning } from "openai/resources.mjs";
import { OPENAI_TIMEOUT_MS, getApiKey, getBaseUrl } from "../config.js";
import {
OPENAI_TIMEOUT_MS,
OPENAI_ORGANIZATION,
OPENAI_PROJECT,
getApiKey,
getBaseUrl,
} from "../config.js";
import { log } from "../logger/log.js";
import { parseToolCallArguments } from "../parsers.js";
import { responsesCreateViaChatCompletions } from "../responses.js";
@@ -28,7 +34,7 @@ import OpenAI, { APIConnectionTimeoutError } from "openai";
// Wait time before retrying after rate limit errors (ms).
const RATE_LIMIT_RETRY_WAIT_MS = parseInt(
process.env["OPENAI_RATE_LIMIT_RETRY_WAIT_MS"] || "2500",
process.env["OPENAI_RATE_LIMIT_RETRY_WAIT_MS"] || "500",
10,
);
@@ -40,6 +46,7 @@ export type CommandConfirmation = {
};
const alreadyProcessedResponses = new Set();
const alreadyStagedItemIds = new Set<string>();
type AgentLoopParams = {
model: string;
@@ -272,12 +279,10 @@ export class AgentLoop {
// defined object. We purposefully copy over the `model` and
// `instructions` that have already been passed explicitly so that
// downstream consumers (e.g. telemetry) still observe the correct values.
this.config =
config ??
({
model,
instructions: instructions ?? "",
} as AppConfig);
this.config = config ?? {
model,
instructions: instructions ?? "",
};
this.additionalWritableRoots = additionalWritableRoots;
this.onItem = onItem;
this.onLoading = onLoading;
@@ -304,6 +309,10 @@ export class AgentLoop {
originator: ORIGIN,
version: CLI_VERSION,
session_id: this.sessionId,
...(OPENAI_ORGANIZATION
? { "OpenAI-Organization": OPENAI_ORGANIZATION }
: {}),
...(OPENAI_PROJECT ? { "OpenAI-Project": OPENAI_PROJECT } : {}),
},
...(timeoutMs !== undefined ? { timeout: timeoutMs } : {}),
});
@@ -554,17 +563,27 @@ export class AgentLoop {
return;
}
// Skip items we've already processed to avoid staging duplicates
if (item.id && alreadyStagedItemIds.has(item.id)) {
return;
}
alreadyStagedItemIds.add(item.id);
// Store the item so the final flush can still operate on a complete list.
// We'll nil out entries once they're delivered.
const idx = staged.push(item) - 1;
// Instead of emitting synchronously we schedule a shortdelay delivery.
//
// This accomplishes two things:
// 1. The UI still sees new messages almost immediately, creating the
// perception of realtime updates.
// 2. If the user calls `cancel()` in the small window right after the
// item was staged we can still abort the delivery because the
// generation counter will have been bumped by `cancel()`.
//
// Use a minimal 3ms delay for terminal rendering to maintain readable
// streaming.
setTimeout(() => {
if (
thisGeneration === this.generation &&
@@ -575,8 +594,9 @@ export class AgentLoop {
// Mark as delivered so flush won't re-emit it
staged[idx] = undefined;
// When we operate without serverside storage we keep our own
// transcript so we can provide full context on subsequent calls.
// Handle transcript updates to maintain consistency. When we
// operate without serverside storage we keep our own transcript
// so we can provide full context on subsequent calls.
if (this.disableResponseStorage) {
// Exclude system messages from transcript as they do not form
// part of the assistant/user dialogue that the model needs.
@@ -620,7 +640,7 @@ export class AgentLoop {
}
}
}
}, 10);
}, 3); // Small 3ms delay for readable streaming.
};
while (turnInput.length > 0) {
@@ -647,16 +667,16 @@ export class AgentLoop {
for (const item of deltaInput) {
stageItem(item as ResponseItem);
}
// Send request to OpenAI with retry on timeout
// Send request to OpenAI with retry on timeout.
let stream;
// Retry loop for transient errors. Up to MAX_RETRIES attempts.
const MAX_RETRIES = 5;
const MAX_RETRIES = 8;
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
try {
let reasoning: Reasoning | undefined;
if (this.model.startsWith("o")) {
reasoning = { effort: "high" };
reasoning = { effort: this.config.reasoningEffort ?? "high" };
if (this.model === "o3" || this.model === "o4-mini") {
reasoning.summary = "auto";
}
@@ -861,7 +881,6 @@ export class AgentLoop {
throw error;
}
}
turnInput = []; // clear turn input, prepare for function call results
// If the user requested cancellation while we were awaiting the network
// request, abort immediately before we start handling the stream.
@@ -881,7 +900,7 @@ export class AgentLoop {
// Keep track of the active stream so it can be aborted on demand.
this.currentStream = stream;
// guard against an undefined stream before iterating
// Guard against an undefined stream before iterating.
if (!stream) {
this.onLoading(false);
log("AgentLoop.run(): stream is undefined");
@@ -894,6 +913,8 @@ export class AgentLoop {
// eslint-disable-next-line no-constant-condition
while (true) {
try {
let newTurnInput: Array<ResponseInputItem> = [];
// eslint-disable-next-line no-await-in-loop
for await (const event of stream as AsyncIterable<ResponseEvent>) {
log(`AgentLoop.run(): response event ${event.type}`);
@@ -935,7 +956,7 @@ export class AgentLoop {
"requires_action"
) {
// TODO: remove this once we can depend on streaming events
const newTurnInput = await this.processEventsWithoutStreaming(
newTurnInput = await this.processEventsWithoutStreaming(
event.response.output,
stageItem,
);
@@ -970,24 +991,30 @@ export class AgentLoop {
if (delta.length === 0) {
// No new input => end conversation.
turnInput = [];
newTurnInput = [];
} else {
// Resend full transcript *plus* the new delta so the
// stateless backend receives complete context.
turnInput = [...this.transcript, ...delta];
newTurnInput = [...this.transcript, ...delta];
// The prefix ends at the current transcript length
// everything after this index is new for the next
// iteration.
transcriptPrefixLen = this.transcript.length;
}
} else {
turnInput = newTurnInput;
}
}
lastResponseId = event.response.id;
this.onLastResponseId(event.response.id);
}
}
// Set after we have consumed all stream events in case the stream wasn't
// complete or we missed events for whatever reason. That way, we will set
// the next turn to an empty array to prevent an infinite loop.
// And don't update the turn input too early otherwise we won't have the
// current turn inputs available for retries.
turnInput = newTurnInput;
// Stream finished successfully leave the retry loop.
break;
} catch (err: unknown) {
@@ -1191,8 +1218,18 @@ export class AgentLoop {
this.onLoading(false);
};
// Delay flush slightly to allow a nearsimultaneous cancel() to land.
setTimeout(flush, 30);
// Use a small delay to make sure UI rendering is smooth. Double-check
// cancellation state right before flushing to avoid race conditions.
setTimeout(() => {
if (
!this.canceled &&
!this.hardAbort.signal.aborted &&
thisGeneration === this.generation
) {
flush();
}
}, 3);
// End of main logic. The corresponding catch block for the wrapper at the
// start of this method follows next.
} catch (err) {
@@ -1282,14 +1319,6 @@ export class AgentLoop {
return true;
}
// Explicit check for OpenAI "server_error" types which are surfaced
// when the backend encounters an unexpected exception. The SDK often
// omits the HTTP status in this case (leaving it undefined) so we
// must inspect the structured error fields instead.
if (e.type === "server_error" || e.code === "server_error") {
return true;
}
if (typeof e.status === "number" && e.status >= 500) {
return true;
}

View File

@@ -211,9 +211,46 @@ class Parser {
}
if (defStr.trim()) {
let found = false;
if (!fileLines.slice(0, index).some((s) => s === defStr)) {
// ------------------------------------------------------------------
// Equality helpers using the canonicalisation from find_context_core.
// (We duplicate a minimal version here because the scope is local.)
// ------------------------------------------------------------------
const canonLocal = (s: string): string =>
s.normalize("NFC").replace(
/./gu,
(c) =>
(
({
"-": "-",
"\u2010": "-",
"\u2011": "-",
"\u2012": "-",
"\u2013": "-",
"\u2014": "-",
"\u2212": "-",
"\u0022": '"',
"\u201C": '"',
"\u201D": '"',
"\u201E": '"',
"\u00AB": '"',
"\u00BB": '"',
"\u0027": "'",
"\u2018": "'",
"\u2019": "'",
"\u201B": "'",
"\u00A0": " ",
"\u202F": " ",
}) as Record<string, string>
)[c] ?? c,
);
if (
!fileLines
.slice(0, index)
.some((s) => canonLocal(s) === canonLocal(defStr))
) {
for (let i = index; i < fileLines.length; i++) {
if (fileLines[i] === defStr) {
if (canonLocal(fileLines[i]!) === canonLocal(defStr)) {
index = i + 1;
found = true;
break;
@@ -222,10 +259,14 @@ class Parser {
}
if (
!found &&
!fileLines.slice(0, index).some((s) => s.trim() === defStr.trim())
!fileLines
.slice(0, index)
.some((s) => canonLocal(s.trim()) === canonLocal(defStr.trim()))
) {
for (let i = index; i < fileLines.length; i++) {
if (fileLines[i]!.trim() === defStr.trim()) {
if (
canonLocal(fileLines[i]!.trim()) === canonLocal(defStr.trim())
) {
index = i + 1;
this.fuzz += 1;
found = true;
@@ -293,34 +334,98 @@ function find_context_core(
context: Array<string>,
start: number,
): [number, number] {
// ---------------------------------------------------------------------------
// Helpers Unicode punctuation normalisation
// ---------------------------------------------------------------------------
/*
* The patch-matching algorithm originally required **exact** string equality
* for non-whitespace characters. That breaks when the file on disk contains
* visually identical but different Unicode code-points (e.g. “EN DASH” vs
* ASCII "-"), because models almost always emit the ASCII variant. To make
* apply_patch resilient we canonicalise a handful of common punctuation
* look-alikes before doing comparisons.
*
* We purposefully keep the mapping *small* only characters that routinely
* appear in source files and are highly unlikely to introduce ambiguity are
* included. Each entry is written using the corresponding Unicode escape so
* that the file remains ASCII-only even after transpilation.
*/
const PUNCT_EQUIV: Record<string, string> = {
// Hyphen / dash variants --------------------------------------------------
/* U+002D HYPHEN-MINUS */ "-": "-",
/* U+2010 HYPHEN */ "\u2010": "-",
/* U+2011 NO-BREAK HYPHEN */ "\u2011": "-",
/* U+2012 FIGURE DASH */ "\u2012": "-",
/* U+2013 EN DASH */ "\u2013": "-",
/* U+2014 EM DASH */ "\u2014": "-",
/* U+2212 MINUS SIGN */ "\u2212": "-",
// Double quotes -----------------------------------------------------------
/* U+0022 QUOTATION MARK */ "\u0022": '"',
/* U+201C LEFT DOUBLE QUOTATION MARK */ "\u201C": '"',
/* U+201D RIGHT DOUBLE QUOTATION MARK */ "\u201D": '"',
/* U+201E DOUBLE LOW-9 QUOTATION MARK */ "\u201E": '"',
/* U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK */ "\u00AB": '"',
/* U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK */ "\u00BB": '"',
// Single quotes -----------------------------------------------------------
/* U+0027 APOSTROPHE */ "\u0027": "'",
/* U+2018 LEFT SINGLE QUOTATION MARK */ "\u2018": "'",
/* U+2019 RIGHT SINGLE QUOTATION MARK */ "\u2019": "'",
/* U+201B SINGLE HIGH-REVERSED-9 QUOTATION MARK */ "\u201B": "'",
// Spaces ------------------------------------------------------------------
/* U+00A0 NO-BREAK SPACE */ "\u00A0": " ",
/* U+202F NARROW NO-BREAK SPACE */ "\u202F": " ",
};
const canon = (s: string): string =>
s
// Canonical Unicode composition first
.normalize("NFC")
// Replace punctuation look-alikes
.replace(/./gu, (c) => PUNCT_EQUIV[c] ?? c);
if (context.length === 0) {
return [start, 0];
}
// Pass 1 exact equality after canonicalisation ---------------------------
const canonicalContext = canon(context.join("\n"));
for (let i = start; i < lines.length; i++) {
if (lines.slice(i, i + context.length).join("\n") === context.join("\n")) {
const segment = canon(lines.slice(i, i + context.length).join("\n"));
if (segment === canonicalContext) {
return [i, 0];
}
}
// Pass 2 ignore trailing whitespace -------------------------------------
for (let i = start; i < lines.length; i++) {
if (
const segment = canon(
lines
.slice(i, i + context.length)
.map((s) => s.trimEnd())
.join("\n") === context.map((s) => s.trimEnd()).join("\n")
) {
.join("\n"),
);
const ctx = canon(context.map((s) => s.trimEnd()).join("\n"));
if (segment === ctx) {
return [i, 1];
}
}
// Pass 3 ignore all surrounding whitespace ------------------------------
for (let i = start; i < lines.length; i++) {
if (
const segment = canon(
lines
.slice(i, i + context.length)
.map((s) => s.trim())
.join("\n") === context.map((s) => s.trim()).join("\n")
) {
.join("\n"),
);
const ctx = canon(context.map((s) => s.trim()).join("\n"));
if (segment === ctx) {
return [i, 100];
}
}
return [-1, 0];
}

View File

@@ -1,17 +1,18 @@
import type { CommandConfirmation } from "./agent-loop.js";
import type { AppConfig } from "../config.js";
import type { ExecInput } from "./sandbox/interface.js";
import type { ApplyPatchCommand, ApprovalPolicy } from "../../approvals.js";
import type { ExecInput } from "./sandbox/interface.js";
import type { ResponseInputItem } from "openai/resources/responses/responses.mjs";
import { exec, execApplyPatch } from "./exec.js";
import { ReviewDecision } from "./review.js";
import { FullAutoErrorMode } from "../auto-approval-mode.js";
import { SandboxType } from "./sandbox/interface.js";
import { canAutoApprove } from "../../approvals.js";
import { formatCommandForDisplay } from "../../format-command.js";
import { FullAutoErrorMode } from "../auto-approval-mode.js";
import { CODEX_UNSAFE_ALLOW_NO_SANDBOX, type AppConfig } from "../config.js";
import { exec, execApplyPatch } from "./exec.js";
import { ReviewDecision } from "./review.js";
import { isLoggingEnabled, log } from "../logger/log.js";
import { access } from "fs/promises";
import { SandboxType } from "./sandbox/interface.js";
import { PATH_TO_SEATBELT_EXECUTABLE } from "./sandbox/macos-seatbelt.js";
import fs from "fs/promises";
// ---------------------------------------------------------------------------
// Sessionlevel cache of commands that the user has chosen to always approve.
@@ -217,7 +218,7 @@ async function execCommand(
let { workdir } = execInput;
if (workdir) {
try {
await access(workdir);
await fs.access(workdir);
} catch (e) {
log(`EXEC workdir=${workdir} not found, use process.cwd() instead`);
workdir = process.cwd();
@@ -270,30 +271,45 @@ async function execCommand(
};
}
const isInLinux = async (): Promise<boolean> => {
try {
await access("/proc/1/cgroup");
return true;
} catch {
return false;
}
};
/** Return `true` if the `/usr/bin/sandbox-exec` is present and executable. */
const isSandboxExecAvailable: Promise<boolean> = fs
.access(PATH_TO_SEATBELT_EXECUTABLE, fs.constants.X_OK)
.then(
() => true,
(err) => {
if (!["ENOENT", "ACCESS", "EPERM"].includes(err.code)) {
log(
`Unexpected error for \`stat ${PATH_TO_SEATBELT_EXECUTABLE}\`: ${err.message}`,
);
}
return false;
},
);
async function getSandbox(runInSandbox: boolean): Promise<SandboxType> {
if (runInSandbox) {
if (process.platform === "darwin") {
return SandboxType.MACOS_SEATBELT;
} else if (await isInLinux()) {
return SandboxType.NONE;
} else if (process.platform === "win32") {
// On Windows, we don't have a sandbox implementation yet, so we fall back to NONE
// instead of throwing an error, which would crash the application
log(
"WARNING: Sandbox was requested but is not available on Windows. Continuing without sandbox.",
);
// On macOS we rely on the system-provided `sandbox-exec` binary to
// enforce the Seatbelt profile. However, starting with macOS 14 the
// executable may be removed from the default installation or the user
// might be running the CLI on a stripped-down environment (for
// instance, inside certain CI images). Attempting to spawn a missing
// binary makes Node.js throw an *uncaught* `ENOENT` error further down
// the stack which crashes the whole CLI.
if (await isSandboxExecAvailable) {
return SandboxType.MACOS_SEATBELT;
} else {
throw new Error(
"Sandbox was mandated, but 'sandbox-exec' was not found in PATH!",
);
}
} else if (CODEX_UNSAFE_ALLOW_NO_SANDBOX) {
// Allow running without a sandbox if the user has explicitly marked the
// environment as already being sufficiently locked-down.
return SandboxType.NONE;
}
// For other platforms, still throw an error as before
// For all else, we hard fail if the user has requested a sandbox and none is available.
throw new Error("Sandbox was mandated, but no sandbox is available!");
} else {
return SandboxType.NONE;

View File

@@ -12,6 +12,14 @@ function getCommonRoots() {
];
}
/**
* When working with `sandbox-exec`, only consider `sandbox-exec` in `/usr/bin`
* to defend against an attacker trying to inject a malicious version on the
* PATH. If /usr/bin/sandbox-exec has been tampered with, then the attacker
* already has root access.
*/
export const PATH_TO_SEATBELT_EXECUTABLE = "/usr/bin/sandbox-exec";
export function execWithSeatbelt(
cmd: Array<string>,
opts: SpawnOptions,
@@ -57,7 +65,7 @@ export function execWithSeatbelt(
);
const fullCommand = [
"sandbox-exec",
PATH_TO_SEATBELT_EXECUTABLE,
"-p",
fullPolicy,
...policyTemplateParams,

View File

@@ -7,15 +7,42 @@
// compiled `dist/` output used by the published CLI.
import type { FullAutoErrorMode } from "./auto-approval-mode.js";
import type { ReasoningEffort } from "openai/resources.mjs";
import { AutoApprovalMode } from "./auto-approval-mode.js";
import { log } from "./logger/log.js";
import { providers } from "./providers.js";
import { config as loadDotenv } from "dotenv";
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "fs";
import { load as loadYaml, dump as dumpYaml } from "js-yaml";
import { homedir } from "os";
import { dirname, join, extname, resolve as resolvePath } from "path";
// ---------------------------------------------------------------------------
// Userwide environment config (~/.codex.env)
// ---------------------------------------------------------------------------
// Load a userlevel dotenv file **after** process.env and any projectlocal
// .env file (loaded via "dotenv/config" in cli.tsx) are in place. We rely on
// dotenv's default behaviour of *not* overriding existing variables so that
// the precedence order becomes:
// 1. Explicit environment variables
// 2. Projectlocal .env (handled in cli.tsx)
// 3. Userwide ~/.codex.env (loaded here)
// This guarantees that users can still override the global key on a perproject
// basis while enjoying the convenience of a persistent default.
// Skip when running inside Vitest to avoid interfering with the FS mocks used
// by tests that stub out `fs` *after* importing this module.
const USER_WIDE_CONFIG_PATH = join(homedir(), ".codex.env");
const isVitest =
typeof (globalThis as { vitest?: unknown }).vitest !== "undefined";
if (!isVitest) {
loadDotenv({ path: USER_WIDE_CONFIG_PATH });
}
export const DEFAULT_AGENTIC_MODEL = "o4-mini";
export const DEFAULT_FULL_CONTEXT_MODEL = "gpt-4.1";
export const DEFAULT_APPROVAL_MODE = AutoApprovalMode.SUGGEST;
@@ -37,6 +64,16 @@ export const OPENAI_TIMEOUT_MS =
export const OPENAI_BASE_URL = process.env["OPENAI_BASE_URL"] || "";
export let OPENAI_API_KEY = process.env["OPENAI_API_KEY"] || "";
export const DEFAULT_REASONING_EFFORT = "high";
export const OPENAI_ORGANIZATION = process.env["OPENAI_ORGANIZATION"] || "";
export const OPENAI_PROJECT = process.env["OPENAI_PROJECT"] || "";
// Can be set `true` when Codex is running in an environment that is marked as already
// considered sufficiently locked-down so that we allow running wihtout an explicit sandbox.
export const CODEX_UNSAFE_ALLOW_NO_SANDBOX = Boolean(
process.env["CODEX_UNSAFE_ALLOW_NO_SANDBOX"] || "",
);
export function setApiKey(apiKey: string): void {
OPENAI_API_KEY = apiKey;
}
@@ -76,6 +113,12 @@ export function getApiKey(provider: string = "openai"): string | undefined {
return process.env[providerInfo.envKey];
}
// Checking `PROVIDER_API_KEY feels more intuitive with a custom provider.
const customApiKey = process.env[`${provider.toUpperCase()}_API_KEY`];
if (customApiKey) {
return customApiKey;
}
// If the provider not found in the providers list and `OPENAI_API_KEY` is set, use it
if (OPENAI_API_KEY !== "") {
return OPENAI_API_KEY;
@@ -102,6 +145,9 @@ export type StoredConfig = {
saveHistory?: boolean;
sensitivePatterns?: Array<string>;
};
/** User-defined safe commands */
safeCommands?: Array<string>;
reasoningEffort?: ReasoningEffort;
};
// Minimal config written on first run. An *empty* model string ensures that
@@ -109,7 +155,7 @@ export type StoredConfig = {
// propagating to existing users until they explicitly set a model.
export const EMPTY_STORED_CONFIG: StoredConfig = { model: "" };
// Prestringified JSON variant so we dont stringify repeatedly.
// Prestringified JSON variant so we don't stringify repeatedly.
const EMPTY_CONFIG_JSON = JSON.stringify(EMPTY_STORED_CONFIG, null, 2) + "\n";
export type MemoryConfig = {
@@ -125,8 +171,9 @@ export type AppConfig = {
approvalMode?: AutoApprovalMode;
fullAutoErrorMode?: FullAutoErrorMode;
memory?: MemoryConfig;
reasoningEffort?: ReasoningEffort;
/** Whether to enable desktop notifications for responses */
notify: boolean;
notify?: boolean;
/** Disable server-side response storage (send full transcript each request) */
disableResponseStorage?: boolean;
@@ -151,6 +198,7 @@ export const PRETTY_PRINT = Boolean(process.env["PRETTY_PRINT"] || "");
export const PROJECT_DOC_MAX_BYTES = 32 * 1024; // 32 kB
const PROJECT_DOC_FILENAMES = ["codex.md", ".codex.md", "CODEX.md"];
const PROJECT_DOC_SEPARATOR = "\n\n--- project-doc ---\n\n";
export function discoverProjectDocPath(startDir: string): string | null {
const cwd = resolvePath(startDir);
@@ -275,6 +323,22 @@ export const loadConfig = (
}
}
if (
storedConfig.disableResponseStorage !== undefined &&
typeof storedConfig.disableResponseStorage !== "boolean"
) {
if (storedConfig.disableResponseStorage === "true") {
storedConfig.disableResponseStorage = true;
} else if (storedConfig.disableResponseStorage === "false") {
storedConfig.disableResponseStorage = false;
} else {
log(
`[codex] Warning: 'disableResponseStorage' in config is not a boolean (got '${storedConfig.disableResponseStorage}'). Ignoring this value.`,
);
delete storedConfig.disableResponseStorage;
}
}
const instructionsFilePathResolved =
instructionsPath ?? INSTRUCTIONS_FILEPATH;
const userInstructions = existsSync(instructionsFilePathResolved)
@@ -305,7 +369,7 @@ export const loadConfig = (
const combinedInstructions = [userInstructions, projectDoc]
.filter((s) => s && s.trim() !== "")
.join("\n\n--- project-doc ---\n\n");
.join(PROJECT_DOC_SEPARATOR);
// Treat empty string ("" or whitespace) as absence so we can fall back to
// the latest DEFAULT_MODEL.
@@ -324,7 +388,8 @@ export const loadConfig = (
instructions: combinedInstructions,
notify: storedConfig.notify === true,
approvalMode: storedConfig.approvalMode,
disableResponseStorage: storedConfig.disableResponseStorage ?? false,
disableResponseStorage: storedConfig.disableResponseStorage === true,
reasoningEffort: storedConfig.reasoningEffort,
};
// -----------------------------------------------------------------------
@@ -439,6 +504,8 @@ export const saveConfig = (
provider: config.provider,
providers: config.providers,
approvalMode: config.approvalMode,
disableResponseStorage: config.disableResponseStorage,
reasoningEffort: config.reasoningEffort,
};
// Add history settings if they exist
@@ -456,5 +523,9 @@ export const saveConfig = (
writeFileSync(targetPath, JSON.stringify(configToSave, null, 2), "utf-8");
}
writeFileSync(instructionsPath, config.instructions, "utf-8");
// Take everything before the first PROJECT_DOC_SEPARATOR (or the whole string if none).
const [userInstructions = ""] = config.instructions.split(
PROJECT_DOC_SEPARATOR,
);
writeFileSync(instructionsPath, userInstructions, "utf-8");
};

View File

@@ -1,5 +1,26 @@
import { execSync } from "node:child_process";
// The objects thrown by `child_process.execSync()` are `Error` instances that
// include additional, undocumented properties such as `status` (exit code) and
// `stdout` (captured standard output). Declare a minimal interface that captures
// just the fields we need so that we can avoid the use of `any` while keeping
// the checks type-safe.
interface ExecSyncError extends Error {
// Exit status code. When a diff is produced, git exits with code 1 which we
// treat as a non-error signal.
status?: number;
// Captured stdout. We rely on this to obtain the diff output when git exits
// with status 1.
stdout?: string;
}
// Type-guard that narrows an unknown value to `ExecSyncError`.
function isExecSyncError(err: unknown): err is ExecSyncError {
return (
typeof err === "object" && err != null && "status" in err && "stdout" in err
);
}
/**
* Returns the current Git diff for the working directory. If the current
* working directory is not inside a Git repository, `isGitRepo` will be
@@ -15,13 +36,86 @@ export function getGitDiff(): {
execSync("git rev-parse --is-inside-work-tree", { stdio: "ignore" });
// If the above call didnt throw, we are inside a git repo. Retrieve the
// diff including color codes so that the overlay can render them.
const output = execSync("git diff --color", {
encoding: "utf8",
maxBuffer: 10 * 1024 * 1024, // 10 MB ought to be enough for now
});
// diff for tracked files **and** include any untracked files so that the
// `/diff` overlay shows a complete picture of the working tree state.
return { isGitRepo: true, diff: output };
// 1. Diff for tracked files (unchanged behaviour)
let trackedDiff = "";
try {
trackedDiff = execSync("git diff --color", {
encoding: "utf8",
maxBuffer: 10 * 1024 * 1024, // 10 MB ought to be enough for now
});
} catch (err) {
// Exit status 1 simply means that differences were found. Capture the
// diff from stdout in that case. Re-throw for any other status codes.
if (
isExecSyncError(err) &&
err.status === 1 &&
typeof err.stdout === "string"
) {
trackedDiff = err.stdout;
} else {
throw err;
}
}
// 2. Determine untracked files.
// We use `git ls-files --others --exclude-standard` which outputs paths
// relative to the repository root, one per line. These are files that
// are not tracked *and* are not ignored by .gitignore.
const untrackedOutput = execSync(
"git ls-files --others --exclude-standard",
{
encoding: "utf8",
maxBuffer: 10 * 1024 * 1024,
},
);
const untrackedFiles = untrackedOutput
.split("\n")
.map((p) => p.trim())
.filter(Boolean);
let untrackedDiff = "";
const nullDevice = process.platform === "win32" ? "NUL" : "/dev/null";
for (const file of untrackedFiles) {
try {
// `git diff --no-index` produces a diff even outside the index by
// comparing two paths. We compare the file against /dev/null so that
// the file is treated as "new".
//
// `git diff --color --no-index /dev/null <file>` exits with status 1
// when differences are found, so we capture stdout from the thrown
// error object instead of letting it propagate.
execSync(`git diff --color --no-index -- "${nullDevice}" "${file}"`, {
encoding: "utf8",
stdio: ["ignore", "pipe", "ignore"],
maxBuffer: 10 * 1024 * 1024,
});
} catch (err) {
if (
isExecSyncError(err) &&
// Exit status 1 simply means that the two inputs differ, which is
// exactly what we expect here. Any other status code indicates a
// real error (e.g. the file disappeared between the ls-files and
// diff calls), so re-throw those.
err.status === 1 &&
typeof err.stdout === "string"
) {
untrackedDiff += err.stdout;
} else {
throw err;
}
}
}
// Concatenate tracked and untracked diffs.
const combinedDiff = `${trackedDiff}${untrackedDiff}`;
return { isGitRepo: true, diff: combinedDiff };
} catch {
// Either git is not installed or were not inside a repository.
return { isGitRepo: false, diff: "" };

View File

@@ -1,7 +1,12 @@
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import { approximateTokensUsed } from "./approximate-tokens-used.js";
import { getBaseUrl, getApiKey } from "./config";
import {
OPENAI_ORGANIZATION,
OPENAI_PROJECT,
getBaseUrl,
getApiKey,
} from "./config";
import { type SupportedModelId, openAiModelInfo } from "./model-info.js";
import OpenAI from "openai";
@@ -22,9 +27,18 @@ async function fetchModels(provider: string): Promise<Array<string>> {
}
try {
const headers: Record<string, string> = {};
if (OPENAI_ORGANIZATION) {
headers["OpenAI-Organization"] = OPENAI_ORGANIZATION;
}
if (OPENAI_PROJECT) {
headers["OpenAI-Project"] = OPENAI_PROJECT;
}
const openai = new OpenAI({
apiKey: getApiKey(provider),
baseURL: getBaseUrl(provider),
defaultHeaders: headers,
});
const list = await openai.models.list();
const models: Array<string> = [];

View File

@@ -1,4 +1,9 @@
export const CLI_VERSION = "0.1.2504221401"; // Must be in sync with package.json.
// Node ESM supports JSON imports behind an assertion. TypeScript's
// `resolveJsonModule` takes care of the typings.
import pkg from "../../package.json" assert { type: "json" };
// Read the version directly from package.json.
export const CLI_VERSION: string = (pkg as { version: string }).version;
export const ORIGIN = "codex_cli_ts";
export type TerminalChatSession = {

View File

@@ -23,7 +23,10 @@ export const SLASH_COMMANDS: Array<SlashCommand> = [
{ command: "/help", description: "Show list of commands" },
{ command: "/model", description: "Open model selection panel" },
{ command: "/approval", description: "Open approval mode selection panel" },
{ command: "/bug", description: "Generate a prefilled GitHub bug report" },
{
command: "/bug",
description: "Generate a prefilled GitHub issue URL with session log",
},
{
command: "/diff",
description:

View File

@@ -67,7 +67,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
}));
vi.mock("../src/format-command.js", () => ({
@@ -94,7 +94,7 @@ describe("cancel before first function_call", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
config: { model: "any", instructions: "", notify: false },
});

View File

@@ -74,7 +74,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
}));
vi.mock("../src/format-command.js", () => ({
@@ -102,7 +102,7 @@ describe("cancel clears previous_response_id", () => {
additionalWritableRoots: [],
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
config: { model: "any", instructions: "", notify: false },
});

View File

@@ -9,12 +9,11 @@ class FakeStream {
public controller = { abort: vi.fn() };
async *[Symbol.asyncIterator]() {
// Immediately start streaming an assistant message so that it is possible
// for a usertriggered cancellation that happens milliseconds later to
// arrive *after* the first token has already been emitted. This mirrors
// the realworld race where the UI shows nothing yet (network / rendering
// latency) even though the model has technically started responding.
// Introduce a delay to simulate network latency and allow for cancel() to be called
await new Promise((resolve) => setTimeout(resolve, 10));
// Mimic an assistant message containing the word "hello".
// Our fix should prevent this from being emitted after cancel() is called
yield {
type: "response.output_item.done",
item: {
@@ -86,9 +85,9 @@ vi.mock("../src/utils/agent/log.js", () => ({
}));
describe("Agent cancellation race", () => {
// We expect this test to highlight the current bug, so the suite should
// fail (red) until the underlying race condition in `AgentLoop` is fixed.
it("still emits the model answer even though cancel() was called", async () => {
// This test verifies our fix for the race condition where a cancelled message
// could still appear after the user cancels a request.
it("should not emit messages after cancel() is called", async () => {
const items: Array<any> = [];
const agent = new AgentLoop({
@@ -99,7 +98,7 @@ describe("Agent cancellation race", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => items.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -131,9 +130,8 @@ describe("Agent cancellation race", () => {
await new Promise((r) => setTimeout(r, 40));
const assistantMsg = items.find((i) => i.role === "assistant");
// The bug manifests if the assistant message is still present even though
// it belongs to the canceled run. We assert that it *should not* be
// delivered this test will fail until the bug is fixed.
// Our fix should prevent the assistant message from being delivered after cancel
// Now that we've fixed it, the test should pass
expect(assistantMsg).toBeUndefined();
});
});

View File

@@ -52,7 +52,7 @@ vi.mock("../src/approvals.js", () => {
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () =>
({ type: "auto-approve", runInSandbox: false } as any),
({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
};
});
@@ -96,7 +96,7 @@ describe("Agent cancellation", () => {
received.push(item);
},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -144,7 +144,7 @@ describe("Agent cancellation", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (item) => received.push(item),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -0,0 +1,115 @@
import { describe, it, expect, vi } from "vitest";
// ---------------------------------------------------------------------------
// This regression test ensures that AgentLoop only surfaces each response item
// once even when the same item appears multiple times in the OpenAI streaming
// response (e.g. as an early `response.output_item.done` event *and* again in
// the final `response.completed` payload).
// ---------------------------------------------------------------------------
// Fake OpenAI stream that emits the *same* message twice: first as an
// incremental output event and then again in the turn completion payload.
class FakeStream {
public controller = { abort: vi.fn() };
async *[Symbol.asyncIterator]() {
// 1) Early incremental item.
yield {
type: "response.output_item.done",
item: {
type: "message",
id: "call-dedupe-1",
role: "assistant",
content: [{ type: "input_text", text: "Hello!" }],
},
} as any;
// 2) Turn completion containing the *same* item again.
yield {
type: "response.completed",
response: {
id: "resp-dedupe-1",
status: "completed",
output: [
{
type: "message",
id: "call-dedupe-1",
role: "assistant",
content: [{ type: "input_text", text: "Hello!" }],
},
],
},
} as any;
}
}
// Intercept the OpenAI SDK used inside AgentLoop so we can inject our fake
// streaming implementation.
vi.mock("openai", () => {
class FakeOpenAI {
public responses = {
create: async () => new FakeStream(),
};
}
class APIConnectionTimeoutError extends Error {}
return { __esModule: true, default: FakeOpenAI, APIConnectionTimeoutError };
});
// Stub approvals / formatting helpers not relevant here.
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
vi.mock("../src/format-command.js", () => ({
__esModule: true,
formatCommandForDisplay: (cmd: Array<string>) => cmd.join(" "),
}));
vi.mock("../src/utils/agent/log.js", () => ({
__esModule: true,
log: () => {},
isLoggingEnabled: () => false,
}));
// After the dependency mocks we can import the module under test.
import { AgentLoop } from "../src/utils/agent/agent-loop.js";
describe("AgentLoop deduplicates output items", () => {
it("invokes onItem exactly once for duplicate items with the same id", async () => {
const received: Array<any> = [];
const agent = new AgentLoop({
model: "any",
instructions: "",
config: { model: "any", instructions: "", notify: false },
approvalPolicy: { mode: "auto" } as any,
additionalWritableRoots: [],
onItem: (item) => received.push(item),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
const userMsg = [
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "hi" }],
},
];
await agent.run(userMsg as any);
// Give the setTimeout(3ms) inside AgentLoop.stageItem a chance to fire.
await new Promise((r) => setTimeout(r, 20));
// Count how many times the duplicate item surfaced.
const appearances = received.filter((i) => i.id === "call-dedupe-1").length;
expect(appearances).toBe(1);
});
});

View File

@@ -91,7 +91,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -121,7 +121,7 @@ describe("function_call_output includes original call ID", () => {
additionalWritableRoots: [],
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -26,7 +26,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -62,7 +62,7 @@ describe("AgentLoop generic network/server errors", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -106,7 +106,7 @@ describe("AgentLoop generic network/server errors", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -47,7 +47,7 @@ describe("Agent interrupt and continue", () => {
onLoading: (loading) => {
loadingState = loading;
},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -25,7 +25,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -61,7 +61,7 @@ describe("AgentLoop invalid request / 4xx errors", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -25,7 +25,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -64,7 +64,7 @@ describe("AgentLoop max_tokens too large error", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -45,7 +45,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -112,7 +112,7 @@ describe("AgentLoop network resilience", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -154,7 +154,7 @@ describe("AgentLoop network resilience", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -56,7 +56,7 @@ vi.mock("../src/approvals.js", () => {
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () =>
({ type: "auto-approve", runInSandbox: false } as any),
({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
};
});
@@ -119,7 +119,7 @@ describe("AgentLoop", () => {
approvalPolicy: { mode: "suggest" } as any,
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -37,7 +37,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -82,7 +82,7 @@ describe("AgentLoop ratelimit handling", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -98,10 +98,8 @@ describe("AgentLoop ratelimit handling", () => {
// is in progress.
const runPromise = agent.run(userMsg as any);
// The agent waits 15 000 ms between retries (ratelimit backoff) and does
// this four times (after attempts 14). Fastforward a bit more to cover
// any additional small `setTimeout` calls inside the implementation.
await vi.advanceTimersByTimeAsync(61_000); // 4 * 15s + 1s safety margin
// Should be done in at most 180 seconds.
await vi.advanceTimersByTimeAsync(180_000);
// Ensure the promise settles without throwing.
await expect(runPromise).resolves.not.toThrow();
@@ -110,8 +108,8 @@ describe("AgentLoop ratelimit handling", () => {
await vi.advanceTimersByTimeAsync(20);
// The OpenAI client should have been called the maximum number of retry
// attempts (5).
expect(openAiState.createSpy).toHaveBeenCalledTimes(5);
// attempts (8).
expect(openAiState.createSpy).toHaveBeenCalledTimes(8);
// Finally, verify that the user sees a helpful system message.
const sysMsg = received.find(

View File

@@ -1,111 +0,0 @@
import { describe, it, expect, vi } from "vitest";
// ---------------------------------------------------------------------------
// Utility helpers & OpenAI mock tailored for serverside errors that occur
// *after* the streaming iterator was created (i.e. during iteration).
// ---------------------------------------------------------------------------
function createStreamThatErrors(err: Error) {
return new (class {
public controller = { abort: vi.fn() };
async *[Symbol.asyncIterator]() {
// Immediately raise the error once iteration starts mimics OpenAI SDK
// behaviour which throws from the iterator when the HTTP response status
// indicates an internal server failure.
throw err;
}
})();
}
// Spy holder swapped out per test case.
const openAiState: { createSpy?: ReturnType<typeof vi.fn> } = {};
vi.mock("openai", () => {
class FakeOpenAI {
public responses = {
create: (...args: Array<any>) => openAiState.createSpy!(...args),
};
}
class APIConnectionTimeoutError extends Error {}
return {
__esModule: true,
default: FakeOpenAI,
APIConnectionTimeoutError,
};
});
// Approvals / formatting stubs not part of the behaviour under test.
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
isSafeCommand: () => null,
}));
vi.mock("../src/format-command.js", () => ({
__esModule: true,
formatCommandForDisplay: (c: Array<string>) => c.join(" "),
}));
// Silence debug logging so the test output stays uncluttered.
vi.mock("../src/utils/agent/log.js", () => ({
__esModule: true,
log: () => {},
isLoggingEnabled: () => false,
}));
import { AgentLoop } from "../src/utils/agent/agent-loop.js";
describe("AgentLoop server_error surfaced during streaming", () => {
it("shows userfriendly system message instead of crashing", async () => {
const apiErr: any = new Error(
"The server had an error while processing your request. Sorry about that!",
);
// Replicate the structure used by the OpenAI SDK for 5xx failures.
apiErr.type = "server_error";
apiErr.code = null;
apiErr.status = undefined; // SDK leaves status undefined in this pathway
openAiState.createSpy = vi.fn(async () => {
return createStreamThatErrors(apiErr);
});
const received: Array<any> = [];
const agent = new AgentLoop({
model: "any",
instructions: "",
approvalPolicy: { mode: "auto" } as any,
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
onLastResponseId: () => {},
});
const userMsg = [
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "ping" }],
},
];
await expect(agent.run(userMsg as any)).resolves.not.toThrow();
// allow async onItem deliveries to flush
await new Promise((r) => setTimeout(r, 20));
const sysMsg = received.find(
(i) =>
i.role === "system" &&
typeof i.content?.[0]?.text === "string" &&
i.content[0].text.includes("Network error"),
);
expect(sysMsg).toBeTruthy();
});
});

View File

@@ -35,7 +35,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -100,7 +100,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -122,7 +122,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
expect(assistant?.content?.[0]?.text).toBe("ok");
});
it("fails after 3 attempts and surfaces system message", async () => {
it("fails after a few attempts and surfaces system message", async () => {
openAiState.createSpy = vi.fn(async () => {
const err: any = new Error("Internal Server Error");
err.status = 502; // any 5xx
@@ -138,7 +138,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -154,7 +154,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
await new Promise((r) => setTimeout(r, 20));
expect(openAiState.createSpy).toHaveBeenCalledTimes(5);
expect(openAiState.createSpy).toHaveBeenCalledTimes(8);
const sysMsg = received.find(
(i) =>

View File

@@ -54,7 +54,7 @@ vi.mock("../src/approvals.js", () => {
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () =>
({ type: "auto-approve", runInSandbox: false } as any),
({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
};
});
@@ -116,7 +116,7 @@ describe("Agent terminate (hard cancel)", () => {
additionalWritableRoots: [],
onItem: (item) => received.push(item),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -152,7 +152,7 @@ describe("Agent terminate (hard cancel)", () => {
additionalWritableRoots: [],
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -110,7 +110,7 @@ describe("thinking time counter", () => {
additionalWritableRoots: [],
onItem: (i) => items.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -56,6 +56,34 @@ test("process_patch - update file", () => {
expect(fs.removals).toEqual([]);
});
// ---------------------------------------------------------------------------
// Unicode canonicalisation tests hyphen / dash / quote look-alikes
// ---------------------------------------------------------------------------
test("process_patch tolerates hyphen/dash variants", () => {
// The file contains EN DASH (\u2013) and NO-BREAK HYPHEN (\u2011)
const original =
"first\nimport foo # local import \u2013 avoids top\u2011level dep\nlast";
const patch = `*** Begin Patch\n*** Update File: uni.txt\n@@\n-import foo # local import - avoids top-level dep\n+import foo # HANDLED\n*** End Patch`;
const fs = createInMemoryFS({ "uni.txt": original });
process_patch(patch, fs.openFn, fs.writeFn, fs.removeFn);
expect(fs.files["uni.txt"]!.includes("HANDLED")).toBe(true);
});
test.skip("process_patch tolerates smart quotes", () => {
const original = "console.log(\u201Chello\u201D);"; // “hello” with smart quotes
const patch = `*** Begin Patch\n*** Update File: quotes.js\n@@\n-console.log(\\"hello\\");\n+console.log(\\"HELLO\\");\n*** End Patch`;
const fs = createInMemoryFS({ "quotes.js": original });
process_patch(patch, fs.openFn, fs.writeFn, fs.removeFn);
expect(fs.files["quotes.js"]).toBe('console.log("HELLO");');
});
test("process_patch - add file", () => {
const patch = `*** Begin Patch
*** Add File: b.txt

View File

@@ -3,7 +3,6 @@ import type { ComponentProps } from "react";
import { describe, it, expect, vi } from "vitest";
import { renderTui } from "./ui-test-helpers.js";
import TerminalChatInput from "../src/components/chat/terminal-chat-input.js";
import TerminalChatNewInput from "../src/components/chat/terminal-chat-new-input.js";
import * as TermUtils from "../src/utils/terminal.js";
// -------------------------------------------------------------------------------------------------
@@ -92,60 +91,6 @@ describe("/clear command", () => {
cleanup();
clearSpy.mockRestore();
});
it("invokes clearTerminal and resets context in TerminalChatNewInput", async () => {
const clearSpy = vi
.spyOn(TermUtils, "clearTerminal")
.mockImplementation(() => {});
const setItems = vi.fn();
const props: ComponentProps<typeof TerminalChatNewInput> = {
isNew: false,
loading: false,
submitInput: () => {},
confirmationPrompt: null,
explanation: undefined,
submitConfirmation: () => {},
setLastResponseId: () => {},
setItems,
contextLeftPercent: 100,
openOverlay: () => {},
openModelOverlay: () => {},
openApprovalOverlay: () => {},
openHelpOverlay: () => {},
openDiffOverlay: () => {},
interruptAgent: () => {},
active: true,
thinkingSeconds: 0,
};
const { stdin, flush, cleanup } = renderTui(
<TerminalChatNewInput {...props} />,
);
await flush();
await type(stdin, "/clear", flush);
await type(stdin, "\r", flush); // press Enter
await flush();
expect(clearSpy).toHaveBeenCalledTimes(1);
expect(setItems).toHaveBeenCalledTimes(1);
const firstArg = setItems.mock.calls[0]![0];
expect(Array.isArray(firstArg)).toBe(true);
expect(firstArg).toHaveLength(1);
expect(firstArg[0]).toMatchObject({
role: "system",
type: "message",
content: [{ type: "input_text", text: "Terminal cleared" }],
});
cleanup();
clearSpy.mockRestore();
});
});
describe("clearTerminal", () => {

View File

@@ -234,3 +234,44 @@ test("loads and saves providers correctly", () => {
expect(mergedConfig.providers["openai"]).toBeDefined();
}
});
test("saves and loads instructions with project doc separator correctly", () => {
const userInstructions = "user specific instructions";
const projectDoc = "project specific documentation";
const combinedInstructions = `${userInstructions}\n\n--- project-doc ---\n\n${projectDoc}`;
const testConfig = {
model: "test-model",
instructions: combinedInstructions,
notify: false,
};
saveConfig(testConfig, testConfigPath, testInstructionsPath);
expect(memfs[testInstructionsPath]).toBe(userInstructions);
const loadedConfig = loadConfig(testConfigPath, testInstructionsPath, {
disableProjectDoc: true,
});
expect(loadedConfig.instructions).toBe(userInstructions);
});
test("handles empty user instructions when saving with project doc separator", () => {
const projectDoc = "project specific documentation";
const combinedInstructions = `\n\n--- project-doc ---\n\n${projectDoc}`;
const testConfig = {
model: "test-model",
instructions: combinedInstructions,
notify: false,
};
saveConfig(testConfig, testConfigPath, testInstructionsPath);
expect(memfs[testInstructionsPath]).toBe("");
const loadedConfig = loadConfig(testConfigPath, testInstructionsPath, {
disableProjectDoc: true,
});
expect(loadedConfig.instructions).toBe("");
});

View File

@@ -0,0 +1,121 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import {
loadConfig,
DEFAULT_REASONING_EFFORT,
saveConfig,
} from "../src/utils/config";
import type { ReasoningEffort } from "openai/resources.mjs";
import * as fs from "fs";
// Mock the fs module
vi.mock("fs", () => ({
existsSync: vi.fn(),
readFileSync: vi.fn(),
writeFileSync: vi.fn(),
mkdirSync: vi.fn(),
}));
// Mock path.dirname
vi.mock("path", async () => {
const actual = await vi.importActual("path");
return {
...actual,
dirname: vi.fn().mockReturnValue("/mock/dir"),
};
});
describe("Reasoning Effort Configuration", () => {
beforeEach(() => {
vi.resetAllMocks();
});
afterEach(() => {
vi.clearAllMocks();
});
it('should have "high" as the default reasoning effort', () => {
expect(DEFAULT_REASONING_EFFORT).toBe("high");
});
it("should use default reasoning effort when not specified in config", () => {
// Mock fs.existsSync to return true for config file
vi.mocked(fs.existsSync).mockImplementation(() => true);
// Mock fs.readFileSync to return a JSON with no reasoningEffort
vi.mocked(fs.readFileSync).mockImplementation(() =>
JSON.stringify({ model: "test-model" }),
);
const config = loadConfig("/mock/config.json", "/mock/instructions.md");
// Config should not have reasoningEffort explicitly set
expect(config.reasoningEffort).toBeUndefined();
});
it("should load reasoningEffort from config file", () => {
// Mock fs.existsSync to return true for config file
vi.mocked(fs.existsSync).mockImplementation(() => true);
// Mock fs.readFileSync to return a JSON with reasoningEffort
vi.mocked(fs.readFileSync).mockImplementation(() =>
JSON.stringify({
model: "test-model",
reasoningEffort: "low" as ReasoningEffort,
}),
);
const config = loadConfig("/mock/config.json", "/mock/instructions.md");
// Config should have the reasoningEffort from the file
expect(config.reasoningEffort).toBe("low");
});
it("should support all valid reasoning effort values", () => {
// Valid values for ReasoningEffort
const validEfforts: Array<ReasoningEffort> = ["low", "medium", "high"];
for (const effort of validEfforts) {
// Mock fs.existsSync to return true for config file
vi.mocked(fs.existsSync).mockImplementation(() => true);
// Mock fs.readFileSync to return a JSON with reasoningEffort
vi.mocked(fs.readFileSync).mockImplementation(() =>
JSON.stringify({
model: "test-model",
reasoningEffort: effort,
}),
);
const config = loadConfig("/mock/config.json", "/mock/instructions.md");
// Config should have the correct reasoningEffort
expect(config.reasoningEffort).toBe(effort);
}
});
it("should preserve reasoningEffort when saving configuration", () => {
// Setup
vi.mocked(fs.existsSync).mockReturnValue(false);
// Create config with reasoningEffort
const configToSave = {
model: "test-model",
instructions: "",
reasoningEffort: "medium" as ReasoningEffort,
notify: false,
};
// Act
saveConfig(configToSave, "/mock/config.json", "/mock/instructions.md");
// Assert
expect(fs.writeFileSync).toHaveBeenCalledWith(
"/mock/config.json",
expect.stringContaining('"model"'),
"utf-8",
);
// Note: Current implementation of saveConfig doesn't save reasoningEffort,
// this test would need to be updated if that functionality is added
});
});

View File

@@ -0,0 +1,93 @@
/**
* codex-cli/tests/disableResponseStorage.agentLoop.test.ts
*
* Verifies AgentLoop's request-building logic for both values of
* disableResponseStorage.
*/
import { describe, it, expect, vi } from "vitest";
import { AgentLoop } from "../src/utils/agent/agent-loop";
import type { AppConfig } from "../src/utils/config";
import { ReviewDecision } from "../src/utils/agent/review";
/* ─────────── 1. Spy + module mock ─────────────────────────────── */
const createSpy = vi.fn().mockResolvedValue({
data: { id: "resp_123", status: "completed", output: [] },
});
vi.mock("openai", () => ({
default: class {
public responses = { create: createSpy };
},
APIConnectionTimeoutError: class extends Error {},
}));
/* ─────────── 2. Parametrised tests ─────────────────────────────── */
describe.each([
{ flag: true, title: "omits previous_response_id & sets store:false" },
{ flag: false, title: "sends previous_response_id & allows store:true" },
])("AgentLoop with disableResponseStorage=%s", ({ flag, title }) => {
/* build a fresh config for each case */
const cfg: AppConfig = {
model: "o4-mini",
provider: "openai",
instructions: "",
disableResponseStorage: flag,
notify: false,
};
it(title, async () => {
/* reset spy per iteration */
createSpy.mockClear();
const loop = new AgentLoop({
model: cfg.model,
provider: cfg.provider,
config: cfg,
instructions: "",
approvalPolicy: "suggest",
disableResponseStorage: flag,
additionalWritableRoots: [],
onItem() {},
onLoading() {},
getCommandConfirmation: async () => ({ review: ReviewDecision.YES }),
onLastResponseId() {},
});
await loop.run([
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "hello" }],
},
]);
expect(createSpy).toHaveBeenCalledTimes(1);
const call = createSpy.mock.calls[0];
if (!call) {
throw new Error("Expected createSpy to have been called at least once");
}
const payload: any = call[0];
if (flag) {
/* behaviour when ZDR is *on* */
expect(payload).not.toHaveProperty("previous_response_id");
if (payload.input) {
payload.input.forEach((m: any) => {
expect(m.store === undefined ? false : m.store).toBe(false);
});
}
} else {
/* behaviour when ZDR is *off* */
expect(payload).toHaveProperty("previous_response_id");
if (payload.input) {
payload.input.forEach((m: any) => {
if ("store" in m) {
expect(m.store).not.toBe(false);
}
});
}
}
});
});

View File

@@ -0,0 +1,43 @@
/**
* codex/codex-cli/tests/disableResponseStorage.test.ts
*/
import { describe, it, expect, beforeAll, afterAll } from "vitest";
import { mkdtempSync, rmSync, writeFileSync, mkdirSync } from "node:fs";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { loadConfig, saveConfig } from "../src/utils/config";
import type { AppConfig } from "../src/utils/config";
const sandboxHome: string = mkdtempSync(join(tmpdir(), "codex-home-"));
const codexDir: string = join(sandboxHome, ".codex");
const yamlPath: string = join(codexDir, "config.yaml");
describe("disableResponseStorage persistence", () => {
beforeAll((): void => {
// mkdir -p ~/.codex inside the sandbox
rmSync(codexDir, { recursive: true, force: true });
mkdirSync(codexDir, { recursive: true });
// seed YAML with ZDR enabled
writeFileSync(yamlPath, "model: o4-mini\ndisableResponseStorage: true\n");
});
afterAll((): void => {
rmSync(sandboxHome, { recursive: true, force: true });
});
it("keeps disableResponseStorage=true across load/save cycle", async (): Promise<void> => {
// 1⃣ explicitly load the sandbox file
const cfg1: AppConfig = loadConfig(yamlPath);
expect(cfg1.disableResponseStorage).toBe(true);
// 2⃣ save right back to the same file
await saveConfig(cfg1, yamlPath);
// 3⃣ reload and re-assert
const cfg2: AppConfig = loadConfig(yamlPath);
expect(cfg2.disableResponseStorage).toBe(true);
});
});

View File

@@ -1,56 +0,0 @@
import TextBuffer from "../src/text-buffer";
import { describe, it, expect, vi } from "vitest";
/* -------------------------------------------------------------------------
* External $EDITOR integration behavioural contract
* ---------------------------------------------------------------------- */
describe("TextBuffer open in external $EDITOR", () => {
it("replaces the buffer with the contents saved by the editor", async () => {
// Initial text put into the file.
const initial = [
"// TODO: draft release notes",
"",
"* Fixed memory leak in xyz module.",
].join("\n");
const buf = new TextBuffer(initial);
// -------------------------------------------------------------------
// Stub the child_process.spawnSync call so no real editor launches.
// -------------------------------------------------------------------
const mockSpawn = vi
.spyOn(require("node:child_process"), "spawnSync")
.mockImplementation((_cmd, args: any) => {
const argv = args as Array<string>;
const file = argv[argv.length - 1];
// Lazily append a dummy line our faux "edit".
require("node:fs").appendFileSync(
file,
"\n* Added unit tests for external editor integration.",
);
return { status: 0 } as any;
});
try {
await buf.openInExternalEditor({ editor: "nano" }); // editor param ignored in stub
} finally {
mockSpawn.mockRestore();
}
const want = [
"// TODO: draft release notes",
"",
"* Fixed memory leak in xyz module.",
"* Added unit tests for external editor integration.",
].join("\n");
expect(buf.getText()).toBe(want);
// Cursor should land at the *end* of the newly imported text.
const [row, col] = buf.getCursor();
expect(row).toBe(3); // 4th line (0based)
expect(col).toBe(
"* Added unit tests for external editor integration.".length,
);
});
});

View File

@@ -60,7 +60,7 @@ function createFunctionCall(
id: `fn_${Math.random().toString(36).slice(2)}`,
call_id: `call_${Math.random().toString(36).slice(2)}`,
arguments: JSON.stringify(args),
};
} as ResponseFunctionToolCallItem;
}
// ---------------------------------------------------------------------------

View File

@@ -26,7 +26,7 @@ vi.mock("../src/approvals.js", () => {
return {
__esModule: true,
canAutoApprove: () =>
({ type: "auto-approve", runInSandbox: false } as any),
({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
};
});
@@ -51,7 +51,7 @@ describe("handleExecCommand invalid executable", () => {
const execInput = { cmd: ["git show"] } as any;
const config = { model: "any", instructions: "" } as any;
const policy = { mode: "auto" } as any;
const getConfirmation = async () => ({ review: "yes" } as any);
const getConfirmation = async () => ({ review: "yes" }) as any;
const additionalWritableRoots: Array<string> = [];
const { outputText, metadata } = await handleExecCommand(

View File

@@ -1,64 +0,0 @@
import { renderTui } from "./ui-test-helpers.js";
import MultilineTextEditor from "../src/components/chat/multiline-editor.js";
import TextBuffer from "../src/text-buffer.js";
import * as React from "react";
import { describe, it, expect, vi } from "vitest";
async function type(
stdin: NodeJS.WritableStream,
text: string,
flush: () => Promise<void>,
) {
stdin.write(text);
await flush();
}
describe("MultilineTextEditor external editor shortcut", () => {
it("fires openInExternalEditor on CtrlE (single key)", async () => {
const spy = vi
.spyOn(TextBuffer.prototype as any, "openInExternalEditor")
.mockResolvedValue(undefined);
const { stdin, flush, cleanup } = renderTui(
React.createElement(MultilineTextEditor, {
initialText: "hello",
width: 20,
height: 3,
}),
);
// Ensure initial render.
await flush();
// Send CtrlE → should fire immediately
await type(stdin, "\x05", flush); // CtrlE (ENQ / 0x05)
expect(spy).toHaveBeenCalledTimes(1);
spy.mockRestore();
cleanup();
});
it("fires openInExternalEditor on CtrlX (single key)", async () => {
const spy = vi
.spyOn(TextBuffer.prototype as any, "openInExternalEditor")
.mockResolvedValue(undefined);
const { stdin, flush, cleanup } = renderTui(
React.createElement(MultilineTextEditor, {
initialText: "hello",
width: 20,
height: 3,
}),
);
// Ensure initial render.
await flush();
// Send CtrlX → should fire immediately
await type(stdin, "\x18", flush); // CtrlX (SUB / 0x18)
expect(spy).toHaveBeenCalledTimes(1);
spy.mockRestore();
cleanup();
});
});

View File

@@ -44,7 +44,7 @@ vi.mock("../src/approvals.js", () => ({
}));
// After mocks are in place we can safely import the component under test.
import TerminalChatInput from "../src/components/chat/terminal-chat-new-input.js";
import TerminalChatInput from "../src/components/chat/terminal-chat-input.js";
// Tiny helper mirroring the one used in other UI tests so we can await Ink's
// internal promises between keystrokes.
@@ -126,7 +126,8 @@ describe("TerminalChatInput history navigation with multiline drafts", () =>
cleanup();
});
it("should restore the draft when navigating forward (↓) past the newest history entry", async () => {
// TODO: Fix this test.
it.skip("should restore the draft when navigating forward (↓) past the newest history entry", async () => {
const { stdin, lastFrameStripped, flush, cleanup } = renderTui(
React.createElement(TerminalChatInput, stubProps()),
);
@@ -148,9 +149,17 @@ describe("TerminalChatInput history navigation with multiline drafts", () =>
expect(draftFrame.includes("draft1")).toBe(true);
expect(draftFrame.includes("draft2")).toBe(true);
// Before we start navigating upwards we must ensure the caret sits at
// the very *start* of the current line. TerminalChatInput only engages
// history recall when the cursor is positioned at row-0 *and* column-0
// (mirroring the behaviour of shells like Bash/zsh or Readline). Hit
// Ctrl+A (ASCII 0x01) to jump to SOL, then proceed with the ↑ presses.
await type(stdin, "\x01", flush); // Ctrl+A move to column-0
// ────────────────────────────────────────────────────────────────────
// 1) Hit ↑ twice: first press just moves the caret to row0, second
// enters history mode and shows the previous message ("prev").
// 1) Hit ↑ twice: first press moves the caret from (row:1,col:0) to
// (row:0,col:0); the *second* press now satisfies the gate for
// history-navigation and should display the previous entry ("prev").
// ────────────────────────────────────────────────────────────────────
await type(stdin, "\x1b[A", flush); // first up vertical move only
await type(stdin, "\x1b[A", flush); // second up recall history

View File

@@ -16,7 +16,7 @@ async function type(
await flush();
}
describe("MultilineTextEditor Shift+Enter (\r variant)", () => {
describe("MultilineTextEditor - Shift+Enter (\r variant)", () => {
it("inserts a newline and does NOT submit when the terminal sends \r for Shift+Enter", async () => {
const onSubmit = vi.fn();

View File

@@ -24,35 +24,6 @@ vi.mock("../src/utils/input-utils.js", () => ({
}));
describe("TerminalChatInput multiline functionality", () => {
it("renders the multiline editor component", async () => {
const props: ComponentProps<typeof TerminalChatInput> = {
isNew: false,
loading: false,
submitInput: () => {},
confirmationPrompt: null,
explanation: undefined,
submitConfirmation: () => {},
setLastResponseId: () => {},
setItems: () => {},
contextLeftPercent: 50,
openOverlay: () => {},
openDiffOverlay: () => {},
openModelOverlay: () => {},
openApprovalOverlay: () => {},
openHelpOverlay: () => {},
onCompact: () => {},
interruptAgent: () => {},
active: true,
thinkingSeconds: 0,
};
const { lastFrameStripped } = renderTui(<TerminalChatInput {...props} />);
const frame = lastFrameStripped();
// Check that the help text mentions shift+enter for new line
expect(frame).toContain("shift+enter for new line");
});
it("allows multiline input with shift+enter", async () => {
const submitInput = vi.fn();

View File

@@ -0,0 +1,130 @@
/* eslint-disable no-console */
import { renderTui } from "./ui-test-helpers.js";
import React from "react";
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import chalk from "chalk";
import ModelOverlay from "src/components/model-overlay.js";
// Mock the necessary dependencies
vi.mock("../src/utils/logger/log.js", () => ({
log: vi.fn(),
}));
vi.mock("chalk", () => ({
default: {
bold: {
red: vi.fn((msg) => `[bold-red]${msg}[/bold-red]`),
},
yellow: vi.fn((msg) => `[yellow]${msg}[/yellow]`),
},
}));
describe("Model Selection Error Handling", () => {
// Create a console.error spy with proper typing
let consoleErrorSpy: ReturnType<typeof vi.spyOn>;
beforeEach(() => {
consoleErrorSpy = vi.spyOn(console, "error").mockImplementation(() => {});
});
afterEach(() => {
vi.clearAllMocks();
consoleErrorSpy.mockRestore();
});
it("should display error with chalk formatting when selecting unavailable model", () => {
// Setup
const allModels = ["gpt-4", "gpt-3.5-turbo"];
const currentModel = "gpt-4";
const unavailableModel = "gpt-invalid";
const currentProvider = "openai";
renderTui(
<ModelOverlay
currentModel={currentModel}
providers={{ openai: { name: "OpenAI", baseURL: "", envKey: "test" } }}
currentProvider={currentProvider}
hasLastResponse={false}
onSelect={(models, newModel) => {
if (!models?.includes(newModel)) {
console.error(
chalk.bold.red(
`Model "${chalk.yellow(
newModel,
)}" is not available for provider "${chalk.yellow(
currentProvider,
)}".`,
),
);
return;
}
}}
onSelectProvider={() => {}}
onExit={() => {}}
/>,
);
const onSelectHandler = vi.fn((models, newModel) => {
if (!models?.includes(newModel)) {
console.error(
chalk.bold.red(
`Model "${chalk.yellow(
newModel,
)}" is not available for provider "${chalk.yellow(
currentProvider,
)}".`,
),
);
return;
}
});
onSelectHandler(allModels, unavailableModel);
expect(consoleErrorSpy).toHaveBeenCalled();
expect(chalk.bold.red).toHaveBeenCalled();
expect(chalk.yellow).toHaveBeenCalledWith(unavailableModel);
expect(chalk.yellow).toHaveBeenCalledWith(currentProvider);
expect(consoleErrorSpy).toHaveBeenCalledWith(
`[bold-red]Model "[yellow]${unavailableModel}[/yellow]" is not available for provider "[yellow]${currentProvider}[/yellow]".[/bold-red]`,
);
});
it("should not proceed with model change when model is unavailable", () => {
const mockSetModel = vi.fn();
const mockSetLastResponseId = vi.fn();
const mockSaveConfig = vi.fn();
const mockSetItems = vi.fn();
const mockSetOverlayMode = vi.fn();
const onSelectHandler = vi.fn((allModels, newModel) => {
if (!allModels?.includes(newModel)) {
console.error(
chalk.bold.red(
`Model "${chalk.yellow(
newModel,
)}" is not available for provider "${chalk.yellow("openai")}".`,
),
);
return;
}
mockSetModel(newModel);
mockSetLastResponseId(null);
mockSaveConfig({});
mockSetItems((prev: Array<unknown>) => [...prev, {}]);
mockSetOverlayMode("none");
});
onSelectHandler(["gpt-4", "gpt-3.5-turbo"], "gpt-invalid");
expect(mockSetModel).not.toHaveBeenCalled();
expect(mockSetLastResponseId).not.toHaveBeenCalled();
expect(mockSaveConfig).not.toHaveBeenCalled();
expect(mockSetItems).not.toHaveBeenCalled();
expect(mockSetOverlayMode).not.toHaveBeenCalled();
expect(consoleErrorSpy).toHaveBeenCalled();
});
});

View File

@@ -0,0 +1,110 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
// Mock OpenAI to avoid API key requirement
vi.mock("openai", () => {
class FakeOpenAI {
public responses = {
create: vi.fn(),
};
}
class APIConnectionTimeoutError extends Error {}
return { __esModule: true, default: FakeOpenAI, APIConnectionTimeoutError };
});
// Stub the logger to avoid filesystem side effects during tests
vi.mock("../src/utils/logger/log.js", () => ({
__esModule: true,
log: () => {},
isLoggingEnabled: () => false,
}));
// Import AgentLoop after mocking dependencies
import { AgentLoop } from "../src/utils/agent/agent-loop.js";
describe("Token streaming performance", () => {
// Mock callback for collecting tokens and their timestamps
const mockOnItem = vi.fn();
let startTime: number;
const tokenTimestamps: Array<number> = [];
beforeEach(() => {
vi.useFakeTimers();
startTime = Date.now();
tokenTimestamps.length = 0;
// Set up the mockOnItem to record timestamps when tokens are received
mockOnItem.mockImplementation(() => {
tokenTimestamps.push(Date.now() - startTime);
});
});
afterEach(() => {
vi.restoreAllMocks();
vi.useRealTimers();
});
it("processes tokens with minimal delay", async () => {
// Create a minimal AgentLoop instance
const agentLoop = new AgentLoop({
model: "gpt-4",
approvalPolicy: "auto-edit",
additionalWritableRoots: [],
onItem: mockOnItem,
onLoading: vi.fn(),
getCommandConfirmation: vi.fn().mockResolvedValue({ review: "approve" }),
onLastResponseId: vi.fn(),
});
// Mock a stream of 100 tokens
const mockItems = Array.from(
{ length: 100 },
(_, i) =>
({
id: `token-${i}`,
type: "message",
role: "assistant",
content: [{ type: "output_text", text: `Token ${i}` }],
status: "completed",
}) as ResponseItem,
);
// Call run with some input
const runPromise = agentLoop.run([
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "Test message" }],
},
]);
// Instead of trying to access private methods, just call onItem directly
// This still tests the timing and processing of tokens
mockItems.forEach((item) => {
agentLoop["onItem"](item);
// Advance the timer slightly to simulate small processing time
vi.advanceTimersByTime(1);
});
// Advance time to complete any pending operations
vi.runAllTimers();
await runPromise;
// Verify that tokens were processed (note that we're using a spy so exact count may vary
// due to other test setup and runtime internal calls)
expect(mockOnItem).toHaveBeenCalled();
// Calculate performance metrics
const intervals = tokenTimestamps
.slice(1)
.map((t, i) => t - (tokenTimestamps[i] || 0));
const avgDelay =
intervals.length > 0
? intervals.reduce((sum, i) => sum + i, 0) / intervals.length
: 0;
// With queueMicrotask, the delay should be minimal
// We're expecting the average delay to be very small (less than 2ms in this simulated environment)
expect(avgDelay).toBeLessThan(2);
});
});

View File

@@ -0,0 +1,62 @@
import { describe, it, expect, beforeEach, afterEach } from "vitest";
import { mkdtempSync, writeFileSync, rmSync } from "fs";
import { tmpdir } from "os";
import { join } from "path";
/**
* Verifies that ~/.codex.env is parsed (lowestpriority) when present.
*/
describe("userwide ~/.codex.env support", () => {
const ORIGINAL_HOME = process.env["HOME"];
const ORIGINAL_API_KEY = process.env["OPENAI_API_KEY"];
let tempHome: string;
beforeEach(() => {
// Create an isolated fake $HOME directory.
tempHome = mkdtempSync(join(tmpdir(), "codex-home-"));
process.env["HOME"] = tempHome;
// Ensure the env var is unset so that the file value is picked up.
delete process.env["OPENAI_API_KEY"];
// Write ~/.codex.env with a dummy key.
writeFileSync(
join(tempHome, ".codex.env"),
"OPENAI_API_KEY=my-home-key\n",
{
encoding: "utf8",
},
);
});
afterEach(() => {
// Cleanup temp directory.
try {
rmSync(tempHome, { recursive: true, force: true });
} catch {
// ignore
}
// Restore original env.
if (ORIGINAL_HOME !== undefined) {
process.env["HOME"] = ORIGINAL_HOME;
} else {
delete process.env["HOME"];
}
if (ORIGINAL_API_KEY !== undefined) {
process.env["OPENAI_API_KEY"] = ORIGINAL_API_KEY;
} else {
delete process.env["OPENAI_API_KEY"];
}
});
it("loads the API key from ~/.codex.env when not set elsewhere", async () => {
// Import the config module AFTER setting up the fake env.
const { getApiKey } = await import("../src/utils/config.js");
expect(getApiKey("openai")).toBe("my-home-key");
});
});

1
codex-rs/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
/target/

4308
codex-rs/Cargo.lock generated Normal file

File diff suppressed because it is too large Load Diff

20
codex-rs/Cargo.toml Normal file
View File

@@ -0,0 +1,20 @@
[workspace]
resolver = "2"
members = [
"ansi-escape",
"apply-patch",
"cli",
"core",
"exec",
"execpolicy",
"tui",
]
[workspace.package]
version = "0.0.2504301132"
[profile.release]
lto = "fat"
# Because we bundle some of these executables with the TypeScript CLI, we
# remove everything to make the binary as small as possible.
strip = "symbols"

22
codex-rs/README.md Normal file
View File

@@ -0,0 +1,22 @@
# codex-rs
April 24, 2025
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible.
To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implmentation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.
## Code Organization
This folder is the root of a Cargo workspace. It contains quite a bit of experimental code, but here are the key crates:
- [`core/`](./core) contains the business logic for Codex. Ultimately, we hope this to be a library crate that is generally useful for building other Rust/native applications that use Codex.
- [`exec/`](./exec) "headless" CLI for use in automation.
- [`tui/`](./tui) CLI that launches a fullscreen TUI built with [Ratatui](https://ratatui.rs/).
- [`cli/`](./cli) CLI multitool that provides the aforementioned CLIs via subcommands.

View File

@@ -0,0 +1,16 @@
[package]
name = "codex-ansi-escape"
version = "0.1.0"
edition = "2021"
[lib]
name = "codex_ansi_escape"
path = "src/lib.rs"
[dependencies]
ansi-to-tui = "7.0.0"
ratatui = { version = "0.29.0", features = [
"unstable-widget-ref",
"unstable-rendered-line-info",
] }
tracing = { version = "0.1.41", features = ["log"] }

View File

@@ -0,0 +1,15 @@
# oai-codex-ansi-escape
Small helper functions that wrap functionality from
<https://crates.io/crates/ansi-to-tui>:
```rust
pub fn ansi_escape_line(s: &str) -> Line<'static>
pub fn ansi_escape<'a>(s: &'a str) -> Text<'a>
```
Advantages:
- `ansi_to_tui::IntoText` is not in scope for the entire TUI crate
- we `panic!()` and log if `IntoText` returns an `Err` and log it so that
the caller does not have to deal with it

View File

@@ -0,0 +1,39 @@
use ansi_to_tui::Error;
use ansi_to_tui::IntoText;
use ratatui::text::Line;
use ratatui::text::Text;
/// This function should be used when the contents of `s` are expected to match
/// a single line. If multiple lines are found, a warning is logged and only the
/// first line is returned.
pub fn ansi_escape_line(s: &str) -> Line<'static> {
let text = ansi_escape(s);
match text.lines.as_slice() {
[] => Line::from(""),
[only] => only.clone(),
[first, rest @ ..] => {
tracing::warn!("ansi_escape_line: expected a single line, got {first:?} and {rest:?}");
first.clone()
}
}
}
pub fn ansi_escape(s: &str) -> Text<'static> {
// to_text() claims to be faster, but introduces complex lifetime issues
// such that it's not worth it.
match s.into_text() {
Ok(text) => text,
Err(err) => match err {
Error::NomError(message) => {
tracing::error!(
"ansi_to_tui NomError docs claim should never happen when parsing `{s}`: {message}"
);
panic!();
}
Error::Utf8Error(utf8error) => {
tracing::error!("Utf8Error: {utf8error}");
panic!();
}
},
}
}

View File

@@ -0,0 +1,21 @@
[package]
name = "codex-apply-patch"
version = "0.1.0"
edition = "2021"
[lib]
name = "codex_apply_patch"
path = "src/lib.rs"
[dependencies]
anyhow = "1"
regex = "1.11.1"
serde_json = "1.0.110"
similar = "2.7.0"
thiserror = "2.0.12"
tree-sitter = "0.25.3"
tree-sitter-bash = "0.23.3"
[dev-dependencies]
pretty_assertions = "1.4.1"
tempfile = "3.13.0"

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,499 @@
//! This module is responsible for parsing & validating a patch into a list of "hunks".
//! (It does not attempt to actually check that the patch can be applied to the filesystem.)
//!
//! The official Lark grammar for the apply-patch format is:
//!
//! start: begin_patch hunk+ end_patch
//! begin_patch: "*** Begin Patch" LF
//! end_patch: "*** End Patch" LF?
//!
//! hunk: add_hunk | delete_hunk | update_hunk
//! add_hunk: "*** Add File: " filename LF add_line+
//! delete_hunk: "*** Delete File: " filename LF
//! update_hunk: "*** Update File: " filename LF change_move? change?
//! filename: /(.+)/
//! add_line: "+" /(.+)/ LF -> line
//!
//! change_move: "*** Move to: " filename LF
//! change: (change_context | change_line)+ eof_line?
//! change_context: ("@@" | "@@ " /(.+)/) LF
//! change_line: ("+" | "-" | " ") /(.+)/ LF
//! eof_line: "*** End of File" LF
//!
//! The parser below is a little more lenient than the explicit spec and allows for
//! leading/trailing whitespace around patch markers.
use std::path::PathBuf;
use thiserror::Error;
const BEGIN_PATCH_MARKER: &str = "*** Begin Patch";
const END_PATCH_MARKER: &str = "*** End Patch";
const ADD_FILE_MARKER: &str = "*** Add File: ";
const DELETE_FILE_MARKER: &str = "*** Delete File: ";
const UPDATE_FILE_MARKER: &str = "*** Update File: ";
const MOVE_TO_MARKER: &str = "*** Move to: ";
const EOF_MARKER: &str = "*** End of File";
const CHANGE_CONTEXT_MARKER: &str = "@@ ";
const EMPTY_CHANGE_CONTEXT_MARKER: &str = "@@";
#[derive(Debug, PartialEq, Error)]
pub enum ParseError {
#[error("invalid patch: {0}")]
InvalidPatchError(String),
#[error("invalid hunk at line {line_number}, {message}")]
InvalidHunkError { message: String, line_number: usize },
}
use ParseError::*;
#[derive(Debug, PartialEq)]
#[allow(clippy::enum_variant_names)]
pub enum Hunk {
AddFile {
path: PathBuf,
contents: String,
},
DeleteFile {
path: PathBuf,
},
UpdateFile {
path: PathBuf,
move_path: Option<PathBuf>,
/// Chunks should be in order, i.e. the `change_context` of one chunk
/// should occur later in the file than the previous chunk.
chunks: Vec<UpdateFileChunk>,
},
}
use Hunk::*;
#[derive(Debug, PartialEq)]
pub struct UpdateFileChunk {
/// A single line of context used to narrow down the position of the chunk
/// (this is usually a class, method, or function definition.)
pub change_context: Option<String>,
/// A contiguous block of lines that should be replaced with `new_lines`.
/// `old_lines` must occur strictly after `change_context`.
pub old_lines: Vec<String>,
pub new_lines: Vec<String>,
/// If set to true, `old_lines` must occur at the end of the source file.
/// (Tolerance around trailing newlines should be encouraged.)
pub is_end_of_file: bool,
}
pub fn parse_patch(patch: &str) -> Result<Vec<Hunk>, ParseError> {
let lines: Vec<&str> = patch.trim().lines().collect();
if lines.is_empty() || lines[0] != BEGIN_PATCH_MARKER {
return Err(InvalidPatchError(String::from(
"The first line of the patch must be '*** Begin Patch'",
)));
}
let last_line_index = lines.len() - 1;
if lines[last_line_index] != END_PATCH_MARKER {
return Err(InvalidPatchError(String::from(
"The last line of the patch must be '*** End Patch'",
)));
}
let mut hunks: Vec<Hunk> = Vec::new();
let mut remaining_lines = &lines[1..last_line_index];
let mut line_number = 2;
while !remaining_lines.is_empty() {
let (hunk, hunk_lines) = parse_one_hunk(remaining_lines, line_number)?;
hunks.push(hunk);
line_number += hunk_lines;
remaining_lines = &remaining_lines[hunk_lines..]
}
Ok(hunks)
}
/// Attempts to parse a single hunk from the start of lines.
/// Returns the parsed hunk and the number of lines parsed (or a ParseError).
fn parse_one_hunk(lines: &[&str], line_number: usize) -> Result<(Hunk, usize), ParseError> {
// Be tolerant of case mismatches and extra padding around marker strings.
let first_line = lines[0].trim();
if let Some(path) = first_line.strip_prefix(ADD_FILE_MARKER) {
// Add File
let mut contents = String::new();
let mut parsed_lines = 1;
for add_line in &lines[1..] {
if let Some(line_to_add) = add_line.strip_prefix('+') {
contents.push_str(line_to_add);
contents.push('\n');
parsed_lines += 1;
} else {
break;
}
}
return Ok((
AddFile {
path: PathBuf::from(path),
contents,
},
parsed_lines,
));
} else if let Some(path) = first_line.strip_prefix(DELETE_FILE_MARKER) {
// Delete File
return Ok((
DeleteFile {
path: PathBuf::from(path),
},
1,
));
} else if let Some(path) = first_line.strip_prefix(UPDATE_FILE_MARKER) {
// Update File
let mut remaining_lines = &lines[1..];
let mut parsed_lines = 1;
// Optional: move file line
let move_path = remaining_lines
.first()
.and_then(|x| x.strip_prefix(MOVE_TO_MARKER));
if move_path.is_some() {
remaining_lines = &remaining_lines[1..];
parsed_lines += 1;
}
let mut chunks = Vec::new();
// NOTE: we need to know to stop once we reach the next special marker header.
while !remaining_lines.is_empty() {
// Skip over any completely blank lines that may separate chunks.
if remaining_lines[0].trim().is_empty() {
parsed_lines += 1;
remaining_lines = &remaining_lines[1..];
continue;
}
if remaining_lines[0].starts_with("***") {
break;
}
let (chunk, chunk_lines) = parse_update_file_chunk(
remaining_lines,
line_number + parsed_lines,
chunks.is_empty(),
)?;
chunks.push(chunk);
parsed_lines += chunk_lines;
remaining_lines = &remaining_lines[chunk_lines..]
}
if chunks.is_empty() {
return Err(InvalidHunkError {
message: format!("Update file hunk for path '{path}' is empty"),
line_number,
});
}
return Ok((
UpdateFile {
path: PathBuf::from(path),
move_path: move_path.map(PathBuf::from),
chunks,
},
parsed_lines,
));
}
Err(InvalidHunkError { message: format!("'{first_line}' is not a valid hunk header. Valid hunk headers: '*** Add File: {{path}}', '*** Delete File: {{path}}', '*** Update File: {{path}}'"), line_number })
}
fn parse_update_file_chunk(
lines: &[&str],
line_number: usize,
allow_missing_context: bool,
) -> Result<(UpdateFileChunk, usize), ParseError> {
if lines.is_empty() {
return Err(InvalidHunkError {
message: "Update hunk does not contain any lines".to_string(),
line_number,
});
}
// If we see an explicit context marker @@ or @@ <context>, consume it; otherwise, optionally
// allow treating the chunk as starting directly with diff lines.
let (change_context, start_index) = if lines[0] == EMPTY_CHANGE_CONTEXT_MARKER {
(None, 1)
} else if let Some(context) = lines[0].strip_prefix(CHANGE_CONTEXT_MARKER) {
(Some(context.to_string()), 1)
} else {
if !allow_missing_context {
return Err(InvalidHunkError {
message: format!(
"Expected update hunk to start with a @@ context marker, got: '{}'",
lines[0]
),
line_number,
});
}
(None, 0)
};
if start_index >= lines.len() {
return Err(InvalidHunkError {
message: "Update hunk does not contain any lines".to_string(),
line_number: line_number + 1,
});
}
let mut chunk = UpdateFileChunk {
change_context,
old_lines: Vec::new(),
new_lines: Vec::new(),
is_end_of_file: false,
};
let mut parsed_lines = 0;
for line in &lines[start_index..] {
match *line {
EOF_MARKER => {
if parsed_lines == 0 {
return Err(InvalidHunkError {
message: "Update hunk does not contain any lines".to_string(),
line_number: line_number + 1,
});
}
chunk.is_end_of_file = true;
parsed_lines += 1;
break;
}
line_contents => {
match line_contents.chars().next() {
None => {
// Interpret this as an empty line.
chunk.old_lines.push(String::new());
chunk.new_lines.push(String::new());
}
Some(' ') => {
chunk.old_lines.push(line_contents[1..].to_string());
chunk.new_lines.push(line_contents[1..].to_string());
}
Some('+') => {
chunk.new_lines.push(line_contents[1..].to_string());
}
Some('-') => {
chunk.old_lines.push(line_contents[1..].to_string());
}
_ => {
if parsed_lines == 0 {
return Err(InvalidHunkError { message: format!("Unexpected line found in update hunk: '{line_contents}'. Every line should start with ' ' (context line), '+' (added line), or '-' (removed line)"), line_number: line_number + 1 });
}
// Assume this is the start of the next hunk.
break;
}
}
parsed_lines += 1;
}
}
}
Ok((chunk, parsed_lines + start_index))
}
#[test]
fn test_parse_patch() {
assert_eq!(
parse_patch("bad"),
Err(InvalidPatchError(
"The first line of the patch must be '*** Begin Patch'".to_string()
))
);
assert_eq!(
parse_patch("*** Begin Patch\nbad"),
Err(InvalidPatchError(
"The last line of the patch must be '*** End Patch'".to_string()
))
);
assert_eq!(
parse_patch(
"*** Begin Patch\n\
*** Update File: test.py\n\
*** End Patch"
),
Err(InvalidHunkError {
message: "Update file hunk for path 'test.py' is empty".to_string(),
line_number: 2,
})
);
assert_eq!(
parse_patch(
"*** Begin Patch\n\
*** End Patch"
),
Ok(Vec::new())
);
assert_eq!(
parse_patch(
"*** Begin Patch\n\
*** Add File: path/add.py\n\
+abc\n\
+def\n\
*** Delete File: path/delete.py\n\
*** Update File: path/update.py\n\
*** Move to: path/update2.py\n\
@@ def f():\n\
- pass\n\
+ return 123\n\
*** End Patch"
),
Ok(vec![
AddFile {
path: PathBuf::from("path/add.py"),
contents: "abc\ndef\n".to_string()
},
DeleteFile {
path: PathBuf::from("path/delete.py")
},
UpdateFile {
path: PathBuf::from("path/update.py"),
move_path: Some(PathBuf::from("path/update2.py")),
chunks: vec![UpdateFileChunk {
change_context: Some("def f():".to_string()),
old_lines: vec![" pass".to_string()],
new_lines: vec![" return 123".to_string()],
is_end_of_file: false
}]
}
])
);
// Update hunk followed by another hunk (Add File).
assert_eq!(
parse_patch(
"*** Begin Patch\n\
*** Update File: file.py\n\
@@\n\
+line\n\
*** Add File: other.py\n\
+content\n\
*** End Patch"
),
Ok(vec![
UpdateFile {
path: PathBuf::from("file.py"),
move_path: None,
chunks: vec![UpdateFileChunk {
change_context: None,
old_lines: vec![],
new_lines: vec!["line".to_string()],
is_end_of_file: false
}],
},
AddFile {
path: PathBuf::from("other.py"),
contents: "content\n".to_string()
}
])
);
// Update hunk without an explicit @@ header for the first chunk should parse.
// Use a raw string to preserve the leading space diff marker on the context line.
assert_eq!(
parse_patch(
r#"*** Begin Patch
*** Update File: file2.py
import foo
+bar
*** End Patch"#,
),
Ok(vec![UpdateFile {
path: PathBuf::from("file2.py"),
move_path: None,
chunks: vec![UpdateFileChunk {
change_context: None,
old_lines: vec!["import foo".to_string()],
new_lines: vec!["import foo".to_string(), "bar".to_string()],
is_end_of_file: false,
}],
}])
);
}
#[test]
fn test_parse_one_hunk() {
assert_eq!(
parse_one_hunk(&["bad"], 234),
Err(InvalidHunkError {
message: "'bad' is not a valid hunk header. \
Valid hunk headers: '*** Add File: {path}', '*** Delete File: {path}', '*** Update File: {path}'".to_string(),
line_number: 234
})
);
// Other edge cases are already covered by tests above/below.
}
#[test]
fn test_update_file_chunk() {
assert_eq!(
parse_update_file_chunk(&["bad"], 123, false),
Err(InvalidHunkError {
message: "Expected update hunk to start with a @@ context marker, got: 'bad'"
.to_string(),
line_number: 123
})
);
assert_eq!(
parse_update_file_chunk(&["@@"], 123, false),
Err(InvalidHunkError {
message: "Update hunk does not contain any lines".to_string(),
line_number: 124
})
);
assert_eq!(
parse_update_file_chunk(&["@@", "bad"], 123, false),
Err(InvalidHunkError {
message: "Unexpected line found in update hunk: 'bad'. \
Every line should start with ' ' (context line), '+' (added line), or '-' (removed line)".to_string(),
line_number: 124
})
);
assert_eq!(
parse_update_file_chunk(&["@@", "*** End of File"], 123, false),
Err(InvalidHunkError {
message: "Update hunk does not contain any lines".to_string(),
line_number: 124
})
);
assert_eq!(
parse_update_file_chunk(
&[
"@@ change_context",
"",
" context",
"-remove",
"+add",
" context2",
"*** End Patch",
],
123,
false
),
Ok((
(UpdateFileChunk {
change_context: Some("change_context".to_string()),
old_lines: vec![
"".to_string(),
"context".to_string(),
"remove".to_string(),
"context2".to_string()
],
new_lines: vec![
"".to_string(),
"context".to_string(),
"add".to_string(),
"context2".to_string()
],
is_end_of_file: false
}),
6
))
);
assert_eq!(
parse_update_file_chunk(&["@@", "+line", "*** End of File"], 123, false),
Ok((
(UpdateFileChunk {
change_context: None,
old_lines: vec![],
new_lines: vec!["line".to_string()],
is_end_of_file: true
}),
3
))
);
}

View File

@@ -0,0 +1,150 @@
/// Attempt to find the sequence of `pattern` lines within `lines` beginning at or after `start`.
/// Returns the starting index of the match or `None` if not found. Matches are attempted with
/// decreasing strictness: exact match, then ignoring trailing whitespace, then ignoring leading
/// and trailing whitespace. When `eof` is true, we first try starting at the end-of-file (so that
/// patterns intended to match file endings are applied at the end), and fall back to searching
/// from `start` if needed.
///
/// Special cases handled defensively:
/// • Empty `pattern` → returns `Some(start)` (no-op match)
/// • `pattern.len() > lines.len()` → returns `None` (cannot match, avoids
/// outofbounds panic that occurred pre20250412)
pub(crate) fn seek_sequence(
lines: &[String],
pattern: &[String],
start: usize,
eof: bool,
) -> Option<usize> {
if pattern.is_empty() {
return Some(start);
}
// When the pattern is longer than the available input there is no possible
// match. Earlyreturn to avoid the outofbounds slice that would occur in
// the search loops below (previously caused a panic when
// `pattern.len() > lines.len()`).
if pattern.len() > lines.len() {
return None;
}
let search_start = if eof && lines.len() >= pattern.len() {
lines.len() - pattern.len()
} else {
start
};
// Exact match first.
for i in search_start..=lines.len().saturating_sub(pattern.len()) {
if lines[i..i + pattern.len()] == *pattern {
return Some(i);
}
}
// Then rstrip match.
for i in search_start..=lines.len().saturating_sub(pattern.len()) {
let mut ok = true;
for (p_idx, pat) in pattern.iter().enumerate() {
if lines[i + p_idx].trim_end() != pat.trim_end() {
ok = false;
break;
}
}
if ok {
return Some(i);
}
}
// Finally, trim both sides to allow more lenience.
for i in search_start..=lines.len().saturating_sub(pattern.len()) {
let mut ok = true;
for (p_idx, pat) in pattern.iter().enumerate() {
if lines[i + p_idx].trim() != pat.trim() {
ok = false;
break;
}
}
if ok {
return Some(i);
}
}
// ------------------------------------------------------------------
// Final, most permissive pass attempt to match after *normalising*
// common Unicode punctuation to their ASCII equivalents so that diffs
// authored with plain ASCII characters can still be applied to source
// files that contain typographic dashes / quotes, etc. This mirrors the
// fuzzy behaviour of `git apply` which ignores minor byte-level
// differences when locating context lines.
// ------------------------------------------------------------------
fn normalise(s: &str) -> String {
s.trim()
.chars()
.map(|c| match c {
// Various dash / hyphen code-points → ASCII '-'
'\u{2010}' | '\u{2011}' | '\u{2012}' | '\u{2013}' | '\u{2014}' | '\u{2015}'
| '\u{2212}' => '-',
// Fancy single quotes → '\''
'\u{2018}' | '\u{2019}' | '\u{201A}' | '\u{201B}' => '\'',
// Fancy double quotes → '"'
'\u{201C}' | '\u{201D}' | '\u{201E}' | '\u{201F}' => '"',
// Non-breaking space and other odd spaces → normal space
'\u{00A0}' | '\u{2002}' | '\u{2003}' | '\u{2004}' | '\u{2005}' | '\u{2006}'
| '\u{2007}' | '\u{2008}' | '\u{2009}' | '\u{200A}' | '\u{202F}' | '\u{205F}'
| '\u{3000}' => ' ',
other => other,
})
.collect::<String>()
}
for i in search_start..=lines.len().saturating_sub(pattern.len()) {
let mut ok = true;
for (p_idx, pat) in pattern.iter().enumerate() {
if normalise(&lines[i + p_idx]) != normalise(pat) {
ok = false;
break;
}
}
if ok {
return Some(i);
}
}
None
}
#[cfg(test)]
mod tests {
use super::seek_sequence;
fn to_vec(strings: &[&str]) -> Vec<String> {
strings.iter().map(|s| s.to_string()).collect()
}
#[test]
fn test_exact_match_finds_sequence() {
let lines = to_vec(&["foo", "bar", "baz"]);
let pattern = to_vec(&["bar", "baz"]);
assert_eq!(seek_sequence(&lines, &pattern, 0, false), Some(1));
}
#[test]
fn test_rstrip_match_ignores_trailing_whitespace() {
let lines = to_vec(&["foo ", "bar\t\t"]);
// Pattern omits trailing whitespace.
let pattern = to_vec(&["foo", "bar"]);
assert_eq!(seek_sequence(&lines, &pattern, 0, false), Some(0));
}
#[test]
fn test_trim_match_ignores_leading_and_trailing_whitespace() {
let lines = to_vec(&[" foo ", " bar\t"]);
// Pattern omits any additional whitespace.
let pattern = to_vec(&["foo", "bar"]);
assert_eq!(seek_sequence(&lines, &pattern, 0, false), Some(0));
}
#[test]
fn test_pattern_longer_than_input_returns_none() {
let lines = to_vec(&["just one line"]);
let pattern = to_vec(&["too", "many", "lines"]);
// Should not panic must return None when pattern cannot possibly fit.
assert_eq!(seek_sequence(&lines, &pattern, 0, false), None);
}
}

33
codex-rs/cli/Cargo.toml Normal file
View File

@@ -0,0 +1,33 @@
[package]
name = "codex-cli"
version = { workspace = true }
edition = "2021"
[[bin]]
name = "codex"
path = "src/main.rs"
[[bin]]
name = "codex-linux-sandbox"
path = "src/linux-sandbox/main.rs"
[lib]
name = "codex_cli"
path = "src/lib.rs"
[dependencies]
anyhow = "1"
clap = { version = "4", features = ["derive"] }
codex-core = { path = "../core" }
codex-exec = { path = "../exec" }
codex-tui = { path = "../tui" }
serde_json = "1"
tokio = { version = "1", features = [
"io-std",
"macros",
"process",
"rt-multi-thread",
"signal",
] }
tracing = "0.1.41"
tracing-subscriber = "0.3.19"

View File

@@ -0,0 +1,37 @@
//! `debug landlock` implementation for the Codex CLI.
//!
//! On Linux the command is executed inside a Landlock + seccomp sandbox by
//! calling the low-level `exec_linux` helper from `codex_core::linux`.
use codex_core::protocol::SandboxPolicy;
use std::os::unix::process::ExitStatusExt;
use std::process;
use std::process::Command;
use std::process::ExitStatus;
/// Execute `command` in a Linux sandbox (Landlock + seccomp) the way Codex
/// would.
pub fn run_landlock(command: Vec<String>, sandbox_policy: SandboxPolicy) -> anyhow::Result<()> {
if command.is_empty() {
anyhow::bail!("command args are empty");
}
// Spawn a new thread and apply the sandbox policies there.
let handle = std::thread::spawn(move || -> anyhow::Result<ExitStatus> {
codex_core::linux::apply_sandbox_policy_to_current_thread(sandbox_policy)?;
let status = Command::new(&command[0]).args(&command[1..]).status()?;
Ok(status)
});
let status = handle
.join()
.map_err(|e| anyhow::anyhow!("Failed to join thread: {e:?}"))??;
// Use ExitStatus to derive the exit code.
if let Some(code) = status.code() {
process::exit(code);
} else if let Some(signal) = status.signal() {
process::exit(128 + signal);
} else {
process::exit(1);
}
}

47
codex-rs/cli/src/lib.rs Normal file
View File

@@ -0,0 +1,47 @@
#[cfg(target_os = "linux")]
pub mod landlock;
pub mod proto;
pub mod seatbelt;
use clap::Parser;
use codex_core::protocol::SandboxPolicy;
use codex_core::SandboxPermissionOption;
#[derive(Debug, Parser)]
pub struct SeatbeltCommand {
/// Convenience alias for low-friction sandboxed automatic execution (network-disabled sandbox that can write to cwd and TMPDIR)
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,
#[clap(flatten)]
pub sandbox: SandboxPermissionOption,
/// Full command args to run under seatbelt.
#[arg(trailing_var_arg = true)]
pub command: Vec<String>,
}
#[derive(Debug, Parser)]
pub struct LandlockCommand {
/// Convenience alias for low-friction sandboxed automatic execution (network-disabled sandbox that can write to cwd and TMPDIR)
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,
#[clap(flatten)]
pub sandbox: SandboxPermissionOption,
/// Full command args to run under landlock.
#[arg(trailing_var_arg = true)]
pub command: Vec<String>,
}
pub fn create_sandbox_policy(full_auto: bool, sandbox: SandboxPermissionOption) -> SandboxPolicy {
if full_auto {
SandboxPolicy::new_full_auto_policy()
} else {
match sandbox.permissions.map(Into::into) {
Some(sandbox_policy) => sandbox_policy,
None => SandboxPolicy::new_read_only_policy(),
}
}
}

View File

@@ -0,0 +1,22 @@
#[cfg(not(target_os = "linux"))]
fn main() -> anyhow::Result<()> {
eprintln!("codex-linux-sandbox is not supported on this platform.");
std::process::exit(1);
}
#[cfg(target_os = "linux")]
fn main() -> anyhow::Result<()> {
use clap::Parser;
use codex_cli::create_sandbox_policy;
use codex_cli::landlock;
use codex_cli::LandlockCommand;
let LandlockCommand {
full_auto,
sandbox,
command,
} = LandlockCommand::parse();
let sandbox_policy = create_sandbox_policy(full_auto, sandbox);
landlock::run_landlock(command, sandbox_policy)?;
Ok(())
}

102
codex-rs/cli/src/main.rs Normal file
View File

@@ -0,0 +1,102 @@
use clap::Parser;
use codex_cli::create_sandbox_policy;
use codex_cli::proto;
use codex_cli::seatbelt;
use codex_cli::LandlockCommand;
use codex_cli::SeatbeltCommand;
use codex_exec::Cli as ExecCli;
use codex_tui::Cli as TuiCli;
use crate::proto::ProtoCli;
/// Codex CLI
///
/// If no subcommand is specified, options will be forwarded to the interactive CLI.
#[derive(Debug, Parser)]
#[clap(
author,
version,
// If a subcommand is given, ignore requirements of the default args.
subcommand_negates_reqs = true
)]
struct MultitoolCli {
#[clap(flatten)]
interactive: TuiCli,
#[clap(subcommand)]
subcommand: Option<Subcommand>,
}
#[derive(Debug, clap::Subcommand)]
enum Subcommand {
/// Run Codex non-interactively.
#[clap(visible_alias = "e")]
Exec(ExecCli),
/// Run the Protocol stream via stdin/stdout
#[clap(visible_alias = "p")]
Proto(ProtoCli),
/// Internal debugging commands.
Debug(DebugArgs),
}
#[derive(Debug, Parser)]
struct DebugArgs {
#[command(subcommand)]
cmd: DebugCommand,
}
#[derive(Debug, clap::Subcommand)]
enum DebugCommand {
/// Run a command under Seatbelt (macOS only).
Seatbelt(SeatbeltCommand),
/// Run a command under Landlock+seccomp (Linux only).
Landlock(LandlockCommand),
}
#[derive(Debug, Parser)]
struct ReplProto {}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let cli = MultitoolCli::parse();
match cli.subcommand {
None => {
codex_tui::run_main(cli.interactive)?;
}
Some(Subcommand::Exec(exec_cli)) => {
codex_exec::run_main(exec_cli).await?;
}
Some(Subcommand::Proto(proto_cli)) => {
proto::run_main(proto_cli).await?;
}
Some(Subcommand::Debug(debug_args)) => match debug_args.cmd {
DebugCommand::Seatbelt(SeatbeltCommand {
command,
sandbox,
full_auto,
}) => {
let sandbox_policy = create_sandbox_policy(full_auto, sandbox);
seatbelt::run_seatbelt(command, sandbox_policy).await?;
}
#[cfg(target_os = "linux")]
DebugCommand::Landlock(LandlockCommand {
command,
sandbox,
full_auto,
}) => {
let sandbox_policy = create_sandbox_policy(full_auto, sandbox);
codex_cli::landlock::run_landlock(command, sandbox_policy)?;
}
#[cfg(not(target_os = "linux"))]
DebugCommand::Landlock(_) => {
anyhow::bail!("Landlock is only supported on Linux.");
}
},
}
Ok(())
}

94
codex-rs/cli/src/proto.rs Normal file
View File

@@ -0,0 +1,94 @@
use std::io::IsTerminal;
use clap::Parser;
use codex_core::protocol::Submission;
use codex_core::util::notify_on_sigint;
use codex_core::Codex;
use tokio::io::AsyncBufReadExt;
use tokio::io::BufReader;
use tracing::error;
use tracing::info;
#[derive(Debug, Parser)]
pub struct ProtoCli {}
pub async fn run_main(_opts: ProtoCli) -> anyhow::Result<()> {
if std::io::stdin().is_terminal() {
anyhow::bail!("Protocol mode expects stdin to be a pipe, not a terminal");
}
tracing_subscriber::fmt()
.with_writer(std::io::stderr)
.init();
let ctrl_c = notify_on_sigint();
let codex = Codex::spawn(ctrl_c.clone())?;
// Task that reads JSON lines from stdin and forwards to Submission Queue
let sq_fut = {
let codex = codex.clone();
let ctrl_c = ctrl_c.clone();
async move {
let stdin = BufReader::new(tokio::io::stdin());
let mut lines = stdin.lines();
loop {
let result = tokio::select! {
_ = ctrl_c.notified() => {
info!("Interrupted, exiting");
break
},
res = lines.next_line() => res,
};
match result {
Ok(Some(line)) => {
let line = line.trim();
if line.is_empty() {
continue;
}
match serde_json::from_str::<Submission>(line) {
Ok(sub) => {
if let Err(e) = codex.submit(sub).await {
error!("{e:#}");
break;
}
}
Err(e) => {
error!("invalid submission: {e}");
}
}
}
_ => {
info!("Submission queue closed");
break;
}
}
}
}
};
// Task that reads events from the agent and prints them as JSON lines to stdout
let eq_fut = async move {
loop {
let event = tokio::select! {
_ = ctrl_c.notified() => break,
event = codex.next_event() => event,
};
match event {
Ok(event) => {
let event_str =
serde_json::to_string(&event).expect("JSON serialization failed");
println!("{event_str}");
}
Err(e) => {
error!("{e:#}");
break;
}
}
}
info!("Event queue closed");
};
tokio::join!(sq_fut, eq_fut);
Ok(())
}

View File

@@ -0,0 +1,17 @@
use codex_core::exec::create_seatbelt_command;
use codex_core::protocol::SandboxPolicy;
pub async fn run_seatbelt(
command: Vec<String>,
sandbox_policy: SandboxPolicy,
) -> anyhow::Result<()> {
let seatbelt_command = create_seatbelt_command(command, &sandbox_policy);
let status = tokio::process::Command::new(seatbelt_command[0].clone())
.args(&seatbelt_command[1..])
.spawn()
.map_err(|e| anyhow::anyhow!("Failed to spawn command: {}", e))?
.wait()
.await
.map_err(|e| anyhow::anyhow!("Failed to wait for command: {}", e))?;
std::process::exit(status.code().unwrap_or(1));
}

62
codex-rs/core/Cargo.toml Normal file
View File

@@ -0,0 +1,62 @@
[package]
name = "codex-core"
version = "0.1.0"
edition = "2021"
[lib]
name = "codex_core"
path = "src/lib.rs"
[dependencies]
anyhow = "1"
async-channel = "2.3.1"
base64 = "0.21"
bytes = "1.10.1"
clap = { version = "4", features = ["derive", "wrap_help"], optional = true }
codex-apply-patch = { path = "../apply-patch" }
dirs = "6"
env-flags = "0.1.1"
eventsource-stream = "0.2.3"
fs-err = "3.1.0"
futures = "0.3"
mime_guess = "2.0"
patch = "0.7"
path-absolutize = "3.1.1"
rand = "0.9"
reqwest = { version = "0.12", features = ["json", "stream"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
thiserror = "2.0.12"
tokio = { version = "1", features = [
"io-std",
"macros",
"process",
"rt-multi-thread",
"signal",
] }
tokio-util = "0.7.14"
toml = "0.8.20"
tracing = { version = "0.1.41", features = ["log"] }
tree-sitter = "0.25.3"
tree-sitter-bash = "0.23.3"
[target.'cfg(target_os = "linux")'.dependencies]
libc = "0.2.172"
landlock = "0.4.1"
seccompiler = "0.5.0"
# Build OpenSSL from source for musl builds.
[target.x86_64-unknown-linux-musl.dependencies]
openssl-sys = { version = "*", features = ["vendored"] }
[dev-dependencies]
assert_cmd = "2"
predicates = "3"
tempfile = "3"
wiremock = "0.6"
[features]
default = []
# Separate feature so that `clap` is not a mandatory dependency.
cli = ["clap"]

10
codex-rs/core/README.md Normal file
View File

@@ -0,0 +1,10 @@
# codex-core
This crate implements the business logic for Codex. It is designed to be used by the various Codex UIs written in Rust.
Though for non-Rust UIs, we are also working to define a _protocol_ for talking to Codex. See:
- [Specification](../docs/protocol_v1.md)
- [Rust types](./src/protocol.rs)
You can use the `proto` subcommand using the executable in the [`cli` crate](../cli) to speak the protocol using newline-delimited-JSON over stdin/stdout.

98
codex-rs/core/prompt.md Normal file
View File

@@ -0,0 +1,98 @@
Please resolve the user's task by editing and testing the code files in your current code execution session.
You are a deployed coding agent.
Your session is backed by a container specifically designed for you to easily modify and run code.
The repo(s) are already cloned in your working directory, and you must fully solve the problem for your answer to be considered correct.
You MUST adhere to the following criteria when executing the task:
- Working on the repo(s) in the current environment is allowed, even if they are proprietary.
- Analyzing code for vulnerabilities is allowed.
- Showing user code and tool call details is allowed.
- User instructions may overwrite the _CODING GUIDELINES_ section in this developer message.
- Do not use \`ls -R\`, \`find\`, or \`grep\` - these are slow in large repos. Use \`rg\` and \`rg --files\`.
- Use \`apply_patch\` to edit files: {"cmd":["apply_patch","*** Begin Patch\\n*** Update File: path/to/file.py\\n@@ def example():\\n- pass\\n+ return 123\\n*** End Patch"]}
- If completing the user's task requires writing or modifying files:
- Your code and final answer should follow these _CODING GUIDELINES_:
- Fix the problem at the root cause rather than applying surface-level patches, when possible.
- Avoid unneeded complexity in your solution.
- Ignore unrelated bugs or broken tests; it is not your responsibility to fix them.
- Update documentation as necessary.
- Keep changes consistent with the style of the existing codebase. Changes should be minimal and focused on the task.
- Use \`git log\` and \`git blame\` to search the history of the codebase if additional context is required; internet access is disabled in the container.
- NEVER add copyright or license headers unless specifically requested.
- You do not need to \`git commit\` your changes; this will be done automatically for you.
- If there is a .pre-commit-config.yaml, use \`pre-commit run --files ...\` to check that your changes pass the pre- commit checks. However, do not fix pre-existing errors on lines you didn't touch.
- If pre-commit doesn't work after a few retries, politely inform the user that the pre-commit setup is broken.
- Once you finish coding, you must
- Check \`git status\` to sanity check your changes; revert any scratch files or changes.
- Remove all inline comments you added much as possible, even if they look normal. Check using \`git diff\`. Inline comments must be generally avoided, unless active maintainers of the repo, after long careful study of the code and the issue, will still misinterpret the code without the comments.
- Check if you accidentally add copyright or license headers. If so, remove them.
- Try to run pre-commit if it is available.
- For smaller tasks, describe in brief bullet points
- For more complex tasks, include brief high-level description, use bullet points, and include details that would be relevant to a code reviewer.
- If completing the user's task DOES NOT require writing or modifying files (e.g., the user asks a question about the code base):
- Respond in a friendly tune as a remote teammate, who is knowledgeable, capable and eager to help with coding.
- When your task involves writing or modifying files:
- Do NOT tell the user to "save the file" or "copy the code into a file" if you already created or modified the file using \`apply_patch\`. Instead, reference the file as already saved.
- Do NOT show the full contents of large files you have already written, unless the user explicitly asks for them.
§ `apply-patch` Specification
Your patch language is a strippeddown, fileoriented diff format designed to be easy to parse and safe to apply. You can think of it as a highlevel envelope:
**_ Begin Patch
[ one or more file sections ]
_** End Patch
Within that envelope, you get a sequence of file operations.
You MUST include a header to specify the action you are taking.
Each operation starts with one of three headers:
**_ Add File: <path> - create a new file. Every following line is a + line (the initial contents).
_** Delete File: <path> - remove an existing file. Nothing follows.
\*\*\* Update File: <path> - patch an existing file in place (optionally with a rename).
May be immediately followed by \*\*\* Move to: <new path> if you want to rename the file.
Then one or more “hunks”, each introduced by @@ (optionally followed by a hunk header).
Within a hunk each line starts with:
- for inserted text,
* for removed text, or
space ( ) for context.
At the end of a truncated hunk you can emit \*\*\* End of File.
Patch := Begin { FileOp } End
Begin := "**_ Begin Patch" NEWLINE
End := "_** End Patch" NEWLINE
FileOp := AddFile | DeleteFile | UpdateFile
AddFile := "**_ Add File: " path NEWLINE { "+" line NEWLINE }
DeleteFile := "_** Delete File: " path NEWLINE
UpdateFile := "**_ Update File: " path NEWLINE [ MoveTo ] { Hunk }
MoveTo := "_** Move to: " newPath NEWLINE
Hunk := "@@" [ header ] NEWLINE { HunkLine } [ "*** End of File" NEWLINE ]
HunkLine := (" " | "-" | "+") text NEWLINE
A full patch can combine several operations:
**_ Begin Patch
_** Add File: hello.txt
+Hello world
**_ Update File: src/app.py
_** Move to: src/main.py
@@ def greet():
-print("Hi")
+print("Hello, world!")
**_ Delete File: obsolete.txt
_** End Patch
It is important to remember:
- You must include a header with your intended action (Add/Delete/Update)
- You must prefix new lines with `+` even when creating a new file
You can invoke apply_patch like:
```
shell {"command":["apply_patch","*** Begin Patch\n*** Add File: hello.txt\n+Hello, world!\n*** End Patch\n"]}
```

View File

@@ -0,0 +1,120 @@
//! Standard type to use with the `--approval-mode` CLI option.
//! Available when the `cli` feature is enabled for the crate.
use std::path::PathBuf;
use clap::ArgAction;
use clap::Parser;
use clap::ValueEnum;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPermission;
#[derive(Clone, Copy, Debug, ValueEnum)]
#[value(rename_all = "kebab-case")]
pub enum ApprovalModeCliArg {
/// Run all commands without asking for user approval.
/// Only asks for approval if a command fails to execute, in which case it
/// will escalate to the user to ask for un-sandboxed execution.
OnFailure,
/// Only run "known safe" commands (e.g. ls, cat, sed) without
/// asking for user approval. Will escalate to the user if the model
/// proposes a command that is not allow-listed.
UnlessAllowListed,
/// Never ask for user approval
/// Execution failures are immediately returned to the model.
Never,
}
impl From<ApprovalModeCliArg> for AskForApproval {
fn from(value: ApprovalModeCliArg) -> Self {
match value {
ApprovalModeCliArg::OnFailure => AskForApproval::OnFailure,
ApprovalModeCliArg::UnlessAllowListed => AskForApproval::UnlessAllowListed,
ApprovalModeCliArg::Never => AskForApproval::Never,
}
}
}
#[derive(Parser, Debug)]
pub struct SandboxPermissionOption {
/// Specify this flag multiple times to specify the full set of permissions
/// to grant to Codex.
///
/// ```shell
/// codex -s disk-full-read-access \
/// -s disk-write-cwd \
/// -s disk-write-platform-user-temp-folder \
/// -s disk-write-platform-global-temp-folder
/// ```
///
/// Note disk-write-folder takes a value:
///
/// ```shell
/// -s disk-write-folder=$HOME/.pyenv/shims
/// ```
///
/// These permissions are quite broad and should be used with caution:
///
/// ```shell
/// -s disk-full-write-access
/// -s network-full-access
/// ```
#[arg(long = "sandbox-permission", short = 's', action = ArgAction::Append, value_parser = parse_sandbox_permission)]
pub permissions: Option<Vec<SandboxPermission>>,
}
/// Custom value-parser so we can keep the CLI surface small *and*
/// still handle the parameterised `disk-write-folder` case.
fn parse_sandbox_permission(raw: &str) -> std::io::Result<SandboxPermission> {
let base_path = std::env::current_dir()?;
parse_sandbox_permission_with_base_path(raw, base_path)
}
pub(crate) fn parse_sandbox_permission_with_base_path(
raw: &str,
base_path: PathBuf,
) -> std::io::Result<SandboxPermission> {
use SandboxPermission::*;
if let Some(path) = raw.strip_prefix("disk-write-folder=") {
return if path.is_empty() {
Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
"--sandbox-permission disk-write-folder=<PATH> requires a non-empty PATH",
))
} else {
use path_absolutize::*;
let file = PathBuf::from(path);
let absolute_path = if file.is_relative() {
file.absolutize_from(base_path)
} else {
file.absolutize()
}
.map(|path| path.into_owned())?;
Ok(DiskWriteFolder {
folder: absolute_path,
})
};
}
match raw {
"disk-full-read-access" => Ok(DiskFullReadAccess),
"disk-write-platform-user-temp-folder" => Ok(DiskWritePlatformUserTempFolder),
"disk-write-platform-global-temp-folder" => Ok(DiskWritePlatformGlobalTempFolder),
"disk-write-cwd" => Ok(DiskWriteCwd),
"disk-full-write-access" => Ok(DiskFullWriteAccess),
"network-full-access" => Ok(NetworkFullAccess),
_ => Err(
std::io::Error::new(
std::io::ErrorKind::InvalidInput,
format!(
"`{raw}` is not a recognised permission.\nRun with `--help` to see the accepted values."
),
)
),
}
}

383
codex-rs/core/src/client.rs Normal file
View File

@@ -0,0 +1,383 @@
use std::collections::BTreeMap;
use std::io::BufRead;
use std::path::Path;
use std::pin::Pin;
use std::sync::LazyLock;
use std::task::Context;
use std::task::Poll;
use std::time::Duration;
use bytes::Bytes;
use eventsource_stream::Eventsource;
use futures::prelude::*;
use reqwest::StatusCode;
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value;
use tokio::sync::mpsc;
use tokio::time::timeout;
use tokio_util::io::ReaderStream;
use tracing::debug;
use tracing::trace;
use tracing::warn;
use crate::error::CodexErr;
use crate::error::Result;
use crate::flags::get_api_key;
use crate::flags::CODEX_RS_SSE_FIXTURE;
use crate::flags::OPENAI_API_BASE;
use crate::flags::OPENAI_REQUEST_MAX_RETRIES;
use crate::flags::OPENAI_STREAM_IDLE_TIMEOUT_MS;
use crate::models::ResponseItem;
use crate::util::backoff;
/// API request payload for a single model turn.
#[derive(Default, Debug, Clone)]
pub struct Prompt {
/// Conversation context input items.
pub input: Vec<ResponseItem>,
/// Optional previous response ID (when storage is enabled).
pub prev_id: Option<String>,
/// Optional initial instructions (only sent on first turn).
pub instructions: Option<String>,
/// Whether to store response on server side (disable_response_storage = !store).
pub store: bool,
}
#[derive(Debug)]
pub enum ResponseEvent {
OutputItemDone(ResponseItem),
Completed { response_id: String },
}
#[derive(Debug, Serialize)]
struct Payload<'a> {
model: &'a str,
#[serde(skip_serializing_if = "Option::is_none")]
instructions: Option<&'a String>,
// TODO(mbolin): ResponseItem::Other should not be serialized. Currently,
// we code defensively to avoid this case, but perhaps we should use a
// separate enum for serialization.
input: &'a Vec<ResponseItem>,
tools: &'a [Tool],
tool_choice: &'static str,
parallel_tool_calls: bool,
reasoning: Option<Reasoning>,
#[serde(skip_serializing_if = "Option::is_none")]
previous_response_id: Option<String>,
/// true when using the Responses API.
store: bool,
stream: bool,
}
#[derive(Debug, Serialize)]
struct Reasoning {
effort: &'static str,
#[serde(skip_serializing_if = "Option::is_none")]
generate_summary: Option<bool>,
}
#[derive(Debug, Serialize)]
struct Tool {
name: &'static str,
#[serde(rename = "type")]
kind: &'static str, // "function"
description: &'static str,
strict: bool,
parameters: JsonSchema,
}
/// Generic JSONSchema subset needed for our tool definitions
#[derive(Debug, Clone, Serialize)]
#[serde(tag = "type", rename_all = "lowercase")]
enum JsonSchema {
String,
Number,
Array {
items: Box<JsonSchema>,
},
Object {
properties: BTreeMap<String, JsonSchema>,
required: &'static [&'static str],
#[serde(rename = "additionalProperties")]
additional_properties: bool,
},
}
/// Tool usage specification
static TOOLS: LazyLock<Vec<Tool>> = LazyLock::new(|| {
let mut properties = BTreeMap::new();
properties.insert(
"command".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String),
},
);
properties.insert("workdir".to_string(), JsonSchema::String);
properties.insert("timeout".to_string(), JsonSchema::Number);
vec![Tool {
name: "shell",
kind: "function",
description: "Runs a shell command, and returns its output.",
strict: false,
parameters: JsonSchema::Object {
properties,
required: &["command"],
additional_properties: false,
},
}]
});
#[derive(Clone)]
pub struct ModelClient {
model: String,
client: reqwest::Client,
}
impl ModelClient {
pub fn new(model: impl ToString) -> Self {
let model = model.to_string();
let client = reqwest::Client::new();
Self { model, client }
}
pub async fn stream(&mut self, prompt: &Prompt) -> Result<ResponseStream> {
if let Some(path) = &*CODEX_RS_SSE_FIXTURE {
// short circuit for tests
warn!(path, "Streaming from fixture");
return stream_from_fixture(path).await;
}
let payload = Payload {
model: &self.model,
instructions: prompt.instructions.as_ref(),
input: &prompt.input,
tools: &TOOLS,
tool_choice: "auto",
parallel_tool_calls: false,
reasoning: Some(Reasoning {
effort: "high",
generate_summary: None,
}),
previous_response_id: prompt.prev_id.clone(),
store: prompt.store,
stream: true,
};
let url = format!("{}/v1/responses", *OPENAI_API_BASE);
debug!(url, "POST");
trace!("request payload: {}", serde_json::to_string(&payload)?);
let mut attempt = 0;
loop {
attempt += 1;
let res = self
.client
.post(&url)
.bearer_auth(get_api_key()?)
.header("OpenAI-Beta", "responses=experimental")
.header(reqwest::header::ACCEPT, "text/event-stream")
.json(&payload)
.send()
.await;
match res {
Ok(resp) if resp.status().is_success() => {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
// spawn task to process SSE
let stream = resp.bytes_stream().map_err(CodexErr::Reqwest);
tokio::spawn(process_sse(stream, tx_event));
return Ok(ResponseStream { rx_event });
}
Ok(res) => {
let status = res.status();
// The OpenAI Responses endpoint returns structured JSON bodies even for 4xx/5xx
// errors. When we bubble early with only the HTTP status the caller sees an opaque
// "unexpected status 400 Bad Request" which makes debugging nearly impossible.
// Instead, read (and include) the response text so higher layers and users see the
// exact error message (e.g. "Unknown parameter: 'input[0].metadata'"). The body is
// small and this branch only runs on error paths so the extra allocation is
// negligible.
if !(status == StatusCode::TOO_MANY_REQUESTS || status.is_server_error()) {
// Surface the error body to callers. Use `unwrap_or_default` per Clippy.
let body = (res.text().await).unwrap_or_default();
return Err(CodexErr::UnexpectedStatus(status, body));
}
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
return Err(CodexErr::RetryLimit(status));
}
// Pull out RetryAfter header if present.
let retry_after_secs = res
.headers()
.get(reqwest::header::RETRY_AFTER)
.and_then(|v| v.to_str().ok())
.and_then(|s| s.parse::<u64>().ok());
let delay = retry_after_secs
.map(|s| Duration::from_millis(s * 1_000))
.unwrap_or_else(|| backoff(attempt));
tokio::time::sleep(delay).await;
}
Err(e) => {
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
return Err(e.into());
}
let delay = backoff(attempt);
tokio::time::sleep(delay).await;
}
}
}
}
}
#[derive(Debug, Deserialize, Serialize)]
struct SseEvent {
#[serde(rename = "type")]
kind: String,
response: Option<Value>,
item: Option<Value>,
}
#[derive(Debug, Deserialize)]
struct ResponseCompleted {
id: String,
}
async fn process_sse<S>(stream: S, tx_event: mpsc::Sender<Result<ResponseEvent>>)
where
S: Stream<Item = Result<Bytes>> + Unpin,
{
let mut stream = stream.eventsource();
// If the stream stays completely silent for an extended period treat it as disconnected.
let idle_timeout = *OPENAI_STREAM_IDLE_TIMEOUT_MS;
// The response id returned from the "complete" message.
let mut response_id = None;
loop {
let sse = match timeout(idle_timeout, stream.next()).await {
Ok(Some(Ok(sse))) => sse,
Ok(Some(Err(e))) => {
debug!("SSE Error: {e:#}");
let event = CodexErr::Stream(e.to_string());
let _ = tx_event.send(Err(event)).await;
return;
}
Ok(None) => {
match response_id {
Some(response_id) => {
let event = ResponseEvent::Completed { response_id };
let _ = tx_event.send(Ok(event)).await;
}
None => {
let _ = tx_event
.send(Err(CodexErr::Stream(
"stream closed before response.completed".into(),
)))
.await;
}
}
return;
}
Err(_) => {
let _ = tx_event
.send(Err(CodexErr::Stream("idle timeout waiting for SSE".into())))
.await;
return;
}
};
let event: SseEvent = match serde_json::from_str(&sse.data) {
Ok(event) => event,
Err(e) => {
debug!("Failed to parse SSE event: {e}, data: {}", &sse.data);
continue;
}
};
trace!(?event, "SSE event");
match event.kind.as_str() {
// Individual output item finalised. Forward immediately so the
// rest of the agent can stream assistant text/functions *live*
// instead of waiting for the final `response.completed` envelope.
//
// IMPORTANT: We used to ignore these events and forward the
// duplicated `output` array embedded in the `response.completed`
// payload. That produced two concrete issues:
// 1. No realtime streaming the user only saw output after the
// entire turn had finished, which broke the “typing” UX and
// made longrunning turns look stalled.
// 2. Duplicate `function_call_output` items both the
// individual *and* the completed array were forwarded, which
// confused the backend and triggered 400
// "previous_response_not_found" errors because the duplicated
// IDs did not match the incremental turn chain.
//
// The fix is to forward the incremental events *as they come* and
// drop the duplicated list inside `response.completed`.
"response.output_item.done" => {
let Some(item_val) = event.item else { continue };
let Ok(item) = serde_json::from_value::<ResponseItem>(item_val) else {
debug!("failed to parse ResponseItem from output_item.done");
continue;
};
let event = ResponseEvent::OutputItemDone(item);
if tx_event.send(Ok(event)).await.is_err() {
return;
}
}
// Final response completed includes array of output items & id
"response.completed" => {
if let Some(resp_val) = event.response {
match serde_json::from_value::<ResponseCompleted>(resp_val) {
Ok(r) => {
response_id = Some(r.id);
}
Err(e) => {
debug!("failed to parse ResponseCompleted: {e}");
continue;
}
};
};
}
other => debug!(other, "sse event"),
}
}
}
pub struct ResponseStream {
rx_event: mpsc::Receiver<Result<ResponseEvent>>,
}
impl Stream for ResponseStream {
type Item = Result<ResponseEvent>;
fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
self.rx_event.poll_recv(cx)
}
}
/// used in tests to stream from a text SSE file
async fn stream_from_fixture(path: impl AsRef<Path>) -> Result<ResponseStream> {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
let f = std::fs::File::open(path.as_ref())?;
let lines = std::io::BufReader::new(f).lines();
// insert \n\n after each line for proper SSE parsing
let mut content = String::new();
for line in lines {
content.push_str(&line?);
content.push_str("\n\n");
}
let rdr = std::io::Cursor::new(content);
let stream = ReaderStream::new(rdr).map_err(CodexErr::Io);
tokio::spawn(process_sse(stream, tx_event));
Ok(ResponseStream { rx_event })
}

1487
codex-rs/core/src/codex.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,77 @@
use std::sync::atomic::AtomicU64;
use std::sync::Arc;
use crate::config::Config;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::Op;
use crate::protocol::Submission;
use crate::util::notify_on_sigint;
use crate::Codex;
use tokio::sync::Notify;
/// Spawn a new [`Codex`] and initialise the session.
///
/// Returns the wrapped [`Codex`] **and** the `SessionInitialized` event that
/// is received as a response to the initial `ConfigureSession` submission so
/// that callers can surface the information to the UI.
pub async fn init_codex(config: Config) -> anyhow::Result<(CodexWrapper, Event, Arc<Notify>)> {
let ctrl_c = notify_on_sigint();
let codex = CodexWrapper::new(Codex::spawn(ctrl_c.clone())?);
let init_id = codex
.submit(Op::ConfigureSession {
model: config.model.clone(),
instructions: config.instructions.clone(),
approval_policy: config.approval_policy,
sandbox_policy: config.sandbox_policy,
disable_response_storage: config.disable_response_storage,
})
.await?;
// The first event must be `SessionInitialized`. Validate and forward it to
// the caller so that they can display it in the conversation history.
let event = codex.next_event().await?;
if event.id != init_id
|| !matches!(
&event,
Event {
id: _id,
msg: EventMsg::SessionConfigured { .. },
}
)
{
return Err(anyhow::anyhow!(
"expected SessionInitialized but got {event:?}"
));
}
Ok((codex, event, ctrl_c))
}
pub struct CodexWrapper {
next_id: AtomicU64,
codex: Codex,
}
impl CodexWrapper {
fn new(codex: Codex) -> Self {
Self {
next_id: AtomicU64::new(0),
codex,
}
}
/// Returns the id of the Submission.
pub async fn submit(&self, op: Op) -> crate::error::Result<String> {
let id = self
.next_id
.fetch_add(1, std::sync::atomic::Ordering::SeqCst)
.to_string();
self.codex.submit(Submission { id: id.clone(), op }).await?;
Ok(id)
}
pub async fn next_event(&self) -> crate::error::Result<Event> {
self.codex.next_event().await
}
}

264
codex-rs/core/src/config.rs Normal file
View File

@@ -0,0 +1,264 @@
use crate::approval_mode_cli_arg::parse_sandbox_permission_with_base_path;
use crate::flags::OPENAI_DEFAULT_MODEL;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPermission;
use crate::protocol::SandboxPolicy;
use dirs::home_dir;
use serde::Deserialize;
use std::path::PathBuf;
/// Embedded fallback instructions that mirror the TypeScript CLIs default
/// system prompt. These are compiled into the binary so a clean install behaves
/// correctly even if the user has not created `~/.codex/instructions.md`.
const EMBEDDED_INSTRUCTIONS: &str = include_str!("../prompt.md");
/// Application configuration loaded from disk and merged with overrides.
#[derive(Debug, Clone)]
pub struct Config {
/// Optional override of model selection.
pub model: String,
/// Approval policy for executing commands.
pub approval_policy: AskForApproval,
pub sandbox_policy: SandboxPolicy,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
/// who have opted into Zero Data Retention (ZDR).
pub disable_response_storage: bool,
/// System instructions.
pub instructions: Option<String>,
}
/// Base config deserialized from ~/.codex/config.toml.
#[derive(Deserialize, Debug, Clone, Default)]
pub struct ConfigToml {
/// Optional override of model selection.
pub model: Option<String>,
/// Default approval policy for executing commands.
pub approval_policy: Option<AskForApproval>,
// The `default` attribute ensures that the field is treated as `None` when
// the key is omitted from the TOML. Without it, Serde treats the field as
// required because we supply a custom deserializer.
#[serde(default, deserialize_with = "deserialize_sandbox_permissions")]
pub sandbox_permissions: Option<Vec<SandboxPermission>>,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
/// who have opted into Zero Data Retention (ZDR).
pub disable_response_storage: Option<bool>,
/// System instructions.
pub instructions: Option<String>,
}
impl ConfigToml {
/// Attempt to parse the file at `~/.codex/config.toml`. If it does not
/// exist, return a default config. Though if it exists and cannot be
/// parsed, report that to the user and force them to fix it.
fn load_from_toml() -> std::io::Result<Self> {
let config_toml_path = codex_dir()?.join("config.toml");
match std::fs::read_to_string(&config_toml_path) {
Ok(contents) => toml::from_str::<Self>(&contents).map_err(|e| {
tracing::error!("Failed to parse config.toml: {e}");
std::io::Error::new(std::io::ErrorKind::InvalidData, e)
}),
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
tracing::info!("config.toml not found, using defaults");
Ok(Self::default())
}
Err(e) => {
tracing::error!("Failed to read config.toml: {e}");
Err(e)
}
}
}
}
fn deserialize_sandbox_permissions<'de, D>(
deserializer: D,
) -> Result<Option<Vec<SandboxPermission>>, D::Error>
where
D: serde::Deserializer<'de>,
{
let permissions: Option<Vec<String>> = Option::deserialize(deserializer)?;
match permissions {
Some(raw_permissions) => {
let base_path = codex_dir().map_err(serde::de::Error::custom)?;
let converted = raw_permissions
.into_iter()
.map(|raw| {
parse_sandbox_permission_with_base_path(&raw, base_path.clone())
.map_err(serde::de::Error::custom)
})
.collect::<Result<Vec<_>, D::Error>>()?;
Ok(Some(converted))
}
None => Ok(None),
}
}
/// Optional overrides for user configuration (e.g., from CLI flags).
#[derive(Default, Debug, Clone)]
pub struct ConfigOverrides {
pub model: Option<String>,
pub approval_policy: Option<AskForApproval>,
pub sandbox_policy: Option<SandboxPolicy>,
pub disable_response_storage: Option<bool>,
}
impl Config {
/// Load configuration, optionally applying overrides (CLI flags). Merges
/// ~/.codex/config.toml, ~/.codex/instructions.md, embedded defaults, and
/// any values provided in `overrides` (highest precedence).
pub fn load_with_overrides(overrides: ConfigOverrides) -> std::io::Result<Self> {
let cfg: ConfigToml = ConfigToml::load_from_toml()?;
tracing::warn!("Config parsed from config.toml: {cfg:?}");
Ok(Self::load_from_base_config_with_overrides(cfg, overrides))
}
fn load_from_base_config_with_overrides(cfg: ConfigToml, overrides: ConfigOverrides) -> Self {
// Instructions: user-provided instructions.md > embedded default.
let instructions =
Self::load_instructions().or_else(|| Some(EMBEDDED_INSTRUCTIONS.to_string()));
// Destructure ConfigOverrides fully to ensure all overrides are applied.
let ConfigOverrides {
model,
approval_policy,
sandbox_policy,
disable_response_storage,
} = overrides;
let sandbox_policy = match sandbox_policy {
Some(sandbox_policy) => sandbox_policy,
None => {
// Derive a SandboxPolicy from the permissions in the config.
match cfg.sandbox_permissions {
// Note this means the user can explicitly set permissions
// to the empty list in the config file, granting it no
// permissions whatsoever.
Some(permissions) => SandboxPolicy::from(permissions),
// Default to read only rather than completely locked down.
None => SandboxPolicy::new_read_only_policy(),
}
}
};
Self {
model: model.or(cfg.model).unwrap_or_else(default_model),
approval_policy: approval_policy
.or(cfg.approval_policy)
.unwrap_or_else(AskForApproval::default),
sandbox_policy,
disable_response_storage: disable_response_storage
.or(cfg.disable_response_storage)
.unwrap_or(false),
instructions,
}
}
fn load_instructions() -> Option<String> {
let mut p = codex_dir().ok()?;
p.push("instructions.md");
std::fs::read_to_string(&p).ok()
}
/// Meant to be used exclusively for tests: `load_with_overrides()` should
/// be used in all other cases.
pub fn load_default_config_for_test() -> Self {
Self::load_from_base_config_with_overrides(
ConfigToml::default(),
ConfigOverrides::default(),
)
}
}
fn default_model() -> String {
OPENAI_DEFAULT_MODEL.to_string()
}
/// Returns the path to the Codex configuration directory, which is `~/.codex`.
/// Does not verify that the directory exists.
pub fn codex_dir() -> std::io::Result<PathBuf> {
let mut p = home_dir().ok_or_else(|| {
std::io::Error::new(
std::io::ErrorKind::NotFound,
"Could not find home directory",
)
})?;
p.push(".codex");
Ok(p)
}
/// Returns the path to the folder where Codex logs are stored. Does not verify
/// that the directory exists.
pub fn log_dir() -> std::io::Result<PathBuf> {
let mut p = codex_dir()?;
p.push("log");
Ok(p)
}
#[cfg(test)]
mod tests {
use super::*;
/// Verify that the `sandbox_permissions` field on `ConfigToml` correctly
/// differentiates between a value that is completely absent in the
/// provided TOML (i.e. `None`) and one that is explicitly specified as an
/// empty array (i.e. `Some(vec![])`). This ensures that downstream logic
/// that treats these two cases differently (default read-only policy vs a
/// fully locked-down sandbox) continues to function.
#[test]
fn test_sandbox_permissions_none_vs_empty_vec() {
// Case 1: `sandbox_permissions` key is *absent* from the TOML source.
let toml_source_without_key = "";
let cfg_without_key: ConfigToml = toml::from_str(toml_source_without_key)
.expect("TOML deserialization without key should succeed");
assert!(cfg_without_key.sandbox_permissions.is_none());
// Case 2: `sandbox_permissions` is present but set to an *empty array*.
let toml_source_with_empty = "sandbox_permissions = []";
let cfg_with_empty: ConfigToml = toml::from_str(toml_source_with_empty)
.expect("TOML deserialization with empty array should succeed");
assert_eq!(Some(vec![]), cfg_with_empty.sandbox_permissions);
// Case 3: `sandbox_permissions` contains a non-empty list of valid values.
let toml_source_with_values = r#"
sandbox_permissions = ["disk-full-read-access", "network-full-access"]
"#;
let cfg_with_values: ConfigToml = toml::from_str(toml_source_with_values)
.expect("TOML deserialization with valid permissions should succeed");
assert_eq!(
Some(vec![
SandboxPermission::DiskFullReadAccess,
SandboxPermission::NetworkFullAccess
]),
cfg_with_values.sandbox_permissions
);
}
/// Deserializing a TOML string containing an *invalid* permission should
/// fail with a helpful error rather than silently defaulting or
/// succeeding.
#[test]
fn test_sandbox_permissions_illegal_value() {
let toml_bad = r#"sandbox_permissions = ["not-a-real-permission"]"#;
let err = toml::from_str::<ConfigToml>(toml_bad)
.expect_err("Deserialization should fail for invalid permission");
// Make sure the error message contains the invalid value so users have
// useful feedback.
let msg = err.to_string();
assert!(msg.contains("not-a-real-permission"));
}
}

111
codex-rs/core/src/error.rs Normal file
View File

@@ -0,0 +1,111 @@
use reqwest::StatusCode;
use serde_json;
use std::io;
use thiserror::Error;
use tokio::task::JoinError;
pub type Result<T> = std::result::Result<T, CodexErr>;
#[derive(Error, Debug)]
pub enum SandboxErr {
/// Error from sandbox execution
#[error("sandbox denied exec error, exit code: {0}, stdout: {1}, stderr: {2}")]
Denied(i32, String, String),
/// Error from linux seccomp filter setup
#[cfg(target_os = "linux")]
#[error("seccomp setup error")]
SeccompInstall(#[from] seccompiler::Error),
/// Error from linux seccomp backend
#[cfg(target_os = "linux")]
#[error("seccomp backend error")]
SeccompBackend(#[from] seccompiler::BackendError),
/// Command timed out
#[error("command timed out")]
Timeout,
/// Command was killed by a signal
#[error("command was killed by a signal")]
Signal(i32),
/// Error from linux landlock
#[error("Landlock was not able to fully enforce all sandbox rules")]
LandlockRestrict,
}
#[derive(Error, Debug)]
pub enum CodexErr {
/// Returned by ResponsesClient when the SSE stream disconnects or errors out **after** the HTTP
/// handshake has succeeded but **before** it finished emitting `response.completed`.
///
/// The Session loop treats this as a transient error and will automatically retry the turn.
#[error("stream disconnected before completion: {0}")]
Stream(String),
/// Returned by run_command_stream when the spawned child process timed out (10s).
#[error("timeout waiting for child process to exit")]
Timeout,
/// Returned by run_command_stream when the child could not be spawned (its stdout/stderr pipes
/// could not be captured). Analogous to the previous `CodexError::Spawn` variant.
#[error("spawn failed: child stdout/stderr not captured")]
Spawn,
/// Returned by run_command_stream when the user pressed CtrlC (SIGINT). Session uses this to
/// surface a polite FunctionCallOutput back to the model instead of crashing the CLI.
#[error("interrupted (CtrlC)")]
Interrupted,
/// Unexpected HTTP status code.
#[error("unexpected status {0}: {1}")]
UnexpectedStatus(StatusCode, String),
/// Retry limit exceeded.
#[error("exceeded retry limit, last status: {0}")]
RetryLimit(StatusCode),
/// Agent loop died unexpectedly
#[error("internal error; agent loop died unexpectedly")]
InternalAgentDied,
/// Sandbox error
#[error("sandbox error: {0}")]
Sandbox(#[from] SandboxErr),
// -----------------------------------------------------------------
// Automatic conversions for common external error types
// -----------------------------------------------------------------
#[error(transparent)]
Io(#[from] io::Error),
#[error(transparent)]
Reqwest(#[from] reqwest::Error),
#[error(transparent)]
Json(#[from] serde_json::Error),
#[cfg(target_os = "linux")]
#[error(transparent)]
LandlockRuleset(#[from] landlock::RulesetError),
#[cfg(target_os = "linux")]
#[error(transparent)]
LandlockPathFd(#[from] landlock::PathFdError),
#[error(transparent)]
TokioJoin(#[from] JoinError),
#[error("missing environment variable {0}")]
EnvVar(&'static str),
}
impl CodexErr {
/// Minimal shim so that existing `e.downcast_ref::<CodexErr>()` checks continue to compile
/// after replacing `anyhow::Error` in the return signature. This mirrors the behavior of
/// `anyhow::Error::downcast_ref` but works directly on our concrete enum.
pub fn downcast_ref<T: std::any::Any>(&self) -> Option<&T> {
(self as &dyn std::any::Any).downcast_ref::<T>()
}
}

360
codex-rs/core/src/exec.rs Normal file
View File

@@ -0,0 +1,360 @@
use std::io;
#[cfg(target_family = "unix")]
use std::os::unix::process::ExitStatusExt;
use std::process::ExitStatus;
use std::process::Stdio;
use std::sync::Arc;
use std::time::Duration;
use std::time::Instant;
use serde::Deserialize;
use tokio::io::AsyncRead;
use tokio::io::AsyncReadExt;
use tokio::io::BufReader;
use tokio::process::Command;
use tokio::sync::Notify;
use crate::error::CodexErr;
use crate::error::Result;
use crate::error::SandboxErr;
use crate::protocol::SandboxPolicy;
// Maximum we send for each stream, which is either:
// - 10KiB OR
// - 256 lines
const MAX_STREAM_OUTPUT: usize = 10 * 1024;
const MAX_STREAM_OUTPUT_LINES: usize = 256;
const DEFAULT_TIMEOUT_MS: u64 = 10_000;
// Hardcode these since it does not seem worth including the libc crate just
// for these.
const SIGKILL_CODE: i32 = 9;
const TIMEOUT_CODE: i32 = 64;
const MACOS_SEATBELT_BASE_POLICY: &str = include_str!("seatbelt_base_policy.sbpl");
/// When working with `sandbox-exec`, only consider `sandbox-exec` in `/usr/bin`
/// to defend against an attacker trying to inject a malicious version on the
/// PATH. If /usr/bin/sandbox-exec has been tampered with, then the attacker
/// already has root access.
const MACOS_PATH_TO_SEATBELT_EXECUTABLE: &str = "/usr/bin/sandbox-exec";
#[derive(Deserialize, Debug, Clone)]
pub struct ExecParams {
pub command: Vec<String>,
pub workdir: Option<String>,
/// This is the maximum time in seconds that the command is allowed to run.
#[serde(rename = "timeout")]
// The wire format uses `timeout`, which has ambiguous units, so we use
// `timeout_ms` as the field name so it is clear in code.
pub timeout_ms: Option<u64>,
}
#[derive(Clone, Copy, Debug, PartialEq)]
pub enum SandboxType {
None,
/// Only available on macOS.
MacosSeatbelt,
/// Only available on Linux.
LinuxSeccomp,
}
#[cfg(target_os = "linux")]
async fn exec_linux(
params: ExecParams,
ctrl_c: Arc<Notify>,
sandbox_policy: &SandboxPolicy,
) -> Result<RawExecToolCallOutput> {
crate::linux::exec_linux(params, ctrl_c, sandbox_policy).await
}
#[cfg(not(target_os = "linux"))]
async fn exec_linux(
_params: ExecParams,
_ctrl_c: Arc<Notify>,
_sandbox_policy: &SandboxPolicy,
) -> Result<RawExecToolCallOutput> {
Err(CodexErr::Io(io::Error::new(
io::ErrorKind::InvalidInput,
"linux sandbox is not supported on this platform",
)))
}
pub async fn process_exec_tool_call(
params: ExecParams,
sandbox_type: SandboxType,
ctrl_c: Arc<Notify>,
sandbox_policy: &SandboxPolicy,
) -> Result<ExecToolCallOutput> {
let start = Instant::now();
let raw_output_result = match sandbox_type {
SandboxType::None => exec(params, ctrl_c).await,
SandboxType::MacosSeatbelt => {
let ExecParams {
command,
workdir,
timeout_ms,
} = params;
let seatbelt_command = create_seatbelt_command(command, sandbox_policy);
exec(
ExecParams {
command: seatbelt_command,
workdir,
timeout_ms,
},
ctrl_c,
)
.await
}
SandboxType::LinuxSeccomp => exec_linux(params, ctrl_c, sandbox_policy).await,
};
let duration = start.elapsed();
match raw_output_result {
Ok(raw_output) => {
let stdout = String::from_utf8_lossy(&raw_output.stdout).to_string();
let stderr = String::from_utf8_lossy(&raw_output.stderr).to_string();
#[cfg(target_family = "unix")]
match raw_output.exit_status.signal() {
Some(TIMEOUT_CODE) => return Err(CodexErr::Sandbox(SandboxErr::Timeout)),
Some(signal) => {
return Err(CodexErr::Sandbox(SandboxErr::Signal(signal)));
}
None => {}
}
let exit_code = raw_output.exit_status.code().unwrap_or(-1);
// NOTE(ragona): This is much less restrictive than the previous check. If we exec
// a command, and it returns anything other than success, we assume that it may have
// been a sandboxing error and allow the user to retry. (The user of course may choose
// not to retry, or in a non-interactive mode, would automatically reject the approval.)
if exit_code != 0 && sandbox_type != SandboxType::None {
return Err(CodexErr::Sandbox(SandboxErr::Denied(
exit_code, stdout, stderr,
)));
}
Ok(ExecToolCallOutput {
exit_code,
stdout,
stderr,
duration,
})
}
Err(err) => {
tracing::error!("exec error: {err}");
Err(err)
}
}
}
pub fn create_seatbelt_command(
command: Vec<String>,
sandbox_policy: &SandboxPolicy,
) -> Vec<String> {
let (file_write_policy, extra_cli_args) = {
if sandbox_policy.has_full_disk_write_access() {
// Allegedly, this is more permissive than `(allow file-write*)`.
(
r#"(allow file-write* (regex #"^/"))"#.to_string(),
Vec::<String>::new(),
)
} else {
let writable_roots = sandbox_policy.get_writable_roots();
let (writable_folder_policies, cli_args): (Vec<String>, Vec<String>) = writable_roots
.iter()
.enumerate()
.map(|(index, root)| {
let param_name = format!("WRITABLE_ROOT_{index}");
let policy: String = format!("(subpath (param \"{param_name}\"))");
let cli_arg = format!("-D{param_name}={}", root.to_string_lossy());
(policy, cli_arg)
})
.unzip();
if writable_folder_policies.is_empty() {
("".to_string(), Vec::<String>::new())
} else {
let file_write_policy = format!(
"(allow file-write*\n{}\n)",
writable_folder_policies.join(" ")
);
(file_write_policy, cli_args)
}
}
};
let file_read_policy = if sandbox_policy.has_full_disk_read_access() {
"; allow read-only file operations\n(allow file-read*)"
} else {
""
};
// TODO(mbolin): apply_patch calls must also honor the SandboxPolicy.
let network_policy = if sandbox_policy.has_full_network_access() {
"(allow network-outbound)\n(allow network-inbound)\n(allow system-socket)"
} else {
""
};
let full_policy = format!(
"{MACOS_SEATBELT_BASE_POLICY}\n{file_read_policy}\n{file_write_policy}\n{network_policy}"
);
let mut seatbelt_command: Vec<String> = vec![
MACOS_PATH_TO_SEATBELT_EXECUTABLE.to_string(),
"-p".to_string(),
full_policy,
];
seatbelt_command.extend(extra_cli_args);
seatbelt_command.push("--".to_string());
seatbelt_command.extend(command);
seatbelt_command
}
#[derive(Debug)]
pub struct RawExecToolCallOutput {
pub exit_status: ExitStatus,
pub stdout: Vec<u8>,
pub stderr: Vec<u8>,
}
#[derive(Debug)]
pub struct ExecToolCallOutput {
pub exit_code: i32,
pub stdout: String,
pub stderr: String,
pub duration: Duration,
}
pub async fn exec(
ExecParams {
command,
workdir,
timeout_ms,
}: ExecParams,
ctrl_c: Arc<Notify>,
) -> Result<RawExecToolCallOutput> {
let mut child = {
if command.is_empty() {
return Err(CodexErr::Io(io::Error::new(
io::ErrorKind::InvalidInput,
"command args are empty",
)));
}
let mut cmd = Command::new(&command[0]);
if command.len() > 1 {
cmd.args(&command[1..]);
}
if let Some(dir) = &workdir {
cmd.current_dir(dir);
}
// Do not create a file descriptor for stdin because otherwise some
// commands may hang forever waiting for input. For example, ripgrep has
// a heuristic where it may try to read from stdin as explained here:
// https://github.com/BurntSushi/ripgrep/blob/e2362d4d5185d02fa857bf381e7bd52e66fafc73/crates/core/flags/hiargs.rs#L1101-L1103
cmd.stdin(Stdio::null());
cmd.stdout(Stdio::piped())
.stderr(Stdio::piped())
.kill_on_drop(true)
.spawn()?
};
let stdout_handle = tokio::spawn(read_capped(
BufReader::new(child.stdout.take().expect("stdout is not piped")),
MAX_STREAM_OUTPUT,
MAX_STREAM_OUTPUT_LINES,
));
let stderr_handle = tokio::spawn(read_capped(
BufReader::new(child.stderr.take().expect("stderr is not piped")),
MAX_STREAM_OUTPUT,
MAX_STREAM_OUTPUT_LINES,
));
let interrupted = ctrl_c.notified();
let timeout = Duration::from_millis(timeout_ms.unwrap_or(DEFAULT_TIMEOUT_MS));
let exit_status = tokio::select! {
result = tokio::time::timeout(timeout, child.wait()) => {
match result {
Ok(Ok(exit_status)) => exit_status,
Ok(e) => e?,
Err(_) => {
// timeout
child.start_kill()?;
// Debatable whether `child.wait().await` should be called here.
synthetic_exit_status(128 + TIMEOUT_CODE)
}
}
}
_ = interrupted => {
child.start_kill()?;
synthetic_exit_status(128 + SIGKILL_CODE)
}
};
let stdout = stdout_handle.await??;
let stderr = stderr_handle.await??;
Ok(RawExecToolCallOutput {
exit_status,
stdout,
stderr,
})
}
async fn read_capped<R: AsyncRead + Unpin>(
mut reader: R,
max_output: usize,
max_lines: usize,
) -> io::Result<Vec<u8>> {
let mut buf = Vec::with_capacity(max_output.min(8 * 1024));
let mut tmp = [0u8; 8192];
let mut remaining_bytes = max_output;
let mut remaining_lines = max_lines;
loop {
let n = reader.read(&mut tmp).await?;
if n == 0 {
break;
}
// Copy into the buffer only while we still have byte and line budget.
if remaining_bytes > 0 && remaining_lines > 0 {
let mut copy_len = 0;
for &b in &tmp[..n] {
if remaining_bytes == 0 || remaining_lines == 0 {
break;
}
copy_len += 1;
remaining_bytes -= 1;
if b == b'\n' {
remaining_lines -= 1;
}
}
buf.extend_from_slice(&tmp[..copy_len]);
}
// Continue reading to EOF to avoid back-pressure, but discard once caps are hit.
}
Ok(buf)
}
#[cfg(unix)]
fn synthetic_exit_status(code: i32) -> ExitStatus {
use std::os::unix::process::ExitStatusExt;
std::process::ExitStatus::from_raw(code)
}
#[cfg(windows)]
fn synthetic_exit_status(code: i32) -> ExitStatus {
use std::os::windows::process::ExitStatusExt;
std::process::ExitStatus::from_raw(code.try_into().unwrap())
}

View File

@@ -0,0 +1,29 @@
use std::time::Duration;
use env_flags::env_flags;
use crate::error::CodexErr;
use crate::error::Result;
env_flags! {
pub OPENAI_DEFAULT_MODEL: &str = "o3";
pub OPENAI_API_BASE: &str = "https://api.openai.com";
pub OPENAI_API_KEY: Option<&str> = None;
pub OPENAI_TIMEOUT_MS: Duration = Duration::from_millis(300_000), |value| {
value.parse().map(Duration::from_millis)
};
pub OPENAI_REQUEST_MAX_RETRIES: u64 = 4;
pub OPENAI_STREAM_MAX_RETRIES: u64 = 10;
// We generally don't want to disconnect; this updates the timeout to be five minutes
// which matches the upstream typescript codex impl.
pub OPENAI_STREAM_IDLE_TIMEOUT_MS: Duration = Duration::from_millis(300_000), |value| {
value.parse().map(Duration::from_millis)
};
pub CODEX_RS_SSE_FIXTURE: Option<&str> = None;
}
pub fn get_api_key() -> Result<&'static str> {
OPENAI_API_KEY.ok_or_else(|| CodexErr::EnvVar("OPENAI_API_KEY"))
}

View File

@@ -0,0 +1,332 @@
use tree_sitter::Parser;
use tree_sitter::Tree;
use tree_sitter_bash::LANGUAGE as BASH;
pub fn is_known_safe_command(command: &[String]) -> bool {
if is_safe_to_call_with_exec(command) {
return true;
}
// TODO(mbolin): Also support safe commands that are piped together such
// as `cat foo | wc -l`.
matches!(
command,
[bash, flag, script]
if bash == "bash"
&& flag == "-lc"
&& try_parse_bash(script).and_then(|tree|
try_parse_single_word_only_command(&tree, script)).is_some_and(|parsed_bash_command| is_safe_to_call_with_exec(&parsed_bash_command))
)
}
fn is_safe_to_call_with_exec(command: &[String]) -> bool {
let cmd0 = command.first().map(String::as_str);
match cmd0 {
Some(
"cat" | "cd" | "echo" | "grep" | "head" | "ls" | "pwd" | "rg" | "tail" | "wc" | "which",
) => true,
Some("find") => {
// Certain options to `find` can delete files, write to files, or
// execute arbitrary commands, so we cannot auto-approve the
// invocation of `find` in such cases.
#[rustfmt::skip]
const UNSAFE_FIND_OPTIONS: &[&str] = &[
// Options that can execute arbitrary commands.
"-exec", "-execdir", "-ok", "-okdir",
// Option that deletes matching files.
"-delete",
// Options that write pathnames to a file.
"-fls", "-fprint", "-fprint0", "-fprintf",
];
!command
.iter()
.any(|arg| UNSAFE_FIND_OPTIONS.contains(&arg.as_str()))
}
// Git
Some("git") => matches!(
command.get(1).map(String::as_str),
Some("branch" | "status" | "log" | "diff" | "show")
),
// Rust
Some("cargo") if command.get(1).map(String::as_str) == Some("check") => true,
// Special-case `sed -n {N|M,N}p FILE`
Some("sed")
if {
command.len() == 4
&& command.get(1).map(String::as_str) == Some("-n")
&& is_valid_sed_n_arg(command.get(2).map(String::as_str))
&& command.get(3).map(String::is_empty) == Some(false)
} =>
{
true
}
// ── anything else ─────────────────────────────────────────────────
_ => false,
}
}
fn try_parse_bash(bash_lc_arg: &str) -> Option<Tree> {
let lang = BASH.into();
let mut parser = Parser::new();
parser.set_language(&lang).expect("load bash grammar");
let old_tree: Option<&Tree> = None;
parser.parse(bash_lc_arg, old_tree)
}
/// If `tree` represents a single Bash command whose name and every argument is
/// an ordinary `word`, return those words in order; otherwise, return `None`.
///
/// `src` must be the exact source string that was parsed into `tree`, so we can
/// extract the text for every node.
pub fn try_parse_single_word_only_command(tree: &Tree, src: &str) -> Option<Vec<String>> {
// Any parse error is an immediate rejection.
if tree.root_node().has_error() {
return None;
}
// (program …) with exactly one statement
let root = tree.root_node();
if root.kind() != "program" || root.named_child_count() != 1 {
return None;
}
let cmd = root.named_child(0)?; // (command …)
if cmd.kind() != "command" {
return None;
}
let mut words = Vec::new();
let mut cursor = cmd.walk();
for child in cmd.named_children(&mut cursor) {
match child.kind() {
// The command name node wraps one `word` child.
"command_name" => {
let word_node = child.named_child(0)?; // make sure it's only a word
if word_node.kind() != "word" {
return None;
}
words.push(word_node.utf8_text(src.as_bytes()).ok()?.to_owned());
}
// Positionalargument word (allowed).
"word" | "number" => {
words.push(child.utf8_text(src.as_bytes()).ok()?.to_owned());
}
"string" => {
if child.child_count() == 3
&& child.child(0)?.kind() == "\""
&& child.child(1)?.kind() == "string_content"
&& child.child(2)?.kind() == "\""
{
words.push(child.child(1)?.utf8_text(src.as_bytes()).ok()?.to_owned());
} else {
// Anything else means the command is *not* plain words.
return None;
}
}
"concatenation" => {
// TODO: Consider things like `'ab\'a'`.
return None;
}
"raw_string" => {
// Raw string is a single word, but we need to strip the quotes.
let raw_string = child.utf8_text(src.as_bytes()).ok()?;
let stripped = raw_string
.strip_prefix('\'')
.and_then(|s| s.strip_suffix('\''));
if let Some(stripped) = stripped {
words.push(stripped.to_owned());
} else {
return None;
}
}
// Anything else means the command is *not* plain words.
_ => return None,
}
}
Some(words)
}
/* ----------------------------------------------------------
Example
---------------------------------------------------------- */
/// Returns true if `arg` matches /^(\d+,)?\d+p$/
fn is_valid_sed_n_arg(arg: Option<&str>) -> bool {
// unwrap or bail
let s = match arg {
Some(s) => s,
None => return false,
};
// must end with 'p', strip it
let core = match s.strip_suffix('p') {
Some(rest) => rest,
None => return false,
};
// split on ',' and ensure 1 or 2 numeric parts
let parts: Vec<&str> = core.split(',').collect();
match parts.as_slice() {
// single number, e.g. "10"
[num] => !num.is_empty() && num.chars().all(|c| c.is_ascii_digit()),
// two numbers, e.g. "1,5"
[a, b] => {
!a.is_empty()
&& !b.is_empty()
&& a.chars().all(|c| c.is_ascii_digit())
&& b.chars().all(|c| c.is_ascii_digit())
}
// anything else (more than one comma) is invalid
_ => false,
}
}
#[cfg(test)]
mod tests {
use super::*;
fn vec_str(args: &[&str]) -> Vec<String> {
args.iter().map(|s| s.to_string()).collect()
}
#[test]
fn known_safe_examples() {
assert!(is_safe_to_call_with_exec(&vec_str(&["ls"])));
assert!(is_safe_to_call_with_exec(&vec_str(&["git", "status"])));
assert!(is_safe_to_call_with_exec(&vec_str(&[
"sed", "-n", "1,5p", "file.txt"
])));
// Safe `find` command (no unsafe options).
assert!(is_safe_to_call_with_exec(&vec_str(&[
"find", ".", "-name", "file.txt"
])));
}
#[test]
fn unknown_or_partial() {
assert!(!is_safe_to_call_with_exec(&vec_str(&["foo"])));
assert!(!is_safe_to_call_with_exec(&vec_str(&["git", "fetch"])));
assert!(!is_safe_to_call_with_exec(&vec_str(&[
"sed", "-n", "xp", "file.txt"
])));
// Unsafe `find` commands.
for args in [
vec_str(&["find", ".", "-name", "file.txt", "-exec", "rm", "{}", ";"]),
vec_str(&[
"find", ".", "-name", "*.py", "-execdir", "python3", "{}", ";",
]),
vec_str(&["find", ".", "-name", "file.txt", "-ok", "rm", "{}", ";"]),
vec_str(&["find", ".", "-name", "*.py", "-okdir", "python3", "{}", ";"]),
vec_str(&["find", ".", "-delete", "-name", "file.txt"]),
vec_str(&["find", ".", "-fls", "/etc/passwd"]),
vec_str(&["find", ".", "-fprint", "/etc/passwd"]),
vec_str(&["find", ".", "-fprint0", "/etc/passwd"]),
vec_str(&["find", ".", "-fprintf", "/root/suid.txt", "%#m %u %p\n"]),
] {
assert!(
!is_safe_to_call_with_exec(&args),
"expected {:?} to be unsafe",
args
);
}
}
#[test]
fn bash_lc_safe_examples() {
assert!(is_known_safe_command(&vec_str(&["bash", "-lc", "ls"])));
assert!(is_known_safe_command(&vec_str(&["bash", "-lc", "ls -1"])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"git status"
])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"grep -R \"Cargo.toml\" -n"
])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"sed -n 1,5p file.txt"
])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"sed -n '1,5p' file.txt"
])));
assert!(is_known_safe_command(&vec_str(&[
"bash",
"-lc",
"find . -name file.txt"
])));
}
#[test]
fn bash_lc_unsafe_examples() {
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "git", "status"])),
"Four arg version is not known to be safe."
);
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "'git status'"])),
"The extra quoting around 'git status' makes it a program named 'git status' and is therefore unsafe."
);
assert!(
!is_known_safe_command(&vec_str(&["bash", "-lc", "find . -name file.txt -delete"])),
"Unsafe find option should not be autoapproved."
);
}
#[test]
fn test_try_parse_single_word_only_command() {
let script_with_single_quoted_string = "sed -n '1,5p' file.txt";
let parsed_words = try_parse_bash(script_with_single_quoted_string)
.and_then(|tree| {
try_parse_single_word_only_command(&tree, script_with_single_quoted_string)
})
.unwrap();
assert_eq!(
vec![
"sed".to_string(),
"-n".to_string(),
// Ensure the single quotes are properly removed.
"1,5p".to_string(),
"file.txt".to_string()
],
parsed_words,
);
let script_with_number_arg = "ls -1";
let parsed_words = try_parse_bash(script_with_number_arg)
.and_then(|tree| try_parse_single_word_only_command(&tree, script_with_number_arg))
.unwrap();
assert_eq!(vec!["ls", "-1"], parsed_words,);
let script_with_double_quoted_string_with_no_funny_stuff_arg = "grep -R \"Cargo.toml\" -n";
let parsed_words = try_parse_bash(script_with_double_quoted_string_with_no_funny_stuff_arg)
.and_then(|tree| {
try_parse_single_word_only_command(
&tree,
script_with_double_quoted_string_with_no_funny_stuff_arg,
)
})
.unwrap();
assert_eq!(vec!["grep", "-R", "Cargo.toml", "-n"], parsed_words);
}
}

31
codex-rs/core/src/lib.rs Normal file
View File

@@ -0,0 +1,31 @@
//! Root of the `codex-core` library.
// Prevent accidental direct writes to stdout/stderr in library code. All
// uservisible output must go through the appropriate abstraction (e.g.,
// the TUI or the tracing stack).
#![deny(clippy::print_stdout, clippy::print_stderr)]
mod client;
pub mod codex;
pub mod codex_wrapper;
pub mod config;
pub mod error;
pub mod exec;
mod flags;
mod is_safe_command;
#[cfg(target_os = "linux")]
pub mod linux;
mod models;
pub mod protocol;
mod safety;
pub mod util;
mod zdr_transcript;
pub use codex::Codex;
#[cfg(feature = "cli")]
mod approval_mode_cli_arg;
#[cfg(feature = "cli")]
pub use approval_mode_cli_arg::ApprovalModeCliArg;
#[cfg(feature = "cli")]
pub use approval_mode_cli_arg::SandboxPermissionOption;

353
codex-rs/core/src/linux.rs Normal file
View File

@@ -0,0 +1,353 @@
use std::collections::BTreeMap;
use std::io;
use std::path::PathBuf;
use std::sync::Arc;
use crate::error::CodexErr;
use crate::error::Result;
use crate::error::SandboxErr;
use crate::exec::exec;
use crate::exec::ExecParams;
use crate::exec::RawExecToolCallOutput;
use crate::protocol::SandboxPolicy;
use landlock::Access;
use landlock::AccessFs;
use landlock::CompatLevel;
use landlock::Compatible;
use landlock::Ruleset;
use landlock::RulesetAttr;
use landlock::RulesetCreatedAttr;
use landlock::ABI;
use seccompiler::apply_filter;
use seccompiler::BpfProgram;
use seccompiler::SeccompAction;
use seccompiler::SeccompCmpArgLen;
use seccompiler::SeccompCmpOp;
use seccompiler::SeccompCondition;
use seccompiler::SeccompFilter;
use seccompiler::SeccompRule;
use seccompiler::TargetArch;
use tokio::sync::Notify;
pub async fn exec_linux(
params: ExecParams,
ctrl_c: Arc<Notify>,
sandbox_policy: &SandboxPolicy,
) -> Result<RawExecToolCallOutput> {
// Allow READ on /
// Allow WRITE on /dev/null
let ctrl_c_copy = ctrl_c.clone();
let sandbox_policy = sandbox_policy.clone();
// Isolate thread to run the sandbox from
let tool_call_output = std::thread::spawn(move || {
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.expect("Failed to create runtime");
rt.block_on(async {
apply_sandbox_policy_to_current_thread(sandbox_policy)?;
exec(params, ctrl_c_copy).await
})
})
.join();
match tool_call_output {
Ok(Ok(output)) => Ok(output),
Ok(Err(e)) => Err(e),
Err(e) => Err(CodexErr::Io(io::Error::new(
io::ErrorKind::Other,
format!("thread join failed: {e:?}"),
))),
}
}
/// Apply sandbox policies inside this thread so only the child inherits
/// them, not the entire CLI process.
pub fn apply_sandbox_policy_to_current_thread(sandbox_policy: SandboxPolicy) -> Result<()> {
if !sandbox_policy.has_full_network_access() {
install_network_seccomp_filter_on_current_thread()?;
}
if !sandbox_policy.has_full_disk_write_access() {
let writable_roots = sandbox_policy.get_writable_roots();
install_filesystem_landlock_rules_on_current_thread(writable_roots)?;
}
// TODO(ragona): Add appropriate restrictions if
// `sandbox_policy.has_full_disk_read_access()` is `false`.
Ok(())
}
/// Installs Landlock file-system rules on the current thread allowing read
/// access to the entire file-system while restricting write access to
/// `/dev/null` and the provided list of `writable_roots`.
///
/// # Errors
/// Returns [`CodexErr::Sandbox`] variants when the ruleset fails to apply.
fn install_filesystem_landlock_rules_on_current_thread(writable_roots: Vec<PathBuf>) -> Result<()> {
let abi = ABI::V5;
let access_rw = AccessFs::from_all(abi);
let access_ro = AccessFs::from_read(abi);
let mut ruleset = Ruleset::default()
.set_compatibility(CompatLevel::BestEffort)
.handle_access(access_rw)?
.create()?
.add_rules(landlock::path_beneath_rules(&["/"], access_ro))?
.add_rules(landlock::path_beneath_rules(&["/dev/null"], access_rw))?
.set_no_new_privs(true);
if !writable_roots.is_empty() {
ruleset = ruleset.add_rules(landlock::path_beneath_rules(&writable_roots, access_rw))?;
}
let status = ruleset.restrict_self()?;
if status.ruleset == landlock::RulesetStatus::NotEnforced {
return Err(CodexErr::Sandbox(SandboxErr::LandlockRestrict));
}
Ok(())
}
/// Installs a seccomp filter that blocks outbound network access except for
/// AF_UNIX domain sockets.
fn install_network_seccomp_filter_on_current_thread() -> std::result::Result<(), SandboxErr> {
// Build rule map.
let mut rules: BTreeMap<i64, Vec<SeccompRule>> = BTreeMap::new();
// Helper insert unconditional deny rule for syscall number.
let mut deny_syscall = |nr: i64| {
rules.insert(nr, vec![]); // empty rule vec = unconditional match
};
deny_syscall(libc::SYS_connect);
deny_syscall(libc::SYS_accept);
deny_syscall(libc::SYS_accept4);
deny_syscall(libc::SYS_bind);
deny_syscall(libc::SYS_listen);
deny_syscall(libc::SYS_getpeername);
deny_syscall(libc::SYS_getsockname);
deny_syscall(libc::SYS_shutdown);
deny_syscall(libc::SYS_sendto);
deny_syscall(libc::SYS_sendmsg);
deny_syscall(libc::SYS_sendmmsg);
deny_syscall(libc::SYS_recvfrom);
deny_syscall(libc::SYS_recvmsg);
deny_syscall(libc::SYS_recvmmsg);
deny_syscall(libc::SYS_getsockopt);
deny_syscall(libc::SYS_setsockopt);
deny_syscall(libc::SYS_ptrace);
// For `socket` we allow AF_UNIX (arg0 == AF_UNIX) and deny everything else.
let unix_only_rule = SeccompRule::new(vec![SeccompCondition::new(
0, // first argument (domain)
SeccompCmpArgLen::Dword,
SeccompCmpOp::Eq,
libc::AF_UNIX as u64,
)?])?;
rules.insert(libc::SYS_socket, vec![unix_only_rule]);
rules.insert(libc::SYS_socketpair, vec![]); // always deny (Unix can use socketpair but fine, keep open?)
let filter = SeccompFilter::new(
rules,
SeccompAction::Allow, // default allow
SeccompAction::Errno(libc::EPERM as u32), // when rule matches return EPERM
if cfg!(target_arch = "x86_64") {
TargetArch::x86_64
} else if cfg!(target_arch = "aarch64") {
TargetArch::aarch64
} else {
unimplemented!("unsupported architecture for seccomp filter");
},
)?;
let prog: BpfProgram = filter.try_into()?;
apply_filter(&prog)?;
Ok(())
}
#[cfg(test)]
mod tests_linux {
use super::*;
use crate::exec::process_exec_tool_call;
use crate::exec::ExecParams;
use crate::exec::SandboxType;
use crate::protocol::SandboxPolicy;
use std::sync::Arc;
use tempfile::NamedTempFile;
use tokio::sync::Notify;
#[allow(clippy::print_stdout)]
async fn run_cmd(cmd: &[&str], writable_roots: &[PathBuf], timeout_ms: u64) {
let params = ExecParams {
command: cmd.iter().map(|elm| elm.to_string()).collect(),
workdir: None,
timeout_ms: Some(timeout_ms),
};
let sandbox_policy =
SandboxPolicy::new_read_only_policy_with_writable_roots(writable_roots);
let ctrl_c = Arc::new(Notify::new());
let res =
process_exec_tool_call(params, SandboxType::LinuxSeccomp, ctrl_c, &sandbox_policy)
.await
.unwrap();
if res.exit_code != 0 {
println!("stdout:\n{}", res.stdout);
println!("stderr:\n{}", res.stderr);
panic!("exit code: {}", res.exit_code);
}
}
#[tokio::test]
async fn test_root_read() {
run_cmd(&["ls", "-l", "/bin"], &[], 200).await;
}
#[tokio::test]
#[should_panic]
async fn test_root_write() {
let tmpfile = NamedTempFile::new().unwrap();
let tmpfile_path = tmpfile.path().to_string_lossy();
run_cmd(
&["bash", "-lc", &format!("echo blah > {}", tmpfile_path)],
&[],
200,
)
.await;
}
#[tokio::test]
async fn test_dev_null_write() {
run_cmd(&["echo", "blah", ">", "/dev/null"], &[], 200).await;
}
#[tokio::test]
async fn test_writable_root() {
let tmpdir = tempfile::tempdir().unwrap();
let file_path = tmpdir.path().join("test");
run_cmd(
&[
"bash",
"-lc",
&format!("echo blah > {}", file_path.to_string_lossy()),
],
&[tmpdir.path().to_path_buf()],
// We have seen timeouts when running this test in CI on GitHub,
// so we are using a generous timeout until we can diagnose further.
1_000,
)
.await;
}
#[tokio::test]
#[should_panic(expected = "Sandbox(Timeout)")]
async fn test_timeout() {
run_cmd(&["sleep", "2"], &[], 50).await;
}
/// Helper that runs `cmd` under the Linux sandbox and asserts that the command
/// does NOT succeed (i.e. returns a nonzero exit code) **unless** the binary
/// is missing in which case we silently treat it as an accepted skip so the
/// suite remains green on leaner CI images.
async fn assert_network_blocked(cmd: &[&str]) {
let params = ExecParams {
command: cmd.iter().map(|s| s.to_string()).collect(),
workdir: None,
// Give the tool a generous 2second timeout so even slow DNS timeouts
// do not stall the suite.
timeout_ms: Some(2_000),
};
let sandbox_policy = SandboxPolicy::new_read_only_policy();
let ctrl_c = Arc::new(Notify::new());
let result =
process_exec_tool_call(params, SandboxType::LinuxSeccomp, ctrl_c, &sandbox_policy)
.await;
let (exit_code, stdout, stderr) = match result {
Ok(output) => (output.exit_code, output.stdout, output.stderr),
Err(CodexErr::Sandbox(SandboxErr::Denied(exit_code, stdout, stderr))) => {
(exit_code, stdout, stderr)
}
_ => {
panic!("expected sandbox denied error, got: {:?}", result);
}
};
dbg!(&stderr);
dbg!(&stdout);
dbg!(&exit_code);
// A completely missing binary exits with 127. Anything else should also
// be nonzero (EPERM from seccomp will usually bubble up as 1, 2, 13…)
// If—*and only if*—the command exits 0 we consider the sandbox breached.
if exit_code == 0 {
panic!(
"Network sandbox FAILED - {:?} exited 0\nstdout:\n{}\nstderr:\n{}",
cmd, stdout, stderr
);
}
}
#[tokio::test]
async fn sandbox_blocks_curl() {
assert_network_blocked(&["curl", "-I", "http://openai.com"]).await;
}
#[cfg(target_os = "linux")]
#[tokio::test]
async fn sandbox_blocks_wget() {
assert_network_blocked(&["wget", "-qO-", "http://openai.com"]).await;
}
#[tokio::test]
async fn sandbox_blocks_ping() {
// ICMP requires raw socket should be denied quickly with EPERM.
assert_network_blocked(&["ping", "-c", "1", "8.8.8.8"]).await;
}
#[tokio::test]
async fn sandbox_blocks_nc() {
// Zerolength connection attempt to localhost.
assert_network_blocked(&["nc", "-z", "127.0.0.1", "80"]).await;
}
#[tokio::test]
async fn sandbox_blocks_ssh() {
// Force ssh to attempt a real TCP connection but fail quickly. `BatchMode`
// avoids password prompts, and `ConnectTimeout` keeps the hang time low.
assert_network_blocked(&[
"ssh",
"-o",
"BatchMode=yes",
"-o",
"ConnectTimeout=1",
"github.com",
])
.await;
}
#[tokio::test]
async fn sandbox_blocks_getent() {
assert_network_blocked(&["getent", "ahosts", "openai.com"]).await;
}
#[tokio::test]
async fn sandbox_blocks_dev_tcp_redirection() {
// This syntax is only supported by bash and zsh. We try bash first.
// Fallback generic socket attempt using /bin/sh with bashstyle /dev/tcp. Not
// all images ship bash, so we guard against 127 as well.
assert_network_blocked(&["bash", "-c", "echo hi > /dev/tcp/127.0.0.1/80"]).await;
}
}

186
codex-rs/core/src/models.rs Normal file
View File

@@ -0,0 +1,186 @@
use base64::Engine;
use serde::ser::Serializer;
use serde::Deserialize;
use serde::Serialize;
use crate::protocol::InputItem;
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ResponseInputItem {
Message {
role: String,
content: Vec<ContentItem>,
},
FunctionCallOutput {
call_id: String,
output: FunctionCallOutputPayload,
},
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ContentItem {
InputText { text: String },
InputImage { image_url: String },
OutputText { text: String },
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ResponseItem {
Message {
role: String,
content: Vec<ContentItem>,
},
FunctionCall {
name: String,
// The Responses API returns the function call arguments as a *string* that contains
// JSON, not as an alreadyparsed object. We keep it as a raw string here and let
// Session::handle_function_call parse it into a Value. This exactly matches the
// Chat Completions + Responses API behavior.
arguments: String,
call_id: String,
},
// NOTE: The input schema for `function_call_output` objects that clients send to the
// OpenAI /v1/responses endpoint is NOT the same shape as the objects the server returns on the
// SSE stream. When *sending* we must wrap the string output inside an object that includes a
// required `success` boolean. The upstream TypeScript CLI does this implicitly. To ensure we
// serialize exactly the expected shape we introduce a dedicated payload struct and flatten it
// here.
FunctionCallOutput {
call_id: String,
output: FunctionCallOutputPayload,
},
#[serde(other)]
Other,
}
impl From<ResponseInputItem> for ResponseItem {
fn from(item: ResponseInputItem) -> Self {
match item {
ResponseInputItem::Message { role, content } => Self::Message { role, content },
ResponseInputItem::FunctionCallOutput { call_id, output } => {
Self::FunctionCallOutput { call_id, output }
}
}
}
}
impl From<Vec<InputItem>> for ResponseInputItem {
fn from(items: Vec<InputItem>) -> Self {
Self::Message {
role: "user".to_string(),
content: items
.into_iter()
.filter_map(|c| match c {
InputItem::Text { text } => Some(ContentItem::InputText { text }),
InputItem::Image { image_url } => Some(ContentItem::InputImage { image_url }),
InputItem::LocalImage { path } => match std::fs::read(&path) {
Ok(bytes) => {
let mime = mime_guess::from_path(&path)
.first()
.map(|m| m.essence_str().to_owned())
.unwrap_or_else(|| "application/octet-stream".to_string());
let encoded = base64::engine::general_purpose::STANDARD.encode(bytes);
Some(ContentItem::InputImage {
image_url: format!("data:{};base64,{}", mime, encoded),
})
}
Err(err) => {
tracing::warn!(
"Skipping image {} could not read file: {}",
path.display(),
err
);
None
}
},
})
.collect::<Vec<ContentItem>>(),
}
}
}
#[expect(dead_code)]
#[derive(Deserialize, Debug, Clone)]
pub struct FunctionCallOutputPayload {
pub content: String,
pub success: Option<bool>,
}
// The Responses API expects two *different* shapes depending on success vs failure:
// • success → output is a plain string (no nested object)
// • failure → output is an object { content, success:false }
// The upstream TypeScript CLI implements this by specialcasing the serialize path.
// We replicate that behavior with a manual Serialize impl.
impl Serialize for FunctionCallOutputPayload {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// The upstream TypeScript CLI always serializes `output` as a *plain string* regardless
// of whether the function call succeeded or failed. The boolean is purely informational
// for local bookkeeping and is NOT sent to the OpenAI endpoint. Sending the nested object
// form `{ content, success:false }` triggers the 400 we are still seeing. Mirror the JS CLI
// exactly: always emit a bare string.
serializer.serialize_str(&self.content)
}
}
// Implement Display so callers can treat the payload like a plain string when logging or doing
// trivial substring checks in tests (existing tests call `.contains()` on the output). Display
// returns the raw `content` field.
impl std::fmt::Display for FunctionCallOutputPayload {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(&self.content)
}
}
impl std::ops::Deref for FunctionCallOutputPayload {
type Target = str;
fn deref(&self) -> &Self::Target {
&self.content
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn serializes_success_as_plain_string() {
let item = ResponseInputItem::FunctionCallOutput {
call_id: "call1".into(),
output: FunctionCallOutputPayload {
content: "ok".into(),
success: None,
},
};
let json = serde_json::to_string(&item).unwrap();
let v: serde_json::Value = serde_json::from_str(&json).unwrap();
// Success case -> output should be a plain string
assert_eq!(v.get("output").unwrap().as_str().unwrap(), "ok");
}
#[test]
fn serializes_failure_as_string() {
let item = ResponseInputItem::FunctionCallOutput {
call_id: "call1".into(),
output: FunctionCallOutputPayload {
content: "bad".into(),
success: Some(false),
},
};
let json = serde_json::to_string(&item).unwrap();
let v: serde_json::Value = serde_json::from_str(&json).unwrap();
assert_eq!(v.get("output").unwrap().as_str().unwrap(), "bad");
}
}

Some files were not shown because too many files have changed in this diff Show More