Compare commits

..

156 Commits

Author SHA1 Message Date
Tiffany Citra
980778f529 Fix sandbox retry approvals 2025-10-15 17:04:10 -07:00
Kevin Alwell
6e7d4a4b45 Should fix #751 by adding verbose error message. 2025-04-30 14:55:15 -04:00
Kevin Alwell
841e19b05d fix typescript error 2025-04-29 12:43:52 -04:00
Kevin Alwell
f466a73428 Omit default to fix type errors 2025-04-29 12:30:19 -04:00
Kevin Alwell
1a868d35f3 fix typecheck error 2025-04-29 12:22:41 -04:00
Kevin Alwell
9c563054e0 TypeScript bug bashing 2025-04-29 12:17:31 -04:00
Kevin Alwell
ca7204537c TypeScript bug bashing 2025-04-29 12:14:46 -04:00
Kevin Alwell
132e87cc8c TypeScript bug bashing 2025-04-29 12:13:38 -04:00
Kevin Alwell
8e80716169 fix tests 2025-04-29 12:10:54 -04:00
Kevin Alwell
d28aedb07b fix pipeline errors 2025-04-29 12:05:16 -04:00
Kevin Alwell
586ee0ec71 Merge branch 'main' into issue-726 2025-04-29 11:54:34 -04:00
Kevin Alwell
9065e61455 Ensure disableResponseStorage flag is respected 2025-04-29 11:48:47 -04:00
Kevin Alwell
6686f28338 Ensure disableResponseStorage flag is respected 2025-04-29 11:25:07 -04:00
Rashim
892242ef7c feat: add --reasoning CLI flag (#314)
This PR adds a new CLI flag: `--reasoning`, which allows users to
customize the reasoning effort level (`low`, `medium`, or `high`) used
by OpenAI's `o` models.
By introducing the `--reasoning` flag, users gain more flexibility when
working with the models. It enables optimization for either speed or
depth of reasoning, depending on specific use cases.
This PR resolves #107

- **Flag**: `--reasoning`
- **Accepted Values**: `low`, `medium`, `high`
- **Default Behavior**: If not specified, the model uses the default
reasoning level.

## Example Usage

```bash
codex --reasoning=low "Write a simple function to calculate factorial"

---------

Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
Co-authored-by: yashrwealthy <yash.rastogi@wealthy.in>
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-29 07:30:49 -07:00
Kevin Alwell
3818df7ba4 Fixes issue #726 by adding config to configToSave object 2025-04-29 10:12:55 -04:00
Fouad Matin
19928bc257 [codex-rs] fix: exit code 1 if no api key (#697) 2025-04-28 21:42:06 -07:00
Michael Bolin
b9bba09819 fix: eliminate runtime dependency on patch(1) for apply_patch (#718)
When processing an `apply_patch` tool call, we were already computing
the new file content in order to compute the unified diff. Before this
PR, we were shelling out to `patch(1)` to apply the unified diff once
the user accepted the change, but this updates the code to just retain
the new file content and use it to write the file when the user accepts.
This simplifies deployment because it no longer assumes `patch(1)` is on
the host.

Note this change is internal to the Codex agent and does not affect
`protocol.rs`.
2025-04-28 21:15:41 -07:00
Thibault Sottiaux
d09dbba7ec feat: lower default retry wait time and increase number of tries (#720)
In total we now guarantee that we will wait for at least 60s before
giving up.

---------

Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-28 21:11:30 -07:00
Michael Bolin
e79549f039 feat: add debug landlock subcommand comparable to debug seatbelt (#715)
This PR adds a `debug landlock` subcommand to the Codex CLI for testing
how Codex would execute a command using the specified sandbox policy.

Built and ran this code in the `rust:latest` Docker container. In the
container, hitting the network with vanilla `curl` succeeds:

```
$ curl google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
```

whereas this fails, as expected:

```
$ cargo run -- debug landlock -s network-restricted -- curl google.com
curl: (6) getaddrinfo() thread failed to start
```
2025-04-28 16:37:05 -07:00
Michael Bolin
e7ad9449ea feat: make it possible to set disable_response_storage = true in config.toml (#714)
https://github.com/openai/codex/pull/642 introduced support for the
`--disable-response-storage` flag, but if you are a ZDR customer, it is
tedious to set this every time, so this PR makes it possible to set this
once in `config.toml` and be done with it.

Incidentally, this tidies things up such that now `init_codex()` takes
only one parameter: `Config`.
2025-04-28 15:39:34 -07:00
Michael Bolin
cca1122ddc fix: make the TUI the default/"interactive" CLI in Rust (#711)
Originally, the `interactive` crate was going to be a placeholder for
building out a UX that was comparable to that of the existing TypeScript
CLI. Though after researching how Ratatui works, that seems difficult to
do because it is designed around the idea that it will redraw the full
screen buffer each time (and so any scrolling should be "internal" to
your Ratatui app) whereas the TypeScript CLI expects to render the full
history of the conversation every time(*) (which is why you can use your
terminal scrollbar to scroll it).

While it is possible to use Ratatui in a way that acts more like what
the TypeScript CLI is doing, it is awkward and seemingly results in
tedious code, so I think we should abandon that approach. As such, this
PR deletes the `interactive/` folder and the code that depended on it.

Further, since we added support for mousewheel scrolling in the TUI in
https://github.com/openai/codex/pull/641, it certainly feels much better
and the need for scroll support via the terminal scrollbar is greatly
diminished. This is now a more appropriate default UX for the
"multitool" CLI.

(*) Incidentally, I haven't verified this, but I think this results in
O(N^2) work in rendering, which seems potentially problematic for long
conversations.
2025-04-28 13:46:22 -07:00
Michael Bolin
40460faf2a fix: tighten up check for /usr/bin/sandbox-exec (#710)
* In both TypeScript and Rust, we now invoke `/usr/bin/sandbox-exec`
explicitly rather than whatever `sandbox-exec` happens to be on the
`PATH`.
* Changed `isSandboxExecAvailable` to use `access()` rather than
`command -v` so that:
  *  We only do the check once over the lifetime of the Codex process.
  * The check is specific to `/usr/bin/sandbox-exec`.
* We now do a syscall rather than incur the overhead of spawning a
process, dealing with timeouts, etc.

I think there is still room for improvement here where we should move
the `isSandboxExecAvailable` check earlier in the CLI, ideally right
after we do arg parsing to verify that we can provide the Seatbelt
sandbox if that is what the user has requested.
2025-04-28 13:42:04 -07:00
Michael Bolin
38575ed8aa fix: increase timeout of test_writable_root (#713)
Although we made some promising fixes in
https://github.com/openai/codex/pull/662, we are still seeing some
flakiness in `test_writable_root()`. If this continues to flake with the
more generous timeout, we should try something other than simply
increasing the timeout.
2025-04-28 13:09:27 -07:00
Michael Bolin
77e2918049 fix: drop d as keyboard shortcut for scrolling in the TUI (#704)
The existing `b` and `space` are sufficient and `d` and `u` default to
half-page scrolling in `less`, so the way we supported `d` and `u`
wasn't faithful to that, anyway:

https://man7.org/linux/man-pages/man1/less.1.html

If we decide to bring `d` and `u` back, they should probably match
`less`?
2025-04-28 10:39:58 -07:00
Thibault Sottiaux
fa5fa8effc fix: only allow running without sandbox if explicitly marked in safe container (#699)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-28 07:48:38 -07:00
Michael Bolin
4eda4dd772 feat: load defaults into Config and introduce ConfigOverrides (#677)
This changes how instantiating `Config` works and also adds
`approval_policy` and `sandbox_policy` as fields. The idea is:

* All fields of `Config` have appropriate default values.
* `Config` is initially loaded from `~/.codex/config.toml`, so values in
`config.toml` will override those defaults.
* Clients must instantiate `Config` via
`Config::load_with_overrides(ConfigOverrides)` where `ConfigOverrides`
has optional overrides that are expected to be settable based on CLI
flags.

The `Config` should be defined early in the program and then passed
down. Now functions like `init_codex()` take fewer individual parameters
because they can just take a `Config`.

Also, `Config::load()` used to fail silently if `~/.codex/config.toml`
had a parse error and fell back to the default config. This seemed
really bad because it wasn't clear why the values in my `config.toml`
weren't getting picked up. I changed things so that
`load_with_overrides()` returns `Result<Config>` and verified that the
various CLIs print a reasonable error if `config.toml` is malformed.

Finally, I also updated the TUI to show which **sandbox** value is being
used, as we do for other key values like **model** and **approval**.
This was also a reminder that the various values of `--sandbox` are
honored on Linux but not macOS today, so I added some TODOs about fixing
that.
2025-04-27 21:47:50 -07:00
Thibault Sottiaux
e9d16d3c2b fix: check if sandbox-exec is available (#696)
- Introduce `isSandboxExecAvailable()` helper and tidy import ordering
in `handle-exec-command.ts`.
- Add runtime check for the `sandbox-exec` binary on macOS; fall back to
`SandboxType.NONE` with a warning if it’s missing, preventing crashes.

---------

Signed-off-by: Thibault Sottiaux <tibo@openai.com>
Co-authored-by: Fouad Matin <fouad@openai.com>
2025-04-27 17:04:47 -07:00
Fouad Matin
523996b5cb fix: /diff should include untracked files (#686) 2025-04-26 12:43:51 -07:00
Tomas Cupr
bc500d3009 feat: user config api key (#569)
Adds support for reading OPENAI_API_KEY (and other variables) from a
user‑wide dotenv file (~/.codex.config). Precedence order is now:
  1. explicit environment variable
  2. project‑local .env (loaded earlier)
  3. ~/.codex.config

Also adds a regression test that ensures the multiline editor correctly
handles cases where printable text and the CSI‑u Shift+Enter sequence
arrive in the same input chunk.

House‑kept with Prettier; removed stray temp.json artifact.
2025-04-26 10:13:30 -07:00
moppywhip
9b0ccf9aeb fix: duplicate messages in quiet mode (#680)
Addressing #600 and #664 (partially)

## Bug
Codex was staging duplicate items in output running when the same
response item appeared in both the streaming events. Specifically:

1. Items would be staged once when received as a
`response.output_item.done` event
2. The same items would be staged again when included in the final
`response.completed` payload

This duplication would result in each message being sent several times
in the quiet mode output.

## Changes
- Added a Set (`alreadyStagedItemIds`) to track items that have already
been staged
- Modified the `stageItem` function to check if an item's ID is already
in this set before staging it
- Added a regression test (`agent-dedupe-items.test.ts`) that verifies
items with the same ID are only staged once

## Testing
Like other tests, the included test creates a mock OpenAI stream that
emits the same message twice (once as an incremental event and once in
the final response) and verifies the item is only passed to `onItem`
once.
2025-04-26 09:14:50 -07:00
Michael Bolin
b0ba65a936 fix: write logs to ~/.codex/log instead of /tmp (#669)
Previously, the Rust TUI was writing log files to `/tmp`, which is
world-readable and not available on Windows, so that isn't great.

This PR tries to clean things up by adding a function that provides the
path to the "Codex config dir," e.g., `~/.codex` (though I suppose we
could support `$CODEX_HOME` to override this?) and then defines other
paths in terms of the result of `codex_dir()`.

For example, `log_dir()` returns the folder where log files should be
written which is defined in terms of `codex_dir()`. I updated the TUI to
use this function. On UNIX, we even go so far as to `chmod 600` the log
file by default, though as noted in a comment, it's a bit tedious to do
the equivalent on Windows, so we just let that go for now.

This also changes the default logging level to `info` for `codex_core`
and `codex_tui` when `RUST_LOG` is not specified. I'm not really sure if
we should use a more verbose default (it may be helpful when debugging
user issues), though if so, we should probably also set up log rotation?
2025-04-25 17:37:41 -07:00
Fouad Matin
103093f793 bump(version): 0.1.2504251709 (#660)
## `0.1.2504251709`

### 🚀 Features

- Add openai model info configuration (#551)
- Added provider to run quiet mode function (#571)
- Create parent directories when creating new files (#552)
- Print bug report URL in terminal instead of opening browser (#510)
(#528)
- Add support for custom provider configuration in the user config
(#537)
- Add support for OpenAI-Organization and OpenAI-Project headers (#626)
- Add specific instructions for creating API keys in error msg (#581)
- Enhance toCodePoints to prevent potential unicode 14 errors (#615)
- More native keyboard navigation in multiline editor (#655)
- Display error on selection of invalid model (#594)

### 🪲 Bug Fixes

- Model selection (#643)
- Nits in apply patch (#640)
- Input keyboard shortcuts (#676)
- `apply_patch` unicode characters (#625)
- Don't clear turn input before retries (#611)
- More loosely match context for apply_patch (#610)
- Update bug report template - there is no --revision flag (#614)
- Remove outdated copy of text input and external editor feature (#670)
- Remove unreachable "disableResponseStorage" logic flow introduced in
#543 (#573)
- Non-openai mode - fix for gemini content: null, fix 429 to throw
before stream (#563)
- Only allow going up in history when not already in history if input is
empty (#654)
- Do not grant "node" user sudo access when using run_in_container.sh
(#627)
- Update scripts/build_container.sh to use pnpm instead of npm (#631)
- Update lint-staged config to use pnpm --filter (#582)
- Non-openai mode - don't default temp and top_p (#572)
- Fix error catching when checking for updates (#597)
- Close stdin when running an exec tool call (#636)
2025-04-25 17:15:40 -07:00
Fouad Matin
3f4762d969 fix: input keyboard shortcuts (#676)
Fixes keyboard shortcuts:
- ctrl+a/e
- opt+arrow keys
2025-04-25 16:58:09 -07:00
Michael Bolin
f3ee933a74 ci: build Rust on Windows as part of CI (#665)
While we aren't ready to provide Windows binaries of Codex CLI, it seems
like a good idea to ensure we guard platform-specific code
appropriately.
2025-04-25 16:22:16 -07:00
Thibault Sottiaux
44d68f9dbf fix: remove outdated copy of text input and external editor feature (#670)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 16:11:16 -07:00
Misha Davidov
15bf5ca971 fix: handling weird unicode characters in apply_patch (#674)
I � unicode
2025-04-25 16:01:58 -07:00
Michael Bolin
c18f1689a9 fix: small fixes so Codex compiles on Windows (#673)
Small fixes required:

* `ExitStatusExt` differs because UNIX expects exit code to be `i32`
whereas Windows does `u32`
* Marking a file "executable only by owner" is a bit more involved on
Windows. We just do something approximate for now (and add a TODO) to
get things compiling.

I created this PR on my personal Windows machine and `cargo test` and
`cargo clippy` succeed. Once this is in, I'll rebase
https://github.com/openai/codex/pull/665 on top so Windows stays fixed!
2025-04-25 15:58:44 -07:00
Michael Bolin
ebd2ae4abd fix: remove dependency on expanduser crate (#667)
In putting up https://github.com/openai/codex/pull/665, I discovered
that the `expanduser` crate does not compile on Windows. Looking into
it, we do not seem to need it because we were only using it with a value
that was passed in via a command-line flag, so the shell expands `~` for
us before we see it, anyway. (I changed the type in `Cli` from `String`
to `PathBuf`, to boot.)

If we do need this sort of functionality in the future,
https://docs.rs/shellexpand/latest/shellexpand/fn.tilde.html seems
promising.
2025-04-25 14:20:21 -07:00
Michael Bolin
9c3ebac3b7 fix: flipped the sense of Prompt.store in #642 (#663)
I got the sense of this wrong in
https://github.com/openai/codex/pull/642. In that PR, I made
`--disable-response-storage` work, but broke the default case.

With this fix, both cases work and I think the code is a bit cleaner.
2025-04-25 13:41:17 -07:00
Parker Thompson
7d9de34bc7 [codex-rs] Improve linux sandbox timeouts (#662)
* Fixes flaking rust unit test
* Adds explicit sandbox exec timeout handling
2025-04-25 12:56:20 -07:00
Parker Thompson
55e25abf78 [codex-rs] CI performance for rust (#639)
* Refactors the rust-ci into a matrix build
* Adds directory caching for the build artifacts
* Adds workflow dispatch for manual testing
2025-04-25 12:44:03 -07:00
Michael Bolin
b323d10ea7 feat: add ZDR support to Rust implementation (#642)
This adds support for the `--disable-response-storage` flag across our
multiple Rust CLIs to support customers who have opted into Zero-Data
Retention (ZDR). The analogous changes to the TypeScript CLI were:

* https://github.com/openai/codex/pull/481
* https://github.com/openai/codex/pull/543

For a client using ZDR, `previous_response_id` will never be available,
so the `input` field of an API request must include the full transcript
of the conversation thus far. As such, this PR changes the type of
`Prompt.input` from `Vec<ResponseInputItem>` to `Vec<ResponseItem>`.

Practically speaking, `ResponseItem` was effectively a "superset" of
`ResponseInputItem` already. The main difference for us is that
`ResponseItem` includes the `FunctionCall` variant that we have to
include as part of the conversation history in the ZDR case.

Another key change in this PR is modifying `try_run_turn()` so that it
returns the `Vec<ResponseItem>` for the turn in addition to the
`Vec<ResponseInputItem>` produced by `try_run_turn()`. This is because
the caller of `run_turn()` needs to record the `Vec<ResponseItem>` when
ZDR is enabled.

To that end, this PR introduces `ZdrTranscript` (and adds
`zdr_transcript: Option<ZdrTranscript>` to `struct State` in `codex.rs`)
to take responsibility for maintaining the conversation transcript in
the ZDR case.
2025-04-25 12:08:18 -07:00
Michael Bolin
dc7b83666a feat(tui-rs): add support for mousewheel scrolling (#641)
It is intuitive to try to scroll the conversation history using the
mouse in the TUI, but prior to this change, we only supported scrolling
via keyboard events.

This PR enables mouse capture upon initialization (and disables it on
exit) such that we get `ScrollUp` and `ScrollDown` events in
`codex-rs/tui/src/app.rs`. I initially mapped each event to scrolling by
one line, but that felt sluggish. I decided to introduce
`ScrollEventHelper` so we could debounce scroll events and measure the
number of scroll events in a 100ms window to determine the "magnitude"
of the scroll event. I put in a basic heuristic to start, but perhaps
someone more motivated can play with it over time.

`ScrollEventHelper` takes care of handling the atomic fields and thread
management to ensure an `AppEvent::Scroll` event is pumped back through
the event loop at the appropriate time with the accumulated delta.
2025-04-25 12:01:52 -07:00
oai-ragona
d7a40195e6 [codex-rs] Reliability pass on networking (#658)
We currently see a behavior that looks like this:
```
2025-04-25T16:52:24.552789Z  WARN codex_core::codex: stream disconnected - retrying turn (1/10 in 232ms)...
codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 1/10 in 232ms…" }
2025-04-25T16:52:54.789885Z  WARN codex_core::codex: stream disconnected - retrying turn (2/10 in 418ms)...
codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 2/10 in 418ms…" }
```

This PR contains a few different fixes that attempt to resolve/improve
this:
1. **Remove overall client timeout.** I think
[this](https://github.com/openai/codex/pull/658/files#diff-c39945d3c42f29b506ff54b7fa2be0795b06d7ad97f1bf33956f60e3c6f19c19L173)
is perhaps the big fix -- it looks to me like this was actually timing
out even if events were still coming through, and that was causing a
disconnect right in the middle of a healthy stream.
2. **Cap response sizes.** We were frequently sending MUCH larger
responses than the upstream typescript `codex`, and that was definitely
not helping. [Fix
here](https://github.com/openai/codex/pull/658/files#diff-d792bef59aa3ee8cb0cbad8b176dbfefe451c227ac89919da7c3e536a9d6cdc0R21-R26)
for that one.
3. **Much higher idle timeout.** Our idle timeout value was much lower
than typescript.
4. **Sub-linear backoff.** We were much too aggressively backing off,
[this](https://github.com/openai/codex/pull/658/files#diff-5d5959b95c6239e6188516da5c6b7eb78154cd9cfedfb9f753d30a7b6d6b8b06R30-R33)
makes it sub-exponential but maintains the jitter and such.

I was seeing that `stream error: stream disconnected` behavior
constantly, and anecdotally I can no longer reproduce. It feels much
snappier.
2025-04-25 11:44:22 -07:00
Tomas Cupr
4760aa1eb9 perf: optimize token streaming with balanced approach (#635)
- Replace setTimeout(10ms) with queueMicrotask for immediate processing
- Add minimal 3ms setTimeout for rendering to maintain readable UX
- Reduces per-token delay while preserving streaming experience
- Add performance test to verify optimization works correctly

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 10:49:38 -07:00
Thibault Sottiaux
d401283a41 feat: more native keyboard navigation in multiline editor (#655)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 10:35:30 -07:00
rumple
69ce06d2f8 feat: Add support for OpenAI-Organization and OpenAI-Project headers (#626)
Added support for OpenAI-Organization and OpenAI-Project headers for
OpenAI API calls.

This is for #74
2025-04-25 09:52:42 -07:00
Thibault Sottiaux
866626347b fix: only allow going up in history when not already in history if input is empty (#654)
\+ cleanup below input help to be "ctrl+c to exit | "/" to see commands
| enter to send" now that we have command autocompletion
\+ minor other drive-by code cleanups

---------

Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 09:39:24 -07:00
Pulipaka Sai Krishna
2759ff39da fix: model selection (#643)
fix: pass correct selected model in ModelOverlay

The ModelOverlay component was incorrectly passing the current model
instead of the newly selected model to its onSelect callback. This
prevented model changes from being applied properly.

The fix ensures that when a user selects a new model, the parent
component receives the correct newly selected model value, allowing
model changes to work as intended.
2025-04-25 09:38:05 -07:00
Luci
3fe7e53327 fix: nits in apply patch (#640)
## Description

Fix a nit in `apply patch`, potentially improving performance slightly.
2025-04-25 07:27:48 -07:00
Luci
1ef8e8afd3 docs: provider config (#653)
close: #651

Hi! @tibo-openai 👋 Could you share some great examples of
`instructions.md` files? Thanks!

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-25 07:25:32 -07:00
Luci
a9ecb2efce chore: upgrade prettier to v3 (#644)
## Description

This PR addresses the following improvements:

**Unify Prettier Version**: Currently, the Prettier version used in
`/package.json` and `/codex-cli/package.json` are different. In this PR,
we're updating both to use Prettier v3.

- Prettier v3 introduces improved support for JavaScript and TypeScript.
(e.g. the formatting scenario shown in the image below. This is more
aligned with the TypeScript indentation standard).

<img width="1126" alt="image"
src="https://github.com/user-attachments/assets/6e237eb8-4553-4574-b336-ed9561c55370"
/>

**Add Prettier Auto-Formatting in lint-staged**: We've added a step to
automatically run prettier --write on JavaScript and TypeScript files as
part of the lint-staged process, before the ESLint checks.

- This will help ensure that all committed code is properly formatted
according to the project's Prettier configuration.
2025-04-25 07:21:50 -07:00
Michael Bolin
bfe6fac463 fix: close stdin when running an exec tool call (#636)
We were already doing this in the TypeScript version, but forgot to
bring this over to Rust:


c38c2a59c7/codex-cli/src/utils/agent/sandbox/raw-exec.ts (L76-L78)
2025-04-24 18:06:08 -07:00
Michael Bolin
6a9c9f4b6c fix: add RUST_BACKTRACE=full when running cargo test in CI (#638)
This should provide more information in the event of a failure.
2025-04-24 18:05:56 -07:00
Michael Bolin
5cdcbfa9b4 fix: only run rust-ci.yml on PRs that modify files in codex-rs (#637)
The `rust-ci.yml` build appears to be a bit flaky (we're looking into
it...), so to save TypeScript contributors some noise, restrict the
`rust-ci.yml` job so that it only runs on PRs that touch files in
`codex-rs/`.
2025-04-24 17:59:35 -07:00
Luci
c38c2a59c7 fix(utils): save config (#578)
## Description

When `saveConfig` is called, the project doc is incorrectly saved into
user instructions. This change ensures that only user instructions are
saved to `instructions.md` during saveConfig, preventing data
corruption.

close: #576

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-24 17:32:33 -07:00
Michael Bolin
58f0e5ab74 feat: introduce codex_execpolicy crate for defining "safe" commands (#634)
As described in detail in `codex-rs/execpolicy/README.md` introduced in
this PR, `execpolicy` is a tool that lets you define a set of _patterns_
used to match [`execv(3)`](https://linux.die.net/man/3/execv)
invocations. When a pattern is matched, `execpolicy` returns the parsed
version in a structured form that is amenable to static analysis.

The primary use case is to define patterns match commands that should be
auto-approved by a tool such as Codex. This supports a richer pattern
matching mechanism that the sort of prefix-matching we have done to
date, e.g.:


5e40d9d221/codex-cli/src/approvals.ts (L333-L354)

Note we are still playing with the API and the `system_path` option in
particular still needs some work.
2025-04-24 17:14:47 -07:00
nvp159
5e40d9d221 feat(bug-report): print bug report URL in terminal instead of opening browser (#510) (#528)
Solves #510 
This PR changes the `/bug` command to print the URL into the terminal
(so it works in headless sessions) instead of trying to open a browser.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-24 17:00:14 -07:00
sooraj
36a5a02d5c feat: display error on selection of invalid model (#594)
Up-to-date of #78 

Fixes #32

addressed requested changes @tibo-openai :) made sense to me


though, previous rationale with passing the state up was assuming there
could be a future need to have a shared state with all available models
being available to the parent
2025-04-24 16:56:00 -07:00
Michael Bolin
bb2d411043 fix: update scripts/build_container.sh to use pnpm instead of npm (#631)
I suspect this is why some contributors kept accidentally including a
new `codex-cli/package-lock.json` in their PRs.

Note the `Dockerfile` still uses `npm` instead of `pnpm`, but that
appears to be fine. (Probably nicer to globally install as few things as
possible in the image.)
2025-04-24 16:38:57 -07:00
oai-ragona
b34ed2ab83 [codex-rs] More fine-grained sandbox flag support on Linux (#632)
##### What/Why
This PR makes it so that in Linux we actually respect the different
types of `--sandbox` flag, such that users can apply network and
filesystem restrictions in combination (currently the only supported
behavior), or just pick one or the other.

We should add similar support for OSX in a future PR.

##### Testing
From Linux devbox, updated tests to use more specific flags:
```
test linux::tests_linux::sandbox_blocks_ping ... ok
test linux::tests_linux::sandbox_blocks_getent ... ok
test linux::tests_linux::test_root_read ... ok
test linux::tests_linux::test_dev_null_write ... ok
test linux::tests_linux::sandbox_blocks_dev_tcp_redirection ... ok
test linux::tests_linux::sandbox_blocks_ssh ... ok
test linux::tests_linux::test_writable_root ... ok
test linux::tests_linux::sandbox_blocks_curl ... ok
test linux::tests_linux::sandbox_blocks_wget ... ok
test linux::tests_linux::sandbox_blocks_nc ... ok
test linux::tests_linux::test_root_write - should panic ... ok
```

##### Todo
- [ ] Add negative tests (e.g. confirm you can hit the network if you
configure filesystem only restrictions)
2025-04-24 15:33:45 -07:00
Michael Bolin
61805a832d fix: do not grant "node" user sudo access when using run_in_container.sh (#627)
This exploration came out of my review of
https://github.com/openai/codex/pull/414.

`run_in_container.sh` runs Codex in a Docker container like so:


bd1c3deed9/codex-cli/scripts/run_in_container.sh (L51-L58)

But then runs `init_firewall.sh` to set up the firewall to restrict
network access.

Previously, we did this by adding `/usr/local/bin/init_firewall.sh` to
the container and adding a special rule in `/etc/sudoers.d` so the
unprivileged user (`node`) could run the privileged `init_firewall.sh`
script to open up the firewall for `api.openai.com`:


31d0d7a305/codex-cli/Dockerfile (L51-L56)

Though I believe this is unnecessary, as we can use `docker exec --user
root` from _outside_ the container to run
`/usr/local/bin/init_firewall.sh` as `root` without adding a special
case in `/etc/sudoers.d`.

This appears to work as expected, as I tested it by doing the following:

```
./codex-cli/scripts/build_container.sh
./codex-cli/scripts/run_in_container.sh 'what is the output of `curl https://www.openai.com`'
```

This was a bit funny because in some of my runs, Codex wasn't convinced
it had network access, so I had to convince it to try the `curl`
request:


![image](https://github.com/user-attachments/assets/80bd487c-74e2-4cd3-aa0f-26a6edd8d3f7)

As you can see, when it ran `curl -s https\://www.openai.com`, it a
connection failure, so the network policy appears to be working as
intended.

Note this PR also removes `sudo` from the `apt-get install` list in the
`Dockerfile`.
2025-04-24 14:25:02 -07:00
Fouad Matin
bd1c3deed9 update: readme (#630)
- mention support for ZDR
- codex open source fund
2025-04-24 14:05:26 -07:00
Michael Bolin
31d0d7a305 feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:

Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.

To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:

- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.

Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
Misha Davidov
acc4acc81e fix: apply_patch unicode characters (#625)
fuzzy-er matching for apply_patch to handle u00A0 and u202F spaces.
2025-04-24 13:04:37 -07:00
Luci
e84fa6793d fix(agent-loop): notify type (#608)
## Description

The `as AppConfig` type assertion in the constructor may introduce
potential type safety risks. Removing the assertion and making `notify`
an optional parameter could enhance type robustness and prevent
unexpected runtime errors.

close: #605
2025-04-24 11:08:52 -07:00
Asa
d1c0d5e683 feat: update README and config to support custom providers with API k… (#577)
When using a non-built-in provider with the `--provider` option, users
are prompted:

```
Set the environment variable <provider>_API_KEY and re-run this command.
You can create a <provider>_API_KEY in the <provider> dashboard.
```

However, many users are confused because, even after correctly setting
`<provider>_API_KEY`, authentication may still fail unless
`OPENAI_API_KEY` is _also_ present in the environment. This is not
intuitive and leads to ambiguity about which API key is actually
required and used as a fallback, especially when using custom or
third-party (non-listed) providers.

Furthermore, the original README/documentation did not mention the
requirement to set `<provider>_BASE_URL` for non-built-in providers,
which is necessary for proper client behavior. This omission made the
configuration process more difficult for users trying to integrate with
custom endpoints.
2025-04-24 11:08:19 -07:00
Luci
6d68a90064 feat: enhance toCodePoints to prevent potential unicode 14 errors (#615)
## Description

`Array.from` may fail when handling certain characters newly added in
Unicode 14. Where possible, it seems better to use `Intl.Segmenter` for
more reliable processing.


![image](https://github.com/user-attachments/assets/2cbd779d-69d3-448e-b76a-d793cb639d96)
2025-04-24 10:49:18 -07:00
Ilya Kamen
1008e1b9a0 fix: update bug report template - there is no --revision flag (#614)
I think there was a wrong word; --revision seems not to exist in help
and does nothing.
2025-04-24 10:48:42 -07:00
Luci
257167a034 fix: lint-staged error (#617)
## Description

In a recent commit, the command `"cd codex-cli && pnpm run typecheck"`
was updated to `"pnpm --filter @openai/codex run typecheck"`.

However, this change introduces an issue: 
when running `pnpm --filter @openai/codex run typecheck`, it executes
`tsc --noEmit somefile.ts` directly, bypassing the `tsconfig.json`
configuration. As a result, numerous type errors are triggered,
preventing successful commits.

Close: #619
2025-04-24 10:48:35 -07:00
Misha Davidov
9b102965b9 feat: more loosely match context for apply_patch (#610)
More of a proposal than anything but models seem to struggle with
composing valid patches for `apply_patch` for context matching when
there are unicode look-a-likes involved. This would normalize them.

```
top-level          # ASCII
top-level          # U+2011 NON-BREAKING HYPHEN
top–level          # U+2013 EN DASH
top—level          # U+2014 EM DASH
top‒level          # U+2012 FIGURE DASH
```

thanks unicode.
2025-04-24 09:05:19 -07:00
theg1239
ad1e39c903 feat: add specific instructions for creating API keys in error msg (#581)
Updates the error message for missing Gemini API keys to reference
"Google AI Studio" instead of the generic "GEMINI dashboard". This
provides users with more accurate information about where to obtain
their Gemini API keys.

This could be extended to other providers as well.
2025-04-24 06:33:34 -05:00
theg1239
006992b85a chore: update lint-staged config to use pnpm --filter (#582)
Replaced directory-specific commands with workspace-aware pnpm commands
2025-04-24 06:33:13 -05:00
Connor Christie
622323a59b fix: don't clear turn input before retries (#611)
The current turn input in the agent loop is being discarded before
consuming the stream events which causes the stream reconnect (after
rate limit failure) to not include the inputs. Since the new stream
includes the previous response ID, it triggers a bad request exception
considering the input doesn't match what OpenAI has stored on the server
side and subsequently a very confusing error message of: `No tool output
found for function call call_xyz`.

This should fix https://github.com/openai/codex/issues/586.

## Testing

I have a personal project that I'm working on that runs multiple Codex
CLIs in parallel and often runs into rate limit errors (as seen in the
OpenAI logs). After making this change, I am no longer experiencing
Codex crashing and it was able to retry and handle everything gracefully
until completion (even though I still see rate limiting in the OpenAI
logs).
2025-04-24 06:29:36 -05:00
Connor Christie
c75cb507f0 bug: fix error catching when checking for updates (#597)
This fixes https://github.com/openai/codex/issues/480 where the latest
code was crashing when attempting to be run inside docker since the
update checker attempts to reach out to `npm.antfu.dev` but that DNS is
not allowed in the firewall rules.

I believe the original code was attempting to catch and ignore any
errors when checking for updates but was doing so incorrectly. If you
use await on a promise, you have to use a standard try/catch instead of
`Promise.catch` so this fixes that.

## Testing

### Before

```
$ scripts/run_in_container.sh "explain this project to me"
7d1aa845edf9a36fe4d5b331474b5cb8ba79537b682922b554ea677f14996c6b
Resolving api.openai.com...
Adding 162.159.140.245 for api.openai.com
Adding 172.66.0.243 for api.openai.com
Host network detected as: 172.17.0.0/24
Firewall configuration complete
Verifying firewall rules...
Firewall verification passed - unable to reach https://example.com as expected
Firewall verification passed - able to reach https://api.openai.com as expected
TypeError: fetch failed
    at node:internal/deps/undici/undici:13510:13
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async getLatestVersionBatch (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132669:17)
    at async getLatestVersion (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132674:19)
    at async getUpdateCheckInfo (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132748:20)
    at async checkForUpdates (file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:132772:23)
    at async file:///usr/local/share/npm-global/lib/node_modules/@openai/codex/dist/cli.js:142027:1 {
  [cause]: AggregateError [ECONNREFUSED]: 
      at internalConnectMultiple (node:net:1122:18)
      at afterConnectMultiple (node:net:1689:7) {
    code: 'ECONNREFUSED',
    [errors]: [ [Error], [Error] ]
  }
}
```

### After

```
$ scripts/run_in_container.sh "explain this project to me"
91aa716e3d3f86c9cf6013dd567be31b2c44eb5d7ab184d55ef498731020bb8d
Resolving api.openai.com...
Adding 162.159.140.245 for api.openai.com
Adding 172.66.0.243 for api.openai.com
Host network detected as: 172.17.0.0/24
Firewall configuration complete
Verifying firewall rules...
Firewall verification passed - unable to reach https://example.com as expected
Firewall verification passed - able to reach https://api.openai.com as expected
╭──────────────────────────────────────────────────────────────╮
│ ● OpenAI Codex (research preview) v0.1.2504221401            │
╰──────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────╮
│ localhost session: 7c782f196ae04503866e39f071e26a69          │
│ ↳ model: o4-mini                                             │
│ ↳ provider: openai                                           │
│ ↳ approval: full-auto                                        │
╰──────────────────────────────────────────────────────────────╯
user
explain this project to me
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│( ●    ) 2s  Thinking                                                                                                                                                                                                                                                  │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
  send q or ctrl+c to exit | send "/clear" to reset | send "/help" for commands | press enter to send | shift+enter for new line — 100% context left
```
2025-04-23 15:21:00 -07:00
kshern
146a61b073 feat: add support for custom provider configuration in the user config (#537)
### What

- Add support for loading and merging custom provider configurations
from a local `providers.json` file.
- Allow users to override or extend default providers with their own
settings.

### Why

This change enables users to flexibly customize and extend provider
endpoints and API keys without modifying the codebase, making the CLI
more adaptable for various LLM backends and enterprise use cases.

### How

- Introduced `loadProvidersFromFile` and `getMergedProviders` in config
logic.
- Added/updated related tests in [tests/config.test.tsx]


### Checklist

- [x] Lint passes for changed files
- [x] Tests pass for all files
- [x] Documentation/comments updated as needed

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-23 01:45:56 -04:00
Erick
b428d66f2b feat: added provider to run quiet mode function (#571)
Adding support to be able to run other models in quiet mode

ie: `codex --approval-mode full-auto -q "explain the current directory"
--provider xai --model grok-3-beta`
2025-04-23 01:12:18 -04:00
Connor Christie
cbeb5c3057 fix: remove unreachable "disableResponseStorage" logic flow introduced in #543 (#573)
This PR cleans up unreachable code that was added as a result of
https://github.com/openai/codex/pull/543.

The code being removed is already being handled above:

23f0887df3/codex-cli/src/utils/agent/agent-loop.ts (L535-L539)
2025-04-23 01:08:52 -04:00
Daniel Nakov
4261973467 bug: non-openai mode - don't default temp and top_p (#572)
I haven't seen any actual errors due to this, but it's been bothering me
that I had it defaulted to 1. I think best to leave it undefined and
have each provider do their thing
2025-04-23 01:07:40 -04:00
Daniel Nakov
23f0887df3 bug: non-openai mode - fix for gemini content: null, fix 429 to throw before stream (#563)
Gemini's API is finicky, it 400's without an error when you pass
content: null
Also fixed the rate limiting issues by throwing outside of the iterator.
I think there's a separate issue with the second isRateLimit check in
agent-loop - turnInput is cleared by that time, so it retries without
the last message.
2025-04-22 20:37:48 -04:00
Misha Davidov
20b6ef0de8 feat: create parent directories when creating new files. (#552)
apply_patch doesn't create parent directories when creating a new file
leading to confusion and flailing by the agent. This will create parent
directories automatically when absent.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-22 19:45:17 -04:00
chunterb
750d97e8ad feat: add openai model info configuration (#551)
In reference to [Issue 548](https://github.com/openai/codex/issues/548)
- part 1.
2025-04-22 17:31:25 -04:00
Fouad Matin
12bc2dcc4e bump(version): 0.1.2504221401 (#559)
## `0.1.2504221401`

### 🚀 Features

- Show actionable errors when api keys are missing (#523)
- Add CLI `--version` flag (#492)

### 🐛 Bug Fixes

- Agent loop for ZDR (`disableResponseStorage`) (#543)
- Fix relative `workdir` check for `apply_patch` (#556)
- Minimal mid-stream #429 retry loop using existing back-off (#506)
- Inconsistent usage of base URL and API key (#507)
- Remove requirement for api key for ollama (#546)
- Support `[provider]_BASE_URL` (#542)
2025-04-22 14:18:04 -07:00
Nick Carchedi
dc096302e5 fix typo in prompt (#558) 2025-04-22 17:15:28 -04:00
Michael Bolin
7c1f2d7deb when a shell tool call invokes apply_patch, resolve relative paths against workdir, if specified (#556)
Previously, we were ignoring the `workdir` field in an `ExecInput` when
running it through `canAutoApprove()`. For ordinary `exec()` calls, that
was sufficient, but for `apply_patch`, we need the `workdir` to resolve
relative paths in the `apply_patch` argument so that we can check them
in `isPathConstrainedTowritablePaths()`.

Likewise, we also need the workdir when running `execApplyPatch()`
because the paths need to be resolved again.

Ideally, the `ApplyPatchCommand` returned by `canAutoApprove()` would
not be a simple `patch: string`, but the parsed patch with all of the
paths resolved, in which case `execApplyPatch()` could expect absolute
paths and would not need `workdir`.
2025-04-22 14:07:47 -07:00
Fouad Matin
a30e79b768 fix: agent loop for disable response storage (#543)
- Fixes post-merge of #506

---------

Co-authored-by: Ilan Bigio <ilan@openai.com>
2025-04-22 13:49:10 -07:00
Daniil Davydov
f99c9080fd fix: support [provider]_BASE_URL (#542)
Resolved issue where an OLLAMA_BASE_URL was not properly handled
(openai/codex#516).
2025-04-22 15:05:48 -04:00
moppywhip
549fc650c3 fix: remove requirement for api key for ollama (#546)
Fixes #540 
# Skip API key validation for Ollama provider

## Description
This PR modifies the CLI to not require an API key when using Ollama as
the provider

## Changes
- Modified the validation logic to skip API key checks for these
providers
- Updated the README to clarify that Ollama doesn't require an API key
2025-04-22 13:59:31 -04:00
Naveen Kumar Battula
fcd1d4bdf9 feat: show actionable errors when api keys are missing (#523)
Change errors on missing api key of other providers from
<img width="854" alt="image"
src="https://github.com/user-attachments/assets/f488a247-5040-4b02-92d6-90a2204419ff"
/>
(missing deepseek key but still throws error for openai)
to
<img width="854" alt="image"
src="https://github.com/user-attachments/assets/8333d24a-91f8-4ba8-9a51-ed22a7e8a074"
/>
This should help new users figure out the issue easier and go to the
right place to get api keys

OpenAI key missing would popup with the right link
<img width="854" alt="image"
src="https://github.com/user-attachments/assets/0ecc9320-380f-425c-972e-4312bf610955"
/>
2025-04-22 12:55:08 -04:00
Michael Bolin
94d5408875 add instructions for connecting to a visual debugger under Contributing (#496)
While here, I also moved the Nix stuff to the end of the
**Contributing** section and replaced some examples with `npm` to use
`pnpm`.
2025-04-22 09:43:10 -07:00
Michael Bolin
9b06fb48a7 add check to ensure ToC in README.md matches headings in the file (#541)
This introduces a Python script (written by Codex!) to verify that the
table of contents in the root `README.md` matches the headings. Like
`scripts/asciicheck.py` in https://github.com/openai/codex/pull/513, it
reports differences by default (and exits non-zero if there are any) and
also has a `--fix` option to synchronize the ToC with the headings.

This will be enforced by CI and the changes to `README.md` in this PR
were generated by the script, so you can see that our ToC was missing
some entries prior to this PR.
2025-04-22 09:38:12 -07:00
narenoai
dd330646d2 feat: add CLI –version flag (#492)
Adds a new flag to cli `--version` that prints the current version and
exits

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-22 12:28:21 -04:00
Scott Leibrand
ee6e1765fa agent-loop: minimal mid-stream #429 retry loop using existing back-off (#506)
As requested by @tibo-openai at
https://github.com/openai/codex/pull/357#issuecomment-2816554203, this
attempts a more minimal implementation of #357 that preserves as much as
possible of the existing code's exponential backoff logic.

Adds a small retry wrapper around the streaming for‑await loop so that
HTTP 429s which occur *after* the stream has started no longer crash the
CLI.

Highlights
• Re‑uses existing RATE_LIMIT_RETRY_WAIT_MS constant and 5‑attempt
limit.
• Exponential back‑off identical to initial request handling. 

This comment is probably more useful here in the PR:
// The OpenAI SDK may raise a 429 (rate‑limit) *after* the stream has
// started. Prior logic already retries the initial `responses.create`
        // call, but we need to add equivalent resilience for mid‑stream
        // failures.  We keep the implementation minimal by wrapping the
// existing `for‑await` loop in a small retry‑for‑loop that re‑creates
        // the stream with exponential back‑off.
2025-04-22 11:02:10 -04:00
Gabriel Bianconi
98a22273d9 fix: inconsistent usage of base URL and API key (#507)
A recent commit introduced the ability to use third-party model
providers. (Really appreciate it!)

However, the usage is inconsistent: some pieces of code use the custom
providers, whereas others still have the old behavior. Additionally,
`OPENAI_BASE_URL` is now being disregarded when it shouldn't be.

This PR normalizes the usage to `getApiKey` and `getBaseUrl`, and
enables the use of `OPENAI_BASE_URL` if present.

---------

Co-authored-by: Gabriel Bianconi <GabrielBianconi@users.noreply.github.com>
2025-04-22 10:51:26 -04:00
Thomas
d78f77edb7 fix(agent-loop): update required properties to include workdir and ti… (#530)
Without this I get an issue running codex it in a docker container. I
receive:

```
{
    "answer": "{\"role\":\"user\",\"content\":[{\"type\":\"input_text\",\"text\":\"\\\"Say hello world\\\"\"}],\"type\":\"message\"}\n{\"id\":\"error-1745325184914\",\"type\":\"message\",\"role\":\"system\",\"content\":[{\"type\":\"input_text\",\"text\":\"⚠️  OpenAI rejected the request (request ID: req_f9027b59ebbce00061e9cd2dbb2d529a). Error details: Status: 400, Code: invalid_function_parameters, Type: invalid_request_error, Message: 400 Invalid schema for function 'shell': In context=(), 'required' is required to be supplied and to be an array including every key in properties. Missing 'workdir'.. Please verify your settings and try again.\"}]}\n"
}
```

This fix makes it work.
2025-04-22 10:32:36 -04:00
Michael Bolin
c00ae2dcc1 Enforce ASCII in README.md (#513)
This all started because I was going to write a script to autogenerate
the Table of Contents in the root `README.md`, but I noticed that the
`href` for the "Why Codex?" heading was `#whycodex` instead of
`#why-codex`. This piqued my curiosity and it turned out that the space
in "Why Codex?" was not an ASCII space but **U+00A0**, a non-breaking
space, and so GitHub ignored it when generating the `href` for the
heading.

This also meant that when I did a text search for `why codex` in the
`README.md` in VS Code, the "Why Codex" heading did not match because of
the presence of **U+00A0**.

In short, these types of Unicode characters seem like a hazard, so I
decided to introduce this script to flag them, and if desired, to
replace them with "good enough" ASCII equivalents. For now, this only
applies to the root `README.md` file, but I think we should ultimately
apply this across our source code, as well, as we seem to have quite a
lot of non-ASCII Unicode and it's probably going to cause `rg` to miss
things.

Contributions of this PR:

* `./scripts/asciicheck.py`, which takes a list of filepaths and returns
non-zero if any of them contain non-ASCII characters. (Currently, there
is one exception for  aka **U+2728**, though I would like to default to
an empty allowlist and then require all exceptions to be specified as
flags.)
* A `--fix` option that will attempt to rewrite files with violations
using a equivalents from a hardcoded substitution list.
* An update to `ci.yml` to verify `./scripts/asciicheck.py README.md`
succeeds.
* A cleanup of `README.md` using the `--fix` option as well as some
editorial decisions on my part.
* I tried to update the `href`s in the Table of Contents to reflect the
changes in the heading titles. (TIL that if a heading has a character
like `&` surrounded by spaces, it becomes `--` in the generated `href`.)
2025-04-22 10:07:40 -04:00
Fouad Matin
2cb8355968 bump(version): 0.1.2504220136 (#518)
## `0.1.2504220136`

### 🚀 Features

- Add support for ZDR orgs (#481)
- Include fractional portion of chunk that exceeds stdout/stderr limit
(#497)
2025-04-22 01:45:30 -07:00
Fouad Matin
9f5ccbb618 feat: add support for ZDR orgs (#481)
- Add `store: boolean` to `AgentLoop` to enable client-side storage of
response items
- Add `--disable-response-storage` arg + `disableResponseStorage` config
2025-04-22 01:30:16 -07:00
Michael Bolin
3eba86a553 include fractional portion of chunk that exceeds stdout/stderr limit (#497)
I saw cases where the first chunk of output from `ls -R` could be large
enough to exceed `MAX_OUTPUT_BYTES` or `MAX_OUTPUT_LINES`, in which case
the loop would exit early in `createTruncatingCollector()` such that
nothing was appended to the `chunks` array. As a result, the reported
`stdout` of `ls -R` would be empty.

I asked Codex to add logic to handle this edge case and write a unit
test. I used this as my test:

```
./codex-cli/dist/cli.js -q 'what is the output of `ls -R`'
```

now it appears to include a ton of stuff whereas before this change, I
saw:

```
{"type":"function_call_output","call_id":"call_a2QhVt7HRJYKjb3dIc8w1aBB","output":"{\"output\":\"\\n\\n[Output truncated: too many lines or bytes]\",\"metadata\":{\"exit_code\":0,\"duration_seconds\":0.5}}"}
```
2025-04-21 19:06:03 -07:00
Fouad Matin
99ed27ad1b bump(version): 0.1.2504211509 (#493)
## `0.1.2504211509`

### 🚀 Features

- Support multiple providers via Responses-Completion transformation
(#247)
- Add user-defined safe commands configuration and approval logic #380
(#386)
- Allow switching approval modes when prompted to approve an
edit/command (#400)
- Add support for `/diff` command autocomplete in TerminalChatInput
(#431)
- Auto-open model selector if user selects deprecated model (#427)
- Read approvalMode from config file (#298)
- `/diff` command to view git diff (#426)
- Tab completions for file paths (#279)
- Add /command autocomplete (#317)
- Allow multi-line input (#438)

### 🐛 Bug Fixes

- `full-auto` support in quiet mode (#374)
- Enable shell option for child process execution (#391)
- Configure husky and lint-staged for pnpm monorepo (#384)
- Command pipe execution by improving shell detection (#437)
- Name of the file not matching the name of the component (#354)
- Allow proper exit from new Switch approval mode dialog (#453)
- Ensure /clear resets context and exclude system messages from
approximateTokenUsed count (#443)
- `/clear` now clears terminal screen and resets context left indicator
(#425)
- Correct fish completion function name in CLI script (#485)
- Auto-open model-selector when model is not found (#448)
- Remove unnecessary isLoggingEnabled() checks (#420)
- Improve test reliability for `raw-exec` (#434)
- Unintended tear down of agent loop (#483)
- Remove extraneous type casts (#462)
2025-04-21 15:13:33 -07:00
Fouad Matin
0e9d75657b update: readme (#491) 2025-04-21 15:06:03 -07:00
Daniel Nakov
e7a3eec942 docs: Add note about non-openai providers; add --provider cli flag to the help (#484) 2025-04-21 14:53:09 -07:00
Mitchell Kutchuk
09f0ae3899 fix: unintended tear down of agent loop (#483)
fixes #465
2025-04-21 15:01:09 -04:00
Benny Yen
f72cfd7ef3 fix: correct fish completion function name in CLI script (#485)
Missing an underscore.

fish function:
https://github.com/fish-shell/fish-shell/blob/master/share/functions/__fish_complete_path.fish

fixes https://github.com/openai/codex/issues/469
2025-04-21 14:45:49 -04:00
Michael Bolin
12ec57b330 do not auto-approve the find command if it contains options that write files or spawn commands (#482)
Updates `isSafeCommand()` so that an invocation of `find` is not
auto-approved if it contains any of: `-exec`, `-execdir`, `-ok`,
`-okdir`, `-delete`, `-fls`, `-fprint`, `-fprint0`, `-fprintf`.
2025-04-21 10:29:58 -07:00
Jon Church
7346f4388e chore: drop src from publish (#474)
Publish shouldn't need the source files published along with the
distributable bin.

`src` is being shipped to the registry rn:
https://www.npmjs.com/package/@openai/codex?activeTab=code

You can verify that the src is not needed by packing the project
manually after removing src from the files:

```sh
# from the codex-cli dir
rm -rf dist # just for hygiene
pnpm run build
pnpm pack

mkdir /tmp/codex-tar-test
mv openai-codex-0.1.2504181820.tgz /tmp/codex-tar-test
cd /tmp/codex-tar-test

pnpm init
pnpm add ./openai-codex-0.1.2504181820.tgz /tmp/codex-tar-test
pnpm exec codex --full-auto "run a bash -c command to echo hello world"
```

The cli is operational

> noticed this when checking the screenshot included in
https://github.com/openai/codex/pull/461
2025-04-21 09:56:00 -07:00
Michael Bolin
d36d295a1a revert #386 due to unsafe shell command parsing (#478)
Reverts https://github.com/openai/codex/pull/386 because:

* The parsing logic for shell commands was unsafe (`split(/\s+/)`
instead of something like `shell-quote`)
* We have a different plan for supporting auto-approved commands.
2025-04-21 12:52:11 -04:00
nerdielol
797eba4930 fix: /clear now clears terminal screen and resets context left indicator (#425)
## What does this PR do?
* Implements the full `/clear` command in **codex‑cli**:
  * Resets chat history **and** wipes the terminal screen.
  * Shows a single system message: `Context cleared`.
* Adds comprehensive unit tests for the new behaviour.

## Why is it needed?
* Fixes user‑reported bugs:  
  * **#395**  
  * **#405**

## How is it implemented?
* **Code** – Adds `process.stdout.write('\x1b[3J\x1b[H\x1b[2J')` in
`terminal.tsx`. Removed reference to `prev` in `
        setItems((prev) => [
          ...prev,
` in `terminal-chat-new-input.tsx` & `terminal-chat-input.tsx`.

## CI / QA
All commands pass locally:
```bash
pnpm test      # green
pnpm run lint  # green
pnpm run typecheck  # zero TS errors
```

## Results



https://github.com/user-attachments/assets/11dcf05c-e054-495a-8ecb-ac6ef21a9da4

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-21 12:39:46 -04:00
Jon Church
5b19451770 chore(build): cleanup dist before build (#477)
Another one that I noticed.

The dist structure is very simple rn, so unlikely to run into orphaned
files as you're emitting a single built artifact which wil be
overwritten on build, but I always prefer to do clean builds as
"hygiene".

I had a dirty dist personally after local development and testing some
things, as an example.

Alternatives could be to create a `clean` script with cross platform
`rimraf dist`
2025-04-21 12:35:25 -04:00
Thibault Sottiaux
3c4f1fea9b chore: consolidate model utils and drive-by cleanups (#476)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-21 12:33:57 -04:00
Thibault Sottiaux
dc276999a9 chore: improve storage/ implementation; use log(...) consistently (#473)
This PR tidies up primitives under storage/.

**Noop changes:**

* Promote logger implementation to top-level utility outside of agent/
* Use logger within storage primitives
* Cleanup doc strings and comments

**Functional changes:**

* Increase command history size to 10_000
* Remove unnecessary debounce implementation and ensure a session ID is
created only once per agent loop

---------

Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-21 09:51:34 -04:00
Aaron Casanova
8f1ea7fa85 fix: remove extraneous type casts (#462) 2025-04-21 00:00:56 -07:00
Benny Yen
3e71c87708 refactor(updates): fetch version from registry instead of npm CLI to support multiple managers (#446)
## Background  
Addressing feedback from
https://github.com/openai/codex/pull/333#discussion_r2050893224, this PR
adds support for Bun alongside npm, pnpm while keeping the code simple.

## Summary  
The update‑check flow is refactored to use a direct registry lookup
(`fast-npm-meta` + `semver`) instead of shelling out to `npm outdated`,
and adds a lightweight installer‑detection mechanism that:

1. Checks if the invoked script lives under a known global‑bin directory
(npm, pnpm, or bun)
2. If not, falls back to local detection via `getUserAgent()` (the
`package‑manager‑detector` library)

## What’s Changed  
- **Registry‑based version check**  
- Replace `execFile("npm", ["outdated"])` with `getLatestVersion()` and
`semver.gt()`
- **Multi‑manager support**  
- New `renderUpdateCommand` handles update commands for `npm`, `pnpm`,
and `bun`.
  - Detect global installer first via `detectInstallerByPath()`  
  - Fallback to local detection via `getUserAgent()`  
- **Module cleanup**  
- Extract `detectInstallerByPath` into
`utils/package-manager-detector.ts`
- Remove legacy `checkOutdated`, `getNPMCommandPath`, and child‑process
JSON parsing
- **Flow improvements in `checkForUpdates`**  
  1. Short‑circuit by `UPDATE_CHECK_FREQUENCY`  
  3. Fetch & compare versions  
  4. Persist new timestamp immediately  
  5. Render & display styled box only when an update exists  
- **Maintain simplicity**
- All multi‑manager logic lives in one small helper and a concise lookup
rather than a complex adapter hierarchy
- Core `checkForUpdates` remains a single, easy‑to‑follow async function
- **Dependencies added**  
- `fast-npm-meta`, `semver`, `package-manager-detector`, `@types/semver`

## Considerations
If we decide to drop the interactive update‑message (`npm install -g
@openai/codex`) rendering altogether, we could remove most of the
installer‑detection code and dependencies, which would simplify the
codebase further but result in a less friendly UX.

## Preview

* npm

![refactor-update-check-flow-npm](https://github.com/user-attachments/assets/57320114-3fb6-4985-8780-3388a1d1ec85)

* bun

![refactor-update-check-flow-bun](https://github.com/user-attachments/assets/d93bf0ae-a687-412a-ab92-581b4f967307)

## Simple Flow Chart

```mermaid
flowchart TD
  A(Start) --> B[Read state]
  B --> C{Recent check?}
  C -- Yes --> Z[End]
  C -- No --> D[Fetch latest version]
  D --> E[Save check time]
  E --> F{Version data OK?}
  F -- No --> Z
  F -- Yes --> G{Update available?}
  G -- No --> Z
  G -- Yes --> H{Global install?}
  H -- Yes --> I[Select global manager]
  H -- No --> K{Local install?}
  K -- No --> Z
  K -- Yes --> L[Select local manager]
  I & L --> M[Render update message]
  M --> N[Format with boxen]
  N --> O[Print update]
  O --> Z
```
2025-04-21 00:00:20 -07:00
Abdelrhman Kamal Mahmoud Ali Slim
655564f25d fix: auto-open model-selector when model is not found (#448)
Change the check from checking if the model has been deprecated to check
if the model_not_found


https://github.com/user-attachments/assets/ad0f7daf-5eb4-4e4b-89e5-04240044c64f
2025-04-20 23:56:20 -07:00
Aaron Casanova
ee3a9bc14b Remove README.md and bin from package.json#files field (#461)
This PR removes always included files and folders from the
[`package.json#files`
field](https://docs.npmjs.com/cli/v11/configuring-npm/package-json#files):

> Certain files are always included, regardless of settings:
> - package.json
> - README
> - LICENSE / LICENCE
> - The file in the "main" field
> - The file(s) in the "bin" field

Validated by running `pnpm i && cd codex-cli && pnpm build && pnpm
release:readme && pnpm pack` and confirming both the `README.md` file
and `bin` directory are still included in the tarball:

<img width="227" alt="image"
src="https://github.com/user-attachments/assets/ecd90a07-73c7-4940-8c83-cb1d51dfcf96"
/>
2025-04-20 23:54:27 -07:00
Aiden Cline
ee7ce5b601 feat: tab completions for file paths (#279)
Made a PR as was requested in the #113
2025-04-20 22:34:27 -07:00
Jordan Docherty
f6b12aa994 refactor(history-overlay): split into modular functions & add tests (fixes #402) (#403)
## What
This PR targets #402 and refactors the `history-overlay.tsx`component to
reduce cognitive complexity by splitting the `buildLists` function into
smaller, focused helper functions. It also adds comprehensive test
coverage to ensure the functionality remains intact.

## Why
The original `buildLists` function had high cognitive complexity due to
multiple nested conditionals, complex string manipulation, and mixed
responsibilities. This refactor makes the code more maintainable and
easier to understand while preserving all existing functionality.

## How
- Split `buildLists` into focused helper functions
- Added comprehensive test coverage for all functionality
- Maintained existing behavior and keyboard interactions
- Improved code organization and readability

## Testing
All tests pass, including:
- Command mode functionality
- File mode functionality
- Keyboard interactions
- Error handling
2025-04-20 22:27:06 -07:00
Abdelrhman Kamal Mahmoud Ali Slim
f97557f59f refactor(component): rename component to match its filename (#432)
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-20 22:21:49 -07:00
Scott Leibrand
e3c8ce49ca fix: allow proper exit from new Switch approval mode dialog (#453)
As described in
https://github.com/openai/codex/issues/392#issuecomment-2817090022
introduced by #400

The testing I'd done worked correctly because I was using the (s)
shortcut, but selecting the same option using arrow‑key → Enter on
“Switch approval mode” was preventing the user from subsequently exiting
the Switch approval mode dialog, requiring a ^C to quit codex entirely.
With this fix, both entry methods work correctly in my testing.

Per codex:

Issue

- When you navigated down (↓) to “Switch approval mode (s)” in the Shell
Command review dialog and pressed Enter, the ApprovalModeOverlay would
open—but because the underlying `TerminalChatCommandReview` component
stayed mounted (albeit disabled), its own Ink input handlers immediately
re‑captured the same key events and re‑opened the overlay as soon as you
hit Esc or Enter again. In practice this made it impossible to exit the
submenu.

Root cause

- We only disabled the SelectInput via `isDisabled`, but never fully
unmounted the review UI when an overlay was shown, so its `useInput` and
`<Select>` hooks were still active and “stealing” keys.

Fix

- In `terminal-chat.tsx` we now only render `<TerminalChatInput>` (and
by extension `TerminalChatCommandReview`) when `overlayMode === "none"`.
That unmounts all of its key handlers whenever any overlay (history,
model, approval, help, diff) is open, so no input leaks through.

Files changed

- **src/components/chat/terminal-chat.tsx**: Wrapped the entire
`<TerminalChatInput>` block in `overlayMode === "none" && agent`

With that in place, arrow‑key → Enter on “Switch approval mode”
correctly opens the overlay, and then you can use Enter/Esc inside the
overlay without getting stuck or immediately re‑opening it.
2025-04-20 22:19:53 -07:00
Brayden Moon
8dd1125681 fix: command pipe execution by improving shell detection (#437)
## Description
This PR fixes Issue #421 where commands with pipes (e.g., `grep -R ...
-n | head -n 20`) were failing to execute properly after PR #391 was
merged.

## Changes
- Modified the `requiresShell` function to only enable shell mode when
the command is a single string containing shell operators
- Added logic to handle the case where shell operators are passed as
separate arguments
- Added comprehensive tests to verify the fix

## Root Cause
The issue was that the `requiresShell` function was detecting shell
operators like `|` even when they were passed as separate arguments,
which caused the command to be executed with `shell: true`
unnecessarily. This was causing syntax errors when running commands with
pipes.

## Testing
- Added unit tests to verify the fix
- Manually tested with real commands using pipes
- Ensured all existing tests pass

Fixes #421
2025-04-20 21:11:19 -07:00
Daniel Nakov
eafbc75612 feat: support multiple providers via Responses-Completion transformation (#247)
https://github.com/user-attachments/assets/9ecb51be-fa65-4e99-8512-abb898dda569

Implemented it as a transformation between Responses API and Completion
API so that it supports existing providers that implement the Completion
API and minimizes the changes needed to the codex repo.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
Co-authored-by: Fouad Matin <fouad@openai.com>
2025-04-20 20:59:34 -07:00
Jordan Docherty
693bd59ecc fix(raw-exec-process-group): improve test reliability (#434)
This PR improves the reliability of `raw-exec-process-group.test`,
addressing [#415](https://github.com/openai/codex/issues/415)

Before: The test would fail sporadically in CI because it checked for
process termination immediately after abort, without accounting for the
time it takes for processes to fully terminate.

Now: We've added a robust `ensureProcessGone` helper that:
- Polls the process status with a 500ms timeout
- Retries every 50ms if the process is still alive
- Provides clear error messages if termination takes too long

We now wait for the child process to fully exit after sending abort
signals, instead of assuming instant death, fixing flakiness caused by
asynchronous process termination.

Changes:
- Added `ensureProcessGone` helper function with retry logic
- Improved error handling and timeout management

See [this bash
demo](https://gist.github.com/jdocherty/a84dbca2fbf7b47e5f95c87a07034ae8)
for a minimal reproduction of why process death is asynchronous and why
the test needs to retry after aborting.
2025-04-20 12:21:02 -07:00
Michael Bolin
b554b522f7 fix: remove unnecessary isLoggingEnabled() checks (#420)
It appears that use of `isLoggingEnabled()` was erroneously copypasta'd
in many places. This PR updates its docstring to clarify that should
only be used to avoid constructing a potentially expensive docstring.
With this change, the only function that merits/uses this check is
`execCommand()`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/420).
* #423
* __->__ #420
* #419
2025-04-20 09:58:06 -07:00
Abdelrhman Kamal Mahmoud Ali Slim
81cf47e591 feat: auto-open model selector if user selects deprecated model (#427)
https://github.com/user-attachments/assets/d254dff6-f1f9-4492-9dfd-d185c38d3a75
2025-04-20 09:51:49 -07:00
Michael Bolin
e372e4667b Make it so CONFIG_DIR is not in the list of writable roots by default (#419)
To play it safe, let's keep `CONFIG_DIR` out of the default list of
writable roots.

This also fixes an issue where `execWithSeatbelt()` was modifying
`writableRoots` instead of creating a new array.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/419).
* #423
* #420
* __->__ #419
2025-04-20 09:37:07 -07:00
Abdelrhman Kamal Mahmoud Ali Slim
425430debb fix(cli): ensure /clear resets context and exclude system messages from approximateTokenUsed count (#443)
https://github.com/user-attachments/assets/58b8f261-a359-4cd8-8c84-90d8d91a5592
2025-04-20 08:52:14 -07:00
Brayden Moon
6d6ca454cd feat: allow multi-line input (#438)
## Description
This PR implements multi-line input support for Codex when it asks for
user feedback (Issue #344). Users can now use Shift+Enter to add new
lines in their responses, making it easier to provide formatted code
snippets, lists, or other structured content.

## Changes
- Replace the single-line TextInput component with the
MultilineTextEditor component in terminal-chat-input.tsx
- Add support for Shift+Enter to create new lines
- Update key handling logic to properly handle history navigation in a
multi-line context
- Add reference to the editor to access cursor position information
- Update help text to inform users about the Shift+Enter functionality
- Add tests for the new functionality

## Testing
- Added new test file (terminal-chat-input-multiline.test.tsx) to test
the multi-line input functionality
- All existing tests continue to pass
- Manually tested the feature to ensure it works as expected

## Fixes
Closes #344

## Screenshots
N/A

## Additional Notes
This implementation maintains backward compatibility while adding the
requested multi-line input functionality. The UI remains clean and
intuitive, with a simple hint about using Shift+Enter for new lines.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-20 08:51:38 -07:00
Thomas
fc1e45636e feat: add support for /diff command autocomplete in TerminalChatInput (#431)
It's not working without this
2025-04-19 20:01:40 -07:00
Michael Bolin
63c99e7d82 use spawn instead of exec to avoid injection vulnerability (#416)
https://github.com/openai/codex/pull/160 introduced a call to `exec()`
that takes a format string as an argument, but it is not clear that the
expansions within the format string are escaped safely. As written, it
is possible a carefully crafted command (e.g., if `cwd` were `"; && rm
-rf` or something...) could run arbitrary code.

Moving to `spawn()` makes this a bit better, as now at least `spawn()`
itself won't run an arbitrary process, though I suppose `osascript`
itself still could if the value passed to `-e` were abused. I'm not
clear on the escaping rules for AppleScript to ensure that `safePreview`
and `cwd` are injected safely.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/416).
* #423
* #420
* #419
* __->__ #416
2025-04-19 18:29:00 -07:00
Tomas Cupr
a3889f92e4 fix: full-auto support in quiet mode (#374)
Fixes https://github.com/openai/codex/issues/292

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-19 17:00:33 -07:00
Fouad Matin
b3b195351e feat: /diff command to view git diff (#426)
Adds `/diff` command to view git diff
2025-04-19 16:23:27 -07:00
Michael Bolin
419f085cc4 re-enable Prettier check for codex-cli in CI (#417)
This check was lost in https://github.com/openai/codex/pull/287. Both
the root folder and `codex-cli/` have their own `pnpm format` commands
that check the formatting of different things.

Also ran `pnpm format:fix` to fix the formatting violations that got in
while this was disabled in CI.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/417).
* #420
* #419
* #416
* __->__ #417
2025-04-19 11:22:45 -07:00
Thomas
081786eaa6 feat: add /command autocomplete (#317)
Add interactive slash‑command autocomplete & navigation in chat input

    Description
This PR enhances the chat input component by adding first‑class support
for slash commands (/help, /clear, /compact, etc.)
    with:

* **Live filtering:** As soon as the user types leading `/`, a list of
matching commands is shown below the prompt.
* **Arrow‑key navigation:** Up/Down arrows cycle through suggestions.
* **Enter to autocomplete:** Pressing Enter on a partial command will
fill it (without submitting) so you can add
    arguments or simply press Enter again to execute.
* **Type‑safe registry:** A new `slash‑commands.ts` file declares all
supported commands in one place, along with
    TypeScript types to prevent drift.
* **Validation:** Only registered commands will ever autocomplete or be
suggested; unknown single‑word slash inputs still
    show an “Invalid command” system message.
        * **Automated tests:**
            * Unit tests for the command registry and prefix filtering

            * Existing tests continue passing with no regressions

    Motivation
Slash commands provide a quick, discoverable way to control the agent
(clearing history, compacting context, opening overlays,
etc.). Before, users had to memorize the exact command or rely on the
generic /help list—autocomplete makes them far more
    accessible and reduces typos.

    Changes

* `src/utils/slash‑commands.ts` – defines `SlashCommand` and exports a
flat list of supported commands + descriptions
        * `terminal‑chat‑input.tsx`
            * Import and type the command registry

* Render filtered suggestions under the prompt when input starts with
`/`

* Hook into `useInput` to handle Up/Down and Enter for selection & fill

* Flag to swallow the first Enter (autocomplete) and only submit on the
next
* Updated tests in `tests/slash‑commands.test.ts` to cover registry
contents and filtering logic
        * Removed old JS version and fixed stray `@ts‑expect‑error`

    How to test locally

        1. Type `/` in the prompt—you should see matching commands.
2. Use arrows to move the highlight, press Enter to fill, then Enter
again to execute.
3. Run the full test suite (`npm test`) to verify no regressions.

    Notes

* Future work could include fuzzy matching, paging long lists, or more
visual styling.
* This change is purely additive and does not affect non‑slash inputs or
existing slash handlers.

---------

Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-19 07:25:46 -07:00
John Gardner
965420cfc5 feat: read approvalMode from config file (#298)
This PR implements support for reading the approvalMode setting from the
user's config file (`~/.codex/config.json` or `~/.codex/config.yaml`),
allowing users to set a persistent default approval mode without needing
to specify command-line flags for each session.

Changes:
- Added approvalMode to the AppConfig type in config.ts
- Updated loadConfig() to read the approval mode from the config file
- Modified saveConfig() to persist the approval mode setting
- Updated CLI logic to respect the config-defined approval mode (while
maintaining CLI flag priority)
- Added comprehensive tests for approval mode config functionality
- Updated README to document the new config option in both YAML and JSON
formats
- additions to `.gitignore` for other CLI tools

Motivation:
As a user who regularly works with CLI-tools, I found it odd to have to
alias this with the command flags I wanted when `approvalMode` simply
wasn't being parsed even though it was an optional prop in `config.ts`.
This change allows me (and other users) to set the preference once in
the config file, streamlining daily usage while maintaining the ability
to override via command-line flags when needed.

Testing:
I've added a new test case loads and saves approvalMode correctly that
verifies:
- Reading the approvalMode from the config file works correctly
- Saving the approvalMode to the config file works as expected
- The value persists through load/save operations

All tests related to the implementation are passing.
2025-04-19 07:25:25 -07:00
Suyash-K
b37b257e63 gracefully handle SSE parse errors and suppress raw parser code (#367)
Closes #187
Closes #358

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-19 07:24:29 -07:00
Abdelrhman Kamal Mahmoud Ali Slim
f3cab736b4 fix: name of the file not matching the name of the component (#354)
before

![image](https://github.com/user-attachments/assets/e47595bc-8b5b-470d-8cca-1d4c4afbf802)
after

![image](https://github.com/user-attachments/assets/c04bc384-534e-44f0-9917-08a15f7280dc)

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-19 07:23:02 -07:00
Scott Leibrand
9eeb78e54f feat: allow switching approval modes when prompted to approve an edit/command (#400)
Implements https://github.com/openai/codex/issues/392

When the user is in suggest or auto-edit mode and gets an approval
request, they now have an option in the `Shell Command` dialog to:
`Switch approval mode (v)`

That option brings up the standard `Switch approval mode` dialog,
allowing the user to switch into the desired mode, then drops them back
to the `Shell Command` dialog's `Allow command?` prompt, allowing them
to approve the current command and let the agent continue doing the rest
of what it was doing without interruption.
```
╭────────────────────────────────────────────────────────
│Shell Command
│                          
│$ apply_patch << 'PATCH'    
│*** Begin Patch                      
│*** Update File: foo.txt          
│@@ -1 +1 @@                         
│-foo                                          
│+bar                                         
│*** End Patch                         
│PATCH                                    
│                                                
│                                                
│Allow command?                   
│                                                
│    Yes (y)                                
│    Explain this command (x) 
│    Edit or give feedback (e)  
│    Switch approval mode (v)
│    No, and keep going (n)    
│    No, and stop for now (esc)
╰────────────────────────────────────────────────────────╭────────────────────────────────────────────────────────
│ Switch approval mode      
│ Current mode: suggest  
│                                          
│                                          
│                                          
│ ❯ suggest                        
│   auto-edit                       
│   full-auto                        
│ type to search · enter to confirm · esc to cancel 
╰────────────────────────────────────────────────────────
```
2025-04-19 07:21:19 -07:00
Alpha Diop
6c7fbc7b94 fix: configure husky and lint-staged for pnpm monorepo (#384)
# Improve Developer Experience with Husky and lint-staged for pnpm
Monorepo

## Summary
This PR enhances the developer experience by configuring Husky and
lint-staged to work properly with our pnpm monorepo structure. It
centralizes Git hooks at the root level and ensures consistent code
quality across the project.

## Changes
- Centralized Husky and lint-staged configuration at the monorepo root
- Added pre-commit hook that runs lint-staged to enforce code quality
- Configured lint-staged to:
  - Format JSON, MD, and YAML files with Prettier
  - Lint and typecheck TypeScript files before commits
- Fixed release script in codex-cli package.json (changed "pmpm" to "npm
publish")
- Removed duplicate Husky and lint-staged configurations from codex-cli
package.json

## Benefits
- **Consistent Code Quality**: Ensures all committed code meets project
standards
- **Automated Formatting**: Automatically formats code during commits
- **Early Error Detection**: Catches type errors and lint issues before
they're committed
- **Centralized Configuration**: Easier to maintain and update in one
place
- **Improved Collaboration**: Ensures consistent code style across the
team

## Future Improvements
We could further enhance this setup by
**Commit Message Validation**: Add commitlint to enforce conventional
commit messages

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-19 07:18:36 -07:00
Robbie Hammond
8e2760e83d Add fallback text for missing images (#397)
# What?
* When a prompt references an image path that doesn’t exist, replace it
with
  ```[missing image: <path>]``` instead of throwing an ENOENT.
* Adds a few unit tests for input-utils as there weren't any beforehand.

# Why?
Right now if you enter an invalid image path (e.g. it doesn't exist),
codex immediately crashes with a ENOENT error like so:
```
Error: ENOENT: no such file or directory, open 'test.png'
   ...
 {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: 'test.png'
}
```
This aborts the entire session. A soft fallback lets the rest of the
input continue.

# How?
Wraps the image encoding + inputItem content pushing in a try-catch. 

This is a minimal patch to avoid completely crashing — future work could
surface a warning to the user when this happens, or something to that
effect.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-18 22:55:24 -07:00
Shuto Otaki
b46b596e5f fix: enable shell option for child process execution (#391)
## Changes

- Added a `requiresShell` function to detect when a command contains
shell operators
- In the `exec` function, enabled the `shell: true` option if shell
operators are present

## Why This Is Necessary

See the discussion in this issue comment:  
https://github.com/openai/codex/issues/320#issuecomment-2816528014

## Code Explanation

The `requiresShell` function parses the command arguments and checks for
any shell‑specific operators. If it finds shell operators, it adds the
`shell: true` option when running the command so that it’s executed
through a shell interpreter.
2025-04-18 22:42:19 -07:00
autotaker
ca7ab76569 feat: add user-defined safe commands configuration and approval logic #380 (#386)
This pull request adds a feature that allows users to configure
auto-approved commands via a `safeCommands` array in the configuration
file.

## Related Issue
#380 

## Changes
- Added loading and validation of the `safeCommands` array in
`src/utils/config.ts`
- Implemented auto-approval logic for commands matching `safeCommands`
prefixes in `src/approvals.ts`
- Added test cases in `src/tests/approvals.test.ts` to verify
`safeCommands` behavior
- Updated documentation with examples and explanations of the
configuration
2025-04-18 22:35:32 -07:00
Benny Yen
fd6f6c51c0 docs: update README to use pnpm commands (#390)
Since we migrated to `pnpm` in #287, this updates the README to reflect
that change.
Just a small cleanup to align the commands with the current setup.
2025-04-18 22:16:13 -07:00
salama-openai
1a8610cd9e feat: add flex mode option for cost savings (#372)
Adding in an option to turn on flex processing mode to reduce costs when
running the agent.

Bumped the openai typescript version to add the new feature.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-18 22:15:01 -07:00
Fouad Matin
6f2271e8cd bump(version): 0.1.2504181820 (#385) 2025-04-18 18:25:33 -07:00
Fouad Matin
aa32e22d4b fix: /bug report command, thinking indicator (#381)
- Fix `/bug` report command
- Fix thinking indicator
2025-04-18 18:13:34 -07:00
Jon Church
c40f4891d4 chore: update lockfile (#379)
Not 100% this isn't a me thing, but might be dirty state leftover from
the pnpm migration
2025-04-18 17:17:35 -07:00
Thibault Sottiaux
9d77d791e3 fix: include pnpm lock file (#377)
Signed-off-by: Thibault Sottiaux <tibo@openai.com>
2025-04-18 17:01:11 -07:00
Benny Yen
d61da89ed3 feat: notify when a newer version is available (#333)
**Summary**  
This change introduces a new startup check that notifies users if a
newer `@openai/codex` version is available. To avoid spamming, it writes
a small state file recording the last check time and will only re‑check
once every 24 hours.

**What’s Changed**  
- **New file** `src/utils/check-updates.ts`  
  - Runs `npm outdated --global @openai/codex`  
  - Reads/writes `codex-state.json` under `CONFIG_DIR`  
  - Limits checks to once per day (`UPDATE_CHECK_FREQUENCY = 24h`)  
- Uses `boxen` for a styled alert and `which` to locate the npm binary
- **Hooked into** `src/cli.tsx` entrypoint:
  ```ts
  import { checkForUpdates } from "./utils/check-updates";
  // …
  // after loading config
  await checkForUpdates().catch();
  ```
- **Dependencies**  
  - Added `boxen@^8.0.1`, `which@^5.0.0`, `@types/which@^3.0.4`  
- **Tests**  
  - Vitest suite under `tests/check-updates.test.ts`  
  - Snapshot in `__snapshots__/check-updates.test.ts.snap`  

**Motivation**  
Addresses issue #244. Users running a stale global install will now see
a friendly reminder—at most once per day—to upgrade and enjoy the latest
features.

**Test Plan**  
- `getNPMCommandPath()` resolves npm correctly  
- `checkOutdated()` parses `npm outdated` JSON  
- State file prevents repeat alerts within 24h  
- Boxen snapshot matches expected output  
- No console output when state indicates a recent check  

**Related Issue**  
try resolves #244


**Preview**
Prompt a pnpm‑style alert when outdated  

![outdated‑alert](https://github.com/user-attachments/assets/294dad45-d858-45d1-bf34-55e672ab883a)

Let me know if you’d tweak any of the messaging, throttle frequency,
placement in the startup flow, or anything else.

---------

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-04-18 17:00:45 -07:00
Abdelrhman Kamal Mahmoud Ali Slim
d69a17ac49 Fix: Change file name to start with small letter instead of captial l… (#356) 2025-04-18 16:55:49 -07:00
BadPirate
bec3947058 Fix #371 Allow multiple containers on same machine (#373)
- Docker container name based on work  directory
- Centralize container removal logic
- Improve quoting for command arguments
- Ensure workdir is always set and normalized

Resolves: #371 

Signed-off-by: BadPirate <badpirate@gmail.com>

Signed-off-by: BadPirate <badpirate@gmail.com>
2025-04-18 16:48:07 -07:00
Alpha Diop
e2fe2572ba chore: migrate to pnpm for improved monorepo management (#287)
# Migrate to pnpm for improved monorepo management

## Summary
This PR migrates the Codex repository from npm to pnpm, providing faster
dependency installation, better disk space usage, and improved monorepo
management.

## Changes
- Added `pnpm-workspace.yaml` to define workspace packages
- Added `.npmrc` with optimal pnpm configuration
- Updated root package.json with workspace scripts
- Moved resolutions and overrides to the root package.json
- Updated scripts to use pnpm instead of npm
- Added documentation for the migration
- Updated GitHub Actions workflow for pnpm

## Benefits
- **Faster installations**: pnpm is significantly faster than npm
- **Disk space savings**: pnpm's content-addressable store avoids
duplication
- **Strict dependency management**: prevents phantom dependencies
- **Simplified monorepo management**: better workspace coordination
- **Preparation for Turborepo**: as discussed, this is the first step
before adding Turborepo

## Testing
- Verified that `pnpm install` works correctly
- Verified that `pnpm run build` completes successfully
- Ensured all existing functionality is preserved

## Documentation
Added a detailed migration guide in `PNPM_MIGRATION.md` explaining:
- Why we're migrating to pnpm
- How to use pnpm with this repository
- Common commands and workspace-specific commands
- Monorepo structure and configuration

## Next Steps
As discussed, once this change is stable, we can consider adding
Turborepo as a follow-up enhancement.
2025-04-18 16:25:15 -07:00
Jon Church
9a046dfcaa Revert "fix: canonicalize the writeable paths used in seatbelt policy… (#370)
This reverts commit 3356ac0aef.

related #330
2025-04-18 16:11:34 -07:00
Fouad Matin
8e2e77fafb feat: add /bug report command (#312)
Add `/bug` command for chat session
2025-04-18 14:09:35 -07:00
Prama
4acd7d8617 fix: Improper spawn of sh on Windows Powershell (#318)
# Fix CLI launcher on Windows by replacing `sh`-based entrypoint with
cross-platform Node script

## What's changed

* This PR attempts to replace the sh-based entry point with a node
script that works on all platforms including Windows Powershell and CMD

## Why 

* Previously, when installing Codex globally via `npm i -g
@openai/codex`, Windows resulted in a broken CLI issue due to the `ps1`
launcher trying to execute `sh.exe`.

* If users don't have Unix-style shell, running the command will fail as
seen below since `sh.exe` can't be found

* Output:
 ``` 
& : The term 'sh.exe' is not recognized as the name of a cmdlet,
function, script file, or operable program. Check the
spelling of the name, or if a path was included, verify that the path is
correct and try again.
At C:\Users\{user}\AppData\Roaming\npm\codex.ps1:24 char:7
+     & "sh$exe"  "$basedir/node_modules/@openai/codex/bin/codex" $args
+       ~~~~~~~~
+ CategoryInfo : ObjectNotFound: (sh.exe:String) [],
CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException
```



## How
* By using a Node based entry point that resolves the path to the compiled ESM bundle and dynamically loads it using native ESM

* Removed dependency on platform-specific launchers allowing a single entrypoint to work everywhere Node.js runs.


## Result

Codex CLI now supports cross-platform and launches correctly via:
* macOS / Linux
* Windows PowerShell
* GitBash
* CMD
* WSL

Directly addresses #316 

![image](https://github.com/user-attachments/assets/85faaca4-24bc-47c9-8160-4e30df6da4c3)


![image](https://github.com/user-attachments/assets/a13f7adc-52c1-4c0e-af02-e35a35dc45d4)
2025-04-18 11:35:51 -07:00
Amar Sood
82f5abbeea Fix handling of Shift+Enter in e.g. Ghostty (#338)
Fix: Shift + Enter no longer prints “[27;2;13~” in the single‑line
input. Validated as working and necessary in Ghostty on Linux.

## Key points
- src/components/vendor/ink-text-input.tsx
- Added early handler that recognises the two modifyOtherKeys
escape‑sequences
    - [13;<mod>u  (mode 2 / CSI‑u)
    - [27;<mod>;13~ (mode 1 / legacy CSI‑~)
- If Ctrl is held (hasCtrl flag) → call onSubmit() (same as plain
Enter).
- Otherwise → insert a real newline at the caret (same as Option+Enter).
  - Prevents the raw sequence from being inserted into the buffer.

- src/components/chat/multiline-editor.tsx
- Replaced non‑breaking spaces with normal spaces to satisfy eslint
no‑irregular‑whitespace rule (no behaviour change).

All unit tests (114) and ESLint now pass:
npm test ✔️
npm run lint ✔️
2025-04-18 09:19:06 -07:00
Thomas
7b5f343179 fix: update context left display logic in TerminalChatInput component (#307)
Added persistent context length with colour coding.
2025-04-18 09:16:33 -07:00
241 changed files with 32236 additions and 10688 deletions

View File

@@ -19,13 +19,14 @@ body:
id: version
attributes:
label: What version of Codex is running?
description: Copy the output of `codex --revision`
description: Copy the output of `codex --version`
- type: input
id: model
attributes:
label: Which model were you using?
description: Like `gpt-4.1`, `o4-mini`, `o3`, etc.
- type: input
id: platform
attributes:
label: What platform is your computer?
description: |

View File

@@ -19,40 +19,56 @@ jobs:
with:
node-version: 22
# Run codex-cli/ tasks first because they are higher signal.
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 10.8.1
run_install: false
- name: Install dependencies (codex-cli)
working-directory: codex-cli
run: npm ci
- name: Check formatting (codex-cli)
working-directory: codex-cli
run: npm run format
- name: Run tests (codex-cli)
working-directory: codex-cli
run: npm run test
- name: Lint (codex-cli)
working-directory: codex-cli
- name: Get pnpm store directory
id: pnpm-cache
shell: bash
run: |
npm run lint -- \
echo "store_path=$(pnpm store path --silent)" >> $GITHUB_OUTPUT
- name: Setup pnpm cache
uses: actions/cache@v4
with:
path: ${{ steps.pnpm-cache.outputs.store_path }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
- name: Install dependencies
run: pnpm install
# Run all tasks using workspace filters
- name: Check TypeScript code formatting
working-directory: codex-cli
run: pnpm run format
- name: Check Markdown and config file formatting
run: pnpm run format
- name: Run tests
run: pnpm run test
- name: Lint
run: |
pnpm --filter @openai/codex exec -- eslint src tests --ext ts --ext tsx \
--report-unused-disable-directives \
--rule "no-console:error" \
--rule "no-debugger:error" \
--max-warnings=-1
- name: Typecheck (codex-cli)
working-directory: codex-cli
run: npm run typecheck
- name: Type-check
run: pnpm run typecheck
- name: Build (codex-cli)
working-directory: codex-cli
run: npm run build
- name: Build
run: pnpm run build
# Run formatting checks in the root directory last.
- name: Install dependencies (root)
run: npm ci
- name: Check formatting (root)
run: npm run format
- name: Ensure README.md contains only ASCII and certain Unicode code points
run: ./scripts/asciicheck.py README.md
- name: Check README ToC
run: python3 scripts/readme_toc.py README.md

94
.github/workflows/rust-ci.yml vendored Normal file
View File

@@ -0,0 +1,94 @@
name: rust-ci
on:
pull_request:
branches:
- main
paths:
- "codex-rs/**"
- ".github/**"
push:
branches:
- main
workflow_dispatch:
# For CI, we build in debug (`--profile dev`) rather than release mode so we
# get signal faster.
jobs:
# CI that don't need specific targets
general:
name: Format / etc
runs-on: ubuntu-24.04
defaults:
run:
working-directory: codex-rs
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: cargo fmt
run: cargo fmt -- --config imports_granularity=Item --check
# CI to validate on different os/targets
lint_build_test:
name: ${{ matrix.runner }} - ${{ matrix.target }}
runs-on: ${{ matrix.runner }}
timeout-minutes: 30
defaults:
run:
working-directory: codex-rs
strategy:
fail-fast: false
matrix:
# Note: While Codex CLI does not support Windows today, we include
# Windows in CI to ensure the code at least builds there.
include:
- runner: macos-14
target: aarch64-apple-darwin
- runner: macos-14
target: x86_64-apple-darwin
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
- runner: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- runner: windows-latest
target: x86_64-pc-windows-msvc
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
${{ github.workspace }}/codex-rs/target/
key: cargo-${{ matrix.runner }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' }}
name: Install musl build tools
run: |
sudo apt install -y musl-tools pkg-config
- name: Initialize failure flag
run: echo "FAILED=" >> $GITHUB_ENV
- name: cargo clippy
run: cargo clippy --target ${{ matrix.target }} --all-features -- -D warnings || echo "FAILED=${FAILED:+$FAILED, }cargo clippy" >> $GITHUB_ENV
- name: cargo test
run: cargo test --target ${{ matrix.target }} || echo "FAILED=${FAILED:+$FAILED, }cargo test" >> $GITHUB_ENV
- name: Fail if any step failed
if: env.FAILED != ''
run: |
echo "See logs above, as the following steps failed:"
echo "$FAILED"
exit 1

14
.gitignore vendored
View File

@@ -1,5 +1,11 @@
# deps
# Node.js dependencies
node_modules
.pnpm-store
.pnpm-debug.log
# Keep pnpm-lock.yaml
!pnpm-lock.yaml
# build
dist/
@@ -17,9 +23,14 @@ result
.vscode/
.idea/
.history/
.zed/
*.swp
*~
# cli tools
CLAUDE.md
.claude/
# caches
.cache/
.turbo/
@@ -61,9 +72,8 @@ Icon?
# Unwanted package managers
.yarn/
yarn.lock
pnpm-lock.yaml
# release
package.json-e
session.ts-e
CHANGELOG.ignore.md
CHANGELOG.ignore.md

1
.husky/pre-commit Normal file
View File

@@ -0,0 +1 @@
pnpm lint-staged

4
.npmrc Normal file
View File

@@ -0,0 +1,4 @@
shamefully-hoist=true
strict-peer-dependencies=false
node-linker=hoisted
prefer-workspace-packages=true

View File

@@ -1,2 +1,3 @@
/codex-cli/dist
/codex-cli/node_modules
pnpm-lock.yaml

View File

@@ -2,19 +2,123 @@
You can install any of these versions: `npm install -g codex@version`
## 0.1.2504172351
## `0.1.2504251709`
### 🚀 Features
- Add openai model info configuration (#551)
- Added provider to run quiet mode function (#571)
- Create parent directories when creating new files (#552)
- Print bug report URL in terminal instead of opening browser (#510) (#528)
- Add support for custom provider configuration in the user config (#537)
- Add support for OpenAI-Organization and OpenAI-Project headers (#626)
- Add specific instructions for creating API keys in error msg (#581)
- Enhance toCodePoints to prevent potential unicode 14 errors (#615)
- More native keyboard navigation in multiline editor (#655)
- Display error on selection of invalid model (#594)
### 🪲 Bug Fixes
- Model selection (#643)
- Nits in apply patch (#640)
- Input keyboard shortcuts (#676)
- `apply_patch` unicode characters (#625)
- Don't clear turn input before retries (#611)
- More loosely match context for apply_patch (#610)
- Update bug report template - there is no --revision flag (#614)
- Remove outdated copy of text input and external editor feature (#670)
- Remove unreachable "disableResponseStorage" logic flow introduced in #543 (#573)
- Non-openai mode - fix for gemini content: null, fix 429 to throw before stream (#563)
- Only allow going up in history when not already in history if input is empty (#654)
- Do not grant "node" user sudo access when using run_in_container.sh (#627)
- Update scripts/build_container.sh to use pnpm instead of npm (#631)
- Update lint-staged config to use pnpm --filter (#582)
- Non-openai mode - don't default temp and top_p (#572)
- Fix error catching when checking for updates (#597)
- Close stdin when running an exec tool call (#636)
## `0.1.2504221401`
### 🚀 Features
- Show actionable errors when api keys are missing (#523)
- Add CLI `--version` flag (#492)
### 🪲 Bug Fixes
- Agent loop for ZDR (`disableResponseStorage`) (#543)
- Fix relative `workdir` check for `apply_patch` (#556)
- Minimal mid-stream #429 retry loop using existing back-off (#506)
- Inconsistent usage of base URL and API key (#507)
- Remove requirement for api key for ollama (#546)
- Support `[provider]_BASE_URL` (#542)
## `0.1.2504220136`
### 🚀 Features
- Add support for ZDR orgs (#481)
- Include fractional portion of chunk that exceeds stdout/stderr limit (#497)
## `0.1.2504211509`
### 🚀 Features
- Support multiple providers via Responses-Completion transformation (#247)
- Add user-defined safe commands configuration and approval logic #380 (#386)
- Allow switching approval modes when prompted to approve an edit/command (#400)
- Add support for `/diff` command autocomplete in TerminalChatInput (#431)
- Auto-open model selector if user selects deprecated model (#427)
- Read approvalMode from config file (#298)
- `/diff` command to view git diff (#426)
- Tab completions for file paths (#279)
- Add /command autocomplete (#317)
- Allow multi-line input (#438)
### 🪲 Bug Fixes
- `full-auto` support in quiet mode (#374)
- Enable shell option for child process execution (#391)
- Configure husky and lint-staged for pnpm monorepo (#384)
- Command pipe execution by improving shell detection (#437)
- Name of the file not matching the name of the component (#354)
- Allow proper exit from new Switch approval mode dialog (#453)
- Ensure /clear resets context and exclude system messages from approximateTokenUsed count (#443)
- `/clear` now clears terminal screen and resets context left indicator (#425)
- Correct fish completion function name in CLI script (#485)
- Auto-open model-selector when model is not found (#448)
- Remove unnecessary isLoggingEnabled() checks (#420)
- Improve test reliability for `raw-exec` (#434)
- Unintended tear down of agent loop (#483)
- Remove extraneous type casts (#462)
## `0.1.2504181820`
### 🚀 Features
- Add `/bug` report command (#312)
- Notify when a newer version is available (#333)
### 🪲 Bug Fixes
- Update context left display logic in TerminalChatInput component (#307)
- Improper spawn of sh on Windows Powershell (#318)
- `/bug` report command, thinking indicator (#381)
- Include pnpm lock file (#377)
## `0.1.2504172351`
### 🚀 Features
- Add Nix flake for reproducible development environments (#225)
### 🐛 Bug Fixes
### 🪲 Bug Fixes
- Handle invalid commands (#304)
- Raw-exec-process-group.test improve reliability and error handling (#280)
- Canonicalize the writeable paths used in seatbelt policy (#275)
## 0.1.2504172304
## `0.1.2504172304`
### 🚀 Features
@@ -27,7 +131,7 @@ You can install any of these versions: `npm install -g codex@version`
- `--config`/`-c` flag to open global instructions in nvim (#158)
- Update position of cursor when navigating input history with arrow keys to the end of the text (#255)
### 🐛 Bug Fixes
### 🪲 Bug Fixes
- Correct word deletion logic for trailing spaces (Ctrl+Backspace) (#131)
- Improve Windows compatibility for CLI commands and sandbox (#261)

70
PNPM.md Normal file
View File

@@ -0,0 +1,70 @@
# Migration to pnpm
This project has been migrated from npm to pnpm to improve dependency management and developer experience.
## Why pnpm?
- **Faster installation**: pnpm is significantly faster than npm and yarn
- **Disk space savings**: pnpm uses a content-addressable store to avoid duplication
- **Phantom dependency prevention**: pnpm creates a strict node_modules structure
- **Native workspaces support**: simplified monorepo management
## How to use pnpm
### Installation
```bash
# Global installation of pnpm
npm install -g pnpm@10.8.1
# Or with corepack (available with Node.js 22+)
corepack enable
corepack prepare pnpm@10.8.1 --activate
```
### Common commands
| npm command | pnpm equivalent |
| --------------- | ---------------- |
| `npm install` | `pnpm install` |
| `npm run build` | `pnpm run build` |
| `npm test` | `pnpm test` |
| `npm run lint` | `pnpm run lint` |
### Workspace-specific commands
| Action | Command |
| ------------------------------------------ | ---------------------------------------- |
| Run a command in a specific package | `pnpm --filter @openai/codex run build` |
| Install a dependency in a specific package | `pnpm --filter @openai/codex add lodash` |
| Run a command in all packages | `pnpm -r run test` |
## Monorepo structure
```
codex/
├── pnpm-workspace.yaml # Workspace configuration
├── .npmrc # pnpm configuration
├── package.json # Root dependencies and scripts
├── codex-cli/ # Main package
│ └── package.json # codex-cli specific dependencies
└── docs/ # Documentation (future package)
```
## Configuration files
- **pnpm-workspace.yaml**: Defines the packages included in the monorepo
- **.npmrc**: Configures pnpm behavior
- **Root package.json**: Contains shared scripts and dependencies
## CI/CD
CI/CD workflows have been updated to use pnpm instead of npm. Make sure your CI environments use pnpm 10.8.1 or higher.
## Known issues
If you encounter issues with pnpm, try the following solutions:
1. Remove the `node_modules` folder and `pnpm-lock.yaml` file, then run `pnpm install`
2. Make sure you're using pnpm 10.8.1 or higher
3. Verify that Node.js 22 or higher is installed

520
README.md
View File

@@ -10,24 +10,36 @@
<details>
<summary><strong>Table&nbsp;of&nbsp;Contents</strong></summary>
<!-- Begin ToC -->
- [Experimental Technology Disclaimer](#experimental-technology-disclaimer)
- [Quickstart](#quickstart)
- [Why Codex?](#whycodex)
- [Security Model \& Permissions](#securitymodelpermissions)
- [Why Codex?](#why-codex)
- [Security Model & Permissions](#security-model--permissions)
- [Platform sandboxing details](#platform-sandboxing-details)
- [System Requirements](#systemrequirements)
- [CLI Reference](#clireference)
- [Memory \& Project Docs](#memoryprojectdocs)
- [Noninteractive / CI mode](#noninteractivecimode)
- [System Requirements](#system-requirements)
- [CLI Reference](#cli-reference)
- [Memory & Project Docs](#memory--project-docs)
- [Non-interactive / CI mode](#non-interactive--ci-mode)
- [Tracing / Verbose Logging](#tracing--verbose-logging)
- [Recipes](#recipes)
- [Installation](#installation)
- [Configuration](#configuration)
- [Configuration Guide](#configuration-guide)
- [Basic Configuration Parameters](#basic-configuration-parameters)
- [Custom AI Provider Configuration](#custom-ai-provider-configuration)
- [History Configuration](#history-configuration)
- [Configuration Examples](#configuration-examples)
- [Full Configuration Example](#full-configuration-example)
- [Custom Instructions](#custom-instructions)
- [Environment Variables Setup](#environment-variables-setup)
- [FAQ](#faq)
- [Funding Opportunity](#funding-opportunity)
- [Zero Data Retention (ZDR) Usage](#zero-data-retention-zdr-usage)
- [Codex Open Source Fund](#codex-open-source-fund)
- [Contributing](#contributing)
- [Development workflow](#development-workflow)
- [Nix Flake Development](#nix-flake-development)
- [Writing highimpact code changes](#writing-highimpact-code-changes)
- [Git Hooks with Husky](#git-hooks-with-husky)
- [Debugging](#debugging)
- [Writing high-impact code changes](#writing-high-impact-code-changes)
- [Opening a pull request](#opening-a-pull-request)
- [Review process](#review-process)
- [Community values](#community-values)
@@ -35,9 +47,12 @@
- [Contributor License Agreement (CLA)](#contributor-license-agreement-cla)
- [Quick fixes](#quick-fixes)
- [Releasing `codex`](#releasing-codex)
- [Security \& Responsible AI](#securityresponsibleai)
- [Alternative Build Options](#alternative-build-options)
- [Nix Flake Development](#nix-flake-development)
- [Security & Responsible AI](#security--responsible-ai)
- [License](#license)
- [Zero Data Retention (ZDR) Organization Limitation](#zero-data-retention-zdr-organization-limitation)
<!-- End ToC -->
</details>
@@ -45,7 +60,7 @@
## Experimental Technology Disclaimer
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. Were building it in the open with the community and welcome:
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
@@ -68,9 +83,7 @@ Next, set your OpenAI API key as an environment variable:
export OPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. To make it permanent, add the `export` line to your shell's configuration file (e.g., `~/.zshrc`).
>
> **Tip:** You can also place your API key into a `.env` file at the root of your project:
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
@@ -78,6 +91,36 @@ export OPENAI_API_KEY="your-api-key-here"
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
>
> ```shell
> export <provider>_BASE_URL="https://your-provider-api-base-url"
> ```
</details>
<br />
Run interactively:
```shell
@@ -94,59 +137,59 @@ codex "explain this codebase to me"
codex --approval-mode full-auto "create the fanciest todo-list app"
```
Thats it Codex will scaffold a file, run it inside a sandbox, install any
That's it - Codex will scaffold a file, run it inside a sandbox, install any
missing dependencies, and show you the live result. Approve the changes and
theyll be committed to your working directory.
they'll be committed to your working directory.
---
## Why Codex?
## Why Codex?
Codex CLI is built for developers who already **live in the terminal** and want
ChatGPTlevel reasoning **plus** the power to actually run code, manipulate
files, and iterate all under version control. In short, its _chatdriven
ChatGPT-level reasoning **plus** the power to actually run code, manipulate
files, and iterate - all under version control. In short, it's _chat-driven
development_ that understands and executes your repo.
- **Zero setup** bring your OpenAI API key and it just works!
- **Zero setup** - bring your OpenAI API key and it just works!
- **Full auto-approval, while safe + secure** by running network-disabled and directory-sandboxed
- **Multimodal** pass in screenshots or diagrams to implement features ✨
- **Multimodal** - pass in screenshots or diagrams to implement features ✨
And it's **fully open-source** so you can see and contribute to how it develops!
---
## Security Model & Permissions
## Security Model & Permissions
Codex lets you decide _how much autonomy_ the agent receives and auto-approval policy via the
`--approval-mode` flag (or the interactive onboarding prompt):
| Mode | What the agent may do without asking | Still requires approval |
| ------------------------- | -------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| **Suggest** <br>(default) | Read any file in the repo | **All** file writes/patches <br>• **Any** arbitrary shell commands (aside from reading files) |
| **Auto Edit** | Read **and** applypatch writes to files | **All** shell commands |
| **Full Auto** | Read/write files <br>• Execute shell commands (network disabled, writes limited to your workdir) | |
| Mode | What the agent may do without asking | Still requires approval |
| ------------------------- | --------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| **Suggest** <br>(default) | <li>Read any file in the repo | <li>**All** file writes/patches<li> **Any** arbitrary shell commands (aside from reading files) |
| **Auto Edit** | <li>Read **and** apply-patch writes to files | <li>**All** shell commands |
| **Full Auto** | <li>Read/write files <li> Execute shell commands (network disabled, writes limited to your workdir) | - |
In **Full Auto** every command is run **networkdisabled** and confined to the
current working directory (plus temporary files) for defenseindepth. Codex
will also show a warning/confirmation if you start in **autoedit** or
**fullauto** while the directory is _not_ tracked by Git, so you always have a
In **Full Auto** every command is run **network-disabled** and confined to the
current working directory (plus temporary files) for defense-in-depth. Codex
will also show a warning/confirmation if you start in **auto-edit** or
**full-auto** while the directory is _not_ tracked by Git, so you always have a
safety net.
Coming soon: youll be able to whitelist specific commands to autoexecute with
the network enabled, once were confident in additional safeguards.
Coming soon: you'll be able to whitelist specific commands to auto-execute with
the network enabled, once we're confident in additional safeguards.
### Platform sandboxing details
The hardening mechanism Codex uses depends on your OS:
- **macOS 12+** commands are wrapped with **Apple Seatbelt** (`sandbox-exec`).
- **macOS 12+** - commands are wrapped with **Apple Seatbelt** (`sandbox-exec`).
- Everything is placed in a readonly jail except for a small set of
- Everything is placed in a read-only jail except for a small set of
writable roots (`$PWD`, `$TMPDIR`, `~/.codex`, etc.).
- Outbound network is _fully blocked_ by default even if a child process
- Outbound network is _fully blocked_ by default - even if a child process
tries to `curl` somewhere it will fail.
- **Linux** there is no sandboxing by default.
- **Linux** - there is no sandboxing by default.
We recommend using Docker for sandboxing, where Codex launches itself inside a **minimal
container image** and mounts your repo _read/write_ at the same path. A
custom `iptables`/`ipset` firewall script denies all egress except the
@@ -155,47 +198,47 @@ The hardening mechanism Codex uses depends on your OS:
---
## System Requirements
## System Requirements
| Requirement | Details |
| --------------------------- | --------------------------------------------------------------- |
| Operating systems | macOS 12+, Ubuntu 20.04+/Debian 10+, or Windows 11 **via WSL2** |
| Operating systems | macOS 12+, Ubuntu 20.04+/Debian 10+, or Windows 11 **via WSL2** |
| Node.js | **22 or newer** (LTS recommended) |
| Git (optional, recommended) | 2.23+ for builtin PR helpers |
| RAM | 4GB minimum (8GB recommended) |
| Git (optional, recommended) | 2.23+ for built-in PR helpers |
| RAM | 4-GB minimum (8-GB recommended) |
> Never run `sudo npm install -g`; fix npm permissions instead.
---
## CLI Reference
## CLI Reference
| Command | Purpose | Example |
| ------------------------------------ | ----------------------------------- | ------------------------------------ |
| `codex` | Interactive REPL | `codex` |
| `codex ""` | Initial prompt for interactive REPL | `codex "fix lint errors"` |
| `codex -q ""` | Noninteractive "quiet mode" | `codex -q --json "explain utils.ts"` |
| `codex "..."` | Initial prompt for interactive REPL | `codex "fix lint errors"` |
| `codex -q "..."` | Non-interactive "quiet mode" | `codex -q --json "explain utils.ts"` |
| `codex completion <bash\|zsh\|fish>` | Print shell completion script | `codex completion bash` |
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
---
## Memory & Project Docs
## Memory & Project Docs
Codex merges Markdown instructions in this order:
1. `~/.codex/instructions.md` personal global guidance
2. `codex.md` at repo root shared project notes
3. `codex.md` in cwd subpackage specifics
1. `~/.codex/instructions.md` - personal global guidance
2. `codex.md` at repo root - shared project notes
3. `codex.md` in cwd - sub-package specifics
Disable with `--no-project-doc` or `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Noninteractive / CI mode
## Non-interactive / CI mode
Run Codex headless in pipelines. Example GitHub Action step:
Run Codex head-less in pipelines. Example GitHub Action step:
```yaml
- name: Update changelog via Codex
@@ -219,15 +262,15 @@ DEBUG=true codex
## Recipes
Below are a few bitesize examples you can copypaste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
| ✨ | What you type | What happens |
| --- | ------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| 1 | `codex "Refactor the Dashboard component to React Hooks"` | Codex rewrites the class component, runs `npm test`, and shows the diff. |
| 1 | `codex "Refactor the Dashboard component to React Hooks"` | Codex rewrites the class component, runs `npm test`, and shows the diff. |
| 2 | `codex "Generate SQL migrations for adding a users table"` | Infers your ORM, creates migration files, and runs them in a sandboxed DB. |
| 3 | `codex "Write unit tests for utils/date.ts"` | Generates tests, executes them, and iterates until they pass. |
| 4 | `codex "Bulkrename *.jpeg *.jpg with git mv"` | Safely renames files and updates imports/usages. |
| 5 | `codex "Explain what this regex does: ^(?=.*[A-Z]).{8,}$"` | Outputs a stepbystep human explanation. |
| 4 | `codex "Bulk-rename *.jpeg -> *.jpg with git mv"` | Safely renames files and updates imports/usages. |
| 5 | `codex "Explain what this regex does: ^(?=.*[A-Z]).{8,}$"` | Outputs a step-by-step human explanation. |
| 6 | `codex "Carefully review this repo, and propose 3 high impact well-scoped PRs"` | Suggests impactful PRs in the current codebase. |
| 7 | `codex "Look for vulnerabilities and create a security review report"` | Finds and explains security bugs. |
@@ -236,7 +279,7 @@ Below are a few bitesize examples you can copypaste. Replace the text in q
## Installation
<details open>
<summary><strong>From npm (Recommended)</strong></summary>
<summary><strong>From npm (Recommended)</strong></summary>
```bash
npm install -g @openai/codex
@@ -244,53 +287,175 @@ npm install -g @openai/codex
yarn global add @openai/codex
# or
bun install -g @openai/codex
# or
pnpm add -g @openai/codex
```
</details>
<details>
<summary><strong>Build from source</strong></summary>
<summary><strong>Build from source</strong></summary>
```bash
# Clone the repository and navigate to the CLI package
git clone https://github.com/openai/codex.git
cd codex/codex-cli
# Enable corepack
corepack enable
# Install dependencies and build
npm install
npm run build
pnpm install
pnpm build
# Get the usage and the options
node ./dist/cli.js --help
# Run the locallybuilt CLI directly
# Run the locally-built CLI directly
node ./dist/cli.js
# Or link the command globally for convenience
npm link
pnpm link
```
</details>
---
## Configuration
## Configuration Guide
Codex looks for config files in **`~/.codex/`**.
Codex configuration files can be placed in the `~/.codex/` directory, supporting both YAML and JSON formats.
### Basic Configuration Parameters
| Parameter | Type | Default | Description | Available Options |
| ------------------- | ------- | ---------- | -------------------------------- | ---------------------------------------------------------------------------------------------- |
| `model` | string | `o4-mini` | AI model to use | Any model name supporting OpenAI API |
| `approvalMode` | string | `suggest` | AI assistant's permission mode | `suggest` (suggestions only)<br>`auto-edit` (automatic edits)<br>`full-auto` (fully automatic) |
| `fullAutoErrorMode` | string | `ask-user` | Error handling in full-auto mode | `ask-user` (prompt for user input)<br>`ignore-and-continue` (ignore and proceed) |
| `notify` | boolean | `true` | Enable desktop notifications | `true`/`false` |
### Custom AI Provider Configuration
In the `providers` object, you can configure multiple AI service providers. Each provider requires the following parameters:
| Parameter | Type | Description | Example |
| --------- | ------ | --------------------------------------- | ----------------------------- |
| `name` | string | Display name of the provider | `"OpenAI"` |
| `baseURL` | string | API service URL | `"https://api.openai.com/v1"` |
| `envKey` | string | Environment variable name (for API key) | `"OPENAI_API_KEY"` |
### History Configuration
In the `history` object, you can configure conversation history settings:
| Parameter | Type | Description | Example Value |
| ------------------- | ------- | ------------------------------------------------------ | ------------- |
| `maxSize` | number | Maximum number of history entries to save | `1000` |
| `saveHistory` | boolean | Whether to save history | `true` |
| `sensitivePatterns` | array | Patterns of sensitive information to filter in history | `[]` |
### Configuration Examples
1. YAML format (save as `~/.codex/config.yaml`):
```yaml
# ~/.codex/config.yaml
model: o4-mini # Default model
fullAutoErrorMode: ask-user # or ignore-and-continue
notify: true # Enable desktop notifications for responses
model: o4-mini
approvalMode: suggest
fullAutoErrorMode: ask-user
notify: true
```
You can also define custom instructions:
2. JSON format (save as `~/.codex/config.json`):
```yaml
# ~/.codex/instructions.md
```json
{
"model": "o4-mini",
"approvalMode": "suggest",
"fullAutoErrorMode": "ask-user",
"notify": true
}
```
### Full Configuration Example
Below is a comprehensive example of `config.json` with multiple custom providers:
```json
{
"model": "o4-mini",
"provider": "openai",
"providers": {
"openai": {
"name": "OpenAI",
"baseURL": "https://api.openai.com/v1",
"envKey": "OPENAI_API_KEY"
},
"openrouter": {
"name": "OpenRouter",
"baseURL": "https://openrouter.ai/api/v1",
"envKey": "OPENROUTER_API_KEY"
},
"gemini": {
"name": "Gemini",
"baseURL": "https://generativelanguage.googleapis.com/v1beta/openai",
"envKey": "GEMINI_API_KEY"
},
"ollama": {
"name": "Ollama",
"baseURL": "http://localhost:11434/v1",
"envKey": "OLLAMA_API_KEY"
},
"mistral": {
"name": "Mistral",
"baseURL": "https://api.mistral.ai/v1",
"envKey": "MISTRAL_API_KEY"
},
"deepseek": {
"name": "DeepSeek",
"baseURL": "https://api.deepseek.com",
"envKey": "DEEPSEEK_API_KEY"
},
"xai": {
"name": "xAI",
"baseURL": "https://api.x.ai/v1",
"envKey": "XAI_API_KEY"
},
"groq": {
"name": "Groq",
"baseURL": "https://api.groq.com/openai/v1",
"envKey": "GROQ_API_KEY"
}
},
"history": {
"maxSize": 1000,
"saveHistory": true,
"sensitivePatterns": []
}
}
```
### Custom Instructions
You can create a `~/.codex/instructions.md` file to define custom instructions:
```markdown
- Always respond with emojis
- Only use git commands if I explicitly mention you should
- Only use git commands when explicitly requested
```
### Environment Variables Setup
For each AI provider, you need to set the corresponding API key in your environment variables. For example:
```bash
# OpenAI
export OPENAI_API_KEY="your-api-key-here"
# OpenRouter
export OPENROUTER_API_KEY="your-openrouter-key-here"
# Similarly for other providers
```
---
@@ -326,43 +491,32 @@ Codex runs model-generated commands in a sandbox. If a proposed command or file
<details>
<summary>Does it work on Windows?</summary>
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) Codex has been tested on macOS and Linux with Node ≥ 22.
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) - Codex has been tested on macOS and Linux with Node 22.
</details>
---
## Zero Data Retention (ZDR) Organization Limitation
## Zero Data Retention (ZDR) Usage
> **Note:** Codex CLI does **not** currently support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled.
If your OpenAI organization has Zero Data Retention enabled, you may encounter errors such as:
Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled. If your OpenAI organization has Zero Data Retention enabled and you still encounter errors such as:
```
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
**Why?**
- Codex CLI relies on the Responses API with `store:true` to enable internal reasoning steps.
- As noted in the [docs](https://platform.openai.com/docs/guides/your-data#responses-api), the Responses API requires a 30-day retention period by default, or when the store parameter is set to true.
- ZDR organizations cannot use `store:true`, so requests will fail.
**What can I do?**
- If you are part of a ZDR organization, Codex CLI will not work until support is added.
- We are tracking this limitation and will update the documentation if support becomes available.
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
---
## Funding Opportunity
## Codex Open Source Fund
Were excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
We're excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
- Grants are awarded in **$25,000** API credit increments.
- Grants are awarded up to **$25,000** API credits.
- Applications are reviewed **on a rolling basis**.
**Interested? [Apply here](https://openai.com/form/codex-open-source-fund/).**
**Interested? [Apply here](https://openai.com/form/codex-open-source-fund/).**
---
@@ -370,14 +524,14 @@ Were excited to launch a **$1 million initiative** supporting open source pr
This project is under active development and the code will likely change pretty significantly. We'll update this message once that's complete!
More broadly we welcome contributions whether you are opening your very first pull request or youre a seasoned maintainer. At the same time we care about reliability and longterm maintainability, so the bar for merging code is intentionally **high**. The guidelines below spell out what highquality means in practice and should make the whole process transparent and friendly.
More broadly we welcome contributions - whether you are opening your very first pull request or you're a seasoned maintainer. At the same time we care about reliability and long-term maintainability, so the bar for merging code is intentionally **high**. The guidelines below spell out what "high-quality" means in practice and should make the whole process transparent and friendly.
### Development workflow
- Create a _topic branch_ from `main` e.g. `feat/interactive-prompt`.
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
- Use `npm run test:watch` during development for superfast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for typechecking.
- Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git Hooks with Husky
@@ -390,7 +544,7 @@ This project uses [Husky](https://typicode.github.io/husky/) to enforce code qua
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./codex-cli/HUSKY.md).
```bash
npm test && npm run lint && npm run typecheck
pnpm test && pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
@@ -399,20 +553,101 @@ npm test && npm run lint && npm run typecheck
I have read the CLA Document and I hereby sign the CLA
```
The CLAAssistant bot will turn the PR status green once all authors have signed.
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
npm run test:watch
# Watch mode (tests rerun on change)
pnpm test:watch
# Typecheck without emitting files
npm run typecheck
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
npm run lint:fix
npm run format:fix
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
### Writing high-impact code changes
1. **Start with an issue.** Open a new one or comment on an existing discussion so we can agree on the solution before code is written.
2. **Add or update tests.** Every new feature or bug-fix should come with test coverage that fails before your change and passes afterwards. 100% coverage is not required, but aim for meaningful assertions.
3. **Document behaviour.** If your change affects user-facing behaviour, update the README, inline help (`codex --help`), or relevant example projects.
4. **Keep commits atomic.** Each commit should compile and the tests should pass. This makes reviews and potential rollbacks easier.
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
### Review process
1. One maintainer will be assigned as a primary reviewer.
2. We may ask for changes - please do not take this personally. We value the work, we just also value consistency and long-term maintainability.
3. When there is consensus that the PR meets the bar, a maintainer will squash-and-merge.
### Community values
- **Be kind and inclusive.** Treat others with respect; we follow the [Contributor Covenant](https://www.contributor-covenant.org/).
- **Assume good intent.** Written communication is hard - err on the side of generosity.
- **Teach & learn.** If you spot something confusing, open an issue or PR with improvements.
### Getting help
If you run into problems setting up the project, would like feedback on an idea, or just want to say _hi_ - please open a Discussion or jump into the relevant issue. We are happy to help.
Together we can make Codex CLI an incredible tool. **Happy hacking!** :rocket:
### Contributor License Agreement (CLA)
All contributors **must** accept the CLA. The process is lightweight:
1. Open your pull request.
2. Paste the following comment (or reply `recheck` if you've signed before):
```text
I have read the CLA Document and I hereby sign the CLA
```
3. The CLA-Assistant bot records your signature in the repo and marks the status check as passed.
No special Git commands, email attachments, or commit footers required.
#### Quick fixes
| Scenario | Command |
| ----------------- | ------------------------------------------------ |
| Amend last commit | `git commit --amend -s --no-edit && git push -f` |
The **DCO check** blocks merges until every commit in the PR carries the footer (with squash this is just the one).
### Releasing `codex`
To publish a new version of the CLI, run the release scripts defined in `codex-cli/package.json`:
1. Open the `codex-cli` directory
2. Make sure you're on a branch like `git checkout -b bump-version`
3. Bump the version and `CLI_VERSION` to current datetime: `pnpm release:version`
4. Commit the version bump (with DCO sign-off):
```bash
git add codex-cli/src/utils/session.ts codex-cli/package.json
git commit -s -m "chore(release): codex-cli v$(node -p \"require('./codex-cli/package.json').version\")"
```
5. Copy README, build, and publish to npm: `pnpm release`
6. Push to branch: `git push origin HEAD`
### Alternative Build Options
#### Nix Flake Development
Prerequisite: Nix >= 2.4 with flakes enabled (`experimental-features = nix-command flakes` in `~/.config/nix/nix.conf`).
@@ -438,85 +673,14 @@ Run the CLI via the flake app:
nix run .#codex
```
### Writing highimpact code changes
1. **Start with an issue.** Open a new one or comment on an existing discussion so we can agree on the solution before code is written.
2. **Add or update tests.** Every new feature or bugfix should come with test coverage that fails before your change and passes afterwards. 100 % coverage is not required, but aim for meaningful assertions.
3. **Document behaviour.** If your change affects userfacing behaviour, update the README, inline help (`codex --help`), or relevant example projects.
4. **Keep commits atomic.** Each commit should compile and the tests should pass. This makes reviews and potential rollbacks easier.
### Opening a pull request
- Fill in the PR template (or include similar information) **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is uptodate with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a mergeable state.
### Review process
1. One maintainer will be assigned as a primary reviewer.
2. We may ask for changes please do not take this personally. We value the work, we just also value consistency and longterm maintainability.
3. When there is consensus that the PR meets the bar, a maintainer will squashandmerge.
### Community values
- **Be kind and inclusive.** Treat others with respect; we follow the [Contributor Covenant](https://www.contributor-covenant.org/).
- **Assume good intent.** Written communication is hard err on the side of generosity.
- **Teach & learn.** If you spot something confusing, open an issue or PR with improvements.
### Getting help
If you run into problems setting up the project, would like feedback on an idea, or just want to say _hi_ please open a Discussion or jump into the relevant issue. We are happy to help.
Together we can make Codex CLI an incredible tool. **Happy hacking!** :rocket:
### Contributor License Agreement (CLA)
All contributors **must** accept the CLA. The process is lightweight:
1. Open your pull request.
2. Paste the following comment (or reply `recheck` if youve signed before):
```text
I have read the CLA Document and I hereby sign the CLA
```
3. The CLAAssistant bot records your signature in the repo and marks the status check as passed.
No special Git commands, email attachments, or commit footers required.
#### Quick fixes
| Scenario | Command |
| ----------------- | ----------------------------------------------------------------------------------------- |
| Amend last commit | `git commit --amend -s --no-edit && git push -f` |
| GitHub UI only | Edit the commit message in the PR → add<br>`Signed-off-by: Your Name <email@example.com>` |
The **DCO check** blocks merges until every commit in the PR carries the footer (with squash this is just the one).
### Releasing `codex`
To publish a new version of the CLI, run the release scripts defined in `codex-cli/package.json`:
1. Open the `codex-cli` directory
2. Make sure you're on a branch like `git checkout -b bump-version`
3. Bump the version and `CLI_VERSION` to current datetime: `npm run release:version`
4. Commit the version bump (with DCO sign-off):
```bash
git add codex-cli/src/utils/session.ts codex-cli/package.json
git commit -s -m "chore(release): codex-cli v$(node -p \"require('./codex-cli/package.json').version\")"
```
5. Copy README, build, and publish to npm: `npm run release`
6. Push to branch: `git push origin HEAD`
---
## Security &amp; Responsible AI
## Security & Responsible AI
Have you discovered a vulnerability or have concerns about model output? Please email **security@openai.com** and we will respond promptly.
Have you discovered a vulnerability or have concerns about model output? Please e-mail **security@openai.com** and we will respond promptly.
---
## License
This repository is licensed under the [Apache-2.0 License](LICENSE).
This repository is licensed under the [Apache-2.0 License](LICENSE).

View File

@@ -35,10 +35,10 @@ conventional_commits = true
commit_parsers = [
{ message = "^feat", group = "<!-- 0 -->🚀 Features" },
{ message = "^fix", group = "<!-- 1 -->🐛 Bug Fixes" },
{ message = "^fix", group = "<!-- 1 -->🪲 Bug Fixes" },
{ message = "^bump", group = "<!-- 6 -->🛳️ Release" },
# Fallback  skip anything that didn't match the above rules.
{ message = ".*", group = "<!-- 10 -->💼 Other", skip = true },
{ message = ".*", group = "<!-- 10 -->💼 Other" },
]
filter_unconventional = false

View File

@@ -1,32 +0,0 @@
#!/usr/bin/env sh
if [ -z "$husky_skip_init" ]; then
debug () {
if [ "$HUSKY_DEBUG" = "1" ]; then
echo "husky (debug) - $1"
fi
}
readonly hook_name="$(basename -- "$0")"
debug "starting $hook_name..."
if [ "$HUSKY" = "0" ]; then
debug "HUSKY env variable is set to 0, skipping hook"
exit 0
fi
if [ -f ~/.huskyrc ]; then
debug "sourcing ~/.huskyrc"
. ~/.huskyrc
fi
readonly husky_skip_init=1
export husky_skip_init
sh -e "$0" "$@"
exitCode="$?"
if [ $exitCode != 0 ]; then
echo "husky - $hook_name hook exited with code $exitCode (error)"
fi
exit $exitCode
fi

View File

@@ -1,5 +0,0 @@
#!/usr/bin/env sh
. "$(dirname -- "$0")/_/husky.sh"
# Run lint-staged to check files that are about to be committed
npm run pre-commit

View File

@@ -1,5 +0,0 @@
#!/usr/bin/env sh
. "$(dirname -- "$0")/_/husky.sh"
# Run tests and type checking before pushing
npm test && npm run typecheck

View File

@@ -1,9 +0,0 @@
{
"*.{ts,tsx}": [
"eslint --fix",
"prettier --write"
],
"*.{json,md,yml}": [
"prettier --write"
]
}

View File

@@ -20,7 +20,6 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
less \
man-db \
procps \
sudo \
unzip \
ripgrep \
zsh \
@@ -47,10 +46,14 @@ RUN npm install -g codex.tgz \
&& rm -rf /usr/local/share/npm-global/lib/node_modules/codex-cli/tests \
&& rm -rf /usr/local/share/npm-global/lib/node_modules/codex-cli/docs
# Copy and set up firewall script
COPY scripts/init_firewall.sh /usr/local/bin/
# Inside the container we consider the environment already sufficiently locked
# down, therefore instruct Codex CLI to allow running without sandboxing.
ENV CODEX_UNSAFE_ALLOW_NO_SANDBOX=1
# Copy and set up firewall script as root.
USER root
RUN chmod +x /usr/local/bin/init_firewall.sh && \
echo "node ALL=(root) NOPASSWD: /usr/local/bin/init_firewall.sh" > /etc/sudoers.d/node-firewall && \
chmod 0440 /etc/sudoers.d/node-firewall
COPY scripts/init_firewall.sh /usr/local/bin/
RUN chmod 500 /usr/local/bin/init_firewall.sh
# Drop back to non-root.
USER node

View File

@@ -1,20 +0,0 @@
#!/usr/bin/env sh
# resolve script path in case of symlink
SOURCE="$0"
while [ -h "$SOURCE" ]; do
DIR=$(dirname "$SOURCE")
SOURCE=$(readlink "$SOURCE")
case "$SOURCE" in
/*) ;; # absolute path
*) SOURCE="$DIR/$SOURCE" ;; # relative path
esac
done
DIR=$(cd "$(dirname "$SOURCE")" && pwd)
if command -v node >/dev/null 2>&1; then
exec node "$DIR/../dist/cli.js" "$@"
elif command -v bun >/dev/null 2>&1; then
exec bun "$DIR/../dist/cli.js" "$@"
else
echo "Error: node or bun is required to run codex" >&2
exit 1
fi

27
codex-cli/bin/codex.js Executable file
View File

@@ -0,0 +1,27 @@
#!/usr/bin/env node
// Unified entry point for Codex CLI on all platforms
// Dynamically loads the compiled ESM bundle in dist/cli.js
import path from 'path';
import { fileURLToPath, pathToFileURL } from 'url';
// Determine this script's directory
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, '../dist/cli.js');
const cliUrl = pathToFileURL(cliPath).href;
// Load and execute the CLI
(async () => {
try {
await import(cliUrl);
} catch (err) {
// eslint-disable-next-line no-console
console.error(err);
// eslint-disable-next-line no-undef
process.exit(1);
}
})();

View File

@@ -1,6 +1,8 @@
import * as esbuild from "esbuild";
import * as fs from "fs";
import * as path from "path";
const OUT_DIR = 'dist'
/**
* ink attempts to import react-devtools-core in an ESM-unfriendly way:
*
@@ -39,6 +41,11 @@ const isDevBuild =
const plugins = [ignoreReactDevToolsPlugin];
// Build Hygiene, ensure we drop previous dist dir and any leftover files
const outPath = path.resolve(OUT_DIR);
if (fs.existsSync(outPath)) {
fs.rmSync(outPath, { recursive: true, force: true });
}
// Add a shebang that enables sourcemap support for dev builds so that stack
// traces point to the original TypeScript lines without requiring callers to
@@ -50,7 +57,7 @@ if (isDevBuild) {
name: "dev-shebang",
setup(build) {
build.onEnd(async () => {
const outFile = path.resolve(isDevBuild ? "dist/cli-dev.js" : "dist/cli.js");
const outFile = path.resolve(isDevBuild ? `${OUT_DIR}/cli-dev.js` : `${OUT_DIR}/cli.js`);
let code = await fs.promises.readFile(outFile, "utf8");
if (code.startsWith("#!")) {
code = code.replace(/^#!.*\n/, devShebangLine);
@@ -69,7 +76,7 @@ esbuild
format: "esm",
platform: "node",
tsconfig: "tsconfig.json",
outfile: isDevBuild ? "dist/cli-dev.js" : "dist/cli.js",
outfile: isDevBuild ? `${OUT_DIR}/cli-dev.js` : `${OUT_DIR}/cli.js`,
minify: !isDevBuild,
sourcemap: isDevBuild ? "inline" : true,
plugins,

File diff suppressed because it is too large Load Diff

View File

@@ -1,9 +1,9 @@
{
"name": "@openai/codex",
"version": "0.1.2504172351",
"version": "0.1.2504251709",
"license": "Apache-2.0",
"bin": {
"codex": "bin/codex"
"codex": "bin/codex.js"
},
"type": "module",
"engines": {
@@ -22,16 +22,11 @@
"build:dev": "NODE_ENV=development node build.mjs --dev && NODE_OPTIONS=--enable-source-maps node dist/cli-dev.js",
"release:readme": "cp ../README.md ./README.md",
"release:version": "TS=$(date +%y%m%d%H%M) && sed -E -i'' -e \"s/\\\"0\\.1\\.[0-9]{10}\\\"/\\\"0.1.${TS}\\\"/g\" package.json src/utils/session.ts",
"release:build-and-publish": "npm run build && npm publish",
"release": "npm run release:readme && npm run release:version && npm install && npm run release:build-and-publish",
"prepare": "husky",
"pre-commit": "lint-staged"
"release:build-and-publish": "pnpm run build && npm publish",
"release": "pnpm run release:readme && pnpm run release:version && pnpm install && pnpm run release:build-and-publish"
},
"files": [
"README.md",
"bin",
"dist",
"src"
"dist"
],
"dependencies": {
"@inkjs/ui": "^2.0.0",
@@ -39,6 +34,7 @@
"diff": "^7.0.0",
"dotenv": "^16.1.4",
"fast-deep-equal": "^3.1.3",
"fast-npm-meta": "^0.4.2",
"figures": "^6.1.0",
"file-type": "^20.1.0",
"ink": "^5.2.0",
@@ -47,7 +43,8 @@
"marked-terminal": "^7.3.0",
"meow": "^13.2.0",
"open": "^10.1.0",
"openai": "^4.89.0",
"openai": "^4.95.1",
"package-manager-detector": "^1.2.0",
"react": "^18.2.0",
"shell-quote": "^1.8.2",
"strip-ansi": "^7.1.0",
@@ -61,9 +58,12 @@
"@types/js-yaml": "^4.0.9",
"@types/marked-terminal": "^6.1.1",
"@types/react": "^18.0.32",
"@types/semver": "^7.7.0",
"@types/shell-quote": "^1.7.5",
"@types/which": "^3.0.4",
"@typescript-eslint/eslint-plugin": "^7.18.0",
"@typescript-eslint/parser": "^7.18.0",
"boxen": "^8.0.1",
"esbuild": "^0.25.2",
"eslint-plugin-import": "^2.31.0",
"eslint-plugin-react": "^7.32.2",
@@ -71,21 +71,14 @@
"eslint-plugin-react-refresh": "^0.4.19",
"husky": "^9.1.7",
"ink-testing-library": "^3.0.0",
"lint-staged": "^15.5.1",
"prettier": "^2.8.7",
"prettier": "^3.5.3",
"punycode": "^2.3.1",
"semver": "^7.7.1",
"ts-node": "^10.9.1",
"typescript": "^5.0.3",
"vitest": "^3.0.9",
"whatwg-url": "^14.2.0"
},
"resolutions": {
"braces": "^3.0.3",
"micromatch": "^4.0.8",
"semver": "^7.7.1"
},
"overrides": {
"punycode": "^2.3.1"
"whatwg-url": "^14.2.0",
"which": "^5.0.0"
},
"repository": {
"type": "git",

View File

@@ -8,9 +8,9 @@ pushd "$SCRIPT_DIR/.." >> /dev/null || {
echo "Error: Failed to change directory to $SCRIPT_DIR/.."
exit 1
}
npm install
npm run build
pnpm install
pnpm run build
rm -rf ./dist/openai-codex-*.tgz
npm pack --pack-destination ./dist
pnpm pack --pack-destination ./dist
mv ./dist/openai-codex-*.tgz ./dist/codex.tgz
docker build -t codex -f "./Dockerfile" .

View File

@@ -23,6 +23,16 @@ fi
WORK_DIR=$(realpath "$WORK_DIR")
# Generate a unique container name based on the normalized work directory
CONTAINER_NAME="codex_$(echo "$WORK_DIR" | sed 's/\//_/g' | sed 's/[^a-zA-Z0-9_-]//g')"
# Define cleanup to remove the container on script exit, ensuring no leftover containers
cleanup() {
docker rm -f "$CONTAINER_NAME" >/dev/null 2>&1 || true
}
# Trap EXIT to invoke cleanup regardless of how the script terminates
trap cleanup EXIT
# Ensure a command is provided.
if [ "$#" -eq 0 ]; then
echo "Usage: $0 [--work_dir directory] \"COMMAND\""
@@ -35,11 +45,11 @@ if [ -z "$WORK_DIR" ]; then
exit 1
fi
# Remove any existing container named 'codex'.
docker rm -f codex 2>/dev/null || true
# Kill any existing container for the working directory using cleanup(), centralizing removal logic.
cleanup
# Run the container with the specified directory mounted at the same path inside the container.
docker run --name codex -d \
docker run --name "$CONTAINER_NAME" -d \
-e OPENAI_API_KEY \
--cap-add=NET_ADMIN \
--cap-add=NET_RAW \
@@ -47,8 +57,8 @@ docker run --name codex -d \
codex \
sleep infinity
# Initialize the firewall inside the container.
docker exec codex bash -c "sudo /usr/local/bin/init_firewall.sh"
# Initialize the firewall inside the container with root privileges.
docker exec --user root "$CONTAINER_NAME" /usr/local/bin/init_firewall.sh
# Execute the provided command in the container, ensuring it runs in the work directory.
# We use a parameterized bash command to safely handle the command and directory.
@@ -57,4 +67,4 @@ quoted_args=""
for arg in "$@"; do
quoted_args+=" $(printf '%q' "$arg")"
done
docker exec -it codex bash -c "cd \"/app$WORK_DIR\" && codex --full-auto ${quoted_args}"
docker exec -it "$CONTAINER_NAME" bash -c "cd \"/app$WORK_DIR\" && codex --full-auto ${quoted_args}"

View File

@@ -71,13 +71,14 @@ export type ApprovalPolicy =
*/
export function canAutoApprove(
command: ReadonlyArray<string>,
workdir: string | undefined,
policy: ApprovalPolicy,
writableRoots: ReadonlyArray<string>,
env: NodeJS.ProcessEnv = process.env,
): SafetyAssessment {
if (command[0] === "apply_patch") {
return command.length === 2 && typeof command[1] === "string"
? canAutoApproveApplyPatch(command[1], writableRoots, policy)
? canAutoApproveApplyPatch(command[1], workdir, writableRoots, policy)
: {
type: "reject",
reason: "Invalid apply_patch command",
@@ -103,7 +104,12 @@ export function canAutoApprove(
) {
const applyPatchArg = tryParseApplyPatch(command[2]);
if (applyPatchArg != null) {
return canAutoApproveApplyPatch(applyPatchArg, writableRoots, policy);
return canAutoApproveApplyPatch(
applyPatchArg,
workdir,
writableRoots,
policy,
);
}
let bashCmd;
@@ -135,8 +141,8 @@ export function canAutoApprove(
// bashCmd could be a mix of strings and operators, e.g.:
// "ls || (true && pwd)" => [ 'ls', { op: '||' }, '(', 'true', { op: '&&' }, 'pwd', ')' ]
// We try to ensure that *every* command segment is deemed safe and that
// all operators belong to an allowlist. If so, the entire expression is
// considered autoapprovable.
// all operators belong to an allow-list. If so, the entire expression is
// considered auto-approvable.
const shellSafe = isEntireShellExpressionSafe(bashCmd);
if (shellSafe != null) {
@@ -162,6 +168,7 @@ export function canAutoApprove(
function canAutoApproveApplyPatch(
applyPatchArg: string,
workdir: string | undefined,
writableRoots: ReadonlyArray<string>,
policy: ApprovalPolicy,
): SafetyAssessment {
@@ -179,7 +186,13 @@ function canAutoApproveApplyPatch(
break;
}
if (isWritePatchConstrainedToWritablePaths(applyPatchArg, writableRoots)) {
if (
isWritePatchConstrainedToWritablePaths(
applyPatchArg,
workdir,
writableRoots,
)
) {
return {
type: "auto-approve",
reason: "apply_patch command is constrained to writable paths",
@@ -208,6 +221,7 @@ function canAutoApproveApplyPatch(
*/
function isWritePatchConstrainedToWritablePaths(
applyPatchArg: string,
workdir: string | undefined,
writableRoots: ReadonlyArray<string>,
): boolean {
// `identify_files_needed()` returns a list of files that will be modified or
@@ -222,10 +236,12 @@ function isWritePatchConstrainedToWritablePaths(
return (
allPathsConstrainedTowritablePaths(
identify_files_needed(applyPatchArg),
workdir,
writableRoots,
) &&
allPathsConstrainedTowritablePaths(
identify_files_added(applyPatchArg),
workdir,
writableRoots,
)
);
@@ -233,24 +249,47 @@ function isWritePatchConstrainedToWritablePaths(
function allPathsConstrainedTowritablePaths(
candidatePaths: ReadonlyArray<string>,
workdir: string | undefined,
writableRoots: ReadonlyArray<string>,
): boolean {
return candidatePaths.every((candidatePath) =>
isPathConstrainedTowritablePaths(candidatePath, writableRoots),
isPathConstrainedTowritablePaths(candidatePath, workdir, writableRoots),
);
}
/** If candidatePath is relative, it will be resolved against cwd. */
function isPathConstrainedTowritablePaths(
candidatePath: string,
workdir: string | undefined,
writableRoots: ReadonlyArray<string>,
): boolean {
const candidateAbsolutePath = path.resolve(candidatePath);
const candidateAbsolutePath = resolvePathAgainstWorkdir(
candidatePath,
workdir,
);
return writableRoots.some((writablePath) =>
pathContains(writablePath, candidateAbsolutePath),
);
}
/**
* If not already an absolute path, resolves `candidatePath` against `workdir`
* if specified; otherwise, against `process.cwd()`.
*/
export function resolvePathAgainstWorkdir(
candidatePath: string,
workdir: string | undefined,
): string {
if (path.isAbsolute(candidatePath)) {
return candidatePath;
} else if (workdir != null) {
return path.resolve(workdir, candidatePath);
} else {
return path.resolve(candidatePath);
}
}
/** Both `parent` and `child` must be absolute paths. */
function pathContains(parent: string, child: string): boolean {
const relative = path.relative(parent, child);
@@ -314,7 +353,7 @@ export function isSafeCommand(
};
case "true":
return {
reason: "Noop (true)",
reason: "No-op (true)",
group: "Utility",
};
case "echo":
@@ -329,11 +368,20 @@ export function isSafeCommand(
reason: "Ripgrep search",
group: "Searching",
};
case "find":
return {
reason: "Find files or directories",
group: "Searching",
};
case "find": {
// Certain options to `find` allow executing arbitrary processes, so we
// cannot auto-approve them.
if (
command.some((arg: string) => UNSAFE_OPTIONS_FOR_FIND_COMMAND.has(arg))
) {
break;
} else {
return {
reason: "Find files or directories",
group: "Searching",
};
}
}
case "grep":
return {
reason: "Text search (grep)",
@@ -421,12 +469,27 @@ function isValidSedNArg(arg: string | undefined): boolean {
return arg != null && /^(\d+,)?\d+p$/.test(arg);
}
const UNSAFE_OPTIONS_FOR_FIND_COMMAND: ReadonlySet<string> = new Set([
// Options that can execute arbitrary commands.
"-exec",
"-execdir",
"-ok",
"-okdir",
// Option that deletes matching files.
"-delete",
// Options that write pathnames to a file.
"-fls",
"-fprint",
"-fprint0",
"-fprintf",
]);
// ---------------- Helper utilities for complex shell expressions -----------------
// A conservative allowlist of bash operators that do not, on their own, cause
// A conservative allow-list of bash operators that do not, on their own, cause
// side effects. Redirections (>, >>, <, etc.) and command substitution `$()`
// are intentionally excluded. Parentheses used for grouping are treated as
// strings by `shellquote`, so we do not add them here. Reference:
// strings by `shell-quote`, so we do not add them here. Reference:
// https://github.com/substack/node-shell-quote#parsecmd-opts
const SAFE_SHELL_OPERATORS: ReadonlySet<string> = new Set([
"&&", // logical AND
@@ -452,7 +515,7 @@ function isEntireShellExpressionSafe(
}
try {
// Collect command segments delimited by operators. `shellquote` represents
// Collect command segments delimited by operators. `shell-quote` represents
// subshell grouping parentheses as literal strings "(" and ")"; treat them
// as unsafe to keep the logic simple (since subshells could introduce
// unexpected scope changes).
@@ -520,7 +583,7 @@ function isParseEntryWithOp(
return (
typeof entry === "object" &&
entry != null &&
// Using the safe `in` operator keeps the check propertysafe even when
// Using the safe `in` operator keeps the check property-safe even when
// `entry` is a `string`.
"op" in entry &&
typeof (entry as { op?: unknown }).op === "string"

View File

@@ -10,23 +10,23 @@ import type { ApprovalPolicy } from "./approvals";
import type { CommandConfirmation } from "./utils/agent/agent-loop";
import type { AppConfig } from "./utils/config";
import type { ResponseItem } from "openai/resources/responses/responses";
import type { ReasoningEffort } from "openai/resources.mjs";
import App from "./app";
import { runSinglePass } from "./cli-singlepass";
import { AgentLoop } from "./utils/agent/agent-loop";
import { initLogger } from "./utils/agent/log";
import { ReviewDecision } from "./utils/agent/review";
import { AutoApprovalMode } from "./utils/auto-approval-mode";
import { checkForUpdates } from "./utils/check-updates";
import {
getApiKey,
loadConfig,
PRETTY_PRINT,
INSTRUCTIONS_FILEPATH,
} from "./utils/config";
import { createInputItem } from "./utils/input-utils";
import {
isModelSupportedForResponses,
preloadModels,
} from "./utils/model-utils.js";
import { initLogger } from "./utils/logger/log";
import { isModelSupportedForResponses } from "./utils/model-utils.js";
import { parseToolCall } from "./utils/parsers";
import { onExit, setInkRenderer } from "./utils/terminal";
import chalk from "chalk";
@@ -53,8 +53,11 @@ const cli = meow(
$ codex completion <bash|zsh|fish>
Options
--version Print version and exit
-h, --help Show usage and exit
-m, --model <model> Model to use for completions (default: o4-mini)
-p, --provider <provider> Provider to use for completions (default: openai)
-i, --image <path> Path(s) to image files to include as input
-v, --view <rollout> Inspect a previously saved rollout instead of starting a session
-q, --quiet Non-interactive mode that only prints the assistant's final output
@@ -70,6 +73,12 @@ const cli = meow(
--full-stdout Do not truncate stdout/stderr from command outputs
--notify Enable desktop notifications for responses
--disable-response-storage Disable serverside response storage (sends the
full conversation context with every request)
--flex-mode Use "flex-mode" processing mode for the request (only supported
with models o3 and o4-mini)
Dangerous options
--dangerously-auto-approve-everything
Skip all confirmation prompts and execute commands without
@@ -91,8 +100,10 @@ const cli = meow(
flags: {
// misc
help: { type: "boolean", aliases: ["h"] },
version: { type: "boolean", description: "Print version and exit" },
view: { type: "string" },
model: { type: "string", aliases: ["m"] },
provider: { type: "string", aliases: ["p"] },
image: { type: "string", isMultiple: true, aliases: ["i"] },
quiet: {
type: "boolean",
@@ -133,24 +144,41 @@ const cli = meow(
},
noProjectDoc: {
type: "boolean",
description: "Disable automatic inclusion of projectlevel codex.md",
description: "Disable automatic inclusion of project-level codex.md",
},
projectDoc: {
type: "string",
description: "Path to a markdown file to include as project doc",
},
flexMode: {
type: "boolean",
description:
"Enable the flex-mode service tier (only supported by models o3 and o4-mini)",
},
fullStdout: {
type: "boolean",
description:
"Disable truncation of command stdout/stderr messages (show everything)",
aliases: ["no-truncate"],
},
reasoning: {
type: "string",
description: "Set the reasoning effort level (low, medium, high)",
choices: ["low", "medium", "high"],
default: "high",
},
// Notification
notify: {
type: "boolean",
description: "Enable desktop notifications for responses",
},
disableResponseStorage: {
type: "boolean",
description:
"Disable server-side response storage (sends full conversation context with every request)",
},
// Experimental mode where whole directory is loaded in context and model is requested
// to make code edits in a single pass.
fullContext: {
@@ -163,6 +191,10 @@ const cli = meow(
},
);
// ---------------------------------------------------------------------------
// Global flag handling
// ---------------------------------------------------------------------------
// Handle 'completion' subcommand before any prompting or API calls
if (cli.input[0] === "completion") {
const shell = cli.input[1] || "bash";
@@ -182,7 +214,7 @@ _codex() {
}
_codex`,
fish: `# fish completion for codex
complete -c codex -a '(_fish_complete_path)' -d 'file path'`,
complete -c codex -a '(__fish_complete_path)' -d 'file path'`,
};
const script = scripts[shell];
if (!script) {
@@ -194,19 +226,20 @@ complete -c codex -a '(_fish_complete_path)' -d 'file path'`,
console.log(script);
process.exit(0);
}
// Show help if requested
// For --help, show help and exit.
if (cli.flags.help) {
cli.showHelp();
}
// Handle config flag: open instructions file in editor and exit
// For --config, open custom instructions file in editor and exit.
if (cli.flags.config) {
// Ensure configuration and instructions file exist
try {
loadConfig();
loadConfig(); // Ensures the file is created if it doesn't already exit.
} catch {
// ignore errors
}
const filePath = INSTRUCTIONS_FILEPATH;
const editor =
process.env["EDITOR"] || (process.platform === "win32" ? "notepad" : "vi");
@@ -218,45 +251,96 @@ if (cli.flags.config) {
// API key handling
// ---------------------------------------------------------------------------
const apiKey = process.env["OPENAI_API_KEY"];
if (!apiKey) {
// eslint-disable-next-line no-console
console.error(
`\n${chalk.red("Missing OpenAI API key.")}\n\n` +
`Set the environment variable ${chalk.bold("OPENAI_API_KEY")} ` +
`and re-run this command.\n` +
`You can create a key here: ${chalk.bold(
chalk.underline("https://platform.openai.com/account/api-keys"),
)}\n`,
);
process.exit(1);
}
const fullContextMode = Boolean(cli.flags.fullContext);
let config = loadConfig(undefined, undefined, {
cwd: process.cwd(),
disableProjectDoc: Boolean(cli.flags.noProjectDoc),
projectDocPath: cli.flags.projectDoc as string | undefined,
projectDocPath: cli.flags.projectDoc,
isFullContext: fullContextMode,
});
const prompt = cli.input[0];
const model = cli.flags.model;
const imagePaths = cli.flags.image as Array<string> | undefined;
const model = cli.flags.model ?? config.model;
const imagePaths = cli.flags.image;
const provider = cli.flags.provider ?? config.provider ?? "openai";
const apiKey = getApiKey(provider);
// Set of providers that don't require API keys
const NO_API_KEY_REQUIRED = new Set(["ollama"]);
// Skip API key validation for providers that don't require an API key
if (!apiKey && !NO_API_KEY_REQUIRED.has(provider.toLowerCase())) {
// eslint-disable-next-line no-console
console.error(
`\n${chalk.red(`Missing ${provider} API key.`)}\n\n` +
`Set the environment variable ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` +
`and re-run this command.\n` +
`${
provider.toLowerCase() === "openai"
? `You can create a key here: ${chalk.bold(
chalk.underline("https://platform.openai.com/account/api-keys"),
)}\n`
: provider.toLowerCase() === "gemini"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
}`,
);
process.exit(1);
}
const flagPresent = Object.hasOwn(cli.flags, "disableResponseStorage");
const disableResponseStorage = flagPresent
? Boolean(cli.flags.disableResponseStorage) // value user actually passed
: (config.disableResponseStorage ?? false); // fall back to YAML, default to false
config = {
apiKey,
...config,
model: model ?? config.model,
notify: Boolean(cli.flags.notify),
reasoningEffort:
(cli.flags.reasoning as ReasoningEffort | undefined) ?? "high",
flexMode: Boolean(cli.flags.flexMode),
provider,
disableResponseStorage,
};
if (!(await isModelSupportedForResponses(config.model))) {
// Check for updates after loading config. This is important because we write state file in
// the config dir.
try {
await checkForUpdates();
} catch {
// ignore
}
// For --flex-mode, validate and exit if incorrect.
if (cli.flags.flexMode) {
const allowedFlexModels = new Set(["o3", "o4-mini"]);
if (!allowedFlexModels.has(config.model)) {
// eslint-disable-next-line no-console
console.error(
`The --flex-mode option is only supported when using the 'o3' or 'o4-mini' models. ` +
`Current model: '${config.model}'.`,
);
process.exit(1);
}
}
if (
!(await isModelSupportedForResponses(provider, config.model)) &&
(!provider || provider.toLowerCase() === "openai")
) {
// eslint-disable-next-line no-console
console.error(
`The model "${config.model}" does not appear in the list of models ` +
`available to your account. Doublecheck the spelling (use\n` +
`available to your account. Double-check the spelling (use\n` +
` openai models list\n` +
`to see the full list) or choose another model with the --model flag.`,
);
@@ -265,6 +349,7 @@ if (!(await isModelSupportedForResponses(config.model))) {
let rollout: AppRollout | undefined;
// For --view, optionally load an existing rollout from disk, display it and exit.
if (cli.flags.view) {
const viewPath = cli.flags.view;
const absolutePath = path.isAbsolute(viewPath)
@@ -280,7 +365,7 @@ if (cli.flags.view) {
}
}
// If we are running in --fullcontext mode, do that and exit.
// For --fullcontext, run the separate cli entrypoint and exit.
if (fullContextMode) {
await runSinglePass({
originalPrompt: prompt,
@@ -296,14 +381,8 @@ const additionalWritableRoots: ReadonlyArray<string> = (
cli.flags.writableRoot ?? []
).map((p) => path.resolve(p));
// If we are running in --quiet mode, do that and exit.
const quietMode = Boolean(cli.flags.quiet);
const autoApproveEverything = Boolean(
cli.flags.dangerouslyAutoApproveEverything,
);
const fullStdout = Boolean(cli.flags.fullStdout);
if (quietMode) {
// For --quiet, run the cli without user interactions and exit.
if (cli.flags.quiet) {
process.env["CODEX_QUIET_MODE"] = "1";
if (!prompt || prompt.trim() === "") {
// eslint-disable-next-line no-console
@@ -312,12 +391,19 @@ if (quietMode) {
);
process.exit(1);
}
await runQuietMode({
prompt: prompt as string,
imagePaths: imagePaths || [],
approvalPolicy: autoApproveEverything
// Determine approval policy for quiet mode based on flags
const quietApprovalPolicy: ApprovalPolicy =
cli.flags.fullAuto || cli.flags.approvalMode === "full-auto"
? AutoApprovalMode.FULL_AUTO
: AutoApprovalMode.SUGGEST,
: cli.flags.autoEdit || cli.flags.approvalMode === "auto-edit"
? AutoApprovalMode.AUTO_EDIT
: config.approvalMode || AutoApprovalMode.SUGGEST;
await runQuietMode({
prompt,
imagePaths: imagePaths || [],
approvalPolicy: quietApprovalPolicy,
additionalWritableRoots,
config,
});
@@ -335,16 +421,15 @@ if (quietMode) {
// it is more dangerous than --fullAuto we deliberately give it lower
// priority so a user specifying both flags still gets the safer behaviour.
// 3. --autoEdit automatically approve edits, but prompt for commands.
// 4. Default suggest mode (prompt for everything).
// 4. config.approvalMode - use the approvalMode setting from ~/.codex/config.json.
// 5. Default suggest mode (prompt for everything).
const approvalPolicy: ApprovalPolicy =
cli.flags.fullAuto || cli.flags.approvalMode === "full-auto"
? AutoApprovalMode.FULL_AUTO
: cli.flags.autoEdit || cli.flags.approvalMode === "auto-edit"
? AutoApprovalMode.AUTO_EDIT
: AutoApprovalMode.SUGGEST;
preloadModels();
? AutoApprovalMode.AUTO_EDIT
: config.approvalMode || AutoApprovalMode.SUGGEST;
const instance = render(
<App
@@ -354,7 +439,7 @@ const instance = render(
imagePaths={imagePaths}
approvalPolicy={approvalPolicy}
additionalWritableRoots={additionalWritableRoots}
fullStdout={fullStdout}
fullStdout={Boolean(cli.flags.fullStdout)}
/>,
{
patchConsole: process.env["DEBUG"] ? false : true,
@@ -428,8 +513,10 @@ async function runQuietMode({
model: config.model,
config: config,
instructions: config.instructions,
provider: config.provider,
approvalPolicy,
additionalWritableRoots,
disableResponseStorage: config.disableResponseStorage,
onItem: (item: ResponseItem) => {
// eslint-disable-next-line no-console
console.log(formatResponseItemForQuietMode(item));
@@ -440,7 +527,12 @@ async function runQuietMode({
getCommandConfirmation: (
_command: Array<string>,
): Promise<CommandConfirmation> => {
return Promise.resolve({ review: ReviewDecision.NO_CONTINUE });
// In quiet mode, default to NO_CONTINUE, except when in full-auto mode
const reviewDecision =
approvalPolicy === AutoApprovalMode.FULL_AUTO
? ReviewDecision.YES
: ReviewDecision.NO_CONTINUE;
return Promise.resolve({ review: reviewDecision });
},
onLastResponseId: () => {
/* intentionally ignored in quiet mode */
@@ -461,13 +553,13 @@ process.on("SIGQUIT", exit);
process.on("SIGTERM", exit);
// ---------------------------------------------------------------------------
// Fallback for CtrlC when stdin is in rawmode
// Fallback for Ctrl-C when stdin is in raw-mode
// ---------------------------------------------------------------------------
if (process.stdin.isTTY) {
// Ensure we do not leave the terminal in raw mode if the user presses
// CtrlC while some other component has focus and Ink is intercepting
// input. Node does *not* emit a SIGINT in rawmode, so we listen for the
// Ctrl-C while some other component has focus and Ink is intercepting
// input. Node does *not* emit a SIGINT in raw-mode, so we listen for the
// corresponding byte (0x03) ourselves and trigger a graceful shutdown.
const onRawData = (data: Buffer | string): void => {
const str = Buffer.isBuffer(data) ? data.toString("utf8") : data;
@@ -478,6 +570,6 @@ if (process.stdin.isTTY) {
process.stdin.on("data", onRawData);
}
// Ensure terminal cleanup always runs, even when other code calls
// Ensure terminal clean-up always runs, even when other code calls
// `process.exit()` directly.
process.once("exit", onExit);

View File

@@ -3,7 +3,7 @@
import { useTerminalSize } from "../../hooks/use-terminal-size";
import TextBuffer from "../../text-buffer.js";
import chalk from "chalk";
import { Box, Text, useInput, useStdin } from "ink";
import { Box, Text, useInput } from "ink";
import { EventEmitter } from "node:events";
import React, { useRef, useState } from "react";
@@ -14,7 +14,7 @@ import React, { useRef, useState } from "react";
* The real `process.stdin` object exposed by Node.js inherits these methods
* from `Socket`, but the lightweight stub used in tests only extends
* `EventEmitter`. Ink calls the two methods when enabling/disabling raw
* mode, so make them harmless noops when they're absent to avoid runtime
* mode, so make them harmless no-ops when they're absent to avoid runtime
* failures during unit tests.
* ----------------------------------------------------------------------- */
@@ -155,6 +155,8 @@ export interface MultilineTextEditorHandle {
isCursorAtLastRow(): boolean;
/** Full text contents */
getText(): string;
/** Move the cursor to the end of the text */
moveCursorToEnd(): void;
}
const MultilineTextEditorInner = (
@@ -187,41 +189,6 @@ const MultilineTextEditorInner = (
// minimum so that the UI never becomes unusably small.
const effectiveWidth = Math.max(20, width ?? terminalSize.columns);
// ---------------------------------------------------------------------------
// External editor integration helpers.
// ---------------------------------------------------------------------------
// Access to stdin so we can toggle rawmode while the external editor is
// in control of the terminal.
const { stdin, setRawMode } = useStdin();
/**
* Launch the user's preferred $EDITOR, blocking until they close it, then
* reload the edited file back into the inmemory TextBuffer. The heavy
* work is delegated to `TextBuffer.openInExternalEditor`, but we are
* responsible for temporarily *disabling* raw mode so the child process can
* interact with the TTY normally.
*/
const openExternalEditor = React.useCallback(async () => {
// Preserve the current rawmode setting so we can restore it afterwards.
const wasRaw = stdin?.isRaw ?? false;
try {
setRawMode?.(false);
await buffer.current.openInExternalEditor();
} catch (err) {
// Surface the error so it doesn't fail silently for now we log to
// stderr. In the future this could surface a toast / overlay.
// eslint-disable-next-line no-console
console.error("[MultilineTextEditor] external editor error", err);
} finally {
if (wasRaw) {
setRawMode?.(true);
}
// Force a rerender so the component reflects the mutated buffer.
setVersion((v) => v + 1);
}
}, [buffer, stdin, setRawMode]);
// ---------------------------------------------------------------------------
// Keyboard handling.
// ---------------------------------------------------------------------------
@@ -232,25 +199,6 @@ const MultilineTextEditorInner = (
return;
}
// Singlestep editor shortcut: Ctrl+X or Ctrl+E
// Treat both true Ctrl+Key combinations *and* raw control codes so that
// the shortcut works consistently in real terminals (rawmode) and the
// inktestinglibrary stub which delivers only the raw byte (e.g. 0x05
// for CtrlE) without setting `key.ctrl`.
const isCtrlX =
(key.ctrl && (input === "x" || input === "\x18")) || input === "\x18";
const isCtrlE =
(key.ctrl && (input === "e" || input === "\x05")) ||
input === "\x05" ||
(!key.ctrl &&
input === "e" &&
input.length === 1 &&
input.charCodeAt(0) === 5);
if (isCtrlX || isCtrlE) {
openExternalEditor();
return;
}
if (
process.env["TEXTBUFFER_DEBUG"] === "1" ||
process.env["TEXTBUFFER_DEBUG"] === "true"
@@ -259,25 +207,47 @@ const MultilineTextEditorInner = (
console.log("[MultilineTextEditor] event", { input, key });
}
// 1) CSIu / modifyOtherKeys (Ink strips initial ESC, so we start with '[')
// 1a) CSI-u / modifyOtherKeys *mode 2* (Ink strips initial ESC, so we
// start with '[') format: "[<code>;<modifiers>u".
if (input.startsWith("[") && input.endsWith("u")) {
const m = input.match(/^\[([0-9]+);([0-9]+)u$/);
if (m && m[1] === "13") {
const mod = Number(m[2]);
// In xterm's encoding: bit1 (value 2) is Shift. Everything >1 that
// isn't exactly 1 means some modifier was held. We treat *shift
// present* (2,4,6,8) as newline; plain (1) as submit.
// In xterm's encoding: bit-1 (value 2) is Shift. Everything >1 that
// isn't exactly 1 means some modifier was held. We treat *shift or
// alt present* (2,3,4,6,8,9) as newline; Ctrl (bit-2 / value 4)
// triggers submit. See xterm/DEC modifyOtherKeys docs.
// Xterm encodes modifier keys in `mod` bit2 (value 4) indicates
// that Ctrl was held. We avoid the `&` bitwise operator (disallowed
// by our ESLint config) by using arithmetic instead.
const hasCtrl = Math.floor(mod / 4) % 2 === 1;
if (hasCtrl) {
if (onSubmit) {
onSubmit(buffer.current.getText());
}
} else {
// Any variant without Ctrl just inserts newline (Shift, Alt, none)
buffer.current.newline();
}
setVersion((v) => v + 1);
return;
}
}
// 1b) CSI-~ / modifyOtherKeys *mode 1* format: "[27;<mod>;<code>~".
// Terminals such as iTerm2 (default), older xterm versions, or when
// modifyOtherKeys=1 is configured, emit this legacy sequence. We
// translate it to the same behaviour as the mode2 variant above so
// that Shift+Enter (newline) / Ctrl+Enter (submit) work regardless
// of the users terminal settings.
if (input.startsWith("[27;") && input.endsWith("~")) {
const m = input.match(/^\[27;([0-9]+);13~$/);
if (m) {
const mod = Number(m[1]);
const hasCtrl = Math.floor(mod / 4) % 2 === 1;
if (hasCtrl) {
if (onSubmit) {
onSubmit(buffer.current.getText());
}
} else {
buffer.current.newline();
}
setVersion((v) => v + 1);
@@ -350,6 +320,16 @@ const MultilineTextEditorInner = (
return row === lineCount - 1;
},
getText: () => buffer.current.getText(),
moveCursorToEnd: () => {
buffer.current.move("home");
const lines = buffer.current.getText().split("\n");
for (let i = 0; i < lines.length - 1; i++) {
buffer.current.move("down");
}
buffer.current.move("end");
// Force a re-render
setVersion((v) => v + 1);
},
}),
[],
);
@@ -405,5 +385,4 @@ const MultilineTextEditorInner = (
};
const MultilineTextEditor = React.forwardRef(MultilineTextEditorInner);
export default MultilineTextEditor;

View File

@@ -15,11 +15,18 @@ const DEFAULT_DENY_MESSAGE =
export function TerminalChatCommandReview({
confirmationPrompt,
onReviewCommand,
// callback to switch approval mode overlay
onSwitchApprovalMode,
explanation: propExplanation,
// whether this review Select is active (listening for keys)
isActive = true,
}: {
confirmationPrompt: React.ReactNode;
onReviewCommand: (decision: ReviewDecision, customMessage?: string) => void;
onSwitchApprovalMode: () => void;
explanation?: string;
// when false, disable the underlying Select so it won't capture input
isActive?: boolean;
}): React.ReactElement {
const [mode, setMode] = React.useState<"select" | "input" | "explanation">(
"select",
@@ -70,6 +77,7 @@ export function TerminalChatCommandReview({
const opts: Array<
| { label: string; value: ReviewDecision }
| { label: string; value: "edit" }
| { label: string; value: "switch" }
> = [
{
label: "Yes (y)",
@@ -93,6 +101,11 @@ export function TerminalChatCommandReview({
label: "Edit or give feedback (e)",
value: "edit",
},
// allow switching approval mode
{
label: "Switch approval mode (s)",
value: "switch",
},
{
label: "No, and keep going (n)",
value: ReviewDecision.NO_CONTINUE,
@@ -106,44 +119,50 @@ export function TerminalChatCommandReview({
return opts;
}, [showAlwaysApprove]);
useInput((input, key) => {
if (mode === "select") {
if (input === "y") {
onReviewCommand(ReviewDecision.YES);
} else if (input === "x") {
onReviewCommand(ReviewDecision.EXPLAIN);
} else if (input === "e") {
setMode("input");
} else if (input === "n") {
onReviewCommand(
ReviewDecision.NO_CONTINUE,
"Don't do that, keep going though",
);
} else if (input === "a" && showAlwaysApprove) {
onReviewCommand(ReviewDecision.ALWAYS);
} else if (key.escape) {
onReviewCommand(ReviewDecision.NO_EXIT);
useInput(
(input, key) => {
if (mode === "select") {
if (input === "y") {
onReviewCommand(ReviewDecision.YES);
} else if (input === "x") {
onReviewCommand(ReviewDecision.EXPLAIN);
} else if (input === "e") {
setMode("input");
} else if (input === "n") {
onReviewCommand(
ReviewDecision.NO_CONTINUE,
"Don't do that, keep going though",
);
} else if (input === "a" && showAlwaysApprove) {
onReviewCommand(ReviewDecision.ALWAYS);
} else if (input === "s") {
// switch approval mode
onSwitchApprovalMode();
} else if (key.escape) {
onReviewCommand(ReviewDecision.NO_EXIT);
}
} else if (mode === "explanation") {
// When in explanation mode, any key returns to select mode
if (key.return || key.escape || input === "x") {
setMode("select");
}
} else {
// text entry mode
if (key.return) {
// if user hit enter on empty msg, fall back to DEFAULT_DENY_MESSAGE
const custom = msg.trim() === "" ? DEFAULT_DENY_MESSAGE : msg;
onReviewCommand(ReviewDecision.NO_CONTINUE, custom);
} else if (key.escape) {
// treat escape as denial with default message as well
onReviewCommand(
ReviewDecision.NO_CONTINUE,
msg.trim() === "" ? DEFAULT_DENY_MESSAGE : msg,
);
}
}
} else if (mode === "explanation") {
// When in explanation mode, any key returns to select mode
if (key.return || key.escape || input === "x") {
setMode("select");
}
} else {
// text entry mode
if (key.return) {
// if user hit enter on empty msg, fall back to DEFAULT_DENY_MESSAGE
const custom = msg.trim() === "" ? DEFAULT_DENY_MESSAGE : msg;
onReviewCommand(ReviewDecision.NO_CONTINUE, custom);
} else if (key.escape) {
// treat escape as denial with default message as well
onReviewCommand(
ReviewDecision.NO_CONTINUE,
msg.trim() === "" ? DEFAULT_DENY_MESSAGE : msg,
);
}
}
});
},
{ isActive },
);
return (
<Box flexDirection="column" gap={1} borderStyle="round" marginTop={1}>
@@ -191,9 +210,13 @@ export function TerminalChatCommandReview({
<Text>Allow command?</Text>
<Box paddingX={2} flexDirection="column" gap={1}>
<Select
onChange={(value: ReviewDecision | "edit") => {
isDisabled={!isActive}
visibleOptionCount={approvalOptions.length}
onChange={(value: ReviewDecision | "edit" | "switch") => {
if (value === "edit") {
setMode("input");
} else if (value === "switch") {
onSwitchApprovalMode();
} else {
onReviewCommand(value);
}

View File

@@ -0,0 +1,64 @@
import { Box, Text } from "ink";
import React, { useMemo } from "react";
type TextCompletionProps = {
/**
* Array of text completion options to display in the list
*/
completions: Array<string>;
/**
* Maximum number of completion items to show at once in the view
*/
displayLimit: number;
/**
* Index of the currently selected completion in the completions array
*/
selectedCompletion: number;
};
function TerminalChatCompletions({
completions,
selectedCompletion,
displayLimit,
}: TextCompletionProps): JSX.Element {
const visibleItems = useMemo(() => {
// Try to keep selection centered in view
let startIndex = Math.max(
0,
selectedCompletion - Math.floor(displayLimit / 2),
);
// Fix window position when at the end of the list
if (completions.length - startIndex < displayLimit) {
startIndex = Math.max(0, completions.length - displayLimit);
}
const endIndex = Math.min(completions.length, startIndex + displayLimit);
return completions.slice(startIndex, endIndex).map((completion, index) => ({
completion,
originalIndex: index + startIndex,
}));
}, [completions, selectedCompletion, displayLimit]);
return (
<Box flexDirection="column">
{visibleItems.map(({ completion, originalIndex }) => (
<Text
key={completion}
dimColor={originalIndex !== selectedCompletion}
underline={originalIndex === selectedCompletion}
backgroundColor={
originalIndex === selectedCompletion ? "blackBright" : undefined
}
>
{completion}
</Text>
))}
</Box>
);
}
export default TerminalChatCompletions;

View File

@@ -1,82 +1,28 @@
import { log, isLoggingEnabled } from "../../utils/agent/log.js";
import Spinner from "../vendor/ink-spinner.js";
import { log } from "../../utils/logger/log.js";
import { Box, Text, useInput, useStdin } from "ink";
import React, { useState } from "react";
import { useInterval } from "use-interval";
const thinkingTexts = ["Thinking"]; /* [
"Consulting the rubber duck",
"Maximizing paperclips",
"Reticulating splines",
"Immanentizing the Eschaton",
"Thinking",
"Thinking about thinking",
"Spinning in circles",
"Counting dust specks",
"Updating priors",
"Feeding the utility monster",
"Taking off",
"Wireheading",
"Counting to infinity",
"Staring into the Basilisk",
"Negotiationing acausal trades",
"Searching the library of babel",
"Multiplying matrices",
"Solving the halting problem",
"Counting grains of sand",
"Simulating a simulation",
"Asking the oracle",
"Detangling qubits",
"Reading tea leaves",
"Pondering universal love and transcendent joy",
"Feeling the AGI",
"Shaving the yak",
"Escaping local minima",
"Pruning the search tree",
"Descending the gradient",
"Bikeshedding",
"Securing funding",
"Rewriting in Rust",
"Engaging infinite improbability drive",
"Clapping with one hand",
"Synthesizing",
"Rebasing thesis onto antithesis",
"Transcending the loop",
"Frogeposting",
"Summoning",
"Peeking beyond the veil",
"Seeking",
"Entering deep thought",
"Meditating",
"Decomposing",
"Creating",
"Beseeching the machine spirit",
"Calibrating moral compass",
"Collapsing the wave function",
"Doodling",
"Translating whale song",
"Whispering to silicon",
"Looking for semicolons",
"Asking ChatGPT",
"Bargaining with entropy",
"Channeling",
"Cooking",
"Parroting stochastically",
]; */
// Retaining a single static placeholder text for potential future use. The
// more elaborate randomised thinking prompts were removed to streamline the
// UI the elapsedtime counter now provides sufficient feedback.
export default function TerminalChatInputThinking({
onInterrupt,
active,
thinkingSeconds,
}: {
onInterrupt: () => void;
active: boolean;
thinkingSeconds: number;
}): React.ReactElement {
const [dots, setDots] = useState("");
const [awaitingConfirm, setAwaitingConfirm] = useState(false);
const [dots, setDots] = useState("");
const [thinkingText, setThinkingText] = useState(
() => thinkingTexts[Math.floor(Math.random() * thinkingTexts.length)],
);
// Animate the ellipsis
useInterval(() => {
setDots((prev) => (prev.length < 3 ? prev + "." : ""));
}, 500);
const { stdin, setRawMode } = useStdin();
@@ -94,11 +40,9 @@ export default function TerminalChatInputThinking({
const str = Buffer.isBuffer(data) ? data.toString("utf8") : data;
if (str === "\x1b\x1b") {
if (isLoggingEnabled()) {
log(
"raw stdin: received collapsed ESC ESC starting confirmation timer",
);
}
log(
"raw stdin: received collapsed ESC ESC starting confirmation timer",
);
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
@@ -110,25 +54,7 @@ export default function TerminalChatInputThinking({
};
}, [stdin, awaitingConfirm, onInterrupt, active, setRawMode]);
useInterval(() => {
setDots((prev) => (prev.length < 3 ? prev + "." : ""));
}, 500);
useInterval(
() => {
setThinkingText((prev) => {
let next = prev;
if (thinkingTexts.length > 1) {
while (next === prev) {
next =
thinkingTexts[Math.floor(Math.random() * thinkingTexts.length)];
}
}
return next;
});
},
active ? 30000 : null,
);
// No timers required beyond tracking the elapsed seconds supplied via props.
useInput(
(_input, key) => {
@@ -137,15 +63,11 @@ export default function TerminalChatInputThinking({
}
if (awaitingConfirm) {
if (isLoggingEnabled()) {
log("useInput: second ESC detected triggering onInterrupt()");
}
log("useInput: second ESC detected triggering onInterrupt()");
onInterrupt();
setAwaitingConfirm(false);
} else {
if (isLoggingEnabled()) {
log("useInput: first ESC detected waiting for confirmation");
}
log("useInput: first ESC detected waiting for confirmation");
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
@@ -153,13 +75,47 @@ export default function TerminalChatInputThinking({
{ isActive: active },
);
// Custom ball animation including the elapsed seconds
const ballFrames = [
"( ● )",
"( ● )",
"( ● )",
"( ● )",
"( ●)",
"( ● )",
"( ● )",
"( ● )",
"( ● )",
"(● )",
];
const [frame, setFrame] = useState(0);
useInterval(() => {
setFrame((idx) => (idx + 1) % ballFrames.length);
}, 80);
// Preserve the spinner (ball) animation while keeping the elapsed seconds
// text static. We achieve this by rendering the bouncing ball inside the
// parentheses and appending the seconds counter *after* the spinner rather
// than injecting it directly next to the ball (which caused the counter to
// move horizontally together with the ball).
const frameTemplate = ballFrames[frame] ?? ballFrames[0];
const frameWithSeconds = `${frameTemplate} ${thinkingSeconds}s`;
return (
<Box flexDirection="column" gap={1}>
<Box gap={2}>
<Spinner type="ball" />
<Box justifyContent="space-between">
<Box gap={2}>
<Text>{frameWithSeconds}</Text>
<Text>
Thinking
{dots}
</Text>
</Box>
<Text>
{thinkingText}
{dots}
Press <Text bold>Esc</Text> twice to interrupt
</Text>
</Box>
{awaitingConfirm && (

View File

@@ -1,3 +1,4 @@
import type { MultilineTextEditorHandle } from "./multiline-editor";
import type { ReviewDecision } from "../../utils/agent/review.js";
import type { HistoryEntry } from "../../utils/storage/command-history.js";
import type {
@@ -5,22 +6,29 @@ import type {
ResponseItem,
} from "openai/resources/responses/responses.mjs";
import MultilineTextEditor from "./multiline-editor";
import { TerminalChatCommandReview } from "./terminal-chat-command-review.js";
import { log, isLoggingEnabled } from "../../utils/agent/log.js";
import TextCompletions from "./terminal-chat-completions.js";
import { loadConfig } from "../../utils/config.js";
import { getFileSystemSuggestions } from "../../utils/file-system-suggestions.js";
import { createInputItem } from "../../utils/input-utils.js";
import { printAndResetSessionSummary } from "../../utils/session-cost.js";
import { log } from "../../utils/logger/log.js";
import { setSessionId } from "../../utils/session.js";
import { SLASH_COMMANDS, type SlashCommand } from "../../utils/slash-commands";
import {
loadCommandHistory,
addToHistory,
} from "../../utils/storage/command-history.js";
import { clearTerminal, onExit } from "../../utils/terminal.js";
import Spinner from "../vendor/ink-spinner.js";
import TextInput from "../vendor/ink-text-input.js";
import { Box, Text, useApp, useInput, useStdin } from "ink";
import { fileURLToPath } from "node:url";
import React, { useCallback, useState, Fragment, useEffect } from "react";
import React, {
useCallback,
useState,
Fragment,
useEffect,
useRef,
} from "react";
import { useInterval } from "use-interval";
const suggestions = [
@@ -43,9 +51,12 @@ export default function TerminalChatInput({
openModelOverlay,
openApprovalOverlay,
openHelpOverlay,
openDiffOverlay,
onCompact,
interruptAgent,
active,
thinkingSeconds,
items = [],
}: {
isNew: boolean;
loading: boolean;
@@ -63,16 +74,33 @@ export default function TerminalChatInput({
openModelOverlay: () => void;
openApprovalOverlay: () => void;
openHelpOverlay: () => void;
openDiffOverlay: () => void;
onCompact: () => void;
interruptAgent: () => void;
active: boolean;
thinkingSeconds: number;
// New: current conversation items so we can include them in bug reports
items?: Array<ResponseItem>;
}): React.ReactElement {
// Slash command suggestion index
const [selectedSlashSuggestion, setSelectedSlashSuggestion] =
useState<number>(0);
const app = useApp();
const [selectedSuggestion, setSelectedSuggestion] = useState<number>(0);
const [input, setInput] = useState("");
const [history, setHistory] = useState<Array<HistoryEntry>>([]);
const [historyIndex, setHistoryIndex] = useState<number | null>(null);
const [draftInput, setDraftInput] = useState<string>("");
const [skipNextSubmit, setSkipNextSubmit] = useState<boolean>(false);
const [fsSuggestions, setFsSuggestions] = useState<Array<string>>([]);
const [selectedCompletion, setSelectedCompletion] = useState<number>(-1);
// Multiline text editor key to force remount after submission
const [editorKey, setEditorKey] = useState(0);
// Imperative handle from the multiline editor so we can query caret position
const editorRef = useRef<MultilineTextEditorHandle | null>(null);
// Track the caret row across keystrokes
const prevCursorRow = useRef<number | null>(null);
const prevCursorWasAtLastRow = useRef<boolean>(false);
// Load command history on component mount
useEffect(() => {
@@ -83,45 +111,237 @@ export default function TerminalChatInput({
loadHistory();
}, []);
// Reset slash suggestion index when input prefix changes
useEffect(() => {
if (input.trim().startsWith("/")) {
setSelectedSlashSuggestion(0);
}
}, [input]);
useInput(
(_input, _key) => {
if (!confirmationPrompt && !loading) {
if (_key.upArrow) {
if (history.length > 0) {
if (historyIndex == null) {
setDraftInput(input);
// Slash command navigation: up/down to select, enter to fill
if (!confirmationPrompt && !loading && input.trim().startsWith("/")) {
const prefix = input.trim();
const matches = SLASH_COMMANDS.filter((cmd: SlashCommand) =>
cmd.command.startsWith(prefix),
);
if (matches.length > 0) {
if (_key.tab) {
// Cycle and fill slash command suggestions on Tab
const len = matches.length;
// Determine new index based on shift state
const nextIdx = _key.shift
? selectedSlashSuggestion <= 0
? len - 1
: selectedSlashSuggestion - 1
: selectedSlashSuggestion >= len - 1
? 0
: selectedSlashSuggestion + 1;
setSelectedSlashSuggestion(nextIdx);
// Autocomplete the command in the input
const match = matches[nextIdx];
if (!match) {
return;
}
const cmd = match.command;
setInput(cmd);
setDraftInput(cmd);
return;
}
if (_key.upArrow) {
setSelectedSlashSuggestion((prev) =>
prev <= 0 ? matches.length - 1 : prev - 1,
);
return;
}
if (_key.downArrow) {
setSelectedSlashSuggestion((prev) =>
prev < 0 || prev >= matches.length - 1 ? 0 : prev + 1,
);
return;
}
if (_key.return) {
// Execute the currently selected slash command
const selIdx = selectedSlashSuggestion;
const cmdObj = matches[selIdx];
if (cmdObj) {
const cmd = cmdObj.command;
setInput("");
setDraftInput("");
setSelectedSlashSuggestion(0);
switch (cmd) {
case "/history":
openOverlay();
break;
case "/help":
openHelpOverlay();
break;
case "/compact":
onCompact();
break;
case "/model":
openModelOverlay();
break;
case "/approval":
openApprovalOverlay();
break;
case "/diff":
openDiffOverlay();
break;
case "/bug":
onSubmit(cmd);
break;
case "/clear":
onSubmit(cmd);
break;
case "/clearhistory":
onSubmit(cmd);
break;
default:
break;
}
}
return;
}
}
}
if (!confirmationPrompt && !loading) {
if (fsSuggestions.length > 0) {
if (_key.upArrow) {
setSelectedCompletion((prev) =>
prev <= 0 ? fsSuggestions.length - 1 : prev - 1,
);
return;
}
if (_key.downArrow) {
setSelectedCompletion((prev) =>
prev >= fsSuggestions.length - 1 ? 0 : prev + 1,
);
return;
}
if (_key.tab && selectedCompletion >= 0) {
const words = input.trim().split(/\s+/);
const selected = fsSuggestions[selectedCompletion];
if (words.length > 0 && selected) {
words[words.length - 1] = selected;
const newText = words.join(" ");
setInput(newText);
// Force remount of the editor with the new text
setEditorKey((k) => k + 1);
// We need to move the cursor to the end after editor remounts
setTimeout(() => {
editorRef.current?.moveCursorToEnd?.();
}, 0);
setFsSuggestions([]);
setSelectedCompletion(-1);
}
return;
}
}
if (_key.upArrow) {
let moveThroughHistory = true;
// Only use history when the caret was *already* on the very first
// row *before* this key-press.
const cursorRow = editorRef.current?.getRow?.() ?? 0;
const cursorCol = editorRef.current?.getCol?.() ?? 0;
const wasAtFirstRow = (prevCursorRow.current ?? cursorRow) === 0;
if (!(cursorRow === 0 && wasAtFirstRow)) {
moveThroughHistory = false;
}
// If we are not yet in history mode, then also require that the col is zero so that
// we only trigger history navigation when the user is at the start of the input.
if (historyIndex == null && !(cursorRow === 0 && cursorCol === 0)) {
moveThroughHistory = false;
}
// Move through history.
if (history.length && moveThroughHistory) {
let newIndex: number;
if (historyIndex == null) {
const currentDraft = editorRef.current?.getText?.() ?? input;
setDraftInput(currentDraft);
newIndex = history.length - 1;
} else {
newIndex = Math.max(0, historyIndex - 1);
}
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
// Re-mount the editor so it picks up the new initialText
setEditorKey((k) => k + 1);
return; // handled
}
return;
// Otherwise let it propagate.
}
if (_key.downArrow) {
if (historyIndex == null) {
// Only move forward in history when we're already *in* history mode
// AND the caret sits on the last line of the buffer.
const wasAtLastRow =
prevCursorWasAtLastRow.current ??
editorRef.current?.isCursorAtLastRow() ??
true;
if (historyIndex != null && wasAtLastRow) {
const newIndex = historyIndex + 1;
if (newIndex >= history.length) {
setHistoryIndex(null);
setInput(draftInput);
setEditorKey((k) => k + 1);
} else {
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
setEditorKey((k) => k + 1);
}
return; // handled
}
// Otherwise let it propagate
}
if (_key.tab) {
const words = input.split(/\s+/);
const mostRecentWord = words[words.length - 1];
if (mostRecentWord === undefined || mostRecentWord === "") {
return;
}
const newIndex = historyIndex + 1;
if (newIndex >= history.length) {
setHistoryIndex(null);
setInput(draftInput);
} else {
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
const completions = getFileSystemSuggestions(mostRecentWord);
setFsSuggestions(completions);
if (completions.length > 0) {
setSelectedCompletion(0);
}
return;
}
}
// Update the cached cursor position *after* **all** handlers (including
// the internal <MultilineTextEditor>) have processed this key event.
//
// Ink invokes `useInput` callbacks starting with **parent** components
// first, followed by their descendants. As a result the call above
// executes *before* the editor has had a chance to react to the key
// press and update its internal caret position. When navigating
// through a multi-line draft with the ↑ / ↓ arrow keys this meant we
// recorded the *old* cursor row instead of the one that results *after*
// the key press. Consequently, a subsequent ↑ still saw
// `prevCursorRow = 1` even though the caret was already on row 0 and
// history-navigation never kicked in.
//
// Defer the sampling by one tick so we read the *final* caret position
// for this frame.
setTimeout(() => {
prevCursorRow.current = editorRef.current?.getRow?.() ?? null;
prevCursorWasAtLastRow.current =
editorRef.current?.isCursorAtLastRow?.() ?? true;
}, 1);
if (input.trim() === "" && isNew) {
if (_key.tab) {
setSelectedSuggestion(
@@ -153,72 +373,91 @@ export default function TerminalChatInput({
const onSubmit = useCallback(
async (value: string) => {
const inputValue = value.trim();
if (!inputValue) {
// If the user only entered a slash, do not send a chat message.
if (inputValue === "/") {
setInput("");
return;
}
if (inputValue === "/history") {
// Skip this submit if we just autocompleted a slash command.
if (skipNextSubmit) {
setSkipNextSubmit(false);
return;
}
if (!inputValue) {
return;
} else if (inputValue === "/history") {
setInput("");
openOverlay();
return;
}
if (inputValue === "/help") {
} else if (inputValue === "/help") {
setInput("");
openHelpOverlay();
return;
}
if (inputValue === "/compact") {
} else if (inputValue === "/diff") {
setInput("");
openDiffOverlay();
return;
} else if (inputValue === "/compact") {
setInput("");
onCompact();
return;
}
if (inputValue.startsWith("/model")) {
} else if (inputValue.startsWith("/model")) {
setInput("");
openModelOverlay();
return;
}
if (inputValue.startsWith("/approval")) {
} else if (inputValue.startsWith("/approval")) {
setInput("");
openApprovalOverlay();
return;
}
if (inputValue === "q" || inputValue === ":q" || inputValue === "exit") {
} else if (["exit", "q", ":q"].includes(inputValue)) {
setInput("");
// wait one 60ms frame
setTimeout(() => {
app.exit();
onExit();
process.exit(0);
}, 60);
}, 60); // Wait one frame.
return;
} else if (inputValue === "/clear" || inputValue === "clear") {
setInput("");
setSessionId("");
setLastResponseId("");
// Clear the terminal first so the summary is printed on a fresh
// screen before the new session starts.
// Clear the terminal screen (including scrollback) before resetting context.
clearTerminal();
// Show the token/cost summary for the session that just ended.
printAndResetSessionSummary();
// Emit a system message to confirm the clear action. We *append*
// it so Ink's <Static> treats it as new output and actually renders it.
setItems((prev) => [
...prev,
{
id: `clear-${Date.now()}`,
type: "message",
role: "system",
content: [{ type: "input_text", text: "Context cleared" }],
},
]);
setItems((prev) => {
const filteredOldItems = prev.filter((item) => {
// Remove any tokenheavy entries (user/assistant turns and function calls)
if (
item.type === "message" &&
(item.role === "user" || item.role === "assistant")
) {
return false;
}
if (
item.type === "function_call" ||
item.type === "function_call_output"
) {
return false;
}
return true; // keep developer/system and other meta entries
});
return [
...filteredOldItems,
{
id: `clear-${Date.now()}`,
type: "message",
role: "system",
content: [{ type: "input_text", text: "Terminal cleared" }],
},
];
});
return;
} else if (inputValue === "/clearhistory") {
@@ -231,7 +470,7 @@ export default function TerminalChatInput({
await clearCommandHistory();
setHistory([]);
// Emit a system message to confirm the history clear action
// Emit a system message to confirm the history clear action.
setItems((prev) => [
...prev,
{
@@ -246,12 +485,65 @@ export default function TerminalChatInput({
},
);
return;
} else if (inputValue === "/bug") {
// Generate a GitHub bug report URL prefilled with session details.
setInput("");
try {
const os = await import("node:os");
const { CLI_VERSION } = await import("../../utils/session.js");
const { buildBugReportUrl } = await import(
"../../utils/bug-report.js"
);
const url = buildBugReportUrl({
items: items ?? [],
cliVersion: CLI_VERSION,
model: loadConfig().model ?? "unknown",
platform: [os.platform(), os.arch(), os.release()]
.map((s) => `\`${s}\``)
.join(" | "),
});
setItems((prev) => [
...prev,
{
id: `bugreport-${Date.now()}`,
type: "message",
role: "system",
content: [
{
type: "input_text",
text: `🔗 Bug report URL: ${url}`,
},
],
},
]);
} catch (error) {
// If anything went wrong, notify the user.
setItems((prev) => [
...prev,
{
id: `bugreport-error-${Date.now()}`,
type: "message",
role: "system",
content: [
{
type: "input_text",
text: `⚠️ Failed to create bug report URL: ${error}`,
},
],
},
]);
}
return;
} else if (inputValue.startsWith("/")) {
// Handle invalid/unrecognized commands.
// Only single-word inputs starting with '/' (e.g., /command) that are not recognized are caught here.
// Any other input, including those starting with '/' but containing spaces
// (e.g., "/command arg"), will fall through and be treated as a regular prompt.
// Handle invalid/unrecognized commands. Only single-word inputs starting with '/'
// (e.g., /command) that are not recognized are caught here. Any other input, including
// those starting with '/' but containing spaces (e.g., "/command arg"), will fall through
// and be treated as a regular prompt.
const trimmed = inputValue.trim();
if (/^\/\S+$/.test(trimmed)) {
@@ -278,11 +570,13 @@ export default function TerminalChatInput({
// detect image file paths for dynamic inclusion
const images: Array<string> = [];
let text = inputValue;
// markdown-style image syntax: ![alt](path)
text = text.replace(/!\[[^\]]*?\]\(([^)]+)\)/g, (_m, p1: string) => {
images.push(p1.startsWith("file://") ? fileURLToPath(p1) : p1);
return "";
});
// quoted file paths ending with common image extensions (e.g. '/path/to/img.png')
text = text.replace(
/['"]([^'"]+?\.(?:png|jpe?g|gif|bmp|webp|svg))['"]/gi,
@@ -291,6 +585,7 @@ export default function TerminalChatInput({
return "";
},
);
// bare file paths ending with common image extensions
text = text.replace(
// eslint-disable-next-line no-useless-escape
@@ -307,10 +602,10 @@ export default function TerminalChatInput({
const inputItem = await createInputItem(text, images);
submitInput([inputItem]);
// Get config for history persistence
// Get config for history persistence.
const config = loadConfig();
// Add to history and update state
// Add to history and update state.
const updatedHistory = await addToHistory(value, history, {
maxSize: config.history?.maxSize ?? 1000,
saveHistory: config.history?.saveHistory ?? true,
@@ -322,6 +617,8 @@ export default function TerminalChatInput({
setDraftInput("");
setSelectedSuggestion(0);
setInput("");
setFsSuggestions([]);
setSelectedCompletion(-1);
},
[
setInput,
@@ -335,8 +632,11 @@ export default function TerminalChatInput({
openApprovalOverlay,
openModelOverlay,
openHelpOverlay,
history, // Add history to the dependency array
openDiffOverlay,
history,
onCompact,
skipNextSubmit,
items,
],
);
@@ -345,7 +645,11 @@ export default function TerminalChatInput({
<TerminalChatCommandReview
confirmationPrompt={confirmationPrompt}
onReviewCommand={submitConfirmation}
// allow switching approval mode via 'v'
onSwitchApprovalMode={openApprovalOverlay}
explanation={explanation}
// disable when input is inactive (e.g., overlay open)
isActive={active}
/>
);
}
@@ -357,65 +661,112 @@ export default function TerminalChatInput({
<TerminalChatInputThinking
onInterrupt={interruptAgent}
active={active}
thinkingSeconds={thinkingSeconds}
/>
) : (
<Box paddingX={1}>
<TextInput
focus={active}
placeholder={
selectedSuggestion
? `"${suggestions[selectedSuggestion - 1]}"`
: "send a message" +
(isNew ? " or press tab to select a suggestion" : "")
}
showCursor
value={input}
onChange={(value) => {
setDraftInput(value);
<MultilineTextEditor
ref={editorRef}
onChange={(txt: string) => {
setDraftInput(txt);
if (historyIndex != null) {
setHistoryIndex(null);
}
setInput(value);
setInput(txt);
// Clear tab completions if a space is typed
if (txt.endsWith(" ")) {
setFsSuggestions([]);
setSelectedCompletion(-1);
} else if (fsSuggestions.length > 0) {
// Update file suggestions as user types
const words = txt.trim().split(/\s+/);
const mostRecentWord =
words.length > 0 ? words[words.length - 1] : "";
if (mostRecentWord !== undefined) {
setFsSuggestions(getFileSystemSuggestions(mostRecentWord));
}
}
}}
key={editorKey}
initialText={input}
height={6}
focus={active}
onSubmit={(txt) => {
onSubmit(txt);
setEditorKey((k) => k + 1);
setInput("");
setHistoryIndex(null);
setDraftInput("");
}}
onSubmit={onSubmit}
/>
</Box>
)}
</Box>
{/* Slash command autocomplete suggestions */}
{input.trim().startsWith("/") && (
<Box flexDirection="column" paddingX={2} marginBottom={1}>
{SLASH_COMMANDS.filter((cmd: SlashCommand) =>
cmd.command.startsWith(input.trim()),
).map((cmd: SlashCommand, idx: number) => (
<Box key={cmd.command}>
<Text
backgroundColor={
idx === selectedSlashSuggestion ? "blackBright" : undefined
}
>
<Text color="blueBright">{cmd.command}</Text>
<Text> {cmd.description}</Text>
</Text>
</Box>
))}
</Box>
)}
<Box paddingX={2} marginBottom={1}>
<Text dimColor>
{isNew && !input ? (
<>
try:{" "}
{suggestions.map((m, key) => (
<Fragment key={key}>
{key !== 0 ? " | " : ""}
<Text
backgroundColor={
key + 1 === selectedSuggestion ? "blackBright" : ""
}
>
{m}
</Text>
</Fragment>
))}
</>
) : (
<>
send q or ctrl+c to exit | send "/clear" to reset | send "/help"
for commands | press enter to send
{contextLeftPercent < 25 && (
<>
{" — "}
<Text color="red">
{Math.round(contextLeftPercent)}% context left send
"/compact" to condense context
</Text>
</>
)}
</>
)}
</Text>
{isNew && !input ? (
<Text dimColor>
try:{" "}
{suggestions.map((m, key) => (
<Fragment key={key}>
{key !== 0 ? " | " : ""}
<Text
backgroundColor={
key + 1 === selectedSuggestion ? "blackBright" : ""
}
>
{m}
</Text>
</Fragment>
))}
</Text>
) : fsSuggestions.length > 0 ? (
<TextCompletions
completions={fsSuggestions}
selectedCompletion={selectedCompletion}
displayLimit={5}
/>
) : (
<Text dimColor>
ctrl+c to exit | "/" to see commands | enter to send
{contextLeftPercent > 25 && (
<>
{" — "}
<Text color={contextLeftPercent > 40 ? "green" : "yellow"}>
{Math.round(contextLeftPercent)}% context left
</Text>
</>
)}
{contextLeftPercent <= 25 && (
<>
{" — "}
<Text color="red">
{Math.round(contextLeftPercent)}% context left send
"/compact" to condense context
</Text>
</>
)}
</Text>
)}
</Box>
</Box>
);
@@ -424,12 +775,42 @@ export default function TerminalChatInput({
function TerminalChatInputThinking({
onInterrupt,
active,
thinkingSeconds,
}: {
onInterrupt: () => void;
active: boolean;
thinkingSeconds: number;
}) {
const [dots, setDots] = useState("");
const [awaitingConfirm, setAwaitingConfirm] = useState(false);
const [dots, setDots] = useState("");
// Animate ellipsis
useInterval(() => {
setDots((prev) => (prev.length < 3 ? prev + "." : ""));
}, 500);
// Spinner frames with embedded seconds
const ballFrames = [
"( ● )",
"( ● )",
"( ● )",
"( ● )",
"( ●)",
"( ● )",
"( ● )",
"( ● )",
"( ● )",
"(● )",
];
const [frame, setFrame] = useState(0);
useInterval(() => {
setFrame((idx) => (idx + 1) % ballFrames.length);
}, 80);
// Keep the elapsedseconds text fixed while the ball animation moves.
const frameTemplate = ballFrames[frame] ?? ballFrames[0];
const frameWithSeconds = `${frameTemplate} ${thinkingSeconds}s`;
// ---------------------------------------------------------------------
// Raw stdin listener to catch the case where the terminal delivers two
@@ -460,11 +841,9 @@ function TerminalChatInputThinking({
const str = Buffer.isBuffer(data) ? data.toString("utf8") : data;
if (str === "\x1b\x1b") {
// Treat as the first Escape press prompt the user for confirmation.
if (isLoggingEnabled()) {
log(
"raw stdin: received collapsed ESC ESC starting confirmation timer",
);
}
log(
"raw stdin: received collapsed ESC ESC starting confirmation timer",
);
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
@@ -477,10 +856,7 @@ function TerminalChatInputThinking({
};
}, [stdin, awaitingConfirm, onInterrupt, active, setRawMode]);
// Cycle the "Thinking…" animation dots.
useInterval(() => {
setDots((prev) => (prev.length < 3 ? prev + "." : ""));
}, 500);
// No local timer: the parent component supplies the elapsed time via props.
// Listen for the escape key to allow the user to interrupt the current
// operation. We require two presses within a short window (1.5s) to avoid
@@ -492,15 +868,11 @@ function TerminalChatInputThinking({
}
if (awaitingConfirm) {
if (isLoggingEnabled()) {
log("useInput: second ESC detected triggering onInterrupt()");
}
log("useInput: second ESC detected triggering onInterrupt()");
onInterrupt();
setAwaitingConfirm(false);
} else {
if (isLoggingEnabled()) {
log("useInput: first ESC detected waiting for confirmation");
}
log("useInput: first ESC detected waiting for confirmation");
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
@@ -509,17 +881,30 @@ function TerminalChatInputThinking({
);
return (
<Box flexDirection="column" gap={1}>
<Box gap={2}>
<Spinner type="ball" />
<Text>Thinking{dots}</Text>
</Box>
{awaitingConfirm && (
<Text dimColor>
Press <Text bold>Esc</Text> again to interrupt and enter a new
instruction
<Box width="100%" flexDirection="column" gap={1}>
<Box
flexDirection="row"
width="100%"
justifyContent="space-between"
paddingRight={1}
>
<Box gap={2}>
<Text>{frameWithSeconds}</Text>
<Text>
Thinking
{dots}
</Text>
</Box>
<Text>
<Text dimColor>press</Text> <Text bold>Esc</Text>{" "}
{awaitingConfirm ? (
<Text bold>again</Text>
) : (
<Text dimColor>twice</Text>
)}{" "}
<Text dimColor>to interrupt</Text>
</Text>
)}
</Box>
</Box>
);
}

View File

@@ -1,563 +0,0 @@
import type { MultilineTextEditorHandle } from "./multiline-editor";
import type { ReviewDecision } from "../../utils/agent/review.js";
import type { HistoryEntry } from "../../utils/storage/command-history.js";
import type {
ResponseInputItem,
ResponseItem,
} from "openai/resources/responses/responses.mjs";
import MultilineTextEditor from "./multiline-editor";
import { TerminalChatCommandReview } from "./terminal-chat-command-review.js";
import { log, isLoggingEnabled } from "../../utils/agent/log.js";
import { loadConfig } from "../../utils/config.js";
import { createInputItem } from "../../utils/input-utils.js";
import { printAndResetSessionSummary } from "../../utils/session-cost.js";
import { setSessionId } from "../../utils/session.js";
import {
loadCommandHistory,
addToHistory,
} from "../../utils/storage/command-history.js";
import { clearTerminal, onExit } from "../../utils/terminal.js";
import Spinner from "../vendor/ink-spinner.js";
import { Box, Text, useApp, useInput, useStdin } from "ink";
import { fileURLToPath } from "node:url";
import React, { useCallback, useState, Fragment, useEffect } from "react";
import { useInterval } from "use-interval";
const suggestions = [
"explain this codebase to me",
"fix any build errors",
"are there any bugs in my code?",
];
const typeHelpText = `ctrl+c to exit | "/clear" to reset context | "/help" for commands | ↑↓ to recall history | ctrl+x to open external editor | enter to send`;
// Enable verbose logging for the historynavigation logic when the
// DEBUG_TCI environment variable is truthy. The traces help while debugging
// unittest failures but remain silent in production.
const DEBUG_HIST =
process.env["DEBUG_TCI"] === "1" || process.env["DEBUG_TCI"] === "true";
const thinkingTexts = ["Thinking"]; /* [
"Consulting the rubber duck",
"Maximizing paperclips",
"Reticulating splines",
"Immanentizing the Eschaton",
"Thinking",
"Thinking about thinking",
"Spinning in circles",
"Counting dust specks",
"Updating priors",
"Feeding the utility monster",
"Taking off",
"Wireheading",
"Counting to infinity",
"Staring into the Basilisk",
"Running acausal tariff negotiations",
"Searching the library of babel",
"Multiplying matrices",
"Solving the halting problem",
"Counting grains of sand",
"Simulating a simulation",
"Asking the oracle",
"Detangling qubits",
"Reading tea leaves",
"Pondering universal love and transcendent joy",
"Feeling the AGI",
"Shaving the yak",
"Escaping local minima",
"Pruning the search tree",
"Descending the gradient",
"Painting the bikeshed",
"Securing funding",
]; */
export default function TerminalChatInput({
isNew: _isNew,
loading,
submitInput,
confirmationPrompt,
explanation,
submitConfirmation,
setLastResponseId,
setItems,
contextLeftPercent,
openOverlay,
openModelOverlay,
openApprovalOverlay,
openHelpOverlay,
interruptAgent,
active,
}: {
isNew: boolean;
loading: boolean;
submitInput: (input: Array<ResponseInputItem>) => void;
confirmationPrompt: React.ReactNode | null;
explanation?: string;
submitConfirmation: (
decision: ReviewDecision,
customDenyMessage?: string,
) => void;
setLastResponseId: (lastResponseId: string) => void;
setItems: React.Dispatch<React.SetStateAction<Array<ResponseItem>>>;
contextLeftPercent: number;
openOverlay: () => void;
openModelOverlay: () => void;
openApprovalOverlay: () => void;
openHelpOverlay: () => void;
interruptAgent: () => void;
active: boolean;
}): React.ReactElement {
const app = useApp();
const [selectedSuggestion, setSelectedSuggestion] = useState<number>(0);
const [input, setInput] = useState("");
const [history, setHistory] = useState<Array<HistoryEntry>>([]);
const [historyIndex, setHistoryIndex] = useState<number | null>(null);
const [draftInput, setDraftInput] = useState<string>("");
// Multiline text editor is now the default input mode. We keep an
// incremental `editorKey` so that we can forceremount the component and
// thus reset its internal buffer after each successful submit.
const [editorKey, setEditorKey] = useState(0);
// Load command history on component mount
useEffect(() => {
async function loadHistory() {
const historyEntries = await loadCommandHistory();
setHistory(historyEntries);
}
loadHistory();
}, []);
// Imperative handle from the multiline editor so we can query caret position
const editorRef = React.useRef<MultilineTextEditorHandle | null>(null);
// Track the caret row across keystrokes so we can tell whether the cursor
// was *already* on the first/last line before the current key event. This
// lets us distinguish between a normal vertical navigation (e.g. moving
// from row 1 → row 0 inside a multiline draft) and an attempt to navigate
// the chat history (pressing ↑ again while already at row 0).
const prevCursorRow = React.useRef<number | null>(null);
useInput(
(_input, _key) => {
if (!confirmationPrompt && !loading) {
if (_key.upArrow) {
if (DEBUG_HIST) {
// eslint-disable-next-line no-console
console.log("[TCI] upArrow", {
historyIndex,
input,
cursorRow: editorRef.current?.getRow?.(),
});
}
// Only recall history when the caret was *already* on the very first
// row *before* this keypress. That means the user pressed ↑ while
// the cursor sat at the top mirroring how shells like Bash/zsh
// enter history navigation. When the caret starts on a lower line
// the first ↑ should merely move it up one row; only a subsequent
// press (when we are *still* at row 0) should trigger the recall.
const cursorRow = editorRef.current?.getRow?.() ?? 0;
const wasAtFirstRow = (prevCursorRow.current ?? cursorRow) === 0;
if (history.length > 0 && cursorRow === 0 && wasAtFirstRow) {
if (historyIndex == null) {
const currentDraft = editorRef.current?.getText?.() ?? input;
setDraftInput(currentDraft);
if (DEBUG_HIST) {
// eslint-disable-next-line no-console
console.log("[TCI] store draft", JSON.stringify(currentDraft));
}
}
let newIndex: number;
if (historyIndex == null) {
newIndex = history.length - 1;
} else {
newIndex = Math.max(0, historyIndex - 1);
}
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
// Remount the editor so it picks up the new initialText.
setEditorKey((k) => k + 1);
return; // we handled the key
}
// Otherwise let the event propagate so the editor moves the caret.
}
if (_key.downArrow) {
if (DEBUG_HIST) {
// eslint-disable-next-line no-console
console.log("[TCI] downArrow", { historyIndex, draftInput, input });
}
// Only move forward in history when we're already *in* history mode
// AND the caret sits on the last line of the buffer (so ↓ within a
// multiline draft simply moves the caret down).
if (historyIndex != null && editorRef.current?.isCursorAtLastRow()) {
const newIndex = historyIndex + 1;
if (newIndex >= history.length) {
setHistoryIndex(null);
setInput(draftInput);
setEditorKey((k) => k + 1);
} else {
setHistoryIndex(newIndex);
setInput(history[newIndex]?.command ?? "");
setEditorKey((k) => k + 1);
}
return; // handled
}
// Otherwise let it propagate.
}
}
if (input.trim() === "") {
if (_key.tab) {
setSelectedSuggestion(
(s) => (s + (_key.shift ? -1 : 1)) % (suggestions.length + 1),
);
} else if (selectedSuggestion && _key.return) {
const suggestion = suggestions[selectedSuggestion - 1] || "";
setInput("");
setSelectedSuggestion(0);
submitInput([
{
role: "user",
content: [{ type: "input_text", text: suggestion }],
type: "message",
},
]);
}
} else if (_input === "\u0003" || (_input === "c" && _key.ctrl)) {
setTimeout(() => {
app.exit();
onExit();
process.exit(0);
}, 60);
}
// Update the cached cursor position *after* we've potentially handled
// the key so that the next event has the correct "previous" reference.
prevCursorRow.current = editorRef.current?.getRow?.() ?? null;
},
{ isActive: active },
);
const onSubmit = useCallback(
async (value: string) => {
const inputValue = value.trim();
if (!inputValue) {
return;
}
if (inputValue === "/history") {
setInput("");
openOverlay();
return;
}
if (inputValue === "/help") {
setInput("");
openHelpOverlay();
return;
}
if (inputValue.startsWith("/model")) {
setInput("");
openModelOverlay();
return;
}
if (inputValue.startsWith("/approval")) {
setInput("");
openApprovalOverlay();
return;
}
if (inputValue === "q" || inputValue === ":q" || inputValue === "exit") {
setInput("");
// wait one 60ms frame
setTimeout(() => {
app.exit();
onExit();
process.exit(0);
}, 60);
return;
} else if (inputValue === "/clear" || inputValue === "clear") {
setInput("");
setSessionId("");
setLastResponseId("");
// Clear screen then display session summary so the user sees it.
clearTerminal();
printAndResetSessionSummary();
// Emit a system message to confirm the clear action. We *append*
// it so Ink's <Static> treats it as new output and actually renders it.
setItems((prev) => [
...prev,
{
id: `clear-${Date.now()}`,
type: "message",
role: "system",
content: [{ type: "input_text", text: "Context cleared" }],
},
]);
return;
} else if (inputValue === "/clearhistory") {
setInput("");
// Import clearCommandHistory function to avoid circular dependencies
// Using dynamic import to lazy-load the function
import("../../utils/storage/command-history.js").then(
async ({ clearCommandHistory }) => {
await clearCommandHistory();
setHistory([]);
// Emit a system message to confirm the history clear action
setItems((prev) => [
...prev,
{
id: `clearhistory-${Date.now()}`,
type: "message",
role: "system",
content: [
{ type: "input_text", text: "Command history cleared" },
],
},
]);
},
);
return;
}
const images: Array<string> = [];
const text = inputValue
.replace(/!\[[^\]]*?\]\(([^)]+)\)/g, (_m, p1: string) => {
images.push(p1.startsWith("file://") ? fileURLToPath(p1) : p1);
return "";
})
.trim();
const inputItem = await createInputItem(text, images);
submitInput([inputItem]);
// Get config for history persistence
const config = loadConfig();
// Add to history and update state
const updatedHistory = await addToHistory(value, history, {
maxSize: config.history?.maxSize ?? 1000,
saveHistory: config.history?.saveHistory ?? true,
sensitivePatterns: config.history?.sensitivePatterns ?? [],
});
setHistory(updatedHistory);
setHistoryIndex(null);
setDraftInput("");
setSelectedSuggestion(0);
setInput("");
},
[
setInput,
submitInput,
setLastResponseId,
setItems,
app,
setHistory,
setHistoryIndex,
openOverlay,
openApprovalOverlay,
openModelOverlay,
openHelpOverlay,
history, // Add history to the dependency array
],
);
if (confirmationPrompt) {
return (
<TerminalChatCommandReview
confirmationPrompt={confirmationPrompt}
onReviewCommand={submitConfirmation}
explanation={explanation}
/>
);
}
return (
<Box flexDirection="column">
{loading ? (
<Box borderStyle="round">
<TerminalChatInputThinking
onInterrupt={interruptAgent}
active={active}
/>
</Box>
) : (
<>
<Box borderStyle="round">
<MultilineTextEditor
ref={editorRef}
onChange={(txt: string) => setInput(txt)}
key={editorKey}
initialText={input}
height={8}
focus={active}
onSubmit={(txt) => {
onSubmit(txt);
setEditorKey((k) => k + 1);
setInput("");
setHistoryIndex(null);
setDraftInput("");
}}
/>
</Box>
<Box paddingX={2} marginBottom={1}>
<Text dimColor>
{!input ? (
<>
try:{" "}
{suggestions.map((m, key) => (
<Fragment key={key}>
{key !== 0 ? " | " : ""}
<Text
backgroundColor={
key + 1 === selectedSuggestion ? "blackBright" : ""
}
>
{m}
</Text>
</Fragment>
))}
</>
) : (
<>
{typeHelpText}
{contextLeftPercent < 25 && (
<>
{" — "}
<Text color="red">
{Math.round(contextLeftPercent)}% context left
</Text>
</>
)}
</>
)}
</Text>
</Box>
</>
)}
</Box>
);
}
function TerminalChatInputThinking({
onInterrupt,
active,
}: {
onInterrupt: () => void;
active: boolean;
}) {
const [dots, setDots] = useState("");
const [awaitingConfirm, setAwaitingConfirm] = useState(false);
const [thinkingText] = useState(
() => thinkingTexts[Math.floor(Math.random() * thinkingTexts.length)],
);
// ---------------------------------------------------------------------
// Raw stdin listener to catch the case where the terminal delivers two
// consecutive ESC bytes ("\x1B\x1B") in a *single* chunk. Ink's `useInput`
// collapses that sequence into one key event, so the regular twostep
// handler above never sees the second press. By inspecting the raw data
// we can identify this special case and trigger the interrupt while still
// requiring a double press for the normal singlebyte ESC events.
// ---------------------------------------------------------------------
const { stdin, setRawMode } = useStdin();
React.useEffect(() => {
if (!active) {
return;
}
// Ensure raw mode already enabled by Ink when the component has focus,
// but called defensively in case that assumption ever changes.
setRawMode?.(true);
const onData = (data: Buffer | string) => {
if (awaitingConfirm) {
return; // already awaiting a second explicit press
}
// Handle both Buffer and string forms.
const str = Buffer.isBuffer(data) ? data.toString("utf8") : data;
if (str === "\x1b\x1b") {
// Treat as the first Escape press prompt the user for confirmation.
if (isLoggingEnabled()) {
log(
"raw stdin: received collapsed ESC ESC starting confirmation timer",
);
}
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
};
stdin?.on("data", onData);
return () => {
stdin?.off("data", onData);
};
}, [stdin, awaitingConfirm, onInterrupt, active, setRawMode]);
useInterval(() => {
setDots((prev) => (prev.length < 3 ? prev + "." : ""));
}, 500);
useInput(
(_input, key) => {
if (!key.escape) {
return;
}
if (awaitingConfirm) {
if (isLoggingEnabled()) {
log("useInput: second ESC detected triggering onInterrupt()");
}
onInterrupt();
setAwaitingConfirm(false);
} else {
if (isLoggingEnabled()) {
log("useInput: first ESC detected waiting for confirmation");
}
setAwaitingConfirm(true);
setTimeout(() => setAwaitingConfirm(false), 1500);
}
},
{ isActive: active },
);
return (
<Box flexDirection="column" gap={1}>
<Box gap={2}>
<Spinner type="ball" />
<Text>
{thinkingText}
{dots}
</Text>
</Box>
{awaitingConfirm && (
<Text dimColor>
Press <Text bold>Esc</Text> again to interrupt and enter a new
instruction
</Text>
)}
</Box>
);
}

View File

@@ -1,3 +1,4 @@
import type { OverlayModeType } from "./terminal-chat";
import type { TerminalRendererOptions } from "marked-terminal";
import type {
ResponseFunctionToolCallItem,
@@ -14,18 +15,25 @@ import chalk, { type ForegroundColorName } from "chalk";
import { Box, Text } from "ink";
import { parse, setOptions } from "marked";
import TerminalRenderer from "marked-terminal";
import React, { useMemo } from "react";
import React, { useEffect, useMemo } from "react";
export default function TerminalChatResponseItem({
item,
fullStdout = false,
setOverlayMode,
}: {
item: ResponseItem;
fullStdout?: boolean;
setOverlayMode?: React.Dispatch<React.SetStateAction<OverlayModeType>>;
}): React.ReactElement {
switch (item.type) {
case "message":
return <TerminalChatResponseMessage message={item} />;
return (
<TerminalChatResponseMessage
setOverlayMode={setOverlayMode}
message={item}
/>
);
case "function_call":
return <TerminalChatResponseToolCall message={item} />;
case "function_call_output":
@@ -98,9 +106,23 @@ const colorsByRole: Record<string, ForegroundColorName> = {
function TerminalChatResponseMessage({
message,
setOverlayMode,
}: {
message: ResponseInputMessageItem | ResponseOutputMessage;
setOverlayMode?: React.Dispatch<React.SetStateAction<OverlayModeType>>;
}) {
// auto switch to model mode if the system message contains "has been deprecated"
useEffect(() => {
if (message.role === "system") {
const systemMessage = message.content.find(
(c) => c.type === "input_text",
)?.text;
if (systemMessage?.includes("model_not_found")) {
setOverlayMode?.("model");
}
}
}, [message, setOverlayMode]);
return (
<Box flexDirection="column">
<Text bold color={colorsByRole[message.role] || "gray"}>
@@ -113,14 +135,14 @@ function TerminalChatResponseMessage({
c.type === "output_text"
? c.text
: c.type === "refusal"
? c.refusal
: c.type === "input_text"
? c.text
: c.type === "input_image"
? "<Image>"
: c.type === "input_file"
? c.filename
: "", // unknown content type
? c.refusal
: c.type === "input_text"
? c.text
: c.type === "input_image"
? "<Image>"
: c.type === "input_file"
? c.filename
: "", // unknown content type
)
.join(" ")}
</Markdown>

View File

@@ -1,135 +0,0 @@
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import { approximateTokensUsed } from "../../utils/approximate-tokens-used.js";
/**
* Typeguard that narrows a {@link ResponseItem} to one that represents a
* userauthored message. The OpenAI SDK represents both input *and* output
* messages with a discriminated union where:
* • `type` is the string literal "message" and
* • `role` is one of "user" | "assistant" | "system" | "developer".
*
* For the purposes of deduplication we only care about *user* messages so we
* detect those here in a single, reusable helper.
*/
function isUserMessage(
item: ResponseItem,
): item is ResponseItem & { type: "message"; role: "user"; content: unknown } {
return item.type === "message" && (item as { role?: string }).role === "user";
}
/**
* Returns the maximum context length (in tokens) for a given model.
* These numbers are besteffort guesses and provide a basis for UI percentages.
*/
export function maxTokensForModel(model: string): number {
const lower = model.toLowerCase();
// Heuristics for common context window sizes. Keep the checks loosely
// ordered from *largest* to *smallest* so that more specific longcontext
// models are detected before their shorter generic counterparts.
// Specialcase for 1,047,576token demo model (gpt4long). We match either
// the literal number or "gpt-4.1" variants we occasionally encounter.
if (lower.includes("1,047,576") || /gpt-4\.1/i.test(lower)) {
return 1047576;
}
if (lower.includes("128k") || /gpt-4\.5|gpt-4o-mini|gpt-4o\b/i.test(lower)) {
return 128000;
}
// Experimental oseries advertised at ~200k context
if (/\bo[134]\b|o[134]-mini|o1[- ]?pro/i.test(lower)) {
return 200000;
}
if (lower.includes("32k")) {
return 32000;
}
if (lower.includes("16k")) {
return 16000;
}
if (lower.includes("8k")) {
return 8000;
}
if (lower.includes("4k")) {
return 4000;
}
// Default to 128k for newer longcontext models
return 128000;
}
/**
* Calculates the percentage of tokens remaining in context for a model.
*/
export function calculateContextPercentRemaining(
items: Array<ResponseItem>,
model: string,
extraContextChars = 0,
): number {
const tokensFromItems = approximateTokensUsed(items);
const extraTokens = Math.ceil(extraContextChars / 4);
const used = tokensFromItems + extraTokens;
const max = maxTokensForModel(model);
const remaining = Math.max(0, max - used);
return (remaining / max) * 100;
}
/**
* Deduplicate the stream of {@link ResponseItem}s before they are persisted in
* component state.
*
* Historically we used the (optional) {@code id} field returned by the
* OpenAI streaming API as the primary key: the first occurrence of any given
* {@code id} “won” and subsequent duplicates were dropped. In practice this
* proved brittle because locallygenerated user messages dont include an
* {@code id}. The result was that if a user quickly pressed <Enter> twice the
* exact same message would appear twice in the transcript.
*
* The new rules are therefore:
* 1. If a {@link ResponseItem} has an {@code id} keep only the *first*
* occurrence of that {@code id} (this retains the previous behaviour for
* assistant / tool messages).
* 2. Additionally, collapse *consecutive* user messages with identical
* content. Two messages are considered identical when their serialized
* {@code content} array matches exactly. We purposefully restrict this
* to **adjacent** duplicates so that legitimately repeated questions at
* a later point in the conversation are still shown.
*/
export function uniqueById(items: Array<ResponseItem>): Array<ResponseItem> {
const seenIds = new Set<string>();
const deduped: Array<ResponseItem> = [];
for (const item of items) {
// ──────────────────────────────────────────────────────────────────
// Rule #1 deduplicate by id when present
// ──────────────────────────────────────────────────────────────────
if (typeof item.id === "string" && item.id.length > 0) {
if (seenIds.has(item.id)) {
continue; // skip duplicates
}
seenIds.add(item.id);
}
// ──────────────────────────────────────────────────────────────────
// Rule #2 collapse consecutive identical user messages
// ──────────────────────────────────────────────────────────────────
if (isUserMessage(item) && deduped.length > 0) {
const prev = deduped[deduped.length - 1]!;
if (
isUserMessage(prev) &&
// Note: the `content` field is an array of message parts. Performing
// a deep compare is overkill here; serialising to JSON is sufficient
// (and fast for the tiny payloads involved).
JSON.stringify(prev.content) === JSON.stringify(item.content)
) {
continue; // skip duplicate user message
}
}
deduped.push(item);
}
return deduped;
}

View File

@@ -5,35 +5,47 @@ import type { ColorName } from "chalk";
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import TerminalChatInput from "./terminal-chat-input.js";
import { TerminalChatToolCallCommand } from "./terminal-chat-tool-call-item.js";
import {
calculateContextPercentRemaining,
uniqueById,
} from "./terminal-chat-utils.js";
import { TerminalChatToolCallCommand } from "./terminal-chat-tool-call-command.js";
import TerminalMessageHistory from "./terminal-message-history.js";
import { formatCommandForDisplay } from "../../format-command.js";
import { useConfirmation } from "../../hooks/use-confirmation.js";
import { useTerminalSize } from "../../hooks/use-terminal-size.js";
import { AgentLoop } from "../../utils/agent/agent-loop.js";
import { isLoggingEnabled, log } from "../../utils/agent/log.js";
import { ReviewDecision } from "../../utils/agent/review.js";
import { generateCompactSummary } from "../../utils/compact-summary.js";
import { OPENAI_BASE_URL } from "../../utils/config.js";
import { getBaseUrl, getApiKey, saveConfig } from "../../utils/config.js";
import { extractAppliedPatches as _extractAppliedPatches } from "../../utils/extract-applied-patches.js";
import { getGitDiff } from "../../utils/get-diff.js";
import { createInputItem } from "../../utils/input-utils.js";
import { getAvailableModels } from "../../utils/model-utils.js";
import { log } from "../../utils/logger/log.js";
import {
getAvailableModels,
calculateContextPercentRemaining,
uniqueById,
} from "../../utils/model-utils.js";
import { CLI_VERSION } from "../../utils/session.js";
import { shortCwd } from "../../utils/short-path.js";
import { saveRollout } from "../../utils/storage/save-rollout.js";
import ApprovalModeOverlay from "../approval-mode-overlay.js";
import DiffOverlay from "../diff-overlay.js";
import HelpOverlay from "../help-overlay.js";
import HistoryOverlay from "../history-overlay.js";
import ModelOverlay from "../model-overlay.js";
import chalk from "chalk";
import { Box, Text } from "ink";
import { exec } from "node:child_process";
import { spawn } from "node:child_process";
import OpenAI from "openai";
import React, { useEffect, useMemo, useRef, useState } from "react";
import { inspect } from "util";
export type OverlayModeType =
| "none"
| "history"
| "model"
| "approval"
| "help"
| "diff";
type Props = {
config: AppConfig;
prompt?: string;
@@ -54,17 +66,21 @@ const colorsByPolicy: Record<ApprovalPolicy, ColorName | undefined> = {
*
* @param command The command to explain
* @param model The model to use for generating the explanation
* @param flexMode Whether to use the flex-mode service tier
* @param config The configuration object
* @returns A human-readable explanation of what the command does
*/
async function generateCommandExplanation(
command: Array<string>,
model: string,
flexMode: boolean,
config: AppConfig,
): Promise<string> {
try {
// Create a temporary OpenAI client
const oai = new OpenAI({
apiKey: process.env["OPENAI_API_KEY"],
baseURL: OPENAI_BASE_URL,
apiKey: getApiKey(config.provider),
baseURL: getBaseUrl(config.provider),
});
// Format the command for display
@@ -73,6 +89,7 @@ async function generateCommandExplanation(
// Create a prompt that asks for an explanation with a more detailed system prompt
const response = await oai.chat.completions.create({
model,
...(flexMode ? { service_tier: "flex" } : {}),
messages: [
{
role: "system",
@@ -93,11 +110,8 @@ async function generateCommandExplanation(
} catch (error) {
log(`Error generating command explanation: ${error}`);
// Improved error handling with more specific error information
let errorMessage = "Unable to generate explanation due to an error.";
if (error instanceof Error) {
// Include specific error message for better debugging
errorMessage = `Unable to generate explanation: ${error.message}`;
// If it's an API error, check for more specific information
@@ -128,21 +142,26 @@ export default function TerminalChat({
additionalWritableRoots,
fullStdout,
}: Props): React.ReactElement {
// Desktop notification setting
const notify = config.notify;
const notify = Boolean(config.notify);
const [model, setModel] = useState<string>(config.model);
const [provider, setProvider] = useState<string>(config.provider || "openai");
const [lastResponseId, setLastResponseId] = useState<string | null>(null);
const [items, setItems] = useState<Array<ResponseItem>>([]);
const [loading, setLoading] = useState<boolean>(false);
// Allow switching approval modes at runtime via an overlay.
const [approvalPolicy, setApprovalPolicy] = useState<ApprovalPolicy>(
initialApprovalPolicy,
);
const [thinkingSeconds, setThinkingSeconds] = useState(0);
const handleCompact = async () => {
setLoading(true);
try {
const summary = await generateCompactSummary(items, model);
const summary = await generateCompactSummary(
items,
model,
Boolean(config.flexMode),
config,
);
setItems([
{
id: `compact-${Date.now()}`,
@@ -167,15 +186,21 @@ export default function TerminalChat({
setLoading(false);
}
};
const {
requestConfirmation,
confirmationPrompt,
explanation,
submitConfirmation,
} = useConfirmation();
const [overlayMode, setOverlayMode] = useState<
"none" | "history" | "model" | "approval" | "help"
>("none");
const [overlayMode, setOverlayMode] = useState<OverlayModeType>("none");
// Store the diff text when opening the diff overlay so the view isnt
// recomputed on every rerender while it is open.
// diffText is passed down to the DiffOverlay component. The setter is
// currently unused but retained for potential future updates. Prefix with
// an underscore so eslint ignores the unused variable.
const [diffText, _setDiffText] = useState<string>("");
const [initialPrompt, setInitialPrompt] = useState(_initialPrompt);
const [initialImagePaths, setInitialImagePaths] =
@@ -191,39 +216,44 @@ export default function TerminalChat({
// ────────────────────────────────────────────────────────────────
// DEBUG: log every render w/ key bits of state
// ────────────────────────────────────────────────────────────────
if (isLoggingEnabled()) {
log(
`render agent? ${Boolean(agentRef.current)} loading=${loading} items=${
items.length
}`,
);
}
log(
`render - agent? ${Boolean(agentRef.current)} loading=${loading} items=${
items.length
}`,
);
useEffect(() => {
if (isLoggingEnabled()) {
log("creating NEW AgentLoop");
log(
`model=${model} instructions=${Boolean(
config.instructions,
)} approvalPolicy=${approvalPolicy}`,
);
// Skip recreating the agent if awaiting a decision on a pending confirmation.
if (confirmationPrompt != null) {
log("skip AgentLoop recreation due to pending confirmationPrompt");
return;
}
// Tear down any existing loop before creating a new one
log("creating NEW AgentLoop");
log(
`model=${model} provider=${provider} instructions=${Boolean(
config.instructions,
)} approvalPolicy=${approvalPolicy}`,
);
// Tear down any existing loop before creating a new one.
agentRef.current?.terminate();
const sessionId = crypto.randomUUID();
agentRef.current = new AgentLoop({
model,
provider,
config,
instructions: config.instructions,
approvalPolicy,
disableResponseStorage: config.disableResponseStorage,
additionalWritableRoots,
onLastResponseId: setLastResponseId,
onItem: (item) => {
log(`onItem: ${JSON.stringify(item)}`);
setItems((prev) => {
const updated = uniqueById([...prev, item as ResponseItem]);
saveRollout(updated);
saveRollout(sessionId, updated);
return updated;
});
},
@@ -240,15 +270,18 @@ export default function TerminalChat({
<TerminalChatToolCallCommand commandForDisplay={commandForDisplay} />,
);
// If the user wants an explanation, generate one and ask again
// If the user wants an explanation, generate one and ask again.
if (review === ReviewDecision.EXPLAIN) {
log(`Generating explanation for command: ${commandForDisplay}`);
// Generate an explanation using the same model
const explanation = await generateCommandExplanation(command, model);
const explanation = await generateCommandExplanation(
command,
model,
Boolean(config.flexMode),
config,
);
log(`Generated explanation: ${explanation}`);
// Ask for confirmation again, but with the explanation
// Ask for confirmation again, but with the explanation.
const confirmResult = await requestConfirmation(
<TerminalChatToolCallCommand
commandForDisplay={commandForDisplay}
@@ -256,11 +289,11 @@ export default function TerminalChat({
/>,
);
// Update the decision based on the second confirmation
// Update the decision based on the second confirmation.
review = confirmResult.decision;
customDenyMessage = confirmResult.customDenyMessage;
// Return the final decision with the explanation
// Return the final decision with the explanation.
return { review, customDenyMessage, applyPatch, explanation };
}
@@ -268,30 +301,23 @@ export default function TerminalChat({
},
});
// force a render so JSX below can "see" the freshly created agent
// Force a render so JSX below can "see" the freshly created agent.
forceUpdate();
if (isLoggingEnabled()) {
log(`AgentLoop created: ${inspect(agentRef.current, { depth: 1 })}`);
}
log(`AgentLoop created: ${inspect(agentRef.current, { depth: 1 })}`);
return () => {
if (isLoggingEnabled()) {
log("terminating AgentLoop");
}
log("terminating AgentLoop");
agentRef.current?.terminate();
agentRef.current = undefined;
forceUpdate(); // rerender after teardown too
};
}, [
model,
config,
approvalPolicy,
requestConfirmation,
additionalWritableRoots,
]);
// We intentionally omit 'approvalPolicy' and 'confirmationPrompt' from the deps
// so switching modes or showing confirmation dialogs doesnt tear down the loop.
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [model, provider, config, requestConfirmation, additionalWritableRoots]);
// whenever loading starts/stops, reset or start a timer — but pause the
// Whenever loading starts/stops, reset or start a timer — but pause the
// timer while a confirmation overlay is displayed so we don't trigger a
// rerender every second during apply_patch reviews.
useEffect(() => {
@@ -316,14 +342,15 @@ export default function TerminalChat({
};
}, [loading, confirmationPrompt]);
// Notify desktop with a preview when an assistant response arrives
// Notify desktop with a preview when an assistant response arrives.
const prevLoadingRef = useRef<boolean>(false);
useEffect(() => {
// Only notify when notifications are enabled
// Only notify when notifications are enabled.
if (!notify) {
prevLoadingRef.current = loading;
return;
}
if (
prevLoadingRef.current &&
!loading &&
@@ -350,21 +377,20 @@ export default function TerminalChat({
const safePreview = preview.replace(/"/g, '\\"');
const title = "Codex CLI";
const cwd = PWD;
exec(
`osascript -e 'display notification "${safePreview}" with title "${title}" subtitle "${cwd}" sound name "Ping"'`,
);
spawn("osascript", [
"-e",
`display notification "${safePreview}" with title "${title}" subtitle "${cwd}" sound name "Ping"`,
]);
}
}
}
prevLoadingRef.current = loading;
}, [notify, loading, confirmationPrompt, items, PWD]);
// Let's also track whenever the ref becomes available
// Let's also track whenever the ref becomes available.
const agent = agentRef.current;
useEffect(() => {
if (isLoggingEnabled()) {
log(`agentRef.current is now ${Boolean(agent)}`);
}
log(`agentRef.current is now ${Boolean(agent)}`);
}, [agent]);
// ---------------------------------------------------------------------
@@ -384,7 +410,7 @@ export default function TerminalChat({
const inputItems = [
await createInputItem(initialPrompt || "", initialImagePaths || []),
];
// Clear them to prevent subsequent runs
// Clear them to prevent subsequent runs.
setInitialPrompt("");
setInitialImagePaths([]);
agent?.run(inputItems);
@@ -397,7 +423,7 @@ export default function TerminalChat({
// ────────────────────────────────────────────────────────────────
useEffect(() => {
(async () => {
const available = await getAvailableModels();
const available = await getAvailableModels(provider);
if (model && available.length > 0 && !available.includes(model)) {
setItems((prev) => [
...prev,
@@ -408,7 +434,7 @@ export default function TerminalChat({
content: [
{
type: "input_text",
text: `Warning: model "${model}" is not in the list of available models returned by OpenAI.`,
text: `Warning: model "${model}" is not in the list of available models for provider "${provider}".`,
},
],
},
@@ -419,7 +445,7 @@ export default function TerminalChat({
// eslint-disable-next-line react-hooks/exhaustive-deps
}, []);
// Just render every item in order, no grouping/collapse
// Just render every item in order, no grouping/collapse.
const lastMessageBatch = items.map((item) => ({ item }));
const groupCounts: Record<string, number> = {};
const userMsgCount = items.filter(
@@ -427,14 +453,8 @@ export default function TerminalChat({
).length;
const contextLeftPercent = useMemo(
() =>
calculateContextPercentRemaining(
items,
model,
// static system instructions count towards the context budget too
config.instructions?.length ?? 0,
),
[items, model, config.instructions],
() => calculateContextPercentRemaining(items, model),
[items, model],
);
return (
@@ -442,6 +462,7 @@ export default function TerminalChat({
<Box flexDirection="column">
{agent ? (
<TerminalMessageHistory
setOverlayMode={setOverlayMode}
batch={lastMessageBatch}
groupCounts={groupCounts}
items={items}
@@ -455,10 +476,12 @@ export default function TerminalChat({
version: CLI_VERSION,
PWD,
model,
provider,
approvalPolicy,
colorsByPolicy,
agent,
initialImagePaths,
flexModeEnabled: Boolean(config.flexMode),
}}
/>
) : (
@@ -466,7 +489,7 @@ export default function TerminalChat({
<Text color="gray">Initializing agent</Text>
</Box>
)}
{agent && (
{overlayMode === "none" && agent && (
<TerminalChatInput
loading={loading}
setItems={setItems}
@@ -488,17 +511,35 @@ export default function TerminalChat({
openModelOverlay={() => setOverlayMode("model")}
openApprovalOverlay={() => setOverlayMode("approval")}
openHelpOverlay={() => setOverlayMode("help")}
openDiffOverlay={() => {
const { isGitRepo, diff } = getGitDiff();
let text: string;
if (isGitRepo) {
text = diff;
} else {
text = "`/diff` — _not inside a git repository_";
}
setItems((prev) => [
...prev,
{
id: `diff-${Date.now()}`,
type: "message",
role: "system",
content: [{ type: "input_text", text }],
},
]);
// Ensure no overlay is shown.
setOverlayMode("none");
}}
onCompact={handleCompact}
active={overlayMode === "none"}
interruptAgent={() => {
if (!agent) {
return;
}
if (isLoggingEnabled()) {
log(
"TerminalChat: interruptAgent invoked calling agent.cancel()",
);
}
log(
"TerminalChat: interruptAgent invoked calling agent.cancel()",
);
agent.cancel();
setLoading(false);
@@ -522,6 +563,8 @@ export default function TerminalChat({
agent.run(inputs, lastResponseId || "");
return {};
}}
items={items}
thinkingSeconds={thinkingSeconds}
/>
)}
{overlayMode === "history" && (
@@ -530,24 +573,45 @@ export default function TerminalChat({
{overlayMode === "model" && (
<ModelOverlay
currentModel={model}
providers={config.providers}
currentProvider={provider}
hasLastResponse={Boolean(lastResponseId)}
onSelect={(newModel) => {
if (isLoggingEnabled()) {
log(
"TerminalChat: interruptAgent invoked calling agent.cancel()",
);
if (!agent) {
log("TerminalChat: agent is not ready yet");
}
onSelect={(allModels, newModel) => {
log(
"TerminalChat: interruptAgent invoked calling agent.cancel()",
);
if (!agent) {
log("TerminalChat: agent is not ready yet");
}
agent?.cancel();
setLoading(false);
if (!allModels?.includes(newModel)) {
// eslint-disable-next-line no-console
console.error(
chalk.bold.red(
`Model "${chalk.yellow(
newModel,
)}" is not available for provider "${chalk.yellow(
provider,
)}".`,
),
);
return;
}
setModel(newModel);
setLastResponseId((prev) =>
prev && newModel !== model ? null : prev,
);
// Save model to config
saveConfig({
...config,
model: newModel,
provider: provider,
});
setItems((prev) => [
...prev,
{
@@ -565,6 +629,51 @@ export default function TerminalChat({
setOverlayMode("none");
}}
onSelectProvider={(newProvider) => {
log(
"TerminalChat: interruptAgent invoked calling agent.cancel()",
);
if (!agent) {
log("TerminalChat: agent is not ready yet");
}
agent?.cancel();
setLoading(false);
// Select default model for the new provider.
const defaultModel = model;
// Save provider to config.
const updatedConfig = {
...config,
provider: newProvider,
model: defaultModel,
};
saveConfig(updatedConfig);
setProvider(newProvider);
setModel(defaultModel);
setLastResponseId((prev) =>
prev && newProvider !== provider ? null : prev,
);
setItems((prev) => [
...prev,
{
id: `switch-provider-${Date.now()}`,
type: "message",
role: "system",
content: [
{
type: "input_text",
text: `Switched provider to ${newProvider} with model ${defaultModel}`,
},
],
},
]);
// Don't close the overlay so user can select a model for the new provider
// setOverlayMode("none");
}}
onExit={() => setOverlayMode("none")}
/>
)}
@@ -573,12 +682,19 @@ export default function TerminalChat({
<ApprovalModeOverlay
currentMode={approvalPolicy}
onSelect={(newMode) => {
agent?.cancel();
setLoading(false);
// Update approval policy without cancelling an in-progress session.
if (newMode === approvalPolicy) {
return;
}
setApprovalPolicy(newMode as ApprovalPolicy);
if (agentRef.current) {
(
agentRef.current as unknown as {
approvalPolicy: ApprovalPolicy;
}
).approvalPolicy = newMode as ApprovalPolicy;
}
setItems((prev) => [
...prev,
{
@@ -603,6 +719,13 @@ export default function TerminalChat({
{overlayMode === "help" && (
<HelpOverlay onExit={() => setOverlayMode("none")} />
)}
{overlayMode === "diff" && (
<DiffOverlay
diffText={diffText}
onExit={() => setOverlayMode("none")}
/>
)}
</Box>
</Box>
);

View File

@@ -9,10 +9,12 @@ export interface TerminalHeaderProps {
version: string;
PWD: string;
model: string;
provider?: string;
approvalPolicy: string;
colorsByPolicy: Record<string, string | undefined>;
agent?: AgentLoop;
initialImagePaths?: Array<string>;
flexModeEnabled?: boolean;
}
const TerminalHeader: React.FC<TerminalHeaderProps> = ({
@@ -20,18 +22,21 @@ const TerminalHeader: React.FC<TerminalHeaderProps> = ({
version,
PWD,
model,
provider = "openai",
approvalPolicy,
colorsByPolicy,
agent,
initialImagePaths,
flexModeEnabled = false,
}) => {
return (
<>
{terminalRows < 10 ? (
// Compact header for small terminal windows
<Text>
Codex v{version} {PWD} {model} {" "}
Codex v{version} - {PWD} - {model} ({provider}) -{" "}
<Text color={colorsByPolicy[approvalPolicy]}>{approvalPolicy}</Text>
{flexModeEnabled ? " - flex-mode" : ""}
</Text>
) : (
<>
@@ -62,12 +67,22 @@ const TerminalHeader: React.FC<TerminalHeaderProps> = ({
<Text dimColor>
<Text color="blueBright"></Text> model: <Text bold>{model}</Text>
</Text>
<Text dimColor>
<Text color="blueBright"></Text> provider:{" "}
<Text bold>{provider}</Text>
</Text>
<Text dimColor>
<Text color="blueBright"></Text> approval:{" "}
<Text bold color={colorsByPolicy[approvalPolicy]} dimColor>
<Text bold color={colorsByPolicy[approvalPolicy]}>
{approvalPolicy}
</Text>
</Text>
{flexModeEnabled && (
<Text dimColor>
<Text color="blueBright"></Text> flex-mode:{" "}
<Text bold>enabled</Text>
</Text>
)}
{initialImagePaths?.map((img, idx) => (
<Text key={img ?? idx} color="gray">
<Text color="blueBright"></Text> image:{" "}

View File

@@ -1,17 +1,18 @@
import type { OverlayModeType } from "./terminal-chat.js";
import type { TerminalHeaderProps } from "./terminal-header.js";
import type { GroupedResponseItem } from "./use-message-grouping.js";
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import TerminalChatResponseItem from "./terminal-chat-response-item.js";
import TerminalHeader from "./terminal-header.js";
import { Box, Static, Text } from "ink";
import { Box, Static } from "ink";
import React, { useMemo } from "react";
// A batch entry can either be a standalone response item or a grouped set of
// items (e.g. autoapproved toolcall batches) that should be rendered
// together.
type BatchEntry = { item?: ResponseItem; group?: GroupedResponseItem };
type MessageHistoryProps = {
type TerminalMessageHistoryProps = {
batch: Array<BatchEntry>;
groupCounts: Record<string, number>;
items: Array<ResponseItem>;
@@ -21,25 +22,25 @@ type MessageHistoryProps = {
thinkingSeconds: number;
headerProps: TerminalHeaderProps;
fullStdout: boolean;
setOverlayMode: React.Dispatch<React.SetStateAction<OverlayModeType>>;
};
const MessageHistory: React.FC<MessageHistoryProps> = ({
const TerminalMessageHistory: React.FC<TerminalMessageHistoryProps> = ({
batch,
headerProps,
loading,
thinkingSeconds,
// `loading` and `thinkingSeconds` handled by input component now.
loading: _loading,
thinkingSeconds: _thinkingSeconds,
fullStdout,
setOverlayMode,
}) => {
// Flatten batch entries to response items.
const messages = useMemo(() => batch.map(({ item }) => item!), [batch]);
return (
<Box flexDirection="column">
{loading && (
<Box marginTop={1}>
<Text color="yellow">{`thinking for ${thinkingSeconds}s`}</Text>
</Box>
)}
{/* The dedicated thinking indicator in the input area now displays the
elapsed time, so we no longer render a separate counter here. */}
<Static items={["header", ...messages]}>
{(item, index) => {
if (item === "header") {
@@ -67,6 +68,7 @@ const MessageHistory: React.FC<MessageHistoryProps> = ({
<TerminalChatResponseItem
item={message}
fullStdout={fullStdout}
setOverlayMode={setOverlayMode}
/>
</Box>
);
@@ -76,4 +78,4 @@ const MessageHistory: React.FC<MessageHistoryProps> = ({
);
};
export default React.memo(MessageHistory);
export default React.memo(TerminalMessageHistory);

View File

@@ -0,0 +1,93 @@
import { Box, Text, useInput } from "ink";
import React, { useState } from "react";
/**
* Simple scrollable view for displaying a diff.
* The component is intentionally lightweight and mirrors the UX of
* HistoryOverlay: Up/Down or j/k to scroll, PgUp/PgDn for paging and Esc to
* close. The caller is responsible for computing the diff text.
*/
export default function DiffOverlay({
diffText,
onExit,
}: {
diffText: string;
onExit: () => void;
}): JSX.Element {
const lines = diffText.length > 0 ? diffText.split("\n") : ["(no changes)"];
const [cursor, setCursor] = useState(0);
// Determine how many rows we can display similar to HistoryOverlay.
const rows = process.stdout.rows || 24;
const headerRows = 2;
const footerRows = 1;
const maxVisible = Math.max(4, rows - headerRows - footerRows);
useInput((input, key) => {
if (key.escape || input === "q") {
onExit();
return;
}
if (key.downArrow || input === "j") {
setCursor((c) => Math.min(lines.length - 1, c + 1));
} else if (key.upArrow || input === "k") {
setCursor((c) => Math.max(0, c - 1));
} else if (key.pageDown) {
setCursor((c) => Math.min(lines.length - 1, c + maxVisible));
} else if (key.pageUp) {
setCursor((c) => Math.max(0, c - maxVisible));
} else if (input === "g") {
setCursor(0);
} else if (input === "G") {
setCursor(lines.length - 1);
}
});
const firstVisible = Math.min(
Math.max(0, cursor - Math.floor(maxVisible / 2)),
Math.max(0, lines.length - maxVisible),
);
const visible = lines.slice(firstVisible, firstVisible + maxVisible);
// Very small helper to colorize diff lines in a basic way.
function renderLine(line: string, idx: number): JSX.Element {
let color: "green" | "red" | "cyan" | undefined = undefined;
if (line.startsWith("+")) {
color = "green";
} else if (line.startsWith("-")) {
color = "red";
} else if (line.startsWith("@@") || line.startsWith("diff --git")) {
color = "cyan";
}
return (
<Text key={idx} color={color} wrap="truncate-end">
{line === "" ? " " : line}
</Text>
);
}
return (
<Box
flexDirection="column"
borderStyle="round"
borderColor="gray"
width={Math.min(120, process.stdout.columns || 120)}
>
<Box paddingX={1}>
<Text bold>Working tree diff ({lines.length} lines)</Text>
</Box>
<Box flexDirection="column" paddingX={1}>
{visible.map((line, idx) => {
return renderLine(line, firstVisible + idx);
})}
</Box>
<Box paddingX={1}>
<Text dimColor>esc Close Scroll PgUp/PgDn g/G First/Last</Text>
</Box>
</Box>
);
}

View File

@@ -52,6 +52,13 @@ export default function HelpOverlay({
<Text>
<Text color="cyan">/clearhistory</Text> clear command history
</Text>
<Text>
<Text color="cyan">/bug</Text> generate a prefilled GitHub issue URL
with session log
</Text>
<Text>
<Text color="cyan">/diff</Text> view working tree git diff
</Text>
<Text>
<Text color="cyan">/compact</Text> condense context into a summary
</Text>

View File

@@ -14,7 +14,10 @@ export default function HistoryOverlay({ items, onExit }: Props): JSX.Element {
const [mode, setMode] = useState<Mode>("commands");
const [cursor, setCursor] = useState(0);
const { commands, files } = useMemo(() => buildLists(items), [items]);
const { commands, files } = useMemo(
() => formatHistoryForDisplay(items),
[items],
);
const list = mode === "commands" ? commands : files;
@@ -95,7 +98,7 @@ export default function HistoryOverlay({ items, onExit }: Props): JSX.Element {
);
}
function buildLists(items: Array<ResponseItem>): {
function formatHistoryForDisplay(items: Array<ResponseItem>): {
commands: Array<string>;
files: Array<string>;
} {
@@ -103,33 +106,9 @@ function buildLists(items: Array<ResponseItem>): {
const filesSet = new Set<string>();
for (const item of items) {
if (
item.type === "message" &&
(item as unknown as { role?: string }).role === "user"
) {
// TODO: We're ignoring images/files here.
const parts =
(item as unknown as { content?: Array<unknown> }).content ?? [];
const texts: Array<string> = [];
if (Array.isArray(parts)) {
for (const part of parts) {
if (part && typeof part === "object" && "text" in part) {
const t = (part as unknown as { text?: string }).text;
if (typeof t === "string" && t.length > 0) {
texts.push(t);
}
}
}
}
if (texts.length > 0) {
const fullPrompt = texts.join(" ");
// Truncate very long prompts so the history view stays legible.
const truncated =
fullPrompt.length > 120 ? `${fullPrompt.slice(0, 117)}` : fullPrompt;
commands.push(`> ${truncated}`);
}
const userPrompt = processUserMessage(item);
if (userPrompt) {
commands.push(userPrompt);
continue;
}
@@ -169,35 +148,11 @@ function buildLists(items: Array<ResponseItem>): {
const cmdArray: Array<string> | undefined = Array.isArray(argsObj?.["cmd"])
? (argsObj!["cmd"] as Array<string>)
: Array.isArray(argsObj?.["command"])
? (argsObj!["command"] as Array<string>)
: undefined;
? (argsObj!["command"] as Array<string>)
: undefined;
if (cmdArray && cmdArray.length > 0) {
commands.push(cmdArray.join(" "));
// Heuristic for file paths in command args
for (const part of cmdArray) {
if (!part.startsWith("-") && part.includes("/")) {
filesSet.add(part);
}
}
// Specialcase apply_patch so we can extract the list of modified files
if (cmdArray[0] === "apply_patch" || cmdArray.includes("apply_patch")) {
const patchTextMaybe = cmdArray.find((s) =>
s.includes("*** Begin Patch"),
);
if (typeof patchTextMaybe === "string") {
const lines = patchTextMaybe.split("\n");
for (const line of lines) {
const m = line.match(/^[-+]{3} [ab]\/(.+)$/);
if (m && m[1]) {
filesSet.add(m[1]);
}
}
}
}
commands.push(processCommandArray(cmdArray, filesSet));
continue; // We processed this as a command; no need to treat as generic tool call.
}
@@ -205,33 +160,96 @@ function buildLists(items: Array<ResponseItem>): {
// short argument representation to give users an idea of what
// happened.
if (typeof toolName === "string" && toolName.length > 0) {
let summary = toolName;
if (argsJson && typeof argsJson === "object") {
// Extract a few common argument keys to make the summary more useful
// without being overly verbose.
const interestingKeys = [
"path",
"file",
"filepath",
"filename",
"pattern",
];
for (const key of interestingKeys) {
const val = (argsJson as Record<string, unknown>)[key];
if (typeof val === "string") {
summary += ` ${val}`;
if (val.includes("/")) {
filesSet.add(val);
}
break;
}
}
}
commands.push(summary);
commands.push(processNonExecTool(toolName, argsJson, filesSet));
}
}
return { commands, files: Array.from(filesSet) };
}
function processUserMessage(item: ResponseItem): string | null {
if (
item.type === "message" &&
(item as unknown as { role?: string }).role === "user"
) {
// TODO: We're ignoring images/files here.
const parts =
(item as unknown as { content?: Array<unknown> }).content ?? [];
const texts: Array<string> = [];
if (Array.isArray(parts)) {
for (const part of parts) {
if (part && typeof part === "object" && "text" in part) {
const t = (part as unknown as { text?: string }).text;
if (typeof t === "string" && t.length > 0) {
texts.push(t);
}
}
}
}
if (texts.length > 0) {
const fullPrompt = texts.join(" ");
// Truncate very long prompts so the history view stays legible.
return fullPrompt.length > 120
? `> ${fullPrompt.slice(0, 117)}`
: `> ${fullPrompt}`;
}
}
return null;
}
function processCommandArray(
cmdArray: Array<string>,
filesSet: Set<string>,
): string {
const cmd = cmdArray.join(" ");
// Heuristic for file paths in command args
for (const part of cmdArray) {
if (!part.startsWith("-") && part.includes("/")) {
filesSet.add(part);
}
}
// Specialcase apply_patch so we can extract the list of modified files
if (cmdArray[0] === "apply_patch" || cmdArray.includes("apply_patch")) {
const patchTextMaybe = cmdArray.find((s) => s.includes("*** Begin Patch"));
if (typeof patchTextMaybe === "string") {
const lines = patchTextMaybe.split("\n");
for (const line of lines) {
const m = line.match(/^[-+]{3} [ab]\/(.+)$/);
if (m && m[1]) {
filesSet.add(m[1]);
}
}
}
}
return cmd;
}
function processNonExecTool(
toolName: string,
argsJson: unknown,
filesSet: Set<string>,
): string {
let summary = toolName;
if (argsJson && typeof argsJson === "object") {
// Extract a few common argument keys to make the summary more useful
// without being overly verbose.
const interestingKeys = ["path", "file", "filepath", "filename", "pattern"];
for (const key of interestingKeys) {
const val = (argsJson as Record<string, unknown>)[key];
if (typeof val === "string") {
summary += ` ${val}`;
if (val.includes("/")) {
filesSet.add(val);
}
break;
}
}
}
return summary;
}

View File

@@ -1,7 +1,7 @@
import TypeaheadOverlay from "./typeahead-overlay.js";
import {
getAvailableModels,
RECOMMENDED_MODELS,
RECOMMENDED_MODELS as _RECOMMENDED_MODELS,
} from "../utils/model-utils.js";
import { Box, Text, useInput } from "ink";
import React, { useEffect, useState } from "react";
@@ -16,39 +16,53 @@ import React, { useEffect, useState } from "react";
*/
type Props = {
currentModel: string;
currentProvider?: string;
hasLastResponse: boolean;
onSelect: (model: string) => void;
providers?: Record<string, { name: string; baseURL: string; envKey: string }>;
onSelect: (allModels: Array<string>, model: string) => void;
onSelectProvider?: (provider: string) => void;
onExit: () => void;
};
export default function ModelOverlay({
currentModel,
providers = {},
currentProvider = "openai",
hasLastResponse,
onSelect,
onSelectProvider,
onExit,
}: Props): JSX.Element {
const [items, setItems] = useState<Array<{ label: string; value: string }>>(
[],
);
const [providerItems, _setProviderItems] = useState<
Array<{ label: string; value: string }>
>(Object.values(providers).map((p) => ({ label: p.name, value: p.name })));
const [mode, setMode] = useState<"model" | "provider">("model");
const [isLoading, setIsLoading] = useState<boolean>(true);
// This effect will run when the provider changes to update the model list
useEffect(() => {
setIsLoading(true);
(async () => {
const models = await getAvailableModels();
// Split the list into recommended and “other” models.
const recommended = RECOMMENDED_MODELS.filter((m) => models.includes(m));
const others = models.filter((m) => !recommended.includes(m));
const ordered = [...recommended, ...others.sort()];
setItems(
ordered.map((m) => ({
label: recommended.includes(m) ? `${m}` : m,
value: m,
})),
);
try {
const models = await getAvailableModels(currentProvider);
// Convert the models to the format needed by TypeaheadOverlay
setItems(
models.map((m) => ({
label: m,
value: m,
})),
);
} catch (error) {
// Silently handle errors - remove console.error
// console.error("Error loading models:", error);
} finally {
setIsLoading(false);
}
})();
}, []);
}, [currentProvider]);
// ---------------------------------------------------------------------------
// If the conversation already contains a response we cannot change the model
@@ -58,10 +72,14 @@ export default function ModelOverlay({
// available action is to dismiss the overlay (Esc or Enter).
// ---------------------------------------------------------------------------
// Always register input handling so hooks are called consistently.
// Register input handling for switching between model and provider selection
useInput((_input, key) => {
if (hasLastResponse && (key.escape || key.return)) {
onExit();
} else if (!hasLastResponse) {
if (key.tab) {
setMode(mode === "model" ? "provider" : "model");
}
}
});
@@ -91,17 +109,56 @@ export default function ModelOverlay({
);
}
if (mode === "provider") {
return (
<TypeaheadOverlay
title="Select provider"
description={
<Box flexDirection="column">
<Text>
Current provider:{" "}
<Text color="greenBright">{currentProvider}</Text>
</Text>
<Text dimColor>press tab to switch to model selection</Text>
</Box>
}
initialItems={providerItems}
currentValue={currentProvider}
onSelect={(provider) => {
if (onSelectProvider) {
onSelectProvider(provider);
// Immediately switch to model selection so user can pick a model for the new provider
setMode("model");
}
}}
onExit={onExit}
/>
);
}
return (
<TypeaheadOverlay
title="Switch model"
title="Select model"
description={
<Text>
Current model: <Text color="greenBright">{currentModel}</Text>
</Text>
<Box flexDirection="column">
<Text>
Current model: <Text color="greenBright">{currentModel}</Text>
</Text>
<Text>
Current provider: <Text color="greenBright">{currentProvider}</Text>
</Text>
{isLoading && <Text color="yellow">Loading models...</Text>}
<Text dimColor>press tab to switch to provider selection</Text>
</Box>
}
initialItems={items}
currentValue={currentModel}
onSelect={onSelect}
onSelect={(selectedModel) =>
onSelect(
items?.map((m) => m.value),
selectedModel,
)
}
onExit={onExit}
/>
);

View File

@@ -1,5 +1,5 @@
import Indicator, { type Props as IndicatorProps } from "./Indicator.js";
import ItemComponent, { type Props as ItemProps } from "./Item.js";
import Indicator, { type Props as IndicatorProps } from "./indicator.js";
import ItemComponent, { type Props as ItemProps } from "./item.js";
import isEqual from "fast-deep-equal";
import { Box, useInput } from "ink";
import React, {

View File

@@ -5,7 +5,13 @@ import type { FileOperation } from "../utils/singlepass/file_ops";
import Spinner from "./vendor/ink-spinner"; // Thirdparty / vendor components
import TextInput from "./vendor/ink-text-input";
import { OPENAI_TIMEOUT_MS, OPENAI_BASE_URL } from "../utils/config";
import {
OPENAI_TIMEOUT_MS,
OPENAI_ORGANIZATION,
OPENAI_PROJECT,
getBaseUrl,
getApiKey,
} from "../utils/config";
import {
generateDiffSummary,
generateEditSummary,
@@ -393,13 +399,23 @@ export function SinglePassApp({
files,
});
const headers: Record<string, string> = {};
if (OPENAI_ORGANIZATION) {
headers["OpenAI-Organization"] = OPENAI_ORGANIZATION;
}
if (OPENAI_PROJECT) {
headers["OpenAI-Project"] = OPENAI_PROJECT;
}
const openai = new OpenAI({
apiKey: config.apiKey ?? "",
baseURL: OPENAI_BASE_URL || undefined,
apiKey: getApiKey(config.provider),
baseURL: getBaseUrl(config.provider),
timeout: OPENAI_TIMEOUT_MS,
defaultHeaders: headers,
});
const chatResp = await openai.beta.chat.completions.parse({
model: config.model,
...(config.flexMode ? { service_tier: "flex" } : {}),
messages: [
{
role: "user",

View File

@@ -44,6 +44,11 @@ export type TextInputProps = {
* Function to call when `Enter` is pressed, where first argument is a value of the input.
*/
readonly onSubmit?: (value: string) => void;
/**
* Explicitly set the cursor position to the end of the text
*/
readonly cursorToEnd?: boolean;
};
function findPrevWordJump(prompt: string, cursorOffset: number) {
@@ -90,12 +95,22 @@ function TextInput({
showCursor = true,
onChange,
onSubmit,
cursorToEnd = false,
}: TextInputProps) {
const [state, setState] = useState({
cursorOffset: (originalValue || "").length,
cursorWidth: 0,
});
useEffect(() => {
if (cursorToEnd) {
setState((prev) => ({
...prev,
cursorOffset: (originalValue || "").length,
}));
}
}, [cursorToEnd, originalValue, focus]);
const { cursorOffset, cursorWidth } = state;
useEffect(() => {
@@ -153,6 +168,78 @@ function TextInput({
useInput(
(input, key) => {
// ────────────────────────────────────────────────────────────────
// Support Shift+Enter / Ctrl+Enter from terminals that have
// modifyOtherKeys enabled. Such terminals encode the keycombo in a
// CSI sequence rather than sending a bare "\r"/"\n". Ink passes the
// sequence through as raw text (without the initial ESC), so we need to
// detect and translate it before the generic character handler below
// treats it as literal input (e.g. "[27;2;13~"). We support both the
// modern *mode 2* (CSIu, ending in "u") and the legacy *mode 1*
// variant (ending in "~").
//
// - Shift+Enter → insert newline (same behaviour as Option+Enter)
// - Ctrl+Enter → submit the input (same as plain Enter)
//
// References: https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-Modify-Other-Keys
// ────────────────────────────────────────────────────────────────
function handleEncodedEnterSequence(raw: string): boolean {
// CSIu (modifyOtherKeys=2) → "[13;<mod>u"
let m = raw.match(/^\[([0-9]+);([0-9]+)u$/);
if (m && m[1] === "13") {
const mod = Number(m[2]);
const hasCtrl = Math.floor(mod / 4) % 2 === 1;
if (hasCtrl) {
if (onSubmit) {
onSubmit(originalValue);
}
} else {
const newValue =
originalValue.slice(0, cursorOffset) +
"\n" +
originalValue.slice(cursorOffset);
setState({
cursorOffset: cursorOffset + 1,
cursorWidth: 0,
});
onChange(newValue);
}
return true; // handled
}
// CSI~ (modifyOtherKeys=1) → "[27;<mod>;13~"
m = raw.match(/^\[27;([0-9]+);13~$/);
if (m) {
const mod = Number(m[1]);
const hasCtrl = Math.floor(mod / 4) % 2 === 1;
if (hasCtrl) {
if (onSubmit) {
onSubmit(originalValue);
}
} else {
const newValue =
originalValue.slice(0, cursorOffset) +
"\n" +
originalValue.slice(cursorOffset);
setState({
cursorOffset: cursorOffset + 1,
cursorWidth: 0,
});
onChange(newValue);
}
return true; // handled
}
return false; // not an encoded Enter sequence
}
if (handleEncodedEnterSequence(input)) {
return;
}
if (
key.upArrow ||
key.downArrow ||

View File

@@ -1,4 +1,3 @@
// use-confirmation.ts
import type { ReviewDecision } from "../utils/agent/review";
import type React from "react";

24
codex-cli/src/shims-external.d.ts vendored Normal file
View File

@@ -0,0 +1,24 @@
// Ambient module declarations for optional/runtimeonly dependencies so that
// `tsc --noEmit` succeeds without installing their full type definitions.
declare module "package-manager-detector" {
export type AgentName = "npm" | "pnpm" | "yarn" | "bun" | "deno";
/** Detects the package manager based on environment variables. */
export function getUserAgent(): AgentName | null | undefined;
}
declare module "fast-npm-meta" {
export interface LatestVersionMeta {
version: string;
}
export function getLatestVersion(
pkgName: string,
opts?: Record<string, unknown>,
): Promise<LatestVersionMeta | { error: unknown }>;
}
declare module "semver" {
export function gt(v1: string, v2: string): boolean;
}

View File

@@ -34,6 +34,10 @@ function clamp(v: number, min: number, max: number): number {
* ---------------------------------------------------------------------- */
function toCodePoints(str: string): Array<string> {
if (typeof Intl !== "undefined" && "Segmenter" in Intl) {
const seg = new Intl.Segmenter();
return [...seg.segment(str)].map((seg) => seg.segment);
}
// [...str] or Array.from both iterate by UTF32 code point, handling
// surrogate pairs correctly.
return Array.from(str);
@@ -103,88 +107,6 @@ export default class TextBuffer {
}
}
/* =====================================================================
* External editor integration (gitstyle $EDITOR workflow)
* =================================================================== */
/**
* Opens the current buffer contents in the users preferred terminal text
* editor ($VISUAL or $EDITOR, falling back to "vi"). The method blocks
* until the editor exits, then reloads the file and replaces the inmemory
* buffer with whatever the user saved.
*
* The operation is treated as a single undoable edit we snapshot the
* previous state *once* before launching the editor so one `undo()` will
* revert the entire change set.
*
* Note: We purposefully rely on the *synchronous* spawn API so that the
* calling process genuinely waits for the editor to close before
* continuing. This mirrors Gits behaviour and simplifies downstream
* controlflow (callers can simply `await` the Promise).
*/
async openInExternalEditor(opts: { editor?: string } = {}): Promise<void> {
// Deliberately use `require()` so that unit tests can stub the
// respective modules with `vi.spyOn(require("node:child_process"), …)`.
// Dynamic `import()` would circumvent those CommonJS stubs.
// eslint-disable-next-line @typescript-eslint/no-var-requires
const pathMod = require("node:path");
// eslint-disable-next-line @typescript-eslint/no-var-requires
const fs = require("node:fs");
// eslint-disable-next-line @typescript-eslint/no-var-requires
const os = require("node:os");
// eslint-disable-next-line @typescript-eslint/no-var-requires
const { spawnSync } = require("node:child_process");
const editor =
opts.editor ??
process.env["VISUAL"] ??
process.env["EDITOR"] ??
(process.platform === "win32" ? "notepad" : "vi");
// Prepare a temporary file with the current contents. We use mkdtempSync
// to obtain an isolated directory and avoid name collisions.
const tmpDir = fs.mkdtempSync(pathMod.join(os.tmpdir(), "codex-edit-"));
const filePath = pathMod.join(tmpDir, "buffer.txt");
fs.writeFileSync(filePath, this.getText(), "utf8");
// One snapshot for undo semantics *before* we mutate anything.
this.pushUndo();
// The child inherits stdio so the user can interact with the editor as if
// they had launched it directly.
const { status, error } = spawnSync(editor, [filePath], {
stdio: "inherit",
});
if (error) {
throw error;
}
if (typeof status === "number" && status !== 0) {
throw new Error(`External editor exited with status ${status}`);
}
// Read the edited contents back in normalise line endings to \n.
let newText = fs.readFileSync(filePath, "utf8");
newText = newText.replace(/\r\n?/g, "\n");
// Update buffer.
this.lines = newText.split("\n");
if (this.lines.length === 0) {
this.lines = [""];
}
// Position the caret at EOF.
this.cursorRow = this.lines.length - 1;
this.cursorCol = cpLen(this.line(this.cursorRow));
// Reset scroll offsets so the new end is visible.
this.scrollRow = Math.max(0, this.cursorRow - 1);
this.scrollCol = 0;
this.version++;
}
/* =======================================================================
* Geometry helpers
* ===================================================================== */
@@ -415,6 +337,58 @@ export default class TextBuffer {
});
}
/**
* Delete everything from the caret to the *end* of the current line. The
* caret itself stays in place (column remains unchanged). Mirrors the
* common Ctrl+K shortcut in many shells and editors.
*/
deleteToLineEnd(): void {
dbg("deleteToLineEnd", { beforeCursor: this.getCursor() });
const line = this.line(this.cursorRow);
if (this.cursorCol >= this.lineLen(this.cursorRow)) {
// Nothing to delete caret already at EOL.
return;
}
this.pushUndo();
// Keep the prefix before the caret, discard the remainder.
this.lines[this.cursorRow] = cpSlice(line, 0, this.cursorCol);
this.version++;
dbg("deleteToLineEnd:after", {
cursor: this.getCursor(),
line: this.line(this.cursorRow),
});
}
/**
* Delete everything from the *start* of the current line up to (but not
* including) the caret. The caret is moved to column-0, mirroring the
* behaviour of the familiar Ctrl+U binding.
*/
deleteToLineStart(): void {
dbg("deleteToLineStart", { beforeCursor: this.getCursor() });
if (this.cursorCol === 0) {
// Nothing to delete caret already at SOL.
return;
}
this.pushUndo();
const line = this.line(this.cursorRow);
this.lines[this.cursorRow] = cpSlice(line, this.cursorCol);
this.cursorCol = 0;
this.version++;
dbg("deleteToLineStart:after", {
cursor: this.getCursor(),
line: this.line(this.cursorRow),
});
}
/* ------------------------------------------------------------------
* Wordwise deletion helpers exposed publicly so tests (and future
* keybindings) can invoke them directly.
@@ -423,7 +397,7 @@ export default class TextBuffer {
/** Delete the word to the *left* of the caret, mirroring common
* Ctrl/Alt+Backspace behaviour in editors & terminals. Both the adjacent
* whitespace *and* the word characters immediately preceding the caret are
* removed. If the caret is already at column0 this becomes a noop. */
* removed. If the caret is already at column0 this becomes a no-op. */
deleteWordLeft(): void {
dbg("deleteWordLeft", { beforeCursor: this.getCursor() });
@@ -636,6 +610,24 @@ export default class TextBuffer {
}
}
/* ------------------------------------------------------------------
* Document-level navigation helpers
* ---------------------------------------------------------------- */
/** Move caret to *absolute* beginning of the buffer (row-0, col-0). */
private moveToStartOfDocument(): void {
this.preferredCol = null;
this.cursorRow = 0;
this.cursorCol = 0;
}
/** Move caret to *absolute* end of the buffer (last row, last column). */
private moveToEndOfDocument(): void {
this.preferredCol = null;
this.cursorRow = this.lines.length - 1;
this.cursorCol = this.lineLen(this.cursorRow);
}
/* =====================================================================
* Higherlevel helpers
* =================================================================== */
@@ -710,7 +702,7 @@ export default class TextBuffer {
}
endSelection(): void {
// noop for now, kept for API symmetry
// no-op for now, kept for API symmetry
// we rely on anchor + current cursor to compute selection
}
@@ -787,7 +779,6 @@ export default class TextBuffer {
!key["ctrl"] &&
!key["alt"]
) {
/* navigation */
this.move("left");
} else if (
key["rightArrow"] &&
@@ -807,12 +798,26 @@ export default class TextBuffer {
key["rightArrow"]
) {
this.move("wordRight");
}
// Many terminal/OS combinations (e.g. macOS Terminal.app & iTerm2 with
// the default key-bindings) translate ⌥← / ⌥→ into the classic readline
// shortcuts ESC-b / ESC-f rather than an ANSI arrow sequence that Ink
// would tag with `leftArrow` / `rightArrow`. Ink parses those 2-byte
// escape sequences into `input === "b"|"f"` with `key.meta === true`.
// Handle this variant explicitly so that Option+Arrow performs word
// navigation consistently across environments.
else if (key["meta"] && (input === "b" || input === "B")) {
this.move("wordLeft");
} else if (key["meta"] && (input === "f" || input === "F")) {
this.move("wordRight");
} else if (key["home"]) {
this.move("home");
} else if (key["end"]) {
this.move("end");
}
/* delete */
// Deletions
//
// In raw terminal mode many frameworks (Ink included) surface a physical
// Backspace keypress as the single DEL (0x7f) byte placed in `input` with
// no `key.backspace` flag set. Treat that byte exactly like an ordinary
@@ -835,22 +840,47 @@ export default class TextBuffer {
// forward deletion so we don't lose that capability on keyboards that
// expose both behaviours.
this.backspace();
}
// Forward deletion (Fn+Delete on macOS, or Delete key with Shift held after
// the branch above) remove the character *under / to the right* of the
// caret, merging lines when at EOL similar to many editors.
else if (key["delete"]) {
} else if (key["delete"]) {
// Forward deletion (Fn+Delete on macOS, or Delete key with Shift held after
// the branch above) remove the character *under / to the right* of the
// caret, merging lines when at EOL similar to many editors.
this.del();
} else if (input && !key["ctrl"] && !key["meta"]) {
}
// Normal input
else if (input && !key["ctrl"] && !key["meta"]) {
this.insert(input);
}
/* printable */
// Emacs/readline-style shortcuts
else if (key["ctrl"] && (input === "a" || input === "\x01")) {
// Ctrl+A → start of input (first row, first column)
this.moveToStartOfDocument();
} else if (key["ctrl"] && (input === "e" || input === "\x05")) {
// Ctrl+E → end of input (last row, last column)
this.moveToEndOfDocument();
} else if (key["ctrl"] && (input === "b" || input === "\x02")) {
// Ctrl+B → char left
this.move("left");
} else if (key["ctrl"] && (input === "f" || input === "\x06")) {
// Ctrl+F → char right
this.move("right");
} else if (key["ctrl"] && (input === "d" || input === "\x04")) {
// Ctrl+D → forward delete
this.del();
} else if (key["ctrl"] && (input === "k" || input === "\x0b")) {
// Ctrl+K → kill to EOL
this.deleteToLineEnd();
} else if (key["ctrl"] && (input === "u" || input === "\x15")) {
// Ctrl+U → kill to SOL
this.deleteToLineStart();
} else if (key["ctrl"] && (input === "w" || input === "\x17")) {
// Ctrl+W → delete word left
this.deleteWordLeft();
}
/* clamp + scroll */
/* printable, clamp + scroll */
this.ensureCursorInRange();
this.ensureCursorVisible(vp);
const cursorMoved =
this.cursorRow !== beforeRow || this.cursorCol !== beforeCol;

View File

@@ -1,18 +1,26 @@
import type { ReviewDecision } from "./review.js";
import type { ApplyPatchCommand, ApprovalPolicy } from "../../approvals.js";
import type { AppConfig } from "../config.js";
import type { UsageBreakdown } from "../estimate-cost.js";
import type { ResponseEvent } from "../responses.js";
import type {
ResponseFunctionToolCall,
ResponseInputItem,
ResponseItem,
ResponseCreateParams,
FunctionTool,
} from "openai/resources/responses/responses.mjs";
import type { Reasoning } from "openai/resources.mjs";
import { log, isLoggingEnabled } from "./log.js";
import { OPENAI_BASE_URL, OPENAI_TIMEOUT_MS } from "../config.js";
import {
OPENAI_TIMEOUT_MS,
OPENAI_ORGANIZATION,
OPENAI_PROJECT,
getApiKey,
getBaseUrl,
} from "../config.js";
import { log } from "../logger/log.js";
import { parseToolCallArguments } from "../parsers.js";
import { ensureSessionTracker } from "../session-cost.js";
import { responsesCreateViaChatCompletions } from "../responses.js";
import {
ORIGIN,
CLI_VERSION,
@@ -26,7 +34,7 @@ import OpenAI, { APIConnectionTimeoutError } from "openai";
// Wait time before retrying after rate limit errors (ms).
const RATE_LIMIT_RETRY_WAIT_MS = parseInt(
process.env["OPENAI_RATE_LIMIT_RETRY_WAIT_MS"] || "2500",
process.env["OPENAI_RATE_LIMIT_RETRY_WAIT_MS"] || "500",
10,
);
@@ -38,12 +46,22 @@ export type CommandConfirmation = {
};
const alreadyProcessedResponses = new Set();
const alreadyStagedItemIds = new Set<string>();
type AgentLoopParams = {
model: string;
provider?: string;
config?: AppConfig;
instructions?: string;
approvalPolicy: ApprovalPolicy;
/**
* Whether the model responses should be stored on the server side (allows
* using `previous_response_id` to provide conversational context). Defaults
* to `true` to preserve the current behaviour. When set to `false` the agent
* will instead send the *full* conversation context as the `input` payload
* on every request and omit the `previous_response_id` parameter.
*/
disableResponseStorage?: boolean;
onItem: (item: ResponseItem) => void;
onLoading: (loading: boolean) => void;
@@ -58,19 +76,39 @@ type AgentLoopParams = {
onLastResponseId: (lastResponseId: string) => void;
};
type Usage = {
total_tokens?: number;
input_tokens?: number;
output_tokens?: number;
const shellTool: FunctionTool = {
type: "function",
name: "shell",
description: "Runs a shell command, and returns its output.",
strict: false,
parameters: {
type: "object",
properties: {
command: { type: "array", items: { type: "string" } },
workdir: {
type: "string",
description: "The working directory for the command.",
},
timeout: {
type: "number",
description:
"The maximum time to wait for the command to complete in milliseconds.",
},
},
required: ["command"],
additionalProperties: false,
},
};
type MaybeUsageEvent = { response?: { usage?: Usage } };
export class AgentLoop {
private model: string;
private provider: string;
private instructions?: string;
private approvalPolicy: ApprovalPolicy;
private config: AppConfig;
private additionalWritableRoots: ReadonlyArray<string>;
/** Whether we ask the API to persist conversation state on the server */
private readonly disableResponseStorage: boolean;
// Using `InstanceType<typeof OpenAI>` sidesteps typing issues with the OpenAI package under
// the TS 5+ `moduleResolution=bundler` setup. OpenAI client instance. We keep the concrete
@@ -101,6 +139,13 @@ export class AgentLoop {
private execAbortController: AbortController | null = null;
/** Set to true when `cancel()` is called so `run()` can exit early. */
private canceled = false;
/**
* Local conversation transcript used when `disableResponseStorage === true`. Holds
* all nonsystem items exchanged so far so we can provide full context on
* every request.
*/
private transcript: Array<ResponseInputItem> = [];
/** Function calls that were emitted by the model but never answered because
* the user cancelled the run. We keep the `call_id`s around so the *next*
* request can send a dummy `function_call_output` that satisfies the
@@ -125,15 +170,13 @@ export class AgentLoop {
// Reset the current stream to allow new requests
this.currentStream = null;
if (isLoggingEnabled()) {
log(
`AgentLoop.cancel() invoked currentStream=${Boolean(
this.currentStream,
)} execAbortController=${Boolean(
this.execAbortController,
)} generation=${this.generation}`,
);
}
log(
`AgentLoop.cancel() invoked currentStream=${Boolean(
this.currentStream,
)} execAbortController=${Boolean(this.execAbortController)} generation=${
this.generation
}`,
);
(
this.currentStream as { controller?: { abort?: () => void } } | null
)?.controller?.abort?.();
@@ -145,9 +188,7 @@ export class AgentLoop {
// Create a new abort controller for future tool calls
this.execAbortController = new AbortController();
if (isLoggingEnabled()) {
log("AgentLoop.cancel(): execAbortController.abort() called");
}
log("AgentLoop.cancel(): execAbortController.abort() called");
// NOTE: We intentionally do *not* clear `lastResponseId` here. If the
// stream produced a `function_call` before the user cancelled, OpenAI now
@@ -183,9 +224,7 @@ export class AgentLoop {
// this.onItem(cancelNotice);
this.generation += 1;
if (isLoggingEnabled()) {
log(`AgentLoop.cancel(): generation bumped to ${this.generation}`);
}
log(`AgentLoop.cancel(): generation bumped to ${this.generation}`);
}
/**
@@ -213,8 +252,10 @@ export class AgentLoop {
// private cumulativeThinkingMs = 0;
constructor({
model,
provider = "openai",
instructions,
approvalPolicy,
disableResponseStorage,
// `config` used to be required. Some unittests (and potentially other
// callers) instantiate `AgentLoop` without passing it, so we make it
// optional and fall back to sensible defaults. This keeps the public
@@ -229,6 +270,7 @@ export class AgentLoop {
additionalWritableRoots,
}: AgentLoopParams & { config?: AppConfig }) {
this.model = model;
this.provider = provider;
this.instructions = instructions;
this.approvalPolicy = approvalPolicy;
@@ -237,32 +279,23 @@ export class AgentLoop {
// defined object. We purposefully copy over the `model` and
// `instructions` that have already been passed explicitly so that
// downstream consumers (e.g. telemetry) still observe the correct values.
this.config =
config ??
({
model,
instructions: instructions ?? "",
} as AppConfig);
this.additionalWritableRoots = additionalWritableRoots;
// Capture usage for costtracking before delegating to the callersupplied
// callback. Wrapping here avoids repeating the bookkeeping logic across
// every UI surface.
this.onItem = (item: ResponseItem) => {
try {
ensureSessionTracker(this.model).addItems([item]);
} catch {
/* besteffort never block uservisible updates */
}
onItem(item);
this.config = config ?? {
model,
instructions: instructions ?? "",
};
this.additionalWritableRoots = additionalWritableRoots;
this.onItem = onItem;
this.onLoading = onLoading;
this.getCommandConfirmation = getCommandConfirmation;
this.onLastResponseId = onLastResponseId;
this.disableResponseStorage = disableResponseStorage ?? false;
this.sessionId = getSessionId() || randomUUID().replaceAll("-", "");
// Configure OpenAI client with optional timeout (ms) from environment
const timeoutMs = OPENAI_TIMEOUT_MS;
const apiKey = this.config.apiKey ?? process.env["OPENAI_API_KEY"] ?? "";
const apiKey = getApiKey(this.provider);
const baseURL = getBaseUrl(this.provider);
this.oai = new OpenAI({
// The OpenAI JS SDK only requires `apiKey` when making requests against
// the official API. When running unittests we stub out all network
@@ -271,11 +304,15 @@ export class AgentLoop {
// errors inside the SDK (it validates that `apiKey` is a nonempty
// string when the field is present).
...(apiKey ? { apiKey } : {}),
baseURL: OPENAI_BASE_URL,
baseURL,
defaultHeaders: {
originator: ORIGIN,
version: CLI_VERSION,
session_id: this.sessionId,
...(OPENAI_ORGANIZATION
? { "OpenAI-Organization": OPENAI_ORGANIZATION }
: {}),
...(OPENAI_PROJECT ? { "OpenAI-Project": OPENAI_PROJECT } : {}),
},
...(timeoutMs !== undefined ? { timeout: timeoutMs } : {}),
});
@@ -335,13 +372,11 @@ export class AgentLoop {
const callId: string = (item as any).call_id ?? (item as any).id;
const args = parseToolCallArguments(rawArguments ?? "{}");
if (isLoggingEnabled()) {
log(
`handleFunctionCall(): name=${
name ?? "undefined"
} callId=${callId} args=${rawArguments}`,
);
}
log(
`handleFunctionCall(): name=${
name ?? "undefined"
} callId=${callId} args=${rawArguments}`,
);
if (args == null) {
const outputItem: ResponseInputItem.FunctionCallOutput = {
@@ -427,11 +462,9 @@ export class AgentLoop {
// Create a fresh AbortController for this run so that tool calls from a
// previous run do not accidentally get signalled.
this.execAbortController = new AbortController();
if (isLoggingEnabled()) {
log(
`AgentLoop.run(): new execAbortController created (${this.execAbortController.signal}) for generation ${this.generation}`,
);
}
log(
`AgentLoop.run(): new execAbortController created (${this.execAbortController.signal}) for generation ${this.generation}`,
);
// NOTE: We no longer (re)attach an `abort` listener to `hardAbort` here.
// A single listener that forwards the `abort` to the current
// `execAbortController` is installed once in the constructor. Readding a
@@ -439,7 +472,17 @@ export class AgentLoop {
// accumulate listeners which in turn triggered Node's
// `MaxListenersExceededWarning` after ten invocations.
let lastResponseId: string = previousResponseId;
// Track the response ID from the last *stored* response so we can use
// `previous_response_id` when `disableResponseStorage` is enabled. When storage
// is disabled we deliberately ignore the callersupplied value because
// the backend will not retain any state that could be referenced.
// If the backend stores conversation state (`disableResponseStorage === false`) we
// forward the callersupplied `previousResponseId` so that the model sees the
// full context. When storage is disabled we *must not* send any ID because the
// server no longer retains the referenced response.
let lastResponseId: string = this.disableResponseStorage
? ""
: previousResponseId;
// If there are unresolved function calls from a previously cancelled run
// we have to emit dummy tool outputs so that the API no longer expects
@@ -461,7 +504,55 @@ export class AgentLoop {
this.pendingAborts.clear();
}
let turnInput = [...abortOutputs, ...input];
// Build the input list for this turn. When responses are stored on the
// server we can simply send the *delta* (the new user input as well as
// any pending abort outputs) and rely on `previous_response_id` for
// context. When storage is disabled the server has no memory of the
// conversation, so we must include the *entire* transcript (minus system
// messages) on every call.
let turnInput: Array<ResponseInputItem> = [];
// Keeps track of how many items in `turnInput` stem from the existing
// transcript so we can avoid reemitting them to the UI. Only used when
// `disableResponseStorage === true`.
let transcriptPrefixLen = 0;
const stripInternalFields = (
item: ResponseInputItem,
): ResponseInputItem => {
// Clone shallowly and remove fields that are not part of the public
// schema expected by the OpenAI Responses API.
// We shallowclone the item so that subsequent mutations (deleting
// internal fields) do not affect the original object which may still
// be referenced elsewhere (e.g. UI components).
const clean = { ...item } as Record<string, unknown>;
delete clean["duration_ms"];
// Remove OpenAI-assigned identifiers and transient status so the
// backend does not reject items that were never persisted because we
// use `store: false`.
delete clean["id"];
delete clean["status"];
return clean as unknown as ResponseInputItem;
};
if (this.disableResponseStorage) {
// Remember where the existing transcript ends everything after this
// index in the upcoming `turnInput` list will be *new* for this turn
// and therefore needs to be surfaced to the UI.
transcriptPrefixLen = this.transcript.length;
// Ensure the transcript is uptodate with the latest user input so
// that subsequent iterations see a complete history.
// `turnInput` is still empty at this point (it will be filled later).
// We need to look at the *input* items the user just supplied.
this.transcript.push(...filterToApiMessages(input));
turnInput = [...this.transcript, ...abortOutputs].map(
stripInternalFields,
);
} else {
turnInput = [...abortOutputs, ...input].map(stripInternalFields);
}
this.onLoading(true);
@@ -472,17 +563,27 @@ export class AgentLoop {
return;
}
// Skip items we've already processed to avoid staging duplicates
if (item.id && alreadyStagedItemIds.has(item.id)) {
return;
}
alreadyStagedItemIds.add(item.id);
// Store the item so the final flush can still operate on a complete list.
// We'll nil out entries once they're delivered.
const idx = staged.push(item) - 1;
// Instead of emitting synchronously we schedule a shortdelay delivery.
//
// This accomplishes two things:
// 1. The UI still sees new messages almost immediately, creating the
// perception of realtime updates.
// 2. If the user calls `cancel()` in the small window right after the
// item was staged we can still abort the delivery because the
// generation counter will have been bumped by `cancel()`.
//
// Use a minimal 3ms delay for terminal rendering to maintain readable
// streaming.
setTimeout(() => {
if (
thisGeneration === this.generation &&
@@ -492,8 +593,54 @@ export class AgentLoop {
this.onItem(item);
// Mark as delivered so flush won't re-emit it
staged[idx] = undefined;
// Handle transcript updates to maintain consistency. When we
// operate without serverside storage we keep our own transcript
// so we can provide full context on subsequent calls.
if (this.disableResponseStorage) {
// Exclude system messages from transcript as they do not form
// part of the assistant/user dialogue that the model needs.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const role = (item as any).role;
if (role !== "system") {
// Clone the item to avoid mutating the object that is also
// rendered in the UI. We need to strip auxiliary metadata
// such as `duration_ms` which is not part of the Responses
// API schema and therefore causes a 400 error when included
// in subsequent requests whose context is sent verbatim.
// Skip items that we have already inserted earlier or that the
// model does not need to see again in the next turn.
// • function_call superseded by the forthcoming
// function_call_output.
// • reasoning internal only, never sent back.
// • user messages we added these to the transcript when
// building the first turnInput; stageItem would add a
// duplicate.
if (
(item as ResponseInputItem).type === "function_call" ||
(item as ResponseInputItem).type === "reasoning" ||
((item as ResponseInputItem).type === "message" &&
// eslint-disable-next-line @typescript-eslint/no-explicit-any
(item as any).role === "user")
) {
return;
}
const clone: ResponseInputItem = {
...(item as unknown as ResponseInputItem),
} as ResponseInputItem;
// The `duration_ms` field is only added to reasoning items to
// show elapsed time in the UI. It must not be forwarded back
// to the server.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
delete (clone as any).duration_ms;
this.transcript.push(clone);
}
}
}
}, 10);
}, 3); // Small 3ms delay for readable streaming.
};
while (turnInput.length > 0) {
@@ -502,66 +649,77 @@ export class AgentLoop {
return;
}
// send request to openAI
for (const item of turnInput) {
// Only surface the *new* input items to the UI replaying the entire
// transcript would duplicate messages that have already been shown in
// earlier turns.
// `turnInput` holds the *new* items that will be sent to the API in
// this iteration. Surface exactly these to the UI so that we do not
// reemit messages from previous turns (which would duplicate user
// prompts) and so that freshly generated `function_call_output`s are
// shown immediately.
// Figure out what subset of `turnInput` constitutes *new* information
// for the UI so that we dont spam the interface with repeats of the
// entire transcript on every iteration when response storage is
// disabled.
const deltaInput = this.disableResponseStorage
? turnInput.slice(transcriptPrefixLen)
: [...turnInput];
for (const item of deltaInput) {
stageItem(item as ResponseItem);
}
// Send request to OpenAI with retry on timeout
// Send request to OpenAI with retry on timeout.
let stream;
// Retry loop for transient errors. Up to MAX_RETRIES attempts.
const MAX_RETRIES = 5;
const MAX_RETRIES = 8;
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
try {
let reasoning: Reasoning | undefined;
if (this.model.startsWith("o")) {
reasoning = { effort: "high" };
reasoning = { effort: this.config.reasoningEffort ?? "high" };
if (this.model === "o3" || this.model === "o4-mini") {
// @ts-expect-error waiting for API type update
reasoning.summary = "auto";
}
}
const mergedInstructions = [prefix, this.instructions]
.filter(Boolean)
.join("\n");
if (isLoggingEnabled()) {
log(
`instructions (length ${mergedInstructions.length}): ${mergedInstructions}`,
);
}
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>
responsesCreateViaChatCompletions(
this.oai,
params as ResponseCreateParams & { stream: true },
);
log(
`instructions (length ${mergedInstructions.length}): ${mergedInstructions}`,
);
// eslint-disable-next-line no-await-in-loop
stream = await this.oai.responses.create({
stream = await responseCall({
model: this.model,
instructions: mergedInstructions,
previous_response_id: lastResponseId || undefined,
input: turnInput,
stream: true,
parallel_tool_calls: false,
reasoning,
tools: [
{
type: "function",
name: "shell",
description: "Runs a shell command, and returns its output.",
strict: false,
parameters: {
type: "object",
properties: {
command: { type: "array", items: { type: "string" } },
workdir: {
type: "string",
description: "The working directory for the command.",
},
timeout: {
type: "number",
description:
"The maximum time to wait for the command to complete in milliseconds.",
},
},
required: ["command"],
additionalProperties: false,
},
},
],
...(this.config.flexMode ? { service_tier: "flex" } : {}),
...(this.disableResponseStorage
? { store: false }
: {
store: true,
previous_response_id: lastResponseId || undefined,
}),
tools: [shellTool],
// Explicitly tell the model it is allowed to pick whatever
// tool it deems appropriate. Omitting this sometimes leads to
// the model ignoring the available tools and responding with
// plain text instead (resulting in a missing toolcall).
tool_choice: "auto",
});
break;
} catch (error) {
@@ -723,7 +881,6 @@ export class AgentLoop {
throw error;
}
}
turnInput = []; // clear turn input, prepare for function call results
// If the user requested cancellation while we were awaiting the network
// request, abort immediately before we start handling the stream.
@@ -743,93 +900,255 @@ export class AgentLoop {
// Keep track of the active stream so it can be aborted on demand.
this.currentStream = stream;
// guard against an undefined stream before iterating
// Guard against an undefined stream before iterating.
if (!stream) {
this.onLoading(false);
log("AgentLoop.run(): stream is undefined");
return;
}
try {
// eslint-disable-next-line no-await-in-loop
for await (const event of stream) {
if (isLoggingEnabled()) {
const MAX_STREAM_RETRIES = 5;
let streamRetryAttempt = 0;
// eslint-disable-next-line no-constant-condition
while (true) {
try {
let newTurnInput: Array<ResponseInputItem> = [];
// eslint-disable-next-line no-await-in-loop
for await (const event of stream as AsyncIterable<ResponseEvent>) {
log(`AgentLoop.run(): response event ${event.type}`);
}
// process and surface each item (noop until we can depend on streaming events)
if (event.type === "response.output_item.done") {
const item = event.item;
// 1) if it's a reasoning item, annotate it
type ReasoningItem = { type?: string; duration_ms?: number };
const maybeReasoning = item as ReasoningItem;
if (maybeReasoning.type === "reasoning") {
maybeReasoning.duration_ms = Date.now() - thinkingStart;
}
if (item.type === "function_call") {
// Track outstanding tool call so we can abort later if needed.
// The item comes from the streaming response, therefore it has
// either `id` (chat) or `call_id` (responses) we normalise
// by reading both.
const callId =
(item as { call_id?: string; id?: string }).call_id ??
(item as { id?: string }).id;
if (callId) {
this.pendingAborts.add(callId);
// process and surface each item (no-op until we can depend on streaming events)
if (event.type === "response.output_item.done") {
const item = event.item;
// 1) if it's a reasoning item, annotate it
type ReasoningItem = { type?: string; duration_ms?: number };
const maybeReasoning = item as ReasoningItem;
if (maybeReasoning.type === "reasoning") {
maybeReasoning.duration_ms = Date.now() - thinkingStart;
}
} else {
stageItem(item as ResponseItem);
}
}
if (event.type === "response.completed") {
if (thisGeneration === this.generation && !this.canceled) {
for (const item of event.response.output) {
if (item.type === "function_call") {
// Track outstanding tool call so we can abort later if needed.
// The item comes from the streaming response, therefore it has
// either `id` (chat) or `call_id` (responses) we normalise
// by reading both.
const callId =
(item as { call_id?: string; id?: string }).call_id ??
(item as { id?: string }).id;
if (callId) {
this.pendingAborts.add(callId);
}
} else {
stageItem(item as ResponseItem);
}
}
if (event.response.status === "completed") {
// TODO: remove this once we can depend on streaming events
const newTurnInput = await this.processEventsWithoutStreaming(
event.response.output,
stageItem,
);
turnInput = newTurnInput;
}
lastResponseId = event.response.id;
this.onLastResponseId(event.response.id);
// Capture exact token usage for cost tracking when provided by
// the API. `responses.completed` events include a `usage` field
// with {input_tokens, output_tokens, total_tokens}. We record
// the total (or fallback to summing the parts if needed).
try {
const usage = (event as MaybeUsageEvent).response?.usage;
if (usage && typeof usage === "object") {
ensureSessionTracker(this.model).addUsage(
usage as unknown as UsageBreakdown,
);
if (event.type === "response.completed") {
if (thisGeneration === this.generation && !this.canceled) {
for (const item of event.response.output) {
stageItem(item as ResponseItem);
}
}
} catch {
/* besteffort only */
if (
event.response.status === "completed" ||
(event.response.status as unknown as string) ===
"requires_action"
) {
// TODO: remove this once we can depend on streaming events
newTurnInput = await this.processEventsWithoutStreaming(
event.response.output,
stageItem,
);
// When we do not use serverside storage we maintain our
// own transcript so that *future* turns still contain full
// conversational context. However, whether we advance to
// another loop iteration should depend solely on the
// presence of *new* input items (i.e. items that were not
// part of the previous request). Resending the transcript
// by itself would create an infinite request loop because
// `turnInput.length` would never reach zero.
if (this.disableResponseStorage) {
// 1) Append the freshly emitted output to our local
// transcript (minus nonmessage items the model does
// not need to see again).
const cleaned = filterToApiMessages(
event.response.output.map(stripInternalFields),
);
this.transcript.push(...cleaned);
// 2) Determine the *delta* (newTurnInput) that must be
// sent in the next iteration. If there is none we can
// safely terminate the loop the transcript alone
// does not constitute new information for the
// assistant to act upon.
const delta = filterToApiMessages(
newTurnInput.map(stripInternalFields),
);
if (delta.length === 0) {
// No new input => end conversation.
newTurnInput = [];
} else {
// Resend full transcript *plus* the new delta so the
// stateless backend receives complete context.
newTurnInput = [...this.transcript, ...delta];
// The prefix ends at the current transcript length
// everything after this index is new for the next
// iteration.
transcriptPrefixLen = this.transcript.length;
}
}
}
lastResponseId = event.response.id;
this.onLastResponseId(event.response.id);
}
}
}
} catch (err: unknown) {
// Gracefully handle an abort triggered via `cancel()` so that the
// consumer does not see an unhandled exception.
if (err instanceof Error && err.name === "AbortError") {
if (!this.canceled) {
// It was aborted for some other reason; surface the error.
throw err;
// Set after we have consumed all stream events in case the stream wasn't
// complete or we missed events for whatever reason. That way, we will set
// the next turn to an empty array to prevent an infinite loop.
// And don't update the turn input too early otherwise we won't have the
// current turn inputs available for retries.
turnInput = newTurnInput;
// Stream finished successfully leave the retry loop.
break;
} catch (err: unknown) {
const isRateLimitError = (e: unknown): boolean => {
if (!e || typeof e !== "object") {
return false;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const ex: any = e;
return (
ex.status === 429 ||
ex.code === "rate_limit_exceeded" ||
ex.type === "rate_limit_exceeded"
);
};
if (
isRateLimitError(err) &&
streamRetryAttempt < MAX_STREAM_RETRIES
) {
streamRetryAttempt += 1;
const waitMs =
RATE_LIMIT_RETRY_WAIT_MS * 2 ** (streamRetryAttempt - 1);
log(
`OpenAI stream ratelimited retry ${streamRetryAttempt}/${MAX_STREAM_RETRIES} in ${waitMs} ms`,
);
// Give the server a breather before retrying.
// eslint-disable-next-line no-await-in-loop
await new Promise((res) => setTimeout(res, waitMs));
// Recreate the stream with the *same* parameters.
let reasoning: Reasoning | undefined;
if (this.model.startsWith("o")) {
reasoning = { effort: "high" };
if (this.model === "o3" || this.model === "o4-mini") {
reasoning.summary = "auto";
}
}
const mergedInstructions = [prefix, this.instructions]
.filter(Boolean)
.join("\n");
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>
responsesCreateViaChatCompletions(
this.oai,
params as ResponseCreateParams & { stream: true },
);
log(
"agentLoop.run(): responseCall(1): turnInput: " +
JSON.stringify(turnInput),
);
// eslint-disable-next-line no-await-in-loop
stream = await responseCall({
model: this.model,
instructions: mergedInstructions,
input: turnInput,
stream: true,
parallel_tool_calls: false,
reasoning,
...(this.config.flexMode ? { service_tier: "flex" } : {}),
...(this.disableResponseStorage
? { store: false }
: {
store: true,
previous_response_id: lastResponseId || undefined,
}),
tools: [shellTool],
tool_choice: "auto",
});
this.currentStream = stream;
// Continue to outer while to consume new stream.
continue;
}
this.onLoading(false);
return;
// Gracefully handle an abort triggered via `cancel()` so that the
// consumer does not see an unhandled exception.
if (err instanceof Error && err.name === "AbortError") {
if (!this.canceled) {
// It was aborted for some other reason; surface the error.
throw err;
}
this.onLoading(false);
return;
}
// Suppress internal stack on JSON parse failures
if (err instanceof SyntaxError) {
this.onItem({
id: `error-${Date.now()}`,
type: "message",
role: "system",
content: [
{
type: "input_text",
text: "⚠️ Failed to parse streaming response (invalid JSON). Please `/clear` to reset.",
},
],
});
this.onLoading(false);
return;
}
// Handle OpenAI API quota errors
if (
err instanceof Error &&
(err as { code?: string }).code === "insufficient_quota"
) {
this.onItem({
id: `error-${Date.now()}`,
type: "message",
role: "system",
content: [
{
type: "input_text",
text: `\u26a0 Insufficient quota: ${err instanceof Error && err.message ? err.message.trim() : "No remaining quota."} Manage or purchase credits at https://platform.openai.com/account/billing.`,
},
],
});
this.onLoading(false);
return;
}
throw err;
} finally {
this.currentStream = null;
}
throw err;
} finally {
this.currentStream = null;
}
} // end while retry loop
log(
`Turn inputs (${turnInput.length}) - ${turnInput
@@ -899,8 +1218,18 @@ export class AgentLoop {
this.onLoading(false);
};
// Delay flush slightly to allow a nearsimultaneous cancel() to land.
setTimeout(flush, 30);
// Use a small delay to make sure UI rendering is smooth. Double-check
// cancellation state right before flushing to avoid race conditions.
setTimeout(() => {
if (
!this.canceled &&
!this.hardAbort.signal.aborted &&
thisGeneration === this.generation
) {
flush();
}
}, 3);
// End of main logic. The corresponding catch block for the wrapper at the
// start of this method follows next.
} catch (err) {
@@ -929,7 +1258,7 @@ export class AgentLoop {
],
});
} catch {
/* noop emitting the error message is besteffort */
/* no-op emitting the error message is besteffort */
}
this.onLoading(false);
return;
@@ -1172,7 +1501,21 @@ You MUST adhere to the following criteria when executing the task:
- For smaller tasks, describe in brief bullet points
- For more complex tasks, include brief high-level description, use bullet points, and include details that would be relevant to a code reviewer.
- If completing the user's task DOES NOT require writing or modifying files (e.g., the user asks a question about the code base):
- Respond in a friendly tune as a remote teammate, who is knowledgeable, capable and eager to help with coding.
- Respond in a friendly tone as a remote teammate, who is knowledgeable, capable and eager to help with coding.
- When your task involves writing or modifying files:
- Do NOT tell the user to "save the file" or "copy the code into a file" if you already created or modified the file using \`apply_patch\`. Instead, reference the file as already saved.
- Do NOT show the full contents of large files you have already written, unless the user explicitly asks for them.`;
function filterToApiMessages(
items: Array<ResponseInputItem>,
): Array<ResponseInputItem> {
return items.filter((it) => {
if (it.type === "message" && it.role === "system") {
return false;
}
if (it.type === "reasoning") {
return false;
}
return true;
});
}

View File

@@ -211,9 +211,46 @@ class Parser {
}
if (defStr.trim()) {
let found = false;
if (!fileLines.slice(0, index).some((s) => s === defStr)) {
// ------------------------------------------------------------------
// Equality helpers using the canonicalisation from find_context_core.
// (We duplicate a minimal version here because the scope is local.)
// ------------------------------------------------------------------
const canonLocal = (s: string): string =>
s.normalize("NFC").replace(
/./gu,
(c) =>
(
({
"-": "-",
"\u2010": "-",
"\u2011": "-",
"\u2012": "-",
"\u2013": "-",
"\u2014": "-",
"\u2212": "-",
"\u0022": '"',
"\u201C": '"',
"\u201D": '"',
"\u201E": '"',
"\u00AB": '"',
"\u00BB": '"',
"\u0027": "'",
"\u2018": "'",
"\u2019": "'",
"\u201B": "'",
"\u00A0": " ",
"\u202F": " ",
}) as Record<string, string>
)[c] ?? c,
);
if (
!fileLines
.slice(0, index)
.some((s) => canonLocal(s) === canonLocal(defStr))
) {
for (let i = index; i < fileLines.length; i++) {
if (fileLines[i] === defStr) {
if (canonLocal(fileLines[i]!) === canonLocal(defStr)) {
index = i + 1;
found = true;
break;
@@ -222,10 +259,14 @@ class Parser {
}
if (
!found &&
!fileLines.slice(0, index).some((s) => s.trim() === defStr.trim())
!fileLines
.slice(0, index)
.some((s) => canonLocal(s.trim()) === canonLocal(defStr.trim()))
) {
for (let i = index; i < fileLines.length; i++) {
if (fileLines[i]!.trim() === defStr.trim()) {
if (
canonLocal(fileLines[i]!.trim()) === canonLocal(defStr.trim())
) {
index = i + 1;
this.fuzz += 1;
found = true;
@@ -293,34 +334,98 @@ function find_context_core(
context: Array<string>,
start: number,
): [number, number] {
// ---------------------------------------------------------------------------
// Helpers Unicode punctuation normalisation
// ---------------------------------------------------------------------------
/*
* The patch-matching algorithm originally required **exact** string equality
* for non-whitespace characters. That breaks when the file on disk contains
* visually identical but different Unicode code-points (e.g. “EN DASH” vs
* ASCII "-"), because models almost always emit the ASCII variant. To make
* apply_patch resilient we canonicalise a handful of common punctuation
* look-alikes before doing comparisons.
*
* We purposefully keep the mapping *small* only characters that routinely
* appear in source files and are highly unlikely to introduce ambiguity are
* included. Each entry is written using the corresponding Unicode escape so
* that the file remains ASCII-only even after transpilation.
*/
const PUNCT_EQUIV: Record<string, string> = {
// Hyphen / dash variants --------------------------------------------------
/* U+002D HYPHEN-MINUS */ "-": "-",
/* U+2010 HYPHEN */ "\u2010": "-",
/* U+2011 NO-BREAK HYPHEN */ "\u2011": "-",
/* U+2012 FIGURE DASH */ "\u2012": "-",
/* U+2013 EN DASH */ "\u2013": "-",
/* U+2014 EM DASH */ "\u2014": "-",
/* U+2212 MINUS SIGN */ "\u2212": "-",
// Double quotes -----------------------------------------------------------
/* U+0022 QUOTATION MARK */ "\u0022": '"',
/* U+201C LEFT DOUBLE QUOTATION MARK */ "\u201C": '"',
/* U+201D RIGHT DOUBLE QUOTATION MARK */ "\u201D": '"',
/* U+201E DOUBLE LOW-9 QUOTATION MARK */ "\u201E": '"',
/* U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK */ "\u00AB": '"',
/* U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK */ "\u00BB": '"',
// Single quotes -----------------------------------------------------------
/* U+0027 APOSTROPHE */ "\u0027": "'",
/* U+2018 LEFT SINGLE QUOTATION MARK */ "\u2018": "'",
/* U+2019 RIGHT SINGLE QUOTATION MARK */ "\u2019": "'",
/* U+201B SINGLE HIGH-REVERSED-9 QUOTATION MARK */ "\u201B": "'",
// Spaces ------------------------------------------------------------------
/* U+00A0 NO-BREAK SPACE */ "\u00A0": " ",
/* U+202F NARROW NO-BREAK SPACE */ "\u202F": " ",
};
const canon = (s: string): string =>
s
// Canonical Unicode composition first
.normalize("NFC")
// Replace punctuation look-alikes
.replace(/./gu, (c) => PUNCT_EQUIV[c] ?? c);
if (context.length === 0) {
return [start, 0];
}
// Pass 1 exact equality after canonicalisation ---------------------------
const canonicalContext = canon(context.join("\n"));
for (let i = start; i < lines.length; i++) {
if (lines.slice(i, i + context.length).join("\n") === context.join("\n")) {
const segment = canon(lines.slice(i, i + context.length).join("\n"));
if (segment === canonicalContext) {
return [i, 0];
}
}
// Pass 2 ignore trailing whitespace -------------------------------------
for (let i = start; i < lines.length; i++) {
if (
const segment = canon(
lines
.slice(i, i + context.length)
.map((s) => s.trimEnd())
.join("\n") === context.map((s) => s.trimEnd()).join("\n")
) {
.join("\n"),
);
const ctx = canon(context.map((s) => s.trimEnd()).join("\n"));
if (segment === ctx) {
return [i, 1];
}
}
// Pass 3 ignore all surrounding whitespace ------------------------------
for (let i = start; i < lines.length; i++) {
if (
const segment = canon(
lines
.slice(i, i + context.length)
.map((s) => s.trim())
.join("\n") === context.map((s) => s.trim()).join("\n")
) {
.join("\n"),
);
const ctx = canon(context.map((s) => s.trim()).join("\n"));
if (segment === ctx) {
return [i, 100];
}
}
return [-1, 0];
}

View File

@@ -1,5 +1,6 @@
import type { ExecInput, ExecResult } from "./sandbox/interface.js";
import type { SpawnOptions } from "child_process";
import type { ParseEntry } from "shell-quote";
import { process_patch } from "./apply-patch.js";
import { SandboxType } from "./sandbox/interface.js";
@@ -8,9 +9,25 @@ import { exec as rawExec } from "./sandbox/raw-exec.js";
import { formatCommandForDisplay } from "../../format-command.js";
import fs from "fs";
import os from "os";
import path from "path";
import { parse } from "shell-quote";
import { resolvePathAgainstWorkdir } from "src/approvals.js";
const DEFAULT_TIMEOUT_MS = 10_000; // 10 seconds
function requiresShell(cmd: Array<string>): boolean {
// If the command is a single string that contains shell operators,
// it needs to be run with shell: true
if (cmd.length === 1 && cmd[0] !== undefined) {
const tokens = parse(cmd[0]) as Array<ParseEntry>;
return tokens.some((token) => typeof token === "object" && "op" in token);
}
// If the command is split into multiple arguments, we don't need shell: true
// even if one of the arguments is a shell operator like '|'
return false;
}
/**
* This function should never return a rejected promise: errors should be
* mapped to a non-zero exit code and the error message should be in stderr.
@@ -33,6 +50,7 @@ export function exec(
const opts: SpawnOptions = {
timeout: timeoutInMillis || DEFAULT_TIMEOUT_MS,
...(requiresShell(cmd) ? { shell: true } : {}),
...(workdir ? { cwd: workdir } : {}),
};
// Merge default writable roots with any user-specified ones.
@@ -44,16 +62,32 @@ export function exec(
return execForSandbox(cmd, opts, writableRoots, abortSignal);
}
export function execApplyPatch(patchText: string): ExecResult {
export function execApplyPatch(
patchText: string,
workdir: string | undefined = undefined,
): ExecResult {
// This is a temporary measure to understand what are the common base commands
// until we start persisting and uploading rollouts
try {
const result = process_patch(
patchText,
(p) => fs.readFileSync(p, "utf8"),
(p, c) => fs.writeFileSync(p, c, "utf8"),
(p) => fs.unlinkSync(p),
(p) => fs.readFileSync(resolvePathAgainstWorkdir(p, workdir), "utf8"),
(p, c) => {
const resolvedPath = resolvePathAgainstWorkdir(p, workdir);
// Ensure the parent directory exists before writing the file. This
// mirrors the behaviour of the standalone apply_patch CLI (see
// write_file() in apply-patch.ts) and prevents errors when adding a
// new file in a notyetcreated subdirectory.
const dir = path.dirname(resolvedPath);
if (dir !== ".") {
fs.mkdirSync(dir, { recursive: true });
}
fs.writeFileSync(resolvedPath, c, "utf8");
},
(p) => fs.unlinkSync(resolvePathAgainstWorkdir(p, workdir)),
);
return {
stdout: result,

View File

@@ -1,17 +1,18 @@
import type { CommandConfirmation } from "./agent-loop.js";
import type { AppConfig } from "../config.js";
import type { ExecInput } from "./sandbox/interface.js";
import type { ApplyPatchCommand, ApprovalPolicy } from "../../approvals.js";
import type { ExecInput } from "./sandbox/interface.js";
import type { ResponseInputItem } from "openai/resources/responses/responses.mjs";
import { exec, execApplyPatch } from "./exec.js";
import { isLoggingEnabled, log } from "./log.js";
import { ReviewDecision } from "./review.js";
import { FullAutoErrorMode } from "../auto-approval-mode.js";
import { SandboxType } from "./sandbox/interface.js";
import { canAutoApprove } from "../../approvals.js";
import { formatCommandForDisplay } from "../../format-command.js";
import { access } from "fs/promises";
import { FullAutoErrorMode } from "../auto-approval-mode.js";
import { CODEX_UNSAFE_ALLOW_NO_SANDBOX, type AppConfig } from "../config.js";
import { exec, execApplyPatch } from "./exec.js";
import { ReviewDecision } from "./review.js";
import { isLoggingEnabled, log } from "../logger/log.js";
import { SandboxType } from "./sandbox/interface.js";
import { PATH_TO_SEATBELT_EXECUTABLE } from "./sandbox/macos-seatbelt.js";
import fs from "fs/promises";
// ---------------------------------------------------------------------------
// Sessionlevel cache of commands that the user has chosen to always approve.
@@ -81,7 +82,7 @@ export async function handleExecCommand(
) => Promise<CommandConfirmation>,
abortSignal?: AbortSignal,
): Promise<HandleExecCommandResult> {
const { cmd: command } = args;
const { cmd: command, workdir } = args;
const key = deriveCommandKey(command);
@@ -103,7 +104,7 @@ export async function handleExecCommand(
// working directory so that edits are constrained to the project root. If
// the caller wishes to broaden or restrict the set it can be made
// configurable in the future.
const safety = canAutoApprove(command, policy, [process.cwd()]);
const safety = canAutoApprove(command, workdir, policy, [process.cwd()]);
let runInSandbox: boolean;
switch (safety.type) {
@@ -144,7 +145,7 @@ export async function handleExecCommand(
abortSignal,
);
// If the operation was aborted in the meantime, propagate the cancellation
// upward by returning an empty (noop) result so that the agent loop will
// upward by returning an empty (no-op) result so that the agent loop will
// exit cleanly without emitting spurious output.
if (abortSignal?.aborted) {
return {
@@ -217,29 +218,28 @@ async function execCommand(
let { workdir } = execInput;
if (workdir) {
try {
await access(workdir);
await fs.access(workdir);
} catch (e) {
log(`EXEC workdir=${workdir} not found, use process.cwd() instead`);
workdir = process.cwd();
}
}
if (isLoggingEnabled()) {
if (applyPatchCommand != null) {
log("EXEC running apply_patch command");
} else {
const { cmd, timeoutInMillis } = execInput;
// Seconds are a bit easier to read in log messages and most timeouts
// are specified as multiples of 1000, anyway.
const timeout =
timeoutInMillis != null
? Math.round(timeoutInMillis / 1000).toString()
: "undefined";
log(
`EXEC running \`${formatCommandForDisplay(
cmd,
)}\` in workdir=${workdir} with timeout=${timeout}s`,
);
}
if (applyPatchCommand != null) {
log("EXEC running apply_patch command");
} else if (isLoggingEnabled()) {
const { cmd, timeoutInMillis } = execInput;
// Seconds are a bit easier to read in log messages and most timeouts
// are specified as multiples of 1000, anyway.
const timeout =
timeoutInMillis != null
? Math.round(timeoutInMillis / 1000).toString()
: "undefined";
log(
`EXEC running \`${formatCommandForDisplay(
cmd,
)}\` in workdir=${workdir} with timeout=${timeout}s`,
);
}
// Note execApplyPatch() and exec() are coded defensively and should not
@@ -248,7 +248,7 @@ async function execCommand(
const start = Date.now();
const execResult =
applyPatchCommand != null
? execApplyPatch(applyPatchCommand.patch)
? execApplyPatch(applyPatchCommand.patch, workdir)
: await exec(
{ ...execInput, additionalWritableRoots },
await getSandbox(runInSandbox),
@@ -271,30 +271,45 @@ async function execCommand(
};
}
const isInLinux = async (): Promise<boolean> => {
try {
await access("/proc/1/cgroup");
return true;
} catch {
return false;
}
};
/** Return `true` if the `/usr/bin/sandbox-exec` is present and executable. */
const isSandboxExecAvailable: Promise<boolean> = fs
.access(PATH_TO_SEATBELT_EXECUTABLE, fs.constants.X_OK)
.then(
() => true,
(err) => {
if (!["ENOENT", "ACCESS", "EPERM"].includes(err.code)) {
log(
`Unexpected error for \`stat ${PATH_TO_SEATBELT_EXECUTABLE}\`: ${err.message}`,
);
}
return false;
},
);
async function getSandbox(runInSandbox: boolean): Promise<SandboxType> {
if (runInSandbox) {
if (process.platform === "darwin") {
return SandboxType.MACOS_SEATBELT;
} else if (await isInLinux()) {
return SandboxType.NONE;
} else if (process.platform === "win32") {
// On Windows, we don't have a sandbox implementation yet, so we fall back to NONE
// instead of throwing an error, which would crash the application
log(
"WARNING: Sandbox was requested but is not available on Windows. Continuing without sandbox.",
);
// On macOS we rely on the system-provided `sandbox-exec` binary to
// enforce the Seatbelt profile. However, starting with macOS 14 the
// executable may be removed from the default installation or the user
// might be running the CLI on a stripped-down environment (for
// instance, inside certain CI images). Attempting to spawn a missing
// binary makes Node.js throw an *uncaught* `ENOENT` error further down
// the stack which crashes the whole CLI.
if (await isSandboxExecAvailable) {
return SandboxType.MACOS_SEATBELT;
} else {
throw new Error(
"Sandbox was mandated, but 'sandbox-exec' was not found in PATH!",
);
}
} else if (CODEX_UNSAFE_ALLOW_NO_SANDBOX) {
// Allow running without a sandbox if the user has explicitly marked the
// environment as already being sufficiently locked-down.
return SandboxType.NONE;
}
// For other platforms, still throw an error as before
// For all else, we hard fail if the user has requested a sandbox and none is available.
throw new Error("Sandbox was mandated, but no sandbox is available!");
} else {
return SandboxType.NONE;

View File

@@ -2,7 +2,7 @@
* Utility functions for handling platform-specific commands
*/
import { log, isLoggingEnabled } from "./log.js";
import { log } from "../logger/log.js";
/**
* Map of Unix commands to their Windows equivalents
@@ -59,9 +59,7 @@ export function adaptCommandForPlatform(command: Array<string>): Array<string> {
return command;
}
if (isLoggingEnabled()) {
log(`Adapting command '${cmd}' for Windows platform`);
}
log(`Adapting command '${cmd}' for Windows platform`);
// Create a new command array with the adapted command
const adaptedCommand = [...command];
@@ -78,9 +76,7 @@ export function adaptCommandForPlatform(command: Array<string>): Array<string> {
}
}
if (isLoggingEnabled()) {
log(`Adapted command: ${adaptedCommand.join(" ")}`);
}
log(`Adapted command: ${adaptedCommand.join(" ")}`);
return adaptedCommand;
}

View File

@@ -0,0 +1,78 @@
// Maximum output cap: either MAX_OUTPUT_LINES lines or MAX_OUTPUT_BYTES bytes,
// whichever limit is reached first.
const MAX_OUTPUT_BYTES = 1024 * 10; // 10 KB
const MAX_OUTPUT_LINES = 256;
/**
* Creates a collector that accumulates data Buffers from a stream up to
* specified byte and line limits. After either limit is exceeded, further
* data is ignored.
*/
export function createTruncatingCollector(
stream: NodeJS.ReadableStream,
byteLimit: number = MAX_OUTPUT_BYTES,
lineLimit: number = MAX_OUTPUT_LINES,
): {
getString: () => string;
hit: boolean;
} {
const chunks: Array<Buffer> = [];
let totalBytes = 0;
let totalLines = 0;
let hitLimit = false;
stream?.on("data", (data: Buffer) => {
if (hitLimit) {
return;
}
const dataLength = data.length;
let newlineCount = 0;
for (let i = 0; i < dataLength; i++) {
if (data[i] === 0x0a) {
newlineCount++;
}
}
// If entire chunk fits within byte and line limits, take it whole
if (
totalBytes + dataLength <= byteLimit &&
totalLines + newlineCount <= lineLimit
) {
chunks.push(data);
totalBytes += dataLength;
totalLines += newlineCount;
} else {
// Otherwise, take a partial slice up to the first limit breach
const allowedBytes = byteLimit - totalBytes;
const allowedLines = lineLimit - totalLines;
let bytesTaken = 0;
let linesSeen = 0;
for (let i = 0; i < dataLength; i++) {
// Stop if byte or line limit is reached
if (bytesTaken === allowedBytes || linesSeen === allowedLines) {
break;
}
const byte = data[i];
if (byte === 0x0a) {
linesSeen++;
}
bytesTaken++;
}
if (bytesTaken > 0) {
chunks.push(data.slice(0, bytesTaken));
totalBytes += bytesTaken;
totalLines += linesSeen;
}
hitLimit = true;
}
});
return {
getString() {
return Buffer.concat(chunks).toString("utf8");
},
/** True if either byte or line limit was exceeded */
get hit(): boolean {
return hitLimit;
},
};
}

View File

@@ -2,37 +2,41 @@ import type { ExecResult } from "./interface.js";
import type { SpawnOptions } from "child_process";
import { exec } from "./raw-exec.js";
import { log } from "../log.js";
import { realpathSync } from "fs";
import { CONFIG_DIR } from "src/utils/config.js";
import { log } from "../../logger/log.js";
function getCommonRoots() {
return [
CONFIG_DIR,
// Without this root, it'll cause:
// pyenv: cannot rehash: $HOME/.pyenv/shims isn't writable
`${process.env["HOME"]}/.pyenv`,
];
}
/**
* When working with `sandbox-exec`, only consider `sandbox-exec` in `/usr/bin`
* to defend against an attacker trying to inject a malicious version on the
* PATH. If /usr/bin/sandbox-exec has been tampered with, then the attacker
* already has root access.
*/
export const PATH_TO_SEATBELT_EXECUTABLE = "/usr/bin/sandbox-exec";
export function execWithSeatbelt(
cmd: Array<string>,
opts: SpawnOptions,
writableRoots: Array<string>,
writableRoots: ReadonlyArray<string>,
abortSignal?: AbortSignal,
): Promise<ExecResult> {
let scopedWritePolicy: string;
let policyTemplateParams: Array<string>;
if (writableRoots.length > 0) {
// Add `~/.codex` to the list of writable roots
// (if there's any already, not in read-only mode)
getCommonRoots().map((root) => writableRoots.push(root));
const { policies, params } = writableRoots
const fullWritableRoots = [...writableRoots, ...getCommonRoots()];
// In practice, fullWritableRoots will be non-empty, but we check just in
// case the logic to build up fullWritableRoots changes.
if (fullWritableRoots.length > 0) {
const { policies, params } = fullWritableRoots
.map((root, index) => ({
policy: `(subpath (param "WRITABLE_ROOT_${index}"))`,
// the kernel resolves symlinks before handing them to seatbelt for checking
// so store the canonicalized form in the policy to be compared against
param: `-DWRITABLE_ROOT_${index}=${realpathSync(root)}`,
param: `-DWRITABLE_ROOT_${index}=${root}`,
}))
.reduce(
(
@@ -61,7 +65,7 @@ export function execWithSeatbelt(
);
const fullCommand = [
"sandbox-exec",
PATH_TO_SEATBELT_EXECUTABLE,
"-p",
fullPolicy,
...policyTemplateParams,

View File

@@ -7,13 +7,12 @@ import type {
StdioPipe,
} from "child_process";
import { log, isLoggingEnabled } from "../log.js";
import { log } from "../../logger/log.js";
import { adaptCommandForPlatform } from "../platform-commands.js";
import { createTruncatingCollector } from "./create-truncating-collector";
import { spawn } from "child_process";
import * as os from "os";
const MAX_BUFFER = 1024 * 100; // 100 KB
/**
* This function should never return a rejected promise: errors should be
* mapped to a non-zero exit code and the error message should be in stderr.
@@ -21,16 +20,13 @@ const MAX_BUFFER = 1024 * 100; // 100 KB
export function exec(
command: Array<string>,
options: SpawnOptions,
_writableRoots: Array<string>,
_writableRoots: ReadonlyArray<string>,
abortSignal?: AbortSignal,
): Promise<ExecResult> {
// Adapt command for the current platform (e.g., convert 'ls' to 'dir' on Windows)
const adaptedCommand = adaptCommandForPlatform(command);
if (
isLoggingEnabled() &&
JSON.stringify(adaptedCommand) !== JSON.stringify(command)
) {
if (JSON.stringify(adaptedCommand) !== JSON.stringify(command)) {
log(
`Command adapted for platform: ${command.join(
" ",
@@ -95,9 +91,7 @@ export function exec(
// timely fashion.
if (abortSignal) {
const abortHandler = () => {
if (isLoggingEnabled()) {
log(`raw-exec: abort signal received killing child ${child.pid}`);
}
log(`raw-exec: abort signal received killing child ${child.pid}`);
const killTarget = (signal: NodeJS.Signals) => {
if (!child.pid) {
return;
@@ -148,37 +142,14 @@ export function exec(
// resolve the promise and translate the failure into a regular
// ExecResult object so the rest of the agent loop can carry on gracefully.
const stdoutChunks: Array<Buffer> = [];
const stderrChunks: Array<Buffer> = [];
let numStdoutBytes = 0;
let numStderrBytes = 0;
let hitMaxStdout = false;
let hitMaxStderr = false;
return new Promise<ExecResult>((resolve) => {
child.stdout?.on("data", (data: Buffer) => {
if (!hitMaxStdout) {
numStdoutBytes += data.length;
if (numStdoutBytes <= MAX_BUFFER) {
stdoutChunks.push(data);
} else {
hitMaxStdout = true;
}
}
});
child.stderr?.on("data", (data: Buffer) => {
if (!hitMaxStderr) {
numStderrBytes += data.length;
if (numStderrBytes <= MAX_BUFFER) {
stderrChunks.push(data);
} else {
hitMaxStderr = true;
}
}
});
// Collect stdout and stderr up to configured limits.
const stdoutCollector = createTruncatingCollector(child.stdout!);
const stderrCollector = createTruncatingCollector(child.stderr!);
child.on("exit", (code, signal) => {
const stdout = Buffer.concat(stdoutChunks).toString("utf8");
const stderr = Buffer.concat(stderrChunks).toString("utf8");
const stdout = stdoutCollector.getString();
const stderr = stderrCollector.getString();
// Map (code, signal) to an exit code. We expect exactly one of the two
// values to be non-null, but we code defensively to handle the case where
@@ -194,24 +165,61 @@ export function exec(
exitCode = 1;
}
if (isLoggingEnabled()) {
log(
`raw-exec: child ${child.pid} exited code=${exitCode} signal=${signal}`,
);
}
resolve({
log(
`raw-exec: child ${child.pid} exited code=${exitCode} signal=${signal}`,
);
const execResult = {
stdout,
stderr,
exitCode,
});
};
resolve(
addTruncationWarningsIfNecessary(
execResult,
stdoutCollector.hit,
stderrCollector.hit,
),
);
});
child.on("error", (err) => {
resolve({
const execResult = {
stdout: "",
stderr: String(err),
exitCode: 1,
});
};
resolve(
addTruncationWarningsIfNecessary(
execResult,
stdoutCollector.hit,
stderrCollector.hit,
),
);
});
});
}
/**
* Adds a truncation warnings to stdout and stderr, if appropriate.
*/
function addTruncationWarningsIfNecessary(
execResult: ExecResult,
hitMaxStdout: boolean,
hitMaxStderr: boolean,
): ExecResult {
if (!hitMaxStdout && !hitMaxStderr) {
return execResult;
} else {
const { stdout, stderr, exitCode } = execResult;
return {
stdout: hitMaxStdout
? stdout + "\n\n[Output truncated: too many lines or bytes]"
: stdout,
stderr: hitMaxStderr
? stderr + "\n\n[Output truncated: too many lines or bytes]"
: stderr,
exitCode,
};
}
}

View File

@@ -19,6 +19,10 @@ export function approximateTokensUsed(items: Array<ResponseItem>): number {
for (const item of items) {
switch (item.type) {
case "message": {
if (item.role !== "user" && item.role !== "assistant") {
continue;
}
for (const c of item.content) {
if (c.type === "input_text" || c.type === "output_text") {
charCount += c.text.length;

View File

@@ -0,0 +1,82 @@
import type {
ResponseItem,
ResponseOutputItem,
} from "openai/resources/responses/responses.mjs";
/**
* Build a GitHub issuesnew URL that prefills the Codex 2bugreport.yml
* template with whatever structured data we can infer from the current
* session.
*/
export function buildBugReportUrl({
items,
cliVersion,
model,
platform,
}: {
/** Chat history so we can summarise user steps */
items: Array<ResponseItem | ResponseOutputItem>;
/** CLI revision string (e.g. output of `codex --revision`) */
cliVersion: string;
/** Active model name */
model: string;
/** Platform string e.g. `darwin arm64 23.0.0` */
platform: string;
}): string {
const params = new URLSearchParams({
template: "2-bug-report.yml",
labels: "bug",
});
params.set("version", cliVersion);
params.set("model", model);
params.set("platform", platform);
const bullets: Array<string> = [];
for (let i = 0; i < items.length; ) {
const entry = items[i];
if (entry?.type === "message" && entry.role === "user") {
const contentArray = entry.content as
| Array<{ text?: string }>
| undefined;
const messageText = contentArray
?.map((c) => c.text ?? "")
.join(" ")
.trim();
let reasoning = 0;
let toolCalls = 0;
let j = i + 1;
while (j < items.length) {
const it = items[j];
if (it?.type === "message" && it?.role === "user") {
break;
} else if (
it?.type === "reasoning" ||
(it?.type === "message" && it?.role === "assistant")
) {
reasoning += 1;
} else if (it?.type === "function_call") {
toolCalls += 1;
}
j++;
}
const codeBlock = `\`\`\`\n ${messageText}\n \`\`\``;
bullets.push(
`- ${codeBlock}\n - \`${reasoning} reasoning\` | \`${toolCalls} tool\``,
);
i = j;
} else {
i += 1;
}
}
if (bullets.length) {
params.set("steps", bullets.join("\n"));
}
return `https://github.com/openai/codex/issues/new?${params.toString()}`;
}

View File

@@ -0,0 +1,146 @@
import type { AgentName } from "package-manager-detector";
import { detectInstallerByPath } from "./package-manager-detector";
import { CLI_VERSION } from "./session";
import boxen from "boxen";
import chalk from "chalk";
import { getLatestVersion } from "fast-npm-meta";
import { readFile, writeFile } from "node:fs/promises";
import { join } from "node:path";
import { getUserAgent } from "package-manager-detector";
import semver from "semver";
interface UpdateCheckState {
lastUpdateCheck?: string;
}
interface UpdateCheckInfo {
currentVersion: string;
latestVersion: string;
}
export interface UpdateOptions {
manager: AgentName;
packageName: string;
}
const UPDATE_CHECK_FREQUENCY = 1000 * 60 * 60 * 24; // 1 day
export function renderUpdateCommand({
manager,
packageName,
}: UpdateOptions): string {
const updateCommands: Record<AgentName, string> = {
npm: `npm install -g ${packageName}`,
pnpm: `pnpm add -g ${packageName}`,
bun: `bun add -g ${packageName}`,
/** Only works in yarn@v1 */
yarn: `yarn global add ${packageName}`,
deno: `deno install -g npm:${packageName}`,
};
return updateCommands[manager];
}
function renderUpdateMessage(options: UpdateOptions) {
const updateCommand = renderUpdateCommand(options);
return `To update, run ${chalk.magenta(updateCommand)} to update.`;
}
async function writeState(stateFilePath: string, state: UpdateCheckState) {
await writeFile(stateFilePath, JSON.stringify(state, null, 2), {
encoding: "utf8",
});
}
async function getUpdateCheckInfo(
packageName: string,
): Promise<UpdateCheckInfo | undefined> {
const metadata = await getLatestVersion(packageName, {
force: true,
throw: false,
});
if ("error" in metadata || !metadata?.version) {
return;
}
return {
currentVersion: CLI_VERSION,
latestVersion: metadata.version,
};
}
export async function checkForUpdates(): Promise<void> {
const { CONFIG_DIR } = await import("./config");
const stateFile = join(CONFIG_DIR, "update-check.json");
// Load previous check timestamp
let state: UpdateCheckState | undefined;
try {
state = JSON.parse(await readFile(stateFile, "utf8"));
} catch {
// ignore
}
// Bail out if we checked less than the configured frequency ago
if (
state?.lastUpdateCheck &&
Date.now() - new Date(state.lastUpdateCheck).valueOf() <
UPDATE_CHECK_FREQUENCY
) {
return;
}
// Fetch current vs latest from the registry
const { name: packageName } = await import("../../package.json");
const packageInfo = await getUpdateCheckInfo(packageName);
await writeState(stateFile, {
...state,
lastUpdateCheck: new Date().toUTCString(),
});
if (
!packageInfo ||
!semver.gt(packageInfo.latestVersion, packageInfo.currentVersion)
) {
return;
}
// Detect global installer
let managerName = await detectInstallerByPath();
// Fallback to the local package manager
if (!managerName) {
const local = getUserAgent();
if (!local) {
// No package managers found, skip it.
return;
}
managerName = local;
}
const updateMessage = renderUpdateMessage({
manager: managerName,
packageName,
});
const box = boxen(
`\
Update available! ${chalk.red(packageInfo.currentVersion)}${chalk.green(
packageInfo.latestVersion,
)}.
${updateMessage}`,
{
padding: 1,
margin: 1,
align: "center",
borderColor: "yellow",
borderStyle: "round",
},
);
// eslint-disable-next-line no-console
console.log(box);
}

View File

@@ -1,21 +1,31 @@
import type { AppConfig } from "./config.js";
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import { OPENAI_BASE_URL } from "./config.js";
import { getBaseUrl, getApiKey } from "./config.js";
import OpenAI from "openai";
/**
* Generate a condensed summary of the conversation items.
* @param items The list of conversation items to summarize
* @param model The model to use for generating the summary
* @returns A concise structured summary string
*/
/**
* Generate a condensed summary of the conversation items.
* @param items The list of conversation items to summarize
* @param model The model to use for generating the summary
* @param flexMode Whether to use the flex-mode service tier
* @param config The configuration object
* @returns A concise structured summary string
*/
export async function generateCompactSummary(
items: Array<ResponseItem>,
model: string,
flexMode = false,
config: AppConfig,
): Promise<string> {
const oai = new OpenAI({
apiKey: process.env["OPENAI_API_KEY"],
baseURL: OPENAI_BASE_URL,
apiKey: getApiKey(config.provider),
baseURL: getBaseUrl(config.provider),
});
const conversationText = items
@@ -44,6 +54,7 @@ export async function generateCompactSummary(
const response = await oai.chat.completions.create({
model,
...(flexMode ? { service_tier: "flex" } : {}),
messages: [
{
role: "assistant",

View File

@@ -7,14 +7,42 @@
// compiled `dist/` output used by the published CLI.
import type { FullAutoErrorMode } from "./auto-approval-mode.js";
import type { ReasoningEffort } from "openai/resources.mjs";
import { log, isLoggingEnabled } from "./agent/log.js";
import { AutoApprovalMode } from "./auto-approval-mode.js";
import { log } from "./logger/log.js";
import { providers } from "./providers.js";
import { config as loadDotenv } from "dotenv";
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "fs";
import { load as loadYaml, dump as dumpYaml } from "js-yaml";
import { homedir } from "os";
import { dirname, join, extname, resolve as resolvePath } from "path";
// ---------------------------------------------------------------------------
// Userwide environment config (~/.codex.env)
// ---------------------------------------------------------------------------
// Load a userlevel dotenv file **after** process.env and any projectlocal
// .env file (loaded via "dotenv/config" in cli.tsx) are in place. We rely on
// dotenv's default behaviour of *not* overriding existing variables so that
// the precedence order becomes:
// 1. Explicit environment variables
// 2. Projectlocal .env (handled in cli.tsx)
// 3. Userwide ~/.codex.env (loaded here)
// This guarantees that users can still override the global key on a perproject
// basis while enjoying the convenience of a persistent default.
// Skip when running inside Vitest to avoid interfering with the FS mocks used
// by tests that stub out `fs` *after* importing this module.
const USER_WIDE_CONFIG_PATH = join(homedir(), ".codex.env");
const isVitest =
typeof (globalThis as { vitest?: unknown }).vitest !== "undefined";
if (!isVitest) {
loadDotenv({ path: USER_WIDE_CONFIG_PATH });
}
export const DEFAULT_AGENTIC_MODEL = "o4-mini";
export const DEFAULT_FULL_CONTEXT_MODEL = "gpt-4.1";
export const DEFAULT_APPROVAL_MODE = AutoApprovalMode.SUGGEST;
@@ -36,26 +64,90 @@ export const OPENAI_TIMEOUT_MS =
export const OPENAI_BASE_URL = process.env["OPENAI_BASE_URL"] || "";
export let OPENAI_API_KEY = process.env["OPENAI_API_KEY"] || "";
export const DEFAULT_REASONING_EFFORT = "high";
export const OPENAI_ORGANIZATION = process.env["OPENAI_ORGANIZATION"] || "";
export const OPENAI_PROJECT = process.env["OPENAI_PROJECT"] || "";
// Can be set `true` when Codex is running in an environment that is marked as already
// considered sufficiently locked-down so that we allow running wihtout an explicit sandbox.
export const CODEX_UNSAFE_ALLOW_NO_SANDBOX = Boolean(
process.env["CODEX_UNSAFE_ALLOW_NO_SANDBOX"] || "",
);
export function setApiKey(apiKey: string): void {
OPENAI_API_KEY = apiKey;
}
// Formatting (quiet mode-only).
export const PRETTY_PRINT = Boolean(process.env["PRETTY_PRINT"] || "");
export function getBaseUrl(provider: string = "openai"): string | undefined {
// Check for a PROVIDER-specific override: e.g. OPENAI_BASE_URL or OLLAMA_BASE_URL.
const envKey = `${provider.toUpperCase()}_BASE_URL`;
if (process.env[envKey]) {
return process.env[envKey];
}
// Get providers config from config file.
const config = loadConfig();
const providersConfig = config.providers ?? providers;
const providerInfo = providersConfig[provider.toLowerCase()];
if (providerInfo) {
return providerInfo.baseURL;
}
// If the provider not found in the providers list and `OPENAI_BASE_URL` is set, use it.
if (OPENAI_BASE_URL !== "") {
return OPENAI_BASE_URL;
}
// We tried.
return undefined;
}
export function getApiKey(provider: string = "openai"): string | undefined {
const config = loadConfig();
const providersConfig = config.providers ?? providers;
const providerInfo = providersConfig[provider.toLowerCase()];
if (providerInfo) {
if (providerInfo.name === "Ollama") {
return process.env[providerInfo.envKey] ?? "dummy";
}
return process.env[providerInfo.envKey];
}
// Checking `PROVIDER_API_KEY feels more intuitive with a custom provider.
const customApiKey = process.env[`${provider.toUpperCase()}_API_KEY`];
if (customApiKey) {
return customApiKey;
}
// If the provider not found in the providers list and `OPENAI_API_KEY` is set, use it
if (OPENAI_API_KEY !== "") {
return OPENAI_API_KEY;
}
// We tried.
return undefined;
}
// Represents config as persisted in config.json.
export type StoredConfig = {
model?: string;
provider?: string;
approvalMode?: AutoApprovalMode;
fullAutoErrorMode?: FullAutoErrorMode;
memory?: MemoryConfig;
/** Whether to enable desktop notifications for responses */
notify?: boolean;
/** Disable server-side response storage (send full transcript each request) */
disableResponseStorage?: boolean;
providers?: Record<string, { name: string; baseURL: string; envKey: string }>;
history?: {
maxSize?: number;
saveHistory?: boolean;
sensitivePatterns?: Array<string>;
};
/** User-defined safe commands */
safeCommands?: Array<string>;
reasoningEffort?: ReasoningEffort;
};
// Minimal config written on first run. An *empty* model string ensures that
@@ -63,7 +155,7 @@ export type StoredConfig = {
// propagating to existing users until they explicitly set a model.
export const EMPTY_STORED_CONFIG: StoredConfig = { model: "" };
// Prestringified JSON variant so we dont stringify repeatedly.
// Prestringified JSON variant so we don't stringify repeatedly.
const EMPTY_CONFIG_JSON = JSON.stringify(EMPTY_STORED_CONFIG, null, 2) + "\n";
export type MemoryConfig = {
@@ -74,11 +166,21 @@ export type MemoryConfig = {
export type AppConfig = {
apiKey?: string;
model: string;
provider?: string;
instructions: string;
approvalMode?: AutoApprovalMode;
fullAutoErrorMode?: FullAutoErrorMode;
memory?: MemoryConfig;
reasoningEffort?: ReasoningEffort;
/** Whether to enable desktop notifications for responses */
notify: boolean;
notify?: boolean;
/** Disable server-side response storage (send full transcript each request) */
disableResponseStorage?: boolean;
/** Enable the "flex-mode" processing mode for supported models (o3, o4-mini) */
flexMode?: boolean;
providers?: Record<string, { name: string; baseURL: string; envKey: string }>;
history?: {
maxSize: number;
saveHistory: boolean;
@@ -86,6 +188,9 @@ export type AppConfig = {
};
};
// Formatting (quiet mode-only).
export const PRETTY_PRINT = Boolean(process.env["PRETTY_PRINT"] || "");
// ---------------------------------------------------------------------------
// Project doc support (codex.md)
// ---------------------------------------------------------------------------
@@ -93,6 +198,7 @@ export type AppConfig = {
export const PROJECT_DOC_MAX_BYTES = 32 * 1024; // 32 kB
const PROJECT_DOC_FILENAMES = ["codex.md", ".codex.md", "CODEX.md"];
const PROJECT_DOC_SEPARATOR = "\n\n--- project-doc ---\n\n";
export function discoverProjectDocPath(startDir: string): string | null {
const cwd = resolvePath(startDir);
@@ -217,6 +323,22 @@ export const loadConfig = (
}
}
if (
storedConfig.disableResponseStorage !== undefined &&
typeof storedConfig.disableResponseStorage !== "boolean"
) {
if (storedConfig.disableResponseStorage === "true") {
storedConfig.disableResponseStorage = true;
} else if (storedConfig.disableResponseStorage === "false") {
storedConfig.disableResponseStorage = false;
} else {
log(
`[codex] Warning: 'disableResponseStorage' in config is not a boolean (got '${storedConfig.disableResponseStorage}'). Ignoring this value.`,
);
delete storedConfig.disableResponseStorage;
}
}
const instructionsFilePathResolved =
instructionsPath ?? INSTRUCTIONS_FILEPATH;
const userInstructions = existsSync(instructionsFilePathResolved)
@@ -237,21 +359,17 @@ export const loadConfig = (
? resolvePath(cwd, options.projectDocPath)
: discoverProjectDocPath(cwd);
if (projectDocPath) {
if (isLoggingEnabled()) {
log(
`[codex] Loaded project doc from ${projectDocPath} (${projectDoc.length} bytes)`,
);
}
log(
`[codex] Loaded project doc from ${projectDocPath} (${projectDoc.length} bytes)`,
);
} else {
if (isLoggingEnabled()) {
log(`[codex] No project doc found in ${cwd}`);
}
log(`[codex] No project doc found in ${cwd}`);
}
}
const combinedInstructions = [userInstructions, projectDoc]
.filter((s) => s && s.trim() !== "")
.join("\n\n--- project-doc ---\n\n");
.join(PROJECT_DOC_SEPARATOR);
// Treat empty string ("" or whitespace) as absence so we can fall back to
// the latest DEFAULT_MODEL.
@@ -266,8 +384,12 @@ export const loadConfig = (
(options.isFullContext
? DEFAULT_FULL_CONTEXT_MODEL
: DEFAULT_AGENTIC_MODEL),
provider: storedConfig.provider,
instructions: combinedInstructions,
notify: storedConfig.notify === true,
approvalMode: storedConfig.approvalMode,
disableResponseStorage: storedConfig.disableResponseStorage === true,
reasoningEffort: storedConfig.reasoningEffort,
};
// -----------------------------------------------------------------------
@@ -345,6 +467,9 @@ export const loadConfig = (
};
}
// Merge default providers with user configured providers in the config.
config.providers = { ...providers, ...storedConfig.providers };
return config;
};
@@ -376,6 +501,11 @@ export const saveConfig = (
// Create the config object to save
const configToSave: StoredConfig = {
model: config.model,
provider: config.provider,
providers: config.providers,
approvalMode: config.approvalMode,
disableResponseStorage: config.disableResponseStorage,
reasoningEffort: config.reasoningEffort,
};
// Add history settings if they exist
@@ -393,5 +523,9 @@ export const saveConfig = (
writeFileSync(targetPath, JSON.stringify(configToSave, null, 2), "utf-8");
}
writeFileSync(instructionsPath, config.instructions, "utf-8");
// Take everything before the first PROJECT_DOC_SEPARATOR (or the whole string if none).
const [userInstructions = ""] = config.instructions.split(
PROJECT_DOC_SEPARATOR,
);
writeFileSync(instructionsPath, userInstructions, "utf-8");
};

View File

@@ -1,212 +0,0 @@
/**
* Costestimation helpers for OpenAI responses.
*
* The implementation now distinguishes between *input*, *cached input* and
* *output* tokens, reflecting OpenAIs 202504 pricing scheme. For models
* where we only have a single blended rate we gracefully fall back to the
* legacy logic so existing callsites continue to work.
*/
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import { approximateTokensUsed } from "./approximate-tokens-used.js";
// ────────────────────────────────────────────────────────────────────────────
// Pricing tables
// ────────────────────────────────────────────────────────────────────────────
/** Breakdown of pertoken prices (in USD). */
type TokenRates = {
/** Price for *noncached* input prompt tokens. */
input: number;
/** Preferential price for *cached* input tokens. */
cachedInput: number;
/** Price for completion / output tokens. */
output: number;
};
/**
* Pricing table (exact model name -> pertoken rates).
* All keys must be lowercase.
*/
const detailedPriceMap: Record<string, TokenRates> = {
// OpenAI “oseries” experimental
"o3": {
input: 10 / 1_000_000,
cachedInput: 2.5 / 1_000_000,
output: 40 / 1_000_000,
},
"o4-mini": {
input: 1.1 / 1_000_000,
cachedInput: 0.275 / 1_000_000,
output: 4.4 / 1_000_000,
},
// GPT4.1 family
"gpt-4.1-nano": {
input: 0.1 / 1_000_000,
cachedInput: 0.025 / 1_000_000,
output: 0.4 / 1_000_000,
},
"gpt-4.1-mini": {
input: 0.4 / 1_000_000,
cachedInput: 0.1 / 1_000_000,
output: 1.6 / 1_000_000,
},
"gpt-4.1": {
input: 2 / 1_000_000,
cachedInput: 0.5 / 1_000_000,
output: 8 / 1_000_000,
},
// GPT4o family
"gpt-4o-mini": {
input: 0.6 / 1_000_000,
cachedInput: 0.3 / 1_000_000,
output: 2.4 / 1_000_000,
},
"gpt-4o": {
input: 5 / 1_000_000,
cachedInput: 2.5 / 1_000_000,
output: 20 / 1_000_000,
},
};
/**
* Legacy singlerate pricing entries (per *thousand* tokens). These are kept
* to provide sensible fallbacks for models that do not yet expose a detailed
* breakdown or where we have no published split pricing. The figures stem
* from older OpenAI announcements and are only meant for *approximation*
* callers that rely on exact accounting should upgrade to models covered by
* {@link detailedPriceMap}.
*/
const blendedPriceMap: Record<string, number> = {
// GPT4 Turbo (Apr 2024)
"gpt-4-turbo": 0.01,
// Legacy GPT4 8k / 32k context models
"gpt-4": 0.03,
// GPT3.5Turbo family
"gpt-3.5-turbo": 0.0005,
// Remaining preview variants (exact names)
"gpt-4o-search-preview": 0.0025,
"gpt-4o-mini-search-preview": 0.00015,
"gpt-4o-realtime-preview": 0.005,
"gpt-4o-audio-preview": 0.0025,
"gpt-4o-mini-audio-preview": 0.00015,
"gpt-4o-mini-realtime-preview": 0.0006,
"gpt-4o-mini": 0.00015,
// Older experimental oseries rates
"o3-mini": 0.0011,
"o1-mini": 0.0011,
"o1-pro": 0.15,
"o1": 0.015,
// Additional internal preview models
"computer-use-preview": 0.003,
};
// ────────────────────────────────────────────────────────────────────────────
// Public helpers
// ────────────────────────────────────────────────────────────────────────────
/**
* Return the pertoken input/cached/output rates for the supplied model, or
* `null` when no detailed pricing is available.
*/
function normalize(model: string): string {
// Lowercase and strip date/version suffixes like “20250414”.
const lower = model.toLowerCase();
const dateSuffix = /-\d{4}-\d{2}-\d{2}$/;
return lower.replace(dateSuffix, "");
}
export function priceRates(model: string): TokenRates | null {
return detailedPriceMap[normalize(model)] ?? null;
}
/**
* Fallback that returns a *single* blended pertoken rate when no detailed
* split is available. This mirrors the behaviour of the pre2025 version so
* that existing callers keep working unmodified.
*/
export function pricePerToken(model: string): number | null {
// Prefer an *average* of the detailed rates when we have them this avoids
// surprises where callers mix `pricePerToken()` with the new detailed
// helpers.
const rates = priceRates(model);
if (rates) {
return (rates.input + rates.output) / 2; // simple average heuristic
}
const entry = blendedPriceMap[normalize(model)];
if (entry == null) {
return null;
}
return entry / 1000;
}
// ────────────────────────────────────────────────────────────────────────────
// Cost estimation
// ────────────────────────────────────────────────────────────────────────────
/** Shape of the `usage` object returned by OpenAIs Responses API. */
export type UsageBreakdown = {
input_tokens?: number;
input_tokens_details?: { cached_tokens?: number } | null;
output_tokens?: number;
total_tokens?: number;
};
/**
* Calculate the exact cost (in USD) for a single usage breakdown. Returns
* `null` when the model is unknown.
*/
export function estimateCostFromUsage(
usage: UsageBreakdown,
model: string,
): number | null {
const rates = priceRates(model);
if (!rates) {
// fall back to blended pricing
const per = pricePerToken(model);
if (per == null) {
return null;
}
const tokens =
usage.total_tokens ??
(usage.input_tokens ?? 0) + (usage.output_tokens ?? 0);
return tokens * per;
}
const input = usage.input_tokens ?? 0;
const cached = usage.input_tokens_details?.cached_tokens ?? 0;
const nonCachedInput = Math.max(0, input - cached);
const output = usage.output_tokens ?? 0;
return (
nonCachedInput * rates.input +
cached * rates.cachedInput +
output * rates.output
);
}
/**
* Rough cost estimate (USD) for a series of {@link ResponseItem}s when using
* the specified model. When no detailed usage object is available we fall
* back to estimating token counts based on the message contents.
*/
export function estimateCostUSD(
items: Array<ResponseItem>,
model: string,
): number | null {
const per = pricePerToken(model);
if (per == null) {
return null;
}
return approximateTokensUsed(items) * per;
}

View File

@@ -0,0 +1,36 @@
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
/**
* Extracts the patch texts of all `apply_patch` tool calls from the given
* message history. Returns an empty string when none are found.
*/
export function extractAppliedPatches(items: Array<ResponseItem>): string {
const patches: Array<string> = [];
for (const item of items) {
if (item.type !== "function_call") {
continue;
}
const { name: toolName, arguments: argsString } = item as unknown as {
name: unknown;
arguments: unknown;
};
if (toolName !== "apply_patch" || typeof argsString !== "string") {
continue;
}
try {
const args = JSON.parse(argsString) as { patch?: string };
if (typeof args.patch === "string" && args.patch.length > 0) {
patches.push(args.patch.trim());
}
} catch {
// Ignore malformed JSON we never want to crash the overlay.
continue;
}
}
return patches.join("\n\n");
}

View File

@@ -0,0 +1,42 @@
import fs from "fs";
import os from "os";
import path from "path";
export function getFileSystemSuggestions(pathPrefix: string): Array<string> {
if (!pathPrefix) {
return [];
}
try {
const sep = path.sep;
const hasTilde = pathPrefix === "~" || pathPrefix.startsWith("~" + sep);
const expanded = hasTilde
? path.join(os.homedir(), pathPrefix.slice(1))
: pathPrefix;
const normalized = path.normalize(expanded);
const isDir = pathPrefix.endsWith(path.sep);
const base = path.basename(normalized);
const dir =
normalized === "." && !pathPrefix.startsWith("." + sep) && !hasTilde
? process.cwd()
: path.dirname(normalized);
const readDir = isDir ? path.join(dir, base) : dir;
return fs
.readdirSync(readDir)
.filter((item) => isDir || item.startsWith(base))
.map((item) => {
const fullPath = path.join(readDir, item);
const isDirectory = fs.statSync(fullPath).isDirectory();
if (isDirectory) {
return path.join(fullPath, sep);
}
return fullPath;
});
} catch {
return [];
}
}

View File

@@ -0,0 +1,123 @@
import { execSync } from "node:child_process";
// The objects thrown by `child_process.execSync()` are `Error` instances that
// include additional, undocumented properties such as `status` (exit code) and
// `stdout` (captured standard output). Declare a minimal interface that captures
// just the fields we need so that we can avoid the use of `any` while keeping
// the checks type-safe.
interface ExecSyncError extends Error {
// Exit status code. When a diff is produced, git exits with code 1 which we
// treat as a non-error signal.
status?: number;
// Captured stdout. We rely on this to obtain the diff output when git exits
// with status 1.
stdout?: string;
}
// Type-guard that narrows an unknown value to `ExecSyncError`.
function isExecSyncError(err: unknown): err is ExecSyncError {
return (
typeof err === "object" && err != null && "status" in err && "stdout" in err
);
}
/**
* Returns the current Git diff for the working directory. If the current
* working directory is not inside a Git repository, `isGitRepo` will be
* false and `diff` will be an empty string.
*/
export function getGitDiff(): {
isGitRepo: boolean;
diff: string;
} {
try {
// First check whether we are inside a git repository. `revparse` exits
// with a nonzero status code if not.
execSync("git rev-parse --is-inside-work-tree", { stdio: "ignore" });
// If the above call didnt throw, we are inside a git repo. Retrieve the
// diff for tracked files **and** include any untracked files so that the
// `/diff` overlay shows a complete picture of the working tree state.
// 1. Diff for tracked files (unchanged behaviour)
let trackedDiff = "";
try {
trackedDiff = execSync("git diff --color", {
encoding: "utf8",
maxBuffer: 10 * 1024 * 1024, // 10 MB ought to be enough for now
});
} catch (err) {
// Exit status 1 simply means that differences were found. Capture the
// diff from stdout in that case. Re-throw for any other status codes.
if (
isExecSyncError(err) &&
err.status === 1 &&
typeof err.stdout === "string"
) {
trackedDiff = err.stdout;
} else {
throw err;
}
}
// 2. Determine untracked files.
// We use `git ls-files --others --exclude-standard` which outputs paths
// relative to the repository root, one per line. These are files that
// are not tracked *and* are not ignored by .gitignore.
const untrackedOutput = execSync(
"git ls-files --others --exclude-standard",
{
encoding: "utf8",
maxBuffer: 10 * 1024 * 1024,
},
);
const untrackedFiles = untrackedOutput
.split("\n")
.map((p) => p.trim())
.filter(Boolean);
let untrackedDiff = "";
const nullDevice = process.platform === "win32" ? "NUL" : "/dev/null";
for (const file of untrackedFiles) {
try {
// `git diff --no-index` produces a diff even outside the index by
// comparing two paths. We compare the file against /dev/null so that
// the file is treated as "new".
//
// `git diff --color --no-index /dev/null <file>` exits with status 1
// when differences are found, so we capture stdout from the thrown
// error object instead of letting it propagate.
execSync(`git diff --color --no-index -- "${nullDevice}" "${file}"`, {
encoding: "utf8",
stdio: ["ignore", "pipe", "ignore"],
maxBuffer: 10 * 1024 * 1024,
});
} catch (err) {
if (
isExecSyncError(err) &&
// Exit status 1 simply means that the two inputs differ, which is
// exactly what we expect here. Any other status code indicates a
// real error (e.g. the file disappeared between the ls-files and
// diff calls), so re-throw those.
err.status === 1 &&
typeof err.stdout === "string"
) {
untrackedDiff += err.stdout;
} else {
throw err;
}
}
}
// Concatenate tracked and untracked diffs.
const combinedDiff = `${trackedDiff}${untrackedDiff}`;
return { isGitRepo: true, diff: combinedDiff };
} catch {
// Either git is not installed or were not inside a repository.
return { isGitRepo: false, diff: "" };
}
}

View File

@@ -2,6 +2,7 @@ import type { ResponseInputItem } from "openai/resources/responses/responses";
import { fileTypeFromBuffer } from "file-type";
import fs from "fs/promises";
import path from "path";
export async function createInputItem(
text: string,
@@ -14,17 +15,24 @@ export async function createInputItem(
};
for (const filePath of images) {
/* eslint-disable no-await-in-loop */
const binary = await fs.readFile(filePath);
const kind = await fileTypeFromBuffer(binary);
/* eslint-enable no-await-in-loop */
const encoded = binary.toString("base64");
const mime = kind?.mime ?? "application/octet-stream";
inputItem.content.push({
type: "input_image",
detail: "auto",
image_url: `data:${mime};base64,${encoded}`,
});
try {
/* eslint-disable no-await-in-loop */
const binary = await fs.readFile(filePath);
const kind = await fileTypeFromBuffer(binary);
/* eslint-enable no-await-in-loop */
const encoded = binary.toString("base64");
const mime = kind?.mime ?? "application/octet-stream";
inputItem.content.push({
type: "input_image",
detail: "auto",
image_url: `data:${mime};base64,${encoded}`,
});
} catch (err) {
inputItem.content.push({
type: "input_text",
text: `[missing image: ${path.basename(filePath)}]`,
});
}
}
return inputItem;

View File

@@ -124,6 +124,14 @@ export function log(message: string): void {
(logger ?? initLogger()).log(message);
}
/**
* USE SPARINGLY! This function should only be used to guard a call to log() if
* the log message is large and you want to avoid constructing it if logging is
* disabled.
*
* `log()` is already a no-op if DEBUG is not set, so an extra
* `isLoggingEnabled()` check is unnecessary.
*/
export function isLoggingEnabled(): boolean {
return (logger ?? initLogger()).isLoggingEnabled();
}

View File

@@ -0,0 +1,198 @@
export type ModelInfo = {
/** The human-readable label for this model */
label: string;
/** The max context window size for this model */
maxContextLength: number;
};
export type SupportedModelId = keyof typeof openAiModelInfo;
export const openAiModelInfo = {
"o1-pro-2025-03-19": {
label: "o1 Pro (2025-03-19)",
maxContextLength: 200000,
},
"o3": {
label: "o3",
maxContextLength: 200000,
},
"o3-2025-04-16": {
label: "o3 (2025-04-16)",
maxContextLength: 200000,
},
"o4-mini": {
label: "o4 Mini",
maxContextLength: 200000,
},
"gpt-4.1-nano": {
label: "GPT-4.1 Nano",
maxContextLength: 1000000,
},
"gpt-4.1-nano-2025-04-14": {
label: "GPT-4.1 Nano (2025-04-14)",
maxContextLength: 1000000,
},
"o4-mini-2025-04-16": {
label: "o4 Mini (2025-04-16)",
maxContextLength: 200000,
},
"gpt-4": {
label: "GPT-4",
maxContextLength: 8192,
},
"o1-preview-2024-09-12": {
label: "o1 Preview (2024-09-12)",
maxContextLength: 128000,
},
"gpt-4.1-mini": {
label: "GPT-4.1 Mini",
maxContextLength: 1000000,
},
"gpt-3.5-turbo-instruct-0914": {
label: "GPT-3.5 Turbo Instruct (0914)",
maxContextLength: 4096,
},
"gpt-4o-mini-search-preview": {
label: "GPT-4o Mini Search Preview",
maxContextLength: 128000,
},
"gpt-4.1-mini-2025-04-14": {
label: "GPT-4.1 Mini (2025-04-14)",
maxContextLength: 1000000,
},
"chatgpt-4o-latest": {
label: "ChatGPT-4o Latest",
maxContextLength: 128000,
},
"gpt-3.5-turbo-1106": {
label: "GPT-3.5 Turbo (1106)",
maxContextLength: 16385,
},
"gpt-4o-search-preview": {
label: "GPT-4o Search Preview",
maxContextLength: 128000,
},
"gpt-4-turbo": {
label: "GPT-4 Turbo",
maxContextLength: 128000,
},
"gpt-4o-realtime-preview-2024-12-17": {
label: "GPT-4o Realtime Preview (2024-12-17)",
maxContextLength: 128000,
},
"gpt-3.5-turbo-instruct": {
label: "GPT-3.5 Turbo Instruct",
maxContextLength: 4096,
},
"gpt-3.5-turbo": {
label: "GPT-3.5 Turbo",
maxContextLength: 16385,
},
"gpt-4-turbo-preview": {
label: "GPT-4 Turbo Preview",
maxContextLength: 128000,
},
"gpt-4o-mini-search-preview-2025-03-11": {
label: "GPT-4o Mini Search Preview (2025-03-11)",
maxContextLength: 128000,
},
"gpt-4-0125-preview": {
label: "GPT-4 (0125) Preview",
maxContextLength: 128000,
},
"gpt-4o-2024-11-20": {
label: "GPT-4o (2024-11-20)",
maxContextLength: 128000,
},
"o3-mini": {
label: "o3 Mini",
maxContextLength: 200000,
},
"gpt-4o-2024-05-13": {
label: "GPT-4o (2024-05-13)",
maxContextLength: 128000,
},
"gpt-4-turbo-2024-04-09": {
label: "GPT-4 Turbo (2024-04-09)",
maxContextLength: 128000,
},
"gpt-3.5-turbo-16k": {
label: "GPT-3.5 Turbo 16k",
maxContextLength: 16385,
},
"o3-mini-2025-01-31": {
label: "o3 Mini (2025-01-31)",
maxContextLength: 200000,
},
"o1-preview": {
label: "o1 Preview",
maxContextLength: 128000,
},
"o1-2024-12-17": {
label: "o1 (2024-12-17)",
maxContextLength: 128000,
},
"gpt-4-0613": {
label: "GPT-4 (0613)",
maxContextLength: 8192,
},
"o1": {
label: "o1",
maxContextLength: 128000,
},
"o1-pro": {
label: "o1 Pro",
maxContextLength: 200000,
},
"gpt-4.5-preview": {
label: "GPT-4.5 Preview",
maxContextLength: 128000,
},
"gpt-4.5-preview-2025-02-27": {
label: "GPT-4.5 Preview (2025-02-27)",
maxContextLength: 128000,
},
"gpt-4o-search-preview-2025-03-11": {
label: "GPT-4o Search Preview (2025-03-11)",
maxContextLength: 128000,
},
"gpt-4o": {
label: "GPT-4o",
maxContextLength: 128000,
},
"gpt-4o-mini": {
label: "GPT-4o Mini",
maxContextLength: 128000,
},
"gpt-4o-2024-08-06": {
label: "GPT-4o (2024-08-06)",
maxContextLength: 128000,
},
"gpt-4.1": {
label: "GPT-4.1",
maxContextLength: 1000000,
},
"gpt-4.1-2025-04-14": {
label: "GPT-4.1 (2025-04-14)",
maxContextLength: 1000000,
},
"gpt-4o-mini-2024-07-18": {
label: "GPT-4o Mini (2024-07-18)",
maxContextLength: 128000,
},
"o1-mini": {
label: "o1 Mini",
maxContextLength: 128000,
},
"gpt-3.5-turbo-0125": {
label: "GPT-3.5 Turbo (0125)",
maxContextLength: 16385,
},
"o1-mini-2024-09-12": {
label: "o1 Mini (2024-09-12)",
maxContextLength: 128000,
},
"gpt-4-1106-preview": {
label: "GPT-4 (1106) Preview",
maxContextLength: 128000,
},
} as const satisfies Record<string, ModelInfo>;

View File

@@ -1,4 +1,13 @@
import { OPENAI_API_KEY } from "./config";
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import { approximateTokensUsed } from "./approximate-tokens-used.js";
import {
OPENAI_ORGANIZATION,
OPENAI_PROJECT,
getBaseUrl,
getApiKey,
} from "./config";
import { type SupportedModelId, openAiModelInfo } from "./model-info.js";
import OpenAI from "openai";
const MODEL_LIST_TIMEOUT_MS = 2_000; // 2 seconds
@@ -11,53 +20,58 @@ export const RECOMMENDED_MODELS: Array<string> = ["o4-mini", "o3"];
* enters interactive mode. The request is made exactly once during the
* lifetime of the process and the results are cached for subsequent calls.
*/
let modelsPromise: Promise<Array<string>> | null = null;
async function fetchModels(): Promise<Array<string>> {
// If the user has not configured an API key we cannot hit the network.
if (!OPENAI_API_KEY) {
return RECOMMENDED_MODELS;
async function fetchModels(provider: string): Promise<Array<string>> {
// If the user has not configured an API key we cannot retrieve the models.
if (!getApiKey(provider)) {
throw new Error("No API key configured for provider: " + provider);
}
try {
const openai = new OpenAI({ apiKey: OPENAI_API_KEY });
const list = await openai.models.list();
const headers: Record<string, string> = {};
if (OPENAI_ORGANIZATION) {
headers["OpenAI-Organization"] = OPENAI_ORGANIZATION;
}
if (OPENAI_PROJECT) {
headers["OpenAI-Project"] = OPENAI_PROJECT;
}
const openai = new OpenAI({
apiKey: getApiKey(provider),
baseURL: getBaseUrl(provider),
defaultHeaders: headers,
});
const list = await openai.models.list();
const models: Array<string> = [];
for await (const model of list as AsyncIterable<{ id?: string }>) {
if (model && typeof model.id === "string") {
models.push(model.id);
let modelStr = model.id;
// Fix for gemini.
if (modelStr.startsWith("models/")) {
modelStr = modelStr.replace("models/", "");
}
models.push(modelStr);
}
}
return models.sort();
} catch {
} catch (error) {
return [];
}
}
export function preloadModels(): void {
if (!modelsPromise) {
// Fireandforget callers that truly need the list should `await`
// `getAvailableModels()` instead.
void getAvailableModels();
}
}
export async function getAvailableModels(): Promise<Array<string>> {
if (!modelsPromise) {
modelsPromise = fetchModels();
}
return modelsPromise;
/** Returns the list of models available for the provided key / credentials. */
export async function getAvailableModels(
provider: string,
): Promise<Array<string>> {
return fetchModels(provider.toLowerCase());
}
/**
* Verify that the provided model identifier is present in the set returned by
* {@link getAvailableModels}. The list of models is fetched from the OpenAI
* `/models` endpoint the first time it is required and then cached inprocess.
* Verifies that the provided model identifier is present in the set returned by
* {@link getAvailableModels}.
*/
export async function isModelSupportedForResponses(
provider: string,
model: string | undefined | null,
): Promise<boolean> {
if (
@@ -70,7 +84,7 @@ export async function isModelSupportedForResponses(
try {
const models = await Promise.race<Array<string>>([
getAvailableModels(),
getAvailableModels(provider),
new Promise<Array<string>>((resolve) =>
setTimeout(() => resolve([]), MODEL_LIST_TIMEOUT_MS),
),
@@ -88,3 +102,112 @@ export async function isModelSupportedForResponses(
return true;
}
}
/** Returns the maximum context length (in tokens) for a given model. */
export function maxTokensForModel(model: string): number {
if (model in openAiModelInfo) {
return openAiModelInfo[model as SupportedModelId].maxContextLength;
}
// fallback to heuristics for models not in the registry
const lower = model.toLowerCase();
if (lower.includes("32k")) {
return 32000;
}
if (lower.includes("16k")) {
return 16000;
}
if (lower.includes("8k")) {
return 8000;
}
if (lower.includes("4k")) {
return 4000;
}
return 128000; // Default to 128k for any other model.
}
/** Calculates the percentage of tokens remaining in context for a model. */
export function calculateContextPercentRemaining(
items: Array<ResponseItem>,
model: string,
): number {
const used = approximateTokensUsed(items);
const max = maxTokensForModel(model);
const remaining = Math.max(0, max - used);
return (remaining / max) * 100;
}
/**
* Typeguard that narrows a {@link ResponseItem} to one that represents a
* userauthored message. The OpenAI SDK represents both input *and* output
* messages with a discriminated union where:
* • `type` is the string literal "message" and
* • `role` is one of "user" | "assistant" | "system" | "developer".
*
* For the purposes of deduplication we only care about *user* messages so we
* detect those here in a single, reusable helper.
*/
function isUserMessage(
item: ResponseItem,
): item is ResponseItem & { type: "message"; role: "user"; content: unknown } {
return item.type === "message" && (item as { role?: string }).role === "user";
}
/**
* Deduplicate the stream of {@link ResponseItem}s before they are persisted in
* component state.
*
* Historically we used the (optional) {@code id} field returned by the
* OpenAI streaming API as the primary key: the first occurrence of any given
* {@code id} “won” and subsequent duplicates were dropped. In practice this
* proved brittle because locallygenerated user messages dont include an
* {@code id}. The result was that if a user quickly pressed <Enter> twice the
* exact same message would appear twice in the transcript.
*
* The new rules are therefore:
* 1. If a {@link ResponseItem} has an {@code id} keep only the *first*
* occurrence of that {@code id} (this retains the previous behaviour for
* assistant / tool messages).
* 2. Additionally, collapse *consecutive* user messages with identical
* content. Two messages are considered identical when their serialized
* {@code content} array matches exactly. We purposefully restrict this
* to **adjacent** duplicates so that legitimately repeated questions at
* a later point in the conversation are still shown.
*/
export function uniqueById(items: Array<ResponseItem>): Array<ResponseItem> {
const seenIds = new Set<string>();
const deduped: Array<ResponseItem> = [];
for (const item of items) {
// ──────────────────────────────────────────────────────────────────
// Rule #1 deduplicate by id when present
// ──────────────────────────────────────────────────────────────────
if (typeof item.id === "string" && item.id.length > 0) {
if (seenIds.has(item.id)) {
continue; // skip duplicates
}
seenIds.add(item.id);
}
// ──────────────────────────────────────────────────────────────────
// Rule #2 collapse consecutive identical user messages
// ──────────────────────────────────────────────────────────────────
if (isUserMessage(item) && deduped.length > 0) {
const prev = deduped[deduped.length - 1]!;
if (
isUserMessage(prev) &&
// Note: the `content` field is an array of message parts. Performing
// a deep compare is overkill here; serialising to JSON is sufficient
// (and fast for the tiny payloads involved).
JSON.stringify(prev.content) === JSON.stringify(item.content)
) {
continue; // skip duplicate user message
}
}
deduped.push(item);
}
return deduped;
}

View File

@@ -0,0 +1,73 @@
import type { AgentName } from "package-manager-detector";
import { execFileSync } from "node:child_process";
import { join, resolve } from "node:path";
import which from "which";
function isInstalled(manager: AgentName): boolean {
try {
which.sync(manager);
return true;
} catch {
return false;
}
}
function getGlobalBinDir(manager: AgentName): string | undefined {
if (!isInstalled(manager)) {
return;
}
try {
switch (manager) {
case "npm": {
const stdout = execFileSync("npm", ["prefix", "-g"], {
encoding: "utf-8",
});
return join(stdout.trim(), "bin");
}
case "pnpm": {
// pnpm bin -g prints the bin dir
const stdout = execFileSync("pnpm", ["bin", "-g"], {
encoding: "utf-8",
});
return stdout.trim();
}
case "bun": {
// bun pm bin -g prints your bun global bin folder
const stdout = execFileSync("bun", ["pm", "bin", "-g"], {
encoding: "utf-8",
});
return stdout.trim();
}
default:
return undefined;
}
} catch {
// ignore
}
return undefined;
}
export async function detectInstallerByPath(): Promise<AgentName | undefined> {
// e.g. /usr/local/bin/codex
const invoked = process.argv[1] && resolve(process.argv[1]);
if (!invoked) {
return;
}
const supportedManagers: Array<AgentName> = ["npm", "pnpm", "bun"];
for (const mgr of supportedManagers) {
const binDir = getGlobalBinDir(mgr);
if (binDir && invoked.startsWith(binDir)) {
return mgr;
}
}
return undefined;
}

View File

@@ -81,7 +81,13 @@ export function parseToolCallArguments(
}
const { cmd, command } = json as Record<string, unknown>;
const commandArray = toStringArray(cmd) ?? toStringArray(command);
// The OpenAI model sometimes produces a single string instead of an array.
// Accept both shapes:
const commandArray =
toStringArray(cmd) ??
toStringArray(command) ??
(typeof cmd === "string" ? [cmd] : undefined) ??
(typeof command === "string" ? [command] : undefined);
if (commandArray == null) {
return undefined;
}

View File

@@ -0,0 +1,45 @@
export const providers: Record<
string,
{ name: string; baseURL: string; envKey: string }
> = {
openai: {
name: "OpenAI",
baseURL: "https://api.openai.com/v1",
envKey: "OPENAI_API_KEY",
},
openrouter: {
name: "OpenRouter",
baseURL: "https://openrouter.ai/api/v1",
envKey: "OPENROUTER_API_KEY",
},
gemini: {
name: "Gemini",
baseURL: "https://generativelanguage.googleapis.com/v1beta/openai",
envKey: "GEMINI_API_KEY",
},
ollama: {
name: "Ollama",
baseURL: "http://localhost:11434/v1",
envKey: "OLLAMA_API_KEY",
},
mistral: {
name: "Mistral",
baseURL: "https://api.mistral.ai/v1",
envKey: "MISTRAL_API_KEY",
},
deepseek: {
name: "DeepSeek",
baseURL: "https://api.deepseek.com",
envKey: "DEEPSEEK_API_KEY",
},
xai: {
name: "xAI",
baseURL: "https://api.x.ai/v1",
envKey: "XAI_API_KEY",
},
groq: {
name: "Groq",
baseURL: "https://api.groq.com/openai/v1",
envKey: "GROQ_API_KEY",
},
};

View File

@@ -0,0 +1,717 @@
import type { OpenAI } from "openai";
import type {
ResponseCreateParams,
Response,
} from "openai/resources/responses/responses";
// Define interfaces based on OpenAI API documentation
type ResponseCreateInput = ResponseCreateParams;
type ResponseOutput = Response;
// interface ResponseOutput {
// id: string;
// object: 'response';
// created_at: number;
// status: 'completed' | 'failed' | 'in_progress' | 'incomplete';
// error: { code: string; message: string } | null;
// incomplete_details: { reason: string } | null;
// instructions: string | null;
// max_output_tokens: number | null;
// model: string;
// output: Array<{
// type: 'message';
// id: string;
// status: 'completed' | 'in_progress';
// role: 'assistant';
// content: Array<{
// type: 'output_text' | 'function_call';
// text?: string;
// annotations?: Array<any>;
// tool_call?: {
// id: string;
// type: 'function';
// function: { name: string; arguments: string };
// };
// }>;
// }>;
// parallel_tool_calls: boolean;
// previous_response_id: string | null;
// reasoning: { effort: string | null; summary: string | null };
// store: boolean;
// temperature: number;
// text: { format: { type: 'text' } };
// tool_choice: string | object;
// tools: Array<any>;
// top_p: number;
// truncation: string;
// usage: {
// input_tokens: number;
// input_tokens_details: { cached_tokens: number };
// output_tokens: number;
// output_tokens_details: { reasoning_tokens: number };
// total_tokens: number;
// } | null;
// user: string | null;
// metadata: Record<string, string>;
// }
// Define types for the ResponseItem content and parts
type ResponseContentPart = {
type: string;
[key: string]: unknown;
};
type ResponseItemType = {
type: string;
id?: string;
status?: string;
role?: string;
content?: Array<ResponseContentPart>;
[key: string]: unknown;
};
type ResponseEvent =
| { type: "response.created"; response: Partial<ResponseOutput> }
| { type: "response.in_progress"; response: Partial<ResponseOutput> }
| {
type: "response.output_item.added";
output_index: number;
item: ResponseItemType;
}
| {
type: "response.content_part.added";
item_id: string;
output_index: number;
content_index: number;
part: ResponseContentPart;
}
| {
type: "response.output_text.delta";
item_id: string;
output_index: number;
content_index: number;
delta: string;
}
| {
type: "response.output_text.done";
item_id: string;
output_index: number;
content_index: number;
text: string;
}
| {
type: "response.function_call_arguments.delta";
item_id: string;
output_index: number;
content_index: number;
delta: string;
}
| {
type: "response.function_call_arguments.done";
item_id: string;
output_index: number;
content_index: number;
arguments: string;
}
| {
type: "response.content_part.done";
item_id: string;
output_index: number;
content_index: number;
part: ResponseContentPart;
}
| {
type: "response.output_item.done";
output_index: number;
item: ResponseItemType;
}
| { type: "response.completed"; response: ResponseOutput }
| { type: "error"; code: string; message: string; param: string | null };
// Define a type for tool call data
type ToolCallData = {
id: string;
name: string;
arguments: string;
};
// Define a type for usage data
type UsageData = {
prompt_tokens?: number;
completion_tokens?: number;
total_tokens?: number;
input_tokens?: number;
input_tokens_details?: { cached_tokens: number };
output_tokens?: number;
output_tokens_details?: { reasoning_tokens: number };
[key: string]: unknown;
};
// Define a type for content output
type ResponseContentOutput =
| {
type: "function_call";
call_id: string;
name: string;
arguments: string;
[key: string]: unknown;
}
| {
type: "output_text";
text: string;
annotations: Array<unknown>;
[key: string]: unknown;
};
// Global map to store conversation histories
const conversationHistories = new Map<
string,
{
previous_response_id: string | null;
messages: Array<OpenAI.Chat.Completions.ChatCompletionMessageParam>;
}
>();
// Utility function to generate unique IDs
function generateId(prefix: string = "msg"): string {
return `${prefix}_${Math.random().toString(36).substr(2, 9)}`;
}
// Function to convert ResponseInputItem to ChatCompletionMessageParam
type ResponseInputItem = ResponseCreateInput["input"][number];
function convertInputItemToMessage(
item: string | ResponseInputItem,
): OpenAI.Chat.Completions.ChatCompletionMessageParam {
// Handle string inputs as content for a user message
if (typeof item === "string") {
return { role: "user", content: item };
}
// At this point we know it's a ResponseInputItem
const responseItem = item;
if (responseItem.type === "message") {
// Use a more specific type assertion for the message content
const content = Array.isArray(responseItem.content)
? responseItem.content
.filter((c) => typeof c === "object" && c.type === "input_text")
.map((c) =>
typeof c === "object" && "text" in c
? (c["text"] as string) || ""
: "",
)
.join("")
: "";
return { role: responseItem.role, content };
} else if (responseItem.type === "function_call_output") {
return {
role: "tool",
tool_call_id: responseItem.call_id,
content: responseItem.output,
};
}
throw new Error(`Unsupported input item type: ${responseItem.type}`);
}
// Function to get full messages including history
function getFullMessages(
input: ResponseCreateInput,
): Array<OpenAI.Chat.Completions.ChatCompletionMessageParam> {
let baseHistory: Array<OpenAI.Chat.Completions.ChatCompletionMessageParam> =
[];
if (input.previous_response_id) {
const prev = conversationHistories.get(input.previous_response_id);
if (!prev) {
throw new Error(
`Previous response not found: ${input.previous_response_id}`,
);
}
baseHistory = prev.messages;
}
// Handle both string and ResponseInputItem in input.input
const newInputMessages = Array.isArray(input.input)
? input.input.map(convertInputItemToMessage)
: [convertInputItemToMessage(input.input)];
const messages = [...baseHistory, ...newInputMessages];
if (
input.instructions &&
messages[0]?.role !== "system" &&
messages[0]?.role !== "developer"
) {
return [{ role: "system", content: input.instructions }, ...messages];
}
return messages;
}
// Function to convert tools
function convertTools(
tools?: ResponseCreateInput["tools"],
): Array<OpenAI.Chat.Completions.ChatCompletionTool> | undefined {
return tools
?.filter((tool) => tool.type === "function")
.map((tool) => ({
type: "function" as const,
function: {
name: tool.name,
description: tool.description || undefined,
parameters: tool.parameters,
},
}));
}
const createCompletion = (openai: OpenAI, input: ResponseCreateInput) => {
const fullMessages = getFullMessages(input);
const chatTools = convertTools(input.tools);
const webSearchOptions = input.tools?.some(
(tool) => tool.type === "function" && tool.name === "web_search",
)
? {}
: undefined;
const chatInput: OpenAI.Chat.Completions.ChatCompletionCreateParams = {
model: input.model,
messages: fullMessages,
tools: chatTools,
web_search_options: webSearchOptions,
temperature: input.temperature,
top_p: input.top_p,
tool_choice: (input.tool_choice === "auto"
? "auto"
: input.tool_choice) as OpenAI.Chat.Completions.ChatCompletionCreateParams["tool_choice"],
stream: input.stream || false,
user: input.user,
metadata: input.metadata,
};
return openai.chat.completions.create(chatInput);
};
// Main function with overloading
async function responsesCreateViaChatCompletions(
openai: OpenAI,
input: ResponseCreateInput & { stream: true },
): Promise<AsyncGenerator<ResponseEvent>>;
async function responsesCreateViaChatCompletions(
openai: OpenAI,
input: ResponseCreateInput & { stream?: false },
): Promise<ResponseOutput>;
async function responsesCreateViaChatCompletions(
openai: OpenAI,
input: ResponseCreateInput,
): Promise<ResponseOutput | AsyncGenerator<ResponseEvent>> {
const completion = await createCompletion(openai, input);
if (input.stream) {
return streamResponses(
input,
completion as AsyncIterable<OpenAI.ChatCompletionChunk>,
);
} else {
return nonStreamResponses(
input,
completion as unknown as OpenAI.Chat.Completions.ChatCompletion,
);
}
}
// Non-streaming implementation
async function nonStreamResponses(
input: ResponseCreateInput,
completion: OpenAI.Chat.Completions.ChatCompletion,
): Promise<ResponseOutput> {
const fullMessages = getFullMessages(input);
try {
const chatResponse = completion;
if (!("choices" in chatResponse) || chatResponse.choices.length === 0) {
throw new Error("No choices in chat completion response");
}
const assistantMessage = chatResponse.choices?.[0]?.message;
if (!assistantMessage) {
throw new Error("No assistant message in chat completion response");
}
// Construct ResponseOutput
const responseId = generateId("resp");
const outputItemId = generateId("msg");
const outputContent: Array<ResponseContentOutput> = [];
// Check if the response contains tool calls
const hasFunctionCalls =
assistantMessage.tool_calls && assistantMessage.tool_calls.length > 0;
if (hasFunctionCalls && assistantMessage.tool_calls) {
for (const toolCall of assistantMessage.tool_calls) {
if (toolCall.type === "function") {
outputContent.push({
type: "function_call",
call_id: toolCall.id,
name: toolCall.function.name,
arguments: toolCall.function.arguments,
});
}
}
}
if (assistantMessage.content) {
outputContent.push({
type: "output_text",
text: assistantMessage.content,
annotations: [],
});
}
// Create response with appropriate status and properties
const responseOutput = {
id: responseId,
object: "response",
created_at: Math.floor(Date.now() / 1000),
status: hasFunctionCalls ? "requires_action" : "completed",
error: null,
incomplete_details: null,
instructions: null,
max_output_tokens: null,
model: chatResponse.model,
output: [
{
type: "message",
id: outputItemId,
status: "completed",
role: "assistant",
content: outputContent,
},
],
parallel_tool_calls: input.parallel_tool_calls ?? false,
previous_response_id: input.previous_response_id ?? null,
reasoning: null,
temperature: input.temperature,
text: { format: { type: "text" } },
tool_choice: input.tool_choice ?? "auto",
tools: input.tools ?? [],
top_p: input.top_p,
truncation: input.truncation ?? "disabled",
usage: chatResponse.usage
? {
input_tokens: chatResponse.usage.prompt_tokens,
input_tokens_details: { cached_tokens: 0 },
output_tokens: chatResponse.usage.completion_tokens,
output_tokens_details: { reasoning_tokens: 0 },
total_tokens: chatResponse.usage.total_tokens,
}
: undefined,
user: input.user ?? undefined,
metadata: input.metadata ?? {},
output_text: "",
} as ResponseOutput;
// Add required_action property for tool calls
if (hasFunctionCalls && assistantMessage.tool_calls) {
// Define type with required action
type ResponseWithAction = Partial<ResponseOutput> & {
required_action: unknown;
};
// Use the defined type for the assertion
(responseOutput as ResponseWithAction).required_action = {
type: "submit_tool_outputs",
submit_tool_outputs: {
tool_calls: assistantMessage.tool_calls.map((toolCall) => ({
id: toolCall.id,
type: toolCall.type,
function: {
name: toolCall.function.name,
arguments: toolCall.function.arguments,
},
})),
},
};
}
// Store history
const newHistory = [...fullMessages, assistantMessage];
conversationHistories.set(responseId, {
previous_response_id: input.previous_response_id ?? null,
messages: newHistory,
});
return responseOutput;
} catch (error) {
const errorMessage = error instanceof Error ? error.message : String(error);
throw new Error(`Failed to process chat completion: ${errorMessage}`);
}
}
// Streaming implementation
async function* streamResponses(
input: ResponseCreateInput,
completion: AsyncIterable<OpenAI.ChatCompletionChunk>,
): AsyncGenerator<ResponseEvent> {
const fullMessages = getFullMessages(input);
const responseId = generateId("resp");
const outputItemId = generateId("msg");
let textContentAdded = false;
let textContent = "";
const toolCalls = new Map<number, ToolCallData>();
let usage: UsageData | null = null;
const finalOutputItem: Array<ResponseContentOutput> = [];
// Initial response
const initialResponse: Partial<ResponseOutput> = {
id: responseId,
object: "response" as const,
created_at: Math.floor(Date.now() / 1000),
status: "in_progress" as const,
model: input.model,
output: [],
error: null,
incomplete_details: null,
instructions: null,
max_output_tokens: null,
parallel_tool_calls: true,
previous_response_id: input.previous_response_id ?? null,
reasoning: null,
temperature: input.temperature,
text: { format: { type: "text" } },
tool_choice: input.tool_choice ?? "auto",
tools: input.tools ?? [],
top_p: input.top_p,
truncation: input.truncation ?? "disabled",
usage: undefined,
user: input.user ?? undefined,
metadata: input.metadata ?? {},
output_text: "",
};
yield { type: "response.created", response: initialResponse };
yield { type: "response.in_progress", response: initialResponse };
let isToolCall = false;
for await (const chunk of completion as AsyncIterable<OpenAI.ChatCompletionChunk>) {
// console.error('\nCHUNK: ', JSON.stringify(chunk));
const choice = chunk.choices[0];
if (!choice) {
continue;
}
if (
!isToolCall &&
(("tool_calls" in choice.delta && choice.delta.tool_calls) ||
choice.finish_reason === "tool_calls")
) {
isToolCall = true;
}
if (chunk.usage) {
usage = {
prompt_tokens: chunk.usage.prompt_tokens,
completion_tokens: chunk.usage.completion_tokens,
total_tokens: chunk.usage.total_tokens,
input_tokens: chunk.usage.prompt_tokens,
input_tokens_details: { cached_tokens: 0 },
output_tokens: chunk.usage.completion_tokens,
output_tokens_details: { reasoning_tokens: 0 },
};
}
if (isToolCall) {
for (const tcDelta of choice.delta.tool_calls || []) {
const tcIndex = tcDelta.index;
const content_index = textContentAdded ? tcIndex + 1 : tcIndex;
if (!toolCalls.has(tcIndex)) {
// New tool call
const toolCallId = tcDelta.id || generateId("call");
const functionName = tcDelta.function?.name || "";
yield {
type: "response.output_item.added",
item: {
type: "function_call",
id: outputItemId,
status: "in_progress",
call_id: toolCallId,
name: functionName,
arguments: "",
},
output_index: 0,
};
toolCalls.set(tcIndex, {
id: toolCallId,
name: functionName,
arguments: "",
});
}
if (tcDelta.function?.arguments) {
const current = toolCalls.get(tcIndex);
if (current) {
current.arguments += tcDelta.function.arguments;
yield {
type: "response.function_call_arguments.delta",
item_id: outputItemId,
output_index: 0,
content_index,
delta: tcDelta.function.arguments,
};
}
}
}
if (choice.finish_reason === "tool_calls") {
for (const [tcIndex, tc] of toolCalls) {
const item = {
type: "function_call",
id: outputItemId,
status: "completed",
call_id: tc.id,
name: tc.name,
arguments: tc.arguments,
};
yield {
type: "response.function_call_arguments.done",
item_id: outputItemId,
output_index: tcIndex,
content_index: textContentAdded ? tcIndex + 1 : tcIndex,
arguments: tc.arguments,
};
yield {
type: "response.output_item.done",
output_index: tcIndex,
item,
};
finalOutputItem.push(item as unknown as ResponseContentOutput);
}
} else {
continue;
}
} else {
if (!textContentAdded) {
yield {
type: "response.content_part.added",
item_id: outputItemId,
output_index: 0,
content_index: 0,
part: { type: "output_text", text: "", annotations: [] },
};
textContentAdded = true;
}
if (choice.delta.content?.length) {
yield {
type: "response.output_text.delta",
item_id: outputItemId,
output_index: 0,
content_index: 0,
delta: choice.delta.content,
};
textContent += choice.delta.content;
}
if (choice.finish_reason) {
yield {
type: "response.output_text.done",
item_id: outputItemId,
output_index: 0,
content_index: 0,
text: textContent,
};
yield {
type: "response.content_part.done",
item_id: outputItemId,
output_index: 0,
content_index: 0,
part: { type: "output_text", text: textContent, annotations: [] },
};
const item = {
type: "message",
id: outputItemId,
status: "completed",
role: "assistant",
content: [
{ type: "output_text", text: textContent, annotations: [] },
],
};
yield {
type: "response.output_item.done",
output_index: 0,
item,
};
finalOutputItem.push(item as unknown as ResponseContentOutput);
} else {
continue;
}
}
// Construct final response
const finalResponse: ResponseOutput = {
id: responseId,
object: "response" as const,
created_at: initialResponse.created_at || Math.floor(Date.now() / 1000),
status: "completed" as const,
error: null,
incomplete_details: null,
instructions: null,
max_output_tokens: null,
model: chunk.model || input.model,
output: finalOutputItem as unknown as ResponseOutput["output"],
parallel_tool_calls: true,
previous_response_id: input.previous_response_id ?? null,
reasoning: null,
temperature: input.temperature,
text: { format: { type: "text" } },
tool_choice: input.tool_choice ?? "auto",
tools: input.tools ?? [],
top_p: input.top_p,
truncation: input.truncation ?? "disabled",
usage: usage as ResponseOutput["usage"],
user: input.user ?? undefined,
metadata: input.metadata ?? {},
output_text: "",
} as ResponseOutput;
// Store history
const assistantMessage: OpenAI.Chat.Completions.ChatCompletionMessageParam =
{
role: "assistant" as const,
};
if (textContent) {
assistantMessage.content = textContent;
}
// Add tool_calls property if needed
if (toolCalls.size > 0) {
const toolCallsArray = Array.from(toolCalls.values()).map((tc) => ({
id: tc.id,
type: "function" as const,
function: { name: tc.name, arguments: tc.arguments },
}));
// Define a more specific type for the assistant message with tool calls
type AssistantMessageWithToolCalls =
OpenAI.Chat.Completions.ChatCompletionMessageParam & {
tool_calls: Array<{
id: string;
type: "function";
function: {
name: string;
arguments: string;
};
}>;
};
// Use type assertion with the defined type
(assistantMessage as AssistantMessageWithToolCalls).tool_calls =
toolCallsArray;
}
const newHistory = [...fullMessages, assistantMessage];
conversationHistories.set(responseId, {
previous_response_id: input.previous_response_id ?? null,
messages: newHistory,
});
yield { type: "response.completed", response: finalResponse };
}
}
export {
responsesCreateViaChatCompletions,
ResponseCreateInput,
ResponseOutput,
ResponseEvent,
};

View File

@@ -1,138 +0,0 @@
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import { approximateTokensUsed } from "./approximate-tokens-used.js";
import {
estimateCostFromUsage,
pricePerToken,
type UsageBreakdown,
} from "./estimate-cost.js";
/**
* Simple accumulator for {@link ResponseItem}s that exposes aggregate token
* and (approximate) dollarcost statistics for the current conversation.
*/
export class SessionCostTracker {
private readonly model: string;
private readonly items: Array<ResponseItem> = [];
private tokensUsedPrecise: number | null = null;
/**
* Aggregated exact cost when we have detailed `usage` information from the
* OpenAI API. Falls back to `null` when we only have the rough estimate
* path available.
*/
private costPrecise: number | null = null;
constructor(model: string) {
this.model = model;
}
/** Append newlyreceived items to the internal history. */
addItems(items: Array<ResponseItem>): void {
this.items.push(...items);
}
/**
* Add a full usage breakdown as returned by the Responses API. This gives
* us exact token counts and allows truetospec cost accounting that
* factors in cached tokens.
*/
addUsage(usage: UsageBreakdown): void {
const tokens =
usage.total_tokens ??
(usage.input_tokens ?? 0) + (usage.output_tokens ?? 0);
if (Number.isFinite(tokens) && tokens > 0) {
this.tokensUsedPrecise = (this.tokensUsedPrecise ?? 0) + tokens;
}
const cost = estimateCostFromUsage(usage, this.model);
if (cost != null) {
this.costPrecise = (this.costPrecise ?? 0) + cost;
}
}
/** Legacy helper for callers that only know the total token count. */
addTokens(count: number): void {
if (Number.isFinite(count) && count > 0) {
this.tokensUsedPrecise = (this.tokensUsedPrecise ?? 0) + count;
// We deliberately do *not* update costPrecise here without a detailed
// breakdown we cannot know whether tokens were input/output/cached. We
// therefore fall back to the blended rate during `getCostUSD()`.
}
}
/** Approximate total token count so far. */
getTokensUsed(): number {
if (this.tokensUsedPrecise != null) {
return this.tokensUsedPrecise;
}
return approximateTokensUsed(this.items);
}
/** Besteffort USD cost estimate. Returns `null` when the model is unknown. */
getCostUSD(): number | null {
if (this.costPrecise != null) {
return this.costPrecise;
}
const per = pricePerToken(this.model);
if (per == null) {
return null;
}
return this.getTokensUsed() * per;
}
/**
* Humanreadable oneliner suitable for printing at session end (e.g. on
* CtrlC or `/clear`).
*/
summary(): string {
const tokens = this.getTokensUsed();
const cost = this.getCostUSD();
if (cost == null) {
return `Session complete approx. ${tokens} tokens used.`;
}
return `Session complete approx. ${tokens} tokens, $${cost.toFixed(
4,
)} USD.`;
}
}
// ────────────────────────────────────────────────────────────────────────────
// Global helpers so disparate parts of the codebase can share a single
// tracker instance without threading it through countless function calls.
// ────────────────────────────────────────────────────────────────────────────
let globalTracker: SessionCostTracker | null = null;
export function getSessionTracker(): SessionCostTracker | null {
return globalTracker;
}
export function ensureSessionTracker(model: string): SessionCostTracker {
if (!globalTracker) {
globalTracker = new SessionCostTracker(model);
}
return globalTracker;
}
export function resetSessionTracker(): void {
globalTracker = null;
}
/**
* Convenience helper that prints the session summary (if any) and resets the
* global tracker so that the next conversation starts with a clean slate.
*/
export function printAndResetSessionSummary(): void {
if (!globalTracker) {
return; // nothing to do
}
// eslint-disable-next-line no-console -- explicit, uservisible log
console.log("\n" + globalTracker.summary() + "\n");
resetSessionTracker();
}

View File

@@ -1,4 +1,4 @@
export const CLI_VERSION = "0.1.2504172351"; // Must be in sync with package.json.
export const CLI_VERSION = "0.1.2504251709"; // Must be in sync with package.json.
export const ORIGIN = "codex_cli_ts";
export type TerminalChatSession = {

View File

@@ -0,0 +1,35 @@
// Defines the available slash commands and their descriptions.
// Used for autocompletion in the chat input.
export interface SlashCommand {
command: string;
description: string;
}
export const SLASH_COMMANDS: Array<SlashCommand> = [
{
command: "/clear",
description: "Clear conversation history and free up context",
},
{
command: "/clearhistory",
description: "Clear command history",
},
{
command: "/compact",
description:
"Clear conversation history but keep a summary in context. Optional: /compact [instructions for summarization]",
},
{ command: "/history", description: "Open command history" },
{ command: "/help", description: "Show list of commands" },
{ command: "/model", description: "Open model selection panel" },
{ command: "/approval", description: "Open approval mode selection panel" },
{
command: "/bug",
description: "Generate a prefilled GitHub issue URL with session log",
},
{
command: "/diff",
description:
"Show git diff of the working directory (or applied patches if not in git)",
},
];

View File

@@ -1,12 +1,13 @@
import { log } from "../logger/log.js";
import { existsSync } from "fs";
import fs from "fs/promises";
import os from "os";
import path from "path";
const HISTORY_FILE = path.join(os.homedir(), ".codex", "history.json");
const DEFAULT_HISTORY_SIZE = 1000;
const DEFAULT_HISTORY_SIZE = 10_000;
// Regex patterns for sensitive commands that should not be saved
// Regex patterns for sensitive commands that should not be saved.
const SENSITIVE_PATTERNS = [
/\b[A-Za-z0-9-_]{20,}\b/, // API keys and tokens
/\bpassword\b/i,
@@ -18,7 +19,7 @@ const SENSITIVE_PATTERNS = [
export interface HistoryConfig {
maxSize: number;
saveHistory: boolean;
sensitivePatterns: Array<string>; // Array of regex patterns as strings
sensitivePatterns: Array<string>; // Regex patterns.
}
export interface HistoryEntry {
@@ -32,9 +33,6 @@ export const DEFAULT_HISTORY_CONFIG: HistoryConfig = {
sensitivePatterns: [],
};
/**
* Loads command history from the history file
*/
export async function loadCommandHistory(): Promise<Array<HistoryEntry>> {
try {
if (!existsSync(HISTORY_FILE)) {
@@ -45,26 +43,21 @@ export async function loadCommandHistory(): Promise<Array<HistoryEntry>> {
const history = JSON.parse(data) as Array<HistoryEntry>;
return Array.isArray(history) ? history : [];
} catch (error) {
// Use error logger but for production would use a proper logging system
// eslint-disable-next-line no-console
console.error("Failed to load command history:", error);
log(`error: failed to load command history: ${error}`);
return [];
}
}
/**
* Saves command history to the history file
*/
export async function saveCommandHistory(
history: Array<HistoryEntry>,
config: HistoryConfig = DEFAULT_HISTORY_CONFIG,
): Promise<void> {
try {
// Create directory if it doesn't exist
// Create directory if it doesn't exist.
const dir = path.dirname(HISTORY_FILE);
await fs.mkdir(dir, { recursive: true });
// Trim history to max size
// Trim history to max size.
const trimmedHistory = history.slice(-config.maxSize);
await fs.writeFile(
@@ -73,14 +66,10 @@ export async function saveCommandHistory(
"utf-8",
);
} catch (error) {
// eslint-disable-next-line no-console
console.error("Failed to save command history:", error);
log(`error: failed to save command history: ${error}`);
}
}
/**
* Adds a command to history if it's not sensitive
*/
export async function addToHistory(
command: string,
history: Array<HistoryEntry>,
@@ -90,46 +79,41 @@ export async function addToHistory(
return history;
}
// Check if command contains sensitive information
if (isSensitiveCommand(command, config.sensitivePatterns)) {
// Skip commands with sensitive information.
if (commandHasSensitiveInfo(command, config.sensitivePatterns)) {
return history;
}
// Check for duplicate (don't add if it's the same as the last command)
// Check for duplicate (don't add if it's the same as the last command).
const lastEntry = history[history.length - 1];
if (lastEntry && lastEntry.command === command) {
return history;
}
// Add new entry
const newEntry: HistoryEntry = {
command,
timestamp: Date.now(),
};
const newHistory = [...history, newEntry];
// Save to file
// Add new entry.
const newHistory: Array<HistoryEntry> = [
...history,
{
command,
timestamp: Date.now(),
},
];
await saveCommandHistory(newHistory, config);
return newHistory;
}
/**
* Checks if a command contains sensitive information
*/
function isSensitiveCommand(
function commandHasSensitiveInfo(
command: string,
additionalPatterns: Array<string> = [],
): boolean {
// Check built-in patterns
// Check built-in patterns.
for (const pattern of SENSITIVE_PATTERNS) {
if (pattern.test(command)) {
return true;
}
}
// Check additional patterns from config
// Check additional patterns from config.
for (const patternStr of additionalPatterns) {
try {
const pattern = new RegExp(patternStr);
@@ -137,23 +121,19 @@ function isSensitiveCommand(
return true;
}
} catch (error) {
// Invalid regex pattern, skip it
// Invalid regex pattern, skip it.
}
}
return false;
}
/**
* Clears the command history
*/
export async function clearCommandHistory(): Promise<void> {
try {
if (existsSync(HISTORY_FILE)) {
await fs.writeFile(HISTORY_FILE, JSON.stringify([]), "utf-8");
}
} catch (error) {
// eslint-disable-next-line no-console
console.error("Failed to clear command history:", error);
log(`error: failed to clear command history: ${error}`);
}
}

View File

@@ -1,20 +1,19 @@
/* eslint-disable no-console */
import type { ResponseItem } from "openai/resources/responses/responses";
import { loadConfig } from "../config";
import { log } from "../logger/log.js";
import fs from "fs/promises";
import os from "os";
import path from "path";
const SESSIONS_ROOT = path.join(os.homedir(), ".codex", "sessions");
async function saveRolloutToHomeSessions(
async function saveRolloutAsync(
sessionId: string,
items: Array<ResponseItem>,
): Promise<void> {
await fs.mkdir(SESSIONS_ROOT, { recursive: true });
const sessionId = crypto.randomUUID();
const timestamp = new Date().toISOString();
const ts = timestamp.replace(/[:.]/g, "-").slice(0, 10);
const filename = `rollout-${ts}-${sessionId}.json`;
@@ -39,23 +38,15 @@ async function saveRolloutToHomeSessions(
"utf8",
);
} catch (error) {
console.error(`Failed to save rollout to ${filePath}: `, error);
log(`error: failed to save rollout to ${filePath}: ${error}`);
}
}
let debounceTimer: NodeJS.Timeout | null = null;
let pendingItems: Array<ResponseItem> | null = null;
export function saveRollout(items: Array<ResponseItem>): void {
pendingItems = items;
if (debounceTimer) {
clearTimeout(debounceTimer);
}
debounceTimer = setTimeout(() => {
if (pendingItems) {
saveRolloutToHomeSessions(pendingItems).catch(() => {});
pendingItems = null;
}
debounceTimer = null;
}, 2000);
export function saveRollout(
sessionId: string,
items: Array<ResponseItem>,
): void {
// Best-effort. We also do not log here in case of failure as that should be taken care of
// by `saveRolloutAsync` already.
saveRolloutAsync(sessionId, items).catch(() => {});
}

View File

@@ -1,9 +1,6 @@
import type { Instance } from "ink";
import type React from "react";
// Costtracking
import { printAndResetSessionSummary } from "./session-cost.js";
let inkRenderer: Instance | null = null;
// Track whether the cleanup routine has already executed so repeat calls are
@@ -53,6 +50,8 @@ export function clearTerminal(): void {
if (inkRenderer) {
inkRenderer.clear();
}
// Also clear scrollback and primary buffer to ensure a truly blank slate
process.stdout.write("\x1b[3J\x1b[H\x1b[2J");
}
export function onExit(): void {
@@ -82,12 +81,4 @@ export function onExit(): void {
/* besteffort continue even if Ink throws */
}
}
// Finally, print a brief token/cost summary for the session best effort
// only, errors are swallowed so that shutdown always succeeds.
try {
printAndResetSessionSummary();
} catch {
/* ignore */
}
}

View File

@@ -0,0 +1,12 @@
// Vitest Snapshot v1, https://vitest.dev/guide/snapshot.html
exports[`checkForUpdates() > renders a box when a newer version exists and no global installer 1`] = `
"
╭─────────────────────────────────────────────────╮
│ │
│ Update available! 1.0.0 → 2.0.0. │
│ To update, run bun add -g my-pkg to update. │
│ │
╰─────────────────────────────────────────────────╯
"
`;

View File

@@ -67,7 +67,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
}));
vi.mock("../src/format-command.js", () => ({
@@ -94,7 +94,7 @@ describe("cancel before first function_call", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
config: { model: "any", instructions: "", notify: false },
});

View File

@@ -74,7 +74,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
}));
vi.mock("../src/format-command.js", () => ({
@@ -102,7 +102,7 @@ describe("cancel clears previous_response_id", () => {
additionalWritableRoots: [],
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
config: { model: "any", instructions: "", notify: false },
});

View File

@@ -9,12 +9,11 @@ class FakeStream {
public controller = { abort: vi.fn() };
async *[Symbol.asyncIterator]() {
// Immediately start streaming an assistant message so that it is possible
// for a usertriggered cancellation that happens milliseconds later to
// arrive *after* the first token has already been emitted. This mirrors
// the realworld race where the UI shows nothing yet (network / rendering
// latency) even though the model has technically started responding.
// Introduce a delay to simulate network latency and allow for cancel() to be called
await new Promise((resolve) => setTimeout(resolve, 10));
// Mimic an assistant message containing the word "hello".
// Our fix should prevent this from being emitted after cancel() is called
yield {
type: "response.output_item.done",
item: {
@@ -86,9 +85,9 @@ vi.mock("../src/utils/agent/log.js", () => ({
}));
describe("Agent cancellation race", () => {
// We expect this test to highlight the current bug, so the suite should
// fail (red) until the underlying race condition in `AgentLoop` is fixed.
it("still emits the model answer even though cancel() was called", async () => {
// This test verifies our fix for the race condition where a cancelled message
// could still appear after the user cancels a request.
it("should not emit messages after cancel() is called", async () => {
const items: Array<any> = [];
const agent = new AgentLoop({
@@ -99,7 +98,7 @@ describe("Agent cancellation race", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => items.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -131,9 +130,8 @@ describe("Agent cancellation race", () => {
await new Promise((r) => setTimeout(r, 40));
const assistantMsg = items.find((i) => i.role === "assistant");
// The bug manifests if the assistant message is still present even though
// it belongs to the canceled run. We assert that it *should not* be
// delivered this test will fail until the bug is fixed.
// Our fix should prevent the assistant message from being delivered after cancel
// Now that we've fixed it, the test should pass
expect(assistantMsg).toBeUndefined();
});
});

View File

@@ -52,7 +52,7 @@ vi.mock("../src/approvals.js", () => {
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () =>
({ type: "auto-approve", runInSandbox: false } as any),
({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
};
});
@@ -96,7 +96,7 @@ describe("Agent cancellation", () => {
received.push(item);
},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -144,7 +144,7 @@ describe("Agent cancellation", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (item) => received.push(item),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -0,0 +1,115 @@
import { describe, it, expect, vi } from "vitest";
// ---------------------------------------------------------------------------
// This regression test ensures that AgentLoop only surfaces each response item
// once even when the same item appears multiple times in the OpenAI streaming
// response (e.g. as an early `response.output_item.done` event *and* again in
// the final `response.completed` payload).
// ---------------------------------------------------------------------------
// Fake OpenAI stream that emits the *same* message twice: first as an
// incremental output event and then again in the turn completion payload.
class FakeStream {
public controller = { abort: vi.fn() };
async *[Symbol.asyncIterator]() {
// 1) Early incremental item.
yield {
type: "response.output_item.done",
item: {
type: "message",
id: "call-dedupe-1",
role: "assistant",
content: [{ type: "input_text", text: "Hello!" }],
},
} as any;
// 2) Turn completion containing the *same* item again.
yield {
type: "response.completed",
response: {
id: "resp-dedupe-1",
status: "completed",
output: [
{
type: "message",
id: "call-dedupe-1",
role: "assistant",
content: [{ type: "input_text", text: "Hello!" }],
},
],
},
} as any;
}
}
// Intercept the OpenAI SDK used inside AgentLoop so we can inject our fake
// streaming implementation.
vi.mock("openai", () => {
class FakeOpenAI {
public responses = {
create: async () => new FakeStream(),
};
}
class APIConnectionTimeoutError extends Error {}
return { __esModule: true, default: FakeOpenAI, APIConnectionTimeoutError };
});
// Stub approvals / formatting helpers not relevant here.
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
vi.mock("../src/format-command.js", () => ({
__esModule: true,
formatCommandForDisplay: (cmd: Array<string>) => cmd.join(" "),
}));
vi.mock("../src/utils/agent/log.js", () => ({
__esModule: true,
log: () => {},
isLoggingEnabled: () => false,
}));
// After the dependency mocks we can import the module under test.
import { AgentLoop } from "../src/utils/agent/agent-loop.js";
describe("AgentLoop deduplicates output items", () => {
it("invokes onItem exactly once for duplicate items with the same id", async () => {
const received: Array<any> = [];
const agent = new AgentLoop({
model: "any",
instructions: "",
config: { model: "any", instructions: "", notify: false },
approvalPolicy: { mode: "auto" } as any,
additionalWritableRoots: [],
onItem: (item) => received.push(item),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
const userMsg = [
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "hi" }],
},
];
await agent.run(userMsg as any);
// Give the setTimeout(3ms) inside AgentLoop.stageItem a chance to fire.
await new Promise((r) => setTimeout(r, 20));
// Count how many times the duplicate item surfaced.
const appearances = received.filter((i) => i.id === "call-dedupe-1").length;
expect(appearances).toBe(1);
});
});

View File

@@ -91,7 +91,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -121,7 +121,7 @@ describe("function_call_output includes original call ID", () => {
additionalWritableRoots: [],
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -26,7 +26,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -62,7 +62,7 @@ describe("AgentLoop generic network/server errors", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -106,7 +106,7 @@ describe("AgentLoop generic network/server errors", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -47,7 +47,7 @@ describe("Agent interrupt and continue", () => {
onLoading: (loading) => {
loadingState = loading;
},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -25,7 +25,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -61,7 +61,7 @@ describe("AgentLoop invalid request / 4xx errors", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -25,7 +25,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -64,7 +64,7 @@ describe("AgentLoop max_tokens too large error", () => {
approvalPolicy: { mode: "auto" } as any,
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -45,7 +45,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -112,7 +112,7 @@ describe("AgentLoop network resilience", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -154,7 +154,7 @@ describe("AgentLoop network resilience", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -50,13 +50,13 @@ vi.mock("openai", () => {
// The AgentLoop pulls these helpers in order to decide whether a command can
// be autoapproved. None of that matters for this test, so we stub the module
// with minimal noop implementations.
// with minimal no-op implementations.
vi.mock("../src/approvals.js", () => {
return {
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () =>
({ type: "auto-approve", runInSandbox: false } as any),
({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
};
});
@@ -119,7 +119,7 @@ describe("AgentLoop", () => {
approvalPolicy: { mode: "suggest" } as any,
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -37,7 +37,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -82,7 +82,7 @@ describe("AgentLoop ratelimit handling", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -98,10 +98,8 @@ describe("AgentLoop ratelimit handling", () => {
// is in progress.
const runPromise = agent.run(userMsg as any);
// The agent waits 15 000 ms between retries (ratelimit backoff) and does
// this four times (after attempts 14). Fastforward a bit more to cover
// any additional small `setTimeout` calls inside the implementation.
await vi.advanceTimersByTimeAsync(61_000); // 4 * 15s + 1s safety margin
// Should be done in at most 180 seconds.
await vi.advanceTimersByTimeAsync(180_000);
// Ensure the promise settles without throwing.
await expect(runPromise).resolves.not.toThrow();
@@ -110,8 +108,8 @@ describe("AgentLoop ratelimit handling", () => {
await vi.advanceTimersByTimeAsync(20);
// The OpenAI client should have been called the maximum number of retry
// attempts (5).
expect(openAiState.createSpy).toHaveBeenCalledTimes(5);
// attempts (8).
expect(openAiState.createSpy).toHaveBeenCalledTimes(8);
// Finally, verify that the user sees a helpful system message.
const sysMsg = received.find(

View File

@@ -35,7 +35,7 @@ vi.mock("openai", () => {
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false } as any),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
}));
@@ -100,7 +100,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -122,7 +122,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
expect(assistant?.content?.[0]?.text).toBe("ok");
});
it("fails after 3 attempts and surfaces system message", async () => {
it("fails after a few attempts and surfaces system message", async () => {
openAiState.createSpy = vi.fn(async () => {
const err: any = new Error("Internal Server Error");
err.status = 502; // any 5xx
@@ -138,7 +138,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
additionalWritableRoots: [],
onItem: (i) => received.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -154,7 +154,7 @@ describe("AgentLoop automatic retry on 5xx errors", () => {
await new Promise((r) => setTimeout(r, 20));
expect(openAiState.createSpy).toHaveBeenCalledTimes(5);
expect(openAiState.createSpy).toHaveBeenCalledTimes(8);
const sysMsg = received.find(
(i) =>

View File

@@ -54,7 +54,7 @@ vi.mock("../src/approvals.js", () => {
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () =>
({ type: "auto-approve", runInSandbox: false } as any),
({ type: "auto-approve", runInSandbox: false }) as any,
isSafeCommand: () => null,
};
});
@@ -116,7 +116,7 @@ describe("Agent terminate (hard cancel)", () => {
additionalWritableRoots: [],
onItem: (item) => received.push(item),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
@@ -152,7 +152,7 @@ describe("Agent terminate (hard cancel)", () => {
additionalWritableRoots: [],
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -110,7 +110,7 @@ describe("thinking time counter", () => {
additionalWritableRoots: [],
onItem: (i) => items.push(i),
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" } as any),
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});

View File

@@ -56,6 +56,34 @@ test("process_patch - update file", () => {
expect(fs.removals).toEqual([]);
});
// ---------------------------------------------------------------------------
// Unicode canonicalisation tests hyphen / dash / quote look-alikes
// ---------------------------------------------------------------------------
test("process_patch tolerates hyphen/dash variants", () => {
// The file contains EN DASH (\u2013) and NO-BREAK HYPHEN (\u2011)
const original =
"first\nimport foo # local import \u2013 avoids top\u2011level dep\nlast";
const patch = `*** Begin Patch\n*** Update File: uni.txt\n@@\n-import foo # local import - avoids top-level dep\n+import foo # HANDLED\n*** End Patch`;
const fs = createInMemoryFS({ "uni.txt": original });
process_patch(patch, fs.openFn, fs.writeFn, fs.removeFn);
expect(fs.files["uni.txt"]!.includes("HANDLED")).toBe(true);
});
test.skip("process_patch tolerates smart quotes", () => {
const original = "console.log(\u201Chello\u201D);"; // “hello” with smart quotes
const patch = `*** Begin Patch\n*** Update File: quotes.js\n@@\n-console.log(\\"hello\\");\n+console.log(\\"HELLO\\");\n*** End Patch`;
const fs = createInMemoryFS({ "quotes.js": original });
process_patch(patch, fs.openFn, fs.writeFn, fs.removeFn);
expect(fs.files["quotes.js"]).toBe('console.log("HELLO");');
});
test("process_patch - add file", () => {
const patch = `*** Begin Patch
*** Add File: b.txt

View File

@@ -11,7 +11,13 @@ describe("canAutoApprove()", () => {
const writeablePaths: Array<string> = [];
const check = (command: ReadonlyArray<string>): SafetyAssessment =>
canAutoApprove(command, "suggest", writeablePaths, env);
canAutoApprove(
command,
/* workdir */ undefined,
"suggest",
writeablePaths,
env,
);
test("simple safe commands", () => {
expect(check(["ls"])).toEqual({
@@ -73,7 +79,7 @@ describe("canAutoApprove()", () => {
test("true command is considered safe", () => {
expect(check(["true"])).toEqual({
type: "auto-approve",
reason: "Noop (true)",
reason: "No-op (true)",
group: "Utility",
runInSandbox: false,
});
@@ -89,4 +95,56 @@ describe("canAutoApprove()", () => {
expect(check(["cargo", "build"])).toEqual({ type: "ask-user" });
});
test("find", () => {
expect(check(["find", ".", "-name", "file.txt"])).toEqual({
type: "auto-approve",
reason: "Find files or directories",
group: "Searching",
runInSandbox: false,
});
// Options that can execute arbitrary commands.
expect(
check(["find", ".", "-name", "file.txt", "-exec", "rm", "{}", ";"]),
).toEqual({
type: "ask-user",
});
expect(
check(["find", ".", "-name", "*.py", "-execdir", "python3", "{}", ";"]),
).toEqual({
type: "ask-user",
});
expect(
check(["find", ".", "-name", "file.txt", "-ok", "rm", "{}", ";"]),
).toEqual({
type: "ask-user",
});
expect(
check(["find", ".", "-name", "*.py", "-okdir", "python3", "{}", ";"]),
).toEqual({
type: "ask-user",
});
// Option that deletes matching files.
expect(check(["find", ".", "-delete", "-name", "file.txt"])).toEqual({
type: "ask-user",
});
// Options that write pathnames to a file.
expect(check(["find", ".", "-fls", "/etc/passwd"])).toEqual({
type: "ask-user",
});
expect(check(["find", ".", "-fprint", "/etc/passwd"])).toEqual({
type: "ask-user",
});
expect(check(["find", ".", "-fprint0", "/etc/passwd"])).toEqual({
type: "ask-user",
});
expect(
check(["find", ".", "-fprintf", "/root/suid.txt", "%#m %u %p\n"]),
).toEqual({
type: "ask-user",
});
});
});

Some files were not shown because too many files have changed in this diff Show More