Compare commits

...

114 Commits

Author SHA1 Message Date
Eugene Brevdo
d131992dbc feat(core): write template config 2025-06-30 14:26:46 -07:00
Michael Bolin
4cb3c76798 fix: softprops/action-gh-release@v2 should use existing tag instead of creating a new tag (#1436)
https://github.com/Homebrew/homebrew-core/pull/228521 details the issues
I was having with the **Source code (tar.gz)** artifact for our GitHub
releases not being quite right. I landed these PRs as stabs in the dark
to fix this:

- https://github.com/openai/codex/pull/1423
- https://github.com/openai/codex/pull/1430

Based on the insights from
https://github.com/Homebrew/homebrew-core/pull/228521, I think those
were wrong and the real problem was this:


6dad5c3b17/.github/workflows/rust-release.yml (L162)

That is, I was manufacturing a new tag name on the fly instead of using
the existing one.

This PR reverts #1423 and #1430 and hopefully fixes how `tag_name` is
set for the `softprops/action-gh-release@v2` step so the **Source code
(tar.gz)** includes the correct files. Assuming this works, this should
make the Homebrew formula straightforward.
2025-06-30 12:10:48 -07:00
Michael Bolin
6dad5c3b17 feat: add query_params option to ModelProviderInfo to support Azure (#1435)
As discovered in https://github.com/openai/codex/issues/1365, the Azure
provider needs to be able to specify `api-version` as a query param, so
this PR introduces a generic `query_params` option to the
`model_providers` config so that an Azure provider can be defined as
follows:

```toml
[model_providers.azure]
name = "Azure"
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY"
query_params = { api-version = "2025-04-01-preview" }
```

This PR also updates the docs with this example.

While here, we also update `wire_api` to default to `"chat"`, as that is
likely the common case for someone defining an external provider.

Fixes https://github.com/openai/codex/issues/1365.
2025-06-30 11:39:54 -07:00
Michael Bolin
cd2d84d496 fix: need to check out the branch, not the tag (#1430)
This should have been done in https://github.com/openai/codex/pull/1423.
2025-06-29 10:18:50 -07:00
Michael Bolin
688100f7f4 chore: fix Rust release process so generated .tar.gz source works with Homebrew (#1423)
Looking at existing releases such as
https://github.com/openai/codex/releases/tag/codex-rs-b289c9207090b2e27494545d7b5404e063bd86f3-1-rust-v0.1.0-alpha.4,
the `.tar.gz` for the source code still seems to have `0.0.0` as the
`version` in `codex-rs/Cargo.toml` instead of what the tag seems to say
it should have:


b289c92070/codex-rs/Cargo.toml (L21)

ChatGPT claims:

> When GitHub generates the Source code (tar.gz) archive for a tag:
	•	It uses the commit the tag points to.
• But in some cases (e.g., shallow clones, GitHub CI, or local tools
that only clone the default branch), that commit may not be included,
and you might get an outdated view or nothing at all depending on how
it’s fetched.
	
Trying this recommended fix.
2025-06-28 19:46:44 -07:00
Michael Bolin
f30bf4bbcf fix: support pre-release identifiers in tags (#1422)
Had to update the regex in the GitHub workflow to allow suffixes like
`-alpha.4`.

Successfully ran:

```
./scripts/create_github_release.sh 0.1.0-alpha.4
```

to create
https://github.com/openai/codex/releases/tag/codex-rs-b289c9207090b2e27494545d7b5404e063bd86f3-1-rust-v0.1.0-alpha.4

and verified that when I run `codex --version`, it prints `codex-cli
0.1.0-alpha.4`.
2025-06-28 16:05:53 -07:00
Michael Bolin
1b7c8d2569 fix: build with codegen-units = 1 for profile.release (#1421)
Great suggestion from @zamazan4ik on
https://github.com/openai/codex/issues/1411.
2025-06-28 15:24:48 -07:00
Michael Bolin
4a341efe92 feat: highlight matching characters in fuzzy file search (#1420)
Using the new file-search API introduced in
https://github.com/openai/codex/pull/1419, matching characters are now
shown in bold in the TUI:


https://github.com/user-attachments/assets/8bbcc6c6-75a3-493f-8ea4-b2a063e09b3a

Fixes https://github.com/openai/codex/issues/1261
2025-06-28 15:04:23 -07:00
Michael Bolin
e2efe8da9c feat: introduce --compute-indices flag to codex-file-search (#1419)
This is a small quality-of-life feature, the addition of
`--compute-indices` to the CLI, which, if enabled, will compute and set
the `indices` field for each `FileMatch` returned by `run()`. Note we
only bother to compute `indices` once we have the top N results because
there could be a lot of intermediate "top N" results during the search
that are ultimately discarded.

When set, the indices are included in the JSON output when `--json` is
specified and the matching indices are displayed in bold when `--json`
is not specified.
2025-06-28 14:39:29 -07:00
Michael Bolin
5a0f236ca4 feat: add support for @ to do file search (#1401)
Introduces support for `@` to trigger a fuzzy-filename search in the
composer. Under the hood, this leverages
https://crates.io/crates/nucleo-matcher to do the fuzzy matching and
https://crates.io/crates/ignore to build up the list of file candidates
(so that it respects `.gitignore`).

For simplicity (at least for now), we do not do any caching between
searches like VS Code does for its file search:


1d89ed699b/src/vs/workbench/services/search/node/rawSearchService.ts (L212-L218)

Because we do not do any caching, I saw queries take up to three seconds
on large repositories with hundreds of thousands of files. To that end,
we do not perform searches synchronously on each keystroke, but instead
dispatch an event to do the search on a background thread that
asynchronously reports back to the UI when the results are available.
This is largely handled by the `FileSearchManager` introduced in this
PR, which also has logic for debouncing requests so there is at most one
search in flight at a time.

While we could potentially polish and tune this feature further, it may
already be overengineered for how it will be used, in practice, so we
can improve things going forward if it turns out that this is not "good
enough" in the wild.

Note this feature does not work like `@` in the TypeScript CLI, which
was more like directory-based tab completion. In the Rust CLI, `@`
triggers a full-repo fuzzy-filename search.

Fixes https://github.com/openai/codex/issues/1261.
2025-06-28 13:47:42 -07:00
Michael Bolin
ff8ae1ffa1 feat: make file search cancellable (#1414)
Update `run()` to take `cancel_flag: Arc<AtomicBool>` that the worker
threads will periodically check to see if it is `true`, exiting early
(and returning empty results) if so.
2025-06-27 20:01:45 -07:00
Michael Bolin
b3ad764532 chore: change arg from PathBuf to &Path (#1409)
Caller no longer needs to clone a `PathBuf`: can just pass `&Path`.
2025-06-27 16:24:41 -07:00
Michael Bolin
a331a67b3e chore: change built_in_model_providers so "openai" is the only "bundled" provider (#1407)
As we are [close to releasing the Rust CLI
beta](https://github.com/openai/codex/discussions/1405), for the moment,
let's take a more neutral stance on what it takes to be a "built-in"
provider.

* For example, there seems to be a discrepancy around what the "right"
configuration for Gemini is: https://github.com/openai/codex/pull/881
* And while the current list of "built-in" providers are all arguably
"well-known" names, this raises a question of what to do about
potentially less familiar providers, such as
https://github.com/openai/codex/pull/1142. Do we just accept every pull
request like this, or is there some criteria a provider has to meet to
"qualify" to be bundled with Codex CLI?

I think that if we can establish clear ground rules for being a built-in
provider, then we can bring this back. But until then, I would rather
take a minimalist approach because if we decided to reverse our position
later, it would break folks who were depending on the presence of the
built-in providers.
2025-06-27 14:49:55 -07:00
Gabriel Peal
2e293ce903 Handle Ctrl+C quit when idle (#1402)
## Summary
- show `Ctrl+C to quit` hint when pressing Ctrl+C with no active task
- exiting with Ctrl+C if the hint is already visible
- clear the hint when tasks begin or other keys are pressed


https://github.com/user-attachments/assets/931e2d7c-1c80-4b45-9908-d119f74df23c



------
https://chatgpt.com/s/cd_685ec8875a308191beaa95886dc1379e

Fixes #1245
2025-06-27 13:37:11 -04:00
Michael Bolin
64feeb3803 fix: add tiebreaker logic for paths when scores are equal (#1400)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1400).
* #1401
* __->__ #1400
2025-06-26 23:05:10 -07:00
Michael Bolin
fa0e17f83a feat: add support for /diff command (#1389)
Adds support for a `/diff` command comparable to the one available in
the TypeScript CLI.

<img width="1103" alt="Screenshot 2025-06-26 at 12 31 33 PM"
src="https://github.com/user-attachments/assets/5dc646ca-301f-41ff-92a7-595c68db64b6"
/>

While here, changed the `SlashCommand` enum so the declared variant
order is the order the commands appear in the popup menu. This way,
`/toggle-mouse-mode` is listed last, as it is the least likely to be
used.

Fixes https://github.com/openai/codex/issues/1253.
2025-06-26 13:03:31 -07:00
Gabriel Peal
a339a7bcce [Rust] Allow resuming a session that was killed with ctrl + c (#1387)
Previously, if you ctrl+c'd a conversation, all subsequent turns would
400 because the Responses API never got a response for one of its call
ids. This ensures that if we aren't sending a call id by hand, we
generate a synthetic aborted call.

Fixes #1244 


https://github.com/user-attachments/assets/5126354f-b970-45f5-8c65-f811bca8294a
2025-06-26 14:40:42 -04:00
Michael Bolin
fcfe43c7df feat: show number of tokens remaining in UI (#1388)
When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.

When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.

Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

![Screenshot 2025-06-25 at 11 20
55 PM](https://github.com/user-attachments/assets/6fd6982f-7247-4f14-84b2-2e600cb1fd49)

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.

Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:


296996d74e/codex-cli/src/utils/approximate-tokens-used.ts (L3-L16)

Fixes https://github.com/openai/codex/issues/1242
2025-06-25 23:31:11 -07:00
Michael Bolin
296996d74e feat: standalone file search CLI (#1386)
Standalone fuzzy filename search library that should be helpful in
addressing https://github.com/openai/codex/issues/1261.
2025-06-25 13:29:03 -07:00
Michael Bolin
50924101d2 feat: add --dangerously-bypass-approvals-and-sandbox (#1384)
This PR reworks `assess_command_safety()` so that the combination of
`AskForApproval::Never` and `SandboxPolicy::DangerFullAccess` ensures
that commands are run without _any_ sandbox and the user should never be
prompted. In turn, it adds support for a new
`--dangerously-bypass-approvals-and-sandbox` flag (that cannot be used
with `--approval-policy` or `--full-auto`) that sets both of those
options.

Fixes https://github.com/openai/codex/issues/1254
2025-06-25 12:36:10 -07:00
Michael Bolin
72082164c1 chore: rename AskForApproval::UnlessAllowListed to AskForApproval::UnlessTrusted (#1385)
We could just rename to `Untrusted` instead of `UnlessTrusted`, but I
think `AskForApproval::UnlessTrusted` reads a bit better.
2025-06-25 12:26:13 -07:00
Michael Bolin
e09691337d chore: improve docstring for --full-auto (#1379)
Reference `-c sandbox.mode=workspace-write` in the docstring and users
can read the config docs for `sandbox` for more information.
2025-06-25 09:13:36 -07:00
Michael Bolin
86d5a9d80d chore: rename unless-allow-listed to untrusted (#1378)
For the `approval_policy` config option, renames `unless-allow-listed`
to `untrusted`. In general, when it comes to exec'ing commands, I think
"trusted" is a more accurate term than "safe."

Also drops the `AskForApproval::AutoEdit` variant, as we were not really
making use of it, anyway.

Fixes https://github.com/openai/codex/issues/1250.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1378).
* #1379
* __->__ #1378
2025-06-24 22:19:21 -07:00
Michael Bolin
531ce7626f fix: pretty-print the sandbox config in the TUI/exec modes (#1376)
Now that https://github.com/openai/codex/pull/1373 simplified the
sandbox config, we can print something much simpler in the TUI (and in
`codex exec`) to summarize the sandbox config.

Before:

![Screenshot 2025-06-24 at 5 45
52 PM](https://github.com/user-attachments/assets/b7633efb-a619-43e1-9abe-7bb0be2d0ec0)

With this change:

![Screenshot 2025-06-24 at 5 46
44 PM](https://github.com/user-attachments/assets/8d099bdd-a429-4796-a08d-70931d984e4f)

For reference, my `config.toml` contains:

```
[sandbox]
mode = "workspace-write"
writable_roots = ["/tmp", "/Users/mbolin/.pyenv/shims"]
```

Fixes https://github.com/openai/codex/issues/1248
2025-06-24 17:48:51 -07:00
Michael Bolin
63363a54e5 chore: install just in the devcontainer for Linux development (#1375)
Apparently `just` was added to `apt` in Ubuntu 24, so this required
updating the Ubuntu version in the `Dockerfile` to make it so we could
simply `apt install just`.

Though then that caused a conflict with the custom `dev` user we were
using, though the end result seems simpler since now we just use the
default `ubuntu` user provided by Ubuntu 24.
2025-06-24 17:20:53 -07:00
Michael Bolin
6d65010aad chore: install clippy and rustfmt in the devcontainer for Linux development (#1374)
I discovered it was difficult to do development in the devcontainer
without these tools available.
2025-06-24 17:05:36 -07:00
Michael Bolin
0776d78357 feat: redesign sandbox config (#1373)
This is a major redesign of how sandbox configuration works and aims to
fix https://github.com/openai/codex/issues/1248. Specifically, it
replaces `sandbox_permissions` in `config.toml` (and the
`-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
three variants:

```toml
# Safest option: full disk is read-only, but writes and network access are disallowed.
[sandbox]
mode = "read-only"

# The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
# writable_roots can be used to specify additional writable folders.
[sandbox]
mode = "workspace-write"
writable_roots = []  # Optional, defaults to the empty list.
network_access = false  # Optional, defaults to false.

# Disable sandboxing: use at your own risk!!!
[sandbox]
mode = "danger-full-access"
```

This should make sandboxing easier to reason about. While we have
dropped support for `-s`, the way it works now is:

- no flags => `read-only`
- `--full-auto` => `workspace-write`
- currently, there is no way to specify `danger-full-access` via a CLI
flag, but we will revisit that as part of
https://github.com/openai/codex/issues/1254

Outstanding issue:

- As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
still conflating sandbox preferences with approval preferences in that
case, which needs to be cleaned up.
2025-06-24 16:59:47 -07:00
Eric Wright
ed5e848f3e add: responses api support for azure (#1321)
- Use Responses API for Azure provider endpoints
- Added a unit test to catch regression on the change from
`/chat/completions` to `/responses`
- Updated the default AOAI api version from `2025-03-01-preview` to
`2025-04-01-preview` to avoid user/400 errors due to missing summary
support in the March API version.
- Changes have been tested locally on AOAI endpoints
2025-06-22 18:01:13 -07:00
Govind Kamtamneni
5aafe190e2 feat(ts): provider‑specific API‑key discovery and clearer Azure guidance (#1324)
## Summary

This PR refactors the Codex CLI authentication flow so that
**non-OpenAI** providers (for example **azure**, or any future addition)
can supply their API key through a dedicated environment variable
without triggering the OpenAI login flow.

Key behaviours introduced:

* When `provider !== "openai"` the CLI consults `src/utils/providers.ts`
to locate the correct environment variable (`AZURE_OPENAI_API_KEY`,
`GEMINI_API_KEY`, and so on) before considering any interactive login.
* Credit redemption (`--free`) and PKCE login now run **only** when the
provider is OpenAI, eliminating unwanted browser prompts for Azure and
others.
* User-facing error messages are revamped to guide Azure users to
**[https://ai.azure.com/](https://ai.azure.com)** and show the exact
variable name they must set.
* All code paths still export `OPENAI_API_KEY` so legacy scripts
continue to operate unchanged.

---

## Example `config.json`

```jsonc
{
  "model": "codex-mini",
  "provider": "azure",
  "providers": {
    "azure": {
      "name": "AzureOpenAI",
      "baseURL": "https://ai-<project-name>.openai.azure.com/openai",
      "envKey": "AZURE_OPENAI_API_KEY"
    }
  },
  "history": {
    "maxSize": 1000,
    "saveHistory": true,
    "sensitivePatterns": []
  }
}
```

With this file in `~/.codex/config.json`, a single command line is
enough:

```bash
export AZURE_OPENAI_API_KEY="<your-key>"
codex "Hello from Azure"
```

No browser window opens, and the CLI works in entirely non-interactive
mode.

---

## Rationale

The new flow enables Codex to run **asynchronously** in sandboxed
environments such as GitHub Actions pipelines. By passing `--provider
azure` (or setting it in `config.json`) and exporting the correct key,
CI/CD jobs can invoke Codex without any ChatGPT-style login or PKCE
round-trip. This unlocks fully automated testing and deployment
scenarios.

---

## What’s changed

| File | Type | Description |
| ------------------------ | ------------------- |
-----------------------------------------------------------------------------------------------------------------------------
|
| `codex-cli/src/cli.tsx` | **feat / refactor** | +43 / -20 lines.
Imports `providers`, adds early provider-specific key lookup, gates
`--free` redemption, rewrites help text. |
| `src/utils/providers.ts` | **chore** | Now consumed by CLI for env-var
discovery. |

---

## How to test

```bash
# Azure example
export AZURE_OPENAI_API_KEY="<your-key>"
codex --provider azure "Automated run in CI"

# OpenAI example (unchanged behaviour)
codex --provider openai --login "Standard OpenAI flow"
```

Expected outcomes:

* Azure and other provider paths are non-interactive when provider flag
is passed.
* The CLI always sets `OPENAI_API_KEY` for backward compatibility.

---

## Checklist

* [x] Logic behind provider-specific env-var lookup added.
* [x] Redundant OpenAI login steps removed for other providers.
* [x] Unit tests cover new branches.
* [x] README and sample config updated.
* [x] CI passes on all supported Node versions.

---

**Related work**

* #92
* #769 
* #1321



I have read the CLA Document and I hereby sign the CLA.
2025-06-22 17:56:36 -07:00
Michael Bolin
b73426c1c4 docs: update codex-rs/README.md to list new features in the Rust CLI (#1267)
Let users know about what the Rust CLI supports that the TypeScript CLI
doesn't!
2025-06-06 18:32:10 -07:00
Reilly Wood
345a38502d codex-rs: Rename /clear to /new, make it start an entirely new chat (#1264)
I noticed that `/clear` wasn't fully clearing chat history; it would
clear the chat history widgets _in the UI_, but the LLM still had access
to information from previous messages.

This PR renames `/clear` to `/new` for clarity as per Michael's
suggestion, resetting `app_state` to a fresh `ChatWidget`.
2025-06-06 16:29:37 -07:00
Michael Bolin
029f39b9da feat: port maybeRedeemCredits() from get-api-key.tsx to login_with_chatgpt.py (#1221)
This builds on https://github.com/openai/codex/pull/1212 and ports the
`maybeRedeemCredits()` function from `get-api-key.ts` to
`login_with_chatgpt.py`:


a80240cfdc/codex-cli/src/utils/get-api-key.tsx (L84-L89)
2025-06-05 23:34:10 -07:00
Michael Bolin
a80240cfdc chore: ensure next Node.js release includes musl binaries for arm64 Linux (#1232)
Target a workflow with more recent binary artifacts.
2025-06-05 23:14:10 -07:00
Michael Bolin
2d5246050a fix: use aarch64-unknown-linux-musl instead of aarch64-unknown-linux-gnu (#1228)
Now that we have published a GitHub Release that contains arm64 musl
artifacts for Linux, update the following scripts to take advantage of
them:

- `dotslash-config.json` now uses musl artifacts for the `linux-aarch64`
target
- `install_native_deps.sh` for the TypeScript CLI now includes
`codex-linux-sandbox-aarch64-unknown-linux-musl` instead of
`codex-linux-sandbox-aarch64-unknown-linux-gnu` for sandboxing
- `codex-cli/bin/codex.js` now checks for `aarch64-unknown-linux-musl`
artifacts instead of `aarch64-unknown-linux-gnu` ones
2025-06-05 22:45:45 -07:00
Michael Bolin
77b017f67d fix: truncate auth.json file before rewriting it (#1231)
This was missed in https://github.com/openai/codex/pull/1212. Caught by
@rizwankce in
https://github.com/openai/codex/discussions/1174#discussioncomment-13377475.
2025-06-05 22:11:02 -07:00
Michael Bolin
c02d25fbad fix: include codex-linux-sandbox-aarch64-unknown-linux-musl in the set of release artifacts (#1230)
This was missed in https://github.com/openai/codex/pull/1225. Once we
create a new GitHub Release with this change, we can use the URL from
the workflow that triggered the release in
https://github.com/openai/codex/pull/1228.
2025-06-05 22:03:07 -07:00
Michael Bolin
9db53b33aa fix: support arm64 build for Linux (#1225)
Users were running into issues with glibc mismatches on arm64 linux. In
the past, we did not provide a musl build for arm64 Linux because we had
trouble getting the openssl dependency to build correctly. Though today
I just tried the same trick in `Cargo.toml` that we were doing for
`x86_64-unknown-linux-musl` (using `openssl-sys` with `features =
["vendored"]`), so I'm not sure what problem we had in the past the
builds "just worked" today!

Though one tweak that did have to be made is that the integration tests
for Seccomp/Landlock empirically require longer timeouts on arm64 linux,
or at least on the `ubuntu-24.04-arm` GitHub Runner. As such, we change
the timeouts for arm64 in `codex-rs/linux-sandbox/tests/landlock.rs`.

Though in solving this problem, I decided I needed a turnkey solution
for testing the Linux build(s) from my Mac laptop, so this PR introduces
`.devcontainer/Dockerfile` and `.devcontainer/devcontainer.json` to
facilitate this. Detailed instructions are in `.devcontainer/README.md`.

We will update `dotslash-config.json` and other release-related scripts
in a follow-up PR.
2025-06-05 20:29:46 -07:00
Michael Bolin
515b6331bd feat: add support for login with ChatGPT (#1212)
This does not implement the full Login with ChatGPT experience, but it
should unblock people.

**What works**

* The `codex` multitool now has a `login` subcommand, so you can run
`codex login`, which should write `CODEX_HOME/auth.json` if you complete
the flow successfully. The TUI will now read the `OPENAI_API_KEY` from
`auth.json`.
* The TUI should refresh the token if it has expired and the necessary
information is in `auth.json`.
* There is a `LoginScreen` in the TUI that tells you to run `codex
login` if both (1) your model provider expects to use `OPENAI_API_KEY`
as its env var, and (2) `OPENAI_API_KEY` is not set.

**What does not work**

* The `LoginScreen` does not support the login flow from within the TUI.
Instead, it tells you to quit, run `codex login`, and then run `codex`
again.
* `codex exec` does read from `auth.json` yet, nor does it direct the
user to go through the login flow if `OPENAI_API_KEY` is not be found.
* The `maybeRedeemCredits()` function from `get-api-key.tsx` has not
been ported from TypeScript to `login_with_chatgpt.py` yet:


a67a67f325/codex-cli/src/utils/get-api-key.tsx (L84-L89)

**Implementation**

Currently, the OAuth flow requires running a local webserver on
`127.0.0.1:1455`. It seemed wasteful to incur the additional binary cost
of a webserver dependency in the Rust CLI just to support login, so
instead we implement this logic in Python, as Python has a `http.server`
module as part of its standard library. Specifically, we bundle the
contents of a single Python file as a string in the Rust CLI and then
use it to spawn a subprocess as `python3 -c
{{SOURCE_FOR_PYTHON_SERVER}}`.

As such, the most significant files in this PR are:

```
codex-rs/login/src/login_with_chatgpt.py
codex-rs/login/src/lib.rs
```

Now that the CLI may load `OPENAI_API_KEY` from the environment _or_
`CODEX_HOME/auth.json`, we need a new abstraction for reading/writing
this variable, so we introduce:

```
codex-rs/core/src/openai_api_key.rs
```

Note that `std::env::set_var()` is [rightfully] `unsafe` in Rust 2024,
so we use a LazyLock<RwLock<Option<String>>> to store `OPENAI_API_KEY`
so it is read in a thread-safe manner.

Ultimately, it should be possible to go through the entire login flow
from the TUI. This PR introduces a placeholder `LoginScreen` UI for that
right now, though the new `codex login` subcommand introduced in this PR
should be a viable workaround until the UI is ready.

**Testing**

Because the login flow is currently implemented in a standalone Python
file, you can test it without building any Rust code as follows:

```
rm -rf /tmp/codex_home && mkdir /tmp/codex_home
CODEX_HOME=/tmp/codex_home python3 codex-rs/login/src/login_with_chatgpt.py
```

For reference:

* the original TypeScript implementation was introduced in
https://github.com/openai/codex/pull/963
* support for redeeming credits was later added in
https://github.com/openai/codex/pull/974
2025-06-04 08:44:17 -07:00
Reilly Wood
a67a67f325 codex-rs: make tool calls prettier (#1211)
This PR overhauls how active tool calls and completed tool calls are
displayed:

1. More use of colour to indicate success/failure and distinguish
between components like tool name+arguments
2. Previously, the entire `CallToolResult` was serialized to JSON and
pretty-printed. Now, we extract each individual `CallToolResultContent`
and print those
1. The previous solution was wasting space by unnecessarily showing
details of the `CallToolResult` struct to users, without formatting the
actual tool call results nicely
2. We're now able to show users more information from tool results in
less space, with nicer formatting when tools return JSON results

### Before:

<img width="1251" alt="Screenshot 2025-06-03 at 11 24 26"
src="https://github.com/user-attachments/assets/5a58f222-219c-4c53-ace7-d887194e30cf"
/>

### After:

<img width="1265" alt="image"
src="https://github.com/user-attachments/assets/99fe54d0-9ebe-406a-855b-7aa529b91274"
/>

## Future Work

1. Integrate image tool result handling better. We should be able to
display images even if they're not the first `CallToolResultContent`
2. Users should have some way to view the full version of truncated tool
results
3. It would be nice to add some left padding for tool results, make it
more clear that they are results. This is doable, just a little fiddly
due to the way `first_visible_line` scrolling works
4. There's almost certainly a better way to format JSON than "all on 1
line with spaces to make Ratatui wrapping work". But I think that works
OK for now.
2025-06-03 14:29:26 -07:00
Michael Bolin
c6fcec55fe fix: always send full instructions when using the Responses API (#1207)
This fixes a longstanding error in the Rust CLI where `codex.rs`
contained an errant `is_first_turn` check that would exclude the user
instructions for subsequent "turns" of a conversation when using the
responses API (i.e., when `previous_response_id` existed).

While here, renames `Prompt.instructions` to `Prompt.user_instructions`
since we now have quite a few levels of instructions floating around.
Also removed an unnecessary use of `clone()` in
`Prompt.get_full_instructions()`.
2025-06-03 09:40:19 -07:00
Michael Bolin
6fcc528a43 fix: provide tolerance for apply_patch tool (#993)
As explained in detail in the doc comment for `ParseMode::Lenient`, we
have observed that GPT-4.1 does not always generate a valid invocation
of `apply_patch`. Fortunately, the error is predictable, so we introduce
some new logic to the `codex-apply-patch` crate to recover from this
error.

Because we would like to avoid this becoming a de facto standard (as it
would be incompatible if `apply_patch` were provided as an actual
executable, unless we also introduced the lenient behavior in the
executable, as well), we require passing `ParseMode::Lenient` to
`parse_patch_text()` to make it clear that the caller is opting into
supporting this special case.

Note the analogous change to the TypeScript CLI was
https://github.com/openai/codex/pull/930. In addition to changing the
accepted input to `apply_patch`, it also introduced additional
instructions for the model, which we include in this PR.

Note that `apply-patch` does not depend on either `regex` or
`regex-lite`, so some of the checks are slightly more verbose to avoid
introducing this dependency.

That said, this PR does not leverage the existing
`extract_heredoc_body_from_apply_patch_command()`, which depends on
`tree-sitter` and `tree-sitter-bash`:


5a5aa89914/codex-rs/apply-patch/src/lib.rs (L191-L246)

though perhaps it should.
2025-06-03 09:06:38 -07:00
Michael Bolin
5a5aa89914 chore: replace regex with regex-lite, where appropriate (#1200)
As explained on https://crates.io/crates/regex-lite, `regex-lite` is a
lighter alternative to `regex` and seems to be sufficient for our
purposes.
2025-06-02 17:11:45 -07:00
Michael Bolin
0f3cc8f842 feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:


d7245cbbc9/codex-rs/core/src/client.rs (L108-L111)

Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:

```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```

We take a cue from the TypeScript CLI, which does a check on the model
name:


d7245cbbc9/codex-cli/src/utils/agent/agent-loop.ts (L786-L789)

This PR does a similar check, though also adds support for the following
config options:

```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```

This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)

This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
Michael Bolin
d7245cbbc9 fix: chat completions API now also passes tools along (#1167)
Prior to this PR, there were two big misses in `chat_completions.rs`:

1. The loop in `stream_chat_completions()` was only including items of
type `ResponseItem::Message` when building up the `"messages"` JSON for
the `POST` request to the `chat/completions` endpoint. This fixes things
by ensuring other variants (`FunctionCall`, `LocalShellCall`, and
`FunctionCallOutput`) are included, as well.
2. In `process_chat_sse()`, we were not recording tool calls and were
only emitting items of type
`ResponseEvent::OutputItemDone(ResponseItem::Message)` to the stream.
Now we introduce `FunctionCallState`, which is used to accumulate the
`delta`s of type `tool_calls`, so we can ultimately emit a
`ResponseItem::FunctionCall`, when appropriate.

While function calling now appears to work for chat completions with my
local testing, I believe that there are still edge cases that are not
covered and that this codepath would benefit from a battery of
integration tests. (As part of that further cleanup, we should also work
to support streaming responses in the UI.)

The other important part of this PR is some cleanup in
`core/src/codex.rs`. In particular, it was hard to reason about how
`run_task()` was building up the list of messages to include in a
request across the various cases:

- Responses API
- Chat Completions API
- Responses API used in concert with ZDR

I like to think things are a bit cleaner now where:

- `zdr_transcript` (if present) contains all messages in the history of
the conversation, which includes function call outputs that have not
been sent back to the model yet
- `pending_input` includes any messages the user has submitted while the
turn is in flight that need to be injected as part of the next `POST` to
the model
- `input_for_next_turn` includes the tool call outputs that have not
been sent back to the model yet
2025-06-02 13:47:51 -07:00
Michael Bolin
e40f86b446 chore: logging cleanup (#1196)
Update what we log to make `RUST_LOG=debug` a bit easier to work with.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1196).
* #1167
* __->__ #1196
2025-06-02 13:31:33 -07:00
Michael Bolin
7896b1089d chore: update the WORKFLOW_URL in install_native_deps.sh to the latest release (#1190) 2025-05-31 10:30:50 -07:00
Michael Bolin
1410ae95ca fix: set --config hide_agent_reasoning=true in the GitHub Action (#1185)
Whoops, I had this flipped in https://github.com/openai/codex/pull/1183.
2025-05-30 23:57:05 -07:00
Michael Bolin
fccf5f3221 fix: disable agent reasoning output by default in the GitHub Action (#1183) 2025-05-30 23:49:48 -07:00
Michael Bolin
1159eaf04f feat: show the version when starting Codex (#1182)
The TypeScript version of the CLI shows the version when it starts up,
which is helpful when users share screenshots (and nice to know, as a
user).
2025-05-30 23:24:36 -07:00
Michael Bolin
e81327e5f4 feat: add hide_agent_reasoning config option (#1181)
This PR introduces a `hide_agent_reasoning` config option (that defaults
to `false`) that users can enable to make the output less verbose by
suppressing reasoning output.

To test, verified that this includes agent reasoning in the output:

```
echo hello | just exec
```

whereas this does not:

```
echo hello | just exec --config hide_agent_reasoning=false
```
2025-05-30 23:14:56 -07:00
Michael Bolin
4f3d294762 feat: dim the timestamp in the exec output (#1180)
This required changing `ts_println!()` to take `$self:ident`, which is a
bit more verbose, but the usability improvement seems worth it.

Also eliminated an unnecessary `.to_string()` while here.
2025-05-30 16:27:37 -07:00
Michael Bolin
cf1d070538 feat: grab-bag of improvements to exec output (#1179)
Fixes:

* Instantiate `EventProcessor` earlier in `lib.rs` so
`print_config_summary()` can be an instance method of it and leverage
its various `Style` fields to ensure it honors `with_ansi` properly.
* After printing the config summary, print out user's prompt with the
heading `User instructions:`. As noted in the comment, now that we can
read the instructions via stdin as of #1178, it is helpful to the user
to ensure they know what instructions were given to Codex.
* Use same colors/bold/italic settings for headers as the TUI, making
the output a bit easier to read.
2025-05-30 16:22:10 -07:00
Michael Bolin
ae743d56b0 feat: for codex exec, if PROMPT is not specified, read from stdin if not a TTY (#1178)
This attempts to make `codex exec` more flexible in how the prompt can
be passed:

* as before, it can be passed as a single string argument
* if `-` is passed as the value, the prompt is read from stdin
* if no argument is passed _and stdin is a tty_, prints a warning to
stderr that no prompt was specified an exits non-zero.
* if no argument is passed _and stdin is NOT a tty_, prints `Reading
prompt from stdin...` to stderr to let the user know that Codex will
wait until it reads EOF from stdin to proceed. (You can repro this case
by doing `yes | just exec` since stdin is not a TTY in that case but it
also never reaches EOF).
2025-05-30 14:41:55 -07:00
Michael Bolin
1bf82056b3 fix: introduce create_tools_json() and share it with chat_completions.rs (#1177)
The main motivator behind this PR is that `stream_chat_completions()`
was not adding the `"tools"` entry to the payload posted to the
`/chat/completions` endpoint. This (1) refactors the existing logic to
build up the `"tools"` JSON from `client.rs` into `openai_tools.rs`, and
(2) updates the use of responses API (`client.rs`) and chat completions
API (`chat_completions.rs`) to both use it.

Note this PR alone is not sufficient to get tool calling from chat
completions working: that is done in
https://github.com/openai/codex/pull/1167.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1177).
* #1167
* __->__ #1177
2025-05-30 14:07:03 -07:00
Michael Bolin
e207f20f64 fix: add extra debugging to GitHub Action (#1173)
https://github.com/openai/codex/actions/runs/15352839832/job/43205041563
appeared to fail around `postComment()`, but I don't see the output from
`fail()` in the logs. Adding a bit more info.
2025-05-30 11:16:30 -07:00
Michael Bolin
0f40ef5a10 fix: missed a step in #1171 for codex.yml (#1172)
Missed in my copy/paste.
2025-05-30 11:04:41 -07:00
Michael Bolin
8676185389 fix: update outdated repo setup in codex.yml (#1171)
We should do some work to share the setup logic across `codex.yml`,
`ci.yml`, and `rust-ci.yml`.
2025-05-30 10:58:57 -07:00
Michael Bolin
baa92f37e0 feat: initial import of experimental GitHub Action (#1170)
This is a first cut at a GitHub Action that lets you define prompt
templates in `.md` files under `.github/codex/labels` that will run
Codex with the associated prompt when the label is added to a GitHub
pull request.

For example, this PR includes these files:

```
.github/codex/labels/codex-attempt.md
.github/codex/labels/codex-code-review.md
.github/codex/labels/codex-investigate-issue.md
```

And the new `.github/workflows/codex.yml` workflow declares the
following triggers:

```yaml
on:
  issues:
    types: [opened, labeled]
  pull_request:
    branches: [main]
    types: [labeled]
```

as well as the following expression to gate the action:

```
jobs:
  codex:
    if: |
      (github.event_name == 'issues' && (
        (github.event.action == 'labeled' && (github.event.label.name == 'codex-attempt' || github.event.label.name == 'codex-investigate-issue'))
      )) ||
      (github.event_name == 'pull_request' && github.event.action == 'labeled' && github.event.label.name == 'codex-code-review')
```

Note the "actor" who added the label must have write access to the repo
for the action to take effect.

After adding a label, the action will "ack" the request by replacing the
original label (e.g., `codex-review`) with an `-in-progress` suffix
(e.g., `codex-review-in-progress`). When it is finished, it will swap
the `-in-progress` label with a `-completed` one (e.g.,
`codex-review-completed`).

Users of the action are responsible for providing an `OPENAI_API_KEY`
and making it available as a secret to the action.
2025-05-30 10:55:28 -07:00
Michael Bolin
a0239c3cd6 fix: enable set positional-arguments in justfile (#1169)
The way these definitions worked before, they did not handle quoted args
with spaces properly.

For example, if you had `/tmp/test-just/printlen.py` as:

```python
#!/usr/bin/env python3

import sys

print(len(sys.argv))
```

and your `justfile` was:

```
printlen *args:
    /tmp/test-just/printlen.py {{args}}
```

Then:

```shell
$ just printlen foo bar
3
$ just printlen 'foo bar'
3
```

which is not what we want: `'foo bar'` should be treated as one
argument.

The fix is to use
[positional-arguments](515e806b51/README.md (L1131)):

```
set positional-arguments

printlen *args:
    /tmp/test-just/printlen.py "$@"
```
2025-05-30 09:11:53 -07:00
Michael Bolin
bdfa95ed31 docs: split the config-related portion of codex-rs/README.md into its own config.md file (#1165)
Also updated the overview on `codex-rs/README.md` while here.
2025-05-29 16:59:35 -07:00
Fouad Matin
828e2062c2 fix(codex-rs): use codex-mini-latest as default (#1164) 2025-05-29 16:55:19 -07:00
Michael Bolin
92957c47fb fix: update justfile to facilitate running CLIs from source and formatting source code (#1163) 2025-05-29 15:35:14 -07:00
Michael Bolin
8c1902b562 chore: update GitHub workflow for native artifacts for npm release (#1162)
Among other things, this picks up this UI treatment fix:

https://github.com/openai/codex/pull/1161
2025-05-29 15:34:06 -07:00
Michael Bolin
a32d305ae6 fix: update UI treatment of slash command menu to match that of the TS CLI (#1161)
Uses the same colors as in the TypeScript CLI:


![image](https://github.com/user-attachments/assets/919cd472-ffb4-4654-a46a-d84f0cd9c097)

Now it is also readable on a light theme, e.g., in Ghostty:


![image](https://github.com/user-attachments/assets/468c37b0-ea63-4455-9b48-73dc2c95f0f6)
2025-05-29 14:57:55 -07:00
Michael Bolin
a768a6a41d fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151)
The output of an MCP server tool call can be one of several types, but
to date, we treated all outputs as text by showing the serialized JSON
as the "tool output" in Codex:


25a9949c49/codex-rs/mcp-types/src/lib.rs (L96-L101)

This PR adds support for the `ImageContent` variant so we can now
display an image output from an MCP tool call.

In making this change, we introduce a new
`ResponseInputItem::McpToolCallOutput` variant so that we can work with
the `mcp_types::CallToolResult` directly when the function call is made
to an MCP server.

Though arguably the more significant change is the introduction of
`HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that
uses `ratatui_image` to render an image into the terminal. To support
this, we introduce `ImageRenderCache`, cache a
`ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the
appropriate scaled image data and dimensions based on the current
terminal size.

To test, I created a minimal `package.json`:

```json
{
  "name": "kitty-mcp",
  "version": "1.0.0",
  "type": "module",
  "description": "MCP that returns image of kitty",
  "main": "index.js",
  "dependencies": {
    "@modelcontextprotocol/sdk": "^1.12.0"
  }
}
```

with the following `index.js` to define the MCP server:

```js
#!/usr/bin/env node

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { readFile } from "node:fs/promises";
import { join } from "node:path";

const IMAGE_URI = "image://Ada.png";

const server = new McpServer({
  name: "Demo",
  version: "1.0.0",
});

server.tool(
  "get-cat-image",
  "If you need a cat image, this tool will provide one.",
  async () => ({
    content: [
      { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" },
    ],
  })
);

server.resource("Ada the Cat", IMAGE_URI, async (uri) => {
  const base64Image = await getAdaPngBase64();
  return {
    contents: [
      {
        uri: uri.href,
        mimeType: "image/png",
        blob: base64Image,
      },
    ],
  };
});

async function getAdaPngBase64() {
  const __dirname = new URL(".", import.meta.url).pathname;
  // From 9705ce2c59/assets/Ada.png
  const filePath = join(__dirname, "Ada.png");
  const imageData = await readFile(filePath);
  const base64Image = imageData.toString("base64");
  return base64Image;
}

const transport = new StdioServerTransport();
await server.connect(transport);
```

With the local changes from this PR, I added the following to my
`config.toml`:

```toml
[mcp_servers.kitty]
command = "node"
args = ["/Users/mbolin/code/kitty-mcp/index.js"]
```

Running the TUI from source:

```
cargo run --bin codex -- --model o3 'I need a picture of a cat'
```

I get:

<img width="732" alt="image"
src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869"
/>

Now, that said, I have only tested in iTerm and there is definitely some
funny business with getting an accurate character-to-pixel ratio
(sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10
rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
Michael Bolin
25a9949c49 fix: ensure inputSchema for MCP tool always has "properties" field when talking to OpenAI (#1150)
As noted in the comment introduced in this PR, this is analogous to the
issue reported in
https://github.com/openai/openai-agents-python/issues/449. This seems to
work now.
2025-05-28 17:17:21 -07:00
Michael Bolin
392fdd7db6 fix: honor RUST_LOG in mcp-client CLI and default to DEBUG (#1149)
We had `debug!()` logging statements already, but they weren't being
printed because `tracing_subscriber` was not set up.
2025-05-28 17:10:06 -07:00
Michael Bolin
ae1a83f095 feat: introduce CellWidget trait (#1148)
The motivation behind this PR is to make it so a `HistoryCell` is more
like a `WidgetRef` that knows how to render itself into a `Rect` so that
it can be backed by something other than a `Vec<Line>`. Because a
`HistoryCell` is intended to appear in a scrollable list, we want to
ensure the stack of cells can be scrolled one `Line` at a time even if
the `HistoryCell` is not backed by a `Vec<Line>` itself.

To this end, we introduce the `CellWidget` trait whose key method is:

```
fn render_window(&self, first_visible_line: usize, area: Rect, buf: &mut Buffer);
```

The `first_visible_line` param is what differs from
`WidgetRef::render_ref()`, as a `CellWidget` needs to know the offset
into its "full view" at which it should start rendering.

The bookkeeping in `ConversationHistoryWidget` has been updated
accordingly to ensure each `CellWidget` in the history is rendered
appropriately.
2025-05-28 14:03:19 -07:00
Michael Bolin
d60f350cf8 feat: add support for -c/--config to override individual config items (#1137)
This PR introduces support for `-c`/`--config` so users can override
individual config values on the command line using `--config
name=value`. Example:

```
codex --config model=o4-mini
```

Making it possible to set arbitrary config values on the command line
results in a more flexible configuration scheme and makes it easier to
provide single-line examples that can be copy-pasted from documentation.

Effectively, it means there are four levels of configuration for some
values:

- Default value (e.g., `model` currently defaults to `o4-mini`)
- Value in `config.toml` (e.g., user could override the default to be
`model = "o3"` in their `config.toml`)
- Specifying `-c` or `--config` to override `model` (e.g., user can
include `-c model=o3` in their list of args to Codex)
- If available, a config-specific flag can be used, which takes
precedence over `-c` (e.g., user can specify `--model o3` in their list
of args to Codex)

Now that it is possible to specify anything that could be configured in
`config.toml` on the command line using `-c`, we do not need to have a
custom flag for every possible config option (which can clutter the
output of `--help`). To that end, as part of this PR, we drop support
for the `--disable-response-storage` flag, as users can now specify `-c
disable_response_storage=true` to get the equivalent functionality.

Under the hood, this works by loading the `config.toml` into a
`toml::Value`. Then for each `key=value`, we create a small synthetic
TOML file with `value` so that we can run the TOML parser to get the
equivalent `toml::Value`. We then parse `key` to determine the point in
the original `toml::Value` to do the insert/replace. Once all of the
overrides from `-c` args have been applied, the `toml::Value` is
deserialized into a `ConfigToml` and then the `ConfigOverrides` are
applied, as before.
2025-05-27 23:11:44 -07:00
Michael Bolin
eba0e32909 fix: update install_native_deps.sh to pick up the latest release (#1136) 2025-05-27 10:06:41 -07:00
Michael Bolin
29d154cb13 fix: use o4-mini as the default model (#1135)
Rollback of https://github.com/openai/codex/pull/972.
2025-05-27 09:12:55 -07:00
Michael Bolin
6b5b184f21 fix: TUI was not honoring --skip-git-repo-check correctly (#1105)
I discovered that if I ran `codex <PROMPT>` in a cwd that was not a Git
repo, Codex did not automatically run `<PROMPT>` after I accepted the
Git warning. It appears that we were not managing the `AppState`
transition correctly, so this fixes the bug and ensures the Codex
session does not start until the user accepts the Git warning.

In particular, we now create the `ChatWidget` lazily and store it in the
`AppState::Chat` variant.
2025-05-24 08:33:49 -07:00
Michael Bolin
4bf81373a7 fix: forgot to pass codex_linux_sandbox_exe through in cli/src/debug_sandbox.rs (#1095)
I accidentally missed this in https://github.com/openai/codex/pull/1086.
2025-05-23 11:53:13 -07:00
Michael Bolin
89ef4efdcf fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086)
Historically, we spawned the Seatbelt and Landlock sandboxes in
substantially different ways:

For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
specified as an arg followed by the original command:


d1de7bb383/codex-rs/core/src/exec.rs (L147-L219)

For **Landlock/Seccomp**, we would do
`tokio::runtime::Builder::new_current_thread()`, _invoke
Landlock/Seccomp APIs to modify the permissions of that new thread_, and
then spawn the command:


d1de7bb383/codex-rs/core/src/exec_linux.rs (L28-L49)

While it is neat that Landlock/Seccomp supports applying a policy to
only one thread without having to apply it to the entire process, it
requires us to maintain two different codepaths and is a bit harder to
reason about. The tipping point was
https://github.com/openai/codex/pull/1061, in which we had to start
building up the `env` in an unexpected way for the existing
Landlock/Seccomp approach to continue to work.

This PR overhauls things so that we do similar things for Mac and Linux.
It turned out that we were already building our own "helper binary"
comparable to Mac's `sandbox-exec` as part of the `cli` crate:


d1de7bb383/codex-rs/cli/Cargo.toml (L10-L12)

We originally created this to build a small binary to include with the
Node.js version of the Codex CLI to provide support for Linux
sandboxing.

Though the sticky bit is that, at this point, we still want to deploy
the Rust version of Codex as a single, standalone binary rather than a
CLI and a supporting sandboxing binary. To satisfy this goal, we use
"the arg0 trick," in which we:

* use `std::env::current_exe()` to get the path to the CLI that is
currently running
* use the CLI as the `program` for the `Command`
* set `"codex-linux-sandbox"` as arg0 for the `Command`

A CLI that supports sandboxing should check arg0 at the start of the
program. If it is `"codex-linux-sandbox"`, it must invoke
`codex_linux_sandbox::run_main()`, which runs the CLI as if it were
`codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
the original command, so do _replace_ the process rather than spawn a
subprocess. Incidentally, we do this before starting the Tokio runtime,
so the process should only have one thread when `execvp(3)` is called.

Because the `core` crate that needs to spawn the Linux sandboxing is not
a CLI in its own right, this means that every CLI that includes `core`
and relies on this behavior has to (1) implement it and (2) provide the
path to the sandboxing executable. While the path is almost always
`std::env::current_exe()`, we needed to make this configurable for
integration tests, so `Config` now has a `codex_linux_sandbox_exe:
Option<PathBuf>` property to facilitate threading this through,
introduced in https://github.com/openai/codex/pull/1089.

This common pattern is now captured in
`codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
functions that should use it have been updated as part of this PR.

The `codex-linux-sandbox` crate added to the Cargo workspace as part of
this PR now has the bulk of the Landlock/Seccomp logic, which makes
`core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
`core/src/landlock.rs` were removed/ported as part of this PR. I also
moved the unit tests for this code into an integration test,
`linux-sandbox/tests/landlock.rs`, in which I use
`env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
`codex_linux_sandbox_exe` since `std::env::current_exe()` is not
appropriate in that case.
2025-05-23 11:37:07 -07:00
Michael Bolin
d1de7bb383 feat: add codex_linux_sandbox_exe: Option<PathBuf> field to Config (#1089)
https://github.com/openai/codex/pull/1086 is a work-in-progress to make
Linux sandboxing work more like Seatbelt where, for the command we want
to sandbox, we build up the command and then hand it, and some sandbox
configuration flags, to another command to set up the sandbox and then
run it.

In the case of Seatbelt, macOS provides this helper binary and provides
it at `/usr/bin/sandbox-exec`. For Linux, we have to build our own and
pass it through (which is what #1086 does), so this makes the new
`codex_linux_sandbox_exe` available on `Config` so that it will later be
available in `exec.rs` when we need it in #1086.
2025-05-22 21:52:28 -07:00
Michael Bolin
63deb7c369 fix: for the @native release of the Node module, use the Rust version by default (#1084)
Added logic so that when we run `./scripts/stage_release.sh --native`
(for the `@native` version of the Node module), we drop a `use-native`
file next to `codex.js`. If present, `codex.js` will now run the Rust
CLI.

Ran `./scripts/stage_release.sh --native` and verified that when the
running `codex.js` in the staged folder:

```
$ /var/folders/wm/f209bc1n2bd_r0jncn9s6j_00000gp/T/tmp.efvEvBlSN6/bin/codex.js --version
codex-cli 0.0.2505220956
```

it ran the expected Rust version of the CLI, as desired.

While here, I also updated the Rust version to one that I cut today,
which includes the new shell environment policy config option:
https://github.com/openai/codex/pull/1061. Note this may "break" some
users if the processes spawned by Codex need extra environment
variables. (We are still working to determine what the right defaults
should be for this option.)
2025-05-22 13:42:55 -07:00
Michael Bolin
cb379d7797 feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.

This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.

Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:

| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table&lt;string,string&gt; | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array&lt;string&gt; | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |


In particular, note that the default is `inherit = "core"`, so:

* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`

This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.

Though if nothing else, previous to this change:

```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```

But after this change it does not print anything (as desired).

One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
Michael Bolin
ef7208359f feat: show Config overview at start of exec (#1073)
Now the `exec` output starts with something like:

```
--------
workdir:  /Users/mbolin/code/codex/codex-rs
model:  o3
provider:  openai
approval:  Never
sandbox:  SandboxPolicy { permissions: [DiskFullReadAccess, DiskWritePlatformUserTempFolder, DiskWritePlatformGlobalTempFolder, DiskWriteCwd, DiskWriteFolder { folder: "/Users/mbolin/.pyenv/shims" }] }
--------
```

which makes it easier to reason about when looking at logs.
2025-05-21 22:53:02 -07:00
Michael Bolin
5746561428 chore: move types out of config.rs into config_types.rs (#1054)
`config.rs` is already quite long without these definitions. Since they
have no real dependencies of their own, let's move them to their own
file so `config.rs` can focus on the business logic of loading a config.
2025-05-20 11:55:25 -07:00
Michael Bolin
d766e845b3 feat: experimental --output-last-message flag to exec subcommand (#1037)
This introduces an experimental `--output-last-message` flag that can be
used to identify a file where the final message from the agent will be
written. Two use cases:

- Ultimately, we will likely add a `--quiet` option to `exec`, but even
if the user does not want any output written to the terminal, they
probably want to know what the agent did. Writing the output to a file
makes it possible to get that information in a clean way.
- Relatedly, when using `exec` in CI, it is easier to review the
transcript written "normally," (i.e., not as JSON or something with
extra escapes), but getting programmatic access to the last message is
likely helpful, so writing the last message to a file gets the best of
both worlds.

I am calling this "experimental" because it is possible that we are
overfitting and will want a more general solution to this problem that
would justify removing this flag.
2025-05-19 16:08:18 -07:00
Michael Bolin
a4bfdf6779 chore: produce .tar.gz versions of artifacts in addition to .zst (#1036)
For sparse containers/environments that do not have `zstd`, provide
`.tar.gz` as alternative archive format.
2025-05-19 15:17:45 -07:00
Fouad Matin
44022db8d0 bump(version): 0.1.2505172129 (#1008)
## `0.1.2505172129`

### 🪲 Bug Fixes

- Add node version check (#1007)
- Persist token after refresh (#1006)
2025-05-17 21:35:54 -07:00
Fouad Matin
a86270f581 fix: add node version check (#1007) 2025-05-17 21:27:41 -07:00
Fouad Matin
835eb77a7d fix: persist token after refresh (#1006)
After a token refresh/exchange, persist the new refresh and id token
2025-05-17 21:27:02 -07:00
Fouad Matin
dbc0ad348e bump(version): 0.1.2505171619 (#1001)
## `0.1.2505171619`

- `codex --login` + `codex --free` (#998)
2025-05-17 16:25:21 -07:00
Fouad Matin
9b4c2984d4 add: codex --login + codex --free (#998)
## Summary
- add `--login` and `--free` flags to cli help
- handle `--login` and `--free` logic in cli
- factor out redeem flow into `maybeRedeemCredits`
- call new helper from login callback
2025-05-17 16:13:12 -07:00
Michael Bolin
f3bde21759 chore: update install_native_deps.sh to use rust-v0.0.2505171051 (#995)
Use a more recent built of the Rust binaries to include with the Node
module.
2025-05-17 11:25:24 -07:00
Michael Bolin
1c6a3f1097 fix: artifacts from previous frames were bleeding through in TUI (#989)
Prior to this PR, I would frequently see glyphs from previous frames
"bleed" through like this:


![image](https://github.com/user-attachments/assets/8784b3d7-f691-4df6-8666-34e2f134ee85)

I think this was due to two issues (now addressed in this PR):

* We were not making use of `ratatui::widgets::Clear` to clear out the
buffer before drawing into it.
* To calculate the `width` used with `wrapped_line_count_for_cell()`, we
were not accounting for the scrollbar.
* Now we calculate `effective_width` using
`inner.width.saturating_sub(1)` where the `1` is for the scrollbar.
* We compute `text_area` using `effective_with` and pass the `text_area`
to `paragraph.render()`.
* We eliminate the conditional `needs_scrollbar` check and always call
`render(Scrollbar)`

I suspect this bug was introduced in
https://github.com/openai/codex/pull/937, though I did not try to
verify: I'm just happy that it appears to be fixed!
2025-05-17 10:51:11 -07:00
Michael Bolin
f8b6b1db81 fix: ensure the first user message always displays after the session info (#988)
Previously, if the first user message was sent with the command
invocation, e.g.:

```
$ cargo run --bin codex 'hello'
```

Then the user message was added as the first entry in the history and
then `is_first_event` would be `false` here:


031df77dfb/codex-rs/tui/src/conversation_history_widget.rs (L178-L179)

which would prevent the "welcome" message with things like the the model
version from displaying.

The fix in this PR is twofold:

* Reorganize the logic so the `ChatWidget` constructor stores
`initial_user_message` rather than sending it right away. Now inside
`handle_codex_event()`, it waits for the `SessionConfigured` event and
sends the `initial_user_message`, if it exists.
* In `conversation_history_widget.rs`, `add_session_info()` checks to
see whether a `WelcomeMessage` exists in the history when determining
the value of `has_welcome_message`. By construction, we expect that
`WelcomeMessage` is always the first message (in which case the existing
`let is_first_event = self.entries.is_empty();` logic would be sound),
but we decide to be extra defensive in case an `EventMsg::Error` is
processed before `EventMsg::SessionConfigured`.
2025-05-17 09:00:23 -07:00
Christoph K
031df77dfb Remove unnecessary console log from test (#970)
When running `npm test` on `codex-cli`, the test
`agent-cancel-prev-response.test.ts` logs a significant body of text to
console for no obvious reason.

This is not helpful, as it makes test logs messy and far longer.

This change deletes the `console.log(...)` that produces the behavior.
2025-05-16 19:48:11 -07:00
Michael Bolin
f9143d0361 fix: do not let Tab keypress flow through to composer when used to toggle focus (#977)
One line fix from Codex!
2025-05-16 19:27:49 -07:00
Fouad Matin
2880925a44 bump(version): 0.1.2505161800 (#978)
## `0.1.2505161800`

- Sign in with chatgpt credits (#974)
- Add support for OpenAI tool type, local_shell (#961)
2025-05-16 18:18:15 -07:00
Fouad Matin
3e19e8fd59 add: sign in with chatgpt credits (#974) 2025-05-16 17:55:08 -07:00
Trevor Creech
c7312c9d52 Fix CLA link in workflow (#964)
## Summary
- fix the CLA link posted by the bot
- docs suggest using an absolute URL:
https://github.com/marketplace/actions/cla-assistant-lite
2025-05-16 17:11:57 -07:00
Michael Bolin
1dc14cefa1 fix: make codex-mini-latest the default model in the Rust TUI (#972)
It's time to make `codex-mini-latest` the new default, as this should be
an "evergreen" model pointer.

* Equivalent change in TypeScript
https://github.com/openai/codex/pull/951
* See some notes about using `codex-mini-latest` with MCP in
https://github.com/openai/codex/pull/961
2025-05-16 17:08:18 -07:00
Michael Bolin
7ca84087e6 feat: make it possible to toggle mouse mode in the Rust TUI (#971)
I did a bit of research to understand why I could not use my mouse to
drag to select text to copy to the clipboard in iTerm.

Apparently https://github.com/openai/codex/pull/641 to enable mousewheel
scrolling broke this functionality. It seems that, unless we put in a
bit of effort, we can have drag-to-select or scrolling, but not both.
Though if you know the trick to hold down `Option` will dragging with
the mouse in iTerm, you can probably get by with this. (I did not know
about this option prior to researching this issue.)

Nevertheless, users may still prefer to disable mouse capture
altogether, so this PR introduces:

* the ability to set `tui.disable_mouse_capture = true` in `config.toml`
to disable mouse capture
* a new command, `/toggle-mouse-mode` to toggle mouse capture
2025-05-16 16:16:50 -07:00
Michael Bolin
67ac8ef605 fix: use text other than 'TODO' as test example (#969)
I casually `rg TODO` to look for TODOs, so the use of TODO in a sample
string in test output was throwing things off.
2025-05-16 14:51:03 -07:00
Michael Bolin
f48dd99f22 feat: add support for OpenAI tool type, local_shell (#961)
The new `codex-mini-latest` model expects a new tool with `{"type":
"local_shell"}`. Its contract is similar to the existing `function` tool
with `"name": "shell"`, so this takes the `local_shell` tool call into
`ExecParams` and sends it through the existing
`handle_container_exec_with_params()` code path.

This also adds the following logic when adding the default set of tools
to a request:

```rust
let default_tools = if self.model.starts_with("codex") {
    &DEFAULT_CODEX_MODEL_TOOLS
} else {
    &DEFAULT_TOOLS
};
```

That is, if the model name starts with `"codex"`, we add `{"type":
"local_shell"}` to the list of tools; otherwise, we add the
aforementioned `shell` tool.

To test this, I ran the TUI with `-m codex-mini-latest` and verified
that it used the `local_shell` tool. Though I also had some entries in
`[mcp_servers]` in my personal `config.toml`. The `codex-mini-latest`
model seemed eager to try the tools from the MCP servers first, so I
have personally commented them out for now, so keep an eye out if you're
testing `codex-mini-latest`!

Perhaps we should include more details with `{"type": "local_shell"}` or
update the following:


fd0b1b0208/codex-rs/core/prompt.md

For reference, the corresponding change in the TypeScript CLI is
https://github.com/openai/codex/pull/951.
2025-05-16 14:38:08 -07:00
Michael Bolin
dfd54e1433 chore: refactor handle_function_call() into smaller functions (#965)
Overall, `codex.rs` is still far too large, but at least there's less
indenting now that things have been moved into smaller functions.

This will also make it easier to introduce the `local_shell` tool in
https://github.com/openai/codex/pull/961.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/965).
* #961
* __->__ #965
2025-05-16 14:17:10 -07:00
Michael Bolin
9739820366 fix: remove file named ">" in the codex-cli folder (#968)
This file appears to have been accidentally added in
https://github.com/openai/codex/pull/912. Its presence makes the repo
impossible to clone on Windows.
2025-05-16 14:08:23 -07:00
Fouad Matin
fd0b1b0208 bump(version): 0.1.2505161243 (#967)
## `0.1.2505161243`

- Sign in with chatgpt (#963)
- Session history viewer (#912)
- Apply patch issue when using different cwd (#942)
- Diff command for filenames with special characters (#954)
2025-05-16 12:46:52 -07:00
Fouad Matin
c6e08ad8c1 add: sign in with chatgpt (#963)
Sign in with ChatGPT to get an API key (flow to grant API credits for Plus/Pro coming later today!)
2025-05-16 12:28:54 -07:00
Fouad Matin
cabf83f2ed add: session history viewer (#912)
- A new “/sessions” command is available for browsing previous sessions,
as shown in the updated slash command list

- The CLI now documents and parses a new “--history” flag to browse past
sessions from the command line

- A dedicated `SessionsOverlay` component loads session metadata and
allows toggling between viewing and resuming sessions

- When the sessions overlay is opened during a chat, selecting a session
can either show the saved rollout or resume it
2025-05-16 12:28:22 -07:00
Michael Bolin
1e39189393 feat: add support for file_opener option in Rust, similiar to #911 (#957)
This ports the enhancement introduced in
https://github.com/openai/codex/pull/911 (and the fixes in
https://github.com/openai/codex/pull/919) for the TypeScript CLI to the
Rust one.
2025-05-16 11:33:08 -07:00
Michael Bolin
3d9f4fcd8a fix: introduce ExtractHeredocError that implements PartialEq (#958) 2025-05-16 09:42:27 -07:00
Sebastian Lund
84e01f4b62 fix: apply patch issue when using different cwd (#942)
If you run a codex instance outside of the current working directory
from where you launched the codex binary it won't be able to apply
patches correctly, even if the sandbox policy allows it. This manifests
weird behaviours, such as

* Reading the same filename in the binary working directory, and
overwriting it in the session working directory. e.g. if you have a
`readme` in both folders it will overwrite the readme in the session
working directory with the readme in the binary working directory
*applied with the suggested patch*.
* The LLM ends up in weird loops trying to verify and debug why the
apply_patch won't work, and it can result in it applying patches by
manually writing python or javascript if it figures out that either is
supported by the system instead.

I added a test-case to ensure that the patch contents are based on the
cwd.

## Issue: mixing relative & absolute paths in apply_patch

1. The apply_patch tool use relative paths based on the session working
directory.
2. `unified_diff_from_chunks` eventually ends up [reading the source
file](https://github.com/reflectionai/codex/blob/main/codex-rs/apply-patch/src/lib.rs#L410)
to figure out what the diff is, by using the relative path.
3. The changes are targeted using an absolute path derived from the
current working directory.

The end-result in case session working directory differs from the binary
working directory: we get the diff for a file relative to the binary
working directory, and apply it on a file in the session working
directory.
2025-05-16 09:12:16 -07:00
hanson-openai
7edfbae062 fix: diff command for filenames with special characters (#954)
## Summary
- fix quoting issues in `/diff` to correctly handle files with special
characters
- add regression test for `getGitDiff` when filenames contain `$`
- relax timeout in raw-exec-process-group test

Fixes https://github.com/openai/codex/issues/943

## Testing
- `pnpm test`
2025-05-16 09:10:44 -07:00
Fouad Matin
316289d01d bump(version): 0.1.2505160811 codex-mini-latest (#953)
## `0.1.2505160811`

- `codex-mini-latest` (#951)
2025-05-16 08:18:20 -07:00
Michael Bolin
30cbfdfa87 chore: update exec crate to use std::time instead of chrono (#952)
When I originally wrote `elapsed.rs`, I realized we were using both
`std::time` and `chrono` with no real benefit of having both. We should
try to keep the `exec` subcommand trim (as it also buildable as a
standalone executable), so this helps tighten things up.
2025-05-16 08:14:50 -07:00
Fouad Matin
070499f534 add: codex-mini-latest (#951)
💽

---------

Co-authored-by: Trevor Creech <tcreech@openai.com>
2025-05-16 08:04:00 -07:00
Michael Bolin
ce2ecbe72f feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.

History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.

Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:


3fdf9df133/codex-cli/src/utils/storage/command-history.ts (L10-L17)

We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.

As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:

```toml
[history]
persistence = "none" 
```

Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).

The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)

The motivation behind this crazy scheme is that it is designed to defend
against:

* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)

Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
Michael Bolin
3fdf9df133 chore: introduce AppEventSender to help fix clippy warnings and update to Rust 1.87 (#948)
Moving to Rust 1.87 introduced a clippy warning that
`SendError<AppEvent>` was too large.

In practice, the only thing we ever did when we got this error was log
it (if the mspc channel is closed, then the app is likely shutting down
or something, so there's not much to do...), so this finally motivated
me to introduce `AppEventSender`, which wraps
`std::sync::mpsc::Sender<AppEvent>` with a `send()` method that invokes
`send()` on the underlying `Sender` and logs an `Err` if it gets one.

This greatly simplifies the code, as many functions that previously
returned `Result<(), SendError<AppEvent>>` now return `()`, so we don't
have to propagate an `Err` all over the place that we don't really
handle, anyway.

This also makes it so we can upgrade to Rust 1.87 in CI.
2025-05-15 14:50:30 -07:00
Michael Bolin
ec5e82b77c chore: pin Rust version to 1.86 and use io::Error::other to prepare for 1.87 (#947)
Previously, our GitHub actions specified the Rust toolchain as
`dtolnay/rust-toolchain@stable`, which meant the version could change
out from under us. In this case, the move from 1.86 to 1.87 introduced
new clippy warnings, causing build failures.

Because it will take a little time to fix all the new clippy warnings,
this PR pins things to 1.86 for now to unbreak the build.

It also replaces `io::Error::new(io::ErrorKind::Other)` with
`io::Error::other()` in preparation for 1.87.
2025-05-15 14:07:16 -07:00
Michael Bolin
5fc9fc3e3e chore: expose codex_home via Config (#941) 2025-05-15 00:30:13 -07:00
193 changed files with 14866 additions and 3070 deletions

1
.codespellignore Normal file
View File

@@ -0,0 +1 @@
iTerm

27
.devcontainer/Dockerfile Normal file
View File

@@ -0,0 +1,27 @@
FROM ubuntu:24.04
ARG DEBIAN_FRONTEND=noninteractive
# enable 'universe' because musl-tools & clang live there
RUN apt-get update && \
apt-get install -y --no-install-recommends \
software-properties-common && \
add-apt-repository --yes universe
# now install build deps
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential curl git ca-certificates \
pkg-config clang musl-tools libssl-dev just && \
rm -rf /var/lib/apt/lists/*
# Ubuntu 24.04 ships with user 'ubuntu' already created with UID 1000.
USER ubuntu
# install Rust + musl target as dev user
RUN curl -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal && \
~/.cargo/bin/rustup target add aarch64-unknown-linux-musl && \
~/.cargo/bin/rustup component add clippy rustfmt
ENV PATH="/home/ubuntu/.cargo/bin:${PATH}"
WORKDIR /workspace

30
.devcontainer/README.md Normal file
View File

@@ -0,0 +1,30 @@
# Containerized Development
We provide the following options to facilitate Codex development in a container. This is particularly useful for verifying the Linux build when working on a macOS host.
## Docker
To build the Docker image locally for x64 and then run it with the repo mounted under `/workspace`:
```shell
CODEX_DOCKER_IMAGE_NAME=codex-linux-dev
docker build --platform=linux/amd64 -t "$CODEX_DOCKER_IMAGE_NAME" ./.devcontainer
docker run --platform=linux/amd64 --rm -it -e CARGO_TARGET_DIR=/workspace/codex-rs/target-amd64 -v "$PWD":/workspace -w /workspace/codex-rs "$CODEX_DOCKER_IMAGE_NAME"
```
Note that `/workspace/target` will contain the binaries built for your host platform, so we include `-e CARGO_TARGET_DIR=/workspace/codex-rs/target-amd64` in the `docker run` command so that the binaries built inside your container are written to a separate directory.
For arm64, specify `--platform=linux/amd64` instead for both `docker build` and `docker run`.
Currently, the `Dockerfile` works for both x64 and arm64 Linux, though you need to run `rustup target add x86_64-unknown-linux-musl` yourself to install the musl toolchain for x64.
## VS Code
VS Code recognizes the `devcontainer.json` file and gives you the option to develop Codex in a container. Currently, `devcontainer.json` builds and runs the `arm64` flavor of the container.
From the integrated terminal in VS Code, you can build either flavor of the `arm64` build (GNU or musl):
```shell
cargo build --target aarch64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu
```

View File

@@ -0,0 +1,27 @@
{
"name": "Codex",
"build": {
"dockerfile": "Dockerfile",
"context": "..",
"platform": "linux/arm64"
},
/* Force VS Code to run the container as arm64 in
case your host is x86 (or vice-versa). */
"runArgs": ["--platform=linux/arm64"],
"containerEnv": {
"RUST_BACKTRACE": "1",
"CARGO_TARGET_DIR": "${containerWorkspaceFolder}/codex-rs/target-arm64"
},
"remoteUser": "ubuntu",
"customizations": {
"vscode": {
"settings": {
"terminal.integrated.defaultProfile.linux": "bash"
},
"extensions": ["rust-lang.rust-analyzer"]
}
}
}

1
.github/actions/codex/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
/node_modules/

View File

@@ -0,0 +1,8 @@
printWidth = 80
quoteProps = "consistent"
semi = true
tabWidth = 2
trailingComma = "all"
# Preserve existing behavior for markdown/text wrapping.
proseWrap = "preserve"

140
.github/actions/codex/README.md vendored Normal file
View File

@@ -0,0 +1,140 @@
# openai/codex-action
`openai/codex-action` is a GitHub Action that facilitates the use of [Codex](https://github.com/openai/codex) on GitHub issues and pull requests. Using the action, associate **labels** to run Codex with the appropriate prompt for the given context. Codex will respond by posting comments or creating PRs, whichever you specify!
Here is a sample workflow that uses `openai/codex-action`:
```yaml
name: Codex
on:
issues:
types: [opened, labeled]
pull_request:
branches: [main]
types: [labeled]
jobs:
codex:
if: ... # optional, but can be effective in conserving CI resources
runs-on: ubuntu-latest
# TODO(mbolin): Need to verify if/when `write` is necessary.
permissions:
contents: write
issues: write
pull-requests: write
steps:
# By default, Codex runs network disabled using --full-auto, so perform
# any setup that requires network (such as installing dependencies)
# before openai/codex-action.
- name: Checkout repository
uses: actions/checkout@v4
- name: Run Codex
uses: openai/codex-action@latest
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
github_token: ${{ secrets.GITHUB_TOKEN }}
```
See sample usage in [`codex.yml`](../../workflows/codex.yml).
## Triggering the Action
Using the sample workflow above, we have:
```yaml
on:
issues:
types: [opened, labeled]
pull_request:
branches: [main]
types: [labeled]
```
which means our workflow will be triggered when any of the following events occur:
- a label is added to an issue
- a label is added to a pull request against the `main` branch
### Label-Based Triggers
To define a GitHub label that should trigger Codex, create a file named `.github/codex/labels/LABEL-NAME.md` in your repository where `LABEL-NAME` is the name of the label. The content of the file is the prompt template to use when the label is added (see more on [Prompt Template Variables](#prompt-template-variables) below).
For example, if the file `.github/codex/labels/codex-review.md` exists, then:
- Adding the `codex-review` label will trigger the workflow containing the `openai/codex-action` GitHub Action.
- When `openai/codex-action` starts, it will replace the `codex-review` label with `codex-review-in-progress`.
- When `openai/codex-action` is finished, it will replace the `codex-review-in-progress` label with `codex-review-completed`.
If Codex sees that either `codex-review-in-progress` or `codex-review-completed` is already present, it will not perform the action.
As determined by the [default config](./src/default-label-config.ts), Codex will act on the following labels by default:
- Adding the `codex-review` label to a pull request will have Codex review the PR and add it to the PR as a comment.
- Adding the `codex-triage` label to an issue will have Codex investigate the issue and report its findings as a comment.
- Adding the `codex-issue-fix` label to an issue will have Codex attempt to fix the issue and create a PR wit the fix, if any.
## Action Inputs
The `openai/codex-action` GitHub Action takes the following inputs
### `openai_api_key` (required)
Set your `OPENAI_API_KEY` as a [repository secret](https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions). See **Secrets and varaibles** then **Actions** in the settings for your GitHub repo.
Note that the secret name does not have to be `OPENAI_API_KEY`. For example, you might want to name it `CODEX_OPENAI_API_KEY` and then configure it on `openai/codex-action` as follows:
```yaml
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
```
### `github_token` (required)
This is required so that Codex can post a comment or create a PR. Set this value on the action as follows:
```yaml
github_token: ${{ secrets.GITHUB_TOKEN }}
```
### `codex_args`
A whitespace-delimited list of arguments to pass to Codex. Defaults to `--full-auto`, but if you want to override the default model to use `o3`:
```yaml
codex_args: "--full-auto --model o3"
```
For more complex configurations, use the `codex_home` input.
### `codex_home`
If set, the value to use for the `$CODEX_HOME` environment variable when running Codex. As explained [in the docs](https://github.com/openai/codex/tree/main/codex-rs#readme), this folder can contain the `config.toml` to configure Codex, custom instructions, and log files.
This should be a relative path within your repo.
## Prompt Template Variables
As shown above, `"prompt"` and `"promptPath"` are used to define prompt templates that will be populated and passed to Codex in response to certain events. All template variables are of the form `{CODEX_ACTION_...}` and the supported values are defined below.
### `CODEX_ACTION_ISSUE_TITLE`
If the action was triggered on a GitHub issue, this is the issue title.
Specifically it is read as the `.issue.title` from the `$GITHUB_EVENT_PATH`.
### `CODEX_ACTION_ISSUE_BODY`
If the action was triggered on a GitHub issue, this is the issue body.
Specifically it is read as the `.issue.body` from the `$GITHUB_EVENT_PATH`.
### `CODEX_ACTION_GITHUB_EVENT_PATH`
The value of the `$GITHUB_EVENT_PATH` environment variable, which is the path to the file that contains the JSON payload for the event that triggered the workflow. Codex can use `jq` to read only the fields of interest from this file.
### `CODEX_ACTION_PR_DIFF`
If the action was triggered on a pull request, this is the diff between the base and head commits of the PR. It is the output from `git diff`.
Note that the content of the diff could be quite large, so is generally safer to point Codex at `CODEX_ACTION_GITHUB_EVENT_PATH` and let it decide how it wants to explore the change.

124
.github/actions/codex/action.yml vendored Normal file
View File

@@ -0,0 +1,124 @@
name: "Codex [reusable action]"
description: "A reusable action that runs a Codex model."
inputs:
openai_api_key:
description: "The value to use as the OPENAI_API_KEY environment variable when running Codex."
required: true
trigger_phrase:
description: "Text to trigger Codex from a PR/issue body or comment."
required: false
default: ""
github_token:
description: "Token so Codex can comment on the PR or issue."
required: true
codex_args:
description: "A whitespace-delimited list of arguments to pass to Codex. Due to limitations in YAML, arguments with spaces are not supported. For more complex configurations, use the `codex_home` input."
required: false
default: "--config hide_agent_reasoning=true --full-auto"
codex_home:
description: "Value to use as the CODEX_HOME environment variable when running Codex."
required: false
codex_release_tag:
description: "The release tag of the Codex model to run."
required: false
default: "codex-rs-ca8e97fcbcb991e542b8689f2d4eab9d30c399d6-1-rust-v0.0.2505302325"
runs:
using: "composite"
steps:
# Do this in Bash so we do not even bother to install Bun if the sender does
# not have write access to the repo.
- name: Verify user has write access to the repo.
env:
GH_TOKEN: ${{ github.token }}
shell: bash
run: |
set -euo pipefail
PERMISSION=$(gh api \
"/repos/${GITHUB_REPOSITORY}/collaborators/${{ github.event.sender.login }}/permission" \
| jq -r '.permission')
if [[ "$PERMISSION" != "admin" && "$PERMISSION" != "write" ]]; then
exit 1
fi
- name: Download Codex
env:
GH_TOKEN: ${{ github.token }}
shell: bash
run: |
set -euo pipefail
# Determine OS/arch and corresponding Codex artifact name.
uname_s=$(uname -s)
uname_m=$(uname -m)
case "$uname_s" in
Linux*) os="linux" ;;
Darwin*) os="apple-darwin" ;;
*) echo "Unsupported operating system: $uname_s"; exit 1 ;;
esac
case "$uname_m" in
x86_64*) arch="x86_64" ;;
arm64*|aarch64*) arch="aarch64" ;;
*) echo "Unsupported architecture: $uname_m"; exit 1 ;;
esac
# linux builds differentiate between musl and gnu.
if [[ "$os" == "linux" ]]; then
if [[ "$arch" == "x86_64" ]]; then
triple="${arch}-unknown-linux-musl"
else
# Only other supported linux build is aarch64 gnu.
triple="${arch}-unknown-linux-gnu"
fi
else
# macOS
triple="${arch}-apple-darwin"
fi
# Note that if we start baking version numbers into the artifact name,
# we will need to update this action.yml file to match.
artifact="codex-exec-${triple}.tar.gz"
gh release download ${{ inputs.codex_release_tag }} --repo openai/codex \
--pattern "$artifact" --output - \
| tar xzO > /usr/local/bin/codex-exec
chmod +x /usr/local/bin/codex-exec
# Display Codex version to confirm binary integrity; ensure we point it
# at the checked-out repository via --cd so that any subsequent commands
# use the correct working directory.
codex-exec --cd "$GITHUB_WORKSPACE" --version
- name: Install Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: 1.2.11
- name: Install dependencies
shell: bash
run: |
cd ${{ github.action_path }}
bun install --production
- name: Run Codex
shell: bash
run: bun run ${{ github.action_path }}/src/main.ts
# Process args plus environment variables often have a max of 128 KiB,
# so we should fit within that limit?
env:
INPUT_CODEX_ARGS: ${{ inputs.codex_args || '' }}
INPUT_CODEX_HOME: ${{ inputs.codex_home || ''}}
INPUT_TRIGGER_PHRASE: ${{ inputs.trigger_phrase || '' }}
OPENAI_API_KEY: ${{ inputs.openai_api_key }}
GITHUB_TOKEN: ${{ inputs.github_token }}
GITHUB_EVENT_ACTION: ${{ github.event.action || '' }}
GITHUB_EVENT_LABEL_NAME: ${{ github.event.label.name || '' }}
GITHUB_EVENT_ISSUE_NUMBER: ${{ github.event.issue.number || '' }}
GITHUB_EVENT_ISSUE_BODY: ${{ github.event.issue.body || '' }}
GITHUB_EVENT_REVIEW_BODY: ${{ github.event.review.body || '' }}
GITHUB_EVENT_COMMENT_BODY: ${{ github.event.comment.body || '' }}

85
.github/actions/codex/bun.lock vendored Normal file
View File

@@ -0,0 +1,85 @@
{
"lockfileVersion": 1,
"workspaces": {
"": {
"name": "codex-action",
"dependencies": {
"@actions/core": "^1.11.1",
"@actions/github": "^6.0.1",
},
"devDependencies": {
"@types/bun": "^1.2.11",
"@types/node": "^22.15.21",
"prettier": "^3.5.3",
"typescript": "^5.8.3",
},
},
},
"packages": {
"@actions/core": ["@actions/core@1.11.1", "", { "dependencies": { "@actions/exec": "^1.1.1", "@actions/http-client": "^2.0.1" } }, "sha512-hXJCSrkwfA46Vd9Z3q4cpEpHB1rL5NG04+/rbqW9d3+CSvtB1tYe8UTpAlixa1vj0m/ULglfEK2UKxMGxCxv5A=="],
"@actions/exec": ["@actions/exec@1.1.1", "", { "dependencies": { "@actions/io": "^1.0.1" } }, "sha512-+sCcHHbVdk93a0XT19ECtO/gIXoxvdsgQLzb2fE2/5sIZmWQuluYyjPQtrtTHdU1YzTZ7bAPN4sITq2xi1679w=="],
"@actions/github": ["@actions/github@6.0.1", "", { "dependencies": { "@actions/http-client": "^2.2.0", "@octokit/core": "^5.0.1", "@octokit/plugin-paginate-rest": "^9.2.2", "@octokit/plugin-rest-endpoint-methods": "^10.4.0", "@octokit/request": "^8.4.1", "@octokit/request-error": "^5.1.1", "undici": "^5.28.5" } }, "sha512-xbZVcaqD4XnQAe35qSQqskb3SqIAfRyLBrHMd/8TuL7hJSz2QtbDwnNM8zWx4zO5l2fnGtseNE3MbEvD7BxVMw=="],
"@actions/http-client": ["@actions/http-client@2.2.3", "", { "dependencies": { "tunnel": "^0.0.6", "undici": "^5.25.4" } }, "sha512-mx8hyJi/hjFvbPokCg4uRd4ZX78t+YyRPtnKWwIl+RzNaVuFpQHfmlGVfsKEJN8LwTCvL+DfVgAM04XaHkm6bA=="],
"@actions/io": ["@actions/io@1.1.3", "", {}, "sha512-wi9JjgKLYS7U/z8PPbco+PvTb/nRWjeoFlJ1Qer83k/3C5PHQi28hiVdeE2kHXmIL99mQFawx8qt/JPjZilJ8Q=="],
"@fastify/busboy": ["@fastify/busboy@2.1.1", "", {}, "sha512-vBZP4NlzfOlerQTnba4aqZoMhE/a9HY7HRqoOPaETQcSQuWEIyZMHGfVu6w9wGtGK5fED5qRs2DteVCjOH60sA=="],
"@octokit/auth-token": ["@octokit/auth-token@4.0.0", "", {}, "sha512-tY/msAuJo6ARbK6SPIxZrPBms3xPbfwBrulZe0Wtr/DIY9lje2HeV1uoebShn6mx7SjCHif6EjMvoREj+gZ+SA=="],
"@octokit/core": ["@octokit/core@5.2.1", "", { "dependencies": { "@octokit/auth-token": "^4.0.0", "@octokit/graphql": "^7.1.0", "@octokit/request": "^8.4.1", "@octokit/request-error": "^5.1.1", "@octokit/types": "^13.0.0", "before-after-hook": "^2.2.0", "universal-user-agent": "^6.0.0" } }, "sha512-dKYCMuPO1bmrpuogcjQ8z7ICCH3FP6WmxpwC03yjzGfZhj9fTJg6+bS1+UAplekbN2C+M61UNllGOOoAfGCrdQ=="],
"@octokit/endpoint": ["@octokit/endpoint@9.0.6", "", { "dependencies": { "@octokit/types": "^13.1.0", "universal-user-agent": "^6.0.0" } }, "sha512-H1fNTMA57HbkFESSt3Y9+FBICv+0jFceJFPWDePYlR/iMGrwM5ph+Dd4XRQs+8X+PUFURLQgX9ChPfhJ/1uNQw=="],
"@octokit/graphql": ["@octokit/graphql@7.1.1", "", { "dependencies": { "@octokit/request": "^8.4.1", "@octokit/types": "^13.0.0", "universal-user-agent": "^6.0.0" } }, "sha512-3mkDltSfcDUoa176nlGoA32RGjeWjl3K7F/BwHwRMJUW/IteSa4bnSV8p2ThNkcIcZU2umkZWxwETSSCJf2Q7g=="],
"@octokit/openapi-types": ["@octokit/openapi-types@24.2.0", "", {}, "sha512-9sIH3nSUttelJSXUrmGzl7QUBFul0/mB8HRYl3fOlgHbIWG+WnYDXU3v/2zMtAvuzZ/ed00Ei6on975FhBfzrg=="],
"@octokit/plugin-paginate-rest": ["@octokit/plugin-paginate-rest@9.2.2", "", { "dependencies": { "@octokit/types": "^12.6.0" }, "peerDependencies": { "@octokit/core": "5" } }, "sha512-u3KYkGF7GcZnSD/3UP0S7K5XUFT2FkOQdcfXZGZQPGv3lm4F2Xbf71lvjldr8c1H3nNbF+33cLEkWYbokGWqiQ=="],
"@octokit/plugin-rest-endpoint-methods": ["@octokit/plugin-rest-endpoint-methods@10.4.1", "", { "dependencies": { "@octokit/types": "^12.6.0" }, "peerDependencies": { "@octokit/core": "5" } }, "sha512-xV1b+ceKV9KytQe3zCVqjg+8GTGfDYwaT1ATU5isiUyVtlVAO3HNdzpS4sr4GBx4hxQ46s7ITtZrAsxG22+rVg=="],
"@octokit/request": ["@octokit/request@8.4.1", "", { "dependencies": { "@octokit/endpoint": "^9.0.6", "@octokit/request-error": "^5.1.1", "@octokit/types": "^13.1.0", "universal-user-agent": "^6.0.0" } }, "sha512-qnB2+SY3hkCmBxZsR/MPCybNmbJe4KAlfWErXq+rBKkQJlbjdJeS85VI9r8UqeLYLvnAenU8Q1okM/0MBsAGXw=="],
"@octokit/request-error": ["@octokit/request-error@5.1.1", "", { "dependencies": { "@octokit/types": "^13.1.0", "deprecation": "^2.0.0", "once": "^1.4.0" } }, "sha512-v9iyEQJH6ZntoENr9/yXxjuezh4My67CBSu9r6Ve/05Iu5gNgnisNWOsoJHTP6k0Rr0+HQIpnH+kyammu90q/g=="],
"@octokit/types": ["@octokit/types@13.10.0", "", { "dependencies": { "@octokit/openapi-types": "^24.2.0" } }, "sha512-ifLaO34EbbPj0Xgro4G5lP5asESjwHracYJvVaPIyXMuiuXLlhic3S47cBdTb+jfODkTE5YtGCLt3Ay3+J97sA=="],
"@types/bun": ["@types/bun@1.2.13", "", { "dependencies": { "bun-types": "1.2.13" } }, "sha512-u6vXep/i9VBxoJl3GjZsl/BFIsvML8DfVDO0RYLEwtSZSp981kEO1V5NwRcO1CPJ7AmvpbnDCiMKo3JvbDEjAg=="],
"@types/node": ["@types/node@22.15.21", "", { "dependencies": { "undici-types": "~6.21.0" } }, "sha512-EV/37Td6c+MgKAbkcLG6vqZ2zEYHD7bvSrzqqs2RIhbA6w3x+Dqz8MZM3sP6kGTeLrdoOgKZe+Xja7tUB2DNkQ=="],
"before-after-hook": ["before-after-hook@2.2.3", "", {}, "sha512-NzUnlZexiaH/46WDhANlyR2bXRopNg4F/zuSA3OpZnllCUgRaOF2znDioDWrmbNVsuZk6l9pMquQB38cfBZwkQ=="],
"bun-types": ["bun-types@1.2.13", "", { "dependencies": { "@types/node": "*" } }, "sha512-rRjA1T6n7wto4gxhAO/ErZEtOXyEZEmnIHQfl0Dt1QQSB4QV0iP6BZ9/YB5fZaHFQ2dwHFrmPaRQ9GGMX01k9Q=="],
"deprecation": ["deprecation@2.3.1", "", {}, "sha512-xmHIy4F3scKVwMsQ4WnVaS8bHOx0DmVwRywosKhaILI0ywMDWPtBSku2HNxRvF7jtwDRsoEwYQSfbxj8b7RlJQ=="],
"once": ["once@1.4.0", "", { "dependencies": { "wrappy": "1" } }, "sha512-lNaJgI+2Q5URQBkccEKHTQOPaXdUxnZZElQTZY0MFUAuaEqe1E+Nyvgdz/aIyNi6Z9MzO5dv1H8n58/GELp3+w=="],
"prettier": ["prettier@3.5.3", "", { "bin": { "prettier": "bin/prettier.cjs" } }, "sha512-QQtaxnoDJeAkDvDKWCLiwIXkTgRhwYDEQCghU9Z6q03iyek/rxRh/2lC3HB7P8sWT2xC/y5JDctPLBIGzHKbhw=="],
"tunnel": ["tunnel@0.0.6", "", {}, "sha512-1h/Lnq9yajKY2PEbBadPXj3VxsDDu844OnaAo52UVmIzIvwwtBPIuNvkjuzBlTWpfJyUbG3ez0KSBibQkj4ojg=="],
"typescript": ["typescript@5.8.3", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-p1diW6TqL9L07nNxvRMM7hMMw4c5XOo/1ibL4aAIGmSAt9slTE1Xgw5KWuof2uTOvCg9BY7ZRi+GaF+7sfgPeQ=="],
"undici": ["undici@5.29.0", "", { "dependencies": { "@fastify/busboy": "^2.0.0" } }, "sha512-raqeBD6NQK4SkWhQzeYKd1KmIG6dllBOTt55Rmkt4HtI9mwdWtJljnrXjAFUBLTSN67HWrOIZ3EPF4kjUw80Bg=="],
"undici-types": ["undici-types@6.21.0", "", {}, "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ=="],
"universal-user-agent": ["universal-user-agent@6.0.1", "", {}, "sha512-yCzhz6FN2wU1NiiQRogkTQszlQSlpWaw8SvVegAc+bDxbzHgh1vX8uIe8OYyMH6DwH+sdTJsgMl36+mSMdRJIQ=="],
"wrappy": ["wrappy@1.0.2", "", {}, "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ=="],
"@octokit/plugin-paginate-rest/@octokit/types": ["@octokit/types@12.6.0", "", { "dependencies": { "@octokit/openapi-types": "^20.0.0" } }, "sha512-1rhSOfRa6H9w4YwK0yrf5faDaDTb+yLyBUKOCV4xtCDB5VmIPqd/v9yr9o6SAzOAlRxMiRiCic6JVM1/kunVkw=="],
"@octokit/plugin-rest-endpoint-methods/@octokit/types": ["@octokit/types@12.6.0", "", { "dependencies": { "@octokit/openapi-types": "^20.0.0" } }, "sha512-1rhSOfRa6H9w4YwK0yrf5faDaDTb+yLyBUKOCV4xtCDB5VmIPqd/v9yr9o6SAzOAlRxMiRiCic6JVM1/kunVkw=="],
"@octokit/plugin-paginate-rest/@octokit/types/@octokit/openapi-types": ["@octokit/openapi-types@20.0.0", "", {}, "sha512-EtqRBEjp1dL/15V7WiX5LJMIxxkdiGJnabzYx5Apx4FkQIFgAfKumXeYAqqJCj1s+BMX4cPFIFC4OLCR6stlnA=="],
"@octokit/plugin-rest-endpoint-methods/@octokit/types/@octokit/openapi-types": ["@octokit/openapi-types@20.0.0", "", {}, "sha512-EtqRBEjp1dL/15V7WiX5LJMIxxkdiGJnabzYx5Apx4FkQIFgAfKumXeYAqqJCj1s+BMX4cPFIFC4OLCR6stlnA=="],
}
}

21
.github/actions/codex/package.json vendored Normal file
View File

@@ -0,0 +1,21 @@
{
"name": "codex-action",
"version": "0.0.0",
"private": true,
"scripts": {
"format": "prettier --check src",
"format:fix": "prettier --write src",
"test": "bun test",
"typecheck": "tsc"
},
"dependencies": {
"@actions/core": "^1.11.1",
"@actions/github": "^6.0.1"
},
"devDependencies": {
"@types/bun": "^1.2.11",
"@types/node": "^22.15.21",
"prettier": "^3.5.3",
"typescript": "^5.8.3"
}
}

View File

@@ -0,0 +1,85 @@
import * as github from "@actions/github";
import type { EnvContext } from "./env-context";
/**
* Add an "eyes" reaction to the entity (issue, issue comment, or pull request
* review comment) that triggered the current Codex invocation.
*
* The purpose is to provide immediate feedback to the user similar to the
* *-in-progress label flow indicating that the bot has acknowledged the
* request and is working on it.
*
* We attempt to add the reaction best suited for the current GitHub event:
*
* • issues → POST /repos/{owner}/{repo}/issues/{issue_number}/reactions
* • issue_comment → POST /repos/{owner}/{repo}/issues/comments/{comment_id}/reactions
* • pull_request_review_comment → POST /repos/{owner}/{repo}/pulls/comments/{comment_id}/reactions
*
* If the specific target is unavailable (e.g. unexpected payload shape) we
* silently skip instead of failing the whole action because the reaction is
* merely cosmetic.
*/
export async function addEyesReaction(ctx: EnvContext): Promise<void> {
const octokit = ctx.getOctokit();
const { owner, repo } = github.context.repo;
const eventName = github.context.eventName;
try {
switch (eventName) {
case "issue_comment": {
const commentId = (github.context.payload as any)?.comment?.id;
if (commentId) {
await octokit.rest.reactions.createForIssueComment({
owner,
repo,
comment_id: commentId,
content: "eyes",
});
return;
}
break;
}
case "pull_request_review_comment": {
const commentId = (github.context.payload as any)?.comment?.id;
if (commentId) {
await octokit.rest.reactions.createForPullRequestReviewComment({
owner,
repo,
comment_id: commentId,
content: "eyes",
});
return;
}
break;
}
case "issues": {
const issueNumber = github.context.issue.number;
if (issueNumber) {
await octokit.rest.reactions.createForIssue({
owner,
repo,
issue_number: issueNumber,
content: "eyes",
});
return;
}
break;
}
default: {
// Fallback: try to react to the issue/PR if we have a number.
const issueNumber = github.context.issue.number;
if (issueNumber) {
await octokit.rest.reactions.createForIssue({
owner,
repo,
issue_number: issueNumber,
content: "eyes",
});
}
}
}
} catch (error) {
// Do not fail the action if reaction creation fails log and continue.
console.warn(`Failed to add \"eyes\" reaction: ${error}`);
}
}

53
.github/actions/codex/src/comment.ts vendored Normal file
View File

@@ -0,0 +1,53 @@
import type { EnvContext } from "./env-context";
import { runCodex } from "./run-codex";
import { postComment } from "./post-comment";
import { addEyesReaction } from "./add-reaction";
/**
* Handle `issue_comment` and `pull_request_review_comment` events once we know
* the action is supported.
*/
export async function onComment(ctx: EnvContext): Promise<void> {
const triggerPhrase = ctx.tryGet("INPUT_TRIGGER_PHRASE");
if (!triggerPhrase) {
console.warn("Empty trigger phrase: skipping.");
return;
}
// Attempt to get the body of the comment from the environment. Depending on
// the event type either `GITHUB_EVENT_COMMENT_BODY` (issue & PR comments) or
// `GITHUB_EVENT_REVIEW_BODY` (PR reviews) is set.
const commentBody =
ctx.tryGetNonEmpty("GITHUB_EVENT_COMMENT_BODY") ??
ctx.tryGetNonEmpty("GITHUB_EVENT_REVIEW_BODY") ??
ctx.tryGetNonEmpty("GITHUB_EVENT_ISSUE_BODY");
if (!commentBody) {
console.warn("Comment body not found in environment: skipping.");
return;
}
// Check if the trigger phrase is present.
if (!commentBody.includes(triggerPhrase)) {
console.log(
`Trigger phrase '${triggerPhrase}' not found: nothing to do for this comment.`,
);
return;
}
// Derive the prompt by removing the trigger phrase. Remove only the first
// occurrence to keep any additional occurrences that might be meaningful.
const prompt = commentBody.replace(triggerPhrase, "").trim();
if (prompt.length === 0) {
console.warn("Prompt is empty after removing trigger phrase: skipping");
return;
}
// Provide immediate feedback that we are working on the request.
await addEyesReaction(ctx);
// Run Codex and post the response as a new comment.
const lastMessage = await runCodex(prompt, ctx);
await postComment(lastMessage, ctx);
}

11
.github/actions/codex/src/config.ts vendored Normal file
View File

@@ -0,0 +1,11 @@
import { readdirSync, statSync } from "fs";
import * as path from "path";
export interface Config {
labels: Record<string, LabelConfig>;
}
export interface LabelConfig {
/** Returns the prompt template. */
getPromptTemplate(): string;
}

View File

@@ -0,0 +1,44 @@
import type { Config } from "./config";
export function getDefaultConfig(): Config {
return {
labels: {
"codex-investigate-issue": {
getPromptTemplate: () =>
`
Troubleshoot whether the reported issue is valid.
Provide a concise and respectful comment summarizing the findings.
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}
`.trim(),
},
"codex-code-review": {
getPromptTemplate: () =>
`
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the \`base\` and \`head\` refs that define this PR. Both refs are available locally.
`.trim(),
},
"codex-attempt-fix": {
getPromptTemplate: () =>
`
Attempt to solve the reported issue.
If a code change is required, create a new branch, commit the fix, and open a pull-request that resolves the problem.
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}
`.trim(),
},
},
};
}

116
.github/actions/codex/src/env-context.ts vendored Normal file
View File

@@ -0,0 +1,116 @@
/*
* Centralised access to environment variables used by the Codex GitHub
* Action.
*
* To enable proper unit-testing we avoid reading from `process.env` at module
* initialisation time. Instead a `EnvContext` object is created (usually from
* the real `process.env`) and passed around explicitly or where that is not
* yet practical imported as the shared `defaultContext` singleton. Tests can
* create their own context backed by a stubbed map of variables without having
* to mutate global state.
*/
import { fail } from "./fail";
import * as github from "@actions/github";
export interface EnvContext {
/**
* Return the value for a given environment variable or terminate the action
* via `fail` if it is missing / empty.
*/
get(name: string): string;
/**
* Attempt to read an environment variable. Returns the value when present;
* otherwise returns undefined (does not call `fail`).
*/
tryGet(name: string): string | undefined;
/**
* Attempt to read an environment variable. Returns non-empty string value or
* null if unset or empty string.
*/
tryGetNonEmpty(name: string): string | null;
/**
* Return a memoised Octokit instance authenticated via the token resolved
* from the provided argument (when defined) or the environment variables
* `GITHUB_TOKEN`/`GH_TOKEN`.
*
* Subsequent calls return the same cached instance to avoid spawning
* multiple REST clients within a single action run.
*/
getOctokit(token?: string): ReturnType<typeof github.getOctokit>;
}
/** Internal helper *not* exported. */
function _getRequiredEnv(
name: string,
env: Record<string, string | undefined>,
): string | undefined {
const value = env[name];
// Avoid leaking secrets into logs while still logging non-secret variables.
if (name.endsWith("KEY") || name.endsWith("TOKEN")) {
if (value) {
console.log(`value for ${name} was found`);
}
} else {
console.log(`${name}=${value}`);
}
return value;
}
/** Create a context backed by the supplied environment map (defaults to `process.env`). */
export function createEnvContext(
env: Record<string, string | undefined> = process.env,
): EnvContext {
// Lazily instantiated Octokit client shared across this context.
let cachedOctokit: ReturnType<typeof github.getOctokit> | null = null;
return {
get(name: string): string {
const value = _getRequiredEnv(name, env);
if (value == null) {
fail(`Missing required environment variable: ${name}`);
}
return value;
},
tryGet(name: string): string | undefined {
return _getRequiredEnv(name, env);
},
tryGetNonEmpty(name: string): string | null {
const value = _getRequiredEnv(name, env);
return value == null || value === "" ? null : value;
},
getOctokit(token?: string) {
if (cachedOctokit) {
return cachedOctokit;
}
// Determine the token to authenticate with.
const githubToken = token ?? env["GITHUB_TOKEN"] ?? env["GH_TOKEN"];
if (!githubToken) {
fail(
"Unable to locate a GitHub token. `github_token` should have been set on the action.",
);
}
cachedOctokit = github.getOctokit(githubToken!);
return cachedOctokit;
},
};
}
/**
* Shared context built from the actual `process.env`. Production code that is
* not yet refactored to receive a context explicitly may import and use this
* singleton. Tests should avoid the singleton and instead pass their own
* context to the functions they exercise.
*/
export const defaultContext: EnvContext = createEnvContext();

4
.github/actions/codex/src/fail.ts vendored Normal file
View File

@@ -0,0 +1,4 @@
export function fail(message: string): never {
console.error(message);
process.exit(1);
}

149
.github/actions/codex/src/git-helpers.ts vendored Normal file
View File

@@ -0,0 +1,149 @@
import { spawnSync } from "child_process";
import * as github from "@actions/github";
import { EnvContext } from "./env-context";
function runGit(args: string[], silent = true): string {
console.info(`Running git ${args.join(" ")}`);
const res = spawnSync("git", args, {
encoding: "utf8",
stdio: silent ? ["ignore", "pipe", "pipe"] : "inherit",
});
if (res.error) {
throw res.error;
}
if (res.status !== 0) {
// Return stderr so caller may handle; else throw.
throw new Error(
`git ${args.join(" ")} failed with code ${res.status}: ${res.stderr}`,
);
}
return res.stdout.trim();
}
function stageAllChanges() {
runGit(["add", "-A"]);
}
function hasStagedChanges(): boolean {
const res = spawnSync("git", ["diff", "--cached", "--quiet", "--exit-code"]);
return res.status !== 0;
}
function ensureOnBranch(
issueNumber: number,
protectedBranches: string[],
suggestedSlug?: string,
): string {
let branch = "";
try {
branch = runGit(["symbolic-ref", "--short", "-q", "HEAD"]);
} catch {
branch = "";
}
// If detached HEAD or on a protected branch, create a new branch.
if (!branch || protectedBranches.includes(branch)) {
if (suggestedSlug) {
const safeSlug = suggestedSlug
.toLowerCase()
.replace(/[^\w\s-]/g, "")
.trim()
.replace(/\s+/g, "-");
branch = `codex-fix-${issueNumber}-${safeSlug}`;
} else {
branch = `codex-fix-${issueNumber}-${Date.now()}`;
}
runGit(["switch", "-c", branch]);
}
return branch;
}
function commitIfNeeded(issueNumber: number) {
if (hasStagedChanges()) {
runGit([
"commit",
"-m",
`fix: automated fix for #${issueNumber} via Codex`,
]);
}
}
function pushBranch(branch: string, githubToken: string, ctx: EnvContext) {
const repoSlug = ctx.get("GITHUB_REPOSITORY"); // owner/repo
const remoteUrl = `https://x-access-token:${githubToken}@github.com/${repoSlug}.git`;
runGit(["push", "--force-with-lease", "-u", remoteUrl, `HEAD:${branch}`]);
}
/**
* If this returns a string, it is the URL of the created PR.
*/
export async function maybePublishPRForIssue(
issueNumber: number,
lastMessage: string,
ctx: EnvContext,
): Promise<string | undefined> {
// Only proceed if GITHUB_TOKEN available.
const githubToken =
ctx.tryGetNonEmpty("GITHUB_TOKEN") ?? ctx.tryGetNonEmpty("GH_TOKEN");
if (!githubToken) {
console.warn("No GitHub token - skipping PR creation.");
return undefined;
}
// Print `git status` for debugging.
runGit(["status"]);
// Stage any remaining changes so they can be committed and pushed.
stageAllChanges();
const octokit = ctx.getOctokit(githubToken);
const { owner, repo } = github.context.repo;
// Determine default branch to treat as protected.
let defaultBranch = "main";
try {
const repoInfo = await octokit.rest.repos.get({ owner, repo });
defaultBranch = repoInfo.data.default_branch ?? "main";
} catch (e) {
console.warn(`Failed to get default branch, assuming 'main': ${e}`);
}
const sanitizedMessage = lastMessage.replace(/\u2022/g, "-");
const [summaryLine] = sanitizedMessage.split(/\r?\n/);
const branch = ensureOnBranch(issueNumber, [defaultBranch, "master"], summaryLine);
commitIfNeeded(issueNumber);
pushBranch(branch, githubToken, ctx);
// Try to find existing PR for this branch
const headParam = `${owner}:${branch}`;
const existing = await octokit.rest.pulls.list({
owner,
repo,
head: headParam,
state: "open",
});
if (existing.data.length > 0) {
return existing.data[0].html_url;
}
// Determine base branch (default to main)
let baseBranch = "main";
try {
const repoInfo = await octokit.rest.repos.get({ owner, repo });
baseBranch = repoInfo.data.default_branch ?? "main";
} catch (e) {
console.warn(`Failed to get default branch, assuming 'main': ${e}`);
}
const pr = await octokit.rest.pulls.create({
owner,
repo,
title: summaryLine,
head: branch,
base: baseBranch,
body: sanitizedMessage,
});
return pr.data.html_url;
}

16
.github/actions/codex/src/git-user.ts vendored Normal file
View File

@@ -0,0 +1,16 @@
export function setGitHubActionsUser(): void {
const commands = [
["git", "config", "--global", "user.name", "github-actions[bot]"],
[
"git",
"config",
"--global",
"user.email",
"41898282+github-actions[bot]@users.noreply.github.com",
],
];
for (const command of commands) {
Bun.spawnSync(command);
}
}

View File

@@ -0,0 +1,11 @@
import * as pathMod from "path";
import { EnvContext } from "./env-context";
export function resolveWorkspacePath(path: string, ctx: EnvContext): string {
if (pathMod.isAbsolute(path)) {
return path;
} else {
const workspace = ctx.get("GITHUB_WORKSPACE");
return pathMod.join(workspace, path);
}
}

View File

@@ -0,0 +1,56 @@
import type { Config, LabelConfig } from "./config";
import { getDefaultConfig } from "./default-label-config";
import { readFileSync, readdirSync, statSync } from "fs";
import * as path from "path";
/**
* Build an in-memory configuration object by scanning the repository for
* Markdown templates located in `.github/codex/labels`.
*
* Each `*.md` file in that directory represents a label that can trigger the
* Codex GitHub Action. The filename **without** the extension is interpreted
* as the label name, e.g. `codex-review.md` ➜ `codex-review`.
*
* For every such label we derive the corresponding `doneLabel` by appending
* the suffix `-completed`.
*/
export function loadConfig(workspace: string): Config {
const labelsDir = path.join(workspace, ".github", "codex", "labels");
let entries: string[];
try {
entries = readdirSync(labelsDir);
} catch {
// If the directory is missing, return the default configuration.
return getDefaultConfig();
}
const labels: Record<string, LabelConfig> = {};
for (const entry of entries) {
if (!entry.endsWith(".md")) {
continue;
}
const fullPath = path.join(labelsDir, entry);
if (!statSync(fullPath).isFile()) {
continue;
}
const labelName = entry.slice(0, -3); // trim ".md"
labels[labelName] = new FileLabelConfig(fullPath);
}
return { labels };
}
class FileLabelConfig implements LabelConfig {
constructor(private readonly promptPath: string) {}
getPromptTemplate(): string {
return readFileSync(this.promptPath, "utf8");
}
}

80
.github/actions/codex/src/main.ts vendored Executable file
View File

@@ -0,0 +1,80 @@
#!/usr/bin/env bun
import type { Config } from "./config";
import { defaultContext, EnvContext } from "./env-context";
import { loadConfig } from "./load-config";
import { setGitHubActionsUser } from "./git-user";
import { onLabeled } from "./process-label";
import { ensureBaseAndHeadCommitsForPRAreAvailable } from "./prompt-template";
import { performAdditionalValidation } from "./verify-inputs";
import { onComment } from "./comment";
import { onReview } from "./review";
async function main(): Promise<void> {
const ctx: EnvContext = defaultContext;
// Build the configuration dynamically by scanning `.github/codex/labels`.
const GITHUB_WORKSPACE = ctx.get("GITHUB_WORKSPACE");
const config: Config = loadConfig(GITHUB_WORKSPACE);
// Optionally perform additional validation of prompt template files.
performAdditionalValidation(config, GITHUB_WORKSPACE);
const GITHUB_EVENT_NAME = ctx.get("GITHUB_EVENT_NAME");
const GITHUB_EVENT_ACTION = ctx.get("GITHUB_EVENT_ACTION");
// Set user.name and user.email to a bot before Codex runs, just in case it
// creates a commit.
setGitHubActionsUser();
switch (GITHUB_EVENT_NAME) {
case "issues": {
if (GITHUB_EVENT_ACTION === "labeled") {
await onLabeled(config, ctx);
return;
} else if (GITHUB_EVENT_ACTION === "opened") {
await onComment(ctx);
return;
}
break;
}
case "issue_comment": {
if (GITHUB_EVENT_ACTION === "created") {
await onComment(ctx);
return;
}
break;
}
case "pull_request": {
if (GITHUB_EVENT_ACTION === "labeled") {
await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
await onLabeled(config, ctx);
return;
}
break;
}
case "pull_request_review": {
await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
if (GITHUB_EVENT_ACTION === "submitted") {
await onReview(ctx);
return;
}
break;
}
case "pull_request_review_comment": {
await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
if (GITHUB_EVENT_ACTION === "created") {
await onComment(ctx);
return;
}
break;
}
}
console.warn(
`Unsupported action '${GITHUB_EVENT_ACTION}' for event '${GITHUB_EVENT_NAME}'.`,
);
}
main();

View File

@@ -0,0 +1,62 @@
import { fail } from "./fail";
import * as github from "@actions/github";
import { EnvContext } from "./env-context";
/**
* Post a comment to the issue / pull request currently in scope.
*
* Provide the environment context so that token lookup (inside getOctokit) does
* not rely on global state.
*/
export async function postComment(
commentBody: string,
ctx: EnvContext,
): Promise<void> {
// Append a footer with a link back to the workflow run, if available.
const footer = buildWorkflowRunFooter(ctx);
const bodyWithFooter = footer ? `${commentBody}${footer}` : commentBody;
const octokit = ctx.getOctokit();
console.info("Got Octokit instance for posting comment");
const { owner, repo } = github.context.repo;
const issueNumber = github.context.issue.number;
if (!issueNumber) {
console.warn(
"No issue or pull_request number found in GitHub context; skipping comment creation.",
);
return;
}
try {
console.info("Calling octokit.rest.issues.createComment()");
await octokit.rest.issues.createComment({
owner,
repo,
issue_number: issueNumber,
body: bodyWithFooter,
});
} catch (error) {
fail(`Failed to create comment via GitHub API: ${error}`);
}
}
/**
* Helper to build a Markdown fragment linking back to the workflow run that
* generated the current comment. Returns `undefined` if required environment
* variables are missing e.g. when running outside of GitHub Actions so we
* can gracefully skip the footer in those cases.
*/
function buildWorkflowRunFooter(ctx: EnvContext): string | undefined {
const serverUrl =
ctx.tryGetNonEmpty("GITHUB_SERVER_URL") ?? "https://github.com";
const repository = ctx.tryGetNonEmpty("GITHUB_REPOSITORY");
const runId = ctx.tryGetNonEmpty("GITHUB_RUN_ID");
if (!repository || !runId) {
return undefined;
}
const url = `${serverUrl}/${repository}/actions/runs/${runId}`;
return `\n\n---\n*[_View workflow run_](${url})*`;
}

View File

@@ -0,0 +1,195 @@
import { fail } from "./fail";
import { EnvContext } from "./env-context";
import { renderPromptTemplate } from "./prompt-template";
import { postComment } from "./post-comment";
import { runCodex } from "./run-codex";
import * as github from "@actions/github";
import { Config, LabelConfig } from "./config";
import { maybePublishPRForIssue } from "./git-helpers";
export async function onLabeled(
config: Config,
ctx: EnvContext,
): Promise<void> {
const GITHUB_EVENT_LABEL_NAME = ctx.get("GITHUB_EVENT_LABEL_NAME");
const labelConfig = config.labels[GITHUB_EVENT_LABEL_NAME] as
| LabelConfig
| undefined;
if (!labelConfig) {
fail(
`Label \`${GITHUB_EVENT_LABEL_NAME}\` not found in config: ${JSON.stringify(config)}`,
);
}
await processLabelConfig(ctx, GITHUB_EVENT_LABEL_NAME, labelConfig);
}
/**
* Wrapper that handles `-in-progress` and `-completed` semantics around the core lint/fix/review
* processing. It will:
*
* - Skip execution if the `-in-progress` or `-completed` label is already present.
* - Mark the PR/issue as `-in-progress`.
* - After successful execution, mark the PR/issue as `-completed`.
*/
async function processLabelConfig(
ctx: EnvContext,
label: string,
labelConfig: LabelConfig,
): Promise<void> {
const octokit = ctx.getOctokit();
const { owner, repo, issueNumber, labelNames } =
await getCurrentLabels(octokit);
const inProgressLabel = `${label}-in-progress`;
const completedLabel = `${label}-completed`;
for (const markerLabel of [inProgressLabel, completedLabel]) {
if (labelNames.includes(markerLabel)) {
console.log(
`Label '${markerLabel}' already present on issue/PR #${issueNumber}. Skipping Codex action.`,
);
// Clean up: remove the triggering label to avoid confusion and re-runs.
await addAndRemoveLabels(octokit, {
owner,
repo,
issueNumber,
remove: markerLabel,
});
return;
}
}
// Mark the PR/issue as in progress.
await addAndRemoveLabels(octokit, {
owner,
repo,
issueNumber,
add: inProgressLabel,
remove: label,
});
// Run the core Codex processing.
await processLabel(ctx, label, labelConfig);
// Mark the PR/issue as completed.
await addAndRemoveLabels(octokit, {
owner,
repo,
issueNumber,
add: completedLabel,
remove: inProgressLabel,
});
}
async function processLabel(
ctx: EnvContext,
label: string,
labelConfig: LabelConfig,
): Promise<void> {
const template = labelConfig.getPromptTemplate();
const populatedTemplate = await renderPromptTemplate(template, ctx);
// Always run Codex and post the resulting message as a comment.
let commentBody = await runCodex(populatedTemplate, ctx);
// Current heuristic: only try to create a PR if "attempt" or "fix" is in the
// label name. (Yes, we plan to evolve this.)
if (label.indexOf("fix") !== -1 || label.indexOf("attempt") !== -1) {
console.info(`label ${label} indicates we should attempt to create a PR`);
const prUrl = await maybeFixIssue(ctx, commentBody);
if (prUrl) {
commentBody += `\n\n---\nOpened pull request: ${prUrl}`;
}
} else {
console.info(
`label ${label} does not indicate we should attempt to create a PR`,
);
}
await postComment(commentBody, ctx);
}
async function maybeFixIssue(
ctx: EnvContext,
lastMessage: string,
): Promise<string | undefined> {
// Attempt to create a PR out of any changes Codex produced.
const issueNumber = github.context.issue.number!; // exists for issues triggering this path
try {
return await maybePublishPRForIssue(issueNumber, lastMessage, ctx);
} catch (e) {
console.warn(`Failed to publish PR: ${e}`);
}
}
async function getCurrentLabels(
octokit: ReturnType<typeof github.getOctokit>,
): Promise<{
owner: string;
repo: string;
issueNumber: number;
labelNames: Array<string>;
}> {
const { owner, repo } = github.context.repo;
const issueNumber = github.context.issue.number;
if (!issueNumber) {
fail("No issue or pull_request number found in GitHub context.");
}
const { data: issueData } = await octokit.rest.issues.get({
owner,
repo,
issue_number: issueNumber,
});
const labelNames =
issueData.labels?.map((label: any) =>
typeof label === "string" ? label : label.name,
) ?? [];
return { owner, repo, issueNumber, labelNames };
}
async function addAndRemoveLabels(
octokit: ReturnType<typeof github.getOctokit>,
opts: {
owner: string;
repo: string;
issueNumber: number;
add?: string;
remove?: string;
},
): Promise<void> {
const { owner, repo, issueNumber, add, remove } = opts;
if (add) {
try {
await octokit.rest.issues.addLabels({
owner,
repo,
issue_number: issueNumber,
labels: [add],
});
} catch (error) {
console.warn(`Failed to add label '${add}': ${error}`);
}
}
if (remove) {
try {
await octokit.rest.issues.removeLabel({
owner,
repo,
issue_number: issueNumber,
name: remove,
});
} catch (error) {
console.warn(`Failed to remove label '${remove}': ${error}`);
}
}
}

View File

@@ -0,0 +1,284 @@
/*
* Utilities to render Codex prompt templates.
*
* A template is a Markdown (or plain-text) file that may contain one or more
* placeholders of the form `{CODEX_ACTION_<NAME>}`. At runtime these
* placeholders are substituted with dynamically generated content. Each
* placeholder is resolved **exactly once** even if it appears multiple times
* in the same template.
*/
import { readFile } from "fs/promises";
import { EnvContext } from "./env-context";
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
/**
* Lazily caches parsed `$GITHUB_EVENT_PATH` contents keyed by the file path so
* we only hit the filesystem once per unique event payload.
*/
const githubEventDataCache: Map<string, Promise<any>> = new Map();
function getGitHubEventData(ctx: EnvContext): Promise<any> {
const eventPath = ctx.get("GITHUB_EVENT_PATH");
let cached = githubEventDataCache.get(eventPath);
if (!cached) {
cached = readFile(eventPath, "utf8").then((raw) => JSON.parse(raw));
githubEventDataCache.set(eventPath, cached);
}
return cached;
}
async function runCommand(args: Array<string>): Promise<string> {
const result = Bun.spawnSync(args, {
stdout: "pipe",
stderr: "pipe",
});
if (result.success) {
return result.stdout.toString();
}
console.error(`Error running ${JSON.stringify(args)}: ${result.stderr}`);
return "";
}
// ---------------------------------------------------------------------------
// Public API
// ---------------------------------------------------------------------------
// Regex that captures the variable name without the surrounding { } braces.
const VAR_REGEX = /\{(CODEX_ACTION_[A-Z0-9_]+)\}/g;
// Cache individual placeholder values so each one is resolved at most once per
// process even if many templates reference it.
const placeholderCache: Map<string, Promise<string>> = new Map();
/**
* Parse a template string, resolve all placeholders and return the rendered
* result.
*/
export async function renderPromptTemplate(
template: string,
ctx: EnvContext,
): Promise<string> {
// ---------------------------------------------------------------------
// 1) Gather all *unique* placeholders present in the template.
// ---------------------------------------------------------------------
const variables = new Set<string>();
for (const match of template.matchAll(VAR_REGEX)) {
variables.add(match[1]);
}
// ---------------------------------------------------------------------
// 2) Kick off (or reuse) async resolution for each variable.
// ---------------------------------------------------------------------
for (const variable of variables) {
if (!placeholderCache.has(variable)) {
placeholderCache.set(variable, resolveVariable(variable, ctx));
}
}
// ---------------------------------------------------------------------
// 3) Await completion so we can perform a simple synchronous replace below.
// ---------------------------------------------------------------------
const resolvedEntries: [string, string][] = [];
for (const [key, promise] of placeholderCache.entries()) {
resolvedEntries.push([key, await promise]);
}
const resolvedMap = new Map<string, string>(resolvedEntries);
// ---------------------------------------------------------------------
// 4) Replace each occurrence. We use replace with a callback to ensure
// correct substitution even if variable names overlap (they shouldn't,
// but better safe than sorry).
// ---------------------------------------------------------------------
return template.replace(VAR_REGEX, (_, varName: string) => {
return resolvedMap.get(varName) ?? "";
});
}
export async function ensureBaseAndHeadCommitsForPRAreAvailable(
ctx: EnvContext,
): Promise<{ baseSha: string; headSha: string } | null> {
const prShas = await getPrShas(ctx);
if (prShas == null) {
console.warn("Unable to resolve PR branches");
return null;
}
const event = await getGitHubEventData(ctx);
const pr = event.pull_request;
if (!pr) {
console.warn("event.pull_request is not defined - unexpected");
return null;
}
const workspace = ctx.get("GITHUB_WORKSPACE");
// Refs (branch names)
const baseRef: string | undefined = pr.base?.ref;
const headRef: string | undefined = pr.head?.ref;
// Clone URLs
const baseRemoteUrl: string | undefined = pr.base?.repo?.clone_url;
const headRemoteUrl: string | undefined = pr.head?.repo?.clone_url;
if (!baseRef || !headRef || !baseRemoteUrl || !headRemoteUrl) {
console.warn(
"Missing PR ref or remote URL information - cannot fetch commits",
);
return null;
}
// Ensure we have the base branch.
await runCommand([
"git",
"-C",
workspace,
"fetch",
"--no-tags",
"origin",
baseRef,
]);
// Ensure we have the head branch.
if (headRemoteUrl === baseRemoteUrl) {
// Same repository the commit is available from `origin`.
await runCommand([
"git",
"-C",
workspace,
"fetch",
"--no-tags",
"origin",
headRef,
]);
} else {
// Fork make sure a `pr` remote exists that points at the fork. Attempting
// to add a remote that already exists causes git to error, so we swallow
// any non-zero exit codes from that specific command.
await runCommand([
"git",
"-C",
workspace,
"remote",
"add",
"pr",
headRemoteUrl,
]);
// Whether adding succeeded or the remote already existed, attempt to fetch
// the head ref from the `pr` remote.
await runCommand([
"git",
"-C",
workspace,
"fetch",
"--no-tags",
"pr",
headRef,
]);
}
return prShas;
}
// ---------------------------------------------------------------------------
// Internal helpers still exported for use by other modules.
// ---------------------------------------------------------------------------
export async function resolvePrDiff(ctx: EnvContext): Promise<string> {
const prShas = await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
if (prShas == null) {
console.warn("Unable to resolve PR branches");
return "";
}
const workspace = ctx.get("GITHUB_WORKSPACE");
const { baseSha, headSha } = prShas;
return runCommand([
"git",
"-C",
workspace,
"diff",
"--color=never",
`${baseSha}..${headSha}`,
]);
}
// ---------------------------------------------------------------------------
// Placeholder resolution
// ---------------------------------------------------------------------------
async function resolveVariable(name: string, ctx: EnvContext): Promise<string> {
switch (name) {
case "CODEX_ACTION_ISSUE_TITLE": {
const event = await getGitHubEventData(ctx);
const issue = event.issue ?? event.pull_request;
return issue?.title ?? "";
}
case "CODEX_ACTION_ISSUE_BODY": {
const event = await getGitHubEventData(ctx);
const issue = event.issue ?? event.pull_request;
return issue?.body ?? "";
}
case "CODEX_ACTION_GITHUB_EVENT_PATH": {
return ctx.get("GITHUB_EVENT_PATH");
}
case "CODEX_ACTION_BASE_REF": {
const event = await getGitHubEventData(ctx);
return event?.pull_request?.base?.ref ?? "";
}
case "CODEX_ACTION_HEAD_REF": {
const event = await getGitHubEventData(ctx);
return event?.pull_request?.head?.ref ?? "";
}
case "CODEX_ACTION_PR_DIFF": {
return resolvePrDiff(ctx);
}
// -------------------------------------------------------------------
// Add new template variables here.
// -------------------------------------------------------------------
default: {
// Unknown variable leave it blank to avoid leaking placeholders to the
// final prompt. The alternative would be to `fail()` here, but silently
// ignoring unknown placeholders is more forgiving and better matches the
// behaviour of typical template engines.
console.warn(`Unknown template variable: ${name}`);
return "";
}
}
}
async function getPrShas(
ctx: EnvContext,
): Promise<{ baseSha: string; headSha: string } | null> {
const event = await getGitHubEventData(ctx);
const pr = event.pull_request;
if (!pr) {
console.warn("event.pull_request is not defined");
return null;
}
// Prefer explicit SHAs if available to avoid relying on local branch names.
const baseSha: string | undefined = pr.base?.sha;
const headSha: string | undefined = pr.head?.sha;
if (!baseSha || !headSha) {
console.warn("one of base or head is not defined on event.pull_request");
return null;
}
return { baseSha, headSha };
}

42
.github/actions/codex/src/review.ts vendored Normal file
View File

@@ -0,0 +1,42 @@
import type { EnvContext } from "./env-context";
import { runCodex } from "./run-codex";
import { postComment } from "./post-comment";
import { addEyesReaction } from "./add-reaction";
/**
* Handle `pull_request_review` events. We treat the review body the same way
* as a normal comment.
*/
export async function onReview(ctx: EnvContext): Promise<void> {
const triggerPhrase = ctx.tryGet("INPUT_TRIGGER_PHRASE");
if (!triggerPhrase) {
console.warn("Empty trigger phrase: skipping.");
return;
}
const reviewBody = ctx.tryGet("GITHUB_EVENT_REVIEW_BODY");
if (!reviewBody) {
console.warn("Review body not found in environment: skipping.");
return;
}
if (!reviewBody.includes(triggerPhrase)) {
console.log(
`Trigger phrase '${triggerPhrase}' not found: nothing to do for this review.`,
);
return;
}
const prompt = reviewBody.replace(triggerPhrase, "").trim();
if (prompt.length === 0) {
console.warn("Prompt is empty after removing trigger phrase: skipping.");
return;
}
await addEyesReaction(ctx);
const lastMessage = await runCodex(prompt, ctx);
await postComment(lastMessage, ctx);
}

56
.github/actions/codex/src/run-codex.ts vendored Normal file
View File

@@ -0,0 +1,56 @@
import { fail } from "./fail";
import { EnvContext } from "./env-context";
import { tmpdir } from "os";
import { join } from "node:path";
import { readFile, mkdtemp } from "fs/promises";
import { resolveWorkspacePath } from "./github-workspace";
/**
* Runs the Codex CLI with the provided prompt and returns the output written
* to the "last message" file.
*/
export async function runCodex(
prompt: string,
ctx: EnvContext,
): Promise<string> {
const OPENAI_API_KEY = ctx.get("OPENAI_API_KEY");
const tempDirPath = await mkdtemp(join(tmpdir(), "codex-"));
const lastMessageOutput = join(tempDirPath, "codex-prompt.md");
const args = ["/usr/local/bin/codex-exec"];
const inputCodexArgs = ctx.tryGet("INPUT_CODEX_ARGS")?.trim();
if (inputCodexArgs) {
args.push(...inputCodexArgs.split(/\s+/));
}
args.push("--output-last-message", lastMessageOutput, prompt);
const env: Record<string, string> = { ...process.env, OPENAI_API_KEY };
const INPUT_CODEX_HOME = ctx.tryGet("INPUT_CODEX_HOME");
if (INPUT_CODEX_HOME) {
env.CODEX_HOME = resolveWorkspacePath(INPUT_CODEX_HOME, ctx);
}
console.log(`Running Codex: ${JSON.stringify(args)}`);
const result = Bun.spawnSync(args, {
stdout: "inherit",
stderr: "inherit",
env,
});
if (!result.success) {
fail(`Codex failed: see above for details.`);
}
// Read the output generated by Codex.
let lastMessage: string;
try {
lastMessage = await readFile(lastMessageOutput, "utf8");
} catch (err) {
fail(`Failed to read Codex output at '${lastMessageOutput}': ${err}`);
}
return lastMessage;
}

View File

@@ -0,0 +1,33 @@
// Validate the inputs passed to the composite action.
// The script currently ensures that the provided configuration file exists and
// matches the expected schema.
import type { Config } from "./config";
import { existsSync } from "fs";
import * as path from "path";
import { fail } from "./fail";
export function performAdditionalValidation(config: Config, workspace: string) {
// Additional validation: ensure referenced prompt files exist and are Markdown.
for (const [label, details] of Object.entries(config.labels)) {
// Determine which prompt key is present (the schema guarantees exactly one).
const promptPathStr =
(details as any).prompt ?? (details as any).promptPath;
if (promptPathStr) {
const promptPath = path.isAbsolute(promptPathStr)
? promptPathStr
: path.join(workspace, promptPathStr);
if (!existsSync(promptPath)) {
fail(`Prompt file for label '${label}' not found: ${promptPath}`);
}
if (!promptPath.endsWith(".md")) {
fail(
`Prompt file for label '${label}' must be a .md file (got ${promptPathStr}).`,
);
}
}
}
}

15
.github/actions/codex/tsconfig.json vendored Normal file
View File

@@ -0,0 +1,15 @@
{
"compilerOptions": {
"lib": ["ESNext"],
"target": "ESNext",
"module": "ESNext",
"moduleDetection": "force",
"moduleResolution": "bundler",
"noEmit": true,
"strict": true,
"skipLibCheck": true
},
"include": ["src"]
}

3
.github/codex/home/config.toml vendored Normal file
View File

@@ -0,0 +1,3 @@
model = "o3"
# Consider setting [mcp_servers] here!

9
.github/codex/labels/codex-attempt.md vendored Normal file
View File

@@ -0,0 +1,9 @@
Attempt to solve the reported issue.
If a code change is required, create a new branch, commit the fix, and open a pull request that resolves the problem.
Here is the original GitHub issue that triggered this run:
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}

7
.github/codex/labels/codex-review.md vendored Normal file
View File

@@ -0,0 +1,7 @@
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the `base` and `head` refs that define this PR. Both refs are available locally.

7
.github/codex/labels/codex-triage.md vendored Normal file
View File

@@ -0,0 +1,7 @@
Troubleshoot whether the reported issue is valid.
Provide a concise and respectful comment summarizing the findings.
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}

View File

@@ -5,7 +5,7 @@
"macos-aarch64": { "regex": "^codex-exec-aarch64-apple-darwin\\.zst$", "path": "codex-exec" },
"macos-x86_64": { "regex": "^codex-exec-x86_64-apple-darwin\\.zst$", "path": "codex-exec" },
"linux-x86_64": { "regex": "^codex-exec-x86_64-unknown-linux-musl\\.zst$", "path": "codex-exec" },
"linux-aarch64": { "regex": "^codex-exec-aarch64-unknown-linux-gnu\\.zst$", "path": "codex-exec" }
"linux-aarch64": { "regex": "^codex-exec-aarch64-unknown-linux-musl\\.zst$", "path": "codex-exec" }
}
},
@@ -14,14 +14,14 @@
"macos-aarch64": { "regex": "^codex-aarch64-apple-darwin\\.zst$", "path": "codex" },
"macos-x86_64": { "regex": "^codex-x86_64-apple-darwin\\.zst$", "path": "codex" },
"linux-x86_64": { "regex": "^codex-x86_64-unknown-linux-musl\\.zst$", "path": "codex" },
"linux-aarch64": { "regex": "^codex-aarch64-unknown-linux-gnu\\.zst$", "path": "codex" }
"linux-aarch64": { "regex": "^codex-aarch64-unknown-linux-musl\\.zst$", "path": "codex" }
}
},
"codex-linux-sandbox": {
"platforms": {
"linux-x86_64": { "regex": "^codex-linux-sandbox-x86_64-unknown-linux-musl\\.zst$", "path": "codex-linux-sandbox" },
"linux-aarch64": { "regex": "^codex-linux-sandbox-aarch64-unknown-linux-gnu\\.zst$", "path": "codex-linux-sandbox" }
"linux-aarch64": { "regex": "^codex-linux-sandbox-aarch64-unknown-linux-musl\\.zst$", "path": "codex-linux-sandbox" }
}
}
}

View File

@@ -23,7 +23,7 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
path-to-document: docs/CLA.md
path-to-document: https://github.com/openai/codex/blob/main/docs/CLA.md
path-to-signatures: signatures/cla.json
branch: cla-signatures
allowlist: dependabot[bot]

View File

@@ -23,3 +23,5 @@ jobs:
uses: codespell-project/codespell-problem-matcher@b80729f885d32f78a716c2f107b4db1025001c42 # v1
- name: Codespell
uses: codespell-project/actions-codespell@406322ec52dd7b488e48c1c4b82e2a8b3a1bf630 # v2
with:
ignore_words_file: .codespellignore

95
.github/workflows/codex.yml vendored Normal file
View File

@@ -0,0 +1,95 @@
name: Codex
on:
issues:
types: [opened, labeled]
pull_request:
branches: [main]
types: [labeled]
jobs:
codex:
# This `if` check provides complex filtering logic to avoid running Codex
# on every PR. Admittedly, one thing this does not verify is whether the
# sender has write access to the repo: that must be done as part of a
# runtime step.
#
# Note the label values should match the ones in the .github/codex/labels
# folder.
if: |
(github.event_name == 'issues' && (
(github.event.action == 'labeled' && (github.event.label.name == 'codex-attempt' || github.event.label.name == 'codex-triage'))
)) ||
(github.event_name == 'pull_request' && github.event.action == 'labeled' && github.event.label.name == 'codex-review')
runs-on: ubuntu-latest
permissions:
contents: write # can push or create branches
issues: write # for comments + labels on issues/PRs
pull-requests: write # for PR comments/labels
steps:
# TODO: Consider adding an optional mode (--dry-run?) to actions/codex
# that verifies whether Codex should actually be run for this event.
# (For example, it may be rejected because the sender does not have
# write access to the repo.) The benefit would be two-fold:
# 1. As the first step of this job, it gives us a chance to add a reaction
# or comment to the PR/issue ASAP to "ack" the request.
# 2. It saves resources by skipping the clone and setup steps below if
# Codex is not going to run.
- name: Checkout repository
uses: actions/checkout@v4
# We install the dependencies like we would for an ordinary CI job,
# particularly because Codex will not have network access to install
# these dependencies.
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 10.8.1
run_install: false
- name: Get pnpm store directory
id: pnpm-cache
shell: bash
run: |
echo "store_path=$(pnpm store path --silent)" >> $GITHUB_OUTPUT
- name: Setup pnpm cache
uses: actions/cache@v4
with:
path: ${{ steps.pnpm-cache.outputs.store_path }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
- name: Install dependencies
run: pnpm install
- uses: dtolnay/rust-toolchain@1.87
with:
targets: x86_64-unknown-linux-gnu
components: clippy
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
${{ github.workspace }}/codex-rs/target/
key: cargo-ubuntu-24.04-x86_64-unknown-linux-gnu-${{ hashFiles('**/Cargo.lock') }}
# Note it is possible that the `verify` step internal to Run Codex will
# fail, in which case the work to setup the repo was worthless :(
- name: Run Codex
uses: ./.github/actions/codex
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
github_token: ${{ secrets.GITHUB_TOKEN }}
codex_home: ./.github/codex/home

View File

@@ -26,7 +26,9 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: dtolnay/rust-toolchain@1.87
with:
components: rustfmt
- name: cargo fmt
run: cargo fmt -- --config imports_granularity=Item --check
@@ -53,14 +55,19 @@ jobs:
target: x86_64-unknown-linux-musl
- runner: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-gnu
- runner: windows-latest
target: x86_64-pc-windows-msvc
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: dtolnay/rust-toolchain@1.87
with:
targets: ${{ matrix.target }}
components: clippy
- uses: actions/cache@v4
with:
@@ -72,7 +79,7 @@ jobs:
${{ github.workspace }}/codex-rs/target/
key: cargo-${{ matrix.runner }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'aarch64-unknown-linux-musl'}}
name: Install musl build tools
run: |
sudo apt install -y musl-tools pkg-config
@@ -97,6 +104,8 @@ jobs:
id: test
continue-on-error: true
run: cargo test --all-features --target ${{ matrix.target }}
env:
RUST_BACKTRACE: 1
# Fail the job if any of the previous steps failed.
- name: verify all steps passed

View File

@@ -15,9 +15,6 @@ concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
env:
TAG_REGEX: '^rust-v[0-9]+\.[0-9]+\.[0-9]+$'
jobs:
tag-check:
runs-on: ubuntu-latest
@@ -33,8 +30,8 @@ jobs:
# 1. Must be a tag and match the regex
[[ "${GITHUB_REF_TYPE}" == "tag" ]] \
|| { echo "❌ Not a tag push"; exit 1; }
[[ "${GITHUB_REF_NAME}" =~ ${TAG_REGEX} ]] \
|| { echo "❌ Tag '${GITHUB_REF_NAME}' != ${TAG_REGEX}"; exit 1; }
[[ "${GITHUB_REF_NAME}" =~ ^rust-v[0-9]+\.[0-9]+\.[0-9]+(-(alpha|beta)(\.[0-9]+)?)?$ ]] \
|| { echo "❌ Tag '${GITHUB_REF_NAME}' doesn't match expected format"; exit 1; }
# 2. Extract versions
tag_ver="${GITHUB_REF_NAME#rust-v}"
@@ -69,12 +66,14 @@ jobs:
target: x86_64-unknown-linux-musl
- runner: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-gnu
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: dtolnay/rust-toolchain@1.87
with:
targets: ${{ matrix.target }}
@@ -88,7 +87,7 @@ jobs:
${{ github.workspace }}/codex-rs/target/
key: cargo-release-${{ matrix.runner }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'aarch64-unknown-linux-musl'}}
name: Install musl build tools
run: |
sudo apt install -y musl-tools pkg-config
@@ -105,7 +104,10 @@ jobs:
cp target/${{ matrix.target }}/release/codex-exec "$dest/codex-exec-${{ matrix.target }}"
cp target/${{ matrix.target }}/release/codex "$dest/codex-${{ matrix.target }}"
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'x86_64-unknown-linux-gnu' || matrix.target == 'aarch64-unknown-linux-gnu' }}
# After https://github.com/openai/codex/pull/1228 is merged and a new
# release is cut with an artifacts built after that PR, the `-gnu`
# variants can go away as we will only use the `-musl` variants.
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'x86_64-unknown-linux-gnu' || matrix.target == 'aarch64-unknown-linux-gnu' || matrix.target == 'aarch64-unknown-linux-musl' }}
name: Stage Linux-only artifacts
shell: bash
run: |
@@ -115,20 +117,47 @@ jobs:
- name: Compress artifacts
shell: bash
run: |
# Path that contains the uncompressed binaries for the current
# ${{ matrix.target }}
dest="dist/${{ matrix.target }}"
zstd -T0 -19 --rm "$dest"/*
# For compatibility with environments that lack the `zstd` tool we
# additionally create a `.tar.gz` alongside every single binary that
# we publish. The end result is:
# codex-<target>.zst (existing)
# codex-<target>.tar.gz (new)
# ...same naming for codex-exec-* and codex-linux-sandbox-*
# 1. Produce a .tar.gz for every file in the directory *before* we
# run `zstd --rm`, because that flag deletes the original files.
for f in "$dest"/*; do
base="$(basename "$f")"
# Skip files that are already archives (shouldn't happen, but be
# safe).
if [[ "$base" == *.tar.gz ]]; then
continue
fi
# Create per-binary tar.gz
tar -C "$dest" -czf "$dest/${base}.tar.gz" "$base"
# Also create .zst (existing behaviour) *and* remove the original
# uncompressed binary to keep the directory small.
zstd -T0 -19 --rm "$dest/$base"
done
- uses: actions/upload-artifact@v4
with:
name: ${{ matrix.target }}
path: codex-rs/dist/${{ matrix.target }}/*
# Upload the per-binary .zst files as well as the new .tar.gz
# equivalents we generated in the previous step.
path: |
codex-rs/dist/${{ matrix.target }}/*
release:
needs: build
name: release
runs-on: ubuntu-24.04
env:
RELEASE_TAG: codex-rs-${{ github.sha }}-${{ github.run_attempt }}-${{ github.ref_name }}
runs-on: ubuntu-latest
steps:
- uses: actions/download-artifact@v4
@@ -138,9 +167,19 @@ jobs:
- name: List
run: ls -R dist/
- uses: softprops/action-gh-release@v2
- name: Define release name
id: release_name
run: |
# Extract the version from the tag name, which is in the format
# "rust-v0.1.0".
version="${GITHUB_REF_NAME#rust-v}"
echo "name=${version}" >> $GITHUB_OUTPUT
- name: Create GitHub Release
uses: softprops/action-gh-release@v2
with:
tag_name: ${{ env.RELEASE_TAG }}
name: ${{ steps.release_name.outputs.name }}
tag_name: ${{ github.ref_name }}
files: dist/**
# For now, tag releases as "prerelease" because we are not claiming
# the Rust CLI is stable yet.
@@ -150,5 +189,5 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag: ${{ env.RELEASE_TAG }}
tag: ${{ github.ref_name }}
config: .github/dotslash-config.json

View File

@@ -2,6 +2,33 @@
You can install any of these versions: `npm install -g codex@version`
## `0.1.2505172129`
### 🪲 Bug Fixes
- Add node version check (#1007)
- Persist token after refresh (#1006)
## `0.1.2505171619`
- `codex --login` + `codex --free` (#998)
## `0.1.2505161800`
- Sign in with chatgpt credits (#974)
- Add support for OpenAI tool type, local_shell (#961)
## `0.1.2505161243`
- Sign in with chatgpt (#963)
- Session history viewer (#912)
- Apply patch issue when using different cwd (#942)
- Diff command for filenames with special characters (#954)
## `0.1.2505160811`
- `codex-mini-latest` (#951)
## `0.1.2505140839`
### 🪲 Bug Fixes

View File

@@ -469,7 +469,7 @@ export OPENAI_API_KEY="your-api-key-here"
# Azure OpenAI
export AZURE_OPENAI_API_KEY="your-azure-api-key-here"
export AZURE_OPENAI_API_VERSION="2025-03-01-preview" (Optional)
export AZURE_OPENAI_API_VERSION="2025-04-01-preview" (Optional)
# OpenRouter
export OPENROUTER_API_KEY="your-openrouter-key-here"

View File

@@ -16,14 +16,23 @@
*/
import { spawnSync } from "child_process";
import fs from "fs";
import path from "path";
import { fileURLToPath, pathToFileURL } from "url";
// Determine whether the user explicitly wants the Rust CLI.
const wantsNative =
process.env.CODEX_RUST != null
// __dirname equivalent in ESM
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// For the @native release of the Node module, the `use-native` file is added,
// indicating we should default to the native binary. For other releases,
// setting CODEX_RUST=1 will opt-in to the native binary, if included.
const wantsNative = fs.existsSync(path.join(__dirname, "use-native")) ||
(process.env.CODEX_RUST != null
? ["1", "true", "yes"].includes(process.env.CODEX_RUST.toLowerCase())
: false;
: false);
// Try native binary if requested.
if (wantsNative) {
@@ -37,7 +46,7 @@ if (wantsNative) {
targetTriple = "x86_64-unknown-linux-musl";
break;
case "arm64":
targetTriple = "aarch64-unknown-linux-gnu";
targetTriple = "aarch64-unknown-linux-musl";
break;
default:
break;
@@ -63,10 +72,6 @@ if (wantsNative) {
throw new Error(`Unsupported platform: ${platform} (${arch})`);
}
// __dirname equivalent in ESM
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const binaryPath = path.join(__dirname, "..", "bin", `codex-${targetTriple}`);
const result = spawnSync(binaryPath, process.argv.slice(2), {
stdio: "inherit",
@@ -78,10 +83,6 @@ if (wantsNative) {
// Fallback: execute the original JavaScript CLI.
// Determine this script's directory
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, "../dist/cli.js");
const cliUrl = pathToFileURL(cliPath).href;

View File

@@ -1,6 +1,6 @@
{
"name": "@openai/codex",
"version": "0.1.2504301751",
"version": "0.0.0-dev",
"license": "Apache-2.0",
"bin": {
"codex": "bin/codex.js"
@@ -31,6 +31,7 @@
"chalk": "^5.2.0",
"diff": "^7.0.0",
"dotenv": "^16.1.4",
"express": "^5.1.0",
"fast-deep-equal": "^3.1.3",
"fast-npm-meta": "^0.4.2",
"figures": "^6.1.0",
@@ -54,6 +55,7 @@
"devDependencies": {
"@eslint/js": "^9.22.0",
"@types/diff": "^7.0.2",
"@types/express": "^5.0.1",
"@types/js-yaml": "^4.0.9",
"@types/marked-terminal": "^6.1.1",
"@types/react": "^18.0.32",

View File

@@ -65,7 +65,7 @@ mkdir -p "$BIN_DIR"
# Until we start publishing stable GitHub releases, we have to grab the binaries
# from the GitHub Action that created them. Update the URL below to point to the
# appropriate workflow run:
WORKFLOW_URL="https://github.com/openai/codex/actions/runs/14950726936"
WORKFLOW_URL="https://github.com/openai/codex/actions/runs/15483730027"
WORKFLOW_ID="${WORKFLOW_URL##*/}"
ARTIFACTS_DIR="$(mktemp -d)"
@@ -78,7 +78,7 @@ gh run download --dir "$ARTIFACTS_DIR" --repo openai/codex "$WORKFLOW_ID"
zstd -d "$ARTIFACTS_DIR/x86_64-unknown-linux-musl/codex-linux-sandbox-x86_64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-linux-sandbox-x64"
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-gnu/codex-linux-sandbox-aarch64-unknown-linux-gnu.zst" \
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-musl/codex-linux-sandbox-aarch64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-linux-sandbox-arm64"
if [[ "$INCLUDE_RUST" -eq 1 ]]; then
@@ -86,8 +86,8 @@ if [[ "$INCLUDE_RUST" -eq 1 ]]; then
zstd -d "$ARTIFACTS_DIR/x86_64-unknown-linux-musl/codex-x86_64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-x86_64-unknown-linux-musl"
# ARM64 Linux
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-gnu/codex-aarch64-unknown-linux-gnu.zst" \
-o "$BIN_DIR/codex-aarch64-unknown-linux-gnu"
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-musl/codex-aarch64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-aarch64-unknown-linux-musl"
# x64 macOS
zstd -d "$ARTIFACTS_DIR/x86_64-apple-darwin/codex-x86_64-apple-darwin.zst" \
-o "$BIN_DIR/codex-x86_64-apple-darwin"

View File

@@ -17,7 +17,7 @@
# When --native is supplied we copy the linux-sandbox binaries (as before) and
# additionally fetch / unpack the two Rust targets that we currently support:
# - x86_64-unknown-linux-musl
# - aarch64-unknown-linux-gnu
# - aarch64-unknown-linux-musl
#
# NOTE: This script is intended to be run from the repository root via
# `pnpm --filter codex-cli stage-release ...` or inside codex-cli with the
@@ -122,6 +122,7 @@ jq --arg version "$VERSION" \
if [[ "$INCLUDE_NATIVE" -eq 1 ]]; then
./scripts/install_native_deps.sh "$TMPDIR" --full-native
touch "${TMPDIR}/bin/use-native"
else
./scripts/install_native_deps.sh "$TMPDIR"
fi
@@ -130,11 +131,12 @@ popd >/dev/null
echo "Staged version $VERSION for release in $TMPDIR"
echo "Test Node:"
echo " node ${TMPDIR}/bin/codex.js --help"
if [[ "$INCLUDE_NATIVE" -eq 1 ]]; then
echo "Test Rust:"
echo " CODEX_RUST=1 node ${TMPDIR}/bin/codex.js --help"
echo " node ${TMPDIR}/bin/codex.js --help"
else
echo "Test Node:"
echo " node ${TMPDIR}/bin/codex.js --help"
fi
# Print final hint for convenience

View File

@@ -1,6 +1,19 @@
#!/usr/bin/env node
import "dotenv/config";
// Exit early if on an older version of Node.js (< 22)
const major = process.versions.node.split(".").map(Number)[0]!;
if (major < 22) {
// eslint-disable-next-line no-console
console.error(
"\n" +
"Codex CLI requires Node.js version 22 or newer.\n" +
`You are running Node.js v${process.versions.node}.\n` +
"Please upgrade Node.js: https://nodejs.org/en/download/\n",
);
process.exit(1);
}
// Hack to suppress deprecation warnings (punycode)
// eslint-disable-next-line @typescript-eslint/no-explicit-any
(process as any).noDeprecation = true;
@@ -14,26 +27,32 @@ import type { ReasoningEffort } from "openai/resources.mjs";
import App from "./app";
import { runSinglePass } from "./cli-singlepass";
import SessionsOverlay from "./components/sessions-overlay.js";
import { AgentLoop } from "./utils/agent/agent-loop";
import { ReviewDecision } from "./utils/agent/review";
import { AutoApprovalMode } from "./utils/auto-approval-mode";
import { checkForUpdates } from "./utils/check-updates";
import {
getApiKey,
loadConfig,
PRETTY_PRINT,
INSTRUCTIONS_FILEPATH,
} from "./utils/config";
import {
getApiKey as fetchApiKey,
maybeRedeemCredits,
} from "./utils/get-api-key";
import { createInputItem } from "./utils/input-utils";
import { initLogger } from "./utils/logger/log";
import { isModelSupportedForResponses } from "./utils/model-utils.js";
import { parseToolCall } from "./utils/parsers";
import { providers } from "./utils/providers";
import { onExit, setInkRenderer } from "./utils/terminal";
import chalk from "chalk";
import { spawnSync } from "child_process";
import fs from "fs";
import { render } from "ink";
import meow from "meow";
import os from "os";
import path from "path";
import React from "react";
@@ -56,10 +75,13 @@ const cli = meow(
--version Print version and exit
-h, --help Show usage and exit
-m, --model <model> Model to use for completions (default: o4-mini)
-m, --model <model> Model to use for completions (default: codex-mini-latest)
-p, --provider <provider> Provider to use for completions (default: openai)
-i, --image <path> Path(s) to image files to include as input
-v, --view <rollout> Inspect a previously saved rollout instead of starting a session
--history Browse previous sessions
--login Start a new sign in flow
--free Retry redeeming free credits
-q, --quiet Non-interactive mode that only prints the assistant's final output
-c, --config Open the instructions file in your editor
-w, --writable-root <path> Writable folder for sandbox in full-auto mode (can be specified multiple times)
@@ -104,6 +126,9 @@ const cli = meow(
help: { type: "boolean", aliases: ["h"] },
version: { type: "boolean", description: "Print version and exit" },
view: { type: "string" },
history: { type: "boolean", description: "Browse previous sessions" },
login: { type: "boolean", description: "Force a new sign in flow" },
free: { type: "boolean", description: "Retry redeeming free credits" },
model: { type: "string", aliases: ["m"] },
provider: { type: "string", aliases: ["p"] },
image: { type: "string", isMultiple: true, aliases: ["i"] },
@@ -261,11 +286,100 @@ let config = loadConfig(undefined, undefined, {
isFullContext: fullContextMode,
});
const prompt = cli.input[0];
// `prompt` can be updated later when the user resumes a previous session
// via the `--history` flag. Therefore it must be declared with `let` rather
// than `const`.
let prompt = cli.input[0];
const model = cli.flags.model ?? config.model;
const imagePaths = cli.flags.image;
const provider = cli.flags.provider ?? config.provider ?? "openai";
const apiKey = getApiKey(provider);
const client = {
issuer: "https://auth.openai.com",
client_id: "app_EMoamEEZ73f0CkXaXp7hrann",
};
let apiKey = "";
let savedTokens:
| {
id_token?: string;
access_token?: string;
refresh_token: string;
}
| undefined;
// Try to load existing auth file if present
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
const authFile = path.join(authDir, "auth.json");
if (fs.existsSync(authFile)) {
const data = JSON.parse(fs.readFileSync(authFile, "utf-8"));
savedTokens = data.tokens;
const lastRefreshTime = data.last_refresh
? new Date(data.last_refresh).getTime()
: 0;
const expired = Date.now() - lastRefreshTime > 28 * 24 * 60 * 60 * 1000;
if (data.OPENAI_API_KEY && !expired) {
apiKey = data.OPENAI_API_KEY;
}
}
} catch {
// ignore errors
}
// Get provider-specific API key if not OpenAI
if (provider.toLowerCase() !== "openai") {
const providerInfo = providers[provider.toLowerCase()];
if (providerInfo) {
const providerApiKey = process.env[providerInfo.envKey];
if (providerApiKey) {
apiKey = providerApiKey;
}
}
}
// Only proceed with OpenAI auth flow if:
// 1. Provider is OpenAI and no API key is set, or
// 2. Login flag is explicitly set
if (provider.toLowerCase() === "openai" && !apiKey) {
if (cli.flags.login) {
apiKey = await fetchApiKey(client.issuer, client.client_id);
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
const authFile = path.join(authDir, "auth.json");
if (fs.existsSync(authFile)) {
const data = JSON.parse(fs.readFileSync(authFile, "utf-8"));
savedTokens = data.tokens;
}
} catch {
/* ignore */
}
} else {
apiKey = await fetchApiKey(client.issuer, client.client_id);
}
}
// Ensure the API key is available as an environment variable for legacy code
process.env["OPENAI_API_KEY"] = apiKey;
// Only attempt credit redemption for OpenAI provider
if (cli.flags.free && provider.toLowerCase() === "openai") {
// eslint-disable-next-line no-console
console.log(`${chalk.bold("codex --free")} attempting to redeem credits...`);
if (!savedTokens?.refresh_token) {
apiKey = await fetchApiKey(client.issuer, client.client_id, true);
// fetchApiKey includes credit redemption as the end of the flow
} else {
await maybeRedeemCredits(
client.issuer,
client.client_id,
savedTokens.refresh_token,
savedTokens.id_token,
);
}
}
// Set of providers that don't require API keys
const NO_API_KEY_REQUIRED = new Set(["ollama"]);
@@ -284,13 +398,18 @@ if (!apiKey && !NO_API_KEY_REQUIRED.has(provider.toLowerCase())) {
? `You can create a key here: ${chalk.bold(
chalk.underline("https://platform.openai.com/account/api-keys"),
)}\n`
: provider.toLowerCase() === "gemini"
: provider.toLowerCase() === "azure"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
`${provider.toUpperCase()}_OPENAI_API_KEY`,
)} ` +
`in Azure AI Foundry portal at ${chalk.bold(chalk.underline("https://ai.azure.com"))}.\n`
: provider.toLowerCase() === "gemini"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
}`,
);
process.exit(1);
@@ -355,6 +474,46 @@ if (
let rollout: AppRollout | undefined;
// For --history, show session selector and optionally update prompt or rollout.
if (cli.flags.history) {
const result: { path: string; mode: "view" | "resume" } | null =
await new Promise((resolve) => {
const instance = render(
React.createElement(SessionsOverlay, {
onView: (p: string) => {
instance.unmount();
resolve({ path: p, mode: "view" });
},
onResume: (p: string) => {
instance.unmount();
resolve({ path: p, mode: "resume" });
},
onExit: () => {
instance.unmount();
resolve(null);
},
}),
);
});
if (!result) {
process.exit(0);
}
if (result.mode === "view") {
try {
const content = fs.readFileSync(result.path, "utf-8");
rollout = JSON.parse(content) as AppRollout;
} catch (error) {
// eslint-disable-next-line no-console
console.error("Error reading session file:", error);
process.exit(1);
}
} else {
prompt = `Resume this session: ${result.path}`;
}
}
// For --view, optionally load an existing rollout from disk, display it and exit.
if (cli.flags.view) {
const viewPath = cli.flags.view;

View File

@@ -54,6 +54,7 @@ export default function TerminalChatInput({
openApprovalOverlay,
openHelpOverlay,
openDiffOverlay,
openSessionsOverlay,
onCompact,
interruptAgent,
active,
@@ -77,6 +78,7 @@ export default function TerminalChatInput({
openApprovalOverlay: () => void;
openHelpOverlay: () => void;
openDiffOverlay: () => void;
openSessionsOverlay: () => void;
onCompact: () => void;
interruptAgent: () => void;
active: boolean;
@@ -280,6 +282,9 @@ export default function TerminalChatInput({
case "/history":
openOverlay();
break;
case "/sessions":
openSessionsOverlay();
break;
case "/help":
openHelpOverlay();
break;
@@ -484,6 +489,10 @@ export default function TerminalChatInput({
setInput("");
openOverlay();
return;
} else if (inputValue === "/sessions") {
setInput("");
openSessionsOverlay();
return;
} else if (inputValue === "/help") {
setInput("");
openHelpOverlay();
@@ -728,6 +737,7 @@ export default function TerminalChatInput({
openModelOverlay,
openHelpOverlay,
openDiffOverlay,
openSessionsOverlay,
history,
onCompact,
skipNextSubmit,

View File

@@ -19,6 +19,7 @@ import { parse, setOptions } from "marked";
import TerminalRenderer from "marked-terminal";
import path from "path";
import React, { useEffect, useMemo } from "react";
import { formatCommandForDisplay } from "src/format-command.js";
import supportsHyperlinks from "supports-hyperlinks";
export default function TerminalChatResponseItem({
@@ -41,8 +42,12 @@ export default function TerminalChatResponseItem({
fileOpener={fileOpener}
/>
);
// @ts-expect-error new item types aren't in SDK yet
case "local_shell_call":
case "function_call":
return <TerminalChatResponseToolCall message={item} />;
// @ts-expect-error new item types aren't in SDK yet
case "local_shell_call_output":
case "function_call_output":
return (
<TerminalChatResponseToolCallOutput
@@ -166,21 +171,28 @@ function TerminalChatResponseMessage({
function TerminalChatResponseToolCall({
message,
}: {
message: ResponseFunctionToolCallItem;
// eslint-disable-next-line @typescript-eslint/no-explicit-any
message: ResponseFunctionToolCallItem | any;
}) {
const details = parseToolCall(message);
let workdir: string | undefined;
let cmdReadableText: string | undefined;
if (message.type === "function_call") {
const details = parseToolCall(message);
workdir = details?.workdir;
cmdReadableText = details?.cmdReadableText;
} else if (message.type === "local_shell_call") {
const action = message.action;
workdir = action.working_directory;
cmdReadableText = formatCommandForDisplay(action.command);
}
return (
<Box flexDirection="column" gap={1}>
<Text color="magentaBright" bold>
command
{details?.workdir ? (
<Text dimColor>{` (${details?.workdir})`}</Text>
) : (
""
)}
{workdir ? <Text dimColor>{` (${workdir})`}</Text> : ""}
</Text>
<Text>
<Text dimColor>$</Text> {details?.cmdReadableText}
<Text dimColor>$</Text> {cmdReadableText}
</Text>
</Box>
);
@@ -190,7 +202,8 @@ function TerminalChatResponseToolCallOutput({
message,
fullStdout,
}: {
message: ResponseFunctionToolCallOutputItem;
// eslint-disable-next-line @typescript-eslint/no-explicit-any
message: ResponseFunctionToolCallOutputItem | any;
fullStdout: boolean;
}) {
const { output, metadata } = parseToolCallOutput(message.output);

View File

@@ -1,3 +1,4 @@
import type { AppRollout } from "../../app.js";
import type { ApplyPatchCommand, ApprovalPolicy } from "../../approvals.js";
import type { CommandConfirmation } from "../../utils/agent/agent-loop.js";
import type { AppConfig } from "../../utils/config.js";
@@ -5,6 +6,7 @@ import type { ColorName } from "chalk";
import type { ResponseItem } from "openai/resources/responses/responses.mjs";
import TerminalChatInput from "./terminal-chat-input.js";
import TerminalChatPastRollout from "./terminal-chat-past-rollout.js";
import { TerminalChatToolCallCommand } from "./terminal-chat-tool-call-command.js";
import TerminalMessageHistory from "./terminal-message-history.js";
import { formatCommandForDisplay } from "../../format-command.js";
@@ -32,7 +34,9 @@ import DiffOverlay from "../diff-overlay.js";
import HelpOverlay from "../help-overlay.js";
import HistoryOverlay from "../history-overlay.js";
import ModelOverlay from "../model-overlay.js";
import SessionsOverlay from "../sessions-overlay.js";
import chalk from "chalk";
import fs from "fs/promises";
import { Box, Text } from "ink";
import { spawn } from "node:child_process";
import React, { useEffect, useMemo, useRef, useState } from "react";
@@ -41,6 +45,7 @@ import { inspect } from "util";
export type OverlayModeType =
| "none"
| "history"
| "sessions"
| "model"
| "approval"
| "help"
@@ -191,6 +196,7 @@ export default function TerminalChat({
submitConfirmation,
} = useConfirmation();
const [overlayMode, setOverlayMode] = useState<OverlayModeType>("none");
const [viewRollout, setViewRollout] = useState<AppRollout | null>(null);
// Store the diff text when opening the diff overlay so the view isnt
// recomputed on every rerender while it is open.
@@ -454,6 +460,16 @@ export default function TerminalChat({
[items, model],
);
if (viewRollout) {
return (
<TerminalChatPastRollout
fileOpener={config.fileOpener}
session={viewRollout.session}
items={viewRollout.items}
/>
);
}
return (
<Box flexDirection="column">
<Box flexDirection="column">
@@ -509,6 +525,7 @@ export default function TerminalChat({
openModelOverlay={() => setOverlayMode("model")}
openApprovalOverlay={() => setOverlayMode("approval")}
openHelpOverlay={() => setOverlayMode("help")}
openSessionsOverlay={() => setOverlayMode("sessions")}
openDiffOverlay={() => {
const { isGitRepo, diff } = getGitDiff();
let text: string;
@@ -568,6 +585,25 @@ export default function TerminalChat({
{overlayMode === "history" && (
<HistoryOverlay items={items} onExit={() => setOverlayMode("none")} />
)}
{overlayMode === "sessions" && (
<SessionsOverlay
onView={async (p) => {
try {
const txt = await fs.readFile(p, "utf-8");
const data = JSON.parse(txt) as AppRollout;
setViewRollout(data);
setOverlayMode("none");
} catch {
setOverlayMode("none");
}
}}
onResume={(p) => {
setOverlayMode("none");
setInitialPrompt(`Resume this session: ${p}`);
}}
onExit={() => setOverlayMode("none")}
/>
)}
{overlayMode === "model" && (
<ModelOverlay
currentModel={model}

View File

@@ -0,0 +1,130 @@
import type { TypeaheadItem } from "./typeahead-overlay.js";
import TypeaheadOverlay from "./typeahead-overlay.js";
import fs from "fs/promises";
import { Box, Text, useInput } from "ink";
import os from "os";
import path from "path";
import React, { useEffect, useState } from "react";
const SESSIONS_ROOT = path.join(os.homedir(), ".codex", "sessions");
export type SessionMeta = {
path: string;
timestamp: string;
userMessages: number;
toolCalls: number;
firstMessage: string;
};
async function loadSessions(): Promise<Array<SessionMeta>> {
try {
const entries = await fs.readdir(SESSIONS_ROOT);
const sessions: Array<SessionMeta> = [];
for (const entry of entries) {
if (!entry.endsWith(".json")) {
continue;
}
const filePath = path.join(SESSIONS_ROOT, entry);
try {
// eslint-disable-next-line no-await-in-loop
const content = await fs.readFile(filePath, "utf-8");
const data = JSON.parse(content) as {
session?: { timestamp?: string };
items?: Array<{
type: string;
role: string;
content: Array<{ text: string }>;
}>;
};
const items = Array.isArray(data.items) ? data.items : [];
const firstUser = items.find(
(i) => i?.type === "message" && i.role === "user",
);
const firstText =
firstUser?.content?.[0]?.text?.replace(/\n/g, " ").slice(0, 16) ?? "";
const userMessages = items.filter(
(i) => i?.type === "message" && i.role === "user",
).length;
const toolCalls = items.filter(
(i) => i?.type === "function_call",
).length;
sessions.push({
path: filePath,
timestamp: data.session?.timestamp || "",
userMessages,
toolCalls,
firstMessage: firstText,
});
} catch {
/* ignore invalid session */
}
}
sessions.sort((a, b) => b.timestamp.localeCompare(a.timestamp));
return sessions;
} catch {
return [];
}
}
type Props = {
onView: (sessionPath: string) => void;
onResume: (sessionPath: string) => void;
onExit: () => void;
};
export default function SessionsOverlay({
onView,
onResume,
onExit,
}: Props): JSX.Element {
const [items, setItems] = useState<Array<TypeaheadItem>>([]);
const [mode, setMode] = useState<"view" | "resume">("view");
useEffect(() => {
(async () => {
const sessions = await loadSessions();
const formatted = sessions.map((s) => {
const ts = s.timestamp
? new Date(s.timestamp).toLocaleString(undefined, {
dateStyle: "short",
timeStyle: "short",
})
: "";
const first = s.firstMessage?.slice(0, 50);
const label = `${ts} · ${s.userMessages} msgs/${s.toolCalls} tools · ${first}`;
return { label, value: s.path } as TypeaheadItem;
});
setItems(formatted);
})();
}, []);
useInput((_input, key) => {
if (key.tab) {
setMode((m) => (m === "view" ? "resume" : "view"));
}
});
return (
<TypeaheadOverlay
title={mode === "view" ? "View session" : "Resume session"}
description={
<Box flexDirection="column">
<Text>
{mode === "view" ? "press enter to view" : "press enter to resume"}
</Text>
<Text dimColor>tab to toggle mode · esc to cancel</Text>
</Box>
}
initialItems={items}
onSelect={(value) => {
if (mode === "view") {
onView(value);
} else {
onResume(value);
}
}}
onExit={onExit}
/>
);
}

View File

@@ -8,6 +8,7 @@ import type {
ResponseItem,
ResponseCreateParams,
FunctionTool,
Tool,
} from "openai/resources/responses/responses.mjs";
import type { Reasoning } from "openai/resources.mjs";
@@ -16,7 +17,6 @@ import {
OPENAI_TIMEOUT_MS,
OPENAI_ORGANIZATION,
OPENAI_PROJECT,
getApiKey,
getBaseUrl,
AZURE_OPENAI_API_VERSION,
} from "../config.js";
@@ -84,7 +84,7 @@ type AgentLoopParams = {
onLastResponseId: (lastResponseId: string) => void;
};
const shellTool: FunctionTool = {
const shellFunctionTool: FunctionTool = {
type: "function",
name: "shell",
description: "Runs a shell command, and returns its output.",
@@ -108,6 +108,11 @@ const shellTool: FunctionTool = {
},
};
const localShellTool: Tool = {
//@ts-expect-error - waiting on sdk
type: "local_shell",
};
export class AgentLoop {
private model: string;
private provider: string;
@@ -301,7 +306,7 @@ export class AgentLoop {
this.sessionId = getSessionId() || randomUUID().replaceAll("-", "");
// Configure OpenAI client with optional timeout (ms) from environment
const timeoutMs = OPENAI_TIMEOUT_MS;
const apiKey = getApiKey(this.provider);
const apiKey = this.config.apiKey ?? process.env["OPENAI_API_KEY"] ?? "";
const baseURL = getBaseUrl(this.provider);
this.oai = new OpenAI({
@@ -461,6 +466,73 @@ export class AgentLoop {
return [outputItem, ...additionalItems];
}
private async handleLocalShellCall(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
item: any,
): Promise<Array<ResponseInputItem>> {
// If the agent has been canceled in the meantime we should not perform any
// additional work. Returning an empty array ensures that we neither execute
// the requested tool call nor enqueue any followup input items. This keeps
// the cancellation semantics intuitive for users once they interrupt a
// task no further actions related to that task should be taken.
if (this.canceled) {
return [];
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const outputItem: any = {
type: "local_shell_call_output",
// `call_id` is mandatory ensure we never send `undefined` which would
// trigger the "No tool output found…" 400 from the API.
call_id: item.call_id,
output: "no function found",
};
// We intentionally *do not* remove this `callId` from the `pendingAborts`
// set right away. The output produced below is only queued up for the
// *next* request to the OpenAI API it has not been delivered yet. If
// the user presses ESCESC (i.e. invokes `cancel()`) in the small window
// between queuing the result and the actual network call, we need to be
// able to surface a synthetic `function_call_output` marked as
// "aborted". Keeping the ID in the set until the run concludes
// successfully lets the next `run()` differentiate between an aborted
// tool call (needs the synthetic output) and a completed one (cleared
// below in the `flush()` helper).
// used to tell model to stop if needed
const additionalItems: Array<ResponseInputItem> = [];
if (item.action.type !== "exec") {
throw new Error("Invalid action type");
}
const args = {
cmd: item.action.command,
workdir: item.action.working_directory,
timeoutInMillis: item.action.timeout_ms,
};
const {
outputText,
metadata,
additionalItems: additionalItemsFromExec,
} = await handleExecCommand(
args,
this.config,
this.approvalPolicy,
this.additionalWritableRoots,
this.getCommandConfirmation,
this.execAbortController?.signal,
);
outputItem.output = JSON.stringify({ output: outputText, metadata });
if (additionalItemsFromExec) {
additionalItems.push(...additionalItemsFromExec);
}
return [outputItem, ...additionalItems];
}
public async run(
input: Array<ResponseInputItem>,
previousResponseId: string = "",
@@ -545,6 +617,11 @@ export class AgentLoop {
// `disableResponseStorage === true`.
let transcriptPrefixLen = 0;
let tools: Array<Tool> = [shellFunctionTool];
if (this.model.startsWith("codex")) {
tools = [localShellTool];
}
const stripInternalFields = (
item: ResponseInputItem,
): ResponseInputItem => {
@@ -648,6 +725,8 @@ export class AgentLoop {
if (
(item as ResponseInputItem).type === "function_call" ||
(item as ResponseInputItem).type === "reasoning" ||
//@ts-expect-error - waiting on sdk
(item as ResponseInputItem).type === "local_shell_call" ||
((item as ResponseInputItem).type === "message" &&
// eslint-disable-next-line @typescript-eslint/no-explicit-any
(item as any).role === "user")
@@ -686,7 +765,7 @@ export class AgentLoop {
// prompts) and so that freshly generated `function_call_output`s are
// shown immediately.
// Figure out what subset of `turnInput` constitutes *new* information
// for the UI so that we dont spam the interface with repeats of the
// for the UI so that we don't spam the interface with repeats of the
// entire transcript on every iteration when response storage is
// disabled.
const deltaInput = this.disableResponseStorage
@@ -721,7 +800,8 @@ export class AgentLoop {
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
this.config.provider?.toLowerCase() === "openai" ||
this.config.provider?.toLowerCase() === "azure"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>
@@ -748,7 +828,7 @@ export class AgentLoop {
store: true,
previous_response_id: lastResponseId || undefined,
}),
tools: [shellTool],
tools: tools,
// Explicitly tell the model it is allowed to pick whatever
// tool it deems appropriate. Omitting this sometimes leads to
// the model ignoring the available tools and responding with
@@ -968,7 +1048,10 @@ export class AgentLoop {
if (maybeReasoning.type === "reasoning") {
maybeReasoning.duration_ms = Date.now() - thinkingStart;
}
if (item.type === "function_call") {
if (
item.type === "function_call" ||
item.type === "local_shell_call"
) {
// Track outstanding tool call so we can abort later if needed.
// The item comes from the streaming response, therefore it has
// either `id` (chat) or `call_id` (responses) we normalise
@@ -1091,7 +1174,11 @@ export class AgentLoop {
let reasoning: Reasoning | undefined;
if (this.model.startsWith("o")) {
reasoning = { effort: "high" };
if (this.model === "o3" || this.model === "o4-mini") {
if (
this.model === "o3" ||
this.model === "o4-mini" ||
this.model === "codex-mini-latest"
) {
reasoning.summary = "auto";
}
}
@@ -1102,7 +1189,8 @@ export class AgentLoop {
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
this.config.provider?.toLowerCase() === "openai" ||
this.config.provider?.toLowerCase() === "azure"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>
@@ -1130,7 +1218,7 @@ export class AgentLoop {
store: true,
previous_response_id: lastResponseId || undefined,
}),
tools: [shellTool],
tools: tools,
tool_choice: "auto",
});
@@ -1492,6 +1580,17 @@ export class AgentLoop {
// eslint-disable-next-line no-await-in-loop
const result = await this.handleFunctionCall(item);
turnInput.push(...result);
//@ts-expect-error - waiting on sdk
} else if (item.type === "local_shell_call") {
//@ts-expect-error - waiting on sdk
if (alreadyProcessedResponses.has(item.id)) {
continue;
}
//@ts-expect-error - waiting on sdk
alreadyProcessedResponses.add(item.id);
// eslint-disable-next-line no-await-in-loop
const result = await this.handleLocalShellCall(item);
turnInput.push(...result);
}
emitItem(item as ResponseItem);
}
@@ -1547,7 +1646,6 @@ You MUST adhere to the following criteria when executing the task:
- If there is a .pre-commit-config.yaml, use \`pre-commit run --files ...\` to check that your changes pass the pre-commit checks. However, do not fix pre-existing errors on lines you didn't touch.
- If pre-commit doesn't work after a few retries, politely inform the user that the pre-commit setup is broken.
- Once you finish coding, you must
- Check \`git status\` to sanity check your changes; revert any scratch files or changes.
- Remove all inline comments you added as much as possible, even if they look normal. Check using \`git diff\`. Inline comments must be generally avoided, unless active maintainers of the repo, after long careful study of the code and the issue, will still misinterpret the code without the comments.
- Check if you accidentally add copyright or license headers. If so, remove them.
- Try to run pre-commit if it is available.

View File

@@ -43,7 +43,7 @@ if (!isVitest) {
loadDotenv({ path: USER_WIDE_CONFIG_PATH });
}
export const DEFAULT_AGENTIC_MODEL = "o4-mini";
export const DEFAULT_AGENTIC_MODEL = "codex-mini-latest";
export const DEFAULT_FULL_CONTEXT_MODEL = "gpt-4.1";
export const DEFAULT_APPROVAL_MODE = AutoApprovalMode.SUGGEST;
export const DEFAULT_INSTRUCTIONS = "";
@@ -69,7 +69,7 @@ export const OPENAI_BASE_URL = process.env["OPENAI_BASE_URL"] || "";
export let OPENAI_API_KEY = process.env["OPENAI_API_KEY"] || "";
export const AZURE_OPENAI_API_VERSION =
process.env["AZURE_OPENAI_API_VERSION"] || "2025-03-01-preview";
process.env["AZURE_OPENAI_API_VERSION"] || "2025-04-01-preview";
export const DEFAULT_REASONING_EFFORT = "high";
export const OPENAI_ORGANIZATION = process.env["OPENAI_ORGANIZATION"] || "";
@@ -120,7 +120,7 @@ export function getApiKey(provider: string = "openai"): string | undefined {
return process.env[providerInfo.envKey];
}
// Checking `PROVIDER_API_KEY feels more intuitive with a custom provider.
// Checking `PROVIDER_API_KEY` feels more intuitive with a custom provider.
const customApiKey = process.env[`${provider.toUpperCase()}_API_KEY`];
if (customApiKey) {
return customApiKey;

View File

@@ -0,0 +1,75 @@
import SelectInput from "../components/select-input/select-input.js";
import Spinner from "../components/vendor/ink-spinner.js";
import TextInput from "../components/vendor/ink-text-input.js";
import { Box, Text } from "ink";
import React, { useState } from "react";
export type Choice = { type: "signin" } | { type: "apikey"; key: string };
export function ApiKeyPrompt({
onDone,
}: {
onDone: (choice: Choice) => void;
}): JSX.Element {
const [step, setStep] = useState<"select" | "paste">("select");
const [apiKey, setApiKey] = useState("");
if (step === "select") {
return (
<Box flexDirection="column" gap={1}>
<Box flexDirection="column">
<Text>
Sign in with ChatGPT to generate an API key or paste one you already
have.
</Text>
<Text dimColor>[use arrows to move, enter to select]</Text>
</Box>
<SelectInput
items={[
{ label: "Sign in with ChatGPT", value: "signin" },
{
label: "Paste an API key (or set as OPENAI_API_KEY)",
value: "paste",
},
]}
onSelect={(item: { value: string }) => {
if (item.value === "signin") {
onDone({ type: "signin" });
} else {
setStep("paste");
}
}}
/>
</Box>
);
}
return (
<Box flexDirection="column">
<Text>Paste your OpenAI API key and press &lt;Enter&gt;:</Text>
<TextInput
value={apiKey}
onChange={setApiKey}
onSubmit={(value: string) => {
if (value.trim() !== "") {
onDone({ type: "apikey", key: value.trim() });
}
}}
placeholder="sk-..."
mask="*"
/>
</Box>
);
}
export function WaitingForAuth(): JSX.Element {
return (
<Box flexDirection="row" marginTop={1}>
<Spinner type="ball" />
<Text>
{" "}
Waiting for authentication <Text dimColor>ctrl + c to quit</Text>
</Text>
</Box>
);
}

View File

@@ -0,0 +1,766 @@
import type { Choice } from "./get-api-key-components";
import type { Request, Response } from "express";
import { ApiKeyPrompt, WaitingForAuth } from "./get-api-key-components";
import chalk from "chalk";
import express from "express";
import fs from "fs/promises";
import { render } from "ink";
import crypto from "node:crypto";
import { URL } from "node:url";
import open from "open";
import os from "os";
import path from "path";
import React from "react";
function promptUserForChoice(): Promise<Choice> {
return new Promise<Choice>((resolve) => {
const instance = render(
<ApiKeyPrompt
onDone={(choice: Choice) => {
resolve(choice);
instance.unmount();
}}
/>,
);
});
}
interface OidcConfiguration {
issuer: string;
authorization_endpoint: string;
token_endpoint: string;
}
async function getOidcConfiguration(
issuer: string,
): Promise<OidcConfiguration> {
const discoveryUrl = new URL(issuer);
discoveryUrl.pathname = "/.well-known/openid-configuration";
if (issuer === "https://auth.openai.com") {
// Account for legacy quirk in production tenant
discoveryUrl.pathname = "/v2.0" + discoveryUrl.pathname;
}
const res = await fetch(discoveryUrl.toString());
if (!res.ok) {
throw new Error("Failed to fetch OIDC configuration");
}
return (await res.json()) as OidcConfiguration;
}
interface IDTokenClaims {
"exp": number;
"https://api.openai.com/auth": {
organization_id: string;
project_id: string;
completed_platform_onboarding: boolean;
is_org_owner: boolean;
chatgpt_subscription_active_start: string;
chatgpt_subscription_active_until: string;
chatgpt_plan_type: string;
};
}
interface AccessTokenClaims {
"https://api.openai.com/auth": {
chatgpt_plan_type: string;
};
}
function generatePKCECodes(): {
code_verifier: string;
code_challenge: string;
} {
const code_verifier = crypto.randomBytes(64).toString("hex");
const code_challenge = crypto
.createHash("sha256")
.update(code_verifier)
.digest("base64url");
return { code_verifier, code_challenge };
}
async function maybeRedeemCredits(
issuer: string,
clientId: string,
refreshToken: string,
idToken?: string,
): Promise<void> {
try {
let currentIdToken = idToken;
let idClaims: IDTokenClaims | undefined;
if (
currentIdToken &&
typeof currentIdToken === "string" &&
currentIdToken.split(".")[1]
) {
idClaims = JSON.parse(
Buffer.from(currentIdToken.split(".")[1]!, "base64url").toString(
"utf8",
),
) as IDTokenClaims;
} else {
currentIdToken = "";
}
// Validate idToken expiration
// if expired, attempt token-exchange for a fresh idToken
if (!idClaims || !idClaims.exp || Date.now() >= idClaims.exp * 1000) {
// eslint-disable-next-line no-console
console.log(chalk.dim("Refreshing credentials..."));
try {
const refreshRes = await fetch("https://auth.openai.com/oauth/token", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
client_id: clientId,
grant_type: "refresh_token",
refresh_token: refreshToken,
scope: "openid profile email",
}),
});
if (!refreshRes.ok) {
// eslint-disable-next-line no-console
console.warn(
`Failed to refresh credentials: ${refreshRes.status} ${refreshRes.statusText}\n${chalk.dim(await refreshRes.text())}`,
);
// eslint-disable-next-line no-console
console.warn(
`Please sign in again to redeem credits: ${chalk.bold("codex --login")}`,
);
return;
}
const refreshData = (await refreshRes.json()) as {
id_token: string;
refresh_token?: string;
};
currentIdToken = refreshData.id_token;
idClaims = JSON.parse(
Buffer.from(currentIdToken.split(".")[1]!, "base64url").toString(
"utf8",
),
) as IDTokenClaims;
if (refreshData.refresh_token) {
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
const authFile = path.join(authDir, "auth.json");
const existingJson = JSON.parse(
await fs.readFile(authFile, "utf-8"),
);
existingJson.tokens.id_token = currentIdToken;
existingJson.tokens.refresh_token = refreshData.refresh_token;
existingJson.last_refresh = new Date().toISOString();
await fs.writeFile(
authFile,
JSON.stringify(existingJson, null, 2),
{ mode: 0o600 },
);
} catch (err) {
// eslint-disable-next-line no-console
console.warn("Unable to update refresh token in auth file:", err);
}
}
} catch (err) {
// eslint-disable-next-line no-console
console.warn("Unable to refresh ID token via token-exchange:", err);
return;
}
}
// Confirm the subscription is active for more than 7 days
const subStart =
idClaims["https://api.openai.com/auth"]
?.chatgpt_subscription_active_start;
if (
typeof subStart === "string" &&
Date.now() - new Date(subStart).getTime() < 7 * 24 * 60 * 60 * 1000
) {
// eslint-disable-next-line no-console
console.warn(
"Sorry, your subscription must be active for more than 7 days to redeem credits.\nMore info: " +
chalk.dim("https://help.openai.com/en/articles/11381614") +
chalk.bold(
"\nPlease try again on " +
new Date(
new Date(subStart).getTime() + 7 * 24 * 60 * 60 * 1000,
).toLocaleDateString() +
" " +
new Date(
new Date(subStart).getTime() + 7 * 24 * 60 * 60 * 1000,
).toLocaleTimeString(),
),
);
return;
}
const completed = Boolean(
idClaims["https://api.openai.com/auth"]?.completed_platform_onboarding,
);
const isOwner = Boolean(
idClaims["https://api.openai.com/auth"]?.is_org_owner,
);
const needsSetup = !completed && isOwner;
const planType = idClaims["https://api.openai.com/auth"]
?.chatgpt_plan_type as string | undefined;
if (needsSetup || !(planType === "plus" || planType === "pro")) {
// eslint-disable-next-line no-console
console.warn(
"Users with Plus or Pro subscriptions can redeem free API credits.\nMore info: " +
chalk.dim("https://help.openai.com/en/articles/11381614"),
);
return;
}
const apiHost =
issuer === "https://auth.openai.com"
? "https://api.openai.com"
: "https://api.openai.org";
const redeemRes = await fetch(`${apiHost}/v1/billing/redeem_credits`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ id_token: currentIdToken }),
});
if (!redeemRes.ok) {
// eslint-disable-next-line no-console
console.warn(
`Credit redemption request failed: ${redeemRes.status} ${redeemRes.statusText}`,
);
return;
}
try {
const redeemData = (await redeemRes.json()) as {
granted_chatgpt_subscriber_api_credits?: number;
};
const granted = redeemData?.granted_chatgpt_subscriber_api_credits ?? 0;
if (granted > 0) {
// eslint-disable-next-line no-console
console.log(
chalk.green(
`${chalk.bold(
`Thanks for being a ChatGPT ${
planType === "plus" ? "Plus" : "Pro"
} subscriber!`,
)}\nIf you haven't already redeemed, you should receive ${
planType === "plus" ? "$5" : "$50"
} in API credits\nCredits: ${chalk.dim(chalk.underline("https://platform.openai.com/settings/organization/billing/credit-grants"))}\nMore info: ${chalk.dim(chalk.underline("https://help.openai.com/en/articles/11381614"))}`,
),
);
} else {
// eslint-disable-next-line no-console
console.log(
chalk.green(
`It looks like no credits were granted:\n${JSON.stringify(
redeemData,
null,
2,
)}\nCredits: ${chalk.dim(
chalk.underline(
"https://platform.openai.com/settings/organization/billing/credit-grants",
),
)}\nMore info: ${chalk.dim(
chalk.underline("https://help.openai.com/en/articles/11381614"),
)}`,
),
);
}
} catch (parseErr) {
// eslint-disable-next-line no-console
console.warn("Unable to parse credit redemption response:", parseErr);
}
} catch (err) {
// eslint-disable-next-line no-console
console.warn("Unable to redeem ChatGPT subscriber API credits:", err);
}
}
async function handleCallback(
req: Request,
issuer: string,
oidcConfig: OidcConfiguration,
codeVerifier: string,
clientId: string,
redirectUri: string,
expectedState: string,
): Promise<{ access_token: string; success_url: string }> {
const state = (req.query as Record<string, string>)["state"] as
| string
| undefined;
if (!state || state !== expectedState) {
throw new Error("Invalid state parameter");
}
const code = (req.query as Record<string, string>)["code"] as
| string
| undefined;
if (!code) {
throw new Error("Missing authorization code");
}
const params = new URLSearchParams();
params.append("grant_type", "authorization_code");
params.append("code", code);
params.append("redirect_uri", redirectUri);
params.append("client_id", clientId);
params.append("code_verifier", codeVerifier);
oidcConfig.token_endpoint = `${issuer}/oauth/token`;
const tokenRes = await fetch(oidcConfig.token_endpoint, {
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded",
},
body: params.toString(),
});
if (!tokenRes.ok) {
throw new Error("Failed to exchange authorization code for tokens");
}
const tokenData = (await tokenRes.json()) as {
id_token: string;
access_token: string;
refresh_token: string;
};
const idTokenParts = tokenData.id_token.split(".");
if (idTokenParts.length !== 3) {
throw new Error("Invalid ID token");
}
const accessTokenParts = tokenData.access_token.split(".");
if (accessTokenParts.length !== 3) {
throw new Error("Invalid access token");
}
const idTokenClaims = JSON.parse(
Buffer.from(idTokenParts[1]!, "base64url").toString("utf8"),
) as IDTokenClaims;
const accessTokenClaims = JSON.parse(
Buffer.from(accessTokenParts[1]!, "base64url").toString("utf8"),
) as AccessTokenClaims;
const org_id = idTokenClaims["https://api.openai.com/auth"]?.organization_id;
if (!org_id) {
throw new Error("Missing organization in id_token claims");
}
const project_id = idTokenClaims["https://api.openai.com/auth"]?.project_id;
if (!project_id) {
throw new Error("Missing project in id_token claims");
}
const randomId = crypto.randomBytes(6).toString("hex");
const exchangeParams = new URLSearchParams({
grant_type: "urn:ietf:params:oauth:grant-type:token-exchange",
client_id: clientId,
requested_token: "openai-api-key",
subject_token: tokenData.id_token,
subject_token_type: "urn:ietf:params:oauth:token-type:id_token",
name: `Codex CLI [auto-generated] (${new Date().toISOString().slice(0, 10)}) [${
randomId
}]`,
});
const exchangeRes = await fetch(oidcConfig.token_endpoint, {
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded",
},
body: exchangeParams.toString(),
});
if (!exchangeRes.ok) {
throw new Error(`Failed to create API key: ${await exchangeRes.text()}`);
}
const exchanged = (await exchangeRes.json()) as {
access_token: string;
// NOTE(mbolin): I did not see the "key" property set in practice. Note
// this property is not read by the code.
key: string;
};
// Determine whether the organization still requires additional
// setup (e.g., adding a payment method) based on the ID-token
// claim provided by the auth service.
const completedOnboarding = Boolean(
idTokenClaims["https://api.openai.com/auth"]?.completed_platform_onboarding,
);
const chatgptPlanType =
accessTokenClaims["https://api.openai.com/auth"]?.chatgpt_plan_type;
const isOrgOwner = Boolean(
idTokenClaims["https://api.openai.com/auth"]?.is_org_owner,
);
const needsSetup = !completedOnboarding && isOrgOwner;
// Build the success URL on the same host/port as the callback and
// include the required query parameters for the front-end page.
// console.log("Redirecting to success page");
const successUrl = new URL("/success", redirectUri);
if (issuer === "https://auth.openai.com") {
successUrl.searchParams.set("platform_url", "https://platform.openai.com");
} else {
successUrl.searchParams.set(
"platform_url",
"https://platform.api.openai.org",
);
}
successUrl.searchParams.set("id_token", tokenData.id_token);
successUrl.searchParams.set("needs_setup", needsSetup ? "true" : "false");
successUrl.searchParams.set("org_id", org_id);
successUrl.searchParams.set("project_id", project_id);
successUrl.searchParams.set("plan_type", chatgptPlanType);
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
await fs.mkdir(authDir, { recursive: true });
const authFile = path.join(authDir, "auth.json");
const authData = {
tokens: tokenData,
last_refresh: new Date().toISOString(),
OPENAI_API_KEY: exchanged.access_token,
};
await fs.writeFile(authFile, JSON.stringify(authData, null, 2), {
mode: 0o600,
});
} catch (err) {
// eslint-disable-next-line no-console
console.warn("Unable to save auth file:", err);
}
await maybeRedeemCredits(
issuer,
clientId,
tokenData.refresh_token,
tokenData.id_token,
);
return {
access_token: exchanged.access_token,
success_url: successUrl.toString(),
};
}
const LOGIN_SUCCESS_HTML = String.raw`
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Sign into Codex CLI</title>
<link rel="icon" href='data:image/svg+xml,%3Csvg xmlns="http://www.w3.org/2000/svg" width="32" height="32" fill="none" viewBox="0 0 32 32"%3E%3Cpath stroke="%23000" stroke-linecap="round" stroke-width="2.484" d="M22.356 19.797H17.17M9.662 12.29l1.979 3.576a.511.511 0 0 1-.005.504l-1.974 3.409M30.758 16c0 8.15-6.607 14.758-14.758 14.758-8.15 0-14.758-6.607-14.758-14.758C1.242 7.85 7.85 1.242 16 1.242c8.15 0 14.758 6.608 14.758 14.758Z"/%3E%3C/svg%3E' type="image/svg+xml">
<style>
.container {
margin: auto;
height: 100%;
display: flex;
align-items: center;
justify-content: center;
position: relative;
background: white;
font-family: system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
}
.inner-container {
width: 400px;
flex-direction: column;
justify-content: flex-start;
align-items: center;
gap: 20px;
display: inline-flex;
}
.content {
align-self: stretch;
flex-direction: column;
justify-content: flex-start;
align-items: center;
gap: 20px;
display: flex;
}
.svg-wrapper {
position: relative;
}
.title {
text-align: center;
color: var(--text-primary, #0D0D0D);
font-size: 28px;
font-weight: 400;
line-height: 36.40px;
word-wrap: break-word;
}
.setup-box {
width: 600px;
padding: 16px 20px;
background: var(--bg-primary, white);
box-shadow: 0px 4px 16px rgba(0, 0, 0, 0.05);
border-radius: 16px;
outline: 1px var(--border-default, rgba(13, 13, 13, 0.10)) solid;
outline-offset: -1px;
justify-content: flex-start;
align-items: center;
gap: 16px;
display: inline-flex;
}
.setup-content {
flex: 1 1 0;
justify-content: flex-start;
align-items: center;
gap: 24px;
display: flex;
}
.setup-text {
flex: 1 1 0;
flex-direction: column;
justify-content: flex-start;
align-items: flex-start;
gap: 4px;
display: inline-flex;
}
.setup-title {
align-self: stretch;
color: var(--text-primary, #0D0D0D);
font-size: 14px;
font-weight: 510;
line-height: 20px;
word-wrap: break-word;
}
.setup-description {
align-self: stretch;
color: var(--text-secondary, #5D5D5D);
font-size: 14px;
font-weight: 400;
line-height: 20px;
word-wrap: break-word;
}
.redirect-box {
justify-content: flex-start;
align-items: center;
gap: 8px;
display: flex;
}
.close-button,
.redirect-button {
height: 28px;
padding: 8px 16px;
background: var(--interactive-bg-primary-default, #0D0D0D);
border-radius: 999px;
justify-content: center;
align-items: center;
gap: 4px;
display: flex;
}
.close-button,
.redirect-text {
color: var(--interactive-label-primary-default, white);
font-size: 14px;
font-weight: 510;
line-height: 20px;
word-wrap: break-word;
text-decoration: none;
}
</style>
</head>
<body>
<div class="container">
<div class="inner-container">
<div class="content">
<div data-svg-wrapper class="svg-wrapper">
<svg width="56" height="56" viewBox="0 0 56 56" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M4.6665 28.0003C4.6665 15.1137 15.1132 4.66699 27.9998 4.66699C40.8865 4.66699 51.3332 15.1137 51.3332 28.0003C51.3332 40.887 40.8865 51.3337 27.9998 51.3337C15.1132 51.3337 4.6665 40.887 4.6665 28.0003ZM37.5093 18.5088C36.4554 17.7672 34.9999 18.0203 34.2583 19.0742L24.8508 32.4427L20.9764 28.1808C20.1095 27.2272 18.6338 27.1569 17.6803 28.0238C16.7267 28.8906 16.6565 30.3664 17.5233 31.3199L23.3566 37.7366C23.833 38.2606 24.5216 38.5399 25.2284 38.4958C25.9353 38.4517 26.5838 38.089 26.9914 37.5098L38.0747 21.7598C38.8163 20.7059 38.5632 19.2504 37.5093 18.5088Z" fill="var(--green-400, #04B84C)"/>
</svg>
</div>
<div class="title">Signed in to Codex CLI</div>
</div>
<div class="close-box" style="display: none;">
<div class="setup-description">You may now close this page</div>
</div>
<div class="setup-box" style="display: none;">
<div class="setup-content">
<div class="setup-text">
<div class="setup-title">Finish setting up your API organization</div>
<div class="setup-description">Add a payment method to use your organization.</div>
</div>
<div class="redirect-box">
<div data-hasendicon="false" data-hasstarticon="false" data-ishovered="false" data-isinactive="false" data-ispressed="false" data-size="large" data-type="primary" class="redirect-button">
<div class="redirect-text">Redirecting in 3s...</div>
</div>
</div>
</div>
</div>
</div>
</div>
<script>
(function () {
const params = new URLSearchParams(window.location.search);
const needsSetup = params.get('needs_setup') === 'true';
const platformUrl = params.get('platform_url') || 'https://platform.openai.com';
const orgId = params.get('org_id');
const projectId = params.get('project_id');
const planType = params.get('plan_type');
const idToken = params.get('id_token');
// Show different message and optional redirect when setup is required
if (needsSetup) {
const setupBox = document.querySelector('.setup-box');
setupBox.style.display = 'flex';
const redirectUrlObj = new URL('/org-setup', platformUrl);
redirectUrlObj.searchParams.set('p', planType);
redirectUrlObj.searchParams.set('t', idToken);
redirectUrlObj.searchParams.set('with_org', orgId);
redirectUrlObj.searchParams.set('project_id', projectId);
const redirectUrl = redirectUrlObj.toString();
const message = document.querySelector('.redirect-text');
let countdown = 3;
function tick() {
message.textContent =
'Redirecting in ' + countdown + 's…';
if (countdown === 0) {
window.location.replace(redirectUrl);
} else {
countdown -= 1;
setTimeout(tick, 1000);
}
}
tick();
} else {
const closeBox = document.querySelector('.close-box');
closeBox.style.display = 'flex';
}
})();
</script>
</body>
</html>`;
async function signInFlow(issuer: string, clientId: string): Promise<string> {
const app = express();
let codeVerifier = "";
let redirectUri = "";
let server: ReturnType<typeof app.listen>;
const state = crypto.randomBytes(32).toString("hex");
const apiKeyPromise = new Promise<string>((resolve, reject) => {
let _apiKey: string | undefined;
app.get("/success", (_req: Request, res: Response) => {
res.type("text/html").send(LOGIN_SUCCESS_HTML);
if (_apiKey) {
resolve(_apiKey);
} else {
// eslint-disable-next-line no-console
console.error(
"Sorry, it seems like the authentication flow failed. Please try again, or submit an issue on our GitHub if it continues.",
);
process.exit(1);
}
});
// Callback route -------------------------------------------------------
app.get("/auth/callback", async (req: Request, res: Response) => {
try {
const oidcConfig = await getOidcConfiguration(issuer);
oidcConfig.token_endpoint = `${issuer}/oauth/token`;
oidcConfig.authorization_endpoint = `${issuer}/oauth/authorize`;
const { access_token, success_url } = await handleCallback(
req,
issuer,
oidcConfig,
codeVerifier,
clientId,
redirectUri,
state,
);
_apiKey = access_token;
res.redirect(success_url);
} catch (err) {
reject(err);
}
});
server = app.listen(1455, "127.0.0.1", async () => {
const address = server.address();
if (typeof address === "string" || !address) {
// eslint-disable-next-line no-console
console.log(
"It seems like you might already be trying to sign in (port :1455 already in use)",
);
process.exit(1);
return;
}
const port = address.port;
redirectUri = `http://localhost:${port}/auth/callback`;
try {
const oidcConfig = await getOidcConfiguration(issuer);
oidcConfig.token_endpoint = `${issuer}/oauth/token`;
oidcConfig.authorization_endpoint = `${issuer}/oauth/authorize`;
const pkce = generatePKCECodes();
codeVerifier = pkce.code_verifier;
const authUrl = new URL(oidcConfig.authorization_endpoint);
authUrl.searchParams.append("response_type", "code");
authUrl.searchParams.append("client_id", clientId);
authUrl.searchParams.append("redirect_uri", redirectUri);
authUrl.searchParams.append(
"scope",
"openid profile email offline_access",
);
authUrl.searchParams.append("code_challenge", pkce.code_challenge);
authUrl.searchParams.append("code_challenge_method", "S256");
authUrl.searchParams.append("id_token_add_organizations", "true");
authUrl.searchParams.append("state", state);
// Open the browser immediately.
open(authUrl.toString());
setTimeout(() => {
// eslint-disable-next-line no-console
console.log(
`\nOpening login page in your browser: ${authUrl.toString()}\n`,
);
}, 500);
} catch (err) {
reject(err);
}
});
});
// Ensure the server is closed afterwards.
return apiKeyPromise.finally(() => {
if (server) {
server.close();
}
});
}
export async function getApiKey(
issuer: string,
clientId: string,
forceLogin: boolean = false,
): Promise<string> {
if (!forceLogin && process.env["OPENAI_API_KEY"]) {
return process.env["OPENAI_API_KEY"]!;
}
const choice = await promptUserForChoice();
if (choice.type === "apikey") {
process.env["OPENAI_API_KEY"] = choice.key;
return choice.key;
}
const spinner = render(<WaitingForAuth />);
try {
const key = await signInFlow(issuer, clientId);
spinner.clear();
spinner.unmount();
process.env["OPENAI_API_KEY"] = key;
return key;
} catch (err) {
spinner.clear();
spinner.unmount();
throw err;
}
}
export { maybeRedeemCredits };

View File

@@ -1,4 +1,4 @@
import { execSync } from "node:child_process";
import { execSync, execFileSync } from "node:child_process";
// The objects thrown by `child_process.execSync()` are `Error` instances that
// include additional, undocumented properties such as `status` (exit code) and
@@ -89,12 +89,18 @@ export function getGitDiff(): {
//
// `git diff --color --no-index /dev/null <file>` exits with status 1
// when differences are found, so we capture stdout from the thrown
// error object instead of letting it propagate.
execSync(`git diff --color --no-index -- "${nullDevice}" "${file}"`, {
encoding: "utf8",
stdio: ["ignore", "pipe", "ignore"],
maxBuffer: 10 * 1024 * 1024,
});
// error object instead of letting it propagate. Using `execFileSync`
// avoids shell interpolation issues with special characters in the
// path.
execFileSync(
"git",
["diff", "--color", "--no-index", "--", nullDevice, file],
{
encoding: "utf8",
stdio: ["ignore", "pipe", "ignore"],
maxBuffer: 10 * 1024 * 1024,
},
);
} catch (err) {
if (
isExecSyncError(err) &&

View File

@@ -19,6 +19,10 @@ export const openAiModelInfo = {
label: "o3 (2025-04-16)",
maxContextLength: 200000,
},
"codex-mini-latest": {
label: "codex-mini-latest",
maxContextLength: 200000,
},
"o4-mini": {
label: "o4 Mini",
maxContextLength: 200000,

View File

@@ -20,6 +20,7 @@ export const SLASH_COMMANDS: Array<SlashCommand> = [
"Clear conversation history but keep a summary in context. Optional: /compact [instructions for summarization]",
},
{ command: "/history", description: "Open command history" },
{ command: "/sessions", description: "Browse previous sessions" },
{ command: "/help", description: "Show list of commands" },
{ command: "/model", description: "Open model selection panel" },
{ command: "/approval", description: "Open approval mode selection panel" },

View File

@@ -0,0 +1,107 @@
/**
* tests/agent-azure-responses-endpoint.test.ts
*
* Verifies that AgentLoop calls the `/responses` endpoint when provider is set to Azure.
*/
import { describe, it, expect, vi, beforeEach } from "vitest";
// Fake stream that yields a completed response event
class FakeStream {
async *[Symbol.asyncIterator]() {
yield {
type: "response.completed",
response: { id: "azure_resp", status: "completed", output: [] },
} as any;
}
}
let lastCreateParams: any = null;
vi.mock("openai", () => {
class FakeDefaultClient {
public responses = {
create: async (params: any) => {
lastCreateParams = params;
return new FakeStream();
},
};
}
class FakeAzureClient {
public responses = {
create: async (params: any) => {
lastCreateParams = params;
return new FakeStream();
},
};
}
class APIConnectionTimeoutError extends Error {}
return {
__esModule: true,
default: FakeDefaultClient,
AzureOpenAI: FakeAzureClient,
APIConnectionTimeoutError,
};
});
// Stub approvals to bypass command approval logic
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }),
isSafeCommand: () => null,
}));
// Stub format-command to avoid formatting side effects
vi.mock("../src/format-command.js", () => ({
__esModule: true,
formatCommandForDisplay: (cmd: Array<string>) => cmd.join(" "),
}));
// Stub internal logging to keep output clean
vi.mock("../src/utils/agent/log.js", () => ({
__esModule: true,
log: () => {},
isLoggingEnabled: () => false,
}));
import { AgentLoop } from "../src/utils/agent/agent-loop.js";
describe("AgentLoop Azure provider responses endpoint", () => {
beforeEach(() => {
lastCreateParams = null;
});
it("calls the /responses endpoint when provider is azure", async () => {
const cfg: any = {
model: "test-model",
provider: "azure",
instructions: "",
disableResponseStorage: false,
notify: false,
};
const loop = new AgentLoop({
additionalWritableRoots: [],
model: cfg.model,
config: cfg,
instructions: cfg.instructions,
approvalPolicy: { mode: "suggest" } as any,
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
await loop.run([
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "hello" }],
},
]);
expect(lastCreateParams).not.toBeNull();
expect(lastCreateParams.model).toBe(cfg.model);
expect(Array.isArray(lastCreateParams.input)).toBe(true);
});
});

View File

@@ -132,8 +132,6 @@ describe("cancel clears previous_response_id", () => {
] as any);
const bodies = _test.getBodies();
// eslint-disable-next-line no-console
console.log(JSON.stringify(bodies, null, 2));
expect(bodies.length).toBeGreaterThanOrEqual(2);
// The *last* invocation belongs to the second run (after cancellation).

View File

@@ -55,6 +55,7 @@ describe("/clear command", () => {
openApprovalOverlay: () => {},
openHelpOverlay: () => {},
openDiffOverlay: () => {},
openSessionsOverlay: () => {},
onCompact: () => {},
interruptAgent: () => {},
active: true,

View File

@@ -67,7 +67,7 @@ test("loads default config if files don't exist", () => {
});
// Keep the test focused on just checking that default model and instructions are loaded
// so we need to make sure we check just these properties
expect(config.model).toBe("o4-mini");
expect(config.model).toBe("codex-mini-latest");
expect(config.instructions).toBe("");
});

View File

@@ -29,7 +29,7 @@ describe.each([
])("AgentLoop with disableResponseStorage=%s", ({ flag, title }) => {
/* build a fresh config for each case */
const cfg: AppConfig = {
model: "o4-mini",
model: "codex-mini-latest",
provider: "openai",
instructions: "",
disableResponseStorage: flag,

View File

@@ -21,7 +21,10 @@ describe("disableResponseStorage persistence", () => {
mkdirSync(codexDir, { recursive: true });
// seed YAML with ZDR enabled
writeFileSync(yamlPath, "model: o4-mini\ndisableResponseStorage: true\n");
writeFileSync(
yamlPath,
"model: codex-mini-latest\ndisableResponseStorage: true\n",
);
});
afterAll((): void => {

View File

@@ -0,0 +1,28 @@
import { mkdtempSync, writeFileSync, rmSync } from "fs";
import { tmpdir } from "os";
import { join } from "path";
import { execSync } from "child_process";
import { describe, it, expect } from "vitest";
import { getGitDiff } from "../src/utils/get-diff.js";
describe("getGitDiff", () => {
it("handles untracked files with special characters", () => {
const repoDir = mkdtempSync(join(tmpdir(), "git-diff-test-"));
const prevCwd = process.cwd();
try {
process.chdir(repoDir);
execSync("git init", { stdio: "ignore" });
const fileName = "a$b.txt";
writeFileSync(join(repoDir, fileName), "hello\n");
const { isGitRepo, diff } = getGitDiff();
expect(isGitRepo).toBe(true);
expect(diff).toContain(fileName);
} finally {
process.chdir(prevCwd);
rmSync(repoDir, { recursive: true, force: true });
}
});
});

View File

@@ -44,7 +44,10 @@ describe("model-utils offline resilience", () => {
"../src/utils/model-utils.js"
);
const supported = await isModelSupportedForResponses("openai", "o4-mini");
const supported = await isModelSupportedForResponses(
"openai",
"codex-mini-latest",
);
expect(supported).toBe(true);
});

View File

@@ -6,6 +6,7 @@ test("SLASH_COMMANDS includes expected commands", () => {
expect(commands).toContain("/clear");
expect(commands).toContain("/compact");
expect(commands).toContain("/history");
expect(commands).toContain("/sessions");
expect(commands).toContain("/help");
expect(commands).toContain("/model");
expect(commands).toContain("/approval");

View File

@@ -21,6 +21,7 @@ describe("TerminalChatInput compact command", () => {
openModelOverlay: () => {},
openApprovalOverlay: () => {},
openHelpOverlay: () => {},
openSessionsOverlay: () => {},
onCompact: () => {},
interruptAgent: () => {},
active: true,

View File

@@ -76,6 +76,7 @@ describe("TerminalChatInput file tag suggestions", () => {
openModelOverlay: vi.fn(),
openApprovalOverlay: vi.fn(),
openHelpOverlay: vi.fn(),
openSessionsOverlay: vi.fn(),
onCompact: vi.fn(),
interruptAgent: vi.fn(),
active: true,

View File

@@ -42,6 +42,7 @@ describe("TerminalChatInput multiline functionality", () => {
openModelOverlay: () => {},
openApprovalOverlay: () => {},
openHelpOverlay: () => {},
openSessionsOverlay: () => {},
onCompact: () => {},
interruptAgent: () => {},
active: true,
@@ -93,6 +94,7 @@ describe("TerminalChatInput multiline functionality", () => {
openModelOverlay: () => {},
openApprovalOverlay: () => {},
openHelpOverlay: () => {},
openSessionsOverlay: () => {},
onCompact: () => {},
interruptAgent: () => {},
active: true,

6
codex-rs/.gitignore vendored
View File

@@ -1 +1,7 @@
/target/
# Recommended value of CARGO_TARGET_DIR when using Docker as explained in .devcontainer/README.md.
/target-amd64/
# Value of CARGO_TARGET_DIR when using .devcontainer/devcontainer.json.
/target-arm64/

787
codex-rs/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -8,6 +8,9 @@ members = [
"core",
"exec",
"execpolicy",
"file-search",
"linux-sandbox",
"login",
"mcp-client",
"mcp-server",
"mcp-types",
@@ -23,7 +26,7 @@ version = "0.0.0"
edition = "2024"
[workspace.lints]
rust = { }
rust = {}
[workspace.lints.clippy]
expect_used = "deny"
@@ -34,3 +37,6 @@ lto = "fat"
# Because we bundle some of these executables with the TypeScript CLI, we
# remove everything to make the binary as small as possible.
strip = "symbols"
# See https://github.com/openai/codex/issues/1411 for details.
codegen-units = 1

View File

@@ -1,16 +1,63 @@
# codex-rs
# Codex CLI (Rust Implementation)
April 24, 2025
We provide Codex CLI as a standalone, native executable to ensure a zero-dependency install.
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible.
## Installing Codex
To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits:
Today, the easiest way to install Codex is via `npm`, though we plan to publish Codex to other package managers soon.
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance.
```shell
npm i -g @openai/codex@native
codex
```
Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implementation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.
You can also download a platform-specific release directly from our [GitHub Releases](https://github.com/openai/codex/releases).
## What's new in the Rust CLI
While we are [working to close the gap between the TypeScript and Rust implementations of Codex CLI](https://github.com/openai/codex/issues/1262), note that the Rust CLI has a number of features that the TypeScript CLI does not!
### Config
Codex supports a rich set of configuration options. Note that the Rust CLI uses `config.toml` instead of `config.json`. See [`config.md`](./config.md) for details.
### Model Context Protocol Support
Codex CLI functions as an MCP client that can connect to MCP servers on startup. See the [`mcp_servers`](./config.md#mcp_servers) section in the configuration documentation for details.
It is still experimental, but you can also launch Codex as an MCP _server_ by running `codex mcp`. Use the [`@modelcontextprotocol/inspector`](https://github.com/modelcontextprotocol/inspector) to try it out:
```shell
npx @modelcontextprotocol/inspector codex mcp
```
### Notifications
You can enable notifications by configuring a script that is run whenever the agent finishes a turn. The [notify documentation](./config.md#notify) includes a detailed example that explains how to get desktop notifications via [terminal-notifier](https://github.com/julienXX/terminal-notifier) on macOS.
### `codex exec` to run Codex programmatially/non-interactively
To run Codex non-interactively, run `codex exec PROMPT` (you can also pass the prompt via `stdin`) and Codex will work on your task until it decides that it is done and exits. Output is printed to the terminal directly. You can set the `RUST_LOG` environment variable to see more about what's going on.
### `--cd`/`-C` flag
Sometimes it is not convenient to `cd` to the directory you want Codex to use as the "working root" before running Codex. Fortunately, `codex` supports a `--cd` option so you can specify whatever folder you want. You can confirm that Codex is honoring `--cd` by double-checking the **workdir** it reports in the TUI at the start of a new session.
### Experimenting with the Codex Sandbox
To test to see what happens when a command is run under the sandbox provided by Codex, we provide the following subcommands in Codex CLI:
```
# macOS
codex debug seatbelt [-s SANDBOX_PERMISSION]... [COMMAND]...
# Linux
codex debug landlock [-s SANDBOX_PERMISSION]... [COMMAND]...
```
You can experiment with different values of `-s` to see what permissions the `COMMAND` needs to execute successfully.
Note that the exact API for the `-s` flag is currently in flux. See https://github.com/openai/codex/issues/1248 for details.
## Code Organization
@@ -20,283 +67,3 @@ This folder is the root of a Cargo workspace. It contains quite a bit of experim
- [`exec/`](./exec) "headless" CLI for use in automation.
- [`tui/`](./tui) CLI that launches a fullscreen TUI built with [Ratatui](https://ratatui.rs/).
- [`cli/`](./cli) CLI multitool that provides the aforementioned CLIs via subcommands.
## Config
The CLI can be configured via `~/.codex/config.toml`. It supports the following options:
### model
The model that Codex should use.
```toml
model = "o3" # overrides the default of "o4-mini"
```
### model_provider
Codex comes bundled with a number of "model providers" predefined. This config value is a string that indicates which provider to use. You can also define your own providers via `model_providers`.
For example, if you are running ollama with Mistral locally, then you would need to add the following to your config:
```toml
model = "mistral"
model_provider = "ollama"
```
because the following definition for `ollama` is included in Codex:
```toml
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"
```
This option defaults to `"openai"` and the corresponding provider is defined as follows:
```toml
[model_providers.openai]
name = "OpenAI"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "responses"
```
### model_providers
This option lets you override and amend the default set of model providers bundled with Codex. This value is a map where the key is the value to use with `model_provider` to select the correspodning provider.
For example, if you wanted to add a provider that uses the OpenAI 4o model via the chat completions API, then you
```toml
# Recall that in TOML, root keys must be listed before tables.
model = "gpt-4o"
model_provider = "openai-chat-completions"
[model_providers.openai-chat-completions]
# Name of the provider that will be displayed in the Codex UI.
name = "OpenAI using Chat Completions"
# The path `/chat/completions` will be amended to this URL to make the POST
# request for the chat completions.
base_url = "https://api.openai.com/v1"
# If `env_key` is set, identifies an environment variable that must be set when
# using Codex with this provider. The value of the environment variable must be
# non-empty and will be used in the `Bearer TOKEN` HTTP header for the POST request.
env_key = "OPENAI_API_KEY"
# valid values for wire_api are "chat" and "responses".
wire_api = "chat"
```
### approval_policy
Determines when the user should be prompted to approve whether Codex can execute a command:
```toml
# This is analogous to --suggest in the TypeScript Codex CLI
approval_policy = "unless-allow-listed"
```
```toml
# If the command fails when run in the sandbox, Codex asks for permission to
# retry the command outside the sandbox.
approval_policy = "on-failure"
```
```toml
# User is never prompted: if the command fails, Codex will automatically try
# something out. Note the `exec` subcommand always uses this mode.
approval_policy = "never"
```
### profiles
A _profile_ is a collection of configuration values that can be set together. Multiple profiles can be defined in `config.toml` and you can specify the one you
want to use at runtime via the `--profile` flag.
Here is an example of a `config.toml` that defines multiple profiles:
```toml
model = "o3"
approval_policy = "unless-allow-listed"
sandbox_permissions = ["disk-full-read-access"]
disable_response_storage = false
# Setting `profile` is equivalent to specifying `--profile o3` on the command
# line, though the `--profile` flag can still be used to override this value.
profile = "o3"
[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
[profiles.o3]
model = "o3"
model_provider = "openai"
approval_policy = "never"
[profiles.gpt3]
model = "gpt-3.5-turbo"
model_provider = "openai-chat-completions"
[profiles.zdr]
model = "o3"
model_provider = "openai"
approval_policy = "on-failure"
disable_response_storage = true
```
Users can specify config values at multiple levels. Order of precedence is as follows:
1. custom command-line argument, e.g., `--model o3`
2. as part of a profile, where the `--profile` is specified via a CLI (or in the config file itself)
3. as an entry in `config.toml`, e.g., `model = "o3"`
4. the default value that comes with Codex CLI (i.e., Codex CLI defaults to `o4-mini`)
### sandbox_permissions
List of permissions to grant to the sandbox that Codex uses to execute untrusted commands:
```toml
# This is comparable to --full-auto in the TypeScript Codex CLI, though
# specifying `disk-write-platform-global-temp-folder` adds /tmp as a writable
# folder in addition to $TMPDIR.
sandbox_permissions = [
"disk-full-read-access",
"disk-write-platform-user-temp-folder",
"disk-write-platform-global-temp-folder",
"disk-write-cwd",
]
```
To add additional writable folders, use `disk-write-folder`, which takes a parameter (this can be specified multiple times):
```toml
sandbox_permissions = [
# ...
"disk-write-folder=/Users/mbolin/.pyenv/shims",
]
```
### mcp_servers
Defines the list of MCP servers that Codex can consult for tool use. Currently, only servers that are launched by executing a program that communicate over stdio are supported. For servers that use the SSE transport, consider an adapter like [mcp-proxy](https://github.com/sparfenyuk/mcp-proxy).
**Note:** Codex may cache the list of tools and resources from an MCP server so that Codex can include this information in context at startup without spawning all the servers. This is designed to save resources by loading MCP servers lazily.
This config option is comparable to how Claude and Cursor define `mcpServers` in their respective JSON config files, though because Codex uses TOML for its config language, the format is slightly different. For example, the following config in JSON:
```json
{
"mcpServers": {
"server-name": {
"command": "npx",
"args": ["-y", "mcp-server"],
"env": {
"API_KEY": "value"
}
}
}
}
```
Should be represented as follows in `~/.codex/config.toml`:
```toml
# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command = "npx"
args = ["-y", "mcp-server"]
env = { "API_KEY" = "value" }
```
### disable_response_storage
Currently, customers whose accounts are set to use Zero Data Retention (ZDR) must set `disable_response_storage` to `true` so that Codex uses an alternative to the Responses API that works with ZDR:
```toml
disable_response_storage = true
```
### notify
Specify a program that will be executed to get notified about events generated by Codex. Note that the program will receive the notification argument as a string of JSON, e.g.:
```json
{
"type": "agent-turn-complete",
"turn-id": "12345",
"input-messages": ["Rename `foo` to `bar` and update the callsites."],
"last-assistant-message": "Rename complete and verified `cargo build` succeeds."
}
```
The `"type"` property will always be set. Currently, `"agent-turn-complete"` is the only notification type that is supported.
As an example, here is a Python script that parses the JSON and decides whether to show a desktop push notification using [terminal-notifier](https://github.com/julienXX/terminal-notifier) on macOS:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
To have Codex use this script for notifications, you would configure it via `notify` in `~/.codex/config.toml` using the appropriate path to `notify.py` on your computer:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
### project_doc_max_bytes
Maximum number of bytes to read from an `AGENTS.md` file to include in the instructions sent with the first turn of a session. Defaults to 32 KiB.

View File

@@ -12,7 +12,6 @@ workspace = true
[dependencies]
anyhow = "1"
regex = "1.11.1"
serde_json = "1.0.110"
similar = "2.7.0"
thiserror = "2.0.12"

View File

@@ -0,0 +1,40 @@
To edit files, ALWAYS use the `shell` tool with `apply_patch` CLI. `apply_patch` effectively allows you to execute a diff/patch against a file, but the format of the diff specification is unique to this task, so pay careful attention to these instructions. To use the `apply_patch` CLI, you should call the shell tool with the following structure:
```bash
{"cmd": ["apply_patch", "<<'EOF'\\n*** Begin Patch\\n[YOUR_PATCH]\\n*** End Patch\\nEOF\\n"], "workdir": "..."}
```
Where [YOUR_PATCH] is the actual content of your patch, specified in the following V4A diff format.
*** [ACTION] File: [path/to/file] -> ACTION can be one of Add, Update, or Delete.
For each snippet of code that needs to be changed, repeat the following:
[context_before] -> See below for further instructions on context.
- [old_code] -> Precede the old code with a minus sign.
+ [new_code] -> Precede the new, replacement code with a plus sign.
[context_after] -> See below for further instructions on context.
For instructions on [context_before] and [context_after]:
- By default, show 3 lines of code immediately above and 3 lines immediately below each change. If a change is within 3 lines of a previous change, do NOT duplicate the first changes [context_after] lines in the second changes [context_before] lines.
- If 3 lines of context is insufficient to uniquely identify the snippet of code within the file, use the @@ operator to indicate the class or function to which the snippet belongs. For instance, we might have:
@@ class BaseClass
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
- If a code block is repeated so many times in a class or function such that even a single `@@` statement and 3 lines of context cannot uniquely identify the snippet of code, you can use multiple `@@` statements to jump to the right context. For instance:
@@ class BaseClass
@@ def method():
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
Note, then, that we do not use line numbers in this diff format, as the context is enough to uniquely identify code. An example of a message that you might pass as "input" to this function, in order to apply a patch, is shown below.
```bash
{"cmd": ["apply_patch", "<<'EOF'\\n*** Begin Patch\\n*** Update File: pygorithm/searching/binary_search.py\\n@@ class BaseClass\\n@@ def search():\\n- pass\\n+ raise NotImplementedError()\\n@@ class Subclass\\n@@ def search():\\n- pass\\n+ raise NotImplementedError()\\n*** End Patch\\nEOF\\n"], "workdir": "..."}
```
File references can only be relative, NEVER ABSOLUTE. After the apply_patch command is run, it will always say "Done!", regardless of whether the patch was successfully applied or not. However, you can determine if there are issue and errors by looking at any warnings or logging lines printed BEFORE the "Done!" is output.

View File

@@ -4,9 +4,9 @@ mod seek_sequence;
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
use std::str::Utf8Error;
use anyhow::Context;
use anyhow::Error;
use anyhow::Result;
pub use parser::Hunk;
pub use parser::ParseError;
@@ -15,10 +15,14 @@ use parser::UpdateFileChunk;
pub use parser::parse_patch;
use similar::TextDiff;
use thiserror::Error;
use tree_sitter::LanguageError;
use tree_sitter::Parser;
use tree_sitter_bash::LANGUAGE as BASH;
#[derive(Debug, Error)]
/// Detailed instructions for gpt-4.1 on how to use the `apply_patch` tool.
pub const APPLY_PATCH_TOOL_INSTRUCTIONS: &str = include_str!("../apply_patch_tool_instructions.md");
#[derive(Debug, Error, PartialEq)]
pub enum ApplyPatchError {
#[error(transparent)]
ParseError(#[from] ParseError),
@@ -46,10 +50,16 @@ pub struct IoError {
source: std::io::Error,
}
#[derive(Debug)]
impl PartialEq for IoError {
fn eq(&self, other: &Self) -> bool {
self.context == other.context && self.source.to_string() == other.source.to_string()
}
}
#[derive(Debug, PartialEq)]
pub enum MaybeApplyPatch {
Body(Vec<Hunk>),
ShellParseError(Error),
ShellParseError(ExtractHeredocError),
PatchParseError(ParseError),
NotApplyPatch,
}
@@ -77,7 +87,7 @@ pub fn maybe_parse_apply_patch(argv: &[String]) -> MaybeApplyPatch {
}
}
#[derive(Debug)]
#[derive(Debug, PartialEq)]
pub enum ApplyPatchFileChange {
Add {
content: String,
@@ -91,14 +101,14 @@ pub enum ApplyPatchFileChange {
},
}
#[derive(Debug)]
#[derive(Debug, PartialEq)]
pub enum MaybeApplyPatchVerified {
/// `argv` corresponded to an `apply_patch` invocation, and these are the
/// resulting proposed file changes.
Body(ApplyPatchAction),
/// `argv` could not be parsed to determine whether it corresponds to an
/// `apply_patch` invocation.
ShellParseError(Error),
ShellParseError(ExtractHeredocError),
/// `argv` corresponded to an `apply_patch` invocation, but it could not
/// be fulfilled due to the specified error.
CorrectnessError(ApplyPatchError),
@@ -106,7 +116,7 @@ pub enum MaybeApplyPatchVerified {
NotApplyPatch,
}
#[derive(Debug)]
#[derive(Debug, PartialEq)]
/// ApplyPatchAction is the result of parsing an `apply_patch` command. By
/// construction, all paths should be absolute paths.
pub struct ApplyPatchAction {
@@ -142,22 +152,16 @@ pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApp
MaybeApplyPatch::Body(hunks) => {
let mut changes = HashMap::new();
for hunk in hunks {
let path = hunk.resolve_path(cwd);
match hunk {
Hunk::AddFile { path, contents } => {
changes.insert(
cwd.join(path),
ApplyPatchFileChange::Add {
content: contents.clone(),
},
);
Hunk::AddFile { contents, .. } => {
changes.insert(path, ApplyPatchFileChange::Add { content: contents });
}
Hunk::DeleteFile { path } => {
changes.insert(cwd.join(path), ApplyPatchFileChange::Delete);
Hunk::DeleteFile { .. } => {
changes.insert(path, ApplyPatchFileChange::Delete);
}
Hunk::UpdateFile {
path,
move_path,
chunks,
move_path, chunks, ..
} => {
let ApplyPatchFileUpdate {
unified_diff,
@@ -169,7 +173,7 @@ pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApp
}
};
changes.insert(
cwd.join(path),
path,
ApplyPatchFileChange::Update {
unified_diff,
move_path: move_path.map(|p| cwd.join(p)),
@@ -205,19 +209,21 @@ pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApp
/// * `Ok(String)` - The heredoc body if the extraction is successful.
/// * `Err(anyhow::Error)` - An error if the extraction fails.
///
fn extract_heredoc_body_from_apply_patch_command(src: &str) -> anyhow::Result<String> {
fn extract_heredoc_body_from_apply_patch_command(
src: &str,
) -> std::result::Result<String, ExtractHeredocError> {
if !src.trim_start().starts_with("apply_patch") {
anyhow::bail!("expected command to start with 'apply_patch'");
return Err(ExtractHeredocError::CommandDidNotStartWithApplyPatch);
}
let lang = BASH.into();
let mut parser = Parser::new();
parser
.set_language(&lang)
.context("failed to load bash grammar")?;
.map_err(ExtractHeredocError::FailedToLoadBashGrammar)?;
let tree = parser
.parse(src, None)
.ok_or_else(|| anyhow::anyhow!("failed to parse patch into AST"))?;
.ok_or(ExtractHeredocError::FailedToParsePatchIntoAst)?;
let bytes = src.as_bytes();
let mut c = tree.root_node().walk();
@@ -227,7 +233,7 @@ fn extract_heredoc_body_from_apply_patch_command(src: &str) -> anyhow::Result<St
if node.kind() == "heredoc_body" {
let text = node
.utf8_text(bytes)
.context("failed to interpret heredoc body as UTF-8")?;
.map_err(ExtractHeredocError::HeredocNotUtf8)?;
return Ok(text.trim_end_matches('\n').to_owned());
}
@@ -236,12 +242,21 @@ fn extract_heredoc_body_from_apply_patch_command(src: &str) -> anyhow::Result<St
}
while !c.goto_next_sibling() {
if !c.goto_parent() {
anyhow::bail!("expected to find heredoc_body in patch candidate");
return Err(ExtractHeredocError::FailedToFindHeredocBody);
}
}
}
}
#[derive(Debug, PartialEq)]
pub enum ExtractHeredocError {
CommandDidNotStartWithApplyPatch,
FailedToLoadBashGrammar(LanguageError),
HeredocNotUtf8(Utf8Error),
FailedToParsePatchIntoAst,
FailedToFindHeredocBody,
}
/// Applies the patch and prints the result to stdout/stderr.
pub fn apply_patch(
patch: &str,
@@ -1137,4 +1152,48 @@ g
"#
);
}
#[test]
fn test_apply_patch_should_resolve_absolute_paths_in_cwd() {
let session_dir = tempdir().unwrap();
let relative_path = "source.txt";
// Note that we need this file to exist for the patch to be "verified"
// and parsed correctly.
let session_file_path = session_dir.path().join(relative_path);
fs::write(&session_file_path, "session directory content\n").unwrap();
let argv = vec![
"apply_patch".to_string(),
r#"*** Begin Patch
*** Update File: source.txt
@@
-session directory content
+updated session directory content
*** End Patch"#
.to_string(),
];
let result = maybe_parse_apply_patch_verified(&argv, session_dir.path());
// Verify the patch contents - as otherwise we may have pulled contents
// from the wrong file (as we're using relative paths)
assert_eq!(
result,
MaybeApplyPatchVerified::Body(ApplyPatchAction {
changes: HashMap::from([(
session_dir.path().join(relative_path),
ApplyPatchFileChange::Update {
unified_diff: r#"@@ -1 +1 @@
-session directory content
+updated session directory content
"#
.to_string(),
move_path: None,
new_content: "updated session directory content\n".to_string(),
},
)]),
})
);
}
}

View File

@@ -22,6 +22,7 @@
//!
//! The parser below is a little more lenient than the explicit spec and allows for
//! leading/trailing whitespace around patch markers.
use std::path::Path;
use std::path::PathBuf;
use thiserror::Error;
@@ -36,7 +37,15 @@ const EOF_MARKER: &str = "*** End of File";
const CHANGE_CONTEXT_MARKER: &str = "@@ ";
const EMPTY_CHANGE_CONTEXT_MARKER: &str = "@@";
#[derive(Debug, PartialEq, Error)]
/// Currently, the only OpenAI model that knowingly requires lenient parsing is
/// gpt-4.1. While we could try to require everyone to pass in a strictness
/// param when invoking apply_patch, it is a pain to thread it through all of
/// the call sites, so we resign ourselves allowing lenient parsing for all
/// models. See [`ParseMode::Lenient`] for details on the exceptions we make for
/// gpt-4.1.
const PARSE_IN_STRICT_MODE: bool = false;
#[derive(Debug, PartialEq, Error, Clone)]
pub enum ParseError {
#[error("invalid patch: {0}")]
InvalidPatchError(String),
@@ -45,7 +54,7 @@ pub enum ParseError {
}
use ParseError::*;
#[derive(Debug, PartialEq)]
#[derive(Debug, PartialEq, Clone)]
#[allow(clippy::enum_variant_names)]
pub enum Hunk {
AddFile {
@@ -64,9 +73,20 @@ pub enum Hunk {
chunks: Vec<UpdateFileChunk>,
},
}
impl Hunk {
pub fn resolve_path(&self, cwd: &Path) -> PathBuf {
match self {
Hunk::AddFile { path, .. } => cwd.join(path),
Hunk::DeleteFile { path } => cwd.join(path),
Hunk::UpdateFile { path, .. } => cwd.join(path),
}
}
}
use Hunk::*;
#[derive(Debug, PartialEq)]
#[derive(Debug, PartialEq, Clone)]
pub struct UpdateFileChunk {
/// A single line of context used to narrow down the position of the chunk
/// (this is usually a class, method, or function definition.)
@@ -83,19 +103,68 @@ pub struct UpdateFileChunk {
}
pub fn parse_patch(patch: &str) -> Result<Vec<Hunk>, ParseError> {
let mode = if PARSE_IN_STRICT_MODE {
ParseMode::Strict
} else {
ParseMode::Lenient
};
parse_patch_text(patch, mode)
}
enum ParseMode {
/// Parse the patch text argument as is.
Strict,
/// GPT-4.1 is known to formulate the `command` array for the `local_shell`
/// tool call for `apply_patch` call using something like the following:
///
/// ```json
/// [
/// "apply_patch",
/// "<<'EOF'\n*** Begin Patch\n*** Update File: README.md\n@@...\n*** End Patch\nEOF\n",
/// ]
/// ```
///
/// This is a problem because `local_shell` is a bit of a misnomer: the
/// `command` is not invoked by passing the arguments to a shell like Bash,
/// but are invoked using something akin to `execvpe(3)`.
///
/// This is significant in this case because where a shell would interpret
/// `<<'EOF'...` as a heredoc and pass the contents via stdin (which is
/// fine, as `apply_patch` is specified to read from stdin if no argument is
/// passed), `execvpe(3)` interprets the heredoc as a literal string. To get
/// the `local_shell` tool to run a command the way shell would, the
/// `command` array must be something like:
///
/// ```json
/// [
/// "bash",
/// "-lc",
/// "apply_patch <<'EOF'\n*** Begin Patch\n*** Update File: README.md\n@@...\n*** End Patch\nEOF\n",
/// ]
/// ```
///
/// In lenient mode, we check if the argument to `apply_patch` starts with
/// `<<'EOF'` and ends with `EOF\n`. If so, we strip off these markers,
/// trim() the result, and treat what is left as the patch text.
Lenient,
}
fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<Vec<Hunk>, ParseError> {
let lines: Vec<&str> = patch.trim().lines().collect();
if lines.is_empty() || lines[0] != BEGIN_PATCH_MARKER {
return Err(InvalidPatchError(String::from(
"The first line of the patch must be '*** Begin Patch'",
)));
}
let last_line_index = lines.len() - 1;
if lines[last_line_index] != END_PATCH_MARKER {
return Err(InvalidPatchError(String::from(
"The last line of the patch must be '*** End Patch'",
)));
}
let lines: &[&str] = match check_patch_boundaries_strict(&lines) {
Ok(()) => &lines,
Err(e) => match mode {
ParseMode::Strict => {
return Err(e);
}
ParseMode::Lenient => check_patch_boundaries_lenient(&lines, e)?,
},
};
let mut hunks: Vec<Hunk> = Vec::new();
// The above checks ensure that lines.len() >= 2.
let last_line_index = lines.len().saturating_sub(1);
let mut remaining_lines = &lines[1..last_line_index];
let mut line_number = 2;
while !remaining_lines.is_empty() {
@@ -107,6 +176,64 @@ pub fn parse_patch(patch: &str) -> Result<Vec<Hunk>, ParseError> {
Ok(hunks)
}
/// Checks the start and end lines of the patch text for `apply_patch`,
/// returning an error if they do not match the expected markers.
fn check_patch_boundaries_strict(lines: &[&str]) -> Result<(), ParseError> {
let (first_line, last_line) = match lines {
[] => (None, None),
[first] => (Some(first), Some(first)),
[first, .., last] => (Some(first), Some(last)),
};
check_start_and_end_lines_strict(first_line, last_line)
}
/// If we are in lenient mode, we check if the first line starts with `<<EOF`
/// (possibly quoted) and the last line ends with `EOF`. There must be at least
/// 4 lines total because the heredoc markers take up 2 lines and the patch text
/// must have at least 2 lines.
///
/// If successful, returns the lines of the patch text that contain the patch
/// contents, excluding the heredoc markers.
fn check_patch_boundaries_lenient<'a>(
original_lines: &'a [&'a str],
original_parse_error: ParseError,
) -> Result<&'a [&'a str], ParseError> {
match original_lines {
[first, .., last] => {
if (first == &"<<EOF" || first == &"<<'EOF'" || first == &"<<\"EOF\"")
&& last.ends_with("EOF")
&& original_lines.len() >= 4
{
let inner_lines = &original_lines[1..original_lines.len() - 1];
match check_patch_boundaries_strict(inner_lines) {
Ok(()) => Ok(inner_lines),
Err(e) => Err(e),
}
} else {
Err(original_parse_error)
}
}
_ => Err(original_parse_error),
}
}
fn check_start_and_end_lines_strict(
first_line: Option<&&str>,
last_line: Option<&&str>,
) -> Result<(), ParseError> {
match (first_line, last_line) {
(Some(&first), Some(&last)) if first == BEGIN_PATCH_MARKER && last == END_PATCH_MARKER => {
Ok(())
}
(Some(&first), _) if first != BEGIN_PATCH_MARKER => Err(InvalidPatchError(String::from(
"The first line of the patch must be '*** Begin Patch'",
))),
_ => Err(InvalidPatchError(String::from(
"The last line of the patch must be '*** End Patch'",
))),
}
}
/// Attempts to parse a single hunk from the start of lines.
/// Returns the parsed hunk and the number of lines parsed (or a ParseError).
fn parse_one_hunk(lines: &[&str], line_number: usize) -> Result<(Hunk, usize), ParseError> {
@@ -300,22 +427,23 @@ fn parse_update_file_chunk(
#[test]
fn test_parse_patch() {
assert_eq!(
parse_patch("bad"),
parse_patch_text("bad", ParseMode::Strict),
Err(InvalidPatchError(
"The first line of the patch must be '*** Begin Patch'".to_string()
))
);
assert_eq!(
parse_patch("*** Begin Patch\nbad"),
parse_patch_text("*** Begin Patch\nbad", ParseMode::Strict),
Err(InvalidPatchError(
"The last line of the patch must be '*** End Patch'".to_string()
))
);
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** Update File: test.py\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Err(InvalidHunkError {
message: "Update file hunk for path 'test.py' is empty".to_string(),
@@ -323,14 +451,15 @@ fn test_parse_patch() {
})
);
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Ok(Vec::new())
);
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** Add File: path/add.py\n\
+abc\n\
@@ -341,7 +470,8 @@ fn test_parse_patch() {
@@ def f():\n\
- pass\n\
+ return 123\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Ok(vec![
AddFile {
@@ -365,14 +495,15 @@ fn test_parse_patch() {
);
// Update hunk followed by another hunk (Add File).
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** Update File: file.py\n\
@@\n\
+line\n\
*** Add File: other.py\n\
+content\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Ok(vec![
UpdateFile {
@@ -395,12 +526,13 @@ fn test_parse_patch() {
// Update hunk without an explicit @@ header for the first chunk should parse.
// Use a raw string to preserve the leading space diff marker on the context line.
assert_eq!(
parse_patch(
parse_patch_text(
r#"*** Begin Patch
*** Update File: file2.py
import foo
+bar
*** End Patch"#,
ParseMode::Strict
),
Ok(vec![UpdateFile {
path: PathBuf::from("file2.py"),
@@ -415,6 +547,80 @@ fn test_parse_patch() {
);
}
#[test]
fn test_parse_patch_lenient() {
let patch_text = r#"*** Begin Patch
*** Update File: file2.py
import foo
+bar
*** End Patch"#;
let expected_patch = vec![UpdateFile {
path: PathBuf::from("file2.py"),
move_path: None,
chunks: vec![UpdateFileChunk {
change_context: None,
old_lines: vec!["import foo".to_string()],
new_lines: vec!["import foo".to_string(), "bar".to_string()],
is_end_of_file: false,
}],
}];
let expected_error =
InvalidPatchError("The first line of the patch must be '*** Begin Patch'".to_string());
let patch_text_in_heredoc = format!("<<EOF\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
);
let patch_text_in_single_quoted_heredoc = format!("<<'EOF'\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_single_quoted_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_single_quoted_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
);
let patch_text_in_double_quoted_heredoc = format!("<<\"EOF\"\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_double_quoted_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_double_quoted_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
);
let patch_text_in_mismatched_quotes_heredoc = format!("<<\"EOF'\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_mismatched_quotes_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_mismatched_quotes_heredoc, ParseMode::Lenient),
Err(expected_error.clone())
);
let patch_text_with_missing_closing_heredoc =
"<<EOF\n*** Begin Patch\n*** Update File: file2.py\nEOF\n".to_string();
assert_eq!(
parse_patch_text(&patch_text_with_missing_closing_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_with_missing_closing_heredoc, ParseMode::Lenient),
Err(InvalidPatchError(
"The last line of the patch must be '*** End Patch'".to_string()
))
);
}
#[test]
fn test_parse_one_hunk() {
assert_eq!(

View File

@@ -7,10 +7,6 @@ edition = "2024"
name = "codex"
path = "src/main.rs"
[[bin]]
name = "codex-linux-sandbox"
path = "src/linux-sandbox/main.rs"
[lib]
name = "codex_cli"
path = "src/lib.rs"
@@ -24,6 +20,8 @@ clap = { version = "4", features = ["derive"] }
codex-core = { path = "../core" }
codex-common = { path = "../common", features = ["cli"] }
codex-exec = { path = "../exec" }
codex-login = { path = "../login" }
codex-linux-sandbox = { path = "../linux-sandbox" }
codex-mcp-server = { path = "../mcp-server" }
codex-tui = { path = "../tui" }
serde_json = "1"

View File

@@ -0,0 +1,113 @@
use std::path::PathBuf;
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::exec::StdioPolicy;
use codex_core::exec::spawn_command_under_linux_sandbox;
use codex_core::exec::spawn_command_under_seatbelt;
use codex_core::exec_env::create_env;
use codex_core::protocol::SandboxPolicy;
use crate::LandlockCommand;
use crate::SeatbeltCommand;
use crate::exit_status::handle_exit_status;
pub async fn run_command_under_seatbelt(
command: SeatbeltCommand,
codex_linux_sandbox_exe: Option<PathBuf>,
) -> anyhow::Result<()> {
let SeatbeltCommand {
full_auto,
config_overrides,
command,
} = command;
run_command_under_sandbox(
full_auto,
command,
config_overrides,
codex_linux_sandbox_exe,
SandboxType::Seatbelt,
)
.await
}
pub async fn run_command_under_landlock(
command: LandlockCommand,
codex_linux_sandbox_exe: Option<PathBuf>,
) -> anyhow::Result<()> {
let LandlockCommand {
full_auto,
config_overrides,
command,
} = command;
run_command_under_sandbox(
full_auto,
command,
config_overrides,
codex_linux_sandbox_exe,
SandboxType::Landlock,
)
.await
}
enum SandboxType {
Seatbelt,
Landlock,
}
async fn run_command_under_sandbox(
full_auto: bool,
command: Vec<String>,
config_overrides: CliConfigOverrides,
codex_linux_sandbox_exe: Option<PathBuf>,
sandbox_type: SandboxType,
) -> anyhow::Result<()> {
let sandbox_policy = create_sandbox_policy(full_auto);
let cwd = std::env::current_dir()?;
let config = Config::load_with_cli_overrides(
config_overrides
.parse_overrides()
.map_err(anyhow::Error::msg)?,
ConfigOverrides {
sandbox_policy: Some(sandbox_policy),
codex_linux_sandbox_exe,
..Default::default()
},
)?;
let stdio_policy = StdioPolicy::Inherit;
let env = create_env(&config.shell_environment_policy);
let mut child = match sandbox_type {
SandboxType::Seatbelt => {
spawn_command_under_seatbelt(command, &config.sandbox_policy, cwd, stdio_policy, env)
.await?
}
SandboxType::Landlock => {
#[expect(clippy::expect_used)]
let codex_linux_sandbox_exe = config
.codex_linux_sandbox_exe
.expect("codex-linux-sandbox executable not found");
spawn_command_under_linux_sandbox(
codex_linux_sandbox_exe,
command,
&config.sandbox_policy,
cwd,
stdio_policy,
env,
)
.await?
}
};
let status = child.wait().await?;
handle_exit_status(status);
}
pub fn create_sandbox_policy(full_auto: bool) -> SandboxPolicy {
if full_auto {
SandboxPolicy::new_workspace_write_policy()
} else {
SandboxPolicy::new_read_only_policy()
}
}

View File

@@ -1,35 +0,0 @@
//! `debug landlock` implementation for the Codex CLI.
//!
//! On Linux the command is executed inside a Landlock + seccomp sandbox by
//! calling the low-level `exec_linux` helper from `codex_core::linux`.
use codex_core::exec::StdioPolicy;
use codex_core::exec::spawn_child_sync;
use codex_core::exec_linux::apply_sandbox_policy_to_current_thread;
use codex_core::protocol::SandboxPolicy;
use std::process::ExitStatus;
use crate::exit_status::handle_exit_status;
/// Execute `command` in a Linux sandbox (Landlock + seccomp) the way Codex
/// would.
pub fn run_landlock(command: Vec<String>, sandbox_policy: SandboxPolicy) -> anyhow::Result<()> {
if command.is_empty() {
anyhow::bail!("command args are empty");
}
// Spawn a new thread and apply the sandbox policies there.
let handle = std::thread::spawn(move || -> anyhow::Result<ExitStatus> {
let cwd = std::env::current_dir()?;
apply_sandbox_policy_to_current_thread(&sandbox_policy, &cwd)?;
let mut child = spawn_child_sync(command, cwd, &sandbox_policy, StdioPolicy::Inherit)?;
let status = child.wait()?;
Ok(status)
});
let status = handle
.join()
.map_err(|e| anyhow::anyhow!("Failed to join thread: {e:?}"))??;
handle_exit_status(status);
}

View File

@@ -1,12 +1,10 @@
pub mod debug_sandbox;
mod exit_status;
#[cfg(unix)]
pub mod landlock;
pub mod login;
pub mod proto;
pub mod seatbelt;
use clap::Parser;
use codex_common::SandboxPermissionOption;
use codex_core::protocol::SandboxPolicy;
use codex_common::CliConfigOverrides;
#[derive(Debug, Parser)]
pub struct SeatbeltCommand {
@@ -14,8 +12,8 @@ pub struct SeatbeltCommand {
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,
#[clap(flatten)]
pub sandbox: SandboxPermissionOption,
#[clap(skip)]
pub config_overrides: CliConfigOverrides,
/// Full command args to run under seatbelt.
#[arg(trailing_var_arg = true)]
@@ -28,21 +26,10 @@ pub struct LandlockCommand {
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,
#[clap(flatten)]
pub sandbox: SandboxPermissionOption,
#[clap(skip)]
pub config_overrides: CliConfigOverrides,
/// Full command args to run under landlock.
#[arg(trailing_var_arg = true)]
pub command: Vec<String>,
}
pub fn create_sandbox_policy(full_auto: bool, sandbox: SandboxPermissionOption) -> SandboxPolicy {
if full_auto {
SandboxPolicy::new_full_auto_policy()
} else {
match sandbox.permissions.map(Into::into) {
Some(sandbox_policy) => sandbox_policy,
None => SandboxPolicy::new_read_only_policy(),
}
}
}

View File

@@ -1,22 +0,0 @@
#[cfg(not(target_os = "linux"))]
fn main() -> anyhow::Result<()> {
eprintln!("codex-linux-sandbox is not supported on this platform.");
std::process::exit(1);
}
#[cfg(target_os = "linux")]
fn main() -> anyhow::Result<()> {
use clap::Parser;
use codex_cli::LandlockCommand;
use codex_cli::create_sandbox_policy;
use codex_cli::landlock;
let LandlockCommand {
full_auto,
sandbox,
command,
} = LandlockCommand::parse();
let sandbox_policy = create_sandbox_policy(full_auto, sandbox);
landlock::run_landlock(command, sandbox_policy)?;
Ok(())
}

35
codex-rs/cli/src/login.rs Normal file
View File

@@ -0,0 +1,35 @@
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_login::login_with_chatgpt;
pub async fn run_login_with_chatgpt(cli_config_overrides: CliConfigOverrides) -> ! {
let cli_overrides = match cli_config_overrides.parse_overrides() {
Ok(v) => v,
Err(e) => {
eprintln!("Error parsing -c overrides: {e}");
std::process::exit(1);
}
};
let config_overrides = ConfigOverrides::default();
let config = match Config::load_with_cli_overrides(cli_overrides, config_overrides) {
Ok(config) => config,
Err(e) => {
eprintln!("Error loading configuration: {e}");
std::process::exit(1);
}
};
let capture_output = false;
match login_with_chatgpt(&config.codex_home, capture_output).await {
Ok(_) => {
eprintln!("Successfully logged in");
std::process::exit(0);
}
Err(e) => {
eprintln!("Error logging in: {e}");
std::process::exit(1);
}
}
}

View File

@@ -1,11 +1,12 @@
use clap::Parser;
use codex_cli::LandlockCommand;
use codex_cli::SeatbeltCommand;
use codex_cli::create_sandbox_policy;
use codex_cli::login::run_login_with_chatgpt;
use codex_cli::proto;
use codex_cli::seatbelt;
use codex_common::CliConfigOverrides;
use codex_exec::Cli as ExecCli;
use codex_tui::Cli as TuiCli;
use std::path::PathBuf;
use crate::proto::ProtoCli;
@@ -20,6 +21,9 @@ use crate::proto::ProtoCli;
subcommand_negates_reqs = true
)]
struct MultitoolCli {
#[clap(flatten)]
pub config_overrides: CliConfigOverrides,
#[clap(flatten)]
interactive: TuiCli,
@@ -33,6 +37,9 @@ enum Subcommand {
#[clap(visible_alias = "e")]
Exec(ExecCli),
/// Login with ChatGPT.
Login(LoginCommand),
/// Experimental: run Codex as an MCP server.
Mcp,
@@ -60,49 +67,72 @@ enum DebugCommand {
}
#[derive(Debug, Parser)]
struct ReplProto {}
struct LoginCommand {
#[clap(skip)]
config_overrides: CliConfigOverrides,
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
fn main() -> anyhow::Result<()> {
codex_linux_sandbox::run_with_sandbox(|codex_linux_sandbox_exe| async move {
cli_main(codex_linux_sandbox_exe).await?;
Ok(())
})
}
async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()> {
let cli = MultitoolCli::parse();
match cli.subcommand {
None => {
codex_tui::run_main(cli.interactive)?;
let mut tui_cli = cli.interactive;
prepend_config_flags(&mut tui_cli.config_overrides, cli.config_overrides);
codex_tui::run_main(tui_cli, codex_linux_sandbox_exe)?;
}
Some(Subcommand::Exec(exec_cli)) => {
codex_exec::run_main(exec_cli).await?;
Some(Subcommand::Exec(mut exec_cli)) => {
prepend_config_flags(&mut exec_cli.config_overrides, cli.config_overrides);
codex_exec::run_main(exec_cli, codex_linux_sandbox_exe).await?;
}
Some(Subcommand::Mcp) => {
codex_mcp_server::run_main().await?;
codex_mcp_server::run_main(codex_linux_sandbox_exe).await?;
}
Some(Subcommand::Proto(proto_cli)) => {
Some(Subcommand::Login(mut login_cli)) => {
prepend_config_flags(&mut login_cli.config_overrides, cli.config_overrides);
run_login_with_chatgpt(login_cli.config_overrides).await;
}
Some(Subcommand::Proto(mut proto_cli)) => {
prepend_config_flags(&mut proto_cli.config_overrides, cli.config_overrides);
proto::run_main(proto_cli).await?;
}
Some(Subcommand::Debug(debug_args)) => match debug_args.cmd {
DebugCommand::Seatbelt(SeatbeltCommand {
command,
sandbox,
full_auto,
}) => {
let sandbox_policy = create_sandbox_policy(full_auto, sandbox);
seatbelt::run_seatbelt(command, sandbox_policy).await?;
DebugCommand::Seatbelt(mut seatbelt_cli) => {
prepend_config_flags(&mut seatbelt_cli.config_overrides, cli.config_overrides);
codex_cli::debug_sandbox::run_command_under_seatbelt(
seatbelt_cli,
codex_linux_sandbox_exe,
)
.await?;
}
#[cfg(unix)]
DebugCommand::Landlock(LandlockCommand {
command,
sandbox,
full_auto,
}) => {
let sandbox_policy = create_sandbox_policy(full_auto, sandbox);
codex_cli::landlock::run_landlock(command, sandbox_policy)?;
}
#[cfg(not(unix))]
DebugCommand::Landlock(_) => {
anyhow::bail!("Landlock is only supported on Linux.");
DebugCommand::Landlock(mut landlock_cli) => {
prepend_config_flags(&mut landlock_cli.config_overrides, cli.config_overrides);
codex_cli::debug_sandbox::run_command_under_landlock(
landlock_cli,
codex_linux_sandbox_exe,
)
.await?;
}
},
}
Ok(())
}
/// Prepend root-level overrides so they have lower precedence than
/// CLI-specific ones specified after the subcommand (if any).
fn prepend_config_flags(
subcommand_config_overrides: &mut CliConfigOverrides,
cli_config_overrides: CliConfigOverrides,
) {
subcommand_config_overrides
.raw_overrides
.splice(0..0, cli_config_overrides.raw_overrides);
}

View File

@@ -2,6 +2,7 @@ use std::io::IsTerminal;
use std::sync::Arc;
use clap::Parser;
use codex_common::CliConfigOverrides;
use codex_core::Codex;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
@@ -13,9 +14,12 @@ use tracing::error;
use tracing::info;
#[derive(Debug, Parser)]
pub struct ProtoCli {}
pub struct ProtoCli {
#[clap(skip)]
pub config_overrides: CliConfigOverrides,
}
pub async fn run_main(_opts: ProtoCli) -> anyhow::Result<()> {
pub async fn run_main(opts: ProtoCli) -> anyhow::Result<()> {
if std::io::stdin().is_terminal() {
anyhow::bail!("Protocol mode expects stdin to be a pipe, not a terminal");
}
@@ -24,7 +28,12 @@ pub async fn run_main(_opts: ProtoCli) -> anyhow::Result<()> {
.with_writer(std::io::stderr)
.init();
let config = Config::load_with_overrides(ConfigOverrides::default())?;
let ProtoCli { config_overrides } = opts;
let overrides_vec = config_overrides
.parse_overrides()
.map_err(anyhow::Error::msg)?;
let config = Config::load_with_cli_overrides(overrides_vec, ConfigOverrides::default())?;
let ctrl_c = notify_on_sigint();
let (codex, _init_id) = Codex::spawn(config, ctrl_c.clone()).await?;
let codex = Arc::new(codex);

View File

@@ -1,16 +0,0 @@
use codex_core::exec::StdioPolicy;
use codex_core::exec::spawn_command_under_seatbelt;
use codex_core::protocol::SandboxPolicy;
use crate::exit_status::handle_exit_status;
pub async fn run_seatbelt(
command: Vec<String>,
sandbox_policy: SandboxPolicy,
) -> anyhow::Result<()> {
let cwd = std::env::current_dir()?;
let mut child =
spawn_command_under_seatbelt(command, &sandbox_policy, cwd, StdioPolicy::Inherit).await?;
let status = child.wait().await?;
handle_exit_status(status);
}

View File

@@ -7,11 +7,13 @@ edition = "2024"
workspace = true
[dependencies]
chrono = { version = "0.4.40", optional = true }
clap = { version = "4", features = ["derive", "wrap_help"], optional = true }
codex-core = { path = "../core" }
toml = { version = "0.8", optional = true }
serde = { version = "1", optional = true }
[features]
# Separate feature so that `clap` is not a mandatory dependency.
cli = ["clap"]
elapsed = ["chrono"]
cli = ["clap", "toml", "serde"]
elapsed = []
sandbox_summary = []

View File

@@ -1,27 +1,23 @@
//! Standard type to use with the `--approval-mode` CLI option.
//! Available when the `cli` feature is enabled for the crate.
use clap::ArgAction;
use clap::Parser;
use clap::ValueEnum;
use codex_core::config::parse_sandbox_permission_with_base_path;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::SandboxPermission;
#[derive(Clone, Copy, Debug, ValueEnum)]
#[value(rename_all = "kebab-case")]
pub enum ApprovalModeCliArg {
/// Only run "trusted" commands (e.g. ls, cat, sed) without asking for user
/// approval. Will escalate to the user if the model proposes a command that
/// is not in the "trusted" set.
Untrusted,
/// Run all commands without asking for user approval.
/// Only asks for approval if a command fails to execute, in which case it
/// will escalate to the user to ask for un-sandboxed execution.
OnFailure,
/// Only run "known safe" commands (e.g. ls, cat, sed) without
/// asking for user approval. Will escalate to the user if the model
/// proposes a command that is not allow-listed.
UnlessAllowListed,
/// Never ask for user approval
/// Execution failures are immediately returned to the model.
Never,
@@ -30,44 +26,9 @@ pub enum ApprovalModeCliArg {
impl From<ApprovalModeCliArg> for AskForApproval {
fn from(value: ApprovalModeCliArg) -> Self {
match value {
ApprovalModeCliArg::Untrusted => AskForApproval::UnlessTrusted,
ApprovalModeCliArg::OnFailure => AskForApproval::OnFailure,
ApprovalModeCliArg::UnlessAllowListed => AskForApproval::UnlessAllowListed,
ApprovalModeCliArg::Never => AskForApproval::Never,
}
}
}
#[derive(Parser, Debug)]
pub struct SandboxPermissionOption {
/// Specify this flag multiple times to specify the full set of permissions
/// to grant to Codex.
///
/// ```shell
/// codex -s disk-full-read-access \
/// -s disk-write-cwd \
/// -s disk-write-platform-user-temp-folder \
/// -s disk-write-platform-global-temp-folder
/// ```
///
/// Note disk-write-folder takes a value:
///
/// ```shell
/// -s disk-write-folder=$HOME/.pyenv/shims
/// ```
///
/// These permissions are quite broad and should be used with caution:
///
/// ```shell
/// -s disk-full-write-access
/// -s network-full-access
/// ```
#[arg(long = "sandbox-permission", short = 's', action = ArgAction::Append, value_parser = parse_sandbox_permission)]
pub permissions: Option<Vec<SandboxPermission>>,
}
/// Custom value-parser so we can keep the CLI surface small *and*
/// still handle the parameterised `disk-write-folder` case.
fn parse_sandbox_permission(raw: &str) -> std::io::Result<SandboxPermission> {
let base_path = std::env::current_dir()?;
parse_sandbox_permission_with_base_path(raw, base_path)
}

View File

@@ -0,0 +1,170 @@
//! Support for `-c key=value` overrides shared across Codex CLI tools.
//!
//! This module provides a [`CliConfigOverrides`] struct that can be embedded
//! into a `clap`-derived CLI struct using `#[clap(flatten)]`. Each occurrence
//! of `-c key=value` (or `--config key=value`) will be collected as a raw
//! string. Helper methods are provided to convert the raw strings into
//! key/value pairs as well as to apply them onto a mutable
//! `serde_json::Value` representing the configuration tree.
use clap::ArgAction;
use clap::Parser;
use serde::de::Error as SerdeError;
use toml::Value;
/// CLI option that captures arbitrary configuration overrides specified as
/// `-c key=value`. It intentionally keeps both halves **unparsed** so that the
/// calling code can decide how to interpret the right-hand side.
#[derive(Parser, Debug, Default, Clone)]
pub struct CliConfigOverrides {
/// Override a configuration value that would otherwise be loaded from
/// `~/.codex/config.toml`. Use a dotted path (`foo.bar.baz`) to override
/// nested values. The `value` portion is parsed as JSON. If it fails to
/// parse as JSON, the raw string is used as a literal.
///
/// Examples:
/// - `-c model="o3"`
/// - `-c 'sandbox_permissions=["disk-full-read-access"]'`
/// - `-c shell_environment_policy.inherit=all`
#[arg(
short = 'c',
long = "config",
value_name = "key=value",
action = ArgAction::Append,
global = true,
)]
pub raw_overrides: Vec<String>,
}
impl CliConfigOverrides {
/// Parse the raw strings captured from the CLI into a list of `(path,
/// value)` tuples where `value` is a `serde_json::Value`.
pub fn parse_overrides(&self) -> Result<Vec<(String, Value)>, String> {
self.raw_overrides
.iter()
.map(|s| {
// Only split on the *first* '=' so values are free to contain
// the character.
let mut parts = s.splitn(2, '=');
let key = match parts.next() {
Some(k) => k.trim(),
None => return Err("Override missing key".to_string()),
};
let value_str = parts
.next()
.ok_or_else(|| format!("Invalid override (missing '='): {s}"))?
.trim();
if key.is_empty() {
return Err(format!("Empty key in override: {s}"));
}
// Attempt to parse as JSON. If that fails, treat it as a raw
// string. This allows convenient usage such as
// `-c model=o3` without the quotes.
let value: Value = match parse_toml_value(value_str) {
Ok(v) => v,
Err(_) => Value::String(value_str.to_string()),
};
Ok((key.to_string(), value))
})
.collect()
}
/// Apply all parsed overrides onto `target`. Intermediate objects will be
/// created as necessary. Values located at the destination path will be
/// replaced.
pub fn apply_on_value(&self, target: &mut Value) -> Result<(), String> {
let overrides = self.parse_overrides()?;
for (path, value) in overrides {
apply_single_override(target, &path, value);
}
Ok(())
}
}
/// Apply a single override onto `root`, creating intermediate objects as
/// necessary.
fn apply_single_override(root: &mut Value, path: &str, value: Value) {
use toml::value::Table;
let parts: Vec<&str> = path.split('.').collect();
let mut current = root;
for (i, part) in parts.iter().enumerate() {
let is_last = i == parts.len() - 1;
if is_last {
match current {
Value::Table(tbl) => {
tbl.insert((*part).to_string(), value);
}
_ => {
let mut tbl = Table::new();
tbl.insert((*part).to_string(), value);
*current = Value::Table(tbl);
}
}
return;
}
// Traverse or create intermediate table.
match current {
Value::Table(tbl) => {
current = tbl
.entry((*part).to_string())
.or_insert_with(|| Value::Table(Table::new()));
}
_ => {
*current = Value::Table(Table::new());
if let Value::Table(tbl) = current {
current = tbl
.entry((*part).to_string())
.or_insert_with(|| Value::Table(Table::new()));
}
}
}
}
}
fn parse_toml_value(raw: &str) -> Result<Value, toml::de::Error> {
let wrapped = format!("_x_ = {raw}");
let table: toml::Table = toml::from_str(&wrapped)?;
table
.get("_x_")
.cloned()
.ok_or_else(|| SerdeError::custom("missing sentinel key"))
}
#[cfg(all(test, feature = "cli"))]
#[allow(clippy::expect_used, clippy::unwrap_used)]
mod tests {
use super::*;
#[test]
fn parses_basic_scalar() {
let v = parse_toml_value("42").expect("parse");
assert_eq!(v.as_integer(), Some(42));
}
#[test]
fn fails_on_unquoted_string() {
assert!(parse_toml_value("hello").is_err());
}
#[test]
fn parses_array() {
let v = parse_toml_value("[1, 2, 3]").expect("parse");
let arr = v.as_array().expect("array");
assert_eq!(arr.len(), 3);
}
#[test]
fn parses_inline_table() {
let v = parse_toml_value("{a = 1, b = 2}").expect("parse");
let tbl = v.as_table().expect("table");
assert_eq!(tbl.get("a").unwrap().as_integer(), Some(1));
assert_eq!(tbl.get("b").unwrap().as_integer(), Some(2));
}
}

View File

@@ -1,18 +1,19 @@
use chrono::Utc;
use std::time::Duration;
use std::time::Instant;
/// Returns a string representing the elapsed time since `start_time` like
/// "1m15s" or "1.50s".
pub fn format_elapsed(start_time: chrono::DateTime<Utc>) -> String {
let elapsed = Utc::now().signed_duration_since(start_time);
format_time_delta(elapsed)
pub fn format_elapsed(start_time: Instant) -> String {
format_duration(start_time.elapsed())
}
fn format_time_delta(elapsed: chrono::TimeDelta) -> String {
let millis = elapsed.num_milliseconds();
format_elapsed_millis(millis)
}
pub fn format_duration(duration: std::time::Duration) -> String {
/// Convert a [`std::time::Duration`] into a human-readable, compact string.
///
/// Formatting rules:
/// * < 1 s -> "{milli}ms"
/// * < 60 s -> "{sec:.2}s" (two decimal places)
/// * >= 60 s -> "{min}m{sec:02}s"
pub fn format_duration(duration: Duration) -> String {
let millis = duration.as_millis() as i64;
format_elapsed_millis(millis)
}
@@ -32,41 +33,40 @@ fn format_elapsed_millis(millis: i64) -> String {
#[cfg(test)]
mod tests {
use super::*;
use chrono::Duration;
#[test]
fn test_format_time_delta_subsecond() {
fn test_format_duration_subsecond() {
// Durations < 1s should be rendered in milliseconds with no decimals.
let dur = Duration::milliseconds(250);
assert_eq!(format_time_delta(dur), "250ms");
let dur = Duration::from_millis(250);
assert_eq!(format_duration(dur), "250ms");
// Exactly zero should still work.
let dur_zero = Duration::milliseconds(0);
assert_eq!(format_time_delta(dur_zero), "0ms");
let dur_zero = Duration::from_millis(0);
assert_eq!(format_duration(dur_zero), "0ms");
}
#[test]
fn test_format_time_delta_seconds() {
fn test_format_duration_seconds() {
// Durations between 1s (inclusive) and 60s (exclusive) should be
// printed with 2-decimal-place seconds.
let dur = Duration::milliseconds(1_500); // 1.5s
assert_eq!(format_time_delta(dur), "1.50s");
let dur = Duration::from_millis(1_500); // 1.5s
assert_eq!(format_duration(dur), "1.50s");
// 59.999s rounds to 60.00s
let dur2 = Duration::milliseconds(59_999);
assert_eq!(format_time_delta(dur2), "60.00s");
let dur2 = Duration::from_millis(59_999);
assert_eq!(format_duration(dur2), "60.00s");
}
#[test]
fn test_format_time_delta_minutes() {
fn test_format_duration_minutes() {
// Durations ≥ 1 minute should be printed mmss.
let dur = Duration::milliseconds(75_000); // 1m15s
assert_eq!(format_time_delta(dur), "1m15s");
let dur = Duration::from_millis(75_000); // 1m15s
assert_eq!(format_duration(dur), "1m15s");
let dur_exact = Duration::milliseconds(60_000); // 1m0s
assert_eq!(format_time_delta(dur_exact), "1m00s");
let dur_exact = Duration::from_millis(60_000); // 1m0s
assert_eq!(format_duration(dur_exact), "1m00s");
let dur_long = Duration::milliseconds(3_601_000);
assert_eq!(format_time_delta(dur_long), "60m01s");
let dur_long = Duration::from_millis(3_601_000);
assert_eq!(format_duration(dur_long), "60m01s");
}
}

View File

@@ -6,5 +6,14 @@ pub mod elapsed;
#[cfg(feature = "cli")]
pub use approval_mode_cli_arg::ApprovalModeCliArg;
#[cfg(any(feature = "cli", test))]
mod config_override;
#[cfg(feature = "cli")]
pub use approval_mode_cli_arg::SandboxPermissionOption;
pub use config_override::CliConfigOverrides;
mod sandbox_summary;
#[cfg(feature = "sandbox_summary")]
pub use sandbox_summary::summarize_sandbox_policy;

View File

@@ -0,0 +1,28 @@
use codex_core::protocol::SandboxPolicy;
pub fn summarize_sandbox_policy(sandbox_policy: &SandboxPolicy) -> String {
match sandbox_policy {
SandboxPolicy::DangerFullAccess => "danger-full-access".to_string(),
SandboxPolicy::ReadOnly => "read-only".to_string(),
SandboxPolicy::WorkspaceWrite {
writable_roots,
network_access,
} => {
let mut summary = "workspace-write".to_string();
if !writable_roots.is_empty() {
summary.push_str(&format!(
" [{}]",
writable_roots
.iter()
.map(|p| p.to_string_lossy())
.collect::<Vec<_>>()
.join(", ")
));
}
if *network_access {
summary.push_str(" (network access enabled)");
}
summary
}
}
}

454
codex-rs/config.md Normal file
View File

@@ -0,0 +1,454 @@
# Config
Codex supports several mechanisms for setting config values:
- Config-specific command-line flags, such as `--model o3` (highest precedence).
- A generic `-c`/`--config` flag that takes a `key=value` pair, such as `--config model="o3"`.
- The key can contain dots to set a value deeper than the root, e.g. `--config model_providers.openai.wire_api="chat"`.
- Values can contain objects, such as `--config shell_environment_policy.include_only=["PATH", "HOME", "USER"]`.
- For consistency with `config.toml`, values are in TOML format rather than JSON format, so use `{a = 1, b = 2}` rather than `{"a": 1, "b": 2}`.
- If `value` cannot be parsed as a valid TOML value, it is treated as a string value. This means that both `-c model="o3"` and `-c model=o3` are equivalent.
- The `$CODEX_HOME/config.toml` configuration file where the `CODEX_HOME` environment value defaults to `~/.codex`. (Note `CODEX_HOME` will also be where logs and other Codex-related information are stored.)
Both the `--config` flag and the `config.toml` file support the following options:
## model
The model that Codex should use.
```toml
model = "o3" # overrides the default of "codex-mini-latest"
```
## model_providers
This option lets you override and amend the default set of model providers bundled with Codex. This value is a map where the key is the value to use with `model_provider` to select the corresponding provider.
For example, if you wanted to add a provider that uses the OpenAI 4o model via the chat completions API, then you could add the following configuration:
```toml
# Recall that in TOML, root keys must be listed before tables.
model = "gpt-4o"
model_provider = "openai-chat-completions"
[model_providers.openai-chat-completions]
# Name of the provider that will be displayed in the Codex UI.
name = "OpenAI using Chat Completions"
# The path `/chat/completions` will be amended to this URL to make the POST
# request for the chat completions.
base_url = "https://api.openai.com/v1"
# If `env_key` is set, identifies an environment variable that must be set when
# using Codex with this provider. The value of the environment variable must be
# non-empty and will be used in the `Bearer TOKEN` HTTP header for the POST request.
env_key = "OPENAI_API_KEY"
# Valid values for wire_api are "chat" and "responses". Defaults to "chat" if omitted.
wire_api = "chat"
# If necessary, extra query params that need to be added to the URL.
# See the Azure example below.
query_params = {}
```
Note this makes it possible to use Codex CLI with non-OpenAI models, so long as they use a wire API that is compatible with the OpenAI chat completions API. For example, you could define the following provider to use Codex CLI with Ollama running locally:
```toml
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
```
Or a third-party provider (using a distinct environment variable for the API key):
```toml
[model_providers.mistral]
name = "Mistral"
base_url = "https://api.mistral.ai/v1"
env_key = "MISTRAL_API_KEY"
```
Note that Azure requires `api-version` to be passed as a query parameter, so be sure to specify it as part of `query_params` when defining the Azure provider:
```toml
[model_providers.azure]
name = "Azure"
# Make sure you set the appropriate subdomain for this URL.
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY" # Or "OPENAI_API_KEY", whichever you use.
query_params = { api-version = "2025-04-01-preview" }
```
## model_provider
Identifies which provider to use from the `model_providers` map. Defaults to `"openai"`.
Note that if you override `model_provider`, then you likely want to override
`model`, as well. For example, if you are running ollama with Mistral locally,
then you would need to add the following to your config in addition to the new entry in the `model_providers` map:
```toml
model = "mistral"
model_provider = "ollama"
```
## approval_policy
Determines when the user should be prompted to approve whether Codex can execute a command:
```toml
# Codex has hardcoded logic that defines a set of "trusted" commands.
# Setting the approval_policy to `untrusted` means that Codex will prompt the
# user before running a command not in the "trusted" set.
#
# See https://github.com/openai/codex/issues/1260 for the plan to enable
# end-users to define their own trusted commands.
approval_policy = "untrusted"
```
```toml
# If the command fails when run in the sandbox, Codex asks for permission to
# retry the command outside the sandbox.
approval_policy = "on-failure"
```
```toml
# User is never prompted: if the command fails, Codex will automatically try
# something out. Note the `exec` subcommand always uses this mode.
approval_policy = "never"
```
## profiles
A _profile_ is a collection of configuration values that can be set together. Multiple profiles can be defined in `config.toml` and you can specify the one you
want to use at runtime via the `--profile` flag.
Here is an example of a `config.toml` that defines multiple profiles:
```toml
model = "o3"
approval_policy = "unless-allow-listed"
disable_response_storage = false
# Setting `profile` is equivalent to specifying `--profile o3` on the command
# line, though the `--profile` flag can still be used to override this value.
profile = "o3"
[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
[profiles.o3]
model = "o3"
model_provider = "openai"
approval_policy = "never"
[profiles.gpt3]
model = "gpt-3.5-turbo"
model_provider = "openai-chat-completions"
[profiles.zdr]
model = "o3"
model_provider = "openai"
approval_policy = "on-failure"
disable_response_storage = true
```
Users can specify config values at multiple levels. Order of precedence is as follows:
1. custom command-line argument, e.g., `--model o3`
2. as part of a profile, where the `--profile` is specified via a CLI (or in the config file itself)
3. as an entry in `config.toml`, e.g., `model = "o3"`
4. the default value that comes with Codex CLI (i.e., Codex CLI defaults to `codex-mini-latest`)
## model_reasoning_effort
If the model name starts with `"o"` (as in `"o3"` or `"o4-mini"`) or `"codex"`, reasoning is enabled by default when using the Responses API. As explained in the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning), this can be set to:
- `"low"`
- `"medium"` (default)
- `"high"`
To disable reasoning, set `model_reasoning_effort` to `"none"` in your config:
```toml
model_reasoning_effort = "none" # disable reasoning
```
## model_reasoning_summary
If the model name starts with `"o"` (as in `"o3"` or `"o4-mini"`) or `"codex"`, reasoning is enabled by default when using the Responses API. As explained in the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries), this can be set to:
- `"auto"` (default)
- `"concise"`
- `"detailed"`
To disable reasoning summaries, set `model_reasoning_summary` to `"none"` in your config:
```toml
model_reasoning_summary = "none" # disable reasoning summaries
```
## sandbox
The `sandbox` configuration determines the _sandbox policy_ that Codex uses to execute untrusted commands. The `mode` determines the "base policy." Currently, only `workspace-write` supports additional configuration options, but this may change in the future.
The default policy is `read-only`, which means commands can read any file on disk, but attempts to write a file or access the network will be blocked.
```toml
[sandbox]
mode = "read-only"
```
A more relaxed policy is `workspace-write`. When specified, the current working directory for the Codex task will be writable (as well as `$TMPDIR` on macOS). Note that the CLI defaults to using `cwd` where it was spawned, though this can be overridden using `--cwd/-C`.
```toml
[sandbox]
mode = "workspace-write"
# By default, only the cwd for the Codex session will be writable (and $TMPDIR on macOS),
# but you can specify additional writable folders in this array.
writable_roots = [
"/tmp",
]
network_access = false # Like read-only, this also defaults to false and can be omitted.
```
To disable sandboxing altogether, specify `danger-full-access` like so:
```toml
[sandbox]
mode = "danger-full-access"
```
This is reasonable to use if Codex is running in an environment that provides its own sandboxing (such as a Docker container) such that further sandboxing is unnecessary.
Though using this option may also be necessary if you try to use Codex in environments where its native sandboxing mechanisms are unsupported, such as older Linux kernels or on Windows.
## mcp_servers
Defines the list of MCP servers that Codex can consult for tool use. Currently, only servers that are launched by executing a program that communicate over stdio are supported. For servers that use the SSE transport, consider an adapter like [mcp-proxy](https://github.com/sparfenyuk/mcp-proxy).
**Note:** Codex may cache the list of tools and resources from an MCP server so that Codex can include this information in context at startup without spawning all the servers. This is designed to save resources by loading MCP servers lazily.
This config option is comparable to how Claude and Cursor define `mcpServers` in their respective JSON config files, though because Codex uses TOML for its config language, the format is slightly different. For example, the following config in JSON:
```json
{
"mcpServers": {
"server-name": {
"command": "npx",
"args": ["-y", "mcp-server"],
"env": {
"API_KEY": "value"
}
}
}
}
```
Should be represented as follows in `~/.codex/config.toml`:
```toml
# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command = "npx"
args = ["-y", "mcp-server"]
env = { "API_KEY" = "value" }
```
## disable_response_storage
Currently, customers whose accounts are set to use Zero Data Retention (ZDR) must set `disable_response_storage` to `true` so that Codex uses an alternative to the Responses API that works with ZDR:
```toml
disable_response_storage = true
```
## shell_environment_policy
Codex spawns subprocesses (e.g. when executing a `local_shell` tool-call suggested by the assistant). By default it passes **only a minimal core subset** of your environment to those subprocesses to avoid leaking credentials. You can tune this behavior via the **`shell_environment_policy`** block in
`config.toml`:
```toml
[shell_environment_policy]
# inherit can be "core" (default), "all", or "none"
inherit = "core"
# set to true to *skip* the filter for `"*KEY*"` and `"*TOKEN*"`
ignore_default_excludes = false
# exclude patterns (case-insensitive globs)
exclude = ["AWS_*", "AZURE_*"]
# force-set / override values
set = { CI = "1" }
# if provided, *only* vars matching these patterns are kept
include_only = ["PATH", "HOME"]
```
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `inherit` | string | `core` | Starting template for the environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run. |
| `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob patterns to drop after the default filter.<br>Examples: `"AWS_*"`, `"AZURE_*"`. |
| `set` | table&lt;string,string&gt; | `{}` | Explicit key/value overrides or additions always win over inherited values. |
| `include_only` | array&lt;string&gt; | `[]` | If non-empty, a whitelist of patterns; only variables that match _one_ pattern survive the final step. (Generally used with `inherit = "all"`.) |
The patterns are **glob style**, not full regular expressions: `*` matches any
number of characters, `?` matches exactly one, and character classes like
`[A-Z]`/`[^0-9]` are supported. Matching is always **case-insensitive**. This
syntax is documented in code as `EnvironmentVariablePattern` (see
`core/src/config_types.rs`).
If you just need a clean slate with a few custom entries you can write:
```toml
[shell_environment_policy]
inherit = "none"
set = { PATH = "/usr/bin", MY_FLAG = "1" }
```
Currently, `CODEX_SANDBOX_NETWORK_DISABLED=1` is also added to the environment, assuming network is disabled. This is not configurable.
## notify
Specify a program that will be executed to get notified about events generated by Codex. Note that the program will receive the notification argument as a string of JSON, e.g.:
```json
{
"type": "agent-turn-complete",
"turn-id": "12345",
"input-messages": ["Rename `foo` to `bar` and update the callsites."],
"last-assistant-message": "Rename complete and verified `cargo build` succeeds."
}
```
The `"type"` property will always be set. Currently, `"agent-turn-complete"` is the only notification type that is supported.
As an example, here is a Python script that parses the JSON and decides whether to show a desktop push notification using [terminal-notifier](https://github.com/julienXX/terminal-notifier) on macOS:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
To have Codex use this script for notifications, you would configure it via `notify` in `~/.codex/config.toml` using the appropriate path to `notify.py` on your computer:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
## history
By default, Codex CLI records messages sent to the model in `$CODEX_HOME/history.jsonl`. Note that on UNIX, the file permissions are set to `o600`, so it should only be readable and writable by the owner.
To disable this behavior, configure `[history]` as follows:
```toml
[history]
persistence = "none" # "save-all" is the default value
```
## file_opener
Identifies the editor/URI scheme to use for hyperlinking citations in model output. If set, citations to files in the model output will be hyperlinked using the specified URI scheme so they can be ctrl/cmd-clicked from the terminal to open them.
For example, if the model output includes a reference such as `【F:/home/user/project/main.py†L42-L50】`, then this would be rewritten to link to the URI `vscode://file/home/user/project/main.py:42`.
Note this is **not** a general editor setting (like `$EDITOR`), as it only accepts a fixed set of values:
- `"vscode"` (default)
- `"vscode-insiders"`
- `"windsurf"`
- `"cursor"`
- `"none"` to explicitly disable this feature
Currently, `"vscode"` is the default, though Codex does not verify VS Code is installed. As such, `file_opener` may default to `"none"` or something else in the future.
## hide_agent_reasoning
Codex intermittently emits "reasoning" events that show the models internal "thinking" before it produces a final answer. Some users may find these events distracting, especially in CI logs or minimal terminal output.
Setting `hide_agent_reasoning` to `true` suppresses these events in **both** the TUI as well as the headless `exec` sub-command:
```toml
hide_agent_reasoning = true # defaults to false
```
## model_context_window
The size of the context window for the model, in tokens.
In general, Codex knows the context window for the most common OpenAI models, but if you are using a new model with an old version of the Codex CLI, then you can use `model_context_window` to tell Codex what value to use to determine how much context is left during a conversation.
## model_max_output_tokens
This is analogous to `model_context_window`, but for the maximum number of output tokens for the model.
## project_doc_max_bytes
Maximum number of bytes to read from an `AGENTS.md` file to include in the instructions sent with the first turn of a session. Defaults to 32 KiB.
## tui
Options that are specific to the TUI.
```toml
[tui]
# This will make it so that Codex does not try to process mouse events, which
# means your Terminal's native drag-to-text to text selection and copy/paste
# should work. The tradeoff is that Codex will not receive any mouse events, so
# it will not be possible to use the mouse to scroll conversation history.
#
# Note that most terminals support holding down a modifier key when using the
# mouse to support text selection. For example, even if Codex mouse capture is
# enabled (i.e., this is set to `false`), you can still hold down alt while
# dragging the mouse to select text.
disable_mouse_capture = true # defaults to `false`
```

View File

@@ -16,10 +16,12 @@ async-channel = "2.3.1"
base64 = "0.21"
bytes = "1.10.1"
codex-apply-patch = { path = "../apply-patch" }
codex-login = { path = "../login" }
codex-mcp-client = { path = "../mcp-client" }
dirs = "6"
env-flags = "0.1.1"
eventsource-stream = "0.2.3"
fs2 = "0.4.3"
fs-err = "3.1.0"
futures = "0.3"
mcp-types = { path = "../mcp-types" }
@@ -30,6 +32,8 @@ rand = "0.9"
reqwest = { version = "0.12", features = ["json", "stream"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
strum = "0.27.1"
strum_macros = "0.27.1"
thiserror = "2.0.12"
time = { version = "0.3", features = ["formatting", "local-offset", "macros"] }
tokio = { version = "1", features = [
@@ -45,9 +49,9 @@ tracing = { version = "0.1.41", features = ["log"] }
tree-sitter = "0.25.3"
tree-sitter-bash = "0.23.3"
uuid = { version = "1", features = ["serde", "v4"] }
wildmatch = "2.4.0"
[target.'cfg(target_os = "linux")'.dependencies]
libc = "0.2.172"
landlock = "0.4.1"
seccompiler = "0.5.0"
@@ -55,8 +59,13 @@ seccompiler = "0.5.0"
[target.x86_64-unknown-linux-musl.dependencies]
openssl-sys = { version = "*", features = ["vendored"] }
# Build OpenSSL from source for musl builds.
[target.aarch64-unknown-linux-musl.dependencies]
openssl-sys = { version = "*", features = ["vendored"] }
[dev-dependencies]
assert_cmd = "2"
maplit = "1.0.2"
predicates = "3"
pretty_assertions = "1.4.1"
tempfile = "3"

View File

@@ -0,0 +1,45 @@
# Codex configuration template
# See https://github.com/openai/codex/blob/main/codex-rs/config.md for details.
# All values below represent defaults. Uncomment to override them.
# model = "codex-mini-latest"
# model_provider = "openai"
# approval_policy = "unless-allow-listed"
# disable_response_storage = false
# project_doc_max_bytes = 32768
# file_opener = "vscode"
# hide_agent_reasoning = false
# model_reasoning_effort = "medium"
# model_reasoning_summary = "auto"
[shell_environment_policy]
# inherit = "core"
# ignore_default_excludes = false
# exclude = []
# set = {}
# include_only = []
[sandbox]
# mode = "read-only"
# writable_roots = []
# network_access = false
[history]
# persistence = "save-all"
[tui]
# disable_mouse_capture = false
# Example provider override
#[model_providers.openai]
# name = "OpenAI"
# base_url = "https://api.openai.com/v1"
# env_key = "OPENAI_API_KEY"
# wire_api = "chat"
# Example profile
#[profiles.example]
# model = "o3"
# model_provider = "openai"
# approval_policy = "never"

View File

@@ -25,10 +25,10 @@ use crate::flags::OPENAI_REQUEST_MAX_RETRIES;
use crate::flags::OPENAI_STREAM_IDLE_TIMEOUT_MS;
use crate::models::ContentItem;
use crate::models::ResponseItem;
use crate::openai_tools::create_tools_json_for_chat_completions_api;
use crate::util::backoff;
/// Implementation for the classic Chat Completions API. This is intentionally
/// minimal: we only stream back plain assistant text.
/// Implementation for the classic Chat Completions API.
pub(crate) async fn stream_chat_completions(
prompt: &Prompt,
model: &str,
@@ -38,35 +38,88 @@ pub(crate) async fn stream_chat_completions(
// Build messages array
let mut messages = Vec::<serde_json::Value>::new();
let full_instructions = prompt.get_full_instructions();
let full_instructions = prompt.get_full_instructions(model);
messages.push(json!({"role": "system", "content": full_instructions}));
for item in &prompt.input {
if let ResponseItem::Message { role, content } = item {
let mut text = String::new();
for c in content {
match c {
ContentItem::InputText { text: t } | ContentItem::OutputText { text: t } => {
text.push_str(t);
match item {
ResponseItem::Message { role, content } => {
let mut text = String::new();
for c in content {
match c {
ContentItem::InputText { text: t }
| ContentItem::OutputText { text: t } => {
text.push_str(t);
}
_ => {}
}
_ => {}
}
messages.push(json!({"role": role, "content": text}));
}
ResponseItem::FunctionCall {
name,
arguments,
call_id,
} => {
messages.push(json!({
"role": "assistant",
"content": null,
"tool_calls": [{
"id": call_id,
"type": "function",
"function": {
"name": name,
"arguments": arguments,
}
}]
}));
}
ResponseItem::LocalShellCall {
id,
call_id: _,
status,
action,
} => {
// Confirm with API team.
messages.push(json!({
"role": "assistant",
"content": null,
"tool_calls": [{
"id": id.clone().unwrap_or_else(|| "".to_string()),
"type": "local_shell_call",
"status": status,
"action": action,
}]
}));
}
ResponseItem::FunctionCallOutput { call_id, output } => {
messages.push(json!({
"role": "tool",
"tool_call_id": call_id,
"content": output.content,
}));
}
ResponseItem::Reasoning { .. } | ResponseItem::Other => {
// Omit these items from the conversation history.
continue;
}
messages.push(json!({"role": role, "content": text}));
}
}
let tools_json = create_tools_json_for_chat_completions_api(prompt, model)?;
let payload = json!({
"model": model,
"messages": messages,
"stream": true
"stream": true,
"tools": tools_json,
});
let base_url = provider.base_url.trim_end_matches('/');
let url = format!("{}/chat/completions", base_url);
let url = provider.get_full_url();
debug!(url, "POST (chat)");
trace!("request payload: {}", payload);
debug!(
"POST to {url}: {}",
serde_json::to_string_pretty(&payload).unwrap_or_default()
);
let api_key = provider.api_key()?;
let mut attempt = 0;
@@ -134,6 +187,21 @@ where
let idle_timeout = *OPENAI_STREAM_IDLE_TIMEOUT_MS;
// State to accumulate a function call across streaming chunks.
// OpenAI may split the `arguments` string over multiple `delta` events
// until the chunk whose `finish_reason` is `tool_calls` is emitted. We
// keep collecting the pieces here and forward a single
// `ResponseItem::FunctionCall` once the call is complete.
#[derive(Default)]
struct FunctionCallState {
name: Option<String>,
arguments: String,
call_id: Option<String>,
active: bool,
}
let mut fn_call_state = FunctionCallState::default();
loop {
let sse = match timeout(idle_timeout, stream.next()).await {
Ok(Some(Ok(ev))) => ev,
@@ -146,6 +214,7 @@ where
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
}))
.await;
return;
@@ -163,6 +232,7 @@ where
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
}))
.await;
return;
@@ -173,23 +243,90 @@ where
Ok(v) => v,
Err(_) => continue,
};
trace!("chat_completions received SSE chunk: {chunk:?}");
let content_opt = chunk
.get("choices")
.and_then(|c| c.get(0))
.and_then(|c| c.get("delta"))
.and_then(|d| d.get("content"))
.and_then(|c| c.as_str());
let choice_opt = chunk.get("choices").and_then(|c| c.get(0));
if let Some(content) = content_opt {
let item = ResponseItem::Message {
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: content.to_string(),
}],
};
if let Some(choice) = choice_opt {
// Handle assistant content tokens.
if let Some(content) = choice
.get("delta")
.and_then(|d| d.get("content"))
.and_then(|c| c.as_str())
{
let item = ResponseItem::Message {
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: content.to_string(),
}],
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
// Handle streaming function / tool calls.
if let Some(tool_calls) = choice
.get("delta")
.and_then(|d| d.get("tool_calls"))
.and_then(|tc| tc.as_array())
{
if let Some(tool_call) = tool_calls.first() {
// Mark that we have an active function call in progress.
fn_call_state.active = true;
// Extract call_id if present.
if let Some(id) = tool_call.get("id").and_then(|v| v.as_str()) {
fn_call_state.call_id.get_or_insert_with(|| id.to_string());
}
// Extract function details if present.
if let Some(function) = tool_call.get("function") {
if let Some(name) = function.get("name").and_then(|n| n.as_str()) {
fn_call_state.name.get_or_insert_with(|| name.to_string());
}
if let Some(args_fragment) =
function.get("arguments").and_then(|a| a.as_str())
{
fn_call_state.arguments.push_str(args_fragment);
}
}
}
}
// Emit end-of-turn when finish_reason signals completion.
if let Some(finish_reason) = choice.get("finish_reason").and_then(|v| v.as_str()) {
match finish_reason {
"tool_calls" if fn_call_state.active => {
// Build the FunctionCall response item.
let item = ResponseItem::FunctionCall {
name: fn_call_state.name.clone().unwrap_or_else(|| "".to_string()),
arguments: fn_call_state.arguments.clone(),
call_id: fn_call_state.call_id.clone().unwrap_or_else(String::new),
};
// Emit it downstream.
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
"stop" => {
// Regular turn without tool-call.
}
_ => {}
}
// Emit Completed regardless of reason so the agent can advance.
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
}))
.await;
// Prepare for potential next turn (should not happen in same stream).
// fn_call_state = FunctionCallState::default();
return; // End processing for this SSE stream.
}
}
}
}
@@ -236,9 +373,14 @@ where
Poll::Ready(None) => return Poll::Ready(None),
Poll::Ready(Some(Err(e))) => return Poll::Ready(Some(Err(e))),
Poll::Ready(Some(Ok(ResponseEvent::OutputItemDone(item)))) => {
// Accumulate *assistant* text but do not emit yet.
if let crate::models::ResponseItem::Message { role, content } = &item {
if role == "assistant" {
// If this is an incremental assistant message chunk, accumulate but
// do NOT emit yet. Forward any other item (e.g. FunctionCall) right
// away so downstream consumers see it.
let is_assistant_delta = matches!(&item, crate::models::ResponseItem::Message { role, .. } if role == "assistant");
if is_assistant_delta {
if let crate::models::ResponseItem::Message { content, .. } = &item {
if let Some(text) = content.iter().find_map(|c| match c {
crate::models::ContentItem::OutputText { text } => Some(text),
_ => None,
@@ -246,12 +388,18 @@ where
this.cumulative.push_str(text);
}
}
// Swallow partial assistant chunk; keep polling.
continue;
}
// Swallow partial event; keep polling.
continue;
// Not an assistant message forward immediately.
return Poll::Ready(Some(Ok(ResponseEvent::OutputItemDone(item))));
}
Poll::Ready(Some(Ok(ResponseEvent::Completed { response_id }))) => {
Poll::Ready(Some(Ok(ResponseEvent::Completed {
response_id,
token_usage,
}))) => {
if !this.cumulative.is_empty() {
let aggregated_item = crate::models::ResponseItem::Message {
role: "assistant".to_string(),
@@ -261,7 +409,10 @@ where
};
// Buffer Completed so it is returned *after* the aggregated message.
this.pending_completed = Some(ResponseEvent::Completed { response_id });
this.pending_completed = Some(ResponseEvent::Completed {
response_id,
token_usage,
});
return Poll::Ready(Some(Ok(ResponseEvent::OutputItemDone(
aggregated_item,
@@ -269,8 +420,16 @@ where
}
// Nothing aggregated forward Completed directly.
return Poll::Ready(Some(Ok(ResponseEvent::Completed { response_id })));
} // No other `Ok` variants exist at the moment, continue polling.
return Poll::Ready(Some(Ok(ResponseEvent::Completed {
response_id,
token_usage,
})));
}
Poll::Ready(Some(Ok(ResponseEvent::Created))) => {
// These events are exclusive to the Responses API and
// will never appear in a Chat Completions stream.
continue;
}
}
}
}
@@ -284,7 +443,7 @@ pub(crate) trait AggregateStreamExt: Stream<Item = Result<ResponseEvent>> + Size
///
/// ```ignore
/// OutputItemDone(<full message>)
/// Completed { .. }
/// Completed
/// ```
///
/// No other `OutputItemDone` events will be seen by the caller.

View File

@@ -1,7 +1,5 @@
use std::collections::BTreeMap;
use std::io::BufRead;
use std::path::Path;
use std::sync::LazyLock;
use std::time::Duration;
use bytes::Bytes;
@@ -11,7 +9,6 @@ use reqwest::StatusCode;
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value;
use serde_json::json;
use tokio::sync::mpsc;
use tokio::time::timeout;
use tokio_util::io::ReaderStream;
@@ -21,12 +18,13 @@ use tracing::warn;
use crate::chat_completions::AggregateStreamExt;
use crate::chat_completions::stream_chat_completions;
use crate::client_common::Payload;
use crate::client_common::Prompt;
use crate::client_common::Reasoning;
use crate::client_common::ResponseEvent;
use crate::client_common::ResponseStream;
use crate::client_common::Summary;
use crate::client_common::ResponsesApiRequest;
use crate::client_common::create_reasoning_param_for_request;
use crate::config_types::ReasoningEffort as ReasoningEffortConfig;
use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
use crate::error::CodexErr;
use crate::error::EnvVarError;
use crate::error::Result;
@@ -36,74 +34,32 @@ use crate::flags::OPENAI_STREAM_IDLE_TIMEOUT_MS;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::WireApi;
use crate::models::ResponseItem;
use crate::openai_tools::create_tools_json_for_responses_api;
use crate::protocol::TokenUsage;
use crate::util::backoff;
/// When serialized as JSON, this produces a valid "Tool" in the OpenAI
/// Responses API.
#[derive(Debug, Serialize)]
struct ResponsesApiTool {
name: &'static str,
r#type: &'static str, // "function"
description: &'static str,
strict: bool,
parameters: JsonSchema,
}
/// Generic JSONSchema subset needed for our tool definitions
#[derive(Debug, Clone, Serialize)]
#[serde(tag = "type", rename_all = "lowercase")]
enum JsonSchema {
String,
Number,
Array {
items: Box<JsonSchema>,
},
Object {
properties: BTreeMap<String, JsonSchema>,
required: &'static [&'static str],
#[serde(rename = "additionalProperties")]
additional_properties: bool,
},
}
/// Tool usage specification
static DEFAULT_TOOLS: LazyLock<Vec<ResponsesApiTool>> = LazyLock::new(|| {
let mut properties = BTreeMap::new();
properties.insert(
"command".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String),
},
);
properties.insert("workdir".to_string(), JsonSchema::String);
properties.insert("timeout".to_string(), JsonSchema::Number);
vec![ResponsesApiTool {
name: "shell",
r#type: "function",
description: "Runs a shell command, and returns its output.",
strict: false,
parameters: JsonSchema::Object {
properties,
required: &["command"],
additional_properties: false,
},
}]
});
#[derive(Clone)]
pub struct ModelClient {
model: String,
client: reqwest::Client,
provider: ModelProviderInfo,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
}
impl ModelClient {
pub fn new(model: impl ToString, provider: ModelProviderInfo) -> Self {
pub fn new(
model: impl ToString,
provider: ModelProviderInfo,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
) -> Self {
Self {
model: model.to_string(),
client: reqwest::Client::new(),
provider,
effort,
summary,
}
}
@@ -151,43 +107,24 @@ impl ModelClient {
return stream_from_fixture(path).await;
}
// Assemble tool list: built-in tools + any extra tools from the prompt.
let mut tools_json = Vec::with_capacity(DEFAULT_TOOLS.len() + prompt.extra_tools.len());
for t in DEFAULT_TOOLS.iter() {
tools_json.push(serde_json::to_value(t)?);
}
tools_json.extend(
prompt
.extra_tools
.clone()
.into_iter()
.map(|(name, tool)| mcp_tool_to_openai_tool(name, tool)),
);
debug!("tools_json: {}", serde_json::to_string_pretty(&tools_json)?);
let full_instructions = prompt.get_full_instructions();
let payload = Payload {
let full_instructions = prompt.get_full_instructions(&self.model);
let tools_json = create_tools_json_for_responses_api(prompt, &self.model)?;
let reasoning = create_reasoning_param_for_request(&self.model, self.effort, self.summary);
let payload = ResponsesApiRequest {
model: &self.model,
instructions: &full_instructions,
input: &prompt.input,
tools: &tools_json,
tool_choice: "auto",
parallel_tool_calls: false,
reasoning: Some(Reasoning {
effort: "high",
summary: Some(Summary::Auto),
}),
reasoning,
previous_response_id: prompt.prev_id.clone(),
store: prompt.store,
stream: true,
};
let base_url = self.provider.base_url.clone();
let base_url = base_url.trim_end_matches('/');
let url = format!("{}/responses", base_url);
debug!(url, "POST");
trace!("request payload: {}", serde_json::to_string(&payload)?);
let url = self.provider.get_full_url();
trace!("POST to {url}: {}", serde_json::to_string(&payload)?);
let mut attempt = 0;
loop {
@@ -229,7 +166,7 @@ impl ModelClient {
// negligible.
if !(status == StatusCode::TOO_MANY_REQUESTS || status.is_server_error()) {
// Surface the error body to callers. Use `unwrap_or_default` per Clippy.
let body = (res.text().await).unwrap_or_default();
let body = res.text().await.unwrap_or_default();
return Err(CodexErr::UnexpectedStatus(status, body));
}
@@ -261,20 +198,6 @@ impl ModelClient {
}
}
fn mcp_tool_to_openai_tool(
fully_qualified_name: String,
tool: mcp_types::Tool,
) -> serde_json::Value {
// TODO(mbolin): Change the contract of this function to return
// ResponsesApiTool.
json!({
"name": fully_qualified_name,
"description": tool.description,
"parameters": tool.input_schema,
"type": "function",
})
}
#[derive(Debug, Deserialize, Serialize)]
struct SseEvent {
#[serde(rename = "type")]
@@ -283,9 +206,44 @@ struct SseEvent {
item: Option<Value>,
}
#[derive(Debug, Deserialize)]
struct ResponseCreated {}
#[derive(Debug, Deserialize)]
struct ResponseCompleted {
id: String,
usage: Option<ResponseCompletedUsage>,
}
#[derive(Debug, Deserialize)]
struct ResponseCompletedUsage {
input_tokens: u64,
input_tokens_details: Option<ResponseCompletedInputTokensDetails>,
output_tokens: u64,
output_tokens_details: Option<ResponseCompletedOutputTokensDetails>,
total_tokens: u64,
}
impl From<ResponseCompletedUsage> for TokenUsage {
fn from(val: ResponseCompletedUsage) -> Self {
TokenUsage {
input_tokens: val.input_tokens,
cached_input_tokens: val.input_tokens_details.map(|d| d.cached_tokens),
output_tokens: val.output_tokens,
reasoning_output_tokens: val.output_tokens_details.map(|d| d.reasoning_tokens),
total_tokens: val.total_tokens,
}
}
}
#[derive(Debug, Deserialize)]
struct ResponseCompletedInputTokensDetails {
cached_tokens: u64,
}
#[derive(Debug, Deserialize)]
struct ResponseCompletedOutputTokensDetails {
reasoning_tokens: u64,
}
async fn process_sse<S>(stream: S, tx_event: mpsc::Sender<Result<ResponseEvent>>)
@@ -297,7 +255,7 @@ where
// If the stream stays completely silent for an extended period treat it as disconnected.
let idle_timeout = *OPENAI_STREAM_IDLE_TIMEOUT_MS;
// The response id returned from the "complete" message.
let mut response_id = None;
let mut response_completed: Option<ResponseCompleted> = None;
loop {
let sse = match timeout(idle_timeout, stream.next()).await {
@@ -309,9 +267,15 @@ where
return;
}
Ok(None) => {
match response_id {
Some(response_id) => {
let event = ResponseEvent::Completed { response_id };
match response_completed {
Some(ResponseCompleted {
id: response_id,
usage,
}) => {
let event = ResponseEvent::Completed {
response_id,
token_usage: usage.map(Into::into),
};
let _ = tx_event.send(Ok(event)).await;
}
None => {
@@ -372,12 +336,17 @@ where
return;
}
}
"response.created" => {
if event.response.is_some() {
let _ = tx_event.send(Ok(ResponseEvent::Created {})).await;
}
}
// Final response completed includes array of output items & id
"response.completed" => {
if let Some(resp_val) = event.response {
match serde_json::from_value::<ResponseCompleted>(resp_val) {
Ok(r) => {
response_id = Some(r.id);
response_completed = Some(r);
}
Err(e) => {
debug!("failed to parse ResponseCompleted: {e}");
@@ -386,6 +355,18 @@ where
};
};
}
"response.content_part.done"
| "response.function_call_arguments.delta"
| "response.in_progress"
| "response.output_item.added"
| "response.output_text.delta"
| "response.output_text.done"
| "response.reasoning_summary_part.added"
| "response.reasoning_summary_text.delta"
| "response.reasoning_summary_text.done" => {
// Currently, we ignore these events, but we handle them
// separately to skip the logging message in the `other` case.
}
other => debug!(other, "sse event"),
}
}

View File

@@ -1,5 +1,9 @@
use crate::config_types::ReasoningEffort as ReasoningEffortConfig;
use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
use crate::error::Result;
use crate::models::ResponseItem;
use crate::protocol::TokenUsage;
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
use futures::Stream;
use serde::Serialize;
use std::borrow::Cow;
@@ -22,7 +26,7 @@ pub struct Prompt {
pub prev_id: Option<String>,
/// Optional instructions from the user to amend to the built-in agent
/// instructions.
pub instructions: Option<String>,
pub user_instructions: Option<String>,
/// Whether to store response on server side (disable_response_storage = !store).
pub store: bool,
@@ -33,44 +37,83 @@ pub struct Prompt {
}
impl Prompt {
pub(crate) fn get_full_instructions(&self) -> Cow<str> {
match &self.instructions {
Some(instructions) => {
let instructions = format!("{BASE_INSTRUCTIONS}\n{instructions}");
Cow::Owned(instructions)
}
None => Cow::Borrowed(BASE_INSTRUCTIONS),
pub(crate) fn get_full_instructions(&self, model: &str) -> Cow<str> {
let mut sections: Vec<&str> = vec![BASE_INSTRUCTIONS];
if let Some(ref user) = self.user_instructions {
sections.push(user);
}
if model.starts_with("gpt-4.1") {
sections.push(APPLY_PATCH_TOOL_INSTRUCTIONS);
}
Cow::Owned(sections.join("\n"))
}
}
#[derive(Debug)]
pub enum ResponseEvent {
Created,
OutputItemDone(ResponseItem),
Completed { response_id: String },
Completed {
response_id: String,
token_usage: Option<TokenUsage>,
},
}
#[derive(Debug, Serialize)]
pub(crate) struct Reasoning {
pub(crate) effort: &'static str,
pub(crate) effort: OpenAiReasoningEffort,
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) summary: Option<Summary>,
pub(crate) summary: Option<OpenAiReasoningSummary>,
}
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning
#[derive(Debug, Serialize, Default, Clone, Copy)]
#[serde(rename_all = "lowercase")]
pub(crate) enum OpenAiReasoningEffort {
Low,
#[default]
Medium,
High,
}
impl From<ReasoningEffortConfig> for Option<OpenAiReasoningEffort> {
fn from(effort: ReasoningEffortConfig) -> Self {
match effort {
ReasoningEffortConfig::Low => Some(OpenAiReasoningEffort::Low),
ReasoningEffortConfig::Medium => Some(OpenAiReasoningEffort::Medium),
ReasoningEffortConfig::High => Some(OpenAiReasoningEffort::High),
ReasoningEffortConfig::None => None,
}
}
}
/// A summary of the reasoning performed by the model. This can be useful for
/// debugging and understanding the model's reasoning process.
#[derive(Debug, Serialize)]
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries
#[derive(Debug, Serialize, Default, Clone, Copy)]
#[serde(rename_all = "lowercase")]
pub(crate) enum Summary {
pub(crate) enum OpenAiReasoningSummary {
#[default]
Auto,
#[allow(dead_code)] // Will go away once this is configurable.
Concise,
#[allow(dead_code)] // Will go away once this is configurable.
Detailed,
}
impl From<ReasoningSummaryConfig> for Option<OpenAiReasoningSummary> {
fn from(summary: ReasoningSummaryConfig) -> Self {
match summary {
ReasoningSummaryConfig::Auto => Some(OpenAiReasoningSummary::Auto),
ReasoningSummaryConfig::Concise => Some(OpenAiReasoningSummary::Concise),
ReasoningSummaryConfig::Detailed => Some(OpenAiReasoningSummary::Detailed),
ReasoningSummaryConfig::None => None,
}
}
}
/// Request object that is serialized as JSON and POST'ed when using the
/// Responses API.
#[derive(Debug, Serialize)]
pub(crate) struct Payload<'a> {
pub(crate) struct ResponsesApiRequest<'a> {
pub(crate) model: &'a str,
pub(crate) instructions: &'a str,
// TODO(mbolin): ResponseItem::Other should not be serialized. Currently,
@@ -88,6 +131,40 @@ pub(crate) struct Payload<'a> {
pub(crate) stream: bool,
}
pub(crate) fn create_reasoning_param_for_request(
model: &str,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
) -> Option<Reasoning> {
let effort: Option<OpenAiReasoningEffort> = effort.into();
let effort = effort?;
if model_supports_reasoning_summaries(model) {
Some(Reasoning {
effort,
summary: summary.into(),
})
} else {
None
}
}
pub fn model_supports_reasoning_summaries(model: &str) -> bool {
// Currently, we hardcode this rule to decide whether enable reasoning.
// We expect reasoning to apply only to OpenAI models, but we do not want
// users to have to mess with their config to disable reasoning for models
// that do not support it, such as `gpt-4.1`.
//
// Though if a user is using Codex with non-OpenAI models that, say, happen
// to start with "o", then they can set `model_reasoning_effort = "none` in
// config.toml to disable reasoning.
//
// Ultimately, this should also be configurable in config.toml, but we
// need to have defaults that "just work." Perhaps we could have a
// "reasoning models pattern" as part of ModelProviderInfo?
model.starts_with("o") || model.starts_with("codex")
}
pub(crate) struct ResponseStream {
pub(crate) rx_event: mpsc::Receiver<Result<ResponseEvent>>,
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,16 +1,26 @@
use crate::config_profile::ConfigProfile;
use crate::config_types::History;
use crate::config_types::McpServerConfig;
use crate::config_types::ReasoningEffort;
use crate::config_types::ReasoningSummary;
use crate::config_types::ShellEnvironmentPolicy;
use crate::config_types::ShellEnvironmentPolicyToml;
use crate::config_types::Tui;
use crate::config_types::UriBasedFileOpener;
use crate::flags::OPENAI_DEFAULT_MODEL;
use crate::mcp_server_config::McpServerConfig;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::built_in_model_providers;
use crate::openai_model_info::get_model_info;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPermission;
use crate::protocol::SandboxPolicy;
use dirs::home_dir;
use serde::Deserialize;
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
use toml::Value as TomlValue;
const DEFAULT_CONFIG_TEMPLATE: &str = include_str!("../config_template.toml");
/// Maximum number of bytes of the documentation that will be embedded. Larger
/// files are *silently truncated* to this size so we do not take up too much of
@@ -23,6 +33,12 @@ pub struct Config {
/// Optional override of model selection.
pub model: String,
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<u64>,
/// Maximum number of output tokens.
pub model_max_output_tokens: Option<u64>,
/// Key into the model_providers map that specifies which provider to use.
pub model_provider_id: String,
@@ -34,6 +50,13 @@ pub struct Config {
pub sandbox_policy: SandboxPolicy,
pub shell_environment_policy: ShellEnvironmentPolicy,
/// When `true`, `AgentReasoning` events emitted by the backend will be
/// suppressed from the frontend output. This can reduce visual noise when
/// users are only interested in the final agent responses.
pub hide_agent_reasoning: bool,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
/// who have opted into Zero Data Retention (ZDR).
@@ -77,6 +100,152 @@ pub struct Config {
/// Maximum number of bytes to include from an AGENTS.md project doc file.
pub project_doc_max_bytes: usize,
/// Directory containing all Codex state (defaults to `~/.codex` but can be
/// overridden by the `CODEX_HOME` environment variable).
pub codex_home: PathBuf,
/// Settings that govern if and what will be written to `~/.codex/history.jsonl`.
pub history: History,
/// Optional URI-based file opener. If set, citations to files in the model
/// output will be hyperlinked using the specified URI scheme.
pub file_opener: UriBasedFileOpener,
/// Collection of settings that are specific to the TUI.
pub tui: Tui,
/// Path to the `codex-linux-sandbox` executable. This must be set if
/// [`crate::exec::SandboxType::LinuxSeccomp`] is used. Note that this
/// cannot be set in the config file: it must be set in code via
/// [`ConfigOverrides`].
///
/// When this program is invoked, arg0 will be set to `codex-linux-sandbox`.
pub codex_linux_sandbox_exe: Option<PathBuf>,
/// If not "none", the value to use for `reasoning.effort` when making a
/// request using the Responses API.
pub model_reasoning_effort: ReasoningEffort,
/// If not "none", the value to use for `reasoning.summary` when making a
/// request using the Responses API.
pub model_reasoning_summary: ReasoningSummary,
}
impl Config {
/// Load configuration with *generic* CLI overrides (`-c key=value`) applied
/// **in between** the values parsed from `config.toml` and the
/// strongly-typed overrides specified via [`ConfigOverrides`].
///
/// The precedence order is therefore: `config.toml` < `-c` overrides <
/// `ConfigOverrides`.
pub fn load_with_cli_overrides(
cli_overrides: Vec<(String, TomlValue)>,
overrides: ConfigOverrides,
) -> std::io::Result<Self> {
// Resolve the directory that stores Codex state (e.g. ~/.codex or the
// value of $CODEX_HOME) so we can embed it into the resulting
// `Config` instance.
let codex_home = find_codex_home()?;
// Step 1: parse `config.toml` into a generic JSON value.
let mut root_value = load_config_as_toml(&codex_home)?;
// Step 2: apply the `-c` overrides.
for (path, value) in cli_overrides.into_iter() {
apply_toml_override(&mut root_value, &path, value);
}
// Step 3: deserialize into `ConfigToml` so that Serde can enforce the
// correct types.
let cfg: ConfigToml = root_value.try_into().map_err(|e| {
tracing::error!("Failed to deserialize overridden config: {e}");
std::io::Error::new(std::io::ErrorKind::InvalidData, e)
})?;
// Step 4: merge with the strongly-typed overrides.
Self::load_from_base_config_with_overrides(cfg, overrides, codex_home)
}
}
/// Read `CODEX_HOME/config.toml` and return it as a generic TOML value. Returns
/// an empty TOML table when the file does not exist.
fn load_config_as_toml(codex_home: &Path) -> std::io::Result<TomlValue> {
let config_path = codex_home.join("config.toml");
match std::fs::read_to_string(&config_path) {
Ok(contents) => match toml::from_str::<TomlValue>(&contents) {
Ok(val) => Ok(val),
Err(e) => {
tracing::error!("Failed to parse config.toml: {e}");
Err(std::io::Error::new(std::io::ErrorKind::InvalidData, e))
}
},
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
tracing::info!("config.toml not found, writing template");
write_default_config_template(&config_path);
Ok(TomlValue::Table(Default::default()))
}
Err(e) => {
tracing::error!("Failed to read config.toml: {e}");
Err(e)
}
}
}
fn write_default_config_template(config_path: &Path) {
if let Some(parent) = config_path.parent() {
if let Err(e) = std::fs::create_dir_all(parent) {
tracing::error!("Failed to create config dir: {e}");
return;
}
}
match std::fs::write(config_path, DEFAULT_CONFIG_TEMPLATE) {
Ok(_) => tracing::info!("wrote default config template at {}", config_path.display()),
Err(e) => tracing::error!("Failed to write default config template: {e}"),
}
}
/// Apply a single dotted-path override onto a TOML value.
fn apply_toml_override(root: &mut TomlValue, path: &str, value: TomlValue) {
use toml::value::Table;
let segments: Vec<&str> = path.split('.').collect();
let mut current = root;
for (idx, segment) in segments.iter().enumerate() {
let is_last = idx == segments.len() - 1;
if is_last {
match current {
TomlValue::Table(table) => {
table.insert(segment.to_string(), value);
}
_ => {
let mut table = Table::new();
table.insert(segment.to_string(), value);
*current = TomlValue::Table(table);
}
}
return;
}
// Traverse or create intermediate object.
match current {
TomlValue::Table(table) => {
current = table
.entry(segment.to_string())
.or_insert_with(|| TomlValue::Table(Table::new()));
}
_ => {
*current = TomlValue::Table(Table::new());
if let TomlValue::Table(tbl) = current {
current = tbl
.entry(segment.to_string())
.or_insert_with(|| TomlValue::Table(Table::new()));
}
}
}
}
}
/// Base config deserialized from ~/.codex/config.toml.
@@ -88,14 +257,20 @@ pub struct ConfigToml {
/// Provider to use from the model_providers map.
pub model_provider: Option<String>,
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<u64>,
/// Maximum number of output tokens.
pub model_max_output_tokens: Option<u64>,
/// Default approval policy for executing commands.
pub approval_policy: Option<AskForApproval>,
// The `default` attribute ensures that the field is treated as `None` when
// the key is omitted from the TOML. Without it, Serde treats the field as
// required because we supply a custom deserializer.
#[serde(default, deserialize_with = "deserialize_sandbox_permissions")]
pub sandbox_permissions: Option<Vec<SandboxPermission>>,
#[serde(default)]
pub shell_environment_policy: ShellEnvironmentPolicyToml,
/// If omitted, Codex defaults to the restrictive `read-only` policy.
pub sandbox: Option<SandboxPolicy>,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
@@ -126,55 +301,24 @@ pub struct ConfigToml {
/// Named profiles to facilitate switching between different configurations.
#[serde(default)]
pub profiles: HashMap<String, ConfigProfile>,
}
impl ConfigToml {
/// Attempt to parse the file at `~/.codex/config.toml`. If it does not
/// exist, return a default config. Though if it exists and cannot be
/// parsed, report that to the user and force them to fix it.
fn load_from_toml() -> std::io::Result<Self> {
let config_toml_path = codex_dir()?.join("config.toml");
match std::fs::read_to_string(&config_toml_path) {
Ok(contents) => toml::from_str::<Self>(&contents).map_err(|e| {
tracing::error!("Failed to parse config.toml: {e}");
std::io::Error::new(std::io::ErrorKind::InvalidData, e)
}),
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
tracing::info!("config.toml not found, using defaults");
Ok(Self::default())
}
Err(e) => {
tracing::error!("Failed to read config.toml: {e}");
Err(e)
}
}
}
}
/// Settings that govern if and what will be written to `~/.codex/history.jsonl`.
#[serde(default)]
pub history: Option<History>,
fn deserialize_sandbox_permissions<'de, D>(
deserializer: D,
) -> Result<Option<Vec<SandboxPermission>>, D::Error>
where
D: serde::Deserializer<'de>,
{
let permissions: Option<Vec<String>> = Option::deserialize(deserializer)?;
/// Optional URI-based file opener. If set, citations to files in the model
/// output will be hyperlinked using the specified URI scheme.
pub file_opener: Option<UriBasedFileOpener>,
match permissions {
Some(raw_permissions) => {
let base_path = codex_dir().map_err(serde::de::Error::custom)?;
/// Collection of settings that are specific to the TUI.
pub tui: Option<Tui>,
let converted = raw_permissions
.into_iter()
.map(|raw| {
parse_sandbox_permission_with_base_path(&raw, base_path.clone())
.map_err(serde::de::Error::custom)
})
.collect::<Result<Vec<_>, D::Error>>()?;
/// When set to `true`, `AgentReasoning` events will be hidden from the
/// UI/output. Defaults to `false`.
pub hide_agent_reasoning: Option<bool>,
Ok(Some(converted))
}
None => Ok(None),
}
pub model_reasoning_effort: Option<ReasoningEffort>,
pub model_reasoning_summary: Option<ReasoningSummary>,
}
/// Optional overrides for user configuration (e.g., from CLI flags).
@@ -184,28 +328,20 @@ pub struct ConfigOverrides {
pub cwd: Option<PathBuf>,
pub approval_policy: Option<AskForApproval>,
pub sandbox_policy: Option<SandboxPolicy>,
pub disable_response_storage: Option<bool>,
pub model_provider: Option<String>,
pub config_profile: Option<String>,
pub codex_linux_sandbox_exe: Option<PathBuf>,
}
impl Config {
/// Load configuration, optionally applying overrides (CLI flags). Merges
/// ~/.codex/config.toml, ~/.codex/instructions.md, embedded defaults, and
/// any values provided in `overrides` (highest precedence).
pub fn load_with_overrides(overrides: ConfigOverrides) -> std::io::Result<Self> {
let cfg: ConfigToml = ConfigToml::load_from_toml()?;
tracing::warn!("Config parsed from config.toml: {cfg:?}");
let codex_dir = codex_dir().ok();
Self::load_from_base_config_with_overrides(cfg, overrides, codex_dir.as_deref())
}
fn load_from_base_config_with_overrides(
/// Meant to be used exclusively for tests: `load_with_overrides()` should
/// be used in all other cases.
pub fn load_from_base_config_with_overrides(
cfg: ConfigToml,
overrides: ConfigOverrides,
codex_dir: Option<&Path>,
codex_home: PathBuf,
) -> std::io::Result<Self> {
let instructions = Self::load_instructions(codex_dir);
let instructions = Self::load_instructions(Some(&codex_home));
// Destructure ConfigOverrides fully to ensure all overrides are applied.
let ConfigOverrides {
@@ -213,9 +349,9 @@ impl Config {
cwd,
approval_policy,
sandbox_policy,
disable_response_storage,
model_provider,
config_profile: config_profile_key,
codex_linux_sandbox_exe,
} = overrides;
let config_profile = match config_profile_key.or(cfg.profile) {
@@ -232,20 +368,10 @@ impl Config {
None => ConfigProfile::default(),
};
let sandbox_policy = match sandbox_policy {
Some(sandbox_policy) => sandbox_policy,
None => {
// Derive a SandboxPolicy from the permissions in the config.
match cfg.sandbox_permissions {
// Note this means the user can explicitly set permissions
// to the empty list in the config file, granting it no
// permissions whatsoever.
Some(permissions) => SandboxPolicy::from(permissions),
// Default to read only rather than completely locked down.
None => SandboxPolicy::new_read_only_policy(),
}
}
};
let sandbox_policy = sandbox_policy.unwrap_or_else(|| {
cfg.sandbox
.unwrap_or_else(SandboxPolicy::new_read_only_policy)
});
let mut model_providers = built_in_model_providers();
// Merge user-defined providers into the built-in list.
@@ -267,6 +393,8 @@ impl Config {
})?
.clone();
let shell_environment_policy = cfg.shell_environment_policy.into();
let resolved_cwd = {
use std::env;
@@ -286,11 +414,25 @@ impl Config {
}
};
let history = cfg.history.unwrap_or_default();
let model = model
.or(config_profile.model)
.or(cfg.model)
.unwrap_or_else(default_model);
let openai_model_info = get_model_info(&model);
let model_context_window = cfg
.model_context_window
.or_else(|| openai_model_info.as_ref().map(|info| info.context_window));
let model_max_output_tokens = cfg.model_max_output_tokens.or_else(|| {
openai_model_info
.as_ref()
.map(|info| info.max_output_tokens)
});
let config = Self {
model: model
.or(config_profile.model)
.or(cfg.model)
.unwrap_or_else(default_model),
model,
model_context_window,
model_max_output_tokens,
model_provider_id,
model_provider,
cwd: resolved_cwd,
@@ -299,8 +441,9 @@ impl Config {
.or(cfg.approval_policy)
.unwrap_or_else(AskForApproval::default),
sandbox_policy,
disable_response_storage: disable_response_storage
.or(config_profile.disable_response_storage)
shell_environment_policy,
disable_response_storage: config_profile
.disable_response_storage
.or(cfg.disable_response_storage)
.unwrap_or(false),
notify: cfg.notify,
@@ -308,6 +451,15 @@ impl Config {
mcp_servers: cfg.mcp_servers,
model_providers,
project_doc_max_bytes: cfg.project_doc_max_bytes.unwrap_or(PROJECT_DOC_MAX_BYTES),
codex_home,
history,
file_opener: cfg.file_opener.unwrap_or(UriBasedFileOpener::VsCode),
tui: cfg.tui.unwrap_or_default(),
codex_linux_sandbox_exe,
hide_agent_reasoning: cfg.hide_agent_reasoning.unwrap_or(false),
model_reasoning_effort: cfg.model_reasoning_effort.unwrap_or_default(),
model_reasoning_summary: cfg.model_reasoning_summary.unwrap_or_default(),
};
Ok(config)
}
@@ -328,27 +480,29 @@ impl Config {
}
})
}
/// Meant to be used exclusively for tests: `load_with_overrides()` should
/// be used in all other cases.
pub fn load_default_config_for_test() -> Self {
#[expect(clippy::expect_used)]
Self::load_from_base_config_with_overrides(
ConfigToml::default(),
ConfigOverrides::default(),
None,
)
.expect("defaults for test should always succeed")
}
}
fn default_model() -> String {
OPENAI_DEFAULT_MODEL.to_string()
}
/// Returns the path to the Codex configuration directory, which is `~/.codex`.
/// Does not verify that the directory exists.
pub fn codex_dir() -> std::io::Result<PathBuf> {
/// Returns the path to the Codex configuration directory, which can be
/// specified by the `CODEX_HOME` environment variable. If not set, defaults to
/// `~/.codex`.
///
/// - If `CODEX_HOME` is set, the value will be canonicalized and this
/// function will Err if the path does not exist.
/// - If `CODEX_HOME` is not set, this function does not verify that the
/// directory exists.
fn find_codex_home() -> std::io::Result<PathBuf> {
// Honor the `CODEX_HOME` environment variable when it is set to allow users
// (and tests) to override the default location.
if let Ok(val) = std::env::var("CODEX_HOME") {
if !val.is_empty() {
return PathBuf::from(val).canonicalize();
}
}
let mut p = home_dir().ok_or_else(|| {
std::io::Error::new(
std::io::ErrorKind::NotFound,
@@ -361,133 +515,119 @@ pub fn codex_dir() -> std::io::Result<PathBuf> {
/// Returns the path to the folder where Codex logs are stored. Does not verify
/// that the directory exists.
pub fn log_dir() -> std::io::Result<PathBuf> {
let mut p = codex_dir()?;
pub fn log_dir(cfg: &Config) -> std::io::Result<PathBuf> {
let mut p = cfg.codex_home.clone();
p.push("log");
Ok(p)
}
pub fn parse_sandbox_permission_with_base_path(
raw: &str,
base_path: PathBuf,
) -> std::io::Result<SandboxPermission> {
use SandboxPermission::*;
if let Some(path) = raw.strip_prefix("disk-write-folder=") {
return if path.is_empty() {
Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
"--sandbox-permission disk-write-folder=<PATH> requires a non-empty PATH",
))
} else {
use path_absolutize::*;
let file = PathBuf::from(path);
let absolute_path = if file.is_relative() {
file.absolutize_from(base_path)
} else {
file.absolutize()
}
.map(|path| path.into_owned())?;
Ok(DiskWriteFolder {
folder: absolute_path,
})
};
}
match raw {
"disk-full-read-access" => Ok(DiskFullReadAccess),
"disk-write-platform-user-temp-folder" => Ok(DiskWritePlatformUserTempFolder),
"disk-write-platform-global-temp-folder" => Ok(DiskWritePlatformGlobalTempFolder),
"disk-write-cwd" => Ok(DiskWriteCwd),
"disk-full-write-access" => Ok(DiskFullWriteAccess),
"network-full-access" => Ok(NetworkFullAccess),
_ => Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
format!(
"`{raw}` is not a recognised permission.\nRun with `--help` to see the accepted values."
),
)),
}
}
#[cfg(test)]
mod tests {
#![allow(clippy::expect_used, clippy::unwrap_used)]
use crate::config_types::HistoryPersistence;
use super::*;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
/// Verify that the `sandbox_permissions` field on `ConfigToml` correctly
/// differentiates between a value that is completely absent in the
/// provided TOML (i.e. `None`) and one that is explicitly specified as an
/// empty array (i.e. `Some(vec![])`). This ensures that downstream logic
/// that treats these two cases differently (default read-only policy vs a
/// fully locked-down sandbox) continues to function.
#[test]
fn test_sandbox_permissions_none_vs_empty_vec() {
// Case 1: `sandbox_permissions` key is *absent* from the TOML source.
let toml_source_without_key = "";
let cfg_without_key: ConfigToml = toml::from_str(toml_source_without_key)
.expect("TOML deserialization without key should succeed");
assert!(cfg_without_key.sandbox_permissions.is_none());
// Case 2: `sandbox_permissions` is present but set to an *empty array*.
let toml_source_with_empty = "sandbox_permissions = []";
let cfg_with_empty: ConfigToml = toml::from_str(toml_source_with_empty)
.expect("TOML deserialization with empty array should succeed");
assert_eq!(Some(vec![]), cfg_with_empty.sandbox_permissions);
// Case 3: `sandbox_permissions` contains a non-empty list of valid values.
let toml_source_with_values = r#"
sandbox_permissions = ["disk-full-read-access", "network-full-access"]
"#;
let cfg_with_values: ConfigToml = toml::from_str(toml_source_with_values)
.expect("TOML deserialization with valid permissions should succeed");
fn test_toml_parsing() {
let history_with_persistence = r#"
[history]
persistence = "save-all"
"#;
let history_with_persistence_cfg = toml::from_str::<ConfigToml>(history_with_persistence)
.expect("TOML deserialization should succeed");
assert_eq!(
Some(vec![
SandboxPermission::DiskFullReadAccess,
SandboxPermission::NetworkFullAccess
]),
cfg_with_values.sandbox_permissions
Some(History {
persistence: HistoryPersistence::SaveAll,
max_bytes: None,
}),
history_with_persistence_cfg.history
);
let history_no_persistence = r#"
[history]
persistence = "none"
"#;
let history_no_persistence_cfg = toml::from_str::<ConfigToml>(history_no_persistence)
.expect("TOML deserialization should succeed");
assert_eq!(
Some(History {
persistence: HistoryPersistence::None,
max_bytes: None,
}),
history_no_persistence_cfg.history
);
}
/// Deserializing a TOML string containing an *invalid* permission should
/// fail with a helpful error rather than silently defaulting or
/// succeeding.
#[test]
fn test_sandbox_permissions_illegal_value() {
let toml_bad = r#"sandbox_permissions = ["not-a-real-permission"]"#;
fn test_sandbox_config_parsing() {
let sandbox_full_access = r#"
[sandbox]
mode = "danger-full-access"
network_access = false # This should be ignored.
"#;
let sandbox_full_access_cfg = toml::from_str::<ConfigToml>(sandbox_full_access)
.expect("TOML deserialization should succeed");
assert_eq!(
Some(SandboxPolicy::DangerFullAccess),
sandbox_full_access_cfg.sandbox
);
let err = toml::from_str::<ConfigToml>(toml_bad)
.expect_err("Deserialization should fail for invalid permission");
let sandbox_read_only = r#"
[sandbox]
mode = "read-only"
network_access = true # This should be ignored.
"#;
// Make sure the error message contains the invalid value so users have
// useful feedback.
let msg = err.to_string();
assert!(msg.contains("not-a-real-permission"));
let sandbox_read_only_cfg = toml::from_str::<ConfigToml>(sandbox_read_only)
.expect("TOML deserialization should succeed");
assert_eq!(Some(SandboxPolicy::ReadOnly), sandbox_read_only_cfg.sandbox);
let sandbox_workspace_write = r#"
[sandbox]
mode = "workspace-write"
writable_roots = [
"/tmp",
]
"#;
let sandbox_workspace_write_cfg = toml::from_str::<ConfigToml>(sandbox_workspace_write)
.expect("TOML deserialization should succeed");
assert_eq!(
Some(SandboxPolicy::WorkspaceWrite {
writable_roots: vec![PathBuf::from("/tmp")],
network_access: false
}),
sandbox_workspace_write_cfg.sandbox
);
}
/// Users can specify config values at multiple levels that have the
/// following precedence:
///
/// 1. custom command-line argument, e.g. `--model o3`
/// 2. as part of a profile, where the `--profile` is specified via a CLI
/// (or in the config file itelf)
/// 3. as an entry in `config.toml`, e.g. `model = "o3"`
/// 4. the default value for a required field defined in code, e.g.,
/// `crate::flags::OPENAI_DEFAULT_MODEL`
///
/// Note that profiles are the recommended way to specify a group of
/// configuration options together.
#[test]
fn test_precedence_overrides_then_profile_then_config_toml() -> std::io::Result<()> {
struct PrecedenceTestFixture {
cwd: TempDir,
codex_home: TempDir,
cfg: ConfigToml,
model_provider_map: HashMap<String, ModelProviderInfo>,
openai_provider: ModelProviderInfo,
openai_chat_completions_provider: ModelProviderInfo,
}
impl PrecedenceTestFixture {
fn cwd(&self) -> PathBuf {
self.cwd.path().to_path_buf()
}
fn codex_home(&self) -> PathBuf {
self.codex_home.path().to_path_buf()
}
}
fn create_test_fixture() -> std::io::Result<PrecedenceTestFixture> {
let toml = r#"
model = "o3"
approval_policy = "unless-allow-listed"
sandbox_permissions = ["disk-full-read-access"]
approval_policy = "untrusted"
disable_response_storage = false
# Can be used to determine which profile to use if not specified by
@@ -526,12 +666,15 @@ disable_response_storage = true
// a parent folder, either.
std::fs::write(cwd.join(".git"), "gitdir: nowhere")?;
let codex_home_temp_dir = TempDir::new().unwrap();
let openai_chat_completions_provider = ModelProviderInfo {
name: "OpenAI using Chat Completions".to_string(),
base_url: "https://api.openai.com/v1".to_string(),
env_key: Some("OPENAI_API_KEY".to_string()),
wire_api: crate::WireApi::Chat,
env_key_instructions: None,
query_params: None,
};
let model_provider_map = {
let mut model_provider_map = built_in_model_providers();
@@ -547,94 +690,173 @@ disable_response_storage = true
.expect("openai provider should exist")
.clone();
Ok(PrecedenceTestFixture {
cwd: cwd_temp_dir,
codex_home: codex_home_temp_dir,
cfg,
model_provider_map,
openai_provider,
openai_chat_completions_provider,
})
}
/// Users can specify config values at multiple levels that have the
/// following precedence:
///
/// 1. custom command-line argument, e.g. `--model o3`
/// 2. as part of a profile, where the `--profile` is specified via a CLI
/// (or in the config file itelf)
/// 3. as an entry in `config.toml`, e.g. `model = "o3"`
/// 4. the default value for a required field defined in code, e.g.,
/// `crate::flags::OPENAI_DEFAULT_MODEL`
///
/// Note that profiles are the recommended way to specify a group of
/// configuration options together.
#[test]
fn test_precedence_fixture_with_o3_profile() -> std::io::Result<()> {
let fixture = create_test_fixture()?;
let o3_profile_overrides = ConfigOverrides {
config_profile: Some("o3".to_string()),
cwd: Some(cwd.clone()),
cwd: Some(fixture.cwd()),
..Default::default()
};
let o3_profile_config =
Config::load_from_base_config_with_overrides(cfg.clone(), o3_profile_overrides, None)?;
let o3_profile_config: Config = Config::load_from_base_config_with_overrides(
fixture.cfg.clone(),
o3_profile_overrides,
fixture.codex_home(),
)?;
assert_eq!(
Config {
model: "o3".to_string(),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_provider_id: "openai".to_string(),
model_provider: openai_provider.clone(),
model_provider: fixture.openai_provider.clone(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: false,
instructions: None,
notify: None,
cwd: cwd.clone(),
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
model_providers: model_provider_map.clone(),
model_providers: fixture.model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
codex_home: fixture.codex_home(),
history: History::default(),
file_opener: UriBasedFileOpener::VsCode,
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::default(),
model_reasoning_summary: ReasoningSummary::default(),
},
o3_profile_config
);
Ok(())
}
#[test]
fn test_precedence_fixture_with_gpt3_profile() -> std::io::Result<()> {
let fixture = create_test_fixture()?;
let gpt3_profile_overrides = ConfigOverrides {
config_profile: Some("gpt3".to_string()),
cwd: Some(cwd.clone()),
cwd: Some(fixture.cwd()),
..Default::default()
};
let gpt3_profile_config = Config::load_from_base_config_with_overrides(
cfg.clone(),
fixture.cfg.clone(),
gpt3_profile_overrides,
None,
fixture.codex_home(),
)?;
let expected_gpt3_profile_config = Config {
model: "gpt-3.5-turbo".to_string(),
model_context_window: Some(16_385),
model_max_output_tokens: Some(4_096),
model_provider_id: "openai-chat-completions".to_string(),
model_provider: openai_chat_completions_provider,
approval_policy: AskForApproval::UnlessAllowListed,
model_provider: fixture.openai_chat_completions_provider.clone(),
approval_policy: AskForApproval::UnlessTrusted,
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: false,
instructions: None,
notify: None,
cwd: cwd.clone(),
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
model_providers: model_provider_map.clone(),
model_providers: fixture.model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
codex_home: fixture.codex_home(),
history: History::default(),
file_opener: UriBasedFileOpener::VsCode,
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::default(),
model_reasoning_summary: ReasoningSummary::default(),
};
assert_eq!(expected_gpt3_profile_config.clone(), gpt3_profile_config);
assert_eq!(expected_gpt3_profile_config, gpt3_profile_config);
// Verify that loading without specifying a profile in ConfigOverrides
// uses the default profile from the config file.
// uses the default profile from the config file (which is "gpt3").
let default_profile_overrides = ConfigOverrides {
cwd: Some(cwd.clone()),
cwd: Some(fixture.cwd()),
..Default::default()
};
let default_profile_config = Config::load_from_base_config_with_overrides(
cfg.clone(),
fixture.cfg.clone(),
default_profile_overrides,
None,
fixture.codex_home(),
)?;
assert_eq!(expected_gpt3_profile_config, default_profile_config);
Ok(())
}
#[test]
fn test_precedence_fixture_with_zdr_profile() -> std::io::Result<()> {
let fixture = create_test_fixture()?;
let zdr_profile_overrides = ConfigOverrides {
config_profile: Some("zdr".to_string()),
cwd: Some(cwd.clone()),
cwd: Some(fixture.cwd()),
..Default::default()
};
let zdr_profile_config =
Config::load_from_base_config_with_overrides(cfg.clone(), zdr_profile_overrides, None)?;
assert_eq!(
Config {
model: "o3".to_string(),
model_provider_id: "openai".to_string(),
model_provider: openai_provider.clone(),
approval_policy: AskForApproval::OnFailure,
sandbox_policy: SandboxPolicy::new_read_only_policy(),
disable_response_storage: true,
instructions: None,
notify: None,
cwd: cwd.clone(),
mcp_servers: HashMap::new(),
model_providers: model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
},
zdr_profile_config
);
let zdr_profile_config = Config::load_from_base_config_with_overrides(
fixture.cfg.clone(),
zdr_profile_overrides,
fixture.codex_home(),
)?;
let expected_zdr_profile_config = Config {
model: "o3".to_string(),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_provider_id: "openai".to_string(),
model_provider: fixture.openai_provider.clone(),
approval_policy: AskForApproval::OnFailure,
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: true,
instructions: None,
notify: None,
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
model_providers: fixture.model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
codex_home: fixture.codex_home(),
history: History::default(),
file_opener: UriBasedFileOpener::VsCode,
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::default(),
model_reasoning_summary: ReasoningSummary::default(),
};
assert_eq!(expected_zdr_profile_config, zdr_profile_config);
Ok(())
}

View File

@@ -0,0 +1,207 @@
//! Types used to define the fields of [`crate::config::Config`].
// Note this file should generally be restricted to simple struct/enum
// definitions that do not contain business logic.
use std::collections::HashMap;
use strum_macros::Display;
use wildmatch::WildMatchPattern;
use serde::Deserialize;
use serde::Serialize;
#[derive(Deserialize, Debug, Clone, PartialEq)]
pub struct McpServerConfig {
pub command: String,
#[serde(default)]
pub args: Vec<String>,
#[serde(default)]
pub env: Option<HashMap<String, String>>,
}
#[derive(Deserialize, Debug, Copy, Clone, PartialEq)]
pub enum UriBasedFileOpener {
#[serde(rename = "vscode")]
VsCode,
#[serde(rename = "vscode-insiders")]
VsCodeInsiders,
#[serde(rename = "windsurf")]
Windsurf,
#[serde(rename = "cursor")]
Cursor,
/// Option to disable the URI-based file opener.
#[serde(rename = "none")]
None,
}
impl UriBasedFileOpener {
pub fn get_scheme(&self) -> Option<&str> {
match self {
UriBasedFileOpener::VsCode => Some("vscode"),
UriBasedFileOpener::VsCodeInsiders => Some("vscode-insiders"),
UriBasedFileOpener::Windsurf => Some("windsurf"),
UriBasedFileOpener::Cursor => Some("cursor"),
UriBasedFileOpener::None => None,
}
}
}
/// Settings that govern if and what will be written to `~/.codex/history.jsonl`.
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
pub struct History {
/// If true, history entries will not be written to disk.
pub persistence: HistoryPersistence,
/// If set, the maximum size of the history file in bytes.
/// TODO(mbolin): Not currently honored.
pub max_bytes: Option<usize>,
}
#[derive(Deserialize, Debug, Copy, Clone, PartialEq, Default)]
#[serde(rename_all = "kebab-case")]
pub enum HistoryPersistence {
/// Save all history entries to disk.
#[default]
SaveAll,
/// Do not write history to disk.
None,
}
/// Collection of settings that are specific to the TUI.
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
pub struct Tui {
/// By default, mouse capture is enabled in the TUI so that it is possible
/// to scroll the conversation history with a mouse. This comes at the cost
/// of not being able to use the mouse to select text in the TUI.
/// (Most terminals support a modifier key to allow this. For example,
/// text selection works in iTerm if you hold down the `Option` key while
/// clicking and dragging.)
///
/// Setting this option to `true` disables mouse capture, so scrolling with
/// the mouse is not possible, though the keyboard shortcuts e.g. `b` and
/// `space` still work. This allows the user to select text in the TUI
/// using the mouse without needing to hold down a modifier key.
pub disable_mouse_capture: bool,
}
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
#[serde(rename_all = "kebab-case")]
pub enum ShellEnvironmentPolicyInherit {
/// "Core" environment variables for the platform. On UNIX, this would
/// include HOME, LOGNAME, PATH, SHELL, and USER, among others.
#[default]
Core,
/// Inherits the full environment from the parent process.
All,
/// Do not inherit any environment variables from the parent process.
None,
}
/// Policy for building the `env` when spawning a process via either the
/// `shell` or `local_shell` tool.
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
pub struct ShellEnvironmentPolicyToml {
pub inherit: Option<ShellEnvironmentPolicyInherit>,
pub ignore_default_excludes: Option<bool>,
/// List of regular expressions.
pub exclude: Option<Vec<String>>,
pub r#set: Option<HashMap<String, String>>,
/// List of regular expressions.
pub include_only: Option<Vec<String>>,
}
pub type EnvironmentVariablePattern = WildMatchPattern<'*', '?'>;
/// Deriving the `env` based on this policy works as follows:
/// 1. Create an initial map based on the `inherit` policy.
/// 2. If `ignore_default_excludes` is false, filter the map using the default
/// exclude pattern(s), which are: `"*KEY*"` and `"*TOKEN*"`.
/// 3. If `exclude` is not empty, filter the map using the provided patterns.
/// 4. Insert any entries from `r#set` into the map.
/// 5. If non-empty, filter the map using the `include_only` patterns.
#[derive(Debug, Clone, PartialEq, Default)]
pub struct ShellEnvironmentPolicy {
/// Starting point when building the environment.
pub inherit: ShellEnvironmentPolicyInherit,
/// True to skip the check to exclude default environment variables that
/// contain "KEY" or "TOKEN" in their name.
pub ignore_default_excludes: bool,
/// Environment variable names to exclude from the environment.
pub exclude: Vec<EnvironmentVariablePattern>,
/// (key, value) pairs to insert in the environment.
pub r#set: HashMap<String, String>,
/// Environment variable names to retain in the environment.
pub include_only: Vec<EnvironmentVariablePattern>,
}
impl From<ShellEnvironmentPolicyToml> for ShellEnvironmentPolicy {
fn from(toml: ShellEnvironmentPolicyToml) -> Self {
let inherit = toml.inherit.unwrap_or(ShellEnvironmentPolicyInherit::Core);
let ignore_default_excludes = toml.ignore_default_excludes.unwrap_or(false);
let exclude = toml
.exclude
.unwrap_or_default()
.into_iter()
.map(|s| EnvironmentVariablePattern::new_case_insensitive(&s))
.collect();
let r#set = toml.r#set.unwrap_or_default();
let include_only = toml
.include_only
.unwrap_or_default()
.into_iter()
.map(|s| EnvironmentVariablePattern::new_case_insensitive(&s))
.collect();
Self {
inherit,
ignore_default_excludes,
exclude,
r#set,
include_only,
}
}
}
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning
#[derive(Debug, Serialize, Deserialize, Default, Clone, Copy, PartialEq, Eq, Display)]
#[serde(rename_all = "lowercase")]
#[strum(serialize_all = "lowercase")]
pub enum ReasoningEffort {
Low,
#[default]
Medium,
High,
/// Option to disable reasoning.
None,
}
/// A summary of the reasoning performed by the model. This can be useful for
/// debugging and understanding the model's reasoning process.
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries
#[derive(Debug, Serialize, Deserialize, Default, Clone, Copy, PartialEq, Eq, Display)]
#[serde(rename_all = "lowercase")]
#[strum(serialize_all = "lowercase")]
pub enum ReasoningSummary {
#[default]
Auto,
Concise,
Detailed,
/// Option to disable reasoning summaries.
None,
}

Some files were not shown because too many files have changed in this diff Show More