Compare commits

...

131 Commits

Author SHA1 Message Date
Charlie Weems
d624f6a521 Clean up command outputs 2025-07-23 10:17:50 -07:00
pap
903de87833 adding best of n 2025-07-22 18:42:35 -07:00
pap
8b9f09a5ba adding secs ago instead of start date 2025-07-22 18:36:45 -07:00
pap
20db1497f3 missing dependencies 2025-07-22 16:06:30 -07:00
pap
ed9de08465 adding a test and renaming jobs to tasks 2025-07-22 16:06:21 -07:00
pap
bb0befc8db adding codex jobs command, including jobs ls, inspect, logs and -a options 2025-07-22 13:42:47 -07:00
pap
2ca44b46d6 adding concurrent option using worktree 2025-07-22 12:03:52 -07:00
Michael Bolin
4082246f6a chore: install an extension for TOML syntax highlighting in the devcontainer (#1650)
Small quality-of-life improvement when doing devcontainer development.
2025-07-22 10:58:09 -07:00
pakrym-oai
6d82907082 Add support for custom base instructions (#1645)
Allows providing custom instructions file as a config parameter and
custom instruction text via MCP tool call.
2025-07-22 09:42:22 -07:00
pakrym-oai
ed206d5687 Log response.failed error message and request-id (#1649)
To help with diagnosing failures.
2025-07-22 09:28:00 -07:00
Michael Bolin
d51654822f fix: use PR_SET_PDEATHSIG so to ensure child processes are killed in a timely manner (#1626)
Some users have reported issues where child processes are not cleaned up
after Codex exits (e.g., https://github.com/openai/codex/issues/1570).

This is generally a tricky issue on operating systems: if a parent
process receives `SIGKILL`, then it terminates immediately and cannot
communicate with the child.

**It only helps on Linux**, but this PR introduces the use of `prctl(2)`
so that if the parent process dies, `SIGTERM` will be delivered to the
child process. Whereas previously, I believe that if Codex spawned a
long-running process (like `tsc --watch`) and the Codex process received
`SIGKILL`, the `tsc --watch` process would be reparented to the init
process and would never be killed. Now with the use of `prctl(2)`, the
`tsc --watch` process should receive `SIGTERM` in that scenario.

We still need to come up with a solution for macOS. I've started to look
at `launchd`, but I'm researching a number of options.
2025-07-22 00:41:27 -07:00
Gabriel Peal
710f728124 Add an elicitation for approve patch and refactor tool calls (#1642)
1. Added an elicitation for `approve-patch` which is very similar to
`approve-exec`.
2. Extracted both elicitations to their own files to prevent
`codex_tool_runner` from blowing up in size.
2025-07-22 02:58:41 -04:00
Michael Bolin
6cf4b96f9d fix: check flags to ripgrep when deciding whether the invocation is "trusted" (#1644)
With this change, if any of `--pre`, `--hostname-bin`, `--search-zip`, or `-z` are used with a proposed invocation of `rg`, do not auto-approve.
2025-07-21 22:38:50 -07:00
Dylan
18b2b30841 [mcp-server] Add reply tool call (#1643)
## Summary
Adds a new mcp tool call, `codex-reply`, so we can continue existing
sessions. This is a first draft and does not yet support sessions from
previous processes.

## Testing
- [x] tested with mcp client
2025-07-21 21:01:56 -07:00
Michael Bolin
d49d802b06 test: add integration test for MCP server (#1633)
This PR introduces a single integration test for `cargo mcp`, though it
also introduces a number of reusable components so that it should be
easier to introduce more integration tests going forward.

The new test is introduced in `codex-rs/mcp-server/tests/elicitation.rs`
and the reusable pieces are in `codex-rs/mcp-server/tests/common`.

The test itself verifies new functionality around elicitations
introduced in https://github.com/openai/codex/pull/1623 (and the fix
introduced in https://github.com/openai/codex/pull/1629) by doing the
following:

- starts a mock model provider with canned responses for
`/v1/chat/completions`
- starts the MCP server with a `config.toml` to use that model provider
(and `approval_policy = "untrusted"`)
- sends the `codex` tool call which causes the mock model provider to
request a shell call for `git init`
- the MCP server sends an elicitation to the client to approve the
request
- the client replies to the elicitation with `"approved"`
- the MCP server runs the command and re-samples the model, getting a
`"finish_reason": "stop"`
- in turn, the MCP server sends the final response to the original
`codex` tool call
- verifies that `git init` ran as expected

To test:

```
cargo test shell_command_approval_triggers_elicitation
```

In writing this test, I discovered that `ExecApprovalResponse` does not
conform to `ElicitResult`, so I added a TODO to fix that, since I think
that should be updated in a separate PR. As it stands, this PR does not
update any business logic, though it does make a number of members of
the `mcp-server` crate `pub` so they can be used in the test.

One additional learning from this PR is that
`std::process::Command::cargo_bin()` from the `assert_cmd` trait is only
available for `std::process::Command`, but we really want to use
`tokio::process::Command` so that everything is async and we can
leverage utilities like `tokio::time::timeout()`. The trick I came up
with was to use `cargo_bin()` to locate the program, and then to use
`std::process::Command::get_program()` when constructing the
`tokio::process::Command`.
2025-07-21 10:27:07 -07:00
Michael Bolin
8a6c6cee88 fix: address review feedback on #1621 and #1623 (#1631)
- formalizes `ExecApprovalElicitRequestParams`
- adds some defensive logic when messages fail to parse
- fixes a typo in a comment
2025-07-20 14:42:11 -07:00
Gabriel Peal
8b590105de Don't drop sessions on elicitation responses (#1629) 2025-07-20 13:31:19 -04:00
Michael Bolin
018003e52f feat: leverage elicitations in the MCP server (#1623)
This updates the MCP server so that if it receives an
`ExecApprovalRequest` from the `Codex` session, it in turn sends an [MCP
elicitation](https://modelcontextprotocol.io/specification/draft/client/elicitation)
to the client to ask for the approval decision. Upon getting a response,
it forwards the client's decision via `Op::ExecApproval`.

Admittedly, we should be doing the same thing for
`ApplyPatchApprovalRequest`, but this is our first time experimenting
with elicitations, so I'm inclined to defer wiring that code path up
until we feel good about how this one works.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1623).
* __->__ #1623
* #1622
* #1621
* #1620
2025-07-19 01:32:03 -04:00
Michael Bolin
11fd3123be chore: introduce OutgoingMessageSender (#1622)
Previous to this change, `MessageProcessor` had a
`tokio::sync::mpsc::Sender<JSONRPCMessage>` as an abstraction for server
code to send a message down to the MCP client. Because `Sender` is cheap
to `clone()`, it was straightforward to make it available to tasks
scheduled with `tokio::task::spawn()`.

This worked well when we were only sending notifications or responses
back down to the client, but we want to add support for sending
elicitations in #1623, which means that we need to be able to send
_requests_ to the client, and now we need a bit of centralization to
ensure all request ids are unique.

To that end, this PR introduces `OutgoingMessageSender`, which houses
the existing `Sender<OutgoingMessage>` as well as an `AtomicI64` to mint
out new, unique request ids. It has methods like `send_request()` and
`send_response()` so that callers do not have to deal with
`JSONRPCMessage` directly, as having to set the `jsonrpc` for each
message was a bit tedious (this cleans up `codex_tool_runner.rs` quite a
bit).

We do not have `OutgoingMessageSender` implement `Clone` because it is
important that the `AtomicI64` is shared across all users of
`OutgoingMessageSender`. As such, `Arc<OutgoingMessageSender>` must be
used instead, as it is frequently shared with new tokio tasks.

As part of this change, we update `message_processor.rs` to embrace
`await`, though we must be careful that no individual handler blocks the
main loop and prevents other messages from being handled.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1622).
* #1623
* __->__ #1622
* #1621
* #1620
2025-07-19 00:30:56 -04:00
Michael Bolin
e78ec00e73 chore: support MCP schema 2025-06-18 (#1621)
This updates the schema in `generate_mcp_types.py` from `2025-03-26` to
`2025-06-18`, regenerates `mcp-types/src/lib.rs`, and then updates all
the code that uses `mcp-types` to honor the changes.

Ran

```
npx @modelcontextprotocol/inspector just codex mcp
```

and verified that I was able to invoke the `codex` tool, as expected.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1621).
* #1623
* #1622
* __->__ #1621
2025-07-19 00:09:34 -04:00
Michael Bolin
a06d4f58e4 chore: clean up generate_mcp_types.py so codegen matches existing output (#1620) 2025-07-18 21:40:39 -04:00
aibrahim-oai
83eefb55fb Add session loading support to Codex (#1602)
## Summary
- extend rollout format to store all session data in JSON
- add resume/write helpers for rollouts
- track session state after each conversation
- support `LoadSession` op to resume a previous rollout
- allow starting Codex with an existing session via
`experimental_resume` config variable

We need a way later for exploring the available sessions in a user
friendly way.

## Testing
- `cargo test --no-run` *(fails: `cargo: command not found`)*

------
https://chatgpt.com/codex/tasks/task_i_68792a29dd5c832190bf6930d3466fba

This video is outdated. you should use `-c experimental_resume:<full
path>` instead of `--resume <full path>`


https://github.com/user-attachments/assets/7a9975c7-aa04-4f4e-899a-9e87defd947a
2025-07-18 17:04:04 -07:00
aibrahim-oai
9846adeabf Refactor env settings into config (#1601)
## Summary
- add OpenAI retry and timeout fields to Config
- inject these settings in tests instead of mutating env vars
- plumb Config values through client and chat completions logic
- document new configuration options

## Testing
- `cargo test -p codex-core --no-run`

------
https://chatgpt.com/codex/tasks/task_i_68792c5b04cc832195c03050c8b6ea94

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-07-18 19:12:39 +00:00
aibrahim-oai
d5a2148deb Fix ctrl+c interrupt while streaming (#1617)
Interrupting while streaming now causes is broken because we aren't
clearing the delta buffer.
2025-07-18 12:08:25 -07:00
Michael Bolin
cc874c9205 chore: use AtomicBool instead of Mutex<bool> (#1616) 2025-07-18 11:13:34 -07:00
pakrym-oai
6f2b01bb6b feat: ensure session ID header is sent in Response API request (#1614)
Include the current session id in Responses API requests.
2025-07-18 09:59:07 -07:00
aibrahim-oai
9cedeadf6a change the default debounce rate to 10ms (#1606)
changed the default debounce rate to 10ms because typing was laggy.

Before:


https://github.com/user-attachments/assets/e5d15fcb-6a2b-4837-b2b4-c3dcb4cc3409

After



https://github.com/user-attachments/assets/6f0005eb-fd49-4130-ba68-635ee0f2831f
2025-07-17 17:00:17 -07:00
pakrym-oai
327e2254f6 chore: rename toolchain file (#1604)
Rename toolchain file so older versions of cargo can pick it up.
2025-07-17 15:36:15 -07:00
Michael Bolin
e16657ca45 feat: add --json flag to codex exec (#1603)
This is designed to facilitate programmatic use of Codex in a more
lightweight way than using `codex mcp`.

Passing `--json` to `codex exec` will print each event as a line of JSON
to stdout. Note that it does not print the individual tokens as they are
streamed, only full messages, as this is aimed at programmatic use
rather than to power UI.

<img width="1348" height="1307" alt="image"
src="https://github.com/user-attachments/assets/fc7908de-b78d-46e4-a6ff-c85de28415c7"
/>

I changed the existing `EventProcessor` into a trait and moved the
implementation to `EventProcessorWithHumanOutput`. Then I introduced an
alternative implementation, `EventProcessorWithJsonOutput`. The `--json`
flag determines which implementation to use.
2025-07-17 15:10:15 -07:00
aibrahim-oai
bb30ab9e96 Implement redraw debounce (#1599)
## Summary
- debouce redraw events so repeated requests don't overwhelm the
terminal
- add `RequestRedraw` event and schedule redraws after 100ms

## Testing
- `cargo clippy --tests`
- `cargo test` *(fails: Sandbox Denied errors in landlock tests)*

------
https://chatgpt.com/codex/tasks/task_i_68792a65b8b483218ec90a8f68746cd8

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-07-17 12:54:55 -07:00
pakrym-oai
6949329a7f chore: auto format code on save and add more details to AGENTS.md (#1582)
Adds a default vscode config with generally applicable settings.
Adds more entrypoints to justfile both  for environment setup and to help
agents better verify changes.
2025-07-17 11:40:00 -07:00
pakrym-oai
b95a010e86 fix: trim MCP tool names to fit into tool name length limit (#1571)
Store fully qualified names along with tool entries so we don't have to re-parse them.

Fixes: https://github.com/openai/codex/issues/1289
2025-07-17 11:35:38 -07:00
aibrahim-oai
fcbcc40f51 Storing the sessions in a more organized way for easier look up. (#1596)
now storing the sessions in `~/.codex/sessions/YYYY/MM/DD/<file>`
2025-07-17 10:12:15 -07:00
aibrahim-oai
643ab1f582 Add streaming to exec and tui (#1594)
Added support for streaming in `tui`
Added support for streaming in `exec`


https://github.com/user-attachments/assets/4215892e-d940-452c-a1d0-416ed0cf14eb
2025-07-16 22:26:31 -07:00
Michael Bolin
d3dbc10479 fix: update bin/codex.js so it listens for exit on the child process (#1590)
When Codex CLI is installed via `npm`, we use a `.js` wrapper script to
launch the Rust binary.

- Previously, we were not listening for signals to ensure that killing
the Node.js process would also kill the underlying Rust process.
- We also did not have a proper `exit` handler in place on the child
process to ensure we exited from the Node.js process.

This PR fixes these things and hopefully addresses
https://github.com/openai/codex/issues/1570.

This also adds logic so that Windows falls back to the TypeScript CLI
again, which should address https://github.com/openai/codex/issues/1573.
2025-07-16 16:35:29 -07:00
Preet 🚀
0bc7ee9193 Added mcp-server name validation (#1591)
This PR implements server name validation for MCP (Model Context
Protocol) servers to ensure they conform to the required pattern
^[a-zA-Z0-9_-]+$. This addresses the TODO comment in
mcp_connection_manager.rs:82.

+ Added validation before spawning MCP client tasks
+ Invalid server names are added to errors map with descriptive messages

I have read the CLA Document and I hereby sign the CLA

---------

Co-authored-by: Michael Bolin <bolinfest@gmail.com>
2025-07-16 16:00:39 -07:00
aibrahim-oai
2bd3314886 support deltas in core (#1587)
- Added support for message and reasoning deltas
- Skipped adding the support in the cli and tui for later
- Commented a failing test (wrong merge) that needs fix in a separate
PR.

Side note: I think we need to disable merge when the CI don't pass.
2025-07-16 15:11:18 -07:00
Michael Bolin
5b820c5ce7 feat: ctrl-d only exits when there is no user input (#1589)
While this does make it so that `ctrl-d` will not exit Codex when the
composer is not empty, `ctrl-d` will still exit Codex if it is in the
"working" state.

Fixes https://github.com/openai/codex/issues/1443.
2025-07-16 08:59:26 -07:00
aibrahim-oai
f14b5adabf Add SSE Response parser tests (#1541)
## Summary
- add `tokio-test` dev dependency
- implement response stream parsing unit tests

## Testing
- `cargo clippy -p codex-core --tests -- -D warnings`
- `cargo test -p codex-core -- --nocapture`

------
https://chatgpt.com/codex/tasks/task_i_687163f3b2208321a6ce2adbef3fbc06
2025-07-14 14:51:32 -07:00
Michael Bolin
9c0b413fd1 docs: clarify the build process for the npm release (#1568)
It appears that `0.5.0` was built with `stage_release.sh` instead of
`stage_rust_release.py`, so add docs to clarify this and recommend
running `--version` on the release candidate to verify the right thing
was built.
2025-07-14 09:41:11 -07:00
aibrahim-oai
3777e18243 Add CLI streaming integration tests (#1542)
## Summary
- add integration test for chat mode streaming via CLI using wiremock
- add integration test for Responses API streaming via fixture
- call `cargo run` to invoke the CLI during tests

## Testing
- `cargo test -p codex-core --test cli_stream -- --nocapture`
- `cargo clippy --all-targets --all-features -- -D warnings`


------
https://chatgpt.com/codex/tasks/task_i_68715980bbec8321999534fdd6a013c1
2025-07-12 18:05:58 -07:00
aibrahim-oai
0f8ac92390 Allow deadcode in test_support (#1555)
#1546 Was pushed while not passing the clippy integration tests. This is
fixing it.
2025-07-12 17:20:35 -07:00
aibrahim-oai
c46bb67d77 Improve SSE tests (#1546)
## Summary
- support fixture-based SSE data in tests
- add helpers to load SSE JSON fixtures
- add table-driven SSE unit tests
- let integration tests use fixture loading
- fix clippy errors from format! calls

## Testing
- `cargo clippy --tests`
- `cargo test --workspace --exclude codex-linux-sandbox`


------
https://chatgpt.com/codex/tasks/task_i_68717468c3e48321b51c9ecac6ba0f09
2025-07-12 16:53:55 -07:00
Michael Bolin
94f5cad895 fix: when invoking Codex via MCP, use the request id as the Submission id (#1554)
Small quality-of-life improvement when using `codex mcp`.
2025-07-12 16:22:02 -07:00
aibrahim-oai
72504f1d9c Add paste summarization to Codex TUI (#1549)
## Summary
- introduce `Paste` event to avoid per-character paste handling
- collapse large pasted blocks to `[Pasted Content X lines]`
- store the real text so submission still includes it
- wire paste handling through `App`, `ChatWidget`, `BottomPane`, and
`ChatComposer`

## Testing
- `cargo test -p codex-tui`


------
https://chatgpt.com/codex/tasks/task_i_6871e24abf80832184d1f3ca0c61a5ee


https://github.com/user-attachments/assets/eda7412f-da30-4474-9f7c-96b49d48fbf8
2025-07-12 15:32:00 -07:00
dependabot[bot]
fa6d507c51 chore(deps-dev): bump @types/bun from 1.2.13 to 1.2.18 in /.github/actions/codex (#1509)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/bun&package-manager=bun&previous-version=1.2.13&new-version=1.2.18)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-12 10:29:37 -07:00
dependabot[bot]
a52a2fe7a9 chore(deps-dev): bump @types/node from 22.15.21 to 24.0.12 in /.github/actions/codex (#1507)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=bun&previous-version=22.15.21&new-version=24.0.12)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-12 09:56:54 -07:00
Gabriel Peal
bfeb8c92a5 Add codex apply to apply a patch created from the Codex remote agent (#1528)
In order to to this, I created a new `chatgpt` crate where we can put
any code that interacts directly with ChatGPT as opposed to the OpenAI
API. I added a disclaimer to the README for it that it should primarily
be modified by OpenAI employees.


https://github.com/user-attachments/assets/bb978e33-d2c9-4d8e-af28-c8c25b1988e8
2025-07-11 13:30:11 -04:00
Michael Bolin
9e58076cf5 chore: read model field off of Config instead of maintaining the parallel field (#1525)
https://github.com/openai/codex/pull/1524 introduced the new `config`
field on `ModelClient`, so this does the post-PR cleanup to remove the
now-unnecessary `model` field.
2025-07-10 14:37:04 -07:00
Michael Bolin
8a424fcfa3 feat: add new config option: model_supports_reasoning_summaries (#1524)
As noted in the updated docs, this makes it so that you can set:

```toml
model_supports_reasoning_summaries = true
```

as a way of overriding the existing heuristic for when to set the
`reasoning` field on a sampling request:


341c091c5b/codex-rs/core/src/client_common.rs (L152-L166)
2025-07-10 14:30:33 -07:00
dependabot[bot]
341c091c5b chore(deps-dev): bump prettier from 3.5.3 to 3.6.2 in /.github/actions/codex (#1508)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=prettier&package-manager=bun&previous-version=3.5.3&new-version=3.6.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 12:13:59 -07:00
dependabot[bot]
6b1e4a6846 chore(deps): bump node from 22-slim to 24-slim in /codex-cli (#1505)
Bumps node from 22-slim to 24-slim.


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=node&package-manager=docker&previous-version=22-slim&new-version=24-slim)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 12:11:44 -07:00
dependabot[bot]
75fa65e054 chore(deps): bump toml from 0.9.0 to 0.9.1 in /codex-rs (#1514)
Bumps [toml](https://github.com/toml-rs/toml) from 0.9.0 to 0.9.1.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8c8ef44ea1"><code>8c8ef44</code></a>
chore: Release</li>
<li><a
href="b60ac5bfe9"><code>b60ac5b</code></a>
fix(toml): Correct minimal version for indexmap (<a
href="https://redirect.github.com/toml-rs/toml/issues/998">#998</a>)</li>
<li><a
href="966bd40511"><code>966bd40</code></a>
fix(toml): Correct minimal version for indexmap</li>
<li><a
href="2ed2af6519"><code>2ed2af6</code></a>
docs(readme): Mention additional crates</li>
<li>See full diff in <a
href="https://github.com/toml-rs/toml/compare/toml-v0.9.0...toml-v0.9.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=toml&package-manager=cargo&previous-version=0.9.0&new-version=0.9.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 11:34:37 -07:00
Michael Bolin
16eafd02ad fix: remove reference to /compact until it is implemented (#1503)
Do not mention `/compact` until
https://github.com/openai/codex/issues/1257 is addressed.
2025-07-10 11:23:35 -07:00
Michael Bolin
c8051b906f chore: drop codex-cli from dependabot (#1523)
We are not actively developing `codex-cli`, so I would rather leave the
existing `pnpm-lock.yaml` files as-is.
2025-07-10 11:23:24 -07:00
Rene Leonhardt
82b0cebe8b chore(rs): update dependencies (#1494)
### Chores
- Update cargo dependencies
- Remove unused cargo dependencies
- Fix clippy warnings
- Update Dockerfile (package.json requires node 22)
- Let Dependabot update bun, cargo, devcontainers, docker,
github-actions, npm (nix still not supported)

### TODO
- Upgrade dependencies with breaking changes

```shell
$ cargo update --verbose
   Unchanged crossterm v0.28.1 (available: v0.29.0)
   Unchanged schemars v0.8.22 (available: v1.0.4)
```
2025-07-10 11:08:16 -07:00
pchuri
3a23a86f4b Add Android platform support for Codex CLI (#1488)
## Summary
Add Android platform support to Codex CLI

## What?
- Added `android` to the list of supported platforms in
`codex-cli/bin/codex.js`
- Treats Android as Linux for binary compatibility

## Why?
- Fixes "Unsupported platform: android (arm64)" error on Termux
- Enables Codex CLI usage on Android devices via Termux
- Improves platform compatibility without affecting other platforms

## How?
- Modified the platform detection switch statement to include `case
"android":`
- Android falls through to the same logic as Linux, using appropriate
ARM64 binaries
- Minimal change with no breaking effects on existing functionality

## Testing
- Tested on Android/Termux environment
- Verified the fix resolves the platform detection error
- Confirmed no impact on other platforms

## Related Issues
Fixes the "Unsupported platform: android (arm64)" error reported by
Termux users
2025-07-09 22:06:55 -07:00
Michael Bolin
268267b59e fix: the completion subcommand should assume the CLI is named codex, not codex-cli (#1496)
Current 0.4.0 release:

```
~/code/codex2/codex-rs$ codex completion | head
_codex-cli() {
    local i cur prev opts cmd
    COMPREPLY=()
    if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
        cur="$2"
    else
        cur="${COMP_WORDS[COMP_CWORD]}"
    fi
    prev="$3"
    cmd=""
```

with this change:

```
~/code/codex2/codex-rs$ just codex completion | head
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.82s
     Running `target/debug/codex completion`
_codex() {
    local i cur prev opts cmd
    COMPREPLY=()
    if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
        cur="$2"
    else
        cur="${COMP_WORDS[COMP_CWORD]}"
    fi
    prev="$3"
    cmd=""
```
2025-07-09 14:08:35 -07:00
Michael Bolin
4a15ebc1ca feat: add codex completion to generate shell completions (#1491)
Once this lands, we can update our brew formula to use
`generate_completions_from_executable()` like so:


905238ff7f/Formula/h/hgrep.rb (L21-L25)
2025-07-08 21:43:27 -07:00
Michael Bolin
8d35ad0ef7 feat: honor OPENAI_BASE_URL for the built-in openai provider (#1487)
Some users have proxies or other setups where they are ultimately
hitting OpenAI endpoints, but need a custom `base_url` rather than the
default value of `"https://api.openai.com/v1"`. This PR makes it
possible to override the `base_url` for the `openai` provider via the
`OPENAI_BASE_URL` environment variable.
2025-07-08 12:39:52 -07:00
Michael Bolin
cc58f1086d docs: document support for model_reasoning_effort and model_reasoning_summary in profiles (#1486)
Documents the new functionality added in
https://github.com/openai/codex/pull/1484.
2025-07-08 12:26:05 -07:00
Yusuf Eren
e444a50cf0 feat: add reasoning fields to profile settings (#1484) 2025-07-08 12:05:22 -07:00
Michael Bolin
f80fc86f18 chore: default to the latest version of the Codex CLI in the GitHub Action (#1485)
Now we no longer have to update the default value of `codex_release_tag`
in the GitHub Action going forward.
2025-07-08 12:00:13 -07:00
Michael Bolin
0b9cb2b9e7 chore: create a release script for the Rust CLI (#1479)
This is a stopgap solution before migrating the build for the npm
release to GitHub Actions (which is ultimately what should be done to
ensure hermetic builds).

The idea is that instead of continuing to create PRs like
https://github.com/openai/codex/pull/1472 where I have to check in a
change to the `WORKFLOW_URL`, this script uses `gh run list` to get the
`WORKFLOW_URL` dynamically and then threads the value through to
`install_native_deps.sh`.

To create the 0.3.0 release on npm, I ran:

```shell
./codex-cli/scripts/stage_rust_release.py --release-version 0.3.0
```

and then did `npm publish --dry-run` followed by `npm publish` in the
temp directory created by `stage_rust_release.py`.
2025-07-07 23:51:34 -07:00
Michael Bolin
e0c08cea4f feat: add support for --sandbox flag (#1476)
On a high-level, we try to design `config.toml` so that you don't have
to "comment out a lot of stuff" when testing different options.

Previously, defining a sandbox policy was somewhat at odds with this
principle because you would define the policy as attributes of
`[sandbox]` like so:

```toml
[sandbox]
mode = "workspace-write"
writable_roots = [ "/tmp" ]
```

but if you wanted to temporarily change to a read-only sandbox, you
might feel compelled to modify your file to be:

```toml
[sandbox]
mode = "read-only"
# mode = "workspace-write"
# writable_roots = [ "/tmp" ]
```

Technically, commenting out `writable_roots` would not be strictly
necessary, as `mode = "read-only"` would ignore `writable_roots`, but
it's still a reasonable thing to do to keep things tidy.

Currently, the various values for `mode` do not support that many
attributes, so this is not that hard to maintain, but one could imagine
this becoming more complex in the future.

In this PR, we change Codex CLI so that it no longer recognizes
`[sandbox]`. Instead, it introduces a top-level option, `sandbox_mode`,
and `[sandbox_workspace_write]` is used to further configure the sandbox
when when `sandbox_mode = "workspace-write"` is used:

```toml
sandbox_mode = "workspace-write"

[sandbox_workspace_write]
writable_roots = [ "/tmp" ]
```

This feels a bit more future-proof in that it is less tedious to
configure different sandboxes:

```toml
sandbox_mode = "workspace-write"

[sandbox_read_only]
# read-only options here...

[sandbox_workspace_write]
writable_roots = [ "/tmp" ]

[sandbox_danger_full_access]
# danger-full-access options here...
```

In this scheme, you never need to comment out the configuration for an
individual sandbox type: you only need to redefine `sandbox_mode`.

Relatedly, previous to this change, a user had to do `-c
sandbox.mode=read-only` to change the mode on the command line. With
this change, things are arguably a bit cleaner because the equivalent
option is `-c sandbox_mode=read-only` (and now `-c
sandbox_workspace_write=...` can be set separately).

Though more importantly, we introduce the `-s/--sandbox` option to the
CLI, which maps directly to `sandbox_mode` in `config.toml`, making
config override behavior easier to reason about. Moreover, as you can
see in the updates to the various Markdown files, it is much easier to
explain how to configure sandboxing when things like `--sandbox
read-only` can be used as an example.

Relatedly, this cleanup also made it straightforward to add support for
a `sandbox` option for Codex when used as an MCP server (see the changes
to `mcp-server/src/codex_tool_config.rs`).

Fixes https://github.com/openai/codex/issues/1248.
2025-07-07 22:31:30 -07:00
Michael Bolin
0a44c42533 docs: update README to include npm install again (#1475)
v0.2.0 of https://www.npmjs.com/package/@openai/codex now runs the Rust
CLI, so it makes sense to bring back the instructions to use `npm i -g
@openai/codex`.

In most places, I list `npm install` before `brew install` because I
believe `npm` is more readily available, though I in the more detailed
part of the documentation, I note that `brew install` will download
fewer bytes, and in that sense, is preferred.
2025-07-07 17:44:26 -07:00
Michael Bolin
a9bed68947 chore: normalize repository.url in package.json (#1474)
I got this as a warning when doing `npm publish --dry-run`, so I ran
`npm pkg fix` to create this PR, as instructed.
2025-07-07 16:33:06 -07:00
ryozi
fd67a0086c Fix Unicode handling in chat_composer "@" token detection (#1467)
## Issues Fixed

- **Primary Issue (#1450)**: Unicode cursor positioning was incorrect
due to mixing character positions with byte positions
- **Additional Issue**: Full-width spaces (CJK whitespace like " ")
weren't properly handled as token boundaries
- ref:
https://doc.rust-lang.org/std/primitive.char.html#method.is_whitespace

---------

Co-authored-by: Michael Bolin <bolinfest@gmail.com>
2025-07-07 13:43:31 -07:00
Michael Bolin
c221eab0b5 feat: support custom HTTP headers for model providers (#1473)
This adds support for two new model provider config options:

- `http_headers` for hardcoded (key, value) pairs
- `env_http_headers` for headers whose values should be read from
environment variables

This also updates the built-in `openai` provider to use this feature to
set the following headers:

- `originator` => `codex_cli_rs`
- `version` => [CLI version]
- `OpenAI-Organization` => `OPENAI_ORGANIZATION` env var
- `OpenAI-Project` => `OPENAI_PROJECT` env var

for consistency with the TypeScript implementation:


bd5a9e8ba9/codex-cli/src/utils/agent/agent-loop.ts (L321-L329)

While here, this also consolidates some logic that was duplicated across
`client.rs` and `chat_completions.rs` by introducing
`ModelProviderInfo.create_request_builder()`.

Resolves https://github.com/openai/codex/discussions/1152
2025-07-07 13:09:16 -07:00
Michael Bolin
bd5a9e8ba9 chore: update release scripts for the TypeScript CLI (#1472)
This introduces two changes to make a quick fix so we can deploy the
Rust CLI for `0.2.0` of `@openai/codex` on npm:

- Updates `WORKFLOW_URL` to point to
https://github.com/openai/codex/actions/runs/15981617627, which is the
GitHub workflow run used to create the binaries for the `0.2.0` release
we published to Homebrew.
- Adds a `--version` option to `stage_release.sh` to specify what the
`version` field in the `package.json` will be.

Locally, I ran the following:

```
./codex-cli/scripts/stage_release.sh --native --version 0.2.0
```

Previously, we only used the `--native` flag to publish to the `native`
tag of `@openai/codex` (e.g., `npm publish --tag native`), but we should
just publish this as the default tag for `0.2.0` to be consistent with
what is in Homebrew.

We can still publish one "final" version of the TypeScript CLI as 0.1.x
later.

Under the hood, this release will still contain `dist/cli.js`,
`bin/codex-linux-sandbox-x64`, and `bin/codex-x86_64-apple-darwin`,
which are not strictly necessary, but we'll fix that in `0.3.0`.
2025-07-07 09:43:03 -07:00
Michael Bolin
abcca30d93 docs: update documentation to reflect Rust CLI release (#1440)
As promised on https://github.com/openai/codex/discussions/1405, we are
making the first official release of the Rust CLI as v0.2.0. As part of
this move, we are making it available in Homebrew:

https://github.com/Homebrew/homebrew-core/pull/228615

Ultimately, we also plan to continue to make the CLI available in npm,
as well, though brew is a bit nicer in that `brew install` will download
only the binary for your platform whereas an npm module is expected to
contain the binaries for _all_ supported platforms, so it is a bit more
heavyweight.

A big part of this change is updating the root `README.md` to document
the behavior of the Rust CLI, which differs in a number of ways from the
TypeScript CLI. The existing `README.md` is moved to
`codex-cli/README.md` as part of this PR, as it is still applicable to
that folder.

As this is still early days for the Rust CLI, I encourage folks to
provide feedback on the command line flags and configuration options.
2025-07-01 15:00:31 -07:00
Michael Bolin
4cb3c76798 fix: softprops/action-gh-release@v2 should use existing tag instead of creating a new tag (#1436)
https://github.com/Homebrew/homebrew-core/pull/228521 details the issues
I was having with the **Source code (tar.gz)** artifact for our GitHub
releases not being quite right. I landed these PRs as stabs in the dark
to fix this:

- https://github.com/openai/codex/pull/1423
- https://github.com/openai/codex/pull/1430

Based on the insights from
https://github.com/Homebrew/homebrew-core/pull/228521, I think those
were wrong and the real problem was this:


6dad5c3b17/.github/workflows/rust-release.yml (L162)

That is, I was manufacturing a new tag name on the fly instead of using
the existing one.

This PR reverts #1423 and #1430 and hopefully fixes how `tag_name` is
set for the `softprops/action-gh-release@v2` step so the **Source code
(tar.gz)** includes the correct files. Assuming this works, this should
make the Homebrew formula straightforward.
2025-06-30 12:10:48 -07:00
Michael Bolin
6dad5c3b17 feat: add query_params option to ModelProviderInfo to support Azure (#1435)
As discovered in https://github.com/openai/codex/issues/1365, the Azure
provider needs to be able to specify `api-version` as a query param, so
this PR introduces a generic `query_params` option to the
`model_providers` config so that an Azure provider can be defined as
follows:

```toml
[model_providers.azure]
name = "Azure"
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY"
query_params = { api-version = "2025-04-01-preview" }
```

This PR also updates the docs with this example.

While here, we also update `wire_api` to default to `"chat"`, as that is
likely the common case for someone defining an external provider.

Fixes https://github.com/openai/codex/issues/1365.
2025-06-30 11:39:54 -07:00
Michael Bolin
cd2d84d496 fix: need to check out the branch, not the tag (#1430)
This should have been done in https://github.com/openai/codex/pull/1423.
2025-06-29 10:18:50 -07:00
Michael Bolin
688100f7f4 chore: fix Rust release process so generated .tar.gz source works with Homebrew (#1423)
Looking at existing releases such as
https://github.com/openai/codex/releases/tag/codex-rs-b289c9207090b2e27494545d7b5404e063bd86f3-1-rust-v0.1.0-alpha.4,
the `.tar.gz` for the source code still seems to have `0.0.0` as the
`version` in `codex-rs/Cargo.toml` instead of what the tag seems to say
it should have:


b289c92070/codex-rs/Cargo.toml (L21)

ChatGPT claims:

> When GitHub generates the Source code (tar.gz) archive for a tag:
	•	It uses the commit the tag points to.
• But in some cases (e.g., shallow clones, GitHub CI, or local tools
that only clone the default branch), that commit may not be included,
and you might get an outdated view or nothing at all depending on how
it’s fetched.
	
Trying this recommended fix.
2025-06-28 19:46:44 -07:00
Michael Bolin
f30bf4bbcf fix: support pre-release identifiers in tags (#1422)
Had to update the regex in the GitHub workflow to allow suffixes like
`-alpha.4`.

Successfully ran:

```
./scripts/create_github_release.sh 0.1.0-alpha.4
```

to create
https://github.com/openai/codex/releases/tag/codex-rs-b289c9207090b2e27494545d7b5404e063bd86f3-1-rust-v0.1.0-alpha.4

and verified that when I run `codex --version`, it prints `codex-cli
0.1.0-alpha.4`.
2025-06-28 16:05:53 -07:00
Michael Bolin
1b7c8d2569 fix: build with codegen-units = 1 for profile.release (#1421)
Great suggestion from @zamazan4ik on
https://github.com/openai/codex/issues/1411.
2025-06-28 15:24:48 -07:00
Michael Bolin
4a341efe92 feat: highlight matching characters in fuzzy file search (#1420)
Using the new file-search API introduced in
https://github.com/openai/codex/pull/1419, matching characters are now
shown in bold in the TUI:


https://github.com/user-attachments/assets/8bbcc6c6-75a3-493f-8ea4-b2a063e09b3a

Fixes https://github.com/openai/codex/issues/1261
2025-06-28 15:04:23 -07:00
Michael Bolin
e2efe8da9c feat: introduce --compute-indices flag to codex-file-search (#1419)
This is a small quality-of-life feature, the addition of
`--compute-indices` to the CLI, which, if enabled, will compute and set
the `indices` field for each `FileMatch` returned by `run()`. Note we
only bother to compute `indices` once we have the top N results because
there could be a lot of intermediate "top N" results during the search
that are ultimately discarded.

When set, the indices are included in the JSON output when `--json` is
specified and the matching indices are displayed in bold when `--json`
is not specified.
2025-06-28 14:39:29 -07:00
Michael Bolin
5a0f236ca4 feat: add support for @ to do file search (#1401)
Introduces support for `@` to trigger a fuzzy-filename search in the
composer. Under the hood, this leverages
https://crates.io/crates/nucleo-matcher to do the fuzzy matching and
https://crates.io/crates/ignore to build up the list of file candidates
(so that it respects `.gitignore`).

For simplicity (at least for now), we do not do any caching between
searches like VS Code does for its file search:


1d89ed699b/src/vs/workbench/services/search/node/rawSearchService.ts (L212-L218)

Because we do not do any caching, I saw queries take up to three seconds
on large repositories with hundreds of thousands of files. To that end,
we do not perform searches synchronously on each keystroke, but instead
dispatch an event to do the search on a background thread that
asynchronously reports back to the UI when the results are available.
This is largely handled by the `FileSearchManager` introduced in this
PR, which also has logic for debouncing requests so there is at most one
search in flight at a time.

While we could potentially polish and tune this feature further, it may
already be overengineered for how it will be used, in practice, so we
can improve things going forward if it turns out that this is not "good
enough" in the wild.

Note this feature does not work like `@` in the TypeScript CLI, which
was more like directory-based tab completion. In the Rust CLI, `@`
triggers a full-repo fuzzy-filename search.

Fixes https://github.com/openai/codex/issues/1261.
2025-06-28 13:47:42 -07:00
Michael Bolin
ff8ae1ffa1 feat: make file search cancellable (#1414)
Update `run()` to take `cancel_flag: Arc<AtomicBool>` that the worker
threads will periodically check to see if it is `true`, exiting early
(and returning empty results) if so.
2025-06-27 20:01:45 -07:00
Michael Bolin
b3ad764532 chore: change arg from PathBuf to &Path (#1409)
Caller no longer needs to clone a `PathBuf`: can just pass `&Path`.
2025-06-27 16:24:41 -07:00
Michael Bolin
a331a67b3e chore: change built_in_model_providers so "openai" is the only "bundled" provider (#1407)
As we are [close to releasing the Rust CLI
beta](https://github.com/openai/codex/discussions/1405), for the moment,
let's take a more neutral stance on what it takes to be a "built-in"
provider.

* For example, there seems to be a discrepancy around what the "right"
configuration for Gemini is: https://github.com/openai/codex/pull/881
* And while the current list of "built-in" providers are all arguably
"well-known" names, this raises a question of what to do about
potentially less familiar providers, such as
https://github.com/openai/codex/pull/1142. Do we just accept every pull
request like this, or is there some criteria a provider has to meet to
"qualify" to be bundled with Codex CLI?

I think that if we can establish clear ground rules for being a built-in
provider, then we can bring this back. But until then, I would rather
take a minimalist approach because if we decided to reverse our position
later, it would break folks who were depending on the presence of the
built-in providers.
2025-06-27 14:49:55 -07:00
Gabriel Peal
2e293ce903 Handle Ctrl+C quit when idle (#1402)
## Summary
- show `Ctrl+C to quit` hint when pressing Ctrl+C with no active task
- exiting with Ctrl+C if the hint is already visible
- clear the hint when tasks begin or other keys are pressed


https://github.com/user-attachments/assets/931e2d7c-1c80-4b45-9908-d119f74df23c



------
https://chatgpt.com/s/cd_685ec8875a308191beaa95886dc1379e

Fixes #1245
2025-06-27 13:37:11 -04:00
Michael Bolin
64feeb3803 fix: add tiebreaker logic for paths when scores are equal (#1400)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1400).
* #1401
* __->__ #1400
2025-06-26 23:05:10 -07:00
Michael Bolin
fa0e17f83a feat: add support for /diff command (#1389)
Adds support for a `/diff` command comparable to the one available in
the TypeScript CLI.

<img width="1103" alt="Screenshot 2025-06-26 at 12 31 33 PM"
src="https://github.com/user-attachments/assets/5dc646ca-301f-41ff-92a7-595c68db64b6"
/>

While here, changed the `SlashCommand` enum so the declared variant
order is the order the commands appear in the popup menu. This way,
`/toggle-mouse-mode` is listed last, as it is the least likely to be
used.

Fixes https://github.com/openai/codex/issues/1253.
2025-06-26 13:03:31 -07:00
Gabriel Peal
a339a7bcce [Rust] Allow resuming a session that was killed with ctrl + c (#1387)
Previously, if you ctrl+c'd a conversation, all subsequent turns would
400 because the Responses API never got a response for one of its call
ids. This ensures that if we aren't sending a call id by hand, we
generate a synthetic aborted call.

Fixes #1244 


https://github.com/user-attachments/assets/5126354f-b970-45f5-8c65-f811bca8294a
2025-06-26 14:40:42 -04:00
Michael Bolin
fcfe43c7df feat: show number of tokens remaining in UI (#1388)
When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.

When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.

Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

![Screenshot 2025-06-25 at 11 20
55 PM](https://github.com/user-attachments/assets/6fd6982f-7247-4f14-84b2-2e600cb1fd49)

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.

Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:


296996d74e/codex-cli/src/utils/approximate-tokens-used.ts (L3-L16)

Fixes https://github.com/openai/codex/issues/1242
2025-06-25 23:31:11 -07:00
Michael Bolin
296996d74e feat: standalone file search CLI (#1386)
Standalone fuzzy filename search library that should be helpful in
addressing https://github.com/openai/codex/issues/1261.
2025-06-25 13:29:03 -07:00
Michael Bolin
50924101d2 feat: add --dangerously-bypass-approvals-and-sandbox (#1384)
This PR reworks `assess_command_safety()` so that the combination of
`AskForApproval::Never` and `SandboxPolicy::DangerFullAccess` ensures
that commands are run without _any_ sandbox and the user should never be
prompted. In turn, it adds support for a new
`--dangerously-bypass-approvals-and-sandbox` flag (that cannot be used
with `--approval-policy` or `--full-auto`) that sets both of those
options.

Fixes https://github.com/openai/codex/issues/1254
2025-06-25 12:36:10 -07:00
Michael Bolin
72082164c1 chore: rename AskForApproval::UnlessAllowListed to AskForApproval::UnlessTrusted (#1385)
We could just rename to `Untrusted` instead of `UnlessTrusted`, but I
think `AskForApproval::UnlessTrusted` reads a bit better.
2025-06-25 12:26:13 -07:00
Michael Bolin
e09691337d chore: improve docstring for --full-auto (#1379)
Reference `-c sandbox.mode=workspace-write` in the docstring and users
can read the config docs for `sandbox` for more information.
2025-06-25 09:13:36 -07:00
Michael Bolin
86d5a9d80d chore: rename unless-allow-listed to untrusted (#1378)
For the `approval_policy` config option, renames `unless-allow-listed`
to `untrusted`. In general, when it comes to exec'ing commands, I think
"trusted" is a more accurate term than "safe."

Also drops the `AskForApproval::AutoEdit` variant, as we were not really
making use of it, anyway.

Fixes https://github.com/openai/codex/issues/1250.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1378).
* #1379
* __->__ #1378
2025-06-24 22:19:21 -07:00
Michael Bolin
531ce7626f fix: pretty-print the sandbox config in the TUI/exec modes (#1376)
Now that https://github.com/openai/codex/pull/1373 simplified the
sandbox config, we can print something much simpler in the TUI (and in
`codex exec`) to summarize the sandbox config.

Before:

![Screenshot 2025-06-24 at 5 45
52 PM](https://github.com/user-attachments/assets/b7633efb-a619-43e1-9abe-7bb0be2d0ec0)

With this change:

![Screenshot 2025-06-24 at 5 46
44 PM](https://github.com/user-attachments/assets/8d099bdd-a429-4796-a08d-70931d984e4f)

For reference, my `config.toml` contains:

```
[sandbox]
mode = "workspace-write"
writable_roots = ["/tmp", "/Users/mbolin/.pyenv/shims"]
```

Fixes https://github.com/openai/codex/issues/1248
2025-06-24 17:48:51 -07:00
Michael Bolin
63363a54e5 chore: install just in the devcontainer for Linux development (#1375)
Apparently `just` was added to `apt` in Ubuntu 24, so this required
updating the Ubuntu version in the `Dockerfile` to make it so we could
simply `apt install just`.

Though then that caused a conflict with the custom `dev` user we were
using, though the end result seems simpler since now we just use the
default `ubuntu` user provided by Ubuntu 24.
2025-06-24 17:20:53 -07:00
Michael Bolin
6d65010aad chore: install clippy and rustfmt in the devcontainer for Linux development (#1374)
I discovered it was difficult to do development in the devcontainer
without these tools available.
2025-06-24 17:05:36 -07:00
Michael Bolin
0776d78357 feat: redesign sandbox config (#1373)
This is a major redesign of how sandbox configuration works and aims to
fix https://github.com/openai/codex/issues/1248. Specifically, it
replaces `sandbox_permissions` in `config.toml` (and the
`-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
three variants:

```toml
# Safest option: full disk is read-only, but writes and network access are disallowed.
[sandbox]
mode = "read-only"

# The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
# writable_roots can be used to specify additional writable folders.
[sandbox]
mode = "workspace-write"
writable_roots = []  # Optional, defaults to the empty list.
network_access = false  # Optional, defaults to false.

# Disable sandboxing: use at your own risk!!!
[sandbox]
mode = "danger-full-access"
```

This should make sandboxing easier to reason about. While we have
dropped support for `-s`, the way it works now is:

- no flags => `read-only`
- `--full-auto` => `workspace-write`
- currently, there is no way to specify `danger-full-access` via a CLI
flag, but we will revisit that as part of
https://github.com/openai/codex/issues/1254

Outstanding issue:

- As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
still conflating sandbox preferences with approval preferences in that
case, which needs to be cleaned up.
2025-06-24 16:59:47 -07:00
Eric Wright
ed5e848f3e add: responses api support for azure (#1321)
- Use Responses API for Azure provider endpoints
- Added a unit test to catch regression on the change from
`/chat/completions` to `/responses`
- Updated the default AOAI api version from `2025-03-01-preview` to
`2025-04-01-preview` to avoid user/400 errors due to missing summary
support in the March API version.
- Changes have been tested locally on AOAI endpoints
2025-06-22 18:01:13 -07:00
Govind Kamtamneni
5aafe190e2 feat(ts): provider‑specific API‑key discovery and clearer Azure guidance (#1324)
## Summary

This PR refactors the Codex CLI authentication flow so that
**non-OpenAI** providers (for example **azure**, or any future addition)
can supply their API key through a dedicated environment variable
without triggering the OpenAI login flow.

Key behaviours introduced:

* When `provider !== "openai"` the CLI consults `src/utils/providers.ts`
to locate the correct environment variable (`AZURE_OPENAI_API_KEY`,
`GEMINI_API_KEY`, and so on) before considering any interactive login.
* Credit redemption (`--free`) and PKCE login now run **only** when the
provider is OpenAI, eliminating unwanted browser prompts for Azure and
others.
* User-facing error messages are revamped to guide Azure users to
**[https://ai.azure.com/](https://ai.azure.com)** and show the exact
variable name they must set.
* All code paths still export `OPENAI_API_KEY` so legacy scripts
continue to operate unchanged.

---

## Example `config.json`

```jsonc
{
  "model": "codex-mini",
  "provider": "azure",
  "providers": {
    "azure": {
      "name": "AzureOpenAI",
      "baseURL": "https://ai-<project-name>.openai.azure.com/openai",
      "envKey": "AZURE_OPENAI_API_KEY"
    }
  },
  "history": {
    "maxSize": 1000,
    "saveHistory": true,
    "sensitivePatterns": []
  }
}
```

With this file in `~/.codex/config.json`, a single command line is
enough:

```bash
export AZURE_OPENAI_API_KEY="<your-key>"
codex "Hello from Azure"
```

No browser window opens, and the CLI works in entirely non-interactive
mode.

---

## Rationale

The new flow enables Codex to run **asynchronously** in sandboxed
environments such as GitHub Actions pipelines. By passing `--provider
azure` (or setting it in `config.json`) and exporting the correct key,
CI/CD jobs can invoke Codex without any ChatGPT-style login or PKCE
round-trip. This unlocks fully automated testing and deployment
scenarios.

---

## What’s changed

| File | Type | Description |
| ------------------------ | ------------------- |
-----------------------------------------------------------------------------------------------------------------------------
|
| `codex-cli/src/cli.tsx` | **feat / refactor** | +43 / -20 lines.
Imports `providers`, adds early provider-specific key lookup, gates
`--free` redemption, rewrites help text. |
| `src/utils/providers.ts` | **chore** | Now consumed by CLI for env-var
discovery. |

---

## How to test

```bash
# Azure example
export AZURE_OPENAI_API_KEY="<your-key>"
codex --provider azure "Automated run in CI"

# OpenAI example (unchanged behaviour)
codex --provider openai --login "Standard OpenAI flow"
```

Expected outcomes:

* Azure and other provider paths are non-interactive when provider flag
is passed.
* The CLI always sets `OPENAI_API_KEY` for backward compatibility.

---

## Checklist

* [x] Logic behind provider-specific env-var lookup added.
* [x] Redundant OpenAI login steps removed for other providers.
* [x] Unit tests cover new branches.
* [x] README and sample config updated.
* [x] CI passes on all supported Node versions.

---

**Related work**

* #92
* #769 
* #1321



I have read the CLA Document and I hereby sign the CLA.
2025-06-22 17:56:36 -07:00
Michael Bolin
b73426c1c4 docs: update codex-rs/README.md to list new features in the Rust CLI (#1267)
Let users know about what the Rust CLI supports that the TypeScript CLI
doesn't!
2025-06-06 18:32:10 -07:00
Reilly Wood
345a38502d codex-rs: Rename /clear to /new, make it start an entirely new chat (#1264)
I noticed that `/clear` wasn't fully clearing chat history; it would
clear the chat history widgets _in the UI_, but the LLM still had access
to information from previous messages.

This PR renames `/clear` to `/new` for clarity as per Michael's
suggestion, resetting `app_state` to a fresh `ChatWidget`.
2025-06-06 16:29:37 -07:00
Michael Bolin
029f39b9da feat: port maybeRedeemCredits() from get-api-key.tsx to login_with_chatgpt.py (#1221)
This builds on https://github.com/openai/codex/pull/1212 and ports the
`maybeRedeemCredits()` function from `get-api-key.ts` to
`login_with_chatgpt.py`:


a80240cfdc/codex-cli/src/utils/get-api-key.tsx (L84-L89)
2025-06-05 23:34:10 -07:00
Michael Bolin
a80240cfdc chore: ensure next Node.js release includes musl binaries for arm64 Linux (#1232)
Target a workflow with more recent binary artifacts.
2025-06-05 23:14:10 -07:00
Michael Bolin
2d5246050a fix: use aarch64-unknown-linux-musl instead of aarch64-unknown-linux-gnu (#1228)
Now that we have published a GitHub Release that contains arm64 musl
artifacts for Linux, update the following scripts to take advantage of
them:

- `dotslash-config.json` now uses musl artifacts for the `linux-aarch64`
target
- `install_native_deps.sh` for the TypeScript CLI now includes
`codex-linux-sandbox-aarch64-unknown-linux-musl` instead of
`codex-linux-sandbox-aarch64-unknown-linux-gnu` for sandboxing
- `codex-cli/bin/codex.js` now checks for `aarch64-unknown-linux-musl`
artifacts instead of `aarch64-unknown-linux-gnu` ones
2025-06-05 22:45:45 -07:00
Michael Bolin
77b017f67d fix: truncate auth.json file before rewriting it (#1231)
This was missed in https://github.com/openai/codex/pull/1212. Caught by
@rizwankce in
https://github.com/openai/codex/discussions/1174#discussioncomment-13377475.
2025-06-05 22:11:02 -07:00
Michael Bolin
c02d25fbad fix: include codex-linux-sandbox-aarch64-unknown-linux-musl in the set of release artifacts (#1230)
This was missed in https://github.com/openai/codex/pull/1225. Once we
create a new GitHub Release with this change, we can use the URL from
the workflow that triggered the release in
https://github.com/openai/codex/pull/1228.
2025-06-05 22:03:07 -07:00
Michael Bolin
9db53b33aa fix: support arm64 build for Linux (#1225)
Users were running into issues with glibc mismatches on arm64 linux. In
the past, we did not provide a musl build for arm64 Linux because we had
trouble getting the openssl dependency to build correctly. Though today
I just tried the same trick in `Cargo.toml` that we were doing for
`x86_64-unknown-linux-musl` (using `openssl-sys` with `features =
["vendored"]`), so I'm not sure what problem we had in the past the
builds "just worked" today!

Though one tweak that did have to be made is that the integration tests
for Seccomp/Landlock empirically require longer timeouts on arm64 linux,
or at least on the `ubuntu-24.04-arm` GitHub Runner. As such, we change
the timeouts for arm64 in `codex-rs/linux-sandbox/tests/landlock.rs`.

Though in solving this problem, I decided I needed a turnkey solution
for testing the Linux build(s) from my Mac laptop, so this PR introduces
`.devcontainer/Dockerfile` and `.devcontainer/devcontainer.json` to
facilitate this. Detailed instructions are in `.devcontainer/README.md`.

We will update `dotslash-config.json` and other release-related scripts
in a follow-up PR.
2025-06-05 20:29:46 -07:00
Michael Bolin
515b6331bd feat: add support for login with ChatGPT (#1212)
This does not implement the full Login with ChatGPT experience, but it
should unblock people.

**What works**

* The `codex` multitool now has a `login` subcommand, so you can run
`codex login`, which should write `CODEX_HOME/auth.json` if you complete
the flow successfully. The TUI will now read the `OPENAI_API_KEY` from
`auth.json`.
* The TUI should refresh the token if it has expired and the necessary
information is in `auth.json`.
* There is a `LoginScreen` in the TUI that tells you to run `codex
login` if both (1) your model provider expects to use `OPENAI_API_KEY`
as its env var, and (2) `OPENAI_API_KEY` is not set.

**What does not work**

* The `LoginScreen` does not support the login flow from within the TUI.
Instead, it tells you to quit, run `codex login`, and then run `codex`
again.
* `codex exec` does read from `auth.json` yet, nor does it direct the
user to go through the login flow if `OPENAI_API_KEY` is not be found.
* The `maybeRedeemCredits()` function from `get-api-key.tsx` has not
been ported from TypeScript to `login_with_chatgpt.py` yet:


a67a67f325/codex-cli/src/utils/get-api-key.tsx (L84-L89)

**Implementation**

Currently, the OAuth flow requires running a local webserver on
`127.0.0.1:1455`. It seemed wasteful to incur the additional binary cost
of a webserver dependency in the Rust CLI just to support login, so
instead we implement this logic in Python, as Python has a `http.server`
module as part of its standard library. Specifically, we bundle the
contents of a single Python file as a string in the Rust CLI and then
use it to spawn a subprocess as `python3 -c
{{SOURCE_FOR_PYTHON_SERVER}}`.

As such, the most significant files in this PR are:

```
codex-rs/login/src/login_with_chatgpt.py
codex-rs/login/src/lib.rs
```

Now that the CLI may load `OPENAI_API_KEY` from the environment _or_
`CODEX_HOME/auth.json`, we need a new abstraction for reading/writing
this variable, so we introduce:

```
codex-rs/core/src/openai_api_key.rs
```

Note that `std::env::set_var()` is [rightfully] `unsafe` in Rust 2024,
so we use a LazyLock<RwLock<Option<String>>> to store `OPENAI_API_KEY`
so it is read in a thread-safe manner.

Ultimately, it should be possible to go through the entire login flow
from the TUI. This PR introduces a placeholder `LoginScreen` UI for that
right now, though the new `codex login` subcommand introduced in this PR
should be a viable workaround until the UI is ready.

**Testing**

Because the login flow is currently implemented in a standalone Python
file, you can test it without building any Rust code as follows:

```
rm -rf /tmp/codex_home && mkdir /tmp/codex_home
CODEX_HOME=/tmp/codex_home python3 codex-rs/login/src/login_with_chatgpt.py
```

For reference:

* the original TypeScript implementation was introduced in
https://github.com/openai/codex/pull/963
* support for redeeming credits was later added in
https://github.com/openai/codex/pull/974
2025-06-04 08:44:17 -07:00
Reilly Wood
a67a67f325 codex-rs: make tool calls prettier (#1211)
This PR overhauls how active tool calls and completed tool calls are
displayed:

1. More use of colour to indicate success/failure and distinguish
between components like tool name+arguments
2. Previously, the entire `CallToolResult` was serialized to JSON and
pretty-printed. Now, we extract each individual `CallToolResultContent`
and print those
1. The previous solution was wasting space by unnecessarily showing
details of the `CallToolResult` struct to users, without formatting the
actual tool call results nicely
2. We're now able to show users more information from tool results in
less space, with nicer formatting when tools return JSON results

### Before:

<img width="1251" alt="Screenshot 2025-06-03 at 11 24 26"
src="https://github.com/user-attachments/assets/5a58f222-219c-4c53-ace7-d887194e30cf"
/>

### After:

<img width="1265" alt="image"
src="https://github.com/user-attachments/assets/99fe54d0-9ebe-406a-855b-7aa529b91274"
/>

## Future Work

1. Integrate image tool result handling better. We should be able to
display images even if they're not the first `CallToolResultContent`
2. Users should have some way to view the full version of truncated tool
results
3. It would be nice to add some left padding for tool results, make it
more clear that they are results. This is doable, just a little fiddly
due to the way `first_visible_line` scrolling works
4. There's almost certainly a better way to format JSON than "all on 1
line with spaces to make Ratatui wrapping work". But I think that works
OK for now.
2025-06-03 14:29:26 -07:00
Michael Bolin
c6fcec55fe fix: always send full instructions when using the Responses API (#1207)
This fixes a longstanding error in the Rust CLI where `codex.rs`
contained an errant `is_first_turn` check that would exclude the user
instructions for subsequent "turns" of a conversation when using the
responses API (i.e., when `previous_response_id` existed).

While here, renames `Prompt.instructions` to `Prompt.user_instructions`
since we now have quite a few levels of instructions floating around.
Also removed an unnecessary use of `clone()` in
`Prompt.get_full_instructions()`.
2025-06-03 09:40:19 -07:00
Michael Bolin
6fcc528a43 fix: provide tolerance for apply_patch tool (#993)
As explained in detail in the doc comment for `ParseMode::Lenient`, we
have observed that GPT-4.1 does not always generate a valid invocation
of `apply_patch`. Fortunately, the error is predictable, so we introduce
some new logic to the `codex-apply-patch` crate to recover from this
error.

Because we would like to avoid this becoming a de facto standard (as it
would be incompatible if `apply_patch` were provided as an actual
executable, unless we also introduced the lenient behavior in the
executable, as well), we require passing `ParseMode::Lenient` to
`parse_patch_text()` to make it clear that the caller is opting into
supporting this special case.

Note the analogous change to the TypeScript CLI was
https://github.com/openai/codex/pull/930. In addition to changing the
accepted input to `apply_patch`, it also introduced additional
instructions for the model, which we include in this PR.

Note that `apply-patch` does not depend on either `regex` or
`regex-lite`, so some of the checks are slightly more verbose to avoid
introducing this dependency.

That said, this PR does not leverage the existing
`extract_heredoc_body_from_apply_patch_command()`, which depends on
`tree-sitter` and `tree-sitter-bash`:


5a5aa89914/codex-rs/apply-patch/src/lib.rs (L191-L246)

though perhaps it should.
2025-06-03 09:06:38 -07:00
Michael Bolin
5a5aa89914 chore: replace regex with regex-lite, where appropriate (#1200)
As explained on https://crates.io/crates/regex-lite, `regex-lite` is a
lighter alternative to `regex` and seems to be sufficient for our
purposes.
2025-06-02 17:11:45 -07:00
Michael Bolin
0f3cc8f842 feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:


d7245cbbc9/codex-rs/core/src/client.rs (L108-L111)

Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:

```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```

We take a cue from the TypeScript CLI, which does a check on the model
name:


d7245cbbc9/codex-cli/src/utils/agent/agent-loop.ts (L786-L789)

This PR does a similar check, though also adds support for the following
config options:

```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```

This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)

This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
Michael Bolin
d7245cbbc9 fix: chat completions API now also passes tools along (#1167)
Prior to this PR, there were two big misses in `chat_completions.rs`:

1. The loop in `stream_chat_completions()` was only including items of
type `ResponseItem::Message` when building up the `"messages"` JSON for
the `POST` request to the `chat/completions` endpoint. This fixes things
by ensuring other variants (`FunctionCall`, `LocalShellCall`, and
`FunctionCallOutput`) are included, as well.
2. In `process_chat_sse()`, we were not recording tool calls and were
only emitting items of type
`ResponseEvent::OutputItemDone(ResponseItem::Message)` to the stream.
Now we introduce `FunctionCallState`, which is used to accumulate the
`delta`s of type `tool_calls`, so we can ultimately emit a
`ResponseItem::FunctionCall`, when appropriate.

While function calling now appears to work for chat completions with my
local testing, I believe that there are still edge cases that are not
covered and that this codepath would benefit from a battery of
integration tests. (As part of that further cleanup, we should also work
to support streaming responses in the UI.)

The other important part of this PR is some cleanup in
`core/src/codex.rs`. In particular, it was hard to reason about how
`run_task()` was building up the list of messages to include in a
request across the various cases:

- Responses API
- Chat Completions API
- Responses API used in concert with ZDR

I like to think things are a bit cleaner now where:

- `zdr_transcript` (if present) contains all messages in the history of
the conversation, which includes function call outputs that have not
been sent back to the model yet
- `pending_input` includes any messages the user has submitted while the
turn is in flight that need to be injected as part of the next `POST` to
the model
- `input_for_next_turn` includes the tool call outputs that have not
been sent back to the model yet
2025-06-02 13:47:51 -07:00
Michael Bolin
e40f86b446 chore: logging cleanup (#1196)
Update what we log to make `RUST_LOG=debug` a bit easier to work with.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1196).
* #1167
* __->__ #1196
2025-06-02 13:31:33 -07:00
Michael Bolin
7896b1089d chore: update the WORKFLOW_URL in install_native_deps.sh to the latest release (#1190) 2025-05-31 10:30:50 -07:00
Michael Bolin
1410ae95ca fix: set --config hide_agent_reasoning=true in the GitHub Action (#1185)
Whoops, I had this flipped in https://github.com/openai/codex/pull/1183.
2025-05-30 23:57:05 -07:00
Michael Bolin
fccf5f3221 fix: disable agent reasoning output by default in the GitHub Action (#1183) 2025-05-30 23:49:48 -07:00
Michael Bolin
1159eaf04f feat: show the version when starting Codex (#1182)
The TypeScript version of the CLI shows the version when it starts up,
which is helpful when users share screenshots (and nice to know, as a
user).
2025-05-30 23:24:36 -07:00
Michael Bolin
e81327e5f4 feat: add hide_agent_reasoning config option (#1181)
This PR introduces a `hide_agent_reasoning` config option (that defaults
to `false`) that users can enable to make the output less verbose by
suppressing reasoning output.

To test, verified that this includes agent reasoning in the output:

```
echo hello | just exec
```

whereas this does not:

```
echo hello | just exec --config hide_agent_reasoning=false
```
2025-05-30 23:14:56 -07:00
Michael Bolin
4f3d294762 feat: dim the timestamp in the exec output (#1180)
This required changing `ts_println!()` to take `$self:ident`, which is a
bit more verbose, but the usability improvement seems worth it.

Also eliminated an unnecessary `.to_string()` while here.
2025-05-30 16:27:37 -07:00
Michael Bolin
cf1d070538 feat: grab-bag of improvements to exec output (#1179)
Fixes:

* Instantiate `EventProcessor` earlier in `lib.rs` so
`print_config_summary()` can be an instance method of it and leverage
its various `Style` fields to ensure it honors `with_ansi` properly.
* After printing the config summary, print out user's prompt with the
heading `User instructions:`. As noted in the comment, now that we can
read the instructions via stdin as of #1178, it is helpful to the user
to ensure they know what instructions were given to Codex.
* Use same colors/bold/italic settings for headers as the TUI, making
the output a bit easier to read.
2025-05-30 16:22:10 -07:00
Michael Bolin
ae743d56b0 feat: for codex exec, if PROMPT is not specified, read from stdin if not a TTY (#1178)
This attempts to make `codex exec` more flexible in how the prompt can
be passed:

* as before, it can be passed as a single string argument
* if `-` is passed as the value, the prompt is read from stdin
* if no argument is passed _and stdin is a tty_, prints a warning to
stderr that no prompt was specified an exits non-zero.
* if no argument is passed _and stdin is NOT a tty_, prints `Reading
prompt from stdin...` to stderr to let the user know that Codex will
wait until it reads EOF from stdin to proceed. (You can repro this case
by doing `yes | just exec` since stdin is not a TTY in that case but it
also never reaches EOF).
2025-05-30 14:41:55 -07:00
Michael Bolin
1bf82056b3 fix: introduce create_tools_json() and share it with chat_completions.rs (#1177)
The main motivator behind this PR is that `stream_chat_completions()`
was not adding the `"tools"` entry to the payload posted to the
`/chat/completions` endpoint. This (1) refactors the existing logic to
build up the `"tools"` JSON from `client.rs` into `openai_tools.rs`, and
(2) updates the use of responses API (`client.rs`) and chat completions
API (`chat_completions.rs`) to both use it.

Note this PR alone is not sufficient to get tool calling from chat
completions working: that is done in
https://github.com/openai/codex/pull/1167.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1177).
* #1167
* __->__ #1177
2025-05-30 14:07:03 -07:00
Michael Bolin
e207f20f64 fix: add extra debugging to GitHub Action (#1173)
https://github.com/openai/codex/actions/runs/15352839832/job/43205041563
appeared to fail around `postComment()`, but I don't see the output from
`fail()` in the logs. Adding a bit more info.
2025-05-30 11:16:30 -07:00
Michael Bolin
0f40ef5a10 fix: missed a step in #1171 for codex.yml (#1172)
Missed in my copy/paste.
2025-05-30 11:04:41 -07:00
Michael Bolin
8676185389 fix: update outdated repo setup in codex.yml (#1171)
We should do some work to share the setup logic across `codex.yml`,
`ci.yml`, and `rust-ci.yml`.
2025-05-30 10:58:57 -07:00
Michael Bolin
baa92f37e0 feat: initial import of experimental GitHub Action (#1170)
This is a first cut at a GitHub Action that lets you define prompt
templates in `.md` files under `.github/codex/labels` that will run
Codex with the associated prompt when the label is added to a GitHub
pull request.

For example, this PR includes these files:

```
.github/codex/labels/codex-attempt.md
.github/codex/labels/codex-code-review.md
.github/codex/labels/codex-investigate-issue.md
```

And the new `.github/workflows/codex.yml` workflow declares the
following triggers:

```yaml
on:
  issues:
    types: [opened, labeled]
  pull_request:
    branches: [main]
    types: [labeled]
```

as well as the following expression to gate the action:

```
jobs:
  codex:
    if: |
      (github.event_name == 'issues' && (
        (github.event.action == 'labeled' && (github.event.label.name == 'codex-attempt' || github.event.label.name == 'codex-investigate-issue'))
      )) ||
      (github.event_name == 'pull_request' && github.event.action == 'labeled' && github.event.label.name == 'codex-code-review')
```

Note the "actor" who added the label must have write access to the repo
for the action to take effect.

After adding a label, the action will "ack" the request by replacing the
original label (e.g., `codex-review`) with an `-in-progress` suffix
(e.g., `codex-review-in-progress`). When it is finished, it will swap
the `-in-progress` label with a `-completed` one (e.g.,
`codex-review-completed`).

Users of the action are responsible for providing an `OPENAI_API_KEY`
and making it available as a secret to the action.
2025-05-30 10:55:28 -07:00
Michael Bolin
a0239c3cd6 fix: enable set positional-arguments in justfile (#1169)
The way these definitions worked before, they did not handle quoted args
with spaces properly.

For example, if you had `/tmp/test-just/printlen.py` as:

```python
#!/usr/bin/env python3

import sys

print(len(sys.argv))
```

and your `justfile` was:

```
printlen *args:
    /tmp/test-just/printlen.py {{args}}
```

Then:

```shell
$ just printlen foo bar
3
$ just printlen 'foo bar'
3
```

which is not what we want: `'foo bar'` should be treated as one
argument.

The fix is to use
[positional-arguments](515e806b51/README.md (L1131)):

```
set positional-arguments

printlen *args:
    /tmp/test-just/printlen.py "$@"
```
2025-05-30 09:11:53 -07:00
Michael Bolin
bdfa95ed31 docs: split the config-related portion of codex-rs/README.md into its own config.md file (#1165)
Also updated the overview on `codex-rs/README.md` while here.
2025-05-29 16:59:35 -07:00
Fouad Matin
828e2062c2 fix(codex-rs): use codex-mini-latest as default (#1164) 2025-05-29 16:55:19 -07:00
198 changed files with 19756 additions and 3593 deletions

27
.devcontainer/Dockerfile Normal file
View File

@@ -0,0 +1,27 @@
FROM ubuntu:24.04
ARG DEBIAN_FRONTEND=noninteractive
# enable 'universe' because musl-tools & clang live there
RUN apt-get update && \
apt-get install -y --no-install-recommends \
software-properties-common && \
add-apt-repository --yes universe
# now install build deps
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential curl git ca-certificates \
pkg-config clang musl-tools libssl-dev just && \
rm -rf /var/lib/apt/lists/*
# Ubuntu 24.04 ships with user 'ubuntu' already created with UID 1000.
USER ubuntu
# install Rust + musl target as dev user
RUN curl -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal && \
~/.cargo/bin/rustup target add aarch64-unknown-linux-musl && \
~/.cargo/bin/rustup component add clippy rustfmt
ENV PATH="/home/ubuntu/.cargo/bin:${PATH}"
WORKDIR /workspace

30
.devcontainer/README.md Normal file
View File

@@ -0,0 +1,30 @@
# Containerized Development
We provide the following options to facilitate Codex development in a container. This is particularly useful for verifying the Linux build when working on a macOS host.
## Docker
To build the Docker image locally for x64 and then run it with the repo mounted under `/workspace`:
```shell
CODEX_DOCKER_IMAGE_NAME=codex-linux-dev
docker build --platform=linux/amd64 -t "$CODEX_DOCKER_IMAGE_NAME" ./.devcontainer
docker run --platform=linux/amd64 --rm -it -e CARGO_TARGET_DIR=/workspace/codex-rs/target-amd64 -v "$PWD":/workspace -w /workspace/codex-rs "$CODEX_DOCKER_IMAGE_NAME"
```
Note that `/workspace/target` will contain the binaries built for your host platform, so we include `-e CARGO_TARGET_DIR=/workspace/codex-rs/target-amd64` in the `docker run` command so that the binaries built inside your container are written to a separate directory.
For arm64, specify `--platform=linux/amd64` instead for both `docker build` and `docker run`.
Currently, the `Dockerfile` works for both x64 and arm64 Linux, though you need to run `rustup target add x86_64-unknown-linux-musl` yourself to install the musl toolchain for x64.
## VS Code
VS Code recognizes the `devcontainer.json` file and gives you the option to develop Codex in a container. Currently, `devcontainer.json` builds and runs the `arm64` flavor of the container.
From the integrated terminal in VS Code, you can build either flavor of the `arm64` build (GNU or musl):
```shell
cargo build --target aarch64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu
```

View File

@@ -0,0 +1,27 @@
{
"name": "Codex",
"build": {
"dockerfile": "Dockerfile",
"context": "..",
"platform": "linux/arm64"
},
/* Force VS Code to run the container as arm64 in
case your host is x86 (or vice-versa). */
"runArgs": ["--platform=linux/arm64"],
"containerEnv": {
"RUST_BACKTRACE": "1",
"CARGO_TARGET_DIR": "${containerWorkspaceFolder}/codex-rs/target-arm64"
},
"remoteUser": "ubuntu",
"customizations": {
"vscode": {
"settings": {
"terminal.integrated.defaultProfile.linux": "bash"
},
"extensions": ["rust-lang.rust-analyzer", "tamasfe.even-better-toml"]
}
}
}

1
.github/actions/codex/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
/node_modules/

View File

@@ -0,0 +1,8 @@
printWidth = 80
quoteProps = "consistent"
semi = true
tabWidth = 2
trailingComma = "all"
# Preserve existing behavior for markdown/text wrapping.
proseWrap = "preserve"

140
.github/actions/codex/README.md vendored Normal file
View File

@@ -0,0 +1,140 @@
# openai/codex-action
`openai/codex-action` is a GitHub Action that facilitates the use of [Codex](https://github.com/openai/codex) on GitHub issues and pull requests. Using the action, associate **labels** to run Codex with the appropriate prompt for the given context. Codex will respond by posting comments or creating PRs, whichever you specify!
Here is a sample workflow that uses `openai/codex-action`:
```yaml
name: Codex
on:
issues:
types: [opened, labeled]
pull_request:
branches: [main]
types: [labeled]
jobs:
codex:
if: ... # optional, but can be effective in conserving CI resources
runs-on: ubuntu-latest
# TODO(mbolin): Need to verify if/when `write` is necessary.
permissions:
contents: write
issues: write
pull-requests: write
steps:
# By default, Codex runs network disabled using --full-auto, so perform
# any setup that requires network (such as installing dependencies)
# before openai/codex-action.
- name: Checkout repository
uses: actions/checkout@v4
- name: Run Codex
uses: openai/codex-action@latest
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
github_token: ${{ secrets.GITHUB_TOKEN }}
```
See sample usage in [`codex.yml`](../../workflows/codex.yml).
## Triggering the Action
Using the sample workflow above, we have:
```yaml
on:
issues:
types: [opened, labeled]
pull_request:
branches: [main]
types: [labeled]
```
which means our workflow will be triggered when any of the following events occur:
- a label is added to an issue
- a label is added to a pull request against the `main` branch
### Label-Based Triggers
To define a GitHub label that should trigger Codex, create a file named `.github/codex/labels/LABEL-NAME.md` in your repository where `LABEL-NAME` is the name of the label. The content of the file is the prompt template to use when the label is added (see more on [Prompt Template Variables](#prompt-template-variables) below).
For example, if the file `.github/codex/labels/codex-review.md` exists, then:
- Adding the `codex-review` label will trigger the workflow containing the `openai/codex-action` GitHub Action.
- When `openai/codex-action` starts, it will replace the `codex-review` label with `codex-review-in-progress`.
- When `openai/codex-action` is finished, it will replace the `codex-review-in-progress` label with `codex-review-completed`.
If Codex sees that either `codex-review-in-progress` or `codex-review-completed` is already present, it will not perform the action.
As determined by the [default config](./src/default-label-config.ts), Codex will act on the following labels by default:
- Adding the `codex-review` label to a pull request will have Codex review the PR and add it to the PR as a comment.
- Adding the `codex-triage` label to an issue will have Codex investigate the issue and report its findings as a comment.
- Adding the `codex-issue-fix` label to an issue will have Codex attempt to fix the issue and create a PR wit the fix, if any.
## Action Inputs
The `openai/codex-action` GitHub Action takes the following inputs
### `openai_api_key` (required)
Set your `OPENAI_API_KEY` as a [repository secret](https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions). See **Secrets and varaibles** then **Actions** in the settings for your GitHub repo.
Note that the secret name does not have to be `OPENAI_API_KEY`. For example, you might want to name it `CODEX_OPENAI_API_KEY` and then configure it on `openai/codex-action` as follows:
```yaml
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
```
### `github_token` (required)
This is required so that Codex can post a comment or create a PR. Set this value on the action as follows:
```yaml
github_token: ${{ secrets.GITHUB_TOKEN }}
```
### `codex_args`
A whitespace-delimited list of arguments to pass to Codex. Defaults to `--full-auto`, but if you want to override the default model to use `o3`:
```yaml
codex_args: "--full-auto --model o3"
```
For more complex configurations, use the `codex_home` input.
### `codex_home`
If set, the value to use for the `$CODEX_HOME` environment variable when running Codex. As explained [in the docs](https://github.com/openai/codex/tree/main/codex-rs#readme), this folder can contain the `config.toml` to configure Codex, custom instructions, and log files.
This should be a relative path within your repo.
## Prompt Template Variables
As shown above, `"prompt"` and `"promptPath"` are used to define prompt templates that will be populated and passed to Codex in response to certain events. All template variables are of the form `{CODEX_ACTION_...}` and the supported values are defined below.
### `CODEX_ACTION_ISSUE_TITLE`
If the action was triggered on a GitHub issue, this is the issue title.
Specifically it is read as the `.issue.title` from the `$GITHUB_EVENT_PATH`.
### `CODEX_ACTION_ISSUE_BODY`
If the action was triggered on a GitHub issue, this is the issue body.
Specifically it is read as the `.issue.body` from the `$GITHUB_EVENT_PATH`.
### `CODEX_ACTION_GITHUB_EVENT_PATH`
The value of the `$GITHUB_EVENT_PATH` environment variable, which is the path to the file that contains the JSON payload for the event that triggered the workflow. Codex can use `jq` to read only the fields of interest from this file.
### `CODEX_ACTION_PR_DIFF`
If the action was triggered on a pull request, this is the diff between the base and head commits of the PR. It is the output from `git diff`.
Note that the content of the diff could be quite large, so is generally safer to point Codex at `CODEX_ACTION_GITHUB_EVENT_PATH` and let it decide how it wants to explore the change.

127
.github/actions/codex/action.yml vendored Normal file
View File

@@ -0,0 +1,127 @@
name: "Codex [reusable action]"
description: "A reusable action that runs a Codex model."
inputs:
openai_api_key:
description: "The value to use as the OPENAI_API_KEY environment variable when running Codex."
required: true
trigger_phrase:
description: "Text to trigger Codex from a PR/issue body or comment."
required: false
default: ""
github_token:
description: "Token so Codex can comment on the PR or issue."
required: true
codex_args:
description: "A whitespace-delimited list of arguments to pass to Codex. Due to limitations in YAML, arguments with spaces are not supported. For more complex configurations, use the `codex_home` input."
required: false
default: "--config hide_agent_reasoning=true --full-auto"
codex_home:
description: "Value to use as the CODEX_HOME environment variable when running Codex."
required: false
codex_release_tag:
description: "The release tag of the Codex model to run, e.g., 'rust-v0.3.0'. Defaults to the latest release."
required: false
default: ""
runs:
using: "composite"
steps:
# Do this in Bash so we do not even bother to install Bun if the sender does
# not have write access to the repo.
- name: Verify user has write access to the repo.
env:
GH_TOKEN: ${{ github.token }}
shell: bash
run: |
set -euo pipefail
PERMISSION=$(gh api \
"/repos/${GITHUB_REPOSITORY}/collaborators/${{ github.event.sender.login }}/permission" \
| jq -r '.permission')
if [[ "$PERMISSION" != "admin" && "$PERMISSION" != "write" ]]; then
exit 1
fi
- name: Download Codex
env:
GH_TOKEN: ${{ github.token }}
shell: bash
run: |
set -euo pipefail
# Determine OS/arch and corresponding Codex artifact name.
uname_s=$(uname -s)
uname_m=$(uname -m)
case "$uname_s" in
Linux*) os="linux" ;;
Darwin*) os="apple-darwin" ;;
*) echo "Unsupported operating system: $uname_s"; exit 1 ;;
esac
case "$uname_m" in
x86_64*) arch="x86_64" ;;
arm64*|aarch64*) arch="aarch64" ;;
*) echo "Unsupported architecture: $uname_m"; exit 1 ;;
esac
# linux builds differentiate between musl and gnu.
if [[ "$os" == "linux" ]]; then
if [[ "$arch" == "x86_64" ]]; then
triple="${arch}-unknown-linux-musl"
else
# Only other supported linux build is aarch64 gnu.
triple="${arch}-unknown-linux-gnu"
fi
else
# macOS
triple="${arch}-apple-darwin"
fi
# Note that if we start baking version numbers into the artifact name,
# we will need to update this action.yml file to match.
artifact="codex-exec-${triple}.tar.gz"
TAG_ARG="${{ inputs.codex_release_tag }}"
# The usage is `gh release download [<tag>] [flags]`, so if TAG_ARG
# is empty, we do not pass it so we can default to the latest release.
gh release download ${TAG_ARG:+$TAG_ARG} --repo openai/codex \
--pattern "$artifact" --output - \
| tar xzO > /usr/local/bin/codex-exec
chmod +x /usr/local/bin/codex-exec
# Display Codex version to confirm binary integrity; ensure we point it
# at the checked-out repository via --cd so that any subsequent commands
# use the correct working directory.
codex-exec --cd "$GITHUB_WORKSPACE" --version
- name: Install Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: 1.2.11
- name: Install dependencies
shell: bash
run: |
cd ${{ github.action_path }}
bun install --production
- name: Run Codex
shell: bash
run: bun run ${{ github.action_path }}/src/main.ts
# Process args plus environment variables often have a max of 128 KiB,
# so we should fit within that limit?
env:
INPUT_CODEX_ARGS: ${{ inputs.codex_args || '' }}
INPUT_CODEX_HOME: ${{ inputs.codex_home || ''}}
INPUT_TRIGGER_PHRASE: ${{ inputs.trigger_phrase || '' }}
OPENAI_API_KEY: ${{ inputs.openai_api_key }}
GITHUB_TOKEN: ${{ inputs.github_token }}
GITHUB_EVENT_ACTION: ${{ github.event.action || '' }}
GITHUB_EVENT_LABEL_NAME: ${{ github.event.label.name || '' }}
GITHUB_EVENT_ISSUE_NUMBER: ${{ github.event.issue.number || '' }}
GITHUB_EVENT_ISSUE_BODY: ${{ github.event.issue.body || '' }}
GITHUB_EVENT_REVIEW_BODY: ${{ github.event.review.body || '' }}
GITHUB_EVENT_COMMENT_BODY: ${{ github.event.comment.body || '' }}

89
.github/actions/codex/bun.lock vendored Normal file
View File

@@ -0,0 +1,89 @@
{
"lockfileVersion": 1,
"workspaces": {
"": {
"name": "codex-action",
"dependencies": {
"@actions/core": "^1.11.1",
"@actions/github": "^6.0.1",
},
"devDependencies": {
"@types/bun": "^1.2.18",
"@types/node": "^24.0.13",
"prettier": "^3.6.2",
"typescript": "^5.8.3",
},
},
},
"packages": {
"@actions/core": ["@actions/core@1.11.1", "", { "dependencies": { "@actions/exec": "^1.1.1", "@actions/http-client": "^2.0.1" } }, "sha512-hXJCSrkwfA46Vd9Z3q4cpEpHB1rL5NG04+/rbqW9d3+CSvtB1tYe8UTpAlixa1vj0m/ULglfEK2UKxMGxCxv5A=="],
"@actions/exec": ["@actions/exec@1.1.1", "", { "dependencies": { "@actions/io": "^1.0.1" } }, "sha512-+sCcHHbVdk93a0XT19ECtO/gIXoxvdsgQLzb2fE2/5sIZmWQuluYyjPQtrtTHdU1YzTZ7bAPN4sITq2xi1679w=="],
"@actions/github": ["@actions/github@6.0.1", "", { "dependencies": { "@actions/http-client": "^2.2.0", "@octokit/core": "^5.0.1", "@octokit/plugin-paginate-rest": "^9.2.2", "@octokit/plugin-rest-endpoint-methods": "^10.4.0", "@octokit/request": "^8.4.1", "@octokit/request-error": "^5.1.1", "undici": "^5.28.5" } }, "sha512-xbZVcaqD4XnQAe35qSQqskb3SqIAfRyLBrHMd/8TuL7hJSz2QtbDwnNM8zWx4zO5l2fnGtseNE3MbEvD7BxVMw=="],
"@actions/http-client": ["@actions/http-client@2.2.3", "", { "dependencies": { "tunnel": "^0.0.6", "undici": "^5.25.4" } }, "sha512-mx8hyJi/hjFvbPokCg4uRd4ZX78t+YyRPtnKWwIl+RzNaVuFpQHfmlGVfsKEJN8LwTCvL+DfVgAM04XaHkm6bA=="],
"@actions/io": ["@actions/io@1.1.3", "", {}, "sha512-wi9JjgKLYS7U/z8PPbco+PvTb/nRWjeoFlJ1Qer83k/3C5PHQi28hiVdeE2kHXmIL99mQFawx8qt/JPjZilJ8Q=="],
"@fastify/busboy": ["@fastify/busboy@2.1.1", "", {}, "sha512-vBZP4NlzfOlerQTnba4aqZoMhE/a9HY7HRqoOPaETQcSQuWEIyZMHGfVu6w9wGtGK5fED5qRs2DteVCjOH60sA=="],
"@octokit/auth-token": ["@octokit/auth-token@4.0.0", "", {}, "sha512-tY/msAuJo6ARbK6SPIxZrPBms3xPbfwBrulZe0Wtr/DIY9lje2HeV1uoebShn6mx7SjCHif6EjMvoREj+gZ+SA=="],
"@octokit/core": ["@octokit/core@5.2.1", "", { "dependencies": { "@octokit/auth-token": "^4.0.0", "@octokit/graphql": "^7.1.0", "@octokit/request": "^8.4.1", "@octokit/request-error": "^5.1.1", "@octokit/types": "^13.0.0", "before-after-hook": "^2.2.0", "universal-user-agent": "^6.0.0" } }, "sha512-dKYCMuPO1bmrpuogcjQ8z7ICCH3FP6WmxpwC03yjzGfZhj9fTJg6+bS1+UAplekbN2C+M61UNllGOOoAfGCrdQ=="],
"@octokit/endpoint": ["@octokit/endpoint@9.0.6", "", { "dependencies": { "@octokit/types": "^13.1.0", "universal-user-agent": "^6.0.0" } }, "sha512-H1fNTMA57HbkFESSt3Y9+FBICv+0jFceJFPWDePYlR/iMGrwM5ph+Dd4XRQs+8X+PUFURLQgX9ChPfhJ/1uNQw=="],
"@octokit/graphql": ["@octokit/graphql@7.1.1", "", { "dependencies": { "@octokit/request": "^8.4.1", "@octokit/types": "^13.0.0", "universal-user-agent": "^6.0.0" } }, "sha512-3mkDltSfcDUoa176nlGoA32RGjeWjl3K7F/BwHwRMJUW/IteSa4bnSV8p2ThNkcIcZU2umkZWxwETSSCJf2Q7g=="],
"@octokit/openapi-types": ["@octokit/openapi-types@24.2.0", "", {}, "sha512-9sIH3nSUttelJSXUrmGzl7QUBFul0/mB8HRYl3fOlgHbIWG+WnYDXU3v/2zMtAvuzZ/ed00Ei6on975FhBfzrg=="],
"@octokit/plugin-paginate-rest": ["@octokit/plugin-paginate-rest@9.2.2", "", { "dependencies": { "@octokit/types": "^12.6.0" }, "peerDependencies": { "@octokit/core": "5" } }, "sha512-u3KYkGF7GcZnSD/3UP0S7K5XUFT2FkOQdcfXZGZQPGv3lm4F2Xbf71lvjldr8c1H3nNbF+33cLEkWYbokGWqiQ=="],
"@octokit/plugin-rest-endpoint-methods": ["@octokit/plugin-rest-endpoint-methods@10.4.1", "", { "dependencies": { "@octokit/types": "^12.6.0" }, "peerDependencies": { "@octokit/core": "5" } }, "sha512-xV1b+ceKV9KytQe3zCVqjg+8GTGfDYwaT1ATU5isiUyVtlVAO3HNdzpS4sr4GBx4hxQ46s7ITtZrAsxG22+rVg=="],
"@octokit/request": ["@octokit/request@8.4.1", "", { "dependencies": { "@octokit/endpoint": "^9.0.6", "@octokit/request-error": "^5.1.1", "@octokit/types": "^13.1.0", "universal-user-agent": "^6.0.0" } }, "sha512-qnB2+SY3hkCmBxZsR/MPCybNmbJe4KAlfWErXq+rBKkQJlbjdJeS85VI9r8UqeLYLvnAenU8Q1okM/0MBsAGXw=="],
"@octokit/request-error": ["@octokit/request-error@5.1.1", "", { "dependencies": { "@octokit/types": "^13.1.0", "deprecation": "^2.0.0", "once": "^1.4.0" } }, "sha512-v9iyEQJH6ZntoENr9/yXxjuezh4My67CBSu9r6Ve/05Iu5gNgnisNWOsoJHTP6k0Rr0+HQIpnH+kyammu90q/g=="],
"@octokit/types": ["@octokit/types@13.10.0", "", { "dependencies": { "@octokit/openapi-types": "^24.2.0" } }, "sha512-ifLaO34EbbPj0Xgro4G5lP5asESjwHracYJvVaPIyXMuiuXLlhic3S47cBdTb+jfODkTE5YtGCLt3Ay3+J97sA=="],
"@types/bun": ["@types/bun@1.2.18", "", { "dependencies": { "bun-types": "1.2.18" } }, "sha512-Xf6RaWVheyemaThV0kUfaAUvCNokFr+bH8Jxp+tTZfx7dAPA8z9ePnP9S9+Vspzuxxx9JRAXhnyccRj3GyCMdQ=="],
"@types/node": ["@types/node@24.0.13", "", { "dependencies": { "undici-types": "~7.8.0" } }, "sha512-Qm9OYVOFHFYg3wJoTSrz80hoec5Lia/dPp84do3X7dZvLikQvM1YpmvTBEdIr/e+U8HTkFjLHLnl78K/qjf+jQ=="],
"@types/react": ["@types/react@19.1.8", "", { "dependencies": { "csstype": "^3.0.2" } }, "sha512-AwAfQ2Wa5bCx9WP8nZL2uMZWod7J7/JSplxbTmBQ5ms6QpqNYm672H0Vu9ZVKVngQ+ii4R/byguVEUZQyeg44g=="],
"before-after-hook": ["before-after-hook@2.2.3", "", {}, "sha512-NzUnlZexiaH/46WDhANlyR2bXRopNg4F/zuSA3OpZnllCUgRaOF2znDioDWrmbNVsuZk6l9pMquQB38cfBZwkQ=="],
"bun-types": ["bun-types@1.2.18", "", { "dependencies": { "@types/node": "*" }, "peerDependencies": { "@types/react": "^19" } }, "sha512-04+Eha5NP7Z0A9YgDAzMk5PHR16ZuLVa83b26kH5+cp1qZW4F6FmAURngE7INf4tKOvCE69vYvDEwoNl1tGiWw=="],
"csstype": ["csstype@3.1.3", "", {}, "sha512-M1uQkMl8rQK/szD0LNhtqxIPLpimGm8sOBwU7lLnCpSbTyY3yeU1Vc7l4KT5zT4s/yOxHH5O7tIuuLOCnLADRw=="],
"deprecation": ["deprecation@2.3.1", "", {}, "sha512-xmHIy4F3scKVwMsQ4WnVaS8bHOx0DmVwRywosKhaILI0ywMDWPtBSku2HNxRvF7jtwDRsoEwYQSfbxj8b7RlJQ=="],
"once": ["once@1.4.0", "", { "dependencies": { "wrappy": "1" } }, "sha512-lNaJgI+2Q5URQBkccEKHTQOPaXdUxnZZElQTZY0MFUAuaEqe1E+Nyvgdz/aIyNi6Z9MzO5dv1H8n58/GELp3+w=="],
"prettier": ["prettier@3.6.2", "", { "bin": { "prettier": "bin/prettier.cjs" } }, "sha512-I7AIg5boAr5R0FFtJ6rCfD+LFsWHp81dolrFD8S79U9tb8Az2nGrJncnMSnys+bpQJfRUzqs9hnA81OAA3hCuQ=="],
"tunnel": ["tunnel@0.0.6", "", {}, "sha512-1h/Lnq9yajKY2PEbBadPXj3VxsDDu844OnaAo52UVmIzIvwwtBPIuNvkjuzBlTWpfJyUbG3ez0KSBibQkj4ojg=="],
"typescript": ["typescript@5.8.3", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-p1diW6TqL9L07nNxvRMM7hMMw4c5XOo/1ibL4aAIGmSAt9slTE1Xgw5KWuof2uTOvCg9BY7ZRi+GaF+7sfgPeQ=="],
"undici": ["undici@5.29.0", "", { "dependencies": { "@fastify/busboy": "^2.0.0" } }, "sha512-raqeBD6NQK4SkWhQzeYKd1KmIG6dllBOTt55Rmkt4HtI9mwdWtJljnrXjAFUBLTSN67HWrOIZ3EPF4kjUw80Bg=="],
"undici-types": ["undici-types@7.8.0", "", {}, "sha512-9UJ2xGDvQ43tYyVMpuHlsgApydB8ZKfVYTsLDhXkFL/6gfkp+U8xTGdh8pMJv1SpZna0zxG1DwsKZsreLbXBxw=="],
"universal-user-agent": ["universal-user-agent@6.0.1", "", {}, "sha512-yCzhz6FN2wU1NiiQRogkTQszlQSlpWaw8SvVegAc+bDxbzHgh1vX8uIe8OYyMH6DwH+sdTJsgMl36+mSMdRJIQ=="],
"wrappy": ["wrappy@1.0.2", "", {}, "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ=="],
"@octokit/plugin-paginate-rest/@octokit/types": ["@octokit/types@12.6.0", "", { "dependencies": { "@octokit/openapi-types": "^20.0.0" } }, "sha512-1rhSOfRa6H9w4YwK0yrf5faDaDTb+yLyBUKOCV4xtCDB5VmIPqd/v9yr9o6SAzOAlRxMiRiCic6JVM1/kunVkw=="],
"@octokit/plugin-rest-endpoint-methods/@octokit/types": ["@octokit/types@12.6.0", "", { "dependencies": { "@octokit/openapi-types": "^20.0.0" } }, "sha512-1rhSOfRa6H9w4YwK0yrf5faDaDTb+yLyBUKOCV4xtCDB5VmIPqd/v9yr9o6SAzOAlRxMiRiCic6JVM1/kunVkw=="],
"@octokit/plugin-paginate-rest/@octokit/types/@octokit/openapi-types": ["@octokit/openapi-types@20.0.0", "", {}, "sha512-EtqRBEjp1dL/15V7WiX5LJMIxxkdiGJnabzYx5Apx4FkQIFgAfKumXeYAqqJCj1s+BMX4cPFIFC4OLCR6stlnA=="],
"@octokit/plugin-rest-endpoint-methods/@octokit/types/@octokit/openapi-types": ["@octokit/openapi-types@20.0.0", "", {}, "sha512-EtqRBEjp1dL/15V7WiX5LJMIxxkdiGJnabzYx5Apx4FkQIFgAfKumXeYAqqJCj1s+BMX4cPFIFC4OLCR6stlnA=="],
}
}

21
.github/actions/codex/package.json vendored Normal file
View File

@@ -0,0 +1,21 @@
{
"name": "codex-action",
"version": "0.0.0",
"private": true,
"scripts": {
"format": "prettier --check src",
"format:fix": "prettier --write src",
"test": "bun test",
"typecheck": "tsc"
},
"dependencies": {
"@actions/core": "^1.11.1",
"@actions/github": "^6.0.1"
},
"devDependencies": {
"@types/bun": "^1.2.18",
"@types/node": "^24.0.13",
"prettier": "^3.6.2",
"typescript": "^5.8.3"
}
}

View File

@@ -0,0 +1,85 @@
import * as github from "@actions/github";
import type { EnvContext } from "./env-context";
/**
* Add an "eyes" reaction to the entity (issue, issue comment, or pull request
* review comment) that triggered the current Codex invocation.
*
* The purpose is to provide immediate feedback to the user similar to the
* *-in-progress label flow indicating that the bot has acknowledged the
* request and is working on it.
*
* We attempt to add the reaction best suited for the current GitHub event:
*
* • issues → POST /repos/{owner}/{repo}/issues/{issue_number}/reactions
* • issue_comment → POST /repos/{owner}/{repo}/issues/comments/{comment_id}/reactions
* • pull_request_review_comment → POST /repos/{owner}/{repo}/pulls/comments/{comment_id}/reactions
*
* If the specific target is unavailable (e.g. unexpected payload shape) we
* silently skip instead of failing the whole action because the reaction is
* merely cosmetic.
*/
export async function addEyesReaction(ctx: EnvContext): Promise<void> {
const octokit = ctx.getOctokit();
const { owner, repo } = github.context.repo;
const eventName = github.context.eventName;
try {
switch (eventName) {
case "issue_comment": {
const commentId = (github.context.payload as any)?.comment?.id;
if (commentId) {
await octokit.rest.reactions.createForIssueComment({
owner,
repo,
comment_id: commentId,
content: "eyes",
});
return;
}
break;
}
case "pull_request_review_comment": {
const commentId = (github.context.payload as any)?.comment?.id;
if (commentId) {
await octokit.rest.reactions.createForPullRequestReviewComment({
owner,
repo,
comment_id: commentId,
content: "eyes",
});
return;
}
break;
}
case "issues": {
const issueNumber = github.context.issue.number;
if (issueNumber) {
await octokit.rest.reactions.createForIssue({
owner,
repo,
issue_number: issueNumber,
content: "eyes",
});
return;
}
break;
}
default: {
// Fallback: try to react to the issue/PR if we have a number.
const issueNumber = github.context.issue.number;
if (issueNumber) {
await octokit.rest.reactions.createForIssue({
owner,
repo,
issue_number: issueNumber,
content: "eyes",
});
}
}
}
} catch (error) {
// Do not fail the action if reaction creation fails log and continue.
console.warn(`Failed to add \"eyes\" reaction: ${error}`);
}
}

53
.github/actions/codex/src/comment.ts vendored Normal file
View File

@@ -0,0 +1,53 @@
import type { EnvContext } from "./env-context";
import { runCodex } from "./run-codex";
import { postComment } from "./post-comment";
import { addEyesReaction } from "./add-reaction";
/**
* Handle `issue_comment` and `pull_request_review_comment` events once we know
* the action is supported.
*/
export async function onComment(ctx: EnvContext): Promise<void> {
const triggerPhrase = ctx.tryGet("INPUT_TRIGGER_PHRASE");
if (!triggerPhrase) {
console.warn("Empty trigger phrase: skipping.");
return;
}
// Attempt to get the body of the comment from the environment. Depending on
// the event type either `GITHUB_EVENT_COMMENT_BODY` (issue & PR comments) or
// `GITHUB_EVENT_REVIEW_BODY` (PR reviews) is set.
const commentBody =
ctx.tryGetNonEmpty("GITHUB_EVENT_COMMENT_BODY") ??
ctx.tryGetNonEmpty("GITHUB_EVENT_REVIEW_BODY") ??
ctx.tryGetNonEmpty("GITHUB_EVENT_ISSUE_BODY");
if (!commentBody) {
console.warn("Comment body not found in environment: skipping.");
return;
}
// Check if the trigger phrase is present.
if (!commentBody.includes(triggerPhrase)) {
console.log(
`Trigger phrase '${triggerPhrase}' not found: nothing to do for this comment.`,
);
return;
}
// Derive the prompt by removing the trigger phrase. Remove only the first
// occurrence to keep any additional occurrences that might be meaningful.
const prompt = commentBody.replace(triggerPhrase, "").trim();
if (prompt.length === 0) {
console.warn("Prompt is empty after removing trigger phrase: skipping");
return;
}
// Provide immediate feedback that we are working on the request.
await addEyesReaction(ctx);
// Run Codex and post the response as a new comment.
const lastMessage = await runCodex(prompt, ctx);
await postComment(lastMessage, ctx);
}

11
.github/actions/codex/src/config.ts vendored Normal file
View File

@@ -0,0 +1,11 @@
import { readdirSync, statSync } from "fs";
import * as path from "path";
export interface Config {
labels: Record<string, LabelConfig>;
}
export interface LabelConfig {
/** Returns the prompt template. */
getPromptTemplate(): string;
}

View File

@@ -0,0 +1,44 @@
import type { Config } from "./config";
export function getDefaultConfig(): Config {
return {
labels: {
"codex-investigate-issue": {
getPromptTemplate: () =>
`
Troubleshoot whether the reported issue is valid.
Provide a concise and respectful comment summarizing the findings.
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}
`.trim(),
},
"codex-code-review": {
getPromptTemplate: () =>
`
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the \`base\` and \`head\` refs that define this PR. Both refs are available locally.
`.trim(),
},
"codex-attempt-fix": {
getPromptTemplate: () =>
`
Attempt to solve the reported issue.
If a code change is required, create a new branch, commit the fix, and open a pull-request that resolves the problem.
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}
`.trim(),
},
},
};
}

116
.github/actions/codex/src/env-context.ts vendored Normal file
View File

@@ -0,0 +1,116 @@
/*
* Centralised access to environment variables used by the Codex GitHub
* Action.
*
* To enable proper unit-testing we avoid reading from `process.env` at module
* initialisation time. Instead a `EnvContext` object is created (usually from
* the real `process.env`) and passed around explicitly or where that is not
* yet practical imported as the shared `defaultContext` singleton. Tests can
* create their own context backed by a stubbed map of variables without having
* to mutate global state.
*/
import { fail } from "./fail";
import * as github from "@actions/github";
export interface EnvContext {
/**
* Return the value for a given environment variable or terminate the action
* via `fail` if it is missing / empty.
*/
get(name: string): string;
/**
* Attempt to read an environment variable. Returns the value when present;
* otherwise returns undefined (does not call `fail`).
*/
tryGet(name: string): string | undefined;
/**
* Attempt to read an environment variable. Returns non-empty string value or
* null if unset or empty string.
*/
tryGetNonEmpty(name: string): string | null;
/**
* Return a memoised Octokit instance authenticated via the token resolved
* from the provided argument (when defined) or the environment variables
* `GITHUB_TOKEN`/`GH_TOKEN`.
*
* Subsequent calls return the same cached instance to avoid spawning
* multiple REST clients within a single action run.
*/
getOctokit(token?: string): ReturnType<typeof github.getOctokit>;
}
/** Internal helper *not* exported. */
function _getRequiredEnv(
name: string,
env: Record<string, string | undefined>,
): string | undefined {
const value = env[name];
// Avoid leaking secrets into logs while still logging non-secret variables.
if (name.endsWith("KEY") || name.endsWith("TOKEN")) {
if (value) {
console.log(`value for ${name} was found`);
}
} else {
console.log(`${name}=${value}`);
}
return value;
}
/** Create a context backed by the supplied environment map (defaults to `process.env`). */
export function createEnvContext(
env: Record<string, string | undefined> = process.env,
): EnvContext {
// Lazily instantiated Octokit client shared across this context.
let cachedOctokit: ReturnType<typeof github.getOctokit> | null = null;
return {
get(name: string): string {
const value = _getRequiredEnv(name, env);
if (value == null) {
fail(`Missing required environment variable: ${name}`);
}
return value;
},
tryGet(name: string): string | undefined {
return _getRequiredEnv(name, env);
},
tryGetNonEmpty(name: string): string | null {
const value = _getRequiredEnv(name, env);
return value == null || value === "" ? null : value;
},
getOctokit(token?: string) {
if (cachedOctokit) {
return cachedOctokit;
}
// Determine the token to authenticate with.
const githubToken = token ?? env["GITHUB_TOKEN"] ?? env["GH_TOKEN"];
if (!githubToken) {
fail(
"Unable to locate a GitHub token. `github_token` should have been set on the action.",
);
}
cachedOctokit = github.getOctokit(githubToken!);
return cachedOctokit;
},
};
}
/**
* Shared context built from the actual `process.env`. Production code that is
* not yet refactored to receive a context explicitly may import and use this
* singleton. Tests should avoid the singleton and instead pass their own
* context to the functions they exercise.
*/
export const defaultContext: EnvContext = createEnvContext();

4
.github/actions/codex/src/fail.ts vendored Normal file
View File

@@ -0,0 +1,4 @@
export function fail(message: string): never {
console.error(message);
process.exit(1);
}

149
.github/actions/codex/src/git-helpers.ts vendored Normal file
View File

@@ -0,0 +1,149 @@
import { spawnSync } from "child_process";
import * as github from "@actions/github";
import { EnvContext } from "./env-context";
function runGit(args: string[], silent = true): string {
console.info(`Running git ${args.join(" ")}`);
const res = spawnSync("git", args, {
encoding: "utf8",
stdio: silent ? ["ignore", "pipe", "pipe"] : "inherit",
});
if (res.error) {
throw res.error;
}
if (res.status !== 0) {
// Return stderr so caller may handle; else throw.
throw new Error(
`git ${args.join(" ")} failed with code ${res.status}: ${res.stderr}`,
);
}
return res.stdout.trim();
}
function stageAllChanges() {
runGit(["add", "-A"]);
}
function hasStagedChanges(): boolean {
const res = spawnSync("git", ["diff", "--cached", "--quiet", "--exit-code"]);
return res.status !== 0;
}
function ensureOnBranch(
issueNumber: number,
protectedBranches: string[],
suggestedSlug?: string,
): string {
let branch = "";
try {
branch = runGit(["symbolic-ref", "--short", "-q", "HEAD"]);
} catch {
branch = "";
}
// If detached HEAD or on a protected branch, create a new branch.
if (!branch || protectedBranches.includes(branch)) {
if (suggestedSlug) {
const safeSlug = suggestedSlug
.toLowerCase()
.replace(/[^\w\s-]/g, "")
.trim()
.replace(/\s+/g, "-");
branch = `codex-fix-${issueNumber}-${safeSlug}`;
} else {
branch = `codex-fix-${issueNumber}-${Date.now()}`;
}
runGit(["switch", "-c", branch]);
}
return branch;
}
function commitIfNeeded(issueNumber: number) {
if (hasStagedChanges()) {
runGit([
"commit",
"-m",
`fix: automated fix for #${issueNumber} via Codex`,
]);
}
}
function pushBranch(branch: string, githubToken: string, ctx: EnvContext) {
const repoSlug = ctx.get("GITHUB_REPOSITORY"); // owner/repo
const remoteUrl = `https://x-access-token:${githubToken}@github.com/${repoSlug}.git`;
runGit(["push", "--force-with-lease", "-u", remoteUrl, `HEAD:${branch}`]);
}
/**
* If this returns a string, it is the URL of the created PR.
*/
export async function maybePublishPRForIssue(
issueNumber: number,
lastMessage: string,
ctx: EnvContext,
): Promise<string | undefined> {
// Only proceed if GITHUB_TOKEN available.
const githubToken =
ctx.tryGetNonEmpty("GITHUB_TOKEN") ?? ctx.tryGetNonEmpty("GH_TOKEN");
if (!githubToken) {
console.warn("No GitHub token - skipping PR creation.");
return undefined;
}
// Print `git status` for debugging.
runGit(["status"]);
// Stage any remaining changes so they can be committed and pushed.
stageAllChanges();
const octokit = ctx.getOctokit(githubToken);
const { owner, repo } = github.context.repo;
// Determine default branch to treat as protected.
let defaultBranch = "main";
try {
const repoInfo = await octokit.rest.repos.get({ owner, repo });
defaultBranch = repoInfo.data.default_branch ?? "main";
} catch (e) {
console.warn(`Failed to get default branch, assuming 'main': ${e}`);
}
const sanitizedMessage = lastMessage.replace(/\u2022/g, "-");
const [summaryLine] = sanitizedMessage.split(/\r?\n/);
const branch = ensureOnBranch(issueNumber, [defaultBranch, "master"], summaryLine);
commitIfNeeded(issueNumber);
pushBranch(branch, githubToken, ctx);
// Try to find existing PR for this branch
const headParam = `${owner}:${branch}`;
const existing = await octokit.rest.pulls.list({
owner,
repo,
head: headParam,
state: "open",
});
if (existing.data.length > 0) {
return existing.data[0].html_url;
}
// Determine base branch (default to main)
let baseBranch = "main";
try {
const repoInfo = await octokit.rest.repos.get({ owner, repo });
baseBranch = repoInfo.data.default_branch ?? "main";
} catch (e) {
console.warn(`Failed to get default branch, assuming 'main': ${e}`);
}
const pr = await octokit.rest.pulls.create({
owner,
repo,
title: summaryLine,
head: branch,
base: baseBranch,
body: sanitizedMessage,
});
return pr.data.html_url;
}

16
.github/actions/codex/src/git-user.ts vendored Normal file
View File

@@ -0,0 +1,16 @@
export function setGitHubActionsUser(): void {
const commands = [
["git", "config", "--global", "user.name", "github-actions[bot]"],
[
"git",
"config",
"--global",
"user.email",
"41898282+github-actions[bot]@users.noreply.github.com",
],
];
for (const command of commands) {
Bun.spawnSync(command);
}
}

View File

@@ -0,0 +1,11 @@
import * as pathMod from "path";
import { EnvContext } from "./env-context";
export function resolveWorkspacePath(path: string, ctx: EnvContext): string {
if (pathMod.isAbsolute(path)) {
return path;
} else {
const workspace = ctx.get("GITHUB_WORKSPACE");
return pathMod.join(workspace, path);
}
}

View File

@@ -0,0 +1,56 @@
import type { Config, LabelConfig } from "./config";
import { getDefaultConfig } from "./default-label-config";
import { readFileSync, readdirSync, statSync } from "fs";
import * as path from "path";
/**
* Build an in-memory configuration object by scanning the repository for
* Markdown templates located in `.github/codex/labels`.
*
* Each `*.md` file in that directory represents a label that can trigger the
* Codex GitHub Action. The filename **without** the extension is interpreted
* as the label name, e.g. `codex-review.md` ➜ `codex-review`.
*
* For every such label we derive the corresponding `doneLabel` by appending
* the suffix `-completed`.
*/
export function loadConfig(workspace: string): Config {
const labelsDir = path.join(workspace, ".github", "codex", "labels");
let entries: string[];
try {
entries = readdirSync(labelsDir);
} catch {
// If the directory is missing, return the default configuration.
return getDefaultConfig();
}
const labels: Record<string, LabelConfig> = {};
for (const entry of entries) {
if (!entry.endsWith(".md")) {
continue;
}
const fullPath = path.join(labelsDir, entry);
if (!statSync(fullPath).isFile()) {
continue;
}
const labelName = entry.slice(0, -3); // trim ".md"
labels[labelName] = new FileLabelConfig(fullPath);
}
return { labels };
}
class FileLabelConfig implements LabelConfig {
constructor(private readonly promptPath: string) {}
getPromptTemplate(): string {
return readFileSync(this.promptPath, "utf8");
}
}

80
.github/actions/codex/src/main.ts vendored Executable file
View File

@@ -0,0 +1,80 @@
#!/usr/bin/env bun
import type { Config } from "./config";
import { defaultContext, EnvContext } from "./env-context";
import { loadConfig } from "./load-config";
import { setGitHubActionsUser } from "./git-user";
import { onLabeled } from "./process-label";
import { ensureBaseAndHeadCommitsForPRAreAvailable } from "./prompt-template";
import { performAdditionalValidation } from "./verify-inputs";
import { onComment } from "./comment";
import { onReview } from "./review";
async function main(): Promise<void> {
const ctx: EnvContext = defaultContext;
// Build the configuration dynamically by scanning `.github/codex/labels`.
const GITHUB_WORKSPACE = ctx.get("GITHUB_WORKSPACE");
const config: Config = loadConfig(GITHUB_WORKSPACE);
// Optionally perform additional validation of prompt template files.
performAdditionalValidation(config, GITHUB_WORKSPACE);
const GITHUB_EVENT_NAME = ctx.get("GITHUB_EVENT_NAME");
const GITHUB_EVENT_ACTION = ctx.get("GITHUB_EVENT_ACTION");
// Set user.name and user.email to a bot before Codex runs, just in case it
// creates a commit.
setGitHubActionsUser();
switch (GITHUB_EVENT_NAME) {
case "issues": {
if (GITHUB_EVENT_ACTION === "labeled") {
await onLabeled(config, ctx);
return;
} else if (GITHUB_EVENT_ACTION === "opened") {
await onComment(ctx);
return;
}
break;
}
case "issue_comment": {
if (GITHUB_EVENT_ACTION === "created") {
await onComment(ctx);
return;
}
break;
}
case "pull_request": {
if (GITHUB_EVENT_ACTION === "labeled") {
await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
await onLabeled(config, ctx);
return;
}
break;
}
case "pull_request_review": {
await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
if (GITHUB_EVENT_ACTION === "submitted") {
await onReview(ctx);
return;
}
break;
}
case "pull_request_review_comment": {
await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
if (GITHUB_EVENT_ACTION === "created") {
await onComment(ctx);
return;
}
break;
}
}
console.warn(
`Unsupported action '${GITHUB_EVENT_ACTION}' for event '${GITHUB_EVENT_NAME}'.`,
);
}
main();

View File

@@ -0,0 +1,62 @@
import { fail } from "./fail";
import * as github from "@actions/github";
import { EnvContext } from "./env-context";
/**
* Post a comment to the issue / pull request currently in scope.
*
* Provide the environment context so that token lookup (inside getOctokit) does
* not rely on global state.
*/
export async function postComment(
commentBody: string,
ctx: EnvContext,
): Promise<void> {
// Append a footer with a link back to the workflow run, if available.
const footer = buildWorkflowRunFooter(ctx);
const bodyWithFooter = footer ? `${commentBody}${footer}` : commentBody;
const octokit = ctx.getOctokit();
console.info("Got Octokit instance for posting comment");
const { owner, repo } = github.context.repo;
const issueNumber = github.context.issue.number;
if (!issueNumber) {
console.warn(
"No issue or pull_request number found in GitHub context; skipping comment creation.",
);
return;
}
try {
console.info("Calling octokit.rest.issues.createComment()");
await octokit.rest.issues.createComment({
owner,
repo,
issue_number: issueNumber,
body: bodyWithFooter,
});
} catch (error) {
fail(`Failed to create comment via GitHub API: ${error}`);
}
}
/**
* Helper to build a Markdown fragment linking back to the workflow run that
* generated the current comment. Returns `undefined` if required environment
* variables are missing e.g. when running outside of GitHub Actions so we
* can gracefully skip the footer in those cases.
*/
function buildWorkflowRunFooter(ctx: EnvContext): string | undefined {
const serverUrl =
ctx.tryGetNonEmpty("GITHUB_SERVER_URL") ?? "https://github.com";
const repository = ctx.tryGetNonEmpty("GITHUB_REPOSITORY");
const runId = ctx.tryGetNonEmpty("GITHUB_RUN_ID");
if (!repository || !runId) {
return undefined;
}
const url = `${serverUrl}/${repository}/actions/runs/${runId}`;
return `\n\n---\n*[_View workflow run_](${url})*`;
}

View File

@@ -0,0 +1,195 @@
import { fail } from "./fail";
import { EnvContext } from "./env-context";
import { renderPromptTemplate } from "./prompt-template";
import { postComment } from "./post-comment";
import { runCodex } from "./run-codex";
import * as github from "@actions/github";
import { Config, LabelConfig } from "./config";
import { maybePublishPRForIssue } from "./git-helpers";
export async function onLabeled(
config: Config,
ctx: EnvContext,
): Promise<void> {
const GITHUB_EVENT_LABEL_NAME = ctx.get("GITHUB_EVENT_LABEL_NAME");
const labelConfig = config.labels[GITHUB_EVENT_LABEL_NAME] as
| LabelConfig
| undefined;
if (!labelConfig) {
fail(
`Label \`${GITHUB_EVENT_LABEL_NAME}\` not found in config: ${JSON.stringify(config)}`,
);
}
await processLabelConfig(ctx, GITHUB_EVENT_LABEL_NAME, labelConfig);
}
/**
* Wrapper that handles `-in-progress` and `-completed` semantics around the core lint/fix/review
* processing. It will:
*
* - Skip execution if the `-in-progress` or `-completed` label is already present.
* - Mark the PR/issue as `-in-progress`.
* - After successful execution, mark the PR/issue as `-completed`.
*/
async function processLabelConfig(
ctx: EnvContext,
label: string,
labelConfig: LabelConfig,
): Promise<void> {
const octokit = ctx.getOctokit();
const { owner, repo, issueNumber, labelNames } =
await getCurrentLabels(octokit);
const inProgressLabel = `${label}-in-progress`;
const completedLabel = `${label}-completed`;
for (const markerLabel of [inProgressLabel, completedLabel]) {
if (labelNames.includes(markerLabel)) {
console.log(
`Label '${markerLabel}' already present on issue/PR #${issueNumber}. Skipping Codex action.`,
);
// Clean up: remove the triggering label to avoid confusion and re-runs.
await addAndRemoveLabels(octokit, {
owner,
repo,
issueNumber,
remove: markerLabel,
});
return;
}
}
// Mark the PR/issue as in progress.
await addAndRemoveLabels(octokit, {
owner,
repo,
issueNumber,
add: inProgressLabel,
remove: label,
});
// Run the core Codex processing.
await processLabel(ctx, label, labelConfig);
// Mark the PR/issue as completed.
await addAndRemoveLabels(octokit, {
owner,
repo,
issueNumber,
add: completedLabel,
remove: inProgressLabel,
});
}
async function processLabel(
ctx: EnvContext,
label: string,
labelConfig: LabelConfig,
): Promise<void> {
const template = labelConfig.getPromptTemplate();
const populatedTemplate = await renderPromptTemplate(template, ctx);
// Always run Codex and post the resulting message as a comment.
let commentBody = await runCodex(populatedTemplate, ctx);
// Current heuristic: only try to create a PR if "attempt" or "fix" is in the
// label name. (Yes, we plan to evolve this.)
if (label.indexOf("fix") !== -1 || label.indexOf("attempt") !== -1) {
console.info(`label ${label} indicates we should attempt to create a PR`);
const prUrl = await maybeFixIssue(ctx, commentBody);
if (prUrl) {
commentBody += `\n\n---\nOpened pull request: ${prUrl}`;
}
} else {
console.info(
`label ${label} does not indicate we should attempt to create a PR`,
);
}
await postComment(commentBody, ctx);
}
async function maybeFixIssue(
ctx: EnvContext,
lastMessage: string,
): Promise<string | undefined> {
// Attempt to create a PR out of any changes Codex produced.
const issueNumber = github.context.issue.number!; // exists for issues triggering this path
try {
return await maybePublishPRForIssue(issueNumber, lastMessage, ctx);
} catch (e) {
console.warn(`Failed to publish PR: ${e}`);
}
}
async function getCurrentLabels(
octokit: ReturnType<typeof github.getOctokit>,
): Promise<{
owner: string;
repo: string;
issueNumber: number;
labelNames: Array<string>;
}> {
const { owner, repo } = github.context.repo;
const issueNumber = github.context.issue.number;
if (!issueNumber) {
fail("No issue or pull_request number found in GitHub context.");
}
const { data: issueData } = await octokit.rest.issues.get({
owner,
repo,
issue_number: issueNumber,
});
const labelNames =
issueData.labels?.map((label: any) =>
typeof label === "string" ? label : label.name,
) ?? [];
return { owner, repo, issueNumber, labelNames };
}
async function addAndRemoveLabels(
octokit: ReturnType<typeof github.getOctokit>,
opts: {
owner: string;
repo: string;
issueNumber: number;
add?: string;
remove?: string;
},
): Promise<void> {
const { owner, repo, issueNumber, add, remove } = opts;
if (add) {
try {
await octokit.rest.issues.addLabels({
owner,
repo,
issue_number: issueNumber,
labels: [add],
});
} catch (error) {
console.warn(`Failed to add label '${add}': ${error}`);
}
}
if (remove) {
try {
await octokit.rest.issues.removeLabel({
owner,
repo,
issue_number: issueNumber,
name: remove,
});
} catch (error) {
console.warn(`Failed to remove label '${remove}': ${error}`);
}
}
}

View File

@@ -0,0 +1,284 @@
/*
* Utilities to render Codex prompt templates.
*
* A template is a Markdown (or plain-text) file that may contain one or more
* placeholders of the form `{CODEX_ACTION_<NAME>}`. At runtime these
* placeholders are substituted with dynamically generated content. Each
* placeholder is resolved **exactly once** even if it appears multiple times
* in the same template.
*/
import { readFile } from "fs/promises";
import { EnvContext } from "./env-context";
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
/**
* Lazily caches parsed `$GITHUB_EVENT_PATH` contents keyed by the file path so
* we only hit the filesystem once per unique event payload.
*/
const githubEventDataCache: Map<string, Promise<any>> = new Map();
function getGitHubEventData(ctx: EnvContext): Promise<any> {
const eventPath = ctx.get("GITHUB_EVENT_PATH");
let cached = githubEventDataCache.get(eventPath);
if (!cached) {
cached = readFile(eventPath, "utf8").then((raw) => JSON.parse(raw));
githubEventDataCache.set(eventPath, cached);
}
return cached;
}
async function runCommand(args: Array<string>): Promise<string> {
const result = Bun.spawnSync(args, {
stdout: "pipe",
stderr: "pipe",
});
if (result.success) {
return result.stdout.toString();
}
console.error(`Error running ${JSON.stringify(args)}: ${result.stderr}`);
return "";
}
// ---------------------------------------------------------------------------
// Public API
// ---------------------------------------------------------------------------
// Regex that captures the variable name without the surrounding { } braces.
const VAR_REGEX = /\{(CODEX_ACTION_[A-Z0-9_]+)\}/g;
// Cache individual placeholder values so each one is resolved at most once per
// process even if many templates reference it.
const placeholderCache: Map<string, Promise<string>> = new Map();
/**
* Parse a template string, resolve all placeholders and return the rendered
* result.
*/
export async function renderPromptTemplate(
template: string,
ctx: EnvContext,
): Promise<string> {
// ---------------------------------------------------------------------
// 1) Gather all *unique* placeholders present in the template.
// ---------------------------------------------------------------------
const variables = new Set<string>();
for (const match of template.matchAll(VAR_REGEX)) {
variables.add(match[1]);
}
// ---------------------------------------------------------------------
// 2) Kick off (or reuse) async resolution for each variable.
// ---------------------------------------------------------------------
for (const variable of variables) {
if (!placeholderCache.has(variable)) {
placeholderCache.set(variable, resolveVariable(variable, ctx));
}
}
// ---------------------------------------------------------------------
// 3) Await completion so we can perform a simple synchronous replace below.
// ---------------------------------------------------------------------
const resolvedEntries: [string, string][] = [];
for (const [key, promise] of placeholderCache.entries()) {
resolvedEntries.push([key, await promise]);
}
const resolvedMap = new Map<string, string>(resolvedEntries);
// ---------------------------------------------------------------------
// 4) Replace each occurrence. We use replace with a callback to ensure
// correct substitution even if variable names overlap (they shouldn't,
// but better safe than sorry).
// ---------------------------------------------------------------------
return template.replace(VAR_REGEX, (_, varName: string) => {
return resolvedMap.get(varName) ?? "";
});
}
export async function ensureBaseAndHeadCommitsForPRAreAvailable(
ctx: EnvContext,
): Promise<{ baseSha: string; headSha: string } | null> {
const prShas = await getPrShas(ctx);
if (prShas == null) {
console.warn("Unable to resolve PR branches");
return null;
}
const event = await getGitHubEventData(ctx);
const pr = event.pull_request;
if (!pr) {
console.warn("event.pull_request is not defined - unexpected");
return null;
}
const workspace = ctx.get("GITHUB_WORKSPACE");
// Refs (branch names)
const baseRef: string | undefined = pr.base?.ref;
const headRef: string | undefined = pr.head?.ref;
// Clone URLs
const baseRemoteUrl: string | undefined = pr.base?.repo?.clone_url;
const headRemoteUrl: string | undefined = pr.head?.repo?.clone_url;
if (!baseRef || !headRef || !baseRemoteUrl || !headRemoteUrl) {
console.warn(
"Missing PR ref or remote URL information - cannot fetch commits",
);
return null;
}
// Ensure we have the base branch.
await runCommand([
"git",
"-C",
workspace,
"fetch",
"--no-tags",
"origin",
baseRef,
]);
// Ensure we have the head branch.
if (headRemoteUrl === baseRemoteUrl) {
// Same repository the commit is available from `origin`.
await runCommand([
"git",
"-C",
workspace,
"fetch",
"--no-tags",
"origin",
headRef,
]);
} else {
// Fork make sure a `pr` remote exists that points at the fork. Attempting
// to add a remote that already exists causes git to error, so we swallow
// any non-zero exit codes from that specific command.
await runCommand([
"git",
"-C",
workspace,
"remote",
"add",
"pr",
headRemoteUrl,
]);
// Whether adding succeeded or the remote already existed, attempt to fetch
// the head ref from the `pr` remote.
await runCommand([
"git",
"-C",
workspace,
"fetch",
"--no-tags",
"pr",
headRef,
]);
}
return prShas;
}
// ---------------------------------------------------------------------------
// Internal helpers still exported for use by other modules.
// ---------------------------------------------------------------------------
export async function resolvePrDiff(ctx: EnvContext): Promise<string> {
const prShas = await ensureBaseAndHeadCommitsForPRAreAvailable(ctx);
if (prShas == null) {
console.warn("Unable to resolve PR branches");
return "";
}
const workspace = ctx.get("GITHUB_WORKSPACE");
const { baseSha, headSha } = prShas;
return runCommand([
"git",
"-C",
workspace,
"diff",
"--color=never",
`${baseSha}..${headSha}`,
]);
}
// ---------------------------------------------------------------------------
// Placeholder resolution
// ---------------------------------------------------------------------------
async function resolveVariable(name: string, ctx: EnvContext): Promise<string> {
switch (name) {
case "CODEX_ACTION_ISSUE_TITLE": {
const event = await getGitHubEventData(ctx);
const issue = event.issue ?? event.pull_request;
return issue?.title ?? "";
}
case "CODEX_ACTION_ISSUE_BODY": {
const event = await getGitHubEventData(ctx);
const issue = event.issue ?? event.pull_request;
return issue?.body ?? "";
}
case "CODEX_ACTION_GITHUB_EVENT_PATH": {
return ctx.get("GITHUB_EVENT_PATH");
}
case "CODEX_ACTION_BASE_REF": {
const event = await getGitHubEventData(ctx);
return event?.pull_request?.base?.ref ?? "";
}
case "CODEX_ACTION_HEAD_REF": {
const event = await getGitHubEventData(ctx);
return event?.pull_request?.head?.ref ?? "";
}
case "CODEX_ACTION_PR_DIFF": {
return resolvePrDiff(ctx);
}
// -------------------------------------------------------------------
// Add new template variables here.
// -------------------------------------------------------------------
default: {
// Unknown variable leave it blank to avoid leaking placeholders to the
// final prompt. The alternative would be to `fail()` here, but silently
// ignoring unknown placeholders is more forgiving and better matches the
// behaviour of typical template engines.
console.warn(`Unknown template variable: ${name}`);
return "";
}
}
}
async function getPrShas(
ctx: EnvContext,
): Promise<{ baseSha: string; headSha: string } | null> {
const event = await getGitHubEventData(ctx);
const pr = event.pull_request;
if (!pr) {
console.warn("event.pull_request is not defined");
return null;
}
// Prefer explicit SHAs if available to avoid relying on local branch names.
const baseSha: string | undefined = pr.base?.sha;
const headSha: string | undefined = pr.head?.sha;
if (!baseSha || !headSha) {
console.warn("one of base or head is not defined on event.pull_request");
return null;
}
return { baseSha, headSha };
}

42
.github/actions/codex/src/review.ts vendored Normal file
View File

@@ -0,0 +1,42 @@
import type { EnvContext } from "./env-context";
import { runCodex } from "./run-codex";
import { postComment } from "./post-comment";
import { addEyesReaction } from "./add-reaction";
/**
* Handle `pull_request_review` events. We treat the review body the same way
* as a normal comment.
*/
export async function onReview(ctx: EnvContext): Promise<void> {
const triggerPhrase = ctx.tryGet("INPUT_TRIGGER_PHRASE");
if (!triggerPhrase) {
console.warn("Empty trigger phrase: skipping.");
return;
}
const reviewBody = ctx.tryGet("GITHUB_EVENT_REVIEW_BODY");
if (!reviewBody) {
console.warn("Review body not found in environment: skipping.");
return;
}
if (!reviewBody.includes(triggerPhrase)) {
console.log(
`Trigger phrase '${triggerPhrase}' not found: nothing to do for this review.`,
);
return;
}
const prompt = reviewBody.replace(triggerPhrase, "").trim();
if (prompt.length === 0) {
console.warn("Prompt is empty after removing trigger phrase: skipping.");
return;
}
await addEyesReaction(ctx);
const lastMessage = await runCodex(prompt, ctx);
await postComment(lastMessage, ctx);
}

56
.github/actions/codex/src/run-codex.ts vendored Normal file
View File

@@ -0,0 +1,56 @@
import { fail } from "./fail";
import { EnvContext } from "./env-context";
import { tmpdir } from "os";
import { join } from "node:path";
import { readFile, mkdtemp } from "fs/promises";
import { resolveWorkspacePath } from "./github-workspace";
/**
* Runs the Codex CLI with the provided prompt and returns the output written
* to the "last message" file.
*/
export async function runCodex(
prompt: string,
ctx: EnvContext,
): Promise<string> {
const OPENAI_API_KEY = ctx.get("OPENAI_API_KEY");
const tempDirPath = await mkdtemp(join(tmpdir(), "codex-"));
const lastMessageOutput = join(tempDirPath, "codex-prompt.md");
const args = ["/usr/local/bin/codex-exec"];
const inputCodexArgs = ctx.tryGet("INPUT_CODEX_ARGS")?.trim();
if (inputCodexArgs) {
args.push(...inputCodexArgs.split(/\s+/));
}
args.push("--output-last-message", lastMessageOutput, prompt);
const env: Record<string, string> = { ...process.env, OPENAI_API_KEY };
const INPUT_CODEX_HOME = ctx.tryGet("INPUT_CODEX_HOME");
if (INPUT_CODEX_HOME) {
env.CODEX_HOME = resolveWorkspacePath(INPUT_CODEX_HOME, ctx);
}
console.log(`Running Codex: ${JSON.stringify(args)}`);
const result = Bun.spawnSync(args, {
stdout: "inherit",
stderr: "inherit",
env,
});
if (!result.success) {
fail(`Codex failed: see above for details.`);
}
// Read the output generated by Codex.
let lastMessage: string;
try {
lastMessage = await readFile(lastMessageOutput, "utf8");
} catch (err) {
fail(`Failed to read Codex output at '${lastMessageOutput}': ${err}`);
}
return lastMessage;
}

View File

@@ -0,0 +1,33 @@
// Validate the inputs passed to the composite action.
// The script currently ensures that the provided configuration file exists and
// matches the expected schema.
import type { Config } from "./config";
import { existsSync } from "fs";
import * as path from "path";
import { fail } from "./fail";
export function performAdditionalValidation(config: Config, workspace: string) {
// Additional validation: ensure referenced prompt files exist and are Markdown.
for (const [label, details] of Object.entries(config.labels)) {
// Determine which prompt key is present (the schema guarantees exactly one).
const promptPathStr =
(details as any).prompt ?? (details as any).promptPath;
if (promptPathStr) {
const promptPath = path.isAbsolute(promptPathStr)
? promptPathStr
: path.join(workspace, promptPathStr);
if (!existsSync(promptPath)) {
fail(`Prompt file for label '${label}' not found: ${promptPath}`);
}
if (!promptPath.endsWith(".md")) {
fail(
`Prompt file for label '${label}' must be a .md file (got ${promptPathStr}).`,
);
}
}
}
}

15
.github/actions/codex/tsconfig.json vendored Normal file
View File

@@ -0,0 +1,15 @@
{
"compilerOptions": {
"lib": ["ESNext"],
"target": "ESNext",
"module": "ESNext",
"moduleDetection": "force",
"moduleResolution": "bundler",
"noEmit": true,
"strict": true,
"skipLibCheck": true
},
"include": ["src"]
}

3
.github/codex/home/config.toml vendored Normal file
View File

@@ -0,0 +1,3 @@
model = "o3"
# Consider setting [mcp_servers] here!

9
.github/codex/labels/codex-attempt.md vendored Normal file
View File

@@ -0,0 +1,9 @@
Attempt to solve the reported issue.
If a code change is required, create a new branch, commit the fix, and open a pull request that resolves the problem.
Here is the original GitHub issue that triggered this run:
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}

7
.github/codex/labels/codex-review.md vendored Normal file
View File

@@ -0,0 +1,7 @@
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the `base` and `head` refs that define this PR. Both refs are available locally.

7
.github/codex/labels/codex-triage.md vendored Normal file
View File

@@ -0,0 +1,7 @@
Troubleshoot whether the reported issue is valid.
Provide a concise and respectful comment summarizing the findings.
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}

26
.github/dependabot.yaml vendored Normal file
View File

@@ -0,0 +1,26 @@
# https://docs.github.com/en/code-security/dependabot/working-with-dependabot/dependabot-options-reference#package-ecosystem-
version: 2
updates:
- package-ecosystem: bun
directory: .github/actions/codex
schedule:
interval: weekly
- package-ecosystem: cargo
directories:
- codex-rs
- codex-rs/*
schedule:
interval: weekly
- package-ecosystem: devcontainers
directory: /
schedule:
interval: weekly
- package-ecosystem: docker
directory: codex-cli
schedule:
interval: weekly
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly

View File

@@ -5,7 +5,7 @@
"macos-aarch64": { "regex": "^codex-exec-aarch64-apple-darwin\\.zst$", "path": "codex-exec" },
"macos-x86_64": { "regex": "^codex-exec-x86_64-apple-darwin\\.zst$", "path": "codex-exec" },
"linux-x86_64": { "regex": "^codex-exec-x86_64-unknown-linux-musl\\.zst$", "path": "codex-exec" },
"linux-aarch64": { "regex": "^codex-exec-aarch64-unknown-linux-gnu\\.zst$", "path": "codex-exec" }
"linux-aarch64": { "regex": "^codex-exec-aarch64-unknown-linux-musl\\.zst$", "path": "codex-exec" }
}
},
@@ -14,14 +14,14 @@
"macos-aarch64": { "regex": "^codex-aarch64-apple-darwin\\.zst$", "path": "codex" },
"macos-x86_64": { "regex": "^codex-x86_64-apple-darwin\\.zst$", "path": "codex" },
"linux-x86_64": { "regex": "^codex-x86_64-unknown-linux-musl\\.zst$", "path": "codex" },
"linux-aarch64": { "regex": "^codex-aarch64-unknown-linux-gnu\\.zst$", "path": "codex" }
"linux-aarch64": { "regex": "^codex-aarch64-unknown-linux-musl\\.zst$", "path": "codex" }
}
},
"codex-linux-sandbox": {
"platforms": {
"linux-x86_64": { "regex": "^codex-linux-sandbox-x86_64-unknown-linux-musl\\.zst$", "path": "codex-linux-sandbox" },
"linux-aarch64": { "regex": "^codex-linux-sandbox-aarch64-unknown-linux-gnu\\.zst$", "path": "codex-linux-sandbox" }
"linux-aarch64": { "regex": "^codex-linux-sandbox-aarch64-unknown-linux-musl\\.zst$", "path": "codex-linux-sandbox" }
}
}
}

View File

@@ -74,7 +74,12 @@ jobs:
GH_TOKEN: ${{ github.token }}
run: pnpm stage-release
- name: Ensure README.md contains only ASCII and certain Unicode code points
- name: Ensure root README.md contains only ASCII and certain Unicode code points
run: ./scripts/asciicheck.py README.md
- name: Check README ToC
- name: Check root README ToC
run: python3 scripts/readme_toc.py README.md
- name: Ensure codex-cli/README.md contains only ASCII and certain Unicode code points
run: ./scripts/asciicheck.py codex-cli/README.md
- name: Check codex-cli/README ToC
run: python3 scripts/readme_toc.py codex-cli/README.md

95
.github/workflows/codex.yml vendored Normal file
View File

@@ -0,0 +1,95 @@
name: Codex
on:
issues:
types: [opened, labeled]
pull_request:
branches: [main]
types: [labeled]
jobs:
codex:
# This `if` check provides complex filtering logic to avoid running Codex
# on every PR. Admittedly, one thing this does not verify is whether the
# sender has write access to the repo: that must be done as part of a
# runtime step.
#
# Note the label values should match the ones in the .github/codex/labels
# folder.
if: |
(github.event_name == 'issues' && (
(github.event.action == 'labeled' && (github.event.label.name == 'codex-attempt' || github.event.label.name == 'codex-triage'))
)) ||
(github.event_name == 'pull_request' && github.event.action == 'labeled' && github.event.label.name == 'codex-review')
runs-on: ubuntu-latest
permissions:
contents: write # can push or create branches
issues: write # for comments + labels on issues/PRs
pull-requests: write # for PR comments/labels
steps:
# TODO: Consider adding an optional mode (--dry-run?) to actions/codex
# that verifies whether Codex should actually be run for this event.
# (For example, it may be rejected because the sender does not have
# write access to the repo.) The benefit would be two-fold:
# 1. As the first step of this job, it gives us a chance to add a reaction
# or comment to the PR/issue ASAP to "ack" the request.
# 2. It saves resources by skipping the clone and setup steps below if
# Codex is not going to run.
- name: Checkout repository
uses: actions/checkout@v4
# We install the dependencies like we would for an ordinary CI job,
# particularly because Codex will not have network access to install
# these dependencies.
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 10.8.1
run_install: false
- name: Get pnpm store directory
id: pnpm-cache
shell: bash
run: |
echo "store_path=$(pnpm store path --silent)" >> $GITHUB_OUTPUT
- name: Setup pnpm cache
uses: actions/cache@v4
with:
path: ${{ steps.pnpm-cache.outputs.store_path }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
- name: Install dependencies
run: pnpm install
- uses: dtolnay/rust-toolchain@1.88
with:
targets: x86_64-unknown-linux-gnu
components: clippy
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
${{ github.workspace }}/codex-rs/target/
key: cargo-ubuntu-24.04-x86_64-unknown-linux-gnu-${{ hashFiles('**/Cargo.lock') }}
# Note it is possible that the `verify` step internal to Run Codex will
# fail, in which case the work to setup the repo was worthless :(
- name: Run Codex
uses: ./.github/actions/codex
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
github_token: ${{ secrets.GITHUB_TOKEN }}
codex_home: ./.github/codex/home

View File

@@ -26,7 +26,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@1.87
- uses: dtolnay/rust-toolchain@1.88
with:
components: rustfmt
- name: cargo fmt
@@ -55,12 +55,16 @@ jobs:
target: x86_64-unknown-linux-musl
- runner: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-gnu
- runner: windows-latest
target: x86_64-pc-windows-msvc
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@1.87
- uses: dtolnay/rust-toolchain@1.88
with:
targets: ${{ matrix.target }}
components: clippy
@@ -75,7 +79,7 @@ jobs:
${{ github.workspace }}/codex-rs/target/
key: cargo-${{ matrix.runner }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'aarch64-unknown-linux-musl'}}
name: Install musl build tools
run: |
sudo apt install -y musl-tools pkg-config

View File

@@ -15,9 +15,6 @@ concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
env:
TAG_REGEX: '^rust-v[0-9]+\.[0-9]+\.[0-9]+$'
jobs:
tag-check:
runs-on: ubuntu-latest
@@ -33,8 +30,8 @@ jobs:
# 1. Must be a tag and match the regex
[[ "${GITHUB_REF_TYPE}" == "tag" ]] \
|| { echo "❌ Not a tag push"; exit 1; }
[[ "${GITHUB_REF_NAME}" =~ ${TAG_REGEX} ]] \
|| { echo "❌ Tag '${GITHUB_REF_NAME}' != ${TAG_REGEX}"; exit 1; }
[[ "${GITHUB_REF_NAME}" =~ ^rust-v[0-9]+\.[0-9]+\.[0-9]+(-(alpha|beta)(\.[0-9]+)?)?$ ]] \
|| { echo "❌ Tag '${GITHUB_REF_NAME}' doesn't match expected format"; exit 1; }
# 2. Extract versions
tag_ver="${GITHUB_REF_NAME#rust-v}"
@@ -69,12 +66,14 @@ jobs:
target: x86_64-unknown-linux-musl
- runner: ubuntu-24.04
target: x86_64-unknown-linux-gnu
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-gnu
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@1.87
- uses: dtolnay/rust-toolchain@1.88
with:
targets: ${{ matrix.target }}
@@ -88,7 +87,7 @@ jobs:
${{ github.workspace }}/codex-rs/target/
key: cargo-release-${{ matrix.runner }}-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' }}
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'aarch64-unknown-linux-musl'}}
name: Install musl build tools
run: |
sudo apt install -y musl-tools pkg-config
@@ -105,7 +104,10 @@ jobs:
cp target/${{ matrix.target }}/release/codex-exec "$dest/codex-exec-${{ matrix.target }}"
cp target/${{ matrix.target }}/release/codex "$dest/codex-${{ matrix.target }}"
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'x86_64-unknown-linux-gnu' || matrix.target == 'aarch64-unknown-linux-gnu' }}
# After https://github.com/openai/codex/pull/1228 is merged and a new
# release is cut with an artifacts built after that PR, the `-gnu`
# variants can go away as we will only use the `-musl` variants.
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'x86_64-unknown-linux-gnu' || matrix.target == 'aarch64-unknown-linux-gnu' || matrix.target == 'aarch64-unknown-linux-musl' }}
name: Stage Linux-only artifacts
shell: bash
run: |
@@ -155,9 +157,7 @@ jobs:
release:
needs: build
name: release
runs-on: ubuntu-24.04
env:
RELEASE_TAG: codex-rs-${{ github.sha }}-${{ github.run_attempt }}-${{ github.ref_name }}
runs-on: ubuntu-latest
steps:
- uses: actions/download-artifact@v4
@@ -167,9 +167,19 @@ jobs:
- name: List
run: ls -R dist/
- uses: softprops/action-gh-release@v2
- name: Define release name
id: release_name
run: |
# Extract the version from the tag name, which is in the format
# "rust-v0.1.0".
version="${GITHUB_REF_NAME#rust-v}"
echo "name=${version}" >> $GITHUB_OUTPUT
- name: Create GitHub Release
uses: softprops/action-gh-release@v2
with:
tag_name: ${{ env.RELEASE_TAG }}
name: ${{ steps.release_name.outputs.name }}
tag_name: ${{ github.ref_name }}
files: dist/**
# For now, tag releases as "prerelease" because we are not claiming
# the Rust CLI is stable yet.
@@ -179,5 +189,5 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag: ${{ env.RELEASE_TAG }}
tag: ${{ github.ref_name }}
config: .github/dotslash-config.json

18
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,18 @@
{
"version": "0.2.0",
"configurations": [
{
"type": "lldb",
"request": "launch",
"name": "Cargo launch",
"cargo": {
"cwd": "${workspaceFolder}/codex-rs",
"args": [
"build",
"--bin=codex-tui"
]
},
"args": []
}
]
}

10
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,10 @@
{
"rust-analyzer.checkOnSave": true,
"rust-analyzer.check.command": "clippy",
"rust-analyzer.check.extraArgs": ["--all-features", "--tests"],
"rust-analyzer.rustfmt.extraArgs": ["--config", "imports_granularity=Item"],
"[rust]": {
"editor.defaultFormatter": "rust-lang.rust-analyzer",
"editor.formatOnSave": true,
}
}

View File

@@ -3,3 +3,7 @@
In the codex-rs folder where the rust code lives:
- Never add or modify any code related to `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR`. You operate in a sandbox where `CODEX_SANDBOX_NETWORK_DISABLED=1` will be set whenever you use the `shell` tool. Any existing code that uses `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` was authored with this fact in mind. It is often used to early exit out of tests that the author knew you would not be able to run given your sandbox limitations.
Before creating a pull request with changes to `codex-rs`, run `just fmt` (in `codex-rs` directory) to format the code and `just fix` (in `codex-rs` directory) to fix any linter issues in the code, ensure the test suite passes by running `cargo test --all-features` in the `codex-rs` directory.
When making individual changes prefer running tests on individual files or projects first.

572
README.md
View File

@@ -1,9 +1,11 @@
<h1 align="center">OpenAI Codex CLI</h1>
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code></p>
<p align="center"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>
![Codex demo GIF using: codex "explain this codebase to me"](./.github/demo.gif)
This is the home of the **Codex CLI**, which is a coding agent from OpenAI that runs locally on your computer. If you are looking for the _cloud-based agent_ from OpenAI, **Codex [Web]**, see <https://chatgpt.com/codex>.
<!-- ![Codex demo GIF using: codex "explain this codebase to me"](./.github/demo.gif) -->
---
@@ -14,6 +16,8 @@
- [Experimental technology disclaimer](#experimental-technology-disclaimer)
- [Quickstart](#quickstart)
- [OpenAI API Users](#openai-api-users)
- [OpenAI Plus/Pro Users](#openai-pluspro-users)
- [Why Codex?](#why-codex)
- [Security model & permissions](#security-model--permissions)
- [Platform sandboxing details](#platform-sandboxing-details)
@@ -21,24 +25,17 @@
- [CLI reference](#cli-reference)
- [Memory & project docs](#memory--project-docs)
- [Non-interactive / CI mode](#non-interactive--ci-mode)
- [Model Context Protocol (MCP)](#model-context-protocol-mcp)
- [Tracing / verbose logging](#tracing--verbose-logging)
- [Recipes](#recipes)
- [Installation](#installation)
- [Configuration guide](#configuration-guide)
- [Basic configuration parameters](#basic-configuration-parameters)
- [Custom AI provider configuration](#custom-ai-provider-configuration)
- [History configuration](#history-configuration)
- [Configuration examples](#configuration-examples)
- [Full configuration example](#full-configuration-example)
- [Custom instructions](#custom-instructions)
- [Environment variables setup](#environment-variables-setup)
- [DotSlash](#dotslash)
- [Configuration](#configuration)
- [FAQ](#faq)
- [Zero data retention (ZDR) usage](#zero-data-retention-zdr-usage)
- [Codex open source fund](#codex-open-source-fund)
- [Contributing](#contributing)
- [Development workflow](#development-workflow)
- [Git hooks with Husky](#git-hooks-with-husky)
- [Debugging](#debugging)
- [Writing high-impact code changes](#writing-high-impact-code-changes)
- [Opening a pull request](#opening-a-pull-request)
- [Review process](#review-process)
@@ -47,8 +44,6 @@
- [Contributor license agreement (CLA)](#contributor-license-agreement-cla)
- [Quick fixes](#quick-fixes)
- [Releasing `codex`](#releasing-codex)
- [Alternative build options](#alternative-build-options)
- [Nix flake development](#nix-flake-development)
- [Security & responsible AI](#security--responsible-ai)
- [License](#license)
@@ -71,54 +66,94 @@ Help us improve by filing issues or submitting PRs (see the section below for ho
## Quickstart
Install globally:
Install globally with your preferred package manager:
```shell
npm install -g @openai/codex
npm install -g @openai/codex # Alternatively: `brew install codex`
```
Or go to the [latest GitHub Release](https://github.com/openai/codex/releases/latest) and download the appropriate binary for your platform.
### OpenAI API Users
Next, set your OpenAI API key as an environment variable:
```shell
export OPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
> ```
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
> [!NOTE]
> This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`), but we recommend setting it for the session.
### OpenAI Plus/Pro Users
If you have a paid OpenAI account, run the following to start the login process:
```
codex login
```
If you complete the process successfully, you should have a `~/.codex/auth.json` file that contains the credentials that Codex will use.
If you encounter problems with the login flow, please comment on <https://github.com/openai/codex/issues/1243>.
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
<summary><strong>Use <code>--profile</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - azure
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - arceeai
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
>
> ```shell
> export <provider>_BASE_URL="https://your-provider-api-base-url"
> ```
Codex also allows you to use other providers that support the OpenAI Chat Completions (or Responses) API.
To do so, you must first define custom [providers](./config.md#model_providers) in `~/.codex/config.toml`. For example, the provider for a standard Ollama setup would be defined as follows:
```toml
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
```
The `base_url` will have `/chat/completions` appended to it to build the full URL for the request.
For providers that also require an `Authorization` header of the form `Bearer: SECRET`, an `env_key` can be specified, which indicates the environment variable to read to use as the value of `SECRET` when making a request:
```toml
[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
```
Providers that speak the Responses API are also supported by adding `wire_api = "responses"` as part of the definition. Accessing OpenAI models via Azure is an example of such a provider, though it also requires specifying additional `query_params` that need to be appended to the request URL:
```toml
[model_providers.azure]
name = "Azure"
# Make sure you set the appropriate subdomain for this URL.
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY" # Or "OPENAI_API_KEY", whichever you use.
# Newer versions appear to support the responses API, see https://github.com/openai/codex/pull/1321
query_params = { api-version = "2025-04-01-preview" }
wire_api = "responses"
```
Once you have defined a provider you wish to use, you can configure it as your default provider as follows:
```toml
model_provider = "azure"
```
> [!TIP]
> If you find yourself experimenting with a variety of models and providers, then you likely want to invest in defining a _profile_ for each configuration like so:
```toml
[profiles.o3]
model_provider = "azure"
model = "o3"
[profiles.mistral]
model_provider = "ollama"
model = "mistral"
```
This way, you can specify one command-line argument (.e.g., `--profile o3`, `--profile mistral`) to override multiple settings together.
</details>
<br />
@@ -136,7 +171,7 @@ codex "explain this codebase to me"
```
```shell
codex --approval-mode full-auto "create the fanciest todo-list app"
codex --full-auto "create the fanciest todo-list app"
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
@@ -162,41 +197,35 @@ And it's **fully open-source** so you can see and contribute to how it develops!
## Security model & permissions
Codex lets you decide _how much autonomy_ the agent receives and auto-approval policy via the
`--approval-mode` flag (or the interactive onboarding prompt):
Codex lets you decide _how much autonomy_ you want to grant the agent. The following options can be configured independently:
| Mode | What the agent may do without asking | Still requires approval |
| ------------------------- | --------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| **Suggest** <br>(default) | <li>Read any file in the repo | <li>**All** file writes/patches<li> **Any** arbitrary shell commands (aside from reading files) |
| **Auto Edit** | <li>Read **and** apply-patch writes to files | <li>**All** shell commands |
| **Full Auto** | <li>Read/write files <li> Execute shell commands (network disabled, writes limited to your workdir) | - |
- [`approval_policy`](./codex-rs/config.md#approval_policy) determines when you should be prompted to approve whether Codex can execute a command
- [`sandbox`](./codex-rs/config.md#sandbox) determines the _sandbox policy_ that Codex uses to execute untrusted commands
In **Full Auto** every command is run **network-disabled** and confined to the
current working directory (plus temporary files) for defense-in-depth. Codex
will also show a warning/confirmation if you start in **auto-edit** or
**full-auto** while the directory is _not_ tracked by Git, so you always have a
safety net.
By default, Codex runs with `--ask-for-approval untrusted` and `--sandbox read-only`, which means that:
Coming soon: you'll be able to whitelist specific commands to auto-execute with
the network enabled, once we're confident in additional safeguards.
- The user is prompted to approve every command not on the set of "trusted" commands built into Codex (`cat`, `ls`, etc.)
- Approved commands are run outside of a sandbox because user approval implies "trust," in this case.
Running Codex with the `--full-auto` convenience flag changes the configuration to `--ask-for-approval on-failure` and `--sandbox workspace-write`, which means that:
- Codex does not initially ask for user approval before running an individual command.
- Though when it runs a command, it is run under a sandbox in which:
- It can read any file on the system.
- It can only write files under the current directory (or the directory specified via `--cd`).
- Network requests are completely disabled.
- Only if the command exits with a non-zero exit code will it ask the user for approval. If granted, it will re-attempt the command outside of the sandbox. (A common case is when Codex cannot `npm install` a dependency because that requires network access.)
Again, these two options can be configured independently. For example, if you want Codex to perform an "exploration" where you are happy for it to read anything it wants but you never want to be prompted, you could run Codex with `--ask-for-approval never` and `--sandbox read-only`.
### Platform sandboxing details
The hardening mechanism Codex uses depends on your OS:
The mechanism Codex uses to implement the sandbox policy depends on your OS:
- **macOS 12+** - commands are wrapped with **Apple Seatbelt** (`sandbox-exec`).
- **macOS 12+** uses **Apple Seatbelt** and runs commands using `sandbox-exec` with a profile (`-p`) that corresponds to the `--sandbox` that was specified.
- **Linux** uses a combination of Landlock/seccomp APIs to enforce the `sandbox` configuration.
- Everything is placed in a read-only jail except for a small set of
writable roots (`$PWD`, `$TMPDIR`, `~/.codex`, etc.).
- Outbound network is _fully blocked_ by default - even if a child process
tries to `curl` somewhere it will fail.
- **Linux** - there is no sandboxing by default.
We recommend using Docker for sandboxing, where Codex launches itself inside a **minimal
container image** and mounts your repo _read/write_ at the same path. A
custom `iptables`/`ipset` firewall script denies all egress except the
OpenAI API. This gives you deterministic, reproducible runs without needing
root on the host. You can use the [`run_in_container.sh`](./codex-cli/scripts/run_in_container.sh) script to set up the sandbox.
Note that when running Linux in a containerized environment such as Docker, sandboxing may not work if the host/container configuration does not support the necessary Landlock/seccomp APIs. In such cases, we recommend configuring your Docker container so that it provides the sandbox guarantees you are looking for and then running `codex` with `--sandbox danger-full-access` (or, more simply, the `--dangerously-bypass-approvals-and-sandbox` flag) within your container.
---
@@ -205,24 +234,20 @@ The hardening mechanism Codex uses depends on your OS:
| Requirement | Details |
| --------------------------- | --------------------------------------------------------------- |
| Operating systems | macOS 12+, Ubuntu 20.04+/Debian 10+, or Windows 11 **via WSL2** |
| Node.js | **22 or newer** (LTS recommended) |
| Git (optional, recommended) | 2.23+ for built-in PR helpers |
| RAM | 4-GB minimum (8-GB recommended) |
> Never run `sudo npm install -g`; fix npm permissions instead.
---
## CLI reference
| Command | Purpose | Example |
| ------------------------------------ | ----------------------------------- | ------------------------------------ |
| `codex` | Interactive REPL | `codex` |
| `codex "..."` | Initial prompt for interactive REPL | `codex "fix lint errors"` |
| `codex -q "..."` | Non-interactive "quiet mode" | `codex -q --json "explain utils.ts"` |
| `codex completion <bash\|zsh\|fish>` | Print shell completion script | `codex completion bash` |
| Command | Purpose | Example |
| ------------------ | ---------------------------------- | ------------------------------- |
| `codex` | Interactive TUI | `codex` |
| `codex "..."` | Initial prompt for interactive TUI | `codex "fix lint errors"` |
| `codex exec "..."` | Non-interactive "automation mode" | `codex exec "explain utils.ts"` |
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
Key flags: `--model/-m`, `--ask-for-approval/-a`.
---
@@ -234,8 +259,6 @@ You can give Codex extra instructions and guidance using `AGENTS.md` files. Code
2. `AGENTS.md` at repo root - shared project notes
3. `AGENTS.md` in the current working directory - sub-folder/feature specifics
Disable loading of these files with `--no-project-doc` or the environment variable `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Non-interactive / CI mode
@@ -247,18 +270,37 @@ Run Codex head-less in pipelines. Example GitHub Action step:
run: |
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex -a auto-edit --quiet "update CHANGELOG for next release"
codex exec --full-auto "update CHANGELOG for next release"
```
Set `CODEX_QUIET_MODE=1` to silence interactive UI noise.
## Model Context Protocol (MCP)
The Codex CLI can be configured to leverage MCP servers by defining an [`mcp_servers`](./codex-rs/config.md#mcp_servers) section in `~/.codex/config.toml`. It is intended to mirror how tools such as Claude and Cursor define `mcpServers` in their respective JSON config files, though the Codex format is slightly different since it uses TOML rather than JSON, e.g.:
```toml
# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command = "npx"
args = ["-y", "mcp-server"]
env = { "API_KEY" = "value" }
```
> [!TIP]
> It is somewhat experimental, but the Codex CLI can also be run as an MCP _server_ via `codex mcp`. If you launch it with an MCP client such as `npx @modelcontextprotocol/inspector codex mcp` and send it a `tools/list` request, you will see that there is only one tool, `codex`, that accepts a grab-bag of inputs, including a catch-all `config` map for anything you might want to override. Feel free to play around with it and provide feedback via GitHub issues.
## Tracing / verbose logging
Setting the environment variable `DEBUG=true` prints full API request and response details:
Because Codex is written in Rust, it honors the `RUST_LOG` environment variable to configure its logging behavior.
The TUI defaults to `RUST_LOG=codex_core=info,codex_tui=info` and log messages are written to `~/.codex/log/codex-tui.log`, so you can leave the following running in a separate terminal to monitor log messages as they are written:
```shell
DEBUG=true codex
```
tail -F ~/.codex/log/codex-tui.log
```
By comparison, the non-interactive mode (`codex exec`) defaults to `RUST_LOG=error`, but messages are printed inline, so there is no need to monitor a separate file.
See the Rust documentation on [`RUST_LOG`](https://docs.rs/env_logger/latest/env_logger/#enabling-logging) for more information on the configuration options.
---
@@ -281,201 +323,78 @@ Below are a few bite-size examples you can copy-paste. Replace the text in quote
## Installation
<details open>
<summary><strong>From npm (Recommended)</strong></summary>
<summary><strong>Install Codex CLI using your preferred package manager.</strong></summary>
From `brew` (recommended, downloads only the binary for your platform):
```bash
npm install -g @openai/codex
# or
yarn global add @openai/codex
# or
bun install -g @openai/codex
# or
pnpm add -g @openai/codex
brew install codex
```
From `npm` (generally more readily available, but downloads binaries for all supported platforms):
```bash
npm i -g @openai/codex
```
Or go to the [latest GitHub Release](https://github.com/openai/codex/releases/latest) and download the appropriate binary for your platform.
Admittedly, each GitHub Release contains many executables, but in practice, you likely want one of these:
- macOS
- Apple Silicon/arm64: `codex-aarch64-apple-darwin.tar.gz`
- x86_64 (older Mac hardware): `codex-x86_64-apple-darwin.tar.gz`
- Linux
- x86_64: `codex-x86_64-unknown-linux-musl.tar.gz`
- arm64: `codex-aarch64-unknown-linux-musl.tar.gz`
Each archive contains a single entry with the platform baked into the name (e.g., `codex-x86_64-unknown-linux-musl`), so you likely want to rename it to `codex` after extracting it.
### DotSlash
The GitHub Release also contains a [DotSlash](https://dotslash-cli.com/) file for the Codex CLI named `codex`. Using a DotSlash file makes it possible to make a lightweight commit to source control to ensure all contributors use the same version of an executable, regardless of what platform they use for development.
</details>
<details>
<summary><strong>Build from source</strong></summary>
```bash
# Clone the repository and navigate to the CLI package
# Clone the repository and navigate to the root of the Cargo workspace.
git clone https://github.com/openai/codex.git
cd codex/codex-cli
cd codex/codex-rs
# Enable corepack
corepack enable
# Install the Rust toolchain, if necessary.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup component add rustfmt
rustup component add clippy
# Install dependencies and build
pnpm install
pnpm build
# Build Codex.
cargo build
# Linux-only: download prebuilt sandboxing binaries (requires gh and zstd).
./scripts/install_native_deps.sh
# Launch the TUI with a sample prompt.
cargo run --bin codex -- "explain this codebase to me"
# Get the usage and the options
node ./dist/cli.js --help
# After making changes, ensure the code is clean.
cargo fmt -- --config imports_granularity=Item
cargo clippy --tests
# Run the locally-built CLI directly
node ./dist/cli.js
# Or link the command globally for convenience
pnpm link
# Run the tests.
cargo test
```
</details>
---
## Configuration guide
## Configuration
Codex configuration files can be placed in the `~/.codex/` directory, supporting both YAML and JSON formats.
Codex supports a rich set of configuration options documented in [`codex-rs/config.md`](./codex-rs/config.md).
### Basic configuration parameters
By default, Codex loads its configuration from `~/.codex/config.toml`.
| Parameter | Type | Default | Description | Available Options |
| ------------------- | ------- | ---------- | -------------------------------- | ---------------------------------------------------------------------------------------------- |
| `model` | string | `o4-mini` | AI model to use | Any model name supporting OpenAI API |
| `approvalMode` | string | `suggest` | AI assistant's permission mode | `suggest` (suggestions only)<br>`auto-edit` (automatic edits)<br>`full-auto` (fully automatic) |
| `fullAutoErrorMode` | string | `ask-user` | Error handling in full-auto mode | `ask-user` (prompt for user input)<br>`ignore-and-continue` (ignore and proceed) |
| `notify` | boolean | `true` | Enable desktop notifications | `true`/`false` |
### Custom AI provider configuration
In the `providers` object, you can configure multiple AI service providers. Each provider requires the following parameters:
| Parameter | Type | Description | Example |
| --------- | ------ | --------------------------------------- | ----------------------------- |
| `name` | string | Display name of the provider | `"OpenAI"` |
| `baseURL` | string | API service URL | `"https://api.openai.com/v1"` |
| `envKey` | string | Environment variable name (for API key) | `"OPENAI_API_KEY"` |
### History configuration
In the `history` object, you can configure conversation history settings:
| Parameter | Type | Description | Example Value |
| ------------------- | ------- | ------------------------------------------------------ | ------------- |
| `maxSize` | number | Maximum number of history entries to save | `1000` |
| `saveHistory` | boolean | Whether to save history | `true` |
| `sensitivePatterns` | array | Patterns of sensitive information to filter in history | `[]` |
### Configuration examples
1. YAML format (save as `~/.codex/config.yaml`):
```yaml
model: o4-mini
approvalMode: suggest
fullAutoErrorMode: ask-user
notify: true
```
2. JSON format (save as `~/.codex/config.json`):
```json
{
"model": "o4-mini",
"approvalMode": "suggest",
"fullAutoErrorMode": "ask-user",
"notify": true
}
```
### Full configuration example
Below is a comprehensive example of `config.json` with multiple custom providers:
```json
{
"model": "o4-mini",
"provider": "openai",
"providers": {
"openai": {
"name": "OpenAI",
"baseURL": "https://api.openai.com/v1",
"envKey": "OPENAI_API_KEY"
},
"azure": {
"name": "AzureOpenAI",
"baseURL": "https://YOUR_PROJECT_NAME.openai.azure.com/openai",
"envKey": "AZURE_OPENAI_API_KEY"
},
"openrouter": {
"name": "OpenRouter",
"baseURL": "https://openrouter.ai/api/v1",
"envKey": "OPENROUTER_API_KEY"
},
"gemini": {
"name": "Gemini",
"baseURL": "https://generativelanguage.googleapis.com/v1beta/openai",
"envKey": "GEMINI_API_KEY"
},
"ollama": {
"name": "Ollama",
"baseURL": "http://localhost:11434/v1",
"envKey": "OLLAMA_API_KEY"
},
"mistral": {
"name": "Mistral",
"baseURL": "https://api.mistral.ai/v1",
"envKey": "MISTRAL_API_KEY"
},
"deepseek": {
"name": "DeepSeek",
"baseURL": "https://api.deepseek.com",
"envKey": "DEEPSEEK_API_KEY"
},
"xai": {
"name": "xAI",
"baseURL": "https://api.x.ai/v1",
"envKey": "XAI_API_KEY"
},
"groq": {
"name": "Groq",
"baseURL": "https://api.groq.com/openai/v1",
"envKey": "GROQ_API_KEY"
},
"arceeai": {
"name": "ArceeAI",
"baseURL": "https://conductor.arcee.ai/v1",
"envKey": "ARCEEAI_API_KEY"
}
},
"history": {
"maxSize": 1000,
"saveHistory": true,
"sensitivePatterns": []
}
}
```
### Custom instructions
You can create a `~/.codex/AGENTS.md` file to define custom guidance for the agent:
```markdown
- Always respond with emojis
- Only use git commands when explicitly requested
```
### Environment variables setup
For each AI provider, you need to set the corresponding API key in your environment variables. For example:
```bash
# OpenAI
export OPENAI_API_KEY="your-api-key-here"
# Azure OpenAI
export AZURE_OPENAI_API_KEY="your-azure-api-key-here"
export AZURE_OPENAI_API_VERSION="2025-03-01-preview" (Optional)
# OpenRouter
export OPENROUTER_API_KEY="your-openrouter-key-here"
# Similarly for other providers
```
Though `--config` can be used to set/override ad-hoc config values for individual invocations of `codex`.
---
@@ -524,7 +443,13 @@ Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)]
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
Ensure you are running `codex` with `--config disable_response_storage=true` or add this line to `~/.codex/config.toml` to avoid specifying the command line option each time:
```toml
disable_response_storage = true
```
See [the configuration documentation on `disable_response_storage`](./codex-rs/config.md#disable_response_storage) for details.
---
@@ -549,51 +474,7 @@ More broadly we welcome contributions - whether you are opening your very first
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
- Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git hooks with Husky
This project uses [Husky](https://typicode.github.io/husky/) to enforce code quality checks:
- **Pre-commit hook**: Automatically runs lint-staged to format and lint files before committing
- **Pre-push hook**: Runs tests and type checking before pushing to the remote
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./codex-cli/HUSKY.md).
```bash
pnpm test && pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
```text
I have read the CLA Document and I hereby sign the CLA
```
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
pnpm test:watch
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
- Following the [development setup](#development-workflow) instructions above, ensure your change is free of lint warnings and test failures.
### Writing high-impact code changes
@@ -605,7 +486,7 @@ To debug the CLI with a visual debugger, do the following in the `codex-cli` fol
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Run **all** checks locally (`cargo test && cargo clippy --tests && cargo fmt -- --config imports_granularity=Item`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
@@ -652,73 +533,22 @@ The **DCO check** blocks merges until every commit in the PR carries the footer
### Releasing `codex`
To publish a new version of the CLI you first need to stage the npm package. A
helper script in `codex-cli/scripts/` does all the heavy lifting. Inside the
`codex-cli` folder run:
_For admins only._
```bash
# Classic, JS implementation that includes small, native binaries for Linux sandboxing.
pnpm stage-release
Make sure you are on `main` and have no local changes. Then run:
# Optionally specify the temp directory to reuse between runs.
RELEASE_DIR=$(mktemp -d)
pnpm stage-release --tmp "$RELEASE_DIR"
# "Fat" package that additionally bundles the native Rust CLI binaries for
# Linux. End-users can then opt-in at runtime by setting CODEX_RUST=1.
pnpm stage-release --native
```shell
VERSION=0.2.0 # Can also be 0.2.0-alpha.1 or any valid Rust version.
./codex-rs/scripts/create_github_release.sh "$VERSION"
```
Go to the folder where the release is staged and verify that it works as intended. If so, run the following from the temp folder:
This will make a local commit on top of `main` with `version` set to `$VERSION` in `codex-rs/Cargo.toml` (note that on `main`, we leave the version as `version = "0.0.0"`).
```
cd "$RELEASE_DIR"
npm publish
```
This will push the commit using the tag `rust-v${VERSION}`, which in turn kicks off [the release workflow](.github/workflows/rust-release.yml). This will create a new GitHub Release named `$VERSION`.
### Alternative build options
If everything looks good in the generated GitHub Release, uncheck the **pre-release** box so it is the latest release.
#### Nix flake development
Prerequisite: Nix >= 2.4 with flakes enabled (`experimental-features = nix-command flakes` in `~/.config/nix/nix.conf`).
Enter a Nix development shell:
```bash
# Use either one of the commands according to which implementation you want to work with
nix develop .#codex-cli # For entering codex-cli specific shell
nix develop .#codex-rs # For entering codex-rs specific shell
```
This shell includes Node.js, installs dependencies, builds the CLI, and provides a `codex` command alias.
Build and run the CLI directly:
```bash
# Use either one of the commands according to which implementation you want to work with
nix build .#codex-cli # For building codex-cli
nix build .#codex-rs # For building codex-rs
./result/bin/codex --help
```
Run the CLI via the flake app:
```bash
# Use either one of the commands according to which implementation you want to work with
nix run .#codex-cli # For running codex-cli
nix run .#codex-rs # For running codex-rs
```
Use direnv with flakes
If you have direnv installed, you can use the following `.envrc` to automatically enter the Nix shell when you `cd` into the project directory:
```bash
cd codex-rs
echo "use flake ../flake.nix#codex-cli" >> .envrc && direnv allow
cd codex-cli
echo "use flake ../flake.nix#codex-rs" >> .envrc && direnv allow
```
Create a PR to update [`Formula/c/codex.rb`](https://github.com/Homebrew/homebrew-core/blob/main/Formula/c/codex.rb) on Homebrew.
---

View File

@@ -1,3 +1,7 @@
# Added by ./scripts/install_native_deps.sh
/bin/codex-aarch64-apple-darwin
/bin/codex-aarch64-unknown-linux-musl
/bin/codex-linux-sandbox-arm64
/bin/codex-linux-sandbox-x64
/bin/codex-x86_64-apple-darwin
/bin/codex-x86_64-unknown-linux-musl

View File

@@ -1,4 +1,4 @@
FROM node:20-slim
FROM node:24-slim
ARG TZ
ENV TZ="$TZ"

736
codex-cli/README.md Normal file
View File

@@ -0,0 +1,736 @@
<h1 align="center">OpenAI Codex CLI</h1>
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code></p>
> [!IMPORTANT]
> This is the documentation for the _legacy_ TypeScript implementation of the Codex CLI. It has been superseded by the _Rust_ implementation. See the [README in the root of the Codex repository](https://github.com/openai/codex/blob/main/README.md) for details.
![Codex demo GIF using: codex "explain this codebase to me"](../.github/demo.gif)
---
<details>
<summary><strong>Table of contents</strong></summary>
<!-- Begin ToC -->
- [Experimental technology disclaimer](#experimental-technology-disclaimer)
- [Quickstart](#quickstart)
- [Why Codex?](#why-codex)
- [Security model & permissions](#security-model--permissions)
- [Platform sandboxing details](#platform-sandboxing-details)
- [System requirements](#system-requirements)
- [CLI reference](#cli-reference)
- [Memory & project docs](#memory--project-docs)
- [Non-interactive / CI mode](#non-interactive--ci-mode)
- [Tracing / verbose logging](#tracing--verbose-logging)
- [Recipes](#recipes)
- [Installation](#installation)
- [Configuration guide](#configuration-guide)
- [Basic configuration parameters](#basic-configuration-parameters)
- [Custom AI provider configuration](#custom-ai-provider-configuration)
- [History configuration](#history-configuration)
- [Configuration examples](#configuration-examples)
- [Full configuration example](#full-configuration-example)
- [Custom instructions](#custom-instructions)
- [Environment variables setup](#environment-variables-setup)
- [FAQ](#faq)
- [Zero data retention (ZDR) usage](#zero-data-retention-zdr-usage)
- [Codex open source fund](#codex-open-source-fund)
- [Contributing](#contributing)
- [Development workflow](#development-workflow)
- [Git hooks with Husky](#git-hooks-with-husky)
- [Debugging](#debugging)
- [Writing high-impact code changes](#writing-high-impact-code-changes)
- [Opening a pull request](#opening-a-pull-request)
- [Review process](#review-process)
- [Community values](#community-values)
- [Getting help](#getting-help)
- [Contributor license agreement (CLA)](#contributor-license-agreement-cla)
- [Quick fixes](#quick-fixes)
- [Releasing `codex`](#releasing-codex)
- [Alternative build options](#alternative-build-options)
- [Nix flake development](#nix-flake-development)
- [Security & responsible AI](#security--responsible-ai)
- [License](#license)
<!-- End ToC -->
</details>
---
## Experimental technology disclaimer
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
## Quickstart
Install globally:
```shell
npm install -g @openai/codex
```
Next, set your OpenAI API key as an environment variable:
```shell
export OPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
> ```
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - azure
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - arceeai
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
>
> ```shell
> export <provider>_BASE_URL="https://your-provider-api-base-url"
> ```
</details>
<br />
Run interactively:
```shell
codex
```
Or, run with a prompt as input (and optionally in `Full Auto` mode):
```shell
codex "explain this codebase to me"
```
```shell
codex --approval-mode full-auto "create the fanciest todo-list app"
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
missing dependencies, and show you the live result. Approve the changes and
they'll be committed to your working directory.
---
## Why Codex?
Codex CLI is built for developers who already **live in the terminal** and want
ChatGPT-level reasoning **plus** the power to actually run code, manipulate
files, and iterate - all under version control. In short, it's _chat-driven
development_ that understands and executes your repo.
- **Zero setup** - bring your OpenAI API key and it just works!
- **Full auto-approval, while safe + secure** by running network-disabled and directory-sandboxed
- **Multimodal** - pass in screenshots or diagrams to implement features ✨
And it's **fully open-source** so you can see and contribute to how it develops!
---
## Security model & permissions
Codex lets you decide _how much autonomy_ the agent receives and auto-approval policy via the
`--approval-mode` flag (or the interactive onboarding prompt):
| Mode | What the agent may do without asking | Still requires approval |
| ------------------------- | --------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| **Suggest** <br>(default) | <li>Read any file in the repo | <li>**All** file writes/patches<li> **Any** arbitrary shell commands (aside from reading files) |
| **Auto Edit** | <li>Read **and** apply-patch writes to files | <li>**All** shell commands |
| **Full Auto** | <li>Read/write files <li> Execute shell commands (network disabled, writes limited to your workdir) | - |
In **Full Auto** every command is run **network-disabled** and confined to the
current working directory (plus temporary files) for defense-in-depth. Codex
will also show a warning/confirmation if you start in **auto-edit** or
**full-auto** while the directory is _not_ tracked by Git, so you always have a
safety net.
Coming soon: you'll be able to whitelist specific commands to auto-execute with
the network enabled, once we're confident in additional safeguards.
### Platform sandboxing details
The hardening mechanism Codex uses depends on your OS:
- **macOS 12+** - commands are wrapped with **Apple Seatbelt** (`sandbox-exec`).
- Everything is placed in a read-only jail except for a small set of
writable roots (`$PWD`, `$TMPDIR`, `~/.codex`, etc.).
- Outbound network is _fully blocked_ by default - even if a child process
tries to `curl` somewhere it will fail.
- **Linux** - there is no sandboxing by default.
We recommend using Docker for sandboxing, where Codex launches itself inside a **minimal
container image** and mounts your repo _read/write_ at the same path. A
custom `iptables`/`ipset` firewall script denies all egress except the
OpenAI API. This gives you deterministic, reproducible runs without needing
root on the host. You can use the [`run_in_container.sh`](../codex-cli/scripts/run_in_container.sh) script to set up the sandbox.
---
## System requirements
| Requirement | Details |
| --------------------------- | --------------------------------------------------------------- |
| Operating systems | macOS 12+, Ubuntu 20.04+/Debian 10+, or Windows 11 **via WSL2** |
| Node.js | **22 or newer** (LTS recommended) |
| Git (optional, recommended) | 2.23+ for built-in PR helpers |
| RAM | 4-GB minimum (8-GB recommended) |
> Never run `sudo npm install -g`; fix npm permissions instead.
---
## CLI reference
| Command | Purpose | Example |
| ------------------------------------ | ----------------------------------- | ------------------------------------ |
| `codex` | Interactive REPL | `codex` |
| `codex "..."` | Initial prompt for interactive REPL | `codex "fix lint errors"` |
| `codex -q "..."` | Non-interactive "quiet mode" | `codex -q --json "explain utils.ts"` |
| `codex completion <bash\|zsh\|fish>` | Print shell completion script | `codex completion bash` |
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
---
## Memory & project docs
You can give Codex extra instructions and guidance using `AGENTS.md` files. Codex looks for `AGENTS.md` files in the following places, and merges them top-down:
1. `~/.codex/AGENTS.md` - personal global guidance
2. `AGENTS.md` at repo root - shared project notes
3. `AGENTS.md` in the current working directory - sub-folder/feature specifics
Disable loading of these files with `--no-project-doc` or the environment variable `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Non-interactive / CI mode
Run Codex head-less in pipelines. Example GitHub Action step:
```yaml
- name: Update changelog via Codex
run: |
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex -a auto-edit --quiet "update CHANGELOG for next release"
```
Set `CODEX_QUIET_MODE=1` to silence interactive UI noise.
## Tracing / verbose logging
Setting the environment variable `DEBUG=true` prints full API request and response details:
```shell
DEBUG=true codex
```
---
## Recipes
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
| ✨ | What you type | What happens |
| --- | ------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| 1 | `codex "Refactor the Dashboard component to React Hooks"` | Codex rewrites the class component, runs `npm test`, and shows the diff. |
| 2 | `codex "Generate SQL migrations for adding a users table"` | Infers your ORM, creates migration files, and runs them in a sandboxed DB. |
| 3 | `codex "Write unit tests for utils/date.ts"` | Generates tests, executes them, and iterates until they pass. |
| 4 | `codex "Bulk-rename *.jpeg -> *.jpg with git mv"` | Safely renames files and updates imports/usages. |
| 5 | `codex "Explain what this regex does: ^(?=.*[A-Z]).{8,}$"` | Outputs a step-by-step human explanation. |
| 6 | `codex "Carefully review this repo, and propose 3 high impact well-scoped PRs"` | Suggests impactful PRs in the current codebase. |
| 7 | `codex "Look for vulnerabilities and create a security review report"` | Finds and explains security bugs. |
---
## Installation
<details open>
<summary><strong>From npm (Recommended)</strong></summary>
```bash
npm install -g @openai/codex
# or
yarn global add @openai/codex
# or
bun install -g @openai/codex
# or
pnpm add -g @openai/codex
```
</details>
<details>
<summary><strong>Build from source</strong></summary>
```bash
# Clone the repository and navigate to the CLI package
git clone https://github.com/openai/codex.git
cd codex/codex-cli
# Enable corepack
corepack enable
# Install dependencies and build
pnpm install
pnpm build
# Linux-only: download prebuilt sandboxing binaries (requires gh and zstd).
./scripts/install_native_deps.sh
# Get the usage and the options
node ./dist/cli.js --help
# Run the locally-built CLI directly
node ./dist/cli.js
# Or link the command globally for convenience
pnpm link
```
</details>
---
## Configuration guide
Codex configuration files can be placed in the `~/.codex/` directory, supporting both YAML and JSON formats.
### Basic configuration parameters
| Parameter | Type | Default | Description | Available Options |
| ------------------- | ------- | ---------- | -------------------------------- | ---------------------------------------------------------------------------------------------- |
| `model` | string | `o4-mini` | AI model to use | Any model name supporting OpenAI API |
| `approvalMode` | string | `suggest` | AI assistant's permission mode | `suggest` (suggestions only)<br>`auto-edit` (automatic edits)<br>`full-auto` (fully automatic) |
| `fullAutoErrorMode` | string | `ask-user` | Error handling in full-auto mode | `ask-user` (prompt for user input)<br>`ignore-and-continue` (ignore and proceed) |
| `notify` | boolean | `true` | Enable desktop notifications | `true`/`false` |
### Custom AI provider configuration
In the `providers` object, you can configure multiple AI service providers. Each provider requires the following parameters:
| Parameter | Type | Description | Example |
| --------- | ------ | --------------------------------------- | ----------------------------- |
| `name` | string | Display name of the provider | `"OpenAI"` |
| `baseURL` | string | API service URL | `"https://api.openai.com/v1"` |
| `envKey` | string | Environment variable name (for API key) | `"OPENAI_API_KEY"` |
### History configuration
In the `history` object, you can configure conversation history settings:
| Parameter | Type | Description | Example Value |
| ------------------- | ------- | ------------------------------------------------------ | ------------- |
| `maxSize` | number | Maximum number of history entries to save | `1000` |
| `saveHistory` | boolean | Whether to save history | `true` |
| `sensitivePatterns` | array | Patterns of sensitive information to filter in history | `[]` |
### Configuration examples
1. YAML format (save as `~/.codex/config.yaml`):
```yaml
model: o4-mini
approvalMode: suggest
fullAutoErrorMode: ask-user
notify: true
```
2. JSON format (save as `~/.codex/config.json`):
```json
{
"model": "o4-mini",
"approvalMode": "suggest",
"fullAutoErrorMode": "ask-user",
"notify": true
}
```
### Full configuration example
Below is a comprehensive example of `config.json` with multiple custom providers:
```json
{
"model": "o4-mini",
"provider": "openai",
"providers": {
"openai": {
"name": "OpenAI",
"baseURL": "https://api.openai.com/v1",
"envKey": "OPENAI_API_KEY"
},
"azure": {
"name": "AzureOpenAI",
"baseURL": "https://YOUR_PROJECT_NAME.openai.azure.com/openai",
"envKey": "AZURE_OPENAI_API_KEY"
},
"openrouter": {
"name": "OpenRouter",
"baseURL": "https://openrouter.ai/api/v1",
"envKey": "OPENROUTER_API_KEY"
},
"gemini": {
"name": "Gemini",
"baseURL": "https://generativelanguage.googleapis.com/v1beta/openai",
"envKey": "GEMINI_API_KEY"
},
"ollama": {
"name": "Ollama",
"baseURL": "http://localhost:11434/v1",
"envKey": "OLLAMA_API_KEY"
},
"mistral": {
"name": "Mistral",
"baseURL": "https://api.mistral.ai/v1",
"envKey": "MISTRAL_API_KEY"
},
"deepseek": {
"name": "DeepSeek",
"baseURL": "https://api.deepseek.com",
"envKey": "DEEPSEEK_API_KEY"
},
"xai": {
"name": "xAI",
"baseURL": "https://api.x.ai/v1",
"envKey": "XAI_API_KEY"
},
"groq": {
"name": "Groq",
"baseURL": "https://api.groq.com/openai/v1",
"envKey": "GROQ_API_KEY"
},
"arceeai": {
"name": "ArceeAI",
"baseURL": "https://conductor.arcee.ai/v1",
"envKey": "ARCEEAI_API_KEY"
}
},
"history": {
"maxSize": 1000,
"saveHistory": true,
"sensitivePatterns": []
}
}
```
### Custom instructions
You can create a `~/.codex/AGENTS.md` file to define custom guidance for the agent:
```markdown
- Always respond with emojis
- Only use git commands when explicitly requested
```
### Environment variables setup
For each AI provider, you need to set the corresponding API key in your environment variables. For example:
```bash
# OpenAI
export OPENAI_API_KEY="your-api-key-here"
# Azure OpenAI
export AZURE_OPENAI_API_KEY="your-azure-api-key-here"
export AZURE_OPENAI_API_VERSION="2025-04-01-preview" (Optional)
# OpenRouter
export OPENROUTER_API_KEY="your-openrouter-key-here"
# Similarly for other providers
```
---
## FAQ
<details>
<summary>OpenAI released a model called Codex in 2021 - is this related?</summary>
In 2021, OpenAI released Codex, an AI system designed to generate code from natural language prompts. That original Codex model was deprecated as of March 2023 and is separate from the CLI tool.
</details>
<details>
<summary>Which models are supported?</summary>
Any model available with [Responses API](https://platform.openai.com/docs/api-reference/responses). The default is `o4-mini`, but pass `--model gpt-4.1` or set `model: gpt-4.1` in your config file to override.
</details>
<details>
<summary>Why does <code>o3</code> or <code>o4-mini</code> not work for me?</summary>
It's possible that your [API account needs to be verified](https://help.openai.com/en/articles/10910291-api-organization-verification) in order to start streaming responses and seeing chain of thought summaries from the API. If you're still running into issues, please let us know!
</details>
<details>
<summary>How do I stop Codex from editing my files?</summary>
Codex runs model-generated commands in a sandbox. If a proposed command or file change doesn't look right, you can simply type **n** to deny the command or give the model feedback.
</details>
<details>
<summary>Does it work on Windows?</summary>
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) - Codex has been tested on macOS and Linux with Node 22.
</details>
---
## Zero data retention (ZDR) usage
Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled. If your OpenAI organization has Zero Data Retention enabled and you still encounter errors such as:
```
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
---
## Codex open source fund
We're excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
- Grants are awarded up to **$25,000** API credits.
- Applications are reviewed **on a rolling basis**.
**Interested? [Apply here](https://openai.com/form/codex-open-source-fund/).**
---
## Contributing
This project is under active development and the code will likely change pretty significantly. We'll update this message once that's complete!
More broadly we welcome contributions - whether you are opening your very first pull request or you're a seasoned maintainer. At the same time we care about reliability and long-term maintainability, so the bar for merging code is intentionally **high**. The guidelines below spell out what "high-quality" means in practice and should make the whole process transparent and friendly.
### Development workflow
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
- Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git hooks with Husky
This project uses [Husky](https://typicode.github.io/husky/) to enforce code quality checks:
- **Pre-commit hook**: Automatically runs lint-staged to format and lint files before committing
- **Pre-push hook**: Runs tests and type checking before pushing to the remote
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./HUSKY.md).
```bash
pnpm test && pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
```text
I have read the CLA Document and I hereby sign the CLA
```
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
pnpm test:watch
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
### Writing high-impact code changes
1. **Start with an issue.** Open a new one or comment on an existing discussion so we can agree on the solution before code is written.
2. **Add or update tests.** Every new feature or bug-fix should come with test coverage that fails before your change and passes afterwards. 100% coverage is not required, but aim for meaningful assertions.
3. **Document behaviour.** If your change affects user-facing behaviour, update the README, inline help (`codex --help`), or relevant example projects.
4. **Keep commits atomic.** Each commit should compile and the tests should pass. This makes reviews and potential rollbacks easier.
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
### Review process
1. One maintainer will be assigned as a primary reviewer.
2. We may ask for changes - please do not take this personally. We value the work, we just also value consistency and long-term maintainability.
3. When there is consensus that the PR meets the bar, a maintainer will squash-and-merge.
### Community values
- **Be kind and inclusive.** Treat others with respect; we follow the [Contributor Covenant](https://www.contributor-covenant.org/).
- **Assume good intent.** Written communication is hard - err on the side of generosity.
- **Teach & learn.** If you spot something confusing, open an issue or PR with improvements.
### Getting help
If you run into problems setting up the project, would like feedback on an idea, or just want to say _hi_ - please open a Discussion or jump into the relevant issue. We are happy to help.
Together we can make Codex CLI an incredible tool. **Happy hacking!** :rocket:
### Contributor license agreement (CLA)
All contributors **must** accept the CLA. The process is lightweight:
1. Open your pull request.
2. Paste the following comment (or reply `recheck` if you've signed before):
```text
I have read the CLA Document and I hereby sign the CLA
```
3. The CLA-Assistant bot records your signature in the repo and marks the status check as passed.
No special Git commands, email attachments, or commit footers required.
#### Quick fixes
| Scenario | Command |
| ----------------- | ------------------------------------------------ |
| Amend last commit | `git commit --amend -s --no-edit && git push -f` |
The **DCO check** blocks merges until every commit in the PR carries the footer (with squash this is just the one).
### Releasing `codex`
To publish a new version of the CLI you first need to stage the npm package. A
helper script in `codex-cli/scripts/` does all the heavy lifting. Inside the
`codex-cli` folder run:
```bash
# Classic, JS implementation that includes small, native binaries for Linux sandboxing.
pnpm stage-release
# Optionally specify the temp directory to reuse between runs.
RELEASE_DIR=$(mktemp -d)
pnpm stage-release --tmp "$RELEASE_DIR"
# "Fat" package that additionally bundles the native Rust CLI binaries for
# Linux. End-users can then opt-in at runtime by setting CODEX_RUST=1.
pnpm stage-release --native
```
Go to the folder where the release is staged and verify that it works as intended. If so, run the following from the temp folder:
```
cd "$RELEASE_DIR"
npm publish
```
### Alternative build options
#### Nix flake development
Prerequisite: Nix >= 2.4 with flakes enabled (`experimental-features = nix-command flakes` in `~/.config/nix/nix.conf`).
Enter a Nix development shell:
```bash
# Use either one of the commands according to which implementation you want to work with
nix develop .#codex-cli # For entering codex-cli specific shell
nix develop .#codex-rs # For entering codex-rs specific shell
```
This shell includes Node.js, installs dependencies, builds the CLI, and provides a `codex` command alias.
Build and run the CLI directly:
```bash
# Use either one of the commands according to which implementation you want to work with
nix build .#codex-cli # For building codex-cli
nix build .#codex-rs # For building codex-rs
./result/bin/codex --help
```
Run the CLI via the flake app:
```bash
# Use either one of the commands according to which implementation you want to work with
nix run .#codex-cli # For running codex-cli
nix run .#codex-rs # For running codex-rs
```
Use direnv with flakes
If you have direnv installed, you can use the following `.envrc` to automatically enter the Nix shell when you `cd` into the project directory:
```bash
cd codex-rs
echo "use flake ../flake.nix#codex-cli" >> .envrc && direnv allow
cd codex-cli
echo "use flake ../flake.nix#codex-rs" >> .envrc && direnv allow
```
---
## Security & responsible AI
Have you discovered a vulnerability or have concerns about model output? Please e-mail **security@openai.com** and we will respond promptly.
---
## License
This repository is licensed under the [Apache-2.0 License](LICENSE).

View File

@@ -15,7 +15,6 @@
* current platform / architecture, an error is thrown.
*/
import { spawnSync } from "child_process";
import fs from "fs";
import path from "path";
import { fileURLToPath, pathToFileURL } from "url";
@@ -35,18 +34,19 @@ const wantsNative = fs.existsSync(path.join(__dirname, "use-native")) ||
: false);
// Try native binary if requested.
if (wantsNative) {
if (wantsNative && process.platform !== 'win32') {
const { platform, arch } = process;
let targetTriple = null;
switch (platform) {
case "linux":
case "android":
switch (arch) {
case "x64":
targetTriple = "x86_64-unknown-linux-musl";
break;
case "arm64":
targetTriple = "aarch64-unknown-linux-gnu";
targetTriple = "aarch64-unknown-linux-musl";
break;
default:
break;
@@ -73,22 +73,76 @@ if (wantsNative) {
}
const binaryPath = path.join(__dirname, "..", "bin", `codex-${targetTriple}`);
const result = spawnSync(binaryPath, process.argv.slice(2), {
// Use an asynchronous spawn instead of spawnSync so that Node is able to
// respond to signals (e.g. Ctrl-C / SIGINT) while the native binary is
// executing. This allows us to forward those signals to the child process
// and guarantees that when either the child terminates or the parent
// receives a fatal signal, both processes exit in a predictable manner.
const { spawn } = await import("child_process");
const child = spawn(binaryPath, process.argv.slice(2), {
stdio: "inherit",
});
const exitCode = typeof result.status === "number" ? result.status : 1;
process.exit(exitCode);
}
child.on("error", (err) => {
// Typically triggered when the binary is missing or not executable.
// Re-throwing here will terminate the parent with a non-zero exit code
// while still printing a helpful stack trace.
// eslint-disable-next-line no-console
console.error(err);
process.exit(1);
});
// Fallback: execute the original JavaScript CLI.
// Forward common termination signals to the child so that it shuts down
// gracefully. In the handler we temporarily disable the default behavior of
// exiting immediately; once the child has been signaled we simply wait for
// its exit event which will in turn terminate the parent (see below).
const forwardSignal = (signal) => {
if (child.killed) {
return;
}
try {
child.kill(signal);
} catch {
/* ignore */
}
};
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, "../dist/cli.js");
const cliUrl = pathToFileURL(cliPath).href;
["SIGINT", "SIGTERM", "SIGHUP"].forEach((sig) => {
process.on(sig, () => forwardSignal(sig));
});
// Load and execute the CLI
(async () => {
// When the child exits, mirror its termination reason in the parent so that
// shell scripts and other tooling observe the correct exit status.
// Wrap the lifetime of the child process in a Promise so that we can await
// its termination in a structured way. The Promise resolves with an object
// describing how the child exited: either via exit code or due to a signal.
const childResult = await new Promise((resolve) => {
child.on("exit", (code, signal) => {
if (signal) {
resolve({ type: "signal", signal });
} else {
resolve({ type: "code", exitCode: code ?? 1 });
}
});
});
if (childResult.type === "signal") {
// Re-emit the same signal so that the parent terminates with the expected
// semantics (this also sets the correct exit code of 128 + n).
process.kill(process.pid, childResult.signal);
} else {
process.exit(childResult.exitCode);
}
} else {
// Fallback: execute the original JavaScript CLI.
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, "../dist/cli.js");
const cliUrl = pathToFileURL(cliPath).href;
// Load and execute the CLI
try {
await import(cliUrl);
} catch (err) {
@@ -96,4 +150,4 @@ const cliUrl = pathToFileURL(cliPath).href;
console.error(err);
process.exit(1);
}
})();
}

View File

@@ -84,6 +84,6 @@
},
"repository": {
"type": "git",
"url": "https://github.com/openai/codex"
"url": "git+https://github.com/openai/codex.git"
}
}

View File

@@ -0,0 +1,9 @@
# npm releases
Run the following:
To build the 0.2.x or later version of the npm module, which runs the Rust version of the CLI, build it as follows:
```bash
./codex-cli/scripts/stage_rust_release.py --release-version 0.6.0
```

View File

@@ -8,7 +8,7 @@
# the native implementation when users set CODEX_RUST=1.
#
# Usage
# install_native_deps.sh [RELEASE_ROOT] [--full-native]
# install_native_deps.sh [--full-native] [--workflow-url URL] [CODEX_CLI_ROOT]
#
# The optional RELEASE_ROOT is the path that contains package.json. Omitting
# it installs the binaries into the repository's own bin/ folder to support
@@ -20,32 +20,43 @@ set -euo pipefail
# Parse arguments
# ------------------
DEST_DIR=""
CODEX_CLI_ROOT=""
INCLUDE_RUST=0
for arg in "$@"; do
case "$arg" in
# Until we start publishing stable GitHub releases, we have to grab the binaries
# from the GitHub Action that created them. Update the URL below to point to the
# appropriate workflow run:
WORKFLOW_URL="https://github.com/openai/codex/actions/runs/15981617627"
while [[ $# -gt 0 ]]; do
case "$1" in
--full-native)
INCLUDE_RUST=1
;;
--workflow-url)
shift || { echo "--workflow-url requires an argument"; exit 1; }
if [ -n "$1" ]; then
WORKFLOW_URL="$1"
fi
;;
*)
if [[ -z "$DEST_DIR" ]]; then
DEST_DIR="$arg"
if [[ -z "$CODEX_CLI_ROOT" ]]; then
CODEX_CLI_ROOT="$1"
else
echo "Unexpected argument: $arg" >&2
echo "Unexpected argument: $1" >&2
exit 1
fi
;;
esac
shift
done
# ----------------------------------------------------------------------------
# Determine where the binaries should be installed.
# ----------------------------------------------------------------------------
if [[ $# -gt 0 ]]; then
if [ -n "$CODEX_CLI_ROOT" ]; then
# The caller supplied a release root directory.
CODEX_CLI_ROOT="$1"
BIN_DIR="$CODEX_CLI_ROOT/bin"
else
# No argument; fall back to the repos own bin directory.
@@ -62,10 +73,6 @@ mkdir -p "$BIN_DIR"
# Download and decompress the artifacts from the GitHub Actions workflow.
# ----------------------------------------------------------------------------
# Until we start publishing stable GitHub releases, we have to grab the binaries
# from the GitHub Action that created them. Update the URL below to point to the
# appropriate workflow run:
WORKFLOW_URL="https://github.com/openai/codex/actions/runs/15334411824"
WORKFLOW_ID="${WORKFLOW_URL##*/}"
ARTIFACTS_DIR="$(mktemp -d)"
@@ -78,7 +85,7 @@ gh run download --dir "$ARTIFACTS_DIR" --repo openai/codex "$WORKFLOW_ID"
zstd -d "$ARTIFACTS_DIR/x86_64-unknown-linux-musl/codex-linux-sandbox-x86_64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-linux-sandbox-x64"
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-gnu/codex-linux-sandbox-aarch64-unknown-linux-gnu.zst" \
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-musl/codex-linux-sandbox-aarch64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-linux-sandbox-arm64"
if [[ "$INCLUDE_RUST" -eq 1 ]]; then
@@ -86,8 +93,8 @@ if [[ "$INCLUDE_RUST" -eq 1 ]]; then
zstd -d "$ARTIFACTS_DIR/x86_64-unknown-linux-musl/codex-x86_64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-x86_64-unknown-linux-musl"
# ARM64 Linux
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-gnu/codex-aarch64-unknown-linux-gnu.zst" \
-o "$BIN_DIR/codex-aarch64-unknown-linux-gnu"
zstd -d "$ARTIFACTS_DIR/aarch64-unknown-linux-musl/codex-aarch64-unknown-linux-musl.zst" \
-o "$BIN_DIR/codex-aarch64-unknown-linux-musl"
# x64 macOS
zstd -d "$ARTIFACTS_DIR/x86_64-apple-darwin/codex-x86_64-apple-darwin.zst" \
-o "$BIN_DIR/codex-x86_64-apple-darwin"

View File

@@ -4,10 +4,7 @@
# -----------------------------------------------------------------------------
# Stages an npm release for @openai/codex.
#
# The script used to accept a single optional positional argument that indicated
# the temporary directory in which to stage the package. We now support a
# flag-based interface so that we can extend the command with further options
# without breaking the call-site contract.
# Usage:
#
# --tmp <dir> : Use <dir> instead of a freshly created temp directory.
# --native : Bundle the pre-built Rust CLI binaries for Linux alongside
@@ -17,7 +14,7 @@
# When --native is supplied we copy the linux-sandbox binaries (as before) and
# additionally fetch / unpack the two Rust targets that we currently support:
# - x86_64-unknown-linux-musl
# - aarch64-unknown-linux-gnu
# - aarch64-unknown-linux-musl
#
# NOTE: This script is intended to be run from the repository root via
# `pnpm --filter codex-cli stage-release ...` or inside codex-cli with the
@@ -30,11 +27,12 @@ set -euo pipefail
usage() {
cat <<EOF
Usage: $(basename "$0") [--tmp DIR] [--native]
Usage: $(basename "$0") [--tmp DIR] [--native] [--version VERSION]
Options
--tmp DIR Use DIR to stage the release (defaults to a fresh mktemp dir)
--native Bundle Rust binaries for Linux (fat package)
--version Specify the version to release (defaults to a timestamp-based version)
-h, --help Show this help
Legacy positional argument: the first non-flag argument is still interpreted
@@ -45,6 +43,9 @@ EOF
TMPDIR=""
INCLUDE_NATIVE=0
# Default to a timestamp-based version (keep same scheme as before)
VERSION="$(printf '0.1.%d' "$(date +%y%m%d%H%M)")"
WORKFLOW_URL=""
# Manual flag parser - Bash getopts does not handle GNU long options well.
while [[ $# -gt 0 ]]; do
@@ -59,6 +60,14 @@ while [[ $# -gt 0 ]]; do
--native)
INCLUDE_NATIVE=1
;;
--version)
shift || { echo "--version requires an argument"; usage 1; }
VERSION="$1"
;;
--workflow-url)
shift || { echo "--workflow-url requires an argument"; exit 1; }
WORKFLOW_URL="$1"
;;
-h|--help)
usage 0
;;
@@ -108,9 +117,6 @@ cp -r dist "$TMPDIR/dist"
cp -r src "$TMPDIR/src" # keep source for TS sourcemaps
cp ../README.md "$TMPDIR" || true # README is one level up - ignore if missing
# Derive a timestamp-based version (keep same scheme as before)
VERSION="$(printf '0.1.%d' "$(date +%y%m%d%H%M)")"
# Modify package.json - bump version and optionally add the native directory to
# the files array so that the binaries are published to npm.
@@ -121,7 +127,7 @@ jq --arg version "$VERSION" \
# 2. Native runtime deps (sandbox plus optional Rust binaries)
if [[ "$INCLUDE_NATIVE" -eq 1 ]]; then
./scripts/install_native_deps.sh "$TMPDIR" --full-native
./scripts/install_native_deps.sh --full-native --workflow-url "$WORKFLOW_URL" "$TMPDIR"
touch "${TMPDIR}/bin/use-native"
else
./scripts/install_native_deps.sh "$TMPDIR"
@@ -132,7 +138,8 @@ popd >/dev/null
echo "Staged version $VERSION for release in $TMPDIR"
if [[ "$INCLUDE_NATIVE" -eq 1 ]]; then
echo "Test Rust:"
echo "Verify the CLI:"
echo " node ${TMPDIR}/bin/codex.js --version"
echo " node ${TMPDIR}/bin/codex.js --help"
else
echo "Test Node:"

View File

@@ -0,0 +1,62 @@
#!/usr/bin/env python3
import json
import subprocess
import sys
import argparse
from pathlib import Path
def main() -> int:
parser = argparse.ArgumentParser(
description="""Stage a release for the npm module.
Run this after the GitHub Release has been created and use
`--release-version` to specify the version to release.
"""
)
parser.add_argument(
"--release-version", required=True, help="Version to release, e.g., 0.3.0"
)
args = parser.parse_args()
version = args.release_version
gh_run = subprocess.run(
[
"gh",
"run",
"list",
"--branch",
f"rust-v{version}",
"--json",
"workflowName,url,headSha",
"--jq",
'first(.[] | select(.workflowName == "rust-release"))',
],
stdout=subprocess.PIPE,
check=True,
)
gh_run.check_returncode()
workflow = json.loads(gh_run.stdout)
sha = workflow["headSha"]
print(f"should `git checkout {sha}`")
current_dir = Path(__file__).parent.resolve()
stage_release = subprocess.run(
[
current_dir / "stage_release.sh",
"--version",
version,
"--workflow-url",
workflow["url"],
"--native",
]
)
stage_release.check_returncode()
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -370,11 +370,26 @@ export function isSafeCommand(
reason: "View file with line numbers",
group: "Reading files",
};
case "rg":
case "rg": {
// Certain ripgrep options execute external commands or invoke other
// processes, so we must reject them.
const isUnsafe = command.some(
(arg: string) =>
UNSAFE_OPTIONS_FOR_RIPGREP_WITHOUT_ARGS.has(arg) ||
[...UNSAFE_OPTIONS_FOR_RIPGREP_WITH_ARGS].some(
(opt) => arg === opt || arg.startsWith(`${opt}=`),
),
);
if (isUnsafe) {
break;
}
return {
reason: "Ripgrep search",
group: "Searching",
};
}
case "find": {
// Certain options to `find` allow executing arbitrary processes, so we
// cannot auto-approve them.
@@ -495,6 +510,22 @@ const UNSAFE_OPTIONS_FOR_FIND_COMMAND: ReadonlySet<string> = new Set([
"-fprintf",
]);
// Ripgrep options that are considered unsafe because they may execute
// arbitrary commands or spawn auxiliary processes.
const UNSAFE_OPTIONS_FOR_RIPGREP_WITH_ARGS: ReadonlySet<string> = new Set([
// Executes an arbitrary command for each matching file.
"--pre",
// Allows custom hostname command which could leak environment details.
"--hostname-bin",
]);
const UNSAFE_OPTIONS_FOR_RIPGREP_WITHOUT_ARGS: ReadonlySet<string> = new Set([
// Enables searching inside archives which triggers external decompression
// utilities reject out of an abundance of caution.
"--search-zip",
"-z",
]);
// ---------------- Helper utilities for complex shell expressions -----------------
// A conservative allow-list of bash operators that do not, on their own, cause

View File

@@ -45,6 +45,7 @@ import { createInputItem } from "./utils/input-utils";
import { initLogger } from "./utils/logger/log";
import { isModelSupportedForResponses } from "./utils/model-utils.js";
import { parseToolCall } from "./utils/parsers";
import { providers } from "./utils/providers";
import { onExit, setInkRenderer } from "./utils/terminal";
import chalk from "chalk";
import { spawnSync } from "child_process";
@@ -327,26 +328,44 @@ try {
// ignore errors
}
if (cli.flags.login) {
apiKey = await fetchApiKey(client.issuer, client.client_id);
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
const authFile = path.join(authDir, "auth.json");
if (fs.existsSync(authFile)) {
const data = JSON.parse(fs.readFileSync(authFile, "utf-8"));
savedTokens = data.tokens;
// Get provider-specific API key if not OpenAI
if (provider.toLowerCase() !== "openai") {
const providerInfo = providers[provider.toLowerCase()];
if (providerInfo) {
const providerApiKey = process.env[providerInfo.envKey];
if (providerApiKey) {
apiKey = providerApiKey;
}
} catch {
/* ignore */
}
} else if (!apiKey) {
apiKey = await fetchApiKey(client.issuer, client.client_id);
}
// Only proceed with OpenAI auth flow if:
// 1. Provider is OpenAI and no API key is set, or
// 2. Login flag is explicitly set
if (provider.toLowerCase() === "openai" && !apiKey) {
if (cli.flags.login) {
apiKey = await fetchApiKey(client.issuer, client.client_id);
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
const authFile = path.join(authDir, "auth.json");
if (fs.existsSync(authFile)) {
const data = JSON.parse(fs.readFileSync(authFile, "utf-8"));
savedTokens = data.tokens;
}
} catch {
/* ignore */
}
} else {
apiKey = await fetchApiKey(client.issuer, client.client_id);
}
}
// Ensure the API key is available as an environment variable for legacy code
process.env["OPENAI_API_KEY"] = apiKey;
if (cli.flags.free) {
// Only attempt credit redemption for OpenAI provider
if (cli.flags.free && provider.toLowerCase() === "openai") {
// eslint-disable-next-line no-console
console.log(`${chalk.bold("codex --free")} attempting to redeem credits...`);
if (!savedTokens?.refresh_token) {
@@ -379,13 +398,18 @@ if (!apiKey && !NO_API_KEY_REQUIRED.has(provider.toLowerCase())) {
? `You can create a key here: ${chalk.bold(
chalk.underline("https://platform.openai.com/account/api-keys"),
)}\n`
: provider.toLowerCase() === "gemini"
: provider.toLowerCase() === "azure"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
`${provider.toUpperCase()}_OPENAI_API_KEY`,
)} ` +
`in Azure AI Foundry portal at ${chalk.bold(chalk.underline("https://ai.azure.com"))}.\n`
: provider.toLowerCase() === "gemini"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
}`,
);
process.exit(1);

View File

@@ -800,7 +800,8 @@ export class AgentLoop {
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
this.config.provider?.toLowerCase() === "openai" ||
this.config.provider?.toLowerCase() === "azure"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>
@@ -1188,7 +1189,8 @@ export class AgentLoop {
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
this.config.provider?.toLowerCase() === "openai" ||
this.config.provider?.toLowerCase() === "azure"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>

View File

@@ -69,7 +69,7 @@ export const OPENAI_BASE_URL = process.env["OPENAI_BASE_URL"] || "";
export let OPENAI_API_KEY = process.env["OPENAI_API_KEY"] || "";
export const AZURE_OPENAI_API_VERSION =
process.env["AZURE_OPENAI_API_VERSION"] || "2025-03-01-preview";
process.env["AZURE_OPENAI_API_VERSION"] || "2025-04-01-preview";
export const DEFAULT_REASONING_EFFORT = "high";
export const OPENAI_ORGANIZATION = process.env["OPENAI_ORGANIZATION"] || "";

View File

@@ -382,6 +382,8 @@ async function handleCallback(
const exchanged = (await exchangeRes.json()) as {
access_token: string;
// NOTE(mbolin): I did not see the "key" property set in practice. Note
// this property is not read by the code.
key: string;
};

View File

@@ -0,0 +1,107 @@
/**
* tests/agent-azure-responses-endpoint.test.ts
*
* Verifies that AgentLoop calls the `/responses` endpoint when provider is set to Azure.
*/
import { describe, it, expect, vi, beforeEach } from "vitest";
// Fake stream that yields a completed response event
class FakeStream {
async *[Symbol.asyncIterator]() {
yield {
type: "response.completed",
response: { id: "azure_resp", status: "completed", output: [] },
} as any;
}
}
let lastCreateParams: any = null;
vi.mock("openai", () => {
class FakeDefaultClient {
public responses = {
create: async (params: any) => {
lastCreateParams = params;
return new FakeStream();
},
};
}
class FakeAzureClient {
public responses = {
create: async (params: any) => {
lastCreateParams = params;
return new FakeStream();
},
};
}
class APIConnectionTimeoutError extends Error {}
return {
__esModule: true,
default: FakeDefaultClient,
AzureOpenAI: FakeAzureClient,
APIConnectionTimeoutError,
};
});
// Stub approvals to bypass command approval logic
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }),
isSafeCommand: () => null,
}));
// Stub format-command to avoid formatting side effects
vi.mock("../src/format-command.js", () => ({
__esModule: true,
formatCommandForDisplay: (cmd: Array<string>) => cmd.join(" "),
}));
// Stub internal logging to keep output clean
vi.mock("../src/utils/agent/log.js", () => ({
__esModule: true,
log: () => {},
isLoggingEnabled: () => false,
}));
import { AgentLoop } from "../src/utils/agent/agent-loop.js";
describe("AgentLoop Azure provider responses endpoint", () => {
beforeEach(() => {
lastCreateParams = null;
});
it("calls the /responses endpoint when provider is azure", async () => {
const cfg: any = {
model: "test-model",
provider: "azure",
instructions: "",
disableResponseStorage: false,
notify: false,
};
const loop = new AgentLoop({
additionalWritableRoots: [],
model: cfg.model,
config: cfg,
instructions: cfg.instructions,
approvalPolicy: { mode: "suggest" } as any,
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
await loop.run([
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "hello" }],
},
]);
expect(lastCreateParams).not.toBeNull();
expect(lastCreateParams.model).toBe(cfg.model);
expect(Array.isArray(lastCreateParams.input)).toBe(true);
});
});

View File

@@ -44,6 +44,14 @@ describe("canAutoApprove()", () => {
group: "Navigating",
runInSandbox: false,
});
// Ripgrep safe invocation.
expect(check(["rg", "TODO"])).toEqual({
type: "auto-approve",
reason: "Ripgrep search",
group: "Searching",
runInSandbox: false,
});
});
test("simple safe commands within a `bash -lc` call", () => {
@@ -67,6 +75,24 @@ describe("canAutoApprove()", () => {
});
});
test("ripgrep unsafe flags", () => {
// Flags that do not take arguments
expect(check(["rg", "--search-zip", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "-z", "TODO"])).toEqual({ type: "ask-user" });
// Flags that take arguments (provided separately)
expect(check(["rg", "--pre", "cat", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "--hostname-bin", "hostname", "TODO"])).toEqual({
type: "ask-user",
});
// Flags that take arguments in = form
expect(check(["rg", "--pre=cat", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "--hostname-bin=hostname", "TODO"])).toEqual({
type: "ask-user",
});
});
test("bash -lc commands with unsafe redirects", () => {
expect(check(["bash", "-lc", "echo hello > file.txt"])).toEqual({
type: "ask-user",

6
codex-rs/.gitignore vendored
View File

@@ -1 +1,7 @@
/target/
# Recommended value of CARGO_TARGET_DIR when using Docker as explained in .devcontainer/README.md.
/target-amd64/
# Value of CARGO_TARGET_DIR when using .devcontainer/devcontainer.json.
/target-arm64/

1417
codex-rs/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -8,7 +8,9 @@ members = [
"core",
"exec",
"execpolicy",
"file-search",
"linux-sandbox",
"login",
"mcp-client",
"mcp-server",
"mcp-types",
@@ -35,3 +37,6 @@ lto = "fat"
# Because we bundle some of these executables with the TypeScript CLI, we
# remove everything to make the binary as small as possible.
strip = "symbols"
# See https://github.com/openai/codex/issues/1411 for details.
codegen-units = 1

View File

@@ -1,16 +1,90 @@
# codex-rs
# Codex CLI (Rust Implementation)
April 24, 2025
We provide Codex CLI as a standalone, native executable to ensure a zero-dependency install.
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible.
## Installing Codex
To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits:
Today, the easiest way to install Codex is via `npm`, though we plan to publish Codex to other package managers soon.
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance.
```shell
npm i -g @openai/codex@native
codex
```
Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implementation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.
You can also download a platform-specific release directly from our [GitHub Releases](https://github.com/openai/codex/releases).
## What's new in the Rust CLI
While we are [working to close the gap between the TypeScript and Rust implementations of Codex CLI](https://github.com/openai/codex/issues/1262), note that the Rust CLI has a number of features that the TypeScript CLI does not!
### Config
Codex supports a rich set of configuration options. Note that the Rust CLI uses `config.toml` instead of `config.json`. See [`config.md`](./config.md) for details.
### Model Context Protocol Support
Codex CLI functions as an MCP client that can connect to MCP servers on startup. See the [`mcp_servers`](./config.md#mcp_servers) section in the configuration documentation for details.
It is still experimental, but you can also launch Codex as an MCP _server_ by running `codex mcp`. Use the [`@modelcontextprotocol/inspector`](https://github.com/modelcontextprotocol/inspector) to try it out:
```shell
npx @modelcontextprotocol/inspector codex mcp
```
### Notifications
You can enable notifications by configuring a script that is run whenever the agent finishes a turn. The [notify documentation](./config.md#notify) includes a detailed example that explains how to get desktop notifications via [terminal-notifier](https://github.com/julienXX/terminal-notifier) on macOS.
### `codex exec` to run Codex programmatially/non-interactively
To run Codex non-interactively, run `codex exec PROMPT` (you can also pass the prompt via `stdin`) and Codex will work on your task until it decides that it is done and exits. Output is printed to the terminal directly. You can set the `RUST_LOG` environment variable to see more about what's going on.
### Use `@` for file search
Typing `@` triggers a fuzzy-filename search over the workspace root. Use up/down to select among the results and Tab or Enter to replace the `@` with the selected path. You can use Esc to cancel the search.
### `--cd`/`-C` flag
Sometimes it is not convenient to `cd` to the directory you want Codex to use as the "working root" before running Codex. Fortunately, `codex` supports a `--cd` option so you can specify whatever folder you want. You can confirm that Codex is honoring `--cd` by double-checking the **workdir** it reports in the TUI at the start of a new session.
### Shell completions
Generate shell completion scripts via:
```shell
codex completion bash
codex completion zsh
codex completion fish
```
### Experimenting with the Codex Sandbox
To test to see what happens when a command is run under the sandbox provided by Codex, we provide the following subcommands in Codex CLI:
```
# macOS
codex debug seatbelt [--full-auto] [COMMAND]...
# Linux
codex debug landlock [--full-auto] [COMMAND]...
```
### Selecting a sandbox policy via `--sandbox`
The Rust CLI exposes a dedicated `--sandbox` (`-s`) flag that lets you pick the sandbox policy **without** having to reach for the generic `-c/--config` option:
```shell
# Run Codex with the default, read-only sandbox
codex --sandbox read-only
# Allow the agent to write within the current workspace while still blocking network access
codex --sandbox workspace-write
# Danger! Disable sandboxing entirely (only do this if you are already running in a container or other isolated env)
codex --sandbox danger-full-access
```
The same setting can be persisted in `~/.codex/config.toml` via the top-level `sandbox_mode = "MODE"` key, e.g. `sandbox_mode = "workspace-write"`.
## Code Organization
@@ -20,373 +94,3 @@ This folder is the root of a Cargo workspace. It contains quite a bit of experim
- [`exec/`](./exec) "headless" CLI for use in automation.
- [`tui/`](./tui) CLI that launches a fullscreen TUI built with [Ratatui](https://ratatui.rs/).
- [`cli/`](./cli) CLI multitool that provides the aforementioned CLIs via subcommands.
## Config
The CLI can be configured via a file named `config.toml`. By default, configuration is read from `~/.codex/config.toml`, though the `CODEX_HOME` environment variable can be used to specify a directory other than `~/.codex`.
The `config.toml` file supports the following options:
### model
The model that Codex should use.
```toml
model = "o3" # overrides the default of "o4-mini"
```
### model_provider
Codex comes bundled with a number of "model providers" predefined. This config value is a string that indicates which provider to use. You can also define your own providers via `model_providers`.
For example, if you are running ollama with Mistral locally, then you would need to add the following to your config:
```toml
model = "mistral"
model_provider = "ollama"
```
because the following definition for `ollama` is included in Codex:
```toml
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"
```
This option defaults to `"openai"` and the corresponding provider is defined as follows:
```toml
[model_providers.openai]
name = "OpenAI"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "responses"
```
### model_providers
This option lets you override and amend the default set of model providers bundled with Codex. This value is a map where the key is the value to use with `model_provider` to select the correspodning provider.
For example, if you wanted to add a provider that uses the OpenAI 4o model via the chat completions API, then you
```toml
# Recall that in TOML, root keys must be listed before tables.
model = "gpt-4o"
model_provider = "openai-chat-completions"
[model_providers.openai-chat-completions]
# Name of the provider that will be displayed in the Codex UI.
name = "OpenAI using Chat Completions"
# The path `/chat/completions` will be amended to this URL to make the POST
# request for the chat completions.
base_url = "https://api.openai.com/v1"
# If `env_key` is set, identifies an environment variable that must be set when
# using Codex with this provider. The value of the environment variable must be
# non-empty and will be used in the `Bearer TOKEN` HTTP header for the POST request.
env_key = "OPENAI_API_KEY"
# valid values for wire_api are "chat" and "responses".
wire_api = "chat"
```
### approval_policy
Determines when the user should be prompted to approve whether Codex can execute a command:
```toml
# This is analogous to --suggest in the TypeScript Codex CLI
approval_policy = "unless-allow-listed"
```
```toml
# If the command fails when run in the sandbox, Codex asks for permission to
# retry the command outside the sandbox.
approval_policy = "on-failure"
```
```toml
# User is never prompted: if the command fails, Codex will automatically try
# something out. Note the `exec` subcommand always uses this mode.
approval_policy = "never"
```
### profiles
A _profile_ is a collection of configuration values that can be set together. Multiple profiles can be defined in `config.toml` and you can specify the one you
want to use at runtime via the `--profile` flag.
Here is an example of a `config.toml` that defines multiple profiles:
```toml
model = "o3"
approval_policy = "unless-allow-listed"
sandbox_permissions = ["disk-full-read-access"]
disable_response_storage = false
# Setting `profile` is equivalent to specifying `--profile o3` on the command
# line, though the `--profile` flag can still be used to override this value.
profile = "o3"
[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
[profiles.o3]
model = "o3"
model_provider = "openai"
approval_policy = "never"
[profiles.gpt3]
model = "gpt-3.5-turbo"
model_provider = "openai-chat-completions"
[profiles.zdr]
model = "o3"
model_provider = "openai"
approval_policy = "on-failure"
disable_response_storage = true
```
Users can specify config values at multiple levels. Order of precedence is as follows:
1. custom command-line argument, e.g., `--model o3`
2. as part of a profile, where the `--profile` is specified via a CLI (or in the config file itself)
3. as an entry in `config.toml`, e.g., `model = "o3"`
4. the default value that comes with Codex CLI (i.e., Codex CLI defaults to `o4-mini`)
### sandbox_permissions
List of permissions to grant to the sandbox that Codex uses to execute untrusted commands:
```toml
# This is comparable to --full-auto in the TypeScript Codex CLI, though
# specifying `disk-write-platform-global-temp-folder` adds /tmp as a writable
# folder in addition to $TMPDIR.
sandbox_permissions = [
"disk-full-read-access",
"disk-write-platform-user-temp-folder",
"disk-write-platform-global-temp-folder",
"disk-write-cwd",
]
```
To add additional writable folders, use `disk-write-folder`, which takes a parameter (this can be specified multiple times):
```toml
sandbox_permissions = [
# ...
"disk-write-folder=/Users/mbolin/.pyenv/shims",
]
```
### mcp_servers
Defines the list of MCP servers that Codex can consult for tool use. Currently, only servers that are launched by executing a program that communicate over stdio are supported. For servers that use the SSE transport, consider an adapter like [mcp-proxy](https://github.com/sparfenyuk/mcp-proxy).
**Note:** Codex may cache the list of tools and resources from an MCP server so that Codex can include this information in context at startup without spawning all the servers. This is designed to save resources by loading MCP servers lazily.
This config option is comparable to how Claude and Cursor define `mcpServers` in their respective JSON config files, though because Codex uses TOML for its config language, the format is slightly different. For example, the following config in JSON:
```json
{
"mcpServers": {
"server-name": {
"command": "npx",
"args": ["-y", "mcp-server"],
"env": {
"API_KEY": "value"
}
}
}
}
```
Should be represented as follows in `~/.codex/config.toml`:
```toml
# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command = "npx"
args = ["-y", "mcp-server"]
env = { "API_KEY" = "value" }
```
### disable_response_storage
Currently, customers whose accounts are set to use Zero Data Retention (ZDR) must set `disable_response_storage` to `true` so that Codex uses an alternative to the Responses API that works with ZDR:
```toml
disable_response_storage = true
```
### shell_environment_policy
Codex spawns subprocesses (e.g. when executing a `local_shell` tool-call suggested by the assistant). By default it passes **only a minimal core subset** of your environment to those subprocesses to avoid leaking credentials. You can tune this behavior via the **`shell_environment_policy`** block in
`config.toml`:
```toml
[shell_environment_policy]
# inherit can be "core" (default), "all", or "none"
inherit = "core"
# set to true to *skip* the filter for `"*KEY*"` and `"*TOKEN*"`
ignore_default_excludes = false
# exclude patterns (case-insensitive globs)
exclude = ["AWS_*", "AZURE_*"]
# force-set / override values
set = { CI = "1" }
# if provided, *only* vars matching these patterns are kept
include_only = ["PATH", "HOME"]
```
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `inherit` | string | `core` | Starting template for the environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run. |
| `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob patterns to drop after the default filter.<br>Examples: `"AWS_*"`, `"AZURE_*"`. |
| `set` | table&lt;string,string&gt; | `{}` | Explicit key/value overrides or additions always win over inherited values. |
| `include_only` | array&lt;string&gt; | `[]` | If non-empty, a whitelist of patterns; only variables that match _one_ pattern survive the final step. (Generally used with `inherit = "all"`.) |
The patterns are **glob style**, not full regular expressions: `*` matches any
number of characters, `?` matches exactly one, and character classes like
`[A-Z]`/`[^0-9]` are supported. Matching is always **case-insensitive**. This
syntax is documented in code as `EnvironmentVariablePattern` (see
`core/src/config_types.rs`).
If you just need a clean slate with a few custom entries you can write:
```toml
[shell_environment_policy]
inherit = "none"
set = { PATH = "/usr/bin", MY_FLAG = "1" }
```
Currently, `CODEX_SANDBOX_NETWORK_DISABLED=1` is also added to the environment, assuming network is disabled. This is not configurable.
### notify
Specify a program that will be executed to get notified about events generated by Codex. Note that the program will receive the notification argument as a string of JSON, e.g.:
```json
{
"type": "agent-turn-complete",
"turn-id": "12345",
"input-messages": ["Rename `foo` to `bar` and update the callsites."],
"last-assistant-message": "Rename complete and verified `cargo build` succeeds."
}
```
The `"type"` property will always be set. Currently, `"agent-turn-complete"` is the only notification type that is supported.
As an example, here is a Python script that parses the JSON and decides whether to show a desktop push notification using [terminal-notifier](https://github.com/julienXX/terminal-notifier) on macOS:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
To have Codex use this script for notifications, you would configure it via `notify` in `~/.codex/config.toml` using the appropriate path to `notify.py` on your computer:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
### history
By default, Codex CLI records messages sent to the model in `$CODEX_HOME/history.jsonl`. Note that on UNIX, the file permissions are set to `o600`, so it should only be readable and writable by the owner.
To disable this behavior, configure `[history]` as follows:
```toml
[history]
persistence = "none" # "save-all" is the default value
```
### file_opener
Identifies the editor/URI scheme to use for hyperlinking citations in model output. If set, citations to files in the model output will be hyperlinked using the specified URI scheme so they can be ctrl/cmd-clicked from the terminal to open them.
For example, if the model output includes a reference such as `【F:/home/user/project/main.py†L42-L50】`, then this would be rewritten to link to the URI `vscode://file/home/user/project/main.py:42`.
Note this is **not** a general editor setting (like `$EDITOR`), as it only accepts a fixed set of values:
- `"vscode"` (default)
- `"vscode-insiders"`
- `"windsurf"`
- `"cursor"`
- `"none"` to explicitly disable this feature
Currently, `"vscode"` is the default, though Codex does not verify VS Code is installed. As such, `file_opener` may default to `"none"` or something else in the future.
### project_doc_max_bytes
Maximum number of bytes to read from an `AGENTS.md` file to include in the instructions sent with the first turn of a session. Defaults to 32 KiB.
### tui
Options that are specific to the TUI.
```toml
[tui]
# This will make it so that Codex does not try to process mouse events, which
# means your Terminal's native drag-to-text to text selection and copy/paste
# should work. The tradeoff is that Codex will not receive any mouse events, so
# it will not be possible to use the mouse to scroll conversation history.
#
# Note that most terminals support holding down a modifier key when using the
# mouse to support text selection. For example, even if Codex mouse capture is
# enabled (i.e., this is set to `false`), you can still hold down alt while
# dragging the mouse to select text.
disable_mouse_capture = true # defaults to `false`
```

View File

@@ -12,12 +12,10 @@ workspace = true
[dependencies]
anyhow = "1"
regex = "1.11.1"
serde_json = "1.0.110"
similar = "2.7.0"
thiserror = "2.0.12"
tree-sitter = "0.25.3"
tree-sitter-bash = "0.23.3"
tree-sitter-bash = "0.25.0"
[dev-dependencies]
pretty_assertions = "1.4.1"

View File

@@ -0,0 +1,40 @@
To edit files, ALWAYS use the `shell` tool with `apply_patch` CLI. `apply_patch` effectively allows you to execute a diff/patch against a file, but the format of the diff specification is unique to this task, so pay careful attention to these instructions. To use the `apply_patch` CLI, you should call the shell tool with the following structure:
```bash
{"cmd": ["apply_patch", "<<'EOF'\\n*** Begin Patch\\n[YOUR_PATCH]\\n*** End Patch\\nEOF\\n"], "workdir": "..."}
```
Where [YOUR_PATCH] is the actual content of your patch, specified in the following V4A diff format.
*** [ACTION] File: [path/to/file] -> ACTION can be one of Add, Update, or Delete.
For each snippet of code that needs to be changed, repeat the following:
[context_before] -> See below for further instructions on context.
- [old_code] -> Precede the old code with a minus sign.
+ [new_code] -> Precede the new, replacement code with a plus sign.
[context_after] -> See below for further instructions on context.
For instructions on [context_before] and [context_after]:
- By default, show 3 lines of code immediately above and 3 lines immediately below each change. If a change is within 3 lines of a previous change, do NOT duplicate the first changes [context_after] lines in the second changes [context_before] lines.
- If 3 lines of context is insufficient to uniquely identify the snippet of code within the file, use the @@ operator to indicate the class or function to which the snippet belongs. For instance, we might have:
@@ class BaseClass
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
- If a code block is repeated so many times in a class or function such that even a single `@@` statement and 3 lines of context cannot uniquely identify the snippet of code, you can use multiple `@@` statements to jump to the right context. For instance:
@@ class BaseClass
@@ def method():
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
Note, then, that we do not use line numbers in this diff format, as the context is enough to uniquely identify code. An example of a message that you might pass as "input" to this function, in order to apply a patch, is shown below.
```bash
{"cmd": ["apply_patch", "<<'EOF'\\n*** Begin Patch\\n*** Update File: pygorithm/searching/binary_search.py\\n@@ class BaseClass\\n@@ def search():\\n- pass\\n+ raise NotImplementedError()\\n@@ class Subclass\\n@@ def search():\\n- pass\\n+ raise NotImplementedError()\\n*** End Patch\\nEOF\\n"], "workdir": "..."}
```
File references can only be relative, NEVER ABSOLUTE. After the apply_patch command is run, it will always say "Done!", regardless of whether the patch was successfully applied or not. However, you can determine if there are issue and errors by looking at any warnings or logging lines printed BEFORE the "Done!" is output.

View File

@@ -19,6 +19,9 @@ use tree_sitter::LanguageError;
use tree_sitter::Parser;
use tree_sitter_bash::LANGUAGE as BASH;
/// Detailed instructions for gpt-4.1 on how to use the `apply_patch` tool.
pub const APPLY_PATCH_TOOL_INSTRUCTIONS: &str = include_str!("../apply_patch_tool_instructions.md");
#[derive(Debug, Error, PartialEq)]
pub enum ApplyPatchError {
#[error(transparent)]
@@ -630,7 +633,7 @@ mod tests {
/// Helper to construct a patch with the given body.
fn wrap_patch(body: &str) -> String {
format!("*** Begin Patch\n{}\n*** End Patch", body)
format!("*** Begin Patch\n{body}\n*** End Patch")
}
fn strs_to_strings(strs: &[&str]) -> Vec<String> {
@@ -658,7 +661,7 @@ mod tests {
}]
);
}
result => panic!("expected MaybeApplyPatch::Body got {:?}", result),
result => panic!("expected MaybeApplyPatch::Body got {result:?}"),
}
}
@@ -685,7 +688,7 @@ PATCH"#,
}]
);
}
result => panic!("expected MaybeApplyPatch::Body got {:?}", result),
result => panic!("expected MaybeApplyPatch::Body got {result:?}"),
}
}

View File

@@ -37,7 +37,15 @@ const EOF_MARKER: &str = "*** End of File";
const CHANGE_CONTEXT_MARKER: &str = "@@ ";
const EMPTY_CHANGE_CONTEXT_MARKER: &str = "@@";
#[derive(Debug, PartialEq, Error)]
/// Currently, the only OpenAI model that knowingly requires lenient parsing is
/// gpt-4.1. While we could try to require everyone to pass in a strictness
/// param when invoking apply_patch, it is a pain to thread it through all of
/// the call sites, so we resign ourselves allowing lenient parsing for all
/// models. See [`ParseMode::Lenient`] for details on the exceptions we make for
/// gpt-4.1.
const PARSE_IN_STRICT_MODE: bool = false;
#[derive(Debug, PartialEq, Error, Clone)]
pub enum ParseError {
#[error("invalid patch: {0}")]
InvalidPatchError(String),
@@ -46,7 +54,7 @@ pub enum ParseError {
}
use ParseError::*;
#[derive(Debug, PartialEq)]
#[derive(Debug, PartialEq, Clone)]
#[allow(clippy::enum_variant_names)]
pub enum Hunk {
AddFile {
@@ -78,7 +86,7 @@ impl Hunk {
use Hunk::*;
#[derive(Debug, PartialEq)]
#[derive(Debug, PartialEq, Clone)]
pub struct UpdateFileChunk {
/// A single line of context used to narrow down the position of the chunk
/// (this is usually a class, method, or function definition.)
@@ -95,19 +103,68 @@ pub struct UpdateFileChunk {
}
pub fn parse_patch(patch: &str) -> Result<Vec<Hunk>, ParseError> {
let mode = if PARSE_IN_STRICT_MODE {
ParseMode::Strict
} else {
ParseMode::Lenient
};
parse_patch_text(patch, mode)
}
enum ParseMode {
/// Parse the patch text argument as is.
Strict,
/// GPT-4.1 is known to formulate the `command` array for the `local_shell`
/// tool call for `apply_patch` call using something like the following:
///
/// ```json
/// [
/// "apply_patch",
/// "<<'EOF'\n*** Begin Patch\n*** Update File: README.md\n@@...\n*** End Patch\nEOF\n",
/// ]
/// ```
///
/// This is a problem because `local_shell` is a bit of a misnomer: the
/// `command` is not invoked by passing the arguments to a shell like Bash,
/// but are invoked using something akin to `execvpe(3)`.
///
/// This is significant in this case because where a shell would interpret
/// `<<'EOF'...` as a heredoc and pass the contents via stdin (which is
/// fine, as `apply_patch` is specified to read from stdin if no argument is
/// passed), `execvpe(3)` interprets the heredoc as a literal string. To get
/// the `local_shell` tool to run a command the way shell would, the
/// `command` array must be something like:
///
/// ```json
/// [
/// "bash",
/// "-lc",
/// "apply_patch <<'EOF'\n*** Begin Patch\n*** Update File: README.md\n@@...\n*** End Patch\nEOF\n",
/// ]
/// ```
///
/// In lenient mode, we check if the argument to `apply_patch` starts with
/// `<<'EOF'` and ends with `EOF\n`. If so, we strip off these markers,
/// trim() the result, and treat what is left as the patch text.
Lenient,
}
fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<Vec<Hunk>, ParseError> {
let lines: Vec<&str> = patch.trim().lines().collect();
if lines.is_empty() || lines[0] != BEGIN_PATCH_MARKER {
return Err(InvalidPatchError(String::from(
"The first line of the patch must be '*** Begin Patch'",
)));
}
let last_line_index = lines.len() - 1;
if lines[last_line_index] != END_PATCH_MARKER {
return Err(InvalidPatchError(String::from(
"The last line of the patch must be '*** End Patch'",
)));
}
let lines: &[&str] = match check_patch_boundaries_strict(&lines) {
Ok(()) => &lines,
Err(e) => match mode {
ParseMode::Strict => {
return Err(e);
}
ParseMode::Lenient => check_patch_boundaries_lenient(&lines, e)?,
},
};
let mut hunks: Vec<Hunk> = Vec::new();
// The above checks ensure that lines.len() >= 2.
let last_line_index = lines.len().saturating_sub(1);
let mut remaining_lines = &lines[1..last_line_index];
let mut line_number = 2;
while !remaining_lines.is_empty() {
@@ -119,6 +176,64 @@ pub fn parse_patch(patch: &str) -> Result<Vec<Hunk>, ParseError> {
Ok(hunks)
}
/// Checks the start and end lines of the patch text for `apply_patch`,
/// returning an error if they do not match the expected markers.
fn check_patch_boundaries_strict(lines: &[&str]) -> Result<(), ParseError> {
let (first_line, last_line) = match lines {
[] => (None, None),
[first] => (Some(first), Some(first)),
[first, .., last] => (Some(first), Some(last)),
};
check_start_and_end_lines_strict(first_line, last_line)
}
/// If we are in lenient mode, we check if the first line starts with `<<EOF`
/// (possibly quoted) and the last line ends with `EOF`. There must be at least
/// 4 lines total because the heredoc markers take up 2 lines and the patch text
/// must have at least 2 lines.
///
/// If successful, returns the lines of the patch text that contain the patch
/// contents, excluding the heredoc markers.
fn check_patch_boundaries_lenient<'a>(
original_lines: &'a [&'a str],
original_parse_error: ParseError,
) -> Result<&'a [&'a str], ParseError> {
match original_lines {
[first, .., last] => {
if (first == &"<<EOF" || first == &"<<'EOF'" || first == &"<<\"EOF\"")
&& last.ends_with("EOF")
&& original_lines.len() >= 4
{
let inner_lines = &original_lines[1..original_lines.len() - 1];
match check_patch_boundaries_strict(inner_lines) {
Ok(()) => Ok(inner_lines),
Err(e) => Err(e),
}
} else {
Err(original_parse_error)
}
}
_ => Err(original_parse_error),
}
}
fn check_start_and_end_lines_strict(
first_line: Option<&&str>,
last_line: Option<&&str>,
) -> Result<(), ParseError> {
match (first_line, last_line) {
(Some(&first), Some(&last)) if first == BEGIN_PATCH_MARKER && last == END_PATCH_MARKER => {
Ok(())
}
(Some(&first), _) if first != BEGIN_PATCH_MARKER => Err(InvalidPatchError(String::from(
"The first line of the patch must be '*** Begin Patch'",
))),
_ => Err(InvalidPatchError(String::from(
"The last line of the patch must be '*** End Patch'",
))),
}
}
/// Attempts to parse a single hunk from the start of lines.
/// Returns the parsed hunk and the number of lines parsed (or a ParseError).
fn parse_one_hunk(lines: &[&str], line_number: usize) -> Result<(Hunk, usize), ParseError> {
@@ -312,22 +427,23 @@ fn parse_update_file_chunk(
#[test]
fn test_parse_patch() {
assert_eq!(
parse_patch("bad"),
parse_patch_text("bad", ParseMode::Strict),
Err(InvalidPatchError(
"The first line of the patch must be '*** Begin Patch'".to_string()
))
);
assert_eq!(
parse_patch("*** Begin Patch\nbad"),
parse_patch_text("*** Begin Patch\nbad", ParseMode::Strict),
Err(InvalidPatchError(
"The last line of the patch must be '*** End Patch'".to_string()
))
);
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** Update File: test.py\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Err(InvalidHunkError {
message: "Update file hunk for path 'test.py' is empty".to_string(),
@@ -335,14 +451,15 @@ fn test_parse_patch() {
})
);
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Ok(Vec::new())
);
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** Add File: path/add.py\n\
+abc\n\
@@ -353,7 +470,8 @@ fn test_parse_patch() {
@@ def f():\n\
- pass\n\
+ return 123\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Ok(vec![
AddFile {
@@ -377,14 +495,15 @@ fn test_parse_patch() {
);
// Update hunk followed by another hunk (Add File).
assert_eq!(
parse_patch(
parse_patch_text(
"*** Begin Patch\n\
*** Update File: file.py\n\
@@\n\
+line\n\
*** Add File: other.py\n\
+content\n\
*** End Patch"
*** End Patch",
ParseMode::Strict
),
Ok(vec![
UpdateFile {
@@ -407,12 +526,13 @@ fn test_parse_patch() {
// Update hunk without an explicit @@ header for the first chunk should parse.
// Use a raw string to preserve the leading space diff marker on the context line.
assert_eq!(
parse_patch(
parse_patch_text(
r#"*** Begin Patch
*** Update File: file2.py
import foo
+bar
*** End Patch"#,
ParseMode::Strict
),
Ok(vec![UpdateFile {
path: PathBuf::from("file2.py"),
@@ -427,6 +547,80 @@ fn test_parse_patch() {
);
}
#[test]
fn test_parse_patch_lenient() {
let patch_text = r#"*** Begin Patch
*** Update File: file2.py
import foo
+bar
*** End Patch"#;
let expected_patch = vec![UpdateFile {
path: PathBuf::from("file2.py"),
move_path: None,
chunks: vec![UpdateFileChunk {
change_context: None,
old_lines: vec!["import foo".to_string()],
new_lines: vec!["import foo".to_string(), "bar".to_string()],
is_end_of_file: false,
}],
}];
let expected_error =
InvalidPatchError("The first line of the patch must be '*** Begin Patch'".to_string());
let patch_text_in_heredoc = format!("<<EOF\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
);
let patch_text_in_single_quoted_heredoc = format!("<<'EOF'\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_single_quoted_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_single_quoted_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
);
let patch_text_in_double_quoted_heredoc = format!("<<\"EOF\"\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_double_quoted_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_double_quoted_heredoc, ParseMode::Lenient),
Ok(expected_patch.clone())
);
let patch_text_in_mismatched_quotes_heredoc = format!("<<\"EOF'\n{patch_text}\nEOF\n");
assert_eq!(
parse_patch_text(&patch_text_in_mismatched_quotes_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_in_mismatched_quotes_heredoc, ParseMode::Lenient),
Err(expected_error.clone())
);
let patch_text_with_missing_closing_heredoc =
"<<EOF\n*** Begin Patch\n*** Update File: file2.py\nEOF\n".to_string();
assert_eq!(
parse_patch_text(&patch_text_with_missing_closing_heredoc, ParseMode::Strict),
Err(expected_error.clone())
);
assert_eq!(
parse_patch_text(&patch_text_with_missing_closing_heredoc, ParseMode::Lenient),
Err(InvalidPatchError(
"The last line of the patch must be '*** End Patch'".to_string()
))
);
}
#[test]
fn test_parse_one_hunk() {
assert_eq!(

View File

@@ -0,0 +1,21 @@
[package]
name = "codex-chatgpt"
version = { workspace = true }
edition = "2024"
[lints]
workspace = true
[dependencies]
anyhow = "1"
clap = { version = "4", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
codex-common = { path = "../common", features = ["cli"] }
codex-core = { path = "../core" }
codex-login = { path = "../login" }
reqwest = { version = "0.12", features = ["json", "stream"] }
tokio = { version = "1", features = ["full"] }
[dev-dependencies]
tempfile = "3"

View File

@@ -0,0 +1,5 @@
# ChatGPT
This crate pertains to first party ChatGPT APIs and products such as Codex agent.
This crate should be primarily built and maintained by OpenAI employees. Please reach out to a maintainer before making an external contribution.

View File

@@ -0,0 +1,89 @@
use clap::Parser;
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use crate::chatgpt_token::init_chatgpt_token_from_auth;
use crate::get_task::GetTaskResponse;
use crate::get_task::OutputItem;
use crate::get_task::PrOutputItem;
use crate::get_task::get_task;
/// Applies the latest diff from a Codex agent task.
#[derive(Debug, Parser)]
pub struct ApplyCommand {
pub task_id: String,
#[clap(flatten)]
pub config_overrides: CliConfigOverrides,
}
pub async fn run_apply_command(apply_cli: ApplyCommand) -> anyhow::Result<()> {
let config = Config::load_with_cli_overrides(
apply_cli
.config_overrides
.parse_overrides()
.map_err(anyhow::Error::msg)?,
ConfigOverrides::default(),
)?;
init_chatgpt_token_from_auth(&config.codex_home).await?;
let task_response = get_task(&config, apply_cli.task_id).await?;
apply_diff_from_task(task_response).await
}
pub async fn apply_diff_from_task(task_response: GetTaskResponse) -> anyhow::Result<()> {
let diff_turn = match task_response.current_diff_task_turn {
Some(turn) => turn,
None => anyhow::bail!("No diff turn found"),
};
let output_diff = diff_turn.output_items.iter().find_map(|item| match item {
OutputItem::Pr(PrOutputItem { output_diff }) => Some(output_diff),
_ => None,
});
match output_diff {
Some(output_diff) => apply_diff(&output_diff.diff).await,
None => anyhow::bail!("No PR output item found"),
}
}
async fn apply_diff(diff: &str) -> anyhow::Result<()> {
let toplevel_output = tokio::process::Command::new("git")
.args(vec!["rev-parse", "--show-toplevel"])
.output()
.await?;
if !toplevel_output.status.success() {
anyhow::bail!("apply must be run from a git repository.");
}
let repo_root = String::from_utf8(toplevel_output.stdout)?
.trim()
.to_string();
let mut git_apply_cmd = tokio::process::Command::new("git")
.args(vec!["apply", "--3way"])
.current_dir(&repo_root)
.stdin(std::process::Stdio::piped())
.stdout(std::process::Stdio::piped())
.stderr(std::process::Stdio::piped())
.spawn()?;
if let Some(mut stdin) = git_apply_cmd.stdin.take() {
tokio::io::AsyncWriteExt::write_all(&mut stdin, diff.as_bytes()).await?;
drop(stdin);
}
let output = git_apply_cmd.wait_with_output().await?;
if !output.status.success() {
anyhow::bail!(
"Git apply failed with status {}: {}",
output.status,
String::from_utf8_lossy(&output.stderr)
);
}
println!("Successfully applied diff");
Ok(())
}

View File

@@ -0,0 +1,45 @@
use codex_core::config::Config;
use crate::chatgpt_token::get_chatgpt_token_data;
use crate::chatgpt_token::init_chatgpt_token_from_auth;
use anyhow::Context;
use serde::de::DeserializeOwned;
/// Make a GET request to the ChatGPT backend API.
pub(crate) async fn chatgpt_get_request<T: DeserializeOwned>(
config: &Config,
path: String,
) -> anyhow::Result<T> {
let chatgpt_base_url = &config.chatgpt_base_url;
init_chatgpt_token_from_auth(&config.codex_home).await?;
// Make direct HTTP request to ChatGPT backend API with the token
let client = reqwest::Client::new();
let url = format!("{chatgpt_base_url}{path}");
let token =
get_chatgpt_token_data().ok_or_else(|| anyhow::anyhow!("ChatGPT token not available"))?;
let response = client
.get(&url)
.bearer_auth(&token.access_token)
.header("chatgpt-account-id", &token.account_id)
.header("Content-Type", "application/json")
.header("User-Agent", "codex-cli")
.send()
.await
.context("Failed to send request")?;
if response.status().is_success() {
let result: T = response
.json()
.await
.context("Failed to parse JSON response")?;
Ok(result)
} else {
let status = response.status();
let body = response.text().await.unwrap_or_default();
anyhow::bail!("Request failed with status {}: {}", status, body)
}
}

View File

@@ -0,0 +1,24 @@
use std::path::Path;
use std::sync::LazyLock;
use std::sync::RwLock;
use codex_login::TokenData;
static CHATGPT_TOKEN: LazyLock<RwLock<Option<TokenData>>> = LazyLock::new(|| RwLock::new(None));
pub fn get_chatgpt_token_data() -> Option<TokenData> {
CHATGPT_TOKEN.read().ok()?.clone()
}
pub fn set_chatgpt_token_data(value: TokenData) {
if let Ok(mut guard) = CHATGPT_TOKEN.write() {
*guard = Some(value);
}
}
/// Initialize the ChatGPT token from auth.json file
pub async fn init_chatgpt_token_from_auth(codex_home: &Path) -> std::io::Result<()> {
let auth_json = codex_login::try_read_auth_json(codex_home).await?;
set_chatgpt_token_data(auth_json.tokens.clone());
Ok(())
}

View File

@@ -0,0 +1,40 @@
use codex_core::config::Config;
use serde::Deserialize;
use crate::chatgpt_client::chatgpt_get_request;
#[derive(Debug, Deserialize)]
pub struct GetTaskResponse {
pub current_diff_task_turn: Option<AssistantTurn>,
}
// Only relevant fields for our extraction
#[derive(Debug, Deserialize)]
pub struct AssistantTurn {
pub output_items: Vec<OutputItem>,
}
#[derive(Debug, Deserialize)]
#[serde(tag = "type")]
pub enum OutputItem {
#[serde(rename = "pr")]
Pr(PrOutputItem),
#[serde(other)]
Other,
}
#[derive(Debug, Deserialize)]
pub struct PrOutputItem {
pub output_diff: OutputDiff,
}
#[derive(Debug, Deserialize)]
pub struct OutputDiff {
pub diff: String,
}
pub(crate) async fn get_task(config: &Config, task_id: String) -> anyhow::Result<GetTaskResponse> {
let path = format!("/wham/tasks/{task_id}");
chatgpt_get_request(config, path).await
}

View File

@@ -0,0 +1,4 @@
pub mod apply_command;
mod chatgpt_client;
mod chatgpt_token;
pub mod get_task;

View File

@@ -0,0 +1,191 @@
#![expect(clippy::expect_used)]
use codex_chatgpt::apply_command::apply_diff_from_task;
use codex_chatgpt::get_task::GetTaskResponse;
use std::path::Path;
use tempfile::TempDir;
use tokio::process::Command;
/// Creates a temporary git repository with initial commit
async fn create_temp_git_repo() -> anyhow::Result<TempDir> {
let temp_dir = TempDir::new()?;
let repo_path = temp_dir.path();
let output = Command::new("git")
.args(["init"])
.current_dir(repo_path)
.output()
.await?;
if !output.status.success() {
anyhow::bail!(
"Failed to initialize git repo: {}",
String::from_utf8_lossy(&output.stderr)
);
}
Command::new("git")
.args(["config", "user.email", "test@example.com"])
.current_dir(repo_path)
.output()
.await?;
Command::new("git")
.args(["config", "user.name", "Test User"])
.current_dir(repo_path)
.output()
.await?;
std::fs::write(repo_path.join("README.md"), "# Test Repo\n")?;
Command::new("git")
.args(["add", "README.md"])
.current_dir(repo_path)
.output()
.await?;
let output = Command::new("git")
.args(["commit", "-m", "Initial commit"])
.current_dir(repo_path)
.output()
.await?;
if !output.status.success() {
anyhow::bail!(
"Failed to create initial commit: {}",
String::from_utf8_lossy(&output.stderr)
);
}
Ok(temp_dir)
}
async fn mock_get_task_with_fixture() -> anyhow::Result<GetTaskResponse> {
let fixture_path = Path::new(env!("CARGO_MANIFEST_DIR")).join("tests/task_turn_fixture.json");
let fixture_content = std::fs::read_to_string(fixture_path)?;
let response: GetTaskResponse = serde_json::from_str(&fixture_content)?;
Ok(response)
}
#[tokio::test]
async fn test_apply_command_creates_fibonacci_file() {
let temp_repo = create_temp_git_repo()
.await
.expect("Failed to create temp git repo");
let repo_path = temp_repo.path();
let task_response = mock_get_task_with_fixture()
.await
.expect("Failed to load fixture");
let original_dir = std::env::current_dir().expect("Failed to get current dir");
std::env::set_current_dir(repo_path).expect("Failed to change directory");
struct DirGuard(std::path::PathBuf);
impl Drop for DirGuard {
fn drop(&mut self) {
let _ = std::env::set_current_dir(&self.0);
}
}
let _guard = DirGuard(original_dir);
apply_diff_from_task(task_response)
.await
.expect("Failed to apply diff from task");
// Assert that fibonacci.js was created in scripts/ directory
let fibonacci_path = repo_path.join("scripts/fibonacci.js");
assert!(fibonacci_path.exists(), "fibonacci.js was not created");
// Verify the file contents match expected
let contents = std::fs::read_to_string(&fibonacci_path).expect("Failed to read fibonacci.js");
assert!(
contents.contains("function fibonacci(n)"),
"fibonacci.js doesn't contain expected function"
);
assert!(
contents.contains("#!/usr/bin/env node"),
"fibonacci.js doesn't have shebang"
);
assert!(
contents.contains("module.exports = fibonacci;"),
"fibonacci.js doesn't export function"
);
// Verify file has correct number of lines (31 as specified in fixture)
let line_count = contents.lines().count();
assert_eq!(
line_count, 31,
"fibonacci.js should have 31 lines, got {line_count}",
);
}
#[tokio::test]
async fn test_apply_command_with_merge_conflicts() {
let temp_repo = create_temp_git_repo()
.await
.expect("Failed to create temp git repo");
let repo_path = temp_repo.path();
// Create conflicting fibonacci.js file first
let scripts_dir = repo_path.join("scripts");
std::fs::create_dir_all(&scripts_dir).expect("Failed to create scripts directory");
let conflicting_content = r#"#!/usr/bin/env node
// This is a different fibonacci implementation
function fib(num) {
if (num <= 1) return num;
return fib(num - 1) + fib(num - 2);
}
console.log("Running fibonacci...");
console.log(fib(10));
"#;
let fibonacci_path = scripts_dir.join("fibonacci.js");
std::fs::write(&fibonacci_path, conflicting_content).expect("Failed to write conflicting file");
Command::new("git")
.args(["add", "scripts/fibonacci.js"])
.current_dir(repo_path)
.output()
.await
.expect("Failed to add fibonacci.js");
Command::new("git")
.args(["commit", "-m", "Add conflicting fibonacci implementation"])
.current_dir(repo_path)
.output()
.await
.expect("Failed to commit conflicting file");
let original_dir = std::env::current_dir().expect("Failed to get current dir");
std::env::set_current_dir(repo_path).expect("Failed to change directory");
struct DirGuard(std::path::PathBuf);
impl Drop for DirGuard {
fn drop(&mut self) {
let _ = std::env::set_current_dir(&self.0);
}
}
let _guard = DirGuard(original_dir);
let task_response = mock_get_task_with_fixture()
.await
.expect("Failed to load fixture");
let apply_result = apply_diff_from_task(task_response).await;
assert!(
apply_result.is_err(),
"Expected apply to fail due to merge conflicts"
);
let contents = std::fs::read_to_string(&fibonacci_path).expect("Failed to read fibonacci.js");
assert!(
contents.contains("<<<<<<< HEAD")
|| contents.contains("=======")
|| contents.contains(">>>>>>> "),
"fibonacci.js should contain merge conflict markers, got: {contents}",
);
}

View File

@@ -0,0 +1,65 @@
{
"current_diff_task_turn": {
"output_items": [
{
"type": "pr",
"pr_title": "Add fibonacci script",
"pr_message": "## Summary\n- add a basic Fibonacci script under `scripts/`\n\n## Testing\n- `node scripts/fibonacci.js 10`\n- `npm run lint` *(fails: next not found)*",
"output_diff": {
"type": "output_diff",
"repo_id": "/workspace/rddit-vercel",
"base_commit_sha": "1a2e9baf2ce2fdd0c126b47b1bcfd512de2a9f7b",
"diff": "diff --git a/scripts/fibonacci.js b/scripts/fibonacci.js\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..6c9fdfdbf8669b7968936411050525b995d0a9a6\n--- /dev/null\n+++ b/scripts/fibonacci.js\n@@ -0,0 +1,31 @@\n+#!/usr/bin/env node\n+\n+function fibonacci(n) {\n+ if (n < 0) {\n+ throw new Error(\"n must be non-negative\");\n+ }\n+ let a = 0;\n+ let b = 1;\n+ for (let i = 0; i < n; i++) {\n+ const next = a + b;\n+ a = b;\n+ b = next;\n+ }\n+ return a;\n+}\n+\n+function printUsage() {\n+ console.log(\"Usage: node scripts/fibonacci.js <n>\");\n+}\n+\n+if (require.main === module) {\n+ const arg = process.argv[2];\n+ if (arg === undefined || isNaN(Number(arg))) {\n+ printUsage();\n+ process.exit(1);\n+ }\n+ const n = Number(arg);\n+ console.log(fibonacci(n));\n+}\n+\n+module.exports = fibonacci;\n",
"external_storage_diff": {
"file_id": "file_00000000114c61f786900f8c2130ace7",
"ttl": null
},
"files_modified": 1,
"lines_added": 31,
"lines_removed": 0,
"commit_message": "Add fibonacci script"
}
},
{
"type": "message",
"role": "assistant",
"content": [
{
"content_type": "text",
"text": "**Summary**\n\n- Created a command-line Fibonacci script that validates input and prints the result when executed with Node"
},
{
"content_type": "repo_file_citation",
"path": "scripts/fibonacci.js",
"line_range_start": 1,
"line_range_end": 31
},
{
"content_type": "text",
"text": "\n\n**Testing**\n\n- ❌ `npm run lint` (failed to run `next lint`)"
},
{
"content_type": "terminal_chunk_citation",
"terminal_chunk_id": "7dd543",
"line_range_start": 1,
"line_range_end": 5
},
{
"content_type": "text",
"text": "\n- ✅ `node scripts/fibonacci.js 10` produced “55”"
},
{
"content_type": "terminal_chunk_citation",
"terminal_chunk_id": "6ee559",
"line_range_start": 1,
"line_range_end": 3
},
{
"content_type": "text",
"text": "\n\nCodex couldn't run certain commands due to environment limitations. Consider configuring a setup script or internet access in your Codex environment to install dependencies."
}
]
}
]
}
}

View File

@@ -17,13 +17,18 @@ workspace = true
[dependencies]
anyhow = "1"
clap = { version = "4", features = ["derive"] }
clap_complete = "4"
codex-chatgpt = { path = "../chatgpt" }
codex-core = { path = "../core" }
codex-common = { path = "../common", features = ["cli"] }
codex-exec = { path = "../exec" }
codex-login = { path = "../login" }
codex-linux-sandbox = { path = "../linux-sandbox" }
codex-mcp-server = { path = "../mcp-server" }
codex-tui = { path = "../tui" }
serde_json = "1"
serde = { version = "1", features = ["derive"] }
chrono = { version = "0.4", default-features = false, features = ["clock"] }
tokio = { version = "1", features = [
"io-std",
"macros",
@@ -33,3 +38,10 @@ tokio = { version = "1", features = [
] }
tracing = "0.1.41"
tracing-subscriber = "0.3.19"
uuid = { version = "1", features = ["v4"] }
[dev-dependencies]
assert_cmd = "2"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tempfile = "3"

View File

@@ -0,0 +1,357 @@
use std::fs::File;
use std::path::PathBuf;
use std::process::{Command, Stdio};
use std::io::Write; // added for write_all / flush
use anyhow::Context;
use codex_common::ApprovalModeCliArg;
use codex_tui::Cli as TuiCli;
/// Attempt to handle a concurrent background run. Returns Ok(true) if a background exec
/// process was spawned (in which case the caller should NOT start the TUI), or Ok(false)
/// to proceed with normal interactive execution.
pub fn maybe_spawn_concurrent(
tui_cli: &mut TuiCli,
root_raw_overrides: &[String],
concurrent: bool,
concurrent_automerge: Option<bool>,
concurrent_branch_name: &Option<String>,
) -> anyhow::Result<bool> {
if !concurrent { return Ok(false); }
// Enforce autonomous execution conditions when running interactive mode.
// Validate git repository presence (required for --concurrent) only if we're in interactive path.
{
let dir_to_check = tui_cli
.cwd
.clone()
.unwrap_or_else(|| std::env::current_dir().unwrap_or_else(|_| PathBuf::from(".")));
let status = Command::new("git")
.arg("-C")
.arg(&dir_to_check)
.arg("rev-parse")
.arg("--git-dir")
.stdout(Stdio::null())
.stderr(Stdio::null())
.status();
if status.as_ref().map(|s| !s.success()).unwrap_or(true) {
eprintln!(
"Error: --concurrent requires a git repository (directory {:?} is not managed by git).",
dir_to_check
);
std::process::exit(2);
}
}
let ap = tui_cli.approval_policy;
let approval_on_failure = matches!(ap, Some(ApprovalModeCliArg::OnFailure));
let autonomous = tui_cli.full_auto
|| tui_cli.dangerously_bypass_approvals_and_sandbox
|| approval_on_failure;
if !autonomous {
eprintln!(
"Error: --concurrent requires autonomous mode. Use one of: --full-auto, --ask-for-approval on-failure, or --dangerously-bypass-approvals-and-sandbox."
);
std::process::exit(2);
}
if tui_cli.prompt.is_none() {
eprintln!(
"Error: --concurrent requires a prompt argument so the agent does not wait for interactive input."
);
std::process::exit(2);
}
// Build exec args from interactive CLI for autonomous run without TUI (background).
let mut exec_args: Vec<String> = Vec::new();
if !tui_cli.images.is_empty() {
exec_args.push("--image".into());
exec_args.push(tui_cli.images.iter().map(|p| p.display().to_string()).collect::<Vec<_>>().join(","));
}
if let Some(model) = &tui_cli.model { exec_args.push("--model".into()); exec_args.push(model.clone()); }
if let Some(profile) = &tui_cli.config_profile { exec_args.push("--profile".into()); exec_args.push(profile.clone()); }
if let Some(sandbox) = &tui_cli.sandbox_mode { exec_args.push("--sandbox".into()); exec_args.push(format!("{sandbox:?}").to_lowercase().replace('_', "-")); }
if tui_cli.full_auto { exec_args.push("--full-auto".into()); }
if tui_cli.dangerously_bypass_approvals_and_sandbox { exec_args.push("--dangerously-bypass-approvals-and-sandbox".into()); }
if tui_cli.skip_git_repo_check { exec_args.push("--skip-git-repo-check".into()); }
for raw in root_raw_overrides { exec_args.push("-c".into()); exec_args.push(raw.clone()); }
// Derive a single slug (shared by worktree branch & log filename) from the prompt.
let raw_prompt = tui_cli.prompt.as_deref().unwrap_or("");
let snippet = raw_prompt.chars().take(32).collect::<String>();
let mut slug: String = snippet
.chars()
.map(|c| if c.is_ascii_alphanumeric() { c.to_ascii_lowercase() } else { '-' })
.collect();
while slug.contains("--") { slug = slug.replace("--", "-"); }
slug = slug.trim_matches('-').to_string();
if slug.is_empty() { slug = "prompt".into(); }
// Determine concurrent defaults from env (no config file), then apply CLI precedence.
let env_automerge = parse_env_bool("CONCURRENT_AUTOMERGE");
let env_branch_name = std::env::var("CONCURRENT_BRANCH_NAME").ok();
let effective_automerge = concurrent_automerge.or(env_automerge).unwrap_or(true);
let user_branch_name_opt = concurrent_branch_name.clone().or(env_branch_name);
let branch_name_effective = if let Some(bn_raw) = user_branch_name_opt.as_ref() {
let bn_trim = bn_raw.trim();
if bn_trim.is_empty() { format!("codex/{slug}") } else { bn_trim.to_string() }
} else {
format!("codex/{slug}")
};
// Unique job id for this concurrent run (used for log file naming instead of slug).
let task_id = uuid::Uuid::new_v4().to_string();
// Prepare log file path early so we can write pre-spawn logs (e.g. worktree creation output) into it.
let log_dir = match codex_base_dir() {
Ok(base) => {
let d = base.join("log");
let _ = std::fs::create_dir_all(&d);
d
}
Err(_) => PathBuf::from("/tmp"),
};
let log_path = log_dir.join(format!("codex-logs-{}.log", task_id));
// If user did NOT specify an explicit cwd, create an isolated git worktree.
let mut created_worktree: Option<(PathBuf, String)> = None; // (path, branch)
let mut original_branch: Option<String> = None;
let mut original_commit: Option<String> = None;
let mut pre_spawn_logs = String::new();
if tui_cli.cwd.is_none() {
original_branch = git_capture(["rev-parse", "--abbrev-ref", "HEAD"]).ok();
original_commit = git_capture(["rev-parse", "HEAD"]).ok();
match create_concurrent_worktree(&branch_name_effective) {
Ok(Some(info)) => {
exec_args.push("--cd".into());
exec_args.push(info.worktree_path.display().to_string());
created_worktree = Some((info.worktree_path, info.branch_name.clone()));
// Keep the original git output plus a concise created line (for log file only).
pre_spawn_logs.push_str(&info.logs);
pre_spawn_logs.push_str(&format!(
"Created git worktree at {} (branch {}) for concurrent run\n",
created_worktree.as_ref().unwrap().0.display(), info.branch_name
));
}
Ok(None) => {
// Silence console noise: do not warn here to keep stdout clean; we still proceed.
}
Err(e) => {
eprintln!("Error: failed to create git worktree for --concurrent: {e}");
eprintln!("Hint: remove or rename existing branch '{branch_name_effective}', or pass --concurrent-branch-name to choose a unique name.");
std::process::exit(3);
}
}
} else if let Some(explicit) = &tui_cli.cwd {
exec_args.push("--cd".into());
exec_args.push(explicit.display().to_string());
}
// Prompt (safe to unwrap due to earlier validation).
if let Some(prompt) = tui_cli.prompt.clone() { exec_args.push(prompt); }
// Create (or truncate) the log file and write any pre-spawn logs we captured.
let file = match File::create(&log_path) {
Ok(mut f) => {
if !pre_spawn_logs.is_empty() {
let _ = f.write_all(pre_spawn_logs.as_bytes());
let _ = f.flush();
}
f
}
Err(e) => {
eprintln!("Failed to create log file {}: {e}. Falling back to interactive mode.", log_path.display());
return Ok(false);
}
};
match File::create(&log_path) {
Ok(file) => {
let file_err = file.try_clone().ok();
let mut cmd = Command::new(
std::env::current_exe().unwrap_or_else(|_| PathBuf::from("codex"))
);
cmd.arg("exec");
for a in &exec_args { cmd.arg(a); }
// Provide metadata for auto merge if we created a worktree.
if let Some((wt_path, branch)) = &created_worktree {
if effective_automerge { cmd.env("CODEX_CONCURRENT_AUTOMERGE", "1"); }
cmd.env("CODEX_CONCURRENT_BRANCH", branch);
cmd.env("CODEX_CONCURRENT_WORKTREE", wt_path);
if let Some(ob) = &original_branch { cmd.env("CODEX_ORIGINAL_BRANCH", ob); }
if let Some(oc) = &original_commit { cmd.env("CODEX_ORIGINAL_COMMIT", oc); }
if let Ok(orig_root) = std::env::current_dir() { cmd.env("CODEX_ORIGINAL_ROOT", orig_root); }
}
// Provide task id so child process can emit token_count updates to tasks.jsonl.
cmd.env("CODEX_TASK_ID", &task_id);
cmd.stdout(Stdio::from(file));
if let Some(f2) = file_err { cmd.stderr(Stdio::from(f2)); }
match cmd.spawn() {
Ok(child) => {
// Human-friendly multi-line output with bold headers.
let branch_val = created_worktree.as_ref().map(|(_, b)| b.as_str()).unwrap_or("(none)");
let worktree_val = created_worktree
.as_ref()
.map(|(p, _)| p.display().to_string())
.unwrap_or_else(|| "(original cwd)".to_string());
// ANSI escape for bold: \x1b[1m ... \x1b[0m
println!("\x1b[1mTask ID:\x1b[0m {}", task_id);
println!("\x1b[1mPID:\x1b[0m {}", child.id());
println!("\x1b[1mBranch:\x1b[0m {}", branch_val);
println!("\x1b[1mWorktree:\x1b[0m {}", worktree_val);
println!("\x1b[1mState:\x1b[0m started");
// Use bold bright magenta (95) for actionable follow-up commands.
println!("\nMonitor all tasks: \x1b[1;95mcodex tasks ls\x1b[0m");
println!("Watch this task: \x1b[1;95mcodex logs {} -f\x1b[0m", task_id);
// Record task metadata to CODEX_HOME/tasks.jsonl (JSON Lines file).
let record_time = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs())
.unwrap_or(0);
if let Ok(base) = codex_base_dir() {
let tasks_path = base.join("tasks.jsonl");
let record = serde_json::json!({
"task_id": task_id,
"pid": child.id(),
"worktree": created_worktree.as_ref().map(|(p, _)| p.display().to_string()),
"branch": created_worktree.as_ref().map(|(_, b)| b.clone()),
"original_branch": original_branch,
"original_commit": original_commit,
"log_path": log_path.display().to_string(),
"prompt": raw_prompt,
"model": tui_cli.model.clone(),
"start_time": record_time,
"automerge": effective_automerge,
"explicit_branch_name": user_branch_name_opt,
"token_count": serde_json::Value::Null,
"state": "started",
});
if let Ok(mut f) = std::fs::OpenOptions::new().create(true).append(true).open(&tasks_path) {
use std::io::Write;
if let Err(e) = writeln!(f, "{}", record.to_string()) {
eprintln!("Warning: failed writing task record to {}: {e}", tasks_path.display());
}
} else {
eprintln!("Warning: could not open tasks log file at {}", tasks_path.display());
}
}
return Ok(true); // background spawned
}
Err(e) => {
eprintln!("Failed to start background exec: {e}. Falling back to interactive mode.");
}
}
}
Err(e) => {
eprintln!(
"Failed to create log file {}: {e}. Falling back to interactive mode.",
log_path.display()
);
}
}
Ok(false)
}
/// Return the base Codex directory under the user's home (~/.codex), creating it if necessary.
fn codex_base_dir() -> anyhow::Result<PathBuf> {
if let Ok(val) = std::env::var("CODEX_HOME") {
if !val.is_empty() {
return Ok(PathBuf::from(val).canonicalize()?);
}
}
let home = std::env::var_os("HOME").ok_or_else(|| anyhow::anyhow!("Could not find home directory"))?;
let base = PathBuf::from(home).join(".codex");
std::fs::create_dir_all(&base)?;
Ok(base)
}
/// Attempt to create a git worktree for an isolated concurrent run capturing git output.
struct WorktreeInfo { worktree_path: PathBuf, branch_name: String, logs: String }
fn create_concurrent_worktree(branch_name: &str) -> anyhow::Result<Option<WorktreeInfo>> {
// Determine repository root.
let output = Command::new("git").arg("rev-parse").arg("--show-toplevel").output();
let repo_root = match output {
Ok(out) if out.status.success() => {
let s = String::from_utf8_lossy(&out.stdout).trim().to_string();
if s.is_empty() { return Ok(None); }
PathBuf::from(s)
}
_ => return Ok(None),
};
// Derive repo name from root directory.
let repo_name = repo_root
.file_name()
.and_then(|s| s.to_str())
.unwrap_or("repo");
// Fast-fail if branch already exists.
if Command::new("git")
.current_dir(&repo_root)
.arg("rev-parse")
.arg("--verify")
.arg(branch_name)
.stdout(Stdio::null())
.stderr(Stdio::null())
.status()
.map(|s| s.success())
.unwrap_or(false) {
anyhow::bail!("branch '{branch_name}' already exists");
}
// Construct worktree directory under ~/.codex/worktrees/<repo_name>/.
let base_dir = codex_base_dir()?.join("worktrees").join(repo_name);
std::fs::create_dir_all(&base_dir)?;
let mut worktree_path = base_dir.join(branch_name.replace('/', "-"));
if worktree_path.exists() {
for i in 1..1000 {
let candidate = base_dir.join(format!("{}-{}", branch_name.replace('/', "-"), i));
if !candidate.exists() { worktree_path = candidate; break; }
}
}
// Run git worktree add capturing output (stdout+stderr).
let add_out = Command::new("git")
.current_dir(&repo_root)
.arg("worktree")
.arg("add")
.arg("-b")
.arg(&branch_name)
.arg(&worktree_path)
.arg("HEAD")
.output()?;
if !add_out.status.success() {
anyhow::bail!("git worktree add failed with status {}", add_out.status);
}
let mut logs = String::new();
if !add_out.stdout.is_empty() { logs.push_str(&String::from_utf8_lossy(&add_out.stdout)); }
if !add_out.stderr.is_empty() { logs.push_str(&String::from_utf8_lossy(&add_out.stderr)); }
Ok(Some(WorktreeInfo { worktree_path, branch_name: branch_name.to_string(), logs }))
}
/// Helper: capture trimmed stdout of a git command.
fn git_capture<I, S>(args: I) -> anyhow::Result<String>
where
I: IntoIterator<Item = S>,
S: AsRef<str>,
{
let mut cmd = Command::new("git");
for a in args { cmd.arg(a.as_ref()); }
let out = cmd.output().context("running git command")?;
if !out.status.success() { anyhow::bail!("git command failed"); }
Ok(String::from_utf8_lossy(&out.stdout).trim().to_string())
}
/// Parse common boolean environment variable representations.
fn parse_env_bool(name: &str) -> Option<bool> {
let raw = std::env::var(name).ok()?;
let lower = raw.to_ascii_lowercase();
match lower.as_str() {
"1" | "true" | "yes" | "on" => Some(true),
"0" | "false" | "no" | "off" => Some(false),
_ => None,
}
}

View File

@@ -1,14 +1,13 @@
use std::path::PathBuf;
use codex_common::CliConfigOverrides;
use codex_common::SandboxPermissionOption;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::config_types::SandboxMode;
use codex_core::exec::StdioPolicy;
use codex_core::exec::spawn_command_under_linux_sandbox;
use codex_core::exec::spawn_command_under_seatbelt;
use codex_core::exec_env::create_env;
use codex_core::protocol::SandboxPolicy;
use crate::LandlockCommand;
use crate::SeatbeltCommand;
@@ -20,13 +19,11 @@ pub async fn run_command_under_seatbelt(
) -> anyhow::Result<()> {
let SeatbeltCommand {
full_auto,
sandbox,
config_overrides,
command,
} = command;
run_command_under_sandbox(
full_auto,
sandbox,
command,
config_overrides,
codex_linux_sandbox_exe,
@@ -41,13 +38,11 @@ pub async fn run_command_under_landlock(
) -> anyhow::Result<()> {
let LandlockCommand {
full_auto,
sandbox,
config_overrides,
command,
} = command;
run_command_under_sandbox(
full_auto,
sandbox,
command,
config_overrides,
codex_linux_sandbox_exe,
@@ -63,20 +58,19 @@ enum SandboxType {
async fn run_command_under_sandbox(
full_auto: bool,
sandbox: SandboxPermissionOption,
command: Vec<String>,
config_overrides: CliConfigOverrides,
codex_linux_sandbox_exe: Option<PathBuf>,
sandbox_type: SandboxType,
) -> anyhow::Result<()> {
let sandbox_policy = create_sandbox_policy(full_auto, sandbox);
let sandbox_mode = create_sandbox_mode(full_auto);
let cwd = std::env::current_dir()?;
let config = Config::load_with_cli_overrides(
config_overrides
.parse_overrides()
.map_err(anyhow::Error::msg)?,
ConfigOverrides {
sandbox_policy: Some(sandbox_policy),
sandbox_mode: Some(sandbox_mode),
codex_linux_sandbox_exe,
..Default::default()
},
@@ -110,13 +104,10 @@ async fn run_command_under_sandbox(
handle_exit_status(status);
}
pub fn create_sandbox_policy(full_auto: bool, sandbox: SandboxPermissionOption) -> SandboxPolicy {
pub fn create_sandbox_mode(full_auto: bool) -> SandboxMode {
if full_auto {
SandboxPolicy::new_full_auto_policy()
SandboxMode::WorkspaceWrite
} else {
match sandbox.permissions.map(Into::into) {
Some(sandbox_policy) => sandbox_policy,
None => SandboxPolicy::new_read_only_policy(),
}
SandboxMode::ReadOnly
}
}

185
codex-rs/cli/src/inspect.rs Normal file
View File

@@ -0,0 +1,185 @@
use clap::Parser;
use serde::Deserialize;
use std::fs::File;
use std::io::{BufRead, BufReader};
use std::path::PathBuf;
use std::fs;
#[derive(Debug, Parser)]
pub struct InspectCli {
/// Task identifier (full/short task id or exact branch name)
pub id: String,
/// Output JSON instead of human table
#[arg(long)]
pub json: bool,
}
#[derive(Debug, Deserialize)]
struct RawRecord {
task_id: Option<String>,
pid: Option<u64>,
worktree: Option<String>,
branch: Option<String>,
original_branch: Option<String>,
original_commit: Option<String>,
log_path: Option<String>,
prompt: Option<String>,
model: Option<String>,
start_time: Option<u64>,
update_time: Option<u64>,
token_count: Option<serde_json::Value>,
state: Option<String>,
completion_time: Option<u64>,
end_time: Option<u64>,
automerge: Option<bool>,
explicit_branch_name: Option<String>,
}
#[derive(Debug, serde::Serialize, Default, Clone)]
struct TaskFull {
task_id: String,
pid: Option<u64>,
branch: Option<String>,
worktree: Option<String>,
original_branch: Option<String>,
original_commit: Option<String>,
log_path: Option<String>,
prompt: Option<String>,
model: Option<String>,
start_time: Option<u64>,
end_time: Option<u64>,
state: Option<String>,
total_tokens: Option<u64>,
input_tokens: Option<u64>,
output_tokens: Option<u64>,
reasoning_output_tokens: Option<u64>,
automerge: Option<bool>,
explicit_branch_name: Option<String>,
last_update_time: Option<u64>,
duration_secs: Option<u64>,
}
pub fn run_inspect(cli: InspectCli) -> anyhow::Result<()> {
let id = cli.id.to_lowercase();
let tasks = load_task_records()?;
let matches: Vec<TaskFull> = tasks
.into_iter()
.filter(|t| t.task_id.starts_with(&id) || t.branch.as_deref().map(|b| b == id).unwrap_or(false))
.collect();
if matches.is_empty() {
eprintln!("No task matches identifier '{}'.", id);
return Ok(());
}
if matches.len() > 1 {
eprintln!("Identifier '{}' is ambiguous; matches: {}", id, matches.iter().map(|m| &m.task_id[..8]).collect::<Vec<_>>().join(", "));
return Ok(());
}
let task = &matches[0];
if cli.json {
println!("{}", serde_json::to_string_pretty(task)?);
return Ok(());
}
print_human(task);
Ok(())
}
fn base_dir() -> Option<PathBuf> {
if let Ok(val) = std::env::var("CODEX_HOME") { if !val.is_empty() { return std::fs::canonicalize(val).ok(); } }
let home = std::env::var_os("HOME")?;
Some(PathBuf::from(home).join(".codex"))
}
fn load_task_records() -> anyhow::Result<Vec<TaskFull>> {
let mut map: std::collections::HashMap<String, TaskFull> = std::collections::HashMap::new();
let Some(base) = base_dir() else { return Ok(vec![]); };
let tasks = base.join("tasks.jsonl");
if !tasks.exists() { return Ok(vec![]); }
let f = File::open(tasks)?;
let reader = BufReader::new(f);
for line in reader.lines() {
let Ok(line) = line else { continue };
if line.trim().is_empty() { continue; }
let Ok(val) = serde_json::from_str::<serde_json::Value>(&line) else { continue };
let Ok(rec) = serde_json::from_value::<RawRecord>(val) else { continue };
let Some(task_id) = rec.task_id.clone() else { continue };
let entry = map.entry(task_id.clone()).or_insert_with(|| TaskFull { task_id: task_id.clone(), ..Default::default() });
// Initial metadata fields
if rec.start_time.is_some() {
entry.pid = rec.pid.or(entry.pid);
entry.branch = rec.branch.or(entry.branch.clone());
entry.worktree = rec.worktree.or(entry.worktree.clone());
entry.original_branch = rec.original_branch.or(entry.original_branch.clone());
entry.original_commit = rec.original_commit.or(entry.original_commit.clone());
entry.log_path = rec.log_path.or(entry.log_path.clone());
entry.prompt = rec.prompt.or(entry.prompt.clone());
entry.model = rec.model.or(entry.model.clone());
entry.start_time = rec.start_time.or(entry.start_time);
entry.automerge = rec.automerge.or(entry.automerge);
entry.explicit_branch_name = rec.explicit_branch_name.or(entry.explicit_branch_name.clone());
}
if let Some(state) = rec.state { entry.state = Some(state); }
if rec.update_time.is_some() { entry.last_update_time = rec.update_time; }
if rec.end_time.is_some() || rec.completion_time.is_some() {
entry.end_time = rec.end_time.or(rec.completion_time).or(entry.end_time);
}
if let Some(tc) = rec.token_count.as_ref() {
if let Some(total) = tc.get("total_tokens").and_then(|v| v.as_u64()) { entry.total_tokens = Some(total); }
if let Some(inp) = tc.get("input_tokens").and_then(|v| v.as_u64()) { entry.input_tokens = Some(inp); }
if let Some(out) = tc.get("output_tokens").and_then(|v| v.as_u64()) { entry.output_tokens = Some(out); }
if let Some(rout) = tc.get("reasoning_output_tokens").and_then(|v| v.as_u64()) { entry.reasoning_output_tokens = Some(rout); }
}
}
// Compute duration
for t in map.values_mut() {
if let (Some(s), Some(e)) = (t.start_time, t.end_time) { t.duration_secs = Some(e.saturating_sub(s)); }
}
Ok(map.into_values().collect())
}
fn print_human(task: &TaskFull) {
println!("Task {}", task.task_id);
println!("State: {}", task.state.as_deref().unwrap_or("?"));
if let Some(model) = &task.model { println!("Model: {}", model); } else { println!("Model: {}", resolve_default_model()); }
if let Some(branch) = &task.branch { println!("Branch: {}", branch); }
if let Some(wt) = &task.worktree { println!("Worktree: {}", wt); }
if let Some(ob) = &task.original_branch { println!("Original branch: {}", ob); }
if let Some(oc) = &task.original_commit { println!("Original commit: {}", oc); }
if let Some(start) = task.start_time { println!("Start: {}", format_epoch(start)); }
if let Some(end) = task.end_time { println!("End: {}", format_epoch(end)); }
if let Some(d) = task.duration_secs { println!("Duration: {}s", d); }
if let Some(pid) = task.pid { println!("PID: {}", pid); }
if let Some(log) = &task.log_path { println!("Log: {}", log); }
if let Some(am) = task.automerge { println!("Automerge: {}", am); }
if let Some(exp) = &task.explicit_branch_name { println!("Explicit branch name: {}", exp); }
if let Some(total) = task.total_tokens { println!("Total tokens: {}", total); }
if task.input_tokens.is_some() || task.output_tokens.is_some() {
println!(" Input: {:?} Output: {:?} Reasoning: {:?}", task.input_tokens, task.output_tokens, task.reasoning_output_tokens);
}
if let Some(p) = &task.prompt { println!("Prompt:\n{}", p); }
}
fn format_epoch(secs: u64) -> String {
use chrono::{TimeZone, Utc};
if let Some(dt) = Utc.timestamp_opt(secs as i64, 0).single() { dt.to_rfc3339() } else { secs.to_string() }
}
fn resolve_default_model() -> String {
if let Some(base) = base_dir() {
let candidates = ["config.json", "config.yaml", "config.yml"];
for name in candidates {
let p = base.join(name);
if p.exists() {
if let Ok(raw) = fs::read_to_string(&p) {
if name.ends_with(".json") {
if let Ok(v) = serde_json::from_str::<serde_json::Value>(&raw) {
if let Some(m) = v.get("model").and_then(|x| x.as_str()) { if !m.trim().is_empty() { return m.to_string(); } }
}
} else {
for line in raw.lines() { if let Some(rest) = line.trim().strip_prefix("model:") { let val = rest.trim().trim_matches('"'); if !val.is_empty() { return val.to_string(); } } }
}
}
}
}
}
"codex-mini-latest".to_string()
}

View File

@@ -1,10 +1,14 @@
pub mod concurrent;
pub mod debug_sandbox;
mod exit_status;
pub mod login;
pub mod proto;
pub mod tasks;
pub mod logs;
pub mod inspect;
use clap::Parser;
use codex_common::CliConfigOverrides;
use codex_common::SandboxPermissionOption;
#[derive(Debug, Parser)]
pub struct SeatbeltCommand {
@@ -12,9 +16,6 @@ pub struct SeatbeltCommand {
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,
#[clap(flatten)]
pub sandbox: SandboxPermissionOption,
#[clap(skip)]
pub config_overrides: CliConfigOverrides,
@@ -29,9 +30,6 @@ pub struct LandlockCommand {
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,
#[clap(flatten)]
pub sandbox: SandboxPermissionOption,
#[clap(skip)]
pub config_overrides: CliConfigOverrides,

35
codex-rs/cli/src/login.rs Normal file
View File

@@ -0,0 +1,35 @@
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_login::login_with_chatgpt;
pub async fn run_login_with_chatgpt(cli_config_overrides: CliConfigOverrides) -> ! {
let cli_overrides = match cli_config_overrides.parse_overrides() {
Ok(v) => v,
Err(e) => {
eprintln!("Error parsing -c overrides: {e}");
std::process::exit(1);
}
};
let config_overrides = ConfigOverrides::default();
let config = match Config::load_with_cli_overrides(cli_overrides, config_overrides) {
Ok(config) => config,
Err(e) => {
eprintln!("Error loading configuration: {e}");
std::process::exit(1);
}
};
let capture_output = false;
match login_with_chatgpt(&config.codex_home, capture_output).await {
Ok(_) => {
eprintln!("Successfully logged in");
std::process::exit(0);
}
Err(e) => {
eprintln!("Error logging in: {e}");
std::process::exit(1);
}
}
}

145
codex-rs/cli/src/logs.rs Normal file
View File

@@ -0,0 +1,145 @@
use clap::Parser;
use serde::Deserialize;
use std::collections::HashMap;
use std::fs::File;
use std::io::{BufRead, BufReader, Read, Seek, SeekFrom};
use std::path::PathBuf;
use std::thread;
use std::time::Duration;
#[derive(Debug, Parser)]
pub struct LogsCli {
/// Task identifier: full/short task UUID or branch name
pub id: String,
/// Follow log output (stream new lines)
#[arg(short = 'f', long = "follow")]
pub follow: bool,
/// Show only the last N lines (like tail -n). If omitted, show full file.
#[arg(short = 'n', long = "lines")]
pub lines: Option<usize>,
}
#[derive(Debug, Deserialize)]
struct RawRecord {
task_id: Option<String>,
branch: Option<String>,
log_path: Option<String>,
start_time: Option<u64>,
}
#[derive(Debug, Clone)]
struct TaskMeta {
task_id: String,
branch: Option<String>,
log_path: String,
start_time: Option<u64>,
}
pub fn run_logs(cli: LogsCli) -> anyhow::Result<()> {
let id = cli.id.to_lowercase();
let tasks = load_tasks_index()?;
if tasks.is_empty() {
eprintln!("No tasks found in tasks.jsonl");
return Ok(());
}
let matches: Vec<&TaskMeta> = tasks
.values()
.filter(|meta| {
meta.task_id.starts_with(&id) || meta.branch.as_deref().map(|b| b == id).unwrap_or(false)
})
.collect();
if matches.is_empty() {
eprintln!("No task matches identifier '{}'.", id);
return Ok(());
}
if matches.len() > 1 {
eprintln!("Identifier '{}' is ambiguous; matches: {}", id, matches.iter().map(|m| &m.task_id[..8]).collect::<Vec<_>>().join(", "));
return Ok(());
}
let task = matches[0];
let path = PathBuf::from(&task.log_path);
if !path.exists() {
eprintln!("Log file not found at {}", path.display());
return Ok(());
}
if cli.follow {
tail_file(&path, cli.lines)?;
} else {
print_file(&path, cli.lines)?;
}
Ok(())
}
fn base_dir() -> Option<PathBuf> {
if let Ok(val) = std::env::var("CODEX_HOME") { if !val.is_empty() { return std::fs::canonicalize(val).ok(); } }
let home = std::env::var_os("HOME")?;
Some(PathBuf::from(home).join(".codex"))
}
fn load_tasks_index() -> anyhow::Result<HashMap<String, TaskMeta>> {
let mut map: HashMap<String, TaskMeta> = HashMap::new();
let Some(base) = base_dir() else { return Ok(map); };
let tasks = base.join("tasks.jsonl");
if !tasks.exists() { return Ok(map); }
let f = File::open(tasks)?;
let reader = BufReader::new(f);
for line in reader.lines() {
let Ok(line) = line else { continue };
if line.trim().is_empty() { continue; }
let Ok(val) = serde_json::from_str::<serde_json::Value>(&line) else { continue };
let Ok(rec) = serde_json::from_value::<RawRecord>(val) else { continue };
let (Some(task_id), Some(log_path)) = (rec.task_id.clone(), rec.log_path.clone()) else { continue };
// Insert or update only if not already present (we just need initial metadata)
map.entry(task_id.clone()).or_insert(TaskMeta {
task_id,
branch: rec.branch,
log_path,
start_time: rec.start_time,
});
}
Ok(map)
}
fn print_file(path: &PathBuf, last_lines: Option<usize>) -> anyhow::Result<()> {
if let Some(n) = last_lines {
let f = File::open(path)?;
let reader = BufReader::new(f);
let mut buf: std::collections::VecDeque<String> = std::collections::VecDeque::with_capacity(n);
for line in reader.lines() {
if let Ok(l) = line { if buf.len() == n { buf.pop_front(); } buf.push_back(l); }
}
for l in buf { println!("{}", l); }
return Ok(());
}
// Full file
let mut f = File::open(path)?;
let mut contents = String::new();
f.read_to_string(&mut contents)?;
print!("{}", contents);
Ok(())
}
fn tail_file(path: &PathBuf, last_lines: Option<usize>) -> anyhow::Result<()> {
use std::io::{self};
// Initial output
if let Some(n) = last_lines { print_file(path, Some(n))?; } else { print_file(path, None)?; }
let mut f = File::open(path)?;
let mut pos = f.metadata()?.len();
loop {
thread::sleep(Duration::from_millis(500));
let meta = match f.metadata() { Ok(m) => m, Err(_) => break };
let len = meta.len();
if len < pos { // truncated
pos = 0;
}
if len > pos {
f.seek(SeekFrom::Start(pos))?;
let mut buf = String::new();
f.read_to_string(&mut buf)?;
if !buf.is_empty() { print!("{}", buf); io::Write::flush(&mut std::io::stdout())?; }
pos = len;
}
}
Ok(())
}

View File

@@ -1,6 +1,13 @@
use clap::CommandFactory;
use clap::Parser;
use clap_complete::Shell;
use clap_complete::generate;
use codex_chatgpt::apply_command::ApplyCommand;
use codex_chatgpt::apply_command::run_apply_command;
use codex_cli::concurrent::maybe_spawn_concurrent;
use codex_cli::LandlockCommand;
use codex_cli::SeatbeltCommand;
use codex_cli::login::run_login_with_chatgpt;
use codex_cli::proto;
use codex_common::CliConfigOverrides;
use codex_exec::Cli as ExecCli;
@@ -26,6 +33,25 @@ struct MultitoolCli {
#[clap(flatten)]
interactive: TuiCli,
/// Autonomous mode: run the command in the background & concurrently using a git worktree.
/// Requires the current directory (or --cd provided path) to be a git repository.
#[clap(long)]
concurrent: bool,
/// Control whether the concurrent run auto-merges the worktree branch back into the original branch.
/// Defaults to true (may also be set via CONCURRENT_AUTOMERGE env var).
#[clap(long = "concurrent-automerge", value_name = "BOOL")]
concurrent_automerge: Option<bool>,
/// Explicit branch name to use for the concurrent worktree instead of the default `codex/<slug>`.
/// May also be set via CONCURRENT_BRANCH_NAME env var.
#[clap(long = "concurrent-branch-name", value_name = "BRANCH")]
concurrent_branch_name: Option<String>,
/// Best-of-n: run n concurrent worktrees (1-4) and let user pick the best result. Implies --concurrent and disables automerge.
#[clap(long = "best-of-n", short = 'n', value_name = "N", default_value_t = 1)]
pub best_of_n: u8,
#[clap(subcommand)]
subcommand: Option<Subcommand>,
}
@@ -36,6 +62,9 @@ enum Subcommand {
#[clap(visible_alias = "e")]
Exec(ExecCli),
/// Login with ChatGPT.
Login(LoginCommand),
/// Experimental: run Codex as an MCP server.
Mcp,
@@ -43,8 +72,31 @@ enum Subcommand {
#[clap(visible_alias = "p")]
Proto(ProtoCli),
/// Generate shell completion scripts.
Completion(CompletionCommand),
/// Internal debugging commands.
Debug(DebugArgs),
/// Apply the latest diff produced by Codex agent as a `git apply` to your local working tree.
#[clap(visible_alias = "a")]
Apply(ApplyCommand),
/// Manage / inspect concurrent background tasks.
Tasks(codex_cli::tasks::TasksCli),
/// Show or follow logs for a specific task.
Logs(codex_cli::logs::LogsCli),
/// Inspect full metadata for a task.
Inspect(codex_cli::inspect::InspectCli),
}
#[derive(Debug, Parser)]
struct CompletionCommand {
/// Shell to generate completions for
#[clap(value_enum, default_value_t = Shell::Bash)]
shell: Shell,
}
#[derive(Debug, Parser)]
@@ -63,7 +115,10 @@ enum DebugCommand {
}
#[derive(Debug, Parser)]
struct ReplProto {}
struct LoginCommand {
#[clap(skip)]
config_overrides: CliConfigOverrides,
}
fn main() -> anyhow::Result<()> {
codex_linux_sandbox::run_with_sandbox(|codex_linux_sandbox_exe| async move {
@@ -78,8 +133,64 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
match cli.subcommand {
None => {
let mut tui_cli = cli.interactive;
let root_raw_overrides = cli.config_overrides.raw_overrides.clone();
prepend_config_flags(&mut tui_cli.config_overrides, cli.config_overrides);
codex_tui::run_main(tui_cli, codex_linux_sandbox_exe)?;
// Best-of-n logic
if cli.best_of_n > 1 {
let n = cli.best_of_n.min(4).max(1);
let mut spawned_any = false;
let base_branch = if let Some(ref name) = cli.concurrent_branch_name {
name.trim().to_string()
} else {
// Derive slug from prompt (copied from maybe_spawn_concurrent)
let raw_prompt = tui_cli.prompt.as_deref().unwrap_or("");
let snippet = raw_prompt.chars().take(32).collect::<String>();
let mut slug: String = snippet
.chars()
.map(|c| if c.is_ascii_alphanumeric() { c.to_ascii_lowercase() } else { '-' })
.collect();
while slug.contains("--") { slug = slug.replace("--", "-"); }
slug = slug.trim_matches('-').to_string();
if slug.is_empty() { slug = "prompt".into(); }
format!("codex/{}", slug)
};
for i in 1..=n {
let mut tui_cli_n = tui_cli.clone();
// Suffix branch name with -01, -02, etc.
let branch_name = format!("{}-{:02}", base_branch, i);
let branch_name_opt = Some(branch_name);
// Always automerge = false for best-of-n
match maybe_spawn_concurrent(
&mut tui_cli_n,
&root_raw_overrides,
true, // force concurrent
Some(false),
&branch_name_opt,
) {
Ok(true) => { spawned_any = true; },
Ok(false) => {},
Err(e) => { eprintln!("Error spawning best-of-n run {}: {e}", i); },
}
}
if !spawned_any {
codex_tui::run_main(tui_cli, codex_linux_sandbox_exe)?;
}
// If any spawned, do not run TUI (user will see task IDs)
} else {
// Attempt concurrent background spawn; if it returns true we skip launching the TUI.
if let Ok(spawned) = maybe_spawn_concurrent(
&mut tui_cli,
&root_raw_overrides,
cli.concurrent,
cli.concurrent_automerge,
&cli.concurrent_branch_name,
) {
if !spawned { codex_tui::run_main(tui_cli, codex_linux_sandbox_exe)?; }
} else {
// On error fallback to interactive.
codex_tui::run_main(tui_cli, codex_linux_sandbox_exe)?;
}
}
}
Some(Subcommand::Exec(mut exec_cli)) => {
prepend_config_flags(&mut exec_cli.config_overrides, cli.config_overrides);
@@ -88,10 +199,17 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
Some(Subcommand::Mcp) => {
codex_mcp_server::run_main(codex_linux_sandbox_exe).await?;
}
Some(Subcommand::Login(mut login_cli)) => {
prepend_config_flags(&mut login_cli.config_overrides, cli.config_overrides);
run_login_with_chatgpt(login_cli.config_overrides).await;
}
Some(Subcommand::Proto(mut proto_cli)) => {
prepend_config_flags(&mut proto_cli.config_overrides, cli.config_overrides);
proto::run_main(proto_cli).await?;
}
Some(Subcommand::Completion(completion_cli)) => {
print_completion(completion_cli);
}
Some(Subcommand::Debug(debug_args)) => match debug_args.cmd {
DebugCommand::Seatbelt(mut seatbelt_cli) => {
prepend_config_flags(&mut seatbelt_cli.config_overrides, cli.config_overrides);
@@ -110,6 +228,19 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
.await?;
}
},
Some(Subcommand::Apply(mut apply_cli)) => {
prepend_config_flags(&mut apply_cli.config_overrides, cli.config_overrides);
run_apply_command(apply_cli).await?;
}
Some(Subcommand::Tasks(tasks_cli)) => {
codex_cli::tasks::run_tasks(tasks_cli)?;
}
Some(Subcommand::Logs(logs_cli)) => {
codex_cli::logs::run_logs(logs_cli)?;
}
Some(Subcommand::Inspect(inspect_cli)) => {
codex_cli::inspect::run_inspect(inspect_cli)?;
}
}
Ok(())
@@ -125,3 +256,9 @@ fn prepend_config_flags(
.raw_overrides
.splice(0..0, cli_config_overrides.raw_overrides);
}
fn print_completion(cmd: CompletionCommand) {
let mut app = MultitoolCli::command();
let name = "codex";
generate(cmd.shell, &mut app, name, &mut std::io::stdout());
}

View File

@@ -35,7 +35,7 @@ pub async fn run_main(opts: ProtoCli) -> anyhow::Result<()> {
let config = Config::load_with_cli_overrides(overrides_vec, ConfigOverrides::default())?;
let ctrl_c = notify_on_sigint();
let (codex, _init_id) = Codex::spawn(config, ctrl_c.clone()).await?;
let (codex, _init_id, _session_id) = Codex::spawn(config, ctrl_c.clone()).await?;
let codex = Arc::new(codex);
// Task that reads JSON lines from stdin and forwards to Submission Queue

212
codex-rs/cli/src/tasks.rs Normal file
View File

@@ -0,0 +1,212 @@
use clap::{Parser, Subcommand};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::fs::File;
use std::io::{BufRead, BufReader};
use std::fs;
use chrono::Local;
use codex_common::elapsed::format_duration;
#[derive(Debug, Parser)]
pub struct TasksCli {
#[command(subcommand)]
pub cmd: TasksCommand,
}
#[derive(Debug, Subcommand)]
pub enum TasksCommand {
/// List background concurrent tasks (from ~/.codex/tasks.jsonl)
Ls(TasksListArgs),
}
#[derive(Debug, Parser)]
pub struct TasksListArgs {
/// Output raw JSON instead of table
#[arg(long)]
pub json: bool,
/// Limit number of tasks displayed (most recent first)
#[arg(long)]
pub limit: Option<usize>,
/// Show completed tasks as well (by default only running tasks)
#[arg(short = 'a', long = "all")]
pub all: bool,
/// Show all columns including prompt text
#[arg(long = "all-columns")]
pub all_columns: bool,
}
#[derive(Debug, Deserialize)]
struct RawRecord {
task_id: Option<String>,
pid: Option<u64>,
worktree: Option<String>,
branch: Option<String>,
original_branch: Option<String>,
original_commit: Option<String>,
log_path: Option<String>,
prompt: Option<String>,
model: Option<String>,
start_time: Option<u64>,
update_time: Option<u64>,
token_count: Option<serde_json::Value>,
state: Option<String>,
completion_time: Option<u64>,
end_time: Option<u64>,
}
#[derive(Debug, Serialize, Default, Clone)]
struct TaskAggregate {
task_id: String,
pid: Option<u64>,
branch: Option<String>,
worktree: Option<String>,
prompt: Option<String>,
model: Option<String>,
start_time: Option<u64>,
last_update_time: Option<u64>,
total_tokens: Option<u64>,
state: Option<String>,
end_time: Option<u64>,
}
pub fn run_tasks(cmd: TasksCli) -> anyhow::Result<()> {
match cmd.cmd {
TasksCommand::Ls(args) => list_tasks(args),
}
}
fn base_dir() -> Option<std::path::PathBuf> {
if let Ok(val) = std::env::var("CODEX_HOME") { if !val.is_empty() { return std::fs::canonicalize(val).ok(); } }
let home = std::env::var_os("HOME")?;
let base = std::path::PathBuf::from(home).join(".codex");
Some(base)
}
fn list_tasks(args: TasksListArgs) -> anyhow::Result<()> {
let Some(base) = base_dir() else {
println!("No home directory found; cannot locate tasks.jsonl");
return Ok(());
};
let path = base.join("tasks.jsonl");
if !path.exists() {
println!("No tasks.jsonl found (no concurrent tasks recorded yet)");
return Ok(());
}
let f = File::open(&path)?;
let reader = BufReader::new(f);
let mut agg: HashMap<String, TaskAggregate> = HashMap::new();
for line_res in reader.lines() {
let line = match line_res { Ok(l) => l, Err(_) => continue };
if line.trim().is_empty() { continue; }
let raw: serde_json::Value = match serde_json::from_str(&line) { Ok(v) => v, Err(_) => continue };
let rec: RawRecord = match serde_json::from_value(raw) { Ok(r) => r, Err(_) => continue };
let Some(task_id) = rec.task_id.clone() else { continue }; // must have task_id
let entry = agg.entry(task_id.clone()).or_insert_with(|| TaskAggregate { task_id: task_id.clone(), ..Default::default() });
if rec.start_time.is_some() { // initial metadata line
entry.pid = rec.pid.or(entry.pid);
entry.branch = rec.branch.or(entry.branch.clone());
entry.worktree = rec.worktree.or(entry.worktree.clone());
entry.prompt = rec.prompt.or(entry.prompt.clone());
entry.model = rec.model.or(entry.model.clone());
entry.start_time = rec.start_time.or(entry.start_time);
}
if let Some(tc_val) = rec.token_count.as_ref() { if tc_val.is_object() { if let Some(total) = tc_val.get("total_tokens").and_then(|v| v.as_u64()) { entry.total_tokens = Some(total); } } }
if rec.update_time.is_some() { entry.last_update_time = rec.update_time; }
if let Some(state) = rec.state { entry.state = Some(state); }
if rec.completion_time.is_some() || rec.end_time.is_some() {
entry.end_time = rec.end_time.or(rec.completion_time).or(entry.end_time);
}
}
// Collect and sort by start_time desc
let mut tasks: Vec<TaskAggregate> = agg.into_values().collect();
tasks.sort_by_key(|j| std::cmp::Reverse(j.start_time.unwrap_or(0)));
if !args.all { tasks.retain(|j| j.state.as_deref() != Some("done")); }
if let Some(limit) = args.limit { tasks.truncate(limit); }
if args.json {
println!("{}", serde_json::to_string_pretty(&tasks)?);
return Ok(());
}
if tasks.is_empty() {
println!("No tasks found");
return Ok(());
}
// Table header
if args.all_columns {
println!("\x1b[1m{:<8} {:>6} {:<22} {:<12} {:<8} {:>8} {:<12} {}\x1b[0m", "TASK_ID", "PID", "BRANCH", "START", "STATE", "TOKENS", "MODEL", "PROMPT");
} else {
// Widened branch column to 22 chars for better readability.
println!("\x1b[1m{:<8} {:>6} {:<22} {:<12} {:<8} {:>8} {:<12}\x1b[0m", "TASK_ID", "PID", "BRANCH", "START", "STATE", "TOKENS", "MODEL");
}
for t in tasks {
let task_short = if t.task_id.len() > 8 { &t.task_id[..8] } else { &t.task_id };
let pid_str = t.pid.map(|p| p.to_string()).unwrap_or_default();
let mut branch = t.branch.clone().unwrap_or_default();
let branch_limit = if args.all_columns { 22 } else { 22 }; // unified width
if branch.len() > branch_limit { branch.truncate(branch_limit); }
let start = t.start_time.map(|start_secs| {
let now = Local::now().timestamp() as u64;
if now > start_secs {
let elapsed = std::time::Duration::from_secs(now - start_secs);
format!("{} ago", format_duration(elapsed))
} else {
"just now".to_string()
}
}).unwrap_or_default();
let tokens = t.total_tokens.map(|t| t.to_string()).unwrap_or_default();
let state = t.state.clone().unwrap_or_else(|| "?".into());
let mut model = t.model.clone().unwrap_or_default();
if model.trim().is_empty() { model = resolve_default_model(); }
if model.is_empty() { model.push('-'); }
if model.len() > 12 { model.truncate(12); }
if args.all_columns {
let mut prompt = t.prompt.clone().unwrap_or_default().replace('\n', " ");
if prompt.len() > 60 { prompt.truncate(60); }
println!("{:<8} {:>6} {:<22} {:<12} {:<8} {:>8} {:<12} {}", task_short, pid_str, branch, start, state, tokens, model, prompt);
} else {
println!("{:<8} {:>6} {:<22} {:<12} {:<8} {:>8} {:<12}", task_short, pid_str, branch, start, state, tokens, model);
}
}
Ok(())
}
fn resolve_default_model() -> String {
// Attempt to read config json/yaml for model, otherwise fallback to hardcoded default.
if let Some(base) = base_dir() {
let candidates = ["config.json", "config.yaml", "config.yml"];
for name in candidates {
let p = base.join(name);
if p.exists() {
if let Ok(raw) = fs::read_to_string(&p) {
// Try JSON first.
if name.ends_with(".json") {
if let Ok(v) = serde_json::from_str::<serde_json::Value>(&raw) {
if let Some(m) = v.get("model").and_then(|x| x.as_str()) {
if !m.trim().is_empty() { return m.to_string(); }
}
}
} else {
// Very lightweight YAML parse: look for line starting with model:
for line in raw.lines() {
if let Some(rest) = line.trim().strip_prefix("model:") {
let val = rest.trim().trim_matches('"');
if !val.is_empty() {
return val.to_string();
}
}
}
}
}
}
}
}
// Fallback default agentic model used elsewhere.
"codex-mini-latest".to_string()
}

View File

@@ -0,0 +1,101 @@
// Minimal integration test for --concurrent background spawning.
// Verifies that invoking the top-level CLI with --concurrent records a task entry
// in CODEX_HOME/tasks.jsonl and that multiple invocations append distinct task_ids.
use std::fs;
use std::io::Write;
use std::process::Command;
use std::time::{Duration, Instant};
use tempfile::TempDir;
// Skip helper when sandbox network disabled (mirrors existing tests' behavior).
fn network_disabled() -> bool {
std::env::var(codex_core::exec::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok()
}
#[test]
fn concurrent_creates_task_records() {
if network_disabled() {
eprintln!("Skipping concurrent_creates_task_records due to sandbox network-disabled env");
return;
}
// Temp home (CODEX_HOME) and separate temp git repo.
let home = TempDir::new().expect("temp home");
let repo = TempDir::new().expect("temp repo");
// Initialize a minimal git repository (needed for --concurrent worktree logic).
assert!(Command::new("git").arg("init").current_dir(repo.path()).status().unwrap().success());
fs::write(repo.path().join("README.md"), "# temp\n").unwrap();
assert!(Command::new("git").arg("add").arg(".").current_dir(repo.path()).status().unwrap().success());
assert!(Command::new("git")
.args(["commit", "-m", "init"]) // may warn about user/email; allow non-zero if commit already exists
.current_dir(repo.path())
.status()
.map(|s| s.success())
.unwrap_or(true));
// SSE fixture so the spawned background exec does not perform a real network call.
let fixture = home.path().join("fixture.sse");
let mut f = fs::File::create(&fixture).unwrap();
writeln!(f, "data: {{\"choices\":[{{\"delta\":{{\"content\":\"ok\"}}}}]}}\n").unwrap();
writeln!(f, "data: {{\"choices\":[{{\"delta\":{{}}}}]}}\n").unwrap();
writeln!(f, "data: [DONE]\n").unwrap();
// Helper to run one concurrent invocation with a given prompt.
let run_once = |prompt: &str| {
let mut cmd = Command::new("cargo");
cmd.arg("run")
.arg("-p")
.arg("codex-cli")
.arg("--quiet")
.arg("--")
.arg("--concurrent")
.arg("--full-auto")
.arg("-C")
.arg(repo.path())
.arg(prompt);
cmd.env("CODEX_HOME", home.path())
.env("OPENAI_API_KEY", "dummy")
.env("CODEX_RS_SSE_FIXTURE", &fixture)
.env("OPENAI_BASE_URL", "http://unused.local");
let output = cmd.output().expect("spawn codex");
assert!(output.status.success(), "concurrent codex run failed: stderr={}", String::from_utf8_lossy(&output.stderr));
};
run_once("Add a cat in ASCII");
run_once("Add hello world comment");
// Wait for tasks.jsonl to contain at least two lines with task records.
let tasks_path = home.path().join("tasks.jsonl");
let deadline = Instant::now() + Duration::from_secs(10);
let mut lines: Vec<String> = Vec::new();
while Instant::now() < deadline {
if tasks_path.exists() {
let content = fs::read_to_string(&tasks_path).unwrap_or_default();
lines = content.lines().filter(|l| !l.trim().is_empty()).map(|s| s.to_string()).collect();
if lines.len() >= 2 { break; }
}
std::thread::sleep(Duration::from_millis(100));
}
assert!(lines.len() >= 2, "Expected at least 2 task records, got {}", lines.len());
// Parse JSON and ensure distinct task_ids and prompts present.
let mut task_ids = std::collections::HashSet::new();
let mut saw_cat = false;
let mut saw_hello = false;
for line in &lines {
if let Ok(val) = serde_json::from_str::<serde_json::Value>(line) {
if let Some(tid) = val.get("task_id").and_then(|v| v.as_str()) { task_ids.insert(tid.to_string()); }
if let Some(p) = val.get("prompt").and_then(|v| v.as_str()) {
if p.contains("cat") { saw_cat = true; }
if p.contains("hello") { saw_hello = true; }
}
assert_eq!(val.get("state").and_then(|v| v.as_str()), Some("started"), "task record missing started state");
}
}
assert!(task_ids.len() >= 2, "Expected distinct task_ids, got {:?}", task_ids);
assert!(saw_cat, "Did not find cat prompt in tasks.jsonl");
assert!(saw_hello, "Did not find hello prompt in tasks.jsonl");
}

View File

@@ -9,10 +9,11 @@ workspace = true
[dependencies]
clap = { version = "4", features = ["derive", "wrap_help"], optional = true }
codex-core = { path = "../core" }
toml = { version = "0.8", optional = true }
toml = { version = "0.9", optional = true }
serde = { version = "1", optional = true }
[features]
# Separate feature so that `clap` is not a mandatory dependency.
cli = ["clap", "toml", "serde"]
elapsed = []
sandbox_summary = []

View File

@@ -1,27 +1,23 @@
//! Standard type to use with the `--approval-mode` CLI option.
//! Available when the `cli` feature is enabled for the crate.
use clap::ArgAction;
use clap::Parser;
use clap::ValueEnum;
use codex_core::config::parse_sandbox_permission_with_base_path;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::SandboxPermission;
#[derive(Clone, Copy, Debug, ValueEnum)]
#[value(rename_all = "kebab-case")]
pub enum ApprovalModeCliArg {
/// Only run "trusted" commands (e.g. ls, cat, sed) without asking for user
/// approval. Will escalate to the user if the model proposes a command that
/// is not in the "trusted" set.
Untrusted,
/// Run all commands without asking for user approval.
/// Only asks for approval if a command fails to execute, in which case it
/// will escalate to the user to ask for un-sandboxed execution.
OnFailure,
/// Only run "known safe" commands (e.g. ls, cat, sed) without
/// asking for user approval. Will escalate to the user if the model
/// proposes a command that is not allow-listed.
UnlessAllowListed,
/// Never ask for user approval
/// Execution failures are immediately returned to the model.
Never,
@@ -30,44 +26,9 @@ pub enum ApprovalModeCliArg {
impl From<ApprovalModeCliArg> for AskForApproval {
fn from(value: ApprovalModeCliArg) -> Self {
match value {
ApprovalModeCliArg::Untrusted => AskForApproval::UnlessTrusted,
ApprovalModeCliArg::OnFailure => AskForApproval::OnFailure,
ApprovalModeCliArg::UnlessAllowListed => AskForApproval::UnlessAllowListed,
ApprovalModeCliArg::Never => AskForApproval::Never,
}
}
}
#[derive(Parser, Debug)]
pub struct SandboxPermissionOption {
/// Specify this flag multiple times to specify the full set of permissions
/// to grant to Codex.
///
/// ```shell
/// codex -s disk-full-read-access \
/// -s disk-write-cwd \
/// -s disk-write-platform-user-temp-folder \
/// -s disk-write-platform-global-temp-folder
/// ```
///
/// Note disk-write-folder takes a value:
///
/// ```shell
/// -s disk-write-folder=$HOME/.pyenv/shims
/// ```
///
/// These permissions are quite broad and should be used with caution:
///
/// ```shell
/// -s disk-full-write-access
/// -s network-full-access
/// ```
#[arg(long = "sandbox-permission", short = 's', action = ArgAction::Append, value_parser = parse_sandbox_permission)]
pub permissions: Option<Vec<SandboxPermission>>,
}
/// Custom value-parser so we can keep the CLI surface small *and*
/// still handle the parameterised `disk-write-folder` case.
fn parse_sandbox_permission(raw: &str) -> std::io::Result<SandboxPermission> {
let base_path = std::env::current_dir()?;
parse_sandbox_permission_with_base_path(raw, base_path)
}

View File

@@ -23,7 +23,7 @@ pub struct CliConfigOverrides {
/// parse as JSON, the raw string is used as a literal.
///
/// Examples:
/// - `-c model="o4-mini"`
/// - `-c model="o3"`
/// - `-c 'sandbox_permissions=["disk-full-read-access"]'`
/// - `-c shell_environment_policy.inherit=all`
#[arg(
@@ -61,10 +61,14 @@ impl CliConfigOverrides {
// Attempt to parse as JSON. If that fails, treat it as a raw
// string. This allows convenient usage such as
// `-c model=o4-mini` without the quotes.
// `-c model=o3` without the quotes.
let value: Value = match parse_toml_value(value_str) {
Ok(v) => v,
Err(_) => Value::String(value_str.to_string()),
Err(_) => {
// Strip leading/trailing quotes if present
let trimmed = value_str.trim().trim_matches(|c| c == '"' || c == '\'');
Value::String(trimmed.to_string())
}
};
Ok((key.to_string(), value))

View File

@@ -20,9 +20,10 @@ pub fn format_duration(duration: Duration) -> String {
fn format_elapsed_millis(millis: i64) -> String {
if millis < 1000 {
format!("{}ms", millis)
format!("{millis}ms")
} else if millis < 60_000 {
format!("{:.2}s", millis as f64 / 1000.0)
let secs = millis / 1000;
format!("{secs}s")
} else {
let minutes = millis / 60_000;
let seconds = (millis % 60_000) / 1000;
@@ -48,13 +49,12 @@ mod tests {
#[test]
fn test_format_duration_seconds() {
// Durations between 1s (inclusive) and 60s (exclusive) should be
// printed with 2-decimal-place seconds.
// printed as whole seconds.
let dur = Duration::from_millis(1_500); // 1.5s
assert_eq!(format_duration(dur), "1.50s");
assert_eq!(format_duration(dur), "1s");
// 59.999s rounds to 60.00s
let dur2 = Duration::from_millis(59_999);
assert_eq!(format_duration(dur2), "60.00s");
assert_eq!(format_duration(dur2), "59s");
}
#[test]

View File

@@ -6,11 +6,20 @@ pub mod elapsed;
#[cfg(feature = "cli")]
pub use approval_mode_cli_arg::ApprovalModeCliArg;
#[cfg(feature = "cli")]
pub use approval_mode_cli_arg::SandboxPermissionOption;
mod sandbox_mode_cli_arg;
#[cfg(feature = "cli")]
pub use sandbox_mode_cli_arg::SandboxModeCliArg;
#[cfg(any(feature = "cli", test))]
mod config_override;
#[cfg(feature = "cli")]
pub use config_override::CliConfigOverrides;
mod sandbox_summary;
#[cfg(feature = "sandbox_summary")]
pub use sandbox_summary::summarize_sandbox_policy;

View File

@@ -0,0 +1,28 @@
//! Standard type to use with the `--sandbox` (`-s`) CLI option.
//!
//! This mirrors the variants of [`codex_core::protocol::SandboxPolicy`], but
//! without any of the associated data so it can be expressed as a simple flag
//! on the command-line. Users that need to tweak the advanced options for
//! `workspace-write` can continue to do so via `-c` overrides or their
//! `config.toml`.
use clap::ValueEnum;
use codex_core::config_types::SandboxMode;
#[derive(Clone, Copy, Debug, ValueEnum)]
#[value(rename_all = "kebab-case")]
pub enum SandboxModeCliArg {
ReadOnly,
WorkspaceWrite,
DangerFullAccess,
}
impl From<SandboxModeCliArg> for SandboxMode {
fn from(value: SandboxModeCliArg) -> Self {
match value {
SandboxModeCliArg::ReadOnly => SandboxMode::ReadOnly,
SandboxModeCliArg::WorkspaceWrite => SandboxMode::WorkspaceWrite,
SandboxModeCliArg::DangerFullAccess => SandboxMode::DangerFullAccess,
}
}
}

View File

@@ -0,0 +1,28 @@
use codex_core::protocol::SandboxPolicy;
pub fn summarize_sandbox_policy(sandbox_policy: &SandboxPolicy) -> String {
match sandbox_policy {
SandboxPolicy::DangerFullAccess => "danger-full-access".to_string(),
SandboxPolicy::ReadOnly => "read-only".to_string(),
SandboxPolicy::WorkspaceWrite {
writable_roots,
network_access,
} => {
let mut summary = "workspace-write".to_string();
if !writable_roots.is_empty() {
summary.push_str(&format!(
" [{}]",
writable_roots
.iter()
.map(|p| p.to_string_lossy())
.collect::<Vec<_>>()
.join(", ")
));
}
if *network_access {
summary.push_str(" (network access enabled)");
}
summary
}
}
}

511
codex-rs/config.md Normal file
View File

@@ -0,0 +1,511 @@
# Config
Codex supports several mechanisms for setting config values:
- Config-specific command-line flags, such as `--model o3` (highest precedence).
- A generic `-c`/`--config` flag that takes a `key=value` pair, such as `--config model="o3"`.
- The key can contain dots to set a value deeper than the root, e.g. `--config model_providers.openai.wire_api="chat"`.
- Values can contain objects, such as `--config shell_environment_policy.include_only=["PATH", "HOME", "USER"]`.
- For consistency with `config.toml`, values are in TOML format rather than JSON format, so use `{a = 1, b = 2}` rather than `{"a": 1, "b": 2}`.
- If `value` cannot be parsed as a valid TOML value, it is treated as a string value. This means that both `-c model="o3"` and `-c model=o3` are equivalent.
- The `$CODEX_HOME/config.toml` configuration file where the `CODEX_HOME` environment value defaults to `~/.codex`. (Note `CODEX_HOME` will also be where logs and other Codex-related information are stored.)
Both the `--config` flag and the `config.toml` file support the following options:
## model
The model that Codex should use.
```toml
model = "o3" # overrides the default of "codex-mini-latest"
```
## model_providers
This option lets you override and amend the default set of model providers bundled with Codex. This value is a map where the key is the value to use with `model_provider` to select the corresponding provider.
For example, if you wanted to add a provider that uses the OpenAI 4o model via the chat completions API, then you could add the following configuration:
```toml
# Recall that in TOML, root keys must be listed before tables.
model = "gpt-4o"
model_provider = "openai-chat-completions"
[model_providers.openai-chat-completions]
# Name of the provider that will be displayed in the Codex UI.
name = "OpenAI using Chat Completions"
# The path `/chat/completions` will be amended to this URL to make the POST
# request for the chat completions.
base_url = "https://api.openai.com/v1"
# If `env_key` is set, identifies an environment variable that must be set when
# using Codex with this provider. The value of the environment variable must be
# non-empty and will be used in the `Bearer TOKEN` HTTP header for the POST request.
env_key = "OPENAI_API_KEY"
# Valid values for wire_api are "chat" and "responses". Defaults to "chat" if omitted.
wire_api = "chat"
# If necessary, extra query params that need to be added to the URL.
# See the Azure example below.
query_params = {}
```
Note this makes it possible to use Codex CLI with non-OpenAI models, so long as they use a wire API that is compatible with the OpenAI chat completions API. For example, you could define the following provider to use Codex CLI with Ollama running locally:
```toml
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
```
Or a third-party provider (using a distinct environment variable for the API key):
```toml
[model_providers.mistral]
name = "Mistral"
base_url = "https://api.mistral.ai/v1"
env_key = "MISTRAL_API_KEY"
```
Note that Azure requires `api-version` to be passed as a query parameter, so be sure to specify it as part of `query_params` when defining the Azure provider:
```toml
[model_providers.azure]
name = "Azure"
# Make sure you set the appropriate subdomain for this URL.
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY" # Or "OPENAI_API_KEY", whichever you use.
query_params = { api-version = "2025-04-01-preview" }
```
It is also possible to configure a provider to include extra HTTP headers with a request. These can be hardcoded values (`http_headers`) or values read from environment variables (`env_http_headers`):
```toml
[model_providers.example]
# name, base_url, ...
# This will add the HTTP header `X-Example-Header` with value `example-value`
# to each request to the model provider.
http_headers = { "X-Example-Header" = "example-value" }
# This will add the HTTP header `X-Example-Features` with the value of the
# `EXAMPLE_FEATURES` environment variable to each request to the model provider
# _if_ the environment variable is set and its value is non-empty.
env_http_headers = { "X-Example-Features": "EXAMPLE_FEATURES" }
```
### Per-provider network tuning
The following optional settings control retry behaviour and streaming idle timeouts **per model provider**. They must be specified inside the corresponding `[model_providers.<id>]` block in `config.toml`. (Older releases accepted toplevel keys; those are now ignored.)
Example:
```toml
[model_providers.openai]
name = "OpenAI"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
# network tuning overrides (all optional; falls back to builtin defaults)
request_max_retries = 4 # retry failed HTTP requests
stream_max_retries = 10 # retry dropped SSE streams
stream_idle_timeout_ms = 300000 # 5m idle timeout
```
#### request_max_retries
How many times Codex will retry a failed HTTP request to the model provider. Defaults to `4`.
#### stream_max_retries
Number of times Codex will attempt to reconnect when a streaming response is interrupted. Defaults to `10`.
#### stream_idle_timeout_ms
How long Codex will wait for activity on a streaming response before treating the connection as lost. Defaults to `300_000` (5 minutes).
## model_provider
Identifies which provider to use from the `model_providers` map. Defaults to `"openai"`. You can override the `base_url` for the built-in `openai` provider via the `OPENAI_BASE_URL` environment variable.
Note that if you override `model_provider`, then you likely want to override
`model`, as well. For example, if you are running ollama with Mistral locally,
then you would need to add the following to your config in addition to the new entry in the `model_providers` map:
```toml
model_provider = "ollama"
model = "mistral"
```
## approval_policy
Determines when the user should be prompted to approve whether Codex can execute a command:
```toml
# Codex has hardcoded logic that defines a set of "trusted" commands.
# Setting the approval_policy to `untrusted` means that Codex will prompt the
# user before running a command not in the "trusted" set.
#
# See https://github.com/openai/codex/issues/1260 for the plan to enable
# end-users to define their own trusted commands.
approval_policy = "untrusted"
```
```toml
# If the command fails when run in the sandbox, Codex asks for permission to
# retry the command outside the sandbox.
approval_policy = "on-failure"
```
```toml
# User is never prompted: if the command fails, Codex will automatically try
# something out. Note the `exec` subcommand always uses this mode.
approval_policy = "never"
```
## profiles
A _profile_ is a collection of configuration values that can be set together. Multiple profiles can be defined in `config.toml` and you can specify the one you
want to use at runtime via the `--profile` flag.
Here is an example of a `config.toml` that defines multiple profiles:
```toml
model = "o3"
approval_policy = "unless-allow-listed"
disable_response_storage = false
# Setting `profile` is equivalent to specifying `--profile o3` on the command
# line, though the `--profile` flag can still be used to override this value.
profile = "o3"
[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
[profiles.o3]
model = "o3"
model_provider = "openai"
approval_policy = "never"
model_reasoning_effort = "high"
model_reasoning_summary = "detailed"
[profiles.gpt3]
model = "gpt-3.5-turbo"
model_provider = "openai-chat-completions"
[profiles.zdr]
model = "o3"
model_provider = "openai"
approval_policy = "on-failure"
disable_response_storage = true
```
Users can specify config values at multiple levels. Order of precedence is as follows:
1. custom command-line argument, e.g., `--model o3`
2. as part of a profile, where the `--profile` is specified via a CLI (or in the config file itself)
3. as an entry in `config.toml`, e.g., `model = "o3"`
4. the default value that comes with Codex CLI (i.e., Codex CLI defaults to `codex-mini-latest`)
## model_reasoning_effort
If the model name starts with `"o"` (as in `"o3"` or `"o4-mini"`) or `"codex"`, reasoning is enabled by default when using the Responses API. As explained in the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning), this can be set to:
- `"low"`
- `"medium"` (default)
- `"high"`
To disable reasoning, set `model_reasoning_effort` to `"none"` in your config:
```toml
model_reasoning_effort = "none" # disable reasoning
```
## model_reasoning_summary
If the model name starts with `"o"` (as in `"o3"` or `"o4-mini"`) or `"codex"`, reasoning is enabled by default when using the Responses API. As explained in the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries), this can be set to:
- `"auto"` (default)
- `"concise"`
- `"detailed"`
To disable reasoning summaries, set `model_reasoning_summary` to `"none"` in your config:
```toml
model_reasoning_summary = "none" # disable reasoning summaries
```
## model_supports_reasoning_summaries
By default, `reasoning` is only set on requests to OpenAI models that are known to support them. To force `reasoning` to set on requests to the current model, you can force this behavior by setting the following in `config.toml`:
```toml
model_supports_reasoning_summaries = true
```
## sandbox_mode
Codex executes model-generated shell commands inside an OS-level sandbox.
In most cases you can pick the desired behaviour with a single option:
```toml
# same as `--sandbox read-only`
sandbox_mode = "read-only"
```
The default policy is `read-only`, which means commands can read any file on
disk, but attempts to write a file or access the network will be blocked.
A more relaxed policy is `workspace-write`. When specified, the current working directory for the Codex task will be writable (as well as `$TMPDIR` on macOS). Note that the CLI defaults to using the directory where it was spawned as `cwd`, though this can be overridden using `--cwd/-C`.
```toml
# same as `--sandbox workspace-write`
sandbox_mode = "workspace-write"
# Extra settings that only apply when `sandbox = "workspace-write"`.
[sandbox_workspace_write]
# By default, only the cwd for the Codex session will be writable (and $TMPDIR
# on macOS), but you can specify additional writable folders in this array.
writable_roots = ["/tmp"]
# Allow the command being run inside the sandbox to make outbound network
# requests. Disabled by default.
network_access = false
```
To disable sandboxing altogether, specify `danger-full-access` like so:
```toml
# same as `--sandbox danger-full-access`
sandbox_mode = "danger-full-access"
```
This is reasonable to use if Codex is running in an environment that provides its own sandboxing (such as a Docker container) such that further sandboxing is unnecessary.
Though using this option may also be necessary if you try to use Codex in environments where its native sandboxing mechanisms are unsupported, such as older Linux kernels or on Windows.
## mcp_servers
Defines the list of MCP servers that Codex can consult for tool use. Currently, only servers that are launched by executing a program that communicate over stdio are supported. For servers that use the SSE transport, consider an adapter like [mcp-proxy](https://github.com/sparfenyuk/mcp-proxy).
**Note:** Codex may cache the list of tools and resources from an MCP server so that Codex can include this information in context at startup without spawning all the servers. This is designed to save resources by loading MCP servers lazily.
This config option is comparable to how Claude and Cursor define `mcpServers` in their respective JSON config files, though because Codex uses TOML for its config language, the format is slightly different. For example, the following config in JSON:
```json
{
"mcpServers": {
"server-name": {
"command": "npx",
"args": ["-y", "mcp-server"],
"env": {
"API_KEY": "value"
}
}
}
}
```
Should be represented as follows in `~/.codex/config.toml`:
```toml
# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command = "npx"
args = ["-y", "mcp-server"]
env = { "API_KEY" = "value" }
```
## disable_response_storage
Currently, customers whose accounts are set to use Zero Data Retention (ZDR) must set `disable_response_storage` to `true` so that Codex uses an alternative to the Responses API that works with ZDR:
```toml
disable_response_storage = true
```
## shell_environment_policy
Codex spawns subprocesses (e.g. when executing a `local_shell` tool-call suggested by the assistant). By default it passes **only a minimal core subset** of your environment to those subprocesses to avoid leaking credentials. You can tune this behavior via the **`shell_environment_policy`** block in
`config.toml`:
```toml
[shell_environment_policy]
# inherit can be "core" (default), "all", or "none"
inherit = "core"
# set to true to *skip* the filter for `"*KEY*"` and `"*TOKEN*"`
ignore_default_excludes = false
# exclude patterns (case-insensitive globs)
exclude = ["AWS_*", "AZURE_*"]
# force-set / override values
set = { CI = "1" }
# if provided, *only* vars matching these patterns are kept
include_only = ["PATH", "HOME"]
```
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `inherit` | string | `core` | Starting template for the environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run. |
| `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob patterns to drop after the default filter.<br>Examples: `"AWS_*"`, `"AZURE_*"`. |
| `set` | table&lt;string,string&gt; | `{}` | Explicit key/value overrides or additions always win over inherited values. |
| `include_only` | array&lt;string&gt; | `[]` | If non-empty, a whitelist of patterns; only variables that match _one_ pattern survive the final step. (Generally used with `inherit = "all"`.) |
The patterns are **glob style**, not full regular expressions: `*` matches any
number of characters, `?` matches exactly one, and character classes like
`[A-Z]`/`[^0-9]` are supported. Matching is always **case-insensitive**. This
syntax is documented in code as `EnvironmentVariablePattern` (see
`core/src/config_types.rs`).
If you just need a clean slate with a few custom entries you can write:
```toml
[shell_environment_policy]
inherit = "none"
set = { PATH = "/usr/bin", MY_FLAG = "1" }
```
Currently, `CODEX_SANDBOX_NETWORK_DISABLED=1` is also added to the environment, assuming network is disabled. This is not configurable.
## notify
Specify a program that will be executed to get notified about events generated by Codex. Note that the program will receive the notification argument as a string of JSON, e.g.:
```json
{
"type": "agent-turn-complete",
"turn-id": "12345",
"input-messages": ["Rename `foo` to `bar` and update the callsites."],
"last-assistant-message": "Rename complete and verified `cargo build` succeeds."
}
```
The `"type"` property will always be set. Currently, `"agent-turn-complete"` is the only notification type that is supported.
As an example, here is a Python script that parses the JSON and decides whether to show a desktop push notification using [terminal-notifier](https://github.com/julienXX/terminal-notifier) on macOS:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
To have Codex use this script for notifications, you would configure it via `notify` in `~/.codex/config.toml` using the appropriate path to `notify.py` on your computer:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
## history
By default, Codex CLI records messages sent to the model in `$CODEX_HOME/history.jsonl`. Note that on UNIX, the file permissions are set to `o600`, so it should only be readable and writable by the owner.
To disable this behavior, configure `[history]` as follows:
```toml
[history]
persistence = "none" # "save-all" is the default value
```
## file_opener
Identifies the editor/URI scheme to use for hyperlinking citations in model output. If set, citations to files in the model output will be hyperlinked using the specified URI scheme so they can be ctrl/cmd-clicked from the terminal to open them.
For example, if the model output includes a reference such as `【F:/home/user/project/main.py†L42-L50】`, then this would be rewritten to link to the URI `vscode://file/home/user/project/main.py:42`.
Note this is **not** a general editor setting (like `$EDITOR`), as it only accepts a fixed set of values:
- `"vscode"` (default)
- `"vscode-insiders"`
- `"windsurf"`
- `"cursor"`
- `"none"` to explicitly disable this feature
Currently, `"vscode"` is the default, though Codex does not verify VS Code is installed. As such, `file_opener` may default to `"none"` or something else in the future.
## hide_agent_reasoning
Codex intermittently emits "reasoning" events that show the model's internal "thinking" before it produces a final answer. Some users may find these events distracting, especially in CI logs or minimal terminal output.
Setting `hide_agent_reasoning` to `true` suppresses these events in **both** the TUI as well as the headless `exec` sub-command:
```toml
hide_agent_reasoning = true # defaults to false
```
## model_context_window
The size of the context window for the model, in tokens.
In general, Codex knows the context window for the most common OpenAI models, but if you are using a new model with an old version of the Codex CLI, then you can use `model_context_window` to tell Codex what value to use to determine how much context is left during a conversation.
## model_max_output_tokens
This is analogous to `model_context_window`, but for the maximum number of output tokens for the model.
## project_doc_max_bytes
Maximum number of bytes to read from an `AGENTS.md` file to include in the instructions sent with the first turn of a session. Defaults to 32 KiB.
## tui
Options that are specific to the TUI.
```toml
[tui]
# This will make it so that Codex does not try to process mouse events, which
# means your Terminal's native drag-to-text to text selection and copy/paste
# should work. The tradeoff is that Codex will not receive any mouse events, so
# it will not be possible to use the mouse to scroll conversation history.
#
# Note that most terminals support holding down a modifier key when using the
# mouse to support text selection. For example, even if Codex mouse capture is
# enabled (i.e., this is set to `false`), you can still hold down alt while
# dragging the mouse to select text.
disable_mouse_capture = true # defaults to `false`
```

View File

@@ -13,7 +13,7 @@ workspace = true
[dependencies]
anyhow = "1"
async-channel = "2.3.1"
base64 = "0.21"
base64 = "0.22"
bytes = "1.10.1"
codex-apply-patch = { path = "../apply-patch" }
codex-mcp-client = { path = "../mcp-client" }
@@ -21,16 +21,16 @@ dirs = "6"
env-flags = "0.1.1"
eventsource-stream = "0.2.3"
fs2 = "0.4.3"
fs-err = "3.1.0"
futures = "0.3"
libc = "0.2.174"
mcp-types = { path = "../mcp-types" }
mime_guess = "2.0"
patch = "0.7"
path-absolutize = "3.1.1"
rand = "0.9"
reqwest = { version = "0.12", features = ["json", "stream"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sha1 = "0.10.6"
strum_macros = "0.27.1"
thiserror = "2.0.12"
time = { version = "0.3", features = ["formatting", "local-offset", "macros"] }
tokio = { version = "1", features = [
@@ -41,10 +41,10 @@ tokio = { version = "1", features = [
"signal",
] }
tokio-util = "0.7.14"
toml = "0.8.20"
toml = "0.9.1"
tracing = { version = "0.1.41", features = ["log"] }
tree-sitter = "0.25.3"
tree-sitter-bash = "0.23.3"
tree-sitter-bash = "0.25.0"
uuid = { version = "1", features = ["serde", "v4"] }
wildmatch = "2.4.0"
@@ -56,10 +56,16 @@ seccompiler = "0.5.0"
[target.x86_64-unknown-linux-musl.dependencies]
openssl-sys = { version = "*", features = ["vendored"] }
# Build OpenSSL from source for musl builds.
[target.aarch64-unknown-linux-musl.dependencies]
openssl-sys = { version = "*", features = ["vendored"] }
[dev-dependencies]
assert_cmd = "2"
maplit = "1.0.2"
predicates = "3"
pretty_assertions = "1.4.1"
tempfile = "3"
tokio-test = "0.4"
walkdir = "2.5.0"
wiremock = "0.6"

View File

@@ -21,14 +21,12 @@ use crate::client_common::ResponseEvent;
use crate::client_common::ResponseStream;
use crate::error::CodexErr;
use crate::error::Result;
use crate::flags::OPENAI_REQUEST_MAX_RETRIES;
use crate::flags::OPENAI_STREAM_IDLE_TIMEOUT_MS;
use crate::models::ContentItem;
use crate::models::ResponseItem;
use crate::openai_tools::create_tools_json_for_chat_completions_api;
use crate::util::backoff;
/// Implementation for the classic Chat Completions API. This is intentionally
/// minimal: we only stream back plain assistant text.
/// Implementation for the classic Chat Completions API.
pub(crate) async fn stream_chat_completions(
prompt: &Prompt,
model: &str,
@@ -38,45 +36,95 @@ pub(crate) async fn stream_chat_completions(
// Build messages array
let mut messages = Vec::<serde_json::Value>::new();
let full_instructions = prompt.get_full_instructions();
let full_instructions = prompt.get_full_instructions(model);
messages.push(json!({"role": "system", "content": full_instructions}));
for item in &prompt.input {
if let ResponseItem::Message { role, content } = item {
let mut text = String::new();
for c in content {
match c {
ContentItem::InputText { text: t } | ContentItem::OutputText { text: t } => {
text.push_str(t);
match item {
ResponseItem::Message { role, content } => {
let mut text = String::new();
for c in content {
match c {
ContentItem::InputText { text: t }
| ContentItem::OutputText { text: t } => {
text.push_str(t);
}
_ => {}
}
_ => {}
}
messages.push(json!({"role": role, "content": text}));
}
ResponseItem::FunctionCall {
name,
arguments,
call_id,
} => {
messages.push(json!({
"role": "assistant",
"content": null,
"tool_calls": [{
"id": call_id,
"type": "function",
"function": {
"name": name,
"arguments": arguments,
}
}]
}));
}
ResponseItem::LocalShellCall {
id,
call_id: _,
status,
action,
} => {
// Confirm with API team.
messages.push(json!({
"role": "assistant",
"content": null,
"tool_calls": [{
"id": id.clone().unwrap_or_else(|| "".to_string()),
"type": "local_shell_call",
"status": status,
"action": action,
}]
}));
}
ResponseItem::FunctionCallOutput { call_id, output } => {
messages.push(json!({
"role": "tool",
"tool_call_id": call_id,
"content": output.content,
}));
}
ResponseItem::Reasoning { .. } | ResponseItem::Other => {
// Omit these items from the conversation history.
continue;
}
messages.push(json!({"role": role, "content": text}));
}
}
let tools_json = create_tools_json_for_chat_completions_api(prompt, model)?;
let payload = json!({
"model": model,
"messages": messages,
"stream": true
"stream": true,
"tools": tools_json,
});
let base_url = provider.base_url.trim_end_matches('/');
let url = format!("{}/chat/completions", base_url);
debug!(
"POST to {}: {}",
provider.get_full_url(),
serde_json::to_string_pretty(&payload).unwrap_or_default()
);
debug!(url, "POST (chat)");
trace!("request payload: {}", payload);
let api_key = provider.api_key()?;
let mut attempt = 0;
let max_retries = provider.request_max_retries();
loop {
attempt += 1;
let mut req_builder = client.post(&url);
if let Some(api_key) = &api_key {
req_builder = req_builder.bearer_auth(api_key.clone());
}
let req_builder = provider.create_request_builder(client)?;
let res = req_builder
.header(reqwest::header::ACCEPT, "text/event-stream")
.json(&payload)
@@ -85,9 +133,13 @@ pub(crate) async fn stream_chat_completions(
match res {
Ok(resp) if resp.status().is_success() => {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(1600);
let stream = resp.bytes_stream().map_err(CodexErr::Reqwest);
tokio::spawn(process_chat_sse(stream, tx_event));
tokio::spawn(process_chat_sse(
stream,
tx_event,
provider.stream_idle_timeout(),
));
return Ok(ResponseStream { rx_event });
}
Ok(res) => {
@@ -97,7 +149,7 @@ pub(crate) async fn stream_chat_completions(
return Err(CodexErr::UnexpectedStatus(status, body));
}
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
if attempt > max_retries {
return Err(CodexErr::RetryLimit(status));
}
@@ -113,7 +165,7 @@ pub(crate) async fn stream_chat_completions(
tokio::time::sleep(delay).await;
}
Err(e) => {
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
if attempt > max_retries {
return Err(e.into());
}
let delay = backoff(attempt);
@@ -126,13 +178,29 @@ pub(crate) async fn stream_chat_completions(
/// Lightweight SSE processor for the Chat Completions streaming format. The
/// output is mapped onto Codex's internal [`ResponseEvent`] so that the rest
/// of the pipeline can stay agnostic of the underlying wire format.
async fn process_chat_sse<S>(stream: S, tx_event: mpsc::Sender<Result<ResponseEvent>>)
where
async fn process_chat_sse<S>(
stream: S,
tx_event: mpsc::Sender<Result<ResponseEvent>>,
idle_timeout: Duration,
) where
S: Stream<Item = Result<Bytes>> + Unpin,
{
let mut stream = stream.eventsource();
let idle_timeout = *OPENAI_STREAM_IDLE_TIMEOUT_MS;
// State to accumulate a function call across streaming chunks.
// OpenAI may split the `arguments` string over multiple `delta` events
// until the chunk whose `finish_reason` is `tool_calls` is emitted. We
// keep collecting the pieces here and forward a single
// `ResponseItem::FunctionCall` once the call is complete.
#[derive(Default)]
struct FunctionCallState {
name: Option<String>,
arguments: String,
call_id: Option<String>,
active: bool,
}
let mut fn_call_state = FunctionCallState::default();
loop {
let sse = match timeout(idle_timeout, stream.next()).await {
@@ -146,6 +214,7 @@ where
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
}))
.await;
return;
@@ -163,6 +232,7 @@ where
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
}))
.await;
return;
@@ -173,23 +243,90 @@ where
Ok(v) => v,
Err(_) => continue,
};
trace!("chat_completions received SSE chunk: {chunk:?}");
let content_opt = chunk
.get("choices")
.and_then(|c| c.get(0))
.and_then(|c| c.get("delta"))
.and_then(|d| d.get("content"))
.and_then(|c| c.as_str());
let choice_opt = chunk.get("choices").and_then(|c| c.get(0));
if let Some(content) = content_opt {
let item = ResponseItem::Message {
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: content.to_string(),
}],
};
if let Some(choice) = choice_opt {
// Handle assistant content tokens.
if let Some(content) = choice
.get("delta")
.and_then(|d| d.get("content"))
.and_then(|c| c.as_str())
{
let item = ResponseItem::Message {
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: content.to_string(),
}],
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
// Handle streaming function / tool calls.
if let Some(tool_calls) = choice
.get("delta")
.and_then(|d| d.get("tool_calls"))
.and_then(|tc| tc.as_array())
{
if let Some(tool_call) = tool_calls.first() {
// Mark that we have an active function call in progress.
fn_call_state.active = true;
// Extract call_id if present.
if let Some(id) = tool_call.get("id").and_then(|v| v.as_str()) {
fn_call_state.call_id.get_or_insert_with(|| id.to_string());
}
// Extract function details if present.
if let Some(function) = tool_call.get("function") {
if let Some(name) = function.get("name").and_then(|n| n.as_str()) {
fn_call_state.name.get_or_insert_with(|| name.to_string());
}
if let Some(args_fragment) =
function.get("arguments").and_then(|a| a.as_str())
{
fn_call_state.arguments.push_str(args_fragment);
}
}
}
}
// Emit end-of-turn when finish_reason signals completion.
if let Some(finish_reason) = choice.get("finish_reason").and_then(|v| v.as_str()) {
match finish_reason {
"tool_calls" if fn_call_state.active => {
// Build the FunctionCall response item.
let item = ResponseItem::FunctionCall {
name: fn_call_state.name.clone().unwrap_or_else(|| "".to_string()),
arguments: fn_call_state.arguments.clone(),
call_id: fn_call_state.call_id.clone().unwrap_or_else(String::new),
};
// Emit it downstream.
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
"stop" => {
// Regular turn without tool-call.
}
_ => {}
}
// Emit Completed regardless of reason so the agent can advance.
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
}))
.await;
// Prepare for potential next turn (should not happen in same stream).
// fn_call_state = FunctionCallState::default();
return; // End processing for this SSE stream.
}
}
}
}
@@ -236,9 +373,14 @@ where
Poll::Ready(None) => return Poll::Ready(None),
Poll::Ready(Some(Err(e))) => return Poll::Ready(Some(Err(e))),
Poll::Ready(Some(Ok(ResponseEvent::OutputItemDone(item)))) => {
// Accumulate *assistant* text but do not emit yet.
if let crate::models::ResponseItem::Message { role, content } = &item {
if role == "assistant" {
// If this is an incremental assistant message chunk, accumulate but
// do NOT emit yet. Forward any other item (e.g. FunctionCall) right
// away so downstream consumers see it.
let is_assistant_delta = matches!(&item, crate::models::ResponseItem::Message { role, .. } if role == "assistant");
if is_assistant_delta {
if let crate::models::ResponseItem::Message { content, .. } = &item {
if let Some(text) = content.iter().find_map(|c| match c {
crate::models::ContentItem::OutputText { text } => Some(text),
_ => None,
@@ -246,12 +388,18 @@ where
this.cumulative.push_str(text);
}
}
// Swallow partial assistant chunk; keep polling.
continue;
}
// Swallow partial event; keep polling.
continue;
// Not an assistant message forward immediately.
return Poll::Ready(Some(Ok(ResponseEvent::OutputItemDone(item))));
}
Poll::Ready(Some(Ok(ResponseEvent::Completed { response_id }))) => {
Poll::Ready(Some(Ok(ResponseEvent::Completed {
response_id,
token_usage,
}))) => {
if !this.cumulative.is_empty() {
let aggregated_item = crate::models::ResponseItem::Message {
role: "assistant".to_string(),
@@ -261,7 +409,10 @@ where
};
// Buffer Completed so it is returned *after* the aggregated message.
this.pending_completed = Some(ResponseEvent::Completed { response_id });
this.pending_completed = Some(ResponseEvent::Completed {
response_id,
token_usage,
});
return Poll::Ready(Some(Ok(ResponseEvent::OutputItemDone(
aggregated_item,
@@ -269,8 +420,22 @@ where
}
// Nothing aggregated forward Completed directly.
return Poll::Ready(Some(Ok(ResponseEvent::Completed { response_id })));
} // No other `Ok` variants exist at the moment, continue polling.
return Poll::Ready(Some(Ok(ResponseEvent::Completed {
response_id,
token_usage,
})));
}
Poll::Ready(Some(Ok(ResponseEvent::Created))) => {
// These events are exclusive to the Responses API and
// will never appear in a Chat Completions stream.
continue;
}
Poll::Ready(Some(Ok(ResponseEvent::OutputTextDelta(_))))
| Poll::Ready(Some(Ok(ResponseEvent::ReasoningSummaryDelta(_)))) => {
// Deltas are ignored here since aggregation waits for the
// final OutputItemDone.
continue;
}
}
}
}
@@ -284,7 +449,7 @@ pub(crate) trait AggregateStreamExt: Stream<Item = Result<ResponseEvent>> + Size
///
/// ```ignore
/// OutputItemDone(<full message>)
/// Completed { .. }
/// Completed
/// ```
///
/// No other `OutputItemDone` events will be seen by the caller.

View File

@@ -1,7 +1,5 @@
use std::collections::BTreeMap;
use std::io::BufRead;
use std::path::Path;
use std::sync::LazyLock;
use std::time::Duration;
use bytes::Bytes;
@@ -11,109 +9,60 @@ use reqwest::StatusCode;
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value;
use serde_json::json;
use tokio::sync::mpsc;
use tokio::time::timeout;
use tokio_util::io::ReaderStream;
use tracing::debug;
use tracing::trace;
use tracing::warn;
use uuid::Uuid;
use crate::chat_completions::AggregateStreamExt;
use crate::chat_completions::stream_chat_completions;
use crate::client_common::Payload;
use crate::client_common::Prompt;
use crate::client_common::Reasoning;
use crate::client_common::ResponseEvent;
use crate::client_common::ResponseStream;
use crate::client_common::Summary;
use crate::client_common::ResponsesApiRequest;
use crate::client_common::create_reasoning_param_for_request;
use crate::config::Config;
use crate::config_types::ReasoningEffort as ReasoningEffortConfig;
use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
use crate::error::CodexErr;
use crate::error::EnvVarError;
use crate::error::Result;
use crate::flags::CODEX_RS_SSE_FIXTURE;
use crate::flags::OPENAI_REQUEST_MAX_RETRIES;
use crate::flags::OPENAI_STREAM_IDLE_TIMEOUT_MS;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::WireApi;
use crate::models::ResponseItem;
use crate::openai_tools::create_tools_json_for_responses_api;
use crate::protocol::TokenUsage;
use crate::util::backoff;
/// When serialized as JSON, this produces a valid "Tool" in the OpenAI
/// Responses API.
#[derive(Debug, Clone, Serialize)]
#[serde(tag = "type")]
enum OpenAiTool {
#[serde(rename = "function")]
Function(ResponsesApiTool),
#[serde(rename = "local_shell")]
LocalShell {},
}
#[derive(Debug, Clone, Serialize)]
struct ResponsesApiTool {
name: &'static str,
description: &'static str,
strict: bool,
parameters: JsonSchema,
}
/// Generic JSONSchema subset needed for our tool definitions
#[derive(Debug, Clone, Serialize)]
#[serde(tag = "type", rename_all = "lowercase")]
enum JsonSchema {
String,
Number,
Array {
items: Box<JsonSchema>,
},
Object {
properties: BTreeMap<String, JsonSchema>,
required: &'static [&'static str],
#[serde(rename = "additionalProperties")]
additional_properties: bool,
},
}
/// Tool usage specification
static DEFAULT_TOOLS: LazyLock<Vec<OpenAiTool>> = LazyLock::new(|| {
let mut properties = BTreeMap::new();
properties.insert(
"command".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String),
},
);
properties.insert("workdir".to_string(), JsonSchema::String);
properties.insert("timeout".to_string(), JsonSchema::Number);
vec![OpenAiTool::Function(ResponsesApiTool {
name: "shell",
description: "Runs a shell command, and returns its output.",
strict: false,
parameters: JsonSchema::Object {
properties,
required: &["command"],
additional_properties: false,
},
})]
});
static DEFAULT_CODEX_MODEL_TOOLS: LazyLock<Vec<OpenAiTool>> =
LazyLock::new(|| vec![OpenAiTool::LocalShell {}]);
use std::sync::Arc;
#[derive(Clone)]
pub struct ModelClient {
model: String,
config: Arc<Config>,
client: reqwest::Client,
provider: ModelProviderInfo,
session_id: Uuid,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
}
impl ModelClient {
pub fn new(model: impl ToString, provider: ModelProviderInfo) -> Self {
pub fn new(
config: Arc<Config>,
provider: ModelProviderInfo,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
session_id: Uuid,
) -> Self {
Self {
model: model.to_string(),
config,
client: reqwest::Client::new(),
provider,
session_id,
effort,
summary,
}
}
@@ -125,9 +74,13 @@ impl ModelClient {
WireApi::Responses => self.stream_responses(prompt).await,
WireApi::Chat => {
// Create the raw streaming connection first.
let response_stream =
stream_chat_completions(prompt, &self.model, &self.client, &self.provider)
.await?;
let response_stream = stream_chat_completions(
prompt,
&self.config.model,
&self.client,
&self.provider,
)
.await?;
// Wrap it with the aggregation adapter so callers see *only*
// the final assistant message per turn (matching the
@@ -158,78 +111,68 @@ impl ModelClient {
if let Some(path) = &*CODEX_RS_SSE_FIXTURE {
// short circuit for tests
warn!(path, "Streaming from fixture");
return stream_from_fixture(path).await;
return stream_from_fixture(path, self.provider.clone()).await;
}
// Assemble tool list: built-in tools + any extra tools from the prompt.
let default_tools = if self.model.starts_with("codex") {
&DEFAULT_CODEX_MODEL_TOOLS
} else {
&DEFAULT_TOOLS
};
let mut tools_json = Vec::with_capacity(default_tools.len() + prompt.extra_tools.len());
for t in default_tools.iter() {
tools_json.push(serde_json::to_value(t)?);
}
tools_json.extend(
prompt
.extra_tools
.clone()
.into_iter()
.map(|(name, tool)| mcp_tool_to_openai_tool(name, tool)),
);
debug!("tools_json: {}", serde_json::to_string_pretty(&tools_json)?);
let full_instructions = prompt.get_full_instructions();
let payload = Payload {
model: &self.model,
let full_instructions = prompt.get_full_instructions(&self.config.model);
let tools_json = create_tools_json_for_responses_api(prompt, &self.config.model)?;
let reasoning = create_reasoning_param_for_request(&self.config, self.effort, self.summary);
let payload = ResponsesApiRequest {
model: &self.config.model,
instructions: &full_instructions,
input: &prompt.input,
tools: &tools_json,
tool_choice: "auto",
parallel_tool_calls: false,
reasoning: Some(Reasoning {
effort: "high",
summary: Some(Summary::Auto),
}),
reasoning,
previous_response_id: prompt.prev_id.clone(),
store: prompt.store,
// TODO: make this configurable
stream: true,
};
let base_url = self.provider.base_url.clone();
let base_url = base_url.trim_end_matches('/');
let url = format!("{}/responses", base_url);
debug!(url, "POST");
trace!("request payload: {}", serde_json::to_string(&payload)?);
trace!(
"POST to {}: {}",
self.provider.get_full_url(),
serde_json::to_string(&payload)?
);
let mut attempt = 0;
let max_retries = self.provider.request_max_retries();
loop {
attempt += 1;
let api_key = self.provider.api_key()?.ok_or_else(|| {
CodexErr::EnvVar(EnvVarError {
var: self.provider.env_key.clone().unwrap_or_default(),
instructions: None,
})
})?;
let res = self
.client
.post(&url)
.bearer_auth(api_key)
let req_builder = self
.provider
.create_request_builder(&self.client)?
.header("OpenAI-Beta", "responses=experimental")
.header("session_id", self.session_id.to_string())
.header(reqwest::header::ACCEPT, "text/event-stream")
.json(&payload)
.send()
.await;
.json(&payload);
let res = req_builder.send().await;
if let Ok(resp) = &res {
trace!(
"Response status: {}, request-id: {}",
resp.status(),
resp.headers()
.get("x-request-id")
.map(|v| v.to_str().unwrap_or_default())
.unwrap_or_default()
);
}
match res {
Ok(resp) if resp.status().is_success() => {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(1600);
// spawn task to process SSE
let stream = resp.bytes_stream().map_err(CodexErr::Reqwest);
tokio::spawn(process_sse(stream, tx_event));
tokio::spawn(process_sse(
stream,
tx_event,
self.provider.stream_idle_timeout(),
));
return Ok(ResponseStream { rx_event });
}
@@ -244,11 +187,11 @@ impl ModelClient {
// negligible.
if !(status == StatusCode::TOO_MANY_REQUESTS || status.is_server_error()) {
// Surface the error body to callers. Use `unwrap_or_default` per Clippy.
let body = (res.text().await).unwrap_or_default();
let body = res.text().await.unwrap_or_default();
return Err(CodexErr::UnexpectedStatus(status, body));
}
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
if attempt > max_retries {
return Err(CodexErr::RetryLimit(status));
}
@@ -265,7 +208,7 @@ impl ModelClient {
tokio::time::sleep(delay).await;
}
Err(e) => {
if attempt > *OPENAI_REQUEST_MAX_RETRIES {
if attempt > max_retries {
return Err(e.into());
}
let delay = backoff(attempt);
@@ -274,34 +217,10 @@ impl ModelClient {
}
}
}
}
fn mcp_tool_to_openai_tool(
fully_qualified_name: String,
tool: mcp_types::Tool,
) -> serde_json::Value {
let mcp_types::Tool {
description,
mut input_schema,
..
} = tool;
// OpenAI models mandate the "properties" field in the schema. The Agents
// SDK fixed this by inserting an empty object for "properties" if it is not
// already present https://github.com/openai/openai-agents-python/issues/449
// so here we do the same.
if input_schema.properties.is_none() {
input_schema.properties = Some(serde_json::Value::Object(serde_json::Map::new()));
pub fn get_provider(&self) -> ModelProviderInfo {
self.provider.clone()
}
// TODO(mbolin): Change the contract of this function to return
// ResponsesApiTool.
json!({
"name": fully_qualified_name,
"description": description,
"parameters": input_schema,
"type": "function",
})
}
#[derive(Debug, Deserialize, Serialize)]
@@ -310,23 +229,61 @@ struct SseEvent {
kind: String,
response: Option<Value>,
item: Option<Value>,
delta: Option<String>,
}
#[derive(Debug, Deserialize)]
struct ResponseCreated {}
#[derive(Debug, Deserialize)]
struct ResponseCompleted {
id: String,
usage: Option<ResponseCompletedUsage>,
}
async fn process_sse<S>(stream: S, tx_event: mpsc::Sender<Result<ResponseEvent>>)
where
#[derive(Debug, Deserialize)]
struct ResponseCompletedUsage {
input_tokens: u64,
input_tokens_details: Option<ResponseCompletedInputTokensDetails>,
output_tokens: u64,
output_tokens_details: Option<ResponseCompletedOutputTokensDetails>,
total_tokens: u64,
}
impl From<ResponseCompletedUsage> for TokenUsage {
fn from(val: ResponseCompletedUsage) -> Self {
TokenUsage {
input_tokens: val.input_tokens,
cached_input_tokens: val.input_tokens_details.map(|d| d.cached_tokens),
output_tokens: val.output_tokens,
reasoning_output_tokens: val.output_tokens_details.map(|d| d.reasoning_tokens),
total_tokens: val.total_tokens,
}
}
}
#[derive(Debug, Deserialize)]
struct ResponseCompletedInputTokensDetails {
cached_tokens: u64,
}
#[derive(Debug, Deserialize)]
struct ResponseCompletedOutputTokensDetails {
reasoning_tokens: u64,
}
async fn process_sse<S>(
stream: S,
tx_event: mpsc::Sender<Result<ResponseEvent>>,
idle_timeout: Duration,
) where
S: Stream<Item = Result<Bytes>> + Unpin,
{
let mut stream = stream.eventsource();
// If the stream stays completely silent for an extended period treat it as disconnected.
let idle_timeout = *OPENAI_STREAM_IDLE_TIMEOUT_MS;
// The response id returned from the "complete" message.
let mut response_id = None;
let mut response_completed: Option<ResponseCompleted> = None;
loop {
let sse = match timeout(idle_timeout, stream.next()).await {
@@ -338,9 +295,15 @@ where
return;
}
Ok(None) => {
match response_id {
Some(response_id) => {
let event = ResponseEvent::Completed { response_id };
match response_completed {
Some(ResponseCompleted {
id: response_id,
usage,
}) => {
let event = ResponseEvent::Completed {
response_id,
token_usage: usage.map(Into::into),
};
let _ = tx_event.send(Ok(event)).await;
}
None => {
@@ -379,7 +342,7 @@ where
// duplicated `output` array embedded in the `response.completed`
// payload. That produced two concrete issues:
// 1. No realtime streaming the user only saw output after the
// entire turn had finished, which broke the typing UX and
// entire turn had finished, which broke the "typing" UX and
// made longrunning turns look stalled.
// 2. Duplicate `function_call_output` items both the
// individual *and* the completed array were forwarded, which
@@ -401,12 +364,46 @@ where
return;
}
}
"response.output_text.delta" => {
if let Some(delta) = event.delta {
let event = ResponseEvent::OutputTextDelta(delta);
if tx_event.send(Ok(event)).await.is_err() {
return;
}
}
}
"response.reasoning_summary_text.delta" => {
if let Some(delta) = event.delta {
let event = ResponseEvent::ReasoningSummaryDelta(delta);
if tx_event.send(Ok(event)).await.is_err() {
return;
}
}
}
"response.created" => {
if event.response.is_some() {
let _ = tx_event.send(Ok(ResponseEvent::Created {})).await;
}
}
"response.failed" => {
if let Some(resp_val) = event.response {
let error = resp_val
.get("error")
.and_then(|v| v.get("message"))
.and_then(|v| v.as_str())
.unwrap_or("response.failed event received");
let _ = tx_event
.send(Err(CodexErr::Stream(error.to_string())))
.await;
}
}
// Final response completed includes array of output items & id
"response.completed" => {
if let Some(resp_val) = event.response {
match serde_json::from_value::<ResponseCompleted>(resp_val) {
Ok(r) => {
response_id = Some(r.id);
response_completed = Some(r);
}
Err(e) => {
debug!("failed to parse ResponseCompleted: {e}");
@@ -415,14 +412,27 @@ where
};
};
}
"response.content_part.done"
| "response.function_call_arguments.delta"
| "response.in_progress"
| "response.output_item.added"
| "response.output_text.done"
| "response.reasoning_summary_part.added"
| "response.reasoning_summary_text.done" => {
// Currently, we ignore these events, but we handle them
// separately to skip the logging message in the `other` case.
}
other => debug!(other, "sse event"),
}
}
}
/// used in tests to stream from a text SSE file
async fn stream_from_fixture(path: impl AsRef<Path>) -> Result<ResponseStream> {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(16);
async fn stream_from_fixture(
path: impl AsRef<Path>,
provider: ModelProviderInfo,
) -> Result<ResponseStream> {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent>>(1600);
let f = std::fs::File::open(path.as_ref())?;
let lines = std::io::BufReader::new(f).lines();
@@ -435,6 +445,299 @@ async fn stream_from_fixture(path: impl AsRef<Path>) -> Result<ResponseStream> {
let rdr = std::io::Cursor::new(content);
let stream = ReaderStream::new(rdr).map_err(CodexErr::Io);
tokio::spawn(process_sse(stream, tx_event));
tokio::spawn(process_sse(
stream,
tx_event,
provider.stream_idle_timeout(),
));
Ok(ResponseStream { rx_event })
}
#[cfg(test)]
mod tests {
#![allow(clippy::expect_used, clippy::unwrap_used)]
use super::*;
use serde_json::json;
use tokio::sync::mpsc;
use tokio_test::io::Builder as IoBuilder;
use tokio_util::io::ReaderStream;
// ────────────────────────────
// Helpers
// ────────────────────────────
/// Runs the SSE parser on pre-chunked byte slices and returns every event
/// (including any final `Err` from a stream-closure check).
async fn collect_events(
chunks: &[&[u8]],
provider: ModelProviderInfo,
) -> Vec<Result<ResponseEvent>> {
let mut builder = IoBuilder::new();
for chunk in chunks {
builder.read(chunk);
}
let reader = builder.build();
let stream = ReaderStream::new(reader).map_err(CodexErr::Io);
let (tx, mut rx) = mpsc::channel::<Result<ResponseEvent>>(16);
tokio::spawn(process_sse(stream, tx, provider.stream_idle_timeout()));
let mut events = Vec::new();
while let Some(ev) = rx.recv().await {
events.push(ev);
}
events
}
/// Builds an in-memory SSE stream from JSON fixtures and returns only the
/// successfully parsed events (panics on internal channel errors).
async fn run_sse(
events: Vec<serde_json::Value>,
provider: ModelProviderInfo,
) -> Vec<ResponseEvent> {
let mut body = String::new();
for e in events {
let kind = e
.get("type")
.and_then(|v| v.as_str())
.expect("fixture event missing type");
if e.as_object().map(|o| o.len() == 1).unwrap_or(false) {
body.push_str(&format!("event: {kind}\n\n"));
} else {
body.push_str(&format!("event: {kind}\ndata: {e}\n\n"));
}
}
let (tx, mut rx) = mpsc::channel::<Result<ResponseEvent>>(8);
let stream = ReaderStream::new(std::io::Cursor::new(body)).map_err(CodexErr::Io);
tokio::spawn(process_sse(stream, tx, provider.stream_idle_timeout()));
let mut out = Vec::new();
while let Some(ev) = rx.recv().await {
out.push(ev.expect("channel closed"));
}
out
}
// ────────────────────────────
// Tests from `implement-test-for-responses-api-sse-parser`
// ────────────────────────────
#[tokio::test]
async fn parses_items_and_completed() {
let item1 = json!({
"type": "response.output_item.done",
"item": {
"type": "message",
"role": "assistant",
"content": [{"type": "output_text", "text": "Hello"}]
}
})
.to_string();
let item2 = json!({
"type": "response.output_item.done",
"item": {
"type": "message",
"role": "assistant",
"content": [{"type": "output_text", "text": "World"}]
}
})
.to_string();
let completed = json!({
"type": "response.completed",
"response": { "id": "resp1" }
})
.to_string();
let sse1 = format!("event: response.output_item.done\ndata: {item1}\n\n");
let sse2 = format!("event: response.output_item.done\ndata: {item2}\n\n");
let sse3 = format!("event: response.completed\ndata: {completed}\n\n");
let provider = ModelProviderInfo {
name: "test".to_string(),
base_url: "https://test.com".to_string(),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(1000),
};
let events = collect_events(
&[sse1.as_bytes(), sse2.as_bytes(), sse3.as_bytes()],
provider,
)
.await;
assert_eq!(events.len(), 3);
matches!(
&events[0],
Ok(ResponseEvent::OutputItemDone(ResponseItem::Message { role, .. }))
if role == "assistant"
);
matches!(
&events[1],
Ok(ResponseEvent::OutputItemDone(ResponseItem::Message { role, .. }))
if role == "assistant"
);
match &events[2] {
Ok(ResponseEvent::Completed {
response_id,
token_usage,
}) => {
assert_eq!(response_id, "resp1");
assert!(token_usage.is_none());
}
other => panic!("unexpected third event: {other:?}"),
}
}
#[tokio::test]
async fn error_when_missing_completed() {
let item1 = json!({
"type": "response.output_item.done",
"item": {
"type": "message",
"role": "assistant",
"content": [{"type": "output_text", "text": "Hello"}]
}
})
.to_string();
let sse1 = format!("event: response.output_item.done\ndata: {item1}\n\n");
let provider = ModelProviderInfo {
name: "test".to_string(),
base_url: "https://test.com".to_string(),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(1000),
};
let events = collect_events(&[sse1.as_bytes()], provider).await;
assert_eq!(events.len(), 2);
matches!(events[0], Ok(ResponseEvent::OutputItemDone(_)));
match &events[1] {
Err(CodexErr::Stream(msg)) => {
assert_eq!(msg, "stream closed before response.completed")
}
other => panic!("unexpected second event: {other:?}"),
}
}
// ────────────────────────────
// Table-driven test from `main`
// ────────────────────────────
/// Verifies that the adapter produces the right `ResponseEvent` for a
/// variety of incoming `type` values.
#[tokio::test]
async fn table_driven_event_kinds() {
struct TestCase {
name: &'static str,
event: serde_json::Value,
expect_first: fn(&ResponseEvent) -> bool,
expected_len: usize,
}
fn is_created(ev: &ResponseEvent) -> bool {
matches!(ev, ResponseEvent::Created)
}
fn is_output(ev: &ResponseEvent) -> bool {
matches!(ev, ResponseEvent::OutputItemDone(_))
}
fn is_completed(ev: &ResponseEvent) -> bool {
matches!(ev, ResponseEvent::Completed { .. })
}
let completed = json!({
"type": "response.completed",
"response": {
"id": "c",
"usage": {
"input_tokens": 0,
"input_tokens_details": null,
"output_tokens": 0,
"output_tokens_details": null,
"total_tokens": 0
},
"output": []
}
});
let cases = vec![
TestCase {
name: "created",
event: json!({"type": "response.created", "response": {}}),
expect_first: is_created,
expected_len: 2,
},
TestCase {
name: "output_item.done",
event: json!({
"type": "response.output_item.done",
"item": {
"type": "message",
"role": "assistant",
"content": [
{"type": "output_text", "text": "hi"}
]
}
}),
expect_first: is_output,
expected_len: 2,
},
TestCase {
name: "unknown",
event: json!({"type": "response.new_tool_event"}),
expect_first: is_completed,
expected_len: 1,
},
];
for case in cases {
let mut evs = vec![case.event];
evs.push(completed.clone());
let provider = ModelProviderInfo {
name: "test".to_string(),
base_url: "https://test.com".to_string(),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(1000),
};
let out = run_sse(evs, provider).await;
assert_eq!(out.len(), case.expected_len, "case {}", case.name);
assert!(
(case.expect_first)(&out[0]),
"first event mismatch in case {}",
case.name
);
}
}
}

View File

@@ -1,5 +1,9 @@
use crate::config_types::ReasoningEffort as ReasoningEffortConfig;
use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
use crate::error::Result;
use crate::models::ResponseItem;
use crate::protocol::TokenUsage;
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
use futures::Stream;
use serde::Serialize;
use std::borrow::Cow;
@@ -22,7 +26,7 @@ pub struct Prompt {
pub prev_id: Option<String>,
/// Optional instructions from the user to amend to the built-in agent
/// instructions.
pub instructions: Option<String>,
pub user_instructions: Option<String>,
/// Whether to store response on server side (disable_response_storage = !store).
pub store: bool,
@@ -30,47 +34,95 @@ pub struct Prompt {
/// the "fully qualified" tool name (i.e., prefixed with the server name),
/// which should be reported to the model in place of Tool::name.
pub extra_tools: HashMap<String, mcp_types::Tool>,
/// Optional override for the built-in BASE_INSTRUCTIONS.
pub base_instructions_override: Option<String>,
}
impl Prompt {
pub(crate) fn get_full_instructions(&self) -> Cow<str> {
match &self.instructions {
Some(instructions) => {
let instructions = format!("{BASE_INSTRUCTIONS}\n{instructions}");
Cow::Owned(instructions)
}
None => Cow::Borrowed(BASE_INSTRUCTIONS),
pub(crate) fn get_full_instructions(&self, model: &str) -> Cow<'_, str> {
let base = self
.base_instructions_override
.as_deref()
.unwrap_or(BASE_INSTRUCTIONS);
let mut sections: Vec<&str> = vec![base];
if let Some(ref user) = self.user_instructions {
sections.push(user);
}
if model.starts_with("gpt-4.1") {
sections.push(APPLY_PATCH_TOOL_INSTRUCTIONS);
}
Cow::Owned(sections.join("\n"))
}
}
#[derive(Debug)]
pub enum ResponseEvent {
Created,
OutputItemDone(ResponseItem),
Completed { response_id: String },
Completed {
response_id: String,
token_usage: Option<TokenUsage>,
},
OutputTextDelta(String),
ReasoningSummaryDelta(String),
}
#[derive(Debug, Serialize)]
pub(crate) struct Reasoning {
pub(crate) effort: &'static str,
pub(crate) effort: OpenAiReasoningEffort,
#[serde(skip_serializing_if = "Option::is_none")]
pub(crate) summary: Option<Summary>,
pub(crate) summary: Option<OpenAiReasoningSummary>,
}
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning
#[derive(Debug, Serialize, Default, Clone, Copy)]
#[serde(rename_all = "lowercase")]
pub(crate) enum OpenAiReasoningEffort {
Low,
#[default]
Medium,
High,
}
impl From<ReasoningEffortConfig> for Option<OpenAiReasoningEffort> {
fn from(effort: ReasoningEffortConfig) -> Self {
match effort {
ReasoningEffortConfig::Low => Some(OpenAiReasoningEffort::Low),
ReasoningEffortConfig::Medium => Some(OpenAiReasoningEffort::Medium),
ReasoningEffortConfig::High => Some(OpenAiReasoningEffort::High),
ReasoningEffortConfig::None => None,
}
}
}
/// A summary of the reasoning performed by the model. This can be useful for
/// debugging and understanding the model's reasoning process.
#[derive(Debug, Serialize)]
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries
#[derive(Debug, Serialize, Default, Clone, Copy)]
#[serde(rename_all = "lowercase")]
pub(crate) enum Summary {
pub(crate) enum OpenAiReasoningSummary {
#[default]
Auto,
#[allow(dead_code)] // Will go away once this is configurable.
Concise,
#[allow(dead_code)] // Will go away once this is configurable.
Detailed,
}
impl From<ReasoningSummaryConfig> for Option<OpenAiReasoningSummary> {
fn from(summary: ReasoningSummaryConfig) -> Self {
match summary {
ReasoningSummaryConfig::Auto => Some(OpenAiReasoningSummary::Auto),
ReasoningSummaryConfig::Concise => Some(OpenAiReasoningSummary::Concise),
ReasoningSummaryConfig::Detailed => Some(OpenAiReasoningSummary::Detailed),
ReasoningSummaryConfig::None => None,
}
}
}
/// Request object that is serialized as JSON and POST'ed when using the
/// Responses API.
#[derive(Debug, Serialize)]
pub(crate) struct Payload<'a> {
pub(crate) struct ResponsesApiRequest<'a> {
pub(crate) model: &'a str,
pub(crate) instructions: &'a str,
// TODO(mbolin): ResponseItem::Other should not be serialized. Currently,
@@ -88,6 +140,46 @@ pub(crate) struct Payload<'a> {
pub(crate) stream: bool,
}
use crate::config::Config;
pub(crate) fn create_reasoning_param_for_request(
config: &Config,
effort: ReasoningEffortConfig,
summary: ReasoningSummaryConfig,
) -> Option<Reasoning> {
if model_supports_reasoning_summaries(config) {
let effort: Option<OpenAiReasoningEffort> = effort.into();
let effort = effort?;
Some(Reasoning {
effort,
summary: summary.into(),
})
} else {
None
}
}
pub fn model_supports_reasoning_summaries(config: &Config) -> bool {
// Currently, we hardcode this rule to decide whether to enable reasoning.
// We expect reasoning to apply only to OpenAI models, but we do not want
// users to have to mess with their config to disable reasoning for models
// that do not support it, such as `gpt-4.1`.
//
// Though if a user is using Codex with non-OpenAI models that, say, happen
// to start with "o", then they can set `model_reasoning_effort = "none"` in
// config.toml to disable reasoning.
//
// Converseley, if a user has a non-OpenAI provider that supports reasoning,
// they can set the top-level `model_supports_reasoning_summaries = true`
// config option to enable reasoning.
if config.model_supports_reasoning_summaries {
return true;
}
let model = &config.model;
model.starts_with("o") || model.starts_with("codex")
}
pub(crate) struct ResponseStream {
pub(crate) rx_event: mpsc::Receiver<Result<ResponseEvent>>,
}

View File

@@ -1,6 +1,7 @@
// Poisoned mutex should fail the program
#![allow(clippy::unwrap_used)]
use std::borrow::Cow;
use std::collections::HashMap;
use std::collections::HashSet;
use std::path::Path;
@@ -20,6 +21,7 @@ use codex_apply_patch::MaybeApplyPatchVerified;
use codex_apply_patch::maybe_parse_apply_patch_verified;
use codex_apply_patch::print_summary;
use futures::prelude::*;
use mcp_types::CallToolResult;
use serde::Serialize;
use serde_json;
use tokio::sync::Notify;
@@ -47,9 +49,7 @@ use crate::exec::ExecToolCallOutput;
use crate::exec::SandboxType;
use crate::exec::process_exec_tool_call;
use crate::exec_env::create_env;
use crate::flags::OPENAI_STREAM_MAX_RETRIES;
use crate::mcp_connection_manager::McpConnectionManager;
use crate::mcp_connection_manager::try_parse_fully_qualified_tool_name;
use crate::mcp_tool_call::handle_mcp_tool_call;
use crate::models::ContentItem;
use crate::models::FunctionCallOutputPayload;
@@ -58,8 +58,10 @@ use crate::models::ReasoningItemReasoningSummary;
use crate::models::ResponseInputItem;
use crate::models::ResponseItem;
use crate::models::ShellToolCallParams;
use crate::project_doc::create_full_instructions;
use crate::project_doc::get_user_instructions;
use crate::protocol::AgentMessageDeltaEvent;
use crate::protocol::AgentMessageEvent;
use crate::protocol::AgentReasoningDeltaEvent;
use crate::protocol::AgentReasoningEvent;
use crate::protocol::ApplyPatchApprovalRequestEvent;
use crate::protocol::AskForApproval;
@@ -99,24 +101,37 @@ impl Codex {
/// Spawn a new [`Codex`] and initialize the session. Returns the instance
/// of `Codex` and the ID of the `SessionInitialized` event that was
/// submitted to start the session.
pub async fn spawn(config: Config, ctrl_c: Arc<Notify>) -> CodexResult<(Codex, String)> {
pub async fn spawn(config: Config, ctrl_c: Arc<Notify>) -> CodexResult<(Codex, String, Uuid)> {
// experimental resume path (undocumented)
let resume_path = config.experimental_resume.clone();
info!("resume_path: {resume_path:?}");
let (tx_sub, rx_sub) = async_channel::bounded(64);
let (tx_event, rx_event) = async_channel::bounded(64);
let (tx_event, rx_event) = async_channel::bounded(1600);
let user_instructions = get_user_instructions(&config).await;
let instructions = create_full_instructions(&config).await;
let configure_session = Op::ConfigureSession {
provider: config.model_provider.clone(),
model: config.model.clone(),
instructions,
model_reasoning_effort: config.model_reasoning_effort,
model_reasoning_summary: config.model_reasoning_summary,
user_instructions,
base_instructions: config.base_instructions.clone(),
approval_policy: config.approval_policy,
sandbox_policy: config.sandbox_policy.clone(),
disable_response_storage: config.disable_response_storage,
notify: config.notify.clone(),
cwd: config.cwd.clone(),
resume_path: resume_path.clone(),
};
let config = Arc::new(config);
tokio::spawn(submission_loop(config, rx_sub, tx_event, ctrl_c));
// Generate a unique ID for the lifetime of this Codex session.
let session_id = Uuid::new_v4();
tokio::spawn(submission_loop(
session_id, config, rx_sub, tx_event, ctrl_c,
));
let codex = Codex {
next_id: AtomicU64::new(0),
tx_sub,
@@ -124,7 +139,7 @@ impl Codex {
};
let init_id = codex.submit(configure_session).await?;
Ok((codex, init_id))
Ok((codex, init_id, session_id))
}
/// Submit the `op` wrapped in a `Submission` with a unique ID.
@@ -170,7 +185,8 @@ pub(crate) struct Session {
/// the model as well as sandbox policies are resolved against this path
/// instead of `std::env::current_dir()`.
cwd: PathBuf,
instructions: Option<String>,
base_instructions: Option<String>,
user_instructions: Option<String>,
approval_policy: AskForApproval,
sandbox_policy: SandboxPolicy,
shell_environment_policy: ShellEnvironmentPolicy,
@@ -185,7 +201,7 @@ pub(crate) struct Session {
/// Optional rollout recorder for persisting the conversation transcript so
/// sessions can be replayed or inspected later.
rollout: Mutex<Option<crate::rollout::RolloutRecorder>>,
rollout: Mutex<Option<RolloutRecorder>>,
state: Mutex<State>,
codex_linux_sandbox_exe: Option<PathBuf>,
}
@@ -203,6 +219,9 @@ impl Session {
struct State {
approved_commands: HashSet<Vec<String>>,
current_task: Option<AgentTask>,
/// Call IDs that have been sent from the Responses API but have not been sent back yet.
/// You CANNOT send a Responses API follow-up message unless you have sent back the output for all pending calls or else it will 400.
pending_call_ids: HashSet<String>,
previous_response_id: Option<String>,
pending_approvals: HashMap<String, oneshot::Sender<ReviewDecision>>,
pending_input: Vec<ResponseInputItem>,
@@ -295,17 +314,34 @@ impl Session {
state.approved_commands.insert(cmd);
}
/// Append the given items to the session's rollout transcript (if enabled)
/// and persist them to disk.
async fn record_rollout_items(&self, items: &[ResponseItem]) {
// Clone the recorder outside of the mutex so we dont hold the lock
// across an await point (MutexGuard is not Send).
/// Records items to both the rollout and the chat completions/ZDR
/// transcript, if enabled.
async fn record_conversation_items(&self, items: &[ResponseItem]) {
debug!("Recording items for conversation: {items:?}");
self.record_state_snapshot(items).await;
if let Some(transcript) = self.state.lock().unwrap().zdr_transcript.as_mut() {
transcript.record_items(items);
}
}
async fn record_state_snapshot(&self, items: &[ResponseItem]) {
let snapshot = {
let state = self.state.lock().unwrap();
crate::rollout::SessionStateSnapshot {
previous_response_id: state.previous_response_id.clone(),
}
};
let recorder = {
let guard = self.rollout.lock().unwrap();
guard.as_ref().cloned()
};
if let Some(rec) = recorder {
if let Err(e) = rec.record_state(snapshot).await {
error!("failed to record rollout state: {e:#}");
}
if let Err(e) = rec.record_items(items).await {
error!("failed to record rollout items: {e:#}");
}
@@ -388,7 +424,7 @@ impl Session {
tool: &str,
arguments: Option<serde_json::Value>,
timeout: Option<Duration>,
) -> anyhow::Result<mcp_types::CallToolResult> {
) -> anyhow::Result<CallToolResult> {
self.mcp_connection_manager
.call_tool(server, tool, arguments, timeout)
.await
@@ -397,6 +433,8 @@ impl Session {
pub fn abort(&self) {
info!("Aborting existing session");
let mut state = self.state.lock().unwrap();
// Don't clear pending_call_ids because we need to keep track of them to ensure we don't 400 on the next turn.
// We will generate a synthetic aborted response for each pending call id.
state.pending_approvals.clear();
state.pending_input.clear();
if let Some(task) = state.current_task.take() {
@@ -417,7 +455,7 @@ impl Session {
}
let Ok(json) = serde_json::to_string(&notification) else {
tracing::error!("failed to serialise notification payload");
error!("failed to serialise notification payload");
return;
};
@@ -429,7 +467,7 @@ impl Session {
// Fire-and-forget we do not wait for completion.
if let Err(e) = command.spawn() {
tracing::warn!("failed to spawn notifier '{}': {e}", notify_command[0]);
warn!("failed to spawn notifier '{}': {e}", notify_command[0]);
}
}
}
@@ -491,14 +529,12 @@ impl AgentTask {
}
async fn submission_loop(
mut session_id: Uuid,
config: Arc<Config>,
rx_sub: Receiver<Submission>,
tx_event: Sender<Event>,
ctrl_c: Arc<Notify>,
) {
// Generate a unique ID for the lifetime of this Codex session.
let session_id = Uuid::new_v4();
let mut sess: Option<Arc<Session>> = None;
// shorthand - send an event when there is no active session
let send_no_session_event = |sub_id: String| async {
@@ -542,14 +578,20 @@ async fn submission_loop(
Op::ConfigureSession {
provider,
model,
instructions,
model_reasoning_effort,
model_reasoning_summary,
user_instructions,
base_instructions,
approval_policy,
sandbox_policy,
disable_response_storage,
notify,
cwd,
resume_path,
} => {
info!("Configuring session: model={model}; provider={provider:?}");
info!(
"Configuring session: model={model}; provider={provider:?}; resume={resume_path:?}"
);
if !cwd.is_absolute() {
let message = format!("cwd is not absolute: {cwd:?}");
error!(message);
@@ -562,8 +604,51 @@ async fn submission_loop(
}
return;
}
// Optionally resume an existing rollout.
let mut restored_items: Option<Vec<ResponseItem>> = None;
let mut restored_prev_id: Option<String> = None;
let rollout_recorder: Option<RolloutRecorder> =
if let Some(path) = resume_path.as_ref() {
match RolloutRecorder::resume(path).await {
Ok((rec, saved)) => {
session_id = saved.session_id;
restored_prev_id = saved.state.previous_response_id;
if !saved.items.is_empty() {
restored_items = Some(saved.items);
}
Some(rec)
}
Err(e) => {
warn!("failed to resume rollout from {path:?}: {e}");
None
}
}
} else {
None
};
let client = ModelClient::new(model.clone(), provider.clone());
let rollout_recorder = match rollout_recorder {
Some(rec) => Some(rec),
None => {
match RolloutRecorder::new(&config, session_id, user_instructions.clone())
.await
{
Ok(r) => Some(r),
Err(e) => {
warn!("failed to initialise rollout recorder: {e}");
None
}
}
}
};
let client = ModelClient::new(
config.clone(),
provider.clone(),
model_reasoning_effort,
model_reasoning_summary,
session_id,
);
// abort any current running session and clone its state
let retain_zdr_transcript =
@@ -616,26 +701,12 @@ async fn submission_loop(
});
}
}
// Attempt to create a RolloutRecorder *before* moving the
// `instructions` value into the Session struct.
// TODO: if ConfigureSession is sent twice, we will create an
// overlapping rollout file. Consider passing RolloutRecorder
// from above.
let rollout_recorder =
match RolloutRecorder::new(&config, session_id, instructions.clone()).await {
Ok(r) => Some(r),
Err(e) => {
tracing::warn!("failed to initialise rollout recorder: {e}");
None
}
};
sess = Some(Arc::new(Session {
client,
tx_event: tx_event.clone(),
ctrl_c: Arc::clone(&ctrl_c),
instructions,
user_instructions,
base_instructions,
approval_policy,
sandbox_policy,
shell_environment_policy: config.shell_environment_policy.clone(),
@@ -648,6 +719,19 @@ async fn submission_loop(
codex_linux_sandbox_exe: config.codex_linux_sandbox_exe.clone(),
}));
// Patch restored state into the newly created session.
if let Some(sess_arc) = &sess {
if restored_prev_id.is_some() || restored_items.is_some() {
let mut st = sess_arc.state.lock().unwrap();
st.previous_response_id = restored_prev_id;
if let (Some(hist), Some(items)) =
(st.zdr_transcript.as_mut(), restored_items.as_ref())
{
hist.record_items(items.iter());
}
}
}
// Gather history metadata for SessionConfiguredEvent.
let (history_log_id, history_entry_count) =
crate::message_history::history_metadata(&config).await;
@@ -716,12 +800,14 @@ async fn submission_loop(
}
}
Op::AddToHistory { text } => {
// TODO: What should we do if we got AddToHistory before ConfigureSession?
// currently, if ConfigureSession has resume path, this history will be ignored
let id = session_id;
let config = config.clone();
tokio::spawn(async move {
if let Err(e) = crate::message_history::append_entry(&text, &id, &config).await
{
tracing::warn!("failed to append to message history: {e}");
warn!("failed to append to message history: {e}");
}
});
}
@@ -751,7 +837,7 @@ async fn submission_loop(
};
if let Err(e) = tx_event.send(event).await {
tracing::warn!("failed to send GetHistoryEntryResponse event: {e}");
warn!("failed to send GetHistoryEntryResponse event: {e}");
}
});
}
@@ -760,6 +846,19 @@ async fn submission_loop(
debug!("Agent loop exited");
}
/// Takes a user message as input and runs a loop where, at each turn, the model
/// replies with either:
///
/// - requested function calls
/// - an assistant message
///
/// While it is possible for the model to return multiple of these items in a
/// single turn, in practice, we generally one item per turn:
///
/// - If the model requests a function call, we execute it and send the output
/// back to the model in the next turn.
/// - If the model sends only an assistant message, we record it in the
/// conversation history and consider the task complete.
async fn run_task(sess: Arc<Session>, sub_id: String, input: Vec<InputItem>) {
if input.is_empty() {
return;
@@ -772,10 +871,14 @@ async fn run_task(sess: Arc<Session>, sub_id: String, input: Vec<InputItem>) {
return;
}
let mut pending_response_input: Vec<ResponseInputItem> = vec![ResponseInputItem::from(input)];
let initial_input_for_turn = ResponseInputItem::from(input);
sess.record_conversation_items(&[initial_input_for_turn.clone().into()])
.await;
let mut input_for_next_turn: Vec<ResponseInputItem> = vec![initial_input_for_turn];
let last_agent_message: Option<String>;
loop {
let mut net_new_turn_input = pending_response_input
let mut net_new_turn_input = input_for_next_turn
.drain(..)
.map(ResponseItem::from)
.collect::<Vec<_>>();
@@ -783,11 +886,12 @@ async fn run_task(sess: Arc<Session>, sub_id: String, input: Vec<InputItem>) {
// Note that pending_input would be something like a message the user
// submitted through the UI while the model was running. Though the UI
// may support this, the model might not.
let pending_input = sess.get_pending_input().into_iter().map(ResponseItem::from);
net_new_turn_input.extend(pending_input);
// Persist only the net-new items of this turn to the rollout.
sess.record_rollout_items(&net_new_turn_input).await;
let pending_input = sess
.get_pending_input()
.into_iter()
.map(ResponseItem::from)
.collect::<Vec<ResponseItem>>();
sess.record_conversation_items(&pending_input).await;
// Construct the input that we will send to the model. When using the
// Chat completions API (or ZDR clients), the model needs the full
@@ -796,20 +900,24 @@ async fn run_task(sess: Arc<Session>, sub_id: String, input: Vec<InputItem>) {
// represents an append-only log without duplicates.
let turn_input: Vec<ResponseItem> =
if let Some(transcript) = sess.state.lock().unwrap().zdr_transcript.as_mut() {
// If we are using Chat/ZDR, we need to send the transcript with every turn.
// 1. Build up the conversation history for the next turn.
let full_transcript = [transcript.contents(), net_new_turn_input.clone()].concat();
// 2. Update the in-memory transcript so that future turns
// include these items as part of the history.
transcript.record_items(&net_new_turn_input);
// Note that `transcript.record_items()` does some filtering
// such that `full_transcript` may include items that were
// excluded from `transcript`.
full_transcript
// If we are using Chat/ZDR, we need to send the transcript with
// every turn. By induction, `transcript` already contains:
// - The `input` that kicked off this task.
// - Each `ResponseItem` that was recorded in the previous turn.
// - Each response to a `ResponseItem` (in practice, the only
// response type we seem to have is `FunctionCallOutput`).
//
// The only thing the `transcript` does not contain is the
// `pending_input` that was injected while the model was
// running. We need to add that to the conversation history
// so that the model can see it in the next turn.
[transcript.contents(), pending_input].concat()
} else {
// In practice, net_new_turn_input should contain only:
// - User messages
// - Outputs for function calls requested by the model
net_new_turn_input.extend(pending_input);
// Responses API path we can just send the new items and
// record the same.
net_new_turn_input
@@ -830,29 +938,88 @@ async fn run_task(sess: Arc<Session>, sub_id: String, input: Vec<InputItem>) {
.collect();
match run_turn(&sess, sub_id.clone(), turn_input).await {
Ok(turn_output) => {
let (items, responses): (Vec<_>, Vec<_>) = turn_output
.into_iter()
.map(|p| (p.item, p.response))
.unzip();
let responses = responses
.into_iter()
.flatten()
.collect::<Vec<ResponseInputItem>>();
let mut items_to_record_in_conversation_history = Vec::<ResponseItem>::new();
let mut responses = Vec::<ResponseInputItem>::new();
for processed_response_item in turn_output {
let ProcessedResponseItem { item, response } = processed_response_item;
match (&item, &response) {
(ResponseItem::Message { role, .. }, None) if role == "assistant" => {
// If the model returned a message, we need to record it.
items_to_record_in_conversation_history.push(item);
}
(
ResponseItem::LocalShellCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
},
);
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
},
);
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::McpToolCallOutput { call_id, result }),
) => {
items_to_record_in_conversation_history.push(item);
let (content, success): (String, Option<bool>) = match result {
Ok(CallToolResult {
content,
is_error,
structured_content: _,
}) => match serde_json::to_string(content) {
Ok(content) => (content, *is_error),
Err(e) => {
warn!("Failed to serialize MCP tool call output: {e}");
(e.to_string(), Some(true))
}
},
Err(e) => (e.clone(), Some(true)),
};
items_to_record_in_conversation_history.push(
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: FunctionCallOutputPayload { content, success },
},
);
}
(ResponseItem::Reasoning { .. }, None) => {
// Omit from conversation history.
}
_ => {
warn!("Unexpected response item: {item:?} with response: {response:?}");
}
};
if let Some(response) = response {
responses.push(response);
}
}
// Only attempt to take the lock if there is something to record.
if !items.is_empty() {
// First persist model-generated output to the rollout file this only borrows.
sess.record_rollout_items(&items).await;
// For ZDR we also need to keep a transcript clone.
if let Some(transcript) = sess.state.lock().unwrap().zdr_transcript.as_mut() {
transcript.record_items(&items);
}
if !items_to_record_in_conversation_history.is_empty() {
sess.record_conversation_items(&items_to_record_in_conversation_history)
.await;
}
if responses.is_empty() {
debug!("Turn completed");
last_agent_message = get_last_assistant_message_from_turn(&items);
last_agent_message = get_last_assistant_message_from_turn(
&items_to_record_in_conversation_history,
);
sess.maybe_notify(UserNotification::AgentTurnComplete {
turn_id: sub_id.clone(),
input_messages: turn_input_messages,
@@ -861,7 +1028,7 @@ async fn run_task(sess: Arc<Session>, sub_id: String, input: Vec<InputItem>) {
break;
}
pending_response_input = responses;
input_for_next_turn = responses;
}
Err(e) => {
info!("Turn error: {e:#}");
@@ -890,9 +1057,8 @@ async fn run_turn(
input: Vec<ResponseItem>,
) -> CodexResult<Vec<ProcessedResponseItem>> {
// Decide whether to use server-side storage (previous_response_id) or disable it
let (prev_id, store, is_first_turn) = {
let (prev_id, store) = {
let state = sess.state.lock().unwrap();
let is_first_turn = state.previous_response_id.is_none();
let store = state.zdr_transcript.is_none();
let prev_id = if store {
state.previous_response_id.clone()
@@ -901,22 +1067,17 @@ async fn run_turn(
// back, but trying to use it results in a 400.
None
};
(prev_id, store, is_first_turn)
};
let instructions = if is_first_turn {
sess.instructions.clone()
} else {
None
(prev_id, store)
};
let extra_tools = sess.mcp_connection_manager.list_all_tools();
let prompt = Prompt {
input,
prev_id,
instructions,
user_instructions: sess.user_instructions.clone(),
store,
extra_tools,
base_instructions_override: sess.base_instructions.clone(),
};
let mut retries = 0;
@@ -926,12 +1087,13 @@ async fn run_turn(
Err(CodexErr::Interrupted) => return Err(CodexErr::Interrupted),
Err(CodexErr::EnvVar(var)) => return Err(CodexErr::EnvVar(var)),
Err(e) => {
if retries < *OPENAI_STREAM_MAX_RETRIES {
// Use the configured provider-specific stream retry budget.
let max_retries = sess.client.get_provider().stream_max_retries();
if retries < max_retries {
retries += 1;
let delay = backoff(retries);
warn!(
"stream disconnected - retrying turn ({retries}/{} in {delay:?})...",
*OPENAI_STREAM_MAX_RETRIES
"stream disconnected - retrying turn ({retries}/{max_retries} in {delay:?})...",
);
// Surface retry information to any UI/frontend so the
@@ -940,8 +1102,7 @@ async fn run_turn(
sess.notify_background_event(
&sub_id,
format!(
"stream error: {e}; retrying {retries}/{} in {:?}",
*OPENAI_STREAM_MAX_RETRIES, delay
"stream error: {e}; retrying {retries}/{max_retries} in {delay:?}"
),
)
.await;
@@ -959,6 +1120,7 @@ async fn run_turn(
/// events map to a `ResponseItem`. A `ResponseItem` may need to be
/// "handled" such that it produces a `ResponseInputItem` that needs to be
/// sent back to the model on the next turn.
#[derive(Debug)]
struct ProcessedResponseItem {
item: ResponseItem,
response: Option<ResponseInputItem>,
@@ -969,30 +1131,139 @@ async fn try_run_turn(
sub_id: &str,
prompt: &Prompt,
) -> CodexResult<Vec<ProcessedResponseItem>> {
let mut stream = sess.client.clone().stream(prompt).await?;
// call_ids that are part of this response.
let completed_call_ids = prompt
.input
.iter()
.filter_map(|ri| match ri {
ResponseItem::FunctionCallOutput { call_id, .. } => Some(call_id),
ResponseItem::LocalShellCall {
call_id: Some(call_id),
..
} => Some(call_id),
_ => None,
})
.collect::<Vec<_>>();
// Buffer all the incoming messages from the stream first, then execute them.
// If we execute a function call in the middle of handling the stream, it can time out.
let mut input = Vec::new();
while let Some(event) = stream.next().await {
input.push(event?);
}
// call_ids that were pending but are not part of this response.
// This usually happens because the user interrupted the model before we responded to one of its tool calls
// and then the user sent a follow-up message.
let missing_calls = {
sess.state
.lock()
.unwrap()
.pending_call_ids
.iter()
.filter_map(|call_id| {
if completed_call_ids.contains(&call_id) {
None
} else {
Some(call_id.clone())
}
})
.map(|call_id| ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: Some(false),
},
})
.collect::<Vec<_>>()
};
let prompt: Cow<Prompt> = if missing_calls.is_empty() {
Cow::Borrowed(prompt)
} else {
// Add the synthetic aborted missing calls to the beginning of the input to ensure all call ids have responses.
let input = [missing_calls, prompt.input.clone()].concat();
Cow::Owned(Prompt {
input,
..prompt.clone()
})
};
let mut stream = sess.client.clone().stream(&prompt).await?;
let mut output = Vec::new();
for event in input {
loop {
// Poll the next item from the model stream. We must inspect *both* Ok and Err
// cases so that transient stream failures (e.g., dropped SSE connection before
// `response.completed`) bubble up and trigger the caller's retry logic.
let event = stream.next().await;
let Some(event) = event else {
// Channel closed without yielding a final Completed event or explicit error.
// Treat as a disconnected stream so the caller can retry.
return Err(CodexErr::Stream(
"stream closed before response.completed".into(),
));
};
let event = match event {
Ok(ev) => ev,
Err(e) => {
// Propagate the underlying stream error to the caller (run_turn), which
// will apply the configured `stream_max_retries` policy.
return Err(e);
}
};
match event {
ResponseEvent::Created => {
let mut state = sess.state.lock().unwrap();
// We successfully created a new response and ensured that all pending calls were included so we can clear the pending call ids.
state.pending_call_ids.clear();
}
ResponseEvent::OutputItemDone(item) => {
let call_id = match &item {
ResponseItem::LocalShellCall {
call_id: Some(call_id),
..
} => Some(call_id),
ResponseItem::FunctionCall { call_id, .. } => Some(call_id),
_ => None,
};
if let Some(call_id) = call_id {
// We just got a new call id so we need to make sure to respond to it in the next turn.
let mut state = sess.state.lock().unwrap();
state.pending_call_ids.insert(call_id.clone());
}
let response = handle_response_item(sess, sub_id, item.clone()).await?;
output.push(ProcessedResponseItem { item, response });
}
ResponseEvent::Completed { response_id } => {
ResponseEvent::Completed {
response_id,
token_usage,
} => {
if let Some(token_usage) = token_usage {
sess.tx_event
.send(Event {
id: sub_id.to_string(),
msg: EventMsg::TokenCount(token_usage),
})
.await
.ok();
}
let mut state = sess.state.lock().unwrap();
state.previous_response_id = Some(response_id);
break;
return Ok(output);
}
ResponseEvent::OutputTextDelta(delta) => {
let event = Event {
id: sub_id.to_string(),
msg: EventMsg::AgentMessageDelta(AgentMessageDeltaEvent { delta }),
};
sess.tx_event.send(event).await.ok();
}
ResponseEvent::ReasoningSummaryDelta(delta) => {
let event = Event {
id: sub_id.to_string(),
msg: EventMsg::AgentReasoningDelta(AgentReasoningDeltaEvent { delta }),
};
sess.tx_event.send(event).await.ok();
}
}
}
Ok(output)
}
async fn handle_response_item(
@@ -1032,7 +1303,7 @@ async fn handle_response_item(
arguments,
call_id,
} => {
tracing::info!("FunctionCall: {arguments}");
info!("FunctionCall: {arguments}");
Some(handle_function_call(sess, sub_id.to_string(), name, arguments, call_id).await)
}
ResponseItem::LocalShellCall {
@@ -1095,13 +1366,13 @@ async fn handle_function_call(
let params = match parse_container_exec_arguments(arguments, sess, &call_id) {
Ok(params) => params,
Err(output) => {
return output;
return *output;
}
};
handle_container_exec_with_params(params, sess, sub_id, call_id).await
}
_ => {
match try_parse_fully_qualified_tool_name(&name) {
match sess.mcp_connection_manager.parse_tool_name(&name) {
Some((server, tool_name)) => {
// TODO(mbolin): Determine appropriate timeout for tool call.
let timeout = None;
@@ -1114,8 +1385,8 @@ async fn handle_function_call(
// Unknown function: reply with structured failure so the model can adapt.
ResponseInputItem::FunctionCallOutput {
call_id,
output: crate::models::FunctionCallOutputPayload {
content: format!("unsupported call: {}", name),
output: FunctionCallOutputPayload {
content: format!("unsupported call: {name}"),
success: None,
},
}
@@ -1138,7 +1409,7 @@ fn parse_container_exec_arguments(
arguments: String,
sess: &Session,
call_id: &str,
) -> Result<ExecParams, ResponseInputItem> {
) -> Result<ExecParams, Box<ResponseInputItem>> {
// parse command
match serde_json::from_str::<ShellToolCallParams>(&arguments) {
Ok(shell_tool_call_params) => Ok(to_exec_params(shell_tool_call_params, sess)),
@@ -1146,12 +1417,12 @@ fn parse_container_exec_arguments(
// allow model to re-sample
let output = ResponseInputItem::FunctionCallOutput {
call_id: call_id.to_string(),
output: crate::models::FunctionCallOutputPayload {
output: FunctionCallOutputPayload {
content: format!("failed to parse function arguments: {e}"),
success: None,
},
};
Err(output)
Err(Box::new(output))
}
}
}
@@ -1214,7 +1485,7 @@ async fn handle_container_exec_with_params(
ReviewDecision::Denied | ReviewDecision::Abort => {
return ResponseInputItem::FunctionCallOutput {
call_id,
output: crate::models::FunctionCallOutputPayload {
output: FunctionCallOutputPayload {
content: "exec command rejected by user".to_string(),
success: None,
},
@@ -1230,7 +1501,7 @@ async fn handle_container_exec_with_params(
SafetyCheck::Reject { reason } => {
return ResponseInputItem::FunctionCallOutput {
call_id,
output: crate::models::FunctionCallOutputPayload {
output: FunctionCallOutputPayload {
content: format!("exec command rejected: {reason}"),
success: None,
},
@@ -1278,7 +1549,7 @@ async fn handle_container_exec_with_params(
}
}
Err(CodexErr::Sandbox(error)) => {
handle_sanbox_error(error, sandbox_type, params, sess, sub_id, call_id).await
handle_sandbox_error(error, sandbox_type, params, sess, sub_id, call_id).await
}
Err(e) => {
// Handle non-sandbox errors
@@ -1293,7 +1564,7 @@ async fn handle_container_exec_with_params(
}
}
async fn handle_sanbox_error(
async fn handle_sandbox_error(
error: SandboxErr,
sandbox_type: SandboxType,
params: ExecParams,
@@ -1307,15 +1578,21 @@ async fn handle_sanbox_error(
call_id,
output: FunctionCallOutputPayload {
content: format!(
"failed in sandbox {:?} with execution error: {error}",
sandbox_type
"failed in sandbox {sandbox_type:?} with execution error: {error}"
),
success: Some(false),
},
};
}
// Ask the user to retry without sandbox
// Note that when `error` is `SandboxErr::Denied`, it could be a false
// positive. That is, it may have exited with a non-zero exit code, not
// because the sandbox denied it, but because that is its expected behavior,
// i.e., a grep command that did not match anything. Ideally we would
// include additional metadata on the command to indicate whether non-zero
// exit codes merit a retry.
// For now, we categorically ask the user to retry without sandbox.
sess.notify_background_event(&sub_id, format!("Execution failed: {error}"))
.await;
@@ -1757,7 +2034,7 @@ fn apply_changes_from_apply_patch(action: &ApplyPatchAction) -> anyhow::Result<A
})
}
fn get_writable_roots(cwd: &Path) -> Vec<std::path::PathBuf> {
fn get_writable_roots(cwd: &Path) -> Vec<PathBuf> {
let mut writable_roots = Vec::new();
if cfg!(target_os = "macos") {
// On macOS, $TMPDIR is private to the user.
@@ -1785,7 +2062,7 @@ fn get_writable_roots(cwd: &Path) -> Vec<std::path::PathBuf> {
}
/// Exec output is a pre-serialized JSON payload
fn format_exec_output(output: &str, exit_code: i32, duration: std::time::Duration) -> String {
fn format_exec_output(output: &str, exit_code: i32, duration: Duration) -> String {
#[derive(Serialize)]
struct ExecMetadata {
exit_code: i32,

View File

@@ -6,15 +6,16 @@ use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::util::notify_on_sigint;
use tokio::sync::Notify;
use uuid::Uuid;
/// Spawn a new [`Codex`] and initialize the session.
///
/// Returns the wrapped [`Codex`] **and** the `SessionInitialized` event that
/// is received as a response to the initial `ConfigureSession` submission so
/// that callers can surface the information to the UI.
pub async fn init_codex(config: Config) -> anyhow::Result<(Codex, Event, Arc<Notify>)> {
pub async fn init_codex(config: Config) -> anyhow::Result<(Codex, Event, Arc<Notify>, Uuid)> {
let ctrl_c = notify_on_sigint();
let (codex, init_id) = Codex::spawn(config, ctrl_c.clone()).await?;
let (codex, init_id, session_id) = Codex::spawn(config, ctrl_c.clone()).await?;
// The first event must be `SessionInitialized`. Validate and forward it to
// the caller so that they can display it in the conversation history.
@@ -33,5 +34,5 @@ pub async fn init_codex(config: Config) -> anyhow::Result<(Codex, Event, Arc<Not
));
}
Ok((codex, event, ctrl_c))
Ok((codex, event, ctrl_c, session_id))
}

View File

@@ -1,6 +1,10 @@
use crate::config_profile::ConfigProfile;
use crate::config_types::History;
use crate::config_types::McpServerConfig;
use crate::config_types::ReasoningEffort;
use crate::config_types::ReasoningSummary;
use crate::config_types::SandboxMode;
use crate::config_types::SandboxWorkplaceWrite;
use crate::config_types::ShellEnvironmentPolicy;
use crate::config_types::ShellEnvironmentPolicyToml;
use crate::config_types::Tui;
@@ -8,8 +12,8 @@ use crate::config_types::UriBasedFileOpener;
use crate::flags::OPENAI_DEFAULT_MODEL;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::built_in_model_providers;
use crate::openai_model_info::get_model_info;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPermission;
use crate::protocol::SandboxPolicy;
use dirs::home_dir;
use serde::Deserialize;
@@ -29,6 +33,12 @@ pub struct Config {
/// Optional override of model selection.
pub model: String,
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<u64>,
/// Maximum number of output tokens.
pub model_max_output_tokens: Option<u64>,
/// Key into the model_providers map that specifies which provider to use.
pub model_provider_id: String,
@@ -42,13 +52,21 @@ pub struct Config {
pub shell_environment_policy: ShellEnvironmentPolicy,
/// When `true`, `AgentReasoning` events emitted by the backend will be
/// suppressed from the frontend output. This can reduce visual noise when
/// users are only interested in the final agent responses.
pub hide_agent_reasoning: bool,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
/// who have opted into Zero Data Retention (ZDR).
pub disable_response_storage: bool,
/// User-provided instructions from instructions.md.
pub instructions: Option<String>,
pub user_instructions: Option<String>,
/// Base instructions override.
pub base_instructions: Option<String>,
/// Optional external notifier command. When set, Codex will spawn this
/// program after each completed *turn* (i.e. when the agent finishes
@@ -107,6 +125,24 @@ pub struct Config {
///
/// When this program is invoked, arg0 will be set to `codex-linux-sandbox`.
pub codex_linux_sandbox_exe: Option<PathBuf>,
/// If not "none", the value to use for `reasoning.effort` when making a
/// request using the Responses API.
pub model_reasoning_effort: ReasoningEffort,
/// If not "none", the value to use for `reasoning.summary` when making a
/// request using the Responses API.
pub model_reasoning_summary: ReasoningSummary,
/// When set to `true`, overrides the default heuristic and forces
/// `model_supports_reasoning_summaries()` to return `true`.
pub model_supports_reasoning_summaries: bool,
/// Base URL for requests to ChatGPT (as opposed to the OpenAI API).
pub chatgpt_base_url: String,
/// Experimental rollout resume path (absolute path to .jsonl; undocumented).
pub experimental_resume: Option<PathBuf>,
}
impl Config {
@@ -220,17 +256,23 @@ pub struct ConfigToml {
/// Provider to use from the model_providers map.
pub model_provider: Option<String>,
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<u64>,
/// Maximum number of output tokens.
pub model_max_output_tokens: Option<u64>,
/// Default approval policy for executing commands.
pub approval_policy: Option<AskForApproval>,
#[serde(default)]
pub shell_environment_policy: ShellEnvironmentPolicyToml,
// The `default` attribute ensures that the field is treated as `None` when
// the key is omitted from the TOML. Without it, Serde treats the field as
// required because we supply a custom deserializer.
#[serde(default, deserialize_with = "deserialize_sandbox_permissions")]
pub sandbox_permissions: Option<Vec<SandboxPermission>>,
/// Sandbox mode to use.
pub sandbox_mode: Option<SandboxMode>,
/// Sandbox configuration to apply if `sandbox` is `WorkspaceWrite`.
pub sandbox_workspace_write: Option<SandboxWorkplaceWrite>,
/// Disable server-side response storage (sends the full conversation
/// context with every request). Currently necessary for OpenAI customers
@@ -272,31 +314,44 @@ pub struct ConfigToml {
/// Collection of settings that are specific to the TUI.
pub tui: Option<Tui>,
/// When set to `true`, `AgentReasoning` events will be hidden from the
/// UI/output. Defaults to `false`.
pub hide_agent_reasoning: Option<bool>,
pub model_reasoning_effort: Option<ReasoningEffort>,
pub model_reasoning_summary: Option<ReasoningSummary>,
/// Override to force-enable reasoning summaries for the configured model.
pub model_supports_reasoning_summaries: Option<bool>,
/// Base URL for requests to ChatGPT (as opposed to the OpenAI API).
pub chatgpt_base_url: Option<String>,
/// Experimental rollout resume path (absolute path to .jsonl; undocumented).
pub experimental_resume: Option<PathBuf>,
/// Experimental path to a file whose contents replace the built-in BASE_INSTRUCTIONS.
pub experimental_instructions_file: Option<PathBuf>,
}
fn deserialize_sandbox_permissions<'de, D>(
deserializer: D,
) -> Result<Option<Vec<SandboxPermission>>, D::Error>
where
D: serde::Deserializer<'de>,
{
let permissions: Option<Vec<String>> = Option::deserialize(deserializer)?;
match permissions {
Some(raw_permissions) => {
let base_path = find_codex_home().map_err(serde::de::Error::custom)?;
let converted = raw_permissions
.into_iter()
.map(|raw| {
parse_sandbox_permission_with_base_path(&raw, base_path.clone())
.map_err(serde::de::Error::custom)
})
.collect::<Result<Vec<_>, D::Error>>()?;
Ok(Some(converted))
impl ConfigToml {
/// Derive the effective sandbox policy from the configuration.
fn derive_sandbox_policy(&self, sandbox_mode_override: Option<SandboxMode>) -> SandboxPolicy {
let resolved_sandbox_mode = sandbox_mode_override
.or(self.sandbox_mode)
.unwrap_or_default();
match resolved_sandbox_mode {
SandboxMode::ReadOnly => SandboxPolicy::new_read_only_policy(),
SandboxMode::WorkspaceWrite => match self.sandbox_workspace_write.as_ref() {
Some(s) => SandboxPolicy::WorkspaceWrite {
writable_roots: s.writable_roots.clone(),
network_access: s.network_access,
},
None => SandboxPolicy::new_workspace_write_policy(),
},
SandboxMode::DangerFullAccess => SandboxPolicy::DangerFullAccess,
}
None => Ok(None),
}
}
@@ -306,10 +361,11 @@ pub struct ConfigOverrides {
pub model: Option<String>,
pub cwd: Option<PathBuf>,
pub approval_policy: Option<AskForApproval>,
pub sandbox_policy: Option<SandboxPolicy>,
pub sandbox_mode: Option<SandboxMode>,
pub model_provider: Option<String>,
pub config_profile: Option<String>,
pub codex_linux_sandbox_exe: Option<PathBuf>,
pub base_instructions: Option<String>,
}
impl Config {
@@ -320,23 +376,24 @@ impl Config {
overrides: ConfigOverrides,
codex_home: PathBuf,
) -> std::io::Result<Self> {
let instructions = Self::load_instructions(Some(&codex_home));
let user_instructions = Self::load_instructions(Some(&codex_home));
// Destructure ConfigOverrides fully to ensure all overrides are applied.
let ConfigOverrides {
model,
cwd,
approval_policy,
sandbox_policy,
sandbox_mode,
model_provider,
config_profile: config_profile_key,
codex_linux_sandbox_exe,
base_instructions,
} = overrides;
let config_profile = match config_profile_key.or(cfg.profile) {
let config_profile = match config_profile_key.as_ref().or(cfg.profile.as_ref()) {
Some(key) => cfg
.profiles
.get(&key)
.get(key)
.ok_or_else(|| {
std::io::Error::new(
std::io::ErrorKind::NotFound,
@@ -347,20 +404,7 @@ impl Config {
None => ConfigProfile::default(),
};
let sandbox_policy = match sandbox_policy {
Some(sandbox_policy) => sandbox_policy,
None => {
// Derive a SandboxPolicy from the permissions in the config.
match cfg.sandbox_permissions {
// Note this means the user can explicitly set permissions
// to the empty list in the config file, granting it no
// permissions whatsoever.
Some(permissions) => SandboxPolicy::from(permissions),
// Default to read only rather than completely locked down.
None => SandboxPolicy::new_read_only_policy(),
}
}
};
let sandbox_policy = cfg.derive_sandbox_policy(sandbox_mode);
let mut model_providers = built_in_model_providers();
// Merge user-defined providers into the built-in list.
@@ -405,11 +449,30 @@ impl Config {
let history = cfg.history.unwrap_or_default();
let model = model
.or(config_profile.model)
.or(cfg.model)
.unwrap_or_else(default_model);
let openai_model_info = get_model_info(&model);
let model_context_window = cfg
.model_context_window
.or_else(|| openai_model_info.as_ref().map(|info| info.context_window));
let model_max_output_tokens = cfg.model_max_output_tokens.or_else(|| {
openai_model_info
.as_ref()
.map(|info| info.max_output_tokens)
});
let experimental_resume = cfg.experimental_resume;
let base_instructions = base_instructions.or(Self::get_base_instructions(
cfg.experimental_instructions_file.as_ref(),
));
let config = Self {
model: model
.or(config_profile.model)
.or(cfg.model)
.unwrap_or_else(default_model),
model,
model_context_window,
model_max_output_tokens,
model_provider_id,
model_provider,
cwd: resolved_cwd,
@@ -424,7 +487,8 @@ impl Config {
.or(cfg.disable_response_storage)
.unwrap_or(false),
notify: cfg.notify,
instructions,
user_instructions,
base_instructions,
mcp_servers: cfg.mcp_servers,
model_providers,
project_doc_max_bytes: cfg.project_doc_max_bytes.unwrap_or(PROJECT_DOC_MAX_BYTES),
@@ -433,6 +497,27 @@ impl Config {
file_opener: cfg.file_opener.unwrap_or(UriBasedFileOpener::VsCode),
tui: cfg.tui.unwrap_or_default(),
codex_linux_sandbox_exe,
hide_agent_reasoning: cfg.hide_agent_reasoning.unwrap_or(false),
model_reasoning_effort: config_profile
.model_reasoning_effort
.or(cfg.model_reasoning_effort)
.unwrap_or_default(),
model_reasoning_summary: config_profile
.model_reasoning_summary
.or(cfg.model_reasoning_summary)
.unwrap_or_default(),
model_supports_reasoning_summaries: cfg
.model_supports_reasoning_summaries
.unwrap_or(false),
chatgpt_base_url: config_profile
.chatgpt_base_url
.or(cfg.chatgpt_base_url)
.unwrap_or("https://chatgpt.com/backend-api/".to_string()),
experimental_resume,
};
Ok(config)
}
@@ -453,6 +538,15 @@ impl Config {
}
})
}
fn get_base_instructions(path: Option<&PathBuf>) -> Option<String> {
let path = path.as_ref()?;
std::fs::read_to_string(path)
.ok()
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
}
}
fn default_model() -> String {
@@ -494,50 +588,6 @@ pub fn log_dir(cfg: &Config) -> std::io::Result<PathBuf> {
Ok(p)
}
pub fn parse_sandbox_permission_with_base_path(
raw: &str,
base_path: PathBuf,
) -> std::io::Result<SandboxPermission> {
use SandboxPermission::*;
if let Some(path) = raw.strip_prefix("disk-write-folder=") {
return if path.is_empty() {
Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
"--sandbox-permission disk-write-folder=<PATH> requires a non-empty PATH",
))
} else {
use path_absolutize::*;
let file = PathBuf::from(path);
let absolute_path = if file.is_relative() {
file.absolutize_from(base_path)
} else {
file.absolutize()
}
.map(|path| path.into_owned())?;
Ok(DiskWriteFolder {
folder: absolute_path,
})
};
}
match raw {
"disk-full-read-access" => Ok(DiskFullReadAccess),
"disk-write-platform-user-temp-folder" => Ok(DiskWritePlatformUserTempFolder),
"disk-write-platform-global-temp-folder" => Ok(DiskWritePlatformGlobalTempFolder),
"disk-write-cwd" => Ok(DiskWriteCwd),
"disk-full-write-access" => Ok(DiskFullWriteAccess),
"network-full-access" => Ok(NetworkFullAccess),
_ => Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
format!(
"`{raw}` is not a recognised permission.\nRun with `--help` to see the accepted values."
),
)),
}
}
#[cfg(test)]
mod tests {
#![allow(clippy::expect_used, clippy::unwrap_used)]
@@ -547,51 +597,14 @@ mod tests {
use pretty_assertions::assert_eq;
use tempfile::TempDir;
/// Verify that the `sandbox_permissions` field on `ConfigToml` correctly
/// differentiates between a value that is completely absent in the
/// provided TOML (i.e. `None`) and one that is explicitly specified as an
/// empty array (i.e. `Some(vec![])`). This ensures that downstream logic
/// that treats these two cases differently (default read-only policy vs a
/// fully locked-down sandbox) continues to function.
#[test]
fn test_sandbox_permissions_none_vs_empty_vec() {
// Case 1: `sandbox_permissions` key is *absent* from the TOML source.
let toml_source_without_key = "";
let cfg_without_key: ConfigToml = toml::from_str(toml_source_without_key)
.expect("TOML deserialization without key should succeed");
assert!(cfg_without_key.sandbox_permissions.is_none());
// Case 2: `sandbox_permissions` is present but set to an *empty array*.
let toml_source_with_empty = "sandbox_permissions = []";
let cfg_with_empty: ConfigToml = toml::from_str(toml_source_with_empty)
.expect("TOML deserialization with empty array should succeed");
assert_eq!(Some(vec![]), cfg_with_empty.sandbox_permissions);
// Case 3: `sandbox_permissions` contains a non-empty list of valid values.
let toml_source_with_values = r#"
sandbox_permissions = ["disk-full-read-access", "network-full-access"]
"#;
let cfg_with_values: ConfigToml = toml::from_str(toml_source_with_values)
.expect("TOML deserialization with valid permissions should succeed");
assert_eq!(
Some(vec![
SandboxPermission::DiskFullReadAccess,
SandboxPermission::NetworkFullAccess
]),
cfg_with_values.sandbox_permissions
);
}
#[test]
fn test_toml_parsing() {
let history_with_persistence = r#"
[history]
persistence = "save-all"
"#;
let history_with_persistence_cfg: ConfigToml =
toml::from_str::<ConfigToml>(history_with_persistence)
.expect("TOML deserialization should succeed");
let history_with_persistence_cfg = toml::from_str::<ConfigToml>(history_with_persistence)
.expect("TOML deserialization should succeed");
assert_eq!(
Some(History {
persistence: HistoryPersistence::SaveAll,
@@ -605,9 +618,8 @@ persistence = "save-all"
persistence = "none"
"#;
let history_no_persistence_cfg: ConfigToml =
toml::from_str::<ConfigToml>(history_no_persistence)
.expect("TOML deserialization should succeed");
let history_no_persistence_cfg = toml::from_str::<ConfigToml>(history_no_persistence)
.expect("TOML deserialization should succeed");
assert_eq!(
Some(History {
persistence: HistoryPersistence::None,
@@ -617,20 +629,56 @@ persistence = "none"
);
}
/// Deserializing a TOML string containing an *invalid* permission should
/// fail with a helpful error rather than silently defaulting or
/// succeeding.
#[test]
fn test_sandbox_permissions_illegal_value() {
let toml_bad = r#"sandbox_permissions = ["not-a-real-permission"]"#;
fn test_sandbox_config_parsing() {
let sandbox_full_access = r#"
sandbox_mode = "danger-full-access"
let err = toml::from_str::<ConfigToml>(toml_bad)
.expect_err("Deserialization should fail for invalid permission");
[sandbox_workspace_write]
network_access = false # This should be ignored.
"#;
let sandbox_full_access_cfg = toml::from_str::<ConfigToml>(sandbox_full_access)
.expect("TOML deserialization should succeed");
let sandbox_mode_override = None;
assert_eq!(
SandboxPolicy::DangerFullAccess,
sandbox_full_access_cfg.derive_sandbox_policy(sandbox_mode_override)
);
// Make sure the error message contains the invalid value so users have
// useful feedback.
let msg = err.to_string();
assert!(msg.contains("not-a-real-permission"));
let sandbox_read_only = r#"
sandbox_mode = "read-only"
[sandbox_workspace_write]
network_access = true # This should be ignored.
"#;
let sandbox_read_only_cfg = toml::from_str::<ConfigToml>(sandbox_read_only)
.expect("TOML deserialization should succeed");
let sandbox_mode_override = None;
assert_eq!(
SandboxPolicy::ReadOnly,
sandbox_read_only_cfg.derive_sandbox_policy(sandbox_mode_override)
);
let sandbox_workspace_write = r#"
sandbox_mode = "workspace-write"
[sandbox_workspace_write]
writable_roots = [
"/tmp",
]
"#;
let sandbox_workspace_write_cfg = toml::from_str::<ConfigToml>(sandbox_workspace_write)
.expect("TOML deserialization should succeed");
let sandbox_mode_override = None;
assert_eq!(
SandboxPolicy::WorkspaceWrite {
writable_roots: vec![PathBuf::from("/tmp")],
network_access: false,
},
sandbox_workspace_write_cfg.derive_sandbox_policy(sandbox_mode_override)
);
}
struct PrecedenceTestFixture {
@@ -655,8 +703,7 @@ persistence = "none"
fn create_test_fixture() -> std::io::Result<PrecedenceTestFixture> {
let toml = r#"
model = "o3"
approval_policy = "unless-allow-listed"
sandbox_permissions = ["disk-full-read-access"]
approval_policy = "untrusted"
disable_response_storage = false
# Can be used to determine which profile to use if not specified by
@@ -668,11 +715,16 @@ name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
request_max_retries = 4 # retry failed HTTP requests
stream_max_retries = 10 # retry dropped SSE streams
stream_idle_timeout_ms = 300000 # 5m idle timeout
[profiles.o3]
model = "o3"
model_provider = "openai"
approval_policy = "never"
model_reasoning_effort = "high"
model_reasoning_summary = "detailed"
[profiles.gpt3]
model = "gpt-3.5-turbo"
@@ -703,6 +755,12 @@ disable_response_storage = true
env_key: Some("OPENAI_API_KEY".to_string()),
wire_api: crate::WireApi::Chat,
env_key_instructions: None,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(4),
stream_max_retries: Some(10),
stream_idle_timeout_ms: Some(300_000),
};
let model_provider_map = {
let mut model_provider_map = built_in_model_providers();
@@ -757,13 +815,15 @@ disable_response_storage = true
assert_eq!(
Config {
model: "o3".to_string(),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_provider_id: "openai".to_string(),
model_provider: fixture.openai_provider.clone(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: false,
instructions: None,
user_instructions: None,
notify: None,
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
@@ -774,6 +834,13 @@ disable_response_storage = true
file_opener: UriBasedFileOpener::VsCode,
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::High,
model_reasoning_summary: ReasoningSummary::Detailed,
model_supports_reasoning_summaries: false,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
experimental_resume: None,
base_instructions: None,
},
o3_profile_config
);
@@ -796,13 +863,15 @@ disable_response_storage = true
)?;
let expected_gpt3_profile_config = Config {
model: "gpt-3.5-turbo".to_string(),
model_context_window: Some(16_385),
model_max_output_tokens: Some(4_096),
model_provider_id: "openai-chat-completions".to_string(),
model_provider: fixture.openai_chat_completions_provider.clone(),
approval_policy: AskForApproval::UnlessAllowListed,
approval_policy: AskForApproval::UnlessTrusted,
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: false,
instructions: None,
user_instructions: None,
notify: None,
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
@@ -813,6 +882,13 @@ disable_response_storage = true
file_opener: UriBasedFileOpener::VsCode,
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::default(),
model_reasoning_summary: ReasoningSummary::default(),
model_supports_reasoning_summaries: false,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
experimental_resume: None,
base_instructions: None,
};
assert_eq!(expected_gpt3_profile_config, gpt3_profile_config);
@@ -850,13 +926,15 @@ disable_response_storage = true
)?;
let expected_zdr_profile_config = Config {
model: "o3".to_string(),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_provider_id: "openai".to_string(),
model_provider: fixture.openai_provider.clone(),
approval_policy: AskForApproval::OnFailure,
sandbox_policy: SandboxPolicy::new_read_only_policy(),
shell_environment_policy: ShellEnvironmentPolicy::default(),
disable_response_storage: true,
instructions: None,
user_instructions: None,
notify: None,
cwd: fixture.cwd(),
mcp_servers: HashMap::new(),
@@ -867,6 +945,13 @@ disable_response_storage = true
file_opener: UriBasedFileOpener::VsCode,
tui: Tui::default(),
codex_linux_sandbox_exe: None,
hide_agent_reasoning: false,
model_reasoning_effort: ReasoningEffort::default(),
model_reasoning_summary: ReasoningSummary::default(),
model_supports_reasoning_summaries: false,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
experimental_resume: None,
base_instructions: None,
};
assert_eq!(expected_zdr_profile_config, zdr_profile_config);

Some files were not shown because too many files have changed in this diff Show More