## Summary
In collaboration with @gpeal: upgrade the onboarding flow, and persist
user settings.
---------
Co-authored-by: Gabriel Peal <gabriel@openai.com>
Trying to use `core` as the default has been "too clever." Users can
always take responsibility for controlling the env without this setting
at all by specifying the `env` they use when calling `codex` in the
first place.
See https://github.com/openai/codex/issues/1249.
## Summary
- add a pulsing dot loader before the shimmering `Working` label in the
status indicator widget and include a small test asserting the spinner
character is rendered
- also fix a small bug in the ran command header by adding a space
between the ⚡ and `Ran command`
https://github.com/user-attachments/assets/6768c9d2-e094-49cb-ad51-44bcac10aa6f
## Testing
- `just fmt`
- `just fix` *(failed: E0658 `let` expressions in core/src/client.rs)*
- `cargo test --all-features` *(failed: E0658 `let` expressions in
core/src/client.rs)*
------
https://chatgpt.com/codex/tasks/task_i_68941bffdb948322b0f4190bc9dbe7f6
---------
Co-authored-by: aibrahim-oai <aibrahim@openai.com>
- `/status` renders
```
signed in with chatgpt
login: example@example.com
plan: plus
```
- Setup for using this info in a few more places.
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
## Summary
- support `codex logout` via new subcommand and helper that removes the
stored `auth.json`
- expose a `logout` function in `codex-login` and test it
- add `/logout` slash command in the TUI; command list is filtered when
not logged in and the handler deletes `auth.json` then exits
## Testing
- `just fix` *(fails: failed to get `diffy` from crates.io)*
- `cargo test --all-features` *(fails: failed to get `diffy` from
crates.io)*
------
https://chatgpt.com/codex/tasks/task_i_68945c3facac832ca83d48499716fb51
- For absolute, use non-cached input + output.
- For estimating what % of the model's context window is used, we need
to account for reasoning output tokens from prior turns being dropped
from the context window. We approximate this here by subtracting
reasoning output tokens from the total. This will be off for the current
turn and pending function calls. We can improve it later.
This will make @ more discoverable (even though it is currently not
super useful, IMO it should be used to bring files into context from
outside CWD)
---------
Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
Replaces the `include_default_writable_roots` option on
`sandbox_workspace_write` (that defaulted to `true`, which was slightly
weird/annoying) with `exclude_tmpdir_env_var`, which defaults to
`false`.
Though perhaps more importantly `/tmp` is now enabled by default as part
of `sandbox_mode = "workspace-write"`, though `exclude_slash_tmp =
false` can be used to disable this.
Cursor wasn't moving when inserting a file, resulting in being not at
the end of the filename when inserting the file.
This fixes it by moving the cursor to the end of the file + one trailing
space.
Example screenshot after selecting a file when typing `@`
<img width="823" height="268" alt="image"
src="https://github.com/user-attachments/assets/ec6e3741-e1ba-4752-89d2-11f14a2bd69f"
/>
This sets up the scaffolding and basic flow for a TUI onboarding
experience. It covers sign in with ChatGPT, env auth, as well as some
safety guidance.
Next up:
1. Replace the git warning screen
2. Use this to configure default approval/sandbox modes
Note the shimmer flashes are from me slicing the video, not jank.
https://github.com/user-attachments/assets/0fbe3479-fdde-41f3-87fb-a7a83ab895b8
## Summary
We have been returning `exit code 0` from the apply patch command when
writes fail, which causes our `exec` harness to pass back confusing
messages to the model. Instead, we should loudly fail so that the
harness and the model can handle these errors appropriately.
Also adds a test to confirm this behavior.
## Testing
- `cargo test -p codex-apply-patch`
- Arguably a bugfix as previously CTRL-Z didn't do anything.
- Only in TUI mode for now. This may make sense in other modes... to be
researched.
- The TUI runs the terminal in raw mode and the signals arrive as key
events, so we handle CTRL-Z as a key event just like CTRL-C.
- Not adding UI for it as a composer redesign is coming, and we can just
add it then.
- We should follow with CTRL-Z a second time doing the native terminal
action.
- Updates the launch screen to:
```
>_ You are using OpenAI Codex in ~/code/codex/codex-rs
Try one of the following commands to get started:
1. /init - Create an AGENTS.md file with instructions for Codex
2. /status - Show current session configuration and token usage
3. /compact - Compact the chat history
4. /new - Start a new chat
```
- These aren't the perfect commands, but as more land soon we can
update.
- We should also add logic later to make /init only show when there's no
existing AGENTS.md.
- Majorly need to iterate on copy.
<img width="905" height="769" alt="image"
src="https://github.com/user-attachments/assets/5912939e-fb0e-4e76-94ff-785261e2d6ee"
/>
## Summary
- Prioritize provider-specific API keys over default Codex auth when
building requests
- Add test to ensure provider env var auth overrides default auth
## Testing
- `just fmt`
- `just fix` *(fails: `let` expressions in this position are unstable)*
- `cargo test --all-features` *(fails: `let` expressions in this
position are unstable)*
------
https://chatgpt.com/codex/tasks/task_i_68926a104f7483208f2c8fd36763e0e3
The docs and code do not match. It turns out the docs are "right" in
they are what we have been meaning to support, so this PR updates the
code:
ae88b69b09/README.md (L298-L302)
Support for `instructions.md` is a holdover from the TypeScript CLI, so
we are just going to drop support for it altogether rather than maintain
it in perpetuity.
## Summary
Forgot to remove this in #1869 last night! Too much of a performance hit
on the main thread. We can bring it back via an async thread on startup.
## Summary
Includes a new user message in the api payload which provides useful
environment context for the model, so it knows about things like the
current working directory and the sandbox.
## Testing
Updated unit tests
## Summary
Have seen these tests flaking over the course of today on different
boxes. `wiremock` seems to be generally written with tokio/threads in
mind but based on the weird panics from the tests, let's see if this
helps.
- Added a `/status` command, which will be useful when we update the
home screen to print less status.
- Moved `create_config_summary_entries` to common since it's used in a
few places.
- Noticed we inconsistently had periods in slash command descriptions
and just removed them everywhere.
- Noticed the diff description was overflowing so made it shorter.
This PR attempts to break `codex-rust-review.md` into sections so that
it is easier to consume.
It also adds a healthy new section on "Assertions in Tests" that has
been on my mind for awhile.
This script attempts to verify that:
- You have no local, uncommitted changes.
- You are on `main`
- The commit you are on exists on `main` also exists on the origin
`https://github.com/openai/codex`, i.e., it is not just a commit you
have pushed to your local version of `main`
As part of this, try to print better error message if/when these
conditions are violated.
Hardcoding to `prerelease: true` is a holdover from before we had
migrated to the Rust CLI for releases and decided on how we were doing
version numbers.
To date, I have had to change the release status from "prerelease" to
"actual release" manually through the GitHub Releases web page. This is
a semi-serious problem because I've discovered that it messes up
Homebrew's automation if the version number _looks_ like a real release
but turns out to be a prerelease. The release potentially gets skipped
from being published on Homebrew, so it's important to set the value
correctly from the start.
I verified that `steps.release_name.outputs.name` does not include the
`rust-v` prefix from the tag name.
https://github.com/openai/codex/pull/1868 is a related fix that was in
flight simultaenously, but after talking to @easong-openai, this:
- logs instead of renders for `BackgroundEvent`
- logs for `TurnDiff`
- renders for `PatchApplyEnd`
## Summary
A split-up PR of #1763 , stacked on top of a tools refactor #1858 to
make the change clearer. From the previous summary:
> Let's try something new: tell the model about the sandbox, and let it
decide when it will need to break the sandbox. Some local testing
suggests that it works pretty well with zero iteration on the prompt!
## Testing
- [x] Added unit tests
- [x] Tested locally and it appears to work smoothly!
## Summary
In an effort to make tools easier to work with and more configurable,
I'm introducing `ToolConfig` and updating `Prompt` to take in a general
list of Tools. I think this is simpler and better for a few reasons:
- We can easily assemble tools from various sources (our own harness,
mcp servers, etc.) and we can consolidate the logic for constructing the
logic in one place that is separate from serialization.
- client.rs no longer needs arbitrary config values, it just takes in a
list of tools to serialize
A hefty portion of the PR is now updating our conversion of
`mcp_types::Tool` to `OpenAITool`, but considering that @bolinfest
accurately called this out as a TODO long ago, I think it's time we
tackled it.
## Testing
- [x] Experimented locally, no changes, as expected
- [x] Added additional unit tests
- [x] Responded to rust-review
Previous to this PR, `ShutdownComplete` was not being handled correctly
in `codex exec`, so it always ended up printing the following to stderr:
```
ERROR codex_exec: Error receiving event: InternalAgentDied
```
Because we were not breaking out of the loop for `ShutdownComplete`,
inevitably `codex.next_event()` would get called again and
`rx_event.recv()` would fail and the error would get mapped to
`InternalAgentDied`:
ea7d3f27bd/codex-rs/core/src/codex.rs (L190-L197)
For reference, https://github.com/openai/codex/pull/1647 introduced the
`ShutdownComplete` variant.
## Summary
Escalating out of sandbox is (almost always) not going to fix
long-running commands timing out - therefore we should just pass the
failure back to the model instead of asking the user to re-run a command
that took a long time anyway.
## Testing
- [x] Ran locally with a timeout and confirmed this worked as expected
This PR started as an investigation with the goal of eliminating the use
of `unsafe { std::env::set_var() }` in `ollama/src/client.rs`, as
setting environment variables in a multithreaded context is indeed
unsafe and these tests were observed to be flaky, as a result.
Though as I dug deeper into the issue, I discovered that the logic for
instantiating `OllamaClient` under test scenarios was not quite right.
In this PR, I aimed to:
- share more code between the two creation codepaths,
`try_from_oss_provider()` and `try_from_provider_with_base_url()`
- use the values from `Config` when setting up Ollama, as we have
various mechanisms for overriding config values, so we should be sure
that we are always using the ultimate `Config` for things such as the
`ModelProviderInfo` associated with the `oss` id
Once this was in place,
`OllamaClient::try_from_provider_with_base_url()` could be used in unit
tests for `OllamaClient` so it was possible to create a properly
configured client without having to set environment variables.
I ended up force-pushing https://github.com/openai/codex/pull/1848
because CI jobs were not being triggered after updating the PR on
GitHub, so this spelling error sneaked through.
This adds support for easily running Codex backed by a local Ollama
instance running our new open source models. See
https://github.com/openai/gpt-oss for details.
If you pass in `--oss` you'll be prompted to install/launch ollama, and
it will automatically download the 20b model and attempt to use it.
We'll likely want to expand this with some options later to make the
experience smoother for users who can't run the 20b or want to run the
120b.
Co-authored-by: Michael Bolin <mbolin@openai.com>
https://github.com/openai/codex/pull/1835 has some messed up history.
This adds support for streaming chat completions, which is useful for ollama. We should probably take a very skeptical eye to the code introduced in this PR.
---------
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
To date, we have a number of hardcoded OpenAI model slug checks spread
throughout the codebase, which makes it hard to audit the various
special cases for each model. To mitigate this issue, this PR introduces
the idea of a `ModelFamily` that has fields to represent the existing
special cases, such as `supports_reasoning_summaries` and
`uses_local_shell_tool`.
There is a `find_family_for_model()` function that maps the raw model
slug to a `ModelFamily`. This function hardcodes all the knowledge about
the special attributes for each model. This PR then replaces the
hardcoded model name checks with checks against a `ModelFamily`.
Note `ModelFamily` is now available as `Config::model_family`. We should
ultimately remove `Config::model` in favor of
`Config::model_family::slug`.
Stream models thoughts and responses instead of waiting for the whole
thing to come through. Very rough right now, but I'm making the risk call to push through.
## Summary
Our recent change in #1737 can sometimes lead to the model confusing
AGENTS.md context as part of the message. But a little prompting and
formatting can help fix this!
## Testing
- Ran locally with a few different prompts to verify the model
behaves well.
- Updated unit tests
Previously, `codex exec` was printing `Warning: no file to write last
message to` as a warning to stderr even though `--output-last-message`
was not specified, which is wrong. This fixes the code and changes
`handle_last_message()` so that it is only called when
`last_message_path` is `Some`.
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.141 to
1.0.142.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.142</h2>
<ul>
<li>impl Default for &Value (<a
href="https://redirect.github.com/serde-rs/json/issues/1265">#1265</a>,
thanks <a
href="https://github.com/aatifsyed"><code>@aatifsyed</code></a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1731167cd5"><code>1731167</code></a>
Release 1.0.142</li>
<li><a
href="e51c81450a"><code>e51c814</code></a>
Touch up PR 1265</li>
<li><a
href="84abbdb613"><code>84abbdb</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1265">#1265</a>
from aatifsyed/master</li>
<li><a
href="9206cc0150"><code>9206cc0</code></a>
feat: impl Default for &Value</li>
<li>See full diff in <a
href="https://github.com/serde-rs/json/compare/v1.0.141...v1.0.142">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [toml](https://github.com/toml-rs/toml) from 0.9.2 to 0.9.4.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2126e6af51"><code>2126e6a</code></a>
chore: Release</li>
<li><a
href="fa2100a888"><code>fa2100a</code></a>
docs: Update changelog</li>
<li><a
href="0c75bbd6f7"><code>0c75bbd</code></a>
feat(toml): Expose DeInteger/DeFloat as_str/radix (<a
href="https://redirect.github.com/toml-rs/toml/issues/1021">#1021</a>)</li>
<li><a
href="e3d64dff47"><code>e3d64df</code></a>
feat(toml): Expose DeFloat::as_str</li>
<li><a
href="ffdd211033"><code>ffdd211</code></a>
feat(toml): Expose DeInteger::as_str/radix</li>
<li><a
href="9e7adcc7fa"><code>9e7adcc</code></a>
docs(readme): Fix links to crates (<a
href="https://redirect.github.com/toml-rs/toml/issues/1020">#1020</a>)</li>
<li><a
href="73d04e20b5"><code>73d04e2</code></a>
docs(readme): Fix links to crates</li>
<li><a
href="da667e8a7d"><code>da667e8</code></a>
chore: Release</li>
<li><a
href="b1327fbe7c"><code>b1327fb</code></a>
docs: Update changelog</li>
<li><a
href="fb5346827e"><code>fb53468</code></a>
fix(toml): Don't enable std in toml_writer (<a
href="https://redirect.github.com/toml-rs/toml/issues/1019">#1019</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/toml-rs/toml/compare/toml-v0.9.2...toml-v0.9.4">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Added:
* C-m for newline (not sure if this is actually treated differently to
Enter, but tui-textarea handles it and it doesn't hurt)
* C-d to delete one char forwards (same as Del)
* A-bksp to delete backwards one word
* A-arrows to navigate by word
The existing prompt is really bad. As a low-hanging fruit, let's correct
the apply_patch instructions - this helps smaller models successfully
apply patches.
Allows users to set their experimental_instructions_file in configs.
For example the below enables experimental instructions when running
`codex -p foo`.
```
[profiles.foo]
experimental_instructions_file = "/Users/foo/.codex/prompt.md"
```
# Testing
- ✅ Running against a profile with experimental_instructions_file works.
- ✅ Running against a profile without experimental_instructions_file
works.
- ✅ Running against no profile with experimental_instructions_file
works.
- ✅ Running against no profile without experimental_instructions_file
works.
This lets us show an accumulating diff across all patches in a turn.
Refer to the docs for TurnDiffTracker for implementation details.
There are multiple ways this could have been done and this felt like the
right tradeoff between reliability and completeness:
*Pros*
* It will pick up all changes to files that the model touched including
if they prettier or another command that updates them.
* It will not pick up changes made by the user or other agents to files
it didn't modify.
*Cons*
* It will pick up changes that the user made to a file that the model
also touched
* It will not pick up changes to codegen or files that were not modified
with apply_patch
## Summary
Users frequently complain about re-approving commands that have failed
for non-sandbox reasons. We can't diagnose with complete accuracy which
errors happened because of a sandbox failure, but we can start to
eliminate some common simple cases.
This PR captures the most common case I've seen, which is a `command not
found` error.
## Testing
- [x] Added unit tests
- [x] Ran a few cases locally
This replaces tui-textarea with a custom textarea component.
Key differences:
1. wrapped lines
2. better unicode handling
3. uses the native terminal cursor
This should perhaps be spun out into its own separate crate at some
point, but for now it's convenient to have it in-tree.
The following test script fails in the codex sandbox:
```
import multiprocessing
from multiprocessing import Lock, Process
def f(lock):
with lock:
print("Lock acquired in child process")
if __name__ == '__main__':
lock = Lock()
p = Process(target=f, args=(lock,))
p.start()
p.join()
```
with
```
Traceback (most recent call last):
File "/Users/david.hao/code/codex/codex-rs/cli/test.py", line 9, in <module>
lock = Lock()
^^^^^^
File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/context.py", line 68, in Lock
return Lock(ctx=self.get_context())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/synchronize.py", line 169, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/synchronize.py", line 57, in __init__
sl = self._semlock = _multiprocessing.SemLock(
^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 1] Operation not permitted
```
After reading, adding this line to the sandbox configs fixes things -
MacOS multiprocessing appears to use sem_lock(), which opens an IPC
which is considered a disk write even though no file is created. I
interrogated ChatGPT about whether it's okay to loosen, and my
impression after reading is that it is, although would appreciate a
close look
Breadcrumb: You can run `cargo run -- debug seatbelt --full-auto <cmd>`
to test the sandbox
To make `--full-auto` safer, this PR updates the Seatbelt policy so that
a `SandboxPolicy` with a `writable_root` that contains a `.git/`
_directory_ will make `.git/` _read-only_ (though as a follow-up, we
should also consider the case where `.git` is a _file_ with a `gitdir:
/path/to/actual/repo/.git` entry that should also be protected).
The two major changes in this PR:
- Updating `SandboxPolicy::get_writable_roots_with_cwd()` to return a
`Vec<WritableRoot>` instead of a `Vec<PathBuf>` where a `WritableRoot`
can specify a list of read-only subpaths.
- Updating `create_seatbelt_command_args()` to honor the read-only
subpaths in `WritableRoot`.
The logic to update the policy is a fairly straightforward update to
`create_seatbelt_command_args()`, but perhaps the more interesting part
of this PR is the introduction of an integration test in
`tests/sandbox.rs`. Leveraging the new API in #1785, we test
`SandboxPolicy` under various conditions, including ones where `$TMPDIR`
is not readable, which is critical for verifying the new behavior.
To ensure that Codex can run its own tests, e.g.:
```
just codex debug seatbelt --full-auto -- cargo test if_git_repo_is_writable_root_then_dot_git_folder_is_read_only
```
I had to introduce the use of `CODEX_SANDBOX=sandbox`, which is
comparable to how `CODEX_SANDBOX_NETWORK_DISABLED=1` was already being
used.
Adding a comparable change for Landlock will be done in a subsequent PR.
Introduce conversation.create handler (handle_create_conversation) and
wire it in MessageProcessor.
Stack:
Top: #1783
Bottom: #1784
---------
Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
Without this change, it is challenging to create integration tests to
verify that the folders not included in `writable_roots` in
`SandboxPolicy::WorkspaceWrite` are read-only because, by default,
`get_writable_roots_with_cwd()` includes `TMPDIR`, which is where most
integrationt
tests do their work.
This introduces a `use_exact_writable_roots` option to disable the
default
includes returned by `get_writable_roots_with_cwd()`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1785).
* #1765
* __->__ #1785
## Summary
- stream command stdout as `ExecCommandStdout` events
- forward streamed stdout to clients and ignore in human output
processor
- adjust call sites for new streaming API
This fixes a bug in insert_history_lines where writing
`Line::From(vec!["A".bold(), "B".into()])` would write "B" as bold,
because "B" didn't explicitly subtract bold.
- MCP server: add send-user-message tool to send user input to a running
Codex session
- Added an integration tests for the happy and sad paths
Changes:
• Add tool definition and schema.
• Expose tool in capabilities.
• Route and handle tool requests with validation.
• Tests for success, bad UUID, and missing session.
follow‑ups
• Listen path not implemented yet; the tool is present but marked “don’t
use yet” in code comments.
• Session run flag reset: clear running_session_id_set appropriately
after turn completion/errors.
This is the third PR in a stack.
Stack:
Final: #1686
Intermediate: #1751
First: #1750
- Add operation to summarize the context so far.
- The operation runs a compact task that summarizes the context.
- The operation clear the previous context to free the context window
- The operation didn't use `run_task` to avoid corrupting the session
- Add /compact in the tui
https://github.com/user-attachments/assets/e06c24e5-dcfb-4806-934a-564d425a919c
- Expose mcp_protocol from mcp-server for reuse in tests and callers.
- In MessageProcessor, detect structured ToolCallRequestParams in
tools/call and forward to a new handler.
- Add handle_new_tool_calls scaffold (returns error for now).
- Test helper: add send_send_user_message_tool_call to McpProcess to
send ConversationSendMessage requests;
This is the second PR in a stack.
Stack:
Final: #1686
Intermediate: #1751
First: #1750
# Summary
- Align MCP server responses with mcp_types by emitting [CallToolResult,
RequestId] instead of an object.
Update send-message result to a tagged enum: Ok or Error { message }.
# Why
Protocol compliance with current MCP schema.
# Tests
- Updated assertions in mcp_protocol.rs for create/stream/send/list and
error cases.
This is the first PR in a stack.
Stack:
Final: #1686
Intermediate: #1751
First: #1750
This delays the call to insert_history_lines until a redraw is
happening. Crucially, the new lines are inserted _after the viewport is
resized_. This results in fewer stray blank lines below the viewport
when modals (e.g. user approval) are closed.
when we enabled KKP in https://github.com/openai/codex/pull/1743, we
started receiving keyup events, but didn't expect them anywhere in our
code. for now, just don't dispatch them at all.
At 550 lines, `exec.rs` was a bit large. In particular, I found it hard
to locate the Seatbelt-related code quickly without a file with
`seatbelt` in the name, so this refactors things so:
- `spawn_command_under_seatbelt()` and dependent code moves to a new
`seatbelt.rs` file
- `spawn_child_async()` and dependent code moves to a new `spawn.rs`
file
This is a follow-up to https://github.com/openai/codex/pull/1705, as
that PR inadvertently lost the logic where `PatchApplyBeginEvent` and
`PatchApplyEndEvent` events were sent when patches were auto-approved.
Though as part of this fix, I believe this also makes an important
safety fix to `assess_patch_safety()`, as there was a case that returned
`SandboxType::None`, which arguably is the thing we were trying to avoid
in #1705.
On a high level, we want there to be only one codepath where
`apply_patch` happens, which should be unified with the patch to run
`exec`, in general, so that sandboxing is applied consistently for both
cases.
Prior to this change, `apply_patch()` in `core` would either:
* exit early, delegating to `exec()` to shell out to `apply_patch` using
the appropriate sandbox
* proceed to run the logic for `apply_patch` in memory
549846b29a/codex-rs/core/src/apply_patch.rs (L61-L63)
In this implementation, only the latter would dispatch
`PatchApplyBeginEvent` and `PatchApplyEndEvent`, though the former would
dispatch `ExecCommandBeginEvent` and `ExecCommandEndEvent` for the
`apply_patch` call (or, more specifically, the `codex
--codex-run-as-apply-patch PATCH` call).
To unify things in this PR, we:
* Eliminate the back half of the `apply_patch()` function, and instead
have it also return with `DelegateToExec`, though we add an extra field
to the return value, `user_explicitly_approved_this_action`.
* In `codex.rs` where we process `DelegateToExec`, we use
`SandboxType::None` when `user_explicitly_approved_this_action` is
`true`. This means **we no longer run the apply_patch logic in memory**,
as we always `exec()`. (Note this is what allowed us to delete so much
code in `apply_patch.rs`.)
* In `codex.rs`, we further update `notify_exec_command_begin()` and
`notify_exec_command_end()` to take additional fields to determine what
type of notification to send: `ExecCommand` or `PatchApply`.
Admittedly, this PR also drops some of the functionality about giving
the user the opportunity to expand the set of writable roots as part of
approving the `apply_patch` command. I'm not sure how much that was
used, and we should probably rethink how that works as we are currently
tidying up the protocol to the TUI, in general.
this fixes a couple of panics that would happen when trying to render
something larger than the terminal, or insert history lines when the top
of the viewport is at y=0.
the git tests were failing on my local machine due to gpg signing config
in my ~/.gitconfig. tests should not be affected by ~/.gitconfig, so
configure them to ignore it.
Simplify and improve many UI elements.
* Remove all-around borders in most places. These interact badly with
terminal resizing and look heavy. Prefer left-side-only borders.
* Make the viewport adjust to the size of its contents.
* <kbd>/</kbd> and <kbd>@</kbd> autocomplete boxes appear below the
prompt, instead of above it.
* Restyle the keyboard shortcut hints & move them to the left.
* Restyle the approval dialog.
* Use synchronized rendering to avoid flashing during rerenders.
https://github.com/user-attachments/assets/96f044af-283b-411c-b7fc-5e6b8a433c20
<img width="1117" height="858" alt="Screenshot 2025-07-30 at 5 29 20 PM"
src="https://github.com/user-attachments/assets/0cc0af77-8396-429b-b6ee-9feaaccdbee7"
/>
The goal of this change is to try an experiment where we try to get AI
to take on more of the code review load. The idea is that once you
believe your PR is ready for review, please add the `codex-rust-review`
label (as opposed to the `codex-review` label).
Admittedly the corresponding prompt currently represents my personal
biases in terms of code review, but we should massage it over time to
represent the team's preferences.
Proof of concept for a resizable viewport.
The general approach here is to duplicate the `Terminal` struct from
ratatui, but with our own logic. This is a "light fork" in that we are
still using all the base ratatui functions (`Buffer`, `Widget` and so
on), but we're doing our own bookkeeping at the top level to determine
where to draw everything.
This approach could use improvement—e.g, when the window is resized to a
smaller size, if the UI wraps, we don't correctly clear out the
artifacts from wrapping. This is possible with a little work (i.e.
tracking what parts of our UI would have been wrapped), but this
behavior is at least at par with the existing behavior.
https://github.com/user-attachments/assets/4eb17689-09fd-4daa-8315-c7ebc654986d
cc @joshka who might have Thoughts™
Building on the work of https://github.com/openai/codex/pull/1702, this
changes how a shell call to `apply_patch` is handled.
Previously, a shell call to `apply_patch` was always handled in-process,
never leveraging a sandbox. To determine whether the `apply_patch`
operation could be auto-approved, the
`is_write_patch_constrained_to_writable_paths()` function would check if
all the paths listed in the paths were writable. If so, the agent would
apply the changes listed in the patch.
Unfortunately, this approach afforded a loophole: symlinks!
* For a soft link, we could fix this issue by tracing the link and
checking whether the target is in the set of writable paths, however...
* ...For a hard link, things are not as simple. We can run `stat FILE`
to see if the number of links is greater than 1, but then we would have
to do something potentially expensive like `find . -inum <inode_number>`
to find the other paths for `FILE`. Further, even if this worked, this
approach runs the risk of a
[TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
race condition, so it is not robust.
The solution, implemented in this PR, is to take the virtual execution
of the `apply_patch` CLI into an _actual_ execution using `codex
--codex-run-as-apply-patch PATCH`, which we can run under the sandbox
the user specified, just like any other `shell` call.
This, of course, assumes that the sandbox prevents writing through
symlinks as a mechanism to write to folders that are not in the writable
set configured by the sandbox. I verified this by testing the following
on both Mac and Linux:
```shell
#!/usr/bin/env bash
set -euo pipefail
# Can running a command in SANDBOX_DIR write a file in EXPLOIT_DIR?
# Codex is run in SANDBOX_DIR, so writes should be constrianed to this directory.
SANDBOX_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
# EXPLOIT_DIR is outside of SANDBOX_DIR, so let's see if we can write to it.
EXPLOIT_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
echo "SANDBOX_DIR: $SANDBOX_DIR"
echo "EXPLOIT_DIR: $EXPLOIT_DIR"
cleanup() {
# Only remove if it looks sane and still exists
[[ -n "${SANDBOX_DIR:-}" && -d "$SANDBOX_DIR" ]] && rm -rf -- "$SANDBOX_DIR"
[[ -n "${EXPLOIT_DIR:-}" && -d "$EXPLOIT_DIR" ]] && rm -rf -- "$EXPLOIT_DIR"
}
trap cleanup EXIT
echo "I am the original content" > "${EXPLOIT_DIR}/original.txt"
# Drop the -s to test hard links.
ln -s "${EXPLOIT_DIR}/original.txt" "${SANDBOX_DIR}/link-to-original.txt"
cat "${SANDBOX_DIR}/link-to-original.txt"
if [[ "$(uname)" == "Linux" ]]; then
SANDBOX_SUBCOMMAND=landlock
else
SANDBOX_SUBCOMMAND=seatbelt
fi
# Attempt the exploit
cd "${SANDBOX_DIR}"
codex debug "${SANDBOX_SUBCOMMAND}" bash -lc "echo pwned > ./link-to-original.txt" || true
cat "${EXPLOIT_DIR}/original.txt"
```
Admittedly, this change merits a proper integration test, but I think I
will have to do that in a follow-up PR.
Adds a `CodexAuth` type that encapsulates information about available
auth modes and logic for refreshing the token.
Changes `Responses` API to send requests to different endpoints based on
the auth type.
Updates login_with_chatgpt to support API-less mode and skip the key
exchange.
This adds a tool the model can call to update a plan. The tool doesn't
actually _do_ anything but it gives clients a chance to read and render
the structured plan. We will likely iterate on the prompt and tools
exposed for planning over time.
see
[discussion](https://github.com/rhysd/tui-textarea/issues/51#issuecomment-3021191712),
it's surprising that ^U behaves this way. IMO the undo/redo
functionality in tui-textarea isn't good enough to be worth preserving,
but if we do bring it back it should probably be on C-z / C-S-z / C-y.
Perhaps there was an intention to make the login screen prettier, but it
feels quite silly right now to just have a screen that says "press q",
so replace it with something that lets the user directly login without
having to quit the app.
<img width="1283" height="635" alt="Screenshot 2025-07-28 at 2 54 05 PM"
src="https://github.com/user-attachments/assets/f19e5595-6ef9-4a2d-b409-aa61b30d3628"
/>
## Summary
Per the [latest MCP
spec](https://modelcontextprotocol.io/specification/2025-06-18/basic#meta),
the `_meta` field is reserved for metadata. In the [Typescript
Schema](0695a497eb/schema/2025-06-18/schema.ts (L37-L40)),
`progressToken` is defined as a value to be attached to subsequent
notifications for that request.
The
[CallToolRequestParams](0695a497eb/schema/2025-06-18/schema.ts (L806-L817))
extends this definition but overwrites the params field. This ambiguity
makes our generated type definitions tricky, so I'm going to skip
`progressToken` field for now and just send back the `requestId`
instead.
In a future PR, we can clarify, update our `generate_mcp_types.py`
script, and update our progressToken logic accordingly.
## Testing
- [x] Added unit tests
- [x] Manually tested with mcp client
(Hopefully) temporary solution to the invisible approvals problem -
prints commands to history when they need approval and then also prints
the result of the approval. In the near future we should be able to do
some fancy stuff with updating commands before writing them to permanent
history.
Also, ctr-c while in the approval modal now acts as esc (aborts command)
and puts the TUI in the state where one additional ctr-c will exit.
This is a straight refactor, moving apply-patch-related code from
`codex.rs` and into the new `apply_patch.rs` file. The only "logical"
change is inlining `#[allow(clippy::unwrap_used)]` instead of declaring
`#![allow(clippy::unwrap_used)]` at the top of the file (which is
currently the case in `codex.rs`).
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1703).
* #1705
* __->__ #1703
* #1702
* #1698
* #1697
This introduces some special behavior to the CLIs that are using the
`codex-arg0` crate where if `arg1` is `--codex-run-as-apply-patch`, then
it will run as if `apply_patch arg2` were invoked. This is important
because it means we can do things like:
```
SANDBOX_TYPE=landlock # or seatbelt for macOS
codex debug "${SANDBOX_TYPE}" -- codex --codex-run-as-apply-patch PATCH
```
which gives us a way to run `apply_patch` while ensuring it adheres to
the sandbox the user specified.
While it would be nice to use the `arg0` trick like we are currently
doing for `codex-linux-sandbox`, there is no way to specify the `arg0`
for the underlying command when running under `/usr/bin/sandbox-exec`,
so it will not work for us in this case.
Admittedly, we could have also supported this via a custom environment
variable (e.g., `CODEX_ARG0`), but since environment variables are
inherited by child processes, that seemed like a potentially leakier
abstraction.
This change, as well as our existing reliance on checking `arg0`, place
additional requirements on those who include `codex-core`. Its
`README.md` has been updated to reflect this.
While we could have just added an `apply-patch` subcommand to the
`codex` multitool CLI, that would not be sufficient for the standalone
`codex-exec` CLI, which is something that we distribute as part of our
GitHub releases for those who know they will not be using the TUI and
therefore prefer to use a slightly smaller executable:
https://github.com/openai/codex/releases/tag/rust-v0.10.0
To that end, this PR adds an integration test to ensure that the
`--codex-run-as-apply-patch` option works with the standalone
`codex-exec` CLI.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1702).
* #1705
* #1703
* __->__ #1702
* #1698
* #1697
The overall idea here is: skip ratatui for writing into scrollback,
because its primitives are wrong. We want to render full lines of text,
that will be wrapped natively by the terminal, and which we never plan
to update using ratatui (so the `Buffer` struct is overhead and in fact
an inhibition).
Instead, we use ANSI scrolling regions (link reference doc to come).
Essentially, we:
1. Define a scrolling region that extends from the top of the prompt
area all the way to the top of scrollback
2. Scroll that region up by N < (screen_height - viewport_height) lines,
in this PR N=1
3. Put our cursor at the top of the newly empty region
4. Print out our new text like normal
The terminal interactions here (write_spans and its dependencies) are
mostly extracted from ratatui.
Most of the time, we expect the `String` returned by
`serde_json::to_string()` to have extra capacity, so `push('\n')` is
unlikely to allocate, which seems cheaper than an extra `write(2)` call,
on average?
This update replaces the previous ratatui history widget with an
append-only log so that the terminal can handle text selection and
scrolling. It also disables streaming responses, which we'll do our best
to bring back in a later PR. It also adds a small summary of token use
after the TUI exits.
Currently, codex on start shows the value for the approval policy as
name of
[AskForApproval](2437a8d17a/codex-rs/core/src/protocol.rs (L128))
enum, which differs from
[approval_policy](2437a8d17a/codex-rs/config.md (approval_policy))
config values.
E.g. "untrusted" becomes "UnlessTrusted", "on-failure" -> "OnFailure",
"never" -> "Never".
This PR changes render names of the approval policy to match with
configuration values.
This PR updates `is_known_safe_command()` to account for "safe
operators" to expand the set of commands that can be run without
approval. This concept existed in the TypeScript CLI, and we are
[finally!] porting it to the Rust one:
c9e2def494/codex-cli/src/approvals.ts (L531-L541)
The idea is that if we have `EXPR1 SAFE_OP EXPR2` and `EXPR1` and
`EXPR2` are considered safe independently, then `EXPR1 SAFE_OP EXPR2`
should be considered safe. Currently, `SAFE_OP` includes `&&`, `||`,
`;`, and `|`.
In the TypeScript implementation, we relied on
https://www.npmjs.com/package/shell-quote to parse the string of Bash,
as it could provide a "lightweight" parse tree, parsing `'beep || boop >
/byte'` as:
```
[ 'beep', { op: '||' }, 'boop', { op: '>' }, '/byte' ]
```
Though in this PR, we introduce the use of
https://crates.io/crates/tree-sitter-bash for parsing (which
incidentally we were already using in
[`codex-apply-patch`](c9e2def494/codex-rs/apply-patch/Cargo.toml (L18))),
which gives us a richer parse tree. (Incidentally, if you have never
played with tree-sitter, try the
[playground](https://tree-sitter.github.io/tree-sitter/7-playground.html)
and select **Bash** from the dropdown to see how it parses various
expressions.)
As a concrete example, prior to this change, our implementation of
`is_known_safe_command()` could verify things like:
```
["bash", "-lc", "grep -R \"Cargo.toml\" -n"]
```
but not:
```
["bash", "-lc", "grep -R \"Cargo.toml\" -n || true"]
```
With this change, the version with `|| true` is also accepted.
Admittedly, this PR does not expand the safety check to support
subshells, so it would reject, e.g. `bash -lc 'ls || (pwd && echo hi)'`,
but that can be addressed in a subsequent PR.
`nl` is a line-numbering tool that should be on the _trusted _ list, as
there is nothing concerning on https://gtfobins.github.io/gtfobins/nl/
that would merit exclusion.
`true` and `false` are also safe, though not particularly useful given
how `is_known_safe_command()` works today, but that will change with
https://github.com/openai/codex/pull/1668.
Because of a quirk of how implementation tests work in Rust, we had a
number of `#[allow(dead_code)]` annotations that were misleading because
the functions _were_ being used, just not by all integration tests in a
`tests/` folder, so when compiling the test that did not use the
function, clippy would complain that it was unused.
This fixes things by create a "test_support" crate under the `tests/`
folder that is imported as a dev dependency for the respective crate.
# Summary
- Writing effective evals for codex sessions requires context of the
overall repository state at the moment the session began
- This change adds this metadata (git repository, branch, commit hash)
to the top of the rollout of the session (if available - if not it
doesn't add anything)
- Currently, this is only effective on a clean working tree, as we can't
track uncommitted/untracked changes with the current metadata set.
Ideally in the future we may want to track unclean changes somehow, or
perhaps prompt the user to stash or commit them.
# Testing
- Added unit tests
- `cargo test && cargo clippy --tests && cargo fmt -- --config
imports_granularity=Item`
### Resulting Rollout
<img width="1243" height="127" alt="Screenshot 2025-07-17 at 1 50 00 PM"
src="https://github.com/user-attachments/assets/68108941-f015-45b2-985c-ea315ce05415"
/>
Bumps [toml](https://github.com/toml-rs/toml) from 0.9.1 to 0.9.2.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c28f9ac30f"><code>c28f9ac</code></a>
chore: Release</li>
<li><a
href="f3a2299148"><code>f3a2299</code></a>
docs: Update changelog</li>
<li><a
href="69f09d3093"><code>69f09d3</code></a>
fix(lex): Don't loop over ')' for forever (<a
href="https://redirect.github.com/toml-rs/toml/issues/1003">#1003</a>)</li>
<li><a
href="cc68ae4f42"><code>cc68ae4</code></a>
fix(lex): Don't loop over ')' for forever</li>
<li>See full diff in <a
href="https://github.com/toml-rs/toml/compare/toml-v0.9.1...toml-v0.9.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [tree-sitter](https://github.com/tree-sitter/tree-sitter) from
0.25.6 to 0.25.8.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f2f197b6b2"><code>f2f197b</code></a>
0.25.8</li>
<li><a
href="8bb33f7d8c"><code>8bb33f7</code></a>
perf: reorder conditional operands</li>
<li><a
href="6f944de32f"><code>6f944de</code></a>
fix(generate): propagate node types error</li>
<li><a
href="c15938532d"><code>c159385</code></a>
0.25.7</li>
<li><a
href="94b55bfcdc"><code>94b55bf</code></a>
perf: reorder expensive conditional operand</li>
<li><a
href="bcb30f7951"><code>bcb30f7</code></a>
fix(generate): use topological sort for subtype map</li>
<li><a
href="3bd8f7df8e"><code>3bd8f7d</code></a>
perf: More efficient computation of used symbols</li>
<li><a
href="d7529c3265"><code>d7529c3</code></a>
perf: reserve <code>Vec</code> capacities where appropriate</li>
<li><a
href="bf4217f0ff"><code>bf4217f</code></a>
fix(web): wasm export paths</li>
<li><a
href="bb7b339ae2"><code>bb7b339</code></a>
Fix 'extra' field generation for node-types.json</li>
<li>Additional commits viewable in <a
href="https://github.com/tree-sitter/tree-sitter/compare/v0.25.6...v0.25.8">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [rand](https://github.com/rust-random/rand) from 0.9.1 to 0.9.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rust-random/rand/blob/master/CHANGELOG.md">rand's
changelog</a>.</em></p>
<blockquote>
<h2>[0.9.2 — 2025-07-20]</h2>
<h3>Deprecated</h3>
<ul>
<li>Deprecate <code>rand::rngs::mock</code> module and
<code>StepRng</code> generator (<a
href="https://redirect.github.com/rust-random/rand/issues/1634">#1634</a>)</li>
</ul>
<h3>Additions</h3>
<ul>
<li>Enable <code>WeightedIndex<usize></code> (de)serialization (<a
href="https://redirect.github.com/rust-random/rand/issues/1646">#1646</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d3dd415705"><code>d3dd415</code></a>
Prepare rand_core 0.9.2 (<a
href="https://redirect.github.com/rust-random/rand/issues/1605">#1605</a>)</li>
<li><a
href="99fabd20e9"><code>99fabd2</code></a>
Prepare rand_core 0.9.2</li>
<li><a
href="c7fe1c43b5"><code>c7fe1c4</code></a>
rand: re-export <code>rand_core</code> (<a
href="https://redirect.github.com/rust-random/rand/issues/1604">#1604</a>)</li>
<li><a
href="db2b1e0bb4"><code>db2b1e0</code></a>
rand: re-export <code>rand_core</code></li>
<li><a
href="ee1d96f9f5"><code>ee1d96f</code></a>
rand_core: implement reborrow for <code>UnwrapMut</code> (<a
href="https://redirect.github.com/rust-random/rand/issues/1595">#1595</a>)</li>
<li><a
href="e0eb2ee0fc"><code>e0eb2ee</code></a>
rand_core: implement reborrow for <code>UnwrapMut</code></li>
<li><a
href="975f602f5d"><code>975f602</code></a>
fixup clippy 1.85 warnings</li>
<li><a
href="775b05be1b"><code>775b05b</code></a>
Relax <code>Sized</code> requirements for blanket impls (<a
href="https://redirect.github.com/rust-random/rand/issues/1593">#1593</a>)</li>
<li>See full diff in <a
href="https://github.com/rust-random/rand/compare/rand_core-0.9.1...rand_core-0.9.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
1. Emit call_id to exec approval elicitations for mcp client convenience
2. Remove the `-retry` from the call id for the same reason as above but
upstream the reset behavior to the mcp client
Always store the entire conversation history.
Request encrypted COT when not storing Responses.
Send entire input context instead of sending previous_response_id
This PR adds a `load_dotenv()` helper function to the `codex-common`
crate that is available when the `cli` feature is enabled. The function
uses [`dotenvy`](https://crates.io/crates/dotenvy) to update the
environment from:
- `$CODEX_HOME/.env`
- `$(pwd)/.env`
To test:
- ran `printenv OPENAI_API_KEY` to verify the env var exists in my
environment
- ran `just codex exec hello` to verify the CLI uses my `OPENAI_API_KEY`
- ran `unset OPENAI_API_KEY`
- ran `just codex exec hello` again and got **ERROR: Missing environment
variable: `OPENAI_API_KEY`**, as expected
- created `~/.codex/.env` and added `OPENAI_API_KEY=sk-proj-...` (also
ran `chmod 400 ~/.codex/.env` for good measure)
- ran `just codex exec hello` again and it worked, verifying it picked
up `OPENAI_API_KEY` from `~/.codex/.env`
Note this functionality was available in the TypeScript CLI:
https://github.com/openai/codex/pull/122 and was recently requested over
on https://github.com/openai/codex/issues/1262#issuecomment-3093203551.
I noticed that releases have taken longer and longer to build.
Originally, I think I did `--all-targets` to be confident that
everything builds cleanly, but that's really the job of CI that runs on
`main`, so we're spending a lot of time in `rust-release.yml` for not
that much additional signal.
Some users have reported issues where child processes are not cleaned up
after Codex exits (e.g., https://github.com/openai/codex/issues/1570).
This is generally a tricky issue on operating systems: if a parent
process receives `SIGKILL`, then it terminates immediately and cannot
communicate with the child.
**It only helps on Linux**, but this PR introduces the use of `prctl(2)`
so that if the parent process dies, `SIGTERM` will be delivered to the
child process. Whereas previously, I believe that if Codex spawned a
long-running process (like `tsc --watch`) and the Codex process received
`SIGKILL`, the `tsc --watch` process would be reparented to the init
process and would never be killed. Now with the use of `prctl(2)`, the
`tsc --watch` process should receive `SIGTERM` in that scenario.
We still need to come up with a solution for macOS. I've started to look
at `launchd`, but I'm researching a number of options.
1. Added an elicitation for `approve-patch` which is very similar to
`approve-exec`.
2. Extracted both elicitations to their own files to prevent
`codex_tool_runner` from blowing up in size.
## Summary
Adds a new mcp tool call, `codex-reply`, so we can continue existing
sessions. This is a first draft and does not yet support sessions from
previous processes.
## Testing
- [x] tested with mcp client
This PR introduces a single integration test for `cargo mcp`, though it
also introduces a number of reusable components so that it should be
easier to introduce more integration tests going forward.
The new test is introduced in `codex-rs/mcp-server/tests/elicitation.rs`
and the reusable pieces are in `codex-rs/mcp-server/tests/common`.
The test itself verifies new functionality around elicitations
introduced in https://github.com/openai/codex/pull/1623 (and the fix
introduced in https://github.com/openai/codex/pull/1629) by doing the
following:
- starts a mock model provider with canned responses for
`/v1/chat/completions`
- starts the MCP server with a `config.toml` to use that model provider
(and `approval_policy = "untrusted"`)
- sends the `codex` tool call which causes the mock model provider to
request a shell call for `git init`
- the MCP server sends an elicitation to the client to approve the
request
- the client replies to the elicitation with `"approved"`
- the MCP server runs the command and re-samples the model, getting a
`"finish_reason": "stop"`
- in turn, the MCP server sends the final response to the original
`codex` tool call
- verifies that `git init` ran as expected
To test:
```
cargo test shell_command_approval_triggers_elicitation
```
In writing this test, I discovered that `ExecApprovalResponse` does not
conform to `ElicitResult`, so I added a TODO to fix that, since I think
that should be updated in a separate PR. As it stands, this PR does not
update any business logic, though it does make a number of members of
the `mcp-server` crate `pub` so they can be used in the test.
One additional learning from this PR is that
`std::process::Command::cargo_bin()` from the `assert_cmd` trait is only
available for `std::process::Command`, but we really want to use
`tokio::process::Command` so that everything is async and we can
leverage utilities like `tokio::time::timeout()`. The trick I came up
with was to use `cargo_bin()` to locate the program, and then to use
`std::process::Command::get_program()` when constructing the
`tokio::process::Command`.
This updates the MCP server so that if it receives an
`ExecApprovalRequest` from the `Codex` session, it in turn sends an [MCP
elicitation](https://modelcontextprotocol.io/specification/draft/client/elicitation)
to the client to ask for the approval decision. Upon getting a response,
it forwards the client's decision via `Op::ExecApproval`.
Admittedly, we should be doing the same thing for
`ApplyPatchApprovalRequest`, but this is our first time experimenting
with elicitations, so I'm inclined to defer wiring that code path up
until we feel good about how this one works.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1623).
* __->__ #1623
* #1622
* #1621
* #1620
Previous to this change, `MessageProcessor` had a
`tokio::sync::mpsc::Sender<JSONRPCMessage>` as an abstraction for server
code to send a message down to the MCP client. Because `Sender` is cheap
to `clone()`, it was straightforward to make it available to tasks
scheduled with `tokio::task::spawn()`.
This worked well when we were only sending notifications or responses
back down to the client, but we want to add support for sending
elicitations in #1623, which means that we need to be able to send
_requests_ to the client, and now we need a bit of centralization to
ensure all request ids are unique.
To that end, this PR introduces `OutgoingMessageSender`, which houses
the existing `Sender<OutgoingMessage>` as well as an `AtomicI64` to mint
out new, unique request ids. It has methods like `send_request()` and
`send_response()` so that callers do not have to deal with
`JSONRPCMessage` directly, as having to set the `jsonrpc` for each
message was a bit tedious (this cleans up `codex_tool_runner.rs` quite a
bit).
We do not have `OutgoingMessageSender` implement `Clone` because it is
important that the `AtomicI64` is shared across all users of
`OutgoingMessageSender`. As such, `Arc<OutgoingMessageSender>` must be
used instead, as it is frequently shared with new tokio tasks.
As part of this change, we update `message_processor.rs` to embrace
`await`, though we must be careful that no individual handler blocks the
main loop and prevents other messages from being handled.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1622).
* #1623
* __->__ #1622
* #1621
* #1620
This updates the schema in `generate_mcp_types.py` from `2025-03-26` to
`2025-06-18`, regenerates `mcp-types/src/lib.rs`, and then updates all
the code that uses `mcp-types` to honor the changes.
Ran
```
npx @modelcontextprotocol/inspector just codex mcp
```
and verified that I was able to invoke the `codex` tool, as expected.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1621).
* #1623
* #1622
* __->__ #1621
## Summary
- extend rollout format to store all session data in JSON
- add resume/write helpers for rollouts
- track session state after each conversation
- support `LoadSession` op to resume a previous rollout
- allow starting Codex with an existing session via
`experimental_resume` config variable
We need a way later for exploring the available sessions in a user
friendly way.
## Testing
- `cargo test --no-run` *(fails: `cargo: command not found`)*
------
https://chatgpt.com/codex/tasks/task_i_68792a29dd5c832190bf6930d3466fba
This video is outdated. you should use `-c experimental_resume:<full
path>` instead of `--resume <full path>`
https://github.com/user-attachments/assets/7a9975c7-aa04-4f4e-899a-9e87defd947a
## Summary
- add OpenAI retry and timeout fields to Config
- inject these settings in tests instead of mutating env vars
- plumb Config values through client and chat completions logic
- document new configuration options
## Testing
- `cargo test -p codex-core --no-run`
------
https://chatgpt.com/codex/tasks/task_i_68792c5b04cc832195c03050c8b6ea94
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
This is designed to facilitate programmatic use of Codex in a more
lightweight way than using `codex mcp`.
Passing `--json` to `codex exec` will print each event as a line of JSON
to stdout. Note that it does not print the individual tokens as they are
streamed, only full messages, as this is aimed at programmatic use
rather than to power UI.
<img width="1348" height="1307" alt="image"
src="https://github.com/user-attachments/assets/fc7908de-b78d-46e4-a6ff-c85de28415c7"
/>
I changed the existing `EventProcessor` into a trait and moved the
implementation to `EventProcessorWithHumanOutput`. Then I introduced an
alternative implementation, `EventProcessorWithJsonOutput`. The `--json`
flag determines which implementation to use.
Adds a default vscode config with generally applicable settings.
Adds more entrypoints to justfile both for environment setup and to help
agents better verify changes.
When Codex CLI is installed via `npm`, we use a `.js` wrapper script to
launch the Rust binary.
- Previously, we were not listening for signals to ensure that killing
the Node.js process would also kill the underlying Rust process.
- We also did not have a proper `exit` handler in place on the child
process to ensure we exited from the Node.js process.
This PR fixes these things and hopefully addresses
https://github.com/openai/codex/issues/1570.
This also adds logic so that Windows falls back to the TypeScript CLI
again, which should address https://github.com/openai/codex/issues/1573.
This PR implements server name validation for MCP (Model Context
Protocol) servers to ensure they conform to the required pattern
^[a-zA-Z0-9_-]+$. This addresses the TODO comment in
mcp_connection_manager.rs:82.
+ Added validation before spawning MCP client tasks
+ Invalid server names are added to errors map with descriptive messages
I have read the CLA Document and I hereby sign the CLA
---------
Co-authored-by: Michael Bolin <bolinfest@gmail.com>
- Added support for message and reasoning deltas
- Skipped adding the support in the cli and tui for later
- Commented a failing test (wrong merge) that needs fix in a separate
PR.
Side note: I think we need to disable merge when the CI don't pass.
While this does make it so that `ctrl-d` will not exit Codex when the
composer is not empty, `ctrl-d` will still exit Codex if it is in the
"working" state.
Fixes https://github.com/openai/codex/issues/1443.
It appears that `0.5.0` was built with `stage_release.sh` instead of
`stage_rust_release.py`, so add docs to clarify this and recommend
running `--version` on the release candidate to verify the right thing
was built.
## Summary
- add integration test for chat mode streaming via CLI using wiremock
- add integration test for Responses API streaming via fixture
- call `cargo run` to invoke the CLI during tests
## Testing
- `cargo test -p codex-core --test cli_stream -- --nocapture`
- `cargo clippy --all-targets --all-features -- -D warnings`
------
https://chatgpt.com/codex/tasks/task_i_68715980bbec8321999534fdd6a013c1
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
In order to to this, I created a new `chatgpt` crate where we can put
any code that interacts directly with ChatGPT as opposed to the OpenAI
API. I added a disclaimer to the README for it that it should primarily
be modified by OpenAI employees.
https://github.com/user-attachments/assets/bb978e33-d2c9-4d8e-af28-c8c25b1988e8
https://github.com/openai/codex/pull/1524 introduced the new `config`
field on `ModelClient`, so this does the post-PR cleanup to remove the
now-unnecessary `model` field.
As noted in the updated docs, this makes it so that you can set:
```toml
model_supports_reasoning_summaries = true
```
as a way of overriding the existing heuristic for when to set the
`reasoning` field on a sampling request:
341c091c5b/codex-rs/core/src/client_common.rs (L152-L166)
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps node from 22-slim to 24-slim.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [toml](https://github.com/toml-rs/toml) from 0.9.0 to 0.9.1.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8c8ef44ea1"><code>8c8ef44</code></a>
chore: Release</li>
<li><a
href="b60ac5bfe9"><code>b60ac5b</code></a>
fix(toml): Correct minimal version for indexmap (<a
href="https://redirect.github.com/toml-rs/toml/issues/998">#998</a>)</li>
<li><a
href="966bd40511"><code>966bd40</code></a>
fix(toml): Correct minimal version for indexmap</li>
<li><a
href="2ed2af6519"><code>2ed2af6</code></a>
docs(readme): Mention additional crates</li>
<li>See full diff in <a
href="https://github.com/toml-rs/toml/compare/toml-v0.9.0...toml-v0.9.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Summary
Add Android platform support to Codex CLI
## What?
- Added `android` to the list of supported platforms in
`codex-cli/bin/codex.js`
- Treats Android as Linux for binary compatibility
## Why?
- Fixes "Unsupported platform: android (arm64)" error on Termux
- Enables Codex CLI usage on Android devices via Termux
- Improves platform compatibility without affecting other platforms
## How?
- Modified the platform detection switch statement to include `case
"android":`
- Android falls through to the same logic as Linux, using appropriate
ARM64 binaries
- Minimal change with no breaking effects on existing functionality
## Testing
- Tested on Android/Termux environment
- Verified the fix resolves the platform detection error
- Confirmed no impact on other platforms
## Related Issues
Fixes the "Unsupported platform: android (arm64)" error reported by
Termux users
Current 0.4.0 release:
```
~/code/codex2/codex-rs$ codex completion | head
_codex-cli() {
local i cur prev opts cmd
COMPREPLY=()
if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
cur="$2"
else
cur="${COMP_WORDS[COMP_CWORD]}"
fi
prev="$3"
cmd=""
```
with this change:
```
~/code/codex2/codex-rs$ just codex completion | head
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.82s
Running `target/debug/codex completion`
_codex() {
local i cur prev opts cmd
COMPREPLY=()
if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
cur="$2"
else
cur="${COMP_WORDS[COMP_CWORD]}"
fi
prev="$3"
cmd=""
```
Some users have proxies or other setups where they are ultimately
hitting OpenAI endpoints, but need a custom `base_url` rather than the
default value of `"https://api.openai.com/v1"`. This PR makes it
possible to override the `base_url` for the `openai` provider via the
`OPENAI_BASE_URL` environment variable.
This is a stopgap solution before migrating the build for the npm
release to GitHub Actions (which is ultimately what should be done to
ensure hermetic builds).
The idea is that instead of continuing to create PRs like
https://github.com/openai/codex/pull/1472 where I have to check in a
change to the `WORKFLOW_URL`, this script uses `gh run list` to get the
`WORKFLOW_URL` dynamically and then threads the value through to
`install_native_deps.sh`.
To create the 0.3.0 release on npm, I ran:
```shell
./codex-cli/scripts/stage_rust_release.py --release-version 0.3.0
```
and then did `npm publish --dry-run` followed by `npm publish` in the
temp directory created by `stage_rust_release.py`.
On a high-level, we try to design `config.toml` so that you don't have
to "comment out a lot of stuff" when testing different options.
Previously, defining a sandbox policy was somewhat at odds with this
principle because you would define the policy as attributes of
`[sandbox]` like so:
```toml
[sandbox]
mode = "workspace-write"
writable_roots = [ "/tmp" ]
```
but if you wanted to temporarily change to a read-only sandbox, you
might feel compelled to modify your file to be:
```toml
[sandbox]
mode = "read-only"
# mode = "workspace-write"
# writable_roots = [ "/tmp" ]
```
Technically, commenting out `writable_roots` would not be strictly
necessary, as `mode = "read-only"` would ignore `writable_roots`, but
it's still a reasonable thing to do to keep things tidy.
Currently, the various values for `mode` do not support that many
attributes, so this is not that hard to maintain, but one could imagine
this becoming more complex in the future.
In this PR, we change Codex CLI so that it no longer recognizes
`[sandbox]`. Instead, it introduces a top-level option, `sandbox_mode`,
and `[sandbox_workspace_write]` is used to further configure the sandbox
when when `sandbox_mode = "workspace-write"` is used:
```toml
sandbox_mode = "workspace-write"
[sandbox_workspace_write]
writable_roots = [ "/tmp" ]
```
This feels a bit more future-proof in that it is less tedious to
configure different sandboxes:
```toml
sandbox_mode = "workspace-write"
[sandbox_read_only]
# read-only options here...
[sandbox_workspace_write]
writable_roots = [ "/tmp" ]
[sandbox_danger_full_access]
# danger-full-access options here...
```
In this scheme, you never need to comment out the configuration for an
individual sandbox type: you only need to redefine `sandbox_mode`.
Relatedly, previous to this change, a user had to do `-c
sandbox.mode=read-only` to change the mode on the command line. With
this change, things are arguably a bit cleaner because the equivalent
option is `-c sandbox_mode=read-only` (and now `-c
sandbox_workspace_write=...` can be set separately).
Though more importantly, we introduce the `-s/--sandbox` option to the
CLI, which maps directly to `sandbox_mode` in `config.toml`, making
config override behavior easier to reason about. Moreover, as you can
see in the updates to the various Markdown files, it is much easier to
explain how to configure sandboxing when things like `--sandbox
read-only` can be used as an example.
Relatedly, this cleanup also made it straightforward to add support for
a `sandbox` option for Codex when used as an MCP server (see the changes
to `mcp-server/src/codex_tool_config.rs`).
Fixes https://github.com/openai/codex/issues/1248.
v0.2.0 of https://www.npmjs.com/package/@openai/codex now runs the Rust
CLI, so it makes sense to bring back the instructions to use `npm i -g
@openai/codex`.
In most places, I list `npm install` before `brew install` because I
believe `npm` is more readily available, though I in the more detailed
part of the documentation, I note that `brew install` will download
fewer bytes, and in that sense, is preferred.
This adds support for two new model provider config options:
- `http_headers` for hardcoded (key, value) pairs
- `env_http_headers` for headers whose values should be read from
environment variables
This also updates the built-in `openai` provider to use this feature to
set the following headers:
- `originator` => `codex_cli_rs`
- `version` => [CLI version]
- `OpenAI-Organization` => `OPENAI_ORGANIZATION` env var
- `OpenAI-Project` => `OPENAI_PROJECT` env var
for consistency with the TypeScript implementation:
bd5a9e8ba9/codex-cli/src/utils/agent/agent-loop.ts (L321-L329)
While here, this also consolidates some logic that was duplicated across
`client.rs` and `chat_completions.rs` by introducing
`ModelProviderInfo.create_request_builder()`.
Resolves https://github.com/openai/codex/discussions/1152
This introduces two changes to make a quick fix so we can deploy the
Rust CLI for `0.2.0` of `@openai/codex` on npm:
- Updates `WORKFLOW_URL` to point to
https://github.com/openai/codex/actions/runs/15981617627, which is the
GitHub workflow run used to create the binaries for the `0.2.0` release
we published to Homebrew.
- Adds a `--version` option to `stage_release.sh` to specify what the
`version` field in the `package.json` will be.
Locally, I ran the following:
```
./codex-cli/scripts/stage_release.sh --native --version 0.2.0
```
Previously, we only used the `--native` flag to publish to the `native`
tag of `@openai/codex` (e.g., `npm publish --tag native`), but we should
just publish this as the default tag for `0.2.0` to be consistent with
what is in Homebrew.
We can still publish one "final" version of the TypeScript CLI as 0.1.x
later.
Under the hood, this release will still contain `dist/cli.js`,
`bin/codex-linux-sandbox-x64`, and `bin/codex-x86_64-apple-darwin`,
which are not strictly necessary, but we'll fix that in `0.3.0`.
As promised on https://github.com/openai/codex/discussions/1405, we are
making the first official release of the Rust CLI as v0.2.0. As part of
this move, we are making it available in Homebrew:
https://github.com/Homebrew/homebrew-core/pull/228615
Ultimately, we also plan to continue to make the CLI available in npm,
as well, though brew is a bit nicer in that `brew install` will download
only the binary for your platform whereas an npm module is expected to
contain the binaries for _all_ supported platforms, so it is a bit more
heavyweight.
A big part of this change is updating the root `README.md` to document
the behavior of the Rust CLI, which differs in a number of ways from the
TypeScript CLI. The existing `README.md` is moved to
`codex-cli/README.md` as part of this PR, as it is still applicable to
that folder.
As this is still early days for the Rust CLI, I encourage folks to
provide feedback on the command line flags and configuration options.
As discovered in https://github.com/openai/codex/issues/1365, the Azure
provider needs to be able to specify `api-version` as a query param, so
this PR introduces a generic `query_params` option to the
`model_providers` config so that an Azure provider can be defined as
follows:
```toml
[model_providers.azure]
name = "Azure"
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY"
query_params = { api-version = "2025-04-01-preview" }
```
This PR also updates the docs with this example.
While here, we also update `wire_api` to default to `"chat"`, as that is
likely the common case for someone defining an external provider.
Fixes https://github.com/openai/codex/issues/1365.
Looking at existing releases such as
https://github.com/openai/codex/releases/tag/codex-rs-b289c9207090b2e27494545d7b5404e063bd86f3-1-rust-v0.1.0-alpha.4,
the `.tar.gz` for the source code still seems to have `0.0.0` as the
`version` in `codex-rs/Cargo.toml` instead of what the tag seems to say
it should have:
b289c92070/codex-rs/Cargo.toml (L21)
ChatGPT claims:
> When GitHub generates the Source code (tar.gz) archive for a tag:
• It uses the commit the tag points to.
• But in some cases (e.g., shallow clones, GitHub CI, or local tools
that only clone the default branch), that commit may not be included,
and you might get an outdated view or nothing at all depending on how
it’s fetched.
Trying this recommended fix.
This is a small quality-of-life feature, the addition of
`--compute-indices` to the CLI, which, if enabled, will compute and set
the `indices` field for each `FileMatch` returned by `run()`. Note we
only bother to compute `indices` once we have the top N results because
there could be a lot of intermediate "top N" results during the search
that are ultimately discarded.
When set, the indices are included in the JSON output when `--json` is
specified and the matching indices are displayed in bold when `--json`
is not specified.
Introduces support for `@` to trigger a fuzzy-filename search in the
composer. Under the hood, this leverages
https://crates.io/crates/nucleo-matcher to do the fuzzy matching and
https://crates.io/crates/ignore to build up the list of file candidates
(so that it respects `.gitignore`).
For simplicity (at least for now), we do not do any caching between
searches like VS Code does for its file search:
1d89ed699b/src/vs/workbench/services/search/node/rawSearchService.ts (L212-L218)
Because we do not do any caching, I saw queries take up to three seconds
on large repositories with hundreds of thousands of files. To that end,
we do not perform searches synchronously on each keystroke, but instead
dispatch an event to do the search on a background thread that
asynchronously reports back to the UI when the results are available.
This is largely handled by the `FileSearchManager` introduced in this
PR, which also has logic for debouncing requests so there is at most one
search in flight at a time.
While we could potentially polish and tune this feature further, it may
already be overengineered for how it will be used, in practice, so we
can improve things going forward if it turns out that this is not "good
enough" in the wild.
Note this feature does not work like `@` in the TypeScript CLI, which
was more like directory-based tab completion. In the Rust CLI, `@`
triggers a full-repo fuzzy-filename search.
Fixes https://github.com/openai/codex/issues/1261.
Update `run()` to take `cancel_flag: Arc<AtomicBool>` that the worker
threads will periodically check to see if it is `true`, exiting early
(and returning empty results) if so.
As we are [close to releasing the Rust CLI
beta](https://github.com/openai/codex/discussions/1405), for the moment,
let's take a more neutral stance on what it takes to be a "built-in"
provider.
* For example, there seems to be a discrepancy around what the "right"
configuration for Gemini is: https://github.com/openai/codex/pull/881
* And while the current list of "built-in" providers are all arguably
"well-known" names, this raises a question of what to do about
potentially less familiar providers, such as
https://github.com/openai/codex/pull/1142. Do we just accept every pull
request like this, or is there some criteria a provider has to meet to
"qualify" to be bundled with Codex CLI?
I think that if we can establish clear ground rules for being a built-in
provider, then we can bring this back. But until then, I would rather
take a minimalist approach because if we decided to reverse our position
later, it would break folks who were depending on the presence of the
built-in providers.
Adds support for a `/diff` command comparable to the one available in
the TypeScript CLI.
<img width="1103" alt="Screenshot 2025-06-26 at 12 31 33 PM"
src="https://github.com/user-attachments/assets/5dc646ca-301f-41ff-92a7-595c68db64b6"
/>
While here, changed the `SlashCommand` enum so the declared variant
order is the order the commands appear in the popup menu. This way,
`/toggle-mouse-mode` is listed last, as it is the least likely to be
used.
Fixes https://github.com/openai/codex/issues/1253.
When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.
When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.
Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.
Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:
296996d74e/codex-cli/src/utils/approximate-tokens-used.ts (L3-L16)
Fixes https://github.com/openai/codex/issues/1242
This PR reworks `assess_command_safety()` so that the combination of
`AskForApproval::Never` and `SandboxPolicy::DangerFullAccess` ensures
that commands are run without _any_ sandbox and the user should never be
prompted. In turn, it adds support for a new
`--dangerously-bypass-approvals-and-sandbox` flag (that cannot be used
with `--approval-policy` or `--full-auto`) that sets both of those
options.
Fixes https://github.com/openai/codex/issues/1254
For the `approval_policy` config option, renames `unless-allow-listed`
to `untrusted`. In general, when it comes to exec'ing commands, I think
"trusted" is a more accurate term than "safe."
Also drops the `AskForApproval::AutoEdit` variant, as we were not really
making use of it, anyway.
Fixes https://github.com/openai/codex/issues/1250.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1378).
* #1379
* __->__ #1378
Apparently `just` was added to `apt` in Ubuntu 24, so this required
updating the Ubuntu version in the `Dockerfile` to make it so we could
simply `apt install just`.
Though then that caused a conflict with the custom `dev` user we were
using, though the end result seems simpler since now we just use the
default `ubuntu` user provided by Ubuntu 24.
This is a major redesign of how sandbox configuration works and aims to
fix https://github.com/openai/codex/issues/1248. Specifically, it
replaces `sandbox_permissions` in `config.toml` (and the
`-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
three variants:
```toml
# Safest option: full disk is read-only, but writes and network access are disallowed.
[sandbox]
mode = "read-only"
# The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
# writable_roots can be used to specify additional writable folders.
[sandbox]
mode = "workspace-write"
writable_roots = [] # Optional, defaults to the empty list.
network_access = false # Optional, defaults to false.
# Disable sandboxing: use at your own risk!!!
[sandbox]
mode = "danger-full-access"
```
This should make sandboxing easier to reason about. While we have
dropped support for `-s`, the way it works now is:
- no flags => `read-only`
- `--full-auto` => `workspace-write`
- currently, there is no way to specify `danger-full-access` via a CLI
flag, but we will revisit that as part of
https://github.com/openai/codex/issues/1254
Outstanding issue:
- As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
still conflating sandbox preferences with approval preferences in that
case, which needs to be cleaned up.
- Use Responses API for Azure provider endpoints
- Added a unit test to catch regression on the change from
`/chat/completions` to `/responses`
- Updated the default AOAI api version from `2025-03-01-preview` to
`2025-04-01-preview` to avoid user/400 errors due to missing summary
support in the March API version.
- Changes have been tested locally on AOAI endpoints
## Summary
This PR refactors the Codex CLI authentication flow so that
**non-OpenAI** providers (for example **azure**, or any future addition)
can supply their API key through a dedicated environment variable
without triggering the OpenAI login flow.
Key behaviours introduced:
* When `provider !== "openai"` the CLI consults `src/utils/providers.ts`
to locate the correct environment variable (`AZURE_OPENAI_API_KEY`,
`GEMINI_API_KEY`, and so on) before considering any interactive login.
* Credit redemption (`--free`) and PKCE login now run **only** when the
provider is OpenAI, eliminating unwanted browser prompts for Azure and
others.
* User-facing error messages are revamped to guide Azure users to
**[https://ai.azure.com/](https://ai.azure.com)** and show the exact
variable name they must set.
* All code paths still export `OPENAI_API_KEY` so legacy scripts
continue to operate unchanged.
---
## Example `config.json`
```jsonc
{
"model": "codex-mini",
"provider": "azure",
"providers": {
"azure": {
"name": "AzureOpenAI",
"baseURL": "https://ai-<project-name>.openai.azure.com/openai",
"envKey": "AZURE_OPENAI_API_KEY"
}
},
"history": {
"maxSize": 1000,
"saveHistory": true,
"sensitivePatterns": []
}
}
```
With this file in `~/.codex/config.json`, a single command line is
enough:
```bash
export AZURE_OPENAI_API_KEY="<your-key>"
codex "Hello from Azure"
```
No browser window opens, and the CLI works in entirely non-interactive
mode.
---
## Rationale
The new flow enables Codex to run **asynchronously** in sandboxed
environments such as GitHub Actions pipelines. By passing `--provider
azure` (or setting it in `config.json`) and exporting the correct key,
CI/CD jobs can invoke Codex without any ChatGPT-style login or PKCE
round-trip. This unlocks fully automated testing and deployment
scenarios.
---
## What’s changed
| File | Type | Description |
| ------------------------ | ------------------- |
-----------------------------------------------------------------------------------------------------------------------------
|
| `codex-cli/src/cli.tsx` | **feat / refactor** | +43 / -20 lines.
Imports `providers`, adds early provider-specific key lookup, gates
`--free` redemption, rewrites help text. |
| `src/utils/providers.ts` | **chore** | Now consumed by CLI for env-var
discovery. |
---
## How to test
```bash
# Azure example
export AZURE_OPENAI_API_KEY="<your-key>"
codex --provider azure "Automated run in CI"
# OpenAI example (unchanged behaviour)
codex --provider openai --login "Standard OpenAI flow"
```
Expected outcomes:
* Azure and other provider paths are non-interactive when provider flag
is passed.
* The CLI always sets `OPENAI_API_KEY` for backward compatibility.
---
## Checklist
* [x] Logic behind provider-specific env-var lookup added.
* [x] Redundant OpenAI login steps removed for other providers.
* [x] Unit tests cover new branches.
* [x] README and sample config updated.
* [x] CI passes on all supported Node versions.
---
**Related work**
* #92
* #769
* #1321
I have read the CLA Document and I hereby sign the CLA.
I noticed that `/clear` wasn't fully clearing chat history; it would
clear the chat history widgets _in the UI_, but the LLM still had access
to information from previous messages.
This PR renames `/clear` to `/new` for clarity as per Michael's
suggestion, resetting `app_state` to a fresh `ChatWidget`.
Now that we have published a GitHub Release that contains arm64 musl
artifacts for Linux, update the following scripts to take advantage of
them:
- `dotslash-config.json` now uses musl artifacts for the `linux-aarch64`
target
- `install_native_deps.sh` for the TypeScript CLI now includes
`codex-linux-sandbox-aarch64-unknown-linux-musl` instead of
`codex-linux-sandbox-aarch64-unknown-linux-gnu` for sandboxing
- `codex-cli/bin/codex.js` now checks for `aarch64-unknown-linux-musl`
artifacts instead of `aarch64-unknown-linux-gnu` ones
Users were running into issues with glibc mismatches on arm64 linux. In
the past, we did not provide a musl build for arm64 Linux because we had
trouble getting the openssl dependency to build correctly. Though today
I just tried the same trick in `Cargo.toml` that we were doing for
`x86_64-unknown-linux-musl` (using `openssl-sys` with `features =
["vendored"]`), so I'm not sure what problem we had in the past the
builds "just worked" today!
Though one tweak that did have to be made is that the integration tests
for Seccomp/Landlock empirically require longer timeouts on arm64 linux,
or at least on the `ubuntu-24.04-arm` GitHub Runner. As such, we change
the timeouts for arm64 in `codex-rs/linux-sandbox/tests/landlock.rs`.
Though in solving this problem, I decided I needed a turnkey solution
for testing the Linux build(s) from my Mac laptop, so this PR introduces
`.devcontainer/Dockerfile` and `.devcontainer/devcontainer.json` to
facilitate this. Detailed instructions are in `.devcontainer/README.md`.
We will update `dotslash-config.json` and other release-related scripts
in a follow-up PR.
This does not implement the full Login with ChatGPT experience, but it
should unblock people.
**What works**
* The `codex` multitool now has a `login` subcommand, so you can run
`codex login`, which should write `CODEX_HOME/auth.json` if you complete
the flow successfully. The TUI will now read the `OPENAI_API_KEY` from
`auth.json`.
* The TUI should refresh the token if it has expired and the necessary
information is in `auth.json`.
* There is a `LoginScreen` in the TUI that tells you to run `codex
login` if both (1) your model provider expects to use `OPENAI_API_KEY`
as its env var, and (2) `OPENAI_API_KEY` is not set.
**What does not work**
* The `LoginScreen` does not support the login flow from within the TUI.
Instead, it tells you to quit, run `codex login`, and then run `codex`
again.
* `codex exec` does read from `auth.json` yet, nor does it direct the
user to go through the login flow if `OPENAI_API_KEY` is not be found.
* The `maybeRedeemCredits()` function from `get-api-key.tsx` has not
been ported from TypeScript to `login_with_chatgpt.py` yet:
a67a67f325/codex-cli/src/utils/get-api-key.tsx (L84-L89)
**Implementation**
Currently, the OAuth flow requires running a local webserver on
`127.0.0.1:1455`. It seemed wasteful to incur the additional binary cost
of a webserver dependency in the Rust CLI just to support login, so
instead we implement this logic in Python, as Python has a `http.server`
module as part of its standard library. Specifically, we bundle the
contents of a single Python file as a string in the Rust CLI and then
use it to spawn a subprocess as `python3 -c
{{SOURCE_FOR_PYTHON_SERVER}}`.
As such, the most significant files in this PR are:
```
codex-rs/login/src/login_with_chatgpt.py
codex-rs/login/src/lib.rs
```
Now that the CLI may load `OPENAI_API_KEY` from the environment _or_
`CODEX_HOME/auth.json`, we need a new abstraction for reading/writing
this variable, so we introduce:
```
codex-rs/core/src/openai_api_key.rs
```
Note that `std::env::set_var()` is [rightfully] `unsafe` in Rust 2024,
so we use a LazyLock<RwLock<Option<String>>> to store `OPENAI_API_KEY`
so it is read in a thread-safe manner.
Ultimately, it should be possible to go through the entire login flow
from the TUI. This PR introduces a placeholder `LoginScreen` UI for that
right now, though the new `codex login` subcommand introduced in this PR
should be a viable workaround until the UI is ready.
**Testing**
Because the login flow is currently implemented in a standalone Python
file, you can test it without building any Rust code as follows:
```
rm -rf /tmp/codex_home && mkdir /tmp/codex_home
CODEX_HOME=/tmp/codex_home python3 codex-rs/login/src/login_with_chatgpt.py
```
For reference:
* the original TypeScript implementation was introduced in
https://github.com/openai/codex/pull/963
* support for redeeming credits was later added in
https://github.com/openai/codex/pull/974
This PR overhauls how active tool calls and completed tool calls are
displayed:
1. More use of colour to indicate success/failure and distinguish
between components like tool name+arguments
2. Previously, the entire `CallToolResult` was serialized to JSON and
pretty-printed. Now, we extract each individual `CallToolResultContent`
and print those
1. The previous solution was wasting space by unnecessarily showing
details of the `CallToolResult` struct to users, without formatting the
actual tool call results nicely
2. We're now able to show users more information from tool results in
less space, with nicer formatting when tools return JSON results
### Before:
<img width="1251" alt="Screenshot 2025-06-03 at 11 24 26"
src="https://github.com/user-attachments/assets/5a58f222-219c-4c53-ace7-d887194e30cf"
/>
### After:
<img width="1265" alt="image"
src="https://github.com/user-attachments/assets/99fe54d0-9ebe-406a-855b-7aa529b91274"
/>
## Future Work
1. Integrate image tool result handling better. We should be able to
display images even if they're not the first `CallToolResultContent`
2. Users should have some way to view the full version of truncated tool
results
3. It would be nice to add some left padding for tool results, make it
more clear that they are results. This is doable, just a little fiddly
due to the way `first_visible_line` scrolling works
4. There's almost certainly a better way to format JSON than "all on 1
line with spaces to make Ratatui wrapping work". But I think that works
OK for now.
This fixes a longstanding error in the Rust CLI where `codex.rs`
contained an errant `is_first_turn` check that would exclude the user
instructions for subsequent "turns" of a conversation when using the
responses API (i.e., when `previous_response_id` existed).
While here, renames `Prompt.instructions` to `Prompt.user_instructions`
since we now have quite a few levels of instructions floating around.
Also removed an unnecessary use of `clone()` in
`Prompt.get_full_instructions()`.
As explained in detail in the doc comment for `ParseMode::Lenient`, we
have observed that GPT-4.1 does not always generate a valid invocation
of `apply_patch`. Fortunately, the error is predictable, so we introduce
some new logic to the `codex-apply-patch` crate to recover from this
error.
Because we would like to avoid this becoming a de facto standard (as it
would be incompatible if `apply_patch` were provided as an actual
executable, unless we also introduced the lenient behavior in the
executable, as well), we require passing `ParseMode::Lenient` to
`parse_patch_text()` to make it clear that the caller is opting into
supporting this special case.
Note the analogous change to the TypeScript CLI was
https://github.com/openai/codex/pull/930. In addition to changing the
accepted input to `apply_patch`, it also introduced additional
instructions for the model, which we include in this PR.
Note that `apply-patch` does not depend on either `regex` or
`regex-lite`, so some of the checks are slightly more verbose to avoid
introducing this dependency.
That said, this PR does not leverage the existing
`extract_heredoc_body_from_apply_patch_command()`, which depends on
`tree-sitter` and `tree-sitter-bash`:
5a5aa89914/codex-rs/apply-patch/src/lib.rs (L191-L246)
though perhaps it should.
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
d7245cbbc9/codex-rs/core/src/client.rs (L108-L111)
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
d7245cbbc9/codex-cli/src/utils/agent/agent-loop.ts (L786-L789)
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
Prior to this PR, there were two big misses in `chat_completions.rs`:
1. The loop in `stream_chat_completions()` was only including items of
type `ResponseItem::Message` when building up the `"messages"` JSON for
the `POST` request to the `chat/completions` endpoint. This fixes things
by ensuring other variants (`FunctionCall`, `LocalShellCall`, and
`FunctionCallOutput`) are included, as well.
2. In `process_chat_sse()`, we were not recording tool calls and were
only emitting items of type
`ResponseEvent::OutputItemDone(ResponseItem::Message)` to the stream.
Now we introduce `FunctionCallState`, which is used to accumulate the
`delta`s of type `tool_calls`, so we can ultimately emit a
`ResponseItem::FunctionCall`, when appropriate.
While function calling now appears to work for chat completions with my
local testing, I believe that there are still edge cases that are not
covered and that this codepath would benefit from a battery of
integration tests. (As part of that further cleanup, we should also work
to support streaming responses in the UI.)
The other important part of this PR is some cleanup in
`core/src/codex.rs`. In particular, it was hard to reason about how
`run_task()` was building up the list of messages to include in a
request across the various cases:
- Responses API
- Chat Completions API
- Responses API used in concert with ZDR
I like to think things are a bit cleaner now where:
- `zdr_transcript` (if present) contains all messages in the history of
the conversation, which includes function call outputs that have not
been sent back to the model yet
- `pending_input` includes any messages the user has submitted while the
turn is in flight that need to be injected as part of the next `POST` to
the model
- `input_for_next_turn` includes the tool call outputs that have not
been sent back to the model yet
This PR introduces a `hide_agent_reasoning` config option (that defaults
to `false`) that users can enable to make the output less verbose by
suppressing reasoning output.
To test, verified that this includes agent reasoning in the output:
```
echo hello | just exec
```
whereas this does not:
```
echo hello | just exec --config hide_agent_reasoning=false
```
This required changing `ts_println!()` to take `$self:ident`, which is a
bit more verbose, but the usability improvement seems worth it.
Also eliminated an unnecessary `.to_string()` while here.
Fixes:
* Instantiate `EventProcessor` earlier in `lib.rs` so
`print_config_summary()` can be an instance method of it and leverage
its various `Style` fields to ensure it honors `with_ansi` properly.
* After printing the config summary, print out user's prompt with the
heading `User instructions:`. As noted in the comment, now that we can
read the instructions via stdin as of #1178, it is helpful to the user
to ensure they know what instructions were given to Codex.
* Use same colors/bold/italic settings for headers as the TUI, making
the output a bit easier to read.
This attempts to make `codex exec` more flexible in how the prompt can
be passed:
* as before, it can be passed as a single string argument
* if `-` is passed as the value, the prompt is read from stdin
* if no argument is passed _and stdin is a tty_, prints a warning to
stderr that no prompt was specified an exits non-zero.
* if no argument is passed _and stdin is NOT a tty_, prints `Reading
prompt from stdin...` to stderr to let the user know that Codex will
wait until it reads EOF from stdin to proceed. (You can repro this case
by doing `yes | just exec` since stdin is not a TTY in that case but it
also never reaches EOF).
The main motivator behind this PR is that `stream_chat_completions()`
was not adding the `"tools"` entry to the payload posted to the
`/chat/completions` endpoint. This (1) refactors the existing logic to
build up the `"tools"` JSON from `client.rs` into `openai_tools.rs`, and
(2) updates the use of responses API (`client.rs`) and chat completions
API (`chat_completions.rs`) to both use it.
Note this PR alone is not sufficient to get tool calling from chat
completions working: that is done in
https://github.com/openai/codex/pull/1167.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1177).
* #1167
* __->__ #1177
This is a first cut at a GitHub Action that lets you define prompt
templates in `.md` files under `.github/codex/labels` that will run
Codex with the associated prompt when the label is added to a GitHub
pull request.
For example, this PR includes these files:
```
.github/codex/labels/codex-attempt.md
.github/codex/labels/codex-code-review.md
.github/codex/labels/codex-investigate-issue.md
```
And the new `.github/workflows/codex.yml` workflow declares the
following triggers:
```yaml
on:
issues:
types: [opened, labeled]
pull_request:
branches: [main]
types: [labeled]
```
as well as the following expression to gate the action:
```
jobs:
codex:
if: |
(github.event_name == 'issues' && (
(github.event.action == 'labeled' && (github.event.label.name == 'codex-attempt' || github.event.label.name == 'codex-investigate-issue'))
)) ||
(github.event_name == 'pull_request' && github.event.action == 'labeled' && github.event.label.name == 'codex-code-review')
```
Note the "actor" who added the label must have write access to the repo
for the action to take effect.
After adding a label, the action will "ack" the request by replacing the
original label (e.g., `codex-review`) with an `-in-progress` suffix
(e.g., `codex-review-in-progress`). When it is finished, it will swap
the `-in-progress` label with a `-completed` one (e.g.,
`codex-review-completed`).
Users of the action are responsible for providing an `OPENAI_API_KEY`
and making it available as a secret to the action.
The way these definitions worked before, they did not handle quoted args
with spaces properly.
For example, if you had `/tmp/test-just/printlen.py` as:
```python
#!/usr/bin/env python3
import sys
print(len(sys.argv))
```
and your `justfile` was:
```
printlen *args:
/tmp/test-just/printlen.py {{args}}
```
Then:
```shell
$ just printlen foo bar
3
$ just printlen 'foo bar'
3
```
which is not what we want: `'foo bar'` should be treated as one
argument.
The fix is to use
[positional-arguments](515e806b51/README.md (L1131)):
```
set positional-arguments
printlen *args:
/tmp/test-just/printlen.py "$@"
```
The output of an MCP server tool call can be one of several types, but
to date, we treated all outputs as text by showing the serialized JSON
as the "tool output" in Codex:
25a9949c49/codex-rs/mcp-types/src/lib.rs (L96-L101)
This PR adds support for the `ImageContent` variant so we can now
display an image output from an MCP tool call.
In making this change, we introduce a new
`ResponseInputItem::McpToolCallOutput` variant so that we can work with
the `mcp_types::CallToolResult` directly when the function call is made
to an MCP server.
Though arguably the more significant change is the introduction of
`HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that
uses `ratatui_image` to render an image into the terminal. To support
this, we introduce `ImageRenderCache`, cache a
`ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the
appropriate scaled image data and dimensions based on the current
terminal size.
To test, I created a minimal `package.json`:
```json
{
"name": "kitty-mcp",
"version": "1.0.0",
"type": "module",
"description": "MCP that returns image of kitty",
"main": "index.js",
"dependencies": {
"@modelcontextprotocol/sdk": "^1.12.0"
}
}
```
with the following `index.js` to define the MCP server:
```js
#!/usr/bin/env node
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { readFile } from "node:fs/promises";
import { join } from "node:path";
const IMAGE_URI = "image://Ada.png";
const server = new McpServer({
name: "Demo",
version: "1.0.0",
});
server.tool(
"get-cat-image",
"If you need a cat image, this tool will provide one.",
async () => ({
content: [
{ type: "image", data: await getAdaPngBase64(), mimeType: "image/png" },
],
})
);
server.resource("Ada the Cat", IMAGE_URI, async (uri) => {
const base64Image = await getAdaPngBase64();
return {
contents: [
{
uri: uri.href,
mimeType: "image/png",
blob: base64Image,
},
],
};
});
async function getAdaPngBase64() {
const __dirname = new URL(".", import.meta.url).pathname;
// From 9705ce2c59/assets/Ada.png
const filePath = join(__dirname, "Ada.png");
const imageData = await readFile(filePath);
const base64Image = imageData.toString("base64");
return base64Image;
}
const transport = new StdioServerTransport();
await server.connect(transport);
```
With the local changes from this PR, I added the following to my
`config.toml`:
```toml
[mcp_servers.kitty]
command = "node"
args = ["/Users/mbolin/code/kitty-mcp/index.js"]
```
Running the TUI from source:
```
cargo run --bin codex -- --model o3 'I need a picture of a cat'
```
I get:
<img width="732" alt="image"
src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869"
/>
Now, that said, I have only tested in iTerm and there is definitely some
funny business with getting an accurate character-to-pixel ratio
(sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10
rows to render instead of 4), so there is still work to be done here.
The motivation behind this PR is to make it so a `HistoryCell` is more
like a `WidgetRef` that knows how to render itself into a `Rect` so that
it can be backed by something other than a `Vec<Line>`. Because a
`HistoryCell` is intended to appear in a scrollable list, we want to
ensure the stack of cells can be scrolled one `Line` at a time even if
the `HistoryCell` is not backed by a `Vec<Line>` itself.
To this end, we introduce the `CellWidget` trait whose key method is:
```
fn render_window(&self, first_visible_line: usize, area: Rect, buf: &mut Buffer);
```
The `first_visible_line` param is what differs from
`WidgetRef::render_ref()`, as a `CellWidget` needs to know the offset
into its "full view" at which it should start rendering.
The bookkeeping in `ConversationHistoryWidget` has been updated
accordingly to ensure each `CellWidget` in the history is rendered
appropriately.
This PR introduces support for `-c`/`--config` so users can override
individual config values on the command line using `--config
name=value`. Example:
```
codex --config model=o4-mini
```
Making it possible to set arbitrary config values on the command line
results in a more flexible configuration scheme and makes it easier to
provide single-line examples that can be copy-pasted from documentation.
Effectively, it means there are four levels of configuration for some
values:
- Default value (e.g., `model` currently defaults to `o4-mini`)
- Value in `config.toml` (e.g., user could override the default to be
`model = "o3"` in their `config.toml`)
- Specifying `-c` or `--config` to override `model` (e.g., user can
include `-c model=o3` in their list of args to Codex)
- If available, a config-specific flag can be used, which takes
precedence over `-c` (e.g., user can specify `--model o3` in their list
of args to Codex)
Now that it is possible to specify anything that could be configured in
`config.toml` on the command line using `-c`, we do not need to have a
custom flag for every possible config option (which can clutter the
output of `--help`). To that end, as part of this PR, we drop support
for the `--disable-response-storage` flag, as users can now specify `-c
disable_response_storage=true` to get the equivalent functionality.
Under the hood, this works by loading the `config.toml` into a
`toml::Value`. Then for each `key=value`, we create a small synthetic
TOML file with `value` so that we can run the TOML parser to get the
equivalent `toml::Value`. We then parse `key` to determine the point in
the original `toml::Value` to do the insert/replace. Once all of the
overrides from `-c` args have been applied, the `toml::Value` is
deserialized into a `ConfigToml` and then the `ConfigOverrides` are
applied, as before.
I discovered that if I ran `codex <PROMPT>` in a cwd that was not a Git
repo, Codex did not automatically run `<PROMPT>` after I accepted the
Git warning. It appears that we were not managing the `AppState`
transition correctly, so this fixes the bug and ensures the Codex
session does not start until the user accepts the Git warning.
In particular, we now create the `ChatWidget` lazily and store it in the
`AppState::Chat` variant.
Historically, we spawned the Seatbelt and Landlock sandboxes in
substantially different ways:
For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
specified as an arg followed by the original command:
d1de7bb383/codex-rs/core/src/exec.rs (L147-L219)
For **Landlock/Seccomp**, we would do
`tokio::runtime::Builder::new_current_thread()`, _invoke
Landlock/Seccomp APIs to modify the permissions of that new thread_, and
then spawn the command:
d1de7bb383/codex-rs/core/src/exec_linux.rs (L28-L49)
While it is neat that Landlock/Seccomp supports applying a policy to
only one thread without having to apply it to the entire process, it
requires us to maintain two different codepaths and is a bit harder to
reason about. The tipping point was
https://github.com/openai/codex/pull/1061, in which we had to start
building up the `env` in an unexpected way for the existing
Landlock/Seccomp approach to continue to work.
This PR overhauls things so that we do similar things for Mac and Linux.
It turned out that we were already building our own "helper binary"
comparable to Mac's `sandbox-exec` as part of the `cli` crate:
d1de7bb383/codex-rs/cli/Cargo.toml (L10-L12)
We originally created this to build a small binary to include with the
Node.js version of the Codex CLI to provide support for Linux
sandboxing.
Though the sticky bit is that, at this point, we still want to deploy
the Rust version of Codex as a single, standalone binary rather than a
CLI and a supporting sandboxing binary. To satisfy this goal, we use
"the arg0 trick," in which we:
* use `std::env::current_exe()` to get the path to the CLI that is
currently running
* use the CLI as the `program` for the `Command`
* set `"codex-linux-sandbox"` as arg0 for the `Command`
A CLI that supports sandboxing should check arg0 at the start of the
program. If it is `"codex-linux-sandbox"`, it must invoke
`codex_linux_sandbox::run_main()`, which runs the CLI as if it were
`codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
the original command, so do _replace_ the process rather than spawn a
subprocess. Incidentally, we do this before starting the Tokio runtime,
so the process should only have one thread when `execvp(3)` is called.
Because the `core` crate that needs to spawn the Linux sandboxing is not
a CLI in its own right, this means that every CLI that includes `core`
and relies on this behavior has to (1) implement it and (2) provide the
path to the sandboxing executable. While the path is almost always
`std::env::current_exe()`, we needed to make this configurable for
integration tests, so `Config` now has a `codex_linux_sandbox_exe:
Option<PathBuf>` property to facilitate threading this through,
introduced in https://github.com/openai/codex/pull/1089.
This common pattern is now captured in
`codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
functions that should use it have been updated as part of this PR.
The `codex-linux-sandbox` crate added to the Cargo workspace as part of
this PR now has the bulk of the Landlock/Seccomp logic, which makes
`core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
`core/src/landlock.rs` were removed/ported as part of this PR. I also
moved the unit tests for this code into an integration test,
`linux-sandbox/tests/landlock.rs`, in which I use
`env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
`codex_linux_sandbox_exe` since `std::env::current_exe()` is not
appropriate in that case.
https://github.com/openai/codex/pull/1086 is a work-in-progress to make
Linux sandboxing work more like Seatbelt where, for the command we want
to sandbox, we build up the command and then hand it, and some sandbox
configuration flags, to another command to set up the sandbox and then
run it.
In the case of Seatbelt, macOS provides this helper binary and provides
it at `/usr/bin/sandbox-exec`. For Linux, we have to build our own and
pass it through (which is what #1086 does), so this makes the new
`codex_linux_sandbox_exe` available on `Config` so that it will later be
available in `exec.rs` when we need it in #1086.
Added logic so that when we run `./scripts/stage_release.sh --native`
(for the `@native` version of the Node module), we drop a `use-native`
file next to `codex.js`. If present, `codex.js` will now run the Rust
CLI.
Ran `./scripts/stage_release.sh --native` and verified that when the
running `codex.js` in the staged folder:
```
$ /var/folders/wm/f209bc1n2bd_r0jncn9s6j_00000gp/T/tmp.efvEvBlSN6/bin/codex.js --version
codex-cli 0.0.2505220956
```
it ran the expected Rust version of the CLI, as desired.
While here, I also updated the Rust version to one that I cut today,
which includes the new shell environment policy config option:
https://github.com/openai/codex/pull/1061. Note this may "break" some
users if the processes spawned by Codex need extra environment
variables. (We are still working to determine what the right defaults
should be for this option.)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
Now the `exec` output starts with something like:
```
--------
workdir: /Users/mbolin/code/codex/codex-rs
model: o3
provider: openai
approval: Never
sandbox: SandboxPolicy { permissions: [DiskFullReadAccess, DiskWritePlatformUserTempFolder, DiskWritePlatformGlobalTempFolder, DiskWriteCwd, DiskWriteFolder { folder: "/Users/mbolin/.pyenv/shims" }] }
--------
```
which makes it easier to reason about when looking at logs.
`config.rs` is already quite long without these definitions. Since they
have no real dependencies of their own, let's move them to their own
file so `config.rs` can focus on the business logic of loading a config.
This introduces an experimental `--output-last-message` flag that can be
used to identify a file where the final message from the agent will be
written. Two use cases:
- Ultimately, we will likely add a `--quiet` option to `exec`, but even
if the user does not want any output written to the terminal, they
probably want to know what the agent did. Writing the output to a file
makes it possible to get that information in a clean way.
- Relatedly, when using `exec` in CI, it is easier to review the
transcript written "normally," (i.e., not as JSON or something with
extra escapes), but getting programmatic access to the last message is
likely helpful, so writing the last message to a file gets the best of
both worlds.
I am calling this "experimental" because it is possible that we are
overfitting and will want a more general solution to this problem that
would justify removing this flag.
## Summary
- add `--login` and `--free` flags to cli help
- handle `--login` and `--free` logic in cli
- factor out redeem flow into `maybeRedeemCredits`
- call new helper from login callback
Prior to this PR, I would frequently see glyphs from previous frames
"bleed" through like this:

I think this was due to two issues (now addressed in this PR):
* We were not making use of `ratatui::widgets::Clear` to clear out the
buffer before drawing into it.
* To calculate the `width` used with `wrapped_line_count_for_cell()`, we
were not accounting for the scrollbar.
* Now we calculate `effective_width` using
`inner.width.saturating_sub(1)` where the `1` is for the scrollbar.
* We compute `text_area` using `effective_with` and pass the `text_area`
to `paragraph.render()`.
* We eliminate the conditional `needs_scrollbar` check and always call
`render(Scrollbar)`
I suspect this bug was introduced in
https://github.com/openai/codex/pull/937, though I did not try to
verify: I'm just happy that it appears to be fixed!
Previously, if the first user message was sent with the command
invocation, e.g.:
```
$ cargo run --bin codex 'hello'
```
Then the user message was added as the first entry in the history and
then `is_first_event` would be `false` here:
031df77dfb/codex-rs/tui/src/conversation_history_widget.rs (L178-L179)
which would prevent the "welcome" message with things like the the model
version from displaying.
The fix in this PR is twofold:
* Reorganize the logic so the `ChatWidget` constructor stores
`initial_user_message` rather than sending it right away. Now inside
`handle_codex_event()`, it waits for the `SessionConfigured` event and
sends the `initial_user_message`, if it exists.
* In `conversation_history_widget.rs`, `add_session_info()` checks to
see whether a `WelcomeMessage` exists in the history when determining
the value of `has_welcome_message`. By construction, we expect that
`WelcomeMessage` is always the first message (in which case the existing
`let is_first_event = self.entries.is_empty();` logic would be sound),
but we decide to be extra defensive in case an `EventMsg::Error` is
processed before `EventMsg::SessionConfigured`.
When running `npm test` on `codex-cli`, the test
`agent-cancel-prev-response.test.ts` logs a significant body of text to
console for no obvious reason.
This is not helpful, as it makes test logs messy and far longer.
This change deletes the `console.log(...)` that produces the behavior.
I did a bit of research to understand why I could not use my mouse to
drag to select text to copy to the clipboard in iTerm.
Apparently https://github.com/openai/codex/pull/641 to enable mousewheel
scrolling broke this functionality. It seems that, unless we put in a
bit of effort, we can have drag-to-select or scrolling, but not both.
Though if you know the trick to hold down `Option` will dragging with
the mouse in iTerm, you can probably get by with this. (I did not know
about this option prior to researching this issue.)
Nevertheless, users may still prefer to disable mouse capture
altogether, so this PR introduces:
* the ability to set `tui.disable_mouse_capture = true` in `config.toml`
to disable mouse capture
* a new command, `/toggle-mouse-mode` to toggle mouse capture
The new `codex-mini-latest` model expects a new tool with `{"type":
"local_shell"}`. Its contract is similar to the existing `function` tool
with `"name": "shell"`, so this takes the `local_shell` tool call into
`ExecParams` and sends it through the existing
`handle_container_exec_with_params()` code path.
This also adds the following logic when adding the default set of tools
to a request:
```rust
let default_tools = if self.model.starts_with("codex") {
&DEFAULT_CODEX_MODEL_TOOLS
} else {
&DEFAULT_TOOLS
};
```
That is, if the model name starts with `"codex"`, we add `{"type":
"local_shell"}` to the list of tools; otherwise, we add the
aforementioned `shell` tool.
To test this, I ran the TUI with `-m codex-mini-latest` and verified
that it used the `local_shell` tool. Though I also had some entries in
`[mcp_servers]` in my personal `config.toml`. The `codex-mini-latest`
model seemed eager to try the tools from the MCP servers first, so I
have personally commented them out for now, so keep an eye out if you're
testing `codex-mini-latest`!
Perhaps we should include more details with `{"type": "local_shell"}` or
update the following:
fd0b1b0208/codex-rs/core/prompt.md
For reference, the corresponding change in the TypeScript CLI is
https://github.com/openai/codex/pull/951.
## `0.1.2505161243`
- Sign in with chatgpt (#963)
- Session history viewer (#912)
- Apply patch issue when using different cwd (#942)
- Diff command for filenames with special characters (#954)
- A new “/sessions” command is available for browsing previous sessions,
as shown in the updated slash command list
- The CLI now documents and parses a new “--history” flag to browse past
sessions from the command line
- A dedicated `SessionsOverlay` component loads session metadata and
allows toggling between viewing and resuming sessions
- When the sessions overlay is opened during a chat, selecting a session
can either show the saved rollout or resume it
If you run a codex instance outside of the current working directory
from where you launched the codex binary it won't be able to apply
patches correctly, even if the sandbox policy allows it. This manifests
weird behaviours, such as
* Reading the same filename in the binary working directory, and
overwriting it in the session working directory. e.g. if you have a
`readme` in both folders it will overwrite the readme in the session
working directory with the readme in the binary working directory
*applied with the suggested patch*.
* The LLM ends up in weird loops trying to verify and debug why the
apply_patch won't work, and it can result in it applying patches by
manually writing python or javascript if it figures out that either is
supported by the system instead.
I added a test-case to ensure that the patch contents are based on the
cwd.
## Issue: mixing relative & absolute paths in apply_patch
1. The apply_patch tool use relative paths based on the session working
directory.
2. `unified_diff_from_chunks` eventually ends up [reading the source
file](https://github.com/reflectionai/codex/blob/main/codex-rs/apply-patch/src/lib.rs#L410)
to figure out what the diff is, by using the relative path.
3. The changes are targeted using an absolute path derived from the
current working directory.
The end-result in case session working directory differs from the binary
working directory: we get the diff for a file relative to the binary
working directory, and apply it on a file in the session working
directory.
## Summary
- fix quoting issues in `/diff` to correctly handle files with special
characters
- add regression test for `getGitDiff` when filenames contain `$`
- relax timeout in raw-exec-process-group test
Fixes https://github.com/openai/codex/issues/943
## Testing
- `pnpm test`
When I originally wrote `elapsed.rs`, I realized we were using both
`std::time` and `chrono` with no real benefit of having both. We should
try to keep the `exec` subcommand trim (as it also buildable as a
standalone executable), so this helps tighten things up.
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
3fdf9df133/codex-cli/src/utils/storage/command-history.ts (L10-L17)
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
Moving to Rust 1.87 introduced a clippy warning that
`SendError<AppEvent>` was too large.
In practice, the only thing we ever did when we got this error was log
it (if the mspc channel is closed, then the app is likely shutting down
or something, so there's not much to do...), so this finally motivated
me to introduce `AppEventSender`, which wraps
`std::sync::mpsc::Sender<AppEvent>` with a `send()` method that invokes
`send()` on the underlying `Sender` and logs an `Err` if it gets one.
This greatly simplifies the code, as many functions that previously
returned `Result<(), SendError<AppEvent>>` now return `()`, so we don't
have to propagate an `Err` all over the place that we don't really
handle, anyway.
This also makes it so we can upgrade to Rust 1.87 in CI.
Previously, our GitHub actions specified the Rust toolchain as
`dtolnay/rust-toolchain@stable`, which meant the version could change
out from under us. In this case, the move from 1.86 to 1.87 introduced
new clippy warnings, causing build failures.
Because it will take a little time to fix all the new clippy warnings,
this PR pins things to 1.86 for now to unbreak the build.
It also replaces `io::Error::new(io::ErrorKind::Other)` with
`io::Error::other()` in preparation for 1.87.
As discussed on
699ec5a87f (commitcomment-156776835),
to properly support scrolling long content in Ratatui for a sequence of
cells, we need to:
* take the `Vec<Line>` for each cell
* using the wrapping logic we want to use at render time, compute the
_effective line count_ using `Paragraph::line_count()` (see
`wrapped_line_count_for_cell()` in this PR)
* sum up the effective line count to compute the height of the area
being scrolled
* given a `scroll_position: usize`, index into the list of "effective
lines" and accumulate the appropriate `Vec<Line>` for the cells that
should be displayed
* take that `Vec<Line>` to create a `Paragraph` and use the same
line-wrapping policy that was used in `wrapped_line_count_for_cell()`
* display the resulting `Paragraph` and use the accounting to display a
scrollbar with the appropriate thumb size and offset without having to
render the `Vec<Line>` for the full history
With this change, lines wrap as I expect and everything appears to
redraw correctly as I resize my terminal!
For now, this removes the `#[non_exhaustive]` directive on `EventMsg` so
that we are forced to handle all `EventMsg` by default. (We may revisit
this if/when we publish `core/` as a `lib` crate.) For now, it is
helpful to have this as a forcing function because we have effectively
two UIs (`tui` and `exec`) and usually when we add a new variant to
`EventMsg`, we want to be sure that we update both.
Previously, running Codex as an MCP server required a standalone binary
in our Cargo workspace, but this PR makes it available as a subcommand
(`mcp`) of the main CLI.
Ran this with:
```
RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run --bin codex -- mcp
```
and verified it worked as expected in the inspector at
`http://127.0.0.1:6274/`.
Introduces support for slash commands like in the TypeScript CLI. We do
not support the full set of commands yet, but the core abstraction is
there now.
In particular, we have a `SlashCommand` enum and due to thoughtful use
of the [strum](https://crates.io/crates/strum) crate, it requires
minimal boilerplate to add a new command to the list.
The key new piece of UI is `CommandPopup`, though the keyboard events
are still handled by `ChatComposer`. The behavior is roughly as follows:
* if the first character in the composer is `/`, the command popup is
displayed (if you really want to send a message to Codex that starts
with a `/`, simply put a space before the `/`)
* while the popup is displayed, up/down can be used to change the
selection of the popup
* if there is a selection, hitting tab completes the command, but does
not send it
* if there is a selection, hitting enter sends the command
* if the prefix of the composer matches a command, the command will be
visible in the popup so the user can see the description (commands could
take arguments, so additional text may appear after the command name
itself)
https://github.com/user-attachments/assets/39c3e6ee-eeb7-4ef7-a911-466d8184975f
Incidentally, Codex wrote almost all the code for this PR!
`BottomPane` was getting a bit unwieldy because it maintained a
`PaneState` enum with three variants and many of its methods had `match`
statements to handle each variant. To replace the enum, this PR:
* Introduces a `trait BottomPaneView` that has two implementations:
`StatusIndicatorView` and `ApprovalModalView`.
* Migrates `PaneState::TextInput` into its own struct, `ChatComposer`,
that does **not** implement `BottomPaneView`.
* Updates `BottomPane` so it has `composer: ChatComposer` and
`active_view: Option<Box<dyn BottomPaneView<'a> + 'a>>`. The idea is
that `active_view` takes priority and is displayed when it is `Some`;
otherwise, `ChatComposer` is displayed.
* While methods of `BottomPane` often have to check whether
`active_view` is present to decide which component to delegate to, the
code is more straightforward than before and introducing new
implementations of `BottomPaneView` should be less painful.
Because we want to retain the `TextArea` owned by `ChatComposer` even
when another view is displayed, to keep the ownership logic simple, it
seemed best to keep `ChatComposer` distinct from `BottomPaneView`.
More about codespell: https://github.com/codespell-project/codespell .
I personally introduced it to dozens if not hundreds of projects already
and so far only positive feedback.
CI workflow has 'permissions' set only to 'read' so also should be safe.
Let me know if just want to take typo fixes in and get rid of the CI
---------
Signed-off-by: Yaroslav O. Halchenko <debian@onerussian.com>
## `0.1.2505140839`
### 🪲 Bug Fixes
- Gpt-4.1 apply_patch handling (#930)
- Add support for fileOpener in config.json (#911)
- Patch in #366 and #367 for marked-terminal (#916)
- Remember to set lastIndex = 0 on shared RegExp (#918)
- Always load version from package.json at runtime (#909)
- Tweak the label for citations for better rendering (#919)
- Tighten up some logic around session timestamps and ids (#922)
- Change EventMsg enum so every variant takes a single struct (#925)
- Reasoning default to medium, show workdir when supplied (#931)
- Test_dev_null_write() was not using echo as intended (#923)
While the `TextArea` used in the Rust TUI is "multiline," it is not like
an HTML `<textarea>` in that it does not wrap, so there was not much
benefit to setting `MIN_TEXTAREA_ROWS` to `3`, so this PR changes it to
`1`. Though there are now three ways to "increase" the height due to
actual linebreaks:
* paste in multiline content (this worked before this PR)
* pressing `Ctrl+J` will insert a newline
* if you have your terminal emulator set such that it is possible to
press something that `crossterm` interprets as "Enter plus some
modifier," then now that will also work
Now things look a bit more compact on startup:
<img width="745" alt="image"
src="https://github.com/user-attachments/assets/86e2857f-f31c-46f5-a80b-1ab2120b266e"
/>
I believe this test meant to verify that echoing content to `/dev/null`
succeeded, but instead, I believe it was testing the equivalent to `echo
'blah > /dev/null'`.
https://github.com/openai/codex/pull/922 did this for the
`SessionConfigured` enum variant, and I think it is generally helpful to
be able to work with the values as each enum variant as their own type,
so this converts the remaining variants and updates all of the
callsites.
Added a simple unit test to verify that the JSON-serialized version of
`Event` does not have any unexpected nesting.
* update `SessionConfigured` event to include the UUID for the session
* show the UUID in the Rust TUI
* use local timestamps in log files instead of UTC
* include timestamps in log file names for easier discovery
This introduces a much-needed "profile" concept where users can specify
a collection of options under one name and then pass that via
`--profile` to the CLI.
This PR introduces the `ConfigProfile` struct and makes it a field of
`CargoToml`. It further updates
`Config::load_from_base_config_with_overrides()` to respect
`ConfigProfile`, overriding default values where appropriate. A detailed
unit test is added at the end of `config.rs` to verify this behavior.
Details on how to use this feature have also been added to
`codex-rs/README.md`.
Right now since the repo is having two different implementations of
codex, flake was updated to work with both typescript implementation and
rust implementation
Adds a space so that sequential citations have some more breathing room.
As I had to update the tests for this change, I also introduced a
`toDiffableString()` helper to make the test easier to update as we make
formatting changes to the output.
This PR uses [`pnpm
patch`](https://www.petermekhaeil.com/til/pnpm-patch/) to pull in the
following proposed fixes for `marked-terminal`:
* https://github.com/mikaelbr/marked-terminal/pull/366
* https://github.com/mikaelbr/marked-terminal/pull/367
This adds a substantial test to `codex-cli/tests/markdown.test.tsx` to
verify the new behavior.
Note that one of the tests shows two citations being split across a line
even though the rendered version would fit comfortably on one line.
Changing this likely requires a subtle fix to `marked-terminal` to
account for "rendered length" when determining line breaks.
This PR introduces the following type:
```typescript
export type FileOpenerScheme = "vscode" | "cursor" | "windsurf";
```
and uses it as the new type for a `fileOpener` option in `config.json`.
If set, this will be used to linkify file annotations in the output
using the URI-based file opener supported in VS Code-based IDEs.
Currently, this does not pass:
Updated `codex-cli/tests/markdown.test.tsx` to verify the new behavior.
Note it required mocking `supports-hyperlinks` and temporarily modifying
`chalk.level` to yield the desired output.
Note the high-level motivation behind this change is to avoid the need
to make temporary changes in the source tree in order to cut a release
build since that runs the risk of leaving things in an inconsistent
state in the event of a failure. The existing code:
```
import pkg from "../../package.json" assert { type: "json" };
```
did not work as intended because, as written, ESBuild would bake the
contents of the local `package.json` into the release build at build
time whereas we want it to read the contents at runtime so we can use
the `package.json` in the tree to build the code and later inject a
modified version into the release package with a timestamped build
version.
Changes:
* move `CLI_VERSION` out of `src/utils/session.ts` and into
`src/version.ts` so `../package.json` is a correct relative path both
from `src/version.ts` in the source tree and also in the final
`dist/cli.js` build output
* change `assert` to `with` in `import pkg` as apparently `with` became
standard in Node 22
* mark `"../package.json"` as external in `build.mjs` so the version is
not baked into the `.js` at build time
After using `pnpm stage-release` to build a release version, if I use
Node 22.0 to run Codex, I see the following printed to stderr at
startup:
```
(node:71308) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
```
Note it is a warning and does not prevent Codex from running.
In Node 22.12, the warning goes away, but the warning still appears in
Node 22.11. For Node 22, 22.15.0 is the current LTS version, so LTS
users will not see this.
Also, something about moving the definition of `CLI_VERSION` caused a
problem with the mocks in `check-updates.test.ts`. I asked Codex to fix
it, and it came up with the change to the test configs. I don't know
enough about vitest to understand what it did, but the tests seem
healthy again, so I'm going with it.
I had seen issues where `codex-rs` would not always write files without
me pressuring it to do so, and between that and the report of
https://github.com/openai/codex/issues/900, I decided to look into this
further. I found two serious issues with agent instructions:
(1) We were only sending agent instructions on the first turn, but
looking at the TypeScript code, we should be sending them on every turn.
(2) There was a serious issue where the agent instructions were
frequently lost:
* The TypeScript CLI appears to keep writing `~/.codex/instructions.md`:
55142e3e6c/codex-cli/src/utils/config.ts (L586)
* If `instructions.md` is present, the Rust CLI uses the contents of it
INSTEAD OF the default prompt, even if `instructions.md` is empty:
55142e3e6c/codex-rs/core/src/config.rs (L202-L203)
The combination of these two things means that I have been using
`codex-rs` without these key instructions:
https://github.com/openai/codex/blob/main/codex-rs/core/prompt.md
Looking at the TypeScript code, it appears we should be concatenating
these three items every time (if they exist):
* `prompt.md`
* `~/.codex/instructions.md`
* nearest `AGENTS.md`
This PR fixes things so that:
* `Config.instructions` is `None` if `instructions.md` is empty
* `Payload.instructions` is now `&'a str` instead of `Option<&'a
String>` because we should always have _something_ to send
* `Prompt` now has a `get_full_instructions()` helper that returns a
`Cow<str>` that will always include the agent instructions first.
This PR fixes a potential path traversal vulnerability by ensuring all
paths are properly normalized in the `resolvePathAgainstWorkdir`
function.
## Changes
- Added path normalization for both absolute and relative paths
- Ensures normalized paths are used in all subsequent operations
- Prevents potential path traversal attacks through non-normalized paths
This minimal change addresses the security concern without adding
unnecessary complexity, while maintaining compatibility with existing
code.
This PR introduces an optional build flag, `--native`, that will build a
version of the Codex npm module that:
- Includes both the Node.js and native Rust versions (for Mac and Linux)
- Will run the native version if `CODEX_RUST=1` is set
- Runs the TypeScript version otherwise
Note this PR also updates the workflow URL to
https://github.com/openai/codex/actions/runs/14872557396, as that is a
build from today that includes everything up through
https://github.com/openai/codex/pull/843.
Test Plan:
In `~/code/codex/codex-cli`, I ran:
```
pnpm stage-release --native
```
The end of the output was:
```
Staged version 0.1.2505121317 for release in /var/folders/wm/f209bc1n2bd_r0jncn9s6j_00000gp/T/tmp.xd2p5ETYGN
Test Node:
node /var/folders/wm/f209bc1n2bd_r0jncn9s6j_00000gp/T/tmp.xd2p5ETYGN/bin/codex.js --help
Test Rust:
CODEX_RUST=1 node /var/folders/wm/f209bc1n2bd_r0jncn9s6j_00000gp/T/tmp.xd2p5ETYGN/bin/codex.js --help
Next: cd "/var/folders/wm/f209bc1n2bd_r0jncn9s6j_00000gp/T/tmp.xd2p5ETYGN" && npm publish --tag native
```
I verified that running each of these commands ran the expected version
of Codex.
While here, I also added `bin` to the `files` list in `package.json`,
which should have been done as part of
https://github.com/openai/codex/pull/757, as that added new entries to
`bin` that were matched by `.gitignore` but should have been included in
a release.
Adds `expect()` as a denied lint. Same deal applies with `unwrap()`
where we now need to put `#[expect(...` on ones that we legit want. Took
care to enable `expect()` in test contexts.
# Tests
```
cargo fmt
cargo clippy --all-features --all-targets --no-deps -- -D warnings
cargo test
```
This PR fixes things so that:
* when the `BottomPane` is in the `StatusIndicator` state, the border
should be dim
* when the `BottomPane` does not have input focus, the border should be
dim
To make it easier to enforce this invariant, this PR introduces
`BottomPane::set_state()` that will:
* update `self.state`
* call `update_border_for_input_focus()`
* request a repaint
This should make it easier to enforce other updates for state changes
going forward.
As shown in the screenshot, we now include reasoning messages from the
model in the TUI under the heading "codex reasoning":

To ensure these are visible by default when using `o4-mini`, this also
changes the default value for `summary` (formerly `generate_summary`,
which is deprecated in favor of `summary` according to the docs) from
unset to `"auto"`.
The TypeScript CLI already has support for including the contents of
`AGENTS.md` in the instructions sent with the first turn of a
conversation. This PR brings this functionality to the Rust CLI.
To be considered, `AGENTS.md` must be in the `cwd` of the session, or in
one of the parent folders up to a Git/filesystem root (whichever is
encountered first).
By default, a maximum of 32 KiB of `AGENTS.md` will be included, though
this is configurable using the new-in-this-PR `project_doc_max_bytes`
option in `config.toml`.
* Add flexMode to stored config, and use it during config loading unless
the flag is explicitly passed.
* If the config asks for flexMode and the model doesn't support it,
silently disable flexMode.
Resolves#803
- Added ArceeAI as a provider - https://conductor.arcee.ai/v1
- Compatible with ArceeAI SLMs (Virtuoso, Maestro)
- Works with ArceeAI's Conductor auto‑router models (auto, auto‑tool),
once #817 is merged
- Fixes guard by using optional chaining to safely check
chunk.choices?.[0] before accessing.
- Currently, accessing chunk.choices[0] without checking could throw if
choices was missing from the chunk.
Reasoning effort was already available, but not expressed into the help
text, so it was non-discoverable.
Other issues discovered, but will fix in separate PR since they are
larger:
* #816 reasoningEffort isn't displayed in the terminal-header, making it
rather hard to see the state of configuration
* I don't think the config file setting works, as the CLI option always
"wins" and overwrites it
Fix: retry on server_error responses that lack an HTTP status code
### What happened
1. An OpenAI endpoint returned a **5xx** (transient server-side
failure).
2. The SDK surfaced it as an `APIError` with
{ "type": "server_error", "message": "...", "status": undefined }
(The SDK does not always populate `status` for these cases.)
3. Our retry logic in `src/utils/agent/agent-loop.ts` determined
isServerError = typeof status === "number" && status >= 500;
Because `status` was *undefined*, the error was **not** recognised as
retriable, the exception bubbled out, and the CLI crashed with a stack
trace similar to:
Error: An error occurred while processing the request.
at .../cli.js:474:1514
### Root cause
The transient-error detector ignored the semantic flag type ===
"server_error" that the SDK provides when the numeric status is missing.
#### Fix (1 loc + comment)
Extend the check:
const status = errCtx?.status ?? errCtx?.httpStatus ??
errCtx?.statusCode;
const isServerError = (typeof status === "number" && status >= 500) ||
// classic 5xx
errCtx?.type === "server_error"; // <-- NEW
Now the agent:
* Retries up to **5** times (existing logic) when the backend reports a
transient failure, even if `status` is absent.
* If all retries fail, surfaces the existing friendly system message
instead of an uncaught exception.
### Tests & validation
pnpm test # all suites green (17 agent-level tests now include this
path)
pnpm run lint # 0 errors / warnings
pnpm run typecheck
A new unit-test file isn’t required—the behaviour is already covered by
tests/agent-server-retry.test.ts, which stubs type: "server_error" and
now passes with the updated logic.
### Impact
* No API-surface changes.
* Prevents CLI crashes on intermittent OpenAI outages.
* Adds robust handling for other providers that may follow the same
error-shape.
When using Codex to develop Codex itself, I noticed that sometimes it
would try to add `#[ignore]` to the following tests:
```
keeps_previous_response_id_between_tasks()
retries_on_early_close()
```
Both of these tests start a `MockServer` that launches an HTTP server on
an ephemeral port and requires network access to hit it, which the
Seatbelt policy associated with `--full-auto` correctly denies. If I
wasn't paying attention to the code that Codex was generating, one of
these `#[ignore]` annotations could have slipped into the codebase,
effectively disabling the test for everyone.
To that end, this PR enables an experimental environment variable named
`CODEX_SANDBOX_NETWORK_DISABLED` that is set to `1` if the
`SandboxPolicy` used to spawn the process does not have full network
access. I say it is "experimental" because I'm not convinced this API is
quite right, but we need to start somewhere. (It might be more
appropriate to have an env var like `CODEX_SANDBOX=full-auto`, but the
challenge is that our newer `SandboxPolicy` abstraction does not map to
a simple set of enums like in the TypeScript CLI.)
We leverage this new functionality by adding the following code to the
aforementioned tests as a way to "dynamically disable" them:
```rust
if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
```
We can use the `debug seatbelt --full-auto` command to verify that
`cargo test` fails when run under Seatbelt prior to this change:
```
$ cargo run --bin codex -- debug seatbelt --full-auto -- cargo test
---- keeps_previous_response_id_between_tasks stdout ----
thread 'keeps_previous_response_id_between_tasks' panicked at /Users/mbolin/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/wiremock-0.6.3/src/mock_server/builder.rs:107:46:
Failed to bind an OS port for a mock server.: Os { code: 1, kind: PermissionDenied, message: "Operation not permitted" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
keeps_previous_response_id_between_tasks
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
error: test failed, to rerun pass `-p codex-core --test previous_response_id`
```
Though after this change, the above command succeeds! This means that,
going forward, when Codex operates on Codex itself, when it runs `cargo
test`, only "real failures" should cause the command to fail.
As part of this change, I decided to tighten up the codepaths for
running `exec()` for shell tool calls. In particular, we do it in `core`
for the main Codex business logic itself, but we also expose this logic
via `debug` subcommands in the CLI in the `cli` crate. The logic for the
`debug` subcommands was not quite as faithful to the true business logic
as I liked, so I:
* refactored a bit of the Linux code, splitting `linux.rs` into
`linux_exec.rs` and `landlock.rs` in the `core` crate.
* gating less code behind `#[cfg(target_os = "linux")]` because such
code does not get built by default when I develop on Mac, which means I
either have to build the code in Docker or wait for CI signal
* introduced `macro_rules! configure_command` in `exec.rs` so we can
have both sync and async versions of this code. The synchronous version
seems more appropriate for straight threads or potentially fork/exec.
## Summary
This PR introduces support for Azure OpenAI as a provider within the
Codex CLI. Users can now configure the tool to leverage their Azure
OpenAI deployments by specifying `"azure"` as the provider in
`config.json` and setting the corresponding `AZURE_OPENAI_API_KEY` and
`AZURE_OPENAI_API_VERSION` environment variables. This functionality is
added alongside the existing provider options (OpenAI, OpenRouter,
etc.).
Related to #92
**Note:** This PR is currently in **Draft** status because tests on the
`main` branch are failing. It will be marked as ready for review once
the `main` branch is stable and tests are passing.
---
## What’s Changed
- **Configuration (`config.ts`, `providers.ts`, `README.md`):**
- Added `"azure"` to the supported `providers` list in `providers.ts`,
specifying its name, default base URL structure, and environment
variable key (`AZURE_OPENAI_API_KEY`).
- Defined the `AZURE_OPENAI_API_VERSION` environment variable in
`config.ts` with a default value (`2025-03-01-preview`).
- Updated `README.md` to:
- Include "azure" in the list of providers.
- Add a configuration section for Azure OpenAI, detailing the required
environment variables (`AZURE_OPENAI_API_KEY`,
`AZURE_OPENAI_API_VERSION`) with examples.
- **Client Instantiation (`terminal-chat.tsx`, `singlepass-cli-app.tsx`,
`agent-loop.ts`, `compact-summary.ts`, `model-utils.ts`):**
- Modified various components and utility functions where the OpenAI
client is initialized.
- Added conditional logic to check if the configured `provider` is
`"azure"`.
- If the provider is Azure, the `AzureOpenAI` client from the `openai`
package is instantiated, using the configured `baseURL`, `apiKey` (from
`AZURE_OPENAI_API_KEY`), and `apiVersion` (from
`AZURE_OPENAI_API_VERSION`).
- Otherwise, the standard `OpenAI` client is instantiated as before.
- **Dependencies:**
- Relies on the `openai` package's built-in support for `AzureOpenAI`.
No *new* external dependencies were added specifically for this Azure
implementation beyond the `openai` package itself.
---
## How to Test
*This has been tested locally and confirmed working with Azure OpenAI.*
1. **Configure `config.json`:**
Ensure your `~/.codex/config.json` (or project-specific config) includes
Azure and sets it as the active provider:
```json
{
"providers": {
// ... other providers
"azure": {
"name": "AzureOpenAI",
"baseURL": "https://YOUR_RESOURCE_NAME.openai.azure.com", // Replace
with your Azure endpoint
"envKey": "AZURE_OPENAI_API_KEY"
}
},
"provider": "azure", // Set Azure as the active provider
"model": "o4-mini" // Use your Azure deployment name here
// ... other config settings
}
```
2. **Set up Environment Variables:**
```bash
# Set the API Key for your Azure OpenAI resource
export AZURE_OPENAI_API_KEY="your-azure-api-key-here"
# Set the API Version (Optional - defaults to `2025-03-01-preview` if
not set)
# Ensure this version is supported by your Azure deployment and endpoint
export AZURE_OPENAI_API_VERSION="2025-03-01-preview"
```
3. **Get the Codex CLI by building from this PR branch:**
Clone your fork, checkout this branch (`feat/azure-openai`), navigate to
`codex-cli`, and build:
```bash
# cd /path/to/your/fork/codex
git checkout feat/azure-openai # Or your branch name
cd codex-cli
corepack enable
pnpm install
pnpm build
```
4. **Invoke Codex:**
Run the locally built CLI using `node` from the `codex-cli` directory:
```bash
node ./dist/cli.js "Explain the purpose of this PR"
```
*(Alternatively, if you ran `pnpm link` after building, you can use
`codex "Explain the purpose of this PR"` from anywhere)*.
5. **Verify:** Confirm that the command executes successfully and
interacts with your configured Azure OpenAI deployment.
---
## Tests
- [x] Tested locally against an Azure OpenAI deployment using API Key
authentication. Basic commands and interactions confirmed working.
---
## Checklist
- [x] Added Azure provider details to configuration files
(`providers.ts`, `config.ts`).
- [x] Implemented conditional `AzureOpenAI` client initialization based
on provider setting.
- [x] Ensured `apiVersion` is passed correctly to the Azure client.
- [x] Updated `README.md` with Azure OpenAI setup instructions.
- [x] Manually tested core functionality against a live Azure OpenAI
endpoint.
- [x] Add/update automated tests for the Azure code path (pending `main`
stability).
cc @theabhinavdas @nikodem-wrona @fouad-openai @tibo-openai (adjust as
needed)
---
I have read the CLA Document and I hereby sign the CLA
Noticed that when pasting multi-line blocks, each newline was treated
like a new submission.
Update tui to handle Paste directly and map newlines to shift+enter.
# Test
Copied this into clipboard:
```
Do nothing.
Explain this repo to me.
```
Pasted in and saw multi-line input. Hitting Enter then submitted the
full block.
This PR is a straight refactor so that creating the `Child` process for
an `shell` tool call and consuming its output can be separate concerns.
For the actual tool call, we will always apply
`consume_truncated_output()`, but for the top-level debug commands in
the CLI (e.g., `debug seatbelt` and `debug landlock`), we only want to
use the `spawn_child()` part of `exec()`.
We want the subcommands to match the `shell` tool call usage as
faithfully as possible. This becomes more important when we introduce a
new parameter to `spawn_child()` in
https://github.com/openai/codex/pull/879.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/878).
* #879
* __->__ #878
I inadvertently regressed support for the Responses API when adding
support for the chat completions API in
https://github.com/openai/codex/pull/862. This should get both APIs
working again, but the chat completions codepath seems more complex than
necessary. I'll try to clean that up shortly, but I want to get things
working again ASAP.
This is a substantial PR to add support for the chat completions API,
which in turn makes it possible to use non-OpenAI model providers (just
like in the TypeScript CLI):
* It moves a number of structs from `client.rs` to `client_common.rs` so
they can be shared.
* It introduces support for the chat completions API in
`chat_completions.rs`.
* It updates `ModelProviderInfo` so that `env_key` is `Option<String>`
instead of `String` (for e.g., ollama) and adds a `wire_api` field
* It updates `client.rs` to choose between `stream_responses()` and
`stream_chat_completions()` based on the `wire_api` for the
`ModelProviderInfo`
* It updates the `exec` and TUI CLIs to no longer fail if the
`OPENAI_API_KEY` environment variable is not set
* It updates the TUI so that `EventMsg::Error` is displayed more
prominently when it occurs, particularly now that it is important to
alert users to the `CodexErr::EnvVar` variant.
* `CodexErr::EnvVar` was updated to include an optional `instructions`
field so we can preserve the behavior where we direct users to
https://platform.openai.com if `OPENAI_API_KEY` is not set.
* Cleaned up the "welcome message" in the TUI to ensure the model
provider is displayed.
* Updated the docs in `codex-rs/README.md`.
To exercise the chat completions API from OpenAI models, I added the
following to my `config.toml`:
```toml
model = "gpt-4o"
model_provider = "openai-chat-completions"
[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
```
Though to test a non-OpenAI provider, I installed ollama with mistral
locally on my Mac because ChatGPT said that would be a good match for my
hardware:
```shell
brew install ollama
ollama serve
ollama pull mistral
```
Then I added the following to my `~/.codex/config.toml`:
```toml
model = "mistral"
model_provider = "ollama"
```
Note this code could certainly use more test coverage, but I want to get
this in so folks can start playing with it.
For reference, I believe https://github.com/openai/codex/pull/247 was
roughly the comparable PR on the TypeScript side.
I installed the GitHub Actions extension for VS Code and it started
giving me lint warnings about this line:
a9adb4175c/.github/workflows/rust-ci.yml (L99)
Using an env var to track the state of individual steps was not great,
so I did some research about GitHub actions, which led to the discovery
of combining `continue-on-error: true` with `if .. steps.STEP.outcome ==
'failure'...`.
Apparently there is also a `failure()` macro that is supposed to make
this simpler, but I saw a number of complains online about it not
working as expected. Checking `outcome` seems maybe more reliable at the
cost of being slightly more verbose.
https://github.com/openai/codex/pull/855 added the clippy warning to
disallow `unwrap()`, but apparently we were not verifying that tests
were "clippy clean" in CI, so I ended up with a lot of local errors in
VS Code.
This turns on the check in CI and fixes the offenders.
I noticed that sometimes I would enter a new message, but it would not
show up in the conversation history. Even if I focused the conversation
history and tried to scroll it to the bottom, I could not bring it into
view. At first, I was concerned that messages were not making it to the
UI layer, but I added debug statements and verified that was not the
issue.
It turned out that, previous to this PR, lines that are wider than the
viewport take up multiple lines of vertical space because `wrap()` was
set on the `Paragraph` inside the scroll pane. Unfortunately, that broke
our "scrollbar math" that assumed each `Line` contributes one line of
height in the UI.
This PR removes the `wrap()`, but introduces a new issue, which is that
now you cannot see long lines without resizing your terminal window. For
now, I filed an issue here:
https://github.com/openai/codex/issues/869
I think the long-term fix is to fix our math so it calculates the height
of a `Line` after it is wrapped given the current width of the viewport.
Sets submodules to use workspace lints. Added denying unwrap as a
workspace level lint, which found a couple of cases where we could have
propagated errors. Also manually labeled ones that were fine by my eye.
This is the first step in supporting other model providers in the Rust
CLI. Specifically, this PR adds support for the new entries in `Config`
and `ConfigOverrides` to specify a `ModelProviderInfo`, which is the
basic config needed for an LLM provider. This PR does not get us all the
way there yet because `client.rs` still categorically appends
`/responses` to the URL and expects the endpoint to support the OpenAI
Responses API. Will fix that next!
I discovered that I accidentally introduced a change in
https://github.com/openai/codex/pull/829 where we load a fresh `Config`
in the middle of `codex.rs`:
c3e10e180a/codex-rs/core/src/codex.rs (L515-L522)
This is not good because the `Config` could differ from the one that has
the user's overrides specified from the CLI. Also, in unit tests, it
means the `Config` was picking up my personal settings as opposed to
using a vanilla config, which was problematic.
This PR cleans things up by moving the common case where
`Op::ConfigureSession` is derived from `Config` (originally done in
`codex_wrapper.rs`) and making it the standard way to initialize `Codex`
by putting it in `Codex::spawn()`. Note this also eliminates quite a bit
of boilerplate from the tests and relieves the caller of the
responsibility of minting out unique IDs when invoking `submit()`.
These abstractions were originally created exclusively for the REPL,
which was removed in https://github.com/openai/codex/pull/754.
Currently, the create some unnecessary Tokio tasks, so we are better off
without them. (We can always bring this back if we have a new use case.)
This adds support for saving transcripts when using the Rust CLI. Like
the TypeScript CLI, it saves the transcript to `~/.codex/sessions`,
though it uses JSONL for the file format (and `.jsonl` for the file
extension) so that even if Codex crashes, what was written to the
`.jsonl` file should generally still be valid JSONL content.
We now impose a 10s timeout on the initial `tools/list` request to an
MCP server. We do not apply a timeout for other types of requests yet,
but we should start enforcing those, as well.
This introduces the use of the `tui-markdown` crate to parse an
assistant message as Markdown and style it using ANSI for a better user
experience. As shown in the screenshot below, it has support for syntax
highlighting for _tagged_ fenced code blocks:
<img width="907" alt="image"
src="https://github.com/user-attachments/assets/900dc229-80bb-46e8-b1bb-efee4c70ba3c"
/>
That said, `tui-markdown` is not as configurable (or stylish!) as
https://www.npmjs.com/package/marked-terminal, which is what we use in
the TypeScript CLI. In particular:
* The styles are hardcoded and `tui_markdown::from_str()` does not take
any options whatsoever. It uses "bold white" for inline code style which
does not stand out as much as the yellow used by `marked-terminal`:
65402cbda7/tui-markdown/src/lib.rs (L464)
I asked Codex to take a first pass at this and it came up with:
https://github.com/joshka/tui-markdown/pull/80
* If a fenced code block is not tagged, then it does not get
highlighted. I would rather add some logic here:
65402cbda7/tui-markdown/src/lib.rs (L262)
that uses something like https://pypi.org/project/guesslang/ to examine
the value of `text` and try to use the appropriate syntax highlighter.
* When we have a fenced code block, we do not want to show the opening
and closing triple backticks in the output.
To unblock ourselves, we might want to bundle our own fork of
`tui-markdown` temporarily until we figure out what the shape of the API
should be and then try to upstream it.
Some effects of this change:
- New formatting changes across many files. No functionality changes
should occur from that.
- Calls to `set_env` are considered unsafe, since this only happens in
tests we wrap them in `unsafe` blocks
I started this PR because I wanted to share the `format_duration()`
utility function in `codex-rs/exec/src/event_processor.rs` with the TUI.
The question was: where to put it?
`core` should have as few dependencies as possible, so moving it there
would introduce a dependency on `chrono`, which seemed undesirable.
`core` already had this `cli` feature to deal with a similar situation
around sharing common utility functions, so I decided to:
* make `core` feature-free
* introduce `common`
* `common` can have as many "special interest" features as it needs,
each of which can declare their own deps
* the first two features of common are `cli` and `elapsed`
In practice, this meant updating a number of `Cargo.toml` files,
replacing this line:
```toml
codex-core = { path = "../core", features = ["cli"] }
```
with these:
```toml
codex-core = { path = "../core" }
codex-common = { path = "../common", features = ["cli"] }
```
Moving `format_duration()` into its own file gave it some "breathing
room" to add a unit test, so I had Codex generate some tests and new
support for durations over 1 minute.
Out of the box, we will make `/` the only official "escape sequence" for
commands in the Rust TUI. We will look to support `q` (or any string you
want to use as a "macro") via a plugin, but not make it part of the
default experience.
Existing `q` users will have to get by with `ctrl+d` for now.
https://github.com/openai/codex/pull/829 noted it introduced a circular
dep between `codex.rs` and `mcp_tool_call.rs`. This attempts to clean
things up: the circular dep still exists, but at least all the fields of
`Session` are private again.
This adds initial support for MCP servers in the style of Claude Desktop
and Cursor. Note this PR is the bare minimum to get things working end
to end: all configured MCP servers are launched every time Codex is run,
there is no recovery for MCP servers that crash, etc.
(Also, I took some shortcuts to change some fields of `Session` to be
`pub(crate)`, which also means there are circular deps between
`codex.rs` and `mcp_tool_call.rs`, but I will clean that up in a
subsequent PR.)
`codex-rs/README.md` is updated as part of this PR to explain how to use
this feature. There is a bit of plumbing to route the new settings from
`Config` to the business logic in `codex.rs`. The most significant
chunks for new code are in `mcp_connection_manager.rs` (which defines
the `McpConnectionManager` struct) and `mcp_tool_call.rs`, which is
responsible for tool calls.
This PR also introduces new `McpToolCallBegin` and `McpToolCallEnd`
event types to the protocol, but does not add any handlers for them.
(See https://github.com/openai/codex/pull/836 for initial usage.)
To test, I added the following to my `~/.codex/config.toml`:
```toml
# Local build of https://github.com/hideya/mcp-server-weather-js
[mcp_servers.weather]
command = "/Users/mbolin/code/mcp-server-weather-js/dist/index.js"
args = []
```
And then I ran the following:
```
codex-rs$ cargo run --bin codex exec 'what is the weather in san francisco'
[2025-05-06T22:40:05] Task started: 1
[2025-05-06T22:40:18] Agent message: Here’s the latest National Weather Service forecast for San Francisco (downtown, near 37.77° N, 122.42° W):
This Afternoon (Tue):
• Sunny, high near 69 °F
• West-southwest wind around 12 mph
Tonight:
• Partly cloudy, low around 52 °F
• SW wind 7–10 mph
...
```
Note that Codex itself is not able to make network calls, so it would
not normally be able to get live weather information like this. However,
the weather MCP is [currently] not run under the Codex sandbox, so it is
able to hit `api.weather.gov` and fetch current weather information.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/829).
* #836
* __->__ #829
I discovered that `cargo build` worked for the entire workspace, but not
for the `mcp-client` or `core` crates.
* `mcp-client` failed to build because it underspecified the set of
features it needed from `tokio`.
* `core` failed to build because it was using a "feature" of its own
crate in the default, no-feature version.
This PR fixes the builds and adds a check in CI to defend against this
sort of thing going forward.
Cleans up the signature for `new_stdio_client()` to more closely mirror
how MCP servers are declared in config files (`command`, `args`, `env`).
Also takes a cue from Claude Code where the MCP server is launched with
a restricted `env` so that it only includes "safe" things like `USER`
and `PATH` (see the `create_env_for_mcp_server()` function introduced in
this PR for details) by default, as it is common for developers to have
sensitive API keys present in their environment that should only be
forwarded to the MCP server when the user has explicitly configured it
to do so.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/831).
* #829
* __->__ #831
This PR introduces an initial `McpClient` that we will use to give Codex
itself programmatic access to foreign MCPs. This does not wire it up in
Codex itself yet, but the new `mcp-client` crate includes a `main.rs`
for basic testing for now.
Manually tested by sending a `tools/list` request to Codex's own MCP
server:
```
codex-rs$ cargo build
codex-rs$ cargo run --bin codex-mcp-client ./target/debug/codex-mcp-server
{
"tools": [
{
"description": "Run a Codex session. Accepts configuration parameters matching the Codex Config struct.",
"inputSchema": {
"properties": {
"approval-policy": {
"description": "Execution approval policy expressed as the kebab-case variant name (`unless-allow-listed`, `auto-edit`, `on-failure`, `never`).",
"enum": [
"auto-edit",
"unless-allow-listed",
"on-failure",
"never"
],
"type": "string"
},
"cwd": {
"description": "Working directory for the session. If relative, it is resolved against the server process's current working directory.",
"type": "string"
},
"disable-response-storage": {
"description": "Disable server-side response storage.",
"type": "boolean"
},
"model": {
"description": "Optional override for the model name (e.g. \"o3\", \"o4-mini\")",
"type": "string"
},
"prompt": {
"description": "The *initial user prompt* to start the Codex conversation.",
"type": "string"
},
"sandbox-permissions": {
"description": "Sandbox permissions using the same string values accepted by the CLI (e.g. \"disk-write-cwd\", \"network-full-access\").",
"items": {
"enum": [
"disk-full-read-access",
"disk-write-cwd",
"disk-write-platform-user-temp-folder",
"disk-write-platform-global-temp-folder",
"disk-full-write-access",
"network-full-access"
],
"type": "string"
},
"type": "array"
}
},
"required": [
"prompt"
],
"type": "object"
},
"name": "codex"
}
]
}
```
This Pull Request addresses an issue where the output of commands
executed in the raw-exec utility was being truncated due to restrictive
limits on the number of lines and bytes collected. The truncation caused
the message [Output truncated: too many lines or bytes] to appear when
processing large outputs, which could hinder the functionality of the
CLI.
Changes Made
Increased the maximum output limits in the
[createTruncatingCollector](https://github.com/openai/codex/pull/575)
utility:
Bytes: Increased from 10 KB to 100 KB.
Lines: Increased from 256 lines to 1024 lines.
Installed the @types/node package to resolve missing type definitions
for [NodeJS](https://github.com/openai/codex/pull/575) and
[Buffer](https://github.com/openai/codex/pull/575).
Verified and fixed any related errors in the
[createTruncatingCollector](https://github.com/openai/codex/pull/575)
implementation.
Issue Solved:
This PR ensures that larger outputs can be processed without truncation,
improving the usability of the CLI for commands that generate extensive
output. https://github.com/openai/codex/issues/509
---------
Co-authored-by: Michael Bolin <bolinfest@gmail.com>
This PR replaces the placeholder `"echo"` tool call in the MCP server
with a `"codex"` tool that calls Codex. Events such as
`ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled
properly yet, but I have `approval_policy = "never"` set in my
`~/.codex/config.toml` such that those codepaths are not exercised.
The schema for this MPC tool is defined by a new `CodexToolCallParam`
struct introduced in this PR. It is fairly similar to `ConfigOverrides`,
as the param is used to help create the `Config` used to start the Codex
session, though it also includes the `prompt` used to kick off the
session.
This PR also introduces the use of the third-party `schemars` crate to
generate the JSON schema, which is verified in the
`verify_codex_tool_json_schema()` unit test.
Events that are dispatched during the Codex session are sent back to the
MCP client as MCP notifications. This gives the client a way to monitor
progress as the tool call itself may take minutes to complete depending
on the complexity of the task requested by the user.
In the video below, I launched the server via:
```shell
mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run --
```
In the video, you can see the flow of:
* requesting the list of tools
* choosing the **codex** tool
* entering a value for **prompt** and then making the tool call
Note that I left the other fields blank because when unspecified, the
values in my `~/.codex/config.toml` were used:
https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192
Note that while using the inspector, I did run into
https://github.com/modelcontextprotocol/inspector/issues/293, though the
tip about ensuring I had only one instance of the **MCP Inspector** tab
open in my browser seemed to fix things.
https://github.com/openai/codex/pull/800 kicked off some work to be more
disciplined about honoring the `cwd` param passed in rather than
assuming `std::env::current_dir()` as the `cwd`. As part of this, we
need to ensure `apply_patch` calls honor the appropriate `cwd` as well,
which is significant if the paths in the `apply_patch` arg are not
absolute paths themselves. Failing that:
- The `apply_patch` function call can contain an optional`workdir`
param, so:
- If specified and is an absolute path, it should be used to resolve
relative paths
- If specified and is a relative path, should be resolved against
`Config.cwd` and then any relative paths will be resolved against the
result
- If `workdir` is not specified on the function call, relative paths
should be resolved against `Config.cwd`
Note that we had a similar issue in the TypeScript CLI that was fixed in
https://github.com/openai/codex/pull/556.
As part of the fix, this PR introduces `ApplyPatchAction` so clients can
deal with that instead of the raw `HashMap<PathBuf,
ApplyPatchFileChange>`. This enables us to enforce, by construction,
that all paths contained in the `ApplyPatchAction` are absolute paths.
https://github.com/openai/codex/pull/800 made `cwd` a property of
`Config` and made it so the `cwd` is not necessarily
`std::env::current_dir()`. As such, `is_inside_git_repo()` should check
`Config.cwd` rather than `std::env::current_dir()`.
This PR updates `is_inside_git_repo()` to take `Config` instead of an
arbitrary `PathBuf` to force the check to operate on a `Config` where
`cwd` has been resolved to what the user specified.
In order to expose Codex via an MCP server, I realized that we should be
taking `cwd` as a parameter rather than assuming
`std::env::current_dir()` as the `cwd`. Specifically, the user may want
to start a session in a directory other than the one where the MCP
server has been started.
This PR makes `cwd: PathBuf` a required field of `Session` and threads
it all the way through, though I think there is still an issue with not
honoring `workdir` for `apply_patch`, which is something we also had to
fix in the TypeScript version: https://github.com/openai/codex/pull/556.
This also adds `-C`/`--cd` to change the cwd via the command line.
To test, I ran:
```
cargo run --bin codex -- exec -C /tmp 'show the output of ls'
```
and verified it showed the contents of my `/tmp` folder instead of
`$PWD`.
https://github.com/openai/codex/pull/793 had important information on
the `notify` config option that seemed worth memorializing, so this PR
updates the documentation about all of the configurable options in
`~/.codex/config.toml`.
With this change, you can specify a program that will be executed to get
notified about events generated by Codex. The notification info will be
packaged as a JSON object. The supported notification types are defined
by the `UserNotification` enum introduced in this PR. Initially, it
contains only one variant, `AgentTurnComplete`:
```rust
pub(crate) enum UserNotification {
#[serde(rename_all = "kebab-case")]
AgentTurnComplete {
turn_id: String,
/// Messages that the user sent to the agent to initiate the turn.
input_messages: Vec<String>,
/// The last message sent by the assistant in the turn.
last_assistant_message: Option<String>,
},
}
```
This is intended to support the common case when a "turn" ends, which
often means it is now your chance to give Codex further instructions.
For example, I have the following in my `~/.codex/config.toml`:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
I created my own custom notifier script that calls out to
[terminal-notifier](https://github.com/julienXX/terminal-notifier) to
show a desktop push notification on macOS. Contents of `notify.py`:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
For reference, here are related PRs that tried to add this functionality
to the TypeScript version of the Codex CLI:
* https://github.com/openai/codex/pull/160
* https://github.com/openai/codex/pull/498
While creating a basic MCP server in
https://github.com/openai/codex/pull/792, I discovered a number of bugs
with the initial `mcp-types` crate that I needed to fix in order to
implement the server.
For example, I discovered that when serializing a message, `"jsonrpc":
"2.0"` was not being included.
I changed the codegen so that the field is added as:
```rust
#[serde(rename = "jsonrpc", default = "default_jsonrpc")]
pub jsonrpc: String,
```
This ensures that the field is serialized as `"2.0"`, though the field
still has to be assigned, which is tedious. I may experiment with
`Default` or something else in the future. (I also considered creating a
custom serializer, but I'm not sure it's worth the trouble.)
While here, I also added `MCP_SCHEMA_VERSION` and `JSONRPC_VERSION` as
`pub const`s for the crate.
I also discovered that MCP rejects sending `null` for optional fields,
so I had to add `#[serde(skip_serializing_if = "Option::is_none")]` on
`Option` fields.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/791).
* #792
* __->__ #791
This adds our own `mcp-types` crate to our Cargo workspace. We vendor in
the
[`2025-03-26/schema.json`](05f2045136/schema/2025-03-26/schema.json)
from the MCP repo and introduce a `generate_mcp_types.py` script to
codegen the `lib.rs` from the JSON schema.
Test coverage is currently light, but I plan to refine things as we
start making use of this crate.
And yes, I am aware that
https://github.com/modelcontextprotocol/rust-sdk exists, though the
published https://crates.io/crates/rmcp appears to be a competing
effort. While things are up in the air, it seems better for us to
control our own version of this code.
Incidentally, Codex did a lot of the work for this PR. I told it to
never edit `lib.rs` directly and instead to update
`generate_mcp_types.py` and then re-run it to update `lib.rs`. It
followed these instructions and once things were working end-to-end, I
iteratively asked for changes to the tests until the API looked
reasonable (and the code worked). Codex was responsible for figuring out
what to do to `generate_mcp_types.py` to achieve the requested test/API
changes.
Building on top of https://github.com/openai/codex/pull/757, this PR
updates Codex to use the Landlock executor binary for sandboxing in the
Node.js CLI. Note that Codex has to be invoked with either `--full-auto`
or `--auto-edit` to activate sandboxing. (Using `--suggest` or
`--dangerously-auto-approve-everything` ensures the sandboxing codepath
will not be exercised.)
When I tested this on a Linux host (specifically, `Ubuntu 24.04.1 LTS`),
things worked as expected: I ran Codex CLI with `--full-auto` and then
asked it to do `echo 'hello mbolin' into hello_world.txt` and it
succeeded without prompting me.
However, in my testing, I discovered that the sandboxing did *not* work
when using `--full-auto` in a Linux Docker container from a macOS host.
I updated the code to throw a detailed error message when this happens:

This introduces `./codex-cli/scripts/stage_release.sh`, which is a shell
script that stages a release for the Node.js module in a temp directory.
It updates the release to include these native binaries:
```
bin/codex-linux-sandbox-arm64
bin/codex-linux-sandbox-x64
```
though this PR does not update Codex CLI to use them yet.
When doing local development, run
`./codex-cli/scripts/install_native_deps.sh` to install these in your
own `bin/` folder.
This PR also updates `README.md` to document the new workflow.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/757).
* #763
* __->__ #757
## `0.1.2504301751`
### 🚀 Features
- User config api key (#569)
- `@mention` files in codex (#701)
- Add `--reasoning` CLI flag (#314)
- Lower default retry wait time and increase number of tries (#720)
- Add common package registries domains to allowed-domains list (#414)
### 🪲 Bug Fixes
- Insufficient quota message (#758)
- Input keyboard shortcut opt+delete (#685)
- `/diff` should include untracked files (#686)
- Only allow running without sandbox if explicitly marked in safe
container (#699)
- Tighten up check for /usr/bin/sandbox-exec (#710)
- Check if sandbox-exec is available (#696)
- Duplicate messages in quiet mode (#680)
Solves #700
## State of the World Before
Prior to this PR, when users wanted to share file contents with Codex,
they had two options:
- Manually copy and paste file contents into the chat
- Wait for the assistant to use the shell tool to view the file
The second approach required the assistant to:
1. Recognize the need to view a file
2. Execute a shell tool call
3. Wait for the tool call to complete
4. Process the file contents
This consumed extra tokens and reduced user control over which files
were shared with the model.
## State of the World After
With this PR, users can now:
- Reference files directly in their chat input using the `@path` syntax
- Have file contents automatically expanded into XML blocks before being
sent to the LLM
For example, users can type `@src/utils/config.js` in their message, and
the file contents will be included in context. Within the terminal chat
history, these file blocks will be collapsed back to `@path` format in
the UI for clean presentation.
Tag File suggestions:
<img width="857" alt="file-suggestions"
src="https://github.com/user-attachments/assets/397669dc-ad83-492d-b5f0-164fab2ff4ba"
/>
Tagging files in action:
<img width="858" alt="tagging-files"
src="https://github.com/user-attachments/assets/0de9d559-7b7f-4916-aeff-87ae9b16550a"
/>
Demo video of file tagging:
[](https://www.youtube.com/watch?v=vL4LqtBnqt8)
## Implementation Details
This PR consists of 2 main components:
1. **File Tag Utilities**:
- New `file-tag-utils.ts` utility module that handles both expansion and
collapsing of file tags
- `expandFileTags()` identifies `@path` tokens and replaces them with
XML blocks containing file contents
- `collapseXmlBlocks()` reverses the process, converting XML blocks back
to `@path` format for UI display
- Tokens are only expanded if they point to valid files (directories are
ignored)
- Expansion happens just before sending input to the model
2. **Terminal Chat Integration**:
- Leveraged the existing file system completion system for tabbing to
support the `@path` syntax
- Added `updateFsSuggestions` helper to manage filesystem suggestions
- Added `replaceFileSystemSuggestion` to replace input with filesystem
suggestions
- Applied `collapseXmlBlocks` in the chat response rendering so that
tagged files are shown as simple `@path` tags
The PR also includes test coverage for both the UI and the file tag
utilities.
## Next Steps
Some ideas I'd like to implement if this feature gets merged:
- Line selection: `@path[50:80]` to grab specific sections of files
- Method selection: `@path#methodName` to grab just one function/class
- Visual improvements: highlight file tags in the UI to make them more
noticeable
This pull request includes a change to improve the error message
displayed when there is insufficient quota in the `AgentLoop` class. The
updated message provides more detailed information and a link for
managing or purchasing credits.
Error message improvement:
*
[`codex-cli/src/utils/agent/agent-loop.ts`](diffhunk://#diff-b15957eac2720c3f1f55aa32f172cdd0ac6969caf4e7be87983df747a9f97083L1140-R1140):
Updated the error message in the `AgentLoop` class to include the
specific error message (if available) and a link to manage or purchase
credits.
Fixes#751
I suspect this was done originally so that `execForSandbox()` had a
consistent signature for both the `SandboxType.NONE` and
`SandboxType.MACOS_SEATBELT` cases, but that is not really necessary and
turns out to make the upcoming Landlock support a bit more complicated
to implement, so I had Codex remove it and clean up the call sites.
Apparently the URLs for draft releases cannot be downloaded using
unauthenticated `curl`, which means the DotSlash file only works for
users who are authenticated with `gh`. According to chat, prereleases
_can_ be fetched with unauthenticated `curl`, so let's try that.
For now, keep things simple such that we never update the `version` in
the `Cargo.toml` for the workspace root on the `main` branch. Instead,
create a new branch for a release, push one commit that updates the
`version`, and then tag that branch to kick off a release.
To test, I ran this script and created this release job:
https://github.com/openai/codex/actions/runs/14762580641
The generated DotSlash file has URLs that refer to
`https://github.com/openai/codex/releases/`, so let's set
`prerelease:false` (but keep `draft:true` for now) so those URLs should
work.
Also updated `version` in Cargo workspace so I will kick off a build
once this lands.
I am working to simplify the build process. As a first step, update
`session.ts` so it reads the `version` from `package.json` at runtime so
we no longer have to modify it during the build process. I want to get
to a place where the build looks like:
```
cd codex-cli
pnpm i
pnpm build
RELEASE_DIR=$(mktemp -d)
cp -r bin "$RELEASE_DIR/bin"
cp -r dist "$RELEASE_DIR/dist"
cp -r src "$RELEASE_DIR/src" # important if we want sourcemaps to continue to work
cp ../README.md "$RELEASE_DIR"
VERSION=$(printf '0.1.%d' $(date +%y%m%d%H%M))
jq --arg version "$VERSION" '.version = $version' package.json > "$RELEASE_DIR/package.json"
```
Then the contents of `$RELEASE_DIR` should be good to `npm publish`, no?
@oai-ragona and I discussed it, and we feel the REPL crate has served
its purpose, so we're going to delete the code and future archaeologists
can find it in Git history.
Apparently I made two key mistakes in
https://github.com/openai/codex/pull/740 (fixed in this PR):
* I forgot to redefine `$dest` in the `Stage Linux-only artifacts` step
* I did not define the `if` check correctly in the `Stage Linux-only
artifacts` step
This fixes both of those issues and bumps the workspace version to
`0.0.2504292006` in preparation for another release attempt.
This introduces a standalone executable that run the equivalent of the
`codex debug landlock` subcommand and updates `rust-release.yml` to
include it in the release.
The idea is that we will include this small binary with the TypeScript
CLI to provide support for Linux sandboxing.
Taking a pass at building artifacts per platform so we can consider
different distribution strategies that don't require users to install
the full `cargo` toolchain.
Right now this grabs just the `codex-repl` and `codex-tui` bins for 5
different targets and bundles them into a draft release. I think a
clearly marked pre-release set of artifacts will unblock the next step
of testing.
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
237f8a11e1/codex-rs/core/src/protocol.rs (L98-L108)
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
The saveConfig() function only includes a hardcoded subset of properties
when writing the config file. Any property not explicitly listed (like
disableResponseStorage) will be dropped.
I have added `disableResponseStorage` to the `configToSave` object as
the immediate fix.
[Linking Issue this fixes.](https://github.com/openai/codex/issues/726)
We provide the following options to facilitate Codex development in a container. This is particularly useful for verifying the Linux build when working on a macOS host.
## Docker
To build the Docker image locally for x64 and then run it with the repo mounted under `/workspace`:
Note that `/workspace/target` will contain the binaries built for your host platform, so we include `-e CARGO_TARGET_DIR=/workspace/codex-rs/target-amd64` in the `docker run` command so that the binaries built inside your container are written to a separate directory.
For arm64, specify `--platform=linux/amd64` instead for both `docker build` and `docker run`.
Currently, the `Dockerfile` works for both x64 and arm64 Linux, though you need to run `rustup target add x86_64-unknown-linux-musl` yourself to install the musl toolchain for x64.
## VS Code
VS Code recognizes the `devcontainer.json` file and gives you the option to develop Codex in a container. Currently, `devcontainer.json` builds and runs the `arm64` flavor of the container.
From the integrated terminal in VS Code, you can build either flavor of the `arm64` build (GNU or musl):
`openai/codex-action` is a GitHub Action that facilitates the use of [Codex](https://github.com/openai/codex) on GitHub issues and pull requests. Using the action, associate **labels** to run Codex with the appropriate prompt for the given context. Codex will respond by posting comments or creating PRs, whichever you specify!
Here is a sample workflow that uses `openai/codex-action`:
```yaml
name:Codex
on:
issues:
types:[opened, labeled]
pull_request:
branches:[main]
types:[labeled]
jobs:
codex:
if:...# optional, but can be effective in conserving CI resources
runs-on:ubuntu-latest
# TODO(mbolin): Need to verify if/when `write` is necessary.
permissions:
contents:write
issues:write
pull-requests:write
steps:
# By default, Codex runs network disabled using --full-auto, so perform
# any setup that requires network (such as installing dependencies)
See sample usage in [`codex.yml`](../../workflows/codex.yml).
## Triggering the Action
Using the sample workflow above, we have:
```yaml
on:
issues:
types:[opened, labeled]
pull_request:
branches:[main]
types:[labeled]
```
which means our workflow will be triggered when any of the following events occur:
- a label is added to an issue
- a label is added to a pull request against the `main` branch
### Label-Based Triggers
To define a GitHub label that should trigger Codex, create a file named `.github/codex/labels/LABEL-NAME.md` in your repository where `LABEL-NAME` is the name of the label. The content of the file is the prompt template to use when the label is added (see more on [Prompt Template Variables](#prompt-template-variables) below).
For example, if the file `.github/codex/labels/codex-review.md` exists, then:
- Adding the `codex-review` label will trigger the workflow containing the `openai/codex-action` GitHub Action.
- When `openai/codex-action` starts, it will replace the `codex-review` label with `codex-review-in-progress`.
- When `openai/codex-action` is finished, it will replace the `codex-review-in-progress` label with `codex-review-completed`.
If Codex sees that either `codex-review-in-progress` or `codex-review-completed` is already present, it will not perform the action.
As determined by the [default config](./src/default-label-config.ts), Codex will act on the following labels by default:
- Adding the `codex-review` label to a pull request will have Codex review the PR and add it to the PR as a comment.
- Adding the `codex-triage` label to an issue will have Codex investigate the issue and report its findings as a comment.
- Adding the `codex-issue-fix` label to an issue will have Codex attempt to fix the issue and create a PR wit the fix, if any.
## Action Inputs
The `openai/codex-action` GitHub Action takes the following inputs
### `openai_api_key` (required)
Set your `OPENAI_API_KEY` as a [repository secret](https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions). See **Secrets and varaibles** then **Actions** in the settings for your GitHub repo.
Note that the secret name does not have to be `OPENAI_API_KEY`. For example, you might want to name it `CODEX_OPENAI_API_KEY` and then configure it on `openai/codex-action` as follows:
This is required so that Codex can post a comment or create a PR. Set this value on the action as follows:
```yaml
github_token:${{ secrets.GITHUB_TOKEN }}
```
### `codex_args`
A whitespace-delimited list of arguments to pass to Codex. Defaults to `--full-auto`, but if you want to override the default model to use `o3`:
```yaml
codex_args:"--full-auto --model o3"
```
For more complex configurations, use the `codex_home` input.
### `codex_home`
If set, the value to use for the `$CODEX_HOME` environment variable when running Codex. As explained [in the docs](https://github.com/openai/codex/tree/main/codex-rs#readme), this folder can contain the `config.toml` to configure Codex, custom instructions, and log files.
This should be a relative path within your repo.
## Prompt Template Variables
As shown above, `"prompt"` and `"promptPath"` are used to define prompt templates that will be populated and passed to Codex in response to certain events. All template variables are of the form `{CODEX_ACTION_...}` and the supported values are defined below.
### `CODEX_ACTION_ISSUE_TITLE`
If the action was triggered on a GitHub issue, this is the issue title.
Specifically it is read as the `.issue.title` from the `$GITHUB_EVENT_PATH`.
### `CODEX_ACTION_ISSUE_BODY`
If the action was triggered on a GitHub issue, this is the issue body.
Specifically it is read as the `.issue.body` from the `$GITHUB_EVENT_PATH`.
### `CODEX_ACTION_GITHUB_EVENT_PATH`
The value of the `$GITHUB_EVENT_PATH` environment variable, which is the path to the file that contains the JSON payload for the event that triggered the workflow. Codex can use `jq` to read only the fields of interest from this file.
### `CODEX_ACTION_PR_DIFF`
If the action was triggered on a pull request, this is the diff between the base and head commits of the PR. It is the output from `git diff`.
Note that the content of the diff could be quite large, so is generally safer to point Codex at `CODEX_ACTION_GITHUB_EVENT_PATH` and let it decide how it wants to explore the change.
description:"A reusable action that runs a Codex model."
inputs:
openai_api_key:
description:"The value to use as the OPENAI_API_KEY environment variable when running Codex."
required:true
trigger_phrase:
description:"Text to trigger Codex from a PR/issue body or comment."
required:false
default:""
github_token:
description:"Token so Codex can comment on the PR or issue."
required:true
codex_args:
description:"A whitespace-delimited list of arguments to pass to Codex. Due to limitations in YAML, arguments with spaces are not supported. For more complex configurations, use the `codex_home` input."
Provide a concise and respectful comment summarizing the findings.
### {CODEX_ACTION_ISSUE_TITLE}
{CODEX_ACTION_ISSUE_BODY}
`.trim(),
},
"codex-code-review":{
getPromptTemplate:()=>
`
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the \`base\` and \`head\` refs that define this PR. Both refs are available locally.
`.trim(),
},
"codex-attempt-fix":{
getPromptTemplate:()=>
`
Attempt to solve the reported issue.
If a code change is required, create a new branch, commit the fix, and open a pull-request that resolves the problem.
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the `base` and `head` refs that define this PR. Both refs are available locally.
Review this PR and respond with a very concise final message, formatted in Markdown.
There should be a summary of the changes (1-2 sentences) and a few bullet points if necessary.
Then provide the **review** (1-2 sentences plus bullet points, friendly tone).
Things to look out for when doing the review:
## General Principles
- **Make sure the pull request body explains the motivation behind the change.** If the author has failed to do this, call it out, and if you think you can deduce the motivation behind the change, propose copy.
- Ideally, the PR body also contains a small summary of the change. For small changes, the PR title may be sufficient.
- Each PR should ideally do one conceptual thing. For example, if a PR does a refactoring as well as introducing a new feature, push back and suggest the refactoring be done in a separate PR. This makes things easier for the reviewer, as refactoring changes can often be far-reaching, yet quick to review.
- When introducing new code, be on the lookout for code that duplicates existing code. When found, propose a way to refactor the existing code such that it should be reused.
## Code Organization
- Each create in the Cargo workspace in `codex-rs` has a specific purpose: make a note if you believe new code is not introduced in the correct crate.
- When possible, try to keep the `core` crate as small as possible. Non-core but shared logic is often a good candidate for `codex-rs/common`.
- Be wary of large files and offer suggestions for how to break things into more reasonably-sized files.
- Rust files should generally be organized such that the public parts of the API appear near the top of the file and helper functions go below. This is analagous to the "inverted pyramid" structure that is favored in journalism.
## Assertions in Tests
Assert the equality of the entire objects instead of doing "piecemeal comparisons," performing `assert_eq!()` on individual fields.
Note that unit tests also function as "executable documentation." As shown in the following example, "piecemeal comparisons" are often more verbose, provide less coverage, and are not as useful as executable documentation.
For example, suppose you have the following enum:
```rust
#[derive(Debug, PartialEq)]
enumMessage{
Request{
id: String,
method: String,
params: Option<serde_json::Value>,
},
Notification{
method: String,
params: Option<serde_json::Value>,
},
}
```
This is an example of a _piecemeal_ comparison:
```rust
// BAD: Piecemeal Comparison
#[test]
fntest_get_latest_messages(){
letmessages=get_latest_messages();
assert_eq!(messages.len(),2);
letm0=&messages[0];
matchm0{
Message::Request{id,method,params}=>{
assert_eq!(id,"123");
assert_eq!(method,"subscribe");
assert_eq!(
*params,
Some(json!({
"conversation_id": "x42z86"
}))
)
}
Message::Notification{..}=>{
panic!("expected Request");
}
}
letm1=&messages[1];
matchm1{
Message::Request{..}=>{
panic!("expected Notification");
}
Message::Notification{method,params}=>{
assert_eq!(method,"log");
assert_eq!(
*params,
Some(json!({
"level": "info",
"message": "subscribed"
}))
)
}
}
}
```
This is a _deep_ comparison:
```rust
// GOOD: Verify the entire structure with a single assert_eq!().
usepretty_assertions::assert_eq;
#[test]
fntest_get_latest_messages(){
letmessages=get_latest_messages();
assert_eq!(
vec![
Message::Request{
id: "123".to_string(),
method: "subscribe".to_string(),
params: Some(json!({
"conversation_id": "x42z86"
})),
},
Message::Notification{
method: "log".to_string(),
params: Some(json!({
"level": "info",
"message": "subscribed"
})),
},
],
messages,
);
}
```
## More Tactical Rust Things To Look Out For
- Do not use `unsafe` (unless you have a really, really good reason like using an operating system API directly and no safe wrapper exists). For example, there are cases where it is tempting to use `unsafe` in order to use `std::env::set_var()`, but this indeed `unsafe` and has led to race conditions on multiple occasions. (When this happens, find a mechanism other than environment variables to use for configuration.)
- Encourage the use of small enums or the newtype pattern in Rust if it helps readability without adding significant cognitive load or lines of code.
- If you see opportunities for the changes in a diff to use more idiomatic Rust, please make specific recommendations. For example, favor the use of expressions over `return`.
- When modifying a `Cargo.toml` file, make sure that dependency lists stay alphabetically sorted. Also consider whether a new dependency is added to the appropriate place (e.g., `[dependencies]` versus `[dev-dependencies]`)
## Pull Request Body
- If the nature of the change seems to have a visual component (which is often the case for changes to `codex-rs/tui`), recommend including a screenshot or video to demonstrate the change, if appropriate.
- References to existing GitHub issues and PRs are encouraged, where appropriate, though you likely do not have network access, so may not be able to help here.
# PR Information
{CODEX_ACTION_GITHUB_EVENT_PATH} contains the JSON that triggered this GitHub workflow. It contains the `base` and `head` refs that define this PR. Both refs are available locally.
- Never add or modify any code related to `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` or `CODEX_SANDBOX_ENV_VAR`.
- You operate in a sandbox where `CODEX_SANDBOX_NETWORK_DISABLED=1` will be set whenever you use the `shell` tool. Any existing code that uses `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` was authored with this fact in mind. It is often used to early exit out of tests that the author knew you would not be able to run given your sandbox limitations.
- Similarly, when you spawn a process using Seatbelt (`/usr/bin/sandbox-exec`), `CODEX_SANDBOX=seatbelt` will be set on the child process. Integration tests that want to run Seatbelt themselves cannot be run under Seatbelt, so checks for `CODEX_SANDBOX=seatbelt` are also often used to early exit out of tests, as appropriate.
Before creating a pull request with changes to `codex-rs`, run `just fmt` (in `codex-rs` directory) to format the code and `just fix` (in `codex-rs` directory) to fix any linter issues in the code, ensure the test suite passes by running `cargo test --all-features` in the `codex-rs` directory.
When making individual changes prefer running tests on individual files or projects first.
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code></p>
<p align="center"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>

<p align="center"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, see <a href="https://chatgpt.com/codex">chatgpt.com/codex</a>.</p>
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
## Quickstart
Install globally:
### Installing and running Codex CLI
Install globally with your preferred package manager:
Next, set your OpenAI API key as an environment variable:
```shell
exportOPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
> ```
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
Or, run with a prompt as input (and optionally in `Full Auto` mode):
<details>
<summary>You can also go to the <a href="https://github.com/openai/codex/releases/latest">latest GitHub Release</a> and download the appropriate binary for your platform.</summary>
Each GitHub Release contains many executables, but in practice, you likely want one of these:
- macOS
- Apple Silicon/arm64: `codex-aarch64-apple-darwin.tar.gz`
- x86_64 (older Mac hardware): `codex-x86_64-apple-darwin.tar.gz`
Each archive contains a single entry with the platform baked into the name (e.g., `codex-x86_64-unknown-linux-musl`), so you likely want to rename it to `codex` after extracting it.
After you run `codex` select Sign in with ChatGPT. You'll need a Plus, Pro, or Team ChatGPT account, and will get access to our latest models, including `gpt-5`, at no extra cost to your plan. (Enterprise is coming soon.)
> Important: If you've used the Codex CLI before, you'll need to follow these steps to migrate from usage-based billing with your API key:
>
> 1. Update the CLI with `codex update` and ensure `codex --version` is greater than 0.13
> 2. Ensure that there is no `OPENAI_API_KEY` environment variable set. (Check that `env | grep 'OPENAI_API_KEY'` returns empty)
> 3. Run `codex login` again
If you encounter problems with the login flow, please comment on [this issue](https://github.com/openai/codex/issues/1243).
### Usage-based billing alternative: Use an OpenAI API key
If you prefer to pay-as-you-go, you can still authenticate with your OpenAI API key by setting it as an environment variable:
```shell
codex "explain this codebase to me"
exportOPENAI_API_KEY="your-api-key-here"
```
```shell
codex --approval-mode full-auto "create the fanciest todo-list app"
> Note: This command only sets the key for your current terminal session, which we recommend. To set it for all future sessions, you can also add the `export` line to your shell's configuration file (e.g., `~/.zshrc`).
### Choosing Codex's level of autonomy
We always recommend running Codex in its default sandbox that gives you strong guardrails around what the agent can do. The default sandbox prevents it from editing files outside its workspace, or from accessing the network.
When you launch Codex in a new folder, it detects whether the folder is version controlled and recommends one of two levels of autonomy:
#### **1. Read/write**
- Codex can run commands and write files in the workspace without approval.
- To write files in other folders, access network, update git or perform other actions protected by the sandbox, Codex will need your permission.
- By default, the workspace includes the current directory, as well as temporary directories like `/tmp`. You can see what directories are in the workspace with the `/status` command. See the docs for how to customize this behavior.
- Advanced: You can manually specify this configuration by running `codex --sandbox workspace-write --ask-for-approval on-request`
- This is the recommended default for version-controlled folders.
#### **2. Read-only**
- Codex can run read-only commands without approval.
- To edit files, access network, or perform other actions protected by the sandbox, Codex will need your permission.
- Advanced: You can manually specify this configuration by running `codex --sandbox read-only --ask-for-approval on-request`
- This is the recommended default non-version-controlled folders.
#### **3. Advanced configuration**
Codex gives you fine-grained control over the sandbox with the `--sandbox` option, and over when it requests approval with the `--ask-for-approval` option. Run `codex help` for more on these options.
#### Can I run without ANY approvals?
Yes, run codex non-interactively with `--ask-for-approval never`. This option works with all `--sandbox` options, so you still have full control over Codex's level of autonomy. It will make its best attempt with whatever contrainsts you provide. For example:
- Use `codex --ask-for-approval never --sandbox read-only` when you are running many agents to answer questions in parallel in the same workspace.
- Use `codex --ask-for-approval never --sandbox workspace-write` when you want the agent to non-interactively take time to produce the best outcome, with strong guardrails around its behavior.
- Use `codex --ask-for-approval never --sandbox danger-full-access` to dangerously give the agent full autonomy. Because this disables important safety mechanisms, we recommend against using this unless running Codex in an isolated environment.
#### Fine-tuning in `config.toml`
```toml
# approval mode
approval_policy="untrusted"
sandbox_mode="read-only"
# full-auto mode
approval_policy="on-request"
sandbox_mode="workspace-write"
# Optional: allow network in workspace-write mode
[sandbox_workspace_write]
network_access=true
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
missing dependencies, and show you the live result. Approve the changes and
they'll be committed to your working directory.
You can also save presets as **profiles**:
---
```toml
[profiles.full_auto]
approval_policy="on-request"
sandbox_mode="workspace-write"
## Why Codex?
Codex CLI is built for developers who already **live in the terminal** and want
ChatGPT-level reasoning **plus** the power to actually run code, manipulate
files, and iterate - all under version control. In short, it's _chat-driven
development_ that understands and executes your repo.
- **Zero setup** - bring your OpenAI API key and it just works!
- **Full auto-approval, while safe + secure** by running network-disabled and directory-sandboxed
- **Multimodal** - pass in screenshots or diagrams to implement features ✨
And it's **fully open-source** so you can see and contribute to how it develops!
---
## Security Model & Permissions
Codex lets you decide _how much autonomy_ the agent receives and auto-approval policy via the
`--approval-mode` flag (or the interactive onboarding prompt):
| Mode | What the agent may do without asking | Still requires approval |
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
---
## Memory & Project Docs
Codex merges Markdown instructions in this order:
1.`~/.codex/instructions.md` - personal global guidance
2.`codex.md` at repo root - shared project notes
3.`codex.md` in cwd - sub-package specifics
Disable with `--no-project-doc` or `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Non-interactive / CI mode
Run Codex head-less in pipelines. Example GitHub Action step:
```yaml
- name:Update changelog via Codex
run:|
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex -a auto-edit --quiet "update CHANGELOG for next release"
[profiles.readonly_quiet]
approval_policy="never"
sandbox_mode="read-only"
```
Set `CODEX_QUIET_MODE=1` to silence interactive UI noise.
## Tracing / Verbose Logging
Setting the environment variable `DEBUG=true` prints full API request and response details:
```shell
DEBUG=true codex
```
---
## Recipes
### Example prompts
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
@@ -274,189 +193,261 @@ Below are a few bite-size examples you can copy-paste. Replace the text in quote
| 6 | `codex "Carefully review this repo, and propose 3 high impact well-scoped PRs"` | Suggests impactful PRs in the current codebase. |
| 7 | `codex "Look for vulnerabilities and create a security review report"` | Finds and explains security bugs. |
## Running with a prompt as input
You can also run Codex CLI with a prompt as input:
```shell
codex "explain this codebase to me"
```
```shell
codex --full-auto "create the fanciest todo-list app"
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
missing dependencies, and show you the live result. Approve the changes and
they'll be committed to your working directory.
## Using Open Source Models
<details>
<summary><strong>Use <code>--profile</code> to use other models</strong></summary>
Codex also allows you to use other providers that support the OpenAI Chat Completions (or Responses) API.
To do so, you must first define custom [providers](./config.md#model_providers) in `~/.codex/config.toml`. For example, the provider for a standard Ollama setup would be defined as follows:
```toml
[model_providers.ollama]
name="Ollama"
base_url="http://localhost:11434/v1"
```
The `base_url` will have `/chat/completions` appended to it to build the full URL for the request.
For providers that also require an `Authorization` header of the form `Bearer: SECRET`, an `env_key` can be specified, which indicates the environment variable to read to use as the value of `SECRET` when making a request:
```toml
[model_providers.openrouter]
name="OpenRouter"
base_url="https://openrouter.ai/api/v1"
env_key="OPENROUTER_API_KEY"
```
Providers that speak the Responses API are also supported by adding `wire_api = "responses"` as part of the definition. Accessing OpenAI models via Azure is an example of such a provider, though it also requires specifying additional `query_params` that need to be appended to the request URL:
```toml
[model_providers.azure]
name="Azure"
# Make sure you set the appropriate subdomain for this URL.
env_key="AZURE_OPENAI_API_KEY"# Or "OPENAI_API_KEY", whichever you use.
# Newer versions appear to support the responses API, see https://github.com/openai/codex/pull/1321
query_params={api-version="2025-04-01-preview"}
wire_api="responses"
```
Once you have defined a provider you wish to use, you can configure it as your default provider as follows:
```toml
model_provider="azure"
```
> [!TIP]
> If you find yourself experimenting with a variety of models and providers, then you likely want to invest in defining a _profile_ for each configuration like so:
```toml
[profiles.o3]
model_provider="azure"
model="o3"
[profiles.mistral]
model_provider="ollama"
model="mistral"
```
This way, you can specify one command-line argument (.e.g., `--profile o3`, `--profile mistral`) to override multiple settings together.
</details>
Codex can run fully locally against an OpenAI-compatible OSS host (like Ollama) using the `--oss` flag:
- Interactive UI:
- codex --oss
- Non-interactive (programmatic) mode:
- echo "Refactor utils" | codex exec --oss
Model selection when using `--oss`:
- If you omit `-m/--model`, Codex defaults to -m gpt-oss:20b and will verify it exists locally (downloading if needed).
- To pick a different size, pass one of:
- -m "gpt-oss:20b"
- -m "gpt-oss:120b"
Point Codex at your own OSS host:
- By default, `--oss` talks to http://localhost:11434/v1.
- To use a different host, set one of these environment variables before running Codex:
The mechanism Codex uses to implement the sandbox policy depends on your OS:
```bash
npm install -g @openai/codex
# or
yarn global add @openai/codex
# or
bun install -g @openai/codex
# or
pnpm add -g @openai/codex
- **macOS 12+** uses **Apple Seatbelt** and runs commands using `sandbox-exec` with a profile (`-p`) that corresponds to the `--sandbox` that was specified.
- **Linux** uses a combination of Landlock/seccomp APIs to enforce the `sandbox` configuration.
Note that when running Linux in a containerized environment such as Docker, sandboxing may not work if the host/container configuration does not support the necessary Landlock/seccomp APIs. In such cases, we recommend configuring your Docker container so that it provides the sandbox guarantees you are looking for and then running `codex` with `--sandbox danger-full-access` (or, more simply, the `--dangerously-bypass-approvals-and-sandbox` flag) within your container.
---
## Experimental technology disclaimer
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
You can give Codex extra instructions and guidance using `AGENTS.md` files. Codex looks for `AGENTS.md` files in the following places, and merges them top-down:
1.`~/.codex/AGENTS.md` - personal global guidance
2.`AGENTS.md` at repo root - shared project notes
3.`AGENTS.md` in the current working directory - sub-folder/feature specifics
---
## Non-interactive / CI mode
Run Codex head-less in pipelines. Example GitHub Action step:
```yaml
- name:Update changelog via Codex
run:|
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex exec --full-auto "update CHANGELOG for next release"
```
## Model Context Protocol (MCP)
The Codex CLI can be configured to leverage MCP servers by defining an [`mcp_servers`](./codex-rs/config.md#mcp_servers) section in `~/.codex/config.toml`. It is intended to mirror how tools such as Claude and Cursor define `mcpServers` in their respective JSON config files, though the Codex format is slightly different since it uses TOML rather than JSON, e.g.:
```toml
# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command="npx"
args=["-y","mcp-server"]
env={"API_KEY"="value"}
```
> [!TIP]
> It is somewhat experimental, but the Codex CLI can also be run as an MCP _server_ via `codex mcp`. If you launch it with an MCP client such as `npx @modelcontextprotocol/inspector codex mcp` and send it a `tools/list` request, you will see that there is only one tool, `codex`, that accepts a grab-bag of inputs, including a catch-all `config` map for anything you might want to override. Feel free to play around with it and provide feedback via GitHub issues.
## Tracing / verbose logging
Because Codex is written in Rust, it honors the `RUST_LOG` environment variable to configure its logging behavior.
The TUI defaults to `RUST_LOG=codex_core=info,codex_tui=info` and log messages are written to `~/.codex/log/codex-tui.log`, so you can leave the following running in a separate terminal to monitor log messages as they are written:
```
tail -F ~/.codex/log/codex-tui.log
```
By comparison, the non-interactive mode (`codex exec`) defaults to `RUST_LOG=error`, but messages are printed inline, so there is no need to monitor a separate file.
See the Rust documentation on [`RUST_LOG`](https://docs.rs/env_logger/latest/env_logger/#enabling-logging) for more information on the configuration options.
---
### DotSlash
The GitHub Release also contains a [DotSlash](https://dotslash-cli.com/) file for the Codex CLI named `codex`. Using a DotSlash file makes it possible to make a lightweight commit to source control to ensure all contributors use the same version of an executable, regardless of what platform they use for development.
</details>
<details>
<summary><strong>Build from source</strong></summary>
```bash
# Clone the repository and navigate to the CLI package
# Clone the repository and navigate to the root of the Cargo workspace.
Though `--config` can be used to set/override ad-hoc config values for individual invocations of `codex`.
---
@@ -497,7 +488,7 @@ Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.mic
---
## Zero Data Retention (ZDR) Usage
## Zero data retention (ZDR) usage
Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled. If your OpenAI organization has Zero Data Retention enabled and you still encounter errors such as:
@@ -505,11 +496,17 @@ Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)]
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
Ensure you are running `codex` with `--config disable_response_storage=true` or add this line to `~/.codex/config.toml` to avoid specifying the command line option each time:
```toml
disable_response_storage=true
```
See [the configuration documentation on `disable_response_storage`](./codex-rs/config.md#disable_response_storage) for details.
---
## Codex Open Source Fund
## Codex open source fund
We're excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
@@ -530,51 +527,7 @@ More broadly we welcome contributions - whether you are opening your very first
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
-Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git Hooks with Husky
This project uses [Husky](https://typicode.github.io/husky/) to enforce code quality checks:
- **Pre-commit hook**: Automatically runs lint-staged to format and lint files before committing
- **Pre-push hook**: Runs tests and type checking before pushing to the remote
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./codex-cli/HUSKY.md).
```bash
pnpm test&& pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
```text
I have read the CLA Document and I hereby sign the CLA
```
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
pnpm test:watch
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
-Following the [development setup](#development-workflow) instructions above, ensure your change is free of lint warnings and test failures.
### Writing high-impact code changes
@@ -586,7 +539,7 @@ To debug the CLI with a visual debugger, do the following in the `codex-cli` fol
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Run **all** checks locally (`cargo test && cargo clippy --tests && cargo fmt -- --config imports_granularity=Item`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
@@ -608,7 +561,7 @@ If you run into problems setting up the project, would like feedback on an idea,
Together we can make Codex CLI an incredible tool. **Happy hacking!** :rocket:
### Contributor License Agreement (CLA)
### Contributor license agreement (CLA)
All contributors **must** accept the CLA. The process is lightweight:
@@ -633,49 +586,26 @@ The **DCO check** blocks merges until every commit in the PR carries the footer
### Releasing `codex`
To publish a new version of the CLI, run the release scripts defined in `codex-cli/package.json`:
_For admins only._
1. Open the `codex-cli` directory
2. Make sure you're on a branch like `git checkout -b bump-version`
3. Bump the version and `CLI_VERSION` to current datetime: `pnpm release:version`
This shell includes Node.js, installs dependencies, builds the CLI, and provides a `codex` command alias.
This will make a local commit on top of `main` with `version` set to `$VERSION` in `codex-rs/Cargo.toml` (note that on `main`, we leave the version as `version = "0.0.0"`).
Build and run the CLI directly:
This will push the commit using the tag `rust-v${VERSION}`, which in turn kicks off [the release workflow](.github/workflows/rust-release.yml). This will create a new GitHub Release named `$VERSION`.
```bash
nix build
./result/bin/codex --help
```
If everything looks good in the generated GitHub Release, uncheck the **pre-release** box so it is the latest release.
Run the CLI via the flake app:
```bash
nix run .#codex
```
Create a PR to update [`Formula/c/codex.rb`](https://github.com/Homebrew/homebrew-core/blob/main/Formula/c/codex.rb) on Homebrew.
---
## Security & Responsible AI
## Security & responsible AI
Have you discovered a vulnerability or have concerns about model output? Please e-mail **security@openai.com** and we will respond promptly.
You are a summarization assistant. A conversation follows between a user and a coding-focused AI (Codex). Your task is to generate a clear summary capturing:
• High-level objective or problem being solved
• Key instructions or design decisions given by the user
• Main code actions or behaviors from the AI
• Important variables, functions, modules, or outputs discussed
• Any unresolved questions or next steps
Produce the summary in a structured format like:
**Objective:** …
**User instructions:** … (bulleted)
**AI actions / code behavior:** … (bulleted)
**Important entities:** … (e.g. function names, variables, files)
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code></p>
> [!IMPORTANT]
> This is the documentation for the _legacy_ TypeScript implementation of the Codex CLI. It has been superseded by the _Rust_ implementation. See the [README in the root of the Codex repository](https://github.com/openai/codex/blob/main/README.md) for details.

---
<details>
<summary><strong>Table of contents</strong></summary>
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
## Quickstart
Install globally:
```shell
npm install -g @openai/codex
```
Next, set your OpenAI API key as an environment variable:
```shell
exportOPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
> ```
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - azure
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - arceeai
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
---
## Memory & project docs
You can give Codex extra instructions and guidance using `AGENTS.md` files. Codex looks for `AGENTS.md` files in the following places, and merges them top-down:
1.`~/.codex/AGENTS.md` - personal global guidance
2.`AGENTS.md` at repo root - shared project notes
3.`AGENTS.md` in the current working directory - sub-folder/feature specifics
Disable loading of these files with `--no-project-doc` or the environment variable `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Non-interactive / CI mode
Run Codex head-less in pipelines. Example GitHub Action step:
```yaml
- name:Update changelog via Codex
run:|
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex -a auto-edit --quiet "update CHANGELOG for next release"
```
Set `CODEX_QUIET_MODE=1` to silence interactive UI noise.
## Tracing / verbose logging
Setting the environment variable `DEBUG=true` prints full API request and response details:
```shell
DEBUG=true codex
```
---
## Recipes
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
<summary>OpenAI released a model called Codex in 2021 - is this related?</summary>
In 2021, OpenAI released Codex, an AI system designed to generate code from natural language prompts. That original Codex model was deprecated as of March 2023 and is separate from the CLI tool.
</details>
<details>
<summary>Which models are supported?</summary>
Any model available with [Responses API](https://platform.openai.com/docs/api-reference/responses). The default is `o4-mini`, but pass `--model gpt-4.1` or set `model: gpt-4.1` in your config file to override.
</details>
<details>
<summary>Why does <code>o3</code> or <code>o4-mini</code> not work for me?</summary>
It's possible that your [API account needs to be verified](https://help.openai.com/en/articles/10910291-api-organization-verification) in order to start streaming responses and seeing chain of thought summaries from the API. If you're still running into issues, please let us know!
</details>
<details>
<summary>How do I stop Codex from editing my files?</summary>
Codex runs model-generated commands in a sandbox. If a proposed command or file change doesn't look right, you can simply type **n** to deny the command or give the model feedback.
</details>
<details>
<summary>Does it work on Windows?</summary>
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) - Codex has been tested on macOS and Linux with Node 22.
</details>
---
## Zero data retention (ZDR) usage
Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled. If your OpenAI organization has Zero Data Retention enabled and you still encounter errors such as:
```
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
---
## Codex open source fund
We're excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
- Grants are awarded up to **$25,000** API credits.
- Applications are reviewed **on a rolling basis**.
This project is under active development and the code will likely change pretty significantly. We'll update this message once that's complete!
More broadly we welcome contributions - whether you are opening your very first pull request or you're a seasoned maintainer. At the same time we care about reliability and long-term maintainability, so the bar for merging code is intentionally **high**. The guidelines below spell out what "high-quality" means in practice and should make the whole process transparent and friendly.
### Development workflow
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
- Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git hooks with Husky
This project uses [Husky](https://typicode.github.io/husky/) to enforce code quality checks:
- **Pre-commit hook**: Automatically runs lint-staged to format and lint files before committing
- **Pre-push hook**: Runs tests and type checking before pushing to the remote
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./HUSKY.md).
```bash
pnpm test&& pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
```text
I have read the CLA Document and I hereby sign the CLA
```
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
pnpm test:watch
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
### Writing high-impact code changes
1. **Start with an issue.** Open a new one or comment on an existing discussion so we can agree on the solution before code is written.
2. **Add or update tests.** Every new feature or bug-fix should come with test coverage that fails before your change and passes afterwards. 100% coverage is not required, but aim for meaningful assertions.
3. **Document behaviour.** If your change affects user-facing behaviour, update the README, inline help (`codex --help`), or relevant example projects.
4. **Keep commits atomic.** Each commit should compile and the tests should pass. This makes reviews and potential rollbacks easier.
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
### Review process
1. One maintainer will be assigned as a primary reviewer.
2. We may ask for changes - please do not take this personally. We value the work, we just also value consistency and long-term maintainability.
3. When there is consensus that the PR meets the bar, a maintainer will squash-and-merge.
### Community values
- **Be kind and inclusive.** Treat others with respect; we follow the [Contributor Covenant](https://www.contributor-covenant.org/).
- **Assume good intent.** Written communication is hard - err on the side of generosity.
- **Teach & learn.** If you spot something confusing, open an issue or PR with improvements.
### Getting help
If you run into problems setting up the project, would like feedback on an idea, or just want to say _hi_ - please open a Discussion or jump into the relevant issue. We are happy to help.
Together we can make Codex CLI an incredible tool. **Happy hacking!** :rocket:
### Contributor license agreement (CLA)
All contributors **must** accept the CLA. The process is lightweight:
1. Open your pull request.
2. Paste the following comment (or reply `recheck` if you've signed before):
```text
I have read the CLA Document and I hereby sign the CLA
```
3. The CLA-Assistant bot records your signature in the repo and marks the status check as passed.
No special Git commands, email attachments, or commit footers required.
"Advertiser","I want you to act as an advertiser. You will create a campaign to promote a product or service of your choice. You will choose a target audience, develop key messages and slogans, select the media channels for promotion, and decide on any additional activities needed to reach your goals. My first suggestion request is ""I need help creating an advertising campaign for a new type of energy drink targeting young adults aged 18-30.""",FALSE
"Storyteller","I want you to act as a storyteller. You will come up with entertaining stories that are engaging, imaginative and captivating for the audience. It can be fairy tales, educational stories or any other type of stories which has the potential to capture people's attention and imagination. Depending on the target audience, you may choose specific themes or topics for your storytelling session e.g., if it's children then you can talk about animals; If it's adults then history-based tales might engage them better etc. My first request is ""I need an interesting story on perseverance.""",FALSE
"Football Commentator","I want you to act as a football commentator. I will give you descriptions of football matches in progress and you will commentate on the match, providing your analysis on what has happened thus far and predicting how the game may end. You should be knowledgeable of football terminology, tactics, players/teams involved in each match, and focus primarily on providing intelligent commentary rather than just narrating play-by-play. My first request is ""I'm watching Manchester United vs Chelsea - provide commentary for this match.""",FALSE
"Stand-up Comedian","I want you to act as a stand-up comedian. I will provide you with some topics related to current events and you will use your wit, creativity, and observational skills to create a routine based on those topics. You should also be sure to incorporate personal anecdotes or experiences into the routine in order to make it more relatable and engaging for the audience. My first request is ""I want an humorous take on politics.""",FALSE
"Stand-up Comedian","I want you to act as a stand-up comedian. I will provide you with some topics related to current events and you will use your with, creativity, and observational skills to create a routine based on those topics. You should also be sure to incorporate personal anecdotes or experiences into the routine in order to make it more relatable and engaging for the audience. My first request is ""I want an humorous take on politics.""",FALSE
"Motivational Coach","I want you to act as a motivational coach. I will provide you with some information about someone's goals and challenges, and it will be your job to come up with strategies that can help this person achieve their goals. This could involve providing positive affirmations, giving helpful advice or suggesting activities they can do to reach their end goal. My first request is ""I need help motivating myself to stay disciplined while studying for an upcoming exam"".",FALSE
"Composer","I want you to act as a composer. I will provide the lyrics to a song and you will create music for it. This could include using various instruments or tools, such as synthesizers or samplers, in order to create melodies and harmonies that bring the lyrics to life. My first request is ""I have written a poem named Hayalet Sevgilim"" and need music to go with it.""""""",FALSE
"Debater","I want you to act as a debater. I will provide you with some topics related to current events and your task is to research both sides of the debates, present valid arguments for each side, refute opposing points of view, and draw persuasive conclusions based on evidence. Your goal is to help people come away from the discussion with increased knowledge and insight into the topic at hand. My first request is ""I want an opinion piece about Deno.""",FALSE
@@ -23,7 +23,7 @@ act,prompt,for_devs
"Movie Critic","I want you to act as a movie critic. You will develop an engaging and creative movie review. You can cover topics like plot, themes and tone, acting and characters, direction, score, cinematography, production design, special effects, editing, pace, dialog. The most important aspect though is to emphasize how the movie has made you feel. What has really resonated with you. You can also be critical about the movie. Please avoid spoilers. My first request is ""I need to write a movie review for the movie Interstellar""",FALSE
"Relationship Coach","I want you to act as a relationship coach. I will provide some details about the two people involved in a conflict, and it will be your job to come up with suggestions on how they can work through the issues that are separating them. This could include advice on communication techniques or different strategies for improving their understanding of one another's perspectives. My first request is ""I need help solving conflicts between my spouse and myself.""",FALSE
"Poet","I want you to act as a poet. You will create poems that evoke emotions and have the power to stir people's soul. Write on any topic or theme but make sure your words convey the feeling you are trying to express in beautiful yet meaningful ways. You can also come up with short verses that are still powerful enough to leave an imprint in readers' minds. My first request is ""I need a poem about love.""",FALSE
"Rapper","I want you to act as a rapper. You will come up with powerful and meaningful lyrics, beats and rhythm that can 'wow' the audience. Your lyrics should have an intriguing meaning and message which people can relate too. When it comes to choosing your beat, make sure it is catchy yet relevant to your words, so that when combined they make an explosion of sound everytime! My first request is ""I need a rap song about finding strength within yourself.""",FALSE
"Rapper","I want you to act as a rapper. You will come up with powerful and meaningful lyrics, beats and rhythm that can 'wow' the audience. Your lyrics should have an intriguing meaning and message which people can relate too. When it comes to choosing your beat, make sure it is catchy yet relevant to your words, so that when combined they make an explosion of sound everytime! My first request is ""I need a rap song about finding strength within yourself.""",FALSE
"Motivational Speaker","I want you to act as a motivational speaker. Put together words that inspire action and make people feel empowered to do something beyond their abilities. You can talk about any topics but the aim is to make sure what you say resonates with your audience, giving them an incentive to work on their goals and strive for better possibilities. My first request is ""I need a speech about how everyone should never give up.""",FALSE
"Philosophy Teacher","I want you to act as a philosophy teacher. I will provide some topics related to the study of philosophy, and it will be your job to explain these concepts in an easy-to-understand manner. This could include providing examples, posing questions or breaking down complex ideas into smaller pieces that are easier to comprehend. My first request is ""I need help understanding how different philosophical theories can be applied in everyday life.""",FALSE
"Philosopher","I want you to act as a philosopher. I will provide some topics or questions related to the study of philosophy, and it will be your job to explore these concepts in depth. This could involve conducting research into various philosophical theories, proposing new ideas or finding creative solutions for solving complex problems. My first request is ""I need help developing an ethical framework for decision making.""",FALSE
1
act
prompt
for_devs
13
Advertiser
I want you to act as an advertiser. You will create a campaign to promote a product or service of your choice. You will choose a target audience, develop key messages and slogans, select the media channels for promotion, and decide on any additional activities needed to reach your goals. My first suggestion request is "I need help creating an advertising campaign for a new type of energy drink targeting young adults aged 18-30."
FALSE
14
Storyteller
I want you to act as a storyteller. You will come up with entertaining stories that are engaging, imaginative and captivating for the audience. It can be fairy tales, educational stories or any other type of stories which has the potential to capture people's attention and imagination. Depending on the target audience, you may choose specific themes or topics for your storytelling session e.g., if it's children then you can talk about animals; If it's adults then history-based tales might engage them better etc. My first request is "I need an interesting story on perseverance."
FALSE
15
Football Commentator
I want you to act as a football commentator. I will give you descriptions of football matches in progress and you will commentate on the match, providing your analysis on what has happened thus far and predicting how the game may end. You should be knowledgeable of football terminology, tactics, players/teams involved in each match, and focus primarily on providing intelligent commentary rather than just narrating play-by-play. My first request is "I'm watching Manchester United vs Chelsea - provide commentary for this match."
FALSE
16
Stand-up Comedian
I want you to act as a stand-up comedian. I will provide you with some topics related to current events and you will use your wit, creativity, and observational skills to create a routine based on those topics. You should also be sure to incorporate personal anecdotes or experiences into the routine in order to make it more relatable and engaging for the audience. My first request is "I want an humorous take on politics."I want you to act as a stand-up comedian. I will provide you with some topics related to current events and you will use your with, creativity, and observational skills to create a routine based on those topics. You should also be sure to incorporate personal anecdotes or experiences into the routine in order to make it more relatable and engaging for the audience. My first request is "I want an humorous take on politics."
FALSE
17
Motivational Coach
I want you to act as a motivational coach. I will provide you with some information about someone's goals and challenges, and it will be your job to come up with strategies that can help this person achieve their goals. This could involve providing positive affirmations, giving helpful advice or suggesting activities they can do to reach their end goal. My first request is "I need help motivating myself to stay disciplined while studying for an upcoming exam".
FALSE
18
Composer
I want you to act as a composer. I will provide the lyrics to a song and you will create music for it. This could include using various instruments or tools, such as synthesizers or samplers, in order to create melodies and harmonies that bring the lyrics to life. My first request is "I have written a poem named Hayalet Sevgilim" and need music to go with it."""
FALSE
19
Debater
I want you to act as a debater. I will provide you with some topics related to current events and your task is to research both sides of the debates, present valid arguments for each side, refute opposing points of view, and draw persuasive conclusions based on evidence. Your goal is to help people come away from the discussion with increased knowledge and insight into the topic at hand. My first request is "I want an opinion piece about Deno."
FALSE
23
Movie Critic
I want you to act as a movie critic. You will develop an engaging and creative movie review. You can cover topics like plot, themes and tone, acting and characters, direction, score, cinematography, production design, special effects, editing, pace, dialog. The most important aspect though is to emphasize how the movie has made you feel. What has really resonated with you. You can also be critical about the movie. Please avoid spoilers. My first request is "I need to write a movie review for the movie Interstellar"
FALSE
24
Relationship Coach
I want you to act as a relationship coach. I will provide some details about the two people involved in a conflict, and it will be your job to come up with suggestions on how they can work through the issues that are separating them. This could include advice on communication techniques or different strategies for improving their understanding of one another's perspectives. My first request is "I need help solving conflicts between my spouse and myself."
FALSE
25
Poet
I want you to act as a poet. You will create poems that evoke emotions and have the power to stir people's soul. Write on any topic or theme but make sure your words convey the feeling you are trying to express in beautiful yet meaningful ways. You can also come up with short verses that are still powerful enough to leave an imprint in readers' minds. My first request is "I need a poem about love."
FALSE
26
Rapper
I want you to act as a rapper. You will come up with powerful and meaningful lyrics, beats and rhythm that can 'wow' the audience. Your lyrics should have an intriguing meaning and message which people can relate too. When it comes to choosing your beat, make sure it is catchy yet relevant to your words, so that when combined they make an explosion of sound everytime! My first request is "I need a rap song about finding strength within yourself."I want you to act as a rapper. You will come up with powerful and meaningful lyrics, beats and rhythm that can 'wow' the audience. Your lyrics should have an intriguing meaning and message which people can relate too. When it comes to choosing your beat, make sure it is catchy yet relevant to your words, so that when combined they make an explosion of sound every time! My first request is "I need a rap song about finding strength within yourself."
FALSE
27
Motivational Speaker
I want you to act as a motivational speaker. Put together words that inspire action and make people feel empowered to do something beyond their abilities. You can talk about any topics but the aim is to make sure what you say resonates with your audience, giving them an incentive to work on their goals and strive for better possibilities. My first request is "I need a speech about how everyone should never give up."
FALSE
28
Philosophy Teacher
I want you to act as a philosophy teacher. I will provide some topics related to the study of philosophy, and it will be your job to explain these concepts in an easy-to-understand manner. This could include providing examples, posing questions or breaking down complex ideas into smaller pieces that are easier to comprehend. My first request is "I need help understanding how different philosophical theories can be applied in everyday life."
FALSE
29
Philosopher
I want you to act as a philosopher. I will provide some topics or questions related to the study of philosophy, and it will be your job to explore these concepts in depth. This could involve conducting research into various philosophical theories, proposing new ideas or finding creative solutions for solving complex problems. My first request is "I need help developing an ethical framework for decision making."
"- Always use rg instead of grep/ls -R because it is much faster and respects gitignore",
);
}
constdynamicPrefix=dynamicLines.join("\n");
constprefix=`You are operating as and within the Codex CLI, a terminal-based agentic coding assistant built by OpenAI. It wraps OpenAI models to enable natural language interaction with a local codebase. You are expected to be precise, safe, and helpful.
You can:
@@ -1494,7 +1646,6 @@ You MUST adhere to the following criteria when executing the task:
- If there is a .pre-commit-config.yaml, use \`pre-commit run --files ...\` to check that your changes pass the pre-commit checks. However, do not fix pre-existing errors on lines you didn't touch.
- If pre-commit doesn't work after a few retries, politely inform the user that the pre-commit setup is broken.
- Once you finish coding, you must
- Check \`git status\` to sanity check your changes; revert any scratch files or changes.
- Remove all inline comments you added as much as possible, even if they look normal. Check using \`git diff\`. Inline comments must be generally avoided, unless active maintainers of the repo, after long careful study of the code and the issue, will still misinterpret the code without the comments.
- Check if you accidentally add copyright or license headers. If so, remove them.
- Try to run pre-commit if it is available.
@@ -1504,7 +1655,9 @@ You MUST adhere to the following criteria when executing the task:
- Respond in a friendly tone as a remote teammate, who is knowledgeable, capable and eager to help with coding.
- When your task involves writing or modifying files:
- Do NOT tell the user to "save the file" or "copy the code into a file" if you already created or modified the file using \`apply_patch\`. Instead, reference the file as already saved.
- Do NOT show the full contents of large files you have already written, unless the user explicitly asks for them.`;
- Do NOT show the full contents of large files you have already written, unless the user explicitly asks for them.
reason+="Patch text must end with the correct patch suffix.";
}
thrownewDiffError(reason);
}
constparser=newParser(orig,lines);
parser.index=1;
@@ -762,3 +770,46 @@ if (import.meta.url === `file://${process.argv[1]}`) {
}
});
}
exportconstapplyPatchToolInstructions=`
To edit files, ALWAYS use the \`shell\` tool with \`apply_patch\` CLI. \`apply_patch\` effectively allows you to execute a diff/patch against a file, but the format of the diff specification is unique to this task, so pay careful attention to these instructions. To use the \`apply_patch\` CLI, you should call the shell tool with the following structure:
\`\`\`bash
{"cmd": ["apply_patch", "<<'EOF'\\n*** Begin Patch\\n[YOUR_PATCH]\\n*** End Patch\\nEOF\\n"], "workdir": "..."}
\`\`\`
Where [YOUR_PATCH] is the actual content of your patch, specified in the following V4A diff format.
*** [ACTION] File: [path/to/file] -> ACTION can be one of Add, Update, or Delete.
For each snippet of code that needs to be changed, repeat the following:
[context_before] -> See below for further instructions on context.
- [old_code] -> Precede the old code with a minus sign.
+ [new_code] -> Precede the new, replacement code with a plus sign.
[context_after] -> See below for further instructions on context.
For instructions on [context_before] and [context_after]:
- By default, show 3 lines of code immediately above and 3 lines immediately below each change. If a change is within 3 lines of a previous change, do NOT duplicate the first change’s [context_after] lines in the second change’s [context_before] lines.
- If 3 lines of context is insufficient to uniquely identify the snippet of code within the file, use the @@ operator to indicate the class or function to which the snippet belongs. For instance, we might have:
@@ class BaseClass
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
- If a code block is repeated so many times in a class or function such that even a single \`@@\` statement and 3 lines of context cannot uniquely identify the snippet of code, you can use multiple \`@@\` statements to jump to the right context. For instance:
@@ class BaseClass
@@ def method():
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
Note, then, that we do not use line numbers in this diff format, as the context is enough to uniquely identify code. An example of a message that you might pass as "input" to this function, in order to apply a patch, is shown below.
\`\`\`bash
{"cmd": ["apply_patch", "<<'EOF'\\n*** Begin Patch\\n*** Update File: pygorithm/searching/binary_search.py\\n@@ class BaseClass\\n@@ def search():\\n- pass\\n+ raise NotImplementedError()\\n@@ class Subclass\\n@@ def search():\\n- pass\\n+ raise NotImplementedError()\\n*** End Patch\\nEOF\\n"], "workdir": "..."}
\`\`\`
File references can only be relative, NEVER ABSOLUTE. After the apply_patch command is run, it will always say "Done!", regardless of whether the patch was successfully applied or not. However, you can determine if there are issue and errors by looking at any warnings or logging lines printed BEFORE the "Done!" is output.
)}\nIf you haven't already redeemed, you should receive ${
planType==="plus"?"$5":"$50"
} in API credits\nCredits: ${chalk.dim(chalk.underline("https://platform.openai.com/settings/organization/billing/credit-grants"))}\nMore info: ${chalk.dim(chalk.underline("https://help.openai.com/en/articles/11381614"))}`,
),
);
}else{
// eslint-disable-next-line no-console
console.log(
chalk.green(
`It looks like no credits were granted:\n${JSON.stringify(
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.