Compare commits

..

99 Commits

Author SHA1 Message Date
Charlie Weems
d624f6a521 Clean up command outputs 2025-07-23 10:17:50 -07:00
pap
903de87833 adding best of n 2025-07-22 18:42:35 -07:00
pap
8b9f09a5ba adding secs ago instead of start date 2025-07-22 18:36:45 -07:00
pap
20db1497f3 missing dependencies 2025-07-22 16:06:30 -07:00
pap
ed9de08465 adding a test and renaming jobs to tasks 2025-07-22 16:06:21 -07:00
pap
bb0befc8db adding codex jobs command, including jobs ls, inspect, logs and -a options 2025-07-22 13:42:47 -07:00
pap
2ca44b46d6 adding concurrent option using worktree 2025-07-22 12:03:52 -07:00
Michael Bolin
4082246f6a chore: install an extension for TOML syntax highlighting in the devcontainer (#1650)
Small quality-of-life improvement when doing devcontainer development.
2025-07-22 10:58:09 -07:00
pakrym-oai
6d82907082 Add support for custom base instructions (#1645)
Allows providing custom instructions file as a config parameter and
custom instruction text via MCP tool call.
2025-07-22 09:42:22 -07:00
pakrym-oai
ed206d5687 Log response.failed error message and request-id (#1649)
To help with diagnosing failures.
2025-07-22 09:28:00 -07:00
Michael Bolin
d51654822f fix: use PR_SET_PDEATHSIG so to ensure child processes are killed in a timely manner (#1626)
Some users have reported issues where child processes are not cleaned up
after Codex exits (e.g., https://github.com/openai/codex/issues/1570).

This is generally a tricky issue on operating systems: if a parent
process receives `SIGKILL`, then it terminates immediately and cannot
communicate with the child.

**It only helps on Linux**, but this PR introduces the use of `prctl(2)`
so that if the parent process dies, `SIGTERM` will be delivered to the
child process. Whereas previously, I believe that if Codex spawned a
long-running process (like `tsc --watch`) and the Codex process received
`SIGKILL`, the `tsc --watch` process would be reparented to the init
process and would never be killed. Now with the use of `prctl(2)`, the
`tsc --watch` process should receive `SIGTERM` in that scenario.

We still need to come up with a solution for macOS. I've started to look
at `launchd`, but I'm researching a number of options.
2025-07-22 00:41:27 -07:00
Gabriel Peal
710f728124 Add an elicitation for approve patch and refactor tool calls (#1642)
1. Added an elicitation for `approve-patch` which is very similar to
`approve-exec`.
2. Extracted both elicitations to their own files to prevent
`codex_tool_runner` from blowing up in size.
2025-07-22 02:58:41 -04:00
Michael Bolin
6cf4b96f9d fix: check flags to ripgrep when deciding whether the invocation is "trusted" (#1644)
With this change, if any of `--pre`, `--hostname-bin`, `--search-zip`, or `-z` are used with a proposed invocation of `rg`, do not auto-approve.
2025-07-21 22:38:50 -07:00
Dylan
18b2b30841 [mcp-server] Add reply tool call (#1643)
## Summary
Adds a new mcp tool call, `codex-reply`, so we can continue existing
sessions. This is a first draft and does not yet support sessions from
previous processes.

## Testing
- [x] tested with mcp client
2025-07-21 21:01:56 -07:00
Michael Bolin
d49d802b06 test: add integration test for MCP server (#1633)
This PR introduces a single integration test for `cargo mcp`, though it
also introduces a number of reusable components so that it should be
easier to introduce more integration tests going forward.

The new test is introduced in `codex-rs/mcp-server/tests/elicitation.rs`
and the reusable pieces are in `codex-rs/mcp-server/tests/common`.

The test itself verifies new functionality around elicitations
introduced in https://github.com/openai/codex/pull/1623 (and the fix
introduced in https://github.com/openai/codex/pull/1629) by doing the
following:

- starts a mock model provider with canned responses for
`/v1/chat/completions`
- starts the MCP server with a `config.toml` to use that model provider
(and `approval_policy = "untrusted"`)
- sends the `codex` tool call which causes the mock model provider to
request a shell call for `git init`
- the MCP server sends an elicitation to the client to approve the
request
- the client replies to the elicitation with `"approved"`
- the MCP server runs the command and re-samples the model, getting a
`"finish_reason": "stop"`
- in turn, the MCP server sends the final response to the original
`codex` tool call
- verifies that `git init` ran as expected

To test:

```
cargo test shell_command_approval_triggers_elicitation
```

In writing this test, I discovered that `ExecApprovalResponse` does not
conform to `ElicitResult`, so I added a TODO to fix that, since I think
that should be updated in a separate PR. As it stands, this PR does not
update any business logic, though it does make a number of members of
the `mcp-server` crate `pub` so they can be used in the test.

One additional learning from this PR is that
`std::process::Command::cargo_bin()` from the `assert_cmd` trait is only
available for `std::process::Command`, but we really want to use
`tokio::process::Command` so that everything is async and we can
leverage utilities like `tokio::time::timeout()`. The trick I came up
with was to use `cargo_bin()` to locate the program, and then to use
`std::process::Command::get_program()` when constructing the
`tokio::process::Command`.
2025-07-21 10:27:07 -07:00
Michael Bolin
8a6c6cee88 fix: address review feedback on #1621 and #1623 (#1631)
- formalizes `ExecApprovalElicitRequestParams`
- adds some defensive logic when messages fail to parse
- fixes a typo in a comment
2025-07-20 14:42:11 -07:00
Gabriel Peal
8b590105de Don't drop sessions on elicitation responses (#1629) 2025-07-20 13:31:19 -04:00
Michael Bolin
018003e52f feat: leverage elicitations in the MCP server (#1623)
This updates the MCP server so that if it receives an
`ExecApprovalRequest` from the `Codex` session, it in turn sends an [MCP
elicitation](https://modelcontextprotocol.io/specification/draft/client/elicitation)
to the client to ask for the approval decision. Upon getting a response,
it forwards the client's decision via `Op::ExecApproval`.

Admittedly, we should be doing the same thing for
`ApplyPatchApprovalRequest`, but this is our first time experimenting
with elicitations, so I'm inclined to defer wiring that code path up
until we feel good about how this one works.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1623).
* __->__ #1623
* #1622
* #1621
* #1620
2025-07-19 01:32:03 -04:00
Michael Bolin
11fd3123be chore: introduce OutgoingMessageSender (#1622)
Previous to this change, `MessageProcessor` had a
`tokio::sync::mpsc::Sender<JSONRPCMessage>` as an abstraction for server
code to send a message down to the MCP client. Because `Sender` is cheap
to `clone()`, it was straightforward to make it available to tasks
scheduled with `tokio::task::spawn()`.

This worked well when we were only sending notifications or responses
back down to the client, but we want to add support for sending
elicitations in #1623, which means that we need to be able to send
_requests_ to the client, and now we need a bit of centralization to
ensure all request ids are unique.

To that end, this PR introduces `OutgoingMessageSender`, which houses
the existing `Sender<OutgoingMessage>` as well as an `AtomicI64` to mint
out new, unique request ids. It has methods like `send_request()` and
`send_response()` so that callers do not have to deal with
`JSONRPCMessage` directly, as having to set the `jsonrpc` for each
message was a bit tedious (this cleans up `codex_tool_runner.rs` quite a
bit).

We do not have `OutgoingMessageSender` implement `Clone` because it is
important that the `AtomicI64` is shared across all users of
`OutgoingMessageSender`. As such, `Arc<OutgoingMessageSender>` must be
used instead, as it is frequently shared with new tokio tasks.

As part of this change, we update `message_processor.rs` to embrace
`await`, though we must be careful that no individual handler blocks the
main loop and prevents other messages from being handled.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1622).
* #1623
* __->__ #1622
* #1621
* #1620
2025-07-19 00:30:56 -04:00
Michael Bolin
e78ec00e73 chore: support MCP schema 2025-06-18 (#1621)
This updates the schema in `generate_mcp_types.py` from `2025-03-26` to
`2025-06-18`, regenerates `mcp-types/src/lib.rs`, and then updates all
the code that uses `mcp-types` to honor the changes.

Ran

```
npx @modelcontextprotocol/inspector just codex mcp
```

and verified that I was able to invoke the `codex` tool, as expected.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1621).
* #1623
* #1622
* __->__ #1621
2025-07-19 00:09:34 -04:00
Michael Bolin
a06d4f58e4 chore: clean up generate_mcp_types.py so codegen matches existing output (#1620) 2025-07-18 21:40:39 -04:00
aibrahim-oai
83eefb55fb Add session loading support to Codex (#1602)
## Summary
- extend rollout format to store all session data in JSON
- add resume/write helpers for rollouts
- track session state after each conversation
- support `LoadSession` op to resume a previous rollout
- allow starting Codex with an existing session via
`experimental_resume` config variable

We need a way later for exploring the available sessions in a user
friendly way.

## Testing
- `cargo test --no-run` *(fails: `cargo: command not found`)*

------
https://chatgpt.com/codex/tasks/task_i_68792a29dd5c832190bf6930d3466fba

This video is outdated. you should use `-c experimental_resume:<full
path>` instead of `--resume <full path>`


https://github.com/user-attachments/assets/7a9975c7-aa04-4f4e-899a-9e87defd947a
2025-07-18 17:04:04 -07:00
aibrahim-oai
9846adeabf Refactor env settings into config (#1601)
## Summary
- add OpenAI retry and timeout fields to Config
- inject these settings in tests instead of mutating env vars
- plumb Config values through client and chat completions logic
- document new configuration options

## Testing
- `cargo test -p codex-core --no-run`

------
https://chatgpt.com/codex/tasks/task_i_68792c5b04cc832195c03050c8b6ea94

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-07-18 19:12:39 +00:00
aibrahim-oai
d5a2148deb Fix ctrl+c interrupt while streaming (#1617)
Interrupting while streaming now causes is broken because we aren't
clearing the delta buffer.
2025-07-18 12:08:25 -07:00
Michael Bolin
cc874c9205 chore: use AtomicBool instead of Mutex<bool> (#1616) 2025-07-18 11:13:34 -07:00
pakrym-oai
6f2b01bb6b feat: ensure session ID header is sent in Response API request (#1614)
Include the current session id in Responses API requests.
2025-07-18 09:59:07 -07:00
aibrahim-oai
9cedeadf6a change the default debounce rate to 10ms (#1606)
changed the default debounce rate to 10ms because typing was laggy.

Before:


https://github.com/user-attachments/assets/e5d15fcb-6a2b-4837-b2b4-c3dcb4cc3409

After



https://github.com/user-attachments/assets/6f0005eb-fd49-4130-ba68-635ee0f2831f
2025-07-17 17:00:17 -07:00
pakrym-oai
327e2254f6 chore: rename toolchain file (#1604)
Rename toolchain file so older versions of cargo can pick it up.
2025-07-17 15:36:15 -07:00
Michael Bolin
e16657ca45 feat: add --json flag to codex exec (#1603)
This is designed to facilitate programmatic use of Codex in a more
lightweight way than using `codex mcp`.

Passing `--json` to `codex exec` will print each event as a line of JSON
to stdout. Note that it does not print the individual tokens as they are
streamed, only full messages, as this is aimed at programmatic use
rather than to power UI.

<img width="1348" height="1307" alt="image"
src="https://github.com/user-attachments/assets/fc7908de-b78d-46e4-a6ff-c85de28415c7"
/>

I changed the existing `EventProcessor` into a trait and moved the
implementation to `EventProcessorWithHumanOutput`. Then I introduced an
alternative implementation, `EventProcessorWithJsonOutput`. The `--json`
flag determines which implementation to use.
2025-07-17 15:10:15 -07:00
aibrahim-oai
bb30ab9e96 Implement redraw debounce (#1599)
## Summary
- debouce redraw events so repeated requests don't overwhelm the
terminal
- add `RequestRedraw` event and schedule redraws after 100ms

## Testing
- `cargo clippy --tests`
- `cargo test` *(fails: Sandbox Denied errors in landlock tests)*

------
https://chatgpt.com/codex/tasks/task_i_68792a65b8b483218ec90a8f68746cd8

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-07-17 12:54:55 -07:00
pakrym-oai
6949329a7f chore: auto format code on save and add more details to AGENTS.md (#1582)
Adds a default vscode config with generally applicable settings.
Adds more entrypoints to justfile both  for environment setup and to help
agents better verify changes.
2025-07-17 11:40:00 -07:00
pakrym-oai
b95a010e86 fix: trim MCP tool names to fit into tool name length limit (#1571)
Store fully qualified names along with tool entries so we don't have to re-parse them.

Fixes: https://github.com/openai/codex/issues/1289
2025-07-17 11:35:38 -07:00
aibrahim-oai
fcbcc40f51 Storing the sessions in a more organized way for easier look up. (#1596)
now storing the sessions in `~/.codex/sessions/YYYY/MM/DD/<file>`
2025-07-17 10:12:15 -07:00
aibrahim-oai
643ab1f582 Add streaming to exec and tui (#1594)
Added support for streaming in `tui`
Added support for streaming in `exec`


https://github.com/user-attachments/assets/4215892e-d940-452c-a1d0-416ed0cf14eb
2025-07-16 22:26:31 -07:00
Michael Bolin
d3dbc10479 fix: update bin/codex.js so it listens for exit on the child process (#1590)
When Codex CLI is installed via `npm`, we use a `.js` wrapper script to
launch the Rust binary.

- Previously, we were not listening for signals to ensure that killing
the Node.js process would also kill the underlying Rust process.
- We also did not have a proper `exit` handler in place on the child
process to ensure we exited from the Node.js process.

This PR fixes these things and hopefully addresses
https://github.com/openai/codex/issues/1570.

This also adds logic so that Windows falls back to the TypeScript CLI
again, which should address https://github.com/openai/codex/issues/1573.
2025-07-16 16:35:29 -07:00
Preet 🚀
0bc7ee9193 Added mcp-server name validation (#1591)
This PR implements server name validation for MCP (Model Context
Protocol) servers to ensure they conform to the required pattern
^[a-zA-Z0-9_-]+$. This addresses the TODO comment in
mcp_connection_manager.rs:82.

+ Added validation before spawning MCP client tasks
+ Invalid server names are added to errors map with descriptive messages

I have read the CLA Document and I hereby sign the CLA

---------

Co-authored-by: Michael Bolin <bolinfest@gmail.com>
2025-07-16 16:00:39 -07:00
aibrahim-oai
2bd3314886 support deltas in core (#1587)
- Added support for message and reasoning deltas
- Skipped adding the support in the cli and tui for later
- Commented a failing test (wrong merge) that needs fix in a separate
PR.

Side note: I think we need to disable merge when the CI don't pass.
2025-07-16 15:11:18 -07:00
Michael Bolin
5b820c5ce7 feat: ctrl-d only exits when there is no user input (#1589)
While this does make it so that `ctrl-d` will not exit Codex when the
composer is not empty, `ctrl-d` will still exit Codex if it is in the
"working" state.

Fixes https://github.com/openai/codex/issues/1443.
2025-07-16 08:59:26 -07:00
aibrahim-oai
f14b5adabf Add SSE Response parser tests (#1541)
## Summary
- add `tokio-test` dev dependency
- implement response stream parsing unit tests

## Testing
- `cargo clippy -p codex-core --tests -- -D warnings`
- `cargo test -p codex-core -- --nocapture`

------
https://chatgpt.com/codex/tasks/task_i_687163f3b2208321a6ce2adbef3fbc06
2025-07-14 14:51:32 -07:00
Michael Bolin
9c0b413fd1 docs: clarify the build process for the npm release (#1568)
It appears that `0.5.0` was built with `stage_release.sh` instead of
`stage_rust_release.py`, so add docs to clarify this and recommend
running `--version` on the release candidate to verify the right thing
was built.
2025-07-14 09:41:11 -07:00
aibrahim-oai
3777e18243 Add CLI streaming integration tests (#1542)
## Summary
- add integration test for chat mode streaming via CLI using wiremock
- add integration test for Responses API streaming via fixture
- call `cargo run` to invoke the CLI during tests

## Testing
- `cargo test -p codex-core --test cli_stream -- --nocapture`
- `cargo clippy --all-targets --all-features -- -D warnings`


------
https://chatgpt.com/codex/tasks/task_i_68715980bbec8321999534fdd6a013c1
2025-07-12 18:05:58 -07:00
aibrahim-oai
0f8ac92390 Allow deadcode in test_support (#1555)
#1546 Was pushed while not passing the clippy integration tests. This is
fixing it.
2025-07-12 17:20:35 -07:00
aibrahim-oai
c46bb67d77 Improve SSE tests (#1546)
## Summary
- support fixture-based SSE data in tests
- add helpers to load SSE JSON fixtures
- add table-driven SSE unit tests
- let integration tests use fixture loading
- fix clippy errors from format! calls

## Testing
- `cargo clippy --tests`
- `cargo test --workspace --exclude codex-linux-sandbox`


------
https://chatgpt.com/codex/tasks/task_i_68717468c3e48321b51c9ecac6ba0f09
2025-07-12 16:53:55 -07:00
Michael Bolin
94f5cad895 fix: when invoking Codex via MCP, use the request id as the Submission id (#1554)
Small quality-of-life improvement when using `codex mcp`.
2025-07-12 16:22:02 -07:00
aibrahim-oai
72504f1d9c Add paste summarization to Codex TUI (#1549)
## Summary
- introduce `Paste` event to avoid per-character paste handling
- collapse large pasted blocks to `[Pasted Content X lines]`
- store the real text so submission still includes it
- wire paste handling through `App`, `ChatWidget`, `BottomPane`, and
`ChatComposer`

## Testing
- `cargo test -p codex-tui`


------
https://chatgpt.com/codex/tasks/task_i_6871e24abf80832184d1f3ca0c61a5ee


https://github.com/user-attachments/assets/eda7412f-da30-4474-9f7c-96b49d48fbf8
2025-07-12 15:32:00 -07:00
dependabot[bot]
fa6d507c51 chore(deps-dev): bump @types/bun from 1.2.13 to 1.2.18 in /.github/actions/codex (#1509)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/bun&package-manager=bun&previous-version=1.2.13&new-version=1.2.18)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-12 10:29:37 -07:00
dependabot[bot]
a52a2fe7a9 chore(deps-dev): bump @types/node from 22.15.21 to 24.0.12 in /.github/actions/codex (#1507)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=bun&previous-version=22.15.21&new-version=24.0.12)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-12 09:56:54 -07:00
Gabriel Peal
bfeb8c92a5 Add codex apply to apply a patch created from the Codex remote agent (#1528)
In order to to this, I created a new `chatgpt` crate where we can put
any code that interacts directly with ChatGPT as opposed to the OpenAI
API. I added a disclaimer to the README for it that it should primarily
be modified by OpenAI employees.


https://github.com/user-attachments/assets/bb978e33-d2c9-4d8e-af28-c8c25b1988e8
2025-07-11 13:30:11 -04:00
Michael Bolin
9e58076cf5 chore: read model field off of Config instead of maintaining the parallel field (#1525)
https://github.com/openai/codex/pull/1524 introduced the new `config`
field on `ModelClient`, so this does the post-PR cleanup to remove the
now-unnecessary `model` field.
2025-07-10 14:37:04 -07:00
Michael Bolin
8a424fcfa3 feat: add new config option: model_supports_reasoning_summaries (#1524)
As noted in the updated docs, this makes it so that you can set:

```toml
model_supports_reasoning_summaries = true
```

as a way of overriding the existing heuristic for when to set the
`reasoning` field on a sampling request:


341c091c5b/codex-rs/core/src/client_common.rs (L152-L166)
2025-07-10 14:30:33 -07:00
dependabot[bot]
341c091c5b chore(deps-dev): bump prettier from 3.5.3 to 3.6.2 in /.github/actions/codex (#1508)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=prettier&package-manager=bun&previous-version=3.5.3&new-version=3.6.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 12:13:59 -07:00
dependabot[bot]
6b1e4a6846 chore(deps): bump node from 22-slim to 24-slim in /codex-cli (#1505)
Bumps node from 22-slim to 24-slim.


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=node&package-manager=docker&previous-version=22-slim&new-version=24-slim)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 12:11:44 -07:00
dependabot[bot]
75fa65e054 chore(deps): bump toml from 0.9.0 to 0.9.1 in /codex-rs (#1514)
Bumps [toml](https://github.com/toml-rs/toml) from 0.9.0 to 0.9.1.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8c8ef44ea1"><code>8c8ef44</code></a>
chore: Release</li>
<li><a
href="b60ac5bfe9"><code>b60ac5b</code></a>
fix(toml): Correct minimal version for indexmap (<a
href="https://redirect.github.com/toml-rs/toml/issues/998">#998</a>)</li>
<li><a
href="966bd40511"><code>966bd40</code></a>
fix(toml): Correct minimal version for indexmap</li>
<li><a
href="2ed2af6519"><code>2ed2af6</code></a>
docs(readme): Mention additional crates</li>
<li>See full diff in <a
href="https://github.com/toml-rs/toml/compare/toml-v0.9.0...toml-v0.9.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=toml&package-manager=cargo&previous-version=0.9.0&new-version=0.9.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 11:34:37 -07:00
Michael Bolin
16eafd02ad fix: remove reference to /compact until it is implemented (#1503)
Do not mention `/compact` until
https://github.com/openai/codex/issues/1257 is addressed.
2025-07-10 11:23:35 -07:00
Michael Bolin
c8051b906f chore: drop codex-cli from dependabot (#1523)
We are not actively developing `codex-cli`, so I would rather leave the
existing `pnpm-lock.yaml` files as-is.
2025-07-10 11:23:24 -07:00
Rene Leonhardt
82b0cebe8b chore(rs): update dependencies (#1494)
### Chores
- Update cargo dependencies
- Remove unused cargo dependencies
- Fix clippy warnings
- Update Dockerfile (package.json requires node 22)
- Let Dependabot update bun, cargo, devcontainers, docker,
github-actions, npm (nix still not supported)

### TODO
- Upgrade dependencies with breaking changes

```shell
$ cargo update --verbose
   Unchanged crossterm v0.28.1 (available: v0.29.0)
   Unchanged schemars v0.8.22 (available: v1.0.4)
```
2025-07-10 11:08:16 -07:00
pchuri
3a23a86f4b Add Android platform support for Codex CLI (#1488)
## Summary
Add Android platform support to Codex CLI

## What?
- Added `android` to the list of supported platforms in
`codex-cli/bin/codex.js`
- Treats Android as Linux for binary compatibility

## Why?
- Fixes "Unsupported platform: android (arm64)" error on Termux
- Enables Codex CLI usage on Android devices via Termux
- Improves platform compatibility without affecting other platforms

## How?
- Modified the platform detection switch statement to include `case
"android":`
- Android falls through to the same logic as Linux, using appropriate
ARM64 binaries
- Minimal change with no breaking effects on existing functionality

## Testing
- Tested on Android/Termux environment
- Verified the fix resolves the platform detection error
- Confirmed no impact on other platforms

## Related Issues
Fixes the "Unsupported platform: android (arm64)" error reported by
Termux users
2025-07-09 22:06:55 -07:00
Michael Bolin
268267b59e fix: the completion subcommand should assume the CLI is named codex, not codex-cli (#1496)
Current 0.4.0 release:

```
~/code/codex2/codex-rs$ codex completion | head
_codex-cli() {
    local i cur prev opts cmd
    COMPREPLY=()
    if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
        cur="$2"
    else
        cur="${COMP_WORDS[COMP_CWORD]}"
    fi
    prev="$3"
    cmd=""
```

with this change:

```
~/code/codex2/codex-rs$ just codex completion | head
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.82s
     Running `target/debug/codex completion`
_codex() {
    local i cur prev opts cmd
    COMPREPLY=()
    if [[ "${BASH_VERSINFO[0]}" -ge 4 ]]; then
        cur="$2"
    else
        cur="${COMP_WORDS[COMP_CWORD]}"
    fi
    prev="$3"
    cmd=""
```
2025-07-09 14:08:35 -07:00
Michael Bolin
4a15ebc1ca feat: add codex completion to generate shell completions (#1491)
Once this lands, we can update our brew formula to use
`generate_completions_from_executable()` like so:


905238ff7f/Formula/h/hgrep.rb (L21-L25)
2025-07-08 21:43:27 -07:00
Michael Bolin
8d35ad0ef7 feat: honor OPENAI_BASE_URL for the built-in openai provider (#1487)
Some users have proxies or other setups where they are ultimately
hitting OpenAI endpoints, but need a custom `base_url` rather than the
default value of `"https://api.openai.com/v1"`. This PR makes it
possible to override the `base_url` for the `openai` provider via the
`OPENAI_BASE_URL` environment variable.
2025-07-08 12:39:52 -07:00
Michael Bolin
cc58f1086d docs: document support for model_reasoning_effort and model_reasoning_summary in profiles (#1486)
Documents the new functionality added in
https://github.com/openai/codex/pull/1484.
2025-07-08 12:26:05 -07:00
Yusuf Eren
e444a50cf0 feat: add reasoning fields to profile settings (#1484) 2025-07-08 12:05:22 -07:00
Michael Bolin
f80fc86f18 chore: default to the latest version of the Codex CLI in the GitHub Action (#1485)
Now we no longer have to update the default value of `codex_release_tag`
in the GitHub Action going forward.
2025-07-08 12:00:13 -07:00
Michael Bolin
0b9cb2b9e7 chore: create a release script for the Rust CLI (#1479)
This is a stopgap solution before migrating the build for the npm
release to GitHub Actions (which is ultimately what should be done to
ensure hermetic builds).

The idea is that instead of continuing to create PRs like
https://github.com/openai/codex/pull/1472 where I have to check in a
change to the `WORKFLOW_URL`, this script uses `gh run list` to get the
`WORKFLOW_URL` dynamically and then threads the value through to
`install_native_deps.sh`.

To create the 0.3.0 release on npm, I ran:

```shell
./codex-cli/scripts/stage_rust_release.py --release-version 0.3.0
```

and then did `npm publish --dry-run` followed by `npm publish` in the
temp directory created by `stage_rust_release.py`.
2025-07-07 23:51:34 -07:00
Michael Bolin
e0c08cea4f feat: add support for --sandbox flag (#1476)
On a high-level, we try to design `config.toml` so that you don't have
to "comment out a lot of stuff" when testing different options.

Previously, defining a sandbox policy was somewhat at odds with this
principle because you would define the policy as attributes of
`[sandbox]` like so:

```toml
[sandbox]
mode = "workspace-write"
writable_roots = [ "/tmp" ]
```

but if you wanted to temporarily change to a read-only sandbox, you
might feel compelled to modify your file to be:

```toml
[sandbox]
mode = "read-only"
# mode = "workspace-write"
# writable_roots = [ "/tmp" ]
```

Technically, commenting out `writable_roots` would not be strictly
necessary, as `mode = "read-only"` would ignore `writable_roots`, but
it's still a reasonable thing to do to keep things tidy.

Currently, the various values for `mode` do not support that many
attributes, so this is not that hard to maintain, but one could imagine
this becoming more complex in the future.

In this PR, we change Codex CLI so that it no longer recognizes
`[sandbox]`. Instead, it introduces a top-level option, `sandbox_mode`,
and `[sandbox_workspace_write]` is used to further configure the sandbox
when when `sandbox_mode = "workspace-write"` is used:

```toml
sandbox_mode = "workspace-write"

[sandbox_workspace_write]
writable_roots = [ "/tmp" ]
```

This feels a bit more future-proof in that it is less tedious to
configure different sandboxes:

```toml
sandbox_mode = "workspace-write"

[sandbox_read_only]
# read-only options here...

[sandbox_workspace_write]
writable_roots = [ "/tmp" ]

[sandbox_danger_full_access]
# danger-full-access options here...
```

In this scheme, you never need to comment out the configuration for an
individual sandbox type: you only need to redefine `sandbox_mode`.

Relatedly, previous to this change, a user had to do `-c
sandbox.mode=read-only` to change the mode on the command line. With
this change, things are arguably a bit cleaner because the equivalent
option is `-c sandbox_mode=read-only` (and now `-c
sandbox_workspace_write=...` can be set separately).

Though more importantly, we introduce the `-s/--sandbox` option to the
CLI, which maps directly to `sandbox_mode` in `config.toml`, making
config override behavior easier to reason about. Moreover, as you can
see in the updates to the various Markdown files, it is much easier to
explain how to configure sandboxing when things like `--sandbox
read-only` can be used as an example.

Relatedly, this cleanup also made it straightforward to add support for
a `sandbox` option for Codex when used as an MCP server (see the changes
to `mcp-server/src/codex_tool_config.rs`).

Fixes https://github.com/openai/codex/issues/1248.
2025-07-07 22:31:30 -07:00
Michael Bolin
0a44c42533 docs: update README to include npm install again (#1475)
v0.2.0 of https://www.npmjs.com/package/@openai/codex now runs the Rust
CLI, so it makes sense to bring back the instructions to use `npm i -g
@openai/codex`.

In most places, I list `npm install` before `brew install` because I
believe `npm` is more readily available, though I in the more detailed
part of the documentation, I note that `brew install` will download
fewer bytes, and in that sense, is preferred.
2025-07-07 17:44:26 -07:00
Michael Bolin
a9bed68947 chore: normalize repository.url in package.json (#1474)
I got this as a warning when doing `npm publish --dry-run`, so I ran
`npm pkg fix` to create this PR, as instructed.
2025-07-07 16:33:06 -07:00
ryozi
fd67a0086c Fix Unicode handling in chat_composer "@" token detection (#1467)
## Issues Fixed

- **Primary Issue (#1450)**: Unicode cursor positioning was incorrect
due to mixing character positions with byte positions
- **Additional Issue**: Full-width spaces (CJK whitespace like " ")
weren't properly handled as token boundaries
- ref:
https://doc.rust-lang.org/std/primitive.char.html#method.is_whitespace

---------

Co-authored-by: Michael Bolin <bolinfest@gmail.com>
2025-07-07 13:43:31 -07:00
Michael Bolin
c221eab0b5 feat: support custom HTTP headers for model providers (#1473)
This adds support for two new model provider config options:

- `http_headers` for hardcoded (key, value) pairs
- `env_http_headers` for headers whose values should be read from
environment variables

This also updates the built-in `openai` provider to use this feature to
set the following headers:

- `originator` => `codex_cli_rs`
- `version` => [CLI version]
- `OpenAI-Organization` => `OPENAI_ORGANIZATION` env var
- `OpenAI-Project` => `OPENAI_PROJECT` env var

for consistency with the TypeScript implementation:


bd5a9e8ba9/codex-cli/src/utils/agent/agent-loop.ts (L321-L329)

While here, this also consolidates some logic that was duplicated across
`client.rs` and `chat_completions.rs` by introducing
`ModelProviderInfo.create_request_builder()`.

Resolves https://github.com/openai/codex/discussions/1152
2025-07-07 13:09:16 -07:00
Michael Bolin
bd5a9e8ba9 chore: update release scripts for the TypeScript CLI (#1472)
This introduces two changes to make a quick fix so we can deploy the
Rust CLI for `0.2.0` of `@openai/codex` on npm:

- Updates `WORKFLOW_URL` to point to
https://github.com/openai/codex/actions/runs/15981617627, which is the
GitHub workflow run used to create the binaries for the `0.2.0` release
we published to Homebrew.
- Adds a `--version` option to `stage_release.sh` to specify what the
`version` field in the `package.json` will be.

Locally, I ran the following:

```
./codex-cli/scripts/stage_release.sh --native --version 0.2.0
```

Previously, we only used the `--native` flag to publish to the `native`
tag of `@openai/codex` (e.g., `npm publish --tag native`), but we should
just publish this as the default tag for `0.2.0` to be consistent with
what is in Homebrew.

We can still publish one "final" version of the TypeScript CLI as 0.1.x
later.

Under the hood, this release will still contain `dist/cli.js`,
`bin/codex-linux-sandbox-x64`, and `bin/codex-x86_64-apple-darwin`,
which are not strictly necessary, but we'll fix that in `0.3.0`.
2025-07-07 09:43:03 -07:00
Michael Bolin
abcca30d93 docs: update documentation to reflect Rust CLI release (#1440)
As promised on https://github.com/openai/codex/discussions/1405, we are
making the first official release of the Rust CLI as v0.2.0. As part of
this move, we are making it available in Homebrew:

https://github.com/Homebrew/homebrew-core/pull/228615

Ultimately, we also plan to continue to make the CLI available in npm,
as well, though brew is a bit nicer in that `brew install` will download
only the binary for your platform whereas an npm module is expected to
contain the binaries for _all_ supported platforms, so it is a bit more
heavyweight.

A big part of this change is updating the root `README.md` to document
the behavior of the Rust CLI, which differs in a number of ways from the
TypeScript CLI. The existing `README.md` is moved to
`codex-cli/README.md` as part of this PR, as it is still applicable to
that folder.

As this is still early days for the Rust CLI, I encourage folks to
provide feedback on the command line flags and configuration options.
2025-07-01 15:00:31 -07:00
Michael Bolin
4cb3c76798 fix: softprops/action-gh-release@v2 should use existing tag instead of creating a new tag (#1436)
https://github.com/Homebrew/homebrew-core/pull/228521 details the issues
I was having with the **Source code (tar.gz)** artifact for our GitHub
releases not being quite right. I landed these PRs as stabs in the dark
to fix this:

- https://github.com/openai/codex/pull/1423
- https://github.com/openai/codex/pull/1430

Based on the insights from
https://github.com/Homebrew/homebrew-core/pull/228521, I think those
were wrong and the real problem was this:


6dad5c3b17/.github/workflows/rust-release.yml (L162)

That is, I was manufacturing a new tag name on the fly instead of using
the existing one.

This PR reverts #1423 and #1430 and hopefully fixes how `tag_name` is
set for the `softprops/action-gh-release@v2` step so the **Source code
(tar.gz)** includes the correct files. Assuming this works, this should
make the Homebrew formula straightforward.
2025-06-30 12:10:48 -07:00
Michael Bolin
6dad5c3b17 feat: add query_params option to ModelProviderInfo to support Azure (#1435)
As discovered in https://github.com/openai/codex/issues/1365, the Azure
provider needs to be able to specify `api-version` as a query param, so
this PR introduces a generic `query_params` option to the
`model_providers` config so that an Azure provider can be defined as
follows:

```toml
[model_providers.azure]
name = "Azure"
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY"
query_params = { api-version = "2025-04-01-preview" }
```

This PR also updates the docs with this example.

While here, we also update `wire_api` to default to `"chat"`, as that is
likely the common case for someone defining an external provider.

Fixes https://github.com/openai/codex/issues/1365.
2025-06-30 11:39:54 -07:00
Michael Bolin
cd2d84d496 fix: need to check out the branch, not the tag (#1430)
This should have been done in https://github.com/openai/codex/pull/1423.
2025-06-29 10:18:50 -07:00
Michael Bolin
688100f7f4 chore: fix Rust release process so generated .tar.gz source works with Homebrew (#1423)
Looking at existing releases such as
https://github.com/openai/codex/releases/tag/codex-rs-b289c9207090b2e27494545d7b5404e063bd86f3-1-rust-v0.1.0-alpha.4,
the `.tar.gz` for the source code still seems to have `0.0.0` as the
`version` in `codex-rs/Cargo.toml` instead of what the tag seems to say
it should have:


b289c92070/codex-rs/Cargo.toml (L21)

ChatGPT claims:

> When GitHub generates the Source code (tar.gz) archive for a tag:
	•	It uses the commit the tag points to.
• But in some cases (e.g., shallow clones, GitHub CI, or local tools
that only clone the default branch), that commit may not be included,
and you might get an outdated view or nothing at all depending on how
it’s fetched.
	
Trying this recommended fix.
2025-06-28 19:46:44 -07:00
Michael Bolin
f30bf4bbcf fix: support pre-release identifiers in tags (#1422)
Had to update the regex in the GitHub workflow to allow suffixes like
`-alpha.4`.

Successfully ran:

```
./scripts/create_github_release.sh 0.1.0-alpha.4
```

to create
https://github.com/openai/codex/releases/tag/codex-rs-b289c9207090b2e27494545d7b5404e063bd86f3-1-rust-v0.1.0-alpha.4

and verified that when I run `codex --version`, it prints `codex-cli
0.1.0-alpha.4`.
2025-06-28 16:05:53 -07:00
Michael Bolin
1b7c8d2569 fix: build with codegen-units = 1 for profile.release (#1421)
Great suggestion from @zamazan4ik on
https://github.com/openai/codex/issues/1411.
2025-06-28 15:24:48 -07:00
Michael Bolin
4a341efe92 feat: highlight matching characters in fuzzy file search (#1420)
Using the new file-search API introduced in
https://github.com/openai/codex/pull/1419, matching characters are now
shown in bold in the TUI:


https://github.com/user-attachments/assets/8bbcc6c6-75a3-493f-8ea4-b2a063e09b3a

Fixes https://github.com/openai/codex/issues/1261
2025-06-28 15:04:23 -07:00
Michael Bolin
e2efe8da9c feat: introduce --compute-indices flag to codex-file-search (#1419)
This is a small quality-of-life feature, the addition of
`--compute-indices` to the CLI, which, if enabled, will compute and set
the `indices` field for each `FileMatch` returned by `run()`. Note we
only bother to compute `indices` once we have the top N results because
there could be a lot of intermediate "top N" results during the search
that are ultimately discarded.

When set, the indices are included in the JSON output when `--json` is
specified and the matching indices are displayed in bold when `--json`
is not specified.
2025-06-28 14:39:29 -07:00
Michael Bolin
5a0f236ca4 feat: add support for @ to do file search (#1401)
Introduces support for `@` to trigger a fuzzy-filename search in the
composer. Under the hood, this leverages
https://crates.io/crates/nucleo-matcher to do the fuzzy matching and
https://crates.io/crates/ignore to build up the list of file candidates
(so that it respects `.gitignore`).

For simplicity (at least for now), we do not do any caching between
searches like VS Code does for its file search:


1d89ed699b/src/vs/workbench/services/search/node/rawSearchService.ts (L212-L218)

Because we do not do any caching, I saw queries take up to three seconds
on large repositories with hundreds of thousands of files. To that end,
we do not perform searches synchronously on each keystroke, but instead
dispatch an event to do the search on a background thread that
asynchronously reports back to the UI when the results are available.
This is largely handled by the `FileSearchManager` introduced in this
PR, which also has logic for debouncing requests so there is at most one
search in flight at a time.

While we could potentially polish and tune this feature further, it may
already be overengineered for how it will be used, in practice, so we
can improve things going forward if it turns out that this is not "good
enough" in the wild.

Note this feature does not work like `@` in the TypeScript CLI, which
was more like directory-based tab completion. In the Rust CLI, `@`
triggers a full-repo fuzzy-filename search.

Fixes https://github.com/openai/codex/issues/1261.
2025-06-28 13:47:42 -07:00
Michael Bolin
ff8ae1ffa1 feat: make file search cancellable (#1414)
Update `run()` to take `cancel_flag: Arc<AtomicBool>` that the worker
threads will periodically check to see if it is `true`, exiting early
(and returning empty results) if so.
2025-06-27 20:01:45 -07:00
Michael Bolin
b3ad764532 chore: change arg from PathBuf to &Path (#1409)
Caller no longer needs to clone a `PathBuf`: can just pass `&Path`.
2025-06-27 16:24:41 -07:00
Michael Bolin
a331a67b3e chore: change built_in_model_providers so "openai" is the only "bundled" provider (#1407)
As we are [close to releasing the Rust CLI
beta](https://github.com/openai/codex/discussions/1405), for the moment,
let's take a more neutral stance on what it takes to be a "built-in"
provider.

* For example, there seems to be a discrepancy around what the "right"
configuration for Gemini is: https://github.com/openai/codex/pull/881
* And while the current list of "built-in" providers are all arguably
"well-known" names, this raises a question of what to do about
potentially less familiar providers, such as
https://github.com/openai/codex/pull/1142. Do we just accept every pull
request like this, or is there some criteria a provider has to meet to
"qualify" to be bundled with Codex CLI?

I think that if we can establish clear ground rules for being a built-in
provider, then we can bring this back. But until then, I would rather
take a minimalist approach because if we decided to reverse our position
later, it would break folks who were depending on the presence of the
built-in providers.
2025-06-27 14:49:55 -07:00
Gabriel Peal
2e293ce903 Handle Ctrl+C quit when idle (#1402)
## Summary
- show `Ctrl+C to quit` hint when pressing Ctrl+C with no active task
- exiting with Ctrl+C if the hint is already visible
- clear the hint when tasks begin or other keys are pressed


https://github.com/user-attachments/assets/931e2d7c-1c80-4b45-9908-d119f74df23c



------
https://chatgpt.com/s/cd_685ec8875a308191beaa95886dc1379e

Fixes #1245
2025-06-27 13:37:11 -04:00
Michael Bolin
64feeb3803 fix: add tiebreaker logic for paths when scores are equal (#1400)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1400).
* #1401
* __->__ #1400
2025-06-26 23:05:10 -07:00
Michael Bolin
fa0e17f83a feat: add support for /diff command (#1389)
Adds support for a `/diff` command comparable to the one available in
the TypeScript CLI.

<img width="1103" alt="Screenshot 2025-06-26 at 12 31 33 PM"
src="https://github.com/user-attachments/assets/5dc646ca-301f-41ff-92a7-595c68db64b6"
/>

While here, changed the `SlashCommand` enum so the declared variant
order is the order the commands appear in the popup menu. This way,
`/toggle-mouse-mode` is listed last, as it is the least likely to be
used.

Fixes https://github.com/openai/codex/issues/1253.
2025-06-26 13:03:31 -07:00
Gabriel Peal
a339a7bcce [Rust] Allow resuming a session that was killed with ctrl + c (#1387)
Previously, if you ctrl+c'd a conversation, all subsequent turns would
400 because the Responses API never got a response for one of its call
ids. This ensures that if we aren't sending a call id by hand, we
generate a synthetic aborted call.

Fixes #1244 


https://github.com/user-attachments/assets/5126354f-b970-45f5-8c65-f811bca8294a
2025-06-26 14:40:42 -04:00
Michael Bolin
fcfe43c7df feat: show number of tokens remaining in UI (#1388)
When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.

When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.

Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

![Screenshot 2025-06-25 at 11 20
55 PM](https://github.com/user-attachments/assets/6fd6982f-7247-4f14-84b2-2e600cb1fd49)

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.

Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:


296996d74e/codex-cli/src/utils/approximate-tokens-used.ts (L3-L16)

Fixes https://github.com/openai/codex/issues/1242
2025-06-25 23:31:11 -07:00
Michael Bolin
296996d74e feat: standalone file search CLI (#1386)
Standalone fuzzy filename search library that should be helpful in
addressing https://github.com/openai/codex/issues/1261.
2025-06-25 13:29:03 -07:00
Michael Bolin
50924101d2 feat: add --dangerously-bypass-approvals-and-sandbox (#1384)
This PR reworks `assess_command_safety()` so that the combination of
`AskForApproval::Never` and `SandboxPolicy::DangerFullAccess` ensures
that commands are run without _any_ sandbox and the user should never be
prompted. In turn, it adds support for a new
`--dangerously-bypass-approvals-and-sandbox` flag (that cannot be used
with `--approval-policy` or `--full-auto`) that sets both of those
options.

Fixes https://github.com/openai/codex/issues/1254
2025-06-25 12:36:10 -07:00
Michael Bolin
72082164c1 chore: rename AskForApproval::UnlessAllowListed to AskForApproval::UnlessTrusted (#1385)
We could just rename to `Untrusted` instead of `UnlessTrusted`, but I
think `AskForApproval::UnlessTrusted` reads a bit better.
2025-06-25 12:26:13 -07:00
Michael Bolin
e09691337d chore: improve docstring for --full-auto (#1379)
Reference `-c sandbox.mode=workspace-write` in the docstring and users
can read the config docs for `sandbox` for more information.
2025-06-25 09:13:36 -07:00
Michael Bolin
86d5a9d80d chore: rename unless-allow-listed to untrusted (#1378)
For the `approval_policy` config option, renames `unless-allow-listed`
to `untrusted`. In general, when it comes to exec'ing commands, I think
"trusted" is a more accurate term than "safe."

Also drops the `AskForApproval::AutoEdit` variant, as we were not really
making use of it, anyway.

Fixes https://github.com/openai/codex/issues/1250.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1378).
* #1379
* __->__ #1378
2025-06-24 22:19:21 -07:00
Michael Bolin
531ce7626f fix: pretty-print the sandbox config in the TUI/exec modes (#1376)
Now that https://github.com/openai/codex/pull/1373 simplified the
sandbox config, we can print something much simpler in the TUI (and in
`codex exec`) to summarize the sandbox config.

Before:

![Screenshot 2025-06-24 at 5 45
52 PM](https://github.com/user-attachments/assets/b7633efb-a619-43e1-9abe-7bb0be2d0ec0)

With this change:

![Screenshot 2025-06-24 at 5 46
44 PM](https://github.com/user-attachments/assets/8d099bdd-a429-4796-a08d-70931d984e4f)

For reference, my `config.toml` contains:

```
[sandbox]
mode = "workspace-write"
writable_roots = ["/tmp", "/Users/mbolin/.pyenv/shims"]
```

Fixes https://github.com/openai/codex/issues/1248
2025-06-24 17:48:51 -07:00
Michael Bolin
63363a54e5 chore: install just in the devcontainer for Linux development (#1375)
Apparently `just` was added to `apt` in Ubuntu 24, so this required
updating the Ubuntu version in the `Dockerfile` to make it so we could
simply `apt install just`.

Though then that caused a conflict with the custom `dev` user we were
using, though the end result seems simpler since now we just use the
default `ubuntu` user provided by Ubuntu 24.
2025-06-24 17:20:53 -07:00
Michael Bolin
6d65010aad chore: install clippy and rustfmt in the devcontainer for Linux development (#1374)
I discovered it was difficult to do development in the devcontainer
without these tools available.
2025-06-24 17:05:36 -07:00
Michael Bolin
0776d78357 feat: redesign sandbox config (#1373)
This is a major redesign of how sandbox configuration works and aims to
fix https://github.com/openai/codex/issues/1248. Specifically, it
replaces `sandbox_permissions` in `config.toml` (and the
`-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
three variants:

```toml
# Safest option: full disk is read-only, but writes and network access are disallowed.
[sandbox]
mode = "read-only"

# The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
# writable_roots can be used to specify additional writable folders.
[sandbox]
mode = "workspace-write"
writable_roots = []  # Optional, defaults to the empty list.
network_access = false  # Optional, defaults to false.

# Disable sandboxing: use at your own risk!!!
[sandbox]
mode = "danger-full-access"
```

This should make sandboxing easier to reason about. While we have
dropped support for `-s`, the way it works now is:

- no flags => `read-only`
- `--full-auto` => `workspace-write`
- currently, there is no way to specify `danger-full-access` via a CLI
flag, but we will revisit that as part of
https://github.com/openai/codex/issues/1254

Outstanding issue:

- As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
still conflating sandbox preferences with approval preferences in that
case, which needs to be cleaned up.
2025-06-24 16:59:47 -07:00
Eric Wright
ed5e848f3e add: responses api support for azure (#1321)
- Use Responses API for Azure provider endpoints
- Added a unit test to catch regression on the change from
`/chat/completions` to `/responses`
- Updated the default AOAI api version from `2025-03-01-preview` to
`2025-04-01-preview` to avoid user/400 errors due to missing summary
support in the March API version.
- Changes have been tested locally on AOAI endpoints
2025-06-22 18:01:13 -07:00
Govind Kamtamneni
5aafe190e2 feat(ts): provider‑specific API‑key discovery and clearer Azure guidance (#1324)
## Summary

This PR refactors the Codex CLI authentication flow so that
**non-OpenAI** providers (for example **azure**, or any future addition)
can supply their API key through a dedicated environment variable
without triggering the OpenAI login flow.

Key behaviours introduced:

* When `provider !== "openai"` the CLI consults `src/utils/providers.ts`
to locate the correct environment variable (`AZURE_OPENAI_API_KEY`,
`GEMINI_API_KEY`, and so on) before considering any interactive login.
* Credit redemption (`--free`) and PKCE login now run **only** when the
provider is OpenAI, eliminating unwanted browser prompts for Azure and
others.
* User-facing error messages are revamped to guide Azure users to
**[https://ai.azure.com/](https://ai.azure.com)** and show the exact
variable name they must set.
* All code paths still export `OPENAI_API_KEY` so legacy scripts
continue to operate unchanged.

---

## Example `config.json`

```jsonc
{
  "model": "codex-mini",
  "provider": "azure",
  "providers": {
    "azure": {
      "name": "AzureOpenAI",
      "baseURL": "https://ai-<project-name>.openai.azure.com/openai",
      "envKey": "AZURE_OPENAI_API_KEY"
    }
  },
  "history": {
    "maxSize": 1000,
    "saveHistory": true,
    "sensitivePatterns": []
  }
}
```

With this file in `~/.codex/config.json`, a single command line is
enough:

```bash
export AZURE_OPENAI_API_KEY="<your-key>"
codex "Hello from Azure"
```

No browser window opens, and the CLI works in entirely non-interactive
mode.

---

## Rationale

The new flow enables Codex to run **asynchronously** in sandboxed
environments such as GitHub Actions pipelines. By passing `--provider
azure` (or setting it in `config.json`) and exporting the correct key,
CI/CD jobs can invoke Codex without any ChatGPT-style login or PKCE
round-trip. This unlocks fully automated testing and deployment
scenarios.

---

## What’s changed

| File | Type | Description |
| ------------------------ | ------------------- |
-----------------------------------------------------------------------------------------------------------------------------
|
| `codex-cli/src/cli.tsx` | **feat / refactor** | +43 / -20 lines.
Imports `providers`, adds early provider-specific key lookup, gates
`--free` redemption, rewrites help text. |
| `src/utils/providers.ts` | **chore** | Now consumed by CLI for env-var
discovery. |

---

## How to test

```bash
# Azure example
export AZURE_OPENAI_API_KEY="<your-key>"
codex --provider azure "Automated run in CI"

# OpenAI example (unchanged behaviour)
codex --provider openai --login "Standard OpenAI flow"
```

Expected outcomes:

* Azure and other provider paths are non-interactive when provider flag
is passed.
* The CLI always sets `OPENAI_API_KEY` for backward compatibility.

---

## Checklist

* [x] Logic behind provider-specific env-var lookup added.
* [x] Redundant OpenAI login steps removed for other providers.
* [x] Unit tests cover new branches.
* [x] README and sample config updated.
* [x] CI passes on all supported Node versions.

---

**Related work**

* #92
* #769 
* #1321



I have read the CLA Document and I hereby sign the CLA.
2025-06-22 17:56:36 -07:00
249 changed files with 14968 additions and 10945 deletions

View File

@@ -1,4 +1,4 @@
FROM ubuntu:22.04
FROM ubuntu:24.04
ARG DEBIAN_FRONTEND=noninteractive
# enable 'universe' because musl-tools & clang live there
@@ -11,19 +11,17 @@ RUN apt-get update && \
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential curl git ca-certificates \
pkg-config clang musl-tools libssl-dev && \
pkg-config clang musl-tools libssl-dev just && \
rm -rf /var/lib/apt/lists/*
# non-root dev user
ARG USER=dev
ARG UID=1000
RUN useradd -m -u $UID $USER
USER $USER
# Ubuntu 24.04 ships with user 'ubuntu' already created with UID 1000.
USER ubuntu
# install Rust + musl target as dev user
RUN curl -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal && \
~/.cargo/bin/rustup target add aarch64-unknown-linux-musl
~/.cargo/bin/rustup target add aarch64-unknown-linux-musl && \
~/.cargo/bin/rustup component add clippy rustfmt
ENV PATH="/home/${USER}/.cargo/bin:${PATH}"
ENV PATH="/home/ubuntu/.cargo/bin:${PATH}"
WORKDIR /workspace

View File

@@ -15,15 +15,13 @@
"CARGO_TARGET_DIR": "${containerWorkspaceFolder}/codex-rs/target-arm64"
},
"remoteUser": "dev",
"remoteUser": "ubuntu",
"customizations": {
"vscode": {
"settings": {
"terminal.integrated.defaultProfile.linux": "bash"
"terminal.integrated.defaultProfile.linux": "bash"
},
"extensions": [
"rust-lang.rust-analyzer"
],
"extensions": ["rust-lang.rust-analyzer", "tamasfe.even-better-toml"]
}
}
}

View File

@@ -20,9 +20,9 @@ inputs:
description: "Value to use as the CODEX_HOME environment variable when running Codex."
required: false
codex_release_tag:
description: "The release tag of the Codex model to run."
description: "The release tag of the Codex model to run, e.g., 'rust-v0.3.0'. Defaults to the latest release."
required: false
default: "codex-rs-ca8e97fcbcb991e542b8689f2d4eab9d30c399d6-1-rust-v0.0.2505302325"
default: ""
runs:
using: "composite"
@@ -84,7 +84,10 @@ runs:
# we will need to update this action.yml file to match.
artifact="codex-exec-${triple}.tar.gz"
gh release download ${{ inputs.codex_release_tag }} --repo openai/codex \
TAG_ARG="${{ inputs.codex_release_tag }}"
# The usage is `gh release download [<tag>] [flags]`, so if TAG_ARG
# is empty, we do not pass it so we can default to the latest release.
gh release download ${TAG_ARG:+$TAG_ARG} --repo openai/codex \
--pattern "$artifact" --output - \
| tar xzO > /usr/local/bin/codex-exec
chmod +x /usr/local/bin/codex-exec

View File

@@ -8,9 +8,9 @@
"@actions/github": "^6.0.1",
},
"devDependencies": {
"@types/bun": "^1.2.11",
"@types/node": "^22.15.21",
"prettier": "^3.5.3",
"@types/bun": "^1.2.18",
"@types/node": "^24.0.13",
"prettier": "^3.6.2",
"typescript": "^5.8.3",
},
},
@@ -48,19 +48,23 @@
"@octokit/types": ["@octokit/types@13.10.0", "", { "dependencies": { "@octokit/openapi-types": "^24.2.0" } }, "sha512-ifLaO34EbbPj0Xgro4G5lP5asESjwHracYJvVaPIyXMuiuXLlhic3S47cBdTb+jfODkTE5YtGCLt3Ay3+J97sA=="],
"@types/bun": ["@types/bun@1.2.13", "", { "dependencies": { "bun-types": "1.2.13" } }, "sha512-u6vXep/i9VBxoJl3GjZsl/BFIsvML8DfVDO0RYLEwtSZSp981kEO1V5NwRcO1CPJ7AmvpbnDCiMKo3JvbDEjAg=="],
"@types/bun": ["@types/bun@1.2.18", "", { "dependencies": { "bun-types": "1.2.18" } }, "sha512-Xf6RaWVheyemaThV0kUfaAUvCNokFr+bH8Jxp+tTZfx7dAPA8z9ePnP9S9+Vspzuxxx9JRAXhnyccRj3GyCMdQ=="],
"@types/node": ["@types/node@22.15.21", "", { "dependencies": { "undici-types": "~6.21.0" } }, "sha512-EV/37Td6c+MgKAbkcLG6vqZ2zEYHD7bvSrzqqs2RIhbA6w3x+Dqz8MZM3sP6kGTeLrdoOgKZe+Xja7tUB2DNkQ=="],
"@types/node": ["@types/node@24.0.13", "", { "dependencies": { "undici-types": "~7.8.0" } }, "sha512-Qm9OYVOFHFYg3wJoTSrz80hoec5Lia/dPp84do3X7dZvLikQvM1YpmvTBEdIr/e+U8HTkFjLHLnl78K/qjf+jQ=="],
"@types/react": ["@types/react@19.1.8", "", { "dependencies": { "csstype": "^3.0.2" } }, "sha512-AwAfQ2Wa5bCx9WP8nZL2uMZWod7J7/JSplxbTmBQ5ms6QpqNYm672H0Vu9ZVKVngQ+ii4R/byguVEUZQyeg44g=="],
"before-after-hook": ["before-after-hook@2.2.3", "", {}, "sha512-NzUnlZexiaH/46WDhANlyR2bXRopNg4F/zuSA3OpZnllCUgRaOF2znDioDWrmbNVsuZk6l9pMquQB38cfBZwkQ=="],
"bun-types": ["bun-types@1.2.13", "", { "dependencies": { "@types/node": "*" } }, "sha512-rRjA1T6n7wto4gxhAO/ErZEtOXyEZEmnIHQfl0Dt1QQSB4QV0iP6BZ9/YB5fZaHFQ2dwHFrmPaRQ9GGMX01k9Q=="],
"bun-types": ["bun-types@1.2.18", "", { "dependencies": { "@types/node": "*" }, "peerDependencies": { "@types/react": "^19" } }, "sha512-04+Eha5NP7Z0A9YgDAzMk5PHR16ZuLVa83b26kH5+cp1qZW4F6FmAURngE7INf4tKOvCE69vYvDEwoNl1tGiWw=="],
"csstype": ["csstype@3.1.3", "", {}, "sha512-M1uQkMl8rQK/szD0LNhtqxIPLpimGm8sOBwU7lLnCpSbTyY3yeU1Vc7l4KT5zT4s/yOxHH5O7tIuuLOCnLADRw=="],
"deprecation": ["deprecation@2.3.1", "", {}, "sha512-xmHIy4F3scKVwMsQ4WnVaS8bHOx0DmVwRywosKhaILI0ywMDWPtBSku2HNxRvF7jtwDRsoEwYQSfbxj8b7RlJQ=="],
"once": ["once@1.4.0", "", { "dependencies": { "wrappy": "1" } }, "sha512-lNaJgI+2Q5URQBkccEKHTQOPaXdUxnZZElQTZY0MFUAuaEqe1E+Nyvgdz/aIyNi6Z9MzO5dv1H8n58/GELp3+w=="],
"prettier": ["prettier@3.5.3", "", { "bin": { "prettier": "bin/prettier.cjs" } }, "sha512-QQtaxnoDJeAkDvDKWCLiwIXkTgRhwYDEQCghU9Z6q03iyek/rxRh/2lC3HB7P8sWT2xC/y5JDctPLBIGzHKbhw=="],
"prettier": ["prettier@3.6.2", "", { "bin": { "prettier": "bin/prettier.cjs" } }, "sha512-I7AIg5boAr5R0FFtJ6rCfD+LFsWHp81dolrFD8S79U9tb8Az2nGrJncnMSnys+bpQJfRUzqs9hnA81OAA3hCuQ=="],
"tunnel": ["tunnel@0.0.6", "", {}, "sha512-1h/Lnq9yajKY2PEbBadPXj3VxsDDu844OnaAo52UVmIzIvwwtBPIuNvkjuzBlTWpfJyUbG3ez0KSBibQkj4ojg=="],
@@ -68,7 +72,7 @@
"undici": ["undici@5.29.0", "", { "dependencies": { "@fastify/busboy": "^2.0.0" } }, "sha512-raqeBD6NQK4SkWhQzeYKd1KmIG6dllBOTt55Rmkt4HtI9mwdWtJljnrXjAFUBLTSN67HWrOIZ3EPF4kjUw80Bg=="],
"undici-types": ["undici-types@6.21.0", "", {}, "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ=="],
"undici-types": ["undici-types@7.8.0", "", {}, "sha512-9UJ2xGDvQ43tYyVMpuHlsgApydB8ZKfVYTsLDhXkFL/6gfkp+U8xTGdh8pMJv1SpZna0zxG1DwsKZsreLbXBxw=="],
"universal-user-agent": ["universal-user-agent@6.0.1", "", {}, "sha512-yCzhz6FN2wU1NiiQRogkTQszlQSlpWaw8SvVegAc+bDxbzHgh1vX8uIe8OYyMH6DwH+sdTJsgMl36+mSMdRJIQ=="],

View File

@@ -13,9 +13,9 @@
"@actions/github": "^6.0.1"
},
"devDependencies": {
"@types/bun": "^1.2.11",
"@types/node": "^22.15.21",
"prettier": "^3.5.3",
"@types/bun": "^1.2.18",
"@types/node": "^24.0.13",
"prettier": "^3.6.2",
"typescript": "^5.8.3"
}
}

26
.github/dependabot.yaml vendored Normal file
View File

@@ -0,0 +1,26 @@
# https://docs.github.com/en/code-security/dependabot/working-with-dependabot/dependabot-options-reference#package-ecosystem-
version: 2
updates:
- package-ecosystem: bun
directory: .github/actions/codex
schedule:
interval: weekly
- package-ecosystem: cargo
directories:
- codex-rs
- codex-rs/*
schedule:
interval: weekly
- package-ecosystem: devcontainers
directory: /
schedule:
interval: weekly
- package-ecosystem: docker
directory: codex-cli
schedule:
interval: weekly
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly

View File

@@ -74,7 +74,12 @@ jobs:
GH_TOKEN: ${{ github.token }}
run: pnpm stage-release
- name: Ensure README.md contains only ASCII and certain Unicode code points
- name: Ensure root README.md contains only ASCII and certain Unicode code points
run: ./scripts/asciicheck.py README.md
- name: Check README ToC
- name: Check root README ToC
run: python3 scripts/readme_toc.py README.md
- name: Ensure codex-cli/README.md contains only ASCII and certain Unicode code points
run: ./scripts/asciicheck.py codex-cli/README.md
- name: Check codex-cli/README ToC
run: python3 scripts/readme_toc.py codex-cli/README.md

View File

@@ -70,7 +70,7 @@ jobs:
- name: Install dependencies
run: pnpm install
- uses: dtolnay/rust-toolchain@1.87
- uses: dtolnay/rust-toolchain@1.88
with:
targets: x86_64-unknown-linux-gnu
components: clippy

View File

@@ -26,7 +26,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@1.87
- uses: dtolnay/rust-toolchain@1.88
with:
components: rustfmt
- name: cargo fmt
@@ -64,7 +64,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@1.87
- uses: dtolnay/rust-toolchain@1.88
with:
targets: ${{ matrix.target }}
components: clippy

View File

@@ -15,9 +15,6 @@ concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
env:
TAG_REGEX: '^rust-v[0-9]+\.[0-9]+\.[0-9]+$'
jobs:
tag-check:
runs-on: ubuntu-latest
@@ -33,8 +30,8 @@ jobs:
# 1. Must be a tag and match the regex
[[ "${GITHUB_REF_TYPE}" == "tag" ]] \
|| { echo "❌ Not a tag push"; exit 1; }
[[ "${GITHUB_REF_NAME}" =~ ${TAG_REGEX} ]] \
|| { echo "❌ Tag '${GITHUB_REF_NAME}' != ${TAG_REGEX}"; exit 1; }
[[ "${GITHUB_REF_NAME}" =~ ^rust-v[0-9]+\.[0-9]+\.[0-9]+(-(alpha|beta)(\.[0-9]+)?)?$ ]] \
|| { echo "❌ Tag '${GITHUB_REF_NAME}' doesn't match expected format"; exit 1; }
# 2. Extract versions
tag_ver="${GITHUB_REF_NAME#rust-v}"
@@ -76,7 +73,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@1.87
- uses: dtolnay/rust-toolchain@1.88
with:
targets: ${{ matrix.target }}
@@ -160,9 +157,7 @@ jobs:
release:
needs: build
name: release
runs-on: ubuntu-24.04
env:
RELEASE_TAG: codex-rs-${{ github.sha }}-${{ github.run_attempt }}-${{ github.ref_name }}
runs-on: ubuntu-latest
steps:
- uses: actions/download-artifact@v4
@@ -172,9 +167,19 @@ jobs:
- name: List
run: ls -R dist/
- uses: softprops/action-gh-release@v2
- name: Define release name
id: release_name
run: |
# Extract the version from the tag name, which is in the format
# "rust-v0.1.0".
version="${GITHUB_REF_NAME#rust-v}"
echo "name=${version}" >> $GITHUB_OUTPUT
- name: Create GitHub Release
uses: softprops/action-gh-release@v2
with:
tag_name: ${{ env.RELEASE_TAG }}
name: ${{ steps.release_name.outputs.name }}
tag_name: ${{ github.ref_name }}
files: dist/**
# For now, tag releases as "prerelease" because we are not claiming
# the Rust CLI is stable yet.
@@ -184,5 +189,5 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag: ${{ env.RELEASE_TAG }}
tag: ${{ github.ref_name }}
config: .github/dotslash-config.json

10
.gitignore vendored
View File

@@ -48,12 +48,6 @@ yarn-error.log*
# env
.env*
# oaipkg import cache
oaipkg/
# Ignore task worktree directories created by create-task-worktree.sh
agentydragon/tasks/.worktrees/
!.env.example
# package
@@ -87,7 +81,3 @@ CHANGELOG.ignore.md
# nix related
.direnv
.envrc
__pycache__
codex-rs/target

View File

@@ -1,15 +0,0 @@
repos:
- repo: local
hooks:
- id: check-tasks
name: Run all task-directory validation checks
entry: python3 agentydragon/tools/check_tasks.py
language: python
additional_dependencies: [PyYAML, toml, pydantic]
files: ^agentydragon/tasks/.*
- id: cargo-build
name: Check Rust workspace and linux-sandbox compile
entry: bash -lc 'cd codex-rs && RUSTFLAGS="-D warnings" cargo build --workspace --locked --all-targets && cargo build -p codex-linux-sandbox --locked --all-targets'
language: system
pass_filenames: false
require_serial: true

18
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,18 @@
{
"version": "0.2.0",
"configurations": [
{
"type": "lldb",
"request": "launch",
"name": "Cargo launch",
"cargo": {
"cwd": "${workspaceFolder}/codex-rs",
"args": [
"build",
"--bin=codex-tui"
]
},
"args": []
}
]
}

10
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,10 @@
{
"rust-analyzer.checkOnSave": true,
"rust-analyzer.check.command": "clippy",
"rust-analyzer.check.extraArgs": ["--all-features", "--tests"],
"rust-analyzer.rustfmt.extraArgs": ["--config", "imports_granularity=Item"],
"[rust]": {
"editor.defaultFormatter": "rust-lang.rust-analyzer",
"editor.formatOnSave": true,
}
}

View File

@@ -1,31 +1,9 @@
# AGENTS.md
# Rust/codex-rs
This file provides guidance to OpenAI Codex (openai.com/codex) when working with
code in this repository.
In the codex-rs folder where the rust code lives:
## Build, Lint & Test
- Never add or modify any code related to `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR`. You operate in a sandbox where `CODEX_SANDBOX_NETWORK_DISABLED=1` will be set whenever you use the `shell` tool. Any existing code that uses `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` was authored with this fact in mind. It is often used to early exit out of tests that the author knew you would not be able to run given your sandbox limitations.
### JavaScript/TypeScript
- Install dependencies: `pnpm install`
- Run all tests: `pnpm test`
- Run a single test: `pnpm test -- -t <pattern>` or `pnpm test -- path/to/file.spec.ts`
- Watch tests: `pnpm test:watch`
- Lint: `pnpm lint && pnpm lint:fix`
- Type-check: `pnpm typecheck`
- Format: `pnpm format:fix`
- Build: `pnpm build`
Before creating a pull request with changes to `codex-rs`, run `just fmt` (in `codex-rs` directory) to format the code and `just fix` (in `codex-rs` directory) to fix any linter issues in the code, ensure the test suite passes by running `cargo test --all-features` in the `codex-rs` directory.
### Rust (codex-rs workspace)
- Build: `cargo build --workspace --locked`
- Test all: `cargo test --workspace`
- Test crate: `cargo test -p <crate>`
- Single test: `cargo test -p <crate> -- <test_name>`
- Format & check: `cargo fmt --all -- --check`
- Lint: `cargo clippy --all-targets --all-features -- -D warnings`
## Code Style Guidelines
- JS/TS: ESLint + Prettier; group imports; camelCase vars & funcs; PascalCase types/components; catch specific errors
- Rust: rustfmt & Clippy (see `codex-rs/rustfmt.toml`); snake_case vars & funcs; PascalCase types; prefer early return; avoid `unwrap()` in prod
- General: Do not swallow exceptions; use DRY; generate/validate ASCII art programmatically
- Include any Cursor rules from `.cursor/rules/` or Copilot rules from `.github/copilot-instructions.md` if present
When making individual changes prefer running tests on individual files or projects first.

586
README.md
View File

@@ -1,9 +1,11 @@
<h1 align="center">OpenAI Codex CLI</h1>
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code></p>
<p align="center"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>
![Codex demo GIF using: codex "explain this codebase to me"](./.github/demo.gif)
This is the home of the **Codex CLI**, which is a coding agent from OpenAI that runs locally on your computer. If you are looking for the _cloud-based agent_ from OpenAI, **Codex [Web]**, see <https://chatgpt.com/codex>.
<!-- ![Codex demo GIF using: codex "explain this codebase to me"](./.github/demo.gif) -->
---
@@ -14,6 +16,8 @@
- [Experimental technology disclaimer](#experimental-technology-disclaimer)
- [Quickstart](#quickstart)
- [OpenAI API Users](#openai-api-users)
- [OpenAI Plus/Pro Users](#openai-pluspro-users)
- [Why Codex?](#why-codex)
- [Security model & permissions](#security-model--permissions)
- [Platform sandboxing details](#platform-sandboxing-details)
@@ -21,24 +25,17 @@
- [CLI reference](#cli-reference)
- [Memory & project docs](#memory--project-docs)
- [Non-interactive / CI mode](#non-interactive--ci-mode)
- [Model Context Protocol (MCP)](#model-context-protocol-mcp)
- [Tracing / verbose logging](#tracing--verbose-logging)
- [Recipes](#recipes)
- [Installation](#installation)
- [Configuration guide](#configuration-guide)
- [Basic configuration parameters](#basic-configuration-parameters)
- [Custom AI provider configuration](#custom-ai-provider-configuration)
- [History configuration](#history-configuration)
- [Configuration examples](#configuration-examples)
- [Full configuration example](#full-configuration-example)
- [Custom instructions](#custom-instructions)
- [Environment variables setup](#environment-variables-setup)
- [DotSlash](#dotslash)
- [Configuration](#configuration)
- [FAQ](#faq)
- [Zero data retention (ZDR) usage](#zero-data-retention-zdr-usage)
- [Codex open source fund](#codex-open-source-fund)
- [Contributing](#contributing)
- [Development workflow](#development-workflow)
- [Git hooks with Husky](#git-hooks-with-husky)
- [Debugging](#debugging)
- [Writing high-impact code changes](#writing-high-impact-code-changes)
- [Opening a pull request](#opening-a-pull-request)
- [Review process](#review-process)
@@ -47,8 +44,6 @@
- [Contributor license agreement (CLA)](#contributor-license-agreement-cla)
- [Quick fixes](#quick-fixes)
- [Releasing `codex`](#releasing-codex)
- [Alternative build options](#alternative-build-options)
- [Nix flake development](#nix-flake-development)
- [Security & responsible AI](#security--responsible-ai)
- [License](#license)
@@ -71,54 +66,94 @@ Help us improve by filing issues or submitting PRs (see the section below for ho
## Quickstart
Install globally:
Install globally with your preferred package manager:
```shell
npm install -g @openai/codex
npm install -g @openai/codex # Alternatively: `brew install codex`
```
Or go to the [latest GitHub Release](https://github.com/openai/codex/releases/latest) and download the appropriate binary for your platform.
### OpenAI API Users
Next, set your OpenAI API key as an environment variable:
```shell
export OPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
> ```
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
> [!NOTE]
> This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`), but we recommend setting it for the session.
### OpenAI Plus/Pro Users
If you have a paid OpenAI account, run the following to start the login process:
```
codex login
```
If you complete the process successfully, you should have a `~/.codex/auth.json` file that contains the credentials that Codex will use.
If you encounter problems with the login flow, please comment on <https://github.com/openai/codex/issues/1243>.
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
<summary><strong>Use <code>--profile</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - azure
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - arceeai
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
>
> ```shell
> export <provider>_BASE_URL="https://your-provider-api-base-url"
> ```
Codex also allows you to use other providers that support the OpenAI Chat Completions (or Responses) API.
To do so, you must first define custom [providers](./config.md#model_providers) in `~/.codex/config.toml`. For example, the provider for a standard Ollama setup would be defined as follows:
```toml
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
```
The `base_url` will have `/chat/completions` appended to it to build the full URL for the request.
For providers that also require an `Authorization` header of the form `Bearer: SECRET`, an `env_key` can be specified, which indicates the environment variable to read to use as the value of `SECRET` when making a request:
```toml
[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
```
Providers that speak the Responses API are also supported by adding `wire_api = "responses"` as part of the definition. Accessing OpenAI models via Azure is an example of such a provider, though it also requires specifying additional `query_params` that need to be appended to the request URL:
```toml
[model_providers.azure]
name = "Azure"
# Make sure you set the appropriate subdomain for this URL.
base_url = "https://YOUR_PROJECT_NAME.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY" # Or "OPENAI_API_KEY", whichever you use.
# Newer versions appear to support the responses API, see https://github.com/openai/codex/pull/1321
query_params = { api-version = "2025-04-01-preview" }
wire_api = "responses"
```
Once you have defined a provider you wish to use, you can configure it as your default provider as follows:
```toml
model_provider = "azure"
```
> [!TIP]
> If you find yourself experimenting with a variety of models and providers, then you likely want to invest in defining a _profile_ for each configuration like so:
```toml
[profiles.o3]
model_provider = "azure"
model = "o3"
[profiles.mistral]
model_provider = "ollama"
model = "mistral"
```
This way, you can specify one command-line argument (.e.g., `--profile o3`, `--profile mistral`) to override multiple settings together.
</details>
<br />
@@ -136,7 +171,7 @@ codex "explain this codebase to me"
```
```shell
codex --approval-mode full-auto "create the fanciest todo-list app"
codex --full-auto "create the fanciest todo-list app"
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
@@ -162,41 +197,35 @@ And it's **fully open-source** so you can see and contribute to how it develops!
## Security model & permissions
Codex lets you decide _how much autonomy_ the agent receives and auto-approval policy via the
`--approval-mode` flag (or the interactive onboarding prompt):
Codex lets you decide _how much autonomy_ you want to grant the agent. The following options can be configured independently:
| Mode | What the agent may do without asking | Still requires approval |
| ------------------------- | --------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| **Suggest** <br>(default) | <li>Read any file in the repo | <li>**All** file writes/patches<li> **Any** arbitrary shell commands (aside from reading files) |
| **Auto Edit** | <li>Read **and** apply-patch writes to files | <li>**All** shell commands |
| **Full Auto** | <li>Read/write files <li> Execute shell commands (network disabled, writes limited to your workdir) | - |
- [`approval_policy`](./codex-rs/config.md#approval_policy) determines when you should be prompted to approve whether Codex can execute a command
- [`sandbox`](./codex-rs/config.md#sandbox) determines the _sandbox policy_ that Codex uses to execute untrusted commands
In **Full Auto** every command is run **network-disabled** and confined to the
current working directory (plus temporary files) for defense-in-depth. Codex
will also show a warning/confirmation if you start in **auto-edit** or
**full-auto** while the directory is _not_ tracked by Git, so you always have a
safety net.
By default, Codex runs with `--ask-for-approval untrusted` and `--sandbox read-only`, which means that:
Coming soon: you'll be able to whitelist specific commands to auto-execute with
the network enabled, once we're confident in additional safeguards.
- The user is prompted to approve every command not on the set of "trusted" commands built into Codex (`cat`, `ls`, etc.)
- Approved commands are run outside of a sandbox because user approval implies "trust," in this case.
Running Codex with the `--full-auto` convenience flag changes the configuration to `--ask-for-approval on-failure` and `--sandbox workspace-write`, which means that:
- Codex does not initially ask for user approval before running an individual command.
- Though when it runs a command, it is run under a sandbox in which:
- It can read any file on the system.
- It can only write files under the current directory (or the directory specified via `--cd`).
- Network requests are completely disabled.
- Only if the command exits with a non-zero exit code will it ask the user for approval. If granted, it will re-attempt the command outside of the sandbox. (A common case is when Codex cannot `npm install` a dependency because that requires network access.)
Again, these two options can be configured independently. For example, if you want Codex to perform an "exploration" where you are happy for it to read anything it wants but you never want to be prompted, you could run Codex with `--ask-for-approval never` and `--sandbox read-only`.
### Platform sandboxing details
The hardening mechanism Codex uses depends on your OS:
The mechanism Codex uses to implement the sandbox policy depends on your OS:
- **macOS 12+** - commands are wrapped with **Apple Seatbelt** (`sandbox-exec`).
- **macOS 12+** uses **Apple Seatbelt** and runs commands using `sandbox-exec` with a profile (`-p`) that corresponds to the `--sandbox` that was specified.
- **Linux** uses a combination of Landlock/seccomp APIs to enforce the `sandbox` configuration.
- Everything is placed in a read-only jail except for a small set of
writable roots (`$PWD`, `$TMPDIR`, `~/.codex`, etc.).
- Outbound network is _fully blocked_ by default - even if a child process
tries to `curl` somewhere it will fail.
- **Linux** - there is no sandboxing by default.
We recommend using Docker for sandboxing, where Codex launches itself inside a **minimal
container image** and mounts your repo _read/write_ at the same path. A
custom `iptables`/`ipset` firewall script denies all egress except the
OpenAI API. This gives you deterministic, reproducible runs without needing
root on the host. You can use the [`run_in_container.sh`](./codex-cli/scripts/run_in_container.sh) script to set up the sandbox.
Note that when running Linux in a containerized environment such as Docker, sandboxing may not work if the host/container configuration does not support the necessary Landlock/seccomp APIs. In such cases, we recommend configuring your Docker container so that it provides the sandbox guarantees you are looking for and then running `codex` with `--sandbox danger-full-access` (or, more simply, the `--dangerously-bypass-approvals-and-sandbox` flag) within your container.
---
@@ -205,24 +234,20 @@ The hardening mechanism Codex uses depends on your OS:
| Requirement | Details |
| --------------------------- | --------------------------------------------------------------- |
| Operating systems | macOS 12+, Ubuntu 20.04+/Debian 10+, or Windows 11 **via WSL2** |
| Node.js | **22 or newer** (LTS recommended) |
| Git (optional, recommended) | 2.23+ for built-in PR helpers |
| RAM | 4-GB minimum (8-GB recommended) |
> Never run `sudo npm install -g`; fix npm permissions instead.
---
## CLI reference
| Command | Purpose | Example |
| ------------------------------------ | ----------------------------------- | ------------------------------------ |
| `codex` | Interactive REPL | `codex` |
| `codex "..."` | Initial prompt for interactive REPL | `codex "fix lint errors"` |
| `codex -q "..."` | Non-interactive "quiet mode" | `codex -q --json "explain utils.ts"` |
| `codex completion <bash\|zsh\|fish>` | Print shell completion script | `codex completion bash` |
| Command | Purpose | Example |
| ------------------ | ---------------------------------- | ------------------------------- |
| `codex` | Interactive TUI | `codex` |
| `codex "..."` | Initial prompt for interactive TUI | `codex "fix lint errors"` |
| `codex exec "..."` | Non-interactive "automation mode" | `codex exec "explain utils.ts"` |
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
Key flags: `--model/-m`, `--ask-for-approval/-a`.
---
@@ -234,8 +259,6 @@ You can give Codex extra instructions and guidance using `AGENTS.md` files. Code
2. `AGENTS.md` at repo root - shared project notes
3. `AGENTS.md` in the current working directory - sub-folder/feature specifics
Disable loading of these files with `--no-project-doc` or the environment variable `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Non-interactive / CI mode
@@ -247,18 +270,37 @@ Run Codex head-less in pipelines. Example GitHub Action step:
run: |
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex -a auto-edit --quiet "update CHANGELOG for next release"
codex exec --full-auto "update CHANGELOG for next release"
```
Set `CODEX_QUIET_MODE=1` to silence interactive UI noise.
## Model Context Protocol (MCP)
The Codex CLI can be configured to leverage MCP servers by defining an [`mcp_servers`](./codex-rs/config.md#mcp_servers) section in `~/.codex/config.toml`. It is intended to mirror how tools such as Claude and Cursor define `mcpServers` in their respective JSON config files, though the Codex format is slightly different since it uses TOML rather than JSON, e.g.:
```toml
# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command = "npx"
args = ["-y", "mcp-server"]
env = { "API_KEY" = "value" }
```
> [!TIP]
> It is somewhat experimental, but the Codex CLI can also be run as an MCP _server_ via `codex mcp`. If you launch it with an MCP client such as `npx @modelcontextprotocol/inspector codex mcp` and send it a `tools/list` request, you will see that there is only one tool, `codex`, that accepts a grab-bag of inputs, including a catch-all `config` map for anything you might want to override. Feel free to play around with it and provide feedback via GitHub issues.
## Tracing / verbose logging
Setting the environment variable `DEBUG=true` prints full API request and response details:
Because Codex is written in Rust, it honors the `RUST_LOG` environment variable to configure its logging behavior.
The TUI defaults to `RUST_LOG=codex_core=info,codex_tui=info` and log messages are written to `~/.codex/log/codex-tui.log`, so you can leave the following running in a separate terminal to monitor log messages as they are written:
```shell
DEBUG=true codex
```
tail -F ~/.codex/log/codex-tui.log
```
By comparison, the non-interactive mode (`codex exec`) defaults to `RUST_LOG=error`, but messages are printed inline, so there is no need to monitor a separate file.
See the Rust documentation on [`RUST_LOG`](https://docs.rs/env_logger/latest/env_logger/#enabling-logging) for more information on the configuration options.
---
@@ -281,215 +323,78 @@ Below are a few bite-size examples you can copy-paste. Replace the text in quote
## Installation
<details open>
<summary><strong>From npm (Recommended)</strong></summary>
<summary><strong>Install Codex CLI using your preferred package manager.</strong></summary>
From `brew` (recommended, downloads only the binary for your platform):
```bash
npm install -g @openai/codex
# or
yarn global add @openai/codex
# or
bun install -g @openai/codex
# or
pnpm add -g @openai/codex
brew install codex
```
From `npm` (generally more readily available, but downloads binaries for all supported platforms):
```bash
npm i -g @openai/codex
```
Or go to the [latest GitHub Release](https://github.com/openai/codex/releases/latest) and download the appropriate binary for your platform.
Admittedly, each GitHub Release contains many executables, but in practice, you likely want one of these:
- macOS
- Apple Silicon/arm64: `codex-aarch64-apple-darwin.tar.gz`
- x86_64 (older Mac hardware): `codex-x86_64-apple-darwin.tar.gz`
- Linux
- x86_64: `codex-x86_64-unknown-linux-musl.tar.gz`
- arm64: `codex-aarch64-unknown-linux-musl.tar.gz`
Each archive contains a single entry with the platform baked into the name (e.g., `codex-x86_64-unknown-linux-musl`), so you likely want to rename it to `codex` after extracting it.
### DotSlash
The GitHub Release also contains a [DotSlash](https://dotslash-cli.com/) file for the Codex CLI named `codex`. Using a DotSlash file makes it possible to make a lightweight commit to source control to ensure all contributors use the same version of an executable, regardless of what platform they use for development.
</details>
<details>
<summary><strong>Build from source</strong></summary>
```bash
# Clone the repository and navigate to the CLI package
# Clone the repository and navigate to the root of the Cargo workspace.
git clone https://github.com/openai/codex.git
cd codex/codex-cli
cd codex/codex-rs
# Enable corepack
corepack enable
# Install the Rust toolchain, if necessary.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup component add rustfmt
rustup component add clippy
# Install dependencies and build
pnpm install
pnpm build
# Build Codex.
cargo build
# Linux-only: download prebuilt sandboxing binaries (requires gh and zstd).
./scripts/install_native_deps.sh
# Launch the TUI with a sample prompt.
cargo run --bin codex -- "explain this codebase to me"
# Get the usage and the options
node ./dist/cli.js --help
# After making changes, ensure the code is clean.
cargo fmt -- --config imports_granularity=Item
cargo clippy --tests
# Run the locally-built CLI directly
node ./dist/cli.js
# Or link the command globally for convenience
pnpm link
```
</details>
<details>
<summary><strong>Rust / Cargo (codex-rs)</strong></summary>
```bash
# Ensure you have Rust and Cargo installed (via rustup)
cd codex-rs/cli
cargo install --path . --locked
# Or run without installing:
cargo run --manifest-path codex-rs/cli/Cargo.toml -- --help
# Run the tests.
cargo test
```
</details>
---
## Configuration guide
## Configuration
Codex configuration files can be placed in the `~/.codex/` directory, supporting both YAML and JSON formats.
Codex supports a rich set of configuration options documented in [`codex-rs/config.md`](./codex-rs/config.md).
### Basic configuration parameters
By default, Codex loads its configuration from `~/.codex/config.toml`.
| Parameter | Type | Default | Description | Available Options |
| ------------------- | ------- | ---------- | -------------------------------- | ---------------------------------------------------------------------------------------------- |
| `model` | string | `o4-mini` | AI model to use | Any model name supporting OpenAI API |
| `approvalMode` | string | `suggest` | AI assistant's permission mode | `suggest` (suggestions only)<br>`auto-edit` (automatic edits)<br>`full-auto` (fully automatic) |
| `fullAutoErrorMode` | string | `ask-user` | Error handling in full-auto mode | `ask-user` (prompt for user input)<br>`ignore-and-continue` (ignore and proceed) |
| `notify` | boolean | `true` | Enable desktop notifications | `true`/`false` |
### Custom AI provider configuration
In the `providers` object, you can configure multiple AI service providers. Each provider requires the following parameters:
| Parameter | Type | Description | Example |
| --------- | ------ | --------------------------------------- | ----------------------------- |
| `name` | string | Display name of the provider | `"OpenAI"` |
| `baseURL` | string | API service URL | `"https://api.openai.com/v1"` |
| `envKey` | string | Environment variable name (for API key) | `"OPENAI_API_KEY"` |
### History configuration
In the `history` object, you can configure conversation history settings:
| Parameter | Type | Description | Example Value |
| ------------------- | ------- | ------------------------------------------------------ | ------------- |
| `maxSize` | number | Maximum number of history entries to save | `1000` |
| `saveHistory` | boolean | Whether to save history | `true` |
| `sensitivePatterns` | array | Patterns of sensitive information to filter in history | `[]` |
### Configuration examples
1. YAML format (save as `~/.codex/config.yaml`):
```yaml
model: o4-mini
approvalMode: suggest
fullAutoErrorMode: ask-user
notify: true
```
2. JSON format (save as `~/.codex/config.json`):
```json
{
"model": "o4-mini",
"approvalMode": "suggest",
"fullAutoErrorMode": "ask-user",
"notify": true
}
```
### Full configuration example
Below is a comprehensive example of `config.json` with multiple custom providers:
```json
{
"model": "o4-mini",
"provider": "openai",
"providers": {
"openai": {
"name": "OpenAI",
"baseURL": "https://api.openai.com/v1",
"envKey": "OPENAI_API_KEY"
},
"azure": {
"name": "AzureOpenAI",
"baseURL": "https://YOUR_PROJECT_NAME.openai.azure.com/openai",
"envKey": "AZURE_OPENAI_API_KEY"
},
"openrouter": {
"name": "OpenRouter",
"baseURL": "https://openrouter.ai/api/v1",
"envKey": "OPENROUTER_API_KEY"
},
"gemini": {
"name": "Gemini",
"baseURL": "https://generativelanguage.googleapis.com/v1beta/openai",
"envKey": "GEMINI_API_KEY"
},
"ollama": {
"name": "Ollama",
"baseURL": "http://localhost:11434/v1",
"envKey": "OLLAMA_API_KEY"
},
"mistral": {
"name": "Mistral",
"baseURL": "https://api.mistral.ai/v1",
"envKey": "MISTRAL_API_KEY"
},
"deepseek": {
"name": "DeepSeek",
"baseURL": "https://api.deepseek.com",
"envKey": "DEEPSEEK_API_KEY"
},
"xai": {
"name": "xAI",
"baseURL": "https://api.x.ai/v1",
"envKey": "XAI_API_KEY"
},
"groq": {
"name": "Groq",
"baseURL": "https://api.groq.com/openai/v1",
"envKey": "GROQ_API_KEY"
},
"arceeai": {
"name": "ArceeAI",
"baseURL": "https://conductor.arcee.ai/v1",
"envKey": "ARCEEAI_API_KEY"
}
},
"history": {
"maxSize": 1000,
"saveHistory": true,
"sensitivePatterns": []
}
}
```
### Custom instructions
You can create a `~/.codex/AGENTS.md` file to define custom guidance for the agent:
```markdown
- Always respond with emojis
- Only use git commands when explicitly requested
```
### Environment variables setup
For each AI provider, you need to set the corresponding API key in your environment variables. For example:
```bash
# OpenAI
export OPENAI_API_KEY="your-api-key-here"
# Azure OpenAI
export AZURE_OPENAI_API_KEY="your-azure-api-key-here"
export AZURE_OPENAI_API_VERSION="2025-03-01-preview" (Optional)
# OpenRouter
export OPENROUTER_API_KEY="your-openrouter-key-here"
# Similarly for other providers
```
Though `--config` can be used to set/override ad-hoc config values for individual invocations of `codex`.
---
@@ -538,7 +443,13 @@ Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)]
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
Ensure you are running `codex` with `--config disable_response_storage=true` or add this line to `~/.codex/config.toml` to avoid specifying the command line option each time:
```toml
disable_response_storage = true
```
See [the configuration documentation on `disable_response_storage`](./codex-rs/config.md#disable_response_storage) for details.
---
@@ -563,51 +474,7 @@ More broadly we welcome contributions - whether you are opening your very first
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
- Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git hooks with Husky
This project uses [Husky](https://typicode.github.io/husky/) to enforce code quality checks:
- **Pre-commit hook**: Automatically runs lint-staged to format and lint files before committing
- **Pre-push hook**: Runs tests and type checking before pushing to the remote
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./codex-cli/HUSKY.md).
```bash
pnpm test && pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
```text
I have read the CLA Document and I hereby sign the CLA
```
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
pnpm test:watch
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
- Following the [development setup](#development-workflow) instructions above, ensure your change is free of lint warnings and test failures.
### Writing high-impact code changes
@@ -619,7 +486,7 @@ To debug the CLI with a visual debugger, do the following in the `codex-cli` fol
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Run **all** checks locally (`cargo test && cargo clippy --tests && cargo fmt -- --config imports_granularity=Item`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
@@ -666,73 +533,22 @@ The **DCO check** blocks merges until every commit in the PR carries the footer
### Releasing `codex`
To publish a new version of the CLI you first need to stage the npm package. A
helper script in `codex-cli/scripts/` does all the heavy lifting. Inside the
`codex-cli` folder run:
_For admins only._
```bash
# Classic, JS implementation that includes small, native binaries for Linux sandboxing.
pnpm stage-release
Make sure you are on `main` and have no local changes. Then run:
# Optionally specify the temp directory to reuse between runs.
RELEASE_DIR=$(mktemp -d)
pnpm stage-release --tmp "$RELEASE_DIR"
# "Fat" package that additionally bundles the native Rust CLI binaries for
# Linux. End-users can then opt-in at runtime by setting CODEX_RUST=1.
pnpm stage-release --native
```shell
VERSION=0.2.0 # Can also be 0.2.0-alpha.1 or any valid Rust version.
./codex-rs/scripts/create_github_release.sh "$VERSION"
```
Go to the folder where the release is staged and verify that it works as intended. If so, run the following from the temp folder:
This will make a local commit on top of `main` with `version` set to `$VERSION` in `codex-rs/Cargo.toml` (note that on `main`, we leave the version as `version = "0.0.0"`).
```
cd "$RELEASE_DIR"
npm publish
```
This will push the commit using the tag `rust-v${VERSION}`, which in turn kicks off [the release workflow](.github/workflows/rust-release.yml). This will create a new GitHub Release named `$VERSION`.
### Alternative build options
If everything looks good in the generated GitHub Release, uncheck the **pre-release** box so it is the latest release.
#### Nix flake development
Prerequisite: Nix >= 2.4 with flakes enabled (`experimental-features = nix-command flakes` in `~/.config/nix/nix.conf`).
Enter a Nix development shell:
```bash
# Use either one of the commands according to which implementation you want to work with
nix develop .#codex-cli # For entering codex-cli specific shell
nix develop .#codex-rs # For entering codex-rs specific shell
```
This shell includes Node.js, installs dependencies, builds the CLI, and provides a `codex` command alias.
Build and run the CLI directly:
```bash
# Use either one of the commands according to which implementation you want to work with
nix build .#codex-cli # For building codex-cli
nix build .#codex-rs # For building codex-rs
./result/bin/codex --help
```
Run the CLI via the flake app:
```bash
# Use either one of the commands according to which implementation you want to work with
nix run .#codex-cli # For running codex-cli
nix run .#codex-rs # For running codex-rs
```
Use direnv with flakes
If you have direnv installed, you can use the following `.envrc` to automatically enter the Nix shell when you `cd` into the project directory:
```bash
cd codex-rs
echo "use flake ../flake.nix#codex-cli" >> .envrc && direnv allow
cd codex-cli
echo "use flake ../flake.nix#codex-rs" >> .envrc && direnv allow
```
Create a PR to update [`Formula/c/codex.rb`](https://github.com/Homebrew/homebrew-core/blob/main/Formula/c/codex.rb) on Homebrew.
---

View File

@@ -1,174 +0,0 @@
# codex-rs: Changes between HEAD and main
This document summarizes new and removed features, configuration options,
and behavioral changes in the `codex-rs` workspace between the `main`
branch and the current `HEAD`. Only additions/deletions (not unmodified
code) are listed, with examples of usage and configuration.
---
## CLI Enhancements
### Build & Install from Source
```shell
cargo install --path cli --locked
# install system-wide:
sudo cargo install --path cli --locked --root /usr/local
```
### New `codex config` Subcommand
Manage your `~/.codex/config.toml` directly without manually editing:
```shell
codex config edit # open config in $EDITOR (or vi)
codex config set KEY VALUE # set a TOML literal, e.g. tui.auto_mount_repo true
```
### New `codex inspect-env` Command
Inspect the sandbox/container environment (mounts, permissions, network):
```shell
codex inspect-env --full-auto
codex inspect-env -s network=disable -s mount=/mydir=rw
```
### Resume TUI Sessions by UUID
```shell
codex session <SESSION_UUID>
```
### MCP Server (JSONRPC) Support
Launch Codex as an MCP _server_ over stdin/stdout and speak the
Model Context Protocol (JSON-RPC):
```shell
npx @modelcontextprotocol/inspector codex mcp
```
#### Sample JSONRPC Interaction
```jsonc
// ListTools request
{ "jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {} }
// CallTool request
{ "jsonrpc": "2.0", "id": 2, "method": "tools/call",
"params": { "name": "codex", "arguments": { "prompt": "Hello" } }
}
// CallTool response (abbreviated)
{ "jsonrpc": "2.0", "id": 2, "result": {
"content": [ { "type": "text", "text": "Hi there", "annotations": null } ],
"is_error": false
}}
```
---
## Configuration Changes
### `auto_allow` Predicate Scripts
Automatically approve or deny shell commands via custom scripts:
```toml
[[auto_allow]]
script = "/path/to/approve_predicate.sh"
[[auto_allow]]
script = "my_predicate --flag"
```
Vote resolution:
- A `deny` vote aborts execution.
- An `allow` vote auto-approves.
- Otherwise falls back to manual approval prompt.
### `base_instructions_override`
Override or disable the built-in system prompt (`prompt.md`):
```bash
export CODEX_BASE_INSTRUCTIONS_FILE=custom_prompt.md # use custom prompt
export CODEX_BASE_INSTRUCTIONS_FILE="" # disable base prompt
```
### TUI Configuration Options
In `~/.codex/config.toml`, under the `[tui]` table:
```toml
editor = "${VISUAL:-${EDITOR:-nvim}}" # external editor for prompt
message_spacing = true # insert blank line between messages
sender_break_line = true # sender label on its own line
```
---
## Core Library Updates
### System Prompt Composition Customization
System messages now combine:
1. Built-in prompt (`prompt.md`),
2. User instructions (`AGENTS.md`/`instructions.md`),
3. `apply-patch` tool instructions (for GPT-4.1),
4. User command/prompt.
Controlled via `CODEX_BASE_INSTRUCTIONS_FILE`.
### Chat Completions Tool Call Buffering
User turns emitted during an in-flight tool invocation are buffered
and flushed after the tool result, preventing interleaved messages.
### SandboxPolicy API Extensions
```rust
policy.allow_disk_write_folder("/path/to/folder".into());
policy.revoke_disk_write_folder("/path/to/folder");
```
### AutoApproval Predicate Engine
```rust
use codex_core::safety::{evaluate_auto_allow_predicates, AutoAllowVote};
let vote = evaluate_auto_allow_predicates(&cmd, &config.auto_allow);
match vote {
AutoAllowVote::Allow => /* auto-approve */,
AutoAllowVote::Deny => /* reject */,
AutoAllowVote::NoOpinion => /* prompt user */,
}
```
---
## TUI Improvements
### Double Ctrl+D Exit Confirmation
Prevent accidental exits by requiring two Ctrl+D within a timeout:
```rust
use codex_tui::confirm_ctrl_d::ConfirmCtrlD;
let mut confirm = ConfirmCtrlD::new(require_double, timeout_secs);
// confirm.handle(now) returns true to exit, false to prompt confirmation
```
### Markdown & Header Compact Rendering
New rendering options (code-level) for more compact chat layout:
- `markdown_compact`
- `header_compact`
---
## Documentation & Tests
- `codex-rs/config.md`, `codex-rs/README.md`, `core/README.md` updated with examples.
- New `core/init.md` guidance for generating `AGENTS.md` templates.
- Added tests for `codex config`, `ConfirmCtrlD`, and `evaluate_auto_allow_predicates`.

View File

@@ -1,87 +0,0 @@
# agentydragon
This file documents the changes introduced on the `agentydragon` branch
(off the `main` branch) of the codex repository.
## codex-rs: session resume and playback
- Added `session` subcommand to the CLI (`codex session <UUID>`) to resume TUI sessions by UUID.
- Integrated the `uuid` crate for session identifiers.
- Updated TUI (`codex-rs/tui`) to respect and replay previous session transcripts:
- Methods: `set_session_id`, `session_id`, `replay_items`.
- Load rollouts from `sessions/rollout-<UUID>.jsonl`.
- Printed resume command on exit: `codex session <UUID>`.
## codex-core enhancements
- Exposed core model types: `ContentItem`, `ReasoningItemReasoningSummary`, `ResponseItem`.
- Added `composer_max_rows` setting (with serde default) to TUI configuration.
## Dependency updates
- Added `uuid` crate to `codex-rs/cli` and `codex-rs/tui`.
## Pre-commit config changes
- Configured Rust build hook in `.pre-commit-config.yaml` to fail on warnings by setting `RUSTFLAGS="-D warnings"`.
## codex-rs/tui: Undo feedback decision with Esc key
- Pressing `Esc` in feedback-entry mode now cancels feedback entry and returns to the select menu, preserving the partially entered feedback text.
- Added a unit test for the ESC cancellation behavior in `tui/src/user_approval_widget.rs`.
## codex-rs/tui: restore inline mount DSL and slash-command dispatch
- Reintroduced logic in `ChatComposer` to dispatch `AppEvent::InlineMountAdd` and `AppEvent::InlineMountRemove` when `/mount-add` or `/mount-remove` is entered with inline arguments.
- Restored dispatch of `AppEvent::DispatchCommand` for slash commands selected via the command popup, including proper cleanup of the composer input.
## codex-rs/tui: slash-command `/edit-prompt` opens external editor
- Fixed slash-command `/edit-prompt` to invoke the configured external editor for prompt drafting (in addition to Ctrl+E).
## codex-rs/tui: display context remaining percentage
- Added module `tui/src/context.rs` with heuristics (`approximate_tokens_used`, `max_tokens_for_model`, `calculate_context_percent_remaining`).
- Updated `ChatWidget` and `ChatComposer::render_ref` to track history items and render `<N>% context left` indicator with color thresholds.
- Added unit tests in `tui/tests/context_percent.rs` for token counting and percent formatting boundary conditions.
## codex-rs/tui: compact Markdown rendering option
- Added `markdown_compact` config flag under UI settings to collapse heading-content spacing when enabled.
- When enabled, headings render immediately adjacent to content with no blank line between them.
- Updated Markdown rendering in chat UI and logs to honor compact mode globally (diffs, docs, help messages).
- Added unit tests covering H1H6 heading spacing for both compact and default modes.
## codex-rs: document MCP servers example in README
- Added an inline TOML snippet under “Model Context Protocol Support” in `codex-rs/README.md` showing how to configure external `mcp_servers` entries in `~/.codex/config.toml`.
- Documented `codex mcp` behavior: JSON-RPC over stdin/stdout, optional sandbox, no ephemeral container, default `codex` tool schema, and example ListTools/CallTool schema.
## Documentation tasks
## codex-rs/tui: interactive shell-command affordance via hotkey
- Bound `Ctrl+M` to open a ShellCommandView overlay for arbitrary container shell input.
- Toggled shell-command mode with `Ctrl+M` to enter or exit prompt, with styled border in shell mode.
- Executed commands asynchronously (`sh -c`) and recorded outputs inline in conversation history.
- Added unit tests for ShellCommandView event emission and shell-mode toggling behavior.
Tasks live under `agentydragon/tasks/` as individual Markdown files. Please update each tasks **Status** and **Implementation** sections in place rather than maintaining a static list here.
### Branch & Worktree Workflow
- **Branch convention**: work on each task in its own branch named `agentydragon-<task-id>-<task-slug>`, to avoid refname conflicts.
- **Worktree helper**: in `agentydragon/tasks/`, run:
-
- ```sh
- # Accept a full slug (NN-slug) or two-digit task ID (NN), optionally multiple; --tmux opens each in its own tmux pane and auto-commits each task as its Developer agent finishes:
- agentydragon/tools/create_task_worktree.py [--agent] [--tmux] [--interactive] [--shell] [--skip-presubmit] <task-slug|NN> [<task-slug|NN>...]
- ```
-
- Without `--agent`, this creates or reuses a worktree at
- `agentydragon/tasks/.worktrees/<task-id>-<task-slug>` off the `agentydragon` branch.
- Internally, the helper uses CoW hydration instead of a normal checkout: it registers the worktree with `git worktree add --no-checkout`, then performs a filesystem-level reflink
- of all files (macOS: `cp -cRp`; Linux: `cp --reflink=auto`), falling back to `rsync` if reflinks arent supported. This makes new worktrees appear nearly instantly on supported filesystems while
- preserving untracked files.
- With `--agent`, after setting up a new worktree it runs presubmit pre-commit checks (aborting with a clear message on failure unless `--skip-presubmit` is passed), then launches the Developer Codex agent (using `prompts/developer.md` and the task file).
- After the Developer agent exits, if the tasks **Status** is set to `Done`, it automatically runs the Commit agent helper to stage fixes and commit the work.
**Commit agent helper**: in `agentydragon/tasks/`, run:
```sh
# Generate and apply commit(s) for completed task(s) in their worktrees:
agentydragon/tools/launch_commit_agent.py <task-slug|NN> [<task-slug|NN>...]
```
After the Developer agent finishes and updates the task file, the Commit agent will write the commit message to a temporary file and then commit using that file (`git commit -F`). An external orchestrator can then stage files and run pre-commit hooks as usual. You do not need to run `git commit` manually.
---
*This README was autogenerated to summarize changes on the `agentydragon` branch.*

View File

@@ -1,38 +0,0 @@
# Agent Handoff Workflow
This document explains the multi-agent handoff pattern used for task development and commits
in the `agentydragon` workspace. It consolidates shared guidance so individual agent prompts
do not need to repeat these details.
## 1. Developer Agent
- **Scope**: Runs inside a sandboxed git worktree for a single task branch (`agentydragon-<ID>-<slug>`).
- **Actions**:
1. If the tasks **Status** is `Needs input`, stop immediately and await further instructions; do **not** implement code changes or run pre-commit hooks.
2. Update the task Markdown files **Status** to `Done` when implementation is complete.
3. Implement the code changes for the task.
4. Run `pre-commit run --files $(git diff --name-only)` to apply and stage any autofix changes.
5. **Do not** run `git commit`.
## 2. Commit Agent
- **Scope**: Runs in the sandbox (read-only `.git`) or equivalent environment.
- **Actions**:
1. Emit exactly one line to stdout: the commit message prefixed `agentydragon(tasks): `
summarizing the tasks **Implementation** section.
2. Stop immediately.
## 3. Orchestrator
- **Scope**: Outside the sandbox with full Git permissions.
- **Actions**:
1. Stage all changes: `git add -u`.
2. Run `pre-commit run --files $(git diff --name-only --cached)`.
3. Read the commit message and run `git commit -m "$MSG"`.
## 4. Status & Launch
- Use `agentydragon_task.py status` to view tasks (including those in `.done/`).
- Summaries:
- **Merged:** tasks with no branch/worktree.
- **Ready to merge:** tasks marked Done with branch commits ahead.
- **Unblocked:** tasks with no outstanding dependencies.
- The script also prints a `agentydragon/tools/create_task_worktree.py --agent --tmux <IDs>` command for all unblocked tasks.
This guide centralizes the handoff workflow for all agents.

View File

@@ -1,16 +0,0 @@
## Commit Agent Prompt
Refer to `agentydragon/WORKFLOW.md` for the overall Developer→Commit→Orchestrator handoff workflow.
You are the **Commit** Codex agent for the `codex` repository. Your job is to stage and commit the changes made by the Developer agent.
Your sole responsibility is to generate the Git commit message on stdout.
Do **not** modify any files or run Git commands; this agent must remain sandbox-friendly.
When you run, **output exactly** the desired commit message (with no extra commentary) on stdout. The message must:
- Be prefixed with `agentydragon(tasks): `
- Concisely summarize the work performed as described in the tasks **Implementation** section.
Stop immediately after emitting the commit message. An external orchestrator will stage, run hooks, and commit using this message.
Below, you will get the task description the agent got. But still verify that the agent actually did what it was supposed to, and adjust the commit message according to what is actually implemented, DO NOT just copy what's in the task file.

View File

@@ -1,24 +0,0 @@
## Developer Agent Prompt
Refer to `agentydragon/WORKFLOW.md` for the overall Developer→Commit→Orchestrator handoff workflow.
You are the **Developer** Codex agent for the `codex` repository. You are running inside a dedicated git worktree for a single task branch.
Use the task Markdown file under `agentydragon/tasks/` as your progress tracker: update its **Status** and **Implementation** sections to record your progress.
Before making any changes, read the task definition in `agentydragon/tasks/` and note that its **Status** and **Implementation** sections are placeholders.
After reviewing, update the tasks **Status** to "In progress" and fill in the **Implementation** section with your planned approach.
If the **Implementation** section is blank or does not describe your intended design and steps, populate it with a concise highlevel plan before proceeding.
Then proceed directly to implement the full functionality in the codebase as a single atomic unit—regardless of how many components are involved, do not split the work into separate sub-steps or pause to ask whether to decompose it.
Do not pause to seek user confirmation after editing the Markdown;
only ask clarifying questions if you encounter genuine ambiguities in the requirements.
At any point, you may set the tasks **Status** to any valid state (e.g. Not started, In progress, Needs input, Needs manual review, Done, Cancelled) as appropriate. Use **Needs input** to request further clarification or resources before proceeding.
When you have finished working on the task file:
- If the tasks **Status** is "Needs input", stop immediately and await further instructions; do **not** run pre-commit hooks or invoke the Commit agent.
- Otherwise, set the tasks **Status** to "Done".
- Run the repositorys pre-commit hooks on all changed files (e.g. `pre-commit run --files <changed-files>`), and stage any autofix changes.
- Do **not** stage or commit beyond hook-driven fixes. Instead, stop and await the Commit agent to record your updates.
Then stop and await further instructions.

View File

@@ -1,54 +0,0 @@
# Project Manager Agent Prompt
You are the **Project Manager** Codex agent for the `codex` repository.
Refer to `agentydragon/WORKFLOW.md` for the standard Developer→Commit→Orchestrator handoff workflow.
Your responsibilities include:
- **Reading documentation**: Load and understand all relevant docs in this repo (especially those defining task, worktree, and branch conventions, as well as each task file and toplevel README files).
- **Task orchestration**: Maintain the list of tasks, statuses, and dependencies; plan waves of work; and generate commands to launch work in parallel using `agentydragon/tools/create_task_worktree.py` (or the legacy `agentydragon/tools/create-task-worktree.sh`) with `--agent` and `--tmux`.
- **Task creation**: When creating a new task stub, review the descriptions of all existing tasks; set the `dependencies` front-matter field to list the tasks that must be completed before work on this task can begin; and include a brief rationale as a Markdown comment (e.g., `<!-- rationale: depends on tasks X and Y because ... -->`) explaining why these dependencies are required and why other tasks are not.
- **Live coordination**: Continuously monitor and report progress, adjust the plan as tasks complete or new ones appear, and surface any blockers.
- **Worktree monitoring**: Check each tasks worktree for uncommitted changes or dirty state to detect agents still working or potential crashes, and report their status as in-progress or needing attention.
- When displaying the task-status table, highlight dirty worktrees in red and tasks marked Done or Merged in green; exclude tasks that are Merged with no branch and no worktree from the main table (they should instead be listed in a green “Done & merged:” summary at the bottom), and filter such merged tasks out of other tasks dependency lists.
- **Background polling**: On user request, enter a sleepandscan loop (e.g. 5min interval) to detect tasks marked “Done” in their Markdown; for each completed task, review its branch worktree, check for merge conflicts, propose merging cleanly mergeable branches, and suggest conflictresolution steps for any that arent cleanly mergeable.
- **Manager utilities**: Create and maintain utility scripts under `agentydragon/tools/manager_utils/` to support your work (e.g., branch scanning, conflict checking, merge proposals, polling loops). Include clear documentation (header comments or docstrings with usage examples) in each script, and invoke these scripts in your workflow.
- **Merge orchestration**: When proposing merges of completed task branches into the integration branch, consider both single-branch and octopus (multi-branch) merges. Detect and report conflicts between branches as well as with the integration branch, and recommend resolution steps or merge ordering to avoid or resolve conflicts.
### First Actions
1. For each task branch (named `agentydragon-<task-id>-<task-slug>`), **without changing the current working directorys Git HEAD or modifying its status**, create or open a dedicated worktree for that branch (e.g. via `agentydragon/tools/create_task_worktree.py <task-slug>`) and read the tasks Markdown copy in that worktree to extract and list the task number, title, live **Status**, and dependencies. *(Always read the **Status** and dependencies from the copy of the task file in the branchs worktree, never from master/HEAD.)*
2. Produce a oneline tmux launch command to spin up only those tasks whose dependencies are satisfied and can actually run in parallel, following the conventions defined in repository documentation.
3. Describe the highlevel wavebywave plan and explain which tasks can run in parallel.
### Usage Examples
```bash
# Parallel worktree launch
agentydragon/tools/create_task_worktree.py --agent --tmux 02 04 07
# Wave-by-wave plan
# Wave 1: tasks 02,04 (no unmet deps)
# Wave 2: task 07 (depends on 02,04)
# Background polling loop (every 5 min)
while true; do
python3 agentydragon/tools/check_tasks.py && \
python3 agentydragon/tools/launch_commit_agent.py $(python3 agentydragon/tools/find_done_tasks.py)
sleep 300
done
# Dispose a task worktree
python3 agentydragon/tools/manager_utils/agentydragon_task.py dispose 07
```
More functionality and refinements will be added later. Begin by executing these steps and await further instructions.
*If instructed, enter a background polling loop (sleep for a configured interval, e.g. 5minutes) to watch for tasks whose Markdown status is updated to “Done” and then prepare review/merge steps for only those branches.*
Once a task branch is merged cleanly into the integration branch, dispose of its worktree and delete its Git branch. To record that merge, use:
python3 agentydragon/tools/manager_utils/agentydragon_task.py set-status <task-id> Merged
Use `python3 agentydragon/tools/manager_utils/agentydragon_task.py dispose <task-id>` to remove the worktree and branch without changing the status (e.g. for cancelled tasks).

View File

@@ -1,5 +0,0 @@
Read the full diff between HEAD and main and produce a list of everything that was added/removed.
Include examples of how to use the features, how to configure them, etc.
Use Markdown format. Write into $(git rev-parse --show-toplevel)/agentydragon/CHANGES.md. Delete it if it already exists.
Only document changes under codex-rs.
Do not include things that already exist on main branch - only what was changed.

View File

@@ -1,4 +0,0 @@
read the description of all tasks in agentydragon/tasks/*.md and relevant context in codex-rs. for every task: disregard existing dependecy declarations in the frontmatter. think long about
why and how they might depend on each other and if there's any way they might conflict and whether the overall picturen of how they fit toether makes sense. for each, *REGENERATE* the
dependency list in frontmatter to the list of tasks the muast be done before each gvien taks becomes unblocked. no need to populate this for already merged tasks. also no need to list
merged tasks inside any dependency list.

View File

@@ -1,66 +0,0 @@
You are the AI “Scaffolding Assistant” for the `codex` monorepo. Your mission is to generate, in separate commits, all of the initial scaffolding needed for the
tydragon-driven task workflow:
1. **Task stubs**
- Create `agentydragon/tasks/task-template.md`.
- Create numbered task stubs (`01-*.md`, `02-*.md`, …) for each planned feature (mounting, approval predicates, livereload, editor integration, etc.), filling in
e, “Status”, “Goal”, and sections for “Acceptance Criteria”, “Implementation”, and “Notes”.
2. **Worktree launcher**
- Implement `agentydragon/tools/create_task_worktree.py` with:
- `--agent` mode to spin up a Codex agent in the worktree,
- `--tmux` to tile panes for multiple tasks in a single tmux session,
- twodigit or slug ID resolution.
- Ensure usage, help text, and numeric/slug handling are correct.
3. **Helper scripts**
- Add `agentydragon/tasks/review-unmerged-task-branches.sh` to review and merge task branches.
- Add `agentydragon/tools/launch-project-manager.sh` to invoke the Project Manager agent prompt.
4. **Projectmanager prompts**
- Create `agentydragon/prompts/manager.md` containing the following Project Manager agent prompt:
```
# Project Manager Agent Prompt
You are the **Project Manager** Codex agent for the `codex` repository. Your responsibilities include:
- **Reading documentation**: Load and understand all relevant docs in this repo (especially those defining task, worktree, and branch conventions, as well as each task file and toplevel README files).
- **Task orchestration**: Maintain the list of tasks, statuses, and dependencies; plan waves of work; and generate shell commands to launch work on tasks in parallel using `create_task_worktree.py` with `--agent` and `--tmux`.
- **Live coordination**: Continuously monitor and report progress, adjust the plan as tasks complete or new ones appear, and surface any blockers.
- **Worktree monitoring**: Check each tasks worktree for uncommitted changes or dirty state to detect agents still working or potential crashes, and report their status as in-progress or needing attention.
- **Background polling**: On user request, enter a sleepandscan loop (e.g. 5min interval) to detect tasks marked “Done” in their Markdown; for each completed task, review its branch worktree, check for merge conflicts, propose merging cleanly mergeable branches, and suggest conflictresolution steps for any that arent cleanly mergeable.
- **Manager utilities**: Create and maintain utility scripts under `agentydragon/tools/manager_utils/` to support your work (e.g., branch scanning, conflict checking, merge proposals, polling loops). Include clear documentation (header comments or docstrings with usage examples) in each script, and invoke these scripts in your workflow.
- **Merge orchestration**: When proposing merges of completed task branches into the integration branch, consider both single-branch and octopus (multi-branch) merges. Detect and report conflicts between branches as well as with the integration branch, and recommend resolution steps or merge ordering to avoid or resolve conflicts.
### First Actions
1. For each task branch (named `agentydragon-<task-id>-<task-slug>`), **without changing the current working directorys Git HEAD or modifying its status**, create or open a dedicated worktree for that branch (e.g. via `create_task_worktree.py <task-slug>`) and read the tasks Markdown copy under that worktrees `agentydragon/tasks/` to extract and list the task number, title, live **Status**, and dependencies. *(Always read the **Status** and dependencies from the copy of the task file in the branchs worktree, never from master/HEAD.)*
2. Produce a oneline tmux launch command to spin up only those tasks whose dependencies are satisfied and can actually run in parallel, following the conventions defined in repository documentation.
3. Describe the highlevel wavebywave plan and explain which tasks can run in parallel.
More functionality and refinements will be added later. Begin by executing these steps and await further instructions.
```
5. **Wavebywave plan**
- Draft a humanreadable plan outlining task dependencies and four “waves” of work, indicating which tasks can run in parallel.
6. **Bootstrap commands**
- Provide concrete shell/`rg`/`tmux` oneliner examples to launch Wave1 (e.g. tasks 06, 03, 08) in parallel.
- Provide a single tmux oneliner to spin up all unblocked tasks.
**Before you begin**, read the existing docs under `agentydragon/tasks/`, toplevel `README.md` and `oaipackaging/README.md` so you fully understand the context and
entions.
**Commit strategy**
- Commit each major component (tasks, script, helper scripts, prompts, plan) as its own Git commit.
- Follow our existing commit-message style: prefix with `agentydragon(tasks):`, `agentydragon:`, etc.
- Dont batch everything into one huge commit; keep each logical piece isolated for easy review.
**Reporting**
After each commit, print a short status message (e.g. “✅ Task stubs created”, “✅ create_task_worktree.py implemented”, etc.) and await confirmation before continuing
the next step.
---
Begin now by listing the current task directory contents and generating `task-template.md`.

View File

@@ -1 +0,0 @@
# Keep this directory in version control

View File

@@ -1,64 +0,0 @@
+++
id = "01"
title = "Dynamic Mount-Add and Mount-Remove Commands"
status = "Merged"
dependencies = ""
last_updated = "2025-06-25T01:40:09.501150"
+++
# Task 01: Dynamic Mount-Add and Mount-Remove Commands
> *This task is specific to codex-rs.*
## Status
**General Status**: Merged
**Summary**: Implemented inline DSL and interactive dialogs for `/mount-add` and `/mount-remove`, with dynamic sandbox policy updates.
## Goal
Implement the `/mount-add` and `/mount-remove` slash commands in the TUI, supporting two modes:
1. **Inline DSL**: e.g. `/mount-add host=/path/to/host container=/path/in/agent mode=rw`
2. **Interactive dialog**: if the user just types `/mount-add` or `/mount-remove` without args, pop up a prompt to fill in `host`, `container`, and optional `mode` fields.
These commands should:
- Create or remove symlinks (or real directories) under the current working directory.
- Update the in-memory `SandboxPolicy` to grant or revoke read/write permission for the host path.
- Emit confirmation or error messages into the TUI log pane.
## Acceptance Criteria
- Users can type `/mount-add host=... container=... mode=...` and the mount is created immediately.
- Users can type `/mount-add` alone to open a small TUI form prompting for the three fields.
- Symmetrically for `/mount-remove` by container path.
- The `sandbox_policy` is updated so subsequent shell commands can read/write the newly mounted folder.
## Implementation
**How it was implemented**
- Added two new slash commands (`mount-add`, `mount-remove`) to the TUIs `slash-command` popup.
- Inline DSL parsing: commands typed as `/mount-add host=... container=... mode=...` or `/mount-remove container=...` are detected and handled immediately by parsing key/value args, performing the mount/unmount, and updating the `Config.sandbox_policy` in memory.
- Interactive dialogs: selecting `/mount-add` or `/mount-remove` without args opens a bottompane form (`MountAddView` or `MountRemoveView`) that prompts sequentially for the required fields and then triggers the same mount logic.
- Mount logic implemented in `do_mount_add`/`do_mount_remove`:
- Creates/removes a symlink under `cwd` pointing to the host path (`std::os::unix::fs::symlink` on Unix, platform equivalents on Windows).
- Uses new `SandboxPolicy` methods (`allow_disk_write_folder`/`revoke_disk_write_folder`) to grant or revoke `DiskWriteFolder` permissions for the host path.
- Emits success or error messages via `tracing::info!`/`tracing::error!`, which appear in the TUI log pane.
**How it works**
1. **Inline DSL**
- User types:
```
/mount-add host=/path/to/host container=path/in/cwd mode=ro
```
- The first-stage popup intercepts the mount-add command with args, dispatches `InlineMountAdd`, and the app parses the args and runs the mount logic immediately.
2. **Interactive dialog**
- User types `/mount-add` (or selects it via the popup) without args.
- A small form appears that prompts for `host`, `container`, then `mode`.
- Upon completion, the same mount logic runs.
3. **Unmount**
- `/mount-remove container=...` (inline) or `/mount-remove` (interactive) remove the symlink and revoke write permissions.
4. **Policy update**
- `allow_disk_write_folder` appends a `DiskWriteFolder` permission for new mounts.
- `revoke_disk_write_folder` removes the corresponding permission on unmount.
## Notes
- This builds on the static `[[sandbox.mounts]]` support introduced earlier.

View File

@@ -1,42 +0,0 @@
+++
id = "03"
title = "Live Config Reload and Prompt on Changes"
status = "Merged"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T05:36:17.783726"
+++
# Task 03: Live Config Reload and Prompt on Changes
> *This task is specific to codex-rs.*
## Status
**General Status**: Done
**Summary**: Live config watcher, diff prompt, and reload integration implemented.
## Goal
Detect changes to the user `config.toml` file while a session is running and prompt the user to apply or ignore the updated settings.
## Acceptance Criteria
- A background file watcher watches `$CODEX_HOME/config.toml` (or active user config path).
- On any write event, compute a unified diff between the in-memory config and the on-disk file.
- Pause the agent, display the diff in the TUI bottom pane, and offer two actions: `Apply new config now` or `Continue with old config`.
- If the user applies, re-parse the config, merge overrides, and resume using the new settings. Otherwise, discard changes and resume.
## Implementation
**How it was implemented**
- Added `codex_tui::config_reload::generate_diff` to compute unified diffs via the `similar` crate (with a unit test).
- Spawned a `notify`-based filesystem watcher thread in `tui::run_main` that debounces write events on `$CODEX_HOME/config.toml`, generates diffs against the last-read contents, and posts `AppEvent::ConfigReloadRequest(diff)`.
- Introduced `AppEvent` variants (`ConfigReloadRequest`, `ConfigReloadApply`, `ConfigReloadIgnore`) and wired them in `App::run` to display a new `BottomPaneView` overlay.
- Created `BottomPaneView` implementation `ConfigReloadView` to render the diff and handle `<Enter>`/`<Esc>` for apply or ignore.
- On apply, reloaded `Config` via `Config::load_with_cli_overrides`, updated both `App.config` and `ChatWidget` (rebuilding its bottom pane with updated settings).
**How it works**
- The watcher thread detects on-disk changes and pushes a diff request into the UI event loop.
- Upon `ConfigReloadRequest`, the TUI bottom pane overlays the diff view and blocks normal input.
- `<Enter>` applies the new config (re-parses and updates runtime state); `<Esc>` dismisses the overlay and continues with the old settings.
## Notes
- Leverage a crate such as `notify` for FS events and `similar` or `diff` for unified diff generation.

View File

@@ -1,42 +0,0 @@
+++
id = "06"
title = "External Editor Integration for Prompt Entry"
status = "Merged"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T02:40:09.505778"
+++
# Task 06: External Editor Integration for Prompt Entry
> *This task is specific to codex-rs.*
## Status
**General Status**: Done
**Summary**: External editor integration for prompt entry implemented.
## Goal
Allow users to spawn an external editor (e.g. Neovim) to compose or edit the chat prompt. The prompt box should update with the editor's contents when closed.
## Acceptance Criteria
- A slash command `/edit-prompt` (or `Ctrl+E`) launches the user's preferred editor on a temporary file pre-populated with the current draft.
- Upon editor exit, the draft is re-read into the composer widget.
- Configurable via `editor = "${VISUAL:-${EDITOR:-nvim}}"` setting in `config.toml`.
## Implementation
**How it was implemented**
- Added `editor` option to `[tui]` section in `config.toml`, defaulting to `${VISUAL:-${EDITOR:-nvim}}`.
- Exposed the `tui.editor` setting in the `codex-core` config model (`config_types.rs`) and wired it through to the TUI.
- Added a new slash-command variant `EditPrompt` in `tui/src/slash_command.rs` to trigger external-editor mode.
- Implemented `ChatComposer::open_external_editor()` in `tui/src/bottom_pane/chat_composer.rs`:
- Creates a temporary file pre-populated with the current draft prompt.
- Launches the configured editor (from `VISUAL`/`EDITOR` with `nvim` fallback) in a blocking subprocess.
- Reads the edited contents back into the `TextArea` on editor exit.
- Wired both `Ctrl+E` and the `/edit-prompt` slash command to invoke `open_external_editor()`.
- Updated `config.md` to document the new `editor` setting under `[tui]`.
**How it works**
- Pressing `Ctrl+E`, or typing `/edit-prompt` and hitting Enter, spawns the user's preferred editor on a temporary file containing the current draft.
- When the editor process exits, the plugin reads back the file and updates the chat composer with the edited text.
- The default editor is determined by `VISUAL`, then `EDITOR`, falling back to `nvim` if neither is set.

View File

@@ -1,37 +0,0 @@
+++
id = "07"
title = "Undo Feedback Decision with Esc Key"
status = "Merged"
dependencies = "01,04,10,12,16,17"
last_updated = "2025-06-25T01:40:09.506146"
+++
# Task 07: Undo Feedback Decision with Esc Key
> *This task is specific to codex-rs.*
## Status
**General Status**: Merged
**Summary**: ESC key now cancels feedback entry and returns to the select menu, preserving any entered text; implementation and tests added.
## Goal
Enhance the user-approval dialog so that if the user opted to leave feedback (“No, enter feedback”) they can press `Esc` to cancel the feedback flow and return to the previous approval choice menu (e.g. “Yes, proceed” vs. “No, enter feedback”).
## Acceptance Criteria
- While the feedback-entry textarea is active, pressing `Esc` closes the feedback editor and reopens the yes/no confirmation dialog.
- The cancellation must restore the dialog state without losing any partially entered feedback text.
## Implementation
**How it was implemented**
- In `tui/src/user_approval_widget.rs`, updated `UserApprovalWidget::handle_input_key` so that pressing `Esc` in input mode switches `mode` back to `Select` (rather than sending a deny decision), and restores `selected_option` to the feedback entry item without clearing the input buffer.
- Added a unit test in the same module to verify that `Esc` cancels input mode, preserves the feedback text, and does not emit any decision event.
**How it works**
- When the widget is in `Mode::Input` (feedback-entry), receiving `KeyCode::Esc` resets `mode` to `Select` and sets `selected_option` to the index of the “Edit or give feedback” option.
- The `input` buffer remains intact, so any partially typed feedback is preserved for if/when the user re-enters feedback mode.
- No approval decision is sent on `Esc`, so the modal remains active and the user can still approve, deny, or re-enter feedback.
## Notes
- Changes in `tui/src/user_approval_widget.rs` to treat `Esc` in input mode as a cancel-feedback action and added corresponding tests.

View File

@@ -1,52 +0,0 @@
+++
id = "08"
title = "Set Shell Title to Reflect Session Status"
status = "Merged"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T04:06:55.265790"
+++
# Task 08: Set Shell Title to Reflect Session Status
> *This task is specific to codex-rs.*
## Status
**General Status**: Done
**Summary**: Implemented session title persistence, `/set-title` slash command, and real-time ANSI updates in both TUI and exec clients.
## Goal
Allow the CLI to update the terminal title bar to reflect the current session status—executing, thinking (sampling), idle, or waiting for approval decision—and persist the title with the session. Users should also be able to explicitly set a custom title.
## Acceptance Criteria
- Implement a slash command or API (`/set-title <title>`) for users to explicitly set the session title.
- Persist the title in session metadata so that on resume the last title is restored.
- Dynamically update the shell/terminal title in real time based on session events:
- Executing: use a play symbol (e.g. ▶)
- Thinking/sampling: use an hourglass or brain symbol (e.g. ⏳)
- Idle: use a green dot or sleep symbol (e.g. 🟢)
- Waiting for approval decision: use an attention-grabbing symbol (e.g. ❗)
- Ensure title updates work across Linux, macOS, and Windows terminals via ANSI escape sequences.
## Implementation
**Note**: Populate this section with a concise high-level plan before beginning detailed implementation.
**Planned approach**
- Extend the session protocol schema (`SessionConfiguredEvent`) in `codex-rs/core` to include an optional `title` field and introduce a new `SessionUpdatedTitleEvent` type.
- Add a `SetTitle { title: String }` variant to the `Op` enum for custom titles and implement the `/set-title <text>` slash command in the TUI crates (`tui/src/slash_command.rs`, `tui/src/app_event.rs`, and `tui/src/app.rs`).
- Modify the core agent loop to handle `Op::SetTitle`: persist the new title in session metadata, emit a `SessionUpdatedTitleEvent`, and include the persisted title in `SessionConfiguredEvent` on startup/resume.
- Implement event listeners in both the interactive TUI (`tui/src/chatwidget.rs`) and non-interactive exec client (`exec/src/event_processor.rs`) that respond to session, title, and lifecycle events (session start, task begin/end, reasoning, idle, approval) by emitting ANSI escape sequences (`\x1b]0;<symbol> <title>\x07`) to update the terminal title bar.
- Choose consistent Unicode symbols for each session state—executing (▶), thinking (⏳), idle (🟢), awaiting approval (❗)—and apply these as status indicators prefixed to the title.
- On session startup or resume, restore the last persisted title or fall back to a default if none exists.
**How it works**
- Users type `/set-title MyTitle` to set a custom session title; the core persists it and broadcasts a `SessionUpdatedTitleEvent`.
- Clients print the appropriate ANSI escape code to update the terminal title before rendering UI or logs, reflecting real-time session state via the selected status symbol prefix.
## Notes
- Use ANSI escape code `\033]0;<title>\007` to set the terminal title.
- Extend the session JSON schema to include a `title` field.
- Select Unicode symbols that render consistently in common terminal fonts.

View File

@@ -1,52 +0,0 @@
+++
id = "10"
title = "Inspect Container State (Mounts, Permissions, Network)"
status = "Merged"
dependencies = ""
last_updated = "2025-06-25T04:07:56.197523"
+++
# Task 10: Inspect Container State (Mounts, Permissions, Network)
> *This task is specific to codex-rs.*
## Status
**General Status**: Completed
**Summary**: Implemented `codex inspect-env` subcommand, CLI output and TUI bindings, tested in sandbox and headless modes.
## Goal
Provide a runtime command that displays the current sandbox/container environment details—what is mounted where, permission scopes, network access status, and other relevant sandbox policies.
## Acceptance Criteria
- Implement a slash command or CLI subcommand (`/inspect-env` or `codex inspect-env`) that outputs:
- List of bind mounts (host path → container path, mode)
- File-system permission policies in effect
- Network sandbox status (restricted or allowed)
- Runtime TUI statusbar indicators for key sandbox attributes (e.g. network enabled/disabled, mount count, read/write scopes)
- Any additional sandbox rules or policy settings applied
- Format the output in a human-readable table or tree view in the TUI and plaintext for logs.
- Ensure the command works in both interactive TUI sessions and non-interactive (headless) modes.
- Include a brief explanation header summarizing each section to help users understand what they are seeing.
## Implementation
**How it was implemented**
Implemented a new `inspect-env` subcommand in `codex-cli`, reusing `create_sandbox_policy` and `Config::load_with_cli_overrides` to derive the effective sandbox policy and working directory. The code computes read-only or read-write mount entries (root and writable roots), enumerates granted `SandboxPermission`s, and checks `has_full_network_access()`. It then prints a formatted table (via `println!`) and summary counts.
**How it works**
Running `codex inspect-env` loads user overrides, builds the sandbox policy, and:
- Lists mounts (path and mode) in a table.
- Prints each granted permission.
- Shows network status as `enabled`/`disabled`.
- Outputs summary counts for mounts and writable roots.
This command works both in CI/headless and inside the TUI (status-bar integration).
## Notes
- Leverage existing sandbox policy data structures used at startup.
- Reuse TUI table or tree components for formatting (e.g., tui-rs widgets).
- Include clear labels for network status (e.g., `NETWORK: disabled` or `NETWORK: enabled`).

View File

@@ -1,61 +0,0 @@
+++
id = "11"
title = "User-Configurable Approval Predicates"
status = "Merged"
dependencies = "01,04,10,12,16,17"
last_updated = "2025-06-25T01:40:09.508560"
+++
# Task 11: User-Configurable Approval Predicates
> *This task is specific to codex-rs.*
## Status
**General Status**: Merged
**Summary**: Implemented custom approval predicates feature: configuration parsing, predicate invocation logic, tests, and documentation.
## Goal
Allow users to plug in an external executable that makes approval decisions for shell commands based on session context.
## Acceptance Criteria
- Support a new `[[approval_predicates]]` section in `config.toml` for Python-based predicates, each with a `python_predicate_binary = "..."` field (pointing to the predicate executable) and an implicit `never_expire = true` setting.
- Before prompting the user, invoke each configured predicate in order, passing the following (via CLI args or env vars):
- Session ID
- Container working directory (CWD)
- Host working directory (CWD)
- Candidate shell command string
- The predicate must print exactly one of `allow`, `deny`, or `ask` on stdout:
- `allow` → auto-approve and skip remaining predicates
- `deny` → auto-reject and skip remaining predicates
- `ask` → open the standard approval dialog and skip remaining predicates
- If a predicate exits non-zero or outputs anything else, treat it as `ask` and continue to the next predicate.
- Write unit and integration tests covering typical and edge-case predicate behavior.
- Document configuration syntax and behavior in the top-level config docs (`config.md`).
## Implementation
**How it was implemented**
- Added `approval_predicates` field to `ConfigToml` and `Config` in `codex_core::config`, supporting a `python_predicate_binary: PathBuf` and an implicit `never_expire = true`.
- Hooked into the command-approval code path in `codex_core::safety` to invoke each configured predicate executable before showing the approval prompt. Predicates are launched via `std::process::Command` with context passed in environment variables (`CODEX_SESSION_ID`, `CODEX_CONTAINER_CWD`, `CODEX_HOST_CWD`, `CODEX_COMMAND`).
- Parsed each predicates stdout for exactly `allow`, `deny`, or `ask`, short-circuiting on `allow` or `deny` (auto-approve/auto-reject) and treating failures or unexpected output as `ask` to continue to the next predicate.
- Wrote unit tests for configuration parsing and predicate-invocation behavior, covering exit-code and output edge cases, plus integration tests verifying end-to-end approval decisions.
- Updated `config.md` to document the `[[approval_predicates]]` table syntax, default semantics, and runtime behavior.
**How it works**
When a shell command requires approval, Codex iterates over each entry in `[[approval_predicates]]` in order. For each predicate:
- Launch the configured binary with session context in its environment.
- If it exits successfully and writes `allow`, Codex auto-approves and skips remaining predicates.
- If it writes `deny`, Codex auto-rejects and skips remaining predicates.
- Otherwise (writes `ask`, fails, or emits unexpected output), Codex moves to the next predicate or falls back to the manual approval dialog if none return `allow` or `deny`.
This mechanism lets users automate approval decisions via custom Python scripts while retaining manual control when predicates defer.
## Notes
- Consider passing context via environment variables (e.g. `CODEX_SESSION_ID`, `CODEX_CONTAINER_CWD`, `CODEX_HOST_CWD`, `CODEX_COMMAND`).
- Reuse invocation logic from the auto-approval predicates feature (Task 02).
- **Motivating example**: auto-approve `pre-commit run --files <any number of space-separated files>`.
- **Motivating example**: auto-approve any `git` command (e.g. `git add`, `git commit`, `git push`, `git status`, etc.) provided its repository root is under `<directory>`, correctly handling common flags and safe invocation modes.
- **Motivating example**: auto-approve any shell pipeline composed out of `<these known-safe commands>` operating on `<known-safe files>` with `<known-safe params>`, using a general pipeline parser to ensure safety—a nontrivial example of predicate logic.

View File

@@ -1,45 +0,0 @@
+++
id = "13"
title = "Interactive Prompting and Commands While Executing"
status = "Merged"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T01:40:09.509881"
+++
# Task 13: Interactive Prompting and Commands While Executing
> *This task is specific to codex-rs.*
## Status
**General Status**: Merged
**Summary**: Implemented interactive prompt overlay allowing user input during streaming without aborting runs.
## Goal
Allow users to interleave composing prompts and issuing slash-commands while the agent is actively executing (e.g. streaming completions), without aborting the current run.
## Acceptance Criteria
- While the LLM is streaming a response or executing a tool, the input box remains active for user edits and slash-commands.
- Sending a message or `/`-command does not implicitly cancel or abort the ongoing execution.
- Any tool invocation messages from the agent must still be immediately followed by their corresponding tool output messages (or the API will error).
- Ensure the TUI correctly preserves the stream and appends new user input at the bottom, scrolling as needed.
- No deadlocks or lost events if the agent finishes while the user is typing; buffer and render properly.
- Update tests to simulate concurrent user input during streaming and validate UI state.
## Implementation
**How it was implemented**
- Modified `BottomPane::handle_key_event` in `tui/src/bottom_pane/mod.rs` to special-case the `StatusIndicatorView` while `is_task_running`, forwarding key events to `ChatComposer` and preserving the overlay.
- Updated `BottomPane::render_ref` to always render the composer first and then overlay the active view, ensuring the input box remains visible and editable under the status indicator.
- Added unit tests in `tui/src/bottom_pane/mod.rs` to verify input is forwarded during task execution and that the status indicator overlay is removed upon task completion.
**How it works**
During LLM streaming or tool execution, the `StatusIndicatorView` remains active as an overlay. The modified event handler detects this overlay and forwards user key events to the underlying `ChatComposer` without dismissing the overlay. On task completion (`set_task_running(false)`), the overlay is automatically removed (via `should_hide_when_task_is_done`), returning to the normal input-only view.
## Notes
- Look at the ChatComposer and streaming loop in `tui/src/bottom_pane/chat_composer.rs` for input and stream handling.
- Ensure event loop in `app.rs` multiplexes between agent stream events and user input events without blocking.
- Consider locking or queuing tool-use messages to guarantee prompt tool-output pairing.

View File

@@ -1,95 +0,0 @@
+++
id = "15"
title = "Agent Worktree Sandbox Configuration"
status = "Merged"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T07:26:13.570520"
+++
# Task 15: Agent Worktree Sandbox Configuration
## Status
**General Status**: Done
**Summary**: Enhanced the task scaffolding script to launch a Codex agent in a sandboxed worktree with writable worktree and TMPDIR, auto-approved file I/O and Git operations, and network disabled.
## Goal
Use `create-task-worktree.sh --agent` to wrap the agent invocation in a sandbox with these properties:
- The task worktree path and the system temporary directory (`$TMPDIR` or `/tmp`) are mounted read-write.
- All other paths on the host are treated as read-only.
- Git operations in the worktree (e.g. `git add`, `git commit`) succeed without additional confirmation.
- Any file read or write under the worktree root is automatically approved.
## Acceptance Criteria
The `create-task-worktree.sh --agent` invocation:
- launches the agent via `codex debug landlock` (or equivalent), passing flags to mount only the worktree and tempdir as writable.
- sets up Landlock permissions so that all other host paths are read-only.
- auto-approves any file system operation under the worktree directory.
- auto-approves Git commands in the worktree without prompting.
- still permits using system temp dir for ephemeral files.
- contains tests or manual verifications demonstrating blocked writes outside and allowed writes inside.
## Implementation
**How it was implemented**
- Extended `create-task-worktree.sh` `--agent` mode to launch the Codex agent under a Landlock+seccomp sandbox by invoking `codex debug landlock --full-auto`, which grants write access only to the worktree (`cwd`) and the platform temp folder (`TMPDIR`), and disables network.
- Updated the `-a|--agent` help text to reflect the new sandbox behavior and tempdir whitelist.
- Added a test script demonstrating allowed writes inside the worktree and TMPDIR and blocked writes to directories outside those paths:
```bash
#!/usr/bin/env bash
# Test script for Task 15: verify sandbox restrictions and allowances
set -euo pipefail
worktree_root="$(cd "$(dirname "$0")"/.. && pwd)"
echo "Running sandbox tests in worktree: $worktree_root"
# Test write inside worktree
echo -n "Test: write inside worktree... "
if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$worktree_root/inside_test'"; then
echo "PASS"
else
echo "FAIL" >&2
exit 1
fi
# Test write inside TMPDIR
tmpdir=${TMPDIR:-/tmp}
echo -n "Test: write inside TMPDIR ($tmpdir)... "
if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$tmpdir/tmp_test'"; then
echo "PASS"
else
echo "FAIL" >&2
exit 1
fi
# Prepare external directory under HOME to test outside worktree/TMPDIR
external_dir="$HOME/sandbox_test_dir"
mkdir -p "$external_dir"
rm -f "$external_dir/outside_test"
echo -n "Test: write outside allowed paths ($external_dir)... "
if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$external_dir/outside_test'"; then
echo "FAIL: outside write succeeded" >&2
exit 1
else
echo "PASS"
fi
```
**How it works**
When invoked with `--agent`, `create-task-worktree.sh` changes into the task worktree and launches:
```bash
codex debug landlock --full-auto codex "$(< \"$repo_root/agentydragon/prompts/developer.md\")"
```
The `--full-auto` flag configures Landlock to allow disk writes under the current directory and the system temp directory, disable network access, and automatically approve commands on success. As a result, any file I/O and Git operations in the worktree proceed without approval prompts, while writes outside the worktree and TMPDIR are blocked by the sandbox.
## Notes
- This feature depends on the underlying Landlock/Seatbelt sandbox APIs.
- Leverage the existing sandbox invocation (`codex debug landlock`) and approval predicates to auto-approve worktree and tmpdir I/O.

View File

@@ -1,54 +0,0 @@
+++
id = "16"
title = "Confirm on Ctrl+D to Exit"
status = "Merged"
dependencies = ""
last_updated = "2025-06-25T05:36:23.493497"
+++
# Task 16: Confirm on Ctrl+D to Exit
> *This task is specific to codex-rs.*
## Status
**General Status**: Done
**Summary**: Double Ctrl+D confirmation implemented and tested.
## Goal
Require two consecutive Ctrl+D keystrokes (within a short timeout) to exit the TUI, preventing accidental termination from a single SIGINT.
## Acceptance Criteria
- Add a `[tui] require_double_ctrl_d = true` config flag (default `false`) to enable doubleCtrl+D exit confirmation.
- When `require_double_ctrl_d` is enabled:
- First Ctrl+D within the TUI suspends exit and shows a status message like "Press Ctrl+D again to confirm exit".
- If a second Ctrl+D occurs within a configurable timeout (e.g. 2sec), the TUI exits normally.
- If no second Ctrl+D arrives before timeout, clear the confirmation state and resume normal operation.
- Ensure that child processes (shell tool calls) still receive SIGINT immediately and are not affected by the doubleCtrl+D logic.
- Prevent immediate exit on Ctrl+D (EOF); require the same doubleconfirmation workflow as for Ctrl+D when EOF is received.
- Provide unit or integration tests simulating SIGINT events to verify behavior.
## Implementation
**How it was implemented**
- Added `require_double_ctrl_d` and `double_ctrl_d_timeout_secs` to the TUI config in `core/src/config_types.rs` with defaults.
- Introduced `ConfirmCtrlD` helper in `tui/src/confirm_ctrl_d.rs` to manage confirmation state and expiration logic.
- Extended `App` in `tui/src/app.rs`:
- Initialized `confirm_ctrl_d` from config in `App::new`.
- Expired stale confirmation windows each event-loop tick and cleared the status overlay when timed out.
- Replaced the Ctrl+D handler to invoke `ConfirmCtrlD::handle`, exiting only on confirmed press and otherwise displaying a prompt via `BottomPane`.
- Leveraged `BottomPane::set_task_running(true)` and `update_status_text` to render the confirmation prompt overlay.
- Added unit tests for `ConfirmCtrlD` in `tui/src/confirm_ctrl_d.rs` covering disabled mode, confirmation press, and timeout expiration.
**How it works**
- When `require_double_ctrl_d = true`, the first Ctrl+D press shows "Press Ctrl+D again to confirm exit" in the status overlay.
- A second Ctrl+D within `double_ctrl_d_timeout_secs` exits the TUI; otherwise the prompt and state clear after timeout.
- When `require_double_ctrl_d = false`, Ctrl+D exits immediately as before.
- Child processes still receive SIGINT normally since only the TUI event loop intercepts Ctrl+D.
## Notes
- Make the doubleCtrl+D timeout duration configurable if desired (e.g. via `tui.double_ctrl_d_timeout_secs`).
- Ensure that existing tests for Ctrl+D behavior are updated or new tests added to cover the confirmation state.

View File

@@ -1,46 +0,0 @@
+++
id = "18"
title = "Chat UI Textarea Overlay and Border Styling Fix"
status = "Merged"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T05:36:27.942304"
+++
# Task 18: Chat UI Textarea Overlay and Border Styling Fix
---
id: 18
title: Chat UI Textarea Overlay and Border Styling Fix
status: Not started
summary: Fix overlay of waiting messages and streamline borders between chat window and input area to improve visibility and reclaim terminal space.
goal: |
Adjust the TUI chat interface so that waiting/status messages no longer overlay the first line of the input textarea (ensuring user drafts remain visible), and merge/remove borders as follows:
- Merge the bottom border of the chat history window with the top border of the input textarea.
- Remove the left, right, and bottom overall borders around the chat interface to reduce wasted space.
---
> *This task is specific to codex-rs.*
## Acceptance Criteria
- Waiting/status messages (e.g. "Thinking...", "Typing...", etc.) appear above the textarea rather than overlaying the first line of the input area.
- User draft text remains visible at all times, even when agent messages or status indicators are rendered.
- The bottom border of the chat history pane and the top border of the textarea are unified into a single border line.
- The left, right, and bottom borders around the entire chat UI are removed, reclaiming columns/rows in the terminal.
- Manual or automated visual verification steps demonstrate correct layout in a variety of terminal widths.
## Implementation
**How it was implemented**
* Merged the bottom border of the history pane and the top border of the input textarea into a single shared line by removing the textarea's top border and keeping only a bottom border on the textarea and both top/bottom borders on the history pane.*
* Removed left/right borders on both panes (history and textarea) and removed the textarea's bottom border from the overall UI to reclaim horizontal space.*
* Updated the status-indicator overlay to render in its own floating box immediately above the textarea instead of covering the first input line.*
**How it works**
At runtime the conversation history widget now draws only its top and bottom borders. The input textarea draws only its bottom border, carrying the help title there. These changes yield a single continuous border line separating history from input and eliminate the outer left, right, and bottom borders. Status messages ("Thinking...", etc.) render in a separate floating box positioned just above the textarea, leaving the user's draft text visible at all times.
## Notes
- This involves updating the rendering logic in the TUI modules (likely under `tui/src/` in `codex-rs`).
- Ensure layout changes do not break existing tests or rendering in unusual terminal sizes.
- Consider writing a simple snapshot test or manual demo script to validate border and overlay behavior.

View File

@@ -1,42 +0,0 @@
+++
id = "19"
title = "Bash Command Rendering Improvements for Less Verbosity"
status = "Merged"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T05:36:32.641375"
+++
> *This task is specific to per-agent UI conventions and log readability.*
## Acceptance Criteria
- Shell commands render as plain text without `bash -lc` wrappers.
- Role labels and message content appear on the same line, separated by a space.
- Command-result annotations show a checkmark and duration for zero exit codes, or `exit code: N` and duration for nonzero codes, in the format `<icon or exit code> <duration>ms`.
- Existing functionality remains unaffected beyond formatting changes.
- Verbose background event logs (e.g. sandboxdenied exec errors, retries) collapse into a single command execution entry showing command start, running indicator, and concise completion status.
- Automated examples or tests verify the new rendering behavior.
## Implementation
This change will touch both the event-processing and rendering layers of the Rust TUI:
- **Event processing** (`codex-rs/exec/src/event_processor.rs`):
- Strip any `bash -lc` wrapper when formatting shell commands via `escape_command`.
- Replace verbose `BackgroundEvent` logs for sandbox-denied errors and automatic retries with a unified exec-command begin/end sequence.
- Annotate completed commands with either a checkmark (✅) and `<duration>ms` for success or `exit code: N <duration>ms` for failures.
- **TUI rendering** (`codex-rs/tui/src/history_cell.rs`):
- Collapse consecutive `BackgroundEvent` entries related to exec failures/retries into the standard active/completed exec-command cells.
- Update `new_active_exec_command` and `new_completed_exec_command` to use the new inline format (icon or exit code + duration, with `$ <command>` on the same block).
- Ensure role labels and plain-text messages render on a single line separated by a space.
- **Tests** (`codex-rs/tui/tests/`):
- Add or update test fixtures to verify:
- Commands appear without any `bash -lc` boilerplate.
- Completed commands show the correct checkmark or exit-code annotation with accurate duration formatting.
- Background debugging events no longer leak raw debug strings and are correctly collapsed into the exec-command flow.
## Notes
- Improves readability of interactive sessions and logs by reducing boilerplate.
- Ensure compatibility with both live TUI output and persisted log transcripts.

View File

@@ -1,34 +0,0 @@
+++
id = "21"
title = "Compact Markdown Rendering Option"
status = "Merged"
dependencies = "03,06,08,13,15,32,18,19,22,23"
last_updated = "2025-06-25T05:55:23.855039"
+++
## Summary
Provide an option to render Markdown without blank lines between headings and content for more vertical packing.
## Goal
Add a configuration flag to control Markdown rendering in the chat UI and logs so that headings render immediately adjacent to their content with no separating blank line.
## Acceptance Criteria
- Introduce a config flag `markdown_compact = true|false` under the UI settings.
- When enabled, the renderer omits the default blank line between headings (lines starting with `#`) and their subsequent content.
- The flag applies globally to all Markdown rendering (diffs, docs, help messages).
- Default behavior remains unchanged (blank lines preserved) when `markdown_compact` is false or unset.
- Add tests to verify both compact and default rendering modes across heading levels.
## Implementation
**How it was implemented**
- Extend the Markdown-to-TUI formatter to check `markdown_compact` and collapse heading/content spacing.
- Implement a post-processing step that removes blank lines immediately following heading tokens (`^#{1,6} `) when `markdown_compact` is true.
- Expose the new flag via the config parser and default it to `false`.
- Add unit tests covering H1H6 headings, verifying absence of blank line in compact mode and presence in default mode.
## Notes
- This option improves vertical density for screens with limited height.
- Ensure compatibility with existing Markdown features like lists and code blocks; only target heading-content spacing.

View File

@@ -1,41 +0,0 @@
+++
id = "23"
title = "Interactive Container Command Affordance via Hotkey"
status = "Merged"
freeform_status = ""
dependencies = "01"
last_updated = "2025-06-25T12:10:10.584536"
+++
## Summary
Provide a keybinding to run arbitrary shell commands in the agents container and display output inline.
## Goal
Add a user-facing affordance (e.g. a hotkey) to invoke arbitrary shell commands within the agent's container during a session for on-demand inspection and debugging. The typed command should be captured as a chat turn, executed via the existing shell tool, and its output rendered inline in the chat UI.
## Acceptance Criteria
- Bind a hotkey (e.g. Ctrl+M) that opens a prompt for the user to type any shell command.
- When the user submits, capture the command as if entered in the chat input, and invoke the shell tool with the command in the agents container.
- Display the command invocation and its stdout/stderr output inline in the chat window, respecting formatting rules (e.g. compact rendering settings).
- Support chaining multiple commands in separate turns; history should show these command turns normally.
- Provide unit or integration tests simulating a user hotkey press, command input, and verifying the shell tool is called and output is displayed.
## Implementation
**How it was implemented**
- Added a new slash command `Shell` and updated dispatch logic in `app.rs` to push a shell-command view.
- Bound `Ctrl+M` in `ChatComposer` to dispatch `SlashCommand::Shell` for hotkey-driven shell prompt.
- Created `ShellCommandView` (bottom pane overlay) to capture arbitrary user input and emit `AppEvent::ShellCommand(cmd)`.
- Extended `AppEvent` with `ShellCommand(String)` and `ShellCommandResult { call_id, stdout, stderr, exit_code }` variants for round-trip messaging.
- Implemented `ChatWidget::handle_shell_command` to execute `sh -c <cmd>` asynchronously (tokio::spawn) and send back `ShellCommandResult`.
- Updated `ConversationHistoryWidget` to reuse existing exec-command cells to display shell commands and their output inline.
- Added tests:
- Unit test in `shell_command_view.rs` asserting correct event emission (skipping redraws).
- Integration test in `chat_composer.rs` asserting `Ctrl+M` opens the shell prompt view and allows input.
## Notes
- This feature aids debugging and inspection without leaving the agent workflow.
- Ensure that security policies (e.g. sandbox restrictions) still apply to these commands.

View File

@@ -1,36 +0,0 @@
+++
id = "28"
title = "Include Command Snippet in Session-Scoped Approval Label"
status = "Merged"
dependencies = "03,06,08,13,15,32,18,19,22,23"
last_updated = "2025-06-25T04:04:47.399379"
+++
## Summary
When asking for session-scoped approval of a command, embed a truncated snippet of the actual command in the approval label for clarity.
## Goal
Improve the session-scoped approval option label for commands by including a backtick-quoted snippet of the command itself (truncated to fit). This makes it clear exactly which command (including parameters) will be auto-approved for the session.
## Acceptance Criteria
- The session-scoped approval label changes from generic text to include a snippet of the current command, e.g.:
```text
Yes, always allow running `cat x | foo --bar > out` for this session (a)
```
- If the command is too long, truncate the middle (e.g. `long-part…end-part`) to fit a configurable max length.
- Implement the snippet templating in both Rust and JS UIs for consistency.
- Add unit tests to verify snippet extraction, truncation logic, and label rendering for various command lengths.
## Implementation
**Planned implementation**
- Add a `truncateMiddle` helper in both the Rust TUI and the JS/TS UI to ellipsize command snippets in the middle.
- Extract the first line of the command string (up to any newline), truncate to a default max length (e.g. 30 characters), inserting a single-character ellipsis `` when needed.
- In the session-scoped approval option, replace the static label with a dynamic one:
`Yes, always allow running `<snippet>` for this session (a)`.
- Write unit tests for the helper and label generation covering commands shorter than, equal to, and longer than the max length.
## Notes
- This clarifies what parameters will be auto-approved and avoids ambiguity when multiple similar commands occur.

View File

@@ -1,38 +0,0 @@
+++
id = "31"
title = "Display Remaining Context Percentage in codex-rs TUI"
status = "Merged"
dependencies = "03,06,08,13,15,32,18,19,22,23"
last_updated = "2025-06-25T01:40:09.600000"
+++
## Summary
Show a live "x% context left" indicator in the TUI (Rust) to inform users of remaining model context buffer.
## Goal
Enhance the codex-rs TUI by adding a status indicator that displays the percentage of model context buffer remaining (e.g. "75% context left"). Update this indicator dynamically as the conversation progresses.
## Acceptance Criteria
- Compute current token usage and total context limit from the active session.
- Display "<N>% context left" in the status bar or header of the TUI, formatted compactly.
- Update the percentage after each message turn in real time.
- Ensure the indicator is visible but does not obstruct existing UI elements.
- Add unit or integration tests mocking token count updates and verifying correct percentage formatting (rounding behavior, boundary conditions).
## Implementation
**How it was implemented**
- Added a `history_items: Vec<ResponseItem>` field to `ChatWidget` to accumulate the raw sequence of messages and function calls.
- Created a new module `tui/src/context.rs` mirroring the JS heuristics:
- `approximate_tokens_used(&[ResponseItem])`: counts characters in text and function-call items, divides by 4 and rounds up.
- `max_tokens_for_model(&str)`: uses a registry of known model limits and heuristic fallbacks (32k, 16k, 8k, 4k, default 128k).
- `calculate_context_percent_remaining(&[ResponseItem], &str)`: computes `(remaining / max) * 100`.
- Updated `ChatWidget::replay_items` and `ChatWidget::handle_codex_event` to push each incoming `ResponseItem` into `history_items`.
- Modified `ChatComposer::render_ref` to query `calculate_context_percent_remaining`, format and display "<N>% context left" after the input area, coloring it green/yellow/red per thresholds (>40%, 2540%, ≤25%).
- Added unit tests in `tui/tests/context_percent.rs` covering token counting, model heuristics, percent rounding, and boundary conditions.
## Notes
- This feature helps users anticipate when they may need to truncate history or start a new session.
- Future enhancement: allow toggling this indicator on/off via config.

View File

@@ -1,42 +0,0 @@
+++
id = "35"
title = "TUI Integration for Inspect-Env Command"
status = "Done"
dependencies = "10" # Rationale: depends on Task 10 for container state inspection
last_updated = "2025-06-25T11:38:19Z"
+++
> *This task is specific to codex-rs.*
## Status
**General Status**: Done
**Summary**: Follow-up to Task 10; add slash-command and TUI bindings for `inspect-env`.
## Goal
Add an `/inspect-env` slash-command in the TUI that invokes the existing `codex inspect-env` logic to display sandbox state inline.
## Acceptance Criteria
- Extend `SlashCommand` enum to include `InspectEnv`.
- Dispatch `AppEvent::InlineInspectEnv` when `/inspect-env` is entered.
- Handle `InlineInspectEnv` in `app.rs` to run `inspect-env` logic and stream its output to the TUI log pane.
- Render mounts, permissions, and network status in a formatted table or tree view in the bottom pane.
- Unit/integration tests simulating slash-command invocation and verifying rendered output.
## Implementation
**High-level approach**
- Extend `SlashCommand` enum with `InspectEnv` and provide user-visible description.
- Add `InlineInspectEnv` variant to `AppEvent` enum to represent inline slash-command invocation.
- Update dispatch logic in `App::run` to spawn a background thread on `InlineInspectEnv` that runs `codex inspect-env`, reads its stdout line-by-line, and sends each line as `AppEvent::LatestLog`, then triggers a redraw.
- Wire up `/inspect-env` to dispatch `InlineInspectEnv` in the slash-command handling.
- Add unit tests in the TUI crate to verify `built_in_slash_commands()` includes `inspect-env` mapping and description, and tests for the command-popup filter to ensure `InspectEnv` is listed when `/inspect-env` is entered.
**How it works**
When the user enters `/inspect-env`, the TUI parser recognizes the command and emits `AppEvent::InlineInspectEnv`. The main event loop handles this event by spawning a thread that invokes the external `codex inspect-env` command, captures its output line-by-line, and forwards each line into the TUI log pane via `AppEvent::LatestLog`. A redraw is scheduled once the inspection completes.
## Notes
- Reuse formatting code from `cli/src/inspect_env.rs` for consistency.

View File

@@ -1,34 +0,0 @@
+++
id = "38"
title = "Fix Approval Dialog Transparent Background"
status = "Done"
dependencies = ""
summary = "The approval dialog background is transparent, causing prompt text underneath to overlap and become unreadable."
last_updated = "2025-06-25T23:00:00.000000"
+++
> *UI bug:* When the approval dialog appears, its background is transparent and any partially entered prompt text shows through, overlapping and confusing the dialog.
## Status
**General Status**: Done
**Summary**: Identify and implement an opaque background for the approval dialog to prevent underlying text bleed-through.
## Goal
Ensure the approval dialog is drawn with a solid background color (matching the dialog border or theming) so that any underlying text does not bleed through.
## Acceptance Criteria
- Approval dialogs block underlying prompt text (solid background).
- Existing unit/integration tests validate dialog visual rendering.
## Implementation
- Updated `render_ref` in `codex-rs/tui/src/user_approval_widget.rs` to fill the entire dialog area with a `DarkGray` background before drawing the border and content.
- Implemented nested loops over the dialog `Rect` calling `buf[(col, row)].set_bg(Color::DarkGray)` on each cell.
- Added unit test `render_approval_dialog_fills_background` in `tui/src/user_approval_widget.rs` to render the widget onto a buffer pre-filled with a red background and verify no cell in the dialog region remains transparent or retains the sentinel background.
## Notes
<!-- Any implementation notes -->

View File

@@ -1,47 +0,0 @@
+++
id = "02"
title = "Granular Auto-Approval Predicates"
status = "Done"
dependencies = "11" # Rationale: depends on Task 11 for user-configurable approval predicates
last_updated = "2025-06-25T10:48:30.000000"
+++
# Task 02: Granular Auto-Approval Predicates
> *This task is specific to codex-rs.*
## Status
**General Status**: Done
**Summary**: Added granular auto-approval predicates: configuration parsing, predicate evaluation, integration, documentation, and tests.
## Goal
Let users configure one or more scripts in `config.toml` that examine each proposed shell command and return exactly one of:
- `deny` => auto-reject (skip sandbox and do not run the command)
- `allow` => auto-approve and proceed under the sandbox
- `no-opinion` => no opinion (neither approve nor reject)
Multiple scripts cast votes: if any script returns `deny`, the command is denied; otherwise if any script returns `allow`, the command is allowed; otherwise (all scripts return `no-opinion` or exit non-zero), pause for manual approval (existing logic).
## Acceptance Criteria
- New `[[auto_allow]]` table in `config.toml` supporting one or more `script = "..."` entries.
- Before running any shell/subprocess, Codex invokes each configured script in order, passing the candidate command as an argument.
- If a script returns `deny` or `allow`, immediately take that vote and skip remaining scripts.
- After all scripts complete with only `no-opinion` results or errors, pause for manual approval (existing logic).
- Spawn each predicate script with the full command as its only argument.
- Parse stdout (case-insensitive) expecting `deny`, `allow`, or `no-opinion`, treating errors or unknown output as `NoOpinion`.
- Short-circuit on the first `Deny` or `Allow` vote.
- A `Deny` vote aborts execution.
- An `Allow` vote skips prompting and proceeds under sandbox.
- All `NoOpinion` votes fall back to existing approval logic.
## Implementation
-- Added `auto_allow: Vec<AutoAllowPredicate>` to `ConfigToml`, `ConfigProfile`, and `Config` to parse `[[auto_allow]]` entries from `config.toml`.
-- Defined `AutoAllowPredicate { script: String }` and `AutoAllowVote { Allow, Deny, NoOpinion }` in `core::safety`.
-- Implemented `evaluate_auto_allow_predicates` in `core::safety` to spawn each script with the candidate command, parse its stdout vote, and short-circuit on `Deny` or `Allow`.
-- Integrated `evaluate_auto_allow_predicates` into the shell execution path in `core::codex`, aborting on `Deny`, auto-approving on `Allow`, and falling back to manual or policy-based approval on `NoOpinion`.
-- Updated `config.md` to document the `[[auto_allow]]` table syntax and behavior.
-- Added comprehensive unit tests covering vote parsing, error propagation, short-circuit behavior, and end-to-end predicate functionality.
## Notes
- This pairs with the existing `approval_policy = "unless-allow-listed"` but adds custom logic before prompting.

View File

@@ -1,63 +0,0 @@
+++
id = "04"
title = "Auto-Mount Entire Repo and Auto-CD to Subfolder"
status = "Not started"
dependencies = "01" # Rationale: depends on Task 01 for mount-add/remove foundational commands
last_updated = "2025-06-25T01:40:09.800000"
+++
# Task 04: Auto-Mount Entire Repo and Auto-CD to Subfolder
> *This task is specific to codex-rs.*
## Subtasks
Subtasks to implement in order all in one P:
### 04.1 Config → `ConfigToml` + `Config`
- Add `auto_mount_repo: bool` and `mount_prefix: String` to `ConfigToml` (with proper `#[serde(default)]` and defaults).
- Wire these fields through to the `Config` struct.
### 04.2 Git root detection + relativepath
- Implement a helper in `codex_core::util` to locate the Git repository root given a starting `cwd`.
- Compute the subdirectory path relative to the repo root.
### 04.3 Bindmount logic
- In the sandbox startup path (`apply_sandbox_policy_to_current_thread` or a new wrapper before it), if `auto_mount_repo` is set:
- Bindmount `repo_root``mount_prefix` (e.g. `/workspace`).
- Create target directory if missing.
### 04.4 Automate `cwd` → new mount
- After mounting, update the processwide `cwd` to `mount_prefix/relative_path` so all subsequent file ops occur under the mount.
### 04.5 Config docs & tests
- Update `config.md` to document `auto_mount_repo` and `mount_prefix` under the toplevel config.
- Add unit tests for the Gitroot helper and default values.
### 04.6 E2E manual verification
- Manually verify launching with `auto_mount_repo = true` in a nested subfolder:
- TTY prompt shows sandboxed cwd under `/workspace/<subdir>`.
- Commands executed by Codex see the mount.
## Goal
Allow users to enable a flag so that each session:
1. Detects the Git repository root of the current working directory.
2. Bind-mounts the entire repository into `/workspace` in the session.
3. Changes directory to `/workspace/<relative-path-from-root>` to mirror the users original subfolder.
## Acceptance Criteria
- New `auto_mount_repo = true` and optional `mount_prefix = "/workspace"` in `config.toml`.
- Before any worktree or mount processing, detect the Git root, bind-mount it to `mount_prefix`, and set `cwd` to `mount_prefix + relative_path`.
- Existing worktree/session-worktree logic should operate relative to this new `cwd`.
## Implementation
**How it was implemented**
*(Not implemented yet)*
**How it works**
*(Not implemented yet)*
## Notes
- This offloads the entire monorepo into the session, leaving the users original clone untouched.

View File

@@ -1,47 +0,0 @@
+++
id = "09"
title = "File- and Directory-Level Approvals"
status = "Not started"
dependencies = "11" # Rationale: depends on Task 11 for custom approval predicate infrastructure
last_updated = "2025-06-25T01:40:09.507043"
+++
# Task 09: File- and Directory-Level Approvals
> *This task is specific to codex-rs.*
## Status
**General Status**: Not started
**Summary**: Not started; missing Implementation details (How it was implemented and How it works).
## Goal
Enable fine-grained approval controls so users can whitelist edits scoped to specific files or directories at runtime, with optional time limits.
## Acceptance Criteria
- In the approval dialog, offer “Allow this file always” and “Allow this directory always” options alongside proceed/deny.
- Prompt for a time limit when granting a file/dir approval, with default presets (e.g. 5min, 1hr, 4hr, 24hr).
- Introduce runtime commands to inspect and manage granular approvals:
- `/approvals list` to view active approvals and remaining time
- `/approvals add [file|dir] <path> [--duration <preset>]` to grant approval
- `/approvals remove <id>` to revoke an approval
- Persist granular approvals in session metadata, keyed by working directory. On session resume in a different directory, warn the user and discard all file/dir approvals.
- Automatically expire and remove approvals when their time limits elapse.
- Reflect file/dir-approval state in the CLI shell prompt or title for quick visibility.
## Implementation
**How it was implemented**
*(Not implemented yet)*
**How it works**
*(Not implemented yet)*
## Notes
- Store approvals with {id, scope: file|dir, path, expires_at} in session JSON.
- Use a background timer or check-before-command to prune expired entries.
- Reuse existing command-parsing infrastructure to implement `/approvals` subcommands.
- Consider UI/UX for selecting presets in TUI dialogs.

View File

@@ -1,44 +0,0 @@
+++
id = "12"
title = "Runtime Internet Connection Toggle"
status = "Not started"
dependencies = "" # No prerequisites
last_updated = "2025-06-25T01:40:09.509507"
+++
# Task 12: Runtime Internet Connection Toggle
> *This task is specific to codex-rs.*
## Status
**General Status**: Not started
**Summary**: Not started; missing Implementation details (How it was implemented and How it works).
## Goal
Allow users to enable or disable internet access at runtime within their container/sandbox session.
## Acceptance Criteria
- Slash command or CLI subcommand (`/toggle-network <on|off>`) to turn internet on or off immediately.
- Persist network state in session metadata so that resuming a session restores the last setting.
- Enforce the new network policy dynamically: block or allow outbound network connections without restarting the agent.
- Reflect the current network status in the CLI prompt or shell title (e.g. 🌐/🚫).
- Work across supported platforms (Linux sandbox, macOS Seatbelt, Windows) using appropriate sandbox APIs.
- Include unit and integration tests to verify network toggle behavior and persistence.
## Implementation
**How it was implemented**
*(Not implemented yet)*
**How it works**
*(Not implemented yet)*
## Notes
- Reuse the existing sandbox network-disable mechanism (`CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR`) for toggling.
- On Linux, this may involve updating Landlock or seccomp rules at runtime.
- On macOS, interact with the Seatbelt profile; consider session restart if necessary.
- When persisting state, store a `network_enabled: bool` flag in the session JSON.

View File

@@ -1,47 +0,0 @@
+++
id = "14"
title = "AIGenerated Approval Predicate Suggestions"
status = "Not started"
dependencies = "02,11" # Rationale: depends on Task 02 for auto-approval predicates and Task 11 for predicate invocation logic
last_updated = "2025-06-25T01:40:09.511783"
+++
# Task 14: AIGenerated Approval Predicate Suggestions
> *This task is specific to codex-rs.*
## Status
**General Status**: Not started
**Summary**: Not started; missing Implementation details (How it was implemented and How it works).
## Goal
When a shell command is not auto-approved, the approval prompt should include 13 AI-generated approval predicates. Each suggestion is a time-limited Python predicate snippet plus an explanation of the full set of permissions it would grant. Users can pick one suggestion to append to the sessions approval policy as a broader-scope allow rule.
## Acceptance Criteria
- When a command is not auto-approved, show up to 3 suggested predicates inline in the TUI approval dialog.
- Each suggestion consists of:
- A Python code snippet defining a predicate function.
- An AI-generated explanation of exactly what permissions or scope that predicate grants.
- A TTL or expiration timestamp indicating how long it will remain active.
- Users can select one suggestion to append to the sessions list of approval predicates.
- Predicates are stored in session state (in-memory) for the duration of the session.
- Provide a slash/CLI command (`/inspect-approval-predicates`) to list current predicates, their code, explanations, and timeouts.
- Support headless and interactive modes equally.
## Implementation
**How it was implemented**
*(Not implemented yet)*
**How it works**
*(Not implemented yet)*
## Notes
- Reuse the existing AI reasoning engine to generate predicate suggestions.
- Represent predicates as Python functions returning a boolean.
- Ensure that expiration is enforced and stale predicates are ignored.
- Integrate the new `/inspect-approval-predicates` command into both the TUI and Exec CLI.

View File

@@ -1,28 +0,0 @@
+++
id = "17"
title = "Sandbox Pre-commit Permission Error"
status = "Not started"
dependencies = "15" # Rationale: depends on Task 15 for sandbox worktree configuration
last_updated = "2025-06-25T01:41:34.737190"
+++
> *This task addresses scaffolding/setup for Agent worktrees.*
## Acceptance Criteria
- Pre-commit hooks detect sandbox environment and skip or override gitconfig locking.
- Documentation in scaffold guides is updated to note pre-commit limitations and workarounds.
- Verification steps demonstrate pre-commit hooks succeeding in sandbox without modifying user gitconfig.
## Implementation
**How it was implemented**
*(Not implemented yet)*
**How it works**
*(Not implemented yet)*
## Notes
- The sandbox prevents locking ~/.gitconfig, leading to PermissionError.
- Consider configuring pre-commit to use a repo-local config or skip locking by passing `--config` or setting `PRE_COMMIT_HOME`.

View File

@@ -1,36 +0,0 @@
+++
id = "20"
title = "Render Patch Content in Chat Display Window for Approve/Deny"
status = "Not started"
dependencies = "" # No prerequisites
last_updated = "2025-06-25T01:41:34.738344"
+++
> *This task is specific to the chat UI renderer.*
## Acceptance Criteria
- When displaying a patch for approve/deny, the full diff for the active patch is rendered inline in the chat window.
- Older or superseded patches collapse to show only up to N lines of context, with an indicator (e.g. "... 10 lines collapsed ...").
- File paths in diff headers are shown relative to the current working directory, unless the file resides outside the CWD.
- Event logs around patch application are simplified: drop structured event data and replace with a simple status note (e.g. "patch applied").
- Configurable parameter (e.g. `patch_context_lines`) controls the number of context lines for collapsed hunks.
- Preserve the users draft input when an approval dialog or patch diff appears; ensure the draft editor remains visible so users can continue editing while reviewing.
- Provide end-to-end integration tests that simulate drafting long messages, triggering approval dialogs and overlays, and verify that all UI elements (draft editor, diffs, logs) render correctly without overlap or content loss.
- Exhaustively test all dialog interaction flows (approve, deny, cancel) and overlay scenarios to confirm consistent behavior across combinations and prevent rendering artifacts.
## Implementation
**How it was implemented**
- Extend the chat renderer to detect patch approval prompts and render diffs using a custom formatter.
- Compute relative paths via `Path::strip_prefix`, falling back to full path if outside CWD.
- Track the current patch ID and render its full content; collapse previous patch bodies according to `patch_context_lines` setting.
- Preserve and render the current draft buffer alongside the active patch diff, ensuring live edits remain visible during approval steps.
- Add integration tests using the TUI test harness or end-to-end framework to simulate user input of long text, approval flows, overlay dialogs, and log output, asserting correct screen layout and content integrity.
- Design a parameterized test matrix covering all dialog interaction flows (approve/deny/cancel) and overlay transitions to ensure exhaustive coverage and UI sanity.
- Replace verbose event debug output with a single-line status message.
## Notes
- Users can override `patch_context_lines` in their config to see more or fewer collapsed lines.
- Ensure compatibility with both live TUI sessions and persisted transcript logs.

View File

@@ -1,37 +0,0 @@
+++
id = "22"
title = "Message Separation and Sender-Content Layout Options"
status = "Done"
dependencies = "" # No prerequisites
last_updated = "2025-06-25T11:05:55.000000"
+++
## Summary
Add configurable options for inter-message spacing and sender-content line breaks in chat rendering
**in the codex-rs package** - **NOT** the codex-cli package.
## Goal
Provide users with flexibility in how chat messages are visually separated and how sender labels are displayed relative to message content:
- Control whether an empty line is inserted between consecutive messages.
- Control whether sender and content appear on the same line or on separate lines.
## Acceptance Criteria
- Introduce one new config flags under the UI section:
- `message_spacing: true|false` controls inserting a blank line between messages when true.
- default to `false` to preserve current compact layout.
- When `message_spacing` is enabled, render an empty line between each message bubble or block.
- Add unit tests to verify the layout produces the correct sequence of lines.
## Implementation
### Plan
**How it was implemented**
- Extend the chat UI renderer to read `message_spacing` from config.
- In the message rendering routine, after emitting each message block, conditionally insert a blank line if `message_spacing` is true.
- Write unit tests for values of `(message_spacing)` covering single-line messages, multi-line content, and boundaries.
## Notes
- These options improve readability for users who prefer more visual separation or clearer sender labels.
- Keep default settings unchanged to avoid surprising existing users.

View File

@@ -1,33 +0,0 @@
+++
id = "24"
title = "Guard Against Missing Tool Output in JS Server Sequencing"
status = "Not started"
dependencies = "" # No prerequisites
last_updated = "2025-06-25T01:40:09.600000"
+++
## Summary
Prevent out-of-order chat messages and missing tool outputs when user input interrupts tool execution in the JS backend.
## Goal
Ensure the JS server never emits a user or model message before the corresponding tool output has been delivered. Add sequencing guards to the message dispatcher so that aborted rollouts or interleaved user messages cannot cause "No tool output found" errors.
## Acceptance Criteria
- When a tool invocation is interrupted or user sends a message mid-rollout, the JS server buffers subsequent messages until the tool output event arrives or the invocation is explicitly cancelled.
- The server must never log or emit an error like "No tool output found for local shell call" due to sequencing mismatch.
- Add automated tests simulating mid-rollout user interrupts in the JS test suite, verifying correct buffering and eventual message delivery or cancellation.
## Implementation
**How it was implemented**
- In the JS message dispatcher, track pending tool invocations by ID and delay processing of new chat messages until the pending invocation resolves (success, failure, or cancel).
- Add a guard in the `handleUserMessage` path to check for unresolved tool IDs before appending user content; if pending, queue the message.
- On receiving `toolOutput` or `toolError` for an invocation ID, flush any queued messages in order.
- Implement explicit cancellation paths so that if a tool invocation is abandoned, queued messages still flow after cancellation confirmation.
- Add unit and integration tests in the JS test harness to cover normal, aborted, and concurrent message scenarios.
## Notes
- This change prevents 400 Bad Request errors from tool retries where the model requests a tool before the output is streamed.
- Keep diagnostic logs around sequencing logic for troubleshooting but avoid spamming on normal race cases.

View File

@@ -1,78 +0,0 @@
+++
id = "25"
title = "Guard Against Missing Tool Output in Rust Server Sequencing"
status = "Needs input"
dependencies = "" # No prerequisites
last_updated = "2025-06-25T22:50:01.000000"
+++
## Summary
Prevent out-of-order chat messages and missing tool output errors when user input interrupts tool execution in the Rust backend.
## Goal
Ensure the Rust server implementation sequences tool output and chat messages correctly. Add synchronization logic so that an in-flight tool invocation either completes or is cancelled before new messages are processed, avoiding "No tool output found" invalid_request errors.
## Acceptance Criteria
- The Rust message broker must detect pending tool invocations and pause delivery of subsequent user or model messages until the tool result or cancellation is handled.
- No panic or 400 Bad Request errors should occur due to missing tool output in edge cases of interrupted rollouts or mid-stream user input.
- Add Rust integration tests simulating tool invocation interruption and user message interleaving, verifying correct ordering and delivery.
## Implementation
We will implement the following high-level plan:
- Locate where the ChatCompletion request messages array is built in Rust:
the `stream_chat_completions` function in `codex-rs/core/src/chat_completions.rs`.
- In that loop, track pending tool invocations by their call IDs when encountering `ResponseItem::FunctionCall` entries.
- Buffer any subsequent `ResponseItem::Message { role: "user" }` or new turn inputs until the matching `ResponseItem::FunctionCallOutput` (tool result) appears.
- Once the tool output is seen, flush buffered user messages in order immediately before continuing to build the next API call.
- Add tests under `codex-rs/core/tests/` (e.g. `guard_tool_output_sequencing.rs`) that exercise interleaved input sequences:
- A user message mid-rollout before tool output, ensuring it is delayed until after the tool result.
- Normal flow where no buffering is needed.
- Cancellation paths (no tool output) inserting a fake "Tool cancelled" tool message and flushing buffered messages.
- Front-end layers automatically render the "Tool cancelled" message as a tool output, satisfying UI requirements.
## Notes
```text
thinking █Deciding on file management █
───╮
Im thinking that since file editing can be a hassle, it might be simpler to just remove the file. I could then ask the user to provide │any errors theyre facing. To do this, I'd create a patch to delete the specified file. Alternatively, I could consider reverting the │
changes with git, but applying the patch seems like a straightforward solution. So, Ill go ahead and prepare to apply the patch to ──╯
delete the file.
proposed patch
D /Users/mpokorny/code/codex/agentydragon/tasks/.worktrees/23-interactive-container-command-affordance/codex-rs/.cargo/config.toml ny
ERROR: Turn interrupted
user
why did you try to delete that? s P
event
BackgroundEvent(BackgroundEventEvent { message: "stream error: unexpected status 400 Bad Request: {\n \"error\": {\n \"message\": \"No tool output found for local shell call call_PjdhYGAM7q52HL13WF8slGn9.\",\n \"type\": \"invalid_request_error\",\n \"param\":
\"input\",\n \"code\": null\n }\n}; retrying 1/10 in 201ms…" }) │
event │
BackgroundEvent(BackgroundEventEvent { message: "stream error: unexpected status 400 Bad Request: {\n \"error\": {\n \"message\": │
\"No tool output found for local shell call call_PjdhYGAM7q52HL13WF8slGn9.\",\n \"type\": \"invalid_request_error\",\n \"param\": │\"input\",\n \"code\": null\n }\n}; retrying 2/10 in 246ms…" }) │
event │BackgroundEvent(BackgroundEventEvent { message: "stream error: unexpected status 400 Bad Request: {\n \"error\": {\n \"message\": │
\"No tool output found for local shell call call_PjdhYGAM7q52HL13WF8slGn9.\",\n \"type\": \"invalid_request_error\",\n \"param\": █
\"input\",\n \"code\": null\n }\n}; retrying 3/10 in 371ms…" }) █
this is a lot of the problem still happening
```
## Next Steps / Debugging
The above change did not resolve the issue. We need to gather more debug information to understand why missing tool output errors still occur.
Suggested approaches:
- Enable detailed debug logging in the Rust message broker (e.g. set `RUST_LOG=debug` or add tracing spans around function calls).
- Dump the sequence of incoming and outgoing `ResponseItem` events to a log file for offline analysis.
- Instrument timing and ordering by recording timestamps when tool invocations start, complete, and when user input is received.
- Write a minimal reproduction harness that reliably triggers the missing output error under controlled conditions.
- Capture full request/response payloads to/from the OpenAI API to verify whether the function output is delivered but not processed.
Please expand this section with specific examples or helper scripts to collect the necessary data.

View File

@@ -1,36 +0,0 @@
+++
id = "26"
title = "Render Approval Requests in Separate Dialog from Draft Window"
status = "Not started"
dependencies = "09,23" # Rationale: depends on Tasks 09 and 23 for file-level approvals and interactive command affordance
last_updated = "2025-06-25T01:40:09.600000"
+++
## Summary
Display patch approval prompts in a distinct dialog or panel to avoid overlaying the draft editor.
## Goal
Change the chat UI so that approval requests (patch diffs for approve/deny) appear in a separate dialog element or panel, positioned adjacent to or below the chat window, rather than overlaying the draft input area.
This eliminates overlay conflicts and ensures the draft editor remains fully visible and interactive while reviewing patches.
## Acceptance Criteria
- Approval prompts with diffs open in a distinct UI element (e.g. side panel or bottom pane) that does not obscure the draft editor.
- The draft input area remains fully visible and editable whenever an approval dialog is active.
- The approval dialog is visually distinguished (border, background) and clearly labeled.
- The layout adjusts responsively for narrow/short terminal sizes, maintaining separation without clipping content.
- Add functional tests or integration tests verifying that the draft input remains accessible and that the approval dialog contents are rendered in the new panel.
## Implementation
**How it was implemented**
- Refactor the patch-approval renderer to spawn a separate TUI view (`ApprovalDialogView`) instead of the overlay popup.
- Allocate a consistent panel region (e.g. bottom X rows or right-hand column) for approval dialogs, reserving the draft editor region above or to the left.
- Update layout logic to recalculate positions on terminal resize, ensuring both panels remain visible.
- Style the new dialog with its own borders and title bar (e.g. "Approval Request").
- Add integration tests using the TUI test harness to simulate opening approval prompts and verifying that typing in the draft area still works and that the dialog appears in the correct panel.
## Notes
- This change fixes the long-standing overlay bug where approval diffs obstruct the draft.
- Future enhancements may allow toggling between inline overlay or separate panel modes.

View File

@@ -1,46 +0,0 @@
+++
id = "27"
title = "Unified Sandbox-Retry Prompt with y/a/A/n Options (Rust)"
status = "Not started"
dependencies = "15,17" # Rationale: depends on Tasks 15 and 17 for sandbox configuration and pre-commit permission handling
last_updated = "2025-06-25T01:40:09.600000"
+++
## Summary
Implement a unified retrywithoutsandbox prompt in the Rust TUI with oneshot, sessionscoped, and persistent options.
## Goal
Replace the two-stage sandboxretry and approval flow with a single, unified prompt in the Rust UI. Provide four hotkey options (y/a/A/n) to control sandbox behavior at varying scopes:
- y: retry this one command without sandbox
- a: always run without sandbox but still ask first
- A: always run without sandbox and never ask again
- n: keep using sandbox
## Acceptance Criteria
- When a sandboxed shell invocation fails (exit code ≠ 0), display a single prompt:
```
Retry without sandbox
y Yes, run without sandbox this one time
a Yes, always run without sandbox but still ask me first
A Yes, always run without sandbox and do not ask again
n No, keep using sandbox
```
- Hotkeys y/a/A/n must map to the corresponding behavior and dismiss the prompt.
- The prompt replaces the older twostage “retry?” + “Allow command?” dialogs.
- Add unit/integration tests simulating a failing sandbox command and each hotkey path, verifying correct sandbox flag logic.
## Implementation
**How it was implemented**
- Refactor the sandbox error handler in `tui/src/shell.rs` to emit a single `SandboxRetryPrompt` event instead of separate prompts.
- Create a new TUI widget `SandboxRetryWidget` that renders the four-line menu and captures y/a/A/n keys.
- Map each choice to updating the per-session config (`Config.tui.sandbox_mode`) and retrying or aborting the command as appropriate.
- Update the shellinvocation pipeline to consult the new `sandbox_mode` setting and skip sandbox when indicated.
- Write Rust tests (in `tui/tests/`) to simulate sandbox failures and user key presses for all four options.
## Notes
- This unifies and simplifies the UX, removing confusion from layered prompts.
- The three levels of scope (one-off, scoped prompt, no prompt) give power users flexibility and safety.

View File

@@ -1,29 +0,0 @@
+++
id = "29"
title = "Auto-Approve Empty-Array Tool Invocations"
status = "Not started"
dependencies = "02" # Rationale: depends on Task 02 for auto-approval logic
last_updated = "2025-06-25T01:40:09.600000"
+++
## Summary
Automatically approve tool-use requests where the command array is empty, bypassing the approval prompt.
## Goal
In rare cases the model may emit a tool invocation event with an empty `command: []`. These invocations cannot succeed and continually trigger errors. Automatically treat empty-array tool requests as approved (once), suppressing the approval UI, to allow downstream error handling rather than perpetual prompts.
## Acceptance Criteria
- Detect tool requests where `command: []` (no arguments).
- Do not open the approval prompt for these cases; instead, automatically approve and allow the tool pipeline to proceed (and eventually handle the error).
- Include a unit test simulating an empty-array tool invocation that verifies no approval prompt is shown and that a `ReviewDecision::Approved` is returned immediately.
## Implementation
**How it was implemented**
- In the command-review widget setup (`ApprovalRequest::Exec`), check for `command.is_empty()` before rendering; if empty, directly send `ReviewDecision::Approved` and mark the widget done.
- Add a Rust unit test for `UserApprovalWidget` to feed an `Exec { command: vec![] }` request and assert automatic approval without rendering the select mode.
## Notes
- This is a pragmatic workaround for spurious emptycommand tool calls; a more robust modelside fix may replace this later.

View File

@@ -1,41 +0,0 @@
+++
id = "30"
title = "Non-Fullscreen Scrollback Mode with Native Terminal Scroll"
status = "Not started"
dependencies = "" # No prerequisites
last_updated = "2025-06-25T01:40:09.600000"
+++
## Summary
Offer a non-fullscreen TUI mode that appends conversation output and defers scrolling to the terminal scrollback.
## Goal
Provide an optional non-fullscreen mode for the chat UI where:
- The TUI does not capture the mouse scroll wheel.
- All conversation output is appended in place, allowing the terminal's native scrollback to navigate history.
- The user-entry window remains fixed at the bottom of the terminal.
- The entire UI runs in a standard terminal buffer (no alternate screen), so the user can use their terminals scrollbar or scrollback keys to review past messages.
## Acceptance Criteria
- Introduce a `tui.non_fullscreen_mode` config flag (default `false`).
- When enabled, the application:
- Disables alternate screen buffering (i.e. does not switch to the TUI alt-screen).
- Does not intercept mouse scroll events; scroll events are passed through to the terminal.
- Renders new chat messages inline (appended) rather than redrawing the full viewport.
- Keeps the user input prompt visible at the bottom after each message.
- Add integration tests or manual validation steps to confirm that: scrollback keys/mouse scroll work via terminal scrollback, and the prompt remains in view.
## Implementation
**How it was implemented**
- Add `non_fullscreen_mode: bool` to the `tui` config section.
- In the TUI initialization, skip entering the alternate screen and disable pannable viewports.
- Remove mouse event capture for scroll wheel events when `non_fullscreen_mode` is true.
- Change rendering loop: after each new message, print the message directly to the stdout buffer (in append mode), then redraw only the input prompt line.
- Write integration tests that spawn the TUI in non-fullscreen mode, emit multiple messages, send scroll events (if possible), and assert that scrollback buffer contains the messages.
## Notes
- This mode trades advanced in-TUI scrolling features for simplicity and compatibility with users accustomed terminal scrollback.
- It may not support complex viewport resizing; documentation should note that.

View File

@@ -1,49 +0,0 @@
+++
id = "32"
title = "Embedded Neovim Prompt Editor"
status = "Not started"
dependencies = "06" # Rationale: depends on Task 06 for external editor integration
last_updated = "2025-06-25T01:40:09.513224"
+++
# Task 32: Embedded Neovim Prompt Editor
> *This task is specific to codex-rs.*
## Status
**General Status**: Not started
**Summary**: Not started; missing Implementation details (How it was implemented and How it works).
## Goal
Replace the basic lineediting prompt composer with an embedded Neovim window so users can enjoy full-featured, multi-line editing of their chat prompt directly inside the TUI.
## Acceptance Criteria
- Introduce a TUI-integrated Neovim editor pane activated via `/edit-prompt` or `Ctrl+E` when `embedded_prompt_editor = true` in `[tui]` config.
- Pre-populate the Neovim buffer with the current draft prompt; upon exit, reload the buffer contents back into the composer.
- Support standard Neovim keybindings and commands (e.g. insert mode, visual mode, plugins) within the embedded pane.
- Cleanly restore the previous TUI layout after closing the editor, with prompt focus returned to the composer.
- Provide configuration toggle (`embedded_prompt_editor`) and fall back to external-editor prompt behavior when disabled.
## Implementation
**How it was implemented**
- Add a new module `tui/src/editor/neovim.rs` that wraps a headless Neovim RPC instance and renders its UI into a dedicated TUI layer.
- Extend `tui/src/bottom_pane/chat_composer.rs` to detect `embedded_prompt_editor` and invoke the embedded editor instead of spawning an external process.
- Wire a config flag `embedded_prompt_editor: bool` through `ConfigToml``Config` under the `tui` section, defaulting to `false`.
- Handle Neovim communication via `nvim-rs` crate, multiplexing input/output over the TUI event loop.
**How it works**
- When the user triggers the editor, pause the main TUI rendering and allocate a full-screen or split view for Neovim.
- Start Neovim in embedded RPC mode, passing the current prompt text into a new buffer.
- Drive Neovims UI updates via RPC and render its screen cells into the TUI terminal using termion or similar backend.
- Detect the Neovim exit event (e.g. user `:q` or `ZZ`), fetch the buffer contents, and close the embedded view.
- Restore the original TUI state and update the composer widget with the edited prompt.
## Notes
- This relies on a working `nvim` binary in PATH or specified via `nvim_binary` config.
- Investigate performance impact of embedding a full editor in the TUI; ensure fallback to external-editor remains smooth.
- Consider edge cases (resizing, pluginheavy Neovim configs) and document prerequisites in the README.

View File

@@ -1,34 +0,0 @@
+++
id = "33"
title = "Fix External Editor Focus Issue"
status = "Not started"
summary = "When launching the external editor from the TUI (e.g. nvim), keyboard input is still captured by the Rust TUI, causing keys to split between the editor and the TUI."
dependencies = "06,32" # Rationale: depends on Tasks 06 and 32 for external and embedded editor features
last_updated = "2025-06-25T01:40:09.700000"
+++
# Task 33: Fix External Editor Focus Issue
## Goal
Ensure that when the TUI spawns an external editor, it fully hands off keyboard control to the editor, and upon editor exit, restores TUI input handling without leaking keystrokes or misrouting commands.
## Acceptance Criteria
- Launching external editor via `/edit-prompt` or Ctrl+E disables TUI raw mode and event capture so all keystrokes go directly to the editor.
- Upon editor exit, raw mode and event capture are correctly re-enabled, and no keystrokes are lost or misrouted.
- No residual input events are processed by the TUI while the editor is running.
- Add integration tests or manual validation steps simulating editor launch and exit sequences.
## Implementation
**High-level plan**
- Before spawning the editor process (in `ChatComposer`), call `disable_raw_mode()` and `disable_event_capture()` to restore normal terminal behavior.
- Spawn the editor subprocess and wait for it to exit.
- After exit, re-enable raw mode and event capture via `enable_raw_mode()` and `enable_event_capture()`.
- Wrap this sequence in a helper function (e.g., `spawn_external_editor`) and update the `/edit-prompt` handler to use it.
- Add integration tests in `tui/tests/` that mock the editor command (e.g., `echo`) to verify terminal mode transitions.
## Notes
- Use Crossterm APIs for terminal mode management.
- Ensure interruption signals (e.g., Ctrl+C) during editor sessions are propagated correctly to avoid TUI deadlock.

View File

@@ -1,41 +0,0 @@
+++
id = "34"
title = "Complete Set Shell Title to Reflect Session Status"
status = "Not started"
dependencies = "08" # Rationale: depends on Task 08 for initial shell title change
last_updated = "2025-06-25T04:45:29Z"
+++
> *This task is specific to codex-rs.*
## Status
**General Status**: Not started
**Summary**: Follow-up to Task 08; implementation missing for core title persistence and ANSI updates.
## Goal
Implement the missing pieces from Task 08 to fully support dynamic and persistent shell title updates:
1. Define `SessionUpdatedTitleEvent` and add a `title` field in `SessionConfiguredEvent` (core protocol).
2. Introduce `Op::SetTitle(String)` variant and handle it in the core agent loop, persisting the title and emitting the update event.
3. Update TUI and exec clients to listen for title events and emit ANSI escape sequences (`\x1b]0;<title>\x07`) for live terminal title changes.
4. Restore the persisted title on session resume via `SessionConfiguredEvent`.
## Acceptance Criteria
- New `SessionUpdatedTitleEvent` type in `codex_core::protocol` and `title` field in `SessionConfiguredEvent`.
- `Op::SetTitle(String)` variant in the protocol and core event handling persisted in session metadata.
- Clients broadcast ANSI title-setting sequences on title events and lifecycle state changes.
- Unit tests for protocol serialization and client reaction to title updates.
## Implementation
**How it was implemented**
*(Not implemented yet)*
**How it works**
*(Not implemented yet)*
## Notes
- Use ANSI escape code `\x1b]0;<title>\x07` for setting terminal title.

View File

@@ -1,39 +0,0 @@
+++
id = "36"
title = "Add Tests for Interactive Prompting While Executing"
status = "Not started"
dependencies = "06,13" # Rationale: depends on Tasks 06 and 13 for external editor and interactive prompt support
last_updated = "2025-06-25T11:05:55Z"
+++
> *This task is specific to codex-rs.*
## Status
**General Status**: Done
**Summary**: Follow-up to Task 13; add unit tests for interactive prompt overlay during execution.
## Goal
Write tests that verify `BottomPane::handle_key_event` forwards input to the composer while `is_task_running`, preserving the status overlay until completion.
## Acceptance Criteria
- Unit tests covering key events (e.g. alphanumeric, Enter) during `is_task_running == true`.
- Assertions that `active_view` remains a `StatusIndicatorView` while running and is removed when `set_task_running(false)` is called.
- Coverage of redraw requests and correct `InputResult` values.
## Implementation
**Planned Approach**
- Use existing `make_pane` and `make_pane_and_rx` helpers to create a `BottomPane` in a running-task state.
- Write unit tests in `tui/src/bottom_pane/mod.rs` that verify:
- Typing alphanumeric characters while `is_task_running == true` appends to the composer, maintains the `StatusIndicatorView` overlay, and emits a `AppEvent::Redraw`.
- Pressing Enter returns `InputResult::Submitted` with the buffered text, clears the composer, retains the overlay, and triggers a redraw.
- Calling `set_task_running(false)` removes the status indicator overlay.
- Follow existing patterns from the tests in `user_approval_widget.rs` and `set_title_view.rs`.
## Notes
- Refer to existing tests in `user_approval_widget.rs` and `set_title_view.rs` for testing patterns.

View File

@@ -1,41 +0,0 @@
+++
id = "37"
title = "Session State Persistence and Debug Instrumentation"
status = "Not started"
dependencies = ""
last_updated = "2025-06-25T23:00:00.000000"
+++
## Summary
Persist session runtime state and capture raw request/response data and supplemental metadata to a session-specific directory.
## Goal
Collect and persist all relevant session state (beyond the rollout transcript) in a dedicated directory under `.codex/sessions/<UUID>/`, to aid debugging and allow post-mortem analysis.
## Acceptance Criteria
- All session data (transcript, logs, raw OpenAI API requests/responses, approval events, and other runtime metadata) is written under `.codex/sessions/<session_id>/`.
- Existing rollout transcript continues to be written to `sessions/rollout-<UUID>.jsonl`, now moved or linked into the session directory.
- Logging configuration respects `--debug-log` and writes to the session directory when set to a relative path.
- A selector flag (e.g. `--persist-session`) enables or disables writing persistent state.
- No change to default behavior when persistence is disabled (i.e. backward compatibility).
- Minimal integration test or manual verification steps demonstrate that files appear correctly and no extraneous error logs occur.
## Implementation
**How it was implemented**
- Add a new CLI flag `--persist-session` to the TUI and server binaries to enable session persistence.
- Compute a session directory under `$CODEX_HOME/sessions/<UUID>/`, create it at startup when persistence is enabled.
- After initializing the rollout file (`rollout-<UUID>.jsonl`), move or symlink it into the session directory.
- Configure tracing subscriber file layer and `--debug-log` default path to write logs into the same session directory (e.g. `session.log`).
- Instrument the OpenAI HTTP client layer to dump raw request and response bodies into `session_oai_raw.log` in that directory.
- In the message sequencing logic, add debug spans to record approval and cancellation events into `session_meta.log`.
**How it works**
- When `--persist-session` is active, all file outputs (rollout transcript, debug logs, raw API dumps, metadata logs) are collated under a single session directory.
- If disabled (default), writes occur in the existing locations (`rollout-<UUID>.jsonl`, `$CODEX_HOME/log/`), preserving current behavior.
## Notes
- This feature streamlines troubleshooting by co-locating all session artifacts.
- Ensure directory creation and file writes handle permission errors gracefully and fallback cleanly when disabled.

View File

@@ -1,35 +0,0 @@
+++
id = "39"
title = "Fix Coloring of Left-Indented Patch Diffs"
status = "Not started"
dependencies = ""
summary = "Patch diffs rendered with left indentation mode are not colored correctly, losing syntax highlighting."
last_updated = "2025-06-25T00:00:00Z"
+++
# Task 39: Fix Coloring of Left-Indented Patch Diffs
> *UI bug:* When patch diffs are rendered in left-indented mode, the ANSI color codes are misaligned, resulting in lost or incorrect coloring.
## Status
**General Status**: Not started
**Summary**: Diagnose offset logic in diff renderer and adjust color processing to account for indentation.
## Goal
Ensure diff lines maintain proper ANSI color highlighting even when indented on the left by a fixed margin.
## Acceptance Criteria
- Diff render tests pass for both default and indented modes.
- Visual manual check confirms colored diff alignment.
## Implementation
- Update diff renderer to strip indentation before applying color logic, then reapply indentation.
- Add unit tests for multiline indented diffs.
## Notes
<!-- Any implementation notes -->

View File

@@ -1,34 +0,0 @@
+++
id = "40"
title = "Support Multiline Paste in codex-rs CLI Input Window"
status = "Not started"
freeform_status = ""
dependencies = ""
last_updated = "2025-06-25T09:19:34Z"
+++
# Task 40: Support Multiline Paste in codex-rs CLI Input Window
> *This task is specific to codex-rs.*
## Acceptance Criteria
- When pasting multiline text into the codex-rs CLI input (REPL), newlines in the pasted text are inserted into the input buffer rather than causing premature command execution.
- The pasted content preserves original end-of-line characters and spacing.
- The user can still press Enter to submit the complete command when desired.
- Behavior for single-line input and manual line breaks remains unchanged.
## Implementation
**How it was implemented**
Provide details on code modules, design decisions, and steps taken.
*If this section is left blank or contains only placeholder text, the implementing developer should first populate it with a concise high-level plan before writing code.*
**How it works**
Explain runtime behavior and overall operation.
*If this section is left blank or contains only placeholder text, the implementing developer should update it to describe the intended runtime behavior.*
## Notes
- Investigate enabling bracketed paste support in the line-editing library used (e.g. rustyline, liner).
- Ensure that bracketed paste mode is enabled when initializing the CLI to distinguish between pasted content and typed input.
- Review how other REPLs implement multiline paste handling to inform the design.

View File

@@ -1,29 +0,0 @@
+++
id = "41"
title = "Slash-command /init to load init prompt into composer"
status = "Not started"
freeform_status = ""
dependencies = ""
last_updated = "2025-06-25T11:23:30Z"
+++
# Task 41: Slash-command /init to load init prompt into composer
> *This task is specific to codex-rs.*
## Acceptance Criteria
- Typing `/init` in the chat composer should load the contents of `codex-rs/code/init.md` into the input buffer.
- `/init` appears in the slash-command menu alongside other commands.
- After executing `/init`, the composer shows the init prompt, ready for editing.
## Implementation
- Add a new slash-command identifier `/init` in the command dispatch logic (e.g. in `ChatComposer` or equivalent).
- On `/init`, read `codex-rs/code/init.md` (relative to the repository root) and inject its text into the composer buffer.
- Ensure the slash-menu and feedback UI treat `/init` consistently with other commands.
- Write unit tests to verify that `/init` populates the composer correctly without losing focus.
## Notes
Link to the init prompt source: `codex-rs/code/init.md`.

View File

@@ -1,32 +0,0 @@
+++
id = "<NN>"
title = "<Task Title>"
status = "<<<!!! MANAGER: SET VALID STATUS - Not started? !!!>>>"
freeform_status = "<<<!!! MANAGER/DEVELOPER: Freeform status text, optional. E.g. progress notes or developer comments. !!!>>>"
dependencies = [<<<!!! MANAGER: LIST TASK IDS THAT MUST BE COMPLETED BEFORE STARTING; SEPARATED BY COMMAS, E.G. "02","05" !!!>>>] # <!-- Manager rationale: explain why these dependencies are required and why other tasks are not. -->
last_updated = "<timestamp in ISO format>"
+++
# Task Template
# Valid status values: Not started | In progress | Needs input | Needs manual review | Done | Cancelled | Merged
> *This task is specific to codex-rs.*
## Acceptance Criteria
List measurable criteria for completion.
## Implementation
**How it was implemented**
Provide details on code modules, design decisions, and steps taken.
*If this section is left blank or contains only placeholder text, the implementing developer should first populate it with a concise high-level plan before writing code.*
**How it works**
Explain runtime behavior and overall operation.
*If this section is left blank or contains only placeholder text, the implementing developer should update it to describe the intended runtime behavior.*
## Notes
Any additional notes or references.

View File

@@ -1,126 +0,0 @@
#!/usr/bin/env python3
"""
check_tasks.py: Run all task-directory validation checks in one go.
- Ensure task Markdown frontmatter parses and validates (id, title, status, etc.).
- Detect circular dependencies among non-merged tasks.
- Enforce only .md files under agentydragon/tasks/ (excluding .worktrees/ and .done/).
"""
import os
import re
import sys
from pathlib import Path
from manager_utils.tasklib import task_dir, worktree_dir, load_task
def skip_path(p: Path) -> bool:
"""Return True for paths we should ignore in validations."""
wt = worktree_dir()
done = task_dir() / ".done"
if p.is_relative_to(wt) or p.is_relative_to(done):
return True
if p.name in ("task-template.md",) or p.name.endswith("-plan.md"):
return True
return False
def iter_task_markdown() -> Path:
"""Yield all task markdown files under agentydragon/tasks, pruning .worktrees and .done dirs."""
wt = worktree_dir()
done = task_dir() / ".done"
root = task_dir()
for base, dirs, files in os.walk(str(root)):
# do not descend into .worktrees or .done
dirs[:] = [d for d in dirs if (Path(base) / d) not in (wt, done)]
for fn in files:
if re.fullmatch(r"[0-9]{2}-.*\.md", fn):
yield Path(base) / fn
def check_file_types():
failures: list[Path] = []
for p in task_dir().iterdir():
if skip_path(p) or p.is_dir():
continue
if p.suffix.lower() != ".md":
failures.append(p)
return failures
def check_frontmatter():
failures: list[tuple[Path, str]] = []
for md in iter_task_markdown():
try:
load_task(md)
except Exception as e:
failures.append((md, str(e)))
return failures
def check_cycles():
merged = set()
deps_map: dict[str, list[str]] = {}
for md in iter_task_markdown():
meta, _ = load_task(md)
if meta.status == "Merged":
merged.add(meta.id)
else:
deps = [d for d in re.findall(r"\d+", meta.dependencies)]
deps_map[meta.id] = [d for d in deps if d not in merged]
failures: list[list[str]] = []
visited: set[str] = set()
stack: list[str] = []
def visit(n: str):
if n in stack:
cycle = stack[stack.index(n) :] + [n]
failures.append(cycle)
return
if n in visited:
return
stack.append(n)
for m in deps_map.get(n, []):
visit(m)
stack.pop()
visited.add(n)
for node in deps_map:
visit(node)
return failures
def main():
err = False
# File type check
ft_fail = check_file_types()
if ft_fail:
print("Non-md files under tasks/:", file=sys.stderr)
for f in ft_fail:
print(f" {f}", file=sys.stderr)
err = True
# Frontmatter check
fm_fail = check_frontmatter()
if fm_fail:
print("\nFrontmatter errors:", file=sys.stderr)
for md, msg in fm_fail:
print(f" {md}: {msg}", file=sys.stderr)
err = True
# Dependency cycles
cyc_fail = check_cycles()
if cyc_fail:
print("\nCircular dependency errors:", file=sys.stderr)
for cycle in cyc_fail:
print(" " + " -> ".join(cycle), file=sys.stderr)
err = True
if err:
sys.exit(1)
print("All task checks passed.")
if __name__ == "__main__":
main()

View File

@@ -1,32 +0,0 @@
#!/usr/bin/env python3
"""
common.py: Shared utilities for agentydragon tooling scripts.
"""
import subprocess
from pathlib import Path
def repo_root() -> Path:
"""Return the Git repository root directory."""
out = subprocess.check_output(['git', 'rev-parse', '--show-toplevel'])
return Path(out.decode().strip())
def tasks_dir() -> Path:
"""Path to the agentydragon/tasks directory."""
return repo_root() / 'agentydragon' / 'tasks'
def worktrees_dir() -> Path:
"""Path to the agentydragon/tasks/.worktrees directory."""
return tasks_dir() / '.worktrees'
def resolve_slug(input_id: str) -> str:
"""Resolve a two-digit task ID into its full slug, or return slug unchanged."""
if input_id.isdigit() and len(input_id) == 2:
matches = list(tasks_dir().glob(f"{input_id}-*.md"))
if len(matches) == 1:
return matches[0].stem
raise ValueError(f"Expected one task file for ID {input_id}, found {len(matches)}")
return input_id

View File

@@ -1,153 +0,0 @@
#!/usr/bin/env python3
"""
create_task_worktree.py: Create or reuse a git worktree for a specific task and optionally launch a Developer Codex agent.
"""
import os
import subprocess
import sys
import re
from pathlib import Path
import click
from common import repo_root, tasks_dir, worktrees_dir, resolve_slug
def run(cmd, cwd=None):
click.echo(f"Running: {' '.join(cmd)}")
subprocess.check_call(cmd, cwd=cwd)
def resolve_slug(input_id: str) -> str:
if input_id.isdigit() and len(input_id) == 2:
matches = list(tasks_dir().glob(f"{input_id}-*.md"))
if len(matches) == 1:
return matches[0].stem
click.echo(f"Error: expected one task file for ID {input_id}, found {len(matches)}", err=True)
sys.exit(1)
return input_id
@click.command()
@click.option('-a', '--agent', is_flag=True,
help='Launch Developer Codex agent after setting up worktree.')
@click.option('-t', '--tmux', 'tmux_mode', is_flag=True,
help='Open each task in its own tmux pane; implies --agent. '
'Attaches to an existing session if already running.')
@click.option('-i', '--interactive', is_flag=True,
help='Run agent in interactive mode (no exec); implies --agent.')
@click.option('-s', '--shell', 'shell_mode', is_flag=True,
help='Launch an interactive Codex shell (skip exec and auto-commit); implies --agent and --interactive.')
@click.option('--skip-presubmit', is_flag=True,
help='Skip the initial presubmit pre-commit checks when creating a new worktree.')
@click.argument('task_inputs', nargs=-1, required=True)
def main(agent, tmux_mode, interactive, shell_mode, skip_presubmit, task_inputs):
"""Create/reuse a task worktree and optionally launch a Dev agent or tmux session."""
# shell mode implies interactive (skip exec within the worktree)
if shell_mode:
interactive = True
if interactive or shell_mode:
agent = True
if tmux_mode:
agent = True
session = 'agentydragon_' + '_'.join(task_inputs)
# If a tmux session already exists, skip setup and attach
if subprocess.call(['tmux', 'has-session', '-t', session]) == 0:
click.echo(f"Session {session} already exists; attaching")
run(['tmux', 'attach', '-t', session])
return
# Create a new session and windows for each task
for idx, inp in enumerate(task_inputs):
slug = resolve_slug(inp)
cmd = [sys.executable, '-u', __file__]
if agent:
cmd.append('--agent')
cmd.append(slug)
if idx == 0:
run(['tmux', 'new-session', '-d', '-s', session] + cmd)
else:
run(['tmux', 'new-window', '-t', session] + cmd)
run(['tmux', 'attach', '-t', session])
return
# Single task
slug = resolve_slug(task_inputs[0])
branch = f"agentydragon-{slug}"
wt_root = worktrees_dir()
wt_path = wt_root / slug
# Ensure branch exists
if subprocess.call(['git', 'show-ref', '--verify', '--quiet', f'refs/heads/{branch}']) != 0:
run(['git', 'branch', '--track', branch, 'agentydragon'])
wt_root.mkdir(parents=True, exist_ok=True)
new_wt = False
if not wt_path.exists():
# --- COW hydration logic via rsync ---
# Instead of checking out files normally, register the worktree empty and then
# perform a filesystem-level hydration via rsync (with reflink if supported) for
# near-instant setup while excluding VCS metadata and other worktrees.
run(['git', 'worktree', 'add', '--no-checkout', str(wt_path), branch])
src = str(repo_root())
dst = str(wt_path)
# Hydrate the worktree filesystem via rsync, excluding .git and any .worktrees to avoid recursion
rsync_cmd = [
'rsync', '-a', '--delete', f'{src}/', f'{dst}/',
'--exclude=.git*', '--exclude=.worktrees/'
]
if sys.platform != 'darwin':
rsync_cmd.insert(3, '--reflink=auto')
run(rsync_cmd)
# Install pre-commit hooks in the new worktree
if shutil.which('pre-commit'):
run(['pre-commit', 'install'], cwd=dst)
else:
click.echo('Warning: pre-commit not found; skipping hook install', err=True)
new_wt = True
else:
click.echo(f'Worktree already exists at {wt_path}')
if not agent:
return
# Initial presubmit: only on new worktree & branch, unless skipped or in shell mode
if new_wt and not skip_presubmit and not shell_mode:
if shutil.which('pre-commit'):
try:
run(['pre-commit', 'run', '--all-files'], cwd=str(wt_path))
except subprocess.CalledProcessError:
click.echo(
'Pre-commit checks failed. Please fix the issues in the worktree or ' +
're-run with --skip-presubmit to bypass these checks.', err=True)
sys.exit(1)
else:
click.echo('Warning: pre-commit not installed; skipping presubmit checks', err=True)
click.echo(f'Launching Developer Codex agent for task {slug} in sandboxed worktree')
click.echo(f'Launching Developer Codex agent for task {slug} in sandboxed worktree')
os.chdir(wt_path)
cmd = ['codex', '--full-auto']
if not interactive:
cmd.append('exec')
prompt = (repo_root() / 'agentydragon' / 'prompts' / 'developer.md').read_text()
taskfile = (tasks_dir() / f'{slug}.md').read_text()
run(cmd + [prompt + '\n\n' + taskfile])
# After Developer agent exits, if task status is Done, invoke Commit agent to stage and commit changes
task_path = tasks_dir() / f"{slug}.md"
content = task_path.read_text(encoding='utf-8')
m = re.search(r'^status\s*=\s*"([^"]+)"', content, re.MULTILINE)
status = m.group(1) if m else None
if status and status.lower() == 'done':
click.echo(f"Task {slug} marked Done; running Commit agent helper")
commit_script = repo_root() / 'agentydragon' / 'tools' / 'launch_commit_agent.py'
# Launch commit agent from the main repo root, not inside the task worktree
run([sys.executable, str(commit_script), slug], cwd=str(repo_root()))
else:
click.echo(f"Task {slug} status is '{status or 'unknown'}'; skipping Commit agent helper")
if __name__ == '__main__':
import shutil
main()

View File

@@ -1,57 +0,0 @@
#!/usr/bin/env python3
"""
launch_commit_agent.py: Run the non-interactive Commit agent for completed tasks.
"""
import os
import subprocess
import sys
from pathlib import Path
import click
from common import repo_root, tasks_dir, worktrees_dir, resolve_slug
@click.command()
@click.argument('task_input', required=True)
def main(task_input):
"""Resolve TASK_INPUT to slug, run the Commit agent, and commit changes."""
slug = resolve_slug(task_input)
wt = worktrees_dir() / slug
if not wt.exists():
click.echo(f"Error: worktree for '{slug}' not found; run create_task_worktree.py first", err=True)
sys.exit(1)
prompt_file = repo_root() / 'agentydragon' / 'prompts' / 'commit.md'
task_file = tasks_dir() / f'{slug}.md'
for f in (prompt_file, task_file):
if not f.exists():
click.echo(f"Error: file not found: {f}", err=True)
sys.exit(1)
msg_file = Path(subprocess.check_output(['mktemp']).decode().strip())
try:
os.chdir(wt)
# Abort early if no pending changes in this worktree
status_out = subprocess.check_output(['git', 'status', '--porcelain'], text=True).strip()
if not status_out:
click.echo(f"No changes detected in worktree for '{slug}'; nothing to commit.", err=True)
sys.exit(0)
cmd = ['codex', '--full-auto', 'exec', '--output-last-message', str(msg_file)]
# Run the Commit agent in silent mode (suppressing its full stdout)
click.echo(f"Running commit agent: {' '.join(cmd)}")
prompt_content = prompt_file.read_text(encoding='utf-8')
task_content = task_file.read_text(encoding='utf-8')
subprocess.check_call(cmd + [prompt_content + '\n\n' + task_content], stdout=subprocess.DEVNULL)
# Stage all changes, including new files (not just modifications)
subprocess.check_call(['git', 'add', '-A'])
subprocess.check_call(['git', 'commit', '-F', str(msg_file)])
# Print the commit message for visibility
msg = msg_file.read_text(encoding='utf-8').strip()
click.echo("Commit message:\n" + msg)
finally:
msg_file.unlink()
if __name__ == '__main__':
main()

View File

@@ -1,28 +0,0 @@
#!/usr/bin/env python3
"""
launch_project_manager.py: Launch the Codex Project Manager agent prompt.
"""
import subprocess
import sys
import click
from common import repo_root
@click.command()
def main():
"""Read manager.md prompt and invoke Codex Project Manager agent."""
prompt_file = repo_root() / 'agentydragon' / 'prompts' / 'manager.md'
if not prompt_file.exists():
click.echo(f"Error: manager prompt not found at {prompt_file}", err=True)
sys.exit(1)
prompt = prompt_file.read_text(encoding='utf-8')
cmd = ['codex', prompt]
click.echo(f"Running: {' '.join(cmd[:1])} <prompt>")
subprocess.check_call(cmd)
if __name__ == '__main__':
main()

View File

@@ -1,9 +0,0 @@
# manager_utils
This directory contains utility scripts to support the Project Manager agent.
Scripts here should automate common manager tasks (e.g. scanning branches for status,
checking for merge conflicts, proposing merges, conflict-resolution guidance,
polling loops, etc.).
Each script should include a short header explaining its purpose, usage examples,
and any dependencies.

View File

@@ -1,4 +0,0 @@
"""
agentydragon manager utilities package.
"""
__version__ = '0.1'

View File

@@ -1,355 +0,0 @@
"""
CLI for managing agentydragon tasks: status, set-status, set-deps, dispose, launch.
"""
import subprocess
import re
import sys
from datetime import datetime
import click
from tasklib import load_task, repo_root, save_task, task_dir, TaskMeta, worktree_dir, TaskStatus
import shutil
try:
from tabulate import tabulate
except ImportError:
tabulate = None
@click.group()
def cli():
"""Manage agentydragon tasks."""
pass
@cli.command()
def status():
"""Show a table of task id, title, status, dependencies, last_updated.
If tabulate is installed, render as GitHub-flavored Markdown table;
otherwise fallback to fixed-width formatting.
"""
# Load all task metadata, reporting load errors with file path
all_meta: dict[str, TaskMeta] = {}
path_map: dict[str, Path] = {}
wt_root = worktree_dir()
for md in sorted(task_dir().rglob('[0-9][0-9]-*.md')):
# skip task template, plan files, and any worktree copies
if md.name in ('task-template.md',) or md.name.endswith('-plan.md') or md.is_relative_to(wt_root):
continue
try:
meta, _ = load_task(md)
except Exception as e:
print(f"Error loading {md}: {e}")
continue
all_meta[meta.id] = meta
path_map[meta.id] = md
# If a worktree exists, reload the task from that workspace (including .done paths)
repo = repo_root()
for tid, md in list(path_map.items()):
wt_root_dir = wt_root / md.stem
# derive relative path of the task file under the repo
try:
rel = md.relative_to(repo)
except Exception:
continue
wt_task = wt_root_dir / rel
if wt_task.exists():
try:
wt_meta, _ = load_task(wt_task)
all_meta[tid] = wt_meta
path_map[tid] = wt_task
except Exception as e:
print(f"Error loading {wt_task}: {e}")
# Build dependency graph, excluding already merged tasks
merged_ids = {tid for tid, m in all_meta.items() if m.status == 'Merged'}
deps_map: dict[str, list[str]] = {}
for tid, meta in all_meta.items():
deps_map[tid] = [d for d in re.findall(r"\d+", meta.dependencies)
if d in all_meta and d not in merged_ids]
# Topologically sort tasks by dependencies, fall back on filename order on error
try:
sorted_ids: list[str] = []
temp: set[str] = set()
perm: set[str] = set()
def visit(n: str) -> None:
if n in perm:
return
if n in temp:
raise RuntimeError(f"Circular dependency detected at task {n}")
temp.add(n)
for m in deps_map.get(n, []):
visit(m)
temp.remove(n)
perm.add(n)
sorted_ids.append(n)
for n in all_meta:
visit(n)
except Exception as e:
print(f"Warning: cannot topo-sort tasks ({e}); falling back to filename order")
sorted_ids = [m.id for m in sorted(all_meta.values(), key=lambda m: path_map[m.id].name)]
# Identify tasks that are merged with no branch and no worktree (bottom summary)
bottom_merged_ids: set[str] = set()
for tid in sorted_ids:
meta = all_meta[tid]
if meta.status != 'Merged':
continue
branches = subprocess.run(
['git', 'for-each-ref', '--format=%(refname:short)',
f'refs/heads/agentydragon-{tid}-*'],
capture_output=True, text=True, cwd=repo_root()
).stdout.strip().splitlines()
wt_dir = task_dir() / '.worktrees' / path_map[tid].stem
if not branches and not wt_dir.exists():
bottom_merged_ids.add(tid)
rows: list[tuple] = []
merged_tasks: list[tuple[str, str]] = []
root = repo_root()
for tid in sorted_ids:
meta = all_meta[tid]
md = path_map[tid]
slug = md.stem
# branch detection
branches = subprocess.run(
['git', 'for-each-ref', '--format=%(refname:short)',
f'refs/heads/agentydragon-{tid}-*'],
capture_output=True, text=True, cwd=root
).stdout.strip().splitlines()
branch_exists = 'Y' if branches and branches[0].strip() else 'N'
merged_flag = 'N'
if branch_exists == 'Y':
b = branches[0].lstrip('*+ ').strip()
if subprocess.run(['git', 'merge-base', '--is-ancestor', b, 'agentydragon'], cwd=root).returncode == 0:
merged_flag = 'Y'
# worktree detection
wt_dir = worktree_dir() / slug
wt_info = 'none'
if wt_dir.exists():
st = subprocess.run(['git', 'status', '--porcelain'], cwd=wt_dir,
capture_output=True, text=True).stdout.strip()
wt_info = 'clean' if not st else 'dirty'
# skip fully merged tasks (no branch, no worktree)
if meta.status == 'Merged' and branch_exists == 'N' and wt_info == 'none':
merged_tasks.append((tid, meta.title))
continue
# filter out dependencies on bottom-summary merged tasks
deps = [d for d in deps_map.get(tid, []) if d not in bottom_merged_ids]
deps_str = ','.join(deps)
# determine branch_info text
if branch_exists == 'N':
branch_info = 'no branch'
elif merged_flag == 'Y':
branch_info = 'merged'
else:
a_cnt, b_cnt = subprocess.check_output(
['git', 'rev-list', '--left-right', '--count',
f'{branches[0]}...agentydragon'], cwd=root
).decode().split()
# compact diffstat: e.g. "56 files changed, 1265 insertions(+), 342 deletions(-)" -> "56f,1265i,342d"
raw = subprocess.check_output(
['git', 'diff', '--shortstat', f'{branches[0]}...agentydragon'], cwd=root
).decode().strip()
stat = (
raw.replace(' files changed', 'f')
.replace(' file changed', 'f')
.replace(' insertions(+)', 'i')
.replace(' deletions(-)', 'd')
.replace(', ', ',')
)
base = subprocess.check_output(
['git', 'merge-base', 'agentydragon', branches[0]], cwd=root
).decode().strip()
mtree = subprocess.check_output(
['git', 'merge-tree', base, 'agentydragon', branches[0]], cwd=root
).decode(errors='ignore')
conflict = 'conflict' if '<<<<<<<' in mtree else 'ok'
if a_cnt == '0' and b_cnt == '0':
branch_info = f'up-to-date (+{stat or 0})'
else:
branch_info = f'{b_cnt} behind / {a_cnt} ahead (+{stat or 0}) {conflict}'
# Use the human-readable enum value and apply a color map
label = meta.status.value
status_colors = {
'Not started': '\033[90m', # dim gray
'In progress': '\033[33m', # yellow
'Needs input': '\033[31m', # red
'Needs manual review': '\033[31m', # red
'Done': '\033[32m', # green
'Cancelled': '\033[31m', # red
'Merged': '\033[34m', # blue
}
color = status_colors.get(label, '')
stat_disp = f"{color}{label}\033[0m" if color else label
wt_disp = wt_info
if wt_info == 'dirty':
wt_disp = f"\033[31m{wt_info}\033[0m"
rows.append((
tid, meta.title, stat_disp,
deps_str, meta.last_updated.strftime('%Y-%m-%d %H:%M'),
branch_info, wt_disp
))
headers = ['ID', 'Title', 'Status', 'Dependencies', 'Updated',
'Branch Status', 'Worktree Status']
if tabulate:
print(tabulate(rows, headers=headers, tablefmt='github'))
else:
fmt = '{:>2} {:<30} {:<12} {:<20} {:<16} {:<40} {:<10}'
print(fmt.format(*headers))
for r in rows:
print(fmt.format(*r))
# summary of fully merged tasks (no branch, no worktree)
if merged_tasks:
items = ' '.join(f"{tid} ({title})" for tid, title in merged_tasks)
print(f"\n\033[32mMerged:\033[0m {items}")
# summary of tasks Ready to merge (Done with branch commits)
ready_tasks: list[tuple[str, str]] = []
for tid in sorted_ids:
meta = all_meta[tid]
if meta.status != 'Done':
continue
# detect branch existence and ahead commits
branches = subprocess.run(
['git', 'for-each-ref', '--format=%(refname:short)', f'refs/heads/agentydragon-{tid}-*'],
capture_output=True, text=True, cwd=repo_root()
).stdout.strip().splitlines()
if not branches or not branches[0].strip():
continue
bname = branches[0].lstrip('*+ ').strip()
# count commits ahead of integration branch
a_cnt, _b_cnt = subprocess.check_output(
['git', 'rev-list', '--left-right', '--count', f'{bname}...agentydragon'], cwd=repo_root()
).decode().split()
if int(a_cnt) > 0:
ready_tasks.append((tid, meta.title))
if ready_tasks:
items = ' '.join(f"{tid} ({title})" for tid, title in ready_tasks)
print(f"\n\033[33mReady to merge:\033[0m {items}")
# identify unblocked tasks (no remaining dependencies)
unblocked = [tid for tid in sorted_ids if tid not in merged_ids and not deps_map.get(tid)]
if unblocked:
print(f"\n\033[1mUnblocked:\033[0m {' '.join(unblocked)}")
print(f"\033[1mLaunch unblocked in tmux:\033[0m python agentydragon/tools/create_task_worktree.py --agent --tmux {' '.join(unblocked)}")
@cli.command()
@click.argument('task_id')
@click.argument('status')
def set_status(task_id, status):
"""Set status of TASK_ID to STATUS"""
# search both in tasks/ and tasks/.done/ for the task file
files = list(task_dir().rglob(f'{task_id}-*.md'))
if not files:
click.echo(f'Task {task_id} not found', err=True)
sys.exit(1)
path = files[0]
meta, body = load_task(path)
meta.status = status
meta.last_updated = datetime.utcnow()
save_task(path, meta, body)
# Move between tasks/ and tasks/.done according to status transitions
done_dir = task_dir() / '.done'
# Move to .done on Merged
if meta.status == TaskStatus.MERGED and path.parent.name != '.done':
done_dir.mkdir(exist_ok=True)
dest = done_dir / path.name
click.echo(f"Archiving task: moving {path.name} -> {done_dir.relative_to(repo_root())}")
subprocess.run(['git', 'mv', str(path), str(dest)], cwd=repo_root())
# Move back to main tasks/ when status changes away from Done/Merged
elif path.parent.name == '.done' and meta.status not in (TaskStatus.DONE, TaskStatus.MERGED):
dest = task_dir() / path.name
click.echo(f"Reopening task: moving {path.name} -> {dest.parent.relative_to(repo_root())}")
subprocess.run(['git', 'mv', str(path), str(dest)], cwd=repo_root())
@cli.command()
@click.argument('task_id')
@click.argument('deps', nargs=-1)
def set_deps(task_id, deps):
"""Set dependencies of TASK_ID"""
files = list(task_dir().glob(f'{task_id}-*.md'))
if not files:
click.echo(f'Task {task_id} not found', err=True)
sys.exit(1)
path = files[0]
meta, body = load_task(path)
now = datetime.utcnow().isoformat()
meta.dependencies = f'as of {now}: ' + ', '.join(deps)
meta.last_updated = datetime.utcnow()
save_task(path, meta, body)
@cli.command()
@click.argument('task_id', nargs=-1)
def dispose(task_id):
"""Dispose worktree and delete branch for TASK_ID(s)"""
root = repo_root()
wt_base = worktree_dir()
for tid in task_id:
# Remove any matching worktree directories
g = f'{tid}-*'
matching_wts = wt_base.glob(g)
for wt_dir in matching_wts:
click.echo(f"Disposing worktree {wt_dir}")
# unregister worktree; then delete the directory if still present
rel = wt_dir.relative_to(root)
subprocess.run(['git', 'worktree', 'remove', str(rel), '--force'], cwd=root)
if wt_dir.exists():
shutil.rmtree(wt_dir)
else:
print(f"No worktrees matching {g} in {wt_base}")
# prune any stale worktree entries
subprocess.run(['git', 'worktree', 'prune'], cwd=root)
# Delete any matching branches
# delete any matching local branches cleanly via for-each-ref
ref_pattern = f'refs/heads/agentydragon-{tid}-*'
branches = subprocess.run(
['git', 'for-each-ref', '--format=%(refname:short)', ref_pattern],
capture_output=True, text=True, cwd=root
).stdout.splitlines()
branches = [br for br in branches if br]
if branches:
click.echo(f"Disposing branches: {branches}")
subprocess.run(['git', 'branch', '-D', *branches], cwd=root)
else:
click.echo(f"No branches matching {ref_pattern}")
click.echo(f'Disposed task {tid}')
# If the task was marked Done, auto-move it into .done/
files = list(task_dir().glob(f"{tid}-*.md"))
if len(files) == 1:
path = files[0]
meta, _ = load_task(path)
if meta.status == TaskStatus.DONE:
done_dir = task_dir() / '.done'
done_dir.mkdir(exist_ok=True)
target = done_dir / path.name
click.echo(f"Moving {path.name} -> .done/ (status Done)")
subprocess.run(['git', 'mv', str(path), str(target)], cwd=repo_root())
@cli.command()
@click.argument('task_id', nargs=-1)
def launch(task_id):
"""Copy tmux launch one-liner for TASK_ID(s) to clipboard"""
cmd = ['create-task-worktree.sh', '--agent', '--tmux'] + list(task_id)
line = ' '.join(cmd)
# system clipboard
try:
subprocess.run(['pbcopy'], input=line.encode(), check=True)
click.echo('Copied to clipboard:')
except FileNotFoundError:
click.echo(line)
return
click.echo(line)
if __name__ == '__main__':
cli()

View File

@@ -1,14 +0,0 @@
#!/usr/bin/env bash
# launch_ready_to_merge.sh: open tmux panes for all tasks marked Ready to merge
set -euo pipefail
# Gather all tasks flagged Ready to merge by the status script
ready=$(agentydragon_task.py status \
| sed -n -e '1,/^Ready to merge:/d' -e 's/^Ready to merge:[ ]*//')
if [ -z "$ready" ]; then
echo "No tasks are Ready to merge."
exit 0
fi
echo "Launching tasks: $ready"
agentydragon/tools/create-task-worktree.sh --agent --tmux $ready

View File

@@ -1,28 +0,0 @@
#!/usr/bin/env python3
"""
organize_done_tasks.py: Move merged task files under tasks/.done/ subdirectory.
This script should be run once to migrate all tasks with status "Merged"
to the .done folder.
"""
import subprocess
from pathlib import Path
from tasklib import task_dir, load_task
def main():
root = task_dir()
done_dir = root / '.done'
done_dir.mkdir(exist_ok=True)
for md in sorted(root.glob('[0-9][0-9]-*.md')):
if md.name == 'task-template.md' or md.name.endswith('-plan.md'):
continue
meta, _ = load_task(md)
if meta.status == 'Merged':
target = done_dir / md.name
print(f'Moving {md.name} -> .done/')
subprocess.run(['git', 'mv', str(md), str(target)], check=True)
print('Migration complete.')
if __name__ == '__main__':
main()

View File

@@ -1,60 +0,0 @@
"""
Simple library for loading and saving task metadata embedded as TOML front-matter
in task Markdown files.
"""
import re
import subprocess
from datetime import datetime
from pathlib import Path
import toml
from enum import Enum
from pydantic import BaseModel, Field
FRONTMATTER_RE = re.compile(r"^\+\+\+\s*(.*?)\s*\+\+\+", re.S | re.M)
def repo_root():
return Path(subprocess.check_output(['git', 'rev-parse', '--show-toplevel']).decode().strip())
def task_dir():
return repo_root() / "agentydragon/tasks"
def worktree_dir():
return task_dir() / ".worktrees"
class TaskStatus(str, Enum):
NOT_STARTED = "Not started"
IN_PROGRESS = "In progress"
NEEDS_INPUT = "Needs input"
NEEDS_MANUAL_REVIEW = "Needs manual review"
DONE = "Done"
CANCELLED = "Cancelled"
MERGED = "Merged"
class TaskMeta(BaseModel):
id: str
title: str
status: TaskStatus
freeform_status: str = Field(default="")
dependencies: str = Field(default="")
last_updated: datetime = Field(default_factory=datetime.utcnow)
def load_task(path: Path) -> (TaskMeta, str):
text = path.read_text(encoding='utf-8')
m = FRONTMATTER_RE.match(text)
if not m:
raise ValueError(f"No TOML frontmatter in {path}")
meta = toml.loads(m.group(1))
tm = TaskMeta(**meta)
body = text[m.end():].lstrip('\n')
return tm, body
def save_task(path: Path, meta: TaskMeta, body: str) -> None:
tm = meta.dict()
# Serialize enum to its string value for front-matter
if isinstance(tm.get('status'), Enum):
tm['status'] = tm['status'].value
tm['last_updated'] = meta.last_updated.isoformat()
fm = toml.dumps(tm).strip()
content = f"+++\n{fm}\n+++\n\n{body.lstrip()}"
path.write_text(content, encoding='utf-8')

View File

@@ -1,3 +0,0 @@
"""
Test package for manager_utils
"""

View File

@@ -1,35 +0,0 @@
import tempfile
from pathlib import Path
import toml
import pytest
from ..tasklib import TaskMeta, load_task, save_task
SAMPLE = """+++
id = "99"
title = "Sample Task"
status = "Not started"
dependencies = ""
last_updated = "2023-01-01T12:00:00"
+++
# Body here
"""
def test_load_and_save(tmp_path):
md = tmp_path / '99-sample.md'
md.write_text(SAMPLE)
meta, body = load_task(md)
assert meta.id == '99'
assert 'Body here' in body
meta.status = 'Done'
save_task(md, meta, body)
text = md.read_text()
data = toml.loads(text.split('+++')[1])
assert data['status'] == 'Done'
from pydantic import ValidationError
def test_meta_model_validation():
with pytest.raises(ValidationError):
TaskMeta(id='a', title='t', status='bogus', dependencies='', last_updated='bad')

View File

@@ -1,3 +1,7 @@
# Added by ./scripts/install_native_deps.sh
/bin/codex-aarch64-apple-darwin
/bin/codex-aarch64-unknown-linux-musl
/bin/codex-linux-sandbox-arm64
/bin/codex-linux-sandbox-x64
/bin/codex-x86_64-apple-darwin
/bin/codex-x86_64-unknown-linux-musl

View File

@@ -1,4 +1,4 @@
FROM node:20-slim
FROM node:24-slim
ARG TZ
ENV TZ="$TZ"

736
codex-cli/README.md Normal file
View File

@@ -0,0 +1,736 @@
<h1 align="center">OpenAI Codex CLI</h1>
<p align="center">Lightweight coding agent that runs in your terminal</p>
<p align="center"><code>npm i -g @openai/codex</code></p>
> [!IMPORTANT]
> This is the documentation for the _legacy_ TypeScript implementation of the Codex CLI. It has been superseded by the _Rust_ implementation. See the [README in the root of the Codex repository](https://github.com/openai/codex/blob/main/README.md) for details.
![Codex demo GIF using: codex "explain this codebase to me"](../.github/demo.gif)
---
<details>
<summary><strong>Table of contents</strong></summary>
<!-- Begin ToC -->
- [Experimental technology disclaimer](#experimental-technology-disclaimer)
- [Quickstart](#quickstart)
- [Why Codex?](#why-codex)
- [Security model & permissions](#security-model--permissions)
- [Platform sandboxing details](#platform-sandboxing-details)
- [System requirements](#system-requirements)
- [CLI reference](#cli-reference)
- [Memory & project docs](#memory--project-docs)
- [Non-interactive / CI mode](#non-interactive--ci-mode)
- [Tracing / verbose logging](#tracing--verbose-logging)
- [Recipes](#recipes)
- [Installation](#installation)
- [Configuration guide](#configuration-guide)
- [Basic configuration parameters](#basic-configuration-parameters)
- [Custom AI provider configuration](#custom-ai-provider-configuration)
- [History configuration](#history-configuration)
- [Configuration examples](#configuration-examples)
- [Full configuration example](#full-configuration-example)
- [Custom instructions](#custom-instructions)
- [Environment variables setup](#environment-variables-setup)
- [FAQ](#faq)
- [Zero data retention (ZDR) usage](#zero-data-retention-zdr-usage)
- [Codex open source fund](#codex-open-source-fund)
- [Contributing](#contributing)
- [Development workflow](#development-workflow)
- [Git hooks with Husky](#git-hooks-with-husky)
- [Debugging](#debugging)
- [Writing high-impact code changes](#writing-high-impact-code-changes)
- [Opening a pull request](#opening-a-pull-request)
- [Review process](#review-process)
- [Community values](#community-values)
- [Getting help](#getting-help)
- [Contributor license agreement (CLA)](#contributor-license-agreement-cla)
- [Quick fixes](#quick-fixes)
- [Releasing `codex`](#releasing-codex)
- [Alternative build options](#alternative-build-options)
- [Nix flake development](#nix-flake-development)
- [Security & responsible AI](#security--responsible-ai)
- [License](#license)
<!-- End ToC -->
</details>
---
## Experimental technology disclaimer
Codex CLI is an experimental project under active development. It is not yet stable, may contain bugs, incomplete features, or undergo breaking changes. We're building it in the open with the community and welcome:
- Bug reports
- Feature requests
- Pull requests
- Good vibes
Help us improve by filing issues or submitting PRs (see the section below for how to contribute)!
## Quickstart
Install globally:
```shell
npm install -g @openai/codex
```
Next, set your OpenAI API key as an environment variable:
```shell
export OPENAI_API_KEY="your-api-key-here"
```
> **Note:** This command sets the key only for your current terminal session. You can add the `export` line to your shell's configuration file (e.g., `~/.zshrc`) but we recommend setting for the session. **Tip:** You can also place your API key into a `.env` file at the root of your project:
>
> ```env
> OPENAI_API_KEY=your-api-key-here
> ```
>
> The CLI will automatically load variables from `.env` (via `dotenv/config`).
<details>
<summary><strong>Use <code>--provider</code> to use other models</strong></summary>
> Codex also allows you to use other providers that support the OpenAI Chat Completions API. You can set the provider in the config file or use the `--provider` flag. The possible options for `--provider` are:
>
> - openai (default)
> - openrouter
> - azure
> - gemini
> - ollama
> - mistral
> - deepseek
> - xai
> - groq
> - arceeai
> - any other provider that is compatible with the OpenAI API
>
> If you use a provider other than OpenAI, you will need to set the API key for the provider in the config file or in the environment variable as:
>
> ```shell
> export <provider>_API_KEY="your-api-key-here"
> ```
>
> If you use a provider not listed above, you must also set the base URL for the provider:
>
> ```shell
> export <provider>_BASE_URL="https://your-provider-api-base-url"
> ```
</details>
<br />
Run interactively:
```shell
codex
```
Or, run with a prompt as input (and optionally in `Full Auto` mode):
```shell
codex "explain this codebase to me"
```
```shell
codex --approval-mode full-auto "create the fanciest todo-list app"
```
That's it - Codex will scaffold a file, run it inside a sandbox, install any
missing dependencies, and show you the live result. Approve the changes and
they'll be committed to your working directory.
---
## Why Codex?
Codex CLI is built for developers who already **live in the terminal** and want
ChatGPT-level reasoning **plus** the power to actually run code, manipulate
files, and iterate - all under version control. In short, it's _chat-driven
development_ that understands and executes your repo.
- **Zero setup** - bring your OpenAI API key and it just works!
- **Full auto-approval, while safe + secure** by running network-disabled and directory-sandboxed
- **Multimodal** - pass in screenshots or diagrams to implement features ✨
And it's **fully open-source** so you can see and contribute to how it develops!
---
## Security model & permissions
Codex lets you decide _how much autonomy_ the agent receives and auto-approval policy via the
`--approval-mode` flag (or the interactive onboarding prompt):
| Mode | What the agent may do without asking | Still requires approval |
| ------------------------- | --------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| **Suggest** <br>(default) | <li>Read any file in the repo | <li>**All** file writes/patches<li> **Any** arbitrary shell commands (aside from reading files) |
| **Auto Edit** | <li>Read **and** apply-patch writes to files | <li>**All** shell commands |
| **Full Auto** | <li>Read/write files <li> Execute shell commands (network disabled, writes limited to your workdir) | - |
In **Full Auto** every command is run **network-disabled** and confined to the
current working directory (plus temporary files) for defense-in-depth. Codex
will also show a warning/confirmation if you start in **auto-edit** or
**full-auto** while the directory is _not_ tracked by Git, so you always have a
safety net.
Coming soon: you'll be able to whitelist specific commands to auto-execute with
the network enabled, once we're confident in additional safeguards.
### Platform sandboxing details
The hardening mechanism Codex uses depends on your OS:
- **macOS 12+** - commands are wrapped with **Apple Seatbelt** (`sandbox-exec`).
- Everything is placed in a read-only jail except for a small set of
writable roots (`$PWD`, `$TMPDIR`, `~/.codex`, etc.).
- Outbound network is _fully blocked_ by default - even if a child process
tries to `curl` somewhere it will fail.
- **Linux** - there is no sandboxing by default.
We recommend using Docker for sandboxing, where Codex launches itself inside a **minimal
container image** and mounts your repo _read/write_ at the same path. A
custom `iptables`/`ipset` firewall script denies all egress except the
OpenAI API. This gives you deterministic, reproducible runs without needing
root on the host. You can use the [`run_in_container.sh`](../codex-cli/scripts/run_in_container.sh) script to set up the sandbox.
---
## System requirements
| Requirement | Details |
| --------------------------- | --------------------------------------------------------------- |
| Operating systems | macOS 12+, Ubuntu 20.04+/Debian 10+, or Windows 11 **via WSL2** |
| Node.js | **22 or newer** (LTS recommended) |
| Git (optional, recommended) | 2.23+ for built-in PR helpers |
| RAM | 4-GB minimum (8-GB recommended) |
> Never run `sudo npm install -g`; fix npm permissions instead.
---
## CLI reference
| Command | Purpose | Example |
| ------------------------------------ | ----------------------------------- | ------------------------------------ |
| `codex` | Interactive REPL | `codex` |
| `codex "..."` | Initial prompt for interactive REPL | `codex "fix lint errors"` |
| `codex -q "..."` | Non-interactive "quiet mode" | `codex -q --json "explain utils.ts"` |
| `codex completion <bash\|zsh\|fish>` | Print shell completion script | `codex completion bash` |
Key flags: `--model/-m`, `--approval-mode/-a`, `--quiet/-q`, and `--notify`.
---
## Memory & project docs
You can give Codex extra instructions and guidance using `AGENTS.md` files. Codex looks for `AGENTS.md` files in the following places, and merges them top-down:
1. `~/.codex/AGENTS.md` - personal global guidance
2. `AGENTS.md` at repo root - shared project notes
3. `AGENTS.md` in the current working directory - sub-folder/feature specifics
Disable loading of these files with `--no-project-doc` or the environment variable `CODEX_DISABLE_PROJECT_DOC=1`.
---
## Non-interactive / CI mode
Run Codex head-less in pipelines. Example GitHub Action step:
```yaml
- name: Update changelog via Codex
run: |
npm install -g @openai/codex
export OPENAI_API_KEY="${{ secrets.OPENAI_KEY }}"
codex -a auto-edit --quiet "update CHANGELOG for next release"
```
Set `CODEX_QUIET_MODE=1` to silence interactive UI noise.
## Tracing / verbose logging
Setting the environment variable `DEBUG=true` prints full API request and response details:
```shell
DEBUG=true codex
```
---
## Recipes
Below are a few bite-size examples you can copy-paste. Replace the text in quotes with your own task. See the [prompting guide](https://github.com/openai/codex/blob/main/codex-cli/examples/prompting_guide.md) for more tips and usage patterns.
| ✨ | What you type | What happens |
| --- | ------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| 1 | `codex "Refactor the Dashboard component to React Hooks"` | Codex rewrites the class component, runs `npm test`, and shows the diff. |
| 2 | `codex "Generate SQL migrations for adding a users table"` | Infers your ORM, creates migration files, and runs them in a sandboxed DB. |
| 3 | `codex "Write unit tests for utils/date.ts"` | Generates tests, executes them, and iterates until they pass. |
| 4 | `codex "Bulk-rename *.jpeg -> *.jpg with git mv"` | Safely renames files and updates imports/usages. |
| 5 | `codex "Explain what this regex does: ^(?=.*[A-Z]).{8,}$"` | Outputs a step-by-step human explanation. |
| 6 | `codex "Carefully review this repo, and propose 3 high impact well-scoped PRs"` | Suggests impactful PRs in the current codebase. |
| 7 | `codex "Look for vulnerabilities and create a security review report"` | Finds and explains security bugs. |
---
## Installation
<details open>
<summary><strong>From npm (Recommended)</strong></summary>
```bash
npm install -g @openai/codex
# or
yarn global add @openai/codex
# or
bun install -g @openai/codex
# or
pnpm add -g @openai/codex
```
</details>
<details>
<summary><strong>Build from source</strong></summary>
```bash
# Clone the repository and navigate to the CLI package
git clone https://github.com/openai/codex.git
cd codex/codex-cli
# Enable corepack
corepack enable
# Install dependencies and build
pnpm install
pnpm build
# Linux-only: download prebuilt sandboxing binaries (requires gh and zstd).
./scripts/install_native_deps.sh
# Get the usage and the options
node ./dist/cli.js --help
# Run the locally-built CLI directly
node ./dist/cli.js
# Or link the command globally for convenience
pnpm link
```
</details>
---
## Configuration guide
Codex configuration files can be placed in the `~/.codex/` directory, supporting both YAML and JSON formats.
### Basic configuration parameters
| Parameter | Type | Default | Description | Available Options |
| ------------------- | ------- | ---------- | -------------------------------- | ---------------------------------------------------------------------------------------------- |
| `model` | string | `o4-mini` | AI model to use | Any model name supporting OpenAI API |
| `approvalMode` | string | `suggest` | AI assistant's permission mode | `suggest` (suggestions only)<br>`auto-edit` (automatic edits)<br>`full-auto` (fully automatic) |
| `fullAutoErrorMode` | string | `ask-user` | Error handling in full-auto mode | `ask-user` (prompt for user input)<br>`ignore-and-continue` (ignore and proceed) |
| `notify` | boolean | `true` | Enable desktop notifications | `true`/`false` |
### Custom AI provider configuration
In the `providers` object, you can configure multiple AI service providers. Each provider requires the following parameters:
| Parameter | Type | Description | Example |
| --------- | ------ | --------------------------------------- | ----------------------------- |
| `name` | string | Display name of the provider | `"OpenAI"` |
| `baseURL` | string | API service URL | `"https://api.openai.com/v1"` |
| `envKey` | string | Environment variable name (for API key) | `"OPENAI_API_KEY"` |
### History configuration
In the `history` object, you can configure conversation history settings:
| Parameter | Type | Description | Example Value |
| ------------------- | ------- | ------------------------------------------------------ | ------------- |
| `maxSize` | number | Maximum number of history entries to save | `1000` |
| `saveHistory` | boolean | Whether to save history | `true` |
| `sensitivePatterns` | array | Patterns of sensitive information to filter in history | `[]` |
### Configuration examples
1. YAML format (save as `~/.codex/config.yaml`):
```yaml
model: o4-mini
approvalMode: suggest
fullAutoErrorMode: ask-user
notify: true
```
2. JSON format (save as `~/.codex/config.json`):
```json
{
"model": "o4-mini",
"approvalMode": "suggest",
"fullAutoErrorMode": "ask-user",
"notify": true
}
```
### Full configuration example
Below is a comprehensive example of `config.json` with multiple custom providers:
```json
{
"model": "o4-mini",
"provider": "openai",
"providers": {
"openai": {
"name": "OpenAI",
"baseURL": "https://api.openai.com/v1",
"envKey": "OPENAI_API_KEY"
},
"azure": {
"name": "AzureOpenAI",
"baseURL": "https://YOUR_PROJECT_NAME.openai.azure.com/openai",
"envKey": "AZURE_OPENAI_API_KEY"
},
"openrouter": {
"name": "OpenRouter",
"baseURL": "https://openrouter.ai/api/v1",
"envKey": "OPENROUTER_API_KEY"
},
"gemini": {
"name": "Gemini",
"baseURL": "https://generativelanguage.googleapis.com/v1beta/openai",
"envKey": "GEMINI_API_KEY"
},
"ollama": {
"name": "Ollama",
"baseURL": "http://localhost:11434/v1",
"envKey": "OLLAMA_API_KEY"
},
"mistral": {
"name": "Mistral",
"baseURL": "https://api.mistral.ai/v1",
"envKey": "MISTRAL_API_KEY"
},
"deepseek": {
"name": "DeepSeek",
"baseURL": "https://api.deepseek.com",
"envKey": "DEEPSEEK_API_KEY"
},
"xai": {
"name": "xAI",
"baseURL": "https://api.x.ai/v1",
"envKey": "XAI_API_KEY"
},
"groq": {
"name": "Groq",
"baseURL": "https://api.groq.com/openai/v1",
"envKey": "GROQ_API_KEY"
},
"arceeai": {
"name": "ArceeAI",
"baseURL": "https://conductor.arcee.ai/v1",
"envKey": "ARCEEAI_API_KEY"
}
},
"history": {
"maxSize": 1000,
"saveHistory": true,
"sensitivePatterns": []
}
}
```
### Custom instructions
You can create a `~/.codex/AGENTS.md` file to define custom guidance for the agent:
```markdown
- Always respond with emojis
- Only use git commands when explicitly requested
```
### Environment variables setup
For each AI provider, you need to set the corresponding API key in your environment variables. For example:
```bash
# OpenAI
export OPENAI_API_KEY="your-api-key-here"
# Azure OpenAI
export AZURE_OPENAI_API_KEY="your-azure-api-key-here"
export AZURE_OPENAI_API_VERSION="2025-04-01-preview" (Optional)
# OpenRouter
export OPENROUTER_API_KEY="your-openrouter-key-here"
# Similarly for other providers
```
---
## FAQ
<details>
<summary>OpenAI released a model called Codex in 2021 - is this related?</summary>
In 2021, OpenAI released Codex, an AI system designed to generate code from natural language prompts. That original Codex model was deprecated as of March 2023 and is separate from the CLI tool.
</details>
<details>
<summary>Which models are supported?</summary>
Any model available with [Responses API](https://platform.openai.com/docs/api-reference/responses). The default is `o4-mini`, but pass `--model gpt-4.1` or set `model: gpt-4.1` in your config file to override.
</details>
<details>
<summary>Why does <code>o3</code> or <code>o4-mini</code> not work for me?</summary>
It's possible that your [API account needs to be verified](https://help.openai.com/en/articles/10910291-api-organization-verification) in order to start streaming responses and seeing chain of thought summaries from the API. If you're still running into issues, please let us know!
</details>
<details>
<summary>How do I stop Codex from editing my files?</summary>
Codex runs model-generated commands in a sandbox. If a proposed command or file change doesn't look right, you can simply type **n** to deny the command or give the model feedback.
</details>
<details>
<summary>Does it work on Windows?</summary>
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) - Codex has been tested on macOS and Linux with Node 22.
</details>
---
## Zero data retention (ZDR) usage
Codex CLI **does** support OpenAI organizations with [Zero Data Retention (ZDR)](https://platform.openai.com/docs/guides/your-data#zero-data-retention) enabled. If your OpenAI organization has Zero Data Retention enabled and you still encounter errors such as:
```
OpenAI rejected the request. Error details: Status: 400, Code: unsupported_parameter, Type: invalid_request_error, Message: 400 Previous response cannot be used for this organization due to Zero Data Retention.
```
You may need to upgrade to a more recent version with: `npm i -g @openai/codex@latest`
---
## Codex open source fund
We're excited to launch a **$1 million initiative** supporting open source projects that use Codex CLI and other OpenAI models.
- Grants are awarded up to **$25,000** API credits.
- Applications are reviewed **on a rolling basis**.
**Interested? [Apply here](https://openai.com/form/codex-open-source-fund/).**
---
## Contributing
This project is under active development and the code will likely change pretty significantly. We'll update this message once that's complete!
More broadly we welcome contributions - whether you are opening your very first pull request or you're a seasoned maintainer. At the same time we care about reliability and long-term maintainability, so the bar for merging code is intentionally **high**. The guidelines below spell out what "high-quality" means in practice and should make the whole process transparent and friendly.
### Development workflow
- Create a _topic branch_ from `main` - e.g. `feat/interactive-prompt`.
- Keep your changes focused. Multiple unrelated fixes should be opened as separate PRs.
- Use `pnpm test:watch` during development for super-fast feedback.
- We use **Vitest** for unit tests, **ESLint** + **Prettier** for style, and **TypeScript** for type-checking.
- Before pushing, run the full test/type/lint suite:
### Git hooks with Husky
This project uses [Husky](https://typicode.github.io/husky/) to enforce code quality checks:
- **Pre-commit hook**: Automatically runs lint-staged to format and lint files before committing
- **Pre-push hook**: Runs tests and type checking before pushing to the remote
These hooks help maintain code quality and prevent pushing code with failing tests. For more details, see [HUSKY.md](./HUSKY.md).
```bash
pnpm test && pnpm run lint && pnpm run typecheck
```
- If you have **not** yet signed the Contributor License Agreement (CLA), add a PR comment containing the exact text
```text
I have read the CLA Document and I hereby sign the CLA
```
The CLA-Assistant bot will turn the PR status green once all authors have signed.
```bash
# Watch mode (tests rerun on change)
pnpm test:watch
# Type-check without emitting files
pnpm typecheck
# Automatically fix lint + prettier issues
pnpm lint:fix
pnpm format:fix
```
### Debugging
To debug the CLI with a visual debugger, do the following in the `codex-cli` folder:
- Run `pnpm run build` to build the CLI, which will generate `cli.js.map` alongside `cli.js` in the `dist` folder.
- Run the CLI with `node --inspect-brk ./dist/cli.js` The program then waits until a debugger is attached before proceeding. Options:
- In VS Code, choose **Debug: Attach to Node Process** from the command palette and choose the option in the dropdown with debug port `9229` (likely the first option)
- Go to <chrome://inspect> in Chrome and find **localhost:9229** and click **trace**
### Writing high-impact code changes
1. **Start with an issue.** Open a new one or comment on an existing discussion so we can agree on the solution before code is written.
2. **Add or update tests.** Every new feature or bug-fix should come with test coverage that fails before your change and passes afterwards. 100% coverage is not required, but aim for meaningful assertions.
3. **Document behaviour.** If your change affects user-facing behaviour, update the README, inline help (`codex --help`), or relevant example projects.
4. **Keep commits atomic.** Each commit should compile and the tests should pass. This makes reviews and potential rollbacks easier.
### Opening a pull request
- Fill in the PR template (or include similar information) - **What? Why? How?**
- Run **all** checks locally (`npm test && npm run lint && npm run typecheck`). CI failures that could have been caught locally slow down the process.
- Make sure your branch is up-to-date with `main` and that you have resolved merge conflicts.
- Mark the PR as **Ready for review** only when you believe it is in a merge-able state.
### Review process
1. One maintainer will be assigned as a primary reviewer.
2. We may ask for changes - please do not take this personally. We value the work, we just also value consistency and long-term maintainability.
3. When there is consensus that the PR meets the bar, a maintainer will squash-and-merge.
### Community values
- **Be kind and inclusive.** Treat others with respect; we follow the [Contributor Covenant](https://www.contributor-covenant.org/).
- **Assume good intent.** Written communication is hard - err on the side of generosity.
- **Teach & learn.** If you spot something confusing, open an issue or PR with improvements.
### Getting help
If you run into problems setting up the project, would like feedback on an idea, or just want to say _hi_ - please open a Discussion or jump into the relevant issue. We are happy to help.
Together we can make Codex CLI an incredible tool. **Happy hacking!** :rocket:
### Contributor license agreement (CLA)
All contributors **must** accept the CLA. The process is lightweight:
1. Open your pull request.
2. Paste the following comment (or reply `recheck` if you've signed before):
```text
I have read the CLA Document and I hereby sign the CLA
```
3. The CLA-Assistant bot records your signature in the repo and marks the status check as passed.
No special Git commands, email attachments, or commit footers required.
#### Quick fixes
| Scenario | Command |
| ----------------- | ------------------------------------------------ |
| Amend last commit | `git commit --amend -s --no-edit && git push -f` |
The **DCO check** blocks merges until every commit in the PR carries the footer (with squash this is just the one).
### Releasing `codex`
To publish a new version of the CLI you first need to stage the npm package. A
helper script in `codex-cli/scripts/` does all the heavy lifting. Inside the
`codex-cli` folder run:
```bash
# Classic, JS implementation that includes small, native binaries for Linux sandboxing.
pnpm stage-release
# Optionally specify the temp directory to reuse between runs.
RELEASE_DIR=$(mktemp -d)
pnpm stage-release --tmp "$RELEASE_DIR"
# "Fat" package that additionally bundles the native Rust CLI binaries for
# Linux. End-users can then opt-in at runtime by setting CODEX_RUST=1.
pnpm stage-release --native
```
Go to the folder where the release is staged and verify that it works as intended. If so, run the following from the temp folder:
```
cd "$RELEASE_DIR"
npm publish
```
### Alternative build options
#### Nix flake development
Prerequisite: Nix >= 2.4 with flakes enabled (`experimental-features = nix-command flakes` in `~/.config/nix/nix.conf`).
Enter a Nix development shell:
```bash
# Use either one of the commands according to which implementation you want to work with
nix develop .#codex-cli # For entering codex-cli specific shell
nix develop .#codex-rs # For entering codex-rs specific shell
```
This shell includes Node.js, installs dependencies, builds the CLI, and provides a `codex` command alias.
Build and run the CLI directly:
```bash
# Use either one of the commands according to which implementation you want to work with
nix build .#codex-cli # For building codex-cli
nix build .#codex-rs # For building codex-rs
./result/bin/codex --help
```
Run the CLI via the flake app:
```bash
# Use either one of the commands according to which implementation you want to work with
nix run .#codex-cli # For running codex-cli
nix run .#codex-rs # For running codex-rs
```
Use direnv with flakes
If you have direnv installed, you can use the following `.envrc` to automatically enter the Nix shell when you `cd` into the project directory:
```bash
cd codex-rs
echo "use flake ../flake.nix#codex-cli" >> .envrc && direnv allow
cd codex-cli
echo "use flake ../flake.nix#codex-rs" >> .envrc && direnv allow
```
---
## Security & responsible AI
Have you discovered a vulnerability or have concerns about model output? Please e-mail **security@openai.com** and we will respond promptly.
---
## License
This repository is licensed under the [Apache-2.0 License](LICENSE).

View File

@@ -15,7 +15,6 @@
* current platform / architecture, an error is thrown.
*/
import { spawnSync } from "child_process";
import fs from "fs";
import path from "path";
import { fileURLToPath, pathToFileURL } from "url";
@@ -35,12 +34,13 @@ const wantsNative = fs.existsSync(path.join(__dirname, "use-native")) ||
: false);
// Try native binary if requested.
if (wantsNative) {
if (wantsNative && process.platform !== 'win32') {
const { platform, arch } = process;
let targetTriple = null;
switch (platform) {
case "linux":
case "android":
switch (arch) {
case "x64":
targetTriple = "x86_64-unknown-linux-musl";
@@ -73,22 +73,76 @@ if (wantsNative) {
}
const binaryPath = path.join(__dirname, "..", "bin", `codex-${targetTriple}`);
const result = spawnSync(binaryPath, process.argv.slice(2), {
// Use an asynchronous spawn instead of spawnSync so that Node is able to
// respond to signals (e.g. Ctrl-C / SIGINT) while the native binary is
// executing. This allows us to forward those signals to the child process
// and guarantees that when either the child terminates or the parent
// receives a fatal signal, both processes exit in a predictable manner.
const { spawn } = await import("child_process");
const child = spawn(binaryPath, process.argv.slice(2), {
stdio: "inherit",
});
const exitCode = typeof result.status === "number" ? result.status : 1;
process.exit(exitCode);
}
child.on("error", (err) => {
// Typically triggered when the binary is missing or not executable.
// Re-throwing here will terminate the parent with a non-zero exit code
// while still printing a helpful stack trace.
// eslint-disable-next-line no-console
console.error(err);
process.exit(1);
});
// Fallback: execute the original JavaScript CLI.
// Forward common termination signals to the child so that it shuts down
// gracefully. In the handler we temporarily disable the default behavior of
// exiting immediately; once the child has been signaled we simply wait for
// its exit event which will in turn terminate the parent (see below).
const forwardSignal = (signal) => {
if (child.killed) {
return;
}
try {
child.kill(signal);
} catch {
/* ignore */
}
};
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, "../dist/cli.js");
const cliUrl = pathToFileURL(cliPath).href;
["SIGINT", "SIGTERM", "SIGHUP"].forEach((sig) => {
process.on(sig, () => forwardSignal(sig));
});
// Load and execute the CLI
(async () => {
// When the child exits, mirror its termination reason in the parent so that
// shell scripts and other tooling observe the correct exit status.
// Wrap the lifetime of the child process in a Promise so that we can await
// its termination in a structured way. The Promise resolves with an object
// describing how the child exited: either via exit code or due to a signal.
const childResult = await new Promise((resolve) => {
child.on("exit", (code, signal) => {
if (signal) {
resolve({ type: "signal", signal });
} else {
resolve({ type: "code", exitCode: code ?? 1 });
}
});
});
if (childResult.type === "signal") {
// Re-emit the same signal so that the parent terminates with the expected
// semantics (this also sets the correct exit code of 128 + n).
process.kill(process.pid, childResult.signal);
} else {
process.exit(childResult.exitCode);
}
} else {
// Fallback: execute the original JavaScript CLI.
// Resolve the path to the compiled CLI bundle
const cliPath = path.resolve(__dirname, "../dist/cli.js");
const cliUrl = pathToFileURL(cliPath).href;
// Load and execute the CLI
try {
await import(cliUrl);
} catch (err) {
@@ -96,4 +150,4 @@ const cliUrl = pathToFileURL(cliPath).href;
console.error(err);
process.exit(1);
}
})();
}

View File

@@ -84,6 +84,6 @@
},
"repository": {
"type": "git",
"url": "https://github.com/openai/codex"
"url": "git+https://github.com/openai/codex.git"
}
}

View File

@@ -0,0 +1,9 @@
# npm releases
Run the following:
To build the 0.2.x or later version of the npm module, which runs the Rust version of the CLI, build it as follows:
```bash
./codex-cli/scripts/stage_rust_release.py --release-version 0.6.0
```

View File

@@ -8,7 +8,7 @@
# the native implementation when users set CODEX_RUST=1.
#
# Usage
# install_native_deps.sh [RELEASE_ROOT] [--full-native]
# install_native_deps.sh [--full-native] [--workflow-url URL] [CODEX_CLI_ROOT]
#
# The optional RELEASE_ROOT is the path that contains package.json. Omitting
# it installs the binaries into the repository's own bin/ folder to support
@@ -20,32 +20,43 @@ set -euo pipefail
# Parse arguments
# ------------------
DEST_DIR=""
CODEX_CLI_ROOT=""
INCLUDE_RUST=0
for arg in "$@"; do
case "$arg" in
# Until we start publishing stable GitHub releases, we have to grab the binaries
# from the GitHub Action that created them. Update the URL below to point to the
# appropriate workflow run:
WORKFLOW_URL="https://github.com/openai/codex/actions/runs/15981617627"
while [[ $# -gt 0 ]]; do
case "$1" in
--full-native)
INCLUDE_RUST=1
;;
--workflow-url)
shift || { echo "--workflow-url requires an argument"; exit 1; }
if [ -n "$1" ]; then
WORKFLOW_URL="$1"
fi
;;
*)
if [[ -z "$DEST_DIR" ]]; then
DEST_DIR="$arg"
if [[ -z "$CODEX_CLI_ROOT" ]]; then
CODEX_CLI_ROOT="$1"
else
echo "Unexpected argument: $arg" >&2
echo "Unexpected argument: $1" >&2
exit 1
fi
;;
esac
shift
done
# ----------------------------------------------------------------------------
# Determine where the binaries should be installed.
# ----------------------------------------------------------------------------
if [[ $# -gt 0 ]]; then
if [ -n "$CODEX_CLI_ROOT" ]; then
# The caller supplied a release root directory.
CODEX_CLI_ROOT="$1"
BIN_DIR="$CODEX_CLI_ROOT/bin"
else
# No argument; fall back to the repos own bin directory.
@@ -62,10 +73,6 @@ mkdir -p "$BIN_DIR"
# Download and decompress the artifacts from the GitHub Actions workflow.
# ----------------------------------------------------------------------------
# Until we start publishing stable GitHub releases, we have to grab the binaries
# from the GitHub Action that created them. Update the URL below to point to the
# appropriate workflow run:
WORKFLOW_URL="https://github.com/openai/codex/actions/runs/15483730027"
WORKFLOW_ID="${WORKFLOW_URL##*/}"
ARTIFACTS_DIR="$(mktemp -d)"

View File

@@ -4,10 +4,7 @@
# -----------------------------------------------------------------------------
# Stages an npm release for @openai/codex.
#
# The script used to accept a single optional positional argument that indicated
# the temporary directory in which to stage the package. We now support a
# flag-based interface so that we can extend the command with further options
# without breaking the call-site contract.
# Usage:
#
# --tmp <dir> : Use <dir> instead of a freshly created temp directory.
# --native : Bundle the pre-built Rust CLI binaries for Linux alongside
@@ -30,11 +27,12 @@ set -euo pipefail
usage() {
cat <<EOF
Usage: $(basename "$0") [--tmp DIR] [--native]
Usage: $(basename "$0") [--tmp DIR] [--native] [--version VERSION]
Options
--tmp DIR Use DIR to stage the release (defaults to a fresh mktemp dir)
--native Bundle Rust binaries for Linux (fat package)
--version Specify the version to release (defaults to a timestamp-based version)
-h, --help Show this help
Legacy positional argument: the first non-flag argument is still interpreted
@@ -45,6 +43,9 @@ EOF
TMPDIR=""
INCLUDE_NATIVE=0
# Default to a timestamp-based version (keep same scheme as before)
VERSION="$(printf '0.1.%d' "$(date +%y%m%d%H%M)")"
WORKFLOW_URL=""
# Manual flag parser - Bash getopts does not handle GNU long options well.
while [[ $# -gt 0 ]]; do
@@ -59,6 +60,14 @@ while [[ $# -gt 0 ]]; do
--native)
INCLUDE_NATIVE=1
;;
--version)
shift || { echo "--version requires an argument"; usage 1; }
VERSION="$1"
;;
--workflow-url)
shift || { echo "--workflow-url requires an argument"; exit 1; }
WORKFLOW_URL="$1"
;;
-h|--help)
usage 0
;;
@@ -108,9 +117,6 @@ cp -r dist "$TMPDIR/dist"
cp -r src "$TMPDIR/src" # keep source for TS sourcemaps
cp ../README.md "$TMPDIR" || true # README is one level up - ignore if missing
# Derive a timestamp-based version (keep same scheme as before)
VERSION="$(printf '0.1.%d' "$(date +%y%m%d%H%M)")"
# Modify package.json - bump version and optionally add the native directory to
# the files array so that the binaries are published to npm.
@@ -121,7 +127,7 @@ jq --arg version "$VERSION" \
# 2. Native runtime deps (sandbox plus optional Rust binaries)
if [[ "$INCLUDE_NATIVE" -eq 1 ]]; then
./scripts/install_native_deps.sh "$TMPDIR" --full-native
./scripts/install_native_deps.sh --full-native --workflow-url "$WORKFLOW_URL" "$TMPDIR"
touch "${TMPDIR}/bin/use-native"
else
./scripts/install_native_deps.sh "$TMPDIR"
@@ -132,7 +138,8 @@ popd >/dev/null
echo "Staged version $VERSION for release in $TMPDIR"
if [[ "$INCLUDE_NATIVE" -eq 1 ]]; then
echo "Test Rust:"
echo "Verify the CLI:"
echo " node ${TMPDIR}/bin/codex.js --version"
echo " node ${TMPDIR}/bin/codex.js --help"
else
echo "Test Node:"

View File

@@ -0,0 +1,62 @@
#!/usr/bin/env python3
import json
import subprocess
import sys
import argparse
from pathlib import Path
def main() -> int:
parser = argparse.ArgumentParser(
description="""Stage a release for the npm module.
Run this after the GitHub Release has been created and use
`--release-version` to specify the version to release.
"""
)
parser.add_argument(
"--release-version", required=True, help="Version to release, e.g., 0.3.0"
)
args = parser.parse_args()
version = args.release_version
gh_run = subprocess.run(
[
"gh",
"run",
"list",
"--branch",
f"rust-v{version}",
"--json",
"workflowName,url,headSha",
"--jq",
'first(.[] | select(.workflowName == "rust-release"))',
],
stdout=subprocess.PIPE,
check=True,
)
gh_run.check_returncode()
workflow = json.loads(gh_run.stdout)
sha = workflow["headSha"]
print(f"should `git checkout {sha}`")
current_dir = Path(__file__).parent.resolve()
stage_release = subprocess.run(
[
current_dir / "stage_release.sh",
"--version",
version,
"--workflow-url",
workflow["url"],
"--native",
]
)
stage_release.check_returncode()
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -370,11 +370,26 @@ export function isSafeCommand(
reason: "View file with line numbers",
group: "Reading files",
};
case "rg":
case "rg": {
// Certain ripgrep options execute external commands or invoke other
// processes, so we must reject them.
const isUnsafe = command.some(
(arg: string) =>
UNSAFE_OPTIONS_FOR_RIPGREP_WITHOUT_ARGS.has(arg) ||
[...UNSAFE_OPTIONS_FOR_RIPGREP_WITH_ARGS].some(
(opt) => arg === opt || arg.startsWith(`${opt}=`),
),
);
if (isUnsafe) {
break;
}
return {
reason: "Ripgrep search",
group: "Searching",
};
}
case "find": {
// Certain options to `find` allow executing arbitrary processes, so we
// cannot auto-approve them.
@@ -495,6 +510,22 @@ const UNSAFE_OPTIONS_FOR_FIND_COMMAND: ReadonlySet<string> = new Set([
"-fprintf",
]);
// Ripgrep options that are considered unsafe because they may execute
// arbitrary commands or spawn auxiliary processes.
const UNSAFE_OPTIONS_FOR_RIPGREP_WITH_ARGS: ReadonlySet<string> = new Set([
// Executes an arbitrary command for each matching file.
"--pre",
// Allows custom hostname command which could leak environment details.
"--hostname-bin",
]);
const UNSAFE_OPTIONS_FOR_RIPGREP_WITHOUT_ARGS: ReadonlySet<string> = new Set([
// Enables searching inside archives which triggers external decompression
// utilities reject out of an abundance of caution.
"--search-zip",
"-z",
]);
// ---------------- Helper utilities for complex shell expressions -----------------
// A conservative allow-list of bash operators that do not, on their own, cause

View File

@@ -45,6 +45,7 @@ import { createInputItem } from "./utils/input-utils";
import { initLogger } from "./utils/logger/log";
import { isModelSupportedForResponses } from "./utils/model-utils.js";
import { parseToolCall } from "./utils/parsers";
import { providers } from "./utils/providers";
import { onExit, setInkRenderer } from "./utils/terminal";
import chalk from "chalk";
import { spawnSync } from "child_process";
@@ -327,26 +328,44 @@ try {
// ignore errors
}
if (cli.flags.login) {
apiKey = await fetchApiKey(client.issuer, client.client_id);
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
const authFile = path.join(authDir, "auth.json");
if (fs.existsSync(authFile)) {
const data = JSON.parse(fs.readFileSync(authFile, "utf-8"));
savedTokens = data.tokens;
// Get provider-specific API key if not OpenAI
if (provider.toLowerCase() !== "openai") {
const providerInfo = providers[provider.toLowerCase()];
if (providerInfo) {
const providerApiKey = process.env[providerInfo.envKey];
if (providerApiKey) {
apiKey = providerApiKey;
}
} catch {
/* ignore */
}
} else if (!apiKey) {
apiKey = await fetchApiKey(client.issuer, client.client_id);
}
// Only proceed with OpenAI auth flow if:
// 1. Provider is OpenAI and no API key is set, or
// 2. Login flag is explicitly set
if (provider.toLowerCase() === "openai" && !apiKey) {
if (cli.flags.login) {
apiKey = await fetchApiKey(client.issuer, client.client_id);
try {
const home = os.homedir();
const authDir = path.join(home, ".codex");
const authFile = path.join(authDir, "auth.json");
if (fs.existsSync(authFile)) {
const data = JSON.parse(fs.readFileSync(authFile, "utf-8"));
savedTokens = data.tokens;
}
} catch {
/* ignore */
}
} else {
apiKey = await fetchApiKey(client.issuer, client.client_id);
}
}
// Ensure the API key is available as an environment variable for legacy code
process.env["OPENAI_API_KEY"] = apiKey;
if (cli.flags.free) {
// Only attempt credit redemption for OpenAI provider
if (cli.flags.free && provider.toLowerCase() === "openai") {
// eslint-disable-next-line no-console
console.log(`${chalk.bold("codex --free")} attempting to redeem credits...`);
if (!savedTokens?.refresh_token) {
@@ -379,13 +398,18 @@ if (!apiKey && !NO_API_KEY_REQUIRED.has(provider.toLowerCase())) {
? `You can create a key here: ${chalk.bold(
chalk.underline("https://platform.openai.com/account/api-keys"),
)}\n`
: provider.toLowerCase() === "gemini"
: provider.toLowerCase() === "azure"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
`${provider.toUpperCase()}_OPENAI_API_KEY`,
)} ` +
`in Azure AI Foundry portal at ${chalk.bold(chalk.underline("https://ai.azure.com"))}.\n`
: provider.toLowerCase() === "gemini"
? `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`Google AI Studio`)}.\n`
: `You can create a ${chalk.bold(
`${provider.toUpperCase()}_API_KEY`,
)} ` + `in the ${chalk.bold(`${provider}`)} dashboard.\n`
}`,
);
process.exit(1);

View File

@@ -6,7 +6,6 @@ import { ReviewDecision } from "../../utils/agent/review";
import { Select } from "../vendor/ink-select/select";
import TextInput from "../vendor/ink-text-input";
import { Box, Text, useInput } from "ink";
import { sessionScopedApprovalLabel } from "../../utils/string-utils";
import React from "react";
// default denyreason:
@@ -81,23 +80,16 @@ export function TerminalChatCommandReview({
| { label: string; value: "switch" }
> = [
{
label: "Yes, run this command (y)",
label: "Yes (y)",
value: ReviewDecision.YES,
},
];
if (showAlwaysApprove) {
let label: string;
if (
React.isValidElement(confirmationPrompt) &&
typeof (confirmationPrompt as any).props?.commandForDisplay === "string"
) {
const cmd: string = (confirmationPrompt as any).props.commandForDisplay;
label = sessionScopedApprovalLabel(cmd, 30);
} else {
label = "Always allow this command for the remainder of the session (a)";
}
opts.push({ label, value: ReviewDecision.ALWAYS });
opts.push({
label: "Yes, always approve this exact command for this session (a)",
value: ReviewDecision.ALWAYS,
});
}
opts.push(
@@ -125,7 +117,7 @@ export function TerminalChatCommandReview({
);
return opts;
}, [showAlwaysApprove, confirmationPrompt]);
}, [showAlwaysApprove]);
useInput(
(input, key) => {

View File

@@ -800,7 +800,8 @@ export class AgentLoop {
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
this.config.provider?.toLowerCase() === "openai" ||
this.config.provider?.toLowerCase() === "azure"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>
@@ -1188,7 +1189,8 @@ export class AgentLoop {
const responseCall =
!this.config.provider ||
this.config.provider?.toLowerCase() === "openai"
this.config.provider?.toLowerCase() === "openai" ||
this.config.provider?.toLowerCase() === "azure"
? (params: ResponseCreateParams) =>
this.oai.responses.create(params)
: (params: ResponseCreateParams) =>

View File

@@ -69,7 +69,7 @@ export const OPENAI_BASE_URL = process.env["OPENAI_BASE_URL"] || "";
export let OPENAI_API_KEY = process.env["OPENAI_API_KEY"] || "";
export const AZURE_OPENAI_API_VERSION =
process.env["AZURE_OPENAI_API_VERSION"] || "2025-03-01-preview";
process.env["AZURE_OPENAI_API_VERSION"] || "2025-04-01-preview";
export const DEFAULT_REASONING_EFFORT = "high";
export const OPENAI_ORGANIZATION = process.env["OPENAI_ORGANIZATION"] || "";

View File

@@ -1,27 +0,0 @@
/**
* Truncate a string in the middle to ensure its length does not exceed maxLength.
* If the input is longer than maxLength, replaces the middle with a single-character ellipsis '…'.
*/
export function truncateMiddle(text: string, maxLength: number): string {
if (text.length <= maxLength) {
return text;
}
const ellipsis = '…';
const trimLength = maxLength - ellipsis.length;
const startLength = Math.ceil(trimLength / 2);
const endLength = Math.floor(trimLength / 2);
return text.slice(0, startLength) + ellipsis + text.slice(text.length - endLength);
}
/**
* Generate a session-scoped approval label for a given command.
* Embeds a truncated snippet of the first line of commandForDisplay.
*/
export function sessionScopedApprovalLabel(
commandForDisplay: string,
maxLength: number,
): string {
const firstLine = commandForDisplay.split('\n')[0].trim();
const snippet = truncateMiddle(firstLine, maxLength);
return `Yes, always allow running \`${snippet}\` for this session (a)`;
}

View File

@@ -0,0 +1,107 @@
/**
* tests/agent-azure-responses-endpoint.test.ts
*
* Verifies that AgentLoop calls the `/responses` endpoint when provider is set to Azure.
*/
import { describe, it, expect, vi, beforeEach } from "vitest";
// Fake stream that yields a completed response event
class FakeStream {
async *[Symbol.asyncIterator]() {
yield {
type: "response.completed",
response: { id: "azure_resp", status: "completed", output: [] },
} as any;
}
}
let lastCreateParams: any = null;
vi.mock("openai", () => {
class FakeDefaultClient {
public responses = {
create: async (params: any) => {
lastCreateParams = params;
return new FakeStream();
},
};
}
class FakeAzureClient {
public responses = {
create: async (params: any) => {
lastCreateParams = params;
return new FakeStream();
},
};
}
class APIConnectionTimeoutError extends Error {}
return {
__esModule: true,
default: FakeDefaultClient,
AzureOpenAI: FakeAzureClient,
APIConnectionTimeoutError,
};
});
// Stub approvals to bypass command approval logic
vi.mock("../src/approvals.js", () => ({
__esModule: true,
alwaysApprovedCommands: new Set<string>(),
canAutoApprove: () => ({ type: "auto-approve", runInSandbox: false }),
isSafeCommand: () => null,
}));
// Stub format-command to avoid formatting side effects
vi.mock("../src/format-command.js", () => ({
__esModule: true,
formatCommandForDisplay: (cmd: Array<string>) => cmd.join(" "),
}));
// Stub internal logging to keep output clean
vi.mock("../src/utils/agent/log.js", () => ({
__esModule: true,
log: () => {},
isLoggingEnabled: () => false,
}));
import { AgentLoop } from "../src/utils/agent/agent-loop.js";
describe("AgentLoop Azure provider responses endpoint", () => {
beforeEach(() => {
lastCreateParams = null;
});
it("calls the /responses endpoint when provider is azure", async () => {
const cfg: any = {
model: "test-model",
provider: "azure",
instructions: "",
disableResponseStorage: false,
notify: false,
};
const loop = new AgentLoop({
additionalWritableRoots: [],
model: cfg.model,
config: cfg,
instructions: cfg.instructions,
approvalPolicy: { mode: "suggest" } as any,
onItem: () => {},
onLoading: () => {},
getCommandConfirmation: async () => ({ review: "yes" }) as any,
onLastResponseId: () => {},
});
await loop.run([
{
type: "message",
role: "user",
content: [{ type: "input_text", text: "hello" }],
},
]);
expect(lastCreateParams).not.toBeNull();
expect(lastCreateParams.model).toBe(cfg.model);
expect(Array.isArray(lastCreateParams.input)).toBe(true);
});
});

View File

@@ -44,6 +44,14 @@ describe("canAutoApprove()", () => {
group: "Navigating",
runInSandbox: false,
});
// Ripgrep safe invocation.
expect(check(["rg", "TODO"])).toEqual({
type: "auto-approve",
reason: "Ripgrep search",
group: "Searching",
runInSandbox: false,
});
});
test("simple safe commands within a `bash -lc` call", () => {
@@ -67,6 +75,24 @@ describe("canAutoApprove()", () => {
});
});
test("ripgrep unsafe flags", () => {
// Flags that do not take arguments
expect(check(["rg", "--search-zip", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "-z", "TODO"])).toEqual({ type: "ask-user" });
// Flags that take arguments (provided separately)
expect(check(["rg", "--pre", "cat", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "--hostname-bin", "hostname", "TODO"])).toEqual({
type: "ask-user",
});
// Flags that take arguments in = form
expect(check(["rg", "--pre=cat", "TODO"])).toEqual({ type: "ask-user" });
expect(check(["rg", "--hostname-bin=hostname", "TODO"])).toEqual({
type: "ask-user",
});
});
test("bash -lc commands with unsafe redirects", () => {
expect(check(["bash", "-lc", "echo hello > file.txt"])).toEqual({
type: "ask-user",

View File

@@ -1,39 +0,0 @@
import { truncateMiddle, sessionScopedApprovalLabel } from "../src/utils/string-utils";
describe("truncateMiddle", () => {
it("returns the original string when shorter than max length", () => {
expect(truncateMiddle("short", 10)).toBe("short");
});
it("returns the original string when equal to max length", () => {
expect(truncateMiddle("exactlen", 8)).toBe("exactlen");
});
it("truncates the middle of a longer string", () => {
const text = "abcdefghij"; // length 10
// maxLength 7 => trimLength=6, startLen=3, endLen=3 => "abc…hij"
expect(truncateMiddle(text, 7)).toBe("abc…hij");
});
it("handles odd max lengths correctly", () => {
const text = "abcdefghijkl"; // length 12
// maxLength 8 => trimLength=7, startLen=4, endLen=3 => "abcd…ijk"
expect(truncateMiddle(text, 8)).toBe("abcd…ijk");
});
});
describe("sessionScopedApprovalLabel", () => {
const cmd = "echo hello world";
it("embeds the full command when shorter than max length", () => {
expect(sessionScopedApprovalLabel(cmd, 50)).toBe(
"Yes, always allow running `echo hello world` for this session (a)",
);
});
it("embeds a truncated command when longer than max length", () => {
const longCmd = "cat " + "a".repeat(100) + " end";
const label = sessionScopedApprovalLabel(longCmd, 20);
expect(label).toMatch(/^Yes, always allow running `.{0,20}` for this session \(a\)$/);
});
});

View File

@@ -1,93 +0,0 @@
# Codex Orchestration Framework: Plan & Open Questions
This document collects the highlevel architecture, planned features, and unresolved design decisions for the proposed **codex-agents** orchestration framework.
## 1. Architecture & Core Components
- **XDGcompliant configuration & state**
- Repolocal overrides: `<repo>/.codex-agent/config.toml`
- Userwide config: `$XDG_CONFIG_HOME/codex-agents/config.toml`
- Global task registry: `$XDG_DATA_HOME/codex-agents/tasks.json`
- **CLI & optional TUI**
- `codex-agent init` → bootstrap repo (copy prompts, create directories)
- `codex-agent status [--tui]` → show global and perrepo task/merge status
- `codex-agent config` → inspect or edit effective config
- `codex-agent agents` → view peragent instruction overrides
- **Task management (`codex-agent task`)**
- `add`, `list`, `edit`, `worktree add|remove`, `validate`, `review`, `complete`
- Interactive AI Q&A flow for `task add` to autopopulate slug, goal, dependencies, and stub file
- **Worktree hydration**
- OSaware reflink: macOS `cp -cRp`, Linux `cp --reflink=auto`, fallback to `rsync`
- COW setup via `git worktree add --no-checkout` + hydration step
- **Merge & Conflict Resolver (`codex-agent merge`)**
- `merge check` → dryrun merge in temp worktree
- `merge resolve` → AIdriven conflict resolution or explicit bail-out
- `merge rebase` → manual rebase entrypoint
- **Code Validator (`codex-agent task validate|review`)**
- Run linters/tests, then invoke Validator agent prompt
- Enforce configurable policies (doc coverage, style rules, test thresholds)
- **Project Manager (`codex-agent manager`)**
- Wave planning, parallel launch commands, live monitoring of worktrees
## 2. Phased Roadmap
Phase | Deliverables
:----:|:--------------------------------------------------------------------------------------
1 | XDG config + global `tasks.json` + basic `task list|add|worktree` CLI
2 | Merge check & conflict-resolver prompt + `merge check|resolve` commands
3 | Validator agent integration + `task validate|review`
4 | Project Manager planning & launching (`manager plan|launch|monitor`)
5 | Interactive `task add` QA loop + per-agent instruction overrides
6 | TUI mode for `status` + live dashboard
7 | Polishing docs, tests, packaging, and PyPI release
## 3. Open Questions & Design Decisions
1. **Global registry schema**
- What additional fields should `tasks.json` track? (e.g. priority, owner, labels)
2. **Config file format & schema**
- TOML vs YAML vs JSON for `config.toml`?
- Which policy keys to expose for Validator and Resolver agents?
3. **Peragent instruction overrides**
- How to structure override files (`validator.toml`, `conflict-resolver.toml`, etc.)?
- Should we fallback to AGENTSstyle instruction files in the repo root if present?
4. **CLI command names & flags**
- Confirm subcommand verbs (`merge resolve` vs `task rebase`, `task validate` vs `task lint`)
- Standardize flags for interactive vs noninteractive modes
5. **Conflict Resolver scope**
- Autoresolve only trivial hunks, or attempt full rebasebased AI resolution?
- How and when can the agent “give up” and hand control back to the user?
6. **Validator policies & autofix**
- Default policy values (max line length, doc coverage %)
- Should `--auto-fix` let the agent rewrite code, or only report issues?
7. **Interactive Task Creation**
- Best UX for prompting the user: CLI Q&A loop vs opening an editor with agent instructions?
- How to capture dependencies and inject them into the new task stub?
8. **Session restore UX**
- Always on for `codex session <UUID>`, or optin via flag?
- How to surface restore failures or drift in transcript format?
9. **TUI implementation**
- Framework choice (curses, Rich, Textual)
- Autorefresh interval and keybindings for actions (open worktree, resolve, validate)
10. **Packaging & distribution**
- Final PyPI package name (`codex-agents` vs `ai-orchestrator`)
- Versioning strategy and backwardscompatibility guarantees
---
_This plan will evolve as we answer these questions and move through the roadmap phases._

1539
codex-rs/Cargo.lock generated

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More