Compare commits

..

32 Commits

Author SHA1 Message Date
Fouad Matin
f08cb68a62 - 2025-10-05 14:57:50 -07:00
Fouad Matin
357612da38 fix: ci 2025-10-04 15:01:48 -07:00
Fouad Matin
07442e4533 fix: ci 2025-10-04 13:12:54 -07:00
Fouad Matin
9f850f8bb6 address feedback 2025-10-04 09:35:06 -07:00
Fouad Matin
662bc2c9ab fix: clippy 2025-10-03 16:23:13 -07:00
Fouad Matin
81ec812bcc - 2025-10-03 15:57:21 -07:00
Fouad Matin
deebfb9d37 - 2025-10-03 15:53:04 -07:00
Fouad Matin
7d3cf212e1 fix: clippy 2025-10-03 13:46:25 -07:00
Fouad Matin
ec9bf6f53e - 2025-10-03 13:37:05 -07:00
Fouad Matin
2c668fa4a1 add [admin] config 2025-10-03 13:07:23 -07:00
Fouad Matin
a5b7675e42 add(core): managed config (#3868)
## Summary

- Factor `load_config_as_toml` into `core::config_loader` so config
loading is reusable across callers.
- Layer `~/.codex/config.toml`, optional `~/.codex/managed_config.toml`,
and macOS managed preferences (base64) with recursive table merging and
scoped threads per source.

## Config Flow

```
Managed prefs (macOS profile: com.openai.codex/config_toml_base64)
                               ▲
                               │
~/.codex/managed_config.toml   │  (optional file-based override)
                               ▲
                               │
                ~/.codex/config.toml (user-defined settings)
```

- The loader searches under the resolved `CODEX_HOME` directory
(defaults to `~/.codex`).
- Managed configs let administrators ship fleet-wide overrides via
device profiles which is useful for enforcing certain settings like
sandbox or approval defaults.
- For nested hash tables: overlays merge recursively. Child tables are
merged key-by-key, while scalar or array values replace the prior layer
entirely. This lets admins add or tweak individual fields without
clobbering unrelated user settings.
2025-10-03 13:02:26 -07:00
Michael Bolin
9823de3cc6 fix: run Prettier in CI (#4681)
This was supposed to be in https://github.com/openai/codex/pull/4645.
2025-10-03 19:10:27 +00:00
Michael Bolin
c32e9cfe86 chore: subject docs/*.md to Prettier checks (#4645)
Apparently we were not running our `pnpm run prettier` check in CI, so
many files that were covered by the existing Prettier check were not
well-formatted.

This updates CI and formats the files.
2025-10-03 11:35:48 -07:00
Gabriel Peal
1d17ca1fa3 [MCP] Add support for MCP Oauth credentials (#4517)
This PR adds oauth login support to streamable http servers when
`experimental_use_rmcp_client` is enabled.

This PR is large but represents the minimal amount of work required for
this to work. To keep this PR smaller, login can only be done with
`codex mcp login` and `codex mcp logout` but it doesn't appear in `/mcp`
or `codex mcp list` yet. Fingers crossed that this is the last large MCP
PR and that subsequent PRs can be smaller.

Under the hood, credentials are stored using platform credential
managers using the [keyring crate](https://crates.io/crates/keyring).
When the keyring isn't available, it falls back to storing credentials
in `CODEX_HOME/.credentials.json` which is consistent with how other
coding agents handle authentication.

I tested this on macOS, Windows, WSL (ubuntu), and Linux. I wasn't able
to test the dbus store on linux but did verify that the fallback works.

One quirk is that if you have credentials, during development, every
build will have its own ad-hoc binary so the keyring won't recognize the
reader as being the same as the write so it may ask for the user's
password. I may add an override to disable this or allow
users/enterprises to opt-out of the keyring storage if it causes issues.

<img width="5064" height="686" alt="CleanShot 2025-09-30 at 19 31 40"
src="https://github.com/user-attachments/assets/9573f9b4-07f1-4160-83b8-2920db287e2d"
/>
<img width="745" height="486" alt="image"
src="https://github.com/user-attachments/assets/9562649b-ea5f-4f22-ace2-d0cb438b143e"
/>
2025-10-03 13:43:12 -04:00
jif-oai
bfe3328129 Fix flaky test (#4672)
This issue was due to the fact that the timeout is not always sufficient
to have enough character for truncation + a race between synthetic
timeout and process kill
2025-10-03 18:09:41 +01:00
jif-oai
e0b38bd7a2 feat: add beta_supported_tools (#4669)
Gate the new read_file tool behind a new `beta_supported_tools` flag and
only enable it for `gpt-5-codex`
2025-10-03 16:58:03 +00:00
Michael Bolin
153338c20f docs: add barebones README for codex-app-server crate (#4671) 2025-10-03 09:26:44 -07:00
pakrym-oai
3495a7dc37 Modernize workflows (#4668)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
2025-10-03 09:25:29 -07:00
Michael Bolin
042d4d55d9 feat: codex exec writes only the final message to stdout (#4644)
This updates `codex exec` so that, by default, most of the agent's
activity is written to stderr so that only the final agent message is
written to stdout. This makes it easier to pipe `codex exec` into
another tool without extra filtering.

I introduced `#![deny(clippy::print_stdout)]` to help enforce this
change and renamed the `ts_println!()` macro to `ts_msg()` because (1)
it no longer calls `println!()` and (2), `ts_eprintln!()` seemed too
long of a name.

While here, this also adds `-o` as an alias for `--output-last-message`.

Fixes https://github.com/openai/codex/issues/1670
2025-10-03 16:22:12 +00:00
pakrym-oai
5af08e0719 Update issue-deduplicator.yml (#4660) 2025-10-03 06:41:57 -07:00
jif-oai
33d3ecbccc chore: refactor tool handling (#4510)
# Tool System Refactor

- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.

Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.

## Next steps:
- Decompose `handle_container_exec_with_params` 
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
jif-oai
69cb72f842 chore: sandbox refactor 2 (#4653)
Revert the revert and fix the UI issue
2025-10-03 11:17:39 +01:00
Michael Bolin
69ac5153d4 fix: replace --api-key with --with-api-key in codex login (#4646)
Previously, users could supply their API key directly via:

```shell
codex login --api-key KEY
```

but this has the drawback that `KEY` is more likely to end up in shell
history, can be read from `/proc`, etc.

This PR removes support for `--api-key` and replaces it with
`--with-api-key`, which reads the key from stdin, so either of these are
better options:

```
printenv OPENAI_API_KEY | codex login --with-api-key
codex login --with-api-key < my_key.txt
```

Other CLIs, such as `gh auth login --with-token`, follow the same
practice.
2025-10-03 06:17:31 +00:00
dedrisian-oai
16b6951648 Nit: Pop model effort picker on esc (#4642)
Pops the effort picker instead of dismissing the whole thing (on
escape).



https://github.com/user-attachments/assets/cef32291-cd07-4ac7-be8f-ce62b38145f9
2025-10-02 21:07:47 -07:00
dedrisian-oai
231c36f8d3 Move gpt-5-codex to top (#4641)
In /model picker
2025-10-03 03:34:58 +00:00
dedrisian-oai
1e4541b982 Fix tab+enter regression on slash commands (#4639)
Before when you would enter `/di`, hit tab on `/diff`, and then hit
enter, it would execute `/diff`. But now it's just sending it as a text.
This fixes the issue.
2025-10-02 20:14:28 -07:00
Shijie Rao
7be3b484ad feat: add file name to fuzzy search response (#4619)
### Summary
* Updated fuzzy search result to include the file name. 
* This should not affect CLI usage and the UI there will be addressed in
a separate PR.

### Testing
Tested locally and with the extension.

### Screenshot
<img width="431" height="244" alt="Screenshot 2025-10-02 at 11 08 44 AM"
src="https://github.com/user-attachments/assets/ba2ca299-a81d-4453-9242-1750e945aea2"
/>

---------

Co-authored-by: shijie.rao <shijie.rao@squareup.com>
2025-10-02 18:19:13 -07:00
Jeremy Rose
9617b69c8a tui: • Working, 100% context dim (#4629)
- add a `•` before the "Working" shimmer
- make the percentage in "X% context left" dim instead of bold

<img width="751" height="480" alt="Screenshot 2025-10-02 at 2 29 57 PM"
src="https://github.com/user-attachments/assets/cf3e771f-ddb3-48f4-babe-1eaf1f0c2959"
/>
2025-10-03 01:17:34 +00:00
pakrym-oai
1d94b9111c Use supports_color in codex exec (#4633)
It knows how to detect github actions
2025-10-03 01:15:03 +00:00
pakrym-oai
2d6cd6951a Enable codex workflows (#4636) 2025-10-02 17:37:22 -07:00
pakrym-oai
310e3c32e5 Update issue-deduplicator.yml (#4638)
let's test codex_args flag
2025-10-02 17:19:00 -07:00
Michael Bolin
37786593a0 feat: write pid in addition to port to server info (#4571)
This is nice to have for debugging.

While here, also cleaned up a bunch of unnecessary noise in
`write_server_info()`.
2025-10-02 17:15:09 -07:00
134 changed files with 11503 additions and 3285 deletions

View File

@@ -60,3 +60,6 @@ jobs:
run: ./scripts/asciicheck.py codex-cli/README.md
- name: Check codex-cli/README ToC
run: python3 scripts/readme_toc.py codex-cli/README.md
- name: Prettier (run `pnpm run format:fix` to fix)
run: pnpm run format

View File

@@ -3,7 +3,7 @@ name: Issue Deduplicator
on:
issues:
types:
# - opened - disabled while testing
- opened
- labeled
jobs:
@@ -45,9 +45,35 @@ jobs:
uses: openai/codex-action@main
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
prompt_file: .github/prompts/issue-deduplicator.txt
require_repo_write: false
codex_version: 0.43.0-alpha.16
model: gpt-5
prompt: |
You are an assistant that triages new GitHub issues by identifying potential duplicates.
You will receive the following JSON files located in the current working directory:
- `codex-current-issue.json`: JSON object describing the newly created issue (fields: number, title, body).
- `codex-existing-issues.json`: JSON array of recent issues (each element includes number, title, body, createdAt).
Instructions:
- Load both files as JSON and review their contents carefully. The codex-existing-issues.json file is large, ensure you explore all of it.
- Compare the current issue against the existing issues to find up to five that appear to describe the same underlying problem or request.
- When unsure, prefer returning fewer matches.
- Include at most five numbers.
output_schema: |
{
"type": "object",
"properties": {
"issues": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": ["issues"],
"additionalProperties": false
}
comment-on-issue:
name: Comment with potential duplicates
@@ -65,20 +91,27 @@ jobs:
with:
github-token: ${{ github.token }}
script: |
let numbers;
const raw = process.env.CODEX_OUTPUT ?? '';
let parsed;
try {
numbers = JSON.parse(process.env.CODEX_OUTPUT);
parsed = JSON.parse(raw);
} catch (error) {
core.info(`Codex output was not valid JSON. Raw output: ${raw}`);
core.info(`Parse error: ${error.message}`);
return;
}
if (numbers.length === 0) {
const issues = Array.isArray(parsed?.issues) ? parsed.issues : [];
if (issues.length === 0) {
core.info('Codex reported no potential duplicates.');
return;
}
const lines = ['Potential duplicates detected:', ...numbers.map((value) => `- #${value}`)];
const lines = [
'Potential duplicates detected:',
...issues.map((value) => `- #${String(value)}`),
'',
'*Powered by [Codex Action](https://github.com/openai/codex-action)*'];
await github.rest.issues.createComment({
owner: context.repo.owner,

View File

@@ -3,7 +3,7 @@ name: Issue Labeler
on:
issues:
types:
# - opened - disabled while testing
- opened
- labeled
jobs:
@@ -13,11 +13,6 @@ jobs:
runs-on: ubuntu-latest
permissions:
contents: read
env:
ISSUE_NUMBER: ${{ github.event.issue.number }}
ISSUE_TITLE: ${{ github.event.issue.title }}
ISSUE_BODY: ${{ github.event.issue.body }}
REPO_FULL_NAME: ${{ github.repository }}
outputs:
codex_output: ${{ steps.codex.outputs.final_message }}
steps:
@@ -27,9 +22,51 @@ jobs:
uses: openai/codex-action@main
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
prompt_file: .github/prompts/issue-labeler.txt
require_repo_write: false
codex_version: 0.43.0-alpha.16
prompt: |
You are an assistant that reviews GitHub issues for the repository.
Your job is to choose the most appropriate existing labels for the issue described later in this prompt.
Follow these rules:
- Only pick labels out of the list below.
- Prefer a small set of precise labels over many broad ones.
Labels to apply:
1. bug — Reproducible defects in Codex products (CLI, VS Code extension, web, auth).
2. enhancement — Feature requests or usability improvements that ask for new capabilities, better ergonomics, or quality-of-life tweaks.
3. extension — VS Code (or other IDE) extension-specific issues.
4. windows-os — Bugs or friction specific to Windows environments (always when PowerShell is mentioned, path handling, copy/paste, OS-specific auth or tooling failures).
5. mcp — Topics involving Model Context Protocol servers/clients.
6. codex-web — Issues targeting the Codex web UI/Cloud experience.
8. azure — Problems or requests tied to Azure OpenAI deployments.
9. documentation — Updates or corrections needed in docs/README/config references (broken links, missing examples, outdated keys, clarification requests).
10. model-behavior — Undesirable LLM behavior: forgetting goals, refusing work, hallucinating environment details, quota misreports, or other reasoning/performance anomalies.
Issue number: ${{ github.event.issue.number }}
Issue title:
${{ github.event.issue.title }}
Issue body:
${{ github.event.issue.body }}
Repository full name:
${{ github.repository }}
output_schema: |
{
"type": "object",
"properties": {
"labels": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": ["labels"],
"additionalProperties": false
}
apply-labels:
name: Apply labels from Codex output
@@ -53,12 +90,12 @@ jobs:
exit 0
fi
if ! printf '%s' "$json" | jq -e 'type == "array"' >/dev/null 2>&1; then
echo "Codex output was not a JSON array. Raw output: $json"
if ! printf '%s' "$json" | jq -e 'type == "object" and (.labels | type == "array")' >/dev/null 2>&1; then
echo "Codex output did not include a labels array. Raw output: $json"
exit 0
fi
labels=$(printf '%s' "$json" | jq -r '.[] | tostring')
labels=$(printf '%s' "$json" | jq -r '.labels[] | tostring')
if [ -z "$labels" ]; then
echo "Codex returned an empty array. Nothing to do."
exit 0

View File

@@ -8,11 +8,16 @@ In the codex-rs folder where the rust code lives:
- Never add or modify any code related to `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` or `CODEX_SANDBOX_ENV_VAR`.
- You operate in a sandbox where `CODEX_SANDBOX_NETWORK_DISABLED=1` will be set whenever you use the `shell` tool. Any existing code that uses `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` was authored with this fact in mind. It is often used to early exit out of tests that the author knew you would not be able to run given your sandbox limitations.
- Similarly, when you spawn a process using Seatbelt (`/usr/bin/sandbox-exec`), `CODEX_SANDBOX=seatbelt` will be set on the child process. Integration tests that want to run Seatbelt themselves cannot be run under Seatbelt, so checks for `CODEX_SANDBOX=seatbelt` are also often used to early exit out of tests, as appropriate.
- Always collapse if statements per https://rust-lang.github.io/rust-clippy/master/index.html#collapsible_if
- Always inline format! args when possible per https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
- Use method references over closures when possible per https://rust-lang.github.io/rust-clippy/master/index.html#redundant_closure_for_method_calls
- When writing tests, prefer comparing the equality of entire objects over fields one by one.
Run `just fmt` (in `codex-rs` directory) automatically after making Rust code changes; do not ask for approval to run it. Before finalizing a change to `codex-rs`, run `just fix -p <project>` (in `codex-rs` directory) to fix any linter issues in the code. Prefer scoping with `-p` to avoid slow workspacewide Clippy builds; only run `just fix` without `-p` if you changed shared crates. Additionally, run the tests:
1. Run the test for the specific project that was changed. For example, if changes were made in `codex-rs/tui`, run `cargo test -p codex-tui`.
2. Once those pass, if any changes were made in common, core, or protocol, run the complete test suite with `cargo test --all-features`.
When running interactively, ask the user before running `just fix` to finalize. `just fmt` does not require approval. project-specific or individual tests can be run without asking the user, but do ask the user before running the complete test suite.
When running interactively, ask the user before running `just fix` to finalize. `just fmt` does not require approval. project-specific or individual tests can be run without asking the user, but do ask the user before running the complete test suite.
## TUI style conventions
@@ -28,6 +33,7 @@ See `codex-rs/tui/styles.md`.
- Desired: vec![" └ ".into(), "M".red(), " ".dim(), "tui/src/app.rs".dim()]
### TUI Styling (ratatui)
- Prefer Stylize helpers: use "text".dim(), .bold(), .cyan(), .italic(), .underlined() instead of manual Style where possible.
- Prefer simple conversions: use "text".into() for spans and vec![…].into() for lines; when inference is ambiguous (e.g., Paragraph::new/Cell::from), use Line::from(spans) or Span::from(text).
- Computed styles: if the Style is computed at runtime, using `Span::styled` is OK (`Span::from(text).set_style(style)` is also acceptable).
@@ -39,6 +45,7 @@ See `codex-rs/tui/styles.md`.
- Compactness: prefer the form that stays on one line after rustfmt; if only one of Line::from(vec![…]) or vec![…].into() avoids wrapping, choose that. If both wrap, pick the one with fewer wrapped lines.
### Text wrapping
- Always use textwrap::wrap to wrap plain strings.
- If you have a ratatui Line and you want to wrap it, use the helpers in tui/src/wrapping.rs, e.g. word_wrap_lines / word_wrap_line.
- If you need to indent wrapped lines, use the initial_indent / subsequent_indent options from RtOptions if you can, rather than writing custom logic.
@@ -60,6 +67,7 @@ This repo uses snapshot tests (via `insta`), especially in `codex-rs/tui`, to va
- `cargo insta accept -p codex-tui`
If you dont have the tool:
- `cargo install cargo-insta`
### Test assertions

View File

@@ -1,4 +1,3 @@
<p align="center"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>
<p align="center"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.
@@ -64,7 +63,6 @@ You can also use Codex with an API key, but this requires [additional setup](./d
Codex CLI supports [MCP servers](./docs/advanced.md#model-context-protocol-mcp). Enable by adding an `mcp_servers` section to your `~/.codex/config.toml`.
### Configuration
Codex CLI supports a rich set of configuration options, with preferences stored in `~/.codex/config.toml`. For full configuration options, see [Configuration](./docs/config.md).

1424
codex-rs/Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -32,6 +32,7 @@ members = [
"git-apply",
"utils/json-to-toml",
"utils/readiness",
"utils/string",
]
resolver = "2"
@@ -71,6 +72,7 @@ codex-rmcp-client = { path = "rmcp-client" }
codex-tui = { path = "tui" }
codex-utils-json-to-toml = { path = "utils/json-to-toml" }
codex-utils-readiness = { path = "utils/readiness" }
codex-utils-string = { path = "utils/string" }
core_test_support = { path = "core/tests/common" }
mcp-types = { path = "mcp-types" }
mcp_test_support = { path = "mcp-server/tests/common" }
@@ -85,6 +87,7 @@ assert_cmd = "2"
async-channel = "2.3.1"
async-stream = "0.3.6"
async-trait = "0.1.89"
axum = { version = "0.8", default-features = false }
base64 = "0.22.1"
bytes = "1.10.1"
chrono = "0.4.42"
@@ -102,7 +105,8 @@ env-flags = "0.1.1"
env_logger = "0.11.5"
escargot = "0.5"
eventsource-stream = "0.2.3"
futures = "0.3"
futures = { version = "0.3", default-features = false }
fd-lock = "4.0.4"
icu_decimal = "2.0.0"
icu_locale_core = "2.0.0"
ignore = "0.4.23"
@@ -110,6 +114,7 @@ image = { version = "^0.25.8", default-features = false }
indexmap = "2.6.0"
insta = "1.43.2"
itertools = "0.14.0"
keyring = "3.6"
landlock = "0.4.1"
lazy_static = "1"
libc = "0.2.175"
@@ -138,13 +143,16 @@ rand = "0.9"
ratatui = "0.29.0"
regex-lite = "0.1.7"
reqwest = "0.12"
rmcp = { version = "0.7.0", default-features = false }
schemars = "0.8.22"
seccompiler = "0.5.0"
serde = "1"
serde_json = "1"
serde_with = "3.14"
serial_test = "3.2.0"
sha1 = "0.10.6"
sha2 = "0.10"
shellexpand = "3.1.0"
shlex = "1.3.0"
similar = "2.7.0"
starlark = "0.13.0"

View File

@@ -725,6 +725,7 @@ pub struct FuzzyFileSearchParams {
pub struct FuzzyFileSearchResult {
pub root: String,
pub path: String,
pub file_name: String,
pub score: u32,
#[serde(skip_serializing_if = "Option::is_none")]
pub indices: Option<Vec<u32>>,

View File

@@ -0,0 +1,15 @@
# codex-app-server
`codex app-server` is the harness Codex uses to power rich interfaces such as the [Codex VS Code extension](https://marketplace.visualstudio.com/items?itemName=openai.chatgpt). The message schema is currently unstable, but those who wish to build experimental UIs on top of Codex may find it valuable.
## Protocol
Similar to [MCP](https://modelcontextprotocol.io/), `codex app-server` supports bidirectional communication, streaming JSONL over stdio. The protocol is JSON-RPC 2.0, though the `"jsonrpc":"2.0"` header is omitted.
## Message Schema
Currently, you can dump a TypeScript version of the schema using `codex generate-ts`. It is specific to the version of Codex you used to run `generate-ts`, so the two are guaranteed to be compatible.
```
codex generate-ts --out DIR
```

View File

@@ -500,7 +500,7 @@ impl CodexMessageProcessor {
}
async fn get_user_saved_config(&self, request_id: RequestId) {
let toml_value = match load_config_as_toml(&self.config.codex_home) {
let toml_value = match load_config_as_toml(&self.config.codex_home).await {
Ok(val) => val,
Err(err) => {
let error = JSONRPCErrorError {
@@ -653,18 +653,19 @@ impl CodexMessageProcessor {
}
async fn process_new_conversation(&self, request_id: RequestId, params: NewConversationParams) {
let config = match derive_config_from_params(params, self.codex_linux_sandbox_exe.clone()) {
Ok(config) => config,
Err(err) => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("error deriving config: {err}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
};
let config =
match derive_config_from_params(params, self.codex_linux_sandbox_exe.clone()).await {
Ok(config) => config,
Err(err) => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("error deriving config: {err}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
};
match self.conversation_manager.new_conversation(config).await {
Ok(conversation_id) => {
@@ -752,7 +753,7 @@ impl CodexMessageProcessor {
// Derive a Config using the same logic as new conversation, honoring overrides if provided.
let config = match params.overrides {
Some(overrides) => {
derive_config_from_params(overrides, self.codex_linux_sandbox_exe.clone())
derive_config_from_params(overrides, self.codex_linux_sandbox_exe.clone()).await
}
None => Ok(self.config.as_ref().clone()),
};
@@ -1320,7 +1321,7 @@ async fn apply_bespoke_event_handling(
}
}
fn derive_config_from_params(
async fn derive_config_from_params(
params: NewConversationParams,
codex_linux_sandbox_exe: Option<PathBuf>,
) -> std::io::Result<Config> {
@@ -1358,7 +1359,7 @@ fn derive_config_from_params(
.map(|(k, v)| (k, json_to_toml(v)))
.collect();
Config::load_with_cli_overrides(cli_overrides, overrides)
Config::load_with_cli_overrides(cli_overrides, overrides).await
}
async fn on_patch_approval_response(

View File

@@ -1,5 +1,6 @@
use std::num::NonZero;
use std::num::NonZeroUsize;
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
use std::sync::atomic::AtomicBool;
@@ -56,9 +57,16 @@ pub(crate) async fn run_fuzzy_file_search(
match res {
Ok(Ok((root, res))) => {
for m in res.matches {
let path = m.path;
//TODO(shijie): Move file name generation to file_search lib.
let file_name = Path::new(&path)
.file_name()
.map(|name| name.to_string_lossy().into_owned())
.unwrap_or_else(|| path.clone());
let result = FuzzyFileSearchResult {
root: root.clone(),
path: m.path,
path,
file_name,
score: m.score,
indices: m.indices,
};

View File

@@ -81,6 +81,7 @@ pub async fn run_main(
)
})?;
let config = Config::load_with_cli_overrides(cli_kv_overrides, ConfigOverrides::default())
.await
.map_err(|e| {
std::io::Error::new(ErrorKind::InvalidData, format!("error loading config: {e}"))
})?;

View File

@@ -1,3 +1,4 @@
use std::collections::VecDeque;
use std::path::Path;
use std::process::Stdio;
use std::sync::atomic::AtomicI64;
@@ -47,6 +48,7 @@ pub struct McpProcess {
process: Child,
stdin: ChildStdin,
stdout: BufReader<ChildStdout>,
pending_user_messages: VecDeque<JSONRPCNotification>,
}
impl McpProcess {
@@ -117,6 +119,7 @@ impl McpProcess {
process,
stdin,
stdout,
pending_user_messages: VecDeque::new(),
})
}
@@ -375,8 +378,9 @@ impl McpProcess {
let message = self.read_jsonrpc_message().await?;
match message {
JSONRPCMessage::Notification(_) => {
eprintln!("notification: {message:?}");
JSONRPCMessage::Notification(notification) => {
eprintln!("notification: {notification:?}");
self.enqueue_user_message(notification);
}
JSONRPCMessage::Request(jsonrpc_request) => {
return jsonrpc_request.try_into().with_context(
@@ -402,8 +406,9 @@ impl McpProcess {
loop {
let message = self.read_jsonrpc_message().await?;
match message {
JSONRPCMessage::Notification(_) => {
eprintln!("notification: {message:?}");
JSONRPCMessage::Notification(notification) => {
eprintln!("notification: {notification:?}");
self.enqueue_user_message(notification);
}
JSONRPCMessage::Request(_) => {
anyhow::bail!("unexpected JSONRPCMessage::Request: {message:?}");
@@ -427,8 +432,9 @@ impl McpProcess {
loop {
let message = self.read_jsonrpc_message().await?;
match message {
JSONRPCMessage::Notification(_) => {
eprintln!("notification: {message:?}");
JSONRPCMessage::Notification(notification) => {
eprintln!("notification: {notification:?}");
self.enqueue_user_message(notification);
}
JSONRPCMessage::Request(_) => {
anyhow::bail!("unexpected JSONRPCMessage::Request: {message:?}");
@@ -451,6 +457,10 @@ impl McpProcess {
) -> anyhow::Result<JSONRPCNotification> {
eprintln!("in read_stream_until_notification_message({method})");
if let Some(notification) = self.take_pending_notification_by_method(method) {
return Ok(notification);
}
loop {
let message = self.read_jsonrpc_message().await?;
match message {
@@ -458,6 +468,7 @@ impl McpProcess {
if notification.method == method {
return Ok(notification);
}
self.enqueue_user_message(notification);
}
JSONRPCMessage::Request(_) => {
anyhow::bail!("unexpected JSONRPCMessage::Request: {message:?}");
@@ -471,4 +482,21 @@ impl McpProcess {
}
}
}
fn take_pending_notification_by_method(&mut self, method: &str) -> Option<JSONRPCNotification> {
if let Some(pos) = self
.pending_user_messages
.iter()
.position(|notification| notification.method == method)
{
return self.pending_user_messages.remove(pos);
}
None
}
fn enqueue_user_message(&mut self, notification: JSONRPCNotification) {
if notification.method == "codex/event/user_message" {
self.pending_user_messages.push_back(notification);
}
}
}

View File

@@ -8,6 +8,7 @@ use app_test_support::to_response;
use codex_app_server_protocol::AddConversationListenerParams;
use codex_app_server_protocol::AddConversationSubscriptionResponse;
use codex_app_server_protocol::ExecCommandApprovalParams;
use codex_app_server_protocol::InputItem;
use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::NewConversationParams;
@@ -25,6 +26,10 @@ use codex_core::protocol::SandboxPolicy;
use codex_core::protocol_config_types::ReasoningEffort;
use codex_core::protocol_config_types::ReasoningSummary;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::InputMessageKind;
use pretty_assertions::assert_eq;
use std::env;
use tempfile::TempDir;
@@ -367,6 +372,234 @@ async fn test_send_user_turn_changes_approval_policy_behavior() {
}
// Helper: minimal config.toml pointing at mock provider.
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn test_send_user_turn_updates_sandbox_and_cwd_between_turns() {
if env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
let tmp = TempDir::new().expect("tmp dir");
let codex_home = tmp.path().join("codex_home");
std::fs::create_dir(&codex_home).expect("create codex home dir");
let workspace_root = tmp.path().join("workspace");
std::fs::create_dir(&workspace_root).expect("create workspace root");
let first_cwd = workspace_root.join("turn1");
let second_cwd = workspace_root.join("turn2");
std::fs::create_dir(&first_cwd).expect("create first cwd");
std::fs::create_dir(&second_cwd).expect("create second cwd");
let responses = vec![
create_shell_sse_response(
vec![
"bash".to_string(),
"-lc".to_string(),
"echo first turn".to_string(),
],
None,
Some(5000),
"call-first",
)
.expect("create first shell response"),
create_final_assistant_message_sse_response("done first")
.expect("create first final assistant message"),
create_shell_sse_response(
vec![
"bash".to_string(),
"-lc".to_string(),
"echo second turn".to_string(),
],
None,
Some(5000),
"call-second",
)
.expect("create second shell response"),
create_final_assistant_message_sse_response("done second")
.expect("create second final assistant message"),
];
let server = create_mock_chat_completions_server(responses).await;
create_config_toml(&codex_home, &server.uri()).expect("write config");
let mut mcp = McpProcess::new(&codex_home)
.await
.expect("spawn mcp process");
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.expect("init timeout")
.expect("init failed");
let new_conv_id = mcp
.send_new_conversation_request(NewConversationParams {
cwd: Some(first_cwd.to_string_lossy().into_owned()),
approval_policy: Some(AskForApproval::Never),
sandbox: Some(SandboxMode::WorkspaceWrite),
..Default::default()
})
.await
.expect("send newConversation");
let new_conv_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(new_conv_id)),
)
.await
.expect("newConversation timeout")
.expect("newConversation resp");
let NewConversationResponse {
conversation_id,
model,
..
} = to_response::<NewConversationResponse>(new_conv_resp)
.expect("deserialize newConversation response");
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams { conversation_id })
.await
.expect("send addConversationListener");
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(add_listener_id)),
)
.await
.expect("addConversationListener timeout")
.expect("addConversationListener resp");
let first_turn_id = mcp
.send_send_user_turn_request(SendUserTurnParams {
conversation_id,
items: vec![InputItem::Text {
text: "first turn".to_string(),
}],
cwd: first_cwd.clone(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::WorkspaceWrite {
writable_roots: vec![first_cwd.clone()],
network_access: false,
exclude_tmpdir_env_var: false,
exclude_slash_tmp: false,
},
model: model.clone(),
effort: Some(ReasoningEffort::Medium),
summary: ReasoningSummary::Auto,
})
.await
.expect("send first sendUserTurn");
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(first_turn_id)),
)
.await
.expect("sendUserTurn 1 timeout")
.expect("sendUserTurn 1 resp");
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/task_complete"),
)
.await
.expect("task_complete 1 timeout")
.expect("task_complete 1 notification");
let second_turn_id = mcp
.send_send_user_turn_request(SendUserTurnParams {
conversation_id,
items: vec![InputItem::Text {
text: "second turn".to_string(),
}],
cwd: second_cwd.clone(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: model.clone(),
effort: Some(ReasoningEffort::Medium),
summary: ReasoningSummary::Auto,
})
.await
.expect("send second sendUserTurn");
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(second_turn_id)),
)
.await
.expect("sendUserTurn 2 timeout")
.expect("sendUserTurn 2 resp");
let mut env_message: Option<String> = None;
let second_cwd_str = second_cwd.to_string_lossy().into_owned();
for _ in 0..10 {
let notification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/user_message"),
)
.await
.expect("user_message timeout")
.expect("user_message notification");
let params = notification
.params
.clone()
.expect("user_message should include params");
let event: Event = serde_json::from_value(params).expect("deserialize user_message event");
if let EventMsg::UserMessage(user) = event.msg
&& matches!(user.kind, Some(InputMessageKind::EnvironmentContext))
&& user.message.contains(&second_cwd_str)
{
env_message = Some(user.message);
break;
}
}
let env_message = env_message.expect("expected environment context update");
assert!(
env_message.contains("<sandbox_mode>danger-full-access</sandbox_mode>"),
"env context should reflect new sandbox mode: {env_message}"
);
assert!(
env_message.contains("<network_access>enabled</network_access>"),
"env context should enable network access for danger-full-access policy: {env_message}"
);
assert!(
env_message.contains(&second_cwd_str),
"env context should include updated cwd: {env_message}"
);
let exec_begin_notification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/exec_command_begin"),
)
.await
.expect("exec_command_begin timeout")
.expect("exec_command_begin notification");
let params = exec_begin_notification
.params
.clone()
.expect("exec_command_begin params");
let event: Event = serde_json::from_value(params).expect("deserialize exec begin event");
let exec_begin = match event.msg {
EventMsg::ExecCommandBegin(exec_begin) => exec_begin,
other => panic!("expected ExecCommandBegin event, got {other:?}"),
};
assert_eq!(
exec_begin.cwd, second_cwd,
"exec turn should run from updated cwd"
);
assert_eq!(
exec_begin.command,
vec![
"bash".to_string(),
"-lc".to_string(),
"echo second turn".to_string()
],
"exec turn should run expected command"
);
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/task_complete"),
)
.await
.expect("task_complete 2 timeout")
.expect("task_complete 2 notification");
}
fn create_config_toml(codex_home: &Path, server_uri: &str) -> std::io::Result<()> {
let config_toml = codex_home.join("config.toml");
std::fs::write(

View File

@@ -1,3 +1,5 @@
use anyhow::Context;
use anyhow::Result;
use app_test_support::McpProcess;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
@@ -9,30 +11,41 @@ use tokio::time::timeout;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn test_fuzzy_file_search_sorts_and_includes_indices() {
async fn test_fuzzy_file_search_sorts_and_includes_indices() -> Result<()> {
// Prepare a temporary Codex home and a separate root with test files.
let codex_home = TempDir::new().expect("create temp codex home");
let root = TempDir::new().expect("create temp search root");
let codex_home = TempDir::new().context("create temp codex home")?;
let root = TempDir::new().context("create temp search root")?;
// Create files designed to have deterministic ordering for query "abc".
std::fs::write(root.path().join("abc"), "x").expect("write file abc");
std::fs::write(root.path().join("abcde"), "x").expect("write file abcx");
std::fs::write(root.path().join("abexy"), "x").expect("write file abcx");
std::fs::write(root.path().join("zzz.txt"), "x").expect("write file zzz");
// Create files designed to have deterministic ordering for query "abe".
std::fs::write(root.path().join("abc"), "x").context("write file abc")?;
std::fs::write(root.path().join("abcde"), "x").context("write file abcde")?;
std::fs::write(root.path().join("abexy"), "x").context("write file abexy")?;
std::fs::write(root.path().join("zzz.txt"), "x").context("write file zzz")?;
let sub_dir = root.path().join("sub");
std::fs::create_dir_all(&sub_dir).context("create sub dir")?;
let sub_abce_path = sub_dir.join("abce");
std::fs::write(&sub_abce_path, "x").context("write file sub/abce")?;
let sub_abce_rel = sub_abce_path
.strip_prefix(root.path())
.context("strip root prefix from sub/abce")?
.to_string_lossy()
.to_string();
// Start MCP server and initialize.
let mut mcp = McpProcess::new(codex_home.path()).await.expect("spawn mcp");
let mut mcp = McpProcess::new(codex_home.path())
.await
.context("spawn mcp")?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.expect("init timeout")
.expect("init failed");
.context("init timeout")?
.context("init failed")?;
let root_path = root.path().to_string_lossy().to_string();
// Send fuzzyFileSearch request.
let request_id = mcp
.send_fuzzy_file_search_request("abe", vec![root_path.clone()], None)
.await
.expect("send fuzzyFileSearch");
.context("send fuzzyFileSearch")?;
// Read response and verify shape and ordering.
let resp: JSONRPCResponse = timeout(
@@ -40,39 +53,65 @@ async fn test_fuzzy_file_search_sorts_and_includes_indices() {
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await
.expect("fuzzyFileSearch timeout")
.expect("fuzzyFileSearch resp");
.context("fuzzyFileSearch timeout")?
.context("fuzzyFileSearch resp")?;
let value = resp.result;
// The path separator on Windows affects the score.
let expected_score = if cfg!(windows) { 69 } else { 72 };
assert_eq!(
value,
json!({
"files": [
{ "root": root_path.clone(), "path": "abexy", "score": 88, "indices": [0, 1, 2] },
{ "root": root_path.clone(), "path": "abcde", "score": 74, "indices": [0, 1, 4] },
{
"root": root_path.clone(),
"path": "abexy",
"file_name": "abexy",
"score": 88,
"indices": [0, 1, 2],
},
{
"root": root_path.clone(),
"path": "abcde",
"file_name": "abcde",
"score": 74,
"indices": [0, 1, 4],
},
{
"root": root_path.clone(),
"path": sub_abce_rel,
"file_name": "abce",
"score": expected_score,
"indices": [4, 5, 7],
},
]
})
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn test_fuzzy_file_search_accepts_cancellation_token() {
let codex_home = TempDir::new().expect("create temp codex home");
let root = TempDir::new().expect("create temp search root");
async fn test_fuzzy_file_search_accepts_cancellation_token() -> Result<()> {
let codex_home = TempDir::new().context("create temp codex home")?;
let root = TempDir::new().context("create temp search root")?;
std::fs::write(root.path().join("alpha.txt"), "contents").expect("write alpha");
std::fs::write(root.path().join("alpha.txt"), "contents").context("write alpha")?;
let mut mcp = McpProcess::new(codex_home.path()).await.expect("spawn mcp");
let mut mcp = McpProcess::new(codex_home.path())
.await
.context("spawn mcp")?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.expect("init timeout")
.expect("init failed");
.context("init timeout")?
.context("init failed")?;
let root_path = root.path().to_string_lossy().to_string();
let request_id = mcp
.send_fuzzy_file_search_request("alp", vec![root_path.clone()], None)
.await
.expect("send fuzzyFileSearch");
.context("send fuzzyFileSearch")?;
let request_id_2 = mcp
.send_fuzzy_file_search_request(
@@ -81,24 +120,27 @@ async fn test_fuzzy_file_search_accepts_cancellation_token() {
Some(request_id.to_string()),
)
.await
.expect("send fuzzyFileSearch");
.context("send fuzzyFileSearch")?;
let resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id_2)),
)
.await
.expect("fuzzyFileSearch timeout")
.expect("fuzzyFileSearch resp");
.context("fuzzyFileSearch timeout")?
.context("fuzzyFileSearch resp")?;
let files = resp
.result
.get("files")
.and_then(|value| value.as_array())
.cloned()
.expect("files array");
.context("files key missing")?
.as_array()
.context("files not array")?
.clone();
assert_eq!(files.len(), 1);
assert_eq!(files[0]["root"], root_path);
assert_eq!(files[0]["path"], "alpha.txt");
Ok(())
}

View File

@@ -29,7 +29,8 @@ pub async fn run_apply_command(
.parse_overrides()
.map_err(anyhow::Error::msg)?,
ConfigOverrides::default(),
)?;
)
.await?;
init_chatgpt_token_from_auth(&config.codex_home).await?;

View File

@@ -32,6 +32,7 @@ codex-app-server-protocol = { workspace = true }
codex-protocol-ts = { workspace = true }
codex-responses-api-proxy = { workspace = true }
codex-tui = { workspace = true }
codex-rmcp-client = { workspace = true }
codex-cloud-tasks = { path = "../cloud-tasks" }
ctor = { workspace = true }
owo-colors = { workspace = true }

View File

@@ -73,7 +73,8 @@ async fn run_command_under_sandbox(
codex_linux_sandbox_exe,
..Default::default()
},
)?;
)
.await?;
// In practice, this should be `std::env::current_dir()` because this CLI
// does not support `--cwd`, but let's use the config value for consistency.

View File

@@ -9,6 +9,8 @@ use codex_core::config::ConfigOverrides;
use codex_login::ServerOptions;
use codex_login::run_device_code_login;
use codex_login::run_login_server;
use std::io::IsTerminal;
use std::io::Read;
use std::path::PathBuf;
pub async fn login_with_chatgpt(codex_home: PathBuf) -> std::io::Result<()> {
@@ -24,7 +26,7 @@ pub async fn login_with_chatgpt(codex_home: PathBuf) -> std::io::Result<()> {
}
pub async fn run_login_with_chatgpt(cli_config_overrides: CliConfigOverrides) -> ! {
let config = load_config_or_exit(cli_config_overrides);
let config = load_config_or_exit(cli_config_overrides).await;
match login_with_chatgpt(config.codex_home).await {
Ok(_) => {
@@ -42,7 +44,7 @@ pub async fn run_login_with_api_key(
cli_config_overrides: CliConfigOverrides,
api_key: String,
) -> ! {
let config = load_config_or_exit(cli_config_overrides);
let config = load_config_or_exit(cli_config_overrides).await;
match login_with_api_key(&config.codex_home, &api_key) {
Ok(_) => {
@@ -56,13 +58,40 @@ pub async fn run_login_with_api_key(
}
}
pub fn read_api_key_from_stdin() -> String {
let mut stdin = std::io::stdin();
if stdin.is_terminal() {
eprintln!(
"--with-api-key expects the API key on stdin. Try piping it, e.g. `printenv OPENAI_API_KEY | codex login --with-api-key`."
);
std::process::exit(1);
}
eprintln!("Reading API key from stdin...");
let mut buffer = String::new();
if let Err(err) = stdin.read_to_string(&mut buffer) {
eprintln!("Failed to read API key from stdin: {err}");
std::process::exit(1);
}
let api_key = buffer.trim().to_string();
if api_key.is_empty() {
eprintln!("No API key provided via stdin.");
std::process::exit(1);
}
api_key
}
/// Login using the OAuth device code flow.
pub async fn run_login_with_device_code(
cli_config_overrides: CliConfigOverrides,
issuer_base_url: Option<String>,
client_id: Option<String>,
) -> ! {
let config = load_config_or_exit(cli_config_overrides);
let config = load_config_or_exit(cli_config_overrides).await;
let mut opts = ServerOptions::new(
config.codex_home,
client_id.unwrap_or(CLIENT_ID.to_string()),
@@ -83,7 +112,7 @@ pub async fn run_login_with_device_code(
}
pub async fn run_login_status(cli_config_overrides: CliConfigOverrides) -> ! {
let config = load_config_or_exit(cli_config_overrides);
let config = load_config_or_exit(cli_config_overrides).await;
match CodexAuth::from_codex_home(&config.codex_home) {
Ok(Some(auth)) => match auth.mode {
@@ -114,7 +143,7 @@ pub async fn run_login_status(cli_config_overrides: CliConfigOverrides) -> ! {
}
pub async fn run_logout(cli_config_overrides: CliConfigOverrides) -> ! {
let config = load_config_or_exit(cli_config_overrides);
let config = load_config_or_exit(cli_config_overrides).await;
match logout(&config.codex_home) {
Ok(true) => {
@@ -132,7 +161,7 @@ pub async fn run_logout(cli_config_overrides: CliConfigOverrides) -> ! {
}
}
fn load_config_or_exit(cli_config_overrides: CliConfigOverrides) -> Config {
async fn load_config_or_exit(cli_config_overrides: CliConfigOverrides) -> Config {
let cli_overrides = match cli_config_overrides.parse_overrides() {
Ok(v) => v,
Err(e) => {
@@ -142,7 +171,7 @@ fn load_config_or_exit(cli_config_overrides: CliConfigOverrides) -> Config {
};
let config_overrides = ConfigOverrides::default();
match Config::load_with_cli_overrides(cli_overrides, config_overrides) {
match Config::load_with_cli_overrides(cli_overrides, config_overrides).await {
Ok(config) => config,
Err(e) => {
eprintln!("Error loading configuration: {e}");

View File

@@ -7,6 +7,7 @@ use codex_chatgpt::apply_command::ApplyCommand;
use codex_chatgpt::apply_command::run_apply_command;
use codex_cli::LandlockCommand;
use codex_cli::SeatbeltCommand;
use codex_cli::login::read_api_key_from_stdin;
use codex_cli::login::run_login_status;
use codex_cli::login::run_login_with_api_key;
use codex_cli::login::run_login_with_chatgpt;
@@ -139,7 +140,18 @@ struct LoginCommand {
#[clap(skip)]
config_overrides: CliConfigOverrides,
#[arg(long = "api-key", value_name = "API_KEY")]
#[arg(
long = "with-api-key",
help = "Read the API key from stdin (e.g. `printenv OPENAI_API_KEY | codex login --with-api-key`)"
)]
with_api_key: bool,
#[arg(
long = "api-key",
value_name = "API_KEY",
help = "(deprecated) Previously accepted the API key directly; now exits with guidance to use --with-api-key",
hide = true
)]
api_key: Option<String>,
/// EXPERIMENTAL: Use device code flow (not yet supported)
@@ -298,7 +310,13 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
login_cli.client_id,
)
.await;
} else if let Some(api_key) = login_cli.api_key {
} else if login_cli.api_key.is_some() {
eprintln!(
"The --api-key flag is no longer supported. Pipe the key instead, e.g. `printenv OPENAI_API_KEY | codex login --with-api-key`."
);
std::process::exit(1);
} else if login_cli.with_api_key {
let api_key = read_api_key_from_stdin();
run_login_with_api_key(login_cli.config_overrides, api_key).await;
} else {
run_login_with_chatgpt(login_cli.config_overrides).await;

View File

@@ -12,6 +12,8 @@ use codex_core::config::load_global_mcp_servers;
use codex_core::config::write_global_mcp_servers;
use codex_core::config_types::McpServerConfig;
use codex_core::config_types::McpServerTransportConfig;
use codex_rmcp_client::delete_oauth_tokens;
use codex_rmcp_client::perform_oauth_login;
/// [experimental] Launch Codex as an MCP server or manage configured MCP servers.
///
@@ -43,6 +45,14 @@ pub enum McpSubcommand {
/// [experimental] Remove a global MCP server entry.
Remove(RemoveArgs),
/// [experimental] Authenticate with a configured MCP server via OAuth.
/// Requires experimental_use_rmcp_client = true in config.toml.
Login(LoginArgs),
/// [experimental] Remove stored OAuth credentials for a server.
/// Requires experimental_use_rmcp_client = true in config.toml.
Logout(LogoutArgs),
}
#[derive(Debug, clap::Parser)]
@@ -82,6 +92,18 @@ pub struct RemoveArgs {
pub name: String,
}
#[derive(Debug, clap::Parser)]
pub struct LoginArgs {
/// Name of the MCP server to authenticate with oauth.
pub name: String,
}
#[derive(Debug, clap::Parser)]
pub struct LogoutArgs {
/// Name of the MCP server to deauthenticate.
pub name: String,
}
impl McpCli {
pub async fn run(self) -> Result<()> {
let McpCli {
@@ -91,16 +113,22 @@ impl McpCli {
match subcommand {
McpSubcommand::List(args) => {
run_list(&config_overrides, args)?;
run_list(&config_overrides, args).await?;
}
McpSubcommand::Get(args) => {
run_get(&config_overrides, args)?;
run_get(&config_overrides, args).await?;
}
McpSubcommand::Add(args) => {
run_add(&config_overrides, args)?;
run_add(&config_overrides, args).await?;
}
McpSubcommand::Remove(args) => {
run_remove(&config_overrides, args)?;
run_remove(&config_overrides, args).await?;
}
McpSubcommand::Login(args) => {
run_login(&config_overrides, args).await?;
}
McpSubcommand::Logout(args) => {
run_logout(&config_overrides, args).await?;
}
}
@@ -108,7 +136,7 @@ impl McpCli {
}
}
fn run_add(config_overrides: &CliConfigOverrides, add_args: AddArgs) -> Result<()> {
async fn run_add(config_overrides: &CliConfigOverrides, add_args: AddArgs) -> Result<()> {
// Validate any provided overrides even though they are not currently applied.
config_overrides.parse_overrides().map_err(|e| anyhow!(e))?;
@@ -134,6 +162,7 @@ fn run_add(config_overrides: &CliConfigOverrides, add_args: AddArgs) -> Result<(
let codex_home = find_codex_home().context("failed to resolve CODEX_HOME")?;
let mut servers = load_global_mcp_servers(&codex_home)
.await
.with_context(|| format!("failed to load MCP servers from {}", codex_home.display()))?;
let new_entry = McpServerConfig {
@@ -156,7 +185,7 @@ fn run_add(config_overrides: &CliConfigOverrides, add_args: AddArgs) -> Result<(
Ok(())
}
fn run_remove(config_overrides: &CliConfigOverrides, remove_args: RemoveArgs) -> Result<()> {
async fn run_remove(config_overrides: &CliConfigOverrides, remove_args: RemoveArgs) -> Result<()> {
config_overrides.parse_overrides().map_err(|e| anyhow!(e))?;
let RemoveArgs { name } = remove_args;
@@ -165,6 +194,7 @@ fn run_remove(config_overrides: &CliConfigOverrides, remove_args: RemoveArgs) ->
let codex_home = find_codex_home().context("failed to resolve CODEX_HOME")?;
let mut servers = load_global_mcp_servers(&codex_home)
.await
.with_context(|| format!("failed to load MCP servers from {}", codex_home.display()))?;
let removed = servers.remove(&name).is_some();
@@ -183,9 +213,65 @@ fn run_remove(config_overrides: &CliConfigOverrides, remove_args: RemoveArgs) ->
Ok(())
}
fn run_list(config_overrides: &CliConfigOverrides, list_args: ListArgs) -> Result<()> {
async fn run_login(config_overrides: &CliConfigOverrides, login_args: LoginArgs) -> Result<()> {
let overrides = config_overrides.parse_overrides().map_err(|e| anyhow!(e))?;
let config = Config::load_with_cli_overrides(overrides, ConfigOverrides::default())
.await
.context("failed to load configuration")?;
if !config.use_experimental_use_rmcp_client {
bail!(
"OAuth login is only supported when experimental_use_rmcp_client is true in config.toml."
);
}
let LoginArgs { name } = login_args;
let Some(server) = config.mcp_servers.get(&name) else {
bail!("No MCP server named '{name}' found.");
};
let url = match &server.transport {
McpServerTransportConfig::StreamableHttp { url, .. } => url.clone(),
_ => bail!("OAuth login is only supported for streamable HTTP servers."),
};
perform_oauth_login(&name, &url).await?;
println!("Successfully logged in to MCP server '{name}'.");
Ok(())
}
async fn run_logout(config_overrides: &CliConfigOverrides, logout_args: LogoutArgs) -> Result<()> {
let overrides = config_overrides.parse_overrides().map_err(|e| anyhow!(e))?;
let config = Config::load_with_cli_overrides(overrides, ConfigOverrides::default())
.await
.context("failed to load configuration")?;
let LogoutArgs { name } = logout_args;
let server = config
.mcp_servers
.get(&name)
.ok_or_else(|| anyhow!("No MCP server named '{name}' found in configuration."))?;
let url = match &server.transport {
McpServerTransportConfig::StreamableHttp { url, .. } => url.clone(),
_ => bail!("OAuth logout is only supported for streamable_http transports."),
};
match delete_oauth_tokens(&name, &url) {
Ok(true) => println!("Removed OAuth credentials for '{name}'."),
Ok(false) => println!("No OAuth credentials stored for '{name}'."),
Err(err) => return Err(anyhow!("failed to delete OAuth credentials: {err}")),
}
Ok(())
}
async fn run_list(config_overrides: &CliConfigOverrides, list_args: ListArgs) -> Result<()> {
let overrides = config_overrides.parse_overrides().map_err(|e| anyhow!(e))?;
let config = Config::load_with_cli_overrides(overrides, ConfigOverrides::default())
.await
.context("failed to load configuration")?;
let mut entries: Vec<_> = config.mcp_servers.iter().collect();
@@ -343,9 +429,10 @@ fn run_list(config_overrides: &CliConfigOverrides, list_args: ListArgs) -> Resul
Ok(())
}
fn run_get(config_overrides: &CliConfigOverrides, get_args: GetArgs) -> Result<()> {
async fn run_get(config_overrides: &CliConfigOverrides, get_args: GetArgs) -> Result<()> {
let overrides = config_overrides.parse_overrides().map_err(|e| anyhow!(e))?;
let config = Config::load_with_cli_overrides(overrides, ConfigOverrides::default())
.await
.context("failed to load configuration")?;
let Some(server) = config.mcp_servers.get(&get_args.name) else {

View File

@@ -13,8 +13,8 @@ fn codex_command(codex_home: &Path) -> Result<assert_cmd::Command> {
Ok(cmd)
}
#[test]
fn add_and_remove_server_updates_global_config() -> Result<()> {
#[tokio::test]
async fn add_and_remove_server_updates_global_config() -> Result<()> {
let codex_home = TempDir::new()?;
let mut add_cmd = codex_command(codex_home.path())?;
@@ -24,7 +24,7 @@ fn add_and_remove_server_updates_global_config() -> Result<()> {
.success()
.stdout(contains("Added global MCP server 'docs'."));
let servers = load_global_mcp_servers(codex_home.path())?;
let servers = load_global_mcp_servers(codex_home.path()).await?;
assert_eq!(servers.len(), 1);
let docs = servers.get("docs").expect("server should exist");
match &docs.transport {
@@ -43,7 +43,7 @@ fn add_and_remove_server_updates_global_config() -> Result<()> {
.success()
.stdout(contains("Removed global MCP server 'docs'."));
let servers = load_global_mcp_servers(codex_home.path())?;
let servers = load_global_mcp_servers(codex_home.path()).await?;
assert!(servers.is_empty());
let mut remove_again_cmd = codex_command(codex_home.path())?;
@@ -53,14 +53,14 @@ fn add_and_remove_server_updates_global_config() -> Result<()> {
.success()
.stdout(contains("No MCP server named 'docs' found."));
let servers = load_global_mcp_servers(codex_home.path())?;
let servers = load_global_mcp_servers(codex_home.path()).await?;
assert!(servers.is_empty());
Ok(())
}
#[test]
fn add_with_env_preserves_key_order_and_values() -> Result<()> {
#[tokio::test]
async fn add_with_env_preserves_key_order_and_values() -> Result<()> {
let codex_home = TempDir::new()?;
let mut add_cmd = codex_command(codex_home.path())?;
@@ -80,7 +80,7 @@ fn add_with_env_preserves_key_order_and_values() -> Result<()> {
.assert()
.success();
let servers = load_global_mcp_servers(codex_home.path())?;
let servers = load_global_mcp_servers(codex_home.path()).await?;
let envy = servers.get("envy").expect("server should exist");
let env = match &envy.transport {
McpServerTransportConfig::Stdio { env: Some(env), .. } => env,

View File

@@ -19,18 +19,21 @@ async-trait = { workspace = true }
base64 = { workspace = true }
bytes = { workspace = true }
chrono = { workspace = true, features = ["serde"] }
codex-app-server-protocol = { workspace = true }
codex-apply-patch = { workspace = true }
codex-file-search = { workspace = true }
codex-mcp-client = { workspace = true }
codex-rmcp-client = { workspace = true }
codex-protocol = { workspace = true }
codex-app-server-protocol = { workspace = true }
codex-otel = { workspace = true, features = ["otel"] }
codex-protocol = { workspace = true }
codex-rmcp-client = { workspace = true }
codex-utils-string = { workspace = true }
dirs = { workspace = true }
dunce = { workspace = true }
env-flags = { workspace = true }
eventsource-stream = { workspace = true }
fd-lock = { workspace = true }
futures = { workspace = true }
gethostname = "0.4"
indexmap = { workspace = true }
libc = { workspace = true }
mcp-types = { workspace = true }
@@ -39,6 +42,7 @@ portable-pty = { workspace = true }
rand = { workspace = true }
regex-lite = { workspace = true }
reqwest = { workspace = true, features = ["json", "stream"] }
shellexpand = { workspace = true }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
sha1 = { workspace = true }
@@ -75,6 +79,9 @@ wildmatch = { workspace = true }
landlock = { workspace = true }
seccompiler = { workspace = true }
[target.'cfg(target_os = "macos")'.dependencies]
core-foundation = "0.9"
# Build OpenSSL from source for musl builds.
[target.x86_64-unknown-linux-musl.dependencies]
openssl-sys = { workspace = true, features = ["vendored"] }
@@ -90,11 +97,12 @@ escargot = { workspace = true }
maplit = { workspace = true }
predicates = { workspace = true }
pretty_assertions = { workspace = true }
serial_test = { workspace = true }
tempfile = { workspace = true }
tokio-test = { workspace = true }
tracing-test = { workspace = true, features = ["no-env-filter"] }
walkdir = { workspace = true }
wiremock = { workspace = true }
tracing-test = { workspace = true, features = ["no-env-filter"] }
[package.metadata.cargo-shear]
ignored = ["openssl-sys"]

View File

@@ -0,0 +1,448 @@
use crate::config_types::AdminAuditEventKind;
use crate::config_types::AdminAuditToml;
use crate::config_types::AdminConfigToml;
use crate::exec::ExecParams;
use crate::exec::SandboxType;
use crate::path_utils::expand_tilde;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
use chrono::DateTime;
use chrono::Utc;
use fd_lock::RwLock;
use gethostname::gethostname;
use reqwest::Client;
use serde::Serialize;
use std::collections::HashSet;
use std::fs;
use std::fs::OpenOptions;
use std::io::Write;
use std::io::{self};
use std::path::Path;
use std::path::PathBuf;
use tokio::runtime::Handle;
use tracing::warn;
#[cfg(unix)]
use std::os::unix::fs::OpenOptionsExt;
#[derive(Debug, Clone, PartialEq, Default)]
pub struct AdminControls {
pub danger: DangerControls,
pub audit: Option<AdminAuditConfig>,
pub pending: Vec<PendingAdminAction>,
}
#[derive(Debug, Clone, PartialEq, Default)]
pub struct DangerControls {
pub disallow_full_access: bool,
pub allow_with_reason: bool,
}
#[derive(Debug, Clone, PartialEq)]
pub struct AdminAuditConfig {
pub log_file: Option<PathBuf>,
pub log_endpoint: Option<String>,
pub log_events: HashSet<AdminAuditEventKind>,
}
#[derive(Debug, Clone, PartialEq)]
pub enum PendingAdminAction {
Danger(DangerPending),
}
#[derive(Debug, Clone, PartialEq)]
pub struct DangerPending {
pub source: DangerRequestSource,
pub requested_sandbox: SandboxPolicy,
pub requested_approval: AskForApproval,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)]
#[serde(rename_all = "snake_case")]
pub enum DangerRequestSource {
Startup,
Resume,
Approvals,
ExecCli,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum DangerDecision {
Allowed,
RequiresJustification,
Denied,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)]
#[serde(rename_all = "snake_case")]
pub enum DangerAuditAction {
Requested,
Approved,
Cancelled,
Denied,
}
#[derive(Debug, Clone, Serialize)]
#[serde(tag = "audit_kind", rename_all = "snake_case")]
pub enum AdminAuditPayload {
Danger {
action: DangerAuditAction,
justification: Option<String>,
requested_by: DangerRequestSource,
sandbox_policy: SandboxPolicy,
approval_policy: AskForApproval,
},
Command {
command: Vec<String>,
command_cwd: PathBuf,
cli_cwd: PathBuf,
sandbox_type: SandboxType,
sandbox_policy: SandboxPolicy,
escalated: bool,
justification: Option<String>,
},
}
#[derive(Debug, Clone, Serialize)]
pub struct AdminAuditRecord {
timestamp: DateTime<Utc>,
username: String,
hostname: String,
#[serde(flatten)]
payload: AdminAuditPayload,
}
impl AdminControls {
pub fn from_toml(raw: Option<AdminConfigToml>) -> io::Result<Self> {
let raw = raw.unwrap_or_default();
let danger = DangerControls {
disallow_full_access: raw.disallow_danger_full_access.unwrap_or(false),
allow_with_reason: raw.allow_danger_with_reason.unwrap_or(false),
};
let audit = match raw.audit {
Some(audit_raw) => AdminAuditConfig::from_toml(audit_raw)?,
None => None,
};
Ok(Self {
danger,
audit,
pending: Vec::new(),
})
}
pub fn decision_for_danger(&self) -> DangerDecision {
if !self.danger.disallow_full_access {
DangerDecision::Allowed
} else if self.danger.allow_with_reason {
DangerDecision::RequiresJustification
} else {
DangerDecision::Denied
}
}
pub fn has_pending_danger(&self) -> bool {
self.pending
.iter()
.any(|action| matches!(action, PendingAdminAction::Danger(_)))
}
pub fn take_pending_danger(&mut self) -> Option<DangerPending> {
self.pending
.extract_if(.., |action| matches!(action, PendingAdminAction::Danger(_)))
.next()
.map(|action| match action {
PendingAdminAction::Danger(pending) => pending,
})
}
pub fn peek_pending_danger(&self) -> Option<&DangerPending> {
self.pending
.iter()
.map(|action| match action {
PendingAdminAction::Danger(pending) => pending,
})
.next()
}
}
impl AdminAuditConfig {
pub fn from_toml(raw: AdminAuditToml) -> io::Result<Option<Self>> {
let AdminAuditToml {
log_file,
log_endpoint,
log_events,
} = raw;
let log_file = match log_file {
Some(path) => {
let trimmed = path.trim();
if trimmed.is_empty() {
None
} else {
Some(expand_tilde(trimmed)?)
}
}
None => None,
};
let log_endpoint = log_endpoint
.map(|endpoint| endpoint.trim().to_string())
.filter(|s| !s.is_empty());
if log_file.is_none() && log_endpoint.is_none() {
return Ok(None);
}
let log_events = log_events.into_iter().collect();
Ok(Some(Self {
log_file,
log_endpoint,
log_events,
}))
}
pub fn should_log(&self, kind: AdminAuditEventKind) -> bool {
self.log_events.is_empty() || self.log_events.contains(&kind)
}
}
impl AdminAuditPayload {
pub fn kind(&self) -> AdminAuditEventKind {
match self {
AdminAuditPayload::Danger { .. } => AdminAuditEventKind::Danger,
AdminAuditPayload::Command { .. } => AdminAuditEventKind::Command,
}
}
}
impl AdminAuditRecord {
fn new(payload: AdminAuditPayload) -> Self {
Self {
timestamp: Utc::now(),
username: current_username(),
hostname: current_hostname(),
payload,
}
}
}
pub fn log_admin_event(config: &AdminAuditConfig, payload: AdminAuditPayload) {
let kind = payload.kind();
if !config.should_log(kind) {
return;
}
let record = AdminAuditRecord::new(payload);
if let Some(path) = &config.log_file
&& let Err(err) = append_record_to_file(path, &record)
{
warn!(
"failed to write admin audit event to {}: {err:?}",
path.display()
);
}
if let Some(endpoint) = &config.log_endpoint {
if Handle::try_current().is_ok() {
let endpoint = endpoint.clone();
tokio::spawn(async move {
if let Err(err) = send_record_to_endpoint(&endpoint, record).await {
warn!("failed to post admin audit event to {endpoint}: {err:?}");
}
});
} else {
warn!(
"admin audit HTTP logging requested for {endpoint}, but no async runtime is available",
);
}
}
}
fn append_record_to_file(path: &Path, record: &AdminAuditRecord) -> io::Result<()> {
if let Some(parent) = path.parent() {
fs::create_dir_all(parent)?;
}
let mut options = OpenOptions::new();
options.create(true).append(true).write(true);
#[cfg(unix)]
{
options.mode(0o600);
}
let file = options.open(path)?;
let mut lock = RwLock::new(file);
let mut guard = lock.write()?;
let line = serde_json::to_string(record).map_err(io::Error::other)?;
guard.write_all(line.as_bytes())?;
guard.write_all(b"\n")?;
guard.flush()?;
Ok(())
}
async fn send_record_to_endpoint(
endpoint: &str,
record: AdminAuditRecord,
) -> Result<(), reqwest::Error> {
Client::new().post(endpoint).json(&record).send().await?;
Ok(())
}
fn current_username() -> String {
env_var("USER")
.or_else(|| env_var("USERNAME"))
.unwrap_or_else(|| "unknown".to_string())
}
fn current_hostname() -> String {
gethostname()
.into_string()
.ok()
.filter(|value| !value.is_empty())
.or_else(|| env_var("HOSTNAME"))
.or_else(|| env_var("COMPUTERNAME"))
.unwrap_or_else(|| "unknown".to_string())
}
fn env_var(key: &str) -> Option<String> {
std::env::var(key).ok().filter(|value| !value.is_empty())
}
pub fn build_danger_audit_payload(
pending: &DangerPending,
action: DangerAuditAction,
justification: Option<String>,
) -> AdminAuditPayload {
AdminAuditPayload::Danger {
action,
justification,
requested_by: pending.source,
sandbox_policy: pending.requested_sandbox.clone(),
approval_policy: pending.requested_approval,
}
}
pub fn build_command_audit_payload(
params: &ExecParams,
sandbox_type: SandboxType,
sandbox_policy: &SandboxPolicy,
cli_cwd: &Path,
) -> AdminAuditPayload {
AdminAuditPayload::Command {
command: params.command.clone(),
command_cwd: params.cwd.clone(),
cli_cwd: cli_cwd.to_path_buf(),
sandbox_type,
sandbox_policy: sandbox_policy.clone(),
escalated: params.with_escalated_permissions.unwrap_or(false),
justification: params.justification.clone(),
}
}
#[cfg(test)]
mod tests {
use super::*;
use serde_json::Value;
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
#[test]
fn danger_payload_serializes_expected_fields() {
let pending = DangerPending {
source: DangerRequestSource::Approvals,
requested_sandbox: SandboxPolicy::DangerFullAccess,
requested_approval: AskForApproval::Never,
};
let payload = build_danger_audit_payload(
&pending,
DangerAuditAction::Requested,
Some("reason".to_string()),
);
let record = AdminAuditRecord::new(payload);
let value = serde_json::to_value(record).expect("serialize record");
assert_eq!(
value.get("audit_kind"),
Some(&Value::String("danger".to_string()))
);
assert_eq!(
value.get("action"),
Some(&Value::String("requested".to_string()))
);
assert_eq!(
value.get("requested_by"),
Some(&Value::String("approvals".to_string()))
);
assert_eq!(
value.get("approval_policy"),
Some(&Value::String("never".to_string()))
);
assert_eq!(
value.get("sandbox_policy").and_then(|sp| sp.get("mode")),
Some(&Value::String("danger-full-access".to_string()))
);
assert_eq!(
value.get("justification"),
Some(&Value::String("reason".to_string()))
);
}
#[test]
fn command_payload_serializes_expected_fields() {
let mut env = HashMap::new();
env.insert("PATH".to_string(), "/usr/bin".to_string());
let params = ExecParams {
command: vec!["echo".to_string(), "hello".to_string()],
cwd: PathBuf::from("/tmp"),
timeout_ms: Some(1000),
env,
with_escalated_permissions: Some(true),
justification: Some("investigation".to_string()),
};
let sandbox_policy = SandboxPolicy::new_workspace_write_policy();
let payload = build_command_audit_payload(
&params,
SandboxType::MacosSeatbelt,
&sandbox_policy,
Path::new("/workspace"),
);
let record = AdminAuditRecord::new(payload);
let value = serde_json::to_value(record).expect("serialize record");
assert_eq!(
value.get("audit_kind"),
Some(&Value::String("command".to_string()))
);
assert_eq!(
value.get("command"),
Some(&serde_json::json!(["echo", "hello"]))
);
assert_eq!(
value.get("command_cwd"),
Some(&Value::String("/tmp".to_string()))
);
assert_eq!(
value.get("cli_cwd"),
Some(&Value::String("/workspace".to_string()))
);
assert_eq!(
value.get("sandbox_type"),
Some(&Value::String("macos-seatbelt".to_string()))
);
assert_eq!(
value.get("sandbox_policy").and_then(|sp| sp.get("mode")),
Some(&Value::String("workspace-write".to_string()))
);
assert_eq!(value.get("escalated"), Some(&Value::Bool(true)));
assert_eq!(
value.get("justification"),
Some(&Value::String("investigation".to_string()))
);
}
}

View File

@@ -27,6 +27,7 @@ pub(crate) enum InternalApplyPatchInvocation {
DelegateToExec(ApplyPatchExec),
}
#[derive(Debug)]
pub(crate) struct ApplyPatchExec {
pub(crate) action: ApplyPatchAction,
pub(crate) user_explicitly_approved_this_action: bool,
@@ -109,3 +110,28 @@ pub(crate) fn convert_apply_patch_to_protocol(
}
result
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
use tempfile::tempdir;
#[test]
fn convert_apply_patch_maps_add_variant() {
let tmp = tempdir().expect("tmp");
let p = tmp.path().join("a.txt");
// Create an action with a single Add change
let action = ApplyPatchAction::new_add_for_test(&p, "hello".to_string());
let got = convert_apply_patch_to_protocol(&action);
assert_eq!(
got.get(&p),
Some(&FileChange::Add {
content: "hello".to_string()
})
);
}
}

View File

@@ -1,6 +1,6 @@
use crate::client_common::tools::ToolSpec;
use crate::error::Result;
use crate::model_family::ModelFamily;
use crate::openai_tools::OpenAiTool;
use crate::protocol::RateLimitSnapshot;
use crate::protocol::TokenUsage;
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
@@ -29,7 +29,7 @@ pub struct Prompt {
/// Tools available to the model, including additional tools sourced from
/// external MCP servers.
pub(crate) tools: Vec<OpenAiTool>,
pub(crate) tools: Vec<ToolSpec>,
/// Optional override for the built-in BASE_INSTRUCTIONS.
pub base_instructions_override: Option<String>,
@@ -49,8 +49,8 @@ impl Prompt {
// AND
// - there is no apply_patch tool present
let is_apply_patch_tool_present = self.tools.iter().any(|tool| match tool {
OpenAiTool::Function(f) => f.name == "apply_patch",
OpenAiTool::Freeform(f) => f.name == "apply_patch",
ToolSpec::Function(f) => f.name == "apply_patch",
ToolSpec::Freeform(f) => f.name == "apply_patch",
_ => false,
});
if self.base_instructions_override.is_none()
@@ -160,6 +160,54 @@ pub(crate) struct ResponsesApiRequest<'a> {
pub(crate) text: Option<TextControls>,
}
pub(crate) mod tools {
use crate::openai_tools::JsonSchema;
use serde::Deserialize;
use serde::Serialize;
/// When serialized as JSON, this produces a valid "Tool" in the OpenAI
/// Responses API.
#[derive(Debug, Clone, Serialize, PartialEq)]
#[serde(tag = "type")]
pub(crate) enum ToolSpec {
#[serde(rename = "function")]
Function(ResponsesApiTool),
#[serde(rename = "local_shell")]
LocalShell {},
// TODO: Understand why we get an error on web_search although the API docs say it's supported.
// https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses#:~:text=%7B%20type%3A%20%22web_search%22%20%7D%2C
#[serde(rename = "web_search")]
WebSearch {},
#[serde(rename = "custom")]
Freeform(FreeformTool),
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct FreeformTool {
pub(crate) name: String,
pub(crate) description: String,
pub(crate) format: FreeformToolFormat,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct FreeformToolFormat {
pub(crate) r#type: String,
pub(crate) syntax: String,
pub(crate) definition: String,
}
#[derive(Debug, Clone, Serialize, PartialEq)]
pub struct ResponsesApiTool {
pub(crate) name: String,
pub(crate) description: String,
/// TODO: Validation. When strict is set to true, the JSON schema,
/// `required` and `additional_properties` must be present. All fields in
/// `properties` must be present in `required`.
pub(crate) strict: bool,
pub(crate) parameters: JsonSchema,
}
}
pub(crate) fn create_reasoning_param_for_request(
model_family: &ModelFamily,
effort: Option<ReasoningEffortConfig>,

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,17 @@
use crate::admin_controls::AdminControls;
use crate::admin_controls::DangerAuditAction;
use crate::admin_controls::DangerDecision;
use crate::admin_controls::DangerPending;
use crate::admin_controls::DangerRequestSource;
use crate::admin_controls::PendingAdminAction;
use crate::admin_controls::build_danger_audit_payload;
use crate::admin_controls::log_admin_event;
use crate::config_loader::LoadedConfigLayers;
pub use crate::config_loader::load_config_as_toml;
use crate::config_loader::load_config_layers_with_overrides;
use crate::config_loader::merge_toml_values;
use crate::config_profile::ConfigProfile;
use crate::config_types::AdminConfigToml;
use crate::config_types::DEFAULT_OTEL_ENVIRONMENT;
use crate::config_types::History;
use crate::config_types::McpServerConfig;
@@ -209,53 +222,44 @@ pub struct Config {
/// OTEL configuration (exporter type, endpoint, headers, etc.).
pub otel: crate::config_types::OtelConfig,
/// Administrator-controlled options and audit configuration.
pub admin: AdminControls,
}
impl Config {
/// Load configuration with *generic* CLI overrides (`-c key=value`) applied
/// **in between** the values parsed from `config.toml` and the
/// strongly-typed overrides specified via [`ConfigOverrides`].
///
/// The precedence order is therefore: `config.toml` < `-c` overrides <
/// `ConfigOverrides`.
pub fn load_with_cli_overrides(
pub async fn load_with_cli_overrides(
cli_overrides: Vec<(String, TomlValue)>,
overrides: ConfigOverrides,
) -> std::io::Result<Self> {
// Resolve the directory that stores Codex state (e.g. ~/.codex or the
// value of $CODEX_HOME) so we can embed it into the resulting
// `Config` instance.
let codex_home = find_codex_home()?;
// Step 1: parse `config.toml` into a generic JSON value.
let mut root_value = load_config_as_toml(&codex_home)?;
let root_value = load_resolved_config(
&codex_home,
cli_overrides,
crate::config_loader::LoaderOverrides::default(),
)
.await?;
// Step 2: apply the `-c` overrides.
for (path, value) in cli_overrides.into_iter() {
apply_toml_override(&mut root_value, &path, value);
}
// Step 3: deserialize into `ConfigToml` so that Serde can enforce the
// correct types.
let cfg: ConfigToml = root_value.try_into().map_err(|e| {
tracing::error!("Failed to deserialize overridden config: {e}");
std::io::Error::new(std::io::ErrorKind::InvalidData, e)
})?;
// Step 4: merge with the strongly-typed overrides.
Self::load_from_base_config_with_overrides(cfg, overrides, codex_home)
}
}
pub fn load_config_as_toml_with_cli_overrides(
pub async fn load_config_as_toml_with_cli_overrides(
codex_home: &Path,
cli_overrides: Vec<(String, TomlValue)>,
) -> std::io::Result<ConfigToml> {
let mut root_value = load_config_as_toml(codex_home)?;
for (path, value) in cli_overrides.into_iter() {
apply_toml_override(&mut root_value, &path, value);
}
let root_value = load_resolved_config(
codex_home,
cli_overrides,
crate::config_loader::LoaderOverrides::default(),
)
.await?;
let cfg: ConfigToml = root_value.try_into().map_err(|e| {
tracing::error!("Failed to deserialize overridden config: {e}");
@@ -265,33 +269,40 @@ pub fn load_config_as_toml_with_cli_overrides(
Ok(cfg)
}
/// Read `CODEX_HOME/config.toml` and return it as a generic TOML value. Returns
/// an empty TOML table when the file does not exist.
pub fn load_config_as_toml(codex_home: &Path) -> std::io::Result<TomlValue> {
let config_path = codex_home.join(CONFIG_TOML_FILE);
match std::fs::read_to_string(&config_path) {
Ok(contents) => match toml::from_str::<TomlValue>(&contents) {
Ok(val) => Ok(val),
Err(e) => {
tracing::error!("Failed to parse config.toml: {e}");
Err(std::io::Error::new(std::io::ErrorKind::InvalidData, e))
}
},
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
tracing::info!("config.toml not found, using defaults");
Ok(TomlValue::Table(Default::default()))
}
Err(e) => {
tracing::error!("Failed to read config.toml: {e}");
Err(e)
}
}
async fn load_resolved_config(
codex_home: &Path,
cli_overrides: Vec<(String, TomlValue)>,
overrides: crate::config_loader::LoaderOverrides,
) -> std::io::Result<TomlValue> {
let layers = load_config_layers_with_overrides(codex_home, overrides).await?;
Ok(apply_overlays(layers, cli_overrides))
}
pub fn load_global_mcp_servers(
fn apply_overlays(
layers: LoadedConfigLayers,
cli_overrides: Vec<(String, TomlValue)>,
) -> TomlValue {
let LoadedConfigLayers {
mut base,
managed_config,
managed_preferences,
} = layers;
for (path, value) in cli_overrides.into_iter() {
apply_toml_override(&mut base, &path, value);
}
for overlay in [managed_config, managed_preferences].into_iter().flatten() {
merge_toml_values(&mut base, &overlay);
}
base
}
pub async fn load_global_mcp_servers(
codex_home: &Path,
) -> std::io::Result<BTreeMap<String, McpServerConfig>> {
let root_value = load_config_as_toml(codex_home)?;
let root_value = load_config_as_toml(codex_home).await?;
let Some(servers_value) = root_value.get("mcp_servers") else {
return Ok(BTreeMap::new());
};
@@ -735,6 +746,10 @@ pub struct ConfigToml {
/// OTEL configuration.
pub otel: Option<crate::config_types::OtelConfigToml>,
/// Administrator-level controls applied to all users on this host.
#[serde(default)]
pub admin: Option<AdminConfigToml>,
}
impl From<ConfigToml> for UserSavedConfig {
@@ -923,7 +938,68 @@ impl Config {
None => ConfigProfile::default(),
};
let sandbox_policy = cfg.derive_sandbox_policy(sandbox_mode);
let resolved_approval_policy = approval_policy
.or(config_profile.approval_policy)
.or(cfg.approval_policy)
.unwrap_or_else(AskForApproval::default);
let mut admin = AdminControls::from_toml(cfg.admin.clone())?;
let mut sandbox_policy = cfg.derive_sandbox_policy(sandbox_mode);
if matches!(sandbox_policy, SandboxPolicy::DangerFullAccess) {
match admin.decision_for_danger() {
DangerDecision::Allowed => {
if let Some(audit) = admin.audit.as_ref() {
let pending = DangerPending {
source: DangerRequestSource::Startup,
requested_sandbox: SandboxPolicy::DangerFullAccess,
requested_approval: resolved_approval_policy,
};
log_admin_event(
audit,
build_danger_audit_payload(&pending, DangerAuditAction::Approved, None),
);
}
}
DangerDecision::RequiresJustification => {
let pending = DangerPending {
source: DangerRequestSource::Startup,
requested_sandbox: SandboxPolicy::DangerFullAccess,
requested_approval: resolved_approval_policy,
};
if let Some(audit) = admin.audit.as_ref() {
log_admin_event(
audit,
build_danger_audit_payload(
&pending,
DangerAuditAction::Requested,
None,
),
);
}
admin.pending.push(PendingAdminAction::Danger(pending));
sandbox_policy = SandboxPolicy::new_workspace_write_policy();
}
DangerDecision::Denied => {
if let Some(audit) = admin.audit.as_ref() {
let pending = DangerPending {
source: DangerRequestSource::Startup,
requested_sandbox: SandboxPolicy::DangerFullAccess,
requested_approval: resolved_approval_policy,
};
log_admin_event(
audit,
build_danger_audit_payload(&pending, DangerAuditAction::Denied, None),
);
}
return Err(std::io::Error::new(
std::io::ErrorKind::PermissionDenied,
"danger-full-access is disabled by administrator policy",
));
}
}
}
let mut model_providers = built_in_model_providers();
// Merge user-defined providers into the built-in list.
@@ -1032,10 +1108,7 @@ impl Config {
model_provider_id,
model_provider,
cwd: resolved_cwd,
approval_policy: approval_policy
.or(config_profile.approval_policy)
.or(cfg.approval_policy)
.unwrap_or_else(AskForApproval::default),
approval_policy: resolved_approval_policy,
sandbox_policy,
shell_environment_policy,
notify: cfg.notify,
@@ -1110,6 +1183,7 @@ impl Config {
exporter,
}
},
admin,
};
Ok(config)
}
@@ -1219,6 +1293,7 @@ pub fn log_dir(cfg: &Config) -> std::io::Result<PathBuf> {
#[cfg(test)]
mod tests {
use crate::admin_controls::AdminControls;
use crate::config_types::HistoryPersistence;
use crate::config_types::Notifications;
@@ -1329,18 +1404,18 @@ exclude_slash_tmp = true
);
}
#[test]
fn load_global_mcp_servers_returns_empty_if_missing() -> anyhow::Result<()> {
#[tokio::test]
async fn load_global_mcp_servers_returns_empty_if_missing() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
let servers = load_global_mcp_servers(codex_home.path())?;
let servers = load_global_mcp_servers(codex_home.path()).await?;
assert!(servers.is_empty());
Ok(())
}
#[test]
fn write_global_mcp_servers_round_trips_entries() -> anyhow::Result<()> {
#[tokio::test]
async fn write_global_mcp_servers_round_trips_entries() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
let mut servers = BTreeMap::new();
@@ -1359,7 +1434,7 @@ exclude_slash_tmp = true
write_global_mcp_servers(codex_home.path(), &servers)?;
let loaded = load_global_mcp_servers(codex_home.path())?;
let loaded = load_global_mcp_servers(codex_home.path()).await?;
assert_eq!(loaded.len(), 1);
let docs = loaded.get("docs").expect("docs entry");
match &docs.transport {
@@ -1375,14 +1450,47 @@ exclude_slash_tmp = true
let empty = BTreeMap::new();
write_global_mcp_servers(codex_home.path(), &empty)?;
let loaded = load_global_mcp_servers(codex_home.path())?;
let loaded = load_global_mcp_servers(codex_home.path()).await?;
assert!(loaded.is_empty());
Ok(())
}
#[test]
fn load_global_mcp_servers_accepts_legacy_ms_field() -> anyhow::Result<()> {
#[tokio::test]
async fn managed_config_wins_over_cli_overrides() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
let managed_path = codex_home.path().join("managed_config.toml");
std::fs::write(
codex_home.path().join(CONFIG_TOML_FILE),
"model = \"base\"\n",
)?;
std::fs::write(&managed_path, "model = \"managed_config\"\n")?;
let overrides = crate::config_loader::LoaderOverrides {
managed_config_path: Some(managed_path),
#[cfg(target_os = "macos")]
managed_preferences_base64: None,
};
let root_value = load_resolved_config(
codex_home.path(),
vec![("model".to_string(), TomlValue::String("cli".to_string()))],
overrides,
)
.await?;
let cfg: ConfigToml = root_value.try_into().map_err(|e| {
tracing::error!("Failed to deserialize overridden config: {e}");
std::io::Error::new(std::io::ErrorKind::InvalidData, e)
})?;
assert_eq!(cfg.model.as_deref(), Some("managed_config"));
Ok(())
}
#[tokio::test]
async fn load_global_mcp_servers_accepts_legacy_ms_field() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
let config_path = codex_home.path().join(CONFIG_TOML_FILE);
@@ -1396,15 +1504,15 @@ startup_timeout_ms = 2500
"#,
)?;
let servers = load_global_mcp_servers(codex_home.path())?;
let servers = load_global_mcp_servers(codex_home.path()).await?;
let docs = servers.get("docs").expect("docs entry");
assert_eq!(docs.startup_timeout_sec, Some(Duration::from_millis(2500)));
Ok(())
}
#[test]
fn write_global_mcp_servers_serializes_env_sorted() -> anyhow::Result<()> {
#[tokio::test]
async fn write_global_mcp_servers_serializes_env_sorted() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
let servers = BTreeMap::from([(
@@ -1439,7 +1547,7 @@ ZIG_VAR = "3"
"#
);
let loaded = load_global_mcp_servers(codex_home.path())?;
let loaded = load_global_mcp_servers(codex_home.path()).await?;
let docs = loaded.get("docs").expect("docs entry");
match &docs.transport {
McpServerTransportConfig::Stdio { command, args, env } => {
@@ -1457,8 +1565,8 @@ ZIG_VAR = "3"
Ok(())
}
#[test]
fn write_global_mcp_servers_serializes_streamable_http() -> anyhow::Result<()> {
#[tokio::test]
async fn write_global_mcp_servers_serializes_streamable_http() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
let mut servers = BTreeMap::from([(
@@ -1486,7 +1594,7 @@ startup_timeout_sec = 2.0
"#
);
let loaded = load_global_mcp_servers(codex_home.path())?;
let loaded = load_global_mcp_servers(codex_home.path()).await?;
let docs = loaded.get("docs").expect("docs entry");
match &docs.transport {
McpServerTransportConfig::StreamableHttp { url, bearer_token } => {
@@ -1518,7 +1626,7 @@ url = "https://example.com/mcp"
"#
);
let loaded = load_global_mcp_servers(codex_home.path())?;
let loaded = load_global_mcp_servers(codex_home.path()).await?;
let docs = loaded.get("docs").expect("docs entry");
match &docs.transport {
McpServerTransportConfig::StreamableHttp { url, bearer_token } => {
@@ -1853,6 +1961,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
otel: OtelConfig::default(),
admin: AdminControls::default(),
},
o3_profile_config
);
@@ -1914,6 +2023,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
otel: OtelConfig::default(),
admin: AdminControls::default(),
};
assert_eq!(expected_gpt3_profile_config, gpt3_profile_config);
@@ -1990,6 +2100,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
otel: OtelConfig::default(),
admin: AdminControls::default(),
};
assert_eq!(expected_zdr_profile_config, zdr_profile_config);
@@ -2052,6 +2163,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
otel: OtelConfig::default(),
admin: AdminControls::default(),
};
assert_eq!(expected_gpt5_profile_config, gpt5_profile_config);

View File

@@ -0,0 +1,118 @@
use std::io;
use toml::Value as TomlValue;
#[cfg(target_os = "macos")]
mod native {
use super::*;
use base64::Engine;
use base64::prelude::BASE64_STANDARD;
use core_foundation::base::TCFType;
use core_foundation::string::CFString;
use core_foundation::string::CFStringRef;
use std::ffi::c_void;
use tokio::task;
pub(crate) async fn load_managed_admin_config_layer(
override_base64: Option<&str>,
) -> io::Result<Option<TomlValue>> {
if let Some(encoded) = override_base64 {
let trimmed = encoded.trim();
return if trimmed.is_empty() {
Ok(None)
} else {
parse_managed_preferences_base64(trimmed).map(Some)
};
}
const LOAD_ERROR: &str = "Failed to load managed preferences configuration";
match task::spawn_blocking(load_managed_admin_config).await {
Ok(result) => result,
Err(join_err) => {
if join_err.is_cancelled() {
tracing::error!("Managed preferences load task was cancelled");
} else {
tracing::error!("Managed preferences load task failed: {join_err}");
}
Err(io::Error::other(LOAD_ERROR))
}
}
}
pub(super) fn load_managed_admin_config() -> io::Result<Option<TomlValue>> {
#[link(name = "CoreFoundation", kind = "framework")]
unsafe extern "C" {
fn CFPreferencesCopyAppValue(
key: CFStringRef,
application_id: CFStringRef,
) -> *mut c_void;
}
const MANAGED_PREFERENCES_APPLICATION_ID: &str = "com.openai.codex";
const MANAGED_PREFERENCES_CONFIG_KEY: &str = "config_toml_base64";
let application_id = CFString::new(MANAGED_PREFERENCES_APPLICATION_ID);
let key = CFString::new(MANAGED_PREFERENCES_CONFIG_KEY);
let value_ref = unsafe {
CFPreferencesCopyAppValue(
key.as_concrete_TypeRef(),
application_id.as_concrete_TypeRef(),
)
};
if value_ref.is_null() {
tracing::debug!(
"Managed preferences for {} key {} not found",
MANAGED_PREFERENCES_APPLICATION_ID,
MANAGED_PREFERENCES_CONFIG_KEY
);
return Ok(None);
}
let value = unsafe { CFString::wrap_under_create_rule(value_ref as _) };
let contents = value.to_string();
let trimmed = contents.trim();
parse_managed_preferences_base64(trimmed).map(Some)
}
pub(super) fn parse_managed_preferences_base64(encoded: &str) -> io::Result<TomlValue> {
let decoded = BASE64_STANDARD.decode(encoded.as_bytes()).map_err(|err| {
tracing::error!("Failed to decode managed preferences as base64: {err}");
io::Error::new(io::ErrorKind::InvalidData, err)
})?;
let decoded_str = String::from_utf8(decoded).map_err(|err| {
tracing::error!("Managed preferences base64 contents were not valid UTF-8: {err}");
io::Error::new(io::ErrorKind::InvalidData, err)
})?;
match toml::from_str::<TomlValue>(&decoded_str) {
Ok(TomlValue::Table(parsed)) => Ok(TomlValue::Table(parsed)),
Ok(other) => {
tracing::error!(
"Managed preferences TOML must have a table at the root, found {other:?}",
);
Err(io::Error::new(
io::ErrorKind::InvalidData,
"managed preferences root must be a table",
))
}
Err(err) => {
tracing::error!("Failed to parse managed preferences TOML: {err}");
Err(io::Error::new(io::ErrorKind::InvalidData, err))
}
}
}
}
#[cfg(target_os = "macos")]
pub(crate) use native::load_managed_admin_config_layer;
#[cfg(not(target_os = "macos"))]
pub(crate) async fn load_managed_admin_config_layer(
_override_base64: Option<&str>,
) -> io::Result<Option<TomlValue>> {
Ok(None)
}

View File

@@ -0,0 +1,311 @@
mod macos;
use crate::config::CONFIG_TOML_FILE;
use macos::load_managed_admin_config_layer;
use std::io;
use std::path::Path;
use std::path::PathBuf;
use tokio::fs;
use toml::Value as TomlValue;
#[cfg(unix)]
const CODEX_MANAGED_CONFIG_SYSTEM_PATH: &str = "/etc/codex/managed_config.toml";
#[derive(Debug)]
pub(crate) struct LoadedConfigLayers {
pub base: TomlValue,
pub managed_config: Option<TomlValue>,
pub managed_preferences: Option<TomlValue>,
}
#[derive(Debug, Default)]
pub(crate) struct LoaderOverrides {
pub managed_config_path: Option<PathBuf>,
#[cfg(target_os = "macos")]
pub managed_preferences_base64: Option<String>,
}
// Configuration layering pipeline (top overrides bottom):
//
// +-------------------------+
// | Managed preferences (*) |
// +-------------------------+
// ^
// |
// +-------------------------+
// | managed_config.toml |
// +-------------------------+
// ^
// |
// +-------------------------+
// | config.toml (base) |
// +-------------------------+
//
// (*) Only available on macOS via managed device profiles.
pub async fn load_config_as_toml(codex_home: &Path) -> io::Result<TomlValue> {
load_config_as_toml_with_overrides(codex_home, LoaderOverrides::default()).await
}
fn default_empty_table() -> TomlValue {
TomlValue::Table(Default::default())
}
pub(crate) async fn load_config_layers_with_overrides(
codex_home: &Path,
overrides: LoaderOverrides,
) -> io::Result<LoadedConfigLayers> {
load_config_layers_internal(codex_home, overrides).await
}
async fn load_config_as_toml_with_overrides(
codex_home: &Path,
overrides: LoaderOverrides,
) -> io::Result<TomlValue> {
let layers = load_config_layers_internal(codex_home, overrides).await?;
Ok(apply_managed_layers(layers))
}
async fn load_config_layers_internal(
codex_home: &Path,
overrides: LoaderOverrides,
) -> io::Result<LoadedConfigLayers> {
#[cfg(target_os = "macos")]
let LoaderOverrides {
managed_config_path,
managed_preferences_base64,
} = overrides;
#[cfg(not(target_os = "macos"))]
let LoaderOverrides {
managed_config_path,
} = overrides;
let managed_config_path =
managed_config_path.unwrap_or_else(|| managed_config_default_path(codex_home));
let user_config_path = codex_home.join(CONFIG_TOML_FILE);
let user_config = read_config_from_path(&user_config_path, true).await?;
let managed_config = read_config_from_path(&managed_config_path, false).await?;
#[cfg(target_os = "macos")]
let managed_preferences =
load_managed_admin_config_layer(managed_preferences_base64.as_deref()).await?;
#[cfg(not(target_os = "macos"))]
let managed_preferences = load_managed_admin_config_layer(None).await?;
Ok(LoadedConfigLayers {
base: user_config.unwrap_or_else(default_empty_table),
managed_config,
managed_preferences,
})
}
async fn read_config_from_path(
path: &Path,
log_missing_as_info: bool,
) -> io::Result<Option<TomlValue>> {
match fs::read_to_string(path).await {
Ok(contents) => match toml::from_str::<TomlValue>(&contents) {
Ok(value) => Ok(Some(value)),
Err(err) => {
tracing::error!("Failed to parse {}: {err}", path.display());
Err(io::Error::new(io::ErrorKind::InvalidData, err))
}
},
Err(err) if err.kind() == io::ErrorKind::NotFound => {
if log_missing_as_info {
tracing::info!("{} not found, using defaults", path.display());
} else {
tracing::debug!("{} not found", path.display());
}
Ok(None)
}
Err(err) => {
tracing::error!("Failed to read {}: {err}", path.display());
Err(err)
}
}
}
/// Merge config `overlay` into `base`, giving `overlay` precedence.
pub(crate) fn merge_toml_values(base: &mut TomlValue, overlay: &TomlValue) {
if let TomlValue::Table(overlay_table) = overlay
&& let TomlValue::Table(base_table) = base
{
for (key, value) in overlay_table {
if let Some(existing) = base_table.get_mut(key) {
merge_toml_values(existing, value);
} else {
base_table.insert(key.clone(), value.clone());
}
}
} else {
*base = overlay.clone();
}
}
fn managed_config_default_path(codex_home: &Path) -> PathBuf {
#[cfg(unix)]
{
let _ = codex_home;
PathBuf::from(CODEX_MANAGED_CONFIG_SYSTEM_PATH)
}
#[cfg(not(unix))]
{
codex_home.join("managed_config.toml")
}
}
fn apply_managed_layers(layers: LoadedConfigLayers) -> TomlValue {
let LoadedConfigLayers {
mut base,
managed_config,
managed_preferences,
} = layers;
for overlay in [managed_config, managed_preferences].into_iter().flatten() {
merge_toml_values(&mut base, &overlay);
}
base
}
#[cfg(test)]
mod tests {
use super::*;
use tempfile::tempdir;
#[tokio::test]
async fn merges_managed_config_layer_on_top() {
let tmp = tempdir().expect("tempdir");
let managed_path = tmp.path().join("managed_config.toml");
std::fs::write(
tmp.path().join(CONFIG_TOML_FILE),
r#"foo = 1
[nested]
value = "base"
"#,
)
.expect("write base");
std::fs::write(
&managed_path,
r#"foo = 2
[nested]
value = "managed_config"
extra = true
"#,
)
.expect("write managed config");
let overrides = LoaderOverrides {
managed_config_path: Some(managed_path),
#[cfg(target_os = "macos")]
managed_preferences_base64: None,
};
let loaded = load_config_as_toml_with_overrides(tmp.path(), overrides)
.await
.expect("load config");
let table = loaded.as_table().expect("top-level table expected");
assert_eq!(table.get("foo"), Some(&TomlValue::Integer(2)));
let nested = table
.get("nested")
.and_then(|v| v.as_table())
.expect("nested");
assert_eq!(
nested.get("value"),
Some(&TomlValue::String("managed_config".to_string()))
);
assert_eq!(nested.get("extra"), Some(&TomlValue::Boolean(true)));
}
#[tokio::test]
async fn returns_empty_when_all_layers_missing() {
let tmp = tempdir().expect("tempdir");
let managed_path = tmp.path().join("managed_config.toml");
let overrides = LoaderOverrides {
managed_config_path: Some(managed_path),
#[cfg(target_os = "macos")]
managed_preferences_base64: None,
};
let layers = load_config_layers_with_overrides(tmp.path(), overrides)
.await
.expect("load layers");
let base_table = layers.base.as_table().expect("base table expected");
assert!(
base_table.is_empty(),
"expected empty base layer when configs missing"
);
assert!(
layers.managed_config.is_none(),
"managed config layer should be absent when file missing"
);
#[cfg(not(target_os = "macos"))]
{
let loaded = load_config_as_toml(tmp.path()).await.expect("load config");
let table = loaded.as_table().expect("top-level table expected");
assert!(
table.is_empty(),
"expected empty table when configs missing"
);
}
}
#[cfg(target_os = "macos")]
#[tokio::test]
async fn managed_preferences_take_highest_precedence() {
use base64::Engine;
let managed_payload = r#"
[nested]
value = "managed"
flag = false
"#;
let encoded = base64::prelude::BASE64_STANDARD.encode(managed_payload.as_bytes());
let tmp = tempdir().expect("tempdir");
let managed_path = tmp.path().join("managed_config.toml");
std::fs::write(
tmp.path().join(CONFIG_TOML_FILE),
r#"[nested]
value = "base"
"#,
)
.expect("write base");
std::fs::write(
&managed_path,
r#"[nested]
value = "managed_config"
flag = true
"#,
)
.expect("write managed config");
let overrides = LoaderOverrides {
managed_config_path: Some(managed_path),
managed_preferences_base64: Some(encoded),
};
let loaded = load_config_as_toml_with_overrides(tmp.path(), overrides)
.await
.expect("load config");
let nested = loaded
.get("nested")
.and_then(|v| v.as_table())
.expect("nested table");
assert_eq!(
nested.get("value"),
Some(&TomlValue::String("managed".to_string()))
);
assert_eq!(nested.get("flag"), Some(&TomlValue::Boolean(false)));
}
}

View File

@@ -563,3 +563,34 @@ mod tests {
.expect_err("should reject bearer token for stdio transport");
}
}
#[derive(Deserialize, Debug, Clone, Default, PartialEq)]
pub struct AdminConfigToml {
#[serde(default)]
pub disallow_danger_full_access: Option<bool>,
#[serde(default)]
pub allow_danger_with_reason: Option<bool>,
#[serde(default)]
pub audit: Option<AdminAuditToml>,
}
#[derive(Deserialize, Debug, Clone, Default, PartialEq)]
pub struct AdminAuditToml {
#[serde(default)]
pub log_file: Option<String>,
#[serde(default)]
pub log_endpoint: Option<String>,
#[serde(default)]
pub log_events: Vec<AdminAuditEventKind>,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum AdminAuditEventKind {
Danger,
Command,
}

View File

@@ -108,6 +108,9 @@ pub enum CodexErr {
#[error("unsupported operation: {0}")]
UnsupportedOperation(String),
#[error("Fatal error: {0}")]
Fatal(String),
// -----------------------------------------------------------------
// Automatic conversions for common external error types
// -----------------------------------------------------------------

View File

@@ -27,6 +27,7 @@ use crate::protocol::SandboxPolicy;
use crate::seatbelt::spawn_command_under_seatbelt;
use crate::spawn::StdioPolicy;
use crate::spawn::spawn_child_async;
use serde::Serialize;
const DEFAULT_TIMEOUT_MS: u64 = 10_000;
@@ -61,7 +62,8 @@ impl ExecParams {
}
}
#[derive(Clone, Copy, Debug, PartialEq)]
#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize)]
#[serde(rename_all = "kebab-case")]
pub enum SandboxType {
None,

View File

@@ -1,7 +1,7 @@
use std::collections::BTreeMap;
use crate::client_common::tools::ResponsesApiTool;
use crate::openai_tools::JsonSchema;
use crate::openai_tools::ResponsesApiTool;
pub const EXEC_COMMAND_TOOL_NAME: &str = "exec_command";
pub const WRITE_STDIN_TOOL_NAME: &str = "write_stdin";

View File

@@ -0,0 +1,101 @@
use std::collections::HashMap;
use std::env;
use async_trait::async_trait;
use crate::CODEX_APPLY_PATCH_ARG1;
use crate::apply_patch::ApplyPatchExec;
use crate::exec::ExecParams;
use crate::function_tool::FunctionCallError;
pub(crate) enum ExecutionMode {
Shell,
ApplyPatch(ApplyPatchExec),
}
#[async_trait]
/// Backend-specific hooks that prepare and post-process execution requests for a
/// given [`ExecutionMode`].
pub(crate) trait ExecutionBackend: Send + Sync {
fn prepare(
&self,
params: ExecParams,
// Required for downcasting the apply_patch.
mode: &ExecutionMode,
) -> Result<ExecParams, FunctionCallError>;
fn stream_stdout(&self, _mode: &ExecutionMode) -> bool {
true
}
}
static SHELL_BACKEND: ShellBackend = ShellBackend;
static APPLY_PATCH_BACKEND: ApplyPatchBackend = ApplyPatchBackend;
pub(crate) fn backend_for_mode(mode: &ExecutionMode) -> &'static dyn ExecutionBackend {
match mode {
ExecutionMode::Shell => &SHELL_BACKEND,
ExecutionMode::ApplyPatch(_) => &APPLY_PATCH_BACKEND,
}
}
struct ShellBackend;
#[async_trait]
impl ExecutionBackend for ShellBackend {
fn prepare(
&self,
params: ExecParams,
mode: &ExecutionMode,
) -> Result<ExecParams, FunctionCallError> {
match mode {
ExecutionMode::Shell => Ok(params),
_ => Err(FunctionCallError::RespondToModel(
"shell backend invoked with non-shell mode".to_string(),
)),
}
}
}
struct ApplyPatchBackend;
#[async_trait]
impl ExecutionBackend for ApplyPatchBackend {
fn prepare(
&self,
params: ExecParams,
mode: &ExecutionMode,
) -> Result<ExecParams, FunctionCallError> {
match mode {
ExecutionMode::ApplyPatch(exec) => {
let path_to_codex = env::current_exe()
.ok()
.map(|p| p.to_string_lossy().to_string())
.ok_or_else(|| {
FunctionCallError::RespondToModel(
"failed to determine path to codex executable".to_string(),
)
})?;
let patch = exec.action.patch.clone();
Ok(ExecParams {
command: vec![path_to_codex, CODEX_APPLY_PATCH_ARG1.to_string(), patch],
cwd: exec.action.cwd.clone(),
timeout_ms: params.timeout_ms,
// Run apply_patch with a minimal environment for determinism and to
// avoid leaking host environment variables into the patch process.
env: HashMap::new(),
with_escalated_permissions: params.with_escalated_permissions,
justification: params.justification,
})
}
ExecutionMode::Shell => Err(FunctionCallError::RespondToModel(
"apply_patch backend invoked without patch context".to_string(),
)),
}
}
fn stream_stdout(&self, _mode: &ExecutionMode) -> bool {
false
}
}

View File

@@ -0,0 +1,51 @@
use std::collections::HashSet;
use std::sync::Arc;
use std::sync::Mutex;
#[derive(Clone, Debug, Default)]
/// Thread-safe store of user approvals so repeated commands can reuse
/// previously granted trust.
pub(crate) struct ApprovalCache {
inner: Arc<Mutex<HashSet<Vec<String>>>>,
}
impl ApprovalCache {
pub(crate) fn insert(&self, command: Vec<String>) {
if command.is_empty() {
return;
}
if let Ok(mut guard) = self.inner.lock() {
guard.insert(command);
}
}
pub(crate) fn snapshot(&self) -> HashSet<Vec<String>> {
self.inner.lock().map(|g| g.clone()).unwrap_or_default()
}
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
#[test]
fn insert_ignores_empty_and_dedupes() {
let cache = ApprovalCache::default();
// Empty should be ignored
cache.insert(vec![]);
assert!(cache.snapshot().is_empty());
// Insert a command and verify snapshot contains it
let cmd = vec!["foo".to_string(), "bar".to_string()];
cache.insert(cmd.clone());
let snap1 = cache.snapshot();
assert!(snap1.contains(&cmd));
// Reinserting should not create duplicates
cache.insert(cmd);
let snap2 = cache.snapshot();
assert_eq!(snap1, snap2);
}
}

View File

@@ -0,0 +1,64 @@
mod backends;
mod cache;
mod runner;
mod sandbox;
pub(crate) use backends::ExecutionMode;
pub(crate) use runner::ExecutionRequest;
pub(crate) use runner::Executor;
pub(crate) use runner::ExecutorConfig;
pub(crate) use runner::normalize_exec_result;
pub(crate) mod linkers {
use crate::exec::ExecParams;
use crate::exec::StdoutStream;
use crate::executor::backends::ExecutionMode;
use crate::executor::runner::ExecutionRequest;
use crate::tools::context::ExecCommandContext;
pub struct PreparedExec {
pub(crate) context: ExecCommandContext,
pub(crate) request: ExecutionRequest,
}
impl PreparedExec {
pub fn new(
context: ExecCommandContext,
params: ExecParams,
approval_command: Vec<String>,
mode: ExecutionMode,
stdout_stream: Option<StdoutStream>,
use_shell_profile: bool,
) -> Self {
let request = ExecutionRequest {
params,
approval_command,
mode,
stdout_stream,
use_shell_profile,
};
Self { context, request }
}
}
}
pub mod errors {
use crate::error::CodexErr;
use crate::function_tool::FunctionCallError;
use thiserror::Error;
#[derive(Debug, Error)]
pub enum ExecError {
#[error(transparent)]
Function(#[from] FunctionCallError),
#[error(transparent)]
Codex(#[from] CodexErr),
}
impl ExecError {
pub(crate) fn rejection(msg: impl Into<String>) -> Self {
FunctionCallError::RespondToModel(msg.into()).into()
}
}
}

View File

@@ -0,0 +1,427 @@
use std::path::PathBuf;
use std::sync::Arc;
use std::sync::RwLock;
use std::time::Duration;
use super::backends::ExecutionMode;
use super::backends::backend_for_mode;
use super::cache::ApprovalCache;
use crate::admin_controls::AdminAuditConfig;
use crate::admin_controls::build_command_audit_payload;
use crate::admin_controls::log_admin_event;
use crate::codex::Session;
use crate::config_types::AdminAuditEventKind;
use crate::error::CodexErr;
use crate::error::SandboxErr;
use crate::error::get_error_message_ui;
use crate::exec::ExecParams;
use crate::exec::ExecToolCallOutput;
use crate::exec::SandboxType;
use crate::exec::StdoutStream;
use crate::exec::StreamOutput;
use crate::exec::process_exec_tool_call;
use crate::executor::errors::ExecError;
use crate::executor::sandbox::select_sandbox;
use crate::function_tool::FunctionCallError;
use crate::protocol::AskForApproval;
use crate::protocol::ReviewDecision;
use crate::protocol::SandboxPolicy;
use crate::shell;
use crate::tools::context::ExecCommandContext;
use codex_otel::otel_event_manager::ToolDecisionSource;
#[derive(Clone, Debug)]
pub(crate) struct ExecutorConfig {
pub(crate) sandbox_policy: SandboxPolicy,
pub(crate) sandbox_cwd: PathBuf,
codex_linux_sandbox_exe: Option<PathBuf>,
pub(crate) admin_audit: Option<AdminAuditConfig>,
}
impl ExecutorConfig {
pub(crate) fn new(
sandbox_policy: SandboxPolicy,
sandbox_cwd: PathBuf,
codex_linux_sandbox_exe: Option<PathBuf>,
admin_audit: Option<AdminAuditConfig>,
) -> Self {
Self {
sandbox_policy,
sandbox_cwd,
codex_linux_sandbox_exe,
admin_audit,
}
}
}
/// Coordinates sandbox selection, backend-specific preparation, and command
/// execution for tool calls requested by the model.
pub(crate) struct Executor {
approval_cache: ApprovalCache,
config: Arc<RwLock<ExecutorConfig>>,
}
impl Executor {
pub(crate) fn new(config: ExecutorConfig) -> Self {
Self {
approval_cache: ApprovalCache::default(),
config: Arc::new(RwLock::new(config)),
}
}
/// Updates the sandbox policy and working directory used for future
/// executions without recreating the executor.
pub(crate) fn update_environment(&self, sandbox_policy: SandboxPolicy, sandbox_cwd: PathBuf) {
if let Ok(mut cfg) = self.config.write() {
cfg.sandbox_policy = sandbox_policy;
cfg.sandbox_cwd = sandbox_cwd;
}
}
/// Runs a prepared execution request end-to-end: prepares parameters, decides on
/// sandbox placement (prompting the user when necessary), launches the command,
/// and lets the backend post-process the final output.
pub(crate) async fn run(
&self,
mut request: ExecutionRequest,
session: &Session,
approval_policy: AskForApproval,
context: &ExecCommandContext,
) -> Result<ExecToolCallOutput, ExecError> {
if matches!(request.mode, ExecutionMode::Shell) {
request.params =
maybe_translate_shell_command(request.params, session, request.use_shell_profile);
}
// Step 1: Normalise parameters via the selected backend.
let backend = backend_for_mode(&request.mode);
let stdout_stream = if backend.stream_stdout(&request.mode) {
request.stdout_stream.clone()
} else {
None
};
request.params = backend
.prepare(request.params, &request.mode)
.map_err(ExecError::from)?;
// Step 2: Snapshot sandbox configuration so it stays stable for this run.
let config = self
.config
.read()
.map_err(|_| ExecError::rejection("executor config poisoned"))?
.clone();
// Step 3: Decide sandbox placement, prompting for approval when needed.
let sandbox_decision = select_sandbox(
&request,
approval_policy,
self.approval_cache.snapshot(),
&config,
session,
&context.sub_id,
&context.call_id,
&context.otel_event_manager,
)
.await?;
if sandbox_decision.record_session_approval {
self.approval_cache.insert(request.approval_command.clone());
}
// Step 4: Launch the command within the chosen sandbox.
let first_attempt = self
.spawn(
request.params.clone(),
sandbox_decision.initial_sandbox,
&config,
stdout_stream.clone(),
)
.await;
// Step 5: Handle sandbox outcomes, optionally escalating to an unsandboxed retry.
match first_attempt {
Ok(output) => Ok(output),
Err(CodexErr::Sandbox(SandboxErr::Timeout { output })) => {
Err(CodexErr::Sandbox(SandboxErr::Timeout { output }).into())
}
Err(CodexErr::Sandbox(error)) => {
if sandbox_decision.escalate_on_failure {
self.retry_without_sandbox(
&request,
&config,
session,
context,
stdout_stream,
error,
)
.await
} else {
let message = sandbox_failure_message(error);
Err(ExecError::rejection(message))
}
}
Err(err) => Err(err.into()),
}
}
/// Fallback path invoked when a sandboxed run is denied so the user can
/// approve rerunning without isolation.
async fn retry_without_sandbox(
&self,
request: &ExecutionRequest,
config: &ExecutorConfig,
session: &Session,
context: &ExecCommandContext,
stdout_stream: Option<StdoutStream>,
sandbox_error: SandboxErr,
) -> Result<ExecToolCallOutput, ExecError> {
session
.notify_background_event(
&context.sub_id,
format!("Execution failed: {sandbox_error}"),
)
.await;
let decision = session
.request_command_approval(
context.sub_id.to_string(),
context.call_id.to_string(),
request.approval_command.clone(),
request.params.cwd.clone(),
Some("command failed; retry without sandbox?".to_string()),
)
.await;
context.otel_event_manager.tool_decision(
&context.tool_name,
&context.call_id,
decision,
ToolDecisionSource::User,
);
match decision {
ReviewDecision::Approved | ReviewDecision::ApprovedForSession => {
if matches!(decision, ReviewDecision::ApprovedForSession) {
self.approval_cache.insert(request.approval_command.clone());
}
session
.notify_background_event(&context.sub_id, "retrying command without sandbox")
.await;
let retry_output = self
.spawn(
request.params.clone(),
SandboxType::None,
config,
stdout_stream,
)
.await?;
Ok(retry_output)
}
ReviewDecision::Denied | ReviewDecision::Abort => {
Err(ExecError::rejection("exec command rejected by user"))
}
}
}
async fn spawn(
&self,
params: ExecParams,
sandbox: SandboxType,
config: &ExecutorConfig,
stdout_stream: Option<StdoutStream>,
) -> Result<ExecToolCallOutput, CodexErr> {
if let Some(admin_audit) = config.admin_audit.as_ref()
&& admin_audit.should_log(AdminAuditEventKind::Command)
{
let payload = build_command_audit_payload(
&params,
sandbox,
&config.sandbox_policy,
&config.sandbox_cwd,
);
log_admin_event(admin_audit, payload);
}
process_exec_tool_call(
params,
sandbox,
&config.sandbox_policy,
&config.sandbox_cwd,
&config.codex_linux_sandbox_exe,
stdout_stream,
)
.await
}
}
fn maybe_translate_shell_command(
params: ExecParams,
session: &Session,
use_shell_profile: bool,
) -> ExecParams {
let should_translate =
matches!(session.user_shell(), shell::Shell::PowerShell(_)) || use_shell_profile;
if should_translate
&& let Some(command) = session
.user_shell()
.format_default_shell_invocation(params.command.clone())
{
return ExecParams { command, ..params };
}
params
}
fn sandbox_failure_message(error: SandboxErr) -> String {
let codex_error = CodexErr::Sandbox(error);
let friendly = get_error_message_ui(&codex_error);
format!("failed in sandbox: {friendly}")
}
pub(crate) struct ExecutionRequest {
pub params: ExecParams,
pub approval_command: Vec<String>,
pub mode: ExecutionMode,
pub stdout_stream: Option<StdoutStream>,
pub use_shell_profile: bool,
}
pub(crate) struct NormalizedExecOutput<'a> {
borrowed: Option<&'a ExecToolCallOutput>,
synthetic: Option<ExecToolCallOutput>,
}
impl<'a> NormalizedExecOutput<'a> {
pub(crate) fn event_output(&'a self) -> &'a ExecToolCallOutput {
match (self.borrowed, self.synthetic.as_ref()) {
(Some(output), _) => output,
(None, Some(output)) => output,
(None, None) => unreachable!("normalized exec output missing data"),
}
}
}
/// Converts a raw execution result into a uniform view that always exposes an
/// [`ExecToolCallOutput`], synthesizing error output when the command fails
/// before producing a response.
pub(crate) fn normalize_exec_result(
result: &Result<ExecToolCallOutput, ExecError>,
) -> NormalizedExecOutput<'_> {
match result {
Ok(output) => NormalizedExecOutput {
borrowed: Some(output),
synthetic: None,
},
Err(ExecError::Codex(CodexErr::Sandbox(SandboxErr::Timeout { output }))) => {
NormalizedExecOutput {
borrowed: Some(output.as_ref()),
synthetic: None,
}
}
Err(err) => {
let message = match err {
ExecError::Function(FunctionCallError::RespondToModel(msg)) => msg.clone(),
ExecError::Codex(e) => get_error_message_ui(e),
err => err.to_string(),
};
let synthetic = ExecToolCallOutput {
exit_code: -1,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new(message.clone()),
aggregated_output: StreamOutput::new(message),
duration: Duration::default(),
timed_out: false,
};
NormalizedExecOutput {
borrowed: None,
synthetic: Some(synthetic),
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::error::CodexErr;
use crate::error::EnvVarError;
use crate::error::SandboxErr;
use crate::exec::StreamOutput;
use pretty_assertions::assert_eq;
fn make_output(text: &str) -> ExecToolCallOutput {
ExecToolCallOutput {
exit_code: 1,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new(text.to_string()),
duration: Duration::from_millis(123),
timed_out: false,
}
}
#[test]
fn normalize_success_borrows() {
let out = make_output("ok");
let result: Result<ExecToolCallOutput, ExecError> = Ok(out);
let normalized = normalize_exec_result(&result);
assert_eq!(normalized.event_output().aggregated_output.text, "ok");
}
#[test]
fn normalize_timeout_borrows_embedded_output() {
let out = make_output("timed out payload");
let err = CodexErr::Sandbox(SandboxErr::Timeout {
output: Box::new(out),
});
let result: Result<ExecToolCallOutput, ExecError> = Err(ExecError::Codex(err));
let normalized = normalize_exec_result(&result);
assert_eq!(
normalized.event_output().aggregated_output.text,
"timed out payload"
);
}
#[test]
fn sandbox_failure_message_uses_denied_stderr() {
let output = ExecToolCallOutput {
exit_code: 101,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new("sandbox stderr".to_string()),
aggregated_output: StreamOutput::new(String::new()),
duration: Duration::from_millis(10),
timed_out: false,
};
let err = SandboxErr::Denied {
output: Box::new(output),
};
let message = sandbox_failure_message(err);
assert_eq!(message, "failed in sandbox: sandbox stderr");
}
#[test]
fn normalize_function_error_synthesizes_payload() {
let err = FunctionCallError::RespondToModel("boom".to_string());
let result: Result<ExecToolCallOutput, ExecError> = Err(ExecError::Function(err));
let normalized = normalize_exec_result(&result);
assert_eq!(normalized.event_output().aggregated_output.text, "boom");
}
#[test]
fn normalize_codex_error_synthesizes_user_message() {
// Use a simple EnvVar error which formats to a clear message
let e = CodexErr::EnvVar(EnvVarError {
var: "FOO".to_string(),
instructions: Some("set it".to_string()),
});
let result: Result<ExecToolCallOutput, ExecError> = Err(ExecError::Codex(e));
let normalized = normalize_exec_result(&result);
assert!(
normalized
.event_output()
.aggregated_output
.text
.contains("Missing environment variable: `FOO`"),
"expected synthesized user-friendly message"
);
}
}

View File

@@ -0,0 +1,415 @@
use crate::apply_patch::ApplyPatchExec;
use crate::codex::Session;
use crate::exec::SandboxType;
use crate::executor::ExecutionMode;
use crate::executor::ExecutionRequest;
use crate::executor::ExecutorConfig;
use crate::executor::errors::ExecError;
use crate::safety::SafetyCheck;
use crate::safety::assess_command_safety;
use crate::safety::assess_patch_safety;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_otel::otel_event_manager::ToolDecisionSource;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::ReviewDecision;
use std::collections::HashSet;
/// Sandbox placement options selected for an execution run, including whether
/// to escalate after failures and whether approvals should persist.
pub(crate) struct SandboxDecision {
pub(crate) initial_sandbox: SandboxType,
pub(crate) escalate_on_failure: bool,
pub(crate) record_session_approval: bool,
}
impl SandboxDecision {
fn auto(sandbox: SandboxType, escalate_on_failure: bool) -> Self {
Self {
initial_sandbox: sandbox,
escalate_on_failure,
record_session_approval: false,
}
}
fn user_override(record_session_approval: bool) -> Self {
Self {
initial_sandbox: SandboxType::None,
escalate_on_failure: false,
record_session_approval,
}
}
}
fn should_escalate_on_failure(approval: AskForApproval, sandbox: SandboxType) -> bool {
matches!(
(approval, sandbox),
(
AskForApproval::UnlessTrusted | AskForApproval::OnFailure,
SandboxType::MacosSeatbelt | SandboxType::LinuxSeccomp
)
)
}
/// Determines how a command should be sandboxed, prompting the user when
/// policy requires explicit approval.
#[allow(clippy::too_many_arguments)]
pub async fn select_sandbox(
request: &ExecutionRequest,
approval_policy: AskForApproval,
approval_cache: HashSet<Vec<String>>,
config: &ExecutorConfig,
session: &Session,
sub_id: &str,
call_id: &str,
otel_event_manager: &OtelEventManager,
) -> Result<SandboxDecision, ExecError> {
match &request.mode {
ExecutionMode::Shell => {
select_shell_sandbox(
request,
approval_policy,
approval_cache,
config,
session,
sub_id,
call_id,
otel_event_manager,
)
.await
}
ExecutionMode::ApplyPatch(exec) => {
select_apply_patch_sandbox(exec, approval_policy, config)
}
}
}
#[allow(clippy::too_many_arguments)]
async fn select_shell_sandbox(
request: &ExecutionRequest,
approval_policy: AskForApproval,
approved_snapshot: HashSet<Vec<String>>,
config: &ExecutorConfig,
session: &Session,
sub_id: &str,
call_id: &str,
otel_event_manager: &OtelEventManager,
) -> Result<SandboxDecision, ExecError> {
let command_for_safety = if request.approval_command.is_empty() {
request.params.command.clone()
} else {
request.approval_command.clone()
};
let safety = assess_command_safety(
&command_for_safety,
approval_policy,
&config.sandbox_policy,
&approved_snapshot,
request.params.with_escalated_permissions.unwrap_or(false),
);
match safety {
SafetyCheck::AutoApprove {
sandbox_type,
user_explicitly_approved,
} => {
let mut decision = SandboxDecision::auto(
sandbox_type,
should_escalate_on_failure(approval_policy, sandbox_type),
);
if user_explicitly_approved {
decision.record_session_approval = true;
}
let (decision_for_event, source) = if user_explicitly_approved {
(ReviewDecision::ApprovedForSession, ToolDecisionSource::User)
} else {
(ReviewDecision::Approved, ToolDecisionSource::Config)
};
otel_event_manager.tool_decision("local_shell", call_id, decision_for_event, source);
Ok(decision)
}
SafetyCheck::AskUser => {
let decision = session
.request_command_approval(
sub_id.to_string(),
call_id.to_string(),
request.approval_command.clone(),
request.params.cwd.clone(),
request.params.justification.clone(),
)
.await;
otel_event_manager.tool_decision(
"local_shell",
call_id,
decision,
ToolDecisionSource::User,
);
match decision {
ReviewDecision::Approved => Ok(SandboxDecision::user_override(false)),
ReviewDecision::ApprovedForSession => Ok(SandboxDecision::user_override(true)),
ReviewDecision::Denied | ReviewDecision::Abort => {
Err(ExecError::rejection("exec command rejected by user"))
}
}
}
SafetyCheck::Reject { reason } => Err(ExecError::rejection(format!(
"exec command rejected: {reason}"
))),
}
}
fn select_apply_patch_sandbox(
exec: &ApplyPatchExec,
approval_policy: AskForApproval,
config: &ExecutorConfig,
) -> Result<SandboxDecision, ExecError> {
if exec.user_explicitly_approved_this_action {
return Ok(SandboxDecision::user_override(false));
}
match assess_patch_safety(
&exec.action,
approval_policy,
&config.sandbox_policy,
&config.sandbox_cwd,
) {
SafetyCheck::AutoApprove { sandbox_type, .. } => Ok(SandboxDecision::auto(
sandbox_type,
should_escalate_on_failure(approval_policy, sandbox_type),
)),
SafetyCheck::AskUser => Err(ExecError::rejection(
"patch requires approval but none was recorded",
)),
SafetyCheck::Reject { reason } => {
Err(ExecError::rejection(format!("patch rejected: {reason}")))
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::codex::make_session_and_context;
use crate::exec::ExecParams;
use crate::function_tool::FunctionCallError;
use crate::protocol::SandboxPolicy;
use codex_apply_patch::ApplyPatchAction;
use pretty_assertions::assert_eq;
#[tokio::test]
async fn select_apply_patch_user_override_when_explicit() {
let (session, ctx) = make_session_and_context();
let tmp = tempfile::tempdir().expect("tmp");
let p = tmp.path().join("a.txt");
let action = ApplyPatchAction::new_add_for_test(&p, "hello".to_string());
let exec = ApplyPatchExec {
action,
user_explicitly_approved_this_action: true,
};
let cfg = ExecutorConfig::new(SandboxPolicy::ReadOnly, std::env::temp_dir(), None, None);
let request = ExecutionRequest {
params: ExecParams {
command: vec!["apply_patch".into()],
cwd: std::env::temp_dir(),
timeout_ms: None,
env: std::collections::HashMap::new(),
with_escalated_permissions: None,
justification: None,
},
approval_command: vec!["apply_patch".into()],
mode: ExecutionMode::ApplyPatch(exec),
stdout_stream: None,
use_shell_profile: false,
};
let otel_event_manager = ctx.client.get_otel_event_manager();
let decision = select_sandbox(
&request,
AskForApproval::OnRequest,
Default::default(),
&cfg,
&session,
"sub",
"call",
&otel_event_manager,
)
.await
.expect("ok");
// Explicit user override runs without sandbox
assert_eq!(decision.initial_sandbox, SandboxType::None);
assert_eq!(decision.escalate_on_failure, false);
}
#[tokio::test]
async fn select_apply_patch_autoapprove_in_danger() {
let (session, ctx) = make_session_and_context();
let tmp = tempfile::tempdir().expect("tmp");
let p = tmp.path().join("a.txt");
let action = ApplyPatchAction::new_add_for_test(&p, "hello".to_string());
let exec = ApplyPatchExec {
action,
user_explicitly_approved_this_action: false,
};
let cfg = ExecutorConfig::new(
SandboxPolicy::DangerFullAccess,
std::env::temp_dir(),
None,
None,
);
let request = ExecutionRequest {
params: ExecParams {
command: vec!["apply_patch".into()],
cwd: std::env::temp_dir(),
timeout_ms: None,
env: std::collections::HashMap::new(),
with_escalated_permissions: None,
justification: None,
},
approval_command: vec!["apply_patch".into()],
mode: ExecutionMode::ApplyPatch(exec),
stdout_stream: None,
use_shell_profile: false,
};
let otel_event_manager = ctx.client.get_otel_event_manager();
let decision = select_sandbox(
&request,
AskForApproval::OnRequest,
Default::default(),
&cfg,
&session,
"sub",
"call",
&otel_event_manager,
)
.await
.expect("ok");
// On platforms with a sandbox, DangerFullAccess still prefers it
let expected = crate::safety::get_platform_sandbox().unwrap_or(SandboxType::None);
assert_eq!(decision.initial_sandbox, expected);
assert_eq!(decision.escalate_on_failure, false);
}
#[tokio::test]
async fn select_apply_patch_requires_approval_on_unless_trusted() {
let (session, ctx) = make_session_and_context();
let tempdir = tempfile::tempdir().expect("tmpdir");
let p = tempdir.path().join("a.txt");
let action = ApplyPatchAction::new_add_for_test(&p, "hello".to_string());
let exec = ApplyPatchExec {
action,
user_explicitly_approved_this_action: false,
};
let cfg = ExecutorConfig::new(SandboxPolicy::ReadOnly, std::env::temp_dir(), None, None);
let request = ExecutionRequest {
params: ExecParams {
command: vec!["apply_patch".into()],
cwd: std::env::temp_dir(),
timeout_ms: None,
env: std::collections::HashMap::new(),
with_escalated_permissions: None,
justification: None,
},
approval_command: vec!["apply_patch".into()],
mode: ExecutionMode::ApplyPatch(exec),
stdout_stream: None,
use_shell_profile: false,
};
let otel_event_manager = ctx.client.get_otel_event_manager();
let result = select_sandbox(
&request,
AskForApproval::UnlessTrusted,
Default::default(),
&cfg,
&session,
"sub",
"call",
&otel_event_manager,
)
.await;
match result {
Ok(_) => panic!("expected error"),
Err(ExecError::Function(FunctionCallError::RespondToModel(msg))) => {
assert!(msg.contains("requires approval"))
}
Err(other) => panic!("unexpected error: {other:?}"),
}
}
#[tokio::test]
async fn select_shell_autoapprove_in_danger_mode() {
let (session, ctx) = make_session_and_context();
let cfg = ExecutorConfig::new(
SandboxPolicy::DangerFullAccess,
std::env::temp_dir(),
None,
None,
);
let request = ExecutionRequest {
params: ExecParams {
command: vec!["some-unknown".into()],
cwd: std::env::temp_dir(),
timeout_ms: None,
env: std::collections::HashMap::new(),
with_escalated_permissions: None,
justification: None,
},
approval_command: vec!["some-unknown".into()],
mode: ExecutionMode::Shell,
stdout_stream: None,
use_shell_profile: false,
};
let otel_event_manager = ctx.client.get_otel_event_manager();
let decision = select_sandbox(
&request,
AskForApproval::OnRequest,
Default::default(),
&cfg,
&session,
"sub",
"call",
&otel_event_manager,
)
.await
.expect("ok");
assert_eq!(decision.initial_sandbox, SandboxType::None);
assert_eq!(decision.escalate_on_failure, false);
}
#[cfg(any(target_os = "macos", target_os = "linux"))]
#[tokio::test]
async fn select_shell_escalates_on_failure_with_platform_sandbox() {
let (session, ctx) = make_session_and_context();
let cfg = ExecutorConfig::new(SandboxPolicy::ReadOnly, std::env::temp_dir(), None, None);
let request = ExecutionRequest {
params: ExecParams {
// Unknown command => untrusted but not flagged dangerous
command: vec!["some-unknown".into()],
cwd: std::env::temp_dir(),
timeout_ms: None,
env: std::collections::HashMap::new(),
with_escalated_permissions: None,
justification: None,
},
approval_command: vec!["some-unknown".into()],
mode: ExecutionMode::Shell,
stdout_stream: None,
use_shell_profile: false,
};
let otel_event_manager = ctx.client.get_otel_event_manager();
let decision = select_sandbox(
&request,
AskForApproval::OnFailure,
Default::default(),
&cfg,
&session,
"sub",
"call",
&otel_event_manager,
)
.await
.expect("ok");
// On macOS/Linux we should have a platform sandbox and escalate on failure
assert_ne!(decision.initial_sandbox, SandboxType::None);
assert_eq!(decision.escalate_on_failure, true);
}
}

View File

@@ -4,4 +4,8 @@ use thiserror::Error;
pub enum FunctionCallError {
#[error("{0}")]
RespondToModel(String),
#[error("LocalShellCall without call_id or id")]
MissingLocalShellCallId,
#[error("Fatal error: {0}")]
Fatal(String),
}

View File

@@ -5,6 +5,7 @@
// the TUI or the tracing stack).
#![deny(clippy::print_stdout, clippy::print_stderr)]
pub mod admin_controls;
mod apply_patch;
pub mod auth;
pub mod bash;
@@ -18,6 +19,7 @@ pub use codex_conversation::CodexConversation;
mod command_safety;
pub mod config;
pub mod config_edit;
pub mod config_loader;
pub mod config_profile;
pub mod config_types;
mod conversation_history;
@@ -27,6 +29,7 @@ pub mod error;
pub mod exec;
mod exec_command;
pub mod exec_env;
pub mod executor;
mod flags;
pub mod git_info;
pub mod landlock;
@@ -35,6 +38,7 @@ mod mcp_tool_call;
mod message_history;
mod model_provider_info;
pub mod parse_command;
mod path_utils;
mod truncate;
mod unified_exec;
mod user_instructions;
@@ -56,7 +60,6 @@ pub mod default_client;
pub mod model_family;
mod openai_model_info;
mod openai_tools;
pub mod plan_tool;
pub mod project_doc;
mod rollout;
pub(crate) mod safety;
@@ -64,7 +67,7 @@ pub mod seatbelt;
pub mod shell;
pub mod spawn;
pub mod terminal;
mod tool_apply_patch;
mod tools;
pub mod turn_diff_tracker;
pub use rollout::ARCHIVED_SESSIONS_SUBDIR;
pub use rollout::INTERACTIVE_SESSION_SOURCES;

View File

@@ -123,12 +123,15 @@ impl McpClientAdapter {
}
async fn new_streamable_http_client(
server_name: String,
url: String,
bearer_token: Option<String>,
params: mcp_types::InitializeRequestParams,
startup_timeout: Duration,
) -> Result<Self> {
let client = Arc::new(RmcpClient::new_streamable_http_client(url, bearer_token)?);
let client = Arc::new(
RmcpClient::new_streamable_http_client(&server_name, &url, bearer_token).await?,
);
client.initialize(params, Some(startup_timeout)).await?;
Ok(McpClientAdapter::Rmcp(client))
}
@@ -208,8 +211,7 @@ impl McpConnectionManager {
) && !use_rmcp_client
{
info!(
"skipping MCP server `{}` configured with url because rmcp client is disabled",
server_name
"skipping MCP server `{server_name}` because the legacy MCP client only supports stdio servers",
);
continue;
}
@@ -217,7 +219,6 @@ impl McpConnectionManager {
let startup_timeout = cfg.startup_timeout_sec.unwrap_or(DEFAULT_STARTUP_TIMEOUT);
let tool_timeout = cfg.tool_timeout_sec.unwrap_or(DEFAULT_TOOL_TIMEOUT);
let use_rmcp_client_flag = use_rmcp_client;
join_set.spawn(async move {
let McpServerConfig { transport, .. } = cfg;
let params = mcp_types::InitializeRequestParams {
@@ -246,17 +247,18 @@ impl McpConnectionManager {
let command_os: OsString = command.into();
let args_os: Vec<OsString> = args.into_iter().map(Into::into).collect();
McpClientAdapter::new_stdio_client(
use_rmcp_client_flag,
use_rmcp_client,
command_os,
args_os,
env,
params.clone(),
params,
startup_timeout,
)
.await
}
McpServerTransportConfig::StreamableHttp { url, bearer_token } => {
McpClientAdapter::new_streamable_http_client(
server_name.clone(),
url,
bearer_token,
params,

View File

@@ -1,5 +1,5 @@
use crate::config_types::ReasoningSummaryFormat;
use crate::tool_apply_patch::ApplyPatchToolType;
use crate::tools::handlers::apply_patch::ApplyPatchToolType;
/// The `instructions` field in the payload sent to a model should always start
/// with this content.
@@ -41,6 +41,9 @@ pub struct ModelFamily {
// Instructions to use for querying the model
pub base_instructions: String,
/// Names of beta tools that should be exposed to this model family.
pub experimental_supported_tools: Vec<String>,
}
macro_rules! model_family {
@@ -57,6 +60,7 @@ macro_rules! model_family {
uses_local_shell_tool: false,
apply_patch_tool_type: None,
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
};
// apply overrides
$(
@@ -105,6 +109,7 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
supports_reasoning_summaries: true,
reasoning_summary_format: ReasoningSummaryFormat::Experimental,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
experimental_supported_tools: vec!["read_file".to_string()],
)
} else if slug.starts_with("gpt-5") {
model_family!(
@@ -127,5 +132,6 @@ pub fn derive_default_model_family(model: &str) -> ModelFamily {
uses_local_shell_tool: false,
apply_patch_tool_type: None,
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,19 @@
use std::io;
use std::path::PathBuf;
pub(crate) fn expand_tilde(raw: &str) -> io::Result<PathBuf> {
if raw.starts_with('~') {
// `shellexpand::tilde` falls back to returning the input when the home directory
// cannot be resolved; mirror the previous error semantics in that case.
let expanded = shellexpand::tilde(raw);
if expanded.starts_with('~') {
return Err(io::Error::new(
io::ErrorKind::NotFound,
"could not resolve home directory while expanding path",
));
}
return Ok(PathBuf::from(expanded.as_ref()));
}
Ok(PathBuf::from(raw))
}

View File

@@ -125,9 +125,10 @@ pub fn assess_command_safety(
// the session _because_ they know it needs to run outside a sandbox.
if is_known_safe_command(command) || approved.contains(command) {
let user_explicitly_approved = approved.contains(command);
return SafetyCheck::AutoApprove {
sandbox_type: SandboxType::None,
user_explicitly_approved: false,
user_explicitly_approved,
};
}
@@ -380,7 +381,7 @@ mod tests {
safety_check,
SafetyCheck::AutoApprove {
sandbox_type: SandboxType::None,
user_explicitly_approved: false,
user_explicitly_approved: true,
}
);
}

View File

@@ -1,9 +1,9 @@
use crate::RolloutRecorder;
use crate::exec_command::ExecSessionManager;
use crate::executor::Executor;
use crate::mcp_connection_manager::McpConnectionManager;
use crate::unified_exec::UnifiedExecSessionManager;
use crate::user_notification::UserNotifier;
use std::path::PathBuf;
use tokio::sync::Mutex;
pub(crate) struct SessionServices {
@@ -12,7 +12,7 @@ pub(crate) struct SessionServices {
pub(crate) unified_exec_manager: UnifiedExecSessionManager,
pub(crate) notifier: UserNotifier,
pub(crate) rollout: Mutex<Option<RolloutRecorder>>,
pub(crate) codex_linux_sandbox_exe: Option<PathBuf>,
pub(crate) user_shell: crate::shell::Shell,
pub(crate) show_raw_agent_reasoning: bool,
pub(crate) executor: Executor,
}

View File

@@ -1,7 +1,5 @@
//! Session-wide mutable state.
use std::collections::HashSet;
use codex_protocol::models::ResponseItem;
use crate::conversation_history::ConversationHistory;
@@ -12,7 +10,6 @@ use crate::protocol::TokenUsageInfo;
/// Persistent, session-scoped state previously stored directly on `Session`.
#[derive(Default)]
pub(crate) struct SessionState {
pub(crate) approved_commands: HashSet<Vec<String>>,
pub(crate) history: ConversationHistory,
pub(crate) token_info: Option<TokenUsageInfo>,
pub(crate) latest_rate_limits: Option<RateLimitSnapshot>,
@@ -44,15 +41,6 @@ impl SessionState {
self.history.replace(items);
}
// Approved command helpers
pub(crate) fn add_approved_command(&mut self, cmd: Vec<String>) {
self.approved_commands.insert(cmd);
}
pub(crate) fn approved_commands_ref(&self) -> &HashSet<Vec<String>> {
&self.approved_commands
}
// Token/rate limit helpers
pub(crate) fn update_token_info_from_usage(
&mut self,

View File

@@ -0,0 +1,244 @@
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::tools::TELEMETRY_PREVIEW_MAX_BYTES;
use crate::tools::TELEMETRY_PREVIEW_MAX_LINES;
use crate::tools::TELEMETRY_PREVIEW_TRUNCATION_NOTICE;
use crate::turn_diff_tracker::TurnDiffTracker;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ShellToolCallParams;
use codex_protocol::protocol::FileChange;
use codex_utils_string::take_bytes_at_char_boundary;
use mcp_types::CallToolResult;
use std::borrow::Cow;
use std::collections::HashMap;
use std::path::PathBuf;
pub struct ToolInvocation<'a> {
pub session: &'a Session,
pub turn: &'a TurnContext,
pub tracker: &'a mut TurnDiffTracker,
pub sub_id: &'a str,
pub call_id: String,
pub tool_name: String,
pub payload: ToolPayload,
}
#[derive(Clone)]
pub enum ToolPayload {
Function {
arguments: String,
},
Custom {
input: String,
},
LocalShell {
params: ShellToolCallParams,
},
UnifiedExec {
arguments: String,
},
Mcp {
server: String,
tool: String,
raw_arguments: String,
},
}
impl ToolPayload {
pub fn log_payload(&self) -> Cow<'_, str> {
match self {
ToolPayload::Function { arguments } => Cow::Borrowed(arguments),
ToolPayload::Custom { input } => Cow::Borrowed(input),
ToolPayload::LocalShell { params } => Cow::Owned(params.command.join(" ")),
ToolPayload::UnifiedExec { arguments } => Cow::Borrowed(arguments),
ToolPayload::Mcp { raw_arguments, .. } => Cow::Borrowed(raw_arguments),
}
}
}
#[derive(Clone)]
pub enum ToolOutput {
Function {
content: String,
success: Option<bool>,
},
Mcp {
result: Result<CallToolResult, String>,
},
}
impl ToolOutput {
pub fn log_preview(&self) -> String {
match self {
ToolOutput::Function { content, .. } => telemetry_preview(content),
ToolOutput::Mcp { result } => format!("{result:?}"),
}
}
pub fn success_for_logging(&self) -> bool {
match self {
ToolOutput::Function { success, .. } => success.unwrap_or(true),
ToolOutput::Mcp { result } => result.is_ok(),
}
}
pub fn into_response(self, call_id: &str, payload: &ToolPayload) -> ResponseInputItem {
match self {
ToolOutput::Function { content, success } => {
if matches!(payload, ToolPayload::Custom { .. }) {
ResponseInputItem::CustomToolCallOutput {
call_id: call_id.to_string(),
output: content,
}
} else {
ResponseInputItem::FunctionCallOutput {
call_id: call_id.to_string(),
output: FunctionCallOutputPayload { content, success },
}
}
}
ToolOutput::Mcp { result } => ResponseInputItem::McpToolCallOutput {
call_id: call_id.to_string(),
result,
},
}
}
}
fn telemetry_preview(content: &str) -> String {
let truncated_slice = take_bytes_at_char_boundary(content, TELEMETRY_PREVIEW_MAX_BYTES);
let truncated_by_bytes = truncated_slice.len() < content.len();
let mut preview = String::new();
let mut lines_iter = truncated_slice.lines();
for idx in 0..TELEMETRY_PREVIEW_MAX_LINES {
match lines_iter.next() {
Some(line) => {
if idx > 0 {
preview.push('\n');
}
preview.push_str(line);
}
None => break,
}
}
let truncated_by_lines = lines_iter.next().is_some();
if !truncated_by_bytes && !truncated_by_lines {
return content.to_string();
}
if preview.len() < truncated_slice.len()
&& truncated_slice
.as_bytes()
.get(preview.len())
.is_some_and(|byte| *byte == b'\n')
{
preview.push('\n');
}
if !preview.is_empty() && !preview.ends_with('\n') {
preview.push('\n');
}
preview.push_str(TELEMETRY_PREVIEW_TRUNCATION_NOTICE);
preview
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
#[test]
fn custom_tool_calls_should_roundtrip_as_custom_outputs() {
let payload = ToolPayload::Custom {
input: "patch".to_string(),
};
let response = ToolOutput::Function {
content: "patched".to_string(),
success: Some(true),
}
.into_response("call-42", &payload);
match response {
ResponseInputItem::CustomToolCallOutput { call_id, output } => {
assert_eq!(call_id, "call-42");
assert_eq!(output, "patched");
}
other => panic!("expected CustomToolCallOutput, got {other:?}"),
}
}
#[test]
fn function_payloads_remain_function_outputs() {
let payload = ToolPayload::Function {
arguments: "{}".to_string(),
};
let response = ToolOutput::Function {
content: "ok".to_string(),
success: Some(true),
}
.into_response("fn-1", &payload);
match response {
ResponseInputItem::FunctionCallOutput { call_id, output } => {
assert_eq!(call_id, "fn-1");
assert_eq!(output.content, "ok");
assert_eq!(output.success, Some(true));
}
other => panic!("expected FunctionCallOutput, got {other:?}"),
}
}
#[test]
fn telemetry_preview_returns_original_within_limits() {
let content = "short output";
assert_eq!(telemetry_preview(content), content);
}
#[test]
fn telemetry_preview_truncates_by_bytes() {
let content = "x".repeat(TELEMETRY_PREVIEW_MAX_BYTES + 8);
let preview = telemetry_preview(&content);
assert!(preview.contains(TELEMETRY_PREVIEW_TRUNCATION_NOTICE));
assert!(
preview.len()
<= TELEMETRY_PREVIEW_MAX_BYTES + TELEMETRY_PREVIEW_TRUNCATION_NOTICE.len() + 1
);
}
#[test]
fn telemetry_preview_truncates_by_lines() {
let content = (0..(TELEMETRY_PREVIEW_MAX_LINES + 5))
.map(|idx| format!("line {idx}"))
.collect::<Vec<_>>()
.join("\n");
let preview = telemetry_preview(&content);
let lines: Vec<&str> = preview.lines().collect();
assert!(lines.len() <= TELEMETRY_PREVIEW_MAX_LINES + 1);
assert_eq!(lines.last(), Some(&TELEMETRY_PREVIEW_TRUNCATION_NOTICE));
}
}
#[derive(Clone, Debug)]
pub(crate) struct ExecCommandContext {
pub(crate) sub_id: String,
pub(crate) call_id: String,
pub(crate) command_for_display: Vec<String>,
pub(crate) cwd: PathBuf,
pub(crate) apply_patch: Option<ApplyPatchCommandContext>,
pub(crate) tool_name: String,
pub(crate) otel_event_manager: OtelEventManager,
}
#[derive(Clone, Debug)]
pub(crate) struct ApplyPatchCommandContext {
pub(crate) user_explicitly_approved_this_action: bool,
pub(crate) changes: HashMap<PathBuf, FileChange>,
}

View File

@@ -1,15 +1,99 @@
use std::collections::BTreeMap;
use std::collections::HashMap;
use crate::client_common::tools::FreeformTool;
use crate::client_common::tools::FreeformToolFormat;
use crate::client_common::tools::ResponsesApiTool;
use crate::client_common::tools::ToolSpec;
use crate::exec::ExecParams;
use crate::function_tool::FunctionCallError;
use crate::openai_tools::JsonSchema;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::handle_container_exec_with_params;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::tools::spec::ApplyPatchToolArgs;
use async_trait::async_trait;
use serde::Deserialize;
use serde::Serialize;
use std::collections::BTreeMap;
use crate::openai_tools::FreeformTool;
use crate::openai_tools::FreeformToolFormat;
use crate::openai_tools::JsonSchema;
use crate::openai_tools::OpenAiTool;
use crate::openai_tools::ResponsesApiTool;
pub struct ApplyPatchHandler;
const APPLY_PATCH_LARK_GRAMMAR: &str = include_str!("tool_apply_patch.lark");
#[async_trait]
impl ToolHandler for ApplyPatchHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
fn matches_kind(&self, payload: &ToolPayload) -> bool {
matches!(
payload,
ToolPayload::Function { .. } | ToolPayload::Custom { .. }
)
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
turn,
tracker,
sub_id,
call_id,
tool_name,
payload,
} = invocation;
let patch_input = match payload {
ToolPayload::Function { arguments } => {
let args: ApplyPatchToolArgs = serde_json::from_str(&arguments).map_err(|e| {
FunctionCallError::RespondToModel(format!(
"failed to parse function arguments: {e:?}"
))
})?;
args.input
}
ToolPayload::Custom { input } => input,
_ => {
return Err(FunctionCallError::RespondToModel(
"apply_patch handler received unsupported payload".to_string(),
));
}
};
let exec_params = ExecParams {
command: vec!["apply_patch".to_string(), patch_input.clone()],
cwd: turn.cwd.clone(),
timeout_ms: None,
env: HashMap::new(),
with_escalated_permissions: None,
justification: None,
};
let content = handle_container_exec_with_params(
tool_name.as_str(),
exec_params,
session,
turn,
tracker,
sub_id.to_string(),
call_id.clone(),
)
.await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, Hash)]
#[serde(rename_all = "snake_case")]
pub enum ApplyPatchToolType {
@@ -19,8 +103,8 @@ pub enum ApplyPatchToolType {
/// Returns a custom tool that can be used to edit files. Well-suited for GPT-5 models
/// https://platform.openai.com/docs/guides/function-calling#custom-tools
pub(crate) fn create_apply_patch_freeform_tool() -> OpenAiTool {
OpenAiTool::Freeform(FreeformTool {
pub(crate) fn create_apply_patch_freeform_tool() -> ToolSpec {
ToolSpec::Freeform(FreeformTool {
name: "apply_patch".to_string(),
description: "Use the `apply_patch` tool to edit files".to_string(),
format: FreeformToolFormat {
@@ -32,7 +116,7 @@ pub(crate) fn create_apply_patch_freeform_tool() -> OpenAiTool {
}
/// Returns a json tool that can be used to edit files. Should only be used with gpt-oss models
pub(crate) fn create_apply_patch_json_tool() -> OpenAiTool {
pub(crate) fn create_apply_patch_json_tool() -> ToolSpec {
let mut properties = BTreeMap::new();
properties.insert(
"input".to_string(),
@@ -41,7 +125,7 @@ pub(crate) fn create_apply_patch_json_tool() -> OpenAiTool {
},
);
OpenAiTool::Function(ResponsesApiTool {
ToolSpec::Function(ResponsesApiTool {
name: "apply_patch".to_string(),
description: r#"Use the `apply_patch` tool to edit files.
Your patch language is a strippeddown, fileoriented diff format designed to be easy to parse and safe to apply. You can think of it as a highlevel envelope:
@@ -111,7 +195,7 @@ It is important to remember:
- You must prefix new lines with `+` even when creating a new file
- File references can only be relative, NEVER ABSOLUTE.
"#
.to_string(),
.to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,

View File

@@ -0,0 +1,71 @@
use async_trait::async_trait;
use crate::exec_command::EXEC_COMMAND_TOOL_NAME;
use crate::exec_command::ExecCommandParams;
use crate::exec_command::WRITE_STDIN_TOOL_NAME;
use crate::exec_command::WriteStdinParams;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
pub struct ExecStreamHandler;
#[async_trait]
impl ToolHandler for ExecStreamHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
tool_name,
payload,
..
} = invocation;
let arguments = match payload {
ToolPayload::Function { arguments } => arguments,
_ => {
return Err(FunctionCallError::RespondToModel(
"exec_stream handler received unsupported payload".to_string(),
));
}
};
let content = match tool_name.as_str() {
EXEC_COMMAND_TOOL_NAME => {
let params: ExecCommandParams = serde_json::from_str(&arguments).map_err(|e| {
FunctionCallError::RespondToModel(format!(
"failed to parse function arguments: {e:?}"
))
})?;
session.handle_exec_command_tool(params).await?
}
WRITE_STDIN_TOOL_NAME => {
let params: WriteStdinParams = serde_json::from_str(&arguments).map_err(|e| {
FunctionCallError::RespondToModel(format!(
"failed to parse function arguments: {e:?}"
))
})?;
session.handle_write_stdin_tool(params).await?
}
_ => {
return Err(FunctionCallError::RespondToModel(format!(
"exec_stream handler does not support tool {tool_name}"
)));
}
};
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
}

View File

@@ -0,0 +1,70 @@
use async_trait::async_trait;
use crate::function_tool::FunctionCallError;
use crate::mcp_tool_call::handle_mcp_tool_call;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
pub struct McpHandler;
#[async_trait]
impl ToolHandler for McpHandler {
fn kind(&self) -> ToolKind {
ToolKind::Mcp
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
sub_id,
call_id,
payload,
..
} = invocation;
let payload = match payload {
ToolPayload::Mcp {
server,
tool,
raw_arguments,
} => (server, tool, raw_arguments),
_ => {
return Err(FunctionCallError::RespondToModel(
"mcp handler received unsupported payload".to_string(),
));
}
};
let (server, tool, raw_arguments) = payload;
let arguments_str = raw_arguments;
let response = handle_mcp_tool_call(
session,
sub_id,
call_id.clone(),
server,
tool,
arguments_str,
)
.await;
match response {
codex_protocol::models::ResponseInputItem::McpToolCallOutput { result, .. } => {
Ok(ToolOutput::Mcp { result })
}
codex_protocol::models::ResponseInputItem::FunctionCallOutput { output, .. } => {
let codex_protocol::models::FunctionCallOutputPayload { content, success } = output;
Ok(ToolOutput::Function { content, success })
}
_ => Err(FunctionCallError::RespondToModel(
"mcp handler received unexpected response variant".to_string(),
)),
}
}
}

View File

@@ -0,0 +1,19 @@
pub mod apply_patch;
mod exec_stream;
mod mcp;
mod plan;
mod read_file;
mod shell;
mod unified_exec;
mod view_image;
pub use plan::PLAN_TOOL;
pub use apply_patch::ApplyPatchHandler;
pub use exec_stream::ExecStreamHandler;
pub use mcp::McpHandler;
pub use plan::PlanHandler;
pub use read_file::ReadFileHandler;
pub use shell::ShellHandler;
pub use unified_exec::UnifiedExecHandler;
pub use view_image::ViewImageHandler;

View File

@@ -1,23 +1,23 @@
use std::collections::BTreeMap;
use std::sync::LazyLock;
use crate::client_common::tools::ResponsesApiTool;
use crate::client_common::tools::ToolSpec;
use crate::codex::Session;
use crate::function_tool::FunctionCallError;
use crate::openai_tools::JsonSchema;
use crate::openai_tools::OpenAiTool;
use crate::openai_tools::ResponsesApiTool;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use async_trait::async_trait;
use codex_protocol::plan_tool::UpdatePlanArgs;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;
use std::collections::BTreeMap;
use std::sync::LazyLock;
// Use the canonical plan tool types from the protocol crate to ensure
// type-identity matches events transported via `codex_protocol`.
pub use codex_protocol::plan_tool::PlanItemArg;
pub use codex_protocol::plan_tool::StepStatus;
pub use codex_protocol::plan_tool::UpdatePlanArgs;
pub struct PlanHandler;
// Types for the TODO tool arguments matching codex-vscode/todo-mcp/src/main.rs
pub(crate) static PLAN_TOOL: LazyLock<OpenAiTool> = LazyLock::new(|| {
pub static PLAN_TOOL: LazyLock<ToolSpec> = LazyLock::new(|| {
let mut plan_item_props = BTreeMap::new();
plan_item_props.insert("step".to_string(), JsonSchema::String { description: None });
plan_item_props.insert(
@@ -43,7 +43,7 @@ pub(crate) static PLAN_TOOL: LazyLock<OpenAiTool> = LazyLock::new(|| {
);
properties.insert("plan".to_string(), plan_items_schema);
OpenAiTool::Function(ResponsesApiTool {
ToolSpec::Function(ResponsesApiTool {
name: "update_plan".to_string(),
description: r#"Updates the task plan.
Provide an optional explanation and a list of plan items, each with a step and status.
@@ -59,6 +59,42 @@ At most one step can be in_progress at a time.
})
});
#[async_trait]
impl ToolHandler for PlanHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
sub_id,
call_id,
payload,
..
} = invocation;
let arguments = match payload {
ToolPayload::Function { arguments } => arguments,
_ => {
return Err(FunctionCallError::RespondToModel(
"update_plan handler received unsupported payload".to_string(),
));
}
};
let content = handle_update_plan(session, arguments, sub_id.to_string(), call_id).await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
}
/// This function doesn't do anything useful. However, it gives the model a structured way to record its plan that clients can read and render.
/// So it's the _inputs_ to this function that are useful to clients, not the outputs and neither are actually useful for the model other
/// than forcing it to come up and document a plan (TBD how that affects performance).

View File

@@ -0,0 +1,255 @@
use std::path::Path;
use std::path::PathBuf;
use async_trait::async_trait;
use codex_utils_string::take_bytes_at_char_boundary;
use serde::Deserialize;
use tokio::fs::File;
use tokio::io::AsyncBufReadExt;
use tokio::io::BufReader;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
pub struct ReadFileHandler;
const MAX_LINE_LENGTH: usize = 500;
fn default_offset() -> usize {
1
}
fn default_limit() -> usize {
2000
}
#[derive(Deserialize)]
struct ReadFileArgs {
file_path: String,
#[serde(default = "default_offset")]
offset: usize,
#[serde(default = "default_limit")]
limit: usize,
}
#[async_trait]
impl ToolHandler for ReadFileHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation { payload, .. } = invocation;
let arguments = match payload {
ToolPayload::Function { arguments } => arguments,
_ => {
return Err(FunctionCallError::RespondToModel(
"read_file handler received unsupported payload".to_string(),
));
}
};
let args: ReadFileArgs = serde_json::from_str(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to parse function arguments: {err:?}"
))
})?;
let ReadFileArgs {
file_path,
offset,
limit,
} = args;
if offset == 0 {
return Err(FunctionCallError::RespondToModel(
"offset must be a 1-indexed line number".to_string(),
));
}
if limit == 0 {
return Err(FunctionCallError::RespondToModel(
"limit must be greater than zero".to_string(),
));
}
let path = PathBuf::from(&file_path);
if !path.is_absolute() {
return Err(FunctionCallError::RespondToModel(
"file_path must be an absolute path".to_string(),
));
}
let collected = read_file_slice(&path, offset, limit).await?;
Ok(ToolOutput::Function {
content: collected.join("\n"),
success: Some(true),
})
}
}
async fn read_file_slice(
path: &Path,
offset: usize,
limit: usize,
) -> Result<Vec<String>, FunctionCallError> {
let file = File::open(path)
.await
.map_err(|err| FunctionCallError::RespondToModel(format!("failed to read file: {err}")))?;
let mut reader = BufReader::new(file);
let mut collected = Vec::new();
let mut seen = 0usize;
let mut buffer = Vec::new();
loop {
buffer.clear();
let bytes_read = reader.read_until(b'\n', &mut buffer).await.map_err(|err| {
FunctionCallError::RespondToModel(format!("failed to read file: {err}"))
})?;
if bytes_read == 0 {
break;
}
if buffer.last() == Some(&b'\n') {
buffer.pop();
if buffer.last() == Some(&b'\r') {
buffer.pop();
}
}
seen += 1;
if seen < offset {
continue;
}
if collected.len() == limit {
break;
}
let formatted = format_line(&buffer);
collected.push(format!("L{seen}: {formatted}"));
if collected.len() == limit {
break;
}
}
if seen < offset {
return Err(FunctionCallError::RespondToModel(
"offset exceeds file length".to_string(),
));
}
Ok(collected)
}
fn format_line(bytes: &[u8]) -> String {
let decoded = String::from_utf8_lossy(bytes);
if decoded.len() > MAX_LINE_LENGTH {
take_bytes_at_char_boundary(&decoded, MAX_LINE_LENGTH).to_string()
} else {
decoded.into_owned()
}
}
#[cfg(test)]
mod tests {
use super::*;
use tempfile::NamedTempFile;
#[tokio::test]
async fn reads_requested_range() {
let mut temp = NamedTempFile::new().expect("create temp file");
use std::io::Write as _;
writeln!(temp, "alpha").unwrap();
writeln!(temp, "beta").unwrap();
writeln!(temp, "gamma").unwrap();
let lines = read_file_slice(temp.path(), 2, 2)
.await
.expect("read slice");
assert_eq!(lines, vec!["L2: beta".to_string(), "L3: gamma".to_string()]);
}
#[tokio::test]
async fn errors_when_offset_exceeds_length() {
let mut temp = NamedTempFile::new().expect("create temp file");
use std::io::Write as _;
writeln!(temp, "only").unwrap();
let err = read_file_slice(temp.path(), 3, 1)
.await
.expect_err("offset exceeds length");
assert_eq!(
err,
FunctionCallError::RespondToModel("offset exceeds file length".to_string())
);
}
#[tokio::test]
async fn reads_non_utf8_lines() {
let mut temp = NamedTempFile::new().expect("create temp file");
use std::io::Write as _;
temp.as_file_mut().write_all(b"\xff\xfe\nplain\n").unwrap();
let lines = read_file_slice(temp.path(), 1, 2)
.await
.expect("read slice");
let expected_first = format!("L1: {}{}", '\u{FFFD}', '\u{FFFD}');
assert_eq!(lines, vec![expected_first, "L2: plain".to_string()]);
}
#[tokio::test]
async fn trims_crlf_endings() {
let mut temp = NamedTempFile::new().expect("create temp file");
use std::io::Write as _;
write!(temp, "one\r\ntwo\r\n").unwrap();
let lines = read_file_slice(temp.path(), 1, 2)
.await
.expect("read slice");
assert_eq!(lines, vec!["L1: one".to_string(), "L2: two".to_string()]);
}
#[tokio::test]
async fn respects_limit_even_with_more_lines() {
let mut temp = NamedTempFile::new().expect("create temp file");
use std::io::Write as _;
writeln!(temp, "first").unwrap();
writeln!(temp, "second").unwrap();
writeln!(temp, "third").unwrap();
let lines = read_file_slice(temp.path(), 1, 2)
.await
.expect("read slice");
assert_eq!(
lines,
vec!["L1: first".to_string(), "L2: second".to_string()]
);
}
#[tokio::test]
async fn truncates_lines_longer_than_max_length() {
let mut temp = NamedTempFile::new().expect("create temp file");
use std::io::Write as _;
let long_line = "x".repeat(MAX_LINE_LENGTH + 50);
writeln!(temp, "{long_line}").unwrap();
let lines = read_file_slice(temp.path(), 1, 1)
.await
.expect("read slice");
let expected = "x".repeat(MAX_LINE_LENGTH);
assert_eq!(lines, vec![format!("L1: {expected}")]);
}
}

View File

@@ -0,0 +1,103 @@
use async_trait::async_trait;
use codex_protocol::models::ShellToolCallParams;
use crate::codex::TurnContext;
use crate::exec::ExecParams;
use crate::exec_env::create_env;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::handle_container_exec_with_params;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
pub struct ShellHandler;
impl ShellHandler {
fn to_exec_params(params: ShellToolCallParams, turn_context: &TurnContext) -> ExecParams {
ExecParams {
command: params.command,
cwd: turn_context.resolve_path(params.workdir.clone()),
timeout_ms: params.timeout_ms,
env: create_env(&turn_context.shell_environment_policy),
with_escalated_permissions: params.with_escalated_permissions,
justification: params.justification,
}
}
}
#[async_trait]
impl ToolHandler for ShellHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
fn matches_kind(&self, payload: &ToolPayload) -> bool {
matches!(
payload,
ToolPayload::Function { .. } | ToolPayload::LocalShell { .. }
)
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
turn,
tracker,
sub_id,
call_id,
tool_name,
payload,
} = invocation;
match payload {
ToolPayload::Function { arguments } => {
let params: ShellToolCallParams =
serde_json::from_str(&arguments).map_err(|e| {
FunctionCallError::RespondToModel(format!(
"failed to parse function arguments: {e:?}"
))
})?;
let exec_params = Self::to_exec_params(params, turn);
let content = handle_container_exec_with_params(
tool_name.as_str(),
exec_params,
session,
turn,
tracker,
sub_id.to_string(),
call_id.clone(),
)
.await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
ToolPayload::LocalShell { params } => {
let exec_params = Self::to_exec_params(params, turn);
let content = handle_container_exec_with_params(
tool_name.as_str(),
exec_params,
session,
turn,
tracker,
sub_id.to_string(),
call_id.clone(),
)
.await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
_ => Err(FunctionCallError::RespondToModel(format!(
"unsupported payload for shell handler: {tool_name}"
))),
}
}
}

View File

@@ -0,0 +1,112 @@
use async_trait::async_trait;
use serde::Deserialize;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::unified_exec::UnifiedExecRequest;
pub struct UnifiedExecHandler;
#[derive(Deserialize)]
struct UnifiedExecArgs {
input: Vec<String>,
#[serde(default)]
session_id: Option<String>,
#[serde(default)]
timeout_ms: Option<u64>,
}
#[async_trait]
impl ToolHandler for UnifiedExecHandler {
fn kind(&self) -> ToolKind {
ToolKind::UnifiedExec
}
fn matches_kind(&self, payload: &ToolPayload) -> bool {
matches!(
payload,
ToolPayload::UnifiedExec { .. } | ToolPayload::Function { .. }
)
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session, payload, ..
} = invocation;
let args = match payload {
ToolPayload::UnifiedExec { arguments } | ToolPayload::Function { arguments } => {
serde_json::from_str::<UnifiedExecArgs>(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to parse function arguments: {err:?}"
))
})?
}
_ => {
return Err(FunctionCallError::RespondToModel(
"unified_exec handler received unsupported payload".to_string(),
));
}
};
let UnifiedExecArgs {
input,
session_id,
timeout_ms,
} = args;
let parsed_session_id = if let Some(session_id) = session_id {
match session_id.parse::<i32>() {
Ok(parsed) => Some(parsed),
Err(output) => {
return Err(FunctionCallError::RespondToModel(format!(
"invalid session_id: {session_id} due to error {output:?}"
)));
}
}
} else {
None
};
let request = UnifiedExecRequest {
session_id: parsed_session_id,
input_chunks: &input,
timeout_ms,
};
let value = session
.run_unified_exec_request(request)
.await
.map_err(|err| {
FunctionCallError::RespondToModel(format!("unified exec failed: {err:?}"))
})?;
#[derive(serde::Serialize)]
struct SerializedUnifiedExecResult {
session_id: Option<String>,
output: String,
}
let content = serde_json::to_string(&SerializedUnifiedExecResult {
session_id: value.session_id.map(|id| id.to_string()),
output: value.output,
})
.map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to serialize unified exec output: {err:?}"
))
})?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
}

View File

@@ -0,0 +1,96 @@
use async_trait::async_trait;
use serde::Deserialize;
use tokio::fs;
use crate::function_tool::FunctionCallError;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::InputItem;
use crate::protocol::ViewImageToolCallEvent;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
pub struct ViewImageHandler;
#[derive(Deserialize)]
struct ViewImageArgs {
path: String,
}
#[async_trait]
impl ToolHandler for ViewImageHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
async fn handle(
&self,
invocation: ToolInvocation<'_>,
) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
turn,
payload,
sub_id,
call_id,
..
} = invocation;
let arguments = match payload {
ToolPayload::Function { arguments } => arguments,
_ => {
return Err(FunctionCallError::RespondToModel(
"view_image handler received unsupported payload".to_string(),
));
}
};
let args: ViewImageArgs = serde_json::from_str(&arguments).map_err(|e| {
FunctionCallError::RespondToModel(format!("failed to parse function arguments: {e:?}"))
})?;
let abs_path = turn.resolve_path(Some(args.path));
let metadata = fs::metadata(&abs_path).await.map_err(|error| {
FunctionCallError::RespondToModel(format!(
"unable to locate image at `{}`: {error}",
abs_path.display()
))
})?;
if !metadata.is_file() {
return Err(FunctionCallError::RespondToModel(format!(
"image path `{}` is not a file",
abs_path.display()
)));
}
let event_path = abs_path.clone();
session
.inject_input(vec![InputItem::LocalImage { path: abs_path }])
.await
.map_err(|_| {
FunctionCallError::RespondToModel(
"unable to attach image (no active task)".to_string(),
)
})?;
session
.send_event(Event {
id: sub_id.to_string(),
msg: EventMsg::ViewImageToolCall(ViewImageToolCallEvent {
call_id,
path: event_path,
}),
})
.await;
Ok(ToolOutput::Function {
content: "attached local image path".to_string(),
success: Some(true),
})
}
}

View File

@@ -0,0 +1,280 @@
pub mod context;
pub(crate) mod handlers;
pub mod registry;
pub mod router;
pub mod spec;
use crate::apply_patch;
use crate::apply_patch::ApplyPatchExec;
use crate::apply_patch::InternalApplyPatchInvocation;
use crate::apply_patch::convert_apply_patch_to_protocol;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::error::SandboxErr;
use crate::exec::ExecParams;
use crate::exec::ExecToolCallOutput;
use crate::exec::StdoutStream;
use crate::executor::ExecutionMode;
use crate::executor::errors::ExecError;
use crate::executor::linkers::PreparedExec;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ApplyPatchCommandContext;
use crate::tools::context::ExecCommandContext;
use crate::turn_diff_tracker::TurnDiffTracker;
use codex_apply_patch::MaybeApplyPatchVerified;
use codex_apply_patch::maybe_parse_apply_patch_verified;
use codex_protocol::protocol::AskForApproval;
use codex_utils_string::take_bytes_at_char_boundary;
use codex_utils_string::take_last_bytes_at_char_boundary;
pub use router::ToolRouter;
use serde::Serialize;
use tracing::trace;
// Model-formatting limits: clients get full streams; only content sent to the model is truncated.
pub(crate) const MODEL_FORMAT_MAX_BYTES: usize = 10 * 1024; // 10 KiB
pub(crate) const MODEL_FORMAT_MAX_LINES: usize = 256; // lines
pub(crate) const MODEL_FORMAT_HEAD_LINES: usize = MODEL_FORMAT_MAX_LINES / 2;
pub(crate) const MODEL_FORMAT_TAIL_LINES: usize = MODEL_FORMAT_MAX_LINES - MODEL_FORMAT_HEAD_LINES; // 128
pub(crate) const MODEL_FORMAT_HEAD_BYTES: usize = MODEL_FORMAT_MAX_BYTES / 2;
// Telemetry preview limits: keep log events smaller than model budgets.
pub(crate) const TELEMETRY_PREVIEW_MAX_BYTES: usize = 2 * 1024; // 2 KiB
pub(crate) const TELEMETRY_PREVIEW_MAX_LINES: usize = 64; // lines
pub(crate) const TELEMETRY_PREVIEW_TRUNCATION_NOTICE: &str =
"[... telemetry preview truncated ...]";
// TODO(jif) break this down
pub(crate) async fn handle_container_exec_with_params(
tool_name: &str,
params: ExecParams,
sess: &Session,
turn_context: &TurnContext,
turn_diff_tracker: &mut TurnDiffTracker,
sub_id: String,
call_id: String,
) -> Result<String, FunctionCallError> {
let otel_event_manager = turn_context.client.get_otel_event_manager();
if params.with_escalated_permissions.unwrap_or(false)
&& !matches!(turn_context.approval_policy, AskForApproval::OnRequest)
{
return Err(FunctionCallError::RespondToModel(format!(
"approval policy is {policy:?}; reject command — you should not ask for escalated permissions if the approval policy is {policy:?}",
policy = turn_context.approval_policy
)));
}
// check if this was a patch, and apply it if so
let apply_patch_exec = match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
MaybeApplyPatchVerified::Body(changes) => {
match apply_patch::apply_patch(sess, turn_context, &sub_id, &call_id, changes).await {
InternalApplyPatchInvocation::Output(item) => return item,
InternalApplyPatchInvocation::DelegateToExec(apply_patch_exec) => {
Some(apply_patch_exec)
}
}
}
MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
// It looks like an invocation of `apply_patch`, but we
// could not resolve it into a patch that would apply
// cleanly. Return to model for resample.
return Err(FunctionCallError::RespondToModel(format!(
"apply_patch verification failed: {parse_error}"
)));
}
MaybeApplyPatchVerified::ShellParseError(error) => {
trace!("Failed to parse shell command, {error:?}");
None
}
MaybeApplyPatchVerified::NotApplyPatch => None,
};
let command_for_display = if let Some(exec) = apply_patch_exec.as_ref() {
vec!["apply_patch".to_string(), exec.action.patch.clone()]
} else {
params.command.clone()
};
let exec_command_context = ExecCommandContext {
sub_id: sub_id.clone(),
call_id: call_id.clone(),
command_for_display: command_for_display.clone(),
cwd: params.cwd.clone(),
apply_patch: apply_patch_exec.as_ref().map(
|ApplyPatchExec {
action,
user_explicitly_approved_this_action,
}| ApplyPatchCommandContext {
user_explicitly_approved_this_action: *user_explicitly_approved_this_action,
changes: convert_apply_patch_to_protocol(action),
},
),
tool_name: tool_name.to_string(),
otel_event_manager,
};
let mode = match apply_patch_exec {
Some(exec) => ExecutionMode::ApplyPatch(exec),
None => ExecutionMode::Shell,
};
sess.services.executor.update_environment(
turn_context.sandbox_policy.clone(),
turn_context.cwd.clone(),
);
let prepared_exec = PreparedExec::new(
exec_command_context,
params,
command_for_display,
mode,
Some(StdoutStream {
sub_id: sub_id.clone(),
call_id: call_id.clone(),
tx_event: sess.get_tx_event(),
}),
turn_context.shell_environment_policy.use_profile,
);
let output_result = sess
.run_exec_with_events(
turn_diff_tracker,
prepared_exec,
turn_context.approval_policy,
)
.await;
match output_result {
Ok(output) => {
let ExecToolCallOutput { exit_code, .. } = &output;
let content = format_exec_output_apply_patch(&output);
if *exit_code == 0 {
Ok(content)
} else {
Err(FunctionCallError::RespondToModel(content))
}
}
Err(ExecError::Function(err)) => Err(err),
Err(ExecError::Codex(CodexErr::Sandbox(SandboxErr::Timeout { output }))) => Err(
FunctionCallError::RespondToModel(format_exec_output_apply_patch(&output)),
),
Err(ExecError::Codex(err)) => Err(FunctionCallError::RespondToModel(format!(
"execution error: {err:?}"
))),
}
}
pub fn format_exec_output_apply_patch(exec_output: &ExecToolCallOutput) -> String {
let ExecToolCallOutput {
exit_code,
duration,
..
} = exec_output;
#[derive(Serialize)]
struct ExecMetadata {
exit_code: i32,
duration_seconds: f32,
}
#[derive(Serialize)]
struct ExecOutput<'a> {
output: &'a str,
metadata: ExecMetadata,
}
// round to 1 decimal place
let duration_seconds = ((duration.as_secs_f32()) * 10.0).round() / 10.0;
let formatted_output = format_exec_output_str(exec_output);
let payload = ExecOutput {
output: &formatted_output,
metadata: ExecMetadata {
exit_code: *exit_code,
duration_seconds,
},
};
#[expect(clippy::expect_used)]
serde_json::to_string(&payload).expect("serialize ExecOutput")
}
pub fn format_exec_output_str(exec_output: &ExecToolCallOutput) -> String {
let ExecToolCallOutput {
aggregated_output, ..
} = exec_output;
// Head+tail truncation for the model: show the beginning and end with an elision.
// Clients still receive full streams; only this formatted summary is capped.
let mut s = &aggregated_output.text;
let prefixed_str: String;
if exec_output.timed_out {
prefixed_str = format!(
"command timed out after {} milliseconds\n",
exec_output.duration.as_millis()
) + s;
s = &prefixed_str;
}
let total_lines = s.lines().count();
if s.len() <= MODEL_FORMAT_MAX_BYTES && total_lines <= MODEL_FORMAT_MAX_LINES {
return s.to_string();
}
let segments: Vec<&str> = s.split_inclusive('\n').collect();
let head_take = MODEL_FORMAT_HEAD_LINES.min(segments.len());
let tail_take = MODEL_FORMAT_TAIL_LINES.min(segments.len().saturating_sub(head_take));
let omitted = segments.len().saturating_sub(head_take + tail_take);
let head_slice_end: usize = segments
.iter()
.take(head_take)
.map(|segment| segment.len())
.sum();
let tail_slice_start: usize = if tail_take == 0 {
s.len()
} else {
s.len()
- segments
.iter()
.rev()
.take(tail_take)
.map(|segment| segment.len())
.sum::<usize>()
};
let marker = format!("\n[... omitted {omitted} of {total_lines} lines ...]\n\n");
// Byte budgets for head/tail around the marker
let mut head_budget = MODEL_FORMAT_HEAD_BYTES.min(MODEL_FORMAT_MAX_BYTES);
let tail_budget = MODEL_FORMAT_MAX_BYTES.saturating_sub(head_budget + marker.len());
if tail_budget == 0 && marker.len() >= MODEL_FORMAT_MAX_BYTES {
// Degenerate case: marker alone exceeds budget; return a clipped marker
return take_bytes_at_char_boundary(&marker, MODEL_FORMAT_MAX_BYTES).to_string();
}
if tail_budget == 0 {
// Make room for the marker by shrinking head
head_budget = MODEL_FORMAT_MAX_BYTES.saturating_sub(marker.len());
}
let head_slice = &s[..head_slice_end];
let head_part = take_bytes_at_char_boundary(head_slice, head_budget);
let mut result = String::with_capacity(MODEL_FORMAT_MAX_BYTES.min(s.len()));
result.push_str(head_part);
result.push_str(&marker);
let remaining = MODEL_FORMAT_MAX_BYTES.saturating_sub(result.len());
if remaining == 0 {
return result;
}
let tail_slice = &s[tail_slice_start..];
let tail_part = take_last_bytes_at_char_boundary(tail_slice, remaining);
result.push_str(tail_part);
result
}

View File

@@ -0,0 +1,197 @@
use std::collections::HashMap;
use std::sync::Arc;
use std::time::Duration;
use async_trait::async_trait;
use codex_protocol::models::ResponseInputItem;
use tracing::warn;
use crate::client_common::tools::ToolSpec;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
pub enum ToolKind {
Function,
UnifiedExec,
Mcp,
}
#[async_trait]
pub trait ToolHandler: Send + Sync {
fn kind(&self) -> ToolKind;
fn matches_kind(&self, payload: &ToolPayload) -> bool {
matches!(
(self.kind(), payload),
(ToolKind::Function, ToolPayload::Function { .. })
| (ToolKind::UnifiedExec, ToolPayload::UnifiedExec { .. })
| (ToolKind::Mcp, ToolPayload::Mcp { .. })
)
}
async fn handle(&self, invocation: ToolInvocation<'_>)
-> Result<ToolOutput, FunctionCallError>;
}
pub struct ToolRegistry {
handlers: HashMap<String, Arc<dyn ToolHandler>>,
}
impl ToolRegistry {
pub fn new(handlers: HashMap<String, Arc<dyn ToolHandler>>) -> Self {
Self { handlers }
}
pub fn handler(&self, name: &str) -> Option<Arc<dyn ToolHandler>> {
self.handlers.get(name).map(Arc::clone)
}
// TODO(jif) for dynamic tools.
// pub fn register(&mut self, name: impl Into<String>, handler: Arc<dyn ToolHandler>) {
// let name = name.into();
// if self.handlers.insert(name.clone(), handler).is_some() {
// warn!("overwriting handler for tool {name}");
// }
// }
pub async fn dispatch<'a>(
&self,
invocation: ToolInvocation<'a>,
) -> Result<ResponseInputItem, FunctionCallError> {
let tool_name = invocation.tool_name.clone();
let call_id_owned = invocation.call_id.clone();
let otel = invocation.turn.client.get_otel_event_manager();
let payload_for_response = invocation.payload.clone();
let log_payload = payload_for_response.log_payload();
let handler = match self.handler(tool_name.as_ref()) {
Some(handler) => handler,
None => {
let message =
unsupported_tool_call_message(&invocation.payload, tool_name.as_ref());
otel.tool_result(
tool_name.as_ref(),
&call_id_owned,
log_payload.as_ref(),
Duration::ZERO,
false,
&message,
);
return Err(FunctionCallError::RespondToModel(message));
}
};
if !handler.matches_kind(&invocation.payload) {
let message = format!("tool {tool_name} invoked with incompatible payload");
otel.tool_result(
tool_name.as_ref(),
&call_id_owned,
log_payload.as_ref(),
Duration::ZERO,
false,
&message,
);
return Err(FunctionCallError::Fatal(message));
}
let output_cell = tokio::sync::Mutex::new(None);
let result = otel
.log_tool_result(
tool_name.as_ref(),
&call_id_owned,
log_payload.as_ref(),
|| {
let handler = handler.clone();
let output_cell = &output_cell;
let invocation = invocation;
async move {
match handler.handle(invocation).await {
Ok(output) => {
let preview = output.log_preview();
let success = output.success_for_logging();
let mut guard = output_cell.lock().await;
*guard = Some(output);
Ok((preview, success))
}
Err(err) => Err(err),
}
}
},
)
.await;
match result {
Ok(_) => {
let mut guard = output_cell.lock().await;
let output = guard.take().ok_or_else(|| {
FunctionCallError::Fatal("tool produced no output".to_string())
})?;
Ok(output.into_response(&call_id_owned, &payload_for_response))
}
Err(err) => Err(err),
}
}
}
pub struct ToolRegistryBuilder {
handlers: HashMap<String, Arc<dyn ToolHandler>>,
specs: Vec<ToolSpec>,
}
impl ToolRegistryBuilder {
pub fn new() -> Self {
Self {
handlers: HashMap::new(),
specs: Vec::new(),
}
}
pub fn push_spec(&mut self, spec: ToolSpec) {
self.specs.push(spec);
}
pub fn register_handler(&mut self, name: impl Into<String>, handler: Arc<dyn ToolHandler>) {
let name = name.into();
if self
.handlers
.insert(name.clone(), handler.clone())
.is_some()
{
warn!("overwriting handler for tool {name}");
}
}
// TODO(jif) for dynamic tools.
// pub fn register_many<I>(&mut self, names: I, handler: Arc<dyn ToolHandler>)
// where
// I: IntoIterator,
// I::Item: Into<String>,
// {
// for name in names {
// let name = name.into();
// if self
// .handlers
// .insert(name.clone(), handler.clone())
// .is_some()
// {
// warn!("overwriting handler for tool {name}");
// }
// }
// }
pub fn build(self) -> (Vec<ToolSpec>, ToolRegistry) {
let registry = ToolRegistry::new(self.handlers);
(self.specs, registry)
}
}
fn unsupported_tool_call_message(payload: &ToolPayload, tool_name: &str) -> String {
match payload {
ToolPayload::Custom { .. } => format!("unsupported custom tool call: {tool_name}"),
_ => format!("unsupported call: {tool_name}"),
}
}

View File

@@ -0,0 +1,177 @@
use std::collections::HashMap;
use crate::client_common::tools::ToolSpec;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolRegistry;
use crate::tools::spec::ToolsConfig;
use crate::tools::spec::build_specs;
use crate::turn_diff_tracker::TurnDiffTracker;
use codex_protocol::models::LocalShellAction;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::models::ShellToolCallParams;
#[derive(Clone)]
pub struct ToolCall {
pub tool_name: String,
pub call_id: String,
pub payload: ToolPayload,
}
pub struct ToolRouter {
registry: ToolRegistry,
specs: Vec<ToolSpec>,
}
impl ToolRouter {
pub fn from_config(
config: &ToolsConfig,
mcp_tools: Option<HashMap<String, mcp_types::Tool>>,
) -> Self {
let builder = build_specs(config, mcp_tools);
let (specs, registry) = builder.build();
Self { registry, specs }
}
pub fn specs(&self) -> &[ToolSpec] {
&self.specs
}
pub fn build_tool_call(
session: &Session,
item: ResponseItem,
) -> Result<Option<ToolCall>, FunctionCallError> {
match item {
ResponseItem::FunctionCall {
name,
arguments,
call_id,
..
} => {
if let Some((server, tool)) = session.parse_mcp_tool_name(&name) {
Ok(Some(ToolCall {
tool_name: name,
call_id,
payload: ToolPayload::Mcp {
server,
tool,
raw_arguments: arguments,
},
}))
} else {
let payload = if name == "unified_exec" {
ToolPayload::UnifiedExec { arguments }
} else {
ToolPayload::Function { arguments }
};
Ok(Some(ToolCall {
tool_name: name,
call_id,
payload,
}))
}
}
ResponseItem::CustomToolCall {
name,
input,
call_id,
..
} => Ok(Some(ToolCall {
tool_name: name,
call_id,
payload: ToolPayload::Custom { input },
})),
ResponseItem::LocalShellCall {
id,
call_id,
action,
..
} => {
let call_id = call_id
.or(id)
.ok_or(FunctionCallError::MissingLocalShellCallId)?;
match action {
LocalShellAction::Exec(exec) => {
let params = ShellToolCallParams {
command: exec.command,
workdir: exec.working_directory,
timeout_ms: exec.timeout_ms,
with_escalated_permissions: None,
justification: None,
};
Ok(Some(ToolCall {
tool_name: "local_shell".to_string(),
call_id,
payload: ToolPayload::LocalShell { params },
}))
}
}
}
_ => Ok(None),
}
}
pub async fn dispatch_tool_call(
&self,
session: &Session,
turn: &TurnContext,
tracker: &mut TurnDiffTracker,
sub_id: &str,
call: ToolCall,
) -> Result<ResponseInputItem, FunctionCallError> {
let ToolCall {
tool_name,
call_id,
payload,
} = call;
let payload_outputs_custom = matches!(payload, ToolPayload::Custom { .. });
let failure_call_id = call_id.clone();
let invocation = ToolInvocation {
session,
turn,
tracker,
sub_id,
call_id,
tool_name,
payload,
};
match self.registry.dispatch(invocation).await {
Ok(response) => Ok(response),
Err(FunctionCallError::Fatal(message)) => Err(FunctionCallError::Fatal(message)),
Err(err) => Ok(Self::failure_response(
failure_call_id,
payload_outputs_custom,
err,
)),
}
}
fn failure_response(
call_id: String,
payload_outputs_custom: bool,
err: FunctionCallError,
) -> ResponseInputItem {
let message = err.to_string();
if payload_outputs_custom {
ResponseInputItem::CustomToolCallOutput {
call_id,
output: message,
}
} else {
ResponseInputItem::FunctionCallOutput {
call_id,
output: codex_protocol::models::FunctionCallOutputPayload {
content: message,
success: Some(false),
},
}
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -7,6 +7,9 @@ use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::config::ConfigToml;
#[cfg(target_os = "linux")]
use assert_cmd::cargo::cargo_bin;
pub mod responses;
pub mod test_codex;
pub mod test_codex_exec;
@@ -17,12 +20,25 @@ pub mod test_codex_exec;
pub fn load_default_config_for_test(codex_home: &TempDir) -> Config {
Config::load_from_base_config_with_overrides(
ConfigToml::default(),
ConfigOverrides::default(),
default_test_overrides(),
codex_home.path().to_path_buf(),
)
.expect("defaults for test should always succeed")
}
#[cfg(target_os = "linux")]
fn default_test_overrides() -> ConfigOverrides {
ConfigOverrides {
codex_linux_sandbox_exe: Some(cargo_bin("codex-linux-sandbox")),
..ConfigOverrides::default()
}
}
#[cfg(not(target_os = "linux"))]
fn default_test_overrides() -> ConfigOverrides {
ConfigOverrides::default()
}
/// Builds an SSE stream body from a JSON fixture.
///
/// The fixture must contain an array of objects where each object represents a

View File

@@ -13,7 +13,7 @@ use tempfile::TempDir;
use crate::load_default_config_for_test;
type ConfigMutator = dyn FnOnce(&mut Config);
type ConfigMutator = dyn FnOnce(&mut Config) + Send;
pub struct TestCodexBuilder {
config_mutators: Vec<Box<ConfigMutator>>,
@@ -22,7 +22,7 @@ pub struct TestCodexBuilder {
impl TestCodexBuilder {
pub fn with_config<T>(mut self, mutator: T) -> Self
where
T: FnOnce(&mut Config) + 'static,
T: FnOnce(&mut Config) + Send + 'static,
{
self.config_mutators.push(Box::new(mutator));
self

View File

@@ -12,12 +12,18 @@ mod fork_conversation;
mod json_result;
mod live_cli;
mod model_overrides;
mod model_tools;
mod otel;
mod prompt_caching;
mod read_file;
mod review;
mod rmcp_client;
mod rollout_list_find;
mod seatbelt;
mod stream_error_allows_next_turn;
mod stream_no_completed;
mod tool_harness;
mod tools;
mod unified_exec;
mod user_notification;
mod view_image;

View File

@@ -0,0 +1,131 @@
#![allow(clippy::unwrap_used)]
use codex_core::CodexAuth;
use codex_core::ConversationManager;
use codex_core::ModelProviderInfo;
use codex_core::built_in_model_providers;
use codex_core::model_family::find_family_for_model;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use core_test_support::load_default_config_for_test;
use core_test_support::load_sse_fixture_with_id;
use core_test_support::skip_if_no_network;
use core_test_support::wait_for_event;
use tempfile::TempDir;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
use wiremock::matchers::method;
use wiremock::matchers::path;
fn sse_completed(id: &str) -> String {
load_sse_fixture_with_id("tests/fixtures/completed_template.json", id)
}
#[allow(clippy::expect_used)]
fn tool_identifiers(body: &serde_json::Value) -> Vec<String> {
body["tools"]
.as_array()
.unwrap()
.iter()
.map(|tool| {
tool.get("name")
.and_then(|v| v.as_str())
.or_else(|| tool.get("type").and_then(|v| v.as_str()))
.map(std::string::ToString::to_string)
.expect("tool should have either name or type")
})
.collect()
}
#[allow(clippy::expect_used)]
async fn collect_tool_identifiers_for_model(model: &str) -> Vec<String> {
let server = MockServer::start().await;
let sse = sse_completed(model);
let template = ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_raw(sse, "text/event-stream");
Mock::given(method("POST"))
.and(path("/v1/responses"))
.respond_with(template)
.expect(1)
.mount(&server)
.await;
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
};
let cwd = TempDir::new().unwrap();
let codex_home = TempDir::new().unwrap();
let mut config = load_default_config_for_test(&codex_home);
config.cwd = cwd.path().to_path_buf();
config.model_provider = model_provider;
config.model = model.to_string();
config.model_family =
find_family_for_model(model).unwrap_or_else(|| panic!("unknown model family for {model}"));
config.include_plan_tool = false;
config.include_apply_patch_tool = false;
config.include_view_image_tool = false;
config.tools_web_search_request = false;
config.use_experimental_streamable_shell_tool = false;
config.use_experimental_unified_exec_tool = false;
let conversation_manager =
ConversationManager::with_auth(CodexAuth::from_api_key("Test API Key"));
let codex = conversation_manager
.new_conversation(config)
.await
.expect("create new conversation")
.conversation;
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: "hello tools".into(),
}],
})
.await
.unwrap();
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let requests = server.received_requests().await.unwrap();
assert_eq!(
requests.len(),
1,
"expected a single request for model {model}"
);
let body = requests[0].body_json::<serde_json::Value>().unwrap();
tool_identifiers(&body)
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn model_selects_expected_tools() {
skip_if_no_network!();
use pretty_assertions::assert_eq;
let codex_tools = collect_tool_identifiers_for_model("codex-mini-latest").await;
assert_eq!(
codex_tools,
vec!["local_shell".to_string()],
"codex-mini-latest should expose the local shell tool",
);
let o3_tools = collect_tool_identifiers_for_model("o3").await;
assert_eq!(
o3_tools,
vec!["shell".to_string()],
"o3 should expose the generic shell tool",
);
let gpt5_codex_tools = collect_tool_identifiers_for_model("gpt-5-codex").await;
assert_eq!(
gpt5_codex_tools,
vec!["shell".to_string(), "read_file".to_string()],
"gpt-5-codex should expose the beta read_file tool",
);
}

View File

@@ -219,7 +219,13 @@ async fn prompt_tools_are_consistent_across_requests() {
// our internal implementation is responsible for keeping tools in sync
// with the OpenAI schema, so we just verify the tool presence here
let expected_tools_names: &[&str] = &["shell", "update_plan", "apply_patch", "view_image"];
let expected_tools_names: &[&str] = &[
"shell",
"update_plan",
"apply_patch",
"read_file",
"view_image",
];
let body0 = requests[0].body_json::<serde_json::Value>().unwrap();
assert_eq!(
body0["instructions"],

View File

@@ -0,0 +1,124 @@
#![cfg(not(target_os = "windows"))]
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use core_test_support::responses;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use pretty_assertions::assert_eq;
use serde_json::Value;
use wiremock::matchers::any;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn read_file_tool_returns_requested_lines() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
cwd,
session_configured,
..
} = test_codex().build(&server).await?;
let file_path = cwd.path().join("sample.txt");
std::fs::write(&file_path, "first\nsecond\nthird\nfourth\n")?;
let file_path = file_path.to_string_lossy().to_string();
let call_id = "read-file-call";
let arguments = serde_json::json!({
"file_path": file_path,
"offset": 2,
"limit": 2,
})
.to_string();
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_function_call(call_id, "read_file", &arguments),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please inspect sample.txt".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let requests = server.received_requests().await.expect("recorded requests");
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().unwrap())
.collect::<Vec<_>>();
assert!(
!request_bodies.is_empty(),
"expected at least one request body"
);
let tool_output_item = request_bodies
.iter()
.find_map(|body| {
body.get("input")
.and_then(Value::as_array)
.and_then(|items| {
items.iter().find(|item| {
item.get("type").and_then(Value::as_str) == Some("function_call_output")
})
})
})
.unwrap_or_else(|| {
panic!("function_call_output item not found in requests: {request_bodies:#?}")
});
assert_eq!(
tool_output_item.get("call_id").and_then(Value::as_str),
Some(call_id)
);
let output_text = tool_output_item
.get("output")
.and_then(|value| match value {
Value::String(text) => Some(text.as_str()),
Value::Object(obj) => obj.get("content").and_then(Value::as_str),
_ => None,
})
.expect("output text present");
assert_eq!(output_text, "L2: second\nL3: third");
Ok(())
}

View File

@@ -1,6 +1,11 @@
use std::collections::HashMap;
use std::ffi::OsString;
use std::fs;
use std::net::TcpListener;
use std::path::Path;
use std::time::Duration;
use std::time::SystemTime;
use std::time::UNIX_EPOCH;
use codex_core::config_types::McpServerConfig;
use codex_core::config_types::McpServerTransportConfig;
@@ -19,6 +24,8 @@ use core_test_support::wait_for_event;
use core_test_support::wait_for_event_with_timeout;
use escargot::CargoBuild;
use serde_json::Value;
use serial_test::serial;
use tempfile::tempdir;
use tokio::net::TcpStream;
use tokio::process::Child;
use tokio::process::Command;
@@ -328,6 +335,189 @@ async fn streamable_http_tool_call_round_trip() -> anyhow::Result<()> {
Ok(())
}
/// This test writes to a fallback credentials file in CODEX_HOME.
/// Ideally, we wouldn't need to serialize the test but it's much more cumbersome to wire CODEX_HOME through the code.
#[serial(codex_home)]
#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
async fn streamable_http_with_oauth_round_trip() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let call_id = "call-789";
let server_name = "rmcp_http_oauth";
let tool_name = format!("{server_name}__echo");
mount_sse_once_match(
&server,
any(),
responses::sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
responses::ev_function_call(call_id, &tool_name, "{\"message\":\"ping\"}"),
responses::ev_completed("resp-1"),
]),
)
.await;
mount_sse_once_match(
&server,
any(),
responses::sse(vec![
responses::ev_assistant_message(
"msg-1",
"rmcp streamable http oauth echo tool completed successfully.",
),
responses::ev_completed("resp-2"),
]),
)
.await;
let expected_env_value = "propagated-env-http-oauth";
let expected_token = "initial-access-token";
let client_id = "test-client-id";
let refresh_token = "initial-refresh-token";
let rmcp_http_server_bin = CargoBuild::new()
.package("codex-rmcp-client")
.bin("test_streamable_http_server")
.run()?
.path()
.to_string_lossy()
.into_owned();
let listener = TcpListener::bind("127.0.0.1:0")?;
let port = listener.local_addr()?.port();
drop(listener);
let bind_addr = format!("127.0.0.1:{port}");
let server_url = format!("http://{bind_addr}/mcp");
let mut http_server_child = Command::new(&rmcp_http_server_bin)
.kill_on_drop(true)
.env("MCP_STREAMABLE_HTTP_BIND_ADDR", &bind_addr)
.env("MCP_EXPECT_BEARER", expected_token)
.env("MCP_TEST_VALUE", expected_env_value)
.spawn()?;
wait_for_streamable_http_server(&mut http_server_child, &bind_addr, Duration::from_secs(5))
.await?;
let temp_home = tempdir()?;
let _guard = EnvVarGuard::set("CODEX_HOME", temp_home.path().as_os_str());
write_fallback_oauth_tokens(
temp_home.path(),
server_name,
&server_url,
client_id,
expected_token,
refresh_token,
)?;
let fixture = test_codex()
.with_config(move |config| {
config.use_experimental_use_rmcp_client = true;
config.mcp_servers.insert(
server_name.to_string(),
McpServerConfig {
transport: McpServerTransportConfig::StreamableHttp {
url: server_url,
bearer_token: None,
},
startup_timeout_sec: Some(Duration::from_secs(10)),
tool_timeout_sec: None,
},
);
})
.build(&server)
.await?;
let session_model = fixture.session_configured.model.clone();
fixture
.codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "call the rmcp streamable http oauth echo tool".into(),
}],
final_output_json_schema: None,
cwd: fixture.cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
unreachable!("event guard guarantees McpToolCallBegin");
};
assert_eq!(begin.invocation.server, server_name);
assert_eq!(begin.invocation.tool, "echo");
let end_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallEnd(_))
})
.await;
let EventMsg::McpToolCallEnd(end) = end_event else {
unreachable!("event guard guarantees McpToolCallEnd");
};
let result = end
.result
.as_ref()
.expect("rmcp echo tool should return success");
assert_eq!(result.is_error, Some(false));
assert!(
result.content.is_empty(),
"content should default to an empty array"
);
let structured = result
.structured_content
.as_ref()
.expect("structured content");
let Value::Object(map) = structured else {
panic!("structured content should be an object: {structured:?}");
};
let echo_value = map
.get("echo")
.and_then(Value::as_str)
.expect("echo payload present");
assert_eq!(echo_value, "ECHOING: ping");
let env_value = map
.get("env")
.and_then(Value::as_str)
.expect("env snapshot inserted");
assert_eq!(env_value, expected_env_value);
wait_for_event(&fixture.codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
server.verify().await;
match http_server_child.try_wait() {
Ok(Some(_)) => {}
Ok(None) => {
let _ = http_server_child.kill().await;
}
Err(error) => {
eprintln!("failed to check streamable http oauth server status: {error}");
let _ = http_server_child.kill().await;
}
}
if let Err(error) = http_server_child.wait().await {
eprintln!("failed to await streamable http oauth server shutdown: {error}");
}
Ok(())
}
async fn wait_for_streamable_http_server(
server_child: &mut Child,
address: &str,
@@ -369,3 +559,60 @@ async fn wait_for_streamable_http_server(
sleep(Duration::from_millis(50)).await;
}
}
fn write_fallback_oauth_tokens(
home: &Path,
server_name: &str,
server_url: &str,
client_id: &str,
access_token: &str,
refresh_token: &str,
) -> anyhow::Result<()> {
let expires_at = SystemTime::now()
.checked_add(Duration::from_secs(3600))
.ok_or_else(|| anyhow::anyhow!("failed to compute expiry time"))?
.duration_since(UNIX_EPOCH)?
.as_millis() as u64;
let store = serde_json::json!({
"stub": {
"server_name": server_name,
"server_url": server_url,
"client_id": client_id,
"access_token": access_token,
"expires_at": expires_at,
"refresh_token": refresh_token,
"scopes": ["profile"],
}
});
let file_path = home.join(".credentials.json");
fs::write(&file_path, serde_json::to_vec(&store)?)?;
Ok(())
}
struct EnvVarGuard {
key: &'static str,
original: Option<OsString>,
}
impl EnvVarGuard {
fn set(key: &'static str, value: &std::ffi::OsStr) -> Self {
let original = std::env::var_os(key);
unsafe {
std::env::set_var(key, value);
}
Self { key, original }
}
}
impl Drop for EnvVarGuard {
fn drop(&mut self) {
unsafe {
match &self.original {
Some(value) => std::env::set_var(self.key, value),
None => std::env::remove_var(self.key),
}
}
}
}

View File

@@ -169,6 +169,12 @@ async fn python_getpwuid_works_under_seatbelt() {
return;
}
// For local dev.
if which::which("python3").is_err() {
eprintln!("python3 not found in PATH, skipping test.");
return;
}
// ReadOnly is sufficient here since we are only exercising user lookup.
let policy = SandboxPolicy::ReadOnly;
let command_cwd = std::env::current_dir().expect("getcwd");

View File

@@ -0,0 +1,568 @@
#![cfg(not(target_os = "windows"))]
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::plan_tool::StepStatus;
use core_test_support::responses;
use core_test_support::responses::ev_apply_patch_function_call;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::ev_local_shell_call;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use serde_json::Value;
use serde_json::json;
use wiremock::matchers::any;
fn function_call_output(body: &Value) -> Option<&Value> {
body.get("input")
.and_then(Value::as_array)
.and_then(|items| {
items.iter().find(|item| {
item.get("type").and_then(Value::as_str) == Some("function_call_output")
})
})
}
fn extract_output_text(item: &Value) -> Option<&str> {
item.get("output").and_then(|value| match value {
Value::String(text) => Some(text.as_str()),
Value::Object(obj) => obj.get("content").and_then(Value::as_str),
_ => None,
})
}
fn find_request_with_function_call_output(requests: &[Value]) -> Option<&Value> {
requests
.iter()
.find(|body| function_call_output(body).is_some())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn shell_tool_executes_command_and_streams_output() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.include_apply_patch_tool = true;
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "shell-tool-call";
let command = vec!["/bin/echo", "tool harness"];
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_local_shell_call(call_id, "completed", command),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "all done"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please run the shell command".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
loop {
let event = codex.next_event().await.expect("event");
if matches!(event.msg, EventMsg::TaskComplete(_)) {
break;
}
}
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
let output_text = extract_output_text(output_item).expect("output text present");
let exec_output: Value = serde_json::from_str(output_text)?;
assert_eq!(exec_output["metadata"]["exit_code"], 0);
let stdout = exec_output["output"].as_str().expect("stdout field");
assert!(
stdout.contains("tool harness"),
"expected stdout to contain command output, got {stdout:?}"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn update_plan_tool_emits_plan_update_event() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.include_plan_tool = true;
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "plan-tool-call";
let plan_args = json!({
"explanation": "Tool harness check",
"plan": [
{"step": "Inspect workspace", "status": "in_progress"},
{"step": "Report results", "status": "pending"},
],
})
.to_string();
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_function_call(call_id, "update_plan", &plan_args),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "plan acknowledged"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please update the plan".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let mut saw_plan_update = false;
loop {
let event = codex.next_event().await.expect("event");
match event.msg {
EventMsg::PlanUpdate(update) => {
saw_plan_update = true;
assert_eq!(update.explanation.as_deref(), Some("Tool harness check"));
assert_eq!(update.plan.len(), 2);
assert_eq!(update.plan[0].step, "Inspect workspace");
assert!(matches!(update.plan[0].status, StepStatus::InProgress));
assert_eq!(update.plan[1].step, "Report results");
assert!(matches!(update.plan[1].status, StepStatus::Pending));
}
EventMsg::TaskComplete(_) => break,
_ => {}
}
}
assert!(saw_plan_update, "expected PlanUpdate event");
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
assert_eq!(
output_item.get("call_id").and_then(Value::as_str),
Some(call_id)
);
let output_text = extract_output_text(output_item).expect("output text present");
assert_eq!(output_text, "Plan updated");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn update_plan_tool_rejects_malformed_payload() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.include_plan_tool = true;
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "plan-tool-invalid";
let invalid_args = json!({
"explanation": "Missing plan data"
})
.to_string();
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_function_call(call_id, "update_plan", &invalid_args),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "malformed plan payload"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please update the plan".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let mut saw_plan_update = false;
loop {
let event = codex.next_event().await.expect("event");
match event.msg {
EventMsg::PlanUpdate(_) => saw_plan_update = true,
EventMsg::TaskComplete(_) => break,
_ => {}
}
}
assert!(
!saw_plan_update,
"did not expect PlanUpdate event for malformed payload"
);
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
assert_eq!(
output_item.get("call_id").and_then(Value::as_str),
Some(call_id)
);
let output_text = extract_output_text(output_item).expect("output text present");
assert!(
output_text.contains("failed to parse function arguments"),
"expected parse error message in output text, got {output_text:?}"
);
if let Some(success_flag) = output_item
.get("output")
.and_then(|value| value.as_object())
.and_then(|obj| obj.get("success"))
.and_then(serde_json::Value::as_bool)
{
assert!(
!success_flag,
"expected tool output to mark success=false for malformed payload"
);
}
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn apply_patch_tool_executes_and_emits_patch_events() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.include_apply_patch_tool = true;
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "apply-patch-call";
let patch_content = r#"*** Begin Patch
*** Add File: notes.txt
+Tool harness apply patch
*** End Patch"#;
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_apply_patch_function_call(call_id, patch_content),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "patch complete"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please apply a patch".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let mut saw_patch_begin = false;
let mut patch_end_success = None;
loop {
let event = codex.next_event().await.expect("event");
match event.msg {
EventMsg::PatchApplyBegin(begin) => {
saw_patch_begin = true;
assert_eq!(begin.call_id, call_id);
}
EventMsg::PatchApplyEnd(end) => {
assert_eq!(end.call_id, call_id);
patch_end_success = Some(end.success);
}
EventMsg::TaskComplete(_) => break,
_ => {}
}
}
assert!(saw_patch_begin, "expected PatchApplyBegin event");
let patch_end_success =
patch_end_success.expect("expected PatchApplyEnd event to capture success flag");
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
assert_eq!(
output_item.get("call_id").and_then(Value::as_str),
Some(call_id)
);
let output_text = extract_output_text(output_item).expect("output text present");
if let Ok(exec_output) = serde_json::from_str::<Value>(output_text) {
let exit_code = exec_output["metadata"]["exit_code"]
.as_i64()
.expect("exit_code present");
let summary = exec_output["output"].as_str().expect("output field");
assert_eq!(
exit_code, 0,
"expected apply_patch exit_code=0, got {exit_code}, summary: {summary:?}"
);
assert!(
patch_end_success,
"expected PatchApplyEnd success flag, summary: {summary:?}"
);
assert!(
summary.contains("Success."),
"expected apply_patch summary to note success, got {summary:?}"
);
let patched_path = cwd.path().join("notes.txt");
let contents = std::fs::read_to_string(&patched_path)
.unwrap_or_else(|e| panic!("failed reading {}: {e}", patched_path.display()));
assert_eq!(contents, "Tool harness apply patch\n");
} else {
assert!(
output_text.contains("codex-run-as-apply-patch"),
"expected apply_patch failure message to mention codex-run-as-apply-patch, got {output_text:?}"
);
assert!(
!patch_end_success,
"expected PatchApplyEnd to report success=false when apply_patch invocation fails"
);
}
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn apply_patch_reports_parse_diagnostics() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.include_apply_patch_tool = true;
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "apply-patch-parse-error";
let patch_content = r"*** Begin Patch
*** Update File: broken.txt
*** End Patch";
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_apply_patch_function_call(call_id, patch_content),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "failed"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please apply a patch".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
loop {
let event = codex.next_event().await.expect("event");
if matches!(event.msg, EventMsg::TaskComplete(_)) {
break;
}
}
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
assert_eq!(
output_item.get("call_id").and_then(Value::as_str),
Some(call_id)
);
let output_text = extract_output_text(output_item).expect("output text present");
assert!(
output_text.contains("apply_patch verification failed"),
"expected apply_patch verification failure message, got {output_text:?}"
);
assert!(
output_text.contains("invalid hunk"),
"expected parse diagnostics in output text, got {output_text:?}"
);
if let Some(success_flag) = output_item
.get("output")
.and_then(|value| value.as_object())
.and_then(|obj| obj.get("success"))
.and_then(serde_json::Value::as_bool)
{
assert!(
!success_flag,
"expected tool output to mark success=false for parse failures"
);
}
Ok(())
}

View File

@@ -0,0 +1,460 @@
#![cfg(not(target_os = "windows"))]
#![allow(clippy::unwrap_used, clippy::expect_used)]
use anyhow::Result;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_custom_tool_call;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::mount_sse_sequence;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use serde_json::Value;
use serde_json::json;
use wiremock::Request;
async fn submit_turn(
test: &TestCodex,
prompt: &str,
approval_policy: AskForApproval,
sandbox_policy: SandboxPolicy,
) -> Result<()> {
let session_model = test.session_configured.model.clone();
test.codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: prompt.into(),
}],
final_output_json_schema: None,
cwd: test.cwd.path().to_path_buf(),
approval_policy,
sandbox_policy,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
loop {
let event = test.codex.next_event().await?;
if matches!(event.msg, EventMsg::TaskComplete(_)) {
break;
}
}
Ok(())
}
fn request_bodies(requests: &[Request]) -> Result<Vec<Value>> {
requests
.iter()
.map(|req| Ok(serde_json::from_slice::<Value>(&req.body)?))
.collect()
}
fn collect_output_items<'a>(bodies: &'a [Value], ty: &str) -> Vec<&'a Value> {
let mut out = Vec::new();
for body in bodies {
if let Some(items) = body.get("input").and_then(Value::as_array) {
for item in items {
if item.get("type").and_then(Value::as_str) == Some(ty) {
out.push(item);
}
}
}
}
out
}
fn tool_names(body: &Value) -> Vec<String> {
body.get("tools")
.and_then(Value::as_array)
.map(|tools| {
tools
.iter()
.filter_map(|tool| {
tool.get("name")
.or_else(|| tool.get("type"))
.and_then(Value::as_str)
.map(str::to_string)
})
.collect()
})
.unwrap_or_default()
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn custom_tool_unknown_returns_custom_output_error() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex();
let test = builder.build(&server).await?;
let call_id = "custom-unsupported";
let tool_name = "unsupported_tool";
let responses = vec![
sse(vec![
json!({"type": "response.created", "response": {"id": "resp-1"}}),
ev_custom_tool_call(call_id, tool_name, "\"payload\""),
ev_completed("resp-1"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
submit_turn(
&test,
"invoke custom tool",
AskForApproval::Never,
SandboxPolicy::DangerFullAccess,
)
.await?;
let requests = server.received_requests().await.expect("recorded requests");
let bodies = request_bodies(&requests)?;
let custom_items = collect_output_items(&bodies, "custom_tool_call_output");
assert_eq!(custom_items.len(), 1, "expected single custom tool output");
let item = custom_items[0];
assert_eq!(item.get("call_id").and_then(Value::as_str), Some(call_id));
let output = item
.get("output")
.and_then(Value::as_str)
.unwrap_or_default();
let expected = format!("unsupported custom tool call: {tool_name}");
assert_eq!(output, expected);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn shell_escalated_permissions_rejected_then_ok() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex();
let test = builder.build(&server).await?;
let command = ["/bin/echo", "shell ok"];
let call_id_blocked = "shell-blocked";
let call_id_success = "shell-success";
let first_args = json!({
"command": command,
"timeout_ms": 1_000,
"with_escalated_permissions": true,
});
let second_args = json!({
"command": command,
"timeout_ms": 1_000,
});
let responses = vec![
sse(vec![
json!({"type": "response.created", "response": {"id": "resp-1"}}),
ev_function_call(
call_id_blocked,
"shell",
&serde_json::to_string(&first_args)?,
),
ev_completed("resp-1"),
]),
sse(vec![
json!({"type": "response.created", "response": {"id": "resp-2"}}),
ev_function_call(
call_id_success,
"shell",
&serde_json::to_string(&second_args)?,
),
ev_completed("resp-2"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-3"),
]),
];
mount_sse_sequence(&server, responses).await;
submit_turn(
&test,
"run the shell command",
AskForApproval::Never,
SandboxPolicy::DangerFullAccess,
)
.await?;
let requests = server.received_requests().await.expect("recorded requests");
let bodies = request_bodies(&requests)?;
let function_outputs = collect_output_items(&bodies, "function_call_output");
for item in &function_outputs {
let call_id = item
.get("call_id")
.and_then(Value::as_str)
.unwrap_or_default();
assert!(
call_id == call_id_blocked || call_id == call_id_success,
"unexpected call id {call_id}"
);
}
let policy = AskForApproval::Never;
let expected_message = format!(
"approval policy is {policy:?}; reject command — you should not ask for escalated permissions if the approval policy is {policy:?}"
);
let blocked_outputs: Vec<&Value> = function_outputs
.iter()
.filter(|item| item.get("call_id").and_then(Value::as_str) == Some(call_id_blocked))
.copied()
.collect();
assert!(
!blocked_outputs.is_empty(),
"expected at least one rejection output for {call_id_blocked}"
);
for item in blocked_outputs {
assert_eq!(
item.get("output").and_then(Value::as_str),
Some(expected_message.as_str()),
"unexpected rejection message"
);
}
let success_item = function_outputs
.iter()
.find(|item| item.get("call_id").and_then(Value::as_str) == Some(call_id_success))
.expect("success output present");
let output_json: Value = serde_json::from_str(
success_item
.get("output")
.and_then(Value::as_str)
.expect("success output string"),
)?;
assert_eq!(
output_json["metadata"]["exit_code"].as_i64(),
Some(0),
"expected exit code 0 after rerunning without escalation",
);
let stdout = output_json["output"].as_str().unwrap_or_default();
assert!(
stdout.contains("shell ok"),
"expected stdout to include command output, got {stdout:?}"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn local_shell_missing_ids_maps_to_function_output_error() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex();
let test = builder.build(&server).await?;
let local_shell_event = json!({
"type": "response.output_item.done",
"item": {
"type": "local_shell_call",
"status": "completed",
"action": {
"type": "exec",
"command": ["/bin/echo", "hi"],
}
}
});
let responses = vec![
sse(vec![
json!({"type": "response.created", "response": {"id": "resp-1"}}),
local_shell_event,
ev_completed("resp-1"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
submit_turn(
&test,
"check shell output",
AskForApproval::Never,
SandboxPolicy::DangerFullAccess,
)
.await?;
let requests = server.received_requests().await.expect("recorded requests");
let bodies = request_bodies(&requests)?;
let function_outputs = collect_output_items(&bodies, "function_call_output");
assert_eq!(
function_outputs.len(),
1,
"expected a single function output"
);
let item = function_outputs[0];
assert_eq!(item.get("call_id").and_then(Value::as_str), Some(""));
assert_eq!(
item.get("output").and_then(Value::as_str),
Some("LocalShellCall without call_id or id"),
);
Ok(())
}
async fn collect_tools(use_unified_exec: bool) -> Result<Vec<String>> {
let server = start_mock_server().await;
let responses = vec![sse(vec![
json!({"type": "response.created", "response": {"id": "resp-1"}}),
ev_assistant_message("msg-1", "done"),
ev_completed("resp-1"),
])];
mount_sse_sequence(&server, responses).await;
let mut builder = test_codex().with_config(move |config| {
config.use_experimental_unified_exec_tool = use_unified_exec;
});
let test = builder.build(&server).await?;
submit_turn(
&test,
"list tools",
AskForApproval::Never,
SandboxPolicy::DangerFullAccess,
)
.await?;
let requests = server.received_requests().await.expect("recorded requests");
assert_eq!(
requests.len(),
1,
"expected a single request for tools collection"
);
let bodies = request_bodies(&requests)?;
let first_body = bodies.first().expect("request body present");
Ok(tool_names(first_body))
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_spec_toggle_end_to_end() -> Result<()> {
skip_if_no_network!(Ok(()));
let tools_disabled = collect_tools(false).await?;
assert!(
!tools_disabled.iter().any(|name| name == "unified_exec"),
"tools list should not include unified_exec when disabled: {tools_disabled:?}"
);
let tools_enabled = collect_tools(true).await?;
assert!(
tools_enabled.iter().any(|name| name == "unified_exec"),
"tools list should include unified_exec when enabled: {tools_enabled:?}"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn shell_timeout_includes_timeout_prefix_and_metadata() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex();
let test = builder.build(&server).await?;
let call_id = "shell-timeout";
let timeout_ms = 50u64;
let args = json!({
"command": ["/bin/sh", "-c", "yes line | head -n 400; sleep 1"],
"timeout_ms": timeout_ms,
});
let responses = vec![
sse(vec![
json!({"type": "response.created", "response": {"id": "resp-1"}}),
ev_function_call(call_id, "shell", &serde_json::to_string(&args)?),
ev_completed("resp-1"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
submit_turn(
&test,
"run a long command",
AskForApproval::Never,
SandboxPolicy::DangerFullAccess,
)
.await?;
let requests = server.received_requests().await.expect("recorded requests");
let bodies = request_bodies(&requests)?;
let function_outputs = collect_output_items(&bodies, "function_call_output");
let timeout_item = function_outputs
.iter()
.find(|item| item.get("call_id").and_then(Value::as_str) == Some(call_id))
.expect("timeout output present");
let output_str = timeout_item
.get("output")
.and_then(Value::as_str)
.expect("timeout output string");
// The exec path can report a timeout in two ways depending on timing:
// 1) Structured JSON with exit_code 124 and a timeout prefix (preferred), or
// 2) A plain error string if the child is observed as killed by a signal first.
if let Ok(output_json) = serde_json::from_str::<Value>(output_str) {
assert_eq!(
output_json["metadata"]["exit_code"].as_i64(),
Some(124),
"expected timeout exit code 124",
);
let stdout = output_json["output"].as_str().unwrap_or_default();
assert!(
stdout.starts_with("command timed out after "),
"expected timeout prefix, got {stdout:?}"
);
let first_line = stdout.lines().next().unwrap_or_default();
let duration_ms = first_line
.strip_prefix("command timed out after ")
.and_then(|line| line.strip_suffix(" milliseconds"))
.and_then(|value| value.parse::<u64>().ok())
.unwrap_or_default();
assert!(
duration_ms >= timeout_ms,
"expected duration >= configured timeout, got {duration_ms} (timeout {timeout_ms})"
);
} else {
// Fallback: accept the signal classification path to deflake the test.
assert!(
output_str.contains("execution error"),
"unexpected non-JSON output: {output_str:?}"
);
assert!(
output_str.contains("Signal(") || output_str.to_lowercase().contains("signal"),
"expected signal classification in error output, got {output_str:?}"
);
}
Ok(())
}

View File

@@ -0,0 +1,280 @@
#![cfg(not(target_os = "windows"))]
use std::collections::HashMap;
use anyhow::Result;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::mount_sse_sequence;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::skip_if_sandbox;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use serde_json::Value;
fn extract_output_text(item: &Value) -> Option<&str> {
item.get("output").and_then(|value| match value {
Value::String(text) => Some(text.as_str()),
Value::Object(obj) => obj.get("content").and_then(Value::as_str),
_ => None,
})
}
fn collect_tool_outputs(bodies: &[Value]) -> Result<HashMap<String, Value>> {
let mut outputs = HashMap::new();
for body in bodies {
if let Some(items) = body.get("input").and_then(Value::as_array) {
for item in items {
if item.get("type").and_then(Value::as_str) != Some("function_call_output") {
continue;
}
if let Some(call_id) = item.get("call_id").and_then(Value::as_str) {
let content = extract_output_text(item)
.ok_or_else(|| anyhow::anyhow!("missing tool output content"))?;
let parsed: Value = serde_json::from_str(content)?;
outputs.insert(call_id.to_string(), parsed);
}
}
}
}
Ok(outputs)
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_reuses_session_via_stdin() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.use_experimental_unified_exec_tool = true;
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let first_call_id = "uexec-start";
let first_args = serde_json::json!({
"input": ["/bin/cat"],
"timeout_ms": 200,
});
let second_call_id = "uexec-stdin";
let second_args = serde_json::json!({
"input": ["hello unified exec\n"],
"session_id": "0",
"timeout_ms": 500,
});
let responses = vec![
sse(vec![
serde_json::json!({"type": "response.created", "response": {"id": "resp-1"}}),
ev_function_call(
first_call_id,
"unified_exec",
&serde_json::to_string(&first_args)?,
),
ev_completed("resp-1"),
]),
sse(vec![
serde_json::json!({"type": "response.created", "response": {"id": "resp-2"}}),
ev_function_call(
second_call_id,
"unified_exec",
&serde_json::to_string(&second_args)?,
),
ev_completed("resp-2"),
]),
sse(vec![
ev_assistant_message("msg-1", "all done"),
ev_completed("resp-3"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "run unified exec".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
loop {
let event = codex.next_event().await.expect("event");
if matches!(event.msg, EventMsg::TaskComplete(_)) {
break;
}
}
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let outputs = collect_tool_outputs(&bodies)?;
let start_output = outputs
.get(first_call_id)
.expect("missing first unified_exec output");
let session_id = start_output["session_id"].as_str().unwrap_or_default();
assert!(
!session_id.is_empty(),
"expected session id in first unified_exec response"
);
assert!(
start_output["output"]
.as_str()
.unwrap_or_default()
.is_empty()
);
let reuse_output = outputs
.get(second_call_id)
.expect("missing reused unified_exec output");
assert_eq!(
reuse_output["session_id"].as_str().unwrap_or_default(),
session_id
);
let echoed = reuse_output["output"].as_str().unwrap_or_default();
assert!(
echoed.contains("hello unified exec"),
"expected echoed output, got {echoed:?}"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_timeout_and_followup_poll() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.use_experimental_unified_exec_tool = true;
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let first_call_id = "uexec-timeout";
let first_args = serde_json::json!({
"input": ["/bin/sh", "-c", "sleep 0.1; echo ready"],
"timeout_ms": 10,
});
let second_call_id = "uexec-poll";
let second_args = serde_json::json!({
"input": Vec::<String>::new(),
"session_id": "0",
"timeout_ms": 800,
});
let responses = vec![
sse(vec![
serde_json::json!({"type": "response.created", "response": {"id": "resp-1"}}),
ev_function_call(
first_call_id,
"unified_exec",
&serde_json::to_string(&first_args)?,
),
ev_completed("resp-1"),
]),
sse(vec![
serde_json::json!({"type": "response.created", "response": {"id": "resp-2"}}),
ev_function_call(
second_call_id,
"unified_exec",
&serde_json::to_string(&second_args)?,
),
ev_completed("resp-2"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-3"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "check timeout".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
loop {
let event = codex.next_event().await.expect("event");
if matches!(event.msg, EventMsg::TaskComplete(_)) {
break;
}
}
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let outputs = collect_tool_outputs(&bodies)?;
let first_output = outputs.get(first_call_id).expect("missing timeout output");
assert_eq!(first_output["session_id"], "0");
assert!(
first_output["output"]
.as_str()
.unwrap_or_default()
.is_empty()
);
let poll_output = outputs.get(second_call_id).expect("missing poll output");
let output_text = poll_output["output"].as_str().unwrap_or_default();
assert!(
output_text.contains("ready"),
"expected ready output, got {output_text:?}"
);
Ok(())
}

View File

@@ -0,0 +1,351 @@
#![cfg(not(target_os = "windows"))]
use base64::Engine;
use base64::engine::general_purpose::STANDARD as BASE64_STANDARD;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use core_test_support::responses;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use serde_json::Value;
use wiremock::matchers::any;
fn function_call_output(body: &Value) -> Option<&Value> {
body.get("input")
.and_then(Value::as_array)
.and_then(|items| {
items.iter().find(|item| {
item.get("type").and_then(Value::as_str) == Some("function_call_output")
})
})
}
fn find_image_message(body: &Value) -> Option<&Value> {
body.get("input")
.and_then(Value::as_array)
.and_then(|items| {
items.iter().find(|item| {
item.get("type").and_then(Value::as_str) == Some("message")
&& item
.get("content")
.and_then(Value::as_array)
.map(|content| {
content.iter().any(|span| {
span.get("type").and_then(Value::as_str) == Some("input_image")
})
})
.unwrap_or(false)
})
})
}
fn extract_output_text(item: &Value) -> Option<&str> {
item.get("output").and_then(|value| match value {
Value::String(text) => Some(text.as_str()),
Value::Object(obj) => obj.get("content").and_then(Value::as_str),
_ => None,
})
}
fn find_request_with_function_call_output(requests: &[Value]) -> Option<&Value> {
requests
.iter()
.find(|body| function_call_output(body).is_some())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn view_image_tool_attaches_local_image() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
cwd,
session_configured,
..
} = test_codex().build(&server).await?;
let rel_path = "assets/example.png";
let abs_path = cwd.path().join(rel_path);
if let Some(parent) = abs_path.parent() {
std::fs::create_dir_all(parent)?;
}
let image_bytes = b"fake_png_bytes".to_vec();
std::fs::write(&abs_path, &image_bytes)?;
let call_id = "view-image-call";
let arguments = serde_json::json!({ "path": rel_path }).to_string();
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_function_call(call_id, "view_image", &arguments),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please add the screenshot".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let mut tool_event = None;
loop {
let event = codex.next_event().await.expect("event");
match event.msg {
EventMsg::ViewImageToolCall(ev) => tool_event = Some(ev),
EventMsg::TaskComplete(_) => break,
_ => {}
}
}
let tool_event = tool_event.expect("view image tool event emitted");
assert_eq!(tool_event.call_id, call_id);
assert_eq!(tool_event.path, abs_path);
let requests = server.received_requests().await.expect("recorded requests");
assert!(
requests.len() >= 2,
"expected at least two POST requests, got {}",
requests.len()
);
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
let output_text = extract_output_text(output_item).expect("output text present");
assert_eq!(output_text, "attached local image path");
let image_message = find_image_message(body_with_tool_output)
.expect("pending input image message not included in request");
let image_url = image_message
.get("content")
.and_then(Value::as_array)
.and_then(|content| {
content.iter().find_map(|span| {
if span.get("type").and_then(Value::as_str) == Some("input_image") {
span.get("image_url").and_then(Value::as_str)
} else {
None
}
})
})
.expect("image_url present");
let expected_image_url = format!(
"data:image/png;base64,{}",
BASE64_STANDARD.encode(&image_bytes)
);
assert_eq!(image_url, expected_image_url);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn view_image_tool_errors_when_path_is_directory() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
cwd,
session_configured,
..
} = test_codex().build(&server).await?;
let rel_path = "assets";
let abs_path = cwd.path().join(rel_path);
std::fs::create_dir_all(&abs_path)?;
let call_id = "view-image-directory";
let arguments = serde_json::json!({ "path": rel_path }).to_string();
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_function_call(call_id, "view_image", &arguments),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please attach the folder".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
loop {
let event = codex.next_event().await.expect("event");
if matches!(event.msg, EventMsg::TaskComplete(_)) {
break;
}
}
let requests = server.received_requests().await.expect("recorded requests");
assert!(
requests.len() >= 2,
"expected at least two POST requests, got {}",
requests.len()
);
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
let output_text = extract_output_text(output_item).expect("output text present");
let expected_message = format!("image path `{}` is not a file", abs_path.display());
assert_eq!(output_text, expected_message);
assert!(
find_image_message(body_with_tool_output).is_none(),
"directory path should not produce an input_image message"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn view_image_tool_errors_when_file_missing() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
cwd,
session_configured,
..
} = test_codex().build(&server).await?;
let rel_path = "missing/example.png";
let abs_path = cwd.path().join(rel_path);
let call_id = "view-image-missing";
let arguments = serde_json::json!({ "path": rel_path }).to_string();
let first_response = sse(vec![
serde_json::json!({
"type": "response.created",
"response": {"id": "resp-1"}
}),
ev_function_call(call_id, "view_image", &arguments),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
let second_response = sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]);
responses::mount_sse_once_match(&server, any(), second_response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
text: "please attach the missing image".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
loop {
let event = codex.next_event().await.expect("event");
if matches!(event.msg, EventMsg::TaskComplete(_)) {
break;
}
}
let requests = server.received_requests().await.expect("recorded requests");
assert!(
requests.len() >= 2,
"expected at least two POST requests, got {}",
requests.len()
);
let request_bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let body_with_tool_output = find_request_with_function_call_output(&request_bodies)
.expect("function_call_output item not found in requests");
let output_item = function_call_output(body_with_tool_output).expect("tool output item");
let output_text = extract_output_text(output_item).expect("output text present");
let expected_prefix = format!("unable to locate image at `{}`:", abs_path.display());
assert!(
output_text.starts_with(&expected_prefix),
"expected error to start with `{expected_prefix}` but got `{output_text}`"
);
assert!(
find_image_message(body_with_tool_output).is_none(),
"missing file should not produce an input_image message"
);
Ok(())
}

View File

@@ -31,6 +31,7 @@ owo-colors = { workspace = true }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
shlex = { workspace = true }
supports-color = { workspace = true }
tokio = { workspace = true, features = [
"io-std",
"macros",

View File

@@ -72,7 +72,7 @@ pub struct Cli {
pub include_plan_tool: bool,
/// Specifies file where the last message from the agent should be written.
#[arg(long = "output-last-message")]
#[arg(long = "output-last-message", short = 'o', value_name = "FILE")]
pub last_message_file: Option<PathBuf>,
/// Initial instructions for the agent. If not provided as an argument (or

View File

@@ -1,7 +1,6 @@
use codex_common::elapsed::format_duration;
use codex_common::elapsed::format_elapsed;
use codex_core::config::Config;
use codex_core::plan_tool::UpdatePlanArgs;
use codex_core::protocol::AgentMessageEvent;
use codex_core::protocol::AgentReasoningRawContentEvent;
use codex_core::protocol::BackgroundEventEvent;
@@ -35,6 +34,8 @@ use crate::event_processor::CodexStatus;
use crate::event_processor::EventProcessor;
use crate::event_processor::handle_last_message;
use codex_common::create_config_summary_entries;
use codex_protocol::plan_tool::StepStatus;
use codex_protocol::plan_tool::UpdatePlanArgs;
/// This should be configurable. When used in CI, users may not want to impose
/// a limit so they can see the full transcript.
@@ -59,6 +60,7 @@ pub(crate) struct EventProcessorWithHumanOutput {
show_raw_agent_reasoning: bool,
last_message_path: Option<PathBuf>,
last_total_token_usage: Option<codex_core::protocol::TokenUsageInfo>,
final_message: Option<String>,
}
impl EventProcessorWithHumanOutput {
@@ -83,6 +85,7 @@ impl EventProcessorWithHumanOutput {
show_raw_agent_reasoning: config.show_raw_agent_reasoning,
last_message_path,
last_total_token_usage: None,
final_message: None,
}
} else {
Self {
@@ -98,6 +101,7 @@ impl EventProcessorWithHumanOutput {
show_raw_agent_reasoning: config.show_raw_agent_reasoning,
last_message_path,
last_total_token_usage: None,
final_message: None,
}
}
}
@@ -108,11 +112,10 @@ struct PatchApplyBegin {
auto_approved: bool,
}
// Timestamped println helper. The timestamp is styled with self.dimmed.
#[macro_export]
macro_rules! ts_println {
/// Timestamped helper. The timestamp is styled with self.dimmed.
macro_rules! ts_msg {
($self:ident, $($arg:tt)*) => {{
println!($($arg)*);
eprintln!($($arg)*);
}};
}
@@ -127,7 +130,7 @@ impl EventProcessor for EventProcessorWithHumanOutput {
session_configured_event: &SessionConfiguredEvent,
) {
const VERSION: &str = env!("CARGO_PKG_VERSION");
ts_println!(
ts_msg!(
self,
"OpenAI Codex v{} (research preview)\n--------",
VERSION
@@ -140,15 +143,15 @@ impl EventProcessor for EventProcessorWithHumanOutput {
));
for (key, value) in entries {
println!("{} {}", format!("{key}:").style(self.bold), value);
eprintln!("{} {}", format!("{key}:").style(self.bold), value);
}
println!("--------");
eprintln!("--------");
// Echo the prompt that will be sent to the agent so it is visible in the
// transcript/logs before any events come in. Note the prompt may have been
// read from stdin, so it may not be visible in the terminal otherwise.
ts_println!(self, "{}\n{}", "user".style(self.cyan), prompt);
ts_msg!(self, "{}\n{}", "user".style(self.cyan), prompt);
}
fn process_event(&mut self, event: Event) -> CodexStatus {
@@ -156,21 +159,25 @@ impl EventProcessor for EventProcessorWithHumanOutput {
match msg {
EventMsg::Error(ErrorEvent { message }) => {
let prefix = "ERROR:".style(self.red);
ts_println!(self, "{prefix} {message}");
ts_msg!(self, "{prefix} {message}");
}
EventMsg::BackgroundEvent(BackgroundEventEvent { message }) => {
ts_println!(self, "{}", message.style(self.dimmed));
ts_msg!(self, "{}", message.style(self.dimmed));
}
EventMsg::StreamError(StreamErrorEvent { message }) => {
ts_println!(self, "{}", message.style(self.dimmed));
ts_msg!(self, "{}", message.style(self.dimmed));
}
EventMsg::TaskStarted(_) => {
// Ignore.
}
EventMsg::TaskComplete(TaskCompleteEvent { last_agent_message }) => {
let last_message = last_agent_message.as_deref();
if let Some(output_file) = self.last_message_path.as_deref() {
handle_last_message(last_agent_message.as_deref(), output_file);
handle_last_message(last_message, output_file);
}
self.final_message = last_agent_message;
return CodexStatus::InitiateShutdown;
}
EventMsg::TokenCount(ev) => {
@@ -181,11 +188,11 @@ impl EventProcessor for EventProcessorWithHumanOutput {
if !self.show_agent_reasoning {
return CodexStatus::Running;
}
println!();
eprintln!();
}
EventMsg::AgentReasoningRawContent(AgentReasoningRawContentEvent { text }) => {
if self.show_raw_agent_reasoning {
ts_println!(
ts_msg!(
self,
"{}\n{}",
"thinking".style(self.italic).style(self.magenta),
@@ -194,7 +201,7 @@ impl EventProcessor for EventProcessorWithHumanOutput {
}
}
EventMsg::AgentMessage(AgentMessageEvent { message }) => {
ts_println!(
ts_msg!(
self,
"{}\n{}",
"codex".style(self.italic).style(self.magenta),
@@ -202,7 +209,7 @@ impl EventProcessor for EventProcessorWithHumanOutput {
);
}
EventMsg::ExecCommandBegin(ExecCommandBeginEvent { command, cwd, .. }) => {
print!(
eprint!(
"{}\n{} in {}",
"exec".style(self.italic).style(self.magenta),
escape_command(&command).style(self.bold),
@@ -226,20 +233,20 @@ impl EventProcessor for EventProcessorWithHumanOutput {
match exit_code {
0 => {
let title = format!(" succeeded{duration}:");
ts_println!(self, "{}", title.style(self.green));
ts_msg!(self, "{}", title.style(self.green));
}
_ => {
let title = format!(" exited {exit_code}{duration}:");
ts_println!(self, "{}", title.style(self.red));
ts_msg!(self, "{}", title.style(self.red));
}
}
println!("{}", truncated_output.style(self.dimmed));
eprintln!("{}", truncated_output.style(self.dimmed));
}
EventMsg::McpToolCallBegin(McpToolCallBeginEvent {
call_id: _,
invocation,
}) => {
ts_println!(
ts_msg!(
self,
"{} {}",
"tool".style(self.magenta),
@@ -264,7 +271,7 @@ impl EventProcessor for EventProcessorWithHumanOutput {
format_mcp_invocation(&invocation)
);
ts_println!(self, "{}", title.style(title_style));
ts_msg!(self, "{}", title.style(title_style));
if let Ok(res) = result {
let val: serde_json::Value = res.into();
@@ -272,13 +279,13 @@ impl EventProcessor for EventProcessorWithHumanOutput {
serde_json::to_string_pretty(&val).unwrap_or_else(|_| val.to_string());
for line in pretty.lines().take(MAX_OUTPUT_LINES_FOR_EXEC_TOOL_CALL) {
println!("{}", line.style(self.dimmed));
eprintln!("{}", line.style(self.dimmed));
}
}
}
EventMsg::WebSearchBegin(WebSearchBeginEvent { call_id: _ }) => {}
EventMsg::WebSearchEnd(WebSearchEndEvent { call_id: _, query }) => {
ts_println!(self, "🌐 Searched: {query}");
ts_msg!(self, "🌐 Searched: {query}");
}
EventMsg::PatchApplyBegin(PatchApplyBeginEvent {
call_id,
@@ -295,7 +302,7 @@ impl EventProcessor for EventProcessorWithHumanOutput {
},
);
ts_println!(
ts_msg!(
self,
"{}",
"file update".style(self.magenta).style(self.italic),
@@ -311,9 +318,9 @@ impl EventProcessor for EventProcessorWithHumanOutput {
format_file_change(change),
path.to_string_lossy()
);
println!("{}", header.style(self.magenta));
eprintln!("{}", header.style(self.magenta));
for line in content.lines() {
println!("{}", line.style(self.green));
eprintln!("{}", line.style(self.green));
}
}
FileChange::Delete { content } => {
@@ -322,9 +329,9 @@ impl EventProcessor for EventProcessorWithHumanOutput {
format_file_change(change),
path.to_string_lossy()
);
println!("{}", header.style(self.magenta));
eprintln!("{}", header.style(self.magenta));
for line in content.lines() {
println!("{}", line.style(self.red));
eprintln!("{}", line.style(self.red));
}
}
FileChange::Update {
@@ -341,20 +348,20 @@ impl EventProcessor for EventProcessorWithHumanOutput {
} else {
format!("{} {}", format_file_change(change), path.to_string_lossy())
};
println!("{}", header.style(self.magenta));
eprintln!("{}", header.style(self.magenta));
// Colorize diff lines. We keep file header lines
// (--- / +++) without extra coloring so they are
// still readable.
for diff_line in unified_diff.lines() {
if diff_line.starts_with('+') && !diff_line.starts_with("+++") {
println!("{}", diff_line.style(self.green));
eprintln!("{}", diff_line.style(self.green));
} else if diff_line.starts_with('-')
&& !diff_line.starts_with("---")
{
println!("{}", diff_line.style(self.red));
eprintln!("{}", diff_line.style(self.red));
} else {
println!("{diff_line}");
eprintln!("{diff_line}");
}
}
}
@@ -391,18 +398,18 @@ impl EventProcessor for EventProcessorWithHumanOutput {
};
let title = format!("{label} exited {exit_code}{duration}:");
ts_println!(self, "{}", title.style(title_style));
ts_msg!(self, "{}", title.style(title_style));
for line in output.lines() {
println!("{}", line.style(self.dimmed));
eprintln!("{}", line.style(self.dimmed));
}
}
EventMsg::TurnDiff(TurnDiffEvent { unified_diff }) => {
ts_println!(
ts_msg!(
self,
"{}",
"file update:".style(self.magenta).style(self.italic)
);
println!("{unified_diff}");
eprintln!("{unified_diff}");
}
EventMsg::ExecApprovalRequest(_) => {
// Should we exit?
@@ -412,7 +419,7 @@ impl EventProcessor for EventProcessorWithHumanOutput {
}
EventMsg::AgentReasoning(agent_reasoning_event) => {
if self.show_agent_reasoning {
ts_println!(
ts_msg!(
self,
"{}\n{}",
"thinking".style(self.italic).style(self.magenta),
@@ -431,41 +438,40 @@ impl EventProcessor for EventProcessorWithHumanOutput {
rollout_path: _,
} = session_configured_event;
ts_println!(
ts_msg!(
self,
"{} {}",
"codex session".style(self.magenta).style(self.bold),
conversation_id.to_string().style(self.dimmed)
);
ts_println!(self, "model: {}", model);
println!();
ts_msg!(self, "model: {}", model);
eprintln!();
}
EventMsg::PlanUpdate(plan_update_event) => {
let UpdatePlanArgs { explanation, plan } = plan_update_event;
// Header
ts_println!(self, "{}", "Plan update".style(self.magenta));
ts_msg!(self, "{}", "Plan update".style(self.magenta));
// Optional explanation
if let Some(explanation) = explanation
&& !explanation.trim().is_empty()
{
ts_println!(self, "{}", explanation.style(self.italic));
ts_msg!(self, "{}", explanation.style(self.italic));
}
// Pretty-print the plan items with simple status markers.
for item in plan {
use codex_core::plan_tool::StepStatus;
match item.status {
StepStatus::Completed => {
ts_println!(self, " {} {}", "".style(self.green), item.step);
ts_msg!(self, " {} {}", "".style(self.green), item.step);
}
StepStatus::InProgress => {
ts_println!(self, " {} {}", "".style(self.cyan), item.step);
ts_msg!(self, " {} {}", "".style(self.cyan), item.step);
}
StepStatus::Pending => {
ts_println!(
ts_msg!(
self,
" {} {}",
"".style(self.dimmed),
@@ -485,7 +491,7 @@ impl EventProcessor for EventProcessorWithHumanOutput {
// Currently ignored in exec output.
}
EventMsg::ViewImageToolCall(view) => {
ts_println!(
ts_msg!(
self,
"{} {}",
"viewed image".style(self.magenta),
@@ -494,13 +500,13 @@ impl EventProcessor for EventProcessorWithHumanOutput {
}
EventMsg::TurnAborted(abort_reason) => match abort_reason.reason {
TurnAbortReason::Interrupted => {
ts_println!(self, "task interrupted");
ts_msg!(self, "task interrupted");
}
TurnAbortReason::Replaced => {
ts_println!(self, "task aborted: replaced by a new task");
ts_msg!(self, "task aborted: replaced by a new task");
}
TurnAbortReason::ReviewEnded => {
ts_println!(self, "task aborted: review ended");
ts_msg!(self, "task aborted: review ended");
}
},
EventMsg::ShutdownComplete => return CodexStatus::Shutdown,
@@ -517,13 +523,25 @@ impl EventProcessor for EventProcessorWithHumanOutput {
fn print_final_output(&mut self) {
if let Some(usage_info) = &self.last_total_token_usage {
ts_println!(
self,
eprintln!(
"{}\n{}",
"tokens used".style(self.magenta).style(self.italic),
format_with_separators(usage_info.total_token_usage.blended_total())
);
}
// If the user has not piped the final message to a file, they will see
// it twice: once written to stderr as part of the normal event
// processing, and once here on stdout. We print the token summary above
// to help break up the output visually in that case.
#[allow(clippy::print_stdout)]
if let Some(message) = &self.final_message {
if message.ends_with('\n') {
print!("{message}");
} else {
println!("{message}");
}
}
}
}

View File

@@ -31,8 +31,6 @@ use crate::exec_events::TurnStartedEvent;
use crate::exec_events::Usage;
use crate::exec_events::WebSearchItem;
use codex_core::config::Config;
use codex_core::plan_tool::StepStatus;
use codex_core::plan_tool::UpdatePlanArgs;
use codex_core::protocol::AgentMessageEvent;
use codex_core::protocol::AgentReasoningEvent;
use codex_core::protocol::Event;
@@ -48,6 +46,8 @@ use codex_core::protocol::SessionConfiguredEvent;
use codex_core::protocol::TaskCompleteEvent;
use codex_core::protocol::TaskStartedEvent;
use codex_core::protocol::WebSearchEndEvent;
use codex_protocol::plan_tool::StepStatus;
use codex_protocol::plan_tool::UpdatePlanArgs;
use tracing::error;
use tracing::warn;
@@ -428,6 +428,7 @@ impl EventProcessor for EventProcessorWithJsonOutput {
});
}
#[allow(clippy::print_stdout)]
fn process_event(&mut self, event: Event) -> CodexStatus {
let aggregated = self.collect_thread_events(&event);
for conv_event in aggregated {

View File

@@ -1,14 +1,25 @@
// - In the default output mode, it is paramount that the only thing written to
// stdout is the final message (if any).
// - In --json mode, stdout must be valid JSONL, one event per line.
// For both modes, any other output must be written to stderr.
#![deny(clippy::print_stdout)]
mod cli;
mod event_processor;
mod event_processor_with_human_output;
pub mod event_processor_with_jsonl_output;
pub mod exec_events;
use anyhow::bail;
pub use cli::Cli;
use codex_core::AuthManager;
use codex_core::BUILT_IN_OSS_MODEL_PROVIDER_ID;
use codex_core::ConversationManager;
use codex_core::NewConversation;
use codex_core::admin_controls::DangerAuditAction;
use codex_core::admin_controls::PendingAdminAction;
use codex_core::admin_controls::build_danger_audit_payload;
use codex_core::admin_controls::log_admin_event;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::git_info::get_git_repo_root;
@@ -28,6 +39,7 @@ use serde_json::Value;
use std::io::IsTerminal;
use std::io::Read;
use std::path::PathBuf;
use supports_color::Stream;
use tracing::debug;
use tracing::error;
use tracing::info;
@@ -113,8 +125,8 @@ pub async fn run_main(cli: Cli, codex_linux_sandbox_exe: Option<PathBuf>) -> any
cli::Color::Always => (true, true),
cli::Color::Never => (false, false),
cli::Color::Auto => (
std::io::stdout().is_terminal(),
std::io::stderr().is_terminal(),
supports_color::on_cached(Stream::Stdout).is_some(),
supports_color::on_cached(Stream::Stderr).is_some(),
),
};
@@ -170,7 +182,7 @@ pub async fn run_main(cli: Cli, codex_linux_sandbox_exe: Option<PathBuf>) -> any
codex_linux_sandbox_exe,
base_instructions: None,
include_plan_tool: Some(include_plan_tool),
include_apply_patch_tool: None,
include_apply_patch_tool: Some(true),
include_view_image_tool: None,
show_raw_agent_reasoning: oss.then_some(true),
tools_web_search_request: None,
@@ -184,7 +196,25 @@ pub async fn run_main(cli: Cli, codex_linux_sandbox_exe: Option<PathBuf>) -> any
}
};
let config = Config::load_with_cli_overrides(cli_kv_overrides, overrides)?;
let config = Config::load_with_cli_overrides(cli_kv_overrides, overrides).await?;
if config.admin.has_pending_danger() {
if let Some(audit) = config.admin.audit.as_ref()
&& let Some(PendingAdminAction::Danger(pending)) = config
.admin
.pending
.iter()
.find(|action| matches!(action, PendingAdminAction::Danger(_)))
{
log_admin_event(
audit,
build_danger_audit_payload(pending, DangerAuditAction::Denied, None),
);
}
bail!(
"danger-full-access requires interactive justification; rerun in the interactive TUI"
);
}
let otel = codex_core::otel_init::build_provider(&config, env!("CARGO_PKG_VERSION"));

View File

@@ -37,6 +37,9 @@ use codex_exec::exec_events::TurnFailedEvent;
use codex_exec::exec_events::TurnStartedEvent;
use codex_exec::exec_events::Usage;
use codex_exec::exec_events::WebSearchItem;
use codex_protocol::plan_tool::PlanItemArg;
use codex_protocol::plan_tool::StepStatus;
use codex_protocol::plan_tool::UpdatePlanArgs;
use mcp_types::CallToolResult;
use pretty_assertions::assert_eq;
use std::path::PathBuf;
@@ -115,10 +118,6 @@ fn web_search_end_emits_item_completed() {
#[test]
fn plan_update_emits_todo_list_started_updated_and_completed() {
use codex_core::plan_tool::PlanItemArg;
use codex_core::plan_tool::StepStatus;
use codex_core::plan_tool::UpdatePlanArgs;
let mut ep = EventProcessorWithJsonOutput::new(None);
// First plan update => item.started (todo_list)
@@ -339,10 +338,6 @@ fn mcp_tool_call_failure_sets_failed_status() {
#[test]
fn plan_update_after_complete_starts_new_todo_list_with_new_id() {
use codex_core::plan_tool::PlanItemArg;
use codex_core::plan_tool::StepStatus;
use codex_core::plan_tool::UpdatePlanArgs;
let mut ep = EventProcessorWithJsonOutput::new(None);
// First turn: start + complete

View File

@@ -229,14 +229,14 @@ fn exec_resume_preserves_cli_configuration_overrides() -> anyhow::Result<()> {
assert!(output.status.success(), "resume run failed: {output:?}");
let stdout = String::from_utf8(output.stdout)?;
let stderr = String::from_utf8(output.stderr)?;
assert!(
stdout.contains("model: gpt-5-high"),
"stdout missing model override: {stdout}"
stderr.contains("model: gpt-5-high"),
"stderr missing model override: {stderr}"
);
assert!(
stdout.contains("sandbox: workspace-write"),
"stdout missing sandbox override: {stdout}"
stderr.contains("sandbox: workspace-write"),
"stderr missing sandbox override: {stderr}"
);
let resumed_path = find_session_file_containing_marker(&sessions_dir, &marker2)

View File

@@ -132,7 +132,7 @@ pub(crate) fn create_tool_for_codex_tool_call_param() -> Tool {
impl CodexToolCallParam {
/// Returns the initial user prompt to start the Codex conversation and the
/// effective Config object generated from the supplied parameters.
pub fn into_config(
pub async fn into_config(
self,
codex_linux_sandbox_exe: Option<PathBuf>,
) -> std::io::Result<(String, codex_core::config::Config)> {
@@ -172,7 +172,8 @@ impl CodexToolCallParam {
.map(|(k, v)| (k, json_to_toml(v)))
.collect();
let cfg = codex_core::config::Config::load_with_cli_overrides(cli_overrides, overrides)?;
let cfg =
codex_core::config::Config::load_with_cli_overrides(cli_overrides, overrides).await?;
Ok((prompt, cfg))
}

View File

@@ -91,6 +91,7 @@ pub async fn run_main(
)
})?;
let config = Config::load_with_cli_overrides(cli_kv_overrides, ConfigOverrides::default())
.await
.map_err(|e| {
std::io::Error::new(ErrorKind::InvalidData, format!("error loading config: {e}"))
})?;

View File

@@ -342,7 +342,10 @@ impl MessageProcessor {
async fn handle_tool_call_codex(&self, id: RequestId, arguments: Option<serde_json::Value>) {
let (initial_prompt, config): (String, Config) = match arguments {
Some(json_val) => match serde_json::from_value::<CodexToolCallParam>(json_val) {
Ok(tool_cfg) => match tool_cfg.into_config(self.codex_linux_sandbox_exe.clone()) {
Ok(tool_cfg) => match tool_cfg
.into_config(self.codex_linux_sandbox_exe.clone())
.await
{
Ok(cfg) => cfg,
Err(e) => {
let result = CallToolResult {

View File

@@ -14,6 +14,7 @@ use eventsource_stream::EventStreamError as StreamError;
use reqwest::Error;
use reqwest::Response;
use serde::Serialize;
use std::borrow::Cow;
use std::fmt::Display;
use std::time::Duration;
use std::time::Instant;
@@ -366,10 +367,10 @@ impl OtelEventManager {
call_id: &str,
arguments: &str,
f: F,
) -> Result<String, E>
) -> Result<(String, bool), E>
where
F: FnOnce() -> Fut,
Fut: Future<Output = Result<String, E>>,
Fut: Future<Output = Result<(String, bool), E>>,
E: Display,
{
let start = Instant::now();
@@ -377,10 +378,12 @@ impl OtelEventManager {
let duration = start.elapsed();
let (output, success) = match &result {
Ok(content) => (content, true),
Err(error) => (&error.to_string(), false),
Ok((preview, success)) => (Cow::Borrowed(preview.as_str()), *success),
Err(error) => (Cow::Owned(error.to_string()), false),
};
let success_str = if success { "true" } else { "false" };
tracing::event!(
tracing::Level::INFO,
event.name = "codex.tool_result",
@@ -396,7 +399,8 @@ impl OtelEventManager {
call_id = %call_id,
arguments = %arguments,
duration_ms = %duration.as_millis(),
success = %success,
success = %success_str,
// `output` is truncated by the tool layer before reaching telemetry.
output = %output,
);

View File

@@ -259,6 +259,7 @@ pub struct ShellToolCallParams {
#[derive(Debug, Clone, PartialEq, TS)]
pub struct FunctionCallOutputPayload {
pub content: String,
// TODO(jif) drop this.
pub success: Option<bool>,
}

View File

@@ -1206,7 +1206,6 @@ pub struct GetHistoryEntryResponseEvent {
pub entry: Option<HistoryEntry>,
}
/// Response payload for `Op::ListMcpTools`.
#[derive(Debug, Clone, Deserialize, Serialize, TS)]
pub struct McpListToolsResponseEvent {
/// Fully qualified tool name -> tool definition.

View File

@@ -49,6 +49,7 @@ pub struct Args {
#[derive(Serialize)]
struct ServerInfo {
port: u16,
pid: u32,
}
/// Entry point for the library main, for parity with other crates.
@@ -100,15 +101,17 @@ fn write_server_info(path: &Path, port: u16) -> Result<()> {
if let Some(parent) = path.parent()
&& !parent.as_os_str().is_empty()
{
let parent_display = parent.display();
fs::create_dir_all(parent).with_context(|| format!("create_dir_all {parent_display}"))?;
fs::create_dir_all(parent)?;
}
let info = ServerInfo { port };
let data = serde_json::to_vec(&info).context("serialize startup info")?;
let p = path.display();
let mut f = File::create(path).with_context(|| format!("create {p}"))?;
f.write_all(&data).with_context(|| format!("write {p}"))?;
f.write_all(b"\n").with_context(|| format!("newline {p}"))?;
let info = ServerInfo {
port,
pid: std::process::id(),
};
let mut data = serde_json::to_string(&info)?;
data.push('\n');
let mut f = File::create(path)?;
f.write_all(data.as_bytes())?;
Ok(())
}

View File

@@ -8,8 +8,19 @@ workspace = true
[dependencies]
anyhow = "1"
axum = { workspace = true, default-features = false, features = [
"http1",
"tokio",
] }
keyring = { workspace = true, features = [
"apple-native",
"crypto-rust",
"linux-native-async-persistent",
"windows-native",
] }
mcp-types = { path = "../mcp-types" }
rmcp = { version = "0.7.0", default-features = false, features = [
rmcp = { workspace = true, default-features = false, features = [
"auth",
"base64",
"client",
"macros",
@@ -19,16 +30,19 @@ rmcp = { version = "0.7.0", default-features = false, features = [
"transport-streamable-http-client-reqwest",
"transport-streamable-http-server",
] }
axum = { version = "0.8", default-features = false, features = ["http1", "tokio"] }
futures = { version = "0.3", default-features = false, features = ["std"] }
futures = { workspace = true, default-features = false, features = ["std"] }
reqwest = { version = "0.12", default-features = false, features = [
"json",
"stream",
"rustls-tls",
] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = [
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
sha2 = { workspace = true }
dirs = { workspace = true }
oauth2 = "5"
tiny_http = { workspace = true }
tokio = { workspace = true, features = [
"io-util",
"macros",
"process",
@@ -37,7 +51,10 @@ tokio = { version = "1", features = [
"io-std",
"time",
] }
tracing = { version = "0.1.41", features = ["log"] }
tracing = { workspace = true, features = ["log"] }
urlencoding = { workspace = true }
webbrowser = { workspace = true }
[dev-dependencies]
pretty_assertions = "1.4.1"
pretty_assertions = { workspace = true }
tempfile = { workspace = true }

View File

@@ -5,6 +5,14 @@ use std::net::SocketAddr;
use std::sync::Arc;
use axum::Router;
use axum::body::Body;
use axum::extract::State;
use axum::http::Request;
use axum::http::StatusCode;
use axum::http::header::AUTHORIZATION;
use axum::middleware;
use axum::middleware::Next;
use axum::response::Response;
use rmcp::ErrorData as McpError;
use rmcp::handler::server::ServerHandler;
use rmcp::model::CallToolRequestParam;
@@ -161,7 +169,30 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
),
);
let router = if let Ok(token) = std::env::var("MCP_EXPECT_BEARER") {
let expected = Arc::new(format!("Bearer {token}"));
router.layer(middleware::from_fn_with_state(expected, require_bearer))
} else {
router
};
axum::serve(listener, router).await?;
task::yield_now().await;
Ok(())
}
async fn require_bearer(
State(expected): State<Arc<String>>,
request: Request<Body>,
next: Next,
) -> Result<Response, StatusCode> {
if request
.headers()
.get(AUTHORIZATION)
.is_some_and(|value| value.as_bytes() == expected.as_bytes())
{
Ok(next.run(request).await)
} else {
Err(StatusCode::UNAUTHORIZED)
}
}

View File

@@ -0,0 +1,33 @@
use dirs::home_dir;
use std::path::PathBuf;
/// This was copied from codex-core but codex-core depends on this crate.
/// TODO: move this to a shared crate lower in the dependency tree.
///
///
/// Returns the path to the Codex configuration directory, which can be
/// specified by the `CODEX_HOME` environment variable. If not set, defaults to
/// `~/.codex`.
///
/// - If `CODEX_HOME` is set, the value will be canonicalized and this
/// function will Err if the path does not exist.
/// - If `CODEX_HOME` is not set, this function does not verify that the
/// directory exists.
pub(crate) fn find_codex_home() -> std::io::Result<PathBuf> {
// Honor the `CODEX_HOME` environment variable when it is set to allow users
// (and tests) to override the default location.
if let Ok(val) = std::env::var("CODEX_HOME")
&& !val.is_empty()
{
return PathBuf::from(val).canonicalize();
}
let mut p = home_dir().ok_or_else(|| {
std::io::Error::new(
std::io::ErrorKind::NotFound,
"Could not find home directory",
)
})?;
p.push(".codex");
Ok(p)
}

View File

@@ -1,5 +1,14 @@
mod find_codex_home;
mod logging_client_handler;
mod oauth;
mod perform_oauth_login;
mod rmcp_client;
mod utils;
pub use oauth::StoredOAuthTokens;
pub use oauth::WrappedOAuthTokenResponse;
pub use oauth::delete_oauth_tokens;
pub(crate) use oauth::load_oauth_tokens;
pub use oauth::save_oauth_tokens;
pub use perform_oauth_login::perform_oauth_login;
pub use rmcp_client::RmcpClient;

View File

@@ -0,0 +1,822 @@
//! This file handles all logic related to managing MCP OAuth credentials.
//! All credentials are stored using the keyring crate which uses os-specific keyring services.
//! https://crates.io/crates/keyring
//! macOS: macOS keychain.
//! Windows: Windows Credential Manager
//! Linux: DBus-based Secret Service, the kernel keyutils, and a combo of the two
//! FreeBSD, OpenBSD: DBus-based Secret Service
//!
//! For Linux, we use linux-native-async-persistent which uses both keyutils and async-secret-service (see below) for storage.
//! See the docs for the keyutils_persistent module for a full explanation of why both are used. Because this store uses the
//! async-secret-service, you must specify the additional features required by that store
//!
//! async-secret-service provides access to the DBus-based Secret Service storage on Linux, FreeBSD, and OpenBSD. This is an asynchronous
//! keystore that always encrypts secrets when they are transferred across the bus. If DBus isn't installed the keystore will fall back to the json
//! file because we don't use the "vendored" feature.
//!
//! If the keyring is not available or fails, we fall back to CODEX_HOME/.credentials.json which is consistent with other coding CLI agents.
use anyhow::Context;
use anyhow::Result;
use keyring::Entry;
use oauth2::AccessToken;
use oauth2::EmptyExtraTokenFields;
use oauth2::RefreshToken;
use oauth2::Scope;
use oauth2::TokenResponse;
use oauth2::basic::BasicTokenType;
use rmcp::transport::auth::OAuthTokenResponse;
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value;
use serde_json::map::Map as JsonMap;
use sha2::Digest;
use sha2::Sha256;
use std::collections::BTreeMap;
use std::fmt;
use std::fs;
use std::io::ErrorKind;
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Duration;
use std::time::SystemTime;
use std::time::UNIX_EPOCH;
use tracing::warn;
use rmcp::transport::auth::AuthorizationManager;
use tokio::sync::Mutex;
use crate::find_codex_home::find_codex_home;
const KEYRING_SERVICE: &str = "Codex MCP Credentials";
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct StoredOAuthTokens {
pub server_name: String,
pub url: String,
pub client_id: String,
pub token_response: WrappedOAuthTokenResponse,
}
#[derive(Debug)]
struct CredentialStoreError(anyhow::Error);
impl CredentialStoreError {
fn new(error: impl Into<anyhow::Error>) -> Self {
Self(error.into())
}
fn message(&self) -> String {
self.0.to_string()
}
fn into_error(self) -> anyhow::Error {
self.0
}
}
impl fmt::Display for CredentialStoreError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.0)
}
}
impl std::error::Error for CredentialStoreError {}
trait CredentialStore {
fn load(&self, service: &str, account: &str) -> Result<Option<String>, CredentialStoreError>;
fn save(&self, service: &str, account: &str, value: &str) -> Result<(), CredentialStoreError>;
fn delete(&self, service: &str, account: &str) -> Result<bool, CredentialStoreError>;
}
struct KeyringCredentialStore;
impl CredentialStore for KeyringCredentialStore {
fn load(&self, service: &str, account: &str) -> Result<Option<String>, CredentialStoreError> {
let entry = Entry::new(service, account).map_err(CredentialStoreError::new)?;
match entry.get_password() {
Ok(password) => Ok(Some(password)),
Err(keyring::Error::NoEntry) => Ok(None),
Err(error) => Err(CredentialStoreError::new(error)),
}
}
fn save(&self, service: &str, account: &str, value: &str) -> Result<(), CredentialStoreError> {
let entry = Entry::new(service, account).map_err(CredentialStoreError::new)?;
entry.set_password(value).map_err(CredentialStoreError::new)
}
fn delete(&self, service: &str, account: &str) -> Result<bool, CredentialStoreError> {
let entry = Entry::new(service, account).map_err(CredentialStoreError::new)?;
match entry.delete_credential() {
Ok(()) => Ok(true),
Err(keyring::Error::NoEntry) => Ok(false),
Err(error) => Err(CredentialStoreError::new(error)),
}
}
}
/// Wrap OAuthTokenResponse to allow for partial equality comparison.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct WrappedOAuthTokenResponse(pub OAuthTokenResponse);
impl PartialEq for WrappedOAuthTokenResponse {
fn eq(&self, other: &Self) -> bool {
match (serde_json::to_string(self), serde_json::to_string(other)) {
(Ok(s1), Ok(s2)) => s1 == s2,
_ => false,
}
}
}
pub(crate) fn load_oauth_tokens(server_name: &str, url: &str) -> Result<Option<StoredOAuthTokens>> {
let store = KeyringCredentialStore;
load_oauth_tokens_with_store(&store, server_name, url)
}
fn load_oauth_tokens_with_store<C: CredentialStore>(
store: &C,
server_name: &str,
url: &str,
) -> Result<Option<StoredOAuthTokens>> {
let key = compute_store_key(server_name, url)?;
match store.load(KEYRING_SERVICE, &key) {
Ok(Some(serialized)) => {
let tokens: StoredOAuthTokens = serde_json::from_str(&serialized)
.context("failed to deserialize OAuth tokens from keyring")?;
Ok(Some(tokens))
}
Ok(None) => load_oauth_tokens_from_file(server_name, url),
Err(error) => {
let message = error.message();
warn!("failed to read OAuth tokens from keyring: {message}");
load_oauth_tokens_from_file(server_name, url)
.with_context(|| format!("failed to read OAuth tokens from keyring: {message}"))
}
}
}
pub fn save_oauth_tokens(server_name: &str, tokens: &StoredOAuthTokens) -> Result<()> {
let store = KeyringCredentialStore;
save_oauth_tokens_with_store(&store, server_name, tokens)
}
fn save_oauth_tokens_with_store<C: CredentialStore>(
store: &C,
server_name: &str,
tokens: &StoredOAuthTokens,
) -> Result<()> {
let serialized = serde_json::to_string(tokens).context("failed to serialize OAuth tokens")?;
let key = compute_store_key(server_name, &tokens.url)?;
match store.save(KEYRING_SERVICE, &key, &serialized) {
Ok(()) => {
if let Err(error) = delete_oauth_tokens_from_file(&key) {
warn!("failed to remove OAuth tokens from fallback storage: {error:?}");
}
Ok(())
}
Err(error) => {
let message = error.message();
warn!("failed to write OAuth tokens to keyring: {message}");
save_oauth_tokens_to_file(tokens)
.with_context(|| format!("failed to write OAuth tokens to keyring: {message}"))
}
}
}
pub fn delete_oauth_tokens(server_name: &str, url: &str) -> Result<bool> {
let store = KeyringCredentialStore;
delete_oauth_tokens_with_store(&store, server_name, url)
}
fn delete_oauth_tokens_with_store<C: CredentialStore>(
store: &C,
server_name: &str,
url: &str,
) -> Result<bool> {
let key = compute_store_key(server_name, url)?;
let keyring_removed = match store.delete(KEYRING_SERVICE, &key) {
Ok(removed) => removed,
Err(error) => {
let message = error.message();
warn!("failed to delete OAuth tokens from keyring: {message}");
return Err(error.into_error()).context("failed to delete OAuth tokens from keyring");
}
};
let file_removed = delete_oauth_tokens_from_file(&key)?;
Ok(keyring_removed || file_removed)
}
#[derive(Clone)]
pub(crate) struct OAuthPersistor {
inner: Arc<OAuthPersistorInner>,
}
struct OAuthPersistorInner {
server_name: String,
url: String,
authorization_manager: Arc<Mutex<AuthorizationManager>>,
last_credentials: Mutex<Option<StoredOAuthTokens>>,
}
impl OAuthPersistor {
pub(crate) fn new(
server_name: String,
url: String,
manager: Arc<Mutex<AuthorizationManager>>,
initial_credentials: Option<StoredOAuthTokens>,
) -> Self {
Self {
inner: Arc::new(OAuthPersistorInner {
server_name,
url,
authorization_manager: manager,
last_credentials: Mutex::new(initial_credentials),
}),
}
}
/// Persists the latest stored credentials if they have changed.
/// Deletes the credentials if they are no longer present.
pub(crate) async fn persist_if_needed(&self) -> Result<()> {
let (client_id, maybe_credentials) = {
let manager = self.inner.authorization_manager.clone();
let guard = manager.lock().await;
guard.get_credentials().await
}?;
match maybe_credentials {
Some(credentials) => {
let stored = StoredOAuthTokens {
server_name: self.inner.server_name.clone(),
url: self.inner.url.clone(),
client_id,
token_response: WrappedOAuthTokenResponse(credentials.clone()),
};
let mut last_credentials = self.inner.last_credentials.lock().await;
if last_credentials.as_ref() != Some(&stored) {
save_oauth_tokens(&self.inner.server_name, &stored)?;
*last_credentials = Some(stored);
}
}
None => {
let mut last_serialized = self.inner.last_credentials.lock().await;
if last_serialized.take().is_some()
&& let Err(error) =
delete_oauth_tokens(&self.inner.server_name, &self.inner.url)
{
warn!(
"failed to remove OAuth tokens for server {}: {error}",
self.inner.server_name
);
}
}
}
Ok(())
}
}
const FALLBACK_FILENAME: &str = ".credentials.json";
const MCP_SERVER_TYPE: &str = "http";
type FallbackFile = BTreeMap<String, FallbackTokenEntry>;
#[derive(Debug, Clone, Serialize, Deserialize)]
struct FallbackTokenEntry {
server_name: String,
server_url: String,
client_id: String,
access_token: String,
#[serde(default)]
expires_at: Option<u64>,
#[serde(default)]
refresh_token: Option<String>,
#[serde(default)]
scopes: Vec<String>,
}
fn load_oauth_tokens_from_file(server_name: &str, url: &str) -> Result<Option<StoredOAuthTokens>> {
let Some(store) = read_fallback_file()? else {
return Ok(None);
};
let key = compute_store_key(server_name, url)?;
for entry in store.values() {
let entry_key = compute_store_key(&entry.server_name, &entry.server_url)?;
if entry_key != key {
continue;
}
let mut token_response = OAuthTokenResponse::new(
AccessToken::new(entry.access_token.clone()),
BasicTokenType::Bearer,
EmptyExtraTokenFields {},
);
if let Some(refresh) = entry.refresh_token.clone() {
token_response.set_refresh_token(Some(RefreshToken::new(refresh)));
}
let scopes = entry.scopes.clone();
if !scopes.is_empty() {
token_response.set_scopes(Some(scopes.into_iter().map(Scope::new).collect()));
}
if let Some(expires_at) = entry.expires_at
&& let Some(seconds) = expires_in_from_timestamp(expires_at)
{
let duration = Duration::from_secs(seconds);
token_response.set_expires_in(Some(&duration));
}
let stored = StoredOAuthTokens {
server_name: entry.server_name.clone(),
url: entry.server_url.clone(),
client_id: entry.client_id.clone(),
token_response: WrappedOAuthTokenResponse(token_response),
};
return Ok(Some(stored));
}
Ok(None)
}
fn save_oauth_tokens_to_file(tokens: &StoredOAuthTokens) -> Result<()> {
let key = compute_store_key(&tokens.server_name, &tokens.url)?;
let mut store = read_fallback_file()?.unwrap_or_default();
let token_response = &tokens.token_response.0;
let refresh_token = token_response
.refresh_token()
.map(|token| token.secret().to_string());
let scopes = token_response
.scopes()
.map(|s| s.iter().map(|s| s.to_string()).collect())
.unwrap_or_default();
let entry = FallbackTokenEntry {
server_name: tokens.server_name.clone(),
server_url: tokens.url.clone(),
client_id: tokens.client_id.clone(),
access_token: token_response.access_token().secret().to_string(),
expires_at: compute_expires_at_millis(token_response),
refresh_token,
scopes,
};
store.insert(key, entry);
write_fallback_file(&store)
}
fn delete_oauth_tokens_from_file(key: &str) -> Result<bool> {
let mut store = match read_fallback_file()? {
Some(store) => store,
None => return Ok(false),
};
let removed = store.remove(key).is_some();
if removed {
write_fallback_file(&store)?;
}
Ok(removed)
}
fn compute_expires_at_millis(response: &OAuthTokenResponse) -> Option<u64> {
let expires_in = response.expires_in()?;
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap_or_else(|_| Duration::from_secs(0));
let expiry = now.checked_add(expires_in)?;
let millis = expiry.as_millis();
if millis > u128::from(u64::MAX) {
Some(u64::MAX)
} else {
Some(millis as u64)
}
}
fn expires_in_from_timestamp(expires_at: u64) -> Option<u64> {
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap_or_else(|_| Duration::from_secs(0));
let now_ms = now.as_millis() as u64;
if expires_at <= now_ms {
None
} else {
Some((expires_at - now_ms) / 1000)
}
}
fn compute_store_key(server_name: &str, server_url: &str) -> Result<String> {
let mut payload = JsonMap::new();
payload.insert(
"type".to_string(),
Value::String(MCP_SERVER_TYPE.to_string()),
);
payload.insert("url".to_string(), Value::String(server_url.to_string()));
payload.insert("headers".to_string(), Value::Object(JsonMap::new()));
let truncated = sha_256_prefix(&Value::Object(payload))?;
Ok(format!("{server_name}|{truncated}"))
}
fn fallback_file_path() -> Result<PathBuf> {
let mut path = find_codex_home()?;
path.push(FALLBACK_FILENAME);
Ok(path)
}
fn read_fallback_file() -> Result<Option<FallbackFile>> {
let path = fallback_file_path()?;
let contents = match fs::read_to_string(&path) {
Ok(contents) => contents,
Err(err) if err.kind() == ErrorKind::NotFound => return Ok(None),
Err(err) => {
return Err(err).context(format!(
"failed to read credentials file at {}",
path.display()
));
}
};
match serde_json::from_str::<FallbackFile>(&contents) {
Ok(store) => Ok(Some(store)),
Err(e) => Err(e).context(format!(
"failed to parse credentials file at {}",
path.display()
)),
}
}
fn write_fallback_file(store: &FallbackFile) -> Result<()> {
let path = fallback_file_path()?;
if store.is_empty() {
if path.exists() {
fs::remove_file(path)?;
}
return Ok(());
}
if let Some(parent) = path.parent() {
fs::create_dir_all(parent)?;
}
let serialized = serde_json::to_string(store)?;
fs::write(&path, serialized)?;
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let perms = fs::Permissions::from_mode(0o600);
fs::set_permissions(&path, perms)?;
}
Ok(())
}
fn sha_256_prefix(value: &Value) -> Result<String> {
let serialized =
serde_json::to_string(&value).context("failed to serialize MCP OAuth key payload")?;
let mut hasher = Sha256::new();
hasher.update(serialized.as_bytes());
let digest = hasher.finalize();
let hex = format!("{digest:x}");
let truncated = &hex[..16];
Ok(truncated.to_string())
}
#[cfg(test)]
mod tests {
use super::*;
use anyhow::Result;
use keyring::Error as KeyringError;
use keyring::credential::CredentialApi as _;
use keyring::mock::MockCredential;
use pretty_assertions::assert_eq;
use std::collections::HashMap;
use std::sync::Arc;
use std::sync::Mutex;
use std::sync::MutexGuard;
use std::sync::OnceLock;
use std::sync::PoisonError;
use tempfile::tempdir;
#[derive(Default, Clone)]
struct MockCredentialStore {
credentials: Arc<Mutex<HashMap<String, Arc<MockCredential>>>>,
}
impl MockCredentialStore {
fn credential(&self, account: &str) -> Arc<MockCredential> {
let mut guard = self.credentials.lock().unwrap();
guard
.entry(account.to_string())
.or_insert_with(|| Arc::new(MockCredential::default()))
.clone()
}
fn saved_value(&self, account: &str) -> Option<String> {
let credential = {
let guard = self.credentials.lock().unwrap();
guard.get(account).cloned()
}?;
credential.get_password().ok()
}
fn set_error(&self, account: &str, error: KeyringError) {
let credential = self.credential(account);
credential.set_error(error);
}
fn contains(&self, account: &str) -> bool {
let guard = self.credentials.lock().unwrap();
guard.contains_key(account)
}
}
impl CredentialStore for MockCredentialStore {
fn load(
&self,
_service: &str,
account: &str,
) -> Result<Option<String>, CredentialStoreError> {
let credential = {
let guard = self.credentials.lock().unwrap();
guard.get(account).cloned()
};
let Some(credential) = credential else {
return Ok(None);
};
match credential.get_password() {
Ok(password) => Ok(Some(password)),
Err(KeyringError::NoEntry) => Ok(None),
Err(error) => Err(CredentialStoreError::new(error)),
}
}
fn save(
&self,
_service: &str,
account: &str,
value: &str,
) -> Result<(), CredentialStoreError> {
let credential = self.credential(account);
credential
.set_password(value)
.map_err(CredentialStoreError::new)
}
fn delete(&self, _service: &str, account: &str) -> Result<bool, CredentialStoreError> {
let credential = {
let guard = self.credentials.lock().unwrap();
guard.get(account).cloned()
};
let Some(credential) = credential else {
return Ok(false);
};
match credential.delete_credential() {
Ok(()) => {
let mut guard = self.credentials.lock().unwrap();
guard.remove(account);
Ok(true)
}
Err(KeyringError::NoEntry) => {
let mut guard = self.credentials.lock().unwrap();
guard.remove(account);
Ok(false)
}
Err(error) => Err(CredentialStoreError::new(error)),
}
}
}
struct TempCodexHome {
_guard: MutexGuard<'static, ()>,
_dir: tempfile::TempDir,
}
impl TempCodexHome {
fn new() -> Self {
static LOCK: OnceLock<Mutex<()>> = OnceLock::new();
let guard = LOCK
.get_or_init(Mutex::default)
.lock()
.unwrap_or_else(PoisonError::into_inner);
let dir = tempdir().expect("create CODEX_HOME temp dir");
unsafe {
std::env::set_var("CODEX_HOME", dir.path());
}
Self {
_guard: guard,
_dir: dir,
}
}
}
impl Drop for TempCodexHome {
fn drop(&mut self) {
unsafe {
std::env::remove_var("CODEX_HOME");
}
}
}
#[test]
fn load_oauth_tokens_reads_from_keyring_when_available() -> Result<()> {
let _env = TempCodexHome::new();
let store = MockCredentialStore::default();
let tokens = sample_tokens();
let expected = tokens.clone();
let serialized = serde_json::to_string(&tokens)?;
let key = super::compute_store_key(&tokens.server_name, &tokens.url)?;
store.save(KEYRING_SERVICE, &key, &serialized)?;
let loaded = super::load_oauth_tokens_with_store(&store, &tokens.server_name, &tokens.url)?;
assert_eq!(loaded, Some(expected));
Ok(())
}
#[test]
fn load_oauth_tokens_falls_back_when_missing_in_keyring() -> Result<()> {
let _env = TempCodexHome::new();
let store = MockCredentialStore::default();
let tokens = sample_tokens();
let expected = tokens.clone();
super::save_oauth_tokens_to_file(&tokens)?;
let loaded = super::load_oauth_tokens_with_store(&store, &tokens.server_name, &tokens.url)?
.expect("tokens should load from fallback");
assert_tokens_match_without_expiry(&loaded, &expected);
Ok(())
}
#[test]
fn load_oauth_tokens_falls_back_when_keyring_errors() -> Result<()> {
let _env = TempCodexHome::new();
let store = MockCredentialStore::default();
let tokens = sample_tokens();
let expected = tokens.clone();
let key = super::compute_store_key(&tokens.server_name, &tokens.url)?;
store.set_error(&key, KeyringError::Invalid("error".into(), "load".into()));
super::save_oauth_tokens_to_file(&tokens)?;
let loaded = super::load_oauth_tokens_with_store(&store, &tokens.server_name, &tokens.url)?
.expect("tokens should load from fallback");
assert_tokens_match_without_expiry(&loaded, &expected);
Ok(())
}
#[test]
fn save_oauth_tokens_prefers_keyring_when_available() -> Result<()> {
let _env = TempCodexHome::new();
let store = MockCredentialStore::default();
let tokens = sample_tokens();
let key = super::compute_store_key(&tokens.server_name, &tokens.url)?;
super::save_oauth_tokens_to_file(&tokens)?;
super::save_oauth_tokens_with_store(&store, &tokens.server_name, &tokens)?;
let fallback_path = super::fallback_file_path()?;
assert!(!fallback_path.exists(), "fallback file should be removed");
let stored = store.saved_value(&key).expect("value saved to keyring");
assert_eq!(serde_json::from_str::<StoredOAuthTokens>(&stored)?, tokens);
Ok(())
}
#[test]
fn save_oauth_tokens_writes_fallback_when_keyring_fails() -> Result<()> {
let _env = TempCodexHome::new();
let store = MockCredentialStore::default();
let tokens = sample_tokens();
let key = super::compute_store_key(&tokens.server_name, &tokens.url)?;
store.set_error(&key, KeyringError::Invalid("error".into(), "save".into()));
super::save_oauth_tokens_with_store(&store, &tokens.server_name, &tokens)?;
let fallback_path = super::fallback_file_path()?;
assert!(fallback_path.exists(), "fallback file should be created");
let saved = super::read_fallback_file()?.expect("fallback file should load");
let key = super::compute_store_key(&tokens.server_name, &tokens.url)?;
let entry = saved.get(&key).expect("entry for key");
assert_eq!(entry.server_name, tokens.server_name);
assert_eq!(entry.server_url, tokens.url);
assert_eq!(entry.client_id, tokens.client_id);
assert_eq!(
entry.access_token,
tokens.token_response.0.access_token().secret().as_str()
);
assert!(store.saved_value(&key).is_none());
Ok(())
}
#[test]
fn delete_oauth_tokens_removes_all_storage() -> Result<()> {
let _env = TempCodexHome::new();
let store = MockCredentialStore::default();
let tokens = sample_tokens();
let serialized = serde_json::to_string(&tokens)?;
let key = super::compute_store_key(&tokens.server_name, &tokens.url)?;
store.save(KEYRING_SERVICE, &key, &serialized)?;
super::save_oauth_tokens_to_file(&tokens)?;
let removed =
super::delete_oauth_tokens_with_store(&store, &tokens.server_name, &tokens.url)?;
assert!(removed);
assert!(!store.contains(&key));
assert!(!super::fallback_file_path()?.exists());
Ok(())
}
#[test]
fn delete_oauth_tokens_propagates_keyring_errors() -> Result<()> {
let _env = TempCodexHome::new();
let store = MockCredentialStore::default();
let tokens = sample_tokens();
let key = super::compute_store_key(&tokens.server_name, &tokens.url)?;
store.set_error(&key, KeyringError::Invalid("error".into(), "delete".into()));
super::save_oauth_tokens_to_file(&tokens).unwrap();
let result =
super::delete_oauth_tokens_with_store(&store, &tokens.server_name, &tokens.url);
assert!(result.is_err());
assert!(super::fallback_file_path().unwrap().exists());
Ok(())
}
fn assert_tokens_match_without_expiry(
actual: &StoredOAuthTokens,
expected: &StoredOAuthTokens,
) {
assert_eq!(actual.server_name, expected.server_name);
assert_eq!(actual.url, expected.url);
assert_eq!(actual.client_id, expected.client_id);
assert_token_response_match_without_expiry(
&actual.token_response,
&expected.token_response,
);
}
fn assert_token_response_match_without_expiry(
actual: &WrappedOAuthTokenResponse,
expected: &WrappedOAuthTokenResponse,
) {
let actual_response = &actual.0;
let expected_response = &expected.0;
assert_eq!(
actual_response.access_token().secret(),
expected_response.access_token().secret()
);
assert_eq!(actual_response.token_type(), expected_response.token_type());
assert_eq!(
actual_response.refresh_token().map(RefreshToken::secret),
expected_response.refresh_token().map(RefreshToken::secret),
);
assert_eq!(actual_response.scopes(), expected_response.scopes());
assert_eq!(
actual_response.extra_fields(),
expected_response.extra_fields()
);
assert_eq!(
actual_response.expires_in().is_some(),
expected_response.expires_in().is_some()
);
}
fn sample_tokens() -> StoredOAuthTokens {
let mut response = OAuthTokenResponse::new(
AccessToken::new("access-token".to_string()),
BasicTokenType::Bearer,
EmptyExtraTokenFields {},
);
response.set_refresh_token(Some(RefreshToken::new("refresh-token".to_string())));
response.set_scopes(Some(vec![
Scope::new("scope-a".to_string()),
Scope::new("scope-b".to_string()),
]));
let expires_in = Duration::from_secs(3600);
response.set_expires_in(Some(&expires_in));
StoredOAuthTokens {
server_name: "test-server".to_string(),
url: "https://example.test".to_string(),
client_id: "client-id".to_string(),
token_response: WrappedOAuthTokenResponse(response),
}
}
}

View File

@@ -0,0 +1,141 @@
use std::string::String;
use std::sync::Arc;
use std::time::Duration;
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
use rmcp::transport::auth::OAuthState;
use tiny_http::Response;
use tiny_http::Server;
use tokio::sync::oneshot;
use tokio::time::timeout;
use urlencoding::decode;
use crate::StoredOAuthTokens;
use crate::WrappedOAuthTokenResponse;
use crate::save_oauth_tokens;
struct CallbackServerGuard {
server: Arc<Server>,
}
impl Drop for CallbackServerGuard {
fn drop(&mut self) {
self.server.unblock();
}
}
pub async fn perform_oauth_login(server_name: &str, server_url: &str) -> Result<()> {
let server = Arc::new(Server::http("127.0.0.1:0").map_err(|err| anyhow!(err))?);
let guard = CallbackServerGuard {
server: Arc::clone(&server),
};
let redirect_uri = match server.server_addr() {
tiny_http::ListenAddr::IP(std::net::SocketAddr::V4(addr)) => {
format!("http://{}:{}/callback", addr.ip(), addr.port())
}
tiny_http::ListenAddr::IP(std::net::SocketAddr::V6(addr)) => {
format!("http://[{}]:{}/callback", addr.ip(), addr.port())
}
#[cfg(not(target_os = "windows"))]
_ => return Err(anyhow!("unable to determine callback address")),
};
let (tx, rx) = oneshot::channel();
spawn_callback_server(server, tx);
let mut oauth_state = OAuthState::new(server_url, None).await?;
oauth_state.start_authorization(&[], &redirect_uri).await?;
let auth_url = oauth_state.get_authorization_url().await?;
println!("Authorize `{server_name}` by opening this URL in your browser:\n{auth_url}\n");
if webbrowser::open(&auth_url).is_err() {
println!("(Browser launch failed; please copy the URL above manually.)");
}
let (code, csrf_state) = timeout(Duration::from_secs(300), rx)
.await
.context("timed out waiting for OAuth callback")?
.context("OAuth callback was cancelled")?;
oauth_state
.handle_callback(&code, &csrf_state)
.await
.context("failed to handle OAuth callback")?;
let (client_id, credentials_opt) = oauth_state
.get_credentials()
.await
.context("failed to retrieve OAuth credentials")?;
let credentials =
credentials_opt.ok_or_else(|| anyhow!("OAuth provider did not return credentials"))?;
let stored = StoredOAuthTokens {
server_name: server_name.to_string(),
url: server_url.to_string(),
client_id,
token_response: WrappedOAuthTokenResponse(credentials),
};
save_oauth_tokens(server_name, &stored)?;
drop(guard);
Ok(())
}
fn spawn_callback_server(server: Arc<Server>, tx: oneshot::Sender<(String, String)>) {
tokio::task::spawn_blocking(move || {
while let Ok(request) = server.recv() {
let path = request.url().to_string();
if let Some(OauthCallbackResult { code, state }) = parse_oauth_callback(&path) {
let response =
Response::from_string("Authentication complete. You may close this window.");
if let Err(err) = request.respond(response) {
eprintln!("Failed to respond to OAuth callback: {err}");
}
if let Err(err) = tx.send((code, state)) {
eprintln!("Failed to send OAuth callback: {err:?}");
}
break;
} else {
let response =
Response::from_string("Invalid OAuth callback").with_status_code(400);
if let Err(err) = request.respond(response) {
eprintln!("Failed to respond to OAuth callback: {err}");
}
}
}
});
}
struct OauthCallbackResult {
code: String,
state: String,
}
fn parse_oauth_callback(path: &str) -> Option<OauthCallbackResult> {
let (route, query) = path.split_once('?')?;
if route != "/callback" {
return None;
}
let mut code = None;
let mut state = None;
for pair in query.split('&') {
let (key, value) = pair.split_once('=')?;
let decoded = decode(value).ok()?.into_owned();
match key {
"code" => code = Some(decoded),
"state" => state = Some(decoded),
_ => {}
}
}
Some(OauthCallbackResult {
code: code?,
state: state?,
})
}

View File

@@ -21,6 +21,8 @@ use rmcp::service::RoleClient;
use rmcp::service::RunningService;
use rmcp::service::{self};
use rmcp::transport::StreamableHttpClientTransport;
use rmcp::transport::auth::AuthClient;
use rmcp::transport::auth::OAuthState;
use rmcp::transport::child_process::TokioChildProcess;
use rmcp::transport::streamable_http_client::StreamableHttpClientTransportConfig;
use tokio::io::AsyncBufReadExt;
@@ -31,7 +33,10 @@ use tokio::time;
use tracing::info;
use tracing::warn;
use crate::load_oauth_tokens;
use crate::logging_client_handler::LoggingClientHandler;
use crate::oauth::OAuthPersistor;
use crate::oauth::StoredOAuthTokens;
use crate::utils::convert_call_tool_result;
use crate::utils::convert_to_mcp;
use crate::utils::convert_to_rmcp;
@@ -40,7 +45,13 @@ use crate::utils::run_with_timeout;
enum PendingTransport {
ChildProcess(TokioChildProcess),
StreamableHttp(StreamableHttpClientTransport<reqwest::Client>),
StreamableHttp {
transport: StreamableHttpClientTransport<reqwest::Client>,
},
StreamableHttpWithOAuth {
transport: StreamableHttpClientTransport<AuthClient<reqwest::Client>>,
oauth_persistor: OAuthPersistor,
},
}
enum ClientState {
@@ -49,6 +60,7 @@ enum ClientState {
},
Ready {
service: Arc<RunningService<RoleClient, LoggingClientHandler>>,
oauth: Option<OAuthPersistor>,
},
}
@@ -103,17 +115,37 @@ impl RmcpClient {
})
}
pub fn new_streamable_http_client(url: String, bearer_token: Option<String>) -> Result<Self> {
let mut config = StreamableHttpClientTransportConfig::with_uri(url);
if let Some(token) = bearer_token {
config = config.auth_header(format!("Bearer {token}"));
}
let transport = StreamableHttpClientTransport::from_config(config);
pub async fn new_streamable_http_client(
server_name: &str,
url: &str,
bearer_token: Option<String>,
) -> Result<Self> {
let initial_tokens = match load_oauth_tokens(server_name, url) {
Ok(tokens) => tokens,
Err(err) => {
warn!("failed to read tokens for server `{server_name}`: {err}");
None
}
};
let transport = if let Some(initial_tokens) = initial_tokens.clone() {
let (transport, oauth_persistor) =
create_oauth_transport_and_runtime(server_name, url, initial_tokens).await?;
PendingTransport::StreamableHttpWithOAuth {
transport,
oauth_persistor,
}
} else {
let mut http_config = StreamableHttpClientTransportConfig::with_uri(url.to_string());
if let Some(bearer_token) = bearer_token {
http_config = http_config.auth_header(format!("Bearer {bearer_token}"));
}
let transport = StreamableHttpClientTransport::from_config(http_config);
PendingTransport::StreamableHttp { transport }
};
Ok(Self {
state: Mutex::new(ClientState::Connecting {
transport: Some(PendingTransport::StreamableHttp(transport)),
transport: Some(transport),
}),
})
}
@@ -125,35 +157,40 @@ impl RmcpClient {
params: InitializeRequestParams,
timeout: Option<Duration>,
) -> Result<InitializeResult> {
let transport = {
let rmcp_params: InitializeRequestParam = convert_to_rmcp(params.clone())?;
let client_handler = LoggingClientHandler::new(rmcp_params);
let (transport, oauth_persistor) = {
let mut guard = self.state.lock().await;
match &mut *guard {
ClientState::Connecting { transport } => transport
.take()
.ok_or_else(|| anyhow!("client already initializing"))?,
ClientState::Ready { .. } => {
return Err(anyhow!("client already initialized"));
}
}
};
let client_info = convert_to_rmcp::<_, InitializeRequestParam>(params.clone())?;
let client_handler = LoggingClientHandler::new(client_info);
let service_future = match transport {
PendingTransport::ChildProcess(transport) => {
service::serve_client(client_handler.clone(), transport).boxed()
}
PendingTransport::StreamableHttp(transport) => {
service::serve_client(client_handler, transport).boxed()
ClientState::Connecting { transport } => match transport.take() {
Some(PendingTransport::ChildProcess(transport)) => (
service::serve_client(client_handler.clone(), transport).boxed(),
None,
),
Some(PendingTransport::StreamableHttp { transport }) => (
service::serve_client(client_handler.clone(), transport).boxed(),
None,
),
Some(PendingTransport::StreamableHttpWithOAuth {
transport,
oauth_persistor,
}) => (
service::serve_client(client_handler.clone(), transport).boxed(),
Some(oauth_persistor),
),
None => return Err(anyhow!("client already initializing")),
},
ClientState::Ready { .. } => return Err(anyhow!("client already initialized")),
}
};
let service = match timeout {
Some(duration) => time::timeout(duration, service_future)
Some(duration) => time::timeout(duration, transport)
.await
.map_err(|_| anyhow!("timed out handshaking with MCP server after {duration:?}"))?
.map_err(|err| anyhow!("handshaking with MCP server failed: {err}"))?,
None => service_future
None => transport
.await
.map_err(|err| anyhow!("handshaking with MCP server failed: {err}"))?,
};
@@ -168,9 +205,16 @@ impl RmcpClient {
let mut guard = self.state.lock().await;
*guard = ClientState::Ready {
service: Arc::new(service),
oauth: oauth_persistor.clone(),
};
}
if let Some(runtime) = oauth_persistor
&& let Err(error) = runtime.persist_if_needed().await
{
warn!("failed to persist OAuth tokens after initialize: {error}");
}
Ok(initialize_result)
}
@@ -186,7 +230,9 @@ impl RmcpClient {
let fut = service.list_tools(rmcp_params);
let result = run_with_timeout(fut, timeout, "tools/list").await?;
convert_to_mcp(result)
let converted = convert_to_mcp(result)?;
self.persist_oauth_tokens().await;
Ok(converted)
}
pub async fn call_tool(
@@ -200,14 +246,79 @@ impl RmcpClient {
let rmcp_params: CallToolRequestParam = convert_to_rmcp(params)?;
let fut = service.call_tool(rmcp_params);
let rmcp_result = run_with_timeout(fut, timeout, "tools/call").await?;
convert_call_tool_result(rmcp_result)
let converted = convert_call_tool_result(rmcp_result)?;
self.persist_oauth_tokens().await;
Ok(converted)
}
async fn service(&self) -> Result<Arc<RunningService<RoleClient, LoggingClientHandler>>> {
let guard = self.state.lock().await;
match &*guard {
ClientState::Ready { service } => Ok(Arc::clone(service)),
ClientState::Ready { service, .. } => Ok(Arc::clone(service)),
ClientState::Connecting { .. } => Err(anyhow!("MCP client not initialized")),
}
}
async fn oauth_persistor(&self) -> Option<OAuthPersistor> {
let guard = self.state.lock().await;
match &*guard {
ClientState::Ready {
oauth: Some(runtime),
service: _,
} => Some(runtime.clone()),
_ => None,
}
}
async fn persist_oauth_tokens(&self) {
if let Some(runtime) = self.oauth_persistor().await
&& let Err(error) = runtime.persist_if_needed().await
{
warn!("failed to persist OAuth tokens: {error}");
}
}
}
async fn create_oauth_transport_and_runtime(
server_name: &str,
url: &str,
initial_tokens: StoredOAuthTokens,
) -> Result<(
StreamableHttpClientTransport<AuthClient<reqwest::Client>>,
OAuthPersistor,
)> {
let http_client = reqwest::Client::builder().build()?;
let mut oauth_state = OAuthState::new(url.to_string(), Some(http_client.clone())).await?;
oauth_state
.set_credentials(
&initial_tokens.client_id,
initial_tokens.token_response.0.clone(),
)
.await?;
let manager = match oauth_state {
OAuthState::Authorized(manager) => manager,
OAuthState::Unauthorized(manager) => manager,
OAuthState::Session(_) | OAuthState::AuthorizedHttpClient(_) => {
return Err(anyhow!("unexpected OAuth state during client setup"));
}
};
let auth_client = AuthClient::new(http_client, manager);
let auth_manager = auth_client.auth_manager.clone();
let transport = StreamableHttpClientTransport::with_client(
auth_client,
StreamableHttpClientTransportConfig::with_uri(url.to_string()),
);
let runtime = OAuthPersistor::new(
server_name.to_string(),
url.to_string(),
auth_manager,
Some(initial_tokens),
);
Ok((transport, runtime))
}

View File

@@ -154,6 +154,8 @@ impl App {
backtrack: BacktrackState::default(),
};
app.process_pending_admin_controls();
let tui_events = tui.event_stream();
tokio::pin!(tui_events);
@@ -366,11 +368,14 @@ impl App {
}
}
}
AppEvent::UpdateAskForApprovalPolicy(policy) => {
self.chat_widget.set_approval_policy(policy);
AppEvent::ApplyApprovalPreset(preset) => {
self.handle_apply_approval_preset(preset)?;
}
AppEvent::UpdateSandboxPolicy(policy) => {
self.chat_widget.set_sandbox_policy(policy);
AppEvent::DangerJustificationSubmitted { justification } => {
self.handle_danger_justification_submission(justification)?;
}
AppEvent::DangerJustificationCancelled => {
self.handle_danger_justification_cancelled()?;
}
AppEvent::OpenReviewBranchPicker(cwd) => {
self.chat_widget.show_review_branch_picker(&cwd).await;

View File

@@ -0,0 +1,192 @@
use crate::app::App;
use codex_common::approval_presets::ApprovalPreset;
use codex_core::admin_controls::DangerAuditAction;
use codex_core::admin_controls::DangerDecision;
use codex_core::admin_controls::DangerPending;
use codex_core::admin_controls::DangerRequestSource;
use codex_core::admin_controls::PendingAdminAction;
use codex_core::admin_controls::build_danger_audit_payload;
use codex_core::admin_controls::log_admin_event;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use color_eyre::eyre::Result;
impl App {
pub(crate) fn handle_apply_approval_preset(&mut self, preset: ApprovalPreset) -> Result<()> {
self.cancel_existing_pending_requests();
let ApprovalPreset {
approval, sandbox, ..
} = preset;
match sandbox {
SandboxPolicy::DangerFullAccess => match self.config.admin.decision_for_danger() {
DangerDecision::Allowed => {
let pending = DangerPending {
source: DangerRequestSource::Approvals,
requested_sandbox: SandboxPolicy::DangerFullAccess,
requested_approval: approval,
};
self.log_danger_event(&pending, DangerAuditAction::Approved, None);
self.apply_sandbox_and_approval(approval, SandboxPolicy::DangerFullAccess);
}
DangerDecision::RequiresJustification => {
let pending = DangerPending {
source: DangerRequestSource::Approvals,
requested_sandbox: SandboxPolicy::DangerFullAccess,
requested_approval: approval,
};
self.log_danger_event(&pending, DangerAuditAction::Requested, None);
self.push_pending_danger(pending.clone());
self.chat_widget.prompt_for_danger_justification(pending);
}
DangerDecision::Denied => {
let pending = DangerPending {
source: DangerRequestSource::Approvals,
requested_sandbox: SandboxPolicy::DangerFullAccess,
requested_approval: approval,
};
self.log_danger_event(&pending, DangerAuditAction::Denied, None);
self.chat_widget.add_error_message(
"Full access is disabled by your administrator.".to_string(),
);
}
},
other => {
self.apply_sandbox_and_approval(approval, other);
}
}
Ok(())
}
pub(crate) fn handle_danger_justification_submission(
&mut self,
justification: String,
) -> Result<()> {
let justification = justification.trim();
if justification.is_empty() {
self.chat_widget.add_error_message(
"Please provide a justification before enabling full access.".to_string(),
);
return Ok(());
}
let Some(pending) = self.chat_widget.take_pending_danger() else {
return Ok(());
};
if let Some(internal) = self.drop_pending_from_configs() {
debug_assert_eq!(internal, pending);
}
self.log_danger_event(
&pending,
DangerAuditAction::Approved,
Some(justification.to_string()),
);
let DangerPending {
requested_approval,
requested_sandbox,
..
} = pending;
self.apply_sandbox_and_approval(requested_approval, requested_sandbox);
self.chat_widget.add_info_message(
"Full access enabled.".to_string(),
Some("Justification has been logged.".to_string()),
);
Ok(())
}
pub(crate) fn handle_danger_justification_cancelled(&mut self) -> Result<()> {
self.cancel_existing_pending_requests();
let approval_label = self.config.approval_policy.to_string();
let sandbox_label = self.config.sandbox_policy.to_string();
self.chat_widget.add_info_message(
format!(
"Full access remains disabled. Current approval policy `{approval_label}`, sandbox `{sandbox_label}`."
),
None,
);
Ok(())
}
pub(crate) fn process_pending_admin_controls(&mut self) {
while let Some(pending) = self.drop_pending_from_configs() {
self.chat_widget.prompt_for_danger_justification(pending);
}
}
fn apply_sandbox_and_approval(&mut self, approval: AskForApproval, sandbox: SandboxPolicy) {
self.chat_widget.submit_op(Op::OverrideTurnContext {
cwd: None,
approval_policy: Some(approval),
sandbox_policy: Some(sandbox.clone()),
model: None,
effort: None,
summary: None,
});
self.chat_widget.set_approval_policy(approval);
self.chat_widget.set_sandbox_policy(sandbox.clone());
self.config.approval_policy = approval;
self.config.sandbox_policy = sandbox;
}
fn push_pending_danger(&mut self, pending: DangerPending) {
self.config
.admin
.pending
.push(PendingAdminAction::Danger(pending.clone()));
self.chat_widget
.config_mut()
.admin
.pending
.push(PendingAdminAction::Danger(pending));
}
fn cancel_existing_pending_requests(&mut self) {
let mut logged = false;
if let Some(previous) = self.config.admin.take_pending_danger() {
self.log_danger_event(&previous, DangerAuditAction::Cancelled, None);
logged = true;
}
if let Some(previous) = self.chat_widget.config_mut().admin.take_pending_danger()
&& !logged
{
self.log_danger_event(&previous, DangerAuditAction::Cancelled, None);
logged = true;
}
if let Some(previous) = self.chat_widget.take_pending_danger()
&& !logged
{
self.log_danger_event(&previous, DangerAuditAction::Cancelled, None);
}
}
fn drop_pending_from_configs(&mut self) -> Option<DangerPending> {
let config_pending = self.config.admin.take_pending_danger();
let widget_pending = self.chat_widget.config_mut().admin.take_pending_danger();
config_pending.or(widget_pending)
}
fn log_danger_event(
&self,
pending: &DangerPending,
action: DangerAuditAction,
justification: Option<String>,
) {
if let Some(audit) = self.config.admin.audit.as_ref() {
log_admin_event(
audit,
build_danger_audit_payload(pending, action, justification),
);
}
}
}

View File

@@ -1,5 +1,6 @@
use std::path::PathBuf;
use codex_common::approval_presets::ApprovalPreset;
use codex_common::model_presets::ModelPreset;
use codex_core::protocol::ConversationPathResponseEvent;
use codex_core::protocol::Event;
@@ -8,8 +9,6 @@ use codex_file_search::FileMatch;
use crate::bottom_pane::ApprovalRequest;
use crate::history_cell::HistoryCell;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::SandboxPolicy;
use codex_core::protocol_config_types::ReasoningEffort;
#[allow(clippy::large_enum_variant)]
@@ -67,11 +66,16 @@ pub(crate) enum AppEvent {
presets: Vec<ModelPreset>,
},
/// Update the current approval policy in the running app and widget.
UpdateAskForApprovalPolicy(AskForApproval),
/// Apply an approval preset chosen from the popup.
ApplyApprovalPreset(ApprovalPreset),
/// Update the current sandbox policy in the running app and widget.
UpdateSandboxPolicy(SandboxPolicy),
/// Submit a justification for enabling danger-full-access.
DangerJustificationSubmitted {
justification: String,
},
/// User cancelled the danger justification prompt without submitting a reason.
DangerJustificationCancelled,
/// Forwarded conversation history snapshot from the current conversation.
ConversationHistory(ConversationPathResponseEvent),

View File

@@ -36,6 +36,7 @@ use crate::bottom_pane::prompt_args::prompt_argument_names;
use crate::bottom_pane::prompt_args::prompt_command_with_arg_placeholders;
use crate::bottom_pane::prompt_args::prompt_has_numeric_placeholders;
use crate::slash_command::SlashCommand;
use crate::slash_command::built_in_slash_commands;
use crate::style::user_message_style;
use crate::terminal_palette;
use codex_protocol::custom_prompts::CustomPrompt;
@@ -894,6 +895,23 @@ impl ChatComposer {
modifiers: KeyModifiers::NONE,
..
} => {
// If the first line is a bare built-in slash command (no args),
// dispatch it even when the slash popup isn't visible. This preserves
// the workflow: type a prefix ("/di"), press Tab to complete to
// "/diff ", then press Enter to run it. Tab moves the cursor beyond
// the '/name' token and our caret-based heuristic hides the popup,
// but Enter should still dispatch the command rather than submit
// literal text.
let first_line = self.textarea.text().lines().next().unwrap_or("");
if let Some((name, rest)) = parse_slash_name(first_line)
&& rest.is_empty()
&& let Some((_n, cmd)) = built_in_slash_commands()
.into_iter()
.find(|(n, _)| *n == name)
{
self.textarea.set_text("");
return (InputResult::Command(cmd), true);
}
// If we're in a paste-like burst capture, treat Enter as part of the burst
// and accumulate it rather than submitting or inserting immediately.
// Do not treat Enter as paste inside a slash-command context.
@@ -2277,6 +2295,38 @@ mod tests {
assert_eq!(composer.textarea.cursor(), composer.textarea.text().len());
}
#[test]
fn slash_tab_then_enter_dispatches_builtin_command() {
let (tx, _rx) = unbounded_channel::<AppEvent>();
let sender = AppEventSender::new(tx);
let mut composer = ChatComposer::new(
true,
sender,
false,
"Ask Codex to do anything".to_string(),
false,
);
// Type a prefix and complete with Tab, which inserts a trailing space
// and moves the cursor beyond the '/name' token (hides the popup).
type_chars_humanlike(&mut composer, &['/', 'd', 'i']);
let (_res, _redraw) =
composer.handle_key_event(KeyEvent::new(KeyCode::Tab, KeyModifiers::NONE));
assert_eq!(composer.textarea.text(), "/diff ");
// Press Enter: should dispatch the command, not submit literal text.
let (result, _needs_redraw) =
composer.handle_key_event(KeyEvent::new(KeyCode::Enter, KeyModifiers::NONE));
match result {
InputResult::Command(cmd) => assert_eq!(cmd.command(), "diff"),
InputResult::Submitted(text) => {
panic!("expected command dispatch after Tab completion, got literal submit: {text}")
}
InputResult::None => panic!("expected Command result for '/diff'"),
}
assert!(composer.textarea.is_empty());
}
#[test]
fn slash_mention_dispatches_command_and_inserts_at() {
use crossterm::event::KeyCode;

Some files were not shown because too many files have changed in this diff Show More