Compare commits

...

30 Commits

Author SHA1 Message Date
jif-oai
cbbe3acdf4 feat: add feed char tool 2025-12-08 15:35:47 +00:00
gameofby
98923654d0 fix: refine the warning message and docs for deprecated tools config (#7685)
Issue #7661 revealed that users are confused by deprecation warnings
like:
> `tools.web_search` is deprecated. Use `web_search_request` instead.

This message misleadingly suggests renaming the config key from
`web_search` to `web_search_request`, when the actual required change is
to **move and rename the configuration from the `[tools]` section to the
`[features]` section**.

This PR clarifies the warning messages and documentation to make it
clear that deprecated `[tools]` configurations should be moved to
`[features]`. Changes made:
- Updated deprecation warning format in `codex-rs/core/src/codex.rs:520`
to include `[features].` prefix
- Updated corresponding test expectations in
`codex-rs/core/tests/suite/deprecation_notice.rs:39`
- Improved documentation in `docs/config.md` to clarify upfront that
`[tools]` options are deprecated in favor of `[features]`
2025-12-08 01:23:21 -08:00
Robby He
57ba9fa100 fix(doc): TOML otel exporter example — multi-line inline table is inv… (#7669)
…alid (#7668)

The `otel` exporter example in `docs/config.md` is misleading and will
cause
the configuration parser to fail if copied verbatim.

Summary
-------
The example uses a TOML inline table but spreads the inline-table braces
across multiple lines. TOML inline tables must be contained on a single
line
(`key = { a = 1, b = 2 }`); placing newlines inside the braces triggers
a
parse error in most TOML parsers and prevents Codex from starting.

Reproduction
------------
1. Paste the snippet below into `~/.codex/config.toml` (or your project
config).
2. Run `codex` (or the command that loads the config).
3. The process will fail to start with a TOML parse error similar to:

```text
Error loading config.toml: TOML parse error at line 55, column 27
   |
55 | exporter = { otlp-http = {
   |                           ^
newlines are unsupported in inline tables, expected nothing
```

Problematic snippet (as currently shown in the docs)
---------------------------------------------------
```toml
[otel]
exporter = { otlp-http = {
  endpoint = "https://otel.example.com/v1/logs",
  protocol = "binary",
  headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
}}
```

Recommended fixes
------------------
```toml
[otel.exporter."otlp-http"]
endpoint = "https://otel.example.com/v1/logs"
protocol = "binary"

[otel.exporter."otlp-http".headers]
"x-otlp-api-key" = "${OTLP_TOKEN}"
```

Or, keep an inline table but write it on one line (valid but less
readable):

```toml
[otel]
exporter = { "otlp-http" = { endpoint = "https://otel.example.com/v1/logs", protocol = "binary", headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" } } }
```
2025-12-08 01:20:23 -08:00
Eric Traut
acb8ed493f Fixed regression for chat endpoint; missing tools name caused litellm proxy to crash (#7724)
This PR addresses https://github.com/openai/codex/issues/7051
2025-12-08 00:49:51 -08:00
Ahmed Ibrahim
53a486f7ea Add remote models feature flag (#7648)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-12-07 09:47:48 -08:00
Michael Bolin
3c3d3d1adc fix: add integration tests for codex-exec-mcp-server with execpolicy (#7617)
This PR introduces integration tests that run
[codex-shell-tool-mcp](https://www.npmjs.com/package/@openai/codex-shell-tool-mcp)
as a user would. Note that this requires running our fork of Bash, so we
introduce a [DotSlash](https://dotslash-cli.com/) file for `bash` so
that we can run the integration tests on multiple platforms without
having to check the binaries into the repository. (As noted in the
DotSlash file, it is slightly more heavyweight than necessary, which may
be worth addressing as disk space in CI is limited:
https://github.com/openai/codex/pull/7678.)

To start, this PR adds two tests:

- `list_tools()` makes the `list_tools` request to the MCP server and
verifies we get the expected response
- `accept_elicitation_for_prompt_rule()` defines a `prefix_rule()` with
`decision="prompt"` and verifies the elicitation flow works as expected

Though the `accept_elicitation_for_prompt_rule()` test **only works on
Linux**, as this PR reveals that there are currently issues when running
the Bash fork in a read-only sandbox on Linux. This will have to be
fixed in a follow-up PR.

Incidentally, getting this test run to correctly on macOS also requires
a recent fix we made to `brew` that hasn't hit a mainline release yet,
so getting CI green in this PR required
https://github.com/openai/codex/pull/7680.
2025-12-07 06:39:38 +00:00
Michael Bolin
3c087e8fda fix: ensure macOS CI runners for Rust tests include recent Homebrew fixes (#7680)
As noted in the code comment, we introduced a key fix for `brew` in
https://github.com/Homebrew/brew/pull/21157 that Codex needs, but it has
not hit stable yet, so we update our CI job to use latest `brew` from
`origin/main`.

This is necessary for the new integration tests introduced in
https://github.com/openai/codex/pull/7617.
2025-12-06 22:11:07 -08:00
Michael Bolin
7386e2efbc fix: clear out space on ubuntu runners before running Rust tests (#7678)
When I put up https://github.com/openai/codex/pull/7617 for review,
initially I started seeing failures on the `ubuntu-24.04` runner used
for Rust test runs for the `x86_64-unknown-linux-gnu` architecture. Chat
suggested a number of things that could be removed to save space, which
seems to help.
2025-12-06 21:46:07 -08:00
Victor
b2cb05d562 docs: point dev checks to just (#7673)
Update install and contributing guides to use the root justfile helpers
(`just fmt`, `just fix -p <crate>`, and targeted tests) instead of the
older cargo fmt/clippy/test instructions that have been in place since
459363e17b. This matches the justfile relocation to the repo root in
952d6c946 and the current lint/test workflow for CI (see
`.github/workflows/rust-ci.yml`).
2025-12-06 18:57:08 -08:00
Jay Sabva
9a74228c66 docs: Remove experimental_use_rmcp_client from config (#7672)
Removed experimental Rust MCP client option from config.
2025-12-06 16:51:07 -08:00
Jay Sabva
315b1e957d docs: fix documentation of rmcp client flag (#7665)
## Summary
- Updated the rmcp client flag's documentation in config.md file
- changed it from `experimental_use_rmcp_client` to `rmcp_client`
2025-12-06 10:17:18 -08:00
Michael Bolin
82090803d9 fix: exec-server stream was erroring for large requests (#7654)
Previous to this change, large `EscalateRequest` payloads exceeded the
kernel send buffer, causing our single `sendmsg(2)` call (with attached
FDs) to be split and retried without proper control handling; this led
to `EINVAL`/broken pipe in the
`handle_escalate_session_respects_run_in_sandbox_decision()` test when
using an `env` with large contents.

**Before:** `AsyncSocket::send_with_fds()` called `send_json_message()`,
which called `send_message_bytes()`, which made one `socket.sendmsg()`
call followed by additional `socket.send()` calls, as necessary:


2e4a402521/codex-rs/exec-server/src/posix/socket.rs (L198-L209)

**After:** `AsyncSocket::send_with_fds()` now calls
`send_stream_frame()`, which calls `send_stream_chunk()` one or more
times. Each call to `send_stream_chunk()` calls `socket.sendmsg()`.

In the previous implementation, the subsequent `socket.send()` writes
had no control information associated with them, whereas in the new
`send_stream_chunk()` implementation, a fresh `MsgHdr` (using
`with_control()`, as appropriate) is created for `socket.sendmsg()` each
time.

Additionally, with this PR, stream sending attaches `SCM_RIGHTS` only on
the first chunk, and omits control data when there are no FDs, allowing
oversized payloads to deliver correctly while preserving FD limits and
error checks.
2025-12-06 10:16:47 -08:00
Alexander
f521d29726 fix: OTEL HTTP exporter panic and mTLS support (#7651)
This fixes two issues with the OTEL HTTP exporter:

1. **Runtime panic with async reqwest client**

The `opentelemetry_sdk` `BatchLogProcessor` spawns a dedicated OS thread
that uses `futures_executor::block_on()` rather than tokio's runtime.
When the async reqwest client's timeout mechanism calls
`tokio::time::sleep()`, it panics with "there is no reactor running,
must be called from the context of a Tokio 1.x runtime".

The fix is to use `reqwest::blocking::Client` instead, which doesn't
depend on tokio for timeouts. However, the blocking client creates its
own internal tokio runtime during construction, which would panic if
built from within an async context. We wrap the construction in
`tokio::task::block_in_place()` to handle this.

2. **mTLS certificate handling**

The HTTP client wasn't properly configured for mTLS, matching the fixes
previously done for the model provider client:

- Added `.tls_built_in_root_certs(false)` when using a custom CA
certificate to ensure only our CA is trusted
- Added `.https_only(true)` when using client identity
- Added `rustls-tls` feature to ensure rustls is used (required for
`Identity::from_pem()` to work correctly)
2025-12-05 20:46:44 -08:00
xl-openai
93f61dbc5f Also load skills from repo root. (#7645)
Also load skills from /REPO_ROOT/codex/skills.
2025-12-05 18:01:49 -08:00
Dylan Hurd
6c9c563faf fix(apply-patch): preserve CRLF line endings on Windows (#7515)
## Summary
This PR is heavily based on #4017, which contains the core logic for the
fix. To reduce the risk, we are first introducing it only on windows. We
can then expand to wsl / other environments as needed, and then tackle
net new files.

## Testing
- [x] added unit tests in apply-patch
- [x] add integration tests to apply_patch_cli.rs

---------

Co-authored-by: Chase Naples <Cnaples79@gmail.com>
2025-12-05 16:43:27 -08:00
Josh McKinney
952d6c9465 Move justfile to repository root (#7652)
## Summary
- move the workspace justfile to the repository root for easier
discovery
- set the just working directory to codex-rs so existing recipes still
run in the Rust workspace

## Testing
- not run (not requested)


------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69334db473108329b0cc253b7fd8218e)
2025-12-05 16:24:55 -08:00
Jeremy Rose
2e4a402521 cloud: status, diff, apply (#7614)
Adds cli commands for getting the status of cloud tasks, and for
getting/applying the diffs from same.
2025-12-05 21:39:23 +00:00
Pavel Krymets
f48d88067e Fix unified_exec on windows (#7620)
Fix unified_exec on windows

Requires removal of PSUEDOCONSOLE_INHERIT_CURSOR flag so child processed
don't attempt to wait for cursor position response (and timeout).


https://github.com/wezterm/wezterm/compare/main...pakrym:wezterm:PSUEDOCONSOLE_INHERIT_CURSOR?expand=1

---------

Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-12-05 20:09:43 +00:00
Dylan Hurd
a8cbbdbc6e feat(core) Add login to shell_command tool (#6846)
## Summary
Adds the `login` parameter to the `shell_command` tool - optional,
defaults to true.

## Testing
- [x] Tested locally
2025-12-05 11:03:25 -08:00
Ahmed Ibrahim
d08efb1743 Wire with_remote_overrides to construct model families (#7621)
- This PR wires `with_remote_overrides` and make the
`construct_model_families` an async function
- Moves getting model family a level above to keep the function `sync`
- Updates the tests to local, offline, and `sync` helper for model
families
2025-12-05 10:40:15 -08:00
jif-oai
5f80ad6da8 fix: chat completion with parallel tool call (#7634) 2025-12-05 10:20:36 -08:00
jif-oai
e91bb6b947 fix: ignore ghost snapshots in token consumption (#7638) 2025-12-05 13:57:24 +00:00
zhao-oai
b8eab7ce90 fix: taking plan type from usage endpoint instead of thru auth token (#7610)
pull plan type from the usage endpoint, persist it in session state /
tui state, and propagate through rate limit snapshots
2025-12-04 23:34:13 -08:00
zhao-oai
b1c918d8f7 feat: exec policy integration in shell mcp (#7609)
adding execpolicy support into the `posix` mcp

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-12-04 21:55:54 -08:00
zhao-oai
4c9762d15c fix typo (#7626) 2025-12-04 21:48:15 -08:00
Ahmed Ibrahim
7b359c9c8e Call models endpoint in models manager (#7616)
- Introduce `with_remote_overrides` and update
`refresh_available_models`
- Put `auth_manager` instead of `auth_mode` on `models_manager`
- Remove `ShellType` and `ReasoningLevel` to use already existing
structs
2025-12-04 18:28:03 -08:00
jif-oai
6736d1828d fix: sse for chat (#7594) 2025-12-04 16:46:56 -08:00
Dylan Hurd
073a8533b8 chore(apply-patch) scenarios for e2e testing (#7567)
## Summary
This PR introduces an End to End test suite for apply-patch, so we can
easily validate behavior against other implementations as well.

## Testing
- [x] These are tests
2025-12-05 00:20:54 +00:00
Michael Bolin
0972cd9404 chore: refactor to move Arc<RwLock> concern outside exec_policy_for (#7615)
The caller should decide whether wrapping the policy in `Arc<RwLock>` is
necessary. This should make https://github.com/openai/codex/pull/7609 a
bit smoother.

- `exec_policy_for()` -> `load_exec_policy_for_features()`
- introduce `load_exec_policy()` that does not take `Features` as an arg
- both return `Result<Policy, ExecPolicyError>` instead of
Result<Arc<RwLock<Policy>>, ExecPolicyError>`

This simplifies the tests as they have no need for `Arc<RwLock>`.
2025-12-04 15:13:27 -08:00
Robby He
28dcdb566a Fix handle_shortcut_overlay_key for cross-platform consistency (#7583)
**Summary**
- Shortcut toggle using `?` in `handle_shortcut_overlay_key` fails to
trigger on some platforms (notably Windows). Current match requires
`KeyCode::Char('?')` with `KeyModifiers::NONE`. Some terminals set
`SHIFT` when producing `?` (since it is typically `Shift + /`), so the
strict `NONE` check prevents toggling.

**Impact**
- On Windows consoles/terminals, pressing `?` with an empty composer
often does nothing, leading to inconsistent UX compared to macOS/Linux.

**Root Cause**
- Crossterm/terminal backends report modifiers inconsistently across
platforms. Generating `?` may include `SHIFT`. The code enforces
`modifiers == NONE`, so valid `?` presses with `SHIFT` are ignored.
AltGr keyboards may also surface as `ALT`.

**Repro Steps**
- Open the TUI, ensure the composer is empty.
- Press `?`.
- Expected: Shortcut overlay toggles.
- Actual (Windows frequently): No toggle occurs.

**Fix Options**
- Option 1 (preferred): Accept `?` regardless of `SHIFT`, but reject
`CONTROL` and `ALT`.
- Rationale: Keeps behavior consistent across platforms with minimal
code change.
	- Example change:
		- Before: matching `KeyModifiers::NONE` only.
		- After: allow `SHIFT`, disallow `CONTROL | ALT`.
		- Suggested condition:
			```rust
			let toggles = matches!(key_event.code, KeyCode::Char('?'))
&& !key_event.modifiers.intersects(KeyModifiers::CONTROL |
KeyModifiers::ALT)
					&& self.is_empty();
			```

- Option 2: Platform-specific handling (Windows vs non-Windows).
- Implement two variants or conditional branches using `#[cfg(target_os
= "windows")]`.
- On Windows, accept `?` with `SHIFT`; on other platforms, retain
current behavior.
- Trade-off: Higher maintenance burden and code divergence for limited
benefit.

---

close #5495
2025-12-04 14:56:58 -08:00
145 changed files with 4053 additions and 519 deletions

View File

@@ -369,6 +369,57 @@ jobs:
steps:
- uses: actions/checkout@v6
# We have been running out of space when running this job on Linux for
# x86_64-unknown-linux-gnu, so remove some unnecessary dependencies.
- name: Remove unnecessary dependencies to save space
if: ${{ startsWith(matrix.runner, 'ubuntu') }}
shell: bash
run: |
set -euo pipefail
sudo rm -rf \
/usr/local/lib/android \
/usr/share/dotnet \
/usr/local/share/boost \
/usr/local/lib/node_modules \
/opt/ghc
sudo apt-get remove -y docker.io docker-compose podman buildah
# Ensure brew includes this fix so that brew's shellenv.sh loads
# cleanly in the Codex sandbox (it is frequently eval'd via .zprofile
# for Brew users, including the macOS runners on GitHub):
#
# https://github.com/Homebrew/brew/pull/21157
#
# Once brew 5.0.5 is released and is the default on macOS runners, this
# step can be removed.
- name: Upgrade brew
if: ${{ startsWith(matrix.runner, 'macos') }}
shell: bash
run: |
set -euo pipefail
brew --version
git -C "$(brew --repo)" fetch origin
git -C "$(brew --repo)" checkout main
git -C "$(brew --repo)" reset --hard origin/main
export HOMEBREW_UPDATE_TO_TAG=0
brew update
brew upgrade
brew --version
# Some integration tests rely on DotSlash being installed.
# See https://github.com/openai/codex/pull/7617.
- name: Install DotSlash
uses: facebook/install-dotslash@v2
- name: Pre-fetch DotSlash artifacts
# The Bash wrapper is not available on Windows.
if: ${{ !startsWith(matrix.runner, 'windows') }}
shell: bash
run: |
set -euo pipefail
dotslash -- fetch exec-server/tests/suite/bash
- uses: dtolnay/rust-toolchain@1.90
with:
targets: ${{ matrix.target }}

View File

@@ -75,6 +75,7 @@ If you dont have the tool:
### Test assertions
- Tests should use pretty_assertions::assert_eq for clearer diffs. Import this at the top of the test module if it isn't already.
- Prefer deep equals comparisons whenever possible. Perform `assert_eq!()` on entire objects, rather than individual fields.
### Integration tests (core)

48
codex-rs/Cargo.lock generated
View File

@@ -1048,7 +1048,7 @@ dependencies = [
"pretty_assertions",
"regex-lite",
"serde_json",
"supports-color",
"supports-color 3.0.2",
"tempfile",
"tokio",
"toml",
@@ -1088,10 +1088,13 @@ dependencies = [
"codex-login",
"codex-tui",
"crossterm",
"owo-colors",
"pretty_assertions",
"ratatui",
"reqwest",
"serde",
"serde_json",
"supports-color 3.0.2",
"tokio",
"tokio-stream",
"tracing",
@@ -1237,7 +1240,7 @@ dependencies = [
"serde",
"serde_json",
"shlex",
"supports-color",
"supports-color 3.0.2",
"tempfile",
"tokio",
"tracing",
@@ -1253,10 +1256,14 @@ name = "codex-exec-server"
version = "0.0.0"
dependencies = [
"anyhow",
"assert_cmd",
"async-trait",
"clap",
"codex-core",
"codex-execpolicy",
"exec_server_test_support",
"libc",
"maplit",
"path-absolutize",
"pretty_assertions",
"rmcp",
@@ -1269,6 +1276,7 @@ dependencies = [
"tokio-util",
"tracing",
"tracing-subscriber",
"which",
]
[[package]]
@@ -1610,7 +1618,7 @@ dependencies = [
"shlex",
"strum 0.27.2",
"strum_macros 0.27.2",
"supports-color",
"supports-color 3.0.2",
"tempfile",
"textwrap 0.16.2",
"tokio",
@@ -2497,6 +2505,18 @@ dependencies = [
"pin-project-lite",
]
[[package]]
name = "exec_server_test_support"
version = "0.0.0"
dependencies = [
"anyhow",
"assert_cmd",
"codex-core",
"rmcp",
"serde_json",
"tokio",
]
[[package]]
name = "eyre"
version = "0.6.12"
@@ -2556,8 +2576,7 @@ dependencies = [
[[package]]
name = "filedescriptor"
version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e40758ed24c9b2eeb76c35fb0aebc66c626084edd827e07e1552279814c6682d"
source = "git+https://github.com/pakrym/wezterm?branch=PSUEDOCONSOLE_INHERIT_CURSOR#fe38df8409545a696909aa9a09e63438630f217d"
dependencies = [
"libc",
"thiserror 1.0.69",
@@ -4433,6 +4452,10 @@ name = "owo-colors"
version = "4.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "48dd4f4a2c8405440fd0462561f0e5806bd0f77e86f51c761481bdd4018b545e"
dependencies = [
"supports-color 2.1.0",
"supports-color 3.0.2",
]
[[package]]
name = "parking"
@@ -4631,8 +4654,7 @@ dependencies = [
[[package]]
name = "portable-pty"
version = "0.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4a596a2b3d2752d94f51fac2d4a96737b8705dddd311a32b9af47211f08671e"
source = "git+https://github.com/pakrym/wezterm?branch=PSUEDOCONSOLE_INHERIT_CURSOR#fe38df8409545a696909aa9a09e63438630f217d"
dependencies = [
"anyhow",
"bitflags 1.3.2",
@@ -4641,7 +4663,7 @@ dependencies = [
"lazy_static",
"libc",
"log",
"nix 0.28.0",
"nix 0.29.0",
"serial2",
"shared_library",
"shell-words",
@@ -6169,6 +6191,16 @@ version = "2.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "13c2bddecc57b384dee18652358fb23172facb8a2c51ccc10d74c157bdea3292"
[[package]]
name = "supports-color"
version = "2.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d6398cde53adc3c4557306a96ce67b302968513830a77a95b2b17305d9719a89"
dependencies = [
"is-terminal",
"is_ci",
]
[[package]]
name = "supports-color"
version = "3.0.2"

View File

@@ -96,6 +96,7 @@ codex-utils-readiness = { path = "utils/readiness" }
codex-utils-string = { path = "utils/string" }
codex-windows-sandbox = { path = "windows-sandbox-rs" }
core_test_support = { path = "core/tests/common" }
exec_server_test_support = { path = "exec-server/tests/common" }
mcp-types = { path = "mcp-types" }
mcp_test_support = { path = "mcp-server/tests/common" }
@@ -178,8 +179,8 @@ seccompiler = "0.5.0"
sentry = "0.34.0"
serde = "1"
serde_json = "1"
serde_yaml = "0.9"
serde_with = "3.16"
serde_yaml = "0.9"
serial_test = "3.2.0"
sha1 = "0.10.6"
sha2 = "0.10"
@@ -288,6 +289,7 @@ opt-level = 0
# Uncomment to debug local changes.
# ratatui = { path = "../../ratatui" }
crossterm = { git = "https://github.com/nornagon/crossterm", branch = "nornagon/color-query" }
portable-pty = { git = "https://github.com/pakrym/wezterm", branch = "PSUEDOCONSOLE_INHERIT_CURSOR" }
ratatui = { git = "https://github.com/nornagon/ratatui", branch = "nornagon-v0.29.0-patch" }
# Uncomment to debug local changes.

View File

@@ -1524,6 +1524,7 @@ pub struct RateLimitSnapshot {
pub primary: Option<RateLimitWindow>,
pub secondary: Option<RateLimitWindow>,
pub credits: Option<CreditsSnapshot>,
pub plan_type: Option<PlanType>,
}
impl From<CoreRateLimitSnapshot> for RateLimitSnapshot {
@@ -1532,6 +1533,7 @@ impl From<CoreRateLimitSnapshot> for RateLimitSnapshot {
primary: value.primary.map(RateLimitWindow::from),
secondary: value.secondary.map(RateLimitWindow::from),
credits: value.credits.map(CreditsSnapshot::from),
plan_type: value.plan_type,
}
}
}

View File

@@ -1499,6 +1499,7 @@ mod tests {
unlimited: false,
balance: Some("5".to_string()),
}),
plan_type: None,
};
handle_token_count_event(

View File

@@ -16,6 +16,9 @@ use tracing::warn;
use crate::error_code::INTERNAL_ERROR_CODE;
#[cfg(test)]
use codex_protocol::account::PlanType;
/// Sends messages to the client and manages request callbacks.
pub(crate) struct OutgoingMessageSender {
next_request_id: AtomicI64,
@@ -230,6 +233,7 @@ mod tests {
}),
secondary: None,
credits: None,
plan_type: Some(PlanType::Plus),
},
});
@@ -245,7 +249,8 @@ mod tests {
"resetsAt": 123
},
"secondary": null,
"credits": null
"credits": null,
"planType": "plus"
}
},
}),

View File

@@ -11,6 +11,7 @@ use codex_app_server_protocol::RateLimitSnapshot;
use codex_app_server_protocol::RateLimitWindow;
use codex_app_server_protocol::RequestId;
use codex_core::auth::AuthCredentialsStoreMode;
use codex_protocol::account::PlanType as AccountPlanType;
use pretty_assertions::assert_eq;
use serde_json::json;
use std::path::Path;
@@ -153,6 +154,7 @@ async fn get_account_rate_limits_returns_snapshot() -> Result<()> {
resets_at: Some(secondary_reset_timestamp),
}),
credits: None,
plan_type: Some(AccountPlanType::Pro),
},
};
assert_eq!(received, expected);

View File

@@ -699,13 +699,7 @@ fn derive_new_contents_from_chunks(
}
};
let mut original_lines: Vec<String> = original_contents.split('\n').map(String::from).collect();
// Drop the trailing empty element that results from the final newline so
// that line counts match the behaviour of standard `diff`.
if original_lines.last().is_some_and(String::is_empty) {
original_lines.pop();
}
let original_lines: Vec<String> = build_lines_from_contents(&original_contents);
let replacements = compute_replacements(&original_lines, path, chunks)?;
let new_lines = apply_replacements(original_lines, &replacements);
@@ -713,13 +707,67 @@ fn derive_new_contents_from_chunks(
if !new_lines.last().is_some_and(String::is_empty) {
new_lines.push(String::new());
}
let new_contents = new_lines.join("\n");
let new_contents = build_contents_from_lines(&original_contents, &new_lines);
Ok(AppliedPatch {
original_contents,
new_contents,
})
}
// TODO(dylan-hurd-oai): I think we can migrate to just use `contents.lines()`
// across all platforms.
fn build_lines_from_contents(contents: &str) -> Vec<String> {
if cfg!(windows) {
contents.lines().map(String::from).collect()
} else {
let mut lines: Vec<String> = contents.split('\n').map(String::from).collect();
// Drop the trailing empty element that results from the final newline so
// that line counts match the behaviour of standard `diff`.
if lines.last().is_some_and(String::is_empty) {
lines.pop();
}
lines
}
}
fn build_contents_from_lines(original_contents: &str, lines: &[String]) -> String {
if cfg!(windows) {
// for now, only compute this if we're on Windows.
let uses_crlf = contents_uses_crlf(original_contents);
if uses_crlf {
lines.join("\r\n")
} else {
lines.join("\n")
}
} else {
lines.join("\n")
}
}
/// Detects whether the source file uses Windows CRLF line endings consistently.
/// We only consider a file CRLF-formatted if every newline is part of a
/// CRLF sequence. This avoids rewriting an LF-formatted file that merely
/// contains embedded sequences of "\r\n".
///
/// Returns `true` if the file uses CRLF line endings, `false` otherwise.
fn contents_uses_crlf(contents: &str) -> bool {
let bytes = contents.as_bytes();
let mut n_newlines = 0usize;
let mut n_crlf = 0usize;
for i in 0..bytes.len() {
if bytes[i] == b'\n' {
n_newlines += 1;
if i > 0 && bytes[i - 1] == b'\r' {
n_crlf += 1;
}
}
}
n_newlines > 0 && n_crlf == n_newlines
}
/// Compute a list of replacements needed to transform `original_lines` into the
/// new lines, given the patch `chunks`. Each replacement is returned as
/// `(start_index, old_len, new_lines)`.
@@ -1359,6 +1407,72 @@ PATCH"#,
assert_eq!(contents, "a\nB\nc\nd\nE\nf\ng\n");
}
/// Ensure CRLF line endings are preserved for updated files on Windowsstyle inputs.
#[cfg(windows)]
#[test]
fn test_preserve_crlf_line_endings_on_update() {
let dir = tempdir().unwrap();
let path = dir.path().join("crlf.txt");
// Original file uses CRLF (\r\n) endings.
std::fs::write(&path, b"a\r\nb\r\nc\r\n").unwrap();
// Replace `b` -> `B` and append `d`.
let patch = wrap_patch(&format!(
r#"*** Update File: {}
@@
a
-b
+B
@@
c
+d
*** End of File"#,
path.display()
));
let mut stdout = Vec::new();
let mut stderr = Vec::new();
apply_patch(&patch, &mut stdout, &mut stderr).unwrap();
let out = std::fs::read(&path).unwrap();
// Expect all CRLF endings; count occurrences of CRLF and ensure there are 4 lines.
let content = String::from_utf8_lossy(&out);
assert!(content.contains("\r\n"));
// No bare LF occurrences immediately preceding a non-CR: the text should not contain "a\nb".
assert!(!content.contains("a\nb"));
// Validate exact content sequence with CRLF delimiters.
assert_eq!(content, "a\r\nB\r\nc\r\nd\r\n");
}
/// Ensure CRLF inputs with embedded carriage returns in the content are preserved.
#[cfg(windows)]
#[test]
fn test_preserve_crlf_embedded_carriage_returns_on_append() {
let dir = tempdir().unwrap();
let path = dir.path().join("crlf_cr_content.txt");
// Original file: first line has a literal '\r' in the content before the CRLF terminator.
std::fs::write(&path, b"foo\r\r\nbar\r\n").unwrap();
// Append a new line without modifying existing ones.
let patch = wrap_patch(&format!(
r#"*** Update File: {}
@@
+BAZ
*** End of File"#,
path.display()
));
let mut stdout = Vec::new();
let mut stderr = Vec::new();
apply_patch(&patch, &mut stdout, &mut stderr).unwrap();
let out = std::fs::read(&path).unwrap();
// CRLF endings must be preserved and the extra CR in "foo\r\r" must not be collapsed.
assert_eq!(out.as_slice(), b"foo\r\r\nbar\r\nBAZ\r\n");
}
#[test]
fn test_pure_addition_chunk_followed_by_removal() {
let dir = tempdir().unwrap();
@@ -1544,6 +1658,37 @@ PATCH"#,
assert_eq!(expected, diff);
}
/// For LF-only inputs with a trailing newline ensure that the helper used
/// on Windows-style builds drops the synthetic trailing empty element so
/// replacements behave like standard `diff` line numbering.
#[test]
fn test_derive_new_contents_lf_trailing_newline() {
let dir = tempdir().unwrap();
let path = dir.path().join("lf_trailing_newline.txt");
fs::write(&path, "foo\nbar\n").unwrap();
let patch = wrap_patch(&format!(
r#"*** Update File: {}
@@
foo
-bar
+BAR
"#,
path.display()
));
let patch = parse_patch(&patch).unwrap();
let chunks = match patch.hunks.as_slice() {
[Hunk::UpdateFile { chunks, .. }] => chunks,
_ => panic!("Expected a single UpdateFile hunk"),
};
let AppliedPatch { new_contents, .. } =
derive_new_contents_from_chunks(&path, chunks).unwrap();
assert_eq!(new_contents, "foo\nBAR\n");
}
#[test]
fn test_unified_diff_insert_at_eof() {
// Insert a new line at endoffile.

View File

@@ -0,0 +1 @@
** text eol=lf

View File

@@ -0,0 +1 @@
This is a new file

View File

@@ -0,0 +1,4 @@
*** Begin Patch
*** Add File: bar.md
+This is a new file
*** End Patch

View File

@@ -0,0 +1,2 @@
line1
changed

View File

@@ -0,0 +1 @@
obsolete

View File

@@ -0,0 +1,2 @@
line1
line2

View File

@@ -0,0 +1,9 @@
*** Begin Patch
*** Add File: nested/new.txt
+created
*** Delete File: delete.txt
*** Update File: modify.txt
@@
-line2
+changed
*** End Patch

View File

@@ -0,0 +1,4 @@
line1
changed2
line3
changed4

View File

@@ -0,0 +1,4 @@
line1
line2
line3
line4

View File

@@ -0,0 +1,9 @@
*** Begin Patch
*** Update File: multi.txt
@@
-line2
+changed2
@@
-line4
+changed4
*** End Patch

View File

@@ -0,0 +1 @@
unrelated file

View File

@@ -0,0 +1 @@
old content

View File

@@ -0,0 +1 @@
unrelated file

View File

@@ -0,0 +1,7 @@
*** Begin Patch
*** Update File: old/name.txt
*** Move to: renamed/dir/name.txt
@@
-old content
+new content
*** End Patch

View File

@@ -0,0 +1,2 @@
*** Begin Patch
*** End Patch

View File

@@ -0,0 +1,2 @@
line1
line2

View File

@@ -0,0 +1,2 @@
line1
line2

View File

@@ -0,0 +1,6 @@
*** Begin Patch
*** Update File: modify.txt
@@
-missing
+changed
*** End Patch

View File

@@ -0,0 +1,3 @@
*** Begin Patch
*** Delete File: missing.txt
*** End Patch

View File

@@ -0,0 +1,3 @@
*** Begin Patch
*** Update File: foo.txt
*** End Patch

View File

@@ -0,0 +1,6 @@
*** Begin Patch
*** Update File: missing.txt
@@
-old
+new
*** End Patch

View File

@@ -0,0 +1,7 @@
*** Begin Patch
*** Update File: old/name.txt
*** Move to: renamed/dir/name.txt
@@
-from
+new
*** End Patch

View File

@@ -0,0 +1,4 @@
*** Begin Patch
*** Add File: duplicate.txt
+new content
*** End Patch

View File

@@ -0,0 +1,3 @@
*** Begin Patch
*** Delete File: dir
*** End Patch

View File

@@ -0,0 +1,3 @@
*** Begin Patch
*** Frobnicate File: foo
*** End Patch

View File

@@ -0,0 +1,2 @@
first line
second line

View File

@@ -0,0 +1,7 @@
*** Begin Patch
*** Update File: no_newline.txt
@@
-no newline at end
+first line
+second line
*** End Patch

View File

@@ -0,0 +1,8 @@
*** Begin Patch
*** Add File: created.txt
+hello
*** Update File: missing.txt
@@
-old
+new
*** End Patch

View File

@@ -0,0 +1,4 @@
line1
line2
added line 1
added line 2

View File

@@ -0,0 +1,2 @@
line1
line2

View File

@@ -0,0 +1,6 @@
*** Begin Patch
*** Update File: input.txt
@@
+added line 1
+added line 2
*** End Patch

View File

@@ -0,0 +1,6 @@
*** Begin Patch
*** Update File: foo.txt
@@
-old
+new
*** End Patch

View File

@@ -0,0 +1,6 @@
*** Begin Patch
*** Update File: file.txt
@@
-one
+two
*** End Patch

View File

@@ -0,0 +1,18 @@
# Overview
This directory is a collection of end to end tests for the apply-patch specification, meant to be easily portable to other languages or platforms.
# Specification
Each test case is one directory, composed of input state (input/), the patch operation (patch.txt), and the expected final state (expected/). This structure is designed to keep tests simple (i.e. test exactly one patch at a time) while still providing enough flexibility to test any given operation across files.
Here's what this would look like for a simple test apply-patch test case to create a new file:
```
001_add/
input/
foo.md
expected/
foo.md
bar.md
patch.txt
```

View File

@@ -1,3 +1,4 @@
mod cli;
mod scenarios;
#[cfg(not(target_os = "windows"))]
mod tool;

View File

@@ -0,0 +1,114 @@
use assert_cmd::prelude::*;
use pretty_assertions::assert_eq;
use std::collections::BTreeMap;
use std::fs;
use std::path::Path;
use std::path::PathBuf;
use std::process::Command;
use tempfile::tempdir;
#[test]
fn test_apply_patch_scenarios() -> anyhow::Result<()> {
for scenario in fs::read_dir("tests/fixtures/scenarios")? {
let scenario = scenario?;
let path = scenario.path();
if path.is_dir() {
run_apply_patch_scenario(&path)?;
}
}
Ok(())
}
/// Reads a scenario directory, copies the input files to a temporary directory, runs apply-patch,
/// and asserts that the final state matches the expected state exactly.
fn run_apply_patch_scenario(dir: &Path) -> anyhow::Result<()> {
let tmp = tempdir()?;
// Copy the input files to the temporary directory
let input_dir = dir.join("input");
if input_dir.is_dir() {
copy_dir_recursive(&input_dir, tmp.path())?;
}
// Read the patch.txt file
let patch = fs::read_to_string(dir.join("patch.txt"))?;
// Run apply_patch in the temporary directory. We intentionally do not assert
// on the exit status here; the scenarios are specified purely in terms of
// final filesystem state, which we compare below.
Command::cargo_bin("apply_patch")?
.arg(patch)
.current_dir(tmp.path())
.output()?;
// Assert that the final state matches the expected state exactly
let expected_dir = dir.join("expected");
let expected_snapshot = snapshot_dir(&expected_dir)?;
let actual_snapshot = snapshot_dir(tmp.path())?;
assert_eq!(
actual_snapshot,
expected_snapshot,
"Scenario {} did not match expected final state",
dir.display()
);
Ok(())
}
#[derive(Debug, Clone, PartialEq, Eq)]
enum Entry {
File(Vec<u8>),
Dir,
}
fn snapshot_dir(root: &Path) -> anyhow::Result<BTreeMap<PathBuf, Entry>> {
let mut entries = BTreeMap::new();
if root.is_dir() {
snapshot_dir_recursive(root, root, &mut entries)?;
}
Ok(entries)
}
fn snapshot_dir_recursive(
base: &Path,
dir: &Path,
entries: &mut BTreeMap<PathBuf, Entry>,
) -> anyhow::Result<()> {
for entry in fs::read_dir(dir)? {
let entry = entry?;
let path = entry.path();
let Some(stripped) = path.strip_prefix(base).ok() else {
continue;
};
let rel = stripped.to_path_buf();
let file_type = entry.file_type()?;
if file_type.is_dir() {
entries.insert(rel.clone(), Entry::Dir);
snapshot_dir_recursive(base, &path, entries)?;
} else if file_type.is_file() {
let contents = fs::read(&path)?;
entries.insert(rel, Entry::File(contents));
}
}
Ok(())
}
fn copy_dir_recursive(src: &Path, dst: &Path) -> anyhow::Result<()> {
for entry in fs::read_dir(src)? {
let entry = entry?;
let path = entry.path();
let file_type = entry.file_type()?;
let dest_path = dst.join(entry.file_name());
if file_type.is_dir() {
fs::create_dir_all(&dest_path)?;
copy_dir_recursive(&path, &dest_path)?;
} else if file_type.is_file() {
if let Some(parent) = dest_path.parent() {
fs::create_dir_all(parent)?;
}
fs::copy(&path, &dest_path)?;
}
}
Ok(())
}

View File

@@ -7,6 +7,7 @@ use crate::types::TurnAttemptsSiblingTurnsResponse;
use anyhow::Result;
use codex_core::auth::CodexAuth;
use codex_core::default_client::get_codex_user_agent;
use codex_protocol::account::PlanType as AccountPlanType;
use codex_protocol::protocol::CreditsSnapshot;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::RateLimitWindow;
@@ -291,6 +292,7 @@ impl Client {
primary,
secondary,
credits: Self::map_credits(payload.credits),
plan_type: Some(Self::map_plan_type(payload.plan_type)),
}
}
@@ -325,6 +327,23 @@ impl Client {
})
}
fn map_plan_type(plan_type: crate::types::PlanType) -> AccountPlanType {
match plan_type {
crate::types::PlanType::Free => AccountPlanType::Free,
crate::types::PlanType::Plus => AccountPlanType::Plus,
crate::types::PlanType::Pro => AccountPlanType::Pro,
crate::types::PlanType::Team => AccountPlanType::Team,
crate::types::PlanType::Business => AccountPlanType::Business,
crate::types::PlanType::Enterprise => AccountPlanType::Enterprise,
crate::types::PlanType::Edu | crate::types::PlanType::Education => AccountPlanType::Edu,
crate::types::PlanType::Guest
| crate::types::PlanType::Go
| crate::types::PlanType::FreeWorkspace
| crate::types::PlanType::Quorum
| crate::types::PlanType::K12 => AccountPlanType::Unknown,
}
}
fn window_minutes_from_seconds(seconds: i32) -> Option<i64> {
if seconds <= 0 {
return None;

View File

@@ -127,6 +127,7 @@ impl Default for TaskText {
#[async_trait::async_trait]
pub trait CloudBackend: Send + Sync {
async fn list_tasks(&self, env: Option<&str>) -> Result<Vec<TaskSummary>>;
async fn get_task_summary(&self, id: TaskId) -> Result<TaskSummary>;
async fn get_task_diff(&self, id: TaskId) -> Result<Option<String>>;
/// Return assistant output messages (no diff) when available.
async fn get_task_messages(&self, id: TaskId) -> Result<Vec<String>>;

View File

@@ -63,6 +63,10 @@ impl CloudBackend for HttpClient {
self.tasks_api().list(env).await
}
async fn get_task_summary(&self, id: TaskId) -> Result<TaskSummary> {
self.tasks_api().summary(id).await
}
async fn get_task_diff(&self, id: TaskId) -> Result<Option<String>> {
self.tasks_api().diff(id).await
}
@@ -149,6 +153,75 @@ mod api {
Ok(tasks)
}
pub(crate) async fn summary(&self, id: TaskId) -> Result<TaskSummary> {
let id_str = id.0.clone();
let (details, body, ct) = self
.details_with_body(&id.0)
.await
.map_err(|e| CloudTaskError::Http(format!("get_task_details failed: {e}")))?;
let parsed: Value = serde_json::from_str(&body).map_err(|e| {
CloudTaskError::Http(format!(
"Decode error for {}: {e}; content-type={ct}; body={body}",
id.0
))
})?;
let task_obj = parsed
.get("task")
.and_then(Value::as_object)
.ok_or_else(|| {
CloudTaskError::Http(format!("Task metadata missing from details for {id_str}"))
})?;
let status_display = parsed
.get("task_status_display")
.or_else(|| task_obj.get("task_status_display"))
.and_then(Value::as_object)
.map(|m| {
m.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect::<HashMap<String, Value>>()
});
let status = map_status(status_display.as_ref());
let mut summary = diff_summary_from_status_display(status_display.as_ref());
if summary.files_changed == 0
&& summary.lines_added == 0
&& summary.lines_removed == 0
&& let Some(diff) = details.unified_diff()
{
summary = diff_summary_from_diff(&diff);
}
let updated_at_raw = task_obj
.get("updated_at")
.and_then(Value::as_f64)
.or_else(|| task_obj.get("created_at").and_then(Value::as_f64))
.or_else(|| latest_turn_timestamp(status_display.as_ref()));
let environment_id = task_obj
.get("environment_id")
.and_then(Value::as_str)
.map(str::to_string);
let environment_label = env_label_from_status_display(status_display.as_ref());
let attempt_total = attempt_total_from_status_display(status_display.as_ref());
let title = task_obj
.get("title")
.and_then(Value::as_str)
.unwrap_or("<untitled>")
.to_string();
let is_review = task_obj
.get("is_review")
.and_then(Value::as_bool)
.unwrap_or(false);
Ok(TaskSummary {
id,
title,
status,
updated_at: parse_updated_at(updated_at_raw.as_ref()),
environment_id,
environment_label,
summary,
is_review,
attempt_total,
})
}
pub(crate) async fn diff(&self, id: TaskId) -> Result<Option<String>> {
let (details, body, ct) = self
.details_with_body(&id.0)
@@ -679,6 +752,34 @@ mod api {
.map(str::to_string)
}
fn diff_summary_from_diff(diff: &str) -> DiffSummary {
let mut files_changed = 0usize;
let mut lines_added = 0usize;
let mut lines_removed = 0usize;
for line in diff.lines() {
if line.starts_with("diff --git ") {
files_changed += 1;
continue;
}
if line.starts_with("+++") || line.starts_with("---") || line.starts_with("@@") {
continue;
}
match line.as_bytes().first() {
Some(b'+') => lines_added += 1,
Some(b'-') => lines_removed += 1,
_ => {}
}
}
if files_changed == 0 && !diff.trim().is_empty() {
files_changed = 1;
}
DiffSummary {
files_changed,
lines_added,
lines_removed,
}
}
fn diff_summary_from_status_display(v: Option<&HashMap<String, Value>>) -> DiffSummary {
let mut out = DiffSummary::default();
let Some(map) = v else { return out };
@@ -700,6 +801,17 @@ mod api {
out
}
fn latest_turn_timestamp(v: Option<&HashMap<String, Value>>) -> Option<f64> {
let map = v?;
let latest = map
.get("latest_turn_status_display")
.and_then(Value::as_object)?;
latest
.get("updated_at")
.or_else(|| latest.get("created_at"))
.and_then(Value::as_f64)
}
fn attempt_total_from_status_display(v: Option<&HashMap<String, Value>>) -> Option<usize> {
let map = v?;
let latest = map

View File

@@ -1,6 +1,7 @@
use crate::ApplyOutcome;
use crate::AttemptStatus;
use crate::CloudBackend;
use crate::CloudTaskError;
use crate::DiffSummary;
use crate::Result;
use crate::TaskId;
@@ -60,6 +61,14 @@ impl CloudBackend for MockClient {
Ok(out)
}
async fn get_task_summary(&self, id: TaskId) -> Result<TaskSummary> {
let tasks = self.list_tasks(None).await?;
tasks
.into_iter()
.find(|t| t.id == id)
.ok_or_else(|| CloudTaskError::Msg(format!("Task {} not found (mock)", id.0)))
}
async fn get_task_diff(&self, id: TaskId) -> Result<Option<String>> {
Ok(Some(mock_diff_for(&id)))
}

View File

@@ -34,6 +34,9 @@ tokio-stream = { workspace = true }
tracing = { workspace = true, features = ["log"] }
tracing-subscriber = { workspace = true, features = ["env-filter"] }
unicode-width = { workspace = true }
owo-colors = { workspace = true, features = ["supports-colors"] }
supports-color = { workspace = true }
[dev-dependencies]
async-trait = { workspace = true }
pretty_assertions = { workspace = true }

View File

@@ -350,6 +350,7 @@ pub enum AppEvent {
mod tests {
use super::*;
use chrono::Utc;
use codex_cloud_tasks_client::CloudTaskError;
struct FakeBackend {
// maps env key to titles
@@ -385,6 +386,17 @@ mod tests {
Ok(out)
}
async fn get_task_summary(
&self,
id: TaskId,
) -> codex_cloud_tasks_client::Result<TaskSummary> {
self.list_tasks(None)
.await?
.into_iter()
.find(|t| t.id == id)
.ok_or_else(|| CloudTaskError::Msg(format!("Task {} not found", id.0)))
}
async fn get_task_diff(
&self,
_id: TaskId,

View File

@@ -16,6 +16,12 @@ pub struct Cli {
pub enum Command {
/// Submit a new Codex Cloud task without launching the TUI.
Exec(ExecCommand),
/// Show the status of a Codex Cloud task.
Status(StatusCommand),
/// Apply the diff for a Codex Cloud task locally.
Apply(ApplyCommand),
/// Show the unified diff for a Codex Cloud task.
Diff(DiffCommand),
}
#[derive(Debug, Args)]
@@ -51,3 +57,32 @@ fn parse_attempts(input: &str) -> Result<usize, String> {
Err("attempts must be between 1 and 4".to_string())
}
}
#[derive(Debug, Args)]
pub struct StatusCommand {
/// Codex Cloud task identifier to inspect.
#[arg(value_name = "TASK_ID")]
pub task_id: String,
}
#[derive(Debug, Args)]
pub struct ApplyCommand {
/// Codex Cloud task identifier to apply.
#[arg(value_name = "TASK_ID")]
pub task_id: String,
/// Attempt number to apply (1-based).
#[arg(long = "attempt", value_parser = parse_attempts, value_name = "N")]
pub attempt: Option<usize>,
}
#[derive(Debug, Args)]
pub struct DiffCommand {
/// Codex Cloud task identifier to display.
#[arg(value_name = "TASK_ID")]
pub task_id: String,
/// Attempt number to display (1-based).
#[arg(long = "attempt", value_parser = parse_attempts, value_name = "N")]
pub attempt: Option<usize>,
}

View File

@@ -8,17 +8,24 @@ pub mod util;
pub use cli::Cli;
use anyhow::anyhow;
use chrono::Utc;
use codex_cloud_tasks_client::TaskStatus;
use codex_login::AuthManager;
use owo_colors::OwoColorize;
use owo_colors::Stream;
use std::cmp::Ordering;
use std::io::IsTerminal;
use std::io::Read;
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Duration;
use std::time::Instant;
use supports_color::Stream as SupportStream;
use tokio::sync::mpsc::UnboundedSender;
use tracing::info;
use tracing_subscriber::EnvFilter;
use util::append_error_log;
use util::format_relative_time;
use util::set_user_agent_suffix;
struct ApplyJob {
@@ -193,6 +200,273 @@ fn resolve_query_input(query_arg: Option<String>) -> anyhow::Result<String> {
}
}
fn parse_task_id(raw: &str) -> anyhow::Result<codex_cloud_tasks_client::TaskId> {
let trimmed = raw.trim();
if trimmed.is_empty() {
anyhow::bail!("task id must not be empty");
}
let without_fragment = trimmed.split('#').next().unwrap_or(trimmed);
let without_query = without_fragment
.split('?')
.next()
.unwrap_or(without_fragment);
let id = without_query
.rsplit('/')
.next()
.unwrap_or(without_query)
.trim();
if id.is_empty() {
anyhow::bail!("task id must not be empty");
}
Ok(codex_cloud_tasks_client::TaskId(id.to_string()))
}
#[derive(Clone, Debug)]
struct AttemptDiffData {
placement: Option<i64>,
created_at: Option<chrono::DateTime<Utc>>,
diff: String,
}
fn cmp_attempt(lhs: &AttemptDiffData, rhs: &AttemptDiffData) -> Ordering {
match (lhs.placement, rhs.placement) {
(Some(a), Some(b)) => a.cmp(&b),
(Some(_), None) => Ordering::Less,
(None, Some(_)) => Ordering::Greater,
(None, None) => match (lhs.created_at, rhs.created_at) {
(Some(a), Some(b)) => a.cmp(&b),
(Some(_), None) => Ordering::Less,
(None, Some(_)) => Ordering::Greater,
(None, None) => Ordering::Equal,
},
}
}
async fn collect_attempt_diffs(
backend: &dyn codex_cloud_tasks_client::CloudBackend,
task_id: &codex_cloud_tasks_client::TaskId,
) -> anyhow::Result<Vec<AttemptDiffData>> {
let text =
codex_cloud_tasks_client::CloudBackend::get_task_text(backend, task_id.clone()).await?;
let mut attempts = Vec::new();
if let Some(diff) =
codex_cloud_tasks_client::CloudBackend::get_task_diff(backend, task_id.clone()).await?
{
attempts.push(AttemptDiffData {
placement: text.attempt_placement,
created_at: None,
diff,
});
}
if let Some(turn_id) = text.turn_id {
let siblings = codex_cloud_tasks_client::CloudBackend::list_sibling_attempts(
backend,
task_id.clone(),
turn_id,
)
.await?;
for sibling in siblings {
if let Some(diff) = sibling.diff {
attempts.push(AttemptDiffData {
placement: sibling.attempt_placement,
created_at: sibling.created_at,
diff,
});
}
}
}
attempts.sort_by(cmp_attempt);
if attempts.is_empty() {
anyhow::bail!(
"No diff available for task {}; it may still be running.",
task_id.0
);
}
Ok(attempts)
}
fn select_attempt(
attempts: &[AttemptDiffData],
attempt: Option<usize>,
) -> anyhow::Result<&AttemptDiffData> {
if attempts.is_empty() {
anyhow::bail!("No attempts available");
}
let desired = attempt.unwrap_or(1);
let idx = desired
.checked_sub(1)
.ok_or_else(|| anyhow!("attempt must be at least 1"))?;
if idx >= attempts.len() {
anyhow::bail!(
"Attempt {desired} not available; only {} attempt(s) found",
attempts.len()
);
}
Ok(&attempts[idx])
}
fn task_status_label(status: &TaskStatus) -> &'static str {
match status {
TaskStatus::Pending => "PENDING",
TaskStatus::Ready => "READY",
TaskStatus::Applied => "APPLIED",
TaskStatus::Error => "ERROR",
}
}
fn summary_line(summary: &codex_cloud_tasks_client::DiffSummary, colorize: bool) -> String {
if summary.files_changed == 0 && summary.lines_added == 0 && summary.lines_removed == 0 {
let base = "no diff";
return if colorize {
base.if_supports_color(Stream::Stdout, |t| t.dimmed())
.to_string()
} else {
base.to_string()
};
}
let adds = summary.lines_added;
let dels = summary.lines_removed;
let files = summary.files_changed;
if colorize {
let adds_raw = format!("+{adds}");
let adds_str = adds_raw
.as_str()
.if_supports_color(Stream::Stdout, |t| t.green())
.to_string();
let dels_raw = format!("-{dels}");
let dels_str = dels_raw
.as_str()
.if_supports_color(Stream::Stdout, |t| t.red())
.to_string();
let bullet = ""
.if_supports_color(Stream::Stdout, |t| t.dimmed())
.to_string();
let file_label = "file"
.if_supports_color(Stream::Stdout, |t| t.dimmed())
.to_string();
let plural = if files == 1 { "" } else { "s" };
format!("{adds_str}/{dels_str} {bullet} {files} {file_label}{plural}")
} else {
format!(
"+{adds}/-{dels}{files} file{}",
if files == 1 { "" } else { "s" }
)
}
}
fn format_task_status_lines(
task: &codex_cloud_tasks_client::TaskSummary,
now: chrono::DateTime<Utc>,
colorize: bool,
) -> Vec<String> {
let mut lines = Vec::new();
let status = task_status_label(&task.status);
let status = if colorize {
match task.status {
TaskStatus::Ready => status
.if_supports_color(Stream::Stdout, |t| t.green())
.to_string(),
TaskStatus::Pending => status
.if_supports_color(Stream::Stdout, |t| t.magenta())
.to_string(),
TaskStatus::Applied => status
.if_supports_color(Stream::Stdout, |t| t.blue())
.to_string(),
TaskStatus::Error => status
.if_supports_color(Stream::Stdout, |t| t.red())
.to_string(),
}
} else {
status.to_string()
};
lines.push(format!("[{status}] {}", task.title));
let mut meta_parts = Vec::new();
if let Some(label) = task.environment_label.as_deref().filter(|s| !s.is_empty()) {
if colorize {
meta_parts.push(
label
.if_supports_color(Stream::Stdout, |t| t.dimmed())
.to_string(),
);
} else {
meta_parts.push(label.to_string());
}
} else if let Some(id) = task.environment_id.as_deref() {
if colorize {
meta_parts.push(
id.if_supports_color(Stream::Stdout, |t| t.dimmed())
.to_string(),
);
} else {
meta_parts.push(id.to_string());
}
}
let when = format_relative_time(now, task.updated_at);
meta_parts.push(if colorize {
when.as_str()
.if_supports_color(Stream::Stdout, |t| t.dimmed())
.to_string()
} else {
when
});
let sep = if colorize {
""
.if_supports_color(Stream::Stdout, |t| t.dimmed())
.to_string()
} else {
"".to_string()
};
lines.push(meta_parts.join(&sep));
lines.push(summary_line(&task.summary, colorize));
lines
}
async fn run_status_command(args: crate::cli::StatusCommand) -> anyhow::Result<()> {
let ctx = init_backend("codex_cloud_tasks_status").await?;
let task_id = parse_task_id(&args.task_id)?;
let summary =
codex_cloud_tasks_client::CloudBackend::get_task_summary(&*ctx.backend, task_id).await?;
let now = Utc::now();
let colorize = supports_color::on(SupportStream::Stdout).is_some();
for line in format_task_status_lines(&summary, now, colorize) {
println!("{line}");
}
if !matches!(summary.status, TaskStatus::Ready) {
std::process::exit(1);
}
Ok(())
}
async fn run_diff_command(args: crate::cli::DiffCommand) -> anyhow::Result<()> {
let ctx = init_backend("codex_cloud_tasks_diff").await?;
let task_id = parse_task_id(&args.task_id)?;
let attempts = collect_attempt_diffs(&*ctx.backend, &task_id).await?;
let selected = select_attempt(&attempts, args.attempt)?;
print!("{}", selected.diff);
Ok(())
}
async fn run_apply_command(args: crate::cli::ApplyCommand) -> anyhow::Result<()> {
let ctx = init_backend("codex_cloud_tasks_apply").await?;
let task_id = parse_task_id(&args.task_id)?;
let attempts = collect_attempt_diffs(&*ctx.backend, &task_id).await?;
let selected = select_attempt(&attempts, args.attempt)?;
let outcome = codex_cloud_tasks_client::CloudBackend::apply_task(
&*ctx.backend,
task_id,
Some(selected.diff.clone()),
)
.await?;
println!("{}", outcome.message);
if !matches!(
outcome.status,
codex_cloud_tasks_client::ApplyStatus::Success
) {
std::process::exit(1);
}
Ok(())
}
fn level_from_status(status: codex_cloud_tasks_client::ApplyStatus) -> app::ApplyResultLevel {
match status {
codex_cloud_tasks_client::ApplyStatus::Success => app::ApplyResultLevel::Success,
@@ -322,6 +596,9 @@ pub async fn run_main(cli: Cli, _codex_linux_sandbox_exe: Option<PathBuf>) -> an
if let Some(command) = cli.command {
return match command {
crate::cli::Command::Exec(args) => run_exec_command(args).await,
crate::cli::Command::Status(args) => run_status_command(args).await,
crate::cli::Command::Apply(args) => run_apply_command(args).await,
crate::cli::Command::Diff(args) => run_diff_command(args).await,
};
}
let Cli { .. } = cli;
@@ -1713,14 +1990,111 @@ fn pretty_lines_from_error(raw: &str) -> Vec<String> {
#[cfg(test)]
mod tests {
use super::*;
use codex_cloud_tasks_client::DiffSummary;
use codex_cloud_tasks_client::MockClient;
use codex_cloud_tasks_client::TaskId;
use codex_cloud_tasks_client::TaskStatus;
use codex_cloud_tasks_client::TaskSummary;
use codex_tui::ComposerAction;
use codex_tui::ComposerInput;
use crossterm::event::KeyCode;
use crossterm::event::KeyEvent;
use crossterm::event::KeyModifiers;
use pretty_assertions::assert_eq;
use ratatui::buffer::Buffer;
use ratatui::layout::Rect;
#[test]
fn format_task_status_lines_with_diff_and_label() {
let now = Utc::now();
let task = TaskSummary {
id: TaskId("task_1".to_string()),
title: "Example task".to_string(),
status: TaskStatus::Ready,
updated_at: now,
environment_id: Some("env-1".to_string()),
environment_label: Some("Env".to_string()),
summary: DiffSummary {
files_changed: 3,
lines_added: 5,
lines_removed: 2,
},
is_review: false,
attempt_total: None,
};
let lines = format_task_status_lines(&task, now, false);
assert_eq!(
lines,
vec![
"[READY] Example task".to_string(),
"Env • 0s ago".to_string(),
"+5/-2 • 3 files".to_string(),
]
);
}
#[test]
fn format_task_status_lines_without_diff_falls_back() {
let now = Utc::now();
let task = TaskSummary {
id: TaskId("task_2".to_string()),
title: "No diff task".to_string(),
status: TaskStatus::Pending,
updated_at: now,
environment_id: Some("env-2".to_string()),
environment_label: None,
summary: DiffSummary::default(),
is_review: false,
attempt_total: Some(1),
};
let lines = format_task_status_lines(&task, now, false);
assert_eq!(
lines,
vec![
"[PENDING] No diff task".to_string(),
"env-2 • 0s ago".to_string(),
"no diff".to_string(),
]
);
}
#[tokio::test]
async fn collect_attempt_diffs_includes_sibling_attempts() {
let backend = MockClient;
let task_id = parse_task_id("https://chatgpt.com/codex/tasks/T-1000").expect("id");
let attempts = collect_attempt_diffs(&backend, &task_id)
.await
.expect("attempts");
assert_eq!(attempts.len(), 2);
assert_eq!(attempts[0].placement, Some(0));
assert_eq!(attempts[1].placement, Some(1));
assert!(!attempts[0].diff.is_empty());
assert!(!attempts[1].diff.is_empty());
}
#[test]
fn select_attempt_validates_bounds() {
let attempts = vec![AttemptDiffData {
placement: Some(0),
created_at: None,
diff: "diff --git a/file b/file\n".to_string(),
}];
let first = select_attempt(&attempts, Some(1)).expect("attempt 1");
assert_eq!(first.diff, "diff --git a/file b/file\n");
assert!(select_attempt(&attempts, Some(2)).is_err());
}
#[test]
fn parse_task_id_from_url_and_raw() {
let raw = parse_task_id("task_i_abc123").expect("raw id");
assert_eq!(raw.0, "task_i_abc123");
let url =
parse_task_id("https://chatgpt.com/codex/tasks/task_i_123456?foo=bar").expect("url id");
assert_eq!(url.0, "task_i_123456");
assert!(parse_task_id(" ").is_err());
}
#[test]
#[ignore = "very slow"]
fn composer_input_renders_typed_characters() {

View File

@@ -20,8 +20,7 @@ use std::time::Instant;
use crate::app::App;
use crate::app::AttemptView;
use chrono::Local;
use chrono::Utc;
use crate::util::format_relative_time_now;
use codex_cloud_tasks_client::AttemptStatus;
use codex_cloud_tasks_client::TaskStatus;
use codex_tui::render_markdown_text;
@@ -804,7 +803,7 @@ fn render_task_item(_app: &App, t: &codex_cloud_tasks_client::TaskSummary) -> Li
if let Some(lbl) = t.environment_label.as_ref().filter(|s| !s.is_empty()) {
meta.push(lbl.clone().dim());
}
let when = format_relative_time(t.updated_at).dim();
let when = format_relative_time_now(t.updated_at).dim();
if !meta.is_empty() {
meta.push(" ".into());
meta.push("".dim());
@@ -841,27 +840,6 @@ fn render_task_item(_app: &App, t: &codex_cloud_tasks_client::TaskSummary) -> Li
ListItem::new(vec![title, meta_line, sub, spacer])
}
fn format_relative_time(ts: chrono::DateTime<Utc>) -> String {
let now = Utc::now();
let mut secs = (now - ts).num_seconds();
if secs < 0 {
secs = 0;
}
if secs < 60 {
return format!("{secs}s ago");
}
let mins = secs / 60;
if mins < 60 {
return format!("{mins}m ago");
}
let hours = mins / 60;
if hours < 24 {
return format!("{hours}h ago");
}
let local = ts.with_timezone(&Local);
local.format("%b %e %H:%M").to_string()
}
fn draw_inline_spinner(
frame: &mut Frame,
area: Rect,

View File

@@ -1,4 +1,6 @@
use base64::Engine as _;
use chrono::DateTime;
use chrono::Local;
use chrono::Utc;
use reqwest::header::HeaderMap;
@@ -120,3 +122,27 @@ pub fn task_url(base_url: &str, task_id: &str) -> String {
}
format!("{normalized}/codex/tasks/{task_id}")
}
pub fn format_relative_time(reference: DateTime<Utc>, ts: DateTime<Utc>) -> String {
let mut secs = (reference - ts).num_seconds();
if secs < 0 {
secs = 0;
}
if secs < 60 {
return format!("{secs}s ago");
}
let mins = secs / 60;
if mins < 60 {
return format!("{mins}m ago");
}
let hours = mins / 60;
if hours < 24 {
return format!("{hours}h ago");
}
let local = ts.with_timezone(&Local);
local.format("%b %e %H:%M").to_string()
}
pub fn format_relative_time_now(ts: DateTime<Utc>) -> String {
format_relative_time(Utc::now(), ts)
}

View File

@@ -181,12 +181,13 @@ mod tests {
"display_name": "gpt-test",
"description": "desc",
"default_reasoning_level": "medium",
"supported_reasoning_levels": ["low", "medium", "high"],
"supported_reasoning_levels": [{"effort": "low", "description": "low"}, {"effort": "medium", "description": "medium"}, {"effort": "high", "description": "high"}],
"shell_type": "shell_command",
"visibility": "list",
"minimal_client_version": [0, 99, 0],
"supported_in_api": true,
"priority": 1
"priority": 1,
"upgrade": null,
}))
.unwrap(),
],

View File

@@ -37,6 +37,7 @@ pub fn parse_rate_limit(headers: &HeaderMap) -> Option<RateLimitSnapshot> {
primary,
secondary,
credits,
plan_type: None,
})
}

View File

@@ -10,6 +10,7 @@ use eventsource_stream::Eventsource;
use futures::Stream;
use futures::StreamExt;
use std::collections::HashMap;
use std::collections::HashSet;
use std::time::Duration;
use tokio::sync::mpsc;
use tokio::time::Instant;
@@ -41,12 +42,17 @@ pub async fn process_chat_sse<S>(
#[derive(Default, Debug)]
struct ToolCallState {
id: Option<String>,
name: Option<String>,
arguments: String,
}
let mut tool_calls: HashMap<String, ToolCallState> = HashMap::new();
let mut tool_call_order: Vec<String> = Vec::new();
let mut tool_calls: HashMap<usize, ToolCallState> = HashMap::new();
let mut tool_call_order: Vec<usize> = Vec::new();
let mut tool_call_order_seen: HashSet<usize> = HashSet::new();
let mut tool_call_index_by_id: HashMap<String, usize> = HashMap::new();
let mut next_tool_call_index = 0usize;
let mut last_tool_call_index: Option<usize> = None;
let mut assistant_item: Option<ResponseItem> = None;
let mut reasoning_item: Option<ResponseItem> = None;
let mut completed_sent = false;
@@ -149,26 +155,55 @@ pub async fn process_chat_sse<S>(
if let Some(tool_call_values) = delta.get("tool_calls").and_then(|c| c.as_array()) {
for tool_call in tool_call_values {
let id = tool_call
.get("id")
.and_then(|i| i.as_str())
.map(str::to_string)
.unwrap_or_else(|| format!("tool-call-{}", tool_call_order.len()));
let mut index = tool_call
.get("index")
.and_then(serde_json::Value::as_u64)
.map(|i| i as usize);
let call_state = tool_calls.entry(id.clone()).or_default();
if !tool_call_order.contains(&id) {
tool_call_order.push(id.clone());
let mut call_id_for_lookup = None;
if let Some(call_id) = tool_call.get("id").and_then(|i| i.as_str()) {
call_id_for_lookup = Some(call_id.to_string());
if let Some(existing) = tool_call_index_by_id.get(call_id) {
index = Some(*existing);
}
}
if index.is_none() && call_id_for_lookup.is_none() {
index = last_tool_call_index;
}
let index = index.unwrap_or_else(|| {
while tool_calls.contains_key(&next_tool_call_index) {
next_tool_call_index += 1;
}
let idx = next_tool_call_index;
next_tool_call_index += 1;
idx
});
let call_state = tool_calls.entry(index).or_default();
if tool_call_order_seen.insert(index) {
tool_call_order.push(index);
}
if let Some(id) = tool_call.get("id").and_then(|i| i.as_str()) {
call_state.id.get_or_insert_with(|| id.to_string());
tool_call_index_by_id.entry(id.to_string()).or_insert(index);
}
if let Some(func) = tool_call.get("function") {
if let Some(fname) = func.get("name").and_then(|n| n.as_str()) {
call_state.name = Some(fname.to_string());
if let Some(fname) = func.get("name").and_then(|n| n.as_str())
&& !fname.is_empty()
{
call_state.name.get_or_insert_with(|| fname.to_string());
}
if let Some(arguments) = func.get("arguments").and_then(|a| a.as_str())
{
call_state.arguments.push_str(arguments);
}
}
last_tool_call_index = Some(index);
}
}
}
@@ -222,13 +257,25 @@ pub async fn process_chat_sse<S>(
.await;
}
for call_id in tool_call_order.drain(..) {
let state = tool_calls.remove(&call_id).unwrap_or_default();
for index in tool_call_order.drain(..) {
let Some(state) = tool_calls.remove(&index) else {
continue;
};
tool_call_order_seen.remove(&index);
let ToolCallState {
id,
name,
arguments,
} = state;
let Some(name) = name else {
debug!("Skipping tool call at index {index} because name is missing");
continue;
};
let item = ResponseItem::FunctionCall {
id: None,
name: state.name.unwrap_or_default(),
arguments: state.arguments,
call_id: call_id.clone(),
name,
arguments,
call_id: id.unwrap_or_else(|| format!("tool-call-{index}")),
};
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
@@ -333,6 +380,59 @@ mod tests {
out
}
#[tokio::test]
async fn concatenates_tool_call_arguments_across_deltas() {
let delta_name = json!({
"choices": [{
"delta": {
"tool_calls": [{
"id": "call_a",
"index": 0,
"function": { "name": "do_a" }
}]
}
}]
});
let delta_args_1 = json!({
"choices": [{
"delta": {
"tool_calls": [{
"index": 0,
"function": { "arguments": "{ \"foo\":" }
}]
}
}]
});
let delta_args_2 = json!({
"choices": [{
"delta": {
"tool_calls": [{
"index": 0,
"function": { "arguments": "1}" }
}]
}
}]
});
let finish = json!({
"choices": [{
"finish_reason": "tool_calls"
}]
});
let body = build_body(&[delta_name, delta_args_1, delta_args_2, finish]);
let events = collect_events(&body).await;
assert_matches!(
&events[..],
[
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { call_id, name, arguments, .. }),
ResponseEvent::Completed { .. }
] if call_id == "call_a" && name == "do_a" && arguments == "{ \"foo\":1}"
);
}
#[tokio::test]
async fn emits_multiple_tool_calls() {
let delta_a = json!({
@@ -365,50 +465,74 @@ mod tests {
let body = build_body(&[delta_a, delta_b, finish]);
let events = collect_events(&body).await;
assert_eq!(events.len(), 3);
assert_matches!(
&events[0],
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { call_id, name, arguments, .. })
if call_id == "call_a" && name == "do_a" && arguments == "{\"foo\":1}"
&events[..],
[
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { call_id: call_a, name: name_a, arguments: args_a, .. }),
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { call_id: call_b, name: name_b, arguments: args_b, .. }),
ResponseEvent::Completed { .. }
] if call_a == "call_a" && name_a == "do_a" && args_a == "{\"foo\":1}" && call_b == "call_b" && name_b == "do_b" && args_b == "{\"bar\":2}"
);
assert_matches!(
&events[1],
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { call_id, name, arguments, .. })
if call_id == "call_b" && name == "do_b" && arguments == "{\"bar\":2}"
);
assert_matches!(events[2], ResponseEvent::Completed { .. });
}
#[tokio::test]
async fn concatenates_tool_call_arguments_across_deltas() {
let delta_name = json!({
async fn emits_tool_calls_for_multiple_choices() {
let payload = json!({
"choices": [
{
"delta": {
"tool_calls": [{
"id": "call_a",
"index": 0,
"function": { "name": "do_a", "arguments": "{}" }
}]
},
"finish_reason": "tool_calls"
},
{
"delta": {
"tool_calls": [{
"id": "call_b",
"index": 0,
"function": { "name": "do_b", "arguments": "{}" }
}]
},
"finish_reason": "tool_calls"
}
]
});
let body = build_body(&[payload]);
let events = collect_events(&body).await;
assert_matches!(
&events[..],
[
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { call_id: call_a, name: name_a, arguments: args_a, .. }),
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { call_id: call_b, name: name_b, arguments: args_b, .. }),
ResponseEvent::Completed { .. }
] if call_a == "call_a" && name_a == "do_a" && args_a == "{}" && call_b == "call_b" && name_b == "do_b" && args_b == "{}"
);
}
#[tokio::test]
async fn merges_tool_calls_by_index_when_id_missing_on_subsequent_deltas() {
let delta_with_id = json!({
"choices": [{
"delta": {
"tool_calls": [{
"index": 0,
"id": "call_a",
"function": { "name": "do_a" }
"function": { "name": "do_a", "arguments": "{ \"foo\":" }
}]
}
}]
});
let delta_args_1 = json!({
let delta_without_id = json!({
"choices": [{
"delta": {
"tool_calls": [{
"id": "call_a",
"function": { "arguments": "{ \"foo\":" }
}]
}
}]
});
let delta_args_2 = json!({
"choices": [{
"delta": {
"tool_calls": [{
"id": "call_a",
"index": 0,
"function": { "arguments": "1}" }
}]
}
@@ -421,7 +545,7 @@ mod tests {
}]
});
let body = build_body(&[delta_name, delta_args_1, delta_args_2, finish]);
let body = build_body(&[delta_with_id, delta_without_id, finish]);
let events = collect_events(&body).await;
assert_matches!(
&events[..],
@@ -432,6 +556,47 @@ mod tests {
);
}
#[tokio::test]
async fn preserves_tool_call_name_when_empty_deltas_arrive() {
let delta_with_name = json!({
"choices": [{
"delta": {
"tool_calls": [{
"id": "call_a",
"function": { "name": "do_a" }
}]
}
}]
});
let delta_with_empty_name = json!({
"choices": [{
"delta": {
"tool_calls": [{
"id": "call_a",
"function": { "name": "", "arguments": "{}" }
}]
}
}]
});
let finish = json!({
"choices": [{
"finish_reason": "tool_calls"
}]
});
let body = build_body(&[delta_with_name, delta_with_empty_name, finish]);
let events = collect_events(&body).await;
assert_matches!(
&events[..],
[
ResponseEvent::OutputItemDone(ResponseItem::FunctionCall { name, arguments, .. }),
ResponseEvent::Completed { .. }
] if name == "do_a" && arguments == "{}"
);
}
#[tokio::test]
async fn emits_tool_calls_even_when_content_and_reasoning_present() {
let delta_content_and_tools = json!({

View File

@@ -5,11 +5,12 @@ use codex_api::provider::RetryConfig;
use codex_api::provider::WireApi;
use codex_client::ReqwestTransport;
use codex_protocol::openai_models::ClientVersion;
use codex_protocol::openai_models::ConfigShellToolType;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ModelVisibility;
use codex_protocol::openai_models::ModelsResponse;
use codex_protocol::openai_models::ReasoningLevel;
use codex_protocol::openai_models::ShellType;
use codex_protocol::openai_models::ReasoningEffort;
use codex_protocol::openai_models::ReasoningEffortPreset;
use http::HeaderMap;
use http::Method;
use wiremock::Mock;
@@ -55,17 +56,27 @@ async fn models_client_hits_models_endpoint() {
slug: "gpt-test".to_string(),
display_name: "gpt-test".to_string(),
description: Some("desc".to_string()),
default_reasoning_level: ReasoningLevel::Medium,
default_reasoning_level: ReasoningEffort::Medium,
supported_reasoning_levels: vec![
ReasoningLevel::Low,
ReasoningLevel::Medium,
ReasoningLevel::High,
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: ReasoningEffort::Low.to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: ReasoningEffort::Medium.to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: ReasoningEffort::High.to_string(),
},
],
shell_type: ShellType::ShellCommand,
shell_type: ConfigShellToolType::ShellCommand,
visibility: ModelVisibility::List,
minimal_client_version: ClientVersion(0, 1, 0),
supported_in_api: true,
priority: 1,
upgrade: None,
}],
};

View File

@@ -90,6 +90,7 @@ wildmatch = { workspace = true }
[features]
deterministic_process_ids = []
test-support = []
[target.'cfg(target_os = "linux")'.dependencies]

View File

@@ -227,23 +227,6 @@ impl CodexAuth {
})
}
/// Raw plan string from the ID token (including unknown/new plan types).
pub fn raw_plan_type(&self) -> Option<String> {
self.get_plan_type().map(|plan| match plan {
InternalPlanType::Known(k) => format!("{k:?}"),
InternalPlanType::Unknown(raw) => raw,
})
}
/// Raw internal plan value from the ID token.
/// Exposes the underlying `token_data::PlanType` without mapping it to the
/// public `AccountPlanType`. Use this when downstream code needs to inspect
/// internal/unknown plan strings exactly as issued in the token.
pub(crate) fn get_plan_type(&self) -> Option<InternalPlanType> {
self.get_current_token_data()
.and_then(|t| t.id_token.chatgpt_plan_type)
}
fn get_current_auth_json(&self) -> Option<AuthDotJson> {
#[expect(clippy::unwrap_used)]
self.auth_dot_json.lock().unwrap().clone()
@@ -1041,10 +1024,6 @@ mod tests {
.expect("auth available");
pretty_assertions::assert_eq!(auth.account_plan_type(), Some(AccountPlanType::Pro));
pretty_assertions::assert_eq!(
auth.get_plan_type(),
Some(InternalPlanType::Known(InternalKnownPlan::Pro))
);
}
#[test]
@@ -1065,10 +1044,6 @@ mod tests {
.expect("auth available");
pretty_assertions::assert_eq!(auth.account_plan_type(), Some(AccountPlanType::Unknown));
pretty_assertions::assert_eq!(
auth.get_plan_type(),
Some(InternalPlanType::Unknown("mystery-tier".to_string()))
);
}
}

View File

@@ -11,8 +11,10 @@ use crate::compact;
use crate::compact::run_inline_auto_compact_task;
use crate::compact::should_use_remote_compact_task;
use crate::compact_remote::run_inline_remote_auto_compact_task;
use crate::exec_policy::load_exec_policy_for_features;
use crate::features::Feature;
use crate::features::Features;
use crate::openai_models::model_family::ModelFamily;
use crate::openai_models::models_manager::ModelsManager;
use crate::parse_command::parse_command;
use crate::parse_turn_item;
@@ -174,9 +176,10 @@ impl Codex {
let user_instructions = get_user_instructions(&config).await;
let exec_policy = crate::exec_policy::exec_policy_for(&config.features, &config.codex_home)
let exec_policy = load_exec_policy_for_features(&config.features, &config.codex_home)
.await
.map_err(|err| CodexErr::Fatal(format!("failed to load execpolicy: {err}")))?;
let exec_policy = Arc::new(RwLock::new(exec_policy));
let config = Arc::new(config);
@@ -396,35 +399,39 @@ pub(crate) struct SessionSettingsUpdate {
}
impl Session {
fn make_turn_context(
auth_manager: Option<Arc<AuthManager>>,
models_manager: Arc<ModelsManager>,
otel_event_manager: &OtelEventManager,
provider: ModelProviderInfo,
session_configuration: &SessionConfiguration,
conversation_id: ConversationId,
sub_id: String,
) -> TurnContext {
fn build_per_turn_config(session_configuration: &SessionConfiguration) -> Config {
let config = session_configuration.original_config_do_not_use.clone();
let features = &config.features;
let mut per_turn_config = (*config).clone();
per_turn_config.model = session_configuration.model.clone();
per_turn_config.model_reasoning_effort = session_configuration.model_reasoning_effort;
per_turn_config.model_reasoning_summary = session_configuration.model_reasoning_summary;
per_turn_config.features = features.clone();
let model_family =
models_manager.construct_model_family(&per_turn_config.model, &per_turn_config);
per_turn_config.features = config.features.clone();
per_turn_config
}
#[allow(clippy::too_many_arguments)]
fn make_turn_context(
auth_manager: Option<Arc<AuthManager>>,
otel_event_manager: &OtelEventManager,
provider: ModelProviderInfo,
session_configuration: &SessionConfiguration,
mut per_turn_config: Config,
model_family: ModelFamily,
conversation_id: ConversationId,
sub_id: String,
) -> TurnContext {
if let Some(model_info) = get_model_info(&model_family) {
per_turn_config.model_context_window = Some(model_info.context_window);
}
let otel_event_manager = otel_event_manager.clone().with_model(
session_configuration.model.as_str(),
session_configuration.model.as_str(),
model_family.slug.as_str(),
);
let per_turn_config = Arc::new(per_turn_config);
let client = ModelClient::new(
Arc::new(per_turn_config.clone()),
per_turn_config.clone(),
auth_manager,
model_family.clone(),
otel_event_manager,
@@ -437,7 +444,7 @@ impl Session {
let tools_config = ToolsConfig::new(&ToolsConfigParams {
model_family: &model_family,
features,
features: &per_turn_config.features,
});
TurnContext {
@@ -450,14 +457,14 @@ impl Session {
user_instructions: session_configuration.user_instructions.clone(),
approval_policy: session_configuration.approval_policy,
sandbox_policy: session_configuration.sandbox_policy.clone(),
shell_environment_policy: config.shell_environment_policy.clone(),
shell_environment_policy: per_turn_config.shell_environment_policy.clone(),
tools_config,
final_output_json_schema: None,
codex_linux_sandbox_exe: config.codex_linux_sandbox_exe.clone(),
codex_linux_sandbox_exe: per_turn_config.codex_linux_sandbox_exe.clone(),
tool_call_gate: Arc::new(ReadinessFlag::new()),
exec_policy: session_configuration.exec_policy.clone(),
truncation_policy: TruncationPolicy::new(
&per_turn_config,
per_turn_config.as_ref(),
model_family.truncation_policy,
),
}
@@ -529,7 +536,7 @@ impl Session {
for (alias, feature) in config.features.legacy_feature_usages() {
let canonical = feature.key();
let summary = format!("`{alias}` is deprecated. Use `{canonical}` instead.");
let summary = format!("`{alias}` is deprecated. Use `[features].{canonical}` instead.");
let details = if alias == canonical {
None
} else {
@@ -543,7 +550,9 @@ impl Session {
});
}
let model_family = models_manager.construct_model_family(&config.model, &config);
let model_family = models_manager
.construct_model_family(&config.model, &config)
.await;
// todo(aibrahim): why are we passing model here while it can change?
let otel_event_manager = OtelEventManager::new(
conversation_id,
@@ -766,12 +775,19 @@ impl Session {
session_configuration
};
let per_turn_config = Self::build_per_turn_config(&session_configuration);
let model_family = self
.services
.models_manager
.construct_model_family(&per_turn_config.model, &per_turn_config)
.await;
let mut turn_context: TurnContext = Self::make_turn_context(
Some(Arc::clone(&self.services.auth_manager)),
Arc::clone(&self.services.models_manager),
&self.services.otel_event_manager,
session_configuration.provider.clone(),
&session_configuration,
per_turn_config,
model_family,
self.conversation_id,
sub_id,
);
@@ -1454,6 +1470,16 @@ async fn submission_loop(sess: Arc<Session>, config: Arc<Config>, rx_sub: Receiv
let mut previous_context: Option<Arc<TurnContext>> =
Some(sess.new_turn(SessionSettingsUpdate::default()).await);
if config.features.enabled(Feature::RemoteModels)
&& let Err(err) = sess
.services
.models_manager
.refresh_available_models(&config.model_provider)
.await
{
error!("failed to refresh available models: {err}");
}
// To break out of this loop, send Op::Shutdown.
while let Ok(sub) = rx_sub.recv().await {
debug!(?sub, "Submission");
@@ -1905,7 +1931,8 @@ async fn spawn_review_thread(
let review_model_family = sess
.services
.models_manager
.construct_model_family(&model, &config);
.construct_model_family(&model, &config)
.await;
// For reviews, disable web_search and view_image regardless of global settings.
let mut review_features = sess.features.clone();
review_features
@@ -2473,6 +2500,7 @@ pub(crate) use tests::make_session_and_context_with_rx;
#[cfg(test)]
mod tests {
use super::*;
use crate::CodexAuth;
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use crate::exec::ExecToolCallOutput;
@@ -2595,6 +2623,7 @@ mod tests {
unlimited: false,
balance: Some("10.00".to_string()),
}),
plan_type: Some(codex_protocol::account::PlanType::Plus),
};
state.set_rate_limits(initial.clone());
@@ -2610,6 +2639,7 @@ mod tests {
resets_at: Some(1_900),
}),
credits: None,
plan_type: None,
};
state.set_rate_limits(update.clone());
@@ -2619,6 +2649,78 @@ mod tests {
primary: update.primary.clone(),
secondary: update.secondary,
credits: initial.credits,
plan_type: initial.plan_type,
})
);
}
#[test]
fn set_rate_limits_updates_plan_type_when_present() {
let codex_home = tempfile::tempdir().expect("create temp dir");
let config = Config::load_from_base_config_with_overrides(
ConfigToml::default(),
ConfigOverrides::default(),
codex_home.path().to_path_buf(),
)
.expect("load default test config");
let config = Arc::new(config);
let session_configuration = SessionConfiguration {
provider: config.model_provider.clone(),
model: config.model.clone(),
model_reasoning_effort: config.model_reasoning_effort,
model_reasoning_summary: config.model_reasoning_summary,
developer_instructions: config.developer_instructions.clone(),
user_instructions: config.user_instructions.clone(),
base_instructions: config.base_instructions.clone(),
compact_prompt: config.compact_prompt.clone(),
approval_policy: config.approval_policy,
sandbox_policy: config.sandbox_policy.clone(),
cwd: config.cwd.clone(),
original_config_do_not_use: Arc::clone(&config),
exec_policy: Arc::new(RwLock::new(ExecPolicy::empty())),
session_source: SessionSource::Exec,
};
let mut state = SessionState::new(session_configuration);
let initial = RateLimitSnapshot {
primary: Some(RateLimitWindow {
used_percent: 15.0,
window_minutes: Some(20),
resets_at: Some(1_600),
}),
secondary: Some(RateLimitWindow {
used_percent: 5.0,
window_minutes: Some(45),
resets_at: Some(1_650),
}),
credits: Some(CreditsSnapshot {
has_credits: true,
unlimited: false,
balance: Some("15.00".to_string()),
}),
plan_type: Some(codex_protocol::account::PlanType::Plus),
};
state.set_rate_limits(initial.clone());
let update = RateLimitSnapshot {
primary: Some(RateLimitWindow {
used_percent: 35.0,
window_minutes: Some(25),
resets_at: Some(1_700),
}),
secondary: None,
credits: None,
plan_type: Some(codex_protocol::account::PlanType::Pro),
};
state.set_rate_limits(update.clone());
assert_eq!(
state.latest_rate_limits,
Some(RateLimitSnapshot {
primary: update.primary,
secondary: update.secondary,
credits: initial.credits,
plan_type: update.plan_type,
})
);
}
@@ -2735,15 +2837,12 @@ mod tests {
fn otel_event_manager(
conversation_id: ConversationId,
config: &Config,
models_manager: &ModelsManager,
model_family: &ModelFamily,
) -> OtelEventManager {
OtelEventManager::new(
conversation_id,
config.model.as_str(),
models_manager
.construct_model_family(&config.model, config)
.slug
.as_str(),
model_family.slug.as_str(),
None,
Some("test@test.com".to_string()),
Some(AuthMode::ChatGPT),
@@ -2763,15 +2862,9 @@ mod tests {
.expect("load default test config");
let config = Arc::new(config);
let conversation_id = ConversationId::default();
let auth_manager = AuthManager::shared(
config.cwd.clone(),
false,
config.cli_auth_credentials_store_mode,
);
let models_manager = Arc::new(ModelsManager::new(auth_manager.get_auth_mode()));
let otel_event_manager =
otel_event_manager(conversation_id, config.as_ref(), &models_manager);
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
let models_manager = Arc::new(ModelsManager::new(auth_manager.clone()));
let session_configuration = SessionConfiguration {
provider: config.model_provider.clone(),
model: config.model.clone(),
@@ -2788,6 +2881,11 @@ mod tests {
exec_policy: Arc::new(RwLock::new(ExecPolicy::empty())),
session_source: SessionSource::Exec,
};
let per_turn_config = Session::build_per_turn_config(&session_configuration);
let model_family =
ModelsManager::construct_model_family_offline(&per_turn_config.model, &per_turn_config);
let otel_event_manager =
otel_event_manager(conversation_id, config.as_ref(), &model_family);
let state = SessionState::new(session_configuration.clone());
@@ -2799,18 +2897,19 @@ mod tests {
rollout: Mutex::new(None),
user_shell: default_user_shell(),
show_raw_agent_reasoning: config.show_raw_agent_reasoning,
auth_manager: Arc::clone(&auth_manager),
auth_manager: auth_manager.clone(),
otel_event_manager: otel_event_manager.clone(),
models_manager: models_manager.clone(),
models_manager,
tool_approvals: Mutex::new(ApprovalStore::default()),
};
let turn_context = Session::make_turn_context(
Some(Arc::clone(&auth_manager)),
models_manager,
&otel_event_manager,
session_configuration.provider.clone(),
&session_configuration,
per_turn_config,
model_family,
conversation_id,
"turn_id".to_string(),
);
@@ -2845,15 +2944,9 @@ mod tests {
.expect("load default test config");
let config = Arc::new(config);
let conversation_id = ConversationId::default();
let auth_manager = AuthManager::shared(
config.cwd.clone(),
false,
config.cli_auth_credentials_store_mode,
);
let models_manager = Arc::new(ModelsManager::new(auth_manager.get_auth_mode()));
let otel_event_manager =
otel_event_manager(conversation_id, config.as_ref(), &models_manager);
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
let models_manager = Arc::new(ModelsManager::new(auth_manager.clone()));
let session_configuration = SessionConfiguration {
provider: config.model_provider.clone(),
model: config.model.clone(),
@@ -2870,6 +2963,11 @@ mod tests {
exec_policy: Arc::new(RwLock::new(ExecPolicy::empty())),
session_source: SessionSource::Exec,
};
let per_turn_config = Session::build_per_turn_config(&session_configuration);
let model_family =
ModelsManager::construct_model_family_offline(&per_turn_config.model, &per_turn_config);
let otel_event_manager =
otel_event_manager(conversation_id, config.as_ref(), &model_family);
let state = SessionState::new(session_configuration.clone());
@@ -2883,16 +2981,17 @@ mod tests {
show_raw_agent_reasoning: config.show_raw_agent_reasoning,
auth_manager: Arc::clone(&auth_manager),
otel_event_manager: otel_event_manager.clone(),
models_manager: models_manager.clone(),
models_manager,
tool_approvals: Mutex::new(ApprovalStore::default()),
};
let turn_context = Arc::new(Session::make_turn_context(
Some(Arc::clone(&auth_manager)),
models_manager,
&otel_event_manager,
session_configuration.provider.clone(),
&session_configuration,
per_turn_config,
model_family,
conversation_id,
"turn_id".to_string(),
));

View File

@@ -87,6 +87,7 @@ impl ContextManager {
let items_tokens = self.items.iter().fold(0i64, |acc, item| {
acc + match item {
ResponseItem::GhostSnapshot { .. } => 0,
ResponseItem::Reasoning {
encrypted_content: Some(content),
..

View File

@@ -47,7 +47,7 @@ impl ConversationManager {
conversations: Arc::new(RwLock::new(HashMap::new())),
auth_manager: auth_manager.clone(),
session_source,
models_manager: Arc::new(ModelsManager::new(auth_manager.get_auth_mode())),
models_manager: Arc::new(ModelsManager::new(auth_manager)),
}
}

View File

@@ -560,6 +560,7 @@ mod tests {
resets_at: Some(secondary_reset_at),
}),
credits: None,
plan_type: None,
}
}

View File

@@ -73,14 +73,18 @@ pub enum ExecPolicyUpdateError {
FeatureDisabled,
}
pub(crate) async fn exec_policy_for(
pub(crate) async fn load_exec_policy_for_features(
features: &Features,
codex_home: &Path,
) -> Result<Arc<RwLock<Policy>>, ExecPolicyError> {
) -> Result<Policy, ExecPolicyError> {
if !features.enabled(Feature::ExecPolicy) {
return Ok(Arc::new(RwLock::new(Policy::empty())));
Ok(Policy::empty())
} else {
load_exec_policy(codex_home).await
}
}
pub async fn load_exec_policy(codex_home: &Path) -> Result<Policy, ExecPolicyError> {
let policy_dir = codex_home.join(POLICY_DIR_NAME);
let policy_paths = collect_policy_files(&policy_dir).await?;
@@ -102,7 +106,7 @@ pub(crate) async fn exec_policy_for(
})?;
}
let policy = Arc::new(RwLock::new(parser.build()));
let policy = parser.build();
tracing::debug!(
"loaded execpolicy from {} files in {}",
policy_paths.len(),
@@ -306,7 +310,7 @@ mod tests {
features.disable(Feature::ExecPolicy);
let temp_dir = tempdir().expect("create temp dir");
let policy = exec_policy_for(&features, temp_dir.path())
let policy = load_exec_policy_for_features(&features, temp_dir.path())
.await
.expect("policy result");
@@ -319,10 +323,7 @@ mod tests {
decision: Decision::Allow
}],
},
policy
.read()
.await
.check_multiple(commands.iter(), &|_| Decision::Allow)
policy.check_multiple(commands.iter(), &|_| Decision::Allow)
);
assert!(!temp_dir.path().join(POLICY_DIR_NAME).exists());
}
@@ -350,7 +351,7 @@ mod tests {
)
.expect("write policy file");
let policy = exec_policy_for(&Features::with_defaults(), temp_dir.path())
let policy = load_exec_policy(temp_dir.path())
.await
.expect("policy result");
let command = [vec!["rm".to_string()]];
@@ -362,10 +363,7 @@ mod tests {
decision: Decision::Forbidden
}],
},
policy
.read()
.await
.check_multiple(command.iter(), &|_| Decision::Allow)
policy.check_multiple(command.iter(), &|_| Decision::Allow)
);
}
@@ -378,7 +376,7 @@ mod tests {
)
.expect("write policy file");
let policy = exec_policy_for(&Features::with_defaults(), temp_dir.path())
let policy = load_exec_policy(temp_dir.path())
.await
.expect("policy result");
let command = [vec!["ls".to_string()]];
@@ -390,10 +388,7 @@ mod tests {
decision: Decision::Allow
}],
},
policy
.read()
.await
.check_multiple(command.iter(), &|_| Decision::Allow)
policy.check_multiple(command.iter(), &|_| Decision::Allow)
);
}

View File

@@ -40,6 +40,8 @@ pub enum Feature {
// Experimental
/// Use the single unified PTY-backed exec tool.
UnifiedExec,
/// Use the unified exec wrapper tool with named sessions.
UnifiedExecWrapper,
/// Enable experimental RMCP features such as OAuth login.
RmcpClient,
/// Include the freeform apply_patch tool.
@@ -54,6 +56,8 @@ pub enum Feature {
WindowsSandbox,
/// Remote compaction enabled (only for ChatGPT auth)
RemoteCompaction,
/// Refresh remote models and emit AppReady once the list is available.
RemoteModels,
/// Allow model to call multiple tools in parallel (only for models supporting it).
ParallelToolCalls,
/// Experimental skills injection (CLI flag-driven).
@@ -291,6 +295,12 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::UnifiedExecWrapper,
key: "unified_exec_wrapper",
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::RmcpClient,
key: "rmcp_client",
@@ -333,6 +343,12 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Experimental,
default_enabled: true,
},
FeatureSpec {
id: Feature::RemoteModels,
key: "remote_models",
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::ParallelToolCalls,
key: "parallel",

View File

@@ -97,7 +97,10 @@ mod user_shell_command;
pub mod util;
pub use apply_patch::CODEX_APPLY_PATCH_ARG1;
pub use command_safety::is_dangerous_command;
pub use command_safety::is_safe_command;
pub use exec_policy::ExecPolicyError;
pub use exec_policy::load_exec_policy;
pub use safety::get_platform_sandbox;
pub use safety::set_windows_sandbox_enabled;
// Re-export the protocol types from the standalone `codex-protocol` crate so existing

View File

@@ -1,11 +1,12 @@
use codex_protocol::config_types::Verbosity;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ReasoningEffort;
use crate::config::Config;
use crate::config::types::ReasoningSummaryFormat;
use crate::tools::handlers::apply_patch::ApplyPatchToolType;
use crate::tools::spec::ConfigShellToolType;
use crate::truncate::TruncationPolicy;
use codex_protocol::openai_models::ConfigShellToolType;
/// The `instructions` field in the payload sent to a model should always start
/// with this content.
@@ -83,6 +84,15 @@ impl ModelFamily {
}
self
}
pub fn with_remote_overrides(mut self, remote_models: Vec<ModelInfo>) -> Self {
for model in remote_models {
if model.slug == self.slug {
self.default_reasoning_effort = Some(model.default_reasoning_level);
self.shell_type = model.shell_type;
}
}
self
}
}
macro_rules! model_family {
@@ -275,3 +285,76 @@ fn derive_default_model_family(model: &str) -> ModelFamily {
truncation_policy: TruncationPolicy::Bytes(10_000),
}
}
#[cfg(test)]
mod tests {
use super::*;
use codex_protocol::openai_models::ClientVersion;
use codex_protocol::openai_models::ModelVisibility;
use codex_protocol::openai_models::ReasoningEffortPreset;
fn remote(slug: &str, effort: ReasoningEffort, shell: ConfigShellToolType) -> ModelInfo {
ModelInfo {
slug: slug.to_string(),
display_name: slug.to_string(),
description: Some(format!("{slug} desc")),
default_reasoning_level: effort,
supported_reasoning_levels: vec![ReasoningEffortPreset {
effort,
description: effort.to_string(),
}],
shell_type: shell,
visibility: ModelVisibility::List,
minimal_client_version: ClientVersion(0, 1, 0),
supported_in_api: true,
priority: 1,
upgrade: None,
}
}
#[test]
fn remote_overrides_apply_when_slug_matches() {
let family = model_family!("gpt-4o-mini", "gpt-4o-mini");
assert_ne!(family.default_reasoning_effort, Some(ReasoningEffort::High));
let updated = family.with_remote_overrides(vec![
remote(
"gpt-4o-mini",
ReasoningEffort::High,
ConfigShellToolType::ShellCommand,
),
remote(
"other-model",
ReasoningEffort::Low,
ConfigShellToolType::UnifiedExec,
),
]);
assert_eq!(
updated.default_reasoning_effort,
Some(ReasoningEffort::High)
);
assert_eq!(updated.shell_type, ConfigShellToolType::ShellCommand);
}
#[test]
fn remote_overrides_skip_non_matching_models() {
let family = model_family!(
"codex-mini-latest",
"codex-mini-latest",
shell_type: ConfigShellToolType::Local
);
let updated = family.clone().with_remote_overrides(vec![remote(
"other",
ReasoningEffort::High,
ConfigShellToolType::ShellCommand,
)]);
assert_eq!(
updated.default_reasoning_effort,
family.default_reasoning_effort
);
assert_eq!(updated.shell_type, family.shell_type);
}
}

View File

@@ -1,34 +1,187 @@
use codex_app_server_protocol::AuthMode;
use codex_api::ModelsClient;
use codex_api::ReqwestTransport;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ModelPreset;
use http::HeaderMap;
use std::sync::Arc;
use tokio::sync::RwLock;
use crate::api_bridge::auth_provider_from_auth;
use crate::api_bridge::map_api_error;
use crate::auth::AuthManager;
use crate::config::Config;
use crate::default_client::build_reqwest_client;
use crate::error::Result as CoreResult;
use crate::model_provider_info::ModelProviderInfo;
use crate::openai_models::model_family::ModelFamily;
use crate::openai_models::model_family::find_family_for_model;
use crate::openai_models::model_presets::builtin_model_presets;
#[derive(Debug)]
pub struct ModelsManager {
// todo(aibrahim) merge available_models and model family creation into one struct
pub available_models: RwLock<Vec<ModelPreset>>,
pub remote_models: RwLock<Vec<ModelInfo>>,
pub etag: String,
pub auth_mode: Option<AuthMode>,
pub auth_manager: Arc<AuthManager>,
}
impl ModelsManager {
pub fn new(auth_mode: Option<AuthMode>) -> Self {
pub fn new(auth_manager: Arc<AuthManager>) -> Self {
Self {
available_models: RwLock::new(builtin_model_presets(auth_mode)),
available_models: RwLock::new(builtin_model_presets(auth_manager.get_auth_mode())),
remote_models: RwLock::new(Vec::new()),
etag: String::new(),
auth_mode,
auth_manager,
}
}
pub async fn refresh_available_models(&self) {
let models = builtin_model_presets(self.auth_mode);
*self.available_models.write().await = models;
pub async fn refresh_available_models(
&self,
provider: &ModelProviderInfo,
) -> CoreResult<Vec<ModelInfo>> {
let auth = self.auth_manager.auth();
let api_provider = provider.to_api_provider(auth.as_ref().map(|auth| auth.mode))?;
let api_auth = auth_provider_from_auth(auth.clone(), provider).await?;
let transport = ReqwestTransport::new(build_reqwest_client());
let client = ModelsClient::new(transport, api_provider, api_auth);
let mut client_version = env!("CARGO_PKG_VERSION");
if client_version == "0.0.0" {
client_version = "99.99.99";
}
let response = client
.list_models(client_version, HeaderMap::new())
.await
.map_err(map_api_error)?;
let models = response.models;
*self.remote_models.write().await = models.clone();
let available_models = self.build_available_models().await;
{
let mut available_models_guard = self.available_models.write().await;
*available_models_guard = available_models;
}
Ok(models)
}
pub fn construct_model_family(&self, model: &str, config: &Config) -> ModelFamily {
pub async fn construct_model_family(&self, model: &str, config: &Config) -> ModelFamily {
find_family_for_model(model)
.with_config_overrides(config)
.with_remote_overrides(self.remote_models.read().await.clone())
}
#[cfg(any(test, feature = "test-support"))]
pub fn construct_model_family_offline(model: &str, config: &Config) -> ModelFamily {
find_family_for_model(model).with_config_overrides(config)
}
async fn build_available_models(&self) -> Vec<ModelPreset> {
let mut available_models = self.remote_models.read().await.clone();
available_models.sort_by(|a, b| b.priority.cmp(&a.priority));
let mut model_presets: Vec<ModelPreset> = available_models
.into_iter()
.map(Into::into)
.filter(|preset: &ModelPreset| preset.show_in_picker)
.collect();
if let Some(default) = model_presets.first_mut() {
default.is_default = true;
}
model_presets
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::CodexAuth;
use crate::model_provider_info::WireApi;
use codex_protocol::openai_models::ModelsResponse;
use serde_json::json;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
use wiremock::matchers::method;
use wiremock::matchers::path;
fn remote_model(slug: &str, display: &str, priority: i32) -> ModelInfo {
serde_json::from_value(json!({
"slug": slug,
"display_name": display,
"description": format!("{display} desc"),
"default_reasoning_level": "medium",
"supported_reasoning_levels": [{"effort": "low", "description": "low"}, {"effort": "medium", "description": "medium"}],
"shell_type": "shell_command",
"visibility": "list",
"minimal_client_version": [0, 1, 0],
"supported_in_api": true,
"priority": priority,
"upgrade": null,
}))
.expect("valid model")
}
fn provider_for(base_url: String) -> ModelProviderInfo {
ModelProviderInfo {
name: "mock".into(),
base_url: Some(base_url),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(5_000),
requires_openai_auth: false,
}
}
#[tokio::test]
async fn refresh_available_models_sorts_and_marks_default() {
let server = MockServer::start().await;
let remote_models = vec![
remote_model("priority-low", "Low", 1),
remote_model("priority-high", "High", 10),
];
let response = ModelsResponse {
models: remote_models.clone(),
};
Mock::given(method("GET"))
.and(path("/models"))
.respond_with(
ResponseTemplate::new(200)
.insert_header("content-type", "application/json")
.set_body_json(&response),
)
.expect(1)
.mount(&server)
.await;
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
let manager = ModelsManager::new(auth_manager);
let provider = provider_for(server.uri());
let returned = manager
.refresh_available_models(&provider)
.await
.expect("refresh succeeds");
assert_eq!(returned, remote_models);
let cached_remote = manager.remote_models.read().await.clone();
assert_eq!(cached_remote, remote_models);
let available = manager.available_models.read().await.clone();
assert_eq!(available.len(), 2);
assert_eq!(available[0].model, "priority-high");
assert!(
available[0].is_default,
"highest priority should be default"
);
assert_eq!(available[1].model, "priority-low");
assert!(!available[1].is_default);
}
}

View File

@@ -126,7 +126,9 @@ pub(crate) async fn assess_command(
output_schema: Some(sandbox_assessment_schema()),
};
let model_family = models_manager.construct_model_family(&config.model, &config);
let model_family = models_manager
.construct_model_family(&config.model, &config)
.await;
let child_otel = parent_otel.with_model(config.model.as_str(), model_family.slug.as_str());

View File

@@ -58,6 +58,28 @@ impl Shell {
}
}
}
pub fn login_command(&self) -> Vec<String> {
self.command_with_login(true)
}
pub fn non_login_command(&self) -> Vec<String> {
self.command_with_login(false)
}
fn command_with_login(&self, login_shell: bool) -> Vec<String> {
let shell_path = self.shell_path.to_string_lossy().to_string();
match self.shell_type {
ShellType::PowerShell | ShellType::Cmd => vec![shell_path],
ShellType::Zsh | ShellType::Bash | ShellType::Sh => {
if login_shell {
vec![shell_path, "-l".to_string()]
} else {
vec![shell_path]
}
}
}
}
}
#[cfg(unix)]
@@ -408,6 +430,59 @@ mod tests {
}
}
#[test]
fn derive_exec_args() {
let test_bash_shell = Shell {
shell_type: ShellType::Bash,
shell_path: PathBuf::from("/bin/bash"),
};
assert_eq!(
test_bash_shell.derive_exec_args("echo hello", false),
vec!["/bin/bash", "-c", "echo hello"]
);
assert_eq!(
test_bash_shell.derive_exec_args("echo hello", true),
vec!["/bin/bash", "-lc", "echo hello"]
);
let test_zsh_shell = Shell {
shell_type: ShellType::Zsh,
shell_path: PathBuf::from("/bin/zsh"),
};
assert_eq!(
test_zsh_shell.derive_exec_args("echo hello", false),
vec!["/bin/zsh", "-c", "echo hello"]
);
assert_eq!(
test_zsh_shell.derive_exec_args("echo hello", true),
vec!["/bin/zsh", "-lc", "echo hello"]
);
let test_powershell_shell = Shell {
shell_type: ShellType::PowerShell,
shell_path: PathBuf::from("pwsh.exe"),
};
assert_eq!(
test_powershell_shell.derive_exec_args("echo hello", false),
vec!["pwsh.exe", "-NoProfile", "-Command", "echo hello"]
);
assert_eq!(
test_powershell_shell.derive_exec_args("echo hello", true),
vec!["pwsh.exe", "-Command", "echo hello"]
);
}
#[test]
fn shell_command_login_variants() {
let sh_shell = Shell {
shell_type: ShellType::Sh,
shell_path: PathBuf::from("/bin/sh"),
};
assert_eq!(sh_shell.login_command(), vec!["/bin/sh", "-l"]);
assert_eq!(sh_shell.non_login_command(), vec!["/bin/sh"]);
}
#[tokio::test]
async fn test_current_shell_detects_zsh() {
let shell = Command::new("sh")

View File

@@ -1,4 +1,5 @@
use crate::config::Config;
use crate::git_info::resolve_root_git_project_for_trust;
use crate::skills::model::SkillError;
use crate::skills::model::SkillLoadOutcome;
use crate::skills::model::SkillMetadata;
@@ -20,6 +21,7 @@ struct SkillFrontmatter {
const SKILLS_FILENAME: &str = "SKILL.md";
const SKILLS_DIR_NAME: &str = "skills";
const REPO_ROOT_CONFIG_DIR_NAME: &str = ".codex";
const MAX_NAME_LEN: usize = 100;
const MAX_DESCRIPTION_LEN: usize = 500;
@@ -65,7 +67,17 @@ pub fn load_skills(config: &Config) -> SkillLoadOutcome {
}
fn skill_roots(config: &Config) -> Vec<PathBuf> {
vec![config.codex_home.join(SKILLS_DIR_NAME)]
let mut roots = vec![config.codex_home.join(SKILLS_DIR_NAME)];
if let Some(repo_root) = resolve_root_git_project_for_trust(&config.cwd) {
roots.push(
repo_root
.join(REPO_ROOT_CONFIG_DIR_NAME)
.join(SKILLS_DIR_NAME),
);
}
roots
}
fn discover_skills_under_root(root: &Path, outcome: &mut SkillLoadOutcome) {
@@ -196,6 +208,9 @@ mod tests {
use super::*;
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use pretty_assertions::assert_eq;
use std::path::Path;
use std::process::Command;
use tempfile::TempDir;
fn make_config(codex_home: &TempDir) -> Config {
@@ -211,7 +226,11 @@ mod tests {
}
fn write_skill(codex_home: &TempDir, dir: &str, name: &str, description: &str) -> PathBuf {
let skill_dir = codex_home.path().join(format!("skills/{dir}"));
write_skill_at(codex_home.path(), dir, name, description)
}
fn write_skill_at(root: &Path, dir: &str, name: &str, description: &str) -> PathBuf {
let skill_dir = root.join(format!("skills/{dir}"));
fs::create_dir_all(&skill_dir).unwrap();
let indented_description = description.replace('\n', "\n ");
let content = format!(
@@ -288,4 +307,37 @@ mod tests {
"expected length error"
);
}
#[test]
fn loads_skills_from_repo_root() {
let codex_home = tempfile::tempdir().expect("tempdir");
let repo_dir = tempfile::tempdir().expect("tempdir");
let status = Command::new("git")
.arg("init")
.current_dir(repo_dir.path())
.status()
.expect("git init");
assert!(status.success(), "git init failed");
let skills_root = repo_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
.join(SKILLS_DIR_NAME);
write_skill_at(&skills_root, "repo", "repo-skill", "from repo");
let mut cfg = make_config(&codex_home);
cfg.cwd = repo_dir.path().to_path_buf();
let repo_root = normalize_path(&skills_root).unwrap_or_else(|_| skills_root.clone());
let outcome = load_skills(&cfg);
assert!(
outcome.errors.is_empty(),
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
let skill = &outcome.skills[0];
assert_eq!(skill.name, "repo-skill");
assert!(skill.path.starts_with(&repo_root));
}
}

View File

@@ -62,7 +62,7 @@ impl SessionState {
}
pub(crate) fn set_rate_limits(&mut self, snapshot: RateLimitSnapshot) {
self.latest_rate_limits = Some(merge_rate_limit_credits(
self.latest_rate_limits = Some(merge_rate_limit_fields(
self.latest_rate_limits.as_ref(),
snapshot,
));
@@ -83,13 +83,16 @@ impl SessionState {
}
}
// Sometimes new snapshots don't include credits
fn merge_rate_limit_credits(
// Sometimes new snapshots don't include credits or plan information.
fn merge_rate_limit_fields(
previous: Option<&RateLimitSnapshot>,
mut snapshot: RateLimitSnapshot,
) -> RateLimitSnapshot {
if snapshot.credits.is_none() {
snapshot.credits = previous.and_then(|prior| prior.credits.clone());
}
if snapshot.plan_type.is_none() {
snapshot.plan_type = previous.and_then(|prior| prior.plan_type);
}
snapshot
}

View File

@@ -49,8 +49,7 @@ impl ShellCommandHandler {
turn_context: &TurnContext,
) -> ExecParams {
let shell = session.user_shell();
let use_login_shell = true;
let command = shell.derive_exec_args(&params.command, use_login_shell);
let command = shell.derive_exec_args(&params.command, params.login.unwrap_or(true));
ExecParams {
command,
@@ -276,9 +275,15 @@ impl ShellHandler {
mod tests {
use std::path::PathBuf;
use codex_protocol::models::ShellCommandToolCallParams;
use pretty_assertions::assert_eq;
use crate::codex::make_session_and_context;
use crate::exec_env::create_env;
use crate::is_safe_command::is_known_safe_command;
use crate::shell::Shell;
use crate::shell::ShellType;
use crate::tools::handlers::ShellCommandHandler;
/// The logic for is_known_safe_command() has heuristics for known shells,
/// so we must ensure the commands generated by [ShellCommandHandler] can be
@@ -312,4 +317,43 @@ mod tests {
&shell.derive_exec_args(command, /* use_login_shell */ false)
));
}
#[test]
fn shell_command_handler_to_exec_params_uses_session_shell_and_turn_context() {
let (session, turn_context) = make_session_and_context();
let command = "echo hello".to_string();
let workdir = Some("subdir".to_string());
let login = None;
let timeout_ms = Some(1234);
let with_escalated_permissions = Some(true);
let justification = Some("because tests".to_string());
let expected_command = session.user_shell().derive_exec_args(&command, true);
let expected_cwd = turn_context.resolve_path(workdir.clone());
let expected_env = create_env(&turn_context.shell_environment_policy);
let params = ShellCommandToolCallParams {
command,
workdir,
login,
timeout_ms,
with_escalated_permissions,
justification: justification.clone(),
};
let exec_params = ShellCommandHandler::to_exec_params(params, &session, &turn_context);
// ExecParams cannot derive Eq due to the CancellationToken field, so we manually compare the fields.
assert_eq!(exec_params.command, expected_command);
assert_eq!(exec_params.cwd, expected_cwd);
assert_eq!(exec_params.env, expected_env);
assert_eq!(exec_params.expiration.timeout_ms(), timeout_ms);
assert_eq!(
exec_params.with_escalated_permissions,
with_escalated_permissions
);
assert_eq!(exec_params.justification, justification);
assert_eq!(exec_params.arg0, None);
}
}

View File

@@ -58,6 +58,28 @@ struct WriteStdinArgs {
max_output_tokens: Option<usize>,
}
#[derive(Debug, Deserialize)]
struct NewSessionArgs {
session_name: String,
#[serde(default)]
workdir: Option<String>,
#[serde(default = "default_write_stdin_yield_time_ms")]
yield_time_ms: u64,
#[serde(default)]
max_output_tokens: Option<usize>,
}
#[derive(Debug, Deserialize)]
struct FeedCharsArgs {
session_name: String,
#[serde(default)]
chars: String,
#[serde(default = "default_write_stdin_yield_time_ms")]
yield_time_ms: u64,
#[serde(default)]
max_output_tokens: Option<usize>,
}
fn default_exec_yield_time_ms() -> u64 {
10000
}
@@ -207,6 +229,87 @@ impl ToolHandler for UnifiedExecHandler {
FunctionCallError::RespondToModel(format!("exec_command failed: {err:?}"))
})?
}
"new_session" => {
let args: NewSessionArgs = serde_json::from_str(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to parse new_session arguments: {err:?}"
))
})?;
if args.session_name.trim().is_empty() {
return Err(FunctionCallError::RespondToModel(
"session_name must not be empty".to_string(),
));
}
let process_id = manager.allocate_process_id().await;
if let Err(err) = manager
.register_session_name(&args.session_name, &process_id)
.await
{
manager.release_process_id(&process_id).await;
return Err(FunctionCallError::RespondToModel(err.to_string()));
}
let workdir = args
.workdir
.filter(|value| !value.is_empty())
.map(|dir| context.turn.resolve_path(Some(dir)));
let response = manager
.exec_command(
ExecCommandRequest {
command: session.user_shell().login_command(),
process_id,
yield_time_ms: args.yield_time_ms,
max_output_tokens: args.max_output_tokens,
workdir,
with_escalated_permissions: None,
justification: None,
},
&context,
)
.await
.map_err(|err| {
FunctionCallError::RespondToModel(format!("exec_command failed: {err:?}"))
})?;
if response.process_id.is_none() {
manager.clear_session_name(&args.session_name).await;
}
response
}
"feed_chars" => {
let args: FeedCharsArgs = serde_json::from_str(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to parse feed_chars arguments: {err:?}"
))
})?;
let Some(process_id) = manager
.process_id_for_session_name(&args.session_name)
.await
else {
return Err(FunctionCallError::RespondToModel(format!(
"Session '{}' not found",
args.session_name
)));
};
manager
.write_stdin(WriteStdinRequest {
call_id: &call_id,
process_id: &process_id,
input: &args.chars,
yield_time_ms: args.yield_time_ms,
max_output_tokens: args.max_output_tokens,
})
.await
.map_err(|err| {
FunctionCallError::RespondToModel(format!("write_stdin failed: {err:?}"))
})?
}
"write_stdin" => {
let args: WriteStdinArgs = serde_json::from_str(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(

View File

@@ -8,6 +8,7 @@ use crate::tools::handlers::apply_patch::ApplyPatchToolType;
use crate::tools::handlers::apply_patch::create_apply_patch_freeform_tool;
use crate::tools::handlers::apply_patch::create_apply_patch_json_tool;
use crate::tools::registry::ToolRegistryBuilder;
use codex_protocol::openai_models::ConfigShellToolType;
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value as JsonValue;
@@ -15,20 +16,6 @@ use serde_json::json;
use std::collections::BTreeMap;
use std::collections::HashMap;
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum ConfigShellToolType {
Default,
Local,
UnifiedExec,
/// Do not include a shell tool by default. Useful when using Codex
/// with tools provided exclusively provided by MCP servers. Often used
/// with `--config base_instructions=CUSTOM_INSTRUCTIONS`
/// to customize agent behavior.
Disabled,
/// Takes a command as a single string to be run in the user's default shell.
ShellCommand,
}
#[derive(Debug, Clone)]
pub(crate) struct ToolsConfig {
pub shell_type: ConfigShellToolType,
@@ -55,10 +42,12 @@ impl ToolsConfig {
let shell_type = if !features.enabled(Feature::ShellTool) {
ConfigShellToolType::Disabled
} else if features.enabled(Feature::UnifiedExecWrapper) {
ConfigShellToolType::UnifiedExecWrapper
} else if features.enabled(Feature::UnifiedExec) {
ConfigShellToolType::UnifiedExec
} else {
model_family.shell_type.clone()
model_family.shell_type
};
let apply_patch_tool_type = match model_family.apply_patch_tool_type {
@@ -264,6 +253,98 @@ fn create_write_stdin_tool() -> ToolSpec {
})
}
fn create_new_session_tool() -> ToolSpec {
let mut properties = BTreeMap::new();
properties.insert(
"session_name".to_string(),
JsonSchema::String {
description: Some("Unique name for the session".to_string()),
},
);
properties.insert(
"workdir".to_string(),
JsonSchema::String {
description: Some(
"Optional working directory for the session; defaults to the turn cwd.".to_string(),
),
},
);
properties.insert(
"yield_time_ms".to_string(),
JsonSchema::Number {
description: Some(
"How long to wait (in milliseconds) before returning the initial output."
.to_string(),
),
},
);
properties.insert(
"max_output_tokens".to_string(),
JsonSchema::Number {
description: Some(
"Maximum number of tokens to return. Excess output will be truncated.".to_string(),
),
},
);
ToolSpec::Function(ResponsesApiTool {
name: "new_session".to_string(),
description: "Open a new interactive exec session in a container. Normally used for launching an interactive shell. Multiple sessions may be running at a time.".to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: Some(vec!["session_name".to_string()]),
additional_properties: Some(false.into()),
},
})
}
fn create_feed_chars_tool() -> ToolSpec {
let mut properties = BTreeMap::new();
properties.insert(
"session_name".to_string(),
JsonSchema::String {
description: Some("Session to feed characters to".to_string()),
},
);
properties.insert(
"chars".to_string(),
JsonSchema::String {
description: Some("Characters to feed; may be empty".to_string()),
},
);
properties.insert(
"yield_time_ms".to_string(),
JsonSchema::Number {
description: Some(
"How long to wait (in milliseconds) for output before flushing STDOUT/STDERR."
.to_string(),
),
},
);
properties.insert(
"max_output_tokens".to_string(),
JsonSchema::Number {
description: Some(
"Maximum number of tokens to return. Excess output will be truncated.".to_string(),
),
},
);
ToolSpec::Function(ResponsesApiTool {
name: "feed_chars".to_string(),
description:
"Feed characters to a session's STDIN, wait briefly, flush STDOUT/STDERR, and return the results."
.to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: Some(vec!["session_name".to_string()]),
additional_properties: Some(false.into()),
},
})
}
fn create_shell_tool() -> ToolSpec {
let mut properties = BTreeMap::new();
properties.insert(
@@ -344,6 +425,15 @@ fn create_shell_command_tool() -> ToolSpec {
description: Some("The working directory to execute the command in".to_string()),
},
);
properties.insert(
"login".to_string(),
JsonSchema::Boolean {
description: Some(
"Whether to run the shell with login shell semantics. Defaults to true."
.to_string(),
),
},
);
properties.insert(
"timeout_ms".to_string(),
JsonSchema::Number {
@@ -808,10 +898,16 @@ pub(crate) fn create_tools_json_for_chat_completions_api(
}
if let Some(map) = tool.as_object_mut() {
let name = map
.get("name")
.and_then(|v| v.as_str())
.unwrap_or_default()
.to_string();
// Remove "type" field as it is not needed in chat completions.
map.remove("type");
Some(json!({
"type": "function",
"name": name,
"function": map,
}))
} else {
@@ -1011,6 +1107,12 @@ pub(crate) fn build_specs(
builder.register_handler("exec_command", unified_exec_handler.clone());
builder.register_handler("write_stdin", unified_exec_handler);
}
ConfigShellToolType::UnifiedExecWrapper => {
builder.push_spec(create_new_session_tool());
builder.push_spec(create_feed_chars_tool());
builder.register_handler("new_session", unified_exec_handler.clone());
builder.register_handler("feed_chars", unified_exec_handler);
}
ConfigShellToolType::Disabled => {
// Do nothing.
}
@@ -1161,6 +1263,7 @@ mod tests {
ConfigShellToolType::Default => Some("shell"),
ConfigShellToolType::Local => Some("local_shell"),
ConfigShellToolType::UnifiedExec => None,
ConfigShellToolType::UnifiedExecWrapper => Some("unified_exec_wrapper"),
ConfigShellToolType::Disabled => None,
ConfigShellToolType::ShellCommand => Some("shell_command"),
}
@@ -2087,4 +2190,58 @@ Examples of valid command strings:
})
);
}
#[test]
fn chat_tools_include_top_level_name() {
let mut properties = BTreeMap::new();
properties.insert("foo".to_string(), JsonSchema::String { description: None });
let tools = vec![ToolSpec::Function(ResponsesApiTool {
name: "demo".to_string(),
description: "A demo tool".to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: None,
additional_properties: None,
},
})];
let responses_json = create_tools_json_for_responses_api(&tools).unwrap();
assert_eq!(
responses_json,
vec![json!({
"type": "function",
"name": "demo",
"description": "A demo tool",
"strict": false,
"parameters": {
"type": "object",
"properties": {
"foo": { "type": "string" }
},
},
})]
);
let tools_json = create_tools_json_for_chat_completions_api(&tools).unwrap();
assert_eq!(
tools_json,
vec![json!({
"type": "function",
"name": "demo",
"function": {
"name": "demo",
"description": "A demo tool",
"strict": false,
"parameters": {
"type": "object",
"properties": {
"foo": { "type": "string" }
},
},
}
})]
);
}
}

View File

@@ -109,17 +109,43 @@ pub(crate) struct UnifiedExecSessionManager {
pub(crate) struct SessionStore {
sessions: HashMap<String, SessionEntry>,
reserved_sessions_id: HashSet<String>,
session_names: HashMap<String, String>,
}
impl SessionStore {
fn remove(&mut self, session_id: &str) -> Option<SessionEntry> {
self.reserved_sessions_id.remove(session_id);
self.session_names.retain(|_, id| id != session_id);
self.sessions.remove(session_id)
}
pub(crate) fn clear(&mut self) {
self.reserved_sessions_id.clear();
self.sessions.clear();
self.session_names.clear();
}
fn process_id_for_name(&self, session_name: &str) -> Option<String> {
self.session_names.get(session_name).cloned()
}
fn insert_session_name(
&mut self,
session_name: &str,
process_id: &str,
) -> Result<(), UnifiedExecError> {
if self.session_names.contains_key(session_name) {
return Err(UnifiedExecError::create_session(format!(
"Session '{session_name}' already in use"
)));
}
self.session_names
.insert(session_name.to_string(), process_id.to_string());
Ok(())
}
fn clear_session_name(&mut self, session_name: &str) {
self.session_names.remove(session_name);
}
}
@@ -456,4 +482,45 @@ mod tests {
Ok(())
}
#[tokio::test]
async fn session_names_track_process_ids() {
let (session, _turn) = test_session_and_turn();
let process_id = session
.services
.unified_exec_manager
.allocate_process_id()
.await;
session
.services
.unified_exec_manager
.register_session_name("default", &process_id)
.await
.expect("session name reserved");
pretty_assertions::assert_eq!(
session
.services
.unified_exec_manager
.process_id_for_session_name("default")
.await,
Some(process_id.clone())
);
session
.services
.unified_exec_manager
.release_process_id(&process_id)
.await;
assert!(
session
.services
.unified_exec_manager
.process_id_for_session_name("default")
.await
.is_none()
);
}
}

View File

@@ -116,6 +116,25 @@ impl UnifiedExecSessionManager {
store.remove(process_id);
}
pub(crate) async fn register_session_name(
&self,
session_name: &str,
process_id: &str,
) -> Result<(), UnifiedExecError> {
let mut store = self.session_store.lock().await;
store.insert_session_name(session_name, process_id)
}
pub(crate) async fn process_id_for_session_name(&self, session_name: &str) -> Option<String> {
let store = self.session_store.lock().await;
store.process_id_for_name(session_name)
}
pub(crate) async fn clear_session_name(&self, session_name: &str) {
let mut store = self.session_store.lock().await;
store.clear_session_name(session_name);
}
pub(crate) async fn exec_command(
&self,
request: ExecCommandRequest,

View File

@@ -71,8 +71,7 @@ async fn run_request(input: Vec<ResponseItem>) -> Value {
let config = Arc::new(config);
let conversation_id = ConversationId::new();
let models_manager = Arc::new(ModelsManager::new(Some(AuthMode::ApiKey)));
let model_family = models_manager.construct_model_family(&config.model, &config);
let model_family = ModelsManager::construct_model_family_offline(&config.model, &config);
let otel_event_manager = OtelEventManager::new(
conversation_id,
config.model.as_str(),

View File

@@ -1,9 +1,9 @@
use assert_matches::assert_matches;
use codex_core::openai_models::models_manager::ModelsManager;
use codex_core::AuthManager;
use std::sync::Arc;
use tracing_test::traced_test;
use codex_app_server_protocol::AuthMode;
use codex_core::CodexAuth;
use codex_core::ContentItem;
use codex_core::ModelClient;
use codex_core::ModelProviderInfo;
@@ -11,6 +11,7 @@ use codex_core::Prompt;
use codex_core::ResponseEvent;
use codex_core::ResponseItem;
use codex_core::WireApi;
use codex_core::openai_models::models_manager::ModelsManager;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::models::ReasoningItemContent;
@@ -71,16 +72,16 @@ async fn run_stream_with_bytes(sse_body: &[u8]) -> Vec<ResponseEvent> {
let config = Arc::new(config);
let conversation_id = ConversationId::new();
let auth_mode = AuthMode::ApiKey;
let models_manager = Arc::new(ModelsManager::new(Some(auth_mode)));
let model_family = models_manager.construct_model_family(&config.model, &config);
let auth_manager = AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
let auth_mode = auth_manager.get_auth_mode();
let model_family = ModelsManager::construct_model_family_offline(&config.model, &config);
let otel_event_manager = OtelEventManager::new(
conversation_id,
config.model.as_str(),
model_family.slug.as_str(),
None,
Some("test@test.com".to_string()),
Some(auth_mode),
auth_mode,
false,
"test".to_string(),
);

View File

@@ -11,7 +11,7 @@ path = "lib.rs"
anyhow = { workspace = true }
assert_cmd = { workspace = true }
base64 = { workspace = true }
codex-core = { workspace = true }
codex-core = { workspace = true, features = ["test-support"] }
codex-protocol = { workspace = true }
notify = { workspace = true }
regex-lite = { workspace = true }

View File

@@ -369,3 +369,13 @@ macro_rules! skip_if_no_network {
}
}};
}
#[macro_export]
macro_rules! skip_if_windows {
($return_value:expr $(,)?) => {{
if cfg!(target_os = "windows") {
println!("Skipping test because it cannot execute on Windows.");
return $return_value;
}
}};
}

View File

@@ -1,6 +1,8 @@
use std::sync::Arc;
use codex_app_server_protocol::AuthMode;
use codex_core::AuthManager;
use codex_core::CodexAuth;
use codex_core::ContentItem;
use codex_core::ModelClient;
use codex_core::ModelProviderInfo;
@@ -63,8 +65,7 @@ async fn responses_stream_includes_subagent_header_on_review() {
let conversation_id = ConversationId::new();
let auth_mode = AuthMode::ChatGPT;
let models_manager = Arc::new(ModelsManager::new(Some(auth_mode)));
let model_family = models_manager.construct_model_family(&config.model, &config);
let model_family = ModelsManager::construct_model_family_offline(&config.model, &config);
let otel_event_manager = OtelEventManager::new(
conversation_id,
config.model.as_str(),
@@ -154,8 +155,7 @@ async fn responses_stream_includes_subagent_header_on_other() {
let conversation_id = ConversationId::new();
let auth_mode = AuthMode::ChatGPT;
let models_manager = Arc::new(ModelsManager::new(Some(auth_mode)));
let model_family = models_manager.construct_model_family(&config.model, &config);
let model_family = ModelsManager::construct_model_family_offline(&config.model, &config);
let otel_event_manager = OtelEventManager::new(
conversation_id,
@@ -246,16 +246,16 @@ async fn responses_respects_model_family_overrides_from_config() {
let config = Arc::new(config);
let conversation_id = ConversationId::new();
let auth_mode = AuthMode::ChatGPT;
let models_manager = Arc::new(ModelsManager::new(Some(auth_mode)));
let model_family = models_manager.construct_model_family(&config.model, &config);
let auth_mode =
AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key")).get_auth_mode();
let model_family = ModelsManager::construct_model_family_offline(&config.model, &config);
let otel_event_manager = OtelEventManager::new(
conversation_id,
config.model.as_str(),
model_family.slug.as_str(),
None,
Some("test@test.com".to_string()),
Some(auth_mode),
auth_mode,
false,
"test".to_string(),
);

View File

@@ -1250,3 +1250,94 @@ async fn apply_patch_change_context_disambiguates_target(
assert_eq!(contents, "fn a\nx=10\ny=2\nfn b\nx=11\ny=20\n");
Ok(())
}
/// Ensure that applying a patch can update a CRLF file with unicode characters.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[test_case(ApplyPatchModelOutput::Freeform)]
#[test_case(ApplyPatchModelOutput::Function)]
#[test_case(ApplyPatchModelOutput::Shell)]
#[test_case(ApplyPatchModelOutput::ShellViaHeredoc)]
#[test_case(ApplyPatchModelOutput::ShellCommandViaHeredoc)]
async fn apply_patch_cli_updates_unicode_characters(
model_output: ApplyPatchModelOutput,
) -> Result<()> {
skip_if_no_network!(Ok(()));
let harness = apply_patch_harness().await?;
let target = harness.path("unicode.txt");
fs::write(&target, "first ⚠️\nsecond ❌\nthird 🔥\n")?;
let patch = format!(
r#"*** Begin Patch
*** Update File: {}
@@
first ⚠️
-second ❌
+SECOND ✅
@@
third 🔥
+FOURTH
*** End of File
*** End Patch"#,
target.display()
);
let call_id = "apply-unicode-update";
mount_apply_patch(&harness, call_id, patch.as_str(), "ok", model_output).await;
harness
.submit("update unicode characters via apply_patch CLI")
.await?;
let file_contents = fs::read(&target)?;
let content = String::from_utf8_lossy(&file_contents);
assert_eq!(content, "first ⚠️\nSECOND ✅\nthird 🔥\nFOURTH\n");
Ok(())
}
/// Ensure that applying a patch via the CLI preserves CRLF line endings for
/// Windows-style inputs even when updating the file contents.
#[cfg(windows)]
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[test_case(ApplyPatchModelOutput::Freeform)]
#[test_case(ApplyPatchModelOutput::Function)]
#[test_case(ApplyPatchModelOutput::Shell)]
#[test_case(ApplyPatchModelOutput::ShellViaHeredoc)]
#[test_case(ApplyPatchModelOutput::ShellCommandViaHeredoc)]
async fn apply_patch_cli_updates_crlf_file_preserves_line_endings(
model_output: ApplyPatchModelOutput,
) -> Result<()> {
skip_if_no_network!(Ok(()));
let harness = apply_patch_harness().await?;
let target = harness.path("crlf.txt");
fs::write(&target, b"first\r\nsecond\r\nthird\r\n")?;
let patch = format!(
r#"*** Begin Patch
*** Update File: {}
@@
first
-second
+SECOND
@@
third
+FOURTH
*** End of File
*** End Patch"#,
target.display()
);
let call_id = "apply-crlf-update";
mount_apply_patch(&harness, call_id, patch.as_str(), "ok", model_output).await;
harness
.submit("update crlf file via apply_patch CLI")
.await?;
let file_contents = fs::read(&target)?;
let content = String::from_utf8_lossy(&file_contents);
assert!(content.contains("\r\n"));
assert_eq!(content, "first\r\nSECOND\r\nthird\r\nFOURTH\r\n");
Ok(())
}

Some files were not shown because too many files have changed in this diff Show More