Commit Graph

91 Commits

Author SHA1 Message Date
Michael Bolin
5d64e58a38 fix: increase timeout to account for slow PowerShell startup (#16608)
Similar to https://github.com/openai/codex/pull/16604, I am seeing
failures on Windows Bazel that could be due to PowerShell startup
timeouts, so try increasing.
2026-04-02 12:40:19 -07:00
Michael Bolin
aa2403e2eb core: remove cross-crate re-exports from lib.rs (#16512)
## Why

`codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
which made downstream crates depend on `codex-core` as a proxy module
instead of the actual owner crate.

Removing those forwards makes crate boundaries explicit and lets leaf
crates drop unnecessary `codex-core` dependencies. In this PR, this
reduces the dependency on `codex-core` to `codex-login` in the following
files:

```
codex-rs/backend-client/Cargo.toml
codex-rs/mcp-server/tests/common/Cargo.toml
```

## What

- Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
`codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
`codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
`codex-tools`, and `codex-utils-path`.
- Delete the `default_client` forwarding shim in `codex-rs/core`.
- Update in-crate and downstream callsites to import directly from the
owning `codex-*` crate.
- Add direct Cargo dependencies where callsites now target the owner
crate, and remove `codex-core` from `codex-rs/backend-client`.
2026-04-01 23:06:24 -07:00
Michael Bolin
9a8730f31e ci: verify codex-rs Cargo manifests inherit workspace settings (#16353)
## Why

Bazel clippy now catches lints that `cargo clippy` can still miss when a
crate under `codex-rs` forgets to opt into workspace lints. The concrete
example here was `codex-rs/app-server/tests/common/Cargo.toml`: Bazel
flagged a clippy violation in `models_cache.rs`, but Cargo did not
because that crate inherited workspace package metadata without
declaring `[lints] workspace = true`.

We already mirror the workspace clippy deny list into Bazel after
[#15955](https://github.com/openai/codex/pull/15955), so we also need a
repo-side check that keeps every `codex-rs` manifest opted into the same
workspace settings.

## What changed

- add `.github/scripts/verify_cargo_workspace_manifests.py`, which
parses every `codex-rs/**/Cargo.toml` with `tomllib` and verifies:
  - `version.workspace = true`
  - `edition.workspace = true`
  - `license.workspace = true`
  - `[lints] workspace = true`
- top-level crate names follow the `codex-*` / `codex-utils-*`
conventions, with explicit exceptions for `windows-sandbox-rs` and
`utils/path-utils`
- run that script in `.github/workflows/ci.yml`
- update the current outlier manifests so the check is enforceable
immediately
- fix the newly exposed clippy violations in the affected crates
(`app-server/tests/common`, `file-search`, `feedback`,
`shell-escalation`, and `debug-client`)






---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16353).
* #16351
* __->__ #16353
2026-03-31 21:59:28 +00:00
Michael Bolin
61dfe0b86c chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
## Why

`argument-comment-lint` was green in CI even though the repo still had
many uncommented literal arguments. The main gap was target coverage:
the repo wrapper did not force Cargo to inspect test-only call sites, so
examples like the `latest_session_lookup_params(true, ...)` tests in
`codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.

This change cleans up the existing backlog, makes the default repo lint
path cover all Cargo targets, and starts rolling that stricter CI
enforcement out on the platform where it is currently validated.

## What changed

- mechanically fixed existing `argument-comment-lint` violations across
the `codex-rs` workspace, including tests, examples, and benches
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
`tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
`--all-targets` unless the caller explicitly narrows the target set
- fixed both wrappers so forwarded cargo arguments after `--` are
preserved with a single separator
- documented the new default behavior in
`tools/argument-comment-lint/README.md`
- updated `rust-ci` so the macOS lint lane keeps the plain wrapper
invocation and therefore enforces `--all-targets`, while Linux and
Windows temporarily pass `-- --lib --bins`

That temporary CI split keeps the stricter all-targets check where it is
already cleaned up, while leaving room to finish the remaining Linux-
and Windows-specific target-gated cleanup before enabling
`--all-targets` on those runners. The Linux and Windows failures on the
intermediate revision were caused by the wrapper forwarding bug, not by
additional lint findings in those lanes.

## Validation

- `bash -n tools/argument-comment-lint/run.sh`
- `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
- shell-level wrapper forwarding check for `-- --lib --bins`
- shell-level wrapper forwarding check for `-- --tests`
- `just argument-comment-lint`
- `cargo test` in `tools/argument-comment-lint`
- `cargo test -p codex-terminal-detection`

## Follow-up

- Clean up remaining Linux-only target-gated callsites, then switch the
Linux lint lane back to the plain wrapper invocation.
- Clean up remaining Windows-only target-gated callsites, then switch
the Windows lint lane back to the plain wrapper invocation.
2026-03-27 19:00:44 -07:00
Ahmed Ibrahim
7eb19e5319 Move terminal module to terminal-detection crate (#15216)
- Move core/src/terminal.rs and its tests into a standalone
terminal-detection workspace crate.
- Update direct consumers to depend on codex-terminal-detection and
import terminal APIs directly.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-19 14:08:04 -07:00
Ahmed Ibrahim
203a70a191 Stabilize shell approval MCP test (#14101)
## Summary
- replace the Python-based file creation command in the MCP shell
approval test with native platform commands
- build the expected command string from the exact argv that the test
sends

## Why this fixes the flake
The old test depended on Python startup and shell quoting details that
varied across runners. The new version still verifies the same approval
flow, but it uses `touch` on Unix and `New-Item` on Windows so the
assertion only depends on the MCP shell command that Codex actually
forwards.
2026-03-09 11:18:26 -07:00
Michael Bolin
1af2a37ada chore: remove codex-core public protocol/shell re-exports (#12432)
## Why

`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.

This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.

## What Changed

- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
  - `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
  - `codex_protocol::protocol`
  - `codex_protocol::config_types`
  - `codex_protocol::models`
  - `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
  - `codex-utils-approval-presets`
  - `codex-utils-cli`

## Verification

- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
2026-02-20 23:45:35 -08:00
sayan-oai
41800fc876 chore: rm remote models fflag (#11699)
rm `remote_models` feature flag.

We see issues like #11527 when a user has `remote_models` disabled, as
we always use the default fallback `ModelInfo`. This causes issues with
model performance.

Builds on #11690, which helps by warning the user when they are using
the default fallback. This PR will make that happen much less frequently
as an accidental consequence of disabling `remote_models`.
2026-02-17 11:43:16 -08:00
Gabriel Peal
bd3ce98190 Bump rmcp to 0.15 (#11539)
https://github.com/modelcontextprotocol/rust-sdk/pull/598 in 0.14 broke
some MCP oauth (like Linear) and
https://github.com/modelcontextprotocol/rust-sdk/pull/641 fixed it in
0.15
2026-02-11 22:04:17 -08:00
Josh McKinney
de59e550c0 test: deflake nextest child-process leak in MCP harnesses (#11263)
## Summary
- add deterministic child-process cleanup to both test `McpProcess`
helpers
- keep Tokio `kill_on_drop(true)` but also reap via bounded `try_wait()`
polling in `Drop`
- document the failure mode and why this avoids nondeterministic `LEAK`
flakes

## Why
`cargo nextest` leak detection can intermittently report `LEAK` when a
spawned server outlives test teardown, making CI flaky.

## Testing
- `just fmt`
- `cargo test -p codex-app-server`
- `cargo test -p codex-mcp-server`


## Failing CI Reference
- Original failing job:
https://github.com/openai/codex/actions/runs/21845226299/job/63039443593?pr=11245
2026-02-10 03:43:24 +00:00
Matthew Zeng
9f1009540b Upgrade rmcp to 0.14 (#10718)
- [x] Upgrade rmcp to 0.14
2026-02-08 15:07:53 -08:00
Eric Traut
10336068db Fix flaky windows CI test (#10993)
Hardens PTY Python REPL test and make MCP test startup deterministic

**Summary**
- `utils/pty/src/tests.rs`
- Added a REPL readiness handshake (`wait_for_python_repl_ready`) that
repeatedly sends a marker and waits for it in PTY output before sending
test commands.
  - Updated `pty_python_repl_emits_output_and_exits` to:
    - wait for readiness first,
    - preserve startup output,
    - append output collected through process exit.
- Reduces Windows/ConPTY flakiness from early stdin writes racing REPL
startup.

- `mcp-server/tests/suite/codex_tool.rs`
- Avoid remote model refresh during MCP test startup, reducing
timeout-prone nondeterminism.
2026-02-07 08:55:42 -08:00
jif-oai
d2394a2494 chore: nuke chat/completions API (#10157) 2026-02-03 11:31:57 +00:00
Michael Bolin
66447d5d2c feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:


8b95d3e082/codex-rs/mcp-types/README.md

Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.

Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:

```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```

so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:

```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```

so the deletion of those individual variants should not be a cause of
great concern.

Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:

```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```

so:

- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`

and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
2026-02-02 17:41:55 -08:00
Shijie Rao
57ec3a8277 Feat: request user input tool (#9472)
### Summary
* Add `requestUserInput` tool that the model can use for gather
feedback/asking question mid turn.


### Tool input schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput input",
  "type": "object",
  "additionalProperties": false,
  "required": ["questions"],
  "properties": {
    "questions": {
      "type": "array",
      "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.",
      "minItems": 1,
      "maxItems": 3,
      "items": {
        "type": "object",
        "additionalProperties": false,
        "required": ["id", "header", "question"],
        "properties": {
          "id": {
            "type": "string",
            "description": "Stable identifier for mapping answers (snake_case)."
          },
          "header": {
            "type": "string",
            "description": "Short header label shown in the UI (12 or fewer chars)."
          },
          "question": {
            "type": "string",
            "description": "Single-sentence prompt shown to the user."
          },
          "options": {
            "type": "array",
            "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.",
            "minItems": 2,
            "maxItems": 3,
            "items": {
              "type": "object",
              "additionalProperties": false,
              "required": ["value", "label", "description"],
              "properties": {
                "value": {
                  "type": "string",
                  "description": "Machine-readable value (snake_case)."
                },
                "label": {
                  "type": "string",
                  "description": "User-facing label (1-5 words)."
                },
                "description": {
                  "type": "string",
                  "description": "One short sentence explaining impact/tradeoff if selected."
                }
              }
            }
          }
        }
      }
    }
  }
}
```

### Tool output schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput output",
  "type": "object",
  "additionalProperties": false,
  "required": ["answers"],
  "properties": {
    "answers": {
      "type": "object",
      "description": "Map of question id to user answer.",
      "additionalProperties": {
        "type": "object",
        "additionalProperties": false,
        "required": ["selected"],
        "properties": {
          "selected": {
            "type": "array",
            "items": { "type": "string" }
          },
          "other": {
            "type": ["string", "null"]
          }
        }
      }
    }
  }
}
```
2026-01-19 10:17:30 -08:00
Michael Bolin
99f47d6e9a fix(mcp): include threadId in both content and structuredContent in CallToolResult (#9338) 2026-01-15 18:33:11 -08:00
Michael Bolin
0c09dc3c03 feat: add threadId to MCP server messages (#9192)
This favors `threadId` instead of `conversationId` so we use the same
terms as https://developers.openai.com/codex/sdk/.

To test the local build:

```
cd codex-rs
cargo build --bin codex
npx -y @modelcontextprotocol/inspector ./target/debug/codex mcp-server
```

I sent:

```json
{
  "method": "tools/call",
  "params": {
    "name": "codex",
    "arguments": {
      "prompt": "favorite ls option?"
    },
    "_meta": {
      "progressToken": 0
    }
  }
}
```

and got:

```json
{
  "content": [
    {
      "type": "text",
      "text": "`ls -lah` (or `ls -alh`) — long listing, includes dotfiles, human-readable sizes."
    }
  ],
  "structuredContent": {
    "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
  }
}
```

and successfully used the `threadId` in the follow-up with the
`codex-reply` tool call:

```json
{
  "method": "tools/call",
  "params": {
    "name": "codex-reply",
    "arguments": {
      "prompt": "what is the long versoin",
      "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
    },
    "_meta": {
      "progressToken": 1
    }
  }
}
```

whose response also has the `threadId`:

```json
{
  "content": [
    {
      "type": "text",
      "text": "Long listing is `ls -l` (adds permissions, owner/group, size, timestamp)."
    }
  ],
  "structuredContent": {
    "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
  }
}
```

Fixes https://github.com/openai/codex/issues/3712.
2026-01-13 22:14:41 -08:00
Ahmed Ibrahim
87f7226cca Assemble sandbox/approval/network prompts dynamically (#8961)
- Add a single builder for developer permissions messaging that accepts
SandboxPolicy and approval policy. This builder now drives the developer
“permissions” message that’s injected at session start and any time
sandbox/approval settings change.
- Trim EnvironmentContext to only include cwd, writable roots, and
shell; removed sandbox/approval/network duplication and adjusted XML
serialization and tests accordingly.

Follow-up: adding a config value to replace the developer permissions
message for custom sandboxes.
2026-01-12 23:12:59 +00:00
zbarsky-openai
2a06d64bc9 feat: add support for building with Bazel (#8875)
This PR configures Codex CLI so it can be built with
[Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
includes configuration so that remote builds can be done using
[BuildBuddy](https://www.buildbuddy.io).

If you are familiar with Bazel, things should work as you expect, e.g.,
run `bazel test //... --keep-going` to run all the tests in the repo,
but we have also added some new aliases in the `justfile` for
convenience:

- `just bazel-test` to run tests locally
- `just bazel-remote-test` to run tests remotely (currently, the remote
build is for x86_64 Linux regardless of your host platform). Note we are
currently seeing the following test failures in the remote build, so we
still need to figure out what is happening here:

```
failures:
    suite::compact::manual_compact_twice_preserves_latest_user_messages
    suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
    suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
```

- `just build-for-release` to build release binaries for all
platforms/architectures remotely

To setup remote execution:
- [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
employees should also request org access at
https://openai.buildbuddy.io/join/ with their `@openai.com` email
address.)
- [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
`~/.bazelrc` (add the line `build
--remote_header=x-buildbuddy-api-key=YOUR_KEY`)
- Use `--config=remote` in your `bazel` invocations (or add `common
--config=remote` to your `~/.bazelrc`, or use the `just` commands)

## CI

In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
(we are working on supporting Windows, but that is not ready yet). Note
that the failures we are seeing in `just bazel-remote-test` do not occur
on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
is green right now.

The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
that macOS CI jobs build _remotely_ on Linux hosts (using the
`docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
root `BUILD.bazel`) using cross-compilation to build the macOS
artifacts. Then these artifacts are downloaded locally to GitHub's macOS
runner so the tests can be executed natively. This is the relevant
config that enables this:

```
common:macos --config=remote
common:macos --strategy=remote
common:macos --strategy=TestRunner=darwin-sandbox,local
```

Because of the remote caching benefits we get from BuildBuddy, these new
CI jobs can be extremely fast! For example, consider these two jobs that
ran all the tests on Linux x86_64:

- Bazel 1m37s
https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
- Cargo 9m20s
https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875

For now, we will continue to run both the Bazel and Cargo jobs for PRs,
but once we add support for Windows and running Clippy, we should be
able to cutover to using Bazel exclusively for PRs, which should still
speed things up considerably. We will probably continue to run the Cargo
jobs post-merge for commits that land on `main` as a sanity check.

Release builds will also continue to be done by Cargo for now.

Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
Earlier attempt to add support for Buck2, now abandoned:
https://github.com/openai/codex/pull/8504

---------

Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-01-09 11:09:43 -08:00
jif-oai
1aed01e99f renaming: task to turn (#8963) 2026-01-09 17:31:17 +00:00
Michael Bolin
e61bae12e3 feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496)
This PR introduces a `codex-utils-cargo-bin` utility crate that
wraps/replaces our use of `assert_cmd::Command` and
`escargot::CargoBuild`.

As you can infer from the introduction of `buck_project_root()` in this
PR, I am attempting to make it possible to build Codex under
[Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
achieve faster incremental local builds (largely due to Buck2's
[dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
build strategy, as well as benefits from its local build daemon) as well
as faster CI builds if we invest in remote execution and caching.

See
https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
for more details about the performance advantages of Buck2.

Buck2 enforces stronger requirements in terms of build and test
isolation. It discourages assumptions about absolute paths (which is key
to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
variables that Cargo provides are absolute paths (which
`assert_cmd::Command` reads), this is a problem for Buck2, which is why
we need this `codex-utils-cargo-bin` utility.

My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
passed to a `rust_test()` build rule as relative paths.
`codex-utils-cargo-bin` will resolve these values to absolute paths,
when necessary.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
* #8498
* __->__ #8496
2025-12-23 19:29:32 -08:00
jif-oai
29381ba5c2 feat: add shell snapshot for shell command (#7786) 2025-12-11 13:46:43 +00:00
Eric Traut
c4af707e09 Removed experimental "command risk assessment" feature (#7799)
This experimental feature received lukewarm reception during internal
testing. Removing from the code base.
2025-12-10 09:48:11 -08:00
Josh McKinney
ec49b56874 chore: add cargo-deny configuration (#7119)
- add GitHub workflow running cargo-deny on push/PR
- document cargo-deny allowlist with workspace-dep notes and advisory
ignores
- align workspace crates to inherit version/edition/license for
consistent checks
2025-11-24 12:22:18 -08:00
pakrym-oai
767b66f407 Migrate coverage to shell_command (#7042) 2025-11-21 03:44:00 +00:00
Anton Panasenko
9572cfc782 [codex] add developer instructions (#5897)
we are using developer instructions for code reviews, we need to pass
them in cli as well.
2025-10-30 11:18:31 -07:00
Eric Traut
f8af4f5c8d Added model summary and risk assessment for commands that violate sandbox policy (#5536)
This PR adds support for a model-based summary and risk assessment for
commands that violate the sandbox policy and require user approval. This
aids the user in evaluating whether the command should be approved.

The feature works by taking a failed command and passing it back to the
model and asking it to summarize the command, give it a risk level (low,
medium, high) and a risk category (e.g. "data deletion" or "data
exfiltration"). It uses a new conversation thread so the context in the
existing thread doesn't influence the answer. If the call to the model
fails or takes longer than 5 seconds, it falls back to the current
behavior.

For now, this is an experimental feature and is gated by a config key
`experimental_sandbox_command_assessment`.

Here is a screen shot of the approval prompt showing the risk assessment
and summary.

<img width="723" height="282" alt="image"
src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b"
/>
2025-10-24 15:23:44 -07:00
jif-oai
5e4f3bbb0b chore: rework tools execution workflow (#5278)
Re-work the tool execution flow. Read `orchestrator.rs` to understand
the structure
2025-10-20 20:57:37 +01:00
Michael Bolin
995f5c3614 feat: add Vec<ParsedCommand> to ExecApprovalRequestEvent (#5222)
This adds `parsed_cmd: Vec<ParsedCommand>` to `ExecApprovalRequestEvent`
in the core protocol (`protocol/src/protocol.rs`), which is also what
this field is named on `ExecCommandBeginEvent`. Honestly, I don't love
the name (it sounds like a single command, but it is actually a list of
them), but I don't want to get distracted by a naming discussion right
now.

This also adds `parsed_cmd` to `ExecCommandApprovalParams` in
`codex-rs/app-server-protocol/src/protocol.rs`, so it will be available
via `codex app-server`, as well.

For consistency, I also updated `ExecApprovalElicitRequestParams` in
`codex-rs/mcp-server/src/exec_approval.rs` to include this field under
the name `codex_parsed_cmd`, as that struct already has a number of
special `codex_*` fields. Note this is the code for when Codex is used
as an MCP _server_ and therefore has to conform to the official spec for
an MCP elicitation type.
2025-10-15 13:58:40 -07:00
Michael Bolin
d9dbf48828 fix: separate codex mcp into codex mcp-server and codex app-server (#4471)
This is a very large PR with some non-backwards-compatible changes.

Historically, `codex mcp` (or `codex mcp serve`) started a JSON-RPC-ish
server that had two overlapping responsibilities:

- Running an MCP server, providing some basic tool calls.
- Running the app server used to power experiences such as the VS Code
extension.

This PR aims to separate these into distinct concepts:

- `codex mcp-server` for the MCP server
- `codex app-server` for the "application server"

Note `codex mcp` still exists because it already has its own subcommands
for MCP management (`list`, `add`, etc.)

The MCP logic continues to live in `codex-rs/mcp-server` whereas the
refactored app server logic is in the new `codex-rs/app-server` folder.
Note that most of the existing integration tests in
`codex-rs/mcp-server/tests/suite` were actually for the app server, so
all the tests have been moved with the exception of
`codex-rs/mcp-server/tests/suite/mod.rs`.

Because this is already a large diff, I tried not to change more than I
had to, so `codex-rs/app-server/tests/common/mcp_process.rs` still uses
the name `McpProcess` for now, but I will do some mechanical renamings
to things like `AppServer` in subsequent PRs.

While `mcp-server` and `app-server` share some overlapping functionality
(like reading streams of JSONL and dispatching based on message types)
and some differences (completely different message types), I ended up
doing a bit of copypasta between the two crates, as both have somewhat
similar `message_processor.rs` and `outgoing_message.rs` files for now,
though I expect them to diverge more in the near future.

One material change is that of the initialize handshake for `codex
app-server`, as we no longer use the MCP types for that handshake.
Instead, we update `codex-rs/protocol/src/mcp_protocol.rs` to add an
`Initialize` variant to `ClientRequest`, which takes the `ClientInfo`
object we need to update the `USER_AGENT_SUFFIX` in
`codex-rs/app-server/src/message_processor.rs`.

One other material change is in
`codex-rs/app-server/src/codex_message_processor.rs` where I eliminated
a use of the `send_event_as_notification()` method I am generally trying
to deprecate (because it blindly maps an `EventMsg` into a
`JSONNotification`) in favor of `send_server_notification()`, which
takes a `ServerNotification`, as that is intended to be a custom enum of
all notification types supported by the app server. So to make this
update, I had to introduce a new variant of `ServerNotification`,
`SessionConfigured`, which is a non-backwards compatible change with the
old `codex mcp`, and clients will have to be updated after the next
release that contains this PR. Note that
`codex-rs/app-server/tests/suite/list_resume.rs` also had to be update
to reflect this change.

I introduced `codex-rs/utils/json-to-toml/src/lib.rs` as a small utility
crate to avoid some of the copying between `mcp-server` and
`app-server`.
2025-09-30 07:06:18 +00:00
Dylan
197f45a3be [mcp-server] Expose fuzzy file search in MCP (#2677)
## Summary
Expose a simple fuzzy file search implementation for mcp clients to work
with

## Testing
- [x] Tested locally
2025-09-29 12:19:09 -07:00
Jeremy Rose
4a5f05c136 make tests pass cleanly in sandbox (#4067)
This changes the reqwest client used in tests to be sandbox-friendly,
and skips a bunch of other tests that don't work inside the
sandbox/without network.
2025-09-25 13:11:14 -07:00
Thibault Sottiaux
c93e77b68b feat: update default (#4076)
Changes:
- Default model and docs now use gpt-5-codex. 
- Disables the GPT-5 Codex NUX by default.
- Keeps presets available for API key users.
2025-09-22 20:10:52 -07:00
jif-oai
be366a31ab chore: clippy on redundant closure (#4058)
Add redundant closure clippy rules and let Codex fix it by minimising
FQP
2025-09-22 19:30:16 +00:00
jif-oai
e5fe50d3ce chore: unify cargo versions (#4044)
Unify cargo versions at root
2025-09-22 16:47:01 +00:00
pakrym-oai
14a115d488 Add non_sandbox_test helper (#3880)
Makes tests shorter
2025-09-22 14:50:41 +00:00
pakrym-oai
d4aba772cb Switch to uuid_v7 and tighten ConversationId usage (#3819)
Make sure conversations have a timestamp.
2025-09-18 14:37:03 +00:00
Eric Traut
e5dd7f0934 Fix get_auth_status response when using custom provider (#3581)
This PR addresses an edge-case bug that appears in the VS Code extension
in the following situation:
1. Log in using ChatGPT (using either the CLI or extension). This will
create an `auth.json` file.
2. Manually modify `config.toml` to specify a custom provider.
3. Start a fresh copy of the VS Code extension.

The profile menu in the VS Code extension will indicate that you are
logged in using ChatGPT even though you're not.

This is caused by the `get_auth_status` method returning an
`auth_method: 'chatgpt'` when a custom provider is configured and it
doesn't use OpenAI auth (i.e. `requires_openai_auth` is false). The
method should always return `auth_method: None` if
`requires_openai_auth` is false.

The same bug also causes the NUX (new user experience) screen to be
displayed in the VSCE in this situation.
2025-09-14 18:27:02 -07:00
jif-oai
c6fd056aa6 feat: reasoning effort as optional (#3527)
Allow the reasoning effort to be optional
2025-09-12 12:06:33 -07:00
Michael Bolin
abdcb40f4c feat: change the behavior of SetDefaultModel RPC so None clears the value. (#3529)
It turns out that we want slightly different behavior for the
`SetDefaultModel` RPC because some models do not work with reasoning
(like GPT-4.1), so we should be able to explicitly clear this value.

Verified in `codex-rs/mcp-server/tests/suite/set_default_model.rs`.
2025-09-12 11:35:51 -07:00
Michael Bolin
c172e8e997 feat: added SetDefaultModel to JSON-RPC server (#3512)
This adds `SetDefaultModel`, which takes `model` and `reasoning_effort`
as optional fields. If set, the field will overwrite what is in the
user's `config.toml`.

This reuses logic that was added to support the `/model` command in the
TUI: https://github.com/openai/codex/pull/2799.
2025-09-11 23:44:17 -07:00
Michael Bolin
9bbeb75361 feat: include reasoning_effort in NewConversationResponse (#3506)
`ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.
2025-09-11 21:04:40 -07:00
Eric Traut
e13b35ecb0 Simplify auth flow and reconcile differences between ChatGPT and API Key auth (#3189)
This PR does the following:
* Adds the ability to paste or type an API key.
* Removes the `preferred_auth_method` config option. The last login
method is always persisted in auth.json, so this isn't needed.
* If OPENAI_API_KEY env variable is defined, the value is used to
prepopulate the new UI. The env variable is otherwise ignored by the
CLI.
* Adds a new MCP server entry point "login_api_key" so we can implement
this same API key behavior for the VS Code extension.
<img width="473" height="140" alt="Screenshot 2025-09-04 at 3 51 04 PM"
src="https://github.com/user-attachments/assets/c11bbd5b-8a4d-4d71-90fd-34130460f9d9"
/>
<img width="726" height="254" alt="Screenshot 2025-09-04 at 3 51 32 PM"
src="https://github.com/user-attachments/assets/6cc76b34-309a-4387-acbc-15ee5c756db9"
/>
2025-09-11 09:16:34 -07:00
Michael Bolin
65f3528cad feat: add UserInfo request to JSON-RPC server (#3428)
This adds a simple endpoint that provides the email address encoded in
`$CODEX_HOME/auth.json`.

As noted, for now, we do not hit the server to verify this is the user's
true email address.
2025-09-10 17:03:35 -07:00
Michael Bolin
44262d8fd8 fix: ensure output of codex-rs/mcp-types/generate_mcp_types.py matches codex-rs/mcp-types/src/lib.rs (#3439)
https://github.com/openai/codex/pull/3395 updated `mcp-types/src/lib.rs`
by hand, but that file is generated code that is produced by
`mcp-types/generate_mcp_types.py`. Unfortunately, we do not have
anything in CI to verify this right now, but I will address that in a
subsequent PR.

#3395 ended up introducing a change that added a required field when
deserializing `InitializeResult`, breaking Codex when used as an MCP
client, so the quick fix in #3436 was to make the new field `Optional`
with `skip_serializing_if = "Option::is_none"`, but that did not address
the problem that `mcp-types/generate_mcp_types.py` and
`mcp-types/src/lib.rs` are out of sync.

This PR gets things back to where they are in sync. It removes the
custom `mcp_types::McpClientInfo` type that was added to
`mcp-types/src/lib.rs` and forces us to use the generated
`mcp_types::Implementation` type. Though this PR also updates
`generate_mcp_types.py` to generate the additional `user_agent:
Optional<String>` field on `Implementation` so that we can continue to
specify it when Codex operates as an MCP server.

However, this also requires us to specify `user_agent: None` when Codex
operates as an MCP client.

We may want to introduce our own `InitializeResult` type that is
specific to when we run as a server to avoid this in the future, but my
immediate goal is just to get things back in sync.
2025-09-10 16:14:41 -07:00
Eric Traut
acb28bf914 Improved resiliency of two auth-related tests (#3427)
This PR improves two existing auth-related tests. They were failing when
run in an environment where an `OPENAI_API_KEY` env variable was
defined. The change makes them more resilient.
2025-09-10 11:46:02 -07:00
Gabriel Peal
8636bff46d Set a user agent suffix when used as a mcp server (#3395)
This automatically adds a user agent suffix whenever the CLI is used as
a MCP server
2025-09-10 02:32:57 +00:00
Ahmed Ibrahim
43809a454e Introduce rollout items (#3380)
This PR introduces Rollout items. This enable us to rollout eventmsgs
and session meta.

This is mostly #3214 with rebase on main
2025-09-09 23:52:33 +00:00
Gabriel Peal
5eab4c7ab4 Replace config.responses_originator_header_internal_override with CODEX_INTERNAL_ORIGINATOR_OVERRIDE_ENV_VAR (#3388)
The previous config approach had a few issues:
1. It is part of the config but not designed to be used externally
2. It had to be wired through many places (look at the +/- on this PR
3. It wasn't guaranteed to be set consistently everywhere because we
don't have a super well defined way that configs stack. For example, the
extension would configure during newConversation but anything that
happened outside of that (like login) wouldn't get it.

This env var approach is cleaner and also creates one less thing we have
to deal with when coming up with a better holistic story around configs.

One downside is that I removed the unit test testing for the override
because I don't want to deal with setting the global env or spawning
child processes and figuring out how to introspect their originator
header. The new code is sufficiently simple and I tested it e2e that I
feel as if this is still worth it.
2025-09-09 17:23:23 -04:00
Michael Bolin
ace14e8d36 feat: add ArchiveConversation to ClientRequest (#3353)
Adds support for `ArchiveConversation` in the JSON-RPC server that takes
a `(ConversationId, PathBuf)` pair and:

- verifies the `ConversationId` corresponds to the rollout id at the
`PathBuf`
- if so, invokes
`ConversationManager.remove_conversation(ConversationId)`
- if the `CodexConversation` was in memory, send `Shutdown` and wait for
`ShutdownComplete` with a timeout
- moves the `.jsonl` file to `$CODEX_HOME/archived_sessions`

---------

Co-authored-by: Gabriel Peal <gabriel@openai.com>
2025-09-09 11:39:00 -04:00