`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.
It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.
## What
- Added `ReadOnlyAccess` in protocol with:
- `Restricted { include_platform_defaults, readable_roots }`
- `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
- `ReadOnly { access: ReadOnlyAccess }`
- `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.
## Compatibility / rollout
- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
Summary
- add a `prefer_websockets` field to `ModelInfo`, defaulting to `false`
in all fixtures and constructors
- wire the new flag into websocket selection so models that opt in
always use websocket transport even when the feature gate is off
Testing
- Not run (not requested)
- Replace image blocks in MCP tool results with a text placeholder when
the active model does not accept image input.
- Add an e2e rmcp test to verify sanitized tool output is what gets sent
back to the model.
## Summary
- keep wiremock MockServer handles alive through async assertions in
remote model suite tests
- assert /models request count in remote_models_hide_picker_only_models
- use a slightly higher parallel timing threshold on aarch64 while
keeping existing x86 threshold
## Validation
- just fmt
- targeted tests:
- cargo test -p codex-core --test all
suite::remote_models::remote_models_merge_replaces_overlapping_model --
--exact
- cargo test -p codex-core --test all
suite::remote_models::remote_models_hide_picker_only_models -- --exact
- cargo test -p codex-core --test all
suite::tool_parallelism::shell_tools_run_in_parallel -- --exact
- soak loop: 40 iterations of all three targeted tests
## Notes
- cargo test -p codex-core has one unrelated local-env failure in
shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file from
exported certificate env content in this workspace.
- local bazel test //codex-rs/core:core-all-test failed to build due
missing rust-objcopy in this host toolchain.
Summary
- add a `required` flag for MCP servers everywhere config/CLI data is
touched so mandatory helpers can be round-tripped
- have `codex exec` and `codex app-server` thread start/resume fail fast
when required MCPs fail to initialize
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:
8b95d3e082/codex-rs/mcp-types/README.md
Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.
Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:
```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```
so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:
```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```
so the deletion of those individual variants should not be a cause of
great concern.
Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```
so:
- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
### Motivation
- Allow MCP OAuth flows to request scopes defined in `config.toml`
instead of requiring users to always pass `--scopes` on the CLI.
CLI/remote parameters should still override config values.
### Description
- Add optional `scopes: Option<Vec<String>>` to `McpServerConfig` and
`RawMcpServerConfig`, and propagate it through deserialization and the
built config types.
- Serialize `scopes` into the MCP server TOML via
`serialize_mcp_server_table` in `core/src/config/edit.rs` and include
`scopes` in the generated config schema (`core/config.schema.json`).
- CLI: update `codex-rs/cli/src/mcp_cmd.rs` `run_login` to fall back to
`server.scopes` when the `--scopes` flag is empty, with explicit CLI
scopes still taking precedence.
- App server: update
`codex-rs/app-server/src/codex_message_processor.rs`
`mcp_server_oauth_login` to use `params.scopes.or_else(||
server.scopes.clone())` so the RPC path also respects configured scopes.
- Update many test fixtures to initialize the new `scopes` field (set to
`None`) so test code builds with the new struct field.
### Testing
- Ran config tooling and formatters: `just write-config-schema`
(succeeded), `just fmt` (succeeded), and `just fix -p codex-core`, `just
fix -p codex-cli`, `just fix -p codex-app-server` (succeeded where
applicable).
- Ran unit tests for the CLI: `cargo test -p codex-cli` (passed).
- Ran unit tests for core: `cargo test -p codex-core` (ran; many tests
passed but several failed, including model refresh/403-related tests,
shell snapshot/timeouts, and several `unified_exec` expectations).
- Ran app-server tests: `cargo test -p codex-app-server` (ran; many
integration-suite tests failed due to mocked/remote HTTP 401/403
responses and wiremock expectations).
If you want, I can split the tests into smaller focused runs or help
debug the failing integration tests (they appear to be unrelated to the
config change and stem from external HTTP/mocking behaviors encountered
during the test runs).
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69718f505914832ea1f334b3ba064553)
## Summary
Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
is explicitly unused by the clients so far, to simplify PRs - app-server
and tui implementations will be follow-ups.
## Testing
- [x] added integration tests
Enterprises want to restrict the MCP servers their users can use.
Admins can now specify an allowlist of MCPs in `requirements.toml`. The
MCP servers are matched on both Name and Transport (local path or HTTP
URL) -- both must match to allow the MCP server. This prevents
circumventing the allowlist by renaming MCP servers in user config. (It
is still possible to replace the local path e.g. rewrite say
`/usr/local/github-mcp` with a nefarious MCP. We could allow hash
pinning in the future, but that would break updates. I also think this
represents a broader, out-of-scope problem.)
We introduce a new field to Constrained: "normalizer". In general, it is
a fn(T) -> T and applies when `Constrained<T>.set()` is called. In this
particular case, it disables MCP servers which do not match the
allowlist. An alternative solution would remove this and instead throw a
ConstraintError. That would stop Codex launching if any MCP server was
configured which didn't match. I think this is bad.
We currently reuse the enabled flag on MCP servers to disable them, but
don't propagate any information about why they are disabled. I'd like to
add that in a follow up PR, possibly by switching out enabled with an
enum.
In action:
```
# MCP server config has two MCPs. We are going to allowlist one of them.
➜ codex git:(gt/restrict-mcps) ✗ cat ~/.codex/config.toml | grep mcp_servers -A1
[mcp_servers.hello_world]
command = "hello-world-mcp"
--
[mcp_servers.docs]
command = "docs-mcp"
# Restrict the MCPs to the hello_world MCP.
➜ codex git:(gt/restrict-mcps) ✗ defaults read com.openai.codex requirements_toml_base64 | base64 -d
[mcp_server_allowlist.hello_world]
command = "hello-world-mcp"
# List the MCPs, observe hello_world is enabled and docs is disabled.
➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
Running `target/debug/codex mcp list`
Name Command Args Env Cwd Status Auth
docs docs-mcp - - - disabled Unsupported
hello_world hello-world-mcp - - - enabled Unsupported
# Remove the restrictions.
➜ codex git:(gt/restrict-mcps) ✗ defaults delete com.openai.codex requirements_toml_base64
# Observe both MCPs are enabled.
➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
Running `target/debug/codex mcp list`
Name Command Args Env Cwd Status Auth
docs docs-mcp - - - enabled Unsupported
hello_world hello-world-mcp - - - enabled Unsupported
# A new requirements that updates the command to one that does not match.
➜ codex git:(gt/restrict-mcps) ✗ cat ~/requirements.toml
[mcp_server_allowlist.hello_world]
command = "hello-world-mcp-v2"
# Use those requirements.
➜ codex git:(gt/restrict-mcps) ✗ defaults write com.openai.codex requirements_toml_base64 "$(base64 -i /Users/gt/requirements.toml)"
# Observe both MCPs are disabled.
➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.75s
Running `target/debug/codex mcp list`
Name Command Args Env Cwd Status Auth
docs docs-mcp - - - disabled Unsupported
hello_world hello-world-mcp - - - disabled Unsupported
```
### Summary
With codesigning on Mac, Windows and Linux, we should be able to safely
remove `features.rmcp_client` and `use_experimental_use_rmcp_client`
check from the codebase now.
- Make Config.model optional and centralize default-selection logic in
ModelsManager, including a default_model helper (with
codex-auto-balanced when available) so sessions now carry an explicit
chosen model separate from the base config.
- Resolve `model` once in `core` and `tui` from config. Then store the
state of it on other structs.
- Move refreshing models to be before resolving the default model
It's pretty amazing we have gotten here without the ability for the
model to see image content from MCP tool calls.
This PR builds off of 4391 and fixes#4819. I would like @KKcorps to get
adequete credit here but I also want to get this fix in ASAP so I gave
him a week to update it and haven't gotten a response so I'm going to
take it across the finish line.
This test highlights how absured the current situation is. I asked the
model to read this image using the Chrome MCP
<img width="2378" height="674" alt="image"
src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
/>
After this change, it correctly outputs:
> Captured the page: image dhows a dark terminal-style UI labeled
`OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
working directory `/codex/codex-rs`
(and more)
Before this change, it said:
> Took the full-page screenshot you asked for. It shows a long,
horizontally repeating pattern of stylized people in orange, light-blue,
and mustard clothing, holding hands in alternating poses against a white
background. No text or other graphics-just rows of flat illustration
stretching off to the right.
Without this change, the Figma, Playwright, Chrome, and other visual MCP
servers are pretty much entirely useless.
I tested this change with the openai respones api as well as a third
party completions api
Some MCP servers expose a lot of tools. In those cases, it is reasonable
to allow/denylist tools for Codex to use so it doesn't get overwhelmed
with too many tools.
The new configuration options available in the `mcp_server` toml table
are:
* `enabled_tools`
* `disabled_tools`
Fixes#4796
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).
Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
This should make it more clear that specific tools come from MCP
servers.
#4806 requested that we add the server name but we already do that.
Fixes#4806
This makes stdio mcp servers more flexible by allowing users to specify
the cwd to run the server command from and adding additional environment
variables to be passed through to the server.
Example config using the test server in this repo:
```toml
[mcp_servers.test_stdio]
cwd = "/Users/<user>/code/codex/codex-rs"
command = "cargo"
args = ["run", "--bin", "test_stdio_server"]
env_vars = ["MCP_TEST_VALUE"]
```
@bolinfest I know you hate these env var tests but let's roll with this
for now. I may take a stab at the env guard + serial macro at some
point.
This adds two new config fields to streamable http mcp servers:
`http_headers`: a map of key to value
`env_http_headers` a map of key to env var which will be resolved at
request time
All headers will be passed to all MCP requests to that server just like
authorization headers.
There is a test ensuring that headers are not passed to other servers.
Fixes#5180
Add proper feature flag instead of having custom flags for everything.
This is just for experimental/wip part of the code
It can be used through CLI:
```bash
codex --enable unified_exec --disable view_image_tool
```
Or in the `config.toml`
```toml
# Global toggles applied to every profile unless overridden.
[features]
apply_patch_freeform = true
view_image_tool = false
```
Follow-up:
In a following PR, the goal is to have a default have `bundles` of
features that we can associate to a model
1. You can now add streamable http servers via the CLI
2. As part of this, I'm also changing the existing bearer_token plain
text config field with ane env var
```
mcp add github --url https://api.githubcopilot.com/mcp/ --bearer-token-env-var=GITHUB_PAT
```
## Summary
- add a reusable `ev_response_created` helper that builds
`response.created` SSE events for integration tests
- update the exec and core integration suites to use the new helper
instead of repeating manual JSON literals
- keep the streaming fixtures consistent by relying on the shared helper
in every touched test
## Testing
- `just fmt`
------
https://chatgpt.com/codex/tasks/task_i_68e1fe885bb883208aafffb94218da61
This PR adds oauth login support to streamable http servers when
`experimental_use_rmcp_client` is enabled.
This PR is large but represents the minimal amount of work required for
this to work. To keep this PR smaller, login can only be done with
`codex mcp login` and `codex mcp logout` but it doesn't appear in `/mcp`
or `codex mcp list` yet. Fingers crossed that this is the last large MCP
PR and that subsequent PRs can be smaller.
Under the hood, credentials are stored using platform credential
managers using the [keyring crate](https://crates.io/crates/keyring).
When the keyring isn't available, it falls back to storing credentials
in `CODEX_HOME/.credentials.json` which is consistent with how other
coding agents handle authentication.
I tested this on macOS, Windows, WSL (ubuntu), and Linux. I wasn't able
to test the dbus store on linux but did verify that the fallback works.
One quirk is that if you have credentials, during development, every
build will have its own ad-hoc binary so the keyring won't recognize the
reader as being the same as the write so it may ask for the user's
password. I may add an override to disable this or allow
users/enterprises to opt-out of the keyring storage if it causes issues.
<img width="5064" height="686" alt="CleanShot 2025-09-30 at 19 31 40"
src="https://github.com/user-attachments/assets/9573f9b4-07f1-4160-83b8-2920db287e2d"
/>
<img width="745" height="486" alt="image"
src="https://github.com/user-attachments/assets/9562649b-ea5f-4f22-ace2-d0cb438b143e"
/>
### Title
## otel
Codex can emit [OpenTelemetry](https://opentelemetry.io/) **log events**
that
describe each run: outbound API requests, streamed responses, user
input,
tool-approval decisions, and the result of every tool invocation. Export
is
**disabled by default** so local runs remain self-contained. Opt in by
adding an
`[otel]` table and choosing an exporter.
```toml
[otel]
environment = "staging" # defaults to "dev"
exporter = "none" # defaults to "none"; set to otlp-http or otlp-grpc to send events
log_user_prompt = false # defaults to false; redact prompt text unless explicitly enabled
```
Codex tags every exported event with `service.name = "codex-cli"`, the
CLI
version, and an `env` attribute so downstream collectors can distinguish
dev/staging/prod traffic. Only telemetry produced inside the
`codex_otel`
crate—the events listed below—is forwarded to the exporter.
### Event catalog
Every event shares a common set of metadata fields: `event.timestamp`,
`conversation.id`, `app.version`, `auth_mode` (when available),
`user.account_id` (when available), `terminal.type`, `model`, and
`slug`.
With OTEL enabled Codex emits the following event types (in addition to
the
metadata above):
- `codex.api_request`
- `cf_ray` (optional)
- `attempt`
- `duration_ms`
- `http.response.status_code` (optional)
- `error.message` (failures)
- `codex.sse_event`
- `event.kind`
- `duration_ms`
- `error.message` (failures)
- `input_token_count` (completion only)
- `output_token_count` (completion only)
- `cached_token_count` (completion only, optional)
- `reasoning_token_count` (completion only, optional)
- `tool_token_count` (completion only)
- `codex.user_prompt`
- `prompt_length`
- `prompt` (redacted unless `log_user_prompt = true`)
- `codex.tool_decision`
- `tool_name`
- `call_id`
- `decision` (`approved`, `approved_for_session`, `denied`, or `abort`)
- `source` (`config` or `user`)
- `codex.tool_result`
- `tool_name`
- `call_id`
- `arguments`
- `duration_ms` (execution time for the tool)
- `success` (`"true"` or `"false"`)
- `output`
### Choosing an exporter
Set `otel.exporter` to control where events go:
- `none` – leaves instrumentation active but skips exporting. This is
the
default.
- `otlp-http` – posts OTLP log records to an OTLP/HTTP collector.
Specify the
endpoint, protocol, and headers your collector expects:
```toml
[otel]
exporter = { otlp-http = {
endpoint = "https://otel.example.com/v1/logs",
protocol = "binary",
headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
}}
```
- `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint
and any
metadata headers:
```toml
[otel]
exporter = { otlp-grpc = {
endpoint = "https://otel.example.com:4317",
headers = { "x-otlp-meta" = "abc123" }
}}
```
If the exporter is `none` nothing is written anywhere; otherwise you
must run or point to your
own collector. All exporters run on a background batch worker that is
flushed on
shutdown.
If you build Codex from source the OTEL crate is still behind an `otel`
feature
flag; the official prebuilt binaries ship with the feature enabled. When
the
feature is disabled the telemetry hooks become no-ops so the CLI
continues to
function without the extra dependencies.
---------
Co-authored-by: Anton Panasenko <apanasenko@openai.com>
This PR adds support for streamable HTTP MCP servers when the
`experimental_use_rmcp_client` is enabled.
To set one up, simply add a new mcp server config with the url:
```
[mcp_servers.figma]
url = "http://127.0.0.1:3845/mcp"
```
It also supports an optional `bearer_token` which will be provided in an
authorization header. The full oauth flow is not supported yet.
The config parsing will throw if it detects that the user mixed and
matched config fields (like command + bearer token or url + env).
The best way to review it is to review `core/src` and then
`rmcp-client/src/rmcp_client.rs` first. The rest is tests and
propagating the `Transport` struct around the codebase.
Example with the Figma MCP:
<img width="5084" height="1614" alt="CleanShot 2025-09-26 at 13 35 40"
src="https://github.com/user-attachments/assets/eaf2771e-df3e-4300-816b-184d7dec5a28"
/>
The [official Rust
SDK](57fc428c57)
has come a long way since we first started our mcp client implementation
5 months ago and, today, it is much more complete than our own
stdio-only implementation.
This PR introduces a new config flag `experimental_use_rmcp_client`
which will use a new mcp client powered by the sdk instead of our own.
To keep this PR simple, I've only implemented the same stdio MCP
functionality that we had but will expand on it with future PRs.
---------
Co-authored-by: pakrym-oai <pakrym@openai.com>