Commit Graph

2785 Commits

Author SHA1 Message Date
Sama Setty
8623dfb4e4 Merge origin/main into dev/ssetty/explicit-plugin-app-startup 2026-04-16 20:15:34 -07:00
pakrym-oai
91e8eebd03 Split codex session modules (#18244)
## Summary
- split `codex.rs` session definitions and constructor into
`codex/session.rs`
- move MCP session methods into `codex/mcp.rs`
- move turn-context types/helpers into `codex/turn_context.rs`
- move review thread spawning into `codex/review.rs`

## Testing
- `cargo check -p codex-core`
- `just fmt`
- `just fix -p codex-core`
- `cargo test -p codex-core` (unit tests passed; integration run failed
locally with 45 failures, including missing helper binaries such as
`test_stdio_server`/`codex` plus approval/web-search/MCP-related cases)
2026-04-16 18:15:19 -07:00
Akshay Nathan
7995c66032 Stream apply_patch changes (#17862)
Adds new events for streaming apply_patch changes from responses api.
This is to enable clients to show progress during file writes.

Caveat: This does not work with apply_patch in function call mode, since
that required adding streaming json parsing.
2026-04-16 18:12:19 -07:00
pakrym-oai
9effa0509f Refactor config loading to use filesystem abstraction (#18209)
Initial pass propagating FileSystem through config loading.
2026-04-17 00:51:21 +00:00
viyatb-oai
2967900d81 fix: deprecate use_legacy_landlock feature flag (#17971)
## Summary
- mark `features.use_legacy_landlock` as a deprecated feature flag
- emit a startup deprecation notice when the flag is configured
- add feature- and core-level regression coverage for the notice


<img width="1288" height="93" alt="Screenshot 2026-04-15 at 11 14 00 PM"
src="https://github.com/user-attachments/assets/fffc628b-614c-4521-9374-64e50a269252"
/>

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 17:37:15 -07:00
viyatb-oai
0d0abe839a feat(sandbox): add glob deny-read platform enforcement (#18096)
## Summary
- adds macOS Seatbelt deny rules for unreadable glob patterns
- expands unreadable glob matches on Linux and masks them in bwrap,
including canonical symlink targets
- keeps Linux glob expansion robust when `rg` is unavailable in minimal
or Bazel test environments
- adds sandbox integration coverage that runs `shell` and `exec_command`
with a `**/*.env = none` policy and verifies the secret contents do not
reach the model

## Linux glob expansion

```text
Prefer:   rg --files --hidden --no-ignore --glob <pattern> -- <search-root>
Fallback: internal globset walker when rg is not installed
Failure:  any other rg failure aborts sandbox construction
```

```
[permissions.workspace.filesystem]
glob_scan_max_depth = 2

[permissions.workspace.filesystem.":project_roots"]
"**/*.env" = "none"
```


This keeps the common path fast without making sandbox construction
depend on an ambient `rg` binary. If `rg` is present but fails for
another reason, the sandbox setup fails closed instead of silently
omitting deny-read masks.

## Platform support
- macOS: subprocess sandbox enforcement is handled by Seatbelt regex
deny rules
- Linux: subprocess sandbox enforcement is handled by expanding existing
glob matches and masking them in bwrap
- Windows: policy/config/direct-tool glob support is already on `main`
from #15979; Windows subprocess sandbox paths continue to fail closed
when unreadable split filesystem carveouts require runtime enforcement,
rather than silently running unsandboxed

## Stack
1. #15979 - merged: cross-platform glob deny-read
policy/config/direct-tool support for macOS, Linux, and Windows
2. This PR - macOS/Linux subprocess sandbox enforcement plus Windows
fail-closed clarification
3. #17740 - managed deny-read requirements

## Verification
- Added integration coverage for `shell` and `exec_command` glob
deny-read enforcement
- `cargo check -p codex-sandboxing -p codex-linux-sandbox --tests`
- `cargo check -p codex-core --test all`
- `cargo clippy -p codex-linux-sandbox -p codex-sandboxing --tests`
- `just bazel-lock-check`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 17:35:16 -07:00
xli-oai
5818ed6660 Move marketplace add under plugin command (#18116)
## Summary
- move the marketplace add CLI from `codex marketplace add` to `codex
plugin marketplace add`
- keep marketplace config overrides working through the nested plugin
command
- reject `--sparse` for local marketplace directory sources before the
local-source install path bypasses git-source validation

## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-cli`
- `cargo test -p codex-core marketplace_add -- --nocapture`
- `cargo test -p codex-core
install_plugin_updates_config_with_relative_path_and_plugin_key --
--nocapture`
- `xli-test-marketplace-cli` local isolated matrix: `T1`, `L1`-`L10`
2026-04-16 17:06:34 -07:00
Jeff Harris
65cc12d72e Use codex-auto-review for guardian reviews (#18169)
## Summary

This is the minimal client-side follow-up for the Codex Auto Review
model slug rollout. It updates the guardian reviewer preferred model
from `gpt-5.4` to `codex-auto-review`, so the client can rely on the
backend catalog + Statsig mapping instead of hardcoding the GPT-5.4
slug.

Context:
https://openai.slack.com/archives/C0AF9328RL0/p1775777479388369?thread_ts=1775773094.071629&cid=C0AF9328RL0

## Testing

- `cargo fmt --package codex-core --check`
- `cargo test -p codex-core guardian::`
- `bazel test --experimental_remote_downloader= --test_output=errors
//codex-rs/core:core-unit-tests --test_arg=guardian`
2026-04-16 15:43:51 -07:00
pakrym-oai
a1736fcd20 [codex] Split codex turn logic (#18206)
## Summary
- Move Codex turn execution logic from `codex.rs` into `codex/turn.rs`.
- Keep the existing crate-visible `run_turn`, `build_prompt`,
`built_tools`, and `get_last_assistant_message_from_turn` surface
re-exported from `codex.rs`.
- Preserve test access for moved turn helpers while reducing the main
`codex.rs` orchestration footprint.

## Stack
- Base: #18200 (`pakrym/split-codex-handlers`)

## Testing
- `CARGO_INCREMENTAL=0 cargo test -p codex-core --lib`
- `just fix -p codex-core`
- `just fmt`
- `git diff --check`
2026-04-16 15:28:59 -07:00
bxie-openai
6a1ddfc366 [codex] Update realtime V2 VAD silence delay and 1.5 prompt (#18092)
## Summary

- set the realtime v2 server VAD silence delay to 500ms
- update the default realtime 1.5 backend prompt to the v4 text
- keep the session payload and prompt rendering tests aligned with those
changes

## Why

- the VAD change gives the voice path a longer pause before ending the
user's turn
- the prompt change makes the default bundled realtime prompt match the
current v4 content

## Validation

- `cargo +1.93.0 test -p codex-core realtime_prompt --manifest-path
/tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml`
- `CARGO_TARGET_DIR=/tmp/codex-pr-v4-target cargo +1.93.0 test -p
codex-api
realtime_v2_session_update_includes_background_agent_tool_and_handoff_output_item
--manifest-path
/tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml`
- `CARGO_TARGET_DIR=/tmp/codex-pr-v4-target cargo +1.93.0 test -p
codex-app-server --test all
'suite::v2::realtime_conversation::realtime_webrtc_start_emits_sdp_notification'
--manifest-path /tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml
-- --exact`
2026-04-16 14:30:57 -07:00
Abhinav
d9c71d41a9 Add OTEL metrics for hook runs (#18026)
# Why
We already emit analytics for completed hook runs, but we don't have
matching OTEL metrics to track hook volume and latency.

# What
- add `codex.hooks.run` and `codex.hooks.run.duration_ms`
- tag both metrics with `hook_name`, `source`, and `status`
- emit the metrics from the completed hook path

Verified locally against a dummy OTLP collector

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 21:30:38 +00:00
Adrian
55c3de75cb Register agent tasks behind use_agent_identity (#17387)
## Summary

Stack PR3 for feature-gated agent identity support.

This PR adds per-thread agent task registration behind
`features.use_agent_identity`. Tasks are minted on the first real user
turn and cached in thread runtime state for later turns.

## Stack

- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - this PR, original
task registration slice
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: https://github.com/openai/codex/pull/17980 - use `AgentAssertion`
downstream when enabled

## Validation

Covered as part of the local stack validation pass:

- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`

## Notes

The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
2026-04-16 14:30:02 -07:00
pakrym-oai
0708cc78cb [codex] Split codex op handlers (#18200)
Start splitting the codex.rs
2026-04-16 14:21:29 -07:00
bxie-openai
37bf42d5d5 [codex] Make realtime startup context truncation deterministic (#18172)
## Summary

- remove the final whole-blob truncation pass from realtime
startup-context assembly
- enforce fixed per-section budgets, including each section heading
- keep the existing per-section caps and raise the overall realtime
startup-context budget to `5300`, matching the sum of those section
budgets
- add focused tests for the new wrapping and section-budget behavior

## Why

The previous flow truncated each section and then middle-truncated the
final combined startup-context blob again. Small input changes could
shift that combined cut point, which made retained context unstable and
caused nondeterministic tests.

## Impact

Startup context now preserves section boundaries and ordering
deterministically. Each section is still budgeted independently, but the
final assembled blob is no longer truncated again as a single opaque
string. To match that design, the overall startup-context token budget
is updated to the sum of the existing section budgets rather than
lowering the section caps.

## Validation

- `cargo +1.93.0 test -p codex-core realtime_context`
- `cargo +1.93.0 test -p codex-core --test all
suite::realtime_conversation::conversation_start_injects_startup_context_from_thread_history
-- --exact`
- `cargo +1.93.0 test -p codex-core --test all
suite::realtime_conversation::conversation_startup_context_current_thread_selects_many_turns_by_budget
-- --exact`
- `cargo +1.93.0 test -p codex-core --test all
suite::realtime_conversation::conversation_startup_context_falls_back_to_workspace_map
-- --exact`
- `cargo +1.93.0 test -p codex-core --test all
suite::realtime_conversation::conversation_startup_context_is_truncated_and_sent_once_per_start
-- --exact`
2026-04-16 13:51:43 -07:00
Felipe Coury
ec8d4bfc77 fix(app-server): replay token usage after resume and fork (#18023)
## Problem

When a user resumed or forked a session, the TUI could render the
restored thread history immediately, but it did not receive token usage
until a later model turn emitted a fresh usage event. That left the
context/status UI blank or stale during the exact window where the user
expects resumed state to look complete. Core already reconstructed token
usage from the rollout; the missing behavior was app-server lifecycle
replay to the client that just attached.

## Mental model

Token usage has two representations. The rollout is the durable source
of historical `TokenCount` events, and the core session cache is the
in-memory snapshot reconstructed from that rollout on resume or fork.
App-server v2 clients do not read core state directly; they learn about
usage through `thread/tokenUsage/updated`. The fix keeps those roles
separate: core exposes the restored `TokenUsageInfo`, and app-server
sends one targeted notification after a successful `thread/resume` or
`thread/fork` response when that restored snapshot exists.

This notification is not a new model event. It is a replay of
already-persisted state for the client that just attached. That
distinction matters because using the normal core event path here would
risk duplicating `TokenCount` entries in the rollout and making future
resumes count historical usage twice.

## Non-goals

This change does not add a new protocol method or payload shape. It
reuses the existing v2 `thread/tokenUsage/updated` notification and the
TUI’s existing handler for that notification.

This change does not alter how token usage is computed, accumulated,
compacted, or written during turns. It only exposes the token usage that
resume and fork reconstruction already restored.

This change does not broadcast historical usage replay to every
subscribed client. The replay is intentionally scoped to the connection
that requested resume or fork so already-attached clients are not
surprised by an old usage update while they may be rendering live
activity.

## Tradeoffs

Sending the usage notification after the JSON-RPC response preserves a
clear lifecycle order: the client first receives the thread object, then
receives restored usage for that thread. The tradeoff is that usage is
still a notification rather than part of the `thread/resume` or
`thread/fork` response. That keeps the protocol shape stable and avoids
duplicating usage fields across response types, but clients must
continue listening for notifications after receiving the response.

The helper selects the latest non-in-progress turn id for the replayed
usage notification. This is conservative because restored usage belongs
to completed persisted accounting, not to newly attached in-flight work.
The fallback to the last turn preserves a stable wire payload for
unusual histories, but histories with no meaningful completed turn still
have a weak attribution story.

## Architecture

Core already seeds `Session` token state from the last persisted rollout
`TokenCount` during `InitialHistory::Resumed` and
`InitialHistory::Forked`. The new core accessor exposes the complete
`TokenUsageInfo` through `CodexThread` without giving app-server direct
session mutation authority.

App-server calls that accessor from three lifecycle paths: cold
`thread/resume`, running-thread resume/rejoin, and `thread/fork`. In
each path, the server sends the normal response first, then calls a
shared helper that converts core usage into
`ThreadTokenUsageUpdatedNotification` and sends it only to the
requesting connection.

The tests build fake rollouts with a user turn plus a persisted token
usage event. They then exercise `thread/resume` and `thread/fork`
without starting another model turn, proving that restored usage arrives
before any next-turn token event could be produced.

## Observability

The primary debug path is the app-server JSON-RPC stream. After
`thread/resume` or `thread/fork`, a client should see the response
followed by `thread/tokenUsage/updated` when the source rollout includes
token usage. If the notification is absent, check whether the rollout
contains an `event_msg` payload of type `token_count`, whether core
reconstruction seeded `Session::token_usage_info`, and whether the
connection stayed attached long enough to receive the targeted
notification.

The notification is sent through the existing
`OutgoingMessageSender::send_server_notification_to_connections` path,
so existing app-server tracing around server notifications still
applies. Because this is a replay, not a model turn event, debugging
should start at the resume/fork handlers rather than the turn event
translation in `bespoke_event_handling`.

## Tests

The focused regression coverage is `cargo test -p codex-app-server
emits_restored_token_usage`, which covers both resume and fork. The core
reconstruction guard is `cargo test -p codex-core
record_initial_history_seeds_token_info_from_rollout`.

Formatting and lint/fix passes were run with `just fmt`, `just fix -p
codex-core`, and `just fix -p codex-app-server`. Full crate test runs
surfaced pre-existing unrelated failures in command execution and plugin
marketplace tests; the new token usage tests passed in focused runs and
within the app-server suite before the unrelated command execution
failure.
2026-04-16 17:29:34 -03:00
Abhinav
8720b7bdce Add codex_hook_run analytics event (#17996)
# Why
Add product analytics for hook handler executions so we can understand
which hooks are running, where they came from, and whether they
completed, failed, stopped, or blocked work.

# What
- add the new `codex_hook_run` analytics event and payload plumbing in
`codex-rs/analytics`
- emit hook-run analytics from the shared hook completion path in
`codex-rs/core`
- classify hook source from the loaded hook path as `system`, `user`,
`project`, or `unknown`

```
{
  "event_type": "codex_hook_run",
  "event_params": {
    "thread_id": "string",
    "turn_id": "string",
    "model_slug": "string",
    "hook_name": "string, // any HookEventName
    "hook_source": "system | user | project | unknown",
    "status": "completed | failed | stopped | blocked"
  }
}
```

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 19:43:16 +00:00
starr-openai
62847e7554 Make thread unsubscribe test deterministic (#18000)
## Summary
- replace the unsubscribe-during-turn test's sleep/polling flow with a
gated streaming SSE response
- add request-count notification support to the streaming SSE test
server so the test can wait for the in-flight Responses request
deterministically

## Scope
- codex-rs/app-server/tests/suite/v2/thread_unsubscribe.rs
- codex-rs/core/tests/common/streaming_sse.rs

## Validation
- Not run locally; this is a narrow extraction from the prior CI-green
branch.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 19:34:04 +00:00
Michael Bolin
dfff8a7d03 fix: drop lock earlier; was held across send_event().await unnecessarily (#18178)
This was flagged by the Codex Security tool: the `state` lock was held
longer than necessary, which included being held across an `async` call,
increasing the potential for deadlock.

While this was flagged by the Codex Security tool, I will look into
enabling
https://rust-lang.github.io/rust-clippy/stable/index.html#await_holding_lock
in a follow-up PR (though unfortunately, that Clippy rule claims it
reports false positives when `drop()` is used to drop a guard instead of
using the end of block scope to drop). Though I can't seem to find a
Clippy rule that checks for opportunities to drop a guard as soon as it
is no longer referenced, in general.
2026-04-16 19:29:25 +00:00
Matthew Zeng
71174574ad Add server-level approval defaults for custom MCP servers (#17843)
## Summary
- Add `default_tools_approval_mode` support for custom MCP server
configs, matching the existing `codex_apps` behavior
- Apply approval precedence as per-tool override, then server default,
then `auto`
- Update config serialization, CLI display, schema generation, docs, and
tests

## Testing
- `cargo check -p codex-config`
- `cargo check -p codex-core`
- `just write-config-schema`
- `just fmt`
- `cargo test -p codex-config`
- Targeted `codex-core` tests for config parsing, config writes, and MCP
approval precedence
- `just fix -p codex-config -p codex-core`
2026-04-16 18:18:07 +00:00
pakrym-oai
206dd13c32 Move more connector logic into connectors crate (#18158)
Reduce the size of core
2026-04-16 11:16:44 -07:00
pakrym-oai
ab97c9aaad Refactor AGENTS.md discovery into AgentsMdManager (#18035)
Encapsulate Agents MD processing a bit and drop user_instructions_path
from config.
2026-04-16 10:51:33 -07:00
xli-oai
faf48489f3 Auto-upgrade configured marketplaces (#17425)
## Summary
- Add best-effort auto-upgrade for user-configured Git marketplaces
recorded in `config.toml`.
- Track the last activated Git revision with `last_revision` so
unchanged marketplace sources skip clone work.
- Trigger the upgrade from plugin startup and `plugin/list`, while
preserving existing fail-open plugin behavior with warning logs rather
than new user-visible errors.

## Details
- Remote configured marketplaces use `git ls-remote` to compare the
source/ref against the recorded revision.
- Upgrades clone into a staging directory, validate that
`.agents/plugins/marketplace.json` exists and that the manifest name
matches the configured marketplace key, then atomically activate the new
root.
- Local `.agents/plugins/marketplace.json` marketplaces remain live
filesystem state and are not auto-pulled.
- Existing non-curated plugin cache refresh is kicked after successful
marketplace root upgrades.

## Validation
- `just write-config-schema`
- `cargo test -p codex-core marketplace_upgrade`
- `cargo check -p codex-cli -p codex-app-server`
- `just fix -p codex-core`

Did not run the complete `cargo test` suite because the repo
instructions require asking before a full core workspace run.
2026-04-16 10:36:34 -07:00
alexsong-oai
109b22a8d0 Improve external agent plugin migration for configured marketplaces (#18055) 2026-04-16 17:34:38 +00:00
viyatb-oai
6862b9c745 feat(permissions): add glob deny-read policy support (#15979)
## Summary
- adds first-class filesystem policy entries for deny-read glob patterns
- parses config such as :project_roots { "**/*.env" = "none" } into
pattern entries
- enforces deny-read patterns in direct read/list helpers
- fails closed for sandbox execution until platform backends enforce
glob patterns in #18096
- preserves split filesystem policy in turn context only when it cannot
be reconstructed from legacy sandbox policy

## Stack
1. This PR - glob deny-read policy/config/direct-tool support
2. #18096 - macOS and Linux sandbox enforcement
3. #17740 - managed deny-read requirements

## Verification
- just fmt
- cargo check -p codex-core -p codex-sandboxing --tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 10:31:51 -07:00
Ahmed Ibrahim
2ca270d08d [2/8] Support piped stdin in exec process API (#18086)
## Summary
- Add an explicit stdin mode to process/start.
- Keep normal non-interactive exec stdin closed while allowing
pipe-backed processes.

## Stack
```text
o  #18027 [8/8] Fail exec client operations after disconnect
│
o  #18025 [7/8] Cover MCP stdio tests with executor placement
│
o  #18089 [6/8] Wire remote MCP stdio through executor
│
o  #18088 [5/8] Add executor process transport for MCP stdio
│
o  #18087 [4/8] Abstract MCP stdio server launching
│
o  #18020 [3/8] Add pushed exec process events
│
@  #18086 [2/8] Support piped stdin in exec process API
│
o  #18085 [1/8] Add MCP server environment config
│
o  main
```

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 10:30:10 -07:00
Won Park
3a4fa77ad7 Make yolo skip managed-network tool enforcement (#18042)
## Summary

This makes `DangerFullAccess` / yolo tool execution fully opt out of
managed-network enforcement.

Previously, yolo turns could have `turn.network` stripped while tool
orchestration still derived `enforce_managed_network=true` from
`requirements.toml.network`. That created an inconsistent state where
the turn had no managed proxy attached, but tool execution still behaved
like managed networking was active.

This updates the tool orchestration and JS REPL paths to treat managed
networking as active only when the current turn actually has
`turn.network`.

## Behavior

- Yolo / `DangerFullAccess`: no managed proxy, no managed-network
enforcement.
- Guardian / workspace-write with managed proxy: managed-network
enforcement still applies.
- Avoids the half-state where yolo has no proxy but still gets
managed-network sandbox behavior.

## Tests

- `just fmt`
- `cargo test -p codex-core
danger_full_access_tool_attempts_do_not_enforce_managed_network --
--nocapture`
- `cargo test -p codex-core danger_full_access -- --nocapture`
- `just fix -p codex-core`

Co-authored-by: jgershen-oai <jgershen@openai.com>
2026-04-16 09:06:10 -07:00
Won Park
85203d8872 Launch image generation by default (#17153)
## Summary
- Promote `image_generation` from under-development to stable
- Enable image generation by default in the feature registry
- Update feature coverage for the new launch-state expectation
- Add the missing image-generation auth fixture field in a tool registry
test

## Testing
- `just fmt`
- `cargo test -p codex-features`
- `cargo test -p codex-tools` currently fails:
`test_full_toolset_specs_for_gpt5_codex_unified_exec_web_search` needs
its expected default tool list updated for `image_generation`
2026-04-16 09:05:38 -07:00
sayan-oai
9c6d038622 [code mode] defer mcp tools from exec description (#17287)
## Summary
- hide deferred MCP/app nested tool descriptions from the `exec` prompt
in code mode
- add short guidance that omitted nested tools are still available
through `ALL_TOOLS`
- cover the code_mode_only path with an integration test that discovers
and calls a deferred app tool

## Motivation
`code_mode_only` exposes only top-level `exec`/`wait`, but the `exec`
description could still include a large nested-tool reference. This
keeps deferred nested tools callable while avoiding that prompt bloat.

## Tests
- `just fmt`
- `just fix -p codex-code-mode`
- `just fix -p codex-tools`
- `cargo test -p codex-code-mode
exec_description_mentions_deferred_nested_tools_when_available`
- `cargo test -p codex-tools
create_code_mode_tool_matches_expected_spec`
- `cargo test -p codex-core
code_mode_only_guides_all_tools_search_and_calls_deferred_app_tools`
2026-04-17 00:01:14 +08:00
Ahmed Ibrahim
b4be3617f9 [1/8] Add MCP server environment config (#18085)
## Summary
- Add an MCP server environment setting with local as the default.
- Thread the default through config serialization, schema generation,
and existing config fixtures.

## Stack
```text
o  #18027 [8/8] Fail exec client operations after disconnect
│
o  #18025 [7/8] Cover MCP stdio tests with executor placement
│
o  #18089 [6/8] Wire remote MCP stdio through executor
│
o  #18088 [5/8] Add executor process transport for MCP stdio
│
o  #18087 [4/8] Abstract MCP stdio server launching
│
o  #18020 [3/8] Add pushed exec process events
│
o  #18086 [2/8] Support piped stdin in exec process API
│
@  #18085 [1/8] Add MCP server environment config
│
o  main
```

Co-authored-by: Codex <noreply@openai.com>
2026-04-16 08:50:03 -07:00
David de Regt
6adba99f4d Stabilize Bazel tests (timeout tweaks and flake fixes) (#17791) 2026-04-16 07:57:51 -07:00
jif-oai
b33478c236 chore: unify memory drop endpoints (#18134)
Unify all the memories drop behind a single implementation that drops
both the main memories and the extensions
2026-04-16 15:44:23 +01:00
jif-oai
18e9ac8c75 chore: more pollution filtering (#18138) 2026-04-16 15:32:32 +01:00
jif-oai
9c326c4cb4 nit: add min values for memories (#18137)
Just add min values to some memories config fields
2026-04-16 14:37:43 +01:00
jif-oai
b0324f9f05 fix: more flake (#18006)
Stabilizes the Responses API proxy header test by splitting the coverage
at the right boundary:
- Core integration test now verifies parent/subagent identity headers
directly from captured `/responses` requests.
- Proxy dump unit test now verifies those identity headers are preserved
in dumped request JSON.
- Removes the flaky real proxy process + temp-file dump polling path
from the core test.
2026-04-16 10:01:45 +01:00
jackz-oai
f97be7dfff [codex] Route Fed ChatGPT auth through Fed edge (#17151)
## Summary
- parse chatgpt_account_is_fedramp from signed ChatGPT auth metadata
- add _account_is_fedramp=true to ChatGPT backend-api requests only for
FedRAMP ChatGPT-auth accounts
2026-04-16 07:13:15 +00:00
xl-openai
48cf3ed7b0 Extract plugin loading and marketplace logic into codex-core-plugins (#18070)
Split plugin loading, marketplace, and related infrastructure out of
core into codex-core-plugins, while keeping the core-facing
configuration and orchestration flow in codex-core.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-15 23:13:17 -07:00
Matthew Zeng
224dad41ac [codex][mcp] Add resource uri meta to tool call item. (#17831)
- [x] Add resource uri meta to tool call item so that the app-server
client can start prefetching resources immediately without loading mcp
server status.
2026-04-16 05:09:17 +00:00
Matthew Zeng
77fe33bf72 Update ToolSearch to be enabled by default (#17854)
## Summary
- Promote `Feature::ToolSearch` to `Stable` and enable it in the default
feature set
- Update feature tests and tool registry coverage to match the new
default
- Adjust the search-tool integration test to assert the default-on path
and explicit disable fallback

## Testing
- `just fmt`
- `cargo test -p codex-features`
- `cargo test -p codex-core --test all search_tool`
- `cargo test -p codex-tools`
2026-04-15 22:01:05 -07:00
pakrym-oai
bd61737e8a Async config loading (#18022)
Parts of config will come from executor. Prepare for that by making
config loading methods async.
2026-04-15 19:18:38 -07:00
Won Park
e2dbe7dfc3 removing network proxy for yolo (#17742)
**Summary**
- prevent managed requirements.toml network settings from leaking into
DangerFullAccess / yolo turns by gating managed proxy attachment on
sandbox mode
- keep guardian/sandboxed modes on the managed proxy path, while making
true yolo bypass the proxy entirely, including /shell full-access
commands
2026-04-16 00:02:42 +00:00
bxie-openai
c2bdb7812c Clarify realtime v2 context and handoff messages (#17896)
## Summary
- wrap realtime startup context in
`<startup_context>...</startup_context>` tags
- prefix V2 mirrored user text and relayed backend text with `[USER]` /
`[BACKEND]`
- remove the V2 progress suffix and replace the final V2 handoff output
with a short completion acknowledgement while preserving the existing V1
wrapper

## Testing
- cargo test -p codex-api
realtime_v2_session_update_includes_background_agent_tool_and_handoff_output_item
-- --exact
- cargo test -p codex-app-server webrtc_v2_background_agent_
- cargo test -p codex-app-server webrtc_v2_text_input_is_
- cargo test -p codex-core conversation_user_text_turn_is_
2026-04-15 16:26:20 -07:00
Matthew Zeng
28b76d13fe [mcp] Add dummy tools for previously called but currently missing tools. (#17853)
- [x] Add dummy tools for previously called but currently missing tools.
Currently supporting MCP tools only.
2026-04-15 21:48:05 +00:00
Curtis 'Fjord' Hawthorne
9e2fc31854 Support original-detail metadata on MCP image outputs (#17714)
## Summary
- honor `_meta["codex/imageDetail"] == "original"` on MCP image content
and map it to `detail: "original"` where supported
- strip that detail back out when the active model does not support
original-detail image inputs
- update code-mode `image(...)` to accept individual MCP image blocks
- teach `js_repl` / `codex.emitImage(...)` to preserve the same hint
from raw MCP image outputs
- document the new `_meta` contract and add generic RMCP-backed coverage
across protocol, core, code-mode, and js_repl paths
2026-04-15 14:43:33 -07:00
evawong-oai
17d94bd1e3 [docs] Revert extra changes from PR 17848 (#18003)
## Summary

1. Revert https://github.com/openai/codex/pull/17848 so the Bazel and
`BUILD` file changes leave `main`.
2. Prepare for a narrower follow up that restores only `SECURITY.md`.

## Validation

1. Reviewed the revert diff against `main`.
2. Ran a clean diff check before push.
2026-04-15 14:43:30 -07:00
xl-openai
e70ccdeaf7 feat: Support alternate marketplace manifests and local string (#17885)
- Discover marketplace manifests from different supported layout paths
instead of only .agents/plugins/marketplace.json.
- Accept local plugin sources written either as { source: "local", path:
... } or as a direct string path.
- Skip unsupported or invalid plugin source entries without failing the
entire marketplace, and keep valid local plugins loadable.
2026-04-15 14:16:41 -07:00
Tom
50d3128269 Migrate archive/unarchive to local ThreadStore (#17892)
# Summary
- implement local ThreadStore archive/unarchive operations
- implement local ThreadStore read_thread operation
- break up the various ThreadStore local method implementations into
separate files
- migrate app-server archive/unarchive and core archive fixture to use
ThreadStore (but not all read operations yet!)
- use the ThreadStore's read operation as a proxy check for thread
persistence/existence in the app server code
- move all other filesystem operations related to archive (path
validation etc) into the local thread store.

# Tests
- add dedicated local store archive/unarchive tests
2026-04-15 20:48:09 +00:00
evawong-oai
0bb438bca6 [docs] Add security boundaries reference in SECURITY.md (#17848)
## Summary
1. Add a Security Boundaries section to `SECURITY.md`.
2. Point readers to the Codex Agent approvals and security documentation
for sandboxing, approvals, and network controls.

## Validation
1. Reviewed the `SECURITY.md` diff in a clean worktree.
2. No tests run. Docs only change.
2026-04-15 20:12:46 +00:00
Ivan Murashko
f2a4925f63 Support remote compaction for Azure responses providers (#17958)
Azure Responses providers were still falling back to local compaction
because the compaction gate only checked
`ModelProviderInfo::is_openai()`.

Move the capability check onto `ModelProviderInfo` with
`supports_remote_compaction()`, backed by the existing Azure Responses
endpoint detection used in `codex-api`, and have `core::compact`
delegate to that helper.

Add regression coverage for:
- OpenAI providers using remote compaction
- Azure providers using remote compaction
- non-OpenAI/non-Azure providers staying on the local path

resolves #17773

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-04-15 13:05:11 -07:00
Michael Bolin
66533ddc61 mcp: remove codex/sandbox-state custom request support (#17957)
## Why

#17763 moved sandbox-state delivery for MCP tool calls to request
`_meta` via the `codex/sandbox-state-meta` experimental capability.
Keeping the older `codex/sandbox-state` capability meant Codex still
maintained a second transport that pushed updates with the custom
`codex/sandbox-state/update` request at server startup and when the
session sandbox policy changed.

That duplicate MCP path is redundant with the per-tool-call metadata
path and makes the sandbox-state contract larger than needed. The
existing managed network proxy refresh on sandbox-policy changes is
still needed, so this keeps that behavior separate from the removed MCP
notification.

## What Changed

- Removed the exported `MCP_SANDBOX_STATE_CAPABILITY` and
`MCP_SANDBOX_STATE_METHOD` constants.
- Removed detection of `codex/sandbox-state` during MCP initialization
and stopped sending `codex/sandbox-state/update` at server startup.
- Removed the `McpConnectionManager::notify_sandbox_state_change`
plumbing while preserving the managed network proxy refresh when a user
turn changes sandbox policy.
- Slimmed `McpConnectionManager::new` so startup paths pass only the
initial `SandboxPolicy` needed for MCP elicitation state.
- Kept `codex/sandbox-state-meta` support intact; servers that opt in
still receive the current `SandboxState` on tool-call request `_meta`
([remaining call
path](ff2d3c1e72/codex-rs/core/src/mcp_tool_call.rs (L487-L526))).
- Added regression coverage for refreshing the live managed network
proxy on a per-turn sandbox-policy change.

## Verification

- `cargo test -p codex-core
new_turn_refreshes_managed_network_proxy_for_sandbox_change`
- `cargo test -p codex-mcp`
2026-04-15 12:02:40 -07:00
pakrym-oai
f5e8eac2ae Refactor auth providers to mutate request headers (#17866)
## Summary
- Move auth header construction into the
`AuthProvider::add_auth_headers` contract.
- Inline `CoreAuthProvider` header mutation in its provider impl and
remove the shared header-map helper.
- Update HTTP, websocket, file upload, sideband websocket, and test auth
callsites to use the provider method.
- Add direct coverage for `CoreAuthProvider` auth header mutation.

## Testing
- `just fmt`
- `cargo test -p codex-api`
- `cargo test -p codex-core
client::tests::auth_request_telemetry_context_tracks_attached_auth_and_retry_phase`
- `cargo test -p codex-core` failed on unrelated/reproducible
`tools::handlers::multi_agents::tests::multi_agent_v2_followup_task_interrupts_busy_child_without_losing_message`

---------

Co-authored-by: Celia Chen <celia@openai.com>
2026-04-15 11:52:51 -07:00