Commit Graph

2838 Commits

Author SHA1 Message Date
Eric Traut
fa0e2ba87c Avoid false shell snapshot cleanup warnings (#18441)
## Why
Fresh app-server thread startup can create a shell snapshot through a
temp file and then promote it to the final snapshot path. The previous
implementation briefly wrapped the temp path in `ShellSnapshot`, so
after a successful rename its `Drop` attempted to delete the old temp
path and could log a false `ENOENT` warning.

Fixes #17549.

## What changed
- Validate the temp snapshot path directly before promotion.
- Rename the temp path directly to the final snapshot path.
- Keep explicit cleanup of the temp path on validation or finalization
failures.
2026-04-20 15:15:05 +01:00
Adrian
904c751a40 [codex] Use background agent task auth for backend calls (#18094)
## Summary

Introduces a single background/control-plane agent task for ChatGPT
backend requests that do not have a thread-scoped task, with
`AuthManager` owning the default ChatGPT backend authorization decision.

Callers now ask `AuthManager` for the default ChatGPT backend
authorization header. `AuthManager` decides whether that is bearer or
background AgentAssertion based on config/internal state, while
low-level bootstrap paths can explicitly request bearer-only auth.

This PR is stacked on PR4 and focuses on the shared background task auth
plumbing plus the first tranche of backend/control-plane consumers. The
remaining callsite wiring is split into PR4.2 to keep review size down.

## Stack

- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: https://github.com/openai/codex/pull/17980 - use task-scoped
`AgentAssertion` for downstream calls
- PR4.1: this PR - introduce AuthManager-owned background/control-plane
`AgentAssertion` auth
- PR4.2: https://github.com/openai/codex/pull/18260 - use background
task auth for additional backend/control-plane calls

## What Changed

- add background task registration and assertion minting inside
`codex-login`
- persist `agent_identity.background_task_id` separately from
per-session task state
- make `BackgroundAgentTaskManager` private to `codex-login`; call sites
do not instantiate or pass it around
- teach `AuthManager` the ChatGPT backend base URL and feature-derived
background auth mode from resolved config
- expose bearer-only helpers for bootstrap/registration/refresh-style
paths that must not use AgentAssertion
- wire `AuthManager` default ChatGPT authorization through app listing,
connector directory listing, remote plugins, MCP status/listing,
analytics, and core-skills remote calls
- preserve bearer fallback when the feature is disabled, the backend
host is unsupported, or background task registration is not available

## Validation

- `just fmt`
- `cargo check -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `cargo test -p codex-login agent_identity`
- `cargo test -p codex-model-provider bearer_auth_provider`
- `cargo test -p codex-core agent_assertion`
- `cargo test -p codex-app-server remote_control`
- `cargo test -p codex-cloud-requirements fetch_cloud_requirements`
- `cargo test -p codex-models-manager manager::tests`
- `cargo test -p codex-chatgpt`
- `cargo test -p codex-cloud-tasks`
- `just fix -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `just fix -p codex-app-server`
- `git diff --check`
2026-04-20 06:50:28 -07:00
jif-oai
2c59806fe0 feat: add metric to track the number of turns with memory usage (#18662)
Add a metric `codex.turn.memory` to know if a turn used memories or not.
This is not part of the other turn metrics as a label to limit
cardinality
2026-04-20 14:31:22 +01:00
jif-oai
1c24347772 feat: chronicle alias (#18651)
Rename Telepathy to Chronicle and add an alias for backward
compatibility
2026-04-20 11:52:21 +01:00
jif-oai
fc758af9eb fix: exec policy loading for sub-agents (#18654) 2026-04-20 11:51:58 +01:00
jif-oai
ff6a5804d2 nit: telepathy to chronicle in tests (#18652) 2026-04-20 11:51:55 +01:00
jif-oai
be4fe9f9b2 feat: add --ignore-user-config and --ignore-rules (#18646)
Add those 2 flags to be able to fully isolate a run of `codex exec` from
any rules or tools.
This will be used by Chronicle
2026-04-20 11:27:47 +01:00
jif-oai
7d8bd69283 fix: FS watcher when file does not exist yet (#18492)
The initial goal of this PR was to stabilise the test
`fs_watch_allows_missing_file_targets`. After further investigation, it
turns out that this test was always failing and the unstability was
coming from a race between timeouts mostly

The goal of the test was to test what happens if a notifier gets
subscribed while a file does not exist yet. But actually the main code
was broken and in case of a file not existing yet, the notifier used to
never notify anything (even if the file ended up being created)

This PR fixes the main code (and the test). For this, we basically watch
the sup-directory when a file does not exist and refresh on it when the
files gets created
2026-04-20 11:23:00 +01:00
jif-oai
7171b25b30 fix: main 2 (#18649) 2026-04-20 10:53:54 +01:00
jif-oai
b528ff02b6 chore: morpheus to path (#18353)
Make the morpheus agent (which is the phase 2 memories agent) follow the
agent-v2 path system by naming it `/morpheus`. To maintain the path
primitive this means moving it to a dedicated `AgentControl`

Co-authored-by: Codex <noreply@openai.com>
2026-04-20 10:32:20 +01:00
jif-oai
e404c4e910 feat: add mem 2 agent header (#18644)
Add a header to memory phase 2 agent for analytics
2026-04-20 09:58:32 +01:00
Adrian
b44d2851cf [codex] Use AgentAssertion downstream behind use_agent_identity (#17980)
## Summary

This is the AgentAssertion downstream slice for feature-gated agent
identity support, replacing the oversized AgentAssertion slice from PR
#17807.

It isolates task-scoped downstream AgentAssertion wiring on top of the
merged PR3.1 work without re-carrying the earlier agent registration,
task registration, or task-state history.

This PR includes the task-scoped bug-fix call sites from the review:
generic file upload auth, MCP OpenAI file upload auth, and ARC monitor
auth. Broader user/control-plane calls move to PR4.1 and PR4.2.

## Stack

- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: this PR - use task-scoped `AgentAssertion` downstream when
enabled
- PR4.1: https://github.com/openai/codex/pull/18094 - introduce
AuthManager-owned background/control-plane `AgentAssertion` auth
- PR4.2: https://github.com/openai/codex/pull/18260 - use background
task auth for additional backend/control-plane calls

## What Changed

- add AgentAssertion envelope generation in `codex-core`
- route downstream HTTP and websocket auth through AgentAssertion when
an agent task is present
- extend the model-provider auth provider so non-bearer authorization
schemes can be passed through cleanly
- make generic file uploads attach the full authorization header value
- make MCP OpenAI file uploads use the cached thread agent task
assertion when present
- make ARC monitor calls use the cached thread agent task assertion when
present

## Why

The original PR had drifted ancestry and showed a much larger diff than
the semantic change actually required. Restacking it onto PR3.1 keeps
the reviewable surface down to the downstream assertion slice.

## Validation

- `just fmt`
- `cargo check -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `cargo test -p codex-model-provider bearer_auth_provider`
- `cargo test -p codex-core agent_assertion`
- `cargo test -p codex-app-server remote_control`
- `cargo test -p codex-cloud-requirements fetch_cloud_requirements`
- `cargo test -p codex-models-manager manager::tests`
- `cargo test -p codex-chatgpt`
- `cargo test -p codex-cloud-tasks`
- `cargo test -p codex-login agent_identity`
- `just fix -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `just fix -p codex-app-server`
- `git diff --check`
2026-04-19 23:16:43 -07:00
richardopenai
3c75f9b4dd [codex] Add workspace owner usage nudge UI (#18221)
## Summary

Third PR in the split from #17956. Stacked on #18220.

- shows workspace-owner/member-specific rate-limit messages behind
`workspace_owner_usage_nudge`
- prompts workspace members to notify the owner or request a usage-limit
increase
- sends the confirmed nudge through the app-server API and renders
completion feedback
- adds focused TUI snapshot coverage for prompts and completion states
- feature gate

## Validation

- `cargo test -p codex-backend-client`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server rate_limits`
- `cargo test -p codex-tui workspace_`
- `cargo test -p codex-tui status_`
- `just fmt`
- `just fix -p codex-backend-client`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`
2026-04-20 05:51:47 +00:00
Andrey Mishchenko
ab65fbbdd6 Add codex debug models to show model catalog (#18625) 2026-04-20 05:42:22 +00:00
Dylan Hurd
0500801123 fix(guardian) disable skills message in guardian thread (#18599)
## Summary
Remove the skills message from the guardian dev message

## Test Plan
- [x] Ran locally
- [x] Added unit test

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-20 04:42:55 +00:00
Dylan Hurd
49403e3676 chore(multiagent) skills instructions toggle (#18596)
## Summary
Support toggling the skills message off.

## Test Plan
- [x] Updated unit tests
2026-04-19 21:11:52 -07:00
Adrian
e5b52a3caa Persist and prewarm agent tasks per thread (#17978)
## Summary
- persist registered agent tasks in the session state update stream so
the thread can reuse them
- prewarm task registration once identity registration succeeds, while
keeping startup failures best-effort
- isolate the session-side task lifecycle into a dedicated module so
AgentIdentityManager and RegisteredAgentTask do not leak across as many
core layers

## Testing
- cargo test -p codex-core startup_agent_task_prewarm
- cargo test -p codex-core
cached_agent_task_for_current_identity_clears_stale_task
- cargo test -p codex-core record_initial_history_
2026-04-19 15:45:28 -07:00
Ahmed Ibrahim
ed1c5013ab Remove unused models.json (#18585)
- Remove the stale core models catalog.
- Update the release workflow to refresh the active models-manager
catalog.
2026-04-19 11:58:55 -07:00
Ahmed Ibrahim
d556e68ff0 Log realtime session id (#18571)
- Log the actual realtime session id when the session.updated event
arrives.
2026-04-19 11:23:25 -07:00
alexsong-oai
cce6002339 Add fallback source for external official marketplace (#18524) 2026-04-19 11:04:13 -07:00
Ahmed Ibrahim
996aa23e4c [5/6] Wire executor-backed MCP stdio (#18212)
## Summary
- Add the executor-backed RMCP stdio transport.
- Wire MCP stdio placement through the executor environment config.
- Cover local and executor-backed stdio paths with the existing MCP test
helpers.

## Stack
```text
o  #18027 [6/6] Fail exec client operations after disconnect
│
@  #18212 [5/6] Wire executor-backed MCP stdio
│
o  #18087 [4/6] Abstract MCP stdio server launching
│
o  #18020 [3/6] Add pushed exec process events
│
o  #18086 [2/6] Support piped stdin in exec process API
│
o  #18085 [1/6] Add MCP server environment config
│
o  main
```

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-18 21:47:43 -07:00
pakrym-oai
53b1570367 Update image outputs to default to high detail (#18386)
Do not assume the default `detail`.
2026-04-18 11:01:12 -07:00
jif-oai
e3c2acb9cd Revert "[codex] drain mailbox only at request boundaries" (#18325)
## Summary
- Reverts PR #17749 so queued inter-agent mail can again preempt after
reasoning/commentary output item boundaries.
- Applies the revert to the current `codex/turn.rs` module layout and
restores the prior pending-input test expectations/snapshots.

## Testing
- `just fmt`
- `cargo test -p codex-core --test all pending_input`
- `cargo test -p codex-core` failed in unrelated
`tools::js_repl::tests::js_repl_imported_local_files_can_access_repl_globals`:
dotslash download hit `mktemp: mkdtemp failed ... Operation not
permitted` in the sandbox temp dir.

Co-authored-by: Codex <noreply@openai.com>
2026-04-18 09:53:48 -07:00
Ahmed Ibrahim
5bb193aa88 Add max context window model metadata (#18382)
Adds max_context_window to model metadata and routes core context-window
reads through resolved model info. Config model_context_window overrides
are clamped to max_context_window when present; without an override, the
model context_window is used.
2026-04-17 21:48:14 -07:00
xli-oai
e9c70fff3f [codex] Add marketplace remove command and shared logic (#17752)
## Summary

Move the marketplace remove implementation into shared core logic so
both the CLI command and follow-up app-server RPC can reuse the same
behavior.

This change:
- adds a shared `codex_core::plugins::remove_marketplace(...)` flow
- moves validation, config removal, and installed-root deletion out of
the CLI
- keeps the CLI as a thin wrapper over the shared implementation
- adds focused core coverage for the shared remove path

## Validation

- `just fmt`
- focused local coverage for the shared remove path
- heavier follow-up validation deferred to stacked PR CI
2026-04-17 21:44:47 -07:00
xli-oai
def6467d2b [codex] Describe uninstalled cross-repo plugin reads (#18449)
## Summary
- Populate `PluginDetail.description` in core for uninstalled cross-repo
plugins when detailed fields are unavailable until install.
- Include the source Git URL plus optional path/ref/sha details in that
fallback description.
- Keep `details_unavailable_reason` as the structured signal while
app-server forwards the description normally.
- Add plugin-read coverage proving the response does not clone the
remote source just to show the message.

## Why
Uninstalled cross-repo plugins intentionally return sparse detail data
so listing/reading does not clone the plugin source. Without a
description, Desktop and TUI detail pages look like an ordinary empty
plugin. This gives users a concrete explanation and source pointer while
keeping the existing structured reason available for callers.

## Validation
- `just fmt`
- `cargo test -p codex-core
read_plugin_for_config_uninstalled_git_source_requires_install_without_cloning`
- `cargo test -p codex-app-server plugin_read --test all`
- `just fix -p codex-core`
- `just fix -p codex-app-server`

Note: `cargo test -p codex-app-server` was also attempted before the
latest refactor and failed broadly in unrelated v2
thread/realtime/review/skills suites; the new plugin-read test passed in
that run as well.
2026-04-17 20:31:13 -07:00
xl-openai
3f7222ec76 feat: Budget skill metadata and surface trimming as a warning (#18298)
Cap the model-visible skills section to a small share of the context
window, with a fallback character budget, and keep only as many implicit
skills as fit within that budget.

Emit a non-fatal warning when enabled skills are omitted, and add a new
app-server warning notification

Record thread-start skill metrics for total enabled skills, kept skills,
and whether truncation happened

---------

Co-authored-by: Matthew Zeng <mzeng@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-04-17 18:11:47 -07:00
alexsong-oai
93ff798e5b [TUI] add external config migration prompt when start TUI (#17891)
- add a TUI startup migration prompt for external agent config
- support migrating external configs including config, skills, AGENTS.md
and plugins
- gate the prompt behind features.external_migrate (default false)

<img width="1037" height="480" alt="Screenshot 2026-04-14 at 9 29 14 PM"
src="https://github.com/user-attachments/assets/6060849b-03cb-429a-9c13-c7bb46ad2e65"
/>
<img width="713" height="183" alt="Screenshot 2026-04-14 at 9 29 26 PM"
src="https://github.com/user-attachments/assets/d13f177e-d4c4-479c-8736-ef29636081e1"
/>

---------

Co-authored-by: Eric Traut <etraut@openai.com>
2026-04-17 17:58:32 -07:00
viyatb-oai
370bed4bf4 fix: trust-gate project hooks and exec policies (#14718)
## Summary
- trust-gate project `.codex` layers consistently, including repos that
have `.codex/hooks.json` or `.codex/execpolicy/*.rules` but no
`.codex/config.toml`
- keep disabled project layers in the config stack so nested trusted
project layers still resolve correctly, while preventing hooks and exec
policies from loading until the project is trusted
- update app-server/TUI onboarding copy to make the trust boundary
explicit and add regressions for loader, hooks, exec-policy, and
onboarding coverage

## Security
Before this change, an untrusted repo could auto-load project hooks or
exec policies from `.codex/` as long as `config.toml` was absent. This
makes trust the single gate for project-local config, hooks, and exec
policies.

## Stack
- Parent of #15936

## Test
- cargo test -p codex-core without_config_toml

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-17 17:56:58 -07:00
xl-openai
26d9894a27 feat: Add remote plugin fields to plugin API (#17277)
## Summary
Update the plugin API for the new remote plugin model.

The mental model is no longer “keep local plugin state in sync with
remote.” Instead, local and remote plugins are becoming separate
sources. Remote catalog entries can be shown directly from the remote
API before installation; after installation they are still downloaded
into the local cache for execution, but remote installed state will come
from the API and be held in memory rather than being read from config.

• ## API changes
- Remove `forceRemoteSync` from `plugin/list`, `plugin/install`, and
`plugin/uninstall`.
  - Remove `remoteSyncError` from `plugin/list`.
  - Add remote-capable metadata to `plugin/list` / `plugin/read`:
    - nullable `marketplaces[].path`
    - `source: { type: "remote", downloadUrl }`
    - URL asset fields alongside local path fields:
  `composerIconUrl`, `logoUrl`, `screenshotUrls`
  - Make `plugin/read` and `plugin/install` source-compatible:
    - `marketplacePath?: AbsolutePathBuf | null`
    - `remoteMarketplaceName?: string | null`
    - exactly one source is required at runtime
2026-04-17 16:47:58 -07:00
pakrym-oai
120bbf46c1 Update image resizing to fit 2048 square bounds (#18384)
We don't have to downsize to 768 height.
2026-04-17 16:31:03 -07:00
Michael Bolin
96d35dd640 bazel: use native rust test sharding (#18082)
## Why

The large Rust test suites are slow and include some of our flakiest
tests, so we want to run them with Bazel native sharding while keeping
shard membership stable between runs.

This is the simpler follow-up to the explicit-label experiment in
#17998. Since #18397 upgraded Codex to `rules_rs` `0.0.58`, which
includes the stable test-name hashing support from
hermeticbuild/rules_rust#14, this PR only needs to wire Codex's Bazel
macros into that support.

Using native sharding preserves BuildBuddy's sharded-test UI and Bazel's
per-shard test action caching. Using stable name hashing avoids
reshuffling every test when one test is added or removed.

## What Changed

`codex_rust_crate` now accepts `test_shard_counts` and applies the right
Bazel/rules_rust attributes to generated unit and integration test
rules. Matched tests are also marked `flaky = True`, giving them Bazel's
default three attempts.

This PR shards these labels 8 ways:

```text
//codex-rs/core:core-all-test
//codex-rs/core:core-unit-tests
//codex-rs/app-server:app-server-all-test
//codex-rs/app-server:app-server-unit-tests
//codex-rs/tui:tui-unit-tests
```

## Verification

`bazel query --output=build` over the selected public labels and their
inner unit-test binaries confirmed the expected `shard_count = 8`,
`flaky = True`, and `experimental_enable_sharding = True` attributes.

Also verified that we see the shards as expected in BuildBuddy so they
can be analyzed independently.

Co-authored-by: Codex <noreply@openai.com>
2026-04-17 23:14:11 +00:00
viyatb-oai
f705f42ba8 fix: fix fs sandbox helper for apply_patch (#18296)
## Summary

- pass split filesystem sandbox policy/cwd through apply_patch contexts,
while omitting legacy-equivalent policies to keep payloads small
- keep the fs helper compatible with legacy Landlock by avoiding helper
read-root permission expansion in that mode and disabling helper network
access

## Root Cause

`d626dc38950fb40a1a5ad0a8ffab2485e3348c53` routed exec-server filesystem
operations through a sandboxed helper. That path forwarded legacy
Landlock into a helper policy shape that could require direct
split-policy enforcement. Sandboxed `apply_patch` hit that edge through
the filesystem abstraction.

The same 0.121 edit-regression path is consistent with #18354: normal
writes route through the `apply_patch` filesystem helper, fail under
sandbox, and then surface the generic retry-without-sandbox prompt.

Fixes #18069
Fixes #18354

## Validation

- `cd codex-rs && just fmt`
- earlier branch validation before merging current `origin/main` and
dropping the now-separate PATH fix:
  - `cd codex-rs && cargo test -p codex-exec-server`
- `cd codex-rs && cargo test -p codex-core file_system_sandbox_context`
  - `cd codex-rs && just fix -p codex-exec-server`
  - `cd codex-rs && just fix -p codex-core`
  - `git diff --check`
  - `cd codex-rs && cargo clean`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-17 15:39:07 -07:00
xli-oai
0e111e08d0 [codex] Add cross-repo plugin sources to marketplace manifests (#18017)
## Summary
- add first-class marketplace support for git-backed plugin sources
- keep the newer marketplace parsing behavior from `main`, including
alternate manifest locations and string local sources
- materialize remote plugin sources during install, detail reads, and
non-curated cache refresh
- expose git plugin source metadata through the app-server protocol

## Details
This teaches the marketplace parser to accept all of the following:
- local string sources such as `"source": "./plugins/foo"`
- local object sources such as
`{"source":"local","path":"./plugins/foo"}`
- remote repo-root sources such as
`{"source":"url","url":"https://github.com/org/repo.git"}`
- remote subdir sources such as
`{"source":"git-subdir","url":"owner/repo","path":"plugins/foo","ref":"main","sha":"..."}`

It also preserves the newer tolerant behavior from `main`: invalid or
unsupported plugin entries are skipped instead of breaking the whole
marketplace.

## Validation
- `cargo test -p codex-core plugins::marketplace::tests`
- `just fix -p codex-core`
- `just fmt`

## Notes
- A full `cargo test -p codex-core` run still hit unrelated existing
failures in agent and multi-agent tests during this session; the
marketplace-focused suite passed after the rebase resolution.
2026-04-17 15:11:42 -07:00
Michael Bolin
1265df0ec2 refactor: narrow async lock guard lifetimes (#18211)
Follow-up to https://github.com/openai/codex/pull/18178, where we called
out enabling the await-holding lint as a follow-up.

The long-term goal is to enable Clippy coverage for async guards held
across awaits. This PR is intentionally only the first, low-risk cleanup
pass: it narrows obvious lock guard lifetimes and leaves
`codex-rs/Cargo.toml` unchanged so the lint is not enabled until the
remaining cases are fixed or explicitly justified. It intentionally
leaves the active-turn/turn-state locking pattern alone because those
checks and mutations need to stay atomic.

## Common fixes used here

These are the main patterns reviewers should expect in this PR, and they
are also the patterns to reach for when fixing future `await_holding_*`
findings:

- **Scope the guard to the synchronous work.** If the code only needs
data from a locked value, move the lock into a small block, clone or
compute the needed values, and do the later `.await` after the block.
- **Use direct one-line mutations when there is no later await.** Cases
like `map.lock().await.remove(&id)` are acceptable when the guard is
only needed for that single mutation and the statement ends before any
async work.
- **Drain or clone work out of the lock before notifying or awaiting.**
For example, the JS REPL drains pending exec senders into a local vector
and the websocket writer clones buffered envelopes before it serializes
or sends them.
- **Use a `Semaphore` only when serialization is intentional across
async work.** The test serialization guards intentionally span awaited
setup or execution, so using a semaphore communicates "one at a time"
without holding a mutex guard.
- **Remove the mutex when there is only one owner.** The PTY stdin
writer task owns `stdin` directly; the old `Arc<Mutex<_>>` did not
protect shared access because nothing else had access to the writer.
- **Do not split locks that protect an atomic invariant.** This PR
deliberately leaves active-turn/turn-state paths alone because those
checks and mutations need to stay atomic. Those cases should be fixed
separately with a design change or documented with `#[expect]`.

## What changed

- Narrow scoped async mutex guards in app-server, JS REPL, network
approval, remote-control websocket, and the RMCP test server.
- Replace test-only async mutex serialization guards with semaphores
where the guard intentionally lives across async work.
- Let the PTY pipe writer task own stdin directly instead of wrapping it
in an async mutex.

## Verification

- `just fix -p codex-core -p codex-app-server -p codex-rmcp-client -p
codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness`
- `just clippy -p codex-core`
- `cargo test -p codex-core -p codex-app-server -p codex-rmcp-client -p
codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` was
run; the app-server suite passed, and `codex-core` failed in the local
sandbox on six otel approval tests plus
`suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`,
which appear to depend on local command approval/default rules and
`CODEX_SANDBOX_NETWORK_DISABLED=1` in this environment.
2026-04-17 14:06:50 -07:00
richardopenai
139fa8b8f2 [codex] Propagate rate limit reached type (#18227)
## Summary

First PR in the split from #17956.

- adds the core/app-server `RateLimitReachedType` shape
- maps backend `rate_limit_reached_type` into Codex rate-limit snapshots
- carries the field through app-server notifications/responses and
generated schemas
- updates existing constructors/tests for the new optional field

## Validation

- `cargo test -p codex-backend-client`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server rate_limits`
- `cargo test -p codex-tui workspace_`
- `cargo test -p codex-tui status_`
- `just fmt`
- `just fix -p codex-backend-client`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`
2026-04-17 13:37:25 -07:00
github-actions[bot]
a801b999ff Update models.json (#12640)
Automated update of models.json.

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
2026-04-17 12:16:07 -07:00
Ahmed Ibrahim
9d3a5cf05e [3/6] Add pushed exec process events (#18020)
## Summary
- Add a pushed `ExecProcessEvent` stream alongside retained
`process/read` output.
- Publish local and remote output, exit, close, and failure events.
- Cover the event stream with shared local/remote exec process tests.

## Testing
- `cargo check -p codex-exec-server`
- `cargo check -p codex-rmcp-client`
- Not run: `cargo test` per repo instruction; CI will cover.

## Stack
```text
o  #18027 [6/6] Fail exec client operations after disconnect
│
o  #18212 [5/6] Wire executor-backed MCP stdio
│
o  #18087 [4/6] Abstract MCP stdio server launching
│
@  #18020 [3/6] Add pushed exec process events
│
o  #18086 [2/6] Support piped stdin in exec process API
│
o  #18085 [1/6] Add MCP server environment config
│
o  main
```

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-17 19:07:43 +00:00
David de Regt
eaf78e43f2 Add sorting/backwardsCursor to thread/list and new thread/turns/list api (#17305)
To improve performance of UI loads from the app, add two main
improvements:
1. The `thread/list` api now gets a `sortDirection` request field and a
`backwardsCursor` to the response, which lets you paginate forwards and
backwards from a window. This lets you fetch the first few items to
display immediately while you paginate to fill in history, then can
paginate "backwards" on future loads to catch up with any changes since
the last UI load without a full reload of the entire data set.
2. Added a new `thread/turns/list` api which also has sortDirection and
backwardsCursor for the same behavior as `thread/list`, allowing you the
same small-fetch for immediate display followed by background fill-in
and resync catchup.
2026-04-17 11:49:02 -07:00
Michael Bolin
2c2ed51876 ci: make Windows Bazel clippy catch core test imports (#18350)
## Why

Unused imports in `core/tests/suite/unified_exec.rs` in the Windows
build were not caught by Bazel CI on
https://github.com/openai/codex/pull/18096. I spot-checked
https://github.com/openai/codex/actions/workflows/rust-ci-full.yml?query=branch%3Amain
and noticed that builds were consistently red. This revealed that our
Cargo builds _were_ properly catching these issues, identifying a
Windows-specific coverage hole in the Bazel clippy job.

The Windows Bazel clippy job uses `--skip_incompatible_explicit_targets`
so it can lint a broad target set without failing immediately on targets
that are genuinely incompatible with Windows. However, with the default
Windows host platform, `rust_test` targets such as
`//codex-rs/core:core-all-test` could be skipped before the clippy
aspect reached their integration-test modules. As a result, the imports
in `core/tests/suite/unified_exec.rs` were not being linted by the
Windows Bazel clippy job at all.

The clippy diagnostic that Windows Bazel should have surfaced was:

```text
error: unused import: `codex_config::Constrained`
 --> core\tests\suite\unified_exec.rs:8:5
  |
8 | use codex_config::Constrained;
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: `-D unused-imports` implied by `-D warnings`
  = help: to override `-D warnings` add `#[allow(unused_imports)]`

error: unused import: `codex_protocol::permissions::FileSystemAccessMode`
  --> core\tests\suite\unified_exec.rs:11:5
   |
11 | use codex_protocol::permissions::FileSystemAccessMode;
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unused import: `codex_protocol::permissions::FileSystemPath`
  --> core\tests\suite\unified_exec.rs:12:5
   |
12 | use codex_protocol::permissions::FileSystemPath;
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unused import: `codex_protocol::permissions::FileSystemSandboxEntry`
  --> core\tests\suite\unified_exec.rs:13:5
   |
13 | use codex_protocol::permissions::FileSystemSandboxEntry;
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: unused import: `codex_protocol::permissions::FileSystemSandboxPolicy`
  --> core\tests\suite\unified_exec.rs:14:5
   |
14 | use codex_protocol::permissions::FileSystemSandboxPolicy;
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

## What changed

- Run the Windows Bazel clippy job with the MSVC host platform via
`--windows-msvc-host-platform`, matching the Windows Bazel test job.
This keeps `--skip_incompatible_explicit_targets` while ensuring Windows
`rust_test` targets such as `//codex-rs/core:core-all-test` are still
linted.
- Remove the unused imports from `core/tests/suite/unified_exec.rs`.
- Add `--print-failed-action-summary` to
`.github/scripts/run-bazel-ci.sh` so Bazel action failures can be
summarized after the build exits.

## Failure reporting

Once the coverage issue was fixed, an intentionally reintroduced unused
import made the Windows Bazel clippy job fail as expected. That exposed
a separate usability problem: because the job keeps `--keep_going`, the
top-level Bazel output could still end with:

```text
ERROR: Build did NOT complete successfully
FAILED:
```

without the underlying rustc/clippy diagnostic being visible in the
obvious part of the GitHub Actions log.

To keep `--keep_going` while making failures actionable, the wrapper now
scans the captured Bazel console output for failed actions and prints
the matching rustc/clippy diagnostic block. When a diagnostic block is
found, it is emitted both as a GitHub `::error` annotation and as plain
expanded log output, rather than being hidden in a collapsed group.

## Verification

To validate the CI path, I intentionally introduced an unused import in
`core/tests/suite/unified_exec.rs`. The Windows Bazel clippy job failed
as expected, confirming that the integration-test module is now covered
by Bazel clippy. The same failure also verified that the wrapper
surfaces the matching clippy diagnostics directly in the Actions output.
2026-04-17 18:19:58 +00:00
sayan-oai
6991be7ead enable tool search over dynamic tools (#18263)
## Summary

- Normalize deferred MCP and dynamic tools into `ToolSearchEntry` values
before constructing `ToolSearchHandler`.
- Move the tool-search entry adapter out of `tools/handlers` and into
`tools/tool_search_entry.rs` so the handlers directory stays focused on
handlers.
- Keep `ToolSearchHandler` operating over one generic entry list for
BM25 search, namespace grouping, and per-bucket default limits.

## Why

Follow-up cleanup for #17849. The dynamic tool-search support made the
handler juggle source-specific MCP and dynamic tool lists, index
arithmetic, output conversion, and namespace emission. This keeps source
adaptation outside the handler so the search loop itself is smaller and
source-agnostic.

## Validation

- `just fmt`
- `cargo test -p codex-core tools::handlers::tool_search::tests`
- `git diff --check`
- `cargo test -p codex-core` currently fails in unrelated
`plugins::manager::tests::list_marketplaces_ignores_installed_roots_missing_from_config`;
rerunning that single test fails the same way at
`core/src/plugins/manager_tests.rs:1692`.

---------

Co-authored-by: pash <pash@openai.com>
2026-04-18 02:07:59 +08:00
colby-oai
ea84537369 Make app tool hint defaults pessimistic for app policies (#17232)
## Summary
- default missing app tool destructive/open-world hints to true for app
policies
- add regression tests for missing MCP annotations under restrictive app
config
2026-04-17 13:27:49 -04:00
jif-oai
cfc23eee3d feat: config aliases (#18140)
Rename `no_memories_if_mcp_or_web_search` →
`disable_on_external_context` with backward compatibility

While doing so, we add a key alias system on our layer merging system.
What we try to avoid is a case where a company managed config use an old
name while the user has a new name in it's local config (which would
make the deserialization fail)
2026-04-17 18:26:09 +01:00
Won Park
af7b8d551c Guardian -> Auto-Review (#18021)
This PR is a user-facing change for our rebranding of guardian to
auto-review.
2026-04-17 09:56:24 -07:00
Michael Bolin
d0eff70383 Fix config-loader tests after filesystem abstraction race (#18351)
## Why

`origin/main` picked up two changes that crossed in flight:

- #18209 refactored config loading to read through `ExecutorFileSystem`,
changing `load_requirements_toml` to take a filesystem handle and an
`AbsolutePathBuf`.
- #17740 added managed `deny_read` requirements tests that still called
`load_requirements_toml` with the previous two-argument signature.

Once both landed, `just clippy` failed because the new tests no longer
matched the current helper API.

## What

- Updates the two managed `deny_read` requirements tests to convert the
fixture path to `AbsolutePathBuf` before loading.
- Passes `LOCAL_FS.as_ref()` into `load_requirements_toml` so these
tests follow the filesystem abstraction introduced by #18209.

## Verification

- `just clippy`
- `cargo test -p codex-core load_requirements_toml_resolves_deny_read`
- `cargo test -p codex-core --test all
unified_exec_enforces_glob_deny_read_policy`
2026-04-17 09:20:39 -07:00
pakrym-oai
71e4c6fa17 Move codex module under session (#18249)
## Summary
- rename the core codex module root to session/mod.rs without using
#[path]
- move the codex module directory and tests under core/src/session
- remove session/mod.rs reexports so call sites use explicit child
module paths

## Testing
- cargo test -p codex-core --lib
- cargo check -p codex-core --tests
- just fmt
- just fix -p codex-core
- git diff --check
2026-04-17 16:18:53 +00:00
viyatb-oai
dae0608c06 feat(config): support managed deny-read requirements (#17740)
## Summary
- adds managed requirements support for deny-read filesystem entries
- constrains config layers so managed deny-read requirements cannot be
widened by user-controlled config
- surfaces managed deny-read requirements through debug/config plumbing

This PR lets managed requirements inject deny-read filesystem
constraints into the effective filesystem sandbox policy.
User-controlled config can still choose the surrounding permission
profile, but it cannot remove or weaken the managed deny-read entries.

## Managed deny-read shape
A managed requirements file can declare exact paths and glob patterns
under `[permissions.filesystem]`:

```toml
# /etc/codex/requirements.toml
[permissions.filesystem]
deny_read = [
  "/Users/alice/.gitconfig",
  "/Users/alice/.ssh",
  "./managed-private/**/*.env",
]
```

Those entries are compiled into the effective filesystem policy as
`access = none` rules, equivalent in shape to filesystem permission
entries like:

```toml
[permissions.workspace.filesystem]
"/Users/alice/.gitconfig" = "none"
"/Users/alice/.ssh" = "none"
"/absolute/path/to/managed-private/**/*.env" = "none"
```

The important difference is that the managed entries come from
requirements, so lower-precedence user config cannot remove them or make
those paths readable again.

Relative managed `deny_read` entries are resolved relative to the
directory containing the managed requirements file. Glob entries keep
their glob suffix after the non-glob prefix is normalized.

## Runtime behavior
- Managed `deny_read` entries are appended to the effective
`FileSystemSandboxPolicy` after the selected permission profile is
resolved.
- Exact paths become `FileSystemPath::Path { access: None }`; glob
patterns become `FileSystemPath::GlobPattern { access: None }`.
- When managed deny-read entries are present, `sandbox_mode` is
constrained to `read-only` or `workspace-write`; `danger-full-access`
and `external-sandbox` cannot silently bypass the managed read-deny
policy.
- On Windows, the managed deny-read policy is enforced for direct file
tools, but shell subprocess reads are not sandboxed yet, so startup
emits a warning for that platform.
- `/debug-config` shows the effective managed requirement as
`permissions.filesystem.deny_read` with its source.

## Stack
1. #15979 - glob deny-read policy/config/direct-tool support
2. #18096 - macOS and Linux sandbox enforcement
3. This PR - managed deny-read requirements

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-17 08:40:09 -07:00
jif-oai
3421a107e0 nit: phase 2 ephemeral (#18338) 2026-04-17 16:10:58 +01:00
Abhinav
8494e5bd7b Add PermissionRequest hooks support (#17563)
## Why

We need `PermissionRequest` hook support!

Also addresses:
- https://github.com/openai/codex/issues/16301
- run a script on Hook to do things like play a sound to draw attention
but actually no-op so user can still approve
- can omit the `decision` object from output or just have the script
exit 0 and print nothing
- https://github.com/openai/codex/issues/15311
  - let the script approve/deny on its own
  - external UI what will run on Hook and relay decision back to codex


## Reviewer Note

There's a lot of plumbing for the new hook, key files to review are:
- New hook added in `codex-rs/hooks/src/events/permission_request.rs`
- Wiring for network approvals
`codex-rs/core/src/tools/network_approval.rs`
- Wiring for tool orchestrator `codex-rs/core/src/tools/orchestrator.rs`
- Wiring for execve
`codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs`

## What

- Wires shell, unified exec, and network approval prompts into the
`PermissionRequest` hook flow.
- Lets hooks allow or deny approval prompts; quiet or invalid hooks fall
back to the normal approval path.
- Uses `tool_input.description` for user-facing context when it helps:
  - shell / `exec_command`: the request justification, when present
  - network approvals: `network-access <domain>`
- Uses `tool_name: Bash` for shell, unified exec, and network approval
permission-request hooks.
- For network approvals, passes the originating command in
`tool_input.command` when there is a single owning call; otherwise falls
back to the synthetic `network-access ...` command.

<details>
<summary>Example `PermissionRequest` hook input for a shell
approval</summary>

```json
{
  "session_id": "<session-id>",
  "turn_id": "<turn-id>",
  "transcript_path": "/path/to/transcript.jsonl",
  "cwd": "/path/to/cwd",
  "hook_event_name": "PermissionRequest",
  "model": "gpt-5",
  "permission_mode": "default",
  "tool_name": "Bash",
  "tool_input": {
    "command": "rm -f /tmp/example"
  }
}
```

</details>

<details>
<summary>Example `PermissionRequest` hook input for an escalated
`exec_command` request</summary>

```json
{
  "session_id": "<session-id>",
  "turn_id": "<turn-id>",
  "transcript_path": "/path/to/transcript.jsonl",
  "cwd": "/path/to/cwd",
  "hook_event_name": "PermissionRequest",
  "model": "gpt-5",
  "permission_mode": "default",
  "tool_name": "Bash",
  "tool_input": {
    "command": "cp /tmp/source.json /Users/alice/export/source.json",
    "description": "Need to copy a generated file outside the workspace"
  }
}
```

</details>

<details>
<summary>Example `PermissionRequest` hook input for a network
approval</summary>

```json
{
  "session_id": "<session-id>",
  "turn_id": "<turn-id>",
  "transcript_path": "/path/to/transcript.jsonl",
  "cwd": "/path/to/cwd",
  "hook_event_name": "PermissionRequest",
  "model": "gpt-5",
  "permission_mode": "default",
  "tool_name": "Bash",
  "tool_input": {
    "command": "curl http://codex-network-test.invalid",
    "description": "network-access http://codex-network-test.invalid"
  }
}
```

</details>

## Follow-ups

- Implement the `PermissionRequest` semantics for `updatedInput`,
`updatedPermissions`, `interrupt`, and suggestions /
`permission_suggestions`
- Add `PermissionRequest` support for the `request_permissions` tool
path

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-17 14:45:47 +00:00
sayan-oai
d0047de7cb add token-based tool deferral behind feature flag (#18097)
add new `tool_search_always_defer_mcp_tools` feature flag that always
defers all mcp tools rather than deferring once > 100 deferrable tools.

add new tests, also move `mcp_exposure` tests into dedicated file rather
than polluting `codex_tests`.
2026-04-17 18:34:06 +08:00