Commit Graph

535 Commits

Author SHA1 Message Date
Matthew Zeng
b7139a7e8f [mcp] Support MCP Apps part 3 - Add mcp tool call support. (#17364)
- [x] Add a new app-server method so that MCP Apps can call their own
MCP server directly.
2026-04-11 04:39:19 +00:00
Won Park
37aac89a6d representing guardian review timeouts in protocol types (#17381)
## Summary

- Add `TimedOut` to Guardian/review carrier types:
  - `ReviewDecision::TimedOut`
  - `GuardianAssessmentStatus::TimedOut`
  - app-server v2 `GuardianApprovalReviewStatus::TimedOut`
- Regenerate app-server JSON/TypeScript schemas for the new wire shape.
- Wire the new status through core/app-server/TUI mappings with
conservative fail-closed handling.
- Keep `TimedOut` non-user-selectable in the approval UI.

**Does not change runtime behavior yet; emitting `TimeOut` and
parent-model timeout messaging will come in followup PRs**
2026-04-10 20:02:33 -07:00
Shijie Rao
930e5adb7e Revert "Option to Notify Workspace Owner When Usage Limit is Reached" (#17391)
Reverts openai/codex#16969

#sev3-2026-04-10-accountscheckversion-500s-for-openai-workspace-7300
2026-04-10 23:33:13 +00:00
Owen Lin
a3be74143a fix(guardian, app-server): introduce guardian review ids (#17298)
## Description

This PR introduces `review_id` as the stable identifier for guardian
reviews and exposes it in app-server `item/autoApprovalReview/started`
and `item/autoApprovalReview/completed` events.

Internally, guardian rejection state is now keyed by `review_id` instead
of the reviewed tool item ID. `target_item_id` is still included when a
review maps to a concrete thread item, but it is no longer overloaded as
the review lifecycle identifier.

## Motivation

We'd like to give users the ability to preempt a guardian review while
it's running (approve or decline).

However, we can't implement the API that allows the user to override a
running guardian review because we didn't have a unique `review_id` per
guardian review. Using `target_item_id` is not correct since:
- with execve reviews, there can be multiple execve calls (and therefore
guardian reviews) per shell command
- with network policy reviews, there is no target item ID

The PR that actually implements user overrides will use `review_id` as
the stable identifier.
2026-04-10 16:21:02 -07:00
Abhinav
7999b0f60f Support clear SessionStart source (#17073)
## Motivation

The `SessionStart` hook already receives `startup` and `resume` sources,
but sessions created from `/clear` previously looked like normal startup
sessions. This makes it impossible for hook authors to distinguish
between these with the matcher.

## Summary

- Add `InitialHistory::Cleared` so `/clear`-created sessions can be
distinguished from ordinary startup sessions.
- Add `SessionStartSource::Clear` and wire it through core, app-server
thread start params, and TUI clear-session flow.
- Update app-server protocol schemas, generated TypeScript, docs, and
related tests.


https://github.com/user-attachments/assets/9cae3cb4-41c7-4d06-b34f-966252442e5c
2026-04-10 16:05:21 -07:00
rhan-oai
5779be314a [codex-analytics] add compaction analytics event (#17155)
- event for compaction analytics
- introduces thread-connection and thread metadata caches for data
denormalization, expected to be useful for denormalization onto core
emitted events in general
- threads analytics event client into core (mirrors approved
implementation in #16640)
- denormalizes key thread metadata: thread_source, subagent_source,
parent_thread_id, as well as app-server client and runtime metadata)
- compaction strategy defaults to memento, forward compatible with
expected prefill_compaction strategy

1. Manual standalone compact, local
`INFO | 2026-04-09 17:35:50 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:526 | Tracked
codex_compaction_event event params={'thread_id':
'019d74d0-5cfb-70c0-bef9-165c3bf9b2df', 'turn_id':
'019d74d0-d7f6-7c81-acc6-aae2030243d6', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': True}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'trigger': 'manual', 'reason':
'user_requested', 'implementation': 'responses', 'phase':
'standalone_turn', 'strategy': 'memento', 'status': 'completed',
'active_context_tokens_before': 20170, 'active_context_tokens_after':
4830, 'started_at': 1775781337, 'completed_at': 1775781350,
'thread_source': 'user', 'subagent_source': None, 'parent_thread_id':
None, 'error': None, 'duration_ms': 13524} | `

2. Auto pre-turn compact, local
`INFO | 2026-04-09 17:37:30 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:526 | Tracked
codex_compaction_event event params={'thread_id':
'019d74d2-45ef-71d1-9c93-23cc0c13d988', 'turn_id':
'019d74d2-7b42-7372-9f0e-c0da3f352328', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': True}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'trigger': 'auto', 'reason':
'context_limit', 'implementation': 'responses', 'phase': 'pre_turn',
'strategy': 'memento', 'status': 'completed',
'active_context_tokens_before': 20063, 'active_context_tokens_after':
4822, 'started_at': 1775781444, 'completed_at': 1775781449,
'thread_source': 'user', 'subagent_source': None, 'parent_thread_id':
None, 'error': None, 'duration_ms': 5497} | `

3. Auto mid-turn compact, local
`INFO | 2026-04-09 17:38:28 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:526 | Tracked
codex_compaction_event event params={'thread_id':
'019d74d3-212f-7a20-8c0a-4816a978675e', 'turn_id':
'019d74d3-3ee1-7462-89f6-2ffbeefcd5e3', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': True}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'trigger': 'auto', 'reason':
'context_limit', 'implementation': 'responses', 'phase': 'mid_turn',
'strategy': 'memento', 'status': 'completed',
'active_context_tokens_before': 20325, 'active_context_tokens_after':
14641, 'started_at': 1775781500, 'completed_at': 1775781508,
'thread_source': 'user', 'subagent_source': None, 'parent_thread_id':
None, 'error': None, 'duration_ms': 7507} | `

4. Remote /responses/compact, manual standalone
`INFO | 2026-04-09 17:40:20 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:526 | Tracked
codex_compaction_event event params={'thread_id':
'019d74d4-7a11-78a1-89f7-0535a1149416', 'turn_id':
'019d74d4-e087-7183-9c20-b1e40b7578c0', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': True}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'trigger': 'manual', 'reason':
'user_requested', 'implementation': 'responses_compact', 'phase':
'standalone_turn', 'strategy': 'memento', 'status': 'completed',
'active_context_tokens_before': 23461, 'active_context_tokens_after':
6171, 'started_at': 1775781601, 'completed_at': 1775781620,
'thread_source': 'user', 'subagent_source': None, 'parent_thread_id':
None, 'error': None, 'duration_ms': 18971} | `
2026-04-10 13:03:54 -07:00
jif-oai
87328976f6 fix: main (#17352) 2026-04-10 18:14:42 +01:00
Ahmed Ibrahim
2e81eac004 Queue Realtime V2 response.create while active (#17306)
Builds on #17264.

- queues Realtime V2 `response.create` while an active response is open,
then flushes it after `response.done` or `response.cancelled`
- requests `response.create` after background agent final output and
steering acknowledgements
- adds app-server integration coverage for all `response.create` paths

Validation:
- `just fmt`
- `cargo check -p codex-app-server --tests`
- `git diff --check`
- CI green

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-10 09:09:13 -07:00
jif-oai
8d58899297 fix: MCP leaks in app-server (#17223)
The disconnect path now reuses the same teardown flow as explicit
unsubscribe, and the thread-state bookkeeping consistently reports only
threads that lost their last subscriber

https://github.com/openai/codex/issues/16895
2026-04-10 15:31:26 +01:00
richardopenai
9f2a585153 Option to Notify Workspace Owner When Usage Limit is Reached (#16969)
## Summary
- Replace the manual `/notify-owner` flow with an inline confirmation
prompt when a usage-based workspace member hits a credits-depleted
limit.
- Fetch the current workspace role from the live ChatGPT
`accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches
the desktop and web clients.
- Keep owner, member, and spend-cap messaging distinct so we only offer
the owner nudge when the workspace is actually out of credits.

## What Changed
- `backend-client`
- Added a typed fetch for the current account role from
`accounts/check`.
  - Mapped backend role values into a Rust workspace-role enum.
- `app-server` and protocol
  - Added `workspaceRole` to `account/read` and `account/updated`.
- Derived `isWorkspaceOwner` from the live role, with a fallback to the
cached token claim when the role fetch is unavailable.
- `tui`
  - Removed the explicit `/notify-owner` slash command.
- When a member is blocked because the workspace is out of credits, the
error now prompts:
- `Your workspace is out of credits. Request more from your workspace
owner? [y/N]`
  - Choosing `y` sends the existing owner-notification request.
- Choosing `n`, pressing `Esc`, or accepting the default selection
dismisses the prompt without sending anything.
- Selection popups now honor explicit item shortcuts, which is how the
`y` / `n` interaction is wired.

## Reviewer Notes
- The main behavior change is scoped to usage-based workspace members
whose workspace credits are depleted.
- Spend-cap reached should not show the owner-notification prompt.
- Owners and admins should continue to see `/usage` guidance instead of
the member prompt.
- The live role fetch is best-effort; if it fails, we fall back to the
existing token-derived ownership signal.

## Testing
- Manual verification
  - Workspace owner does not see the member prompt.
- Workspace member with depleted credits sees the confirmation prompt
and can send the nudge with `y`.
- Workspace member with spend cap reached does not see the
owner-notification prompt.

### Workspace member out of usage

https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1

### Workspace owner
<img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48
22 AM"
src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6"
/>
2026-04-09 21:15:17 -07:00
Abhinav
f6cc2bb0cb Emit live hook prompts before raw-event filtering (#17189)
# What
Project raw Stop-hook prompt response items into typed v2 hookPrompt
item-completed notifications before applying the raw-response-event
filter. Keep ordinary raw response items filtered for normal
subscribers; only the existing hookPrompt bridge runs on the filtered
raw-item path.

# Why
Blocked Stop hooks record their continuation instruction as a raw
model-history user item. Normal v2 desktop subscribers do not opt into
raw response events, so the app-server listener filtered that raw item
before the existing hookPrompt translator could emit the typed live
item/completed notification. As a result, the hook-prompt bubble only
appeared after thread history was reloaded.
2026-04-09 19:48:21 -07:00
Ruslan Nigmatullin
ff1ab61e4f app-server: Fix clippy by removing extra mut (#17262) 2026-04-09 14:30:18 -07:00
Matthew Zeng
d7f99b0fa6 [mcp] Expand tool search to custom MCPs. (#16944)
- [x] Expand tool search to custom MCPs.
- [x] Rename several variables/fields to be more generic.

Updated tool & server name lifecycles:

**Raw Identity**

ToolInfo.server_name is raw MCP server name.
ToolInfo.tool.name is raw MCP tool name.
MCP calls route back to raw via parse_tool_name() returning
(tool.server_name, tool.tool.name).
mcpServerStatus/list now groups by raw server and keys tools by
Tool.name: mod.rs:599
App-server just forwards that grouped raw snapshot:
codex_message_processor.rs:5245

**Callable Names**

On list-tools, we create provisional callable_namespace / callable_name:
mcp_connection_manager.rs:1556
For non-app MCP, provisional callable name starts as raw tool name.
For codex-apps, provisional callable name is sanitized and strips
connector name/id prefix; namespace includes connector name.
Then qualify_tools() sanitizes callable namespace + name to ASCII alnum
/ _ only: mcp_tool_names.rs:128
Note: this is stricter than Responses API. Hyphen is currently replaced
with _ for code-mode compatibility.

**Collision Handling**

We do initially collapse example-server and example_server to the same
base.
Then qualify_tools() detects distinct raw namespace identities behind
the same sanitized namespace and appends a hash to the callable
namespace: mcp_tool_names.rs:137
Same idea for tool-name collisions: hash suffix goes on callable tool
name.
Final list_all_tools() map key is callable_namespace + callable_name:
mcp_connection_manager.rs:769

**Direct Model Tools**

Direct MCP tool declarations use the full qualified sanitized key as the
Responses function name.
The raw rmcp Tool is converted but renamed for model exposure.

**Tool Search / Deferred**

Tool search result namespace = final ToolInfo.callable_namespace:
tool_search.rs:85
Tool search result nested name = final ToolInfo.callable_name:
tool_search.rs:86
Deferred tool handler is registered as "{namespace}:{name}":
tool_registry_plan.rs:248
When a function call comes back, core recombines namespace + name, looks
up the full qualified key, and gets the raw server/tool for MCP
execution: codex.rs:4353

**Separate Legacy Snapshot**

collect_mcp_snapshot_from_manager_with_detail() still returns a map
keyed by qualified callable name.
mcpServerStatus/list no longer uses that; it uses
McpServerStatusSnapshot, which is raw-inventory shaped.
2026-04-09 13:34:52 -07:00
Ruslan Nigmatullin
545f3daba0 app-server: Use shared receivers for app-server message processors (#17256)
We do not rely on the mutability here, so express it in the type system.
2026-04-09 19:53:50 +00:00
neil-oai
a92a5085bd Forward app-server turn clientMetadata to Responses (#16009)
## Summary
App-server v2 already receives turn-scoped `clientMetadata`, but the
Rust app-server was dropping it before the outbound Responses request.
This change keeps the fix lightweight by threading that metadata through
the existing turn-metadata path rather than inventing a new transport.

## What we're trying to do and why
We want turn-scoped metadata from the app-server protocol layer,
especially fields like Hermes/GAAS run IDs, to survive all the way to
the actual Responses API request so it is visible in downstream
websocket request logging and analytics.

The specific bug was:
- app-server protocol uses camelCase `clientMetadata`
- Responses transport already has an existing turn metadata carrier:
`x-codex-turn-metadata`
- websocket transport already rewrites that header into
`request.request_body.client_metadata["x-codex-turn-metadata"]`
- but the Rust app-server never parsed or stored `clientMetadata`, so
nothing from the app-server request was making it into that existing
path

This PR fixes that without adding a new header or a second metadata
channel.

## How we did it
### Protocol surface
- Add optional `clientMetadata` to v2 `TurnStartParams` and
`TurnSteerParams`
- Regenerate the JSON schema / TypeScript fixtures
- Update app-server docs to describe the field and its behavior

### Runtime plumbing
- Add a dedicated core op for app-server user input carrying turn-scoped
metadata: `Op::UserInputWithClientMetadata`
- Wire `turn/start` and `turn/steer` through that op / signature path
instead of dropping the metadata at the message-processor boundary
- Store the metadata in `TurnMetadataState`

### Transport behavior
- Reuse the existing serialized `x-codex-turn-metadata` payload
- Merge the new app-server `clientMetadata` into that JSON additively
- Do **not** replace built-in reserved fields already present in the
turn metadata payload
- Keep websocket behavior unchanged at the outer shape level: it still
sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON
string now contains the merged fields
- Keep HTTP fallback behavior unchanged except that the existing
`x-codex-turn-metadata` header now includes the merged fields too

### Request shape before / after
Before, a websocket `response.create` looked like:
```json
{
  "type": "response.create",
  "client_metadata": {
    "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}"
  }
}
```
Even if the app-server caller supplied `clientMetadata`, it was not
represented there.

After, the same request shape is preserved, but the serialized payload
now includes the new turn-scoped fields:
```json
{
  "type": "response.create",
  "client_metadata": {
    "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}"
  }
}
```

## Validation
### Targeted tests added / updated
- protocol round-trip coverage for `clientMetadata` on `turn/start` and
`turn/steer`
- protocol round-trip coverage for `Op::UserInputWithClientMetadata`
- `TurnMetadataState` merge test proving client metadata is added
without overwriting reserved built-in fields
- websocket request-shape test proving outbound `response.create`
contains merged metadata inside
`client_metadata["x-codex-turn-metadata"]`
- app-server integration tests proving:
- `turn/start` forwards `clientMetadata` into the outbound Responses
request path
  - websocket warmup + real turn request both behave correctly
  - `turn/steer` updates the follow-up request metadata

### Commands run
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-protocol`
- `cargo test -p codex-core
turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields
--lib`
- `cargo test -p codex-core --test all
responses_websocket_preserves_custom_turn_metadata_fields`
- `cargo test -p codex-app-server --test all client_metadata`
- `cargo test -p codex-app-server --test all
turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2
-- --nocapture`
- `just fmt`
- `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol
-p codex-app-server`
- `just fix -p codex-exec -p codex-tui-app-server`
- `just argument-comment-lint`

### Full suite note
`cargo test` in `codex-rs` still fails in:
-
`suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request`

I verified that same failure on a clean detached `HEAD` worktree with an
isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.
2026-04-09 11:52:37 -07:00
jif-oai
12f0e0b0eb chore: merge name and title (#17116)
Merge title and name concept to leverage the sqlite title column and
have more efficient queries

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-09 18:44:26 +01:00
Ahmed Ibrahim
2f9090be62 Add realtime voice selection (#17176)
- Add realtime voice selection for realtime/start.
- Expose the supported v1/v2 voice lists and cover explicit, configured,
default, and invalid voice paths.
2026-04-08 20:19:15 -07:00
maja-openai
dcbc91fd39 Update guardian output schema (#17061)
## Summary
- Update guardian output schema to separate risk, authorization,
outcome, and rationale.
- Feed guardian rationale into rejection messages.
- Split the guardian policy into template and tenant-config sections.

## Validation
- `cargo test -p codex-core mcp_tool_call`
- `env -u CODEX_SANDBOX_NETWORK_DISABLED INSTA_UPDATE=always cargo test
-p codex-core guardian::`

---------

Co-authored-by: Owen Lin <owen@openai.com>
2026-04-08 15:47:29 -07:00
pakrym-oai
e4d6702b87 [codex] Support remote exec cwd in TUI startup (#17142)
When running with remote executor the cwd is the remote path. Today we
check for existence of a local directory on startup and attempt to load
config from it.

For remote executors don't do that.
2026-04-08 13:09:28 -07:00
pakrym-oai
35b5720e8d Use AbsolutePathBuf for exec cwd plumbing (#17063)
## Summary
- Carry `AbsolutePathBuf` through tool cwd parsing/resolution instead of
resolving workdirs to raw `PathBuf`s.
- Type exec/sandbox request cwd fields as `AbsolutePathBuf` through
`ExecParams`, `ExecRequest`, `SandboxCommand`, and unified exec runtime
requests.
- Keep `PathBuf` conversions at external/event boundaries and update
existing tests/fixtures for the typed cwd.

## Validation
- `cargo check -p codex-core --tests`
- `cargo check -p codex-sandboxing --tests`
- `cargo test -p codex-sandboxing`
- `cargo test -p codex-core --lib tools::handlers::`
- `just fix -p codex-sandboxing`
- `just fix -p codex-core`
- `just fmt`

Full `codex-core` test suite was not run locally; per repo guidance I
kept local validation targeted.
2026-04-08 10:54:12 -07:00
Vivian Fang
ea516f9a40 Support anyOf and enum in JsonSchema (#16875)
This brings us into better alignment with the JSON schema subset that is
supported in
<https://developers.openai.com/api/docs/guides/structured-outputs#supported-schemas>,
and also allows us to render richer function signatures in code mode
(e.g., anyOf{null, OtherObjectType})
2026-04-08 01:07:55 -07:00
pash-openai
80ebc80be5 Use model metadata for Fast Mode status (#16949)
Fast Mode status was still tied to one model name in the TUI and
model-list plumbing. This changes the model metadata shape so a model
can advertise additional speed tiers, carries that field through the
app-server model list, and uses it to decide when to show Fast Mode
status.

For people using Codex, the behavior is intended to stay the same for
existing models. Fast Mode still requires the existing signed-in /
feature-gated path; the difference is that the UI can now recognize any
model the model list marks as Fast-capable, instead of requiring a new
client-side slug check.
2026-04-07 17:55:40 -07:00
Ahmed Ibrahim
fb3dcfde1d Add WebRTC transport to realtime start (#16960)
Adds WebRTC startup to the experimental app-server
`thread/realtime/start` method with an optional transport enum. The
websocket path remains the default; WebRTC offers create the realtime
session through the shared start flow and emit the answer SDP via
`thread/realtime/sdp`.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-07 15:43:38 -07:00
Dylan Hurd
6c36e7d688 fix(app-server) revert null instructions changes (#17047) 2026-04-07 15:18:34 -07:00
Ruslan Nigmatullin
59af4a730c app-server: Allow enabling remote control in runtime (#16973)
Refresh the feature flag on writes to the config.
2026-04-07 11:36:17 -07:00
Ruslan Nigmatullin
8a13f82204 app-server: Move watch_id to request of fs/watch (#17026)
It's easier for clients to maintain watchers if they define the watch
id, so move it into the request.
It's not used yet, so should be a safe change.
2026-04-07 11:22:28 -07:00
Matthew Zeng
252d79f5eb [mcp] Support MCP Apps part 2 - Add meta to mcp tool call result. (#16465)
- [x] Add meta to mcp tool call result.
2026-04-07 11:10:21 -07:00
pakrym-oai
f1a2b920f9 [codex] Make AbsolutePathBuf joins infallible (#16981)
Having to check for errors every time join is called is painful and
unnecessary.
2026-04-07 10:52:08 -07:00
pakrym-oai
413c1e1fdf [codex] reduce module visibility (#16978)
## Summary
- reduce public module visibility across Rust crates, preferring private
or crate-private modules with explicit crate-root public exports
- update external call sites and tests to use the intended public crate
APIs instead of reaching through module trees
- add the module visibility guideline to AGENTS.md

## Validation
- `cargo check --workspace --all-targets --message-format=short` passed
before the final fix/format pass
- `just fix` completed successfully
- `just fmt` completed successfully
- `git diff --check` passed
2026-04-07 08:03:35 -07:00
jif-oai
89f1a44afa feat: /feedback cascade (#16442)
Example here:
https://openai.sentry.io/issues/7380240430/?project=4510195390611458&query=019d498f-bec4-7ba2-96d2-612b1e4507df&referrer=issue-stream
2026-04-07 12:47:37 +01:00
Ahmed Ibrahim
24c598e8a9 Honor null thread instructions (#16964)
- Treat explicit null thread instructions as a blank-slate override
while preserving omitted-field fallback behavior.
- Preserve null through rollout resume/fork and keep explicit empty
strings distinct.
- Add app-server v2 start/fork coverage for the tri-state instruction
params.
2026-04-07 04:10:19 +00:00
viyatb-oai
9d13d29acd [codex] Add danger-full-access denylist-only network mode (#16946)
## Summary

This adds `experimental_network.danger_full_access_denylist_only` for
orgs that want yolo / danger-full-access sessions to keep full network
access while still enforcing centrally managed deny rules.

When the flag is true and the session sandbox is `danger-full-access`,
the network proxy starts with:

- domain allowlist set to `*`
- managed domain `deny` entries enforced
- upstream proxy use allowed
- all Unix sockets allowed
- local/private binding allowed

Caveat: the denylist is best effort only. In yolo / danger-full-access
mode, Codex or the model can use an allowed socket or other
local/private network path to bypass the proxy denylist, so this should
not be treated as a hard security boundary.

The flag is intentionally scoped to `SandboxPolicy::DangerFullAccess`.
Read-only and workspace-write modes keep the existing managed/user
allowlist, denylist, Unix socket, and local-binding behavior. This does
not enable the non-loopback proxy listener setting; that still requires
its own explicit config.

This also threads the new field through config requirements parsing,
app-server protocol/schema output, config API mapping, and the TUI debug
config output.

## How to use

Add the flag under `[experimental_network]` in the network policy config
that is delivered to Codex. The setting is not under `[permissions]`.

```toml
[experimental_network]
enabled = true
danger_full_access_denylist_only = true

[experimental_network.domains]
"blocked.example.com" = "deny"
"*.blocked.example.com" = "deny"
```

With that configuration, yolo / danger-full-access sessions get broad
network access except for the managed denied domains above. The denylist
remains a best-effort proxy policy because the session may still use
allowed sockets to bypass it. Other sandbox modes do not get the
wildcard domain allowlist or the socket/local-binding relaxations from
this flag.

## Verification

- `cargo test -p codex-config network_requirements`
- `cargo test -p codex-core network_proxy_spec`
- `cargo test -p codex-app-server map_requirements_toml_to_api`
- `cargo test -p codex-tui debug_config_output`
- `cargo test -p codex-app-server-protocol`
- `just write-app-server-schema`
- `just fmt`
- `just fix -p codex-config -p codex-core -p codex-app-server-protocol
-p codex-app-server -p codex-tui`
- `just fix -p codex-core -p codex-config`
- `git diff --check`
- `cargo clean`
2026-04-06 19:38:51 -07:00
Matthew Zeng
5fe9ef06ce [mcp] Support MCP Apps part 1. (#16082)
- [x] Add `mcpResource/read` method to read mcp resource.
2026-04-06 19:17:14 -07:00
Ruslan Nigmatullin
b34a3a6e92 app-server: Unify config changes handling a bit (#16961) 2026-04-06 18:04:00 -07:00
pakrym-oai
1f2411629f Refactor config types into a separate crate (#16962)
Move config types into a separate crate because their macros expand into
a lot of new code.
2026-04-07 00:32:41 +00:00
Eric Traut
9f737c28dd Speed up /mcp inventory listing (#16831)
Addresses #16244

This was a performance regression introduced when we moved the TUI on
top of the app server API.

Problem: `/mcp` rebuilt a full MCP inventory through
`mcpServerStatus/list`, including resources and resource templates that
made the TUI wait on slow inventory probes.

Solution: add a lightweight `detail` mode to `mcpServerStatus/list`,
have `/mcp` request tools-and-auth only, and cover the fast path with
app-server and TUI tests.

Testing: Confirmed slow (multi-second) response prior to change and
immediate response after change.

I considered two options:
1. Change the existing `mcpServerStatus/list` API to accept an optional
"details" parameter so callers can request only a subset of the
information.
2. Add a separate `mcpServer/list` API that returns only the servers,
tools, and auth but omits the resources.

I chose option 1, but option 2 is also a reasonable approach.
2026-04-06 16:27:02 -07:00
rhan-oai
756c45ec61 [codex-analytics] add protocol-native turn timestamps (#16638)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
* #16870
* #16706
* #16659
* #16641
* #16640
* __->__ #16638
2026-04-06 16:22:59 -07:00
xl-openai
e62d645e67 feat: refresh non-curated cache from plugin list. (#16191)
1. Use versions for non-curated plugin (defined in plugin.json) for
cache refresh
2. Trigger refresh from plugin/list roots
2026-04-06 15:40:00 -07:00
Ruslan Nigmatullin
73dab2046f app-server: Add transport for remote control (#15951) 2026-04-06 14:55:59 -07:00
joeytrasatti-openai
03c07956cf Revert "[codex-backend] Make thread metadata updates tolerate pending backfill" (#16923)
Reverts openai/codex#16877
2026-04-06 21:25:05 +00:00
Matthew Zeng
756ba8baae Fix clippy warning (#16939)
- [x] Fix clippy warning
2026-04-06 14:08:55 -07:00
Ruslan Nigmatullin
1525bbdb9a app-server: centralize AuthManager initialization (#16764)
Extract a shared helper that builds AuthManager from Config and applies
the forced ChatGPT workspace override in one place.

Create the shared AuthManager at MessageProcessor call sites so that
upcoming new transport's initialization can reuse the same handle, and
keep only external auth refresher wiring inside `MessageProcessor`.

Remove the now-unused `AuthManager::shared_with_external_auth` helper.
2026-04-06 12:46:55 -07:00
Owen Lin
bd30bad96f fix(guardian): fix ordering of guardian events (#16462)
Guardian events were emitted a bit out of order for CommandExecution
items. This would make it hard for the frontend to render a guardian
auto-review, which has this payload:
```
pub struct ItemGuardianApprovalReviewStartedNotification {
    pub thread_id: String,
    pub turn_id: String,
    pub target_item_id: String,
    pub review: GuardianApprovalReview,
    // FYI this is no longer a json blob
    pub action: Option<JsonValue>,
}
```

There is a `target_item_id` the auto-approval review is referring to,
but the actual item had not been emitted yet.

Before this PR:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`, and if approved...
- `item/started`
- `item/completed`

After this PR:
- `item/started`
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
- `item/completed`

This lines up much better with existing patterns (i.e. human review in
`Default mode`, where app-server would send a server request to prompt
for user approval after `item/started`), and makes it easier for clients
to render what guardian is actually reviewing.

We do this following a similar pattern as `FileChange` (aka apply patch)
items, where we create a FileChange item and emit `item/started` if we
see the apply patch approval request, before the actual apply patch call
runs.
2026-04-06 19:14:27 +00:00
Owen Lin
ded559680d feat(requirements): support allowed_approval_reviewers (#16701)
## Description

Add requirements.toml support for `allowed_approvals_reviewers =
["user", "guardian_subagent"]`, so admins can now restrict the use of
guardian mode.

Note: If a user sets a reviewer that isn’t allowed by requirements.toml,
config loading falls back to the first allowed reviewer and emits a
startup warning.

The table below describes the possible admin controls.
| Admin intent | `requirements.toml` | User `config.toml` | End result |
|---|---|---|---|
| Leave Guardian optional | omit `allowed_approvals_reviewers` or set
`["user", "guardian_subagent"]` | user chooses `approvals_reviewer =
"user"` or `"guardian_subagent"` | Guardian off for `user`, on for
`guardian_subagent` + `approval_policy = "on-request"` |
| Force Guardian off | `allowed_approvals_reviewers = ["user"]` | any
user value | Effective reviewer is `user`; Guardian off |
| Force Guardian on | `allowed_approvals_reviewers =
["guardian_subagent"]` and usually `allowed_approval_policies =
["on-request"]` | any user reviewer value; user should also have
`approval_policy = "on-request"` unless policy is forced | Effective
reviewer is `guardian_subagent`; Guardian on when effective approval
policy is `on-request` |
| Allow both, but default to manual if user does nothing |
`allowed_approvals_reviewers = ["user", "guardian_subagent"]` | omit
`approvals_reviewer` | Effective reviewer is `user`; Guardian off |
| Allow both, and user explicitly opts into Guardian |
`allowed_approvals_reviewers = ["user", "guardian_subagent"]` |
`approvals_reviewer = "guardian_subagent"` and `approval_policy =
"on-request"` | Guardian on |
| Invalid admin config | `allowed_approvals_reviewers = []` | anything |
Config load error |
2026-04-06 11:11:44 -07:00
joeytrasatti-openai
4ce97cef02 [codex-backend] Make thread metadata updates tolerate pending backfill (#16877)
### Summary
Fix `thread/metadata/update` so it can still patch stored thread
metadata when the list/backfill-gated `get_state_db(...)` path is
unavailable.

What was happening:
- The app logs showed `thread/metadata/update` failing with `sqlite
state db unavailable for thread ...`.
- This was not isolated to one bad thread. Once the failure started for
a user, branch metadata updates failed 100% of the time for that user.
- Reports were staggered across users, which points at local app-server
/ local SQLite state rather than one global server-side failure.
- Turns could still start immediately after the metadata update failed,
which suggests the thread itself was valid and the failure was in the
metadata endpoint DB-handle path.

The fix:
- Keep using the loaded thread state DB and the normal
`get_state_db(...)` fallback first.
- If that still returns `None`, open `StateRuntime::init(...)` directly
for this targeted metadata update path.
- Log the direct state runtime init error if that final fallback also
fails, so future reports have the real DB-open cause instead of only the
generic unavailable error.
- Add a regression test where the DB exists but backfill is not
complete, and verify `thread/metadata/update` can still repair the
stored rollout thread and patch `gitInfo`.

Relevant context / suspect PRs:
- #16434 changed state DB startup to run auto-vacuum / incremental
vacuum. This is the most suspicious timing match for per-user, staggered
local SQLite availability failures.
- #16433 dropped the old log table from the state DB, also near the
timing window.
- #13280 introduced this endpoint and made it rely on SQLite for git
metadata without resuming the thread.
- #14859 and #14888 added/consumed persisted model + reasoning effort
metadata. I checked these because of the new thread metadata fields, but
this failure happens before the endpoint reaches thread-row update/load
logic, so they seem less likely as the direct cause.

### Testing
- `cargo fmt -- --config imports_granularity=Item` completed; local
stable rustfmt emitted warnings that `imports_granularity` is unstable
- `cargo test -p codex-app-server thread_metadata_update`
- `git diff --check`
2026-04-06 13:07:19 -04:00
rhan-oai
4fd5c35c4f [codex-analytics] subagent analytics (#15915)
- creates custom event that emits subagent thread analytics from core
- wires client metadata (`product_client_id, client_name,
client_version`), through from app-server
- creates `created_at `timestamp in core
- subagent analytics are behind `FeatureFlag::GeneralAnalytics`

PR stack
- [[telemetry] thread events
#15690](https://github.com/openai/codex/pull/15690)
- --> [[telemetry] subagent events
#15915](https://github.com/openai/codex/pull/15915)
- [[telemetry] turn events
#15591](https://github.com/openai/codex/pull/15591)
- [[telemetry] steer events
#15697](https://github.com/openai/codex/pull/15697)
- [[telemetry] queued prompt data
#15804](https://github.com/openai/codex/pull/15804)

Notes:
- core does not spawn a subagent thread for compact, but represented in
mapping for consistency

`INFO | 2026-04-01 13:08:12 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:399 | Tracked
codex_thread_initialized event params={'thread_id':
'019d4aa9-233b-70f2-a958-c3dbae1e30fa', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': None}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'model': 'gpt-5.3-codex', 'ephemeral':
False, 'initialization_mode': 'new', 'created_at': 1775074091,
'thread_source': 'subagent', 'subagent_source': 'thread_spawn',
'parent_thread_id': '019d4aa8-51ec-77e3-bafb-2c1b8e29e385'} | `

`INFO | 2026-04-01 13:08:41 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:399 | Tracked
codex_thread_initialized event params={'thread_id':
'019d4aa9-94e3-75f1-8864-ff8ad0e55e1e', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': None}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'model': 'gpt-5.3-codex', 'ephemeral':
False, 'initialization_mode': 'new', 'created_at': 1775074120,
'thread_source': 'subagent', 'subagent_source': 'review',
'parent_thread_id': None} | `

---------

Co-authored-by: jif-oai <jif@openai.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-04-04 11:06:43 -07:00
Michael Bolin
a70aee1a1e Fix Windows Bazel app-server trust tests (#16711)
## Why

Extracted from [#16528](https://github.com/openai/codex/pull/16528) so
the Windows Bazel app-server test failures can be reviewed independently
from the rest of that PR.

This PR targets:

-
`suite::v2::thread_shell_command::thread_shell_command_runs_as_standalone_turn_and_persists_history`
-
`suite::v2::thread_start::thread_start_with_elevated_sandbox_trusts_project_and_followup_loads_project_config`
-
`suite::v2::thread_start::thread_start_with_nested_git_cwd_trusts_repo_root`

There were two Windows-specific assumptions baked into those tests and
the underlying trust lookup:

- project trust keys were persisted and looked up using raw path
strings, but Bazel's Windows test environment can surface canonicalized
paths with `\\?\` / UNC prefixes or normalized symlink/junction targets,
so follow-up `thread/start` requests no longer matched the project entry
that had just been written
- `item/commandExecution/outputDelta` assertions compared exact trailing
line endings even though shell output chunk boundaries and CRLF handling
can differ on Windows, and Bazel made that timing-sensitive mismatch
visible

There was also one behavior bug separate from the assertion cleanup:
`thread/start` decided whether to persist trust from the final resolved
sandbox policy, but on Windows an explicit `workspace-write` request may
be downgraded to `read-only`. That incorrectly skipped writing trust
even though the request had asked to elevate the project, so the new
logic also keys off the requested sandbox mode.

## What

- Canonicalize project trust keys when persisting/loading `[projects]`
entries, while still accepting legacy raw keys for existing configs.
- Persist project trust when `thread/start` explicitly requests
`workspace-write` or `danger-full-access`, even if the resolved policy
is later downgraded on Windows.
- Make the Windows app-server tests compare persisted trust paths and
command output deltas in a path/newline-normalized way.

## Verification

- Existing app-server v2 tests cover the three failing Windows Bazel
cases above.
2026-04-03 21:41:25 +00:00
Eric Traut
0ab8eda375 Add remote --cd forwarding for app-server sessions (#16700)
Addresses #16124

Problem: `codex --remote --cd <path>` canonicalized the path locally and
then omitted it from remote thread lifecycle requests, so remote-only
working directories failed or were ignored.

Solution: Keep remote startup on the local cwd, forward explicit `--cd`
values verbatim to `thread/start`, `thread/resume`, and `thread/fork`,
and cover the behavior with `codex-tui` tests.

Testing: I manually tested `--remote --cd` with both absolute and
relative paths and validated correct behavior.


---

Update based on code review feedback:

Problem: Remote `--cd` was forwarded to `thread/resume` and
`thread/fork`, but not to `thread/list` lookups, so `--resume --last`
and picker flows could select a session from the wrong cwd; relative cwd
filters also failed against stored absolute paths.

Solution: Apply explicit remote `--cd` to `thread/list` lookups for
`--last` and picker flows, normalize relative cwd filters on the
app-server before exact matching, and document/test the behavior.
2026-04-03 11:26:45 -07:00
Eric Traut
0f7394883e Suppress bwrap warning when sandboxing is bypassed (#16667)
Addresses #15282

Problem: Codex warned about missing system bubblewrap even when
sandboxing was disabled.

Solution: Gate the bwrap warning on the active sandbox policy and skip
it for danger-full-access and external-sandbox modes.
2026-04-03 10:54:30 -07:00
Eric Traut
a3b3e7a6cc Fix MCP tool listing for hyphenated server names (#16674)
Addresses #16671 and #14927

Problem: `mcpServerStatus/list` rebuilt MCP tool groups from sanitized
tool prefixes but looked them up by unsanitized server names, so
hyphenated servers rendered as having no tools in `/mcp`. This was
reported as a regression when the TUI switched to use the app server.

Solution: Build each server's tool map using the original server name's
sanitized prefix, include effective runtime MCP servers in the status
response, and add a regression test for hyphenated server names.
2026-04-03 09:05:50 -07:00