Commit Graph

562 Commits

Author SHA1 Message Date
Michael Bolin
c886193921 app-server tests: select turn permission profiles by name 2026-04-30 03:39:50 -07:00
xl-openai
87d0cf1a62 feat: Add workspace plugin sharing APIs (#20278)
1. Adds v2 plugin/share/save, plugin/share/list, and plugin/share/delete
RPCs.
2. Implements save by archiving a local plugin root, enforcing a size
limit, uploading through the workspace upload flow, and supporting
updates via remotePluginId.
3. Lists created workspace plugins
4. Deletes a previously uploaded/shared plugin.
2026-04-29 23:49:20 -07:00
Abhinav
8f3c06cc97 Add persisted hook enablement state (#19840)
## Why

After `hooks/list` exposes the hook inventory, clients need a way to
persist user hook preferences, make those changes effective in
already-open sessions, and distinguish user-controllable hooks from
managed requirements without adding another bespoke app-server write
API.

## What

- Extends `hooks/list` entries with effective `enabled` state.
- Persists user-level hook state under `hooks.state.<hook-id>` so the
model can grow beyond a single boolean over time.
- Uses the existing `config/batchWrite` path for hook state updates
instead of introducing a dedicated hook write RPC.
- Refreshes live session hook engines after config writes so
already-open threads observe updated enablement without a restart.

## Stack

1. openai/codex#19705
2. openai/codex#19778
3. This PR - openai/codex#19840
4. openai/codex#19882

## Reviewer Notes

The generated schema files account for much of the raw diff. The core
behavior is in:

- `hooks/src/config_rules.rs`, which resolves per-hook user state from
the config layer stack.
- `hooks/src/engine/discovery.rs`, which projects effective enablement
into `hooks/list` from source-derived managedness.
- `config/src/hook_config.rs`, which defines the new `hooks.state`
representation.
- `core/src/session/mod.rs`, which rebuilds live hook state after user
config reloads.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-30 04:46:32 +00:00
Michael Bolin
ac4332c05b permissions: expose active profile metadata (#20095) 2026-04-29 20:54:59 -07:00
Alex Zamoshchin
8b07132e09 update codex_plugins_beta_setting (from workspace settings) (#20250)
update the name after rename internally

see https://github.com/openai/openai/pull/871006
2026-04-30 00:40:25 +00:00
pakrym-oai
fedcefe9da Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager
construction and ModelProvider
2026-04-29 17:22:41 -07:00
stefanstokic-oai
c8abcbf925 Import external agent sessions in background (#20284)
Summary:
- Return from external agent import before session history import
finishes
- Run session import work in the background and emit the existing
completion notification when it is done
- Serialize session imports so duplicate requests do not create
duplicate imported threads

Verification:
- cargo test -p codex-app-server external_agent_config_
- cargo test -p codex-external-agent-sessions
- just fix -p codex-app-server
- just fix -p codex-external-agent-sessions
- git diff --check
2026-04-30 00:00:41 +00:00
alexsong-oai
7bcd4626c4 Consume ai-title from external sessions and add end marker (#20261)
## Summary
- Support Claude Code `ai-title` / `aiTitle` records when detecting and
importing external agent sessions.
- Preserve existing `custom-title` / `customTitle` precedence; only fall
back to `aiTitle` when no custom title is present.
- Add coverage for both detection and import title selection, including
the custom-title-over-ai-title case.

## Testing
- `cargo test -p codex-external-agent-sessions`
- `just fix -p codex-external-agent-sessions`
2026-04-30 00:00:13 +00:00
Abhinav
8774229a89 Add hooks/list app-server RPC (#19778)
## Why

We need a way to list the available hooks to expose via the TUI and App
so users can view and manage their hooks

## What

- Adds `hooks/list` for one or more `cwd` values that returns discovered
hook metadata

## Stack

1. openai/codex#19705
2. This PR - openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Review Notes

The generated schema files account for most of the raw diff, these files
have the core change:

- `hooks/src/engine/discovery.rs` builds the inventory entries during
hook discovery while leaving runtime handlers focused on execution.
- `app-server/src/codex_message_processor.rs` wires `hooks/list` into
the app-server flow for each requested `cwd`.
- `app-server-protocol/src/protocol/v2.rs` defines the new v2
request/response payloads exposed on the wire.

### Core Changes

`core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so
`skills/list` and `hooks/list`can resolve plugin state for each
requested `cwd`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 23:39:57 +00:00
xli-oai
afbddabc8b Require remote plugin detail before uninstall (#19966)
## Summary
- Fetch remote plugin detail before sending the uninstall request.
- Use the detail response to derive the marketplace namespace and plugin
name for cache cleanup.
- Stop the uninstall before the backend POST if detail lookup fails, so
backend state and local cache state do not diverge.

## Testing
- `just fmt`
- `cargo test -p codex-app-server plugin_uninstall`
- `cargo test -p codex-core-plugins`
- `git diff --check`
2026-04-29 14:01:11 -07:00
xl-openai
73cd831952 feat: Use remote installed plugin cache for skills and MCP (#20096)
- Fetches and caches remote /installed plugin state
- Lets skills/list load skills from remote-installed cached plugins
without requiring a local marketplace entry
- Routes plugin list/startup/install/uninstall changes through async
plugin cache invalidation and MCP refresh
2026-04-29 12:09:49 -07:00
xl-openai
6f328d5e02 Soften skill description budget warnings (#20112)
Updates skill description budget messaging to be less alarming
2026-04-28 19:56:25 -07:00
Celia Chen
8c47e36504 feat: expose provider capability bounds to app server clients (#20049)
follow up of #19442. The app server now exposes provider-derived bounds
through a new v2 `modelProvider/read` method. The response reports the
configured provider map key as `modelProvider` and returns the effective
capability booleans so clients can align their UI with the same
provider-owned limits used by core.
2026-04-29 01:36:19 +00:00
canvrno-oai
4c39ad33cb Fix plugin list workspace settings test isolation (#20086)
Fixes test that often fails locally when running `cargo test`
- Add an app-server test helper that combines managed-config isolation
with custom env overrides.
- Isolate `HOME` / `USERPROFILE` in plugin-list workspace settings tests
so host home marketplaces do not affect results.
2026-04-28 18:34:38 -07:00
Gabriel Peal
5e6cbbadf7 Return None when auth refresh fails (#20092)
Right now, if Codex winds up in a state with auth but it can't refresh
the token, the user is left with an unhelpful message that says to log
out and log back in again.

Ultimately, we should prevent that from happening but if it does,
returning None will allow the caller to redirect the user back to the
login page
2026-04-28 16:15:47 -07:00
Ruslan Nigmatullin
1de7a9bf69 app-server: allow remote_control runtime feature override (#20047) 2026-04-28 13:36:12 -07:00
stefanstokic-oai
4c68bd728f External agent session support (#19895)
## Summary

This extends external agent detection/import beyond config artifacts so
Codex can detect recent sessions files from the external agent home and
import them into Codex rollout history.

## What changed

- Added a focused `external_agent_sessions` module for:
  - session discovery
  - source-record parsing
  - rollout construction
  - import ledger tracking
- Wired session detection/import into the app-server external agent
config API.
- Added compaction handling so large imported sessions can be resumed
safely before the first follow-up turn.

## Testing

Added coverage for:
- recent-session detection
- custom-title handling
- recency filtering
- dedupe and re-detect-after-source-change behavior
- visible imported turn construction
- backward-compatible import payload deserialization
- end-to-end RPC import flow
- rejection of undetected session paths
- repeat-import behavior
- large-session compaction before first follow-up

Ran:
- `cargo test -p codex-app-server external_agent_config_import_ --test
all`
2026-04-28 17:42:36 +00:00
xli-oai
803705f795 Add remote plugin uninstall API (#19456)
## Summary
- Adds the remote `plugin/uninstall` request form using required
`pluginId` plus optional `remoteMarketplaceName`, while preserving local
`pluginId` uninstall.
- Adds `codex_core_plugins::remote::uninstall_remote_plugin` for the
deployed ChatGPT plugin backend uninstall path and validates the backend
returns the same id with `enabled: false`.
- Routes app-server remote uninstall through feature checks, remote
plugin id validation, backend mutation, local downloaded cache deletion,
cache clearing, docs, and regenerated protocol schemas.

## Tests
- `just write-app-server-schema`
- `just fmt`
- `cargo test -p codex-app-server-protocol
plugin_uninstall_params_serialization_omits_force_remote_sync`
- `cargo test -p codex-app-server plugin_uninstall --test all`
- `cargo test -p codex-app-server plugin_uninstall`
- `cargo build -p codex-cli`
- `CODEX_BIN=/Users/xli/code/codex/codex-rs/target/debug/codex python3
/Users/xli/.codex/skills/xli-test-marketplace-api/scripts/run_marketplace_api_matrix.py`
(44 pass / 0 fail)
- `just fix -p codex-app-server-protocol -p codex-app-server -p
codex-tui`
- `just fix -p codex-app-server`
2026-04-28 03:27:53 -07:00
xl-openai
7d72fc8f53 feat: Cache remote plugin bundles on install (#19914)
Remote installs now fetch, validate, download, and cache the plugin
bundle locally
2026-04-28 00:53:27 -07:00
Michael Bolin
4b55979755 permissions: remove cwd special path (#19841)
## Why

The experimental `PermissionProfile` API had both `:cwd` and
`:project_roots` special filesystem paths, which made the permission
root ambiguous. This PR removes the unstable `current_working_directory`
special path before the permissions API is stabilized, so callers use
`:project_roots` for symbolic project-root access.

## What changed

- Removes `FileSystemSpecialPath::CurrentWorkingDirectory` from protocol
and app-server protocol models, plus regenerated app-server
JSON/TypeScript schemas.
- Replaces internal `:cwd` permission entries with `:project_roots`
entries.
- Keeps the existing cwd-update behavior for legacy-shaped
workspace-write profiles, while removing the deleted
`CurrentWorkingDirectory` case from that compatibility path.
- Keeps `PermissionProfile::workspace_write()` as the reusable symbolic
workspace-write helper, with docs noting that `:project_roots` entries
resolve at enforcement time.
- Updates app-server docs/examples and approval UI labeling to stop
advertising `:cwd` as a permission token.

## Compatibility

Persisted rollout items may contain the old
`{"kind":"current_working_directory"}` tag from earlier experimental
`permissionProfile` snapshots. This PR keeps that tag as a
deserialize-only alias for `ProjectRoots { subpath: None }`, while
continuing to serialize only the new `project_roots` tag.

## Follow-up

This PR intentionally does not introduce an explicit project-root set on
`SessionConfiguration` or runtime sandbox resolution. Today, the
resolver still uses the active cwd as the single implicit project root.
A follow-up should model project roots separately from tool cwd so
`:project_roots` entries can resolve against the configured project
roots, and resolve to no entries when there are no project roots.

## Verification

- `cargo test -p codex-protocol permissions:: --lib`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-sandboxing -p codex-exec-server --lib`
- `cargo test -p codex-core session_configuration_apply_ --lib`
- `cargo test -p codex-app-server
command_exec_permission_profile_project_roots_use_command_cwd --test
all`
- `cargo test -p codex-tui
thread_read_session_state_does_not_reuse_primary_permission_profile
--lib`
- `cargo test -p codex-tui
preset_matching_accepts_workspace_write_with_extra_roots --lib`
- `cargo test -p codex-config --lib`
2026-04-27 13:41:27 -07:00
rhan-oai
215d5a8f7c [codex-analytics] remove ga flag (#19863) 2026-04-27 19:29:19 +00:00
Andrey Mishchenko
35bc6e3d01 Delete unused ResponseItem::Message.end_turn (#19605)
This field is unused. Delete it.
2026-04-26 17:18:09 -07:00
pakrym-oai
9c3abcd46c [codex] Move config loading into codex-config (#19487)
## Why

Config loading had become split across crates: `codex-config` owned the
config types and merge logic, while `codex-core` still owned the loader
that assembled the layer stack. This change consolidates that
responsibility in `codex-config`, so the crate that defines config
behavior also owns how configs are discovered and loaded.

To make that move possible without reintroducing the old dependency
cycle, the shell-environment policy types and helpers that
`codex-exec-server` needs now live in `codex-protocol` instead of
flowing through `codex-config`.

This also makes the migrated loader tests more deterministic on machines
that already have managed or system Codex config installed by letting
tests override the system config and requirements paths instead of
reading the host's `/etc/codex`.

## What Changed

- moved the config loader implementation from `codex-core` into
`codex-config::loader` and deleted the old `core::config_loader` module
instead of leaving a compatibility shim
- moved shell-environment policy types and helpers into
`codex-protocol`, then updated `codex-exec-server` and other downstream
crates to import them from their new home
- updated downstream callers to use loader/config APIs from
`codex-config`
- added test-only loader overrides for system config and requirements
paths so loader-focused tests do not depend on host-managed config state
- cleaned up now-unused dependency entries and platform-specific cfgs
that were surfaced by post-push CI

## Testing

- `cargo test -p codex-config`
- `cargo test -p codex-core config_loader_tests::`
- `cargo test -p codex-protocol -p codex-exec-server -p
codex-cloud-requirements -p codex-rmcp-client --lib`
- `cargo test --lib -p codex-app-server-client -p codex-exec`
- `cargo test --no-run --lib -p codex-app-server`
- `cargo test -p codex-linux-sandbox --lib`
- `cargo shear`
- `just bazel-lock-check`

## Notes

- I did not chase unrelated full-suite failures outside the migrated
loader surface.
- `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
failures on this machine, and Windows CI still shows unrelated
long-running/timeouting test noise outside the loader migration itself.
2026-04-26 15:10:53 -07:00
Michael Bolin
4d7ce3447d permissions: make runtime config profile-backed (#19606)
## Why

This supersedes #19391. During stack repair, GitHub marked #19391 as
merged into a temporary stack branch rather than into `main`, so the
runtime-config change needed a fresh PR.

`PermissionProfile` is now the canonical permissions shape after #19231
because it can distinguish `Managed`, `Disabled`, and `External`
enforcement while also carrying filesystem rules that legacy
`SandboxPolicy` cannot represent cleanly. Core config and session state
still needed to accept profile-backed permissions without forcing every
profile through the strict legacy bridge, which rejected valid runtime
profiles such as direct write roots.

The unrelated CI/test hardening that previously rode along with this PR
has been split into #19683 so this PR stays focused on the permissions
model migration.

## What Changed

- Adds `Permissions.permission_profile` and
`SessionConfiguration.permission_profile` as constrained runtime state,
while keeping `sandbox_policy` as a legacy compatibility projection.
- Introduces profile setters that keep `PermissionProfile`, split
filesystem/network policies, and legacy `SandboxPolicy` projections
synchronized.
- Uses a compatibility projection for requirement checks and legacy
consumers instead of rejecting profiles that cannot round-trip through
`SandboxPolicy` exactly.
- Updates config loading, config overrides, session updates, turn
context plumbing, prompt permission text, sandbox tags, and exec request
construction to carry profile-backed runtime permissions.
- Preserves configured deny-read entries and `glob_scan_max_depth` when
command/session profiles are narrowed.
- Adds `PermissionProfile::read_only()` and
`PermissionProfile::workspace_write()` presets that match legacy
defaults.

## Verification

- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19606).
* #19395
* #19394
* #19393
* #19392
* __->__ #19606
2026-04-26 13:29:54 -07:00
Michael Bolin
ac2bffa443 test: harden app-server integration tests (#19683)
## Why

Windows Bazel runs in the permissions stack exposed that app-server
integration tests were launching normal plugin startup warmups in every
subprocess. Those warmups can call
`https://chatgpt.com/backend-api/plugins/featured` when a test is not
specifically exercising plugin startup, which adds slow background work,
noisy stderr, and dependence on external network state. The relevant
startup/featured-plugin behavior was introduced across #15042 and
#15264.

A few app-server tests also had long optional waits or unbounded cleanup
paths, making failures expensive to diagnose and contributing to slow
Windows shards. One external-agent config test from #18246 used a
GitHub-style marketplace source, which was enough to exercise the
pending remote-import path but also meant the background completion task
could attempt a real clone.

## What Changed

- Adds explicit `AppServerRuntimeOptions` / `PluginStartupTasks`
plumbing and a hidden debug-only
`--disable-plugin-startup-tasks-for-tests` app-server flag, so
integration tests can suppress startup plugin warmups without adding a
production env-var gate.
- Has the app-server test harness pass that hidden flag by default,
while opting plugin-startup coverage back in for tests that
intentionally exercise startup sync and featured-plugin warmup behavior.
- Lowers normal app-server subprocess logging from `info`/`debug` to
`warn` to avoid multi-megabyte stderr output in Bazel logs.
- Prevents the external-agent config test from attempting a real
marketplace clone by using an invalid non-local source while still
exercising the pending-import completion path.
- Bounds optional filesystem/realtime waits and fake WebSocket
test-server shutdown so failures produce targeted timeouts instead of
hanging a shard.
- Fixes the Unix script-resolution test in `rmcp-client` to exercise
PATH resolution directly and include the actual spawn error in failures.

## Verification

- `cargo check -p codex-app-server`
- `cargo clippy -p codex-app-server --tests -- -D warnings`
- `cargo test -p codex-rmcp-client
program_resolver::tests::test_unix_executes_script_without_extension`
- `cargo test -p codex-app-server --test all
external_agent_config_import_sends_completion_notification_after_pending_plugins_finish
-- --nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_uses_warmed_featured_plugin_ids_cache_on_first_request --
--nocapture`
- Windows Local Bazel passed with this test-hardening bundle before it
was extracted from #19606.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19683).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19683
2026-04-26 12:43:16 -07:00
Michael Bolin
d54493ba1c test: stabilize app-server path assertions on Windows (#19604)
## Why

Windows can represent the same canonical local path with either a normal
drive path or a verbatim device path prefix. The failure pattern that
motivated this PR was an assertion diff like `C:\...` versus
`\\?\C:\...`: different spellings, same file.

That became visible while validating the permissions stack above this
PR. The stack increasingly routes paths through `AbsolutePathBuf`, which
normalizes supported Windows device prefixes, while several existing
tests still built expected values directly with
`std::fs::canonicalize()` or compared `AbsolutePathBuf::as_path()` to a
raw `PathBuf`. On Windows, that can make tests fail because the two
sides choose different textual forms for an otherwise equivalent
canonical path.

This PR is intentionally split out as the bottom PR below #19606. The
runtime permissions migration should not carry unrelated Windows test
stabilization, and reviewers should be able to verify this as a
test-only change before looking at the larger permissions changes.

## Failure Modes Covered

- `conversation_summary` expected rollout paths were built from raw
canonicalized `PathBuf`s, while app-server responses could carry
`AbsolutePathBuf`-normalized paths.
- `thread_resume` compared returned thread paths directly to previously
stored or fixture paths, so a verbatim-prefix spelling could fail an
otherwise correct resume.
- `marketplace_add` compared plugin install roots through `as_path()`
against raw canonicalized paths, reproducing the same `C:\...` versus
`\\?\C:\...` mismatch in both app-server and core-plugin coverage.

## What Changed

- In `app-server/tests/suite/conversation_summary.rs`, normalize both
expected rollout paths and received `ConversationSummary.path` values
through `AbsolutePathBuf` before comparing the full summary object.
- In `app-server/tests/suite/v2/thread_resume.rs`, normalize both sides
of thread path comparisons before asserting equality. This keeps the
tests focused on whether resume returned the same existing path, not
whether Windows used the same string spelling.
- In `app-server/tests/suite/v2/marketplace_add.rs` and
`core-plugins/src/marketplace_add.rs`, compare install roots as
`AbsolutePathBuf` values instead of comparing an absolute-path wrapper
to a raw canonicalized `PathBuf`.

## Behavior

This PR does not change production app-server or marketplace behavior.
It only changes tests to assert semantic path identity across Windows
path spelling variants. It also leaves API response values untouched;
the normalization happens inside assertions only.

## Verification

Targeted local checks run while extracting this fix:

- `cargo test -p codex-app-server
get_conversation_summary_by_thread_id_reads_rollout`
- `cargo test -p codex-app-server
get_conversation_summary_by_relative_rollout_path_resolves_from_codex_home`
- `cargo test -p codex-app-server
thread_resume_prefers_path_over_thread_id`

Windows-specific confidence comes from the Bazel Windows CI job for this
PR, since the failure is platform-specific.

## Docs

No docs update is needed because this is test-only infrastructure
stabilization.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19604).
* #19395
* #19394
* #19393
* #19392
* #19606
* __->__ #19604
2026-04-25 16:25:28 -07:00
Michael Bolin
f41306b4f3 test: isolate remote thread store regression from plugin warmups (#19593)
Follow-up to #19266.

## Why


`thread_start_with_non_local_thread_store_does_not_create_local_persistence`
is meant to catch accidental local thread persistence when a non-local
thread store is configured. The Windows flake reported in [this
BuildBuddy
invocation](https://app.buildbuddy.io/invocation/0b75dde4-6828-4e7b-a35b-e45b73fb005d)
showed that the assertion was tripping on an unexpected top-level `.tmp`
entry:

```diff
 {
+    ".tmp",
     "config.toml",
     "installation_id",
     "memories",
     "skills",
 }
```

That `.tmp` does not appear to come from `tempfile::TempDir`; it comes
from unrelated plugin startup work that can legitimately materialize
`codex_home/.tmp`, including the startup remote plugin sync marker in
[`core/src/plugins/startup_sync.rs`](bce74c70ce/codex-rs/core/src/plugins/startup_sync.rs (L13-L15))
and the curated plugin snapshot under
[`.tmp/plugins`](bce74c70ce/codex-rs/core-plugins/src/startup_sync.rs (L25-L26)).

That makes the regression race unrelated background startup tasks
instead of validating the thread-store invariant it was added to cover.
Rather than weakening the assertion to allow arbitrary `.tmp` entries,
this change isolates the test from plugin warmups so it can stay strict
about unexpected local thread persistence artifacts.

## What changed

- disable plugins in the generated config used by
`app-server/tests/suite/v2/remote_thread_store.rs`
- keep the existing `codex_home` assertions unchanged so the test still
fails if local session or sqlite persistence is introduced

## Verification

- `cargo test -p codex-app-server
suite::v2::remote_thread_store::thread_start_with_non_local_thread_store_does_not_create_local_persistence
-- --exact`
2026-04-25 20:45:31 +00:00
Eric Traut
4167628622 Add goal core runtime (4 / 5) (#18076)
Adds the core runtime behavior for active goals on top of the model
tools from PR 3.

## Why

A long-running goal should be a core runtime concern, not something
every client has to implement. Core owns the turn lifecycle, tool
completion boundaries, interruptions, resume behavior, and token usage,
so it is the right place to account progress, enforce budgets, and
decide when to continue work.

## What changed

- Centralized goal lifecycle side effects behind
`Session::goal_runtime_apply(GoalRuntimeEvent::...)`.
- Starts goal continuation turns only when the session is idle; pending
user input and mailbox work take priority.
- Accounts token and wall-clock usage at turn, tool, mutation,
interrupt, and resume boundaries; `get_thread_goal` remains read-only.
- Preserves sub-second wall-clock remainder across accounting boundaries
so long-running goals do not drift downward over time.
- Treats token budget exhaustion as a soft stop by marking the goal
`budget_limited` and injecting wrap-up steering instead of aborting the
active turn.
- Suppresses budget steering when `update_goal` marks a goal complete.
- Pauses active goals on interrupt and auto-reactivates paused goals
when a thread resumes outside plan mode.
- Suppresses repeated automatic continuation when a continuation turn
makes no tool calls.
- Added continuation and budget-limit prompt templates.

## Verification

- Added focused core coverage for continuation scheduling, accounting
boundaries, budget-limit steering, completion accounting, interrupt
pause behavior, resume auto-activation, and wall-clock remainder
accounting.
2026-04-24 21:16:00 -07:00
Eric Traut
6c874f9b34 Add goal app-server API (2 / 5) (#18074)
Adds the app-server v2 goal API on top of the persisted goal state from
PR 1.

## Why

Clients need a stable app-server surface for reading and controlling
materialized thread goals before the model tools and TUI can use them.
Goal changes also need to be observable by app-server clients, including
clients that resume an existing thread.

## What changed

- Added v2 `thread/goal/get`, `thread/goal/set`, and `thread/goal/clear`
RPCs for materialized threads.
- Added `thread/goal/updated` and `thread/goal/cleared` notifications so
clients can keep local goal state in sync.
- Added resume/snapshot wiring so reconnecting clients see the current
goal state for a thread.
- Added app-server handlers that reconcile persisted rollout state
before direct goal mutations.
- Updated the app-server README plus generated JSON and TypeScript
schema fixtures for the new API surface.

## Verification

- Added app-server v2 coverage for goal get/set/clear behavior,
notification emission, resume snapshots, and non-local thread-store
interactions.
2026-04-24 20:53:41 -07:00
Michael Bolin
789f387982 permissions: remove legacy read-only access modes (#19449)
## Why

`ReadOnlyAccess` was a transitional legacy shape on `SandboxPolicy`:
`FullAccess` meant the historical read-only/workspace-write modes could
read the full filesystem, while `Restricted` tried to carry partial
readable roots. The partial-read model now belongs in
`FileSystemSandboxPolicy` and `PermissionProfile`, so keeping it on
`SandboxPolicy` makes every legacy projection reintroduce lossy
read-root bookkeeping and creates unnecessary noise in the rest of the
permissions migration.

This PR makes the legacy policy model narrower and explicit:
`SandboxPolicy::ReadOnly` and `SandboxPolicy::WorkspaceWrite` represent
the old full-read sandbox modes only. Split readable roots, deny-read
globs, and platform-default/minimal read behavior stay in the runtime
permissions model.

## What changed

- Removes `ReadOnlyAccess` from
`codex_protocol::protocol::SandboxPolicy`, including the generated
`access` and `readOnlyAccess` API fields.
- Updates legacy policy/profile conversions so restricted filesystem
reads are represented only by `FileSystemSandboxPolicy` /
`PermissionProfile` entries.
- Keeps app-server v2 compatible with legacy `fullAccess` read-access
payloads by accepting and ignoring that no-op shape, while rejecting
legacy `restricted` read-access payloads instead of silently widening
them to full-read legacy policies.
- Carries Windows sandbox platform-default read behavior with an
explicit override flag instead of depending on
`ReadOnlyAccess::Restricted`.
- Refreshes generated app-server schema/types and updates tests/docs for
the simplified legacy policy shape.

## Verification

- `cargo check -p codex-app-server-protocol --tests`
- `cargo check -p codex-windows-sandbox --tests`
- `cargo test -p codex-app-server-protocol sandbox_policy_`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19449).
* #19395
* #19394
* #19393
* #19392
* #19391
* __->__ #19449
2026-04-24 17:16:58 -07:00
Tom
588f7a9fc4 [codex] add non-local thread store regression harness (#19266)
- Add an integration test that guarantees nothing gets written to codex
home dir or sqlite when running a rollout with a non-local ThreadStore
- Add an in-memory "spy" ThreadStore for tests like this

Note I could not find a good way to also ensure there were no filesystem
_reads_ that didn't go through threadstore. I explored a more elaborate
sandboxed-subprocess approach but it isn't platform portable and felt
like it wasn't (yet) worth it.
2026-04-24 15:45:44 -07:00
Tom
0a9b559c0b Migrate fork and resume reads to thread store (#18900)
- Route cold thread/resume and thread/fork source loading through
ThreadStore reads instead of direct rollout path operations
- Keep lookups that explicitly specify a rollout-path using the local
thread store methods but return an invalid-request error for remote
ThreadStore configurations
- Add some additional unit tests for code path coverage
2026-04-24 13:51:37 -07:00
Ruslan Nigmatullin
a3cccbd8ed [codex] Omit fork turns from thread started notifications (#19093)
## Why

`thread/fork` responses intentionally include copied history so the
caller can render the fork immediately, but `thread/started` is a
lifecycle notification. The v2 `Thread` contract says notifications
should return `turns: []`, and the fork path was reusing the response
thread directly, causing copied turns to be emitted through
`thread/started` as well.

## What Changed

- Route app-server `thread/started` notification construction through a
helper that clears `thread.turns` before sending.
- Keep `thread/fork` responses unchanged so callers still receive copied
history.
- Add persistent and ephemeral fork coverage that asserts
`thread/started` emits an empty `turns` array while the response retains
fork history.

## Testing

- `just fmt`
- `cargo test -p codex-app-server`
2026-04-24 12:31:13 -07:00
Alex Zamoshchin
bcc1caa920 respect workspace option for disabling plugins (#18907)
Respects the workspace setting for plugins in Codex

Plugins menu disappears
Plugins do not load
Plugins do not load in composer

no plugins loaded
<img width="809" height="226" alt="Screenshot 2026-04-23 at 3 20 45 PM"
src="https://github.com/user-attachments/assets/3a4dba8e-69c3-4046-a77e-f13ab77f84b4"
/>


no plugins in menu
<img width="293" height="204" alt="Screenshot 2026-04-23 at 3 20 35 PM"
src="https://github.com/user-attachments/assets/5cb9bf52-ad72-488f-b90c-5eb457da09a3"
/>
2026-04-24 17:38:45 +00:00
danwang-oai
11806faf71 Fix hang on turn/interrupt (#18392)
Fix a bug where the `turn/interrupt` RPC hangs when interrupting a turn
that has already completed.

Before this change, `turn/interrupt` requests were queued in app-server
and only answered when a later TurnAborted event arrived. If the target
turn was already complete, core treated Op::Interrupt as a no-op, so no
abort event was emitted and the RPC could hang indefinitely.

This change fixes that in two places:

* Reject turn/interrupt immediately with `INVALID_REQUEST` when the
requested turn is no longer the active turn.
* Resolve any already-accepted pending interrupt requests when the turn
reaches TurnComplete, covering the case where a turn finishes naturally
after the interrupt request is accepted but before it aborts.

I tested this by adding a failing test in
707487c063. You may view the results here:
https://github.com/openai/codex/actions/runs/24585182419/

<img width="1512" height="310" alt="CleanShot 2026-04-17 at 16 33 30@2x"
src="https://github.com/user-attachments/assets/f4a88228-b2a4-41f4-9aaa-ec82814096af"
/>
2026-04-24 10:47:50 -04:00
sayan-oai
c10f95ddac Update models.json and related fixtures (#19323)
Supersedes #18735.

The scheduled rust-release-prepare workflow force-pushed
`bot/update-models-json` back to the generated models.json-only diff,
which dropped the test and snapshot updates needed for CI.

This PR keeps the latest generated `models.json` from #18735 and adds
the corresponding fixture updates:
- preserve model availability NUX in the app-server model cache fixture
- update core/TUI expectations for the new `gpt-5.4` `xhigh` default
reasoning
- refresh affected TUI chatwidget snapshots for the `gpt-5.5`
default/model copy changes

Validation run locally while preparing the fix:
- `just fmt`
- `cargo test -p codex-app-server model_list`
- `cargo test -p codex-core includes_no_effort_in_request`
- `cargo test -p codex-core
includes_default_reasoning_effort_in_request_when_defined_by_model_info`
- `cargo test -p codex-tui --lib chatwidget::tests`
- `cargo insta pending-snapshots`

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
2026-04-24 11:14:13 +02:00
Michael Bolin
4816b89204 permissions: make profiles represent enforcement (#19231)
## Why

`PermissionProfile` is becoming the canonical permissions abstraction,
but the old shape only carried optional filesystem and network fields.
It could describe allowed access, but not who is responsible for
enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy
when profiles were exported, cached, or round-tripped through app-server
APIs.

The important model change is that active permissions are now a disjoint
union over the enforcement mode. Conceptually:

```rust
pub enum PermissionProfile {
    Managed {
        file_system: FileSystemSandboxPolicy,
        network: NetworkSandboxPolicy,
    },
    Disabled,
    External {
        network: NetworkSandboxPolicy,
    },
}
```

This distinction matters because `Disabled` means Codex should apply no
outer sandbox at all, while `External` means filesystem isolation is
owned by an outside caller. Those are not equivalent to a broad managed
sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an
inner sandbox may require the outer Codex layer to use no sandbox rather
than a permissive one.

## How Existing Modeling Maps

Legacy `SandboxPolicy` remains a boundary projection, but it now maps
into the higher-fidelity profile model:

- `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed`
with restricted filesystem entries plus the corresponding network
policy.
- `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving
the “no outer sandbox” intent instead of treating it as a lax managed
sandbox.
- `ExternalSandbox { network_access }` maps to
`PermissionProfile::External { network }`, preserving external
filesystem enforcement while still carrying the active network policy.
- Split runtime policies that legacy `SandboxPolicy` cannot faithfully
express, such as managed unrestricted filesystem plus restricted
network, stay `Managed` instead of being collapsed into
`ExternalSandbox`.
- Per-command/session/turn grants remain partial overlays via
`AdditionalPermissionProfile`; full `PermissionProfile` is reserved for
complete active runtime permissions.

## What Changed

- Change active `PermissionProfile` into a tagged union: `managed`,
`disabled`, and `external`.
- Keep partial permission grants separate with
`AdditionalPermissionProfile` for command/session/turn overlays.
- Represent managed filesystem permissions as either `restricted`
entries or `unrestricted`; `glob_scan_max_depth` is non-zero when
present.
- Preserve old rollout compatibility by accepting the pre-tagged `{
network, file_system }` profile shape during deserialization.
- Preserve fidelity for important edge cases: `DangerFullAccess`
round-trips as `disabled`, `ExternalSandbox` round-trips as `external`,
and managed unrestricted filesystem + restricted network stays managed
instead of being mistaken for external enforcement.
- Preserve configured deny-read entries and bounded glob scan depth when
full profiles are projected back into runtime policies, including
unrestricted replacements that now become `:root = write` plus deny
entries.
- Regenerate the experimental app-server v2 JSON/TypeScript schema and
update the `command/exec` README example for the tagged
`permissionProfile` shape.

## Compatibility

Legacy `SandboxPolicy` remains available at config/API boundaries as the
compatibility projection. Existing rollout lines with the old
`PermissionProfile` shape continue to load. The app-server
`permissionProfile` field is experimental, so its v2 wire shape is
intentionally updated to match the higher-fidelity model.

## Verification

- `just write-app-server-schema`
- `cargo check --tests`
- `cargo test -p codex-protocol permission_profile`
- `cargo test -p codex-protocol
preserving_deny_entries_keeps_unrestricted_policy_enforceable`
- `cargo test -p codex-app-server-protocol
permission_profile_file_system_permissions`
- `cargo test -p codex-app-server-protocol serialize_client_response`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox`
- `just fix`
- `just fix -p codex-protocol`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-core`
- `just fix -p codex-app-server`
2026-04-23 23:02:18 -07:00
xli-oai
33cc135cc3 [codex] Support remote plugin install writes (#18917)
## Summary
- Add a remote plugin install write call that POSTs the selected remote
plugin to the ChatGPT cloud plugin API.
- Align remote install with the latest remote read contract:
`pluginName` carries the backend remote plugin id directly, for example
`plugins~Plugin_linear`, and install no longer synthesizes
`<name>@<marketplace>` ids.
- Validate remote install ids with the same character rules as remote
read, return the same install response shape as local installs, and
include mocked app-server coverage for the write path.

## Validation
- `just fmt`
- `cargo test -p codex-app-server --test all plugin_install`
- `cargo test -p codex-core-plugins`
- `just fix -p codex-app-server`
- `just fix -p codex-core-plugins`
2026-04-23 22:10:15 -07:00
starr-openai
49fb25997f Add sticky environment API and thread state (#18897)
## Summary
- add sticky environment selections to app-server v2 thread/start and
turn/start request flow
- carry thread-level selections through core session/thread state
- add app-server coverage for sticky selections and turn overrides

## Stack
1. This PR: API and thread persistence
2. #18898: config.toml named environment loading
3. #18899: downstream tool/runtime consumers

## Validation
- Not run locally; split only.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 18:57:13 -07:00
Celia Chen
432771c5fd feat: expose AWS account state from account/read (#19048)
## Why

AWS/Bedrock mode currently reports `account: null` with
`requiresOpenaiAuth: false` from `account/read`. That suppresses the
OpenAI-auth requirement, but it does not let app clients distinguish AWS
auth from any other non-OpenAI custom provider. For the prototype AWS
provider UX, clients need a simple provider-derived signal so they can
suppress ChatGPT/API-key login and token-refresh paths without
hardcoding Bedrock checks.

## What changed

- Adds an `aws` variant to the v2 `Account` protocol union.
- Adds `ProviderAccountKind` to `codex-model-provider` so the runtime
provider owns the app-visible account classification.
- Makes Amazon Bedrock return `ProviderAccountKind::Aws` from the
model-provider layer.
- Updates app-server `account/read` to map `ProviderAccountKind` to the
existing `GetAccountResponse` wire shape.
- Preserves the existing `account: null, requiresOpenaiAuth: false`
behavior for other non-OpenAI providers.
- Regenerates the app-server protocol schema fixtures.
- Adds coverage for provider account classification and for the Amazon
Bedrock `account/read` response.

## Testing

- `cargo test -p codex-model-provider`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server get_account_with_aws_provider`

## Notes

I attempted `just bazel-lock-update` and `just bazel-lock-check`, but
both are blocked in my local environment because `bazel` is not
installed.
2026-04-24 01:53:13 +00:00
xli-oai
0d6a90cd6b Add app-server marketplace upgrade RPC (#19074)
## Summary
- add a v2 `marketplace/upgrade` app-server RPC that mirrors the
existing configured Git marketplace upgrade path
- expose typed request/response/error payloads and regenerate
JSON/TypeScript schema fixtures
- add app-server integration coverage for all, named, already
up-to-date, and invalid marketplace upgrade requests

## Tests
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server marketplace_upgrade`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fmt`
2026-04-23 13:00:46 -07:00
xl-openai
198eddd25d Move marketplace add/remove and startup sync out of core. (#19099)
Move more things to core-plugins.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 11:27:17 -07:00
Eric Traut
c2423f42d1 Respect explicit untrusted project config (#18626)
## Why

Fixes #18475. A `-c` override such as `projects.<cwd>.trust_level =
"untrusted"` is meant to be a runtime config override, but app-server
thread startup treated any non-trusted project as eligible for automatic
trust persistence when a permissive sandbox/cwd was requested. That
meant an explicit `untrusted` session override could still cause
`config.toml` to be updated with `trusted`.

## What changed

The app-server auto-trust path now runs only when the active project
trust level is unknown. Explicit `trusted` and explicit `untrusted`
values are both respected, regardless of whether they came from
persisted config or session flags.

A focused `thread/start` test now covers the explicit `untrusted` case
with a permissive sandbox request.

## Verification

- `cargo test -p codex-app-server`
- `just fix -p codex-app-server`
2026-04-23 10:51:17 -07:00
David de Regt
3d3028a5a9 Add excludeTurns parameter to thread/resume and thread/fork (#19014)
For callers who expect to be paginating the results for the UI, they can
now call thread/resume or thread/fork with excludeturns:true so it will
not fetch any pages of turns, and instead only set up the subscription.
That call can be immediately followed by pagination requests to
thread/turns/list to fetch pages of turns according to the UI's current
interactions.
2026-04-23 10:07:59 -07:00
Abhinav
305825abd9 Support MCP tools in hooks (#18385)
## Summary

Lifecycle hooks currently treat `PreToolUse`, `PostToolUse`, and
`PermissionRequest` as Bash-only flows
- hook schema constrains `tool_name` to `Bash`
- hook input assumes a command-shaped `tool_input`
- core hook dispatch path passes only shell command strings

That means hooks cannot target MCP tools even though MCP tool names are
model-visible and stable

This change generalizes those hook paths so they can match and receive
payloads for MCP tools while preserving the existing Bash behavior.

## Reviewer Notes

I think these are the key files
- `codex-rs/core/src/tools/handlers/mcp.rs`
- `codex-rs/core/src/mcp_tool_call.rs`

Otherwise the changes across apply_patch, shell, and unified_exec are
mainly to rewire everything to be `tool_input` based instead of just
`command` so that it'll make sense for MCP tools.

## Changes

- Allow `PreToolUse`, `PostToolUse`, and `PermissionRequest` hook inputs
to carry arbitrary `tool_name` and `tool_input` values instead of
hard-coding `Bash` and command-only payloads.
- Add MCP hook payload support through `McpHandler`, using the
model-visible tool name from `ToolInvocation` and the raw MCP arguments
as `tool_input`.
- Include MCP tool responses in `PostToolUse` by serializing
`McpToolOutput` into the hook response payload.
- Run `PermissionRequest` hooks for MCP approval requests after
remembered approval checks and before falling back to user-facing MCP
elicitation.
- Preserve exact matching for literal hook matchers like `Bash` and
`mcp__memory__create_entities`, while keeping regex matcher support for
patterns like `mcp__memory__.*` and `mcp__.*__write.*`.

---------

Co-authored-by: Andrei Eternal <eternal@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-04-23 07:33:57 +00:00
Michael Bolin
8bc667b07b app-server: include filesystem entries in permission requests (#19086)
## Why

`item/permissions/requestApproval` sends a requested permission profile
to app-server clients. The core profile already stores filesystem
permissions as `entries`, but the v2 compatibility conversion used the
legacy `read`/`write` projection whenever possible and left `entries`
unset.

That made the request ambiguous for clients that consume the canonical
v2 shape: `permissions.fileSystem.entries` was missing even though
filesystem access was being requested. A client that rendered or echoed
grants from `entries` could treat the request as having no filesystem
permission entries, then return an empty or incomplete grant. The
app-server intersects responses with the original request, so omitted
filesystem permissions are denied.

## What Changed

- Populate `AdditionalFileSystemPermissions.entries` when converting
legacy read/write roots for request permission payloads, while
preserving `read` and `write` for compatibility.
- Mark `read` and `write` as transitional schema fields in the generated
app-server schema.
- Add regression coverage for the v2 conversion, the app-server
`item/permissions/requestApproval` round trip, and TUI app-server
approval conversion expectations.
- Refresh generated JSON and TypeScript schema fixtures.

## Verification

- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server request_permissions_round_trip`
- `cargo test -p codex-tui
converts_request_permissions_into_granted_permissions`
- `cargo test -p codex-tui
resolves_permissions_and_user_input_through_app_server_request_id`
2026-04-23 00:21:59 -07:00
xl-openai
fb6308cf64 Use remote plugin IDs for detail reads and enlarge list pages (#19079)
1. For remote plugin use plugin id (plugin name) directly for read
plugin details;
2. Request up to 200 remote plugins per directory list page.
2026-04-22 22:50:20 -07:00
Michael Bolin
9d824cf4b4 app-server: accept command permission profiles (#18283)
## Why

`command/exec` is another app-server entry point that can run under
caller-provided permissions. It needs to accept `PermissionProfile`
directly so command execution is not left behind on `SandboxPolicy`
while thread APIs move forward.

Command-level profiles also need to preserve the semantics clients
expect from profile-relative paths. `:cwd` and cwd-relative deny globs
should be anchored to the resolved command cwd for a command-specific
profile, while configured deny-read restrictions such as `**/*.env =
none` still need to be enforced because they can come from config or
requirements rather than the command override itself.

## What Changed

This adds `permissionProfile` to `CommandExecParams`, rejects requests
that combine it with `sandboxPolicy`, and converts accepted profiles
into the runtime filesystem/network permissions used for command
execution.

When a command supplies a profile, the app-server resolves that profile
against the command cwd instead of the thread/server cwd. It also
preserves configured deny-read entries and `globScanMaxDepth` on the
effective filesystem policy so one-off command overrides cannot drop
those read protections. The PR also updates app-server docs/schema
fixtures and adds command-exec coverage for accepted, rejected,
cwd-scoped, and deny-read-preserving profile paths.

## Verification

- `cargo test -p codex-app-server
command_exec_permission_profile_cwd_uses_command_cwd`
- `cargo test -p codex-app-server
command_profile_preserves_configured_deny_read_restrictions`
- `cargo test -p codex-app-server
command_exec_accepts_permission_profile`
- `cargo test -p codex-app-server
command_exec_rejects_sandbox_policy_with_permission_profile`
- `just fix -p codex-app-server`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18283).
* #18288
* #18287
* #18286
* #18285
* #18284
* __->__ #18283
2026-04-22 22:33:16 -07:00
Eric Traut
bbff4ee61a Add safety check notification and error handling (#19055)
Adds a new app-server notification that fires when a user account has
been flagged for potential safety reasons.
2026-04-22 22:24:12 -07:00
Dylan Hurd
5e71da1424 feat(request-permissions) approve with strict review (#19050)
## Summary
Allow the user to approve a request_permissions_tool request with the
condition that all commands in the rest of the turn are reviewed by
guardian, regardless of sandbox status.

## Testing
- [x] Added unit tests
- [x] Ran locally
2026-04-23 01:56:32 +00:00