Compare commits

..

595 Commits

Author SHA1 Message Date
viyatb-oai
71ea2607b5 feat(tui): add permission profile picker
Co-authored-by: Codex noreply@openai.com
2026-05-15 17:01:43 -07:00
viyatb-oai
96866bc773 style(core): satisfy session formatting
Co-authored-by: Codex noreply@openai.com
2026-05-15 17:01:28 -07:00
viyatb-oai
8970201bc1 fix: align runtime profile refresh with current session APIs
Co-authored-by: Codex noreply@openai.com
2026-05-15 16:54:33 -07:00
viyatb-oai
a26143432d fix: add active profile to memories override test
Co-authored-by: Codex noreply@openai.com
2026-05-15 16:47:18 -07:00
viyatb-oai
a107f154d8 fix: tolerate Windows delete-pending memory files in tests
Co-authored-by: Codex noreply@openai.com
2026-05-15 16:47:18 -07:00
viyatb-oai
2ff9585378 style: stabilize network proxy config formatting
Co-authored-by: Codex noreply@openai.com
2026-05-15 16:47:17 -07:00
viyatb-oai
d89d4f8d2c fix: refresh profile network proxy on permission switches
Co-authored-by: Codex noreply@openai.com
2026-05-15 16:47:17 -07:00
viyatb-oai
7335c4a54a core: expose permission profile picker metadata
Co-authored-by: Codex noreply@openai.com
2026-05-15 16:45:46 -07:00
Michael Bolin
bbb5c2811d tui: pass active permission profiles through app commands (#22891)
## Why

This continues the permissions migration by keeping the TUI command
boundary aligned with the app-server protocol direction from #22795:
callers should select a permission profile by id instead of passing a
concrete `PermissionProfile` value around as the turn configuration.

`AppCommand` is internal to the TUI, but it is the path that eventually
becomes `thread/turn/start`, so carrying concrete profile details there
made it too easy for UI code to keep relying on the old whole-profile
replacement model.

## What changed

- `AppCommand::UserTurn` and `AppCommand::OverrideTurnContext` now carry
`Option<ActivePermissionProfile>` instead of `PermissionProfile`.
- Composer submissions copy the active permission profile id from the
current session snapshot; legacy snapshots intentionally submit no
active profile id.
- Permission preset UI events now carry only the active built-in profile
id. The app derives the concrete built-in `PermissionProfile` internally
only when updating its local config/status snapshot.
- Permission presets expose their built-in active profile id, and preset
selection preserves that id in both the immediate turn override and the
local TUI config snapshot.
- Turn routing sends `TurnPermissionsOverride::ActiveProfile` when an
active id is present, and only falls back to the legacy sandbox
projection for the remaining runtime override path.

## How to review

Start with `codex-rs/tui/src/app_command.rs` to verify the command shape
no longer exposes `PermissionProfile`.

Then read `codex-rs/tui/src/app/thread_routing.rs` to verify the
app-server turn-start conversion: active ids go through as ids, while
the legacy sandbox fallback is still constrained to the existing runtime
override case.

Finally, check `codex-rs/tui/src/chatwidget/permission_popups.rs`,
`codex-rs/tui/src/app/event_dispatch.rs`,
`codex-rs/tui/src/app/config_persistence.rs`, and
`codex-rs/utils/approval-presets/src/lib.rs` to see how preset
selections stay id-only across TUI events while the local display/config
mirror still gets a concrete built-in profile.

## Verification

Latest local verification after the id-only `AppEvent` cleanup:

- `cargo check -p codex-tui --tests`
- `cargo test -p codex-tui
permissions_selection_sends_approvals_reviewer_in_override_turn_context`
- `cargo test -p codex-tui update_feature_flags_enabling_guardian`
- `cargo test -p codex-utils-approval-presets`
- `just fmt`
- `just fix -p codex-tui -p codex-utils-approval-presets`

Earlier in the same PR, before the final event-shape cleanup:

- `cargo test -p codex-tui turn_permissions_`
- `cargo test -p codex-tui submission_`
- `cargo test -p codex-tui
session_configured_syncs_widget_config_permissions_and_cwd`
- `RUST_MIN_STACK=16777216 cargo test -p codex-tui`
2026-05-15 22:42:35 +00:00
Curtis 'Fjord' Hawthorne
8543e39885 Preserve image detail in app-server inputs (#20693)
## Summary

- Add optional image detail to user image inputs across core, app-server
v2, thread history/event mapping, and the generated app-server
schemas/types.
- Preserve requested detail when serializing Responses image inputs:
omitted detail stays on the existing `high` default, while explicit
`original` keeps local images on the original-resolution path.
- Support `high`/`original` consistently for tool image outputs,
including MCP `codex/imageDetail`, code-mode image helpers, and
`view_image`.
2026-05-15 15:04:04 -07:00
Tom
249d50aafc [codex] Soften SQLite metadata sync failures (#22899)
## Summary
- keep transcript-derived local thread metadata SQLite failures
best-effort
- preserve hard failures for explicit git-only metadata updates that
still require SQLite state
- add regression coverage for the soft-vs-hard metadata update policy

## Root cause
The live thread metadata sync introduced after v0.131.0-alpha.8 moved
append-derived metadata writes above the rollout writer. Those SQLite
writes now propagated through the live thread flush path, so a corrupted
optional state DB could surface as a transcript persistence warning even
when JSONL writes still succeeded.

The hard failures were introduced in #22236
2026-05-15 21:37:27 +00:00
Owen Lin
6a331a66eb feat(app-server): update remote control APIs for better UX (#22877)
## Why
To help improve `codex remote-control` CLI UX which I plan to do in a
followup, this PR adds `server-name` to the various remote control APIs:
- `remoteControl/enable`
- `remoteControl/disable`
- `remoteControl/status/changed`

Also, add a `remoteControl/status/read` API. This will be helpful in the
Codex App.
2026-05-15 14:33:24 -07:00
Shijie Rao
98129fb9c5 Disable DMG staging for signed macOS promotion (#22900)
## Why
`promote_signed` is now used to finish a release from an externally
signed macOS handoff, but this release path (temporarily) no longer
distributes DMGs. Keeping DMG staging enabled made the handoff
unnecessarily require DMG assets and notarization/stapling validation
even though the promoted release only needs the signed macOS binaries.

## What changed
- Set every `stage-signed-macos` matrix entry to `build_dmg: "false"`,
including the primary macOS bundles.
- Kept the existing DMG staging branch in place behind
`matrix.build_dmg` so it can be re-enabled deliberately later.
- Updated the workflow header comment so the signed handoff contract
asks for signed binaries, not signed DMGs.

The regular signed build path that creates, signs, notarizes, and stages
DMGs is unchanged; this only affects the `promote_signed` handoff path.
2026-05-15 14:19:06 -07:00
Michael Bolin
8df2d96860 core: construct test permission profiles directly (#22795)
## Why

The core migration is trying to make `PermissionProfile` the shape tests
and runtime code reason about, leaving `SandboxPolicy` only where legacy
behavior is explicitly under test. The local
`permission_profile_for_sandbox_policy()` test helpers kept new
permission-profile tests mentally tied to the old sandbox model even
when the equivalent profile is straightforward.

## What Changed

- Removed the `permission_profile_for_sandbox_policy()` helper from the
network proxy spec tests and session tests.
- Replaced legacy conversions for read-only, workspace-write, and
full-access cases with `PermissionProfile::read_only()`,
`PermissionProfile::workspace_write()`, and
`PermissionProfile::Disabled`.
- Constructed the external-sandbox session test's
`PermissionProfile::External` directly, while preserving the legacy
`SandboxPolicy` only where the test still exercises legacy config update
behavior.

## How To Review

This PR is intentionally test-only. Review the two touched files and
check that each replacement preserves the old legacy mapping:

- `SandboxPolicy::new_read_only_policy()` ->
`PermissionProfile::read_only()`
- `SandboxPolicy::new_workspace_write_policy()` ->
`PermissionProfile::workspace_write()`
- `SandboxPolicy::DangerFullAccess` -> `PermissionProfile::Disabled`
- `SandboxPolicy::ExternalSandbox { network_access: Restricted }` ->
`PermissionProfile::External { network: Restricted }`

## Verification

- `cargo test -p codex-core
requirements_allowed_domains_are_a_baseline_for_user_allowlist`
- `cargo test -p codex-core
start_managed_network_proxy_applies_execpolicy_network_rules`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox`
- `cargo test -p codex-core
managed_network_proxy_decider_survives_full_access_start`
- `just fix -p codex-core`








---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22795).
* #22891
* __->__ #22795
2026-05-15 13:09:25 -07:00
Michael Bolin
83bbb4f326 app-server: stop returning thread permission profiles (#22792)
## Why

The app-server thread lifecycle API should no longer expose the full
`PermissionProfile` value. After the permissions-profile migration,
clients should round-trip only the active profile identity through
`activePermissionProfile` and `permissions` when that identity is known.

The full profile is server-side config. Treating a response-derived
legacy sandbox projection as a new local profile can lose named-profile
restrictions and accidentally widen permissions on the next turn. The
legacy `sandbox` response field remains only as the
compatibility/display fallback.

## What Changed

- Removed `permissionProfile` from `ThreadStartResponse`,
`ThreadResumeResponse`, and `ThreadForkResponse`.
- Stopped populating that field in app-server thread start/resume/fork
responses.
- Updated embedded exec/TUI response mapping to derive display
permission state from local config or the legacy sandbox fallback
instead of a response profile value.
- Added a TUI turn override shape that distinguishes preserving server
permissions, selecting an active profile id, and sending a legacy
sandbox for an explicit local override.
- Preserved remote app-server permissions across turns by sending
`permissions` only when an `activePermissionProfile` id is known, and
otherwise sending no sandbox override unless the user selected a local
override.
- Kept embedded `thread/resume` hydration server-authored when
`activePermissionProfile` is absent, which matches the live-thread
attach path where the server ignores requested overrides.
- Updated the app-server README to remove the obsolete lifecycle
response `permissionProfile` reference. The remaining
`permissionProfile` README references are request-side permission
overrides.
- Regenerated app-server JSON schema and TypeScript fixtures.
- Kept the generated typed response enum exempt from
`large_enum_variant`, matching the existing payload enum exemption after
the lifecycle response variants shrank.

## How To Review

Start with `codex-rs/app-server-protocol/src/protocol/v2/thread.rs` to
confirm the response shape, then check the response construction in
`codex-rs/app-server/src/request_processors`. The generated schema and
TypeScript fixture changes are mechanical follow-through from the
protocol removal.

The TUI behavior is the delicate part: review
`codex-rs/tui/src/app_server_session.rs` for response hydration and
turn-start override projection, then
`codex-rs/tui/src/app/thread_routing.rs` for the decision about whether
the next turn should preserve the server snapshot, send an active
profile id, or send a legacy sandbox for an explicit local override.

## Verification

- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol
thread_lifecycle_responses_default_missing_optional_fields`
- `cargo test -p codex-exec
session_configured_from_thread_response_uses_permission_profile_from_config`
- `cargo test -p codex-tui --lib thread_response`
- `cargo test -p codex-tui turn_permissions_`
- `cargo test -p codex-tui
resume_response_restores_turns_from_thread_items`
- `cargo test -p codex-analytics
track_response_only_enqueues_analytics_relevant_responses`
- `just fix -p codex-analytics`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-tui`
- `just argument-comment-lint`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22792).
* #22795
* __->__ #22792
2026-05-15 12:45:48 -07:00
viyatb-oai
6afe00efda Workflow updates (#22582) 2026-05-15 12:41:18 -07:00
Boyang Niu
c15613f2b6 Forward apps MCP product SKU from Codex config (#22872)
This adds `apps_mcp_product_sku` as a toplevel config.toml key. We pass
the given value as a header when listing MCPs for the client, allowing
connectors to be filtered per product entry point.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-15 11:52:14 -07:00
Michael Bolin
4c80435eba telemetry: tag sandboxes from permission profiles (#22791)
## Why

Sandbox telemetry tags should be derived from the active permission
profile, not from a legacy `SandboxPolicy`, so the tagging code stays
aligned with the permissions migration and does not preserve a
policy-shaped production helper only for tests.

## What Changed

- Removed the production `sandbox_tag(&SandboxPolicy, ...)` helper.
- Updated sandbox tag tests to construct the relevant
`PermissionProfile` values directly.
- Kept the platform-specific sandbox tag behavior under the existing
`permission_profile_sandbox_tag` path.

## How To Review

The production change is in `codex-rs/core/src/sandbox_tags.rs`. Most of
the diff is test cleanup that replaces legacy policy setup with
permission profiles, so review the expected tag assertions rather than
the old helper mechanics.

## Verification

- `cargo test -p codex-core sandbox_tag`









---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22791).
* #22795
* #22792
* __->__ #22791
2026-05-15 10:58:50 -07:00
Michael Bolin
aeca1cba6f context: remove legacy permissions instructions helper (#22790)
## Why

The permissions instruction builder should consume the new permissions
model directly. Keeping a `SandboxPolicy` conversion helper in this path
encourages new code to route through legacy sandbox policy values even
when the caller already has a `PermissionProfile`.

## What Changed

- Removed `PermissionsInstructions::from_policy`.
- Removed the test that exercised that legacy helper.
- Left the existing profile-based instruction coverage in place.

## How To Review

Review `codex-rs/core/src/context/permissions_instructions.rs` first.
This PR is intentionally narrow: the production behavior should be
unchanged for profile callers, and the deleted surface was only a
convenience adapter from `SandboxPolicy`.

## Verification

- `cargo test -p codex-core builds_permissions_from_profile`








---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22790).
* #22795
* #22792
* #22791
* __->__ #22790
2026-05-15 10:11:16 -07:00
Chris Bookholt
9facdccb37 Ignore configured hooks in git helpers (#22843)
## What
- Internal Git helper commands now ignore configured hook directories
during repository bookkeeping.

## Why
- These helper flows should stay consistent even when a repository has
hook-directory configuration of its own.

## How
- Pass a command-local `core.hooksPath` override in the shared helper
path and the Git-info helper path.
- Add regressions for the baseline index rewrite flow and the metadata
status flow.

## Validation
- `cargo fmt --manifest-path
/Users/bookholt/code/codex/codex-rs/Cargo.toml --all --check`
- `cargo test --manifest-path
/Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-git-utils`
- `cargo test --manifest-path
/Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-core
test_get_has_changes_`
2026-05-15 10:07:54 -07:00
Eric Traut
7fa0007ea8 tui: split remaining composer draft and footer state (#22656)
## Why

[#22581](https://github.com/openai/codex/pull/22581) started separating
the chat composer’s responsibilities, but `ChatComposer` still owned the
remaining editable draft state alongside footer/status presentation
state. This follow-up makes those ownership lines explicit so future
composer changes have a smaller blast radius and `BottomPane` does not
need to keep exposing scattered draft getters.

This is just a refactor. No functional or behavioral changes are
intended.

## What changed

- Move the remaining editable composer state into
`bottom_pane/chat_composer/draft_state.rs`.
- Move footer and status-row presentation state into
`bottom_pane/chat_composer/footer_state.rs`.
- Add an internal `ComposerDraftSnapshot` for restore flows, replacing
several ad hoc `BottomPane` pass-through reads.
- Rewire the related history-search and thread-input restore paths to
use the extracted state.

## Verification

- `RUST_MIN_STACK=8388608 cargo test -p codex-tui`
- `cargo insta pending-snapshots`
2026-05-15 09:12:52 -07:00
Michael Bolin
68ccfdc905 guardian: use permission profile for review sandbox (#22789)
## Why

`SandboxPolicy` is being pushed back toward legacy config loading and
compatibility boundaries. Guardian review sessions already want the
built-in read-only permission behavior; carrying that as an active
`PermissionProfile` makes the review sandbox follow the new permissions
path instead of configuring the child session through the legacy policy
API.

## What Changed

- Configure the guardian review session with
`PermissionProfile::read_only()`.
- Send the read-only profile through the guardian child `Op::UserTurn`.
- Keep the legacy `sandbox_policy` field populated with
`SandboxPolicy::new_read_only_policy()` declared next to the profile so
the two remain visibly in sync until the compatibility field goes away.

## How To Review

Start in `codex-rs/core/src/guardian/review_session.rs`. The important
check is that both the guardian config and the child turn now use the
read-only permission profile, while the remaining
`SandboxPolicy::ReadOnly` assignment is only the compatibility field
required by the current turn protocol.

## Verification

- `cargo test -p codex-core
guardian_review_session_config_clears_parent_developer_instructions`





---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22789).
* #22795
* #22792
* #22791
* #22790
* __->__ #22789
2026-05-15 08:59:31 -07:00
jif-oai
cccde930ce Move memory prompt injection to app-server extension (#22841)
## Why

Memory prompt injection should be owned by the extension path that
app-server composes at runtime, not by an inlined special case inside
`codex-core`. This keeps `codex-core` focused on session orchestration
while allowing the memories extension to own its app-server prompt
behavior.

## What Changed

- Registers `codex-memories-extension` in the app-server extension
registry.
- Moves the memory developer-instruction injection out of
`core/src/session/mod.rs` and into the memories extension prompt
contributor.
- Adds config-change handling so the extension keeps its per-thread
memory settings in sync after startup.
- Leaves memories read/retrieval tools unregistered for now so this PR
only changes prompt injection.
- Removes the stale `cargo-shear` ignore now that app-server depends on
the extension crate.

## Validation

Not run locally; validation is left to CI.
2026-05-15 16:19:34 +02:00
jif-oai
5d30764fe9 Run compact hooks for remote compaction v2 (#22828)
## Why

Remote compaction v2 is the `/responses` implementation of
session-history compaction, but it still needs to preserve the
observable contract of the legacy `/responses/compact` path. In
particular, users and integrations that rely on `PreCompact` and
`PostCompact` hooks should not see different behavior when
`remote_compaction_v2` is enabled.

## What Changed

- Runs `PreCompact` before issuing the remote compaction v2 request,
including `Interrupted` analytics when a pre-hook stops execution.
- Runs `PostCompact` after a successful v2 compaction and aborts the
turn if the post-hook stops execution.
- Adds `compact_remote_parity` coverage that compares legacy and v2
compaction across manual transcript shapes, automatic pre-turn
compaction, automatic mid-turn compaction, hook payloads, replacement
history, follow-up request payloads, and API-key `service_tier=fast`
behavior.
- Registers the new parity suite under `core/tests/suite`.

Relevant code:

-
[`compact_remote_v2.rs`](af63745cb5/codex-rs/core/src/compact_remote_v2.rs)
-
[`compact_remote_parity.rs`](af63745cb5/codex-rs/core/tests/suite/compact_remote_parity.rs)

## Verification

- Added `core/tests/suite/compact_remote_parity.rs` to assert parity
between legacy remote compaction and remote compaction v2 for the
affected request, hook, rollout-history, and follow-up paths.
- Existing `compact_remote_v2` unit coverage still exercises v2
replacement-history retention and compaction-output collection.
2026-05-15 15:26:21 +02:00
jif-oai
c03cea4ca2 Remove zombie tools spec module (#22820)
## Summary

- move tool_user_shell_type out of the old tools::spec module and call
it from tools directly
- attach the remaining spec planning model tests under spec_plan
- delete core/src/tools/spec.rs

## Tests

- just fmt
- cargo test -p codex-core tools::spec_plan

Note: a broader cargo test -p codex-core run on the earlier PR-head
worktree still hit the pre-existing stack overflow in
agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns.
2026-05-15 13:44:58 +02:00
jif-oai
6f1a01fbdd Simplify tool executor and registry plumbing (#22636)
## Why

The tool runtime path still had a typed output associated type on
`ToolExecutor`, plus a core-only `RegisteredTool` adapter and
extension-only executor aliases. That made every new shared tool runtime
carry extra adapter plumbing before it could participate in core
dispatch, extension tools, hook payloads, telemetry, and model-visible
spec generation.

This PR moves output erasure to the shared executor boundary so core and
extension tools can use the same execution contract directly.

## What Changed

- Changed `codex_tools::ToolExecutor` to return `Box<dyn ToolOutput>`
instead of an associated `Output` type.
- Removed the extension-specific `ExtensionToolExecutor` /
`ExtensionToolOutput` aliases and exposed `ToolExecutor<ToolCall>` plus
`ToolOutput` through `codex-extension-api`.
- Reworked core tool registration around `CoreToolRuntime` and
`ToolRegistry::from_tools`, removing the extra `RegisteredTool` /
`ToolRegistryBuilder` layer.
- Consolidated model-visible spec planning and registry construction in
`core/src/tools/spec_plan.rs`, including deferred tool search and
code-mode-only filtering.
- Added `ToolOutput` helpers for post-tool-use hook ids and inputs so
MCP, unified exec, extension, and other boxed outputs preserve the same
hook payload behavior.
- Updated core handlers, memories tools, and the related
registry/spec/router tests to use the simplified contract.

## Test Coverage

- Updated coverage for tool spec planning, registry lookup, deferred
tool search registration, extension tool routing, post-tool-use hook
payloads, dispatch tracing, guardian output extraction, and memories
extension tool execution.
2026-05-15 11:47:54 +02:00
jif-oai
0322ac3df8 [codex] Use compaction_trigger item for remote compaction v2 (#22809)
## Why

Remote compaction v2 was still using `context_compaction` as both the
request trigger and the compacted output shape. The Responses API now
has the landed contract for this flow: Codex sends a dedicated `{
"type": "compaction_trigger" }` input item, and the backend returns the
standard `compaction` output item with encrypted content.

This aligns the v2 path with that wire contract while preserving the
existing local compacted-history post-processing behavior.

## What changed

- Add `ResponseItem::CompactionTrigger` and regenerate the app-server
protocol schema fixtures.
- Send `compaction_trigger` from `remote_compaction_v2` instead of a
payload-less `context_compaction`.
- Collect exactly one backend `compaction` output item, then reuse the
existing compacted-history rebuilding path.
- Treat the trigger item as a transient request marker rather than model
output or persisted rollout/memory content.

## Verification

- `cargo test -p codex-protocol compaction_trigger`
- `cargo test -p codex-core remote_compact_v2`
- `cargo test -p codex-core compact_remote_v2`
- `cargo test -p codex-core
responses_websocket_sends_response_processed_after_remote_compaction_v2`
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol schema_fixtures`
2026-05-15 11:40:35 +02:00
jif-oai
a5e5faf216 Reject legacy [profiles] when using profile-v2 (#22647)
## Why

`profile-v2` layers the selected profile file on top of the base user
`config.toml`, but the legacy `[profiles]` table also stores named
profile overrides in that same base file. Allowing both paths during one
load makes it too easy to get a mixed profile where stale legacy
settings still influence a profile-v2 run.

## What Changed

- Detect a legacy `[profiles]` table in the base user config whenever
`--profile-v2` selects a profile file.
- Fail config loading with an `InvalidData` error that tells the user to
move those settings into the selected profile-v2 file or remove
`[profiles]`.
- Add a loader regression covering `--profile-v2` with legacy
`[profiles]` in `config.toml`.

## Testing

- `cargo test -p codex-config
profile_v2_rejects_legacy_profiles_in_base_user_config`
2026-05-15 11:35:42 +02:00
Shijie Rao
302149d979 Fix signed macOS release promotion follow-up jobs (#22788)
## Why

The `release_mode=promote_signed` path intentionally skips the build
jobs after signed macOS artifacts are staged, then runs the `release`
job from the signed handoff. In the `rust-v0.131.0-alpha.19` promotion
run, `release` succeeded but the npm, PyPI, and `latest-alpha-cli`
follow-up jobs were skipped because their custom job `if:` expressions
let GitHub Actions apply the implicit `success()` status check before
reading `needs.release.outputs.*`.

The unsigned build handoff does not need DotSlash manifests. Publishing
unsigned DotSlash manifests creates release assets that can conflict
with the later signed promotion, especially shared outputs such as
`bwrap`, `codex-command-runner`, and `codex-windows-sandbox-setup`.

## What Changed

- Stop publishing DotSlash manifests when `SIGN_MACOS == 'false'`.
- Delete `.github/dotslash-unsigned-config.json`.
- Gate post-release jobs with the `!cancelled()` status function plus an
explicit `needs.release.result == 'success'` check before consulting
release outputs.
- Keep the existing publish eligibility rules for npm, PyPI, WinGet, and
`latest-alpha-cli`.

## Verification

- `rg -n "dotslash-unsigned-config|SIGN_MACOS ==
'false'.*dotslash|unsigned-config" .github/workflows/rust-release.yml
.github || true`
- `git diff --check -- .github/workflows/rust-release.yml
.github/dotslash-unsigned-config.json`
2026-05-15 00:43:23 -07:00
Michael Bolin
8adb6032cc tui/exec: show effective workspace roots in summaries (#22612)
## Why

This PR builds on [#22611](https://github.com/openai/codex/pull/22611).

After `runtimeWorkspaceRoots` moved onto thread state, the user-facing
summaries were still inconsistent about which roots they showed. In
particular, `/status` and the exec startup summary could under-report
extra workspace roots from `--add-dir` or from profile-defined
`workspace_roots`, which made the new model look incorrect even when the
permissions themselves were right.

## What Changed

- switched the TUI status surfaces to summarize against
`Config::effective_workspace_roots()`
- updated the exec human-output summary to render from the effective
permission profile instead of the raw constrained profile
- added focused regressions for both the TUI and exec code paths so
extra workspace roots stay visible in user-facing summaries

## Verification

Targeted coverage for this follow-up lives in:
- `codex-rs/tui/src/status/tests.rs`
- `codex-rs/exec/src/event_processor_with_human_output_tests.rs`

The added regressions verify that:
- status output includes profile-defined workspace roots in the
effective permissions summary
- exec startup output includes runtime workspace roots instead of
collapsing back to `cwd` only
2026-05-14 23:10:45 -07:00
Michael Bolin
8a5306ff88 app-server: use permission ids and runtime workspace roots (#22611)
## Why

This PR builds on [#22610](https://github.com/openai/codex/pull/22610)
and is the app-server side of the migration from mutable per-turn
`SandboxPolicy` replacement toward selecting immutable permission
profiles by id plus mutable runtime workspace roots.

Once permission profiles can carry their own immutable
`workspace_roots`, app-server no longer needs to mutate the selected
`PermissionProfile` just to represent thread-specific filesystem
context. The mutable part now lives on the thread as explicit
`runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until
the sandbox is realized for a turn.

## What Changed

- Replaced the v2 permission-selection wrapper surface with plain
profile ids for `thread/start`, `thread/resume`, `thread/fork`, and
`turn/start`.
- Removed the API surface for profile modifications
(`PermissionProfileSelectionParams`,
`PermissionProfileModificationParams`,
`ActivePermissionProfileModification`).
- Added experimental `runtimeWorkspaceRoots` fields to the thread
lifecycle and turn-start APIs.
- Threaded runtime workspace roots through core session/thread
snapshots, turn overrides, app-server request handling, and command
execution permission resolution.
- Kept session permission state symbolic so later runtime root updates
and cwd-only implicit-root retargeting rebind `:workspace_roots`
correctly.
- Updated the embedded clients just enough to send and restore the new
thread state.
- Refreshed the generated schema/TypeScript artifacts and the app-server
README to match the new contract.

## Verification

Targeted coverage for this layer lives in:

- `codex-rs/app-server-protocol/src/protocol/v2/tests.rs`
- `codex-rs/app-server/tests/suite/v2/thread_start.rs`
- `codex-rs/app-server/tests/suite/v2/thread_resume.rs`
- `codex-rs/app-server/tests/suite/v2/turn_start.rs`
- `codex-rs/core/src/session/tests.rs`

The key regression checks exercise that:

- `runtimeWorkspaceRoots` resolve against the effective cwd on thread
start.
- Profile-declared workspace roots are excluded from the runtime
workspace roots returned by app-server.
- A turn-level runtime workspace-root update persists onto the thread
and is returned by `thread/resume`.
- A named permission profile selected on one turn remains symbolic so a
later runtime-root-only turn update changes the actual sandbox writes.
- A cwd-only turn update retargets the implicit runtime cwd root while
preserving additional runtime roots.
- The protocol fixtures and generated client artifacts stay in sync with
the string-based permission selection contract.











---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611).
* #22612
* __->__ #22611
2026-05-14 23:00:05 -07:00
Eric Traut
e6a7368810 TUI: split history cells into focused modules (#22704)
## Why

`codex-rs/tui/src/history_cell.rs` had become the dumping ground for
transcript rendering: the shared trait, common helpers, and the concrete
cells for messages, plans, MCP/search, notices, patches, approvals,
session chrome, and separators all lived together. That made small
transcript changes require reopening a very large file and made
ownership less obvious.

## What changed

- Replaced the monolithic `history_cell.rs` with a `history_cell/`
module tree organized by concern.
- Kept the existing `crate::history_cell::*` surface stable through
re-exports in `history_cell/mod.rs`.
- Moved the existing render coverage into `history_cell/tests.rs`.

## Reviewer notes

- This PR is intentionally mechanical in mature — existing code and
tests moving into files that match their concern.
- The snapshot files under `codex-rs/tui/src/history_cell/snapshots/`
moved with the extracted test module. `insta` resolves these unnamed
snapshots relative to the source file that declares them, so this is
path churn only; snapshot contents were not updated.
- The small non-mechanical seam edits are limited to split fallout:
sibling-module visibility for shared cell containers, moving
approval-specific exec-snippet helpers beside approvals, fixing the
separator module path, and keeping a couple of existing test helpers
reachable after extraction.
2026-05-14 21:19:06 -07:00
Eric Traut
d1235a0a78 Prevent Esc from dismissing or rewinding /side (#22710)
Addresses #22599

## Why
`/side` currently lets `Esc` return to the parent thread. Multiple users
reported that this collides with queued-steer UI that also advertises
`Esc`, so a timing-sensitive keypress can dismiss an ephemeral side chat
instead of sending the queued prompt.

After removing that dismissal shortcut, the same `Esc` path could fall
through to main-thread backtrack/edit-previous handling, which is not
valid for ephemeral side conversations. This keeps `/side` out of both
global `Esc` behaviors.

## What changed
- Remove `Esc` from the `/side` return shortcut matcher while keeping
the existing `Ctrl+C` and `Ctrl+D` behavior.
- Update side-conversation hints and blocked-command copy to advertise
`Ctrl+C` as the return shortcut.
- Rename the reserved `Esc` keymap label to describe backtracking only.
- Block backtrack/edit-previous handling while a side conversation is
active and report `Editing previous prompts is unavailable in side
conversations.` when that path would have fired.
- Keep composer-owned `Esc` behavior, such as Vim insert-mode escape,
routed locally.
- Refresh focused shortcut assertions and TUI snapshots for the updated
footer and new side-conversation error message.

## Verification
Manually tested `/side` use cases and `Esc`, `Ctrl+C`, `Ctrl+D`.
2026-05-14 20:51:08 -07:00
guinness-oai
4f2918dd7f [codex] Add opaque desktop config namespace (#22584)
## Summary
- reserve an explicit opaque `desktop` namespace in `ConfigToml`
- expose `desktop` directly in the app-server v2 `config/read` response
- keep `config/value/write` and `config/batchWrite` as the only mutation
seam for paths like `desktop.someKey`
- regenerate the config/app-server schema outputs and document the new
contract

## Why
The desktop settings work wants one durable, user-editable home for
app-owned preferences in `~/.codex/config.toml`, without forcing Rust to
model every individual desktop setting key.

This PR is only the enabling Rust/app-server layer. It gives the
Electron app a first-class config namespace it can read and write
through the existing config APIs, while leaving the actual desktop
migration to the app PR.

## Behavior and design notes
- **Opaque but explicit:** `desktop` is first-class at the typed config
root, while its children remain app-owned and open-ended.
- **Strict validation still works:** arbitrary nested `desktop.*` keys
are accepted instead of being rejected as unknown config.
- **Existing config APIs stay the seam:** `config/read` returns the bag,
and dotted writes such as `desktop.someKey` continue to flow through
`config/value/write` / `config/batchWrite` rather than a bespoke RPC.
- **No new consumer behavior:** Core/TUI do not start depending on
desktop preferences. This only preserves and exposes the namespace for
callers that intentionally use it.
- **Same persistence machinery:** hand-edited `config.toml` keeps using
the existing TOML edit/write path; this PR does not introduce a second
serializer or side channel.
- **TOML-friendly values:** the namespace is intended for ordinary
JSON-shaped setting values that map cleanly into TOML: strings, numbers,
booleans, arrays, and nested object/table values. This PR does not add
special handling for TOML-only edge cases such as datetimes.

## Layering semantics
Reads keep using the ordinary effective config pipeline, so `desktop`
participates in the same layered `config/read` behavior as the rest of
`ConfigToml`. Writes still target user config through the existing
config service.

## Why this is the shape
The alternative would be teaching Rust about each desktop setting as it
is added. That would make ordinary app preferences into a cross-repo
change, which is exactly the coupling we want to avoid.

This keeps the contract small:
1. Rust owns one opaque `desktop` namespace in `config.toml`.
2. The desktop app owns the schema and meaning of individual keys inside
it.
3. The existing config APIs remain the transport and mutation surface.

That is the piece the desktop settings PR needs in order to move forward
cleanly.

## Verification
- `cargo test -p codex-config strict_config_accepts_opaque_desktop_keys`
- `cargo test -p codex-core
desktop_toml_round_trips_opaque_nested_values`
- `cargo test -p codex-core config_schema_matches_fixture`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server --test all desktop_settings`
2026-05-15 02:34:21 +00:00
Eric Traut
3a23e87e20 tui: recover local state db startup failures (#22734)
## Why

#22580 made app-server startup fail when the local SQLite state database
cannot be initialized. Embedded/local TUI startup still continued on the
permissive path, which left the CLI inconsistent and could hide a real
startup problem behind unrelated UI. This brings local TUI startup onto
the same fail-closed behavior while keeping recovery humane for the two
failure modes we are seeing in practice: damaged database files and
startup stalls caused by another process holding the database write
lock.

## What changed

- Embedded TUI startup now uses `state_db::try_init(...)` and returns a
typed `LocalStateDbStartupError` that preserves the affected database
path plus the underlying failure detail.
- CLI startup handles that failure before entering the interactive TUI:
- lock-contention failures tell users to quit other Codex processes and
try again
- failures consistent with a broken local database offer a safe repair
that backs up Codex-owned SQLite files, rebuilds local database files,
and retries startup once
- declined or unsuccessful repairs print concise guidance plus technical
details
- Shared startup error plumbing lives in `tui/src/startup_error.rs`,
while CLI recovery policy and focused recovery tests live in
`cli/src/state_db_recovery.rs`.

## Verification

- `cargo test -p codex-tui
embedded_state_db_failure_is_typed_for_cli_recovery`
- `cargo test -p codex-cli state_db_recovery`
- Manually held an exclusive SQLite lock on `state_5.sqlite` and
confirmed the CLI shows lock-specific guidance without offering repair.
- Manually exercised the repair path with a deliberately invalid
`sqlite_home` and confirmed it backs up the blocking path and resumes
startup.
2026-05-14 18:51:36 -07:00
Michael Bolin
3c6d727810 permissions: resolve profile identity with constraints (#22683)
## Why

This PR is the invariant-cleanup layer that follows the workspace-roots
base merged in [#22610](https://github.com/openai/codex/pull/22610).

#22610 adds `[permissions.<id>.workspace_roots]` and keeps runtime
workspace roots separate from the raw permission profile, but its
in-memory representation is intentionally transitional: `Permissions`
still carries the selected profile identity next to a constrained
`PermissionProfile`. That makes APIs such as
`set_constrained_permission_profile_with_active_profile()` fragile
because the id and value only mean the right thing when every caller
keeps them in sync.

This PR introduces a single resolved profile state so profile identity,
`extends`, the profile value, and profile-declared workspace roots
travel together. The next PR,
[#22611](https://github.com/openai/codex/pull/22611), builds on this by
changing the app-server turn API to select permission profiles by id
plus runtime workspace roots.

## Stack Context

- #22610, now merged: adds profile-declared `workspace_roots`, runtime
workspace roots, and `:workspace_roots` materialization.
- This PR: replaces the parallel active-profile/profile-value fields
with `PermissionProfileState`.
- #22611: switches app-server turn updates toward profile ids plus
runtime workspace roots.
- #22612: updates TUI/exec summaries to show the effective workspace
roots.

Keeping this separate from #22611 is deliberate: reviewers can validate
the internal state invariant before reviewing the app-server protocol
migration.

## What Changed

- Added `ResolvedPermissionProfile::{Legacy, BuiltIn, Named}` and
`PermissionProfileState`.
- Typed built-in profile ids with `BuiltInPermissionProfileId`.
- Moved selected profile identity and profile-declared workspace roots
into the resolved state.
- Replaced `Permissions` parallel profile fields with one
`permission_profile_state`.
- Removed `set_constrained_permission_profile_with_active_profile()`
from session sync paths.
- Kept trusted session replay/`SessionConfigured` compatibility through
explicit session snapshot helpers.
- Updated session configuration, MCP initialization, app-server, exec,
TUI, and guardian call sites to consume `&PermissionProfile` directly.

## Review Guide

Start with `codex-rs/core/src/config/resolved_permission_profile.rs`; it
is the new invariant boundary. Then review
`codex-rs/core/src/config/mod.rs` to see how config loading records
active profile identity and profile workspace roots. The remaining
call-site changes are mostly mechanical fallout from
`Permissions::permission_profile()` returning `&PermissionProfile`
instead of `&Constrained<PermissionProfile>`.

## Verification

The existing config/session coverage now constructs and asserts through
`PermissionProfileState`. The workspace-root config test also asserts
that profile-declared roots are preserved in the resolved state, which
is the behavior #22611 relies on when runtime roots become mutable
through the app-server API.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22683).
* #22612
* #22611
* __->__ #22683
2026-05-14 18:47:44 -07:00
Dylan Hurd
06bb508547 Stabilize compact rollback follow-up test (#22303)
## Summary
- add the missing response.created event to the mocked empty follow-up
response in the compact rollback test
- keep the fix scoped to the flaky mocked stream shape, without
increasing timeouts

## Recent flakes on main
- `snapshot_rollback_followup_turn_trims_context_updates` failed in
`rust-ci-full` on `main` in the Ubuntu remote test job on 2026-05-14:
https://github.com/openai/codex/actions/runs/25891434395/job/76095284830
- The same `compact_resume_fork` suite also failed recently on `main`
with `snapshot_rollback_past_compaction_replays_append_only_history`,
which has the same mocked Responses stream shape sensitivity this PR is
tightening:
https://github.com/openai/codex/actions/runs/25892437363/job/76098329098

## Verification
- env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core --test
all snapshot_rollback_followup_turn_trims_context_updates -- --nocapture
- repeated the same focused test 3 consecutive times locally
- UV_CACHE_DIR=/private/tmp/uv-cache-codex-fmt just fmt
2026-05-14 18:43:18 -07:00
Michael Bolin
db51df0f44 ci: support signed macOS release promotion (#22737)
## Why

`rust-release.yml` can create unsigned macOS artifacts for external
signing, but there was no signed resume path after those artifacts
returned from a secure enclave. Release operators need a way to reuse
the first run artifacts, ingest signed macOS binaries and DMGs, and
continue the normal signed release path without rebuilding every
platform or treating handoff assets as final release assets.

## How this is meant to be used

First, start the release as an unsigned macOS build against the release
tag:

```shell
gh workflow run rust-release.yml \
  --repo openai/codex \
  --ref rust-vX.Y.Z \
  -f release_mode=build_unsigned
```

That run builds the normal Linux/Windows artifacts and publishes
unsigned macOS handoff artifacts. The unsigned macOS binaries are then
copied to the secure enclave, signed and notarized there, packaged as a
signed handoff archive, and uploaded back to the GitHub Release for the
same tag.

The signed handoff asset should contain either target directories such
as `aarch64-apple-darwin/` and `x86_64-apple-darwin/`, or artifact
directories such as `aarch64-apple-darwin-app-server/`. The promote
workflow accepts either layout. The directories should contain the
signed binaries and, for primary macOS bundles, the signed and stapled
DMGs.

For example, after signing, upload the handoff asset to the release:

```shell
gh release upload rust-vX.Y.Z \
  signed-macos-rust-vX.Y.Z.tar.zst \
  --repo openai/codex \
  --clobber
```

Then start the promotion run. `unsigned_run_id` is the workflow run id
from the first `build_unsigned` run, and `signed_macos_asset` is the
exact Release asset name uploaded by the secure enclave:

```shell
gh workflow run rust-release.yml \
  --repo openai/codex \
  --ref rust-vX.Y.Z \
  -f release_mode=promote_signed \
  -f unsigned_run_id=1234567890 \
  -f signed_macos_asset=signed-macos-rust-vX.Y.Z.tar.zst \
  -f signed_macos_sha256=<sha256>
```

The `signed_macos_sha256` input is optional, but when provided the
promotion run verifies the handoff archive before unpacking it. The
promotion run also validates that `unsigned_run_id` points to a
successful manual `rust-release` run for the same tag and commit before
importing artifacts.

## What Changed

- Add explicit manual `release_mode` values for `build_unsigned` and
`promote_signed` while keeping `sign_macos` as a deprecated
compatibility input.
- Add promote inputs for `unsigned_run_id`, `signed_macos_asset`, and
optional `signed_macos_sha256`.
- Add a `stage-signed-macos` job that downloads the signed handoff asset
from the GitHub Release, verifies signed binaries and stapled DMGs,
repacks normal macOS release artifacts, and builds macOS Python runtime
wheels.
- Teach the release job to download Part 1 artifacts from the unsigned
run, discard unsigned macOS staging artifacts, re-upload promoted Linux
and Windows artifacts for npm staging, and then run the signed release
tail.
- Validate that `unsigned_run_id` points to a successful manual
`rust-release` run for the same tag and commit before importing
artifacts.
- Limit unsigned macOS artifact upload to the unsigned build path so
normal signed releases do not publish unsigned handoff binaries.
- Clean up unsigned and signed handoff release assets after successful
promotion.

## Verification

- Parsed `.github/workflows/rust-release.yml` with Ruby YAML loading.

No developers.openai.com documentation update is needed.
2026-05-14 18:36:20 -07:00
mchen-oai
10cf1f79dd Add user_input_requested_during_turn to MCP turn metadata (#22237)
## Why
- Similar change as https://github.com/openai/codex/pull/21219
- Without change: MCP tool calls receive
`_meta["x-codex-turn-metadata"]` with various key values.
- Issue: MCP servers currently do not know if user input was requested
during the turn (Ex: Model decides to prompt the user for approval
mid-turn before making a possibly risky tool call). MCP servers may want
to know this when tracking latency metrics because these instances are
inflated.

## What Changed
- With change: MCP turn metadata now includes
`user_input_requested_during_turn` when a model-visible
`request_user_input` call happened earlier in the turn, propagated in
`_meta["x-codex-turn-metadata"]`.
- `mark_turn_user_input_requested()` is called when user input is
requested through either MCP elicitation (`mcp.rs`) or the
`request_user_input` tool (`mod.rs`).
- MCP tool call `_meta` is now built immediately before execution
(`mcp_tool_call.rs`) so user input requested earlier in the same turn,
including within the same tool call via elicitation, is reflected in the
metadata.
- Normal `/responses` turn metadata headers are unchanged.

## Verification
- `codex-rs/core/src/session/mcp_tests.rs`
- `codex-rs/core/src/tools/handlers/request_user_input_tests.rs`
- `codex-rs/core/src/turn_metadata_tests.rs`
- `codex-rs/core/tests/suite/search_tool.rs`
2026-05-15 01:26:50 +00:00
Michael Bolin
c25d905f61 permissions: support workspace roots in profiles (#22610)
## Why

This is the configuration/model half of the alternative permissions
migration we discussed as a comparison point for
[#22401](https://github.com/openai/codex/pull/22401) and
[#22402](https://github.com/openai/codex/pull/22402).

The old `workspace-write` model mixes three concerns that we want to
keep separate:
- reusable profile rules that should stay immutable once selected
- user/runtime workspace roots from `cwd`, `--add-dir`, and legacy
workspace-write config
- internal Codex writable roots such as memories, which should not be
shown as user workspace roots

This PR gives permission profiles first-class `workspace_roots` so users
can opt multiple repositories into the same `:workspace_roots` rules
without using broad absolute-path write grants. It also starts
separating the raw selected profile from the effective runtime profile
by making `Permissions` expose explicit accessors instead of public
mutable fields.

A representative `config.toml` looks like this:

```toml
default_permissions = "dev"

[permissions.dev.workspace_roots]
"~/code/openai" = true
"~/code/developers-website" = true

[permissions.dev.filesystem.":workspace_roots"]
"." = "write"
".codex" = "read"
".git" = "read"
".vscode" = "read"
```

If Codex starts in `~/code/codex` with that profile selected, the
effective workspace-root set becomes:
- `~/code/codex` from the runtime `cwd`
- `~/code/openai` from the profile
- `~/code/developers-website` from the profile

The `:workspace_roots` rules are materialized across each root, so
`.git`, `.codex`, and `.vscode` stay scoped the same way everywhere.
Runtime additions such as `--add-dir` can still layer on later stack
entries without mutating the selected profile.

## Stack Shape

This PR intentionally stops before the profile-identity cleanup in
[#22683](https://github.com/openai/codex/pull/22683) so the base review
stays focused on config loading, workspace-root materialization, and
compatibility with legacy `workspace-write`.

The representation in this PR is therefore transitional: `Permissions`
carries enough state to distinguish the raw constrained profile from the
effective runtime profile, and there are still call sites that must keep
the active profile identity and constrained profile value in sync. The
follow-up PR replaces that with a single resolved profile state
(`ResolvedPermissionProfile` / `PermissionProfileState`) that keeps the
profile id, immutable `PermissionProfile`, and profile-declared
workspace roots together. That follow-up removes APIs such as
`set_constrained_permission_profile_with_active_profile()` where
separate arguments could drift out of sync.

Downstream PRs then build on this base to switch app-server turn updates
to profile ids plus runtime workspace roots and to finish the
user-visible summary behavior. Reviewers should judge this PR as the
workspace-roots foundation, not as the final in-memory shape of selected
permission profiles.

## Review Guide

Suggested review order:

1. Start with `codex-rs/core/src/config/mod.rs`.
This is the main shape change in the base slice. `Permissions` now
stores a private raw `Constrained<PermissionProfile>` plus runtime
`workspace_roots`. Callers use `permission_profile()` when they need the
raw constrained value and `effective_permission_profile()` when they
need a materialized runtime profile. As noted above,
[#22683](https://github.com/openai/codex/pull/22683) replaces this
transitional shape with a resolved profile state that keeps identity and
profile data together.

2. Review `codex-rs/config/src/permissions_toml.rs` and
`codex-rs/core/src/config/permissions.rs`.
These add `[permissions.<id>.workspace_roots]`, resolve enabled entries
relative to the policy cwd, and keep `:workspace_roots` deny-read glob
patterns symbolic until the actual roots are known.

3. Review `codex-rs/protocol/src/permissions.rs` and
`codex-rs/protocol/src/models.rs`.
These add the policy/profile materialization helpers that expand exact
`:workspace_roots` entries and scoped deny-read globs over every
workspace root. This is also where `ActivePermissionProfileModification`
is removed from the core model.

4. Review the legacy bridge in
`Config::load_from_base_config_with_overrides` and
`Config::set_legacy_sandbox_policy`.
This is where legacy `workspace-write` roots become runtime workspace
roots, while Codex internal writable roots stay internal and do not
appear as user-facing workspace roots.

5. Then skim downstream call sites.
The interesting pattern is raw-vs-effective access: state/proxy/bwrap
paths keep the raw constrained profile, while execution, summaries, and
user-visible status use the effective profile and workspace-root list.

## What Changed

- added `[permissions.<id>.workspace_roots]` to the config model and
schema
- added runtime `workspace_roots` state to `Config`/`Permissions` and
`ConfigOverrides`
- made `Permissions` profile fields private and replaced direct mutation
with accessors/setters
- added `PermissionProfile` and `FileSystemSandboxPolicy` helpers for
materializing `:workspace_roots` exact paths and deny-read globs across
all roots
- moved legacy additional writable roots into runtime workspace-root
state instead of active profile modifications
- removed `ActivePermissionProfileModification` and its app-server
protocol/schema export
- updated sandbox/status summary paths so internal writable roots are
not reported as user workspace roots

## Verification Strategy

The targeted tests cover the behavior at the layers where regressions
are most likely:
- `codex-rs/core/src/config/config_tests.rs` verifies config loading,
legacy workspace-root seeding, effective profile materialization, and
memory-root handling.
- `codex-rs/core/src/config/permissions_tests.rs` verifies profile
`workspace_roots` parsing and `:workspace_roots` scoped/glob
compilation.
- `codex-rs/protocol/src/permissions.rs` unit tests verify exact and
glob materialization over multiple workspace roots.
- `codex-rs/tui/src/status/tests.rs` and
`codex-rs/utils/sandbox-summary/src/sandbox_summary.rs` verify the
user-facing summaries show effective workspace roots and hide internal
writes.

I also ran `cargo check --tests` locally after the latest stack refresh
to catch cross-crate API breakage from the private-field/accessor
changes.







---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22610).
* #22612
* #22611
* #22683
* __->__ #22610
2026-05-14 18:25:23 -07:00
Dylan Hurd
7dbe1c9498 [codex] Remove experimental instructions file config (#22724)
## Summary

Remove the deprecated `experimental_instructions_file` config setting
from the typed config surface and the remaining deprecation-notice
plumbing. `model_instructions_file` remains the supported setting and
its loading path is unchanged.

The setting was deprecated when it was renamed to
`model_instructions_file` on January 20, 2026 in
https://github.com/openai/codex/pull/9555.

## Changes

- Remove `experimental_instructions_file` from `ConfigToml` and
`ConfigProfile`.
- Delete the custom config-layer scan and session deprecation notice for
the removed setting.
- Stop clearing the removed field from generated session config locks.
- Remove the obsolete deprecation-notice test case while keeping
`model_instructions_file` coverage intact.

## Validation

- `just write-config-schema`
- `just fmt`
- `cargo test -p codex-config`
- `cargo test -p codex-core model_instructions_file`
- `just fix -p codex-core`
- `git diff --check`

Co-authored-by: Codex <noreply@openai.com>
2026-05-14 18:04:26 -07:00
Dylan Hurd
eeabaf74ea [codex] Group removed feature flags (#22730)
## Summary
- move removed feature enum variants under the existing Removed section
- keep active feature variants grouped away from no-op compatibility
flags

## Test plan
- just fmt
- cargo test -p codex-features

Co-authored-by: Codex <noreply@openai.com>
2026-05-15 00:53:13 +00:00
pakrym-oai
4bff020a96 Remove SSE fixture loaders (#22684)
## Why

The Responses API test support already has structured SSE event
builders. Keeping separate JSON fixture loaders made small mock streams
harder to read and left an on-disk fixture for a single event.

## What changed

- Removed `load_sse_fixture` and `load_sse_fixture_with_id_from_str`
from `core_test_support`.
- Deleted the one `tests/fixtures/incomplete_sse.json` Responses API
fixture.
- Replaced the remaining call sites with `responses::sse(...)` and
existing event helpers.

## Validation

- `cargo test -p codex-core --test all
stream_no_completed::retries_on_early_close`
- `cargo test -p codex-core --test all
history_dedupes_streamed_and_final_messages_across_turns`
- `cargo test -p codex-core --test all review::`
2026-05-15 00:40:32 +00:00
canvrno-oai
66af217865 Fix /review mode MCP startup render issue (#21624)
This change fixes the case where the UI can sit on _"Starting MCP
servers"_ even though the review work is already running or has already
completed.

- MCP startup status header is visible when a `/review` turn starts with
enabled MCP server startups
- Restore the underlying _Working..._ status after MCP startup completes
or fails
- Add regression coverage for overlapping startup/turn flows and status
restoration

_De-scoped from a broader thread-scoped MCP status change that would
have made it easier to route MCP startup statuses to the appropriate
thread (parent vs. review). These changes address the UI regression
without requiring more significant changes across app-server & core._

Fixes #18792.
2026-05-14 17:25:32 -07:00
Eric Traut
3dc278b68e Trim TUI legacy core helper usage (#22695)
## Why

The TUI still had a few low-risk dependencies flowing through the
transitional `legacy_core` namespace after the app-server migration.
These helpers either already have clearer non-core owners or are
presentation logic that does not belong in `codex-core`, so moving them
out reduces the compatibility surface without changing product behavior.

## What changed

This is a low-risk change, almost completely mechanical in nature.

- Route TUI Codex-home lookup through `codex-utils-home-dir`, use
`Config::log_dir` directly, and call
`codex-sandboxing::system_bwrap_warning` without going through
`legacy_core`.
- Move shared `codex resume` hint formatting from `codex-core` into
`codex-utils-cli`.
- Update CLI and TUI call sites to use the shared CLI utility, and keep
the resume-command behavior covered by tests in its new home.

## Verification

- `cargo test -p codex-utils-cli`
- `cargo test -p codex-utils-cli resume_command`
2026-05-14 16:54:59 -07:00
Dylan Hurd
85915a2a21 chore(config) rm windows_wsl_setup_acknowledged (#22717)
## Summary
Remove dead code from a notice that no longer exists.

## Testing
- [x] Unit tests pass.
2026-05-14 23:25:15 +00:00
Dylan Hurd
51b0e94105 chore(features) rm Feature::ApplyPatchFreeform (#22711)
## Summary
Removes the feature since this is effectively on by default in all cases
where we should use it, or can be configured via models.json.

## Testing
- [x] unit tests pass
2026-05-14 16:15:56 -07:00
starr-openai
7c11c14efc Fix Windows sandbox clippy clones (#22687)
## Summary
- remove two redundant `PathBuf` clones in Windows sandbox setup tests
- fix current `rust-ci-full` Windows clippy failures on `main`

## Validation
- `just fmt`
- attempted on `dev`: `cargo clippy --target x86_64-pc-windows-msvc
--tests --profile dev --timings -- -D warnings`
- blocked by missing MSVC cross toolchain on the Linux devbox (`lib.exe`
/ MSVC C toolchain unavailable)
- live failure evidence: main `rust-ci-full` runs 25880209898 and
25879137967 failed on `windows-sandbox-rs/src/bin/setup_main/win.rs`
with `clippy::redundant_clone` at the two edited callsites
2026-05-14 15:54:18 -07:00
xli-oai
8c7a176b55 Unqueue plugin list and read requests (#22703)
## Summary
- remove the app-server `plugin-read` serialization queue from
`plugin/list` and `plugin/read`
- allow plugin read/list requests to start immediately instead of
waiting behind other plugin read/list requests

## Test plan
- `just fmt`
- `cargo test -p codex-app-server-protocol`
2026-05-14 15:07:20 -07:00
sayan-oai
d346957288 make rust-release-prepare use env secret (#22702)
made a `rust-release-prepare` environment with the necessary API key as
an environment secret. use this in the workflow rather than the action
secret.

once this merges and i confirm it works as intended, ill rm the action
secret.
2026-05-14 21:45:53 +00:00
rreichel3-oai
02a7205250 [codex] Support multiple forced ChatGPT workspaces (#18161)
## Summary

This change lets `forced_chatgpt_workspace_id` accept multiple workspace
IDs instead of a single value.

It keeps the existing config key name, adds backward-compatible parsing
for a single string in `config.toml`, and normalizes the setting into an
allowed workspace list across login enforcement, app-server config
surfaces, and local ChatGPT auth helpers.

## Why

Workspace-restricted deployments may need to allow more than one ChatGPT
workspace without dropping the guardrail entirely.

## Server-side impact

Codex's local server and app-server protocol needed changes because they
previously assumed a single workspace ID. The local login flow now
matches the auth backend interface by sending the allowed workspace list
as a single comma-separated `allowed_workspace_id` query parameter.

## Validation

This was tested with:

- A single workspace config
- With multi-workspace configs
- With multiple workspaces in the config
- The user only being a part of a subset of them

All were successful.

Automated coverage:

- `cargo test -p codex-login`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-tui local_chatgpt_auth`
- `cargo test --locked -p codex-app-server
login_account_chatgpt_includes_forced_workspace_allowlist_query_param`
2026-05-14 17:11:36 -04:00
starr-openai
32b45a43e2 tests: isolate codex home for live cli (#22563)
## Why

Some core integration-test paths were creating Codex state under ambient
`~/.codex`. In environments where `HOME=/tmp`, that showed up as
`/tmp/.codex`, which is host-level shared state and makes these tests
environment/order sensitive.

The affected paths were:

- `core/tests/suite/live_cli.rs`: `run_live()` spawned the real CLI with
a temp cwd, but without an isolated home, so the child resolved Codex
home from ambient `HOME`.
- core / exec-server integration test binaries using
`configure_test_binary_dispatch(...)`: their startup ctor installs arg0
helper aliases like `apply_patch` and `codex-linux-sandbox`. Full
`arg0_dispatch()` also installs aliases from ambient Codex-home
resolution, so test-binary startup could create `CODEX_HOME/tmp/arg0`;
with `HOME=/tmp`, that became `/tmp/.codex/tmp/arg0/...`.

## What changed

- `live_cli` now gives the spawned CLI a temp `HOME` and temp
`CODEX_HOME`.
- arg0 alias setup now has an explicit-home form,
`prepend_path_entry_for_codex_aliases_in(...)`, so test helpers can
place alias state under a temp directory without relying on ambient
`CODEX_HOME`.
- helper re-entry behavior is preserved with
`dispatch_arg0_if_needed()`, so aliases like `apply_patch` and
`codex-linux-sandbox` still dispatch correctly before test alias
installation.
- core test support keeps the temp Codex home alive for the lifetime of
the test binary, matching the alias lifetime.

## Verification

Verified on `dev2` with `HOME=/tmp` that the focused core test-binary
startup path no longer recreates `/tmp/.codex`.

Also checked the exact `live_cli` test path under `HOME=/tmp`; on `dev2`
it still hits the existing remote-only `cargo_bin("codex-rs")`
resolution failure before spawning the child, but `/tmp/.codex` remains
absent after the run.
2026-05-14 12:59:56 -07:00
starr-openai
255748638c Fix remote environment test fixtures (#22572)
## Why
The Docker remote-env coverage was failing before it reached the
behavior those tests are meant to exercise. The remote-aware test
fixture only registered the remote environment, so tests that
intentionally select both `local` and `remote` could not start a turn.
After that was fixed, two tests exposed stale fixtures: the approval
test was auto-approving under workspace-write, and the remote
`view_image` test was writing invalid PNG bytes.

## What Changed
- Added `EnvironmentManager::create_for_tests_with_local(...)` so tests
can keep the provider default while also selecting `local` explicitly.
- Updated `build_remote_aware()` to use that test-only manager when a
remote exec-server URL is present.
- Changed the remote apply-patch approval helper to use
`SandboxPolicy::new_read_only_policy()` so the test actually exercises
approval caching per environment.
- Replaced the hardcoded remote `view_image` PNG blob with the existing
`png_bytes(...)` helper so the test uses a valid image fixture.

## Validation
Ran these isolated Docker remote-env tests on the devbox with
`$remote-tests` setup:
-
`suite::remote_env::apply_patch_freeform_routes_to_selected_remote_environment`
-
`suite::remote_env::apply_patch_approvals_are_remembered_per_environment`
-
`suite::remote_env::apply_patch_intercepted_exec_command_routes_to_selected_remote_environment`
-
`suite::remote_env::exec_command_routes_to_selected_remote_environment`
- `suite::view_image::view_image_routes_to_selected_remote_environment`

All five pass.
2026-05-14 12:40:01 -07:00
Michael Bolin
e8969d940d test: isolate exec review policy config test (#22512)
## Why


`thread_start_params_include_review_policy_when_review_policy_is_manual_only`
builds a `Config` with a temporary `CODEX_HOME`, but
`ConfigBuilder::default()` can still load host-managed configuration. On
local macOS machines with enterprise-managed Codex config, that host
state can leak into the test and change the resulting config, even
though CI does not have the same managed config source.

This makes the test environment-dependent: it can pass in CI while
failing locally for developers who have managed configuration installed.

## What Changed

- Updated `codex-rs/exec/src/lib_tests.rs` so the test calls
`LoaderOverrides::without_managed_config_for_tests()` through
`ConfigBuilder::loader_overrides(...)`.
- Left the rest of the test setup intact, including the temporary
`CODEX_HOME`, temporary cwd, and explicit `approvals_reviewer` harness
override.

## Verification

```shell
cargo test -p codex-exec thread_start_params_include_review_policy_when_review_policy_is_manual_only
```
2026-05-14 12:14:20 -07:00
Matthew Zeng
d8ddeb6869 Support explicit MCP OAuth client IDs (#22575)
## Why
Some MCP OAuth providers require a pre-registered public client ID and
cannot rely on dynamic client registration. Codex already supports MCP
OAuth, but it had no way to supply that client ID from config into the
PKCE flow.

## What changed
- add `oauth.client_id` under `[mcp_servers.<server>]` config, including
config editing and schema generation
- thread the configured client ID through CLI, app-server, plugin login,
and MCP skill dependency OAuth entrypoints
- configure RMCP authorization with the explicit client when present,
while preserving the existing dynamic-registration path when it is
absent
- add focused coverage for config parsing/serialization and OAuth URL
generation

## Verification
- `cargo test -p codex-config -p codex-rmcp-client -p codex-mcp -p
codex-core-plugins`
- `cargo test -p codex-core blocking_replace_mcp_servers_round_trips
--lib`
- `cargo test -p codex-core
replace_mcp_servers_streamable_http_serializes_oauth_resource --lib`
- `cargo test -p codex-core config_schema_matches_fixture --lib`

## Notes
Broader local package runs still hit unrelated pre-existing stack
overflows in:
- `codex-app-server::in_process_start_clamps_zero_channel_capacity`
-
`codex-core::resume_agent_from_rollout_uses_edge_data_when_descendant_metadata_source_is_stale`
2026-05-14 11:52:43 -07:00
Casey Chow
4a1f1df8ce [codex] fix plugin CLI active user layer compile (#22666)
## Why

PR #21396 merged after #17141 removed the old
`ConfigLayerStack::get_user_layer()` API. The new plugin CLI call sites
still used that stale API, which caused `main` to fail compilation.

## What Changed

- update `codex plugin marketplace list` to read configured marketplaces
through `get_active_user_layer()`
- update the plugin snapshot validation helper to use
`get_active_user_layer()`

This preserves the intended active writable user-layer behavior from the
profile-aware config API while fixing the stale call sites.

## Validation

- `cargo check -p codex-cli`
- `cargo test -p codex-cli --test plugin_cli`
- `git diff --check`
2026-05-14 18:41:04 +00:00
Rajeev Nayak
f13e21ef43 Prefer the model list fetched from the backend for SIWC users (#22547)
## Summary
- For SIWC users, update the model list merging logic to prefer the
model list fetched from the backend over the bundled model list (this is
needed for special cases where users have a more limited set of models
they're allowed to use)
- Add or update tests covering the revised cache behavior

## Testing
- Added/updated unit tests in
`codex-rs/models-manager/src/manager_tests.rs`
- Not run (not requested)
2026-05-14 13:45:49 -04:00
Felipe Coury
5a02962519 fix(tui): render network approval history by target (#22229)
## Why

Network approval prompts are rendered without a command string on the
app-server path. After the user approves one of those prompts, the TUI
history cell previously fell back to command-oriented copy and produced
malformed lines such as:

```text
You approved codex to run  every time this session
```

That hid the network target the user actually approved and left a
visibly broken transcript entry.

## What changed

- Preserve the approval subject as either a command or a network target
when recording TUI approval decisions.
- Render target-aware history copy for network approval outcomes:
  - approve once
  - approve for the current session
  - cancel
- Include the approval protocol and preserve the managed-proxy
`network-access` target when present, including non-default ports such
as `https://example.com:8443`.
- Fall back to formatting the network approval context as
`protocol://host` when no generated target command is available.
- Keep ordinary command approval history, Guardian approval history, and
persisted network-rule history behavior unchanged.
- Add focused regression coverage and snapshots for the three
network-history cases.

## How to Test

1. Start Codex in a flow that triggers a network approval prompt.
2. Approve network access only for the current conversation.
3. Confirm the transcript records the approved network target, for
example:
- `You approved codex network access to https://example.com:8443 every
time this session`
4. Trigger the prompt again and verify the one-time approval and cancel
paths also record target-specific history text instead of an empty
command gap.

Targeted automated coverage:
- `cargo test -p codex-tui network_exec_approval_history`

## Additional verification

- `cargo insta pending-snapshots`
- `git diff --check`
- `just fix -p codex-tui`
- `just argument-comment-lint`

## Known unrelated local test noise

A full `cargo test -p codex-tui` run still hits a pre-existing stack
overflow outside this change:
- `tests::fork_last_filters_latest_session_by_cwd_unless_show_all`
aborts with a stack overflow
2026-05-14 14:33:54 -03:00
Chris Bookholt
6ec8c4a6ec [codex] Ignore fsmonitor config in Git metadata reads (#22652)
## Summary
- keep Git metadata/status subprocesses independent of repository
`core.fsmonitor` configuration
- preserve existing working-tree state reporting while making the helper
behavior more predictable
- add regression coverage for `get_has_changes` when a repository
defines an fsmonitor command

## Validation
- `cargo fmt --all`
- `cargo test -p codex-core test_get_has_changes_`
- `cargo test -p codex-git-utils`
2026-05-14 10:07:43 -07:00
starr-openai
8736e32657 tests: avoid ambient temp sandbox roots (#22576)
## Why
Some sandboxed integration tests enabled both ambient temp roots
(`TMPDIR` and literal `/tmp`) even though they were not testing
temp-root behavior. On Linux bwrap, making `/tmp` writable causes
protected metadata mount targets such as `/tmp/.git`, `/tmp/.agents`,
and `/tmp/.codex` to be synthesized. If a run is interrupted, those
top-level markers can be left behind and contaminate later tests.

## What changed
For the incidental integration tests that do not need ambient temp-root
access, set `exclude_tmpdir_env_var` and `exclude_slash_tmp` to `true`.
Dedicated protected-metadata coverage remains in the lower-level sandbox
tests that use isolated temp roots.

## Verification
Focused remote devbox repros passed with a watcher polling `/tmp/.git`,
`/tmp/.agents`, and `/tmp/.codex`; no leaked markers were observed.
2026-05-14 10:04:24 -07:00
Casey Chow
74a1b46a00 [codex] add plugin marketplace CLI commands (#21396)
## Why

Plugin CLI installs should behave more like `apt-get install`:
configured marketplaces are the only install sources, the local
marketplace snapshot is the package index used at install time, and
`plugins/cache` is only a cache of already-downloaded plugin bytes.

That distinction matters once marketplaces and plugins have auth or
availability state. A repo-local marketplace manifest or leftover cached
plugin artifact should not silently become an install source unless the
marketplace was explicitly configured and its readable snapshot still
authorizes the plugin.

## What Changed

- add CLI commands to list configured marketplaces and add, list, or
remove marketplace plugins
- accept stable `plugin@marketplace` ids for add/remove while preserving
the explicit `--marketplace` form
- restrict `codex plugin add` and `codex plugin list` to configured
marketplaces instead of also discovering current-working-directory
marketplace roots
- fail `codex plugin add` and `codex plugin list` when a configured
marketplace snapshot is missing or malformed instead of treating it as
an empty source or a generic plugin miss
- preserve marketplace snapshot semantics: a configured local/Git
marketplace snapshot can authorize installs without consulting the
original upstream source
- allow `plugins/cache` reuse only after configured marketplace
resolution succeeds
- keep removal resilient after marketplace deletion or drift and ignore
malformed marketplace config entries in listing

## Commands Added

- `codex plugin add <plugin>@<marketplace>`
- `codex plugin add <plugin> --marketplace <marketplace>`
- `codex plugin list`
- `codex plugin list --marketplace <marketplace>`
- `codex plugin remove <plugin>@<marketplace>`
- `codex plugin remove <plugin> --marketplace <marketplace>`
- `codex plugin marketplace add <source>`
- `codex plugin marketplace add <source> --ref <ref>`
- `codex plugin marketplace add <source> --sparse <path>`
- `codex plugin marketplace list`
- `codex plugin marketplace upgrade`
- `codex plugin marketplace upgrade <marketplace>`
- `codex plugin marketplace remove <marketplace>`

## CLI Help Output

<details>
<summary><code>codex plugin --help</code></summary>

```text
Manage Codex plugins

Usage: codex plugin [OPTIONS] <COMMAND>

Commands:
  add          Install a plugin from a configured marketplace snapshot
  list         List plugins available from configured marketplace snapshots
  marketplace  Add, list, upgrade, or remove configured plugin marketplaces
  remove       Remove an installed plugin from local config and cache
  help         Print this message or the help of the given subcommand(s)
```

</details>

<details>
<summary><code>codex plugin add --help</code></summary>

```text
Install a plugin from a configured marketplace snapshot.

Pass either `PLUGIN@MARKETPLACE` or pass `PLUGIN` with `--marketplace MARKETPLACE`.

Usage: codex plugin add [OPTIONS] <PLUGIN[@MARKETPLACE]>

Arguments:
  <PLUGIN[@MARKETPLACE]>
          Plugin selector to install: either PLUGIN@MARKETPLACE or PLUGIN with --marketplace

Options:
  -m, --marketplace <MARKETPLACE>
          Configured marketplace name to use when PLUGIN does not include @MARKETPLACE

Examples:
  codex plugin add sample@debug
  codex plugin add sample --marketplace debug
```

</details>

<details>
<summary><code>codex plugin list --help</code></summary>

```text
List plugins available from configured marketplace snapshots

Usage: codex plugin list [OPTIONS]

Options:
  -m, --marketplace <MARKETPLACE>
          Only list plugins from this configured marketplace name

Examples:
  codex plugin list
  codex plugin list --marketplace debug
```

</details>

<details>
<summary><code>codex plugin remove --help</code></summary>

```text
Remove an installed plugin from local config and cache.

Pass either `PLUGIN@MARKETPLACE` or pass `PLUGIN` with `--marketplace MARKETPLACE`.

Usage: codex plugin remove [OPTIONS] <PLUGIN[@MARKETPLACE]>

Arguments:
  <PLUGIN[@MARKETPLACE]>
          Plugin selector to remove: either PLUGIN@MARKETPLACE or PLUGIN with --marketplace

Options:
  -m, --marketplace <MARKETPLACE>
          Marketplace name to use when PLUGIN does not include @MARKETPLACE

Examples:
  codex plugin remove sample@debug
  codex plugin remove sample --marketplace debug
```

</details>

<details>
<summary><code>codex plugin marketplace --help</code></summary>

```text
Add, list, upgrade, or remove configured plugin marketplaces

Usage: codex plugin marketplace [OPTIONS] <COMMAND>

Commands:
  add      Add a local or Git marketplace to the configured marketplace sources
  list     List configured marketplace names and their local snapshot roots
  upgrade  Refresh configured Git marketplace snapshots
  remove   Remove a configured marketplace source by name
```

</details>

<details>
<summary><code>codex plugin marketplace add --help</code></summary>

```text
Add a local or Git marketplace to the configured marketplace sources

Usage: codex plugin marketplace add [OPTIONS] <SOURCE>

Arguments:
  <SOURCE>
          Marketplace source: a local path, owner/repo[@ref], HTTPS Git URL, or SSH Git URL

Options:
      --ref <REF>
          Git ref to fetch for Git marketplace sources

      --sparse <PATH>
          Sparse checkout path for Git marketplace sources. Can be repeated

Examples:
  codex plugin marketplace add ./path/to/marketplace
  codex plugin marketplace add owner/repo --ref main
  codex plugin marketplace add https://github.com/owner/repo --sparse plugins/foo
```

</details>

<details>
<summary><code>codex plugin marketplace list --help</code></summary>

```text
List configured marketplace names and their local snapshot roots

Usage: codex plugin marketplace list [OPTIONS]
```

</details>

<details>
<summary><code>codex plugin marketplace upgrade --help</code></summary>

```text
Refresh configured Git marketplace snapshots.

Omit MARKETPLACE_NAME to upgrade all configured Git marketplaces.

Usage: codex plugin marketplace upgrade [OPTIONS] [MARKETPLACE_NAME]

Arguments:
  [MARKETPLACE_NAME]
          Optional configured marketplace name to upgrade. Omit to upgrade all Git marketplaces

Examples:
  codex plugin marketplace upgrade
  codex plugin marketplace upgrade debug
```

</details>

<details>
<summary><code>codex plugin marketplace remove --help</code></summary>

```text
Remove a configured marketplace source by name

Usage: codex plugin marketplace remove [OPTIONS] <MARKETPLACE_NAME>

Arguments:
  <MARKETPLACE_NAME>
          Configured marketplace name to remove

Example:
  codex plugin marketplace remove debug
```

</details>

## Public Semantics

- `codex plugin add <plugin>@<marketplace>` succeeds only when
`<marketplace>` is configured and its local marketplace snapshot
contains `<plugin>`
- repo-local marketplaces are not install sources until the user runs
`codex plugin marketplace add ...`
- configured marketplace snapshots must be readable; missing or
malformed snapshots fail the CLI operation rather than silently falling
through to cache or empty results
- cached plugin artifacts can satisfy reinstall only when the configured
marketplace snapshot still authorizes that plugin
- cached plugin artifacts alone never make a plugin installable

## Tests

- `cargo test -p codex-cli --test plugin_cli`
- `cargo clippy -p codex-cli --tests -- -D warnings`
- `cargo test -p codex-cli`
- `git diff --check`
- `just bazel-lock-update`
- `just bazel-lock-check`
2026-05-14 09:33:38 -07:00
Eric Traut
a5040d0b39 tui: split composer attachment and popup state (#22581)
## Why

`ChatComposer` currently owns text editing alongside attachment
bookkeeping and popup lifecycle state, while `BottomPane` still triggers
a couple of popup resyncs after composer methods that already do that
work internally. That blurs the ownership boundary and makes the
composer harder to simplify safely.

This PR is part 1 of a two-part cleanup. It peels off the composer state
that can move cleanly on its own, so the follow-up can tackle the
heavier draft/editing boundary without mixing every concern into one
diff.

## What changed

- Move local and remote image bookkeeping, placeholder relabeling, and
remote-image keyboard selection into `AttachmentState`.
- Move active-popup and popup-dismissal/query bookkeeping into
`PopupState`.
- Update composer and history-search paths to use those state owners
directly.
- Remove redundant `BottomPane` popup synchronization after paste
handling and `insert_str`.

## Part 2

The follow-up PR will finish the cleanup around the remaining composer
boundary: split out the draft/editing-oriented state and footer/status
presentation concerns that still live in `ChatComposer`, then revisit
the leftover `BottomPane` pass-throughs once those ownership lines are
explicit. The goal is for `ChatComposer` to coordinate a few focused
collaborators instead of continuing to be the landing zone for every
input-path concern.

## Verification

Did manual smoke tests.
2026-05-14 09:04:27 -07:00
Shijie Rao
e79e1b42b9 Chore: better published unsigned artifacts (#22649)
This is the exact same change as @bolinfest made but he could not push
because of github action change permission.

## Why

The `rust-release` workflow can now be run manually with
`sign_macos=false` to skip macOS signing, but that path previously
stopped before creating a GitHub Release. That left the unsigned macOS
binaries available only as workflow-run artifacts, which are awkward to
fetch from automation and cannot be retrieved with a simple
unauthenticated `curl`.

For the unsigned path we still should not perform the normal release
side effects: no npm or Python publishing, no WinGet publishing, no
`latest-alpha-cli` branch update, and no promotion to GitHub's latest
release. The goal is only to make the build outputs easy to fetch from
the release page.

## What changed

- Allow the `release` job in `.github/workflows/rust-release.yml` to run
for `workflow_dispatch` runs with `sign_macos=false`.
- For unsigned runs, keep the unsigned macOS artifacts plus the normal
Linux and Windows release artifacts needed for DotSlash, then
create/update the GitHub Release with `make_latest: false`.
- Keep the normal publish/promote paths gated to signed releases:
  - npm staging and publish
  - Python runtime publish
  - WinGet publish
  - `latest-alpha-cli` update
  - developer-site deploy
  - normal DotSlash release files
- Add `.github/dotslash-unsigned-config.json`, which publishes
`*-unsigned` DotSlash files that use unsigned macOS artifacts and the
normal Linux/Windows artifacts.


## What I added
PLEASE READ THIS!!!
I added `codex-command-runner` and `codex-windows-sandbox-setup` entries
to `.github/dotslash-unsigned-config.json` so that with
`sign_macos=false` we would still get the dotslash files for those
artifacts which are necessary for windows builds.
2026-05-14 08:47:21 -07:00
Michael Bolin
01d93fd9fc permissions: canonicalize workspace_roots and danger-full-access names (#22624)
## Why

This is a small precursor to the larger permissions-migration work. Both
the comparison stack in
[#22401](https://github.com/openai/codex/pull/22401) /
[#22402](https://github.com/openai/codex/pull/22402) and the alternate
stack in [#22610](https://github.com/openai/codex/pull/22610) /
[#22611](https://github.com/openai/codex/pull/22611) /
[#22612](https://github.com/openai/codex/pull/22612) are easier to
review if the terminology is already settled underneath them.

Because `:project_roots` and `:danger-no-sandbox` have not shipped as
stable user-facing surface area, carrying them forward as aliases would
just add more migration logic to the later stacks. This PR removes that
ambiguity now so the follow-on work can rely on one spelling for each
built-in concept.

## What Changed

- renamed the config-facing special filesystem key from `:project_roots`
to `:workspace_roots`
- dropped unpublished `:project_roots` parsing support in
`core/src/config/permissions.rs`, so new config only recognizes
`:workspace_roots`
- renamed the built-in full-access permission profile id from
`:danger-no-sandbox` to `:danger-full-access`
- dropped unpublished `:danger-no-sandbox` support entirely, including
the old active-profile canonicalization path, and added explicit
rejection coverage for the legacy id
- introduced shared built-in permission-profile id constants in
`codex-rs/protocol/src/models.rs`
- updated `core`, `app-server`, and `tui` call sites that special-case
built-in profiles to use the shared constants and canonical ids
- updated tests and the Linux sandbox README to use `:workspace_roots` /
`:danger-full-access`

## Verification

I focused verification on the three places this rename can regress:
config parsing, active-profile identity surfaced back out of `core`, and
user/server call sites that special-case built-in profiles.

Targeted checks:

-
`config::tests::default_permissions_can_select_builtin_profile_without_permissions_table`
-
`config::tests::default_permissions_read_only_applies_additional_writable_roots_as_modifications`
-
`config::tests::default_permissions_can_select_builtin_full_access_profile`
- `config::tests::legacy_danger_no_sandbox_is_rejected`
- `workspace_root` filtered `codex-core` tests
-
`request_processors::thread_processor::thread_processor_tests::thread_processor_behavior_tests::requested_permissions_trust_project_uses_permission_profile_intent`
-
`suite::v2::turn_start::turn_start_rejects_invalid_permission_selection_before_starting_turn`
- `status::tests::status_snapshot_shows_auto_review_permissions`
-
`status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access`
-
`app_server_session::tests::embedded_turn_permissions_use_active_profile_selection`
2026-05-14 08:45:54 -07:00
jif-oai
12bfb57139 Fix turn extension data task plumbing (#22646)
## Summary
- carry the per-turn extension data through RunningTask so abort
handling can rebuild SessionTaskContext
- update stale test ExtensionData::new() callsites to pass the turn id

## Testing
- Not run after PR branch creation; CI will cover.
2026-05-14 16:00:06 +02:00
Chris Bookholt
9ea38136b0 [codex] treat PowerShell stop-parsing forms as unsupported (#22643)
## Summary
- Treat PowerShell stop-parsing token forms as unsupported in the
AST-backed command flattener.
- Add focused regressions at the parser layer and Windows command-safety
layer.

## Why
The command-safety parser lowers PowerShell AST elements into argv-like
words. Stop-parsing syntax preserves a native-command argument shape
that this lowering does not model, so these forms should stay on the
conservative unsupported path.

## Validation
- `cargo fmt --manifest-path codex-rs/Cargo.toml --all --check`
- `cargo test --manifest-path codex-rs/Cargo.toml -p
codex-shell-command`
2026-05-14 06:28:34 -07:00
jif-oai
deedf3b2c4 feat: add layered --profile-v2 config files (#17141)
## Why

`--profile-v2 <name>` gives launchers and runtime entry points a named
profile config without making each profile duplicate the base user
config. The base `$CODEX_HOME/config.toml` still loads first, then
`$CODEX_HOME/<name>.config.toml` layers above it and becomes the active
writable user config for that session.

That keeps shared defaults, plugin/MCP setup, and managed/user
constraints in one place while letting a named profile override only the
pieces that need to differ.

## What Changed

- Added the shared `--profile-v2 <name>` runtime option with validated
plain names, now represented by `ProfileV2Name`.
- Extended config layer state so the base user config and selected
profile config are both `User` layers; APIs expose the active user layer
and merged effective user config.
- Threaded profile selection through runtime entry points: `codex`,
`codex exec`, `codex review`, `codex resume`, `codex fork`, and `codex
debug prompt-input`.
- Made user-facing config writes go to the selected profile file when
active, including TUI/settings persistence, app-server config writes,
and MCP/app tool approval persistence.
- Made plugin, marketplace, MCP, hooks, and config reload paths read
from the merged user config so base and profile layers both participate.
- Updated app-server config layer schemas to mark profile-backed user
layers.

## Limits

`--profile-v2` is still rejected for config-management subcommands such
as feature, MCP, and marketplace edits. Those paths remain tied to the
base `config.toml` until they have explicit profile-selection semantics.

Some adjacent background writes may still update base or global state
rather than the selected profile:

- marketplace auto-upgrade metadata
- automatic MCP dependency installs from skills
- remote plugin sync or uninstall config edits
- personality migration marker/default writes

## Verification

Added targeted coverage for profile name validation, layer
ordering/merging, selected-profile writes, app-server config writes,
session hot reload, plugin config merging, hooks/config fixture updates,
and MCP/app approval persistence.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-14 15:16:15 +02:00
jif-oai
17cd321c32 Wire turn item contributors into stream output (#22494)
## Summary
- run registered TurnItemContributor hooks for parsed stream output
items
- plumb the active turn extension store into stream item handling
- preserve existing memory citation parsing as fallback after
contributors run

## Tests
- cargo test -p codex-core stream_events_utils -- --nocapture
- just fmt
- just fix -p codex-core
- git diff --check
2026-05-14 14:48:17 +02:00
jif-oai
6d65686313 feat: make ToolExecutor an async trait (#22560)
## Why

`codex_tools::ToolExecutor` keeps a tool spec attached to its runtime
handler, but extension tools still carried a parallel
`ExtensionToolFuture` / `ExtensionToolExecutor` shape. That made
extension-owned tools look different from host tools even though
routing, registration, and execution need the same abstraction.

This PR makes the shared executor contract directly async and lets
extension tools implement it too, so host tools and extension tools can
move through the same registration path.

## What changed

- Changed `ToolExecutor::handle` to an `async fn` using `async-trait`,
and updated built-in tool handlers to implement the async trait
directly.
- Replaced the bespoke `ExtensionToolFuture` contract with a marker
`ExtensionToolExecutor` over `ToolExecutor<ToolCall, Output =
JsonToolOutput>`, re-exporting `ToolExecutor` from
`codex-extension-api`.
- Updated the memories extension tools to implement the shared executor
trait.
- Split tool-router construction into collected executors plus hosted
model specs, keeping hosted tools like web search and image generation
separate from executable handlers.
- Updated spec/router tests and extension-tool stubs for the new
executor shape.

## Verification

- Not run locally.
2026-05-14 11:23:57 +02:00
Eric Traut
6a225e4005 Defer startup NUX impressions until startup succeeds (#22587)
## Why

This is a follow-up to #22573. This problem was surfaced in a code
review comment that I missed before merging the previous PR.

Fresh-session startup could prepare a model-availability NUX before
`app_server.start_thread(&config)` completed. If thread startup then
failed, the TUI never rendered the tooltip, but
`prepare_startup_tooltip_override(...)` had already persisted one of the
limited impressions.

## What Changed

- Move startup tooltip preparation inside the fresh-thread startup
branch, after `start_thread(...)` succeeds.
- Keep resume/fork paths unchanged.
- Remove the now-redundant
`should_prepare_startup_tooltip_override(...)` helper and its gate test.
2026-05-13 21:03:19 -07:00
xli-oai
9797296564 Relax remote plugin sync gate (#22594)
## Summary
- Allow remote installed-plugin cache refresh to start whenever plugins
are enabled.
- Allow remote installed-plugin bundle sync to start whenever plugins
are enabled.
- Remove the extra local `remote_plugin_enabled` guard from those
background sync paths.

## Context
Server-side installed plugin state and optional bundle URL behavior are
owned by plugin-service `/public/plugins/installed`, so these local sync
paths only need the overall plugin enablement gate.

## Test plan
- `just fmt`
- `cargo test -p codex-core-plugins`
2026-05-14 03:38:30 +00:00
Eric Traut
35451ba79c Simplify TUI startup test coverage (#22573)
## Why

The TUI startup test surface had drifted into expensive, brittle
coverage:

- `tui/tests/suite/no_panic_on_startup.rs` was already ignored as flaky
while still spawning a PTY to exercise malformed exec-policy rules.
- `tui/tests/suite/model_availability_nux.rs` used a seeded session,
cursor-query spoofing, and repeated interrupts to verify a narrow
resume-path invariant.
- `app/tests.rs` had started accumulating unrelated startup and summary
coverage in one flat module even after the surrounding app code was
split into feature modules.

This keeps those behaviors covered while making the tests cheaper to
understand and less likely to rot. It also preserves the malformed-rules
regression from #8803 without requiring a terminal orchestration test.

## What changed

- Replaced the malformed `rules` startup PTY case with a direct
exec-policy loader regression:

[`rules_path_file_returns_read_dir_error`](21b6b5622f/codex-rs/core/src/exec_policy_tests.rs (L264-L284))
- Made the existing fresh-session-only startup tooltip behavior explicit
with

[`should_prepare_startup_tooltip_override`](21b6b5622f/codex-rs/tui/src/app/thread_routing.rs (L1272-L1279)),
then added focused coverage for the resume/fork gate and the persisted
NUX counter.
- Split startup and session-summary coverage out of
`tui/src/app/tests.rs` into dedicated modules so the test layout better
mirrors the current app architecture.
- Converted one single-message goal validation snapshot into semantic
assertions where layout was not the behavior under test.
- Removed the two PTY-heavy suite files that the narrower tests now
supersede.

## Verification

- `cargo test -p codex-core rules_path_file_returns_read_dir_error`
- `cargo test -p codex-tui startup_`
- `cargo test -p codex-tui session_summary_`
- `cargo test -p codex-tui
goal_slash_command_rejects_oversized_objective`
2026-05-13 18:16:54 -07:00
Owen Lin
4e368aa2e9 enable/disable remote control at runtime, not via features (#22578)
## Why
reapplies https://github.com/openai/codex/pull/22386 which was
previously reverted

Also, introduce `remoteControl/enable` and `remoteControl/disable`
app-server APIs to toggle on/off remote control at runtime for a given
running app-server instance.

## What Changed

- Adds experimental v2 RPCs:
  - `remoteControl/enable`
  - `remoteControl/disable`
- Adds `RemoteControlRequestProcessor` and routes the new RPCs through
it instead of `ConfigRequestProcessor`.
- Adds named `RemoteControlHandle::enable`, `disable`, and `status`
methods.
- Makes `remoteControl/enable` return an error when sqlite state DB is
unavailable, while keeping enrollment/websocket failures as async status
updates.
- Adds `AppServerRuntimeOptions.remote_control_enabled` and hidden
`--remote-control` flags for `codex app-server` and `codex-app-server`.
- Updates managed daemon startup to use `codex app-server
--remote-control --listen unix://`.
- Marks `Feature::RemoteControl` as removed and ignores
`[features].remote_control`.
- Updates app-server README entries for the new remote-control methods.
2026-05-14 01:07:46 +00:00
Owen Lin
512f8f8012 Improve remote-control daemon UX (#22562)
## Why

`codex remote-control` manages the app-server daemon with
`remote_control` enabled, but it previously only exposed an implicit
start path. Once started, there was no obvious top-level
`remote-control` command for stopping the daemon; users had to know
about the lower-level `codex app-server daemon stop` command.

The startup failure for missing managed installs was also ambiguous.
`codex remote-control` and daemon bootstrap require the standalone Codex
install under `CODEX_HOME/packages/standalone/current/codex`, but the
old error only said to install Codex first, which is unclear when
another `codex` binary is already on PATH. Now we add an explicit
instruction for how to get the standalone Codex install.

## What changed

- Converts `codex remote-control` into a command group while preserving
bare `codex remote-control` as the existing start behavior.
- Adds `codex remote-control start` as the explicit start path.
- Adds `codex remote-control stop`, which maps to app-server daemon
stop.
- Updates the shared daemon managed-install error to name the missing
standalone path, explain why that install is required, provide the
installer command, and tell users to rerun the command they just tried.

## Verification

- `cargo test -p codex-app-server-daemon`
- `cargo test -p codex-cli`
- `./target/debug/codex remote-control --help`
2026-05-13 18:04:08 -07:00
Dylan Hurd
e33cf9ae28 chore(config) rm experimental_use_freeform_apply_patch (#22565)
## Summary
Get rid of the `experimental_use_freeform_apply_patch` config option,
since it is now encoded in model config. No deprecation message since it
has been experimental this entire time.

## Testing
- [x] Updated unit tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-13 17:52:15 -07:00
David de Regt
53a36fc1c2 fix: Block appserver startup if state db can't be opened (#22580)
All apps must be able to open the db to proceed -- codex is having
issues with manufacturing new installation ids in local mode when the db
can't be opened for race conditions or any other reasons.
2026-05-14 00:50:17 +00:00
Eric Ning
64d8f387f9 Remove connector_openai prefix filtering (#22555)
Remove unnecessary prefix filtering from codex

## Test Plan

Test local cli build + make sure backend returns appropriate apps 

```
cd ~/code/codex/codex-rs
cargo build -p codex-cli --bin codex
./target/debug/codex
```

Appropriate apps show up in my list
2026-05-13 16:59:22 -07:00
Max Burkhardt
4fca5c32da Deprecate issue labeler (#22574) 2026-05-13 19:55:31 -04:00
Shijie Rao
49d1f66c4c Add unsigned macOS release artifacts (#22559)
## Summary
- Upload unsigned macOS release binaries before signing so they remain
available from the workflow run if signing fails
- Add a manual `workflow_dispatch` option, `sign_macos`, defaulting to
`true`
- When `sign_macos=false`, skip macOS signing, signed-name macOS
artifacts, DMGs, npm/DotSlash/PyPI publishing, latest release marking,
and `latest-alpha-cli` updates


## Process
HAVE NOT TESTED YET BUT we should be able to run
```
gh workflow run rust-release.yml \
  -R openai/codex \
  --ref rust-v0.132.0 \
  -f sign_macos=false
```

which will then start the rust-release script with `sign_macos` and
therefore do not codesign mac and also no release afterward.
2026-05-13 16:47:44 -07:00
xl-openai
e3bf0cfc63 [codex] Canonicalize shared workspace plugin IDs (#22564)
## Summary
- Canonicalize private and unlisted workspace shared plugin IDs to
`workspace-shared-with-me`.
- Keep `plugin/list` private/unlisted shared-with-me buckets as UI
grouping only.
- Update share read/list/checkout and cache cleanup coverage for the
canonical namespace.

## Tests
- `cargo test -p codex-app-server --test all
plugin_list_fetches_shared_with_me_kind`
- `cargo test -p codex-app-server --test all
plugin_read_returns_share_context_for_shared_remote_plugin`
- `cargo test -p codex-app-server --test all suite::v2::plugin_share`
- `cargo test -p codex-core-plugins
list_remote_plugin_shares_fetches_created_workspace_plugins`
- `cargo test -p codex-core-plugins
stale_remote_plugin_cleanup_removes_old_shared_with_me_cache_and_keeps_canonical_cache`
- `git diff --check`
2026-05-13 16:29:47 -07:00
Eric Traut
3c3e18c222 Refactor chatwidget orchestration into modules (phase 5) (#22537)
## Why

`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase cleanup to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer. #22407 completed phase 2 by extracting input and
submission flow, #22433 completed phase 3 by extracting protocol,
replay, streaming, and tool lifecycle handling, and #22518 completed
phase 4 by extracting settings, popups, and status surfaces.

This PR is phase 5. It cleans up the remaining constructor and
orchestration code now that the larger behavior domains have moved out,
leaving `chatwidget.rs` much closer to the composition layer the cleanup
was aiming for. This is once again a mechanical movement of existing
functions. No functional changes.

## What Changed

- Added focused modules for widget construction and initial wiring,
session configuration flow, key/composer interaction routing, review
popup orchestration, desktop notification coalescing, and render
composition.
- Moved the remaining constructor, session setup, interaction,
notification, review picker, and rendering helpers out of
`codex-rs/tui/src/chatwidget.rs`.
- Preserved the existing startup/session behavior, keyboard handling,
review picker flow, notification priority behavior, and render
composition while shrinking the central widget module substantially.
- Left `codex-rs/tui/src/chatwidget.rs` as the registration and
composition surface for the extracted behavior modules.

## Cleanup Phases

The five-phase cleanup plan from #22269 is:

1. Phase 1: mechanical helper and state moves. Completed in #22269.
2. Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. Completed in #22407.
3. Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
Completed in #22433.
4. Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers. Completed in #22518.
5. Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer. This PR.

## Verification

- `cargo check -p codex-tui`
- `cargo test -p codex-tui chatwidget::tests::popups_and_settings`
- `cargo test -p codex-tui chatwidget::tests::plan_mode`
- `cargo test -p codex-tui chatwidget::tests::review_mode`
- `cargo test -p codex-tui chatwidget::tests::status_and_layout`

`cargo test -p codex-tui` also compiles and begins running, but aborts
in the unchanged app-side test
`app::tests::discard_side_thread_keeps_local_state_when_server_close_fails`
with the same reproducible stack overflow noted in phase 4.
2026-05-13 15:40:53 -07:00
Eric Traut
efdcbba053 Remove resurrected /collab slash command (#22535)
## Summary
`/collab` was intentionally removed in
[#12012](https://github.com/openai/codex/pull/12012), but the
TUI/app-server migration accidentally brought that slash-command path
back. This restores the earlier product decision so the TUI no longer
advertises or dispatches `/collab`. This command was redundant because
it did the same thing as `/plan` but in a less-intuitive way.

## What Changed
- Remove `SlashCommand::Collab` from the TUI slash-command surface.
- Delete the picker and app-event plumbing that only existed to service
`/collab`.
- Remove obsolete TUI test coverage for the deleted picker flow.
2026-05-13 15:40:37 -07:00
jif-oai
e6939e3969 feat: namespace in ext (#22556) 2026-05-14 00:37:48 +02:00
Abhinav
23bb524973 Spill oversized PreToolUse additionalContext (#22529)
# Why

`PreToolUse.additionalContext` became model-visible after #20692, but
the hook-output spilling path from #21069 never picked up that newer
lane. As a result, oversized `PreToolUse` context could bypass the
truncation/spill treatment that already applies to the other hook
outputs Codex forwards to the model.

# What

- Run `PreToolUseOutcome.additional_contexts` through
`maybe_spill_texts(...)`
- Add an integration test proving a large `PreToolUse.additionalContext`
is replaced with a truncated preview plus spill-file pointer, while the
full text is preserved on disk.
2026-05-13 15:21:31 -07:00
Andrey Mishchenko
7c57a59f51 Make multi_agent_v2 wait_agent timeouts configurable (#22528)
## Why

`multi_agent_v2` already allowed configuring the minimum `wait_agent`
timeout, but the default timeout and upper bound were still hard-coded.
That made it hard to tune waits for subagent mailbox activity in
sessions that need either faster wakeups or longer waits, and it meant
the model-visible `wait_agent` schema could not fully reflect the
resolved runtime limits.

## What Changed

- Added `features.multi_agent_v2.max_wait_timeout_ms` and
`features.multi_agent_v2.default_wait_timeout_ms` alongside the existing
`min_wait_timeout_ms` setting.
- Validated all three timeouts in config as `0..=3_600_000`, with
`min_wait_timeout_ms <= default_wait_timeout_ms <= max_wait_timeout_ms`.
- Thread and review session tool config now passes the resolved
min/default/max values into the `wait_agent` tool schema.
- `wait_agent` now uses the configured default when `timeout_ms` is
omitted and rejects explicit values outside the configured min/max range
instead of silently clamping them.
- Updated the generated config schema and config-lock test coverage for
the new fields.
2026-05-13 14:43:06 -07:00
iceweasel-oai
8ae0c837f0 Avoid PowerShell profiles in elevated Windows sandbox (#21400)
## Why

On Windows, elevated sandboxed commands run under a dedicated sandbox
account while `HOME` / `USERPROFILE` can still point at the real user's
profile directory. For PowerShell login shells, that combination can
make the sandbox account try to load the real user's PowerShell profile
script. If the sandbox account's execution policy differs from the real
user's policy, startup can emit profile-loading errors before the
requested command runs.

For this backend, loading the profile is not a faithful user login
shell: it is cross-account profile execution. Treating these PowerShell
invocations as non-login shells avoids that invalid startup path.

## Why This Happens Late

The normal `login` decision is resolved when shell argv is created, but
that point is too early to make this Windows sandbox-specific decision.
At argv creation time we do not yet know the actual sandbox attempt that
will run the command. A turn can include sandboxed and unsandboxed
attempts, and a broad turn-level override would also affect Full Access
commands where the user's profile should remain available.

Instead, this change carries the selected `ShellType` alongside the argv
and applies the `-NoProfile` adjustment in the shell runtimes once the
`SandboxAttempt` is known. That keeps the override scoped to actual
`WindowsRestrictedToken` attempts with `WindowsSandboxLevel::Elevated`.

The runtime uses the selected shell metadata rather than re-detecting
PowerShell from argv. That avoids brittle parsing and covers PowerShell
invocation shapes such as `-EncodedCommand`.

## What Changed

- Carry selected shell metadata through `exec_command` / unified exec
requests and shell tool requests.
- Insert `-NoProfile` for PowerShell commands only when the runtime is
about to execute a sandboxed elevated Windows attempt.
- Add focused unit coverage for elevated Windows PowerShell,
`-EncodedCommand`, existing `-NoProfile`, legacy restricted-token
attempts, unsandboxed attempts, and non-PowerShell commands.

## Verification

- `cargo test -p codex-core disable_powershell_profile_tests`
- `cargo test -p codex-core test_get_command`
- `cargo clippy --fix --tests --allow-dirty --allow-no-vcs -p
codex-core`

A full `cargo test -p codex-core` run was also attempted during
development, but it still hit an unrelated stack overflow in
`agent::control` tests before reaching this area.
2026-05-13 21:37:50 +00:00
sayan-oai
3de4d7f238 clean up instructions (#22543)
rm behavioral steering in tool docs for code mode.
2026-05-13 14:28:57 -07:00
Felipe Coury
9798eb377a feat(cli): add codex doctor diagnostics (#22336)
## Why

Users and support need a single command that captures the local Codex
runtime, configuration, auth, terminal, network, and state shape without
asking the user to know which diagnostic depth to choose first. `codex
doctor` now runs the useful checks by default and makes the detailed
human output the default because the command is usually run when someone
already needs context.

The command also targets concrete support failure modes we have seen
while iterating on the design:

- update-target mismatches like #21956, where the installed package
manager target can differ from the running executable
- terminal and multiplexer issues that depend on `TERM`, tmux/zellij
state, color handling, and TTY metadata
- provider-specific HTTP/WebSocket connectivity, including ChatGPT
WebSocket handshakes and API-key/provider endpoint reachability
- local state/log SQLite integrity problems and large rollout
directories
- feedback reports that need an attached, redacted diagnostic snapshot
without asking the user to run a second command

## What Changed

- Adds `codex doctor` as a grouped CLI diagnostic report with default
detailed output and `--summary` for the compact view.
- Adds stable report sections for Environment, Configuration, Updates,
Connectivity, and Background Server, plus a top Notes block that
promotes anomalies such as available updates, large rollout directories,
optional MCP issues, and mixed auth signals.
- Adds runtime provenance, install consistency, bundled/system search
readiness, terminal/multiplexer metadata, `config.toml` parse status,
auth mode details, sandbox details, feature flag summaries, update
cache/latest-version state, app-server daemon state, SQLite integrity
checks, rollout statistics, and provider-aware network diagnostics.
- Adds ChatGPT WebSocket diagnostics that report the negotiated HTTP
upgrade as `HTTP 101 Switching Protocols` and include timeout, DNS,
auth, and provider context in detailed output.
- Makes reachability provider-aware: API-key OpenAI setups check the API
endpoint, ChatGPT auth checks the ChatGPT path, and custom/AWS/local
providers check configured HTTP endpoints when available.
- Adds structured, redacted JSON output where `checks` is keyed by check
id and `details` is a key/value object for support tooling.
- Integrates doctor with feedback uploads by attaching a best-effort
`codex-doctor-report.json` report and adding derived Sentry tags for
overall status and failing/warning checks.
- Updates the TUI feedback consent copy so users can see that the doctor
report is included when logs/diagnostics are uploaded.
- Updates the CLI bug issue template to ask reporters for `codex doctor
--json` and render pasted reports as JSON.

## Example Output

The examples below are sanitized from local smoke runs with `--no-color`
so the structure is reviewable in plain text.

### `codex doctor`

```text
Codex Doctor v0.0.0 · macos-aarch64

Notes
   ↑ updates      0.130.0 available (current 0.0.0, dismissed 0.128.0)
   ⚠ rollouts     1,526 active files · 2.53 GB on disk
   ⚠ mcp          MCP configuration has optional issues
   ⚠ auth         mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode
─────────────────────────────────────────────────────────────

Environment
  ✓ runtime      local debug build
      version                  0.0.0
      install method           other
      commit                   unknown
      executable               ~/code/codex.fcoury-doct…x-rs/target/debug/codex
  ✓ install      consistent
      context                  other
      managed by               npm: no · bun: no · package root —
      PATH entries (2)         ~/.local/share/mise/installs/node/24/bin/codex
                               ~/.local/share/mise/shims/codex
  ✓ search       ripgrep 15.1.0 (system, `rg`)
  ✓ terminal     Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color
      terminal                 Ghostty
      TERM_PROGRAM             ghostty
      terminal version         1.3.2-main-+b0f827665
      TERM                     xterm-256color
      multiplexer              tmux 3.6a
      tmux extended-keys       on
      tmux allow-passthrough   on
      tmux set-clipboard       on
  ✓ state        databases healthy
      CODEX_HOME               ~/.codex (dir)
      state DB                 ~/.codex/state_5.sqlite (file) · integrity ok
      log DB                   ~/.codex/logs_2.sqlite (file) · integrity ok
      active rollouts          1,526 files · 2.53 GB (avg 1.70 MB)
      archived rollouts        8 files · 3.84 MB (avg 491.11 KB)

Configuration
  ✓ config       loaded
      model                    gpt-5.5 · openai
      cwd                      ~/code/codex.fcoury-doctor/codex-rs
      config.toml              ~/.codex/config.toml
      config.toml parse        ok
      MCP servers              1
      feature flags            36 enabled · 7 overridden (full list with --all)
      overrides                code_mode, code_mode_only, memories, chronicle, goals, remote_control, prevent_idle_sleep
  ✓ auth         auth is configured
      auth storage mode        File
      auth file                ~/.codex/auth.json
      auth env vars present    OPENAI_API_KEY
      stored auth mode         chatgpt
      stored API key           false
      stored ChatGPT tokens    true
      stored agent identity    false
  ⚠ mcp          MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server.
      configured servers       1
      disabled servers         0
      streamable_http servers  1
      optional reachability    openaiDeveloperDocs: https://developers.openai.com/mcp (HEAD connect failed; GET connect failed)
  ✓ sandbox      restricted fs + restricted network · approval OnRequest
      approval policy          OnRequest
      filesystem sandbox       restricted
      network sandbox          restricted

Connectivity
  ✓ network      network-related environment looks readable
  ✓ websocket    connected (HTTP 101 Switching Protocols) · 15s timeout
      model provider           openai
      provider name            OpenAI
      wire API                 responses
      supports websockets      true
      connect timeout          15000 ms
      auth mode                chatgpt
      endpoint                 wss://chatgpt.com/backend-api/<redacted>
      DNS                      2 IPv4, 2 IPv6, first IPv6
      handshake result         HTTP 101 Switching Protocols
  ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration.
      reachability mode        API key auth
      openai API               https://api.openai.com/v1 connect failed (required)

Background Server
  ○ app-server   not running (ephemeral mode)

─────────────────────────────────────────────────────────────
11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed

--summary compact output           --all expand truncated lists
--json redacted report
```

### `codex doctor --summary`

```text
Codex Doctor v0.0.0 · macos-aarch64

Notes
   ↑ updates      0.130.0 available (current 0.0.0, dismissed 0.128.0)
   ⚠ rollouts     1,526 active files · 2.53 GB on disk
   ⚠ mcp          MCP configuration has optional issues
   ⚠ auth         mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode
─────────────────────────────────────────────────────────────

Environment
  ✓ runtime      local debug build
  ✓ install      consistent
  ✓ search       ripgrep 15.1.0 (system, `rg`)
  ✓ terminal     Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color
  ✓ state        databases healthy

Configuration
  ✓ config       loaded
  ✓ auth         auth is configured
  ⚠ mcp          MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server.
  ✓ sandbox      restricted fs + restricted network · approval OnRequest

Updates
  ✓ updates      update configuration is locally consistent

Connectivity
  ✓ network      network-related environment looks readable
  ✓ websocket    connected (HTTP 101 Switching Protocols) · 15s timeout
  ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration.

Background Server
  ○ app-server   not running (ephemeral mode)

─────────────────────────────────────────────────────────────
11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed

Run codex doctor without --summary for detailed diagnostics.
--all expand truncated lists       --json redacted report
```

### `codex doctor --json` shape

```json
{
  "schema_version": 1,
  "overall_status": "fail",
  "checks": {
    "runtime.provenance": {
      "id": "runtime.provenance",
      "category": "Environment",
      "status": "ok",
      "summary": "local debug build",
      "details": {
        "version": "0.0.0",
        "install method": "other",
        "commit": "unknown"
      }
    },
    "sandbox.helpers": {
      "id": "sandbox.helpers",
      "category": "Configuration",
      "status": "ok",
      "summary": "restricted fs + restricted network · approval OnRequest",
      "details": {
        "approval policy": "OnRequest",
        "filesystem sandbox": "restricted",
        "network sandbox": "restricted"
      }
    }
  }
}
```

### `/feedback` new sentry attachment

<img width="938" height="798" alt="CleanShot 2026-05-13 at 15 36 14"
src="https://github.com/user-attachments/assets/715e62e0-d7b4-4fea-a35a-fd5d5d33c4c0"
/>

### New section in CLI issue template

<img width="1164" height="435" alt="CleanShot 2026-05-13 at 15 47 24"
src="https://github.com/user-attachments/assets/9081dc25-a28c-4afa-8ba1-e299c2b4031d"
/>

## How to Test

1. Run `cargo run --bin codex -- doctor --no-color`.
2. Confirm the detailed report is the default and includes promoted
Notes, grouped sections, terminal details, state DB integrity, rollout
stats, provider reachability, WebSocket diagnostics, and app-server
status.
3. Run `cargo run --bin codex -- doctor --summary --no-color`.
4. Confirm the compact view keeps the same sections and summary counts
but omits detailed key/value rows.
5. Run `cargo run --bin codex -- doctor --json`.
6. Confirm the output is redacted JSON, `checks` is an object keyed by
check id, and each check's `details` is a key/value object.
7. Preview the CLI bug issue template and confirm the `Codex doctor
report` field appears after the terminal field, asks for `codex doctor
--json`, and renders pasted output as JSON.
8. Start a feedback flow that includes logs.
9. Confirm the upload consent copy lists `codex-doctor-report.json`
alongside the log attachments.

Targeted tests:

- `cargo test -p codex-cli doctor`
- `cargo test -p codex-app-server
doctor_report_tags_summarize_status_counts`
- `cargo test -p codex-feedback`
- `cargo test -p codex-tui feedback_view`
- `just argument-comment-lint`
- `git diff --check`
2026-05-13 21:23:19 +00:00
canvrno-oai
5d7e6a2503 [codex] Fix TUI wrapping for external borrowed slices (#21235)
Fixes #20587, reported by @noeljackson.

This prevents the TUI wrapping code from panicking when `textwrap`
returns a borrowed slice that does not point into the original source
text. The fix follows the direction proposed by @misrtjakub in the issue
comment: validate the borrowed slice pointer range first, and fall back
to the existing owned-line mapper when the slice is external.

- Guards borrowed wrapped slices before converting pointer offsets into
byte ranges.
- Reuses the existing owned-line range recovery path for external
borrowed slices.
- Adds coverage for rejecting borrowed slices outside the source text.

End-user testing steps:
- Start Codex in TUI mode under a PTY wrapper that can inject stdin
after startup.
- Inject `\x1b[200~test message\x1b[201~\r` after the TUI is ready.
- Confirm Codex does not panic and the pasted text is handled normally.

Local validation:
- `cargo test -p codex-tui wrapping::tests::`
- `cargo test -p codex-tui -- --skip
status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access
--skip
status::tests::status_permissions_full_disk_managed_without_network_is_external_sandbox`
2026-05-13 14:19:05 -07:00
canvrno-oai
16592f593d Use plugin/list to get list of plugins for mentions (#22375)
This switches TUI plugin mentions to use app-server `plugin/list` for
plugin inventory and metadata instead of `PluginManager`, while keeping
the same mention-eligibility filters as before.

Same filters as before:
- Only plugins in the current config / cwd scope.
- Only installed and enabled plugins.
- Only plugins that actually expose a capability, meaning at least one
skill, MCP server, or app connector.
- Uses `plugin/list` for the mention names/descriptions
2026-05-13 14:11:10 -07:00
Abhinav
14473c216f Enable plugin hooks by default (#22549)
# Why

Plugin-bundled hooks are already wired through the plugin manager,
session setup, and app-server hook listing paths. Keeping `plugin_hooks`
disabled by default means users still need an explicit feature opt-in
before that existing behavior participates in normal plugin loading.

# What

- mark `plugin_hooks` as stable and enable it by default
- add feature-registry test coverage for the new default/stage pairing

Validation:

- `cargo test -p codex-features`
- `just fmt`
2026-05-13 21:10:28 +00:00
stevenlee-oai
0a2d751fc2 Add callback ids to local MCP OAuth redirects (#20237)
## Summary

- Add a deterministic callback-id path segment to local MCP OAuth
redirect URIs before starting authorization.
- Derive the callback id from the normalized MCP server URL and encode
it as a 12-character URL-safe hash.
- Reuse the existing exact callback-path validation so OAuth completion
only succeeds on the callback path that was sent in the redirect URI.

## Context

Slack thread:
https://openai.slack.com/archives/C087WB3AGCR/p1777480566571699

That thread calls out the OAuth mix-up class of issue for MCP servers.
The connector/App Connect flow already has a callback_id concept that
binds the OAuth callback URL to the MCP app/server identity. Codex
desktop's local MCP OAuth flow was still using a generic local callback
path like `/callback`, so this PR adds the same shape to the shared
local MCP OAuth helper.

## Behavior

Before this change, local MCP OAuth used:

- default local callback URL: `http://127.0.0.1:<port>/callback`
- configured callback URL: `<configured callback URL>` unchanged

After this change, Codex appends a deterministic callback-id segment:

- default local callback URL:
`http://127.0.0.1:<port>/callback/<callback_id>`
- configured callback URL: `<configured callback path>/<callback_id>`

The local callback server already compares the incoming request path
against the path from the redirect URI. By appending the callback id
before both authorization and callback validation, callbacks that arrive
on the old generic path or a mismatched callback-id path are rejected.

The callback id is bound to the MCP endpoint URL, including path and
query, so path-based multi-tenant MCP deployments on the same origin do
not share a callback path. URL fragments are ignored because they are
not sent to the server.

The change lives in `codex-rmcp-client`, so it covers both the normal
desktop MCP OAuth login path and silent/plugin-triggered MCP OAuth login
paths that use the same `perform_oauth_login_*` helpers.

## Scope and non-goals

- This does not change the app-server protocol or desktop webview
request shape.
- This does not implement RFC 9207 `iss` validation; issuer validation
is still useful when providers return `iss`.
- This does not make arbitrary untrusted MCP servers safe to use. It
specifically adds callback URL binding for the local MCP OAuth flow.

## Validation

- `cargo fmt --all`
- `cargo test -p codex-rmcp-client perform_oauth_login`
2026-05-13 13:26:04 -07:00
pakrym-oai
3ac1d15598 Use selected environment cwd for filesystem helpers (#22542)
## Why

`TurnContext::cwd` is deprecated in favor of resolving paths from the
selected turn environment cwd. A few filesystem-oriented paths were
still constructing sandbox context from the legacy cwd and then mutating
it afterward, or resolving local file paths through the deprecated
helper.

## What changed

- Make `TurnContext::file_system_sandbox_context` take the trusted cwd
explicitly.
- Pass the selected turn environment cwd directly from `apply_patch` and
`view_image` call sites.
- Restrict `spawn_agents_on_csv` to exactly one local environment and
resolve input/output CSV paths from that local environment cwd.
- Remove a redundant test setup assignment that only synchronized
deprecated `TurnContext::cwd` with a replaced config.

## Validation

- `cargo test -p codex-core view_image`
- `cargo test -p codex-core
maybe_persist_mcp_tool_approval_writes_project_config_for_project_server`
- `cargo test -p codex-core parse_csv_supports_quotes_and_commas`
- `git diff --check`
2026-05-13 13:18:56 -07:00
starr-openai
d666238b40 Shard Bazel Windows tests across jobs (#22408)
## Summary
- split the single PR-blocking Bazel Windows test leg into four Windows
shard jobs
- preserve the existing required Windows Bazel check name with a
lightweight aggregate gate
- keep Linux/macOS Bazel test jobs and the separate Windows
clippy/release jobs unchanged

## Why
The ordinary PR Windows Bazel test leg was one GitHub Actions job, so
Bazel only had in-job parallelism. This gives that lane real job-level
fanout across separate Windows hosts while keeping the target set
disjoint via stable label hashing.

## Evidence
- final pre-rebase green run: `25774733562`
- Windows shard target counts: `61/212`, `48/212`, `52/212`, `51/212`
- Windows test fanout completed in about 7m29s versus a recent
monolithic median around 22m26s

## Notes
- this is scoped to the Bazel Windows test leg only
- each shard keeps the existing Windows cross-compile/RBE path and
restores the former monolithic Windows test cache
- shard jobs do not upload duplicate repository caches after test work,
keeping cache cleanup off the PR-blocking shard path
- no local validation run; relying on GitHub Actions for the
workflow-shaped check

Co-authored-by: Codex <noreply@openai.com>
2026-05-13 12:46:51 -07:00
Owen Lin
fb7cfc813a fix: prevent codex-backend from stealing originator (#22533)
## Why

Remote control starts by letting `codex-backend` initialize against the
app-server as an infrastructure health/proxy client before the real
remote client connects. App-server initialization also sets the
process-wide `originator` from `client_info.name`, so `codex-backend`
could become the sticky originator for later model/API requests even
after the real client initialized.

## What changed

- Treat `codex-backend` as a non-originating initialize client,
alongside the existing `codex_app_server_daemon` probe client.
- Preserve normal per-connection initialize behavior, including session
metadata and initialize analytics.
- Add regression coverage that verifies `codex-backend` initialize does
not replace the default originator.

## Testing

- `cargo test -p codex-app-server --test all
initialize_codex_backend_does_not_override_originator`
2026-05-13 12:38:34 -07:00
Dylan Hurd
9c691b74d6 chore(config) rm tools.view_image (#22501)
## Summary
It appears this config flag has been broken/a noop for quite some time:
since https://github.com/openai/codex/pull/8850. Let's simplify and get
rid of this.

## Testing
- [x] Updated unit tests
2026-05-13 12:35:37 -07:00
Dylan Hurd
d18a7c982e chore(config) rm Feature::CodexGitCommit (#22412)
## Summary
Removes the unused Feature::CodexGitCommit

## Testing
- [x] tests pass
2026-05-13 12:33:36 -07:00
Alex Daley
a98b57d065 [codex] Reuse Apps MCP path override for plugin-service rollout (#22527)
## Summary
- reuse `apps_mcp_path_override` for the plugin-service rollout,
defaulting enabled boolean overrides to `/ps/mcp` while preserving
explicit configured paths

## Validation
- `just write-config-schema`
- `just fmt`
- `cargo test -p codex-mcp`
- `cargo test -p codex-core apps_mcp_path_override`
- `cargo test -p codex-core
to_mcp_config_preserves_apps_feature_from_config`
- `cargo test -p codex-features`
2026-05-13 19:18:35 +00:00
iceweasel-oai
c9edb26755 windows-sandbox: fail elevated setup when firewall policy is ineffective (#22353)
## Why

Elevated Windows sandbox setup currently assumes that the firewall rules
it writes will take effect. On managed Windows hosts, local firewall
policy changes can be ignored or only partially apply across the active
profiles, which means setup can appear to succeed without providing the
expected network isolation.

## What changed

- Query `INetFwPolicy2::LocalPolicyModifyState` before configuring the
elevated sandbox firewall rules.
- Fail setup when Windows reports that local firewall policy edits are
ineffective or only apply to some current profiles.
- Surface that condition with a dedicated
`helper_firewall_policy_ineffective` setup error code so support and
IT-facing diagnostics can distinguish it from COM access failures.
- Add focused coverage for effective policy, group-policy override, and
partial-profile coverage cases.

## Testing

- `cargo test -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
2026-05-13 18:43:14 +00:00
Adrian
3d2a0b5517 [codex] Scope Windows sandbox write-root capability SIDs (#21479)
## Summary
- fix by scoping Windows workspace-write capability SIDs to active
effective write roots
- build legacy/elevated tokens from only the active effective write
roots
- align setup/audit deny ACL handling with active root-specific SIDs

## Testing
- just fmt
- git diff --check --cached
- just argument-comment-lint
- cargo check -p codex-windows-sandbox --locked (blocked by libwebrtc ->
libyuv fetch: CONNECT tunnel failed, response 403)
2026-05-13 18:28:52 +00:00
Eric Traut
1ae811ddb2 Refactor chatwidget settings surfaces into modules (phase 4) (#22518)
## Why

`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase cleanup to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer. #22407 completed phase 2 by extracting input and
submission flow, and #22433 completed phase 3 by extracting protocol,
replay, streaming, and tool lifecycle handling.

This PR is phase 4. It keeps moving high-churn UI coordination out of
the central widget by extracting settings, popups, and status surfaces
without changing the visible behavior those flows already provide. This
is once again a mechanical movement of existing functions. No functional
changes.

## What Changed

- Added focused modules for runtime settings/model coordination,
model/reasoning/collaboration popups,
settings/personality/theme/audio/experimental popups, permission
prompts, status setup/output controls, and Windows sandbox prompt flows.
- Moved the remaining rate-limit nudge/status helpers and connectors
popup/loading/update helpers into their existing focused modules.
- Preserved the existing picker flows, approval behavior, status/title
setup previews, rate-limit notices, and connectors/app list behavior
while shrinking `chatwidget.rs` back toward orchestration.
- Left `codex-rs/tui/src/chatwidget.rs` as the registration and
composition surface for these extracted behaviors.

## Cleanup Phases

The five-phase cleanup plan from #22269 is:

1. Phase 1: mechanical helper and state moves. Completed in #22269.
2. Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. Completed in #22407.
3. Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
Completed in #22433.
4. Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers. This PR.
5. Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.

## Verification

- `cargo check -p codex-tui`
- `cargo test -p codex-tui chatwidget::tests::permissions`
- `cargo test -p codex-tui chatwidget::tests::status_surface_previews`
- `cargo test -p codex-tui chatwidget::tests::popups_and_settings`
- `cargo test -p codex-tui chatwidget::tests::status_and_layout`

`cargo test -p codex-tui` also compiles and begins running, but aborts
in the unchanged app-side test
`app::tests::discard_side_thread_keeps_local_state_when_server_close_fails`
with a reproducible stack overflow.
2026-05-13 11:26:37 -07:00
pakrym-oai
4454e1411b Deprecate TurnContext cwd and resolve_path (#22519)
## Why

`TurnContext::cwd` and `TurnContext::resolve_path` are being phased out
in favor of using the selected turn environment cwd directly.
Deprecating both APIs makes any new direct dependency visible while
preserving the existing migration path for current callers.

## What Changed

- Marked `TurnContext::cwd` and `TurnContext::resolve_path` as
deprecated with guidance to use the selected turn environment cwd
instead.
- Added exact `#[allow(deprecated)]` suppressions at each existing
direct usage site, including tests, rather than adding crate-wide
suppression.
- Kept the change behavior-preserving: current cwd reads, writes, and
path resolution continue to use the same values.

## Verification

- `just fmt`
- `cargo check -p codex-core`
- `cargo check -p codex-core --tests`
- `git diff --check`
2026-05-13 11:15:25 -07:00
Eric Ning
610b86fefb Pass Codex product SKU to ChatGPT backend (#22366)
# Description

We need to set the appropriate Product SKU for full functionality for
the apps endpoints for each type of client

# Testing

`./target/debug/codex --enable app`

<img width="1786" height="398" alt="CleanShot 2026-05-12 at 11 51 25@2x"
src="https://github.com/user-attachments/assets/2142f768-fc72-4fcb-8f39-9bd0d8569170"
/>

Regular slack flows seem to work, also curling these endpoints with the
correct SKU returns the right apps
2026-05-13 11:00:54 -07:00
jif-oai
fc26af377f feat: expose multi-agent v2 as model-only tools (#22514)
## Why

`code_mode_only` filters code-mode nested tools out of the top-level
tool list. For multi-agent v2, we need a rollout shape where the
collaboration tools remain callable as normal model tools without also
being embedded into the code-mode `exec` tool declaration.

Related to this:
https://openai-corpws.slack.com/archives/C0AQLHB4U75/p1778660267922549

## What Changed

- Adds `features.multi_agent_v2.non_code_mode_only`, including config
resolution, profile override handling, and generated schema coverage.
- Introduces `ToolExposure::DirectModelOnly` so a tool can be included
in the initial model-visible list while staying out of the nested
code-mode tool surface.
- Applies that exposure to the multi-agent v2 tools when the new flag is
set: `spawn_agent`, `send_message`, `followup_task`, `wait_agent`,
`close_agent`, and `list_agents`.
- Updates code-mode-only filtering so direct-model-only tools remain
visible while ordinary nested code-mode tools are still hidden.

## Verification

- Added config parsing/profile tests for `non_code_mode_only`.
- Added tool spec coverage for the code-mode-only multi-agent v2
exposure behavior.
2026-05-13 19:49:47 +02:00
Shijie Rao
157fffc6a4 Revert "Scope macOS signing secrets to release environment" (#22513)
Reverts openai/codex#22443
2026-05-13 10:41:45 -07:00
Owen Lin
2b3b220605 revert: mark Feature::RemoteControl as removed (#22520)
reverts: https://github.com/openai/codex/pull/22386
2026-05-13 17:32:15 +00:00
pakrym-oai
83decfa300 [codex] Remove unused legacy shell tools (#22246)
## Why

Recent session history showed no active use of the raw `shell`,
`local_shell`, or `container.exec` execution surfaces. Keeping those
handlers/specs wired into core leaves duplicate shell execution paths
alongside the supported `shell_command` and unified exec tools.

## What changed

- Removed the raw `shell` handler/spec and its `ShellToolCallParams`
protocol helper.
- Removed the legacy `local_shell` and `container.exec` handler/spec
plumbing while preserving persisted-history compatibility for old
response items.
- Normalized model/config `default` and `local` shell selections to
`shell_command`.
- Pruned tests that exercised removed raw-shell/local-shell/apply-patch
variants and kept coverage on `shell_command`, unified exec, and
freeform `apply_patch`.

## Verification

- `git diff --check`
- `cargo test -p codex-protocol`
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::handlers::shell`
- `cargo test -p codex-core tools::spec`
- `cargo test -p codex-core tools::router`
- `cargo test -p codex-core
active_call_preserves_triggering_command_context`
- `cargo test -p codex-core guardian_tests`
- `cargo test -p codex-core --test all shell_serialization`
- `cargo test -p codex-core --test all apply_patch_cli`
- `cargo test -p codex-core --test all shell_command_`
- `cargo test -p codex-core --test all local_shell`
- `cargo test -p codex-core --test all otel::`
- `cargo test -p codex-core --test all hooks::`
- `just fix -p codex-core`
- `just fix -p codex-tools`
2026-05-13 16:43:25 +00:00
jif-oai
7c7b4861d8 fix: drop underscored id headers (#22193)
## Why
Stop sending duplicate `session_id`/`thread_id` headers. We only want
the hyphenated forms as `_` is rejected by some proxies

Related discussion here:
https://openai.slack.com/archives/C095U48JNL9/p1778508316923179

## What
- Keep `session-id` and `thread-id`
- Remove the underscore aliases
2026-05-13 18:21:02 +02:00
jif-oai
fdda59c00b Introduce tool exposure for deferred registration (#22489)
## Why

Deferred tools were tracked with separate side-channel filtering after
tool specs had already been assembled. That made the registry
responsible for executing tools while the router/spec planner separately
decided whether those same tools should be exposed to the model up
front.

This PR makes exposure part of the tool handler contract so direct
versus deferred availability travels with the executable tool
registration.

Next step will be to simplify registration

## What Changed

- Adds `ToolExposure` to `codex-tools` and exposes it through
`ToolExecutor`, defaulting tools to `Direct`.
- Teaches dynamic tools and MCP handlers to mark deferred tools as
`Deferred` at construction time.
- Renames the registry object-safe wrapper from `AnyToolHandler` to
`RegisteredTool` and uses `ToolExposure` when deciding whether to
include a handler's spec in the initial model-visible tool list.
- Refactors tool spec planning to derive direct specs and deferred
search entries from registered handlers, removing the router's
special-case deferred dynamic tool filtering.

## Verification

- Not run.
2026-05-13 18:16:51 +02:00
Michael Bolin
889ee018e7 config: add strict config parsing (#20559)
## Why

Codex intentionally ignores unknown `config.toml` fields by default so
older and newer config files keep working across versions. That leniency
also makes typo detection hard because misspelled or misplaced keys
disappear silently.

This change adds an opt-in strict config mode so users and tooling can
fail fast on unrecognized config fields without changing the default
permissive behavior.

This feature is possible because `serde_ignored` exposes the exact
signal Codex needs: it lets Codex run ordinary Serde deserialization
while recording fields Serde would otherwise ignore. That avoids
requiring `#[serde(deny_unknown_fields)]` across every config type and
keeps strict validation opt-in around the existing config model.

## What Changed

### Added strict config validation

- Added `serde_ignored`-based validation for `ConfigToml` in
`codex-rs/config/src/strict_config.rs`.
- Combined `serde_ignored` with `serde_path_to_error` so strict mode
preserves typed config error paths while also collecting fields Serde
would otherwise ignore.
- Added strict-mode validation for unknown `[features]` keys, including
keys that would otherwise be accepted by `FeaturesToml`'s flattened
boolean map.
- Kept typed config errors ahead of ignored-field reporting, so
malformed known fields are reported before unknown-field diagnostics.
- Added source-range diagnostics for top-level and nested unknown config
fields, including non-file managed preference source names.

### Kept parsing single-pass per source

- Reworked file and managed-config loading so strict validation reuses
the already parsed `TomlValue` for that source.
- For actual config files and managed config strings, the loader now
reads once, parses once, and validates that same parsed value instead of
deserializing multiple times.
- Validated `-c` / `--config` override layers with the same
base-directory context used for normal relative-path resolution, so
unknown override keys are still reported when another override contains
a relative path.

### Scoped `--strict-config` to config-heavy entry points

- Added support for `--strict-config` on the main config-loading entry
points where it is most useful:
  - `codex`
  - `codex resume`
  - `codex fork`
  - `codex exec`
  - `codex review`
  - `codex mcp-server`
  - `codex app-server` when running the server itself
  - the standalone `codex-app-server` binary
  - the standalone `codex-exec` binary
- Commands outside that set now reject `--strict-config` early with
targeted errors instead of accepting it everywhere through shared CLI
plumbing.
- `codex app-server` subcommands such as `proxy`, `daemon`, and
`generate-*` are intentionally excluded from the first rollout.
- When app-server strict mode sees invalid config, app-server exits with
the config error instead of logging a warning and continuing with
defaults.
- Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli`
instead of extending shared `ReviewArgs`, so `--strict-config` stays on
the outer config-loading command surface and does not become part of the
reusable review payload used by `codex exec review`.

### Coverage

- Added tests for top-level and nested unknown config fields, unknown
`[features]` keys, typed-error precedence, source-location reporting,
and non-file managed preference source names.
- Added CLI coverage showing invalid `--enable`, invalid `--disable`,
and unknown `-c` overrides still error when `--strict-config` is
present, including compound-looking feature names such as
`multi_agent_v2.subagent_usage_hint_text`.
- Added integration coverage showing both `codex app-server
--strict-config` and standalone `codex-app-server --strict-config` exit
with an error for unknown config fields instead of starting with
fallback defaults.
- Added coverage showing unsupported command surfaces reject
`--strict-config` with explicit errors.

## Example Usage

Run Codex with strict config validation enabled:

```shell
codex --strict-config
```

Strict config mode is also available on the supported config-heavy
subcommands:

```shell
codex --strict-config exec "explain this repository"
codex review --strict-config --uncommitted
codex mcp-server --strict-config
codex app-server --strict-config --listen off
codex-app-server --strict-config --listen off
```

For example, if `~/.codex/config.toml` contains a typo in a key name:

```toml
model = "gpt-5"
approval_polic = "on-request"
```

then `codex --strict-config` reports the misspelled key instead of
silently ignoring it. The path is shortened to `~` here for readability:

```text
$ codex --strict-config
Error loading config.toml:
~/.codex/config.toml:2:1: unknown configuration field `approval_polic`
  |
2 | approval_polic = "on-request"
  | ^^^^^^^^^^^^^^
```

Without `--strict-config`, Codex keeps the existing permissive behavior
and ignores the unknown key.

Strict config mode also validates ad-hoc `-c` / `--config` overrides:

```text
$ codex --strict-config -c foo=bar
Error: unknown configuration field `foo` in -c/--config override

$ codex --strict-config -c features.foo=true
Error: unknown configuration field `features.foo` in -c/--config override
```

Invalid feature toggles are rejected too, including values that look
like nested config paths:

```text
$ codex --strict-config --enable does_not_exist
Error: Unknown feature flag: does_not_exist

$ codex --strict-config --disable does_not_exist
Error: Unknown feature flag: does_not_exist

$ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text
Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text
```

Unsupported commands reject the flag explicitly:

```text
$ codex --strict-config cloud list
Error: `--strict-config` is not supported for `codex cloud`
```

## Verification

The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid
`--disable`, the compound `multi_agent_v2.subagent_usage_hint_text`
case, unknown `-c` overrides, app-server strict startup failure through
`codex app-server`, and rejection for unsupported commands such as
`codex cloud`, `codex mcp`, `codex remote-control`, and `codex
app-server proxy`.

The config and config-loader tests cover unknown top-level fields,
unknown nested fields, unknown `[features]` keys, source-location
reporting, non-file managed config sources, and `-c` validation for keys
such as `features.foo`.

The app-server test suite covers standalone `codex-app-server
--strict-config` startup failure for an unknown config field.

## Documentation

The Codex CLI docs on developers.openai.com/codex should mention
`--strict-config` as an opt-in validation mode for supported
config-heavy entry points once this ships.
2026-05-13 16:08:05 +00:00
cassirer-openai
702e6a3c64 [rollout-trace] Add a trace ID to MCP calls. (#22326)
This allows us to connect individual tool calls to the logs of the
invocations.
2026-05-13 16:03:33 +00:00
jif-oai
382404db99 fix: prevent fmt from updating Python SDK lockfile (#22505)
## Why

`just fmt` should align source formatting without resolving dependencies
or rewriting lockfiles. The Python SDK formatting steps run through
`uv`, so differing local `uv` versions could decide the SDK lock was
stale and mutate `sdk/python/uv.lock` before Ruff ran.

## What

- Add `--frozen` to both Python SDK `uv run ... ruff` commands in the
root `fmt` recipe.
- Update the existing Python SDK artifact workflow guard test so future
changes keep the formatter recipe non-lock-mutating.

## Verification

- `uv run --frozen --project ../sdk/python --extra dev pytest
../sdk/python/tests/test_artifact_workflow_and_binaries.py -q`
2026-05-13 17:58:08 +02:00
Eric Traut
8fe0ecb045 Refactor chatwidget protocol flows into modules (phase 3) (#22433)
## Why

`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase cleanup to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer. #22407 completed phase 2 by extracting input and
submission flow.

This PR is phase 3. It keeps moving high-churn event handling out of the
central widget by extracting protocol, replay, streaming, and tool
lifecycle handling without changing the visible behavior those flows
already provide. This is once again just a mechanical movement of
existing functions. No functional changes.

## What Changed

- Added focused modules for protocol request dispatch, replay rendering,
assistant/plan/reasoning streaming, turn runtime bookkeeping, hook
lifecycle handling, command lifecycle handling, tool lifecycle
rendering, and interactive tool request prompts.
- Kept active-cell grouping, transcript invalidation, interrupt
deferral, and final-message separator behavior in the same flows, just
moved into smaller files.
- Added module header comments to the new files so the ownership
boundaries are explicit.
- Left `codex-rs/tui/src/chatwidget.rs` as the registration and
orchestration surface for these extracted behaviors.

## Cleanup Phases

The five-phase cleanup plan from #22269 is:

1. Phase 1: mechanical helper and state moves. Completed in #22269.
2. Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. Completed in #22407.
3. Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
This PR.
4. Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers.
5. Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.
2026-05-13 08:52:56 -07:00
jif-oai
2edae8d858 refactor: split memories extension crate modules (#22500)
## Why

The memories extension has several distinct responsibilities:
registering its prompt and tool contributors, enforcing local-memory
filesystem boundaries, implementing list/read/search behavior, and
wrapping that backend as extension tools. Those responsibilities were
concentrated in `lib.rs`, `local.rs`, and the tool modules, which made
follow-up work harder to review and risked growing files through
unrelated edits.

This PR reorganizes the crate so each responsibility has a narrower
owner while preserving the same extension entrypoint and memory tool
behavior.

## What Changed

- Moved extension lifecycle, prompt, and tool registration into
`src/extension.rs`, leaving `src/lib.rs` as the small crate entrypoint.
- Split `LocalMemoriesBackend` helpers into `local/list.rs`,
`local/path.rs`, `local/read.rs`, and `local/search.rs`.
- Centralized tool names and limits at the crate level, and kept the
backend and extension implementation crate-private.
- Made `memory_list`, `memory_read`, and `memory_search` tool executors
generic over `MemoriesBackend`, so tests can exercise the full executor
path without depending on tool internals.
- Consolidated and expanded memory extension tests in `src/tests.rs`,
including read/search tool output coverage, multi-query search, windowed
`all_within_lines`, and legacy `query` rejection.

## Testing

- Not run locally.
2026-05-13 17:39:50 +02:00
Felipe Coury
3d517fbd00 feat(tui): standardize picker navigation keys (#22347)
## Why

Picker-style UI in the TUI has accumulated a mix of hardcoded navigation
keys. Some lists supported page movement, some did not; some accepted
Vim-like keys, while others only accepted arrows; and tabbed or
horizontally adjustable pickers had no shared keymap action for
left/right movement.

This PR makes picker/list navigation consistent and configurable so
users can rely on the same defaults across the TUI.

## What Changed

- Adds shared list keymap actions for:
  - vertical movement: `move_up`, `move_down`
  - horizontal movement: `move_left`, `move_right`
  - paging and jumps: `page_up`, `page_down`, `jump_top`, `jump_bottom`
- Adds defaults:
- Up/down: arrows, `Ctrl+P/N`, `Ctrl+K/J`, and plain `k/j` where text
input is not active
  - Page up/down: `PageUp/PageDown` and `Ctrl+B/F`
  - First/last: `Home/End`
  - Left/right: `Left/Right` and `Ctrl+H/L`
- Wires the shared list keymap through picker and list surfaces
including session resume, multi-select, tabbed selection lists,
settings-style lists, app-link selection, MCP elicitation,
request-user-input, and the OSS selection wizard.
- Keeps search behavior intact by reserving printable characters for
query text in searchable pickers.
- Updates keymap setup actions, config schema, snapshots, and focused
coverage for the new list actions.

## How to Test

1. Start Codex from this branch and open the session picker, for example
with an existing session history.
2. In the session list, verify that `Ctrl+J/K` moves the selection
down/up.
3. Verify that `Ctrl+F/B` pages down/up and `Home/End` jumps to the
first/last visible session.
4. Type printable search text such as `j` or `k` and confirm it updates
the query instead of navigating.
5. Focus a picker control that changes values horizontally, such as a
session picker toolbar control, and verify `Ctrl+H/L` changes the
focused value like left/right arrows.

Targeted tests run:

- `cargo test -p codex-tui keymap::tests::`
- `cargo test -p codex-tui keymap_setup::tests::`
- `cargo test -p codex-tui horizontal_list_keys`
- `cargo test -p codex-tui page_and_jump_navigation_use_list_keymap`
- `cargo test -p codex-tui ctrl_h_l_move_provider_selection`
- `cargo test -p codex-tui scroll_state::tests`
- `cargo test -p codex-tui
switching_tabs_changes_visible_items_and_clears_search`
- `cargo test -p codex-tui toggle_sort_key_reloads_with_new_sort`

Also ran `just write-config-schema`, `just fmt`, `just fix -p
codex-tui`, `just argument-comment-lint`, and `git diff --check`.

Note: `cargo test -p codex-tui` was attempted and still aborts in the
pre-existing
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all` stack
overflow, which is unrelated to this branch.
2026-05-13 15:33:27 +00:00
jif-oai
441c2f818f fix: main (#22503)
Fix main due to conflicting merge
2026-05-13 17:28:37 +02:00
jif-oai
8ba6749932 feat: memories ext (#22498)
First memories extension implementation
Based on memories-mcp tools
2026-05-13 17:14:31 +02:00
jif-oai
34bb85519f feat: add config-change extension contributor (#22488)
## Why

Extensions can observe thread and turn lifecycle events today, but there
was no single host-owned hook for changes to the effective thread
configuration. That makes features that need to react to model,
permission, or tool-suggest updates either depend on individual mutation
paths or risk going stale after runtime config refreshes.

This adds a typed config-change contributor so extension-owned state can
stay synchronized with the effective thread config while the host
remains responsible for deciding when config changed.

## What Changed

- Added `ConfigContributor<C>` to `codex_extension_api`, with
before/after immutable snapshots of the effective config plus
session/thread extension stores.
- Added registry builder/accessor support through `config_contributor`
and `config_contributors`.
- Emits config-change callbacks after committed updates from session
settings, per-turn setting updates, and `refresh_runtime_config`.
- Builds effective config snapshots only when config contributors are
registered, and suppresses no-op callbacks when the before/after
snapshots are equal.
- Added a core session regression test that verifies contributors
observe both model changes and user-layer runtime config changes,
including access to session and thread extension stores.

## Validation

Added `config_change_contributor_observes_effective_config_changes` in
`codex-rs/core/src/session/tests.rs` to cover the new contributor path.
2026-05-13 17:13:34 +02:00
Ahmed Ibrahim
87de4e3290 Add service tier overrides to spawned agents (#22139)
## Why

Spawned agents can already override `model` and `reasoning_effort`, but
they have no equivalent way to opt into a model-supported service tier.
That makes it impossible to preserve or intentionally select tiered
execution behavior when delegating work to a sub-agent, even though the
model catalog already advertises supported `service_tiers`.

## What changed

- Add optional `service_tier` to both legacy and `MultiAgentV2`
`spawn_agent` tool inputs.
- Show each picker-visible model's supported service tier ids and
descriptions in the `spawn_agent` tool guidance.
- Resolve service tier selection after the child agent's effective model
is known.
- Inherit the parent tier when omitted and still supported by the final
child model; otherwise clear it.
- Reject explicit unsupported tier requests with a model-facing error.
- Keep explicit `service_tier` usable on full-history forks, while still
honoring the existing model/reasoning fork restrictions.
- Hide `service_tier` alongside other spawn metadata when
`hide_spawn_agent_metadata` is enabled.

## Verification

Added focused coverage for:

- v1/v2 `spawn_agent` schema exposure for `service_tier`
- tier descriptions in spawn guidance
- hidden-metadata suppression
- explicit supported tier selection
- explicit unknown and unsupported tier rejection
- inherited tier preservation or clearing based on child-model support
- full-history fork acceptance for explicit service tiers in both v1 and
v2

Local Rust tests were not run in this workspace per repo guidance; the
new coverage is included for CI.
2026-05-13 18:11:50 +03:00
Felipe Coury
6f77b70ff3 feat(tui): remove Zellij TUI workarounds (#22214)
## Why

We added Zellij-specific TUI workarounds because older Zellij behavior
did not work with Codex's normal terminal model:

- #8555 made `tui.alternate_screen = "auto"` disable alternate screen in
Zellij so transcript history stayed available.
- #16578 avoided scroll-region operations in Zellij by emitting raw
newlines and using a separate composer styling path.

This PR removes both workarounds because the latest Zellij release
tested locally (`zellij 0.44.1`) works correctly with Codex's standard
TUI behavior: normal alternate-screen handling, redraw, and history
insertion.

## What Changed

- Removed the `InsertHistoryMode::Zellij` path and the Zellij-only
newline scrollback insertion behavior.
- Removed cached `is_zellij` state from the TUI and composer.
- Removed Zellij-specific composer styling, the helper snapshot, and the
`TerminalInfo::is_zellij()` convenience method that only served this
workaround.
- Changed `tui.alternate_screen = "auto"` to use alternate screen for
Zellij too; `--no-alt-screen` and `tui.alternate_screen = "never"` still
preserve the inline mode escape hatch.
- Updated the generated config schema description for
`tui.alternate_screen`.

## How to Test

Manual smoke path used with `zellij 0.44.1`:

1. Build and run this branch inside a Zellij `0.44.1` session with
default config.
2. Start Codex normally and produce enough assistant/tool output to
create scrollback.
3. Confirm the transcript remains readable, the composer renders
normally, and scrolling through terminal history works.
4. Resize the Zellij pane while output exists and confirm the TUI
redraws without duplicated, missing, or stale rows.
5. Compare with `--no-alt-screen` or `-c tui.alternate_screen=never` if
you want to verify the inline fallback still works.

Targeted tests:
- `just write-config-schema`
- `just fmt`
- `just fix -p codex-tui`
- `cargo test -p codex-terminal-detection`
- `cargo test -p codex-tui alternate_screen_auto_uses_alt_screen`

Attempted but did not complete locally:
- `cargo test -p codex-tui` built and ran the new test successfully,
then failed later on unrelated local failures in
`status_permissions_full_disk_managed_*` and a stack overflow in
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all`.

## Documentation

No developers.openai.com Codex documentation update is needed for this
revert.
2026-05-13 12:11:15 -03:00
jif-oai
68e045a631 Make context contributors async (#22491)
## Summary
- make ContextContributor return a boxed Send future
- await context contributors during initial context assembly
- update existing contributors and extension-api examples for the async
contract

## Testing
- cargo test -p codex-extension-api --examples
- cargo test -p codex-git-attribution
- cargo test -p codex-core
build_initial_context_includes_git_attribution_from_extensions --
--nocapture
- cargo test -p codex-core
build_initial_context_omits_git_attribution_when_feature_is_disabled --
--nocapture
- cargo test -p codex-core (fails in unrelated
agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns
stack overflow)
- just fix -p codex-extension-api
- just fix -p codex-git-attribution
- just fix -p codex-core
- cargo clippy -p codex-extension-api --examples
2026-05-13 16:43:28 +02:00
jif-oai
1dcc89f1d4 feat: move extension scope ids into ExtensionData (#22490)
## Summary
- add a scoped level_id to ExtensionData and expose it through
level_id()
- remove thread_id/turn_id parameters from extension contributor inputs
where the scoped ExtensionData already carries that identity
- move turn-scoped extension data onto TurnContext so token usage and
lifecycle contributors can share the same turn store

## Testing
- cargo check -p codex-extension-api -p codex-core --tests
- cargo test -p codex-extension-api
- cargo test -p codex-guardian
- cargo test -p codex-core --lib
record_token_usage_info_notifies_extension_contributors
- cargo test -p codex-core --lib
submission_loop_channel_close_emits_thread_stop_lifecycle
- cargo test -p codex-core --lib
submission_loop_channel_close_aborts_active_turn_before_thread_stop_lifecycle
- just fix -p codex-extension-api
- just fix -p codex-guardian
- just fix -p codex-core
- just fmt

## Note
- Attempted cargo test -p codex-core; it aborted in
agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns
with the existing stack overflow before the full suite completed.
2026-05-13 16:13:16 +02:00
Shijie Rao
99157f3797 Scope macOS signing secrets to release environment (#22443)
## Summary
- Split macOS Rust release builds into a dedicated `build-macos` job
- Attach the `macos-signing` environment only to the macOS signing/build
job
- Keep Linux release builds outside the Apple signing environment while
preserving the existing shared release build steps
2026-05-13 06:31:08 -07:00
jif-oai
083c1962f9 feat: add token usage contributor hook (#22485)
## Why

Extensions need a stable place to observe token accounting after Codex
folds model-provider usage into the session's cached `TokenUsageInfo`.
Without a contributor hook, extension-owned features that need last-turn
or cumulative token usage have to duplicate session plumbing or infer
state from client-facing `TokenCount` notifications.

## What changed

- Added `TokenUsageContributor` to `codex-extension-api`, passing
session/thread `ExtensionData`, `ThreadId`, turn id, and the current
`TokenUsageInfo`.
- Added registry builder/storage support for token-usage contributors.
- Invoked registered contributors from
`Session::record_token_usage_info` after the session token cache is
updated and before the client `TokenCount` notification is emitted.

## Testing

- Added `record_token_usage_info_notifies_extension_contributors`,
covering cumulative token usage updates and access to both extension
stores.
2026-05-13 14:32:23 +02:00
jif-oai
fcc2a92743 fix: emit thread stop lifecycle on implicit shutdown (#22482)
## Why

The thread lifecycle contributor hooks from #22476 should observe every
session teardown. The explicit `Op::Shutdown` path already emitted
`on_thread_stop`, but when `submission_loop` exited because its
submission channel closed, it only tore down runtime services. That
meant extensions could miss the thread-stop lifecycle signal on implicit
runtime shutdown.

## What Changed

- Split shared runtime teardown into `shutdown_runtime_services(...)`.
- Split thread-stop lifecycle emission into
`emit_thread_stop_lifecycle(...)`.
- Reused those helpers from both explicit shutdown and the channel-close
shutdown path.
- Tracked whether `Op::Shutdown` was received so the explicit path does
not double-emit lifecycle events after it exits the loop.
- Added a regression test that closes the submission channel and asserts
`ThreadLifecycleContributor::on_thread_stop` runs once with the expected
thread/session stores.

## Testing

- `cargo test -p codex-core
submission_loop_channel_close_emits_thread_stop_lifecycle`
2026-05-13 14:19:57 +02:00
jif-oai
27e67a8c2a feat: add turn lifecycle contributors (#22480)
## Why

Extensions can already contribute prompt, tool, turn-item, and
thread-lifecycle behavior, but there was no explicit host-owned hook for
per-turn setup and cleanup. That makes extension-private turn state
awkward: an extension either has to stash it outside the turn lifecycle
or depend on core runtime objects.

This adds a small turn lifecycle boundary. Extensions receive stable
identifiers plus the existing session, thread, and turn `ExtensionData`
stores, while core keeps owning task scheduling, cancellation, and turn
teardown.

## What Changed

- Added `TurnLifecycleContributor` with `on_turn_start`, `on_turn_stop`,
and `on_turn_abort` callbacks in `codex-rs/ext/extension-api`.
- Added typed `TurnStartInput`, `TurnStopInput`, and `TurnAbortInput`
payloads that expose `thread_id`, `turn_id`, `session_store`,
`thread_store`, and `turn_store`.
- Registered and re-exported turn lifecycle contributors through
`ExtensionRegistry` and `ExtensionRegistryBuilder`.
- Wired `Session` to emit turn start, stop, and abort callbacks from the
existing turn/task lifecycle paths.
- Carried the turn-scoped `ExtensionData` through `RunningTask` and
`RemovedTask` so stop/abort callbacks receive the same turn store
created at turn start.

## Verification

- Not run locally.
2026-05-13 13:47:27 +02:00
jif-oai
e831db7a96 nit: codeowners (#22479) 2026-05-13 13:38:46 +02:00
jif-oai
5ab7e6b4c6 feat: add thread lifecycle contributor hooks (#22476)
## Why

Extensions that need thread-scoped state currently only get a start-time
callback. That is enough for seeding stores, but it leaves the host
without a shared extension seam for later thread rehydrate and flush
work as thread ownership evolves. This PR turns that start-only seam
into a host-owned thread lifecycle contributor contract so
extension-private state can stay behind the extension API instead of
leaking extra orchestration through core.

## What changed

- Replaced `ThreadStartContributor` with `ThreadLifecycleContributor`
and added typed lifecycle inputs for thread start, resume, and stop. The
contract lives in
[`contributors/thread_lifecycle.rs`](d0e9211f70/codex-rs/ext/extension-api/src/contributors/thread_lifecycle.rs (L1-L64)).
- Kept the existing start-time behavior intact by routing session
construction through `on_thread_start`.
- Invoked `on_thread_stop` during session shutdown before thread-scoped
extension state is dropped, while isolating contributor failures behind
warning logs.
- Migrated `git-attribution` and `guardian` onto the lifecycle
registration path.
- Renamed the extension registry plumbing from start-specific
contributors to lifecycle-specific contributors.

## Notes

`on_thread_resume` is introduced at the API boundary here so extensions
can target the final lifecycle shape; host resume dispatch can be wired
where that runtime path is finalized.
2026-05-13 13:11:30 +02:00
xli-oai
7fbd342fb3 [codex] isolate plugin/list from config serialization queue (#22437)
## Summary
- move `plugin/list` from the shared `config` read queue onto a
dedicated `plugin-list` shared-read queue
- move `plugin/read` onto that same dedicated shared-read queue as well
- keep the existing scheduler behavior unchanged
- allow plugin list/read operations to proceed independently of
config-family writes, accepting temporary stale or transient read errors
during concurrent mutations

## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
2026-05-13 11:05:57 +00:00
cassirer-openai
842ac74b9c [app-server] Gate login issuer override constant (#22338)
Gate the debug-only login issuer override constant so release builds no
longer warn that it is unused.
2026-05-13 10:43:18 +00:00
jif-oai
9c5dfa7b1a Refactor extension tools onto shared ToolExecutor (#22369)
## Why

Extension tools were split across two public runtime contracts:
`codex-tool-api` exposed `ToolBundle` plus its own call/spec/error
types, while core native tools used `codex_tools::ToolExecutor`. That
made contributed tool specs and execution behavior easy to drift apart
and added another crate boundary for what should be one executable-tool
seam.

This PR makes `ToolExecutor` the single runtime contract and keeps
extension-specific pinning in `codex-extension-api`.

## Remaining todo

https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5
Either generic `Invocation` or sub-extract the `ToolCall` and clean
`ToolInvocation`

## What changed

- Removed the `codex-tool-api` workspace crate and its dependencies from
core and `codex-extension-api`.
- Made `codex_tools::ToolExecutor` object-safe with `async_trait` so
extension contributors can return a dyn executor.
- Added the extension-facing aliases under
`ext/extension-api/src/contributors/tools.rs`, including
`ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output =
ExtensionToolOutput>`.
- Changed `ToolContributor::tools` to return extension executors
directly instead of `ToolBundle`s.
- Updated core’s extension tool handler/registry/router path to adapt
those extension executors into the existing native `ToolInvocation`
runtime path.
- Added focused coverage for extension tools being registered,
model-visible, dispatchable, and not replacing built-in tools.

## Verification

- `cargo test -p codex-tools`
- `cargo test -p codex-extension-api`
2026-05-13 12:12:06 +02:00
jif-oai
1824685a00 feat: extract shared tool executor interface (#22359)
## Why

Codex still models model-visible tools and executable behavior largely
inside `codex-core`, which makes it harder to evolve the tool system
toward a single reusable abstraction for built-ins, MCP-backed tools,
dynamic tools, and later tools injected from outside core.

This PR takes the next incremental step in that direction by moving the
common execution-facing pieces out of core and separating them from
core-only orchestration. The intent is to let shared tool abstractions
improve in one place, while `codex-core` keeps the parts that are still
inherently host-specific today, such as `ToolInvocation`, dispatch
wiring, and hook integration.

This PR is mostly moving things around. The only interesting piece is
this abstraction:
https://github.com/openai/codex/pull/22359/changes#diff-81af519002548ba51ed102bdaaf77e081d40a1e73a6e5f9b104bbbc96a6f1b3dR13

## What changed

- Added `codex_tools::ToolExecutor<Invocation>` as the shared execution
trait for model-visible tools.
- Moved the reusable execution support types from `codex-core` into
`codex-tools`:
  - `FunctionCallError`
  - `ToolPayload`
  - `ToolOutput`
- Refactored core tool implementations so that execution behavior lives
on `ToolExecutor<ToolInvocation>`, while `ToolHandler` remains the
core-local extension point for hook payloads, telemetry tags, diff
consumers, and other orchestration concerns.
- Kept the registry and dispatch flow behaviorally unchanged while
making the shared/extracted boundary explicit across built-in, MCP,
dynamic, extension-backed, shell, and multi-agent tool handlers.

## Verification

- `cargo test -p codex-tools`
- `just fix -p codex-tools`
- `just fix -p codex-core`
- `cargo test -p codex-core` progressed through the updated tool
surfaces and then hit the existing unrelated multi-agent stack overflow
in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.
2026-05-13 11:31:27 +02:00
jif-oai
155c04ad40 extension-api: add approval review contributor flow (#22344)
## Why

`codex-extension-api` needs an approval hook that lets an installed
extension own a rendered approval-review prompt and produce the final
`ReviewDecision`. The prior interceptor stub only exposed a yes/no claim
and did not model the review result itself, which left the host with the
missing half of the control flow.

## What changed

- Replaces `ApprovalInterceptorContributor` with
[`ApprovalReviewContributor`](c49d17531e/codex-rs/ext/extension-api/src/contributors.rs (L43-L55)),
which may claim a rendered prompt and return an async `ReviewDecision`.
- Re-exports the new contributor and future types from `extension-api`.
- Adds registry support through `approval_review_contributor(...)` plus
[`ExtensionRegistry::approval_review(...)`](c49d17531e/codex-rs/ext/extension-api/src/registry.rs (L90-L101)),
which returns the first installed contributor that claims the prompt.
2026-05-13 10:39:12 +02:00
jif-oai
7e97da7c13 chore: Keep view_image sandbox test in temp dir (#22355)
## Summary
- move the `view_image` sandbox filesystem-read unit test onto a
temporary cwd
- keep the turn cwd and selected turn environment cwd aligned inside the
test
- avoid leaving `core/image.png` behind in the repo checkout after the
test runs

## Root cause
The test wrote `image.png` beneath `turn.cwd`, and the shared session
test helper defaults that cwd to the current repo directory when no
override is provided.

## Validation
- `just fmt`
- `cargo test -p codex-core
tools::handlers::view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads`
2026-05-13 10:39:07 +02:00
xl-openai
2a67c46de4 feat: Add plugin share checkout (#22435)
Adds plugin/share/checkout to turn a shared remote plugin into a local
working copy under ~/plugins/<name>.

Registers the copy in the managed personal marketplace and records the
remote-to-local mapping for later share/save flows.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-13 00:50:29 -07:00
Abhinav
392e94e9ea add --dangerously-bypass-hook-trust CLI flag (#21768)
# Why

Hook trust happens through the TUI in `/hooks` so it can block
non-interactive use cases. This flag will allow users that are using
codex headlessly to bypass hooks when they want to.

# What

This adds one invocation-scoped escape hatch.

- the CLI flag sets a runtime-only `bypass_hook_trust` override; there
is no durable `config.toml` setting
- hook discovery still respects normal enablement, so explicitly
disabled hooks remain disabled
- we show a `--dangerously-bypass-hook-trust is enabled. Enabled hooks
may run without review for this invocation.` message on startup so
accidental use is visible in both interactive and exec flows

This keeps “enabled” and “trusted” as separate concepts in the normal
path, while giving CI/E2E callers a stable way to opt into the
exceptional path when they already control the hook set.
2026-05-13 07:13:57 +00:00
Abhinav
934a40c7d9 Use root repo hooks in linked worktrees (#21969)
# Why

Linked worktrees currently load their own project hook declarations, so
the same repo can present different hook definitions depending on which
checkout is active. https://github.com/openai/codex/pull/21762 tried to
share trust by giving matching worktree hooks a shared synthetic key,
but review pointed out that divergent worktree hook definitions would
then fight over one `trusted_hash`.

Instead of introducing a second trust model, this makes linked worktrees
use the root checkout as the single source of truth for project hook
declarations. Worktree-local project config can still diverge for
unrelated settings, but project hooks now keep one real source path and
one trust state per repo.

# What

- Teach project config loading to remember the matching root-checkout
`.codex/` folder for actual linked-worktree project layers.
- Keep ordinary project config sourced from the worktree, but replace
project hook declarations with the root checkout's matching layer before
hook discovery runs, including linked-worktree layers with `.codex/` but
no local `config.toml`.
- Make hook discovery use that authoritative hook folder for both
`hooks.json` and TOML hook source paths, so linked worktrees produce the
same hook key and trust state as the root checkout.
- Cover the linked-worktree path plus regressions for missing worktree
`config.toml` and nested non-worktree project roots.
2026-05-13 06:58:58 +00:00
sayan-oai
2304ec45ca Remove unavailable MCP placeholder tool backfill (#22439)
## Why

`UnavailableDummyTools` kept synthetic placeholder tools alive for
historical tool calls whose backing MCP tool was no longer available.
That path adds stale model-visible tool specs and special routing at the
point where unavailable MCP calls should use ordinary current-tool
handling. This removes the runtime backfill instead of preserving a
second compatibility lane.

## Is it safe to remove?

The unavailable tools were added in #17853 after a CS issue when a
previously-called MCP tool failed to load and was omitted from the CS
spec. Now that we have tool search, I think this is resolved:
- API merges tools from previous TST output into effective tool set so
theyre always in CS spec
- if an MCP tool surfaced by TST later becomes unavailable, the model
can still call it and it will just return model-visible error
- both TST output and function call output are dropped on compaction so
model will not remember old calls to MCP post compaction

## What changed

- Delete unavailable-tool collection, placeholder handler, router/spec
plumbing, and obsolete placeholder coverage.
- Keep `features.unavailable_dummy_tools` as a removed no-op feature
tombstone so existing configs still parse cleanly.
- Add an integration-style `tool_search` regression test showing that a
deferred MCP tool surfaced through `tool_search` still routes through
MCP and returns a model-visible tool-call error rather than `unsupported
call`.

## Verification

- `cargo test -p codex-core tool_search`
2026-05-12 23:30:13 -07:00
Eric Traut
2630a6ca35 Refactor chatwidget input flow into modules (#22407)
## Why

`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase effort to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer.

This PR is phase 2 of that plan. It extracts the input and submission
flow as a mechanical move before the later protocol, popup/status, and
constructor/orchestration phases.

## What Changed

- Added `codex-rs/tui/src/chatwidget/input_flow.rs` for composer input
results, queued user-message draining, pending-input previews, and
mode-specific submission entry points.
- Added `codex-rs/tui/src/chatwidget/input_submission.rs` for
user-message construction/submission, shell prompt submission,
structured mention resolution, and blocked image draft restoration.
- Added `codex-rs/tui/src/chatwidget/input_restore.rs` for
initial-message submission, pending steer restoration after interrupts,
and thread input snapshot/restore behavior.
- Registered the new modules and removed the moved `ChatWidget` impl
methods from `codex-rs/tui/src/chatwidget.rs`.

## Follow-On Refactor Phases

The five-phase plan from #22269 is:

- Phase 1: mechanical helper and state moves. Completed in #22269.
- Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. This PR.
- Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
- Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers.
- Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.
2026-05-12 21:17:35 -07:00
Eric Traut
ad572709ab Add support for UDS in codex --remote (#22414)
## Why

Added support for UDS connections in `codex --remote`.

TUI also now connects to local app-server using UDS by default if it is
running and set to listen to UDS connection.

## What Changed

- Introduced `RemoteAppServerEndpoint` with `WebSocket` and `UnixSocket`
variants.
- Reused the existing JSON-RPC-over-WebSocket protocol over either a TCP
WebSocket stream or a UDS stream.
- Updated `codex --remote` to accept `ws://host:port`,
`wss://host:port`, `unix://`, and `unix://PATH`.
- Kept `--remote-auth-token-env` restricted to `wss://` and loopback
`ws://` remotes.
- Added a fast TUI startup probe for the default daemon socket, falling
back to the embedded app server when the daemon is absent or
unresponsive.

## Verification

- Manually verified that the updated remote flow works.
- Added coverage for UDS remote round trips, WebSocket auth headers,
auth-token transport policy, remote address parsing, and missing-daemon
fallback.
- Ran focused remote test coverage locally.
2026-05-12 21:17:20 -07:00
xl-openai
7bf95b39aa feat: Split shared workspace plugins by discoverability (#22425)
- Keep shared-with-me as the plugin/list request kind, but return
private plugins under workspace-shared-with-me-private.
- Add workspace-shared-with-me-unlisted for installed workspace plugins
with UNLISTED discoverability,
2026-05-12 21:11:19 -07:00
pakrym-oai
104fc14956 Encapsulate tool search entries in handlers (#22261)
## Why

This builds on the handler-owned spec refactor by moving deferred
tool-search metadata to the same handlers that already own tool specs.
The registry builder no longer needs a separate prebuilt
`tool_search_entries` path; it can collect searchable entries from
deferred handlers directly.

## What changed

- Added `search_info()` to tool handlers and implemented it for MCP and
dynamic handlers.
- Reused handler `spec()` output when constructing tool-search entries,
adapting it into the deferred `LoadableToolSpec` shape expected by
`tool_search`.
- Simplified `build_tool_registry_builder(...)` so `tool_search`
registration is based on deferred handlers with search info.
- Removed the old standalone search-entry builders and now-unused
`codex-tools` discovery helper exports.

## Verification

- `cargo test -p codex-core tools::handlers::tool_search::tests:: --
--nocapture`
- `cargo test -p codex-core tools::spec_plan::tests::search_tool --
--nocapture`
- `cargo test -p codex-core tools::spec::tests:: -- --nocapture`
- `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture`
- `cargo test -p codex-tools`
- `just fix -p codex-core`
- `just fix -p codex-tools`
2026-05-12 20:48:02 -07:00
pakrym-oai
67c8486462 tools: infer code-mode namespace descriptions from specs (#22406)
## Why

Code mode already builds the merged nested `ToolSpec`s that feed the
`exec` prompt. Keeping a separate `tool_namespaces` map in the planning
path duplicated that metadata and left extra wrapper plumbing in
`spec.rs`.

## What changed

- derive code-mode namespace descriptions from the merged
`ToolSpec::Namespace` entries before building the code-mode handlers
- extract `build_code_mode_handlers(...)` so the code-mode-specific
planning stays in one place
- remove `tool_namespaces` from `ToolRegistryBuildParams`
- delete the now-unused `McpToolPlanInputs` wrapper and related test
helper plumbing

## Testing

- `cargo test -p codex-core spec_plan`
2026-05-12 20:47:50 -07:00
pakrym-oai
96833c5b15 Remove CODEX_RS_SSE_FIXTURE test hook (#22413)
## Why

`CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
bypass the normal Responses transport by reading SSE from local files.
That kept test-only behavior wired through production client code. The
affected tests can stay hermetic by using the existing
`core_test_support::responses` mock server and passing `openai_base_url`
instead.

## What Changed

- Removed the `CODEX_RS_SSE_FIXTURE` flag,
`codex_api::stream_from_fixture`, the `env-flags` dependency, and the
checked-in SSE fixture files.
- Repointed the affected core, exec, and TUI tests at `MockServer` with
the existing SSE event constructors.
- Removed the Bazel test data plumbing for the deleted fixtures and
refreshed cargo/Bazel lock state.

## Verification

- `cargo build -p codex-cli`
- `cargo test -p codex-api`
- `cargo test -p codex-core --test all responses_api_stream_cli`
- `cargo test -p codex-core --test all
integration_creates_and_checks_session_file`
- `cargo test -p codex-exec --test all ephemeral`
- `cargo test -p codex-exec --test all resume`
- `cargo test -p codex-tui --test all
resume_startup_does_not_consume_model_availability_nux_count`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
- `git diff --check`
2026-05-13 03:08:01 +00:00
Andrei Eternal
913aad4d3c Add allow_managed_hooks_only hook requirement (#20319)
## Why

Enterprise-managed hook policy needs a narrow way to require Codex to
ignore user-controlled lifecycle hooks without adopting the broader
trust-precedence model from earlier hook work. This keeps the policy
anchored in `requirements.toml`, so admins can opt into managed hooks
only while normal `config.toml` files cannot enable the restriction
themselves.

## What changed

- Added `allow_managed_hooks_only` to the requirements data flow and
preserved explicit `false` values.
- Also adds it to /debug-config
- Marked MDM, system, and legacy managed config layers as managed for
hook discovery.
- Updated hook discovery so `allow_managed_hooks_only = true`:
  - keeps managed requirements hooks and managed config-layer hooks,
- skips user/project/session `hooks.json` and `[hooks]` entries with
concise startup warnings,
  - skips current unmanaged plugin hooks,
- ignores any `allow_managed_hooks_only` key placed in ordinary
`config.toml` layers.
2026-05-12 19:05:25 -07:00
Andrei Eternal
fbfbfe5fc5 hooks: use new session IDs instead of thread IDs for hooks, apply parent's session ID to subagents' hooks (#22268)
## Why

hook semantics treat `session_id` as shared across a root session and
its subagents. Codex hooks were still emitting the current thread ID,
which made spawned agents look like independent sessions and made it
harder for hook integrations to correlate work across a root thread and
its spawned helpers

This change makes hooks use Codex's existing shared session identity so
hook `session_id` matches the root-thread session across spawned
subagents.

## What Changed

- switch hook payloads to use the existing shared session identity from
core instead of the current thread ID
- cover all hook surfaces that expose `session_id`, including
`SessionStart`, tool hooks, compact hooks, prompt-submit hooks, stop
hooks, and legacy after-agent dispatch
2026-05-12 19:05:10 -07:00
Celia Chen
e2eb7c30fe feat: route guardian review model selection through providers (#22258)
## Why

Guardian review selection was hard-coded in `core`, which worked for the
default OpenAI path but did not give provider implementations a way to
choose backend-specific reviewer model IDs. That matters for Amazon
Bedrock: guardian review should run through the Bedrock/Mantle provider
using Bedrock's `openai.gpt-5.4` model ID, instead of accidentally
selecting a reviewer model that implies the OpenAI backend.

## What Changed

- Added provider-owned approval review model selection via
`ModelProvider::approval_review_model_selection`.
- Moved the existing default selection policy into the provider
abstraction: prefer the requested reviewer model when it is available,
otherwise fall back to the active turn model, preferring `Low` reasoning
when supported.
- Added an Amazon Bedrock override that pins guardian review to
`openai.gpt-5.4` with `Low` reasoning.
2026-05-13 01:55:46 +00:00
Eric Traut
51bfb5f3b1 Restore app-server websocket listener with auth guard (#22404)
## Why
PR #21843 removed the TCP websocket app-server listener, but that also
removed functionality that still needs to exist. Restoring it as-is
would reopen the old remote exposure problem, so this keeps the restored
listener while making remote and non-loopback usage require explicit
auth.

## What Changed
- Mostly reverts #21843 and reapplies the small merge-conflict
resolutions needed on top of current main.
- Restores ws://IP:PORT parsing, the app-server TCP websocket acceptor,
websocket auth CLI flags, and the associated tests.
- The only intentional behavior change from the restored code is that
non-loopback websocket listeners now fail startup unless --ws-auth
capability-token or --ws-auth signed-bearer-token is configured.
Loopback listeners remain available for local and SSH-forwarding
workflows.

## Reviewer Focus
Please focus review on the small auth-enforcement delta layered on top
of the revert:

- codex-rs/app-server-transport/src/transport/websocket.rs:
start_websocket_acceptor now rejects unauthenticated non-loopback
websocket binds before accepting connections.
- codex-rs/app-server-transport/src/transport/auth.rs: helper logic
classifies unauthenticated non-loopback listeners.
- codex-rs/app-server/tests/suite/v2/connection_handling_websocket.rs:
tests cover unauthenticated ws://0.0.0.0 startup rejection and
authenticated non-loopback capability-token startup.

Everything else is intended to be revert/merge-conflict restoration
rather than new product behavior.

## Verification

- Manually verified that TUI remoting is restored and that auth is
enforced for non-localhost urls.
2026-05-12 18:40:53 -07:00
xl-openai
d1430fd61e feat: Expose plugin versions and gate plugin sharing (#22397)
- Adds localVersion to plugin summaries and remoteVersion to share
context, including generated API schemas.
- Hydrates local and remote plugin versions from manifests and remote
release metadata.
- Adds default-on plugin_sharing gate for shared-with-me listing and
plugin/share/save, with disabled-path errors
    and focused coverage.
2026-05-12 17:56:30 -07:00
efrazer-oai
01b4817bac docs(skills): simplify plugin creator deeplink shape (#22240)
## Summary

Plugin Creator now documents the shorter local-plugin handoff URL that
the app can interpret directly.
[#22221](https://github.com/openai/codex/pull/22221) teaches the skill
to end marketplace-backed creation flows with named View and Share
links; this follow-up updates those examples so the skill only emits the
normalized plugin name, the absolute marketplace path, and optional
share mode.

The documented shape is:

```txt
codex://plugins/<normalized-plugin-name>?marketplacePath=<absolute-marketplace-json-path>
codex://plugins/<normalized-plugin-name>?marketplacePath=<absolute-marketplace-json-path>&mode=share
```

The skill text now states exactly where the normalized plugin name
belongs, exactly where the absolute marketplace path belongs, and that
it should not add `pluginName` or `hostId` query parameters.

## Testing

Tests: plugin-creator skill validation.
2026-05-13 00:54:52 +00:00
Owen Lin
2237a13cf1 mark Feature::RemoteControl as removed (#22386)
## Why

`remote_control` can appear in `config.toml`, CLI feature overrides, and
the app-server config APIs. Before this PR, app-server startup treated
`config.features.enabled(Feature::RemoteControl)` as the signal to start
remote control ([base
code](5e3ee5eddf/codex-rs/app-server/src/lib.rs (L678-L680))).
That meant a user with:

```toml
[features]
remote_control = true
```

would accidentally opt every app-server process into remote control.
Remote-control startup should instead be a per-process launch decision
made by CLI flags.

## What Changed

- Marks `Feature::RemoteControl` as `Stage::Removed`, keeping
`remote_control` as a known compatibility key while making it
config-inert.
- Adds a hidden `--remote-control` process flag to `codex app-server`
and standalone `codex-app-server`.
- Plumbs that flag through
`AppServerRuntimeOptions.remote_control_enabled` and makes app-server
startup use only that runtime option to decide whether to start remote
control.
- Removes the app-server config mutation hook that reloaded config and
toggled remote control at runtime.
- Updates managed daemon spawning to use `codex app-server
--remote-control --listen unix://` instead of `--enable remote_control`.

Config APIs can still list, read, write, and set `remote_control`; those
operations just no longer affect remote-control process enrollment.
2026-05-13 00:52:45 +00:00
sayan-oai
1ae9867296 [codex] Remove tool search bucket limit override (#22381)
## Why

`tool_search` still carries the server-specific result-cap path added in
#17684 for `computer-use`: when the model omitted `limit`, a matching
result expanded the search to 20 and then `limit_results_by_bucket`
applied per-bucket caps. That makes default result handling depend on a
one-off server exception instead of the single
`TOOL_SEARCH_DEFAULT_LIMIT` path.

This PR removes that custom branch so omitted `limit` values use the
ordinary global default consistently. The implementation being retired
is the pre-change bucketed search path in
[`tool_search.rs`](5e3ee5eddf/codex-rs/core/src/tools/handlers/tool_search.rs (L121-L190)).

## What changed

- Collapse `ToolSearchHandler::search` back to one BM25 search with the
resolved limit.
- Remove `limit_results_by_bucket`, the `computer-use` constants, and
the omitted-limit plumbing that only existed for the override.
- Drop dead `ToolSearchEntry::limit_bucket` metadata from deferred MCP
and dynamic search entries.
- Remove tests and helpers that only asserted the deleted override
behavior.
- Add direct handler-level unit coverage for omitted/default and
explicit `tool_search` result limits.

## Validation

- `cargo test -p codex-core tool_search`
- The matching unit tests passed, including the new omitted/default and
explicit result-limit coverage.
- The broader `--test all` search-tool fixture phase then failed before
sending mocked response requests in
`tool_search_indexes_only_enabled_non_app_mcp_tools` and
`tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`.
- `cargo test -p codex-core`
- The touched tool-search coverage passed before the run later aborted
in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`
with a stack overflow.
2026-05-13 00:46:07 +00:00
Eric Traut
92930a8d40 Refactor chatwidget state into modules (#22269)
## Why

`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. After #21866 consolidated some of the state it tracks, this
starts the next phase by moving coherent state/helper clusters out of
the main module without changing behavior.

This PR is intentionally mechanical: it only moves existing functions,
structs, and helpers into focused modules so the boundaries are easier
to review before the less mechanical refactors that should follow.

## What Changed

- Moved user-message, composer, queue, pending steer, and merge/remap
helpers into `codex-rs/tui/src/chatwidget/user_messages.rs`.
- Added `codex-rs/tui/src/chatwidget/exec_state.rs` for unified exec
bookkeeping helpers.
- Added `codex-rs/tui/src/chatwidget/rate_limits.rs` for rate-limit
warning, prompt, and error classification state.
- Moved plugin list fetch and install auth-flow state into
`codex-rs/tui/src/chatwidget/plugins.rs`.
- Made a couple of test-only `VecDeque` imports explicit now that those
tests no longer inherit the parent module import.

## Verification

- `cargo test -p codex-tui` was run

## Follow-On Refactor Phases

This PR is phase 1: mechanical helper and state moves. Planned follow-up
PRs:

- Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior.
- Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
- Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers.
- Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.
2026-05-12 17:33:33 -07:00
Tom
c51c65ad09 Unify thread metadata updates above store (#22236)
- make ThreadStore::update_thread_metadata accept a broad range of
metadata patches
- keep ThreadStore::append_items as raw canonical history append (no
metadata side effects)
- in the local store, write these metadata updates to a combination of
sqlite and rollout jsonl files for backwards-compat. It special cases
which fields need to go into jsonl vs sqlite vs whatever, confining the
awkwardness to just this implementation
- in remote stores we can simply persist the metadata directly to a
database, no special casing required.
- move the "implicit metadata updates triggered by appending rollout
items" from the RolloutRecorder (which is local-threadstore-specific) to
the LiveThread layer above the ThreadStore, inside of a private helper
utility called ThreadMetadataSync. LiveThread calls ThreadStore
append_items and update_metadata separately.
- Add a generic update metadata method to ThreadManager that works on
both live threads and "cold" threads
- Call that ThreadManager method from app server code, so app server
doesn't need to worry about whether the thread is live or not
2026-05-13 00:28:15 +00:00
pakrym-oai
f11ad1eacb [codex] Add search term coverage for tool_search (#22398)
## Why

`tool_search` already had solid end-to-end coverage for discovery and
follow-up execution, but it did not prove that distinct pieces of
indexed search text actually work in integration. In particular, we were
not exercising whether unique tool names, descriptions, namespaces,
underscore-expanded dynamic names, and schema-property terms were
sufficient to surface the expected deferred tools.

This change adds focused integration coverage for those term sources so
regressions in search text construction are caught by a real `TestCodex`
flow instead of only by lower-level unit tests.

## What changed

- added a small helper in `core/tests/suite/search_tool.rs` to assert
that a `tool_search_output` contains an expected namespace child tool
- added an MCP integration test that issues several `tool_search_call`s
and verifies distinct query terms match the expected app tools:
  - exact tool name: `calendar_timezone_option_99`
  - tool description phrase: `uploaded document`
  - top-level schema property: `starts_at`
- added a dynamic-tool integration test that verifies distinct query
terms match the expected deferred dynamic tool:
  - exact name: `quasar_ping_beacon`
  - underscore-expanded name: `quasar ping beacon`
  - description phrase: `saffron metronome`
  - namespace: `orbit_ops`
  - schema property: `chrono_spec`

## Validation

- `cargo test -p codex-core tool_search_matches_`

## Docs

No documentation update needed.
2026-05-13 00:24:07 +00:00
Michael Bolin
9e7cdbd0d2 core: box multi-agent handler futures (#22266)
## Why

This is the base PR in the split stack for the permissions migration. It
isolates stack-safety work that had been mixed into the larger
permissions PR, so reviewers can evaluate the async-future changes
separately from the permissions model changes in #22267.

The main risk this addresses is large or recursive multi-agent futures
overflowing smaller runner stacks. A follow-up review also called out
that `shutdown_live_agent` must remain quiescent: callers should not
remove a live agent from tracking or release its spawn slot until the
worker loop has actually terminated.

## What Changed

- Boxes the large async futures in the multi-agent spawn, resume, and
close tool handlers.
- Boxes the `AgentControl` spawn and recursive close/shutdown paths that
can otherwise build very deep futures.
- Keeps `shutdown_live_agent` waiting for thread termination before
removing/releasing the live agent, preserving the previous shutdown
ordering while still boxing the recursive close path.

## Verification Strategy

The focused local coverage was `cargo test -p codex-core multi_agents`,
which exercises the multi-agent spawn/resume/close handlers, cascade
close/resume behavior, and the shutdown path touched by this PR.












---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22266).
* #22330
* #22329
* #22328
* #22327
* __->__ #22266
2026-05-12 17:22:25 -07:00
Channing Conger
589b820d6e code-mode: Add pending-aware code mode execution (#22280)
Introduce execute_to_pending and wait_to_pending APIs that freeze
pending-mode runtimes until an explicit resume, while preserving the
existing continuously-running execute path. Add runtime and service
coverage for pending, resume, completion, and freeze behavior.
2026-05-12 17:16:57 -07:00
pakrym-oai
0173f71143 Refactor namespaced tool spec registration (#22256)
## Summary

This refactor makes tool handlers the owner of the specs they can
publish, so registry construction can register handlers once and
separately publish only the specs that should be model-visible.

The main motivation is deferred tools: MCP and dynamic tools still need
handlers registered up front, but deferred tools should be discoverable
through `tool_search` rather than emitted in the initial tool spec list.

## What changed

- `McpHandler` and `DynamicToolHandler` can return their own `ToolSpec`.
- `build_tool_registry_builder` now collects handlers, registers them
through the no-spec path, and publishes only non-deferred handler specs.
- Deferred MCP and dynamic tool names are combined into one
`all_deferred_tools` set that drives spec filtering, code-mode
deferred-tool signaling, and `tool_search` registration.
- `tool_search` registration now requires both deferred tools and
`namespace_tools`.
- Namespace specs are merged in `spec_plan`, preserving top-level spec
order, sorting tools within each namespace, and backfilling empty
namespace descriptions.
- Hosted web search and image-generation specs are included in the
collected spec vector before namespace merge/publication, and tool-name
tests that should not care about hosted relative order now compare sets.

## Testing

- `cargo test -p codex-core tools::spec::tests:: -- --nocapture`
- `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture`
- `cargo test -p codex-core
tools::router::tests::specs_filter_deferred_dynamic_tools --
--nocapture`
- `cargo test -p codex-core
suite::prompt_caching::prompt_tools_are_consistent_across_requests --
--nocapture`
- `just fmt`
- `just fix -p codex-core`
- `cargo test -p codex-core -- --skip
tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`
passed the library suite after skipping the known stack-overflowing unit
test.

Full `cargo test -p codex-core` currently hits a stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`;
the same focused test reproduces on `origin/main`.
2026-05-12 17:09:14 -07:00
richardopenai
b6e718591b [codex] Remove workspace owner usage nudge gate (#20509)
## Summary
- make workspace owner nudge handling unconditional in the TUI now that
it is fully rolled out
- keep `workspace_owner_usage_nudge` as a removed no-op compatibility
flag so old configs/app overrides remain accepted during rollout
- remove flag-disabled test setup

## Companion PR
- https://github.com/openai/openai/pull/876351 removes the Codex Apps
Statsig rollout gate override after this change is available to the
app/runtime path

## Validation
- `just write-config-schema`
- `just fmt`
- `cargo test -p codex-features`
- `cargo test -p codex-tui status_and_layout`
2026-05-12 17:07:33 -07:00
Anton Panasenko
ac466c0dbd feat(exec-server): use protobuf relay frames (#22343)
## Why

Remote exec-server now needs one executor websocket to serve multiple
harness JSON-RPC sessions. Rendezvous routes by `stream_id`, and the
exec-server side needs to use the same stable relay frame contract
instead of a hand-rolled JSON shape.

The relay protocol also needs to make ownership boundaries clear:
harness and executor endpoints own sequencing, acks, retries, duplicate
suppression, segmentation, and reassembly; rendezvous only routes
frames.

## What Changed

- Add the checked-in `codex.exec_server.relay.v1.RelayMessageFrame`
proto plus generated prost bindings for `codex-exec-server`.
- Encode remote harness/executor relay traffic as binary protobuf
websocket frames while keeping local websocket JSON-RPC unchanged.
- Demux executor-side relay streams into independent
`ConnectionProcessor` sessions keyed by `stream_id`.
- Add a programmatic `RemoteExecutorConfig::with_bearer_token(...)`
constructor for non-CLI callers and integration tests.
- Add an integration test that starts the remote executor against a fake
registry/rendezvous websocket and verifies two virtual streams share one
executor websocket without cross-talk, including per-stream reset
behavior.
- Document the remote relay envelope, sequence ranges, `ack`/`ack_bits`,
and endpoint responsibilities in `exec-server/README.md`.

## Verification

- `cargo test -p codex-exec-server --test relay
multiplexed_remote_executor_routes_independent_virtual_streams --
--exact`
- `cargo test -p codex-exec-server --test relay`
- `cargo test -p codex-exec-server` passed outside the sandbox. The
sandboxed run hit macOS `sandbox-exec: sandbox_apply: Operation not
permitted` in filesystem sandbox tests.
2026-05-12 16:50:45 -07:00
Felipe Coury
6dc3b3d7c8 test(tui): relax configured pet load timeout (#22392)
## Why

Windows CI has been timing out in
`configured_pet_load_is_deferred_until_after_construction` while waiting
for the deferred configured-pet load event.

The test still needs to prove construction returns before the pet image
is available, but the background load slices the built-in pet
spritesheet into frame cache files. That work can exceed the old 2
second deadline on slower or more contended CI machines.

## What Changed

- Increased the test wait for `ConfiguredPetLoaded` from 2 seconds to 30
seconds.
- Kept the post-construction assertion intact so the test still verifies
that the pet is not loaded synchronously during `ChatWidget`
construction.

## How to Test

Targeted tests:
- `cargo test -p codex-tui
configured_pet_load_is_deferred_until_after_construction`
- `just argument-comment-lint`

Additional check:
- `cargo test -p codex-tui` was run, but the broader crate suite did not
complete successfully due to unrelated existing failures:
-
`status::tests::status_permissions_full_disk_managed_without_network_is_external_sandbox`
-
`status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access`
- later abort in
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all` from
stack overflow
2026-05-12 16:50:35 -07:00
pakrym-oai
960d42ddae code-mode: carry nested tool kind through runtime (#22377)
## Why

Code mode only used nested spec lookup at execution time to rediscover
whether a nested tool should be invoked as a function tool or a freeform
tool.

That information is already present in the enabled tool metadata that
code mode builds to expose `tools.*` and `ALL_TOOLS`, so re-looking it
up from the router was redundant and kept execution coupled to a
separate spec lookup path.

## What Changed

- thread `CodeModeToolKind` through the code-mode runtime `ToolCall`
event and `CodeModeNestedToolCall`
- emit the nested tool kind directly from the V8 callback using the
already-enabled tool metadata
- build nested tool payloads from the propagated kind instead of calling
`find_spec`
- remove the now-unused `find_spec` plumbing from the router and
parallel runtime helpers
- add unit coverage for function vs freeform payload shaping and update
affected router tests

## Testing

- `cargo test -p codex-code-mode`
- `cargo test -p codex-core code_mode::tests`
- `cargo test -p codex-core
extension_tool_bundles_are_model_visible_and_dispatchable`
- `cargo test -p codex-core
model_visible_specs_filter_deferred_dynamic_tools`
2026-05-12 23:34:37 +00:00
Dylan Hurd
8123bddb16 chore(config) include_collaboration_mode_instructions (#22383)
## Summary
Adds include_collaboration_mode_instructions, which is a config
equivalent to include_permissions_instructions for collaboration modes.
Desired for situations where we want to disable this instruction from
entering the context

## Testing
- [x] Added unit test
2026-05-12 15:50:10 -07:00
pakrym-oai
862b2122ee tools: remove is_mutating dispatch gating (#22382)
## Why

Tool dispatch had two serialization mechanisms:

- `supports_parallel_tool_calls` decides whether a tool participates in
the shared parallel-execution lock.
- `is_mutating` separately gated some calls inside dispatch.

That second hook no longer carried its weight. The remaining
parallel-support flag is already the per-tool concurrency policy, so
keeping a second mutating gate made dispatch harder to follow and left
behind extra session plumbing that only existed for that path.

## What changed

- Removed `is_mutating` from tool handlers and deleted the
`tool_call_gate` path that existed only to support it.
- Simplified dispatch and routing to rely on the existing per-tool
`supports_parallel_tool_calls` boolean.
- Dropped the now-unused handler overrides and related session/test
scaffolding.
- Kept the router/parallel tests focused on the surviving per-tool
behavior.
- Removed the unused `codex-utils-readiness` dependency from
`codex-core` as a follow-up fix for `cargo shear`.

## Testing

- `cargo test -p codex-core
parallel_support_does_not_match_namespaced_local_tool_names`
- `cargo test -p codex-core mcp_parallel_support_uses_handler_data`
- `cargo test -p codex-core
tools_without_handlers_do_not_support_parallel`
2026-05-12 22:44:54 +00:00
Chris Bookholt
5e3ee5eddf [codex] Tighten unified exec sandbox setup (#22207)
## Summary
- tighten unified exec sandbox initialization
- preserve the requested process workdir independently from sandbox
setup
- add regression coverage for the updated invariant

## Validation
- Ran `/tmp/cargo-tools/bin/just fmt`.
- Ran the targeted `codex-core` regression test successfully.
- Ran `cargo test -p codex-core`; it did not complete cleanly because
unrelated existing agent/config-loader tests failed and the run later
aborted on a stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 08:41:00 -07:00
jif-oai
89c8e9a4db fix: uv lock (#22323)
Update the lock of UV
2026-05-12 16:24:54 +02:00
Felipe Coury
95b332c820 feat(tui): add ambient terminal pets (#21206)
## Why

The Codex App has animated pets, but the TUI had no equivalent ambient
companion surface. This brings that experience into terminal Codex while
keeping the main chat flow usable: the pet should feel present, but it
cannot cover transcript text, composer input, approvals, or picker
content.

The feature also needs to be terminal-aware. Different terminals support
different image protocols, tmux can interfere with image rendering, and
some users will want pets disabled entirely or anchored differently
depending on their layout.

<table>
<tr><td>
<img width="4110" height="2584" alt="CleanShot 2026-05-05 at 12 41
45@2x"
src="https://github.com/user-attachments/assets/68a1fcbc-2104-48d6-b834-69c6aaa95cdf"
/>
<p align="center">macOS - Ghostty, iTerm2 and WezTerm with Custom
Pet</p>
</td></tr>
<tr><td>
![Uploading CleanShot 2026-05-10 at 20.28.30.png…]()
<p align="center">Windows Terminal</p>
</td></tr>
<tr><td>
<img width="3902" height="2752" alt="CleanShot 2026-05-05 at 12 39
02@2x"
src="https://github.com/user-attachments/assets/300e2931-6b00-467e-91cb-ab8e28470500"
/>
<p align="center">Linux - WezTerm and Ghostty</p>
</td></tr>
</table>

## What Changed

- Add a TUI ambient pet renderer in `codex-rs/tui/src/pets/`.
- Port the app-style pet animation states so the sprite changes with
task status, waiting-for-input states, review/ready states, and
failures.
- Add `/pets` selection UI with a preview pane, loading state, built-in
pet choices, and a first-row `Disable terminal pets` option.
- Download built-in pet spritesheets on demand from the same public CDN
path already used by Android, under
`https://persistent.oaistatic.com/codex/pets/v1/...`, and cache them
locally under `~/.codex/cache/tui-pets/`.
- Keep custom pets local.
- Add config support for pet selection, disabling pets, and choosing
whether the pet follows the composer bottom or anchors to the terminal
bottom.
- Reserve layout space around the pet so transcript wrapping, live
responses, and composer input do not render underneath the sprite.
- Gate image rendering by terminal capability, disable image pets under
tmux, and support both Kitty Graphics and SIXEL terminals.
- Add redraw cleanup for terminal image artifacts, including sixel cell
clearing.

## Current Scope

- This is an initial TUI version of ambient pets, not full App parity.
- It focuses on ambient sprite rendering, `/pets` selection, custom
pets, terminal capability gating, and on-demand CDN-backed built-in
assets.
- The ambient text overlay is currently disabled, so the TUI renders the
pet sprite without extra status text beside it.

## How to Test

1. Start Codex TUI in a terminal with image support.
2. Run `/pets`.
3. Confirm the picker shows built-in pets plus custom pets, and the
first item is `Disable terminal pets`.
4. On a fresh `~/.codex/cache/tui-pets/`, move onto a built-in pet and
confirm the first preview downloads the spritesheet from the shared
Codex pets CDN and renders successfully.
5. Move through the pet list and confirm subsequent built-in previews
use the local cache.
6. Select a pet, then send and receive messages. Confirm transcript and
composer text wrap before the pet instead of rendering underneath the
sprite.
7. Change the pet anchor setting and confirm the pet can either follow
the composer bottom or sit at the terminal bottom.
8. Return to `/pets`, choose `Disable terminal pets`, and confirm the
sprite disappears cleanly.

Targeted tests:
- `cargo test -p codex-tui ambient_pet_`
- `cargo test -p codex-tui
resize_reflow_wraps_transcript_early_when_pet_is_enabled`
- `cargo insta pending-snapshots`
2026-05-12 10:43:17 -03:00
cassirer-openai
cb55b769d1 [rollout-trace] Add x-codex-inference-call-id header to inference calls. (#22311)
This allows us to attach call logs to inference requests in traces.
2026-05-12 05:55:11 -07:00
jif-oai
d996f5366f feat: guardian as an extension (contributors part) (#22216)
Part 1 of guardian as extension. This bind all the logic to spawn
another agent from an extension and it adds `ThreadId` in the start
thread collaborator
2026-05-12 14:41:45 +02:00
xl-openai
5b1a4c2fa7 feat: Normalize remote plugin summary identities. (#22265)
Makes plugin summaries use config-style plugin@marketplace IDs while
exposing backend remote IDs separately as remotePluginId.

Also fix the consistency issue of REMOTE_SHARED_WITH_ME_MARKETPLACE_NAME
2026-05-12 00:58:37 -07:00
viyatb-oai
46f30d0282 feat(sandbox): add Windows deny-read parity (#18202)
## Why

The split filesystem policy stack already supports exact and glob
`access = none` read restrictions on macOS and Linux. Windows still
needed subprocess handling for those deny-read policies without claiming
enforcement from a backend that cannot provide it.

## Key finding

The unelevated restricted-token backend cannot safely enforce deny-read
overlays. Its `WRITE_RESTRICTED` token model is authoritative for write
checks, not read denials, so this PR intentionally fails that backend
closed when deny-read overrides are present instead of claiming
unsupported enforcement.

## What changed

This PR adds the Windows deny-read enforcement layer and makes the
backend split explicit:

- Resolves Windows deny-read filesystem policy entries into concrete ACL
targets.
- Preserves exact missing paths so they can be materialized and denied
before an enforceable sandboxed process starts.
- Snapshot-expands existing glob matches into ACL targets for Windows
subprocess enforcement.
- Honors `glob_scan_max_depth` when expanding Windows deny-read globs.
- Plans both the configured lexical path and the canonical target for
existing paths so reparse-point aliases are covered.
- Threads deny-read overrides through the elevated/logon-user Windows
sandbox backend and unified exec.
- Applies elevated deny-read ACLs synchronously before command launch
rather than delegating them to the background read-grant helper.
- Reconciles persistent deny-read ACEs per sandbox principal so policy
changes do not leave stale deny-read ACLs behind.
- Fails closed on the unelevated restricted-token backend when deny-read
overrides are present, because its `WRITE_RESTRICTED` token model is not
authoritative for read denials.

## Landed prerequisites

These prerequisite PRs are already on `main`:

1. #15979 `feat(permissions): add glob deny-read policy support`
2. #18096 `feat(sandbox): add glob deny-read platform enforcement`
3. #17740 `feat(config): support managed deny-read requirements`

This PR targets `main` directly and contains only the Windows deny-read
enforcement layer.

## Implementation notes

- Exact deny-read paths remain enforceable on the elevated path even
when they do not exist yet: Windows materializes the missing path before
applying the deny ACE, so the sandboxed command cannot create and read
it during the same run.
- Existing exact deny paths are preserved lexically until the ACL
planner, which then adds the canonical target as a second ACL target
when needed. That keeps both the configured alias and the resolved
object covered.
- Windows ACLs do not consume Codex glob syntax directly, so glob
deny-read entries are expanded to the concrete matches that exist before
process launch.
- Glob traversal deduplicates directory visits within each pattern walk
to avoid cycles, without collapsing distinct lexical roots that happen
to resolve to the same target.
- Persistent deny-read ACL state is keyed by sandbox principal SID, so
cleanup only removes ACEs owned by the same backend principal.
- Deny-read ACEs are fail-closed on the elevated path: setup aborts if
mandatory deny-read ACL application fails.
- Unelevated restricted-token sessions reject deny-read overrides early
instead of running with a silently unenforceable read policy.

## Verification

- `cargo test -p codex-core
windows_restricted_token_rejects_unreadable_split_carveouts`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-windows-sandbox`
- GitHub Actions rerun is in progress on the pushed head.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-11 23:04:28 -07:00
pakrym-oai
c9e46ed639 [codex] Make handlers own parallel tool support (#22254)
## Why

`ToolRouter::tool_supports_parallel()` was still consulting configured
specs when a handler lookup missed, even though parallel schedulability
is really a property of the executable handler. Keeping that metadata on
`ConfiguredToolSpec` duplicated state between the model-visible spec
layer and the runtime handler layer.

This change makes handlers the sole source of truth for parallel tool
support and removes the extra spec wrapper that only existed to carry
duplicated metadata.

## What changed

- removed `ConfiguredToolSpec` and store plain `ToolSpec` values in the
registry/router builder path
- changed `ToolRouter::tool_supports_parallel()` to consult only the
handler registry and fall back to `false`
- simplified spec collection and test helpers to operate directly on
`ToolSpec`
- updated router/spec tests to cover handler-owned parallel behavior and
the no-handler fallback

## Validation

- `cargo test -p codex-tools`
- `cargo test -p codex-core mcp_parallel_support_uses_handler_data`
- `cargo test -p codex-core
deferred_responses_api_tool_serializes_with_defer_loading`
- `cargo test -p codex-core
tools_without_handlers_do_not_support_parallel`
- `cargo test -p codex-core
request_plugin_install_can_be_registered_without_search_tool`

## Docs

No documentation updates needed.
2026-05-11 22:26:33 -07:00
pakrym-oai
79c65f816c [codex] Filter legacy warning messages during compaction (#22243)
## Why

Older sessions can contain model-warning records persisted as `user`
messages, including the unified exec process-limit warning, the
`apply_patch`-via-`exec_command` warning, and the model-mismatch
high-risk cyber fallback warning. Those warnings are no longer produced
as conversation history items, but when old sessions compact they should
still be recognized as injected context rather than preserved as real
user turns.

## What changed

- Removed `record_model_warning` and the production paths that emitted
these warning messages into conversation history.
- Added `LegacyUnifiedExecProcessLimitWarning`,
`LegacyApplyPatchExecCommandWarning`, and `LegacyModelMismatchWarning`
contextual fragments that are used only for matching old persisted
messages.
- Registered the legacy fragments with contextual user message detection
so compaction filters them through the existing fragment path.
- Added focused compaction coverage for old warning messages being
dropped during compacted-history processing.

## Testing

- `cargo test -p codex-core warning`
- `just fix -p codex-core`
2026-05-11 19:51:51 -07:00
Abhinav
d08906a944 Support PreToolUse updatedInput rewrites (#20527)
## Why

`PreToolUse` already exposes `updatedInput` in its hook output schema,
but Codex currently rejects it instead of applying the rewrite. That
leaves hook authors unable to make the documented pre-execution
adjustment to a tool call before it runs.

## What

- Accept `updatedInput` from `PreToolUse` hooks when paired with
`permissionDecision: "allow"`.
- Apply the rewritten input before dispatch so the tool executes the
updated payload, not the original one.
- Preserve the stable hook-facing compatibility shapes that
participating tool handlers expose:
- Bash-like tools (`shell`, `container.exec`, `local_shell`,
`shell_command`, `exec_command`) use `{ "command": ... }`.
- `apply_patch` exposes its patch body through the same command-shaped
hook contract.
  - MCP tools expose their JSON argument object directly.
- Keep each participating tool handler responsible for translating
hook-facing `updatedInput` back into its concrete invocation shape.

## Verification

Direct Bash-like rewrite coverage:

- `pre_tool_use_rewrites_shell_before_execution`
- `pre_tool_use_rewrites_container_exec_before_execution`
- `pre_tool_use_rewrites_local_shell_before_execution`
- `pre_tool_use_rewrites_shell_command_before_execution`
- `pre_tool_use_rewrites_exec_command_before_execution`

These cases assert that each supported Bash-like surface runs only the
rewritten command while the hook still observes the original `{
"command": ... }` input.

`pre_tool_use_rewrites_apply_patch_before_execution`

- Model emits one patch.
- Hook swaps in a different patch.
- Asserts only the rewritten file is created, and the hook saw the
original patch.

`pre_tool_use_rewrites_code_mode_nested_exec_command_before_execution`

- Model runs one nested shell command from code mode.
- Hook rewrites it.
- Asserts only the rewritten command runs, and the hook saw the original
nested input.

`pre_tool_use_rewrites_mcp_tool_before_execution`

- Model calls the RMCP echo tool.
- Hook rewrites the MCP arguments.
- Asserts the MCP server receives and returns the rewritten message, not
the original one.
2026-05-11 22:27:24 -04:00
starr-openai
17ed5ad0b0 Apply sandbox context to local view_image reads (#21861)
## Summary
- create a selected-cwd filesystem sandbox context for view_image
metadata and file reads in both local and remote environments
- add a local restricted-profile regression test for the previously
unsandboxed read path

## Validation
- just fmt
- bazel test --bes_backend= --bes_results_url= --test_output=errors
--test_filter=view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads
//codex-rs/core:core-unit-tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-11 18:48:43 -07:00
efrazer-oai
fd24c00b0b feat(skills): default plugin creator to personal share flow (#22221)
## Summary
Plugin creation now defaults to the personal marketplace path and ends
with a readable handoff back into Codex after a marketplace-backed
scaffold.

Before this change, `plugin-creator` centered repo-local marketplace
updates and did not clearly guide the agent to return the user to the
created plugin afterward. This PR updates the bundled system skill so
marketplace-backed scaffolds default to `~/plugins/<plugin-name>` plus
`~/.agents/plugins/marketplace.json`, ask for user intent only when an
existing repo marketplace makes personal vs team scope ambiguous, and
end with named Markdown deeplinks labeled `View <plugin-name>` and
`Share <plugin-name>`.

## What changed
- default marketplace-backed creation to the personal plugin location
- document the explicit repo/team override path for codebases that
should own the plugin entry
- ask personal vs team only when the current Git repo already has
`.agents/plugins/marketplace.json` and the user has not stated scope
- require named Markdown deeplinks after marketplace-backed creation so
the final response returns the user to the exact plugin cleanly
- keep the deeplink targets precise with real absolute `marketplacePath`
and normalized `pluginName` values
- align the bundled prompt, scaffold help text, and marketplace
reference spec with the new default

## Testing
Tests: targeted skill validation, Python compile checks,
personal-default scaffold smoke, repo-override scaffold smoke, and
whitespace checks.
2026-05-11 17:58:48 -07:00
pakrym-oai
ed5944ba1d Simplify MCP tool handler plumbing (#21595)
## Why
The MCP tool path had accumulated a few core-owned special cases: a
dedicated payload variant, resolver plumbing, a legacy `AfterToolUse`
translation path, and a side channel for parallel-call metadata. That
made `ToolRegistry` and the spec builder know more about MCP than they
needed to.

This change moves MCP-specific execution details back onto `ToolInfo`
and `McpHandler` so `codex-core` can treat MCP calls like normal
function calls while still preserving MCP-specific dispatch and
telemetry behavior where it belongs.

## What changed
- removed `resolve_mcp_tool_info`, `ToolPayload::Mcp`, `ToolKind`, and
the remaining registry-side MCP resolver path
- stored MCP routing metadata directly on `McpHandler` and `ToolInfo`,
including `supports_parallel_tool_calls`
- deleted the legacy `AfterToolUse` consumer in `core`, which removes
the need for handler-specific `after_tool_use_payload` implementations
- switched tool-result telemetry to handler-provided tags and kept
MCP-specific dispatch payload construction inside the handler
- simplified tool spec planning/building by passing `ToolInfo` directly
and dropping the direct/deferred MCP wrapper structs and the
parallel-server side table

## Testing
- `cargo check -p codex-core -p codex-mcp -p codex-otel`
- `cargo test -p codex-core
mcp_parallel_support_uses_exact_payload_server`
- `cargo test -p codex-core
direct_mcp_tools_register_namespaced_handlers`
- `cargo test -p codex-core
search_tool_description_lists_each_mcp_source_once`
- `cargo test -p codex-mcp
list_all_tools_uses_startup_snapshot_while_client_is_pending`
- `just fix -p codex-core -p codex-mcp -p codex-otel`
2026-05-12 00:11:31 +00:00
Felipe Coury
e16b4e46d4 fix(tui): handle hidden app git directives (#21946) 2026-05-11 21:08:40 -03:00
Ruslan Nigmatullin
95d8669ab2 [exec-server] serve websocket listener via HTTP upgrade (#21963)
## Why

`codex exec-server` should keep the existing public `ws://IP:PORT` URL
shape while serving that websocket connection through an HTTP upgrade
path internally. That keeps the client-facing configuration simple and
allows the listener to work through intermediate HTTP-aware
infrastructure.

## What changed

- keep the emitted and configured exec-server URL as `ws://IP:PORT`
- serve that websocket endpoint through Axum HTTP upgrade handling on
`/`
- expose `GET /readyz` from the same listener for readiness checks
- route upgraded Axum websocket streams through the shared JSON-RPC
connection machinery
- initialize the rustls crypto provider before websocket client
connections
- preserve inbound binary websocket JSON-RPC parsing for compatibility
with the prior transport behavior

## Verification

- `cargo test -p codex-exec-server --test health --test process --test
websocket --test initialize --test exec_process`
2026-05-11 17:04:21 -07:00
Matthew Zeng
e15ecc9c35 Add production startup and TTFT telemetry (#22198)
## Why

While investigating `codex exec hi` startup latency, the useful
questions were not "is startup slow?" but "which durable bucket is slow
in production?"

The path we observed has a few distinct stages:

1. `thread/start` creates the session
2. startup prewarm builds the turn context, tools, and prompt
3. startup prewarm warms the websocket
4. the first real turn resolves the prewarm
5. the model produces the first token

Before this PR, production telemetry had some of the raw measurements
already:

- aggregate startup-prewarm duration / age-at-first-turn metrics
- TTFT as a metric
- websocket request telemetry

But there was no coherent production event stream for the startup
breakdown itself, and TTFT was metric-only. That made it hard to answer
the same latency questions from OpenTelemetry-backed logs without adding
one-off local instrumentation.

## What changed

Add durable production telemetry on the existing `SessionTelemetry`
path:

- new `codex.startup_phase` OTel log/trace events plus
`codex.startup.phase.duration_ms`
- new `codex.turn_ttft` OTel log/trace events while preserving the
existing TTFT metric

The startup phase event is emitted for the coarse buckets we actually
observed while running `exec hi`:

- `thread_start_create_thread`
- `startup_prewarm_total`
- `startup_prewarm_create_turn_context`
- `startup_prewarm_build_tools`
- `startup_prewarm_build_prompt`
- `startup_prewarm_websocket_warmup`
- `startup_prewarm_resolve`

These phases are intentionally low-cardinality so they remain safe as
production telemetry tags.

## Why this shape

This keeps the instrumentation on the same production path as the rest
of the session telemetry instead of adding a local debug-only trace
mode. It also avoids changing startup behavior:

- prewarm still runs
- no control flow changes
- no extra remote calls
- no user-visible behavior changes

One boundary is intentional: very early process bootstrap that happens
before a session exists is not included here, because this PR uses
session-scoped production telemetry. The expensive buckets we were
trying to understand after `thread/start` are now covered durably.

## Verification

- `cargo test -p codex-otel`
- `cargo test -p codex-core turn_timing`
- `cargo test -p codex-core
regular_turn_emits_turn_started_without_waiting_for_startup_prewarm`
- `cargo test -p codex-core
interrupting_regular_turn_waiting_on_startup_prewarm_emits_turn_aborted`
- `cargo test -p codex-app-server thread_start`
- `just fix -p codex-otel -p codex-core -p codex-app-server`

I also ran `cargo test -p codex-core`; it built successfully and then
hit an existing unrelated stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.
2026-05-11 23:58:36 +00:00
starr-openai
22e84c49d0 Support multi-environment apply_patch selection (#21617)
## Summary
- add multi-environment apply_patch routing for both freeform and
function-call tool flows
- parse and reconcile the optional environment selector in the main
apply_patch parser, then verify against the selected environment in the
handler
- carry environment_id through runtime and approval surfaces so
remote-targeted patches stay explicit end to end

## Testing
- just fmt
- remote exec-server e2e: `cargo test -p codex-core --test all
apply_patch_multi_environment_uses_remote_executor -- --nocapture` on
dev via `scripts/test-remote-env.sh`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-11 16:33:44 -07:00
alexsong-oai
bb6134c028 Stop uploading accepted line fingerprints (#22180)
## Summary
- keep accepted-line diff parsing and fingerprint hashing logic locally
- stop uploading path/line hash fingerprints in the accepted-line
analytics event payload
- keep aggregate accepted added/deleted line counts in the event

## Testing
- just fmt
- cargo test -p codex-analytics
- just fix -p codex-analytics
2026-05-11 15:41:38 -07:00
Owen Lin
4859d80ffe Update codex remote-control to start the daemon (#22218)
## Why
Update `codex remote-control` to use the new app server daemon commands
instead.
- if the updater loop is not running, bootstrap the daemon with remote
control enabled (`codex app-server daemon bootstrap --remote-control`)
- otherwise, enable the persisted remote-control setting and start the
daemon normally
2026-05-11 15:38:30 -07:00
Abhinav
9ab7f4e6ac Add Windows hook command overrides (#22159)
# Why

Managed hook configs need a shared cross-platform shape without making
the existing `command` field polymorphic. The common case is still one
command string, with Windows needing a different entrypoint only when
the runtime is actually Windows.

Keeping `command` as the portable/default path and adding an optional
Windows override keeps the config easier to read, preserves the existing
scalar shape for non-Windows users, and avoids forcing every caller into
a `{ unix, windows }` object when only one platform needs special
handling.

# What

- Add optional `command_windows` / `commandWindows` alongside the
existing hook `command` field.
- Resolve `command_windows` only on Windows during hook discovery; other
platforms continue to use `command` unchanged.
- Keep trust hashing aligned to the effective command selected for the
current runtime.

# Docs

The Codex hooks/config reference should document `command_windows` as
the Windows-only override for command hooks.
2026-05-11 22:22:29 +00:00
rhan-oai
a175ddacc0 [codex-analytics] emit terminal review events (#18748)
## Why

Review telemetry should describe reviews as first-class events, not only
as counters denormalized onto terminal tool-item events. That lets us
analyze guardian and user reviews consistently across command execution,
file changes, permissions, and network access, while still preserving
the terminal item summaries that existing tool analytics need.

To make those review events accurate, analytics also needs the observed
completion time for each review and enough command metadata to
distinguish `shell` from `unified_exec` reviews.

## What changed

- emit generic `codex_review_event` rows for completed user and guardian
reviews, with review subjects, reviewer, trigger, terminal status,
resolution, and observed duration
- reduce approval request / response / abort facts into review events
for command execution, file change, and permissions flows
- keep denormalized review counts, final approval outcome, and
permission-request flags on terminal tool-item events for
item-associated reviews
- plumb review completion timing so user-review responses and aborts use
app-server-observed completion times, while guardian analytics reuse the
same terminal timestamps emitted on guardian assessment events
- carry command approval `source` through the protocol and app-server
layers so review analytics can distinguish `shell` from `unified_exec`
- add analytics coverage for user-review emission, guardian-review
emission, permission reviews that should not denormalize onto tool
items, item-summary isolation across threads, and the serialized
review-event shape

## Verification

- `cargo test -p codex-analytics`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18748).
* __->__ #18748
* #21434
* #18747
* #17090
* #17089
* #20514
2026-05-11 22:13:32 +00:00
Ahmed Ibrahim
aa9e8f0262 [8/8] Add Python SDK Ruff formatting (#22021)
## Why

The Python SDK needs the same tight formatter/lint loop as the rest of
the repo: a safe Ruff autofix pass, Ruff formatting, editor save
behavior, and CI checks that catch drift. Without that loop, SDK changes
can land with formatting or import ordering that differs from what
reviewers and CI expect.

## What

- Add Ruff configuration to `sdk/python/pyproject.toml`, excluding
generated protocol code and notebooks from the normal lint/format pass.
- Update `just fmt` so it still formats Rust and also runs Python SDK
Ruff autofix and formatting.
- Add Python SDK CI steps for `ruff check` and `ruff format --check`
before pytest.
- Recommend the Ruff VS Code extension and enable Python
format/fix/organize-on-save so Cmd+S uses the same tooling.
- Apply the resulting Ruff formatting to SDK Python files, examples, and
the checked-in generated `v2_all.py` output emitted by the pinned
generator.
- Add a guard test for the `just fmt` recipe so it keeps working from
both Rust and Python SDK working directories.

## Stack

1. #21891 `[1/8]` Pin Python SDK runtime dependency
2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
3. #21895 `[3/8]` Run Python SDK tests in CI
4. #21896 `[4/8]` Define Python SDK public API surface
5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
6. #21910 `[6/8]` Add high-level Python SDK approval mode
7. #22014 `[7/8]` Add Python SDK app-server integration harness
8. This PR `[8/8]` Add Python SDK Ruff formatting

## Verification

- Added `test_root_fmt_recipe_formats_rust_and_python_sdk` for the
shared format recipe.
- Ran `just fmt` after the recipe update.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 01:10:29 +03:00
Ahmed Ibrahim
3e10e09e24 [7/8] Add Python SDK app-server integration harness (#22014)
## Why

The SDK had behavioral tests that replaced SDK client internals. Those
tests could catch wrapper mistakes, but they did not prove the pinned
app-server runtime, generated notification models, request routing, and
sync/async public clients worked together.

This PR adds deterministic integration coverage that starts the pinned
`codex app-server` process and mocks only the upstream Responses HTTP
boundary.

## What

- Add `AppServerHarness` and `MockResponsesServer` helpers for isolated
`CODEX_HOME`, mock-provider config, queued SSE responses, and captured
`/v1/responses` requests.
- Add shared helpers for SSE construction, stream assertions,
approval-policy inspection, and image fixtures.
- Split integration coverage into focused modules for run behavior,
inputs, streaming, turn controls, approvals, and thread lifecycle.
- Cover sync and async `Thread.run`, `TurnHandle.stream`, interleaved
streams, approval-mode persistence, lifecycle helpers, final-answer
phase handling, image inputs, loaded skill input injection, steering,
interruption, listing, history reads, run overrides, and token usage
mapping.
- Replace public-wrapper tests that duplicated integration-test behavior
with lower-level client tests only where direct client behavior is the
thing under test.

## Stack

1. #21891 `[1/8]` Pin Python SDK runtime dependency
2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
3. #21895 `[3/8]` Run Python SDK tests in CI
4. #21896 `[4/8]` Define Python SDK public API surface
5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
6. #21910 `[6/8]` Add high-level Python SDK approval mode
7. This PR `[7/8]` Add Python SDK app-server integration harness
8. #22021 `[8/8]` Add Python SDK Ruff formatting

## Verification

- Added pinned app-server integration tests under
`sdk/python/tests/test_app_server_*.py` and
`test_real_app_server_integration.py`.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 01:06:41 +03:00
Ahmed Ibrahim
2b90c37069 [6/8] Add high-level Python SDK approval mode (#21910)
## Why

The high-level SDK should expose the approval behavior it actually
supports instead of leaking generated app-server routing fields. New
work should have two clear choices: default auto review, or explicitly
deny escalated permission requests. Existing threads and subsequent
turns should preserve their current approval behavior unless the caller
passes an override.

## What

- Add the public `ApprovalMode` enum with `auto_review` and `deny_all`.
- Default new thread creation to `ApprovalMode.auto_review`.
- Preserve existing approval settings by default for resume, fork, run,
and turn helpers.
- Remove raw `approval_policy` / `approvals_reviewer` kwargs from
high-level SDK wrappers.
- Update generated wrapper output, docs, examples, notebooks, and tests
for the high-level approval mode API.

## Stack

1. #21891 `[1/8]` Pin Python SDK runtime dependency
2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
3. #21895 `[3/8]` Run Python SDK tests in CI
4. #21896 `[4/8]` Define Python SDK public API surface
5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
6. This PR `[6/8]` Add high-level Python SDK approval mode
7. #22014 `[7/8]` Add Python SDK app-server integration harness
8. #22021 `[8/8]` Add Python SDK Ruff formatting

## Verification

- Added approval-mode mapping/default tests for new threads, existing
threads, forks, resumes, and subsequent turns.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 01:02:43 +03:00
Ahmed Ibrahim
f1b84fac63 [5/8] Rename Python SDK package to openai-codex (#21905)
## Why

The SDK should publish under the reserved public distribution name
`openai-codex`, and its import module should match that name in the
Python style. Since package names can contain hyphens but import modules
cannot, the public import path becomes `openai_codex`.

Keeping the rename separate from the public API surface change makes the
naming change easy to review and avoids mixing it with API curation.

## What

- Rename the SDK distribution from `openai-codex-app-server-sdk` to
`openai-codex`.
- Rename the import package from `codex_app_server` to `openai_codex`.
- Keep the runtime wheel as the separate `openai-codex-cli-bin`
dependency.
- Update docs, examples, notebooks, artifact scripts, lockfile metadata,
and tests for the new distribution/module names.

## Stack

1. #21891 `[1/8]` Pin Python SDK runtime dependency
2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
3. #21895 `[3/8]` Run Python SDK tests in CI
4. #21896 `[4/8]` Define Python SDK public API surface
5. This PR `[5/8]` Rename Python SDK package to `openai-codex`
6. #21910 `[6/8]` Add high-level Python SDK approval mode
7. #22014 `[7/8]` Add Python SDK app-server integration harness
8. #22021 `[8/8]` Add Python SDK Ruff formatting

## Verification

- Updated package metadata and public API tests to assert the
distribution and import names.

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 00:59:25 +03:00
Ahmed Ibrahim
b4bc02439f [4/8] Define Python SDK public API surface (#21896)
## Why

The SDK package root should be the ergonomic public client API, not a
dump of every generated app-server schema type. Generated models still
need a supported import path, but callers should be able to tell which
names are high-level SDK entrypoints and which names are protocol value
models.

## What

- Define a curated root `__all__` for clients, handles, input helpers,
retry helpers, config, and public errors.
- Add a `types` module as the supported home for generated app-server
response, event, enum, and helper models.
- Update docs and examples to import protocol/value models from the type
module.
- Add tests that lock root exports, type-module exports, star-import
behavior, and example import hygiene.

## Stack

1. #21891 `[1/8]` Pin Python SDK runtime dependency
2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
3. #21895 `[3/8]` Run Python SDK tests in CI
4. This PR `[4/8]` Define Python SDK public API surface
5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
6. #21910 `[6/8]` Add high-level Python SDK approval mode
7. #22014 `[7/8]` Add Python SDK app-server integration harness
8. #22021 `[8/8]` Add Python SDK Ruff formatting

## Verification

- Added public API signature tests for root exports, `types` exports,
and example imports.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 00:57:44 +03:00
Ahmed Ibrahim
3e2936dd0e [3/8] Run Python SDK tests in CI (#21895)
## Why

The Python SDK stack now depends on packaging metadata, pinned runtime
wheels, generated artifacts, async behavior, and stream interleaving.
Those checks need to run in CI so future changes cannot bypass the SDK
test suite.

## What

- Add a dedicated `python-sdk` job to `.github/workflows/sdk.yml`.
- Run the job in `python:3.12-alpine` so dependency resolution exercises
the pinned musl runtime wheel.
- Keep the Python SDK test job parallel to the existing SDK job instead
of serializing the full workflow.

## Stack

1. #21891 `[1/8]` Pin Python SDK runtime dependency
2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
3. This PR `[3/8]` Run Python SDK tests in CI
4. #21896 `[4/8]` Define Python SDK public API surface
5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
6. #21910 `[6/8]` Add high-level Python SDK approval mode
7. #22014 `[7/8]` Add Python SDK app-server integration harness
8. #22021 `[8/8]` Add Python SDK Ruff formatting

## Verification

- The added workflow job installs the SDK with `uv sync --extra dev
--frozen` and runs the Python SDK pytest suite.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 00:53:36 +03:00
Ahmed Ibrahim
6a4653efc8 [2/8] Generate Python SDK types from pinned runtime (#21893)
## Why

Once the SDK declares its runtime package, generated Python artifacts
should come from that pinned runtime rather than whatever app-server
schema happens to be in the current checkout. That keeps the generated
API and model surface aligned with the runtime users install.

## What

- Teach `scripts/update_sdk_artifacts.py generate-types` to invoke the
pinned runtime package for schema generation.
- Regenerate `v2_all.py`, `notification_registry.py`, and generated
public wrapper methods from that schema.
- Add freshness coverage so regenerating from the pinned runtime must
leave checked-in artifacts unchanged.

## Stack

1. #21891 `[1/8]` Pin Python SDK runtime dependency
2. This PR `[2/8]` Generate Python SDK types from pinned runtime
3. #21895 `[3/8]` Run Python SDK tests in CI
4. #21896 `[4/8]` Define Python SDK public API surface
5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
6. #21910 `[6/8]` Add high-level Python SDK approval mode
7. #22014 `[7/8]` Add Python SDK app-server integration harness
8. #22021 `[8/8]` Add Python SDK Ruff formatting

## Verification

- Added `test_generated_files_are_up_to_date` for pinned-runtime
generation drift.
- Added generator-structure tests for schema annotation and notification
metadata generation.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 00:53:21 +03:00
Ahmed Ibrahim
5fe33443b0 [1/8] Pin Python SDK runtime dependency (#21891)
## Why

The Python SDK depends on the app-server runtime package for the bundled
`codex` binary and schema source of truth. That relationship should be
explicit in package metadata instead of inferred from matching version
numbers, so installers, lockfiles, and reviewers can see exactly which
runtime the SDK expects.

## What

- Declare `openai-codex-cli-bin==0.131.0a4` as a Python SDK dependency.
- Update runtime setup helpers to resolve the runtime version from the
declared dependency pin.
- Refresh the SDK lockfile for the pinned runtime wheel.
- Update package/runtime tests and docs that describe where the runtime
version comes from.

## Stack

1. This PR `[1/8]` Pin Python SDK runtime dependency
2. #21893 `[2/8]` Generate Python SDK types from pinned runtime
3. #21895 `[3/8]` Run Python SDK tests in CI
4. #21896 `[4/8]` Define Python SDK public API surface
5. #21905 `[5/8]` Rename Python SDK package to `openai-codex`
6. #21910 `[6/8]` Add high-level Python SDK approval mode
7. #22014 `[7/8]` Add Python SDK app-server integration harness
8. #22021 `[8/8]` Add Python SDK Ruff formatting

## Verification

- Added coverage for the SDK runtime dependency pin and runtime
distribution naming.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-12 00:42:26 +03:00
viyatb-oai
c7b55cdc46 feat: add network proxy feature flag (#20147)
## Why

The permissions migration is making
`permissions.<profile>.network.enabled` the canonical sandbox network
bit, while proxy startup is a separate concern. Enabling network access
should not implicitly start the proxy, and users who are still on legacy
sandbox modes need a separate place to opt into proxy startup and
provide proxy-specific settings.

This follow-up to #19900 gives the network proxy its own feature surface
instead of overloading permission-profile network semantics.

## What changed

- Add an experimental `network_proxy` feature with a configurable
`[features.network_proxy]` table.
- Overlay `features.network_proxy` settings onto the configured proxy
state after permission-profile selection, so the proxy only starts when
the active `NetworkSandboxPolicy` already allows network access.
- Preserve `[experimental_network]` startup behavior independently of
the new feature flag.

## Behavior and examples

There are now three related knobs:

- `permissions.<profile>.network.enabled` controls whether the active
permission profile has network access at all.
- `features.network_proxy` enables proxy restrictions for an
already-network-enabled profile.
- Legacy `sandbox_mode` plus `[sandbox_workspace_write].network_access`
still control whether legacy `workspace-write` has network access at
all.

The rule is:

- network off + proxy flag on -> network stays off, proxy is a no-op
- network on + proxy flag off -> unrestricted direct network
- network on + proxy flag on -> network stays on, with proxy
restrictions applied

For permission profiles, the feature toggle adds proxy restrictions only
when network access is already enabled:

```toml
default_permissions = "workspace"

[permissions.workspace.filesystem]
":minimal" = "read"

[permissions.workspace.network]
enabled = true

[features]
network_proxy = true
```

If `network.enabled = false`, the same feature flag is a no-op: network
remains off and the proxy does not start.

For legacy sandbox config, `network_access` remains the master switch:

```toml
sandbox_mode = "workspace-write"

[sandbox_workspace_write]
network_access = true

[features]
network_proxy = true
```

That keeps legacy `workspace-write` network access on, but routes it
through the proxy policy. If `network_access = false`, the proxy feature
is a no-op and legacy `workspace-write` remains offline.

The same proxy opt-in can be supplied from the CLI:

```bash
codex -c 'features.network_proxy=true'
```

Additional proxy settings can be supplied when a table is needed:

```bash
codex \
  -c 'features.network_proxy.enabled=true' \
  -c 'features.network_proxy.enable_socks5=false'
```

The intended behavior matrix is:

| Config surface | Network setting | `features.network_proxy` | Direct
sandbox network | Proxy |
| --- | --- | --- | --- | --- |
| Permission profile | `network.enabled = false` | off | restricted |
off |
| Permission profile | `network.enabled = false` | on | restricted | off
|
| Permission profile | `network.enabled = true` | off | enabled | off |
| Permission profile | `network.enabled = true` | on | enabled | on |
| Legacy `workspace-write` | `network_access = false` | off | restricted
| off |
| Legacy `workspace-write` | `network_access = false` | on | restricted
| off |
| Legacy `workspace-write` | `network_access = true` | off | enabled |
off |
| Legacy `workspace-write` | `network_access = true` | on | enabled | on
|

`[experimental_network]` requirements remain separate from the user
feature toggle and still start the proxy on their own.

Relevant code:
-
[`features/src/feature_configs.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/features/src/feature_configs.rs#L58-L117)
defines the feature-specific proxy config.
-
[`core/src/config/mod.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L1959-L1964)
reads the feature table, and [later applies it only when network access
is already
enabled](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L2448-L2458).

## Verification

Added focused coverage for:
- keeping the proxy off when `features.network_proxy` is enabled but
sandbox network access is disabled
- the full permission-profile and legacy `workspace-write` matrix above
- preserving `[experimental_network]` startup without the feature
- reusing profile-supplied proxy settings when the feature is enabled

Ran:
- `cargo test -p codex-features`
- `cargo test -p codex-core network_proxy_feature`
- `cargo test -p codex-core
experimental_network_requirements_enable_proxy_without_feature`
2026-05-11 14:12:00 -07:00
cooper-oai
54ec99cb54 [login] revoke superseded auth tokens on relogin (#21747)
## Summary
- revoke previously stored managed ChatGPT tokens after a successful
re-login
- keep the new login successful even when revocation is unavailable or
fails
- cover the shared persistence path used by browser and device-code
login flows

## Why
A new `codex login` currently overwrites existing managed ChatGPT
credentials without attempting to revoke the superseded tokens, leaving
old credentials valid longer than necessary.

## Validation
- `just fmt`
- `CARGO_HOME=/tmp/cargo-home cargo test -p codex-login`

## Notes
- Initial local Cargo validation hit a corrupt existing crate cache in
the default `CARGO_HOME`; rerunning with a clean temporary `CARGO_HOME`
passed.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-11 13:36:46 -07:00
Ruslan Nigmatullin
e3f481da98 daemon: refresh updater after validated binary rollout (#21853)
## Why

`bootstrap` starts a detached pid-backed updater loop, but before this
change that updater could keep running an old executable image even
after `install.sh` replaced the managed standalone binary under
`CODEX_HOME`. That left the updater itself behind the binary it had just
rolled out, especially when the app-server was stopped or when the
managed binary changed without a version-string change.

## What changed

- Track updater identity from the executable contents rather than only
the reported CLI version.
- Force the managed app-server restart path when the managed binary
contents differ from the running updater image, then re-exec the updater
from the managed binary once the rollout is in a safe state.
- Distinguish a genuinely absent managed app-server from a managed
process that exists but is not yet probeable, so self-refresh does not
skip a required restart.
- Keep the restart/re-exec decision under the daemon operation lock so
`bootstrap` cannot race the handoff.
- Update `app-server-daemon/README.md` to document the resulting
standalone and out-of-band update behavior.

## Verification

- `cargo test -p codex-app-server-daemon`
- `just fix -p codex-app-server-daemon`

Added focused unit coverage for:
- content-based updater refresh decisions
- safe updater re-exec outcomes across restart states
2026-05-11 12:37:10 -07:00
Felipe Coury
99b98aece6 config: accept minus in TUI keymap config (#22192)
## Summary

Fixes #22128.

The `/keymap` flow already persists the `-` key as `minus`, and the
runtime keymap parser already accepts that spelling. `codex-config` was
the missing leg: it rejected `minus` during config deserialization, so a
binding saved by Codex could fail on the next startup or config reload.

## What Changed

- Accept `minus` as a valid canonical key name in `tui.keymap` config
normalization.
- Update the config validation message so its supported-key list
includes `minus`.
- Add regression coverage that deserializes both `minus` and `alt-minus`
under `[tui.keymap.global]` and verifies the normalized config shape.

## How to Test

1. Start Codex TUI.
2. Run `/keymap`.
3. Assign the `-` key to an action and save the change.
4. Restart Codex or reload the config.
5. Confirm the config loads normally and the saved binding remains
usable instead of failing on `minus`.
6. As a focused regression check, repeat with a modifier form such as
`alt--` captured through `/keymap`, which persists as `alt-minus` and
should also reload successfully.

Targeted tests:
- `cargo test -p codex-config`
2026-05-11 16:34:33 -03:00
Matthew Zeng
192481d1a1 [elicitation] Advertise new url elicitation capability when auth_elicitation is enabled. (#22188)
## Why

We've added support for auth elicitation behind the auth_elicitation
flag, but servers need to explicitly check the capability before it
decides to send elicitations in order to be backward compatible. This PR
adds the capability advertising conditioned on the flag.

## What changed

- Build `client_elicitation_capability` from the `AuthElicitation`
feature state.
- Thread that capability through MCP config, session startup, and
`McpConnectionManager` so RMCP initialization advertises the correct
elicitation support.
- Advertise both `form` and `url` elicitation when the feature is
enabled, and preserve the empty default capability when it is disabled.
- Add coverage for the feature-derived config shape and the advertised
initialization payload.

## Testing

- `cargo test -p codex-mcp`
- `cargo test -p codex-core
to_mcp_config_preserves_auth_elicitation_feature_from_config`
- `cargo test -p codex-core` *(currently fails outside this change in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`
with a stack overflow after unrelated tests have started running)*
2026-05-11 12:23:55 -07:00
viyatb-oai
d0fa2d81d8 feat(connectors): support managed app tool approval requirements (#21061)
## Why

Managed requirements can already centrally disable apps, but they could
not express the per-tool app approval rules that normal config already
supports. That left admins without a way to enforce connector tool
approvals through `/etc/codex/requirements.toml` or cloud requirements.

## What changed

- Extend app requirements with per-tool `approval_mode` entries.
- Merge managed app tool requirements across managed sources while
preserving higher-precedence exact tool settings.
- Apply managed tool approvals separately from user app config so
managed policy is matched only on raw MCP `tool.name`, while user config
keeps the existing raw-name-then-title convenience fallback.
- Add coverage for local requirements, cloud requirements parsing,
managed-over-user precedence, and a title-collision case that must not
widen managed auto-approval.

## Configuration shape

Local `/etc/codex/requirements.toml` and cloud requirements use the same
TOML shape:

```toml
[apps.connector_123123.tools."calendar/list_events"]
approval_mode = "approve"
```

This is a per-tool approval rule keyed by app ID and raw MCP tool name,
not an app-level boolean such as `apps.connector_123123.approve = true`.
2026-05-11 19:08:26 +00:00
viyatb-oai
6506765168 fix(permissions): preserve managed deny-read during escalation (#15977)
## Why

Managed filesystem `deny_read` requirements are administrator-enforced
restrictions on specific paths. Once those requirements are active,
Codex should not drop them just because an execution path would
otherwise leave the sandbox.

Before this change, an explicit escalation, a prefix-rule allow, a
sandbox-denial retry, or an app-server legacy sandbox override could
rebuild the runtime policy without those managed read-deny entries and
expose a path the administrator had marked unreadable.

This is narrower than general sandbox-mode constraints. If an enterprise
only sets `allowed_sandbox_modes`, a trusted `prefix_rule(..., decision
= "allow")` can still run its matching command unsandboxed; this PR only
preserves managed filesystem `deny_read` restrictions across those
paths.

## What Changed

- Mark filesystem policies built from managed `deny_read` requirements
so callers can tell when those deny entries must survive escalation.
- Preserve managed deny-read entries when runtime permission profiles
are rebuilt through protocol, app-server, or legacy sandbox-policy
compatibility paths.
- Keep managed deny-read attempts inside the selected sandbox on the
first attempt and after sandbox-denial retries.
- Preserve the same behavior in the zsh-fork escalation path, including
prefix-rule-driven escalation.
- Add a regression test showing the opposite case too: without managed
deny-read, a prefix-rule allow still chooses unsandboxed execution.

## Verification

Targeted automated verification:

```shell
cargo test -p codex-core shell_request_escalation_execution_is_explicit -- --nocapture
cargo test -p codex-core prefix_rule_uses_unsandboxed_execution_without_managed_deny_read -- --nocapture
cargo test -p codex-core prefix_rule_preserves_managed_deny_read_escalation -- --nocapture
cargo test -p codex-protocol permission_profile_round_trip_preserves_filesystem_policy_metadata -- --nocapture
cargo test -p codex-protocol preserving_deny_entries_keeps_unrestricted_policy_enforceable -- --nocapture
cargo test -p codex-app-server-protocol permission_profile_file_system_permissions_preserves_policy_metadata -- --nocapture
cargo check -p codex-app-server -p codex-tui
```

Smoke-test invocations:

```shell
# macOS exact deny + allowed control
codex exec --skip-git-repo-check -C "$ROOT" \
  -c 'default_permissions="deny_read_smoke"' \
  -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",":project_roots"={"."="write","secrets"="none","future-secret"="none","**/*.env"="none"}}' \
  'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.'

# Linux exact deny + allowed control
codex exec --skip-git-repo-check -C "$ROOT" \
  -c 'default_permissions="deny_read_smoke"' \
  -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",glob_scan_max_depth=3,":project_roots"={"."="write","secrets"="none","future-secret"="none","**/*.env"="none"}}' \
  'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.'
```

Observed manual smoke matrix:

| Case | macOS Seatbelt | Linux bubblewrap |
| --- | --- | --- |
| `cat allowed.txt` | Pass | Pass |
| `cat secrets/exact-secret.txt` | Blocked | Blocked |
| `cat envs/root.env` | Blocked | Blocked |
| `cat envs/nested/one.env` | Blocked | Blocked |
| `cat envs/nested/two.env` | Blocked | Blocked |
| `cat alias-to-secrets/exact-secret.txt` | Blocked | Blocked |
| Missing denied path | A file created after sandbox setup remained
unreadable | Creation was blocked by the reserved missing-path
placeholder, and the placeholder was cleaned up after exit |
| Real `codex exec` shell turn | Pass | Pass |

Notes:

- The Linux smoke run used the fallback glob walker because the devbox
did not have `rg` installed.
- The smoke matrix verifies the end-to-end filesystem behavior on macOS
and Linux; the escalation-specific behavior is covered by the focused
tests above.

---------

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Charlie Marsh <charliemarsh@openai.com>
2026-05-11 11:49:44 -07:00
Owen Lin
7bddb3083d fix(app-server): thread history redaction for remote clients (#22178)
## Summary

Remote clients can still receive large `thread/resume` histories when
prior turns include MCP tool call payloads or image-generation results.
This adds a temporary response-only redaction path for the known remote
client names.

Longer term we will move towards fully paginated APIs backed by SQLite.

## Changes

- Redact MCP tool call payload-bearing fields in `thread/resume`
responses for `codex_chatgpt_android_remote` and
`codex_chatgpt_ios_remote`.
- Drop `imageGeneration` items from those `thread/resume` responses.
- Keep redaction out of persisted rollout files, `thread/read`,
`thread/turns/list`, live notifications, and token usage replay.
- Cover the behavior with app-server helper tests and a v2 resume
integration test that checks both remote clients plus a non-target
control client.

## Testing

- `cargo test -p codex-app-server thread_resume_redaction`
- `cargo test -p codex-app-server
thread_resume_redacts_payloads_for_chatgpt_remote_clients`
2026-05-11 11:45:25 -07:00
Felipe Coury
90bd445e7f fix(exec-server): suppress Windows taskkill output (#22058)
## Summary

This is the `exec-server` follow-up to #21759.

#21759 fixed the Windows `taskkill` output leak for the `rmcp-client`
MCP teardown path, but #22050 showed that `exec-server` still had a
parallel `taskkill /T /F` cleanup path in
`exec-server/src/connection.rs`. Because that command inherited the
parent stdio handles, Windows could still print `SUCCESS:` lines into
the user's terminal during stdio child cleanup.

This change silences that remaining `exec-server` callsite by
redirecting `taskkill` stdin, stdout, and stderr to `Stdio::null()`.

## What Changed

- add a Windows-only `Stdio` import in `exec-server/src/connection.rs`
- redirect the `taskkill` command in `kill_windows_process_tree` to
`Stdio::null()` for stdin, stdout, and stderr
- keep the existing kill semantics unchanged by still checking
`.status()` and preserving the existing fallback/logging behavior

## How to Test

Manual validation is Windows-only, so I did not run the UI repro path
locally here.

1. On Windows, use a Codex build from this branch.
2. Exercise an `exec-server` stdio flow that spawns a child process tree
and then triggers transport cleanup.
3. Confirm the child process tree is still torn down.
4. Confirm the terminal no longer shows `SUCCESS: The process with PID
... has been terminated.` lines during cleanup.

Targeted tests:
- `cargo test -p codex-exec-server
client::tests::dropping_stdio_client_terminates_spawned_process --
--exact`
- `cargo test -p codex-exec-server
client::tests::malformed_stdio_message_terminates_spawned_process --
--exact`

Notes:
- `cargo test -p codex-exec-server` still hits unrelated local macOS
`sandbox-exec: sandbox_apply: Operation not permitted` failures in
`tests/file_system.rs`.

## References

- Fixes the remaining callsite discussed in #22050
- Related earlier fix: #21759
2026-05-11 15:40:56 -03:00
Dylan Hurd
e783dab44c fix(exec-policy) use is_known_safe_command less (#20305)
## Summary
Restricts behavior of `is_known_safe_command` only to modes where it is
explicitly part of the documented behavior:
- when `environment_lacks_sandbox_protections`
- in `AskForApproval::UnlessTrusted`

Notably, as a result of this, escalations for commands that pass
`is_known_safe_commands` are no longer auto-approved in
AskForApproval::OnRequest or AskForApproval::Granular.

## Testing
- [x] Updated unit tests
- [x] Updated approvals scenario tests.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-11 11:37:53 -07:00
canvrno-oai
eaf05c9002 Unified mentions in TUI (#19068)
This PR replaces the TUI’s file-only `@mention` popup with a unified
mentions experience. Typing `@...` now searches across filesystem
matches, installed plugins, and skills in one popup, with result types
clearly labeled and selectable from the same flow.

- Adds a unified `@mentions` popup that returns:
  - plugins
  - skills
  - files
  - directories

- Adds search modes so users can narrow the popup without changing their
query:
  - All Results _(default/same as Codex App)_
  - Filesystem Only
  - Plugins _(...and skills)_

- Preserves existing insertion behavior:
  - selected file paths are inserted into the prompt
  - paths with spaces are quoted
  - image file selections still attach as images when possible
  - selecting a plugin or skill inserts the corresponding `$name`
- the composer records the canonical mention binding, such as
`plugin://...` or the skill path

- Expanded `@mentions` rendering:
  - type tags for Plugin, Skill, File, and Dir
  - distinct plugin/filesystem colors
  - stable fixed-height layout (8 rows)
  - truncation behavior for narrow terminals

Note:
- The unified mentions popup does not display app connectors under
`@mention` results for Codex App parity. Connector mentions remain
available through the existing `$mention` path.


https://github.com/user-attachments/assets/f93781ed-57d3-4cb5-9972-675bc5f3ef3f
2026-05-11 11:34:52 -07:00
jif-oai
b401666ca5 Add process-scoped SQLite telemetry (#22154)
## Summary
- add SQLite init, backfill-gate, and fallback telemetry without
introducing a cross-cutting state-db access wrapper
- install one process-scoped telemetry sink after OTEL startup and let
low-level state/rollout paths emit through it directly
- add process-start metrics for the process owners that initialize
SQLite

---------

Co-authored-by: Owen Lin <owen@openai.com>
2026-05-11 11:32:40 -07:00
rhan-oai
cf6342b75b [codex-analytics] add turn tool counts to turn events (#21431)
## Summary
- accumulate completed tool-item counts per turn from the item lifecycle
- populate the reserved count fields on `codex_turn_event`
- add reducer coverage for zero-count turns and mixed completed tool
items

## Why
PR #17090 moved tool-item analytics onto the item lifecycle, so the turn
reducer can now derive the per-turn tool counts from the same completed
items instead of leaving the reserved fields null.

## Validation
- `just fmt`
- `cargo test -p codex-analytics`
2026-05-11 18:18:02 +00:00
Won Park
0dbad2a348 Make auto-review denial short-circuit use a rolling review window (#22110)
## Why

Long-running turns can accumulate enough denied auto-review decisions to
trip the global short-circuit even when those denials are spread far
apart. The breaker should still stop genuinely bad loops, but it should
judge recent behavior instead of lifetime turn history.

## What changed

- Replaced the lifetime `10 total denials` threshold with `10 denials in
the last 50 reviews`.
- Kept the existing `3 consecutive denials` interrupt behavior
unchanged.
- Tracked recent auto-review outcomes in the circuit breaker and updated
the warning copy to report the rolling-window count.
- Renamed the new rolling-window coverage to `auto_review_*` test names.
- Added coverage that confirms older denials fall out of the 50-review
window and no longer trigger the breaker.

## Validation

- `just fmt`
- `cargo test -p codex-core guardian_rejection_circuit_breaker --lib`
- `cargo test -p codex-core auto_review_rejection_circuit_breaker --lib`
2026-05-11 11:03:11 -07:00
Eric Traut
1e65b3e0af Fix goal update and add /goal edit command in TUI (#21954)
## Why

Users have requested the ability to edit a goal's objective after a goal
has been created. This PR exposes a new `/goal edit` command in the TUI
to address this request.

In the process of implementing this, I also noticed an existing bug in
the goal runtime. When a goal's objective is updated through the
`thread/goal/set` app server API, the goal runtime didn't emit a new
steering prompt to tell the agent about the new objective. This PR also
fixes this hole.

## What Changed

- Adds `/goal edit` in the TUI, opening an edit box prefilled with the
current goal objective.
- Keeps active and paused goals in their current state, resets completed
goals to active, keeps budget-limited goals budget-limited, and
preserves the existing token budget.
- Changes the existing `thread/goal/set` behavior so editing an
objective preserves goal accounting instead of resetting it. The older
reset-on-new-objective behavior was left over from before
`thread/goal/clear`; clients that need to reset accounting can now clear
the existing goal and create a new one.
- Reuses the existing goal set API path; this does not add or change
app-server protocol surface area.
- Adds a dedicated goal runtime steering prompt when an externally
persisted goal mutation changes the objective, so active turns receive
the updated objective.

## Validation

- Make sure `/goal edit` returns an error if no goal currently exists
- Make sure `/goal edit` displays an edit box that can be optionally
canceled with no side effects
- Make sure that an edited goal results in a steer so the agent starts
pursuing the new objective
- Make sure the new objective is reflected in the goal if you use
`/goal` to display the goal summary
- Make sure that `/goal edit` doesn't reset the token budget, time/token
accounting on the updated goal
2026-05-11 10:49:19 -07:00
jif-oai
32b1ae7099 chore: drop built-in MCPs (#22173)
Drop something that was never used
2026-05-11 19:45:08 +02:00
Ruslan Nigmatullin
a124ddb854 app-server: remove TCP websocket listener (#21843)
## Why

The app-server no longer needs to expose a TCP websocket listener.
Keeping that transport also kept around a separate listener/auth surface
that is unnecessary now that local clients can use stdio or the
Unix-domain control socket, while remote connectivity is handled by
`remote_control`.

## What Changed

- Removed `ws://IP:PORT` parsing and the `AppServerTransport::WebSocket`
startup path.
- Deleted the app-server websocket listener auth module and removed
related CLI flags/dependencies.
- Kept websocket framing only where it is still needed: over the
Unix-domain control socket and in the outbound `remote_control`
connection.
- Updated app-server CLI/help text and `app-server/README.md` to
document only `stdio://`, `unix://`, `unix://PATH`, and `off` for local
transports.
- Converted affected app-server integration coverage from TCP websocket
listeners to UDS-backed websocket connections, and added a parse test
that rejects `ws://` listen URLs.
- Removed the now-unused workspace `constant_time_eq` dependency and
refreshed `Cargo.lock` after `cargo shear` caught the drift.
- Moved test app-server UDS socket paths to short Unix temp paths so
macOS Bazel test sandboxes do not exceed Unix socket path limits.

## Verification

- Added/updated tests around UDS websocket transport behavior and
`ws://` listen URL rejection.
- `cargo shear`
- `cargo metadata --no-deps --format-version 1`
- `cargo test -p codex-app-server unix_socket_transport`
- `cargo test -p codex-app-server unix_socket_disconnect`
- `just fix -p codex-app-server`
- `git diff --check`

Local full Rust test execution was blocked before compilation by an
external fetch failure for the pinned `nornagon/crossterm` git
dependency. `just bazel-lock-update` and `just bazel-lock-check` were
retried after the manifest cleanup but remain blocked by external
BuildBuddy/V8 fetch timeouts.
2026-05-11 10:17:26 -07:00
Eric Traut
f10ddc3f13 Use goal preview metadata for goal-first threads (#21981)
Fixes #20792

## Why

`/goal`-first threads are valid resumable threads, but they can be
missing from `codex resume` and app recents because discovery depends on
metadata derived from a normal first user message.

PR #21489 attempted to fix this by using the goal objective as
`first_user_message`. Review feedback pointed out that
`first_user_message` does more than provide visible text today: it gates
listing, supplies preview text, and participates in deciding whether a
later title should surface as a distinct thread name. Reusing it for the
goal objective could leave a `/goal`-first thread with
`first_user_message=<goal>` and `title=<later prompt>`, even though the
goal should only provide the initial visible preview.

This PR follows that feedback by and keeps the `first_user_message` as
is but introduces a new `preview` field to separate concerns. The
`preview` field is populated from the first user message or the goal
objective. We can extend it in the future to include other sources.

## What Changed

- Added internal thread `preview` metadata in `codex-state`, including a
SQLite migration that backfills from `first_user_message` and from
existing `thread_goals` objectives when needed.
- Treated `ThreadGoalUpdated` as preview-bearing metadata so goal-first
threads can be listed and searched without mutating
`first_user_message`.
- Updated rollout listing, state queries, thread-store conversion, and
app-server mapping to use preview metadata while continuing to expose
the existing public `preview` field.
- Preserved title/name distinctness behavior around literal
`first_user_message`, so a later normal prompt after `/goal` does not
surface as a separate name just because the goal supplied the initial
preview.
- Preserved compatibility for older/internal metadata writes by deriving
preview from `first_user_message` when explicit preview metadata is
absent.

## Verification

- Manually verified that a thread that starts with a `/goal <objective>`
shows up in the resume picker.
2026-05-11 10:12:46 -07:00
Eric Traut
96836e15ed Improve goal continuation based on feedback (#22045)
## Summary

This PR updates the goal continuation prompt to address feedback from
early adopters. There are two primary changes:

1. Goal continuation and budget-limit steering prompts now use hidden
user-context messages instead of hidden developer messages.
2. The goal continuation prompt is refined to improve the model's
ability to fully complete the active goal rather than stop at a smaller
or merely passing subset.

The user-message transition is important for two reasons. First, it
eliminates an issue where older steering messages could be responded to
again after a new turn. Second, it works better with compaction because
user messages are treated differently from developer messages during
compaction.

The prompt refinements make persistence explicit, ground work in current
evidence, encourage `update_plan` for multi-step progress visibility,
and require stronger completion audits before calling `update_goal`. It
also removes the elapsed-time reporting in the prompt; I saw evidence
that this was causing the model to shortcut work as it became nervous
about time.

These changes were tested with evals. Chriss4123 has also been running
independent evals in
[#19910](https://github.com/openai/codex/issues/19910), and many of the
improvements in this PR were suggested by him.

## Verification

- Tested with evals.
- Added and updated focused `codex-core` coverage for hidden goal user
context, continuation and budget-limit request shape, prompt rendering,
and objective delimiter escaping.
2026-05-11 09:51:21 -07:00
Eric Traut
c03eb20d8d Fix side conversation config inheritance (#22106)
Addresses #22101

## Why

Side conversations are ephemeral forks of the active thread, but `/side`
was building its fork config from the app-level config after refreshing
it from disk. If the parent thread had runtime settings that differed
from the current persisted defaults, such as a changed model, reasoning
effort, permissions, reviewer, or fast-mode selection, the side
conversation could start with different behavior than its parent.

## What changed

- Build side fork config from the active parent `ChatWidget` config,
then overlay the parent thread's effective model, reasoning effort,
service tier, and fast-mode opt-out state.
- Forward model reasoning summary, verbosity, personality, web search
mode, and service-tier overrides through TUI app-server
start/resume/fork lifecycle params.
- Add focused tests for parent runtime inheritance, side developer
guardrail preservation, and lifecycle param forwarding.
2026-05-11 09:47:51 -07:00
Ahmed Ibrahim
69f3183a8e Revert "[codex] Harden overflow auto-compaction recovery" (#22170)
Reverts openai/codex#22141
2026-05-11 19:33:15 +03:00
Ahmed Ibrahim
15e79f3c26 [codex] Harden overflow auto-compaction recovery (#22141)
## Why
Dogfooder feedback exposed two correctness gaps in normal-loop overflow
recovery:

1. a sampling request that hit `ContextWindowExceeded` could keep
re-entering auto-compaction indefinitely if the compacted retry still
did not fit, and
2. local compact-history rebuilds flattened user messages down to text,
so an overflowing `[image, "what is this?"]` turn could be retried
without the image after compaction.

That means recovery could either fail to terminate cleanly or proceed
with a materially weakened version of the user request.

## What changed
- Move normal-loop `ContextWindowExceeded` handling into the sampling
retry loop, so successful rescue compaction consumes the provider retry
budget instead of creating an unbounded outer-turn loop.
- Keep compacted user-history rebuilds structured:
`collect_user_messages` now carries user `UserInput` content rather than
flattened strings, and `build_compacted_history` reconstructs full user
messages from that structured representation.
- Preserve image inputs while retaining the existing text-budget
truncation behavior for compacted user history.
- Preserve existing compaction-task failure handling and client-session
reset behavior while bounding repeated overflow retries.
- Add focused regression coverage for:
  - recovery after a normal-loop overflow,
  - retry-budget exhaustion after repeated overflow,
  - local recovery preserving image + text input,
  - remote recovery preserving image + text input,
  - remote compaction v2 preserving image + text input, and
  - compaction failure still terminating cleanly.

The main behavior changes are in `codex-rs/core/src/session/turn.rs` and
`codex-rs/core/src/compact.rs`.

## Verification
- Not run locally; relying on PR CI for this update.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-11 16:16:49 +00:00
Eric Traut
2229c8daf2 Persist /goal commands in history (#21860)
## Summary

A user reported that `/goal` was not saved to the TUI command history,
which made it unavailable for later recall even though other accepted
input paths persist history entries.

This updates the TUI goal slash-command dispatch so successful `/goal`
invocations append the command text to message history. The change
covers the bare `/goal` menu command, goal control commands such as
`/goal pause`, and objective-setting commands such as `/goal improve
benchmark coverage`.

## Verification

- `cargo test -p codex-tui goal_slash_command -- --nocapture`
2026-05-11 08:43:55 -07:00
Andrey Mishchenko
704ad620f6 Add x-codex-ws-stream-request-start-ms (#22113)
For capturing client-side timing information.
2026-05-11 08:15:52 -07:00
jif-oai
8e12c12a07 feat: move extensions tool (#22163)
This PR is just moving stuff around
2026-05-11 17:14:43 +02:00
jif-oai
672cc1f669 feat: wire extension tool bundles into core (#22147)
## Why

This is the next narrow step toward moving concrete tool families out of
core. After #22138 introduced `codex-tool-api`, we still needed a real
end-to-end seam that lets an extension own an executable tool definition
once and have core install it without the temporary `extension-api`
wrapper or a dependency on `codex-tools`.

`codex-tool-api` is the small extension-facing execution contract, while
`codex-tools` still has a different job: host-side shared tool metadata
and planning logic that is not “run this contributed tool”, like spec
shaping, namespaces, discovery, code-mode augmentation, and
MCP/dynamic-to-Responses API conversion

## What changed

- Moved the shared leaf tool-spec and JSON Schema types into
`codex-tool-api`, so the executable contract now lives with
[`ToolBundle`](c538758095/codex-rs/tool-api/src/bundle.rs (L19-L70)).
- Replaced the temporary extension-side tool wrapper with direct
`ToolBundle` use in `codex-extension-api`.
- Taught core to collect contributed bundles, include them in spec
planning, register them through
[`ToolRegistryBuilder::register_tool_bundle`](c538758095/codex-rs/core/src/tools/registry.rs (L653-L667)),
and dispatch them through the existing router/runtime path.
- Added focused coverage for contributed tools becoming model-visible
and dispatchable, plus spec-planning coverage for contributed function
and freeform tools.

## Verification

- Added `extension_tool_bundles_are_model_visible_and_dispatchable` in
`core/src/tools/router_tests.rs`.
- Added spec-plan coverage in `core/src/tools/spec_plan_tests.rs` for
contributed extension bundles.

## Related

- Follow-up to #22138
2026-05-11 16:42:29 +02:00
jif-oai
7e15e6db9e [codex] default unknown contributed tools to mutating (#22143)
## Summary
- make the shared `ToolExecutor::is_mutating` default conservative by
returning `true`
- update the trait docs to say read-only tools should opt out explicitly
- add a regression test covering the default behavior

## Why
Hosts use this signal for serialization and approval policy. Treating
unknown contributed tools as read-only lets a write-capable tool
accidentally bypass mutating-tool safeguards if it forgets to override
the hook.

## Validation
- not run, per request
2026-05-11 14:39:21 +02:00
jif-oai
ebd3d53451 feat: drop CodexExtension (#22140)
Drop `CodexExtension` as not needed for now
2026-05-11 14:19:51 +02:00
jif-oai
95bfea847d refactor: extract executable tool contracts into codex-tool-api (#22138)
## Why
The tool-extraction work needs one shared executable-tool seam that
hosts and tool owners can depend on without reaching into `codex-core`.
Landing that seam first makes the later tool-family ports incremental
and keeps the reusable contract separate from any one migration.

## What changed
- add a new `codex-tool-api` crate and workspace wiring
- move the common executable-tool contracts into that crate:
`ToolBundle`, `ToolDefinition`, `ToolExecutor`, `ToolCall`, `ToolInput`,
`ToolOutput`, `JsonToolOutput`, and `ToolError`
- keep host state generic through `ToolBundle<C>` / `ToolCall<C>` so
later integrations can provide their own runtime context without baking
core types into the API
- carry the host signals the runtime will need later, including
parallel-call support and mutability probing
- leave existing tool families in place for now; this PR only
establishes the reusable API surface
- add the Bazel target and lockfile updates for the new crate

## Testing
- `cargo test -p codex-tool-api`
2026-05-11 13:56:59 +02:00
jif-oai
569ff6a1c4 extension: move git attribution into an extension (#21738)
## Why

Git commit attribution is prompt policy, not session orchestration.
After #21737 adds the extension-registry seam, this moves that
prompt-only behavior out of `codex-core` so `Session` can consume
extension-contributed prompt fragments instead of owning a one-off
policy path itself.

Before this PR, `Session` injected the trailer instruction directly from
`codex-core` ([session
assembly](a57a747eb6/codex-rs/core/src/session/mod.rs (L2733-L2739)),
[helper
module](a57a747eb6/codex-rs/core/src/commit_attribution.rs (L1-L33))).
This branch moves that same responsibility into
[`codex-git-attribution`](b5029a6736/codex-rs/ext/git-attribution/src/lib.rs (L14-L100)).

## What changed

- Added the `codex-git-attribution` extension crate.
- Snapshot `CodexGitCommit` plus `commit_attribution` at thread start,
then contribute the developer-policy fragment through the extension
registry.
- Register the extension in app-server thread extensions.
- Remove the old `codex-core` helper module and direct `Session`
injection path.

This keeps the existing behavior intact: the prompt is only contributed
when `CodexGitCommit` is enabled, blank attribution still disables the
trailer, and the default remains `Codex <noreply@openai.com>`.

## Stack

- Stacked on #21737.
2026-05-11 12:53:15 +02:00
jif-oai
436c0df658 extension: wire extension registries into sessions (#21737)
## Why

[#21736](https://github.com/openai/codex/pull/21736) introduces the
typed extension API, but the runtime does not yet carry a registry
through thread/session startup or give contributors host-owned stores to
read from. This PR wires that host-side path so later feature migrations
can move product-specific behavior behind typed contributions without
adding another bespoke seam directly to `codex-core`.

## What changed

- Thread `ExtensionRegistry<Config>` through `ThreadManager`,
`CodexSpawnArgs`, `Session`, and sub-agent spawn paths.
- Wire `ThreadStartContributor` and `ContextContributor`
- Expose the small supporting surface needed by non-core callers that
construct threads directly, including `empty_extension_registry()`
through `codex-core-api`.

This PR lands the host plumbing only: the app-server registry is still
empty, and concrete feature migrations are intended to follow
separately.
2026-05-11 11:38:18 +02:00
jif-oai
d2c3ebac1f extension: add initial typed extension API (#21736)
## Why

`codex-core` still owns a growing amount of product-specific behavior.
This PR starts the extraction path by introducing a small, typed
first-party extension seam: features can install the contribution
families they actually own, while the host keeps lifecycle and state
ownership instead of pushing a broad service locator into the API.

See the `examples/` for illustration

## Known limitations
* Tool contract definition will be shared with core
* Fragments must be extracted
* Missing some contributors
2026-05-11 11:06:24 +02:00
xli-oai
2abdeb34d5 Read cached metadata for installed Git plugins (#20825)
## Summary
- Populate `plugin/list` interface metadata for installed Git-sourced
marketplace plugins from the active cached plugin bundle.
- Preserve marketplace category precedence so list behavior matches
`plugin/read`.
- Keep existing fallback behavior when the cache or manifest is missing
or invalid.

## Test Plan
- `cd codex-rs && just fmt`
- `cd codex-rs && cargo test -p codex-core-plugins
list_marketplaces_installed_git_source_reads_metadata_from_cache_without_cloning`
- `cd codex-rs && cargo test -p codex-app-server
plugin_list_returns_installed_git_source_interface_from_cache`
- `cd codex-rs && just fix -p codex-core-plugins`
- `cd codex-rs && just fix -p codex-app-server`
- `git diff --check`

Server-truth check: OpenAI monorepo app-server generated types already
expose `PluginSummary.interface`, and the webview consumes it for plugin
cards. This PR keeps the protocol/schema unchanged and fills the
existing field from the cached installed bundle for Git-backed
cross-repo plugins.
2026-05-10 16:59:57 -07:00
Felipe Coury
5248e3da2b feat(tui): render responsive Markdown tables in TUI (#22052)
## Why

The TUI currently treats Markdown tables as ordinary wrapped text, which
makes table-heavy responses hard to read and brittle across narrow panes
and terminal resizes.

This change teaches the TUI to render Markdown tables responsively while
preserving the raw Markdown source needed to re-render streamed and
finalized transcript content after width changes. The goal is to keep
tables legible during streaming, after resize, and once a turn has
finished, without corrupting scrollback ordering.

## What Changed

- add table detection and responsive table rendering in the Markdown
renderer
- render standard tables with Unicode box-drawing borders when the pane
is wide enough
- add a vertical readability fallback for constrained or dense tables so
narrow panes still show each row clearly
- keep links and `<br>` content inside table cells instead of leaking
text outside the table
- avoid table normalization inside fenced or indented code blocks
- preserve raw streamed Markdown source and keep the active table as a
mutable tail until finalization
- consolidate finalized streamed content into source-backed transcript
cells so post-resize re-rendering stays correct
- add snapshot and targeted streaming/resize regression coverage for the
new table behavior

## How to Test

1. Start Codex TUI from this branch.
2. Paste this exact prompt:
`This is a session to test codex, no need to do any thinking, just end
different markdown tables, with columns exploring different markdown
contents, like links, bold italic, code, etc. Make them different sizes,
some 30+ rows, some not and intertwine them with some paragraphs with
complex formatting as well.`
3. Confirm the response includes several Markdown tables mixed with
richly formatted paragraphs.
4. Confirm wide-enough tables render with box-drawing borders instead of
plain wrapped pipe text.
5. Resize the terminal narrower while the answer is still streaming and
confirm the in-progress table stays coherent instead of duplicating
headers or leaving broken scrollback behind.
6. Resize again after the turn finishes and confirm the finalized
transcript re-renders cleanly at the new width.
7. In a narrow pane, verify dense tables fall back to the vertical
per-row layout instead of producing unreadable wrapped columns.
8. Also verify pipe-heavy fenced code blocks still render as code, not
as tables.

Targeted tests:
- `cargo test -p codex-tui table_readability_fallback --no-fail-fast`
- `cargo test -p codex-tui markdown_render --no-fail-fast`
- `cargo test -p codex-tui streaming::controller --no-fail-fast`
- `cargo test -p codex-tui table_resize_lifecycle --no-fail-fast`

## Docs

No developer docs update appears necessary.
2026-05-10 20:42:11 +00:00
Eric Traut
76845d716b Deduplicate issue digest interactions by user (#22039)
## Summary

The issue digest uses recent posts, comments, and reactions to decide
which issues deserve attention. A single active user could previously
raise an issue's apparent importance by commenting or reacting multiple
times in the window.

This changes `codex-issue-digest` so `user_interactions` counts unique
human GitHub users per issue across new issue posts, new comments, and
new reactions. Raw reaction/comment counts are still preserved for
detail output, and the skill guidance now describes `Interactions` as a
unique-human-user count.
2026-05-10 09:55:42 -07:00
Felipe Coury
e5d022297d fix(tui): suppress taskkill output for MCP teardown on Windows (#21759)
## Why

On native Windows, running `/mcp` can leak `taskkill`'s normal
`SUCCESS:` messages into the Codex TUI while the temporary MCP inventory
process tree is being torn down. That corrupts the screen even though
MCP itself is working correctly.

Fixes #20845.

## What Changed

- Redirect the Windows-only MCP teardown `taskkill` subprocess to null
stdio so its console output cannot reach the TUI.

## How to Test

1. On native Windows, configure a stdio MCP server, for example:
   ```powershell
codex mcp add sequential-thinking -- npx -y
@modelcontextprotocol/server-sequential-thinking
   ```
2. With the latest released Codex CLI, start Codex and run `/mcp`.
3. Confirm the current behavior: `taskkill` `SUCCESS:` lines appear in
the TUI during the MCP refresh.
4. Switch to this branch's build, start Codex again, and run `/mcp`.
5. Confirm the MCP inventory still renders normally and the `taskkill`
lines no longer appear.
6. Repeat `/mcp` once more on this branch to verify the regression does
not recur on repeated inventory requests.

Targeted tests:
- `cargo test -p codex-rmcp-client`
- `cargo test -p codex-rmcp-client --test process_group_cleanup --quiet`
2026-05-10 15:51:26 +00:00
Felipe Coury
cac5354455 fix(tui): preserve Shift+Enter in tmux csi-u panes (#21943)
## Why

Inside tmux, `Shift+Enter` can still reach Codex as a plain `Enter` even
when tmux has extended keys enabled. In `csi-u` tmux panes, Codex needs
to request `modifyOtherKeys` mode 2 so tmux moves the pane from `VT10x`
into extended-key mode and preserves the Shift modifier. Without that
extra request, composer `Shift+Enter` submits the draft instead of
inserting a newline.

Fixes #21699.

## What Changed

- Detect tmux sessions and read the active `extended-keys-format`,
preferring the pane-local value before falling back to the global
option.
- Request `modifyOtherKeys` mode 2 for tmux panes using `csi-u` extended
keys, and reset it when restoring keyboard reporting.
- Add unit coverage for tmux detection, the format gate, and the emitted
`modifyOtherKeys` escape sequence.

## How to Test

1. In tmux, configure:
   ```tmux
   set-option -g extended-keys on
   set-option -g extended-keys-format csi-u
   ```
2. Start Codex in a fresh tmux pane from this branch.
3. From another pane, confirm the Codex pane reports `mode=Ext 2`:
   ```bash
tmux list-panes -a -F '#{session_name}:#{window_index}.#{pane_index}
mode=#{pane_key_mode} cmd=#{pane_current_command}'
   ```
4. Type a draft in the composer and press `Shift+Enter`; confirm it
inserts a newline instead of submitting.
5. Also confirm plain `Enter` still submits as before.

Targeted tests:
- `cargo test -p codex-tui`

## Notes

- Manual verification used both real `Shift+Enter` in iTerm2/tmux and
`tmux send-keys ... S-Enter` to confirm the tmux pane changes from
`VT10x` to `Ext 2` and preserves newline behavior.
- On this checkout, the broader `codex-tui` test run currently reaches
unrelated existing failures in `status::tests::*` plus a later stack
overflow in
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all`.
2026-05-10 11:45:49 -03:00
Ahmed Ibrahim
178c3d3005 Persist 'priority' service tier as fast in config (#21991)
### Motivation
- Normalize persisted service tier so selecting the request value
`priority` (or legacy `fast`) is stored as `fast` while preserving
unknown tier IDs and keeping request-time behavior unchanged.

### Description
- Update persistence logic in `codex-rs/core/src/config/edit.rs` so
`ConfigEdit::SetServiceTier` maps request values: `priority`/`fast` ->
`"fast"`, `flex` -> `"flex"`, and leaves unknown strings unchanged.
- Add unit tests in `codex-rs/core/src/config/edit_tests.rs` that verify
a `priority` selection is written to `config.toml` as `"fast"` and that
unknown tiers are preserved.
- Add a config load test in `codex-rs/core/src/config/config_tests.rs`
to ensure `service_tier = "priority"` still resolves to the `priority`
request value at load time.
- Add the required import `use
codex_protocol::config_types::ServiceTier;` to the edited modules.

### Testing
- Ran `just fmt` and `just fix -p codex-core` to apply formatting and
lints and they completed successfully.
- Ran `cargo test -p codex-core --lib service_tier` (focused unit tests
for the change) and the tests passed.
- Ran `cargo test -p codex-protocol` and the protocol test suite passed.
- Note: an initial broader `cargo test -p codex-core service_tier`
invocation matched integration tests and produced unrelated
failures/hangs, so that run was interrupted and the focused `--lib`
unit-test invocation was used instead.

------
[Codex
Task](https://chatgpt.com/codex/cloud/tasks/task_i_69ffc5a1262c8321af91b69c9845147f)
2026-05-10 06:22:46 +03:00
Eric Traut
789b7e39dc Split ChatWidget state into focused modules (#21866)
## Summary

`ChatWidget` has been carrying several independent domains in one large
state bag: transcript bookkeeping, turn lifecycle, queued input, status
surfaces, connectors, review mode, and protocol dispatch. That makes
otherwise-local changes hard to reason about because unrelated fields
and side effects live beside each other in `chatwidget.rs`.

This is the first cleanup PR in a larger decomposition effort. It does
not try to make `chatwidget.rs` small in one sweep; instead, it
establishes focused state boundaries that later handler, popup,
rendering, and effect-synchronization extractions can build on.

This PR keeps `ChatWidget` as the composition layer while moving focused
state into smaller `codex-tui` modules. The widget still owns effects
that touch the bottom pane, app events, command submission, redraw
scheduling, and terminal-title updates.

## Changes

- Add focused state modules under `codex-rs/tui/src/chatwidget/` for
input queues, turn lifecycle, transcript bookkeeping, status state,
connectors, review mode, and app-server protocol dispatch.
- Update `ChatWidget` to hold grouped state structs and route
input/lifecycle/status operations through those focused helpers.
- Move app-server notification dispatch into `chatwidget/protocol.rs`
while leaving feature handlers and side effects on `ChatWidget`.
- Replace the large manual `ChatWidget` test literal with the normal
constructor plus narrow test overrides, so future state moves do not
require every field to be restated in test setup.
- Update existing tests to access the new grouped state or narrower
helpers without changing snapshot behavior.

## Longer-term direction

Follow-up PRs can continue shrinking `chatwidget.rs` by moving behavior,
not just state, into focused modules:

- Extract input/submission flow, turn/stream handling, and tool-cell
lifecycles into domain modules that call the new state reducers.
- Move popup/settings builders and rendering helpers out of the main
widget file so `ChatWidget` stays focused on composition.
- Reduce direct `BottomPane` mutation by applying domain-specific sync
outputs at clearer boundaries.
2026-05-09 15:16:01 -07:00
Eric Traut
90c0bec50c Avoid blocking TUI on agent metadata hydration (#21870)
## Why

Fixes #16688.

The TUI currently hydrates collab receiver metadata by awaiting
`thread/read` before each active-thread notification is rendered. During
large subagent fan-outs, the embedded app-server can be busy starting
agents and processing spawn work, so those synchronous metadata reads
queue behind the fan-out and block the TUI event loop. That makes the UI
appear frozen even though the underlying agent work can continue.

## What Changed

- Replaced eager `thread/read` metadata hydration on the active
notification path with local receiver-thread caching.
- Kept `ThreadStarted` and picker refreshes as the places that fill in
agent nickname/role metadata when it is available.
- Skipped caching receiver threads that are explicitly reported as
`NotFound`, avoiding live-looking ghost entries for failed stale-agent
calls.
- Added TUI tests covering both local receiver caching and `NotFound`
suppression.

## Verification

- `cargo test -p codex-tui collab_receiver_notification`
- `just fix -p codex-tui`

I also ran the full `cargo test -p codex-tui`; the new test passed, but
the full process later aborted with an unrelated stack overflow in
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all`.
2026-05-09 15:15:40 -07:00
Abhinav
6d747db7d8 Improve hooks trust flow in TUI (#21755)
# Why
Hooks that need trust review were easy to miss, and the existing TUI
flow made users discover `/hooks` manually before they could decide
whether to inspect or trust them.

# What
- add a startup review prompt for new or changed hooks before normal
composer use
- add a top-level `t` shortcut in `/hooks` to trust every review-needed
hook at once
- make pending-review rows and helper copy use warning styling

## TUI

### Startup review interstitial

```text
Hooks need review
2 hooks are new or changed.
Hooks can run outside the sandbox after you trust them.

› 1. Review hooks
  2. Trust all and continue
  3. Continue without trusting (hooks won't run)
```

### Top-level `/hooks` page when review is needed

```text
Hooks
Lifecycle hooks from config and enabled plugins.

⚠ 1 hook needs review before it can run.

Event                 Installed   Active   Review   Description
PreToolUse            1           0        1        Before a tool executes
...

Press t to trust all; enter to review hooks; esc to close
```
2026-05-09 21:17:30 +00:00
Felipe Coury
53468b97f6 fix(tui): improve light-mode selection contrast (#21950)
## Why

On light terminal backgrounds, selected rows in several TUI pickers were
rendered with the same bright cyan accent used on dark themes. Against
the light menu surface, that made the current selection hard to
distinguish at a glance.

<table><tr>
<td>
<p align="center">Before</p>
<img width="1109" height="864" alt="SCR-20260509-nmtz"
src="https://github.com/user-attachments/assets/b31ce0d0-19c2-4bdd-a220-7acc77bd8e8e"
/>
</td>
<td>
<p align="center">After</p>
<img width="1164" height="844" alt="SCR-20260509-nmox"
src="https://github.com/user-attachments/assets/7b3fede0-4739-4a9f-a979-cdbb7451841f"
/>
</td>
</tr></table>

## What changed

- Added a shared background-aware accent style for active/selected TUI
controls.
- Use a darker cyan-family accent on light backgrounds while preserving
the existing bright cyan accent on dark or unknown backgrounds.
- Reused that accent across shared picker rows and the custom
selection-like surfaces that had drifted separately: picker tabs, hooks
browsing, external-agent migration choices, and /keymap affordances.
- Added focused tests for the light/dark accent rule and rendered
selected-row styling.

## How to Test

1. Start Codex in a terminal using a light background theme.
2. Type `/` to open the slash-command picker and move the selection
through a few rows.
3. Confirm that the selected row is visibly colored with strong contrast
instead of blending into the popup surface.
4. Open `/keymap` and confirm the active tab, selected rows, and picker
hint accents use the same light-theme accent treatment.
5. In a dark terminal theme, repeat the slash-picker check and confirm
the existing bright cyan selection styling is preserved.

Targeted tests:
- `cargo test -p codex-tui accent_style_uses_`
- `cargo test -p codex-tui selected_rows_use_the_shared_accent_style`
- `cargo test -p codex-tui
selected_event_rows_use_the_shared_accent_style`

Notes:
- A full `cargo test -p codex-tui` run reached the end of the suite but
hit an unrelated existing stack overflow in
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all`.
2026-05-09 16:10:56 -03:00
Felipe Coury
f27cf9db09 fix(tui): preserve wrapped prose beside URLs (#21760)
## Why

Mixed prose lines that contained URLs started taking the URL-preserving
wrapping path, but that path could split ordinary words mid-token. A
follow-up issue remained in scrollback insertion: when already-rendered
indented rows were wrapped again, continuation rows could lose their
margin and fall back to terminal hard wrapping. Together those bugs made
normal Markdown output look broken around links, lists, blockquotes, and
indented content.

Separately, the local argument-comment lint wrappers failed under
environments that set `PYTHONSAFEPATH=1`, because Python no longer adds
the script directory to `sys.path` automatically. That prevented the
lint from reaching Rust callsites at all.

<img width="1778" height="1558" alt="CleanShot 2026-05-09 at 11 51 38"
src="https://github.com/user-attachments/assets/9274d150-1757-4f1a-89ac-5bdc9997d8cb"
/>

## What Changed

- Preserve URL tokens without turning every neighboring prose word into
a character-level split point.
- Add a mixed URL/prose wrapper that keeps ordinary words whole,
preserves leading whitespace, and re-splits long non-URL tokens against
the actual width available on continuation rows.
- Reuse a rendered history row's leading whitespace as the continuation
indent when scrollback insertion has to pre-wrap it again.
- Add regression coverage for markdown wrapping, history-cell rendering,
scrollback continuation margins, leading-indent width accounting, and
continuation-row re-splitting.
- Make both argument-comment lint entrypoints explicitly add their own
directory to `sys.path`, so sibling imports still work when
`PYTHONSAFEPATH=1`.

## How to Test

1. Start Codex and render a long Markdown response that mixes prose with
inline links, blockquotes, lists, and indented code-like text.
2. Confirm that ordinary words next to links stay whole instead of
breaking mid-word.
3. Resize or replay the transcript and confirm wrapped continuation rows
keep their expected left margin for blockquotes, lists, and indented
content.
4. Run the source argument-comment lint from a shell with
`PYTHONSAFEPATH=1` and confirm it starts normally instead of failing to
import `wrapper_common`.

Targeted tests:
- `cargo test -p codex-tui mixed_line --lib`
- `cargo test -p codex-tui preserves_prefix_on_wrapped_rows --lib`
- `cargo test -p codex-tui
agent_markdown_cell_does_not_split_words_after_inline_markdown --lib`
- `cargo test -p codex-tui
mixed_url_markdown_wraps_prose_without_splitting_words_snapshot --lib`
- `python3 tools/argument-comment-lint/test_wrapper_common.py`
- `just argument-comment-lint-from-source -p codex-tui -- --lib`

Notes:
- `cargo test -p codex-tui` currently reaches the new tests
successfully, then still aborts in the pre-existing
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all`
stack-overflow failure.
2026-05-09 13:58:10 -03:00
Michael Bolin
0c70698e24 tests: cover sandbox link write behavior (#21819)
## Why

[PR #1705](https://github.com/openai/codex/pull/1705) moved
`apply_patch` execution under the configured sandbox and called out the
need for integration coverage. We already covered textual `../` escapes,
but did not have coverage for link aliases that live inside a writable
workspace while pointing at, or aliasing, files visible outside it.

This PR locks in the current sandbox boundary without changing
production write semantics. Symlink escapes into a read-only outside
root should fail and leave the outside file unchanged. Existing hard
links are characterized separately: if a user-created hard link already
exists inside the writable root, sandboxed writes preserve normal
hard-link semantics rather than replacing the link and silently breaking
that relationship.

## What Changed

- Added
`apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
to verify `apply_patch` cannot update a symlink that targets a file
outside the writable workspace.
- Added `apply_patch_cli_preserves_existing_hard_link_outside_workspace`
to verify `apply_patch` intentionally writes through an existing hard
link and does not unlink or replace it.
- Added `file_system_sandboxed_write_preserves_existing_hard_link` to
verify sandboxed `fs/writeFile` preserves an existing hard link and
writes the shared inode.

## Testing

- `cargo test -p codex-exec-server file_system_sandboxed_write`
- `cargo test -p codex-core
apply_patch_cli_does_not_write_through_symlink_escape_outside_workspace`
- `cargo test -p codex-core
apply_patch_cli_preserves_existing_hard_link_outside_workspace`
- `just fix -p codex-exec-server -p codex-core`
- `just fix -p codex-core`



---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21819).
* #21845
* __->__ #21819
2026-05-09 08:28:15 -07:00
Ahmed Ibrahim
fca81eeb5b [codex] Lowercase TUI service tier commands (#21906)
## Why

Service-tier slash commands are built from model-catalog metadata. If
the catalog returns a name like `Fast`, the TUI currently exposes
`/Fast` and exact dispatch expects that casing, which is inconsistent
with the lowercase command style used elsewhere.

## What

- Lowercase service-tier command names when converting catalog tiers
into `ServiceTierCommand` values.
- Add regression coverage that seeds a catalog tier named `Fast` and
expects the generated command to be `fast`.

## Testing

Not run locally per repo instruction; PR CI should run the new
`service_tier_commands_lowercase_catalog_names` coverage.
2026-05-09 14:29:12 +03:00
Ahmed Ibrahim
ebe75bb683 Route Python SDK turn notifications by ID (#21778)
## Why

The Python SDK previously protected the stdio transport with a single
active turn-consumer guard. That avoided competing reads from stdout,
but it also meant one `Codex`/`AsyncCodex` client could not stream
multiple active turns at the same time. Notifications could also arrive
before the caller received a `TurnHandle` and registered for streaming,
so the SDK needed an explicit routing layer instead of letting
individual API calls read directly from the shared transport.

## What Changed

- Added a private `MessageRouter` that owns per-request response queues,
per-turn notification queues, pending turn-notification replay, and
global notification delivery behind a single stdout reader thread.
- Generated typed notification routing metadata so turn IDs come from
known payload shapes instead of router-side attribute guessing, with
explicit fallback handling for unknown notification payloads.
- Updated sync and async turn streaming so `TurnHandle.stream()`/`run()`
and `stream_text()` consume only notifications for their own turn ID,
while `AsyncAppServerClient` no longer serializes all transport calls
behind one async lock.
- Cleared pending turn-notification buffers when unregistered turns
complete so never-consumed turn handles do not leave stale queues
behind.
- Removed the internal stream-until helper now that turn completion
waiting can register directly with routed turn notifications.
- Updated Python SDK docs and focused tests for concurrent transport
calls, interleaved turn routing, buffered early notifications, unknown
notification routing, async delegation, and routed turn completion
behavior.

## Validation

- `uv run --extra dev ruff format scripts/update_sdk_artifacts.py
src/codex_app_server/_message_router.py src/codex_app_server/client.py
src/codex_app_server/generated/notification_registry.py
tests/test_client_rpc_methods.py
tests/test_public_api_runtime_behavior.py
tests/test_async_client_behavior.py`
- `uv run --extra dev ruff check scripts/update_sdk_artifacts.py
src/codex_app_server/_message_router.py src/codex_app_server/client.py
src/codex_app_server/generated/notification_registry.py
tests/test_client_rpc_methods.py
tests/test_public_api_runtime_behavior.py
tests/test_async_client_behavior.py`
- `uv run --extra dev pytest tests/test_client_rpc_methods.py
tests/test_public_api_runtime_behavior.py
tests/test_async_client_behavior.py`
- `git diff --check`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-09 04:16:23 +00:00
sayan-oai
77d9223e9f [codex] compact network context rendering (#21875)
## Why

The model-visible `<network>` context currently repeats indentation and
a pair of XML tags for every allowed or denied domain. Large domain sets
spend a surprising amount of prompt budget on that scaffolding instead
of the actual policy values.

## What changed

- Render allowed domains as one comma-separated `<allowed>` value
instead of one element per domain.
- Render denied domains the same way.
- Keep the full allow/deny domain sets model-visible while updating the
serialization and settings-update coverage for the denser shape.

## Example

Before:
```xml
<network enabled="true">
  <allowed>api.example.test</allowed>
  <allowed>cdn.example.test</allowed>
  <denied>blocked.example.test</denied>
</network>
```

After:
```xml
<network enabled="true"><allowed>api.example.test,cdn.example.test</allowed><denied>blocked.example.test</denied></network>
```

## Validation

- `cargo test -p codex-core environment_context`
- `cargo test -p codex-core
build_settings_update_items_emits_environment_item_for_network_changes`
- Ran a local `codex` session with a real network context containing 121
allowed domains and 42 denied domains, then inspected the raw prompt
with `raw_token_viewer_cli.py`. With the same domain set, the rendered
`<network>` section shrank from 7,175 characters across 161 lines to
3,666 characters on one line, and the containing environment-context
block fell from 6,428 tokens to 5,379 tokens.
2026-05-09 03:52:48 +00:00
xl-openai
479491ed89 feat: Add role-aware plugin share context APIs (#21867)
Expose discoverability and full share principals in share context, carry
roles through save/updateTargets, hydrate local shared plugin reads, and
keep share URLs only under plugin.shareContext.
2026-05-08 20:46:39 -07:00
pakrym-oai
c579da41b1 Move file watcher out of core (#21290)
## Why

The app-server watcher relocation leaves the generic filesystem watcher
as the last watcher-specific implementation still living inside
`codex-core`. Moving that code to a small crate keeps `codex-core`
focused on thread execution and lets app-server depend on the watcher
without reaching back into core for filesystem watching primitives.

This PR is stacked on #21287.

## What changed

- Added a new `codex-file-watcher` crate containing the existing watcher
implementation and its unit tests.
- Updated app-server `fs_watch`, `skills_watcher`, and listener state to
import watcher types from `codex-file-watcher`.
- Removed the `file_watcher` module and `notify` dependency from
`codex-core`.
- Updated Cargo workspace metadata and `Cargo.lock` for the new internal
crate.

## Validation

- `cargo check -p codex-file-watcher -p codex-core -p codex-app-server`
- `cargo test -p codex-file-watcher`
- `cargo test -p codex-app-server
skills_changed_notification_is_emitted_after_skill_change`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fix -p codex-file-watcher`
- `just fix -p codex-core`
- `just fix -p codex-app-server`
2026-05-08 18:19:23 -07:00
pakrym-oai
408e6218ab Reapply "Move skills watcher to app-server" (#21652)
## Why

PR #21460 reverted the earlier move of skills change watching from
`codex-core` into app-server. This reapplies that boundary change so
app-server owns client-facing `skills/changed` notifications and core no
longer carries the watcher.

## What

- Restore the app-server `SkillsWatcher` and register it from thread
listener setup.
- Remove the core-owned skills watcher and its core live-reload
integration surface.
- Restore app-server coverage for `skills/changed` notifications after a
watched skill file changes.

## Validation

- `cargo test -p codex-app-server --test all
suite::v2::skills_list::skills_changed_notification_is_emitted_after_skill_change
-- --exact --nocapture`
- `cargo test -p codex-core --lib --no-run`
2026-05-08 17:41:15 -07:00
Owen Lin
95ca276373 sqlite: no more destructive version bumps (#21847)
## Why

We'd like SQLite state to become required and load-bearing. As a first
step, let's remove the mechanism that allows us to blow away the SQLite
DB on a version bump, and instead rely on graceful migrations.

The original motivation
([PR](https://github.com/openai/codex/pull/10623)) behind this mechanism
was to care less about backwards compatibility while SQLite was being
landed, but I'd say it's quite important now to keep the data in it.

## What changed

- Make `STATE_DB_FILENAME` and `LOGS_DB_FILENAME` the full canonical
filenames: `state_5.sqlite` and `logs_2.sqlite`.
- Remove `STATE_DB_VERSION` / `LOGS_DB_VERSION` and the helper that
constructed filenames from versions.
- Stop `StateRuntime::init` from scanning for or deleting older SQLite
DB filenames at startup.
- Delete the tests that encoded legacy state/logs DB deletion behavior.

## Verification

- `cargo test -p codex-state`
2026-05-08 17:29:44 -07:00
Celia Chen
bd42660cb4 feat: add Bedrock Mantle client agent header (#21840)
## Why

Amazon Bedrock Mantle needs a stable client-agent header so requests
from the built-in Bedrock provider can be identified as coming from
Codex for safety stack.

## What changed

- Added `x-amzn-mantle-client-agent: codex` to the built-in Amazon
Bedrock provider default HTTP headers.
2026-05-08 23:58:41 +00:00
Ruslan Nigmatullin
0c8d42525e [daemon] Add app-server daemon lifecycle management (#20718)
## Why

Desktop and mobile Codex clients need a machine-readable way to
bootstrap and manage `codex app-server` on remote machines reached over
SSH. The same flow is also useful for bringing up app-server with
`remote_control` enabled on a fresh developer machine and keeping that
managed install current without requiring a human session.

## What changed

- add the new experimental `codex-app-server-daemon` crate and wire it
into `codex app-server daemon` lifecycle commands: `start`, `restart`,
`stop`, `version`, and `bootstrap`
- add explicit `enable-remote-control` and `disable-remote-control`
commands that persist the launch setting and restart a running managed
daemon so the change takes effect immediately
- emit JSON success responses for daemon commands so remote callers can
consume them directly
- support a Unix-only pidfile-backed detached backend for lifecycle
management
- assume the standalone `install.sh` layout for daemon-managed binaries
and always launch `CODEX_HOME/packages/standalone/current/codex`
- add bootstrap support for the standalone managed install plus a
detached hourly updater loop
- harden lifecycle management around concurrent operations, pidfile
ownership, stale state cleanup, updater ownership, managed-binary
preflight, Unix-only rejection, forced shutdown after the graceful
window, and updater process-group tracking/cleanup
- document the experimental Unix-only support boundary plus the
standalone bootstrap/update flow in
`codex-rs/app-server-daemon/README.md`

## Verification

- `cargo test -p codex-app-server-daemon -p codex-cli`
- live pid validation on `cb4`: `bootstrap --remote-control`, `restart`,
`version`, `stop`

## Follow-up

- Add updater self-refresh so the long-lived `pid-update-loop` can
replace its own executable image after installing a newer managed Codex
binary.
2026-05-08 16:51:16 -07:00
starr-openai
faa5d4a5e2 Increase exec-server environment transport timeouts (#21825)
## Why

The environment-backed exec-server transport currently hardcodes 5
second connect and initialize timeouts in `client_transport.rs`. That is
short for SSH-backed stdio environments and remote websocket
environments, and there is currently no way to raise those values from
`CODEX_HOME/environments.toml`.

This stacked follow-up raises the default environment transport timeouts
and lets each configured environment override them in
`environments.toml`.

## What Changed

- raise the default environment transport connect and initialize
timeouts from 5s to 10s
- store concrete timeout values on `ExecServerTransportParams` instead
of hardcoding them in `connect_for_transport(...)`
- add `connect_timeout_sec` and `initialize_timeout_sec` to
`[[environments]]` entries in `environments.toml`
- apply parse-time defaults so runtime transport code receives fully
resolved timeout values
- reject `connect_timeout_sec` on stdio environments because it only
applies to websocket transports
- extend parser tests to cover the new fields and defaults

## Stack

- base: https://github.com/openai/codex/pull/21794
- this PR: configurable environment transport timeouts

## Validation

- `cd
/Users/starr/code/codex-worktrees/exec-env-timeouts-config-20260508/codex-rs
&& just fmt`
- not run: tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 16:33:29 -07:00
Michael Zeng
8f4020846e [codex] support executor registry remote environments (#21323)
## Summary

Support registry-backed remote executors end to end so downstream
services can resolve an executor id into an exec-server URL and make
that environment available to Codex without relying on the legacy cloud
environments flow.

## What changed

- switch remote executor registration to the executor registry bootstrap
contract
- allow named remote environments to be inserted into
`EnvironmentManager` at runtime
- add the experimental app-server RPC `environment/add` so initialized
experimental clients can register those remote environments for later
`thread/start` and `turn/start` selection

## Validation

Ran focused validation locally:
- `cargo test -p codex-exec-server environment_manager_`
- `cargo test -p codex-exec-server
register_executor_posts_with_bearer_token_header`
- `cargo test -p codex-app-server-protocol`
2026-05-08 16:30:07 -07:00
lt-oai
80a408e201 Support openai library tool (#20293)
Support chatgpt library tool
2026-05-08 22:56:13 +00:00
Ruslan Nigmatullin
1b86906fa1 app-server: support daemon-safe restart handling (#21831)
## Why

The app-server daemon work needs two app-server behaviors to be safe
when lifecycle management is driven by a helper process:

- a readiness probe must not become the process-wide client identity
just because it connects first
- a graceful reload signal needs to keep draining active turns even if
it is delivered more than once

## What changed

- Treat `codex_app_server_daemon` initialization as a probe-only client
for process-global originator and user-agent suffix state.
- Distinguish forceable shutdown signals from graceful-only ones, and
treat Unix `SIGHUP` as graceful-only while leaving `SIGTERM` and Ctrl-C
forceable.
- Add regression coverage for daemon probe initialization and repeated
`SIGHUP` delivery while a turn is still running.

## Testing

- `cargo test -p codex-app-server`
  - The new daemon-probe and repeated-`SIGHUP` coverage passed.
- The run still failed in the existing
`suite::conversation_summary::get_conversation_summary_by_relative_rollout_path_resolves_from_codex_home`
and
`suite::conversation_summary::get_conversation_summary_by_thread_id_reads_rollout`
tests because their initialize handshake timed out.
- `cargo test -p codex-app-server --test all
suite::conversation_summary::`
- Reproduced the same two existing initialize-timeout failures in
isolation.
2026-05-08 15:47:51 -07:00
starr-openai
dac108f2f1 Make environment provider snapshots path-free (#21794)
## Summary
- make EnvironmentProvider::snapshot path-free and keep providers
focused on provider-owned remote environments
- let provider snapshots request local inclusion via include_local, with
environments.toml including local and CODEX_EXEC_SERVER_URL excluding
local
- move reserved local environment construction into EnvironmentManager
using ExecServerRuntimePaths

Follow-up to https://github.com/openai/codex/pull/20667

## Testing
- just fmt
- git diff --check
- devbox: bazel build --bes_backend= --bes_results_url=
//codex-rs/exec-server:exec-server
- devbox: bazel test --bes_backend= --bes_results_url=
//codex-rs/exec-server:exec-server-unit-tests

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 15:30:00 -07:00
Michael Bolin
24111790f0 ci: check out PR head commits in workflows (#21835)
## Why

PR CI should test the exact commit that was pushed to the PR branch. By
default, GitHub's `pull_request` event checks out a synthetic merge
commit from `refs/pull/<number>/merge`, so the tested tree can include
an implicit merge with the current base branch instead of matching the
pushed head SHA.

Using the PR head SHA makes each check result correspond to a concrete
commit the author submitted. This also behaves better for stacked PR
workflows, including Sapling stacks and other Git stack tooling: a
middle PR's head commit already contains the lower stack changes in its
tree, without pulling in commits above it or GitHub's temporary merge
ref.

## What Changed

- Set every `actions/checkout` in `pull_request` workflows under
`.github/workflows` to use `github.event.pull_request.head.sha` on PR
events and `github.sha` otherwise.
- Updated `blob-size-policy` to compare
`github.event.pull_request.base.sha` and
`github.event.pull_request.head.sha`, since it no longer checks out
GitHub's merge commit where `HEAD^1`/`HEAD^2` represented the PR range.

## Verification

- Parsed the edited workflow YAML files with Ruby.
- Checked that every checkout block in the `pull_request` workflows has
the PR-head `ref`.
2026-05-08 15:14:33 -07:00
Matthew Zeng
2f3a2d7a86 Using cached connector directory for discoverable tools list (#21497)
## Summary

Startup tool construction currently depends on connector directory
metadata for `tool_suggest` discoverables. On a cold directory cache,
that can put slow connector-directory requests on the blocking path even
though the tools array only needs directory data for install
suggestions, not for the live connector MCP tools themselves.

This PR keeps the discoverables path off that cold network fetch:
- read connector directory metadata from cache only when building
discoverable tools
- persist connector directory metadata to
`~/.codex/cache/codex_app_directory/<hash>.json` and use it to hydrate
the in-memory cache on later runs before the normal refresh path updates
it
- use connector-directory-specific cache naming to distinguish this
metadata cache from the separate Codex Apps tools-spec cache

This reduces first-turn startup work without changing how live connector
MCP tools are sourced. Longer term, directory-backed install suggestions
should move to a search-based flow so they no longer need to be inlined
into the tools prompt at all.

## Testing

- `cargo test -p codex-connectors`
- `cargo test -p codex-chatgpt`
- `cargo test -p codex-core
request_plugin_install_is_available_without_search_tool_after_discovery_attempts`
- `cargo test -p codex-core
tool_suggest_uses_connector_id_fallback_when_directory_cache_is_empty`
2026-05-08 14:14:11 -07:00
Charlie Marsh
7c9731c9af Enable --deny-warnings for cargo shear (#21616)
## Summary

In https://github.com/openai/codex/pull/21584, we disabled doctests for
crates that lack any doctests. We can enforce that property via `cargo
shear --deny-warnings`: crates that lack doctests will be flagged if
doctests are enabled, and crates with doctests will be flagged if
doctests are disabled.

A few additional notes:

- By adding `--deny-warnings`, `cargo shear` also flagged a number of
modules that were not reachable at all. Some of those have been removed.
- This PR removes a usage of `windows_modules!` (since `cargo shear` and
`rustfmt` couldn't see through it) in favor of simple `#[cfg(target_os =
"windows")]` macros. As a consequence, many of these files exhibit churn
in this PR, since they weren't being formatted by `rustfmt` at all on
main.
- Again, to make the code more analyzable, this PR also removes some
usages of `#[path = "cwd_junction.rs"]` in favor of a more standard
module structure. The bin sidecar structure is still retained, but,
e.g., `windows-sandbox-rs/src/bin/command_runner.rs‎` was moved to
`windows-sandbox-rs/src/bin/command_runner/main.rs`, and so on.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 20:29:00 +00:00
pakrym-oai
46e2250bcf [codex] Remove legacy after tool use hooks (#21805)
## Why

The legacy `AfterToolUse` hook path was still wired through core tool
dispatch even though the hooks registry never populated any handlers for
it. The supported hook surface is `PostToolUse`, so the old
infrastructure was dead code on the hot path.

## What changed

- Removed the legacy `AfterToolUse` dispatch from `codex-core` tool
execution.
- Removed the unused legacy hook payload types and exports from
`codex-hooks`.
- Simplified legacy notify handling now that `HookEvent` only carries
`AfterAgent`.

## Validation

- `cargo test -p codex-hooks`
- `cargo test -p codex-core registry`
2026-05-08 13:20:05 -07:00
pakrym-oai
e783341b70 [codex] Delete function-style apply_patch (#21651)
## Why

`apply_patch` is now a freeform/custom tool. Keeping the old
JSON/function-style registration and parsing path left another way for
models and tests to invoke `apply_patch`, which made the tool surface
harder to reason about.

## What changed

- Removed the `ApplyPatchToolType::Function` variant, JSON `apply_patch`
spec, and handler support for function payloads.
- Kept `apply_patch_tool_type = freeform` as the supported model
metadata path, including Bedrock catalog metadata.
- Migrated `apply_patch` tests and SSE fixtures to custom/freeform tool
calls.

## Verification

- `cargo test -p codex-tools -p codex-protocol -p codex-model-provider`
- `cargo test -p codex-core tools::handlers::apply_patch --lib`
- `cargo test -p codex-core --test all
apply_patch_tool_executes_and_emits_patch_events`
- `cargo test -p codex-core --test all
apply_patch_reports_parse_diagnostics`
- `cargo test -p codex-exec test_apply_patch_tool`
- `just fix -p codex-core`
- `just fix -p codex-tools -p codex-protocol -p codex-model-provider -p
codex-exec`
2026-05-08 13:00:57 -07:00
Ahmed Ibrahim
cf941ede15 Revert "Publish Python runtime wheels on release" (#21810)
Reverts openai/codex#21784
2026-05-08 22:37:10 +03:00
Jiaming Zhang
5f4d0ec343 [codex] request desktop attestation from app (#20619)
## Summary

TL;DR: teaches `codex-rs` / app-server to request a desktop-provided
attestation token and attach it as `x-oai-attestation` on the scoped
ChatGPT Codex request paths.

![DeviceCheck attestation
interface](https://raw.githubusercontent.com/openai/codex/dev/jm/devicecheck-diagram-assets/pr-assets/devicecheck-attestation-interface.png)

## Details

This PR teaches the Codex app-server runtime how to request and attach
an attestation token. It does not generate DeviceCheck tokens directly;
instead, it relies on the connected desktop app to advertise that it can
generate attestation and then asks that app for a fresh header value
when needed.

The flow is:

1. The Codex desktop app connects to app-server.
2. During `initialize`, the app can advertise that it supports
`requestAttestation`.
3. Before app-server calls selected ChatGPT Codex endpoints, it sends
the internal server request `attestation/generate` to the app.
4. app-server receives a pre-encoded header value back.
5. app-server forwards that value as `x-oai-attestation` on the scoped
outbound requests.

The code in this repo is mostly protocol and runtime plumbing: it adds
the app-server request/response shape, introduces an attestation
provider in core, wires that provider into Responses / compaction /
realtime setup paths, and covers the intended scoping with tests. The
signed macOS DeviceCheck generation remains owned by the desktop app PR.

## Related PR

- Codex desktop app implementation:
https://github.com/openai/openai/pull/878649

## Validation

<details>
<summary>Tests run</summary>

```sh
cargo test -p codex-app-server-protocol
cargo test -p codex-core attestation --lib
cargo test -p codex-app-server --lib attestation
```

Also ran:

```sh
just fix -p codex-core
just fix -p codex-app-server
just fix -p codex-app-server-protocol
just fmt
just write-app-server-schema
```

</details>

<details>
<summary>E2E DeviceCheck validation</summary>

First validated the signed desktop app boundary directly: launched a
packaged signed `Codex.app`, sent `attestation/generate`, decoded the
returned `v1.` attestation header, and validated the extracted
DeviceCheck token with `personal/jm/verify_devicecheck_token.py` using
bundle ID `com.openai.codex`. Apple returned `status_code: 200` and
`is_ok: true`.

Then ran the fuller app + app-server flow. The packaged `Codex.app`
launched a current-branch app-server via `CODEX_CLI_PATH`, and a local
MITM proxy intercepted outbound `chatgpt.com` traffic. The app-server
requested `attestation/generate` from the real Electron app process, and
the intercepted `/backend-api/codex/responses` traffic included
`x-oai-attestation` on both routes:

```text
GET  /backend-api/codex/responses  Upgrade: websocket  x-oai-attestation: present
POST /backend-api/codex/responses  Upgrade: none       x-oai-attestation: present
```

The captured header decoded to a DeviceCheck token that also validated
with Apple for `com.openai.codex` (`status_code: 200`, `is_ok: true`,
team `2DC432GLL2`).

</details>

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 12:36:02 -07:00
pakrym-oai
61142b6169 Remove ToolName display helper (#21465)
## Why

`ToolName::display()` made it too easy to flatten tool identity and
accidentally compare rendered strings. Tool identity should stay
structural until a legacy string boundary actually requires the
flattened spelling.

## What

- Removes `ToolName::display()` and relies on the existing `Display`
impl for messages and errors.
- Adds structural ordering for `ToolName` and uses it for
sorting/deduping deferred tools.
- Carries `ToolName` through tool/sandbox plumbing, flattening only at
legacy boundaries such as hook payloads, telemetry tags, and Responses
tool names.
- Updates MCP normalization tests to assert `ToolName` structure instead
of rendered strings.

## Testing

- `cargo test -p codex-mcp test_normalize_tools`
- `cargo test -p codex-core unavailable_tool`
- `just fix -p codex-protocol`
- `just fix -p codex-mcp`
- `just fix -p codex-core`
2026-05-08 12:17:48 -07:00
alexsong-oai
bbb6bf0a37 Emit accepted line fingerprint analytics (#21601)
## Why

Codex assisted-code attribution needs a client-side accepted-code source
that does not upload raw code. This adds a hash-only analytics event
derived from the turn diff so downstream attribution can compare
accepted Codex lines against commit or PR diffs.

## What Changed

- Parse accepted/effective added lines from the final turn diff and emit
`codex_accepted_line_fingerprints` analytics.
- Hash repo, path, and normalized line content before upload; raw code
and raw diffs are not included in the event.
- Chunk large fingerprint payloads and send accepted-line fingerprint
events in isolated requests while preserving normal batching for other
analytics events.
- Canonicalize Git remote URLs before repo hashing so SSH/HTTPS GitHub
remotes join to the same repo hash.
- Add parser coverage for unified diff hunk lines that look like `+++`
or `---` file headers.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-git-utils canonicalize_git_remote_url`
- `just fix -p codex-analytics`
- `just bazel-lock-check`
- `git diff --check`
2026-05-08 12:16:24 -07:00
Ahmed Ibrahim
9183503b97 Publish Python runtime wheels on release (#21784)
## Why

Published Python SDK builds depend on an exact `openai-codex-cli-bin`
runtime package, but the release workflow did not publish that runtime
package to PyPI. That left the SDK packaging story incomplete: release
artifacts could produce Codex binaries, but Python users still needed a
matching wheel carrying the platform-specific runtime and helper
executables.

This PR is stacked on #21787 so release jobs can include helper binaries
in runtime wheels: Linux wheels include `bwrap` for sandbox fallback,
and Windows wheels include the signed sandbox/elevation helpers beside
`codex.exe`.

## What changed

- Builds platform-specific `openai-codex-cli-bin` wheels from signed
release binaries on macOS, Linux, and Windows release runners.
- Packages Linux `bwrap` into musllinux runtime wheels.
- Packages Windows sandbox helper executables into Windows runtime
wheels.
- Uploads runtime wheels as GitHub release assets and publishes them to
PyPI using trusted publishing from the `pypi` GitHub environment.
- Keeps the new Python runtime publish job non-blocking so failures need
follow-up but do not fail the Rust release workflow.
- Pins the PyPA publish action to the `v1.13.0` commit SHA for
reproducible release publishing.
- Documents that runtime wheels are platform wheels published through
PyPI trusted publishing.

## Testing

- `ruby -e 'require "yaml"; ARGV.each { |f| YAML.load_file(f); puts "ok
#{f}" }' .github/workflows/rust-release.yml
.github/workflows/rust-release-windows.yml`
- `git diff --check`

CI is the real end-to-end verification for the release workflow path.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 22:00:58 +03:00
Ahmed Ibrahim
8956a928a1 Support resource binaries in Python runtime staging (#21787)
## Why

Some Codex runtime distributions need helper executables beside the main
bundled binary. Linux sandbox fallback needs a packaged `bwrap` when no
suitable system `bwrap` is available, and Windows sandbox/elevation
needs helper executables discoverable beside `codex.exe`. The checked-in
`openai-codex-cli-bin` template already packages everything under
`codex_cli_bin/bin/**`, but the staging script only copied the main
Codex binary into that directory.

This PR adds the generic staging primitive needed by release workflows
to build complete platform runtime wheels without baking
platform-specific helper names into the package template.

## What changed

- Added repeatable `stage-runtime --resource-binary` support so release
workflows can copy extra executables beside the bundled Codex binary.
- Kept resource selection in workflow code, where the platform target is
known.
- Added tests that verify resource binaries are copied into the staged
runtime package, that the wheel include config covers them, and that the
CLI forwards repeated `--resource-binary` values.

## Testing

- `uv run ruff check scripts/update_sdk_artifacts.py
tests/test_artifact_workflow_and_binaries.py`
- `uv run --extra dev pytest
tests/test_artifact_workflow_and_binaries.py::test_stage_runtime_release_copies_resource_binaries
tests/test_artifact_workflow_and_binaries.py::test_runtime_resource_binaries_are_included_by_wheel_config
tests/test_artifact_workflow_and_binaries.py::test_stage_runtime_stages_binary_without_type_generation`

Full `tests/test_artifact_workflow_and_binaries.py` still has unrelated
schema-normalization drift in the local checkout.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 22:00:44 +03:00
github-actions[bot]
772e034594 Update models.json (#21776)
Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
2026-05-08 21:37:23 +03:00
starr-openai
5f2543b74e Load configured environments from CODEX_HOME (#20667)
## Why

The earlier PRs add stdio transport support and the config-backed
environment provider, but the feature remains inert until normal Codex
entrypoints construct `EnvironmentManager` with enough context to
discover `CODEX_HOME/environments.toml`. This final stack PR activates
the provider while preserving the legacy `CODEX_EXEC_SERVER_URL`
fallback when no environments file exists.

**Stack position:** this is PR 5 of 5. It is the product wiring PR that
activates the configured environment provider added in PR 4.

## What Changed

- Thread `codex_home` into `EnvironmentManagerArgs`.
- Change `EnvironmentManager::new(...)` to load the provider from
`CODEX_HOME`.
- Preserve legacy behavior by falling back to
`DefaultEnvironmentProvider::from_env()` when `environments.toml` is
absent.
- Make `environments.toml`-backed managers start new threads with all
configured environments, default first, while keeping the legacy env-var
path single-default.
- Update the app-server, TUI, exec, MCP server, connector, prompt-debug,
and thread-manager-sample callsites to pass `codex_home` and handle
provider-loading errors.

## Self-Review Notes

- The multi-environment startup path is intentionally tied to the
`environments.toml` provider. Using `>1` configured environment as the
only signal would also expand the legacy `CODEX_EXEC_SERVER_URL`
provider because it keeps `local` addressable alongside `remote`.
- The startup environment list is still derived inside
`EnvironmentManager`; the provider only says whether its snapshot should
start new threads with all configured environments.
- The thread-manager sample was updated to pass the current
`ThreadManager::new(...)` installation id argument so the stack compiles
under Bazel.

## Stack

- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- **5. This PR:** https://github.com/openai/codex/pull/20667 - Load
configured environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

- `just fmt`
- `git diff --check`
- `bazel build --config=remote --strategy=remote
--remote_download_toplevel
//codex-rs/thread-manager-sample:codex-thread-manager-sample`
- `bazel test --config=remote --strategy=remote
--remote_download_toplevel
//codex-rs/exec-server:exec-server-unit-tests`
- `bazel test --config=remote --strategy=remote
--remote_download_toplevel --test_sharding_strategy=disabled
--test_arg=default_thread_environment_selections_use_manager_default_id
//codex-rs/core:core-unit-tests`
- `bazel test --config=remote --strategy=remote
--remote_download_toplevel --test_sharding_strategy=disabled
--test_arg=start_thread_uses_all_default_environments_from_codex_home
//codex-rs/core:core-unit-tests`

## Documentation

This activates `CODEX_HOME/environments.toml`; user-facing documentation
should be added before this stack is treated as a documented public
workflow.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 11:17:56 -07:00
David de Regt
872b8b15b3 feat: Use installation ID in remote enrollments (#21662)
* Pass installation ID for storage on enrollments server for
deduping/grouping multiple appservers per installation
* Pass installation ID in remoteControl/status/changed events
2026-05-08 17:54:01 +00:00
William Woodruff
8bea5d231a [codex] Address some more GHA hygiene issues (#21622)
This does two things:

- We use `persist-credentials: false` everywhere now. This is
unfortunately not the default in GitHub Actions, but it prevents
`actions/checkout` from dropping `secrets.GITHUB_TOKEN` onto disk.
- We interpose (some) template expansions through environment variables.
I've limited this to contexts that have non-fixed values; contexts that
are fixed (like `*.result`) are not dangerous to expand directly inline
(but maybe we should clean those up in the future for consistency
anyways).

This is a medium-risk change in terms of CI breakage: I did a scan for
usage of `git push` and other commands that implicitly use the persisted
credential, but couldn't find any. Even still, some implicit usages of
the persisted credentials may be lurking. Please ping ww@ if any issues
arise.
2026-05-08 10:19:27 -07:00
Eric Traut
0a0d09ad21 Clarify docs folder guidance in AGENTS.md (#21772)
## Summary

Codex keeps trying to add documentation to the `docs/` directory. With
the exception of app server API documentation, the docs for Codex should
not live in this repo. We don't want the local `docs/` folder to become
a stale shadow of the official docs.

This PR updates `AGENTS.md` to make that boundary explicit and scopes
the existing API documentation guidance to app-server docs/examples. It
also removes the extra `docs/config.md` sections that were recently
added.
2026-05-08 10:11:57 -07:00
Ahmed Ibrahim
7c0e54bf59 [codex] Generalize service tier slash commands (#21745)
## Why

`/fast` was wired as a one-off slash command even though model metadata
now exposes service tiers as catalog data. That meant adding another
tier, such as a slower/cheaper tier, would require more hardcoded TUI
plumbing instead of letting the model catalog drive the available
commands.

This change makes service-tier commands data-driven: each advertised
`service_tiers` entry becomes a `/name` command using the catalog
description, while the request path sends the tier `id` only when the
selected model supports it.

## What Changed

- Removed the hardcoded `/fast` slash-command variant and introduced
dynamic service-tier command items in the composer and command popup.
- Added toggle behavior for service-tier commands: invoking `/name`
selects that tier, and invoking it again clears the selection.
- Preserved the existing Fast-mode keybinding/status affordances by
resolving the current model tier whose name is `fast`, while still
sending the tier request value such as `priority`.
- Persisted service-tier selections as raw request strings so non-fast
tiers can round-trip through config.
- Updated the Bedrock catalog entry to advertise fast support through
`service_tiers` with `id: "priority"` and `name: "fast"`.
- Added defensive filtering in core so unsupported selected service
tiers are omitted from `/responses` requests.

## Validation

- Added/updated coverage for dynamic service-tier slash command lookup,
popup descriptions, composer dispatch, TUI fast toggling, and
unsupported-tier omission in core request construction.
- Local tests were not run per request.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 20:09:51 +03:00
Zanie Blue
47f1d7b40b Use CARGO_NET_GIT_FETCH_WITH_CLI in rust-ci-full for more reliable git fetches (#21628)
Cargo uses libgit2 by default. In uv, we gave up this entirely and
always call out to the git CLI because it is much more reliable. This is
a part of my attempt to reduce flakes in `rust-ci-full`.
2026-05-08 09:53:21 -07:00
Zanie Blue
05ffa0b1d0 Fix rust-ci-full failures due to missing bwrap (#21604)
Since https://github.com/openai/codex/pull/21255, `rust-ci-full` has
been failing due to a missing `bwrap`.

```
thread 'main' panicked at linux-sandbox/src/launcher.rs:43:13:
bubblewrap is unavailable: no system bwrap was found on PATH and no bundled codex-resources/bwrap binary was found next to the Codex executable
```

Since the happy path is now to use the system binary, let's ensure
that's installed.


8d51826631
was necessary for the `bwrap` executable to be discoverable when the
working directory is `/`.

I ran `rust-ci-full` at
https://github.com/openai/codex/actions/runs/25528074506

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 09:52:19 -07:00
evawong-oai
5733e00c79 [sandboxing] Remove Darwin user cache write from Seatbelt network policy (#21443)
## Summary

1. Removes the broad `DARWIN_USER_CACHE_DIR` write rule from the macOS
Seatbelt network policy.
2. Removes the now unused policy parameter plumbing for that cache path.
3. Adds sandboxing coverage that keeps `com.apple.trustd.agent` for TLS
while rejecting the cache write rule.

## Why

This closes the exact cache poisoning boundary. The earlier `gh` TLS
issue is now covered by trustd access, so the cache write is no longer
needed.

## Validation

1. Rust formatting passed.
2. The sandboxing crate tests passed.
3. Local macOS Seatbelt repro with patched policy passed. `gh api`
returned `21442` without the cache write rule.
2026-05-08 16:43:07 +00:00
jif-oai
5b87bd2845 chore: thread tui (#21767) 2026-05-08 17:53:23 +02:00
bbrown-oai
607b0dd1f0 codex-otel: validate provider span attributes consistently (#21749)
Provider initialization installs process-global OTEL state, so invalid
trace metadata needs to fail before setup begins.

Use the same span attribute validator as config loading when traces are
exported so provider startup enforces the config contract without
duplicating validation logic.
2026-05-08 08:20:49 -07:00
jif-oai
f9bbbafb68 nit: comment (#21763)
Because of an async discussion
2026-05-08 17:15:46 +02:00
jif-oai
bd8fc9adb9 api: send hyphenated session and thread headers (#21757)
## Why
Some consumers expect conventional hyphenated HTTP headers. Codex
already sends the session and thread IDs on outbound Responses requests,
but it only uses the underscore spellings today, which makes those IDs
harder to consume in systems that normalize or reject underscore header
names.

Full context here:
https://openai.slack.com/archives/C08KCGLSPSQ/p1778248578422369

## What changed
- `build_session_headers` now emits both `session_id` and `session-id`
when a session ID is present.
- It does the same for `thread_id` and `thread-id`.
- Added regression coverage in `codex-api/tests/clients.rs` and
`core/tests/suite/client.rs` so both the lower-level client tests and
the end-to-end request tests assert the two header spellings are
present.

## Test plan
- Added header assertions in `codex-api/tests/clients.rs`.
- Added request-header assertions in `core/tests/suite/client.rs` for
both the `/v1/responses` and `/api/codex/responses` request paths.
2026-05-08 17:11:19 +02:00
Eric Traut
e6312d44f0 Show permissions and approval mode in the TUI status line (#21677)
Fixes #21665.

## Why

The TUI status line is the right place for compact, glanceable session
state. The original request was motivated by the need to see the active
permission posture without opening `/permissions` or `/status`,
especially when switching between safer and more permissive modes during
a session.

This PR intentionally separates `permissions` from `approval-mode`
instead of combining them into one status-line item. They answer related
but different questions: `permissions` describes the active
sandbox/profile shape, while `approval-mode` describes how command
approvals are handled. Keeping them separate makes each item
independently configurable and avoids long combined labels in an already
space-constrained status line.

The tradeoff is that users who want the full permission posture in the
status line need to opt into both items. In exchange, users can show
only the sandbox/profile label, only the approval behavior, or both, and
named user-defined profiles remain concise. Non-standard permission
shapes are rendered as `Custom permissions` rather than trying to
squeeze detailed profile contents into the status line; `/status`
remains the fuller explanatory surface.

## What changed

- Added a configurable `permissions` status-line item.
- Added a separate `approval-mode` status-line item, with `approval` as
an alias.
- Render standard permission states compactly as `Read Only`,
`Workspace`, or `Full Access`.
- Preserve user-defined permission profile names directly in the status
line.
- Render unnamed non-standard permission shapes as `Custom permissions`.
- Refresh status surfaces when `/permissions` updates the permission
profile, approval policy, or approval reviewer.
- Updated status-line preview snapshot coverage for the new items.

## Verification

- `cargo test -p codex-tui
status_permissions_non_default_workspace_write_uses_workspace_label`
- `cargo test -p codex-tui
permissions_selection_emits_history_cell_when_selection_changes`
- `cargo insta pending-snapshots --manifest-path tui/Cargo.toml`
2026-05-08 08:03:11 -07:00
Eric Traut
f86d95a242 Display blended token count in status line (#21669)
## Why

The configurable `/statusline` and terminal title can display session
token usage. That display was using the raw total token count, which
includes cached input tokens, so it significantly overstated the token
usage compared with the blended token count shown elsewhere (in
`/status` and tracked in goals). This inconsistency resulted in user
confusion. We don't want to report cached tokens because we don't charge
for them and they are somewhat of an implementation detail that users
shouldn't care about.

## What changed

- Use `TokenUsage::blended_total()` for the `used-tokens` status surface
item so cached input is excluded.
- Add a brief comment to `tokens_in_context_window()` clarifying that it
returns raw `total_tokens`, whose meaning depends on whether the caller
has last-turn or accumulated usage.
2026-05-08 07:56:13 -07:00
github-actions[bot]
aadcae9f3c Update models.json (#19896)
Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
2026-05-08 17:41:55 +03:00
Ahmed Ibrahim
cce059467a [codex] Enable apply_patch freeform by default (#21687)
## Summary
- enable `apply_patch_freeform` by default in the feature registry

## Why
- make the freeform `apply_patch` tool available by default when model
metadata does not explicitly opt into another mode

## Validation
- `just fmt`
- did not run tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 13:15:00 +00:00
Ahmed Ibrahim
317213fd33 Allow string service tiers in config TOML (#21697)
## Why

`service_tier` in `config.toml` and profile config was still modeled as
an enum, which blocked newer or experimental service tier IDs even
though the runtime paths already carry string IDs.

This change makes the TOML-facing config accept string service tier IDs
directly while keeping the legacy `fast` alias behavior by normalizing
it to the request value `priority`.

## What Changed

- change the TOML-facing `service_tier` fields in global and profile
config to `Option<String>`
- keep config-load normalization so legacy `fast` still resolves to
`priority`
- persist resolved service tier strings directly in config locks so
arbitrary IDs round-trip cleanly
- regenerate the config schema and add config coverage for arbitrary
string IDs plus legacy `fast` normalization

## Verification

- added config tests for arbitrary string service tiers and legacy
`fast` normalization
- ran `just write-config-schema`
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 15:15:00 +03:00
jif-oai
d9feaffffb [codex] make shutdown pending-touch test deterministic (#21550)
## What changed
- rewrote `shutdown_flushes_pending_metadata_irrelevant_updated_at` to
seed an existing pending `updated_at` touch directly in
`RolloutWriterState`
- kept the shutdown test focused on draining a pending touch, leaving
the separate coalescing test to cover timing-based deferral

## Why
The old test had to complete several async operations inside the 50 ms
test-only coalescing window. When that sequence took longer, the second
flush updated `threads.updated_at` immediately and the pre-shutdown
equality assertion failed, even though shutdown behavior was correct.

## Validation
- `cargo test -p codex-rollout
shutdown_flushes_pending_metadata_irrelevant_updated_at`
- `cargo test -p codex-rollout`

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 10:48:45 +02:00
Ahmed Ibrahim
71d80f9a14 Omit service_tier from remote /responses/compact requests under API auth (#21676)
## Summary

API-key-auth remote compaction requests should not inherit
`service_tier` from normal `/responses` turns. This path needs to match
API auth expectations, while ChatGPT-auth remote compaction should keep
reusing the shared request fields that still apply there.

This change keeps the decision inline in
`codex-rs/core/src/compact_remote.rs` only. Under API key auth, the
classic remote `/responses/compact` path now omits `service_tier`; under
ChatGPT auth, it keeps reusing the configured tier.
`codex-rs/core/src/compact_remote_v2.rs` is unchanged. The remote
compaction parity coverage and snapshots were updated to assert the
API-key omission and preserve the ChatGPT-auth behavior.

## Testing

- Updated remote compaction parity coverage in
`codex-rs/core/tests/suite/compact_remote.rs` and the corresponding
snapshots.
2026-05-08 11:15:14 +03:00
Eric Traut
d2e71db22a Remove exec research preview banner wording (#21683)
## Why

`codex exec` still included the stale `research preview` label in its
human-readable startup banner, which makes the CLI look older and less
current than it is.

Fixes #21444.

## What Changed

Removed the hard-coded ` (research preview)` suffix from the `OpenAI
Codex v<version>` startup banner in
`codex-rs/exec/src/event_processor_with_human_output.rs`.

## Validation

Local validation was not required for this one-line startup banner text
cleanup.
2026-05-08 00:30:44 -07:00
Eric Traut
c15ce42a12 Fix feature request Contributing link (#21688)
Fixes #20870.

## Summary

The feature request template currently links users to the README
`#contributing` anchor, but that anchor does not exist. This can confuse
users who are trying to understand contribution expectations before
filing a request.

This updates `.github/ISSUE_TEMPLATE/5-feature-request.yml` to point
`Contributing` at `docs/contributing.md`, matching the repository's
existing contribution guidance.
2026-05-08 00:23:40 -07:00
Eric Traut
8b1d6875ed Fix issue template labels (#21686)
Issue forms should only reference labels that exist in the repository so
new reports receive the intended automatic labels.

This updates the CLI issue form to stop applying the missing `needs
triage` label, and changes the documentation issue form from `docs` to
the existing `documentation` label.

Fixes #21158
2026-05-08 00:22:33 -07:00
Eric Traut
911841001d Fix duplicate CLI issue template description (#21685)
Fixes #21270.

The CLI bug report template defined `description` twice for the terminal
emulator field. Because duplicate YAML keys are ambiguous and parsers
generally keep the later value, the form could drop the multiplexer
guidance.

This combines that guidance with the terminal examples under a single
block scalar in `.github/ISSUE_TEMPLATE/3-cli.yml`.
2026-05-08 00:20:17 -07:00
xl-openai
ae15343243 feat: Update plugin share settings with discoverability (#21637)
Requires discoverability on plugin/share/updateTargets so the server can
manage workspace link access consistently, including auto-adding the
workspace principal for UNLISTED.

Also rejects LISTED on share creation and blocks client-supplied
workspace principals while preserving response parsing for LISTED.
2026-05-07 21:28:18 -07:00
Celia Chen
9cbd4c0371 feat: enable AWS login credentials for Bedrock auth (#21623)
## Summary

Codex's Amazon Bedrock provider signs Mantle requests with SigV4 using
credentials resolved by the AWS SDK. That worked for standard AWS
profiles and environment credentials, but AWS CLI console-login profiles
created by `aws login` require the SDK's `credentials-login` feature to
resolve `login_session` credentials.

This change enables that credential provider so Bedrock can use AWS
console-login credentials through the existing provider-owned AWS auth
path.

While testing the console-login path, we also hit a Mantle-specific
SigV4 regression from the new split between `session_id` and
`thread_id`. Mantle does not preserve legacy OpenAI compatibility
headers that use `snake_case` before SigV4 verification, so signing
those headers can make the server reconstruct a different canonical
request. The Bedrock auth path now removes that header class before
signing, keeping preserved hyphenated Codex/AWS headers such as
`x-codex-turn-metadata` signed normally.

## Changes

- Enable `aws-config`'s `credentials-login` feature in
`codex-rs/aws-auth`.
- Add a compile-time regression test for
`aws_config::login::LoginCredentialsProvider`.
- Strip `snake_case` compatibility headers from Bedrock Mantle SigV4
requests before signing.
- Expand the Bedrock auth regression test to cover `session_id`,
`thread_id`, and future headers of the same shape.
- Refresh Cargo and Bazel lockfiles for the added `aws-sdk-signin`
dependency.

## Tests
- tested with `aws login` locally and verified that it works as
intended.
2026-05-08 04:07:59 +00:00
xli-oai
314229fd72 Remove skills list extra roots (#21485)
## Summary
- Remove `perCwdExtraUserRoots` / `SkillsListExtraRootsForCwd` from the
`skills/list` app-server API.
- Drop Rust app-server and `codex-core-skills` extra-root plumbing so
skill scans are keyed by the normal cwd/user/plugin roots only.
- Regenerate app-server schemas and update docs/tests that only existed
for the removed extra-roots behavior.

## Validation
- `just write-app-server-schema`
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-core-skills`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-core-skills`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`

## Notes
- `cargo test -p codex-app-server --test all skills_list` ran the edited
skills-list cases, but the full filtered run ended on existing
`skills_changed_notification_is_emitted_after_skill_change` timeout
after a websocket `401`.
- `cargo test -p codex-tui --lib` compiled the changed TUI callers, then
failed two unrelated status permission tests because local
`/etc/codex/requirements.toml` forbids `DangerFullAccess`.
- Source-truth check found the OpenAI monorepo still has
generated/app-server-kit mirror references to the removed field; those
should be cleaned up when generated app-server types are synced or in a
companion OpenAI cleanup.
2026-05-07 20:56:42 -07:00
rhan-oai
99016ec732 [codex-analytics] plumb protocol-native review timing (#21434)
## Why

We want terminal tool review analytics, but the reducer should not stamp
review timing from its own wall clock.

This PR plumbs review timing through the real protocol and app-server
seams so downstream analytics can consume the emitter's timestamps
directly. Guardian reviews keep their enriched `started_at` /
`completed_at` analytics fields by deriving those legacy second-based
values from the same protocol-native millisecond lifecycle timestamps,
rather than sampling a separate analytics clock.

## What changed

- add `started_at_ms` to user approval request payloads
- add `started_at_ms` / `completed_at_ms` to guardian review
notifications
- preserve Guardian review `started_at` / `completed_at` enrichment from
the protocol-native timing source
- stamp typed `ServerResponse` analytics facts with app-server-observed
`completed_at_ms`
- thread the new timing fields through core, protocol, app-server, TUI,
and analytics fixtures

## Verification

- `cargo test -p codex-app-server outgoing_message --manifest-path
codex-rs/Cargo.toml`
- `cargo test -p codex-app-server-protocol guardian --manifest-path
codex-rs/Cargo.toml`
- `cargo test -p codex-tui guardian --manifest-path codex-rs/Cargo.toml`
- `cargo test -p codex-analytics analytics_client_tests --manifest-path
codex-rs/Cargo.toml`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21434).
* #18748
* __->__ #21434
* #18747
* #17090
* #17089
* #20514
2026-05-07 20:31:41 -07:00
pakrym-oai
af16baa549 Revert "Use --locked in cargo build and lint invocations" (#21646)
Reverts openai/codex#21602
2026-05-07 20:05:47 -07:00
pakrym-oai
dfa1e864a2 Send response.processed after remote compaction v2 (#21642)
## Why

Remote compaction v2 consumes a normal Responses stream, but that
compaction-specific stream consumer dropped the `response.completed` id.
As a result, the `responses_websocket_response_processed` lifecycle
notification was emitted for normal turn sampling but not after a v2
remote compaction response was fully processed.

## What changed

- Return the completed response id alongside the v2 `context_compaction`
output item.
- After v2 compacted history is installed, send `response.processed`
through the same websocket session when the feature is enabled.
- Add websocket regression coverage for a remote compaction v2 request
followed by `response.processed`.

## Verification

- `cargo test -p codex-core --test all
responses_websocket_sends_response_processed_after_remote_compaction_v2
-- --nocapture`
- `cargo test -p codex-core
collect_context_compaction_output_accepts_additional_output_items --
--nocapture`
2026-05-07 19:57:36 -07:00
starr-openai
07b695190f Add CODEX_HOME environments TOML provider (#20666)
## Why

After stdio transports and provider-owned defaults exist, Codex needs a
config-backed provider that can describe more than the single legacy
`CODEX_EXEC_SERVER_URL` remote. This PR adds that provider without
activating it in product entrypoints yet, keeping parser/validation
review separate from runtime wiring.

**Stack position:** this is PR 4 of 5. It builds on PR 3's
provider/default model and adds the `environments.toml` provider used by
PR 5.

## What Changed

- Add `environment_toml.rs` as the TOML-specific home for parsing,
validation, and provider construction.
- Keep the TOML schema/provider structs private; the public constructor
added here is `EnvironmentManager::from_codex_home(...)`.
- Add `TomlEnvironmentProvider`, including validation for:
  - reserved ids such as `local` and `none`
  - duplicate ids
  - unknown explicit defaults
  - empty programs or URLs
  - exactly one of `url` or `program` per configured environment
- Support websocket environments with `url = "ws://..."` / `wss://...`.
- Support stdio-command environments with `program = "..."`.
- Add helpers to load `environments.toml` from `CODEX_HOME`, but do not
wire entrypoints to call them yet.
- Add the `toml` dependency for parsing.

## Stack

- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- **4. This PR:** https://github.com/openai/codex/pull/20666 - Add
CODEX_HOME environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

Not run locally; this was split out of the original draft stack.

## Documentation

This introduces the config shape for `environments.toml`; user-facing
documentation should be added before this stack is treated as a
documented public workflow.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 01:37:47 +00:00
starr-openai
1bfc3d9773 Route view_image through selected environments
Route view_image through selected environments so image reads use the selected turn environment and cwd, with schema exposure limited to multi-environment toolsets.\n\nCo-authored-by: Codex <noreply@openai.com>
2026-05-08 01:29:03 +00:00
starr-openai
9669756b5f Make environment providers own default selection (#20665)
## Why

The next PR in this stack introduces configured environments, where the
provider knows both which environments exist and which one should be
selected by default. The existing manager derived the default internally
by checking for the legacy `remote` and `local` ids, and it treated
"remote" as equivalent to "has a websocket URL." That does not work
cleanly for stdio-command remotes because they are remote environments
without an `exec_server_url`.

**Stack position:** this is PR 3 of 5. It is the environment-model
bridge between PR 2's transport enum and PR 4's TOML provider.

## What Changed

- Add `DefaultEnvironmentSelection` to the `EnvironmentProvider`
contract:
  - `Derived` preserves the old `remote`-then-`local` fallback behavior.
- `Environment(id)` lets a provider explicitly select a configured
default.
- `Disabled` lets a provider intentionally expose no default
environment.
- Move the legacy `CODEX_EXEC_SERVER_URL=none` default-disabling
behavior into `DefaultEnvironmentProvider`.
- Make `EnvironmentManager` validate explicit provider defaults and
return an error if the selected id is missing.
- Track `remote_transport` separately from `exec_server_url` so
stdio-command environments are still recognized as remote.
- Add `Environment::remote_stdio_shell_command(...)` for the TOML
provider added in the next PR.

## Stack

- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- **3. This PR:** https://github.com/openai/codex/pull/20665 - Make
environment providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

Not run locally; this was split out of the original draft stack.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-08 01:00:31 +00:00
Tom
79ad209ce6 [codex] Remove remote thread store implementation (#21596)
Remove the remote thread-store backend and checked-in protobuf
artifacts. We've moved these into another crate that link against this
one.

Also remove the config settings for thread store backend selection,
since we'll instead pass an instantiated thread store into the core-api
crate's main entrypoint.
2026-05-08 00:02:46 +00:00
starr-openai
a3de5bde6e Add stdio exec-server client transport (#20664)
## Why

Configured environments need to connect to exec-server instances that
are not necessarily already listening on a websocket URL. A
command-backed stdio transport lets Codex start an exec-server process,
speak JSON-RPC over its stdio streams, and clean up that child process
with the client lifetime.

**Stack position:** this is PR 2 of 5. It builds on the server-side
stdio listener from PR 1 and provides the client transport used by later
environment/config PRs.

## What Changed

- Add `ExecServerTransport` variants for websocket URLs and stdio shell
commands.
- Add stdio command connection support for `ExecServerClient`.
- Move websocket/stdio transport setup into `client_transport.rs` so
`client.rs` stays focused on shared JSON-RPC client, session, HTTP, and
notification behavior.
- Tie stdio child process cleanup to the JSON-RPC connection lifetime
with a RAII lifetime guard.
- Keep existing websocket environment behavior by adapting URL-backed
remotes to `ExecServerTransport::WebSocketUrl`.

## Stack

- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- **2. This PR:** https://github.com/openai/codex/pull/20664 - Add stdio
exec-server client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

Not run locally; this was split out of the original draft stack and then
refactored to separate transport setup from the base client.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 23:48:50 +00:00
Zanie Blue
79154e6952 Use --locked in cargo build and lint invocations (#21602)
This ensures CI fails if the committed lockfile is outdated
2026-05-07 23:14:18 +00:00
William Woodruff
893038f77c [codex] Apply a Dependabot cooldown of 7 days (#21599)
This adds 7-day cooldowns to all of our Dependabot ecosystem blocks. Our
Dependabot runs will continue at the same cadence as before, but the
scheduled PRs will no suggest updates that are fewer than 7 days old
themselves. This serves two purposes: to let dependencies "bake" for a
bit in terms of stability before we adopt them, and to give third-party
security services/tooling a chance to detect and revoke malware.

This should have no functional changes/consequences besides how rapidly
we get (non-security) updates. Dependabot security PRs can still be
scheduled and will bypass the cooldown.
2026-05-07 16:07:46 -07:00
bbrown-oai
31b233c7c6 codex-otel: add configurable trace metadata (#21556)
Add Codex config for static trace span attributes and structured W3C
tracestate field upserts. The config flows through OtelSettings so
callers can attach trace metadata without touching every span call site.

Apply span attributes with an SDK span processor so every exported
trace span carries the configured metadata. Model tracestate as nested
member fields so configured keys can be upserted while unrelated
propagated state in the same member is preserved.

Validate configured tracestate before installing provider-global state,
including header-unsafe values the SDK does not reject by itself. This
keeps Codex from propagating malformed trace context from config.

Update the config schema, public docs, and OTLP loopback coverage for
config parsing, span export, propagation, and invalid-header rejection.
2026-05-07 16:06:57 -07:00
Owen Lin
0d0835dd53 feat(app-server, threadstore): Thread pagination APIs and ThreadStore contract (#21566)
## Why
The goal of this PR is to align on app-server and `ThreadStore` API
updates for paginating through large threads.


#### app-server
##### `thread/turns/list`
- Updates `thread/turns/list` to support `itemsView?: "notLoaded" |
"summary" | "full" | null`, defaulting to `summary`.
- Implements the current `thread/turns/list` behavior over the existing
persisted rollout-history fallback:
  - `notLoaded` returns turn envelopes with empty `items`.
- `summary` returns the first user message and final assistant message
when available.
  - `full` preserves the existing full item behavior.

Note that this method still uses the naive approach of loading the
entire rollout file, and returns just the filtered slice of the data.
Real pagination will come later by leveraging SQLite.

##### `thread/turns/items/list`
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.

#### ThreadStore
- Adds the experimental `thread/turns/items/list` protocol, schema,
dispatcher, and processor stub. The app-server currently returns
JSON-RPC `-32601` with `thread/turns/items/list is not supported yet`.
- Adds `ThreadStore` contract types and stubbed methods for listing
thread turns and listing items within a turn.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.
- Adds a typed `StoredTurnStatus` and `StoredTurnError` to avoid baking
app-server API enums or lossy string status values into the store-facing
turn contract.

This also sketches the storage abstraction we expect to need once turns
are indexed/stored. In particular, `notLoaded` is useful only if
ThreadStore can eventually list turn metadata without loading every
persisted item for each turn.

## Validation

- Added/updated protocol serialization coverage for the new request and
response shapes.
- Added app-server integration coverage for `thread/turns/list` default
summary behavior and all three `itemsView` modes.
- Added app-server integration coverage that `thread/turns/items/list`
returns the expected unsupported JSON-RPC error when experimental APIs
are enabled.
- Added thread-store coverage that the default trait methods return
`ThreadStoreError::Unsupported`.

No developers.openai.com documentation update is needed for this
internal experimental app-server API surface.
2026-05-07 15:44:43 -07:00
Charlie Marsh
54ef99a365 Disable empty Cargo test targets (#21584)
## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 15:44:17 -07:00
Aria Desires
80a8563e48 Ensure all mentions of cargo-install are --locked (#21592)
There's already a preference for this in the codebase, but a few of them
have drifted away. Generally `--locked` is preferred to reduce exposure
to supply-chain attacks (and just generally improve reproducibility).

In an ideal world these dependencies would maybe even be pinned to
versions but Cargo is kinda bad at that for devtools. Still better to
use --locked than not.
2026-05-07 15:30:37 -07:00
William Woodruff
8abcc5357d [codex] Fully qualify hash-pins in GitHub Actions (#21436)
This builds on top of https://github.com/openai/codex/pull/15828 by
ensuring that hash-pinned actions with version comments are fully
qualified, rather than referencing floating/mutable comments like "v7".
This makes actions management tools behave more consistently.

This shouldn't break anything, since it's comment only. But if it does,
ping ww@ 🙂
2026-05-07 14:31:20 -07:00
Zanie Blue
27ec488ad5 Add a Cargo build profile for benchmarking (#21574)
A clean release build takes ~18m and an incremental build takes ~12m.
This is far too slow to iterate on performance related changes and the
build time is dominated by LTO.

This pull request adds a `profiling` profile for Cargo which takes ~13m
clean and ~6m incremental, the primary change is that LTO is disabled.
This matches a profile used in uv and follows the great work at
https://github.com/astral-sh/uv/pull/5955 — there's a bit of commentary
there about the trade-offs this implies.

We've found that this does not inhibit the ability to accurately
benchmark as measurements with LTO disabled are generally consistent
with the results with LTO enabled and it makes it much faster (~2x) to
rebuild after making a change.

This is motivated by my interest in improving Codex TUI performance,
which is blocked by the tragically builds right now.

I tested incremental build times by making a no-op change to the
`codex-cli` crate.
2026-05-07 14:30:35 -07:00
Zanie Blue
8367ef4522 Use descriptive names for Cargo profile options (#21582)
These are equivalent and their intent is clearer, e.g., I was confused
if `debug = 1` meant the same thing as `debug = true` (it does not).
2026-05-07 14:19:32 -07:00
iceweasel-oai
163eac9306 Grant sandbox users access to desktop runtime bin (#21564)
## Why

Codex desktop copies bundled Windows binaries out of `WindowsApps` into
a LocalAppData runtime cache before launching `codex.exe`. Sandboxed
commands can then need to execute helpers from that cache, but the
sandbox user group may not have read/execute access to the runtime bin
directory.

This makes the Windows sandbox refresh path repair that access directly
so the packaged desktop runtime remains usable from sandboxed sessions.

## What changed

- Added `setup_runtime_bin` to locate `%LOCALAPPDATA%\OpenAI\Codex\bin`,
matching the desktop bundled-binaries destination path, with the same
`USERPROFILE\AppData\Local` fallback shape.
- During refresh setup, check whether `CodexSandboxUsers` already has
read/execute access to the runtime bin directory.
- If access is missing, grant `CodexSandboxUsers` `OI/CI/RX` inheritance
on that directory.
- If the runtime bin directory does not exist, no-op cleanly.

## Verification

- `cargo build -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
- `cargo test -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
- Manual Windows ACL exercise against the installed packaged runtime
bin:
- existing inherited `CodexSandboxUsers:(I)(OI)(CI)(RX)` no-ops without
changing SDDL
- after disabling inheritance and removing the group ACE, setup adds
`CodexSandboxUsers:(OI)(CI)(RX)`
- with `LOCALAPPDATA` pointed at a fake location without
`OpenAI\Codex\bin`, setup exits successfully and does not create the
directory
- restored the real runtime bin with inherited ACLs and confirmed the
final SDDL matched the baseline exactly
2026-05-07 11:38:10 -07:00
Tom
4242bba2eb Route ThreadManager rollout path reads through thread store (#21265)
- Route ThreadManager rollout-path resume/fork through ThreadStore
history reads.
- Add in-memory store coverage proving path-addressed reads are used.

This isn't strictly necessary for the ThreadStore migration, since these
ThreadManager methods _only_ work for path-based lookups, but I'm trying
to migrate all the rollout recorder callsites to use the threadstore
were possible for consistency.
2026-05-07 11:25:25 -07:00
Tom
0274398901 [codex] Fix pathless thread summaries (#21266)
## Summary

Fix `getConversationSummary` so thread-id summaries work for stored
threads that do not have a local rollout path, such as remote thread
stores.

The root cause was that `summary_from_stored_thread` returned `None`
when `StoredThread.rollout_path` was absent, and
`get_thread_summary_response_inner` treated that as an internal error.
This made conversation-id lookups depend on a local-only field even
though the thread store can address the thread by id.
2026-05-07 11:18:16 -07:00
Tom
56823ec46b Move thread name edits to ThreadStore (#21264)
- Route live thread renames through `ThreadStore` metadata updates.
- Read resumed thread names from store metadata with legacy local
fallback preserved in the store.
2026-05-07 11:12:22 -07:00
Charlie Marsh
0dc1885a5c Upgrade cargo-shear to 1.11.2 (#21547)
## Summary

Catches a few additional dependencies (`sha2`, `url`) that should be in
`dev-dependencies`.
2026-05-07 11:07:18 -07:00
pakrym-oai
566f2cb612 [codex] Move tool specs onto handlers (#21461)
## Why

This is the next stacked step after deleting the tool-handler kind
indirection. Specs should come from the registered handlers themselves
so registry construction has a single source of truth for handler
behavior and exposed tool definitions.

## What changed

- Added `ToolHandler::spec()` plus handler-provided parallel/code-mode
metadata, and made `ToolRegistryBuilder::register_handler` automatically
collect specs from registered handlers.
- Moved builtin tool spec construction into the corresponding handlers
and their adjacent `_spec` modules, including shell, unified exec, apply
patch, view image, request plugin install, tool search, MCP resource,
goals, planning, permissions, agent jobs, and multi-agent tools.
- Reworked configurable handlers to receive their tool-building options
through constructors, with non-optional handler options where the
handler is always spec-backed. Shell fallback handlers keep an explicit
no-spec mode because they are also registered as hidden dispatch
aliases.
- Kept `CodeModeExecuteHandler` on the explicit configured wrapper so
the code-mode exec spec can still be built from the nested registry.

## Verification

- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `cargo test -p codex-core tools::handlers::multi_agents_spec::tests`
- `RUST_MIN_STACK=16777216 cargo test -p codex-core
tools::handlers::multi_agents::tests`
- `cargo test -p codex-core tools::handlers::apply_patch::tests`
- `cargo test -p codex-core tools::handlers::unified_exec::tests`
- `just fix -p codex-core`
- `git diff --check`
2026-05-07 10:48:36 -07:00
jif-oai
eb0462f2af app-server: refresh live threads from latest config snapshot (#21187)
## Why

App-server config writes were leaving existing threads partially stale.
After a config mutation, the app-server told each live thread to run
`Op::ReloadUserConfig`, but that path only re-read the user
`config.toml` layer. Settings that came from the app-server's
materialized config snapshot did not propagate to existing threads until
restart.

This change prevent a FS access from `core` for CCA.

## What changed

- add `CodexThread::refresh_runtime_config()` and
`Session::refresh_runtime_config()` so the app-server can push a freshly
rebuilt config snapshot into a live thread
- rebuild the latest config with each thread's `cwd` after config
mutations, then refresh the thread from that snapshot instead of asking
it to reload only `config.toml`
- keep session-static settings unchanged during refresh, while updating
runtime-refreshable state such as the config layer stack,
`tool_suggest`, and derived hook/plugin/skill state
- keep `reload_user_config_layer()` as the file-backed fallback for
legacy local reload flows, but route the shared refresh logic through
the new runtime refresh path

## Testing

- add a session test that verifies `refresh_runtime_config()` rebuilds
hooks from refreshed config
- add a session test that verifies runtime-refreshable fields update
while session-static settings like `model` and `notify` stay unchanged

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 19:22:04 +02:00
Owen Lin
129401df43 add top-level remote-control command (#21424)
## Summary

`codex --enable remote_control app-server --listen off` is the current
way to start a headless, remote-controllable app-server, but it is hard
to remember and exposes implementation details.

This adds `codex remote-control` as a friendly top-level wrapper for
that flow. The command starts a foreground app-server with local
transports disabled and enables `remote_control` only for that
invocation.

## Changes

- Add a visible `codex remote-control` CLI subcommand.
- Launch app-server with `AppServerTransport::Off`.
- Append `features.remote_control=true` after root feature toggles so
the explicit command wins over `--disable remote_control`.
- Reject root `--remote` / `--remote-auth-token-env`, matching other
non-TUI subcommands.
- Add tests for parsing, launch defaults, override ordering, and remote
flag rejection.

## Verification

- `cargo test -p codex-cli`
- `just fix -p codex-cli`
2026-05-07 10:17:07 -07:00
pakrym-oai
857e731478 [codex] Remove string-keyed MCP tool maps (#21454)
## Summary

This PR removes the synthetic `HashMap<String, ToolInfo>` keys from MCP
tool discovery. `McpConnectionManager::list_all_tools()` now returns
normalized `Vec<ToolInfo>`, and downstream code derives identity from
`ToolInfo::canonical_tool_name()`.

The motivation is to keep model-visible tool identity on
`ToolName`/`ToolInfo` instead of parallel string map keys, so future
namespace changes do not have to preserve otherwise-unused lookup keys.

## Changes

- Rename the MCP normalization path from `qualify_tools` to
`normalize_tools_for_model` and return tool values directly.
- Flow MCP tool lists through connectors, plugin injection, router/spec
building, code mode, and tool search as vectors/slices.
- Keep direct/deferred subtraction local to `mcp_tool_exposure`, using
`ToolName` values.
- Update tests to compare `ToolName` instances where MCP identity
matters.

## Validation

- `cargo test -p codex-mcp test_normalize_tools`
- `cargo test -p codex-core mcp_tool_exposure`
- `cargo test -p codex-core
direct_mcp_tools_register_namespaced_handlers`
- `cargo test -p codex-core
search_tool_registers_namespaced_mcp_tool_aliases`
- `just fix -p codex-mcp`
- `just fix -p codex-core`
2026-05-07 10:16:10 -07:00
xl-openai
114bac1409 feat: Expose plugin share metadata in shareContext (#21495)
Extends PluginSummary.shareContext with shareUrl and reader shareTargets
2026-05-07 10:07:03 -07:00
rhan-oai
3444b0d60a [codex-analytics] add tool review event schema (#18747)
## Why

We want to emit terminal review analytics for tool-related approval
flows, but the event contract needs to exist before the reducer can
publish anything.

This PR is the schema-only slice for the Codex review event family.

## What changed

- add the `ReviewEvent` analytics envelope in
`codex-rs/analytics/src/events.rs`
- define the review subject kind, reviewer, trigger, terminal status,
and post-review resolution enums
- define the review event payload with thread, turn, item, lineage,
tool, and timing fields that the emitter stack will populate

## Verification

- stacked verification in dependent PRs: `cargo test -p codex-analytics
analytics_client_tests --manifest-path codex-rs/Cargo.toml`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18747).
* #18748
* #21434
* __->__ #18747
* #17090
* #17089
* #20514
2026-05-07 09:46:46 -07:00
jif-oai
9b6c6f7a01 fix: preserve exact turn diffs after partial apply_patch failures (#21518)
## Why

Follow-up to #21180: turn diffs are operation-backed now, but a failed
`apply_patch` can still leave exact filesystem mutations behind. For
example, a move can write the destination file before failing to remove
the source. Treating the whole call as unknowable then drops a change
that Codex actually knows happened, so the emitted turn diff can drift
from the workspace.

## What changed

-
[`apply-patch`](f55724e027/codex-rs/apply-patch/src/lib.rs (L248-L345))
now returns `ApplyPatchFailure` with the exact committed prefix
accumulated before an error. If a write failure may already have mutated
the target, the delta is marked inexact instead of being reused blindly.
- Move handling now records the destination write before attempting
source removal, so a partially failed move can still report the
destination file that definitely landed
([code](f55724e027/codex-rs/apply-patch/src/lib.rs (L463-L521))).
-
[`ApplyPatchRuntime`](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L49-L67))
now accumulates committed deltas across attempts and forwards them even
when the visible tool result is failed or sandbox-denied ([runtime
path](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L223-L250)),
[event
path](f55724e027/codex-rs/core/src/tools/events.rs (L215-L225))).
- `TurnDiffTracker` now consumes committed exact deltas rather than only
fully successful patches; exact-empty failures leave the aggregate
unchanged, while inexact deltas still invalidate it.

## Verification

- Added a regression test covering a failed move that still emits the
committed destination diff:
[`apply_patch_failed_move_preserves_committed_destination_diff`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1517-L1586)).
- Kept explicit coverage that an inexact delta clears the aggregate
instead of publishing a guessed diff:
[`apply_patch_clears_aggregated_diff_after_inexact_delta`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1589-L1655)).

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 18:05:45 +02:00
Ruslan Nigmatullin
e64a8979b0 device-key: clean up unused crate (#21487) 2026-05-07 09:01:44 -07:00
pakrym-oai
acac786d91 [codex] add account id to feedback uploads (#21498)
## Why

Feedback uploads already carry auth-derived context like
`chatgpt_user_id`, but they do not include the authenticated
workspace/account id. Adding `account_id` makes feedback triage easier
when a user can operate across multiple ChatGPT workspaces.

## What changed

- emit auth-derived `account_id` into feedback tags in `app-server`
before the feedback snapshot is uploaded
- preserve that tag through `codex-feedback` upload tag assembly
alongside the existing merge behavior for other tags
- extend `codex-feedback` coverage to assert that snapshot-derived
`account_id` is present in uploaded tags

## Verification

- `cargo test -p codex-feedback
upload_tags_include_client_tags_and_preserve_reserved_fields`
- `cargo test -p codex-app-server --lib feedback_processor`
2026-05-07 08:45:16 -07:00
jif-oai
f7e8ff8e50 Make turn diff tracking operation backed (#21180)
## Summary
- replace filesystem-based turn diff tracking with an operation-backed
accumulator
- preserve enough verified apply_patch state to render move-overwrite
cases correctly
- keep the turn/diff/updated contract intact while removing remote-only
turn-diff test skips

This takes the assumption that no 3P services rely on the output format
of `apply_patch`

## Why
For the CCA file system isolation push

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 11:33:47 +02:00
jif-oai
b2268999fe feat: make built-in MCPs first-class runtime servers (#21356)
## DISCLAIMER
This is experimental and no production service must rely on this

## Why

Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.

## What changed

- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.

## Test plan

- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 10:36:32 +02:00
Abhinav
40e282849c Show plugin hooks in plugin details (#21447)
Supersedes the abandoned #19859, rebuilt on latest `main`.

# Why

PR #19705 adds discovery for hooks bundled with plugins, but `/plugins`
still only shows skills, apps, and MCP servers. This follow-up makes
bundled hooks visible in the same plugin detail view so users can
inspect the full plugin surface in one place.

We also need `PluginHookSummary` to populate Plugin Hooks in the app;
`hooks/list` is not enough there because plugin detail needs to show
hooks for disabled plugins too.

# What

- extend `plugin/read` with `PluginHookSummary` entries for bundled
hooks
- summarize plugin hooks while loading plugin details
- render a `Hooks` row in the `/plugins` detail popup

<img width="3456" height="848" alt="CleanShot 2026-04-27 at 11 45 34@2x"
src="https://github.com/user-attachments/assets/fe3a38d6-a260-4351-8513-fb04c93d725b"
/>
2026-05-07 00:21:14 -07:00
xli-oai
898f5bfeaa [codex] fix PluginListParams test initializer (#21494)
## Summary
- update the app-server protocol test fixture to include the required
`marketplace_kinds` field on `PluginListParams`

## Why
`PluginListParams` now requires `marketplace_kinds`, but a later-added
test fixture in `common.rs` still constructed the older shape with only
`cwds`. That stale initializer breaks the main build with `missing field
marketplace_kinds`.

## Impact
This is a test-only repair. It restores compilation without changing the
JSON-RPC schema or runtime behavior.

## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
2026-05-06 23:58:26 -07:00
pakrym-oai
a8488fec5e Revert state DB injection and agent graph store (#21481)
## Why

Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.

## What changed

- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.

## Validation

- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.
2026-05-06 22:48:29 -07:00
xli-oai
5bc33fe31f [codex] Parallelize skills list cwd loading (#21441)
## Summary
- process `skills/list` cwd entries with bounded concurrency of 5
- preserve the caller's requested cwd order in the response
- add coverage that verifies response ordering remains stable

## Why
Cold-start desktop traces showed that `skills/list` can dominate the
shared config queue when it scans many workspace roots serially. The
expensive work is largely independent per cwd, so the request was paying
the sum of all cwd costs instead of the cost of the slowest bounded
batch.

## Impact
This keeps current request semantics intact while reducing the
wall-clock time of large multi-root `skills/list` calls. That should
also reduce how long later config-family requests, such as
`plugin/list`, wait behind `skills/list` during startup.

## Validation
- `just fmt`
- `cargo test -p codex-app-server`
- `cargo test -p codex-app-server
skills_list_preserves_requested_cwd_order`
2026-05-06 21:25:24 -07:00
xli-oai
05cd5c313e [codex] allow shared config reads in app-server queue (#21340)
## Summary
- add a shared-read serialization mode for global app-server request
families
- let consecutive leading shared reads for the same family run together
while keeping exclusive requests ordered
- mark only `skills/list`, `config/read` and `plugin/list` as shared
reads for now

## Why
`skills/list` and `plugin/list` are read-only config-family requests,
but the app-server queue currently treats every config request as
exclusive. That means one long `skills/list` can make a later
`plugin/list` wait even though the two requests do not mutate config.

This change keeps the existing queue order but lets adjacent reads
overlap. If a write is already waiting, later reads still stay behind
it, so writes do not starve.

## Scope
This intentionally keeps the first pass narrow:
- shared reads: `skills/list`, `plugin/list`
- still exclusive: `plugin/install`, `marketplace/*`,
`skills/config/write`, `config/*write`, `config/read`, and the rest of
the config family

## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`

## Desktop verification
I ran the dev desktop app against this branch's built binary with the
existing UI timing logs enabled. The app did use
`/Users/xli/code/codex_6/codex-rs/target/debug/codex`.

The new scheduler behavior works, but this narrow change does not remove
every cold-start delay: in the observed trace, an earlier exclusive
`config/read` was already queued ahead of the later `skills/list` and
`plugin/list` requests, so the page-open plugin requests still waited
behind that earlier exclusive config-family request before they could
run together.

That means this PR is the scheduler primitive needed for shared reads,
not the complete end-to-end latency fix by itself.

## Not run
- full workspace test suite, because repo policy requires explicit
approval before running it after touching `app-server-protocol`
2026-05-06 21:16:31 -07:00
mifan-oai
001363188a [codex] Add OpenAI Developers to tool suggest allowlist (#21423)
## Summary

Add `openai-developers@openai-curated` to
`TOOL_SUGGEST_DISCOVERABLE_PLUGIN_ALLOWLIST` so the OpenAI Developers
plugin can be surfaced through tool suggestions once it is available in
the Built by OpenAI marketplace.

Update the discoverable plugin test fixture to assert the plugin is
returned from the curated marketplace allowlist path.

## Validation

- `cargo fmt --check` passed; rustfmt emitted the existing
stable-channel warnings about `imports_granularity`.
- `cargo test -p codex-core
list_tool_suggest_discoverable_plugins_returns_uninstalled_curated_plugins`
passed.
2026-05-06 23:49:15 -04:00
pakrym-oai
e394625ea2 [codex] Delete tool handler plan indirection (#21427)
## Why

The spec split in the parent PR still left an intermediate registry plan
that recorded `ToolHandlerKind` values and translated them into concrete
handlers later. That kept tool registration dependent on static enum
bookkeeping instead of registering handlers from the same code that
assembles their specs.

## What Changed

- Make `build_tool_registry_builder` register concrete handlers directly
while adding specs.
- Add small `ToolRegistryBuilder` helpers for spec augmentation and
nested code-mode inspection.
- Remove `ToolHandlerKind`, `ToolHandlerSpec`, and `ToolRegistryPlan`.
- Update spec-plan tests to assert against the built `ToolRegistry`
instead of static handler descriptors.

## Validation

- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `just fix -p codex-core`
2026-05-06 20:36:24 -07:00
Felipe Coury
5a4b2702f2 fix(tui): clear first inline viewport render (#21450)
## Why

The alpha TUI can render the initial trust-directory prompt with stale
terminal text showing through spaces when startup begins below existing
shell output. The first inline viewport transition can happen while the
previous viewport is still empty, so the old clear path no-ops before
Ratatui draws the prompt. Ratatui then skips blank cells because its
previous buffer also thinks those cells are blank, leaving old terminal
contents visible inside the prompt.

## What Changed

- Clear from the new inline viewport top when the previous viewport is
empty during a viewport transition.
- Keep the existing clear-from-old-viewport behavior for normal viewport
updates.
- Add a VT100-backed regression test that pre-fills terminal contents,
performs the first viewport clear, and verifies stale text inside the
new viewport is removed while shell content above the viewport remains.

## How to Test

1. Start Codex alpha in a terminal that already has visible shell output
above the cursor.
2. Use a fresh untrusted project directory so the trust-directory prompt
appears.
3. Confirm the prompt text renders cleanly, with spaces staying blank
instead of showing fragments of previous shell output.
4. As a regression check, confirm content above the inline viewport is
still preserved in terminal scrollback.

Targeted tests:
- `cargo test -p codex-tui
first_viewport_change_clears_from_new_viewport_when_old_viewport_is_empty
-- --nocapture`
- `cargo test -p codex-tui`
2026-05-07 02:48:49 +00:00
pakrym-oai
103dc2b6ae Revert "Move skills watcher to app-server" (#21460)
Reverts openai/codex#21287
2026-05-07 02:24:20 +00:00
Andrei Eternal
527d52df03 Add compact lifecycle hooks (started by vincentkoc - external contrib) (#19905)
Based on work from Vincent K -
https://github.com/openai/codex/pull/19060

<img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
/>

## Why

Compaction rewrites the conversation context that future model turns
receive, but hooks currently have no deterministic lifecycle point
around that rewrite. This adds compact lifecycle hooks so users can
audit manual and automatic compaction, surface hook messages in the UI,
and run post-compaction follow-up without overloading tool or prompt
hooks.

## What Changed

- Added `PreCompact` and `PostCompact` hook events across hook config,
discovery, dispatch, generated schemas, app-server notifications,
analytics, and TUI hook rendering.
- Added trigger matching for compact hooks with the documented `manual`
and `auto` matcher values.
- Wired `PreCompact` before both local and remote compaction, and
`PostCompact` after successful local or remote compaction.
- Kept compact hook command input to lifecycle metadata: session id,
Codex turn id, transcript path, cwd, hook event name, model, and
trigger.
- Made compact stdout handling consistent with other hooks: plain stdout
is ignored as debug output, while malformed JSON-looking stdout is
reported as failed hook output.
- Added integration coverage for compact hook dispatch, trigger
matching, post-compact execution, and the audited behavior that
`decision:"block"` does not block compaction.

## Out of Scope

- Hook-specific compaction blocking is not implemented;
`decision:"block"` and exit-code-2 blocking semantics are intentionally
unsupported for `PreCompact`.
- Custom compaction instructions are not exposed to compact hooks in
this PR.
- Compact summaries, summary character counts, and summary previews are
not exposed to compact hooks in this PR.

## Verification

- `cargo test -p codex-hooks`
- `cargo test -p codex-core
manual_pre_compact_block_decision_does_not_block_compaction`
- `cargo test -p codex-app-server hooks_list`
- `cargo test -p codex-core config_schema_matches_fixture`
- `cargo test -p codex-tui hooks_browser`

## Docs

The developer documentation for Codex hooks should be updated alongside
this feature to document `PreCompact` and `PostCompact`, the
`manual`/`auto` matcher values, and the compact hook payload fields.

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-05-06 18:08:31 -07:00
xl-openai
11106016ff feat: Add marketplace source filtering and plugin share context (#21419)
Adds marketplaceKinds to plugin/list for local, workspace-directory, and
shared-with-me; omitted params keep default local plus gated global
behavior, while explicit kinds are exact.

Exposes shareContext on plugin summaries from local share mappings and
remote workspace/shared responses, including remotePluginId and nullable
creator metadata.

Adds shared-with-me listing through /ps/plugins/workspace/shared,
renames the workspace remote namespace to workspace-directory, and keeps
direct remote read/share/install/update/delete paths gated by plugins
rather than remote_plugin.
2026-05-06 16:12:23 -07:00
pakrym-oai
9417cf9696 [codex] Move tool specs into core handlers (#21416)
## Why

This is the first mechanical slice of moving tool spec ownership toward
the handlers. `codex-tools` should keep shared primitives and conversion
helpers, while builtin tool specs and registration planning live in
`codex-core` with the handlers that own those tools.

Keeping this PR to relocation and import updates isolates the copy/move
review from the later logic change that wires specs through registered
handlers.

## What changed

- Moved builtin tool spec constructors from `codex-rs/tools/src` into
`codex-rs/core/src/tools/handlers/*_spec.rs` or nearby core tool
modules.
- Moved the registry planning code into
`codex-rs/core/src/tools/spec_plan.rs` and its associated types/tests
into core.
- Kept shared primitives in `codex-tools`, including `ToolSpec`,
schema/types, discovery/config primitives, dynamic/MCP conversion
helpers, and code-mode collection helpers.
- Updated handlers that referenced moved argument types or tool-name
constants to use the core spec modules.
- Moved spec tests next to the moved spec modules.

## Verification

- `cargo check -p codex-tools`
- `cargo check -p codex-core`
- `cargo test -p codex-tools`
- `cargo test -p codex-core _spec::tests`
- `cargo test -p codex-core tools::spec_plan::tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`

Note: I also tried the broader `cargo test -p codex-core tools::`; it
reached the moved spec-plan/spec tests successfully, then aborted with a
stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`,
which is outside this spec relocation.
2026-05-06 15:40:50 -07:00
pakrym-oai
d5eea229cc Move skills watcher to app-server (#21287)
## Why

Skills update notifications are app-server API behavior, but the watcher
lived in `codex-core` and surfaced through
`EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core
focused on thread execution and lets app-server own both cache
invalidation and the `skills/changed` notification.

## What changed

- Added an app-server-owned skills watcher that watches local skill
roots, clears the shared skills cache, and emits `skills/changed`
directly.
- Registers skill watches from the common app-server thread listener
attach path, including direct starts, resumes, and app-server-observed
child or forked threads.
- Stores the `WatchRegistration` on `ThreadState`, so listener
replacement, thread teardown, idle unload, and app-server shutdown
deregister by dropping the RAII guard.
- Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the
old core live-reload test.
- Extended the app-server skills change test to verify a cached skills
list is refreshed after a filesystem change without forcing reload.

## Validation

- `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p
codex-rollout -p codex-rollout-trace`
- `cargo test -p codex-app-server
skills_changed_notification_is_emitted_after_skill_change`
2026-05-06 15:38:11 -07:00
Brian Henzelmann
8f5d68f9d2 Document Codex git commit attribution config (#21379)
## Summary
- document that commit attribution for generated git commit messages is
gated by the `codex_git_commit` feature flag
- add an example `config.toml` snippet showing `commit_attribution` with
`[features].codex_git_commit = true`
- update the config schema description so the reference docs explain
that `commit_attribution` only takes effect when the feature is enabled

Fixes #19799.

## Validation
- `cargo run -p codex-core --bin codex-write-config-schema`
- `cargo test -p codex-config`
- `cargo test -p codex-features`
- `cargo fmt --check`
- `git diff --check`

## Notes
- `cargo test -p codex-core config_schema_matches_fixture` currently
fails before reaching the schema test because `core_test_support`
imports `similar` without a linked crate in this checkout. The narrower
package checks above avoid that unrelated test-support build failure.
2026-05-06 16:14:50 -05:00
iceweasel-oai
123e78b97b [codex] Fix Windows sandbox git safe.directory for worktrees (#21409)
## Why

Windows sandboxed commands run as a sandbox user, while workspace
repositories are usually owned by the real user. The sandbox compensates
by injecting a temporary Git `safe.directory` entry into the child
environment.

That injection was still broken for linked worktrees because the helper
followed the `.git` file's `gitdir:` pointer and injected the internal
`.git/worktrees/...` location. Git's dubious-ownership check expects the
worktree root instead, so sandboxed Git commands still failed in
worktree-based Codex checkouts.

## What changed

- Treat any `.git` marker, directory or file, as the worktree root for
`safe.directory` injection.
- Keep the safe-directory logic in
`windows-sandbox-rs/src/sandbox_utils.rs` and have the one-shot elevated
path reuse it.
- Add regression coverage for both normal `.git` directories and
gitfile-based worktrees.

## Validation

- `cargo test -p codex-windows-sandbox sandbox_utils::tests`
- `cargo test -p codex-windows-sandbox` built and ran; the new
`sandbox_utils` tests passed, while two pre-existing legacy sandbox
tests failed locally with `Access is denied`:
`session::tests::legacy_non_tty_cmd_emits_output` and
`spawn_prep::tests::legacy_spawn_env_applies_offline_network_rewrite`.
2026-05-06 14:08:45 -07:00
rhan-oai
fbdbc6b2fe [codex-analytics] emit tool item events from item lifecycle (#17090)
## Why

After the tool-item schemas are in place, analytics needs to emit them
from the app-server item lifecycle rather than requiring bespoke
tracking at each callsite. The reducer should also reuse the shared
thread analytics context introduced below it in the stack so later event
families do not repeat the same reducer joins or missing-state ladder.

## What changed

- Tracks tool-item completion notifications and emits the matching tool
analytics event when a terminal item arrives.
- Derives event-specific payload details for command execution, file
changes, MCP calls, dynamic tools, collaboration tools, web search, and
image generation.
- Denormalizes thread, app-server client, runtime, and subagent
provenance metadata through the shared thread analytics context.
- Adds reducer coverage for item lifecycle emission and subagent
metadata inheritance.

## Duration semantics

`duration_ms` is computed from the app-server item lifecycle timestamps:
`completed_at_ms - started_at_ms`. That makes it the duration of the
lifecycle Codex observed locally, not necessarily the upstream
provider's full execution time.

- Web search usually has a meaningful observed lifecycle because
Responses can send `response.output_item.added` before
`response.output_item.done`; in that case `started_at_ms` comes from the
added event and `completed_at_ms` comes from the done event.
- Image generation can be much less precise. In the current observed
stream, image generation often arrives only as a completed
`response.output_item.done`; when there is no earlier added event, Codex
synthesizes the started item immediately before completion, so
`duration_ms` can be `0` even though upstream image generation took
longer.
- Standalone web search and standalone image generation work is expected
to land after this stack. Those paths may introduce more direct
lifecycle events or timing points, so the current
web-search/image-generation duration semantics should be treated as the
best available item-lifecycle approximation, not the final latency
contract for those tool families.
- `execution_duration_ms` is populated only where the completed item
already carries a native execution duration; otherwise it remains `null`
while `duration_ms` still reflects the local lifecycle interval.

## Currently placeholder / partial fields

Some fields are included in the schema for the intended steady-state
contract, but this PR does not yet populate them from real
approval/review state:

- `review_count`, `guardian_review_count`, and `user_review_count`
currently default to `0`.
- `final_approval_outcome` currently defaults to `unknown`.
- `requested_additional_permissions` and `requested_network_access`
currently default to `false`.

## Verification

- `cargo test -p codex-analytics`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17090).
* #18748
* #18747
* __->__ #17090
* #17089
* #20514
2026-05-06 20:27:41 +00:00
rhan-oai
21295f47e2 [codex-tui] pass thread source for tui threads (#21401)
## Summary
- mark TUI-created thread starts and forks with explicit `thread_source
= user`
- add focused coverage for embedded and remote lifecycle request
builders

## Why
Thread analytics now consume an explicit thread-level source
classification instead of inferring it from `session_source`. The TUI
still omitted that field, so TUI-created interactive threads would
continue to land as `null` even after the new analytics plumbing
shipped.

## Validation
- `cargo test -p codex-tui app_server_session --lib`
2026-05-06 13:18:41 -07:00
pakrym-oai
b9c50a53d7 [codex] Split tool handlers into separate files (#21395)
## Why

Several tool handler modules still bundled multiple `ToolHandler`
implementations in one file. That made the handler directory harder to
navigate and made otherwise local handler edits land in large shared
modules.

## What

- Split grouped tool handlers into one handler file each for agent jobs,
goals, MCP resources, shell tools, and unified exec.
- Kept shared parsing, payload, and runtime helpers in the existing
parent modules, with re-exports preserving the existing handler import
paths.
- Updated the shell handler tests to construct `ShellCommandHandler`
through the existing `ShellCommandBackendConfig` conversion now that the
backend detail lives with the shell-command handler.

## Validation

- `cargo check -p codex-core`
- `cargo clippy -p codex-core --lib -- -D warnings`
- `git diff --check -- codex-rs/core/src/tools/handlers`

Targeted `codex-core` handler tests did not run locally because
`core_test_support` currently fails to compile before reaching these
tests due to an unresolved `similar` import.
2026-05-06 13:12:24 -07:00
canvrno-oai
d5f0b6d63a [codex] Dedupe fallback model metadata warnings (#21090)
Fixes #21070.

This is a small cleanup around model metadata handling for
gateway/provider model names. It follows the report and proposed
direction from @dkbush by keeping the fallback metadata warning useful
without repeating it every turn, and by tightening the existing
provider-prefix lookup path.

- Track fallback metadata warning slugs in session state so each
unresolved model warns once per session.
- Keep warning emission outside the session-state lock and preserve the
existing warning text.
- Allow one-segment provider prefixes with hyphenated provider IDs,
while preserving the multi-segment rejection behavior.
- Add focused coverage for warning dedupe and hyphenated provider-prefix
metadata matching.

Testing:

- Ran `just fmt`.
- Ran `git diff --check`.
- Added tests for the new warning dedupe and provider-prefix lookup
behavior.
2026-05-06 13:11:44 -07:00
starr-openai
63a27ad6c6 Avoid hard-coded environment context shell (#21390)
## Summary
- make resolved turn environment shell metadata optional instead of
hard-coding bash
- render environment context shells from explicit environment metadata
when present, falling back to the existing session shell
- update environment context tests for inherited PowerShell-style
fallback and explicit per-environment shell override

## Testing
- Not run (not requested; formatted with `just fmt`).

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 19:54:26 +00:00
Christoph Paasch (OpenAI)
f9063045e1 Avoid noisy OTEL diagnostics in codex exec (#21107)
`codex exec` should not print OpenTelemetry exporter self-diagnostics to
stderr by default. Suppress the SDK and OTLP exporter targets unless
callers
explicitly opt in with `RUST_LOG`.

Also stop defaulting the trace exporter to the log exporter, since OTLP
HTTP
endpoints are signal-specific and a logs endpoint is not valid for
spans.

Co-authored-by: Codex <noreply@openai.com>

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 12:49:13 -07:00
Clark DuVall
346070a424 Route opted-in MCP elicitations through Guardian (#19431)
# Motivation

Browser Use origin-access prompts are MCP elicitations, not direct
tool-call approval prompts, so they were bypassing the Guardian approval
path. We need a generic opt-in that lets eligible MCP elicitations use
Guardian when the current turn already routes approvals there.

# Description

Add a generic elicitation reviewer hook in codex-mcp and wire codex-core
to pass a Guardian reviewer callback when creating the MCP connection
manager. The reviewer validates explicit mcp_tool_call opt-in metadata,
builds a Guardian MCP tool-call review request from
server/tool/connector metadata and tool params, and maps Guardian
approval, denial, timeout, and cancellation decisions back to MCP
elicitation responses.

The new option to trigger this in the `_meta` object is:
```
"codex_request_type": "approval_request",
```

# Testing

- RUST_MIN_STACK=8388608 NEXTEST_STATUS_LEVEL=leak cargo nextest run
--no-fail-fast --cargo-profile ci-test --test-threads 2
- cargo clippy --tests -- -D warnings
- cargo fmt -- --config imports_granularity=Item --check
- cargo shear
- pnpm run format
- python3 .github/scripts/verify_cargo_workspace_manifests.py
- python3 .github/scripts/verify_tui_core_boundary.py
- python3 .github/scripts/verify_bazel_clippy_lints.py
- git diff --check
2026-05-06 19:42:45 +00:00
Felipe Coury
6b7d6cafa0 fix(tui): persist ctrl-c draft via app event (#21397)
## Why

The main branch started failing after #21351 merged because the merge
commit kept calling `AppCommand::add_to_history` from
`BottomPane::clear_composer_for_ctrl_c`, but main had already removed
that helper as part of the history persistence refactor. The PR head
passed because it was based on an older main commit where the helper
still existed.

This restores the Ctrl+C draft-stashing behavior using the current
app-event path instead of the removed command helper.

## What Changed

- Store the active `ThreadId` in `BottomPane` when history metadata is
provided.
- Emit `AppEvent::AppendMessageHistoryEntry` for Ctrl+C-cleared drafts.
- Update the slash-clear regression test to assert the current history
event shape.

## How to Test

Targeted tests:
- `cargo test -p codex-tui
slash_clear_after_ctrl_c_keeps_stashed_draft_recallable`

Broader local checks:
- `just fix -p codex-tui`
- `just argument-comment-lint -p codex-tui`
- `git diff --check origin/main...HEAD`
- `cargo test -p codex-tui` reached completion; the fixed test passed,
and the only local failures were
`status::tests::status_permissions_full_disk_managed_*`, blocked by this
machine config rejecting `DangerFullAccess` via
`/etc/codex/requirements.toml`.
2026-05-06 19:03:11 +00:00
iceweasel-oai
f32c496144 [codex] Handle git pagination flags by position (#21381)
## Why

This is a follow-up to the Windows Git safe-command bypass fix for
BUGB-15601. Git's global `--paginate` / `-p` flags can route output
through a configured pager, so they should not be auto-approved as safe
before the subcommand. At the same time, `-p` after read-only
subcommands like `log`, `diff`, and `show` is the common patch-output
flag, so treating every `-p` as unsafe would make ordinary read-only
inspection commands prompt unnecessarily.

## What Changed

- Split Git option safety matching into explicit global-option and
subcommand-option lists.
- Treat global `git --paginate ...` and `git -p ...` as unsafe.
- Keep post-subcommand patch usage such as `git log -p`, `git diff -p`,
and `git show -p HEAD` safe.
- Keep the pagination coverage with the shared Git safe-command
implementation rather than the Windows wrapper tests.
- Remove the stale `git_global_option_requires_prompt` helper now that
safe-command Git option matching owns the prompt-required lists.

## Testing

- `cargo test -p codex-shell-command`
2026-05-06 11:53:26 -07:00
pakrym-oai
712305be47 Remove core MCP list tools op (#21281)
## Why

The core `Op::ListMcpTools` request path is no longer needed. Keeping it
around left a dead request/response surface alongside the app-server MCP
inventory APIs that own current server status listing.

## What Changed

- Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the
core handler that built the MCP snapshot response.
- Removed the now-unused `codex-mcp` snapshot wrapper/export and passive
event handling arms in rollout and MCP-server consumers.
- Updated tests that used the old op as a synchronization hook to wait
on existing startup/skills events, and deleted the plugin test that only
exercised the removed listing op.

## Validation

- `cargo test -p codex-protocol`
- `cargo test -p codex-mcp`
- `cargo test -p codex-rollout -p codex-rollout-trace -p
codex-mcp-server`
- `cargo test -p codex-core --test all
pending_input::queued_inter_agent_mail`
- `cargo test -p codex-core --test all
rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta`
- `cargo test -p codex-core --test all
rmcp_client::stdio_image_responses`
- `just fix -p codex-core -p codex-protocol -p codex-mcp -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server`
2026-05-06 11:20:34 -07:00
Michael Bolin
123ec8b035 vendor: update bubblewrap to 0.11.2 (#21389)
## Why

`codex-rs/vendor/bubblewrap` had fallen behind upstream, and upstream
`v0.11.2` is the current Bubblewrap release. The release is a security
update for `CVE-2026-41163`, affecting setuid Bubblewrap builds, and
deprecates setuid support in favor of the default non-setuid build mode.

## What changed

- Refreshed the vendored Bubblewrap sources under
`codex-rs/vendor/bubblewrap` to upstream `v0.11.2`.
- Brought in the upstream `-Dsupport_setuid` build option, which
defaults setuid support off.
- Updated vendored release notes and documentation files included with
Bubblewrap.

## Verification

Not run locally; this PR only refreshes the vendored upstream Bubblewrap
source snapshot.

Upstream release:
https://github.com/containers/bubblewrap/releases/tag/v0.11.2
2026-05-06 18:10:30 +00:00
Felipe Coury
e97610cf3b fix(tui): keep Ctrl-C stashed drafts after /clear (#21351)
## Why

When a user stashes a draft with Ctrl+C, then runs `/clear`, the fresh
chat session loses the in-memory composer history that held the stashed
draft. Pressing Up after `/clear` can then recall an older submitted
prompt instead of the draft the user explicitly saved for later.

## What Changed

- Record Ctrl+C-cleared composer text through the existing message
history path, so it survives the fresh session created by `/clear`.
- Keep `/clear` itself out of local slash-command recall so it does not
sit ahead of the stashed draft.
- Add regression coverage for the full flow: submit a prompt, stash a
later draft with Ctrl+C, run `/clear`, then recall the stashed draft
before the older prompt.

## How to Test

1. Start Codex with `just c`.
2. Submit a short prompt such as `ok` and wait for the turn to complete.
3. Type a new draft, press Ctrl+C, then run `/clear`.
4. Press Up and confirm the stashed draft is restored.
5. Press Up again and confirm the older submitted prompt is still
reachable after the stashed draft.

Targeted tests:

- `cargo test -p codex-tui
slash_clear_after_ctrl_c_keeps_stashed_draft_recallable`

Manual verification:

- Reproduced the issue in tmux with `RUST_LOG=trace just c -c
log_dir=...`: before the fix, Up after `/clear` recalled the older
submitted prompt.
- Re-tested the same tmux flow after the fix: Up after `/clear` restored
the Ctrl+C-stashed draft.
2026-05-06 14:46:18 -03:00
mifan-oai
f2f5d6f6c7 [codex] Coordinate OpenAI docs sample with API key setup (#21263)
## Summary
- Add the same API key setup coordination guidance to the embedded
OpenAI Docs sample skill in `codex-rs/skills`.
- Keep the skill description/frontmatter unchanged; the coordination
lives only in the body.
- Preserve direct OpenAI Docs routing for docs-only questions,
citations, model/API guidance, conceptual explanations, and non-building
examples.

## Why
The Codex repo carries its own OpenAI Docs skill variant under
`codex-rs/skills/src/assets/samples`. This keeps that embedded sample
aligned with the other OpenAI Docs variants patched in the related PRs.

## Validation
- `cargo test -p codex-skills`
- `git diff --check`
2026-05-06 13:46:15 -04:00
jif-oai
ab43db44a2 feat: move auto vaccum (#21378)
The initial vaccum is not needed anymore. We can consider all the DBs
have been reclaimed by now
2026-05-06 19:32:28 +02:00
jif-oai
0e821b380a rollout: coalesce thread updated_at touches (#21367)
## Why

Metadata-irrelevant rollout events currently refresh
`threads.updated_at` on every flush. That keeps thread recency accurate,
but it also turns high-frequency agent output into unnecessary SQLite
writes. Recency only needs to advance periodically during an active
session, while the final suppressed touch still needs to be persisted
before shutdown.

## What changed

- coalesce touch-only `updated_at` writes in the rollout writer, with a
short production interval between persisted touches
- retain the latest suppressed touch and flush it during shutdown so the
thread is not left stale
- extend rollout recorder coverage for coalesced touches, delayed
refresh, shutdown flushing, and the existing missing-thread fallback
path

## Verification

- Added regression coverage in `rollout/src/recorder_tests.rs` for
coalescing and shutdown flushing behavior.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 19:32:24 +02:00
pakrym-oai
2070d5bfd3 [codex] Add response.processed websocket request (#21284)
## Summary

- Add a `response.processed` websocket request payload and sender for
Responses API websockets.
- Send `response.processed` from `try_run_sampling_request` after a
response completes, local turn processing succeeds, and the
session-owned feature flag is enabled.
- Add websocket coverage for both enabled and disabled feature-flag
behavior.

## Validation

- `just fmt`
- `cargo test -p codex-core response_processed`
- `cargo test -p codex-api responses_websocket`
- `cargo test -p codex-features
responses_websocket_response_processed_is_under_development`
- `git diff --check`
- `just fix -p codex-api -p codex-core -p codex-features`
- `git diff --check origin/main...HEAD`
2026-05-06 09:58:46 -07:00
pakrym-oai
2004173cd7 Move message history out of core (#21278)
## Why

Message history was implemented inside `codex-core` and surfaced through
core protocol ops and `SessionConfiguredEvent` fields even though the
current consumer is TUI-local prompt recall. That made core own UI
history persistence and exposed `history_log_id` / `history_entry_count`
through surfaces that app-server and other clients do not need.

This change moves message history persistence out of core and keeps the
recall plumbing local to the TUI.

## What changed

- Added a new `codex-message-history` crate for appending, looking up,
trimming, and reading metadata from `history.jsonl`.
- Removed core protocol history ops/events: `AddToHistory`,
`GetHistoryEntryRequest`, and `GetHistoryEntryResponse`.
- Removed `history_log_id` and `history_entry_count` from
`SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly.
- Updated the TUI to dispatch local app events for message-history
append/lookup and keep its persistent-history metadata in TUI session
state.

## Validation

- `cargo test -p codex-message-history -p codex-protocol`
- `cargo test -p codex-exec event_processor_with_json_output`
- `cargo test -p codex-mcp-server outgoing_message`
- `cargo test -p codex-tui`
- `just fix -p codex-message-history -p codex-protocol -p codex-core -p
codex-tui -p codex-exec -p codex-mcp-server`
2026-05-06 08:35:42 -07:00
Ahmed Ibrahim
be1d3cff93 2- Use string service tiers in session protocol (#20971)
## Summary
- break service tier session/op/app-server protocol fields from the
closed enum to string tier ids
- send the service tier string directly through model requests, prewarm,
compaction, memories, and TUI/app-server turn starts
- regenerate app-server protocol JSON/TypeScript schemas, removing the
standalone ServiceTier TS enum

## Verification
- just fmt
- cargo check -p codex-core -p codex-app-server -p codex-tui
- just write-app-server-schema

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 18:00:21 +03:00
jif-oai
ebd9ec05b4 [codex] fix builtin MCP Windows path test (#21350)
## Summary
- make the builtin MCP config test derive the expected `--codex-home`
argument from `AbsolutePathBuf`

## Why
`AbsolutePathBuf::try_from("/tmp/codex-home")` is rendered as
`D:\\tmp\\codex-home` on Windows, but the test asserted the Unix literal
`"/tmp/codex-home"`. That made the Windows Bazel job fail even though
the production code was behaving correctly.

## Impact
This keeps the test cross-platform while preserving the same transport
assertion on Unix and Windows.

## Validation
- `cargo test -p codex-builtin-mcps`

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 16:06:21 +02:00
jif-oai
5ecff05196 feat(app-server): move v2 sessionId onto Thread (#21336)
## Why

`session_id` and `thread_id` are separate identities after #20437, but
app-server only surfaced `sessionId` on the `thread/start`,
`thread/resume`, and `thread/fork` response envelopes. Other
thread-bearing surfaces such as `thread/list`, `thread/read`,
`thread/started`, `thread/rollback`, `thread/metadata/update`, and
`thread/unarchive` either lacked the grouping key or forced clients to
special-case those three responses.

Making `sessionId` part of the reusable `Thread` payload gives every v2
API surface one place to expose session-tree identity.

## Mental model
  1. thread.sessionId lives on `Thread`
2. It is a view/runtime identity for the current live session tree, not
durable stored lineage metadata
3. When app-server has a live loaded thread, it copies the real value
from core’s session_configured.session_id
4. When it only has stored/unloaded data, it falls back to
thread.sessionId = thread.id

## What changed

- Added `sessionId` to the v2
[`Thread`](8fc9e9b4cf/codex-rs/app-server-protocol/src/protocol/v2/thread_data.rs (L105-L109)).
- Removed the duplicate top-level `sessionId` fields from
`thread/start`, `thread/resume`, and `thread/fork`; clients should now
read `response.thread.sessionId`.
- Populated `thread.sessionId` when building live thread responses,
replaying loaded threads, and returning stored-thread summaries so the
field is present across start, resume, fork, list, read, rollback,
metadata-update, unarchive, and `thread/started` paths. See
[`load_thread_from_resume_source_or_send_internal`](8fc9e9b4cf/codex-rs/app-server/src/request_processors/thread_processor.rs (L2824-L2918))
and
[`thread_from_stored_thread`](8fc9e9b4cf/codex-rs/app-server/src/request_processors/thread_processor.rs (L3671-L3719)).
- Preserved the stored-thread fallback: if a thread has not been loaded
into a live session tree yet, `thread.sessionId` falls back to
`thread.id`; once the thread is live again, the field reports the active
session tree root.
- Regenerated the JSON/TypeScript schemas and updated the app-server
README examples to show
[`thread.sessionId`](8fc9e9b4cf/codex-rs/app-server/README.md (L306-L310))
on the thread object.
2026-05-06 15:23:25 +02:00
jif-oai
ca257b6ce5 chore: spawn MCP for memories (#21214)
Co-authored-by: Codex <noreply@openai.com>
2026-05-06 15:05:54 +02:00
jif-oai
8f3bb355f4 Move installation ID resolution out of core startup (#21182)
## Summary

- resolve or inject the installation ID before core startup and pass it
through `ThreadManager`, `CodexSpawnArgs`, and `Session` as a plain
`String`
- keep child sessions on the parent installation ID instead of
rediscovering it inside core
- propagate installation ID startup failures in `mcp-server` instead of
panicking

## Why

Core was still touching the filesystem on the session startup path to
discover `installation_id`. This moves that work to the outer host
boundary so core no longer depends on `codex_home` reads during session
construction.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 10:48:54 +00:00
Ahmed Ibrahim
5d6f23a27b Propagate cache key and service tiers in compact (#21249)
## Why

`/responses/compact` should preserve the request-affinity fields that
apply to the active auth mode. ChatGPT-auth compact requests need the
effective `service_tier`, and compact requests for every auth mode need
the stable `prompt_cache_key`, so compaction does not quietly lose
routing or cache behavior that normal sampling already has.

This follows the request-parity direction from #20719, but keeps the net
change focused on the compact payload fields needed here.

## What changed

- Add `service_tier` and `prompt_cache_key` to the compact endpoint
input payload.
- Build the remote compact payload from the existing responses request
builder output so `Fast` still maps to `priority` when compact sends a
service tier.
- Pass the turn service tier into remote compaction, but only include it
in compact payloads for ChatGPT-backed auth.
- Keep `prompt_cache_key` on compact payloads for all auth modes.
- Add request-body diff snapshot coverage in
`core/tests/suite/compact_remote.rs` for:
- API-key auth reusing `prompt_cache_key` while omitting `service_tier`
even when `Fast` is configured.
  - ChatGPT auth reusing both `service_tier` and `prompt_cache_key`.
- Drive the snapshot coverage through five varied turns: plain text,
multi-part text, tool-call continuation, image+text input, local-shell
continuation, and final-turn reasoning output.

## Verification

- Added insta snapshots for compact request-body parity against the last
normal `/responses` request after five varied turns.
- Not run locally per repo guidance; relying on GitHub CI for test
execution.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 13:38:43 +03:00
jif-oai
cc84e6bc6d Revert "feat: support template interpolation in multi-agent usage hints" (#21337)
Reverts openai/codex#20973
2026-05-06 12:33:37 +02:00
jif-oai
06e5dfa4dd feat: return session ID from thread/fork (#21332)
## Why

`thread/start` and `thread/resume` already return `sessionId`, but
`thread/fork` only returned the new thread. That left clients to infer
the forked thread's session identity from `thread.id`, which kept the
new `session_id` / `thread_id` split implicit at one lifecycle boundary.
Follow-up to #20437.

## What changed

- Add `sessionId` to `ThreadForkResponse`.
- Populate it from the forked session configuration.
- Regenerate the v2 JSON/TypeScript schema fixtures and update the
app-server docs/example.
- Extend the fork integration test to assert the returned `sessionId`.

## Verification

- Added coverage in `thread_fork_creates_new_thread_and_emits_started`
for the new response field.
2026-05-06 12:04:27 +02:00
jif-oai
fe24a180ab feat: include thread ID in MCP turn metadata (#21329)
## Why

MCP tool calls already include `session_id` in `x-codex-turn-metadata`,
but descendant threads intentionally share that value with the root
thread. Consumers that need to correlate work at the concrete thread
level also need the current `thread_id`.

## What changed

- add `thread_id` to `x-codex-turn-metadata` while preserving
`session_id` as the shared session identity
- thread the two identities separately through normal turns and spawned
review threads
- add regression coverage for resumed sessions, reserved metadata
fields, and deferred MCP tool calls

## Verification

- added focused coverage in `core/src/session/tests.rs`,
`core/src/turn_metadata_tests.rs`, and `core/tests/suite/search_tool.rs`
2026-05-06 11:36:15 +02:00
jif-oai
b5e965e1d7 test: isolate app-server-client in-process test state (#21328)
## Why

The in-process `app-server-client` tests were still building their
configs from the ambient `codex_home` and letting the embedded app
server create its own state DB when `state_db` was absent. That matters
because in-process startup falls back to
`init_state_db_from_config(...)` in that case, so tests can otherwise
share persisted state instead of getting isolated fixtures:
[`app-server/src/in_process.rs`](a98623511b/codex-rs/app-server/src/in_process.rs (L368-L373)).

## What changed

- Give each in-process test client its own temporary `codex_home`.
- Initialize the matching state DB from that per-client config and pass
it into the client explicitly.
- Keep the temp directory alive for the lifetime of the test client
through a small `TestClient` wrapper.
- Add `tempfile` as a dev dependency for the new harness.

The updated setup lives in
[`app-server-client/src/lib.rs`](35c1133d45/codex-rs/app-server-client/src/lib.rs (L982-L1055)).

## Testing

- Existing `codex-app-server-client` tests continue to exercise the
updated in-process client path through the isolated helper.
2026-05-06 09:21:22 +00:00
jif-oai
a98623511b feat: add session_id (#20437)
## Summary

Related to
https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
TLDR:
We update the meaning of session ids and thread ids:
* thread_id stays as now
* session_id become a shared id between every thread under a /root
thread (i.e. every sub-agent share the same session id)

This PR introduces an explicit `SessionId` and threads it through the
protocol/client boundary so `session_id` and `thread_id` can diverge
when they need to, while preserving compatibility for older serialized
`session_configured` events.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 10:48:37 +02:00
Matthew Zeng
f9a907aebe Support Codex Apps auth elicitations (#19193)
## Summary

- request URL-mode MCP elicitations when Codex Apps tool calls fail with
connector auth metadata
- route Codex Apps auth URL elicitations into the TUI app-link flow

## Test plan

- `just fmt`
- `cargo test -p codex-core mcp_tool_call::tests`
- `cargo test -p codex-mcp`
- `cargo test -p codex-tui bottom_pane::app_link_view::tests`
- `just fix -p codex-core`
- `just fix -p codex-mcp`
- `just fix -p codex-tui`

Also attempted broader local runs:

- `cargo test -p codex-core` fails in unrelated
config/request-permission/proxy-sensitive tests under the current Codex
Desktop environment.
- `cargo test -p codex-tui` fails in unrelated status
snapshots/trust-default tests because the ambient environment renders
workspace-write/network permission defaults.
2026-05-06 07:18:00 +00:00
Michael Bolin
22326e263c release: bundle bwrap with Linux codex DotSlash artifact (#21312)
## Why

#21255 changed the Linux sandbox fallback so Codex can use a bundled
`codex-resources/bwrap` executable when no suitable system `bwrap` is
available. That lookup is relative to the native Codex executable
returned by
`std::env::current_exe()`, as implemented in
[`bundled_bwrap.rs`](9766d3d51c/codex-rs/linux-sandbox/src/bundled_bwrap.rs (L83-L93)).

The release already publishes a separate `bwrap` DotSlash output, but
the Linux `codex` DotSlash output still pointed at a single-binary
`.zst` payload. Running the `codex` DotSlash manifest only materializes
the native `codex` executable; it does not also create sibling files
from the separate `bwrap` manifest. The fallback path therefore needs
the Linux `codex` DotSlash artifact itself to include the real `bwrap`
executable at `codex-resources/bwrap`.

## What changed

- stage a Linux primary `codex-<target>-bundle.tar.zst` release artifact
containing `codex` and `codex-resources/bwrap`
- point the Linux `codex` DotSlash outputs at that bundle tarball
- leave the standalone `bwrap` DotSlash output in place for consumers
that want to fetch `bwrap` directly

## Verification

- `jq . .github/dotslash-config.json`
- Ruby YAML parse of `.github/workflows/rust-release.yml`
2026-05-05 23:33:13 -07:00
viyatb-oai
9766d3d51c fix(bwrap): emit libcap after standalone archive (#21285)
## Why

#21255 added the standalone `codex-bwrap` binary. In the Cargo build,
[`pkg_config::probe("libcap")`](a736cb55a2/codex-rs/bwrap/build.rs (L37-L39))
emits `-lcap` before
[`cc::Build::compile("standalone_bwrap")`](a736cb55a2/codex-rs/bwrap/build.rs (L50-L67))
adds the static bwrap archive. The Linux musl link then sees `-lcap
-lstandalone_bwrap`; because static archives are resolved left-to-right,
`cap_from_name` is still undefined once `standalone_bwrap` introduces
that reference.

The musl setup already builds `libcap.a` and exposes it through
[`libcap.pc`](a736cb55a2/.github/scripts/install-musl-build-tools.sh (L78-L88)),
so the failure is link ordering rather than a missing dependency.

## What changed

- probe `libcap` with `cargo_metadata(false)` so `pkg-config` does not
emit its link flags early
- emit the discovered `libcap` search paths and libraries after
`standalone_bwrap` is compiled, preserving the needed static-link order

## Verification

- `cargo test -p codex-bwrap`
- `cargo clippy -p codex-bwrap --all-targets`

The affected Linux musl release link is exercised by CI, which is the
path this fix targets.
2026-05-05 22:22:01 -07:00
Matthew Zeng
41505bcea2 [mcp] Return Accept early per feedback. (#21277)
- [x] Return Accept early when auto_deny is enabled per feedback.
2026-05-05 21:23:42 -07:00
aaronl-openai
9f06d171e2 Preserve session MCP config on refresh (#21055)
# Overview
MCP refreshes were rebuilding active threads from fresh disk-backed
config only, which dropped thread-start session overlays such as
app-injected MCP servers. This keeps refreshes current with disk config
while preserving the thread-local config that only the active thread
knows about.

# Changes
- Rebuild refreshed config per active thread using that thread's current
`cwd`, rather than fanning out one app-server config to every thread.
- Preserve each thread's `SessionFlags` layer while replacing reloadable
config layers with freshly loaded config, then derive the MCP refresh
payload from the rebuilt result.
- Move MCP refresh orchestration into app-server so manual refreshes
fail loudly while background refreshes remain best-effort, and route
plugin-triggered refreshes through the same per-thread reload path.
- Add regression coverage for session overlays, fresh project config,
plugin-derived MCP config, current requirements, and strict vs
best-effort refresh behavior.

# Verification
- Passed focused Rust coverage for the thread-config rebuild behavior
and deferred MCP refresh flow, plus `cargo test -p codex-app-server
--lib`.
- Verified end to end in the Codex dev app against the locally built
CLI: registered an MCP via thread config, verified that it could be used
successfully before refresh, manually triggered MCP refresh, and
verified that it continued to be available afterward.
2026-05-05 21:09:28 -07:00
Andrei Eternal
8ef31894dc app-server: align dynamic tool identifiers with Responses API (#20724)
## Why

Codex currently accepts dynamic tool names and namespaces that the
upstream Responses function-tool path does not actually support. In
practice, that means app-server can register a dynamic tool successfully
and only discover later that the LLM-facing tool contract will reject or
mishandle it.

This PR tightens the app-server-side dynamic tool contract to match the
Responses API before we stack dynamic tool hook support on top of it.

## What changed

- validate dynamic tool `name` against the Responses function-tool
identifier contract: `^[a-zA-Z0-9_-]+$`, length `1..128`
- validate dynamic tool `namespace` the same way, with the Responses
namespace length limit `1..64`
- reject namespaces that collide with the always-reserved Responses
runtime namespaces such as `functions`, `multi_tool_use`, `file_search`,
`web`, `browser`, `image_gen`, `computer`, `container`, `terminal`,
`python`, `python_user_visible`, `api_tool`, `tool_search`, and
`submodel_delegator`
- escape invalid identifiers in error messages so control characters do
not spill raw into logs or client-visible error text
- document the tightened dynamic tool identifier contract in
`codex-rs/app-server/README.md`
- add both unit coverage for the validator and an app-server integration
test that rejects a `thread/start` request with Responses-incompatible
dynamic tool identifiers

## Verification

- `cargo test -p codex-app-server validate_dynamic_tools_`
- `cargo test -p codex-app-server --test all
thread_start_rejects_dynamic_tools_not_supported_by_responses`
2026-05-05 21:05:00 -07:00
xl-openai
5119680f85 feat: Add plugin share access controls (#21124)
Extends `plugin/share/save` to accept optional discoverability and
shareTargets while uploading plugin contents, and adds
`plugin/share/updateTargets` for share-only target updates without
re-uploading.
2026-05-05 20:14:18 -07:00
rhan-oai
b3d4f1a9f0 [codex-analytics] rework thread_source for thread analytics (#20949)
## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification

## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.

Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.

## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.

## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
2026-05-06 02:12:31 +00:00
Abdulrahman Alfozan
94db03d5af Expose plugin manifest keywords in app server (#21271)
## Summary
- Add plugin manifest keywords to core plugin marketplace/detail models
- Expose keywords on app-server v2 PluginSummary and generated
schema/types
- Populate keywords in plugin/list and plugin/read responses for local
plugins

Depends on https://github.com/openai/openai/pull/891087

## Validation
- just fmt
- just write-app-server-schema
- cargo test -p codex-app-server-protocol
- cargo test -p codex-core-plugins
- cargo test -p codex-app-server
plugin_list_keeps_valid_marketplaces_when_another_marketplace_fails_to_load
- cargo test -p codex-app-server
plugin_read_returns_plugin_details_with_bundle_contents
2026-05-06 02:09:05 +00:00
pakrym-oai
136e442e95 [codex] Remove legacy ListSkills op (#21282)
## Why

`skills/list` is already exposed through app-server v2 and covered by
the app-server test suite. Keeping the separate core `Op::ListSkills`
path leaves a duplicate legacy protocol surface that no longer needs to
be maintained.

## What Changed

- Removed `Op::ListSkills` and `EventMsg::ListSkillsResponse` from the
core protocol.
- Deleted the corresponding core session handler and stale core
integration tests.
- Removed rollout/MCP ignore branches and protocol v1 docs references
for the deleted event/op.
- Left app-server `skills/list` and its existing coverage intact.

## Validation

- `cargo test -p codex-protocol`
- `cargo test -p codex-core --test all suite::skills`
- `cargo check -p codex-mcp-server -p codex-rollout -p
codex-rollout-trace`
- `just fix -p codex-core`
2026-05-05 18:58:18 -07:00
pakrym-oai
024118625e [codex] Remove unused ListModels op (#21276)
## Why

The core protocol still exposed a `ListModels` submission op even though
no client sends it and the core submission loop treated it as an ignored
unknown op. Keeping the dead variant made the protocol surface look
supported while the active model listing API is the app-server
`model/list` JSON-RPC request.

## What Changed

- Removed the unused `Op::ListModels` variant from `codex-rs/protocol`.
- Removed its `Op::kind()` mapping.

The existing app-server `model/list` endpoint is unchanged.

## Verification

- `cargo test -p codex-protocol`
2026-05-06 01:57:17 +00:00
Michael Bolin
a736cb55a2 release/npm: bundle standalone bwrap on Linux (#21257) 2026-05-05 18:21:52 -07:00
iceweasel-oai
db22c91e61 Share Git safe-command logic on Windows (#21275)
## Why

BUGB-15601 showed that the Windows safe-command path had drifted from
the generic Git classifier. The Windows-specific Git parser could
classify a PowerShell-wrapped `git` command as safe as soon as it found
a safelisted subcommand, without applying the generic checks for unsafe
subcommand options such as `--output`, `--ext-diff`, `--textconv`,
`--paginate`, or `cat-file --filters`.

The generic classifier already models the Git command boundary and the
read-only argument checks more carefully, so Windows should reuse that
logic instead of maintaining a smaller parallel parser.

## What Changed

- Extracted the existing generic Git classification logic into
`is_safe_git_command`.
- Updated `windows_safe_commands.rs` to call that shared helper for
parsed PowerShell `git` commands.
- Removed the Windows-only Git subcommand safelist, including the
`cat-file` allowance that was part of the reported bypass.
- Added a Windows regression test that keeps PowerShell-wrapped Git
commands with side-effecting options classified unsafe.
- Made the full-path PowerShell test discover the installed PowerShell
executable instead of depending on one hard-coded `pwsh.exe` path.

## Verification

- `cargo test -p codex-shell-command
rejects_git_subcommand_options_with_side_effects`
- `cargo test -p codex-shell-command
git_global_override_flags_are_not_safe`
- `cargo test -p codex-shell-command
windows_powershell_full_path_is_safe -- --nocapture`

Co-authored-by: Codex <codex@openai.com>
2026-05-05 17:49:42 -07:00
mchen-oai
794c240f25 Add model and reasoning effort to MCP turn metadata (#21219)
## Why
- Similar change as https://github.com/openai/codex/pull/19473.
- Without change: MCP tool calls receive
`_meta["x-codex-turn-metadata"]` with `session_id`, `turn_id`, and
`turn_started_at_unix_ms`.
- Issue: MCP servers may want the model and reasoning effort to better
understand tool-call behavior and latency relative to turn start.

## What Changed
- With change: MCP turn metadata now includes `model` and
`reasoning_effort`, propagated in `_meta["x-codex-turn-metadata"]`.
- Normal `/responses` turn metadata headers are unchanged.

## Verification
- `codex-rs/core/src/mcp_tool_call_tests.rs`
- `codex-rs/core/src/turn_metadata_tests.rs`
- `codex-rs/core/tests/suite/search_tool.rs`
2026-05-05 17:37:48 -07:00
pakrym-oai
2c1a361a2e [codex] Move thread naming to app server (#21260)
## Why

Thread names are app-server metadata now, backed by the thread store and
sqlite state database. Keeping a core `SetThreadName` op plus a rollout
`thread_name_updated` event made rename persistence live in the wrong
layer and required historical replay support for an event that new
app-server flows should not write.

## What changed

- Removed `Op::SetThreadName` and `EventMsg::ThreadNameUpdated` from the
core protocol and deleted the core handler path that appended rename
events to rollouts.
- Updated app-server `thread/name/set` so both loaded and unloaded
threads write through thread-store metadata and app-server emits
`thread/name/updated` notifications.
- Updated local thread-store name metadata updates to write sqlite title
metadata and the legacy thread-name index without appending rollout
events.
- Removed state extraction and rollout handling for the deleted
thread-name event.

## Validation

- `cargo test -p codex-app-server thread_name_updated_broadcasts`
- `cargo test -p codex-app-server
thread_name_set_is_reflected_in_read_list_and_resume`
- `cargo test -p codex-thread-store
update_thread_metadata_sets_name_on_active_rollout_and_indexes_name`
- `cargo test -p codex-state`
- `cargo check -p codex-mcp-server -p codex-rollout-trace`
- `just fix -p codex-app-server -p codex-thread-store -p codex-state -p
codex-mcp-server -p codex-rollout-trace`

## Docs

No external documentation update is expected for this internal ownership
change.
2026-05-05 17:16:06 -07:00
Michael Bolin
3ec18a2c0a release: publish standalone bwrap artifacts (#21256)
**Summary**
- Build Linux `bwrap` before the main release binaries.
- Export the release `bwrap` SHA-256 as `CODEX_BWRAP_SHA256` so the
Codex binary can verify the bundled fallback.
- Sign, stage, and upload `bwrap` alongside the primary Linux release
artifacts.

**Verification**
- YAML parse check for `.github/workflows/rust-release.yml`











---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21256).
* #21257
* __->__ #21256
2026-05-05 17:15:46 -07:00
Michael Bolin
26f355b67b linux-sandbox: use standalone bundled bwrap (#21255)
**Summary**
- Add `codex-bwrap`, a standalone `bwrap` binary built from the existing
vendored bubblewrap sources.
- Remove the linked vendored bwrap path from `codex-linux-sandbox`;
runtime now prefers system `bwrap` and falls back to bundled
`codex-resources/bwrap`.
- Add bundled SHA-256 verification with missing/all-zero digest as the
dev-mode skip value, then exec the verified file through
`/proc/self/fd`.
- Keep `launcher.rs` focused on choosing and dispatching the preferred
launcher. Bundled lookup, digest verification, and bundled exec now live
in `linux-sandbox/src/bundled_bwrap.rs`; Bazel runfiles lookup lives in
`linux-sandbox/src/bazel_bwrap.rs`; shared argv/fd exec helpers live in
`linux-sandbox/src/exec_util.rs`.
- Teach Bazel tests to surface the Bazel-built `//codex-rs/bwrap:bwrap`
through `CARGO_BIN_EXE_bwrap`; `codex-linux-sandbox` only honors that
fallback in debug Bazel runfiles environments so release/user runtime
lookup stays tied to `codex-resources/bwrap`.
- Allow `codex-exec-server` filesystem helpers to preserve just the
Bazel bwrap/runfiles variables they need in debug Bazel builds, since
those helpers intentionally rebuild a small environment before spawning
`codex-linux-sandbox`.
- Verify the Bazel bwrap target in Linux release CI with a build-only
check. Running `bwrap --version` is too strong for GitHub runners
because bubblewrap still attempts namespace setup there.

**Verification**
- Latest update: `cargo test -p codex-linux-sandbox`
- Latest update: `just fix -p codex-linux-sandbox`
- `cargo check --target x86_64-unknown-linux-gnu -p codex-linux-sandbox`
could not run locally because this macOS machine does not have
`x86_64-linux-gnu-gcc`; GitHub Linux Bazel CI is expected to cover the
Linux-only modules.
- Earlier in this PR: `cargo test -p codex-bwrap`
- Earlier in this PR: `cargo test -p codex-exec-server`
- Earlier in this PR: `cargo check --release -p codex-exec-server`
- Earlier in this PR: `just fix -p codex-linux-sandbox -p
codex-exec-server`
- Earlier in this PR: `bazel test --nobuild
//codex-rs/linux-sandbox:linux-sandbox-all-test
//codex-rs/core:core-all-test
//codex-rs/exec-server:exec-server-file_system-test
//codex-rs/app-server:app-server-all-test` (analysis completed; Bazel
then refuses to run tests under `--nobuild`)
- Earlier in this PR: `bazel build --nobuild //codex-rs/bwrap:bwrap`
- Prior to this update: `just bazel-lock-update`, `just
bazel-lock-check`, and YAML parse check for
`.github/workflows/bazel.yml`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21255).
* #21257
* #21256
* __->__ #21255
2026-05-05 17:14:29 -07:00
Channing Conger
03d3403a41 ci: trigger rusty-v8 releases from tags (#21259)
Swap to tag based releasing and allow tags of type `rusty-v8-v*.*.*`
2026-05-05 16:56:43 -07:00
Owen Lin
d7de4dd3ac chore(app-server-protocol): split v2 API definitions into modules (#21251)
## Why

`codex-rs/app-server-protocol/src/protocol/v2.rs` had grown into a
single ~12k-line definition file for the entire app-server v2 API.

This is purely a mechanical refactor to break up the monolithic `v2.rs`
file that contains all app-server API v2 types into more modular files,
grouped by resource (e.g. account, thread, turn, etc.).

`just write-app-server-schema` shows no real changes, so we can be sure
that this is purely an internal organizational change.

## What changed

- Replaced the monolithic `protocol/v2.rs` with a `protocol/v2/` module
tree and a small `mod.rs` that only declares and reexports modules.
- Grouped v2 API definitions by conceptual owner, including `account`,
`apps`, `collaboration_mode`, `command_exec`, `config`, `device_key`,
`experimental_feature`, `feedback`, `fs`, `hook`, `item`, `mcp`,
`model`, `notification`, `permissions`, `plugin`, `process`, `realtime`,
`review`, `thread`, `thread_data`, `turn`, and `windows_sandbox`.
- Moved v2 tests into `protocol/v2/tests.rs` so `mod.rs` stays small.
- Kept shared protocol helpers in `protocol/v2/shared.rs`, including the
enum mirroring macro and common cross-resource types.
- Co-located resource-specific notifications and server-request payloads
with the modules that own those resources.
- Regenerated app-server protocol schema fixtures. The schema diffs are
non-semantic newline-only changes after the refactor.

## Verification

- `cargo check -p codex-app-server-protocol`
- `cargo test -p codex-app-server-protocol`
- `just write-app-server-schema`
2026-05-05 16:46:51 -07:00
Michael Bolin
332b8b2c74 fix build (#21261)
I believe a merge race in https://github.com/openai/codex/pull/20689
broke the build, so this is a quick fix.

`cargo check --tests` passed locally.
2026-05-05 16:02:06 -07:00
Tom
ee02cf26d6 codex: use ThreadStore history for core review forks (#20577)
- fork loaded parent threads from `ThreadStore` history in core agent
control paths
- migrate guardian review fork history to loaded session history instead
of rereading rollout files

## Verification
- `cargo test -p codex-core spawn_agent_fork`
2026-05-05 15:25:19 -07:00
Michael Zeng
d0f9d5eba2 Add cloud executor registration to exec-server (#19575)
## Summary
This PR adds the first `codex-rs` milestone for remote-exec e2e: a local
`codex exec-server` can now register itself with
`codex-cloud-environments` and attach to the returned rendezvous
websocket.

At a high level, `codex exec-server --cloud ...` now:
- loads ChatGPT auth from normal Codex config
- registers an executor with `codex-cloud-environments`
- receives a signed rendezvous websocket URL
- serves the existing exec-server JSON-RPC protocol over that websocket

## What Changed
- Added `--cloud`, `--cloud-base-url`, `--cloud-environment-id`, and
`--cloud-name` to `codex exec-server`
- Added a new `exec-server/src/cloud.rs` module that handles:
  - registration requests
  - auth/header setup
  - bounded auth retry on `401/403`
  - reconnect/backoff after websocket disconnects
- Reused the existing `ConnectionProcessor` / `ExecServerHandler` path
so cloud mode serves the same exec/filesystem RPC surface as local
websocket mode
- Added cloud-specific error variants and minimal docs for the new mode

## Testing
Manual e2e test that fully goes through exec server flow with our codex
cloud agent as orchestrator
2026-05-05 22:01:48 +00:00
Rasmus Rygaard
7e310bc7f3 Inject state DB, agent graph store (#20689)
## Why

We want the agent graph store to be passed down the stack as a real
dependency, the same way we already treat the thread store.

This will let us inject the agent graph store as a real dependency and
support implementations other than the local SQLite-backed one. Right
now most code instantiates a state DB and an agent graph store
just-in-time. Ideally, we would not depend on the state DB directly but
only read through the higher-level interfaces.

This change makes the dependency boundaries explicit and moves state DB
initialization to process bootstrap instead of hiding it inside local
store implementations.

## What changed

- `ThreadManager` now requires a `StateDbHandle` and an
`AgentGraphStore` at construction time instead of treating them as
optional internals.
- The local store constructors no longer lazily initialize SQLite.
Callers now initialize the state DB once per process and use that shared
handle to build:
  - `LocalThreadStore`
  - `LocalAgentGraphStore`
- App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the
thread-manager sample) now initialize the state DB up front and inject
the resulting handle down the stack.
- `app-server` now consistently uses its process-scoped state DB handle
instead of reopening SQLite or trying to recover it from loaded threads.
- Device-key storage now reuses the shared state DB handle instead of
maintaining its own lazy opener.
- The thread archive / descendant traversal paths now use the injected
`AgentGraphStore` instead of reaching through local
thread-store-specific state.

## Verification

- `cargo check -p codex-core -p codex-thread-store -p codex-app-server
-p codex-mcp-server -p codex-thread-manager-sample --tests`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-core
thread_manager_accepts_separate_agent_graph_store_and_thread_store --
--nocapture`
- `cargo test -p codex-app-server
thread_archive_archives_spawned_descendants -- --nocapture`
2026-05-05 21:45:29 +00:00
Channing Conger
36460387ec Enable V8 sandboxing for source-built builds (#21146)
## Summary

This is the first PR in the V8 in-process sandboxing rollout.

It adds the build-system and Rust feature plumbing needed to support
sandboxed V8 builds, then enables sandboxing by default for the
source-built Bazel V8 path that we control directly. It deliberately
keeps the published `rusty_v8` artifact workflows on their current
non-sandboxed contract so this PR can land and ship independently before
we change any released artifacts.

## Rollout plan

- [x] **PR 1: land sandbox plumbing and default source-built Bazel V8 to
sandboxed mode**

- [ ] **PR 2: publish sandbox-enabled release artifacts and add
compatibility validation**
- Produce sandboxed artifact pairs for every released Cargo target that
does not already use the source-built Bazel path.
- Add CI coverage that consumes those sandboxed artifacts and verifies:
    - `codex-v8-poc` reports sandbox enabled
    - `codex-code-mode` builds/tests against the sandboxed path

- [ ] **PR 3: switch release consumers to sandboxed artifacts by
default**
  - Update released artifact selectors/checksums.
- Enable the Rust `v8_enable_sandbox` feature in the default release
path.
- Make the sandboxed artifact family the normal path for published
builds.

- [ ] **PR 4: remove rollout-only compatibility paths**
- Remove the temporary non-sandbox release compatibility config once the
new default has shipped and baked.
  - Keep the invariant tests permanently.
2026-05-05 14:36:37 -07:00
Felipe Coury
bb2257e3f5 [codex] fix TUI turn items view fixtures (#21243)
## Summary

Adds the required `items_view` field to the three session picker `Turn`
test fixtures that populate full turn item lists.

## Root Cause

`#21063` added `Turn.items_view` to the app-server protocol type. The
later session picker merge added three test-only
`codex_app_server_protocol::Turn` literals without the new field, which
broke Bazel compilation on `main` with `E0063: missing field
items_view`.

## Validation

- `just fmt`
- `cargo test -p codex-tui resume_picker --no-fail-fast`
- `just argument-comment-lint`

I also ran `cargo test -p codex-tui`; it compiled and ran the suite, but
this local machine failed two pre-existing status permission-profile
tests because `/etc/codex/requirements.toml` disallows
`DangerFullAccess`.
2026-05-05 14:24:28 -07:00
Eric Traut
8c88f9a304 Auto-deny MCP elicitations for Xcode 26.4 clients (#21113)
## Summary

Xcode 26.4 was built against app-server behavior from before MCP
elicitation requests became client-visible in CLI 0.120.0 via #17043.
That client line does not expect the new events/messages, so this PR
restores the old behavior for exactly that client/version combination.

The compatibility handling stays in the app-server layer: when the
initialized client is `Xcode` and its version starts with `26.4`, the
app server marks the live Codex thread so MCP elicitations are
auto-denied. The flag is applied on thread start/resume/fork/turn
attachment, carried through `Codex`/`CodexThread`, and stored on
`McpConnectionManager` so refreshed MCP managers preserve the behavior.

## Notes

This is intentionally narrow and includes a TODO to remove the
compatibility path once Xcode 26.4 ages out.
2026-05-05 14:05:42 -07:00
pakrym-oai
f593323ef1 [codex] Split tool handlers by tool name (#20687)
## Why

Tool registration used to bind a tool name to a handler externally,
which left ownership split between the registry plan and the handler
implementation. Some built-in handlers also multiplexed multiple in-core
tools by switching on the invoked tool name internally.

This moves the registry identity onto the handler itself and makes
built-in multi-tool areas use separate concrete handlers, so each
registered handler instance owns exactly one tool name and one dispatch
path.

## What Changed

- Added `ToolHandler::tool_name()` and changed
`ToolRegistryBuilder::register_handler` to derive the registry key from
the handler.
- Split built-in multiplexed handlers into concrete per-tool handlers
for unified exec, shell/local shell/container exec, MCP resources, goal
tools, and agent job tools.
- Kept name-carrying handler instances only where the runtime target is
inherently external or dynamic, such as MCP tools, dynamic tools, and
unavailable placeholders.
- Updated `ToolHandlerKind` and registry-plan construction so plan
entries map directly to concrete handler registrations.

## Verification

- `cargo test -p codex-tools tool_registry_plan`
- `cargo test -p codex-core --lib tools::registry_tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`
2026-05-05 13:46:45 -07:00
viyatb-oai
9cbef243b5 fix(linux-sandbox): isolate Linux sandbox synthetic mount registry per user for shared codex use case (#21234)
## Summary
- make the Linux sandbox synthetic mount registry path unique per
effective UID
- keep same-user coordination intact while avoiding collisions between
users sharing `/tmp`
- add a regression test for the registry path contract

## Why
Issue #21192 reports that the Linux sandbox currently uses one global
temp path at `/tmp/codex-bwrap-synthetic-mount-targets`. If another user
creates that directory first, later users can fail to open the shared
lock file with `Permission denied`.

## Validation
- `just fmt`
- `cargo test -p codex-linux-sandbox`
- `cargo clippy -p codex-linux-sandbox --all-targets`

Fixes #21192
2026-05-05 20:43:37 +00:00
viyatb-oai
8b95d5467e fix(linux-sandbox): avoid panic on bwrap build failures (#21127)
## Summary

- Propagate Linux bubblewrap argument-construction failures instead of
panicking in the helper
- Keep mutable-symlink carveouts fail-closed while reporting them as
ordinary sandbox build failures
- Add regression coverage for a protected `.codex` symlink inside a
writable workspace root

## Root cause

Linux bubblewrap intentionally rejects read-only carveouts that cross a
symlink the sandboxed process can still rewrite. That is the correct
security behavior for protected metadata paths such as `.codex`.

The bug was one layer higher: `linux_run_main` treated the expected
build failure as impossible and panicked while constructing the
bubblewrap argv. For issue #20716, that turned a normal fail-closed
sandbox outcome into a noisy panic in the transcript.

## User impact

Users with a project-local `.codex` symlink inside a writable workspace
still get the conservative sandbox decision, but they no longer see a
Rust panic for that condition. The helper now exits with the concise
sandbox-build error so the normal denial / escalation path can handle
it.


Fixes #20716
2026-05-05 13:34:08 -07:00
Felipe Coury
3b2ebb368e feat(tui): redesign session picker (#20065)
## Why

The resume/fork picker is becoming the main way users recover previous
work, but the old fixed table made sessions hard to scan once thread
names, branches, working directories, and timestamps all mattered. This
redesign makes the picker denser by default, easier to search, and safer
to inspect before resuming or forking.

<table>
<tr>
<td>
<img width="1660" height="1103" alt="CleanShot 2026-05-03 at 12 34 10"
src="https://github.com/user-attachments/assets/313ede1d-1da4-4863-acd2-56b3e27e9703"
/>
</td>
<td>
<img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 34 15"
src="https://github.com/user-attachments/assets/cfde7d5c-bab0-4994-a807-254e53f344ea"
/>
</td>
</tr>
<tr>
<td>
<img width="1664" height="1107" alt="CleanShot 2026-05-03 at 12 39 22"
src="https://github.com/user-attachments/assets/e1ee58ca-4dc5-4a35-ae0f-47562da3974c"
/>
</td>
<td>
<img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 35 09"
src="https://github.com/user-attachments/assets/9c888072-eedf-4f45-985c-0c14df28bcc7"
/>
</td>
</tr>
</table>

## What Changed

- Replaces the old session table with responsive session rows that
prioritize the session name or preview, then show timestamp, cwd, and
branch metadata.
- Makes dense view the default while keeping comfortable view available
through `Ctrl+O`.
- Persists the picker view preference in `[tui].session_picker_view`,
including active profile-scoped config.
- Adds sort/filter controls for updated time, created time, cwd, and all
sessions.
- Expands search matching across session name, preview, thread id,
branch, and cwd.
- Makes `Esc` safer in search mode: it clears an active query before
starting a new session.
- Adds lazy transcript inspection:
  - `Space` expands recent transcript context inline.
  - `Ctrl+T` opens a transcript overlay.
  - raw reasoning visibility follows `show_raw_agent_reasoning`.
- Keeps remote cwd filtering server-side for remote app-server sessions
so local path normalization does not incorrectly hide remote results.
- Updates snapshots and config schema for the new picker states and
config option.

## How to Test

1. Start Codex in a repo with several saved sessions.
2. Press `Ctrl+R` / resume picker entry point.
3. Confirm the picker opens in dense mode and shows session name or
preview, timestamp, cwd, and branch metadata.
4. Press `Ctrl+O` and confirm it switches between dense and comfortable
views.
5. Restart Codex and confirm the selected view persists.
6. Type a query that matches a branch, cwd, thread id, or session name;
confirm matching sessions appear.
7. Press `Esc` while the query is non-empty and confirm it clears search
instead of starting a new session.
8. Select a session and press `Space`; confirm recent transcript context
expands inline.
9. Press `Ctrl+T`; confirm the transcript overlay opens and respects
raw-reasoning visibility settings.

Targeted tests:
- `cargo test -p codex-tui resume_picker --no-fail-fast`
- `cargo test -p codex-core
runtime_config_resolves_session_picker_view_default_and_override`
- `cargo test -p codex-core profile_tui_rejects_unsupported_settings`
- `cargo check -p codex-thread-manager-sample`
- `cargo insta pending-snapshots`
2026-05-05 13:32:54 -07:00
Felipe Coury
52fbbe7cdd feat(tui): route /diff through workspace commands (#21001)
Stacked on #20892.

## Why

#20892 adds the TUI workspace command abstraction so branch status
metadata can run through app-server instead of assuming the CLI process
has the active workspace locally. `/diff` still used direct local
process execution, which means remote app-server sessions could compute
the diff against the wrong machine or fail to see the active workspace
at all.

This PR moves `/diff` onto that same app-server-backed command path so
Git runs wherever the active workspace lives.

## What Changed

- Route `/diff` through the TUI `WorkspaceCommandExecutor` using the
active chat cwd.
- Replace direct `tokio::process::Command` usage in `get_git_diff` with
argv-based workspace command requests.
- Preserve the existing `/diff` behavior: tracked diff output, untracked
file diffs, treating Git diff exit code `1` as success, and showing the
existing non-git-repository message.
- Extend `WorkspaceCommand` with caller-set timeouts and an explicit
uncapped-output opt-out. Metadata probes remain capped by default;
`/diff` opts out because its full output is the user-visible payload.

## How to Test

Manual reviewer path:

1. Start the Codex TUI from a Git worktree with one tracked file change
and one untracked file.
2. Run `/diff`.
3. Confirm the rendered diff includes both the tracked diff and the
untracked file diff.
4. Start the TUI outside a Git worktree, or switch to a non-git cwd,
then run `/diff`.
5. Confirm it shows the existing `/diff` not-inside-a-git-repository
message.

Targeted tests run:

- `cargo test -p codex-tui get_git_diff -- --nocapture`
- `cargo test -p codex-tui branch_summary -- --nocapture`
- `cargo test -p codex-tui`
2026-05-05 17:09:25 -03:00
rhan-oai
9e0c191c13 add turn items view to app-server turns (#21063)
## Why

`Turn.items` currently overloads an empty array to mean either that no
items exist or that the server intentionally did not load them for this
response. That ambiguity blocks future lazy-loading work where clients
need to distinguish unloaded, summary, and fully hydrated turn payloads.

## What changed

- add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full`
variants
- add required `itemsView` metadata to app-server `Turn` payloads
- mark reconstructed persisted history as `full` and live shell-style
turn payloads as `notLoaded`
- keep current `thread/turns/list` behavior unchanged and document that
it still returns `full` turns today
- regenerate the JSON and TypeScript protocol fixtures

## Verification

- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server thread_read_can_include_turns`
- `cargo test -p codex-app-server
thread_turns_list_can_page_backward_and_forward`
- `cargo test -p codex-app-server
thread_resume_rejects_history_when_thread_is_running`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fmt`
2026-05-05 19:17:16 +00:00
pakrym-oai
b6d4c4ea6b [codex] Use shared app-server JSON-RPC error helpers (#21221)
## Why

App-server had repeated hand-built JSON-RPC error objects for standard
error shapes. Using the shared helpers keeps the common
`invalid_request`, `invalid_params`, and `internal_error` construction
in one place and reduces the chance of new call sites drifting from the
common error payload shape.

## What changed

- Replaced manual standard JSON-RPC error object creation with
`internal_error(...)`, `invalid_request(...)`, and `invalid_params(...)`
across app-server request processors and runtime paths.
- Removed local duplicate helper definitions from search and review
request handling.
- Preserved existing structured `data` payloads by creating the shared
helper error first and then attaching the existing metadata.
- Left custom non-standard errors and raw error-code assertions intact.

## Validation

- `cargo test -p codex-app-server`
2026-05-05 12:13:59 -07:00
Abhinav
0452dca986 hook trust metadata and enforcement (#20321)
# Why

We want shared hook trust that both the app and the TUI can build on,
but the metadata is only useful if runtime behavior agrees with it. This
PR adds a single backend trust model for hooks so unmanaged hooks cannot
run until the current definition has been reviewed, while managed hooks
remain runnable and non-configurable.

# What

- persist `trusted_hash` alongside hook state in `config.toml`
- expose `currentHash` and derived `trustStatus` through `hooks/list`
- derive trust from normalized hook definitions so equivalent hooks from
`config.toml` and `hooks.json` share the same trust identity
- gate unmanaged hooks on trust before they enter the runnable handler
set

# Reviewer Notes

- key file to review is `codex-rs/hooks/src/engine/discovery.rs`
- the only **core** change is schema related
2026-05-05 19:13:55 +00:00
starr-openai
78421face0 Route process tools to selected environments (#20647)
## Why
When a turn exposes multiple selected environments, shell-style tools
need a model-facing way to identify the intended target environment and
handlers need to resolve that target before parsing cwd-relative
permission fields or launching processes.

This PR scopes that rollout to process tools. Filesystem-oriented tools
such as `apply_patch`, `view_image`, and `list_dir` are intentionally
left for follow-up slices.

## What Changed
- Adds an `include_environment_id` option to shell-style tool schema
builders.
- Exposes optional `environment_id` on `shell`, `shell_command`, and
`exec_command` only when `ToolEnvironmentMode::Multiple` is active.
- Adds a shared handler helper that parses `environment_id` and
`workdir` from JSON function-call arguments and returns the selected
`Environment` plus effective absolute cwd.
- Uses that helper in `shell`, `shell_command`, and `exec_command`
handling so process execution uses the selected environment filesystem
and cwd.
- Changes `ExecCommandRequest` to carry a required resolved `cwd`,
removing the process-manager fallback to the primary turn cwd for new
exec commands.
- Leaves `write_stdin` unchanged because it targets an existing process
id, not a new environment.

## Testing
- Added unit coverage for process-tool schema exposure, selected
environment resolution, primary fallback, no-environment handling,
unknown environment ids, and resolving cwd-relative permission paths
against the selected environment cwd.
- Added a remote-suite e2e coverage case for `exec_command` routing
across explicit zero environments, one local environment, and
local+remote environments.
- Ran `just fmt` and `git diff --check`.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-05 12:12:03 -07:00
rhan-oai
fb7e1eb6fc [codex-analytics] add tool item event schemas (#17089)
## Why

Tool analytics need stable, typed payloads before the later lifecycle
reducer starts emitting them. Keeping the event schema definitions
isolated in their own PR makes the emitted surface reviewable separately
from the reducer logic that produces those events.

## What changed

- Adds the common tool-item analytics event base plus event payload
types for command execution, file changes, MCP calls, dynamic tools,
collaboration tools, web search, and image generation.
- Extends `TrackEventRequest` with the corresponding tool-item variants.
- Adds serialization coverage for the command-execution event shape.

## Verification

- `cargo test -p codex-analytics`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17089).
* #18748
* #18747
* #17090
* __->__ #17089
* #20514
2026-05-05 11:49:30 -07:00
Owen Lin
6075b77001 app-server: ignore persist_extended_history param (#21225)
## Why

Taking a step to removing the `persistExtendedHistory` field. It's not
scalable to be persisting so much data in the rollout file and returning
it in the thread history.

When a client explicitly sends `true`, the server now tells that client
the parameter is deprecated and ignored so the caller has a clear
migration signal via the `deprecationNotice` notification.

## What changed

- Keep the `persist_extended_history` / `persistExtendedHistory` field
in the v2 protocol for compatibility, but document it as deprecated and
ignored.
- Ignore the parameter in app-server `thread/start`, `thread/resume`,
and `thread/fork`; those paths always use limited history persistence
now.
- Stop treating `persistExtendedHistory` as a running-thread resume
override mismatch.
- Emit a connection-scoped `deprecationNotice` when a request explicitly
sets `persist_extended_history: true`.

## Verification

- Added `thread_start_deprecates_persist_extended_history_true` to cover
the deprecation notice.
- `cargo test -p codex-app-server`
- `cargo test -p codex-app-server-protocol`
2026-05-05 18:36:13 +00:00
Felipe Coury
5e0a4adbe5 feat(tui): add raw scrollback mode (#20819)
## Why

Granular copy is particularly difficult with the current output. Part of
it was solved with the introduction of the `/copy` command but when you
only need to copy parts of a response, you still encounter some issues:

- When you copy a paragraph, the result is a sequence of separate lines
instead of one correctly joined paragraph.
- When a word wraps, part of it stays on the original line and the rest
appears at the start of the next line.
- When you copy a long command, extra line breaks are often inserted,
and command arguments can be split across multiple lines.


https://github.com/user-attachments/assets/0ef85c84-9363-4aad-b43a-15fce062a443

## Solution

Now that we own the scrollback and we re-create it when we resize, we
have the opportunity of toggling between the raw text and the rich text
we see today.

- Add TUI raw scrollback mode with `tui.raw_output_mode`, `/raw
[on|off]`, and the configurable `tui.keymap.global.toggle_raw_output`
action.
- Render transcript cells through rich/raw-aware paths so raw mode
preserves source text and lets the terminal soft-wrap selection-friendly
output.
- Bind raw-mode toggle to `alt-r` by default, with the keybinding path
toggling silently while `/raw` continues to emit confirmation messages.

## Related Issues

Likely addressed by raw mode:

- #12200: clean copy for multiline and soft-wrapped output. Raw mode
removes Codex-inserted wrapping/indentation and lets the terminal
soft-wrap logical lines.
- #9252: command suggestions gain unwanted leading spaces when copied.
Raw mode renders transcript text without the rich-mode left
padding/gutter.
- #8258: prompt output is hard to copy because of leading indentation.
Raw mode renders user/source-backed transcript text without that
decorative indentation.

Partially or conditionally addressed:

- #2880: copy/export message as Markdown. Raw mode exposes raw Markdown
for terminal selection, but this PR does not add a dedicated
export/copy-message command.
- #19820: mouse drag selection + copy in the TUI. Raw mode improves
terminal-native selection of output/history text, but this PR does not
implement in-TUI mouse selection, highlighting, auto-copy, or composer
selection.
- #18979: copied content is divided into two parts. This should improve
cases caused by Codex-inserted wraps/padding in rendered output; if the
report is about pasting into the composer/input path, that remains
outside this PR.

## Validation

- `just write-config-schema`
- `just fmt`
- `cargo test -p codex-config`
- `cargo test -p codex-tui`
- `just fix -p codex-tui`
- `just argument-comment-lint`
- `cargo test -p codex-tui
raw_output_mode_can_change_without_inserting_notice -- --nocapture`
- `cargo test -p codex-tui
raw_slash_command_toggles_and_accepts_on_off_args -- --nocapture`
- `cargo test -p codex-tui raw_output_toggle -- --nocapture`
- `git diff --check`
- `cargo insta pending-snapshots`
2026-05-05 11:17:47 -07:00
viyatb-oai
172303bbfa chore: add minimal proxy egress diagnostics (#21220)
## Why
Recent Auto Review reports show Git traffic hanging through the local
proxy on both SSH and HTTPS paths. Today the support bundle does not
make it obvious whether a request is stuck before upstream dialing,
during the proxy hop, or after the upstream response begins, which slows
down root-cause triage.

This adds a small amount of runtime visibility at the existing proxy
boundaries without changing routing or policy behavior.

## What changed
- log whether HTTP and CONNECT traffic take the direct or upstream-proxy
route
- log start / success / failure timings for CONNECT, HTTP, and SOCKS5
upstream dials
- log CONNECT forwarding lifecycle events
- describe HTTP success at the response-header boundary that is actually
observed, rather than implying the full body finished

## Verification
- `cargo test -p codex-network-proxy`
- `cargo clippy -p codex-network-proxy --all-targets -- -D warnings`
2026-05-05 17:50:59 +00:00
viyatb-oai
ed6082c9f9 fix(sandboxing): Bound advisory system bwrap startup probe (#20111)
## Why

Linux startup runs an advisory system `bwrap` warning probe on each
launch. On hosts with NFS or autofs mounts, its `--ro-bind / /` probe
can take tens of seconds before Codex prints anything, matching #19828.
Because this probe only decides whether to surface a warning, it should
not be allowed to stall startup.

Relevant pre-change path:
[`codex-rs/sandboxing/src/bwrap.rs`](de2ccf9473/codex-rs/sandboxing/src/bwrap.rs (L64-L80))

## What changed

- Bound the advisory system `bwrap` probe to 500 ms.
- Preserve the existing warning behavior when `bwrap` promptly reports a
known user-namespace failure.
- Kill and reap the probe child on timeout, then suppress the advisory
warning instead of blocking startup.
- Read probe stderr with a bounded nonblocking drain so descendants that
inherit the pipe cannot extend startup after the probe child exits.
- Add regression coverage for both a deliberately slow fake `bwrap`
process and a fake probe whose descendant keeps stderr open.

## Security

This only bounds the advisory startup probe. It does not change the
command execution path or add a fail-open sandbox fallback. The related
command-side hang in #20017 remains separate from this PR.

## Verification

- Added `system_bwrap_probe_times_out_without_reporting_a_warning`.
- Added
`system_bwrap_probe_does_not_wait_for_descendants_holding_stderr_open`.
- `cargo test -p codex-sandboxing`
- `cargo clippy -p codex-sandboxing --all-targets -- -D warnings`

Fixes #19828
Related: #20017
2026-05-05 10:45:35 -07:00
Felipe Coury
a3a09dfc9b fix(tui): external editor expansion for same-size large pastes (#21190)
## Why

We found this while reviewing #21091, but confirmed it is not introduced
by that PR: the order-sensitive `current_text_with_pending()`
replacement loop already existed, and `main` already allowed active
same-size large pastes to use prefix-overlapping labels such as `[Pasted
Content N chars]` and `[Pasted Content N chars] #2`.

#21091 fixes placeholder numbering after a draft is cleared, so a fresh
same-size paste can reuse the base label. This PR fixes a different
path: when a draft already contains multiple active same-size large
pastes, the placeholders can overlap by prefix, for example `[Pasted
Content N chars]` and `[Pasted Content N chars] #2`.

That overlap breaks `current_text_with_pending()` when the composer
materializes the draft text for the external editor. Replacing the base
placeholder first can partially rewrite the `#2` placeholder, leaving
the external editor seeded with corrupted text instead of both paste
payloads.

| Before | After |
|---|---|
| <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 09"
src="https://github.com/user-attachments/assets/88a2936c-cf00-4adc-8567-8fd8f398b4a8"
/> | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 20
31"
src="https://github.com/user-attachments/assets/119cff52-43c8-432a-9367-418d82f4ed82"
/> |
| <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 57"
src="https://github.com/user-attachments/assets/026031bb-839b-4252-a0fd-9ba9616435fe"
/> | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 21
31"
src="https://github.com/user-attachments/assets/8cb6f2c8-3a5d-411b-8623-dca666ee3c08"
/> |

## What Changed

- Changed `current_text_with_pending()` to expand pending pastes through
the existing element-range based `expand_pending_pastes()` helper
instead of global string replacement.
- Added a regression test with two different same-length large pastes to
ensure both overlapping placeholders expand to their original payloads.

## How to Test

1. Start Codex TUI.
2. Paste a large string, for example 1004 `A` characters.
```shell
perl -e 'print "A" x 1004' | pbcopy
```
3. Paste a second large string with the same length, for example 1004
`B` characters.
```shell
perl -e 'print "B" x 1004' | pbcopy
```
4. Open the external editor from the composer.
5. Confirm the editor is seeded with the full `A...` payload followed by
the full `B...` payload, with no literal `#2` left behind.

Targeted tests:
- `cargo test -p codex-tui
current_text_with_pending_expands_overlapping_placeholders`
- `just argument-comment-lint-from-source -p codex-tui`

I also ran `cargo test -p codex-tui`; it reached the full crate suite
but failed two unrelated local status tests because this machine's
`/etc/codex/requirements.toml` rejects `DangerFullAccess`.
2026-05-05 14:41:43 -03:00
Abhinav
13be504063 revert legacy notify deprecation (#21152)
# Why

Revert #20524 for now because the computer use plugin has not migrated
off legacy `notify` yet. Keeping the deprecation in place today would
show users a warning before the plugin path is ready to move, so this
rolls the change back until that migration is complete.

# What

- revert the legacy `notify` deprecation change from #20524
- restore the prior `notify` behavior and remove the temporary
deprecation metrics/docs from that change

Once the computer use plugin has migrated, we can land the same
deprecation again.
2026-05-05 10:34:44 -07:00
canvrno-oai
394242e95b [codex] Fix fork --last cwd filtering (#21089)
Fixes #20945.

This keeps `codex fork --last` aligned with the neighboring
latest-session lookup flows. The local fork path now uses the same
cwd-scope helper as `resume --last`, which is also a small code cleanup
around how this selection logic is shared.

Credit to @chanwooyang1 for the report and for pointing out the narrow
fix direction.

What changed:
- Route `fork --last` through the shared latest-session cwd filter.
- Preserve `--all` as the explicit opt-in for global latest-session
selection.
- Keep remote cwd override behavior unchanged.
- Add focused coverage for local default, `--all`, and remote override
filter semantics.

Validation:
- Ran `just fmt`.
- Ran `git diff --check`.
- Reviewed the `fork --last`, `resume --last`, and fork picker selection
paths against the issue report.
2026-05-05 10:33:40 -07:00
canvrno-oai
1feaa7d85b [codex] Fix TUI large paste placeholder numbering after Ctrl+C (#21091)
Fixes #19940.

Large-paste placeholder numbering was backed by a per-size counter, so
clearing a draft with `Ctrl+C` left numbering state behind even though
the active pending paste state was gone. This updates the composer to
derive the next placeholder suffix from active pending pastes instead,
which keeps simultaneous same-size pastes distinct while letting fresh
drafts reuse the base label. This is also a small code cleanup: pending
paste state is now the source of truth instead of maintaining a separate
counter.

Credit to @Sungyoun-Kim for the issue report, root-cause notes, and fork
with the proposed fix, and to @charley-oai for the earlier related
#10032 proposal.

Changes:
- Remove the monotonic large-paste counter from the composer.
- Compute suffixes from currently active pending paste placeholders.
- Document large-paste placeholder behavior in the composer module docs.
- Add regression coverage for `Ctrl+C` clearing and deletion/reset
behavior.

Testing:
- `just fmt`
- `git diff --check`
2026-05-05 10:33:37 -07:00
Abhinav
af86be529c Support PreToolUse additionalContext (#20692)
# Why

`PreToolUse` already exposes `hookSpecificOutput.additionalContext` in
the generated hook schema, but the runtime still rejected it as
unsupported. That leaves `PreToolUse` out of step with the other
context-injecting hooks and prevents hook authors from attaching
model-visible guidance to a pending tool call before it runs.

# What

- Parse `PreToolUse.additionalContext` and carry it through the hook
event pipeline.
- Record `PreToolUse` context at the hook boundary so successful context
is preserved for both allowed and blocked calls without widening the
tool registry surface.
- Preserve existing deny behavior when context is combined with either
`permissionDecision: "deny"` or the legacy `decision: "block"` shape.
2026-05-05 10:29:30 -07:00
iceweasel-oai
f35285dc78 Add Windows sandbox readiness RPC (#20708)
## Why

The desktop app on Windows needs a read-only way to tell, before the
next tool call, whether the local Windows sandbox setup is in a state
that should block the user and ask for setup again.

The main case we want to cover is the elevated sandbox setup version
bump. Today, if the app is configured for elevated Windows sandboxing
and the installed setup is stale, the next sandboxed shell/exec path can
end up triggering the elevated setup flow directly. That means the user
can see an unexpected UAC prompt with no UI explanation.

This change adds a small app-server preflight so the desktop app can ask
“is Windows sandbox ready, not configured, or update-required?” during
startup and show the appropriate blocking UI before the user hits a tool
call.

## What changed

- Added a new read-only app-server RPC: `windowsSandbox/readiness`
- Added a new protocol enum and response type:
  - `WindowsSandboxReadiness`
  - `WindowsSandboxReadinessResponse`
- Added core readiness logic in `core/src/windows_sandbox.rs`:
  - `ready`
  - `notConfigured`
  - `updateRequired`
- Wired the new request through `codex_message_processor`
- Regenerated the vendored app-server schema fixtures

## Readiness semantics

This is intentionally a coarse startup/version-bump readiness check, not
a full predictor of every runtime repair case.

For now, readiness is determined from:
- the configured Windows sandbox level
- `sandbox_setup_is_complete()` for elevated mode

That means:
- `disabled` maps to `notConfigured`
- `restricted token` maps to `ready`
- `elevated` maps to `ready` or `updateRequired` depending on
`sandbox_setup_is_complete()`

This is deliberate for the first UI integration because the common case
we want to catch is “the app updated, the elevated setup version bumped,
and the user should see an update-required blocker instead of a surprise
UAC prompt”.

It does not attempt to model every case where the deeper runtime path
might decide to repair or re-run setup.

## Testing

- Ran `cargo fmt --all -- app-server-protocol/src/protocol/common.rs
app-server-protocol/src/protocol/v2.rs
app-server/src/codex_message_processor.rs core/src/windows_sandbox.rs
core/src/windows_sandbox_tests.rs`
- Added unit tests for the pure readiness mapping in
`core/src/windows_sandbox_tests.rs`
- Regenerated vendored schema fixtures with `cargo run -p
codex-app-server-protocol --bin write_schema_fixtures -- --schema-root
app-server-protocol/schema`
- Did not run the full cargo test suite
2026-05-05 09:58:23 -07:00
Eric Traut
f09e1936e0 Validate /goal objective length in TUI (#20746)
## Why

Long `/goal` definitions currently reach lower-level goal validation and
can produce an opaque failure. This bug was reported by a user. Pasted
instruction blocks are especially confusing because the composer can
still contain a paste placeholder before expansion, which may otherwise
fall into the generic prompt-size error path.

There was also a related paste edge case where `/goal ` followed by a
multiline block whose first pasted line was blank looked like a bare
`/goal` command. That showed the goal usage/summary instead of setting
the pasted objective.

## What Changed

This adds TUI-side preflight validation for `/goal <objective>` using
the shared `MAX_THREAD_GOAL_OBJECTIVE_CHARS` limit. Oversized typed,
queued, and pasted goal objectives now fail locally with a goal-specific
message that recommends putting longer instructions in a file and
referencing that file from the goal.

The TUI now also lets inline-argument slash commands consume later-line
arguments before treating the first line as a bare command, so `/goal `
followed by blank lines and then objective text sets the goal instead of
opening the bare `/goal` flow.

## Manual Testing

1. Start the TUI with goals enabled and an active session.
2. Submit `/goal ` followed by exactly 4,000 objective characters. It
should continue through the normal goal-setting path.
3. Submit `/goal ` followed by 4,001 objective characters. It should not
set a goal, and should show `Goal objective is too long: 4,001
characters. Limit: 4,000 characters.` followed by the guidance to put
longer instructions in a file and reference that file from the goal.
4. Type `/goal `, paste a large block that becomes a `[Pasted Content
... chars]` placeholder, then submit. It should validate the expanded
pasted text and show the goal-specific file guidance rather than the
generic prompt-size error.
5. Type `/goal `, paste a multiline block whose first line is blank,
then submit. It should set the objective from the non-blank pasted
content instead of showing `Usage: /goal <objective>` or the bare goal
summary.
6. While a turn is running, queue an oversized `/goal` command. When the
queue drains, it should show the same goal-specific error and should not
emit a goal-setting request.
2026-05-05 09:55:07 -07:00
Eric Traut
91b7350187 Add goal lifecycle metrics (#20799)
## Why

Adding goal metrics makes it possible to track how often goals are
created, completed, and stopped by budget limits, plus the final token
and wall-clock usage for terminal outcomes.

## What Changed

- Added OpenTelemetry metric constants for goal lifecycle tracking:
- `codex.goal.created`: increments each time a new persisted goal is
created or an existing goal is replaced with a new objective.
- `codex.goal.completed`: increments when a goal transitions to
`complete`.
- `codex.goal.budget_limited`: increments when a goal transitions to
`budget_limited` because its token budget has been reached.
- `codex.goal.token_count`: records the final persisted token count when
a goal transitions to `complete` or `budget_limited`.
- `codex.goal.duration_s`: records the final persisted elapsed
wall-clock time, in seconds, when a goal transitions to `complete` or
`budget_limited`.
- Emitted creation metrics when a goal is created or replaced.
- Emitted terminal outcome counters and final usage histograms when a
goal transitions to `complete` or `budget_limited`, avoiding
double-counting later in-flight accounting for already budget-limited
goals.
- Added focused `codex-core` tests for create/complete metrics and
one-time budget-limit metrics.
2026-05-05 09:21:54 -07:00
Felipe Coury
69283aa1c0 fix(tui): make /copy work inside tmux without passthrough (#20207)
## Summary
- prefer tmux's native clipboard integration for `/copy` when running
inside tmux
- fall back to OSC 52 when tmux clipboard copy is unavailable
- add coverage for tmux-preferred, fallback, and combined-failure paths

## Why
Inside tmux, `/copy` previously relied on DCS-wrapped OSC 52 when `TMUX`
was set. That only reaches the outer terminal when tmux passthrough is
enabled, so Codex could report success even though the system clipboard
never changed.

## User impact
`/copy` now works inside tmux even when `allow-passthrough` is off, as
long as tmux clipboard integration is available. If tmux cannot handle
the copy, Codex still keeps the existing OSC 52 fallback path.

## Validation
- `cargo test -p codex-tui`
- `just fmt`
- `just fix -p codex-tui`
- `just argument-comment-lint`
- manually verified `/copy` inside tmux with `allow-passthrough off`

Fixes #19926
2026-05-05 16:18:02 +00:00
jif-oai
be12a80ad1 feat: add normalized matching to memory search (#21205)
## Why

Memory search currently treats separators literally, so callers need to
know whether a stored term uses spaces, hyphens, or no separators at
all. That makes recall brittle for terms such as `MultiAgentV2` vs.
`multi agent v2` and `cold-resume` vs. `cold resume`.

## What changed

- Add an opt-in `normalized` mode to memory search that removes
non-alphanumeric separators after any requested case folding.
- Thread the new flag through the MCP `search` tool into the local
backend while keeping existing literal matching as the default.
- Reject queries that normalize to an empty string, and add regression
coverage for both normalized matching and that validation path.

## Testing

- `cargo test -p codex-memories-mcp`
2026-05-05 17:33:07 +02:00
jif-oai
f75c600872 feat: support windowed multi-query memory search (#21204)
## Why

Memory search currently supports either independent substring matches or
requiring every query to appear on the same line. That is too
restrictive for memory files where related terms often land on nearby
lines in the same note or bullet block.

## What changed

- Replace the old `all` match mode with explicit tagged modes:
`all_on_same_line` and `all_within_lines { line_count }`.
- Add windowed matching in `codex-rs/memories/mcp/src/local.rs` so
callers can require every query to appear within a bounded line range
while returning only the minimal qualifying windows.
- Reject invalid zero-width windows and update the MCP tool description
plus argument parsing to expose the new mode.
- Add coverage for same-line matching, windowed matching, and invalid
`line_count` input.

## Verification

- Added targeted coverage in `codex-rs/memories/mcp/src/local_tests.rs`
for `search_supports_all_within_lines_match_mode` and
`search_rejects_zero_line_window`.
- Added server-side parsing coverage in
`codex-rs/memories/mcp/src/server.rs` for
`search_args_accept_windowed_all_match_mode`.
2026-05-05 17:15:21 +02:00
jif-oai
de924af134 memories-mcp: hide dot paths from list, read, and search (#21201)
## Why

The local memories root can contain implementation details such as
`.git` plus incidental OS metadata like `.DS_Store`. Those entries are
not authored memory content, so the memories MCP should keep them
invisible instead of exposing them through normal discovery or direct
lookup.

Only for local implementation ofc

## What changed

- Return `NotFound` for scoped `list`, `read`, and `search` requests
that include a hidden path component.
- Skip hidden files and directories while listing a directory or
recursively searching the memories tree.
- Add regression coverage for hidden files, hidden directories, and
hidden scoped requests across `list`, `read`, and `search`.

## Testing

- Added focused regression tests in `memories/mcp/src/local_tests.rs`
covering hidden-path behavior across the affected APIs.
2026-05-05 16:59:05 +02:00
jif-oai
70807730f5 tools: remove unused experimental list_dir tool (#21170)
## Why
`list_dir` still carries a full spec/handler/test path, but nothing in
the current model catalog advertises it via
`experimental_supported_tools`. That leaves us maintaining an
environment-backed tool surface that is effectively unused.

## What changed
- delete the `list_dir` handler and its tests from `codex-core`
- remove the `list_dir` spec builder, handler kind, and registry wiring
from `codex-tools`
- clean up the remaining internal README and registry tests so they no
longer mention the removed tool
2026-05-05 13:11:07 +02:00
Ahmed Ibrahim
9d579813bb 1- Add model service tiers metadata (#20969)
## Why

The model list needs to carry display-ready service tier metadata so
clients can render tier choices with stable IDs, names, and
descriptions. A raw speed-tier string list is not enough for richer UI
copy or future tier labels.

## What changed

- Added `ModelServiceTier` to shared model metadata with string `id`,
`name`, and `description` fields.
- Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving
empty defaults for older cached model payloads.
- Exposed `serviceTiers` on app-server v2 `Model` responses and threaded
it through TUI app-server model conversion.
- Marked legacy `additional_speed_tiers` / `additionalSpeedTiers`
metadata as deprecated in source and generated schema output.
- Regenerated app-server protocol JSON schema and TypeScript fixtures,
including `ModelServiceTier.ts`.

## Verification

- Ran `just write-app-server-schema`.
- Did not run local tests per repo instruction; relying on PR CI.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-05 09:51:18 +03:00
Abhinav
dca105cf99 Spill large hook outputs from context (#21069)
## Why

Large hook outputs can enter model-visible context through hook-specific
paths such as `additionalContext` and `Stop` continuation prompts.
Without a dedicated cap, one hook can inject a large blob directly into
conversation history instead of leaving a bounded preview for the model
and preserving the full text elsewhere.

## What

- spill hook text once it exceeds a fixed `2_500`-token budget,
preserving the full output on disk and leaving a head/tail preview plus
saved path in context
- add shared hook-output spilling under
`CODEX_HOME/hook_outputs/<thread_id>/<uuid>.txt`
- apply the cap to both `additionalContext`, `feedback_message`, and
`Stop` continuation fragments
2026-05-05 05:03:18 +00:00
Tom
33d24b0df5 codex: migrate (more) app-server thread history reads to ThreadStore (#20575)
Migrate token usage replay, rollback responses, and detached review
setup (a special case of forking) to be served from ThreadStore reads
rather direct rollout files.

- replay restored token usage from already-loaded `RolloutItem` history
instead of reopening `Thread.path`
- rebuild rollback responses from loaded `ThreadStore` snapshots and
history
- start detached reviews from store-backed parent history and stored
review-thread metadata
- remove obsolete app-server rollout-summary helper code that became
dead after the store-backed migration
- preserve response/notification ordering for resume, fork, rollback,
and detached review flows
- add integration test coverage for the affected paths
2026-05-04 21:16:50 -07:00
edwardysun3
7e71d02610 Add turn_id to Codex skill invocation analytics (#21122) 2026-05-05 00:11:06 -04:00
alexsong-oai
3ad7cf0993 Add plugin ID to skill analytics (#20923)
## Summary
- thread plugin skill roots through the skills loader with their plugin
ID
- store plugin ID on loaded skill metadata for plugin-provided skills
- include plugin ID on skill invocation analytics events

## Test plan
- cargo check -p codex-core-skills
- cargo check -p codex-core -p codex-core-plugins -p codex-analytics
- cargo check -p codex-tui
- cargo check -p codex-plugin -p codex-core -p codex-core-plugins -p
codex-analytics
- cargo check -p codex-app-server
- cargo test -p codex-analytics
- HOME=/private/tmp/codex-empty-home cargo test -p codex-core-skills
- just fix -p codex-core-skills
- just fix -p codex-analytics
- just fix -p codex-core-plugins
- just fix -p codex-core
- just fmt
- git diff --check
2026-05-04 20:36:29 -07:00
Tom
707e51bd8b codex: route metadata updates through ThreadStore (#20576)
- Route `thread/metadata/update` through
`ThreadStore::update_thread_metadata`.
- Add `LocalThreadStore` git metadata patch support for set, partial
update, and clear semantics.
- Add some unit tests for the new thread store code
- Remove a lot of dead code/tests!
2026-05-04 20:09:41 -07:00
Shijie Rao
0d418f478d Rename agent identity login surface to access token (#21059)
## Why
The external startup/login surface for this auth path should talk about
an access token instead of exposing the internal Agent Identity
terminology. Users should pass `CODEX_ACCESS_TOKEN` or pipe a token into
`codex login --with-access-token`; the old external env/flag spellings
are removed so there is only one supported user-facing path.

## What Changed
- Added `CODEX_ACCESS_TOKEN` as the supported environment variable for
this auth path.
- Added `codex login --with-access-token` as the supported stdin-based
login command.
- Removed the legacy `CODEX_AGENT_IDENTITY` env-var fallback and hidden
`--with-agent-identity` CLI alias.
- Updated CLI error, status, and stdin prompts to use access-token
language.
- Added coverage for access-token env loading, CLI login failure
behavior, and renamed login status text.

## Validation
- `cargo test -p codex-login`
- `cargo test -p codex-cli`
- `just fix -p codex-login`
- `just fix -p codex-cli`
2026-05-04 19:43:48 -07:00
evawong-oai
d85783901c [network-proxy] Cover DNS timeout blocking (#21105)
## Summary
- Add a testable DNS lookup helper for the local or private host
precheck while preserving production `lookup_host` behavior.
- Add deterministic coverage for DNS timeout, lookup error, private
resolution, and public resolution decisions.
- Keep BUGB 15982 guarded without relying on ambient DNS timing or
resolver behavior.

## Why
BUGB 15982 was fixed by failing closed on DNS lookup errors and
timeouts. The existing regression covered lookup failure through real
DNS, but did not deterministically exercise the timeout branch. This PR
adds a small injection point so CI can cover that branch without
standing up slow authoritative DNS.

## Validation
- `cargo test -p codex-network-proxy host_resolves_to_non_public_ip --
--nocapture`
- `cargo test -p codex-network-proxy
host_blocked_rejects_allowlisted_hostname_when_dns_lookup_fails --
--nocapture`
- `cargo test -p codex-network-proxy`
- `just fmt`
- `just fix -p codex-network-proxy`
- `git diff --check`

## Tickets
- BUGB 15982
-
https://linear.app/openai/issue/BUGB-15982/codex-dns-timeout-fail-open-in-codex-network-proxy-bypasses
- Bugcrowd:
https://tracker.bugcrowd.com/openai/submissions/b2bf131d-db04-478f-85aa-cdd17ca8f604
2026-05-04 19:03:56 -07:00
Ruslan Nigmatullin
4950e7d8a6 [codex] Add unsandboxed process exec API (#19040)
## Why

App-server clients sometimes need argv-based local process execution
while sandbox policy is controlled outside Codex. Those environments can
reject sandbox-disabling paths before a command ever starts, even when
the caller intentionally wants unsandboxed execution.

This PR adds a distinct `process/*` API for that use case instead of
extending `command/exec` with another sandbox-disabling shape. Keeping
the new surface separate also makes the future removal of `command/exec`
simpler: clients that need explicit process lifecycle control can move
to the newer handle-based API without depending on `command/exec`
business logic.

## What changed

- Added v2 process lifecycle methods: `process/spawn`,
`process/writeStdin`, `process/resizePty`, and `process/kill`.
- Added process notifications: `process/outputDelta` for streamed
stdout/stderr chunks and `process/exited` for final exit status and
buffered output.
- Made `process/spawn` intentionally unsandboxed and omitted
sandbox-selection fields such as `sandboxPolicy` and
`permissionProfile`.
- Added client-supplied, connection-scoped `processHandle` values for
follow-up control requests and notification routing.
- Supported cwd, environment overrides, PTY mode and size, stdin
streaming, stdout/stderr streaming, per-stream output caps, and timeout
controls.
- Killed active process sessions when the originating app-server
connection closes.
- Wired the implementation through the modular `request_processors/`
app-server layout, with process-handle request serialization for
follow-up control calls.
- Updated generated JSON/TypeScript schema fixtures and documented the
new API in `codex-rs/app-server/README.md`.
- Added v2 app-server integration coverage in
`codex-rs/app-server/tests/suite/v2/process_exec.rs` for spawn
acknowledgement before exit, buffered output caps, and process
termination.

## Verification

- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`

---------

Co-authored-by: Owen Lin <owen@openai.com>
2026-05-04 16:43:58 -07:00
xli-oai
a8db4af5c3 Remove remote plugin uninstall prefix gate (#20722)
## Summary

Remove the hardcoded remote plugin ID prefix allow-list from app-server
uninstall routing. IDs that do not parse as local `plugin@marketplace`
IDs now flow through the remote uninstall path, where the existing
remote ID safety validation still rejects empty IDs, spaces, slashes,
and other unsafe characters before URL/cache use.

## Why

Plugin-service owns the backend remote plugin ID contract. Codex should
not require remote IDs to start with the local hardcoded prefixes
`plugins~`, `plugins_`, `app_`, `asdk_app_`, or `connector_`, because
newer backend ID families could otherwise be rejected before
plugin-service sees the request.

## Validation

- `just fmt`
- `cargo test -p codex-app-server plugin_uninstall`
- `just fix -p codex-app-server`
- `git diff --check`
2026-05-04 16:28:13 -07:00
rhan-oai
aee1fe2659 [codex-analytics] add item lifecycle timing (#20514)
## Why

Tool families already disagree on what their existing `duration` fields
mean, so lifecycle latency should live on the shared item envelope
instead of being inferred from per-tool execution fields. Carrying that
envelope through app-server notifications gives downstream consumers one
reusable timing signal without pretending every tool has the same
execution semantics.

## What changed

- Adds `started_at_ms` to core `ItemStartedEvent` values and
`completed_at_ms` to core `ItemCompletedEvent` values.
- Populates those timestamps in the shared session lifecycle emitters,
so protocol-native items get timing without each producer tracking its
own clock state.
- Exposes `startedAtMs` on app-server `item/started` notifications and
`completedAtMs` on `item/completed` notifications.
- Maps the lifecycle timestamps through the app-server boundary while
leaving legacy-converted notifications nullable when no lifecycle
timestamp exists.
- Regenerates the app-server JSON schema and TypeScript fixtures for the
notification-envelope change and updates downstream fixtures that
construct those notifications directly.
- Extends the existing web-search and image-generation integration flows
to assert the new lifecycle timestamps on the native item events.

## Verification

- `cargo check -p codex-protocol -p codex-core -p
codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec
-p codex-app-server-client`
- `cargo test -p codex-core --test all web_search_item_is_emitted`
- `cargo test -p codex-core --test all
image_generation_call_event_is_emitted`
- `cargo test -p codex-app-server-protocol`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514).
* #18748
* #18747
* #17090
* #17089
* __->__ #20514
2026-05-04 22:33:20 +00:00
kmeelu-oai
e7e6267ab3 Make realtime sideband startup async (#20715)
## Summary

Moves the WebRTC realtime sideband websocket join out of the voice start
critical path. Call creation still posts the SDP offer and session
config synchronously so the client gets the SDP answer, but the sideband
websocket now connects in the input task async and doesn't block
conversation state installation.

This lets the normal realtime input channels buffer text, handoff
output, and audio while the WebRTC sideband websocket is connecting. If
the sideband join fails while the conversation is still active, the task
sends a RealtimeEvent::Error through the existing events_tx / fanout
path.

To rephrase this:
* No longer blocked on sideband: the client can receive the SDP answer
earlier, set up the WebRTC peer connection, and let the media leg
progress while the sideband websocket joins.
* Still blocked on sideband: queued text, handoff output, and sideband
server events cannot flow until connect_webrtc_sideband(...).await
finishes and then run_realtime_input_task(...) starts

## Validation

- `env CODEX_SKIP_VENDORED_BWRAP=1 cargo test --manifest-path
codex-rs/Cargo.toml -p codex-core --test all
conversation_webrtc_start_posts_generated_session`

`CODEX_SKIP_VENDORED_BWRAP=1` is needed in this local environment
because `libcap.pc` is not installed for the vendored bubblewrap build.

## Testing
I tested this locally by running `cargo run -p codex-cli --bin codex --
--enable realtime_conversation` and invoking `/realtime`. Then, we get
logs emitted in `~/.codex/log/codex-tui.log`.

### Before the Change
Logging commit
(c0299e6edf)
```
2026-05-04T16:06:09.251956Z  INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: starting realtime conversation
2026-05-04T16:06:09.251980Z  INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: creating realtime call transport="webrtc"
2026-05-04T16:06:10.365722Z  INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: realtime call created; sdp answer ready transport="webrtc" call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy elapsed_ms=1113 total_elapsed_ms=1113
2026-05-04T16:06:10.365843Z  INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: connecting realtime sideband websocket call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy
2026-05-04T16:06:10.784528Z  INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: connected realtime sideband websocket call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy elapsed_ms=418 total_elapsed_ms=1532
2026-05-04T16:06:10.784665Z  INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: realtime conversation started
```

### After the Change
Logging commit
(c8b00ac21a)
```
2026-05-04T15:41:24.080363Z  INFO ... codex_core::realtime_conversation: starting realtime conversation
2026-05-04T15:41:24.080434Z  INFO ... codex_core::realtime_conversation: creating realtime call transport="webrtc"
2026-05-04T15:41:25.106906Z  INFO ... codex_core::realtime_conversation: realtime call created; sdp answer ready transport="webrtc" call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy elapsed_ms=1026 total_elapsed_ms=1026
2026-05-04T15:41:25.107067Z  INFO ... codex_core::realtime_conversation: spawned realtime sideband connection task transport="webrtc" total_elapsed_ms=1026
2026-05-04T15:41:25.107160Z  INFO ... codex_core::realtime_conversation: realtime conversation started
2026-05-04T15:41:25.107185Z  INFO codex_core::realtime_conversation: connecting realtime sideband websocket call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy
2026-05-04T15:41:25.107352Z  INFO ... codex_core::realtime_conversation: sent realtime sdp answer to client
2026-05-04T15:41:26.076685Z  INFO codex_core::realtime_conversation: connected realtime sideband websocket call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy elapsed_ms=969 total_elapsed_ms=1996
2026-05-04T15:41:26.573893Z  INFO codex_core::realtime_conversation: realtime session updated realtime_session_id=sess_u0_Dbpi8nhak5eLjQZ73yhAy
2026-05-04T15:41:26.573970Z  INFO codex_core::realtime_conversation: received realtime conversation event event=SessionUpdated { ... }
```

### Conclusion
Here we see that we saved about a half a second in conversation startup
(1532ms -> 969ms). This also checks out with my sanity tests; I was
seeing at most a second of saving.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-04 22:28:14 +00:00
Felipe Coury
36912ce3de fix(tui): use shared paste burst interval on Windows (#18914)
## Summary

Fixes #11678 by removing the Windows-specific
`PASTE_BURST_CHAR_INTERVAL` override. Windows now uses the same `8ms`
paste-burst character interval as macOS and Linux, which removes the
extra per-character hold that made fast typing and key repeat feel
delayed on Windows.

The paste-burst heuristic itself is unchanged, and the Windows-specific
active idle timeout remains in place. This PR only restores the shared
character-to-character burst threshold that decides whether adjacent
plain character events are part of a paste.

## Motivation

PR #9348 raised the Windows character interval from `8ms` to `30ms` to
protect the multiline paste behavior tracked in #2137, where pasted
newlines could be interpreted as submits in Windows terminals. That
fixed the paste failure, but it also made ordinary typing visibly laggy
because the TUI waits briefly before flushing a single typed character
while it checks whether a paste burst is forming.

The deployed behavior here is to remove that Windows-only delay and
return to the cross-platform threshold. Manual Windows validation of the
critical VS Code integrated terminal path shows multiline paste still
works with the final `8ms` value, including testing on VS Code
`1.107.0`.

## Testing

- `cargo test -p codex-tui`
- Manual Windows validation in VS Code integrated PowerShell with the
final `8ms` interval
2026-05-04 20:39:11 +00:00
Michael Bolin
30de54da36 bazel: run sharded rust integration tests (#21057)
## Why

Bazel CI was not actually exercising some sharded Rust integration-test
targets on macOS. The `rules_rust` sharding wrapper expects a symlink
runfiles tree, but this repo runs Bazel with `--noenable_runfiles`. In
that configuration the wrapper could fail to find the generated test
binary, produce an empty test list, and exit successfully. That made
targets such as `//codex-rs/core:core-all-test` look green even when
Cargo CI could still catch failures in the same Rust tests.

The coverage gap appears to have been introduced by
[#18082](https://github.com/openai/codex/pull/18082), which enabled
rules_rust native sharding on `//codex-rs/core:core-all-test` and the
other large Rust test labels. The manifest-runfiles setup itself
predates that change in
[#10098](https://github.com/openai/codex/pull/10098), but #18082 is
where the affected integration tests started running through the
incompatible rules_rust sharding wrapper.
[#18913](https://github.com/openai/codex/pull/18913) fixed the same
class of issue for wrapped unit-test shards, but integration-test shards
were still going through the rules_rust wrapper until this PR.

We still do not have the V8/code-mode pieces stable under the Bazel CI
cross-compile setup, so this keeps those tests out of Bazel while
restoring coverage for the rest of the sharded Rust integration suites.
Cargo CI remains responsible for V8/code-mode coverage for now.

This change did uncover a real failing core test on `main`:
`approved_folder_write_request_permissions_unblocks_later_apply_patch`.
That fix is split into
[#21060](https://github.com/openai/codex/pull/21060), which enables the
`apply_patch` tool in the test, teaches the aggregate core test binary
to dispatch the sandboxed filesystem helper, canonicalizes the macOS
temp patch target, and isolates the core test harness from managed
local/enterprise config. Keeping that fix separate lets this PR stay
focused on restoring Bazel coverage while documenting the first failure
it exposed.

## What changed

- Build sharded Rust integration tests as manual `*-bin` binaries and
run them through the existing manifest-aware `workspace_root_test`
launcher.
- Keep Bazel sharding on the launcher target so Rust test cases are
still distributed by stable test-name hashing.
- Configure Bazel CI to skip Rust tests whose names contain
`suite::code_mode::`.
- Exclude the standalone `codex-rs/code-mode` and `codex-rs/v8-poc`
unit-test targets from `bazel.yml`.

## Verification

- `bazel query --output=build //codex-rs/core:core-all-test` now shows
`workspace_root_test` wrapping `//codex-rs/core:core-all-test-bin`.
- `bazel test --test_output=all --nocache_test_results
--test_sharding_strategy=disabled //codex-rs/core:core-all-test
--test_filter=suite::request_permissions_tool::approved_folder_write_request_permissions_unblocks_later_apply_patch`
runs the actual Rust test body and passes.
- `bazel test --test_output=errors --nocache_test_results
--test_env=CODEX_BAZEL_TEST_SKIP_FILTERS=suite::code_mode::
//codex-rs/core:core-all-test` runs the sharded target with code-mode
skipped and passes overall locally, with one flaky attempt retried by
the existing `flaky = True` setting.
2026-05-04 13:33:14 -07:00
Felipe Coury
87d2235b54 fix(tui): support modified backspace/delete keys (#21058)
## Why

Fixes #21046.

Codex TUI 0.128.0 can show Backspace/Delete-related editor shortcuts in
`/keymap`, but Windows-style modified Backspace/Delete events were still
dropped by the composer because the default editor keymap did not
include those modified special-key variants. On Windows/CMD this meant
`Shift+Backspace` and `Shift+Delete` did not fall through to normal
character deletion, and `Ctrl+Backspace` / `Ctrl+Delete` did not perform
the word deletion users expect from Windows text inputs.

## What Changed

- Added default editor bindings for `shift-backspace` and `shift-delete`
so shifted delete keys keep normal grapheme deletion behavior.
- Added default editor bindings for `ctrl-backspace`,
`ctrl-shift-backspace`, `ctrl-delete`, and `ctrl-shift-delete` so
Windows-style word deletion works when terminals preserve those
modifiers.
- Added regression coverage for the resolved default keymap and textarea
behavior.

## How to Test

1. Start Codex in the TUI on Windows CMD or another terminal that
reports modified Backspace/Delete keys distinctly.
2. Type `hello world` in the composer.
3. Press `Ctrl+Backspace`; confirm `world` is removed and `hello `
remains.
4. Type `world` again, move the cursor before it, then press
`Ctrl+Delete`; confirm the next word is removed.
5. Type a few characters and press `Shift+Backspace` and `Shift+Delete`;
confirm they delete one character in the expected direction instead of
doing nothing.
6. Open `/keymap`, inspect the Editor deletion actions, and confirm the
modified Backspace/Delete aliases are visible as configurable defaults.

Targeted tests:
- `cargo test -p codex-tui keymap::tests`
- `cargo test -p codex-tui bottom_pane::textarea::tests`
- `cargo test -p codex-tui keymap_setup::tests`
2026-05-04 17:16:41 -03:00
charley-openai
a6599b8202 Add reasoning effort to turn tracing spans (#20060)
Why
#19432 added token usage to the turn and response spans. This follow-up
adds the configured reasoning effort so performance traces can be
filtered by model effort.

[example
trace](https://openai.datadoghq.com/apm/trace/1ff708a87159ff4898bdc8bd6091ec18?graphType=waterfall&shouldShowLegend=true&spanID=6596351544047485652&traceQuery=)
<img width="533" height="434" alt="Screenshot 2026-04-28 at 3 52 12 PM"
src="https://github.com/user-attachments/assets/77ef32fc-d7cd-4eec-87b4-26c6798f1af8"
/>


What Changed
- Adds `codex.turn.reasoning_effort` to the turn span.
- Adds `codex.request.reasoning_effort` to `handle_responses`.
- Extends the span test to cover explicit `high` effort with token
usage.

Testing
- `cargo test -p codex-core
turn_and_completed_response_spans_record_token_usage`
- `cargo test -p codex-otel`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-otel`
2026-05-04 12:57:05 -07:00
Michael Bolin
229b40aa21 core: fix apply_patch request permissions test (#21060)
## Why

The Bazel test coverage change exposed
`approved_folder_write_request_permissions_unblocks_later_apply_patch`,
and `rust-ci-full.yml` showed the same test failing on `main` on macOS.
There were two separate classes of problems here.

### Clean CI failure

The test emits an `apply_patch` tool call, but its config did not enable
the `apply_patch` tool, so the mocked response completed without an
`apply-patch-call` output. After enabling the tool, the same path also
needs the aggregate `codex-core` test binary to dispatch
`--codex-run-as-fs-helper`; sandboxed `apply_patch` uses that helper
under macOS Seatbelt.

The test now also canonicalizes the temporary patch target before
building the patch payload so the path matches normalized grants on
macOS, where `/var` paths often normalize to `/private/var`.

### Local/enterprise config isolation

The core test harness now builds its default test config with managed
config disabled, so host-managed enterprise config cannot alter these
tests. The request-permissions turns in this test also explicitly use
the user reviewer path, keeping the assertions focused on
`request_permissions` behavior rather than reviewer defaults from the
host.

## What Changed

- Enable `apply_patch` in
`approved_folder_write_request_permissions_unblocks_later_apply_patch`.
- Teach the core integration test binary to dispatch
`CODEX_FS_HELPER_ARG1`, matching the existing apply-patch and
linux-sandbox dispatch paths.
- Canonicalize the tempdir-backed patch target before creating the
patch.
- Ignore managed config in default core test configs and explicitly pin
this test to `ApprovalsReviewer::User`.

## Verification

Run outside the Codex app sandbox because these macOS tests
intentionally spawn Seatbelt:

- `cargo test -p codex-core
approved_folder_write_request_permissions_unblocks_later_apply_patch`
- `cargo test -p codex-core
approved_folder_write_request_permissions_unblocks_later_exec_without_sandbox_args`
2026-05-04 12:48:59 -07:00
sayan-oai
8126af3879 core: preserve last model ids in feedback tags (#21026)
## Why

Feedback reports do not currently surface a direct pointer to the last
model call, so investigations may require searching through many
requests in a session to find the bad response. Preserve the last
model-side IDs at response-stream time so immediate feedback reports
carry that breadcrumb.

## What changed

- Record `last_model_request_id` when a Responses stream exposes an
upstream request ID.
- Record `last_model_response_id` when the model response completes.
- Add unit coverage for the emitted feedback tags.

## Verification

- `cargo test -p codex-core
client::tests::response_stream_records_last_model_feedback_ids`
2026-05-04 12:46:08 -07:00
sayan-oai
b9e8df47da Use MCP server instructions in deferred namespace descriptions (#21053)
## Why

MCP servers can provide `instructions` that explain what their tools are
for. Directly exposed MCP namespaces already use those instructions when
a connector description is not available, but deferred `tool_search`
results did not preserve that fallback. The direct path falls back from
connector metadata to server instructions, while the deferred path only
carried `connector_description` and otherwise fell back to generic
namespace text.

That meant a plain MCP server could provide useful model-facing guidance
and still appear as `Tools in the X namespace.` whenever it was
discovered lazily through `tool_search`.

## What changed

- Store one model-facing `namespace_description` on `ToolInfo`, using
connector descriptions for connector-backed tools and server
instructions for plain MCP servers.
- Thread that namespace description through the `tool_search` source
list, search indexing, and returned namespace metadata.
- Add an end-to-end regression test for deferred non-app MCP search
results exposing server instructions as the namespace description.

## Verification

- `cargo test -p codex-tools
search_tool_description_lists_each_mcp_source_once --lib`
- `cargo test -p codex-core --test all
tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`
2026-05-04 19:36:07 +00:00
Felipe Coury
48402be6fa feat(tui): improve TUI keymap coverage (#20798)
## Summary
- normalize terminal-emitted C0 control characters through configurable
editor keymaps, covering raw control-key fallbacks like
Shift+Enter-as-LF in terminals from #20555 and #20898, plus part of the
modified-Enter behavior in #20580
- add default-unbound keymap actions for toggling Fast mode and killing
the current composer line, giving #20698 users a configurable zsh-style
Ctrl+U option without changing the existing default Ctrl+U behavior
- wire the new actions through gated /keymap picker entries, schema
generation, and snapshot coverage

Fixes #20555.
Fixes #20898.

## Testing
- just write-config-schema
- just fmt
- cargo test -p codex-config
- cargo test -p codex-tui keymap::tests
- cargo test -p codex-tui bottom_pane::textarea::tests
- cargo test -p codex-tui keymap_setup::tests
- cargo insta pending-snapshots
- just fix -p codex-tui
- git diff --check
- just argument-comment-lint
2026-05-04 19:18:56 +00:00
Felipe Coury
cc16995cc6 feat(tui): add PR summary statusline items (#20892)
## Why?

The Codex App already exposes branch and PR context in its
branch-details UI. This brings the same context into the CLI footer as
opt-in statusline items, so users can choose the extra signal without
making the default footer busier.

## What?

Add optional `pull-request-number` and `branch-changes` items to the
configurable TUI status line.

- `pull-request-number` shows the open PR for the current checkout and
renders as a clickable terminal hyperlink when OSC 8 links are
supported.
- `branch-changes` shows committed additions/deletions against the
repository default branch, or `No changes` when the branch has no
committed diff.

<img width="1257" height="261" alt="CleanShot 2026-05-03 at 20 44 15"
src="https://github.com/user-attachments/assets/10b4380b-c3e9-4729-9ee1-3f742068fa47"
/>

## Architecture

This follows the same client/app-server split as the Codex App: the TUI
owns presentation, caching, and optional rendering, while
workspace-sensitive `git` and `gh` discovery runs through app-server.

The new TUI-local `workspace_command` layer sends bounded,
non-interactive `command/exec` requests to the active app-server. That
makes the implementation remote-friendly: the TUI does not decide
whether commands run in an embedded local workspace or a remote
workspace, and it does not bypass app-server sandbox or permission
policy.

The branch summary logic stays internal to `codex-tui` because this PR
only needs TUI statusline behavior. The command boundary is still
isolated behind `WorkspaceCommandExecutor`, so the lookup code can be
lifted or reused later without changing statusline rendering.

## How?

- Add a TUI `WorkspaceCommandExecutor` abstraction backed by app-server
`command/exec`.
- Add branch summary probes for:
  - current branch name,
  - open PR metadata,
  - committed branch diff stats against the default branch.
- Prefer remote-tracking default branch refs for diff stats, avoiding
stale or absent local `main` branches.
- Resolve PRs with `gh pr view` first, then fall back to
commit-associated PR lookup across parent/fork repos.
- Add `/statusline` picker entries, preview values, rendering, and OSC 8
clickable PR links.
- Keep all probes best-effort so missing `git`, missing `gh`, auth
failures, or non-git directories hide optional items instead of
surfacing footer errors.

## Validation

- `cargo test -p codex-tui branch_summary -- --nocapture`
- Snapshot coverage for the `/statusline` preview/setup rendering paths
- Hyperlink rendering coverage for clickable PR statusline cells
2026-05-04 16:11:15 -03:00
Owen Lin
c2fed01550 rollout: store web search and mcp tool calls (#21054)
Codex App would like these.
2026-05-04 18:54:20 +00:00
Ruslan Nigmatullin
4d201e340e state: pass state db handles through consumers (#20561)
## Why

SQLite state was still being opened from consumer paths, including lazy
`OnceCell`-backed thread-store call sites. That let one process
construct multiple state DB connections for the same Codex home, which
makes SQLite lock contention and `database is locked` failures much
easier to hit.

State DB lifetime should be chosen by main-like entrypoints and tests,
then passed through explicitly. Consumers should use the supplied
`Option<StateDbHandle>` or `StateDbHandle` and keep their existing
filesystem fallback or error behavior when no handle is available.

The startup path also needs to keep the rollout crate in charge of
SQLite state initialization. Opening `codex_state::StateRuntime`
directly bypasses rollout metadata backfill, so entrypoints should
initialize through `codex_rollout::state_db` and receive a handle only
after required rollout backfills have completed.

## What Changed

- Initialize the state DB in main-like entrypoints for CLI, TUI,
app-server, exec, MCP server, and the thread-manager sample.
- Pass `Option<StateDbHandle>` through `ThreadManager`,
`LocalThreadStore`, app-server processors, TUI app wiring, rollout
listing/recording, personality migration, shell snapshot cleanup,
session-name lookup, and memory/device-key consumers.
- Remove the lazy local state DB wrapper from the thread store so
non-test consumers use only the supplied handle or their existing
fallback path.
- Make `codex_rollout::state_db::init` the local state startup path: it
opens/migrates SQLite, runs rollout metadata backfill when needed, waits
for concurrent backfill workers up to a bounded timeout, verifies
completion, and then returns the initialized handle.
- Keep optional/non-owning SQLite helpers, such as remote TUI local
reads, as open-only paths that do not run startup backfill.
- Switch app-server startup from direct
`codex_state::StateRuntime::init` to the rollout state initializer so
app-server cannot skip rollout backfill.
- Collapse split rollout lookup/list APIs so callers use the normal
methods with an optional state handle instead of `_with_state_db`
variants.
- Restore `getConversationSummary(ThreadId)` to delegate through
`ThreadStore::read_thread` instead of a LocalThreadStore-specific
rollout path special case.
- Keep DB-backed rollout path lookup keyed on the DB row and file
existence, without imposing the filesystem filename convention on
existing DB rows.
- Verify readable DB-backed rollout paths against `session_meta.id`
before returning them, so a stale SQLite row that points at another
thread's JSONL falls back to filesystem search and read-repairs the DB
row.
- Keep `debug prompt-input` filesystem-only so a one-off debug command
does not initialize or backfill SQLite state just to print prompt input.
- Keep goal-session test Codex homes alive only in the goal-specific
helper, rather than leaking tempdirs from the shared session test
helper.
- Update tests and call sites to pass explicit state handles where DB
behavior is expected and explicit `None` where filesystem-only behavior
is intended.

## Validation

- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
codex-tui -p codex-exec -p codex-cli --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout state_db_`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout try_init_ -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
codex-rollout --lib -- -D warnings`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store
read_thread_falls_back_when_sqlite_path_points_to_another_thread --
--nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
shell_snapshot`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all personality_migration`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find::find_prefers_sqlite_path_by_id --
--nocapture`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
interrupt_accounts_active_goal_before_pausing`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server get_auth_status -- --test-threads=1`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server --lib`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
-p codex-app-server --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
-p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
codex-exec -p codex-cli`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
codex-app-server`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
codex-rollout`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
- `just argument-comment-lint -p codex-core`
- `just argument-comment-lint -p codex-rollout`

Focused coverage added in `codex-rollout`:

- `recorder::tests::state_db_init_backfills_before_returning` verifies
the rollout metadata row exists before startup init returns.
- `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
verifies startup waits for another worker to finish backfill instead of
disabling the handle for the process.
-
`state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
verifies startup does not hang indefinitely on a stuck backfill lease.
-
`tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
verifies DB-backed lookup accepts valid existing rollout paths even when
the filename does not include the thread UUID.
-
`tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
verifies DB-backed lookup ignores a stale row whose existing path
belongs to another thread and read-repairs the row after filesystem
fallback.

Focused coverage updated in `codex-core`:

- `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
DB-preferred rollout file with matching `session_meta.id`, so it still
verifies that valid SQLite paths win without depending on stale/empty
rollout contents.

`cargo test -p codex-app-server thread_list_respects_search_term_filter
-- --test-threads=1 --nocapture` was attempted locally but timed out
waiting for the app-server test harness `initialize` response before
reaching the changed thread-list code path.

`bazel test //codex-rs/thread-store:thread-store-unit-tests
--test_output=errors` was attempted locally after the thread-store fix,
but this container failed before target analysis while fetching `v8+`
through BuildBuddy/direct GitHub. The equivalent local crate coverage,
including `cargo test -p codex-thread-store`, passes.

A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
also requires system `libcap.pc` for `codex-linux-sandbox`; the
follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
this container.
2026-05-04 11:46:03 -07:00
starr-openai
0035d7bd18 Add stdio exec-server listener (#20663)
## Why

This stack adds configured exec-server environments, including
environments reached over stdio. Before client-side stdio transports or
config can use that path, the exec-server binary itself needs a
first-class stdio listen mode so it can speak the same JSON-RPC protocol
over stdin/stdout that it already speaks over websockets.

**Stack position:** this is PR 1 of 5. It is the server-side transport
foundation for the stack.

## What Changed

- Accept `stdio` and `stdio://` for `codex exec-server --listen`.
- Promote the existing stdio `JsonRpcConnection` helper from test-only
code into normal exec-server transport code.
- Add parse coverage for stdio listen URLs while preserving the existing
websocket default.

## Stack

- **1. This PR:** https://github.com/openai/codex/pull/20663 - Add stdio
exec-server listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME

Split from original draft: https://github.com/openai/codex/pull/20508

## Validation

Not run locally; this was split out of the original draft stack.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-04 11:40:03 -07:00
iceweasel-oai
5d5500650b Fix Windows PTY teardown by preserving ConPTY ownership (#20685)
## Why

On Windows, background terminals could stay visible after their shell
process had already exited. The elevated runner waits for the PTY output
reader to reach EOF before it sends the final exit message, but the
ConPTY helper was reducing ownership down to raw handles too early. That
left the pseudoconsole's borrowed pipe handles alive past teardown, so
EOF never propagated and the session stayed `running`.

## What changed

- change `utils/pty/src/win/conpty.rs` to hand off owned ConPTY
resources instead of leaking only raw handles
- make `windows-sandbox-rs/src/conpty/mod.rs` keep the pseudoconsole
owner and the backing pipe handles together until teardown
- update the elevated runner and the legacy unified-exec backend to keep
that `ConptyInstance` alive, take only the specific pipe handles they
need, and drop the owner at teardown instead of trying to close a
detached pseudoconsole handle later

## Testing

- desktop app in `Auto-review`: 11 x `cmd /c "ping -n 3 google.com"` all
exited cleanly and did not accumulate in the UI
- desktop app in `Auto-review`: 5 x `cmd /c "ping -n 30 google.com"`
appeared in the UI and drained back out on their own
2026-05-04 18:40:00 +00:00
starr-openai
905987c08f Prepare selected environment plumbing (#20669)
## Why
This is a prep PR in the multi-environment process-tool stack. It
separates ownership/config cleanup from the behavior change that teaches
process tools to route by selected environment, so the follow-up PR can
focus on model-facing `environment_id` behavior.

## Stack
1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext`
rendering for selected environments
2. https://github.com/openai/codex/pull/20669 - selected-environment
ownership and tool config prep (this PR)
3. https://github.com/openai/codex/pull/20647 - process-tool
`environment_id` routing

## What Changed
- keep the resolved turn environment list wrapped in
`ResolvedTurnEnvironments` through `TurnContext` instead of unwrapping
it back to a raw `Vec`
- add `TurnContext::resolve_path_against` so cwd-relative path
resolution has one shared helper
- replace the old tool config boolean with `ToolEnvironmentMode::{None,
Single, Multiple}`

## Testing
- Tests not run locally; this prep refactor is covered by GitHub CI for
the stack.

Co-authored-by: Codex <noreply@openai.com>
2026-05-04 17:55:49 +00:00
Won Park
5c1ec8f4fd tui: retire /approvals and rename /autoreview to /approve (#21034)
## Why

The TUI currently exposes overlapping command names for the same
permissions flow: `/permissions` and the older `/approvals` alias. It
also uses `/autoreview` for the manual retry flow, even though the
action users take there is approving one denied auto-review request.

This change makes the command surface consistent with the hard rebrand:
- `/permissions` is the only command for permission settings.
- `/approve` is the command for approving a recent auto-review denial.

## What changed

- Removed the legacy `/approvals` slash command and its dispatch path.
- Kept `/permissions` as the single permissions command shown and
accepted by the TUI.
- Renamed the auto-review denial command from `/autoreview` to
`/approve`.
- Updated nearby comments so they refer to `/permissions` rather than
the retired `/approvals` name.

## Verification

- Updated the slash-command unit test to assert that `AutoReview` now
renders and parses as `approve`.
2026-05-04 17:50:34 +00:00
Felipe Coury
94800ecbbf feat(tui): add keymap debug inspector (#20794)
## Why

We constantly get bug reports about keys not being recognized by Codex
when the terminal is not handling the key press. Running `/keymap debug`
or `/keymap` and going to the Debug tab, we can allow the user to either
understand that the key being pressed is not being recognized or to
check what it's being recognized as and report or reassign that key.

| Menu | Inspector | Hint |
|---|---|---|
| <img width="1369" height="796" alt="CleanShot 2026-05-02 at 12 57 12"
src="https://github.com/user-attachments/assets/512b6faa-344e-4aee-9c00-b4bdc633a662"
/> | <img width="1261" height="754" alt="CleanShot 2026-05-02 at 12 56
36"
src="https://github.com/user-attachments/assets/a6ddae7d-e174-4ee4-893f-e6bec4fff4ab"
/> | <img width="1369" height="796" alt="CleanShot 2026-05-02 at 12 57
30"
src="https://github.com/user-attachments/assets/db507784-f40a-4cff-ac23-a61d9703769b"
/> |
## Summary
- add a Debug tab to `/keymap` and support `/keymap debug` for direct
access
- show what key Codex receives, the config key representation, raw event
details, and matching actions
- add a progressive missing-key hint that escalates after a few seconds
with no detected keypress

## Validation
- `just fmt`
- `cargo test -p codex-tui keymap_setup::tests::debug_view`
- `cargo test -p codex-tui keymap_setup::tests`
- `cargo test -p codex-tui slash_keymap`
- `cargo test -p codex-tui` (unit tests passed; integration test
`suite::model_availability_nux::resume_startup_does_not_consume_model_availability_nux_count`
failed locally by itself with `codex resume` exiting 1 and terminal
probe escape output)
- `just fix -p codex-tui`
- `just argument-comment-lint`
- `cargo insta pending-snapshots`
- `git diff --check`
2026-05-04 14:40:50 -03:00
viyatb-oai
5b80f87c97 fix(linux-sandbox): fall back when system bwrap lacks perms (#20628)
## Why

Codex `0.128` started using `--perms` in more routine Linux sandbox
construction when protected workspace metadata mounts landed in #19852.
Upstream bubblewrap added `--perms` in `v0.5.0`, so system `bwrap`
versions older than that, including the `v0.4.0` and `v0.4.1` family, do
not support the flag. The launcher still selected those binaries as long
as they existed on `PATH`.

That means affected hosts can fail every sandboxed command up front
with:

```text
bwrap: Unknown option --perms
```

The reports in #20590 and duplicate #20623 match that compatibility gap;
#20623 explicitly shows system bubblewrap `0.4.0`.

## What changed

- Replace the single `--argv0` probe with a small system-bwrap
capability probe in `codex-rs/linux-sandbox/src/launcher.rs`.
- Continue using the old-system `--argv0` compatibility path when
needed, but only select a system `bwrap` if it also advertises
`--perms`.
- Fall back to the vendored `bwrap` when the system binary is too old
for the flags Codex now requires.
- Add regression coverage for the old-system-bwrap case so binaries
without `--perms` stay on the vendored path.

## Verification

- Added `falls_back_to_vendored_when_system_bwrap_lacks_perms` to cover
the reported compatibility gap.
- Ran `cargo test -p codex-linux-sandbox` and `cargo clippy -p
codex-linux-sandbox --tests` locally. On macOS, the crate builds but its
Linux-only tests are cfg-gated out, so the new regression test still
needs Linux CI or a Linux devbox run for real execution coverage.

## Related issues

- Fixes #20590
- Duplicate report: #20623
2026-05-04 10:38:31 -07:00
Owen Lin
541e99cf09 feat(app-server): always return limited thread history (#20682)
## Why

Whenever we return a thread's history (turns and items) over app-server,
always return the limited form as specified by the rollout policy
`EventPersistenceMode::Limited`, even if the thread was previously
started with `EventPersistenceMode::Extended`.

We're finding it is quite unscalable to be returning the extended
history, so let's apply the same filtering logic of the rollout policy
when we load and return the thread's history.

## What Changed

- Reuse the rollout persistence policy when reconstructing app-server
`ThreadItem` history so only `EventPersistenceMode::Limited` rollout
items are replayed into API turns.
- Route `thread/read`, `thread/resume`, `thread/fork`,
`thread/turns/list`, and rollback responses through the same filtered
app-server history projection.
- Keep live active turns intact when composing a response for a
currently running thread.
- Update command execution coverage so persisted extended command events
are excluded from returned history for `thread/read`, `thread/fork`, and
`thread/turns/list`.

## Test Plan

- `cargo test -p codex-app-server limited`
- `cargo test -p codex-app-server thread_shell_command`
- `cargo test -p codex-app-server thread_read`
- `cargo test -p codex-app-server thread_rollback`
- `cargo test -p codex-app-server thread_fork`
- `cargo test -p codex-app-server-protocol`
2026-05-04 10:37:35 -07:00
Matthew Zeng
1b900bee8a Unify skip-review handling for approval_mode = "approve" (#20750)
## Summary
- Treat `approval_mode = "approve"` as skip-review across all permission
modes.
- Remove the mode-specific split in the MCP auto-approval gate so
approved tools bypass review consistently.
- Expand regression coverage in the shared MCP helper and the core
tool-call flow.

## Testing
- `just fmt`
- `cargo test -p codex-mcp`
- `cargo test -p codex-core
approve_mode_skips_arc_and_guardian_in_every_permission_mode`
- `git diff --check`
- Full `cargo test -p codex-core` was also attempted, but the suite hit
an unrelated pre-existing stack overflow in an existing multi-agent test
2026-05-04 10:30:47 -07:00
Matthew Zeng
83a4e3b66b [mcp-apps] Persist MCP Apps specific tool call end event. (#20853)
- [x] Persist a special type of MCP tool calls for triggering MCP App,
this type of mcp tool calls has 'mcpAppResourceUri` set. These events
are needed so that the Codex App can correctly render the MCP App after
resume.
2026-05-04 10:20:58 -07:00
jif-oai
e3451ce6be core: share responses request builder with compact requests (#20989)
## Why

`ModelClientSession` and `compact_conversation_history()` were still
rebuilding the same `ResponsesApiRequest` fields separately. That
duplication makes it easy for normal `/responses` turns and compact
requests to drift when request-shape changes land later, which is
exactly the kind of cache-affecting divergence we want to avoid.

This follow-up keeps the scope small by extracting the shared
request-construction logic into one helper and using it from both paths.

## What changed

- move `ResponsesApiRequest` construction into a shared
`ModelClient::build_responses_request(...)` helper in
`core/src/client.rs`
- update the normal `/responses` streaming path to call that helper
instead of the old `ModelClientSession`-local implementation
- update `compact_conversation_history()` to derive its compact payload
from the same helper so `model`, `instructions`, `input`, `tools`,
`parallel_tool_calls`, `reasoning`, and `text` stay aligned with normal
request building
- add a unit test covering the shared helper's prompt cache key,
installation metadata, and `service_tier` behavior

## Verification

- `cargo test -p codex-core
build_responses_request_sets_shared_cache_and_metadata_fields`
- `cargo test -p codex-core --test all
remote_compact_v2_reuses_context_compaction_for_followups`

## Docs

No docs update needed.
2026-05-04 17:18:38 +00:00
jif-oai
4fd7dfe223 memories-mcp: reject symlink traversal in local backend (#21010)
## Why

The local memories MCP backend only rejected symlinks after resolving
the final path. That left room for scoped requests like
`skills/secret.md` to walk through a symlinked ancestor directory and
escape the configured memories root.

This change also makes missing scoped paths fail explicitly instead of
looking like an empty `list` / `search` result or a `NotFile` read
error.

## What Changed

- walk each scoped path component in
`LocalMemoriesBackend::resolve_scoped_path` and reject symlinked
ancestors before accessing the target
- reject scoped paths that traverse through a non-directory intermediate
component
- add a `NotFound` backend error for missing `read`, `list`, and
`search` paths and map it through the MCP server error conversion
- add coverage for missing paths and symlinked ancestor directories in
`codex-rs/memories/mcp/src/local_tests.rs`

## Testing

- added unit coverage in `codex-rs/memories/mcp/src/local_tests.rs` for
missing paths and symlinked ancestor directories across `read`, `list`,
and `search`
2026-05-04 18:40:28 +02:00
jif-oai
f20f8a719e memories/mcp: generate tool schemas with schemars (#21012)
## Why

The memories MCP server currently keeps handwritten JSON Schema beside
the Rust types that actually serialize and deserialize the tool
payloads:
[`schema.rs`](2f5c06a29c/codex-rs/memories/mcp/src/schema.rs (L4-L133)),
[`server.rs`](2f5c06a29c/codex-rs/memories/mcp/src/server.rs (L44-L75)),
and
[`backend.rs`](2f5c06a29c/codex-rs/memories/mcp/src/backend.rs (L41-L117)).
That duplicates the tool contract and makes schema drift easier as the
API evolves.

## What changed

- derive `JsonSchema` for the memories tool arguments, responses, and
nested response types
- replace the handwritten schema builders with shared `schemars`
generation
- preserve the existing wire shape while generating schemas, including
nullable output `Option` fields and non-nullable optional input fields
- wire the `list`, `read`, and `search` tools to the generated schemas

## Verification

- CI pending
2026-05-04 18:40:17 +02:00
jif-oai
161541310f typo (#21023) 2026-05-04 18:39:46 +02:00
pakrym-oai
33b19bcfde [codex] Split app-server request processors (#20940)
## Why

The app-server request path had grown around a large
`CodexMessageProcessor` plus separate API wrapper/helper modules. That
made the dependency graph hard to see and forced unrelated request
families to share broad processor state.

This PR makes the split mechanical and command-prefix oriented so
request families own only the dependencies they use.

## What changed

- Replaced `CodexMessageProcessor` with command-prefix request
processors under `app-server/src/request_processors/`.
- Removed the old config, device-key, external-agent-config, and fs API
wrapper files by moving their API handling into processors.
- Split apps, plugins, marketplace, catalog, account, MCP, command exec,
fs, git, feedback, thread, turn, thread goals, and Windows sandbox
handling into dedicated processors.
- Kept shared lifecycle, summary conversion, token usage replay, and
shared error mapping only where multiple processors use them; single-use
helpers were inlined into their owning processor.
- Removed the fallback processor path and moved processor tests to
`_tests` files.

## Validation

- `cargo test -p codex-app-server`
- `cargo check -p codex-app-server`
- `just fix -p codex-app-server`
2026-05-04 09:34:11 -07:00
Eric Traut
12a729f2b2 Keep paused goals paused on thread resume (#20790)
## Summary

Early adopters of the `/goal` feature have provided feedback that they
expect a goal they explicitly paused to remain paused when they resume a
thread. Previously, resuming a thread would reactivate a paused goal.

This PR keeps persisted goal status unchanged during thread resume. This
honors the user feedback while also simplifying the core goal logic.

Rather than have the core logic automatically resume a paused goal, that
responsibility is transferred to the client. The TUI now detects a
resumed thread with a paused goal and asks the user whether to `Resume
goal` or `Leave paused`. The prompt appears only for quiet resume flows,
so users who resume with an immediate prompt are not interrupted.

<img width="544" height="111" alt="image"
src="https://github.com/user-attachments/assets/0ac9de1c-6ee6-47ba-b223-c03c8eb4c192"
/>
2026-05-04 09:04:30 -07:00
Eric Traut
f072119ccf Speed up /side parent restore replay (#20815)
## Why

Returning from a `/side` conversation restores the parent thread by
replaying its snapshot into the TUI. For very long parent threads,
replaying every transcript row can take noticeable time even though most
rows immediately scroll out of terminal history.

## What Changed

- Buffer thread-switch replay for parent restores when terminal resize
reflow is enabled.
- Reuse the existing resize-reflow tail renderer so only the retained
transcript tail is written back to scrollback when a row cap is
configured.
2026-05-04 09:00:30 -07:00
Eric Traut
3c2dcbef85 Keep paused goals paused on thread resume (#20790)
## Summary

Early adopters of the `/goal` feature have provided feedback that they
expect a goal they explicitly paused to remain paused when they resume a
thread. Previously, resuming a thread would reactivate a paused goal.

This PR keeps persisted goal status unchanged during thread resume. This
honors the user feedback while also simplifying the core goal logic.

Rather than have the core logic automatically resume a paused goal, that
responsibility is transferred to the client. The TUI now detects a
resumed thread with a paused goal and asks the user whether to `Resume
goal` or `Leave paused`. The prompt appears only for quiet resume flows,
so users who resume with an immediate prompt are not interrupted.

<img width="544" height="111" alt="image"
src="https://github.com/user-attachments/assets/0ac9de1c-6ee6-47ba-b223-c03c8eb4c192"
/>
2026-05-04 08:58:07 -07:00
jif-oai
2f5c06a29c nit: legacy (#21006) 2026-05-04 16:04:29 +02:00
jif-oai
8ba294ea13 feat: support multi-query memories search (#21004)
## Why
The memories MCP `search` tool only accepts a single substring today,
which makes it hard for clients to express combined queries or explain
why a line matched. This change adds the richer search shape needed for
the next client iteration while keeping the legacy single-`query` call
working.

## What changed
- accept either the legacy `query` field or a new `queries` array, plus
`match_mode: any|all`
- teach the local memories backend to evaluate multi-query line matches
and return `matched_queries` on each hit
- update the MCP input/output schema and add coverage for parser
behavior, ordering, pagination, case sensitivity, and match modes

## Testing
- added unit coverage in `memories/mcp/src/local_tests.rs` and
`memories/mcp/src/server.rs`
2026-05-04 15:55:06 +02:00
jif-oai
5512b23c95 nit: renaming (#20998) 2026-05-04 15:43:58 +02:00
jif-oai
0269a46ab1 feat: add context lines to memories MCP search (#20997)
## Why

The paginated memories MCP `search` tool still returned only the
matching line text, which made it harder for clients to present useful
search results or decide whether they needed to follow up with a
separate `read` call. Adding a small amount of surrounding context makes
individual hits much more usable while keeping the search response
deterministic and line-addressable.

## What changed

- add an optional `context_lines` search argument and thread it through
the MCP server into the local memories backend
- change search matches to return the matched `line_number` plus a
`start_line_number` and multi-line `content` block for the requested
context window
- update the search tool schema and description to document the new
request/response shape
- extend the local backend tests to cover zero-context matches,
contextual results, pagination, and invalid cursors that point past the
end of the result set

## Testing

- Added targeted unit coverage in `memories/mcp/src/local_tests.rs`
- GitHub Actions are running for the branch

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-04 15:32:57 +02:00
jif-oai
554223ab80 feat: paginate memories MCP search results (#20996)
## Why

The memories MCP `search` tool previously stopped once it hit
`max_results`, so callers could tell there were more matches via
`truncated` but had no way to fetch the rest of the result set. That
made large searches awkward for clients that need to keep paging through
a stable, deterministic view of the matches.

## What changed

- add an optional `cursor` field to `SearchMemoriesRequest` / tool input
and return `next_cursor` in `SearchMemoriesResponse`
- update the MCP schemas and tool wiring so clients can request
subsequent pages explicitly
- change the local memories backend to collect and sort the full scoped
match list, then slice the requested page and reject invalid cursors
- add unit coverage for paginated search results and invalid cursor
handling in `memories/mcp/src/local_tests.rs`

## Testing

- Added targeted unit coverage in `memories/mcp/src/local_tests.rs`
- GitHub Actions are running for the branch
2026-05-04 15:23:10 +02:00
jif-oai
29352569b3 feat: make memories MCP list shallow (#20994)
## Why
The memories MCP `list` tool should behave like a directory listing, not
a recursive tree walk. Recursive results make pagination harder to
reason about, return unexpectedly deep paths for scoped requests, and no
longer match the intended tool contract.

## What Changed
- Changed the local memories backend so `list` returns only the
immediate children of the requested path.
- Preserved file-scoped requests by returning the file itself, and
missing paths by returning an empty result.
- Updated cursor handling to paginate over the shallow sibling set and
reject cursors past the available results.
- Updated the MCP tool description to say it lists immediate files and
directories under a path.
- Reworked the local backend tests to cover shallow top-level listing,
shallow scoped listing, sibling ordering, and pagination.

## Testing
- `cargo test -p codex-memories-mcp`
2026-05-04 15:08:34 +02:00
jif-oai
5730615e75 feat: paginate MCP memories list (#20993)
## Why

Large memories trees do not fit well into a single MCP `list` response.
This change makes the memories MCP server page `list` results so callers
can continue walking the tree without overfetching or relying on
ambiguous truncation.

## What changed

- add an optional `cursor` input to the memories MCP `list` API and
return `next_cursor` alongside `truncated` in the response
- paginate recursive local-memory traversal while preserving
lexicographic path order across directories
- reject malformed and out-of-range cursors as invalid MCP requests
- update the server/schema wiring and add coverage for pagination,
ordering, and cursor validation in `memories/mcp/src/local_tests.rs`

## Testing

- `cargo test -p codex-memories-mcp`
2026-05-04 14:59:56 +02:00
jif-oai
6b6581ac59 feat: add max_lines to memories MCP read (#20991)
## Why

The memories MCP `read` tool already supports `line_offset`, but it
cannot return a bounded line range. That makes it awkward to page
through large memory files or request a small slice without relying on
token truncation.

## What changed

- add an optional `max_lines` parameter to the memories MCP `read` tool
schema and request parsing
- cap local backend reads to the requested number of lines before token
truncation
- treat `max_lines = 0` as an invalid request and surface it as
`invalid_params`
- add backend tests for bounded reads and invalid line request
validation

## Testing

- added coverage in `memories/mcp/src/local_tests.rs` for `max_lines`
reads and invalid `max_lines` / `line_offset` requests
2026-05-04 14:45:38 +02:00
jif-oai
019755d570 feat: add line offsets to memory read MCP (#20986)
## Why

Memory clients sometimes need to continue reading a file from a known
line instead of starting over from the top. Adding a line offset to the
`read` MCP keeps that resume logic simple and avoids re-reading
already-consumed content.

## What changed

- Added an optional `line_offset` argument to the memory `read` tool,
defaulting to `1`.
- Read content starting at the requested 1-indexed line before token
truncation, and return `start_line_number` in the response.
- Treat invalid offsets as invalid params errors and cover the new
behavior in `codex-rs/memories/mcp/src/local_tests.rs`.

## Testing

- Added unit tests for reading from a non-default starting line.
- Added unit tests for rejecting `0` and past-end line offsets.
2026-05-04 14:26:37 +02:00
jif-oai
d927f61208 feat: add remote compaction v2 Responses client path (#20773)
## Why

This adds the `remote_compaction_v2` client path so remote compaction
can run through the normal Responses stream and install a
`context_compaction` item that trigger a compaction.

The goal is to migrate some of the compaction logic on the client side

We keeps the v2 transport behind a feature flag while letting follow-up
requests reuse the compacted context instead of falling back to the
legacy compaction item shape.

## What changed

- add `ResponseItem::ContextCompaction` and refresh the generated
app-server / schema / TypeScript fixtures that expose response items on
the wire
- add `core/src/compact_remote_v2.rs` to send compaction through the
standard streamed Responses client, require exactly one
`context_compaction` output item, and install that item into compacted
history
- route manual compact and auto-compaction through the v2 path when
`remote_compaction_v2` is enabled, while keeping the existing remote
compaction path as the fallback
- preserve the new item type across history retention, follow-up request
construction, telemetry, rollout persistence, and rollout-trace
normalization
- add targeted coverage for the feature flag, `context_compaction`
serialization, rollout-trace normalization, and remote-compaction
follow-up behavior

## Verification

- added protocol tests for `context_compaction`
serialization/deserialization in `protocol/src/models.rs`
- added rollout-trace coverage for `context_compaction` normalization in
`rollout-trace/src/reducer/conversation_tests.rs`
- added remote compaction integration coverage for v2 follow-up reuse
and mixed compaction output streams in
`core/tests/suite/compact_remote.rs`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-04 14:15:01 +02:00
jif-oai
d013155f40 feat: memories mcp v1 (#20622)
Add an experimental MCP on memories
This must never be used and is only here for testing purpose

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-04 13:51:03 +02:00
jif-oai
f48b777717 feat: support template interpolation in multi-agent usage hints (#20973)
## Why

`multi_agent_v2` usage hints sometimes need to reference resolved config
values such as the effective thread limit. Those values only exist after
config layering, defaulting, and feature materialization, so the raw
TOML alone was not enough to render them.

## What changed

- allow
`features.multi_agent_v2.{usage_hint_text,root_agent_usage_hint_text,subagent_usage_hint_text}`
to use `{{ ... }}` placeholders backed by the materialized effective
config
- fail config loading with a targeted error when a referenced
placeholder does not exist or does not resolve to a scalar value
- move resolved-config materialization into a shared helper so config
interpolation and config-lock export/replay both serialize the same
resolved feature, memory, and agent settings

## Example
```
[features.multi_agent_v2]
enabled = true
usage_hint_text = "lorem {{ features.multi_agent_v2.max_concurrent_threads_per_session }} ipsum"
```
gets rendered as 
```
        "description": String("... \lorem 4 ipsum"),
```
2026-05-04 11:50:01 +02:00
pakrym-oai
c8c30d9d75 [codex] Emit MCP tool calls as turn items (#20677)
## Why

`McpToolCall` was still an app-server item synthesized from deprecated
legacy begin/end events. Recent item migrations moved this ownership
into core `TurnItem`s, so MCP tool calls now follow the same canonical
lifecycle and leave legacy events as compatibility fanout.

Keeping the core item close to the v2 `ThreadItem::McpToolCall` shape
also avoids spreading MCP result semantics across app-server conversion
code. Core now owns whether a completed call is `completed` or `failed`,
and whether the payload is a tool result or an error.

## What changed

- Added core `TurnItem::McpToolCall` with flattened `server`, `tool`,
`arguments`, `status`, `result`, and `error` fields.
- Updated MCP tool call emitters, including MCP resource tools, to emit
`ItemStarted`/`ItemCompleted` around directly constructed core MCP
items.
- Updated app-server v2 conversion to project the core MCP item into
`ThreadItem::McpToolCall` without deriving status or splitting `Result`
locally.
- Ignored live deprecated MCP legacy fanout in app-server v2 to avoid
duplicate item notifications, while keeping thread history replay on the
legacy event path.

## Verification

- `cargo test -p codex-protocol`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-core --lib mcp_tool_call`
- `cargo check -p codex-app-server`
- `cargo test -p codex-app-server
mcp_tool_call_completion_notification_contains_truncated_large_result`
2026-05-03 22:50:13 -07:00
pakrym-oai
9ddfda9db7 [codex] Refactor app-server dispatch result flow (#20897)
## Why

App-server request handling had response sending spread across many
individual handlers, which made it harder to see which requests return
payloads, which methods send their own delayed response, and which
branches emit notifications after a response.

## What changed

- Centralized normal `ClientResponsePayload` sending in the dispatch
path.
- Kept explicit-response methods explicit where they need custom
ordering or delayed delivery.
- Removed forward-only handler wrappers and immediate `async { ...
}.await` bodies where they were not needed.
- Moved branch-specific post-response notifications into the branches
that own the response ordering.
- Replaced unreachable delegated request-family error arms with explicit
`unreachable!` cases.

## Verification

- `cargo check -p codex-app-server`
- `cargo test -p codex-app-server thread_goal`
- `just fix -p codex-app-server`
2026-05-03 18:57:46 -07:00
Eric Traut
67849d950d Remove local docs and specs (#20896)
## Summary

We should not check local-only docs or planning specs into this
repository. Keeping those files here duplicates the canonical Codex
documentation surface and makes transient implementation notes look like
supported docs.

This PR removes the local-only docs/spec files from `docs/` and trims
`docs/config.md` back to links for the maintained configuration
documentation on developers.openai.com.
2026-05-03 10:23:09 -07:00
Eric Traut
39555036a3 [codex] Add issue labeler area labels (#20893)
## Why

The automated issue labeler needs more precise area labels for newly
opened GitHub issues so triage can distinguish new Codex app and agent
feature surfaces without falling back to broad labels.

## What Changed

- Added labeler prompt entries for `computer-use`, `browser`, `memory`,
`imagen`, `remote`, `performance`, `automations`, and `pets` in
`.github/workflows/issue-labeler.yml`.
- Updated the agent-area guidance so `memory` is used for agentic memory
storage/retrieval and `performance` is used for slow behavior, high
memory utilization, and leaks.
- Expanded the fallback `agent` guidance so Codex prefers the new
specific labels when applicable.

## Verification

- Parsed `.github/workflows/issue-labeler.yml` with `yq e '.'`.
- Ran `git diff --check` for the workflow change.
2026-05-03 09:25:42 -07:00
pakrym-oai
35aaa5d9fc Bound websocket request sends with idle timeout (#20751)
## Why

We saw Responses websocket sessions recover only after a long quiet
period when the server had already logged the websocket as disconnected.
The normal connect path is already bounded by
`websocket_connect_timeout_ms`, but the first request send on an
established websocket reused only the receive-side idle timeout after
the write completed. If the socket write/pump stalls, the client can sit
in `ws_stream.send(...)` without reaching the existing receive timeout.
2026-05-01 23:33:32 -07:00
Matthew Zeng
f88701f5c8 [tool_suggest] More prompt polishes. (#20566)
Tool suggest still misfires when model needs tool_search, updating the
prompts to further disambiguate it:

- [x] rename it from `tool_suggest` to `request_plugin_install`
- [x] rephrase "suggestion" to "install" in the tool descriptions.
- [x] disambiguate "the tool" vs "the plugin/connector". 

Tested with the Codex App and verified it still works.
2026-05-02 04:22:12 +00:00
Felipe Coury
127434cd8b fix(tui): bound startup terminal probes (#20654)
## Summary

Bound TUI startup terminal response probes so unsupported terminals
cannot stall startup for multiple seconds.

This replaces the Unix startup uses of crossterm's blocking response
probes with short `/dev/tty` probes that use nonblocking reads and
`poll` with a 100ms timeout. It covers the initial cursor-position
query, keyboard enhancement support detection, and OSC 10/11
default-color detection. The default-color probe uses one shared
deadline for foreground and background instead of allowing two
independent full waits.

The diagnostic mode/trace env vars from the investigation branch are
intentionally not included. The shipped behavior is simply bounded
probing by default, while non-Unix keeps the existing crossterm fallback
path.

## Details

- Add a private `terminal_probe` module for bounded Unix terminal probes
and response parsers.
- Let `custom_terminal::Terminal` accept a caller-provided initial
cursor position so startup can compute it before constructing the
terminal.
- Use bounded cursor, keyboard enhancement, and default-color probes on
Unix startup.
- Preserve default-color cache behavior so a failed attempted query does
not retry forever.

## Validation

- `cd codex-rs && just fmt`
- `cd codex-rs && cargo test -p codex-tui terminal_probe`
- `cd codex-rs && just fix -p codex-tui`
- `cd codex-rs && just argument-comment-lint`
- `git diff --check`
- `git diff --cached --check`

`cd codex-rs && cargo test -p codex-tui` still aborts on the
pre-existing local stack overflow in
`app::tests::discard_side_thread_keeps_local_state_when_server_close_fails`;
I reproduced that same focused failure on `main` before this PR work, so
it is not introduced by this change.

Manual validation in the VM showed the original crossterm path taking
about 2s per unanswered probe, while bounded probing returned in about
100ms per probe.
2026-05-02 01:20:57 +00:00
jgershen-oai
9e905528bb Fix custom CA login behind TLS-inspecting proxies (#20676)
Refs:
https://linear.app/openai/issue/SE-6311/login-fails-for-experian-users-behind-tls-inspecting-proxy

## Summary
- When a custom CA bundle is configured, force the shared `codex-client`
reqwest builder onto rustls before registering custom roots.
- Add the `rustls-tls-native-roots` reqwest feature so the rustls client
preserves native roots plus the enterprise CA bundle.
- Add subprocess TLS coverage for both a direct local TLS 1.3 server and
a hermetic local CONNECT TLS-intercepting proxy that forwards a
token-exchange-shaped POST to a local origin.

## Plain-language explanation
Experian users are behind a TLS-inspecting proxy, so the login token
exchange needs to trust the enterprise CA bundle from
`CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE`. Before this change, that
custom-CA branch still used reqwest default TLS selection, which could
fail in the proxy environment. Now, only when a custom CA is configured,
Codex selects rustls first and then adds the custom CA roots, matching
the validated behavior from the Experian test build while leaving normal
system-root clients unchanged.

The new regression test recreates the enterprise-proxy shape locally:
the probe client sends an HTTPS `POST /oauth/token` through an explicit
HTTP CONNECT proxy, the proxy presents a leaf certificate signed by a
runtime-generated test CA, decrypts the request, forwards it to a local
origin, and relays the `ok` response back.

## Scope note
- The actual production fix is the first commit: `8368119282 Fix custom
CA reqwest clients to use rustls`.
- The second commit is integration-test coverage only. It generates all
test CA and localhost certificate material at runtime.

## Validation
- `cd codex-rs && cargo test -p codex-client --test ca_env
posts_to_token_origin_through_tls_intercepting_proxy_with_custom_ca_bundle
-- --nocapture`
- `cd codex-rs && cargo test -p codex-client`
- `cd codex-rs && cargo test -p codex-login`
- `cd codex-rs && just fmt`
- `cd codex-rs && just bazel-lock-update`
- `cd codex-rs && just bazel-lock-check`
- `cd codex-rs && just fix -p codex-client`
2026-05-01 17:51:49 -07:00
Michael Bolin
cd2760fc08 ci: cross-compile Windows Bazel clippy (#20701)
## Why

#20585 moved the Windows Bazel test job to the cross-compile path, but
the Windows Bazel clippy and verify-release-build jobs were still using
the native Windows/MSVC-host fallback. Those two jobs became the slowest
Windows PR legs, even though both are build-only signal and do not need
to execute the resulting binaries.

## What Changed

- Switches the Windows Bazel clippy job from
`--windows-msvc-host-platform` to `--windows-cross-compile`, so clippy
build actions use Linux RBE while still targeting
`x86_64-pc-windows-gnullvm`.
- Switches the Windows Bazel verify-release-build job to
`--windows-cross-compile` as well. This job only compiles
`cfg(not(debug_assertions))` Rust code under `fastbuild`, so it does not
need a native Windows build host.
- Keeps the old `--skip_incompatible_explicit_targets` behavior only for
fork/community PRs without `BUILDBUDDY_API_KEY`, where `run-bazel-ci.sh`
falls back to the local Windows MSVC-host shape.
- Adds `--windows-cross-compile` support to
`.github/scripts/run-bazel-query-ci.sh`, so target-discovery queries
select the same `ci-windows-cross` config as the subsequent build.
- Threads that option through `scripts/list-bazel-clippy-targets.sh` so
the Windows clippy job discovers targets under the same platform shape
as the subsequent clippy build.

## Verification

Local checks:

```shell
bash -n .github/scripts/run-bazel-query-ci.sh
bash -n scripts/list-bazel-clippy-targets.sh
ruby -e 'require "yaml"; YAML.load_file(".github/workflows/bazel.yml"); puts "ok"'
RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh | grep -c -- '-windows-cross-bin$'
RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh --windows-cross-compile | grep -c -- '-windows-cross-bin$'
```

The Linux target-list check reported `0` Windows-cross internal test
binaries, while the Windows cross target-list check reported `47`,
preserving the test-code clippy coverage shape from the existing Windows
job.
2026-05-01 16:40:29 -07:00
Michael Bolin
466798aa83 ci: cross-compile Windows Bazel tests (#20585)
## Status

This is the Bazel PR-CI cross-compilation follow-up to #20485. It is
intentionally split from the Cargo/cargo-xwin release-build PoC so
#20485 can stay as the historical release-build exploration. The
unrelated async-utils test cleanup has been moved to #20686, so this PR
is focused on the Windows Bazel CI path.

The intended tradeoff is now explicit in `.github/workflows/bazel.yml`:
pull requests get the fast Windows cross-compiled Bazel test leg, while
post-merge pushes to `main` run both that fast cross leg and a fully
native Windows Bazel test leg. The native main-only job keeps full
V8/code-mode coverage and gets a 40-minute timeout because it is less
latency-sensitive than PR CI. All other Bazel jobs remain at 30 minutes.

## Why

Windows Bazel PR CI currently does the expensive part of the build on
Windows. A native Windows Bazel test job on `main` completed in about
28m12s, leaving very little headroom under the 30-minute job timeout and
making Windows the slowest PR signal.

#20485 showed that Windows cross-compilation can be materially faster
for Cargo release builds, but PR CI needs Bazel because Bazel owns our
test sharding, flaky-test retries, and integration-test layout. This PR
applies the same high-level shape we already use for macOS Bazel CI:
compile with remote Linux execution, then run platform-specific tests on
the platform runner.

The compromise is deliberately signal-aware: code-mode/V8 changes are
rare enough that PR CI can accept losing the direct V8/code-mode
smoke-test signal temporarily, while `main` still runs the native
Windows job post-merge to catch that class of regression. A follow-up PR
should investigate making the cross-built Windows gnullvm V8 archive
pass the direct V8/code-mode tests so this tradeoff can eventually go
away.

## What Changed

- Adds a `ci-windows-cross` Bazel config that targets
`x86_64-pc-windows-gnullvm`, uses Linux RBE for build actions, and keeps
`TestRunner` actions local on the Windows runner.
- Adds explicit Windows platform definitions for
`windows_x86_64_gnullvm`, `windows_x86_64_msvc`, and a bridge toolchain
that lets gnullvm test targets execute under the Windows MSVC host
platform.
- Updates the Windows Bazel PR test leg to opt into the cross-compile
path via `--windows-cross-compile` and `--remote-download-toplevel`.
- Adds a `test-windows-native-main` job that runs only for `push` events
on `refs/heads/main`, uses the native Windows Bazel path, includes
V8/code-mode smoke tests, and has `timeout-minutes: 40`.
- Keeps fork/community PRs without `BUILDBUDDY_API_KEY` on the previous
local Windows MSVC-host fallback, including
`--host_platform=//:local_windows_msvc` and `--jobs=8`.
- Preserves the existing integration-test shape on non-gnullvm
platforms, while generating Windows-cross wrapper targets only for
`windows_gnullvm`.
- Resolves `CARGO_BIN_EXE_*` values from runfiles at test runtime,
avoiding hard-coded Cargo paths and duplicate test runfiles.
- Extends the V8 Bazel patches enough for the
`x86_64-pc-windows-gnullvm` target and Linux remote execution path.
- Makes the Windows sandbox test cwd derive from `INSTA_WORKSPACE_ROOT`
at runtime when Bazel provides it, because cross-compiled binaries may
contain Linux compile-time paths.
- Keeps the direct V8/code-mode unit smoke tests out of the Windows
cross PR path for now while native Windows CI continues to cover them
post-merge.

## Command Shape

The fast Windows PR test leg invokes the normal Bazel CI wrapper like
this:

```shell
./.github/scripts/run-bazel-ci.sh \
  --print-failed-action-summary \
  --print-failed-test-logs \
  --windows-cross-compile \
  --remote-download-toplevel \
  -- \
  test \
  --test_tag_filters=-argument-comment-lint \
  --test_verbose_timeout_warnings \
  --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
  -- \
  //... \
  -//third_party/v8:all \
  -//codex-rs/code-mode:code-mode-unit-tests \
  -//codex-rs/v8-poc:v8-poc-unit-tests
```

With the BuildBuddy secret available on Windows, the wrapper selects
`--config=ci-windows-cross` and appends the important Windows-cross
overrides after rc expansion:

```shell
--host_platform=//:rbe
--shell_executable=/bin/bash
--action_env=PATH=/usr/bin:/bin
--host_action_env=PATH=/usr/bin:/bin
--test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}
```

The native post-merge Windows job intentionally omits
`--windows-cross-compile` and does not exclude the V8/code-mode unit
targets:

```shell
./.github/scripts/run-bazel-ci.sh \
  --print-failed-action-summary \
  --print-failed-test-logs \
  -- \
  test \
  --test_tag_filters=-argument-comment-lint \
  --test_verbose_timeout_warnings \
  --build_metadata=COMMIT_SHA=${GITHUB_SHA} \
  --build_metadata=TAG_windows_native_main=true \
  -- \
  //... \
  -//third_party/v8:all
```

## Research Notes

The existing macOS Bazel CI config already uses the model we want here:
build actions run remotely with `--strategy=remote`, but `TestRunner`
actions execute on the macOS runner. This PR mirrors that pattern for
Windows with `--strategy=TestRunner=local`.

The important Bazel detail is that `rules_rs` is already targeting
`x86_64-pc-windows-gnullvm` for Windows Bazel PR tests. This PR changes
where the build actions execute; it does not switch the Bazel PR test
target to Cargo, `cargo-nextest`, or the MSVC release target.

Cargo release builds differ from this Bazel path for V8: the normal
Windows Cargo release target is MSVC, and `rusty_v8` publishes prebuilt
Windows MSVC `.lib.gz` archives. The Bazel PR path targets
`windows-gnullvm`; `rusty_v8` does not publish a prebuilt Windows
GNU/gnullvm archive, so this PR builds that archive in-tree. That
Linux-RBE-built gnullvm archive currently crashes in direct V8/code-mode
smoke tests, which is why the workflow keeps native Windows coverage on
`main`.

The less obvious Bazel detail is test wrapper selection. Bazel chooses
the Windows test wrapper (`tw.exe`) from the test action execution
platform, not merely from the Rust target triple. The outer
`workspace_root_test` therefore declares the default test toolchain and
uses the bridge toolchain above so the test action executes on Windows
while its inner Rust binary is built for gnullvm.

The V8 investigation exposed a Windows-client gotcha: even when an
action execution platform is Linux RBE, Bazel can still derive the
genrule shell path from the Windows client. That produced remote
commands trying to run `C:\Program Files\Git\usr\bin\bash.exe` on Linux
workers. The wrapper now passes `--shell_executable=/bin/bash` with
`--host_platform=//:rbe` for the Windows cross path.

The same Windows-client/Linux-RBE boundary also affected
`third_party/v8:binding_cc`: a multiline genrule command can carry CRLF
line endings into Linux remote bash, which failed as `$'\r'`. That
genrule now keeps the `sed` command on one physical shell line while
using an explicit Starlark join so the shell arguments stay readable.

## Verification

Local checks included:

```shell
bash -n .github/scripts/run-bazel-ci.sh
bash -n workspace_root_test_launcher.sh.tpl
ruby -e "require %q{yaml}; YAML.load_file(%q{.github/workflows/bazel.yml}); puts %q{ok}"
RUNNER_OS=Linux ./scripts/list-bazel-clippy-targets.sh
RUNNER_OS=Windows ./scripts/list-bazel-clippy-targets.sh
RUNNER_OS=Linux ./tools/argument-comment-lint/list-bazel-targets.sh
RUNNER_OS=Windows ./tools/argument-comment-lint/list-bazel-targets.sh
```

The Linux clippy and argument-comment target lists contain zero
`*-windows-cross-bin` labels, while the Windows lists still include 47
Windows-cross internal test binaries.

CI evidence:

- Baseline native Windows Bazel test on `main`: success in about 28m12s,
https://github.com/openai/codex/actions/runs/25206257208/job/73907325959
- Green Windows-cross Bazel run on the split PR before adding the
main-only native leg: Windows test 9m16s, Windows release verify 5m10s,
Windows clippy 4m43s,
https://github.com/openai/codex/actions/runs/25231890068
- The latest SHA adds the explicit PR-vs-main tradeoff in `bazel.yml`;
CI is rerunning on that focused diff.

## Follow-Up

A subsequent PR should investigate making a cross-built Windows binary
work with V8/code-mode enabled. Likely options are either making the
Linux-RBE-built `windows-gnullvm` V8 archive correct at runtime, or
evaluating whether a Bazel MSVC target/toolchain can reuse the same
prebuilt MSVC `rusty_v8` archive shape that Cargo release builds already
use.
2026-05-01 15:55:28 -07:00
Channing Conger
a5fbcf1ab4 Prune unused code-mode globals (#20542)
Hide Atomics, SharedArrayBuffer, and WebAssembly from the code-mode
runtime since the harness does not expose worker support or need those
APIs.
2026-05-01 15:11:22 -07:00
starr-openai
2952beb009 Surface multi-environment choices in environment context (#20646)
## Why
The model needs a way to see which environments are available during a
multi-environment turn without changing the legacy single-environment
prompt surface or pulling replay/persistence changes into the same
review.

## Stack
1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext`
rendering for selected environments (this PR)
2. https://github.com/openai/codex/pull/20669 - selected-environment
ownership and tool config prep
3. https://github.com/openai/codex/pull/20647 - process-tool
`environment_id` routing

## What Changed
- extend `environment_context` so multi-environment turns render an
`<environments>` block with the selected environment ids and cwd values
- keep zero- and single-environment turns on the existing cwd-only
render path
- keep replay and persistence paths on the legacy surface for now so
this PR stays scoped to live prompt rendering
- add focused coverage in
`codex-rs/core/src/context/environment_context_tests.rs`

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-01 22:11:06 +00:00
Abhinav
d55479488e Clear live hook rows when turns finalize (#20674)
# Why

When a user interrupts a turn while a hook is still running, the normal
turn status is cleared but the separate live hook row can remain visible
as `Running` because the TUI may never receive a matching
`HookCompleted` event before cancellation. Once the turn itself is
finalized, that turn-scoped live state should not remain on screen.

# What

- clear any still-live `active_hook_cell` during turn finalization
- add a regression snapshot covering an interrupted turn with a visible
`PreToolUse` hook row

# Testing

- `cargo test -p codex-tui interrupted_turn_clears_visible_running_hook`
- attempted `cargo test -p codex-tui` (currently aborts on unrelated
existing stack overflow in
`app::tests::discard_side_thread_removes_agent_navigation_entry`)
2026-05-01 14:48:22 -07:00
Abhinav
443f6b831e Use the 2025-06-18 elicitation capability shape (#20562)
# Why

Codex currently negotiates MCP `2025-06-18`, where the client
elicitation capability is represented as an empty object. We were still
serializing `capabilities.elicitation.form`, which belongs to the later
capability shape and can cause strict `2025-06-18` servers to reject
`initialize` with an unrecognized-field error.

This keeps the handshake aligned with the protocol version Codex
actually negotiates and fixes the compatibility regression tracked in
#17492.

# What

- Serialize the client elicitation capability as `elicitation: {}` for
`2025-06-18`.
- Keep elicitation advertised for both Codex Apps and custom MCP
servers.
- Tighten regression coverage so the unit test asserts both the Rust
value and the serialized wire shape.
- Add an app-server integration test that round-trips a form elicitation
from a custom MCP server; the existing connector round-trip continues to
cover the connector path.

# Verification

- `cargo test -p codex-mcp`
- `cargo test -p codex-app-server mcp_server_elicitation_round_trip`
- `cargo test -p codex-app-server
mcp_server_tool_call_round_trips_elicitation`

# Next steps

- Decide whether `tool_call_mcp_elicitation=false` should also suppress
capability advertisement during `initialize`.
- Revisit `form` / `url` capability advertisement when Codex is ready to
negotiate MCP `2025-11-25`, which defines that newer shape.
2026-05-01 14:16:22 -07:00
pakrym-oai
aed74e5ee4 [codex] Emit image view as core item (#20512)
## Why

Image-view results should be represented as a core-produced turn item
instead of being reconstructed by app-server. At the same time, existing
rollout/history paths still understand the legacy `ViewImageToolCall`
event, so this keeps that event as compatibility output generated from
the new item lifecycle.

## What changed

- Added `TurnItem::ImageView` to `codex-protocol`.
- Emitted image-view item start/completion directly from the core
`view_image` handler.
- Kept `ViewImageToolCall` as a legacy event and generate it from
completed `TurnItem::ImageView` items.
- Kept `thread_history.rs` on the legacy `ViewImageToolCall` replay
path, with `ImageView` item lifecycle events ignored there.
- Updated app-server protocol conversion, rollout persistence, and
affected exhaustive event matches for the new item plus legacy fan-out
shape.

## Verification

- `cargo test -p codex-protocol -p codex-app-server-protocol -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server -p
codex-app-server --lib`
- `cargo test -p codex-core --test all
view_image_tool_attaches_local_image`
- `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol
-p codex-app-server -p codex-rollout -p codex-rollout-trace -p
codex-mcp-server`
- `git diff --check`
2026-05-01 11:28:30 -07:00
canvrno-oai
610eefb86b /plugins: add marketplace upgrade flow (#20478)
This PR adds marketplace upgrade to the `/plugins` menu so users can
update configured marketplaces. It adds a `Ctrl+U` shortcut on eligible
marketplace tabs, a loading state, and the app-server request flow
needed to perform `marketplace/upgrade`. After a successful upgrade, the
TUI refreshes plugin data, plugin mentions, and user config so updated
marketplace contents show up across the menu and other plugin surfaces.
It also preserves the current marketplace tab on no-op and failure paths
and surfaces backend error details directly in the TUI.

- Add a `Ctrl+U` upgrade option for user-configured marketplace tabs in
`/plugins`
- Show the upgrade footer hint only on upgradeable marketplace tabs
- Show a loading state during `marketplace/upgrade`
- Surface already-up-to-date and per-marketplace failure results from
the backend
- Refresh plugin data, plugin mentions, and user config after successful
upgrades
- Add tests and snapshot updates for the shortcut flow, loading state,
and failure messaging

Steps to test:
1. Add a `/plugin` marketplace to Codex TUI.
2. Open `/plugins`, move to that marketplace tab, and confirm the footer
shows `Ctrl+U` to upgrade.
3. Press `Ctrl+U` and confirm the popup switches into an upgrade loading
state.
4. When the request finishes, confirm you see the expected result:
updated marketplace contents on success, an already-up-to-date message
on no-op, or backend error details on failure. On no-op or failure,
confirm the popup stays on the same marketplace tab.
2026-05-01 11:26:29 -07:00
jif-oai
2817866a32 fix: reduce ConfigBuilder::build stack usage (#20650)
## Why

`ConfigBuilder::build` performs a large amount of async config loading.
Leaving that entire future on the caller stack makes config startup more
fragile on small runtime worker stacks.

## What changed

- keep `ConfigBuilder::build` as a thin wrapper that boxes the
config-loading future before awaiting it
- move the existing implementation into a private `build_inner` method
so the large async state machine lives on the heap instead of the
runtime thread stack

## Testing

- Not run locally
2026-05-01 20:24:17 +02:00
Felipe Coury
ff66b3c7eb fix(tui): restore alt-enter newline alias (#20535)
Fixes https://github.com/openai/codex/issues/20501

## Summary
- add Alt+Enter to the built-in editor newline aliases
- update keymap tests that used Alt+Enter as a custom submit binding now
that it conflicts with newline
- refresh the keymap action-menu snapshot fixture

## Test Plan
- `just fmt`
- `cargo test -p codex-tui keymap::tests`
- `cargo test -p codex-tui bottom_pane::textarea::tests`
- `cargo test -p codex-tui keymap_setup::tests`
- `cargo test -p codex-tui`
- `cargo insta pending-snapshots`
- `git diff --check`
- `just argument-comment-lint`
2026-05-01 15:22:02 -03:00
starr-openai
be71b6fcd1 Use selected turn environments for runtime context (#20281)
## Summary
- make selected turn environments the source of truth for session
runtime cwd and MCP runtime environment selection
- keep local/no-selection fallback behavior intact
- add coverage for duplicate selected environments, cwd resolution, and
MCP runtime environment selection

## Validation
- git diff --check
- rustfmt was run on touched Rust files during the implementation
workflow

CI should provide the full Bazel/test signal.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-01 11:00:14 -07:00
Tom
e4d6675632 [codex] Migrate loaded thread/read history to ThreadStore (#20486)
## Summary

- Route loaded `thread/read` + `includeTurns` through
`CodexThread::load_history` / ThreadStore history instead of direct
rollout JSONL reads.
- Add an in-memory ThreadStore regression test covering loaded
`thread/read includeTurns` without a local rollout path.
2026-05-01 10:55:04 -07:00
Abhinav
78baa20780 deprecate legacy notify (#20524)
# Why

`notify` is the remaining compatibility surface from the legacy hook
implementation. The newer lifecycle hook engine now owns the active hook
system, so we should start steering users away from adding new `notify`
configs before removing the old path entirely. This also adds a
lightweight watchpoint for the deprecation so we can see how much legacy
usage remains before the clean drop.

# What

- emit a startup deprecation notice when a non-empty `notify` command is
configured
- emit `codex.notify.configured` when a session starts with legacy
`notify` configured
- emit `codex.notify.run` when the legacy notify path fires after a
completed turn
- mark `notify` as deprecated in the config schema and repo docs
- remove the orphaned `codex-rs/hooks/src/user_notification.rs` file
that is no longer compiled
- add regression coverage for the new deprecation notice

# Next steps

A follow-up PR can remove the legacy notify path entirely once we are
ready for the clean drop. Before then, we can watch
`codex.notify.configured` and `codex.notify.run` to understand the
deprecation impact and remaining active usage. The cleanup PR should
then delete the `notify` config field, the `legacy_notify`
implementation, the old compatibility dispatch types and callsites that
only exist for the legacy path, and the remaining compatibility
docs/tests.

# Testing

- `cargo test -p codex-hooks`
- `cargo test -p codex-config`
- `cargo test -p codex-core emits_deprecation_notice_for_notify`
2026-05-01 17:35:21 +00:00
pakrym-oai
9b8d585075 [codex] Add Codex environment config (#20630)
## Why

This adds a checked-in Codex environment configuration so the repo
exposes a ready-to-run Codex action from the app environment metadata.

## What changed

- Added `.codex/environments/environment.toml` with a generated `Run`
action.
- The action runs the `codex` binary from `codex-rs/Cargo.toml` with
`mcp_oauth_credentials_store=file`.

## Verification

- Not run; configuration-only change.
2026-05-01 10:01:45 -07:00
Eric Traut
6784db51c0 Add /ide context support to the TUI (#20294)
## Why

Users have asked for a `/ide` command in the TUI so Codex can use the
active IDE session for live context such as the current file, open tabs,
and selected ranges. We already support a similar feature in the Codex
desktop app, so bringing it to the TUI makes sense.

One subtle compatibility constraint is that the injected prompt wrapper
and transcript stripping should match the desktop app and IDE extension.
By using the same `## My request for Codex:` delimiter and hiding the
injected context from transcript rendering the same way, threads created
in the TUI render correctly in desktop and IDE surfaces, and threads
created there replay correctly in the TUI, even when IDE context was
included.

Addresses https://github.com/openai/codex/issues/13834.

## What changed
### Summary
This PR consists of four four pieces:
1. An IPC client that uses a socket (Mac/Linux) or named pipe (Windows)
to talk to the IDE Extension
2. Logic that establishes the IPC connection and requests IDE context
(open files, selection) on demand
3. Logic that injects this context into the user prompt (using the same
technique as the desktop app) and hides the added context when rendering
the prompt in the TUI transcript
4. A new slash command for enabling/disabling this mode and text within
the footer to indicate when it's enabled

### Details
- Added `/ide [on|off|status]` to the TUI, with bare `/ide` toggling IDE
context on or off.
- Added a Rust IDE context client that connects to the local Codex IDE
IPC route as a client and requests context from the IDE extension flow.
- Injected IDE context using the same prompt delimiter and
transcript-stripping convention as the desktop app and IDE extension so
shared threads render consistently across surfaces.
- Added an `IDE context` status-line indicator while the feature is
active and cleared it when enabling or fetching context fails.
- Added handling for multiple selection ranges, oversized selections,
interleaved IPC messages, and transient reconnect timing after quick
toggles.

## Verification

Did extensive manual testing in addition to running automated unit and
regression tests.

To test:

- Launch VS Code (or Cursor) with the IDE extension.
- Open one or more files in the IDE and select a range of text within
one of them.
- Start the TUI.
- Ask the agent which files you have open in your IDE, and it should say
that it does not know.
- Enable `/ide` mode; note that `IDE context` appears in the lower
right.
- Ask the agent what files you have open in your IDE and what text is
selected.
2026-05-01 09:39:48 -07:00
Ruslan Nigmatullin
41e171fcf2 app-server: move transport into dedicated crate (#20545)
## Why

`codex-app-server` currently owns both request-processing code and
transport implementation details. Splitting the transport layer into its
own crate makes that boundary explicit, reduces the amount of
transport-specific dependency surface carried by `codex-app-server`, and
gives future transport work a narrower place to evolve.

## What changed

- Added `codex-app-server-transport` and moved the existing transport
tree into it, including stdio, unix socket, websocket, remote-control
transport, and websocket auth.
- Moved shared transport-facing message types into the new crate so both
the transport implementation and `codex-app-server` use the same
definitions.
- Kept processor-facing connection state and outbound routing in
`codex-app-server`, with the routing tests moved next to that local
wrapper.
- Updated workspace metadata, Bazel crate metadata, and
`codex-app-server` dependencies for the new crate boundary.

## Validation

- `cargo metadata --locked --no-deps`
- `git diff --check`
- Attempted `cargo test -p codex-app-server-transport`, `cargo test -p
codex-app-server`, `just fix -p codex-app-server-transport`, and `just
fix -p codex-app-server`; all were blocked before compilation by the
existing `packageproxy` resolution failure for locked `rustls-webpki =
0.103.13`.
- Attempted Bazel build / lockfile validation; those were blocked by
external fetch failures against BuildBuddy / GitHub while resolving
`v8`.
2026-05-01 09:23:47 -07:00
jif-oai
5744b85b9a fix: cargo deny (#20627)
Fix cargo deny by ack the `RUSTSEC` while a fix land
```
  RUSTSEC-2026-0118
  NSEC3 closest-encloser proof validation enters unbounded loop on cross-zone responses

  RUSTSEC-2026-0119
  CPU exhaustion during message encoding due to O(n²) name compression

  Dependency path:

  hickory-proto 0.25.2
  └── hickory-resolver 0.25.2
      └── rama-dns 0.3.0-alpha.4
          └── rama-tcp 0.3.0-alpha.4
              └── codex-network-proxy
```

Also upgrade some workers version to prevent this:
```
warning[license-not-encountered]: license was not encountered
    ┌─ ./codex-rs/deny.toml:131:6
    │
131 │     "OpenSSL",
    │      ━━━━━━━ unmatched license allowance

warning[duplicate]: found 2 duplicate entries for crate 'base64'
   ┌─ /github/workspace/codex-rs/Cargo.lock:79:1
   │
79 │ ╭ base64 0.21.7 registry+https://github.com/rust-lang/crates.io-index
80 │ │ base64 0.22.1 registry+https://github.com/rust-lang/crates.io-index
   │ ╰───────────────────────────────────────────────────────────────────┘ lock entries
```
2026-05-01 18:15:38 +02:00
Eric Traut
3d1d164aee Remove no-tool goal continuation suppression (#20523)
## Why

`/goal` is supposed to keep Codex working until the goal is actually
done. The previous continuation logic had two ways to stop early: the
continuation prompt told the model to wait for new input when it felt
blocked, and the runtime suppressed another continuation turn after a
continuation finished without any tool calls.

That made goals stop short even when the agent could still keep making
progress (I received a few reports of this from users). It also relied
on a brittle heuristic that treated "no registry tool calls" as
equivalent to "should stop."

## What changed

- removed the continuation prompt sentence that told the model to stop
and wait for new input when it could not continue productively
- removed the goal runtime suppression heuristic that stopped
auto-continuation after a no-tool continuation turn
- deleted the continuation-activity bookkeeping and left `tool_calls` as
telemetry only
- added focused regressions for the two intended behaviors: completed
no-tool continuation turns still continue, while `request_user_input`
keeps the existing turn open instead of spawning a new continuation
2026-05-01 09:09:55 -07:00
Eric Traut
227bee0445 Enforce animations = false for screen readers (#20564)
## Why

Issue #20489 calls out that animated TUI affordances can be noisy for
screen-reader users. Codex already has `tui.animations = false` as a
reduced-motion setting, but some live activity rows render spinner-style
prefixes in that mode. These were relatively recent regressions.

We have also regressed this pattern more than once by adding new
spinner/shimmer callsites that do not think through the reduced-motion
path, so this PR adds a small guardrail while fixing the current
surfaces.

## What changed

- Omit the live status-row spinner when animations are disabled, so the
row starts with stable text like `Working (...)`.
- Render running hook headers without the spinner prefix when animations
are disabled, while preserving shimmer/spinner behavior when animations
are enabled.
- Centralize TUI activity indicators in `tui/src/motion.rs`, with
explicit reduced-motion choices for hidden prefixes, static bullets, and
plain shimmer-text fallbacks.
- Route existing spinner/shimmer callsites through the central motion
helper, including exec rows, MCP/web-search/loading rows, hook rows,
plugin loading, and onboarding loading text.
- Add a source-scan regression test that rejects direct `spinner(...)`
or `shimmer_spans(...)` usage outside the central module and primitive
definition.
- Add focused coverage that reduced-motion active exec rows are stable,
status rows start without a spinner, running hooks omit the spinner, and
MCP inventory loading stays stable.
- Update the one affected status-indicator snapshot; the existing detail
tree prefix remains unchanged.

## Verification

- `cargo test -p codex-tui`
2026-05-01 09:07:56 -07:00
pakrym-oai
f476338f93 Move apply-patch file changes into turn items (#20540)
## Why

Apply-patch file changes are now part of the core turn item stream, so
v2 clients can consume the same first-class item lifecycle path used by
other turn items instead of relying on app-server-specific remapping
from legacy patch events.

## What changed

- Added a core `TurnItem::FileChange` carrying apply-patch changes and
completion metadata.
- Updated the apply-patch tool emitter to send `ItemStarted` /
`ItemCompleted` with the new `FileChange` item while preserving legacy
`PatchApplyBegin` / `PatchApplyEnd` fan-out.
- Updated app-server v2 conversion to render the new core item directly
and stopped `event_mapping` from remapping old patch begin/end events
into item notifications.
- Kept thread history reconstruction based on the existing old
apply-patch events for rollout compatibility.

## Verification

- `cargo test -p codex-protocol -p codex-app-server-protocol`
- `cargo test -p codex-core --test all
apply_patch_tool_executes_and_emits_patch_events`
- `cargo test -p codex-app-server bespoke_event_handling`
2026-05-01 08:47:18 -07:00
jif-oai
0b04d1b3cc feat: export and replay effective config locks (#20405)
## Why

For reproducibility. A hand-written `config.toml` is not enough to
recreate what a Codex session actually ran with because layered config,
CLI overrides, defaults, feature aliases, resolved feature config,
prompt setup, and model-catalog/session values can all affect the final
runtime behavior.

This PR adds an effective config lockfile path: one run can export the
resolved session config, and a later run can replay that lockfile and
fail early if the regenerated effective config drifts.

## What Changed

- Add a dedicated `ConfigLockfileToml` wrapper with top-level lockfile
metadata plus the replayable config:

  ```toml
  version = 1
  codex_version = "..."

  [config]
  # effective ConfigToml fields
  ```

- Keep lockfile metadata out of regular `ConfigToml`; replay loads
`ConfigLockfileToml` and then uses its nested `config` as the
authoritative config layer.
- Add `debug.config_lockfile.export_dir` to write
`<thread_id>.config.lock.toml` when a root session starts.
- Add `debug.config_lockfile.load_path` to replay a saved lockfile and
validate the regenerated session lockfile against it.
- Add `debug.config_lockfile.allow_codex_version_mismatch` to optionally
tolerate Codex binary version drift while still comparing the rest of
the lockfile.
- Add `debug.config_lockfile.save_fields_resolved_from_model_catalog` so
lock creation can either save model-catalog/session-resolved fields or
intentionally leave those fields dynamic.
- Build lockfiles from the effective config plus resolved runtime values
such as model selection, reasoning settings, prompts, service tier, web
search mode, feature states/config, memories config, skill instructions,
and agent limits.
- Materialize feature aliases and custom feature config into the
lockfile so replay compares canonical resolved behavior instead of
user-authored alias shape.
- Strip profile/debug/file-include/environment-specific inputs from
generated lockfiles so they contain replayable values rather than the
inputs that produced those values.
- Surface JSON-RPC server error code/data in app-server client and TUI
bootstrap errors so config-lock replay failures include the actual TOML
diff.
- Regenerate the config schema for the new debug config keys.

## Review Notes

The main flow is split across these files:

- `config/src/config_toml.rs`: lockfile/debug TOML shapes.
- `core/src/config/mod.rs`: loading `debug.config_lockfile.*`, replaying
a lockfile as a config layer, and preserving the expected lockfile for
validation.
- `core/src/session/config_lock.rs`: exporting the current session
lockfile and materializing resolved session/config values.
- `core/src/config_lock.rs`: lockfile parsing, metadata/version checks,
replay comparison, and diff formatting.

## Usage

Export a lockfile from a normal session:

```sh
codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"'
```

Export a lockfile without saving model-catalog/session-resolved fields:

```sh
codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"' \
  -c 'debug.config_lockfile.save_fields_resolved_from_model_catalog=false'
```

Replay a saved lockfile in a later session:

```sh
codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"'
```

If replay resolves to a different effective config, startup fails with a
TOML diff.

To tolerate Codex binary version drift during replay:

```sh
codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"' \
  -c 'debug.config_lockfile.allow_codex_version_mismatch=true'
```

## Limitations

This does not support custom rules/network policies.

## Verification

- `cargo test -p codex-core config_lock`
- `cargo test -p codex-config`
- `cargo test -p codex-thread-manager-sample`
2026-05-01 17:46:02 +02:00
jif-oai
ff27d01676 feat: seed ad-hoc memory extension instructions (#20606)
## Summary

Ad-hoc memory notes are written under `memories/extensions/ad_hoc/`, but
the consolidation agent only knows how to interpret an extension when
the extension folder has an `instructions.md`. Seed those instructions
from the memories write pipeline so an enabled memories startup creates
the expected ad-hoc extension layout automatically.

This also moves extension-specific write behavior behind a dedicated
`memories/write/src/extensions/` module. `ad_hoc` owns the seeded
instructions template, while the existing resource-retention cleanup
lives in its own `prune` module so future memory extensions can add
their own write-side setup without growing a flat helper file.

## Changes

- Seed `memories/extensions/ad_hoc/instructions.md` during eligible
memory startup without overwriting an existing file.
- Store the ad-hoc instructions template under
`memories/write/templates/extensions/ad_hoc/`, keeping ownership in
`codex-memories-write`.
- Split memory extension support into `extensions::ad_hoc` and
`extensions::prune`.
- Keep the existing old-resource pruning behavior unchanged.

## Verification

- `cargo test -p codex-memories-write`
- `bazel build //codex-rs/memories/write:write`

---------

Co-authored-by: chatgpt-codex-connector[bot] <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
2026-05-01 14:43:58 +02:00
jif-oai
70fc55b8f3 chore: improve remember prompt (#20610) 2026-05-01 14:38:07 +02:00
jif-oai
97aae46800 feat: ad-hoc instructions (#20602) 2026-05-01 13:42:54 +02:00
jif-oai
ad404c8400 chore: allow memories edition (#20600) 2026-05-01 13:27:37 +02:00
xl-openai
48791920a8 feat: Track local paths for shared plugins (#20560)
When a local plugin is shared, Codex now records the local plugin path
by remote plugin id under CODEX_HOME/.tmp.

plugin/share/list includes the remote share URL and the matching local
plugin path when available, and plugin/share/delete
clears the local mapping after deleting the remote share.

Also add sharedURL to plugin/share/list.
2026-05-01 00:50:12 -07:00
xli-oai
96d2ea9058 Add remote plugin skill read API (#20150)
## Summary

Adds an app-server `plugin/skill/read` method for remote plugin skill
markdown. The new method calls the plugin-service skill detail endpoint
and returns `skill_md_contents`, so clients can preview skills for
remote plugins before the bundle is installed locally.

## Why

Uninstalled remote plugin skills do not have local `SKILL.md` files.
Without an on-demand remote read, the desktop plugin details UI cannot
render the skill details modal for those skills.

## Validation

- `just write-app-server-schema`
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server --test all --
suite::v2::plugin_read::plugin_skill_read_reads_remote_skill_contents_when_remote_plugin_enabled
--exact`
- `just fix -p codex-app-server-protocol -p codex-core-plugins -p
codex-app-server`
2026-05-01 00:16:25 -07:00
xli-oai
a62b52f826 Refresh remote plugin cache on auth changes (#20265)
## Summary
- Refresh the remote installed-plugin cache after login/logout instead
of keying it by account or eagerly clearing it.
- Reuse the existing single-flight remote installed refresh loop so
newer queued auth refreshes replace older pending requests and the API
result eventually overwrites or clears the cache.
- Keep derived plugin/skills cache and MCP refresh side effects behind
the existing effective-plugin-changed task when the refreshed installed
state changes.
- Leave `clear_plugin_related_caches` scoped to derived plugin/skills
caches so share mutations do not drop remote installed plugins.

## Tests
- `cargo fmt --all --manifest-path codex-rs/Cargo.toml` (passes; stable
rustfmt warns that `imports_granularity = Item` is nightly-only)
- `cargo test -p codex-core-plugins remote_installed_cache`
- `cargo test -p codex-app-server
skills_list_loads_remote_installed_plugin_skills_from_cache`
2026-04-30 23:11:14 -07:00
Eric Traut
a93c89f497 Color TUI statusline from active theme (#19631)
## Why

Users have shared that the TUI can feel too visually flat because themes
mostly show up in code syntax highlighting. The configurable statusline
is a natural place to make the active theme more visible, while still
letting users keep the existing monotone statusline if they prefer it.

## What Changed

- Added a statusline styling helper that builds the rendered statusline
from `(StatusLineItem, text)` segments, preserving item identity while
keeping the plain text output unchanged.
- Derived foreground accent colors from the active syntax theme by
looking up TextMate scopes through the existing syntax highlighter, with
conservative ANSI fallbacks when a scope does not provide a foreground.
- Tuned theme-derived colors to keep the accents visible without making
the statusline feel overly bright.
- Added `[tui].status_line_use_colors`, defaulting to `true`, plus a
separated `/statusline` toggle so users can enable or disable
theme-derived statusline colors from the setup UI.
- Updated the live statusline and `/statusline` preview to use the same
styled builder, while keeping terminal-title preview text plain.
- Kept statusline separators and active-agent add-ons subdued while
removing blanket dimming from the whole passive statusline.

## Verification

- `cargo test -p codex-tui status_line`
- `cargo test -p codex-tui theme_picker`
- `cargo test -p codex-tui foreground_style_for_scopes`
- `cargo test -p codex-tui`
- `cargo test -p codex-config`
- `cargo test -p codex-core status_line_use_colors`
- `cargo insta pending-snapshots --manifest-path tui/Cargo.toml`

## Visual

<img width="369" height="23" alt="Screenshot 2026-04-30 at 6 16 08 PM"
src="https://github.com/user-attachments/assets/11d03efb-8e4f-4450-8f4d-00a9659ef4cd"
/>

<img width="385" height="23" alt="Screenshot 2026-04-30 at 6 16 02 PM"
src="https://github.com/user-attachments/assets/a3d89f36-bdc1-42e8-8e84-61350e3999e2"
/>
2026-04-30 22:42:48 -07:00
Eric Traut
d898cc8f3f Format multi-day goal durations in the TUI (#20558)
## Why

Goal mode shows elapsed time in compact hour/minute form. That is easy
to scan for shorter runs, but once a goal runs past 24 hours, large hour
counts become harder to read at a glance.

## What changed

Updated `codex-rs/tui/src/goal_display.rs` so unbudgeted goal elapsed
time keeps the existing compact format below one day, then switches to a
day-aware format once the elapsed time reaches 24 hours:

- `23h 59m`
- `1d 0h 0m`
- `2d 23h 42m`

The formatter now covers the 24-hour boundary in unit tests, and the TUI
status-line snapshot for a completed elapsed goal now exercises the
multi-day display.

## Verification

- `cargo test -p codex-tui`

Here's my longest-running test task:

<img width="186" height="23" alt="image"
src="https://github.com/user-attachments/assets/cedfcdab-7f6e-44e6-8495-8a39f63973fb"
/>
2026-04-30 22:42:07 -07:00
Tom
fe05acad23 Make thread store process-scoped (#19474)
- Build one app-server process ThreadStore from startup config and share
it with ThreadManager and CodexMessageProcessor.
- Remove per-thread/fork store reconstruction so effective thread config
cannot switch the persistence backend.
- Add params to ThreadStore create/resume for specifying thread
metadata, since otherwise the metadata from store creation would be used
(incorrectly).
2026-04-30 21:24:59 -07:00
pakrym-oai
f50c02d7bc [codex] Remove unused event messages (#20511)
## Why

Several legacy `EventMsg` variants were still emitted or mapped even
though clients either ignored them or had moved to item/lifecycle
events. `Op::Undo` had also degraded to an unavailable shim, so this
removes that dead task path instead of preserving a command that cannot
do useful work.

`McpStartupComplete`, `WebSearchBegin`, and `ImageGenerationBegin` are
intentionally kept because useful consumers still depend on them: MCP
startup completion drives readiness behavior, and the begin events let
app-server/core consumers surface in-progress web-search and
image-generation items before the final payload arrives.

## What Changed

- Removed weak legacy event variants and payloads from `codex-protocol`,
including legacy agent deltas, background events, and undo lifecycle
events.
- Kept/restored `EventMsg::McpStartupComplete`,
`EventMsg::WebSearchBegin`, and `EventMsg::ImageGenerationBegin` with
serializer and emission coverage.
- Updated core, rollout, MCP server, app-server thread history,
review/delegate filtering, and tests to rely on the useful replacement
events that remain.
- Removed `Op::Undo`, `UndoTask`, the undo test module, and stale TUI
slash-command comments.
- Stopped agent job/background progress and compaction retry notices
from emitting `BackgroundEvent` payloads.

## Verification

- `cargo check -p codex-protocol -p codex-app-server-protocol -p
codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
- `cargo test -p codex-protocol -p codex-app-server-protocol -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server`
- `cargo test -p codex-core --test all suite::items`
- `just fix -p codex-protocol -p codex-app-server-protocol -p codex-core
-p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
- Earlier coverage on this PR also included `codex-mcp`, `codex-tui`,
core library tests, MCP/plugin/delegate/review/agent job tests, and MCP
startup TUI tests.
2026-04-30 20:03:26 -07:00
xli-oai
bb60b78c46 Surface admin-disabled remote plugin status (#20298)
## Summary

Remote plugin-service returns plugin availability separately from a
user's installed/enabled state. This adds `PluginAvailabilityStatus` to
the app-server protocol, propagates remote catalog `status` into
`PluginSummary`, and rejects install attempts for remote plugins marked
`DISABLED_BY_ADMIN` before downloading or caching the bundle.

This is the `openai/codex` half of the change. The companion
`openai/openai` webview PR is
https://github.com/openai/openai/pull/873269.

## Validation

- `cargo run -p codex-app-server-protocol --bin write_schema_fixtures`
- `cargo test -p codex-app-server --test all
plugin_list_marks_remote_plugin_disabled_by_admin`
- `cargo test -p codex-app-server --test all
plugin_list_includes_remote_marketplaces_when_remote_plugin_enabled`
- `cargo test -p codex-app-server --test all
plugin_install_rejects_remote_plugin_disabled_by_admin_before_download`
- `cargo test -p codex-app-server-protocol schema_fixtures`
2026-04-30 20:00:07 -07:00
Tom
c39824c2fd [codex] Improve PR babysitter CI diagnostics and guardrails (#20484)
## Summary

- Surface failed GitHub Actions jobs in the PR babysitter watcher so
Codex can fetch job logs as soon as a job fails, instead of waiting for
the overall workflow run to complete.
- Update babysit-pr skill instructions, GitHub API notes, and heuristics
to prefer direct job log archives before falling back to `gh run view
--log-failed`.
- Add guardrails requiring explicit user confirmation before posting
replies to human-authored review comments.
- Add guardrails preventing Codex from patching unrelated flaky tests,
CI infrastructure, runner issues, dependency outages, or other failures
not caused by the PR branch.

## Validation

- `python3 -m pytest
.codex/skills/babysit-pr/scripts/test_gh_pr_watch.py`
2026-04-30 19:58:19 -07:00
rhan-oai
6b1b227804 [codex-analytics] centralize thread analytics state (#20300)
## Why

Several analytics event families need the same per-thread attribution
state: the app-server client/runtime associated with a thread and, for
lifecycle-oriented events, the thread metadata captured during
initialization. Keeping connection ids and lifecycle metadata in
separate maps made each consumer rebuild the same thread context and
made subagent attribution harder to resolve consistently.

## What changed

- Replaces the separate thread connection and metadata maps with one
reducer-owned `threads` map.
- Routes guardian, compaction, turn-steer, and turn analytics through
shared thread-state lookups while preserving turn-origin attribution for
turn events and request-origin attribution for steer events.
- Lets newly observed spawned subagent threads inherit their parent
thread connection so later thread-scoped analytics can resolve through
the same state model.
- Adds regression coverage for standalone `SubAgentThreadStarted`
publication plus the `SubAgentSource::ThreadSpawn` parent fallback
through a thread-scoped consumer that depends on inherited connection
state.

## Verification

- `cargo test -p codex-analytics`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20300).
* #18748
* #18747
* #17090
* #17089
* #20239
* #20515
* #20514
* __->__ #20300
2026-04-30 18:58:50 -07:00
Ruslan Nigmatullin
972b819213 app-server: switch remote control to protocol v3 segmentation (#20341)
## Why

Remote-control protocol v3 makes segmentation an explicit wire-level
feature. The app-server transport needs to support that protocol
directly so large messages can be chunked, acknowledged, replayed, and
reassembled consistently.

## What changed

- Bump the remote-control websocket protocol version from `2` to `3`.
- Add explicit client/server chunk envelope variants plus chunk-aware
acknowledgements.
- Split oversized outbound server messages into bounded transport
chunks.
- Reassemble ordered inbound client chunks with bounded memory usage and
stream/client invalidation handling.
- Track inbound chunk cursors and outbound ack cursors as `(seq_id,
segment_id)` so duplicate chunks and partial replays behave correctly.
- Add focused coverage for chunk splitting, reassembly, duplicate
suppression, and stream replacement behavior.

## Validation

- Added targeted unit coverage for segmented message handling in
`remote_control`.
- Local validation is currently blocked before compilation because
`packageproxy` does not serve the locked `rustls-webpki 0.103.13`
dependency required by the workspace.
2026-04-30 18:27:16 -07:00
Dylan Hurd
af089fb21d fix(exec_policy) heredoc parsing file_redirect (#20113)
## Summary
Fixes a regression introduced in #10941 so that heredocs do not permit
file redirects to be approved by rules, and adds scenario tests to cover
this behavior.


Previously, heredoc command parsing would allow redirects and
environment variables:
```bash
# commands_for_exec_policy() would parse this via parse_shell_lc_single_command_prefix
PATH=/tmp/bad:$PATH cat <<'EOF' > /tmp/bad/hello.txt
hello
EOF
```
This conflicts with the Codex Rules documentation; heredoc parsing logic
should abide by the same strictness of parsing.


## Tests
- [x] Updated unit tests accordingly
- [x] Added scenario tests for these cases

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-01 01:05:02 +00:00
iceweasel-oai
4f96001fa7 execpolicy: unwrap PowerShell -Command wrappers on Windows (#20336)
## Why
On Windows, Codex runs shell commands through a top-level
`powershell.exe -NoProfile -Command ...` wrapper. `execpolicy` was
matching that wrapper instead of the inner command, so prefix rules like
`["git", "push"]` did not fire for PowerShell-wrapped commands even
though the same normalization already happens for `bash -lc` on Unix.

This change makes the Windows shell wrapper transparent to rule matching
while preserving the existing Windows unmatched-command safelist and
dangerous-command heuristics.

## What changed
- add `parse_powershell_command_plain_commands()` in
`shell-command/src/powershell.rs` to unwrap the top-level PowerShell
`-Command` body with `extract_powershell_command()` and parse it with
the existing PowerShell AST parser
- update `core/src/exec_policy.rs` so `commands_for_exec_policy()`
treats top-level PowerShell wrappers like `bash -lc` and evaluates rules
against the parsed inner commands
- carry a small `ExecPolicyCommandOrigin` through unmatched-command
evaluation and expose `is_safe_powershell_words()` /
`is_dangerous_powershell_words()` so Windows safelist and
dangerous-command checks still work after unwrap
- add Windows-focused tests for wrapped PowerShell prompt/allow matches,
wrapper parsing, and unmatched safe/dangerous inner commands, and
re-enable the end-to-end `execpolicy_blocks_shell_invocation` test on
Windows

## Testing
- `cargo test -p codex-shell-command`
2026-05-01 00:56:20 +00:00
Abhinav
0d9a5d20ec Alias codex_hooks feature as hooks (#20522)
# Why

The hooks feature flag should use the concise canonical name `hooks`,
while existing configs that still use `codex_hooks` continue to work
during the rename.

# What

- change the canonical `Feature::CodexHooks` key from `codex_hooks` to
`hooks`
- register `codex_hooks` through the existing legacy-alias path
- update the config schema and canonical config fixtures to prefer
`hooks`
- add regression coverage that both `hooks` and `codex_hooks` resolve to
`Feature::CodexHooks`

# Verification

- `cargo test -p codex-features`
- `cargo test -p codex-core config::schema_tests`
- `cargo test -p codex-core
pre_tool_use_blocks_shell_when_defined_in_config_toml`
- `cargo test -p codex-app-server
hooks_list_uses_each_cwds_effective_feature_enablement`
2026-05-01 00:46:33 +00:00
Owen Lin
5affb7f9d5 fix(app-server): mark thread/turns/list and exclude_turns as experime… (#20499)
…ntal

We have some bugs to work out and it is not quite ready to consume as a
public API.
2026-04-30 17:39:08 -07:00
xli-oai
acdf908268 Emit analytics for remote plugin installs (#20267)
## Summary

- emit `codex_plugin_installed` after a remote plugin install succeeds
- keep local installs unchanged, but let remote installs override the
analytics `plugin_id` with the backend remote plugin id
(`plugins~Plugin_...`)
- preserve the local/display identity in `plugin_name` and
`marketplace_name`, plus capability metadata from the installed bundle
- add regression coverage for local install analytics, remote install
analytics, and analytics id override serialization

## Testing

- `just fmt`
- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server`
2026-04-30 17:27:16 -07:00
Felipe Coury
b6f81257f8 feat(tui): add vim composer mode (#18595)
## Why

Codex now has configurable TUI keymaps, but the composer still behaves
like a plain text field. Users who prefer modal editing need a way to
keep Vim muscle memory while drafting prompts, and the keymap picker
needs to expose Vim-specific actions if those bindings are configurable
instead of hardcoded.

## What Changed

- Adds composer Vim mode with insert/normal state, common normal-mode
movement and editing commands, `d`/`y` operator-pending flows, and
mode-aware footer and cursor indicators.
- Adds `/vim`, an optional global `toggle_vim_mode` binding, and
`tui.vim_mode_default` so Vim mode can be toggled per session or enabled
as the default composer state.
- Extends runtime and config keymaps with `vim_normal` and
`vim_operator` contexts, exposes those contexts in `/keymap`, refreshes
the config schema, and validates Vim bindings separately.
- Integrates Vim normal mode with existing composer behavior: `/` opens
slash command entry, `!` enters shell mode, `j`/`k` navigate history at
history boundaries, successful submissions reset back to normal mode,
and paste burst handling remains insert-mode only.
- Teaches the TUI render path to apply and restore cursor style so Vim
insert mode can use a bar cursor without leaving the terminal in that
state after exit.

## Validation

- `cargo test -p codex-tui keymap -- --nocapture` on the keymap/Vim
coverage
- `cargo insta pending-snapshots`

## Docs

This introduces user-facing `/vim`, `tui.vim_mode_default`, and Vim
keymap contexts under `tui.keymap`, so the public CLI configuration and
slash-command docs should be updated before the feature ships.
2026-04-30 17:20:51 -07:00
maja-openai
a5ebedef67 Bypass review for always-allow MCP tools in auto-review (#20069)
## Why

When an MCP or app tool is configured with approval mode `approve`
(always allow), users expect that decision to be authoritative. In
guardian auto-review mode, ARC could still return `ask-user`, which then
routed the approval question into guardian with the ARC reason as
context. That meant a tool explicitly configured as always allowed still
went through both safety monitors before running.

This change keeps the existing ARC behavior for non-auto-review
sessions, but avoids the ARC-to-guardian sequence when
`approvals_reviewer = auto_review` and the tool approval mode is
`approve`.

## What changed

- Short-circuit MCP tool approval handling when `approval_mode ==
approve` and `approvals_reviewer == auto_review`.
- Updated the MCP approval regression test so the auto-review case
asserts neither ARC nor guardian is called.
- Preserved existing tests that verify ARC can still block always-allow
MCP tools outside guardian auto-review mode.

## Verification

- `cargo test -p codex-core --lib mcp_tool_call`
2026-04-30 16:44:09 -07:00
Owen Lin
5de7992ee5 fix(tui): set persist_extended_history: false (#20502)
Large rollouts are no good. This updates the TUI to behave the same as
the Codex App, which is also turning it off.
2026-04-30 23:31:31 +00:00
xli-oai
2686873e77 Sync remote installed plugin bundles (#20268)
## Summary
- Download missing remote installed plugin bundles during app-server
startup and plugin/list refresh.
- Upgrade cached remote installed bundles when the backend installed
version changes.
- Remove stale remote installed bundle caches without writing remote
plugin state into config.toml.

## Review note
This is a clean PR branch cut from the current diff on top of latest
`origin/main`. The diff intentionally has no `codex-rs/core/**` files,
so CODEOWNERS should not request the core-directory owner review from
stale PR history.

## Validation
Already run on the source branch before creating this clean PR:
- `just fmt`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-app-server --test all
app_server_startup_sync_downloads_remote_installed_plugin_bundles --
--nocapture`
- `cargo test -p codex-app-server --test all
plugin_list_sync_upgrades_and_removes_remote_installed_plugin_bundles --
--nocapture`
- `cargo test -p codex-app-server --test all
app_server_startup_remote_plugin_sync_runs_once -- --nocapture`
- `just fix -p codex-core-plugins`
- `just fix -p codex-app-server`
- `git diff --check`
2026-04-30 16:05:14 -07:00
Owen Lin
9ddb267e9c fix: ignore dangerous project-level config keys (#20098)
## Description
Ignore these top-level config keys when loading project-scoped
config.toml files:
```
    "openai_base_url",
    "chatgpt_base_url",
    "model_provider",
    "model_providers",
    "profile",
    "profiles",
    "experimental_realtime_ws_base_url",
```

## What changed

- Add a project-local config denylist for credential-routing fields such
as `openai_base_url`, `chatgpt_base_url`, `model_provider`,
`model_providers`, `profile`, `profiles`, and
`experimental_realtime_ws_base_url`.
- Strip those fields from project config layers before they participate
in effective config merging, while leaving safe project-local settings
intact.
- Track ignored project-local keys on config layers and surface a
startup warning telling users to move those settings to user-level
`config.toml` if they intentionally need them.
- Update profile behavior coverage so project-local `profile` /
`profiles` entries are ignored instead of overriding user-level profile
selection.

## Verification

- `cargo test -p codex-config`
- `cargo test -p codex-core
project_layer_ignores_unsupported_config_keys`
- `cargo test -p codex-core project_profiles_are_ignored`
- `cargo test -p codex-core config::config_loader_tests`
2026-04-30 23:03:01 +00:00
Owen Lin
6014b6679f fix flaky test falls_back_to_registered_fallback_port_when_default_po… (#20504)
…rt_is_in_use
2026-04-30 22:06:04 +00:00
Akshay Nathan
8426edf71e Stateful streaming apply_patch parser 2026-04-30 21:41:15 +00:00
xl-openai
7b3de63041 Move plugin out of core. (#20348) 2026-04-30 14:26:14 -07:00
Tom
127be0612c [codex] Migrate thread turns list to thread store (#19280)
- migrate `thread/turns/list` to ThreadStore. Uses ThreadStore for most
data now but merges in the in-memory state from thread manager
- keep v2 `thread/list` pathless-store friendly by converting
`StoredThread` directly to API `Thread`
- add regression coverage for pathless store history/listing
2026-04-30 14:16:42 -07:00
alexsong-oai
9121132c8f Send external import completion for sync imports (#20379) 2026-04-30 13:03:21 -07:00
Matthew Zeng
70090c9ff7 [plugin] Add Canva to suggesteable list. (#20474)
- [x] Add Canva to suggesteable list.
2026-04-30 12:39:52 -07:00
iceweasel-oai
8121710ffe install WFP filters for Windows sandbox setup (#20101)
## Summary

This PR installs a first wave of WFP (Windows Filtering Platform)
filters that reduce the surface area of network egress vulnerabilities
for the Windows Sandbox.

- Add persistent Windows Filtering Platform provider, sublayer, and
filters for the Windows sandbox offline account.
- Install WFP filters during elevated full setup, log failures
non-fatally, and emit setup metrics when analytics are enabled.
- Bump the Windows sandbox setup version so existing users rerun full
setup and receive the new filters.

## What WFP is
Windows Filtering Platform (WFP) is the low-level Windows networking
policy engine underneath things like Windows Firewall. It lets
privileged code install persistent filtering rules at specific network
stack layers, with conditions like "only traffic from this Windows
account" or "only this remote port," and an action like block.

In this change, we create a Codex-owned persistent WFP provider and
sublayer, then install block filters scoped to the Windows sandbox's
offline user account via `ALE_USER_ID`. That means the filters are
targeted at sandboxed processes running as that account, rather than
globally affecting the host.

## Initial filter set
We are starting with 12 concrete WFP filters across a few high-value
bypass surfaces. The table below describes the filter families rather
than one filter per row:

| Area | Concrete filters | Purpose |
| --- | --- | --- |
| ICMP | 4 filters: ICMP v4/v6 on `ALE_AUTH_CONNECT` and
`ALE_RESOURCE_ASSIGNMENT` | Block direct ping-style network reachability
checks from the offline account. |
| DNS | 2 filters: remote port `53` on `ALE_AUTH_CONNECT_V4/V6` | Block
direct DNS queries that bypass our intended proxy/offline path. |
| DNS-over-TLS | 2 filters: remote port `853` on
`ALE_AUTH_CONNECT_V4/V6` | Block encrypted DNS attempts that could
bypass ordinary DNS interception. |
| SMB / NetBIOS | 4 filters: remote ports `445` and `139` on
`ALE_AUTH_CONNECT_V4/V6` | Block Windows file-sharing/network share
traffic from sandboxed processes. |

For IPv4/IPv6 coverage, the port-based filters are installed on both
`ALE_AUTH_CONNECT_V4` and `ALE_AUTH_CONNECT_V6`. ICMP also gets both
connect-layer and resource-assignment-layer coverage because ICMP
traffic is shaped differently from ordinary TCP/UDP port traffic.

## Validation
- `cargo fmt -p codex-windows-sandbox` (completed with existing
stable-rustfmt warnings about `imports_granularity = Item`)
- `cargo test -p codex-windows-sandbox wfp::tests`
- `cargo test -p codex-windows-sandbox` (fails in existing legacy
PowerShell sandbox tests because `Microsoft.PowerShell.Utility` could
not be loaded; WFP tests passed before that failure)
2026-04-30 12:39:01 -07:00
Owen Lin
7dd08e304c feat(rollouts): store EventMsg::ApplyPatchEnd in limited history mode (#20463)
The Codex App treats apply patch tool calls quite load-bearing in the UI
(always shown on a completed turn), so we'd like to persist
`EventMsg::ApplyPatchEnd` to guarantee that when a client reconnects to
app-server mid-turn, we always have the full diff to display at the end
of that turn.
2026-04-30 12:11:02 -07:00
iceweasel-oai
06f3b4836a [codex] Fix elevated Windows sandbox named-pipe access (#20270)
## Summary
- add elevated-only token constructors that include the current token
user SID in the restricted SID list
- switch the elevated Windows command runner to use those constructors
- leave the unelevated restricted-token path unchanged

## Why
Windows named pipes created by tools like Ninja use the platform's
default named-pipe ACL when no explicit security descriptor is provided.
In the elevated sandbox, the pipe owner has access, but the
write-restricted token can still fail its restricted-SID access check
because the sandbox user SID was not in the restricting SID set. That
causes child processes to exit successfully while Ninja never receives
the expected pipe completion/close behavior and hangs.

Including the elevated sandbox user's SID in the restricting SID list
lets the restricted check succeed for these owner-scoped pipe objects
without broadening the unelevated sandbox to the real signed-in user.

## Impact
- fixes the minimal Ninja hang repro in the elevated Windows sandbox
- preserves the existing unelevated sandbox behavior and write
protections
- keeps the change scoped to the elevated runner rather than changing
shared token semantics
- this does not affect file-writes for the sandbox because the sandbox
users themselves do not receive any additional permissions over what the
capability SIDs already have. In fact we don't even explicitly grant the
sandbox user ACLs anywhere.

## Validation
- `cargo build -p codex-windows-sandbox --quiet`
- verified the stock `ninja.exe` minimal repro exits normally on host
and in the elevated sandbox
- verified the same repro still hangs in the unelevated sandbox, which
is the intended scope of this change
2026-04-30 12:06:11 -07:00
Celia Chen
31f8813e3e fix: show correct Bedrock runtime endpoint in /status (#20275)
## Why

`/status` was showing the configured `ModelProviderInfo.base_url` for
Amazon Bedrock, which can be stale or misleading because the actual
Bedrock Mantle endpoint is derived at runtime from the resolved AWS
region. This made sessions report the wrong provider endpoint even
though requests used the correct runtime URL.

## What changed

- Added `ModelProvider::runtime_base_url()` so provider implementations
can expose the request-time base URL through the shared runtime provider
abstraction.
- Moved Bedrock region-to-Mantle URL resolution into
`amazon_bedrock::mantle::runtime_base_url()`, keeping region resolution
private to the Mantle module.
- Overrode `runtime_base_url()` for Amazon Bedrock so it returns the
resolved Mantle endpoint instead of the configured default.
- Resolved and cached the runtime provider base URL during TUI startup,
then used that cached value when rendering `/status`.
- Added status coverage that verifies Bedrock displays the runtime URL
and ignores the configured Bedrock `base_url` when they differ.

## Verification
model provider is resolved correctly in local build:
<img width="696" height="245" alt="Screenshot 2026-04-29 at 5 01 36 PM"
src="https://github.com/user-attachments/assets/a13c10a5-3720-41ab-8ace-3c4bc573f971"
/>
2026-04-30 19:02:34 +00:00
Abhinav
93d53f655b Add /hooks browser for lifecycle hooks (#19882)
## Why

`hooks/list` and `hooks/config/write` give us read/write access to hooks
and their state. This hooks up the TUI as a client so users can inspect
and manage that state directly.

## What

- add a two-page `/hooks` browser in the TUI: an event overview with
installed/active counts, followed by a per-event handler page with
toggle controls and detail rendering
- thread managed-state metadata through hook discovery and `hooks/list`
so the UI can label admin-managed hooks and suppress toggles for them
- persist hook toggles through the existing config-write path and add
snapshot coverage for the event list, handler list, managed-hook, and
empty states

## Stack

1. openai/codex#19705
2. openai/codex#19778
3. openai/codex#19840
4. This PR - openai/codex#19882

## Reviewer Notes

- Main UI logic is in
`codex-rs/tui/src/bottom_pane/hooks_browser_view.rs`; most of the diff
is the new view plus its snapshot coverage
- Request / write plumbing for opening the browser and persisting
toggles is in `codex-rs/tui/src/app/background_requests.rs` and
`codex-rs/tui/src/chatwidget/hooks.rs`
- Outside the TUI, the only behavioral change in this PR is threading
`is_managed` through hook discovery and `hooks/list` so managed hooks
render as non-toggleable
- The `codex-rs/tui/src/status/snapshots/` churn is unrelated merge
fallout from the stacked base branch's newer permission-label rendering

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-30 11:58:27 -07:00
khoi
719431da6e [Codex] Add browser use external feature flag (#20245)
## Summary

- Adds a separate feature control for external-browser Browser Use
integrations.
- Registers `browser_use_external` as a stable, default-enabled
requirements-owned feature key.
- Updates feature registry tests and regenerates the config schema.

Codex validation:
- `cargo fmt -- --config imports_granularity=Item`
- `cargo run -p codex-core --bin codex-write-config-schema`
- `cargo test -p codex-features`

## Addendum

This gives enterprise policy a coarse control for Browser Use outside
the Codex-managed in-app browser. The existing `browser_use` feature is
the Browser Use control, while `browser_use_external` can gate
extension/native integrations for external browsers as that surface
grows
2026-04-30 11:53:19 -07:00
pakrym-oai
b52083146c Stop emitting item/fileChange/outputDelta output delta notifications (#20471)
## Why

`item/fileChange/outputDelta` text output was only the tool's summary or
error text and not used by client surfaces.

We keep `item/fileChange/outputDelta` in the app-server protocol as a
deprecated compatibility entry, but the server no longer emits it.

## What changed

- stop the `apply_patch` runtime from emitting `ExecCommandOutputDelta`
events
- simplify `item_event_to_server_notification` so command output deltas
always map to `item/commandExecution/outputDelta`
- remove the app-server bookkeeping that tried to detect whether an
output delta belonged to a file change
- mark `item/fileChange/outputDelta` as a deprecated legacy protocol
entry in the v2 types, schema, and README
- simplify the file-change approval tests so they only wait for
completion instead of expecting output-delta notifications

## Testing

- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-thread-manager-sample`
- `cargo test -p codex-app-server-protocol
protocol::event_mapping::tests::exec_command_output_delta_maps_to_command_execution_output_delta
-- --exact`
- `cargo test -p codex-app-server
turn_start_file_change_approval_accept_for_session_persists_v2 --
--exact` *(failed before the test assertions because the wiremock
`/responses` mock received 0 requests in setup)*
2026-04-30 11:42:07 -07:00
Eric Traut
f2bc2f26a9 Remove core protocol dependency [2/2] (#20325)
## Why

With the local model layer and app-server routing in place from PR1,
this PR moves the active TUI runtime onto app-server notifications. The
affected pieces share the same event flow, so the command surface,
session state, bottom-pane prompts, chat rendering, history/status
views, and tests move together to keep the stacked branch buildable.

This PR also removes the obsolete compatibility surface that is no
longer used after the migration. The proposed protocol-boundary verifier
layer was dropped from the stack; enforcing that final boundary will be
simpler once `codex-tui` no longer needs any `codex_protocol`
references.

This PR is part 2 of a 2-PR stack:

1. Add TUI-owned replacement models and extract app-server event
routing.
2. Move the active TUI flow to app-server notifications and delete
obsolete adapter code.

## What changed

- Rewired app command and session handling to use app-server request and
notification shapes.
- Moved approval overlays, request-user-input flows, MCP elicitation,
realtime events, and review commands onto the app-server-facing model
surface.
- Updated chat rendering, history cells, status views, multi-agent UI,
replay state, and TUI tests to use app-server notifications plus the
local models introduced in PR1.
- Deleted `codex-rs/tui/src/app/app_server_adapter.rs` and the
superseded `chatwidget/tests/background_events.rs` fixture path.

## Verification

- `cargo check -p codex-tui --tests`
- Top of stack: `cargo test -p codex-tui`
2026-04-30 11:34:34 -07:00
pakrym-oai
5cc5f12efc Move item event mapping into app-server-protocol (#20299)
## Why

Follow-up to #20291.

The v2 item-event-to-notification translation had been embedded in
`app-server/src/bespoke_event_handling.rs`, which made it hard to reuse
anywhere else. This PR moves that stateless mapping into shared protocol
code so other entry points can produce the same `ServerNotification`
payloads without copying app-server logic.

That also lets `thread-manager-sample` demonstrate the same notification
surface that the app server exposes, instead of only printing the final
assistant message.

## What changed

- move `item_event_to_server_notification` into
`codex-app-server-protocol::protocol::event_mapping`
- keep the mapper tests next to the shared implementation in
`codex-app-server-protocol`
- re-export the mapper from `codex-core-api` so lightweight consumers
can use it without reaching into `app-server-protocol` directly
- simplify `app-server/src/bespoke_event_handling.rs` so it delegates
the stateless event-to-notification projection to the shared helper
- update `thread-manager-sample` to:
  - print mapped notifications as newline-delimited JSON
  - use the shared mapper through `codex-core-api`
- enable the default feature set so the sample exposes the normal tool
surface
- use a `read_only` permission profile so shell commands can run in the
sample without widening permissions

## Testing

- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-core-api`
- `cargo test -p codex-app-server bespoke_event_handling::tests`
- `cargo test -p codex-thread-manager-sample`
- `cargo run -p codex-thread-manager-sample -- "briefly explore the repo
with pwd and ls, then summarize it"`
2026-04-30 11:02:13 -07:00
Eric Traut
c70cdc108f Remove core protocol dependency [1/2] (#20324)
## Why

This stack moves `codex-tui` away from the core protocol event surface
and toward app-server API shapes plus TUI-owned local models. This first
PR sets up the lower-risk foundation: it introduces the local model
surface and extracts app-server event routing into focused TUI modules
while preserving the existing behavior for the larger migration in PR2.

This PR is part 1 of a 2-PR stack:

1. Add TUI-owned replacement models and extract app-server event
routing.
2. Move the active TUI flow to app-server notifications and delete
obsolete adapter code.

## What changed

- Added TUI-owned approval, diff, session state, session resume, token
usage, and user-message models.
- Added `app/app_server_event_targets.rs` and `app/app_server_events.rs`
to hold app-server event targeting and dispatch logic outside `app.rs`.
- Updated app/status tests to use the local model layer and added
focused routing coverage.
- Boxed a few large async TUI test futures so this base layer remains
checkable without overflowing the default test stack.

## Verification

- `cargo check -p codex-tui --tests`
2026-04-30 10:52:19 -07:00
teddywyly-oai
487716ae74 [Extension] Allowlist Chrome Extension in the tool_suggest tool (#20458)
### Summary
Allowlist chrome extension in tool_suggest tool

### Screenshot
Allowlist chrome extension in tool_suggest tool
<img width="808" height="309" alt="chrome_internal"
src="https://github.com/user-attachments/assets/ed769d77-b635-4a40-a0c5-fbff05af3036"
/>
2026-04-30 10:29:03 -07:00
canvrno-oai
a85d265097 /plugins: remove marketplace (#19843)
This PR adds marketplace removal to the /plugins menu, giving users a
way to remove user-configured plugin marketplaces. It adds a `Ctrl+R`
shortcut to remove selected marketplace tabs, a confirmation prompt,
loading and error states, and the app-server request flow needed to
perform marketplace/remove. After a successful removal, the TUI
refreshes config, plugin mentions, user config, and plugin data so the
removed marketplace disappears from the menu and other surfaces in the
TUI.

- Add `Ctrl+R` removal option for user-configured marketplace tabs
- Show marketplace removal confirmation, loading, and error states
- Route `marketplace/remove` through the TUI background request flow
- Refresh config, plugin mentions, and plugin data after successful
removal
- Adds reusable per-tab footer hints so removal guidance only appears on
applicable tabs
- Add test coverage for `Ctrl+R` behavior while plugin search is active

Steps to test:
- Add a marketplace using the TUI /plugins menu
- Use Ctrl+R to remove the marketplace
- Accept the confirmation prompt
- Confirm the marketplace is removed when the process completes.
2026-04-30 10:25:07 -07:00
Eric Traut
c02814c106 Mark goals feature as experimental (#20083)
## Why

The `goals` feature flag is ready to move out of the hidden
under-development bucket and into the user-facing experimental surface.
Marking it experimental lets users discover it through the experimental
features UI while still making clear that it is opt-in.

## What changed

- Changed `goals` from `Stage::UnderDevelopment` to
`Stage::Experimental` in `codex-rs/features/src/lib.rs`.
- Added experimental menu metadata for the feature with the description
`Set a persistent goal Codex can continue over time`.

## Verification

- `cargo test -p codex-features`
2026-04-30 10:06:44 -07:00
Owen Lin
3516cb9751 fix(core): truncate large mcp tool outputs in rollouts (#20260)
## Why
Large MCP tool call outputs can make rollout JSONL files enormous. In
the session that motivated this change, the biggest JSONL records were:
- `event_msg/mcp_tool_call_end`
- `response_item/function_call_output`

both containing the same unbounded MCP payloads - just 3 MCP tool calls
that each were multi-hundred MBs 😱

This PR truncates both of those JSONL records.

## How

#### For `response_item/function_call_output`
Unified exec already bounds tool output before it is injected into
model-facing history, which also keeps the corresponding rollout
`response_item/function_call_output` records small.

MCP should follow the same pattern: truncate the model-facing tool
output at the tool-output boundary, while leaving code-mode/raw hook
consumers alone.

#### For `event_msg/mcp_tool_call_end`
`McpToolCallEnd` also needs its own bounded event copy because it is the
app-server/replay/UI event shape that backs `ThreadItem::McpToolCall`.
Unfortunately this is _not_ downstream of the `ToolOutput` trait.

## Model behavior 
Model behavior is actually unchanged as a result of this PR. 

Before this PR, MCP output was:
1. Converted to `FunctionCallOutput`.
2. Recorded into in-memory history.
3. Truncated by `ContextManager::record_items()` before later model
turns saw it.

After this branch, MCP output is truncated earlier, in
`McpToolOutput::response_payload()`, using the same helper. Then
`ContextManager::record_items()` sees an already-truncated output and
effectively has little/no additional work to do.

So the model should still see the same kind of truncated function-call
output. The practical difference is where truncation happens: earlier,
before rollout persistence/app-server emission can see the giant
payload.

## Verification

- `cargo test -p codex-core mcp_tool_output`
- `cargo test -p codex-core
mcp_tool_call::tests::truncate_mcp_tool_result_for_event`
- `cargo test -p codex-core
mcp_post_tool_use_payload_uses_model_tool_name_args_and_result`
- `just fmt`
- `just fix -p codex-core`
- `git diff --check`
2026-04-30 16:30:43 +00:00
Ahmed Ibrahim
8a97f3cf03 realtime: rename provider session ids (#20361)
## Summary

Codex is repurposing `session` to mean a thread group, so the realtime
provider session id should no longer use `session_id` / `sessionId` in
Codex-facing protocol payloads. This PR renames that provider-specific
field to `realtime_session_id` / `realtimeSessionId` and intentionally
breaks clients that still send the old field names.

## What Changed

- Renamed realtime provider session fields in `ConversationStartParams`,
`RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`.
- Renamed app-server v2 realtime request and notification fields to
`realtimeSessionId`.
- Removed legacy serde aliases for `session_id` / `sessionId`; clients
must send the new names.
- Propagated the rename through core realtime startup, app-server
adapters, codex-api websocket handling, and TUI realtime state.
- Regenerated app-server protocol schema/TypeScript outputs and updated
app-server README examples.
- Kept upstream Realtime API concepts unchanged: provider `session.id`
parsing and `x-session-id` headers still use the upstream wire names.

## Testing

- CI is running on the latest pushed commit.
- Earlier local verification on this PR:
  - `cargo test -p codex-protocol`
- `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core
realtime_conversation`
  - `cargo test -p codex-app-server-protocol`
- `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server
realtime_conversation`
- attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local
linker bus error while linking the test binary)

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-30 13:39:48 +03:00
jif-oai
c37f7434ba Gate multi-agent v2 tools independently of collab (#20246)
## Why

`multi_agents_v2` is meant to be independently gated from the older
`collab` feature. The tool registry still treated the
collaboration-style agent tools as `collab`-only, so enabling
`multi_agents_v2` without `collab` omitted the v2 agent tools. Review
and guardian sub-sessions also need to keep agent spawning disabled even
when the outer session has `multi_agents_v2` enabled.

## What changed

- Include the collab-backed agent tools when either `multi_agents_v2` or
`collab` is enabled.
- Explicitly disable `multi_agents_v2` for review and guardian review
sub-sessions, matching the existing `spawn_csv` and `collab`
restrictions.
- Add a registry test that enables `multi_agents_v2`, disables `collab`,
and verifies the v2 agent tools are present while legacy `send_input`
and `resume_agent` remain hidden.

## Testing

- Added
`test_build_specs_multi_agent_v2_does_not_require_collab_feature`.
2026-04-30 10:23:31 +02:00
Eric Traut
a73403a890 Make missing config clears no-ops (#20334)
## Why

Fixes #20145.

`config/value/write` treats a JSON `null` value as a request to clear
the config key. Clearing a key that is already absent should be
idempotent, but clearing a nested key such as `features.personality`
from an empty `config.toml` returned `configPathNotFound` because
`clear_path` treated the missing `features` parent table as an error.

That makes app-server reset flows brittle because clients have to read
first and avoid sending a clear request unless the parent path already
exists.

## What Changed

- Updated app-server config clearing so missing intermediate tables, or
non-table parents, are treated as an unchanged no-op.
- Removed the now-unreachable `MergeError::PathNotFound` path from
config write merging.
- Added a regression test covering `features.personality = null` against
an empty user config.

## Verification

- `cargo test -p codex-app-server clear_missing_nested_config_is_noop`
- `cargo test -p codex-app-server` was run; the config manager unit
suite passed, but one unrelated integration test failed because
`turn_start_emits_thread_scoped_warning_notification_for_trimmed_skills`
expected `7` trimmed skills and observed `8`.
- `just fix -p codex-app-server`
2026-04-30 10:13:33 +02:00
xl-openai
87d0cf1a62 feat: Add workspace plugin sharing APIs (#20278)
1. Adds v2 plugin/share/save, plugin/share/list, and plugin/share/delete
RPCs.
2. Implements save by archiving a local plugin root, enforcing a size
limit, uploading through the workspace upload flow, and supporting
updates via remotePluginId.
3. Lists created workspace plugins
4. Deletes a previously uploaded/shared plugin.
2026-04-29 23:49:20 -07:00
Michael Bolin
ae863e72a2 ci: increase Windows release workflow timeouts (#20343)
## Why

#20271 increased the `90`-minute timeout in `rust-release.yml`, but it
did not update the reusable Windows workflow in
`rust-release-windows.yml`. As a result, the Windows release compile
jobs were still capped at `60` minutes and the `windows-x64` primary
build could continue timing out.

We are keeping the existing `90`-minute timeout in `rust-release.yml`.
That increase was still directionally correct because the top-level
release build benefits from extra headroom; the mistake was assuming it
also covered the reusable Windows jobs.

## What Changed
- increase the reusable Windows release workflow timeouts in
`rust-release-windows.yml` from `60` minutes to `90` minutes
- update the comment in `rust-release.yml` so it no longer implies that
the top-level timeout covers the Windows reusable jobs
2026-04-29 23:27:04 -07:00
Abhinav
8f3c06cc97 Add persisted hook enablement state (#19840)
## Why

After `hooks/list` exposes the hook inventory, clients need a way to
persist user hook preferences, make those changes effective in
already-open sessions, and distinguish user-controllable hooks from
managed requirements without adding another bespoke app-server write
API.

## What

- Extends `hooks/list` entries with effective `enabled` state.
- Persists user-level hook state under `hooks.state.<hook-id>` so the
model can grow beyond a single boolean over time.
- Uses the existing `config/batchWrite` path for hook state updates
instead of introducing a dedicated hook write RPC.
- Refreshes live session hook engines after config writes so
already-open threads observe updated enablement without a restart.

## Stack

1. openai/codex#19705
2. openai/codex#19778
3. This PR - openai/codex#19840
4. openai/codex#19882

## Reviewer Notes

The generated schema files account for much of the raw diff. The core
behavior is in:

- `hooks/src/config_rules.rs`, which resolves per-hook user state from
the config layer stack.
- `hooks/src/engine/discovery.rs`, which projects effective enablement
into `hooks/list` from source-derived managedness.
- `config/src/hook_config.rs`, which defines the new `hooks.state`
representation.
- `core/src/session/mod.rs`, which rebuilds live hook state after user
config reloads.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-30 04:46:32 +00:00
Michael Bolin
ac4332c05b permissions: expose active profile metadata (#20095) 2026-04-29 20:54:59 -07:00
Matthew Zeng
ebe602d005 [plugins] Allow MSFT curated plugins in tool_suggest (#20304)
## Summary
- [x] Move the allowlist out of core crate
- [x] Add Teams, SharePoint, Outlook Email, and Outlook Calendar to the
tool_suggest discoverable plugin allowlist
- [x] Add focused coverage for Microsoft curated plugin discovery

## Testing
- just fmt
- cargo test -p codex-core-plugins
- cargo test -p codex-core
list_tool_suggest_discoverable_plugins_returns_
2026-04-29 19:45:52 -07:00
pakrym-oai
4e677d62da app-server: remove dead api version handling from bespoke events (#20291)
Remove ApiVersion::V1
2026-04-30 01:55:44 +00:00
rhan-oai
bb536d65bd [codex-analytics] prevent stale guardian events from satisfying reused reviews (#20080)
## Why

Reused Guardian review trunks can still have older child-turn events
queued when a later review starts. The review waiter currently accepts
the first terminal event it sees from the shared child session, so a
stale `TurnComplete` can be attributed to the new review. That produces
impossible analytics combinations such as non-null TTFT with sub-10 ms
completion latency and zero token deltas on `trunk_reused` reviews.

## What changed

- Preserve the child turn id returned by the Guardian review
`Op::UserTurn` submission.
- Restrict Guardian review waiting to events correlated with that
submitted child turn.
- Restrict timeout/abort draining to terminal events for the same child
turn.
- Add regression coverage for stale prior-turn completions, stale
prior-turn errors, and interrupt draining in
`codex-rs/core/src/guardian/review_session.rs`.

## Verification

- `cargo test -p codex-core guardian::review_session::tests::`
- `cargo clippy -p codex-core --tests -- -D warnings`
2026-04-29 18:26:39 -07:00
Alex Zamoshchin
8b07132e09 update codex_plugins_beta_setting (from workspace settings) (#20250)
update the name after rename internally

see https://github.com/openai/openai/pull/871006
2026-04-30 00:40:25 +00:00
Eric Traut
515aa9a4fb tui: return from side chat on Ctrl-D (#20282)
## Why

Fixes #20264.

Side conversations are an ephemeral layer on top of the main chat.
Pressing `Ctrl+D` from an empty side-chat composer should unwind back to
the parent thread, matching the existing side-return behavior, instead
of falling through to the global quit shortcut and exiting Codex.

## What changed

The side-return shortcut matcher now treats `Ctrl+D` the same way it
already treats `Esc` and `Ctrl+C`. Because app-level side-return
handling runs before the chat widget's global quit handling, this
returns from `/side` while preserving normal `Ctrl+D` quit behavior
outside side conversations.

The existing shortcut coverage was updated to include lowercase and
uppercase `Ctrl+D` key events.

## Verification

- `cargo test -p codex-tui
side_return_shortcuts_match_esc_ctrl_c_and_ctrl_d`
- `cargo test -p codex-tui` starts successfully and the new shortcut
test passes, but the broader suite later aborts in the unrelated
existing test
`app::tests::attach_live_thread_for_selection_rejects_unmaterialized_fallback_threads`
with a stack overflow.
2026-04-29 17:26:11 -07:00
pakrym-oai
fedcefe9da Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager
construction and ModelProvider
2026-04-29 17:22:41 -07:00
stefanstokic-oai
c8abcbf925 Import external agent sessions in background (#20284)
Summary:
- Return from external agent import before session history import
finishes
- Run session import work in the background and emit the existing
completion notification when it is done
- Serialize session imports so duplicate requests do not create
duplicate imported threads

Verification:
- cargo test -p codex-app-server external_agent_config_
- cargo test -p codex-external-agent-sessions
- just fix -p codex-app-server
- just fix -p codex-external-agent-sessions
- git diff --check
2026-04-30 00:00:41 +00:00
alexsong-oai
7bcd4626c4 Consume ai-title from external sessions and add end marker (#20261)
## Summary
- Support Claude Code `ai-title` / `aiTitle` records when detecting and
importing external agent sessions.
- Preserve existing `custom-title` / `customTitle` precedence; only fall
back to `aiTitle` when no custom title is present.
- Add coverage for both detection and import title selection, including
the custom-title-over-ai-title case.

## Testing
- `cargo test -p codex-external-agent-sessions`
- `just fix -p codex-external-agent-sessions`
2026-04-30 00:00:13 +00:00
Abhinav
8774229a89 Add hooks/list app-server RPC (#19778)
## Why

We need a way to list the available hooks to expose via the TUI and App
so users can view and manage their hooks

## What

- Adds `hooks/list` for one or more `cwd` values that returns discovered
hook metadata

## Stack

1. openai/codex#19705
2. This PR - openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Review Notes

The generated schema files account for most of the raw diff, these files
have the core change:

- `hooks/src/engine/discovery.rs` builds the inventory entries during
hook discovery while leaving runtime handlers focused on execution.
- `app-server/src/codex_message_processor.rs` wires `hooks/list` into
the app-server flow for each requested `cwd`.
- `app-server-protocol/src/protocol/v2.rs` defines the new v2
request/response payloads exposed on the wire.

### Core Changes

`core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so
`skills/list` and `hooks/list`can resolve plugin state for each
requested `cwd`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 23:39:57 +00:00
Michael Bolin
6eab7519b4 chore: increase release build timeout from 60 min to 90 (#20271)
Build times are creeping up, so increase the timeout as a precaution.
2026-04-29 16:19:59 -07:00
rafael-jac
98f67b15d3 Update Codex login success page UX (#20136)
## Summary

update the local login success page to match the Codex desktop auth UX
use theme-aware colors and an inline 20px Codex mark
keep the actual localhost success page aligned with the browser auth UX
PR

## Tests

<img width="1728" height="1117" alt="Screenshot 2026-04-29 at 12 00
34 PM"
src="https://github.com/user-attachments/assets/76a40c3f-07c3-452c-97da-e7c43717cd2c"
/>
2026-04-29 19:14:53 -04:00
evawong-oai
74f06dcdfb Enforce workspace metadata protections in Linux sandbox (#19852)
## Summary

Enforce FileSystemSandboxPolicy protected metadata names in the Linux
bubblewrap adapter so `.git`, `.agents`, and `.codex` remain read only
inside writable workspace roots unless the policy grants an explicit
write carveout.

## Scope

1. Translate protected metadata names from FileSystemSandboxPolicy into
bubblewrap masks for existing metadata paths.
2. Represent missing protected metadata paths as guarded mount targets
so agents cannot create `.git`, `.agents`, or `.codex` under writable
roots.
3. Preserve normal git discovery for existing repos, worktrees, and
parent repos.
4. Keep explicit user write grants working when policy allows a
protected metadata path directly.

## Not in scope

1. No shell preflight UX.
2. No TUI runtime profile propagation.
3. No macOS Seatbelt changes in this PR.

## Reviewer focus

1. This should be reviewed as the Linux enforcement adapter for the
policy primitive from PR 19846.
2. macOS enforcement already landed in PR 19847.
3. The important invariant is that `FileSystemSandboxPolicy` is the
source of truth for `.git`, `.agents`, and `.codex`.

## Validation

1. `git diff` whitespace check passed.
2. `cargo fmt` check passed with the existing stable rustfmt warning
about `imports_granularity`.
3. Full Linux sandbox Cargo test suite passed on the devbox.
4. Devbox forty six case suite passed at head
`012accb703c13bd28df5b40079a9bf183036336a`.
5. Devbox summary: pass 46, fail 0.
6. The devbox suite was run through `just c sandbox linux`.
7. Focused repo test for Viyat parent repo case passed on the devbox.
2026-04-29 16:14:14 -07:00
iceweasel-oai
13dbcda28f stop blocking unified_exec on Windows (#19435)
## Summary
- remove the Windows-specific unified-exec environment block from tool
selection
- keep `unified_exec` default-off on Windows unless the feature is
explicitly enabled
- normalize model-provided `shell_type = unified_exec` to
`shell_command` when the feature is disabled
- drop obsolete tests tied to the removed environment gate and keep the
feature-flag regression coverage

## Why
Now that the session/long-lived process backend is implemented for the
Windows sandbox, we don't need to hard disable it anymore. We will be
rolling out slowly using a feature gate.

## Impact
This allows manual Windows opt-in in CLI and app-backed flows while
preserving the existing default-off behavior for Windows users.

---------

Co-authored-by: canvrno-oai <kbond@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-04-29 16:06:33 -07:00
pakrym-oai
8de2a7a16d Add codex-core public API listing (#20243)
Summary:
- Add a checked-in codex-core public API listing generated by
cargo-public-api.
- Add scripts/regen-public-api.sh with an embedded crate list,
auto-install for cargo-public-api 0.51.0, pinned nightly, and --check
mode.
- Add Rust CI jobs on the codex Linux x64 runner pool to verify the
listing stays up to date.

Testing:
- bash -n scripts/regen-public-api.sh
- just regen-public-api --check
- yq '.' .github/workflows/rust-ci.yml
.github/workflows/rust-ci-full.yml
- git diff --check
2026-04-29 22:58:08 +00:00
Rasmus Rygaard
782191547c Add agent graph store interface (#19229)
## Summary

Persisted subagent parent/child topology currently leaks through
`StateRuntime`'s SQLite-specific thread-spawn helpers. This PR
introduces a narrow `AgentGraphStore` boundary so follow-up work can
route graph operations through a local or remote store without coupling
orchestration code directly to the state DB graph API.

## Changes

- Adds the new `codex-agent-graph-store` crate.
- Defines a flat `AgentGraphStore` trait for the v1 graph surface:
upsert edge, set edge status, list direct children, and list
descendants.
- Adds public graph types for `ThreadSpawnEdgeStatus`,
`AgentGraphStoreError`, and `AgentGraphStoreResult`.
- Implements `LocalAgentGraphStore` on top of an existing
`codex_state::StateRuntime`, preserving today's SQLite-backed
`thread_spawn_edges` behavior.
- Registers the crate in Cargo/Bazel metadata.

This PR only adds the local contract and implementation; call-site
migration and the remote gRPC store are left to the follow-up PRs in the
stack.

## Testing

- `cargo test -p codex-agent-graph-store`

The new unit tests cover local parity with the existing `StateRuntime`
graph methods, `Open`/`Closed` filtering, status updates, and stable
breadth-first descendant ordering.
2026-04-29 22:48:26 +00:00
Matthew Zeng
e20391e567 [mcp] Fix plugin MCP approval policy. (#19537)
Plugin MCP servers are loaded from plugin manifests rather than
top-level `[mcp_servers]`, so their tool approval preferences need to be
stored and applied through the owning plugin config. Without this,
choosing "Always allow" for a plugin MCP tool could write a preference
that was not reliably used on later tool calls.

## Summary
- Add plugin-scoped MCP policy config under
`plugins.<plugin>.mcp_servers`, including server enablement, tool
allow/deny lists, server defaults, and per-tool approval modes.
- Overlay plugin MCP policy onto manifest-provided server configs when
plugins are loaded.
- Route persistent "Always allow" writes for plugin MCP tools back to
the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config
entry.
- Reload user config after persisting an approval and make the plugin
load cache config-aware so stale plugin MCP policy is not reused after
`config.toml` changes.
- Regenerate the config schema and add coverage for plugin MCP policy
loading, approval lookup, persistence, and stale-cache prevention.

## Testing
- `cargo test -p codex-config`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-core --lib plugin_mcp`
2026-04-29 15:40:03 -07:00
Eric Traut
4241df4d79 Escape turn metadata headers as ASCII JSON (#19620)
## Why

`x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex
was serializing the metadata JSON with raw UTF-8 string contents. When a
workspace path contains non-ASCII characters, common HTTP stacks can
reject or corrupt that header before the request reaches the provider.

Fixes #17468. Also addresses the duplicate WebSocket report in #19581.

## What changed

- Added `codex_utils_string::to_ascii_json_string`, a shared helper that
serializes JSON normally while escaping non-ASCII string content as
`\uXXXX`.
- Switched turn metadata header serialization, including merged
Responses API client metadata, to use the ASCII-safe JSON helper.
- Added coverage for non-ASCII workspace paths and non-ASCII client
metadata while preserving the same parsed JSON values.

## Verification

- `cargo test -p codex-utils-string`
- `cargo test -p codex-core turn_metadata`
- `just bazel-lock-check`
2026-04-29 15:35:33 -07:00
Michael Bolin
b1546008fc docs: discourage #[async_trait] and #[allow(async_fn_in_trait)] (#20242)
## Why

We have run into two avoidable problems when introducing async trait
APIs in Rust:

- `#[async_trait]` has caused materially worse build times in this
repository.
- `#[allow(async_fn_in_trait)]` makes it too easy to ship a public trait
without spelling out whether the returned future is `Send`, which hides
an important part of the trait contract.

We already have a good example of the preferred alternative in
[#16630](https://github.com/openai/codex/pull/16630) /
[`3c7f013f9735`](https://github.com/openai/codex/commit/3c7f013f9735),
but that guidance currently lives only as prior art in the codebase.
This PR documents the rule in `AGENTS.md` so contributors are more
likely to follow the native RPITIT pattern before these two shortcuts
spread further.

## What Changed

- added Rust guidance in `AGENTS.md` discouraging both `#[async_trait]`
and `#[allow(async_fn_in_trait)]`
- pointed contributors to the native RPITIT pattern with explicit `Send`
bounds on the returned future
- clarified that implementations may still use `async fn` when they
satisfy that trait contract

## Verification

- docs-only change; no tests run
2026-04-29 15:29:29 -07:00
Alex Daley
f63b19bedd [apps] Add apps MCP path override (#20231)
Summary

- Add `[features.apps_mcp_path_override]` config with a `path` field for
overriding only the built-in apps MCP path.
- Keep existing host/base URL derivation unchanged and append the
configured path after that base.
- Regenerate the config schema with the custom feature-config case.

Test Plan

- Not run for latest revision; only `just fmt` and `just
write-config-schema` were run.
- Earlier revision: `cargo test -p codex-features`
- Earlier revision: `cargo test -p codex-mcp`
2026-04-29 18:08:06 -04:00
xli-oai
8d5da3ffe5 Fallback login callback port when default is busy (#19334)
## Summary
- Keep the preferred ChatGPT login callback port `1455` first.
- Preserve the existing `/cancel` recovery for stale Codex login
servers.
- Fall back to the registered localhost callback port `1457` when `1455`
remains unavailable.

## Why
Cursor and Codex Desktop both use the ChatGPT account login callback
server. On Windows, Cursor can already be listening on `127.0.0.1:1455`
/ `[::1]:1455`, causing Codex Desktop sign-in to fail with:

`Local callback port 1455 is already in use on this machine.`

Codex already attempted to cancel a stale Codex login server on that
port, but if the listener does not release the port, the old behavior
was to fail. The new behavior falls back to `1457`, which matches the
fixed redirect URI being registered server-side in
`openai/openai#863817`. This keeps the OAuth `redirect_uri` inside
Hydra's exact allow-list instead of choosing an arbitrary ephemeral
port.

## Validation
- `just fmt`
- `cargo test -p codex-login`
- `git diff --check HEAD~1..HEAD`
2026-04-29 14:45:27 -07:00
rhan-oai
72a39e3a96 [app-server] centralize client response analytics (#20059)
## Why

The precursor PR keeps successful client responses typed until
app-server's outgoing response seam. This follow-up uses that seam to
move successful client-response analytics out of individual handlers and
into the shared sender path, while keeping filtering decisions inside
`codex-analytics`.

## What changed

- Emit successful client-response analytics centrally from
`OutgoingMessageSender::send_response`.
- Remove duplicate handler-local response tracking for the current
thread/turn lifecycle responses.
- Keep analytics ingestion selective inside `AnalyticsEventsClient`, so
unrelated client traffic is ignored before cloning or boxing.
- Collapse client-response analytics facts onto one typed path and
normalize payloads in the reducer.
- Add direct client-filter coverage plus sender-level coverage for the
centralized forwarding path.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests --lib`
2026-04-29 21:22:39 +00:00
xli-oai
afbddabc8b Require remote plugin detail before uninstall (#19966)
## Summary
- Fetch remote plugin detail before sending the uninstall request.
- Use the detail response to derive the marketplace namespace and plugin
name for cache cleanup.
- Stop the uninstall before the backend POST if detail lookup fails, so
backend state and local cache state do not diverge.

## Testing
- `just fmt`
- `cargo test -p codex-app-server plugin_uninstall`
- `cargo test -p codex-core-plugins`
- `git diff --check`
2026-04-29 14:01:11 -07:00
rhan-oai
973c5c823e [app-server] type client response payloads (#20050)
## Why

`pr17088` adds typed server-originated request/response plumbing, but
successful client responses are still erased into bare JSON-RPC `result`
values before app-server can make any typed decision about them.

This precursor PR keeps successful client responses typed until the
outgoing response seam. It is intentionally limited to
protocol/app-server plumbing so the analytics behavior change can review
separately on top.

## What changed

- Add `ClientResponsePayload` as the pre-serialization client response
body type.
- Route app-server successful response paths through the typed payload
seam while preserving existing handler-local analytics behavior.
- Keep `InterruptConversation` JSON-RPC-only because it has no
`ClientResponse` variant.
- Move the new payload conversion tests into a dedicated protocol test
module.

## Verification

- `cargo check -p codex-app-server`
- `cargo test -p codex-app-server-protocol`
2026-04-29 20:50:47 +00:00
sayan-oai
b15074d0a4 app-server: fix outgoing sender test setup (#20258)
## Why

[#17088](https://github.com/openai/codex/pull/17088) changed
`OutgoingMessageSender::new` to require an `AnalyticsEventsClient`, but
one `command_exec` test added earlier on `main` still called the old
one-argument constructor. That leaves current `main` failing to compile
in Bazel and argument-comment-lint jobs.

## What changed

- Pass `AnalyticsEventsClient::disabled()` to the missed
`OutgoingMessageSender::new` test call site in `command_exec.rs`.

## Verification

- `cargo test -p codex-app-server
timeout_or_cancellation_reports_cancellation_without_timeout_exit_code`
2026-04-29 20:47:20 +00:00
Matthew Zeng
8ce48f9968 [tool_suggest] Improve tool_suggest triggering conditions. (#20091)
## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`
2026-04-29 13:41:12 -07:00
rhan-oai
0690ab0842 [codex-analytics] ingest server requests and responses (#17088)
## Why

Codex analytics needs a typed seam for app-server-originated
request/response traffic so future tool-approval analytics can consume
those facts without adding bespoke callsite tracking each time. Server
responses arrive as JSON-RPC `id + result` payloads, so analytics has to
reconstruct the matching typed response from the original typed request
while that request context still exists in app-server.

This also puts analytics on the app-server outbound path, which needs to
avoid keeping the runtime alive during shutdown. The final ownership fix
keeps the normal strong auth-manager retention in analytics and makes
the external-auth refresh bridge hold a weak back-reference to
`OutgoingMessageSender`, breaking the runtime cycle at the bridge
boundary instead of exposing retention policy through the analytics
client API.

## What changed

- Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus
`AnalyticsEventsClient::track_server_request` and
`track_server_response`.
- Renames the existing client-side facts to `ClientRequest` and
`ClientResponse` so reducers can distinguish client-to-server traffic
from server-to-client traffic.
- Adds `ServerRequest::response_from_result`, allowing a stored typed
request to decode the matching typed server response from a raw JSON-RPC
result payload.
- Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and
records targeted server requests, replayed targeted requests, and
matching targeted responses with the responding connection id needed for
correlation.
- Intentionally leaves broadcast server requests/responses out of
analytics for now because the current model is per connection, while
broadcasts fan one logical request out across multiple connections.
- Breaks the app-server shutdown cycle by storing
`Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and
upgrading it only when an external-auth refresh is actually requested.
- Keeps reducer ingestion of the new server-side facts as no-ops for
now; this PR is plumbing for later tool-approval analytics work.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests::`
- Covers typed-response reconstruction plus the targeted, replayed,
broadcast-exclusion, and response-attribution analytics paths.

## Follow-up

This PR intentionally stops at ingestion plumbing, so `ServerRequest`
and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR
adds real downstream analytics output for those facts:

- replace the temporary pre-reducer observation seam with reducer tests
for the emitted event shape;
- add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs`
for the real app-server workflow and captured analytics payload;
- remove the temporary sender-level observer tests added here in favor
of the real-output coverage above.

---

[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088).
* #18748
* #18747
* #17090
* #17089
* #20241
* #20239
* __->__ #17088
2026-04-29 19:56:41 +00:00
iceweasel-oai
9d1e5df4b2 expand the set of core shell env vars for Windows. (#20089)
https://github.com/openai/codex/issues/13917 and
https://github.com/openai/codex/issues/18248 correctly identify that

```
[shell_environment_policy]
inherit = "core"
```
is not functional on Windows because it carries an insufficient set of
env vars.
This PR expands that to match the more functional set from the MCP
client
2026-04-29 19:23:46 +00:00
viyatb-oai
07c8b8c77c fix: handle deferred network proxy denials (#19184)
## Why

This bug is exposed by Guardian/auto-review approvals. With the managed
network proxy enabled, a blocked network request can be reported back
through the network approval service as an approval denial after the
command has already started. Before this change, the shell and unified
exec runtimes registered those network approval calls, but did not have
a way to observe an async proxy denial as a cancellation/failure signal
for the running process.

The result was confusing: Guardian/auto-review could correctly deny
network access, but the command path could keep running or unregister
the approval without surfacing the denial as the command failure.

## What Changed

- `NetworkApprovalService` now attaches a cancellation token to active
and deferred network approvals.
- Proxy-denial outcomes are recorded only for active registrations,
cancel the owning token, and are consumed when the approval is
finalized.
- The shell runtime combines the normal command timeout with the
network-denial cancellation token.
- Unified exec stores the deferred network approval object, terminates
tracked processes when the proxy denial arrives, and returns the denial
as a process failure while polling or completing the process.
- Tool orchestration passes the active network approval cancellation
token into the sandbox attempt and preserves deferred approval errors
instead of silently unregistering them.
- App-server `command/exec` now handles the combined
timeout-or-cancellation expiration variant used by the runtime.

## Verification

- `cargo test -p codex-core network_approval --lib`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `cargo clippy -p codex-core --all-targets -- -D warnings`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 19:13:57 +00:00
xl-openai
73cd831952 feat: Use remote installed plugin cache for skills and MCP (#20096)
- Fetches and caches remote /installed plugin state
- Lets skills/list load skills from remote-installed cached plugins
without requiring a local marketplace entry
- Routes plugin list/startup/install/uninstall changes through async
plugin cache invalidation and MCP refresh
2026-04-29 12:09:49 -07:00
Won Park
5cf0adba93 Include auto-review rollout in feedback uploads (#20064)
## Summary

- include the live auto-review trunk rollout when `/feedback` uploads
logs
- upload that attachment as
`auto-review-rollout-<parent-thread-id>.jsonl` so it is distinguishable
from the parent rollout
- show the same auto-review attachment name in the TUI consent popup

## Scope

- this only covers the live cached auto-review trunk for the current
parent thread
- it does not add durable historical parent->auto-review lookup
- it does not add persisted rollout support for ephemeral parallel
review forks

## UI 

<img width="599" height="185" alt="Screenshot 2026-04-28 at 1 17 18 PM"
src="https://github.com/user-attachments/assets/6a0e79c2-5d21-4702-8a89-f765778bc9e9"
/>

## Validation

- `cargo test -p codex-core
cached_guardian_subagent_exposes_its_rollout_path`
- `cargo test -p codex-feedback`
- `cargo test -p codex-app-server`
- `cargo test -p codex-tui feedback_upload_consent_popup_snapshot`
- `cargo test -p codex-tui
feedback_good_result_consent_popup_includes_connectivity_diagnostics_filename`

## Known unrelated local failures

- `cargo test -p codex-core` currently fails in the pre-existing proxy
env snapshot test
`tools::runtimes::tests::maybe_wrap_shell_lc_with_snapshot_keeps_user_proxy_env_when_proxy_inactive`
- `cargo test -p codex-tui` currently hits pre-existing `status::*`
snapshot drift unrelated to this change

## Follow-Up 
- persist parallel auto-review fork sessions so /feedback can include
their rollout history too
- attach each persisted fork as its own clearly named file, for example
auto-review-rollout-<parent-thread-id>-fork <n>.jsonl, instead of
merging multiple Guardian sessions into one attachment
- keep the same live-session-only scope initially; durable historical
parent -> auto-review lookup can remain a separate decision if we later
need feedback from resumed sessions
2026-04-29 11:44:55 -07:00
friel-openai
05fd904572 test protocol: lock inter-agent commentary phase (#20046)
## Summary
- add a regression test for
`InterAgentCommunication::to_response_input_item`
- assert replayed inter-agent messages keep `phase:
Some(MessagePhase::Commentary)`

## Test plan
- `cargo test -p codex-protocol`
- `just argument-comment-lint`
2026-04-29 11:24:17 -07:00
pakrym-oai
8356806fc9 Add ThreadManager sample crate (#20141)
Summary:
- Add codex-thread-manager-sample, a one-shot binary that starts a
ThreadManager thread, submits a prompt, and prints the final assistant
output.
- Pass ThreadStore into ThreadManager::new and expose
thread_store_from_config for existing callsites.
- Build the sample Config directly with only --model and prompt inputs.

Verification:
- just fmt
- cargo check -p codex-thread-manager-sample -p codex-app-server -p
codex-mcp-server
- git diff --check

Tests: Not run per request.
2026-04-29 11:21:06 -07:00
joeytrasatti-openai
47fba5df4a [codex-backend] Prefer sqlite git info for rollout-path reads (#20228)
### Summary

- Path-based local thread reads currently return rollout/session git
metadata directly, so `thread/resume` can disagree with persisted SQLite
metadata for the same thread.
- Merge non-null SQLite git fields over rollout-path reads while keeping
rollout values as fallbacks for fields SQLite does not know.
- Add focused regression coverage for rollout-path reads so persisted
branch updates are preserved during resume.

### Testing

- `cargo test -p codex-thread-store`
2026-04-29 17:54:37 +00:00
Eric Traut
d0204c3dcc TUI: Remove core protocol dependency [3/7] (#20174)
## Why

This is part 3 of a 7-PR stack to remove direct
`codex_protocol::protocol` usage from `codex-tui` while keeping each
layer reviewable and shippable.

With `AppCommand` now explicit, the internal app event bus can carry TUI
commands directly instead of bouncing through core `Op` values.

## What changed

- Changed `AppEvent::CodexOp` and `AppEvent::SubmitThreadOp` to carry
`AppCommand`.
- Updated app-event senders and direct emitters to submit `AppCommand`
values.
- Adjusted tests to match `AppCommand` or convert back through
`into_core()` where they intentionally assert legacy payload equality.

## Verification

- `cargo test -p codex-tui --no-run`
2026-04-29 10:52:10 -07:00
Eric Traut
445629815c TUI: Remove core protocol dependency [2/7] (#20173)
## Why

This is part 2 of a 7-PR stack to remove direct
`codex_protocol::protocol` usage from `codex-tui` while keeping each
layer reviewable and shippable.

Before the TUI event bus can stop carrying core `Op` values,
`AppCommand` needs to be an owned TUI command shape rather than a thin
wrapper around `Op`.

## What changed

- Replaced the opaque `AppCommand(Op)` wrapper with explicit owned
variants for the commands the TUI submits.
- Preserved `into_core()` so this layer does not yet change the
app/thread submission boundary.
- Kept existing core leaf types for now so this remains a mechanical
command-shape refactor.

## Verification

- `cargo check -p codex-tui`
2026-04-29 10:28:04 -07:00
cassirer-openai
df966996a7 [rollout-tracer] Match analysis messages on encrypted id. (#20123)
In some setups the summary or raw content can be dropped between
requests. This triggers a check in the reducer which expects that the
messages should remain identical between requests.

This PR relaxes the checks to only focus on the encrypted ID instead. It
also changes the reducer to keep the most rich version of the message
observed during the rollout (this ensures that we don't accidentally
lose the CoT nor summary when available).
2026-04-29 17:22:24 +00:00
iceweasel-oai
cecca5ae06 Improve Windows process management edge cases (#19211)
## Summary

Some improvements to Windows process-management issues from
https://github.com/openai/codex/pull/15578

- bound the elevated runner pipe-connect handshake instead of waiting
forever on blocking pipe connects
- terminate the spawned runner if that handshake fails, so timeout/error
paths do not leave a stray `codex-command-runner.exe`
- loop on partial `WriteFile` results when forwarding stdin in the
elevated runner, so input is not silently truncated
- fix the concrete HANDLE/SID cleanup paths in the runner setup code
- keep draining driver-backed stdout/stderr after exit until the backend
closes, instead of dropping the tail after a fixed 200ms grace period
- reuse `LocalSid` for SID ownership and add more explanatory comments
around the ownership/concurrency-sensitive code paths

## Why

The original PR fixed a lot of Windows session plumbing, but there were
still a few sharp process-lifecycle edges:

- some elevated runner handshakes could block forever
- the new timeout path could still orphan the spawned runner process
- stdin forwarding still assumed a single `WriteFile` consumed the whole
buffer
- a few raw HANDLE/SID error paths still leaked
- driver-backed output could still lose the last chunk of stdout/stderr
on slower backends

## Validation

- `cargo fmt -p codex-windows-sandbox -p codex-utils-pty`
- `cargo test -p codex-utils-pty`
- `cargo test -p codex-windows-sandbox finish_driver_spawn`
- `cargo test -p codex-windows-sandbox runner_`

Ran a local test matrix of unified-exec and shell_tool tests, all
passing
2026-04-29 10:00:01 -07:00
Eric Traut
1c420a90cd TUI: Remove core protocol dependency [1/7] (#20172)
## Why

This is part 1 of a 7-PR stack to remove direct
`codex_protocol::protocol` usage from `codex-tui` while keeping each
layer reviewable and shippable.

This first layer reduces the size of the later `chatwidget` diff by
mechanically moving MCP startup bookkeeping out of the central widget
file without changing the event shapes or behavior.

## What changed

- Extracted MCP startup status handling into
`tui/src/chatwidget/mcp_startup.rs`.
- Kept the existing core event types in place for this purely mechanical
move.
- Updated the MCP startup tests to import the moved test-only event
types directly.

## Verification

- `cargo test -p codex-tui chatwidget::tests::mcp_startup`
2026-04-29 09:10:22 -07:00
Eric Traut
91ca551df8 Use /goal resume for paused goals (#20082)
## Why

The paused goal statusline currently points users at `/goal` to unpause
a goal, but bare `/goal` is the summary command and does not change the
goal state. Instead of making `/goal` mutate state only when a goal is
paused, this gives the action an explicit command that reads naturally
in the UI.

## What Changed

- Replace `/goal unpause` with `/goal resume` for reactivating a paused
goal.
- Update the paused goal statusline and `/goal` summary copy to point at
`/goal resume`.
2026-04-29 08:56:02 -07:00
jif-oai
70ac0f123c Make multi-agent v2 ignore agents.max_depth (#20180)
## Why

`agents.max_depth` is a legacy multi-agent v1 guard. Multi-agent v2 uses
task-path routing and its own session/thread limits, so v2 should not
reject nested `spawn_agent` calls just because the thread-spawn depth
has reached the v1 maximum.

Keeping the v1 depth guard active in v2 prevents deeper task trees even
though the v2 path still needs the depth value only for lineage and
task-path metadata.

## What Changed

- Removed the depth-limit rejection from the multi-agent v2
`spawn_agent` handler while still computing child depth for lineage/path
metadata.
- Made the depth-based disabling of legacy `SpawnCsv`/`Collab` tools
apply only when `Feature::MultiAgentV2` is disabled.
- Added `multi_agent_v2_spawn_agent_ignores_configured_max_depth` to
cover a v2 child spawning another agent when `agent_max_depth = 1`,
while the existing v1 depth-limit tests continue to enforce the legacy
behavior.

## Verification

- `cargo test -p codex-core
multi_agent_v2_spawn_agent_ignores_configured_max_depth -- --nocapture`
- `cargo test -p codex-core depth_limit -- --nocapture`
- `cargo test -p codex-core tools::handlers::multi_agents::tests --
--nocapture`
2026-04-29 12:23:00 +02:00
jif-oai
c41b74c453 nit: drop old memories things (#20186)
Drop legacy code
2026-04-29 12:19:50 +02:00
iceweasel-oai
5cac3f896d Fix Windows pseudoconsole attribute handling for sandboxed PTY sessions (#20042)
## Summary
Fix the Windows sandbox PTY spawn path to pass the pseudoconsole handle
value directly into `UpdateProcThreadAttribute`.

## Why
Sandboxed `unified_exec` PTY sessions on Windows were failing during
child process startup with `0xc0000142` (`STATUS_DLL_INIT_FAILED`). In
practice this showed up as PowerShell DLL init popups when the sandboxed
background-terminal path tried to launch an interactive shell.

The root cause was that we were passing a pointer to a local `isize`
variable instead of the pseudoconsole handle value in the form Windows
expects for `PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE`.

## Validation
- `cargo build -p codex-windows-sandbox --bins`
- Reproduced the real sandboxed `codex exec` flow with
`windows.sandbox_private_desktop=true`
- Verified a `tty=true` interactive session launched through the normal
PowerShell wrapper, printed `READY`, accepted follow-up stdin, and
exited cleanly
- Confirmed no new `0xc0000142` / `Application Popup` events appeared
after the successful repro
2026-04-29 11:59:45 +02:00
alexsong-oai
d92c909ee4 Fix migrated hook path rewriting (#20144)
## Summary
- Rewrite migrated external-agent hook commands by replacing the full
hook script path token instead of only the `.claude/hooks/` segment.
- Preserve quoting around the full rewritten target path so script names
with spaces, absolute paths, and shell operators/redirection continue to
work.
- Apply `.claude/settings.local.json` over `.claude/settings.json` for
config, MCP, and plugin migration so local scope matches Claude settings
precedence.
- Skip legacy command markdown without `description` frontmatter,
including README-style docs under `.claude/commands`.

## Root Cause
The previous hook rewrite handled `.claude/hooks/` as a substring
replacement. For absolute source commands, that left the original
project-root prefix before the newly quoted `.codex/hooks` directory,
producing invalid commands like
`project/'project/.codex/hooks'/script.sh`.

The migration also only used project `settings.json` for
config/MCP/plugin decisions, so local settings such as
`disabledMcpjsonServers` could be ignored even though Claude gives local
settings higher precedence than project settings.

## Validation
- `just fmt`
- `cargo test -p codex-external-agent-migration`
- `cargo test -p codex-app-server external_agent_config`
- `just fix -p codex-external-agent-migration`
- `just fix -p codex-app-server`
- `git diff --check`
2026-04-29 00:46:11 -07:00
viyatb-oai
5597925155 feat(cli): add sandbox profile config controls (#20118)
## Why

The explicit profile path from #20117 is meant for standalone testing,
but it still inherited the
shell cwd and all managed requirements implicitly. The pre-existing
launcher path even called out
that it did not support a separate cwd yet in

[`debug_sandbox.rs`](509453f688/codex-rs/cli/src/debug_sandbox.rs (L174-L179)).

For a standalone command, the useful default is to let the caller choose
the project directory being
tested and to avoid administrator-provided constraints unless the caller
explicitly wants to test
those too.

## What changed

- Add explicit-profile-only `-C/--cd DIR`, and use that cwd for both
profile resolution and command
  execution.
- Add explicit-profile-only `--include-managed-config`.
- Make explicit profile mode skip managed requirement sources by
default, including cloud
requirements, MDM requirements, `/etc/codex/requirements.toml`, and the
legacy managed-config
  requirements projection.
- Preserve all existing invocations outside the explicit-profile path.

## Stack

1. #20117 `sandbox-ui-profile`
2. #20118 `sandbox-ui-config` --> this PR

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_`
- `cargo test -p codex-core
load_config_layers_can_ignore_managed_requirements`
- `cargo test -p codex-core
load_config_layers_includes_cloud_requirements`
- macOS branch-binary smoke on the rebased top of stack: `-C` changed
execution cwd, explicit
profile mode omitted managed proxy env under `env -i`, and
`--include-managed-config` restored it.
- Linux devbox branch-binary smoke on the rebased top of stack: `-C`
changed execution cwd for
  built-in and user-defined explicit profiles.
2026-04-29 06:55:51 +00:00
1962 changed files with 226752 additions and 102863 deletions

View File

@@ -79,6 +79,10 @@ common:ci --disk_cache=
# Shared config for the main Bazel CI workflow.
common:ci-bazel --config=ci
common:ci-bazel --build_metadata=TAG_workflow=bazel
# Bazel CI cross-compiles in several legs, and the V8-backed code-mode tests
# are not stable in that setup yet. Keep running the rest of the Rust
# integration suites through the workspace-root launcher.
common:ci-bazel --test_env=CODEX_BAZEL_TEST_SKIP_FILTERS=suite::code_mode::
# Shared config for Bazel-backed Rust linting.
build:clippy --aspects=@rules_rust//rust:defs.bzl%rust_clippy_aspect
@@ -153,6 +157,25 @@ common:ci-macos --config=remote
common:ci-macos --strategy=remote
common:ci-macos --strategy=TestRunner=darwin-sandbox,local
# On Windows, use Linux remote execution for build actions but keep test actions
# on the Windows runner so Bazel's normal test sharding and flaky-test retries
# still run against Windows binaries.
common:ci-windows-cross --config=ci-windows
common:ci-windows-cross --build_metadata=TAG_windows_cross_compile=true
common:ci-windows-cross --config=remote
common:ci-windows-cross --host_platform=//:rbe
common:ci-windows-cross --strategy=remote
common:ci-windows-cross --strategy=TestRunner=local
common:ci-windows-cross --local_test_jobs=4
common:ci-windows-cross --test_env=RUST_TEST_THREADS=1
# Native Windows CI still covers the PowerShell tests. The cross-built gnullvm
# binaries currently hang in PowerShell AST parser tests when those binaries are
# run on the Windows runner.
common:ci-windows-cross --test_env=CODEX_BAZEL_TEST_SKIP_FILTERS=suite::code_mode::,powershell
common:ci-windows-cross --platforms=//:windows_x86_64_gnullvm
common:ci-windows-cross --extra_execution_platforms=//:rbe,//:windows_x86_64_msvc
common:ci-windows-cross --extra_toolchains=//:windows_gnullvm_tests_on_msvc_host_toolchain
# Linux-only V8 CI config.
common:ci-v8 --config=ci
common:ci-v8 --build_metadata=TAG_workflow=v8
@@ -160,5 +183,15 @@ common:ci-v8 --build_metadata=TAG_os=linux
common:ci-v8 --config=remote
common:ci-v8 --strategy=remote
# Source-built Bazel V8 artifacts use the in-process sandbox by default. This
# does not affect Cargo's default prebuilt rusty_v8 path.
common --@v8//:v8_enable_pointer_compression=True
common --@v8//:v8_enable_sandbox=True
# Keep currently published rusty_v8 release artifacts non-sandboxed until the
# artifact migration ships matching Rust feature selection for Cargo consumers.
common:v8-release-compat --@v8//:v8_enable_pointer_compression=False
common:v8-release-compat --@v8//:v8_enable_sandbox=False
# Optional per-user local overrides.
try-import %workspace%/user.bazelrc

View File

@@ -0,0 +1,11 @@
# THIS IS AUTOGENERATED. DO NOT EDIT MANUALLY
version = 1
name = "codex"
[setup]
script = ""
[[actions]]
name = "Run"
icon = "run"
command = "cargo +1.93.0 run --manifest-path=codex-rs/Cargo.toml --bin codex -- -c mcp_oauth_credentials_store=file"

View File

@@ -27,10 +27,10 @@ Accept any of the following:
2. Run the watcher script to snapshot PR/review/CI state (or consume each streamed snapshot from `--watch`).
3. Inspect the `actions` list in the JSON response.
4. If `diagnose_ci_failure` is present, inspect failed run logs and classify the failure.
5. If the failure is likely caused by the current branch, patch code locally, commit, and push.
5. If the failure is likely caused by the current branch, patch code locally, commit, and push. Do not patch random flaky tests, CI infrastructure, dependency outages, runner issues, or other failures that are unrelated to the branch.
6. If `process_review_comment` is present, inspect surfaced review items and decide whether to address them.
7. If a review item is actionable and correct, patch code locally, commit, push, and then mark the associated review thread/comment as resolved once the fix is on GitHub.
8. If a review item from another author is non-actionable, already addressed, or not valid, post one reply on the comment/thread explaining that decision (for example answering the question or explaining why no change is needed). Prefix the GitHub reply body with `[codex]` so it is clear the response is automated. If the watcher later surfaces your own reply, treat that self-authored item as already handled and do not reply again.
8. Do not post replies to human-authored review comments/threads unless the user explicitly confirms the exact response. If a human review item is non-actionable, already addressed, or not valid, surface the item and recommended response to the user instead of replying on GitHub.
9. If the failure is likely flaky/unrelated and `retry_failed_checks` is present, rerun failed jobs with `--retry-failed-now`.
10. If both actionable review feedback and `retry_failed_checks` are present, prioritize review feedback first; a new commit will retrigger CI, so avoid rerunning flaky checks on the old SHA unless you intentionally defer the review change.
11. On every loop, look for newly surfaced review feedback before acting on CI failures or mergeability state, then verify mergeability / merge-conflict status (for example via `gh pr view`) alongside CI.
@@ -69,12 +69,18 @@ python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr <number-or-url> --o
Use `gh` commands to inspect failed runs before deciding to rerun.
- `gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha`
- `gh run view <run-id> --log-failed`
- `gh api repos/<owner>/<repo>/actions/runs/<run-id>/jobs -X GET -f per_page=100`
- `gh api repos/<owner>/<repo>/actions/jobs/<job-id>/logs > /tmp/codex-gh-job-<job-id>-logs.zip`
- `gh run view <run-id> --log-failed` as a fallback after the overall workflow run is complete
Prefer treating failures as branch-related when logs point to changed code (compile/test/lint/typecheck/snapshots/static analysis in touched areas).
`gh run view --log-failed` is workflow-run scoped and may not expose failed-job logs until the overall run finishes. For faster diagnosis, poll the run's jobs first and, as soon as a specific job has failed, fetch that job's logs directly from the Actions job logs endpoint. The watcher includes a `failed_jobs` list with each failed job's `job_id` and `logs_endpoint` when GitHub exposes one.
Prefer treating failures as branch-related when failed-job logs point to changed code (compile/test/lint/typecheck/snapshots/static analysis in touched areas).
Prefer treating failures as flaky/unrelated when logs show transient infra/external issues (timeouts, runner provisioning failures, registry/network outages, GitHub Actions infra errors).
Do not attempt to fix flaky/unrelated failures by changing tests, build scripts, CI configuration, dependency pins, or infrastructure-adjacent code unless the logs clearly connect the failure to the PR branch. For flaky/unrelated failures, rerun only when the watcher recommends `retry_failed_checks`; otherwise wait or stop for user help.
If classification is ambiguous, perform one manual diagnosis attempt before choosing rerun.
Read `.codex/skills/babysit-pr/references/heuristics.md` for a concise checklist.
@@ -99,7 +105,8 @@ When you agree with a comment and it is actionable:
5. Resume watching on the new SHA immediately (do not stop after reporting the push).
6. If monitoring was running in `--watch` mode, restart `--watch` immediately after the push in the same turn; do not wait for the user to ask again.
If you disagree or the comment is non-actionable/already addressed, reply once directly on the GitHub comment/thread so the reviewer gets an explicit answer, then continue the watcher loop. Prefix any GitHub reply to a code review comment/thread with `[codex]` so it is clear the response is automated and not from the human user. If the watcher later surfaces your own reply because the authenticated operator is treated as a trusted review author, treat that self-authored item as already handled and do not reply again.
Do not post replies to human-authored GitHub review comments/threads automatically. If you disagree with a human comment, believe it is non-actionable/already addressed, or need to answer a question, report the item to the user with a suggested response and wait for explicit confirmation before posting anything on GitHub. If the user approves a response, prefix it with `[codex]` so it is clear the response is automated and not from the human user.
If the watcher later surfaces your own approved reply because the authenticated operator is treated as a trusted review author, treat that self-authored item as already handled and do not reply again.
If a code review comment/thread is already marked as resolved in GitHub, treat it as non-actionable and safely ignore it unless new unresolved follow-up feedback appears.
## Git Safety Rules
@@ -125,11 +132,11 @@ Use this loop in a live Codex session:
2. Read `actions`.
3. First check whether the PR is now merged or otherwise closed; if so, report that terminal state and stop polling immediately.
4. Check CI summary, new review items, and mergeability/conflict status.
5. Diagnose CI failures and classify branch-related vs flaky/unrelated.
6. For each surfaced review item from another author, either reply once with an explanation if it is non-actionable or patch/commit/push and then resolve it if it is actionable. If a later snapshot surfaces your own reply, treat it as informational and continue without responding again.
5. Diagnose CI failures and classify branch-related vs flaky/unrelated. If the overall run is still pending but `failed_jobs` already includes a failed job, fetch that job's logs and diagnose immediately instead of waiting for the whole workflow run to finish. Patch only when the failure is branch-related.
6. For each surfaced review item from another author, patch/commit/push and then resolve it if it is actionable. If it is non-actionable, already addressed, or requires a written answer, surface it to the user with a suggested response instead of posting automatically. If a later snapshot surfaces your own approved reply, treat it as informational and continue without responding again.
7. Process actionable review comments before flaky reruns when both are present; if a review fix requires a commit, push it and skip rerunning failed checks on the old SHA.
8. Retry failed checks only when `retry_failed_checks` is present and you are not about to replace the current SHA with a review/CI fix commit.
9. If you pushed a commit, resolved a review thread, replied to a review comment, or triggered a rerun, report the action briefly and continue polling (do not stop).
8. Retry failed checks only when `retry_failed_checks` is present and you are not about to replace the current SHA with a review/CI fix commit. Do not make code changes for unrelated flakes or infrastructure failures just to get CI green.
9. If you pushed a commit, resolved a review thread, or triggered a rerun, report the action briefly and continue polling (do not stop). If a human review comment needs a written GitHub response, stop and ask for confirmation before posting.
10. After a review-fix push, proactively restart continuous monitoring (`--watch`) in the same turn unless a strict stop condition has already been reached.
11. If everything is passing, mergeable, not blocked on required review approval, and there are no unaddressed review items, report that the PR is currently ready to merge but keep the watcher running so new review comments are surfaced quickly while the PR remains open.
12. If blocked on a user-help-required issue (infra outage, exhausted flaky retries, unclear reviewer request, permissions), report the blocker and stop.

View File

@@ -1,4 +1,4 @@
interface:
display_name: "PR Babysitter"
short_description: "Watch PR review comments, CI, and merge conflicts"
default_prompt: "Babysit the current PR: monitor reviewer comments, CI, and merge-conflict status (prefer the watchers --watch mode for live monitoring); surface new review feedback before acting on CI or mergeability work, fix valid issues, push updates, and rerun flaky failures up to 3 times. Keep exactly one watcher session active for the PR (do not leave duplicate --watch terminals running). If you pause monitoring to patch review/CI feedback, restart --watch yourself immediately after the push in the same turn. If a watcher is still running and no strict stop condition has been reached, the task is still in progress: keep consuming watcher output and sending progress updates instead of ending the turn. Do not treat a green + mergeable PR as a terminal stop while it is still open; continue polling autonomously after any push/rerun so newly posted review comments are surfaced until a strict terminal stop condition is reached or the user interrupts."
default_prompt: "Babysit the current PR: monitor reviewer comments, CI, and merge-conflict status (prefer the watchers --watch mode for live monitoring); surface new review feedback before acting on CI or mergeability work, fix valid issues, push updates, and rerun flaky failures up to 3 times. Do not post replies to human-authored review comments unless the user explicitly confirms the exact response. Do not patch unrelated flaky tests, CI infrastructure, dependency outages, runner issues, or other failures that are not caused by the branch. Keep exactly one watcher session active for the PR (do not leave duplicate --watch terminals running). If you pause monitoring to patch review/CI feedback, restart --watch yourself immediately after the push in the same turn. If a watcher is still running and no strict stop condition has been reached, the task is still in progress: keep consuming watcher output and sending progress updates instead of ending the turn. Do not treat a green + mergeable PR as a terminal stop while it is still open; continue polling autonomously after any push/rerun so newly posted review comments are surfaced until a strict terminal stop condition is reached or the user interrupts."

View File

@@ -23,9 +23,11 @@ Used to discover failed workflow runs and rerunnable run IDs.
### Failed log inspection
- `gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha`
- `gh api repos/{owner}/{repo}/actions/runs/{run_id}/jobs -X GET -f per_page=100`
- `gh api repos/{owner}/{repo}/actions/jobs/{job_id}/logs > /tmp/codex-gh-job-{job_id}-logs.zip`
- `gh run view <run-id> --log-failed`
Used by Codex to classify branch-related vs flaky/unrelated failures.
Used by Codex to classify branch-related vs flaky/unrelated failures. Prefer the direct job log endpoint as soon as a job has failed because `gh run view --log-failed` may not produce failed-job logs until the overall workflow run completes.
### Retry failed jobs only
@@ -70,3 +72,11 @@ Reruns only failed jobs (and dependencies) for a workflow run.
- `conclusion`
- `html_url`
- `head_sha`
### Actions run jobs API (`jobs[]`)
- `id`
- `name`
- `status`
- `conclusion`
- `html_url`

View File

@@ -18,6 +18,8 @@ Treat as **likely flaky or unrelated** when evidence points to transient or exte
- Cloud/service rate limits or transient API outages
- Non-deterministic failures in unrelated integration tests with known flake patterns
Do not patch likely flaky/unrelated failures. Use the retry budget for rerunnable failures, wait for pending jobs, or stop and report the blocker when the failure is persistent or infrastructure-owned.
If uncertain, inspect failed logs once before choosing rerun.
## Decision tree (fix vs rerun vs stop)
@@ -25,9 +27,11 @@ If uncertain, inspect failed logs once before choosing rerun.
1. If PR is merged/closed: stop.
2. If there are failed checks:
- Diagnose first.
- If checks are still pending but an individual job has already failed: fetch that job's logs and diagnose now.
- If branch-related: fix locally, commit, push.
- If likely flaky/unrelated and all checks for the current SHA are terminal: rerun failed jobs.
- If checks are still pending: wait.
- If likely flaky/unrelated and not safely rerunnable: stop and report the blocker; do not edit unrelated tests, build scripts, CI configuration, dependency pins, or infrastructure code.
- If checks are still pending and no failed job is available yet: wait.
3. If flaky reruns for the same SHA reach the configured limit (default 3): stop and report persistent failure.
4. Independently, process any new human review comments.
@@ -40,12 +44,15 @@ Address the comment when:
- The requested change does not conflict with the users intent or recent guidance.
- The change can be made safely without unrelated refactors.
Fix valid human review feedback in code when possible, but do not post a GitHub reply to a human-authored comment/thread unless the user explicitly confirms the exact response.
Do not auto-fix when:
- The comment is ambiguous and needs clarification.
- The request conflicts with explicit user instructions.
- The proposed change requires product/design decisions the user has not made.
- The codebase is in a dirty/unrelated state that makes safe editing uncertain.
- The comment only needs a written answer or disagreement response; propose the reply to the user instead of posting it automatically.
## Stop-and-ask conditions
@@ -56,3 +63,4 @@ Stop and ask the user instead of continuing automatically when:
- The PR branch cannot be pushed.
- CI failures persist after the flaky retry budget.
- Reviewer feedback requires a product decision or cross-team coordination.
- A human review comment requires a written GitHub reply instead of a code change.

View File

@@ -338,6 +338,66 @@ def failed_runs_from_workflow_runs(runs, head_sha):
return failed_runs
def get_jobs_for_run(repo, run_id):
endpoint = f"repos/{repo}/actions/runs/{run_id}/jobs"
data = gh_json(["api", endpoint, "-X", "GET", "-f", "per_page=100"], repo=repo)
if not isinstance(data, dict):
raise GhCommandError("Unexpected payload from actions run jobs API")
jobs = data.get("jobs") or []
if not isinstance(jobs, list):
raise GhCommandError("Expected `jobs` to be a list")
return jobs
def failed_jobs_from_workflow_runs(repo, runs, head_sha):
failed_jobs = []
for run in runs:
if not isinstance(run, dict):
continue
if str(run.get("head_sha") or "") != head_sha:
continue
run_id = run.get("id")
if run_id in (None, ""):
continue
run_status = str(run.get("status") or "")
run_conclusion = str(run.get("conclusion") or "")
if run_status.lower() == "completed" and run_conclusion not in FAILED_RUN_CONCLUSIONS:
continue
jobs = get_jobs_for_run(repo, run_id)
for job in jobs:
if not isinstance(job, dict):
continue
conclusion = str(job.get("conclusion") or "")
if conclusion not in FAILED_RUN_CONCLUSIONS:
continue
job_id = job.get("id")
logs_endpoint = None
if job_id not in (None, ""):
logs_endpoint = f"repos/{repo}/actions/jobs/{job_id}/logs"
failed_jobs.append(
{
"run_id": run_id,
"workflow_name": run.get("name") or run.get("display_title") or "",
"run_status": run_status,
"run_conclusion": run_conclusion,
"job_id": job_id,
"job_name": str(job.get("name") or ""),
"status": str(job.get("status") or ""),
"conclusion": conclusion,
"html_url": str(job.get("html_url") or ""),
"logs_endpoint": logs_endpoint,
}
)
failed_jobs.sort(
key=lambda item: (
str(item.get("workflow_name") or ""),
str(item.get("job_name") or ""),
str(item.get("job_id") or ""),
)
)
return failed_jobs
def get_authenticated_login():
data = gh_json(["api", "user"])
if not isinstance(data, dict) or not data.get("login"):
@@ -568,7 +628,7 @@ def is_pr_ready_to_merge(pr, checks_summary, new_review_items):
return True
def recommend_actions(pr, checks_summary, failed_runs, new_review_items, retries_used, max_retries):
def recommend_actions(pr, checks_summary, failed_runs, failed_jobs, new_review_items, retries_used, max_retries):
actions = []
if pr["closed"] or pr["merged"]:
if new_review_items:
@@ -583,7 +643,7 @@ def recommend_actions(pr, checks_summary, failed_runs, new_review_items, retries
if new_review_items:
actions.append("process_review_comment")
has_failed_pr_checks = checks_summary["failed_count"] > 0
has_failed_pr_checks = checks_summary["failed_count"] > 0 or bool(failed_jobs)
if has_failed_pr_checks:
if checks_summary["all_terminal"] and retries_used >= max_retries:
actions.append("stop_exhausted_retries")
@@ -621,12 +681,14 @@ def collect_snapshot(args):
checks_summary = summarize_checks(checks)
workflow_runs = get_workflow_runs_for_sha(pr["repo"], pr["head_sha"])
failed_runs = failed_runs_from_workflow_runs(workflow_runs, pr["head_sha"])
failed_jobs = failed_jobs_from_workflow_runs(pr["repo"], workflow_runs, pr["head_sha"])
retries_used = current_retry_count(state, pr["head_sha"])
actions = recommend_actions(
pr,
checks_summary,
failed_runs,
failed_jobs,
new_review_items,
retries_used,
args.max_flaky_retries,
@@ -641,6 +703,7 @@ def collect_snapshot(args):
"pr": pr,
"checks": checks_summary,
"failed_runs": failed_runs,
"failed_jobs": failed_jobs,
"new_review_items": new_review_items,
"actions": actions,
"retry_state": {

View File

@@ -75,6 +75,11 @@ def test_collect_snapshot_fetches_review_items_before_ci(monkeypatch, tmp_path):
"failed_runs_from_workflow_runs",
lambda *args, **kwargs: call_order.append("failed_runs") or [],
)
monkeypatch.setattr(
gh_pr_watch,
"failed_jobs_from_workflow_runs",
lambda *args, **kwargs: call_order.append("failed_jobs") or [],
)
monkeypatch.setattr(
gh_pr_watch,
"recommend_actions",
@@ -100,6 +105,7 @@ def test_recommend_actions_prioritizes_review_comments():
sample_pr(),
sample_checks(failed_count=1),
[{"run_id": 99}],
[],
[{"kind": "review_comment", "id": "1"}],
0,
3,
@@ -119,6 +125,7 @@ def test_run_watch_keeps_polling_open_ready_to_merge_pr(monkeypatch):
"pr": sample_pr(),
"checks": sample_checks(),
"failed_runs": [],
"failed_jobs": [],
"new_review_items": [],
"actions": ["ready_to_merge"],
"retry_state": {
@@ -153,3 +160,58 @@ def test_run_watch_keeps_polling_open_ready_to_merge_pr(monkeypatch):
assert sleeps == [30, 30]
assert [event for event, _ in events] == ["snapshot", "snapshot"]
def test_failed_jobs_include_direct_logs_endpoint(monkeypatch):
jobs_by_run = {
99: [
{
"id": 555,
"name": "unit tests",
"status": "completed",
"conclusion": "failure",
"html_url": "https://github.com/openai/codex/actions/runs/99/job/555",
},
{
"id": 556,
"name": "lint",
"status": "completed",
"conclusion": "success",
},
]
}
monkeypatch.setattr(
gh_pr_watch,
"get_jobs_for_run",
lambda repo, run_id: jobs_by_run[run_id],
)
failed_jobs = gh_pr_watch.failed_jobs_from_workflow_runs(
"openai/codex",
[
{
"id": 99,
"name": "CI",
"status": "in_progress",
"conclusion": "",
"head_sha": "abc123",
}
],
"abc123",
)
assert failed_jobs == [
{
"run_id": 99,
"workflow_name": "CI",
"run_status": "in_progress",
"run_conclusion": "",
"job_id": 555,
"job_name": "unit tests",
"status": "completed",
"conclusion": "failure",
"html_url": "https://github.com/openai/codex/actions/runs/99/job/555",
"logs_endpoint": "repos/openai/codex/actions/jobs/555/logs",
}
]

View File

@@ -53,7 +53,7 @@ Use `--window "past week"` or `--window-hours 168` when the user asks for a non-
## Summary
No major issues reported by users.
Source: collector v4, git `abc123def456`, window `2026-04-27T00:00:00Z` to `2026-04-28T00:00:00Z`.
Source: collector v5, git `abc123def456`, window `2026-04-27T00:00:00Z` to `2026-04-28T00:00:00Z`.
Want details? I can expand this into the issue table.
```
@@ -65,7 +65,7 @@ Two issues are being surfaced by users:
🔥🔥 Terminal launch hangs on startup [1](https://github.com/openai/codex/issues/123)
🔥 Resume switches model providers unexpectedly [2](https://github.com/openai/codex/issues/456)
Source: collector v4, git `abc123def456`, window `2026-04-27T00:00:00Z` to `2026-04-28T00:00:00Z`.
Source: collector v5, git `abc123def456`, window `2026-04-27T00:00:00Z` to `2026-04-28T00:00:00Z`.
Want details? I can expand this into the issue table.
```
5. In `## Details`, when details are requested, include a compact table only when useful:
@@ -76,7 +76,7 @@ Want details? I can expand this into the issue table.
- A clear quiet/no-concern sentence when there is no meaningful signal.
6. Use the JSON `attention_marker` exactly. It is empty for normal rows, `🔥` for elevated rows, and `🔥🔥` for very high-attention rows. The actual cutoffs are in `attention_thresholds`.
7. Use inline numbered references where a row or bullet points to issues, for example `Compaction bugs [1](https://github.com/openai/codex/issues/123), [2](https://github.com/openai/codex/issues/456)`. Do not add a separate footnotes section.
8. Label `interactions` as `Interactions`; it counts posts/comments/reactions during the requested window, not unique people.
8. Label `interactions` as `Interactions`; it counts unique human GitHub users who created a new issue, added a new comment, or reacted during the requested window. Multiple posts/reactions from the same user on the same issue count once.
9. Mention the collector `script_version`, repo checkout `git_head`, and time window in one compact source line. In default mode, put this before the details prompt so the final line still asks whether the user wants details. In details-upfront mode, it can be the footer.
## Reaction Handling
@@ -89,7 +89,7 @@ GitHub issue search is still seeded by issue `updated_at`, so a purely reaction-
## Attention Markers
The collector scales attention markers by the requested time window. The baseline is 5 human user interactions for `🔥` and 10 for `🔥🔥` over 24 hours; longer or shorter windows scale those cutoffs linearly and round up. For example, a one-week report uses 35 and 70 interactions. Human user interactions are human-authored new issue posts, human-authored new comments, and human reactions created during the window, including upvotes. Bot posts and bot reactions are excluded. In prose, explain this as high user interaction rather than naming the emoji.
The collector scales attention markers by the requested time window. The baseline is 5 unique human users for `🔥` and 10 unique human users for `🔥🔥` over 24 hours; longer or shorter windows scale those cutoffs linearly and round up. For example, a one-week report uses 35 and 70 interactions. Unique human users are users who authored a new issue, authored a new comment, or reacted during the window, including upvotes. Multiple actions from the same user on the same issue count once. Bot posts and bot reactions are excluded. In prose, explain this as high user interaction rather than naming the emoji.
## Freshness

View File

@@ -11,7 +11,7 @@ from datetime import datetime, timedelta, timezone
from pathlib import Path
from urllib.parse import quote
SCRIPT_VERSION = 4
SCRIPT_VERSION = 5
QUALIFYING_KIND_LABELS = ("bug", "enhancement")
REACTION_KEYS = ("+1", "-1", "laugh", "hooray", "confused", "heart", "rocket", "eyes")
BASE_ATTENTION_WINDOW_HOURS = 24.0
@@ -393,9 +393,15 @@ def is_bot_login(login):
return bool(login) and login.lower().endswith("[bot]")
def is_human_user(user_obj):
def human_login_key(user_obj):
login = extract_login(user_obj)
return bool(login) and not is_bot_login(login)
if not login or is_bot_login(login):
return ""
return login.casefold()
def is_human_user(user_obj):
return bool(human_login_key(user_obj))
def label_names(issue):
@@ -467,22 +473,26 @@ def reaction_summary(item):
def reaction_event_summary(reactions, since, until):
counts = {}
total = 0
users = set()
for reaction in reactions or []:
if not isinstance(reaction, dict):
continue
if not is_in_window(str(reaction.get("created_at") or ""), since, until):
continue
if not is_human_user(reaction.get("user")):
user_key = human_login_key(reaction.get("user"))
if not user_key:
continue
content = str(reaction.get("content") or "")
if not content:
continue
counts[content] = counts.get(content, 0) + 1
total += 1
users.add(user_key)
return {
"total": total,
"counts": counts,
"upvotes": counts.get("+1", 0),
"users": sorted(users, key=str.casefold),
}
@@ -618,13 +628,21 @@ def summarize_issue(
new_comment_reaction_total = sum(
comment["reaction_total"] for comment in new_comments
)
new_issue_user_interaction = new_issue and is_human_user(issue.get("user"))
new_issue_user_key = human_login_key(issue.get("user")) if new_issue else ""
new_issue_user_interaction = bool(new_issue_user_key)
new_comment_user_interactions = sum(
1 for comment in new_comments if comment["human_user_interaction"]
)
user_interactions = (
int(new_issue_user_interaction) + new_comment_user_interactions + new_reactions
interaction_user_keys = set(issue_reaction_events_summary["users"])
interaction_user_keys.update(comment_reaction_events_summary["users"])
if new_issue_user_key:
interaction_user_keys.add(new_issue_user_key)
interaction_user_keys.update(
comment["author"].casefold()
for comment in new_comments
if comment["human_user_interaction"]
)
user_interactions = len(interaction_user_keys)
attention_level = attention_level_for(user_interactions, attention_thresholds)
attention_marker = attention_marker_for(user_interactions, attention_thresholds)
updated_without_visible_new_post = (
@@ -957,6 +975,7 @@ def collect_digest(args):
"New issue comments are filtered by comment creation time within the window from the fetched comment set.",
"Reaction events are counted by GitHub reaction created_at timestamps for hydrated issues and fetched comments.",
"Current reaction totals are standing engagement signals; new_reactions and new_upvotes are windowed activity.",
"user_interactions counts unique human users per issue across new issues, new comments, and new reactions; repeated actions by the same user count once.",
"The collector does not assign semantic clusters; use summary_inputs as model-ready evidence for report-time clustering.",
"Pure reaction-only issues may be missed if GitHub issue search does not surface them via updated_at.",
"Issues updated during the window without a new issue body or new comment are retained because label/status edits can still be useful owner signals.",

View File

@@ -494,6 +494,70 @@ def test_reactions_count_toward_attention_markers():
assert summary["new_comments"][0]["new_upvotes"] == 0
def test_user_interactions_are_deduped_by_human_login():
since = collect_issue_digest.parse_timestamp("2026-04-25T00:00:00Z", "--since")
until = collect_issue_digest.parse_timestamp("2026-04-26T00:00:00Z", "--until")
def comment(comment_id, login):
return {
"id": comment_id,
"created_at": f"2026-04-25T0{comment_id + 1}:00:00Z",
"updated_at": f"2026-04-25T0{comment_id + 1}:00:00Z",
"user": {"login": login},
"body": "same issue",
}
def reaction(content, login, created_at="2026-04-25T10:00:00Z"):
return {
"content": content,
"created_at": created_at,
"user": {"login": login},
}
issue = {
"number": 790,
"title": "Repeated pings should not boost attention",
"html_url": "https://github.com/openai/codex/issues/790",
"state": "open",
"created_at": "2026-04-25T01:00:00Z",
"updated_at": "2026-04-25T12:00:00Z",
"user": {"login": "Alice"},
"labels": [{"name": "bug"}, {"name": "tui"}],
}
comments = [comment(1, "alice"), comment(2, "ALICE"), comment(3, "bob")]
comments.append(comment(4, "github-actions[bot]"))
issue_reactions = [
reaction("+1", "alice"),
reaction("rocket", "Alice"),
reaction("+1", "bob"),
reaction("+1", "github-actions[bot]"),
reaction("+1", "carol", created_at="2026-04-24T23:00:00Z"),
]
comment_reactions_by_id = {
1: [reaction("heart", "alice")],
2: [reaction("+1", "bob")],
3: [reaction("eyes", "carol")],
}
summary = collect_issue_digest.summarize_issue(
issue,
comments,
["tui"],
since,
until,
body_chars=100,
comment_chars=100,
issue_reaction_events=issue_reactions,
comment_reactions_by_id=comment_reactions_by_id,
)
assert summary["activity"]["new_human_comments"] == 3
assert summary["new_reactions"] == 6
assert summary["user_interactions"] == 3
assert summary["attention"] is False
assert summary["attention_marker"] == ""
def test_digest_rows_are_table_ready_with_concise_descriptions():
rows = collect_issue_digest.digest_rows(
[

1
.github/CODEOWNERS vendored
View File

@@ -1,5 +1,6 @@
# Core crate ownership.
/codex-rs/core/ @openai/codex-core-agent-team
/codex-rs/ext/extension-api/ @openai/codex-core-agent-team
# Keep ownership changes reviewed by the same team.
/.github/CODEOWNERS @openai/codex-core-agent-team

View File

@@ -2,7 +2,6 @@ name: 💻 CLI Bug
description: Report an issue in the Codex CLI
labels:
- bug
- needs triage
body:
- type: markdown
attributes:
@@ -12,6 +11,8 @@ body:
Make sure you are running the [latest](https://npmjs.com/package/@openai/codex) version of Codex CLI. The bug you are experiencing may already have been fixed.
If your version supports it, please run `codex doctor --json` and paste the output in the "Codex doctor report" field below. This helps us diagnose install, config, auth, terminal, MCP, network, and local state issues.
- type: input
id: version
attributes:
@@ -41,9 +42,19 @@ body:
id: terminal
attributes:
label: What terminal emulator and version are you using (if applicable)?
description: Also note any multiplexer in use (screen / tmux / zellij)
description: |
E.g, VSCode, Terminal.app, iTerm2, Ghostty, Windows Terminal (WSL / PowerShell)
Also note any multiplexer in use (screen / tmux / zellij).
E.g., VS Code, Terminal.app, iTerm2, Ghostty, Windows Terminal (WSL / PowerShell)
- type: textarea
id: doctor
attributes:
label: Codex doctor report
description: |
If available, run `codex doctor --json` and paste the full output here.
The report is designed to redact secrets, but please review it before submitting.
If your Codex version does not support `doctor`, write `not available`.
render: json
- type: textarea
id: actual
attributes:

View File

@@ -10,7 +10,7 @@ body:
Before you submit a feature:
1. Search existing issues for similar features. If you find one, 👍 it rather than opening a new one.
2. The Codex team will try to balance the varying needs of the community when prioritizing or rejecting new features. Not all features will be accepted. See [Contributing](https://github.com/openai/codex#contributing) for more details.
2. The Codex team will try to balance the varying needs of the community when prioritizing or rejecting new features. Not all features will be accepted. See [Contributing](https://github.com/openai/codex/blob/main/docs/contributing.md) for more details.
- type: input
id: variant

View File

@@ -1,6 +1,6 @@
name: 📗 Documentation Issue
description: Tell us if there is missing or incorrect documentation
labels: [docs]
labels: [documentation]
body:
- type: markdown
attributes:
@@ -24,4 +24,4 @@ body:
- type: textarea
attributes:
label: Where did you find it?
description: If possible, please provide the URL(s) where you found this issue.
description: If possible, please provide the URL(s) where you found this issue.

View File

@@ -50,7 +50,7 @@ runs:
- name: Restore bazel repository cache
id: cache_bazel_repository_restore
continue-on-error: true
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.setup_bazel.outputs.repository-cache-path }}
key: ${{ steps.cache_bazel_repository_key.outputs.repository-cache-key }}

View File

@@ -30,7 +30,7 @@ runs:
using: composite
steps:
- name: Azure login for Trusted Signing (OIDC)
uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2
uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0
with:
client-id: ${{ inputs.client-id }}
tenant-id: ${{ inputs.tenant-id }}
@@ -54,7 +54,7 @@ runs:
} >> "$GITHUB_OUTPUT"
- name: Sign Windows binaries with Azure Trusted Signing
uses: azure/trusted-signing-action@1d365fec12862c4aa68fcac418143d73f0cea293 # v0
uses: azure/trusted-signing-action@1d365fec12862c4aa68fcac418143d73f0cea293 # v0.5.11
with:
endpoint: ${{ inputs.endpoint }}
trusted-signing-account-name: ${{ inputs.account-name }}

View File

@@ -6,25 +6,37 @@ updates:
directory: .github/actions/codex
schedule:
interval: weekly
cooldown:
default-days: 7
- package-ecosystem: cargo
directories:
- codex-rs
- codex-rs/*
schedule:
interval: weekly
cooldown:
default-days: 7
- package-ecosystem: devcontainers
directory: /
schedule:
interval: weekly
cooldown:
default-days: 7
- package-ecosystem: docker
directory: codex-cli
schedule:
interval: weekly
cooldown:
default-days: 7
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly
cooldown:
default-days: 7
- package-ecosystem: rust-toolchain
directory: codex-rs
schedule:
interval: weekly
cooldown:
default-days: 7

View File

@@ -11,11 +11,11 @@
"path": "codex"
},
"linux-x86_64": {
"regex": "^codex-x86_64-unknown-linux-musl\\.zst$",
"regex": "^codex-x86_64-unknown-linux-musl-bundle\\.tar\\.zst$",
"path": "codex"
},
"linux-aarch64": {
"regex": "^codex-aarch64-unknown-linux-musl\\.zst$",
"regex": "^codex-aarch64-unknown-linux-musl-bundle\\.tar\\.zst$",
"path": "codex"
},
"windows-x86_64": {
@@ -84,6 +84,18 @@
}
}
},
"bwrap": {
"platforms": {
"linux-x86_64": {
"regex": "^bwrap-x86_64-unknown-linux-musl\\.zst$",
"path": "bwrap"
},
"linux-aarch64": {
"regex": "^bwrap-aarch64-unknown-linux-musl\\.zst$",
"path": "bwrap"
}
}
},
"codex-command-runner": {
"platforms": {
"windows-x86_64": {

View File

@@ -5,9 +5,9 @@ tool entries, such as Maven, that can change independently of this repo and
cause avoidable cache misses.
This script derives a smaller, cache-stable PATH that keeps the Windows
toolchain entries Bazel-backed CI tasks need: MSVC and Windows SDK paths, Git,
PowerShell, Node, Python, DotSlash, and the standard Windows system
directories.
toolchain entries Bazel-backed CI tasks need: MSVC and Windows SDK paths,
MinGW runtime DLL paths for gnullvm-built tests, Git, PowerShell, Node, Python,
DotSlash, and the standard Windows system directories.
`setup-bazel-ci` runs this after exporting the MSVC environment, and the script
publishes the result via `GITHUB_ENV` as `CODEX_BAZEL_WINDOWS_PATH` so later
steps can pass that explicit PATH to Bazel.
@@ -49,6 +49,8 @@ foreach ($pathEntry in ($env:PATH -split ';')) {
$pathEntry -like '*Microsoft Visual Studio*' -or
$pathEntry -like '*Windows Kits*' -or
$pathEntry -like '*Microsoft SDKs*' -or
$pathEntry -eq 'C:\mingw64\bin' -or
$pathEntry -like 'C:\msys64\*\bin' -or
$pathEntry -like 'C:\Program Files\Git\*' -or
$pathEntry -like 'C:\Program Files\PowerShell\*' -or
$pathEntry -like 'C:\hostedtoolcache\windows\node\*' -or
@@ -85,6 +87,12 @@ if ($pwshCommand) {
Add-StablePathEntry (Split-Path $pwshCommand.Source -Parent)
}
foreach ($mingwPath in @('C:\mingw64\bin', 'C:\msys64\mingw64\bin', 'C:\msys64\ucrt64\bin')) {
if (Test-Path $mingwPath) {
Add-StablePathEntry $mingwPath
}
}
if ($windowsAppsPath) {
Add-StablePathEntry $windowsAppsPath
}

View File

@@ -6,6 +6,7 @@ print_failed_bazel_test_logs=0
print_failed_bazel_action_summary=0
remote_download_toplevel=0
windows_msvc_host_platform=0
windows_cross_compile=0
while [[ $# -gt 0 ]]; do
case "$1" in
@@ -25,6 +26,10 @@ while [[ $# -gt 0 ]]; do
windows_msvc_host_platform=1
shift
;;
--windows-cross-compile)
windows_cross_compile=1
shift
;;
--)
shift
break
@@ -37,7 +42,7 @@ while [[ $# -gt 0 ]]; do
done
if [[ $# -eq 0 ]]; then
echo "Usage: $0 [--print-failed-test-logs] [--print-failed-action-summary] [--remote-download-toplevel] [--windows-msvc-host-platform] -- <bazel args> -- <targets>" >&2
echo "Usage: $0 [--print-failed-test-logs] [--print-failed-action-summary] [--remote-download-toplevel] [--windows-msvc-host-platform] [--windows-cross-compile] -- <bazel args> -- <targets>" >&2
exit 1
fi
@@ -61,7 +66,11 @@ case "${RUNNER_OS:-}" in
ci_config=ci-macos
;;
Windows)
ci_config=ci-windows
if [[ $windows_cross_compile -eq 1 ]]; then
ci_config=ci-windows-cross
else
ci_config=ci-windows
fi
;;
esac
@@ -105,8 +114,8 @@ print_bazel_test_log_tails() {
while IFS= read -r target; do
failed_targets+=("$target")
done < <(
grep -E '^FAIL: //' "$console_log" \
| sed -E 's#^FAIL: (//[^ ]+).*#\1#' \
grep -E '^(FAIL: //|ERROR: .* Testing //)' "$console_log" \
| sed -E 's#^FAIL: (//[^ ]+).*#\1#; s#^ERROR: .* Testing (//[^ ]+) failed:.*#\1#' \
| sort -u
)
@@ -244,6 +253,12 @@ if [[ ${#bazel_args[@]} -eq 0 || ${#bazel_targets[@]} -eq 0 ]]; then
exit 1
fi
if [[ "${RUNNER_OS:-}" == "Windows" && $windows_cross_compile -eq 1 && -z "${BUILDBUDDY_API_KEY:-}" ]]; then
# Fork PRs do not receive the BuildBuddy secret needed for the remote
# cross-compile config. Preserve the previous local Windows build shape.
windows_msvc_host_platform=1
fi
post_config_bazel_args=()
if [[ "${RUNNER_OS:-}" == "Windows" && $windows_msvc_host_platform -eq 1 ]]; then
has_host_platform_override=0
@@ -269,6 +284,25 @@ if [[ $remote_download_toplevel -eq 1 ]]; then
post_config_bazel_args+=(--remote_download_toplevel)
fi
if [[ "${RUNNER_OS:-}" == "Windows" && $windows_cross_compile -eq 1 && -n "${BUILDBUDDY_API_KEY:-}" ]]; then
# `--enable_platform_specific_config` expands `common:windows` on Windows
# hosts after ordinary rc configs, which can override `ci-windows-cross`'s
# RBE host platform. Repeat the host platform on the command line so V8 and
# other genrules execute on Linux RBE workers instead of Git Bash locally.
#
# Bazel also derives the default genrule shell from the client host. Without
# an explicit shell executable, remote Linux actions can be asked to run
# `C:\Program Files\Git\usr\bin\bash.exe`.
post_config_bazel_args+=(--host_platform=//:rbe --shell_executable=/bin/bash)
fi
if [[ "${RUNNER_OS:-}" == "Windows" && $windows_cross_compile -eq 1 && -z "${BUILDBUDDY_API_KEY:-}" ]]; then
# The Windows cross-compile config depends on remote execution. Fork PRs do
# not receive the BuildBuddy secret, so fall back to the existing local build
# shape and keep its lower concurrency cap.
post_config_bazel_args+=(--jobs=8)
fi
if [[ -n "${BAZEL_REPO_CONTENTS_CACHE:-}" ]]; then
# Windows self-hosted runners can run multiple Bazel jobs concurrently. Give
# each job its own repo contents cache so they do not fight over the shared
@@ -287,37 +321,57 @@ if [[ -n "${CODEX_BAZEL_EXECUTION_LOG_COMPACT_DIR:-}" ]]; then
fi
if [[ "${RUNNER_OS:-}" == "Windows" ]]; then
windows_action_env_vars=(
INCLUDE
LIB
LIBPATH
UCRTVersion
UniversalCRTSdkDir
VCINSTALLDIR
VCToolsInstallDir
WindowsLibPath
WindowsSdkBinPath
WindowsSdkDir
WindowsSDKLibVersion
WindowsSDKVersion
)
pass_windows_build_env=1
if [[ $windows_cross_compile -eq 1 && -n "${BUILDBUDDY_API_KEY:-}" ]]; then
# Remote build actions execute on Linux RBE workers. Passing the Windows
# runner's build environment there makes Bazel genrules try to execute
# C:\Program Files\Git\usr\bin\bash.exe on Linux.
pass_windows_build_env=0
fi
for env_var in "${windows_action_env_vars[@]}"; do
if [[ -n "${!env_var:-}" ]]; then
post_config_bazel_args+=("--action_env=${env_var}" "--host_action_env=${env_var}")
fi
done
if [[ $pass_windows_build_env -eq 1 ]]; then
windows_action_env_vars=(
INCLUDE
LIB
LIBPATH
UCRTVersion
UniversalCRTSdkDir
VCINSTALLDIR
VCToolsInstallDir
WindowsLibPath
WindowsSdkBinPath
WindowsSdkDir
WindowsSDKLibVersion
WindowsSDKVersion
)
for env_var in "${windows_action_env_vars[@]}"; do
if [[ -n "${!env_var:-}" ]]; then
post_config_bazel_args+=("--action_env=${env_var}" "--host_action_env=${env_var}")
fi
done
fi
if [[ -z "${CODEX_BAZEL_WINDOWS_PATH:-}" ]]; then
echo "CODEX_BAZEL_WINDOWS_PATH must be set for Windows Bazel CI." >&2
exit 1
fi
post_config_bazel_args+=(
"--action_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}"
"--host_action_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}"
"--test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}"
)
if [[ $pass_windows_build_env -eq 1 ]]; then
post_config_bazel_args+=(
"--action_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}"
"--host_action_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}"
)
elif [[ $windows_cross_compile -eq 1 ]]; then
# Remote build actions run on Linux RBE workers. Give their shell snippets
# a Linux PATH while preserving CODEX_BAZEL_WINDOWS_PATH below for local
# Windows test execution.
post_config_bazel_args+=(
"--action_env=PATH=/usr/bin:/bin"
"--host_action_env=PATH=/usr/bin:/bin"
)
fi
post_config_bazel_args+=("--test_env=PATH=${CODEX_BAZEL_WINDOWS_PATH}")
fi
bazel_console_log="$(mktemp)"

View File

@@ -6,8 +6,13 @@ set -euo pipefail
# invocation so target-discovery queries can reuse the same Bazel server.
query_args=()
windows_cross_compile=0
while [[ $# -gt 0 ]]; do
case "$1" in
--windows-cross-compile)
windows_cross_compile=1
shift
;;
--)
shift
break
@@ -20,7 +25,7 @@ while [[ $# -gt 0 ]]; do
done
if [[ $# -ne 1 ]]; then
echo "Usage: $0 [<bazel query args>...] -- <query expression>" >&2
echo "Usage: $0 [--windows-cross-compile] [<bazel query args>...] -- <query expression>" >&2
exit 1
fi
@@ -32,7 +37,11 @@ case "${RUNNER_OS:-}" in
ci_config=ci-macos
;;
Windows)
ci_config=ci-windows
if [[ $windows_cross_compile -eq 1 ]]; then
ci_config=ci-windows-cross
else
ci_config=ci-windows
fi
;;
esac

View File

@@ -63,8 +63,10 @@ def bazel_output_files(
platform: str,
labels: list[str],
compilation_mode: str = "fastbuild",
bazel_configs: list[str] | None = None,
) -> list[Path]:
expression = "set(" + " ".join(labels) + ")"
bazel_configs = bazel_configs or []
result = subprocess.run(
[
"bazel",
@@ -72,6 +74,7 @@ def bazel_output_files(
"-c",
compilation_mode,
f"--platforms=@llvm//platforms:{platform}",
*[f"--config={config}" for config in bazel_configs],
"--output=files",
expression,
],
@@ -87,7 +90,9 @@ def bazel_build(
platform: str,
labels: list[str],
compilation_mode: str = "fastbuild",
bazel_configs: list[str] | None = None,
) -> None:
bazel_configs = bazel_configs or []
subprocess.run(
[
"bazel",
@@ -95,6 +100,7 @@ def bazel_build(
"-c",
compilation_mode,
f"--platforms=@llvm//platforms:{platform}",
*[f"--config={config}" for config in bazel_configs],
*labels,
],
cwd=ROOT,
@@ -106,13 +112,14 @@ def ensure_bazel_output_files(
platform: str,
labels: list[str],
compilation_mode: str = "fastbuild",
bazel_configs: list[str] | None = None,
) -> list[Path]:
outputs = bazel_output_files(platform, labels, compilation_mode)
outputs = bazel_output_files(platform, labels, compilation_mode, bazel_configs)
if all(path.exists() for path in outputs):
return outputs
bazel_build(platform, labels, compilation_mode)
outputs = bazel_output_files(platform, labels, compilation_mode)
bazel_build(platform, labels, compilation_mode, bazel_configs)
outputs = bazel_output_files(platform, labels, compilation_mode, bazel_configs)
missing = [str(path) for path in outputs if not path.exists()]
if missing:
raise SystemExit(f"missing built outputs for {labels}: {missing}")
@@ -187,8 +194,9 @@ def single_bazel_output_file(
platform: str,
label: str,
compilation_mode: str = "fastbuild",
bazel_configs: list[str] | None = None,
) -> Path:
outputs = ensure_bazel_output_files(platform, [label], compilation_mode)
outputs = ensure_bazel_output_files(platform, [label], compilation_mode, bazel_configs)
if len(outputs) != 1:
raise SystemExit(f"expected exactly one output for {label}, found {outputs}")
return outputs[0]
@@ -198,11 +206,17 @@ def merged_musl_archive(
platform: str,
lib_path: Path,
compilation_mode: str = "fastbuild",
bazel_configs: list[str] | None = None,
) -> Path:
llvm_ar = single_bazel_output_file(platform, LLVM_AR_LABEL, compilation_mode)
llvm_ranlib = single_bazel_output_file(platform, LLVM_RANLIB_LABEL, compilation_mode)
llvm_ar = single_bazel_output_file(platform, LLVM_AR_LABEL, compilation_mode, bazel_configs)
llvm_ranlib = single_bazel_output_file(
platform,
LLVM_RANLIB_LABEL,
compilation_mode,
bazel_configs,
)
runtime_archives = [
single_bazel_output_file(platform, label, compilation_mode)
single_bazel_output_file(platform, label, compilation_mode, bazel_configs)
for label in MUSL_RUNTIME_ARCHIVE_LABELS
]
@@ -233,11 +247,13 @@ def stage_release_pair(
target: str,
output_dir: Path,
compilation_mode: str = "fastbuild",
bazel_configs: list[str] | None = None,
) -> None:
outputs = ensure_bazel_output_files(
platform,
[release_pair_label(target)],
compilation_mode,
bazel_configs,
)
try:
@@ -254,7 +270,7 @@ def stage_release_pair(
staged_library = output_dir / staged_archive_name(target, lib_path)
staged_binding = output_dir / f"src_binding_release_{target}.rs"
source_archive = (
merged_musl_archive(platform, lib_path, compilation_mode)
merged_musl_archive(platform, lib_path, compilation_mode, bazel_configs)
if is_musl_archive_target(target, lib_path)
else lib_path
)
@@ -293,6 +309,12 @@ def parse_args() -> argparse.Namespace:
stage_release_pair_parser.add_argument("--platform", required=True)
stage_release_pair_parser.add_argument("--target", required=True)
stage_release_pair_parser.add_argument("--output-dir", required=True)
stage_release_pair_parser.add_argument(
"--bazel-config",
action="append",
default=[],
dest="bazel_configs",
)
stage_release_pair_parser.add_argument(
"--compilation-mode",
default="fastbuild",
@@ -330,6 +352,7 @@ def main() -> int:
target=args.target,
output_dir=Path(args.output_dir),
compilation_mode=args.compilation_mode,
bazel_configs=args.bazel_configs,
)
return 0
if args.command == "resolved-v8-crate-version":

View File

@@ -25,7 +25,10 @@ TOP_LEVEL_NAME_EXCEPTIONS = {
UTILITY_NAME_EXCEPTIONS = {
"path-utils": "codex-utils-path",
}
MANIFEST_FEATURE_EXCEPTIONS = {}
MANIFEST_FEATURE_EXCEPTIONS = {
"codex-rs/code-mode/Cargo.toml": {"sandbox": ("v8/v8_enable_sandbox",)},
"codex-rs/v8-poc/Cargo.toml": {"sandbox": ("v8/v8_enable_sandbox",)},
}
OPTIONAL_DEPENDENCY_EXCEPTIONS = set()
INTERNAL_DEPENDENCY_FEATURE_EXCEPTIONS = {}

View File

@@ -17,13 +17,10 @@ concurrency:
cancel-in-progress: ${{ github.ref_name != 'main' }}
jobs:
test:
# Even though a no-cache-hit Windows build seems to exceed the 30-minute
# limit on occasion, the more common reason for exceeding the limit is a
# true test failure in a rust_test() marked "flaky" that gets run 3x.
# In that case, extra time generally does not give us more signal.
#
# Ultimately we need true distributed builds (e.g.,
# https://www.buildbuddy.io/docs/rbe-setup/) to speed things up.
# PRs use the sharded Windows cross-compiled test jobs below. Post-merge
# pushes to main also run the native Windows test job for broader Windows
# signal without putting PR latency back on the critical path. Cargo CI
# owns V8/code-mode test coverage for now.
timeout-minutes: 30
strategy:
fail-fast: false
@@ -47,16 +44,16 @@ jobs:
# - os: ubuntu-24.04-arm
# target: aarch64-unknown-linux-gnu
# Windows
- os: windows-latest
target: x86_64-pc-windows-gnullvm
runs-on: ${{ matrix.os }}
# Configure a human readable name for each job
name: Local Bazel build on ${{ matrix.os }} for ${{ matrix.target }}
name: Bazel test on ${{ matrix.os }} for ${{ matrix.target }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Check rusty_v8 MODULE.bazel checksums
if: matrix.os == 'ubuntu-24.04' && matrix.target == 'x86_64-unknown-linux-gnu'
@@ -88,9 +85,15 @@ jobs:
# path. V8 consumers under `//codex-rs/...` still participate
# transitively through `//...`.
-//third_party/v8:all
# V8-backed code-mode tests are covered by Cargo CI. Bazel CI
# cross-compiles in several legs, and those tests are not stable in
# that setup yet.
-//codex-rs/code-mode:code-mode-unit-tests
-//codex-rs/v8-poc:v8-poc-unit-tests
)
bazel_wrapper_args=(
--print-failed-action-summary
--print-failed-test-logs
)
bazel_test_args=(
@@ -99,11 +102,6 @@ jobs:
--test_verbose_timeout_warnings
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
)
if [[ "${RUNNER_OS}" == "Windows" ]]; then
bazel_wrapper_args+=(--windows-msvc-host-platform)
bazel_test_args+=(--jobs=8)
fi
./.github/scripts/run-bazel-ci.sh \
"${bazel_wrapper_args[@]}" \
-- \
@@ -114,7 +112,7 @@ jobs:
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-test-${{ matrix.target }}
path: ${{ runner.temp }}/bazel-execution-logs
@@ -125,7 +123,195 @@ jobs:
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}
test-windows-shard:
# Split the Windows Bazel test leg across separate Windows
# hosts. Each shard still uses Linux RBE for build actions, but the test
# execution itself happens on its own Windows runner.
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
shard:
- 1
- 2
- 3
- 4
runs-on: windows-latest
name: Bazel test on windows-latest for x86_64-pc-windows-gnullvm shard ${{ matrix.shard }}/4
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
uses: ./.github/actions/prepare-bazel-ci
with:
target: x86_64-pc-windows-gnullvm
# Reuse the former monolithic Windows test cache for restores. Do
# not save it from every shard below; duplicate uploads would sit on
# the PR-blocking critical path after the useful test work is done.
cache-scope: bazel-test
install-test-prereqs: "true"
- name: bazel test shard
env:
BAZEL_TEST_SHARD: ${{ matrix.shard }}
BAZEL_TEST_SHARD_COUNT: 4
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
set -euo pipefail
bazel_test_query='tests(//...) except tests(//third_party/v8:all) except //codex-rs/code-mode:code-mode-unit-tests except //codex-rs/v8-poc:v8-poc-unit-tests except attr(tags, "manual", tests(//...))'
mapfile -t bazel_targets < <(
MSYS2_ARG_CONV_EXCL='*' bazel query --output=label "${bazel_test_query}" \
| LC_ALL=C sort
)
selected_targets=()
for bazel_target in "${bazel_targets[@]}"; do
target_bucket="$(
printf '%s\n' "${bazel_target}" \
| cksum \
| awk -v shard_count="${BAZEL_TEST_SHARD_COUNT}" '{ print ($1 % shard_count) + 1 }'
)"
if [[ "${target_bucket}" == "${BAZEL_TEST_SHARD}" ]]; then
selected_targets+=("${bazel_target}")
fi
done
if [[ ${#selected_targets[@]} -eq 0 ]]; then
echo "No Bazel test targets selected for Windows shard ${BAZEL_TEST_SHARD}/${BAZEL_TEST_SHARD_COUNT}." >&2
exit 1
fi
echo "Selected ${#selected_targets[@]} of ${#bazel_targets[@]} Bazel test targets for Windows shard ${BAZEL_TEST_SHARD}/${BAZEL_TEST_SHARD_COUNT}."
bazel_test_args=(
test
--skip_incompatible_explicit_targets
--test_tag_filters=-argument-comment-lint
--test_verbose_timeout_warnings
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
--build_metadata=TAG_windows_test_shard=${BAZEL_TEST_SHARD}
)
./.github/scripts/run-bazel-ci.sh \
--print-failed-action-summary \
--print-failed-test-logs \
--windows-cross-compile \
--remote-download-toplevel \
-- \
"${bazel_test_args[@]}" \
-- \
"${selected_targets[@]}"
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-test-x86_64-pc-windows-gnullvm-shard-${{ matrix.shard }}
path: ${{ runner.temp }}/bazel-execution-logs
if-no-files-found: ignore
test-windows:
# Preserve the existing required-check surface while the real work happens
# in the sharded Windows jobs above.
if: always()
needs: test-windows-shard
runs-on: ubuntu-24.04
name: Bazel test on windows-latest for x86_64-pc-windows-gnullvm
steps:
- name: Confirm Windows Bazel test shards passed
shell: bash
run: |
if [[ "${{ needs.test-windows-shard.result }}" != "success" ]]; then
echo "Windows Bazel test shards finished with result: ${{ needs.test-windows-shard.result }}" >&2
exit 1
fi
test-windows-native-main:
# Native Windows Bazel tests are slower and frequently approach the
# 30-minute PR budget. Run this only for post-merge commits to main and give
# it a larger timeout.
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
timeout-minutes: 40
runs-on: windows-latest
name: Bazel test on windows-latest for x86_64-pc-windows-gnullvm (native main)
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
uses: ./.github/actions/prepare-bazel-ci
with:
target: x86_64-pc-windows-gnullvm
cache-scope: bazel-${{ github.job }}
install-test-prereqs: "true"
- name: bazel test //...
env:
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
bazel_targets=(
//...
# Keep standalone V8 library targets out of the ordinary Bazel CI
# path. V8 consumers under `//codex-rs/...` still participate
# transitively through `//...`.
-//third_party/v8:all
# Keep this aligned with the main Bazel job. The native Windows
# job preserves broad post-merge coverage, but code-mode/V8 tests
# are covered by Cargo CI rather than Bazel for now.
-//codex-rs/code-mode:code-mode-unit-tests
-//codex-rs/v8-poc:v8-poc-unit-tests
)
bazel_test_args=(
test
--test_tag_filters=-argument-comment-lint
--test_verbose_timeout_warnings
--build_metadata=COMMIT_SHA=${GITHUB_SHA}
--build_metadata=TAG_windows_native_main=true
)
./.github/scripts/run-bazel-ci.sh \
--print-failed-action-summary \
--print-failed-test-logs \
-- \
"${bazel_test_args[@]}" \
-- \
"${bazel_targets[@]}"
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-test-windows-native-x86_64-pc-windows-gnullvm
path: ${{ runner.temp }}/bazel-execution-logs
if-no-files-found: ignore
# Save the job-scoped Bazel repository cache after cache misses. Keep the
# upload non-fatal so cache service issues never fail the job itself.
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}
@@ -150,7 +336,10 @@ jobs:
name: Bazel clippy on ${{ matrix.os }} for ${{ matrix.target }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
@@ -170,17 +359,24 @@ jobs:
--build_metadata=TAG_job=clippy
)
bazel_wrapper_args=()
bazel_target_list_args=()
if [[ "${RUNNER_OS}" == "Windows" ]]; then
# Keep this aligned with the Windows Bazel test job. With the
# default `//:local_windows` host platform, Windows `rust_test`
# targets such as `//codex-rs/core:core-all-test` can be skipped
# by `--skip_incompatible_explicit_targets`, which hides clippy
# diagnostics from integration-test modules.
bazel_wrapper_args+=(--windows-msvc-host-platform)
bazel_clippy_args+=(--skip_incompatible_explicit_targets)
# Keep this aligned with the fast Windows Bazel test job: use
# Linux RBE for clippy build actions while targeting Windows
# gnullvm. Fork/community PRs without the BuildBuddy secret fall
# back inside `run-bazel-ci.sh` to the previous local Windows MSVC
# host-platform shape.
bazel_wrapper_args+=(--windows-cross-compile)
bazel_target_list_args+=(--windows-cross-compile)
if [[ -z "${BUILDBUDDY_API_KEY:-}" ]]; then
# The fork fallback can see incompatible explicit Windows-cross
# internal test binaries in the generated target list. Preserve
# the old local-fallback behavior there.
bazel_clippy_args+=(--skip_incompatible_explicit_targets)
fi
fi
bazel_target_lines="$(./scripts/list-bazel-clippy-targets.sh)"
bazel_target_lines="$(./scripts/list-bazel-clippy-targets.sh "${bazel_target_list_args[@]}")"
bazel_targets=()
while IFS= read -r target; do
bazel_targets+=("${target}")
@@ -198,7 +394,7 @@ jobs:
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-clippy-${{ matrix.target }}
path: ${{ runner.temp }}/bazel-execution-logs
@@ -209,7 +405,7 @@ jobs:
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}
@@ -230,7 +426,10 @@ jobs:
name: Verify release build on ${{ matrix.os }} for ${{ matrix.target }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Prepare Bazel CI
id: prepare_bazel
@@ -252,7 +451,12 @@ jobs:
# Rust debug assertions explicitly.
bazel_wrapper_args=()
if [[ "${RUNNER_OS}" == "Windows" ]]; then
bazel_wrapper_args+=(--windows-msvc-host-platform)
# This is build-only signal, so use the same Linux-RBE
# cross-compile path as the fast Windows test and clippy jobs.
# Fork/community PRs without the BuildBuddy secret fall back
# inside `run-bazel-ci.sh` to the previous local Windows MSVC
# host-platform shape.
bazel_wrapper_args+=(--windows-cross-compile)
fi
bazel_build_args=(
@@ -278,10 +482,26 @@ jobs:
-- \
"${bazel_targets[@]}"
- name: Verify Bazel builds bwrap
if: runner.os == 'Linux'
env:
BUILDBUDDY_API_KEY: ${{ secrets.BUILDBUDDY_API_KEY }}
shell: bash
run: |
./.github/scripts/run-bazel-ci.sh \
--remote-download-toplevel \
--print-failed-action-summary \
-- \
build \
--build_metadata=COMMIT_SHA=${GITHUB_SHA} \
--build_metadata=TAG_job=verify-bwrap \
-- \
//codex-rs/bwrap:bwrap
- name: Upload Bazel execution logs
if: always() && !cancelled()
continue-on-error: true
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: bazel-execution-logs-verify-release-build-${{ matrix.target }}
path: ${{ runner.temp }}/bazel-execution-logs
@@ -292,7 +512,7 @@ jobs:
- name: Save bazel repository cache
if: always() && !cancelled() && steps.prepare_bazel.outputs.repository-cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ steps.prepare_bazel.outputs.repository-cache-path }}
key: ${{ steps.prepare_bazel.outputs.repository-cache-key }}

View File

@@ -8,17 +8,19 @@ jobs:
name: Blob size policy
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
fetch-depth: 0
persist-credentials: false
- name: Determine PR comparison range
id: range
shell: bash
run: |
set -euo pipefail
echo "base=$(git rev-parse HEAD^1)" >> "$GITHUB_OUTPUT"
echo "head=$(git rev-parse HEAD^2)" >> "$GITHUB_OUTPUT"
echo "base=${{ github.event.pull_request.base.sha }}" >> "$GITHUB_OUTPUT"
echo "head=${{ github.event.pull_request.head.sha }}" >> "$GITHUB_OUTPUT"
- name: Check changed blob sizes
env:

View File

@@ -14,13 +14,16 @@ jobs:
working-directory: ./codex-rs
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
- name: Run cargo-deny
uses: EmbarkStudios/cargo-deny-action@82eb9f621fbc699dd0918f3ea06864c14cc84246 # v2
with:
rust-version: stable
rust-version: 1.93.0
manifest-path: ./codex-rs/Cargo.toml

View File

@@ -12,7 +12,10 @@ jobs:
NODE_OPTIONS: --max-old-space-size=4096
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Verify codex-rs Cargo manifests inherit workspace settings
run: python3 .github/scripts/verify_cargo_workspace_manifests.py
@@ -29,7 +32,7 @@ jobs:
run_install: false
- name: Setup Node.js
uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6
uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
with:
node-version: 22
@@ -52,16 +55,18 @@ jobs:
CODEX_VERSION=0.125.0
WORKFLOW_URL="https://github.com/openai/codex/actions/runs/24901475298"
OUTPUT_DIR="${RUNNER_TEMP}"
# This reused workflow predates the standalone bwrap artifact.
python3 ./scripts/stage_npm_packages.py \
--release-version "$CODEX_VERSION" \
--workflow-url "$WORKFLOW_URL" \
--package codex \
--allow-missing-native-component bwrap \
--output-dir "$OUTPUT_DIR"
PACK_OUTPUT="${OUTPUT_DIR}/codex-npm-${CODEX_VERSION}.tgz"
echo "pack_output=$PACK_OUTPUT" >> "$GITHUB_OUTPUT"
- name: Upload staged npm package artifact
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: codex-npm-staging
path: ${{ steps.stage_npm_package.outputs.pack_output }}

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Close inactive PRs from contributors
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8.0.0
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |

View File

@@ -18,9 +18,12 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Annotate locations with typos
uses: codespell-project/codespell-problem-matcher@b80729f885d32f78a716c2f107b4db1025001c42 # v1
uses: codespell-project/codespell-problem-matcher@b80729f885d32f78a716c2f107b4db1025001c42 # v1.1.0
- name: Codespell
uses: codespell-project/actions-codespell@8f01853be192eb0f849a5c7d721450e7a467c579 # v2.2
with:

View File

@@ -15,12 +15,8 @@ jobs:
permissions:
contents: read
outputs:
issues_json: ${{ steps.normalize-all.outputs.issues_json }}
reason: ${{ steps.normalize-all.outputs.reason }}
has_matches: ${{ steps.normalize-all.outputs.has_matches }}
codex_output: ${{ steps.codex-all.outputs.final-message }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Prepare Codex inputs
env:
GH_TOKEN: ${{ github.token }}
@@ -65,6 +61,8 @@ jobs:
with:
openai-api-key: ${{ secrets.CODEX_OPENAI_API_KEY }}
allow-users: "*"
safety-strategy: drop-sudo
sandbox: read-only
prompt: |
You are an assistant that triages new GitHub issues by identifying potential duplicates.
@@ -98,10 +96,21 @@ jobs:
"additionalProperties": false
}
normalize-duplicates-all:
name: Normalize pass 1 output
needs: gather-duplicates-all
if: ${{ needs.gather-duplicates-all.result == 'success' }}
runs-on: ubuntu-latest
permissions: {}
outputs:
issues_json: ${{ steps.normalize-all.outputs.issues_json }}
reason: ${{ steps.normalize-all.outputs.reason }}
has_matches: ${{ steps.normalize-all.outputs.has_matches }}
steps:
- id: normalize-all
name: Normalize pass 1 output
env:
CODEX_OUTPUT: ${{ steps.codex-all.outputs.final-message }}
CODEX_OUTPUT: ${{ needs.gather-duplicates-all.outputs.codex_output }}
CURRENT_ISSUE_NUMBER: ${{ github.event.issue.number }}
run: |
set -eo pipefail
@@ -144,19 +153,15 @@ jobs:
gather-duplicates-open:
name: Identify potential duplicates (open issues fallback)
# Pass 1 may drop sudo on the runner, so run the fallback in a fresh job.
needs: gather-duplicates-all
if: ${{ needs.gather-duplicates-all.result == 'success' && needs.gather-duplicates-all.outputs.has_matches != 'true' }}
# Pass 1 Codex execution drops sudo on its runner, so run the fallback in a fresh job.
needs: normalize-duplicates-all
if: ${{ needs.normalize-duplicates-all.result == 'success' && needs.normalize-duplicates-all.outputs.has_matches != 'true' }}
runs-on: ubuntu-latest
permissions:
contents: read
outputs:
issues_json: ${{ steps.normalize-open.outputs.issues_json }}
reason: ${{ steps.normalize-open.outputs.reason }}
has_matches: ${{ steps.normalize-open.outputs.has_matches }}
codex_output: ${{ steps.codex-open.outputs.final-message }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Prepare Codex inputs
env:
GH_TOKEN: ${{ github.token }}
@@ -199,6 +204,8 @@ jobs:
with:
openai-api-key: ${{ secrets.CODEX_OPENAI_API_KEY }}
allow-users: "*"
safety-strategy: drop-sudo
sandbox: read-only
prompt: |
You are an assistant that triages new GitHub issues by identifying potential duplicates.
@@ -232,10 +239,21 @@ jobs:
"additionalProperties": false
}
normalize-duplicates-open:
name: Normalize pass 2 output
needs: gather-duplicates-open
if: ${{ needs.gather-duplicates-open.result == 'success' }}
runs-on: ubuntu-latest
permissions: {}
outputs:
issues_json: ${{ steps.normalize-open.outputs.issues_json }}
reason: ${{ steps.normalize-open.outputs.reason }}
has_matches: ${{ steps.normalize-open.outputs.has_matches }}
steps:
- id: normalize-open
name: Normalize pass 2 output
env:
CODEX_OUTPUT: ${{ steps.codex-open.outputs.final-message }}
CODEX_OUTPUT: ${{ needs.gather-duplicates-open.outputs.codex_output }}
CURRENT_ISSUE_NUMBER: ${{ github.event.issue.number }}
run: |
set -eo pipefail
@@ -279,9 +297,9 @@ jobs:
select-final:
name: Select final duplicate set
needs:
- gather-duplicates-all
- gather-duplicates-open
if: ${{ always() && needs.gather-duplicates-all.result == 'success' && (needs.gather-duplicates-open.result == 'success' || needs.gather-duplicates-open.result == 'skipped') }}
- normalize-duplicates-all
- normalize-duplicates-open
if: ${{ always() && needs.normalize-duplicates-all.result == 'success' && (needs.normalize-duplicates-open.result == 'success' || needs.normalize-duplicates-open.result == 'skipped') }}
runs-on: ubuntu-latest
permissions:
contents: read
@@ -291,12 +309,12 @@ jobs:
- id: select-final
name: Select final duplicate set
env:
PASS1_ISSUES: ${{ needs.gather-duplicates-all.outputs.issues_json }}
PASS1_REASON: ${{ needs.gather-duplicates-all.outputs.reason }}
PASS2_ISSUES: ${{ needs.gather-duplicates-open.outputs.issues_json }}
PASS2_REASON: ${{ needs.gather-duplicates-open.outputs.reason }}
PASS1_HAS_MATCHES: ${{ needs.gather-duplicates-all.outputs.has_matches }}
PASS2_HAS_MATCHES: ${{ needs.gather-duplicates-open.outputs.has_matches }}
PASS1_ISSUES: ${{ needs.normalize-duplicates-all.outputs.issues_json }}
PASS1_REASON: ${{ needs.normalize-duplicates-all.outputs.reason }}
PASS2_ISSUES: ${{ needs.normalize-duplicates-open.outputs.issues_json }}
PASS2_REASON: ${{ needs.normalize-duplicates-open.outputs.reason }}
PASS1_HAS_MATCHES: ${{ needs.normalize-duplicates-all.outputs.has_matches }}
PASS2_HAS_MATCHES: ${{ needs.normalize-duplicates-open.outputs.has_matches }}
run: |
set -eo pipefail
@@ -342,7 +360,7 @@ jobs:
issues: write
steps:
- name: Comment on issue
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8.0.0
env:
CODEX_OUTPUT: ${{ needs.select-final.outputs.codex_output }}
with:

View File

@@ -17,13 +17,13 @@ jobs:
outputs:
codex_output: ${{ steps.codex.outputs.final-message }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- id: codex
uses: openai/codex-action@5c3f4ccdb2b8790f73d6b21751ac00e602aa0c02 # v1.7
with:
openai-api-key: ${{ secrets.CODEX_OPENAI_API_KEY }}
allow-users: "*"
safety-strategy: drop-sudo
sandbox: read-only
prompt: |
You are an assistant that reviews GitHub issues for the repository.
@@ -44,7 +44,7 @@ jobs:
6. iOS — Issues with the Codex iOS app.
- Additionally add zero or more of the following labels that are relevant to the issue content. Prefer a small set of precise labels over many broad ones.
- For agent-area issues, prefer the most specific applicable label. Use "agent" only as a fallback for agent-related issues that do not fit a more specific agent-area label. Prefer "app-server" over "session" or "config" when the issue is about app-server protocol, API, RPC, schema, launch, or bridge behavior.
- For agent-area issues, prefer the most specific applicable label. Use "agent" only as a fallback for agent-related issues that do not fit a more specific agent-area label. Prefer "app-server" over "session" or "config" when the issue is about app-server protocol, API, RPC, schema, launch, or bridge behavior. Use "memory" for agentic memory storage/retrieval and "performance" for high process memory utilization or memory leaks.
1. windows-os — Bugs or friction specific to Windows environments (always when PowerShell is mentioned, path handling, copy/paste, OS-specific auth or tooling failures).
2. mcp — Topics involving Model Context Protocol servers/clients.
3. mcp-server — Problems related to the codex mcp-server command, where codex runs as an MCP server.
@@ -68,7 +68,15 @@ jobs:
21. session - Issues involving session or thread management, including resume, fork, archive, rename/title, thread history, rollout persistence, compaction, checkpoints, retention, and cross-session state.
22. config - Issues involving config.toml, config keys, config key merging, config updates, profiles, hooks config, project config, agent role TOMLs, instruction/personality config, and config schema behavior.
23. plan - Issues involving plan mode, planning workflows, or plan-specific tools/behavior.
24. agent - Fallback only for core agent loop or agent-related issues that do not fit app-server, connectivity, subagent, session, config, or plan.
24. computer-use - Issues involving agentic computer use or SkyComputerUseService.
25. browser - Issues involving agentic browser use, IAB, or the built-in browser within the Codex app.
26. memory - Issues involving agentic memory storage and retrieval.
27. imagen - Issues involving image generation.
28. remote - Issues involving remote access, remote control, or SSH.
29. performance - Issues involving slow, laggy performance, high memory utilization, or memory leaks.
30. automations - Issues involving scheduled automation tasks or heartbeats.
31. pets - Issues involving pets avatars and animations.
32. agent - Fallback only for core agent loop or agent-related issues that do not fit app-server, connectivity, subagent, session, config, plan, computer-use, browser, memory, imagen, remote, performance, automations, or pets.
Issue number: ${{ github.event.issue.number }}

View File

@@ -7,6 +7,11 @@ on:
workflow_dispatch:
# CI builds in debug (dev) for faster signal.
env:
# Cargo's libgit2 transport has been flaky on macOS when fetching git
# dependencies with nested submodules. Use the system git CLI, which has
# better network/proxy behavior and matches Cargo's own suggested fallback.
CARGO_NET_GIT_FETCH_WITH_CLI: "true"
jobs:
# --- CI that doesn't need specific targets ---------------------------------
@@ -17,7 +22,9 @@ jobs:
run:
working-directory: codex-rs
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
with:
components: rustfmt
@@ -31,14 +38,15 @@ jobs:
run:
working-directory: codex-rs
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
tool: cargo-shear
version: 1.5.1
persist-credentials: false
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2.62.49
with:
tool: cargo-shear@1.11.2
- name: cargo shear
run: cargo shear
run: cargo shear --deny-warnings
argument_comment_lint_package:
name: Argument comment lint package
@@ -47,14 +55,16 @@ jobs:
CARGO_DYLINT_VERSION: 5.0.0
DYLINT_LINK_VERSION: 5.0.0
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
with:
toolchain: nightly-2025-09-18
components: llvm-tools-preview, rustc-dev, rust-src
- name: Cache cargo-dylint tooling
id: cargo_dylint_cache
uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cargo/bin/cargo-dylint
@@ -97,7 +107,9 @@ jobs:
group: codex-runners
labels: codex-windows-x64
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: ./.github/actions/setup-bazel-ci
with:
target: ${{ runner.os }}
@@ -233,7 +245,9 @@ jobs:
labels: codex-windows-arm64
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Install Linux build dependencies
if: ${{ runner.os == 'Linux' }}
shell: bash
@@ -276,7 +290,7 @@ jobs:
# avoid caching the large target dir on the gnu-dev job.
- name: Restore cargo home cache
id: cache_cargo_home_restore
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cargo/bin/
@@ -294,7 +308,7 @@ jobs:
# Install and restore sccache cache
- name: Install sccache
if: ${{ env.USE_SCCACHE == 'true' }}
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2.62.49
with:
tool: sccache
version: 0.7.5
@@ -321,7 +335,7 @@ jobs:
- name: Restore sccache cache (fallback)
if: ${{ env.USE_SCCACHE == 'true' && env.SCCACHE_GHA_ENABLED != 'true' }}
id: cache_sccache_restore
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ github.workspace }}/.sccache/
key: sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-${{ steps.lockhash.outputs.hash }}-${{ github.run_id }}
@@ -348,7 +362,7 @@ jobs:
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'aarch64-unknown-linux-musl'}}
name: Restore APT cache (musl)
id: cache_apt_restore
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
/var/cache/apt
@@ -356,7 +370,7 @@ jobs:
- if: ${{ matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'aarch64-unknown-linux-musl'}}
name: Install Zig
uses: mlugg/setup-zig@d1434d08867e3ee9daa34448df10607b98908d29 # v2
uses: mlugg/setup-zig@d1434d08867e3ee9daa34448df10607b98908d29 # v2.2.1
with:
version: 0.14.0
@@ -430,7 +444,7 @@ jobs:
- name: Install cargo-chef
if: ${{ matrix.profile == 'release' }}
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2.62.49
with:
tool: cargo-chef
version: 0.1.71
@@ -449,7 +463,7 @@ jobs:
- name: Upload Cargo timings (clippy)
if: always()
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: cargo-timings-rust-ci-clippy-${{ matrix.target }}-${{ matrix.profile }}
path: codex-rs/target/**/cargo-timings/cargo-timing.html
@@ -460,7 +474,7 @@ jobs:
- name: Save cargo home cache
if: always() && !cancelled() && steps.cache_cargo_home_restore.outputs.cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cargo/bin/
@@ -476,7 +490,7 @@ jobs:
- name: Save sccache cache (fallback)
if: always() && !cancelled() && env.USE_SCCACHE == 'true' && env.SCCACHE_GHA_ENABLED != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ github.workspace }}/.sccache/
key: sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-${{ steps.lockhash.outputs.hash }}-${{ github.run_id }}
@@ -501,7 +515,7 @@ jobs:
- name: Save APT cache (musl)
if: always() && !cancelled() && (matrix.target == 'x86_64-unknown-linux-musl' || matrix.target == 'aarch64-unknown-linux-musl') && steps.cache_apt_restore.outputs.cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
/var/cache/apt
@@ -559,7 +573,9 @@ jobs:
labels: codex-windows-arm64
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Install Linux build dependencies
if: ${{ runner.os == 'Linux' }}
shell: bash
@@ -567,7 +583,7 @@ jobs:
set -euo pipefail
if command -v apt-get >/dev/null 2>&1; then
sudo apt-get update -y
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends pkg-config libcap-dev
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends pkg-config libcap-dev bubblewrap
fi
# Some integration tests rely on DotSlash being installed.
@@ -590,7 +606,7 @@ jobs:
- name: Restore cargo home cache
id: cache_cargo_home_restore
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cargo/bin/
@@ -603,7 +619,7 @@ jobs:
- name: Install sccache
if: ${{ env.USE_SCCACHE == 'true' }}
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2.62.49
with:
tool: sccache
version: 0.7.5
@@ -630,7 +646,7 @@ jobs:
- name: Restore sccache cache (fallback)
if: ${{ env.USE_SCCACHE == 'true' && env.SCCACHE_GHA_ENABLED != 'true' }}
id: cache_sccache_restore
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/restore@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ github.workspace }}/.sccache/
key: sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-${{ steps.lockhash.outputs.hash }}-${{ github.run_id }}
@@ -638,7 +654,7 @@ jobs:
sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-${{ steps.lockhash.outputs.hash }}-
sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2.62.49
with:
tool: nextest
version: 0.9.103
@@ -674,7 +690,7 @@ jobs:
- name: Upload Cargo timings (nextest)
if: always()
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: cargo-timings-rust-ci-nextest-${{ matrix.target }}-${{ matrix.profile }}
path: codex-rs/target/**/cargo-timings/cargo-timing.html
@@ -683,7 +699,7 @@ jobs:
- name: Save cargo home cache
if: always() && !cancelled() && steps.cache_cargo_home_restore.outputs.cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cargo/bin/
@@ -695,7 +711,7 @@ jobs:
- name: Save sccache cache (fallback)
if: always() && !cancelled() && env.USE_SCCACHE == 'true' && env.SCCACHE_GHA_ENABLED != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: ${{ github.workspace }}/.sccache/
key: sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-${{ steps.lockhash.outputs.hash }}-${{ github.run_id }}
@@ -722,10 +738,12 @@ jobs:
shell: bash
run: |
set +e
if [[ "${{ steps.test.outcome }}" != "success" ]]; then
if [[ "${STEPS_TEST_OUTCOME}" != "success" ]]; then
docker logs codex-remote-test-env || true
fi
docker rm -f codex-remote-test-env >/dev/null 2>&1 || true
env:
STEPS_TEST_OUTCOME: ${{ steps.test.outcome }}
- name: verify tests passed
if: steps.test.outcome == 'failure'

View File

@@ -14,9 +14,11 @@ jobs:
codex: ${{ steps.detect.outputs.codex }}
workflows: ${{ steps.detect.outputs.workflows }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
fetch-depth: 0
persist-credentials: false
- name: Detect changed paths (no external action)
id: detect
shell: bash
@@ -61,7 +63,10 @@ jobs:
run:
working-directory: codex-rs
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
with:
components: rustfmt
@@ -77,14 +82,16 @@ jobs:
run:
working-directory: codex-rs
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
tool: cargo-shear
version: 1.5.1
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2.62.49
with:
tool: cargo-shear@1.11.2
- name: cargo shear
run: cargo shear
run: cargo shear --deny-warnings
argument_comment_lint_package:
name: Argument comment lint package
@@ -95,7 +102,10 @@ jobs:
CARGO_DYLINT_VERSION: 5.0.0
DYLINT_LINK_VERSION: 5.0.0
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
- name: Install nightly argument-comment-lint toolchain
shell: bash
@@ -109,7 +119,7 @@ jobs:
rustup default nightly-2025-09-18
- name: Cache cargo-dylint tooling
id: cargo_dylint_cache
uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cargo/bin/cargo-dylint
@@ -170,8 +180,11 @@ jobs:
echo "No argument-comment-lint relevant changes."
echo "run=false" >> "$GITHUB_OUTPUT"
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
if: ${{ steps.argument_comment_lint_gate.outputs.run == 'true' }}
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Run argument comment lint on codex-rs via Bazel
if: ${{ steps.argument_comment_lint_gate.outputs.run == 'true' }}
uses: ./.github/actions/run-argument-comment-lint
@@ -203,20 +216,25 @@ jobs:
# If nothing relevant changed (PR touching only root README, etc.),
# declare success regardless of other jobs.
if [[ '${{ needs.changed.outputs.argument_comment_lint }}' != 'true' && '${{ needs.changed.outputs.codex }}' != 'true' && '${{ needs.changed.outputs.workflows }}' != 'true' ]]; then
if [[ "${NEEDS_CHANGED_OUTPUTS_ARGUMENT_COMMENT_LINT}" != 'true' && "${NEEDS_CHANGED_OUTPUTS_CODEX}" != 'true' && "${NEEDS_CHANGED_OUTPUTS_WORKFLOWS}" != 'true' ]]; then
echo 'No relevant changes -> CI not required.'
exit 0
fi
if [[ '${{ needs.changed.outputs.argument_comment_lint_package }}' == 'true' ]]; then
if [[ "${NEEDS_CHANGED_OUTPUTS_ARGUMENT_COMMENT_LINT_PACKAGE}" == 'true' ]]; then
[[ '${{ needs.argument_comment_lint_package.result }}' == 'success' ]] || { echo 'argument_comment_lint_package failed'; exit 1; }
fi
if [[ '${{ needs.changed.outputs.argument_comment_lint }}' == 'true' || '${{ needs.changed.outputs.workflows }}' == 'true' ]]; then
if [[ "${NEEDS_CHANGED_OUTPUTS_ARGUMENT_COMMENT_LINT}" == 'true' || "${NEEDS_CHANGED_OUTPUTS_WORKFLOWS}" == 'true' ]]; then
[[ '${{ needs.argument_comment_lint_prebuilt.result }}' == 'success' ]] || { echo 'argument_comment_lint_prebuilt failed'; exit 1; }
fi
if [[ '${{ needs.changed.outputs.codex }}' == 'true' || '${{ needs.changed.outputs.workflows }}' == 'true' ]]; then
if [[ "${NEEDS_CHANGED_OUTPUTS_CODEX}" == 'true' || "${NEEDS_CHANGED_OUTPUTS_WORKFLOWS}" == 'true' ]]; then
[[ '${{ needs.general.result }}' == 'success' ]] || { echo 'general failed'; exit 1; }
[[ '${{ needs.cargo_shear.result }}' == 'success' ]] || { echo 'cargo_shear failed'; exit 1; }
fi
env:
NEEDS_CHANGED_OUTPUTS_ARGUMENT_COMMENT_LINT: ${{ needs.changed.outputs.argument_comment_lint }}
NEEDS_CHANGED_OUTPUTS_CODEX: ${{ needs.changed.outputs.codex }}
NEEDS_CHANGED_OUTPUTS_WORKFLOWS: ${{ needs.changed.outputs.workflows }}
NEEDS_CHANGED_OUTPUTS_ARGUMENT_COMMENT_LINT_PACKAGE: ${{ needs.changed.outputs.argument_comment_lint_package }}

View File

@@ -56,7 +56,9 @@ jobs:
labels: codex-windows-x64
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
with:
@@ -100,7 +102,7 @@ jobs:
(cd "${RUNNER_TEMP}" && tar -czf "$GITHUB_WORKSPACE/$archive_path" argument-comment-lint)
fi
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: argument-comment-lint-${{ matrix.target }}
path: dist/argument-comment-lint/${{ matrix.target }}/*

View File

@@ -16,12 +16,16 @@ jobs:
prepare:
# Prevent scheduled runs on forks (no secrets, wastes Actions minutes)
if: github.repository == 'openai/codex'
environment:
name: rust-release-prepare
deployment: false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: main
fetch-depth: 0
persist-credentials: false
- name: Update models.json
env:
@@ -43,7 +47,7 @@ jobs:
curl --http1.1 --fail --show-error --location "${headers[@]}" "${url}" | jq '.' > codex-rs/models-manager/models.json
- name: Open pull request (if changed)
uses: peter-evans/create-pull-request@c0f553fe549906ede9cf27b5156039d195d2ece0 # v8
uses: peter-evans/create-pull-request@c0f553fe549906ede9cf27b5156039d195d2ece0 # v8.1.0
with:
commit-message: "Update models.json"
title: "Update models.json"

View File

@@ -24,7 +24,9 @@ jobs:
build-windows-binaries:
name: Build Windows binaries - ${{ matrix.runner }} - ${{ matrix.target }} - ${{ matrix.bundle }}
runs-on: ${{ matrix.runs_on }}
timeout-minutes: 60
# Windows release builds can exceed an hour on fat-LTO mainline releases,
# so keep the timeout aligned with the top-level release build headroom.
timeout-minutes: 90
permissions:
contents: read
defaults:
@@ -81,7 +83,9 @@ jobs:
labels: codex-windows-arm64
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Print runner specs (Windows)
shell: powershell
run: |
@@ -110,7 +114,7 @@ jobs:
cargo build --target ${{ matrix.target }} --release --timings "${build_args[@]}"
- name: Upload Cargo timings
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: cargo-timings-rust-release-windows-${{ matrix.target }}-${{ matrix.bundle }}
path: codex-rs/target/**/cargo-timings/cargo-timing.html
@@ -126,7 +130,7 @@ jobs:
done
- name: Upload Windows binaries
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: windows-binaries-${{ matrix.target }}-${{ matrix.bundle }}
path: |
@@ -137,7 +141,7 @@ jobs:
- build-windows-binaries
name: Build - ${{ matrix.runner }} - ${{ matrix.target }}
runs-on: ${{ matrix.runs_on }}
timeout-minutes: 60
timeout-minutes: 90
permissions:
contents: read
id-token: write
@@ -163,22 +167,24 @@ jobs:
labels: codex-windows-arm64
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Download prebuilt Windows primary binaries
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
name: windows-binaries-${{ matrix.target }}-primary
path: codex-rs/target/${{ matrix.target }}/release
- name: Download prebuilt Windows helper binaries
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
name: windows-binaries-${{ matrix.target }}-helpers
path: codex-rs/target/${{ matrix.target }}/release
- name: Download prebuilt Windows app-server binary
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
name: windows-binaries-${{ matrix.target }}-app-server
path: codex-rs/target/${{ matrix.target }}/release
@@ -214,6 +220,48 @@ jobs:
"$dest/${binary}-${{ matrix.target }}.exe"
done
- name: Build Python runtime wheel
shell: bash
run: |
set -euo pipefail
case "${{ matrix.target }}" in
aarch64-pc-windows-msvc)
platform_tag="win_arm64"
;;
x86_64-pc-windows-msvc)
platform_tag="win_amd64"
;;
*)
echo "No Python runtime wheel platform tag for ${{ matrix.target }}"
exit 1
;;
esac
python -m venv "${RUNNER_TEMP}/python-runtime-build-venv"
"${RUNNER_TEMP}/python-runtime-build-venv/Scripts/python.exe" -m pip install build
stage_dir="${RUNNER_TEMP}/openai-codex-cli-bin-${{ matrix.target }}"
wheel_dir="${GITHUB_WORKSPACE}/python-runtime-dist/${{ matrix.target }}"
# Keep the helpers next to codex.exe in the runtime wheel so Windows
# sandbox/elevation lookup matches the standalone release zip.
python "${GITHUB_WORKSPACE}/sdk/python/scripts/update_sdk_artifacts.py" \
stage-runtime \
"$stage_dir" \
"${GITHUB_WORKSPACE}/codex-rs/target/${{ matrix.target }}/release/codex.exe" \
--codex-version "${GITHUB_REF_NAME}" \
--platform-tag "$platform_tag" \
--resource-binary "${GITHUB_WORKSPACE}/codex-rs/target/${{ matrix.target }}/release/codex-command-runner.exe" \
--resource-binary "${GITHUB_WORKSPACE}/codex-rs/target/${{ matrix.target }}/release/codex-windows-sandbox-setup.exe"
"${RUNNER_TEMP}/python-runtime-build-venv/Scripts/python.exe" -m build --wheel --outdir "$wheel_dir" "$stage_dir"
- name: Upload Python runtime wheel
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: python-runtime-wheel-${{ matrix.target }}
path: python-runtime-dist/${{ matrix.target }}/*.whl
if-no-files-found: error
- name: Install DotSlash
uses: facebook/install-dotslash@1e4e7b3e07eaca387acb98f1d4720e0bee8dbb6a # v2
@@ -279,7 +327,7 @@ jobs:
"${GITHUB_WORKSPACE}/.github/workflows/zstd" -T0 -19 "$dest/$base"
done
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: ${{ matrix.target }}
path: |

View File

@@ -45,7 +45,9 @@ jobs:
git \
libncursesw5-dev
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Build, smoke-test, and stage zsh artifact
shell: bash
@@ -53,7 +55,7 @@ jobs:
"${GITHUB_WORKSPACE}/.github/scripts/build-zsh-release-artifact.sh" \
"dist/zsh/${{ matrix.target }}/${{ matrix.archive_name }}"
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: codex-zsh-${{ matrix.target }}
path: dist/zsh/${{ matrix.target }}/*
@@ -81,7 +83,9 @@ jobs:
brew install autoconf
fi
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Build, smoke-test, and stage zsh artifact
shell: bash
@@ -89,7 +93,7 @@ jobs:
"${GITHUB_WORKSPACE}/.github/scripts/build-zsh-release-artifact.sh" \
"dist/zsh/${{ matrix.target }}/${{ matrix.archive_name }}"
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
- uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: codex-zsh-${{ matrix.target }}
path: dist/zsh/${{ matrix.target }}/*

File diff suppressed because it is too large Load Diff

View File

@@ -1,20 +1,12 @@
name: rusty-v8-release
on:
workflow_dispatch:
inputs:
release_tag:
description: Optional release tag. Defaults to rusty-v8-v<resolved_v8_version>.
required: false
type: string
publish:
description: Publish the staged musl artifacts to a GitHub release.
required: false
default: true
type: boolean
push:
tags:
- "rusty-v8-v*.*.*"
concurrency:
group: ${{ github.workflow }}::${{ inputs.release_tag || github.run_id }}
group: ${{ github.workflow }}::${{ github.ref_name }}
cancel-in-progress: false
jobs:
@@ -25,10 +17,12 @@ jobs:
v8_version: ${{ steps.v8_version.outputs.version }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.12"
@@ -43,15 +37,17 @@ jobs:
- name: Resolve release tag
id: release_tag
env:
RELEASE_TAG_INPUT: ${{ inputs.release_tag }}
GITHUB_REF_NAME: ${{ github.ref_name }}
V8_VERSION: ${{ steps.v8_version.outputs.version }}
shell: bash
run: |
set -euo pipefail
release_tag="${RELEASE_TAG_INPUT}"
if [[ -z "${release_tag}" ]]; then
release_tag="rusty-v8-v${V8_VERSION}"
expected_release_tag="rusty-v8-v${V8_VERSION}"
release_tag="${GITHUB_REF_NAME}"
if [[ "${release_tag}" != "${expected_release_tag}" ]]; then
echo "Tag ${release_tag} does not match resolved v8 crate version ${V8_VERSION}." >&2
exit 1
fi
echo "release_tag=${release_tag}" >> "$GITHUB_OUTPUT"
@@ -75,7 +71,9 @@ jobs:
target: aarch64-unknown-linux-musl
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Bazel
uses: ./.github/actions/setup-bazel-ci
@@ -83,7 +81,7 @@ jobs:
target: ${{ matrix.target }}
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.12"
@@ -111,6 +109,7 @@ jobs:
-c
opt
"--platforms=@llvm//platforms:${PLATFORM}"
--config=v8-release-compat
"${pair_target}"
"${extra_targets[@]}"
--build_metadata=COMMIT_SHA=$(git rev-parse HEAD)
@@ -134,16 +133,16 @@ jobs:
--platform "${PLATFORM}" \
--target "${TARGET}" \
--compilation-mode opt \
--bazel-config v8-release-compat \
--output-dir "dist/${TARGET}"
- name: Upload staged musl artifacts
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: rusty-v8-${{ needs.metadata.outputs.v8_version }}-${{ matrix.target }}
path: dist/${{ matrix.target }}/*
publish-release:
if: ${{ inputs.publish }}
needs:
- metadata
- build
@@ -153,16 +152,6 @@ jobs:
actions: read
steps:
- name: Ensure publishing from default branch
if: ${{ github.ref_name != github.event.repository.default_branch }}
env:
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
shell: bash
run: |
set -euo pipefail
echo "Publishing is only allowed from ${DEFAULT_BRANCH}; current ref is ${GITHUB_REF_NAME}." >&2
exit 1
- name: Ensure release tag is new
env:
GH_TOKEN: ${{ github.token }}
@@ -176,12 +165,12 @@ jobs:
exit 1
fi
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
path: dist
- name: Create GitHub Release
uses: softprops/action-gh-release@153bb8e04406b158c6c84fc1615b65b24149a1fe # v2
uses: softprops/action-gh-release@153bb8e04406b158c6c84fc1615b65b24149a1fe # v2.6.1
with:
tag_name: ${{ needs.metadata.outputs.release_tag }}
name: ${{ needs.metadata.outputs.release_tag }}

View File

@@ -6,6 +6,41 @@ on:
pull_request: {}
jobs:
python-sdk:
runs-on:
group: codex-runners
labels: codex-linux-x64
timeout-minutes: 10
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Test Python SDK
shell: bash
run: |
set -euo pipefail
# Run inside Alpine so dependency resolution exercises the pinned
# runtime wheel on the same Linux wheel family that CI installs.
docker run --rm \
--user "$(id -u):$(id -g)" \
-e HOME=/tmp/codex-python-sdk-home \
-e UV_LINK_MODE=copy \
-v "${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE}" \
-w "${GITHUB_WORKSPACE}/sdk/python" \
python:3.12-alpine \
sh -euxc '
python -m venv /tmp/uv
/tmp/uv/bin/python -m pip install uv==0.11.3
/tmp/uv/bin/uv sync --extra dev --frozen
/tmp/uv/bin/uv run --extra dev ruff check --output-format=github .
/tmp/uv/bin/uv run --extra dev ruff format --check .
/tmp/uv/bin/uv run --extra dev pytest
'
sdks:
runs-on:
group: codex-runners
@@ -13,7 +48,10 @@ jobs:
timeout-minutes: 10
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Install Linux bwrap build dependencies
shell: bash
@@ -28,7 +66,7 @@ jobs:
run_install: false
- name: Setup Node.js
uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6
uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
with:
node-version: 22
cache: pnpm
@@ -115,7 +153,7 @@ jobs:
- name: Save bazel repository cache
if: always() && !cancelled() && steps.setup_bazel.outputs.cache-hit != 'true'
continue-on-error: true
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: |
~/.cache/bazel-repo-cache

View File

@@ -40,10 +40,13 @@ jobs:
v8_version: ${{ steps.v8_version.outputs.version }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.12"
@@ -74,7 +77,10 @@ jobs:
target: aarch64-unknown-linux-musl
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
persist-credentials: false
- name: Set up Bazel
uses: ./.github/actions/setup-bazel-ci
@@ -82,7 +88,7 @@ jobs:
target: ${{ matrix.target }}
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.12"
@@ -105,6 +111,7 @@ jobs:
bazel_args=(
build
"--platforms=@llvm//platforms:${PLATFORM}"
--config=v8-release-compat
"${pair_target}"
"${extra_targets[@]}"
--build_metadata=COMMIT_SHA=$(git rev-parse HEAD)
@@ -127,10 +134,11 @@ jobs:
python3 .github/scripts/rusty_v8_bazel.py stage-release-pair \
--platform "${PLATFORM}" \
--target "${TARGET}" \
--bazel-config v8-release-compat \
--output-dir "dist/${TARGET}"
- name: Upload staged musl artifacts
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: v8-canary-${{ needs.metadata.outputs.v8_version }}-${{ matrix.target }}
path: dist/${{ matrix.target }}/*

View File

@@ -1,6 +1,7 @@
{
"recommendations": [
"rust-lang.rust-analyzer",
"charliermarsh.ruff",
"tamasfe.even-better-toml",
"vadimcn.vscode-lldb",

View File

@@ -12,6 +12,14 @@
"editor.defaultFormatter": "tamasfe.even-better-toml",
"editor.formatOnSave": true,
},
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit",
},
},
// Array order for options in ~/.codex/config.toml such as `notify` and the
// `args` for an MCP server is significant, so we disable reordering.
"evenBetterToml.formatter.reorderArrays": false,

View File

@@ -19,8 +19,14 @@ In the codex-rs folder where the rust code lives:
- You can run `just argument-comment-lint` to run the lint check locally. This is powered by Bazel, so running it the first time can be slow if Bazel is not warmed up, though incremental invocations should take <15s. Most of the time, it is best to update the PR and let CI take responsibility for checking this (or run it asynchronously in the background after submitting the PR). Note CI checks all three platforms, which the local run does not.
- When possible, make `match` statements exhaustive and avoid wildcard arms.
- Newly added traits should include doc comments that explain their role and how implementations are expected to use them.
- Discourage both `#[async_trait]` and `#[allow(async_fn_in_trait)]` in Rust traits.
- Prefer native RPITIT trait methods with explicit `Send` bounds on the returned future, as in `3c7f013f9735` / `#16630`.
- Preferred trait shape:
`fn foo(&self, ...) -> impl std::future::Future<Output = T> + Send;`
- Implementations may still use `async fn foo(&self, ...) -> T` when they satisfy that contract.
- Do not use `#[allow(async_fn_in_trait)]` as a shortcut around spelling the future contract explicitly.
- When writing tests, prefer comparing the equality of entire objects over fields one by one.
- When making a change that adds or changes an API, ensure that the documentation in the `docs/` folder is up to date if applicable.
- Do not add general product or user-facing documentation to the `docs/` folder. The official Codex documentation lives elsewhere. The exception is app-server API documentation, which is covered by the app-server guidance below.
- Prefer private modules and explicitly exported public crate API.
- If you change `ConfigToml` or nested config types, run `just write-config-schema` to update `codex-rs/core/config.schema.json`.
- When working with MCP tool calls, prefer using `codex-rs/codex-mcp/src/mcp_connection_manager.rs` to handle mutation of tools and tool calls. Aim to minimize the footprint of changes and leverage existing abstractions rather than plumbing code through multiple levels of function calls.
@@ -124,7 +130,7 @@ When UI or text output changes intentionally, update the snapshots as follows:
If you dont have the tool:
- `cargo install cargo-insta`
- `cargo install --locked cargo-insta`
### Test assertions
@@ -204,7 +210,7 @@ These guidelines apply to app-server protocol work in `codex-rs`, especially:
### Development Workflow
- Update docs/examples when API behavior changes (at minimum `app-server/README.md`).
- Update app-server docs/examples when API behavior changes (at minimum `app-server/README.md`).
- Regenerate schema fixtures when API shapes change:
`just write-app-server-schema`
(and `just write-app-server-schema --experimental` when experimental API fixtures are affected).

View File

@@ -30,6 +30,40 @@ platform(
parents = ["@platforms//host"],
)
platform(
name = "windows_x86_64_gnullvm",
constraint_values = [
"@platforms//cpu:x86_64",
"@platforms//os:windows",
"@rules_rs//rs/experimental/platforms/constraints:windows_gnullvm",
],
)
platform(
name = "windows_x86_64_msvc",
constraint_values = [
"@platforms//cpu:x86_64",
"@platforms//os:windows",
"@rules_rs//rs/experimental/platforms/constraints:windows_msvc",
],
)
toolchain(
name = "windows_gnullvm_tests_on_msvc_host_toolchain",
exec_compatible_with = [
"@platforms//cpu:x86_64",
"@platforms//os:windows",
"@rules_rs//rs/experimental/platforms/constraints:windows_msvc",
],
target_compatible_with = [
"@platforms//cpu:x86_64",
"@platforms//os:windows",
"@rules_rs//rs/experimental/platforms/constraints:windows_gnullvm",
],
toolchain = "@bazel_tools//tools/test:empty_toolchain",
toolchain_type = "@bazel_tools//tools/test:default_test_toolchain_type",
)
alias(
name = "rbe",
actual = "@rbe_platform",

View File

@@ -327,6 +327,18 @@ crate.annotation(
"RUSTY_V8_SRC_BINDING_PATH": "$(execpath @v8_targets//:rusty_v8_binding_for_target)",
},
crate = "v8",
# Keep the Rust feature aligned with the source-built Bazel artifacts.
# Windows MSVC still consumes upstream non-sandboxed prebuilts.
crate_features_select = {
"aarch64-apple-darwin": ["v8_enable_sandbox"],
"aarch64-pc-windows-gnullvm": ["v8_enable_sandbox"],
"aarch64-unknown-linux-gnu": ["v8_enable_sandbox"],
"aarch64-unknown-linux-musl": ["v8_enable_sandbox"],
"x86_64-apple-darwin": ["v8_enable_sandbox"],
"x86_64-pc-windows-gnullvm": ["v8_enable_sandbox"],
"x86_64-unknown-linux-gnu": ["v8_enable_sandbox"],
"x86_64-unknown-linux-musl": ["v8_enable_sandbox"],
},
gen_build_script = "on",
patch_args = ["-p1"],
patches = [

3
MODULE.bazel.lock generated
View File

@@ -665,6 +665,7 @@
"aws-lc-rs_1.16.2": "{\"dependencies\":[{\"name\":\"aws-lc-fips-sys\",\"optional\":true,\"req\":\"^0.13.1\"},{\"default_features\":false,\"name\":\"aws-lc-sys\",\"optional\":true,\"req\":\"^0.39.0\"},{\"features\":[\"derive\"],\"kind\":\"dev\",\"name\":\"clap\",\"req\":\"^4.4\"},{\"kind\":\"dev\",\"name\":\"hex\",\"req\":\"^0.4.3\"},{\"kind\":\"dev\",\"name\":\"lazy_static\",\"req\":\"^1.5.0\"},{\"kind\":\"dev\",\"name\":\"paste\",\"req\":\"^1.0.15\"},{\"kind\":\"dev\",\"name\":\"regex\",\"req\":\"^1.11.1\"},{\"name\":\"untrusted\",\"optional\":true,\"req\":\"^0.7.1\"},{\"name\":\"zeroize\",\"req\":\"^1.8.1\"}],\"features\":{\"alloc\":[],\"asan\":[\"aws-lc-sys?/asan\",\"aws-lc-fips-sys?/asan\"],\"bindgen\":[\"aws-lc-sys?/bindgen\",\"aws-lc-fips-sys?/bindgen\"],\"default\":[\"aws-lc-sys\",\"alloc\",\"ring-io\",\"ring-sig-verify\"],\"dev-tests-only\":[],\"fips\":[\"dep:aws-lc-fips-sys\"],\"non-fips\":[\"aws-lc-sys\"],\"prebuilt-nasm\":[\"aws-lc-sys?/prebuilt-nasm\"],\"ring-io\":[\"dep:untrusted\"],\"ring-sig-verify\":[\"dep:untrusted\"],\"test_logging\":[],\"unstable\":[]}}",
"aws-lc-sys_0.39.0": "{\"dependencies\":[{\"kind\":\"build\",\"name\":\"bindgen\",\"optional\":true,\"req\":\"^0.72.0\"},{\"features\":[\"parallel\"],\"kind\":\"build\",\"name\":\"cc\",\"req\":\"^1.2.26\"},{\"kind\":\"build\",\"name\":\"cmake\",\"req\":\"^0.1.54\"},{\"kind\":\"build\",\"name\":\"dunce\",\"req\":\"^1.0.5\"},{\"kind\":\"build\",\"name\":\"fs_extra\",\"req\":\"^1.3.0\"}],\"features\":{\"all-bindings\":[],\"asan\":[],\"bindgen\":[\"dep:bindgen\"],\"default\":[\"all-bindings\"],\"disable-prebuilt-nasm\":[],\"fips\":[\"dep:bindgen\"],\"prebuilt-nasm\":[],\"ssl\":[\"bindgen\",\"all-bindings\"]}}",
"aws-runtime_1.5.17": "{\"dependencies\":[{\"kind\":\"dev\",\"name\":\"arbitrary\",\"req\":\"^1.3\"},{\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"features\":[\"http0-compat\"],\"name\":\"aws-sigv4\",\"req\":\"^1.3.7\"},{\"name\":\"aws-smithy-async\",\"req\":\"^1.2.7\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-smithy-async\",\"req\":\"^1.2.7\"},{\"name\":\"aws-smithy-eventstream\",\"optional\":true,\"req\":\"^0.60.14\"},{\"name\":\"aws-smithy-http\",\"req\":\"^0.62.6\"},{\"kind\":\"dev\",\"name\":\"aws-smithy-protocol-test\",\"req\":\"^0.63.7\"},{\"features\":[\"client\"],\"name\":\"aws-smithy-runtime\",\"req\":\"^1.9.5\"},{\"features\":[\"client\"],\"name\":\"aws-smithy-runtime-api\",\"req\":\"^1.9.3\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-smithy-runtime-api\",\"req\":\"^1.9.3\"},{\"name\":\"aws-smithy-types\",\"req\":\"^1.3.5\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-smithy-types\",\"req\":\"^1.3.5\"},{\"name\":\"aws-types\",\"req\":\"^1.3.11\"},{\"name\":\"bytes\",\"req\":\"^1.10.0\"},{\"kind\":\"dev\",\"name\":\"bytes-utils\",\"req\":\"^0.1.2\"},{\"kind\":\"dev\",\"name\":\"convert_case\",\"req\":\"^0.6.0\"},{\"name\":\"fastrand\",\"req\":\"^2.3.0\"},{\"default_features\":false,\"kind\":\"dev\",\"name\":\"futures-util\",\"req\":\"^0.3.29\"},{\"name\":\"http-02x\",\"package\":\"http\",\"req\":\"^0.2.9\"},{\"name\":\"http-1x\",\"optional\":true,\"package\":\"http\",\"req\":\"^1.1.0\"},{\"name\":\"http-body-04x\",\"package\":\"http-body\",\"req\":\"^0.4.5\"},{\"name\":\"http-body-1x\",\"optional\":true,\"package\":\"http-body\",\"req\":\"^1.0.0\"},{\"name\":\"percent-encoding\",\"req\":\"^2.3.1\"},{\"name\":\"pin-project-lite\",\"req\":\"^0.2.14\"},{\"kind\":\"dev\",\"name\":\"proptest\",\"req\":\"^1.2\"},{\"name\":\"regex-lite\",\"optional\":true,\"req\":\"^0.1.5\"},{\"features\":[\"derive\"],\"kind\":\"dev\",\"name\":\"serde\",\"req\":\"^1\"},{\"kind\":\"dev\",\"name\":\"serde_json\",\"req\":\"^1\"},{\"features\":[\"macros\",\"rt\",\"time\"],\"kind\":\"dev\",\"name\":\"tokio\",\"req\":\"^1.23.1\"},{\"name\":\"tracing\",\"req\":\"^0.1.40\"},{\"features\":[\"env-filter\"],\"kind\":\"dev\",\"name\":\"tracing-subscriber\",\"req\":\"^0.3.17\"},{\"kind\":\"dev\",\"name\":\"tracing-test\",\"req\":\"^0.2.4\"},{\"name\":\"uuid\",\"req\":\"^1\"}],\"features\":{\"event-stream\":[\"dep:aws-smithy-eventstream\",\"aws-sigv4/sign-eventstream\"],\"http-02x\":[],\"http-1x\":[\"dep:http-1x\",\"dep:http-body-1x\"],\"sigv4a\":[\"aws-sigv4/sigv4a\"],\"test-util\":[\"dep:regex-lite\"]}}",
"aws-sdk-signin_1.2.0": "{\"dependencies\":[{\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"name\":\"aws-runtime\",\"req\":\"^1.5.17\"},{\"name\":\"aws-smithy-async\",\"req\":\"^1.2.7\"},{\"name\":\"aws-smithy-http\",\"req\":\"^0.62.6\"},{\"name\":\"aws-smithy-json\",\"req\":\"^0.61.8\"},{\"features\":[\"client\"],\"name\":\"aws-smithy-runtime\",\"req\":\"^1.9.5\"},{\"features\":[\"client\",\"http-02x\"],\"name\":\"aws-smithy-runtime-api\",\"req\":\"^1.9.3\"},{\"name\":\"aws-smithy-types\",\"req\":\"^1.3.5\"},{\"name\":\"aws-types\",\"req\":\"^1.3.11\"},{\"name\":\"bytes\",\"req\":\"^1.4.0\"},{\"name\":\"fastrand\",\"req\":\"^2.0.0\"},{\"name\":\"http\",\"req\":\"^0.2.9\"},{\"kind\":\"dev\",\"name\":\"proptest\",\"req\":\"^1\"},{\"name\":\"regex-lite\",\"req\":\"^0.1.5\"},{\"features\":[\"macros\",\"test-util\",\"rt-multi-thread\"],\"kind\":\"dev\",\"name\":\"tokio\",\"req\":\"^1.23.1\"},{\"name\":\"tracing\",\"req\":\"^0.1\"}],\"features\":{\"behavior-version-latest\":[],\"default\":[\"rustls\",\"default-https-client\",\"rt-tokio\"],\"default-https-client\":[\"aws-smithy-runtime/default-https-client\"],\"gated-tests\":[],\"rt-tokio\":[\"aws-smithy-async/rt-tokio\",\"aws-smithy-types/rt-tokio\"],\"rustls\":[\"aws-smithy-runtime/tls-rustls\"],\"test-util\":[\"aws-credential-types/test-util\",\"aws-smithy-runtime/test-util\"]}}",
"aws-sdk-sso_1.91.0": "{\"dependencies\":[{\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"name\":\"aws-runtime\",\"req\":\"^1.5.17\"},{\"name\":\"aws-smithy-async\",\"req\":\"^1.2.7\"},{\"name\":\"aws-smithy-http\",\"req\":\"^0.62.6\"},{\"name\":\"aws-smithy-json\",\"req\":\"^0.61.8\"},{\"features\":[\"client\"],\"name\":\"aws-smithy-runtime\",\"req\":\"^1.9.5\"},{\"features\":[\"client\",\"http-02x\"],\"name\":\"aws-smithy-runtime-api\",\"req\":\"^1.9.3\"},{\"name\":\"aws-smithy-types\",\"req\":\"^1.3.5\"},{\"name\":\"aws-types\",\"req\":\"^1.3.11\"},{\"name\":\"bytes\",\"req\":\"^1.4.0\"},{\"name\":\"fastrand\",\"req\":\"^2.0.0\"},{\"name\":\"http\",\"req\":\"^0.2.9\"},{\"kind\":\"dev\",\"name\":\"proptest\",\"req\":\"^1\"},{\"name\":\"regex-lite\",\"req\":\"^0.1.5\"},{\"features\":[\"macros\",\"test-util\",\"rt-multi-thread\"],\"kind\":\"dev\",\"name\":\"tokio\",\"req\":\"^1.23.1\"},{\"name\":\"tracing\",\"req\":\"^0.1\"}],\"features\":{\"behavior-version-latest\":[],\"default\":[\"rustls\",\"default-https-client\",\"rt-tokio\"],\"default-https-client\":[\"aws-smithy-runtime/default-https-client\"],\"gated-tests\":[],\"rt-tokio\":[\"aws-smithy-async/rt-tokio\",\"aws-smithy-types/rt-tokio\"],\"rustls\":[\"aws-smithy-runtime/tls-rustls\"],\"test-util\":[\"aws-credential-types/test-util\",\"aws-smithy-runtime/test-util\"]}}",
"aws-sdk-ssooidc_1.93.0": "{\"dependencies\":[{\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"name\":\"aws-runtime\",\"req\":\"^1.5.17\"},{\"name\":\"aws-smithy-async\",\"req\":\"^1.2.7\"},{\"name\":\"aws-smithy-http\",\"req\":\"^0.62.6\"},{\"name\":\"aws-smithy-json\",\"req\":\"^0.61.8\"},{\"features\":[\"client\"],\"name\":\"aws-smithy-runtime\",\"req\":\"^1.9.5\"},{\"features\":[\"client\",\"http-02x\"],\"name\":\"aws-smithy-runtime-api\",\"req\":\"^1.9.3\"},{\"name\":\"aws-smithy-types\",\"req\":\"^1.3.5\"},{\"name\":\"aws-types\",\"req\":\"^1.3.11\"},{\"name\":\"bytes\",\"req\":\"^1.4.0\"},{\"name\":\"fastrand\",\"req\":\"^2.0.0\"},{\"name\":\"http\",\"req\":\"^0.2.9\"},{\"kind\":\"dev\",\"name\":\"proptest\",\"req\":\"^1\"},{\"name\":\"regex-lite\",\"req\":\"^0.1.5\"},{\"features\":[\"macros\",\"test-util\",\"rt-multi-thread\"],\"kind\":\"dev\",\"name\":\"tokio\",\"req\":\"^1.23.1\"},{\"name\":\"tracing\",\"req\":\"^0.1\"}],\"features\":{\"behavior-version-latest\":[],\"default\":[\"rustls\",\"default-https-client\",\"rt-tokio\"],\"default-https-client\":[\"aws-smithy-runtime/default-https-client\"],\"gated-tests\":[],\"rt-tokio\":[\"aws-smithy-async/rt-tokio\",\"aws-smithy-types/rt-tokio\"],\"rustls\":[\"aws-smithy-runtime/tls-rustls\"],\"test-util\":[\"aws-credential-types/test-util\",\"aws-smithy-runtime/test-util\"]}}",
"aws-sdk-sts_1.95.0": "{\"dependencies\":[{\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-credential-types\",\"req\":\"^1.2.11\"},{\"name\":\"aws-runtime\",\"req\":\"^1.5.17\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-runtime\",\"req\":\"^1.5.17\"},{\"name\":\"aws-smithy-async\",\"req\":\"^1.2.7\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-smithy-async\",\"req\":\"^1.2.7\"},{\"name\":\"aws-smithy-http\",\"req\":\"^0.62.6\"},{\"features\":[\"test-util\",\"wire-mock\"],\"kind\":\"dev\",\"name\":\"aws-smithy-http-client\",\"req\":\"^1.1.5\"},{\"name\":\"aws-smithy-json\",\"req\":\"^0.61.8\"},{\"kind\":\"dev\",\"name\":\"aws-smithy-protocol-test\",\"req\":\"^0.63.7\"},{\"name\":\"aws-smithy-query\",\"req\":\"^0.60.9\"},{\"features\":[\"client\"],\"name\":\"aws-smithy-runtime\",\"req\":\"^1.9.5\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-smithy-runtime\",\"req\":\"^1.9.5\"},{\"features\":[\"client\",\"http-02x\"],\"name\":\"aws-smithy-runtime-api\",\"req\":\"^1.9.3\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-smithy-runtime-api\",\"req\":\"^1.9.3\"},{\"name\":\"aws-smithy-types\",\"req\":\"^1.3.5\"},{\"features\":[\"test-util\"],\"kind\":\"dev\",\"name\":\"aws-smithy-types\",\"req\":\"^1.3.5\"},{\"name\":\"aws-smithy-xml\",\"req\":\"^0.60.13\"},{\"name\":\"aws-types\",\"req\":\"^1.3.11\"},{\"name\":\"fastrand\",\"req\":\"^2.0.0\"},{\"default_features\":false,\"features\":[\"alloc\"],\"kind\":\"dev\",\"name\":\"futures-util\",\"req\":\"^0.3.25\"},{\"name\":\"http\",\"req\":\"^0.2.9\"},{\"kind\":\"dev\",\"name\":\"http-1x\",\"package\":\"http\",\"req\":\"^1\"},{\"kind\":\"dev\",\"name\":\"proptest\",\"req\":\"^1\"},{\"name\":\"regex-lite\",\"req\":\"^0.1.5\"},{\"kind\":\"dev\",\"name\":\"serde_json\",\"req\":\"^1.0.0\"},{\"features\":[\"macros\",\"test-util\",\"rt-multi-thread\"],\"kind\":\"dev\",\"name\":\"tokio\",\"req\":\"^1.23.1\"},{\"name\":\"tracing\",\"req\":\"^0.1\"},{\"features\":[\"env-filter\",\"json\"],\"kind\":\"dev\",\"name\":\"tracing-subscriber\",\"req\":\"^0.3.16\"}],\"features\":{\"behavior-version-latest\":[],\"default\":[\"rustls\",\"default-https-client\",\"rt-tokio\"],\"default-https-client\":[\"aws-smithy-runtime/default-https-client\"],\"gated-tests\":[],\"rt-tokio\":[\"aws-smithy-async/rt-tokio\",\"aws-smithy-types/rt-tokio\"],\"rustls\":[\"aws-smithy-runtime/tls-rustls\"],\"test-util\":[\"aws-credential-types/test-util\",\"aws-smithy-runtime/test-util\"]}}",
@@ -886,7 +887,6 @@
"enum-as-inner_0.6.1": "{\"dependencies\":[{\"name\":\"heck\",\"req\":\"^0.5\"},{\"name\":\"proc-macro2\",\"req\":\"^1.0\"},{\"name\":\"quote\",\"req\":\"^1.0\"},{\"name\":\"syn\",\"req\":\"^2.0\"}],\"features\":{}}",
"enumflags2_0.7.12": "{\"dependencies\":[{\"name\":\"enumflags2_derive\",\"req\":\"=0.7.12\"},{\"default_features\":false,\"name\":\"serde\",\"optional\":true,\"req\":\"^1.0.0\"}],\"features\":{\"std\":[]}}",
"enumflags2_derive_0.7.12": "{\"dependencies\":[{\"name\":\"proc-macro2\",\"req\":\"^1.0\"},{\"name\":\"quote\",\"req\":\"^1.0\"},{\"default_features\":false,\"features\":[\"parsing\",\"printing\",\"derive\",\"proc-macro\"],\"name\":\"syn\",\"req\":\"^2.0\"}],\"features\":{}}",
"env-flags_0.1.1": "{\"dependencies\":[],\"features\":{}}",
"env_filter_1.0.0": "{\"dependencies\":[{\"features\":[\"std\"],\"name\":\"log\",\"req\":\"^0.4.8\"},{\"default_features\":false,\"features\":[\"std\",\"perf\"],\"name\":\"regex\",\"optional\":true,\"req\":\"^1.0.3\"},{\"kind\":\"dev\",\"name\":\"snapbox\",\"req\":\"^0.6\"}],\"features\":{\"default\":[\"regex\"],\"regex\":[\"dep:regex\"]}}",
"env_filter_1.0.1": "{\"dependencies\":[{\"features\":[\"std\"],\"name\":\"log\",\"req\":\"^0.4.29\"},{\"default_features\":false,\"features\":[\"std\",\"perf\"],\"name\":\"regex\",\"optional\":true,\"req\":\"^1.12.3\"},{\"kind\":\"dev\",\"name\":\"snapbox\",\"req\":\"^1.0\"}],\"features\":{\"default\":[\"regex\"],\"regex\":[\"dep:regex\"]}}",
"env_home_0.1.0": "{\"dependencies\":[],\"features\":{}}",
@@ -1482,6 +1482,7 @@
"serde_derive_1.0.228": "{\"dependencies\":[{\"default_features\":false,\"features\":[\"proc-macro\"],\"name\":\"proc-macro2\",\"req\":\"^1.0.74\"},{\"default_features\":false,\"features\":[\"proc-macro\"],\"name\":\"quote\",\"req\":\"^1.0.35\"},{\"kind\":\"dev\",\"name\":\"serde\",\"req\":\"^1\"},{\"default_features\":false,\"features\":[\"clone-impls\",\"derive\",\"parsing\",\"printing\",\"proc-macro\"],\"name\":\"syn\",\"req\":\"^2.0.81\"}],\"features\":{\"default\":[],\"deserialize_in_place\":[]}}",
"serde_derive_internals_0.29.1": "{\"dependencies\":[{\"default_features\":false,\"name\":\"proc-macro2\",\"req\":\"^1.0.74\"},{\"default_features\":false,\"name\":\"quote\",\"req\":\"^1.0.35\"},{\"default_features\":false,\"features\":[\"clone-impls\",\"derive\",\"parsing\",\"printing\"],\"name\":\"syn\",\"req\":\"^2.0.46\"}],\"features\":{}}",
"serde_html_form_0.3.2": "{\"dependencies\":[{\"kind\":\"dev\",\"name\":\"assert_matches2\",\"req\":\"^0.1.0\"},{\"kind\":\"dev\",\"name\":\"divan\",\"req\":\"^0.1.11\"},{\"default_features\":false,\"features\":[\"alloc\"],\"name\":\"form_urlencoded\",\"req\":\"^1.0.1\"},{\"default_features\":false,\"name\":\"indexmap\",\"req\":\"^2.0.0\"},{\"kind\":\"dev\",\"name\":\"insta\",\"req\":\"^1.45.0\"},{\"name\":\"itoa\",\"req\":\"^1.0.1\"},{\"name\":\"ryu\",\"optional\":true,\"req\":\"^1.0.9\"},{\"features\":[\"derive\"],\"kind\":\"dev\",\"name\":\"serde\",\"req\":\"^1.0.221\"},{\"default_features\":false,\"features\":[\"alloc\"],\"name\":\"serde_core\",\"req\":\"^1.0.221\"},{\"kind\":\"dev\",\"name\":\"serde_urlencoded\",\"req\":\"^0.7.1\"}],\"features\":{\"default\":[\"ryu\",\"std\"],\"std\":[]}}",
"serde_ignored_0.1.14": "{\"dependencies\":[{\"default_features\":false,\"name\":\"serde\",\"req\":\"^1.0.220\",\"target\":\"cfg(any())\"},{\"kind\":\"dev\",\"name\":\"serde\",\"req\":\"^1.0.220\"},{\"default_features\":false,\"features\":[\"alloc\"],\"name\":\"serde_core\",\"req\":\"^1.0.220\"},{\"kind\":\"dev\",\"name\":\"serde_derive\",\"req\":\"^1.0.220\"},{\"kind\":\"dev\",\"name\":\"serde_json\",\"req\":\"^1.0.110\"}],\"features\":{}}",
"serde_json_1.0.149": "{\"dependencies\":[{\"kind\":\"dev\",\"name\":\"automod\",\"req\":\"^1.0.11\"},{\"name\":\"indexmap\",\"optional\":true,\"req\":\"^2.2.3\"},{\"kind\":\"dev\",\"name\":\"indoc\",\"req\":\"^2.0.2\"},{\"name\":\"itoa\",\"req\":\"^1.0\"},{\"default_features\":false,\"name\":\"memchr\",\"req\":\"^2\"},{\"kind\":\"dev\",\"name\":\"ref-cast\",\"req\":\"^1.0.18\"},{\"kind\":\"dev\",\"name\":\"rustversion\",\"req\":\"^1.0.13\"},{\"default_features\":false,\"name\":\"serde\",\"req\":\"^1.0.220\",\"target\":\"cfg(any())\"},{\"features\":[\"derive\"],\"kind\":\"dev\",\"name\":\"serde\",\"req\":\"^1.0.194\"},{\"kind\":\"dev\",\"name\":\"serde_bytes\",\"req\":\"^0.11.10\"},{\"default_features\":false,\"name\":\"serde_core\",\"req\":\"^1.0.220\"},{\"kind\":\"dev\",\"name\":\"serde_derive\",\"req\":\"^1.0.166\"},{\"kind\":\"dev\",\"name\":\"serde_stacker\",\"req\":\"^0.1.8\"},{\"features\":[\"diff\"],\"kind\":\"dev\",\"name\":\"trybuild\",\"req\":\"^1.0.108\"},{\"name\":\"zmij\",\"req\":\"^1.0\"}],\"features\":{\"alloc\":[\"serde_core/alloc\"],\"arbitrary_precision\":[],\"default\":[\"std\"],\"float_roundtrip\":[],\"preserve_order\":[\"indexmap\",\"std\"],\"raw_value\":[],\"std\":[\"memchr/std\",\"serde_core/std\"],\"unbounded_depth\":[]}}",
"serde_path_to_error_0.1.20": "{\"dependencies\":[{\"name\":\"itoa\",\"req\":\"^1.0\"},{\"default_features\":false,\"name\":\"serde\",\"req\":\"^1.0.220\",\"target\":\"cfg(any())\"},{\"kind\":\"dev\",\"name\":\"serde\",\"req\":\"^1.0.220\"},{\"default_features\":false,\"features\":[\"alloc\"],\"name\":\"serde_core\",\"req\":\"^1.0.220\"},{\"kind\":\"dev\",\"name\":\"serde_derive\",\"req\":\"^1.0.220\"},{\"kind\":\"dev\",\"name\":\"serde_json\",\"req\":\"^1.0.100\"}],\"features\":{}}",
"serde_repr_0.1.20": "{\"dependencies\":[{\"name\":\"proc-macro2\",\"req\":\"^1.0.74\"},{\"name\":\"quote\",\"req\":\"^1.0.35\"},{\"kind\":\"dev\",\"name\":\"rustversion\",\"req\":\"^1.0.13\"},{\"kind\":\"dev\",\"name\":\"serde\",\"req\":\"^1.0.166\"},{\"kind\":\"dev\",\"name\":\"serde_json\",\"req\":\"^1.0.100\"},{\"name\":\"syn\",\"req\":\"^2.0.46\"},{\"features\":[\"diff\"],\"kind\":\"dev\",\"name\":\"trybuild\",\"req\":\"^1.0.81\"}],\"features\":{}}",

View File

@@ -2,7 +2,7 @@
// Unified entry point for the Codex CLI.
import { spawn } from "node:child_process";
import { existsSync } from "fs";
import { existsSync, realpathSync } from "fs";
import { createRequire } from "node:module";
import path from "path";
import { fileURLToPath } from "url";
@@ -171,6 +171,7 @@ const packageManagerEnvVar =
? "CODEX_MANAGED_BY_BUN"
: "CODEX_MANAGED_BY_NPM";
env[packageManagerEnvVar] = "1";
env.CODEX_MANAGED_PACKAGE_ROOT = realpathSync(path.join(__dirname, ".."));
const child = spawn(binaryPath, process.argv.slice(2), {
stdio: "inherit",

View File

@@ -69,8 +69,8 @@ PACKAGE_EXPANSIONS: dict[str, list[str]] = {
PACKAGE_NATIVE_COMPONENTS: dict[str, list[str]] = {
"codex": [],
"codex-linux-x64": ["codex", "rg"],
"codex-linux-arm64": ["codex", "rg"],
"codex-linux-x64": ["bwrap", "codex", "rg"],
"codex-linux-arm64": ["bwrap", "codex", "rg"],
"codex-darwin-x64": ["codex", "rg"],
"codex-darwin-arm64": ["codex", "rg"],
"codex-win32-x64": ["codex", "rg", "codex-windows-sandbox-setup", "codex-command-runner"],
@@ -87,6 +87,7 @@ PACKAGE_TARGET_FILTERS: dict[str, str] = {
PACKAGE_CHOICES = tuple(PACKAGE_NATIVE_COMPONENTS)
COMPONENT_DEST_DIR: dict[str, str] = {
"bwrap": "codex-resources",
"codex": "codex",
"codex-responses-api-proxy": "codex-responses-api-proxy",
"codex-windows-sandbox-setup": "codex",
@@ -137,6 +138,16 @@ def parse_args() -> argparse.Namespace:
type=Path,
help="Directory containing pre-installed native binaries to bundle (vendor root).",
)
parser.add_argument(
"--allow-missing-native-component",
dest="allow_missing_native_components",
action="append",
default=[],
help=(
"Native component that may be absent from --vendor-src. Intended for CI "
"compatibility with older artifact workflows; releases should not use this."
),
)
return parser.parse_args()
@@ -177,6 +188,7 @@ def main() -> int:
staging_dir,
native_components,
target_filter={target_filter} if target_filter else None,
allow_missing_components=set(args.allow_missing_native_components),
)
if release_version:
@@ -365,12 +377,14 @@ def copy_native_binaries(
staging_dir: Path,
components: list[str],
target_filter: set[str] | None = None,
allow_missing_components: set[str] | None = None,
) -> None:
vendor_src = vendor_src.resolve()
if not vendor_src.exists():
raise RuntimeError(f"Vendor source directory not found: {vendor_src}")
components_set = {component for component in components if component in COMPONENT_DEST_DIR}
allow_missing_components = allow_missing_components or set()
if not components_set:
return
@@ -399,6 +413,8 @@ def copy_native_binaries(
src_component_dir = target_dir / dest_dir_name
if not src_component_dir.exists():
if component in allow_missing_components:
continue
raise RuntimeError(
f"Missing native component '{component}' in vendor source: {src_component_dir}"
)

View File

@@ -1,5 +1,5 @@
#!/usr/bin/env python3
"""Install Codex native binaries (Rust CLI plus ripgrep helpers)."""
"""Install Codex native binaries (Rust CLI, bwrap, and ripgrep helpers)."""
import argparse
from contextlib import contextmanager
@@ -42,8 +42,15 @@ class BinaryComponent:
WINDOWS_TARGETS = tuple(target for target in BINARY_TARGETS if "windows" in target)
LINUX_TARGETS = tuple(target for target in BINARY_TARGETS if "linux" in target)
BINARY_COMPONENTS = {
"bwrap": BinaryComponent(
artifact_prefix="bwrap",
dest_dir="codex-resources",
binary_basename="bwrap",
targets=LINUX_TARGETS,
),
"codex": BinaryComponent(
artifact_prefix="codex",
dest_dir="codex",
@@ -135,7 +142,7 @@ def parse_args() -> argparse.Namespace:
choices=tuple(list(BINARY_COMPONENTS) + ["rg"]),
help=(
"Limit installation to the specified components."
" May be repeated. Defaults to codex, codex-windows-sandbox-setup,"
" May be repeated. Defaults to bwrap, codex, codex-windows-sandbox-setup,"
" codex-command-runner, and rg."
),
)
@@ -159,6 +166,7 @@ def main() -> int:
vendor_dir.mkdir(parents=True, exist_ok=True)
components = args.components or [
"bwrap",
"codex",
"codex-windows-sandbox-setup",
"codex-command-runner",

View File

@@ -6,4 +6,6 @@ ignore = [
"RUSTSEC-2024-0436", # paste 1.0.15 via starlark/ratatui; upstream crate is unmaintained
"RUSTSEC-2024-0320", # yaml-rust via syntect; remove when syntect drops or updates it
"RUSTSEC-2025-0141", # bincode via syntect; remove when syntect drops or updates it
"RUSTSEC-2026-0118", # hickory-proto via rama-dns/rama-tcp; remove when rama updates to hickory 0.26.1 or hickory-net
"RUSTSEC-2026-0119", # hickory-proto via rama-dns/rama-tcp; remove when rama updates to hickory 0.26.1 or hickory-net
]

View File

@@ -17,7 +17,7 @@ jobs:
working-directory: codex-rs
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: dtolnay/rust-toolchain@a0b273b48ed29de4470960879e8381ff45632f26 # 1.93.0
- name: Install cargo-audit
uses: taiki-e/install-action@v2
with:

358
codex-rs/Cargo.lock generated
View File

@@ -757,6 +757,7 @@ checksum = "96571e6996817bf3d58f6b569e4b9fd2e9d2fcf9f7424eed07b2ce9bb87535e5"
dependencies = [
"aws-credential-types",
"aws-runtime",
"aws-sdk-signin",
"aws-sdk-sso",
"aws-sdk-ssooidc",
"aws-sdk-sts",
@@ -767,15 +768,20 @@ dependencies = [
"aws-smithy-runtime-api",
"aws-smithy-types",
"aws-types",
"base64-simd",
"bytes",
"fastrand",
"hex",
"http 1.4.0",
"p256",
"rand 0.8.5",
"ring",
"sha2",
"time",
"tokio",
"tracing",
"url",
"uuid",
"zeroize",
]
@@ -838,6 +844,28 @@ dependencies = [
"uuid",
]
[[package]]
name = "aws-sdk-signin"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c084bd63941916e1348cb8d9e05ac2e49bdd40a380e9167702683184c6c6be53"
dependencies = [
"aws-credential-types",
"aws-runtime",
"aws-smithy-async",
"aws-smithy-http",
"aws-smithy-json",
"aws-smithy-runtime",
"aws-smithy-runtime-api",
"aws-smithy-types",
"aws-types",
"bytes",
"fastrand",
"http 0.2.12",
"regex-lite",
"tracing",
]
[[package]]
name = "aws-sdk-sso"
version = "1.91.0"
@@ -1748,6 +1776,21 @@ dependencies = [
"unicode-width 0.2.1",
]
[[package]]
name = "codex-agent-graph-store"
version = "0.0.0"
dependencies = [
"async-trait",
"codex-protocol",
"codex-state",
"pretty_assertions",
"serde",
"serde_json",
"tempfile",
"thiserror 2.0.18",
"tokio",
]
[[package]]
name = "codex-agent-identity"
version = "0.0.0"
@@ -1842,8 +1885,8 @@ dependencies = [
"chrono",
"clap",
"codex-analytics",
"codex-api",
"codex-app-server-protocol",
"codex-app-server-transport",
"codex-arg0",
"codex-backend-client",
"codex-chatgpt",
@@ -1851,21 +1894,26 @@ dependencies = [
"codex-config",
"codex-core",
"codex-core-plugins",
"codex-device-key",
"codex-exec-server",
"codex-extension-api",
"codex-external-agent-migration",
"codex-external-agent-sessions",
"codex-features",
"codex-feedback",
"codex-file-search",
"codex-file-watcher",
"codex-git-utils",
"codex-guardian",
"codex-hooks",
"codex-login",
"codex-mcp",
"codex-memories-extension",
"codex-memories-write",
"codex-model-provider",
"codex-model-provider-info",
"codex-models-manager",
"codex-otel",
"codex-plugin",
"codex-protocol",
"codex-rmcp-client",
"codex-rollout",
@@ -1874,23 +1922,17 @@ dependencies = [
"codex-state",
"codex-thread-store",
"codex-tools",
"codex-uds",
"codex-utils-absolute-path",
"codex-utils-cargo-bin",
"codex-utils-cli",
"codex-utils-json-to-toml",
"codex-utils-pty",
"codex-utils-rustls-provider",
"constant_time_eq 0.3.1",
"core_test_support",
"flate2",
"futures",
"gethostname",
"hmac",
"jsonwebtoken",
"opentelemetry",
"opentelemetry_sdk",
"owo-colors",
"pretty_assertions",
"reqwest",
"rmcp",
@@ -1928,11 +1970,14 @@ dependencies = [
"codex-exec-server",
"codex-feedback",
"codex-protocol",
"codex-uds",
"codex-utils-absolute-path",
"codex-utils-rustls-provider",
"futures",
"pretty_assertions",
"serde",
"serde_json",
"tempfile",
"tokio",
"tokio-tungstenite",
"toml 0.9.11+spec-1.1.0",
@@ -1940,6 +1985,27 @@ dependencies = [
"url",
]
[[package]]
name = "codex-app-server-daemon"
version = "0.0.0"
dependencies = [
"anyhow",
"codex-app-server-protocol",
"codex-app-server-transport",
"codex-uds",
"codex-utils-home-dir",
"futures",
"libc",
"pretty_assertions",
"reqwest",
"serde",
"serde_json",
"sha2",
"tempfile",
"tokio",
"tokio-tungstenite",
]
[[package]]
name = "codex-app-server-protocol"
version = "0.0.0"
@@ -1988,6 +2054,45 @@ dependencies = [
"uuid",
]
[[package]]
name = "codex-app-server-transport"
version = "0.0.0"
dependencies = [
"anyhow",
"axum",
"base64 0.22.1",
"chrono",
"clap",
"codex-api",
"codex-app-server-protocol",
"codex-config",
"codex-core",
"codex-login",
"codex-model-provider",
"codex-state",
"codex-uds",
"codex-utils-absolute-path",
"codex-utils-rustls-provider",
"constant_time_eq 0.3.1",
"futures",
"gethostname",
"hmac",
"jsonwebtoken",
"owo-colors",
"pretty_assertions",
"serde",
"serde_json",
"sha2",
"tempfile",
"time",
"tokio",
"tokio-tungstenite",
"tokio-util",
"tracing",
"url",
"uuid",
]
[[package]]
name = "codex-apply-patch"
version = "0.0.0"
@@ -2075,6 +2180,15 @@ dependencies = [
"serde_with",
]
[[package]]
name = "codex-bwrap"
version = "0.0.0"
dependencies = [
"cc",
"libc",
"pkg-config",
]
[[package]]
name = "codex-chatgpt"
version = "0.0.0"
@@ -2084,9 +2198,11 @@ dependencies = [
"codex-app-server-protocol",
"codex-connectors",
"codex-core",
"codex-core-plugins",
"codex-git-utils",
"codex-login",
"codex-model-provider",
"codex-plugin",
"codex-utils-cargo-bin",
"codex-utils-cli",
"pretty_assertions",
@@ -2105,7 +2221,9 @@ dependencies = [
"assert_matches",
"clap",
"clap_complete",
"codex-api",
"codex-app-server",
"codex-app-server-daemon",
"codex-app-server-protocol",
"codex-app-server-test-client",
"codex-arg0",
@@ -2118,11 +2236,14 @@ dependencies = [
"codex-exec-server",
"codex-execpolicy",
"codex-features",
"codex-install-context",
"codex-login",
"codex-mcp",
"codex-mcp-server",
"codex-memories-write",
"codex-model-provider",
"codex-models-manager",
"codex-plugin",
"codex-protocol",
"codex-responses-api-proxy",
"codex-rmcp-client",
@@ -2137,11 +2258,14 @@ dependencies = [
"codex-utils-cli",
"codex-utils-path",
"codex-windows-sandbox",
"crossterm",
"http 1.4.0",
"libc",
"owo-colors",
"predicates",
"pretty_assertions",
"regex-lite",
"serde",
"serde_json",
"sqlx",
"supports-color 3.0.2",
@@ -2168,6 +2292,7 @@ dependencies = [
"opentelemetry_sdk",
"pretty_assertions",
"rand 0.9.3",
"rcgen",
"reqwest",
"rustls",
"rustls-native-certs",
@@ -2314,6 +2439,7 @@ dependencies = [
"prost 0.14.3",
"schemars 0.8.22",
"serde",
"serde_ignored",
"serde_json",
"serde_path_to_error",
"sha2",
@@ -2340,7 +2466,11 @@ dependencies = [
"codex-app-server-protocol",
"pretty_assertions",
"serde",
"serde_json",
"sha1",
"tempfile",
"tokio",
"tracing",
"urlencoding",
]
@@ -2370,11 +2500,11 @@ dependencies = [
"codex-core-skills",
"codex-exec-server",
"codex-execpolicy",
"codex-extension-api",
"codex-features",
"codex-feedback",
"codex-git-utils",
"codex-hooks",
"codex-kernel",
"codex-login",
"codex-mcp",
"codex-memories-read",
@@ -2406,7 +2536,6 @@ dependencies = [
"codex-utils-path",
"codex-utils-plugins",
"codex-utils-pty",
"codex-utils-readiness",
"codex-utils-stream-parser",
"codex-utils-string",
"codex-utils-template",
@@ -2416,7 +2545,6 @@ dependencies = [
"ctor 0.6.3",
"dirs",
"dunce",
"env-flags",
"eventsource-stream",
"futures",
"http 1.4.0",
@@ -2426,7 +2554,6 @@ dependencies = [
"insta",
"libc",
"maplit",
"notify",
"once_cell",
"openssl-sys",
"opentelemetry",
@@ -2465,17 +2592,38 @@ dependencies = [
"zstd 0.13.3",
]
[[package]]
name = "codex-core-api"
version = "0.0.0"
dependencies = [
"codex-analytics",
"codex-app-server-protocol",
"codex-arg0",
"codex-config",
"codex-core",
"codex-exec-server",
"codex-extension-api",
"codex-features",
"codex-login",
"codex-model-provider-info",
"codex-models-manager",
"codex-protocol",
"codex-utils-absolute-path",
]
[[package]]
name = "codex-core-plugins"
version = "0.0.0"
dependencies = [
"anyhow",
"chrono",
"codex-analytics",
"codex-app-server-protocol",
"codex-config",
"codex-core-skills",
"codex-exec-server",
"codex-git-utils",
"codex-hooks",
"codex-login",
"codex-model-provider",
"codex-otel",
@@ -2544,22 +2692,6 @@ dependencies = [
"serde_json",
]
[[package]]
name = "codex-device-key"
version = "0.0.0"
dependencies = [
"async-trait",
"base64 0.22.1",
"p256",
"pretty_assertions",
"rand 0.9.3",
"serde",
"serde_json",
"thiserror 2.0.18",
"tokio",
"url",
]
[[package]]
name = "codex-exec"
version = "0.0.0"
@@ -2584,6 +2716,7 @@ dependencies = [
"codex-utils-cargo-bin",
"codex-utils-cli",
"codex-utils-oss",
"codex-utils-sandbox-summary",
"core_test_support",
"libc",
"opentelemetry",
@@ -2612,6 +2745,7 @@ dependencies = [
"anyhow",
"arc-swap",
"async-trait",
"axum",
"base64 0.22.1",
"bytes",
"codex-app-server-protocol",
@@ -2622,9 +2756,11 @@ dependencies = [
"codex-test-binary-support",
"codex-utils-absolute-path",
"codex-utils-pty",
"codex-utils-rustls-provider",
"ctor 0.6.3",
"futures",
"pretty_assertions",
"prost 0.14.3",
"reqwest",
"serde",
"serde_json",
@@ -2635,8 +2771,10 @@ dependencies = [
"tokio",
"tokio-tungstenite",
"tokio-util",
"toml 0.9.11+spec-1.1.0",
"tracing",
"uuid",
"wiremock",
]
[[package]]
@@ -2685,6 +2823,14 @@ dependencies = [
"syn 2.0.114",
]
[[package]]
name = "codex-extension-api"
version = "0.0.0"
dependencies = [
"codex-protocol",
"codex-tools",
]
[[package]]
name = "codex-external-agent-migration"
version = "0.0.0"
@@ -2763,6 +2909,17 @@ dependencies = [
"serde",
]
[[package]]
name = "codex-file-watcher"
version = "0.0.0"
dependencies = [
"notify",
"pretty_assertions",
"tempfile",
"tokio",
"tracing",
]
[[package]]
name = "codex-git-utils"
version = "0.0.0"
@@ -2787,6 +2944,15 @@ dependencies = [
"walkdir",
]
[[package]]
name = "codex-guardian"
version = "0.0.0"
dependencies = [
"codex-core",
"codex-extension-api",
"codex-protocol",
]
[[package]]
name = "codex-hooks"
version = "0.0.0"
@@ -2797,6 +2963,7 @@ dependencies = [
"codex-plugin",
"codex-protocol",
"codex-utils-absolute-path",
"codex-utils-output-truncation",
"futures",
"pretty_assertions",
"regex",
@@ -2805,6 +2972,8 @@ dependencies = [
"serde_json",
"tempfile",
"tokio",
"tracing",
"uuid",
]
[[package]]
@@ -2816,20 +2985,6 @@ dependencies = [
"tempfile",
]
[[package]]
name = "codex-kernel"
version = "0.0.0"
dependencies = [
"codex-protocol",
"codex-tools",
"pretty_assertions",
"rand 0.9.3",
"serde",
"serde_json",
"tokio",
"tokio-util",
]
[[package]]
name = "codex-keyring-store"
version = "0.0.0"
@@ -2842,7 +2997,6 @@ dependencies = [
name = "codex-linux-sandbox"
version = "0.0.0"
dependencies = [
"cc",
"clap",
"codex-core",
"codex-process-hardening",
@@ -2852,11 +3006,11 @@ dependencies = [
"globset",
"landlock",
"libc",
"pkg-config",
"pretty_assertions",
"seccompiler",
"serde",
"serde_json",
"sha2",
"tempfile",
"tokio",
"url",
@@ -2959,9 +3113,8 @@ dependencies = [
"codex-config",
"codex-core",
"codex-exec-server",
"codex-features",
"codex-extension-api",
"codex-login",
"codex-models-manager",
"codex-protocol",
"codex-shell-command",
"codex-utils-absolute-path",
@@ -2983,6 +3136,44 @@ dependencies = [
"wiremock",
]
[[package]]
name = "codex-memories-extension"
version = "0.0.0"
dependencies = [
"async-trait",
"codex-core",
"codex-extension-api",
"codex-features",
"codex-memories-read",
"codex-tools",
"codex-utils-absolute-path",
"codex-utils-output-truncation",
"pretty_assertions",
"schemars 0.8.22",
"serde",
"serde_json",
"tempfile",
"thiserror 2.0.18",
"tokio",
]
[[package]]
name = "codex-memories-mcp"
version = "0.0.0"
dependencies = [
"anyhow",
"codex-utils-absolute-path",
"codex-utils-output-truncation",
"pretty_assertions",
"rmcp",
"schemars 0.8.22",
"serde",
"serde_json",
"tempfile",
"thiserror 2.0.18",
"tokio",
]
[[package]]
name = "codex-memories-read"
version = "0.0.0"
@@ -3032,6 +3223,19 @@ dependencies = [
"wiremock",
]
[[package]]
name = "codex-message-history"
version = "0.0.0"
dependencies = [
"codex-config",
"pretty_assertions",
"serde",
"serde_json",
"tempfile",
"tokio",
"tracing",
]
[[package]]
name = "codex-model-provider"
version = "0.0.0"
@@ -3207,7 +3411,6 @@ dependencies = [
"codex-utils-absolute-path",
"codex-utils-image",
"codex-utils-string",
"codex-utils-template",
"encoding_rs",
"globset",
"http 1.4.0",
@@ -3278,6 +3481,7 @@ version = "0.0.0"
dependencies = [
"anyhow",
"axum",
"base64 0.22.1",
"bytes",
"codex-api",
"codex-client",
@@ -3341,6 +3545,7 @@ dependencies = [
"anyhow",
"codex-code-mode",
"codex-protocol",
"http 1.4.0",
"pretty_assertions",
"serde",
"serde_json",
@@ -3492,6 +3697,17 @@ dependencies = [
"tempfile",
]
[[package]]
name = "codex-thread-manager-sample"
version = "0.0.0"
dependencies = [
"anyhow",
"clap",
"codex-core-api",
"serde_json",
"tracing",
]
[[package]]
name = "codex-thread-store"
version = "0.0.0"
@@ -3502,17 +3718,13 @@ dependencies = [
"codex-protocol",
"codex-rollout",
"codex-state",
"codex-utils-path",
"pretty_assertions",
"prost 0.14.3",
"serde",
"serde_json",
"tempfile",
"thiserror 2.0.18",
"tokio",
"tokio-stream",
"tonic",
"tonic-prost",
"tonic-prost-build",
"tracing",
"uuid",
]
@@ -3521,16 +3733,19 @@ dependencies = [
name = "codex-tools"
version = "0.0.0"
dependencies = [
"async-trait",
"codex-app-server-protocol",
"codex-code-mode",
"codex-features",
"codex-protocol",
"codex-utils-absolute-path",
"codex-utils-pty",
"codex-utils-string",
"pretty_assertions",
"rmcp",
"serde",
"serde_json",
"thiserror 2.0.18",
"tracing",
]
@@ -3563,6 +3778,8 @@ dependencies = [
"codex-install-context",
"codex-login",
"codex-mcp",
"codex-message-history",
"codex-model-provider",
"codex-model-provider-info",
"codex-models-manager",
"codex-otel",
@@ -3570,6 +3787,7 @@ dependencies = [
"codex-protocol",
"codex-realtime-webrtc",
"codex-rollout",
"codex-sandboxing",
"codex-shell-command",
"codex-state",
"codex-terminal-detection",
@@ -3579,15 +3797,16 @@ dependencies = [
"codex-utils-cli",
"codex-utils-elapsed",
"codex-utils-fuzzy-match",
"codex-utils-home-dir",
"codex-utils-oss",
"codex-utils-path",
"codex-utils-plugins",
"codex-utils-pty",
"codex-utils-sandbox-summary",
"codex-utils-sleep-inhibitor",
"codex-utils-string",
"codex-windows-sandbox",
"color-eyre",
"core_test_support",
"cpal",
"crossterm",
"derive_more 2.1.1",
@@ -3611,6 +3830,7 @@ dependencies = [
"serde",
"serde_json",
"serial_test",
"sha2",
"shlex",
"strum 0.27.2",
"strum_macros 0.28.0",
@@ -3637,6 +3857,7 @@ dependencies = [
"which 8.0.0",
"windows-sys 0.52.0",
"winsplit",
"wiremock",
]
[[package]]
@@ -3696,6 +3917,7 @@ version = "0.0.0"
dependencies = [
"clap",
"codex-protocol",
"codex-shell-command",
"pretty_assertions",
"serde",
"toml 0.9.11+spec-1.1.0",
@@ -3850,6 +4072,8 @@ version = "0.0.0"
dependencies = [
"pretty_assertions",
"regex-lite",
"serde",
"serde_json",
]
[[package]]
@@ -3874,6 +4098,7 @@ dependencies = [
"anyhow",
"base64 0.22.1",
"chrono",
"codex-otel",
"codex-protocol",
"codex-utils-absolute-path",
"codex-utils-pty",
@@ -4128,7 +4353,9 @@ dependencies = [
"codex-config",
"codex-core",
"codex-exec-server",
"codex-extension-api",
"codex-features",
"codex-hooks",
"codex-login",
"codex-model-provider-info",
"codex-models-manager",
@@ -4145,6 +4372,7 @@ dependencies = [
"reqwest",
"serde_json",
"shlex",
"similar",
"tempfile",
"tokio",
"tokio-tungstenite",
@@ -5179,12 +5407,6 @@ dependencies = [
"syn 2.0.114",
]
[[package]]
name = "env-flags"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dbfd0e7fc632dec5e6c9396a27bc9f9975b4e039720e1fd3e34021d3ce28c415"
[[package]]
name = "env_filter"
version = "1.0.0"
@@ -5236,7 +5458,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "39cab71617ae0d63f51a36d69f866391735b51691dbda63cf6f96d042b63efeb"
dependencies = [
"libc",
"windows-sys 0.61.2",
"windows-sys 0.52.0",
]
[[package]]
@@ -8784,7 +9006,7 @@ version = "5.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "51e219e79014df21a225b1860a479e2dcd7cbd9130f4defd4bd0e191ea31d67d"
dependencies = [
"base64 0.21.7",
"base64 0.22.1",
"chrono",
"getrandom 0.2.17",
"http 1.4.0",
@@ -9252,7 +9474,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7d8fae84b431384b68627d0f9b3b1245fcf9f46f6c0e3dc902e9dce64edd1967"
dependencies = [
"libc",
"windows-sys 0.45.0",
"windows-sys 0.61.2",
]
[[package]]
@@ -11423,6 +11645,16 @@ dependencies = [
"serde_core",
]
[[package]]
name = "serde_ignored"
version = "0.1.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "115dffd5f3853e06e746965a20dcbae6ee747ae30b543d91b0e089668bb07798"
dependencies = [
"serde",
"serde_core",
]
[[package]]
name = "serde_json"
version = "1.0.149"
@@ -13896,7 +14128,7 @@ version = "0.1.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c2a7b1c03c876122aa43f3020e6c3c3ee5c05081c9a00739faf7503aeba10d22"
dependencies = [
"windows-sys 0.48.0",
"windows-sys 0.61.2",
]
[[package]]

View File

@@ -2,11 +2,15 @@
members = [
"aws-auth",
"analytics",
"agent-graph-store",
"agent-identity",
"backend-client",
"bwrap",
"ansi-escape",
"async-utils",
"app-server",
"app-server-transport",
"app-server-daemon",
"app-server-client",
"app-server-protocol",
"app-server-test-client",
@@ -26,11 +30,11 @@ members = [
"collaboration-mode-templates",
"connectors",
"config",
"device-key",
"shell-command",
"shell-escalation",
"skills",
"core",
"core-api",
"core-plugins",
"core-skills",
"hooks",
@@ -40,16 +44,20 @@ members = [
"exec-server",
"execpolicy",
"execpolicy-legacy",
"ext/extension-api",
"ext/guardian",
"ext/memories",
"external-agent-migration",
"external-agent-sessions",
"keyring-store",
"kernel",
"file-search",
"file-watcher",
"linux-sandbox",
"lmstudio",
"login",
"codex-mcp",
"mcp-server",
"memories/mcp",
"memories/read",
"memories/write",
"model-provider-info",
@@ -98,6 +106,7 @@ members = [
"state",
"terminal-detection",
"test-binary-support",
"thread-manager-sample",
"thread-store",
"uds",
"codex-experimental-api-macros",
@@ -119,11 +128,14 @@ license = "Apache-2.0"
# Internal
app_test_support = { path = "app-server/tests/common" }
codex-analytics = { path = "analytics" }
codex-agent-graph-store = { path = "agent-graph-store" }
codex-agent-identity = { path = "agent-identity" }
codex-ansi-escape = { path = "ansi-escape" }
codex-api = { path = "codex-api" }
codex-aws-auth = { path = "aws-auth" }
codex-app-server = { path = "app-server" }
codex-app-server-transport = { path = "app-server-transport" }
codex-app-server-daemon = { path = "app-server-daemon" }
codex-app-server-client = { path = "app-server-client" }
codex-app-server-protocol = { path = "app-server-protocol" }
codex-app-server-test-client = { path = "app-server-test-client" }
@@ -142,13 +154,15 @@ codex-code-mode = { path = "code-mode" }
codex-config = { path = "config" }
codex-connectors = { path = "connectors" }
codex-core = { path = "core" }
codex-core-api = { path = "core-api" }
codex-core-plugins = { path = "core-plugins" }
codex-core-skills = { path = "core-skills" }
codex-device-key = { path = "device-key" }
codex-exec = { path = "exec" }
codex-file-system = { path = "file-system" }
codex-exec-server = { path = "exec-server" }
codex-execpolicy = { path = "execpolicy" }
codex-extension-api = { path = "ext/extension-api" }
codex-guardian = { path = "ext/guardian" }
codex-external-agent-migration = { path = "external-agent-migration" }
codex-external-agent-sessions = { path = "external-agent-sessions" }
codex-experimental-api-macros = { path = "codex-experimental-api-macros" }
@@ -156,13 +170,15 @@ codex-features = { path = "features" }
codex-feedback = { path = "feedback" }
codex-install-context = { path = "install-context" }
codex-file-search = { path = "file-search" }
codex-file-watcher = { path = "file-watcher" }
codex-git-utils = { path = "git-utils" }
codex-hooks = { path = "hooks" }
codex-keyring-store = { path = "keyring-store" }
codex-kernel = { path = "kernel" }
codex-linux-sandbox = { path = "linux-sandbox" }
codex-lmstudio = { path = "lmstudio" }
codex-login = { path = "login" }
codex-message-history = { path = "message-history" }
codex-memories-extension = { path = "ext/memories" }
codex-memories-read = { path = "memories/read" }
codex-memories-write = { path = "memories/write" }
codex-mcp = { path = "codex-mcp" }
@@ -210,7 +226,6 @@ codex-utils-output-truncation = { path = "utils/output-truncation" }
codex-utils-path = { path = "utils/path-utils" }
codex-utils-plugins = { path = "utils/plugins" }
codex-utils-pty = { path = "utils/pty" }
codex-utils-readiness = { path = "utils/readiness" }
codex-utils-rustls-provider = { path = "utils/rustls-provider" }
codex-utils-sandbox-summary = { path = "utils/sandbox-summary" }
codex-utils-sleep-inhibitor = { path = "utils/sleep-inhibitor" }
@@ -263,7 +278,6 @@ dotenvy = "0.15.7"
dunce = "1.0.4"
ed25519-dalek = { version = "2.2.0", features = ["pkcs8"] }
encoding_rs = "0.8.35"
env-flags = "0.1.1"
env_logger = "0.11.9"
eventsource-stream = "0.2.3"
flate2 = "1.1.8"
@@ -308,7 +322,6 @@ os_info = "3.12.0"
owo-colors = "4.3.0"
path-absolutize = "3.1.1"
pathdiff = "0.2"
p256 = "0.13.2"
portable-pty = "0.9.0"
predicates = "3"
pretty_assertions = "1.4.1"
@@ -317,6 +330,10 @@ quick-xml = "0.38.4"
rand = "0.9"
ratatui = "0.29.0"
ratatui-macros = "0.6.0"
rcgen = { version = "0.14.7", default-features = false, features = [
"aws_lc_rs",
"pem",
] }
regex = "1.12.3"
regex-lite = "0.1.8"
reqwest = { version = "0.12", features = ["cookies"] }
@@ -333,6 +350,7 @@ seccompiler = "0.5.0"
semver = "1.0"
sentry = "0.46.0"
serde = "1"
serde_ignored = "0.1.14"
serde_json = "1"
serde_path_to_error = "0.1.20"
serde_with = "3.17"
@@ -451,23 +469,22 @@ unwrap_used = "deny"
# silence the false positive here instead of deleting a real dependency.
[workspace.metadata.cargo-shear]
ignored = [
"codex-agent-graph-store",
"icu_provider",
"openssl-sys",
"codex-utils-readiness",
"codex-utils-template",
"codex-v8-poc",
]
[profile.dev]
# Keep line tables/backtraces while avoiding expensive full variable debug info
# across local dev builds.
debug = 1
debug = "limited"
[profile.dev-small]
inherits = "dev"
opt-level = 0
debug = 0
strip = true
debug = "none"
strip = "symbols"
[profile.release]
lto = "fat"
@@ -479,8 +496,15 @@ strip = "symbols"
# See https://github.com/openai/codex/issues/1411 for details.
codegen-units = 1
[profile.profiling]
inherits = "release"
debug = "full"
lto = false
strip = false
[profile.ci-test]
debug = 1 # Reduce debug symbol size
# Reduce binary size to reduce disk pressure.
debug = "limited"
inherits = "test"
opt-level = 0

View File

@@ -0,0 +1,6 @@
load("//:defs.bzl", "codex_rust_crate")
codex_rust_crate(
name = "agent-graph-store",
crate_name = "codex_agent_graph_store",
)

View File

@@ -1,25 +1,26 @@
[package]
edition.workspace = true
license.workspace = true
name = "codex-kernel"
name = "codex-agent-graph-store"
version.workspace = true
[lib]
doctest = false
name = "codex_kernel"
name = "codex_agent_graph_store"
path = "src/lib.rs"
doctest = false
[lints]
workspace = true
[dependencies]
async-trait = { workspace = true }
codex-protocol = { workspace = true }
codex-tools = { workspace = true }
rand = { workspace = true }
codex-state = { workspace = true }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
tokio = { workspace = true, features = ["time"] }
tokio-util = { workspace = true, features = ["rt"] }
thiserror = { workspace = true }
[dev-dependencies]
pretty_assertions = { workspace = true }
serde_json = { workspace = true }
tempfile = { workspace = true }
tokio = { workspace = true, features = ["macros", "rt-multi-thread"] }

View File

@@ -0,0 +1,20 @@
/// Result type returned by agent graph store operations.
pub type AgentGraphStoreResult<T> = Result<T, AgentGraphStoreError>;
/// Error type shared by agent graph store implementations.
#[derive(Debug, thiserror::Error)]
pub enum AgentGraphStoreError {
/// The caller supplied invalid request data.
#[error("invalid agent graph store request: {message}")]
InvalidRequest {
/// User-facing explanation of the invalid request.
message: String,
},
/// Catch-all for implementation failures that do not fit a more specific category.
#[error("agent graph store internal error: {message}")]
Internal {
/// User-facing explanation of the implementation failure.
message: String,
},
}

View File

@@ -0,0 +1,12 @@
//! Storage-neutral parent/child topology for thread-spawned agents.
mod error;
mod local;
mod store;
mod types;
pub use error::AgentGraphStoreError;
pub use error::AgentGraphStoreResult;
pub use local::LocalAgentGraphStore;
pub use store::AgentGraphStore;
pub use types::ThreadSpawnEdgeStatus;

View File

@@ -0,0 +1,325 @@
use async_trait::async_trait;
use codex_protocol::ThreadId;
use codex_state::StateRuntime;
use std::sync::Arc;
use crate::AgentGraphStore;
use crate::AgentGraphStoreError;
use crate::AgentGraphStoreResult;
use crate::ThreadSpawnEdgeStatus;
/// SQLite-backed implementation of [`AgentGraphStore`] using an existing state runtime.
#[derive(Clone)]
pub struct LocalAgentGraphStore {
state_db: Arc<StateRuntime>,
}
impl std::fmt::Debug for LocalAgentGraphStore {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("LocalAgentGraphStore")
.field("codex_home", &self.state_db.codex_home())
.finish_non_exhaustive()
}
}
impl LocalAgentGraphStore {
/// Create a local graph store from an already-initialized state runtime.
pub fn new(state_db: Arc<StateRuntime>) -> Self {
Self { state_db }
}
}
#[async_trait]
impl AgentGraphStore for LocalAgentGraphStore {
async fn upsert_thread_spawn_edge(
&self,
parent_thread_id: ThreadId,
child_thread_id: ThreadId,
status: ThreadSpawnEdgeStatus,
) -> AgentGraphStoreResult<()> {
self.state_db
.upsert_thread_spawn_edge(parent_thread_id, child_thread_id, to_state_status(status))
.await
.map_err(internal_error)
}
async fn set_thread_spawn_edge_status(
&self,
child_thread_id: ThreadId,
status: ThreadSpawnEdgeStatus,
) -> AgentGraphStoreResult<()> {
self.state_db
.set_thread_spawn_edge_status(child_thread_id, to_state_status(status))
.await
.map_err(internal_error)
}
async fn list_thread_spawn_children(
&self,
parent_thread_id: ThreadId,
status_filter: Option<ThreadSpawnEdgeStatus>,
) -> AgentGraphStoreResult<Vec<ThreadId>> {
if let Some(status) = status_filter {
return self
.state_db
.list_thread_spawn_children_with_status(parent_thread_id, to_state_status(status))
.await
.map_err(internal_error);
}
self.state_db
.list_thread_spawn_children(parent_thread_id)
.await
.map_err(internal_error)
}
async fn list_thread_spawn_descendants(
&self,
root_thread_id: ThreadId,
status_filter: Option<ThreadSpawnEdgeStatus>,
) -> AgentGraphStoreResult<Vec<ThreadId>> {
match status_filter {
Some(status) => self
.state_db
.list_thread_spawn_descendants_with_status(root_thread_id, to_state_status(status))
.await
.map_err(internal_error),
None => self
.state_db
.list_thread_spawn_descendants(root_thread_id)
.await
.map_err(internal_error),
}
}
}
fn to_state_status(status: ThreadSpawnEdgeStatus) -> codex_state::DirectionalThreadSpawnEdgeStatus {
match status {
ThreadSpawnEdgeStatus::Open => codex_state::DirectionalThreadSpawnEdgeStatus::Open,
ThreadSpawnEdgeStatus::Closed => codex_state::DirectionalThreadSpawnEdgeStatus::Closed,
}
}
fn internal_error(err: impl std::fmt::Display) -> AgentGraphStoreError {
AgentGraphStoreError::Internal {
message: err.to_string(),
}
}
#[cfg(test)]
mod tests {
use super::*;
use codex_state::DirectionalThreadSpawnEdgeStatus;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
struct TestRuntime {
state_db: Arc<StateRuntime>,
_codex_home: TempDir,
}
fn thread_id(suffix: u128) -> ThreadId {
ThreadId::from_string(&format!("00000000-0000-0000-0000-{suffix:012}"))
.expect("valid thread id")
}
async fn state_runtime() -> TestRuntime {
let codex_home = TempDir::new().expect("tempdir should be created");
let state_db =
StateRuntime::init(codex_home.path().to_path_buf(), "test-provider".to_string())
.await
.expect("state db should initialize");
TestRuntime {
state_db,
_codex_home: codex_home,
}
}
#[tokio::test]
async fn local_store_upserts_and_lists_direct_children_with_status_filters() {
let fixture = state_runtime().await;
let state_db = fixture.state_db;
let store = LocalAgentGraphStore::new(state_db.clone());
let parent_thread_id = thread_id(/*suffix*/ 1);
let first_child_thread_id = thread_id(/*suffix*/ 2);
let second_child_thread_id = thread_id(/*suffix*/ 3);
store
.upsert_thread_spawn_edge(
parent_thread_id,
second_child_thread_id,
ThreadSpawnEdgeStatus::Closed,
)
.await
.expect("closed child edge should insert");
store
.upsert_thread_spawn_edge(
parent_thread_id,
first_child_thread_id,
ThreadSpawnEdgeStatus::Open,
)
.await
.expect("open child edge should insert");
let all_children = store
.list_thread_spawn_children(parent_thread_id, /*status_filter*/ None)
.await
.expect("all children should load");
assert_eq!(
all_children,
vec![first_child_thread_id, second_child_thread_id]
);
let open_children = store
.list_thread_spawn_children(parent_thread_id, Some(ThreadSpawnEdgeStatus::Open))
.await
.expect("open children should load");
let state_open_children = state_db
.list_thread_spawn_children_with_status(
parent_thread_id,
DirectionalThreadSpawnEdgeStatus::Open,
)
.await
.expect("state open children should load");
assert_eq!(open_children, state_open_children);
assert_eq!(open_children, vec![first_child_thread_id]);
let closed_children = store
.list_thread_spawn_children(parent_thread_id, Some(ThreadSpawnEdgeStatus::Closed))
.await
.expect("closed children should load");
assert_eq!(closed_children, vec![second_child_thread_id]);
}
#[tokio::test]
async fn local_store_updates_edge_status() {
let fixture = state_runtime().await;
let state_db = fixture.state_db;
let store = LocalAgentGraphStore::new(state_db);
let parent_thread_id = thread_id(/*suffix*/ 10);
let child_thread_id = thread_id(/*suffix*/ 11);
store
.upsert_thread_spawn_edge(
parent_thread_id,
child_thread_id,
ThreadSpawnEdgeStatus::Open,
)
.await
.expect("child edge should insert");
store
.set_thread_spawn_edge_status(child_thread_id, ThreadSpawnEdgeStatus::Closed)
.await
.expect("child edge should close");
let open_children = store
.list_thread_spawn_children(parent_thread_id, Some(ThreadSpawnEdgeStatus::Open))
.await
.expect("open children should load");
assert_eq!(open_children, Vec::<ThreadId>::new());
let closed_children = store
.list_thread_spawn_children(parent_thread_id, Some(ThreadSpawnEdgeStatus::Closed))
.await
.expect("closed children should load");
assert_eq!(closed_children, vec![child_thread_id]);
}
#[tokio::test]
async fn local_store_lists_descendants_breadth_first_with_status_filters() {
let fixture = state_runtime().await;
let state_db = fixture.state_db;
let store = LocalAgentGraphStore::new(state_db.clone());
let root_thread_id = thread_id(/*suffix*/ 20);
let later_child_thread_id = thread_id(/*suffix*/ 22);
let earlier_child_thread_id = thread_id(/*suffix*/ 21);
let closed_grandchild_thread_id = thread_id(/*suffix*/ 23);
let open_grandchild_thread_id = thread_id(/*suffix*/ 24);
let closed_child_thread_id = thread_id(/*suffix*/ 25);
let closed_great_grandchild_thread_id = thread_id(/*suffix*/ 26);
for (parent_thread_id, child_thread_id, status) in [
(
root_thread_id,
later_child_thread_id,
ThreadSpawnEdgeStatus::Open,
),
(
root_thread_id,
earlier_child_thread_id,
ThreadSpawnEdgeStatus::Open,
),
(
earlier_child_thread_id,
open_grandchild_thread_id,
ThreadSpawnEdgeStatus::Open,
),
(
later_child_thread_id,
closed_grandchild_thread_id,
ThreadSpawnEdgeStatus::Closed,
),
(
root_thread_id,
closed_child_thread_id,
ThreadSpawnEdgeStatus::Closed,
),
(
closed_child_thread_id,
closed_great_grandchild_thread_id,
ThreadSpawnEdgeStatus::Closed,
),
] {
store
.upsert_thread_spawn_edge(parent_thread_id, child_thread_id, status)
.await
.expect("edge should insert");
}
let all_descendants = store
.list_thread_spawn_descendants(root_thread_id, /*status_filter*/ None)
.await
.expect("all descendants should load");
assert_eq!(
all_descendants,
vec![
earlier_child_thread_id,
later_child_thread_id,
closed_child_thread_id,
closed_grandchild_thread_id,
open_grandchild_thread_id,
closed_great_grandchild_thread_id,
]
);
let open_descendants = store
.list_thread_spawn_descendants(root_thread_id, Some(ThreadSpawnEdgeStatus::Open))
.await
.expect("open descendants should load");
let state_open_descendants = state_db
.list_thread_spawn_descendants_with_status(
root_thread_id,
DirectionalThreadSpawnEdgeStatus::Open,
)
.await
.expect("state open descendants should load");
assert_eq!(open_descendants, state_open_descendants);
assert_eq!(
open_descendants,
vec![
earlier_child_thread_id,
later_child_thread_id,
open_grandchild_thread_id,
]
);
let closed_descendants = store
.list_thread_spawn_descendants(root_thread_id, Some(ThreadSpawnEdgeStatus::Closed))
.await
.expect("closed descendants should load");
assert_eq!(
closed_descendants,
vec![closed_child_thread_id, closed_great_grandchild_thread_id]
);
}
}

View File

@@ -0,0 +1,55 @@
use async_trait::async_trait;
use codex_protocol::ThreadId;
use crate::AgentGraphStoreResult;
use crate::ThreadSpawnEdgeStatus;
/// Storage-neutral boundary for persisted thread-spawn parent/child topology.
///
/// Implementations are expected to return stable ordering for list methods so callers can merge
/// persisted graph state with live in-memory state without introducing nondeterministic output.
#[async_trait]
pub trait AgentGraphStore: Send + Sync {
/// Insert or replace the directional parent/child edge for a spawned thread.
///
/// `child_thread_id` has at most one persisted parent. Re-inserting the same child should
/// update both the parent and status to match the supplied values.
async fn upsert_thread_spawn_edge(
&self,
parent_thread_id: ThreadId,
child_thread_id: ThreadId,
status: ThreadSpawnEdgeStatus,
) -> AgentGraphStoreResult<()>;
/// Update the persisted lifecycle status of a spawned thread's incoming edge.
///
/// Implementations should treat missing children as a successful no-op.
async fn set_thread_spawn_edge_status(
&self,
child_thread_id: ThreadId,
status: ThreadSpawnEdgeStatus,
) -> AgentGraphStoreResult<()>;
/// List direct spawned children of a parent thread.
///
/// When `status_filter` is `Some`, only child edges with that exact status are returned. When
/// it is `None`, all direct child edges are returned regardless of status, including statuses
/// that may be added by a future store implementation.
async fn list_thread_spawn_children(
&self,
parent_thread_id: ThreadId,
status_filter: Option<ThreadSpawnEdgeStatus>,
) -> AgentGraphStoreResult<Vec<ThreadId>>;
/// List spawned descendants breadth-first by depth, then by thread id.
///
/// `status_filter` is applied to every traversed edge, not just to the returned descendants.
/// For example, `Some(Open)` walks only open edges, so descendants under a closed edge are not
/// included even if their own incoming edge is open. `None` walks and returns every persisted
/// edge regardless of status.
async fn list_thread_spawn_descendants(
&self,
root_thread_id: ThreadId,
status_filter: Option<ThreadSpawnEdgeStatus>,
) -> AgentGraphStoreResult<Vec<ThreadId>>;
}

View File

@@ -0,0 +1,42 @@
use serde::Deserialize;
use serde::Serialize;
/// Lifecycle status attached to a directional thread-spawn edge.
#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum ThreadSpawnEdgeStatus {
/// The child thread is still live or resumable as an open spawned agent.
Open,
/// The child thread has been closed from the parent/child graph's perspective.
Closed,
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
#[test]
fn thread_spawn_edge_status_serializes_as_snake_case() {
assert_eq!(
serde_json::to_string(&ThreadSpawnEdgeStatus::Open)
.expect("open status should serialize"),
"\"open\""
);
assert_eq!(
serde_json::to_string(&ThreadSpawnEdgeStatus::Closed)
.expect("closed status should serialize"),
"\"closed\""
);
assert_eq!(
serde_json::from_str::<ThreadSpawnEdgeStatus>("\"open\"")
.expect("open status should deserialize"),
ThreadSpawnEdgeStatus::Open
);
assert_eq!(
serde_json::from_str::<ThreadSpawnEdgeStatus>("\"closed\"")
.expect("closed status should deserialize"),
ThreadSpawnEdgeStatus::Closed
);
}
}

View File

@@ -0,0 +1,256 @@
use crate::events::CodexAcceptedLineFingerprintsEventParams;
use crate::events::CodexAcceptedLineFingerprintsEventRequest;
use crate::events::TrackEventRequest;
use crate::facts::AcceptedLineFingerprint;
use codex_git_utils::canonicalize_git_remote_url;
use codex_git_utils::get_git_remote_urls_assume_git_repo;
use sha1::Digest;
use std::path::Path;
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct AcceptedLineFingerprintSummary {
pub accepted_added_lines: u64,
pub accepted_deleted_lines: u64,
pub line_fingerprints: Vec<AcceptedLineFingerprint>,
}
pub(crate) struct AcceptedLineFingerprintEventInput {
pub(crate) event_type: &'static str,
pub(crate) turn_id: String,
pub(crate) thread_id: String,
pub(crate) product_surface: Option<String>,
pub(crate) model_slug: Option<String>,
pub(crate) completed_at: u64,
pub(crate) repo_hash: Option<String>,
pub(crate) accepted_added_lines: u64,
pub(crate) accepted_deleted_lines: u64,
pub(crate) line_fingerprints: Vec<AcceptedLineFingerprint>,
}
pub fn accepted_line_fingerprints_from_unified_diff(
unified_diff: &str,
) -> AcceptedLineFingerprintSummary {
let mut current_path: Option<String> = None;
let mut in_hunk = false;
let mut accepted_added_lines = 0;
let mut accepted_deleted_lines = 0;
let mut line_fingerprints = Vec::new();
for line in unified_diff.lines() {
if line.starts_with("diff --git ") {
current_path = None;
in_hunk = false;
continue;
}
if line.starts_with("@@ ") {
in_hunk = true;
continue;
}
if !in_hunk && let Some(path) = line.strip_prefix("+++ ") {
current_path = normalize_diff_path(path);
continue;
}
if !in_hunk && line.starts_with("--- ") {
continue;
}
if let Some(added_line) = line.strip_prefix('+') {
accepted_added_lines += 1;
if let Some(path) = current_path.as_deref()
&& let Some(normalized_line) = normalize_effective_line(added_line)
{
line_fingerprints.push(AcceptedLineFingerprint {
path_hash: fingerprint_hash("path", path),
line_hash: fingerprint_hash("line", &normalized_line),
});
}
continue;
}
if line.starts_with('-') {
accepted_deleted_lines += 1;
}
}
AcceptedLineFingerprintSummary {
accepted_added_lines,
accepted_deleted_lines,
line_fingerprints,
}
}
pub fn fingerprint_hash(domain: &str, value: &str) -> String {
let mut hasher = sha1::Sha1::new();
hasher.update(b"file-line-v1\0");
hasher.update(domain.as_bytes());
hasher.update(b"\0");
hasher.update(value.as_bytes());
format!("{:x}", hasher.finalize())
}
pub(crate) fn accepted_line_fingerprint_event_requests(
input: AcceptedLineFingerprintEventInput,
) -> Vec<TrackEventRequest> {
let AcceptedLineFingerprintEventInput {
event_type,
turn_id,
thread_id,
product_surface,
model_slug,
completed_at,
repo_hash,
accepted_added_lines,
accepted_deleted_lines,
line_fingerprints: _line_fingerprints,
} = input;
vec![TrackEventRequest::AcceptedLineFingerprints(Box::new(
CodexAcceptedLineFingerprintsEventRequest {
event_type: "codex_accepted_line_fingerprints",
event_params: CodexAcceptedLineFingerprintsEventParams {
event_type,
turn_id,
thread_id,
product_surface,
model_slug,
completed_at,
repo_hash,
accepted_added_lines,
accepted_deleted_lines,
// Keep computing local fingerprints for parsing tests and future attribution,
// but do not upload path/line hashes in the analytics event payload.
line_fingerprints: Vec::new(),
},
},
))]
}
pub async fn accepted_line_repo_hash_for_cwd(cwd: &Path) -> Option<String> {
let remotes = get_git_remote_urls_assume_git_repo(cwd).await?;
remotes
.get("origin")
.or_else(|| remotes.values().next())
.map(|remote_url| {
let canonical_remote_url =
canonicalize_git_remote_url(remote_url).unwrap_or_else(|| remote_url.to_string());
fingerprint_hash("repo", &canonical_remote_url)
})
}
fn normalize_diff_path(path: &str) -> Option<String> {
let path = path.trim();
if path == "/dev/null" {
return None;
}
Some(
path.strip_prefix("b/")
.or_else(|| path.strip_prefix("a/"))
.unwrap_or(path)
.to_string(),
)
}
fn normalize_effective_line(line: &str) -> Option<String> {
let normalized = line.split_whitespace().collect::<Vec<_>>().join(" ");
if normalized.len() <= 3 {
return None;
}
if !normalized
.chars()
.any(|ch| ch.is_alphanumeric() || ch == '_')
{
return None;
}
Some(normalized)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parses_counts_and_effective_added_fingerprints() {
let diff = "\
diff --git a/src/lib.rs b/src/lib.rs
index 1111111..2222222
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,3 +1,5 @@
-old line
+fn useful() {
+}
+ return user.id;
context
";
let summary = accepted_line_fingerprints_from_unified_diff(diff);
assert_eq!(
summary,
AcceptedLineFingerprintSummary {
accepted_added_lines: 3,
accepted_deleted_lines: 1,
line_fingerprints: vec![
AcceptedLineFingerprint {
path_hash: fingerprint_hash("path", "src/lib.rs"),
line_hash: fingerprint_hash("line", "fn useful() {"),
},
AcceptedLineFingerprint {
path_hash: fingerprint_hash("path", "src/lib.rs"),
line_hash: fingerprint_hash("line", "return user.id;"),
},
],
}
);
}
#[test]
fn skips_added_file_metadata_headers() {
let diff = "\
diff --git a/new.py b/new.py
new file mode 100644
index 0000000..1111111
--- /dev/null
+++ b/new.py
@@ -0,0 +1 @@
+print('hello')
";
let summary = accepted_line_fingerprints_from_unified_diff(diff);
assert_eq!(summary.accepted_added_lines, 1);
assert_eq!(summary.accepted_deleted_lines, 0);
assert_eq!(summary.line_fingerprints.len(), 1);
}
#[test]
fn parses_hunk_lines_that_look_like_file_headers() {
let diff = "\
diff --git a/src/lib.rs b/src/lib.rs
index 1111111..2222222
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,2 +1,2 @@
--- old value
+++ new value
";
let summary = accepted_line_fingerprints_from_unified_diff(diff);
assert_eq!(
summary,
AcceptedLineFingerprintSummary {
accepted_added_lines: 1,
accepted_deleted_lines: 1,
line_fingerprints: vec![AcceptedLineFingerprint {
path_hash: fingerprint_hash("path", "src/lib.rs"),
line_hash: fingerprint_hash("line", "++ new value"),
}],
}
);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -22,14 +22,18 @@ use crate::facts::TurnResolvedConfigFact;
use crate::facts::TurnTokenUsageFact;
use crate::reducer::AnalyticsReducer;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::ClientResponse;
use codex_app_server_protocol::ClientResponsePayload;
use codex_app_server_protocol::InitializeParams;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ServerNotification;
use codex_app_server_protocol::ServerRequest;
use codex_app_server_protocol::ServerResponse;
use codex_login::AuthManager;
use codex_login::CodexAuth;
use codex_login::default_client::create_client;
use codex_plugin::PluginTelemetryMetadata;
use codex_protocol::request_permissions::RequestPermissionsResponse;
use std::collections::HashSet;
use std::sync::Arc;
use std::sync::Mutex;
@@ -49,8 +53,7 @@ pub(crate) struct AnalyticsEventsQueue {
#[derive(Clone)]
pub struct AnalyticsEventsClient {
queue: AnalyticsEventsQueue,
analytics_enabled: Option<bool>,
queue: Option<AnalyticsEventsQueue>,
}
impl AnalyticsEventsQueue {
@@ -119,11 +122,15 @@ impl AnalyticsEventsClient {
analytics_enabled: Option<bool>,
) -> Self {
Self {
queue: AnalyticsEventsQueue::new(Arc::clone(&auth_manager), base_url),
analytics_enabled,
queue: (analytics_enabled != Some(false))
.then(|| AnalyticsEventsQueue::new(Arc::clone(&auth_manager), base_url)),
}
}
pub fn disabled() -> Self {
Self { queue: None }
}
pub fn track_skill_invocations(
&self,
tracking: TrackEventsContext,
@@ -166,9 +173,10 @@ impl AnalyticsEventsClient {
&self,
tracking: &GuardianReviewTrackContext,
result: GuardianReviewAnalyticsResult,
completed_at_ms: u64,
) {
self.record_fact(AnalyticsFact::Custom(CustomAnalyticsFact::GuardianReview(
Box::new(tracking.event_params(result)),
Box::new(tracking.event_params(result, completed_at_ms)),
)));
}
@@ -181,16 +189,30 @@ impl AnalyticsEventsClient {
)));
}
pub fn track_request(&self, connection_id: u64, request_id: RequestId, request: ClientRequest) {
self.record_fact(AnalyticsFact::Request {
pub fn track_request(
&self,
connection_id: u64,
request_id: RequestId,
request: &ClientRequest,
) {
if !matches!(
request,
ClientRequest::TurnStart { .. } | ClientRequest::TurnSteer { .. }
) {
return;
}
self.record_fact(AnalyticsFact::ClientRequest {
connection_id,
request_id,
request: Box::new(request),
request: Box::new(request.clone()),
});
}
pub fn track_app_used(&self, tracking: TrackEventsContext, app: AppInvocation) {
if !self.queue.should_enqueue_app_used(&tracking, &app) {
let Some(queue) = self.queue.as_ref() else {
return;
};
if !queue.should_enqueue_app_used(&tracking, &app) {
return;
}
self.record_fact(AnalyticsFact::Custom(CustomAnalyticsFact::AppUsed(
@@ -205,7 +227,10 @@ impl AnalyticsEventsClient {
}
pub fn track_plugin_used(&self, tracking: TrackEventsContext, plugin: PluginTelemetryMetadata) {
if !self.queue.should_enqueue_plugin_used(&tracking, &plugin) {
let Some(queue) = self.queue.as_ref() else {
return;
};
if !queue.should_enqueue_plugin_used(&tracking, &plugin) {
return;
}
self.record_fact(AnalyticsFact::Custom(CustomAnalyticsFact::PluginUsed(
@@ -268,15 +293,30 @@ impl AnalyticsEventsClient {
}
pub(crate) fn record_fact(&self, input: AnalyticsFact) {
if self.analytics_enabled == Some(false) {
return;
if let Some(queue) = self.queue.as_ref() {
queue.try_send(input);
}
self.queue.try_send(input);
}
pub fn track_response(&self, connection_id: u64, response: ClientResponse) {
self.record_fact(AnalyticsFact::Response {
pub fn track_response(
&self,
connection_id: u64,
request_id: RequestId,
response: ClientResponsePayload,
) {
if !matches!(
response,
ClientResponsePayload::ThreadStart(_)
| ClientResponsePayload::ThreadResume(_)
| ClientResponsePayload::ThreadFork(_)
| ClientResponsePayload::TurnStart(_)
| ClientResponsePayload::TurnSteer(_)
) {
return;
}
self.record_fact(AnalyticsFact::ClientResponse {
connection_id,
request_id,
response: Box::new(response),
});
}
@@ -296,7 +336,53 @@ impl AnalyticsEventsClient {
});
}
pub fn track_server_request(&self, connection_id: u64, request: ServerRequest) {
self.record_fact(AnalyticsFact::ServerRequest {
connection_id,
request: Box::new(request),
});
}
pub fn track_server_response(&self, completed_at_ms: u64, response: ServerResponse) {
self.record_fact(AnalyticsFact::ServerResponse {
completed_at_ms,
response: Box::new(response),
});
}
pub fn track_effective_permissions_approval_response(
&self,
completed_at_ms: u64,
request_id: RequestId,
response: RequestPermissionsResponse,
) {
self.record_fact(AnalyticsFact::EffectivePermissionsApprovalResponse {
completed_at_ms,
request_id,
response: Box::new(response),
});
}
pub fn track_server_request_aborted(&self, completed_at_ms: u64, request_id: RequestId) {
self.record_fact(AnalyticsFact::ServerRequestAborted {
completed_at_ms,
request_id,
});
}
pub fn track_notification(&self, notification: ServerNotification) {
if !matches!(
notification,
ServerNotification::TurnStarted(_)
| ServerNotification::TurnCompleted(_)
| ServerNotification::TurnDiffUpdated(_)
| ServerNotification::ItemStarted(_)
| ServerNotification::ItemCompleted(_)
| ServerNotification::ItemGuardianApprovalReviewStarted(_)
| ServerNotification::ItemGuardianApprovalReviewCompleted(_)
) {
return;
}
self.record_fact(AnalyticsFact::Notification(Box::new(notification)));
}
}
@@ -309,6 +395,7 @@ async fn send_track_events(
if events.is_empty() {
return;
}
let Some(auth) = auth_manager.auth().await else {
return;
};
@@ -318,12 +405,45 @@ async fn send_track_events(
let base_url = base_url.trim_end_matches('/');
let url = format!("{base_url}/codex/analytics-events/events");
for events in track_event_request_batches(events) {
send_track_events_request(&auth, &url, events).await;
}
}
fn track_event_request_batches(events: Vec<TrackEventRequest>) -> Vec<Vec<TrackEventRequest>> {
let mut batches = Vec::new();
let mut current_batch = Vec::new();
for event in events {
if event.should_send_in_isolated_request() {
if !current_batch.is_empty() {
batches.push(current_batch);
current_batch = Vec::new();
}
batches.push(vec![event]);
} else {
current_batch.push(event);
}
}
if !current_batch.is_empty() {
batches.push(current_batch);
}
batches
}
async fn send_track_events_request(auth: &CodexAuth, url: &str, events: Vec<TrackEventRequest>) {
if events.is_empty() {
return;
}
let payload = TrackEventsRequest { events };
let response = create_client()
.post(&url)
.post(url)
.timeout(ANALYTICS_EVENTS_TIMEOUT)
.headers(codex_model_provider::auth_provider_from_auth(&auth).to_auth_headers())
.headers(codex_model_provider::auth_provider_from_auth(auth).to_auth_headers())
.header("Content-Type", "application/json")
.json(&payload)
.send()
@@ -341,3 +461,7 @@ async fn send_track_events(
}
}
}
#[cfg(test)]
#[path = "client_tests.rs"]
mod tests;

View File

@@ -0,0 +1,283 @@
use super::AnalyticsEventsClient;
use super::AnalyticsEventsQueue;
use super::track_event_request_batches;
use crate::events::CodexAcceptedLineFingerprintsEventParams;
use crate::events::CodexAcceptedLineFingerprintsEventRequest;
use crate::events::SkillInvocationEventParams;
use crate::events::SkillInvocationEventRequest;
use crate::events::TrackEventRequest;
use crate::facts::AnalyticsFact;
use crate::facts::InvocationType;
use codex_app_server_protocol::ApprovalsReviewer as AppServerApprovalsReviewer;
use codex_app_server_protocol::AskForApproval as AppServerAskForApproval;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::ClientResponsePayload;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::SandboxPolicy as AppServerSandboxPolicy;
use codex_app_server_protocol::SessionSource as AppServerSessionSource;
use codex_app_server_protocol::Thread;
use codex_app_server_protocol::ThreadArchiveParams;
use codex_app_server_protocol::ThreadArchiveResponse;
use codex_app_server_protocol::ThreadForkResponse;
use codex_app_server_protocol::ThreadResumeResponse;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::ThreadStatus as AppServerThreadStatus;
use codex_app_server_protocol::Turn;
use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::TurnStatus as AppServerTurnStatus;
use codex_app_server_protocol::TurnSteerParams;
use codex_app_server_protocol::TurnSteerResponse;
use codex_utils_absolute_path::test_support::PathBufExt;
use codex_utils_absolute_path::test_support::test_path_buf;
use std::collections::HashSet;
use std::sync::Arc;
use std::sync::Mutex;
use tokio::sync::mpsc;
use tokio::sync::mpsc::error::TryRecvError;
fn sample_accepted_line_fingerprint_event(thread_id: &str) -> TrackEventRequest {
TrackEventRequest::AcceptedLineFingerprints(Box::new(
CodexAcceptedLineFingerprintsEventRequest {
event_type: "codex_accepted_line_fingerprints",
event_params: CodexAcceptedLineFingerprintsEventParams {
event_type: "codex.accepted_line_fingerprints",
turn_id: "turn-1".to_string(),
thread_id: thread_id.to_string(),
product_surface: Some("codex".to_string()),
model_slug: Some("gpt-5.1-codex".to_string()),
completed_at: 1,
repo_hash: None,
accepted_added_lines: 1,
accepted_deleted_lines: 0,
line_fingerprints: Vec::new(),
},
},
))
}
fn sample_regular_track_event(thread_id: &str) -> TrackEventRequest {
TrackEventRequest::SkillInvocation(SkillInvocationEventRequest {
event_type: "skill_invocation",
skill_id: format!("skill-{thread_id}"),
skill_name: "doc".to_string(),
event_params: SkillInvocationEventParams {
product_client_id: None,
skill_scope: None,
plugin_id: None,
repo_url: None,
thread_id: Some(thread_id.to_string()),
turn_id: Some("turn-1".to_string()),
invoke_type: Some(InvocationType::Explicit),
model_slug: Some("gpt-5.1-codex".to_string()),
},
})
}
fn client_with_receiver() -> (AnalyticsEventsClient, mpsc::Receiver<AnalyticsFact>) {
let (sender, receiver) = mpsc::channel(8);
let queue = AnalyticsEventsQueue {
sender,
app_used_emitted_keys: Arc::new(Mutex::new(HashSet::new())),
plugin_used_emitted_keys: Arc::new(Mutex::new(HashSet::new())),
};
(AnalyticsEventsClient { queue: Some(queue) }, receiver)
}
fn sample_turn_start_request() -> ClientRequest {
ClientRequest::TurnStart {
request_id: RequestId::Integer(1),
params: TurnStartParams {
thread_id: "thread-1".to_string(),
input: Vec::new(),
..Default::default()
},
}
}
fn sample_turn_steer_request() -> ClientRequest {
ClientRequest::TurnSteer {
request_id: RequestId::Integer(2),
params: TurnSteerParams {
thread_id: "thread-1".to_string(),
expected_turn_id: "turn-1".to_string(),
input: Vec::new(),
responsesapi_client_metadata: None,
},
}
}
fn sample_thread_archive_request() -> ClientRequest {
ClientRequest::ThreadArchive {
request_id: RequestId::Integer(3),
params: ThreadArchiveParams {
thread_id: "thread-1".to_string(),
},
}
}
fn sample_thread(thread_id: &str) -> Thread {
Thread {
id: thread_id.to_string(),
session_id: format!("session-{thread_id}"),
forked_from_id: None,
preview: "first prompt".to_string(),
ephemeral: false,
model_provider: "openai".to_string(),
created_at: 1,
updated_at: 2,
status: AppServerThreadStatus::Idle,
path: None,
cwd: test_path_buf("/tmp").abs(),
cli_version: "0.0.0".to_string(),
source: AppServerSessionSource::Exec,
thread_source: None,
agent_nickname: None,
agent_role: None,
git_info: None,
name: None,
turns: Vec::new(),
}
}
fn sample_thread_start_response() -> ClientResponsePayload {
ClientResponsePayload::ThreadStart(ThreadStartResponse {
thread: sample_thread("thread-1"),
model: "gpt-5".to_string(),
model_provider: "openai".to_string(),
service_tier: None,
cwd: test_path_buf("/tmp").abs(),
runtime_workspace_roots: Vec::new(),
instruction_sources: Vec::new(),
approval_policy: AppServerAskForApproval::OnFailure,
approvals_reviewer: AppServerApprovalsReviewer::User,
sandbox: AppServerSandboxPolicy::DangerFullAccess,
active_permission_profile: None,
reasoning_effort: None,
})
}
fn sample_thread_resume_response() -> ClientResponsePayload {
ClientResponsePayload::ThreadResume(ThreadResumeResponse {
thread: sample_thread("thread-2"),
model: "gpt-5".to_string(),
model_provider: "openai".to_string(),
service_tier: None,
cwd: test_path_buf("/tmp").abs(),
runtime_workspace_roots: Vec::new(),
instruction_sources: Vec::new(),
approval_policy: AppServerAskForApproval::OnFailure,
approvals_reviewer: AppServerApprovalsReviewer::User,
sandbox: AppServerSandboxPolicy::DangerFullAccess,
active_permission_profile: None,
reasoning_effort: None,
})
}
fn sample_thread_fork_response() -> ClientResponsePayload {
ClientResponsePayload::ThreadFork(ThreadForkResponse {
thread: sample_thread("thread-3"),
model: "gpt-5".to_string(),
model_provider: "openai".to_string(),
service_tier: None,
cwd: test_path_buf("/tmp").abs(),
runtime_workspace_roots: Vec::new(),
instruction_sources: Vec::new(),
approval_policy: AppServerAskForApproval::OnFailure,
approvals_reviewer: AppServerApprovalsReviewer::User,
sandbox: AppServerSandboxPolicy::DangerFullAccess,
active_permission_profile: None,
reasoning_effort: None,
})
}
fn sample_turn_start_response() -> ClientResponsePayload {
ClientResponsePayload::TurnStart(TurnStartResponse {
turn: Turn {
id: "turn-1".to_string(),
items_view: codex_app_server_protocol::TurnItemsView::Full,
items: Vec::new(),
status: AppServerTurnStatus::InProgress,
error: None,
started_at: None,
completed_at: None,
duration_ms: None,
},
})
}
fn sample_turn_steer_response() -> ClientResponsePayload {
ClientResponsePayload::TurnSteer(TurnSteerResponse {
turn_id: "turn-2".to_string(),
})
}
#[test]
fn track_request_only_enqueues_analytics_relevant_requests() {
let (client, mut receiver) = client_with_receiver();
for (request_id, request) in [
(RequestId::Integer(1), sample_turn_start_request()),
(RequestId::Integer(2), sample_turn_steer_request()),
] {
client.track_request(/*connection_id*/ 7, request_id, &request);
assert!(matches!(
receiver.try_recv(),
Ok(AnalyticsFact::ClientRequest { .. })
));
}
let ignored_request = sample_thread_archive_request();
client.track_request(
/*connection_id*/ 7,
RequestId::Integer(3),
&ignored_request,
);
assert!(matches!(receiver.try_recv(), Err(TryRecvError::Empty)));
}
#[test]
fn track_response_only_enqueues_analytics_relevant_responses() {
let (client, mut receiver) = client_with_receiver();
for (request_id, response) in [
(RequestId::Integer(1), sample_thread_start_response()),
(RequestId::Integer(2), sample_thread_resume_response()),
(RequestId::Integer(3), sample_thread_fork_response()),
(RequestId::Integer(4), sample_turn_start_response()),
(RequestId::Integer(5), sample_turn_steer_response()),
] {
client.track_response(/*connection_id*/ 7, request_id, response);
assert!(matches!(
receiver.try_recv(),
Ok(AnalyticsFact::ClientResponse { .. })
));
}
client.track_response(
/*connection_id*/ 7,
RequestId::Integer(6),
ClientResponsePayload::ThreadArchive(ThreadArchiveResponse {}),
);
assert!(matches!(receiver.try_recv(), Err(TryRecvError::Empty)));
}
#[test]
fn track_event_request_batches_only_isolates_accepted_line_fingerprint_events() {
let batches = track_event_request_batches(vec![
sample_regular_track_event("thread-1"),
sample_regular_track_event("thread-2"),
sample_accepted_line_fingerprint_event("thread-3"),
sample_accepted_line_fingerprint_event("thread-4"),
sample_regular_track_event("thread-5"),
sample_regular_track_event("thread-6"),
]);
assert_eq!(batches.len(), 4);
assert_eq!(batches[0].len(), 2);
assert_eq!(batches[1].len(), 1);
assert_eq!(batches[2].len(), 1);
assert_eq!(batches[3].len(), 2);
assert!(batches[1][0].should_send_in_isolated_request());
assert!(batches[2][0].should_send_in_isolated_request());
}

View File

@@ -1,5 +1,6 @@
use std::time::Instant;
use crate::facts::AcceptedLineFingerprint;
use crate::facts::AppInvocation;
use crate::facts::CodexCompactionEvent;
use crate::facts::CompactionImplementation;
@@ -18,8 +19,9 @@ use crate::facts::TurnStatus;
use crate::facts::TurnSteerRejectionReason;
use crate::facts::TurnSteerResult;
use crate::facts::TurnSubmissionType;
use crate::now_unix_seconds;
use crate::now_unix_millis;
use codex_app_server_protocol::CodexErrorInfo;
use codex_app_server_protocol::CommandExecutionSource;
use codex_login::default_client::originator;
use codex_plugin::PluginTelemetryMetadata;
use codex_protocol::approvals::NetworkApprovalProtocol;
@@ -33,6 +35,7 @@ use codex_protocol::protocol::HookEventName;
use codex_protocol::protocol::HookRunStatus;
use codex_protocol::protocol::HookSource;
use codex_protocol::protocol::SubAgentSource;
use codex_protocol::protocol::ThreadSource;
use codex_protocol::protocol::TokenUsage;
use serde::Serialize;
@@ -61,6 +64,16 @@ pub(crate) enum TrackEventRequest {
Compaction(Box<CodexCompactionEventRequest>),
TurnEvent(Box<CodexTurnEventRequest>),
TurnSteer(CodexTurnSteerEventRequest),
CommandExecution(CodexCommandExecutionEventRequest),
FileChange(CodexFileChangeEventRequest),
McpToolCall(CodexMcpToolCallEventRequest),
DynamicToolCall(CodexDynamicToolCallEventRequest),
CollabAgentToolCall(CodexCollabAgentToolCallEventRequest),
WebSearch(CodexWebSearchEventRequest),
ImageGeneration(CodexImageGenerationEventRequest),
AcceptedLineFingerprints(Box<CodexAcceptedLineFingerprintsEventRequest>),
#[allow(dead_code)]
ReviewEvent(CodexReviewEventRequest),
PluginUsed(CodexPluginUsedEventRequest),
PluginInstalled(CodexPluginEventRequest),
PluginUninstalled(CodexPluginEventRequest),
@@ -68,6 +81,32 @@ pub(crate) enum TrackEventRequest {
PluginDisabled(CodexPluginEventRequest),
}
impl TrackEventRequest {
pub(crate) fn should_send_in_isolated_request(&self) -> bool {
matches!(self, Self::AcceptedLineFingerprints(_))
}
}
#[derive(Serialize)]
pub(crate) struct CodexAcceptedLineFingerprintsEventParams {
pub(crate) event_type: &'static str,
pub(crate) turn_id: String,
pub(crate) thread_id: String,
pub(crate) product_surface: Option<String>,
pub(crate) model_slug: Option<String>,
pub(crate) completed_at: u64,
pub(crate) repo_hash: Option<String>,
pub(crate) accepted_added_lines: u64,
pub(crate) accepted_deleted_lines: u64,
pub(crate) line_fingerprints: Vec<AcceptedLineFingerprint>,
}
#[derive(Serialize)]
pub(crate) struct CodexAcceptedLineFingerprintsEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexAcceptedLineFingerprintsEventParams,
}
#[derive(Serialize)]
pub(crate) struct SkillInvocationEventRequest {
pub(crate) event_type: &'static str,
@@ -80,8 +119,10 @@ pub(crate) struct SkillInvocationEventRequest {
pub(crate) struct SkillInvocationEventParams {
pub(crate) product_client_id: Option<String>,
pub(crate) skill_scope: Option<String>,
pub(crate) plugin_id: Option<String>,
pub(crate) repo_url: Option<String>,
pub(crate) thread_id: Option<String>,
pub(crate) turn_id: Option<String>,
pub(crate) invoke_type: Option<InvocationType>,
pub(crate) model_slug: Option<String>,
}
@@ -110,7 +151,7 @@ pub(crate) struct ThreadInitializedEventParams {
pub(crate) runtime: CodexRuntimeMetadata,
pub(crate) model: String,
pub(crate) ephemeral: bool,
pub(crate) thread_source: Option<&'static str>,
pub(crate) thread_source: Option<ThreadSource>,
pub(crate) initialization_mode: ThreadInitializationMode,
pub(crate) subagent_source: Option<String>,
pub(crate) parent_thread_id: Option<String>,
@@ -248,7 +289,7 @@ pub struct GuardianReviewTrackContext {
approval_request_source: GuardianApprovalRequestSource,
reviewed_action: GuardianReviewedAction,
review_timeout_ms: u64,
started_at: u64,
pub started_at_ms: u64,
started_instant: Instant,
}
@@ -270,7 +311,7 @@ impl GuardianReviewTrackContext {
approval_request_source,
reviewed_action,
review_timeout_ms,
started_at: now_unix_seconds(),
started_at_ms: now_unix_millis(),
started_instant: Instant::now(),
}
}
@@ -278,6 +319,7 @@ impl GuardianReviewTrackContext {
pub(crate) fn event_params(
&self,
result: GuardianReviewAnalyticsResult,
completed_at_ms: u64,
) -> GuardianReviewEventParams {
GuardianReviewEventParams {
thread_id: self.thread_id.clone(),
@@ -303,8 +345,8 @@ impl GuardianReviewTrackContext {
tool_call_count: None,
time_to_first_token_ms: result.time_to_first_token_ms,
completion_latency_ms: Some(self.started_instant.elapsed().as_millis() as u64),
started_at: self.started_at,
completed_at: Some(now_unix_seconds()),
started_at: self.started_at_ms / 1_000,
completed_at: Some(completed_at_ms / 1_000),
input_tokens: result.token_usage.as_ref().map(|usage| usage.input_tokens),
cached_input_tokens: result
.token_usage
@@ -384,6 +426,276 @@ pub(crate) struct GuardianReviewEventPayload {
pub(crate) guardian_review: GuardianReviewEventParams,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum FinalApprovalOutcome {
Unknown,
NotNeeded,
ConfigAllowed,
PolicyForbidden,
GuardianApproved,
GuardianDenied,
GuardianAborted,
UserApproved,
UserApprovedForSession,
UserDenied,
UserAborted,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum ToolItemTerminalStatus {
Completed,
Failed,
Rejected,
Interrupted,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum ToolItemFailureKind {
ToolError,
ApprovalDenied,
ApprovalAborted,
SandboxDenied,
PolicyForbidden,
}
#[derive(Serialize)]
pub(crate) struct CodexToolItemEventBase {
pub(crate) thread_id: String,
pub(crate) turn_id: String,
/// App-server ThreadItem.id. For tool-originated items this generally
/// corresponds to the originating core call_id.
pub(crate) item_id: String,
pub(crate) app_server_client: CodexAppServerClientMetadata,
pub(crate) runtime: CodexRuntimeMetadata,
pub(crate) thread_source: Option<ThreadSource>,
pub(crate) subagent_source: Option<String>,
pub(crate) parent_thread_id: Option<String>,
pub(crate) tool_name: String,
pub(crate) started_at_ms: u64,
pub(crate) completed_at_ms: u64,
// Observed item lifecycle duration. This may undercount end-to-end execution
// for tools where app-server only sees part of the upstream flow.
pub(crate) duration_ms: Option<u64>,
pub(crate) execution_duration_ms: Option<u64>,
pub(crate) review_count: u64,
pub(crate) guardian_review_count: u64,
pub(crate) user_review_count: u64,
pub(crate) final_approval_outcome: FinalApprovalOutcome,
pub(crate) terminal_status: ToolItemTerminalStatus,
pub(crate) failure_kind: Option<ToolItemFailureKind>,
pub(crate) requested_additional_permissions: bool,
pub(crate) requested_network_access: bool,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum ReviewSubjectKind {
CommandExecution,
FileChange,
McpToolCall,
Permissions,
NetworkAccess,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum Reviewer {
Guardian,
User,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum ReviewTrigger {
Initial,
SandboxDenial,
NetworkPolicyDenial,
ExecveIntercept,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum ReviewStatus {
Approved,
Denied,
Aborted,
TimedOut,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum ReviewResolution {
None,
SessionApproval,
ExecPolicyAmendment,
NetworkPolicyAmendment,
}
#[derive(Serialize)]
pub(crate) struct CodexReviewEventParams {
pub(crate) thread_id: String,
pub(crate) turn_id: String,
pub(crate) item_id: Option<String>,
pub(crate) review_id: String,
pub(crate) app_server_client: CodexAppServerClientMetadata,
pub(crate) runtime: CodexRuntimeMetadata,
pub(crate) thread_source: Option<ThreadSource>,
pub(crate) subagent_source: Option<String>,
pub(crate) parent_thread_id: Option<String>,
pub(crate) subject_kind: ReviewSubjectKind,
pub(crate) subject_name: String,
pub(crate) reviewer: Reviewer,
pub(crate) trigger: ReviewTrigger,
pub(crate) status: ReviewStatus,
pub(crate) resolution: ReviewResolution,
pub(crate) started_at_ms: u64,
pub(crate) completed_at_ms: u64,
pub(crate) duration_ms: Option<u64>,
}
#[derive(Serialize)]
pub(crate) struct CodexReviewEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexReviewEventParams,
}
#[allow(dead_code)]
#[derive(Clone, Copy, Debug, Serialize)]
#[serde(rename_all = "snake_case")]
pub(crate) enum WebSearchActionKind {
Search,
OpenPage,
FindInPage,
Other,
}
#[derive(Serialize)]
pub(crate) struct CodexCommandExecutionEventParams {
#[serde(flatten)]
pub(crate) base: CodexToolItemEventBase,
pub(crate) command_execution_source: CommandExecutionSource,
pub(crate) exit_code: Option<i32>,
pub(crate) command_total_action_count: u64,
pub(crate) command_read_action_count: u64,
pub(crate) command_list_files_action_count: u64,
pub(crate) command_search_action_count: u64,
pub(crate) command_unknown_action_count: u64,
}
#[derive(Serialize)]
pub(crate) struct CodexCommandExecutionEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexCommandExecutionEventParams,
}
#[derive(Serialize)]
pub(crate) struct CodexFileChangeEventParams {
#[serde(flatten)]
pub(crate) base: CodexToolItemEventBase,
pub(crate) file_change_count: u64,
pub(crate) file_add_count: u64,
pub(crate) file_update_count: u64,
pub(crate) file_delete_count: u64,
pub(crate) file_move_count: u64,
}
#[derive(Serialize)]
pub(crate) struct CodexFileChangeEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexFileChangeEventParams,
}
#[derive(Serialize)]
pub(crate) struct CodexMcpToolCallEventParams {
#[serde(flatten)]
pub(crate) base: CodexToolItemEventBase,
pub(crate) mcp_server_name: String,
pub(crate) mcp_tool_name: String,
pub(crate) mcp_error_present: bool,
}
#[derive(Serialize)]
pub(crate) struct CodexMcpToolCallEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexMcpToolCallEventParams,
}
#[derive(Serialize)]
pub(crate) struct CodexDynamicToolCallEventParams {
#[serde(flatten)]
pub(crate) base: CodexToolItemEventBase,
pub(crate) dynamic_tool_name: String,
pub(crate) success: Option<bool>,
pub(crate) output_content_item_count: Option<u64>,
pub(crate) output_text_item_count: Option<u64>,
pub(crate) output_image_item_count: Option<u64>,
}
#[derive(Serialize)]
pub(crate) struct CodexDynamicToolCallEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexDynamicToolCallEventParams,
}
#[derive(Serialize)]
pub(crate) struct CodexCollabAgentToolCallEventParams {
#[serde(flatten)]
pub(crate) base: CodexToolItemEventBase,
pub(crate) sender_thread_id: String,
pub(crate) receiver_thread_count: u64,
pub(crate) receiver_thread_ids: Option<Vec<String>>,
pub(crate) requested_model: Option<String>,
pub(crate) requested_reasoning_effort: Option<String>,
pub(crate) agent_state_count: Option<u64>,
pub(crate) completed_agent_count: Option<u64>,
pub(crate) failed_agent_count: Option<u64>,
}
#[derive(Serialize)]
pub(crate) struct CodexCollabAgentToolCallEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexCollabAgentToolCallEventParams,
}
#[derive(Serialize)]
pub(crate) struct CodexWebSearchEventParams {
#[serde(flatten)]
pub(crate) base: CodexToolItemEventBase,
pub(crate) web_search_action: Option<WebSearchActionKind>,
pub(crate) query_present: bool,
pub(crate) query_count: Option<u64>,
}
#[derive(Serialize)]
pub(crate) struct CodexWebSearchEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexWebSearchEventParams,
}
#[derive(Serialize)]
pub(crate) struct CodexImageGenerationEventParams {
#[serde(flatten)]
pub(crate) base: CodexToolItemEventBase,
pub(crate) revised_prompt_present: bool,
pub(crate) saved_path_present: bool,
}
#[derive(Serialize)]
pub(crate) struct CodexImageGenerationEventRequest {
pub(crate) event_type: &'static str,
pub(crate) event_params: CodexImageGenerationEventParams,
}
#[derive(Serialize)]
pub(crate) struct CodexAppMetadata {
pub(crate) connector_id: Option<String>,
@@ -429,7 +741,7 @@ pub(crate) struct CodexCompactionEventParams {
pub(crate) turn_id: String,
pub(crate) app_server_client: CodexAppServerClientMetadata,
pub(crate) runtime: CodexRuntimeMetadata,
pub(crate) thread_source: Option<&'static str>,
pub(crate) thread_source: Option<ThreadSource>,
pub(crate) subagent_source: Option<String>,
pub(crate) parent_thread_id: Option<String>,
pub(crate) trigger: CompactionTrigger,
@@ -462,7 +774,7 @@ pub(crate) struct CodexTurnEventParams {
pub(crate) app_server_client: CodexAppServerClientMetadata,
pub(crate) runtime: CodexRuntimeMetadata,
pub(crate) ephemeral: bool,
pub(crate) thread_source: Option<String>,
pub(crate) thread_source: Option<ThreadSource>,
pub(crate) initialization_mode: ThreadInitializationMode,
pub(crate) subagent_source: Option<String>,
pub(crate) parent_thread_id: Option<String>,
@@ -482,8 +794,6 @@ pub(crate) struct CodexTurnEventParams {
pub(crate) status: Option<TurnStatus>,
pub(crate) turn_error: Option<CodexErrorInfo>,
pub(crate) steer_count: Option<usize>,
// TODO(rhan-oai): Populate these once tool-call accounting is emitted from
// core; the schema is reserved but these fields are currently always None.
pub(crate) total_tool_call_count: Option<usize>,
pub(crate) shell_command_count: Option<usize>,
pub(crate) file_change_count: Option<usize>,
@@ -515,7 +825,7 @@ pub(crate) struct CodexTurnSteerEventParams {
pub(crate) accepted_turn_id: Option<String>,
pub(crate) app_server_client: CodexAppServerClientMetadata,
pub(crate) runtime: CodexRuntimeMetadata,
pub(crate) thread_source: Option<String>,
pub(crate) thread_source: Option<ThreadSource>,
pub(crate) subagent_source: Option<String>,
pub(crate) parent_thread_id: Option<String>,
pub(crate) num_input_images: usize,
@@ -587,11 +897,16 @@ pub(crate) fn codex_app_metadata(
}
pub(crate) fn codex_plugin_metadata(plugin: PluginTelemetryMetadata) -> CodexPluginMetadata {
let capability_summary = plugin.capability_summary;
let PluginTelemetryMetadata {
plugin_id,
remote_plugin_id,
capability_summary,
} = plugin;
let event_plugin_id = remote_plugin_id.unwrap_or_else(|| plugin_id.as_key());
CodexPluginMetadata {
plugin_id: Some(plugin.plugin_id.as_key()),
plugin_name: Some(plugin.plugin_id.plugin_name),
marketplace_name: Some(plugin.plugin_id.marketplace_name),
plugin_id: Some(event_plugin_id),
plugin_name: Some(plugin_id.plugin_name),
marketplace_name: Some(plugin_id.marketplace_name),
has_skills: capability_summary
.as_ref()
.map(|summary| summary.has_skills),
@@ -613,7 +928,7 @@ pub(crate) fn codex_compaction_event_params(
input: CodexCompactionEvent,
app_server_client: CodexAppServerClientMetadata,
runtime: CodexRuntimeMetadata,
thread_source: Option<&'static str>,
thread_source: Option<ThreadSource>,
subagent_source: Option<String>,
parent_thread_id: Option<String>,
) -> CodexCompactionEventParams {
@@ -671,6 +986,8 @@ fn analytics_hook_event_name(event_name: HookEventName) -> &'static str {
HookEventName::PreToolUse => "PreToolUse",
HookEventName::PermissionRequest => "PermissionRequest",
HookEventName::PostToolUse => "PostToolUse",
HookEventName::PreCompact => "PreCompact",
HookEventName::PostCompact => "PostCompact",
HookEventName::SessionStart => "SessionStart",
HookEventName::UserPromptSubmit => "UserPromptSubmit",
HookEventName::Stop => "Stop",
@@ -685,6 +1002,7 @@ fn analytics_hook_source(source: HookSource) -> &'static str {
HookSource::Mdm => "mdm",
HookSource::SessionFlags => "session_flags",
HookSource::Plugin => "plugin",
HookSource::CloudRequirements => "cloud_requirements",
HookSource::LegacyManagedConfigFile => "legacy_managed_config_file",
HookSource::LegacyManagedConfigMdm => "legacy_managed_config_mdm",
HookSource::Unknown => "unknown",
@@ -716,7 +1034,7 @@ pub(crate) fn subagent_thread_started_event_request(
runtime: current_runtime_metadata(),
model: input.model,
ephemeral: input.ephemeral,
thread_source: Some("subagent"),
thread_source: Some(ThreadSource::Subagent),
initialization_mode: ThreadInitializationMode::New,
subagent_source: Some(subagent_source_name(&input.subagent_source)),
parent_thread_id: input

View File

@@ -2,11 +2,13 @@ use crate::events::AppServerRpcTransport;
use crate::events::CodexRuntimeMetadata;
use crate::events::GuardianReviewEventParams;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::ClientResponse;
use codex_app_server_protocol::ClientResponsePayload;
use codex_app_server_protocol::InitializeParams;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ServerNotification;
use codex_app_server_protocol::ServerRequest;
use codex_app_server_protocol::ServerResponse;
use codex_plugin::PluginTelemetryMetadata;
use codex_protocol::config_types::ApprovalsReviewer;
use codex_protocol::config_types::ModeKind;
@@ -23,9 +25,16 @@ use codex_protocol::protocol::SessionSource;
use codex_protocol::protocol::SkillScope;
use codex_protocol::protocol::SubAgentSource;
use codex_protocol::protocol::TokenUsage;
use codex_protocol::request_permissions::RequestPermissionsResponse;
use serde::Serialize;
use std::path::PathBuf;
#[derive(Clone, Debug, PartialEq, Eq, Serialize)]
pub struct AcceptedLineFingerprint {
pub path_hash: String,
pub line_hash: String,
}
#[derive(Clone)]
pub struct TrackEventsContext {
pub model_slug: String,
@@ -171,6 +180,7 @@ pub struct SkillInvocation {
pub skill_name: String,
pub skill_scope: SkillScope,
pub skill_path: PathBuf,
pub plugin_id: Option<String>,
pub invocation_type: InvocationType,
}
@@ -272,14 +282,15 @@ pub(crate) enum AnalyticsFact {
runtime: CodexRuntimeMetadata,
rpc_transport: AppServerRpcTransport,
},
Request {
ClientRequest {
connection_id: u64,
request_id: RequestId,
request: Box<ClientRequest>,
},
Response {
ClientResponse {
connection_id: u64,
response: Box<ClientResponse>,
request_id: RequestId,
response: Box<ClientResponsePayload>,
},
ErrorResponse {
connection_id: u64,
@@ -287,6 +298,23 @@ pub(crate) enum AnalyticsFact {
error: JSONRPCErrorError,
error_type: Option<AnalyticsJsonRpcError>,
},
ServerRequest {
connection_id: u64,
request: Box<ServerRequest>,
},
ServerResponse {
completed_at_ms: u64,
response: Box<ServerResponse>,
},
EffectivePermissionsApprovalResponse {
completed_at_ms: u64,
request_id: RequestId,
response: Box<RequestPermissionsResponse>,
},
ServerRequestAborted {
completed_at_ms: u64,
request_id: RequestId,
},
Notification(Box<ServerNotification>),
// Facts that do not naturally exist on the app-server protocol surface, or
// would require non-trivial protocol reshaping on this branch.

View File

@@ -1,3 +1,4 @@
mod accepted_lines;
mod client;
mod events;
mod facts;
@@ -6,6 +7,8 @@ mod reducer;
use std::time::SystemTime;
use std::time::UNIX_EPOCH;
pub use accepted_lines::accepted_line_fingerprints_from_unified_diff;
pub use accepted_lines::fingerprint_hash;
pub use client::AnalyticsEventsClient;
pub use events::AppServerRpcTransport;
pub use events::GuardianApprovalRequestSource;
@@ -17,6 +20,7 @@ pub use events::GuardianReviewSessionKind;
pub use events::GuardianReviewTerminalStatus;
pub use events::GuardianReviewTrackContext;
pub use events::GuardianReviewedAction;
pub use facts::AcceptedLineFingerprint;
pub use facts::AnalyticsJsonRpcError;
pub use facts::AppInvocation;
pub use facts::CodexCompactionEvent;
@@ -51,3 +55,27 @@ pub fn now_unix_seconds() -> u64 {
.unwrap_or_default()
.as_secs()
}
pub fn now_unix_millis() -> u64 {
u64::try_from(
SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap_or_default()
.as_millis(),
)
.unwrap_or(u64::MAX)
}
pub(crate) fn serialize_enum_as_string<T: serde::Serialize>(value: &T) -> Option<String> {
serde_json::to_value(value)
.ok()
.and_then(|value| value.as_str().map(str::to_string))
}
pub(crate) fn usize_to_u64(value: usize) -> u64 {
u64::try_from(value).unwrap_or(u64::MAX)
}
pub(crate) fn option_i64_to_u64(value: Option<i64>) -> Option<u64> {
value.and_then(|value| u64::try_from(value).ok())
}

File diff suppressed because it is too large Load Diff

View File

@@ -7,6 +7,8 @@ license.workspace = true
[lib]
name = "codex_ansi_escape"
path = "src/lib.rs"
test = false
doctest = false
[lints]
workspace = true

View File

@@ -7,6 +7,7 @@ license.workspace = true
[lib]
name = "codex_app_server_client"
path = "src/lib.rs"
doctest = false
[lints]
workspace = true
@@ -20,6 +21,8 @@ codex-core = { workspace = true }
codex-exec-server = { workspace = true }
codex-feedback = { workspace = true }
codex-protocol = { workspace = true }
codex-uds = { workspace = true }
codex-utils-absolute-path = { workspace = true }
codex-utils-rustls-provider = { workspace = true }
futures = { workspace = true }
serde = { workspace = true }
@@ -33,4 +36,5 @@ url = { workspace = true }
[dev-dependencies]
pretty_assertions = { workspace = true }
serde_json = { workspace = true }
tempfile = { workspace = true }
tokio = { workspace = true, features = ["macros", "rt-multi-thread"] }

View File

@@ -25,10 +25,12 @@ use std::io::Result as IoResult;
use std::sync::Arc;
use std::time::Duration;
pub use codex_app_server::app_server_control_socket_path;
pub use codex_app_server::in_process::DEFAULT_IN_PROCESS_CHANNEL_CAPACITY;
pub use codex_app_server::in_process::InProcessServerEvent;
use codex_app_server::in_process::InProcessStartArgs;
use codex_app_server::in_process::LogDbLayer;
pub use codex_app_server::in_process::StateDbHandle;
use codex_app_server_protocol::ClientInfo;
use codex_app_server_protocol::ClientNotification;
use codex_app_server_protocol::ClientRequest;
@@ -48,7 +50,6 @@ use codex_config::RemoteThreadConfigLoader;
use codex_config::ThreadConfigLoader;
use codex_core::config::Config;
pub use codex_exec_server::EnvironmentManager;
pub use codex_exec_server::EnvironmentManagerArgs;
pub use codex_exec_server::ExecServerRuntimePaths;
use codex_feedback::CodexFeedback;
use codex_protocol::protocol::SessionSource;
@@ -61,6 +62,7 @@ use tracing::warn;
pub use crate::remote::RemoteAppServerClient;
pub use crate::remote::RemoteAppServerConnectArgs;
pub use crate::remote::RemoteAppServerEndpoint;
/// Transitional access to core-only embedded app-server types.
///
@@ -71,12 +73,9 @@ pub mod legacy_core {
pub use codex_core::DEFAULT_AGENTS_MD_FILENAME;
pub use codex_core::LOCAL_AGENTS_MD_FILENAME;
pub use codex_core::McpManager;
pub use codex_core::append_message_history_entry;
pub use codex_core::check_execpolicy_for_warnings;
pub use codex_core::format_exec_policy_error_with_source;
pub use codex_core::grant_read_root_non_elevated;
pub use codex_core::lookup_message_history_entry;
pub use codex_core::message_history_metadata;
pub use codex_core::web_search_detail;
pub mod config {
@@ -99,10 +98,6 @@ pub mod legacy_core {
pub use codex_core::personality_migration::*;
}
pub mod plugins {
pub use codex_core::plugins::PluginsManager;
}
pub mod review_format {
pub use codex_core::review_format::*;
}
@@ -304,7 +299,15 @@ impl fmt::Display for TypedRequestError {
write!(f, "{method} transport error: {source}")
}
Self::Server { method, source } => {
write!(f, "{method} failed: {}", source.message)
write!(
f,
"{method} failed: {} (code {})",
source.message, source.code
)?;
if let Some(data) = source.data.as_ref() {
write!(f, ", data: {data}")?;
}
Ok(())
}
Self::Deserialize { method, source } => {
write!(f, "{method} response decode error: {source}")
@@ -333,12 +336,16 @@ pub struct InProcessClientStartArgs {
pub cli_overrides: Vec<(String, TomlValue)>,
/// Loader override knobs used by config API paths.
pub loader_overrides: LoaderOverrides,
/// Whether config API paths should reject unknown config fields.
pub strict_config: bool,
/// Preloaded cloud requirements provider.
pub cloud_requirements: CloudRequirementsLoader,
/// Feedback sink used by app-server/core telemetry and logs.
pub feedback: CodexFeedback,
/// SQLite tracing layer used to flush recently emitted logs before feedback upload.
pub log_db: Option<LogDbLayer>,
/// Process-wide SQLite state handle shared with the embedded app-server.
pub state_db: Option<StateDbHandle>,
/// Environment manager used by core execution and filesystem operations.
pub environment_manager: Arc<EnvironmentManager>,
/// Startup warnings emitted after initialize succeeds.
@@ -371,6 +378,7 @@ impl InProcessClientStartArgs {
pub fn initialize_params(&self) -> InitializeParams {
let capabilities = InitializeCapabilities {
experimental_api: self.experimental_api,
request_attestation: false,
opt_out_notification_methods: if self.opt_out_notification_methods.is_empty() {
None
} else {
@@ -396,10 +404,12 @@ impl InProcessClientStartArgs {
config: self.config,
cli_overrides: self.cli_overrides,
loader_overrides: self.loader_overrides,
strict_config: self.strict_config,
cloud_requirements: self.cloud_requirements,
thread_config_loader,
feedback: self.feedback,
log_db: self.log_db,
state_db: self.state_db,
environment_manager: self.environment_manager,
config_warnings: self.config_warnings,
session_source: self.session_source,
@@ -946,12 +956,19 @@ mod tests {
use codex_app_server_protocol::ToolRequestUserInputParams;
use codex_app_server_protocol::ToolRequestUserInputQuestion;
use codex_core::config::ConfigBuilder;
use codex_core::init_state_db;
use codex_uds::UnixListener;
use codex_utils_absolute_path::AbsolutePathBuf;
use futures::SinkExt;
use futures::StreamExt;
use pretty_assertions::assert_eq;
use std::ops::Deref;
use std::path::Path;
use tempfile::TempDir;
use tokio::net::TcpListener;
use tokio::time::Duration;
use tokio::time::timeout;
use tokio_tungstenite::accept_async;
use tokio_tungstenite::accept_hdr_async;
use tokio_tungstenite::tungstenite::Message;
use tokio_tungstenite::tungstenite::handshake::server::Request as WebSocketRequest;
@@ -967,18 +984,60 @@ mod tests {
}
}
async fn build_test_config_for_codex_home(codex_home: &Path) -> Config {
match ConfigBuilder::default()
.codex_home(codex_home.to_path_buf())
.build()
.await
{
Ok(config) => config,
Err(_) => Config::load_default_with_cli_overrides_for_codex_home(
codex_home.to_path_buf(),
Vec::new(),
)
.await
.expect("default config should load"),
}
}
struct TestClient {
_codex_home: TempDir,
client: InProcessAppServerClient,
}
impl Deref for TestClient {
type Target = InProcessAppServerClient;
fn deref(&self) -> &Self::Target {
&self.client
}
}
impl TestClient {
async fn shutdown(self) -> IoResult<()> {
self.client.shutdown().await
}
}
async fn start_test_client_with_capacity(
session_source: SessionSource,
channel_capacity: usize,
) -> InProcessAppServerClient {
InProcessAppServerClient::start(InProcessClientStartArgs {
) -> TestClient {
let codex_home = TempDir::new().expect("temp dir");
let config = Arc::new(build_test_config_for_codex_home(codex_home.path()).await);
let state_db = init_state_db(config.as_ref())
.await
.expect("state db should initialize for in-process test");
let client = InProcessAppServerClient::start(InProcessClientStartArgs {
arg0_paths: Arg0DispatchPaths::default(),
config: Arc::new(build_test_config().await),
config,
cli_overrides: Vec::new(),
loader_overrides: LoaderOverrides::default(),
strict_config: false,
cloud_requirements: CloudRequirementsLoader::default(),
feedback: CodexFeedback::new(),
log_db: None,
state_db: Some(state_db),
environment_manager: Arc::new(EnvironmentManager::default_for_tests()),
config_warnings: Vec::new(),
session_source,
@@ -990,10 +1049,15 @@ mod tests {
channel_capacity,
})
.await
.expect("in-process app-server client should start")
.expect("in-process app-server client should start");
TestClient {
_codex_home: codex_home,
client,
}
}
async fn start_test_client(session_source: SessionSource) -> InProcessAppServerClient {
async fn start_test_client(session_source: SessionSource) -> TestClient {
start_test_client_with_capacity(session_source, DEFAULT_IN_PROCESS_CHANNEL_CAPACITY).await
}
@@ -1045,9 +1109,10 @@ mod tests {
format!("ws://{addr}")
}
async fn expect_remote_initialize(
websocket: &mut tokio_tungstenite::WebSocketStream<tokio::net::TcpStream>,
) {
async fn expect_remote_initialize<S>(websocket: &mut tokio_tungstenite::WebSocketStream<S>)
where
S: tokio::io::AsyncRead + tokio::io::AsyncWrite + Unpin,
{
let JSONRPCMessage::Request(request) = read_websocket_message(websocket).await else {
panic!("expected initialize request");
};
@@ -1068,9 +1133,12 @@ mod tests {
assert_eq!(notification.method, "initialized");
}
async fn read_websocket_message(
websocket: &mut tokio_tungstenite::WebSocketStream<tokio::net::TcpStream>,
) -> JSONRPCMessage {
async fn read_websocket_message<S>(
websocket: &mut tokio_tungstenite::WebSocketStream<S>,
) -> JSONRPCMessage
where
S: tokio::io::AsyncRead + tokio::io::AsyncWrite + Unpin,
{
loop {
let frame = websocket
.next()
@@ -1090,10 +1158,12 @@ mod tests {
}
}
async fn write_websocket_message(
websocket: &mut tokio_tungstenite::WebSocketStream<tokio::net::TcpStream>,
async fn write_websocket_message<S>(
websocket: &mut tokio_tungstenite::WebSocketStream<S>,
message: JSONRPCMessage,
) {
) where
S: tokio::io::AsyncRead + tokio::io::AsyncWrite + Unpin,
{
websocket
.send(Message::Text(
serde_json::to_string(&message)
@@ -1130,6 +1200,7 @@ mod tests {
ServerNotification::ItemCompleted(codex_app_server_protocol::ItemCompletedNotification {
thread_id: "thread".to_string(),
turn_id: "turn".to_string(),
completed_at_ms: 0,
item: codex_app_server_protocol::ThreadItem::AgentMessage {
id: "item".to_string(),
text: text.to_string(),
@@ -1144,6 +1215,7 @@ mod tests {
thread_id: "thread".to_string(),
turn: codex_app_server_protocol::Turn {
id: "turn".to_string(),
items_view: codex_app_server_protocol::TurnItemsView::Full,
items: Vec::new(),
status: codex_app_server_protocol::TurnStatus::Completed,
error: None,
@@ -1156,8 +1228,10 @@ mod tests {
fn test_remote_connect_args(websocket_url: String) -> RemoteAppServerConnectArgs {
RemoteAppServerConnectArgs {
websocket_url,
auth_token: None,
endpoint: RemoteAppServerEndpoint::WebSocket {
websocket_url,
auth_token: None,
},
client_name: "codex-app-server-client-test".to_string(),
client_version: "0.0.0-test".to_string(),
experimental_api: true,
@@ -1396,6 +1470,64 @@ mod tests {
client.shutdown().await.expect("shutdown should complete");
}
#[tokio::test]
async fn remote_unix_socket_typed_request_roundtrip_works() {
let socket_dir = TempDir::new().expect("socket dir");
let socket_path = AbsolutePathBuf::from_absolute_path(socket_dir.path().join("codex.sock"))
.expect("socket path should resolve");
let mut listener = UnixListener::bind(socket_path.as_path())
.await
.expect("listener should bind");
tokio::spawn(async move {
let stream = listener.accept().await.expect("accept should succeed");
let mut websocket = accept_async(stream)
.await
.expect("websocket upgrade should succeed");
expect_remote_initialize(&mut websocket).await;
let JSONRPCMessage::Request(request) = read_websocket_message(&mut websocket).await
else {
panic!("expected account/read request");
};
assert_eq!(request.method, "account/read");
write_websocket_message(
&mut websocket,
JSONRPCMessage::Response(JSONRPCResponse {
id: request.id,
result: serde_json::to_value(GetAccountResponse {
account: None,
requires_openai_auth: false,
})
.expect("response should serialize"),
}),
)
.await;
websocket.close(None).await.expect("close should succeed");
});
let client = RemoteAppServerClient::connect(RemoteAppServerConnectArgs {
endpoint: RemoteAppServerEndpoint::UnixSocket { socket_path },
client_name: "codex-app-server-client-test".to_string(),
client_version: "0.0.0-test".to_string(),
experimental_api: true,
opt_out_notification_methods: Vec::new(),
channel_capacity: 8,
})
.await
.expect("remote client should connect");
let response: GetAccountResponse = client
.request_typed(ClientRequest::GetAccount {
request_id: RequestId::Integer(1),
params: codex_app_server_protocol::GetAccountParams {
refresh_token: false,
},
})
.await
.expect("typed request should succeed");
assert_eq!(response.account, None);
client.shutdown().await.expect("shutdown should complete");
}
#[tokio::test]
async fn remote_typed_request_accepts_large_single_frame_response() {
let padding = "x".repeat((17 << 20) + 1024);
@@ -1457,8 +1589,15 @@ mod tests {
)
.await;
let client = RemoteAppServerClient::connect(RemoteAppServerConnectArgs {
auth_token: Some(auth_token),
..test_remote_connect_args(websocket_url)
endpoint: RemoteAppServerEndpoint::WebSocket {
websocket_url,
auth_token: Some(auth_token),
},
client_name: "codex-app-server-client-test".to_string(),
client_version: "0.0.0-test".to_string(),
experimental_api: true,
opt_out_notification_methods: Vec::new(),
channel_capacity: 8,
})
.await
.expect("remote client should connect");
@@ -1469,9 +1608,15 @@ mod tests {
#[tokio::test]
async fn remote_connect_rejects_non_loopback_ws_when_auth_configured() {
let result = RemoteAppServerClient::connect(RemoteAppServerConnectArgs {
websocket_url: "ws://example.com:4500".to_string(),
auth_token: Some("remote-bearer-token".to_string()),
..test_remote_connect_args("ws://127.0.0.1:1".to_string())
endpoint: RemoteAppServerEndpoint::WebSocket {
websocket_url: "ws://example.com:4500".to_string(),
auth_token: Some("remote-bearer-token".to_string()),
},
client_name: "codex-app-server-client-test".to_string(),
client_version: "0.0.0-test".to_string(),
experimental_api: true,
opt_out_notification_methods: Vec::new(),
channel_capacity: 8,
})
.await;
let err = match result {
@@ -1649,13 +1794,8 @@ mod tests {
})
.await;
let mut client = RemoteAppServerClient::connect(RemoteAppServerConnectArgs {
websocket_url,
auth_token: None,
client_name: "codex-app-server-client-test".to_string(),
client_version: "0.0.0-test".to_string(),
experimental_api: true,
opt_out_notification_methods: Vec::new(),
channel_capacity: 1,
..test_remote_connect_args(websocket_url)
})
.await
.expect("remote client should connect");
@@ -1919,11 +2059,15 @@ mod tests {
method: "thread/read".to_string(),
source: JSONRPCErrorError {
code: -32603,
data: None,
data: Some(serde_json::json!({"detail": "config lock mismatch"})),
message: "internal".to_string(),
},
};
assert_eq!(std::error::Error::source(&server).is_some(), false);
assert_eq!(
server.to_string(),
"thread/read failed: internal (code -32603), data: {\"detail\":\"config lock mismatch\"}"
);
let deserialize = TypedRequestError::Deserialize {
method: "thread/start".to_string(),
@@ -1970,6 +2114,7 @@ mod tests {
thread_id: "thread".to_string(),
turn: codex_app_server_protocol::Turn {
id: "turn".to_string(),
items_view: codex_app_server_protocol::TurnItemsView::Full,
items: Vec::new(),
status: codex_app_server_protocol::TurnStatus::Completed,
error: None,
@@ -1999,6 +2144,7 @@ mod tests {
codex_app_server_protocol::ItemCompletedNotification {
thread_id: "thread".to_string(),
turn_id: "turn".to_string(),
completed_at_ms: 0,
item: codex_app_server_protocol::ThreadItem::AgentMessage {
id: "item".to_string(),
text: "hello".to_string(),
@@ -2046,9 +2192,11 @@ mod tests {
config: config.clone(),
cli_overrides: Vec::new(),
loader_overrides: LoaderOverrides::default(),
strict_config: false,
cloud_requirements: CloudRequirementsLoader::default(),
feedback: CodexFeedback::new(),
log_db: None,
state_db: None,
environment_manager: environment_manager.clone(),
config_warnings: Vec::new(),
session_source: SessionSource::Exec,
@@ -2085,9 +2233,11 @@ mod tests {
config: Arc::new(config),
cli_overrides: Vec::new(),
loader_overrides: LoaderOverrides::default(),
strict_config: false,
cloud_requirements: CloudRequirementsLoader::default(),
feedback: CodexFeedback::new(),
log_db: None,
state_db: None,
environment_manager: Arc::new(EnvironmentManager::default_for_tests()),
config_warnings: Vec::new(),
session_source: SessionSource::Exec,

View File

@@ -1,11 +1,13 @@
/*
This module implements the websocket-backed app-server client transport.
This module implements the remote app-server client transport.
It owns the remote connection lifecycle, including the initialize/initialized
handshake, JSON-RPC request/response routing, server-request resolution, and
notification streaming. The rest of the crate uses the same `AppServerEvent`
surface for both in-process and remote transports, so callers such as the TUI
can switch between them without changing their higher-level session logic.
notification streaming. Remote connections always carry WebSocket frames, over
either TCP WebSocket URLs or local Unix sockets. The rest of the crate uses the
same `AppServerEvent` surface for both in-process and remote transports, so
callers such as the TUI can switch between them without changing their
higher-level session logic.
*/
use std::collections::HashMap;
@@ -35,17 +37,23 @@ use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::Result as JsonRpcResult;
use codex_app_server_protocol::ServerNotification;
use codex_app_server_protocol::ServerRequest;
use codex_uds::UnixStream;
use codex_utils_absolute_path::AbsolutePathBuf;
use codex_utils_rustls_provider::ensure_rustls_crypto_provider;
use futures::SinkExt;
use futures::StreamExt;
use serde::de::DeserializeOwned;
use tokio::io::AsyncRead;
use tokio::io::AsyncWrite;
use tokio::net::TcpStream;
use tokio::sync::mpsc;
use tokio::sync::oneshot;
use tokio::time::timeout;
use tokio_tungstenite::MaybeTlsStream;
use tokio_tungstenite::WebSocketStream;
use tokio_tungstenite::client_async_with_config;
use tokio_tungstenite::connect_async_with_config;
use tokio_tungstenite::tungstenite::Error as TungsteniteError;
use tokio_tungstenite::tungstenite::Message;
use tokio_tungstenite::tungstenite::client::IntoClientRequest;
use tokio_tungstenite::tungstenite::http::HeaderValue;
@@ -57,22 +65,35 @@ use url::Url;
const CONNECT_TIMEOUT: Duration = Duration::from_secs(10);
const INITIALIZE_TIMEOUT: Duration = Duration::from_secs(10);
const REMOTE_APP_SERVER_MAX_WEBSOCKET_MESSAGE_SIZE: usize = 128 << 20;
// Tungstenite still needs an HTTP request URI for the WebSocket handshake;
// the bytes travel over the Unix socket, not TCP.
const UDS_WEBSOCKET_HANDSHAKE_URL: &str = "ws://localhost/rpc";
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum RemoteAppServerEndpoint {
WebSocket {
websocket_url: String,
auth_token: Option<String>,
},
UnixSocket {
socket_path: AbsolutePathBuf,
},
}
#[derive(Debug, Clone)]
pub struct RemoteAppServerConnectArgs {
pub websocket_url: String,
pub auth_token: Option<String>,
pub endpoint: RemoteAppServerEndpoint,
pub client_name: String,
pub client_version: String,
pub experimental_api: bool,
pub opt_out_notification_methods: Vec<String>,
pub channel_capacity: usize,
}
impl RemoteAppServerConnectArgs {
fn initialize_params(&self) -> InitializeParams {
let capabilities = InitializeCapabilities {
experimental_api: self.experimental_api,
request_attestation: false,
opt_out_notification_methods: if self.opt_out_notification_methods.is_empty() {
None
} else {
@@ -140,69 +161,39 @@ pub struct RemoteAppServerRequestHandle {
impl RemoteAppServerClient {
pub async fn connect(args: RemoteAppServerConnectArgs) -> IoResult<Self> {
let channel_capacity = args.channel_capacity.max(1);
let websocket_url = args.websocket_url.clone();
let url = Url::parse(&websocket_url).map_err(|err| {
IoError::new(
ErrorKind::InvalidInput,
format!("invalid websocket URL `{websocket_url}`: {err}"),
)
})?;
if args.auth_token.is_some() && !websocket_url_supports_auth_token(&url) {
return Err(IoError::new(
ErrorKind::InvalidInput,
format!(
"remote auth tokens require `wss://` or loopback `ws://` URLs; got `{websocket_url}`"
),
));
let initialize_params = args.initialize_params();
match args.endpoint {
RemoteAppServerEndpoint::WebSocket {
websocket_url,
auth_token,
} => {
let (endpoint, stream) =
connect_websocket_endpoint(websocket_url, auth_token).await?;
Self::connect_with_stream(channel_capacity, endpoint, stream, initialize_params)
.await
}
RemoteAppServerEndpoint::UnixSocket { socket_path } => {
let (endpoint, stream) = connect_unix_socket_endpoint(socket_path).await?;
Self::connect_with_stream(channel_capacity, endpoint, stream, initialize_params)
.await
}
}
let mut request = url.as_str().into_client_request().map_err(|err| {
IoError::new(
ErrorKind::InvalidInput,
format!("invalid websocket URL `{websocket_url}`: {err}"),
)
})?;
if let Some(auth_token) = args.auth_token.as_deref() {
let header_value =
HeaderValue::from_str(&format!("Bearer {auth_token}")).map_err(|err| {
IoError::new(
ErrorKind::InvalidInput,
format!("invalid remote authorization header value: {err}"),
)
})?;
request.headers_mut().insert(AUTHORIZATION, header_value);
}
ensure_rustls_crypto_provider();
// Remote resume responses can legitimately carry large thread histories.
// Keep a bounded cap, but raise it above tungstenite's 16 MiB frame default.
let websocket_config = WebSocketConfig::default()
.max_frame_size(Some(REMOTE_APP_SERVER_MAX_WEBSOCKET_MESSAGE_SIZE))
.max_message_size(Some(REMOTE_APP_SERVER_MAX_WEBSOCKET_MESSAGE_SIZE));
let stream = timeout(
CONNECT_TIMEOUT,
connect_async_with_config(
request,
Some(websocket_config),
/*disable_nagle*/ false,
),
)
.await
.map_err(|_| {
IoError::new(
ErrorKind::TimedOut,
format!("timed out connecting to remote app server at `{websocket_url}`"),
)
})?
.map(|(stream, _response)| stream)
.map_err(|err| {
IoError::other(format!(
"failed to connect to remote app server at `{websocket_url}`: {err}"
))
})?;
}
async fn connect_with_stream<S>(
channel_capacity: usize,
endpoint: String,
stream: WebSocketStream<S>,
initialize_params: InitializeParams,
) -> IoResult<Self>
where
S: AsyncRead + AsyncWrite + Unpin + Send + 'static,
{
let mut stream = stream;
let pending_events = initialize_remote_connection(
&mut stream,
&websocket_url,
args.initialize_params(),
&endpoint,
initialize_params,
INITIALIZE_TIMEOUT,
)
.await?;
@@ -234,13 +225,13 @@ impl RemoteAppServerClient {
if let Err(err) = write_jsonrpc_message(
&mut stream,
JSONRPCMessage::Request(jsonrpc_request_from_client_request(*request)),
&websocket_url,
&endpoint,
)
.await
{
let err_message = err.to_string();
let message = format!(
"remote app server at `{websocket_url}` write failed: {err_message}"
"remote app server at `{endpoint}` write failed: {err_message}"
);
if let Some(response_tx) = pending_requests.remove(&request_id) {
let _ = response_tx.send(Err(err));
@@ -261,7 +252,7 @@ impl RemoteAppServerClient {
JSONRPCMessage::Notification(
jsonrpc_notification_from_client_notification(notification),
),
&websocket_url,
&endpoint,
)
.await;
let _ = response_tx.send(result);
@@ -277,7 +268,7 @@ impl RemoteAppServerClient {
id: request_id,
result,
}),
&websocket_url,
&endpoint,
)
.await;
let _ = response_tx.send(result);
@@ -293,16 +284,20 @@ impl RemoteAppServerClient {
error,
id: request_id,
}),
&websocket_url,
&endpoint,
)
.await;
let _ = response_tx.send(result);
}
RemoteClientCommand::Shutdown { response_tx } => {
let close_result = stream.close(None).await.map_err(|err| {
IoError::other(format!(
"failed to close websocket app server `{websocket_url}`: {err}"
))
let close_result = stream.close(None).await.or_else(|err| {
if websocket_close_error_is_already_closed(&err) {
Ok(())
} else {
Err(IoError::other(format!(
"failed to close websocket app server `{endpoint}`: {err}"
)))
}
});
let _ = response_tx.send(close_result);
break;
@@ -363,13 +358,13 @@ impl RemoteAppServerClient {
},
id: request_id,
}),
&websocket_url,
&endpoint,
)
.await
{
let err_message = reject_err.to_string();
let message = format!(
"remote app server at `{websocket_url}` write failed: {err_message}"
"remote app server at `{endpoint}` write failed: {err_message}"
);
let _ = deliver_event(
&event_tx,
@@ -386,7 +381,7 @@ impl RemoteAppServerClient {
}
Err(err) => {
let message = format!(
"remote app server at `{websocket_url}` sent invalid JSON-RPC: {err}"
"remote app server at `{endpoint}` sent invalid JSON-RPC: {err}"
);
let _ = deliver_event(
&event_tx,
@@ -407,7 +402,7 @@ impl RemoteAppServerClient {
.filter(|reason| !reason.is_empty())
.unwrap_or_else(|| "connection closed".to_string());
let message = format!(
"remote app server at `{websocket_url}` disconnected: {reason}"
"remote app server at `{endpoint}` disconnected: {reason}"
);
let _ = deliver_event(
&event_tx,
@@ -427,7 +422,7 @@ impl RemoteAppServerClient {
| Some(Ok(Message::Frame(_))) => {}
Some(Err(err)) => {
let message = format!(
"remote app server at `{websocket_url}` transport failed: {err}"
"remote app server at `{endpoint}` transport failed: {err}"
);
let _ = deliver_event(
&event_tx,
@@ -440,7 +435,7 @@ impl RemoteAppServerClient {
}
None => {
let message = format!(
"remote app server at `{websocket_url}` closed the connection"
"remote app server at `{endpoint}` closed the connection"
);
let _ = deliver_event(
&event_tx,
@@ -677,12 +672,131 @@ impl RemoteAppServerRequestHandle {
}
}
async fn initialize_remote_connection(
stream: &mut WebSocketStream<MaybeTlsStream<TcpStream>>,
websocket_url: &str,
async fn connect_websocket_endpoint(
websocket_url: String,
auth_token: Option<String>,
) -> IoResult<(String, WebSocketStream<MaybeTlsStream<TcpStream>>)> {
let url = Url::parse(&websocket_url).map_err(|err| {
IoError::new(
ErrorKind::InvalidInput,
format!("invalid websocket URL `{websocket_url}`: {err}"),
)
})?;
if auth_token.is_some() && !websocket_url_supports_auth_token(&url) {
return Err(IoError::new(
ErrorKind::InvalidInput,
format!(
"remote auth tokens require `wss://` or loopback `ws://` URLs; got `{websocket_url}`"
),
));
}
let mut request = url.as_str().into_client_request().map_err(|err| {
IoError::new(
ErrorKind::InvalidInput,
format!("invalid websocket URL `{websocket_url}`: {err}"),
)
})?;
if let Some(auth_token) = auth_token.as_deref() {
let header_value =
HeaderValue::from_str(&format!("Bearer {auth_token}")).map_err(|err| {
IoError::new(
ErrorKind::InvalidInput,
format!("invalid remote authorization header value: {err}"),
)
})?;
request.headers_mut().insert(AUTHORIZATION, header_value);
}
ensure_rustls_crypto_provider();
let websocket_config = remote_websocket_config();
let stream = timeout(
CONNECT_TIMEOUT,
connect_async_with_config(
request,
Some(websocket_config),
/*disable_nagle*/ false,
),
)
.await
.map_err(|_| {
IoError::new(
ErrorKind::TimedOut,
format!("timed out connecting to remote app server at `{websocket_url}`"),
)
})?
.map(|(stream, _response)| stream)
.map_err(|err| {
IoError::other(format!(
"failed to connect to remote app server at `{websocket_url}`: {err}"
))
})?;
Ok((websocket_url, stream))
}
async fn connect_unix_socket_endpoint(
socket_path: AbsolutePathBuf,
) -> IoResult<(String, WebSocketStream<UnixStream>)> {
let endpoint = format!("unix://{}", socket_path.display());
let request = UDS_WEBSOCKET_HANDSHAKE_URL
.into_client_request()
.map_err(|err| {
IoError::new(
ErrorKind::InvalidInput,
format!("invalid UDS websocket handshake URL: {err}"),
)
})?;
let stream = timeout(CONNECT_TIMEOUT, UnixStream::connect(socket_path.as_path()))
.await
.map_err(|_| {
IoError::new(
ErrorKind::TimedOut,
format!("timed out connecting to remote app server at `{endpoint}`"),
)
})?
.map_err(|err| {
IoError::other(format!(
"failed to connect to remote app server at `{endpoint}`: {err}"
))
})?;
let websocket_config = remote_websocket_config();
let stream = timeout(
CONNECT_TIMEOUT,
client_async_with_config(request, stream, Some(websocket_config)),
)
.await
.map_err(|_| {
IoError::new(
ErrorKind::TimedOut,
format!("timed out upgrading remote app server at `{endpoint}`"),
)
})?
.map(|(stream, _response)| stream)
.map_err(|err| {
IoError::other(format!(
"failed to upgrade remote app server at `{endpoint}`: {err}"
))
})?;
Ok((endpoint, stream))
}
fn remote_websocket_config() -> WebSocketConfig {
WebSocketConfig::default()
.max_frame_size(Some(REMOTE_APP_SERVER_MAX_WEBSOCKET_MESSAGE_SIZE))
.max_message_size(Some(REMOTE_APP_SERVER_MAX_WEBSOCKET_MESSAGE_SIZE))
}
async fn initialize_remote_connection<S>(
stream: &mut WebSocketStream<S>,
endpoint: &str,
params: InitializeParams,
initialize_timeout: Duration,
) -> IoResult<Vec<AppServerEvent>> {
) -> IoResult<Vec<AppServerEvent>>
where
S: AsyncRead + AsyncWrite + Unpin,
{
let initialize_request_id = RequestId::String("initialize".to_string());
let mut pending_events = Vec::new();
write_jsonrpc_message(
@@ -693,7 +807,7 @@ async fn initialize_remote_connection(
params,
},
)),
websocket_url,
endpoint,
)
.await?;
@@ -703,7 +817,7 @@ async fn initialize_remote_connection(
Some(Ok(Message::Text(text))) => {
let message = serde_json::from_str::<JSONRPCMessage>(&text).map_err(|err| {
IoError::other(format!(
"remote app server at `{websocket_url}` sent invalid initialize response: {err}"
"remote app server at `{endpoint}` sent invalid initialize response: {err}"
))
})?;
match message {
@@ -712,7 +826,7 @@ async fn initialize_remote_connection(
}
JSONRPCMessage::Error(error) if error.id == initialize_request_id => {
break Err(IoError::other(format!(
"remote app server at `{websocket_url}` rejected initialize: {}",
"remote app server at `{endpoint}` rejected initialize: {}",
error.error.message
)));
}
@@ -742,7 +856,7 @@ async fn initialize_remote_connection(
},
id: request_id,
}),
websocket_url,
endpoint,
)
.await?;
}
@@ -764,19 +878,19 @@ async fn initialize_remote_connection(
break Err(IoError::new(
ErrorKind::ConnectionAborted,
format!(
"remote app server at `{websocket_url}` closed during initialize: {reason}"
"remote app server at `{endpoint}` closed during initialize: {reason}"
),
));
}
Some(Err(err)) => {
break Err(IoError::other(format!(
"remote app server at `{websocket_url}` transport failed during initialize: {err}"
"remote app server at `{endpoint}` transport failed during initialize: {err}"
)));
}
None => {
break Err(IoError::new(
ErrorKind::UnexpectedEof,
format!("remote app server at `{websocket_url}` closed during initialize"),
format!("remote app server at `{endpoint}` closed during initialize"),
));
}
}
@@ -786,7 +900,7 @@ async fn initialize_remote_connection(
.map_err(|_| {
IoError::new(
ErrorKind::TimedOut,
format!("timed out waiting for initialize response from `{websocket_url}`"),
format!("timed out waiting for initialize response from `{endpoint}`"),
)
})??;
@@ -795,7 +909,7 @@ async fn initialize_remote_connection(
JSONRPCMessage::Notification(jsonrpc_notification_from_client_notification(
ClientNotification::Initialized,
)),
websocket_url,
endpoint,
)
.await?;
@@ -849,21 +963,35 @@ fn jsonrpc_notification_from_client_notification(
}
}
async fn write_jsonrpc_message(
stream: &mut WebSocketStream<MaybeTlsStream<TcpStream>>,
async fn write_jsonrpc_message<S>(
stream: &mut WebSocketStream<S>,
message: JSONRPCMessage,
websocket_url: &str,
) -> IoResult<()> {
endpoint: &str,
) -> IoResult<()>
where
S: AsyncRead + AsyncWrite + Unpin,
{
let payload = serde_json::to_string(&message).map_err(IoError::other)?;
stream
.send(Message::Text(payload.into()))
.await
.map_err(|err| {
IoError::other(format!(
"failed to write websocket message to `{websocket_url}`: {err}"
"failed to write websocket message to `{endpoint}`: {err}"
))
})
}
fn websocket_close_error_is_already_closed(err: &TungsteniteError) -> bool {
match err {
TungsteniteError::ConnectionClosed | TungsteniteError::AlreadyClosed => true,
TungsteniteError::Io(err) => matches!(
err.kind(),
ErrorKind::BrokenPipe | ErrorKind::ConnectionReset | ErrorKind::NotConnected
),
_ => false,
}
}
#[cfg(test)]
mod tests {
use super::*;

View File

@@ -0,0 +1,6 @@
load("//:defs.bzl", "codex_rust_crate")
codex_rust_crate(
name = "app-server-daemon",
crate_name = "codex_app_server_daemon",
)

View File

@@ -0,0 +1,40 @@
[package]
name = "codex-app-server-daemon"
version.workspace = true
edition.workspace = true
license.workspace = true
[lib]
name = "codex_app_server_daemon"
path = "src/lib.rs"
doctest = false
[lints]
workspace = true
[dependencies]
anyhow = { workspace = true }
codex-app-server-protocol = { workspace = true }
codex-app-server-transport = { workspace = true }
codex-utils-home-dir = { workspace = true }
codex-uds = { workspace = true }
futures = { workspace = true }
libc = { workspace = true }
reqwest = { workspace = true, features = ["rustls-tls"] }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
sha2 = { workspace = true }
tokio = { workspace = true, features = [
"fs",
"io-util",
"macros",
"process",
"rt-multi-thread",
"signal",
"time",
] }
tokio-tungstenite = { workspace = true }
[dev-dependencies]
pretty_assertions = { workspace = true }
tempfile = { workspace = true }

View File

@@ -0,0 +1,113 @@
# codex-app-server-daemon
> `codex-app-server-daemon` is experimental and its lifecycle contract may
> change while the remote-management flow is still being developed.
`codex-app-server-daemon` backs the machine-readable `codex app-server`
lifecycle commands used by remote clients such as the desktop and mobile apps.
It is intended for Codex instances launched over SSH, including fresh developer
machines that should expose app-server with `remote_control` enabled.
## Platform support
The current daemon implementation is Unix-only. It uses pidfile-backed
daemonization plus Unix process and file-locking primitives, and does not yet
support Windows lifecycle management.
## Commands
```sh
codex app-server daemon start
codex app-server daemon restart
codex app-server daemon enable-remote-control
codex app-server daemon disable-remote-control
codex app-server daemon stop
codex app-server daemon version
codex app-server daemon bootstrap --remote-control
```
On success, every command writes exactly one JSON object to stdout. Consumers
should parse that JSON rather than relying on human-readable text. Lifecycle
responses report the resolved backend, socket path, local CLI version, and
running app-server version when applicable.
## Bootstrap flow
For a new remote machine:
```sh
curl -fsSL https://chatgpt.com/codex/install.sh | sh
$HOME/.codex/packages/standalone/current/codex app-server daemon bootstrap --remote-control
```
`bootstrap` requires the standalone managed install. It records the daemon
settings under `CODEX_HOME/app-server-daemon/`, starts app-server as a
pidfile-backed detached process, and launches a detached updater loop.
## Installation and update cases
The daemon assumes Codex is installed through `install.sh` and always launches
the standalone managed binary under `CODEX_HOME`.
| Situation | What starts | Does this daemon fetch new binaries? | Does a running app-server eventually move to a newer binary on its own? |
| --- | --- | --- | --- |
| `install.sh` has run, but only `start` is used | `start` uses `CODEX_HOME/packages/standalone/current/codex` | No | No. The managed path is used when starting or restarting, but no updater is installed. |
| `install.sh` has run, then `bootstrap` is used | The pidfile backend uses `CODEX_HOME/packages/standalone/current/codex` | Yes. Bootstrap launches a detached updater loop that runs `install.sh` hourly. | Yes, while that updater process is alive and app-server is already running. After a successful fetch, the updater restarts app-server with the refreshed binary and only then replaces its own process image. |
| Some other tool updates the managed binary path | The next fresh start or restart uses the updated file at that path | Only if `bootstrap` is active, because the updater still runs `install.sh` on its normal cadence. | Without `bootstrap`, no. With `bootstrap`, the next successful updater pass compares the managed binary contents after `install.sh` runs; if app-server is running and they differ from the updater's current image, it refreshes app-server first and then itself. |
### Standalone installs
For installs created by `install.sh`:
- lifecycle commands always use the standalone managed binary path
- `bootstrap` is supported
- `bootstrap` starts a detached pid-backed updater loop that fetches via
`install.sh`
- after a successful refresh, if app-server is running and the managed binary
contents changed, the updater restarts app-server with that binary first and
only then replaces its own process image
- the updater loop is not reboot-persistent; it must be started again by
rerunning `bootstrap` after a reboot
### Out-of-band updates
This daemon does not watch arbitrary executable files for replacement. If some
other tool updates the managed binary path:
- without `bootstrap`, a currently running app-server remains on the old
executable image until an explicit `restart`
- with `bootstrap`, the detached updater loop notices the changed managed
binary on its next successful scheduled pass after running `install.sh`; if
app-server is running, it refreshes app-server first and then refreshes itself
once that replacement starts successfully
## Lifecycle semantics
`start` is idempotent and returns after app-server is ready to answer the normal
JSON-RPC initialize handshake on the Unix control socket.
`restart` stops any managed daemon and starts it again.
`enable-remote-control` and `disable-remote-control` persist the launch setting
for future starts. If a managed app-server is already running, they restart it
so the new setting takes effect immediately.
Top-level `codex remote-control` bootstraps with `--remote-control` when the
updater loop is not running. Otherwise it enables remote control and starts the
daemon normally.
`stop` sends a graceful termination request first, then sends a second
termination signal after the grace window if the process is still alive.
All mutating lifecycle commands are serialized per `CODEX_HOME`, so a concurrent
`start`, `restart`, `enable-remote-control`, `disable-remote-control`, `stop`,
or `bootstrap` does not race another in-flight lifecycle operation.
## State
The daemon stores its local state under `CODEX_HOME/app-server-daemon/`:
- `settings.json` for persisted launch settings
- `app-server.pid` for the app-server process record
- `app-server-updater.pid` for the pid-backed standalone updater loop
- `daemon.lock` for daemon-wide lifecycle serialization

View File

@@ -0,0 +1,33 @@
mod pid;
use std::path::PathBuf;
use serde::Serialize;
pub(crate) use pid::PidBackend;
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum BackendKind {
Pid,
}
#[derive(Debug, Clone)]
pub(crate) struct BackendPaths {
pub(crate) codex_bin: PathBuf,
pub(crate) pid_file: PathBuf,
pub(crate) update_pid_file: PathBuf,
pub(crate) remote_control_enabled: bool,
}
pub(crate) fn pid_backend(paths: BackendPaths) -> PidBackend {
PidBackend::new(
paths.codex_bin,
paths.pid_file,
paths.remote_control_enabled,
)
}
pub(crate) fn pid_update_loop_backend(paths: BackendPaths) -> PidBackend {
PidBackend::new_update_loop(paths.codex_bin, paths.update_pid_file)
}

View File

@@ -0,0 +1,594 @@
use std::path::Path;
use std::path::PathBuf;
#[cfg(unix)]
use std::process::Stdio;
use std::time::Duration;
use anyhow::Context;
use anyhow::Result;
use anyhow::bail;
use serde::Deserialize;
use serde::Serialize;
use tokio::fs;
#[cfg(unix)]
use tokio::process::Command;
use tokio::time::sleep;
const STOP_POLL_INTERVAL: Duration = Duration::from_millis(50);
const STOP_GRACE_PERIOD: Duration = Duration::from_secs(60);
const STOP_TIMEOUT: Duration = Duration::from_secs(70);
const START_TIMEOUT: Duration = Duration::from_secs(10);
#[derive(Debug)]
#[cfg_attr(not(unix), allow(dead_code))]
pub(crate) struct PidBackend {
codex_bin: PathBuf,
pid_file: PathBuf,
lock_file: PathBuf,
command_kind: PidCommandKind,
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct PidRecord {
pid: u32,
process_start_time: String,
}
#[derive(Debug, Clone, PartialEq, Eq)]
enum PidFileState {
Missing,
Starting,
Running(PidRecord),
}
#[derive(Debug, Clone, Copy)]
#[cfg_attr(not(unix), allow(dead_code))]
enum PidCommandKind {
AppServer { remote_control_enabled: bool },
UpdateLoop,
}
impl PidBackend {
pub(crate) fn new(codex_bin: PathBuf, pid_file: PathBuf, remote_control_enabled: bool) -> Self {
let lock_file = pid_file.with_extension("pid.lock");
Self {
codex_bin,
pid_file,
lock_file,
command_kind: PidCommandKind::AppServer {
remote_control_enabled,
},
}
}
pub(crate) fn new_update_loop(codex_bin: PathBuf, pid_file: PathBuf) -> Self {
let lock_file = pid_file.with_extension("pid.lock");
Self {
codex_bin,
pid_file,
lock_file,
command_kind: PidCommandKind::UpdateLoop,
}
}
pub(crate) async fn is_starting_or_running(&self) -> Result<bool> {
loop {
match self.read_pid_file_state().await? {
PidFileState::Missing => return Ok(false),
PidFileState::Starting => return Ok(true),
PidFileState::Running(record) => {
if self.record_is_active(&record).await? {
return Ok(true);
}
match self.refresh_after_stale_record(&record).await? {
PidFileState::Missing => return Ok(false),
PidFileState::Starting | PidFileState::Running(_) => continue,
}
}
}
}
}
#[cfg(unix)]
pub(crate) async fn start(&self) -> Result<Option<u32>> {
if let Some(parent) = self.pid_file.parent() {
fs::create_dir_all(parent)
.await
.with_context(|| format!("failed to create pid directory {}", parent.display()))?;
}
let reservation_lock = self.acquire_reservation_lock().await?;
let _pid_file = loop {
match fs::OpenOptions::new()
.create_new(true)
.write(true)
.open(&self.pid_file)
.await
{
Ok(pid_file) => break pid_file,
Err(err) if err.kind() == std::io::ErrorKind::AlreadyExists => {
match self.read_pid_file_state_with_lock_held().await? {
PidFileState::Missing => continue,
PidFileState::Running(record) => {
if self.record_is_active(&record).await? {
return Ok(None);
}
let _ = fs::remove_file(&self.pid_file).await;
continue;
}
PidFileState::Starting => {
unreachable!("lock holder cannot observe starting")
}
}
}
Err(err) => {
return Err(err).with_context(|| {
format!("failed to reserve pid file {}", self.pid_file.display())
});
}
}
};
let mut command = Command::new(&self.codex_bin);
command
.args(self.command_args())
.stdin(Stdio::null())
.stdout(Stdio::null())
.stderr(Stdio::null());
#[cfg(unix)]
{
unsafe {
command.pre_exec(|| {
if libc::setsid() == -1 {
return Err(std::io::Error::last_os_error());
}
Ok(())
});
}
}
let child = match command.spawn() {
Ok(child) => child,
Err(err) => {
let _ = fs::remove_file(&self.pid_file).await;
return Err(err).with_context(|| {
format!(
"failed to spawn detached app-server process using {}",
self.codex_bin.display()
)
});
}
};
let pid = child
.id()
.context("spawned app-server process has no pid")?;
let record = match read_process_start_time(pid).await {
Ok(process_start_time) => PidRecord {
pid,
process_start_time,
},
Err(err) => {
let _ = self.terminate_process(pid);
let _ = fs::remove_file(&self.pid_file).await;
return Err(err);
}
};
let contents = serde_json::to_vec(&record).context("failed to serialize pid record")?;
let temp_pid_file = self.pid_file.with_extension("pid.tmp");
if let Err(err) = fs::write(&temp_pid_file, &contents).await {
let _ = self.terminate_process(pid);
let _ = fs::remove_file(&self.pid_file).await;
return Err(err).with_context(|| {
format!("failed to write pid temp file {}", temp_pid_file.display())
});
}
if let Err(err) = fs::rename(&temp_pid_file, &self.pid_file).await {
let _ = self.terminate_process(pid);
let _ = fs::remove_file(&temp_pid_file).await;
let _ = fs::remove_file(&self.pid_file).await;
return Err(err).with_context(|| {
format!("failed to publish pid file {}", self.pid_file.display())
});
}
drop(reservation_lock);
Ok(Some(pid))
}
#[cfg(not(unix))]
pub(crate) async fn start(&self) -> Result<Option<u32>> {
bail!("pid-managed app-server startup is unsupported on this platform")
}
pub(crate) async fn stop(&self) -> Result<()> {
loop {
let Some(record) = self.wait_for_pid_start().await? else {
return Ok(());
};
if !self.record_is_active(&record).await? {
match self.refresh_after_stale_record(&record).await? {
PidFileState::Missing => return Ok(()),
PidFileState::Starting | PidFileState::Running(_) => continue,
}
}
let pid = record.pid;
self.terminate_process(pid)?;
let started_at = tokio::time::Instant::now();
let deadline = tokio::time::Instant::now() + STOP_TIMEOUT;
let mut forced = false;
while tokio::time::Instant::now() < deadline {
if !self.record_is_active(&record).await? {
match self.refresh_after_stale_record(&record).await? {
PidFileState::Missing => return Ok(()),
PidFileState::Starting | PidFileState::Running(_) => break,
}
}
if !forced && started_at.elapsed() >= STOP_GRACE_PERIOD {
self.force_terminate_process(pid)?;
forced = true;
}
sleep(STOP_POLL_INTERVAL).await;
}
if self.record_is_active(&record).await? {
bail!("timed out waiting for pid-managed app server {pid} to stop");
}
}
}
async fn wait_for_pid_start(&self) -> Result<Option<PidRecord>> {
let deadline = tokio::time::Instant::now() + START_TIMEOUT;
loop {
match self.read_pid_file_state().await? {
PidFileState::Missing => return Ok(None),
PidFileState::Running(record) => return Ok(Some(record)),
PidFileState::Starting if tokio::time::Instant::now() < deadline => {
sleep(STOP_POLL_INTERVAL).await;
}
PidFileState::Starting => {
bail!(
"timed out waiting for pid reservation in {} to finish initializing",
self.pid_file.display()
);
}
}
}
}
async fn read_pid_file_state(&self) -> Result<PidFileState> {
let contents = match fs::read_to_string(&self.pid_file).await {
Ok(contents) => contents,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
return if reservation_lock_is_active(&self.lock_file).await? {
Ok(PidFileState::Starting)
} else {
Ok(PidFileState::Missing)
};
}
Err(err) => {
return Err(err).with_context(|| {
format!("failed to read pid file {}", self.pid_file.display())
});
}
};
if contents.trim().is_empty() {
match inspect_empty_pid_reservation(&self.pid_file, &self.lock_file).await? {
EmptyPidReservation::Active => {
return Ok(PidFileState::Starting);
}
EmptyPidReservation::Stale => {
return Ok(PidFileState::Missing);
}
EmptyPidReservation::Record(record) => return Ok(PidFileState::Running(record)),
}
}
let record = serde_json::from_str(&contents)
.with_context(|| format!("invalid pid file contents in {}", self.pid_file.display()))?;
Ok(PidFileState::Running(record))
}
async fn read_pid_file_state_with_lock_held(&self) -> Result<PidFileState> {
let contents = match fs::read_to_string(&self.pid_file).await {
Ok(contents) => contents,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
return Ok(PidFileState::Missing);
}
Err(err) => {
return Err(err).with_context(|| {
format!("failed to read pid file {}", self.pid_file.display())
});
}
};
if contents.trim().is_empty() {
let _ = fs::remove_file(&self.pid_file).await;
return Ok(PidFileState::Missing);
}
let record = serde_json::from_str(&contents)
.with_context(|| format!("invalid pid file contents in {}", self.pid_file.display()))?;
Ok(PidFileState::Running(record))
}
async fn refresh_after_stale_record(&self, expected: &PidRecord) -> Result<PidFileState> {
let reservation_lock = self.acquire_reservation_lock().await?;
let state = match self.read_pid_file_state_with_lock_held().await? {
PidFileState::Running(record) if record == *expected => {
let _ = fs::remove_file(&self.pid_file).await;
PidFileState::Missing
}
state => state,
};
drop(reservation_lock);
Ok(state)
}
async fn acquire_reservation_lock(&self) -> Result<fs::File> {
let reservation_lock = fs::OpenOptions::new()
.create(true)
.truncate(false)
.write(true)
.open(&self.lock_file)
.await
.with_context(|| {
format!("failed to open pid lock file {}", self.lock_file.display())
})?;
let lock_deadline = tokio::time::Instant::now() + START_TIMEOUT;
while !try_lock_file(&reservation_lock)? {
if tokio::time::Instant::now() >= lock_deadline {
bail!(
"timed out waiting for pid lock {}",
self.lock_file.display()
);
}
sleep(STOP_POLL_INTERVAL).await;
}
Ok(reservation_lock)
}
#[cfg(unix)]
fn command_args(&self) -> Vec<&'static str> {
match self.command_kind {
PidCommandKind::AppServer {
remote_control_enabled: true,
} => vec!["app-server", "--remote-control", "--listen", "unix://"],
PidCommandKind::AppServer {
remote_control_enabled: false,
} => vec!["app-server", "--listen", "unix://"],
PidCommandKind::UpdateLoop => vec!["app-server", "daemon", "pid-update-loop"],
}
}
fn terminate_process(&self, pid: u32) -> Result<()> {
match self.command_kind {
PidCommandKind::AppServer { .. } => terminate_process(pid),
PidCommandKind::UpdateLoop => terminate_process(pid),
}
}
fn force_terminate_process(&self, pid: u32) -> Result<()> {
match self.command_kind {
PidCommandKind::AppServer { .. } => force_terminate_process(pid),
PidCommandKind::UpdateLoop => force_terminate_process_group(pid),
}
}
async fn record_is_active(&self, record: &PidRecord) -> Result<bool> {
process_matches_record(record).await
}
}
#[cfg(unix)]
fn process_exists(pid: u32) -> bool {
let Ok(pid) = libc::pid_t::try_from(pid) else {
return false;
};
let result = unsafe { libc::kill(pid, 0) };
result == 0 || std::io::Error::last_os_error().raw_os_error() == Some(libc::EPERM)
}
#[cfg(unix)]
fn terminate_process(pid: u32) -> Result<()> {
let raw_pid = libc::pid_t::try_from(pid)
.with_context(|| format!("pid-managed app server pid {pid} is out of range"))?;
let result = unsafe { libc::kill(raw_pid, libc::SIGTERM) };
if result == 0 {
return Ok(());
}
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(libc::ESRCH) {
return Ok(());
}
Err(err).with_context(|| format!("failed to terminate pid-managed app server {pid}"))
}
#[cfg(unix)]
fn force_terminate_process(pid: u32) -> Result<()> {
let raw_pid = libc::pid_t::try_from(pid)
.with_context(|| format!("pid-managed app server pid {pid} is out of range"))?;
let result = unsafe { libc::kill(raw_pid, libc::SIGKILL) };
if result == 0 {
return Ok(());
}
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(libc::ESRCH) {
return Ok(());
}
Err(err).with_context(|| format!("failed to force terminate pid-managed app server {pid}"))
}
#[cfg(unix)]
fn force_terminate_process_group(pid: u32) -> Result<()> {
let raw_pid = libc::pid_t::try_from(pid)
.with_context(|| format!("pid-managed updater pid {pid} is out of range"))?;
let result = unsafe { libc::kill(-raw_pid, libc::SIGKILL) };
if result == 0 {
return Ok(());
}
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(libc::ESRCH) {
return Ok(());
}
Err(err).with_context(|| format!("failed to force terminate pid-managed updater group {pid}"))
}
#[cfg(not(unix))]
fn terminate_process(_pid: u32) -> Result<()> {
bail!("pid-managed app-server shutdown is unsupported on this platform")
}
#[cfg(not(unix))]
fn force_terminate_process(_pid: u32) -> Result<()> {
bail!("pid-managed app-server shutdown is unsupported on this platform")
}
#[cfg(not(unix))]
fn force_terminate_process_group(_pid: u32) -> Result<()> {
bail!("pid-managed updater shutdown is unsupported on this platform")
}
#[cfg(unix)]
async fn process_matches_record(record: &PidRecord) -> Result<bool> {
if !process_exists(record.pid) {
return Ok(false);
}
match read_process_start_time(record.pid).await {
Ok(start_time) => Ok(start_time == record.process_start_time),
Err(_err) if !process_exists(record.pid) => Ok(false),
Err(err) => Err(err),
}
}
#[cfg(not(unix))]
async fn process_matches_record(_record: &PidRecord) -> Result<bool> {
Ok(false)
}
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(not(unix), allow(dead_code))]
enum EmptyPidReservation {
Active,
Stale,
Record(PidRecord),
}
#[cfg(unix)]
fn try_lock_file(file: &fs::File) -> Result<bool> {
use std::os::fd::AsRawFd;
let result = unsafe { libc::flock(file.as_raw_fd(), libc::LOCK_EX | libc::LOCK_NB) };
if result == 0 {
return Ok(true);
}
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(libc::EWOULDBLOCK) {
return Ok(false);
}
Err(err).context("failed to lock pid reservation")
}
#[cfg(not(unix))]
fn try_lock_file(_file: &fs::File) -> Result<bool> {
bail!("pid-managed app-server startup is unsupported on this platform")
}
#[cfg(unix)]
async fn reservation_lock_is_active(path: &Path) -> Result<bool> {
let file = match fs::OpenOptions::new()
.write(true)
.create(true)
.truncate(false)
.open(path)
.await
{
Ok(file) => file,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
return Ok(false);
}
Err(err) => {
return Err(err)
.with_context(|| format!("failed to inspect pid lock file {}", path.display()));
}
};
Ok(!try_lock_file(&file)?)
}
#[cfg(not(unix))]
async fn reservation_lock_is_active(_path: &Path) -> Result<bool> {
Ok(false)
}
#[cfg(unix)]
async fn inspect_empty_pid_reservation(
pid_path: &Path,
lock_path: &Path,
) -> Result<EmptyPidReservation> {
let file = match fs::OpenOptions::new()
.write(true)
.create(true)
.truncate(false)
.open(lock_path)
.await
{
Ok(file) => file,
Err(err) => {
return Err(err).with_context(|| {
format!("failed to inspect pid lock file {}", lock_path.display())
});
}
};
if !try_lock_file(&file)? {
return Ok(EmptyPidReservation::Active);
}
let contents = match fs::read_to_string(pid_path).await {
Ok(contents) => contents,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
return Ok(EmptyPidReservation::Stale);
}
Err(err) => {
return Err(err)
.with_context(|| format!("failed to reread pid file {}", pid_path.display()));
}
};
if contents.trim().is_empty() {
let _ = fs::remove_file(pid_path).await;
return Ok(EmptyPidReservation::Stale);
}
let record = serde_json::from_str(&contents)
.with_context(|| format!("invalid pid file contents in {}", pid_path.display()))?;
Ok(EmptyPidReservation::Record(record))
}
#[cfg(not(unix))]
async fn inspect_empty_pid_reservation(
_pid_path: &Path,
_lock_path: &Path,
) -> Result<EmptyPidReservation> {
Ok(EmptyPidReservation::Stale)
}
#[cfg(unix)]
async fn read_process_start_time(pid: u32) -> Result<String> {
let output = Command::new("ps")
.args(["-p", &pid.to_string(), "-o", "lstart="])
.output()
.await
.context("failed to invoke ps for pid-managed app server")?;
if !output.status.success() {
bail!("failed to read start time for pid-managed app server {pid}");
}
let start_time = String::from_utf8(output.stdout)
.context("pid-managed app server start time was not utf-8")?;
let start_time = start_time.trim();
if start_time.is_empty() {
bail!("pid-managed app server {pid} has no recorded start time");
}
Ok(start_time.to_string())
}
#[cfg(all(test, unix))]
#[path = "pid_tests.rs"]
mod tests;

View File

@@ -0,0 +1,172 @@
use std::time::Duration;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
use super::PidBackend;
use super::PidCommandKind;
use super::PidFileState;
use super::PidRecord;
use super::try_lock_file;
#[tokio::test]
async fn locked_empty_pid_file_is_treated_as_active_reservation() {
let temp_dir = TempDir::new().expect("temp dir");
let pid_file = temp_dir.path().join("app-server.pid");
tokio::fs::write(&pid_file, "")
.await
.expect("write pid file");
let backend = PidBackend::new(
temp_dir.path().join("codex"),
pid_file.clone(),
/*remote_control_enabled*/ false,
);
let reservation = tokio::fs::OpenOptions::new()
.create(true)
.truncate(false)
.write(true)
.open(&backend.lock_file)
.await
.expect("open pid lock file");
assert!(try_lock_file(&reservation).expect("lock reservation"));
assert_eq!(
backend.read_pid_file_state().await.expect("read pid"),
PidFileState::Starting
);
assert!(pid_file.exists());
}
#[tokio::test]
async fn unlocked_empty_pid_file_is_treated_as_stale_reservation() {
let temp_dir = TempDir::new().expect("temp dir");
let pid_file = temp_dir.path().join("app-server.pid");
tokio::fs::write(&pid_file, "")
.await
.expect("write pid file");
let backend = PidBackend::new(
temp_dir.path().join("codex"),
pid_file.clone(),
/*remote_control_enabled*/ false,
);
assert_eq!(
backend.read_pid_file_state().await.expect("read pid"),
PidFileState::Missing
);
assert!(!pid_file.exists());
}
#[tokio::test]
async fn stop_waits_for_live_reservation_to_resolve() {
let temp_dir = TempDir::new().expect("temp dir");
let pid_file = temp_dir.path().join("app-server.pid");
tokio::fs::write(&pid_file, "")
.await
.expect("write pid file");
let backend = PidBackend::new(
temp_dir.path().join("codex"),
pid_file.clone(),
/*remote_control_enabled*/ false,
);
let reservation = tokio::fs::OpenOptions::new()
.create(true)
.truncate(false)
.write(true)
.open(&backend.lock_file)
.await
.expect("open pid lock file");
assert!(try_lock_file(&reservation).expect("lock reservation"));
let cleanup = tokio::spawn(async move {
tokio::time::sleep(Duration::from_millis(50)).await;
drop(reservation);
tokio::fs::remove_file(pid_file)
.await
.expect("remove pid file");
});
backend.stop().await.expect("stop");
cleanup.await.expect("cleanup task");
}
#[tokio::test]
async fn start_retries_stale_empty_pid_file_under_its_own_lock() {
let temp_dir = TempDir::new().expect("temp dir");
let pid_file = temp_dir.path().join("app-server.pid");
tokio::fs::write(&pid_file, "")
.await
.expect("write pid file");
let backend = PidBackend::new(
temp_dir.path().join("missing-codex"),
pid_file,
/*remote_control_enabled*/ false,
);
let err = backend.start().await.expect_err("start");
assert!(
err.to_string()
.starts_with("failed to spawn detached app-server process using ")
);
}
#[tokio::test]
async fn stale_record_cleanup_preserves_replacement_record() {
let temp_dir = TempDir::new().expect("temp dir");
let pid_file = temp_dir.path().join("app-server.pid");
let backend = PidBackend::new(
temp_dir.path().join("codex"),
pid_file.clone(),
/*remote_control_enabled*/ false,
);
let stale = PidRecord {
pid: 1,
process_start_time: "old".to_string(),
};
let replacement = PidRecord {
pid: 2,
process_start_time: "new".to_string(),
};
tokio::fs::write(
&pid_file,
serde_json::to_vec(&replacement).expect("serialize replacement"),
)
.await
.expect("write replacement pid file");
assert_eq!(
backend
.refresh_after_stale_record(&stale)
.await
.expect("cleanup"),
PidFileState::Running(replacement)
);
}
#[test]
fn update_loop_uses_hidden_app_server_subcommand() {
let backend = PidBackend {
codex_bin: "codex".into(),
pid_file: "updater.pid".into(),
lock_file: "updater.pid.lock".into(),
command_kind: PidCommandKind::UpdateLoop,
};
assert_eq!(
backend.command_args(),
vec!["app-server", "daemon", "pid-update-loop"]
);
}
#[test]
fn app_server_remote_control_uses_runtime_flag() {
let backend = PidBackend::new(
"codex".into(),
"app-server.pid".into(),
/*remote_control_enabled*/ true,
);
assert_eq!(
backend.command_args(),
vec!["app-server", "--remote-control", "--listen", "unix://"]
);
}

View File

@@ -0,0 +1,131 @@
use std::path::Path;
use std::time::Duration;
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
use codex_app_server_protocol::ClientInfo;
use codex_app_server_protocol::InitializeParams;
use codex_app_server_protocol::InitializeResponse;
use codex_app_server_protocol::JSONRPCMessage;
use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCRequest;
use codex_app_server_protocol::RequestId;
use codex_uds::UnixStream;
use futures::SinkExt;
use futures::StreamExt;
use tokio::time::timeout;
use tokio_tungstenite::client_async;
use tokio_tungstenite::tungstenite::Message;
const PROBE_TIMEOUT: Duration = Duration::from_secs(2);
const CLIENT_NAME: &str = "codex_app_server_daemon";
#[derive(Debug, Clone, PartialEq, Eq)]
pub(crate) struct ProbeInfo {
pub(crate) app_server_version: String,
}
pub(crate) async fn probe(socket_path: &Path) -> Result<ProbeInfo> {
timeout(PROBE_TIMEOUT, probe_inner(socket_path))
.await
.with_context(|| {
format!(
"timed out probing app-server control socket {}",
socket_path.display()
)
})?
}
async fn probe_inner(socket_path: &Path) -> Result<ProbeInfo> {
let stream = UnixStream::connect(socket_path)
.await
.with_context(|| format!("failed to connect to {}", socket_path.display()))?;
let (mut websocket, _response) = client_async("ws://localhost/", stream)
.await
.with_context(|| format!("failed to upgrade {}", socket_path.display()))?;
let initialize = JSONRPCMessage::Request(JSONRPCRequest {
id: RequestId::Integer(1),
method: "initialize".to_string(),
params: Some(serde_json::to_value(InitializeParams {
client_info: ClientInfo {
name: CLIENT_NAME.to_string(),
title: Some("Codex App Server Daemon".to_string()),
version: env!("CARGO_PKG_VERSION").to_string(),
},
capabilities: None,
})?),
trace: None,
});
websocket
.send(Message::Text(serde_json::to_string(&initialize)?.into()))
.await
.context("failed to send initialize request")?;
let response = loop {
let frame = websocket
.next()
.await
.ok_or_else(|| anyhow!("app-server closed before initialize response"))??;
let Message::Text(payload) = frame else {
continue;
};
let message = serde_json::from_str::<JSONRPCMessage>(&payload)?;
if let JSONRPCMessage::Response(response) = message
&& response.id == RequestId::Integer(1)
{
break response;
}
};
let initialize_response = serde_json::from_value::<InitializeResponse>(response.result)?;
let initialized = JSONRPCMessage::Notification(JSONRPCNotification {
method: "initialized".to_string(),
params: None,
});
websocket
.send(Message::Text(serde_json::to_string(&initialized)?.into()))
.await
.context("failed to send initialized notification")?;
websocket.close(None).await.ok();
Ok(ProbeInfo {
app_server_version: parse_version_from_user_agent(&initialize_response.user_agent)?,
})
}
fn parse_version_from_user_agent(user_agent: &str) -> Result<String> {
let (_originator, rest) = user_agent
.split_once('/')
.ok_or_else(|| anyhow!("app-server user-agent omitted version separator"))?;
let version = rest
.split_whitespace()
.next()
.filter(|version| !version.is_empty())
.ok_or_else(|| anyhow!("app-server user-agent omitted version"))?;
Ok(version.to_string())
}
#[cfg(all(test, unix))]
mod tests {
use pretty_assertions::assert_eq;
use super::parse_version_from_user_agent;
#[test]
fn parses_version_from_codex_user_agent() {
assert_eq!(
parse_version_from_user_agent(
"codex_app_server_daemon/1.2.3 (Linux 6.8.0; x86_64) codex_cli_rs/1.2.3",
)
.expect("version"),
"1.2.3"
);
}
#[test]
fn rejects_user_agent_without_version() {
assert!(parse_version_from_user_agent("codex_app_server_daemon").is_err());
}
}

View File

@@ -0,0 +1,878 @@
mod backend;
mod client;
mod managed_install;
mod settings;
mod update_loop;
use std::path::Path;
use std::path::PathBuf;
use std::time::Duration;
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
pub use backend::BackendKind;
use backend::BackendPaths;
use codex_app_server_transport::app_server_control_socket_path;
use codex_utils_home_dir::find_codex_home;
use managed_install::managed_codex_bin;
#[cfg(unix)]
use managed_install::managed_codex_version;
use serde::Serialize;
use settings::DaemonSettings;
use tokio::time::sleep;
const START_POLL_INTERVAL: Duration = Duration::from_millis(50);
const START_TIMEOUT: Duration = Duration::from_secs(10);
const OPERATION_LOCK_TIMEOUT: Duration = Duration::from_secs(75);
const PID_FILE_NAME: &str = "app-server.pid";
const UPDATE_PID_FILE_NAME: &str = "app-server-updater.pid";
const OPERATION_LOCK_FILE_NAME: &str = "daemon.lock";
const SETTINGS_FILE_NAME: &str = "settings.json";
const STATE_DIR_NAME: &str = "app-server-daemon";
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum LifecycleCommand {
Start,
Restart,
Stop,
Version,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum LifecycleStatus {
AlreadyRunning,
Started,
Restarted,
Stopped,
NotRunning,
Running,
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct LifecycleOutput {
pub status: LifecycleStatus,
#[serde(skip_serializing_if = "Option::is_none")]
pub backend: Option<BackendKind>,
#[serde(skip_serializing_if = "Option::is_none")]
pub pid: Option<u32>,
pub socket_path: PathBuf,
#[serde(skip_serializing_if = "Option::is_none")]
pub cli_version: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub app_server_version: Option<String>,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct BootstrapOptions {
pub remote_control_enabled: bool,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum BootstrapStatus {
Bootstrapped,
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct BootstrapOutput {
pub status: BootstrapStatus,
pub backend: BackendKind,
pub auto_update_enabled: bool,
pub remote_control_enabled: bool,
pub managed_codex_path: PathBuf,
pub socket_path: PathBuf,
pub cli_version: String,
pub app_server_version: String,
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
#[serde(untagged)]
pub enum RemoteControlStartOutput {
Bootstrap(BootstrapOutput),
Start(LifecycleOutput),
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum RemoteControlMode {
Enabled,
Disabled,
}
impl RemoteControlMode {
fn is_enabled(self) -> bool {
matches!(self, Self::Enabled)
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum RemoteControlStatus {
Enabled,
Disabled,
AlreadyEnabled,
AlreadyDisabled,
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct RemoteControlOutput {
pub status: RemoteControlStatus,
#[serde(skip_serializing_if = "Option::is_none")]
pub backend: Option<BackendKind>,
pub remote_control_enabled: bool,
pub socket_path: PathBuf,
pub cli_version: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub app_server_version: Option<String>,
}
#[cfg(unix)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) enum RestartIfRunningOutcome {
Busy,
NotRunning,
NotReady,
AlreadyCurrent,
Restarted,
}
#[cfg(unix)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) enum RestartMode {
IfVersionChanged,
Always,
}
#[cfg(unix)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) enum UpdaterRefreshMode {
None,
ReexecIfManagedBinaryChanged,
}
#[cfg(unix)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum RestartDecision {
NotReady,
AlreadyCurrent,
Restart,
}
pub async fn run(command: LifecycleCommand) -> Result<LifecycleOutput> {
ensure_supported_platform()?;
Daemon::from_environment()?.run(command).await
}
pub async fn bootstrap(options: BootstrapOptions) -> Result<BootstrapOutput> {
ensure_supported_platform()?;
Daemon::from_environment()?.bootstrap(options).await
}
pub async fn ensure_remote_control_started() -> Result<RemoteControlStartOutput> {
ensure_supported_platform()?;
Daemon::from_environment()?
.ensure_remote_control_started()
.await
}
pub async fn set_remote_control(mode: RemoteControlMode) -> Result<RemoteControlOutput> {
ensure_supported_platform()?;
Daemon::from_environment()?.set_remote_control(mode).await
}
pub async fn run_pid_update_loop() -> Result<()> {
ensure_supported_platform()?;
update_loop::run().await
}
#[cfg(unix)]
fn ensure_supported_platform() -> Result<()> {
Ok(())
}
#[cfg(not(unix))]
fn ensure_supported_platform() -> Result<()> {
Err(anyhow!(
"codex app-server daemon lifecycle is only supported on Unix platforms"
))
}
struct Daemon {
socket_path: PathBuf,
pid_file: PathBuf,
update_pid_file: PathBuf,
operation_lock_file: PathBuf,
settings_file: PathBuf,
managed_codex_bin: PathBuf,
}
impl Daemon {
fn from_environment() -> Result<Self> {
let codex_home = find_codex_home().context("failed to resolve CODEX_HOME")?;
let socket_path = app_server_control_socket_path(codex_home.as_path())?
.as_path()
.to_path_buf();
let state_dir = codex_home.as_path().join(STATE_DIR_NAME);
Ok(Self {
socket_path,
pid_file: state_dir.join(PID_FILE_NAME),
update_pid_file: state_dir.join(UPDATE_PID_FILE_NAME),
operation_lock_file: state_dir.join(OPERATION_LOCK_FILE_NAME),
settings_file: state_dir.join(SETTINGS_FILE_NAME),
managed_codex_bin: managed_codex_bin(codex_home.as_path()),
})
}
async fn run(&self, command: LifecycleCommand) -> Result<LifecycleOutput> {
match command {
LifecycleCommand::Start => {
let _operation_lock = self.acquire_operation_lock().await?;
self.start().await
}
LifecycleCommand::Restart => {
let _operation_lock = self.acquire_operation_lock().await?;
self.restart().await
}
LifecycleCommand::Stop => {
let _operation_lock = self.acquire_operation_lock().await?;
self.stop().await
}
LifecycleCommand::Version => self.version().await,
}
}
async fn start(&self) -> Result<LifecycleOutput> {
let settings = self.load_settings().await?;
if let Ok(info) = client::probe(&self.socket_path).await {
return Ok(self.output(
LifecycleStatus::AlreadyRunning,
self.running_backend(&settings).await?,
/*pid*/ None,
Some(info.app_server_version),
));
}
if self.running_backend_instance(&settings).await?.is_some() {
let info = self.wait_until_ready().await?;
return Ok(self.output(
LifecycleStatus::AlreadyRunning,
Some(BackendKind::Pid),
/*pid*/ None,
Some(info.app_server_version),
));
}
self.ensure_managed_codex_bin()?;
let pid = self.start_managed_backend(&settings).await?;
let info = self.wait_until_ready().await?;
Ok(self.output(
LifecycleStatus::Started,
Some(BackendKind::Pid),
pid,
Some(info.app_server_version),
))
}
async fn restart(&self) -> Result<LifecycleOutput> {
let settings = self.load_settings().await?;
if client::probe(&self.socket_path).await.is_ok()
&& self.running_backend(&settings).await?.is_none()
{
return Err(anyhow!(
"app server is running but is not managed by codex app-server daemon"
));
}
self.ensure_managed_codex_bin()?;
if let Some(backend) = self.running_backend_instance(&settings).await? {
backend.stop().await?;
}
let pid = self.start_managed_backend(&settings).await?;
let info = self.wait_until_ready().await?;
Ok(self.output(
LifecycleStatus::Restarted,
Some(BackendKind::Pid),
pid,
Some(info.app_server_version),
))
}
#[cfg(unix)]
pub(crate) async fn try_restart_if_running(
&self,
mode: RestartMode,
updater_refresh_mode: UpdaterRefreshMode,
managed_codex_bin: &Path,
) -> Result<RestartIfRunningOutcome> {
let operation_lock = self.open_operation_lock_file().await?;
if !try_lock_file(&operation_lock)? {
return Ok(RestartIfRunningOutcome::Busy);
}
let settings = self.load_settings().await?;
let outcome = if let Some(backend) = self.running_backend_instance(&settings).await? {
let info = client::probe(&self.socket_path).await.ok();
let managed_version = if info.is_some() {
Some(managed_codex_version(managed_codex_bin).await?)
} else {
None
};
match restart_decision(mode, info.as_ref(), managed_version.as_deref()) {
RestartDecision::NotReady => return Ok(RestartIfRunningOutcome::NotReady),
RestartDecision::AlreadyCurrent => RestartIfRunningOutcome::AlreadyCurrent,
RestartDecision::Restart => {
backend.stop().await?;
let _ = self
.start_managed_backend_with_bin(&settings, managed_codex_bin)
.await?;
self.wait_until_ready().await?;
RestartIfRunningOutcome::Restarted
}
}
} else if client::probe(&self.socket_path).await.is_ok() {
return Err(anyhow!(
"app server is running but is not managed by codex app-server daemon"
));
} else {
RestartIfRunningOutcome::NotRunning
};
if should_reexec_updater(updater_refresh_mode, outcome) {
crate::update_loop::reexec_managed_updater(managed_codex_bin)?;
}
Ok(outcome)
}
async fn stop(&self) -> Result<LifecycleOutput> {
let settings = self.load_settings().await?;
if let Some(backend) = self.running_backend_instance(&settings).await? {
backend.stop().await?;
return Ok(self.output(
LifecycleStatus::Stopped,
Some(BackendKind::Pid),
/*pid*/ None,
/*app_server_version*/ None,
));
}
if client::probe(&self.socket_path).await.is_ok() {
return Err(anyhow!(
"app server is running but is not managed by codex app-server daemon"
));
}
Ok(self.output(
LifecycleStatus::NotRunning,
/*backend*/ None,
/*pid*/ None,
/*app_server_version*/ None,
))
}
async fn version(&self) -> Result<LifecycleOutput> {
let settings = self.load_settings().await?;
let info = client::probe(&self.socket_path).await?;
Ok(self.output(
LifecycleStatus::Running,
self.running_backend(&settings).await?,
/*pid*/ None,
Some(info.app_server_version),
))
}
async fn wait_until_ready(&self) -> Result<client::ProbeInfo> {
let deadline = tokio::time::Instant::now() + START_TIMEOUT;
loop {
match client::probe(&self.socket_path).await {
Ok(info) => return Ok(info),
Err(err) if tokio::time::Instant::now() < deadline => {
let _ = err;
sleep(START_POLL_INTERVAL).await;
}
Err(err) => {
return Err(err).with_context(|| {
format!(
"app server did not become ready on {}",
self.socket_path.display()
)
});
}
}
}
}
async fn bootstrap(&self, options: BootstrapOptions) -> Result<BootstrapOutput> {
let _operation_lock = self.acquire_operation_lock().await?;
self.bootstrap_locked(options).await
}
async fn ensure_remote_control_started(&self) -> Result<RemoteControlStartOutput> {
let _operation_lock = self.acquire_operation_lock().await?;
let settings = self.load_settings().await?;
if self.is_bootstrapped(&settings).await? {
let _ = self
.set_remote_control_locked(RemoteControlMode::Enabled)
.await?;
let output = self.start().await?;
return Ok(RemoteControlStartOutput::Start(output));
}
let output = self
.bootstrap_locked(BootstrapOptions {
remote_control_enabled: true,
})
.await?;
Ok(RemoteControlStartOutput::Bootstrap(output))
}
async fn set_remote_control(&self, mode: RemoteControlMode) -> Result<RemoteControlOutput> {
let _operation_lock = self.acquire_operation_lock().await?;
self.set_remote_control_locked(mode).await
}
async fn set_remote_control_locked(
&self,
mode: RemoteControlMode,
) -> Result<RemoteControlOutput> {
let previous_settings = self.load_settings().await?;
let mut settings = previous_settings.clone();
let remote_control_enabled = mode.is_enabled();
let backend = self.running_backend_instance(&previous_settings).await?;
if backend.is_none() && client::probe(&self.socket_path).await.is_ok() {
return Err(anyhow!(
"app server is running but is not managed by codex app-server daemon"
));
}
if settings.remote_control_enabled == remote_control_enabled {
let info = if backend.is_some() {
Some(self.wait_until_ready().await?)
} else {
None
};
return Ok(self.remote_control_output(
already_remote_control_status(mode),
backend.map(|_| BackendKind::Pid),
remote_control_enabled,
info.map(|info| info.app_server_version),
));
}
settings.remote_control_enabled = remote_control_enabled;
settings.save(&self.settings_file).await?;
let app_server_version = if let Some(backend) = backend {
self.ensure_managed_codex_bin()?;
backend.stop().await?;
let _ = self.start_managed_backend(&settings).await?;
Some(self.wait_until_ready().await?.app_server_version)
} else {
None
};
Ok(self.remote_control_output(
remote_control_status(mode),
app_server_version.as_ref().map(|_| BackendKind::Pid),
remote_control_enabled,
app_server_version,
))
}
async fn bootstrap_locked(&self, options: BootstrapOptions) -> Result<BootstrapOutput> {
self.ensure_managed_codex_bin()?;
let settings = DaemonSettings {
remote_control_enabled: options.remote_control_enabled,
};
if client::probe(&self.socket_path).await.is_ok()
&& self.running_backend(&settings).await?.is_none()
{
return Err(anyhow!(
"app server is running but is not managed by codex app-server daemon"
));
}
settings.save(&self.settings_file).await?;
if let Some(backend) = self.running_backend_instance(&settings).await? {
backend.stop().await?;
}
let backend = backend::pid_backend(self.backend_paths(&settings));
backend.start().await?;
let updater = backend::pid_update_loop_backend(self.backend_paths(&settings));
if updater.is_starting_or_running().await? {
updater.stop().await?;
}
updater.start().await?;
let info = self.wait_until_ready().await?;
Ok(BootstrapOutput {
status: BootstrapStatus::Bootstrapped,
backend: BackendKind::Pid,
auto_update_enabled: true,
remote_control_enabled: settings.remote_control_enabled,
managed_codex_path: self.managed_codex_bin.clone(),
socket_path: self.socket_path.clone(),
cli_version: env!("CARGO_PKG_VERSION").to_string(),
app_server_version: info.app_server_version,
})
}
async fn running_backend(&self, settings: &DaemonSettings) -> Result<Option<BackendKind>> {
Ok(self
.running_backend_instance(settings)
.await?
.map(|_| BackendKind::Pid))
}
async fn running_backend_instance(
&self,
settings: &DaemonSettings,
) -> Result<Option<backend::PidBackend>> {
let backend = backend::pid_backend(self.backend_paths(settings));
if backend.is_starting_or_running().await? {
return Ok(Some(backend));
}
Ok(None)
}
async fn start_managed_backend(&self, settings: &DaemonSettings) -> Result<Option<u32>> {
self.start_managed_backend_with_bin(settings, &self.managed_codex_bin)
.await
}
async fn start_managed_backend_with_bin(
&self,
settings: &DaemonSettings,
managed_codex_bin: &Path,
) -> Result<Option<u32>> {
let backend =
backend::pid_backend(self.backend_paths_with_bin(settings, managed_codex_bin));
backend.start().await
}
async fn is_bootstrapped(&self, settings: &DaemonSettings) -> Result<bool> {
let updater = backend::pid_update_loop_backend(self.backend_paths(settings));
updater.is_starting_or_running().await
}
fn ensure_managed_codex_bin(&self) -> Result<()> {
if self.managed_codex_bin.is_file() {
return Ok(());
}
let managed_codex_path = self.managed_codex_bin.display();
Err(anyhow!(
"managed standalone Codex install not found at {managed_codex_path}\n\n\
This command requires the standalone install managed by the Codex installer, because \
the daemon starts and updates app-server from that fixed path.\n\n\
Install it with:\n curl -fsSL https://chatgpt.com/codex/install.sh | sh\n\n\
Then rerun the command you just tried."
))
}
fn backend_paths(&self, settings: &DaemonSettings) -> BackendPaths {
self.backend_paths_with_bin(settings, &self.managed_codex_bin)
}
fn backend_paths_with_bin(
&self,
settings: &DaemonSettings,
managed_codex_bin: &Path,
) -> BackendPaths {
BackendPaths {
codex_bin: managed_codex_bin.to_path_buf(),
pid_file: self.pid_file.clone(),
update_pid_file: self.update_pid_file.clone(),
remote_control_enabled: settings.remote_control_enabled,
}
}
async fn load_settings(&self) -> Result<DaemonSettings> {
DaemonSettings::load(&self.settings_file).await
}
async fn acquire_operation_lock(&self) -> Result<tokio::fs::File> {
let operation_lock = self.open_operation_lock_file().await?;
let deadline = tokio::time::Instant::now() + OPERATION_LOCK_TIMEOUT;
while !try_lock_file(&operation_lock)? {
if tokio::time::Instant::now() >= deadline {
return Err(anyhow!(
"timed out waiting for daemon operation lock {}",
self.operation_lock_file.display()
));
}
sleep(START_POLL_INTERVAL).await;
}
Ok(operation_lock)
}
async fn open_operation_lock_file(&self) -> Result<tokio::fs::File> {
if let Some(parent) = self.operation_lock_file.parent() {
tokio::fs::create_dir_all(parent).await.with_context(|| {
format!(
"failed to create daemon state directory {}",
parent.display()
)
})?;
}
tokio::fs::OpenOptions::new()
.create(true)
.truncate(false)
.write(true)
.open(&self.operation_lock_file)
.await
.with_context(|| {
format!(
"failed to open daemon operation lock {}",
self.operation_lock_file.display()
)
})
}
fn output(
&self,
status: LifecycleStatus,
backend: Option<BackendKind>,
pid: Option<u32>,
app_server_version: Option<String>,
) -> LifecycleOutput {
LifecycleOutput {
status,
backend,
pid,
socket_path: self.socket_path.clone(),
cli_version: Some(env!("CARGO_PKG_VERSION").to_string()),
app_server_version,
}
}
fn remote_control_output(
&self,
status: RemoteControlStatus,
backend: Option<BackendKind>,
remote_control_enabled: bool,
app_server_version: Option<String>,
) -> RemoteControlOutput {
RemoteControlOutput {
status,
backend,
remote_control_enabled,
socket_path: self.socket_path.clone(),
cli_version: env!("CARGO_PKG_VERSION").to_string(),
app_server_version,
}
}
}
fn remote_control_status(mode: RemoteControlMode) -> RemoteControlStatus {
match mode {
RemoteControlMode::Enabled => RemoteControlStatus::Enabled,
RemoteControlMode::Disabled => RemoteControlStatus::Disabled,
}
}
fn already_remote_control_status(mode: RemoteControlMode) -> RemoteControlStatus {
match mode {
RemoteControlMode::Enabled => RemoteControlStatus::AlreadyEnabled,
RemoteControlMode::Disabled => RemoteControlStatus::AlreadyDisabled,
}
}
#[cfg(unix)]
fn restart_decision(
mode: RestartMode,
info: Option<&client::ProbeInfo>,
managed_version: Option<&str>,
) -> RestartDecision {
match (mode, info, managed_version) {
(RestartMode::IfVersionChanged, None, _) => RestartDecision::NotReady,
(RestartMode::IfVersionChanged, Some(info), Some(managed_version))
if info.app_server_version == managed_version =>
{
RestartDecision::AlreadyCurrent
}
_ => RestartDecision::Restart,
}
}
#[cfg(unix)]
fn should_reexec_updater(
updater_refresh_mode: UpdaterRefreshMode,
outcome: RestartIfRunningOutcome,
) -> bool {
updater_refresh_mode == UpdaterRefreshMode::ReexecIfManagedBinaryChanged
&& outcome == RestartIfRunningOutcome::Restarted
}
#[cfg(unix)]
fn try_lock_file(file: &tokio::fs::File) -> Result<bool> {
use std::os::fd::AsRawFd;
let result = unsafe { libc::flock(file.as_raw_fd(), libc::LOCK_EX | libc::LOCK_NB) };
if result == 0 {
return Ok(true);
}
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(libc::EWOULDBLOCK) {
return Ok(false);
}
Err(err).context("failed to lock daemon operation")
}
#[cfg(not(unix))]
fn try_lock_file(_file: &tokio::fs::File) -> Result<bool> {
Ok(true)
}
#[cfg(all(test, unix))]
mod tests {
use pretty_assertions::assert_eq;
use super::BackendKind;
use super::BootstrapOutput;
use super::BootstrapStatus;
use super::LifecycleOutput;
use super::LifecycleStatus;
use super::RemoteControlStartOutput;
use super::RemoteControlStatus;
use super::RestartDecision;
use super::RestartIfRunningOutcome;
use super::RestartMode;
use super::UpdaterRefreshMode;
use super::restart_decision;
use super::should_reexec_updater;
use crate::client::ProbeInfo;
#[test]
fn lifecycle_status_uses_camel_case_json() {
assert_eq!(
serde_json::to_string(&LifecycleStatus::AlreadyRunning).expect("serialize"),
"\"alreadyRunning\""
);
}
#[test]
fn bootstrap_status_uses_camel_case_json() {
assert_eq!(
serde_json::to_string(&BootstrapStatus::Bootstrapped).expect("serialize"),
"\"bootstrapped\""
);
}
#[test]
fn remote_control_status_uses_camel_case_json() {
assert_eq!(
serde_json::to_string(&RemoteControlStatus::AlreadyEnabled).expect("serialize"),
"\"alreadyEnabled\""
);
}
#[test]
fn updater_reexec_waits_for_validated_restart() {
assert_eq!(
[
RestartIfRunningOutcome::Busy,
RestartIfRunningOutcome::NotReady,
RestartIfRunningOutcome::AlreadyCurrent,
RestartIfRunningOutcome::NotRunning,
RestartIfRunningOutcome::Restarted,
]
.map(|outcome| {
should_reexec_updater(UpdaterRefreshMode::ReexecIfManagedBinaryChanged, outcome)
}),
[false, false, false, false, true]
);
}
#[test]
fn unchanged_updater_never_reexecs() {
assert_eq!(
[
RestartIfRunningOutcome::Busy,
RestartIfRunningOutcome::NotReady,
RestartIfRunningOutcome::AlreadyCurrent,
RestartIfRunningOutcome::NotRunning,
RestartIfRunningOutcome::Restarted,
]
.map(|outcome| should_reexec_updater(UpdaterRefreshMode::None, outcome)),
[false, false, false, false, false]
);
}
#[test]
fn restart_decision_preserves_forced_refreshes() {
let current_info = ProbeInfo {
app_server_version: "0.1.0".to_string(),
};
assert_eq!(
[
restart_decision(
RestartMode::IfVersionChanged,
Some(&current_info),
Some("0.1.0"),
),
restart_decision(
RestartMode::IfVersionChanged,
/*info*/ None,
/*managed_version*/ None,
),
restart_decision(RestartMode::Always, Some(&current_info), Some("0.1.0")),
restart_decision(
RestartMode::Always,
/*info*/ None,
/*managed_version*/ None,
),
],
[
RestartDecision::AlreadyCurrent,
RestartDecision::NotReady,
RestartDecision::Restart,
RestartDecision::Restart,
]
);
}
#[test]
fn remote_control_start_output_serializes_inner_output_without_tag() {
let lifecycle_output = LifecycleOutput {
status: LifecycleStatus::AlreadyRunning,
backend: Some(BackendKind::Pid),
pid: None,
socket_path: "codex.sock".into(),
cli_version: Some("1.2.3".to_string()),
app_server_version: Some("1.2.4".to_string()),
};
let output = RemoteControlStartOutput::Start(lifecycle_output.clone());
assert_eq!(
serde_json::to_value(output).expect("serialize"),
serde_json::to_value(lifecycle_output).expect("serialize")
);
let bootstrap_output = BootstrapOutput {
status: BootstrapStatus::Bootstrapped,
backend: BackendKind::Pid,
auto_update_enabled: true,
remote_control_enabled: true,
managed_codex_path: "codex".into(),
socket_path: "codex.sock".into(),
cli_version: "1.2.3".to_string(),
app_server_version: "1.2.4".to_string(),
};
let output = RemoteControlStartOutput::Bootstrap(bootstrap_output.clone());
assert_eq!(
serde_json::to_value(output).expect("serialize"),
serde_json::to_value(bootstrap_output).expect("serialize")
);
}
}

View File

@@ -0,0 +1,103 @@
use std::path::Path;
use std::path::PathBuf;
#[cfg(unix)]
use anyhow::Context;
#[cfg(unix)]
use anyhow::Result;
#[cfg(unix)]
use anyhow::anyhow;
#[cfg(unix)]
use sha2::Digest;
#[cfg(unix)]
use sha2::Sha256;
#[cfg(unix)]
use tokio::fs;
#[cfg(unix)]
use tokio::process::Command;
pub(crate) fn managed_codex_bin(codex_home: &Path) -> PathBuf {
codex_home
.join("packages")
.join("standalone")
.join("current")
.join(managed_codex_file_name())
}
#[cfg(unix)]
pub(crate) async fn resolved_managed_codex_bin(codex_bin: &Path) -> Result<PathBuf> {
fs::canonicalize(codex_bin).await.with_context(|| {
format!(
"failed to resolve managed Codex binary {}",
codex_bin.display()
)
})
}
#[cfg(unix)]
pub(crate) async fn managed_codex_version(codex_bin: &Path) -> Result<String> {
let output = Command::new(codex_bin)
.arg("--version")
.output()
.await
.with_context(|| {
format!(
"failed to invoke managed Codex binary {}",
codex_bin.display()
)
})?;
if !output.status.success() {
return Err(anyhow!(
"managed Codex binary {} exited with status {}",
codex_bin.display(),
output.status
));
}
let stdout = String::from_utf8(output.stdout).with_context(|| {
format!(
"managed Codex version was not utf-8: {}",
codex_bin.display()
)
})?;
parse_codex_version(&stdout)
}
#[cfg(unix)]
#[derive(Debug, Clone, PartialEq, Eq)]
pub(crate) struct ExecutableIdentity {
digest: [u8; 32],
}
#[cfg(unix)]
pub(crate) async fn executable_identity(executable: &Path) -> Result<ExecutableIdentity> {
let bytes = fs::read(executable)
.await
.with_context(|| format!("failed to read executable {}", executable.display()))?;
Ok(executable_identity_from_bytes(&bytes))
}
#[cfg(unix)]
pub(crate) fn executable_identity_from_bytes(bytes: &[u8]) -> ExecutableIdentity {
ExecutableIdentity {
digest: Sha256::digest(bytes).into(),
}
}
fn managed_codex_file_name() -> &'static str {
if cfg!(windows) { "codex.exe" } else { "codex" }
}
#[cfg(unix)]
fn parse_codex_version(output: &str) -> Result<String> {
let version = output
.split_whitespace()
.nth(1)
.filter(|version| !version.is_empty())
.ok_or_else(|| anyhow!("managed Codex version output was malformed"))?;
Ok(version.to_string())
}
#[cfg(all(test, unix))]
#[path = "managed_install_tests.rs"]
mod tests;

View File

@@ -0,0 +1,27 @@
use pretty_assertions::assert_eq;
use super::executable_identity_from_bytes;
use super::parse_codex_version;
#[test]
fn parses_codex_cli_version_output() {
assert_eq!(
parse_codex_version("codex 1.2.3\n").expect("version"),
"1.2.3"
);
}
#[test]
fn rejects_malformed_codex_cli_version_output() {
assert!(parse_codex_version("codex\n").is_err());
}
#[test]
fn executable_identity_uses_binary_contents() {
let old = executable_identity_from_bytes(b"old");
let same = executable_identity_from_bytes(b"old");
let new = executable_identity_from_bytes(b"new");
assert_eq!(old, same);
assert_ne!(old, new);
}

View File

@@ -0,0 +1,63 @@
use std::path::Path;
use anyhow::Context;
use anyhow::Result;
use serde::Deserialize;
use serde::Serialize;
use tokio::fs;
#[derive(Debug, Clone, Default, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub(crate) struct DaemonSettings {
pub(crate) remote_control_enabled: bool,
}
impl DaemonSettings {
pub(crate) async fn load(path: &Path) -> Result<Self> {
let contents = match fs::read_to_string(path).await {
Ok(contents) => contents,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => return Ok(Self::default()),
Err(err) => {
return Err(err)
.with_context(|| format!("failed to read daemon settings {}", path.display()));
}
};
serde_json::from_str(&contents)
.with_context(|| format!("failed to parse daemon settings {}", path.display()))
}
pub(crate) async fn save(&self, path: &Path) -> Result<()> {
if let Some(parent) = path.parent() {
fs::create_dir_all(parent).await.with_context(|| {
format!(
"failed to create daemon settings directory {}",
parent.display()
)
})?;
}
let contents = serde_json::to_vec_pretty(self).context("failed to serialize settings")?;
fs::write(path, contents)
.await
.with_context(|| format!("failed to write daemon settings {}", path.display()))
}
}
#[cfg(all(test, unix))]
mod tests {
use pretty_assertions::assert_eq;
use super::DaemonSettings;
#[test]
fn daemon_settings_use_camel_case_json() {
assert_eq!(
serde_json::to_string(&DaemonSettings {
remote_control_enabled: true,
})
.expect("serialize"),
r#"{"remoteControlEnabled":true}"#
);
}
}

View File

@@ -0,0 +1,197 @@
#[cfg(unix)]
use std::process::Command as StdCommand;
#[cfg(unix)]
use std::process::Stdio;
#[cfg(unix)]
use std::time::Duration;
#[cfg(unix)]
use anyhow::Context;
use anyhow::Result;
#[cfg(not(unix))]
use anyhow::bail;
#[cfg(unix)]
use futures::FutureExt;
#[cfg(unix)]
use std::os::unix::process::CommandExt;
#[cfg(unix)]
use tokio::io::AsyncWriteExt;
#[cfg(unix)]
use tokio::process::Command;
#[cfg(unix)]
use tokio::signal::unix::Signal;
#[cfg(unix)]
use tokio::signal::unix::SignalKind;
#[cfg(unix)]
use tokio::signal::unix::signal;
#[cfg(unix)]
use tokio::time::sleep;
#[cfg(unix)]
use crate::Daemon;
#[cfg(unix)]
use crate::RestartIfRunningOutcome;
#[cfg(unix)]
use crate::RestartMode;
#[cfg(unix)]
use crate::UpdaterRefreshMode;
#[cfg(unix)]
use crate::managed_install::ExecutableIdentity;
#[cfg(unix)]
use crate::managed_install::executable_identity;
#[cfg(unix)]
use crate::managed_install::resolved_managed_codex_bin;
#[cfg(unix)]
const INITIAL_UPDATE_DELAY: Duration = Duration::from_secs(5 * 60);
#[cfg(unix)]
const RESTART_RETRY_INTERVAL: Duration = Duration::from_millis(50);
#[cfg(unix)]
const UPDATE_INTERVAL: Duration = Duration::from_secs(60 * 60);
#[cfg(unix)]
pub(crate) async fn run() -> Result<()> {
let mut terminate =
signal(SignalKind::terminate()).context("failed to install updater shutdown handler")?;
let running_updater_identity = current_updater_identity().await?;
if sleep_or_terminate(INITIAL_UPDATE_DELAY, &mut terminate).await {
return Ok(());
}
loop {
match update_once(&running_updater_identity, &mut terminate).await {
Ok(UpdateLoopControl::Continue) | Err(_) => {}
Ok(UpdateLoopControl::Stop) => return Ok(()),
}
if sleep_or_terminate(UPDATE_INTERVAL, &mut terminate).await {
return Ok(());
}
}
}
#[cfg(not(unix))]
pub(crate) async fn run() -> Result<()> {
bail!("pid-managed updater loop is unsupported on this platform")
}
#[cfg(unix)]
async fn sleep_or_terminate(duration: Duration, terminate: &mut Signal) -> bool {
tokio::select! {
_ = sleep(duration) => false,
_ = terminate.recv() => true,
}
}
#[cfg(unix)]
enum UpdateLoopControl {
Continue,
Stop,
}
#[cfg(unix)]
async fn update_once(
running_updater_identity: &ExecutableIdentity,
terminate: &mut Signal,
) -> Result<UpdateLoopControl> {
install_latest_standalone().await?;
let daemon = Daemon::from_environment()?;
let managed_codex_bin = resolved_managed_codex_bin(&daemon.managed_codex_bin).await?;
let managed_identity = executable_identity(&managed_codex_bin).await?;
let (restart_mode, updater_refresh_mode) =
update_modes_for_identities(running_updater_identity, &managed_identity);
loop {
if terminate.recv().now_or_never().flatten().is_some() {
return Ok(UpdateLoopControl::Stop);
}
match daemon
.try_restart_if_running(restart_mode, updater_refresh_mode, &managed_codex_bin)
.await?
{
RestartIfRunningOutcome::Busy => {
if sleep_or_terminate(RESTART_RETRY_INTERVAL, terminate).await {
return Ok(UpdateLoopControl::Stop);
}
}
_ => return Ok(UpdateLoopControl::Continue),
}
}
}
#[cfg(unix)]
async fn current_updater_identity() -> Result<ExecutableIdentity> {
let current_exe =
std::env::current_exe().context("failed to resolve current updater executable")?;
executable_identity(&current_exe).await
}
#[cfg(unix)]
fn update_modes_for_identities(
running_updater_identity: &ExecutableIdentity,
managed_identity: &ExecutableIdentity,
) -> (RestartMode, UpdaterRefreshMode) {
if running_updater_identity == managed_identity {
(RestartMode::IfVersionChanged, UpdaterRefreshMode::None)
} else {
(
RestartMode::Always,
UpdaterRefreshMode::ReexecIfManagedBinaryChanged,
)
}
}
#[cfg(unix)]
pub(crate) fn reexec_managed_updater(managed_codex_bin: &std::path::Path) -> Result<()> {
let err = StdCommand::new(managed_codex_bin)
.args(["app-server", "daemon", "pid-update-loop"])
.exec();
Err(err).with_context(|| {
format!(
"failed to replace updater with managed Codex binary {}",
managed_codex_bin.display()
)
})
}
#[cfg(unix)]
async fn install_latest_standalone() -> Result<()> {
let script = reqwest::get("https://chatgpt.com/codex/install.sh")
.await
.context("failed to fetch standalone Codex updater")?
.error_for_status()
.context("standalone Codex updater request failed")?
.bytes()
.await
.context("failed to read standalone Codex updater")?;
let mut child = Command::new("/bin/sh")
.arg("-s")
.stdin(Stdio::piped())
.stdout(Stdio::null())
.stderr(Stdio::null())
.spawn()
.context("failed to invoke standalone Codex updater")?;
let mut stdin = child
.stdin
.take()
.context("standalone Codex updater stdin was unavailable")?;
stdin
.write_all(&script)
.await
.context("failed to pass standalone Codex updater to shell")?;
drop(stdin);
let status = child
.wait()
.await
.context("failed to wait for standalone Codex updater")?;
if status.success() {
Ok(())
} else {
anyhow::bail!("standalone Codex updater exited with status {status}")
}
}
#[cfg(all(test, unix))]
#[path = "update_loop_tests.rs"]
mod tests;

View File

@@ -0,0 +1,31 @@
use pretty_assertions::assert_eq;
use super::update_modes_for_identities;
use crate::RestartMode;
use crate::UpdaterRefreshMode;
use crate::managed_install::executable_identity_from_bytes;
#[test]
fn unchanged_updater_uses_version_based_restart() {
assert_eq!(
update_modes_for_identities(
&executable_identity_from_bytes(b"same"),
&executable_identity_from_bytes(b"same"),
),
(RestartMode::IfVersionChanged, UpdaterRefreshMode::None)
);
}
#[test]
fn changed_updater_forces_refresh_even_when_version_may_match() {
assert_eq!(
update_modes_for_identities(
&executable_identity_from_bytes(b"old"),
&executable_identity_from_bytes(b"new"),
),
(
RestartMode::Always,
UpdaterRefreshMode::ReexecIfManagedBinaryChanged,
)
);
}

View File

@@ -7,6 +7,7 @@ license.workspace = true
[lib]
name = "codex_app_server_protocol"
path = "src/lib.rs"
doctest = false
[lints]
workspace = true

View File

@@ -0,0 +1,5 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "AttestationGenerateParams",
"type": "object"
}

View File

@@ -1,14 +1,14 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Fetch a controller-local device key public key by id.",
"properties": {
"keyId": {
"token": {
"description": "Opaque client attestation token.",
"type": "string"
}
},
"required": [
"keyId"
"token"
],
"title": "DeviceKeyPublicParams",
"title": "AttestationGenerateResponse",
"type": "object"
}

File diff suppressed because it is too large Load Diff

View File

@@ -593,6 +593,11 @@
"null"
]
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this approval request started.",
"format": "int64",
"type": "integer"
},
"threadId": {
"type": "string"
},
@@ -602,6 +607,7 @@
},
"required": [
"itemId",
"startedAtMs",
"threadId",
"turnId"
],

View File

@@ -18,6 +18,11 @@
"null"
]
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this approval request started.",
"format": "int64",
"type": "integer"
},
"threadId": {
"type": "string"
},
@@ -27,6 +32,7 @@
},
"required": [
"itemId",
"startedAtMs",
"threadId",
"turnId"
],

View File

@@ -297,6 +297,11 @@
"null"
]
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this approval request started.",
"format": "int64",
"type": "integer"
},
"threadId": {
"type": "string"
},
@@ -308,6 +313,7 @@
"cwd",
"itemId",
"permissions",
"startedAtMs",
"threadId",
"turnId"
],

View File

@@ -1032,6 +1032,7 @@
"type": "object"
},
"FileChangeOutputDeltaNotification": {
"description": "Deprecated legacy notification for `apply_patch` textual output.\n\nThe server no longer emits this notification.",
"properties": {
"delta": {
"type": "string"
@@ -1735,6 +1736,8 @@
"preToolUse",
"permissionRequest",
"postToolUse",
"preCompact",
"postCompact",
"sessionStart",
"userPromptSubmit",
"stop"
@@ -1901,6 +1904,7 @@
"mdm",
"sessionFlags",
"plugin",
"cloudRequirements",
"legacyManagedConfigFile",
"legacyManagedConfigMdm",
"unknown"
@@ -1928,8 +1932,20 @@
],
"type": "object"
},
"ImageDetail": {
"enum": [
"high",
"original"
],
"type": "string"
},
"ItemCompletedNotification": {
"properties": {
"completedAtMs": {
"description": "Unix timestamp (in milliseconds) when this item lifecycle completed.",
"format": "int64",
"type": "integer"
},
"item": {
"$ref": "#/definitions/ThreadItem"
},
@@ -1941,6 +1957,7 @@
}
},
"required": [
"completedAtMs",
"item",
"threadId",
"turnId"
@@ -1953,6 +1970,11 @@
"action": {
"$ref": "#/definitions/GuardianApprovalReviewAction"
},
"completedAtMs": {
"description": "Unix timestamp (in milliseconds) when this review completed.",
"format": "int64",
"type": "integer"
},
"decisionSource": {
"$ref": "#/definitions/AutoReviewDecisionSource"
},
@@ -1963,6 +1985,11 @@
"description": "Stable identifier for this review.",
"type": "string"
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this review started.",
"format": "int64",
"type": "integer"
},
"targetItemId": {
"description": "Identifier for the reviewed item or tool call when one exists.\n\nIn most cases, one review maps to one target item. The exceptions are - execve reviews, where a single command may contain multiple execve calls to review (only possible when using the shell_zsh_fork feature) - network policy reviews, where there is no target item\n\nA network call is triggered by a CommandExecution item, so having a target_item_id set to the CommandExecution item would be misleading because the review is about the network call, not the command execution. Therefore, target_item_id is set to None for network policy reviews.",
"type": [
@@ -1979,9 +2006,11 @@
},
"required": [
"action",
"completedAtMs",
"decisionSource",
"review",
"reviewId",
"startedAtMs",
"threadId",
"turnId"
],
@@ -2000,6 +2029,11 @@
"description": "Stable identifier for this review.",
"type": "string"
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this review started.",
"format": "int64",
"type": "integer"
},
"targetItemId": {
"description": "Identifier for the reviewed item or tool call when one exists.\n\nIn most cases, one review maps to one target item. The exceptions are - execve reviews, where a single command may contain multiple execve calls to review (only possible when using the shell_zsh_fork feature) - network policy reviews, where there is no target item\n\nA network call is triggered by a CommandExecution item, so having a target_item_id set to the CommandExecution item would be misleading because the review is about the network call, not the command execution. Therefore, target_item_id is set to None for network policy reviews.",
"type": [
@@ -2018,6 +2052,7 @@
"action",
"review",
"reviewId",
"startedAtMs",
"threadId",
"turnId"
],
@@ -2028,6 +2063,11 @@
"item": {
"$ref": "#/definitions/ThreadItem"
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this item lifecycle started.",
"format": "int64",
"type": "integer"
},
"threadId": {
"type": "string"
},
@@ -2037,6 +2077,7 @@
},
"required": [
"item",
"startedAtMs",
"threadId",
"turnId"
],
@@ -2401,6 +2442,96 @@
],
"type": "string"
},
"ProcessExitedNotification": {
"description": "Final process exit notification for `process/spawn`.",
"properties": {
"exitCode": {
"description": "Process exit code.",
"format": "int32",
"type": "integer"
},
"processHandle": {
"description": "Client-supplied, connection-scoped `processHandle` from `process/spawn`.",
"type": "string"
},
"stderr": {
"description": "Buffered stderr capture.\n\nEmpty when stderr was streamed via `process/outputDelta`.",
"type": "string"
},
"stderrCapReached": {
"description": "Whether stderr reached `outputBytesCap`.\n\nIn streaming mode, stderr is empty and cap state is also reported on the final stderr `process/outputDelta` notification.",
"type": "boolean"
},
"stdout": {
"description": "Buffered stdout capture.\n\nEmpty when stdout was streamed via `process/outputDelta`.",
"type": "string"
},
"stdoutCapReached": {
"description": "Whether stdout reached `outputBytesCap`.\n\nIn streaming mode, stdout is empty and cap state is also reported on the final stdout `process/outputDelta` notification.",
"type": "boolean"
}
},
"required": [
"exitCode",
"processHandle",
"stderr",
"stderrCapReached",
"stdout",
"stdoutCapReached"
],
"type": "object"
},
"ProcessOutputDeltaNotification": {
"description": "Base64-encoded output chunk emitted for a streaming `process/spawn` request.",
"properties": {
"capReached": {
"description": "True on the final streamed chunk for this stream when output was truncated by `outputBytesCap`.",
"type": "boolean"
},
"deltaBase64": {
"description": "Base64-encoded output bytes.",
"type": "string"
},
"processHandle": {
"description": "Client-supplied, connection-scoped `processHandle` from `process/spawn`.",
"type": "string"
},
"stream": {
"allOf": [
{
"$ref": "#/definitions/ProcessOutputStream"
}
],
"description": "Output stream this chunk belongs to."
}
},
"required": [
"capReached",
"deltaBase64",
"processHandle",
"stream"
],
"type": "object"
},
"ProcessOutputStream": {
"description": "Stream label for `process/outputDelta` notifications.",
"oneOf": [
{
"description": "stdout stream. PTY mode multiplexes terminal output here.",
"enum": [
"stdout"
],
"type": "string"
},
{
"description": "stderr stream.",
"enum": [
"stderr"
],
"type": "string"
}
]
},
"RateLimitReachedType": {
"enum": [
"rate_limit_reached",
@@ -2613,7 +2744,7 @@
"type": "string"
},
"RemoteControlStatusChangedNotification": {
"description": "Current remote-control connection status and environment id exposed to clients.",
"description": "Current remote-control connection status and remote identity exposed to clients.",
"properties": {
"environmentId": {
"type": [
@@ -2621,11 +2752,19 @@
"null"
]
},
"installationId": {
"type": "string"
},
"serverName": {
"type": "string"
},
"status": {
"$ref": "#/definitions/RemoteControlConnectionStatus"
}
},
"required": [
"installationId",
"serverName",
"status"
],
"type": "object"
@@ -2968,6 +3107,10 @@
"description": "Usually the first user message in the thread, if available.",
"type": "string"
},
"sessionId": {
"description": "Session id shared by threads that belong to the same session tree.",
"type": "string"
},
"source": {
"allOf": [
{
@@ -2984,6 +3127,17 @@
],
"description": "Current runtime status for the thread."
},
"threadSource": {
"anyOf": [
{
"$ref": "#/definitions/ThreadSource"
},
{
"type": "null"
}
],
"description": "Optional analytics source classification for this thread."
},
"turns": {
"description": "Only populated on `thread/resume`, `thread/rollback`, `thread/fork`, and `thread/read` (when `includeTurns` is true) responses. For all other responses and notifications returning a Thread, the turns field will be an empty list.",
"items": {
@@ -3005,6 +3159,7 @@
"id",
"modelProvider",
"preview",
"sessionId",
"source",
"status",
"turns",
@@ -3929,7 +4084,7 @@
"ThreadRealtimeStartedNotification": {
"description": "EXPERIMENTAL - emitted when thread realtime startup is accepted.",
"properties": {
"sessionId": {
"realtimeSessionId": {
"type": [
"string",
"null"
@@ -3990,6 +4145,14 @@
],
"type": "object"
},
"ThreadSource": {
"enum": [
"user",
"subagent",
"memory_consolidation"
],
"type": "string"
},
"ThreadStartedNotification": {
"properties": {
"thread": {
@@ -4208,12 +4371,21 @@
"type": "string"
},
"items": {
"description": "Only populated on a `thread/resume` or `thread/fork` response. For all other responses and notifications returning a Turn, the items field will be an empty list.",
"description": "Thread items currently included in this turn payload.",
"items": {
"$ref": "#/definitions/ThreadItem"
},
"type": "array"
},
"itemsView": {
"allOf": [
{
"$ref": "#/definitions/TurnItemsView"
}
],
"default": "full",
"description": "Describes how much of `items` has been loaded for this turn."
},
"startedAt": {
"description": "Unix timestamp (in seconds) when the turn started.",
"format": "int64",
@@ -4296,6 +4468,31 @@
],
"type": "object"
},
"TurnItemsView": {
"oneOf": [
{
"description": "`items` was not loaded for this turn. The field is intentionally empty.",
"enum": [
"notLoaded"
],
"type": "string"
},
{
"description": "`items` contains only a display summary for this turn.",
"enum": [
"summary"
],
"type": "string"
},
{
"description": "`items` contains every ThreadItem available from persisted app-server history for this turn.",
"enum": [
"full"
],
"type": "string"
}
]
},
"TurnPlanStep": {
"properties": {
"status": {
@@ -4403,6 +4600,17 @@
},
{
"properties": {
"detail": {
"anyOf": [
{
"$ref": "#/definitions/ImageDetail"
},
{
"type": "null"
}
],
"default": null
},
"type": {
"enum": [
"image"
@@ -4423,6 +4631,17 @@
},
{
"properties": {
"detail": {
"anyOf": [
{
"$ref": "#/definitions/ImageDetail"
},
{
"type": "null"
}
],
"default": null
},
"path": {
"type": "string"
},
@@ -5149,6 +5368,48 @@
"title": "Command/exec/outputDeltaNotification",
"type": "object"
},
{
"description": "Stream base64-encoded stdout/stderr chunks for a running `process/spawn` session.",
"properties": {
"method": {
"enum": [
"process/outputDelta"
],
"title": "Process/outputDeltaNotificationMethod",
"type": "string"
},
"params": {
"$ref": "#/definitions/ProcessOutputDeltaNotification"
}
},
"required": [
"method",
"params"
],
"title": "Process/outputDeltaNotification",
"type": "object"
},
{
"description": "Final exit notification for a `process/spawn` session.",
"properties": {
"method": {
"enum": [
"process/exited"
],
"title": "Process/exitedNotificationMethod",
"type": "string"
},
"params": {
"$ref": "#/definitions/ProcessExitedNotification"
}
},
"required": [
"method",
"params"
],
"title": "Process/exitedNotification",
"type": "object"
},
{
"properties": {
"method": {
@@ -5190,6 +5451,7 @@
"type": "object"
},
{
"description": "Deprecated legacy apply_patch output stream notification.",
"properties": {
"method": {
"enum": [

View File

@@ -121,6 +121,9 @@
],
"type": "object"
},
"AttestationGenerateParams": {
"type": "object"
},
"ChatgptAuthTokensRefreshParams": {
"properties": {
"previousAccountId": {
@@ -417,6 +420,11 @@
"null"
]
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this approval request started.",
"format": "int64",
"type": "integer"
},
"threadId": {
"type": "string"
},
@@ -426,6 +434,7 @@
},
"required": [
"itemId",
"startedAtMs",
"threadId",
"turnId"
],
@@ -598,6 +607,11 @@
"null"
]
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this approval request started.",
"format": "int64",
"type": "integer"
},
"threadId": {
"type": "string"
},
@@ -607,6 +621,7 @@
},
"required": [
"itemId",
"startedAtMs",
"threadId",
"turnId"
],
@@ -1587,6 +1602,11 @@
"null"
]
},
"startedAtMs": {
"description": "Unix timestamp (in milliseconds) when this approval request started.",
"format": "int64",
"type": "integer"
},
"threadId": {
"type": "string"
},
@@ -1598,6 +1618,7 @@
"cwd",
"itemId",
"permissions",
"startedAtMs",
"threadId",
"turnId"
],
@@ -1900,6 +1921,31 @@
"title": "Account/chatgptAuthTokens/refreshRequest",
"type": "object"
},
{
"description": "Generate a fresh upstream attestation result on demand.",
"properties": {
"id": {
"$ref": "#/definitions/RequestId"
},
"method": {
"enum": [
"attestation/generate"
],
"title": "Attestation/generateRequestMethod",
"type": "string"
},
"params": {
"$ref": "#/definitions/AttestationGenerateParams"
}
},
"required": [
"id",
"method",
"params"
],
"title": "Attestation/generateRequest",
"type": "object"
},
{
"description": "DEPRECATED APIs below Request to approve a patch. This request is used for Turns started via the legacy APIs (i.e. SendUserTurn, SendUserMessage).",
"properties": {

View File

@@ -39,6 +39,11 @@
"array",
"null"
]
},
"requestAttestation": {
"default": false,
"description": "Opt into `attestation/generate` requests for upstream `x-oai-attestation`.",
"type": "boolean"
}
},
"type": "object"

View File

@@ -228,6 +228,13 @@
"null"
]
},
"desktop": {
"additionalProperties": true,
"type": [
"object",
"null"
]
},
"developer_instructions": {
"type": [
"string",
@@ -235,9 +242,13 @@
]
},
"forced_chatgpt_workspace_id": {
"type": [
"string",
"null"
"anyOf": [
{
"$ref": "#/definitions/ForcedChatgptWorkspaceIds"
},
{
"type": "null"
}
]
},
"forced_login_method": {
@@ -352,13 +363,9 @@
]
},
"service_tier": {
"anyOf": [
{
"$ref": "#/definitions/ServiceTier"
},
{
"type": "null"
}
"type": [
"string",
"null"
]
},
"tools": {
@@ -486,6 +493,13 @@
],
"description": "This is the path to the user's config.toml file, though it is not guaranteed to exist."
},
"profile": {
"description": "Name of the selected profile-v2 config layered on top of the base user config, when this layer represents one.",
"type": [
"string",
"null"
]
},
"type": {
"enum": [
"user"
@@ -578,6 +592,20 @@
}
]
},
"ForcedChatgptWorkspaceIds": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
],
"description": "Backward-compatible API shape for ChatGPT workspace login restrictions."
},
"ForcedLoginMethod": {
"enum": [
"chatgpt",
@@ -658,13 +686,9 @@
]
},
"service_tier": {
"anyOf": [
{
"$ref": "#/definitions/ServiceTier"
},
{
"type": "null"
}
"type": [
"string",
"null"
]
},
"tools": {
@@ -754,21 +778,8 @@
},
"type": "object"
},
"ServiceTier": {
"enum": [
"fast",
"flex"
],
"type": "string"
},
"ToolsV2": {
"properties": {
"view_image": {
"type": [
"boolean",
"null"
]
},
"web_search": {
"anyOf": [
{

Some files were not shown because too many files have changed in this diff Show More