Disable the sccache daemon idle timeout in rust-ci-full so long test phases can still report the compile-cache stats collected during the build phase.
Co-authored-by: Codex <noreply@openai.com>
Let Windows rust-ci-full jobs use sccache again, store the fallback cache on the configured work drive, and set Cargo's rustc wrapper to an absolute sccache path so Windows subprocesses resolve it consistently.
Co-authored-by: Codex <noreply@openai.com>
Configure Windows Rust CI jobs and the shared Bazel CI setup to put temp, repository-cache, and output-root paths on the runner's fast work drive when available. Fall back to C: if no secondary drive or Dev Drive provisioning path is available.
Co-authored-by: Codex <noreply@openai.com>
Let the Windows arm64 test matrix use a longer timeout after CI showed the lane spending most of the default 45 minutes compiling before nextest could finish.
Also pin nextest through taiki-e/install-action's supported tool version syntax so the requested version is not ignored.
Co-authored-by: Codex <noreply@openai.com>
Windows rust-ci-full repeatedly times out in subprocess-heavy tests even when the global nextest thread count is capped. Isolate the recurring Windows-only families with nextest overrides so the rest of the suite can keep normal parallelism.
Co-authored-by: Codex <noreply@openai.com>
Replace the realtime websocket accept-delay race with an explicit test-server gate so close is issued while the sideband connection is pending, then prove the closed conversation does not emit stale events or send sideband websocket requests.
Co-authored-by: Codex <noreply@openai.com>
Keep polling when Windows temporarily denies metadata reads while the phase 2 memory workspace is being cleaned up, so the test still verifies the file is removed and the baseline becomes clean.
Co-authored-by: Codex <noreply@openai.com>
Subscribe before test shutdown and close operations, then wait for the Shutdown status before resuming the same thread IDs. This removes the Windows live-writer race exposed by the full nextest run.
Co-authored-by: Codex <noreply@openai.com>
Use a Windows command form that exits successfully in constrained CI shells and trim the expected newline in the delegated realtime shell-tool assertion.
Co-authored-by: Codex <noreply@openai.com>
Avoid PowerShell command forms that depend on method invocation for the delegated realtime shell-tool test, and wait for a shutdown status before resuming the same subagent thread in the nickname/role restore test.
Co-authored-by: Codex <noreply@openai.com>
The legacy sandbox runs PowerShell in constrained language mode, so method calls fail and module-backed cmdlets may not autoload. Use literal string expressions for the PowerShell I/O smoke tests so they exercise process output without depending on cmdlets or method invocation.
Co-authored-by: Codex <noreply@openai.com>
Windows arm64 can launch pwsh in the legacy sandbox while still failing Write-Output because Microsoft.PowerShell.Utility cannot autoload. Use Console output in the legacy PowerShell smoke tests so they continue to verify sandbox process I/O without depending on module autoload.
Co-authored-by: Codex <noreply@openai.com>
Dev Drive setup can put temporary Codex homes on D:, which exposed test fixtures that wrote root-relative '/' rollout cwd values while assertions expected the Windows-aware C:\ root helper. Use the same test_path_buf helper when creating and expecting fake rollout cwd values so the tests remain independent of the process temp drive.
Co-authored-by: Codex <noreply@openai.com>
Use the existing mock server as the sideband failure endpoint instead of relying on an OS-level connection refusal from 127.0.0.1:1. Disable retries in this failure-path test so Windows CI does not spend the default retry budget before emitting the expected error/close events.
Co-authored-by: Codex <noreply@openai.com>
Claim job items before spawning workers and allow reports to complete unassigned running items, so fast workers cannot lose stop=true reports before the parent records their thread id.
Co-authored-by: Codex <noreply@openai.com>
A worker stop request used to record the item result and job cancellation in separate updates, so the job runner could observe the item completion first and continue spawning pending work. Commit both state updates together and prevent completion from overwriting a final cancellation.
Co-authored-by: Codex <noreply@openai.com>
This builds on top of https://github.com/openai/codex/pull/15828 by
ensuring that hash-pinned actions with version comments are fully
qualified, rather than referencing floating/mutable comments like "v7".
This makes actions management tools behave more consistently.
This shouldn't break anything, since it's comment only. But if it does,
ping ww@ 🙂
A clean release build takes ~18m and an incremental build takes ~12m.
This is far too slow to iterate on performance related changes and the
build time is dominated by LTO.
This pull request adds a `profiling` profile for Cargo which takes ~13m
clean and ~6m incremental, the primary change is that LTO is disabled.
This matches a profile used in uv and follows the great work at
https://github.com/astral-sh/uv/pull/5955 — there's a bit of commentary
there about the trade-offs this implies.
We've found that this does not inhibit the ability to accurately
benchmark as measurements with LTO disabled are generally consistent
with the results with LTO enabled and it makes it much faster (~2x) to
rebuild after making a change.
This is motivated by my interest in improving Codex TUI performance,
which is blocked by the tragically builds right now.
I tested incremental build times by making a no-op change to the
`codex-cli` crate.
## Why
Codex desktop copies bundled Windows binaries out of `WindowsApps` into
a LocalAppData runtime cache before launching `codex.exe`. Sandboxed
commands can then need to execute helpers from that cache, but the
sandbox user group may not have read/execute access to the runtime bin
directory.
This makes the Windows sandbox refresh path repair that access directly
so the packaged desktop runtime remains usable from sandboxed sessions.
## What changed
- Added `setup_runtime_bin` to locate `%LOCALAPPDATA%\OpenAI\Codex\bin`,
matching the desktop bundled-binaries destination path, with the same
`USERPROFILE\AppData\Local` fallback shape.
- During refresh setup, check whether `CodexSandboxUsers` already has
read/execute access to the runtime bin directory.
- If access is missing, grant `CodexSandboxUsers` `OI/CI/RX` inheritance
on that directory.
- If the runtime bin directory does not exist, no-op cleanly.
## Verification
- `cargo build -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
- `cargo test -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
- Manual Windows ACL exercise against the installed packaged runtime
bin:
- existing inherited `CodexSandboxUsers:(I)(OI)(CI)(RX)` no-ops without
changing SDDL
- after disabling inheritance and removing the group ACE, setup adds
`CodexSandboxUsers:(OI)(CI)(RX)`
- with `LOCALAPPDATA` pointed at a fake location without
`OpenAI\Codex\bin`, setup exits successfully and does not create the
directory
- restored the real runtime bin with inherited ACLs and confirmed the
final SDDL matched the baseline exactly
- Route ThreadManager rollout-path resume/fork through ThreadStore
history reads.
- Add in-memory store coverage proving path-addressed reads are used.
This isn't strictly necessary for the ThreadStore migration, since these
ThreadManager methods _only_ work for path-based lookups, but I'm trying
to migrate all the rollout recorder callsites to use the threadstore
were possible for consistency.
## Summary
Fix `getConversationSummary` so thread-id summaries work for stored
threads that do not have a local rollout path, such as remote thread
stores.
The root cause was that `summary_from_stored_thread` returned `None`
when `StoredThread.rollout_path` was absent, and
`get_thread_summary_response_inner` treated that as an internal error.
This made conversation-id lookups depend on a local-only field even
though the thread store can address the thread by id.
- Route live thread renames through `ThreadStore` metadata updates.
- Read resumed thread names from store metadata with legacy local
fallback preserved in the store.
## Why
This is the next stacked step after deleting the tool-handler kind
indirection. Specs should come from the registered handlers themselves
so registry construction has a single source of truth for handler
behavior and exposed tool definitions.
## What changed
- Added `ToolHandler::spec()` plus handler-provided parallel/code-mode
metadata, and made `ToolRegistryBuilder::register_handler` automatically
collect specs from registered handlers.
- Moved builtin tool spec construction into the corresponding handlers
and their adjacent `_spec` modules, including shell, unified exec, apply
patch, view image, request plugin install, tool search, MCP resource,
goals, planning, permissions, agent jobs, and multi-agent tools.
- Reworked configurable handlers to receive their tool-building options
through constructors, with non-optional handler options where the
handler is always spec-backed. Shell fallback handlers keep an explicit
no-spec mode because they are also registered as hidden dispatch
aliases.
- Kept `CodeModeExecuteHandler` on the explicit configured wrapper so
the code-mode exec spec can still be built from the nested registry.
## Verification
- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `cargo test -p codex-core tools::handlers::multi_agents_spec::tests`
- `RUST_MIN_STACK=16777216 cargo test -p codex-core
tools::handlers::multi_agents::tests`
- `cargo test -p codex-core tools::handlers::apply_patch::tests`
- `cargo test -p codex-core tools::handlers::unified_exec::tests`
- `just fix -p codex-core`
- `git diff --check`
## Why
App-server config writes were leaving existing threads partially stale.
After a config mutation, the app-server told each live thread to run
`Op::ReloadUserConfig`, but that path only re-read the user
`config.toml` layer. Settings that came from the app-server's
materialized config snapshot did not propagate to existing threads until
restart.
This change prevent a FS access from `core` for CCA.
## What changed
- add `CodexThread::refresh_runtime_config()` and
`Session::refresh_runtime_config()` so the app-server can push a freshly
rebuilt config snapshot into a live thread
- rebuild the latest config with each thread's `cwd` after config
mutations, then refresh the thread from that snapshot instead of asking
it to reload only `config.toml`
- keep session-static settings unchanged during refresh, while updating
runtime-refreshable state such as the config layer stack,
`tool_suggest`, and derived hook/plugin/skill state
- keep `reload_user_config_layer()` as the file-backed fallback for
legacy local reload flows, but route the shared refresh logic through
the new runtime refresh path
## Testing
- add a session test that verifies `refresh_runtime_config()` rebuilds
hooks from refreshed config
- add a session test that verifies runtime-refreshable fields update
while session-static settings like `model` and `notify` stay unchanged
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
`codex --enable remote_control app-server --listen off` is the current
way to start a headless, remote-controllable app-server, but it is hard
to remember and exposes implementation details.
This adds `codex remote-control` as a friendly top-level wrapper for
that flow. The command starts a foreground app-server with local
transports disabled and enables `remote_control` only for that
invocation.
## Changes
- Add a visible `codex remote-control` CLI subcommand.
- Launch app-server with `AppServerTransport::Off`.
- Append `features.remote_control=true` after root feature toggles so
the explicit command wins over `--disable remote_control`.
- Reject root `--remote` / `--remote-auth-token-env`, matching other
non-TUI subcommands.
- Add tests for parsing, launch defaults, override ordering, and remote
flag rejection.
## Verification
- `cargo test -p codex-cli`
- `just fix -p codex-cli`
## Summary
This PR removes the synthetic `HashMap<String, ToolInfo>` keys from MCP
tool discovery. `McpConnectionManager::list_all_tools()` now returns
normalized `Vec<ToolInfo>`, and downstream code derives identity from
`ToolInfo::canonical_tool_name()`.
The motivation is to keep model-visible tool identity on
`ToolName`/`ToolInfo` instead of parallel string map keys, so future
namespace changes do not have to preserve otherwise-unused lookup keys.
## Changes
- Rename the MCP normalization path from `qualify_tools` to
`normalize_tools_for_model` and return tool values directly.
- Flow MCP tool lists through connectors, plugin injection, router/spec
building, code mode, and tool search as vectors/slices.
- Keep direct/deferred subtraction local to `mcp_tool_exposure`, using
`ToolName` values.
- Update tests to compare `ToolName` instances where MCP identity
matters.
## Validation
- `cargo test -p codex-mcp test_normalize_tools`
- `cargo test -p codex-core mcp_tool_exposure`
- `cargo test -p codex-core
direct_mcp_tools_register_namespaced_handlers`
- `cargo test -p codex-core
search_tool_registers_namespaced_mcp_tool_aliases`
- `just fix -p codex-mcp`
- `just fix -p codex-core`
## Why
We want to emit terminal review analytics for tool-related approval
flows, but the event contract needs to exist before the reducer can
publish anything.
This PR is the schema-only slice for the Codex review event family.
## What changed
- add the `ReviewEvent` analytics envelope in
`codex-rs/analytics/src/events.rs`
- define the review subject kind, reviewer, trigger, terminal status,
and post-review resolution enums
- define the review event payload with thread, turn, item, lineage,
tool, and timing fields that the emitter stack will populate
## Verification
- stacked verification in dependent PRs: `cargo test -p codex-analytics
analytics_client_tests --manifest-path codex-rs/Cargo.toml`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18747).
* #18748
* #21434
* __->__ #18747
* #17090
* #17089
* #20514
## Why
Follow-up to #21180: turn diffs are operation-backed now, but a failed
`apply_patch` can still leave exact filesystem mutations behind. For
example, a move can write the destination file before failing to remove
the source. Treating the whole call as unknowable then drops a change
that Codex actually knows happened, so the emitted turn diff can drift
from the workspace.
## What changed
-
[`apply-patch`](f55724e027/codex-rs/apply-patch/src/lib.rs (L248-L345))
now returns `ApplyPatchFailure` with the exact committed prefix
accumulated before an error. If a write failure may already have mutated
the target, the delta is marked inexact instead of being reused blindly.
- Move handling now records the destination write before attempting
source removal, so a partially failed move can still report the
destination file that definitely landed
([code](f55724e027/codex-rs/apply-patch/src/lib.rs (L463-L521))).
-
[`ApplyPatchRuntime`](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L49-L67))
now accumulates committed deltas across attempts and forwards them even
when the visible tool result is failed or sandbox-denied ([runtime
path](f55724e027/codex-rs/core/src/tools/runtimes/apply_patch.rs (L223-L250)),
[event
path](f55724e027/codex-rs/core/src/tools/events.rs (L215-L225))).
- `TurnDiffTracker` now consumes committed exact deltas rather than only
fully successful patches; exact-empty failures leave the aggregate
unchanged, while inexact deltas still invalidate it.
## Verification
- Added a regression test covering a failed move that still emits the
committed destination diff:
[`apply_patch_failed_move_preserves_committed_destination_diff`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1517-L1586)).
- Kept explicit coverage that an inexact delta clears the aggregate
instead of publishing a guessed diff:
[`apply_patch_clears_aggregated_diff_after_inexact_delta`](f55724e027/codex-rs/core/tests/suite/apply_patch_cli.rs (L1589-L1655)).
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Feedback uploads already carry auth-derived context like
`chatgpt_user_id`, but they do not include the authenticated
workspace/account id. Adding `account_id` makes feedback triage easier
when a user can operate across multiple ChatGPT workspaces.
## What changed
- emit auth-derived `account_id` into feedback tags in `app-server`
before the feedback snapshot is uploaded
- preserve that tag through `codex-feedback` upload tag assembly
alongside the existing merge behavior for other tags
- extend `codex-feedback` coverage to assert that snapshot-derived
`account_id` is present in uploaded tags
## Verification
- `cargo test -p codex-feedback
upload_tags_include_client_tags_and_preserve_reserved_fields`
- `cargo test -p codex-app-server --lib feedback_processor`
## Summary
- replace filesystem-based turn diff tracking with an operation-backed
accumulator
- preserve enough verified apply_patch state to render move-overwrite
cases correctly
- keep the turn/diff/updated contract intact while removing remote-only
turn-diff test skips
This takes the assumption that no 3P services rely on the output format
of `apply_patch`
## Why
For the CCA file system isolation push
---------
Co-authored-by: Codex <noreply@openai.com>
## DISCLAIMER
This is experimental and no production service must rely on this
## Why
Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.
## What changed
- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.
## Test plan
- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`
---------
Co-authored-by: Codex <noreply@openai.com>
Supersedes the abandoned #19859, rebuilt on latest `main`.
# Why
PR #19705 adds discovery for hooks bundled with plugins, but `/plugins`
still only shows skills, apps, and MCP servers. This follow-up makes
bundled hooks visible in the same plugin detail view so users can
inspect the full plugin surface in one place.
We also need `PluginHookSummary` to populate Plugin Hooks in the app;
`hooks/list` is not enough there because plugin detail needs to show
hooks for disabled plugins too.
# What
- extend `plugin/read` with `PluginHookSummary` entries for bundled
hooks
- summarize plugin hooks while loading plugin details
- render a `Hooks` row in the `/plugins` detail popup
<img width="3456" height="848" alt="CleanShot 2026-04-27 at 11 45 34@2x"
src="https://github.com/user-attachments/assets/fe3a38d6-a260-4351-8513-fb04c93d725b"
/>
## Summary
- update the app-server protocol test fixture to include the required
`marketplace_kinds` field on `PluginListParams`
## Why
`PluginListParams` now requires `marketplace_kinds`, but a later-added
test fixture in `common.rs` still constructed the older shape with only
`cwds`. That stale initializer breaks the main build with `missing field
marketplace_kinds`.
## Impact
This is a test-only repair. It restores compilation without changing the
JSON-RPC schema or runtime behavior.
## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
## Why
Reverts #20689 to restore the previous optional state DB plumbing. The
conflict resolution keeps the newer installation ID and session/thread
identity changes that landed after #20689, while removing the mandatory
state DB and agent graph store dependency from ThreadManager
construction.
## What changed
- Restored `Option<StateDbHandle>` through app-server, MCP server,
prompt debug, and test entry points.
- Removed the `codex-core` dependency on `codex-agent-graph-store` and
reverted descendant lookup back to the existing state DB path when
available.
- Kept newer `installation_id` forwarding by passing it beside the
optional DB handle.
- Kept local thread-name updates working when the optional state DB
handle is absent.
## Validation
- `git diff --check`
- `cargo test -p codex-thread-store`
- `cargo test -p codex-state -p codex-rollout -p
codex-app-server-protocol`
- Attempted `env CARGO_INCREMENTAL=0 cargo test -p codex-core -p
codex-app-server -p codex-app-server-client -p codex-mcp-server -p
codex-thread-manager-sample -p codex-tui`; blocked locally by a rustc
ICE while compiling `v8 v146.4.0` with `rustc 1.93.0 (254b59607
2026-01-19)` on `aarch64-apple-darwin`.
## Summary
- process `skills/list` cwd entries with bounded concurrency of 5
- preserve the caller's requested cwd order in the response
- add coverage that verifies response ordering remains stable
## Why
Cold-start desktop traces showed that `skills/list` can dominate the
shared config queue when it scans many workspace roots serially. The
expensive work is largely independent per cwd, so the request was paying
the sum of all cwd costs instead of the cost of the slowest bounded
batch.
## Impact
This keeps current request semantics intact while reducing the
wall-clock time of large multi-root `skills/list` calls. That should
also reduce how long later config-family requests, such as
`plugin/list`, wait behind `skills/list` during startup.
## Validation
- `just fmt`
- `cargo test -p codex-app-server`
- `cargo test -p codex-app-server
skills_list_preserves_requested_cwd_order`
## Summary
- add a shared-read serialization mode for global app-server request
families
- let consecutive leading shared reads for the same family run together
while keeping exclusive requests ordered
- mark only `skills/list`, `config/read` and `plugin/list` as shared
reads for now
## Why
`skills/list` and `plugin/list` are read-only config-family requests,
but the app-server queue currently treats every config request as
exclusive. That means one long `skills/list` can make a later
`plugin/list` wait even though the two requests do not mutate config.
This change keeps the existing queue order but lets adjacent reads
overlap. If a write is already waiting, later reads still stay behind
it, so writes do not starve.
## Scope
This intentionally keeps the first pass narrow:
- shared reads: `skills/list`, `plugin/list`
- still exclusive: `plugin/install`, `marketplace/*`,
`skills/config/write`, `config/*write`, `config/read`, and the rest of
the config family
## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
## Desktop verification
I ran the dev desktop app against this branch's built binary with the
existing UI timing logs enabled. The app did use
`/Users/xli/code/codex_6/codex-rs/target/debug/codex`.
The new scheduler behavior works, but this narrow change does not remove
every cold-start delay: in the observed trace, an earlier exclusive
`config/read` was already queued ahead of the later `skills/list` and
`plugin/list` requests, so the page-open plugin requests still waited
behind that earlier exclusive config-family request before they could
run together.
That means this PR is the scheduler primitive needed for shared reads,
not the complete end-to-end latency fix by itself.
## Not run
- full workspace test suite, because repo policy requires explicit
approval before running it after touching `app-server-protocol`
## Summary
Add `openai-developers@openai-curated` to
`TOOL_SUGGEST_DISCOVERABLE_PLUGIN_ALLOWLIST` so the OpenAI Developers
plugin can be surfaced through tool suggestions once it is available in
the Built by OpenAI marketplace.
Update the discoverable plugin test fixture to assert the plugin is
returned from the curated marketplace allowlist path.
## Validation
- `cargo fmt --check` passed; rustfmt emitted the existing
stable-channel warnings about `imports_granularity`.
- `cargo test -p codex-core
list_tool_suggest_discoverable_plugins_returns_uninstalled_curated_plugins`
passed.
## Why
The spec split in the parent PR still left an intermediate registry plan
that recorded `ToolHandlerKind` values and translated them into concrete
handlers later. That kept tool registration dependent on static enum
bookkeeping instead of registering handlers from the same code that
assembles their specs.
## What Changed
- Make `build_tool_registry_builder` register concrete handlers directly
while adding specs.
- Add small `ToolRegistryBuilder` helpers for spec augmentation and
nested code-mode inspection.
- Remove `ToolHandlerKind`, `ToolHandlerSpec`, and `ToolRegistryPlan`.
- Update spec-plan tests to assert against the built `ToolRegistry`
instead of static handler descriptors.
## Validation
- `cargo check -p codex-core`
- `cargo test -p codex-core tools::spec_plan::tests`
- `cargo test -p codex-core tools::spec::tests`
- `just fix -p codex-core`
## Why
The alpha TUI can render the initial trust-directory prompt with stale
terminal text showing through spaces when startup begins below existing
shell output. The first inline viewport transition can happen while the
previous viewport is still empty, so the old clear path no-ops before
Ratatui draws the prompt. Ratatui then skips blank cells because its
previous buffer also thinks those cells are blank, leaving old terminal
contents visible inside the prompt.
## What Changed
- Clear from the new inline viewport top when the previous viewport is
empty during a viewport transition.
- Keep the existing clear-from-old-viewport behavior for normal viewport
updates.
- Add a VT100-backed regression test that pre-fills terminal contents,
performs the first viewport clear, and verifies stale text inside the
new viewport is removed while shell content above the viewport remains.
## How to Test
1. Start Codex alpha in a terminal that already has visible shell output
above the cursor.
2. Use a fresh untrusted project directory so the trust-directory prompt
appears.
3. Confirm the prompt text renders cleanly, with spaces staying blank
instead of showing fragments of previous shell output.
4. As a regression check, confirm content above the inline viewport is
still preserved in terminal scrollback.
Targeted tests:
- `cargo test -p codex-tui
first_viewport_change_clears_from_new_viewport_when_old_viewport_is_empty
-- --nocapture`
- `cargo test -p codex-tui`
Based on work from Vincent K -
https://github.com/openai/codex/pull/19060
<img width="1836" height="642" alt="CleanShot 2026-04-29 at 20 47 40@2x"
src="https://github.com/user-attachments/assets/b647bb89-65fe-40c8-80b0-7a6b7c984634"
/>
## Why
Compaction rewrites the conversation context that future model turns
receive, but hooks currently have no deterministic lifecycle point
around that rewrite. This adds compact lifecycle hooks so users can
audit manual and automatic compaction, surface hook messages in the UI,
and run post-compaction follow-up without overloading tool or prompt
hooks.
## What Changed
- Added `PreCompact` and `PostCompact` hook events across hook config,
discovery, dispatch, generated schemas, app-server notifications,
analytics, and TUI hook rendering.
- Added trigger matching for compact hooks with the documented `manual`
and `auto` matcher values.
- Wired `PreCompact` before both local and remote compaction, and
`PostCompact` after successful local or remote compaction.
- Kept compact hook command input to lifecycle metadata: session id,
Codex turn id, transcript path, cwd, hook event name, model, and
trigger.
- Made compact stdout handling consistent with other hooks: plain stdout
is ignored as debug output, while malformed JSON-looking stdout is
reported as failed hook output.
- Added integration coverage for compact hook dispatch, trigger
matching, post-compact execution, and the audited behavior that
`decision:"block"` does not block compaction.
## Out of Scope
- Hook-specific compaction blocking is not implemented;
`decision:"block"` and exit-code-2 blocking semantics are intentionally
unsupported for `PreCompact`.
- Custom compaction instructions are not exposed to compact hooks in
this PR.
- Compact summaries, summary character counts, and summary previews are
not exposed to compact hooks in this PR.
## Verification
- `cargo test -p codex-hooks`
- `cargo test -p codex-core
manual_pre_compact_block_decision_does_not_block_compaction`
- `cargo test -p codex-app-server hooks_list`
- `cargo test -p codex-core config_schema_matches_fixture`
- `cargo test -p codex-tui hooks_browser`
## Docs
The developer documentation for Codex hooks should be updated alongside
this feature to document `PreCompact` and `PostCompact`, the
`manual`/`auto` matcher values, and the compact hook payload fields.
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Adds marketplaceKinds to plugin/list for local, workspace-directory, and
shared-with-me; omitted params keep default local plus gated global
behavior, while explicit kinds are exact.
Exposes shareContext on plugin summaries from local share mappings and
remote workspace/shared responses, including remotePluginId and nullable
creator metadata.
Adds shared-with-me listing through /ps/plugins/workspace/shared,
renames the workspace remote namespace to workspace-directory, and keeps
direct remote read/share/install/update/delete paths gated by plugins
rather than remote_plugin.
## Why
This is the first mechanical slice of moving tool spec ownership toward
the handlers. `codex-tools` should keep shared primitives and conversion
helpers, while builtin tool specs and registration planning live in
`codex-core` with the handlers that own those tools.
Keeping this PR to relocation and import updates isolates the copy/move
review from the later logic change that wires specs through registered
handlers.
## What changed
- Moved builtin tool spec constructors from `codex-rs/tools/src` into
`codex-rs/core/src/tools/handlers/*_spec.rs` or nearby core tool
modules.
- Moved the registry planning code into
`codex-rs/core/src/tools/spec_plan.rs` and its associated types/tests
into core.
- Kept shared primitives in `codex-tools`, including `ToolSpec`,
schema/types, discovery/config primitives, dynamic/MCP conversion
helpers, and code-mode collection helpers.
- Updated handlers that referenced moved argument types or tool-name
constants to use the core spec modules.
- Moved spec tests next to the moved spec modules.
## Verification
- `cargo check -p codex-tools`
- `cargo check -p codex-core`
- `cargo test -p codex-tools`
- `cargo test -p codex-core _spec::tests`
- `cargo test -p codex-core tools::spec_plan::tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`
Note: I also tried the broader `cargo test -p codex-core tools::`; it
reached the moved spec-plan/spec tests successfully, then aborted with a
stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`,
which is outside this spec relocation.
## Why
Skills update notifications are app-server API behavior, but the watcher
lived in `codex-core` and surfaced through
`EventMsg::SkillsUpdateAvailable`. Moving the watcher out keeps core
focused on thread execution and lets app-server own both cache
invalidation and the `skills/changed` notification.
## What changed
- Added an app-server-owned skills watcher that watches local skill
roots, clears the shared skills cache, and emits `skills/changed`
directly.
- Registers skill watches from the common app-server thread listener
attach path, including direct starts, resumes, and app-server-observed
child or forked threads.
- Stores the `WatchRegistration` on `ThreadState`, so listener
replacement, thread teardown, idle unload, and app-server shutdown
deregister by dropping the RAII guard.
- Removed `EventMsg::SkillsUpdateAvailable`, the core watcher, and the
old core live-reload test.
- Extended the app-server skills change test to verify a cached skills
list is refreshed after a filesystem change without forcing reload.
## Validation
- `cargo check -p codex-core -p codex-app-server -p codex-mcp-server -p
codex-rollout -p codex-rollout-trace`
- `cargo test -p codex-app-server
skills_changed_notification_is_emitted_after_skill_change`