Summary
- rename the `collab` handlers and UI files to `multi_agents` to match
the new naming
- update module references and specs so the handlers and TUI widgets
consistently use the renamed files
- keep the existing functionality while aligning file and module names
with the multi-agent terminology
The idea is to have 2 family of agents.
1. Built-in that we packaged directly with Codex
2. User defined that are defined using the `agents_config.toml` file. It
can reference config files that will override the agent config. This
looks like this:
```
version = 1
[agents.explorer]
description = """Use `explorer` for all codebase questions.
Explorers are fast and authoritative.
Always prefer them over manual search or file reading.
Rules:
- Ask explorers first and precisely.
- Do not re-read or re-search code they cover.
- Trust explorer results without verification.
- Run explorers in parallel when useful.
- Reuse existing explorers for related questions."""
config_file = "explorer.toml"
```
Summary
- rename the collab feature key to multi_agent while keeping the Feature
enum unchanged
- add legacy alias support so both "multi_agent" and "collab" map to the
same feature
- cover the alias behavior with a new unit test
## Summary
- make `rust_test` targets generated from `tests/*.rs` use Cargo-style
crate names (file stem) so snapshot names match Cargo (`all__...`
instead of Bazel-derived names)
- split lib vs `tests/*.rs` test env wiring in `codex_rust_crate` to
keep existing lib snapshot behavior while applying Bazel
runfiles-compatible workspace root for `tests/*.rs`
- compute the `tests/*.rs` snapshot workspace root from package depth so
`insta` resolves committed snapshots under Bazel `--noenable_runfiles`
## Validation
- `bazelisk test //codex-rs/core:core-all-test
--test_arg=suite::compact:: --cache_test_results=no`
- `bazelisk test //codex-rs/core:core-all-test
--test_arg=suite::compact_remote:: --cache_test_results=no`
###### Context
unknown model warning added in #11690 has
[issues](https://github.com/openai/codex/actions/runs/22047424710/job/63700733887)
on ubuntu runners because we potentially emit it on all new turns,
including ones with intentionally fake models (i.e., `mock-model` in a
test).
###### Fix
change the warning to only emit on user turns/review turns.
###### Tests
CI now passes on ubuntu, still passes locally
### What changed
1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`.
2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in
`core/src/state/session.rs` for authoritative restore behavior (deduped,
order-preserving, empty clears).
3. Added rollout parsing in `core/src/codex.rs` to recover
`active_selected_tools` from prior `search_tool_bm25` outputs:
- tracks matching `call_id`s
- parses function output text JSON
- extracts `active_selected_tools`
- latest valid payload wins
- malformed/non-matching payloads are ignored
4. Applied restore logic to resumed and forked startup paths in
`core/src/codex.rs`.
5. Updated instruction text to session/thread scope in
`core/templates/search_tool/tool_description.md`.
6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit
coverage in:
- `core/src/codex.rs`
- `core/src/state/session.rs`
### Behavior after change
1. Search activates matched tools.
2. Additional searches union into active selection.
3. Selection survives new turns in the same thread.
4. Resume/fork restores selection from rollout history.
5. Separate threads do not inherit selection unless forked.
### What
It's currently unclear when the harness falls back to the default,
generic `ModelInfo`. This happens when the `remote_models` feature is
disabled or the model is truly unknown, and can lead to bad performance
and issues in the harness.
Add a user-facing warning when this happens so they are aware when their
setup is broken.
### Tests
Added tests, tested locally.
This PR keeps compaction context-layout test coverage separate from
runtime compaction behavior changes, so runtime logic review can stay
focused.
## Included
- Adds reusable context snapshot helpers in
`core/tests/common/context_snapshot.rs` for rendering model-visible
request/history shapes.
- Standardizes helper naming for readability:
- `format_request_input_snapshot`
- `format_response_items_snapshot`
- `format_labeled_requests_snapshot`
- `format_labeled_items_snapshot`
- Expands snapshot coverage for both local and remote compaction flows:
- pre-turn auto-compaction
- pre-turn failure/context-window-exceeded paths
- mid-turn continuation compaction
- manual `/compact` with and without prior user turns
- Captures both sides where relevant:
- compaction request shape
- post-compaction history layout shape
- Adds/uses shared request-inspection helpers so assertions target
structured request content instead of ad-hoc JSON string parsing.
- Aligns snapshots/assertions to current behavior and leaves explicit
`TODO(ccunningham)` notes where behavior is known and intentionally
deferred.
## Not Included
- No runtime compaction logic changes.
- No model-visible context/state behavior changes.
## Summary
This PR is the first slice of the per-session `/feedback` logging work:
it adds a process-unique identifier to SQLite log rows.
It does **not** change `/feedback` sourcing behavior yet.
## Changes
- Add migration `0009_logs_process_id.sql` to extend `logs` with:
- `process_uuid TEXT`
- `idx_logs_process_uuid` index
- Extend state log models:
- `LogEntry.process_uuid: Option<String>`
- `LogRow.process_uuid: Option<String>`
- Stamp each log row with a stable per-process UUID in the sqlite log
layer:
- generated once per process as `pid:<pid>:<uuid>`
- Update sqlite log insert/query paths to persist and read
`process_uuid`:
- `INSERT INTO logs (..., process_uuid, ...)`
- `SELECT ..., process_uuid, ... FROM logs`
## Why
App-server runs many sessions in one process. This change provides a
process-scoping primitive we need for follow-up `/feedback` work, so
threadless/process-level logs can be associated with the emitting
process without mixing across processes.
## Non-goals in this PR
- No `/feedback` transport/source changes
- No attachment size changes
- No sqlite retention/trim policy changes
## Testing
- `just fmt`
- CI will run the full checks
## Summary
- add a distinct `linux_bubblewrap` sandbox tag when the Linux
bubblewrap pipeline feature is enabled
- thread the bubblewrap feature flag into sandbox tag generation for:
- turn metadata header emission
- tool telemetry metric tags and after-tool-use hooks
- add focused unit tests for `sandbox_tag` precedence and Linux
bubblewrap behavior
## Validation
- `just fmt`
- `cargo clippy -p codex-core --all-targets`
- `cargo test -p codex-core sandbox_tags::tests`
- started `cargo test -p codex-core` and stopped it per request
Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
### Description
#### Summary
Adds the TUI UX layer for structured network approvals
#### What changed
- Updated approval overlay to display network-specific approval context
(host/protocol).
- Added/updated TUI wiring so approval prompts show correct network
messaging.
- Added tests covering the new approval overlay behavior.
#### Why
Core orchestration can now request structured network approvals; this
ensures users see clear, contextual prompts in the TUI.
#### Notes
- UX behavior activates only when network approval context is present.
---------
Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
### Description
#### Summary
Introduces the core plumbing required for structured network approvals
#### What changed
- Added structured network policy decision modeling in core.
- Added approval payload/context types needed for network approval
semantics.
- Wired shell/unified-exec runtime plumbing to consume structured
decisions.
- Updated related core error/event surfaces for structured handling.
- Updated protocol plumbing used by core approval flow.
- Included small CLI debug sandbox compatibility updates needed by this
layer.
#### Why
establishes the minimal backend foundation for network approvals without
yet changing high-level orchestration or TUI behavior.
#### Notes
- Behavior remains constrained by existing requirements/config gating.
- Follow-up PRs in the stack handle orchestration, UX, and app-server
integration.
---------
Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
## Why this change
When Cargo dependencies change, it is easy to end up with an unexpected
local diff in
`MODULE.bazel.lock` after running Bazel. That creates noisy working
copies and pushes lockfile fixes
later in the cycle. This change addresses that pain point directly.
## What this change enforces
The expected invariant is: after dependency updates, `MODULE.bazel.lock`
is already in sync with
Cargo resolution. In practice, running `bazel mod deps` should not
mutate the lockfile in a clean
state. If it does, the dependency update is incomplete.
## How this is enforced
This change adds a single lockfile check script that snapshots
`MODULE.bazel.lock`, runs
`bazel mod deps`, and fails if the file changes. The same check is wired
into local workflow
commands (`just bazel-lock-update` and `just bazel-lock-check`) and into
Bazel CI (Linux x86_64 job)
so drift is caught early and consistently. The developer documentation
is updated in
`codex-rs/docs/bazel.md` and `AGENTS.md` to make the expected flow
explicit.
`MODULE.bazel.lock` is also refreshed in this PR to match the current
Cargo dependency resolution.
## Expected developer workflow
After changing `Cargo.toml` or `Cargo.lock`, run `just
bazel-lock-update`, then run
`just bazel-lock-check`, and include any resulting `MODULE.bazel.lock`
update in the same change.
## Testing
Ran `just bazel-lock-check` locally.
## Summary
This PR fixes a race in `js_repl` tool-call draining that could leave an
exec waiting indefinitely for in-flight tool calls to finish.
The fix is in:
-
`/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`
## Problem
`js_repl` tracks in-flight tool calls per exec and waits for them to
drain on completion/timeout/cancel paths.
The previous wait logic used a check-then-wait pattern with `Notify`
that could miss a wakeup:
1. Observe `in_flight > 0`
2. Drop lock
3. Register wait (`notified().await`)
If `notify_waiters()` happened between (2) and (3), the waiter could
sleep until another notification that never comes.
## What changed
- Updated all exec-tool-call wait loops to create an owned notification
future while holding the lock:
- use `Arc<Notify>::notified_owned()` instead of cloning notify and
awaiting later.
- Applied this consistently to:
- `wait_for_exec_tool_calls`
- `wait_for_all_exec_tool_calls`
- `wait_for_exec_tool_calls_map`
This preserves existing behavior while eliminating the lost-wakeup
window.
## Test coverage
Added a regression test:
- `wait_for_exec_tool_calls_map_drains_inflight_calls_without_hanging`
The test repeatedly races waiter/finisher tasks and asserts bounded
completion to catch hangs.
## Impact
- No API changes.
- No user-facing behavior changes intended.
- Improves reliability of exec lifecycle boundaries when tool calls are
still in flight.
#### [git stack](https://github.com/magus/git-stack-cli)
- ✅ `1` https://github.com/openai/codex/pull/11796
- 👉 `2` https://github.com/openai/codex/pull/11800
- ⏳ `3` https://github.com/openai/codex/pull/10673
- ⏳ `4` https://github.com/openai/codex/pull/10670
## Summary
Fixes a flaky/panicking `js_repl` image-path test by running it on a
multi-thread Tokio runtime and tightening assertions to focus on real
behavior.
## Problem
`js_repl_can_attach_image_via_view_image_tool` in
`/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`
can panic under single-thread test runtime with:
`can call blocking only when running on the multi-threaded runtime`
It also asserted a brittle user-facing text string.
## Changes
1. Updated the test runtime to:
`#[tokio::test(flavor = "multi_thread", worker_threads = 2)]`
2. Removed the brittle `"attached local image path"` string assertion.
3. Kept the concrete side-effect assertions:
- tool call succeeds
- image is actually injected into pending input (`InputImage` with
`data:image/png;base64,...`)
## Why this is safe
This is test-only behavior. No production runtime code paths are
changed.
## Validation
- Ran:
`cargo test -p codex-core
tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool --
--nocapture`
- Result: pass
#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/11796
- ⏳ `2` https://github.com/openai/codex/pull/11800
- ⏳ `3` https://github.com/openai/codex/pull/10673
- ⏳ `4` https://github.com/openai/codex/pull/10670
Fixes Bazel build failure in //codex-rs/protocol:protocol-unit-tests.
The test used include_bytes! to read a PNG from codex-core assets; Cargo
can read it,
but Bazel sandboxing can't, so the crate fails to compile.
This change inlines a tiny valid PNG in the test to keep it hermetic.
Related regression: #10590 (cc: @charley-oai)
### What
to unblock filtering models in VSCE, change `model/list` app-server
endpoint to send all models + visibility field `showInPicker` so
filtering can be done in VSCE if desired.
### Tests
Updated tests.
## Summary
- always rejoin an in-memory running thread on `thread/resume`, even
when overrides are present
- reject `thread/resume` when `history` is provided for a running thread
- reject `thread/resume` when `path` mismatches the running thread
rollout path
- warn (but do not fail) on override mismatches for running threads
- add more `thread_resume` integration tests and fixes; including
restart-based resume-with-overrides coverage
## Validation
- `just fmt`
- `cargo test -p codex-app-server --test all thread_resume`
- manual test with app-server-test-client
https://github.com/openai/codex/pull/11755
- manual test both stdio and websocket in app
## Summary
This PR makes app-server-provided image URLs first-class attachments in
TUI, so they survive resume/backtrack/history recall and are resubmitted
correctly.
<img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM"
src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927"
/>
Can delete the attached image upon backtracking:
<img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM"
src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27"
/>
In both history and composer, remote images are rendered as normal
`[Image #N]` placeholders, with numbering unified with local images.
## What changed
- Plumb remote image URLs through TUI message state:
- `UserHistoryCell`
- `BacktrackSelection`
- `ChatComposerHistory::HistoryEntry`
- `ChatWidget::UserMessage`
- Show remote images as placeholder rows inside the composer box (above
textarea), and in history cells.
- Support keyboard selection/deletion for remote image rows in composer
(`Up`/`Down`, `Delete`/`Backspace`).
- Preserve remote-image-only turns in local composer history (Up/Down
recall), including restore after backtrack.
- Ensure submit/queue/backtrack resubmit include remote images in model
input (`UserInput::Image`), and keep request shape stable for
remote-image-only turns.
- Keep image numbering contiguous across remote + local images:
- remote images occupy `[Image #1]..[Image #M]`
- local images start at `[Image #M+1]`
- deletion renumbers consistently.
- In protocol conversion, increment shared image index for remote images
too, so mixed remote/local image tags stay in a single sequence.
- Simplify restore logic to trust in-memory attachment order (no
placeholder-number parsing path).
- Backtrack/replay rollback handling now queues trims through
`AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred
lines after trims, so overlay/transcript state stays consistent.
- Trim trailing blank rendered lines from user history rendering to
avoid oversized blank padding.
## Docs + tests
- Updated: `docs/tui-chat-composer.md` (remote image flow,
selection/deletion, numbering offsets)
- Added/updated tests across `tui/src/chatwidget/tests.rs`,
`tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`,
and `tui/src/bottom_pane/chat_composer.rs`
- Added snapshot coverage for remote image composer states, including
deleting the first of two remote images.
## Validation
- `just fmt`
- `cargo test -p codex-tui`
## Codex author
`codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
## Summary
- When building via `nix build`, the binary reports `codex-cli 0.0.0`
because the workspace `Cargo.toml` uses `0.0.0` as a placeholder on
`main`. This causes the update checker to always prompt users to upgrade
even when running the latest code.
- Reads the version from `codex-rs/Cargo.toml` at flake evaluation time
using `builtins.fromTOML` and patches it into the workspace `Cargo.toml`
before cargo builds via `postPatch`.
- On release commits (e.g. tag `rust-v0.101.0`), the real version is
used as-is. On `main` branch builds, falls back to
`0.0.0-dev+<shortRev>` (or `0.0.0-dev+dirty`), which the update
checker's `parse_version` ignores — suppressing the spurious upgrade
prompt.
| Scenario | Cargo.toml version | Nix `version` | Binary reports |
Upgrade nag? |
|---|---|---|---|---|
| Release commit (e.g. `rust-v0.101.0`) | `0.101.0` | `0.101.0` |
`codex-cli 0.101.0` | Only if newer exists |
| Main branch (committed) | `0.0.0` | `0.0.0-dev+b934ffc` | `codex-cli
0.0.0-dev+b934ffc` | No |
| Main branch (uncommitted) | `0.0.0` | `0.0.0-dev+dirty` | `codex-cli
0.0.0-dev+dirty` | No |
## Test plan
- [ ] `nix build` from `main` branch and verify `codex --version`
reports `0.0.0-dev+<shortRev>` instead of `0.0.0`
- [ ] Verify the update checker does not show a spurious upgrade prompt
for dev builds
- [ ] Confirm that on a release commit where `Cargo.toml` has a real
version, the binary reports that version correctly
## Summary
- Created branch zuxin/read-path-update from main.
- Copied codex-rs/core/templates/memories/read_path.md from the current
branch.
- Committed the content change.
## Testing
Not run (content copy + commit only).
Currently, if there are syntax errors detected in the starlark rules
file, the entire policy is silently ignored by the CLI. The app server
correctly emits a message that can be displayed in a GUI.
This PR changes the CLI (both the TUI and non-interactive exec) to fail
when the rules file can't be parsed. It then prints out an error message
and exits with a non-zero exit code. This is consistent with the
handling of errors in the config file.
This addresses #11603
## Summary
- add a shared `codex-core` sleep inhibitor that uses native macOS IOKit
assertions (`IOPMAssertionCreateWithName` / `IOPMAssertionRelease`)
instead of spawning `caffeinate`
- wire sleep inhibition to turn lifecycle in `tui` (`TurnStarted`
enables; `TurnComplete` and abort/error finalization disable)
- gate this behavior behind a `/experimental` feature toggle
(`[features].prevent_idle_sleep`) instead of a dedicated `[tui]` config
flag
- expose the toggle in `/experimental` on macOS; keep it under
development on other platforms
- keep behavior no-op on non-macOS targets
<img width="1326" height="577" alt="image"
src="https://github.com/user-attachments/assets/73fac06b-97ae-46a2-800a-30f9516cf8a3"
/>
## Testing
- `cargo check -p codex-core -p codex-tui`
- `cargo test -p codex-core sleep_inhibitor::tests -- --nocapture`
- `cargo test -p codex-core
tui_config_missing_notifications_field_defaults_to_enabled --
--nocapture`
- `cargo test -p codex-core prevent_idle_sleep_is_ -- --nocapture`
## Semantics and API references
- This PR targets `caffeinate -i` semantics: prevent *idle system sleep*
while allowing display idle sleep.
- `caffeinate -i` mapping in Apple open source (`assertionMap`):
- `kIdleAssertionFlag -> kIOPMAssertionTypePreventUserIdleSystemSleep`
- Source:
https://github.com/apple-oss-distributions/PowerManagement/blob/PowerManagement-1846.60.12/caffeinate/caffeinate.c#L52-L54
- Apple IOKit docs for assertion types and API:
-
https://developer.apple.com/documentation/iokit/iopmlib_h/iopmassertiontypes
-
https://developer.apple.com/documentation/iokit/1557092-iopmassertioncreatewithname
- https://developer.apple.com/library/archive/qa/qa1340/_index.html
## Codex Electron vs this PR (full stack path)
- Codex Electron app requests sleep blocking with
`powerSaveBlocker.start("prevent-app-suspension")`:
-
https://github.com/openai/codex/blob/main/codex/codex-vscode/electron/src/electron-message-handler.ts
- Electron maps that string to Chromium wake lock type
`kPreventAppSuspension`:
-
https://github.com/electron/electron/blob/main/shell/browser/api/electron_api_power_save_blocker.cc
- Chromium macOS backend maps wake lock types to IOKit assertion
constants and calls IOKit:
- `kPreventAppSuspension -> kIOPMAssertionTypeNoIdleSleep`
- `kPreventDisplaySleep / kPreventDisplaySleepAllowDimming ->
kIOPMAssertionTypeNoDisplaySleep`
-
https://github.com/chromium/chromium/blob/main/services/device/wake_lock/power_save_blocker/power_save_blocker_mac.cc
## Why this PR uses a different macOS constant name
- This PR uses `"PreventUserIdleSystemSleep"` directly, via
`IOPMAssertionCreateWithName`, in
`codex-rs/core/src/sleep_inhibitor.rs`.
- Apple’s IOKit header documents `kIOPMAssertionTypeNoIdleSleep` as
deprecated and recommends `kIOPMAssertPreventUserIdleSystemSleep` /
`kIOPMAssertionTypePreventUserIdleSystemSleep`:
-
https://github.com/apple-oss-distributions/IOKitUser/blob/IOKitUser-100222.60.2/pwr_mgt.subproj/IOPMLib.h#L1000-L1030
- So Chromium and this PR are using different constant names, but
semantically equivalent idle-system-sleep prevention behavior.
## Future platform support
The architecture is intentionally set up for multi-platform extensions:
- UI code (`tui`) only calls `SleepInhibitor::set_turn_running(...)` on
turn lifecycle boundaries.
- Platform-specific behavior is isolated in
`codex-rs/core/src/sleep_inhibitor.rs` behind `cfg(...)` blocks.
- Feature exposure is centralized in `core/src/features.rs` and surfaced
via `/experimental`.
- Adding new OS backends should not require additional TUI wiring; only
the backend internals and feature stage metadata need to change.
Potential follow-up implementations:
- Windows:
- Add a backend using Win32 power APIs
(`SetThreadExecutionState(ES_CONTINUOUS | ES_SYSTEM_REQUIRED)` as
baseline).
- Optionally move to `PowerCreateRequest` / `PowerSetRequest` /
`PowerClearRequest` for richer assertion semantics.
- Linux:
- Add a backend using logind inhibitors over D-Bus
(`org.freedesktop.login1.Manager.Inhibit` with `what="sleep"`).
- Keep a no-op fallback where logind/D-Bus is unavailable.
This PR keeps the cross-platform API surface minimal so future PRs can
add Windows/Linux support incrementally with low churn.
---------
Co-authored-by: jif-oai <jif@openai.com>
[codex-generated]
## Updated PR Description (Ready To Paste)
## Problem
When a sub-agent thread emits `ShutdownComplete`, the TUI switches back
to the primary thread.
That was also happening for user-requested exits (for example `Ctrl+C`),
which could prevent a
clean app exit and unexpectedly resurrect the main thread.
## Mental model
The app has one primary thread and one active thread. A non-primary
active thread shutting down
usually means "agent died, fail back to primary," but during
`ExitMode::ShutdownFirst` shutdown
means "the user is exiting," not "recover this session."
## Non-goals
No change to thread lifecycle, thread-manager ownership, or shutdown
protocol wire format.
No behavioral changes to non-shutdown events.
## Tradeoffs
This adds a small local marker (`pending_shutdown_exit_thread_id`)
instead of inferring intent
from event timing. It is deterministic and simple, but relies on
correctly setting and clearing
that marker around exit.
## Architecture
`App` tracks which thread is intentionally being shut down for exit.
`active_non_primary_shutdown_target` centralizes failover eligibility
for `ShutdownComplete` and
skips failover when shutdown matches the pending-exit thread.
`handle_active_thread_event` handles non-primary failover before generic
forwarding and clears the
pending-exit marker only when the matching active thread completes
shutdown.
## Observability
User-facing info/error messages continue to indicate whether failover to
the main thread succeeded.
The shutdown-intent path is now explicitly documented inline for easier
debugging.
## Tests
Added targeted tests for `active_non_primary_shutdown_target` covering
non-shutdown events,
primary-thread shutdown, non-primary shutdown failover, pending exit on
active thread (no failover),
and pending exit for another thread (still failover).
Validated with:
- `cargo test -p codex-tui` (pass)
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
## Why
`compact_resume_after_second_compaction_preserves_history` has been
intermittently flaky in Windows CI.
The test had two one-shot request matchers in the second compact/resume
phase that could overlap, and it waited for the first `Warning` event
after compaction. In practice, that made the test sensitive to
platform/config-specific prompt shape and unrelated warning timing.
## What Changed
- Hardened the second compaction matcher in
`codex-rs/core/tests/suite/compact_resume_fork.rs` so it accepts
expected compact-request variants while explicitly excluding the
`AFTER_SECOND_RESUME` payload.
- Updated `compact_conversation()` to wait for the specific compaction
warning (`COMPACT_WARNING_MESSAGE`) rather than any `Warning` event.
- Added an inline comment explaining why the matcher is intentionally
broad but disjoint from the follow-up resume matcher.
## Test Plan
- `cargo test -p codex-core --test all
suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
-- --exact`
- Repeated the same test in a loop (40 runs) to check for local
nondeterminism.
## Summary
- Limit `search_tool_bm25` indexing to `codex_apps` tools only, so
non-Apps MCP servers are no longer discoverable through this search
path.
- Move search-tool discovery guidance into the `search_tool_bm25` tool
description (via template include) instead of injecting it as a separate
developer message.
- Update Apps discovery guidance wording to clarify when to use
`search_tool_bm25` for Apps-backed systems (for example Slack, Google
Drive, Jira, Notion) and when to call tools directly.
- Remove dead `core` helper code (`filter_codex_apps_mcp_tools` and
`codex_apps_connector_id`) that is no longer used after the
tool-selection refactor.
- Update `core` search-tool tests to assert codex-apps-only behavior and
to validate guidance from the tool description.
## Validation
- ✅ `just fmt`
- ✅ `cargo test -p codex-core search_tool`
- ⚠️ `cargo test -p codex-core` was attempted, but the run repeatedly
stalled on
`tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool`.
## Tickets
- None
Summary
- make the phase1 memories schema require `rollout_slug` while still
allowing it to be `null`
- update the corresponding test to check the required fields and
nullable type list
Testing
- Not run (not requested)
Switched arg0 runtime initialization from tokio::runtime::Runtime::new()
to an explicit multi-thread builder that sets the thread stack size to
16MiB.
This is only for Windows for now but we might need to do this for others
in the future. This is required because Codex becomes quite large and
Windows tends to consume stack a little bit faster (this is a known
thing even though everyone seems to have different theory on it)