Update the TUI config consumer to unwrap the lenient CLI auth credential store before passing it to the cloud requirements loader.
Co-authored-by: Codex <noreply@openai.com>
Remove the local warning unwrap helpers so runtime config consumption calls Lenient::into_valid_with_warning directly instead of hiding the API behind wrappers.
Co-authored-by: Codex <noreply@openai.com>
Keep Lenient::into_valid as the silent projection helper and add into_valid_with_warning for runtime config consumption that should emit startup warnings.
Co-authored-by: Codex <noreply@openai.com>
Remove the explicit invalid_enum_warnings tree walk and have Lenient::into_valid append warning messages when invalid values are consumed. Keep higher-level resolvers returning values instead of accepting active profile and warning sink plumbing.
Co-authored-by: Codex <noreply@openai.com>
Cover the full ConfigBuilder load path for invalid enum values so startup warnings are asserted alongside valid config that still applies.
Co-authored-by: Codex <noreply@openai.com>
Store selected config enum fields as Lenient<T> so invalid values remain visible after deserialization and can be reported as startup warnings while valid consumers unwrap at runtime. Remove the older retry-loop sanitizer now that warnings come from the typed config tree.
Co-authored-by: Codex <noreply@openai.com>
Replace the explicit config enum sanitizer with a generic deserialize retry loop over the assembled TOML. Unknown enum variant errors remove the offending field, append a warning with the field path and invalid value, and retry deserialization.
Co-authored-by: Codex <noreply@openai.com>
Defer invalid enum handling until after config layers are assembled. This keeps layer loading raw, removes invalid enum values from the final effective config, and reports warnings with the dotted field path and invalid value only.
Co-authored-by: Codex <noreply@openai.com>
Allow config loading to continue when enum-valued settings contain invalid values. Invalid enum entries are removed from the layer before merging and surfaced through startup config warnings, while unrelated valid settings keep loading normally.
Co-authored-by: Codex <noreply@openai.com>
## Why
App-server clients sometimes need argv-based local process execution
while sandbox policy is controlled outside Codex. Those environments can
reject sandbox-disabling paths before a command ever starts, even when
the caller intentionally wants unsandboxed execution.
This PR adds a distinct `process/*` API for that use case instead of
extending `command/exec` with another sandbox-disabling shape. Keeping
the new surface separate also makes the future removal of `command/exec`
simpler: clients that need explicit process lifecycle control can move
to the newer handle-based API without depending on `command/exec`
business logic.
## What changed
- Added v2 process lifecycle methods: `process/spawn`,
`process/writeStdin`, `process/resizePty`, and `process/kill`.
- Added process notifications: `process/outputDelta` for streamed
stdout/stderr chunks and `process/exited` for final exit status and
buffered output.
- Made `process/spawn` intentionally unsandboxed and omitted
sandbox-selection fields such as `sandboxPolicy` and
`permissionProfile`.
- Added client-supplied, connection-scoped `processHandle` values for
follow-up control requests and notification routing.
- Supported cwd, environment overrides, PTY mode and size, stdin
streaming, stdout/stderr streaming, per-stream output caps, and timeout
controls.
- Killed active process sessions when the originating app-server
connection closes.
- Wired the implementation through the modular `request_processors/`
app-server layout, with process-handle request serialization for
follow-up control calls.
- Updated generated JSON/TypeScript schema fixtures and documented the
new API in `codex-rs/app-server/README.md`.
- Added v2 app-server integration coverage in
`codex-rs/app-server/tests/suite/v2/process_exec.rs` for spawn
acknowledgement before exit, buffered output caps, and process
termination.
## Verification
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`
---------
Co-authored-by: Owen Lin <owen@openai.com>
## Summary
Remove the hardcoded remote plugin ID prefix allow-list from app-server
uninstall routing. IDs that do not parse as local `plugin@marketplace`
IDs now flow through the remote uninstall path, where the existing
remote ID safety validation still rejects empty IDs, spaces, slashes,
and other unsafe characters before URL/cache use.
## Why
Plugin-service owns the backend remote plugin ID contract. Codex should
not require remote IDs to start with the local hardcoded prefixes
`plugins~`, `plugins_`, `app_`, `asdk_app_`, or `connector_`, because
newer backend ID families could otherwise be rejected before
plugin-service sees the request.
## Validation
- `just fmt`
- `cargo test -p codex-app-server plugin_uninstall`
- `just fix -p codex-app-server`
- `git diff --check`
## Why
Tool families already disagree on what their existing `duration` fields
mean, so lifecycle latency should live on the shared item envelope
instead of being inferred from per-tool execution fields. Carrying that
envelope through app-server notifications gives downstream consumers one
reusable timing signal without pretending every tool has the same
execution semantics.
## What changed
- Adds `started_at_ms` to core `ItemStartedEvent` values and
`completed_at_ms` to core `ItemCompletedEvent` values.
- Populates those timestamps in the shared session lifecycle emitters,
so protocol-native items get timing without each producer tracking its
own clock state.
- Exposes `startedAtMs` on app-server `item/started` notifications and
`completedAtMs` on `item/completed` notifications.
- Maps the lifecycle timestamps through the app-server boundary while
leaving legacy-converted notifications nullable when no lifecycle
timestamp exists.
- Regenerates the app-server JSON schema and TypeScript fixtures for the
notification-envelope change and updates downstream fixtures that
construct those notifications directly.
- Extends the existing web-search and image-generation integration flows
to assert the new lifecycle timestamps on the native item events.
## Verification
- `cargo check -p codex-protocol -p codex-core -p
codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec
-p codex-app-server-client`
- `cargo test -p codex-core --test all web_search_item_is_emitted`
- `cargo test -p codex-core --test all
image_generation_call_event_is_emitted`
- `cargo test -p codex-app-server-protocol`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514).
* #18748
* #18747
* #17090
* #17089
* __->__ #20514
## Summary
Moves the WebRTC realtime sideband websocket join out of the voice start
critical path. Call creation still posts the SDP offer and session
config synchronously so the client gets the SDP answer, but the sideband
websocket now connects in the input task async and doesn't block
conversation state installation.
This lets the normal realtime input channels buffer text, handoff
output, and audio while the WebRTC sideband websocket is connecting. If
the sideband join fails while the conversation is still active, the task
sends a RealtimeEvent::Error through the existing events_tx / fanout
path.
To rephrase this:
* No longer blocked on sideband: the client can receive the SDP answer
earlier, set up the WebRTC peer connection, and let the media leg
progress while the sideband websocket joins.
* Still blocked on sideband: queued text, handoff output, and sideband
server events cannot flow until connect_webrtc_sideband(...).await
finishes and then run_realtime_input_task(...) starts
## Validation
- `env CODEX_SKIP_VENDORED_BWRAP=1 cargo test --manifest-path
codex-rs/Cargo.toml -p codex-core --test all
conversation_webrtc_start_posts_generated_session`
`CODEX_SKIP_VENDORED_BWRAP=1` is needed in this local environment
because `libcap.pc` is not installed for the vendored bubblewrap build.
## Testing
I tested this locally by running `cargo run -p codex-cli --bin codex --
--enable realtime_conversation` and invoking `/realtime`. Then, we get
logs emitted in `~/.codex/log/codex-tui.log`.
### Before the Change
Logging commit
(c0299e6edf)
```
2026-05-04T16:06:09.251956Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: starting realtime conversation
2026-05-04T16:06:09.251980Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: creating realtime call transport="webrtc"
2026-05-04T16:06:10.365722Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: realtime call created; sdp answer ready transport="webrtc" call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy elapsed_ms=1113 total_elapsed_ms=1113
2026-05-04T16:06:10.365843Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: connecting realtime sideband websocket call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy
2026-05-04T16:06:10.784528Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: connected realtime sideband websocket call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy elapsed_ms=418 total_elapsed_ms=1532
2026-05-04T16:06:10.784665Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: realtime conversation started
```
### After the Change
Logging commit
(c8b00ac21a)
```
2026-05-04T15:41:24.080363Z INFO ... codex_core::realtime_conversation: starting realtime conversation
2026-05-04T15:41:24.080434Z INFO ... codex_core::realtime_conversation: creating realtime call transport="webrtc"
2026-05-04T15:41:25.106906Z INFO ... codex_core::realtime_conversation: realtime call created; sdp answer ready transport="webrtc" call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy elapsed_ms=1026 total_elapsed_ms=1026
2026-05-04T15:41:25.107067Z INFO ... codex_core::realtime_conversation: spawned realtime sideband connection task transport="webrtc" total_elapsed_ms=1026
2026-05-04T15:41:25.107160Z INFO ... codex_core::realtime_conversation: realtime conversation started
2026-05-04T15:41:25.107185Z INFO codex_core::realtime_conversation: connecting realtime sideband websocket call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy
2026-05-04T15:41:25.107352Z INFO ... codex_core::realtime_conversation: sent realtime sdp answer to client
2026-05-04T15:41:26.076685Z INFO codex_core::realtime_conversation: connected realtime sideband websocket call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy elapsed_ms=969 total_elapsed_ms=1996
2026-05-04T15:41:26.573893Z INFO codex_core::realtime_conversation: realtime session updated realtime_session_id=sess_u0_Dbpi8nhak5eLjQZ73yhAy
2026-05-04T15:41:26.573970Z INFO codex_core::realtime_conversation: received realtime conversation event event=SessionUpdated { ... }
```
### Conclusion
Here we see that we saved about a half a second in conversation startup
(1532ms -> 969ms). This also checks out with my sanity tests; I was
seeing at most a second of saving.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Fixes#11678 by removing the Windows-specific
`PASTE_BURST_CHAR_INTERVAL` override. Windows now uses the same `8ms`
paste-burst character interval as macOS and Linux, which removes the
extra per-character hold that made fast typing and key repeat feel
delayed on Windows.
The paste-burst heuristic itself is unchanged, and the Windows-specific
active idle timeout remains in place. This PR only restores the shared
character-to-character burst threshold that decides whether adjacent
plain character events are part of a paste.
## Motivation
PR #9348 raised the Windows character interval from `8ms` to `30ms` to
protect the multiline paste behavior tracked in #2137, where pasted
newlines could be interpreted as submits in Windows terminals. That
fixed the paste failure, but it also made ordinary typing visibly laggy
because the TUI waits briefly before flushing a single typed character
while it checks whether a paste burst is forming.
The deployed behavior here is to remove that Windows-only delay and
return to the cross-platform threshold. Manual Windows validation of the
critical VS Code integrated terminal path shows multiline paste still
works with the final `8ms` value, including testing on VS Code
`1.107.0`.
## Testing
- `cargo test -p codex-tui`
- Manual Windows validation in VS Code integrated PowerShell with the
final `8ms` interval
## Why
Bazel CI was not actually exercising some sharded Rust integration-test
targets on macOS. The `rules_rust` sharding wrapper expects a symlink
runfiles tree, but this repo runs Bazel with `--noenable_runfiles`. In
that configuration the wrapper could fail to find the generated test
binary, produce an empty test list, and exit successfully. That made
targets such as `//codex-rs/core:core-all-test` look green even when
Cargo CI could still catch failures in the same Rust tests.
The coverage gap appears to have been introduced by
[#18082](https://github.com/openai/codex/pull/18082), which enabled
rules_rust native sharding on `//codex-rs/core:core-all-test` and the
other large Rust test labels. The manifest-runfiles setup itself
predates that change in
[#10098](https://github.com/openai/codex/pull/10098), but #18082 is
where the affected integration tests started running through the
incompatible rules_rust sharding wrapper.
[#18913](https://github.com/openai/codex/pull/18913) fixed the same
class of issue for wrapped unit-test shards, but integration-test shards
were still going through the rules_rust wrapper until this PR.
We still do not have the V8/code-mode pieces stable under the Bazel CI
cross-compile setup, so this keeps those tests out of Bazel while
restoring coverage for the rest of the sharded Rust integration suites.
Cargo CI remains responsible for V8/code-mode coverage for now.
This change did uncover a real failing core test on `main`:
`approved_folder_write_request_permissions_unblocks_later_apply_patch`.
That fix is split into
[#21060](https://github.com/openai/codex/pull/21060), which enables the
`apply_patch` tool in the test, teaches the aggregate core test binary
to dispatch the sandboxed filesystem helper, canonicalizes the macOS
temp patch target, and isolates the core test harness from managed
local/enterprise config. Keeping that fix separate lets this PR stay
focused on restoring Bazel coverage while documenting the first failure
it exposed.
## What changed
- Build sharded Rust integration tests as manual `*-bin` binaries and
run them through the existing manifest-aware `workspace_root_test`
launcher.
- Keep Bazel sharding on the launcher target so Rust test cases are
still distributed by stable test-name hashing.
- Configure Bazel CI to skip Rust tests whose names contain
`suite::code_mode::`.
- Exclude the standalone `codex-rs/code-mode` and `codex-rs/v8-poc`
unit-test targets from `bazel.yml`.
## Verification
- `bazel query --output=build //codex-rs/core:core-all-test` now shows
`workspace_root_test` wrapping `//codex-rs/core:core-all-test-bin`.
- `bazel test --test_output=all --nocache_test_results
--test_sharding_strategy=disabled //codex-rs/core:core-all-test
--test_filter=suite::request_permissions_tool::approved_folder_write_request_permissions_unblocks_later_apply_patch`
runs the actual Rust test body and passes.
- `bazel test --test_output=errors --nocache_test_results
--test_env=CODEX_BAZEL_TEST_SKIP_FILTERS=suite::code_mode::
//codex-rs/core:core-all-test` runs the sharded target with code-mode
skipped and passes overall locally, with one flaky attempt retried by
the existing `flaky = True` setting.
## Why
Fixes#21046.
Codex TUI 0.128.0 can show Backspace/Delete-related editor shortcuts in
`/keymap`, but Windows-style modified Backspace/Delete events were still
dropped by the composer because the default editor keymap did not
include those modified special-key variants. On Windows/CMD this meant
`Shift+Backspace` and `Shift+Delete` did not fall through to normal
character deletion, and `Ctrl+Backspace` / `Ctrl+Delete` did not perform
the word deletion users expect from Windows text inputs.
## What Changed
- Added default editor bindings for `shift-backspace` and `shift-delete`
so shifted delete keys keep normal grapheme deletion behavior.
- Added default editor bindings for `ctrl-backspace`,
`ctrl-shift-backspace`, `ctrl-delete`, and `ctrl-shift-delete` so
Windows-style word deletion works when terminals preserve those
modifiers.
- Added regression coverage for the resolved default keymap and textarea
behavior.
## How to Test
1. Start Codex in the TUI on Windows CMD or another terminal that
reports modified Backspace/Delete keys distinctly.
2. Type `hello world` in the composer.
3. Press `Ctrl+Backspace`; confirm `world` is removed and `hello `
remains.
4. Type `world` again, move the cursor before it, then press
`Ctrl+Delete`; confirm the next word is removed.
5. Type a few characters and press `Shift+Backspace` and `Shift+Delete`;
confirm they delete one character in the expected direction instead of
doing nothing.
6. Open `/keymap`, inspect the Editor deletion actions, and confirm the
modified Backspace/Delete aliases are visible as configurable defaults.
Targeted tests:
- `cargo test -p codex-tui keymap::tests`
- `cargo test -p codex-tui bottom_pane::textarea::tests`
- `cargo test -p codex-tui keymap_setup::tests`
## Why
The Bazel test coverage change exposed
`approved_folder_write_request_permissions_unblocks_later_apply_patch`,
and `rust-ci-full.yml` showed the same test failing on `main` on macOS.
There were two separate classes of problems here.
### Clean CI failure
The test emits an `apply_patch` tool call, but its config did not enable
the `apply_patch` tool, so the mocked response completed without an
`apply-patch-call` output. After enabling the tool, the same path also
needs the aggregate `codex-core` test binary to dispatch
`--codex-run-as-fs-helper`; sandboxed `apply_patch` uses that helper
under macOS Seatbelt.
The test now also canonicalizes the temporary patch target before
building the patch payload so the path matches normalized grants on
macOS, where `/var` paths often normalize to `/private/var`.
### Local/enterprise config isolation
The core test harness now builds its default test config with managed
config disabled, so host-managed enterprise config cannot alter these
tests. The request-permissions turns in this test also explicitly use
the user reviewer path, keeping the assertions focused on
`request_permissions` behavior rather than reviewer defaults from the
host.
## What Changed
- Enable `apply_patch` in
`approved_folder_write_request_permissions_unblocks_later_apply_patch`.
- Teach the core integration test binary to dispatch
`CODEX_FS_HELPER_ARG1`, matching the existing apply-patch and
linux-sandbox dispatch paths.
- Canonicalize the tempdir-backed patch target before creating the
patch.
- Ignore managed config in default core test configs and explicitly pin
this test to `ApprovalsReviewer::User`.
## Verification
Run outside the Codex app sandbox because these macOS tests
intentionally spawn Seatbelt:
- `cargo test -p codex-core
approved_folder_write_request_permissions_unblocks_later_apply_patch`
- `cargo test -p codex-core
approved_folder_write_request_permissions_unblocks_later_exec_without_sandbox_args`
## Why
Feedback reports do not currently surface a direct pointer to the last
model call, so investigations may require searching through many
requests in a session to find the bad response. Preserve the last
model-side IDs at response-stream time so immediate feedback reports
carry that breadcrumb.
## What changed
- Record `last_model_request_id` when a Responses stream exposes an
upstream request ID.
- Record `last_model_response_id` when the model response completes.
- Add unit coverage for the emitted feedback tags.
## Verification
- `cargo test -p codex-core
client::tests::response_stream_records_last_model_feedback_ids`
## Why
MCP servers can provide `instructions` that explain what their tools are
for. Directly exposed MCP namespaces already use those instructions when
a connector description is not available, but deferred `tool_search`
results did not preserve that fallback. The direct path falls back from
connector metadata to server instructions, while the deferred path only
carried `connector_description` and otherwise fell back to generic
namespace text.
That meant a plain MCP server could provide useful model-facing guidance
and still appear as `Tools in the X namespace.` whenever it was
discovered lazily through `tool_search`.
## What changed
- Store one model-facing `namespace_description` on `ToolInfo`, using
connector descriptions for connector-backed tools and server
instructions for plain MCP servers.
- Thread that namespace description through the `tool_search` source
list, search indexing, and returned namespace metadata.
- Add an end-to-end regression test for deferred non-app MCP search
results exposing server instructions as the namespace description.
## Verification
- `cargo test -p codex-tools
search_tool_description_lists_each_mcp_source_once --lib`
- `cargo test -p codex-core --test all
tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`
## Summary
- normalize terminal-emitted C0 control characters through configurable
editor keymaps, covering raw control-key fallbacks like
Shift+Enter-as-LF in terminals from #20555 and #20898, plus part of the
modified-Enter behavior in #20580
- add default-unbound keymap actions for toggling Fast mode and killing
the current composer line, giving #20698 users a configurable zsh-style
Ctrl+U option without changing the existing default Ctrl+U behavior
- wire the new actions through gated /keymap picker entries, schema
generation, and snapshot coverage
Fixes#20555.
Fixes#20898.
## Testing
- just write-config-schema
- just fmt
- cargo test -p codex-config
- cargo test -p codex-tui keymap::tests
- cargo test -p codex-tui bottom_pane::textarea::tests
- cargo test -p codex-tui keymap_setup::tests
- cargo insta pending-snapshots
- just fix -p codex-tui
- git diff --check
- just argument-comment-lint
## Why?
The Codex App already exposes branch and PR context in its
branch-details UI. This brings the same context into the CLI footer as
opt-in statusline items, so users can choose the extra signal without
making the default footer busier.
## What?
Add optional `pull-request-number` and `branch-changes` items to the
configurable TUI status line.
- `pull-request-number` shows the open PR for the current checkout and
renders as a clickable terminal hyperlink when OSC 8 links are
supported.
- `branch-changes` shows committed additions/deletions against the
repository default branch, or `No changes` when the branch has no
committed diff.
<img width="1257" height="261" alt="CleanShot 2026-05-03 at 20 44 15"
src="https://github.com/user-attachments/assets/10b4380b-c3e9-4729-9ee1-3f742068fa47"
/>
## Architecture
This follows the same client/app-server split as the Codex App: the TUI
owns presentation, caching, and optional rendering, while
workspace-sensitive `git` and `gh` discovery runs through app-server.
The new TUI-local `workspace_command` layer sends bounded,
non-interactive `command/exec` requests to the active app-server. That
makes the implementation remote-friendly: the TUI does not decide
whether commands run in an embedded local workspace or a remote
workspace, and it does not bypass app-server sandbox or permission
policy.
The branch summary logic stays internal to `codex-tui` because this PR
only needs TUI statusline behavior. The command boundary is still
isolated behind `WorkspaceCommandExecutor`, so the lookup code can be
lifted or reused later without changing statusline rendering.
## How?
- Add a TUI `WorkspaceCommandExecutor` abstraction backed by app-server
`command/exec`.
- Add branch summary probes for:
- current branch name,
- open PR metadata,
- committed branch diff stats against the default branch.
- Prefer remote-tracking default branch refs for diff stats, avoiding
stale or absent local `main` branches.
- Resolve PRs with `gh pr view` first, then fall back to
commit-associated PR lookup across parent/fork repos.
- Add `/statusline` picker entries, preview values, rendering, and OSC 8
clickable PR links.
- Keep all probes best-effort so missing `git`, missing `gh`, auth
failures, or non-git directories hide optional items instead of
surfacing footer errors.
## Validation
- `cargo test -p codex-tui branch_summary -- --nocapture`
- Snapshot coverage for the `/statusline` preview/setup rendering paths
- Hyperlink rendering coverage for clickable PR statusline cells
## Why
SQLite state was still being opened from consumer paths, including lazy
`OnceCell`-backed thread-store call sites. That let one process
construct multiple state DB connections for the same Codex home, which
makes SQLite lock contention and `database is locked` failures much
easier to hit.
State DB lifetime should be chosen by main-like entrypoints and tests,
then passed through explicitly. Consumers should use the supplied
`Option<StateDbHandle>` or `StateDbHandle` and keep their existing
filesystem fallback or error behavior when no handle is available.
The startup path also needs to keep the rollout crate in charge of
SQLite state initialization. Opening `codex_state::StateRuntime`
directly bypasses rollout metadata backfill, so entrypoints should
initialize through `codex_rollout::state_db` and receive a handle only
after required rollout backfills have completed.
## What Changed
- Initialize the state DB in main-like entrypoints for CLI, TUI,
app-server, exec, MCP server, and the thread-manager sample.
- Pass `Option<StateDbHandle>` through `ThreadManager`,
`LocalThreadStore`, app-server processors, TUI app wiring, rollout
listing/recording, personality migration, shell snapshot cleanup,
session-name lookup, and memory/device-key consumers.
- Remove the lazy local state DB wrapper from the thread store so
non-test consumers use only the supplied handle or their existing
fallback path.
- Make `codex_rollout::state_db::init` the local state startup path: it
opens/migrates SQLite, runs rollout metadata backfill when needed, waits
for concurrent backfill workers up to a bounded timeout, verifies
completion, and then returns the initialized handle.
- Keep optional/non-owning SQLite helpers, such as remote TUI local
reads, as open-only paths that do not run startup backfill.
- Switch app-server startup from direct
`codex_state::StateRuntime::init` to the rollout state initializer so
app-server cannot skip rollout backfill.
- Collapse split rollout lookup/list APIs so callers use the normal
methods with an optional state handle instead of `_with_state_db`
variants.
- Restore `getConversationSummary(ThreadId)` to delegate through
`ThreadStore::read_thread` instead of a LocalThreadStore-specific
rollout path special case.
- Keep DB-backed rollout path lookup keyed on the DB row and file
existence, without imposing the filesystem filename convention on
existing DB rows.
- Verify readable DB-backed rollout paths against `session_meta.id`
before returning them, so a stale SQLite row that points at another
thread's JSONL falls back to filesystem search and read-repairs the DB
row.
- Keep `debug prompt-input` filesystem-only so a one-off debug command
does not initialize or backfill SQLite state just to print prompt input.
- Keep goal-session test Codex homes alive only in the goal-specific
helper, rather than leaking tempdirs from the shared session test
helper.
- Update tests and call sites to pass explicit state handles where DB
behavior is expected and explicit `None` where filesystem-only behavior
is intended.
## Validation
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p
codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p
codex-tui -p codex-exec -p codex-cli --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout state_db_`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout find_thread_path -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout try_init_ -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-rollout`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p
codex-rollout --lib -- -D warnings`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store
read_thread_falls_back_when_sqlite_path_points_to_another_thread --
--nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-thread-store`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
shell_snapshot`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all personality_migration`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find::find_prefers_sqlite_path_by_id --
--nocapture`
- `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
--test all rollout_list_find -- --nocapture`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core
interrupt_accounts_active_goal_before_pausing`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server get_auth_status -- --test-threads=1`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p
codex-app-server --lib`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout
-p codex-app-server --tests`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout
-p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p
codex-exec -p codex-cli`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p
codex-app-server`
- `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p
codex-rollout`
- `CODEX_SKIP_VENDORED_BWRAP=1
CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core`
- `just argument-comment-lint -p codex-core`
- `just argument-comment-lint -p codex-rollout`
Focused coverage added in `codex-rollout`:
- `recorder::tests::state_db_init_backfills_before_returning` verifies
the rollout metadata row exists before startup init returns.
- `state_db::tests::try_init_waits_for_concurrent_startup_backfill`
verifies startup waits for another worker to finish backfill instead of
disabling the handle for the process.
-
`state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill`
verifies startup does not hang indefinitely on a stuck backfill lease.
-
`tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename`
verifies DB-backed lookup accepts valid existing rollout paths even when
the filename does not include the thread UUID.
-
`tests::find_thread_path_falls_back_when_db_path_points_to_another_thread`
verifies DB-backed lookup ignores a stale row whose existing path
belongs to another thread and read-repairs the row after filesystem
fallback.
Focused coverage updated in `codex-core`:
- `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a
DB-preferred rollout file with matching `session_meta.id`, so it still
verifies that valid SQLite paths win without depending on stale/empty
rollout contents.
`cargo test -p codex-app-server thread_list_respects_search_term_filter
-- --test-threads=1 --nocapture` was attempted locally but timed out
waiting for the app-server test harness `initialize` response before
reaching the changed thread-list code path.
`bazel test //codex-rs/thread-store:thread-store-unit-tests
--test_output=errors` was attempted locally after the thread-store fix,
but this container failed before target analysis while fetching `v8+`
through BuildBuddy/direct GitHub. The equivalent local crate coverage,
including `cargo test -p codex-thread-store`, passes.
A plain local `cargo check -p codex-rollout -p codex-app-server --tests`
also requires system `libcap.pc` for `codex-linux-sandbox`; the
follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in
this container.
## Why
This stack adds configured exec-server environments, including
environments reached over stdio. Before client-side stdio transports or
config can use that path, the exec-server binary itself needs a
first-class stdio listen mode so it can speak the same JSON-RPC protocol
over stdin/stdout that it already speaks over websockets.
**Stack position:** this is PR 1 of 5. It is the server-side transport
foundation for the stack.
## What Changed
- Accept `stdio` and `stdio://` for `codex exec-server --listen`.
- Promote the existing stdio `JsonRpcConnection` helper from test-only
code into normal exec-server transport code.
- Add parse coverage for stdio listen URLs while preserving the existing
websocket default.
## Stack
- **1. This PR:** https://github.com/openai/codex/pull/20663 - Add stdio
exec-server listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME
Split from original draft: https://github.com/openai/codex/pull/20508
## Validation
Not run locally; this was split out of the original draft stack.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
On Windows, background terminals could stay visible after their shell
process had already exited. The elevated runner waits for the PTY output
reader to reach EOF before it sends the final exit message, but the
ConPTY helper was reducing ownership down to raw handles too early. That
left the pseudoconsole's borrowed pipe handles alive past teardown, so
EOF never propagated and the session stayed `running`.
## What changed
- change `utils/pty/src/win/conpty.rs` to hand off owned ConPTY
resources instead of leaking only raw handles
- make `windows-sandbox-rs/src/conpty/mod.rs` keep the pseudoconsole
owner and the backing pipe handles together until teardown
- update the elevated runner and the legacy unified-exec backend to keep
that `ConptyInstance` alive, take only the specific pipe handles they
need, and drop the owner at teardown instead of trying to close a
detached pseudoconsole handle later
## Testing
- desktop app in `Auto-review`: 11 x `cmd /c "ping -n 3 google.com"` all
exited cleanly and did not accumulate in the UI
- desktop app in `Auto-review`: 5 x `cmd /c "ping -n 30 google.com"`
appeared in the UI and drained back out on their own
## Why
This is a prep PR in the multi-environment process-tool stack. It
separates ownership/config cleanup from the behavior change that teaches
process tools to route by selected environment, so the follow-up PR can
focus on model-facing `environment_id` behavior.
## Stack
1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext`
rendering for selected environments
2. https://github.com/openai/codex/pull/20669 - selected-environment
ownership and tool config prep (this PR)
3. https://github.com/openai/codex/pull/20647 - process-tool
`environment_id` routing
## What Changed
- keep the resolved turn environment list wrapped in
`ResolvedTurnEnvironments` through `TurnContext` instead of unwrapping
it back to a raw `Vec`
- add `TurnContext::resolve_path_against` so cwd-relative path
resolution has one shared helper
- replace the old tool config boolean with `ToolEnvironmentMode::{None,
Single, Multiple}`
## Testing
- Tests not run locally; this prep refactor is covered by GitHub CI for
the stack.
Co-authored-by: Codex <noreply@openai.com>
## Why
The TUI currently exposes overlapping command names for the same
permissions flow: `/permissions` and the older `/approvals` alias. It
also uses `/autoreview` for the manual retry flow, even though the
action users take there is approving one denied auto-review request.
This change makes the command surface consistent with the hard rebrand:
- `/permissions` is the only command for permission settings.
- `/approve` is the command for approving a recent auto-review denial.
## What changed
- Removed the legacy `/approvals` slash command and its dispatch path.
- Kept `/permissions` as the single permissions command shown and
accepted by the TUI.
- Renamed the auto-review denial command from `/autoreview` to
`/approve`.
- Updated nearby comments so they refer to `/permissions` rather than
the retired `/approvals` name.
## Verification
- Updated the slash-command unit test to assert that `AutoReview` now
renders and parses as `approve`.
## Why
We constantly get bug reports about keys not being recognized by Codex
when the terminal is not handling the key press. Running `/keymap debug`
or `/keymap` and going to the Debug tab, we can allow the user to either
understand that the key being pressed is not being recognized or to
check what it's being recognized as and report or reassign that key.
| Menu | Inspector | Hint |
|---|---|---|
| <img width="1369" height="796" alt="CleanShot 2026-05-02 at 12 57 12"
src="https://github.com/user-attachments/assets/512b6faa-344e-4aee-9c00-b4bdc633a662"
/> | <img width="1261" height="754" alt="CleanShot 2026-05-02 at 12 56
36"
src="https://github.com/user-attachments/assets/a6ddae7d-e174-4ee4-893f-e6bec4fff4ab"
/> | <img width="1369" height="796" alt="CleanShot 2026-05-02 at 12 57
30"
src="https://github.com/user-attachments/assets/db507784-f40a-4cff-ac23-a61d9703769b"
/> |
## Summary
- add a Debug tab to `/keymap` and support `/keymap debug` for direct
access
- show what key Codex receives, the config key representation, raw event
details, and matching actions
- add a progressive missing-key hint that escalates after a few seconds
with no detected keypress
## Validation
- `just fmt`
- `cargo test -p codex-tui keymap_setup::tests::debug_view`
- `cargo test -p codex-tui keymap_setup::tests`
- `cargo test -p codex-tui slash_keymap`
- `cargo test -p codex-tui` (unit tests passed; integration test
`suite::model_availability_nux::resume_startup_does_not_consume_model_availability_nux_count`
failed locally by itself with `codex resume` exiting 1 and terminal
probe escape output)
- `just fix -p codex-tui`
- `just argument-comment-lint`
- `cargo insta pending-snapshots`
- `git diff --check`
## Why
Codex `0.128` started using `--perms` in more routine Linux sandbox
construction when protected workspace metadata mounts landed in #19852.
Upstream bubblewrap added `--perms` in `v0.5.0`, so system `bwrap`
versions older than that, including the `v0.4.0` and `v0.4.1` family, do
not support the flag. The launcher still selected those binaries as long
as they existed on `PATH`.
That means affected hosts can fail every sandboxed command up front
with:
```text
bwrap: Unknown option --perms
```
The reports in #20590 and duplicate #20623 match that compatibility gap;
#20623 explicitly shows system bubblewrap `0.4.0`.
## What changed
- Replace the single `--argv0` probe with a small system-bwrap
capability probe in `codex-rs/linux-sandbox/src/launcher.rs`.
- Continue using the old-system `--argv0` compatibility path when
needed, but only select a system `bwrap` if it also advertises
`--perms`.
- Fall back to the vendored `bwrap` when the system binary is too old
for the flags Codex now requires.
- Add regression coverage for the old-system-bwrap case so binaries
without `--perms` stay on the vendored path.
## Verification
- Added `falls_back_to_vendored_when_system_bwrap_lacks_perms` to cover
the reported compatibility gap.
- Ran `cargo test -p codex-linux-sandbox` and `cargo clippy -p
codex-linux-sandbox --tests` locally. On macOS, the crate builds but its
Linux-only tests are cfg-gated out, so the new regression test still
needs Linux CI or a Linux devbox run for real execution coverage.
## Related issues
- Fixes#20590
- Duplicate report: #20623
## Why
Whenever we return a thread's history (turns and items) over app-server,
always return the limited form as specified by the rollout policy
`EventPersistenceMode::Limited`, even if the thread was previously
started with `EventPersistenceMode::Extended`.
We're finding it is quite unscalable to be returning the extended
history, so let's apply the same filtering logic of the rollout policy
when we load and return the thread's history.
## What Changed
- Reuse the rollout persistence policy when reconstructing app-server
`ThreadItem` history so only `EventPersistenceMode::Limited` rollout
items are replayed into API turns.
- Route `thread/read`, `thread/resume`, `thread/fork`,
`thread/turns/list`, and rollback responses through the same filtered
app-server history projection.
- Keep live active turns intact when composing a response for a
currently running thread.
- Update command execution coverage so persisted extended command events
are excluded from returned history for `thread/read`, `thread/fork`, and
`thread/turns/list`.
## Test Plan
- `cargo test -p codex-app-server limited`
- `cargo test -p codex-app-server thread_shell_command`
- `cargo test -p codex-app-server thread_read`
- `cargo test -p codex-app-server thread_rollback`
- `cargo test -p codex-app-server thread_fork`
- `cargo test -p codex-app-server-protocol`
## Summary
- Treat `approval_mode = "approve"` as skip-review across all permission
modes.
- Remove the mode-specific split in the MCP auto-approval gate so
approved tools bypass review consistently.
- Expand regression coverage in the shared MCP helper and the core
tool-call flow.
## Testing
- `just fmt`
- `cargo test -p codex-mcp`
- `cargo test -p codex-core
approve_mode_skips_arc_and_guardian_in_every_permission_mode`
- `git diff --check`
- Full `cargo test -p codex-core` was also attempted, but the suite hit
an unrelated pre-existing stack overflow in an existing multi-agent test
- [x] Persist a special type of MCP tool calls for triggering MCP App,
this type of mcp tool calls has 'mcpAppResourceUri` set. These events
are needed so that the Codex App can correctly render the MCP App after
resume.
## Why
`ModelClientSession` and `compact_conversation_history()` were still
rebuilding the same `ResponsesApiRequest` fields separately. That
duplication makes it easy for normal `/responses` turns and compact
requests to drift when request-shape changes land later, which is
exactly the kind of cache-affecting divergence we want to avoid.
This follow-up keeps the scope small by extracting the shared
request-construction logic into one helper and using it from both paths.
## What changed
- move `ResponsesApiRequest` construction into a shared
`ModelClient::build_responses_request(...)` helper in
`core/src/client.rs`
- update the normal `/responses` streaming path to call that helper
instead of the old `ModelClientSession`-local implementation
- update `compact_conversation_history()` to derive its compact payload
from the same helper so `model`, `instructions`, `input`, `tools`,
`parallel_tool_calls`, `reasoning`, and `text` stay aligned with normal
request building
- add a unit test covering the shared helper's prompt cache key,
installation metadata, and `service_tier` behavior
## Verification
- `cargo test -p codex-core
build_responses_request_sets_shared_cache_and_metadata_fields`
- `cargo test -p codex-core --test all
remote_compact_v2_reuses_context_compaction_for_followups`
## Docs
No docs update needed.
## Why
The local memories MCP backend only rejected symlinks after resolving
the final path. That left room for scoped requests like
`skills/secret.md` to walk through a symlinked ancestor directory and
escape the configured memories root.
This change also makes missing scoped paths fail explicitly instead of
looking like an empty `list` / `search` result or a `NotFile` read
error.
## What Changed
- walk each scoped path component in
`LocalMemoriesBackend::resolve_scoped_path` and reject symlinked
ancestors before accessing the target
- reject scoped paths that traverse through a non-directory intermediate
component
- add a `NotFound` backend error for missing `read`, `list`, and
`search` paths and map it through the MCP server error conversion
- add coverage for missing paths and symlinked ancestor directories in
`codex-rs/memories/mcp/src/local_tests.rs`
## Testing
- added unit coverage in `codex-rs/memories/mcp/src/local_tests.rs` for
missing paths and symlinked ancestor directories across `read`, `list`,
and `search`
## Why
The memories MCP server currently keeps handwritten JSON Schema beside
the Rust types that actually serialize and deserialize the tool
payloads:
[`schema.rs`](2f5c06a29c/codex-rs/memories/mcp/src/schema.rs (L4-L133)),
[`server.rs`](2f5c06a29c/codex-rs/memories/mcp/src/server.rs (L44-L75)),
and
[`backend.rs`](2f5c06a29c/codex-rs/memories/mcp/src/backend.rs (L41-L117)).
That duplicates the tool contract and makes schema drift easier as the
API evolves.
## What changed
- derive `JsonSchema` for the memories tool arguments, responses, and
nested response types
- replace the handwritten schema builders with shared `schemars`
generation
- preserve the existing wire shape while generating schemas, including
nullable output `Option` fields and non-nullable optional input fields
- wire the `list`, `read`, and `search` tools to the generated schemas
## Verification
- CI pending
## Why
The app-server request path had grown around a large
`CodexMessageProcessor` plus separate API wrapper/helper modules. That
made the dependency graph hard to see and forced unrelated request
families to share broad processor state.
This PR makes the split mechanical and command-prefix oriented so
request families own only the dependencies they use.
## What changed
- Replaced `CodexMessageProcessor` with command-prefix request
processors under `app-server/src/request_processors/`.
- Removed the old config, device-key, external-agent-config, and fs API
wrapper files by moving their API handling into processors.
- Split apps, plugins, marketplace, catalog, account, MCP, command exec,
fs, git, feedback, thread, turn, thread goals, and Windows sandbox
handling into dedicated processors.
- Kept shared lifecycle, summary conversion, token usage replay, and
shared error mapping only where multiple processors use them; single-use
helpers were inlined into their owning processor.
- Removed the fallback processor path and moved processor tests to
`_tests` files.
## Validation
- `cargo test -p codex-app-server`
- `cargo check -p codex-app-server`
- `just fix -p codex-app-server`
## Summary
Early adopters of the `/goal` feature have provided feedback that they
expect a goal they explicitly paused to remain paused when they resume a
thread. Previously, resuming a thread would reactivate a paused goal.
This PR keeps persisted goal status unchanged during thread resume. This
honors the user feedback while also simplifying the core goal logic.
Rather than have the core logic automatically resume a paused goal, that
responsibility is transferred to the client. The TUI now detects a
resumed thread with a paused goal and asks the user whether to `Resume
goal` or `Leave paused`. The prompt appears only for quiet resume flows,
so users who resume with an immediate prompt are not interrupted.
<img width="544" height="111" alt="image"
src="https://github.com/user-attachments/assets/0ac9de1c-6ee6-47ba-b223-c03c8eb4c192"
/>
## Why
Returning from a `/side` conversation restores the parent thread by
replaying its snapshot into the TUI. For very long parent threads,
replaying every transcript row can take noticeable time even though most
rows immediately scroll out of terminal history.
## What Changed
- Buffer thread-switch replay for parent restores when terminal resize
reflow is enabled.
- Reuse the existing resize-reflow tail renderer so only the retained
transcript tail is written back to scrollback when a row cap is
configured.
## Summary
Early adopters of the `/goal` feature have provided feedback that they
expect a goal they explicitly paused to remain paused when they resume a
thread. Previously, resuming a thread would reactivate a paused goal.
This PR keeps persisted goal status unchanged during thread resume. This
honors the user feedback while also simplifying the core goal logic.
Rather than have the core logic automatically resume a paused goal, that
responsibility is transferred to the client. The TUI now detects a
resumed thread with a paused goal and asks the user whether to `Resume
goal` or `Leave paused`. The prompt appears only for quiet resume flows,
so users who resume with an immediate prompt are not interrupted.
<img width="544" height="111" alt="image"
src="https://github.com/user-attachments/assets/0ac9de1c-6ee6-47ba-b223-c03c8eb4c192"
/>
## Why
The memories MCP `search` tool only accepts a single substring today,
which makes it hard for clients to express combined queries or explain
why a line matched. This change adds the richer search shape needed for
the next client iteration while keeping the legacy single-`query` call
working.
## What changed
- accept either the legacy `query` field or a new `queries` array, plus
`match_mode: any|all`
- teach the local memories backend to evaluate multi-query line matches
and return `matched_queries` on each hit
- update the MCP input/output schema and add coverage for parser
behavior, ordering, pagination, case sensitivity, and match modes
## Testing
- added unit coverage in `memories/mcp/src/local_tests.rs` and
`memories/mcp/src/server.rs`
## Why
The paginated memories MCP `search` tool still returned only the
matching line text, which made it harder for clients to present useful
search results or decide whether they needed to follow up with a
separate `read` call. Adding a small amount of surrounding context makes
individual hits much more usable while keeping the search response
deterministic and line-addressable.
## What changed
- add an optional `context_lines` search argument and thread it through
the MCP server into the local memories backend
- change search matches to return the matched `line_number` plus a
`start_line_number` and multi-line `content` block for the requested
context window
- update the search tool schema and description to document the new
request/response shape
- extend the local backend tests to cover zero-context matches,
contextual results, pagination, and invalid cursors that point past the
end of the result set
## Testing
- Added targeted unit coverage in `memories/mcp/src/local_tests.rs`
- GitHub Actions are running for the branch
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
The memories MCP `search` tool previously stopped once it hit
`max_results`, so callers could tell there were more matches via
`truncated` but had no way to fetch the rest of the result set. That
made large searches awkward for clients that need to keep paging through
a stable, deterministic view of the matches.
## What changed
- add an optional `cursor` field to `SearchMemoriesRequest` / tool input
and return `next_cursor` in `SearchMemoriesResponse`
- update the MCP schemas and tool wiring so clients can request
subsequent pages explicitly
- change the local memories backend to collect and sort the full scoped
match list, then slice the requested page and reject invalid cursors
- add unit coverage for paginated search results and invalid cursor
handling in `memories/mcp/src/local_tests.rs`
## Testing
- Added targeted unit coverage in `memories/mcp/src/local_tests.rs`
- GitHub Actions are running for the branch
## Why
The memories MCP `list` tool should behave like a directory listing, not
a recursive tree walk. Recursive results make pagination harder to
reason about, return unexpectedly deep paths for scoped requests, and no
longer match the intended tool contract.
## What Changed
- Changed the local memories backend so `list` returns only the
immediate children of the requested path.
- Preserved file-scoped requests by returning the file itself, and
missing paths by returning an empty result.
- Updated cursor handling to paginate over the shallow sibling set and
reject cursors past the available results.
- Updated the MCP tool description to say it lists immediate files and
directories under a path.
- Reworked the local backend tests to cover shallow top-level listing,
shallow scoped listing, sibling ordering, and pagination.
## Testing
- `cargo test -p codex-memories-mcp`
## Why
Large memories trees do not fit well into a single MCP `list` response.
This change makes the memories MCP server page `list` results so callers
can continue walking the tree without overfetching or relying on
ambiguous truncation.
## What changed
- add an optional `cursor` input to the memories MCP `list` API and
return `next_cursor` alongside `truncated` in the response
- paginate recursive local-memory traversal while preserving
lexicographic path order across directories
- reject malformed and out-of-range cursors as invalid MCP requests
- update the server/schema wiring and add coverage for pagination,
ordering, and cursor validation in `memories/mcp/src/local_tests.rs`
## Testing
- `cargo test -p codex-memories-mcp`
## Why
The memories MCP `read` tool already supports `line_offset`, but it
cannot return a bounded line range. That makes it awkward to page
through large memory files or request a small slice without relying on
token truncation.
## What changed
- add an optional `max_lines` parameter to the memories MCP `read` tool
schema and request parsing
- cap local backend reads to the requested number of lines before token
truncation
- treat `max_lines = 0` as an invalid request and surface it as
`invalid_params`
- add backend tests for bounded reads and invalid line request
validation
## Testing
- added coverage in `memories/mcp/src/local_tests.rs` for `max_lines`
reads and invalid `max_lines` / `line_offset` requests
## Why
Memory clients sometimes need to continue reading a file from a known
line instead of starting over from the top. Adding a line offset to the
`read` MCP keeps that resume logic simple and avoids re-reading
already-consumed content.
## What changed
- Added an optional `line_offset` argument to the memory `read` tool,
defaulting to `1`.
- Read content starting at the requested 1-indexed line before token
truncation, and return `start_line_number` in the response.
- Treat invalid offsets as invalid params errors and cover the new
behavior in `codex-rs/memories/mcp/src/local_tests.rs`.
## Testing
- Added unit tests for reading from a non-default starting line.
- Added unit tests for rejecting `0` and past-end line offsets.