## Summary
- Make command/exec output-delta tests accumulate streamed chunks
instead of assuming complete logical output in a single notification.
- Collect stdout and stderr independently so stream interleaving does
not fail the pipe streaming test.
## Why
The command/exec protocol exposes output as deltas, so tests should not
rely on chunk boundaries being stable. A line like `out-start\n` may
arrive split across multiple notifications, and stdout/stderr
notifications may interleave.
## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-app-server suite::v2::command_exec`
## Summary
- wrap realtime startup context in
`<startup_context>...</startup_context>` tags
- prefix V2 mirrored user text and relayed backend text with `[USER]` /
`[BACKEND]`
- remove the V2 progress suffix and replace the final V2 handoff output
with a short completion acknowledgement while preserving the existing V1
wrapper
## Testing
- cargo test -p codex-api
realtime_v2_session_update_includes_background_agent_tool_and_handoff_output_item
-- --exact
- cargo test -p codex-app-server webrtc_v2_background_agent_
- cargo test -p codex-app-server webrtc_v2_text_input_is_
- cargo test -p codex-core conversation_user_text_turn_is_
## Summary
1. Revert https://github.com/openai/codex/pull/17848 so the Bazel and
`BUILD` file changes leave `main`.
2. Prepare for a narrower follow up that restores only `SECURITY.md`.
## Validation
1. Reviewed the revert diff against `main`.
2. Ran a clean diff check before push.
Dismiss stale TUI app-server approvals after remote resolution
When an approval, user-input prompt, or elicitation request is resolved
by another client, the TUI now dismisses the matching local UI instead
of leaving stale prompts behind and emitting a misleading local
cancellation.
This change teaches pending app-server request tracking to map
`serverRequest/resolved` notifications back to the concrete request type
and stable request key, then propagates that resolved request into TUI
prompt state. Approval, request-user-input, and MCP elicitation overlays
now drop the resolved current or queued request quietly, advance to the
next queued request when present, and avoid emitting abort/cancel events
for stale UI.
The latest update also retires matching prompts while they are still
deferred behind active streaming and suppresses buffered active-thread
requests whose app-server request id has already been resolved before
drain. `ChatWidget` removes a resolved request from both the deferred
interrupt queue and the materialized bottom-pane stack, while
active-thread request handling verifies the app-server request is still
pending before showing a prompt. Lifecycle events such as exec begin/end
remain queued so approved work can still render normally.
Tests cover resolved-request mapping, overlay dismissal behavior,
deferred prompt pruning for same-turn user input, exec approval IDs,
lifecycle-event retention, and the buffered active-thread ordering
regression.
Validation:
- `just fmt`
- `git diff --check`
- `cargo test -p codex-tui
resolved_buffered_approval_does_not_become_actionable_after_drain`
- `cargo test -p codex-tui
enqueue_primary_thread_session_replays_buffered_approval_after_attach`
- `cargo test -p codex-tui chatwidget::interrupts`
- `just fix -p codex-tui`
---------
Co-authored-by: Codex <noreply@openai.com>
# Summary
- implement local ThreadStore archive/unarchive operations
- implement local ThreadStore read_thread operation
- break up the various ThreadStore local method implementations into
separate files
- migrate app-server archive/unarchive and core archive fixture to use
ThreadStore (but not all read operations yet!)
- use the ThreadStore's read operation as a proxy check for thread
persistence/existence in the app server code
- move all other filesystem operations related to archive (path
validation etc) into the local thread store.
# Tests
- add dedicated local store archive/unarchive tests
## Summary
1. Add a Security Boundaries section to `SECURITY.md`.
2. Point readers to the Codex Agent approvals and security documentation
for sandboxing, approvals, and network controls.
## Validation
1. Reviewed the `SECURITY.md` diff in a clean worktree.
2. No tests run. Docs only change.
## Summary
- Track outbound remote-control sequence IDs independently for each
client stream.
- Retain unacked outbound messages per stream using FIFO buffers.
- Require stream-scoped acks and update tests for contiguous per-stream
sequencing.
## Why
The remote-control peer uses outbound sequence gaps to detect lost
messages and re-initialize. A single global outbound sequence counter
can create apparent gaps on an individual stream when another stream
receives an interleaved message.
## Validation
- `just fmt`
- `cargo test -p codex-app-server remote_control`
- `just fix -p codex-app-server`
- `git diff --check`
Builds on top of #17659
Move the filesystem + sqlite thread listing-related operations inside of
a local ThreadStore implementation and call ThreadStore from the places
that used to perform these filesystem/sqlite operations.
This is the first of a series of PRs that will implement the rest of the
local ThreadStore.
Testing:
- added unit tests for the thread store implementation
- adjusted some unit tests in the realtime + personality packages whose
callsites changed. Specifically I'm trying to hide ThreadMetadata inside
of the local implementation and make ThreadMetadata a sqlite
implementation detail concern rather than a public interface, preferring
the more generate StoredThread interface instead
- added a corner case test for the personality migration package that
wasn't covered by the existing test suite
- adjust the behavior of searched thread listing to run the existing
local rollout repair/backfill pass _before_ querying SQLite results, so
callers using ThreadStore::list_threads do not miss matches after a
partial metadata warm-up
## Summary
Stack PR 2 of 4 for feature-gated agent identity support.
This PR adds agent identity registration behind
`features.use_agent_identity`. It keeps the app-server protocol
unchanged and starts registration after ChatGPT auth exists rather than
requiring a client restart.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - this PR
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR4: https://github.com/openai/codex/pull/17388 - use `AgentAssertion`
downstream when enabled
## Validation
Covered as part of the local stack validation pass:
- `just fmt`
- `cargo test -p codex-core --lib agent_identity`
- `cargo test -p codex-core --lib agent_assertion`
- `cargo test -p codex-core --lib websocket_agent_task`
- `cargo test -p codex-api api_bridge`
- `cargo build -p codex-cli --bin codex`
## Notes
The full local app-server E2E path is still being debugged after PR
creation. The current branch stack is directionally ready for review
while that follow-up continues.
stacked on #17402.
MCP tools returned by `tool_search` (deferred tools) get registered in
our `ToolRegistry` with a different format than directly available
tools. this leads to two different ways of accessing MCP tools from our
tool catalog, only one of which works for each. fix this by registering
all MCP tools with the namespace format, since this info is already
available.
also, direct MCP tools are registered to responsesapi without a
namespace, while deferred MCP tools have a namespace. this means we can
receive MCP `FunctionCall`s in both formats from namespaces. fix this by
always registering MCP tools with namespace, regardless of deferral
status.
make code mode track `ToolName` provenance of tools so it can map the
literal JS function name string to the correct `ToolName` for
invocation, rather than supporting both in core.
this lets us unify to a single canonical `ToolName` representation for
each MCP tool and force everywhere to use that one, without supporting
fallbacks.
## Summary
- Port marketplace source support into the shared core marketplace-add
flow
- Support local marketplace directory sources
- Support direct `marketplace.json` URL sources
- Persist the new source types in config/schema and cover them in CLI
and app-server tests
## Validation
- `cargo test -p codex-core marketplace_add`
- `cargo test -p codex-cli marketplace_add`
- `cargo test -p codex-app-server marketplace_add`
- `just write-config-schema`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-cli`
## Context
Current `main` moved marketplace-add behavior into shared core code and
still assumed only git-backed sources. This change keeps that structure
but restores support for local directories and direct manifest URLs in
the shared path.
Problem: PR #17372 moved initialized request handling into
`dispatch_initialized_client_request`, leaving analytics code that uses
`connection_id` without a local binding and breaking `codex-app-server`
builds.
Solution: Restore the `connection_id` binding from
`connection_request_id` before initialized request validation and
analytics tracking.
## Summary
- Refactors `MessageProcessor` and per-connection session state so
initialized service RPC handling can be moved into spawned tasks in a
follow-up PR.
- Shares the processor and initialized session data with
`Arc`/`OnceLock` instead of mutable borrowed connection state.
- Keeps initialized request handling synchronous in this PR; it does
**not** call `tokio::spawn` for service RPCs yet.
## Testing
- `just fmt`
- `cargo test -p codex-app-server` *(fails on existing hardening gaps
covered by #17375, #17376, and #17377; the pipelined config regression
passed before the unrelated failures)*
- `just fix -p codex-app-server`
To allow the ability to have guaranteed-unique cursors, we make two
important updates:
* Add new updated_at_ms and created_at_ms columns that are in
millisecond precision
* Guarantee uniqueness -- if multiple items are inserted at the same
millisecond, bump the new one by one millisecond until it becomes unique
This lets us use single-number cursors for forwards and backwards paging
through resultsets and guarantee that the cursor is a fixed point to do
(timestamp > cursor) and get new items only.
This updated implementation is backwards-compatible since multiple
appservers can be running and won't handle the previous method well.
- Add outputModality to thread/realtime/start and wire text/audio output
selection through app-server, core, API, and TUI.\n- Rename the realtime
transcript delta notification and add a separate transcript done
notification that forwards final text from item done without correlating
it with deltas.
## Summary
Move `codex marketplace add` onto a shared core implementation so the
CLI and app-server path can use one source of truth.
This change:
- adds shared marketplace-add orchestration in `codex-core`
- switches the CLI command to call that shared implementation
- removes duplicated CLI-only marketplace add helpers
- preserves focused parser and add-path coverage while moving the shared
behavior into core tests
## Why
The new `marketplace/add` RPC should reuse the same underlying
marketplace-add flow as the CLI. This refactor lands that consolidation
first so the follow-up app-server PR can be mostly protocol and handler
wiring.
## Validation
- `cargo test -p codex-core marketplace_add`
- `cargo test -p codex-cli marketplace_cmd`
- `just fix -p codex-core`
- `just fix -p codex-cli`
- `just fmt`
## Summary
- Add `turn/inject_items` app-server v2 request support for appending
raw Responses API items to a loaded thread history without starting a
turn.
- Generate JSON schema and TypeScript protocol artifacts for the new
params and empty response.
- Document the new endpoint and include a request/response example.
- Preserve compatibility with the typo alias `turn/injet_items` while
returning the canonical method name.
## Testing
- Not run (not requested)
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
Currently app-server may unload actively running threads once the last
connection disconnects, which is not expected.
Instead track when was the last active turn & when there were any
subscribers the last time, also add 30 minute idleness/no subscribers
timer to reduce the churn.
Addresses #17593
Problem: A regression introduced in
https://github.com/openai/codex/pull/16492 made thread/start fail when
Codex could not persist trusted project state, which crashes startup for
users with read-only config.toml.
Solution: Treat trusted project persistence as best effort and keep the
current thread's config trusted in memory when writing config.toml
fails.
Addresses #17514
Problem: PR #16966 made the TUI render the deprecated context-compaction
notification, while v2 could also receive legacy unified-exec
interaction items alongside terminal-interaction notifications, causing
duplicate "Context compacted" and "Waited for background terminal"
messages.
Solution: Suppress deprecated context-compaction notifications and
legacy unified-exec interaction command items from the app-server v2
projection, and render canonical context-compaction items through the
existing TUI info-event path.
Addresses #17498
Problem: The TUI derived /status instruction source paths from the local
client environment, which could show stale <none> output or incorrect
paths when connected to a remote app server.
Solution: Add an app-server v2 instructionSources snapshot to thread
start/resume/fork responses, default it to an empty list when older
servers omit it, and render TUI /status from that server-provided
session data.
Additional context: The app-server field is intentionally named
instructionSources rather than AGENTS.md-specific terminology because
the loaded instruction sources can include global instructions, project
AGENTS.md files, AGENTS.override.md, user-defined instruction files, and
future dynamic sources.
- Let typed user messages submit while realtime is active and mirror
accepted text into the realtime text stream.
- Add integration coverage and snapshot for outbound realtime text.
## Summary
- Add an optional `tags` dictionary to feedback upload params.
- Capture the active app-server turn id in the TUI and submit it as
`tags.turn_id` with `/feedback` uploads.
- Merge client-provided feedback tags into Sentry feedback tags while
preserving reserved system fields like `thread_id`, `classification`,
`cli_version`, `session_source`, and `reason`.
## Behavior / impact
Existing feedback upload callers remain compatible because `tags` is
optional and nullable. The wire shape is still a normal JSON object /
TypeScript dictionary, so adding future feedback metadata will not
require a new top-level protocol field each time. This change only adds
feedback metadata for Codex CLI/TUI uploads; it does not affect existing
pipelines, DAGs, exports, or downstream consumers unless they choose to
read the new `turn_id` feedback tag.
## Tests
- `cargo fmt -- --config imports_granularity=Item` passed; stable
rustfmt warned that `imports_granularity` is nightly-only.
- `cargo run -p codex-app-server-protocol --bin write_schema_fixtures`
- `cargo test -p codex-feedback
upload_tags_include_client_tags_and_preserve_reserved_fields`
- `cargo test -p codex-app-server-protocol
schema_fixtures_match_generated`
- `cargo test -p codex-tui build_feedback_upload_params`
- `cargo test -p codex-tui
live_app_server_turn_started_sets_feedback_turn_id`
- `cargo check -p codex-app-server --tests`
- `git diff --check`
---------
Co-authored-by: Codex <noreply@openai.com>
Addresses #17302
Problem: `thread/list` compared cwd filters with raw path equality, so
`resume --last` could miss Windows sessions when the saved cwd used a
verbatim path form and the current cwd did not.
Solution: Normalize cwd comparisons through the existing path comparison
utilities before falling back to direct equality, and add Windows
regression coverage for verbatim paths. I made this a general utility
function and replaced all of the duplicated instance of it across the
code base.