## Why
Codex assisted-code attribution needs a client-side accepted-code source
that does not upload raw code. This adds a hash-only analytics event
derived from the turn diff so downstream attribution can compare
accepted Codex lines against commit or PR diffs.
## What Changed
- Parse accepted/effective added lines from the final turn diff and emit
`codex_accepted_line_fingerprints` analytics.
- Hash repo, path, and normalized line content before upload; raw code
and raw diffs are not included in the event.
- Chunk large fingerprint payloads and send accepted-line fingerprint
events in isolated requests while preserving normal batching for other
analytics events.
- Canonicalize Git remote URLs before repo hashing so SSH/HTTPS GitHub
remotes join to the same repo hash.
- Add parser coverage for unified diff hunk lines that look like `+++`
or `---` file headers.
## Verification
- `cargo test -p codex-analytics`
- `cargo test -p codex-git-utils canonicalize_git_remote_url`
- `just fix -p codex-analytics`
- `just bazel-lock-check`
- `git diff --check`
## Why
`session_id` and `thread_id` are separate identities after #20437, but
app-server only surfaced `sessionId` on the `thread/start`,
`thread/resume`, and `thread/fork` response envelopes. Other
thread-bearing surfaces such as `thread/list`, `thread/read`,
`thread/started`, `thread/rollback`, `thread/metadata/update`, and
`thread/unarchive` either lacked the grouping key or forced clients to
special-case those three responses.
Making `sessionId` part of the reusable `Thread` payload gives every v2
API surface one place to expose session-tree identity.
## Mental model
1. thread.sessionId lives on `Thread`
2. It is a view/runtime identity for the current live session tree, not
durable stored lineage metadata
3. When app-server has a live loaded thread, it copies the real value
from core’s session_configured.session_id
4. When it only has stored/unloaded data, it falls back to
thread.sessionId = thread.id
## What changed
- Added `sessionId` to the v2
[`Thread`](8fc9e9b4cf/codex-rs/app-server-protocol/src/protocol/v2/thread_data.rs (L105-L109)).
- Removed the duplicate top-level `sessionId` fields from
`thread/start`, `thread/resume`, and `thread/fork`; clients should now
read `response.thread.sessionId`.
- Populated `thread.sessionId` when building live thread responses,
replaying loaded threads, and returning stored-thread summaries so the
field is present across start, resume, fork, list, read, rollback,
metadata-update, unarchive, and `thread/started` paths. See
[`load_thread_from_resume_source_or_send_internal`](8fc9e9b4cf/codex-rs/app-server/src/request_processors/thread_processor.rs (L2824-L2918))
and
[`thread_from_stored_thread`](8fc9e9b4cf/codex-rs/app-server/src/request_processors/thread_processor.rs (L3671-L3719)).
- Preserved the stored-thread fallback: if a thread has not been loaded
into a live session tree yet, `thread.sessionId` falls back to
`thread.id`; once the thread is live again, the field reports the active
session tree root.
- Regenerated the JSON/TypeScript schemas and updated the app-server
README examples to show
[`thread.sessionId`](8fc9e9b4cf/codex-rs/app-server/README.md (L306-L310))
on the thread object.
## Why
`thread/start` and `thread/resume` already return `sessionId`, but
`thread/fork` only returned the new thread. That left clients to infer
the forked thread's session identity from `thread.id`, which kept the
new `session_id` / `thread_id` split implicit at one lifecycle boundary.
Follow-up to #20437.
## What changed
- Add `sessionId` to `ThreadForkResponse`.
- Populate it from the forked session configuration.
- Regenerate the v2 JSON/TypeScript schema fixtures and update the
app-server docs/example.
- Extend the fork integration test to assert the returned `sessionId`.
## Verification
- Added coverage in `thread_fork_creates_new_thread_and_emits_started`
for the new response field.
## Summary
Related to
https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
TLDR:
We update the meaning of session ids and thread ids:
* thread_id stays as now
* session_id become a shared id between every thread under a /root
thread (i.e. every sub-agent share the same session id)
This PR introduces an explicit `SessionId` and threads it through the
protocol/client boundary so `session_id` and `thread_id` can diverge
when they need to, while preserving compatibility for older serialized
`session_configured` events.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification
## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.
Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.
## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.
## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
## Why
`Turn.items` currently overloads an empty array to mean either that no
items exist or that the server intentionally did not load them for this
response. That ambiguity blocks future lazy-loading work where clients
need to distinguish unloaded, summary, and fully hydrated turn payloads.
## What changed
- add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full`
variants
- add required `itemsView` metadata to app-server `Turn` payloads
- mark reconstructed persisted history as `full` and live shell-style
turn payloads as `notLoaded`
- keep current `thread/turns/list` behavior unchanged and document that
it still returns `full` turns today
- regenerate the JSON and TypeScript protocol fixtures
## Verification
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server thread_read_can_include_turns`
- `cargo test -p codex-app-server
thread_turns_list_can_page_backward_and_forward`
- `cargo test -p codex-app-server
thread_resume_rejects_history_when_thread_is_running`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fmt`
## Why
The precursor PR keeps successful client responses typed until
app-server's outgoing response seam. This follow-up uses that seam to
move successful client-response analytics out of individual handlers and
into the shared sender path, while keeping filtering decisions inside
`codex-analytics`.
## What changed
- Emit successful client-response analytics centrally from
`OutgoingMessageSender::send_response`.
- Remove duplicate handler-local response tracking for the current
thread/turn lifecycle responses.
- Keep analytics ingestion selective inside `AnalyticsEventsClient`, so
unrelated client traffic is ignored before cloning or boxing.
- Collapse client-response analytics facts onto one typed path and
normalize payloads in the reducer.
- Add direct client-filter coverage plus sender-level coverage for the
centralized forwarding path.
## Verification
- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests --lib`
## Why
`pr17088` adds typed server-originated request/response plumbing, but
successful client responses are still erased into bare JSON-RPC `result`
values before app-server can make any typed decision about them.
This precursor PR keeps successful client responses typed until the
outgoing response seam. It is intentionally limited to
protocol/app-server plumbing so the analytics behavior change can review
separately on top.
## What changed
- Add `ClientResponsePayload` as the pre-serialization client response
body type.
- Route app-server successful response paths through the typed payload
seam while preserving existing handler-local analytics behavior.
- Keep `InterruptConversation` JSON-RPC-only because it has no
`ClientResponse` variant.
- Move the new payload conversion tests into a dedicated protocol test
module.
## Verification
- `cargo check -p codex-app-server`
- `cargo test -p codex-app-server-protocol`