## Why
Review telemetry should describe reviews as first-class events, not only
as counters denormalized onto terminal tool-item events. That lets us
analyze guardian and user reviews consistently across command execution,
file changes, permissions, and network access, while still preserving
the terminal item summaries that existing tool analytics need.
To make those review events accurate, analytics also needs the observed
completion time for each review and enough command metadata to
distinguish `shell` from `unified_exec` reviews.
## What changed
- emit generic `codex_review_event` rows for completed user and guardian
reviews, with review subjects, reviewer, trigger, terminal status,
resolution, and observed duration
- reduce approval request / response / abort facts into review events
for command execution, file change, and permissions flows
- keep denormalized review counts, final approval outcome, and
permission-request flags on terminal tool-item events for
item-associated reviews
- plumb review completion timing so user-review responses and aborts use
app-server-observed completion times, while guardian analytics reuse the
same terminal timestamps emitted on guardian assessment events
- carry command approval `source` through the protocol and app-server
layers so review analytics can distinguish `shell` from `unified_exec`
- add analytics coverage for user-review emission, guardian-review
emission, permission reviews that should not denormalize onto tool
items, item-summary isolation across threads, and the serialized
review-event shape
## Verification
- `cargo test -p codex-analytics`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18748).
* __->__ #18748
* #21434
* #18747
* #17090
* #17089
* #20514
## Summary
- accumulate completed tool-item counts per turn from the item lifecycle
- populate the reserved count fields on `codex_turn_event`
- add reducer coverage for zero-count turns and mixed completed tool
items
## Why
PR #17090 moved tool-item analytics onto the item lifecycle, so the turn
reducer can now derive the per-turn tool counts from the same completed
items instead of leaving the reserved fields null.
## Validation
- `just fmt`
- `cargo test -p codex-analytics`
## Why
Codex assisted-code attribution needs a client-side accepted-code source
that does not upload raw code. This adds a hash-only analytics event
derived from the turn diff so downstream attribution can compare
accepted Codex lines against commit or PR diffs.
## What Changed
- Parse accepted/effective added lines from the final turn diff and emit
`codex_accepted_line_fingerprints` analytics.
- Hash repo, path, and normalized line content before upload; raw code
and raw diffs are not included in the event.
- Chunk large fingerprint payloads and send accepted-line fingerprint
events in isolated requests while preserving normal batching for other
analytics events.
- Canonicalize Git remote URLs before repo hashing so SSH/HTTPS GitHub
remotes join to the same repo hash.
- Add parser coverage for unified diff hunk lines that look like `+++`
or `---` file headers.
## Verification
- `cargo test -p codex-analytics`
- `cargo test -p codex-git-utils canonicalize_git_remote_url`
- `just fix -p codex-analytics`
- `just bazel-lock-check`
- `git diff --check`
## Why
We want terminal tool review analytics, but the reducer should not stamp
review timing from its own wall clock.
This PR plumbs review timing through the real protocol and app-server
seams so downstream analytics can consume the emitter's timestamps
directly. Guardian reviews keep their enriched `started_at` /
`completed_at` analytics fields by deriving those legacy second-based
values from the same protocol-native millisecond lifecycle timestamps,
rather than sampling a separate analytics clock.
## What changed
- add `started_at_ms` to user approval request payloads
- add `started_at_ms` / `completed_at_ms` to guardian review
notifications
- preserve Guardian review `started_at` / `completed_at` enrichment from
the protocol-native timing source
- stamp typed `ServerResponse` analytics facts with app-server-observed
`completed_at_ms`
- thread the new timing fields through core, protocol, app-server, TUI,
and analytics fixtures
## Verification
- `cargo test -p codex-app-server outgoing_message --manifest-path
codex-rs/Cargo.toml`
- `cargo test -p codex-app-server-protocol guardian --manifest-path
codex-rs/Cargo.toml`
- `cargo test -p codex-tui guardian --manifest-path codex-rs/Cargo.toml`
- `cargo test -p codex-analytics analytics_client_tests --manifest-path
codex-rs/Cargo.toml`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21434).
* #18748
* __->__ #21434
* #18747
* #17090
* #17089
* #20514
## Why
We want to emit terminal review analytics for tool-related approval
flows, but the event contract needs to exist before the reducer can
publish anything.
This PR is the schema-only slice for the Codex review event family.
## What changed
- add the `ReviewEvent` analytics envelope in
`codex-rs/analytics/src/events.rs`
- define the review subject kind, reviewer, trigger, terminal status,
and post-review resolution enums
- define the review event payload with thread, turn, item, lineage,
tool, and timing fields that the emitter stack will populate
## Verification
- stacked verification in dependent PRs: `cargo test -p codex-analytics
analytics_client_tests --manifest-path codex-rs/Cargo.toml`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18747).
* #18748
* #21434
* __->__ #18747
* #17090
* #17089
* #20514
## Why
After the tool-item schemas are in place, analytics needs to emit them
from the app-server item lifecycle rather than requiring bespoke
tracking at each callsite. The reducer should also reuse the shared
thread analytics context introduced below it in the stack so later event
families do not repeat the same reducer joins or missing-state ladder.
## What changed
- Tracks tool-item completion notifications and emits the matching tool
analytics event when a terminal item arrives.
- Derives event-specific payload details for command execution, file
changes, MCP calls, dynamic tools, collaboration tools, web search, and
image generation.
- Denormalizes thread, app-server client, runtime, and subagent
provenance metadata through the shared thread analytics context.
- Adds reducer coverage for item lifecycle emission and subagent
metadata inheritance.
## Duration semantics
`duration_ms` is computed from the app-server item lifecycle timestamps:
`completed_at_ms - started_at_ms`. That makes it the duration of the
lifecycle Codex observed locally, not necessarily the upstream
provider's full execution time.
- Web search usually has a meaningful observed lifecycle because
Responses can send `response.output_item.added` before
`response.output_item.done`; in that case `started_at_ms` comes from the
added event and `completed_at_ms` comes from the done event.
- Image generation can be much less precise. In the current observed
stream, image generation often arrives only as a completed
`response.output_item.done`; when there is no earlier added event, Codex
synthesizes the started item immediately before completion, so
`duration_ms` can be `0` even though upstream image generation took
longer.
- Standalone web search and standalone image generation work is expected
to land after this stack. Those paths may introduce more direct
lifecycle events or timing points, so the current
web-search/image-generation duration semantics should be treated as the
best available item-lifecycle approximation, not the final latency
contract for those tool families.
- `execution_duration_ms` is populated only where the completed item
already carries a native execution duration; otherwise it remains `null`
while `duration_ms` still reflects the local lifecycle interval.
## Currently placeholder / partial fields
Some fields are included in the schema for the intended steady-state
contract, but this PR does not yet populate them from real
approval/review state:
- `review_count`, `guardian_review_count`, and `user_review_count`
currently default to `0`.
- `final_approval_outcome` currently defaults to `unknown`.
- `requested_additional_permissions` and `requested_network_access`
currently default to `false`.
## Verification
- `cargo test -p codex-analytics`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17090).
* #18748
* #18747
* __->__ #17090
* #17089
* #20514
## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification
## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.
Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.
## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.
## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
## Why
Several analytics event families need the same per-thread attribution
state: the app-server client/runtime associated with a thread and, for
lifecycle-oriented events, the thread metadata captured during
initialization. Keeping connection ids and lifecycle metadata in
separate maps made each consumer rebuild the same thread context and
made subagent attribution harder to resolve consistently.
## What changed
- Replaces the separate thread connection and metadata maps with one
reducer-owned `threads` map.
- Routes guardian, compaction, turn-steer, and turn analytics through
shared thread-state lookups while preserving turn-origin attribution for
turn events and request-origin attribution for steer events.
- Lets newly observed spawned subagent threads inherit their parent
thread connection so later thread-scoped analytics can resolve through
the same state model.
- Adds regression coverage for standalone `SubAgentThreadStarted`
publication plus the `SubAgentSource::ThreadSpawn` parent fallback
through a thread-scoped consumer that depends on inherited connection
state.
## Verification
- `cargo test -p codex-analytics`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20300).
* #18748
* #18747
* #17090
* #17089
* #20239
* #20515
* #20514
* __->__ #20300
## Why
The precursor PR keeps successful client responses typed until
app-server's outgoing response seam. This follow-up uses that seam to
move successful client-response analytics out of individual handlers and
into the shared sender path, while keeping filtering decisions inside
`codex-analytics`.
## What changed
- Emit successful client-response analytics centrally from
`OutgoingMessageSender::send_response`.
- Remove duplicate handler-local response tracking for the current
thread/turn lifecycle responses.
- Keep analytics ingestion selective inside `AnalyticsEventsClient`, so
unrelated client traffic is ignored before cloning or boxing.
- Collapse client-response analytics facts onto one typed path and
normalize payloads in the reducer.
- Add direct client-filter coverage plus sender-level coverage for the
centralized forwarding path.
## Verification
- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests --lib`
## Why
Codex analytics needs a typed seam for app-server-originated
request/response traffic so future tool-approval analytics can consume
those facts without adding bespoke callsite tracking each time. Server
responses arrive as JSON-RPC `id + result` payloads, so analytics has to
reconstruct the matching typed response from the original typed request
while that request context still exists in app-server.
This also puts analytics on the app-server outbound path, which needs to
avoid keeping the runtime alive during shutdown. The final ownership fix
keeps the normal strong auth-manager retention in analytics and makes
the external-auth refresh bridge hold a weak back-reference to
`OutgoingMessageSender`, breaking the runtime cycle at the bridge
boundary instead of exposing retention policy through the analytics
client API.
## What changed
- Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus
`AnalyticsEventsClient::track_server_request` and
`track_server_response`.
- Renames the existing client-side facts to `ClientRequest` and
`ClientResponse` so reducers can distinguish client-to-server traffic
from server-to-client traffic.
- Adds `ServerRequest::response_from_result`, allowing a stored typed
request to decode the matching typed server response from a raw JSON-RPC
result payload.
- Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and
records targeted server requests, replayed targeted requests, and
matching targeted responses with the responding connection id needed for
correlation.
- Intentionally leaves broadcast server requests/responses out of
analytics for now because the current model is per connection, while
broadcasts fan one logical request out across multiple connections.
- Breaks the app-server shutdown cycle by storing
`Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and
upgrading it only when an external-auth refresh is actually requested.
- Keeps reducer ingestion of the new server-side facts as no-ops for
now; this PR is plumbing for later tool-approval analytics work.
## Verification
- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests::`
- Covers typed-response reconstruction plus the targeted, replayed,
broadcast-exclusion, and response-attribution analytics paths.
## Follow-up
This PR intentionally stops at ingestion plumbing, so `ServerRequest`
and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR
adds real downstream analytics output for those facts:
- replace the temporary pre-reducer observation seam with reducer tests
for the emitted event shape;
- add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs`
for the real app-server workflow and captured analytics payload;
- remove the temporary sender-level observer tests added here in favor
of the real-output coverage above.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088).
* #18748
* #18747
* #17090
* #17089
* #20241
* #20239
* __->__ #17088
Keep extracting memories out of core and moving the write trigger in the
app-server
This is temporary and it should move at the client level as a follow-up
This makes core fully independant from `codex-memories-write`
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Runtime decisions should not infer permissions from the lossy legacy
sandbox projection once `PermissionProfile` is available. In particular,
`Disabled` and `External` need to remain distinct, and managed profiles
with split filesystem or deny-read rules should not be collapsed before
approval, network, safety, or analytics code makes decisions.
## What Changed
- Changes managed network proxy setup and network approval logic to use
`PermissionProfile` when deciding whether a managed sandbox is active.
- Migrates patch safety, Guardian/user-shell approval paths, Landlock
helper setup, analytics sandbox classification, and selected
turn/session code to profile-backed permissions.
- Validates command-level profile overrides against the constrained
`PermissionProfile` rather than a strict `SandboxPolicy` round trip.
- Preserves configured deny-read restrictions when command profiles are
narrowed.
- Adds coverage for profile-backed trust, network proxy/approval
behavior, patch safety, analytics classification, and command-profile
narrowing.
## Verification
- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19393).
* #19395
* #19394
* __->__ #19393
# Why
Add product analytics for hook handler executions so we can understand
which hooks are running, where they came from, and whether they
completed, failed, stopped, or blocked work.
# What
- add the new `codex_hook_run` analytics event and payload plumbing in
`codex-rs/analytics`
- emit hook-run analytics from the shared hook completion path in
`codex-rs/core`
- classify hook source from the loaded hook path as `system`, `user`,
`project`, or `unknown`
```
{
"event_type": "codex_hook_run",
"event_params": {
"thread_id": "string",
"turn_id": "string",
"model_slug": "string",
"hook_name": "string, // any HookEventName
"hook_source": "system | user | project | unknown",
"status": "completed | failed | stopped | blocked"
}
}
```
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Adds `thread_source` field to the existing Codex turn metadata sent to
Responses API
- Sends `thread_source: "user"` for user-initiated sessions: CLI, VS
Code, and Exec
- Sends `thread_source: "subagent"` for subagent sessions
- Omits `thread_source` for MCP, custom, and unknown session sources
- Uses the existing turn metadata transport:
- HTTP requests send through the `x-codex-turn-metadata` header
- WebSocket `response.create` requests send through
`client_metadata["x-codex-turn-metadata"]`
## Testing
- `cargo test -p codex-protocol
session_source_thread_source_name_classifies_user_and_subagent_sources`
- `cargo test -p codex-core turn_metadata_state`
- `cargo test -p codex-core --test responses_headers
responses_stream_includes_turn_metadata_header_for_git_workspace_e2e --
--nocapture`