mirror of
https://github.com/openai/codex.git
synced 2026-05-30 07:50:17 +00:00
## Why Guardian reviews already emit analytics events, but we do not expose aggregate OpenTelemetry metrics for review volume, latency, token usage, or terminal outcomes. That makes it harder to monitor Guardian behavior during rollouts and to compare review outcomes by source, action type, session kind, model, and failure mode. ## What Changed - Added Guardian review metric names for count, total duration, time to first token, and token usage in `codex-rs/otel`. - Added `core/src/guardian/metrics.rs` to convert `GuardianReviewAnalyticsResult` into sanitized metric tags covering decision, terminal status, failure reason, approval request source, reviewed action, session kind, risk/outcome, model, reasoning effort, and context/truncation state. - Emitted the new metrics from `track_guardian_review` for each terminal Guardian review result. ## Testing - Added `guardian_review_metrics_record_counts_durations_and_token_usage`, which verifies the emitted count, duration, TTFT, token usage histograms, and tag set through the in-memory metrics exporter.