Files
codex/codex-rs/otel
Eric Traut 91b7350187 Add goal lifecycle metrics (#20799)
## Why

Adding goal metrics makes it possible to track how often goals are
created, completed, and stopped by budget limits, plus the final token
and wall-clock usage for terminal outcomes.

## What Changed

- Added OpenTelemetry metric constants for goal lifecycle tracking:
- `codex.goal.created`: increments each time a new persisted goal is
created or an existing goal is replaced with a new objective.
- `codex.goal.completed`: increments when a goal transitions to
`complete`.
- `codex.goal.budget_limited`: increments when a goal transitions to
`budget_limited` because its token budget has been reached.
- `codex.goal.token_count`: records the final persisted token count when
a goal transitions to `complete` or `budget_limited`.
- `codex.goal.duration_s`: records the final persisted elapsed
wall-clock time, in seconds, when a goal transitions to `complete` or
`budget_limited`.
- Emitted creation metrics when a goal is created or replaced.
- Emitted terminal outcome counters and final usage histograms when a
goal transitions to `complete` or `budget_limited`, avoiding
double-counting later in-flight accounting for already budget-limited
goals.
- Added focused `codex-core` tests for create/complete metrics and
one-time budget-limit metrics.
2026-05-05 09:21:54 -07:00
..
2026-05-05 09:21:54 -07:00

codex-otel

codex-otel is the OpenTelemetry integration crate for Codex. It provides:

  • Provider wiring for log/trace/metric exporters (codex_otel::OtelProvider and codex_otel::provider).
  • Session-scoped business event emission via codex_otel::SessionTelemetry.
  • Low-level metrics APIs via codex_otel::metrics.
  • Trace-context helpers via codex_otel::trace_context and crate-root re-exports.

Tracing and logs

Create an OTEL provider from OtelSettings. The provider also configures metrics (when enabled), then attach its layers to your tracing_subscriber registry:

use codex_otel::config::OtelExporter;
use codex_otel::config::OtelHttpProtocol;
use codex_otel::config::OtelSettings;
use codex_otel::OtelProvider;
use tracing_subscriber::prelude::*;

let settings = OtelSettings {
    environment: "dev".to_string(),
    service_name: "codex-cli".to_string(),
    service_version: env!("CARGO_PKG_VERSION").to_string(),
    codex_home: std::path::PathBuf::from("/tmp"),
    exporter: OtelExporter::OtlpHttp {
        endpoint: "https://otlp.example.com".to_string(),
        headers: std::collections::HashMap::new(),
        protocol: OtelHttpProtocol::Binary,
        tls: None,
    },
    trace_exporter: OtelExporter::OtlpHttp {
        endpoint: "https://otlp.example.com".to_string(),
        headers: std::collections::HashMap::new(),
        protocol: OtelHttpProtocol::Binary,
        tls: None,
    },
    metrics_exporter: OtelExporter::None,
};

if let Some(provider) = OtelProvider::from(&settings)? {
    let registry = tracing_subscriber::registry()
        .with(provider.logger_layer())
        .with(provider.tracing_layer());
    registry.init();
}

SessionTelemetry (events)

SessionTelemetry adds consistent metadata to tracing events and helps record Codex-specific session events. Rich session/business events should go through SessionTelemetry; subsystem-owned audit events can stay with the owning subsystem.

use codex_otel::SessionTelemetry;

let manager = SessionTelemetry::new(
    conversation_id,
    model,
    slug,
    account_id,
    account_email,
    auth_mode,
    originator,
    log_user_prompts,
    terminal_type,
    session_source,
);

manager.user_prompt(&prompt_items);

Metrics (OTLP or in-memory)

Modes:

  • OTLP: exports metrics via the OpenTelemetry OTLP exporter (HTTP or gRPC).
  • In-memory: records via opentelemetry_sdk::metrics::InMemoryMetricExporter for tests/assertions; call shutdown() to flush.

codex-otel also provides OtelExporter::Statsig, a shorthand for exporting OTLP/HTTP JSON metrics to Statsig using Codex-internal defaults.

Statsig ingestion (OTLP/HTTP JSON) example:

use codex_otel::config::{OtelExporter, OtelHttpProtocol};

let metrics = MetricsClient::new(MetricsConfig::otlp(
    "dev",
    "codex-cli",
    env!("CARGO_PKG_VERSION"),
    OtelExporter::OtlpHttp {
        endpoint: "https://api.statsig.com/otlp".to_string(),
        headers: std::collections::HashMap::from([(
            "statsig-api-key".to_string(),
            std::env::var("STATSIG_SERVER_SDK_SECRET")?,
        )]),
        protocol: OtelHttpProtocol::Json,
        tls: None,
    },
))?;

metrics.counter("codex.session_started", 1, &[("source", "tui")])?;
metrics.histogram("codex.request_latency", 83, &[("route", "chat")])?;

In-memory (tests):

let exporter = InMemoryMetricExporter::default();
let metrics = MetricsClient::new(MetricsConfig::in_memory(
    "test",
    "codex-cli",
    env!("CARGO_PKG_VERSION"),
    exporter.clone(),
))?;
metrics.counter("codex.turns", 1, &[("model", "gpt-5.1")])?;
metrics.shutdown()?; // flushes in-memory exporter

Trace context

Trace propagation helpers remain separate from the session event emitter:

use codex_otel::current_span_w3c_trace_context;
use codex_otel::set_parent_from_w3c_trace_context;

Shutdown

  • OtelProvider::shutdown() stops the OTEL exporter.
  • SessionTelemetry::shutdown_metrics() flushes and shuts down the metrics provider.

Both are optional because drop performs best-effort shutdown, but calling them explicitly gives deterministic flushing (or a shutdown error if flushing does not complete in time).