Commit Graph

10 Commits

Author SHA1 Message Date
Anton Panasenko
8660ad6c64 feat: show runtime metrics in console (#10278)
Summary of changes:

- Adds a new feature flag: runtime_metrics
  - Declared in core/src/features.rs
  - Added to core/config.schema.json
  - Wired into OTEL init in core/src/otel_init.rs

- Enables on-demand runtime metric snapshots in OTEL
  - Adds runtime_metrics: bool to otel/src/config.rs
  - Enables experimental custom reader features in otel/Cargo.toml
  - Adds snapshot/reset/summary APIs in:
    - otel/src/lib.rs
    - otel/src/metrics/client.rs
    - otel/src/metrics/config.rs
    - otel/src/metrics/error.rs

- Defines metric names and a runtime summary builder
  - New files:
    - otel/src/metrics/names.rs
    - otel/src/metrics/runtime_metrics.rs
  - Summarizes totals for:
    - Tool calls
    - API requests
    - SSE/streaming events

- Instruments metrics collection in OTEL manager
  - otel/src/traces/otel_manager.rs now records:
    - API call counts + durations
    - SSE event counts + durations (success/failure)
    - Tool call metrics now use shared constants

- Surfaces runtime metrics in the TUI
  - Resets runtime metrics at turn start in tui/src/chatwidget.rs
- Displays metrics in the final separator line in
tui/src/history_cell.rs

- Adds tests
  - New OTEL tests:
    - otel/tests/suite/snapshot.rs
    - otel/tests/suite/runtime_summary.rs
  - New TUI test:
- final_message_separator_includes_runtime_metrics in
tui/src/history_cell.rs

Scope:
- 19 files changed
- ~652 insertions, 38 deletions


<img width="922" height="169" alt="Screenshot 2026-01-30 at 4 11 34 PM"
src="https://github.com/user-attachments/assets/1efd754d-a16d-4564-83a5-f4442fd2f998"
/>
2026-01-30 22:20:02 -08:00
jif-oai
16c66c37eb chore: move otel provider outside of trace module (#8968) 2026-01-09 12:42:54 +00:00
jif-oai
634650dd25 feat: metrics capabilities (#8318)
Add metrics capabilities to Codex. The `README.md` is up to date.

This will not be merged with the metrics before this PR of course:
https://github.com/openai/codex/pull/8350
2026-01-08 11:47:36 +00:00
dependabot[bot]
c673e7adb6 chore(deps): bump tracing-opentelemetry from 0.31.0 to 0.32.0 in /codex-rs (#8415)
Bumps
[tracing-opentelemetry](https://github.com/tokio-rs/tracing-opentelemetry)
from 0.31.0 to 0.32.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/tracing-opentelemetry/releases">tracing-opentelemetry's
releases</a>.</em></p>
<blockquote>
<h2>0.32.0</h2>
<h3>Added</h3>
<ul>
<li>Add configuration for including <code>target</code> in spans (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/222">#222</a>)</li>
</ul>
<h3>Changed</h3>
<ul>
<li>OpenTelemetry context activation (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/202">#202</a>)
<ul>
<li>Trace ID and span ID can be obtained from <code>OtelData</code> via
dedicated functions. Note that these
will be available only if the context has already been built. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/233">#233</a>)</li>
</ul>
</li>
<li>Correctly track entered and exited state for timings (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/212">#212</a>)</li>
<li>Slightly improve error message on version mismatch (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/211">#211</a>)</li>
<li>Remove Lazy for thread_local static (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/215">#215</a>)</li>
<li>Update description of special fields and semantic conventions</li>
</ul>
<h3>Breaking Changes</h3>
<ul>
<li>The attributes <code>code.filepath</code>, <code>code.lineno</code>,
and <code>code.namespace</code> have been renamed to
<code>code.file.path</code>, and <code>code.line.number</code>, and
<code>code.module.name</code>, to align with the opentelemetry
semantic conventions for code. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/225">#225</a>)</li>
<li>Upgrade from opentelemetry to 0.31.0. Refer to the upstream
<a
href="https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/CHANGELOG.md#0310">changelog</a>
for more information. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/230">#230</a>)</li>
<li>Hold onto <code>MetricsProvider</code> in <code>MetricsLayer</code>
(<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/224">#224</a>)</li>
<li>The attribute <code>otel.status_message</code> was changed to
<code>otel.status_description</code> to align with the
opentelemetry semantic conventions for code. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/209">#209</a>)</li>
<li>Remove the <code>metrics_gauge_unstable</code> feature.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.1.x/CHANGELOG.md">tracing-opentelemetry's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/tokio-rs/tracing-opentelemetry/compare/v0.31.0...v0.32.0">0.32.0</a>
- 2025-09-29</h2>
<h3>Added</h3>
<ul>
<li>Add configuration for including <code>target</code> in spans (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/222">#222</a>)</li>
</ul>
<h3>Changed</h3>
<ul>
<li>OpenTelemetry context activation (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/202">#202</a>)
<ul>
<li>Trace ID and span ID can be obtained from <code>OtelData</code> via
dedicated functions. Note that these
will be available only if the context has already been built. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/233">#233</a>)</li>
</ul>
</li>
<li>Correctly track entered and exited state for timings (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/212">#212</a>)</li>
<li>Slightly improve error message on version mismatch (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/211">#211</a>)</li>
<li>Remove Lazy for thread_local static (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/215">#215</a>)</li>
<li>Update description of special fields and semantic conventions</li>
</ul>
<h3>Breaking Changes</h3>
<ul>
<li>The attributes <code>code.filepath</code>, <code>code.lineno</code>,
and <code>code.namespace</code> have been renamed to
<code>code.file.path</code>, and <code>code.line.number</code>, and
<code>code.module.name</code>, to align with the opentelemetry
semantic conventions for code. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/225">#225</a>)</li>
<li>Upgrade from opentelemetry to 0.31.0. Refer to the upstream
<a
href="https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/CHANGELOG.md#0310">changelog</a>
for more information. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/230">#230</a>)</li>
<li>Hold onto <code>MetricsProvider</code> in <code>MetricsLayer</code>
(<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/224">#224</a>)</li>
<li>The attribute <code>otel.status_message</code> was changed to
<code>otel.status_description</code> to align with the
opentelemetry semantic conventions for code. (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/pull/209">#209</a>)</li>
<li>Remove the <code>metrics_gauge_unstable</code> feature.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f663332dd8"><code>f663332</code></a>
chore: prepare release of 0.32.0</li>
<li><a
href="0154fa470b"><code>0154fa4</code></a>
chore: fix docs link</li>
<li><a
href="d684c2ee36"><code>d684c2e</code></a>
chore: delete removed docs.rs feature</li>
<li><a
href="73a6baf71d"><code>73a6baf</code></a>
feat: make trace ID and span ID public on <code>OtelData</code> (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/233">#233</a>)</li>
<li><a
href="4ebae2c537"><code>4ebae2c</code></a>
Upgrade to <code>opentelemetry</code> 0.31 (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/230">#230</a>)</li>
<li><a
href="4fdf56048d"><code>4fdf560</code></a>
fix(layer)!: use otel semantic conventions for code (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/225">#225</a>)</li>
<li><a
href="612b5b2601"><code>612b5b2</code></a>
chore: fix clippy lints (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/226">#226</a>)</li>
<li><a
href="c4fe96ac2a"><code>c4fe96a</code></a>
feat: OpenTelemetry context activation (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/202">#202</a>)</li>
<li><a
href="764cd7365f"><code>764cd73</code></a>
fix(metrics)!: hold onto <code>MetricsProvider</code> in
<code>MetricsLayer</code> (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/224">#224</a>)</li>
<li><a
href="fd0a58a7f4"><code>fd0a58a</code></a>
feat(layer): add configuration for including <code>target</code> in
spans (<a
href="https://redirect.github.com/tokio-rs/tracing-opentelemetry/issues/222">#222</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/tokio-rs/tracing-opentelemetry/compare/v0.31.0...v0.32.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tracing-opentelemetry&package-manager=cargo&previous-version=0.31.0&new-version=0.32.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anton Panasenko <apanasenko@openai.com>
2026-01-02 16:38:16 -08:00
jif-oai
d7482510b1 nit: trace span for regular task (#8053)
Logs are too spammy

---------

Co-authored-by: Anton Panasenko <apanasenko@openai.com>
2025-12-16 16:53:15 +00:00
Anton Panasenko
ad7b9d63c3 [codex] add otel tracing (#7844) 2025-12-12 17:07:17 -08:00
Michael Bolin
fa4cac1e6b fix: introduce AbsolutePathBuf and resolve relative paths in config.toml (#7796)
This PR attempts to solve two problems by introducing a
`AbsolutePathBuf` type with a special deserializer:

- `AbsolutePathBuf` attempts to be a generally useful abstraction, as it
ensures, by constructing, that it represents a value that is an
absolute, normalized path, which is a stronger guarantee than an
arbitrary `PathBuf`.
- Values in `config.toml` that can be either an absolute or relative
path should be resolved against the folder containing the `config.toml`
in the relative path case. This PR makes this easy to support: the main
cost is ensuring `AbsolutePathBufGuard` is used inside
`deserialize_config_toml_with_base()`.

While `AbsolutePathBufGuard` may seem slightly distasteful because it
relies on thread-local storage, this seems much cleaner to me than using
than my various experiments with
https://docs.rs/serde/latest/serde/de/trait.DeserializeSeed.html.
Further, since the `deserialize()` method from the `Deserialize` trait
is not async, we do not really have to worry about the deserialization
work being spread across multiple threads in a way that would interfere
with `AbsolutePathBufGuard`.

To start, this PR introduces the use of `AbsolutePathBuf` in
`OtelTlsConfig`. Note how this simplifies `otel_provider.rs` because it
no longer requires `settings.codex_home` to be threaded through.
Furthermore, this sets us up better for a world where multiple
`config.toml` files from different folders could be loaded and then
merged together, as the absolutifying of the paths must be done against
the correct parent folder.
2025-12-09 17:37:52 -08:00
Alexander
f521d29726 fix: OTEL HTTP exporter panic and mTLS support (#7651)
This fixes two issues with the OTEL HTTP exporter:

1. **Runtime panic with async reqwest client**

The `opentelemetry_sdk` `BatchLogProcessor` spawns a dedicated OS thread
that uses `futures_executor::block_on()` rather than tokio's runtime.
When the async reqwest client's timeout mechanism calls
`tokio::time::sleep()`, it panics with "there is no reactor running,
must be called from the context of a Tokio 1.x runtime".

The fix is to use `reqwest::blocking::Client` instead, which doesn't
depend on tokio for timeouts. However, the blocking client creates its
own internal tokio runtime during construction, which would panic if
built from within an async context. We wrap the construction in
`tokio::task::block_in_place()` to handle this.

2. **mTLS certificate handling**

The HTTP client wasn't properly configured for mTLS, matching the fixes
previously done for the model provider client:

- Added `.tls_built_in_root_certs(false)` when using a custom CA
certificate to ensure only our CA is trusted
- Added `.https_only(true)` when using client identity
- Added `rustls-tls` feature to ensure rustls is used (required for
`Identity::from_pem()` to work correctly)
2025-12-05 20:46:44 -08:00
Anton Panasenko
f7a921039c [codex][otel] support mtls configuration (#6228)
fix for https://github.com/openai/codex/issues/6153

supports mTLS configuration and includes TLS features in the library
build to enable secure HTTPS connections with custom root certificates.

grpc:
https://docs.rs/tonic/0.13.1/src/tonic/transport/channel/endpoint.rs.html#63
https:
https://docs.rs/reqwest/0.12.23/src/reqwest/async_impl/client.rs.html#516
2025-11-18 14:01:01 -08:00
vishnu-oai
04c1782e52 OpenTelemetry events (#2103)
### Title

## otel

Codex can emit [OpenTelemetry](https://opentelemetry.io/) **log events**
that
describe each run: outbound API requests, streamed responses, user
input,
tool-approval decisions, and the result of every tool invocation. Export
is
**disabled by default** so local runs remain self-contained. Opt in by
adding an
`[otel]` table and choosing an exporter.

```toml
[otel]
environment = "staging"   # defaults to "dev"
exporter = "none"          # defaults to "none"; set to otlp-http or otlp-grpc to send events
log_user_prompt = false    # defaults to false; redact prompt text unless explicitly enabled
```

Codex tags every exported event with `service.name = "codex-cli"`, the
CLI
version, and an `env` attribute so downstream collectors can distinguish
dev/staging/prod traffic. Only telemetry produced inside the
`codex_otel`
crate—the events listed below—is forwarded to the exporter.

### Event catalog

Every event shares a common set of metadata fields: `event.timestamp`,
`conversation.id`, `app.version`, `auth_mode` (when available),
`user.account_id` (when available), `terminal.type`, `model`, and
`slug`.

With OTEL enabled Codex emits the following event types (in addition to
the
metadata above):

- `codex.api_request`
  - `cf_ray` (optional)
  - `attempt`
  - `duration_ms`
  - `http.response.status_code` (optional)
  - `error.message` (failures)
- `codex.sse_event`
  - `event.kind`
  - `duration_ms`
  - `error.message` (failures)
  - `input_token_count` (completion only)
  - `output_token_count` (completion only)
  - `cached_token_count` (completion only, optional)
  - `reasoning_token_count` (completion only, optional)
  - `tool_token_count` (completion only)
- `codex.user_prompt`
  - `prompt_length`
  - `prompt` (redacted unless `log_user_prompt = true`)
- `codex.tool_decision`
  - `tool_name`
  - `call_id`
- `decision` (`approved`, `approved_for_session`, `denied`, or `abort`)
  - `source` (`config` or `user`)
- `codex.tool_result`
  - `tool_name`
  - `call_id`
  - `arguments`
  - `duration_ms` (execution time for the tool)
  - `success` (`"true"` or `"false"`)
  - `output`

### Choosing an exporter

Set `otel.exporter` to control where events go:

- `none` – leaves instrumentation active but skips exporting. This is
the
  default.
- `otlp-http` – posts OTLP log records to an OTLP/HTTP collector.
Specify the
  endpoint, protocol, and headers your collector expects:

  ```toml
  [otel]
  exporter = { otlp-http = {
    endpoint = "https://otel.example.com/v1/logs",
    protocol = "binary",
    headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
  }}
  ```

- `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint
and any
  metadata headers:

  ```toml
  [otel]
  exporter = { otlp-grpc = {
    endpoint = "https://otel.example.com:4317",
    headers = { "x-otlp-meta" = "abc123" }
  }}
  ```

If the exporter is `none` nothing is written anywhere; otherwise you
must run or point to your
own collector. All exporters run on a background batch worker that is
flushed on
shutdown.

If you build Codex from source the OTEL crate is still behind an `otel`
feature
flag; the official prebuilt binaries ship with the feature enabled. When
the
feature is disabled the telemetry hooks become no-ops so the CLI
continues to
function without the extra dependencies.

---------

Co-authored-by: Anton Panasenko <apanasenko@openai.com>
2025-09-29 11:30:55 -07:00