codex

mirror of https://github.com/openai/codex.git synced 2026-05-04 03:16:31 +00:00

Author	SHA1	Message	Date
charley-openai	de2ccf9473	[codex] Add token usage to turn tracing spans (#19432 ) ## Why Slow Codex turns are easier to debug when token usage is visible in the trace itself, without joining against separate analytics. This adds token usage to existing turn-handling spans for regular user turns only. [Example turn](https://openai.datadoghq.com/apm/trace/9d353efa2cb5de1f4c5b93dc33c3df04?colorBy=service&graphType=flamegraph&shouldShowLegend=true&sort=time&spanID=3555541504891512675&spanViewType=metadata&traceQuery=) <img width="1447" height="967" alt="Screenshot 2026-04-24 at 3 03 07 PM" src="https://github.com/user-attachments/assets/ab7bb187-e7fc-41f0-a366-6c44610b2b2c" /> ## What Changed Added response-level token fields on completed handle_responses spans: gen_ai.usage.input_tokens gen_ai.usage.cache_read.input_tokens gen_ai.usage.output_tokens codex.usage.reasoning_output_tokens codex.usage.total_tokens Added aggregate token fields on regular turn spans: codex.turn.token_usage.* Added an explicit regular-turn opt-in via SessionTask::records_turn_token_usage_on_span() so this is not coupled to span-name strings. ## Testing - `cargo test -p codex-otel` - `cargo test -p codex-core turn_and_completed_response_spans_record_token_usage` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-otel` - Manual local Electron/app-server smoke test: regular user turn emits the new span fields Known status: `cargo test -p codex-core` was attempted and failed in unrelated existing areas: config approvals, request-permissions, git-info ordering, and subagent metadata persistence.	2026-04-28 11:41:32 -07:00
pakrym-oai	4e05f3053c	Remove ghost snapshots (#19481 ) ## Summary - Remove `ghost_snapshot` / `GhostCommit` from the Responses API surface and generated SDK/schema artifacts. - Keep legacy config loading compatible, but make undo a no-op that reports the feature is unavailable. - Clean up core history, compaction, telemetry, rollout, and tests to stop carrying ghost snapshot items. ## Testing - Unit tests passed for `codex-protocol`, `codex-core` targeted undo and compaction flows, `codex-rollout`, and `codex-app-server-protocol`. - Regenerated config and app-server schemas plus Python SDK artifacts and verified they match the checked-in outputs.	2026-04-27 18:48:57 -07:00
Eric Traut	bbff4ee61a	Add safety check notification and error handling (#19055 ) Adds a new app-server notification that fires when a user account has been flagged for potential safety reasons.	2026-04-22 22:24:12 -07:00
Akshay Nathan	7995c66032	Stream apply_patch changes (#17862 ) Adds new events for streaming apply_patch changes from responses api. This is to enable clients to show progress during file writes. Caveat: This does not work with apply_patch in function call mode, since that required adding streaming json parsing.	2026-04-16 18:12:19 -07:00
pakrym-oai	413c1e1fdf	[codex] reduce module visibility (#16978 ) ## Summary - reduce public module visibility across Rust crates, preferring private or crate-private modules with explicit crate-root public exports - update external call sites and tests to use the intended public crate APIs instead of reaching through module trees - add the module visibility guideline to AGENTS.md ## Validation - `cargo check --workspace --all-targets --message-format=short` passed before the final fix/format pass - `just fix` completed successfully - `just fmt` completed successfully - `git diff --check` passed	2026-04-07 08:03:35 -07:00
jif-oai	2cf4d5ef35	chore: add metrics for profile (#15180 )	2026-03-19 15:48:02 +00:00
xl-openai	86982ca1f9	Revert "fix: harden plugin feature gating" (#15102 ) Reverts openai/codex#15020 I messed up the commit in my PR and accidentally merged changes that were still under review.	2026-03-18 15:19:29 -07:00
xl-openai	580f32ad2a	fix: harden plugin feature gating (#15020 ) 1. Use requirement-resolved config.features as the plugin gate. 2. Guard plugin/list, plugin/read, and related flows behind that gate. 3. Skip bad marketplace.json files instead of failing the whole list. 4. Simplify plugin state and caching.	2026-03-18 10:11:43 -07:00
Colin Young	0d2ff40a58	Add auth env observability (#14905 ) CXC-410 Emit Env Var Status with `/feedback` report Add more observability on top of #14611 [Unset](https://openai.sentry.io/issues/7340419168/?project=4510195390611458&query=019cfa8d-c1ba-7002-96fa-e35fc340551d&referrer=issue-stream) [Set](https://openai.sentry.io/issues/7340426331/?project=4510195390611458&query=019cfa91-aba1-7823-ab7e-762edfbc0ed4&referrer=issue-stream) <img width="1063" height="610" alt="image" src="https://github.com/user-attachments/assets/937ab026-1c2d-4757-81d5-5f31b853113e" /> ###### Summary - Adds auth-env telemetry that records whether key auth-related env overrides were present on session start and request paths. - Threads those auth-env fields through `/responses`, websocket, and `/models` telemetry and feedback metadata. - Buckets custom provider `env_key` configuration to a safe `"configured"` value instead of emitting raw config text. - Keeps the slice observability-only: no raw token values or raw URLs are emitted. ###### Rationale (from spec findings) - 401 and auth-path debugging needs a way to distinguish env-driven auth paths from sessions with no auth env override. - Startup and model-refresh failures need the same auth-env diagnostics as normal request failures. - Feedback and Sentry tags need the same auth-env signal as OTel events so reports can be triaged consistently. - Custom provider config is user-controlled text, so the telemetry contract must stay presence-only / bucketed. ###### Scope - Adds a small `AuthEnvTelemetry` bundle for env presence collection and threads it through the main request/session telemetry paths. - Does not add endpoint/base-url/provider-header/geo routing attribution or broader telemetry API redesign. ###### Trade-offs - `provider_env_key_name` is bucketed to `"configured"` instead of preserving the literal configured env var name. - `/models` is included because startup/model-refresh auth failures need the same diagnostics, but broader parity work remains out of scope. - This slice keeps the existing telemetry APIs and layers auth-env fields onto them rather than redesigning the metadata model. ###### Client follow-up - Add the separate endpoint/base-url attribution slice if routing-source diagnosis is still needed. - Add provider-header or residency attribution only if auth-env presence proves insufficient in real reports. - Revisit whether any additional auth-related env inputs need safe bucketing after more 401 triage data. ###### Testing - `cargo test -p codex-core emit_feedback_request_tags -- --nocapture` - `cargo test -p codex-core collect_auth_env_telemetry_buckets_provider_env_key_name -- --nocapture` - `cargo test -p codex-core models_request_telemetry_emits_auth_env_feedback_tags_on_failure -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_api_request_auth_observability -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_connect_auth_observability -- --nocapture` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_request_transport_observability -- --nocapture` - `cargo test -p codex-core --no-run --message-format short` - `cargo test -p codex-otel --no-run --message-format short` --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-17 14:26:27 -07:00
Michael Bolin	b77fe8fefe	Apply argument comment lint across codex-rs (#14652 ) ## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/param/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/param/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/param/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- -> #14652 * #14651	2026-03-16 16:48:15 -07:00
Colin Young	d692b74007	Add auth 401 observability to client bug reports (#14611 ) CXC-392 [With 401](https://openai.sentry.io/issues/7333870443/?project=4510195390611458&query=019ce8f8-560c-7f10-a00a-c59553740674&referrer=issue-stream) <img width="1909" height="555" alt="401 auth tags in Sentry" src="https://github.com/user-attachments/assets/412ea950-61c4-4780-9697-15c270971ee3" /> - auth_401_: preserved facts from the latest unauthorized response snapshot - auth_: latest auth-related facts from the latest request attempt - auth_recovery_: unauthorized recovery state and follow-up result Without 401 <img width="1917" height="522" alt="happy-path auth tags in Sentry" src="https://github.com/user-attachments/assets/3381ed28-8022-43b0-b6c0-623a630e679f" /> ###### Summary - Add client-visible 401 diagnostics for auth attachment, upstream auth classification, and 401 request id / cf-ray correlation. - Record unauthorized recovery mode, phase, outcome, and retry/follow-up status without changing auth behavior. - Surface the highest-signal auth and recovery fields on uploaded client bug reports so they are usable in Sentry. - Preserve original unauthorized evidence under `auth_401_` while keeping follow-up result tags separate. ###### Rationale (from spec findings) - The dominant bucket needed proof of whether the client attached auth before send or upstream still classified the request as missing auth. - Client uploads needed to show whether unauthorized recovery ran and what the client tried next. - Request id and cf-ray needed to be preserved on the unauthorized response so server-side correlation is immediate. - The bug-report path needed the same auth evidence as the request telemetry path, otherwise the observability would not be operationally useful. ###### Scope - Add auth 401 and unauthorized-recovery observability in `codex-rs/core`, `codex-rs/codex-api`, and `codex-rs/otel`, including feedback-tag surfacing. - Keep auth semantics, refresh behavior, retry behavior, endpoint classification, and geo-denial follow-up work out of this PR. ###### Trade-offs - This exports only safe auth evidence: header presence/name, upstream auth classification, request ids, and recovery state. It does not export token values or raw upstream bodies. - This keeps websocket connection reuse as a transport clue because it can help distinguish stale reused sessions from fresh reconnects. - Misroute/base-url classification and geo-denial are intentionally deferred to a separate follow-up PR so this review stays focused on the dominant auth 401 bucket. ###### Client follow-up - PR 2 will add misroute/provider and geo-denial observability plus the matching feedback-tag surfacing. - A separate host/app-server PR should log auth-decision inputs so pre-send host auth state can be correlated with client request evidence. - `device_id` remains intentionally separate until there is a safe existing source on the feedback upload path. ###### Testing - `cargo test -p codex-core refresh_available_models_sorts_by_priority` - `cargo test -p codex-core emit_feedback_request_tags_` - `cargo test -p codex-core emit_feedback_auth_recovery_tags_` - `cargo test -p codex-core auth_request_telemetry_context_tracks_attached_auth_and_retry_phase` - `cargo test -p codex-core extract_response_debug_context_decodes_identity_headers` - `cargo test -p codex-core identity_auth_details` - `cargo test -p codex-core telemetry_error_messages_preserve_non_http_details` - `cargo test -p codex-core --all-features --no-run` - `cargo test -p codex-otel otel_export_routing_policy_routes_api_request_auth_observability` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_connect_auth_observability` - `cargo test -p codex-otel otel_export_routing_policy_routes_websocket_request_transport_observability`	2026-03-14 15:38:51 -07:00
Anton Panasenko	77b0c75267	feat: search_tool migrate to bring you own tool of Responses API (#14274 ) ## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch	2026-03-11 17:51:51 -07:00
Owen Lin	da991bdf3a	feat(otel): Centralize OTEL metric names and shared tag builders (#14117 ) This cleans up a bunch of metric plumbing that had started to drift. The main change is making `codex-otel` the canonical home for shared metric definitions and metric tag helpers. I moved the `turn/thread` metric names that were still duplicated into the OTEL metric registry, added a shared `metrics::tags` module for common tag keys and session tag construction, and updated `SessionTelemetry` to build its metadata tags through that shared path. On the codex-core side, TTFT/TTFM now use the shared metric-name constants instead of local string definitions. I also switched the obvious remaining turn/thread metric callsites over to the shared constants, and added a small helper so TTFT/TTFM can attach an optional sanitized client.name tag from TurnContext. This should make follow-on telemetry work less ad hoc: - one canonical place for metric names - one canonical place for common metric tag keys/builders - less duplication between `codex-core` and `codex-otel`	2026-03-09 12:46:42 -07:00
Owen Lin	289ed549cf	chore(otel): rename OtelManager to SessionTelemetry (#13808 ) ## Summary This is a purely mechanical refactor of `OtelManager` -> `SessionTelemetry` to better convey what the struct is doing. No behavior change. ## Why `OtelManager` ended up sounding much broader than what this type actually does. It doesn't manage OTEL globally; it's the session-scoped telemetry surface for emitting log/trace events and recording metrics with consistent session metadata (`app_version`, `model`, `slug`, `originator`, etc.). `SessionTelemetry` is a more accurate name, and updating the call sites makes that boundary a lot easier to follow. ## Validation - `just fmt` - `cargo test -p codex-otel` - `cargo test -p codex-core`	2026-03-06 16:23:30 -08:00
Owen Lin	dd4a5216c9	chore(otel): reorganize codex-otel crate (#13800 ) ## Summary This is a structural cleanup of `codex-otel` to make the ownership boundaries a lot clearer. For example, previously it was quite confusing that `OtelManager` which emits log + trace event telemetry lived under `codex-rs/otel/src/traces/`. Also, there were two places that defined methods on OtelManager via `impl OtelManager` (`lib.rs` and `otel_manager.rs`). What changed: - move the `OtelProvider` implementation into `src/provider.rs` - move `OtelManager` and session-scoped event emission into `src/events/otel_manager.rs` - collapse the shared log/trace event helpers into `src/events/shared.rs` - pull target classification into `src/targets.rs` - move `traceparent_context_from_env()` into `src/trace_context.rs` - keep `src/otel_provider.rs` as a compatibility shim for existing imports - update the `codex-otel` README to reflect the new layout ## Why `lib.rs` and `otel_provider.rs` were doing too many different jobs at once: provider setup, export routing, trace-context helpers, and session event emission all lived together. This refactor separates those concerns without trying to change the behavior of the crate. The goal is to make future OTEL work easier to reason about and easier to review. ## Notes - no intended behavior change - `OtelManager` remains the session-scoped event emitter in this PR - the `otel_provider` shim keeps downstream churn low while the internals move around ## Validation - `just fmt` - `cargo test -p codex-otel` - `just fix -p codex-otel`	2026-03-06 14:58:18 -08:00

15 Commits