feat(network-proxy): add embedded OTEL policy audit logging (#12046)

**PR Summary**

This PR adds embedded-only OTEL policy audit logging for
`codex-network-proxy` and threads audit metadata from `codex-core` into
managed proxy startup.

### What changed
- Added structured audit event emission in `network_policy.rs` with
target `codex_otel.network_proxy`.
- Emitted:
- `codex.network_proxy.domain_policy_decision` once per domain-policy
evaluation.
  - `codex.network_proxy.block_decision` for non-domain denies.
- Added required policy/network fields, RFC3339 UTC millisecond
`event.timestamp`, and fallback defaults (`http.request.method="none"`,
`client.address="unknown"`).
- Added non-domain deny audit emission in HTTP/SOCKS handlers for
mode-guard and proxy-state denies, including unix-socket deny paths.
- Added `REASON_UNIX_SOCKET_UNSUPPORTED` and used it for unsupported
unix-socket auditing.
- Added `NetworkProxyAuditMetadata` to runtime/state, re-exported from
`lib.rs` and `state.rs`.
- Added `start_proxy_with_audit_metadata(...)` in core config, with
`start_proxy()` delegating to default metadata.
- Wired metadata construction in `codex.rs` from session/auth context,
including originator sanitization for OTEL-safe tagging.
- Updated `network-proxy/README.md` with embedded-mode audit schema and
behavior notes.
- Refactored HTTP block-audit emission to a small local helper to reduce
duplication.
- Preserved existing unix-socket proxy-disabled host/path behavior for
responses and blocked history while using an audit-only endpoint
override (`server.address="unix-socket"`, `server.port=0`).

### Explicit exclusions
- No standalone proxy OTEL startup work.
- No `main.rs` binary wiring.
- No `standalone_otel.rs`.
- No standalone docs/tests.

### Tests
- Extended `network_policy.rs` tests for event mapping, metadata
propagation, fallbacks, timestamp format, and target prefix.
- Extended HTTP tests to assert unix-socket deny block audit events.
- Extended SOCKS tests to cover deny emission from handler deny
branches.
- Added/updated core tests to verify audit metadata threading into
managed proxy state.

### Validation run
- `just fmt`
- `cargo test -p codex-network-proxy` 
- `cargo test -p codex-core` ran with one unrelated flaky timeout
(`shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin`), and
the test passed when rerun directly 

---------

Co-authored-by: viyatb-oai <viyatb@openai.com>
This commit is contained in:
mcgrew-oai
2026-02-25 11:46:37 -05:00
committed by GitHub
parent 8362b79cb4
commit 9a393c9b6f
16 changed files with 1592 additions and 657 deletions

View File

@@ -137,6 +137,45 @@ the decider can auto-allow network requests originating from that command.
**Important:** Explicit deny rules still win. The decider only gets a chance to override
`not_allowed` (allowlist misses), not `denied` or `not_allowed_local`.
## OTEL Audit Events (embedded/managed)
When `codex-network-proxy` is embedded in managed Codex runtime, policy decisions emit structured
OTEL-compatible events with `target=codex_otel.network_proxy`.
Event name:
- `codex.network_proxy.policy_decision`
- emitted for each policy decision (`domain` and `non_domain`).
- `network.policy.scope = "domain"` for host-policy evaluations (`evaluate_host_policy`).
- `network.policy.scope = "non_domain"` for mode-guard/proxy-state checks (including unix-socket guard paths and unix-socket allow decisions).
Common fields:
- `event.name`
- `event.timestamp` (RFC3339 UTC, millisecond precision)
- optional metadata:
- `conversation.id`
- `app.version`
- `user.account_id`
- policy/network:
- `network.policy.scope` (`domain` or `non_domain`)
- `network.policy.decision` (`allow`, `deny`, or `ask`)
- `network.policy.source` (`baseline_policy`, `mode_guard`, `proxy_state`, `decider`)
- `network.policy.reason`
- `network.transport.protocol`
- `server.address`
- `server.port`
- `http.request.method` (defaults to `"none"` when absent)
- `client.address` (defaults to `"unknown"` when absent)
- `network.policy.override` (`true` only when decider-allow overrides baseline `not_allowed`)
Unix-socket block-path audits use sentinel endpoint values:
- `server.address = "unix-socket"`
- `server.port = 0`
Audit events intentionally avoid logging full URL/path/query data.
## Admin API
The admin API is a small HTTP server intended for debugging and runtime adjustments.