mirror of
https://github.com/openai/codex.git
synced 2026-05-26 05:55:36 +00:00
## Stacked PRs This work is now effectively split across two steps: - #14178: add custom CA support for browser and device-code login flows, docs, and hermetic subprocess tests - #14239: extend that shared custom CA handling across Codex HTTPS clients and secure websocket TLS Note: #14240 was merged into this branch while it was stacked on top of this PR. This PR now subsumes that websocket follow-up and should be treated as the combined change. Builds on top of #14178. ## Problem Custom CA support landed first in the login path, but the real requirement is broader. Codex constructs outbound TLS clients in multiple places, and both HTTPS and secure websocket paths can fail behind enterprise TLS interception if they do not honor `CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently. This PR broadens the shared custom-CA logic beyond login and applies the same policy to websocket TLS, so the enterprise-proxy story is no longer split between “HTTPS works” and “websockets still fail”. ## What This Delivers Custom CA support is no longer limited to login. Codex outbound HTTPS clients and secure websocket connections can now honor the same `CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise proxy/intercept setups work more consistently end-to-end. For users and operators, nothing new needs to be configured beyond the same CA env vars introduced in #14178. The change is that more of Codex now respects them, including websocket-backed flows that were previously still using default trust roots. I also manually validated the proxy path locally with mitmproxy using: `CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem HTTPS_PROXY=http://127.0.0.1:8080 just codex` with mitmproxy installed via `brew install mitmproxy` and configured as the macOS system proxy. ## Mental model `codex-client` is now the owner of shared custom-CA policy for outbound TLS client construction. Reqwest callers start from the builder configuration they already need, then pass that builder through `build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the same module for a rustls client config when a custom CA bundle is configured. The env precedence is the same everywhere: - `CODEX_CA_CERTIFICATE` wins - otherwise fall back to `SSL_CERT_FILE` - otherwise use system roots The helper is intentionally narrow. It loads every usable certificate from the configured PEM bundle into the appropriate root store and returns either a configured transport or a typed error that explains what went wrong. ## Non-goals This does not add handshake-level integration tests against a live TLS endpoint. It does not validate that the configured bundle forms a meaningful certificate chain. It also does not try to force every transport in the repo through one abstraction; it extends the shared CA policy across the reqwest and websocket paths that actually needed it. ## Tradeoffs The main tradeoff is centralizing CA behavior in `codex-client` while still leaving adoption up to call sites. That keeps the implementation additive and reviewable, but it means the rule "outbound Codex TLS that should honor enterprise roots must use the shared helper" is still partly enforced socially rather than by types. For websockets, the shared helper only builds an explicit rustls config when a custom CA bundle is configured. When no override env var is set, websocket callers still use their ordinary default connector path. ## Architecture `codex-client::custom_ca` now owns CA bundle selection, PEM normalization, mixed-section parsing, certificate extraction, typed CA-loading errors, and optional rustls client-config construction for websocket TLS. The affected consumers now call into that shared helper directly rather than carrying login-local CA behavior: - backend-client - cloud-tasks - RMCP client paths that use `reqwest` - TUI voice HTTP paths - `codex-core` default reqwest client construction - `codex-api` websocket clients for both responses and realtime websocket connections The subprocess CA probe, env-sensitive integration tests, and shared PEM fixtures also live in `codex-client`, which is now the actual owner of the behavior they exercise. ## Observability The shared CA path logs: - which environment variable selected the bundle - which path was loaded - how many certificates were accepted - when `TRUSTED CERTIFICATE` labels were normalized - when CRLs were ignored - where client construction failed Returned errors remain user-facing and include the relevant env var, path, and remediation hint. That same error model now applies whether the failure surfaced while building a reqwest client or websocket TLS configuration. ## Tests Pure unit tests in `codex-client` cover env precedence and PEM normalization behavior. Real client construction remains in subprocess tests so the suite can control process env and avoid the macOS seatbelt panic path that motivated the hermetic test split. The subprocess coverage verifies: - `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE` - fallback to `SSL_CERT_FILE` - single-cert and multi-cert bundles - malformed and empty-file errors - OpenSSL `TRUSTED CERTIFICATE` handling - CRL tolerance for well-formed CRL sections The websocket side is covered by the existing `codex-api` / `codex-core` websocket test suites plus the manual mitmproxy validation above. --------- Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com> Co-authored-by: Codex <noreply@openai.com>
240 lines
8.7 KiB
Rust
240 lines
8.7 KiB
Rust
use crate::config_loader::ResidencyRequirement;
|
|
use crate::spawn::CODEX_SANDBOX_ENV_VAR;
|
|
use codex_client::BuildCustomCaTransportError;
|
|
use codex_client::CodexHttpClient;
|
|
pub use codex_client::CodexRequestBuilder;
|
|
use codex_client::build_reqwest_client_with_custom_ca;
|
|
use reqwest::header::HeaderMap;
|
|
use reqwest::header::HeaderValue;
|
|
use std::sync::LazyLock;
|
|
use std::sync::Mutex;
|
|
use std::sync::RwLock;
|
|
|
|
/// Set this to add a suffix to the User-Agent string.
|
|
///
|
|
/// It is not ideal that we're using a global singleton for this.
|
|
/// This is primarily designed to differentiate MCP clients from each other.
|
|
/// Because there can only be one MCP server per process, it should be safe for this to be a global static.
|
|
/// However, future users of this should use this with caution as a result.
|
|
/// In addition, we want to be confident that this value is used for ALL clients and doing that requires a
|
|
/// lot of wiring and it's easy to miss code paths by doing so.
|
|
/// See https://github.com/openai/codex/pull/3388/files for an example of what that would look like.
|
|
/// Finally, we want to make sure this is set for ALL mcp clients without needing to know a special env var
|
|
/// or having to set data that they already specified in the mcp initialize request somewhere else.
|
|
///
|
|
/// A space is automatically added between the suffix and the rest of the User-Agent string.
|
|
/// The full user agent string is returned from the mcp initialize response.
|
|
/// Parenthesis will be added by Codex. This should only specify what goes inside of the parenthesis.
|
|
pub static USER_AGENT_SUFFIX: LazyLock<Mutex<Option<String>>> = LazyLock::new(|| Mutex::new(None));
|
|
pub const DEFAULT_ORIGINATOR: &str = "codex_cli_rs";
|
|
pub const CODEX_INTERNAL_ORIGINATOR_OVERRIDE_ENV_VAR: &str = "CODEX_INTERNAL_ORIGINATOR_OVERRIDE";
|
|
pub const RESIDENCY_HEADER_NAME: &str = "x-openai-internal-codex-residency";
|
|
|
|
#[derive(Debug, Clone)]
|
|
pub struct Originator {
|
|
pub value: String,
|
|
pub header_value: HeaderValue,
|
|
}
|
|
static ORIGINATOR: LazyLock<RwLock<Option<Originator>>> = LazyLock::new(|| RwLock::new(None));
|
|
static REQUIREMENTS_RESIDENCY: LazyLock<RwLock<Option<ResidencyRequirement>>> =
|
|
LazyLock::new(|| RwLock::new(None));
|
|
|
|
#[derive(Debug)]
|
|
pub enum SetOriginatorError {
|
|
InvalidHeaderValue,
|
|
AlreadyInitialized,
|
|
}
|
|
|
|
fn get_originator_value(provided: Option<String>) -> Originator {
|
|
let value = std::env::var(CODEX_INTERNAL_ORIGINATOR_OVERRIDE_ENV_VAR)
|
|
.ok()
|
|
.or(provided)
|
|
.unwrap_or(DEFAULT_ORIGINATOR.to_string());
|
|
|
|
match HeaderValue::from_str(&value) {
|
|
Ok(header_value) => Originator {
|
|
value,
|
|
header_value,
|
|
},
|
|
Err(e) => {
|
|
tracing::error!("Unable to turn originator override {value} into header value: {e}");
|
|
Originator {
|
|
value: DEFAULT_ORIGINATOR.to_string(),
|
|
header_value: HeaderValue::from_static(DEFAULT_ORIGINATOR),
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
pub fn set_default_originator(value: String) -> Result<(), SetOriginatorError> {
|
|
if HeaderValue::from_str(&value).is_err() {
|
|
return Err(SetOriginatorError::InvalidHeaderValue);
|
|
}
|
|
let originator = get_originator_value(Some(value));
|
|
let Ok(mut guard) = ORIGINATOR.write() else {
|
|
return Err(SetOriginatorError::AlreadyInitialized);
|
|
};
|
|
if guard.is_some() {
|
|
return Err(SetOriginatorError::AlreadyInitialized);
|
|
}
|
|
*guard = Some(originator);
|
|
Ok(())
|
|
}
|
|
|
|
pub fn set_default_client_residency_requirement(enforce_residency: Option<ResidencyRequirement>) {
|
|
let Ok(mut guard) = REQUIREMENTS_RESIDENCY.write() else {
|
|
tracing::warn!("Failed to acquire requirements residency lock");
|
|
return;
|
|
};
|
|
*guard = enforce_residency;
|
|
}
|
|
|
|
pub fn originator() -> Originator {
|
|
if let Ok(guard) = ORIGINATOR.read()
|
|
&& let Some(originator) = guard.as_ref()
|
|
{
|
|
return originator.clone();
|
|
}
|
|
|
|
if std::env::var(CODEX_INTERNAL_ORIGINATOR_OVERRIDE_ENV_VAR).is_ok() {
|
|
let originator = get_originator_value(None);
|
|
if let Ok(mut guard) = ORIGINATOR.write() {
|
|
match guard.as_ref() {
|
|
Some(originator) => return originator.clone(),
|
|
None => *guard = Some(originator.clone()),
|
|
}
|
|
}
|
|
return originator;
|
|
}
|
|
|
|
get_originator_value(None)
|
|
}
|
|
|
|
pub fn is_first_party_originator(originator_value: &str) -> bool {
|
|
originator_value == DEFAULT_ORIGINATOR
|
|
|| originator_value == "codex_vscode"
|
|
|| originator_value.starts_with("Codex ")
|
|
}
|
|
|
|
pub fn is_first_party_chat_originator(originator_value: &str) -> bool {
|
|
originator_value == "codex_atlas" || originator_value == "codex_chatgpt_desktop"
|
|
}
|
|
|
|
pub fn get_codex_user_agent() -> String {
|
|
let build_version = env!("CARGO_PKG_VERSION");
|
|
let os_info = os_info::get();
|
|
let originator = originator();
|
|
let prefix = format!(
|
|
"{}/{build_version} ({} {}; {}) {}",
|
|
originator.value.as_str(),
|
|
os_info.os_type(),
|
|
os_info.version(),
|
|
os_info.architecture().unwrap_or("unknown"),
|
|
crate::terminal::user_agent()
|
|
);
|
|
let suffix = USER_AGENT_SUFFIX
|
|
.lock()
|
|
.ok()
|
|
.and_then(|guard| guard.clone());
|
|
let suffix = suffix
|
|
.as_deref()
|
|
.map(str::trim)
|
|
.filter(|value| !value.is_empty())
|
|
.map_or_else(String::new, |value| format!(" ({value})"));
|
|
|
|
let candidate = format!("{prefix}{suffix}");
|
|
sanitize_user_agent(candidate, &prefix)
|
|
}
|
|
|
|
/// Sanitize the user agent string.
|
|
///
|
|
/// Invalid characters are replaced with an underscore.
|
|
///
|
|
/// If the user agent fails to parse, it falls back to fallback and then to ORIGINATOR.
|
|
fn sanitize_user_agent(candidate: String, fallback: &str) -> String {
|
|
if HeaderValue::from_str(candidate.as_str()).is_ok() {
|
|
return candidate;
|
|
}
|
|
|
|
let sanitized: String = candidate
|
|
.chars()
|
|
.map(|ch| if matches!(ch, ' '..='~') { ch } else { '_' })
|
|
.collect();
|
|
if !sanitized.is_empty() && HeaderValue::from_str(sanitized.as_str()).is_ok() {
|
|
tracing::warn!(
|
|
"Sanitized Codex user agent because provided suffix contained invalid header characters"
|
|
);
|
|
sanitized
|
|
} else if HeaderValue::from_str(fallback).is_ok() {
|
|
tracing::warn!(
|
|
"Falling back to base Codex user agent because provided suffix could not be sanitized"
|
|
);
|
|
fallback.to_string()
|
|
} else {
|
|
tracing::warn!(
|
|
"Falling back to default Codex originator because base user agent string is invalid"
|
|
);
|
|
originator().value
|
|
}
|
|
}
|
|
|
|
/// Create an HTTP client with default `originator` and `User-Agent` headers set.
|
|
pub fn create_client() -> CodexHttpClient {
|
|
let inner = build_reqwest_client();
|
|
CodexHttpClient::new(inner)
|
|
}
|
|
|
|
/// Builds the default reqwest client used for ordinary Codex HTTP traffic.
|
|
///
|
|
/// This starts from the standard Codex user agent, default headers, and sandbox-specific proxy
|
|
/// policy, then layers in shared custom CA handling from `CODEX_CA_CERTIFICATE` /
|
|
/// `SSL_CERT_FILE`. The function remains infallible for compatibility with existing call sites, so
|
|
/// a custom-CA or builder failure is logged and falls back to `reqwest::Client::new()`.
|
|
pub fn build_reqwest_client() -> reqwest::Client {
|
|
try_build_reqwest_client().unwrap_or_else(|error| {
|
|
tracing::warn!(error = %error, "failed to build default reqwest client");
|
|
reqwest::Client::new()
|
|
})
|
|
}
|
|
|
|
/// Tries to build the default reqwest client used for ordinary Codex HTTP traffic.
|
|
///
|
|
/// Callers that need a structured CA-loading failure instead of the legacy logged fallback can use
|
|
/// this method directly.
|
|
pub fn try_build_reqwest_client() -> Result<reqwest::Client, BuildCustomCaTransportError> {
|
|
let ua = get_codex_user_agent();
|
|
|
|
let mut builder = reqwest::Client::builder()
|
|
// Set UA via dedicated helper to avoid header validation pitfalls
|
|
.user_agent(ua)
|
|
.default_headers(default_headers());
|
|
if is_sandboxed() {
|
|
builder = builder.no_proxy();
|
|
}
|
|
|
|
build_reqwest_client_with_custom_ca(builder)
|
|
}
|
|
|
|
pub fn default_headers() -> HeaderMap {
|
|
let mut headers = HeaderMap::new();
|
|
headers.insert("originator", originator().header_value);
|
|
if let Ok(guard) = REQUIREMENTS_RESIDENCY.read()
|
|
&& let Some(requirement) = guard.as_ref()
|
|
&& !headers.contains_key(RESIDENCY_HEADER_NAME)
|
|
{
|
|
let value = match requirement {
|
|
ResidencyRequirement::Us => HeaderValue::from_static("us"),
|
|
};
|
|
headers.insert(RESIDENCY_HEADER_NAME, value);
|
|
}
|
|
headers
|
|
}
|
|
|
|
fn is_sandboxed() -> bool {
|
|
std::env::var(CODEX_SANDBOX_ENV_VAR).as_deref() == Ok("seatbelt")
|
|
}
|
|
|
|
#[cfg(test)]
|
|
#[path = "default_client_tests.rs"]
|
|
mod tests;
|