mirror of
https://github.com/openai/codex.git
synced 2026-05-17 09:43:19 +00:00
## Why Users and support need a single command that captures the local Codex runtime, configuration, auth, terminal, network, and state shape without asking the user to know which diagnostic depth to choose first. `codex doctor` now runs the useful checks by default and makes the detailed human output the default because the command is usually run when someone already needs context. The command also targets concrete support failure modes we have seen while iterating on the design: - update-target mismatches like #21956, where the installed package manager target can differ from the running executable - terminal and multiplexer issues that depend on `TERM`, tmux/zellij state, color handling, and TTY metadata - provider-specific HTTP/WebSocket connectivity, including ChatGPT WebSocket handshakes and API-key/provider endpoint reachability - local state/log SQLite integrity problems and large rollout directories - feedback reports that need an attached, redacted diagnostic snapshot without asking the user to run a second command ## What Changed - Adds `codex doctor` as a grouped CLI diagnostic report with default detailed output and `--summary` for the compact view. - Adds stable report sections for Environment, Configuration, Updates, Connectivity, and Background Server, plus a top Notes block that promotes anomalies such as available updates, large rollout directories, optional MCP issues, and mixed auth signals. - Adds runtime provenance, install consistency, bundled/system search readiness, terminal/multiplexer metadata, `config.toml` parse status, auth mode details, sandbox details, feature flag summaries, update cache/latest-version state, app-server daemon state, SQLite integrity checks, rollout statistics, and provider-aware network diagnostics. - Adds ChatGPT WebSocket diagnostics that report the negotiated HTTP upgrade as `HTTP 101 Switching Protocols` and include timeout, DNS, auth, and provider context in detailed output. - Makes reachability provider-aware: API-key OpenAI setups check the API endpoint, ChatGPT auth checks the ChatGPT path, and custom/AWS/local providers check configured HTTP endpoints when available. - Adds structured, redacted JSON output where `checks` is keyed by check id and `details` is a key/value object for support tooling. - Integrates doctor with feedback uploads by attaching a best-effort `codex-doctor-report.json` report and adding derived Sentry tags for overall status and failing/warning checks. - Updates the TUI feedback consent copy so users can see that the doctor report is included when logs/diagnostics are uploaded. - Updates the CLI bug issue template to ask reporters for `codex doctor --json` and render pasted reports as JSON. ## Example Output The examples below are sanitized from local smoke runs with `--no-color` so the structure is reviewable in plain text. ### `codex doctor` ```text Codex Doctor v0.0.0 · macos-aarch64 Notes ↑ updates 0.130.0 available (current 0.0.0, dismissed 0.128.0) ⚠ rollouts 1,526 active files · 2.53 GB on disk ⚠ mcp MCP configuration has optional issues ⚠ auth mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode ───────────────────────────────────────────────────────────── Environment ✓ runtime local debug build version 0.0.0 install method other commit unknown executable ~/code/codex.fcoury-doct…x-rs/target/debug/codex ✓ install consistent context other managed by npm: no · bun: no · package root — PATH entries (2) ~/.local/share/mise/installs/node/24/bin/codex ~/.local/share/mise/shims/codex ✓ search ripgrep 15.1.0 (system, `rg`) ✓ terminal Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color terminal Ghostty TERM_PROGRAM ghostty terminal version 1.3.2-main-+b0f827665 TERM xterm-256color multiplexer tmux 3.6a tmux extended-keys on tmux allow-passthrough on tmux set-clipboard on ✓ state databases healthy CODEX_HOME ~/.codex (dir) state DB ~/.codex/state_5.sqlite (file) · integrity ok log DB ~/.codex/logs_2.sqlite (file) · integrity ok active rollouts 1,526 files · 2.53 GB (avg 1.70 MB) archived rollouts 8 files · 3.84 MB (avg 491.11 KB) Configuration ✓ config loaded model gpt-5.5 · openai cwd ~/code/codex.fcoury-doctor/codex-rs config.toml ~/.codex/config.toml config.toml parse ok MCP servers 1 feature flags 36 enabled · 7 overridden (full list with --all) overrides code_mode, code_mode_only, memories, chronicle, goals, remote_control, prevent_idle_sleep ✓ auth auth is configured auth storage mode File auth file ~/.codex/auth.json auth env vars present OPENAI_API_KEY stored auth mode chatgpt stored API key false stored ChatGPT tokens true stored agent identity false ⚠ mcp MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server. configured servers 1 disabled servers 0 streamable_http servers 1 optional reachability openaiDeveloperDocs: https://developers.openai.com/mcp (HEAD connect failed; GET connect failed) ✓ sandbox restricted fs + restricted network · approval OnRequest approval policy OnRequest filesystem sandbox restricted network sandbox restricted Connectivity ✓ network network-related environment looks readable ✓ websocket connected (HTTP 101 Switching Protocols) · 15s timeout model provider openai provider name OpenAI wire API responses supports websockets true connect timeout 15000 ms auth mode chatgpt endpoint wss://chatgpt.com/backend-api/<redacted> DNS 2 IPv4, 2 IPv6, first IPv6 handshake result HTTP 101 Switching Protocols ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration. reachability mode API key auth openai API https://api.openai.com/v1 connect failed (required) Background Server ○ app-server not running (ephemeral mode) ───────────────────────────────────────────────────────────── 11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed --summary compact output --all expand truncated lists --json redacted report ``` ### `codex doctor --summary` ```text Codex Doctor v0.0.0 · macos-aarch64 Notes ↑ updates 0.130.0 available (current 0.0.0, dismissed 0.128.0) ⚠ rollouts 1,526 active files · 2.53 GB on disk ⚠ mcp MCP configuration has optional issues ⚠ auth mixed auth signals: ChatGPT login plus API key env var; HTTP reachability uses API-key mode ───────────────────────────────────────────────────────────── Environment ✓ runtime local debug build ✓ install consistent ✓ search ripgrep 15.1.0 (system, `rg`) ✓ terminal Ghostty 1.3.2-main-+b0f827665 · tmux 3.6a · TERM=xterm-256color ✓ state databases healthy Configuration ✓ config loaded ✓ auth auth is configured ⚠ mcp MCP configuration has optional issues — Set the missing MCP env vars or disable the affected server. ✓ sandbox restricted fs + restricted network · approval OnRequest Updates ✓ updates update configuration is locally consistent Connectivity ✓ network network-related environment looks readable ✓ websocket connected (HTTP 101 Switching Protocols) · 15s timeout ✗ reachability one or more required provider endpoints are unreachable over HTTP — Check proxy, VPN, firewall, DNS, and custom CA configuration. Background Server ○ app-server not running (ephemeral mode) ───────────────────────────────────────────────────────────── 11 ok · 1 idle · 4 notes · 1 warn · 1 fail failed Run codex doctor without --summary for detailed diagnostics. --all expand truncated lists --json redacted report ``` ### `codex doctor --json` shape ```json { "schema_version": 1, "overall_status": "fail", "checks": { "runtime.provenance": { "id": "runtime.provenance", "category": "Environment", "status": "ok", "summary": "local debug build", "details": { "version": "0.0.0", "install method": "other", "commit": "unknown" } }, "sandbox.helpers": { "id": "sandbox.helpers", "category": "Configuration", "status": "ok", "summary": "restricted fs + restricted network · approval OnRequest", "details": { "approval policy": "OnRequest", "filesystem sandbox": "restricted", "network sandbox": "restricted" } } } } ``` ### `/feedback` new sentry attachment <img width="938" height="798" alt="CleanShot 2026-05-13 at 15 36 14" src="https://github.com/user-attachments/assets/715e62e0-d7b4-4fea-a35a-fd5d5d33c4c0" /> ### New section in CLI issue template <img width="1164" height="435" alt="CleanShot 2026-05-13 at 15 47 24" src="https://github.com/user-attachments/assets/9081dc25-a28c-4afa-8ba1-e299c2b4031d" /> ## How to Test 1. Run `cargo run --bin codex -- doctor --no-color`. 2. Confirm the detailed report is the default and includes promoted Notes, grouped sections, terminal details, state DB integrity, rollout stats, provider reachability, WebSocket diagnostics, and app-server status. 3. Run `cargo run --bin codex -- doctor --summary --no-color`. 4. Confirm the compact view keeps the same sections and summary counts but omits detailed key/value rows. 5. Run `cargo run --bin codex -- doctor --json`. 6. Confirm the output is redacted JSON, `checks` is an object keyed by check id, and each check's `details` is a key/value object. 7. Preview the CLI bug issue template and confirm the `Codex doctor report` field appears after the terminal field, asks for `codex doctor --json`, and renders pasted output as JSON. 8. Start a feedback flow that includes logs. 9. Confirm the upload consent copy lists `codex-doctor-report.json` alongside the log attachments. Targeted tests: - `cargo test -p codex-cli doctor` - `cargo test -p codex-app-server doctor_report_tags_summarize_status_counts` - `cargo test -p codex-feedback` - `cargo test -p codex-tui feedback_view` - `just argument-comment-lint` - `git diff --check`
520 lines
16 KiB
Rust
520 lines
16 KiB
Rust
use crate::AgentJob;
|
|
use crate::AgentJobCreateParams;
|
|
use crate::AgentJobItem;
|
|
use crate::AgentJobItemCreateParams;
|
|
use crate::AgentJobItemStatus;
|
|
use crate::AgentJobProgress;
|
|
use crate::AgentJobStatus;
|
|
use crate::LOGS_DB_FILENAME;
|
|
use crate::LogEntry;
|
|
use crate::LogQuery;
|
|
use crate::LogRow;
|
|
use crate::STATE_DB_FILENAME;
|
|
use crate::SortKey;
|
|
use crate::ThreadMetadata;
|
|
use crate::ThreadMetadataBuilder;
|
|
use crate::ThreadsPage;
|
|
use crate::apply_rollout_item;
|
|
use crate::migrations::runtime_logs_migrator;
|
|
use crate::migrations::runtime_state_migrator;
|
|
use crate::model::AgentJobRow;
|
|
use crate::model::ThreadGoalRow;
|
|
use crate::model::ThreadRow;
|
|
use crate::model::anchor_from_item;
|
|
use crate::model::datetime_to_epoch_millis;
|
|
use crate::model::datetime_to_epoch_seconds;
|
|
use crate::model::epoch_millis_to_datetime;
|
|
use crate::paths::file_modified_time_utc;
|
|
use crate::telemetry::DbKind;
|
|
use crate::telemetry::DbTelemetry;
|
|
use chrono::DateTime;
|
|
use chrono::Utc;
|
|
use codex_protocol::ThreadId;
|
|
use codex_protocol::dynamic_tools::DynamicToolSpec;
|
|
use codex_protocol::protocol::RolloutItem;
|
|
use log::LevelFilter;
|
|
use serde_json::Value;
|
|
use sqlx::ConnectOptions;
|
|
use sqlx::QueryBuilder;
|
|
use sqlx::Row;
|
|
use sqlx::Sqlite;
|
|
use sqlx::SqliteConnection;
|
|
use sqlx::SqlitePool;
|
|
use sqlx::migrate::Migrator;
|
|
use sqlx::sqlite::SqliteAutoVacuum;
|
|
use sqlx::sqlite::SqliteConnectOptions;
|
|
use sqlx::sqlite::SqliteJournalMode;
|
|
use sqlx::sqlite::SqlitePoolOptions;
|
|
use sqlx::sqlite::SqliteSynchronous;
|
|
use std::collections::BTreeSet;
|
|
use std::path::Path;
|
|
use std::path::PathBuf;
|
|
use std::sync::Arc;
|
|
use std::sync::atomic::AtomicI64;
|
|
use std::time::Duration;
|
|
use std::time::Instant;
|
|
use tracing::warn;
|
|
|
|
mod agent_jobs;
|
|
mod backfill;
|
|
mod goals;
|
|
mod logs;
|
|
mod memories;
|
|
mod remote_control;
|
|
#[cfg(test)]
|
|
mod test_support;
|
|
mod threads;
|
|
|
|
pub use goals::ThreadGoalAccountingMode;
|
|
pub use goals::ThreadGoalAccountingOutcome;
|
|
pub use goals::ThreadGoalUpdate;
|
|
pub use remote_control::RemoteControlEnrollmentRecord;
|
|
pub use threads::ThreadFilterOptions;
|
|
|
|
// "Partition" is the retained-log-content bucket we cap at 10 MiB:
|
|
// - one bucket per non-null thread_id
|
|
// - one bucket per threadless (thread_id IS NULL) non-null process_uuid
|
|
// - one bucket for threadless rows with process_uuid IS NULL
|
|
// This budget tracks each row's persisted rendered log body plus non-body
|
|
// metadata, rather than the exact sum of all persisted SQLite column bytes.
|
|
const LOG_PARTITION_SIZE_LIMIT_BYTES: i64 = 10 * 1024 * 1024;
|
|
const LOG_PARTITION_ROW_LIMIT: i64 = 1_000;
|
|
|
|
#[derive(Clone)]
|
|
pub struct StateRuntime {
|
|
codex_home: PathBuf,
|
|
default_provider: String,
|
|
pool: Arc<sqlx::SqlitePool>,
|
|
logs_pool: Arc<sqlx::SqlitePool>,
|
|
thread_updated_at_millis: Arc<AtomicI64>,
|
|
}
|
|
|
|
impl StateRuntime {
|
|
/// Initialize the state runtime using the provided Codex home and default provider.
|
|
///
|
|
/// This opens (and migrates) the SQLite databases under `codex_home`,
|
|
/// keeping logs in a dedicated file to reduce lock contention with the
|
|
/// rest of the state store.
|
|
pub async fn init(codex_home: PathBuf, default_provider: String) -> anyhow::Result<Arc<Self>> {
|
|
Self::init_inner(
|
|
codex_home,
|
|
default_provider,
|
|
/*telemetry_override*/ None,
|
|
)
|
|
.await
|
|
}
|
|
|
|
#[cfg(test)]
|
|
pub(crate) async fn init_with_telemetry_for_tests(
|
|
codex_home: PathBuf,
|
|
default_provider: String,
|
|
telemetry_override: &dyn DbTelemetry,
|
|
) -> anyhow::Result<Arc<Self>> {
|
|
Self::init_inner(codex_home, default_provider, Some(telemetry_override)).await
|
|
}
|
|
|
|
async fn init_inner(
|
|
codex_home: PathBuf,
|
|
default_provider: String,
|
|
telemetry_override: Option<&dyn DbTelemetry>,
|
|
) -> anyhow::Result<Arc<Self>> {
|
|
tokio::fs::create_dir_all(&codex_home).await?;
|
|
let state_migrator = runtime_state_migrator();
|
|
let logs_migrator = runtime_logs_migrator();
|
|
let state_path = state_db_path(codex_home.as_path());
|
|
let logs_path = logs_db_path(codex_home.as_path());
|
|
let pool = match open_state_sqlite(&state_path, &state_migrator, telemetry_override).await {
|
|
Ok(db) => Arc::new(db),
|
|
Err(err) => {
|
|
warn!("failed to open state db at {}: {err}", state_path.display());
|
|
return Err(err);
|
|
}
|
|
};
|
|
let logs_pool = match open_logs_sqlite(&logs_path, &logs_migrator, telemetry_override).await
|
|
{
|
|
Ok(db) => Arc::new(db),
|
|
Err(err) => {
|
|
warn!("failed to open logs db at {}: {err}", logs_path.display());
|
|
return Err(err);
|
|
}
|
|
};
|
|
let started = Instant::now();
|
|
let backfill_state_result = ensure_backfill_state_row_in_pool(pool.as_ref()).await;
|
|
crate::telemetry::record_init_result(
|
|
telemetry_override,
|
|
DbKind::State,
|
|
"ensure_backfill_state",
|
|
started.elapsed(),
|
|
&backfill_state_result,
|
|
);
|
|
backfill_state_result?;
|
|
let started = Instant::now();
|
|
let thread_updated_at_millis_result: anyhow::Result<Option<i64>> =
|
|
sqlx::query_scalar("SELECT MAX(threads.updated_at_ms) FROM threads")
|
|
.fetch_one(pool.as_ref())
|
|
.await
|
|
.map_err(anyhow::Error::from);
|
|
crate::telemetry::record_init_result(
|
|
telemetry_override,
|
|
DbKind::State,
|
|
"post_init_query",
|
|
started.elapsed(),
|
|
&thread_updated_at_millis_result,
|
|
);
|
|
let thread_updated_at_millis = thread_updated_at_millis_result?;
|
|
let thread_updated_at_millis = thread_updated_at_millis.unwrap_or(0);
|
|
let runtime = Arc::new(Self {
|
|
pool,
|
|
logs_pool,
|
|
codex_home,
|
|
default_provider,
|
|
thread_updated_at_millis: Arc::new(AtomicI64::new(thread_updated_at_millis)),
|
|
});
|
|
if let Err(err) = runtime.run_logs_startup_maintenance().await {
|
|
warn!(
|
|
"failed to run startup maintenance for logs db at {}: {err}",
|
|
logs_path.display(),
|
|
);
|
|
}
|
|
Ok(runtime)
|
|
}
|
|
|
|
/// Return the configured Codex home directory for this runtime.
|
|
pub fn codex_home(&self) -> &Path {
|
|
self.codex_home.as_path()
|
|
}
|
|
}
|
|
|
|
fn base_sqlite_options(path: &Path) -> SqliteConnectOptions {
|
|
SqliteConnectOptions::new()
|
|
.filename(path)
|
|
.create_if_missing(true)
|
|
.journal_mode(SqliteJournalMode::Wal)
|
|
.synchronous(SqliteSynchronous::Normal)
|
|
.busy_timeout(Duration::from_secs(5))
|
|
.log_statements(LevelFilter::Off)
|
|
}
|
|
|
|
async fn open_state_sqlite(
|
|
path: &Path,
|
|
migrator: &Migrator,
|
|
telemetry_override: Option<&dyn DbTelemetry>,
|
|
) -> anyhow::Result<SqlitePool> {
|
|
// New state DBs should use incremental auto-vacuum, but retrofitting an
|
|
// existing DB requires a full VACUUM. Do not attempt that during process
|
|
// startup: it is maintenance work that can contend with foreground writers.
|
|
open_sqlite(
|
|
path,
|
|
migrator,
|
|
DbKind::State,
|
|
"open_state",
|
|
"migrate_state",
|
|
telemetry_override,
|
|
)
|
|
.await
|
|
}
|
|
|
|
async fn open_logs_sqlite(
|
|
path: &Path,
|
|
migrator: &Migrator,
|
|
telemetry_override: Option<&dyn DbTelemetry>,
|
|
) -> anyhow::Result<SqlitePool> {
|
|
open_sqlite(
|
|
path,
|
|
migrator,
|
|
DbKind::Logs,
|
|
"open_logs",
|
|
"migrate_logs",
|
|
telemetry_override,
|
|
)
|
|
.await
|
|
}
|
|
|
|
async fn open_sqlite(
|
|
path: &Path,
|
|
migrator: &Migrator,
|
|
db: DbKind,
|
|
open_phase: &'static str,
|
|
migrate_phase: &'static str,
|
|
telemetry_override: Option<&dyn DbTelemetry>,
|
|
) -> anyhow::Result<SqlitePool> {
|
|
let options = base_sqlite_options(path).auto_vacuum(SqliteAutoVacuum::Incremental);
|
|
let started = Instant::now();
|
|
let pool_result = SqlitePoolOptions::new()
|
|
.max_connections(5)
|
|
.connect_with(options)
|
|
.await
|
|
.map_err(anyhow::Error::from);
|
|
crate::telemetry::record_init_result(
|
|
telemetry_override,
|
|
db,
|
|
open_phase,
|
|
started.elapsed(),
|
|
&pool_result,
|
|
);
|
|
let pool = pool_result?;
|
|
let started = Instant::now();
|
|
let migrate_result = migrator.run(&pool).await.map_err(anyhow::Error::from);
|
|
crate::telemetry::record_init_result(
|
|
telemetry_override,
|
|
db,
|
|
migrate_phase,
|
|
started.elapsed(),
|
|
&migrate_result,
|
|
);
|
|
migrate_result?;
|
|
Ok(pool)
|
|
}
|
|
|
|
pub(super) async fn ensure_backfill_state_row_in_pool(
|
|
pool: &sqlx::SqlitePool,
|
|
) -> anyhow::Result<()> {
|
|
sqlx::query(
|
|
r#"
|
|
INSERT INTO backfill_state (id, status, last_watermark, last_success_at, updated_at)
|
|
VALUES (?, ?, NULL, NULL, ?)
|
|
ON CONFLICT(id) DO NOTHING
|
|
"#,
|
|
)
|
|
.bind(1_i64)
|
|
.bind(crate::BackfillStatus::Pending.as_str())
|
|
.bind(Utc::now().timestamp())
|
|
.execute(pool)
|
|
.await?;
|
|
Ok(())
|
|
}
|
|
|
|
pub fn state_db_filename() -> String {
|
|
STATE_DB_FILENAME.to_string()
|
|
}
|
|
|
|
pub fn state_db_path(codex_home: &Path) -> PathBuf {
|
|
codex_home.join(state_db_filename())
|
|
}
|
|
|
|
pub fn logs_db_filename() -> String {
|
|
LOGS_DB_FILENAME.to_string()
|
|
}
|
|
|
|
pub fn logs_db_path(codex_home: &Path) -> PathBuf {
|
|
codex_home.join(logs_db_filename())
|
|
}
|
|
|
|
/// Run SQLite's built-in integrity check against an existing database file.
|
|
pub async fn sqlite_integrity_check(path: &Path) -> anyhow::Result<Vec<String>> {
|
|
let options = SqliteConnectOptions::new()
|
|
.filename(path)
|
|
.create_if_missing(false)
|
|
.read_only(true)
|
|
.log_statements(LevelFilter::Off);
|
|
let pool = SqlitePoolOptions::new()
|
|
.max_connections(1)
|
|
.connect_with(options)
|
|
.await?;
|
|
let rows = sqlx::query_scalar::<_, String>("PRAGMA integrity_check")
|
|
.fetch_all(&pool)
|
|
.await?;
|
|
pool.close().await;
|
|
Ok(rows)
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use super::StateRuntime;
|
|
use super::open_state_sqlite;
|
|
use super::runtime_state_migrator;
|
|
use super::sqlite_integrity_check;
|
|
use super::state_db_path;
|
|
use super::test_support::unique_temp_dir;
|
|
use crate::DB_INIT_METRIC;
|
|
use crate::DbTelemetry;
|
|
use crate::migrations::STATE_MIGRATOR;
|
|
use pretty_assertions::assert_eq;
|
|
use sqlx::SqlitePool;
|
|
use sqlx::migrate::MigrateError;
|
|
use sqlx::sqlite::SqliteConnectOptions;
|
|
use std::collections::BTreeMap;
|
|
use std::collections::BTreeSet;
|
|
use std::path::Path;
|
|
use std::sync::Mutex;
|
|
|
|
#[derive(Default)]
|
|
struct TestTelemetry {
|
|
counters: Mutex<Vec<MetricEvent>>,
|
|
}
|
|
|
|
#[derive(Debug, Eq, PartialEq)]
|
|
struct MetricEvent {
|
|
name: String,
|
|
tags: BTreeMap<String, String>,
|
|
}
|
|
|
|
impl TestTelemetry {
|
|
fn counters(&self) -> Vec<MetricEvent> {
|
|
self.counters
|
|
.lock()
|
|
.expect("telemetry lock")
|
|
.iter()
|
|
.map(|event| MetricEvent {
|
|
name: event.name.clone(),
|
|
tags: event.tags.clone(),
|
|
})
|
|
.collect()
|
|
}
|
|
}
|
|
|
|
impl DbTelemetry for TestTelemetry {
|
|
fn counter(&self, name: &str, _inc: i64, tags: &[(&str, &str)]) {
|
|
self.counters
|
|
.lock()
|
|
.expect("telemetry lock")
|
|
.push(MetricEvent {
|
|
name: name.to_string(),
|
|
tags: tags_to_map(tags),
|
|
});
|
|
}
|
|
|
|
fn record_duration(
|
|
&self,
|
|
_name: &str,
|
|
_duration: std::time::Duration,
|
|
_tags: &[(&str, &str)],
|
|
) {
|
|
}
|
|
}
|
|
|
|
fn tags_to_map(tags: &[(&str, &str)]) -> BTreeMap<String, String> {
|
|
tags.iter()
|
|
.map(|(key, value)| ((*key).to_string(), (*value).to_string()))
|
|
.collect()
|
|
}
|
|
|
|
async fn open_db_pool(path: &Path) -> SqlitePool {
|
|
SqlitePool::connect_with(
|
|
SqliteConnectOptions::new()
|
|
.filename(path)
|
|
.create_if_missing(false),
|
|
)
|
|
.await
|
|
.expect("open sqlite pool")
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn sqlite_integrity_check_reports_ok_for_valid_db() {
|
|
let codex_home = unique_temp_dir();
|
|
tokio::fs::create_dir_all(&codex_home)
|
|
.await
|
|
.expect("create codex home");
|
|
let path = state_db_path(codex_home.as_path());
|
|
let pool = SqlitePool::connect_with(
|
|
SqliteConnectOptions::new()
|
|
.filename(&path)
|
|
.create_if_missing(true),
|
|
)
|
|
.await
|
|
.expect("open sqlite db");
|
|
sqlx::query("CREATE TABLE sample (id INTEGER PRIMARY KEY)")
|
|
.execute(&pool)
|
|
.await
|
|
.expect("create sample table");
|
|
pool.close().await;
|
|
|
|
let result = sqlite_integrity_check(&path)
|
|
.await
|
|
.expect("integrity check should run");
|
|
|
|
assert_eq!(result, vec!["ok".to_string()]);
|
|
let _ = tokio::fs::remove_dir_all(codex_home).await;
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn open_state_sqlite_tolerates_newer_applied_migrations() {
|
|
let codex_home = unique_temp_dir();
|
|
tokio::fs::create_dir_all(&codex_home)
|
|
.await
|
|
.expect("create codex home");
|
|
let state_path = state_db_path(codex_home.as_path());
|
|
let pool = SqlitePool::connect_with(
|
|
SqliteConnectOptions::new()
|
|
.filename(&state_path)
|
|
.create_if_missing(true),
|
|
)
|
|
.await
|
|
.expect("open state db");
|
|
STATE_MIGRATOR
|
|
.run(&pool)
|
|
.await
|
|
.expect("apply current state schema");
|
|
sqlx::query(
|
|
"INSERT INTO _sqlx_migrations (version, description, success, checksum, execution_time) VALUES (?, ?, ?, ?, ?)",
|
|
)
|
|
.bind(9_999_i64)
|
|
.bind("future migration")
|
|
.bind(true)
|
|
.bind(vec![1_u8, 2, 3, 4])
|
|
.bind(1_i64)
|
|
.execute(&pool)
|
|
.await
|
|
.expect("insert future migration record");
|
|
pool.close().await;
|
|
|
|
let strict_pool = open_db_pool(state_path.as_path()).await;
|
|
let strict_err = STATE_MIGRATOR
|
|
.run(&strict_pool)
|
|
.await
|
|
.expect_err("strict migrator should reject newer applied migrations");
|
|
assert!(matches!(strict_err, MigrateError::VersionMissing(9_999)));
|
|
strict_pool.close().await;
|
|
|
|
let tolerant_migrator = runtime_state_migrator();
|
|
let tolerant_pool = open_state_sqlite(
|
|
state_path.as_path(),
|
|
&tolerant_migrator,
|
|
/*telemetry_override*/ None,
|
|
)
|
|
.await
|
|
.expect("runtime migrator should tolerate newer applied migrations");
|
|
tolerant_pool.close().await;
|
|
|
|
let _ = tokio::fs::remove_dir_all(codex_home).await;
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn init_records_successful_sqlite_init_phases_to_explicit_telemetry() {
|
|
let codex_home = unique_temp_dir();
|
|
let telemetry = TestTelemetry::default();
|
|
|
|
let runtime = StateRuntime::init_with_telemetry_for_tests(
|
|
codex_home.clone(),
|
|
"test-provider".to_string(),
|
|
&telemetry,
|
|
)
|
|
.await
|
|
.expect("state runtime should initialize");
|
|
|
|
let phases = telemetry
|
|
.counters()
|
|
.into_iter()
|
|
.filter(|event| event.name == DB_INIT_METRIC)
|
|
.filter(|event| event.tags.get("status").map(String::as_str) == Some("success"))
|
|
.filter_map(|event| event.tags.get("phase").cloned())
|
|
.collect::<BTreeSet<_>>();
|
|
let expected = [
|
|
"open_state",
|
|
"migrate_state",
|
|
"open_logs",
|
|
"migrate_logs",
|
|
"ensure_backfill_state",
|
|
"post_init_query",
|
|
]
|
|
.into_iter()
|
|
.map(str::to_string)
|
|
.collect::<BTreeSet<_>>();
|
|
assert_eq!(phases, expected);
|
|
|
|
runtime.pool.close().await;
|
|
runtime.logs_pool.close().await;
|
|
let _ = tokio::fs::remove_dir_all(codex_home).await;
|
|
}
|
|
}
|