mirror of
https://github.com/openai/codex.git
synced 2026-05-02 10:26:45 +00:00
[codex-backend] Make thread metadata updates tolerate pending backfill (#16877)
### Summary Fix `thread/metadata/update` so it can still patch stored thread metadata when the list/backfill-gated `get_state_db(...)` path is unavailable. What was happening: - The app logs showed `thread/metadata/update` failing with `sqlite state db unavailable for thread ...`. - This was not isolated to one bad thread. Once the failure started for a user, branch metadata updates failed 100% of the time for that user. - Reports were staggered across users, which points at local app-server / local SQLite state rather than one global server-side failure. - Turns could still start immediately after the metadata update failed, which suggests the thread itself was valid and the failure was in the metadata endpoint DB-handle path. The fix: - Keep using the loaded thread state DB and the normal `get_state_db(...)` fallback first. - If that still returns `None`, open `StateRuntime::init(...)` directly for this targeted metadata update path. - Log the direct state runtime init error if that final fallback also fails, so future reports have the real DB-open cause instead of only the generic unavailable error. - Add a regression test where the DB exists but backfill is not complete, and verify `thread/metadata/update` can still repair the stored rollout thread and patch `gitInfo`. Relevant context / suspect PRs: - #16434 changed state DB startup to run auto-vacuum / incremental vacuum. This is the most suspicious timing match for per-user, staggered local SQLite availability failures. - #16433 dropped the old log table from the state DB, also near the timing window. - #13280 introduced this endpoint and made it rely on SQLite for git metadata without resuming the thread. - #14859 and #14888 added/consumed persisted model + reasoning effort metadata. I checked these because of the new thread metadata fields, but this failure happens before the endpoint reaches thread-row update/load logic, so they seem less likely as the direct cause. ### Testing - `cargo fmt -- --config imports_granularity=Item` completed; local stable rustfmt emitted warnings that `imports_granularity` is unstable - `cargo test -p codex-app-server thread_metadata_update` - `git diff --check`
This commit is contained in:
committed by
GitHub
parent
54dbbb839e
commit
4ce97cef02
@@ -2754,6 +2754,24 @@ impl CodexMessageProcessor {
|
||||
if state_db_ctx.is_none() {
|
||||
state_db_ctx = get_state_db(&self.config).await;
|
||||
}
|
||||
if state_db_ctx.is_none() {
|
||||
match StateRuntime::init(
|
||||
self.config.sqlite_home.clone(),
|
||||
self.config.model_provider_id.clone(),
|
||||
)
|
||||
.await
|
||||
{
|
||||
Ok(ctx) => {
|
||||
state_db_ctx = Some(ctx);
|
||||
}
|
||||
Err(err) => {
|
||||
warn!(
|
||||
"failed to initialize state db for thread metadata update at {}: {err}",
|
||||
self.config.sqlite_home.display()
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
let Some(state_db_ctx) = state_db_ctx else {
|
||||
self.send_internal_error(
|
||||
request_id,
|
||||
|
||||
Reference in New Issue
Block a user