Commit Graph

16 Commits

Author SHA1 Message Date
jif-oai
74ecd6e3b2 state: add memory consolidation lock primitives (#11199)
## Summary
- add a migration for memory_consolidation_locks
- add acquire/release lock primitives to codex-state runtime
- add core/state_db wrappers and cwd normalization for memory queries
and lock keys

## Testing
- cargo test -p codex-state memory_consolidation_lock_
- cargo test -p codex-core --lib state_db::
2026-02-09 21:04:20 +00:00
jif-oai
3800173459 feat: backfill async again (#10894) 2026-02-06 15:41:52 +01:00
jif-oai
428a9f6035 feat: wait for backfill to be ready (#10790) 2026-02-05 20:45:16 +00:00
jif-oai
68e82e5dc9 nit: add DB version is discrepancy recording (#10762) 2026-02-05 16:24:18 +00:00
jif-oai
901215e310 feat: repair DB in case of missing lines (#10751) 2026-02-05 16:21:49 +00:00
jif-oai
4033f905c6 feat: resumable backfill (#10745)
## Summary

This PR makes SQLite rollout backfill resumable and repeatable instead
of one-shot-on-db-create.

## What changed

- Added a persisted backfill state table:
  - state/migrations/0008_backfill_state.sql
- Tracks status (pending|running|complete), last_watermark, and
last_success_at.
- Added backfill state model/types in codex-state:
  - BackfillState, BackfillStatus (state/src/model/backfill_state.rs)
- Added runtime APIs to manage backfill lifecycle/progress:
  - get_backfill_state
  - mark_backfill_running
  - checkpoint_backfill
  - mark_backfill_complete
- Updated core startup behavior:
- Backfill now runs whenever state is not Complete (not only when DB
file is newly created).
- Reworked backfill execution:
- Collect rollout files, derive deterministic watermark per path, sort,
resume from last_watermark.
- Process in batches (BACKFILL_BATCH_SIZE = 200), checkpoint after each
batch.
  - Mark complete with last_success_at at the end.

## Why

Previous behavior could leave users permanently partially backfilled if
the process exited during initial async backfill. This change allows
safe continuation across restarts and avoids restarting from scratch.
2026-02-05 14:34:34 +00:00
jif-oai
4922b3e571 feat: add phase 1 mem db (#10634)
- Schema: thread_id (PK, FK to threads.id with cascade delete),
trace_summary, memory_summary, updated_at.
- Migration: creates the table and an index on (updated_at DESC,
thread_id DESC) for efficient recent-first reads.
  - Runtime API (DB-only):
      - `get_thread_memory(thread_id)`: fetch one memory row.
- `upsert_thread_memory(thread_id, trace_summary, memory_summary)`:
insert/update by thread id and always advance updated_at.
- `get_last_n_thread_memories_for_cwd(cwd, n)`: join thread_memory with
threads and return newest n rows for an exact cwd match.
- Model layer: introduced ThreadMemory and row conversion types to keep
query decoding typed and consistent with existing state models.
2026-02-04 21:38:39 +00:00
jif-oai
583e5d4f41 Migrate state DB path helpers to versioned filename (#10623)
Summary
- add versioned state sqlite filename helpers and re-export them from
the state crate
- remove legacy state files when initializing the runtime and update
consumers/tests to use the new helpers
- tweak logs client description and database resolution to match the new
path
2026-02-04 14:31:12 +00:00
jif-oai
100eb6e6f0 Prefer state DB thread listings before filesystem (#10544)
Summary
- add Cursor/ThreadsPage conversions so state DB listings can be mapped
back into the rollout list model
- make recorder list helpers query the state DB first (archived flag
included) and only fall back to file traversal if needed, along with
populating head bytes lazily
- add extensive tests to ensure the DB path is honored for active and
archived threads and that the fallback works

Testing
- Not run (not requested)

<img width="1196" height="693" alt="Screenshot 2026-02-03 at 20 42 33"
src="https://github.com/user-attachments/assets/826b3c7a-ef11-4b27-802a-3c343695794a"
/>
2026-02-04 09:27:24 +00:00
Celia Chen
fb2df99cf1 [feat] persist thread_dynamic_tools in db (#10252)
Persist thread_dynamic_tools in sqlite and read first from it. Fall back
to rollout files if it's not found. Persist dynamic tools to both sqlite
and rollout files.

Saw that new sessions get populated to db correctly & old sessions get
backfilled correctly at startup:
```
celia@com-92114 codex-rs % sqlite3 ~/.codex/state.sqlite \      "select thread_id, position,name,description,input_schema from thread_dynamic_tools;"
019c0cad-ec0d-74b2-a787-e8b33a349117|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
....
019c10ca-aa4b-7620-ae40-c0919fbd7ea7|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
```
2026-02-03 00:06:44 +00:00
jif-oai
e9a774e7ae fix: thread listing (#10383) 2026-02-02 12:52:49 +00:00
jif-oai
887bec0dee chore: do not clean the DB anymore (#10232) 2026-01-30 18:23:00 +01:00
jif-oai
25ad414680 chore: unify metric (#10220) 2026-01-30 11:13:43 +01:00
jif-oai
714dc8d8bd feat: async backfill (#10089) 2026-01-29 09:57:50 +00:00
jif-oai
780482da84 feat: add log db (#10086)
Add a log DB. The goal is just to store our logs in a `.sqlite` DB to
make it easier to crawl them and drop the oldest ones.
2026-01-29 10:23:03 +01:00
jif-oai
3878c3dc7c feat: sqlite 1 (#10004)
Add a `.sqlite` database to be used to store rollout metatdata (and
later logs)
This PR is phase 1:
* Add the database and the required infrastructure
* Add a backfill of the database
* Persist the newly created rollout both in files and in the DB
* When we need to get metadata or a rollout, consider the `JSONL` as the
source of truth but compare the results with the DB and show any errors
2026-01-28 15:29:14 +01:00