## Summary
Adds experimental `additionalContext` support to `turn/start` and
`turn/steer` so clients can provide ephemeral external context, such as
browser or automation state, without turning that plumbing into a
visible user prompt or triggering user-prompt lifecycle behavior.
## API Shape
The parameter shape is:
```ts
additionalContext?: Record<string, {
value: string
kind: "untrusted" | "application"
}> | null
```
Example:
```json
{
"additionalContext": {
"browser_info": {
"value": "Active tab is CI failures.",
"kind": "untrusted"
},
"automation_info": {
"value": "CI rerun is in progress.",
"kind": "application"
}
}
}
```
The keys are opaque and caller-defined.
## Context Injection
When provided, accepted entries are inserted into model context as
hidden contextual message items, not as visible thread user-message
items.
`kind: "untrusted"` entries are inserted with role `user`:
```text
<external_${key}>${value}</external_${key}>
```
`kind: "application"` entries are inserted with role `developer`:
```text
<${key}>${value}</${key}>
```
Values are not escaped. Each value is truncated to 1k approximate tokens
before wrapping.
For `turn/start`, accepted additional context is inserted before normal
user input. For `turn/steer`, additional context is merged only when the
steer includes non-empty user input; context-only steers still reject as
empty input.
## Dedupe Strategy
`AdditionalContextStore` lives on session state and stores the latest
complete additional-context map.
Each `turn/start` or non-empty `turn/steer` treats its
`additionalContext` as the current complete set of values. Entries are
injected only when the key is new or the exact entry for that key
changed, including `value` or `kind`. After merging, the store is
replaced with the provided map, so omitted keys are removed from the
retained set and can be injected again later if reintroduced.
Omitting `additionalContext`, passing `null`, or passing an empty object
resets the store to empty and injects nothing.
## What Changed
- Threads experimental v2 `additionalContext` through app-server into
core turn start and steer handling.
- Adds separate contextual fragment types for untrusted user-role
context and application developer-role context.
- Uses pending response input items so additional context can be
combined with normal user input without treating it as prompt text.
- Adds integration coverage for start/steer flow, role routing,
dedupe/reset behavior, deletion/re-add behavior, hook-blocked input
behavior, empty context-only steer rejection, external-fragment marker
matching, and truncation.
Memories
This directory owns reusable memory crates and the memory pipeline documentation.
Runtime orchestration for Phase 1 and Phase 2 still lives in codex-core under
codex-rs/core/src/memories/.
Crates
codex-rs/memories/read(codex-memories-read) owns the read path: memory developer-instruction injection, memory citation parsing, and read-usage telemetry classification.codex-rs/memories/write(codex-memories-write) owns the write path: Phase 1 and Phase 2 prompt rendering, filesystem artifact helpers, workspace diff helpers, and extension resource pruning.
Prompt Templates
Memory prompt templates live with the crate that uses them:
- The undated template files are the canonical latest versions used at runtime:
read/templates/memories/read_path.mdwrite/templates/memories/stage_one_system.mdwrite/templates/memories/stage_one_input.mdwrite/templates/memories/consolidation.md
- In
codex, edit those undated template files in place. - The dated snapshot-copy workflow is used in the separate
openai/project/agent_memory/writeharness repo, not here.
When it runs
The pipeline is triggered when a root session starts, and only if:
- the session is not ephemeral
- the memory feature is enabled
- the session is not a sub-agent session
- the state DB is available
It runs asynchronously in the background and executes two phases in order: Phase 1, then Phase 2.
Phase 1: Rollout Extraction (per-thread)
Phase 1 finds recent eligible rollouts and extracts a structured memory from each one.
Eligible rollouts are selected from the state DB using startup claim rules. In practice this means the pipeline only considers rollouts that are:
- from allowed interactive session sources
- within the configured age window
- idle long enough (to avoid summarizing still-active/fresh rollouts)
- not already owned by another in-flight phase-1 worker
- within startup scan/claim limits (bounded work per startup)
What it does:
- claims a bounded set of rollout jobs from the state DB (startup claim)
- filters rollout content down to memory-relevant response items
- sends each rollout to a model (in parallel, with a concurrency cap)
- expects structured output containing:
- a detailed
raw_memory - a compact
rollout_summary - an optional
rollout_slug
- a detailed
- redacts secrets from the generated memory fields
- stores successful outputs back into the state DB as stage-1 outputs
Concurrency / coordination:
- Phase 1 runs multiple extraction jobs in parallel (with a fixed concurrency cap) so startup memory generation can process several rollouts at once.
- Each job is leased/claimed in the state DB before processing, which prevents duplicate work across concurrent workers/startups.
- Failed jobs are marked with retry backoff, so they are retried later instead of hot-looping.
Job outcomes:
succeeded(memory produced)succeeded_no_output(valid run but nothing useful generated)failed(with retry backoff/lease handling in DB)
Phase 1 is the stage that turns individual rollouts into DB-backed memory records.
Phase 2: Global Consolidation
Phase 2 consolidates the latest stage-1 outputs into the filesystem memory artifacts and then runs a dedicated consolidation agent.
What it does:
- claims a single global phase-2 lock before touching the memories root (so only one consolidation inspects or mutates the workspace at a time)
- loads a bounded set of stage-1 outputs from the state DB using phase-2
selection rules:
- ignores memories whose
last_usagefalls outside the configuredmax_unused_dayswindow - for memories with no
last_usage, falls back togenerated_atso fresh never-used memories can still be selected - ranks eligible memories by
usage_countfirst, then by the most recentlast_usage/generated_at
- ignores memories whose
- computes a completion watermark from the claimed watermark + newest input timestamps
- syncs local memory artifacts under the memories root:
raw_memories.md(merged raw memories, stable ascending thread-id order)rollout_summaries/(one summary file per selected rollout)
- keeps the memories root itself as a git-baseline directory, initialized under
~/.codex/memories/.gitbycodex-git-utils - prunes stale rollout summaries that are no longer selected
- prunes memory extension resource files older than the extension retention window, so cleanup appears in the workspace diff
- writes
phase2_workspace_diff.mdin the memories root with the git-style diff from the previous successful Phase 2 baseline to the current worktree - if the memory workspace has no changes after artifact sync/pruning, marks the job successful and exits
If the memory workspace has changes, it then:
- spawns an internal consolidation sub-agent
- builds the Phase 2 prompt with the path to the generated workspace diff
- points the agent at
phase2_workspace_diff.mdfor the detailed diff context - runs it with no approvals, no network, and local write access only
- disables collab for that agent (to prevent recursive delegation)
- watches the agent status and heartbeats the global job lease while it runs
- resets the memory git baseline after the agent completes successfully; the generated diff file is removed before this reset so deleted content is not kept in the prompt artifact or unreachable git objects
- marks the phase-2 job success/failure in the state DB when the agent finishes
Selection and workspace-diff behavior:
- successful Phase 2 runs mark the exact stage-1 snapshots they consumed with
selected_for_phase2 = 1and persist the matchingselected_for_phase2_source_updated_at - Phase 1 upserts preserve the previous
selected_for_phase2baseline until the next successful Phase 2 run rewrites it - Phase 2 loads only the current top-N selected stage-1 inputs, syncs
rollout_summaries/directly to that selection, rendersraw_memories.mdin stable ascending thread-id order to avoid usage-rank churn, then lets the git-style workspace diff surface additions, modifications, and deletions against the previous successful memory baseline - when the selected input set is empty, stale
rollout_summaries/files are removed andraw_memories.mdis rewritten to the empty-input placeholder; consolidated outputs such asMEMORY.md,memory_summary.md, andskills/are left for the agent to update
Watermark behavior:
- The global phase-2 lock does not use DB watermarks as a dirty check; git workspace dirtiness decides whether an agent needs to run.
- The global phase-2 job row still tracks an input watermark as bookkeeping for the latest DB input timestamp known when the job was claimed.
- Phase 2 recomputes a
new_watermarkusing the max of:- the claimed watermark
- the newest
source_updated_attimestamp in the stage-1 inputs it actually loaded
- On success, Phase 2 stores that completion watermark in the DB.
- This avoids moving the recorded completion watermark backwards, but does not decide whether Phase 2 has work.
In practice, this phase is responsible for refreshing the on-disk memory workspace and producing/updating the higher-level consolidated memory outputs.
Why it is split into two phases
- Phase 1 scales across many rollouts and produces normalized per-rollout memory records.
- Phase 2 serializes global consolidation so the shared memory artifacts are updated safely and consistently.