feat: export and replay effective config locks (#20405)

## Why

For reproducibility. A hand-written `config.toml` is not enough to
recreate what a Codex session actually ran with because layered config,
CLI overrides, defaults, feature aliases, resolved feature config,
prompt setup, and model-catalog/session values can all affect the final
runtime behavior.

This PR adds an effective config lockfile path: one run can export the
resolved session config, and a later run can replay that lockfile and
fail early if the regenerated effective config drifts.

## What Changed

- Add a dedicated `ConfigLockfileToml` wrapper with top-level lockfile
metadata plus the replayable config:

  ```toml
  version = 1
  codex_version = "..."

  [config]
  # effective ConfigToml fields
  ```

- Keep lockfile metadata out of regular `ConfigToml`; replay loads
`ConfigLockfileToml` and then uses its nested `config` as the
authoritative config layer.
- Add `debug.config_lockfile.export_dir` to write
`<thread_id>.config.lock.toml` when a root session starts.
- Add `debug.config_lockfile.load_path` to replay a saved lockfile and
validate the regenerated session lockfile against it.
- Add `debug.config_lockfile.allow_codex_version_mismatch` to optionally
tolerate Codex binary version drift while still comparing the rest of
the lockfile.
- Add `debug.config_lockfile.save_fields_resolved_from_model_catalog` so
lock creation can either save model-catalog/session-resolved fields or
intentionally leave those fields dynamic.
- Build lockfiles from the effective config plus resolved runtime values
such as model selection, reasoning settings, prompts, service tier, web
search mode, feature states/config, memories config, skill instructions,
and agent limits.
- Materialize feature aliases and custom feature config into the
lockfile so replay compares canonical resolved behavior instead of
user-authored alias shape.
- Strip profile/debug/file-include/environment-specific inputs from
generated lockfiles so they contain replayable values rather than the
inputs that produced those values.
- Surface JSON-RPC server error code/data in app-server client and TUI
bootstrap errors so config-lock replay failures include the actual TOML
diff.
- Regenerate the config schema for the new debug config keys.

## Review Notes

The main flow is split across these files:

- `config/src/config_toml.rs`: lockfile/debug TOML shapes.
- `core/src/config/mod.rs`: loading `debug.config_lockfile.*`, replaying
a lockfile as a config layer, and preserving the expected lockfile for
validation.
- `core/src/session/config_lock.rs`: exporting the current session
lockfile and materializing resolved session/config values.
- `core/src/config_lock.rs`: lockfile parsing, metadata/version checks,
replay comparison, and diff formatting.

## Usage

Export a lockfile from a normal session:

```sh
codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"'
```

Export a lockfile without saving model-catalog/session-resolved fields:

```sh
codex -c 'debug.config_lockfile.export_dir="/tmp/codex-locks"' \
  -c 'debug.config_lockfile.save_fields_resolved_from_model_catalog=false'
```

Replay a saved lockfile in a later session:

```sh
codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"'
```

If replay resolves to a different effective config, startup fails with a
TOML diff.

To tolerate Codex binary version drift during replay:

```sh
codex -c 'debug.config_lockfile.load_path="/tmp/codex-locks/<thread_id>.config.lock.toml"' \
  -c 'debug.config_lockfile.allow_codex_version_mismatch=true'
```

## Limitations

This does not support custom rules/network policies.

## Verification

- `cargo test -p codex-core config_lock`
- `cargo test -p codex-config`
- `cargo test -p codex-thread-manager-sample`
This commit is contained in:
jif-oai
2026-05-01 17:46:02 +02:00
committed by GitHub
parent ff27d01676
commit 0b04d1b3cc
17 changed files with 977 additions and 16 deletions

View File

@@ -30,6 +30,10 @@ impl FeatureConfig for MultiAgentV2ConfigToml {
fn enabled(&self) -> Option<bool> {
self.enabled
}
fn set_enabled(&mut self, enabled: bool) {
self.enabled = Some(enabled);
}
}
#[derive(Serialize, Deserialize, Debug, Clone, Default, PartialEq, Eq, JsonSchema)]
@@ -45,4 +49,8 @@ impl FeatureConfig for AppsMcpPathOverrideConfigToml {
fn enabled(&self) -> Option<bool> {
self.enabled.or(self.path.as_ref().map(|_| true))
}
fn set_enabled(&mut self, enabled: bool) {
self.enabled = Some(enabled);
}
}

View File

@@ -593,6 +593,37 @@ impl FeaturesToml {
}
entries
}
pub fn materialize_resolved_enabled(&mut self, features: &Features) {
let Self {
multi_agent_v2,
apps_mcp_path_override,
entries,
} = self;
for key in legacy::legacy_feature_keys() {
entries.remove(key);
}
for spec in FEATURES {
let enabled = features.enabled(spec.id);
if spec.id == Feature::MultiAgentV2 {
materialize_resolved_feature_enabled(multi_agent_v2, enabled);
} else if spec.id == Feature::AppsMcpPathOverride {
materialize_resolved_feature_enabled(apps_mcp_path_override, enabled);
} else {
entries.insert(spec.key.to_string(), enabled);
}
}
}
}
fn materialize_resolved_feature_enabled<T: FeatureConfig>(
feature: &mut Option<FeatureToml<T>>,
enabled: bool,
) {
match feature {
Some(feature) => feature.set_enabled(enabled),
None => *feature = Some(FeatureToml::Enabled(enabled)),
}
}
impl From<BTreeMap<String, bool>> for FeaturesToml {
@@ -620,12 +651,20 @@ impl<T: FeatureConfig> FeatureToml<T> {
Self::Config(config) => config.enabled(),
}
}
pub fn set_enabled(&mut self, enabled: bool) {
match self {
Self::Enabled(value) => *value = enabled,
Self::Config(config) => config.set_enabled(enabled),
}
}
}
// A trait to be implemented by custom feature config structs when defining a feature that needs more configuration than
// just enabled/disabled.
pub trait FeatureConfig {
fn enabled(&self) -> Option<bool>;
fn set_enabled(&mut self, enabled: bool);
}
/// Single, easy-to-read registry of all feature definitions.

View File

@@ -490,6 +490,54 @@ usage_hint_enabled = false
);
}
#[test]
fn materialize_resolved_enabled_writes_all_features_and_preserves_custom_config() {
let mut features = Features::with_defaults();
features.enable(Feature::CodeMode);
features.enable(Feature::MultiAgentV2);
features.disable(Feature::ToolSearch);
let mut features_toml = FeaturesToml {
multi_agent_v2: Some(FeatureToml::Config(crate::MultiAgentV2ConfigToml {
enabled: Some(false),
min_wait_timeout_ms: Some(2500),
..Default::default()
})),
entries: BTreeMap::from([("include_apply_patch_tool".to_string(), true)]),
..Default::default()
};
features_toml.materialize_resolved_enabled(&features);
let entries = features_toml.entries();
assert_eq!(entries.get("include_apply_patch_tool"), None);
for spec in crate::FEATURES {
assert_eq!(
entries.get(spec.key),
Some(&features.enabled(spec.id)),
"{}",
spec.key
);
}
assert_eq!(
features_toml.multi_agent_v2,
Some(FeatureToml::Config(crate::MultiAgentV2ConfigToml {
enabled: Some(true),
min_wait_timeout_ms: Some(2500),
..Default::default()
}))
);
let replayed = Features::from_sources(
FeatureConfigSource {
features: Some(&features_toml),
..Default::default()
},
FeatureConfigSource::default(),
FeatureOverrides::default(),
);
assert_eq!(replayed.enabled(Feature::ApplyPatchFreeform), false);
}
#[test]
fn unstable_warning_event_only_mentions_enabled_under_development_features() {
let mut configured_features = Table::new();