codex

mirror of https://github.com/openai/codex.git synced 2026-04-27 16:15:09 +00:00

Author	SHA1	Message	Date
Michael Bolin	bfff0c729f	config: enforce enterprise feature requirements (#13388 ) ## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.	2026-03-04 04:40:22 +00:00
Curtis 'Fjord' Hawthorne	40ab71a985	Disable js_repl when Node is incompatible at startup (#12824 ) ## Summary - validate `js_repl` Node compatibility during session startup when the experiment is enabled - if Node is missing or too old, disable `js_repl` and `js_repl_tools_only` for the session before tools and instructions are built - surface that startup disablement to users through the existing startup warning flow instead of only logging it - reuse the same compatibility check in js_repl kernel startup so startup gating and runtime behavior stay aligned - add a regression test that verifies the warning is emitted and that the first advertised tool list omits `js_repl` and `js_repl_reset` when Node is incompatible ## Why Today `js_repl` can be advertised based only on the feature flag, then fail later when the kernel starts. That makes the available tool list inaccurate at the start of a conversation, and users do not get a clear explanation for why the tool is unavailable. This change makes tool availability reflect real startup checks, keeps the advertised tool set stable for the lifetime of the session, and gives users a visible warning when `js_repl` is disabled. ## Testing - `just fmt` - `cargo test -p codex-core --test all js_repl_is_not_advertised_when_startup_node_is_incompatible`	2026-02-26 01:14:51 +00:00
Curtis 'Fjord' Hawthorne	9501669a24	tests(js_repl): remove node-related skip paths from js_repl tests (#12185 ) ## Summary Remove js_repl/node test-skip paths and make Node setup explicit in CI so js_repl tests always run instead of silently skipping. ## Why We had multiple “expediency” skip paths that let js_repl tests pass without actually exercising Node-backed behavior. This reduced CI signal and hid runtime/environment regressions. ## What changed ### CI - Added Node setup using `codex-rs/node-version.txt` in: - `.github/workflows/rust-ci.yml` - `.github/workflows/bazel.yml` - Added a Unix PATH copy step in Bazel workflow to expose the setup-node binary in common paths. ### js_repl test harness - Added explicit js_repl sandbox test configuration helpers in: - `codex-rs/core/src/tools/js_repl/mod.rs` - `codex-rs/core/src/tools/handlers/js_repl.rs` - Added Linux arg0 dispatch glue for js_repl tests so sandbox subprocess entrypoint behavior is correct under Linux test execution. ### Removed skip behavior - Deleted runtime guard function and early-return skips in js_repl tests (`can_run_js_repl_runtime_tests` and related per-test short-circuits). - Removed view_image integration test skip behavior: - dropped `skip_if_no_network!(Ok(()))` - removed “skip on Node missing/too old” branch after js_repl output inspection. ## Impact - js_repl/node tests now consistently execute and fail loudly when the environment is not correctly provisioned. - CI has stronger signal for js_repl regressions instead of false green from conditional skips. ## Testing - `cargo test -p codex-core` (locally) to validate js_repl unit/integration behavior with skips removed. - CI expected to surface any remaining environment/runtime gaps directly (rather than masking them). #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12300 - ✅ `2` https://github.com/openai/codex/pull/12275 - ✅ `3` https://github.com/openai/codex/pull/12205 - ✅ `4` https://github.com/openai/codex/pull/12407 - ✅ `5` https://github.com/openai/codex/pull/12372 - 👉 `6` https://github.com/openai/codex/pull/12185 - ⏳ `7` https://github.com/openai/codex/pull/10673	2026-02-24 22:52:14 -08:00
Curtis 'Fjord' Hawthorne	8f3f2c3c02	tests(js_repl): stabilize CI runtime test execution (#12407 ) ## Summary Stabilize `js_repl` runtime test setup in CI and move tool-facing `js_repl` behavior coverage into integration tests. This is a test/CI change only. No production `js_repl` behavior change is intended. ## Why - Bazel test sandboxes (especially on macOS) could resolve a different `node` than the one installed by `actions/setup-node`, which caused `js_repl` runtime/version failures. - `js_repl` runtime tests depend on platform-specific sandbox/test-harness behavior, so they need explicit gating in a base-stability commit. - Several tests in the `js_repl` unit test module were actually black-box/tool-level behavior tests and fit better in the integration suite. ## Changes - Add `actions/setup-node` to the Bazel and Rust `Tests` workflows, using the exact version pinned in the repo’s Node version file. - In Bazel (non-Windows), pass `CODEX_JS_REPL_NODE_PATH=$(which node)` into test env so `js_repl` uses the `actions/setup-node` runtime inside Bazel tests. - Add a new integration test suite for `js_repl` tool behavior and register it in the core integration test suite module. - Move black-box `js_repl` behavior tests into the integration suite (persistence/TLA, builtin tool invocation, recursive self-call rejection, `process` isolation, blocked builtin imports). - Keep white-box manager/kernel tests in the `js_repl` unit test module. - Gate `js_repl` runtime tests to run only on macOS and only when a usable Node runtime is available (skip on other platforms / missing Node in this commit). ## Impact - Reduces `js_repl` CI failures caused by Node resolution drift in Bazel. - Improves test organization by separating tool-facing behavior tests from white-box manager/kernel tests. - Keeps the base commit stable while expanding `js_repl` runtime coverage. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12372 - 👉 `2` https://github.com/openai/codex/pull/12407 - ⏳ `3` https://github.com/openai/codex/pull/12185 - ⏳ `4` https://github.com/openai/codex/pull/10673	2026-02-24 21:04:34 -08:00

4 Commits