## Summary
https://github.com/openai/codex/pull/13860 changed the serialized output
format of Unified Exec. This PR reverts those changes and some related
test changes
## Testing
- [x] Update tests
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Remove the stale `?` after `AbsolutePathBuf::join` in the unified exec
integration test helper.
## Root Cause
- `AbsolutePathBuf::join` was made infallible, but
`core/tests/suite/unified_exec.rs` still treated it as a `Result`, which
broke the Windows test build for the `all` integration test target.
## Validation
- `just fmt`
- `cargo test -p codex-core --test all
unified_exec_resolves_relative_workdir`
## Summary
- Convert unified exec integration tests that can run against the remote
executor to use the remote-aware test harness.
- Create workspace directories through the executor filesystem for
remote runs.
- Install `python3` and `zsh` in the remote test container so restored
Python/zsh-based test commands work in fresh Ubuntu containers.
## Validation
- `just fmt`
- `cargo test -p codex-core --test all unified_exec_defaults_to_pipe`
- `cargo test -p codex-core --test all unified_exec_can_enable_tty`
- `cargo test -p codex-core --test all unified_exec`
- Remote on `codex-remote`: `source scripts/test-remote-env.sh && cd
codex-rs && cargo test -p codex-core --test all unified_exec`
- `just fix -p codex-core`
## Summary
Adds support for approvals_reviewer to `Op::UserTurn` so we can migrate
`[CodexMessageProcessor::turn_start]` to use Op::UserTurn
## Testing
- [x] Adds quick test for the new field
Co-authored-by: Codex <noreply@openai.com>
Moves Code Mode to a new crate with no dependencies on codex. This
create encodes the code mode semantics that we want for lifetime,
mounting, tool calling.
The model-facing surface is mostly unchanged. `exec` still runs raw
JavaScript, `wait` still resumes or terminates a `cell_id`, nested tools
are still available through `tools.*`, and helpers like `text`, `image`,
`store`, `load`, `notify`, `yield_control`, and `exit` still exist.
The major change is underneath that surface:
- Old code mode was an external Node runtime.
- New code mode is an in-process V8 runtime embedded directly in Rust.
- Old code mode managed cells inside a long-lived Node runner process.
- New code mode manages cells in Rust, with one V8 runtime thread per
active `exec`.
- Old code mode used JSON protocol messages over child stdin/stdout plus
Node worker-thread messages.
- New code mode uses Rust channels and direct V8 callbacks/events.
This PR also fixes the two migration regressions that fell out of that
substrate change:
- `wait { terminate: true }` now waits for the V8 runtime to actually
stop before reporting termination.
- synchronous top-level `exit()` now succeeds again instead of surfacing
as a script error.
---
- `core/src/tools/code_mode/*` is now mostly an adapter layer for the
public `exec` / `wait` tools.
- `code-mode/src/service.rs` owns cell sessions and async control flow
in Rust.
- `code-mode/src/runtime/*.rs` owns the embedded V8 isolate and
JavaScript execution.
- each `exec` spawns a dedicated runtime thread plus a Rust
session-control task.
- helper globals are installed directly into the V8 context instead of
being injected through a source prelude.
- helper modules like `tools.js` and `@openai/code_mode` are synthesized
through V8 module resolution callbacks in Rust.
---
Also added a benchmark for showing the speed of init and use of a code
mode env:
```
$ cargo bench -p codex-code-mode --bench exec_overhead -- --samples 30 --warm-iterations 25 --tool-counts 0,32,128
Finished [`bench` profile [optimized]](https://doc.rust-lang.org/cargo/reference/profiles.html#default-profiles) target(s) in 0.18s
Running benches/exec_overhead.rs (target/release/deps/exec_overhead-008c440d800545ae)
exec_overhead: samples=30, warm_iterations=25, tool_counts=[0, 32, 128]
scenario tools samples warmups iters mean/exec p95/exec rssΔ p50 rssΔ max
cold_exec 0 30 0 1 1.13ms 1.20ms 8.05MiB 8.06MiB
warm_exec 0 30 1 25 473.43us 512.49us 912.00KiB 1.33MiB
cold_exec 32 30 0 1 1.03ms 1.15ms 8.08MiB 8.11MiB
warm_exec 32 30 1 25 509.73us 545.76us 960.00KiB 1.30MiB
cold_exec 128 30 0 1 1.14ms 1.19ms 8.30MiB 8.34MiB
warm_exec 128 30 1 25 575.08us 591.03us 736.00KiB 864.00KiB
memory uses a fresh-process max RSS delta for each scenario
```
---------
Co-authored-by: Codex <noreply@openai.com>
- Split the feature system into a new `codex-features` crate.
- Cut `codex-core` and workspace consumers over to the new config and
warning APIs.
Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
### Motivation
- Interrupting a running turn (Ctrl+C / Esc) currently also terminates
long‑running background shells, which is surprising for workflows like
local dev servers or file watchers.
- The existing cleanup command name was confusing; callers expect an
explicit command to stop background terminals rather than a UI clear
action.
- Make background‑shell termination explicit and surface a clearer
command name while preserving backward compatibility.
### Description
- Renamed the background‑terminal cleanup slash command from `Clean`
(`/clean`) to `Stop` (`/stop`) and kept `clean` as an alias in the
command parsing/visibility layer, updated the user descriptions and
command popup wiring accordingly.
- Updated the unified‑exec footer text and snapshots to point to `/stop`
(and trimmed corresponding snapshot output to match the new label).
- Changed interrupt behavior so `Op::Interrupt` (Ctrl+C / Esc interrupt)
no longer closes or clears tracked unified exec / background terminal
processes in the TUI or core cleanup path; background shells are now
preserved after an interrupt.
- Updated protocol/docs to clarify that `turn/interrupt` (or
`Op::Interrupt`) interrupts the active turn but does not terminate
background terminals, and that `thread/backgroundTerminals/clean` is the
explicit API to stop those shells.
- Updated unit/integration tests and insta snapshots in the TUI and core
unified‑exec suites to reflect the new semantics and command name.
### Testing
- Ran formatting with `just fmt` in `codex-rs` (succeeded).
- Ran `cargo test -p codex-protocol` (succeeded).
- Attempted `cargo test -p codex-tui` but the build could not complete
in this environment due to a native build dependency that requires
`libcap` development headers (the `codex-linux-sandbox` vendored build
step); install `libcap-dev` / make `libcap.pc` available in
`PKG_CONFIG_PATH` to run the TUI test suite locally.
- Updated and accepted the affected `insta` snapshots for the TUI
changes so visual diffs reflect the new `/stop` wording and preserved
interrupt behavior.
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69b39c44b6dc8323bd133ae206310fae)
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved
## Runtime model
This PR does not introduce a new `approval_policy`.
Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
- `user`
- `guardian_subagent`
`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.
The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.
When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`
Users can still change `/approvals` afterward.
Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior
ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.
## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.
- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface
## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.
It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`
`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`
`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.
These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.
This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.
## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
- resolved approval/denial state in history
## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)
Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
Enterprises can already constrain approvals, sandboxing, and web search
through `requirements.toml` and MDM, but feature flags were still only
configurable as managed defaults. That meant an enterprise could suggest
feature values, but it could not actually pin them.
This change closes that gap and makes enterprise feature requirements
behave like the other constrained settings. The effective feature set
now stays consistent with enterprise requirements during config load,
when config writes are validated, and when runtime code mutates feature
flags later in the session.
It also tightens the runtime API for managed features. `ManagedFeatures`
now follows the same constraint-oriented shape as `Constrained<T>`
instead of exposing panic-prone mutation helpers, and production code
can no longer construct it through an unconstrained `From<Features>`
path.
The PR also hardens the `compact_resume_fork` integration coverage on
Windows. After the feature-management changes,
`compact_resume_after_second_compaction_preserves_history` was
overflowing the libtest/Tokio thread stacks on Windows, so the test now
uses an explicit larger-stack harness as a pragmatic mitigation. That
may not be the ideal root-cause fix, and it merits a parallel
investigation into whether part of the async future chain should be
boxed to reduce stack pressure instead.
## What Changed
Enterprises can now pin feature values in `requirements.toml` with the
requirements-side `features` table:
```toml
[features]
personality = true
unified_exec = false
```
Only canonical feature keys are allowed in the requirements `features`
table; omitted keys remain unconstrained.
- Added a requirements-side pinned feature map to
`ConfigRequirementsToml`, threaded it through source-preserving
requirements merge and normalization in `codex-config`, and made the
TOML surface use `[features]` (while still accepting legacy
`[feature_requirements]` for compatibility).
- Exposed `featureRequirements` from `configRequirements/read`,
regenerated the JSON/TypeScript schema artifacts, and updated the
app-server README.
- Wrapped the effective feature set in `ManagedFeatures`, backed by
`ConstrainedWithSource<Features>`, and changed its API to mirror
`Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`,
and result-returning `enable` / `disable` / `set_enabled` helpers.
- Removed the legacy-usage and bulk-map passthroughs from
`ManagedFeatures`; callers that need those behaviors now mutate a plain
`Features` value and reapply it through `set(...)`, so the constrained
wrapper remains the enforcement boundary.
- Removed the production loophole for constructing unconstrained
`ManagedFeatures`. Non-test code now creates it through the configured
feature-loading path, and `impl From<Features> for ManagedFeatures` is
restricted to `#[cfg(test)]`.
- Rejected legacy feature aliases in enterprise feature requirements,
and return a load error when a pinned combination cannot survive
dependency normalization.
- Validated config writes against enterprise feature requirements before
persisting changes, including explicit conflicting writes and
profile-specific feature states that normalize into invalid
combinations.
- Updated runtime and TUI feature-toggle paths to use the constrained
setter API and to persist or apply the effective post-constraint value
rather than the requested value.
- Updated the `core_test_support` Bazel target to include the bundled
core model-catalog fixtures in its runtime data, so helper code that
resolves `core/models.json` through runfiles works in remote Bazel test
environments.
- Renamed the core config test coverage to emphasize that effective
feature values are normalized at runtime, while conflicting persisted
config writes are rejected.
- Ran `compact_resume_after_second_compaction_preserves_history` inside
an explicit 8 MiB test thread and Tokio runtime worker stack, following
the existing larger-stack integration-test pattern, to keep the Windows
`compact_resume_fork` test slice from aborting while a parallel
investigation continues into whether some of the underlying async
futures should be boxed.
## Verification
- `cargo test -p codex-config`
- `cargo test -p codex-core feature_requirements_ -- --nocapture`
- `cargo test -p codex-core
load_requirements_toml_produces_expected_constraints -- --nocapture`
- `cargo test -p codex-core
compact_resume_after_second_compaction_preserves_history -- --nocapture`
- `cargo test -p codex-core compact_resume_fork -- --nocapture`
- Re-ran the built `codex-core` `tests/all` binary with
`RUST_MIN_STACK=262144` for
`compact_resume_after_second_compaction_preserves_history` to confirm
the explicit-stack harness fixes the deterministic low-stack repro.
- `cargo test -p codex-core`
- This still fails locally in unrelated integration areas that expect
the `codex` / `test_stdio_server` binaries or hit existing `search_tool`
wiremock mismatches.
## Docs
`developers.openai.com/codex` should document the requirements-side
`[features]` table for enterprise and MDM-managed configuration,
including that it only accepts canonical feature keys and that
conflicting config writes are rejected.
- add a local Fast mode setting in codex-core (similar to how model id
is currently stored on disk locally)
- send `service_tier=priority` on requests when Fast is enabled
- add `/fast` in the TUI and persist it locally
- feature flag
Summary is a required parameter on UserTurn. Ideally we'd like the core
to decide the appropriate summary level.
Make the summary optional and don't send it when not needed.
## Why
`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.
This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.
## What Changed
- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
- `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
- `codex_protocol::protocol`
- `codex_protocol::config_types`
- `codex_protocol::models`
- `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
- `codex-utils-approval-presets`
- `codex-utils-cli`
## Verification
- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.
It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.
## What
- Added `ReadOnlyAccess` in protocol with:
- `Restricted { include_platform_defaults, readable_roots }`
- `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
- `ReadOnly { access: ReadOnlyAccess }`
- `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.
## Compatibility / rollout
- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
With this PR we do not close the unified exec processes (i.e. background
terminals) at the end of a turn unless:
* The user interrupt the turn
* The user decide to clean the processes through `app-server` or
`/clean`
I made sure that `codex exec` correctly kill all the processes
## Summary
Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
is explicitly unused by the clients so far, to simplify PRs - app-server
and tui implementations will be follow-ups.
## Testing
- [x] added integration tests
- Make Config.model optional and centralize default-selection logic in
ModelsManager, including a default_model helper (with
codex-auto-balanced when available) so sessions now carry an explicit
chosen model separate from the base config.
- Resolve `model` once in `core` and `tui` from config. Then store the
state of it on other structs.
- Move refreshing models to be before resolving the default model
**Change**: Seatbelt now allows file-ioctl on /dev/ttys[0-9]+ even
without the sandbox extension so pre-created PTYs remain interactive
(Python REPL, shells).
**Risk**: A seatbelted process that already holds a PTY fd (including
one it shouldn’t) could issue tty ioctls like TIOCSTI or termios changes
on that fd. This doesn’t allow opening new PTYs or reading/writing them;
it only broadens ioctl capability on existing fds.
**Why acceptable**: We already hand the child its PTY for interactive
use; restoring ioctls is required for isatty() and prompts to work. The
attack requires being given or inheriting a sensitive PTY fd; by design
we don’t hand untrusted processes other users’ PTYs (we don't hand them
any PTYs actually), so the practical exposure is limited to the PTY
intentionally allocated for the session.
**Validation**:
Running
```
start a python interpreter and keep it running
```
Followed by:
* `calculate 1+1 using it` -> works as expected
* `Use this Python session to run the command just fix in
/Users/jif/code/codex/codex-rs` -> does not work as expected
# Unified Exec Shell Selection on Windows
## Problem
reference issue #7466
The `unified_exec` handler currently deserializes model-provided tool
calls into the `ExecCommandArgs` struct:
```rust
#[derive(Debug, Deserialize)]
struct ExecCommandArgs {
cmd: String,
#[serde(default)]
workdir: Option<String>,
#[serde(default = "default_shell")]
shell: String,
#[serde(default = "default_login")]
login: bool,
#[serde(default = "default_exec_yield_time_ms")]
yield_time_ms: u64,
#[serde(default)]
max_output_tokens: Option<usize>,
#[serde(default)]
with_escalated_permissions: Option<bool>,
#[serde(default)]
justification: Option<String>,
}
```
The `shell` field uses a hard-coded default:
```rust
fn default_shell() -> String {
"/bin/bash".to_string()
}
```
When the model returns a tool call JSON that only contains `cmd` (which
is the common case), Serde fills in `shell` with this default value.
Later, `get_command` uses that value as if it were a model-provided
shell path:
```rust
fn get_command(args: &ExecCommandArgs) -> Vec<String> {
let shell = get_shell_by_model_provided_path(&PathBuf::from(args.shell.clone()));
shell.derive_exec_args(&args.cmd, args.login)
}
```
On Unix, this usually resolves to `/bin/bash` and works as expected.
However, on Windows this behavior is problematic:
- The hard-coded `"/bin/bash"` is not a valid Windows path.
- `get_shell_by_model_provided_path` treats this as a model-specified
shell, and tries to resolve it (e.g. via `which::which("bash")`), which
may or may not exist and may not behave as intended.
- In practice, this leads to commands being executed under a non-default
or non-existent shell on Windows (for example, WSL bash), instead of the
expected Windows PowerShell or `cmd.exe`.
The core of the issue is that **"model did not specify `shell`" is
currently interpreted as "the model explicitly requested `/bin/bash`"**,
which is both Unix-specific and wrong on Windows.
## Proposed Solution
Instead of hard-coding `"/bin/bash"` into `ExecCommandArgs`, we should
distinguish between:
1. **The model explicitly specifying a shell**, e.g.:
```json
{
"cmd": "echo hello",
"shell": "pwsh"
}
```
In this case, we *do* want to respect the model’s choice and use
`get_shell_by_model_provided_path`.
2. **The model omitting the `shell` field entirely**, e.g.:
```json
{
"cmd": "echo hello"
}
```
In this case, we should *not* assume `/bin/bash`. Instead, we should use
`default_user_shell()` and let the platform decide.
To express this distinction, we can:
1. Change `shell` to be optional in `ExecCommandArgs`:
```rust
#[derive(Debug, Deserialize)]
struct ExecCommandArgs {
cmd: String,
#[serde(default)]
workdir: Option<String>,
#[serde(default)]
shell: Option<String>,
#[serde(default = "default_login")]
login: bool,
#[serde(default = "default_exec_yield_time_ms")]
yield_time_ms: u64,
#[serde(default)]
max_output_tokens: Option<usize>,
#[serde(default)]
with_escalated_permissions: Option<bool>,
#[serde(default)]
justification: Option<String>,
}
```
Here, the absence of `shell` in the JSON is represented as `shell:
None`, rather than a hard-coded string value.
second attempt to fix this test after
https://github.com/openai/codex/pull/6884. I think this flakiness is
happening because yield_time is too small for a 10,000 step loop in
python.
Thread through an `exit_notify` tokio `Notify` through to the
`UnifiedExecSession` so that we can return early if the command
terminates before `yield_time_ms`.
As Codex review correctly pointed out below 🙌 we also need a
`exit_signaled` flag so that commands which finish before we start
waiting can also exit early.
Since the default `yield_time_ms` is now 10s, this means that we don't
have to wait 10s for trivial commands like ls, sed, etc (which are the
majority of agent commands 😅)
---------
Co-authored-by: jif-oai <jif@openai.com>
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
- This PR is to make it on path for truncating by tokens. This path will
be initially used by unified exec and context manager (responsible for
MCP calls mainly).
- We are exposing new config `calls_output_max_tokens`
- Use `tokens` as the main budget unit but truncate based on the model
family by Introducing `TruncationPolicy`.
- Introduce `truncate_text` as a router for truncation based on the
mode.
In next PRs:
- remove truncate_with_line_bytes_budget
- Add the ability to the model to override the token budget.
## Summary
- update documentation, example configs, and automation defaults to
reference gpt-5.1 / gpt-5.1-codex
- bump the CLI and core configuration defaults, model presets, and error
messaging to the new models while keeping the model-family/tool coverage
for legacy slugs
- refresh tests, fixtures, and TUI snapshots so they expect the upgraded
defaults
## Testing
- `cargo test -p codex-core
config::tests::test_precedence_fixture_with_gpt5_profile`
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_6916c5b3c2b08321ace04ee38604fc6b)