Summary
- propagate approval policy from parent to spawned agents and drop the
Never override so sub-agents respect the caller’s request
- refresh the pending-approval list whenever events arrive or the active
thread changes and surface the list above the composer for inactive
threads
- add widgets, helpers, and tests covering the new pending-thread
approval UI state
![Uploading Screenshot 2026-02-25 at 11.02.18.png…]()
## Why
`unix_escalation.rs` checks a session-scoped approval cache before
prompting again for an execve-intercepted skill script. Without also
recording `ReviewDecision::ApprovedForSession`, that cache never gets
populated, so the same skill script can still trigger repeated approval
prompts within one session.
## What Changed
- Add `execve_session_approvals` to `SessionServices` so the session can
track approved skill script paths.
- Record the script path when a skill-script prompt returns
`ReviewDecision::ApprovedForSession`, but only for the skill-script path
rather than broader prefix-rule approvals.
- Reuse the cached approval on later execve callbacks by treating an
already-approved skill script as `Decision::Allow`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12756).
* #12758
* __->__ #12756
## Summary
- Preserve each skill’s raw permissions block as a permission_profile on
SkillMetadata during skill loading.
- Keep compiling that same metadata into the existing runtime
Permissions object, so current enforcement
behavior stays intact.
- When zsh-fork intercepts execution of a script that belongs to a
skill, include the skill’s
permission_profile in the exec approval request.
- This lets approval UIs show the extra filesystem access the skill
declared when prompting for approval.
## Why
In the `shell_zsh_fork` flow, `codex-shell-escalation` receives the
executable path exactly as the shell passed it to `execve()`. That path
is not guaranteed to be absolute.
For commands such as `./scripts/hello-mbolin.sh`, if the shell was
launched with a different `workdir`, resolving the intercepted `file`
against the server process working directory makes policy checks and
skill matching inspect the wrong executable. This change pushes that fix
a step further by keeping the normalized path typed as `AbsolutePathBuf`
throughout the rest of the escalation pipeline.
That makes the absolute-path invariant explicit, so later code cannot
accidentally treat the resolved executable path as an arbitrary
`PathBuf`.
## What Changed
- record the wrapper process working directory as an `AbsolutePathBuf`
- update the escalation protocol so `workdir` is explicitly absolute
while `file` remains the raw intercepted exec path
- resolve a relative intercepted `file` against the request `workdir` as
soon as the server receives the request
- thread `AbsolutePathBuf` through `EscalationPolicy`,
`CoreShellActionProvider`, and command normalization helpers so the
resolved executable path stays type-checked as absolute
- replace the `path-absolutize` dependency in `codex-shell-escalation`
with `codex-utils-absolute-path`
- add a regression test that covers a relative `file` with a distinct
`workdir`
## Verification
- `cargo test -p codex-shell-escalation`
Direct skill-script matches force `Decision::Prompt`, so skill-backed
scripts require explicit approval before they run. (Note "allow for
session" is not supported in this PR, but will be done in a follow-up.)
In the process of implementing this, I fixed an important bug:
`ShellZshFork` is supposed to keep ordinary allowed execs on the
client-side `Run` path so later `execve()` calls are still intercepted
and reviewed. After the shell-escalation port, `Decision::Allow` still
mapped to `Escalate`, which moved `zsh` to server-side execution too
early. That broke the intended flow for skill-backed scripts and made
the approval prompt depend on the wrong execution path.
## What changed
- In `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs`,
`Decision::Allow` now returns `Run` unless escalation is actually
required.
- Removed the zsh-specific `argv[0]` fallback. With the `Allow -> Run`
fix in place, zsh's later `execve()` of the script is intercepted
normally, so the skill match happens on the script path itself.
- Kept the skill-path handling in `determine_action()` focused on the
direct `program` match path.
## Verification
- Updated `shell_zsh_fork_prompts_for_skill_script_execution` in
`codex-rs/core/tests/suite/skill_approval.rs` (gated behind `cfg(unix)`)
to:
- run under `SandboxPolicy::new_workspace_write_policy()` instead of
`DangerFullAccess`
- assert the approval command contains only the script path
- assert the approved run returns both stdout and stderr markers in the
shell output
- Ran `cargo test -p codex-core
shell_zsh_fork_prompts_for_skill_script_execution -- --nocapture`
## Manual Testing
Run the dev build:
```
just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork
```
I have created `/Users/mbolin/.agents/skills/mbolin-test-skill` with:
```
├── scripts
│ └── hello-mbolin.sh
└── SKILL.md
```
The skill:
```
---
name: mbolin-test-skill
description: Used to exercise various features of skills.
---
When this skill is invoked, run the `hello-mbolin.sh` script and report the output.
```
The script:
```
set -e
# Note this script will fail if run with network disabled.
curl --location openai.com
```
Use `$mbolin-test-skill` to invoke the skill manually and verify that I
get prompted to run `hello-mbolin.sh`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12730).
* #12750
* __->__ #12730
## Summary
Remove js_repl/node test-skip paths and make Node setup explicit in CI
so js_repl tests always run instead of silently skipping.
## Why
We had multiple “expediency” skip paths that let js_repl tests pass
without actually exercising Node-backed behavior. This reduced CI signal
and hid runtime/environment regressions.
## What changed
### CI
- Added Node setup using `codex-rs/node-version.txt` in:
- `.github/workflows/rust-ci.yml`
- `.github/workflows/bazel.yml`
- Added a Unix PATH copy step in Bazel workflow to expose the setup-node
binary in common paths.
### js_repl test harness
- Added explicit js_repl sandbox test configuration helpers in:
- `codex-rs/core/src/tools/js_repl/mod.rs`
- `codex-rs/core/src/tools/handlers/js_repl.rs`
- Added Linux arg0 dispatch glue for js_repl tests so sandbox subprocess
entrypoint behavior is correct under Linux test execution.
### Removed skip behavior
- Deleted runtime guard function and early-return skips in js_repl tests
(`can_run_js_repl_runtime_tests` and related per-test short-circuits).
- Removed view_image integration test skip behavior:
- dropped `skip_if_no_network!(Ok(()))`
- removed “skip on Node missing/too old” branch after js_repl output
inspection.
## Impact
- js_repl/node tests now consistently execute and fail loudly when the
environment is not correctly provisioned.
- CI has stronger signal for js_repl regressions instead of false green
from conditional skips.
## Testing
- `cargo test -p codex-core` (locally) to validate js_repl
unit/integration behavior with skips removed.
- CI expected to surface any remaining environment/runtime gaps directly
(rather than masking them).
#### [git stack](https://github.com/magus/git-stack-cli)
- ✅ `1` https://github.com/openai/codex/pull/12300
- ✅ `2` https://github.com/openai/codex/pull/12275
- ✅ `3` https://github.com/openai/codex/pull/12205
- ✅ `4` https://github.com/openai/codex/pull/12407
- ✅ `5` https://github.com/openai/codex/pull/12372
- 👉 `6` https://github.com/openai/codex/pull/12185
- ⏳ `7` https://github.com/openai/codex/pull/10673
## Summary
Stabilize `js_repl` runtime test setup in CI and move tool-facing
`js_repl` behavior coverage into integration tests.
This is a test/CI change only. No production `js_repl` behavior change
is intended.
## Why
- Bazel test sandboxes (especially on macOS) could resolve a different
`node` than the one installed by `actions/setup-node`, which caused
`js_repl` runtime/version failures.
- `js_repl` runtime tests depend on platform-specific
sandbox/test-harness behavior, so they need explicit gating in a
base-stability commit.
- Several tests in the `js_repl` unit test module were actually
black-box/tool-level behavior tests and fit better in the integration
suite.
## Changes
- Add `actions/setup-node` to the Bazel and Rust `Tests` workflows,
using the exact version pinned in the repo’s Node version file.
- In Bazel (non-Windows), pass `CODEX_JS_REPL_NODE_PATH=$(which node)`
into test env so `js_repl` uses the `actions/setup-node` runtime inside
Bazel tests.
- Add a new integration test suite for `js_repl` tool behavior and
register it in the core integration test suite module.
- Move black-box `js_repl` behavior tests into the integration suite
(persistence/TLA, builtin tool invocation, recursive self-call
rejection, `process` isolation, blocked builtin imports).
- Keep white-box manager/kernel tests in the `js_repl` unit test module.
- Gate `js_repl` runtime tests to run only on macOS and only when a
usable Node runtime is available (skip on other platforms / missing Node
in this commit).
## Impact
- Reduces `js_repl` CI failures caused by Node resolution drift in
Bazel.
- Improves test organization by separating tool-facing behavior tests
from white-box manager/kernel tests.
- Keeps the base commit stable while expanding `js_repl` runtime
coverage.
#### [git stack](https://github.com/magus/git-stack-cli)
- ✅ `1` https://github.com/openai/codex/pull/12372
- 👉 `2` https://github.com/openai/codex/pull/12407
- ⏳ `3` https://github.com/openai/codex/pull/12185
- ⏳ `4` https://github.com/openai/codex/pull/10673
This PR replaces the old `additional_permissions.fs_read/fs_write` shape
with a shared `PermissionProfile`
model and wires it through the command approval, sandboxing, protocol,
and TUI layers. The schema is adopted from the
`SkillManifestPermissions`, which is also refactored to use this unified
struct. This helps us easily expose permission profiles in app
server/core as a follow-up.
## Summary
- Fix `js_repl` so `await codex.tool("view_image", { path })` actually
attaches the image to the active turn when called from inside the JS
REPL.
- Restore the behavior expected by the existing `js_repl`
image-attachment test.
- This is a follow-up to
[#12553](https://github.com/openai/codex/pull/12553), which changed
`view_image` to return structured image content.
## Root Cause
- [#12553](https://github.com/openai/codex/pull/12553) changed
`view_image` from directly injecting a pending user image message to
returning structured `function_call_output` content items.
- The nested tool-call bridge inside `js_repl` serialized that tool
response back to the JS runtime, but it did not mirror returned image
content into the active turn.
- As a result, `view_image` appeared to succeed inside `js_repl`, but no
`input_image` was actually attached for the outer turn.
## What Changed
- Updated the nested tool-call path in `js_repl` to inspect function
tool responses for structured content items.
- When a nested tool response includes `input_image` content, `js_repl`
now injects a corresponding user `Message` into the active turn before
returning the raw tool result back to the JS runtime.
- Kept the normal JSON result flow intact, so `codex.tool(...)` still
returns the original tool output object to JavaScript.
## Why
- `js_repl` documentation and tests already assume that `view_image` can
be used from inside the REPL to attach generated images to the model.
- Without this fix, the nested call path silently dropped that
attachment behavior.
## Why
`codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` previously
located `codex-execve-wrapper` by scanning `PATH` and sibling
directories. That lookup is brittle and can select the wrong binary when
the runtime environment differs from startup assumptions.
We already pass `codex-linux-sandbox` from `codex-arg0`;
`codex-execve-wrapper` should use the same startup-driven path plumbing.
## What changed
- Introduced `Arg0DispatchPaths` in `codex-arg0` to carry both helper
executable paths:
- `codex_linux_sandbox_exe`
- `main_execve_wrapper_exe`
- Updated `arg0_dispatch_or_else()` to pass `Arg0DispatchPaths` to
top-level binaries and preserve helper paths created in
`prepend_path_entry_for_codex_aliases()`.
- Threaded `Arg0DispatchPaths` through entrypoints in `cli`, `exec`,
`tui`, `app-server`, and `mcp-server`.
- Added `main_execve_wrapper_exe` to core configuration plumbing
(`Config`, `ConfigOverrides`, and `SessionServices`).
- Updated zsh-fork shell escalation to consume the configured
`main_execve_wrapper_exe` and removed path-sniffing fallback logic.
- Updated app-server config reload paths so reloaded configs keep the
same startup-provided helper executable paths.
## References
- [`Arg0DispatchPaths`
definition](e355b43d5c/codex-rs/arg0/src/lib.rs (L20-L24))
- [`arg0_dispatch_or_else()` forwarding both
paths](e355b43d5c/codex-rs/arg0/src/lib.rs (L145-L176))
- [zsh-fork escalation using configured wrapper
path](e355b43d5c/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L109-L150))
## Testing
- `cargo check -p codex-arg0 -p codex-core -p codex-exec -p codex-tui -p
codex-mcp-server -p codex-app-server`
- `cargo test -p codex-arg0`
- `cargo test -p codex-core tools::runtimes::shell::unix_escalation:: --
--nocapture`
## Summary
Improve `js_repl` behavior when the Node kernel hits a process-level
failure (for example, an uncaught exception or unhandled Promise
rejection).
Instead of only surfacing a generic `js_repl kernel exited unexpectedly`
after stdout EOF, `js_repl` now returns a clearer exec error for the
active request, then resets the kernel cleanly.
## Why
Some sandbox-denied operations can trigger Node errors that become
process-level failures (for example, an unhandled EventEmitter `'error'`
event). In that case:
- the kernel process exits,
- the host sees stdout EOF,
- the user gets a generic kernel-exit error,
- and the next request can briefly race with stale kernel state.
This change improves that failure mode without monkeypatching Node APIs.
## Changes
### Kernel-side (`js_repl` Node process)
- Add process-level handlers for:
- `uncaughtException`
- `unhandledRejection`
- When one of these fires:
- best-effort emit a normal `exec_result` error for the active exec
- include actionable guidance to catch/handle async errors (including
Promise rejections and EventEmitter `'error'` events)
- exit intentionally so the host can reset/restart the kernel
### Host-side (`JsReplManager`)
- Clear dead kernel state as soon as the stdout reader observes
unexpected kernel exit/EOF.
- This lets the next `js_repl` exec start a fresh kernel instead of
hitting a stale broken-pipe path.
### Tests
- Add regression coverage for:
- uncaught async exception -> exec error + kernel recovery on next exec
- Update forced-kernel-exit test to validate recovery behavior (next
exec restarts cleanly)
## Impact
- Better user-facing error for kernel crashes caused by
uncaught/unhandled async failures.
- Cleaner recovery behavior after kernel exit.
## Validation
- `cargo test -p codex-core --lib
tools::js_repl::tests::js_repl_uncaught_exception_returns_exec_error_and_recovers
-- --exact`
- `cargo test -p codex-core --lib
tools::js_repl::tests::js_repl_forced_kernel_exit_recovers_on_next_exec
-- --exact`
- `just fmt`
## Why
`codex-shell-escalation` exposed a `codex-core`-specific adapter layer
(`ShellActionProvider`, `ShellPolicyFactory`, and `run_escalate_server`)
that existed only to bridge `codex-core` to `EscalateServer`. That
indirection increased API surface and obscured crate ownership without
adding behavior.
This change moves orchestration into `codex-core` so boundaries are
clearer: `codex-shell-escalation` provides reusable escalation
primitives, and `codex-core` provides shell-tool policy decisions.
Admittedly, @pakrym rightfully requested this sort of cleanup as part of
https://github.com/openai/codex/pull/12649, though this avoids moving
all of `codex-shell-escalation` into `codex-core`.
## What changed
- Made `EscalateServer` public and exported it from `shell-escalation`.
- Removed the adapter layer from `shell-escalation`:
- deleted `shell-escalation/src/unix/core_shell_escalation.rs`
- removed exports for `ShellActionProvider`, `ShellPolicyFactory`,
`EscalationPolicyFactory`, and `run_escalate_server`
- Updated `core/src/tools/runtimes/shell/unix_escalation.rs` to:
- create `Stopwatch`/cancellation in `codex-core`
- instantiate `EscalateServer` directly
- implement `EscalationPolicy` directly on `CoreShellActionProvider`
Net effect: same escalation flow with fewer wrappers and a smaller
public API.
## Verification
- Manually reviewed the old vs. new escalation call flow to confirm
timeout/cancellation behavior and approval policy decisions are
preserved while removing wrapper types.
Summary
- detect skill-invoking shell commands based on the original command
string, request approvals when needed, and cache positive decisions per
session
- keep implicit skill invocation emitted after approval and keep skill
approval decline messaging centralized to the shell handler
- expand and adjust skill approval tests to cover shell-based skill
scripts while matching the new detection expectations
Testing
- Not run (not requested)
## Why
This PR switches the `shell_command` zsh-fork path over to
`codex-shell-escalation` so the new shell tool can use the shared
exec-wrapper/escalation protocol instead of the `zsh_exec_bridge`
implementation that was introduced in
https://github.com/openai/codex/pull/12052. `zsh_exec_bridge` relied on
UNIX domain sockets, which is not as tamper-proof as the FD-based
approach in `codex-shell-escalation`.
## What Changed
- Added a Unix zsh-fork runtime adapter in `core`
(`core/src/tools/runtimes/shell/unix_escalation.rs`) that:
- runs zsh-fork commands through
`codex_shell_escalation::run_escalate_server`
- bridges exec-policy / approval decisions into `ShellActionProvider`
- executes escalated commands via a `ShellCommandExecutor` that calls
`process_exec_tool_call`
- Updated `ShellRuntime` / `ShellCommandHandler` / tool spec wiring to
select a `shell_command` backend (`classic` vs `zsh-fork`) while leaving
the generic `shell` tool path unchanged.
- Removed the `zsh_exec_bridge`-based session service and deleted
`core/src/zsh_exec_bridge/mod.rs`.
- Moved exec-wrapper entrypoint dispatch to `arg0` by handling the
`codex-execve-wrapper` arg0 alias there, and removed the old
`codex_core::maybe_run_zsh_exec_wrapper_mode()` hooks from `cli` and
`app-server` mains.
- Added the needed `codex-shell-escalation` dependencies for `core` and
`arg0`.
## Tests
- `cargo test -p codex-core
shell_zsh_fork_prefers_shell_command_over_unified_exec`
- `cargo test -p codex-app-server turn_start_shell_zsh_fork --
--nocapture`
- verifies zsh-fork command execution and approval flows through the new
backend
- includes subcommand approve/decline coverage using the shared zsh
DotSlash fixture in `app-server/tests/suite/zsh`
- To test manually, I added the following to `~/.codex/config.toml`:
```toml
zsh_path = "/Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh"
[features]
shell_zsh_fork = true
```
Then I ran `just c` to run the dev build of Codex with these changes and
sent it the message:
```
run `echo $0`
```
And it replied with:
```
echo $0 printed:
/Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh
In this tool context, $0 reflects the script path used to invoke the shell, not just zsh.
```
so the tool appears to be wired up correctly.
## Notes
- The zsh subcommand-decline integration test now uses `rm` under a
`WorkspaceWrite` sandbox. The previous `/usr/bin/true` scenario is
auto-allowed by the new `shell-escalation` policy path, which no longer
produces subcommand approval prompts.
## Summary
Introduces the initial implementation of Feature::RequestPermissions.
RequestPermissions allows the model to request that a command be run
inside the sandbox, with additional permissions, like writing to a
specific folder. Eventually this will include other rules as well, and
the ability to persist these permissions, but this PR is already quite
large - let's get the core flow working and go from there!
<img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM"
src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368"
/>
## Testing
- [x] Added tests
- [x] Tested locally
- [x] Feature
- use `skills_for_cwd` lookup to scope allowed skills and build
invocation context for downstream processing
- add detection in `stream_events_utils` to classify tool calls as
implicit skill invocations per the proposal (script runners, extensions,
`scripts` dirs, and SKILL.md reads)
- deduplicate invocations per turn and emit analytics/OTEL events on the
same background queue as explicit invokes
## Summary
Persist network approval allow/deny decisions as `network_rule(...)`
entries in execpolicy (not proxy config)
It adds `network_rule` parsing + append support in `codex-execpolicy`,
including `decision="prompt"` (parse-only; not compiled into proxy
allow/deny lists)
- compile execpolicy network rules into proxy allow/deny lists and
update the live proxy state on approval
- preserve requirements execpolicy `network_rule(...)` entries when
merging with file-based execpolicy
- reject broad wildcard hosts (for example `*`) for persisted
`network_rule(...)`
## Why
Tool handlers and runtimes needed to pass the same turn/session context
for shell and non-shell workflows without duplicative ownership churn.
Using shared pointers avoids temporary lifetimes and keeps existing
behavior unchanged while simplifying call sites.
## What changed
- Converted `ToolCtx` to store shared context handles (`Arc`-based),
including updates across shell, apply-patch, and unified-exec paths.
- Updated orchestrator/runtime call sites to consume the shared context
consistently and remove brittle move/borrow patterns.
- Kept behavior unchanged while preparing the type surface for the new
shell escalation integration in the next stack commit.
## Verification
- Validated this commit stack point with `just clippy` and confirmed
workspace compiles cleanly in this stack state.
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12583).
* #12584
* __->__ #12583
* #12556
## Why
`codex-rs/arg0` only needed two things from `codex-core`:
- the `find_codex_home()` wrapper
- the special argv flag used for the internal `apply_patch`
self-invocation path
That made `codex-arg0` depend on `codex-core` for a very small surface
area. This change removes that dependency edge and moves the shared
`apply_patch` invocation flag to a more natural boundary
(`codex-apply-patch`) while keeping the contract explicitly documented.
## What Changed
- Moved the internal `apply_patch` argv[1] flag constant out of
`codex-core` and into `codex-apply-patch`.
- Renamed the constant to `CODEX_CORE_APPLY_PATCH_ARG1` and documented
that it is part of the Codex core process-invocation contract (even
though it now lives in `codex-apply-patch`).
- Updated `arg0`, the core apply-patch runtime, and the `codex-exec`
apply-patch test to import the constant from `codex-apply-patch`.
- Updated `codex-rs/arg0` to call
`codex_utils_home_dir::find_codex_home()` directly instead of
`codex_core::config::find_codex_home()`.
- Removed the `codex-core` dependency from `codex-rs/arg0` and added the
needed direct dependency on `codex-utils-home-dir`.
- Added `codex-apply-patch` as a dev-dependency for `codex-rs/exec`
tests (the apply-patch test now imports the moved constant directly).
## Verification
- `cargo test -p codex-apply-patch`
- `cargo test -p codex-arg0`
- `cargo test -p codex-core --lib apply_patch`
- `cargo test -p codex-exec
test_standalone_exec_cli_can_use_apply_patch`
- `cargo shear`
## Summary
Tighten the `js_repl` freeform Lark grammar to block the most common
malformed payload wrappers before they reach runtime validation.
## What Changed
- Replaced the overly permissive `js_repl` freeform grammar (`start:
/[\s\S]*/`) with a structured grammar that still supports:
- plain JS source
- optional first-line `// codex-js-repl:` pragma followed by JS source
- Added grammar-level filtering for common bad payload shapes by
rejecting inputs whose first significant token starts with:
- `{` (JSON object wrapper like `{"code":"..."}`)
- `"` (quoted code string)
- `` ``` `` (markdown code fences)
- Implemented the grammar without regex lookahead/lookbehind because the
API-side Lark regex engine does not support look-around.
- Added a unit test to validate the grammar shape and guard against
reintroducing unsupported lookaround.
## Why
`js_repl` is a freeform tool, but the model sometimes emits wrapped
payloads (JSON, quoted strings, markdown fences) instead of raw
JavaScript. We already reject those at runtime, but this change moves
the constraint into the tool grammar so the model is less likely to
generate invalid tool-call payloads in the first place.
## Testing
- `cargo test -p codex-core
js_repl_freeform_grammar_blocks_common_non_js_prefixes`
- `cargo test -p codex-core parse_freeform_args_rejects_`
## Notes
- This intentionally over-blocks a few uncommon valid JS starts (for
example top-level `{ ... }` blocks or top-level quoted directives like
`"use strict";`) in exchange for preventing the common wrapped-payload
mistakes.
#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/12300
- ⏳ `2` https://github.com/openai/codex/pull/12275
- ⏳ `3` https://github.com/openai/codex/pull/12205
- ⏳ `4` https://github.com/openai/codex/pull/12185
- ⏳ `5` https://github.com/openai/codex/pull/10673
## Summary
Simplify network approvals by removing per-attempt proxy correlation and
moving to session-level approval dedupe keyed by (host, protocol, port).
Instead of encoding attempt IDs into proxy credentials/URLs, we now
treat approvals as a destination policy decision.
- Concurrent calls to the same destination share one approval prompt.
- Different destinations (or same host on different ports) get separate
prompts.
- Allow once approves the current queued request group only.
- Allow for session caches that (host, protocol, port) and auto-allows
future matching requests.
- Never policy continues to deny without prompting.
Example:
- 3 calls:
- a.com (line 443)
- b.com (line 443)
- a.com (line 443)
=> 2 prompts total (a, b), second a waits on the first decision.
- a.com:80 is treated separately from a.com line 443
## Testing
- `just fmt` (in `codex-rs`)
- `cargo test -p codex-core tools::network_approval::tests`
- `cargo test -p codex-core` (unit tests pass; existing
integration-suite failures remain in this environment)
Summary
- capture the origin for each configured MCP server and expose it via
the connection manager
- plumb MCP server name/origin into tool logging and emit
codex.tool_result events with those fields
- add unit coverage for origin parsing and extend OTEL tests to assert
empty MCP fields for non-MCP tools
- currently not logging full urls or url paths to prevent logging
potentially sensitive data
Testing
- Not run (not requested)
Summary
- expose `agents.max_depth` in config schema and toml parsing, with
defaults and validation
- thread-spawn depth guards and multi-agent handler now respect the
configured limit instead of a hardcoded value
- ensure documentation and helpers account for agent depth limits
## Summary
Fix `js_repl` package-resolution boundary checks for macOS temp
directory path aliasing (`/var` vs `/private/var`).
## Problem
`js_repl` verifies that resolved bare-package imports stay inside a
configured `node_modules` root.
On macOS, temp directories are commonly exposed as `/var/...` but
canonicalize to `/private/var/...`.
Because the boundary check compared raw paths with `path.relative(...)`,
valid resolutions under temp dirs could be misclassified as escaping the
allowed base, causing false `Module not found` errors.
## Changes
- Add `fs` import in the JS kernel.
- Add `canonicalizePath()` using `fs.realpathSync.native(...)` (with
safe fallback).
- Canonicalize both `base` and `resolvedPath` before running the
`node_modules` containment check.
## Impact
- Fixes false-negative boundary checks for valid package resolutions in
macOS temp-dir scenarios.
- Keeps the existing security boundary behavior intact.
- Scope is limited to `js_repl` kernel module path validation logic.
#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/12177
- ⏳ `2` https://github.com/openai/codex/pull/10673
## Summary
This change removes tool-list filtering in `js_repl_tools_only` mode and
relies on the normal model tool descriptions, while still enforcing that
tool execution must go through `js_repl` + `codex.tool(...)`.
## Motivation
The previous `js_repl_tools_only` filtering hid most tools from the
model request, which diverged from standard tool-list behavior and made
signatures less discoverable. I tested that this filtering is not
needed, and the model can follow the prompt to only call tools via
`js_repl`.
## What Changed
- `filter_tools_for_model(...)` in `core/src/tools/spec.rs` is now a
pass-through (no filtering when `js_repl_tools_only` is enabled).
- Updated tests to assert that model tools are not filtered in
`js_repl_tools_only` mode.
- Updated dynamic-tool test to assert dynamic tools remain visible in
model tool specs.
- Removed obsolete test helper used only by the old filtering
assertions.
## Safety / Behavior
- This commit does **not** relax execution policy.
- Direct model tool calls remain blocked in `js_repl_tools_only` mode
(except internal `js_repl` tools), and callers are instructed to use
`js_repl` + `codex.tool(...)`.
## Testing
- `cargo test -p codex-core js_repl_tools_only`
- Manual rollout validation showed the model can follow the `js_repl`
routing instructions without needing filtered tool lists.
#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/12069
- ⏳ `2` https://github.com/openai/codex/pull/10673
- ⏳ `3` https://github.com/openai/codex/pull/10670
# External (non-OpenAI) Pull Request Requirements
In `js_repl` mode, module resolution currently starts from
`js_repl_kernel.js`, which is written to a per-kernel temp dir. This
effectively means that bare imports will not resolve.
This PR adds a new config option, `js_repl_node_module_dirs`, which is a
list of dirs that are used (in order) to resolve a bare import. If none
of those work, the current working directory of the thread is used.
For example:
```toml
js_repl_node_module_dirs = [
"/path/to/node_modules/",
"/other/path/to/node_modules/",
]
```
zsh fork PR stack:
- https://github.com/openai/codex/pull/12051
- https://github.com/openai/codex/pull/12052👈
### Summary
This PR introduces a feature-gated native shell runtime path that routes
shell execution through a patched zsh exec bridge, removing MCP-specific
behavior from the shell hot path while preserving existing
CommandExecution lifecycle semantics.
When shell_zsh_fork is enabled, shell commands run via patched zsh with
per-`execve` interception through EXEC_WRAPPER. Core receives wrapper
IPC requests over a Unix socket, applies existing approval policy, and
returns allow/deny before the subcommand executes.
### What’s included
**1) New zsh exec bridge runtime in core**
- Wrapper-mode entrypoint (maybe_run_zsh_exec_wrapper_mode) for
EXEC_WRAPPER invocations.
- Per-execution Unix-socket IPC handling for wrapper requests/responses.
- Approval callback integration using existing core approval
orchestration.
- Streaming stdout/stderr deltas to existing command output event
pipeline.
- Error handling for malformed IPC, denial/abort, and execution
failures.
**2) Session lifecycle integration**
SessionServices now owns a `ZshExecBridge`.
Session startup initializes bridge state; shutdown tears it down
cleanly.
**3) Shell runtime routing (feature-gated)**
When `shell_zsh_fork` is enabled:
- Build execution env/spec as usual.
- Add wrapper socket env wiring.
- Execute via `zsh_exec_bridge.execute_shell_request(...)` instead of
the regular shell path.
- Non-zsh-fork behavior remains unchanged.
**4) Config + feature wiring**
- Added `Feature::ShellZshFork` (under development).
- Added config support for `zsh_path` (optional absolute path to patched
zsh):
- `Config`, `ConfigToml`, `ConfigProfile`, overrides, and schema.
- Session startup validates that `zsh_path` exists/usable when zsh-fork
is enabled.
- Added startup test for missing `zsh_path` failure mode.
**5) Seatbelt/sandbox updates for wrapper IPC**
- Extended seatbelt policy generation to optionally allow outbound
connection to explicitly permitted Unix sockets.
- Wired sandboxing path to pass wrapper socket path through to seatbelt
policy generation.
- Added/updated seatbelt tests for explicit socket allow rule and
argument emission.
**6) Runtime entrypoint hooks**
- This allows the same binary to act as the zsh wrapper subprocess when
invoked via `EXEC_WRAPPER`.
**7) Tool selection behavior**
- ToolsConfig now prefers ShellCommand type when shell_zsh_fork is
enabled.
- Added test coverage for precedence with unified-exec enabled.
zsh fork PR stack:
- https://github.com/openai/codex/pull/12051👈
- https://github.com/openai/codex/pull/12052
With upcoming support for a fork of zsh that allows us to intercept
`execve` and run execpolicy checks for each subcommand as part of a
`CommandExecution`, it will be possible for there to be multiple
approval requests for a shell command like `/path/to/zsh -lc 'git status
&& rg \"TODO\" src && make test'`.
To support that, this PR introduces a new `approval_id` field across
core, protocol, and app-server so that we can associate approvals
properly for subcommands.