codex

mirror of https://github.com/openai/codex.git synced 2026-05-03 10:56:37 +00:00

Author	SHA1	Message	Date
Vivian Fang	d47b755aa2	Render namespace description for tools (#16879 )	2026-04-08 02:39:40 -07:00
pakrym-oai	f1a2b920f9	[codex] Make AbsolutePathBuf joins infallible (#16981 ) Having to check for errors every time join is called is painful and unnecessary.	2026-04-07 10:52:08 -07:00
Michael Bolin	aa2403e2eb	core: remove cross-crate re-exports from lib.rs (#16512 ) ## Why `codex-core` was re-exporting APIs owned by sibling `codex-` crates, which made downstream crates depend on `codex-core` as a proxy module instead of the actual owner crate. Removing those forwards makes crate boundaries explicit and lets leaf crates drop unnecessary `codex-core` dependencies. In this PR, this reduces the dependency on `codex-core` to `codex-login` in the following files: ``` codex-rs/backend-client/Cargo.toml codex-rs/mcp-server/tests/common/Cargo.toml ``` ## What - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`, `codex-protocol`, `codex-shell-command`, `codex-sandboxing`, `codex-tools`, and `codex-utils-path`. - Delete the `default_client` forwarding shim in `codex-rs/core`. - Update in-crate and downstream callsites to import directly from the owning `codex-` crate. - Add direct Cargo dependencies where callsites now target the owner crate, and remove `codex-core` from `codex-rs/backend-client`.	2026-04-01 23:06:24 -07:00
Michael Bolin	d4464125c5	Remove client_common tool re-exports (#16482 ) ## Why `codex-rs/core/src/client_common.rs` still had a `tools` re-export module that forwarded `codex_tools` types back into `codex-core`. After the earlier extraction work in #16379, #16471, #16477, and #16481, that extra layer no longer adds value. Removing it keeps dependencies explicit: the `codex-core` modules that actually use `ToolSpec` and related types now depend on `codex_tools` directly instead of reaching through `client_common`. ## What Changed - removed the `client_common::tools` re-export module from `core/src/client_common.rs` - updated the remaining `codex-core` consumers to import `codex_tools` directly - adjusted the affected test code to reference `codex_tools::ResponsesApiTool` directly as well This is a mechanical cleanup only. It does not change tool behavior or runtime logic. ## Testing - `cargo test -p codex-core client_common::tests` - `cargo test -p codex-core tools::router::tests` - `cargo test -p codex-core tools::context::tests` - `cargo test -p codex-core tools::spec::tests`	2026-04-01 19:15:15 -07:00
Michael Bolin	61dfe0b86c	chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054 ) ## Why `argument-comment-lint` was green in CI even though the repo still had many uncommented literal arguments. The main gap was target coverage: the repo wrapper did not force Cargo to inspect test-only call sites, so examples like the `latest_session_lookup_params(true, ...)` tests in `codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path. This change cleans up the existing backlog, makes the default repo lint path cover all Cargo targets, and starts rolling that stricter CI enforcement out on the platform where it is currently validated. ## What changed - mechanically fixed existing `argument-comment-lint` violations across the `codex-rs` workspace, including tests, examples, and benches - updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and `tools/argument-comment-lint/run.sh` so non-`--fix` runs default to `--all-targets` unless the caller explicitly narrows the target set - fixed both wrappers so forwarded cargo arguments after `--` are preserved with a single separator - documented the new default behavior in `tools/argument-comment-lint/README.md` - updated `rust-ci` so the macOS lint lane keeps the plain wrapper invocation and therefore enforces `--all-targets`, while Linux and Windows temporarily pass `-- --lib --bins` That temporary CI split keeps the stricter all-targets check where it is already cleaned up, while leaving room to finish the remaining Linux- and Windows-specific target-gated cleanup before enabling `--all-targets` on those runners. The Linux and Windows failures on the intermediate revision were caused by the wrapper forwarding bug, not by additional lint findings in those lanes. ## Validation - `bash -n tools/argument-comment-lint/run.sh` - `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh` - shell-level wrapper forwarding check for `-- --lib --bins` - shell-level wrapper forwarding check for `-- --tests` - `just argument-comment-lint` - `cargo test` in `tools/argument-comment-lint` - `cargo test -p codex-terminal-detection` ## Follow-up - Clean up remaining Linux-only target-gated callsites, then switch the Linux lint lane back to the plain wrapper invocation. - Clean up remaining Windows-only target-gated callsites, then switch the Windows lint lane back to the plain wrapper invocation.	2026-03-27 19:00:44 -07:00
Michael Bolin	e6e2999209	permissions: remove macOS seatbelt extension profiles (#15918 ) ## Why `PermissionProfile` should only describe the per-command permissions we still want to grant dynamically. Keeping `MacOsSeatbeltProfileExtensions` in that surface forced extra macOS-only approval, protocol, schema, and TUI branches for a capability we no longer want to expose. ## What changed - Removed the macOS-specific permission-profile types from `codex-protocol`, the app-server v2 API, and the generated schema/TypeScript artifacts. - Deleted the core and sandboxing plumbing that threaded `MacOsSeatbeltProfileExtensions` through execution requests and seatbelt construction. - Simplified macOS seatbelt generation so it always includes the fixed read-only preferences allowlist instead of carrying a configurable profile extension. - Removed the macOS additional-permissions UI/docs/test coverage and deleted the obsolete macOS permission modules. - Tightened `request_permissions` intersection handling so explicitly empty requested read lists are preserved only when that field was actually granted, avoiding zero-grant responses being stored as active permissions.	2026-03-26 17:12:45 -07:00
Michael Bolin	dfb36573cd	sandboxing: use OsString for SandboxCommand.program (#15897 ) ## Why `SandboxCommand.program` represents an executable path, but keeping it as `String` forced path-backed callers to run `to_string_lossy()` before the sandbox layer ever touched the command. That loses fidelity earlier than necessary and adds avoidable conversions in runtimes that already have a `PathBuf`. ## What changed - Changed `SandboxCommand.program` to `OsString`. - Updated `SandboxManager::transform` to keep the program and argv in `OsString` form until the `SandboxExecRequest` conversion boundary. - Switched the path-backed `apply_patch` and `js_repl` runtimes to pass `into_os_string()` instead of `to_string_lossy()`. - Updated the remaining string-backed builders and tests to match the new type while preserving the existing Linux helper `arg0` behavior. ## Verification - `cargo test -p codex-sandboxing` - `just argument-comment-lint -p codex-core -p codex-sandboxing` - `cargo test -p codex-core` currently fails in unrelated existing config tests: `config::tests::approvals_reviewer_` and `config::tests::smart_approvals_alias_`	2026-03-26 20:38:33 +00:00
pakrym-oai	504aeb0e09	Use AbsolutePathBuf for cwd state (#15710 ) Migrate `cwd` and related session/config state to `AbsolutePathBuf` so downstream consumers consistently see absolute working directories. Add test-only `.abs()` helpers for `Path`, `PathBuf`, and `TempDir`, and update branch-local tests to use them instead of `AbsolutePathBuf::try_from(...)`. For the remaining TUI/app-server snapshot coverage that renders absolute cwd values, keep the snapshots unchanged and skip the Windows-only cases where the platform-specific absolute path layout differs.	2026-03-25 16:02:22 +00:00
Ahmed Ibrahim	062fa7a2bb	Move string truncation helpers into codex-utils-string (#15572 ) - move the shared byte-based middle truncation logic from `core` into `codex-utils-string` - keep token-specific truncation in `codex-core` so rollout can reuse the shared helper in the next stacked PR --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-24 15:45:40 -07:00
pakrym-oai	0b619afc87	Drop sandbox_permissions from sandbox exec requests (#15665 ) ## Summary - drop `sandbox_permissions` from the sandboxing `ExecOptions` and `ExecRequest` adapter types - remove the now-unused plumbing from shell, unified exec, JS REPL, and apply-patch runtime call sites - default reconstructed `ExecParams` to `SandboxPermissions::UseDefault` where the lower-level API still requires the field ## Testing - `just fmt` - `just argument-comment-lint` - `cargo test -p codex-core` (still running locally; first failures observed in `suite::cli_stream::responses_mode_stream_cli`, `suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_config_override`, and `suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_env_fallback`)	2026-03-24 15:42:45 -07:00
pakrym-oai	f49eb8e9d7	Extract sandbox manager and transforms into codex-sandboxing (#15603 ) Extract sandbox manager	2026-03-24 08:20:57 -07:00
Ahmed Ibrahim	2e22885e79	Split features into codex-features crate (#15253 ) - Split the feature system into a new `codex-features` crate. - Cut `codex-core` and workspace consumers over to the new config and warning APIs. Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-19 20:12:07 -07:00
Michael Bolin	a3e59e9e85	core: add a full-buffer exec capture policy (#15254 )	2026-03-20 02:38:12 +00:00
pakrym-oai	88e5382fc4	Propagate tool errors to code mode (#15075 ) Clean up error flow to push the FunctionCallError all the way up to dispatcher and allow code mode to surface as exception.	2026-03-18 13:57:55 -07:00
pakrym-oai	606d85055f	Add notify to code-mode (#14842 ) Allows model to send an out-of-band notification. The notification is injected as another tool call output for the same call_id.	2026-03-18 09:37:13 -07:00
Michael Bolin	b77fe8fefe	Apply argument comment lint across codex-rs (#14652 ) ## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/param/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/param/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/param/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- -> #14652 * #14651	2026-03-16 16:48:15 -07:00
Channing Conger	70eddad6b0	dynamic tool calls: add param `exposeToContext` to optionally hide tool (#14501 ) This extends dynamic_tool_calls to allow us to hide a tool from the model context but still use it as part of the general tool calling runtime (for ex from js_repl/code_mode)	2026-03-14 01:58:43 -07:00
iceweasel-oai	6b3d82daca	Use a private desktop for Windows sandbox instead of Winsta0\Default (#14400 ) ## Summary - launch Windows sandboxed children on a private desktop instead of `Winsta0\Default` - make private desktop the default while keeping `windows.sandbox_private_desktop=false` as the escape hatch - centralize process launch through the shared `create_process_as_user(...)` path - scope the private desktop ACL to the launching logon SID ## Why Today sandboxed Windows commands run on the visible shared desktop. That leaves an avoidable same-desktop attack surface for window interaction, spoofing, and related UI/input issues. This change moves sandboxed commands onto a dedicated per-launch desktop by default so the sandbox no longer shares `Winsta0\Default` with the user session. The implementation stays conservative on security with no silent fallback back to `Winsta0\Default` If private-desktop setup fails on a machine, users can still opt out explicitly with `windows.sandbox_private_desktop=false`. ## Validation - `cargo build -p codex-cli` - elevated-path `codex exec` desktop-name probe returned `CodexSandboxDesktop-*` - elevated-path `codex exec` smoke sweep for shell commands, nested `pwsh`, jobs, and hidden `notepad` launch - unelevated-path full private-desktop compatibility sweep via `codex exec` with `-c windows.sandbox=unelevated`	2026-03-13 10:13:39 -07:00
aaronl-openai	d9a403a8c0	[js_repl] Hard-stop active js_repl execs on explicit user interrupts (#13329 ) ## Summary - hard-stop `js_repl` only for `TurnAbortReason::Interrupted`, preserving the persistent REPL across replaced turns - track the current top-level exec by turn and only reset when the interrupted turn owns submitted work or a freshly started kernel for the current exec attempt - close both interrupt races: the write-window race by marking the exec as submitted before async pipe writes begin, and the startup-window race by tracking fresh-kernel ownership until submission - add regression coverage for interrupted in-flight execs and the pending-kernel-start window ## Why Stopping a turn previously surfaced `aborted by user after Xs` even though the underlying `js_repl` kernel could continue executing. Earlier fixes also risked resetting the session-scoped REPL too broadly or missing already-dispatched work. This change keeps cleanup scoped to explicit stop semantics and makes the interrupt path line up with both submitted execs and newly started kernels. ## Testing - `just fmt` - `cargo test -p codex-core` - `just fix -p codex-core` `cargo test -p codex-core` passes the updated `js_repl` coverage, including the new startup-window regression test, but still has unrelated integration failures in this environment outside `js_repl`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-12 17:51:56 -07:00
Curtis 'Fjord' Hawthorne	b560494c9f	Persist js_repl codex helpers across cells (#14503 ) ## Summary This changes `js_repl` so saved references to `codex.tool(...)` and `codex.emitImage(...)` keep working across cells. Previously, those helpers were recreated per exec and captured that exec's `message.id`. If a persisted object or saved closure reused an old helper in a later cell, the nested tool/image call could fail with `js_repl exec context not found`. This patch: - keeps stable `codex.tool` and `codex.emitImage` helper identities in the kernel - resolves the current exec dynamically at call time using `AsyncLocalStorage` - adds regression coverage for persisted helper references across cells - updates the js_repl docs and project-doc instructions to describe the new behavior and its limits ## Why We already support persistent top-level bindings across `js_repl` cells, so persisted objects should be able to reuse `codex` helpers in later active cells. The bug was that helper identity was exec-scoped, not kernel-scoped. Using `AsyncLocalStorage` fixes the cross-cell reuse case without falling back to a single global active exec that could accidentally attribute stale background callbacks to the wrong cell.	2026-03-12 15:41:54 -07:00
aaronl-openai	f35d46002a	Fix js_repl hangs on U+2028/U+2029 dynamic tool responses (#14421 ) ## Summary Dynamic tool responses containing literal U+2028 / U+2029 would cause await codex.tool(...) to hang even though the response had already arrived. This PR replaces the kernel’s readline-based stdin handling with byte-oriented JSONL framing that handles these characters properly. ## Testing - `cargo test -p codex-core` - tested the binary on a repro case and confirmed it's fixed --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-12 13:01:02 -07:00
Michael Bolin	0c8a36676a	fix: move inline codex-rs/core unit tests into sibling files (#14444 ) ## Why PR #13783 moved the `codex.rs` unit tests into `codex_tests.rs`. This applies the same extraction pattern across the rest of `codex-rs/core` so the production modules stay focused on runtime code instead of large inline test blocks. Keeping the tests in sibling files also makes follow-up edits easier to review because product changes no longer have to share a file with hundreds or thousands of lines of test scaffolding. ## What changed - replaced each inline `mod tests { ... }` in `codex-rs/core/src/*` with a path-based module declaration - moved each extracted unit test module into a sibling `_tests.rs` file, using `mod_tests.rs` for `mod.rs` modules - preserved the existing `cfg(...)` guards and module-local structure so the refactor remains structural rather than behavioral ## Testing - `cargo test -p codex-core --lib` (`1653 passed; 0 failed; 5 ignored`) - `just fix -p codex-core` - `cargo fmt --check` - `cargo shear`	2026-03-12 08:16:36 -07:00
viyatb-oai	04892b4ceb	refactor: make bubblewrap the default Linux sandbox (#13996 ) ## Summary - make bubblewrap the default Linux sandbox and keep `use_legacy_landlock` as the only override - remove `use_linux_sandbox_bwrap` from feature, config, schema, and docs surfaces - update Linux sandbox selection, CLI/config plumbing, and related tests/docs to match the new default - fold in the follow-up CI fixes for request-permissions responses and Linux read-only sandbox error text	2026-03-11 23:31:18 -07:00
Matthew Zeng	ba5b94287e	[apps] Add tool_suggest tool. (#14287 ) - [x] Add tool_suggest tool. - [x] Move chatgpt/src/connectors.rs and core/src/connectors.rs into a dedicated mod so that we have all the logic and global cache in one place. - [x] Update TUI app link view to support rendering the installation view for mcp elicitation. --------- Co-authored-by: Shaqayeq <shaqayeq@openai.com> Co-authored-by: Eric Traut <etraut@openai.com> Co-authored-by: pakrym-oai <pakrym@openai.com> Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com> Co-authored-by: guinness-oai <guinness@openai.com> Co-authored-by: Eugene Brevdo <ebrevdo@users.noreply.github.com> Co-authored-by: Charlie Guo <cguo@openai.com> Co-authored-by: Fouad Matin <fouad@openai.com> Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com> Co-authored-by: xl-openai <xl@openai.com> Co-authored-by: alexsong-oai <alexsong@openai.com> Co-authored-by: Owen Lin <owenlin0@gmail.com> Co-authored-by: sdcoffey <stevendcoffey@gmail.com> Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Won Park <won@openai.com> Co-authored-by: Dylan Hurd <dylan.hurd@openai.com> Co-authored-by: celia-oai <celia@openai.com> Co-authored-by: gabec-openai <gabec@openai.com> Co-authored-by: joeytrasatti-openai <joey.trasatti@openai.com> Co-authored-by: Leo Shimonaka <leoshimo@openai.com> Co-authored-by: Rasmus Rygaard <rasmus@openai.com> Co-authored-by: maja-openai <163171781+maja-openai@users.noreply.github.com> Co-authored-by: pash-openai <pash@openai.com> Co-authored-by: Josh McKinney <joshka@openai.com>	2026-03-11 22:06:59 -07:00
Anton Panasenko	77b0c75267	feat: search_tool migrate to bring you own tool of Responses API (#14274 ) ## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch	2026-03-11 17:51:51 -07:00
Curtis 'Fjord' Hawthorne	8791f0ab9a	Let models opt into original image detail (#14175 ) ## Summary This PR narrows original image detail handling to a single opt-in feature: - `image_detail_original` lets the model request `detail: "original"` on supported models - Omitting `detail` preserves the default resized behavior The model only sees `detail: "original"` guidance when the active model supports it: - JS REPL instructions include the guidance and examples only on supported models - `view_image` only exposes a `detail` parameter when the feature and model can use it The image detail API is intentionally narrow and consistent across both paths: - `view_image.detail` supports only `"original"`; otherwise omit the field - `codex.emitImage(..., detail)` supports only `"original"`; otherwise omit the field - Unsupported explicit values fail clearly at the API boundary instead of being silently reinterpreted - Unsupported explicit `detail: "original"` requests fall back to normal behavior when the feature is disabled or the model does not support original detail	2026-03-11 15:25:07 -07:00
Curtis 'Fjord' Hawthorne	5a89660ae4	Add js_repl cwd and homeDir helpers (#14385 ) ## Summary This PR adds two read-only path helpers to `js_repl`: - `codex.cwd` - `codex.homeDir` They are exposed alongside the existing `codex.tmpDir` helper so the REPL can reference basic host path context without reopening direct `process` access. ## Implementation - expose `codex.cwd` and `codex.homeDir` from the js_repl kernel - make `codex.homeDir` come from the kernel process environment - pass session dependency env through js_repl kernel startup so `codex.homeDir` matches the env a shell-launched process would see - keep existing shell `HOME` population behavior unchanged - update js_repl prompt/docs and add runtime/integration coverage for the new helpers	2026-03-11 14:44:44 -07:00
pakrym-oai	ee8f84153e	Add output schema to MCP tools and expose MCP tool results in code mode (#14236 ) Summary - drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers to keep MCP tooling focused on the final result shape - wire the new schema definitions through code mode, context, handlers, and spec modules so MCP tools serialize the exact output shape expected by the model - extend code mode tests to cover multiple MCP call scenarios and ensure the serialized data matches the new schema - refresh JS runner helpers and protocol models alongside the schema changes Testing - Not run (not requested)	2026-03-11 12:33:08 -07:00
pakrym-oai	c4d35084f5	Reuse McpToolOutput in McpHandler (#14229 ) We already have a type to represent the MCP tool output, reuse it instead of the custom McpHandlerOutput	2026-03-11 12:33:07 -07:00
Michael Bolin	bf5c2f48a5	seatbelt: honor split filesystem sandbox policies (#13448 ) ## Why After `#13440` and `#13445`, macOS Seatbelt policy generation was still deriving filesystem and network behavior from the legacy `SandboxPolicy` projection. That projection loses explicit unreadable carveouts and conflates split network decisions, so the generated Seatbelt policy could still be wider than the split policy that Codex had already computed. ## What changed - added Seatbelt entrypoints that accept `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` directly - built read and write policy stanzas from access roots plus excluded subpaths so explicit unreadable carveouts survive into the generated Seatbelt policy - switched network policy generation to consult `NetworkSandboxPolicy` directly - failed closed when managed-network or proxy-constrained sessions do not yield usable loopback proxy endpoints - updated the macOS callers and test helpers that now need to carry the split policies explicitly ## Verification - added regression coverage in `core/src/seatbelt.rs` for unreadable carveouts under both full-disk and scoped-readable policies - verified the current PR state with `just clippy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13448). * #13453 * #13452 * #13451 * #13449 * __->__ #13448 * #13445 * #13440 * #13439 --------- Co-authored-by: viyatb-oai <viyatb@openai.com>	2026-03-08 00:35:19 +00:00
Michael Bolin	22ac6b9aaa	sandboxing: plumb split sandbox policies through runtime (#13439 ) ## Why `#13434` introduces split `FileSystemSandboxPolicy` and `NetworkSandboxPolicy`, but the runtime still made most execution-time sandbox decisions from the legacy `SandboxPolicy` projection. That projection loses information about combinations like unrestricted filesystem access with restricted network access. In practice, that means the runtime can choose the wrong platform sandbox behavior or set the wrong network-restriction environment for a command even when config has already separated those concerns. This PR carries the split policies through the runtime so sandbox selection, process spawning, and exec handling can consult the policy that actually matters. ## What changed - threaded `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` through `TurnContext`, `ExecRequest`, sandbox attempts, shell escalation state, unified exec, and app-server exec overrides - updated sandbox selection in `core/src/sandboxing/mod.rs` and `core/src/exec.rs` to key off `FileSystemSandboxPolicy.kind` plus `NetworkSandboxPolicy`, rather than inferring behavior only from the legacy `SandboxPolicy` - updated process spawning in `core/src/spawn.rs` and the platform wrappers to use `NetworkSandboxPolicy` when deciding whether to set `CODEX_SANDBOX_NETWORK_DISABLED` - kept additional-permissions handling and legacy `ExternalSandbox` compatibility projections aligned with the split policies, including explicit user-shell execution and Windows restricted-token routing - updated callers across `core`, `app-server`, and `linux-sandbox` to pass the split policies explicitly ## Verification - added regression coverage in `core/tests/suite/user_shell_cmd.rs` to verify `RunUserShellCommand` does not inherit `CODEX_SANDBOX_NETWORK_DISABLED` from the active turn - added coverage in `core/src/exec.rs` for Windows restricted-token sandbox selection when the legacy projection is `ExternalSandbox` - updated Linux sandbox coverage in `linux-sandbox/tests/suite/landlock.rs` to exercise the split-policy exec path - verified the current PR state with `just clippy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13439). * #13453 * #13452 * #13451 * #13449 * #13448 * #13445 * #13440 * __->__ #13439 --------- Co-authored-by: viyatb-oai <viyatb@openai.com>	2026-03-07 02:30:21 +00:00
Curtis 'Fjord' Hawthorne	cfbbbb1dda	Harden js_repl emitImage to accept only data: URLs (#13507 ) ### Motivation - Prevent untrusted js_repl code from supplying arbitrary external URLs that the host would forward into model input and cause external fetches / data exfiltration. This change narrows the emitImage contract to safe, self-contained data URLs. ### Description - Kernel: added `normalizeEmitImageUrl` and enforce that string-valued `codex.emitImage(...)` inputs and `input_image`/content-item paths only accept non-empty `data:` URLs; byte-based paths still produce data URLs as before (`kernel.js`). - Host: added `validate_emitted_image_url` and check `EmitImage` requests before creating `FunctionCallOutputContentItem::InputImage`, returning an error to the kernel if the URL is not a `data:` URL (`mod.rs`). - Tests/docs: added a runtime test `js_repl_emit_image_rejects_non_data_url` to assert rejection of non-data URLs and updated user-facing docs/instruction text to state `data URL` support instead of generic direct image URLs (`mod.rs`, `docs/js_repl.md`, `project_doc.rs`). ### Testing - Ran `just fmt` in `codex-rs`; it completed successfully. - Added a runtime test (`cargo test -p codex-core js_repl_emit_image_rejects_non_data_url`) but executing the test in this environment failed due to a missing system dependency required by `codex-linux-sandbox` (the vendored `bubblewrap` build requires `libcap.pc` via `pkg-config`), so the test could not be run here. - Attempted a focused `cargo test` invocation with and without default features; both compile/test attempts were blocked by the same missing system `libcap` dependency in this environment. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69a7837bce98832d91db92d5f76d6cbe)	2026-03-05 12:12:32 -08:00
Curtis 'Fjord' Hawthorne	657841e7f5	Persist initialized js_repl bindings after failed cells (#13482 ) ## Summary - Change `js_repl` failed-cell persistence so later cells keep prior bindings plus only the current-cell bindings whose initialization definitely completed before the throw. - Preserve initialized lexical bindings across failed cells via module-namespace readability, including top-level destructuring that partially succeeds before a later throw. - Preserve hoisted `var` and `function` bindings only when execution clearly reached their declaration site, and preserve direct top-level pre-declaration `var` writes and updates through explicit write-site markers. - Preserve top-level `for...in` / `for...of` `var` bindings when the loop body executes at least once, using a first-iteration guard to avoid per-iteration bookkeeping overhead. - Keep prior module state intact across link-time failures and evaluation failures before the prelude runs, while still allowing failed cells that already recreated prior bindings to persist updates to those existing bindings. - Hide internal commit hooks from user `js_repl` code after the prelude aliases them, so snippets cannot spoof committed bindings by calling the raw `import.meta` hooks directly. - Add focused regression coverage for the supported failed-cell behaviors and the intentionally unsupported boundaries. - Update `js_repl` docs and generated instructions to describe the new, narrower failed-cell persistence model. ## Motivation We saw `js_repl` drop bindings that had already been initialized successfully when a later statement in the same cell threw, for example: const { context: liveContext, session } = await initializeGoogleSheetsLiveForTab(tab); // later statement throws That was surprising in practice because successful earlier work disappeared from the next cell. This change makes failed-cell persistence more useful without trying to model every possible partially executed JavaScript edge case. The resulting behavior is narrower and easier to reason about: - prior bindings are always preserved - lexical bindings persist when their initialization completed before the throw - hoisted `var` / `function` bindings persist only when execution clearly reached their declaration or a supported top-level `var` write site - failed cells that already recreated prior bindings can persist writes to those existing bindings even if they introduce no new bindings The detailed edge-case matrix stays in `docs/js_repl.md`. The model-facing `project_doc` guidance is intentionally shorter and focused on generation-relevant behavior. ## Supported Failed-Cell Behavior - Prior bindings remain available after a failed cell. - Initialized lexical bindings remain available after a failed cell. - Top-level destructuring like `const { a, b } = ...` preserves names whose initialization completed before a later throw. - Hoisted `function` bindings persist when execution reached the declaration statement before the throw. - Direct top-level pre-declaration `var` writes and updates persist, for example: - `x = 1` - `x += 1` - `x++` - short-circuiting logical assignments only persist when the write branch actually runs - Non-empty top-level `for...in` / `for...of` `var` loops persist their loop bindings. - Failed cells can persist updates to existing carried bindings after the prelude has run, even when the cell commits no new bindings. - Link failures and eval failures before the prelude do not poison `@prev`. ## Intentionally Unsupported Failed-Cell Cases - Hoisted function reads before the declaration, such as `foo(); ...; function foo() {}` - Aliasing or inference-based recovery from reads before declaration - Nested writes inside already-instrumented assignment RHS expressions - Destructuring-assignment recovery for hoisted `var` - Partial `var` destructuring recovery - Pre-declaration `undefined` reads for hoisted `var` - Empty top-level `for...in` / `for...of` loop vars - Nested or scope-sensitive pre-declaration `var` writes outside direct top-level expression statements	2026-03-05 11:01:46 -08:00
aaronl-openai	ff0341dc94	[js_repl] Support local ESM file imports (#13437 ) ## Summary - add `js_repl` support for dynamic imports of relative and absolute local ESM `.js` / `.mjs` files - keep bare package imports on the native Node path and resolved from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then `cwd`), even when they originate from imported local files - restrict static imports inside imported local files to other local relative/absolute `.js` / `.mjs` files, and surface a clear error for unsupported top-level static imports in the REPL cell - run imported local files inside the REPL VM context so they can access `codex.tmpDir`, `codex.tool`, captured `console`, and Node-like `import.meta` helpers - reload local files between execs so later `await import("./file.js")` calls pick up edits and fixed failures, while preserving package/builtin caching and persistent top-level REPL bindings - make `import.meta.resolve()` self-consistent by allowing the returned `file://...` URLs to round-trip through `await import(...)` - update both public and injected `js_repl` docs to clarify the narrowed contract, including global bare-import resolution behavior for local absolute files ## Testing - `cargo test -p codex-core js_repl_` - built codex binary and verified behavior --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-04 22:40:31 -08:00
Michael Bolin	bfff0c729f	config: enforce enterprise feature requirements (#13388 ) ## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.	2026-03-04 04:40:22 +00:00
Curtis 'Fjord' Hawthorne	c4cb594e73	Make js_repl image output controllable (#13331 ) ## Summary Instead of always adding inner function call outputs to the model context, let js code decide which ones to return. - Stop auto-hoisting nested tool outputs from `codex.tool(...)` into the outer `js_repl` function output. - Keep `codex.tool(...)` return values unchanged as structured JS objects. - Add `codex.emitImage(...)` as the explicit path for attaching an image to the outer `js_repl` function output. - Support emitting from a direct image URL, a single `input_image` item, an explicit `{ bytes, mimeType }` object, or a raw tool response object containing exactly one image. - Preserve existing `view_image` original-resolution behavior when JS emits the raw `view_image` tool result. - Suppress the special `ViewImageToolCall` event for `js_repl`-sourced `view_image` calls so nested inspection stays side-effect free until JS explicitly emits. - Update the `js_repl` docs and generated project instructions with both recommended patterns: - `await codex.emitImage(codex.tool("view_image", { path }))` - `await codex.emitImage({ bytes: await page.screenshot({ type: "jpeg", quality: 85 }), mimeType: "image/jpeg" })` #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/13050 - 👉 `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 16:25:59 -08:00
Curtis 'Fjord' Hawthorne	b92146d48b	Add under-development original-resolution view_image support (#13050 ) ## Summary Add original-resolution support for `view_image` behind the under-development `view_image_original_resolution` feature flag. When the flag is enabled and the target model is `gpt-5.3-codex` or newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends `detail: "original"` to the Responses API instead of using the legacy resize/compress path. ## What changed - Added `view_image_original_resolution` as an under-development feature flag. - Added `ImageDetail` to the protocol models and support for serializing `detail: "original"` on tool-returned images. - Added `PromptImageMode::Original` to `codex-utils-image`. - Preserves original PNG/JPEG/WebP bytes. - Keeps legacy behavior for the resize path. - Updated `view_image` to: - use the shared `local_image_content_items_with_label_number(...)` helper in both code paths - select original-resolution mode only when: - the feature flag is enabled, and - the model slug parses as `gpt-5.3-codex` or newer - Kept local user image attachments on the existing resize path; this change is specific to `view_image`. - Updated history/image accounting so only `detail: "original"` images use the docs-based GPT-5 image cost calculation; legacy images still use the old fixed estimate. - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG at 85% quality unless lossless is required, while still allowing other formats when explicitly requested. - Updated tests and helper code that construct `FunctionCallOutputContentItem::InputImage` to carry the new `detail` field. ## Behavior ### Feature off - `view_image` keeps the existing resize/re-encode behavior. - History estimation keeps the existing fixed-cost heuristic. ### Feature on + `gpt-5.3-codex+` - `view_image` sends original-resolution images with `detail: "original"`. - PNG/JPEG/WebP source bytes are preserved when possible. - History estimation uses the GPT-5 docs-based image-cost calculation for those `detail: "original"` images. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/13050 - ⏳ `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 15:56:54 -08:00
Curtis 'Fjord' Hawthorne	7e980d7db6	Support multimodal custom tool outputs (#12948 ) ## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948	2026-02-26 18:17:46 -08:00
Michael Bolin	7fa9d9ae35	feat: include sandbox config with escalation request (#12839 ) ## Why Before this change, an escalation approval could say that a command should be rerun, but it could not carry the sandbox configuration that should still apply when the escalated command is actually spawned. That left an unsafe gap in the `zsh-fork` skill path: skill scripts under `scripts/` that did not declare permissions could be escalated without a sandbox, and scripts that did declare permissions could lose their bounded sandbox on rerun or cached session approval. This PR extends the escalation protocol so approvals can optionally carry sandbox configuration all the way through execution. That lets the shell runtime preserve the intended sandbox instead of silently widening access. We likely want a single permissions type for this codepath eventually, probably centered on `Permissions`. For now, the protocol needs to represent both the existing `PermissionProfile` form and the fuller `Permissions` form, so this introduces a temporary disjoint union, `EscalationPermissions`, to carry either one. Further, this means that today, a skill either: - does not declare any permissions, in which case it is run using the default sandbox for the turn - specifies permissions, in which case the skill is run using that exact sandbox, which might be more restrictive than the default sandbox for the turn We will likely change the skill's permissions to be additive to the existing permissions for the turn. ## What Changed - Added `EscalationPermissions` to `codex-protocol` so escalation requests can carry either a `PermissionProfile` or a full `Permissions` payload. - Added an explicit `EscalationExecution` mode to the shell escalation protocol so reruns distinguish between `Unsandboxed`, `TurnDefault`, and `Permissions(...)` instead of overloading `None`. - Updated `zsh-fork` shell reruns to resolve `TurnDefault` at execution time, which keeps ordinary `UseDefault` commands on the turn sandbox and preserves turn-level macOS seatbelt profile extensions. - Updated the `zsh-fork` skill path so a skill with no declared permissions inherits the conversation's effective sandbox instead of escalating unsandboxed. - Updated the `zsh-fork` skill path so a skill with declared permissions reruns with exactly those permissions, including when a cached session approval is reused. ## Testing - Added unit coverage in `core/src/tools/runtimes/shell/unix_escalation.rs` for the explicit `UseDefault` / `RequireEscalated` / `WithAdditionalPermissions` execution mapping. - Added unit coverage in `core/src/tools/runtimes/shell/unix_escalation.rs` for macOS seatbelt extension preservation in both the `TurnDefault` and explicit-permissions rerun paths. - Added integration coverage in `core/tests/suite/skill_approval.rs` for permissionless skills inheriting the turn sandbox and explicit skill permissions remaining bounded across cached approval reuse.	2026-02-26 12:00:18 -08:00
Curtis 'Fjord' Hawthorne	eb77db2957	Log js_repl nested tool responses in rollout history (#12837 ) ## Summary - add tracing-based diagnostics for nested `codex.tool(...)` calls made from `js_repl` - emit a bounded, sanitized summary at `info!` - emit the exact raw serialized response object or error string seen by JavaScript at `trace!` - document how to enable these logs and where to find them, especially for `codex app-server` ## Why Nested `codex.tool(...)` calls inside `js_repl` are a debugging boundary: JavaScript sees the tool result, but that result is otherwise hard to inspect from outside the kernel. This change adds explicit tracing for that path using the repo’s normal observability pattern: - `info` for compact summaries - `trace` for exact raw payloads when deep debugging is needed ## What changed - `js_repl` now summarizes nested tool-call results across the response shapes it can receive: - message content - function-call outputs - custom tool outputs - MCP tool results and MCP error results - direct error strings - each nested `codex.tool(...)` completion logs: - `exec_id` - `tool_call_id` - `tool_name` - `ok` - a bounded summary struct describing the payload shape - at `trace`, the same path also logs the exact serialized response object or error string that JavaScript received - docs now include concrete logging examples for `codex app-server` - unit coverage was added for multimodal function output summaries and error summaries ## How to use it ### Summary-only logging Set: ```sh RUST_LOG=codex_core::tools::js_repl=info ``` For `codex app-server`, tracing output is written to the server process `stderr`. Example: ```sh RUST_LOG=codex_core::tools::js_repl=info \ LOG_FORMAT=json \ codex app-server \ 2> /tmp/codex-app-server.log ``` This emits bounded summary lines for nested `codex.tool(...)` calls. ### Full raw debugging Set: ```sh RUST_LOG=codex_core::tools::js_repl=trace ``` Example: ```sh RUST_LOG=codex_core::tools::js_repl=trace \ LOG_FORMAT=json \ codex app-server \ 2> /tmp/codex-app-server.log ``` At `trace`, you get: - the same `info` summary line - a `trace` line with the exact serialized response object seen by JavaScript - or the exact error string if the nested tool call failed ### Where the logs go For `codex app-server`, these logs go to process `stderr`, so redirect or capture `stderr` to inspect them. Example: ```sh RUST_LOG=codex_core::tools::js_repl=trace \ LOG_FORMAT=json \ /Users/fjord/code/codex/codex-rs/target/debug/codex app-server \ 2> /tmp/codex-app-server.log ``` Then inspect: ```sh rg "js_repl nested tool call" /tmp/codex-app-server.log ``` Without an explicit `RUST_LOG` override, these `js_repl` nested tool-call logs are typically not visible.	2026-02-26 10:12:28 -08:00
Curtis 'Fjord' Hawthorne	40ab71a985	Disable js_repl when Node is incompatible at startup (#12824 ) ## Summary - validate `js_repl` Node compatibility during session startup when the experiment is enabled - if Node is missing or too old, disable `js_repl` and `js_repl_tools_only` for the session before tools and instructions are built - surface that startup disablement to users through the existing startup warning flow instead of only logging it - reuse the same compatibility check in js_repl kernel startup so startup gating and runtime behavior stay aligned - add a regression test that verifies the warning is emitted and that the first advertised tool list omits `js_repl` and `js_repl_reset` when Node is incompatible ## Why Today `js_repl` can be advertised based only on the feature flag, then fail later when the kernel starts. That makes the available tool list inaccurate at the start of a conversation, and users do not get a clear explanation for why the tool is unavailable. This change makes tool availability reflect real startup checks, keeps the advertised tool set stable for the lifetime of the session, and gives users a visible warning when `js_repl` is disabled. ## Testing - `just fmt` - `cargo test -p codex-core --test all js_repl_is_not_advertised_when_startup_node_is_incompatible`	2026-02-26 01:14:51 +00:00
Curtis 'Fjord' Hawthorne	9501669a24	tests(js_repl): remove node-related skip paths from js_repl tests (#12185 ) ## Summary Remove js_repl/node test-skip paths and make Node setup explicit in CI so js_repl tests always run instead of silently skipping. ## Why We had multiple “expediency” skip paths that let js_repl tests pass without actually exercising Node-backed behavior. This reduced CI signal and hid runtime/environment regressions. ## What changed ### CI - Added Node setup using `codex-rs/node-version.txt` in: - `.github/workflows/rust-ci.yml` - `.github/workflows/bazel.yml` - Added a Unix PATH copy step in Bazel workflow to expose the setup-node binary in common paths. ### js_repl test harness - Added explicit js_repl sandbox test configuration helpers in: - `codex-rs/core/src/tools/js_repl/mod.rs` - `codex-rs/core/src/tools/handlers/js_repl.rs` - Added Linux arg0 dispatch glue for js_repl tests so sandbox subprocess entrypoint behavior is correct under Linux test execution. ### Removed skip behavior - Deleted runtime guard function and early-return skips in js_repl tests (`can_run_js_repl_runtime_tests` and related per-test short-circuits). - Removed view_image integration test skip behavior: - dropped `skip_if_no_network!(Ok(()))` - removed “skip on Node missing/too old” branch after js_repl output inspection. ## Impact - js_repl/node tests now consistently execute and fail loudly when the environment is not correctly provisioned. - CI has stronger signal for js_repl regressions instead of false green from conditional skips. ## Testing - `cargo test -p codex-core` (locally) to validate js_repl unit/integration behavior with skips removed. - CI expected to surface any remaining environment/runtime gaps directly (rather than masking them). #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12300 - ✅ `2` https://github.com/openai/codex/pull/12275 - ✅ `3` https://github.com/openai/codex/pull/12205 - ✅ `4` https://github.com/openai/codex/pull/12407 - ✅ `5` https://github.com/openai/codex/pull/12372 - 👉 `6` https://github.com/openai/codex/pull/12185 - ⏳ `7` https://github.com/openai/codex/pull/10673	2026-02-24 22:52:14 -08:00
Curtis 'Fjord' Hawthorne	8f3f2c3c02	tests(js_repl): stabilize CI runtime test execution (#12407 ) ## Summary Stabilize `js_repl` runtime test setup in CI and move tool-facing `js_repl` behavior coverage into integration tests. This is a test/CI change only. No production `js_repl` behavior change is intended. ## Why - Bazel test sandboxes (especially on macOS) could resolve a different `node` than the one installed by `actions/setup-node`, which caused `js_repl` runtime/version failures. - `js_repl` runtime tests depend on platform-specific sandbox/test-harness behavior, so they need explicit gating in a base-stability commit. - Several tests in the `js_repl` unit test module were actually black-box/tool-level behavior tests and fit better in the integration suite. ## Changes - Add `actions/setup-node` to the Bazel and Rust `Tests` workflows, using the exact version pinned in the repo’s Node version file. - In Bazel (non-Windows), pass `CODEX_JS_REPL_NODE_PATH=$(which node)` into test env so `js_repl` uses the `actions/setup-node` runtime inside Bazel tests. - Add a new integration test suite for `js_repl` tool behavior and register it in the core integration test suite module. - Move black-box `js_repl` behavior tests into the integration suite (persistence/TLA, builtin tool invocation, recursive self-call rejection, `process` isolation, blocked builtin imports). - Keep white-box manager/kernel tests in the `js_repl` unit test module. - Gate `js_repl` runtime tests to run only on macOS and only when a usable Node runtime is available (skip on other platforms / missing Node in this commit). ## Impact - Reduces `js_repl` CI failures caused by Node resolution drift in Bazel. - Improves test organization by separating tool-facing behavior tests from white-box manager/kernel tests. - Keeps the base commit stable while expanding `js_repl` runtime coverage. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12372 - 👉 `2` https://github.com/openai/codex/pull/12407 - ⏳ `3` https://github.com/openai/codex/pull/12185 - ⏳ `4` https://github.com/openai/codex/pull/10673	2026-02-24 21:04:34 -08:00
Curtis 'Fjord' Hawthorne	125fbec317	Fix js_repl view_image attachments in nested tool calls (#12725 ) ## Summary - Fix `js_repl` so `await codex.tool("view_image", { path })` actually attaches the image to the active turn when called from inside the JS REPL. - Restore the behavior expected by the existing `js_repl` image-attachment test. - This is a follow-up to [#12553](https://github.com/openai/codex/pull/12553), which changed `view_image` to return structured image content. ## Root Cause - [#12553](https://github.com/openai/codex/pull/12553) changed `view_image` from directly injecting a pending user image message to returning structured `function_call_output` content items. - The nested tool-call bridge inside `js_repl` serialized that tool response back to the JS runtime, but it did not mirror returned image content into the active turn. - As a result, `view_image` appeared to succeed inside `js_repl`, but no `input_image` was actually attached for the outer turn. ## What Changed - Updated the nested tool-call path in `js_repl` to inspect function tool responses for structured content items. - When a nested tool response includes `input_image` content, `js_repl` now injects a corresponding user `Message` into the active turn before returning the raw tool result back to the JS runtime. - Kept the normal JSON result flow intact, so `codex.tool(...)` still returns the original tool output object to JavaScript. ## Why - `js_repl` documentation and tests already assume that `view_image` can be used from inside the REPL to attach generated images to the model. - Without this fix, the nested call path silently dropped that attachment behavior.	2026-02-24 18:23:53 -08:00
Curtis 'Fjord' Hawthorne	63c2ac96cd	fix(js_repl): surface uncaught kernel errors and reset cleanly (#12636 ) ## Summary Improve `js_repl` behavior when the Node kernel hits a process-level failure (for example, an uncaught exception or unhandled Promise rejection). Instead of only surfacing a generic `js_repl kernel exited unexpectedly` after stdout EOF, `js_repl` now returns a clearer exec error for the active request, then resets the kernel cleanly. ## Why Some sandbox-denied operations can trigger Node errors that become process-level failures (for example, an unhandled EventEmitter `'error'` event). In that case: - the kernel process exits, - the host sees stdout EOF, - the user gets a generic kernel-exit error, - and the next request can briefly race with stale kernel state. This change improves that failure mode without monkeypatching Node APIs. ## Changes ### Kernel-side (`js_repl` Node process) - Add process-level handlers for: - `uncaughtException` - `unhandledRejection` - When one of these fires: - best-effort emit a normal `exec_result` error for the active exec - include actionable guidance to catch/handle async errors (including Promise rejections and EventEmitter `'error'` events) - exit intentionally so the host can reset/restart the kernel ### Host-side (`JsReplManager`) - Clear dead kernel state as soon as the stdout reader observes unexpected kernel exit/EOF. - This lets the next `js_repl` exec start a fresh kernel instead of hitting a stale broken-pipe path. ### Tests - Add regression coverage for: - uncaught async exception -> exec error + kernel recovery on next exec - Update forced-kernel-exit test to validate recovery behavior (next exec restarts cleanly) ## Impact - Better user-facing error for kernel crashes caused by uncaught/unhandled async failures. - Cleaner recovery behavior after kernel exit. ## Validation - `cargo test -p codex-core --lib tools::js_repl::tests::js_repl_uncaught_exception_returns_exec_error_and_recovers -- --exact` - `cargo test -p codex-core --lib tools::js_repl::tests::js_repl_forced_kernel_exit_recovers_on_next_exec -- --exact` - `just fmt`	2026-02-24 17:12:02 -08:00
Dylan Hurd	f6053fdfb3	feat(core) Introduce Feature::RequestPermissions (#11871 ) ## Summary Introduces the initial implementation of Feature::RequestPermissions. RequestPermissions allows the model to request that a command be run inside the sandbox, with additional permissions, like writing to a specific folder. Eventually this will include other rules as well, and the ability to persist these permissions, but this PR is already quite large - let's get the core flow working and go from there! <img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM" src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368" /> ## Testing - [x] Added tests - [x] Tested locally - [x] Feature	2026-02-24 09:48:57 -08:00
Michael Bolin	66d5d34e6e	core: preserve constrained approval/sandbox policies in TurnContext (#12473 )	2026-02-21 14:40:24 -08:00
Curtis 'Fjord' Hawthorne	097620218d	js_repl: remove codex.state helper references (#12275 ) ## Summary This PR removes `codex.state` from the `js_repl` helper surface and removes all corresponding documentation/instruction references. ## Motivation Top-level bindings in `js_repl` now persist across cells, so the extra `codex.state` helper is redundant and adds unnecessary API/docs surface. ## Changes - Removed the long-lived `state` object from the Node kernel helper wiring. - Stopped exposing `codex.state` (and `context.state`) during `js_repl` execution. - Updated user-facing `js_repl` docs to remove `codex.state`. - Updated generated instruction text and related test expectations to list only: - `codex.tmpDir` - `codex.tool(name, args?)` #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12300 - 👉 `2` https://github.com/openai/codex/pull/12275 - ⏳ `3` https://github.com/openai/codex/pull/12205 - ⏳ `4` https://github.com/openai/codex/pull/12185 - ⏳ `5` https://github.com/openai/codex/pull/10673	2026-02-20 11:20:45 -08:00
Curtis 'Fjord' Hawthorne	cc248e4681	js_repl: canonicalize paths for node_modules boundary checks (#12177 ) ## Summary Fix `js_repl` package-resolution boundary checks for macOS temp directory path aliasing (`/var` vs `/private/var`). ## Problem `js_repl` verifies that resolved bare-package imports stay inside a configured `node_modules` root. On macOS, temp directories are commonly exposed as `/var/...` but canonicalize to `/private/var/...`. Because the boundary check compared raw paths with `path.relative(...)`, valid resolutions under temp dirs could be misclassified as escaping the allowed base, causing false `Module not found` errors. ## Changes - Add `fs` import in the JS kernel. - Add `canonicalizePath()` using `fs.realpathSync.native(...)` (with safe fallback). - Canonicalize both `base` and `resolvedPath` before running the `node_modules` containment check. ## Impact - Fixes false-negative boundary checks for valid package resolutions in macOS temp-dir scenarios. - Keeps the existing security boundary behavior intact. - Scope is limited to `js_repl` kernel module path validation logic. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12177 - ⏳ `2` https://github.com/openai/codex/pull/10673	2026-02-18 11:56:45 -08:00
aaronl-openai	f600453699	[js_repl] paths for node module resolution can be specified for js_repl (#11944 ) # External (non-OpenAI) Pull Request Requirements In `js_repl` mode, module resolution currently starts from `js_repl_kernel.js`, which is written to a per-kernel temp dir. This effectively means that bare imports will not resolve. This PR adds a new config option, `js_repl_node_module_dirs`, which is a list of dirs that are used (in order) to resolve a bare import. If none of those work, the current working directory of the thread is used. For example: ```toml js_repl_node_module_dirs = [ "/path/to/node_modules/", "/other/path/to/node_modules/", ] ```	2026-02-17 23:29:49 -08:00

1 2

59 Commits