Commit Graph

101 Commits

Author SHA1 Message Date
Owen Lin
cf91ea808a some other way 2026-04-09 16:08:43 -07:00
Owen Lin
82dcda2910 move initial context to session, not config 2026-04-09 14:06:20 -07:00
Leo Shimonaka
01537f0bd2 Auto-approve MCP server elicitations in Full Access mode (#17164)
Currently, when a MCP server sends an elicitation to Codex running in
Full Access (`sandbox_policy: DangerFullAccess` + `approval_policy:
Never`), the elicitations are auto-cancelled.

This PR updates the automatic handling of MCP elicitations to be
consistent with other approvals in full-access, where they are
auto-approved. Because MCP elicitations may actually require user input,
this mechanism is limited to empty form elicitations.

## Changeset
- Add policy helper shared with existing MCP tool call approval
auto-approve
- Update `ElicitationRequestManager` to auto-approve elicitations in
full access when `can_auto_accept_elicitation` is true.
- Add tests

Co-authored-by: Codex <noreply@openai.com>
2026-04-08 16:41:02 -07:00
maja-openai
dcbc91fd39 Update guardian output schema (#17061)
## Summary
- Update guardian output schema to separate risk, authorization,
outcome, and rationale.
- Feed guardian rationale into rejection messages.
- Split the guardian policy into template and tenant-config sections.

## Validation
- `cargo test -p codex-core mcp_tool_call`
- `env -u CODEX_SANDBOX_NETWORK_DISABLED INSTA_UPDATE=always cargo test
-p codex-core guardian::`

---------

Co-authored-by: Owen Lin <owen@openai.com>
2026-04-08 15:47:29 -07:00
pakrym-oai
35b5720e8d Use AbsolutePathBuf for exec cwd plumbing (#17063)
## Summary
- Carry `AbsolutePathBuf` through tool cwd parsing/resolution instead of
resolving workdirs to raw `PathBuf`s.
- Type exec/sandbox request cwd fields as `AbsolutePathBuf` through
`ExecParams`, `ExecRequest`, `SandboxCommand`, and unified exec runtime
requests.
- Keep `PathBuf` conversions at external/event boundaries and update
existing tests/fixtures for the typed cwd.

## Validation
- `cargo check -p codex-core --tests`
- `cargo check -p codex-sandboxing --tests`
- `cargo test -p codex-sandboxing`
- `cargo test -p codex-core --lib tools::handlers::`
- `just fix -p codex-sandboxing`
- `just fix -p codex-core`
- `just fmt`

Full `codex-core` test suite was not run locally; per repo guidance I
kept local validation targeted.
2026-04-08 10:54:12 -07:00
Vivian Fang
d47b755aa2 Render namespace description for tools (#16879) 2026-04-08 02:39:40 -07:00
viyatb-oai
3c1adbabcd fix: refresh network proxy settings when sandbox mode changes (#17040)
## Summary

Fix network proxy sessions so changing sandbox mode recomputes the
effective managed network policy and applies it to the already-running
per-session proxy.

## Root Cause

`danger_full_access_denylist_only` injects `"*"` only while building the
proxy spec for Full Access. Sessions built that spec once at startup, so
a later permission switch to Full Access left the live proxy in its
original restricted policy. Switching back needed the same recompute
path to remove the synthetic wildcard again.

## What Changed

- Preserve the original managed network proxy config/requirements so the
effective spec can be recomputed for a new sandbox policy.
- Refresh the current session proxy when sandbox settings change, then
reapply exec-policy network overlays.
- Add an in-place proxy state update path while rejecting
listener/port/SOCKS changes that cannot be hot-reloaded.
- Keep runtime proxy settings cheap to snapshot and update.
- Add regression coverage for workspace-write -> Full Access ->
workspace-write.
2026-04-08 03:07:55 +00:00
Dylan Hurd
6c36e7d688 fix(app-server) revert null instructions changes (#17047) 2026-04-07 15:18:34 -07:00
pakrym-oai
f1a2b920f9 [codex] Make AbsolutePathBuf joins infallible (#16981)
Having to check for errors every time join is called is painful and
unnecessary.
2026-04-07 10:52:08 -07:00
Owen Lin
5d1671ca70 feat(analytics): generate an installation_id and pass it in responsesapi client_metadata (#16912)
## Summary

This adds a stable Codex installation ID and includes it on Responses
API requests via `x-codex-installation-id` passed in via the
`client_metadata` field for analytics/debugging.

The main pieces are:
- persist a UUID in `$CODEX_HOME/installation_id`
- thread the installation ID into `ModelClient`
- send it in `client_metadata` on Responses requests so it works
consistently across HTTP and WebSocket transports
2026-04-07 09:52:17 -07:00
Ahmed Ibrahim
cd591dc457 Preserve null developer instructions (#16976)
Preserve explicit null developer-instruction overrides across app-server
resume and fork flows.
2026-04-07 09:32:14 -07:00
pakrym-oai
413c1e1fdf [codex] reduce module visibility (#16978)
## Summary
- reduce public module visibility across Rust crates, preferring private
or crate-private modules with explicit crate-root public exports
- update external call sites and tests to use the intended public crate
APIs instead of reaching through module trees
- add the module visibility guideline to AGENTS.md

## Validation
- `cargo check --workspace --all-targets --message-format=short` passed
before the final fix/format pass
- `just fix` completed successfully
- `just fmt` completed successfully
- `git diff --check` passed
2026-04-07 08:03:35 -07:00
Won Park
90320fc51a collapse dev message into one (#16988)
collapse image-gen dev message into one
2026-04-06 23:49:47 -07:00
Ahmed Ibrahim
24c598e8a9 Honor null thread instructions (#16964)
- Treat explicit null thread instructions as a blank-slate override
while preserving omitted-field fallback behavior.
- Preserve null through rollout resume/fork and keep explicit empty
strings distinct.
- Add app-server v2 start/fork coverage for the tri-state instruction
params.
2026-04-07 04:10:19 +00:00
pakrym-oai
4bb507d2c4 Make AGENTS.md discovery FS-aware (#15826)
## Summary
- make AGENTS.md discovery and loading fully FS-aware and remove the
non-FS discover helper
- migrate remote-aware codex-core tests to use TestEnv workspace setup
instead of syncing a local workspace copy
- add AGENTS.md corner-case coverage, including directory fallbacks and
remote-aware integration coverage

## Testing
- cargo test -p codex-core project_doc -- --nocapture
- cargo test -p codex-core hierarchical_agents -- --nocapture
- cargo test -p codex-core agents_md -- --nocapture
- cargo test -p codex-tui status -- --nocapture
- cargo test -p codex-tui-app-server status -- --nocapture
- just fix
- just fmt
- just bazel-lock-update
- just bazel-lock-check
- just argument-comment-lint
- remote Linux executor tests in progress via scripts/test-remote-env.sh
2026-04-06 20:26:21 -07:00
starr-openai
a504d8f0fa Disable env-bound tools when exec server is none (#16349)
## Summary
- make `CODEX_EXEC_SERVER_URL=none` map to an explicit disabled
environment mode instead of inferring from a missing URL
- expose environment capabilities (`exec_enabled`, `filesystem_enabled`)
so tool building can gate behavior explicitly and future
multi-environment work has a clearer seam
- suppress env-backed tools when the relevant capability is unavailable,
including exec tools, `js_repl`, `apply_patch`, `list_dir`, and
`view_image`
- keep handler/runtime backstops so disabled environments still reject
execution if a tool path somehow bypasses registration

## Testing
- `just fmt`
- `cargo test -p codex-exec-server`
- `cargo test -p codex-tools
disabled_environment_omits_environment_backed_tools`
- `cargo test -p codex-tools
environment_capabilities_gate_exec_and_filesystem_tools_independently`
- remote devbox Bazel build via `codex-applied-devbox`:
`//codex-rs/cli:cli`
2026-04-06 17:22:06 -07:00
rhan-oai
756c45ec61 [codex-analytics] add protocol-native turn timestamps (#16638)
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638).
* #16870
* #16706
* #16659
* #16641
* #16640
* __->__ #16638
2026-04-06 16:22:59 -07:00
Owen Lin
ded559680d feat(requirements): support allowed_approval_reviewers (#16701)
## Description

Add requirements.toml support for `allowed_approvals_reviewers =
["user", "guardian_subagent"]`, so admins can now restrict the use of
guardian mode.

Note: If a user sets a reviewer that isn’t allowed by requirements.toml,
config loading falls back to the first allowed reviewer and emits a
startup warning.

The table below describes the possible admin controls.
| Admin intent | `requirements.toml` | User `config.toml` | End result |
|---|---|---|---|
| Leave Guardian optional | omit `allowed_approvals_reviewers` or set
`["user", "guardian_subagent"]` | user chooses `approvals_reviewer =
"user"` or `"guardian_subagent"` | Guardian off for `user`, on for
`guardian_subagent` + `approval_policy = "on-request"` |
| Force Guardian off | `allowed_approvals_reviewers = ["user"]` | any
user value | Effective reviewer is `user`; Guardian off |
| Force Guardian on | `allowed_approvals_reviewers =
["guardian_subagent"]` and usually `allowed_approval_policies =
["on-request"]` | any user reviewer value; user should also have
`approval_policy = "on-request"` unless policy is forced | Effective
reviewer is `guardian_subagent`; Guardian on when effective approval
policy is `on-request` |
| Allow both, but default to manual if user does nothing |
`allowed_approvals_reviewers = ["user", "guardian_subagent"]` | omit
`approvals_reviewer` | Effective reviewer is `user`; Guardian off |
| Allow both, and user explicitly opts into Guardian |
`allowed_approvals_reviewers = ["user", "guardian_subagent"]` |
`approvals_reviewer = "guardian_subagent"` and `approval_policy =
"on-request"` | Guardian on |
| Invalid admin config | `allowed_approvals_reviewers = []` | anything |
Config load error |
2026-04-06 11:11:44 -07:00
Eric Traut
b5edeb98a0 Fix flaky permissions escalation test on Windows (#16825)
Problem: `rejects_escalated_permissions_when_policy_not_on_request`
retried a real shell command after asserting the escalation rejection,
so Windows CI could fail on command startup timing instead of approval
behavior.

Solution: Keep the rejection assertion, verify no turn permissions were
granted, and assert through exec-policy evaluation that the same command
would be allowed without escalation instead of timing a subprocess.
2026-04-05 10:51:01 -07:00
rhan-oai
4fd5c35c4f [codex-analytics] subagent analytics (#15915)
- creates custom event that emits subagent thread analytics from core
- wires client metadata (`product_client_id, client_name,
client_version`), through from app-server
- creates `created_at `timestamp in core
- subagent analytics are behind `FeatureFlag::GeneralAnalytics`

PR stack
- [[telemetry] thread events
#15690](https://github.com/openai/codex/pull/15690)
- --> [[telemetry] subagent events
#15915](https://github.com/openai/codex/pull/15915)
- [[telemetry] turn events
#15591](https://github.com/openai/codex/pull/15591)
- [[telemetry] steer events
#15697](https://github.com/openai/codex/pull/15697)
- [[telemetry] queued prompt data
#15804](https://github.com/openai/codex/pull/15804)

Notes:
- core does not spawn a subagent thread for compact, but represented in
mapping for consistency

`INFO | 2026-04-01 13:08:12 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:399 | Tracked
codex_thread_initialized event params={'thread_id':
'019d4aa9-233b-70f2-a958-c3dbae1e30fa', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': None}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'model': 'gpt-5.3-codex', 'ephemeral':
False, 'initialization_mode': 'new', 'created_at': 1775074091,
'thread_source': 'subagent', 'subagent_source': 'thread_spawn',
'parent_thread_id': '019d4aa8-51ec-77e3-bafb-2c1b8e29e385'} | `

`INFO | 2026-04-01 13:08:41 | codex_backend.routers.analytics_events |
analytics_events.track_analytics_events:399 | Tracked
codex_thread_initialized event params={'thread_id':
'019d4aa9-94e3-75f1-8864-ff8ad0e55e1e', 'product_surface': 'codex',
'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name':
'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process',
'experimental_api_enabled': None}, 'runtime': {'codex_rs_version':
'0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0',
'runtime_arch': 'aarch64'}, 'model': 'gpt-5.3-codex', 'ephemeral':
False, 'initialization_mode': 'new', 'created_at': 1775074120,
'thread_source': 'subagent', 'subagent_source': 'review',
'parent_thread_id': None} | `

---------

Co-authored-by: jif-oai <jif@openai.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-04-04 11:06:43 -07:00
Thibault Sottiaux
91ca49e53c [codex] allow disabling environment context injection (#16745)
This adds an `include_environment_context` config/profile flag that
defaults on, and guards both initial injection and later environment
updates to allow skipping injection of `<environment_context>`.
2026-04-03 18:06:52 -07:00
Ahmed Ibrahim
af8a9d2d2b remove temporary ownership re-exports (#16626)
Stacked on #16508.

This removes the temporary `codex-core` / `codex-login` re-export shims
from the ownership split and rewrites callsites to import directly from
`codex-model-provider-info`, `codex-models-manager`, `codex-api`,
`codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.

No behavior change intended; this is the mechanical import cleanup layer
split out from the ownership move.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-03 00:33:34 -07:00
Ahmed Ibrahim
6fff9955f1 extract models manager and related ownership from core (#16508)
## Summary
- split `models-manager` out of `core` and add `ModelsManagerConfig`
plus `Config::to_models_manager_config()` so model metadata paths stop
depending on `core::Config`
- move login-owned/auth-owned code out of `core` into `codex-login`,
move model provider config into `codex-model-provider-info`, move API
bridge mapping into `codex-api`, move protocol-owned types/impls into
`codex-protocol`, and move response debug helpers into a dedicated
`response-debug-context` crate
- move feedback tag emission into `codex-feedback`, relocate tests to
the crates that now own the code, and keep broad temporary re-exports so
this PR avoids a giant import-only rewrite

## Major moves and decisions
- created `codex-models-manager` as the owner for model
cache/catalog/config/model info logic, including the new
`ModelsManagerConfig` struct
- created `codex-model-provider-info` as the owner for provider config
parsing/defaults and kept temporary `codex-login`/`codex-core`
re-exports for old import paths
- moved `api_bridge` error mapping + `CoreAuthProvider` into
`codex-api`, while `codex-login::api_bridge` temporarily re-exports
those symbols and keeps the `auth_provider_from_auth` wrapper
- moved `auth_env_telemetry` and `provider_auth` ownership to
`codex-login`
- moved `CodexErr` ownership to `codex-protocol::error`, plus
`StreamOutput`, `bytes_to_string_smart`, and network policy helpers to
protocol-owned modules
- created `codex-response-debug-context` for
`extract_response_debug_context`, `telemetry_transport_error_message`,
and related response-debug plumbing instead of leaving that behavior in
`core`
- moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and
`emit_feedback_request_tags_with_auth_env` to `codex-feedback`
- deferred removal of temporary re-exports and the mechanical import
rewrites to a stacked follow-up PR so this PR stays reviewable

## Test moves
- moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to
`login/tests/suite/auth_refresh.rs`
- moved text encoding coverage from
`core/tests/suite/text_encoding_fix.rs` to
`protocol/src/exec_output_tests.rs`
- moved model info override coverage from
`core/tests/suite/model_info_overrides.rs` to
`models-manager/src/model_info_overrides_tests.rs`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-02 23:00:02 -07:00
Michael Bolin
7a3eec6fdb core: cut codex-core compile time 48% with native async SessionTask (#16631)
## Why

This continues the compile-time cleanup from #16630. `SessionTask`
implementations are monomorphized, but `Session` stores the task behind
a `dyn` boundary so it can drive and abort heterogenous turn tasks
uniformly. That means we can move the `#[async_trait]` expansion off the
implementation trait, keep a small boxed adapter only at the storage
boundary, and preserve the existing task lifecycle semantics while
reducing the amount of generated async-trait glue in `codex-core`.

One measurement caveat showed up while exploring this: a warm
incremental benchmark based on `touch core/src/tasks/mod.rs && cargo
check -p codex-core --lib` was basically flat, but that was the wrong
benchmark for this change. Using package-clean `codex-core` rebuilds,
like #16630, shows the real win.

Relevant pre-change code:

- [`SessionTask` with
`#[async_trait]`](3c7f013f97/codex-rs/core/src/tasks/mod.rs (L129-L182))
- [`RunningTask` storing `Arc<dyn
SessionTask>`](3c7f013f97/codex-rs/core/src/state/turn.rs (L69-L77))

## What changed

- Switched `SessionTask::{run, abort}` to native RPITIT futures with
explicit `Send` bounds.
- Added a private `AnySessionTask` adapter that boxes those futures only
at the `Arc<dyn ...>` storage boundary.
- Updated `RunningTask` to store `Arc<dyn AnySessionTask>` and removed
`#[async_trait]` from the concrete task impls plus test-only
`SessionTask` impls.

## Timing

Benchmarked package-clean `codex-core` rebuilds with dependencies left
warm:

```shell
cargo check -p codex-core --lib >/dev/null
cargo clean -p codex-core >/dev/null
/usr/bin/time -p cargo +nightly rustc -p codex-core --lib -- \
  -Z time-passes \
  -Z time-passes-format=json >/dev/null
```

| revision | rustc `total` | process `real` | `generate_crate_metadata`
| `MIR_borrow_checking` | `monomorphization_collector_graph_walk` |
| --- | ---: | ---: | ---: | ---: | ---: |
| parent `3c7f013f9735` | 67.21s | 67.71s | 24.61s | 23.43s | 22.43s |
| this PR `2cafd783ac22` | 35.08s | 35.60s | 8.01s | 7.25s | 7.15s |
| delta | -47.8% | -47.4% | -67.5% | -69.1% | -68.1% |

For completeness, the warm touched-file benchmark stayed flat (`1.96s`
parent vs `1.97s` this PR), which is why that benchmark should not be
used to evaluate this refactor.

## Verification

- Ran `cargo test -p codex-core`; this change compiled and task-related
tests passed before hitting the same unrelated 5
`config::tests::*guardian*` failures already present on the parent
stack.
2026-04-02 23:39:56 +00:00
jif-oai
ab6cce62b8 chore: rework state machine further (#16567) 2026-04-02 16:15:28 +02:00
jif-oai
e47ed5e57f fix: races in end of turn (#16566) 2026-04-02 15:55:55 +02:00
jif-oai
627299c551 fix: race pending (#16561) 2026-04-02 15:31:30 +02:00
Michael Bolin
c1d18ceb6f [codex] Remove codex-core config type shim (#16529)
## Why

This finishes the config-type move out of `codex-core` by removing the
temporary compatibility shim in `codex_core::config::types`. Callers now
depend on `codex-config` directly, which keeps these config model types
owned by the config crate instead of re-expanding `codex-core` as a
transitive API surface.

## What Changed

- Removed the `codex-rs/core/src/config/types.rs` re-export shim and the
`core::config::ApprovalsReviewer` re-export.
- Updated `codex-core`, `codex-cli`, `codex-tui`, `codex-app-server`,
`codex-mcp-server`, and `codex-linux-sandbox` call sites to import
`codex_config::types` directly.
- Added explicit `codex-config` dependencies to downstream crates that
previously relied on the `codex-core` re-export.
- Regenerated `codex-rs/core/config.schema.json` after updating the
config docs path reference.
2026-04-02 01:19:44 -07:00
Michael Bolin
aa2403e2eb core: remove cross-crate re-exports from lib.rs (#16512)
## Why

`codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
which made downstream crates depend on `codex-core` as a proxy module
instead of the actual owner crate.

Removing those forwards makes crate boundaries explicit and lets leaf
crates drop unnecessary `codex-core` dependencies. In this PR, this
reduces the dependency on `codex-core` to `codex-login` in the following
files:

```
codex-rs/backend-client/Cargo.toml
codex-rs/mcp-server/tests/common/Cargo.toml
```

## What

- Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
`codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
`codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
`codex-tools`, and `codex-utils-path`.
- Delete the `default_client` forwarding shim in `codex-rs/core`.
- Update in-crate and downstream callsites to import directly from the
owning `codex-*` crate.
- Add direct Cargo dependencies where callsites now target the owner
crate, and remove `codex-core` from `codex-rs/backend-client`.
2026-04-01 23:06:24 -07:00
Ahmed Ibrahim
59b68f5519 Extract MCP into codex-mcp crate (#15919)
- Split MCP runtime/server code out of `codex-core` into the new
`codex-mcp` crate. New/moved public structs/types include `McpConfig`,
`McpConnectionManager`, `ToolInfo`, `ToolPluginProvenance`,
`CodexAppsToolsCacheKey`, and the `McpManager` API
(`codex_mcp::mcp::McpManager` plus the `codex_core::mcp::McpManager`
wrapper/shim). New/moved functions include `with_codex_apps_mcp`,
`configured_mcp_servers`, `effective_mcp_servers`,
`collect_mcp_snapshot`, `collect_mcp_snapshot_from_manager`,
`qualified_mcp_tool_name_prefix`, and the MCP auth/skill-dependency
helpers. Why: this creates a focused MCP crate boundary and shrinks
`codex-core` without forcing every consumer to migrate in the same PR.

- Move MCP server config schema and persistence into `codex-config`.
New/moved structs/enums include `AppToolApproval`,
`McpServerToolConfig`, `McpServerConfig`, `RawMcpServerConfig`,
`McpServerTransportConfig`, `McpServerDisabledReason`, and
`codex_config::ConfigEditsBuilder`. New/moved functions include
`load_global_mcp_servers` and
`ConfigEditsBuilder::replace_mcp_servers`/`apply`. Why: MCP TOML
parsing/editing is config ownership, and this keeps config
validation/round-tripping (including per-tool approval overrides and
inline bearer-token rejection) in the config crate instead of
`codex-core`.

- Rewire `codex-core`, app-server, and plugin call sites onto the new
crates. Updated `Config::to_mcp_config(&self, plugins_manager)`,
`codex-rs/core/src/mcp.rs`, `codex-rs/core/src/connectors.rs`,
`codex-rs/core/src/codex.rs`,
`CodexMessageProcessor::list_mcp_server_status_task`, and
`utils/plugins/src/mcp_connector.rs` to build/pass the new MCP
config/runtime types. Why: plugin-provided MCP servers still merge with
user-configured servers, and runtime auth (`CodexAuth`) is threaded into
`with_codex_apps_mcp` / `collect_mcp_snapshot` explicitly so `McpConfig`
stays config-only.
2026-04-01 19:03:26 -07:00
jif-oai
213756c9ab feat: add mailbox concept for wait (#16010)
Add a mailbox we can use for inter-agent communication
`wait` is now based on it and don't take target anymore
2026-03-30 11:47:20 +02:00
Michael Bolin
61dfe0b86c chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
## Why

`argument-comment-lint` was green in CI even though the repo still had
many uncommented literal arguments. The main gap was target coverage:
the repo wrapper did not force Cargo to inspect test-only call sites, so
examples like the `latest_session_lookup_params(true, ...)` tests in
`codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.

This change cleans up the existing backlog, makes the default repo lint
path cover all Cargo targets, and starts rolling that stricter CI
enforcement out on the platform where it is currently validated.

## What changed

- mechanically fixed existing `argument-comment-lint` violations across
the `codex-rs` workspace, including tests, examples, and benches
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
`tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
`--all-targets` unless the caller explicitly narrows the target set
- fixed both wrappers so forwarded cargo arguments after `--` are
preserved with a single separator
- documented the new default behavior in
`tools/argument-comment-lint/README.md`
- updated `rust-ci` so the macOS lint lane keeps the plain wrapper
invocation and therefore enforces `--all-targets`, while Linux and
Windows temporarily pass `-- --lib --bins`

That temporary CI split keeps the stricter all-targets check where it is
already cleaned up, while leaving room to finish the remaining Linux-
and Windows-specific target-gated cleanup before enabling
`--all-targets` on those runners. The Linux and Windows failures on the
intermediate revision were caused by the wrapper forwarding bug, not by
additional lint findings in those lanes.

## Validation

- `bash -n tools/argument-comment-lint/run.sh`
- `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
- shell-level wrapper forwarding check for `-- --lib --bins`
- shell-level wrapper forwarding check for `-- --tests`
- `just argument-comment-lint`
- `cargo test` in `tools/argument-comment-lint`
- `cargo test -p codex-terminal-detection`

## Follow-up

- Clean up remaining Linux-only target-gated callsites, then switch the
Linux lint lane back to the plain wrapper invocation.
- Clean up remaining Windows-only target-gated callsites, then switch
the Windows lint lane back to the plain wrapper invocation.
2026-03-27 19:00:44 -07:00
Celia Chen
dd30c8eedd chore: refactor network permissions to use explicit domain and unix socket rule maps (#15120)
## Summary

This PR replaces the legacy network allow/deny list model with explicit
rule maps for domains and unix sockets across managed requirements,
permissions profiles, the network proxy config, and the app server
protocol.

Concretely, it:

- introduces typed domain (`allow` / `deny`) and unix socket permission
(`allow` / `none`) entries instead of separate `allowed_domains`,
`denied_domains`, and `allow_unix_sockets` lists
- updates config loading, managed requirements merging, and exec-policy
overlays to read and upsert rule entries consistently
- exposes the new shape through protocol/schema outputs, debug surfaces,
and app-server config APIs
- rejects the legacy list-based keys and updates docs/tests to reflect
the new config format

## Why

The previous representation split related network policy across multiple
parallel lists, which made merging and overriding rules harder to reason
about. Moving to explicit keyed permission maps gives us a single source
of truth per host/socket entry, makes allow/deny precedence clearer, and
gives protocol consumers access to the full rule state instead of
derived projections only.

## Backward Compatibility

### Backward compatible

- Managed requirements still accept the legacy
`experimental_network.allowed_domains`,
`experimental_network.denied_domains`, and
`experimental_network.allow_unix_sockets` fields. They are normalized
into the new canonical `domains` and `unix_sockets` maps internally.
- App-server v2 still deserializes legacy `allowedDomains`,
`deniedDomains`, and `allowUnixSockets` payloads, so older clients can
continue reading managed network requirements.
- App-server v2 responses still populate `allowedDomains`,
`deniedDomains`, and `allowUnixSockets` as legacy compatibility views
derived from the canonical maps.
- `managed_allowed_domains_only` keeps the same behavior after
normalization. Legacy managed allowlists still participate in the same
enforcement path as canonical `domains` entries.

### Not backward compatible

- Permissions profiles under `[permissions.<profile>.network]` no longer
accept the legacy list-based keys. Those configs must use the canonical
`[domains]` and `[unix_sockets]` tables instead of `allowed_domains`,
`denied_domains`, or `allow_unix_sockets`.
- Managed `experimental_network` config cannot mix canonical and legacy
forms in the same block. For example, `domains` cannot be combined with
`allowed_domains` or `denied_domains`, and `unix_sockets` cannot be
combined with `allow_unix_sockets`.
- The canonical format can express explicit `"none"` entries for unix
sockets, but those entries do not round-trip through the legacy
compatibility fields because the legacy fields only represent allow/deny
lists.
## Testing
`/target/debug/codex sandbox macos --log-denials /bin/zsh -c 'curl
https://www.example.com' ` gives 200 with config
```
[permissions.workspace.network.domains]
"www.example.com" = "allow"
```
and fails when set to deny: `curl: (56) CONNECT tunnel failed, response
403`.

Also tested backward compatibility path by verifying that adding the
following to `/etc/codex/requirements.toml` works:
```
[experimental_network]
allowed_domains = ["www.example.com"]
```
2026-03-27 06:17:59 +00:00
rreichel3-oai
86764af684 Protect first-time project .codex creation across Linux and macOS sandboxes (#15067)
## Problem

Codex already treated an existing top-level project `./.codex` directory
as protected, but there was a gap on first creation.

If `./.codex` did not exist yet, a turn could create files under it,
such as `./.codex/config.toml`, without going through the same approval
path as later modifications. That meant the initial write could bypass
the intended protection for project-local Codex state.

## What this changes

This PR closes that first-creation gap in the Unix enforcement layers:

- `codex-protocol`
- treat the top-level project `./.codex` path as a protected carveout
even when it does not exist yet
- avoid injecting the default carveout when the user already has an
explicit rule for that exact path
- macOS Seatbelt
- deny writes to both the exact protected path and anything beneath it,
so creating `./.codex` itself is blocked in addition to writes inside it
- Linux bubblewrap
- preserve the same protected-path behavior for first-time creation
under `./.codex`
- tests
- add protocol regressions for missing `./.codex` and explicit-rule
collisions
- add Unix sandbox coverage for blocking first-time `./.codex` creation
  - tighten Seatbelt policy assertions around excluded subpaths

## Scope

This change is intentionally scoped to protecting the top-level project
`.codex` subtree from agent writes.

It does not make `.codex` unreadable, and it does not change the product
behavior around loading project skills from `.codex` when project config
is untrusted.

## Why this shape

The fix is pointed rather than broad:
- it preserves the current model of “project `.codex` is protected from
writes”
- it closes the security-relevant first-write hole
- it avoids folding a larger permissions-model redesign into this PR

## Validation

- `cargo test -p codex-protocol`
- `cargo test -p codex-sandboxing seatbelt`
- `cargo test -p codex-exec --test all
sandbox_blocks_first_time_dot_codex_creation -- --nocapture`

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-03-26 16:06:53 -04:00
Michael Bolin
01fa4f0212 core: remove special execve handling for skill scripts (#15812) 2026-03-26 07:46:04 -07:00
pakrym-oai
8fa88fa8ca Add cached environment manager for exec server URL (#15785)
Add environment manager that is a singleton and is created early in
app-server (before skill manager, before config loading).

Use an environment variable to point to a running exec server.
2026-03-25 16:14:36 -07:00
Ahmed Ibrahim
9dbe098349 Extract codex-core-skills crate (#15749)
## Summary
- move skill loading and management into codex-core-skills
- leave codex-core with the thin integration layer and shared wiring

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 12:57:42 -07:00
Ahmed Ibrahim
d273efc0f3 Extract codex-analytics crate (#15748)
## Summary
- move the analytics events client into codex-analytics
- update codex-core and app-server callsites to use the new crate

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 11:08:05 -07:00
pakrym-oai
504aeb0e09 Use AbsolutePathBuf for cwd state (#15710)
Migrate `cwd` and related session/config state to `AbsolutePathBuf` so
downstream consumers consistently see absolute working directories.

Add test-only `.abs()` helpers for `Path`, `PathBuf`, and `TempDir`, and
update branch-local tests to use them instead of
`AbsolutePathBuf::try_from(...)`.

For the remaining TUI/app-server snapshot coverage that renders absolute
cwd values, keep the snapshots unchanged and skip the Windows-only cases
where the platform-specific absolute path layout differs.
2026-03-25 16:02:22 +00:00
Charley Cunningham
d72fa2a209 [codex] Defer fork context injection until first turn (#15699)
## Summary
- remove the fork-startup `build_initial_context` injection
- keep the reconstructed `reference_context_item` as the fork baseline
until the first real turn
- update fork-history tests and the request snapshot, and add a
`TODO(ccunningham)` for remaining nondiffable initial-context inputs

## Why
Fork startup was appending current-session initial context immediately
after reconstructing the parent rollout, then the first real turn could
emit context updates again. That duplicated model-visible context in the
child rollout.

## Impact
Forked sessions now behave like resume for context seeding: startup
reconstructs history and preserves the prior baseline, and the first
real turn handles any current-session context emission.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-24 18:34:44 -07:00
Ahmed Ibrahim
062fa7a2bb Move string truncation helpers into codex-utils-string (#15572)
- move the shared byte-based middle truncation logic from `core` into
`codex-utils-string`
- keep token-specific truncation in `codex-core` so rollout can reuse
the shared helper in the next stacked PR

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-24 15:45:40 -07:00
Ruslan Nigmatullin
daf5e584c2 core: Make FileWatcher reusable (#15093)
### Summary
Make `FileWatcher` a reusable core component which can be built upon.
Extract skills-related logic into a separate `SkillWatcher`.
Introduce a composable `ThrottledWatchReceiver` to throttle filesystem
events, coalescing affected paths among them.

### Testing
Updated existing unit tests.
2026-03-24 11:04:47 -07:00
Charley Cunningham
910cf49269 [codex] Stabilize second compaction history test (#15605)
## Summary
- replace the second-compaction test fixtures with a single ordered
`/responses` sequence
- assert against the real recorded request order instead of aggregating
per-mock captures
- realign the second-summary assertion to the first post-compaction user
turn where the summary actually appears

## Root cause
`compact_resume_after_second_compaction_preserves_history` collected
requests from multiple `mount_sse_once_match` recorders. Overlapping
matchers could record the same HTTP request more than once, so the test
indexed into a duplicated synthetic list rather than the true request
stream. That made the summary assertion depend on matcher evaluation
order and platform-specific behavior.

## Impact
- makes the flaky test deterministic by removing duplicate request
capture from the assertion path
- keeps the change scoped to the test only

## Validation
- `just fmt`
- `just argument-comment-lint`
- `env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core
compact_resume_after_second_compaction_preserves_history -- --nocapture`
- repeated the same targeted test 10 times

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-24 10:14:21 -07:00
Dylan Hurd
67c1c7c054 chore(core) Add approvals reviewer to UserTurn (#15426)
## Summary
Adds support for approvals_reviewer to `Op::UserTurn` so we can migrate
`[CodexMessageProcessor::turn_start]` to use Op::UserTurn

## Testing
- [x] Adds quick test for the new field

Co-authored-by: Codex <noreply@openai.com>
2026-03-23 15:19:01 -07:00
jif-oai
37ac0c093c feat: structured multi-agent output (#15515)
Send input now sends messages as assistant message and with this format:

```
author: /root/worker_a
recipient: /root/worker_a/tester
other_recipients: []
Content: bla bla bla. Actual content. Only text for now
```
2026-03-23 18:53:54 +00:00
Charley Cunningham
e838645fa2 tui: queue follow-ups during manual /compact (#15259)
## Summary
- queue input after the user submits `/compact` until that manual
compact turn ends
- mirror the same behavior in the app-server TUI
- add regressions for input queued before compact starts and while it is
running

Co-authored-by: Codex <noreply@openai.com>
2026-03-23 10:19:44 -07:00
Charley Cunningham
85065ea1b8 core: snapshot fork startup context injection (#15443)
## Summary
- add a snapshot-style core test for fork startup context injection
followed by first-turn diff injection
- capture the current duplicated startup-plus-turn context behavior
without changing runtime logic

## Testing
- not run locally; relying on CI
- just fmt

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-22 18:24:14 -07:00
Matthew Zeng
06e06ab173 [plugins] Fix plugin explicit mention context management. (#15372)
- [x] Fix plugin explicit mention context management.
2026-03-21 00:29:29 -07:00
Won Park
461ba012fc Feat/restore image generation history (#15223)
Restore image generation items in resumed thread history
2026-03-19 22:57:16 -07:00
Ahmed Ibrahim
2e22885e79 Split features into codex-features crate (#15253)
- Split the feature system into a new `codex-features` crate.
- Cut `codex-core` and workspace consumers over to the new config and
warning APIs.

Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
2026-03-19 20:12:07 -07:00