codex

mirror of https://github.com/openai/codex.git synced 2026-04-30 09:26:44 +00:00

Author	SHA1	Message	Date
Michael Bolin	b4e9baaaff	sandboxing: plumb split sandbox policies through runtime	2026-03-06 13:03:45 -08:00
Michael Bolin	929eeaf2c9	config: add v3 filesystem permission profiles	2026-03-06 13:03:45 -08:00
Michael Bolin	488875f24d	fix: move unit tests in codex-rs/core/src/codex.rs into their own file (#13783 ) This is analogous to https://github.com/openai/codex/pull/13780.	2026-03-06 11:56:49 -08:00
Michael Bolin	39869f7443	fix: move unit tests in codex-rs/core/src/config/mod.rs into their own file (#13780 ) At over 7,000 lines, `codex-rs/core/src/config/mod.rs` was getting a bit unwieldy. This PR does the same type of move as https://github.com/openai/codex/pull/12957 to put unit tests in their own file, though I decided `config_tests.rs` is a more intuitive name than `mod_tests.rs`. Ultimately, I'll codemod the rest of the codebase to follow suit, but I want to do it in stages to reduce merge conflicts for people.	2026-03-06 11:21:58 -08:00
sayan-oai	8a54d3caaa	feat: structured plugin parsing (#13711 ) #### What Add structured `@plugin` parsing and TUI support for plugin mentions. - Core: switch from plain-text `@display_name` parsing to structured `plugin://...` mentions via `UserInput::Mention` and `[$...](plugin://...)` links in text, same pattern as apps/skills. - TUI: add plugin mention popup, autocomplete, and chips when typing `$`. Load plugin capability summaries and feed them into the composer; plugin mentions appear alongside skills and apps. - Generalize mention parsing to a sigil parameter, still defaults to `$` <img width="797" height="119" alt="image" src="https://github.com/user-attachments/assets/f0fe2658-d908-4927-9139-73f850805ceb" /> Builds on #13510. Currently clients have to build their own `id` via `plugin@marketplace` and filter plugins to show by `enabled`, but we will add `id` and `available` as fields returned from `plugin/list` soon. ####Tests Added tests, verified locally.	2026-03-06 11:08:36 -08:00
jif-oai	0e41a5c4a8	chore: improve DB flushing (#13620 ) This branch: * Avoid flushing DB when not necessary * Filter events for which we perfom an `upsert` into the DB * Add a dedicated update function of the `thread:updated_at` that is lighter This should significantly reduce the DB lock contention. If it is not sufficient, we can de-sync the flush of the DB for `updated_at`	2026-03-06 19:58:14 +01:00
Owen Lin	3449e00bc9	feat(otel, core): record turn TTFT and TTFM metrics in codex-core (#13630 ) ### Summary This adds turn-level latency metrics for the first model output and the first completed agent message. - `codex.turn.ttft.duration_ms` starts at turn start and records on the first output signal we see from the model. That includes normal assistant text, reasoning deltas, and non-text outputs like tool-call items. - `codex.turn.ttfm.duration_ms` also starts at turn start, but it records when the first agent message finishes streaming rather than when its first delta arrives. ### Implementation notes The timing is tracked in codex-core, not app-server, so the definition stays consistent across CLI, TUI, and app-server clients. I reused the existing turn lifecycle boundary that already drives `codex.turn.e2e_duration_ms`, stored the turn start timestamp in turn state, and record each metric once per turn. I also wired the new metric names into the OTEL runtime metrics summary so they show up in the same in-memory/debug snapshot path as the existing timing metrics.	2026-03-06 10:23:48 -08:00
Charley Cunningham	cb1a182bbe	Clarify sandbox permission override helper semantics (#13703 ) ## Summary Today `SandboxPermissions::requires_additional_permissions()` does not actually mean "is `WithAdditionalPermissions`". It returns `true` for any non-default sandbox override, including `RequireEscalated`. That broad behavior is relied on in multiple `main` callsites. The naming is security-sensitive because `SandboxPermissions` is used on shell-like tool calls to tell the executor how a single command should relate to the turn sandbox: - `UseDefault`: run with the turn sandbox unchanged - `RequireEscalated`: request execution outside the sandbox - `WithAdditionalPermissions`: stay sandboxed but widen permissions for that command only ## Problem The old helper name reads as if it only applies to the `WithAdditionalPermissions` variant. In practice it means "this command requested any explicit sandbox override." That ambiguity made it easy to read production checks incorrectly and made the guardian change look like a standalone `main` fix when it is not. On `main` today: - `shell` and `unified_exec` intentionally reject any explicit `sandbox_permissions` request unless approval policy is `OnRequest` - `exec_policy` intentionally treats any explicit sandbox override as prompt-worthy in restricted sandboxes - tests intentionally serialize both `RequireEscalated` and `WithAdditionalPermissions` as explicit sandbox override requests So changing those callsites from the broad helper to a narrow `WithAdditionalPermissions` check would be a behavior change, not a pure cleanup. ## What This PR Does - documents `SandboxPermissions` as a per-command sandbox override, not a generic permissions bag - adds `requests_sandbox_override()` for the broad meaning: anything except `UseDefault` - adds `uses_additional_permissions()` for the narrow meaning: only `WithAdditionalPermissions` - keeps `requires_additional_permissions()` as a compatibility alias to the broad meaning for now - updates the current broad callsites to use the accurately named broad helper - adds unit coverage that locks in the semantics of all three helpers ## What This PR Does Not Do This PR does not change runtime behavior. That is intentional. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-06 09:57:48 -08:00
jif-oai	f891f516a5	feat: drop discrepency metrics (#13753 )	2026-03-06 18:32:25 +01:00
jif-oai	fa16c26908	feat: drop sqlite db feature flag (#13750 )	2026-03-06 17:57:52 +01:00
jif-oai	5d4303510c	fix: windows normalization (#13742 )	2026-03-06 15:50:44 +01:00
jif-oai	8ad768eb76	feat: prune old memories in DB (#13734 ) To save memory	2026-03-06 15:10:49 +01:00
Matthew Zeng	98dca99db7	[elicitations] Switch to use MCP style elicitation payload for mcp tool approvals. (#13621 ) - [x] Switch to use MCP style elicitation payload for mcp tool approvals. - [ ] TODO: Update the UI to support the full spec.	2026-03-06 01:50:26 -08:00
Won Park	ee1a20258a	Enabling CWD Saving for Image-Gen (#13607 ) Codex now saves the generated image on to your current working directory.	2026-03-06 00:47:21 -08:00
sayan-oai	014a59fb0b	check app auth in plugin/install (#13685 ) #### What on `plugin/install`, check if installed apps are already authed on chatgpt, and return list of all apps that are not. clients can use this list to trigger auth workflows as needed. checks are best effort based on `codex_apps` loading, much like `app/list`. #### Tests Added integration tests, tested locally.	2026-03-06 06:45:00 +00:00
Dylan Hurd	4c9b1c38f6	fix(tui) remove config check for trusted setting (#11874 ) ## Summary Simplify the trusted directory flow. This logic was originally designed several months ago, to determine if codex should start in read-only or workspace-write mode. However, that's no longer the purpose of directory trust - and therefore we should get rid of this logic. ## Testing - [x] Unit tests pass	2026-03-05 22:29:34 -08:00
iceweasel-oai	14de492985	copy current exe to CODEX_HOME/.sandbox-bin for apply_patch (#13669 ) We do this for codex-command-runner.exe as well for the same reason. Windows sandbox users cannot execute binaries in the WindowsApp/ installed directory for the Codex App. This causes apply-patch to fail because it tries to execute codex.exe as the sandbox user.	2026-03-05 22:15:10 -08:00
viyatb-oai	6a79ed5920	refactor: remove proxy admin endpoint (#13687 ) ## Summary - delete the network proxy admin server and its runtime listener/task plumbing - remove the admin endpoint config, runtime, requirement, protocol, schema, and debug-surface fields - update proxy docs to reflect the remaining HTTP and SOCKS listeners only	2026-03-05 22:03:16 -08:00
xl-openai	520ed724d2	support plugin/list. (#13540 ) Introduce a plugin/list which reads from local marketplace.json. Also update the signature for plugin/install.	2026-03-05 21:58:50 -05:00
Ahmed Ibrahim	629cb15bc6	Replay thread rollback from rollout history (#13615 ) - Replay thread rollback from the persisted rollout history instead of truncating in-memory state.\n- Add rollback coverage, including rollback-behind-compaction snapshot coverage.	2026-03-05 16:40:09 -08:00
Ahmed Ibrahim	6cf0ed4e79	Refine realtime startup context formatting (#13560 ) ## Summary - group recent work by git repo when available, otherwise by directory - render recent work as bounded user asks with per-thread cwd context - exclude hidden files and directories from workspace trees	2026-03-05 16:31:20 -08:00
Owen Lin	c3736cff0a	feat(otel): safe tracing (#13626 ) ### Motivation Today config.toml has three different OTEL knobs under `[otel]`: - `exporter` controls where OTEL logs go - `trace_exporter` controls where OTEL traces go - `metrics_exporter` controls where metrics go Those often (pretty much always?) serve different purposes. For example, for OpenAI internal usage, the log exporter is already being used for IT/security telemetry, and that use case is intentionally content-rich: tool calls, arguments, outputs, MCP payloads, and in some cases user content are all useful there. `log_user_prompt` is a good example of that distinction. When it’s enabled, we include raw prompt text in OTEL logs, which is acceptable for the security use case. The trace exporter is a different story. The goal there is to give OpenAI engineers visibility into latency and request behavior when they run Codex locally, without sending sensitive prompt or tool data as trace event data. In other words, traces should help answer “what was slow?” or “where did time go?”, not “what did the user say?” or “what did the tool return?” The complication is that Rust’s `tracing` crate does not make a hard distinction between “logs” and “trace events.” It gives us one instrumentation API for logs and trace events (via `tracing::event!`), and subscribers decide what gets treated as logs, trace events, or both. Before this change, our OTEL trace layer was effectively attached to the general tracing stream, which meant turning on `trace_exporter` could pick up content-rich events that were originally written with logging (and the `log_exporter`) in mind. That made it too easy for sensitive data to end up in exported traces by accident. ### Concrete example In `otel_manager.rs`, this `tracing::event!` call would be exported in both logs AND traces (as a trace event). ``` pub fn user_prompt(&self, items: &[UserInput]) { let prompt = items .iter() .flat_map(\|item\| match item { UserInput::Text { text, .. } => Some(text.as_str()), _ => None, }) .collect::<String>(); let prompt_to_log = if self.metadata.log_user_prompts { prompt.as_str() } else { "[REDACTED]" }; tracing::event!( tracing::Level::INFO, event.name = "codex.user_prompt", event.timestamp = %timestamp(), // ... prompt = %prompt_to_log, ); } ``` Instead of `tracing::event!`, we should now be using `log_event!` and `trace_event!` instead to more clearly indicate which sink (logs vs. traces) that event should be exported to. ### What changed This PR makes the log and trace export distinct instead of treating them as two sinks for the same data. On the provider side, OTEL logs and traces now have separate routing/filtering policy. The log exporter keeps receiving the existing `codex_otel` events, while trace export is limited to spans and trace events. On the event side, `OtelManager` now emits two flavors of telemetry where needed: - a log-only event with the current rich payloads - a tracing-safe event with summaries only It also has a convenience `log_and_trace_event!` macro for emitting to both logs and traces when it's safe to do so, as well as log- and trace-specific fields. That means prompts, tool args, tool output, account email, MCP metadata, and similar content stay in the log lane, while traces get the pieces that are actually useful for performance work: durations, counts, sizes, status, token counts, tool origin, and normalized error classes. This preserves current IT/security logging behavior while making it safe to turn on trace export for employees. ### Full list of things removed from trace export - raw user prompt text from `codex.user_prompt` - raw tool arguments and output from `codex.tool_result` - MCP server metadata from `codex.tool_result` (mcp_server, mcp_server_origin) - account identity fields like `user.email` and `user.account_id` from trace-safe OTEL events - `host.name` from trace resources - generic `codex.tool_decision` events from traces - generic `codex.sse_event` events from traces - the full ToolCall debug payload from the `handle_tool_call` span What traces now keep instead is mostly: - spans - trace-safe OTEL events - counts, lengths, durations, status, token counts, and tool origin summaries	2026-03-05 16:30:53 -08:00
Ahmed Ibrahim	3ff618b493	Update models.json (#13617 ) - Update `models.json` to surface the new model entry. - Refresh the TUI model picker snapshot to match the updated catalog ordering. --------- Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>	2026-03-05 16:22:39 -08:00
Celia Chen	aaefee04cd	core/protocol: add structured macOS additional permissions and merge them into sandbox execution (#13499 ) ## Summary - Introduce strongly-typed macOS additional permissions across protocol/core/app-server boundaries. - Merge additional permissions into effective sandbox execution, including macOS seatbelt profile extensions. - Expand docs, schema/tool definitions, UI rendering, and tests for `network`, `file_system`, and `macos` additional permissions.	2026-03-05 16:21:45 -08:00
sayan-oai	4e77ea0ec7	add @plugin mentions (#13510 ) ## Note-- added plugin mentions via @, but that conflicts with file mentions depends and builds upon #13433. - introduces explicit `@plugin` mentions. this injects the plugin's mcp servers, app names, and skill name format into turn context as a dev message. - we do not yet have UI for these mentions, so we currently parse raw text (as opposed to skills and apps which have UI chips, autocomplete, etc.) this depends on a `plugins/list` app-server endpoint we can feed the UI with, which is upcoming - also annotate mcp and app tool descriptions with the plugin(s) they come from. this gives the model a first class way of understanding what tools come from which plugins, which will help implicit invocation. ### Tests Added and updated tests, unit and integration. Also confirmed locally a raw `@plugin` injects the dev message, and the model knows about its apps, mcps, and skills.	2026-03-06 00:03:39 +00:00
Curtis 'Fjord' Hawthorne	1ed542bf31	Clarify js_repl image emission and encoding guidance (#13639 ) ## Summary This updates the `js_repl` prompt and docs to make the image guidance less confusing. ## What changed - Clarified that `codex.emitImage(...)` adds one image per call and can be called multiple times to emit multiple images. - Reworded the image-encoding guidance to be general `js_repl` advice instead of `ImageDetailOriginal`-specific behavior. - Updated the guidance to recommend JPEG at about quality 85 when lossy compression is acceptable, and PNG when transparency or lossless detail matters. - Mirrored the same wording in the public `js_repl` docs.	2026-03-05 16:02:37 -08:00
viyatb-oai	9203f17b0e	Improve macOS Seatbelt network and unix socket handling (#12702 ) This improves macOS Seatbelt handling for sandboxed tool processes. ## Changes - Allow dual-stack local binding in proxy-managed sessions, while still keeping traffic limited to loopback and configured proxy endpoints. - Replace the old generic unix-socket path rule with explicit AF_UNIX permissions for socket creation, bind, and outbound connect. - Keep explicitly approved wrapper sockets connect-only. Local helper servers are less likely to fail when binding on macOS. Tools using local unix-socket IPC should work more reliably under the sandbox. Full-network sessions, proxy fail-closed behavior, and proxy lifecycle are unchanged.	2026-03-05 15:39:54 -08:00
Owen Lin	aa3fe8abf8	feat(core): persist trace_id for turns in RolloutItem::TurnContext (#13602 ) This PR adds a durable trace linkage for each turn by storing the active trace ID on the rollout TurnContext record stored in session rollout files. Before this change, we propagated trace context at runtime but didn’t persist a stable per-turn trace key in rollout history. That made after-the-fact debugging harder (for example, mapping a historical turn to the corresponding trace in datadog). This sets us up for much easier debugging in the future. ### What changed - Added an optional `trace_id` to TurnContextItem (rollout schema). - Added a small OTEL helper to read the current span trace ID. - Captured `trace_id` when creating `TurnContext` and included it in `to_turn_context_item()`. - Updated tests and fixtures that construct TurnContextItem so older/no-trace cases still work. ### Why this approach TurnContext is already the canonical durable per-turn metadata in rollout. This keeps ownership clean: trace linkage lives with other persisted turn metadata.	2026-03-05 13:26:48 -08:00
Curtis 'Fjord' Hawthorne	cfbbbb1dda	Harden js_repl emitImage to accept only data: URLs (#13507 ) ### Motivation - Prevent untrusted js_repl code from supplying arbitrary external URLs that the host would forward into model input and cause external fetches / data exfiltration. This change narrows the emitImage contract to safe, self-contained data URLs. ### Description - Kernel: added `normalizeEmitImageUrl` and enforce that string-valued `codex.emitImage(...)` inputs and `input_image`/content-item paths only accept non-empty `data:` URLs; byte-based paths still produce data URLs as before (`kernel.js`). - Host: added `validate_emitted_image_url` and check `EmitImage` requests before creating `FunctionCallOutputContentItem::InputImage`, returning an error to the kernel if the URL is not a `data:` URL (`mod.rs`). - Tests/docs: added a runtime test `js_repl_emit_image_rejects_non_data_url` to assert rejection of non-data URLs and updated user-facing docs/instruction text to state `data URL` support instead of generic direct image URLs (`mod.rs`, `docs/js_repl.md`, `project_doc.rs`). ### Testing - Ran `just fmt` in `codex-rs`; it completed successfully. - Added a runtime test (`cargo test -p codex-core js_repl_emit_image_rejects_non_data_url`) but executing the test in this environment failed due to a missing system dependency required by `codex-linux-sandbox` (the vendored `bubblewrap` build requires `libcap.pc` via `pkg-config`), so the test could not be run here. - Attempted a focused `cargo test` invocation with and without default features; both compile/test attempts were blocked by the same missing system `libcap` dependency in this environment. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69a7837bce98832d91db92d5f76d6cbe)	2026-03-05 12:12:32 -08:00
Celia Chen	a63624a61a	feat: merge skill permission profiles into the turn sandbox for zsh-fork execs (#13496 ) ## Summary This changes the Unix shell escalation path for skill-matched executables to apply a skill's `PermissionProfile` as additive permissions on top of the existing turn/request sandbox policy. Previously, skill-matched executables compiled the skill permission profile into a standalone sandbox policy and executed against that replacement policy. Now they go through the same `additional_permissions` merge path used elsewhere in shell sandbox preparation. ## What Changed - Changed `skill_escalation_execution()` to return `EscalationPermissions::PermissionProfile(...)` for non-empty skill permission profiles. - Kept empty or missing skill permission profiles on the `TurnDefault` path. - Added tests covering the new additive skill-permission behavior. - Added inline comments in `prepare_escalated_exec()` clarifying the difference between additive permission merging and fully specified replacement sandbox policies. - Removed the now-unused skill permission compiler module after switching this path away from standalone compiled skill sandbox policies. ## Testing - Ran `just fmt` in `codex-rs` - Ran `cargo test -p codex-core` `cargo test -p codex-core` still hits an unrelated existing failure: `shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin` ## Follow-up This change intentionally does not merge skill-specific macOS seatbelt profile extensions through the `additional_permissions` path yet. Filesystem and network permissions now follow the additive merge path, but seatbelt extension permissions still need separate handling in a follow-up PR.	2026-03-05 20:05:35 +00:00
Curtis 'Fjord' Hawthorne	657841e7f5	Persist initialized js_repl bindings after failed cells (#13482 ) ## Summary - Change `js_repl` failed-cell persistence so later cells keep prior bindings plus only the current-cell bindings whose initialization definitely completed before the throw. - Preserve initialized lexical bindings across failed cells via module-namespace readability, including top-level destructuring that partially succeeds before a later throw. - Preserve hoisted `var` and `function` bindings only when execution clearly reached their declaration site, and preserve direct top-level pre-declaration `var` writes and updates through explicit write-site markers. - Preserve top-level `for...in` / `for...of` `var` bindings when the loop body executes at least once, using a first-iteration guard to avoid per-iteration bookkeeping overhead. - Keep prior module state intact across link-time failures and evaluation failures before the prelude runs, while still allowing failed cells that already recreated prior bindings to persist updates to those existing bindings. - Hide internal commit hooks from user `js_repl` code after the prelude aliases them, so snippets cannot spoof committed bindings by calling the raw `import.meta` hooks directly. - Add focused regression coverage for the supported failed-cell behaviors and the intentionally unsupported boundaries. - Update `js_repl` docs and generated instructions to describe the new, narrower failed-cell persistence model. ## Motivation We saw `js_repl` drop bindings that had already been initialized successfully when a later statement in the same cell threw, for example: const { context: liveContext, session } = await initializeGoogleSheetsLiveForTab(tab); // later statement throws That was surprising in practice because successful earlier work disappeared from the next cell. This change makes failed-cell persistence more useful without trying to model every possible partially executed JavaScript edge case. The resulting behavior is narrower and easier to reason about: - prior bindings are always preserved - lexical bindings persist when their initialization completed before the throw - hoisted `var` / `function` bindings persist only when execution clearly reached their declaration or a supported top-level `var` write site - failed cells that already recreated prior bindings can persist writes to those existing bindings even if they introduce no new bindings The detailed edge-case matrix stays in `docs/js_repl.md`. The model-facing `project_doc` guidance is intentionally shorter and focused on generation-relevant behavior. ## Supported Failed-Cell Behavior - Prior bindings remain available after a failed cell. - Initialized lexical bindings remain available after a failed cell. - Top-level destructuring like `const { a, b } = ...` preserves names whose initialization completed before a later throw. - Hoisted `function` bindings persist when execution reached the declaration statement before the throw. - Direct top-level pre-declaration `var` writes and updates persist, for example: - `x = 1` - `x += 1` - `x++` - short-circuiting logical assignments only persist when the write branch actually runs - Non-empty top-level `for...in` / `for...of` `var` loops persist their loop bindings. - Failed cells can persist updates to existing carried bindings after the prelude has run, even when the cell commits no new bindings. - Link failures and eval failures before the prelude do not poison `@prev`. ## Intentionally Unsupported Failed-Cell Cases - Hoisted function reads before the declaration, such as `foo(); ...; function foo() {}` - Aliasing or inference-based recovery from reads before declaration - Nested writes inside already-instrumented assignment RHS expressions - Destructuring-assignment recovery for hoisted `var` - Partial `var` destructuring recovery - Pre-declaration `undefined` reads for hoisted `var` - Empty top-level `for...in` / `for...of` loop vars - Nested or scope-sensitive pre-declaration `var` writes outside direct top-level expression statements	2026-03-05 11:01:46 -08:00
Owen Lin	926b2f19e8	feat(app-server): support mcp elicitations in v2 api (#13425 ) This adds a first-class server request for MCP server elicitations: `mcpServer/elicitation/request`. Until now, MCP elicitation requests only showed up as a raw `codex/event/elicitation_request` event from core. That made it hard for v2 clients to handle elicitations using the same request/response flow as other server-driven interactions (like shell and `apply_patch` tools). This also updates the underlying MCP elicitation request handling in core to pass through the full MCP request (including URL and form data) so we can expose it properly in app-server. ### Why not `item/mcpToolCall/elicitationRequest`? This is because MCP elicitations are related to MCP servers first, and only optionally to a specific MCP tool call. In the MCP protocol, elicitation is a server-to-client capability: the server sends `elicitation/create`, and the client replies with an elicitation result. RMCP models it that way as well. In practice an elicitation is often triggered by an MCP tool call, but not always. ### What changed - add `mcpServer/elicitation/request` to the v2 app-server API - translate core `codex/event/elicitation_request` events into the new v2 server request - map client responses back into `Op::ResolveElicitation` so the MCP server can continue - update app-server docs and generated protocol schema - add an end-to-end app-server test that covers the full round trip through a real RMCP elicitation flow - The new test exercises a realistic case where an MCP tool call triggers an elicitation, the app-server emits mcpServer/elicitation/request, the client accepts it, and the tool call resumes and completes successfully. ### app-server API flow - Client starts a thread with `thread/start`. - Client starts a turn with `turn/start`. - App-server sends `item/started` for the `mcpToolCall`. - While that tool call is in progress, app-server sends `mcpServer/elicitation/request`. - Client responds to that request with `{ action: "accept" \| "decline" \| "cancel" }`. - App-server sends `serverRequest/resolved`. - App-server sends `item/completed` for the mcpToolCall. - App-server sends `turn/completed`. - If the turn is interrupted while the elicitation is pending, app-server still sends `serverRequest/resolved` before the turn finishes.	2026-03-05 07:20:20 -08:00
jif-oai	0cc6835416	feat: ultra polish package manager (#13573 ) See the readme	2026-03-05 13:02:30 +00:00
jif-oai	f304b2ef62	feat: bind package manager (#13571 )	2026-03-05 11:57:13 +00:00
Michael Bolin	b4cb989563	refactor: prepare unified exec for zsh-fork backend (#13392 ) ## Why `shell_zsh_fork` already provides stronger guarantees around which executables receive elevated permissions. To reuse that machinery from unified exec without pushing Unix-specific escalation details through generic runtime code, the escalation bootstrap and session lifetime handling need a cleaner boundary. That boundary also needs to be safe for long-lived sessions: when an intercepted shell session is closed or pruned, any in-flight approval workers and any already-approved escalated child they spawned must be torn down with the session, and the inherited escalation socket must not leak into unrelated subprocesses. ## What Changed - Extracted a reusable `EscalationSession` and `EscalateServer::start_session(...)` in `shell-escalation` so callers can get the wrapper/socket env overlay and keep the escalation server alive without immediately running a one-shot command. - Documented that `EscalationSession::env()` and `ShellCommandExecutor::run(...)` exchange only that env overlay, which callers must merge into their own base shell environment. - Clarified the prepared-exec helper boundary in `core` by naming the new helper APIs around `ExecRequest`, while keeping the legacy `execute_env(...)` entrypoints as thin compatibility wrappers for existing callers that still use the older naming. - Added a small post-spawn hook on the prepared execution path so the parent copy of the inheritable escalation socket is closed immediately after both the existing one-shot shell-command spawn and the unified-exec spawn. - Made session teardown explicit with session-scoped cancellation: dropping an `EscalationSession` or canceling its parent request now stops intercept workers, and the server-spawned escalated child uses `kill_on_drop(true)` so teardown cannot orphan an already-approved child. - Added `UnifiedExecBackendConfig` plumbing through `ToolsConfig`, a `shell::zsh_fork_backend` facade, and an opaque unified-exec spawn-lifecycle hook so unified exec can prepare a wrapped `zsh -c/-lc` request without storing `EscalationSession` directly in generic process/runtime code. - Kept the existing `shell_command` zsh-fork behavior intact on top of the new bootstrap path. Tool selection is unchanged in this PR: when `shell_zsh_fork` is enabled, `ShellCommand` still wins over `exec_command`. ## Verification - `cargo test -p codex-shell-escalation` - includes coverage for `start_session_exposes_wrapper_env_overlay` - includes coverage for `exec_closes_parent_socket_after_shell_spawn` - includes coverage for `dropping_session_aborts_intercept_workers_and_kills_spawned_child` - `cargo test -p codex-core shell_zsh_fork_prefers_shell_command_over_unified_exec` - `cargo test -p codex-core --test all shell_zsh_fork_prompts_for_skill_script_execution` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13392). * #13432 * __->__ #13392	2026-03-05 08:55:12 +00:00
sayan-oai	03d55f0e6f	chore: add web_search_tool_type for image support (#13538 ) add `web_search_tool_type` on model_info that can be populated from backend. will be used to filter which models can use `web_search` with images and which cant. added small unit test.	2026-03-05 07:02:27 +00:00
Ahmed Ibrahim	8f828f8a43	Reduce realtime audio submission log noise (#13539 ) - lower `submission_dispatch` span logging to debug for realtime audio submissions only - keep other submission spans at info and add a targeted test for the level selection --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-04 22:44:14 -08:00
aaronl-openai	ff0341dc94	[js_repl] Support local ESM file imports (#13437 ) ## Summary - add `js_repl` support for dynamic imports of relative and absolute local ESM `.js` / `.mjs` files - keep bare package imports on the native Node path and resolved from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then `cwd`), even when they originate from imported local files - restrict static imports inside imported local files to other local relative/absolute `.js` / `.mjs` files, and surface a clear error for unsupported top-level static imports in the REPL cell - run imported local files inside the REPL VM context so they can access `codex.tmpDir`, `codex.tool`, captured `console`, and Node-like `import.meta` helpers - reload local files between execs so later `await import("./file.js")` calls pick up edits and fixed failures, while preserving package/builtin caching and persistent top-level REPL bindings - make `import.meta.resolve()` self-consistent by allowing the returned `file://...` URLs to round-trip through `await import(...)` - update both public and injected `js_repl` docs to clarify the narrowed contract, including global bare-import resolution behavior for local absolute files ## Testing - `cargo test -p codex-core js_repl_` - built codex binary and verified behavior --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-04 22:40:31 -08:00
pash-openai	394e538640	[core] Enable fast mode by default (#13450 ) Co-authored-by: Codex <noreply@openai.com>	2026-03-04 20:06:35 -08:00
sayan-oai	d44398905b	feat: track plugins mcps/apps and add plugin info to user_instructions (#13433 ) ### first half of changes, followed by #13510 Track plugin capabilities as derived summaries on `PluginLoadOutcome` for enabled plugins with at least one skill/app/mcp. Also add `Plugins` section to `user_instructions` injected on session start. These introduce the plugins concept and list enabled plugins, but do NOT currently include paths to enabled plugins or details on what apps/mcps the plugins contain (current plan is to inject this on @-mention). that can be adjusted in a follow up and based on evals. ### tests Added/updated tests, confirmed locally that new `Plugins` section + currently enabled plugins show up in `user_instructions`.	2026-03-04 19:46:13 -08:00
Won Park	229e6d0347	image-gen-event/client_processing (#13512 ) enabling client-side to process with image-generation capabilities (setting app-server)	2026-03-04 16:54:38 -08:00
Ahmed Ibrahim	7b088901c2	Log non-audio realtime events (#13516 ) Improve observability of realtime conversation event handling by logging non-audio events with payload details in the event loop, while skipping audio-out events to reduce noise.	2026-03-04 16:30:18 -08:00
xl-openai	1e877ccdd2	plugin: support local-based marketplace.json + install endpoint. (#13422 ) Support marketplace.json that points to a local file, with ``` "source": { "source": "local", "path": "./plugin-1" }, ``` Add a new plugin/install endpoint which add the plugin to the cache folder and enable it in config.toml.	2026-03-04 19:08:18 -05:00
Ahmed Ibrahim	294079b0b1	Prefix handoff messages with role (#13505 ) Format handoff context by prefixing each message with its role (for example "user:" and "assistant:") before forwarding to the agent.	2026-03-04 15:37:31 -08:00
alexsong-oai	ce139bb1af	add metrics for external config import (#13501 )	2026-03-04 13:59:50 -08:00
jif-oai	2322e49549	feat: external artifacts builder (#13485 ) This PR reverts the built-in artifact render while a decision is being reached. No impact expected on any features	2026-03-04 20:22:34 +00:00
Owen Lin	27724f6ead	feat(core, tracing): add a span representing a turn (#13424 ) This is PR 3 of the app-server tracing rollout. PRs https://github.com/openai/codex/pull/13285 and https://github.com/openai/codex/pull/13368 gave us inbound request spans in app-server and propagated trace context through Submission. This change finishes the next piece in core: when a request actually starts a turn, we now create a core-owned long-lived span that stays open for the real lifetime of the turn. What changed: - `Session::spawn_task` can now optionally create a long-lived turn span and run the spawned task inside it - `turn/start` uses that path, so normal turn execution stays under a single core-owned span after the async handoff - `review/start` uses the same pattern - added a unit test that verifies the spawned turn task inherits the submission dispatch trace ancestry Why The app-server request span is intentionally short-lived. Once work crosses into core, we still want one span that covers the actual execution window until completion or interruption. This keeps that ownership where it belongs: in the layer that owns the runtime lifecycle.	2026-03-04 11:09:17 -08:00
Alex Daley	8a59386273	add new scopes to login (#12383 ) Validated login + refresh flows. Removing scopes from the refresh request until we have upgrade flow in place. Confirmed that tokens refresh with existing scopes.	2026-03-04 16:41:54 +00:00
jif-oai	f72ab43fd1	feat: memories in workspace write (#13467 )	2026-03-04 13:00:26 +00:00
jif-oai	df619474f5	nit: citation prompt (#13468 )	2026-03-04 13:00:11 +00:00

1 2 3 4 5 ...

2078 Commits