Commit Graph

40 Commits

Author SHA1 Message Date
Michael Bolin
abbd74e2be feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.

It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.

## What

- Added `ReadOnlyAccess` in protocol with:
  - `Restricted { include_platform_defaults, readable_roots }`
  - `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
  - `ReadOnly { access: ReadOnlyAccess }`
  - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.

## Compatibility / rollout

- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
2026-02-11 18:31:14 -08:00
Dylan Hurd
168c359b71 Adjust shell command timeouts for Windows (#11247)
Summary
- add platform-aware defaults for shell command timeouts so Windows
tests get longer waits
- keep medium timeout longer on Windows to ensure flakiness is reduced

Testing
- Not run (not requested)
2026-02-09 20:03:32 -08:00
Charley Cunningham
e6662d6387 app-server: treat null mode developer instructions as built-in defaults (#10983)
## Summary
- make `turn/start` normalize
`collaborationMode.settings.developer_instructions: null` to the
built-in instructions for the selected mode
- prevent app-server clients from accidentally clearing mode-switch
developer instructions by sending `null`
- document this behavior in the v2 protocol and app-server docs

## What changed
- `codex-rs/app-server/src/codex_message_processor.rs`
  - added a small `normalize_turn_start_collaboration_mode` helper
  - in `turn_start`, apply normalization before `OverrideTurnContext`
- `codex-rs/app-server/tests/suite/v2/turn_start.rs`
- extended `turn_start_accepts_collaboration_mode_override_v2` to assert
the outgoing request includes default-mode instruction text when the
client sends `developer_instructions: null`
- `codex-rs/app-server-protocol/src/protocol/v2.rs`
- clarified `TurnStartParams.collaboration_mode` docs:
`settings.developer_instructions: null` means use built-in mode
instructions
- regenerated schema fixture:
- `codex-rs/app-server-protocol/schema/typescript/v2/TurnStartParams.ts`
- docs:
  - `codex-rs/app-server/README.md`
  - `codex-rs/docs/codex_mcp_interface.md`
2026-02-07 12:59:41 -08:00
jif-oai
c67120f4a0 fix: flaky landlock (#10689)
https://openai.slack.com/archives/C095U48JNL9/p1770243347893959
2026-02-05 10:30:18 +00:00
Dylan Hurd
a05aadfa1b chore(config) Default Personality Pragmatic (#10705)
## Summary
Switch back to Pragmatic personality

## Testing
- [x] Updated unit tests
2026-02-04 21:22:47 -08:00
Dylan Hurd
73f32840c6 chore(core) personality migration tests (#10650)
## Summary
Adds additional tests for personality edge cases

## Testing
- [x] These are tests
2026-02-04 19:03:14 -08:00
gt-oai
7c6d21a414 Fix test_shell_command_interruption flake (#10649)
## Human summary
Sandboxing (specifically `LandlockRestrict`) is means that e.g. `sleep
10` fails immediately. Therefore it cannot be interrupted.

In suite::interrupt::test_shell_command_interruption, sleep 10 is issued
at 17:28:16.554 (ToolCall: shell_command {"command":"sleep 10"...}),
then fails at 17:28:16.589 with duration_ms=34, success=false,
exit_code=101, and
    Sandbox(LandlockRestrict).

## Codex summary
- set `sandbox_mode = "danger-full-access"` in `interrupt` and
`v2/turn_interrupt` integration tests
- set `sandbox: Some(SandboxMode::DangerFullAccess)` in
`test_codex_jsonrpc_conversation_flow`
- set `sandbox_policy: Some(SandboxPolicy::DangerFullAccess)` in
`command_execution_notifications_include_process_id`

## Why
On some Linux CI environments, command execution fails immediately with
`LandlockRestrict` when sandboxed. These tests are intended to validate
JSON-RPC/task lifecycle behavior (interrupt semantics, command
notification shape/process id, request flow), but early sandbox startup
failure changes turn flow and can trigger extra follow-up requests,
causing flakes.

This change removes environment-specific sandbox startup dependency from
these tests while preserving their primary intent.

## Testing
- not run in this environment (per request)
2026-02-04 22:19:06 +00:00
Charley Cunningham
d509df676b Cleanup collaboration mode variants (#10404)
## Summary

This PR simplifies collaboration modes to the visible set `default |
plan`, while preserving backward compatibility for older partners that
may still send legacy mode
names.

Specifically:
- Renames the old Code behavior to **Default**.
- Keeps **Plan** as-is.
- Removes **Custom** mode behavior (fallbacks now resolve to Default).
- Keeps `PairProgramming` and `Execute` internally for compatibility
plumbing, while removing them from schema/API and UI visibility.
- Adds legacy input aliasing so older clients can still send old mode
names.

## What Changed

1. Mode enum and compatibility
- `ModeKind` now uses `Plan` + `Default` as active/public modes.
- `ModeKind::Default` deserialization accepts legacy values:
  - `code`
  - `pair_programming`
  - `execute`
  - `custom`
- `PairProgramming` and `Execute` variants remain in code but are hidden
from protocol/schema generation.
- `Custom` variant is removed; previous custom fallbacks now map to
`Default`.

2. Collaboration presets and templates
- Built-in presets now return only:
  - `Plan`
  - `Default`
- Template rename:
  - `core/templates/collaboration_mode/code.md` -> `default.md`
- `execute.md` and `pair_programming.md` remain on disk but are not
surfaced in visible preset lists.

3. TUI updates
- Updated user-facing naming and prompts from “Code” to “Default”.
- Updated mode-cycle and indicator behavior to reflect only visible
`Plan` and `Default`.
- Updated corresponding tests and snapshots.

4. request_user_input behavior
- `request_user_input` remains allowed only in `Plan` mode.
- Rejection messaging now consistently treats non-plan modes as
`Default`.

5. Schemas
- Regenerated config and app-server schemas.
- Public schema types now advertise mode values as:
  - `plan`
  - `default`

## Backward Compatibility Notes

- Incoming legacy mode names (`code`, `pair_programming`, `execute`,
`custom`) are accepted and coerced to `default`.
- Outgoing/public schema surfaces intentionally expose only `plan |
default`.
- This allows tolerant ingestion of older partner payloads while
standardizing new integrations on the reduced mode set.

## Codex author
`codex fork 019c1fae-693b-7840-b16e-9ad38ea0bd00`
2026-02-03 09:23:53 -08:00
Dylan Hurd
8a461765f3 chore(core) Default to friendly personality (#10305)
## Summary
Update default personality to friendly

## Testing
- [x] Unit tests pass
2026-01-31 17:11:32 -07:00
Dylan Hurd
ed9e02c9dc chore(app-server) add personality update test (#10306)
## Summary
Add some additional validation to ensure app-server handles Personality
changes

## Testing
- [x] These are tests
2026-01-31 14:49:55 -07:00
Dylan Hurd
e3ab0bd973 chore(personality) new schema with fallbacks (#10147)
## Summary
Let's dial in this api contract in a bit more with more robust fallback
behavior when model_instructions_template is false.

Switches to a more explicit template / variables structure, with more
fallbacks.

## Testing
- [x] Adding unit tests
- [x] Tested locally
2026-01-30 00:10:12 -07:00
Dylan Hurd
25fccc3d4d chore(core) move model_instructions_template config (#9871)
## Summary
Move `model_instructions_template` config to the experimental slug while
we iterate on this feature

## Testing
- [x] Tested locally, unit tests still pass
2026-01-26 07:02:11 +00:00
Ahmed Ibrahim
69cfc73dc6 change collaboration mode to struct (#9793)
Shouldn't cause behavioral change
2026-01-23 17:00:23 -08:00
Dylan Hurd
2b1ee24e11 feat(app-server) Expose personality (#9674)
### Motivation
Exposes a per-thread / per-turn `personality` override in the v2
app-server API so clients can influence model communication style at
thread/turn start. Ensures the override is passed into the session
configuration resolution so it becomes effective for subsequent turns
and headless runners.

### Testing
- [x] Add an integration-style test
`turn_start_accepts_personality_override_v2` in
`codex-rs/app-server/tests/suite/v2/turn_start.rs` that verifies a
`/personality` override results in a developer update message containing
`<personality_spec>` in the outbound model request.

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_6971d646b1c08322a689a54d2649f3fe)
2026-01-22 18:00:20 -08:00
charley-oai
be9e55c5fc Add total (non-partial) TextElement placeholder accessors (#9545)
## Summary
- Make `TextElement` placeholders private and add a text-backed accessor
to avoid assuming `Some`.
- Since they are optional in the protocol, we want to make sure any
accessors properly handle the None case (getting the placeholder using
the byte range in the text)
- Preserve placeholders during protocol/app-server conversions using the
accessor fallback.
- Update TUI composer/remap logic and tests to use the new
constructor/accessor.
2026-01-20 14:04:11 -08:00
Shijie Rao
57ec3a8277 Feat: request user input tool (#9472)
### Summary
* Add `requestUserInput` tool that the model can use for gather
feedback/asking question mid turn.


### Tool input schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput input",
  "type": "object",
  "additionalProperties": false,
  "required": ["questions"],
  "properties": {
    "questions": {
      "type": "array",
      "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.",
      "minItems": 1,
      "maxItems": 3,
      "items": {
        "type": "object",
        "additionalProperties": false,
        "required": ["id", "header", "question"],
        "properties": {
          "id": {
            "type": "string",
            "description": "Stable identifier for mapping answers (snake_case)."
          },
          "header": {
            "type": "string",
            "description": "Short header label shown in the UI (12 or fewer chars)."
          },
          "question": {
            "type": "string",
            "description": "Single-sentence prompt shown to the user."
          },
          "options": {
            "type": "array",
            "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.",
            "minItems": 2,
            "maxItems": 3,
            "items": {
              "type": "object",
              "additionalProperties": false,
              "required": ["value", "label", "description"],
              "properties": {
                "value": {
                  "type": "string",
                  "description": "Machine-readable value (snake_case)."
                },
                "label": {
                  "type": "string",
                  "description": "User-facing label (1-5 words)."
                },
                "description": {
                  "type": "string",
                  "description": "One short sentence explaining impact/tradeoff if selected."
                }
              }
            }
          }
        }
      }
    }
  }
}
```

### Tool output schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput output",
  "type": "object",
  "additionalProperties": false,
  "required": ["answers"],
  "properties": {
    "answers": {
      "type": "object",
      "description": "Map of question id to user answer.",
      "additionalProperties": {
        "type": "object",
        "additionalProperties": false,
        "required": ["selected"],
        "properties": {
          "selected": {
            "type": "array",
            "items": { "type": "string" }
          },
          "other": {
            "type": ["string", "null"]
          }
        }
      }
    }
  }
}
```
2026-01-19 10:17:30 -08:00
Ahmed Ibrahim
146d54cede Add collaboration_mode override to turns (#9408) 2026-01-16 21:51:25 -08:00
charley-oai
1fa8350ae7 Add text element metadata to protocol, app server, and core (#9331)
The second part of breaking up PR
https://github.com/openai/codex/pull/9116

Summary:

- Add `TextElement` / `ByteRange` to protocol user inputs and user
message events with defaults.
- Thread `text_elements` through app-server v1/v2 request handling and
history rebuild.
- Preserve UI metadata only in user input/events (not `ContentItem`)
while keeping local image attachments in user events for rehydration.

Details:

- Protocol: `UserInput::Text` carries `text_elements`;
`UserMessageEvent` carries `text_elements` + `local_images`.
Serialization includes empty vectors for backward compatibility.
- app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in
camelCase with conversions; v2 uses its own camelCase wrapper.
- app-server: v1/v2 input mapping includes `text_elements`; thread
history rebuilds include them.
- Core: user event emission preserves UI metadata while model history
stays clean; history replay round-trips the metadata.
2026-01-15 17:26:41 -08:00
charley-oai
4a9c2bcc5a Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00
Owen Lin
fbe883318d fix(app-server): set originator header from initialize (re-revert) (#8988)
Reapplies https://github.com/openai/codex/pull/8873 which was reverted
due to merge conflicts
2026-01-09 12:09:30 -08:00
jif-oai
5c380d5b1e Revert "fix(app-server): set originator header from initialize JSON-RPC request" (#8986)
Reverts openai/codex#8873
2026-01-09 17:00:53 +00:00
Owen Lin
ea56186c2b fix(app-server): set originator header from initialize JSON-RPC request (#8873)
**Motivation**
The `originator` header is important for codex-backend’s Responses API
proxy because it identifies the real end client (codex cli, codex vscode
extension, codex exec, future IDEs) and is used to categorize requests
by client for our enterprise compliance API.

Today the `originator` header is set by either:
- the `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` env var (our VSCode extension
does this)
- calling `set_default_originator()` which sets a global immutable
singleton (`codex exec` does this)

For `codex app-server`, we want the `initialize` JSON-RPC request to set
that header because it is a natural place to do so. Example:
```json
{
  "method": "initialize",
  "id": 0,
  "params": {
    "clientInfo": {
      "name": "codex_vscode",
      "title": "Codex VS Code Extension",
      "version": "0.1.0"
    }
  }
}
```
and when app-server receives that request, it can call
`set_default_originator()`. This is a much more natural interface than
asking third party developers to set an env var.

One hiccup is that `originator()` reads the global singleton and locks
in the value, preventing a later `set_default_originator()` call from
setting it. This would be fine but is brittle, since any codepath that
calls `originator()` before app-server can process an `initialize`
JSON-RPC call would prevent app-server from setting it. This was
actually the case with OTEL initialization which runs on boot, but I
also saw this behavior in certain tests.

Instead, what we now do is:
- [unchanged] If `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` env var is set,
`originator()` would return that value and `set_default_originator()`
with some other value does NOT override it.
- [new] If no env var is set, `originator()` would return the default
value which is `codex_cli_rs` UNTIL `set_default_originator()` is called
once, in which case it is set to the new value and becomes immutable.
Later calls to `set_default_originator()` returns
`SetOriginatorError::AlreadyInitialized`.

**Other notes**
- I updated `codex_core::otel_init::build_provider` to accepts a service
name override, and app-server sends a hardcoded `codex_app_server`
service name to distinguish it from `codex_cli_rs` used by default (e.g.
TUI).

**Next steps**
- Update VSCE to set the proper value for `clientInfo.name` on
`initialize` and drop the `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` env var.
- Delete support for `CODEX_INTERNAL_ORIGINATOR_OVERRIDE` in codex-rs.
2026-01-09 08:17:13 -08:00
Celia Chen
be4364bb80 [chore] move app server tests from chat completion to responses (#8939)
We are deprecating chat completions. Move all app server tests from chat
completion to responses.
2026-01-08 22:27:55 +00:00
Celia Chen
051bf81df9 [fix] app server flaky send_messages test (#8874)
Fix flakiness of CI test:
https://github.com/openai/codex/actions/runs/20350530276/job/58473691434?pr=8282

This PR does two things:
1. move the flakiness test to use responses API instead of chat
completion API
2. make mcp_process agnostic to the order of
responses/notifications/requests that come in, by buffering messages not
read
2026-01-08 20:41:21 +00:00
Owen Lin
66450f0445 fix: implement 'Allow this session' for apply_patch approvals (#8451)
**Summary**
This PR makes “ApprovalDecision::AcceptForSession / don’t ask again this
session” actually work for `apply_patch` approvals by caching approvals
based on absolute file paths in codex-core, properly wiring it through
app-server v2, and exposing the choice in both TUI and TUI2.
- This brings `apply_patch` calls to be at feature-parity with general
shell commands, which also have a "Yes, and don't ask again" option.
- This also fixes VSCE's "Allow this session" button to actually work.

While we're at it, also split the app-server v2 protocol's
`ApprovalDecision` enum so execpolicy amendments are only available for
command execution approvals.

**Key changes**
- Core: per-session patch approval allowlist keyed by absolute file
paths
- Handles multi-file patches and renames/moves by recording both source
and destination paths for `Update { move_path: Some(...) }`.
- Extend the `Approvable` trait and `ApplyPatchRuntime` to work with
multiple keys, because an `apply_patch` tool call can modify multiple
files. For a request to be auto-approved, we will need to check that all
file paths have been approved previously.
- App-server v2: honor AcceptForSession for file changes
- File-change approval responses now map AcceptForSession to
ReviewDecision::ApprovedForSession (no longer downgraded to plain
Approved).
- Replace `ApprovalDecision` with two enums:
`CommandExecutionApprovalDecision` and `FileChangeApprovalDecision`
- TUI / TUI2: expose “don’t ask again for these files this session”
- Patch approval overlays now include a third option (“Yes, and don’t
ask again for these files this session (s)”).
    - Snapshot updates for the approval modal.

**Tests added/updated**
- Core:
- Integration test that proves ApprovedForSession on a patch skips the
next patch prompt for the same file
- App-server:
- v2 integration test verifying
FileChangeApprovalDecision::AcceptForSession works properly

**User-visible behavior**
- When the user approves a patch “for session”, future patches touching
only those previously approved file(s) will no longer prompt gain during
that session (both via app-server v2 and TUI/TUI2).

**Manual testing**
Tested both TUI and TUI2 - see screenshots below.

TUI:
<img width="1082" height="355" alt="image"
src="https://github.com/user-attachments/assets/adcf45ad-d428-498d-92fc-1a0a420878d9"
/>


TUI2:
<img width="1089" height="438" alt="image"
src="https://github.com/user-attachments/assets/dd768b1a-2f5f-4bd6-98fd-e52c1d3abd9e"
/>
2026-01-07 20:11:12 +00:00
Anton Panasenko
807f8a43c2 feat: expose outputSchema to user_turn/turn_start app_server API (#8377)
What changed
- Added `outputSchema` support to the app-server APIs, mirroring `codex
exec --output-schema` behavior.
- V1 `sendUserTurn` now accepts `outputSchema` and constrains the final
assistant message for that turn.
- V2 `turn/start` now accepts `outputSchema` and constrains the final
assistant message for that turn (explicitly per-turn only).

Core behavior
- `Op::UserTurn` already supported `final_output_json_schema`; now V1
`sendUserTurn` forwards `outputSchema` into that field.
- `Op::UserInput` now carries `final_output_json_schema` for per-turn
settings updates; core maps it into
`SessionSettingsUpdate.final_output_json_schema` so it applies to the
created turn context.
- V2 `turn/start` does NOT persist the schema via `OverrideTurnContext`
(it’s applied only for the current turn). Other overrides
(cwd/model/etc) keep their existing persistent behavior.

API / docs
- `codex-rs/app-server-protocol/src/protocol/v1.rs`: add `output_schema:
Option<serde_json::Value>` to `SendUserTurnParams` (serialized as
`outputSchema`).
- `codex-rs/app-server-protocol/src/protocol/v2.rs`: add `output_schema:
Option<JsonValue>` to `TurnStartParams` (serialized as `outputSchema`).
- `codex-rs/app-server/README.md`: document `outputSchema` for
`turn/start` and clarify it applies only to the current turn.
- `codex-rs/docs/codex_mcp_interface.md`: document `outputSchema` for v1
`sendUserTurn` and v2 `turn/start`.

Tests added/updated
- New app-server integration tests asserting `outputSchema` is forwarded
into outbound `/responses` requests as `text.format`:
  - `codex-rs/app-server/tests/suite/output_schema.rs`
  - `codex-rs/app-server/tests/suite/v2/output_schema.rs`
- Added per-turn semantics tests (schema does not leak to the next
turn):
  - `send_user_turn_output_schema_is_per_turn_v1`
  - `turn_start_output_schema_is_per_turn_v2`
- Added protocol wire-compat tests for the merged op:
  - serialize omits `final_output_json_schema` when `None`
  - deserialize works when field is missing
  - serialize includes `final_output_json_schema` when `Some(schema)`

Call site updates (high level)
- Updated all `Op::UserInput { .. }` constructions to include
`final_output_json_schema`:
  - `codex-rs/app-server/src/codex_message_processor.rs`
  - `codex-rs/core/src/codex_delegate.rs`
  - `codex-rs/mcp-server/src/codex_tool_runner.rs`
  - `codex-rs/tui/src/chatwidget.rs`
  - `codex-rs/tui2/src/chatwidget.rs`
  - plus impacted core tests.

Validation
- `just fmt`
- `cargo test -p codex-core`
- `cargo test -p codex-app-server`
- `cargo test -p codex-mcp-server`
- `cargo test -p codex-tui`
- `cargo test -p codex-tui2`
- `cargo test -p codex-protocol`
- `cargo clippy --all-features --tests --profile dev --fix -- -D
warnings`
2026-01-05 10:27:00 -08:00
Michael Bolin
642b7566df fix: introduce AbsolutePathBuf as part of sandbox config (#7856)
Changes the `writable_roots` field of the `WorkspaceWrite` variant of
the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`.
This is helpful because now callers can be sure the value is an absolute
path rather than a relative one. (Though when using an absolute path in
a Seatbelt config policy, we still have to _canonicalize_ it first.)

Because `writable_roots` can be read from a config file, it is important
that we are able to resolve relative paths properly using the parent
folder of the config file as the base path.
2025-12-12 15:25:22 -08:00
zhao-oai
0a32acaa2d updating app server types to support execpoilcy amendment (#7747)
also includes minor refactor merging `ApprovalDecision` with
`CommandExecutionRequestAcceptSettings`
2025-12-08 13:56:22 -08:00
Ahmed Ibrahim
71504325d3 Migrate model preset (#7542)
- Introduce `openai_models` in `/core`
- Move `PRESETS` under it
- Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`,
`ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol`
- Introduce `Op::ListModels` and `EventMsg::AvailableModels`

Next steps:
- migrate `app-server` and `tui` to use the introduced Operation
2025-12-03 20:30:43 +00:00
Owen Lin
8532876ad8 [app-server] fix: emit item/fileChange/outputDelta for file change items (#7399) 2025-12-01 17:52:34 +00:00
jif-oai
28ff364c3a feat: update process ID for event handling (#7261) 2025-11-25 14:21:05 -08:00
Owen Lin
157a16cefa [app-server] feat: add thread_id and turn_id to item and error notifications (#7124)
Add `thread_id` and `turn_id` to `item/started`, `item/completed`, and
`error` notifications. Otherwise the client will have a hard time
knowing which thread & turn (if multiple threads are running in
parallel) a new item/error is for.

Also add `thread_id` to `turn/started` and `turn/completed`.
2025-11-25 08:05:47 -08:00
Dylan Hurd
1e832b1438 fix(windows) support apply_patch parsing in powershell (#7221)
## Summary
Support powershell parsing of apply_patch

## Testing
- [x] Enable apply_patch unit tests

---------

Co-authored-by: jif-oai <jif@openai.com>
2025-11-24 19:32:47 +00:00
Owen Lin
2ae1f81d84 [app-server] feat: add Declined status for command exec (#7101)
Add a `Declined` status for when we request an approval from the user
and the user declines. This allows us to distinguish from commands that
actually ran, but failed.

This behaves similarly to apply_patch / FileChange, which does the same
thing.
2025-11-21 09:19:39 -08:00
pakrym-oai
767b66f407 Migrate coverage to shell_command (#7042) 2025-11-21 03:44:00 +00:00
Owen Lin
d6c30ed25e [app-server] feat: v2 apply_patch approval flow (#6760)
This PR adds the API V2 version of the apply_patch approval flow, which
centers around `ThreadItem::FileChange`.

This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only)
and related events (`item/started`, `item/completed` for
`ThreadItem::FileChange`, which are emitted in both V1 and V2) through
the app-server
protocol. The new approval RPC is only sent when the user initiates a
turn with the new `turn/start` API so we don't break backwards
compatibility with VSCE.

Similar to https://github.com/openai/codex/pull/6758, the approach I
took was to make as few changes to the Codex core as possible,
leveraging existing `EventMsg` core events, and translating those in
app-server. I did have to add a few additional fields to
`EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those
were fairly lightweight.

However, the `EventMsg`s emitted by core are the following:
```
1) Auto-approved (no request for approval)

- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

2) Approved by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

3) Declined by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd
```

For a request triggering an approval, this would result in:
```
item/fileChange/requestApproval
item/started
item/completed
```

which is different from the `ThreadItem::CommandExecution` flow
introduced in https://github.com/openai/codex/pull/6758, which does the
below and is preferable:
```
item/started
item/commandExecution/requestApproval
item/completed
```

To fix this, we leverage `TurnSummaryStore` on codex_message_processor
to store a little bit of state, allowing us to fire `item/started` and
`item/fileChange/requestApproval` whenever we receive the underlying
`EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the
`EventMsg::PatchApplyBegin` later.

This is much less invasive than modifying the order of EventMsg within
core (I tried).

The resulting payloads:
```
{
  "method": "item/started",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "inProgress",
      "type": "fileChange"
    }
  }
}
```

```
{
  "id": 0,
  "method": "item/fileChange/requestApproval",
  "params": {
    "grantRoot": null,
    "itemId": "call_Nxnwj7B3YXigfV6Mwh03d686",
    "reason": null,
    "threadId": "019a9e11-8295-7883-a283-779e06502c6f",
    "turnId": "1"
  }
}
```

```
{
  "id": 0,
  "result": {
    "decision": "accept"
  }
}
```

```
{
  "method": "item/completed",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "completed",
      "type": "fileChange"
    }
  }
}
```
2025-11-19 20:13:31 -08:00
Michael Bolin
a75321a64c fix: add more fields to ThreadStartResponse and ThreadResumeResponse (#6847)
This adds the following fields to `ThreadStartResponse` and
`ThreadResumeResponse`:

```rust
    pub model: String,
    pub model_provider: String,
    pub cwd: PathBuf,
    pub approval_policy: AskForApproval,
    pub sandbox: SandboxPolicy,
    pub reasoning_effort: Option<ReasoningEffort>,
```

This is important because these fields are optional in
`ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
able to determine what values were ultimately used to start/resume the
conversation. (Though note that any of these could be changed later
between turns in the conversation.)

Though to get this information reliably, it must be read from the
internal `SessionConfiguredEvent` that is created in response to the
start of a conversation. Because `SessionConfiguredEvent` (as defined in
`codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
number of them had to be added as part of this PR.

Because `SessionConfiguredEvent` is referenced in many tests, test
instances of `SessionConfiguredEvent` had to be updated, as well, which
is why this PR touches so many files.
2025-11-18 21:18:43 -08:00
Celia Chen
b395dc1be6 [app-server] introduce turn/completed v2 event (#6800)
similar to logic in
`codex/codex-rs/exec/src/event_processor_with_jsonl_output.rs`.
translation of v1 -> v2 events:
`codex/event/task_complete` -> `turn/completed`
`codex/event/turn_aborted` -> `turn/completed` with `interrupted` status
`codex/event/error` -> `turn/completed` with `error` status

this PR also makes `items` field in `Turn` optional. For now, we only
populate it when we resume a thread, and leave it as None for all other
places until we properly rewrite core to keep track of items.

tested using the codex app server client. example new event:
```
< {
<   "method": "turn/completed",
<   "params": {
<     "turn": {
<       "id": "0",
<       "items": [],
<       "status": "interrupted"
<     }
<   }
< }
```
2025-11-19 01:55:24 +00:00
Owen Lin
cecbd5b021 [app-server] feat: add v2 command execution approval flow (#6758)
This PR adds the API V2 version of the command‑execution approval flow
for the shell tool.

This PR wires the new RPC (`item/commandExecution/requestApproval`, V2
only) and related events (`item/started`, `item/completed`, and
`item/commandExecution/delta`, which are emitted in both V1 and V2)
through the app-server
protocol. The new approval RPC is only sent when the user initiates a
turn with the new `turn/start` API so we don't break backwards
compatibility with VSCE.

The approach I took was to make as few changes to the Codex core as
possible, leveraging existing `EventMsg` core events, and translating
those in app-server. I did have to add additional fields to
`EventMsg::ExecCommandEndEvent` to capture the command's input so that
app-server can statelessly transform these events to a
`ThreadItem::CommandExecution` item for the `item/completed` event.

Once we stabilize the API and it's complete enough for our partners, we
can work on migrating the core to be aware of command execution items as
a first-class concept.

**Note**: We'll need followup work to make sure these APIs work for the
unified exec tool, but will wait til that's stable and landed before
doing a pass on app-server.

Example payloads below:
```
{
  "method": "item/started",
  "params": {
    "item": {
      "aggregatedOutput": null,
      "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
      "cwd": "/Users/owen/repos/codex/codex-rs",
      "durationMs": null,
      "exitCode": null,
      "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
      "parsedCmd": [
        {
          "cmd": "touch /tmp/should-trigger-approval",
          "type": "unknown"
        }
      ],
      "status": "inProgress",
      "type": "commandExecution"
    }
  }
}
```

```
{
  "id": 0,
  "method": "item/commandExecution/requestApproval",
  "params": {
    "itemId": "call_lNWWsbXl1e47qNaYjFRs0dyU",
    "parsedCmd": [
      {
        "cmd": "touch /tmp/should-trigger-approval",
        "type": "unknown"
      }
    ],
    "reason": "Need to create file in /tmp which is outside workspace sandbox",
    "risk": null,
    "threadId": "019a93e8-0a52-7fe3-9808-b6bc40c0989a",
    "turnId": "1"
  }
}
```

```
{
  "id": 0,
  "result": {
    "acceptSettings": {
      "forSession": false
    },
    "decision": "accept"
  }
}
```

```
{
  "params": {
    "item": {
      "aggregatedOutput": null,
      "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
      "cwd": "/Users/owen/repos/codex/codex-rs",
      "durationMs": 224,
      "exitCode": 0,
      "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
      "parsedCmd": [
        {
          "cmd": "touch /tmp/should-trigger-approval",
          "type": "unknown"
        }
      ],
      "status": "completed",
      "type": "commandExecution"
    }
  }
}
```
2025-11-18 00:23:54 +00:00
Owen Lin
6582554926 [app-server] feat: v2 Turn APIs (#6216)
Implements:
```
turn/start
turn/interrupt
```

along with their integration tests. These are relatively light wrappers
around the existing core logic, and changes to core logic are minimal.

However, an improvement made for developer ergonomics:
- `turn/start` replaces both `SendUserMessage` (no turn overrides) and
`SendUserTurn` (can override model, approval policy, etc.)
2025-11-06 16:36:36 +00:00