codex

mirror of https://github.com/openai/codex.git synced 2026-05-28 15:00:16 +00:00

Author	SHA1	Message	Date
jimmyfraiture	601f5715c6	fix merge	2025-10-03 14:31:38 +01:00
jimmyfraiture	39cfc8465c	Merge remote-tracking branch 'origin/main' into jif/parallel-tool-calls # Conflicts: # codex-rs/core/src/client_common.rs # codex-rs/core/src/codex.rs # codex-rs/core/src/config.rs # codex-rs/core/src/lib.rs # codex-rs/core/src/tools/handlers/apply_patch.rs # codex-rs/core/src/tools/handlers/mod.rs # codex-rs/core/src/tools/handlers/plan.rs # codex-rs/core/src/tools/handlers/read_file.rs # codex-rs/core/src/tools/handlers/view_image.rs # codex-rs/core/src/tools/mod.rs # codex-rs/core/src/tools/registry.rs # codex-rs/core/src/tools/router.rs # codex-rs/core/src/tools/spec.rs # codex-rs/core/tests/suite/mod.rs # codex-rs/core/tests/suite/read_file.rs # codex-rs/core/tests/suite/view_image.rs	2025-10-03 14:00:51 +01:00
jif-oai	33d3ecbccc	chore: refactor tool handling (#4510 ) # Tool System Refactor - Centralizes tool definitions and execution in `core/src/tools/`: specs (`spec.rs`), handlers (`handlers/`), router (`router.rs`), registry/dispatch (`registry.rs`), and shared context (`context.rs`). One registry now builds the model-visible tool list and binds handlers. - Router converts model responses to tool calls; Registry dispatches with consistent telemetry via `codex-rs/otel` and unified error handling. Function, Local Shell, MCP, and experimental `unified_exec` all flow through this path; legacy shell aliases still work. - Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and make adding tools predictable and testable. Example: `read_file` - Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`, registered by `build_specs`). - Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`, 1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation). - E2E test: `core/tests/suite/read_file.rs` validates the tool returns the requested lines. ## Next steps: - Decompose `handle_container_exec_with_params` - Add parallel tool calls	2025-10-03 13:21:06 +01:00
jif-oai	69cb72f842	chore: sandbox refactor 2 (#4653 ) Revert the revert and fix the UI issue	2025-10-03 11:17:39 +01:00
Ahmed Ibrahim	ed5d656fa8	Revert "chore: sanbox extraction" (#4626 ) Reverts openai/codex#4286	2025-10-02 21:09:21 +00:00
pakrym-oai	4c566d484a	Separate interactive and non-interactive sessions (#4612 ) Do not show exec session in VSCode/TUI selector.	2025-10-02 13:06:21 -07:00
Jeremy Rose	45936f8fbd	show "Viewed Image" when the model views an image (#4475 ) <img width="1022" height="339" alt="Screenshot 2025-09-29 at 4 22 00 PM" src="https://github.com/user-attachments/assets/12da7358-19be-4010-a71b-496ede6dfbbf" />	2025-10-02 18:36:03 +00:00
jimmyfraiture	ad29d2bf39	V2	2025-10-01 17:12:29 +01:00
jimmyfraiture	a6d3e2a334	V1	2025-10-01 16:42:01 +01:00
jimmyfraiture	e23cbcaaf6	Fix regression 1	2025-10-01 12:37:08 +01:00
jimmyfraiture	0718b00b80	Fix merge 2	2025-10-01 12:18:37 +01:00
jimmyfraiture	d3687a7a65	Fix merge 1	2025-10-01 12:10:07 +01:00
jimmyfraiture	5305f247aa	Merge remote-tracking branch 'origin/main' into jif/tool_refactors-1 # Conflicts: # codex-rs/core/src/codex.rs # codex-rs/core/src/executor/mod.rs # codex-rs/core/src/executor/runner.rs	2025-10-01 12:09:18 +01:00
jif-oai	b8195a17e5	chore: sanbox extraction (#4286 ) # Extract and Centralize Sandboxing - Goal: Improve safety and clarity by centralizing sandbox planning and execution. - Approach: - Add planner (ExecPlan) and backend registry (Direct/Seatbelt/Linux) with run_with_plan. - Refactor codex.rs to plan-then-execute; handle failures/escalation via the plan. - Delegate apply_patch to the codex binary and run it with an empty env for determinism.	2025-10-01 12:05:12 +01:00
Michael Bolin	5881c0d6d4	fix: remove mcp-types from app server protocol (#4537 ) We continue the separation between `codex app-server` and `codex mcp-server`. In particular, we introduce a new crate, `codex-app-server-protocol`, and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it `codex-rs/app-server-protocol/src/protocol.rs`. Because `ConversationId` was defined in `mcp_protocol.rs`, we move it into its own file, `codex-rs/protocol/src/conversation_id.rs`, and because it is referenced in a ton of places, we have to touch a lot of files as part of this PR. We also decide to get away from proper JSON-RPC 2.0 semantics, so we also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which is basically the same `JSONRPCMessage` type defined in `mcp-types` except with all of the `"jsonrpc": "2.0"` removed. Getting rid of `"jsonrpc": "2.0"` makes our serialization logic considerably simpler, as we can lean heavier on serde to serialize directly into the wire format that we use now.	2025-10-01 02:16:26 +00:00
jimmyfraiture	ee5f5e85cd	Fix tests and clippy	2025-09-30 21:03:35 +01:00
jimmyfraiture	9e04b908e1	Add read file	2025-09-30 19:07:22 +01:00
jimmyfraiture	150765dbe3	V2	2025-09-30 17:19:23 +01:00
jimmyfraiture	dccf499850	V1	2025-09-30 17:02:58 +01:00
jimmyfraiture	5c00e1596a	Restore otel	2025-09-30 13:28:25 +01:00
jimmyfraiture	4533dceafa	Merge remote-tracking branch 'origin/main' into jif/sandbox-1 # Conflicts: # codex-rs/core/src/codex.rs	2025-09-30 12:45:23 +01:00
jimmyfraiture	43c0abb31e	RV 6	2025-09-30 12:42:36 +01:00
jimmyfraiture	8c09db17c3	RV 5	2025-09-30 12:04:44 +01:00
jimmyfraiture	1d87628d41	RV 4	2025-09-30 11:55:43 +01:00
jimmyfraiture	4656160e31	RV 3	2025-09-30 11:40:44 +01:00
jimmyfraiture	2dd226891a	RV 2	2025-09-30 11:36:51 +01:00
jimmyfraiture	ed45f85209	RV 1	2025-09-30 11:27:23 +01:00
jimmyfraiture	5b74f10a7b	Sandboxing iteration 2	2025-09-29 19:34:12 +01:00
vishnu-oai	04c1782e52	OpenTelemetry events (#2103 ) ### Title ## otel Codex can emit [OpenTelemetry](https://opentelemetry.io/) log events that describe each run: outbound API requests, streamed responses, user input, tool-approval decisions, and the result of every tool invocation. Export is disabled by default so local runs remain self-contained. Opt in by adding an `[otel]` table and choosing an exporter. ```toml [otel] environment = "staging" # defaults to "dev" exporter = "none" # defaults to "none"; set to otlp-http or otlp-grpc to send events log_user_prompt = false # defaults to false; redact prompt text unless explicitly enabled ``` Codex tags every exported event with `service.name = "codex-cli"`, the CLI version, and an `env` attribute so downstream collectors can distinguish dev/staging/prod traffic. Only telemetry produced inside the `codex_otel` crate—the events listed below—is forwarded to the exporter. ### Event catalog Every event shares a common set of metadata fields: `event.timestamp`, `conversation.id`, `app.version`, `auth_mode` (when available), `user.account_id` (when available), `terminal.type`, `model`, and `slug`. With OTEL enabled Codex emits the following event types (in addition to the metadata above): - `codex.api_request` - `cf_ray` (optional) - `attempt` - `duration_ms` - `http.response.status_code` (optional) - `error.message` (failures) - `codex.sse_event` - `event.kind` - `duration_ms` - `error.message` (failures) - `input_token_count` (completion only) - `output_token_count` (completion only) - `cached_token_count` (completion only, optional) - `reasoning_token_count` (completion only, optional) - `tool_token_count` (completion only) - `codex.user_prompt` - `prompt_length` - `prompt` (redacted unless `log_user_prompt = true`) - `codex.tool_decision` - `tool_name` - `call_id` - `decision` (`approved`, `approved_for_session`, `denied`, or `abort`) - `source` (`config` or `user`) - `codex.tool_result` - `tool_name` - `call_id` - `arguments` - `duration_ms` (execution time for the tool) - `success` (`"true"` or `"false"`) - `output` ### Choosing an exporter Set `otel.exporter` to control where events go: - `none` – leaves instrumentation active but skips exporting. This is the default. - `otlp-http` – posts OTLP log records to an OTLP/HTTP collector. Specify the endpoint, protocol, and headers your collector expects: ```toml [otel] exporter = { otlp-http = { endpoint = "https://otel.example.com/v1/logs", protocol = "binary", headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" } }} ``` - `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint and any metadata headers: ```toml [otel] exporter = { otlp-grpc = { endpoint = "https://otel.example.com:4317", headers = { "x-otlp-meta" = "abc123" } }} ``` If the exporter is `none` nothing is written anywhere; otherwise you must run or point to your own collector. All exporters run on a background batch worker that is flushed on shutdown. If you build Codex from source the OTEL crate is still behind an `otel` feature flag; the official prebuilt binaries ship with the feature enabled. When the feature is disabled the telemetry hooks become no-ops so the CLI continues to function without the extra dependencies. --------- Co-authored-by: Anton Panasenko <apanasenko@openai.com>	2025-09-29 11:30:55 -07:00
jif-oai	7b6d8b60c9	Merge branch 'main' into jif/sandbox-1	2025-09-29 09:35:29 +01:00
Gabriel Peal	e555a36c6a	[MCP] Introduce an experimental official rust sdk based mcp client (#4252 ) The [official Rust SDK](`57fc428c57`) has come a long way since we first started our mcp client implementation 5 months ago and, today, it is much more complete than our own stdio-only implementation. This PR introduces a new config flag `experimental_use_rmcp_client` which will use a new mcp client powered by the sdk instead of our own. To keep this PR simple, I've only implemented the same stdio MCP functionality that we had but will expand on it with future PRs. --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-09-26 13:13:37 -04:00
jif-oai	1fc3413a46	ref: state - 2 (#4229 ) Extracting tasks in a module and start abstraction behind a Trait (more to come on this but each task will be tackled in a dedicated PR) The goal was to drop the ActiveTask and to have a (potentially) set of tasks during each turn	2025-09-26 13:49:08 +00:00
jimmyfraiture	caab5a19ee	Move some stuff around	2025-09-26 14:46:07 +02:00
jimmyfraiture	805de19381	V1	2025-09-26 13:42:58 +02:00
jif-oai	250b244ab4	ref: full state refactor (#4174 ) ## Current State Observations - `Session` currently holds many unrelated responsibilities (history, approval queues, task handles, rollout recorder, shell discovery, token tracking, etc.), making it hard to reason about ownership and lifetimes. - The anonymous `State` struct inside `codex.rs` mixes session-long data with turn-scoped queues and approval bookkeeping. - Turn execution (`run_task`) relies on ad-hoc local variables that should conceptually belong to a per-turn state object. - External modules (`codex::compact`, tests) frequently poke the raw `Session.state` mutex, which couples them to implementation details. - Interrupts, approvals, and rollout persistence all have bespoke cleanup paths, contributing to subtle bugs when a turn is aborted mid-flight. ## Desired End State - Keep a slim `Session` object that acts as the orchestrator and façade. It should expose a focused API (submit, approvals, interrupts, event emission) without storing unrelated fields directly. - Introduce a `state` module that encapsulates all mutable data structures: - `SessionState`: session-persistent data (history, approved commands, token/rate-limit info, maybe user preferences). - `ActiveTurn`: metadata for the currently running turn (sub-id, task kind, abort handle) and an `Arc<TurnState>`. - `TurnState`: all turn-scoped pieces (pending inputs, approval waiters, diff tracker, review history, auto-compact flags, last agent message, outstanding tool call bookkeeping). - Group long-lived helpers/managers into a dedicated `SessionServices` struct so `Session` does not accumulate "random" fields. - Provide clear, lock-safe APIs so other modules never touch raw mutexes. - Ensure every turn creates/drops a `TurnState` and that interrupts/finishes delegate cleanup to it.	2025-09-25 12:16:06 +02:00
pakrym-oai	addc946d13	Simplify tool implemetations (#4160 ) Use Result<String, FunctionCallError> for all tool handling code and rely on error propagation instead of creating failed items everywhere.	2025-09-24 17:27:35 +00:00
Ahmed Ibrahim	8227a5ba1b	Send limits when getting rate limited (#4102 ) Users need visibility on rate limits when they are rate limited.	2025-09-23 22:56:34 +00:00
pakrym-oai	fdb8dadcae	Add exec output-schema parameter (#4079 ) Adds structured output to `exec` via the `--structured-output` parameter.	2025-09-23 13:59:16 -07:00
jif-oai	b84a920067	chore: compact do not modify instructions (#4088 ) Keep the developer instruction and insert the summarisation message as a user message instead	2025-09-23 17:59:17 +01:00
pakrym-oai	5c7d9e27b1	Add notifier tests (#4064 ) Proposal: 1. Use anyhow for tests and avoid unwrap 2. Extract a helper for starting a test instance of codex	2025-09-23 14:25:46 +00:00
jif-oai	be366a31ab	chore: clippy on redundant closure (#4058 ) Add redundant closure clippy rules and let Codex fix it by minimising FQP	2025-09-22 19:30:16 +00:00
Jeremy Rose	19f46439ae	timeouts for mcp tool calls (#3959 ) defaults to 60sec, overridable with MCP_TOOL_TIMEOUT or on a per-server basis in the config.	2025-09-22 10:30:59 -07:00
Ahmed Ibrahim	04504d8218	Forward Rate limits to the UI (#3965 ) We currently get information about rate limits in the response headers. We want to forward them to the clients to have better transparency. UI/UX plans have been discussed and this information is needed.	2025-09-20 21:26:16 -07:00
Ahmed Ibrahim	a7fda70053	Use a unified shell tell to not break cache (#3814 ) Currently, we change the tool description according to the sandbox policy and approval policy. This breaks the cache when the user hits `/approvals`. This PR does the following: - Always use the shell with escalation parameter: - removes `create_shell_tool_for_sandbox` and always uses unified tool via `create_shell_tool` - Reject the func call when the model uses escalation parameter when it cannot.	2025-09-19 00:08:28 +00:00
Michael Bolin	8595237505	fix: ensure cwd for conversation and sandbox are separate concerns (#3874 ) Previous to this PR, both of these functions take a single `cwd`: `71038381aa/codex-rs/core/src/seatbelt.rs (L19-L25)` `71038381aa/codex-rs/core/src/landlock.rs (L16-L23)` whereas `cwd` and `sandbox_cwd` should be set independently (fixed in this PR). Added `sandbox_distinguishes_command_and_policy_cwds()` to `codex-rs/exec/tests/suite/sandbox.rs` to verify this.	2025-09-18 14:37:06 -07:00
dedrisian-oai	62258df92f	feat: /review (#3774 ) Adds `/review` action in TUI <img width="637" height="370" alt="Screenshot 2025-09-17 at 12 41 19 AM" src="https://github.com/user-attachments/assets/b1979a6e-844a-4b97-ab20-107c185aec1d" />	2025-09-18 14:14:16 -07:00
Jeremy Rose	b34e906396	Reland "refactor transcript view to handle HistoryCells" (#3753 ) Reland of #3538	2025-09-18 20:55:53 +00:00
jif-oai	277fc6254e	chore: use tokio mutex and async function to prevent blocking a worker (#3850 ) ### Why Use `tokio::sync::Mutex` `std::sync::Mutex` are not _async-aware_. As a result, they will block the entire thread instead of just yielding the task. Furthermore they can be poisoned which is not the case of `tokio` Mutex. This allows the Tokio runtime to continue running other tasks while waiting for the lock, preventing deadlocks and performance bottlenecks. In general, this is preferred in async environment	2025-09-18 18:21:52 +01:00
dedrisian-oai	72733e34c4	Add dev message upon review out (#3758 ) Proposal: We want to record a dev message like so: ``` { "type": "message", "role": "user", "content": [ { "type": "input_text", "text": "<user_action> <context>User initiated a review task. Here's the full review output from reviewer model. User may select one or more comments to resolve.</context> <action>review</action> <results> {findings_str} </results> </user_action>" } ] }, ``` Without showing in the chat transcript. Rough idea, but it fixes issue where the user finishes a review thread, and asks the parent "fix the rest of the review issues" thinking that the parent knows about it. ### Question: Why not a tool call? Because the agent didn't make the call, it was a human. + we haven't implemented sub-agents yet, and we'll need to think about the way we represent these human-led tool calls for the agent.	2025-09-16 18:43:32 -07:00
dedrisian-oai	7fe4021f95	Review mode core updates (#3701 ) 1. Adds the environment prompt (including cwd) to review thread 2. Prepends the review prompt as a user message (temporary fix so the instructions are not replaced on backend) 3. Sets reasoning to low 4. Sets default review model to `gpt-5-codex`	2025-09-16 13:36:51 -07:00

1 2 3 4 5

212 Commits