codex

mirror of https://github.com/openai/codex.git synced 2026-05-24 21:14:51 +00:00

Author	SHA1	Message	Date
Michael Bolin	491a3058f6	fix(exec-server): retain output until streams close (#18946 ) ## Why A Mac Bazel run hit a flake in `server::handler::tests::output_and_exit_are_retained_after_notification_receiver_closes` where the read path observed process exit but lost the expected buffered stdout (`first\nsecond\n`). See the [GitHub Actions job](https://github.com/openai/codex/actions/runs/24758468552/job/72436716505) and [BuildBuddy invocation](https://app.buildbuddy.io/invocation/37475a12-4ef2-45fb-ab8a-e49a2aba1d59). The underlying race is that process exit is not the same thing as stdout/stderr closure. If a child or grandchild inherits the pipe write end, or a process duplicates it with `dup2`, the watched process can exit while the stream is still open and more output can still arrive. The exec-server was starting exited-process retention cleanup from the exit event, so the process entry could be removed before the output streams had actually closed. While stress-testing the exec-server unit suite, `server::handler::tests::long_poll_read_fails_after_session_resume` exposed a separate test race: it started a short-lived command that could exit and wake the pending long-poll read before the session-resume assertion observed the resumed-session error. That test is intended to cover resume eviction, not process-exit delivery, so this change keeps the process alive and quiet while the second connection resumes the session. ## What changed - Keep exec-server process entries retained until stdout/stderr streams close, then start the post-exit retention timer from the closed event. - Wake long-poll readers when the closed event is emitted. - Add focused `local_process` unit coverage that proves late output is still retained after the short test retention interval has elapsed, and that closed process entries are eventually evicted. - Add a local and remote regression test where a parent exits while a child keeps inherited stdout open. The child waits on an explicit release file, so the test deterministically observes exit first, releases the child, then requires a nonzero-wait read from the exit sequence to receive the late output. - In `codex-rs/exec-server/src/server/handler/tests.rs`, make `long_poll_read_fails_after_session_resume` run a long-lived silent command instead of a short command that prints and exits. This isolates the test to session-resume behavior and prevents a normal process exit from satisfying the pending long-poll read first. ## Testing - `cargo test -p codex-exec-server exec_process_retains_output_after_exit_until_streams_close` - `cargo test -p codex-exec-server local_process::tests` - `cargo test -p codex-exec-server` - `just fix -p codex-exec-server` - `bazel test //codex-rs/exec-server:exec-server-unit-tests //codex-rs/exec-server:exec-server-exec_process-test //codex-rs/exec-server:exec-server-file_system-test //codex-rs/exec-server:exec-server-http_client-test //codex-rs/exec-server:exec-server-initialize-test //codex-rs/exec-server:exec-server-process-test //codex-rs/exec-server:exec-server-websocket-test` - `bazel test --runs_per_test=25 //codex-rs/exec-server:exec-server-unit-tests` ## Documentation No docs update needed; this is an internal exec-server correctness fix.	2026-04-23 19:49:58 +00:00
Michael Bolin	d3dd0d759b	exec-server: expose arg0 alias root to fs sandbox (#19016 ) ## Why The post-merge `rust-ci-full` run for #18999 still failed the Ubuntu remote `suite::remote_env` sandboxed filesystem tests. That run checked out merge commit `ddde50c611e4800cb805f243ed3c50bbafe7d011`, so the arg0 guard lifetime fix was present. The Docker-backed failure had two remaining pieces: - The sandboxed filesystem helper needs to execute Codex through the `codex-linux-sandbox` arg0 alias path. The helper sandbox was only granting read access to the real Codex executable parent, so the alias parent also has to be visible inside the helper sandbox. - The remote-env tests were building sandbox contexts with `FileSystemSandboxContext::new()`, which captures the local test runner cwd. In the Docker remote exec-server, that host checkout path does not exist, so spawning the filesystem helper failed with `No such file or directory` before the helper could process the request. ## What Changed - Track all helper runtime read roots instead of a single root. - Add both the real Codex executable parent and the `codex-linux-sandbox` alias parent to sandbox readable roots. - Avoid sending an unused local cwd in remote filesystem sandbox contexts when the permission profile has no cwd-dependent entries. - Build the Docker remote-env test sandbox contexts with a cwd path that exists inside the container. - Add unit coverage for the alias-parent root and remote sandbox cwd handling. ## Verification - `cargo test -p codex-exec-server` - `cargo test -p codex-core remote_test_env_sandboxed_read_allows_readable_root` - `just fix -p codex-exec-server` - `just fix -p codex-core`	2026-04-22 21:34:22 +00:00
efrazer-oai	be75785504	fix: fully revert agent identity runtime wiring (#18757 ) ## Summary This PR fully reverts the previously merged Agent Identity runtime integration from the old stack: https://github.com/openai/codex/pull/17387/changes It removes the Codex-side task lifecycle wiring, rollout/session persistence, feature flag plumbing, lazy `auth.json` mutation, background task auth paths, and request callsite changes introduced by that stack. This leaves the repo in a clean pre-AgentIdentity integration state so the follow-up PRs can reintroduce the pieces in smaller reviewable layers. ## Stack 1. This PR: full revert 2. https://github.com/openai/codex/pull/18871: move Agent Identity business logic into a crate 3. https://github.com/openai/codex/pull/18785: add explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate auth callsites through AuthProvider ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 14:30:55 -07:00
Ahmed Ibrahim	2ca270d08d	[2/8] Support piped stdin in exec process API (#18086 ) ## Summary - Add an explicit stdin mode to process/start. - Keep normal non-interactive exec stdin closed while allowing pipe-backed processes. ## Stack ```text o #18027 [8/8] Fail exec client operations after disconnect │ o #18025 [7/8] Cover MCP stdio tests with executor placement │ o #18089 [6/8] Wire remote MCP stdio through executor │ o #18088 [5/8] Add executor process transport for MCP stdio │ o #18087 [4/8] Abstract MCP stdio server launching │ o #18020 [3/8] Add pushed exec process events │ @ #18086 [2/8] Support piped stdin in exec process API │ o #18085 [1/8] Add MCP server environment config │ o main ``` Co-authored-by: Codex <noreply@openai.com>	2026-04-16 10:30:10 -07:00
jif-oai	bacb92b1d7	Build remote exec env from exec-server policy (#17216 ) ## Summary - add an exec-server `envPolicy` field; when present, the server starts from its own process env and applies the shell environment policy there - keep `env` as the exact environment for local/embedded starts, but make it an overlay for remote unified-exec starts - move the shell-environment-policy builder into `codex-config` so Core and exec-server share the inherit/filter/set/include behavior - overlay only runtime/sandbox/network deltas from Core onto the exec-server-derived env ## Why Remote unified exec was materializing the shell env inside Core and forwarding the whole map to exec-server, so remote processes could inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the base env on the executor while preserving Core-owned runtime additions like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and sandbox marker env. ## Validation - `just fmt` - `git diff --check` - `cargo test -p codex-exec-server --lib` - `cargo test -p codex-core --lib unified_exec::process_manager::tests` - `cargo test -p codex-core --lib exec_env::tests` - `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter matched 0 tests) - `cargo test -p codex-config --lib shell_environment` (compile-only; filter matched 0 tests) - `just bazel-lock-update` ## Known local validation issue - `just bazel-lock-check` is not runnable in this checkout: it invokes `./scripts/check-module-bazel-lock.sh`, which is missing. --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: pakrym-oai <pakrym@openai.com>	2026-04-13 09:59:08 +01:00
starr-openai	d626dc3895	Run exec-server fs operations through sandbox helper (#17294 ) ## Summary - run exec-server filesystem RPCs requiring sandboxing through a `codex-fs` arg0 helper over stdin/stdout - keep direct local filesystem execution for `DangerFullAccess` and external sandbox policies - remove the standalone exec-server binary path in favor of top-level arg0 dispatch/runtime paths - add sandbox escape regression coverage for local and remote filesystem paths ## Validation - `just fmt` - `git diff --check` - remote devbox: `cd codex-rs && bazel test --bes_backend= --bes_results_url= //codex-rs/exec-server:all` (6/6 passed) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-12 18:36:03 -07:00
Adrian	39cc85310f	Add use_agent_identity feature flag (#17385 )	2026-04-11 09:52:06 -07:00
Eric Traut	824ec94eab	Fix Windows exec-server output test flake (#17409 ) Problem: The Windows exec-server test command could let separator whitespace become part of `echo` output, making the exact retained-output assertion flaky. Solution: Tighten the Windows `cmd.exe` command by placing command separators directly after the echoed tokens so stdout remains deterministic while preserving the exact assertion.	2026-04-10 19:24:40 -07:00
jif-oai	085ffb4456	feat: move exec-server ownership (#16344 ) This introduces session-scoped ownership for exec-server so ws disconnects no longer immediately kill running remote exec processes, and it prepares the protocol for reconnect-based resume. - add session_id / resume_session_id to the exec-server initialize handshake - move process ownership under a shared session registry - detach sessions on websocket disconnect and expire them after a TTL instead of killing processes immediately (we will resume based on this) - allow a new connection to resume an existing session and take over notifications/ownership - I use UUID to make them not predictable as we don't have auth for now - make detached-session expiry authoritative at resume time so teardown wins at the TTL boundary - reject long-poll process/read calls that get resumed out from under an older attachment --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-10 14:11:47 +01:00
jif-oai	6d2f4aaafc	feat: use `ProcessId` in `exec-server` (#15866 ) Use a full struct for the ProcessId to increase readability and make it easier in the future to make it evolve if needed	2026-03-26 16:45:36 +01:00
starr-openai	1d210f639e	Add exec-server exec RPC implementation (#15090 ) Stacked PR 2/3, based on the stub PR. Adds the exec RPC implementation and process/event flow in exec-server only. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-19 19:00:36 +00:00

11 Commits