mirror of
https://github.com/openai/codex.git
synced 2026-04-24 22:54:54 +00:00
5.9 KiB
5.9 KiB
codex-sdk-v2
codex-sdk-v2 is an experimental Python prototype that borrows the host/runtime split from Universal Computer but uses codex app-server as the execution runtime.
Prototype shape:
- The host SDK owns workspace materialization, Codex process startup, and Responses API transport.
- A host bridge exposes
/v1/responsesto the locally running Codex runtime. - Codex runs with
codex app-server --listen stdio://. - The SDK talks to app-server over stdio.
- Thread startup uses
thread/start.sdkDelegationto point Codex at the host bridge. - The SDK owns the bridge lifecycle and
await task.close()tears down both the app-server session and the bridge. - The prototype uses a local attached-process backend so it can run against the host-installed Codex binary without cross-compiling a Linux container binary.
Capability model:
- Capabilities are the SDK’s grouping abstraction for UC-style bundles.
- A
Capabilitycan contribute:tools()instructions()process_manifest(manifest)
- The capability API intentionally uses a single
tools()method; the built-in vs function-tool split stays internal to the SDK runtime. - The default capability set is
UnifiedExecCapability(), which enablesExecCommandandWriteStdin.
Tool model:
- Built-in Codex tools are exposed as Python classes such as
ExecCommand,WriteStdin,ApplyPatch,ReadFile, andViewImage. - The SDK sends those classes to app-server as an exact
thread/start.builtinToolsallowlist. - Defaults come from
UnifiedExecCapability(), which enablesExecCommandplusWriteStdin. - Host-side custom tools subclass
FunctionTool; the SDK registers them as dynamic tools internally and answersitem/tool/callrequests on the host. - SDK users do not need to work with raw app-server
dynamicToolspayloads directly. - Custom
FunctionTools can contribute instruction fragments; the SDK folds those fragments intodeveloperInstructions. - Built-in tool instructions are owned by Codex itself and are composed in Rust from the enabled built-in capability set.
Example capability:
from codex_sdk_v2 import Capability, ExecCommand, WriteStdin
class UnifiedExec(Capability):
def tools(self):
return (ExecCommand, WriteStdin)
Pending tool call API:
task.pending_tool_calls()returns unresolved tool calls.- Each pending tool call supports
describe()andawait tool_call(task). - The pending tool call subclasses are:
PendingCommandExecutionPendingFileChangePendingFunctionToolCall
- The explicit host helpers are:
task.approve(...)task.reject(...)task.replace_command(...)task.run_function_tool(...)task.submit_tool_result(...)
Decision model:
ApproveDecision()RejectDecision()DeferDecision()ReplaceCommandDecision(command=[...])RunDecision(arguments=...)RespondDecision(result=...)
Approval model:
- Manual is the default.
- If a tool does not make a decision, its call stays pending in
task.pending_tool_calls(). FunctionTool.approve(call)can resolve or defer a function tool call.BuiltinTool.with_approval_policy(policy=...)can resolve or defer a built-in call.- There is no agent-wide global approval policy in the prototype.
Example:
from codex_sdk_v2 import Agent, ApproveDecision, DeferDecision
from codex_sdk_v2 import ExecCommand, FunctionTool, Manifest
from codex_sdk_v2 import PendingCommandExecution, ReplaceCommandDecision, WriteStdin
class LookupRefundStatus(FunctionTool):
name = "lookup_refund_status"
description = "Return a canned refund status for a demo taxpayer id."
input_schema = {
"type": "object",
"properties": {"taxpayer_id": {"type": "string"}},
"required": ["taxpayer_id"],
"additionalProperties": False,
}
async def approve(self, call):
if call.arguments["taxpayer_id"].startswith("demo_"):
return ApproveDecision()
return DeferDecision()
async def run(self, arguments):
return f"Refund status for {arguments['taxpayer_id']}: approved"
async def approve_exec(call: PendingCommandExecution):
if call.command and call.command.startswith("ls"):
return ApproveDecision()
if call.command and call.command.startswith("cat"):
return ReplaceCommandDecision(command=["sed", "-n", "1,20p", "README.md"])
return DeferDecision()
agent = Agent(
manifest=Manifest(root="/workspace"),
tools=(
ExecCommand.with_approval_policy(policy=approve_exec),
WriteStdin,
LookupRefundStatus(),
),
)
task = await agent.start()
await stream_turn(task, start_text="Help me with my taxes")
while task.pending_tool_calls():
for tool_call in task.pending_tool_calls():
print(tool_call.describe())
await tool_call(task)
await stream_turn(task)
Current delegation shape:
- The SDK starts a local HTTP bridge on the host.
thread/start.sdkDelegation.bridgeUrltells Codex to use that host bridge as its Responses base URL for the thread.- Codex sends the raw Responses request body to the host bridge.
- The host bridge adds the upstream
Authorizationheader on the host side and forwards the request to OpenAI. - The bridge streams the upstream response back to Codex unchanged.
This means the prototype is bridge-based delegation, not the full event-by-event delegated Responses flow from the RFC yet.
Debugging:
- Set
CODEX_SDK_V2_DEBUG=1to print JSON-RPC traffic and app-server stderr while running an example. - The local backend prefers a repo-built
codex-rs/target/debug/codex-app-serverbinary when present; otherwise it falls back tocodexon yourPATH.
Current limitation:
- The UC-style pending-tool-call flow is now present in-memory on the SDK task object. Persisting unresolved tool calls cleanly across a full host process restart still depends on replay behavior from app-server for the underlying pending request type.