mirror of
https://github.com/openai/codex.git
synced 2026-05-29 23:40:29 +00:00
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved
## Runtime model
This PR does not introduce a new `approval_policy`.
Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
- `user`
- `guardian_subagent`
`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.
The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.
When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`
Users can still change `/approvals` afterward.
Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior
ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.
## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.
- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface
## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.
It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`
`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`
`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.
These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.
This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.
## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
- resolved approval/denial state in history
## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)
Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)
---------
Co-authored-by: Codex <noreply@openai.com>
407 lines
14 KiB
Rust
407 lines
14 KiB
Rust
use super::*;
|
|
use crate::mcp_tool_call::MCP_TOOL_APPROVAL_DECLINE_SYNTHETIC;
|
|
use crate::mcp_tool_call::MCP_TOOL_APPROVAL_QUESTION_ID_PREFIX;
|
|
use async_channel::bounded;
|
|
use codex_protocol::config_types::ApprovalsReviewer;
|
|
use codex_protocol::models::NetworkPermissions;
|
|
use codex_protocol::models::ResponseItem;
|
|
use codex_protocol::protocol::AgentStatus;
|
|
use codex_protocol::protocol::AskForApproval;
|
|
use codex_protocol::protocol::EventMsg;
|
|
use codex_protocol::protocol::ExecApprovalRequestEvent;
|
|
use codex_protocol::protocol::GuardianAssessmentEvent;
|
|
use codex_protocol::protocol::GuardianAssessmentStatus;
|
|
use codex_protocol::protocol::McpInvocation;
|
|
use codex_protocol::protocol::RawResponseItemEvent;
|
|
use codex_protocol::protocol::ReviewDecision;
|
|
use codex_protocol::protocol::TurnAbortReason;
|
|
use codex_protocol::protocol::TurnAbortedEvent;
|
|
use codex_protocol::request_permissions::RequestPermissionProfile;
|
|
use codex_protocol::request_permissions::RequestPermissionsEvent;
|
|
use codex_protocol::request_permissions::RequestPermissionsResponse;
|
|
use codex_protocol::request_user_input::RequestUserInputAnswer;
|
|
use codex_protocol::request_user_input::RequestUserInputEvent;
|
|
use codex_protocol::request_user_input::RequestUserInputQuestion;
|
|
use pretty_assertions::assert_eq;
|
|
use serde_json::json;
|
|
use std::collections::HashMap;
|
|
use std::path::PathBuf;
|
|
use std::sync::Arc;
|
|
use tokio::sync::Mutex;
|
|
use tokio::sync::watch;
|
|
use tokio::time::timeout;
|
|
|
|
#[tokio::test]
|
|
async fn forward_events_cancelled_while_send_blocked_shuts_down_delegate() {
|
|
let (tx_events, rx_events) = bounded(1);
|
|
let (tx_sub, rx_sub) = bounded(SUBMISSION_CHANNEL_CAPACITY);
|
|
let (_agent_status_tx, agent_status) = watch::channel(AgentStatus::PendingInit);
|
|
let (session, ctx, _rx_evt) = crate::codex::make_session_and_context_with_rx().await;
|
|
let codex = Arc::new(Codex {
|
|
tx_sub,
|
|
rx_event: rx_events,
|
|
agent_status,
|
|
session: Arc::clone(&session),
|
|
session_loop_termination: completed_session_loop_termination(),
|
|
});
|
|
|
|
let (tx_out, rx_out) = bounded(1);
|
|
tx_out
|
|
.send(Event {
|
|
id: "full".to_string(),
|
|
msg: EventMsg::TurnAborted(TurnAbortedEvent {
|
|
turn_id: Some("turn-1".to_string()),
|
|
reason: TurnAbortReason::Interrupted,
|
|
}),
|
|
})
|
|
.await
|
|
.unwrap();
|
|
|
|
let cancel = CancellationToken::new();
|
|
let forward = tokio::spawn(forward_events(
|
|
Arc::clone(&codex),
|
|
tx_out.clone(),
|
|
session,
|
|
ctx,
|
|
Arc::new(Mutex::new(HashMap::new())),
|
|
cancel.clone(),
|
|
));
|
|
|
|
tx_events
|
|
.send(Event {
|
|
id: "evt".to_string(),
|
|
msg: EventMsg::RawResponseItem(RawResponseItemEvent {
|
|
item: ResponseItem::CustomToolCall {
|
|
id: None,
|
|
status: None,
|
|
call_id: "call-1".to_string(),
|
|
name: "tool".to_string(),
|
|
input: "{}".to_string(),
|
|
},
|
|
}),
|
|
})
|
|
.await
|
|
.unwrap();
|
|
|
|
drop(tx_events);
|
|
cancel.cancel();
|
|
timeout(std::time::Duration::from_millis(1000), forward)
|
|
.await
|
|
.expect("forward_events hung")
|
|
.expect("forward_events join error");
|
|
|
|
let received = rx_out.recv().await.expect("prefilled event missing");
|
|
assert_eq!("full", received.id);
|
|
let mut ops = Vec::new();
|
|
while let Ok(sub) = rx_sub.try_recv() {
|
|
ops.push(sub.op);
|
|
}
|
|
assert!(
|
|
ops.iter().any(|op| matches!(op, Op::Interrupt)),
|
|
"expected Interrupt op after cancellation"
|
|
);
|
|
assert!(
|
|
ops.iter().any(|op| matches!(op, Op::Shutdown)),
|
|
"expected Shutdown op after cancellation"
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn forward_ops_preserves_submission_trace_context() {
|
|
let (tx_sub, rx_sub) = bounded(SUBMISSION_CHANNEL_CAPACITY);
|
|
let (_tx_events, rx_events) = bounded(SUBMISSION_CHANNEL_CAPACITY);
|
|
let (_agent_status_tx, agent_status) = watch::channel(AgentStatus::PendingInit);
|
|
let (session, _ctx, _rx_evt) = crate::codex::make_session_and_context_with_rx().await;
|
|
let codex = Arc::new(Codex {
|
|
tx_sub,
|
|
rx_event: rx_events,
|
|
agent_status,
|
|
session,
|
|
session_loop_termination: completed_session_loop_termination(),
|
|
});
|
|
let (tx_ops, rx_ops) = bounded(1);
|
|
let cancel = CancellationToken::new();
|
|
let forward = tokio::spawn(forward_ops(Arc::clone(&codex), rx_ops, cancel));
|
|
|
|
let submission = Submission {
|
|
id: "sub-1".to_string(),
|
|
op: Op::Interrupt,
|
|
trace: Some(codex_protocol::protocol::W3cTraceContext {
|
|
traceparent: Some(
|
|
"00-1234567890abcdef1234567890abcdef-1234567890abcdef-01".to_string(),
|
|
),
|
|
tracestate: Some("vendor=state".to_string()),
|
|
}),
|
|
};
|
|
tx_ops.send(submission.clone()).await.unwrap();
|
|
drop(tx_ops);
|
|
|
|
let forwarded = timeout(Duration::from_secs(1), rx_sub.recv())
|
|
.await
|
|
.expect("forward_ops hung")
|
|
.expect("forwarded submission missing");
|
|
assert_eq!(submission.id, forwarded.id);
|
|
assert_eq!(submission.op, forwarded.op);
|
|
assert_eq!(submission.trace, forwarded.trace);
|
|
|
|
timeout(Duration::from_secs(1), forward)
|
|
.await
|
|
.expect("forward_ops did not exit")
|
|
.expect("forward_ops join error");
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn handle_request_permissions_uses_tool_call_id_for_round_trip() {
|
|
let (parent_session, parent_ctx, rx_events) =
|
|
crate::codex::make_session_and_context_with_rx().await;
|
|
*parent_session.active_turn.lock().await = Some(crate::state::ActiveTurn::default());
|
|
|
|
let (tx_sub, rx_sub) = bounded(SUBMISSION_CHANNEL_CAPACITY);
|
|
let (_tx_events, rx_events_child) = bounded(SUBMISSION_CHANNEL_CAPACITY);
|
|
let (_agent_status_tx, agent_status) = watch::channel(AgentStatus::PendingInit);
|
|
let codex = Arc::new(Codex {
|
|
tx_sub,
|
|
rx_event: rx_events_child,
|
|
agent_status,
|
|
session: Arc::clone(&parent_session),
|
|
session_loop_termination: completed_session_loop_termination(),
|
|
});
|
|
|
|
let call_id = "tool-call-1".to_string();
|
|
let expected_response = RequestPermissionsResponse {
|
|
permissions: RequestPermissionProfile {
|
|
network: Some(NetworkPermissions {
|
|
enabled: Some(true),
|
|
}),
|
|
..RequestPermissionProfile::default()
|
|
},
|
|
scope: PermissionGrantScope::Turn,
|
|
};
|
|
let cancel_token = CancellationToken::new();
|
|
let request_call_id = call_id.clone();
|
|
|
|
let handle = tokio::spawn({
|
|
let codex = Arc::clone(&codex);
|
|
let parent_session = Arc::clone(&parent_session);
|
|
let parent_ctx = Arc::clone(&parent_ctx);
|
|
let cancel_token = cancel_token.clone();
|
|
async move {
|
|
handle_request_permissions(
|
|
codex.as_ref(),
|
|
&parent_session,
|
|
&parent_ctx,
|
|
RequestPermissionsEvent {
|
|
call_id: request_call_id,
|
|
turn_id: "child-turn-1".to_string(),
|
|
reason: Some("need access".to_string()),
|
|
permissions: RequestPermissionProfile {
|
|
network: Some(NetworkPermissions {
|
|
enabled: Some(true),
|
|
}),
|
|
..RequestPermissionProfile::default()
|
|
},
|
|
},
|
|
&cancel_token,
|
|
)
|
|
.await;
|
|
}
|
|
});
|
|
|
|
let request_event = timeout(Duration::from_secs(1), rx_events.recv())
|
|
.await
|
|
.expect("request_permissions event timed out")
|
|
.expect("request_permissions event missing");
|
|
let EventMsg::RequestPermissions(request) = request_event.msg else {
|
|
panic!("expected RequestPermissions event");
|
|
};
|
|
assert_eq!(request.call_id, call_id.clone());
|
|
|
|
parent_session
|
|
.notify_request_permissions_response(&call_id, expected_response.clone())
|
|
.await;
|
|
|
|
timeout(Duration::from_secs(1), handle)
|
|
.await
|
|
.expect("handle_request_permissions hung")
|
|
.expect("handle_request_permissions join error");
|
|
|
|
let submission = timeout(Duration::from_secs(1), rx_sub.recv())
|
|
.await
|
|
.expect("request_permissions response timed out")
|
|
.expect("request_permissions response missing");
|
|
assert_eq!(
|
|
submission.op,
|
|
Op::RequestPermissionsResponse {
|
|
id: call_id,
|
|
response: expected_response,
|
|
}
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn handle_exec_approval_uses_call_id_for_guardian_review_and_approval_id_for_reply() {
|
|
let (parent_session, parent_ctx, rx_events) =
|
|
crate::codex::make_session_and_context_with_rx().await;
|
|
let mut parent_ctx = Arc::try_unwrap(parent_ctx).expect("single turn context ref");
|
|
let mut config = (*parent_ctx.config).clone();
|
|
config.approvals_reviewer = ApprovalsReviewer::GuardianSubagent;
|
|
parent_ctx.config = Arc::new(config);
|
|
parent_ctx
|
|
.approval_policy
|
|
.set(AskForApproval::OnRequest)
|
|
.expect("set on-request policy");
|
|
let parent_ctx = Arc::new(parent_ctx);
|
|
|
|
let (tx_sub, rx_sub) = bounded(SUBMISSION_CHANNEL_CAPACITY);
|
|
let (_tx_events, rx_events_child) = bounded(SUBMISSION_CHANNEL_CAPACITY);
|
|
let (_agent_status_tx, agent_status) = watch::channel(AgentStatus::PendingInit);
|
|
let codex = Arc::new(Codex {
|
|
tx_sub,
|
|
rx_event: rx_events_child,
|
|
agent_status,
|
|
session: Arc::clone(&parent_session),
|
|
session_loop_termination: completed_session_loop_termination(),
|
|
});
|
|
|
|
let cancel_token = CancellationToken::new();
|
|
let handle = tokio::spawn({
|
|
let codex = Arc::clone(&codex);
|
|
let parent_session = Arc::clone(&parent_session);
|
|
let parent_ctx = Arc::clone(&parent_ctx);
|
|
let cancel_token = cancel_token.clone();
|
|
async move {
|
|
handle_exec_approval(
|
|
codex.as_ref(),
|
|
"child-turn-1".to_string(),
|
|
&parent_session,
|
|
&parent_ctx,
|
|
ExecApprovalRequestEvent {
|
|
call_id: "command-item-1".to_string(),
|
|
approval_id: Some("callback-approval-1".to_string()),
|
|
turn_id: "child-turn-1".to_string(),
|
|
command: vec!["rm".to_string(), "-rf".to_string(), "tmp".to_string()],
|
|
cwd: PathBuf::from("/tmp"),
|
|
reason: Some("unsafe subcommand".to_string()),
|
|
network_approval_context: None,
|
|
proposed_execpolicy_amendment: None,
|
|
proposed_network_policy_amendments: None,
|
|
additional_permissions: None,
|
|
skill_metadata: None,
|
|
available_decisions: Some(vec![
|
|
ReviewDecision::Approved,
|
|
ReviewDecision::Abort,
|
|
]),
|
|
parsed_cmd: Vec::new(),
|
|
},
|
|
&cancel_token,
|
|
)
|
|
.await;
|
|
}
|
|
});
|
|
|
|
let assessment_event = timeout(Duration::from_secs(2), async {
|
|
loop {
|
|
let event = rx_events.recv().await.expect("guardian assessment event");
|
|
if let EventMsg::GuardianAssessment(assessment) = event.msg {
|
|
return assessment;
|
|
}
|
|
}
|
|
})
|
|
.await
|
|
.expect("timed out waiting for guardian assessment");
|
|
assert_eq!(
|
|
assessment_event,
|
|
GuardianAssessmentEvent {
|
|
id: "command-item-1".to_string(),
|
|
turn_id: parent_ctx.sub_id.clone(),
|
|
status: GuardianAssessmentStatus::InProgress,
|
|
risk_score: None,
|
|
risk_level: None,
|
|
rationale: None,
|
|
action: Some(json!({
|
|
"tool": "shell",
|
|
"command": "rm -rf tmp",
|
|
"cwd": "/tmp",
|
|
})),
|
|
}
|
|
);
|
|
|
|
cancel_token.cancel();
|
|
|
|
timeout(Duration::from_secs(2), handle)
|
|
.await
|
|
.expect("handle_exec_approval hung")
|
|
.expect("handle_exec_approval join error");
|
|
|
|
let submission = timeout(Duration::from_secs(2), rx_sub.recv())
|
|
.await
|
|
.expect("exec approval response timed out")
|
|
.expect("exec approval response missing");
|
|
assert_eq!(
|
|
submission.op,
|
|
Op::ExecApproval {
|
|
id: "callback-approval-1".to_string(),
|
|
turn_id: Some("child-turn-1".to_string()),
|
|
decision: ReviewDecision::Abort,
|
|
}
|
|
);
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn delegated_mcp_guardian_abort_returns_synthetic_decline_answer() {
|
|
let (parent_session, parent_ctx, _rx_events) =
|
|
crate::codex::make_session_and_context_with_rx().await;
|
|
let mut parent_ctx = Arc::try_unwrap(parent_ctx).expect("single turn context ref");
|
|
let mut config = (*parent_ctx.config).clone();
|
|
config.approvals_reviewer = ApprovalsReviewer::GuardianSubagent;
|
|
parent_ctx.config = Arc::new(config);
|
|
parent_ctx
|
|
.approval_policy
|
|
.set(AskForApproval::OnRequest)
|
|
.expect("set on-request policy");
|
|
let parent_ctx = Arc::new(parent_ctx);
|
|
|
|
let pending_mcp_invocations = Arc::new(Mutex::new(HashMap::from([(
|
|
"call-1".to_string(),
|
|
McpInvocation {
|
|
server: "custom_server".to_string(),
|
|
tool: "dangerous_tool".to_string(),
|
|
arguments: None,
|
|
},
|
|
)])));
|
|
let cancel_token = CancellationToken::new();
|
|
cancel_token.cancel();
|
|
|
|
let response = maybe_auto_review_mcp_request_user_input(
|
|
&parent_session,
|
|
&parent_ctx,
|
|
&pending_mcp_invocations,
|
|
&RequestUserInputEvent {
|
|
call_id: "call-1".to_string(),
|
|
turn_id: "child-turn-1".to_string(),
|
|
questions: vec![RequestUserInputQuestion {
|
|
id: format!("{MCP_TOOL_APPROVAL_QUESTION_ID_PREFIX}_call-1"),
|
|
header: "Approve app tool call?".to_string(),
|
|
question: "Allow this app tool?".to_string(),
|
|
is_other: false,
|
|
is_secret: false,
|
|
options: None,
|
|
}],
|
|
},
|
|
&cancel_token,
|
|
)
|
|
.await;
|
|
|
|
assert_eq!(
|
|
response,
|
|
Some(RequestUserInputResponse {
|
|
answers: HashMap::from([(
|
|
format!("{MCP_TOOL_APPROVAL_QUESTION_ID_PREFIX}_call-1"),
|
|
RequestUserInputAnswer {
|
|
answers: vec![MCP_TOOL_APPROVAL_DECLINE_SYNTHETIC.to_string()],
|
|
},
|
|
)]),
|
|
})
|
|
);
|
|
}
|