mirror of
https://github.com/openai/codex.git
synced 2026-05-03 02:46:39 +00:00
[codex] Improve PR babysitter CI diagnostics and guardrails (#20484)
## Summary - Surface failed GitHub Actions jobs in the PR babysitter watcher so Codex can fetch job logs as soon as a job fails, instead of waiting for the overall workflow run to complete. - Update babysit-pr skill instructions, GitHub API notes, and heuristics to prefer direct job log archives before falling back to `gh run view --log-failed`. - Add guardrails requiring explicit user confirmation before posting replies to human-authored review comments. - Add guardrails preventing Codex from patching unrelated flaky tests, CI infrastructure, runner issues, dependency outages, or other failures not caused by the PR branch. ## Validation - `python3 -m pytest .codex/skills/babysit-pr/scripts/test_gh_pr_watch.py`
This commit is contained in:
@@ -23,9 +23,11 @@ Used to discover failed workflow runs and rerunnable run IDs.
|
||||
### Failed log inspection
|
||||
|
||||
- `gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha`
|
||||
- `gh api repos/{owner}/{repo}/actions/runs/{run_id}/jobs -X GET -f per_page=100`
|
||||
- `gh api repos/{owner}/{repo}/actions/jobs/{job_id}/logs > /tmp/codex-gh-job-{job_id}-logs.zip`
|
||||
- `gh run view <run-id> --log-failed`
|
||||
|
||||
Used by Codex to classify branch-related vs flaky/unrelated failures.
|
||||
Used by Codex to classify branch-related vs flaky/unrelated failures. Prefer the direct job log endpoint as soon as a job has failed because `gh run view --log-failed` may not produce failed-job logs until the overall workflow run completes.
|
||||
|
||||
### Retry failed jobs only
|
||||
|
||||
@@ -70,3 +72,11 @@ Reruns only failed jobs (and dependencies) for a workflow run.
|
||||
- `conclusion`
|
||||
- `html_url`
|
||||
- `head_sha`
|
||||
|
||||
### Actions run jobs API (`jobs[]`)
|
||||
|
||||
- `id`
|
||||
- `name`
|
||||
- `status`
|
||||
- `conclusion`
|
||||
- `html_url`
|
||||
|
||||
Reference in New Issue
Block a user