mirror of https://github.com/openai/codex.git synced 2026-04-25 07:05:38 +00:00

Files

Eric Traut 7e569f1162 Add PR babysitting skill for this repo (#12513 )

## PR Notes

This PR adds a project-scoped `babysit-pr` skill for ongoing PR
monitoring (CI, reviews, mergeability).

Simply invoke this skill after creating a PR, and codex will do its best
to get it to a mergeable state:

### What the skill does
* Fixes CI failures related to the PR
* Retries CI failures due to flaky tests
* Addresses code review comments if it agrees with them
* Addresses merge conflicts on main branch

### How the skill works
- Polls PR status on a loop (CI checks, workflow runs, review activity,
mergeability, and review decision).
- Detects new review feedback (including inline comments and automated
Codex review comments) and prompts/handles follow-up work.
- Distinguishes pending vs failed vs passed CI and identifies likely
flaky failures.
- Can retry failed checks/workflows when appropriate.
- Prioritizes actionable code review feedback over flaky CI retries (to
avoid rerunning CI on a SHA that is about to be replaced).
- Continues monitoring after fixes are applied and pushed, rather than
stopping after a progress update.
- Uses a slower backoff polling cadence once CI is green, while still
watching for new review feedback or state changes.
- Treats required review/approval as a blocking condition and keeps
watching until the PR is actually merge-ready (or merged/closed, or
human intervention is needed).

### Intended outcome

Keep the PR moving with minimal manual babysitting by continuously
watching for CI failures, reviewer feedback, and merge blockers, and
responding in the right order until the PR is ready to merge.

2026-02-22 15:36:28 -08:00

2.3 KiB

Raw Blame History

CI / Review Heuristics

CI classification checklist

Treat as branch-related when logs clearly indicate a regression caused by the PR branch:

Compile/typecheck/lint failures in files or modules touched by the branch
Deterministic unit/integration test failures in changed areas
Snapshot output changes caused by UI/text changes in the branch
Static analysis violations introduced by the latest push
Build script/config changes in the PR causing a deterministic failure

Treat as likely flaky or unrelated when evidence points to transient or external issues:

DNS/network/registry timeout errors while fetching dependencies
Runner image provisioning or startup failures
GitHub Actions infrastructure/service outages
Cloud/service rate limits or transient API outages
Non-deterministic failures in unrelated integration tests with known flake patterns

If uncertain, inspect failed logs once before choosing rerun.

Decision tree (fix vs rerun vs stop)

If PR is merged/closed: stop.
If there are failed checks:
- Diagnose first.
- If branch-related: fix locally, commit, push.
- If likely flaky/unrelated and all checks for the current SHA are terminal: rerun failed jobs.
- If checks are still pending: wait.
If flaky reruns for the same SHA reach the configured limit (default 3): stop and report persistent failure.
Independently, process any new human review comments.

Review comment agreement criteria

Address the comment when:

The comment is technically correct.
The change is actionable in the current branch.
The requested change does not conflict with the user’s intent or recent guidance.
The change can be made safely without unrelated refactors.

Do not auto-fix when:

The comment is ambiguous and needs clarification.
The request conflicts with explicit user instructions.
The proposed change requires product/design decisions the user has not made.
The codebase is in a dirty/unrelated state that makes safe editing uncertain.

Stop-and-ask conditions

Stop and ask the user instead of continuing automatically when:

The local worktree has unrelated uncommitted changes.
gh auth/permissions fail.
The PR branch cannot be pushed.
CI failures persist after the flaky retry budget.
Reviewer feedback requires a product decision or cross-team coordination.

2.3 KiB Raw Blame History Unescape Escape