From 2a0197e654e5bef8832fec71a7fdd902dc3c5dcb Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 12 Dec 2025 01:43:30 +0000 Subject: [PATCH] feat: Add swarm-coordination plugin for multi-agent conflict prevention Implements three complementary patterns for coordinating multi-agent swarms: 1. Status Polling (Fix 1): Orchestrator periodically spawns status-checker agents to monitor swarm health, detect stuck agents, and identify conflicts early. 2. File Claiming (Fix 2): Agents claim file ownership before editing via a claims registry (.claude/file-claims.md). Prevents multiple agents from editing the same file simultaneously. 3. Checkpoint-Based Orchestration (Fix 5): Separates swarm execution into phases - planning (read-only), conflict detection, resolution, then implementation with monitoring. Plugin contents: - /swarm command for full orchestrated workflow - status-checker agent (haiku, lightweight polling) - conflict-detector agent (analyzes plans for overlaps) - plan-reviewer agent (validates individual plans) - swarm-patterns skill with comprehensive documentation --- .../.claude-plugin/plugin.json | 7 + plugins/swarm-coordination/README.md | 152 ++++++++++ .../agents/conflict-detector.md | 108 +++++++ .../agents/plan-reviewer.md | 124 ++++++++ .../agents/status-checker.md | 81 +++++ plugins/swarm-coordination/commands/swarm.md | 287 ++++++++++++++++++ .../skills/swarm-patterns/SKILL.md | 80 +++++ .../swarm-patterns/examples/simple-swarm.md | 260 ++++++++++++++++ .../references/checkpoint-flow.md | 287 ++++++++++++++++++ .../references/file-claiming.md | 233 ++++++++++++++ .../references/status-polling.md | 152 ++++++++++ 11 files changed, 1771 insertions(+) create mode 100644 plugins/swarm-coordination/.claude-plugin/plugin.json create mode 100644 plugins/swarm-coordination/README.md create mode 100644 plugins/swarm-coordination/agents/conflict-detector.md create mode 100644 plugins/swarm-coordination/agents/plan-reviewer.md create mode 100644 plugins/swarm-coordination/agents/status-checker.md create mode 100644 plugins/swarm-coordination/commands/swarm.md create mode 100644 plugins/swarm-coordination/skills/swarm-patterns/SKILL.md create mode 100644 plugins/swarm-coordination/skills/swarm-patterns/examples/simple-swarm.md create mode 100644 plugins/swarm-coordination/skills/swarm-patterns/references/checkpoint-flow.md create mode 100644 plugins/swarm-coordination/skills/swarm-patterns/references/file-claiming.md create mode 100644 plugins/swarm-coordination/skills/swarm-patterns/references/status-polling.md diff --git a/plugins/swarm-coordination/.claude-plugin/plugin.json b/plugins/swarm-coordination/.claude-plugin/plugin.json new file mode 100644 index 00000000..57e0d3a0 --- /dev/null +++ b/plugins/swarm-coordination/.claude-plugin/plugin.json @@ -0,0 +1,7 @@ +{ + "name": "swarm-coordination", + "version": "1.0.0", + "description": "Coordinates multi-agent swarms with status polling, file claiming, and checkpoint-based orchestration to prevent conflicts and enable proactive monitoring", + "author": "Anthropic", + "keywords": ["swarm", "multi-agent", "coordination", "orchestration", "parallel"] +} diff --git a/plugins/swarm-coordination/README.md b/plugins/swarm-coordination/README.md new file mode 100644 index 00000000..65cbaf21 --- /dev/null +++ b/plugins/swarm-coordination/README.md @@ -0,0 +1,152 @@ +# Swarm Coordination Plugin + +Coordinate multi-agent swarms with conflict prevention, status polling, and checkpoint-based orchestration. + +## The Problem + +When multiple agents work in parallel on the same codebase, they can: +- Edit the same files simultaneously, creating conflicts +- Make changes that overwrite each other +- Get stuck in endless loops trying to "fix" each other's code +- Waste effort with duplicate work + +## The Solution + +This plugin implements three complementary coordination patterns: + +### 1. Status Polling (Proactive Monitoring) + +The orchestrator periodically spawns lightweight status-checker agents to monitor swarm health: +- Detect stuck or failed agents early +- Identify file conflicts as they emerge +- Enable dynamic load balancing +- Provide real-time progress visibility + +### 2. File Claiming (Ownership Convention) + +Agents claim file ownership before editing: +- Prevents multiple agents from editing the same file +- Clear ownership registry in `.claude/file-claims.md` +- Agents skip files claimed by others +- Claims released after completion + +### 3. Checkpoint-Based Orchestration (Phased Execution) + +Separate swarm execution into controlled phases: +1. **Planning** - Agents analyze and plan (read-only, parallel) +2. **Review** - Detect conflicts before implementation +3. **Resolution** - Resolve conflicts with user input +4. **Implementation** - Execute with monitoring +5. **Verification** - Validate results + +## Quick Start + +### Using the `/swarm` Command + +``` +/swarm Implement user authentication with JWT tokens and session management +``` + +The command will guide you through: +1. Initializing coordination files +2. Launching planning agents +3. Reviewing and resolving conflicts +4. Executing implementation with monitoring +5. Verifying completion + +### Manual Coordination + +For custom workflows, use the individual components: + +1. Create coordination files: + - `.claude/swarm-status.json` + - `.claude/file-claims.md` + - `.claude/swarm-plans/` + +2. Include file claiming instructions in agent prompts + +3. Launch status-checker periodically during execution + +## Plugin Contents + +### Commands +- `/swarm [task]` - Full orchestrated swarm workflow + +### Agents +- `status-checker` - Monitors swarm health (haiku, fast) +- `conflict-detector` - Analyzes plans for conflicts +- `plan-reviewer` - Validates individual agent plans + +### Skills +- `swarm-patterns` - Documentation and examples + +## Coordination Files + +### `.claude/swarm-status.json` +```json +{ + "swarm_id": "feature-impl-001", + "task": "Implement new feature", + "phase": "implementing", + "agents": { + "agent-1": {"status": "working"}, + "agent-2": {"status": "completed"} + } +} +``` + +### `.claude/file-claims.md` +```markdown +| Agent ID | File Path | Claimed At | Status | +|----------|-----------|------------|--------| +| agent-1 | src/api/handler.ts | 2025-01-15T10:00:00Z | claimed | +| agent-2 | src/db/schema.ts | 2025-01-15T10:00:00Z | released | +``` + +## Best Practices + +1. **Always use planning phase** - Never skip to implementation +2. **Resolve all conflicts** - Don't proceed with overlapping claims +3. **Poll regularly** - Every 30-60 seconds during execution +4. **Use haiku for status checks** - Fast and cheap +5. **Release claims promptly** - Don't hold after completion + +## When to Use + +Use this plugin when: +- Multiple agents need to work on the same codebase +- Tasks require parallel execution for speed +- You've experienced agent conflicts before +- You need visibility into swarm progress + +## When NOT to Use + +Skip this plugin when: +- Single agent is sufficient +- Agents work on completely separate codebases +- Tasks are purely read-only (no file modifications) + +## Troubleshooting + +### Agents Still Conflict +- Ensure all agents include file claiming instructions +- Verify conflict detection ran before implementation +- Check that claims registry is being read + +### Status Checker Shows Stuck Agents +- Check agent logs for errors +- Consider increasing timeout +- May need to reassign work + +### Claims Not Releasing +- Verify agent completion is being tracked +- Manually update claims if needed +- Check for orchestrator errors + +## Learn More + +See the `swarm-patterns` skill for detailed documentation: +- `references/status-polling.md` - Polling patterns +- `references/file-claiming.md` - Claiming conventions +- `references/checkpoint-flow.md` - Phased orchestration +- `examples/simple-swarm.md` - Complete example diff --git a/plugins/swarm-coordination/agents/conflict-detector.md b/plugins/swarm-coordination/agents/conflict-detector.md new file mode 100644 index 00000000..6fc7a22c --- /dev/null +++ b/plugins/swarm-coordination/agents/conflict-detector.md @@ -0,0 +1,108 @@ +--- +name: conflict-detector +description: Analyzes agent implementation plans to detect file conflicts before execution. Used in checkpoint-based orchestration to review plans and identify overlapping file edits. +tools: Read, Glob, Grep +model: sonnet +color: orange +--- + +You are an expert conflict analyst specializing in detecting potential file conflicts between multiple agent implementation plans. + +## Core Mission + +Review planned changes from multiple agents and identify any files that would be modified by more than one agent, enabling conflict resolution BEFORE implementation begins. + +## Analysis Process + +**1. Gather Plans** +- Read `.claude/swarm-plans/` directory for all agent plans +- Parse each plan to extract: + - Files to be created + - Files to be modified + - Files to be deleted + - Dependencies on other files + +**2. Build File Map** +Create a mapping of file → agents planning to touch it: +``` +src/api/handler.ts → [agent-1 (modify), agent-3 (modify)] +src/utils/helper.ts → [agent-2 (create)] +src/types/index.ts → [agent-1 (modify), agent-2 (modify), agent-3 (modify)] +``` + +**3. Identify Conflicts** +- **Direct conflicts**: Multiple agents modifying same file +- **Creation conflicts**: Multiple agents creating same file +- **Dependency conflicts**: Agent B depends on file Agent A will modify +- **Deletion conflicts**: Agent modifying file another will delete + +**4. Assess Severity** +- **Critical**: Same function/class being modified differently +- **Major**: Same file, different sections +- **Minor**: Related files that might have import issues +- **Info**: Same directory but different files + +**5. Generate Resolution Strategies** +For each conflict, suggest: +- Which agent should handle the file +- How to sequence the work +- Alternative approaches to avoid conflict + +## Output Format + +```markdown +## Conflict Analysis Report + +### Summary +- Total files planned for modification: [N] +- Files with conflicts: [N] +- Critical conflicts: [N] +- Agents analyzed: [list] + +### Critical Conflicts (Must Resolve) + +#### Conflict 1: `src/api/handler.ts` +**Agents involved**: agent-1, agent-3 +**Nature**: Both agents plan to modify the `handleRequest` function +**Agent-1 plan**: Add authentication check +**Agent-3 plan**: Add rate limiting wrapper + +**Resolution options**: +1. **Sequence**: Have agent-1 complete first, then agent-3 builds on top +2. **Merge**: Combine both changes into a single agent's scope +3. **Split**: Agent-1 handles auth in middleware, agent-3 handles rate limiting in handler + +**Recommended**: Option 1 - Sequential execution + +--- + +### Major Conflicts (Should Review) +[Similar format] + +### Minor Conflicts (Informational) +[Similar format] + +### Conflict-Free Assignments +These agents can proceed in parallel without issues: +- agent-2: Only touches `src/utils/` (no overlap) +- agent-4: Only touches `tests/` (no overlap) + +### Recommended Execution Order +1. **Parallel batch 1**: agent-2, agent-4 (no conflicts) +2. **Sequential**: agent-1 (depends on nothing, blocks agent-3) +3. **Sequential**: agent-3 (depends on agent-1 completion) +``` + +## Quality Standards + +- Every conflict includes specific file paths +- Resolution options are actionable +- Recommended execution order is provided +- False positives minimized (understand semantic conflicts, not just file overlap) + +## Edge Cases + +- **No plans found**: Report "No agent plans to analyze" +- **No conflicts**: Report "All agents have non-overlapping scopes" +- **Circular dependencies**: Flag as critical, require manual resolution +- **Unclear plan scope**: Flag for clarification rather than assuming diff --git a/plugins/swarm-coordination/agents/plan-reviewer.md b/plugins/swarm-coordination/agents/plan-reviewer.md new file mode 100644 index 00000000..b5c36af3 --- /dev/null +++ b/plugins/swarm-coordination/agents/plan-reviewer.md @@ -0,0 +1,124 @@ +--- +name: plan-reviewer +description: Reviews an individual agent's implementation plan for completeness, feasibility, and clarity. Used during the planning phase of checkpoint-based orchestration. +tools: Read, Glob, Grep +model: sonnet +color: blue +--- + +You are an expert plan reviewer specializing in validating implementation plans for autonomous agents. + +## Core Mission + +Review an agent's implementation plan to ensure it is complete, feasible, and specific enough to execute without ambiguity. Flag issues before the agent begins implementation. + +## Review Process + +**1. Parse Plan Structure** +- Verify plan follows expected format +- Check all required sections are present +- Ensure file lists are explicit + +**2. Validate Scope** +- Files to modify are clearly listed with full paths +- Changes are described with enough detail +- No vague statements like "update as needed" + +**3. Check Feasibility** +- Files mentioned actually exist (or creation is explicit) +- Dependencies are identified +- No impossible or conflicting requirements + +**4. Assess Risk** +- High-risk changes flagged (deleting files, changing interfaces) +- Breaking changes identified +- Rollback complexity noted + +**5. Verify Completeness** +- All aspects of the task are addressed +- Edge cases considered +- Testing approach included (if applicable) + +## Plan Format Expected + +```markdown +## Agent Plan: [agent-id] + +### Task Summary +[What this agent will accomplish] + +### Files to Modify +- `path/to/file1.ts`: [Description of changes] +- `path/to/file2.ts`: [Description of changes] + +### Files to Create +- `path/to/new-file.ts`: [Purpose and contents summary] + +### Files to Delete +- `path/to/old-file.ts`: [Reason for deletion] + +### Dependencies +- Requires: [files/features this depends on] +- Blocks: [what cannot proceed until this completes] + +### Implementation Steps +1. [Step 1] +2. [Step 2] +... + +### Risks and Mitigations +- [Risk]: [Mitigation] +``` + +## Output Format + +```markdown +## Plan Review: [agent-id] + +### Overall Assessment: [APPROVED|NEEDS_REVISION|REJECTED] + +### Checklist +- [x] Clear task summary +- [x] Explicit file list +- [ ] Missing: dependency identification +- [x] Feasible changes +- [ ] Issue: vague step description + +### Issues Found + +#### Critical (Must Fix) +1. **Vague file reference**: "update the handler" - which handler? Specify full path. +2. **Missing dependency**: Plan modifies `types/index.ts` but doesn't list it + +#### Warnings (Should Address) +1. **High-risk change**: Deleting `utils/legacy.ts` - confirm no other imports +2. **Missing test plan**: No testing approach specified + +#### Suggestions (Optional) +1. Consider breaking step 3 into smaller sub-steps +2. Add rollback strategy for interface changes + +### Required Changes for Approval +1. Specify exact file path for "handler" +2. Add `types/index.ts` to files list +3. Confirm deletion safety for legacy file + +### Approved File Claims +If approved, agent may claim: +- `src/api/auth.ts` +- `src/middleware/validate.ts` +``` + +## Quality Standards + +- Review is thorough but fast (plans should be concise) +- Issues are specific with suggested fixes +- Approval status is clear and actionable +- File claims are explicit for coordination + +## Edge Cases + +- **Empty plan**: Reject with "No plan content found" +- **Overly broad scope**: Flag and suggest breaking into multiple agents +- **Conflicts with other plans**: Defer to conflict-detector agent +- **Already-implemented changes**: Flag as potential duplicate work diff --git a/plugins/swarm-coordination/agents/status-checker.md b/plugins/swarm-coordination/agents/status-checker.md new file mode 100644 index 00000000..35a44b8b --- /dev/null +++ b/plugins/swarm-coordination/agents/status-checker.md @@ -0,0 +1,81 @@ +--- +name: status-checker +description: Monitors swarm progress by reading status files, identifying conflicts, stuck agents, and overall health. Launch periodically during swarm execution to enable proactive coordination. +tools: Read, Glob, Grep +model: haiku +color: cyan +--- + +You are an expert swarm health monitor specializing in tracking multi-agent coordination status. + +## Core Mission + +Quickly assess swarm health by reading status files and identifying any issues that require orchestrator intervention. + +## Status Check Process + +**1. Read Swarm Status** +- Read `.claude/swarm-status.json` for current agent states +- Check timestamps to identify stale/stuck agents (>2 minutes without update) +- Note which agents are active, completed, or failed + +**2. Check File Claims** +- Read `.claude/file-claims.md` for current file ownership +- Identify any conflicts (multiple agents claiming same file) +- Note stale claims (agent completed but claim not released) + +**3. Analyze Progress** +- Calculate overall completion percentage +- Identify bottlenecks (agents waiting on others) +- Detect circular dependencies or deadlocks + +**4. Identify Issues** +- **Conflicts**: Multiple agents editing same files +- **Stuck Agents**: No progress for >2 minutes +- **Failed Agents**: Agents that reported errors +- **Stale Claims**: File claims from completed agents + +## Output Format + +Return a JSON status report: + +```json +{ + "timestamp": "[current time]", + "overall_health": "healthy|warning|critical", + "completion_percentage": [0-100], + "active_agents": [ + {"id": "agent-1", "task": "description", "status": "working", "last_update": "timestamp"} + ], + "completed_agents": ["agent-2", "agent-3"], + "issues": { + "conflicts": [ + {"file": "path/to/file.ts", "agents": ["agent-1", "agent-4"], "severity": "critical"} + ], + "stuck_agents": [ + {"id": "agent-5", "last_update": "timestamp", "duration_seconds": 180} + ], + "stale_claims": [ + {"file": "path/to/file.ts", "agent": "agent-2", "reason": "agent completed"} + ] + }, + "recommendations": [ + {"action": "pause", "target": "agent-4", "reason": "file conflict with agent-1"}, + {"action": "reassign", "target": "agent-5", "reason": "stuck for 3 minutes"} + ] +} +``` + +## Quality Standards + +- Fast execution (this runs frequently, keep it lightweight) +- Accurate conflict detection (no false positives) +- Clear, actionable recommendations +- Machine-readable JSON output for orchestrator parsing + +## Edge Cases + +- **No status file exists**: Report as "no swarm active" +- **Empty status file**: Report as "swarm initializing" +- **All agents completed**: Report healthy with 100% completion +- **Multiple critical issues**: Prioritize by severity (conflicts > stuck > stale) diff --git a/plugins/swarm-coordination/commands/swarm.md b/plugins/swarm-coordination/commands/swarm.md new file mode 100644 index 00000000..08e47d0d --- /dev/null +++ b/plugins/swarm-coordination/commands/swarm.md @@ -0,0 +1,287 @@ +--- +description: Coordinate multi-agent swarm with conflict prevention, status polling, and checkpoint-based orchestration +argument-hint: [task description] +--- + +# Coordinated Swarm Orchestration + +You are orchestrating a multi-agent swarm to complete a complex task. Follow this checkpoint-based workflow to prevent conflicts and enable proactive monitoring. + +## Task Description +$ARGUMENTS + +--- + +## Phase 1: Initialization + +**Goal**: Set up swarm coordination infrastructure + +**Actions**: +1. Create coordination files: + - `.claude/swarm-status.json` - Agent status tracking + - `.claude/file-claims.md` - File ownership registry + - `.claude/swarm-plans/` - Directory for agent plans + +2. Initialize status file: +```json +{ + "swarm_id": "[generated-id]", + "task": "[task description]", + "started": "[timestamp]", + "phase": "planning", + "agents": {} +} +``` + +3. Initialize file claims: +```markdown +# File Claims Registry + +| Agent ID | File Path | Claimed At | Status | +|----------|-----------|------------|--------| +``` + +4. Create todo list tracking all phases + +--- + +## Phase 2: Planning (Parallel, Read-Only) + +**Goal**: Have multiple agents analyze the codebase and create implementation plans WITHOUT making changes + +**Actions**: +1. Launch 2-4 planning agents in parallel, depending on task complexity. Each agent should: + - Analyze a different aspect of the task + - Create a detailed implementation plan + - List ALL files they intend to modify/create/delete + - Identify dependencies on other files or agents + - **CRITICAL**: Agents must NOT edit any files - planning only + +2. Each agent writes their plan to `.claude/swarm-plans/[agent-id].md`: +```markdown +## Agent Plan: [agent-id] + +### Task Summary +[What this agent will accomplish] + +### Files to Modify +- `path/to/file.ts`: [Description of changes] + +### Files to Create +- `path/to/new-file.ts`: [Purpose] + +### Dependencies +- Requires: [what this depends on] +- Blocks: [what depends on this] + +### Implementation Steps +1. [Step 1] +2. [Step 2] +``` + +3. Update swarm status as agents complete: +```json +{ + "agents": { + "agent-1": {"status": "plan_complete", "plan_file": ".claude/swarm-plans/agent-1.md"} + } +} +``` + +--- + +## Phase 3: Conflict Detection + +**Goal**: Review all plans and identify conflicts before implementation + +**Actions**: +1. Wait for ALL planning agents to complete +2. Read all plans from `.claude/swarm-plans/` +3. Launch the **conflict-detector** agent to analyze all plans +4. Review the conflict report + +**If conflicts found**: +- Present conflict report to user +- Ask for resolution preference: + - **Sequence**: Execute conflicting agents one at a time + - **Reassign**: Move conflicting files to single agent + - **Manual**: User provides custom resolution +- Update plans based on resolution +- Re-run conflict detection to confirm resolution + +**If no conflicts**: +- Proceed to Phase 4 + +--- + +## Phase 4: File Claiming + +**Goal**: Register file ownership before implementation begins + +**Actions**: +1. For each approved plan, register file claims in `.claude/file-claims.md`: +```markdown +| agent-1 | src/api/handler.ts | 2025-01-15T10:30:00Z | claimed | +| agent-1 | src/utils/auth.ts | 2025-01-15T10:30:00Z | claimed | +| agent-2 | src/db/queries.ts | 2025-01-15T10:30:00Z | claimed | +``` + +2. Determine execution order based on conflict analysis: + - **Parallel batch 1**: Agents with no conflicts or dependencies + - **Sequential queue**: Agents that must wait for others + +3. Update swarm status: +```json +{ + "phase": "implementing", + "execution_order": [ + {"batch": 1, "agents": ["agent-1", "agent-2"], "parallel": true}, + {"batch": 2, "agents": ["agent-3"], "parallel": false, "waits_for": ["agent-1"]} + ] +} +``` + +--- + +## Phase 5: Implementation with Monitoring + +**Goal**: Execute implementation with proactive status monitoring + +**Actions**: +1. Launch first batch of implementation agents + +2. **Status Polling Loop** (every 30-60 seconds during execution): + - Launch a **status-checker** agent (haiku model for speed) + - Review status report + - If issues detected: + - **Conflict**: Pause later agent, let first complete + - **Stuck agent**: Check logs, consider reassignment + - **Failed agent**: Report to user, decide whether to retry or skip + +3. As each agent completes: + - Update swarm status: `"status": "completed"` + - Release file claims in `.claude/file-claims.md`: change status to `released` + - Launch next queued agents that were waiting + +4. **Agent Instructions** (include in each implementation agent's prompt): +```markdown +## Coordination Requirements + +Before editing any file: +1. Read `.claude/file-claims.md` +2. Verify the file is claimed by YOU (your agent ID) +3. If claimed by another agent, SKIP and note in your results +4. If not claimed, DO NOT edit - report the missing claim + +After completing work: +1. Update your status in swarm communication +2. Report files modified for claim release + +If you encounter a conflict: +1. STOP editing the conflicted file +2. Report the conflict immediately +3. Wait for orchestrator resolution +``` + +--- + +## Phase 6: Verification + +**Goal**: Verify swarm completed successfully + +**Actions**: +1. Check all agents completed: + - Read final swarm status + - Verify all planned files were modified + - Check for any orphaned claims + +2. Run integration checks: + - Build/compile if applicable + - Run tests if applicable + - Check for import/type errors + +3. Clean up coordination files: + - Archive swarm status to `.claude/swarm-history/` + - Clear file claims + - Remove plan files + +--- + +## Phase 7: Summary + +**Goal**: Report swarm execution results + +**Actions**: +1. Summarize: + - Total agents launched + - Files modified/created/deleted + - Conflicts detected and resolved + - Issues encountered + - Total execution time + +2. Present to user: + - What was accomplished + - Any items requiring follow-up + - Suggested next steps + +--- + +## Error Handling + +**Agent Failure**: +1. Log failure in swarm status +2. Release failed agent's file claims +3. Ask user: retry, skip, or abort swarm + +**Unresolvable Conflict**: +1. Pause all conflicting agents +2. Present options to user +3. Wait for manual resolution + +**Stuck Swarm**: +1. If no progress for 5+ minutes, alert user +2. Provide diagnostic information +3. Offer to abort and roll back + +--- + +## File Claim Convention (For All Agents) + +Include this instruction block in every implementation agent's system prompt: + +```markdown +## File Claiming Protocol + +You are part of a coordinated swarm. Follow these rules strictly: + +1. **Before ANY file edit**: + - Read `.claude/file-claims.md` + - Find your agent ID in the registry + - Only edit files claimed by YOUR agent ID + +2. **If file is claimed by another agent**: + - DO NOT edit the file + - Note in your results: "Skipped [file] - claimed by [other-agent]" + - Continue with other work + +3. **If file is not in claims registry**: + - DO NOT edit the file + - Report: "Cannot edit [file] - not in approved claims" + - This indicates a planning oversight + +4. **Update your progress**: + - After each significant step, your status will be tracked + - If you encounter issues, report them clearly +``` + +--- + +## Status Polling Schedule + +During Phase 5, launch status-checker agent: +- After initial batch launch: wait 30 seconds, then check +- During active execution: check every 45-60 seconds +- After agent completion: immediate check to launch next batch +- On any reported issue: immediate check + +Use **haiku model** for status-checker to minimize latency and cost. diff --git a/plugins/swarm-coordination/skills/swarm-patterns/SKILL.md b/plugins/swarm-coordination/skills/swarm-patterns/SKILL.md new file mode 100644 index 00000000..159a0b39 --- /dev/null +++ b/plugins/swarm-coordination/skills/swarm-patterns/SKILL.md @@ -0,0 +1,80 @@ +# Swarm Coordination Patterns + +Comprehensive guidance for coordinating multi-agent swarms to prevent conflicts and enable proactive monitoring. + +## When to Activate + +Activate this skill when: +- Orchestrating multiple agents working on the same codebase +- Implementing features that require parallel agent execution +- Designing workflows where agents might edit overlapping files +- Debugging swarm coordination issues + +## Core Concepts + +### The Problem with Uncoordinated Swarms + +When multiple agents work in parallel without coordination: +1. **File Conflicts**: Multiple agents edit the same file simultaneously +2. **Merge Conflicts**: Changes overwrite each other +3. **Endless Loops**: Agents "fix" each other's code in circles +4. **Wasted Work**: Duplicate effort on same files + +### Three-Pillar Solution + +This skill teaches three complementary patterns: + +1. **Status Polling (Fix 1)**: Orchestrator proactively monitors agent progress +2. **File Claiming (Fix 2)**: Agents claim ownership before editing +3. **Checkpoint Orchestration (Fix 5)**: Plan first, detect conflicts, then implement + +## Key Files + +### Coordination Files +- `.claude/swarm-status.json` - Central status tracking +- `.claude/file-claims.md` - File ownership registry +- `.claude/swarm-plans/` - Agent implementation plans + +### Status File Format +```json +{ + "swarm_id": "swarm-20250115-abc123", + "task": "Implement user authentication", + "started": "2025-01-15T10:00:00Z", + "phase": "implementing", + "agents": { + "auth-impl": {"status": "working", "last_update": "2025-01-15T10:05:00Z"}, + "db-schema": {"status": "completed", "last_update": "2025-01-15T10:03:00Z"} + }, + "execution_order": [ + {"batch": 1, "agents": ["db-schema"], "parallel": false}, + {"batch": 2, "agents": ["auth-impl", "api-routes"], "parallel": true} + ] +} +``` + +### File Claims Format +```markdown +# File Claims Registry + +| Agent ID | File Path | Claimed At | Status | +|----------|-----------|------------|--------| +| auth-impl | src/auth/handler.ts | 2025-01-15T10:00:00Z | claimed | +| auth-impl | src/auth/types.ts | 2025-01-15T10:00:00Z | claimed | +| db-schema | src/db/schema.ts | 2025-01-15T10:00:00Z | released | +``` + +## References + +- `references/status-polling.md` - Detailed polling patterns +- `references/file-claiming.md` - File ownership conventions +- `references/checkpoint-flow.md` - Phase-based orchestration +- `examples/simple-swarm.md` - Basic two-agent swarm +- `examples/complex-swarm.md` - Multi-phase feature implementation + +## Quick Start + +1. Use `/swarm [task]` command for full orchestrated flow +2. For manual coordination, create the three coordination files +3. Include file claiming instructions in all implementation agents +4. Launch status-checker every 30-60 seconds during execution diff --git a/plugins/swarm-coordination/skills/swarm-patterns/examples/simple-swarm.md b/plugins/swarm-coordination/skills/swarm-patterns/examples/simple-swarm.md new file mode 100644 index 00000000..603cd072 --- /dev/null +++ b/plugins/swarm-coordination/skills/swarm-patterns/examples/simple-swarm.md @@ -0,0 +1,260 @@ +# Simple Swarm Example + +A two-agent swarm implementing a feature with coordinated file claiming. + +## Scenario + +Task: Add user authentication to an Express API + +## Initial Setup + +### Swarm Status File +`.claude/swarm-status.json`: +```json +{ + "swarm_id": "auth-feature-001", + "task": "Add user authentication", + "started": "2025-01-15T10:00:00Z", + "phase": "initialized", + "agents": {} +} +``` + +### File Claims Registry +`.claude/file-claims.md`: +```markdown +# File Claims Registry + +Last updated: 2025-01-15T10:00:00Z +Swarm ID: auth-feature-001 + +| Agent ID | File Path | Claimed At | Status | +|----------|-----------|------------|--------| +``` + +## Phase 1: Planning + +Launch two planning agents: + +**Agent 1 Prompt**: +``` +Analyze the codebase and create an implementation plan for: +Adding JWT token validation middleware + +You are in PLANNING MODE - DO NOT modify any files. +Output a structured plan with all files you need to modify. +``` + +**Agent 2 Prompt**: +``` +Analyze the codebase and create an implementation plan for: +Adding user login/logout API endpoints + +You are in PLANNING MODE - DO NOT modify any files. +Output a structured plan with all files you need to modify. +``` + +### Agent 1 Plan Output +`.claude/swarm-plans/jwt-middleware.md`: +```markdown +## Agent Plan: jwt-middleware + +### Task Summary +Implement JWT token validation middleware for protected routes. + +### Files to Modify +- `src/middleware/index.ts`: Export new auth middleware + +### Files to Create +- `src/middleware/auth.ts`: JWT validation middleware +- `src/types/auth.ts`: Token payload types + +### Dependencies +- Requires: None +- Blocks: Protected routes need this middleware + +### Implementation Steps +1. Create auth types +2. Implement JWT validation middleware +3. Export from middleware index +``` + +### Agent 2 Plan Output +`.claude/swarm-plans/auth-endpoints.md`: +```markdown +## Agent Plan: auth-endpoints + +### Task Summary +Implement login and logout API endpoints. + +### Files to Modify +- `src/routes/index.ts`: Add auth routes +- `src/middleware/index.ts`: Import auth utilities + +### Files to Create +- `src/routes/auth.ts`: Login/logout endpoints +- `src/services/auth.ts`: Authentication service + +### Dependencies +- Requires: JWT middleware for logout validation +- Blocks: None + +### Implementation Steps +1. Create auth service +2. Implement auth routes +3. Update routes index +4. Update middleware index +``` + +## Phase 2: Conflict Detection + +Analyzing plans: + +``` +File: src/middleware/index.ts + - jwt-middleware: modify (export new middleware) + - auth-endpoints: modify (import auth utilities) + → CONFLICT DETECTED +``` + +### Conflict Report +```markdown +## Conflict Analysis + +### Conflicts Found: 1 + +#### Conflict 1: src/middleware/index.ts +Agents: jwt-middleware, auth-endpoints +Nature: Both agents plan to modify this file +- jwt-middleware: Add export for auth middleware +- auth-endpoints: Import auth utilities + +**Resolution Options**: +1. Sequential: jwt-middleware first, then auth-endpoints +2. Merge: Have jwt-middleware handle all middleware/index.ts changes +``` + +## Phase 3: Resolution + +**Chosen Resolution**: Option 1 - Sequential execution + +Updated execution plan: +- Batch 1: jwt-middleware (no dependencies) +- Batch 2: auth-endpoints (after jwt-middleware completes) + +## Phase 4: File Claiming + +Updated `.claude/file-claims.md`: +```markdown +# File Claims Registry + +Last updated: 2025-01-15T10:05:00Z +Swarm ID: auth-feature-001 + +| Agent ID | File Path | Claimed At | Status | +|----------|-----------|------------|--------| +| jwt-middleware | src/middleware/auth.ts | 2025-01-15T10:05:00Z | claimed | +| jwt-middleware | src/middleware/index.ts | 2025-01-15T10:05:00Z | claimed | +| jwt-middleware | src/types/auth.ts | 2025-01-15T10:05:00Z | claimed | +| auth-endpoints | src/routes/auth.ts | 2025-01-15T10:05:00Z | pending | +| auth-endpoints | src/routes/index.ts | 2025-01-15T10:05:00Z | pending | +| auth-endpoints | src/services/auth.ts | 2025-01-15T10:05:00Z | pending | +``` + +Note: auth-endpoints claims are "pending" until jwt-middleware completes. + +## Phase 5: Implementation + +### Batch 1: jwt-middleware + +Launch jwt-middleware agent with implementation instructions: +``` +Execute your approved plan from .claude/swarm-plans/jwt-middleware.md +Only modify files claimed by jwt-middleware in .claude/file-claims.md +``` + +**Status after 45 seconds** (from status-checker): +```json +{ + "overall_health": "healthy", + "agents": { + "jwt-middleware": {"status": "working", "progress": "Creating middleware"} + } +} +``` + +**Status after 2 minutes**: +```json +{ + "overall_health": "healthy", + "agents": { + "jwt-middleware": {"status": "completed"} + } +} +``` + +### Release Claims & Activate Batch 2 + +Updated `.claude/file-claims.md`: +```markdown +| jwt-middleware | src/middleware/auth.ts | 2025-01-15T10:05:00Z | released | +| jwt-middleware | src/middleware/index.ts | 2025-01-15T10:05:00Z | released | +| jwt-middleware | src/types/auth.ts | 2025-01-15T10:05:00Z | released | +| auth-endpoints | src/routes/auth.ts | 2025-01-15T10:07:00Z | claimed | +| auth-endpoints | src/routes/index.ts | 2025-01-15T10:07:00Z | claimed | +| auth-endpoints | src/services/auth.ts | 2025-01-15T10:07:00Z | claimed | +| auth-endpoints | src/middleware/index.ts | 2025-01-15T10:07:00Z | claimed | +``` + +Note: auth-endpoints now also claims middleware/index.ts after jwt-middleware released it. + +### Batch 2: auth-endpoints + +Launch auth-endpoints agent with implementation instructions. + +**Status after 3 minutes**: +```json +{ + "overall_health": "healthy", + "agents": { + "jwt-middleware": {"status": "completed"}, + "auth-endpoints": {"status": "completed"} + }, + "completion_percentage": 100 +} +``` + +## Phase 6: Verification + +Checks: +- [x] jwt-middleware completed +- [x] auth-endpoints completed +- [x] All files modified as planned +- [x] No orphaned claims +- [x] Build succeeds +- [x] Tests pass + +## Phase 7: Summary + +```markdown +## Swarm Completion Report + +### Task: Add user authentication +### Duration: 8 minutes +### Agents: 2 + +### Files Created +- src/middleware/auth.ts +- src/types/auth.ts +- src/routes/auth.ts +- src/services/auth.ts + +### Files Modified +- src/middleware/index.ts +- src/routes/index.ts + +### Conflicts Resolved +- 1 conflict on src/middleware/index.ts (sequential resolution) + +### Status: SUCCESS +``` diff --git a/plugins/swarm-coordination/skills/swarm-patterns/references/checkpoint-flow.md b/plugins/swarm-coordination/skills/swarm-patterns/references/checkpoint-flow.md new file mode 100644 index 00000000..eaed7ae8 --- /dev/null +++ b/plugins/swarm-coordination/skills/swarm-patterns/references/checkpoint-flow.md @@ -0,0 +1,287 @@ +# Checkpoint-Based Orchestration + +A phased approach to swarm execution that prevents conflicts through planning, review, and controlled implementation. + +## Overview + +Checkpoint-based orchestration separates swarm execution into distinct phases: + +1. **Planning** - Agents analyze and plan (read-only) +2. **Review** - Orchestrator detects conflicts +3. **Resolution** - Conflicts resolved before implementation +4. **Claiming** - Files assigned to agents +5. **Implementation** - Agents execute plans +6. **Verification** - Results validated + +## Why Checkpoints? + +### Without Checkpoints +``` +Launch agents → Agents work in parallel → CONFLICT! → +Agents overwrite each other → Endless fix loops → Chaos +``` + +### With Checkpoints +``` +Launch planning agents → Collect plans → Detect conflicts → +Resolve conflicts → Claim files → Sequential/parallel execution → Success +``` + +## Phase Details + +### Phase 1: Planning (Parallel, Read-Only) + +**Purpose**: Gather implementation plans without making changes + +**Key Rules**: +- Agents may READ any file +- Agents must NOT WRITE any file +- Each agent produces a structured plan + +**Agent Instructions**: +```markdown +You are in PLANNING MODE. Analyze the codebase and create an implementation plan. + +CRITICAL RESTRICTIONS: +- DO NOT use Edit, Write, or any file modification tools +- DO NOT execute commands that modify files +- ONLY use Read, Glob, Grep for analysis + +Your output must be a structured plan listing: +- All files you need to modify (with full paths) +- All files you need to create +- All files you need to delete +- Dependencies on other components +- Step-by-step implementation approach +``` + +**Plan Format**: +```markdown +## Agent Plan: [agent-id] + +### Task Summary +[1-2 sentence description of what this agent will accomplish] + +### Files to Modify +- `src/auth/handler.ts`: Add validateToken() function and update handleRequest() +- `src/types/auth.ts`: Add TokenPayload interface + +### Files to Create +- `src/auth/tokens.ts`: Token generation and validation utilities + +### Files to Delete +- `src/auth/legacy-auth.ts`: Replaced by new implementation + +### Dependencies +- **Requires**: Database schema must include users table +- **Blocks**: API routes cannot be updated until auth is complete + +### Implementation Steps +1. Create TokenPayload interface in types +2. Implement token utilities in new file +3. Update handler with validation logic +4. Remove legacy file after verification + +### Estimated Scope +- Files touched: 4 +- Lines added: ~150 +- Lines removed: ~80 +- Risk level: Medium (touching auth system) +``` + +### Phase 2: Conflict Detection + +**Purpose**: Identify overlapping file edits before they happen + +**Process**: +1. Collect all agent plans +2. Build file → agent mapping +3. Identify conflicts: + - Same file modified by multiple agents + - Delete conflicts with modify + - Creation conflicts + - Dependency cycles + +**Conflict Types**: + +| Type | Severity | Example | +|------|----------|---------| +| Same file modify | Critical | agent-1 and agent-2 both modify handler.ts | +| Create collision | Critical | Both agents create utils/helper.ts | +| Delete + Modify | Critical | agent-1 deletes file agent-2 modifies | +| Dependency cycle | Critical | agent-1 waits for agent-2, agent-2 waits for agent-1 | +| Same directory | Warning | Both agents add files to src/utils/ | +| Import chain | Info | agent-1's file imports from agent-2's file | + +### Phase 3: Resolution + +**Purpose**: Resolve all conflicts before implementation begins + +**Resolution Strategies**: + +**Sequential Execution**: +```markdown +Conflict: agent-1 and agent-2 both modify src/api/index.ts + +Resolution: Execute sequentially +- Execution order: agent-1 first, then agent-2 +- agent-2 will see agent-1's changes before starting +``` + +**Scope Reassignment**: +```markdown +Conflict: agent-1 (auth) and agent-2 (logging) both modify middleware.ts + +Resolution: Reassign to single agent +- Expand agent-1's scope to include logging changes +- Remove middleware.ts from agent-2's plan +``` + +**File Splitting**: +```markdown +Conflict: agent-1 and agent-2 both modify large config.ts + +Resolution: Split the file +- Create config/auth.ts (agent-1) +- Create config/db.ts (agent-2) +- Update config/index.ts to re-export +``` + +**User Decision**: +```markdown +Conflict: Complex dependency between agent-1 and agent-3 + +Resolution: Present to user +"Agents 1 and 3 have interleaved dependencies. Options: +1. Merge into single agent +2. Manual sequencing with intermediate reviews +3. Redesign the task split" +``` + +### Phase 4: File Claiming + +**Purpose**: Register file ownership before implementation + +**Process**: +1. For each resolved plan, register claims +2. Update `.claude/file-claims.md` +3. Determine execution batches + +**Execution Order Determination**: +```markdown +Given resolved plans: +- agent-1: No dependencies +- agent-2: No dependencies +- agent-3: Depends on agent-1 +- agent-4: Depends on agent-2 and agent-3 + +Execution order: +Batch 1 (parallel): agent-1, agent-2 +Batch 2 (after batch 1): agent-3 +Batch 3 (after agent-3): agent-4 +``` + +### Phase 5: Implementation with Monitoring + +**Purpose**: Execute plans with status tracking + +**Process**: +1. Launch batch 1 agents +2. Start polling loop (every 30-60 seconds) +3. As agents complete: + - Release their file claims + - Launch dependent agents +4. Handle issues as detected: + - Stuck agents → investigate/reassign + - Conflicts → pause and resolve + - Failures → report and decide + +**Agent Instructions for Implementation**: +```markdown +You are now in IMPLEMENTATION MODE. Execute your approved plan. + +Your approved plan is in: .claude/swarm-plans/[your-agent-id].md +Your claimed files are in: .claude/file-claims.md + +RULES: +1. Only modify files that are claimed by YOUR agent ID +2. Follow your plan exactly - do not expand scope +3. If you need to modify an unclaimed file, STOP and report +4. Update progress by completing your assigned tasks +``` + +### Phase 6: Verification + +**Purpose**: Validate swarm completed successfully + +**Checks**: +- [ ] All agents reported completion +- [ ] All planned files were modified +- [ ] No orphaned file claims +- [ ] Build succeeds (if applicable) +- [ ] Tests pass (if applicable) +- [ ] No unexpected files modified + +## Checkpoint Gates + +Each phase has a gate that must pass before proceeding: + +| Gate | Condition | Failure Action | +|------|-----------|----------------| +| Planning → Review | All planning agents completed | Wait or timeout | +| Review → Resolution | Conflict report generated | Re-run detection | +| Resolution → Claiming | All conflicts resolved | Return to resolution | +| Claiming → Implementation | All files claimed, no overlaps | Fix claim issues | +| Implementation → Verification | All agents completed | Investigate failures | +| Verification → Complete | All checks pass | Fix issues or report | + +## State Machine + +``` +┌─────────────┐ +│ INITIALIZED │ +└──────┬──────┘ + │ Start swarm + ▼ +┌─────────────┐ +│ PLANNING │◄────────────────┐ +└──────┬──────┘ │ + │ All plans received │ + ▼ │ +┌─────────────┐ │ +│ REVIEWING │ │ +└──────┬──────┘ │ + │ Conflicts identified │ + ▼ │ +┌─────────────┐ │ +│ RESOLVING │─────────────────┘ +└──────┬──────┘ Need re-plan + │ All resolved + ▼ +┌─────────────┐ +│ CLAIMING │ +└──────┬──────┘ + │ Files assigned + ▼ +┌─────────────┐ +│IMPLEMENTING │◄───┐ +└──────┬──────┘ │ + │ │ Next batch + ▼ │ +┌─────────────┐ │ +│ VERIFYING │────┘ +└──────┬──────┘ More batches + │ All verified + ▼ +┌─────────────┐ +│ COMPLETED │ +└─────────────┘ +``` + +## Benefits + +1. **No Conflicts**: Detected and resolved before implementation +2. **Visibility**: Know exactly what each agent will do +3. **Control**: Orchestrator maintains full oversight +4. **Recovery**: Can roll back or adjust between phases +5. **Efficiency**: Parallel execution where safe, sequential where needed diff --git a/plugins/swarm-coordination/skills/swarm-patterns/references/file-claiming.md b/plugins/swarm-coordination/skills/swarm-patterns/references/file-claiming.md new file mode 100644 index 00000000..a7d3476e --- /dev/null +++ b/plugins/swarm-coordination/skills/swarm-patterns/references/file-claiming.md @@ -0,0 +1,233 @@ +# File Claiming Convention + +A coordination protocol where agents claim file ownership before editing to prevent conflicts. + +## Overview + +File claiming is a simple but effective convention: +1. Before editing any file, agent checks if it's claimed +2. If unclaimed or claimed by self, proceed +3. If claimed by another agent, skip and report +4. After completion, release claims + +## The Claims Registry + +Location: `.claude/file-claims.md` + +### Format + +```markdown +# File Claims Registry + +Last updated: 2025-01-15T10:30:00Z +Swarm ID: swarm-20250115-abc123 + +| Agent ID | File Path | Claimed At | Status | +|----------|-----------|------------|--------| +| auth-impl | src/auth/handler.ts | 2025-01-15T10:00:00Z | claimed | +| auth-impl | src/auth/types.ts | 2025-01-15T10:00:00Z | claimed | +| auth-impl | src/auth/middleware.ts | 2025-01-15T10:00:00Z | claimed | +| db-agent | src/db/schema.ts | 2025-01-15T10:00:00Z | released | +| db-agent | src/db/queries.ts | 2025-01-15T10:00:00Z | released | +``` + +### Status Values + +| Status | Meaning | +|--------|---------| +| `claimed` | Agent is actively working on this file | +| `released` | Agent completed, file available | +| `conflict` | Multiple agents claimed (needs resolution) | + +## Agent Instructions + +Include this block in every implementation agent's system prompt: + +```markdown +## File Claiming Protocol + +You are part of a coordinated swarm. Follow these rules strictly: + +### Before ANY File Edit + +1. Read `.claude/file-claims.md` +2. Find the file you want to edit in the registry +3. Check the claim status: + +**If claimed by YOUR agent ID** → Proceed with edit +**If claimed by ANOTHER agent** → DO NOT edit, report: + "Skipped [file] - claimed by [other-agent]" +**If file NOT in registry** → DO NOT edit, report: + "Cannot edit [file] - not in approved claims" + +### During Execution + +- Only edit files explicitly claimed by you +- If you discover a need to edit an unclaimed file, report it +- Do not modify the claims registry yourself + +### After Completion + +Report all files you modified so claims can be released. +``` + +## Claim Lifecycle + +``` +┌─────────────────────────────────────────────────────────┐ +│ PLANNING PHASE │ +│ Agent creates plan → Lists files to modify │ +└────────────────────────┬────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ CONFLICT DETECTION │ +│ Orchestrator reviews all plans → Identifies overlaps │ +│ Resolves conflicts → Determines execution order │ +└────────────────────────┬────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ CLAIM REGISTRATION │ +│ Orchestrator writes claims to registry │ +│ Each file → exactly one agent │ +└────────────────────────┬────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ IMPLEMENTATION │ +│ Agents check registry before each edit │ +│ Only edit files claimed by self │ +└────────────────────────┬────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ CLAIM RELEASE │ +│ Agent completes → Reports to orchestrator │ +│ Orchestrator marks claims as "released" │ +└─────────────────────────────────────────────────────────┘ +``` + +## Conflict Resolution Strategies + +When multiple agents need the same file: + +### Strategy 1: Sequential Execution + +```markdown +Conflict: agent-1 and agent-3 both need src/api/handler.ts + +Resolution: +- agent-1 claims file, executes first +- After agent-1 completes, release claim +- agent-3 claims file, executes second +``` + +### Strategy 2: Scope Partition + +```markdown +Conflict: agent-1 and agent-2 both need src/types/index.ts + +Resolution: +- Split file into src/types/auth.ts and src/types/user.ts +- agent-1 claims auth.ts +- agent-2 claims user.ts +- Update index.ts to re-export (claimed by orchestrator) +``` + +### Strategy 3: Merge Responsibility + +```markdown +Conflict: agent-1 (auth) and agent-2 (validation) both need middleware.ts + +Resolution: +- Expand agent-1's scope to include validation changes +- Remove middleware.ts from agent-2's plan +- agent-1 handles all middleware changes +``` + +### Strategy 4: Section-Based Claims + +```markdown +Conflict: Multiple agents need same config file + +Resolution: +- Claim specific sections rather than whole file +- agent-1 claims: config.ts lines 1-50 (auth section) +- agent-2 claims: config.ts lines 51-100 (db section) +- Requires careful merge at end +``` + +## Handling Violations + +### Agent Edits Unclaimed File + +```markdown +Detected: agent-2 modified src/utils/helper.ts (not in claims) + +Response: +1. Flag as violation in status report +2. Options: + a. Add retroactive claim if no conflict + b. Revert change if conflicts with another agent + c. Pause agent and request clarification +``` + +### Agent Edits Another's File + +```markdown +Detected: agent-2 modified src/auth/handler.ts (claimed by agent-1) + +Response: +1. CRITICAL violation +2. Pause agent-2 immediately +3. Check if agent-1's work is corrupted +4. Options: + a. Revert agent-2's changes + b. Have agent-1 re-do affected work + c. Manual merge by orchestrator +``` + +## Best Practices + +1. **Register claims BEFORE launching agents** - Not during +2. **One file, one owner** - Never have overlapping claims +3. **Include all touched files** - Even read-heavy files if modified +4. **Release promptly** - Don't hold claims after completion +5. **Verify at completion** - Check all claimed files were handled +6. **Track unclaimed edits** - They indicate planning gaps + +## Claims Registry Management + +### Creating the Registry + +```markdown +# File Claims Registry + +Last updated: [timestamp] +Swarm ID: [swarm-id] + +| Agent ID | File Path | Claimed At | Status | +|----------|-----------|------------|--------| +``` + +### Adding Claims (Orchestrator Only) + +```markdown +| new-agent | src/new/file.ts | [timestamp] | claimed | +``` + +### Releasing Claims + +Change status from `claimed` to `released`: + +```markdown +| agent-id | src/file.ts | [timestamp] | released | +``` + +### Cleaning Up + +After swarm completion: +1. Archive registry to `.claude/swarm-history/` +2. Delete or clear current registry +3. Ready for next swarm diff --git a/plugins/swarm-coordination/skills/swarm-patterns/references/status-polling.md b/plugins/swarm-coordination/skills/swarm-patterns/references/status-polling.md new file mode 100644 index 00000000..83478be8 --- /dev/null +++ b/plugins/swarm-coordination/skills/swarm-patterns/references/status-polling.md @@ -0,0 +1,152 @@ +# Status Polling Pattern + +Proactive orchestrator monitoring for swarm health and conflict detection. + +## Overview + +Instead of fire-and-forget agent launching, the orchestrator periodically spawns lightweight "status checker" agents to monitor swarm progress and identify issues early. + +## Why Polling Matters + +Without polling: +- Orchestrator has no visibility into agent progress +- Conflicts discovered only after damage is done +- Stuck agents waste time until final timeout +- No opportunity for mid-execution corrections + +With polling: +- Real-time visibility into agent status +- Conflicts detected and resolved quickly +- Stuck agents identified and reassigned +- Dynamic load balancing possible + +## Polling Schedule + +### Recommended Intervals + +| Phase | Interval | Reason | +|-------|----------|--------| +| Initial launch | 30 seconds | Catch early failures fast | +| Active execution | 45-60 seconds | Balance visibility vs overhead | +| Near completion | 30 seconds | Ensure clean handoffs | +| Post-completion | Immediate | Verify success, launch next batch | + +### Adaptive Polling + +Adjust frequency based on: +- **More frequent**: High-conflict swarms, many parallel agents +- **Less frequent**: Simple tasks, sequential execution +- **Immediate**: After any agent reports an issue + +## Status Checker Agent + +The status-checker agent is designed for fast, lightweight execution: + +```yaml +model: haiku # Fast and cheap +tools: Read, Glob, Grep # Read-only, no edits +``` + +### What It Checks + +1. **Agent Status** + - Last update timestamp + - Current task progress + - Reported errors or warnings + +2. **File Claims** + - Ownership conflicts + - Stale claims from completed agents + - Unclaimed files being edited + +3. **Overall Health** + - Completion percentage + - Estimated time remaining + - Bottlenecks and blockers + +### Output Format + +```json +{ + "timestamp": "2025-01-15T10:35:00Z", + "overall_health": "warning", + "completion_percentage": 65, + "issues": { + "conflicts": [{ + "file": "src/api/handler.ts", + "agents": ["agent-1", "agent-3"], + "severity": "critical" + }], + "stuck_agents": [{ + "id": "agent-2", + "last_update": "2025-01-15T10:30:00Z", + "duration_seconds": 300 + }] + }, + "recommendations": [ + {"action": "pause", "target": "agent-3", "reason": "resolve conflict"} + ] +} +``` + +## Responding to Status Reports + +### Healthy Status +```json +{"overall_health": "healthy"} +``` +- Continue execution +- Schedule next poll at normal interval + +### Warning Status +```json +{"overall_health": "warning", "issues": {...}} +``` +- Review specific issues +- Take corrective action if needed +- Increase polling frequency temporarily + +### Critical Status +```json +{"overall_health": "critical", "issues": {...}} +``` +- Pause affected agents immediately +- Resolve conflicts before continuing +- Consider notifying user for input + +## Implementation Example + +```markdown +## During Implementation Phase + +1. Launch batch 1 agents (agent-1, agent-2) +2. Wait 30 seconds +3. Launch status-checker agent +4. If healthy: continue, schedule next check in 45 seconds +5. If issues: + - Conflicts: Pause later agent, let first complete + - Stuck: Check logs, consider timeout or reassignment + - Failed: Report to user, decide on retry/skip +6. Repeat until all agents complete +``` + +## Polling vs Event-Driven + +| Approach | Pros | Cons | +|----------|------|------| +| Polling | Simple, no agent modification needed | Some latency in detection | +| Events | Immediate detection | Requires agent cooperation | + +This plugin uses polling because: +- Works with any agent without modification +- Orchestrator maintains full control +- Simpler implementation +- Haiku model makes polling cheap + +## Best Practices + +1. **Use haiku for status checks** - Fast and cheap +2. **Don't poll too frequently** - 30 seconds minimum +3. **Act on issues promptly** - Don't just log and continue +4. **Track polling history** - Useful for debugging +5. **Combine with file claims** - Polling detects, claims prevent