epic-team

Orchestrate a Claude Code agent team to autonomously work a Linear epic across multiple repos

OmniNode-ai 3 2 Updated 4mo ago

Resources

GitHub

Install

npx skillscat add omninode-ai/omniclaude/epic-team

Install via the SkillsCat registry.

SKILL.md

Epic Team Orchestration

Session lifetime: The monitoring phase is alive only while this session runs. Use /epic-team {epic_id} --resume to re-enter after a disconnection.

Architecture note (v2.0.0): epic-team is a thin orchestrator. All business logic lives in
independently-invocable composable sub-skills. epic-team's job is coordination, state, and routing
— not implementation.

Overview

Decompose a Linear epic into per-repo workstreams and autonomously drive them to completion.
The team lead (this session) owns planning, dispatch, state persistence, and lifecycle
notifications. Tickets are executed by dispatching ticket-pipeline as sequential Task()
subagents directly from the team-lead session, in dependency-respecting waves.

Key constraint: Workers spawned as team members (via TeamCreate + Task(team_name=...))
go idle immediately and never process tasks. The proven working pattern is direct dispatch
from the team-lead session — see the Architecture section below.

If the epic has zero child tickets, epic-team invokes decompose-epic to create sub-tickets,
then posts a Slack LOW_RISK gate. Silence for 30 minutes = proceed.

Composable Sub-Skills

epic-team orchestrates these independently-invocable primitives:

Sub-Skill	Purpose	Ticket
`decompose-epic`	Analyze epic → create Linear child tickets	OMN-2522
`slack-gate`	LOW_RISK / MEDIUM_RISK / HIGH_RISK human gates	OMN-2521
`ticket-pipeline`	Per-ticket pipeline (implement → review → PR → CI → merge)	—

ticket-pipeline in turn composes:

Sub-Skill	Purpose	Ticket
`ticket-work`	Implement ticket (autonomous mode)	OMN-2526
`local-review`	Review + fix loop	—
`ci-watch`	Poll CI, auto-fix failures	OMN-2523
`pr-watch`	Poll PR reviews, auto-fix comments	OMN-2524
`auto-merge`	Merge PR with HIGH_RISK Slack gate	OMN-2525

Each layer is independently invocable:

ticket-pipeline runs standalone without epic-team
ticket-work runs standalone without ticket-pipeline
ci-watch runs standalone on any PR

If the epic has zero child tickets, epic-team automatically invokes the decompose-epic sub-skill to analyze the epic description and create sub-tickets, then posts a Slack LOW_RISK gate. Silence for 30 minutes means proceed with newly created tickets.

Usage Examples

# Dry run — see decomposition without spawning agents
/epic-team OMN-2000 --dry-run

# Full run
/epic-team OMN-2000

# Resume after session disconnect
/epic-team OMN-2000 --resume

# Force restart (archive existing state; pauses if workers active)
/epic-team OMN-2000 --force

# Force restart even with active workers (dangerous)
/epic-team OMN-2000 --force --force-kill

# Route unmatched tickets to omniplan triage
/epic-team OMN-2000 --force-unmatched

Orchestration Flow

epic-team OMN-XXXX
  → Fetch child tickets from Linear
  → If 0 child tickets:
      → Dispatch decompose-epic (composable, returns skill_result)
      → Read ~/.claude/skill-results/{context_id}/decompose-epic.json
      → Dispatch slack-gate (LOW_RISK, 30 min, silence=proceed)
      → Read ~/.claude/skill-results/{context_id}/slack-gate.json
      → If rejected: stop
      → Re-fetch newly created tickets
  → Assign tickets to repos via repo_manifest
  → Build dependency waves:
      Wave 0: independent tickets + cross-repo Part 1 splits (run in parallel)
      Wave 1: cross-repo Part 2 splits (run after Wave 0 completes)
  → For each wave: dispatch ticket-pipeline per ticket as Task() from team-lead session
  → Await all Task() calls in wave before starting next wave
  → Collect results (status, pr_url, branch) from each Task()
  → Post-wave integration check (OMN-3345): run gap cycle --no-fix per repo touched
      → GREEN/YELLOW/RED per repo → post to Slack epic thread
      → Write integration_check section to state.yaml (non-blocking — always advances)
  → Send Slack lifecycle notifications (started, ticket done, epic done)
  → Persist state to ~/.claude/epics/{epic_id}/state.yaml

Dispatch: decompose-epic

When epic has 0 child tickets:

Task(
  subagent_type="onex:polymorphic-agent",
  description="epic-team: auto-decompose empty epic {epic_id}",
  prompt="The epic {epic_id} has no child tickets.

    Invoke: Skill(skill=\"onex:decompose-epic\", args=\"{epic_id}\")

    Read result from ~/.claude/skill-results/{context_id}/decompose-epic.json
    Report back: created_tickets (list of IDs and titles), count."
)

Dispatch: Ticket-Pipeline per Ticket (Direct Dispatch Pattern)

For each ticket in a wave, dispatch ticket-pipeline as a Task() from the team-lead session:

Task(
  subagent_type="onex:polymorphic-agent",
  description="epic-team: ticket-pipeline for {ticket_id} [{repo}]",
  prompt="You are executing ticket {ticket_id} for epic {epic_id}.

    Ticket: {ticket_id} - {title}
    URL: {url}
    Repo: {repo} at {repo_path}
    Epic: {epic_id}  Run: {run_id}

    Invoke: Skill(skill=\"onex:ticket-pipeline\", args=\"{ticket_id}\")

    After ticket-pipeline completes, report back:
    - ticket_id: {ticket_id}
    - status: (merged/failed/blocked)
    - pr_url: (if available)
    - branch: (branch name used)"
)

Wave parallelism: All Task() calls within a wave MUST be dispatched in the same response
(same message) for true parallelism. Do NOT dispatch tickets sequentially within a wave.

Wave serialization: Wave N+1 starts only after all Task() calls from Wave N have returned.

DEPRECATED: Spawning per-repo workers via TeamCreate + Task(team_name=...) + a
WORKER_TEMPLATE is no longer used. See prompt.md for the deprecated WORKER_TEMPLATE
preserved for historical reference.

Skill Result Communication

All sub-skills write their output to ~/.claude/skill-results/{context_id}/:

Sub-Skill	Output File	Key Fields
`decompose-epic`	`decompose-epic.json`	status, created_tickets, count
`slack-gate`	`slack-gate.json`	status (accepted/rejected/timeout)
`ticket-pipeline`	`ticket-pipeline.json`	status, ticket_id, pr_url
`ticket-work`	`ticket-work.json`	status, pr_url, phase_reached
`local-review`	`local-review.json`	status, iterations_run
`ci-watch`	`ci-watch.json`	status, fix_cycles_used
`pr-watch`	`pr-watch.json`	status, fix_cycles_used
`auto-merge`	`auto-merge.json`	status, merge_commit

State Persistence

Runtime state is persisted to ~/.claude/epics/{epic_id}/state.yaml:

epic_id: OMN-XXXX
run_id: f084b6c3
status: monitoring  # queued | monitoring | done | failed
checkpoint:
  schema_version: 1
  last_completed_wave: 0
  waves:
    - wave_id: 0
      tickets: [OMN-2001, OMN-2002]
      status: completed  # pending | in_progress | completed | failed
      completed_at: "2026-03-06T..."
    - wave_id: 1
      tickets: [OMN-2003]
      status: pending
  open_prs:
    OMN-2001: {pr_number: 45, repo: "omniclaude", branch: "jonah/omn-2001-..."}
  failures:
    OMN-2003: {class: "ci_failure_ruff", attempts: 1, last_error: "..."}
  last_update_utc: "2026-03-06T..."
workers:
  - repo: omniclaude
    tickets: [OMN-2001, OMN-2002]
    status: running  # running | done | failed
ticket_status:
  OMN-2001: merged
  OMN-2002: running

Use --resume to re-enter monitoring from persisted state after session disconnect.

Empty Epic Auto-Decompose

When epic has 0 child tickets:

[LOW_RISK] epic-team: Auto-decomposed OMN-XXXX

Epic had no child tickets. Created N sub-tickets:
  - OMN-YYYY: [title]
  - OMN-ZZZZ: [title]
  ...

Reply "reject" within 30 minutes to cancel. Silence = proceed with orchestration.

--dry-run behavior for empty epic

Invoke decompose-epic --dry-run (returns plan, no tickets created). Print plan. Do not post
Slack gate.

Repo Manifest

Ticket-to-repo assignment uses plugins/onex/skills/epic-team/repo_manifest.yaml:

MIN_TOP_SCORE: 4

repos:
  - name: omniclaude
    path: ~/Code/omniclaude
    keywords: [hooks, skills, agents, claude, plugin, ticket-pipeline]
  - name: omnibase_core
    path: ~/Code/omnibase_core
    keywords: [nodes, contracts, runtime, onex]
  - name: omnibase_infra
    path: ~/Code/omnibase_infra
    keywords: [kubernetes, deploy, infra, helm]

Keyword matching is case-insensitive. Tickets with no keyword match are UNMATCHED.
Use --force-unmatched to route them to omniplan as TRIAGE tasks.

Worktree Policy

Workers create isolated git worktrees at:

~/.claude/worktrees/{epic_id}/{run_id}/{ticket_id}/

Stale worktrees are cleaned up automatically after merge when auto_cleanup_merged_worktrees: true
(default).

Architecture

epic-team is a thin composition layer. It owns:

Epic decomposition (via decompose-epic)
Ticket-to-repo assignment (via repo_manifest)
Wave construction (group tickets by dependency into parallel waves)
Direct Task() dispatch of ticket-pipeline per ticket (from team-lead session)
State persistence (~/.claude/epics/{epic_id}/state.yaml)
Slack lifecycle notifications (started, ticket done, epic done)

It does NOT own:

Ticket implementation (delegated to ticket-pipeline → ticket-work)
Code review (delegated to local-review)
CI polling (delegated to ci-watch)
PR review polling (delegated to pr-watch)
Merge execution (delegated to auto-merge)

Execution Model

Direct dispatch from team-lead session is the authoritative execution pattern:

Team-lead constructs waves of tickets grouped by dependency
For each wave, team-lead dispatches one Task() per ticket in parallel
Team-lead awaits all Task() completions in a wave before starting the next wave
Results (status, pr_url, branch) are collected directly from Task() return values
No background workers, no TaskList polling loop, no SendMessage coordination

Why not per-repo workers? Workers spawned as team members via TeamCreate + Task(team_name=...)
go idle immediately (idleReason: available) and never process tasks from the task queue.
This is a confirmed behavior across multiple epic runs. The direct dispatch pattern is the
only execution model that reliably completes tickets.

Failure Taxonomy and Recovery Strategies

Failure Class	Symptoms	Recovery Strategy
`rate_limit`	Sub-agent exits with rate limit error	Wait 60s, retry the ticket via `ticket-pipeline` sequentially
`context_limit`	Sub-agent hits max context length mid-ticket	Spawn fresh sub-agent; previous work is in the branch
`ci_failure_uv`	CI fails with lock file or uv version error	Verify CI uv version; regenerate lock with matching version
`ci_failure_ruff`	CI fails with lint/format error	Run `uv run ruff check --fix` + `uv run ruff format`, recommit
`stale_branch`	PR fails to merge — "main has moved"	`git rebase origin/main`, re-push, re-enable auto-merge
`wrong_repo`	Ticket worked in wrong repo	Look up target repo in Linear ticket `repo` field; re-dispatch
`blocker_unresolved`	Ticket blocked by another in-progress ticket	Move to end of queue; complete blocking ticket first
`pr_template_blocked`	Mergeability gate rejected — missing PR body sections	Update PR body with required sections (Summary/Risk/Test Evidence/Rollback), rerun gate
`unknown`	Failure doesn't match above patterns	Escalate to user: ticket ID, last command output, branch state

Failure Attempt Definition

A failure attempt is counted when ticket-pipeline returns non-zero exit OR the summary contains an explicit failure class. Partial completions (branch exists, some steps done) count as one attempt.

Self-Healing Rules

Attempt retry up to 2 times within the same run only. On interruption + re-run, counts reset.
After 2 failed attempts: produce failure report and stop dispatching new tickets.
Escalation = log the failure report and continue with remaining unblocked tickets.

epic-team

Resources

Install

Epic Team Orchestration

Overview

Composable Sub-Skills

Usage Examples

Orchestration Flow

Dispatch: decompose-epic

Dispatch: Ticket-Pipeline per Ticket (Direct Dispatch Pattern)

Skill Result Communication

State Persistence

Empty Epic Auto-Decompose

--dry-run behavior for empty epic

Repo Manifest

Worktree Policy

Architecture

Execution Model

Failure Taxonomy and Recovery Strategies

Failure Attempt Definition

Self-Healing Rules

See Also

Categories

Install

epic-team

Resources

Install

Epic Team Orchestration

Overview

Composable Sub-Skills

Usage Examples

Orchestration Flow

Dispatch: decompose-epic

Dispatch: Ticket-Pipeline per Ticket (Direct Dispatch Pattern)

Skill Result Communication

State Persistence

Empty Epic Auto-Decompose

--dry-run behavior for empty epic

Repo Manifest

Worktree Policy

Architecture

Execution Model

Failure Taxonomy and Recovery Strategies

Failure Attempt Definition

Self-Healing Rules

See Also

Categories

Install

Recommended Skills