design

"Design Ops v3.1. Journey → PRP → Issues → TDD. Tiered pipeline with invariant enforcement, devil's advocate, and e2e testing. USE WHEN design, PRP, validate, requirements, init project, review implementation."

saselvan 0 Updated 4mo ago

Resources

GitHub

Install

npx skillscat add saselvan/design-ops-plugin

Install via the SkillsCat registry.

SKILL.md

Design Ops v3.1

Transform intent into executable PRPs. Issues own the vertical slicing. TDD per issue.

Tier Selection (AI-Proposed, Human-Confirmed)

At the start of any coding task, Claude MUST propose a tier and wait for confirmation:

"This looks like a MEDIUM task — want me to generate a PRP, or just implement with tests?"

Do not skip this. Do not default to SMALL to save tokens. Propose honestly based on:

SMALL (< 1 file, obvious scope, clear pattern):
  → Just implement with tests. No PRP needed.
  → Do NOT nag about PRPs for small work.

MEDIUM (multi-file, < 1 day, known domain):
  → /design prp {journey}        Generate PRP + red-team review + auto-validate
  → /prp-to-issues {prp}         Interactive slicing into GitHub issues
  → /design build {issues}       TDD per issue (red-green-refactor)

LARGE (multi-day, architectural impact, high risk, new domain):
  → /design discover {journey}   Explore + grill-me → decisions log file
  → /design prp {journey}        Generate PRP + red-team + auto-validate
  → /prp-to-issues {prp}         Interactive slicing into GitHub issues
  → /design build {issues}       TDD per issue (red-green-refactor)
  → /design retro                Only if something surprised you

Signal	Tier
Bug fix, add a field, simple UI change	SMALL
New page, new feature, multi-file change	MEDIUM
New architecture, compliance-critical, unknown domain	LARGE
Confidence score < 5	Escalate to LARGE
New domain or tech stack	LARGE
Multiple teams or stakeholders	LARGE

When in doubt, start MEDIUM. You can escalate to LARGE if you hit uncertainty during PRP generation.

Command Reference

/design discover {journey-or-description}

Interactive exploration before PRP generation. LARGE tier only.

What happens:

Read the journey/problem statement
Explore the codebase for relevant patterns and conventions
Run /grill-me — devil's advocate walks the decision tree, challenges assumptions
Resolve each branch interactively
Write a decisions log file (not just conversation — survives context compression)

Output: Decisions log file at docs/design/discoveries/{feature-name}.md:

# Discovery: {Feature Name}
Date: {date}

## Decisions
1. {Decision 1} — {rationale}
2. {Decision 2} — {rationale}

## Open Questions
- {Unresolved question}

## Confidence Concerns
- {Factor}: {concern} (estimated score: {X}/10)

## Codebase Patterns Found
- {Pattern 1}: {where found, how to reuse}

When to skip: When you already know the approach and just need to formalize it.

/design prp {input} [--domain domain] [--tier medium|large]

Generate a PRP from a journey, problem statement, or discovery decisions log.

Input: Journey file, description, discovery log, or "from conversation."

Domain detection:

Check for .designops file in project root → use declared domain(s)
Fall back to --domain flag if specified
Fall back to universal-only if neither exists

What happens:

Load domain from .designops config or --domain flag
Explore codebase → detect patterns, conventions, tech stack
Generate PRP using domain-aware template (6 core sections + domain extensions)
Validate invariants on the generated PRP (built-in)
Score confidence (1-10 scale, 5 weighted factors)
Red-team review (MEDIUM and LARGE) — 7 adversarial questions:
- What failure paths are missing?
- What assumptions are hidden?
- What edge cases aren't covered?
- What are the component dependencies?
- Where could integration break?
- What's over-engineered?
- What would a user actually do differently?
Auto-validate — run validate-prp.sh on the generated file

Output: PRP markdown file with confidence score and validation results.

Hard gate: If confidence score is RED (< 4), the pipeline STOPS. You must explicitly type "proceed with risk" or fix the gaps. Claude cannot override this.

Invariant violations: BLOCKING for universal invariants 1-10. ADVISORY for domain invariants (warn, don't reject) unless healthcare/security domain where ALL invariants are blocking.

/prp-to-issues {prp}

This is where vertical slicing happens. The PRP defines WHAT (scope, success criteria, dependencies). Issues define HOW (vertical slices, build order).

See the prp-to-issues skill for full details. Key points:

Interactive quiz loop to refine slices with the user
Each slice is a thin end-to-end tracer bullet (schema → API → UI → tests)
HITL vs AFK classification
Dependency ordering
Issues link back to PRP success criteria

/design build {issues-or-prp}

True TDD per issue. Replaces the old /design implement + /design run split.

Per-issue loop (in dependency order):

1. Read issue's acceptance criteria
2. Write failing tests for THIS issue only          (RED)
3. Write minimal code to pass                        (GREEN)
4. Run integration test (this + all previous issues)
5. Run e2e smoke test (if within domain time budget)
6. Refactor if needed
7. Commit
8. Next issue

Progress-based circuit breaker:

After each fix attempt:
  - Did failing test count decrease?    → PROGRESS, continue
  - Did error messages change?          → PROGRESS, continue
  - Same failures, same errors?         → STUCK

STUCK after 2 identical failures → escalate immediately with:
  - What failed
  - What was tried
  - Diagnosis: code problem / issue problem / PRP problem
  - Recommended fix at the right level

Hard max: 5 attempts per issue regardless of progress

Testing pyramid per issue:

Unit tests       → Does this issue's logic work?
Contract test    → Does output match the defined interface?
Integration test → Does this issue work WITH previous issues?
E2E smoke test   → Does the full workflow still work?

Implementation invariants (Claude Code specific):

API contract changes → test ALL consumers (INV-IMPL-001)
Verification evidence required — snapshots, not claims (INV-IMPL-002)
No ad-hoc changes outside the pipeline for LARGE tier
Maintain dependency awareness (API → consumer map)

Completion summary: When all issues pass, output:

## Build Complete: {Feature Name}

### Proven by tests
- {Success criterion 1} ✓ (verified by: {test name})
- {Success criterion 2} ✓ (verified by: {test name})

### Requires production observation
- {Success criterion 3} — monitor: {metric, dashboard, or method}
- {Success criterion 4} — verify after: {timeframe}

### Open risks
- {Risk from PRP that wasn't fully mitigated}

### Issues completed
- #{issue1}: {title} ✓
- #{issue2}: {title} ✓

/design retro

Extract learnings after implementation. Only run when something surprised you.

What to capture:

What invariant would have caught this earlier?
What was the gap between the PRP and reality?
Should the confidence rubric be updated?

Rule: New invariants come from pain, not theory. Must cite the specific failure it would have prevented.

/design init {project-name} [--domain domain]

Bootstrap project structure with domain config.

{project-name}/
├── docs/design/
│   ├── journeys/
│   ├── discoveries/        ← NEW: decisions logs from /design discover
│   ├── PRPs/
│   └── deltas/
├── .designops              ← Domain config (read by /design prp)
├── CONVENTIONS.md
└── README.md

.designops file format:

domains:
  - consumer-product
e2e:
  tool: playwright
  time_budget: 120s

Domain Configuration (.designops)

Per-project config file. Eliminates per-command --domain flags.

# .designops
domains:
  - healthcare-ai
  - data-architecture
e2e:
  tool: pytest           # playwright | pytest | notebook | manual
  time_budget: 300s      # max time for e2e smoke test
  run_frequency: every_slice  # every_slice | every_2_slices | at_gates

Domain auto-loading: /design prp reads this file automatically. The --domain flag overrides it.

E2E Smoke Test (Domain-Specific)

E2E means different things per domain. Define in .designops and in the PRP's domain extension.

Domain	E2E tool	What it verifies	Typical time
consumer-product	Playwright	Browser click-through of critical user path	30-120s
data-architecture	pytest / notebook	Pipeline run with test data → output schema + row counts + quality	60-300s
healthcare-ai	pytest + audit check	Above + PHI absent from output + audit log populated	120-600s
integration	pytest / curl	Request → response → contract match + side effects	15-60s
physical-construction	manual checklist	Inspection gate completion	N/A (human)

Time budget rule: If e2e exceeds the time budget, run it every 2-3 issues instead of every issue. Always run at final build completion.

Confidence Scoring

Quantitative risk assessment. 5 weighted factors:

Factor	Weight	What it measures
Requirement Clarity	30%	Are requirements unambiguous and testable?
Pattern Availability	25%	Do proven patterns exist for this?
Test Coverage Plan	20%	How well-defined is validation?
Edge Case Handling	15%	Are failure modes identified?
Tech Familiarity	10%	How well do you know the tech?

Score → Action:

1-3 (Red): HARD STOP. Cannot proceed without explicit human override ("proceed with risk"). Escalate to LARGE tier.
4-6 (Yellow): PROCEED with explicit risk acknowledgment in PRP.
7-9 (Green): PROCEED normally.
10 (Perfect): Suspicious. Verify nothing was missed.

Invariant Enforcement

Universal Invariants (always enforced, blocking)

#	Invariant	Key test
1	Ambiguity is Invalid	No "properly", "easily" without definition
2	State Must Be Explicit	Every verb has before→action→after
3	Emotional Intent Must Compile	"Feel X" becomes ":= concrete mechanism"
4	No Irreversible Without Recovery	Destructive verbs have undo/backup
5	Execution Must Fail Loudly	No "gracefully" or "silently"
6	Scope Must Be Bounded	No "all" without limits
7	Validation Must Be Executable	Metrics + thresholds, not "looks good"
8	Cost Boundaries Must Be Explicit	Limits on API/storage/money
9	Blast Radius Must Be Declared	Write ops declare affected scope
10	Degradation Path Must Exist	External deps have fallbacks

Domain Invariants (loaded per project, advisory by default)

Loaded from .designops config. Healthcare and security domains are BLOCKING.

Code-Level Invariants (during /design build)

ID	Rule
TYPE-001	Single canonical location for database/domain types
TYPE-002	TypeScript interfaces must match DB schema nullability
TYPE-003	No `as any` for known tables
FRAME-001	Use correct framework version patterns
INV-IMPL-001	API contract changes → test all consumers
INV-IMPL-002	Verification evidence required (snapshots, not claims)

Two Agents

Agent	What it does	When it runs
validator	Checks PRP against universal invariants (1-10) + domain invariants. BLOCKING or ADVISORY per domain.	During `/design prp`
red-team	Devil's advocate. 7 adversarial questions. BLOCKING findings halt the pipeline.	During `/design prp` (MEDIUM and LARGE)

PRP Structure (6 Core Sections)

The PRP defines WHAT must be true. Issues define HOW to get there.

Meta + Confidence Score — domain, risk quantification (1-10), tier
Problem & Solution — what's broken, what we're building, scope
Success Criteria — pseudo-code conditions (SUCCESS := ALL(...), FAILURE := ANY(...))
Scope & Dependencies — components, their relationships, what depends on what
Risks & Fallbacks — circuit breakers, degradation paths
Validation Commands — integration, e2e smoke test (domain-specific), build/quality

Domain extensions appended when relevant. Template: ~/.claude/design-ops/templates/prp-template.md

Key Files

design-ops/
├── SKILL.md                    # This file (v3.1 command reference)
├── design.md                   # Skill loaded into context
├── system-invariants.md        # Universal invariants 1-10
├── validate-prp.sh             # Auto-validator (runs after /design prp)
├── domains/                    # Domain-specific invariants
├── templates/
│   ├── prp-template.md         # Domain-aware PRP template
│   ├── confidence-rubric.md    # Scoring guidelines
│   └── prp-examples/           # Filled examples
└── _archive/                   # v2.x files (preserved, not loaded)

Version: 3.1
Predecessor: v3.0 (refined by grill-me session)
Last updated: 2026-03-22

design

Resources

Install

Design Ops v3.1

Tier Selection (AI-Proposed, Human-Confirmed)

Command Reference

/design discover {journey-or-description}

/design prp {input} [--domain domain] [--tier medium|large]

/prp-to-issues {prp}

/design build {issues-or-prp}

/design retro

/design init {project-name} [--domain domain]

Domain Configuration (.designops)

E2E Smoke Test (Domain-Specific)

Confidence Scoring

Invariant Enforcement

Universal Invariants (always enforced, blocking)

Domain Invariants (loaded per project, advisory by default)

Code-Level Invariants (during /design build)

Two Agents

PRP Structure (6 Core Sections)

Key Files

Categories

Install

Recommended Skills