red-green-refactor

"Domain-agnostic TDD methodology for iterative improvement. Enforces baseline measurement, minimal changes, and rigorous testing through commitment devices. Use when applying red-green-refactor workflow to any domain: prompts, skills, code, tests. Triggers: TDD methodology, baseline measurement, iterative improvement, red-green-refactor."

sebnow 8 Updated 5mo ago

GitHub

Install

npx skillscat add sebnow/configs/red-green-refactor

Install via the SkillsCat registry.

SKILL.md

Red-Green-Refactor Methodology

Test-Driven Development applies beyond code:
observe failures with current approach,
make minimal improvements,
then test rigorously.

This skill defines the domain-agnostic methodology.
Consuming skills define what "baseline", "minimal change", and "testing" mean
in their domain.

Red Phase: Establish Baseline

Before changing anything, measure current performance.

State: "Beginning Red phase - establishing baseline"

Required steps:

Define success criteria:
- What does good output look like?
- How will you measure it?
- What's the acceptable threshold?
Observe and measure baseline:
- Run current approach against criteria
- Document specific failures and patterns
- Record baseline metrics
Document failures:
- What patterns appear in failures?
- What rationalizations or mistakes occur?
- What concrete symptoms indicate the problem?

Phase gate: You cannot proceed to Green phase
without documented baseline measurements.

Forbidden rationalizations:

"This is simple enough to skip testing"
"I know what the output will be"
"I'll improve it then test later"
"Quick changes are better than none"

If you haven't measured baseline, you don't know if changes help.

Failure mode:
Skipping Red creates changes that may perform worse than the original.

Green Phase: Minimal Improvements

Apply one change at a time. Measure after each change.

State: "Beginning Green phase - making minimal improvements"

Required steps:

Address one observed failure at a time
Make the smallest change that could fix it
Measure impact against baseline after each change
Only proceed to next failure when current one is addressed

Do not change multiple things simultaneously.
This isolates what helps vs what hurts.

Phase gate: Each change must show measurable improvement
or be reverted before trying alternatives.

Refactor Phase: Rigorous Testing

Test comprehensively. Compare against baseline. Close loopholes.

State: "Beginning Refactor phase - testing rigorously"

Required steps:

Run against all test cases, not just the ones you fixed
Compare results against original baseline
Verify no regressions from Green phase changes
Close loopholes discovered during testing

Common loopholes:

Required steps get skipped
Rationalizations bypass the process
Edge cases missed in initial testing
Changes that help one case hurt another

Only after rigorous testing can you claim improvement.

Commitment Devices

These mechanisms prevent process shortcuts:

State announcements -
Declare current phase before starting work.
This creates accountability.

Phase gates -
Cannot proceed to next phase without completing current one.
Red requires baseline. Green requires measured improvement.

Forbidden rationalizations -
Explicit list of thoughts that signal process violation.
When you think these, stop and follow the methodology.

Failure mode warnings -
Each phase documents what goes wrong when skipped,
making the cost of shortcuts concrete.

Domain Specialization

Consuming skills define domain-specific meanings:

Baseline: What to measure
(test results, agent behavior, prompt output, code health)
Minimal change: What counts as one change
(one technique, one instruction, one code fix)
Testing: How to verify improvement
(test cases, multi-run statistics, fresh instances, test suites)
Loopholes: Domain-specific failure patterns to watch for

The methodology stays the same. The domain provides the vocabulary.

red-green-refactor

Install

Red-Green-Refactor Methodology

Red Phase: Establish Baseline

Green Phase: Minimal Improvements

Refactor Phase: Rigorous Testing

Commitment Devices

Domain Specialization

Categories

Install

Recommended Skills