On-demand adversarial quality reviews using strategy templates. Selects strategies by criticality level, executes adversarial templates against deliverables, and scores quality using LLM-as-Judge rubric. Integrates with quality-enforcement.md SSOT.
Resources
3Install
npx skillscat add geekatron/jerry/adversary Install via the SkillsCat registry.
Adversary Skill
Version: 1.0.0
Framework: Jerry Adversarial Quality (ADV)
Constitutional Compliance: Jerry Constitution v1.0
SSOT Reference:.context/rules/quality-enforcement.md
Document Audience (Triple-Lens)
This SKILL.md serves multiple audiences:
| Level | Audience | Sections to Focus On |
|---|---|---|
| L0 (ELI5) | New users, stakeholders | Purpose, When to Use, Routing Disambiguation, When to Use /adversary vs ps-critic, Quick Reference |
| L1 (Engineer) | Developers invoking agents | Invoking an Agent, Available Agents, Dependencies, Adversarial Quality Mode |
| L2 (Architect) | Workflow designers | P-003 Compliance, H-14 Integration, Constitutional Compliance, Strategy Templates |
Purpose
The Adversary skill provides on-demand adversarial quality reviews using strategy templates from the Jerry quality framework. Unlike the problem-solving skill's integrated adversarial mode (which operates within creator-critic loops), the adversary skill is invoked explicitly when you need a standalone adversarial assessment of any deliverable.
Key Capabilities
- Strategy Selection - Maps criticality levels (C1-C4) to the correct adversarial strategy sets per SSOT
- Strategy Execution - Loads and runs strategy templates against deliverables with structured finding classification
- Quality Scoring - Implements S-014 LLM-as-Judge rubric scoring with 6-dimension weighted composite
- Criticality-Aware - Automatically adjusts review depth based on deliverable criticality
- Template-Driven - All strategies follow standardized templates from
.context/templates/adversarial/
When to Use This Skill
Activate when:
- Applying adversarial strategies to a completed deliverable outside a creator-critic loop
- Scoring deliverable quality using the SSOT 6-dimension rubric
- Selecting the appropriate strategy set for a given criticality level
- Running a full C4 tournament review with all 10 strategies
- Pairing S-003 (Steelman) before S-002 (Devil's Advocate) per H-16
- Needing a standalone quality assessment without revision cycles
NEVER invoke this skill when:
- Task requires iterative creator-critic-revision loop -- Consequence: Adversarial one-shot assessment applied to iterative work produces premature rejection without revision pathway; use
/problem-solvingwith ps-critic instead - Task is routine code review for quick defect checks -- Consequence: Full adversarial strategy template execution (S-001 through S-014) applied to routine defect detection wastes significant context budget on strategy selection and template loading; use ps-reviewer instead
- Task is binary constraint validation (pass/fail compliance) -- Consequence: Adversarial strategies assess quality dimensions, not binary constraint compliance; traceability matrices not generated; use ps-validator instead
- Work is routine code changes at C1 criticality -- Consequence: Full adversarial overhead (adv-selector, adv-executor, adv-scorer) applied to C1 routine tasks consumes disproportionate context budget for low-risk work; use self-review (S-010) only
- Defects or bugs have obvious solutions -- Consequence: Adversarial quality assessment evaluates existing deliverables, not diagnose root causes; use
/problem-solvingfor root-cause analysis instead - User explicitly requests a quick review without adversarial rigor -- Consequence: Overriding user preference violates P-020 (user authority); respect the request
See Routing Disambiguation for full exclusion conditions with consequences.
Note: Use
/adversaryfor adversarial code review (e.g., red team security review, tournament quality assessment of code artifacts). Useps-reviewerfor routine defect detection.
Relationship to ps-critic
The adv-scorer and ps-critic agents share the same S-014 LLM-as-Judge rubric and 6-dimension weighted composite scoring methodology. They serve different workflow positions:
| Aspect | adv-scorer | ps-critic |
|---|---|---|
| Workflow | Standalone/on-demand scoring | Embedded in creator-critic-revision loops |
| Output | Focused score report with L0 summary | L0/L1/L2 multi-level critique report |
| Iteration | May be invoked once or re-invoked for re-scoring | Iterates within the H-14 cycle |
| Invocation | Via /adversary skill |
Via /problem-solving skill |
Both agents produce comparable scores from the same rubric; the choice depends on whether you need standalone assessment (adv-scorer) or iterative critique with revision guidance (ps-critic).
Available Agents
| Agent | Role | Model | Output Location |
|---|---|---|---|
adv-selector |
Strategy Selector - Maps criticality to strategy sets | haiku | Strategy selection plan |
adv-executor |
Strategy Executor - Runs strategy templates against deliverables | sonnet | Strategy execution reports |
adv-scorer |
Quality Scorer - LLM-as-Judge rubric scoring | sonnet | Quality score reports |
P-003 Compliance
All adversary agents are workers, NOT orchestrators. The MAIN CONTEXT (Claude session) orchestrates the workflow.
P-003 AGENT HIERARCHY:
======================
+-------------------+
| MAIN CONTEXT | <-- Orchestrator (Claude session)
| (orchestrator) |
+-------------------+
| | |
v v v
+------+ +------+ +------+
| adv- | | adv- | | adv- | <-- Workers (max 1 level)
|select| |exec | |scorer|
+------+ +------+ +------+
Agents CANNOT invoke other agents.
Agents CANNOT spawn subagents.
Only MAIN CONTEXT orchestrates the sequence.Invoking an Agent
Option 1: Natural Language Request
Simply describe what you need:
"Run an adversarial review of this ADR at C3 criticality"
"Score this deliverable with LLM-as-Judge"
"What strategies should I apply for a C2 review?"
"Run Devil's Advocate and Steelman on this design document"
"Execute a full C4 tournament review on the architecture proposal"The orchestrator will select the appropriate agent(s) based on keywords and context.
Option 2: Explicit Agent Request
Request a specific agent:
"Use adv-selector to pick strategies for C3 criticality"
"Have adv-executor run S-002 Devil's Advocate on the ADR"
"I need adv-scorer to produce a quality score for this synthesis"Option 3: Task Tool Invocation
For programmatic invocation within workflows:
Task(
description="adv-selector: Strategy selection for C3",
subagent_type="general-purpose",
prompt="""
You are the adv-selector agent (v1.0.0).
## ADV CONTEXT (REQUIRED)
- **Criticality Level:** C3
- **Deliverable Type:** Architecture Decision Record
- **Deliverable Path:** docs/decisions/adr-042-persistence.md
## MANDATORY PERSISTENCE (P-002)
Create file at: {output_path}
## TASK
Select the strategy set for C3 criticality per SSOT.
"""
)Dependencies / Prerequisites
The adversary skill depends on external artifacts created by other enablers. These MUST be in place before the skill is fully operational.
Strategy Template Files
All 10 strategy templates in .context/templates/adversarial/ are created by separate enablers:
| Template | Source Enabler | Status |
|---|---|---|
s-001-red-team.md |
EN-809 | Created by EN-809 |
s-002-devils-advocate.md |
EN-806 | Created by EN-806 |
s-003-steelman.md |
EN-807 | Created by EN-807 |
s-004-pre-mortem.md |
EN-808 | Created by EN-808 |
s-007-constitutional-ai.md |
EN-805 | Created by EN-805 |
s-010-self-refine.md |
EN-804 | Created by EN-804 |
s-011-cove.md |
EN-809 | Created by EN-809 |
s-012-fmea.md |
EN-808 | Created by EN-808 |
s-013-inversion.md |
EN-808 | Created by EN-808 |
s-014-llm-as-judge.md |
EN-803 | Created by EN-803 |
Naming Convention: Templates follow the pattern s-{NNN}-{slug}.md where {NNN} is the strategy ID from the quality-enforcement SSOT and {slug} is a hyphenated descriptor (e.g., s-002-devils-advocate.md). These are static reference documents versioned alongside the codebase — they are not dynamically generated.
Fallback behavior: If a template file is not found, adv-executor MUST:
- Emit a WARNING to the orchestrator with the missing template path
- Request the corrected path from the orchestrator
- Do NOT silently skip the strategy — the orchestrator decides whether to skip or provide an alternative
The skill skeleton (EN-802) defines the structure; the template enablers populate the content.
SSOT File
.context/rules/quality-enforcement.md-- MUST exist. All thresholds, strategy IDs, criticality levels, and quality dimensions are sourced from here.
Adversarial Quality Mode
SSOT Reference:
.context/rules/quality-enforcement.md-- all thresholds, strategy IDs, criticality levels, and quality dimensions are defined there. NEVER hardcode values; always reference the SSOT.
Strategy Catalog
The quality framework provides 10 selected adversarial strategies across 4 mechanistic families. See .context/rules/quality-enforcement.md (Strategy Catalog section) for the authoritative list.
| Family | Strategies | Adversary Application |
|---|---|---|
| Iterative Self-Correction | S-014 (LLM-as-Judge), S-007 (Constitutional AI Critique), S-010 (Self-Refine) | Quality scoring, constitutional compliance, self-review |
| Dialectical Synthesis | S-003 (Steelman Technique) | Strengthen arguments before critique (H-16 REQUIRED) |
| Role-Based Adversarialism | S-002 (Devil's Advocate), S-004 (Pre-Mortem Analysis), S-001 (Red Team Analysis) | Challenge assumptions, anticipate failures, adversarial exploration |
| Structured Decomposition | S-013 (Inversion Technique), S-012 (FMEA), S-011 (Chain-of-Verification) | Systematic failure mode analysis, verification chains |
Strategy Templates
All strategies use standardized templates from .context/templates/adversarial/:
| Template | Strategy | Purpose |
|---|---|---|
s-001-red-team.md |
S-001 Red Team Analysis | Adversarial exploration of attack surfaces |
s-002-devils-advocate.md |
S-002 Devil's Advocate | Challenge assumptions and key claims |
s-003-steelman.md |
S-003 Steelman Technique | Strengthen the best version of the argument |
s-004-pre-mortem.md |
S-004 Pre-Mortem Analysis | Anticipate failure modes |
s-007-constitutional-ai.md |
S-007 Constitutional AI Critique | Constitutional compliance verification |
s-010-self-refine.md |
S-010 Self-Refine | Iterative self-improvement |
s-011-cove.md |
S-011 Chain-of-Verification | Systematic claim verification |
s-012-fmea.md |
S-012 FMEA | Failure Mode and Effects Analysis |
s-013-inversion.md |
S-013 Inversion Technique | Invert key claims to find blind spots |
s-014-llm-as-judge.md |
S-014 LLM-as-Judge | Rubric-based quality scoring |
Criticality-Based Strategy Selection
Per SSOT, strategy activation follows criticality levels:
| Level | Required Strategies | Optional Strategies |
|---|---|---|
| C1 (Routine) | S-010 | S-003, S-014 |
| C2 (Standard) | S-007, S-002, S-014 | S-003, S-010 |
| C3 (Significant) | C2 + S-004, S-012, S-013 | S-001, S-003, S-010, S-011 |
| C4 (Critical) | All 10 selected strategies | None (all required) |
H-16 Ordering Constraint
HARD rule: S-003 (Steelman) MUST be applied before S-002 (Devil's Advocate). Always strengthen the argument before challenging it.
Quality Scoring (S-014)
The SSOT defines 6 quality dimensions with weights:
| Dimension | Weight |
|---|---|
| Completeness | 0.20 |
| Internal Consistency | 0.20 |
| Methodological Rigor | 0.20 |
| Evidence Quality | 0.15 |
| Actionability | 0.15 |
| Traceability | 0.10 |
Threshold: >= 0.92 weighted composite for C2+ deliverables (H-13)
Leniency bias counteraction: Score strictly against rubric criteria. When uncertain between adjacent scores, choose the lower one.
When to Use /adversary vs ps-critic
Both /adversary and ps-critic apply adversarial strategies, but serve different workflow positions:
| Aspect | /adversary Skill | ps-critic Agent |
|---|---|---|
| Use Case | Standalone adversarial reviews, tournament scoring, strategy template execution | Embedded quality critique within creator-critic-revision loops |
| Invocation | Explicit on-demand (/adversary or natural language request) |
Invoked by orchestrator within H-14 cycle |
| Output Focus | Strategy-specific findings (adv-executor) + quality score (adv-scorer) | L0/L1/L2 multi-level critique with dimension-level improvement guidance |
| Iteration | May be used once or re-invoked for re-scoring after revision | Iterates within the H-14 minimum 3-iteration cycle |
| Strategy Coverage | Full strategy set per criticality (C1-C4), including tournament mode (all 10) | Applies strategies appropriate to criticality, embedded in workflow |
| Agents | adv-selector, adv-executor, adv-scorer | ps-critic (single agent) |
| Output Artifacts | Strategy execution reports + quality score report | Critique report with improvement recommendations |
When to Use Each:
Use
/adversarywhen:- You need a formal adversarial review outside a creator-critic loop
- You need tournament mode scoring with all 10 strategies (C4)
- You need strategy-specific finding reports (adv-executor outputs)
- You need standalone quality scoring with S-014 rubric
- You need to apply a specific strategy template (e.g., "run S-002 Devil's Advocate on this ADR")
Use
ps-criticwhen:- You are within an orchestrated creator-critic-revision workflow
- You need iterative improvement guidance across multiple revision cycles
- You need lightweight iteration-level critique without full tournament overhead
- You are working within
/problem-solvingor/orchestrationskills
Complementary Use:
Both can work together:
ps-criticapplies strategies within workflows for embedded quality cycles/adversaryorchestrates cross-strategy tournament reviews for C4 critical deliverablesps-criticuses the same S-014 rubric and dimension scoring as adv-scorer for consistency
Tournament Mode
Tournament mode executes all 10 adversarial strategies against a C4 (Critical) deliverable in a deterministic sequence. This is the most comprehensive review level, required for irreversible decisions.
Execution Order
All 10 strategies run in the recommended order from skills/adversary/agents/adv-selector.md:
- Group A — Self-Review: S-010 (Self-Refine)
- Group B — Strengthen: S-003 (Steelman Technique)
- Group C — Challenge: S-002 (Devil's Advocate), S-004 (Pre-Mortem Analysis), S-001 (Red Team Analysis)
- Group D — Verify: S-007 (Constitutional AI Critique), S-011 (Chain-of-Verification)
- Group E — Decompose: S-012 (FMEA), S-013 (Inversion Technique)
- Group F — Score: S-014 (LLM-as-Judge) — ALWAYS LAST
Aggregation
Findings from all strategy execution reports are collected across all 9 executor runs. The adv-scorer agent (S-014) receives these aggregated findings as input evidence when producing the final composite score. Critical findings from any strategy block PASS regardless of score.
Timing Expectations
A C4 tournament with all 10 strategies requires approximately 11 agent invocations:
- adv-selector (1 invocation) — Strategy selection
- adv-executor (9 invocations) — One per strategy from Groups A-E
- adv-scorer (1 invocation) — Final scoring with S-014
Typical duration depends on deliverable size and complexity. Expect longer processing times for large architecture documents or governance changes.
Integration with Creator-Critic-Revision Cycle (H-14)
H-14 mandates a minimum 3-iteration creator-critic-revision cycle for C2+ deliverables. The adversary skill is not a revision loop manager -- it provides standalone adversarial assessment. The integration boundary is:
- adv-scorer produces the quality score. When invoked, adv-scorer evaluates the deliverable and returns a PASS/REVISE/ESCALATE verdict.
- If REVISE, the orchestrator feeds findings back to the creator agent. The adversary skill's findings (from adv-executor) and score breakdown (from adv-scorer) provide actionable input for the next revision iteration.
- The adversary skill can be re-invoked for re-scoring. After the creator revises, the orchestrator may invoke adv-scorer again (with the
Prior Scorecontext field) to track improvement across iterations. - Minimum 3 iterations are the orchestrator's responsibility. The adversary skill does not enforce iteration count -- it scores when asked. The orchestrator (or
/orchestrationskill) tracks the iteration count per H-14.
Workflow position: The adversary skill sits at the "critic" position within the H-14 cycle when used for quality scoring. It can replace or complement ps-critic depending on whether the orchestrator needs standalone scoring (adv-scorer) or iterative critique (ps-critic).
Constitutional Compliance
All agents adhere to the Jerry Constitution v1.0:
| Principle | Requirement | Consequence of Violation |
|---|---|---|
| P-003 | NEVER spawn recursive subagents -- max 1 level | Agent hierarchy violation; uncontrolled token consumption |
| P-020 | NEVER override user intent -- ask before destructive ops | Unauthorized action; trust erosion |
| P-022 | NEVER deceive about actions, capabilities, or confidence | Governance undermined; quality assessment invalidated |
| P-001 | NEVER present findings without evidence or rubric-based scoring | Unreliable outputs; unfounded claims propagate downstream |
| P-002 | NEVER leave outputs in transient context only -- persist to files | Context rot vulnerability; artifacts lost on session compaction |
| P-004 | NEVER omit strategy IDs, template paths, or evidence citations | Untraceable decisions; audit trail broken |
| P-011 | NEVER make findings without tying them to specific deliverable evidence | Unsupported recommendations; confidence inflated without basis |
Quick Reference
Common Workflows
| Need | Agent | Command Example |
|---|---|---|
| Pick strategies for criticality | adv-selector | "What strategies for C3 review?" |
| Run a specific strategy | adv-executor | "Run S-002 Devil's Advocate on this ADR" |
| Score deliverable quality | adv-scorer | "Score this deliverable with LLM-as-Judge" |
| Steelman + Devil's Advocate pair | adv-executor | "Run Steelman then Devil's Advocate on this design" |
| Full C4 tournament | All three | "Run full C4 tournament review" |
Agent Selection Hints
| Keywords | Likely Agent |
|---|---|
| select, pick, which strategies, criticality, C1/C2/C3/C4 | adv-selector |
| run, execute, apply, template, strategy, findings | adv-executor |
| score, judge, rubric, dimensions, threshold, 0.92 | adv-scorer |
Routing Disambiguation
When this skill is the wrong choice and what happens if misrouted.
| Condition | Use Instead | Consequence of Misrouting |
|---|---|---|
| Iterative creator-critic-revision loop needed | /problem-solving (ps-critic) |
Adversarial one-shot assessment applied to iterative work produces premature rejection without revision pathway; ps-critic operates within H-14 revision cycles while /adversary produces standalone assessments |
| Routine code review for quick defect checks | /problem-solving (ps-reviewer) |
Full adversarial strategy template execution (S-001 through S-014) applied to routine defect detection wastes significant context budget on strategy selection and template loading |
| Constraint validation (pass/fail compliance) | /problem-solving (ps-validator) |
Adversarial strategies assess quality dimensions, not binary constraint compliance; ps-validator produces traceability matrices while /adversary produces quality scores |
| Research, analysis, or root cause investigation | /problem-solving (ps-researcher or ps-investigator) |
Adversarial agents evaluate existing deliverables, not produce new analysis; no research methodology or causal investigation capability |
| Security-hardened software design or threat modeling | /eng-team |
/adversary applies quality assessment strategies (S-001 Red Team Analysis is quality-focused); /eng-team provides STRIDE/DREAD threat modeling and OWASP compliance |
| Offensive security testing or penetration testing | /red-team |
/adversary "red team" keyword refers to S-001 quality strategy; /red-team provides MITRE ATT&CK kill chain methodology for authorized penetration testing |
| C1 routine work with obvious solutions | Self-review (S-010) only | Full adversarial overhead (adv-selector, adv-executor, adv-scorer) applied to C1 routine tasks consumes disproportionate context budget for low-risk work |
References
| Source | Content |
|---|---|
.context/rules/quality-enforcement.md |
SSOT for thresholds, strategies, criticality levels |
.context/templates/adversarial/ |
Strategy execution templates |
skills/problem-solving/SKILL.md |
Integrated adversarial quality mode (ps-critic) |
docs/governance/JERRY_CONSTITUTION.md |
Constitutional principles |
| ADR-EPIC002-001 | Strategy selection and composite scores |
| ADR-EPIC002-002 | 5-layer enforcement architecture |
Skill Version: 1.0.0
Constitutional Compliance: Jerry Constitution v1.0
SSOT: .context/rules/quality-enforcement.md
Created: 2026-02-15