adversary

On-demand adversarial quality reviews using strategy templates. Selects strategies by criticality level, executes adversarial templates against deliverables, and scores quality using LLM-as-Judge rubric. Integrates with quality-enforcement.md SSOT.

geekatron 26 5 Updated 4mo ago

Resources

GitHub

Install

npx skillscat add geekatron/jerry/adversary

Install via the SkillsCat registry.

SKILL.md

Adversary Skill

Version: 1.0.0
Framework: Jerry Adversarial Quality (ADV)
Constitutional Compliance: Jerry Constitution v1.0
SSOT Reference: .context/rules/quality-enforcement.md

Document Audience (Triple-Lens)

This SKILL.md serves multiple audiences:

Level	Audience	Sections to Focus On
L0 (ELI5)	New users, stakeholders	Purpose, When to Use, Routing Disambiguation, When to Use /adversary vs ps-critic, Quick Reference
L1 (Engineer)	Developers invoking agents	Invoking an Agent, Available Agents, Dependencies, Adversarial Quality Mode
L2 (Architect)	Workflow designers	P-003 Compliance, H-14 Integration, Constitutional Compliance, Strategy Templates

Purpose

The Adversary skill provides on-demand adversarial quality reviews using strategy templates from the Jerry quality framework. Unlike the problem-solving skill's integrated adversarial mode (which operates within creator-critic loops), the adversary skill is invoked explicitly when you need a standalone adversarial assessment of any deliverable.

Key Capabilities

Strategy Selection - Maps criticality levels (C1-C4) to the correct adversarial strategy sets per SSOT
Strategy Execution - Loads and runs strategy templates against deliverables with structured finding classification
Quality Scoring - Implements S-014 LLM-as-Judge rubric scoring with 6-dimension weighted composite
Criticality-Aware - Automatically adjusts review depth based on deliverable criticality
Template-Driven - All strategies follow standardized templates from .context/templates/adversarial/

When to Use This Skill

Activate when:

Applying adversarial strategies to a completed deliverable outside a creator-critic loop
Scoring deliverable quality using the SSOT 6-dimension rubric
Selecting the appropriate strategy set for a given criticality level
Running a full C4 tournament review with all 10 strategies
Pairing S-003 (Steelman) before S-002 (Devil's Advocate) per H-16
Needing a standalone quality assessment without revision cycles

NEVER invoke this skill when:

Task requires iterative creator-critic-revision loop -- Consequence: Adversarial one-shot assessment applied to iterative work produces premature rejection without revision pathway; use /problem-solving with ps-critic instead
Task is routine code review for quick defect checks -- Consequence: Full adversarial strategy template execution (S-001 through S-014) applied to routine defect detection wastes significant context budget on strategy selection and template loading; use ps-reviewer instead
Task is binary constraint validation (pass/fail compliance) -- Consequence: Adversarial strategies assess quality dimensions, not binary constraint compliance; traceability matrices not generated; use ps-validator instead
Work is routine code changes at C1 criticality -- Consequence: Full adversarial overhead (adv-selector, adv-executor, adv-scorer) applied to C1 routine tasks consumes disproportionate context budget for low-risk work; use self-review (S-010) only
Defects or bugs have obvious solutions -- Consequence: Adversarial quality assessment evaluates existing deliverables, not diagnose root causes; use /problem-solving for root-cause analysis instead
User explicitly requests a quick review without adversarial rigor -- Consequence: Overriding user preference violates P-020 (user authority); respect the request

See Routing Disambiguation for full exclusion conditions with consequences.

Note: Use /adversary for adversarial code review (e.g., red team security review, tournament quality assessment of code artifacts). Use ps-reviewer for routine defect detection.

Relationship to ps-critic

The adv-scorer and ps-critic agents share the same S-014 LLM-as-Judge rubric and 6-dimension weighted composite scoring methodology. They serve different workflow positions:

Aspect	adv-scorer	ps-critic
Workflow	Standalone/on-demand scoring	Embedded in creator-critic-revision loops
Output	Focused score report with L0 summary	L0/L1/L2 multi-level critique report
Iteration	May be invoked once or re-invoked for re-scoring	Iterates within the H-14 cycle
Invocation	Via `/adversary` skill	Via `/problem-solving` skill

Both agents produce comparable scores from the same rubric; the choice depends on whether you need standalone assessment (adv-scorer) or iterative critique with revision guidance (ps-critic).

Available Agents

Agent	Role	Model	Output Location
`adv-selector`	Strategy Selector - Maps criticality to strategy sets	haiku	Strategy selection plan
`adv-executor`	Strategy Executor - Runs strategy templates against deliverables	sonnet	Strategy execution reports
`adv-scorer`	Quality Scorer - LLM-as-Judge rubric scoring	sonnet	Quality score reports

P-003 Compliance

All adversary agents are workers, NOT orchestrators. The MAIN CONTEXT (Claude session) orchestrates the workflow.

P-003 AGENT HIERARCHY:
======================

  +-------------------+
  | MAIN CONTEXT      |  <-- Orchestrator (Claude session)
  | (orchestrator)    |
  +-------------------+
     |        |        |
     v        v        v
  +------+ +------+ +------+
  | adv- | | adv- | | adv- |   <-- Workers (max 1 level)
  |select| |exec  | |scorer|
  +------+ +------+ +------+

  Agents CANNOT invoke other agents.
  Agents CANNOT spawn subagents.
  Only MAIN CONTEXT orchestrates the sequence.

Invoking an Agent

Option 1: Natural Language Request

Simply describe what you need:

"Run an adversarial review of this ADR at C3 criticality"
"Score this deliverable with LLM-as-Judge"
"What strategies should I apply for a C2 review?"
"Run Devil's Advocate and Steelman on this design document"
"Execute a full C4 tournament review on the architecture proposal"

The orchestrator will select the appropriate agent(s) based on keywords and context.

Option 2: Explicit Agent Request

Request a specific agent:

"Use adv-selector to pick strategies for C3 criticality"
"Have adv-executor run S-002 Devil's Advocate on the ADR"
"I need adv-scorer to produce a quality score for this synthesis"

Option 3: Task Tool Invocation

For programmatic invocation within workflows:

Task(
    description="adv-selector: Strategy selection for C3",
    subagent_type="general-purpose",
    prompt="""
You are the adv-selector agent (v1.0.0).

## ADV CONTEXT (REQUIRED)
- **Criticality Level:** C3
- **Deliverable Type:** Architecture Decision Record
- **Deliverable Path:** docs/decisions/adr-042-persistence.md

## MANDATORY PERSISTENCE (P-002)
Create file at: {output_path}

## TASK
Select the strategy set for C3 criticality per SSOT.
"""
)

Dependencies / Prerequisites

The adversary skill depends on external artifacts created by other enablers. These MUST be in place before the skill is fully operational.

Strategy Template Files

All 10 strategy templates in .context/templates/adversarial/ are created by separate enablers:

Template	Source Enabler	Status
`s-001-red-team.md`	EN-809	Created by EN-809
`s-002-devils-advocate.md`	EN-806	Created by EN-806
`s-003-steelman.md`	EN-807	Created by EN-807
`s-004-pre-mortem.md`	EN-808	Created by EN-808
`s-007-constitutional-ai.md`	EN-805	Created by EN-805
`s-010-self-refine.md`	EN-804	Created by EN-804
`s-011-cove.md`	EN-809	Created by EN-809
`s-012-fmea.md`	EN-808	Created by EN-808
`s-013-inversion.md`	EN-808	Created by EN-808
`s-014-llm-as-judge.md`	EN-803	Created by EN-803

Naming Convention: Templates follow the pattern s-{NNN}-{slug}.md where {NNN} is the strategy ID from the quality-enforcement SSOT and {slug} is a hyphenated descriptor (e.g., s-002-devils-advocate.md). These are static reference documents versioned alongside the codebase — they are not dynamically generated.

Fallback behavior: If a template file is not found, adv-executor MUST:

Emit a WARNING to the orchestrator with the missing template path
Request the corrected path from the orchestrator
Do NOT silently skip the strategy — the orchestrator decides whether to skip or provide an alternative

The skill skeleton (EN-802) defines the structure; the template enablers populate the content.

SSOT File

.context/rules/quality-enforcement.md -- MUST exist. All thresholds, strategy IDs, criticality levels, and quality dimensions are sourced from here.

Adversarial Quality Mode

SSOT Reference: .context/rules/quality-enforcement.md -- all thresholds, strategy IDs, criticality levels, and quality dimensions are defined there. NEVER hardcode values; always reference the SSOT.

Strategy Catalog

The quality framework provides 10 selected adversarial strategies across 4 mechanistic families. See .context/rules/quality-enforcement.md (Strategy Catalog section) for the authoritative list.

Family	Strategies	Adversary Application
Iterative Self-Correction	S-014 (LLM-as-Judge), S-007 (Constitutional AI Critique), S-010 (Self-Refine)	Quality scoring, constitutional compliance, self-review
Dialectical Synthesis	S-003 (Steelman Technique)	Strengthen arguments before critique (H-16 REQUIRED)
Role-Based Adversarialism	S-002 (Devil's Advocate), S-004 (Pre-Mortem Analysis), S-001 (Red Team Analysis)	Challenge assumptions, anticipate failures, adversarial exploration
Structured Decomposition	S-013 (Inversion Technique), S-012 (FMEA), S-011 (Chain-of-Verification)	Systematic failure mode analysis, verification chains

Strategy Templates

All strategies use standardized templates from .context/templates/adversarial/:

Template	Strategy	Purpose
`s-001-red-team.md`	S-001 Red Team Analysis	Adversarial exploration of attack surfaces
`s-002-devils-advocate.md`	S-002 Devil's Advocate	Challenge assumptions and key claims
`s-003-steelman.md`	S-003 Steelman Technique	Strengthen the best version of the argument
`s-004-pre-mortem.md`	S-004 Pre-Mortem Analysis	Anticipate failure modes
`s-007-constitutional-ai.md`	S-007 Constitutional AI Critique	Constitutional compliance verification
`s-010-self-refine.md`	S-010 Self-Refine	Iterative self-improvement
`s-011-cove.md`	S-011 Chain-of-Verification	Systematic claim verification
`s-012-fmea.md`	S-012 FMEA	Failure Mode and Effects Analysis
`s-013-inversion.md`	S-013 Inversion Technique	Invert key claims to find blind spots
`s-014-llm-as-judge.md`	S-014 LLM-as-Judge	Rubric-based quality scoring

Criticality-Based Strategy Selection

Per SSOT, strategy activation follows criticality levels:

Level	Required Strategies	Optional Strategies
C1 (Routine)	S-010	S-003, S-014
C2 (Standard)	S-007, S-002, S-014	S-003, S-010
C3 (Significant)	C2 + S-004, S-012, S-013	S-001, S-003, S-010, S-011
C4 (Critical)	All 10 selected strategies	None (all required)

H-16 Ordering Constraint

HARD rule: S-003 (Steelman) MUST be applied before S-002 (Devil's Advocate). Always strengthen the argument before challenging it.

Quality Scoring (S-014)

The SSOT defines 6 quality dimensions with weights:

Dimension	Weight
Completeness	0.20
Internal Consistency	0.20
Methodological Rigor	0.20
Evidence Quality	0.15
Actionability	0.15
Traceability	0.10

Threshold: >= 0.92 weighted composite for C2+ deliverables (H-13)

Leniency bias counteraction: Score strictly against rubric criteria. When uncertain between adjacent scores, choose the lower one.

When to Use /adversary vs ps-critic

Both /adversary and ps-critic apply adversarial strategies, but serve different workflow positions:

Aspect	/adversary Skill	ps-critic Agent
Use Case	Standalone adversarial reviews, tournament scoring, strategy template execution	Embedded quality critique within creator-critic-revision loops
Invocation	Explicit on-demand (`/adversary` or natural language request)	Invoked by orchestrator within H-14 cycle
Output Focus	Strategy-specific findings (adv-executor) + quality score (adv-scorer)	L0/L1/L2 multi-level critique with dimension-level improvement guidance
Iteration	May be used once or re-invoked for re-scoring after revision	Iterates within the H-14 minimum 3-iteration cycle
Strategy Coverage	Full strategy set per criticality (C1-C4), including tournament mode (all 10)	Applies strategies appropriate to criticality, embedded in workflow
Agents	adv-selector, adv-executor, adv-scorer	ps-critic (single agent)
Output Artifacts	Strategy execution reports + quality score report	Critique report with improvement recommendations

When to Use Each:

Use /adversary when:
- You need a formal adversarial review outside a creator-critic loop
- You need tournament mode scoring with all 10 strategies (C4)
- You need strategy-specific finding reports (adv-executor outputs)
- You need standalone quality scoring with S-014 rubric
- You need to apply a specific strategy template (e.g., "run S-002 Devil's Advocate on this ADR")
Use ps-critic when:
- You are within an orchestrated creator-critic-revision workflow
- You need iterative improvement guidance across multiple revision cycles
- You need lightweight iteration-level critique without full tournament overhead
- You are working within /problem-solving or /orchestration skills

Complementary Use:

Both can work together:

ps-critic applies strategies within workflows for embedded quality cycles
/adversary orchestrates cross-strategy tournament reviews for C4 critical deliverables
ps-critic uses the same S-014 rubric and dimension scoring as adv-scorer for consistency

Tournament Mode

Tournament mode executes all 10 adversarial strategies against a C4 (Critical) deliverable in a deterministic sequence. This is the most comprehensive review level, required for irreversible decisions.

Execution Order

All 10 strategies run in the recommended order from skills/adversary/agents/adv-selector.md:

Group A — Self-Review: S-010 (Self-Refine)
Group B — Strengthen: S-003 (Steelman Technique)
Group C — Challenge: S-002 (Devil's Advocate), S-004 (Pre-Mortem Analysis), S-001 (Red Team Analysis)
Group D — Verify: S-007 (Constitutional AI Critique), S-011 (Chain-of-Verification)
Group E — Decompose: S-012 (FMEA), S-013 (Inversion Technique)
Group F — Score: S-014 (LLM-as-Judge) — ALWAYS LAST

Aggregation

Findings from all strategy execution reports are collected across all 9 executor runs. The adv-scorer agent (S-014) receives these aggregated findings as input evidence when producing the final composite score. Critical findings from any strategy block PASS regardless of score.

Timing Expectations

A C4 tournament with all 10 strategies requires approximately 11 agent invocations:

adv-selector (1 invocation) — Strategy selection
adv-executor (9 invocations) — One per strategy from Groups A-E
adv-scorer (1 invocation) — Final scoring with S-014

Typical duration depends on deliverable size and complexity. Expect longer processing times for large architecture documents or governance changes.

Integration with Creator-Critic-Revision Cycle (H-14)

H-14 mandates a minimum 3-iteration creator-critic-revision cycle for C2+ deliverables. The adversary skill is not a revision loop manager -- it provides standalone adversarial assessment. The integration boundary is:

adv-scorer produces the quality score. When invoked, adv-scorer evaluates the deliverable and returns a PASS/REVISE/ESCALATE verdict.
If REVISE, the orchestrator feeds findings back to the creator agent. The adversary skill's findings (from adv-executor) and score breakdown (from adv-scorer) provide actionable input for the next revision iteration.
The adversary skill can be re-invoked for re-scoring. After the creator revises, the orchestrator may invoke adv-scorer again (with the Prior Score context field) to track improvement across iterations.
Minimum 3 iterations are the orchestrator's responsibility. The adversary skill does not enforce iteration count -- it scores when asked. The orchestrator (or /orchestration skill) tracks the iteration count per H-14.

Workflow position: The adversary skill sits at the "critic" position within the H-14 cycle when used for quality scoring. It can replace or complement ps-critic depending on whether the orchestrator needs standalone scoring (adv-scorer) or iterative critique (ps-critic).

Constitutional Compliance

All agents adhere to the Jerry Constitution v1.0:

Principle	Requirement	Consequence of Violation
P-003	NEVER spawn recursive subagents -- max 1 level	Agent hierarchy violation; uncontrolled token consumption
P-020	NEVER override user intent -- ask before destructive ops	Unauthorized action; trust erosion
P-022	NEVER deceive about actions, capabilities, or confidence	Governance undermined; quality assessment invalidated
P-001	NEVER present findings without evidence or rubric-based scoring	Unreliable outputs; unfounded claims propagate downstream
P-002	NEVER leave outputs in transient context only -- persist to files	Context rot vulnerability; artifacts lost on session compaction
P-004	NEVER omit strategy IDs, template paths, or evidence citations	Untraceable decisions; audit trail broken
P-011	NEVER make findings without tying them to specific deliverable evidence	Unsupported recommendations; confidence inflated without basis

Quick Reference

Common Workflows

Need	Agent	Command Example
Pick strategies for criticality	adv-selector	"What strategies for C3 review?"
Run a specific strategy	adv-executor	"Run S-002 Devil's Advocate on this ADR"
Score deliverable quality	adv-scorer	"Score this deliverable with LLM-as-Judge"
Steelman + Devil's Advocate pair	adv-executor	"Run Steelman then Devil's Advocate on this design"
Full C4 tournament	All three	"Run full C4 tournament review"

Agent Selection Hints

Keywords	Likely Agent
select, pick, which strategies, criticality, C1/C2/C3/C4	adv-selector
run, execute, apply, template, strategy, findings	adv-executor
score, judge, rubric, dimensions, threshold, 0.92	adv-scorer

Routing Disambiguation

When this skill is the wrong choice and what happens if misrouted.

Condition	Use Instead	Consequence of Misrouting
Iterative creator-critic-revision loop needed	`/problem-solving` (ps-critic)	Adversarial one-shot assessment applied to iterative work produces premature rejection without revision pathway; ps-critic operates within H-14 revision cycles while /adversary produces standalone assessments
Routine code review for quick defect checks	`/problem-solving` (ps-reviewer)	Full adversarial strategy template execution (S-001 through S-014) applied to routine defect detection wastes significant context budget on strategy selection and template loading
Constraint validation (pass/fail compliance)	`/problem-solving` (ps-validator)	Adversarial strategies assess quality dimensions, not binary constraint compliance; ps-validator produces traceability matrices while /adversary produces quality scores
Research, analysis, or root cause investigation	`/problem-solving` (ps-researcher or ps-investigator)	Adversarial agents evaluate existing deliverables, not produce new analysis; no research methodology or causal investigation capability
Security-hardened software design or threat modeling	`/eng-team`	/adversary applies quality assessment strategies (S-001 Red Team Analysis is quality-focused); /eng-team provides STRIDE/DREAD threat modeling and OWASP compliance
Offensive security testing or penetration testing	`/red-team`	/adversary "red team" keyword refers to S-001 quality strategy; /red-team provides MITRE ATT&CK kill chain methodology for authorized penetration testing
C1 routine work with obvious solutions	Self-review (S-010) only	Full adversarial overhead (adv-selector, adv-executor, adv-scorer) applied to C1 routine tasks consumes disproportionate context budget for low-risk work

References

Source	Content
`.context/rules/quality-enforcement.md`	SSOT for thresholds, strategies, criticality levels
`.context/templates/adversarial/`	Strategy execution templates
`skills/problem-solving/SKILL.md`	Integrated adversarial quality mode (ps-critic)
`docs/governance/JERRY_CONSTITUTION.md`	Constitutional principles
ADR-EPIC002-001	Strategy selection and composite scores
ADR-EPIC002-002	5-layer enforcement architecture

Skill Version: 1.0.0
Constitutional Compliance: Jerry Constitution v1.0
SSOT: .context/rules/quality-enforcement.md
Created: 2026-02-15

adversary

Resources

Install

Adversary Skill

Document Audience (Triple-Lens)

Purpose

Key Capabilities

When to Use This Skill

Relationship to ps-critic

Available Agents

P-003 Compliance

Invoking an Agent

Option 1: Natural Language Request

Option 2: Explicit Agent Request

Option 3: Task Tool Invocation

Dependencies / Prerequisites

Strategy Template Files

SSOT File

Adversarial Quality Mode

Strategy Catalog

Strategy Templates

Criticality-Based Strategy Selection

H-16 Ordering Constraint

Quality Scoring (S-014)

When to Use /adversary vs ps-critic

Tournament Mode

Execution Order

Aggregation

Timing Expectations

Integration with Creator-Critic-Revision Cycle (H-14)

Constitutional Compliance

Quick Reference

Common Workflows

Agent Selection Hints

Routing Disambiguation

References

Categories

Install

Recommended Skills