devils-advocate

"Challenge research assumptions and identify weaknesses in arguments. Stress-test papers before submission or revision."

mseok 5 1 Updated 4mo ago

Resources

GitHub

Install

npx skillscat add mseok/dot/devils-advocate

Install via the SkillsCat registry.

SKILL.md

Devil's Advocate Skill

Challenge research assumptions and identify weaknesses in your arguments.

Purpose

Based on Scott Cunningham's Part 3: "Creating Devil's Advocate Agents for Tough Problems" - addressing the "LLM thing of over-confidence in diagnosing a problem."

For formal code audits with replication scripts and referee reports, use the Referee 2 agent instead (.codex/agents/referee2-reviewer.md). This skill is for quick adversarial feedback on arguments, not systematic audits.

When to Use

Before submitting a paper
When stuck on a research problem
When you want to stress-test an argument
During paper revision planning

When NOT to Use

Code audits — use the Referee 2 agent instead
Replication verification — use the Referee 2 agent instead
Quick proofreading — just ask for a read-through
When you want validation — this skill is designed to challenge, not affirm

Workflow

Understand the claim — Read the paper/argument being evaluated
Generate competing hypotheses — If evaluating a research question or design, load references/competing-hypotheses.md and generate 3-5 rival explanations before critiquing
Run the debate — Use the multi-turn debate protocol below (default) or single-shot mode for quick checks
Deliver the verdict — Synthesize surviving critiques with severity ratings

Multi-Turn Debate Protocol (Default)

Inspired by the simulated scientific debates in Google's AI Co-Scientist. A one-shot critique is easy for an LLM to produce but often superficial. Multi-turn debates force each critique to survive a defense, filtering out weak objections and sharpening the strong ones.

Round 1: Adversarial Critic

Adopt the persona of a hostile but competent reviewer. Challenge on:

Theoretical foundations — Are the assumptions justified?
Methodology — Limitations? Alternative approaches?
Data — Selection bias? Measurement issues? External validity?
Causal claims — Alternative explanations? Confounders?
Contribution — Novel enough? Does it matter?

Produce numbered critiques (aim for 5-8), each with a concrete statement of the problem.

Round 2: Defense

Switch persona to the paper's author. For each numbered critique, provide the strongest possible defense:

Cite evidence from the paper that addresses the concern
Explain design choices that mitigate the issue
Acknowledge limitations honestly where the defense is weak
Propose concrete fixes where the critique has merit

Round 3: Adjudication

Switch to an impartial senior reviewer. For each critique-defense pair, rule:

Critique stands — the defense is insufficient; this is a real weakness
Critique partially addressed — defense has merit but issue remains
Critique resolved — the defense adequately addresses the concern

Final Synthesis

Produce a structured report with only the surviving critiques (stands + partially addressed), ranked by severity:

## Devil's Advocate Report

### Critical (must fix before submission)
1. [Critique] — [Why the defense failed] — [Suggested fix]

### Major (reviewers will likely raise)
2. [Critique] — [What remains after defense] — [Suggested fix]

### Minor (worth acknowledging)
3. [Critique] — [Residual concern] — [How to preempt]

### Dismissed
- [Critiques that were resolved in Round 2, listed briefly for transparency]

Single-Shot Mode

For quick checks (e.g., "just poke holes in this argument"), skip the multi-turn protocol and produce a direct critique. Use when the user says "quick", "just challenge this", or the input is a paragraph rather than a full paper.

Example Use

"Play devil's advocate on my MCDM paper about preference drift - specifically challenge my identification strategy and the assumptions about utility functions."

Cross-References

Skill	When to use instead/alongside
`$interview-me`	To develop the idea further through structured interview
`$research-ideation`	To generate alternative research questions on the same topic
`$proofread`	For language/formatting review rather than argument critique

devils-advocate

Resources

Install

Devil's Advocate Skill

Purpose

When to Use

When NOT to Use

Workflow

Multi-Turn Debate Protocol (Default)

Round 1: Adversarial Critic

Round 2: Defense

Round 3: Adjudication

Final Synthesis

Single-Shot Mode

Example Use

Cross-References

Categories

Install

Recommended Skills