Harden a plan/design/algorithm/decision via a thesis→antithesis→synthesis loop: each round, two independent reviewers (a different model + a freshly-spawned subagent) critique the current draft; fold in only findings the draft doesn't already resolve, adjudicating reviewer disagreements by evidence; stop after two consecutive clean rounds (cap 10, usually 3–5). Independence is the engine — a different model and an isolated perspective expose blind spots one model can't see in itself. Use when the user says "dialectic" or "정반합". NOT for trivial or no-real-tension changes.
Resources
2Install
npx skillscat add beingcognitive/unprimed-dialectic Install via the SkillsCat registry.
/dialectic — convergence loop
Harden a plan / design / algorithm / decision until independent review stops finding things. A single
model can't see its own blind spots; a different model and a fresh, uncontaminated perspective can.
That independence — and the disagreement it produces — is the engine.
It hardens reasoning; it does not prove correctness. For empirical / code / financial / security /
performance / UX claims, dialectic agreement is not evidence — carry through to real verification (run
it, test it, benchmark it, check the source). Don't use for trivial or no-real-tension changes.
Step 0 — Seed the thesis
The loop hardens a draft; it doesn't author one from nothing. If the input isn't already a structured
draft, write v0 yourself: goal · constraints · key assumptions · the approach · alternatives
considered · known risks · how it will be verified. If a missing hard requirement could change the
plan, pause for the user's answer before starting rounds; otherwise state your assumption and proceed.
Show v0 — that's round 1's thesis.
Right-size
Always two independent reviewers per round — that pair is the method's load-bearing minimum. What
you right-size is the expected round count and whether to add a third domain/human reviewer, not
the core two:
- Small or already-mature artifact → often hits the two-clean-round minimum in ~2–3 rounds.
- High-stakes → add a domain/human reviewer; may run to the cap.
- So trivial a single gut-check would do → below this skill's bar; don't run the loop.
State the expected round count to the user up front.
Each round
Send the complete current draft (+ an optional neutral Settled constraints / non-goals block)
to each reviewer — in parallel, in one message, each blind to the other's output. Nothing else:
no prior rounds, ledgers, verdicts, or your rationale. ("Parallel" = the two dispatches; any
verify-a-fact step happens later, in synthesis, after both return.)Reviewer A = model-independent: a different model/provider when available (e.g.
/codex). If none
is available — or it's your own model family — use the most independent reviewer you have, label the
round "reduced independence/confidence," and don't call it model-independent.
Reviewer B = perspective-independent: spawn a new subagent (Agent tool); pass it only the
complete draft + the reviewer prompt; do not pass conversation history, prior rounds, ledgers, or
your rationale (same base model as you, but isolated context and a different review axis).
Give the two different review axes, not different tones — e.g. A = adversarial correctness;
B = simplicity / over-engineering, or a risk axis the artifact needs (security / data-loss / ops / cost).
Match each axis to the reviewer's comparative advantage — in particular, route checkable/factual
claims to the web-search-capable reviewer (codex), and judgment/taste/argument axes to the other.Reviewer prompt (stable, neutral):
Critique this draft for correctness, missing constraints, hidden assumptions, edge cases,
over-engineering, under-specification, implementation risk, and verification gaps. Each finding must
cite a specific draft element (or a specific missing one) and its failure mechanism — no generic
checklist worries. Severity-grouped; for each: observation · why it matters · concrete fix. Don't
rewrite the draft. If it's solid, say so.Synthesize into the per-round ledger (below); the revised draft becomes next round's thesis.
Per-round ledger (fill literally)
Round N
finding | source(A/B) | verdict(accept/reject) | one-line reason
...
Disagreements (only if any): claim A | claim B | deciding evidence/mechanism | resolution | confidence | needs-empirical?
Accepted this round: <n> (the round is CLEAN iff 0)
Self-bias: <one way this synthesis could be wrong; if plausible, re-open the last reject>Synthesis rules
- Accept a finding only if the draft doesn't already adequately resolve it — adequate = the draft
states the decision + rationale + constraints + a verification/mitigation path at an actionable level.
Merely naming the topic ≠ resolved. Also accept: unresolved, materially under-specified, or contradicted. - Adjudicate disagreements; don't average. A divergence is exactly where a lone model would adopt a
wrong premise. If it turns on a fact you can verify now, pause and verify before revising; if not,
record it as an unresolved risk, mark confidence, and make the draft conditional on the assumption. - Discard non-actionable / out-of-scope / mechanism-less / already-resolved findings.
- Re-run a reviewer at most once, and only for a format/process failure (e.g. no severity grouping,
empty, ignored the prompt) — never because findings are sparse, negative, or inconvenient (that's
selection bias); log any re-run. A re-run's findings count normally toward acceptance, and a re-run
never by itself resets the clean-round counter; a round whose only findings came from a re-run still
counts as CLEAN if synthesis accepts none.
Stop
- A round is CLEAN when synthesis accepts no finding (the draft needs no change). Findings discarded
as already-resolved / cosmetic / out-of-scope / non-actionable do not count and do not reset the
counter. Stop after two consecutive clean rounds. - A clean round — including round 1 — still triggers one more round with freshly-spawned reviewers on
the unchanged draft (independence is what makes the confirmation meaningful). If the artifact had no real
tension, note it was likely too trivial for this method. - Hard cap = 10 rounds. At any stop (clean-rounds or cap), the exit checklist must pass to
declare convergence; if it doesn't (including hitting the cap), report the unresolved items to the user
and stop without declaring convergence. - Exit checklist: objectives met · assumptions explicit · risks handled · alternatives + why-rejected
recorded · unresolved disagreements documented · verification plan named. - High-stakes: the domain/human reviewer reviews at least the initial and the final pre-stop draft
(and any round their concern is materially revised); their unresolved objections block convergence. - Mode-shift = the real signal: early rounds change the draft (defects, design); near convergence,
feedback turns to traceability/tests/verification. When critique becomes verification, you're there. - On stop, deliver the Output (below). Proceed to implementation + verification only if the user
asked for it; otherwise stop at the hardened draft.
Guardrails (skip these and the result is fake — single source; don't restate them inline)
- Don't prime the reviewers. Keep expected verdicts, "already-rejected" lists, "say READY if nothing's
new," and round numbers OUT of the reviewer prompt. Priming hands them the answer and suppresses real
findings — "convergence" becomes compliance. Neutral context (settled constraints, non-goals) is fine;
a prescribed conclusion is not. Tell your prediction to the user only. - Don't stop on a feeling. Stop only by the rule, the stated budget cap, or a user-approved scope —
never because the draft "feels done" (the classic confirmation-bias tell). - Judge novelty against the draft's content, not the reviewer's wording.
- Independence is earned, not assumed. If both reviewers are effectively the same model, agreement
proves little; diversify model / axis / evidence. Verify reviewer B's isolation — some harnesses
spawn a subagent that inherits the whole conversation regardless of what you nominally pass it (observed
with Claude Code's Agent tool: a "fresh" reviewer cited an unrelated detail from earlier in the session).
Probe for it. If B can't be isolated, it is fresh reasoning + a different axis, not context-independent —
your real independence then rests on the different-model reviewer (A). - Privacy preflight. Reviewers (especially
/codex→ OpenAI) see whatever you send. Redact secrets,
credentials, PII, and proprietary context from the draft itself; scoping file reads isn't enough.
"Complete draft" means complete w.r.t. the sanitized draft — replace decision-relevant redactions with
abstract placeholders that preserve the constraint (e.g.[regulated customer data],[vendor contract limit]).
Output
Final draft · Accepted findings by round · Rejected findings + reasons · Decision records ·Unresolved risks · Verification plan · Implementation status.
Notes (read once — rationale, not steps)
- Why two independent reviewers: a single model can't see its own blind spots; a different model (A)
catches different classes of error, and an isolated fresh perspective (B) catches what shared context
would have suppressed. Their disagreement is the highest-signal output — it's exactly where a lone
reviewer would have adopted a wrong premise. - The orchestrator writes the draft and judges the critiques — the loop's weakest link. That's why the
rejection ledger + self-bias line are mandatory: they're your only audit trail against your own bias.
Appendix — codex (environment-specific; not part of the method)
- Use a fresh
codex exec— itsresumerejects the-Cworking-dir flag, so start fresh each round. - Treat codex as GPT unless the user says otherwise.
- stderr capture: a fixed path like
/tmp/codex-err-$$.txt($$= shell PID; ensure it runs in a shell;
BSDmktempfails on a template with a suffix after theXs).