dialectic

Harden a plan/design/algorithm/decision via a thesis→antithesis→synthesis loop: each round, two independent reviewers (a different model + a freshly-spawned subagent) critique the current draft; fold in only findings the draft doesn't already resolve, adjudicating reviewer disagreements by evidence; stop after two consecutive clean rounds (cap 10, usually 3–5). Independence is the engine — a different model and an isolated perspective expose blind spots one model can't see in itself. Use when the user says "dialectic" or "정반합". NOT for trivial or no-real-tension changes.

beingcognitive 0 Updated 1mo ago

Resources

GitHub

Install

npx skillscat add beingcognitive/unprimed-dialectic

Install via the SkillsCat registry.

SKILL.md

/dialectic — convergence loop

Harden a plan / design / algorithm / decision until independent review stops finding things. A single
model can't see its own blind spots; a different model and a fresh, uncontaminated perspective can.
That independence — and the disagreement it produces — is the engine.

It hardens reasoning; it does not prove correctness. For empirical / code / financial / security /
performance / UX claims, dialectic agreement is not evidence — carry through to real verification (run
it, test it, benchmark it, check the source). Don't use for trivial or no-real-tension changes.

Step 0 — Seed the thesis

The loop hardens a draft; it doesn't author one from nothing. If the input isn't already a structured
draft, write v0 yourself: goal · constraints · key assumptions · the approach · alternatives
considered · known risks · how it will be verified. If a missing hard requirement could change the
plan, pause for the user's answer before starting rounds; otherwise state your assumption and proceed.
Show v0 — that's round 1's thesis.

Right-size

Always two independent reviewers per round — that pair is the method's load-bearing minimum. What
you right-size is the expected round count and whether to add a third domain/human reviewer, not
the core two:

Small or already-mature artifact → often hits the two-clean-round minimum in ~2–3 rounds.
High-stakes → add a domain/human reviewer; may run to the cap.
So trivial a single gut-check would do → below this skill's bar; don't run the loop.

State the expected round count to the user up front.

Each round

Send the complete current draft (+ an optional neutral Settled constraints / non-goals block)
to each reviewer — in parallel, in one message, each blind to the other's output. Nothing else:
no prior rounds, ledgers, verdicts, or your rationale. ("Parallel" = the two dispatches; any
verify-a-fact step happens later, in synthesis, after both return.)
Reviewer A = model-independent: a different model/provider when available (e.g. /codex). If none
is available — or it's your own model family — use the most independent reviewer you have, label the
round "reduced independence/confidence," and don't call it model-independent.
Reviewer B = perspective-independent: spawn a new subagent (Agent tool); pass it only the
complete draft + the reviewer prompt; do not pass conversation history, prior rounds, ledgers, or
your rationale (same base model as you, but isolated context and a different review axis).
Give the two different review axes, not different tones — e.g. A = adversarial correctness;
B = simplicity / over-engineering, or a risk axis the artifact needs (security / data-loss / ops / cost).
Match each axis to the reviewer's comparative advantage — in particular, route checkable/factual
claims to the web-search-capable reviewer (codex), and judgment/taste/argument axes to the other.

Reviewer prompt (stable, neutral):

Critique this draft for correctness, missing constraints, hidden assumptions, edge cases,
over-engineering, under-specification, implementation risk, and verification gaps. Each finding must
cite a specific draft element (or a specific missing one) and its failure mechanism — no generic
checklist worries. Severity-grouped; for each: observation · why it matters · concrete fix. Don't
rewrite the draft. If it's solid, say so.
Synthesize into the per-round ledger (below); the revised draft becomes next round's thesis.

Per-round ledger (fill literally)

Round N
finding | source(A/B) | verdict(accept/reject) | one-line reason
...
Disagreements (only if any): claim A | claim B | deciding evidence/mechanism | resolution | confidence | needs-empirical?
Accepted this round: <n>   (the round is CLEAN iff 0)
Self-bias: <one way this synthesis could be wrong; if plausible, re-open the last reject>

Synthesis rules

Accept a finding only if the draft doesn't already adequately resolve it — adequate = the draft
states the decision + rationale + constraints + a verification/mitigation path at an actionable level.
Merely naming the topic ≠ resolved. Also accept: unresolved, materially under-specified, or contradicted.
Adjudicate disagreements; don't average. A divergence is exactly where a lone model would adopt a
wrong premise. If it turns on a fact you can verify now, pause and verify before revising; if not,
record it as an unresolved risk, mark confidence, and make the draft conditional on the assumption.
Discard non-actionable / out-of-scope / mechanism-less / already-resolved findings.
Re-run a reviewer at most once, and only for a format/process failure (e.g. no severity grouping,
empty, ignored the prompt) — never because findings are sparse, negative, or inconvenient (that's
selection bias); log any re-run. A re-run's findings count normally toward acceptance, and a re-run
never by itself resets the clean-round counter; a round whose only findings came from a re-run still
counts as CLEAN if synthesis accepts none.

Stop

A round is CLEAN when synthesis accepts no finding (the draft needs no change). Findings discarded
as already-resolved / cosmetic / out-of-scope / non-actionable do not count and do not reset the
counter. Stop after two consecutive clean rounds.
A clean round — including round 1 — still triggers one more round with freshly-spawned reviewers on
the unchanged draft (independence is what makes the confirmation meaningful). If the artifact had no real
tension, note it was likely too trivial for this method.
Hard cap = 10 rounds. At any stop (clean-rounds or cap), the exit checklist must pass to
declare convergence; if it doesn't (including hitting the cap), report the unresolved items to the user
and stop without declaring convergence.
Exit checklist: objectives met · assumptions explicit · risks handled · alternatives + why-rejected
recorded · unresolved disagreements documented · verification plan named.
High-stakes: the domain/human reviewer reviews at least the initial and the final pre-stop draft
(and any round their concern is materially revised); their unresolved objections block convergence.
Mode-shift = the real signal: early rounds change the draft (defects, design); near convergence,
feedback turns to traceability/tests/verification. When critique becomes verification, you're there.
On stop, deliver the Output (below). Proceed to implementation + verification only if the user
asked for it; otherwise stop at the hardened draft.

Guardrails (skip these and the result is fake — single source; don't restate them inline)

Don't prime the reviewers. Keep expected verdicts, "already-rejected" lists, "say READY if nothing's
new," and round numbers OUT of the reviewer prompt. Priming hands them the answer and suppresses real
findings — "convergence" becomes compliance. Neutral context (settled constraints, non-goals) is fine;
a prescribed conclusion is not. Tell your prediction to the user only.
Don't stop on a feeling. Stop only by the rule, the stated budget cap, or a user-approved scope —
never because the draft "feels done" (the classic confirmation-bias tell).
Judge novelty against the draft's content, not the reviewer's wording.
Independence is earned, not assumed. If both reviewers are effectively the same model, agreement
proves little; diversify model / axis / evidence. Verify reviewer B's isolation — some harnesses
spawn a subagent that inherits the whole conversation regardless of what you nominally pass it (observed
with Claude Code's Agent tool: a "fresh" reviewer cited an unrelated detail from earlier in the session).
Probe for it. If B can't be isolated, it is fresh reasoning + a different axis, not context-independent —
your real independence then rests on the different-model reviewer (A).
Privacy preflight. Reviewers (especially /codex → OpenAI) see whatever you send. Redact secrets,
credentials, PII, and proprietary context from the draft itself; scoping file reads isn't enough.
"Complete draft" means complete w.r.t. the sanitized draft — replace decision-relevant redactions with
abstract placeholders that preserve the constraint (e.g. [regulated customer data], [vendor contract limit]).

Output

Final draft · Accepted findings by round · Rejected findings + reasons · Decision records ·
Unresolved risks · Verification plan · Implementation status.

Notes (read once — rationale, not steps)

Why two independent reviewers: a single model can't see its own blind spots; a different model (A)
catches different classes of error, and an isolated fresh perspective (B) catches what shared context
would have suppressed. Their disagreement is the highest-signal output — it's exactly where a lone
reviewer would have adopted a wrong premise.
The orchestrator writes the draft and judges the critiques — the loop's weakest link. That's why the
rejection ledger + self-bias line are mandatory: they're your only audit trail against your own bias.

Appendix — codex (environment-specific; not part of the method)

Use a fresh codex exec — its resume rejects the -C working-dir flag, so start fresh each round.
Treat codex as GPT unless the user says otherwise.
stderr capture: a fixed path like /tmp/codex-err-$$.txt ($$ = shell PID; ensure it runs in a shell;
BSD mktemp fails on a template with a suffix after the Xs).

dialectic

Resources

Install

/dialectic — convergence loop

Step 0 — Seed the thesis

Right-size

Each round

Per-round ledger (fill literally)

Synthesis rules

Stop

Guardrails (skip these and the result is fake — single source; don't restate them inline)

Output

Notes (read once — rationale, not steps)

Appendix — codex (environment-specific; not part of the method)

Categories

Install

Recommended Skills