"Verify technical claims in code, docs, and comments via evidence-backed verdicts before merge."
Resources
2Install
npx skillscat add axiomantic/spellbook/fact-checking Install via the SkillsCat registry.
Invariant Principles
- Claims are hypotheses - Every claim requires empirical evidence before verdict
- Evidence before verdict - No verdict without traceable, citable proof
- User controls scope - User selects scope and approves all fixes
- Deduplicate findings - Check AgentDB before verifying; store after
- Learn from trajectories - Store verification trajectories in ReasoningBank
| Pattern | Action |
|---|---|
| RESEARCH_REQUEST ("research", "check", "verify") | Dispatch research subagent |
| UNKNOWN ("don't know", "not sure") | Dispatch analysis subagent |
| CLARIFICATION (ends with ?) | Answer, then re-ask |
| SKIP ("skip", "move on") | Proceed to next item |
| </ARH_INTEGRATION> | |
Inputs/Outputs
| Input | Required | Description |
|---|---|---|
scope |
Yes | branch changes, uncommitted, or full repo |
modes |
No | Missing Facts, Extraneous Info, Clarity (default: all) |
autonomous |
No | Skip prompts, use defaults |
| Output | Type | Description |
|---|---|---|
verification_report |
Inline | Summary, findings, bibliography |
implementation_plan |
Inline | Fixes for refuted/stale claims |
glossary |
Inline | Key facts (Clarity Mode) |
state_checkpoint |
File | .fact-checking/state.json |
Shared Data Structures
Verdict Table
| Verdict | Meaning | Evidence Required |
|---|---|---|
| Verified | Claim is accurate | test output, code trace, docs, benchmark |
| Refuted | Claim is false | failing test, contradicting code |
| Incomplete | True but missing context | base verified + missing elements |
| Inconclusive | Cannot determine | document attempts, why insufficient |
| Ambiguous | Wording unclear | multiple interpretations explained |
| Misleading | Technically true, implies falsehood | what reader assumes vs reality |
| Jargon-heavy | Too technical for audience | unexplained terms, accessible version |
| Stale | Was true, no longer applies | when true, what changed, current state |
| Extraneous | Unnecessary/redundant | value analysis shows no added info |
Bibliography Formats
| Type | Format |
|---|---|
| Code trace | file:lines - finding |
| Test | command - result |
| Web source | Title - URL - "excerpt" |
| Git history | commit/issue - finding |
| Documentation | Docs: source section - URL |
| Benchmark | Benchmark: method - results |
| Paper/RFC | Citation - section - URL |
Workflow
Phase 0: Configuration
Present three optional modes (default: all enabled):
- Missing Facts Detection - gaps where claims lack critical context
- Extraneous Info Detection - redundant/LLM-style over-commenting
- Clarity Mode - generate glossaries for AI config files
Autonomous mode detected ("Mode: AUTONOMOUS")? Enable all automatically.
Phase 1: Scope Selection
Ask scope BEFORE extraction. No exceptions.
| Option | Method |
|---|---|
| A. Branch changes | git diff $(git merge-base HEAD main)...HEAD --name-only + unstaged |
| B. Uncommitted | git diff --name-only + git diff --cached --name-only |
| C. Full repo | All code/doc patterns |
Phases 2-3: Claim Extraction and Triage
Subagent dispatch: Invoke fact-check-extract command.
Context to provide: File list from Phase 1, scope selection, enabled modes.
Phases 4-5: Parallel Verification and Verdicts
Subagent dispatch: Invoke fact-check-verify command.
Context to provide: Triaged claims list from Phases 2-3, depth assignments.
Phases 6-7: Report and Learning
Subagent dispatch: Invoke fact-check-report command.
Context to provide: All verdicts and evidence from Phases 4-5, enabled modes (for Clarity Mode), bibliography entries.
Phase 8: Fixes
NEVER apply fixes without explicit per-fix user approval.
- Present implementation plan for non-verified claims
- Show proposed change, ask approval
- Apply approved fixes
- Offer re-verification
Interruption Handling
Checkpoint to .fact-checking/state.json after each claim:
{
"scope": "branch",
"claims": [...],
"completed": [0, 1, 2],
"pending": [3, 4, 5],
"findings": {...},
"bibliography": [...]
}Offer resume on next invocation.
**Verdicts Without Evidence** - "it looks correct" or "code seems fine" without trace - Every verdict requires concrete, citable evidence
Skipping Claims
- No claim is "trivial" - verify individually
- No batching similar claims without individual verification
Applying Fixes Without Approval
- No auto-correcting comments
- Each fix requires explicit user approval
Ignoring AgentDB
- ALWAYS check before verifying
- ALWAYS store findings after verification
**User**: "Factcheck my current branch"
Phase 1: Scope selection -> User selects "A. Branch changes"
Phase 2: Extract claims -> Found 8 claims in 5 files
Phase 3: Triage display:
### Security (2 claims)
1. [MEDIUM] src/auth/password.ts:34 - "passwords hashed with bcrypt"
2. [DEEP] src/auth/session.ts:78 - "session tokens cryptographically random"Phase 4: Verify claim 1: Read src/auth/password.ts:34-60, found import { hash } from 'bcryptjs' and await hash(password, 12). Cost factor 12 meets OWASP.
Verdict: VERIFIED | Evidence: bcryptjs.hash() cost factor 12 | Sources: [1] Code trace, [2] OWASP Password Storage
Phase 6: Report excerpt:
# Fact-Checking Report
Scope: Branch feature/auth-refactor (12 commits)
Verified: 5 | Refuted: 1 | Stale: 1 | Inconclusive: 1
## Bibliography
[1] src/auth/password.ts:34-60 - bcryptjs hash() call
[2] OWASP Password Storage - https://cheatsheetseries.owasp.org/...
## Implementation Plan
1. [ ] src/cache/store.ts:23 - TTL is 60s not 300s, update comment
Before finalizing: - [ ] Configuration wizard completed (or autonomous mode) - [ ] Scope explicitly selected by user - [ ] ALL claims presented for triage before verification - [ ] Each verdict has CONCRETE evidence - [ ] AgentDB checked before, updated after - [ ] Bibliography cites all sources - [ ] Trajectories stored in ReasoningBank - [ ] Fixes await explicit per-fix approval
If ANY unchecked: STOP and fix.