Cross-validate technical claims using multiple independent verification strategies (official docs, counter-examples, source code inspection). Use when a tech-feasibility report, design document, or any technical analysis contains claims that were verified by only one method or one AI session. Triggered by "cross-validate", "verify these claims", "double-check this", or "independent verification".
Install
npx skillscat add tomwangowa/agent-skills/research-cross-validator Install via the SkillsCat registry.
Research Cross-Validator
Overview
Independently verify technical claims using multiple orthogonal
strategies to eliminate single-source bias and AI hallucination risk.
Each claim is tested through 2-3 independent verification paths, and
results are compared for consistency.
Core principle: A claim verified by only one method is an opinion.
A claim verified by three independent methods is evidence.
Announce at start:
"Cross-validating claims — I'll verify each one through multiple
independent strategies and flag any inconsistencies."
When to Use
- After
tech-feasibilityorcritical-researchproduces a report —
cross-validate key verdicts - When an AI-generated analysis makes technical claims you want to trust
- When a vendor's documentation makes capability claims
- When two sources disagree and you need a tiebreaker
- After
assumption-extractorflags CRITICAL assumptions as
PARTIALLY VERIFIED
When NOT to use:
- The claim can be tested by running code (use
micro-poc-validator
instead — empirical beats theoretical) - The claim is subjective or opinion-based (no objective verification
possible) - You only have one claim to check (use
critical-researchfor
single-claim falsification)
Required Input
CLAIMS: List of specific technical claims to verify
(extract from a document, or provide directly)
SOURCE: Where these claims came from
(e.g., "tech-feasibility report from 2026-01-14")
PRIORITY: Which claims matter most? (optional — will prioritize
CRITICAL-impact claims if not specified)If the user provides a file path, extract the top 5-10 verifiable
technical claims automatically.
Verification Strategies
Each claim is tested through at least 2 of these independent
strategies:
Strategy A: Official Documentation
Query the authoritative source for the technology in question.
Tools: mcp__context7__query-docs, WebFetch on official docs site
Evidence strength: HIGH (if docs are current and specific)
Process:
- Identify the official documentation source
- Search for the specific capability claimed
- Record: confirmed / denied / not mentioned
- Note the doc version and date
Strategy B: Counter-Evidence Search
Actively search for evidence that the claim is FALSE.
Tools: WebSearch with falsification queries
Evidence strength: HIGH (if counter-evidence is from credible source)
Process:
- Construct falsification queries:
"[tech] cannot [claimed capability]""[tech] limitation [relevant area]""[tech] [capability] not working""[tech] [capability] broken"/"[tech] [capability] issue"
- Search GitHub Issues for the project
- Record: counter-evidence found / no counter-evidence / ambiguous
Strategy C: Source Code Inspection
For open-source tools, check the actual implementation.
Tools: Bash (pip show, npm info), Read, Grep, WebFetch
on GitHub
Evidence strength: HIGHEST (code is truth)
Process:
- Locate the relevant source file/function
- Check if the claimed capability is implemented
- Check function signatures, parameters, return types
- Record: implemented / not implemented / partially implemented
Strategy D: Community Corroboration
Check if real users have confirmed or denied the capability.
Tools: WebSearch on Stack Overflow, GitHub Discussions, Reddit
Evidence strength: MEDIUM (anecdotal but valuable in aggregate)
Process:
- Search for users reporting success/failure with the claimed capability
- Check recency (last 6 months preferred)
- Record: corroborated / contradicted / no data
Strategy E: Version-Specific Check
Verify the claim holds for the specific version in use.
Tools: Bash (check installed version), WebSearch for changelogs
Evidence strength: HIGH (version-specific evidence is precise)
Process:
- Identify the version in use (or planned)
- Check changelog for relevant additions/removals/breaking changes
- Record: confirmed for version X / not available until version Y /
removed in version Z
Workflow
Step 1: Extract and Prioritize Claims
From the source document, extract concrete technical claims. Prioritize:
- Claims marked as CRITICAL or HIGH impact (from
assumption-extractor) - Claims that are foundational — other claims depend on them
- Claims that came from a single source only
Limit: Max 10 claims per session. If more exist, ask the user to
prioritize.
Step 2: Select Strategies Per Claim
For each claim, select the 2-3 most appropriate strategies:
| Claim type | Primary strategy | Secondary strategy | Tertiary |
|---|---|---|---|
| Library capability | C (source code) | A (official docs) | B (counter-evidence) |
| API availability | A (official docs) | B (counter-evidence) | D (community) |
| Performance claim | B (counter-evidence) | D (community) | E (version check) |
| Compatibility | C (source code) | D (community) | A (docs) |
| Vendor claim | B (counter-evidence) | D (community) | A (docs) |
Step 3: Execute Verification (Per Claim)
Run each selected strategy independently. Do NOT let the result of one
strategy influence how you search in another — this is the key to
avoiding confirmation bias.
Parallel execution: If verifying multiple claims, use Task agents
to run verifications in parallel where possible.
Step 4: Compare Results
For each claim, compare strategy results:
| Outcome | Meaning | Confidence |
|---|---|---|
| All agree: TRUE | Strong confirmation | HIGH |
| All agree: FALSE | Strong refutation | HIGH |
| Majority agree, minority disagrees | Likely true/false with caveats | MEDIUM |
| Evenly split | Genuinely uncertain | LOW |
| Only one strategy found evidence | Weak — needs more data | LOW |
Step 5: Generate Cross-Validation Report
# Cross-Validation Report
**Date**: YYYY-MM-DD
**Source**: [document or origin of claims]
**Claims evaluated**: [N]
## Summary
| Claim | Strategy A | Strategy B | Strategy C | Consensus | Confidence |
|-------|-----------|-----------|-----------|-----------|------------|
| [C1] | TRUE | TRUE | TRUE | CONFIRMED | HIGH |
| [C2] | TRUE | FALSE | — | DISPUTED | LOW |
| [C3] | FALSE | FALSE | FALSE | REFUTED | HIGH |
## Detailed Findings
### Claim 1: [statement]
**Consensus**: CONFIRMED / DISPUTED / REFUTED / UNCERTAIN
**Confidence**: HIGH / MEDIUM / LOW
| Strategy | Result | Evidence | Source |
|----------|--------|----------|--------|
| A: Official docs | TRUE | [what was found] | [URL] |
| B: Counter-evidence | TRUE (no counter-evidence found) | [search queries used] | — |
| C: Source code | TRUE | [function/file that confirms] | [path/URL] |
**Resolution**: [If disputed, explain the discrepancy and which
evidence is more authoritative]
### Claim 2: [statement]
(repeat)
## Implications
### Confirmed Claims (safe to rely on)
- [C1]: [brief restatement]
### Refuted Claims (must update plan)
- [C3]: [brief restatement] — **Impact**: [what this breaks]
### Disputed Claims (need further investigation)
- [C2]: [brief restatement] — **Recommended**: [micro-PoC / expert
consultation / defer]
## Upstream Impact
<!-- How these findings affect the source document -->
| Source Document Section | Affected Claims | Required Update |
|------------------------|----------------|-----------------|
| [Section 2.1] | C2, C3 | Rewrite to account for refuted claims |Examples
Example 1: Library Capability Claims
Claims from tech-feasibility report:
C1: "nodriver supports connecting to remote browsers via WSS"
Strategy A (docs): FALSE — no WSS documentation found
Strategy C (source): FALSE — start() accepts host+port only
Strategy D (community): FALSE — GitHub issue #47 confirms not supported
Consensus: REFUTED (HIGH confidence)
C2: "Playwright connect_over_cdp() accepts wss:// URLs"
Strategy A (docs): TRUE — official docs show wss:// examples
Strategy C (source): TRUE — connect_over_cdp signature accepts string URL
Strategy E (version): TRUE — available since Playwright 1.20
Consensus: CONFIRMED (HIGH confidence)Example 2: API Availability Claims
C1: "ScraperAPI provides a structured Amazon Reviews endpoint"
Strategy A (docs): PARTIAL — endpoint listed but marked "beta"
Strategy B (counter): FALSE — multiple reports of 404 since Nov 2024
Strategy D (community): FALSE — users report endpoint returns empty data
Consensus: REFUTED (HIGH confidence)
Resolution: Docs are outdated; real-world testing confirms endpoint is
non-functionalConstraints
- Independence of strategies — do NOT let one strategy's result
influence another's search. Run them as if each is the first check. - Minimum 2 strategies per claim — single-strategy verification is
not cross-validation. - Max 10 claims per session — cross-validation is thorough by design;
limit scope to maintain quality. - Recency matters — prefer evidence from the last 6 months. Flag
older evidence as potentially stale. - Source attribution — every finding must link to a specific URL,
file path, or search query. - No consensus forcing — if strategies disagree, report DISPUTED
with LOW confidence. Do not pick a winner without justification.
Error Handling
| Scenario | Action |
|---|---|
| Official docs don't mention the capability at all | Record as "NOT DOCUMENTED" — distinct from FALSE. Absence of evidence ≠ evidence of absence. Recommend micro-PoC. |
| Source code is unavailable (closed-source tool) | Skip Strategy C, increase weight on A + B + D |
| All search queries return no results | Broaden terms; if still nothing, record as "NO DATA" and flag as high uncertainty |
| Conflicting versions of the same doc exist | Use the version matching the user's installed/planned version |
| Strategy execution reveals the claim is poorly defined | Ask the user to clarify before continuing verification |
Security Considerations
- Read-only by default — this skill primarily reads docs and searches.
The only Bash usage is for checking installed package versions (pip show,npm list), not for executing arbitrary code. - No credential exposure — never include API keys in search queries.
- Source validation — verify URLs point to legitimate domains before
citing. Flag suspicious sources. - Search query sanitization — sanitize user-provided claim text
before constructing search queries.
Related Skills
- tech-feasibility — upstream: produces reports containing claims to
cross-validate - critical-research — complementary: focused falsification of a
single hypothesis; cross-validator handles multiple claims in parallel - assumption-extractor — upstream: extracts assumptions that become
claims for cross-validation - micro-poc-validator — complementary: cross-validator uses desk
research; micro-poc uses empirical testing. Use micro-poc for claims
that can be tested with code. - research-synthesis — downstream: combines cross-validation results
with other research into decisions - tech-research-pipeline — orchestrator: invokes this skill after
critical-research and before research-synthesis