skill-auditor

Perform structured, reproducible audits of agent skills — testing mechanical correctness, agent usability, output quality, context efficiency, EARS compliance, prompt complexity, multi-agent coordination, audit convergence (reproducibility across runs), finding divergence (specificity and tailoring), AGENTS.md adherence (rule absorption verification), documentation/runtime staleness drift detection, CLI discoverability helper coverage, idempotency, error recovery, and credential safety. Covers 25 audit domains (D1–D25) with confidence-scored findings at every level. Use when (1) auditing or health-checking a skill end-to-end, (2) verifying AGENTS.md adherence, audit convergence, or finding divergence, or (3) validating scripts, CLIs, context/token footprint, EARS requirement syntax, prompt complexity, name consistency, dispatch prompt quality, or structured confidence-scored audit reports.

DevGuyRash 0 Updated 4mo ago

Resources

GitHub

Install

npx skillscat add devguyrash/agent-skills/skill-auditor

Install via the SkillsCat registry.

SKILL.md

Skill Auditor

A structured, phased audit framework for evaluating agent skills across 25
domains (D1–D25). The agent performs the audit and writes the report; bundled
scripts only gather deterministic evidence that helps the audit.

Guidance delivery

You SHALL run <skills-file-root>/scripts/audit-skill <command> for guidance.
This CLI provides just-in-time routing. IF the CLI is unavailable, read the
corresponding reference file instead.

Command	What it provides	Fallback reference
`audit-skill phases`	List all phases	`references/audit-phases-core.md`
`audit-skill phase <N>`	Detailed phase instructions	`references/audit-phases-core.md` or `audit-phases-analysis.md`
`audit-skill domains`	List all 25 domains	`references/domains-core.md`, `domains-analysis.md`, `domains-new.md`
`audit-skill domain <ID>`	Full domain spec	Corresponding `references/domains-*.md`
`audit-skill activate <traits>`	Domains for skill traits	`references/domains-core.md` § Activation Rules
`audit-skill hints <ID>`	Actionable hints	Inline in domain specs
`audit-skill next-steps`	Ordered audit steps	This file § Getting started
`audit-skill step <N>`	Detailed guidance for one audit step	This file § Getting started
`audit-skill report-workflow`	Report-writing workflow and finding schema	This file § Report finalization
`audit-skill confidence`	Scoring rules	`references/confidence-scoring.md`
`audit-skill report-template`	Report structure	`references/report-template.md`
`audit-skill check <name> <dir>`	Gather deterministic evidence with one helper script	Direct script invocation
`audit-skill check-all <dir>`	Gather deterministic evidence with all helper scripts	Sequential script invocation
`audit-skill self-check`	Validate the auditor's helper-script layer	`check-all` on own directory

Workflow overview

Every audit follows five primary phases plus a deterministic duplication gate.
The phases are agent-led: scripts can accelerate evidence gathering, but the
agent decides what matters and writes the final report.

Phase	Name	Budget	Focus
1	Environment & Build	~15%	Dead scripts block everything else
2	API Surface	~25%	Name mismatches are the #1 agent failure
3	Workflow Simulation	~30%	The agent's actual experience
4	Context & Token	~15%	Quantifies what phases 1-3 found
4b	Duplication Gate	~5%	Deterministic duplicate/contradiction gate
5	Output Quality	~15%	Depth, consistency, EARS, complexity

WHEN the skill orchestrates subagents, THEN add Phase 3b: Multi-Agent
Audit between phases 3 and 4. Read
<skills-file-root>/references/multi-agent-audit.md before starting Phase 3.

Getting started

Identify the target skill directory and read its SKILL.md end-to-end.
Create a running report file immediately; add findings as you discover them.
Map the directory tree: scripts, references, binaries, configs.
Determine activated domains: run <skills-file-root>/scripts/audit-skill activate <traits...>.
List the workflow: run <skills-file-root>/scripts/audit-skill next-steps.
Drill into the current step: run <skills-file-root>/scripts/audit-skill step <N>.
Read phase instructions: run <skills-file-root>/scripts/audit-skill phase 1.
Create a report directory under ~/.local/reports/skill-auditor/<skill-name>/<YYYY-MM-DD>/.
Run helper scripts when they add evidence quickly; when a script is missing or weak, perform the equivalent manual inspection and record that gap.
Proceed through each phase in order. Run audit-skill phase <N> for guidance.
Finalize the actual audit output using <skills-file-root>/scripts/audit-skill report-workflow and <skills-file-root>/scripts/audit-skill report-template.

Trait → domain activation lives in the router manifest and domain references.
Run <skills-file-root>/scripts/audit-skill activate <traits...> for the
current skill instead of relying on a duplicated quick-reference table.

Confidence scoring

Every finding needs confidence: HIGH [H] for deterministic script output,
MEDIUM [M] for traced agent inspection, and LOW [L] for weaker pattern
matching. Run <skills-file-root>/scripts/audit-skill confidence for details.

Severity framework

Use BLOCKER for no-workaround workflow failure, MAJOR for wrong output or
serious waste, MINOR for workaround-required friction, and NIT for polish.

Error recovery

WHEN a phase fails mid-audit, you SHALL:

Record all findings discovered so far — partial findings are usable.
WHEN a deterministic script fails, re-run it individually with
<skills-file-root>/scripts/audit-skill check <name> <dir> for details.
You MAY restart the failed phase from its beginning. You SHALL NOT restart
completed phases unless new information invalidates earlier findings.
WHEN check-all reports failures, proceed with remaining phases and note
the failure in the report.

Report finalization

WHEN all phases are complete:

Read the report workflow: <skills-file-root>/scripts/audit-skill report-workflow.
Read the report template: <skills-file-root>/scripts/audit-skill report-template.
Write the report to ~/.local/reports/skill-auditor/<skill-name>/<YYYY-MM-DD>/<HH-MM-SS>_audit.md.
Fill in each phase section from accumulated findings as an agent judgment, not as a raw script dump.
Use helper-script output as evidence and confidence support, but do not treat check-all or self-check as the audit itself.
Write the executive summary LAST.
Assign the final verdict: SHIP | SHIP WITH FIXES | DO NOT SHIP.
Include aggregate confidence score and distribution.
Verify D18 convergence, D19 divergence, D20 adherence, D21 staleness,
and D22 discoverability per the domain specs.
Use the severity framework consistently: BLOCKER | MAJOR | MINOR | NIT.

What `self-check` means

audit-skill self-check validates only the deterministic helper layer. A real
self-audit still requires the agent to walk phases, synthesize findings, and
write the report.

Reference index

Load only the reference needed for the current phase:

references/audit-phases-core.md for phases 1–3
references/audit-phases-analysis.md for phases 4–5
references/domains-core.md, references/domains-analysis.md, and references/domains-new.md for domain specs
references/multi-agent-audit.md only when the skill dispatches subagents
references/confidence-scoring.md and references/report-template.md during report synthesis
references/cli-implementation-guide.md when auditing another skill's router CLI

skill-auditor

Resources

Install

Skill Auditor

Guidance delivery

Workflow overview

Getting started

Confidence scoring

Severity framework

Error recovery

Report finalization

What self-check means

Reference index

Categories

Install

Recommended Skills

What `self-check` means