Evaluate skill quality against best practices. Use when asked to "rate this skill", "review skill quality", "check skill formatting", "is this skill good", "evaluate SKILL.md", "grade this skill", or when validating skill files before publishing.
Install
npx skillscat add antjanus/skillbox/rate-skill Install via the SkillsCat registry.
Rate Skill
Overview
Audit SKILL.md files against quality standards from generate-skill best practices. Provides letter grade (A-F) and actionable recommendations.
Core principle: Measure skill quality objectively to improve activation reliability and context efficiency.
When to Use
Always use when:
- Reviewing skills before publishing
- Validating skill structure and formatting
- Checking if skill meets quality standards
- User asks to "rate", "grade", or "review" a skill
Useful for:
- Skill authors validating their work
- Maintainers reviewing PRs with new skills
- Quality audits of skill repositories
- Before submitting skills to marketplaces
Avoid when:
- Evaluating non-skill documentation
- Reviewing code (not skill definitions)
- General code quality auditing
How It Works
- Read specified SKILL.md file
- Evaluate against quality criteria
- Calculate scores per category
- Generate letter grade (A-F)
- Output findings with priorities
- Provide actionable recommendations
Quality Criteria
| Category | Weight | Criteria |
|---|---|---|
| Length | 20% | Under 500 lines (or progressive disclosure) |
| Conciseness | 20% | Clear, scannable, no fluff |
| Repetitiveness | 15% | No redundant content |
| Structure | 15% | Required sections present and ordered |
| Triggers | 15% | 3-5+ specific activation phrases |
| Examples | 10% | Good/Bad code comparisons |
| Troubleshooting | 5% | Common issues addressed |
Length (20%)
Scores: A: <500 or progressive disclosure | B: 500-600 | C: 600-800 | D: 800-1000 | F: >1000
Checks: Line count, reference/ directory, progressive disclosure links
Conciseness (20%)
Scores: A: High info density, scannable | B: Mostly concise | C: Some wordiness | D: Verbose | F: Excessive
Red flags: Long paragraphs (>5 sentences), redundant explanations, flowery language
Repetitiveness (15%)
Scores: A: Zero redundancy | B: 1-2 overlaps | C: 3-4 overlaps | D: 5+ overlaps | F: Heavy redundancy
Common: Format in section AND example, repeated "use when", duplicate trigger phrases
Structure (15%)
Scores: A: All required sections | B: Missing 1 optional | C: Missing 2-3 | D: Missing required | F: Severely lacking
Required: Frontmatter, Overview, When to Use, Main content, Examples (Good/Bad), Troubleshooting, Integration
Triggers (15%)
Scores: A: 5+ specific | B: 3-4 good | C: 2 phrases | D: 1 vague | F: None
Quality: User language ("when asked to X"), specific situations, multiple contexts, concrete not abstract
Examples (10%)
Scores: A: 3+ with Good/Bad | B: 2 with comparisons | C: 1 comparison | D: No comparisons | F: None
Quality: Uses tags, includes explanations, real scenarios, syntax highlighting
Troubleshooting (5%)
Scores: A: 5+ pairs | B: 3-4 pairs | C: 1-2 basic | D: Vague | F: None
Quality: Clear problem, cause identified, solution with code, explanation
Output Format
# Skill Rating: [Letter Grade]
## Summary
- **File:** path/to/SKILL.md
- **Lines:** XXX lines
- **Overall Grade:** [A/B/C/D/F] ([Score]/100)
- **Status:** [Production Ready / Needs Work / Not Ready]
## Category Scores
| Category | Score | Grade | Status |
|----------|-------|-------|--------|
| Length | XX/20 | [A-F] | [✅/⚠️/❌] |
| Conciseness | XX/20 | [A-F] | [✅/⚠️/❌] |
| Repetitiveness | XX/15 | [A-F] | [✅/⚠️/❌] |
| Structure | XX/15 | [A-F] | [✅/⚠️/❌] |
| Triggers | XX/15 | [A-F] | [✅/⚠️/❌] |
| Examples | XX/10 | [A-F] | [✅/⚠️/❌] |
| Troubleshooting | XX/5 | [A-F] | [✅/⚠️/❌] |
## Findings by Priority
### ❌ Critical Issues (Fix Before Publishing)
1. [Issue description]
- Impact: [Why this matters]
- Fix: [Specific action to take]
### ⚠️ Important Issues (Should Fix)
1. [Issue description]
- Impact: [Why this matters]
- Fix: [Specific action to take]
### 📋 Nice to Have
1. [Suggestion]
- Benefit: [Why this helps]
## Strengths
- [What this skill does well]
- [Another strength]
## Priority Action Items
1. [Priority 1 action]
2. [Priority 2 action]
3. [Priority 3 action]
## Estimated Improvements
- Fix critical issues: +[X] points
- Address important issues: +[X] points
- Potential grade: [Current] → [Target]Usage
Basic rating:
/rate-skill skills/example-skill/SKILL.mdRate after changes:
# Make improvements
[edit SKILL.md]
# Re-rate
/rate-skill skills/example-skill/SKILL.mdCompare before/after:
# Rate original
/rate-skill skills/track-session/SKILL.md
# Make improvements
[condense, remove redundancy]
# Rate again to see improvement
/rate-skill skills/track-session/SKILL.mdGrading Scale
| Grade | Score | Meaning |
|---|---|---|
| A | 90-100 | Excellent - Production ready |
| B | 80-89 | Good - Minor improvements recommended |
| C | 70-79 | Acceptable - Needs work before publishing |
| D | 60-69 | Poor - Significant issues to address |
| F | 0-59 | Failing - Major overhaul needed |
Status mapping:
- A-B: Production Ready ✅
- C: Needs Work ⚠️
- D-F: Not Ready ❌
Examples
Example 1: Rating a High-Quality Skill
Input: /rate-skill skills/track-session/SKILL.md
Output:
# Skill Rating: A
## Summary
- **File:** skills/track-session/SKILL.md
- **Lines:** 489 lines
- **Overall Grade:** A (92/100)
- **Status:** Production Ready
## Category Scores
| Category | Score | Grade | Status |
|----------|-------|-------|--------|
| Length | 20/20 | A | Pass |
| Conciseness | 18/20 | A | Pass |
| Repetitiveness | 14/15 | A | Pass |
| Structure | 15/15 | A | Pass |
| Triggers | 15/15 | A | Pass |
| Examples | 9/10 | A | Pass |
| Troubleshooting | 5/5 | A | Pass |
## Strengths
- Excellent progressive disclosure with reference/VERIFICATION.md
- 10+ diverse trigger phrases
- Strong examples with Good/Bad comparisons
## Priority Action Items
1. Consider adding one more example for edge cases (optional)Note: High-scoring skills get a short report focused on strengths and optional improvements.
Example 2: Rating a Skill That Needs Work
Input: /rate-skill skills/problematic-skill/SKILL.md
Output:
# Skill Rating: C
## Summary
- **File:** skills/problematic-skill/SKILL.md
- **Lines:** 742 lines
- **Overall Grade:** C (74/100)
- **Status:** Needs Work
## Findings by Priority
### Critical Issues
1. **Length: 742 lines without progressive disclosure**
- Impact: High context usage, harder to scan
- Fix: Move detailed content to reference/ directory
2. **Only 2 trigger phrases in description**
- Impact: Poor activation reliability
- Fix: Add 3-5 specific user phrases and situations
### Important Issues
1. **Verbose mode descriptions (30+ lines each)**
- Fix: Condense to 2-3 lines per mode
## Priority Action Items
1. Implement progressive disclosure (move 200+ lines to reference/)
2. Add 3+ trigger phrases to description
3. Condense verbose sections
## Estimated Improvements
- Fix critical issues: +12 points -> 86 (B)
- Potential grade: C -> ANote: Lower-scoring skills get detailed findings with specific fixes and an improvement roadmap.
Troubleshooting
Problem: Can't find SKILL.md file
Cause: Path incorrect or file doesn't exist.
Solution:
# Verify file exists
ls skills/skill-name/SKILL.md
# Use correct path
/rate-skill skills/skill-name/SKILL.mdProblem: Rating seems too harsh
Cause: Standards are strict for good reason - quality matters for activation.
Solution:
- Review specific findings
- Compare to high-quality skills
- Focus on critical issues first
- Remember: B grade is still "good"
Problem: Grade improved but still low
Cause: Multiple categories need attention.
Solution:
- Focus on highest-weight categories first (Length, Conciseness)
- Fix critical issues before nice-to-haves
- Re-rate after each major change
- Use "Estimated Improvements" as roadmap
Problem: Don't know how to fix an issue
Cause: Fix recommendation unclear.
Solution:
- Check generate-skill examples for patterns
- Review high-rated skills for reference
- Ask for specific help on that issue
- Consult CLAUDE.md for SkillBox guidelines
Integration
This skill works with:
- generate-skill - Use after generating to validate quality
- Skill development workflow - Rate before committing/publishing
- Quality control - Gate for accepting skills into repositories
- Continuous improvement - Track quality metrics over time
Workflow:
# Create skill
/generate-skill new-feature
# Rate it
/rate-skill skills/new-feature/SKILL.md
# Fix issues
[make improvements]
# Re-rate
/rate-skill skills/new-feature/SKILL.md
# When A or B grade, publish
git add skills/new-feature/
git commit -m "Add new-feature skill"Quality gates:
- A-B: Merge to main ✅
- C: Request changes ⚠️
- D-F: Reject until improved ❌
References
Based on:
- generate-skill best practices
- SkillBox CLAUDE.md guidelines
- obra/superpowers patterns
- Vercel agent-skills standards
Related: