Parse, extract, and format transcripts (VTT, SRT, plain text) into structured Markdown packets with action items, decisions, questions, and topics. v2.0 uses hybrid Python+LLM architecture for VTT files. Integrates with ps-critic for quality review.
Resources
7Install
npx skillscat add geekatron/jerry/transcript Install via the SkillsCat registry.
MANDATORY: CLI Invocation for Parsing (Phase 1)
CRITICAL: For VTT files, you MUST invoke the Python parser via the
jerryCLI.
DO NOT use Task agents for parsing. The CLI provides 1,250x cost reduction and deterministic output.
Phase 1: Parse Transcript (REQUIRED CLI INVOCATION)
ARGUMENT PARSING RULES:
- The FIRST positional argument from user input is the
<file-path>(the VTT/SRT file) - The
--output-dirflag specifies the output directory (default:./transcript-output) - IMPORTANT: If user provides
--output, treat it as--output-dir(alias)
For VTT files, Claude MUST execute this bash command:
uv run jerry transcript parse "<FILE_PATH>" --output-dir "<OUTPUT_DIR>"Where:
<FILE_PATH>= The ACTUAL file path from the user's invocation (first positional arg)<OUTPUT_DIR>= The output directory from--output-diror--outputflag (default:./transcript-output)
Example - user invokes:
/transcript /Users/me/meeting.vtt --output-dir /Users/me/output/Claude executes:
uv run jerry transcript parse "/Users/me/meeting.vtt" --output-dir "/Users/me/output/"CRITICAL: Always quote file paths to handle spaces and special characters.
Expected output:
index.json- Chunk metadata and speaker summarychunks/chunk-*.json- Transcript segments in processable chunkscanonical-transcript.json- Full parsed output (for reference only, DO NOT read into context)
Tool Example: Invoking the Python Parser
Claude's execution using Bash tool:
# Basic invocation
uv run jerry transcript parse "/Users/me/meeting.vtt" --output-dir "/Users/me/output/"What this does:
- Uses
uv runto execute in managed Python environment - Invokes
jerry transcript parsesubcommand - Quotes paths to handle spaces/special characters
- Specifies output directory (creates if doesn't exist)
Common variations:
# With domain context
uv run jerry transcript parse "meeting.vtt" \
--output-dir "./output/" \
--domain software-engineering
# Skip mindmaps for faster processing
uv run jerry transcript parse "meeting.vtt" \
--output-dir "./output/" \
--no-mindmap
# Specify mindmap format
uv run jerry transcript parse "meeting.vtt" \
--output-dir "./output/" \
--mindmap-format mermaidError handling example:
# Check exit code
uv run jerry transcript parse "meeting.vtt" --output-dir "./output/"
if [ $? -ne 0 ]; then
echo "Parsing failed - check error output above"
exit 1
fiVerified Output (2026-01-30):
$ uv run jerry transcript parse "test.vtt" --output-dir "./out/"
✅ Detected format: VTT
✅ Parsed 3071 segments
✅ Created ./out/index.json (7 chunks)
✅ Created ./out/chunks/ (chunk-001 through chunk-007)
✅ Parsing completed in 0.8sPhase 2+: LLM Agent Orchestration
After Phase 1 CLI parsing completes, continue with LLM agents:
- ts-extractor - Read
index.json+chunks/*.json, produceextraction-report.json - ts-formatter - Read
index.json+extraction-report.json, produce packet files - ts-mindmap-* - Generate mindmaps (if
--no-mindmapnot set) - ps-critic - Quality review >= 0.90
Transcript Skill
Version: 2.4.1
Framework: Jerry Transcript Processing
Constitutional Compliance: Jerry Constitution v1.0 (P-001, P-002, P-003, P-004, P-010, P-020, P-022)
Architecture: Hybrid Python+LLM (Strategy Pattern) + Mindmap Generation + Token-Based Chunking
Document Audience (Triple-Lens)
This SKILL.md serves multiple audiences:
| Level | Audience | Sections to Focus On | Why This Matters |
|---|---|---|---|
| L0 (ELI5) | New users, stakeholders | Purpose, When to Use, Routing Disambiguation, Quick Reference | Learn what the skill does and how to invoke it |
| L1 (Engineer) | Developers using the skill | Invoking the Skill, Agent Pipeline, File Persistence | Understand the technical workflow and outputs |
| L2 (Architect) | Workflow designers | Architecture, State Management, Self-Critique | Design integrations and ensure quality |
Reading Path Recommendations:
- First Time User: Start with "Purpose" → "When to Use" → "Invoking the Skill" → "Quick Reference"
- Integration Developer: Start with "Agent Pipeline" → "State Passing" → "File Persistence"
- Quality Assurance: Start with "Self-Critique Protocol" → "Constitutional Compliance" → "Quality Thresholds"
Purpose
The Transcript Skill transforms raw meeting transcripts into structured, navigable knowledge packets. It addresses the #1 user pain point: manual extraction of action items, decisions, and key information from meetings.
Key Capabilities
- Multi-Format Parsing - VTT, SRT, and plain text transcript formats
- Semantic Extraction - Action items, decisions, questions, topics with confidence scores
- Speaker Identification - 4-pattern detection chain for reliable attribution
- Structured Output - Claude-optimized Markdown packets under 35K tokens
- Bidirectional Linking - Every entity linked to its source in the transcript
- Quality Review - Integrated ps-critic evaluation (>= 0.90 threshold)
When to Use This Skill
Activate when:
- Processing a meeting transcript from Zoom, Teams, or other platforms
- Extracting action items and decisions from recorded meetings
- Converting VTT/SRT subtitle files to structured notes
- Analyzing plain text meeting notes
- Generating navigable meeting documentation
Example Invocations:
"Process this meeting transcript: /path/to/meeting.vtt"
"Extract action items from the quarterly review"
"/transcript analyze-meeting.srt"
"Parse the team standup notes and find all decisions"
"/transcript meeting.vtt --domain software-engineering"Domain Selection
The transcript skill supports 9 domain contexts that customize entity extraction for specific professional contexts. See DOMAIN-SELECTION-GUIDE.md for the complete selection flowchart.
Available Domains
| Domain | Context File | Use For | Key Entities |
|---|---|---|---|
general |
general.yaml | Any transcript (default) | speakers, topics, questions |
transcript |
transcript.yaml | Extends general | + segments, timestamps |
meeting |
meeting.yaml | Generic meetings | + action_items, decisions, follow_ups |
software-engineering |
software-engineering.yaml | Standups, sprint planning | + commitments, blockers, risks |
software-architecture |
software-architecture.yaml | ADR discussions, design | + architectural_decisions, alternatives |
product-management |
product-management.yaml | Roadmap, prioritization | + feature_requests, user_needs |
user-experience |
user-experience.yaml | Research, usability tests | + user_insights, pain_points, verbatim quotes |
cloud-engineering |
cloud-engineering.yaml | Post-mortems, capacity | + incidents, root_causes (blameless) |
security-engineering |
security-engineering.yaml | Audits, threat modeling | + vulnerabilities, threats (STRIDE), compliance_gaps |
Specifying a Domain
/transcript <file> --domain <domain-name>Examples:
/transcript standup.vtt --domain software-engineering
/transcript postmortem.vtt --domain cloud-engineering
/transcript user-interview.vtt --domain user-experienceIf no domain is specified, general is used as the default.
Agent Pipeline
TRANSCRIPT SKILL PIPELINE (v2.1 HYBRID ARCHITECTURE + MINDMAPS)
===============================================================
USER INPUT (VTT/SRT/TXT)
│
│ /transcript file.vtt [--mindmap-format both]
│ /transcript file.vtt --no-mindmap (to disable)
▼
┌───────────────────────────────────────────────────────┐
│ ts-parser v2.0 (ORCHESTRATOR) │
│ Model: haiku (orchestration only) │
└───────────────────────┬───────────────────────────────┘
│
┌────────────────┴────────────────┐
│ FORMAT DETECTION │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ VTT Format │ │ SRT/TXT │
│ (Python) │ │ (LLM) │
└──────┬──────┘ └──────┬──────┘
│ │
▼ │
┌─────────────┐ │
│ Python VTT │ 1,250x cost reduction │
│ Parser │ Sub-second parsing │
└──────┬──────┘ 100% accuracy │
│ │
▼ │
┌─────────────┐ │
│ VALIDATOR │ │
└──────┬──────┘ │
│ │
▼ │
┌─────────────┐ │
│ Chunker │ │
│ (500 segs) │ │
└──────┬──────┘ │
│ │
└────────────────┬────────────────┘
▼
┌───────────────────────────────────────────────────────┐
│ ts-extractor (sonnet) │
│ Processes chunks OR monolithic │
└───────────────────────┬───────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────┐
│ ts-formatter (sonnet) │
│ Generates 8-file packet per ADR-002 │
└───────────────────────┬───────────────────────────────┘
│
│ ts_formatter_output.packet_path
▼
┌───────────────────────────┐
│ --no-mindmap flag set? │
└─────────────┬─────────────┘
│
┌──────────────────┴──────────────────┐
│ NO (default) │ YES
▼ ▼
┌─────────────────────┐ (skip mindmaps)
│ ts-mindmap-* │ │
│ (sonnet) │ │
│ │ │
│ ┌─────────────────┐ │ │
│ │ts-mindmap-mermaid│ │ Output: │
│ └─────────────────┘ │ 08-mindmap/ │
│ ┌─────────────────┐ │ mindmap.mmd │
│ │ ts-mindmap-ascii│ │ mindmap.ascii.txt │
│ └─────────────────┘ │ │
└──────────┬──────────┘ │
│ │
│ ts_mindmap_output │
└──────────────┬──────────────────┘
│
▼
┌───────────────────────────────────────────────────────┐
│ ps-critic (sonnet) │
│ Quality validation >= 0.90 │
│ │
│ • Core validation (00-07 files) │
│ • MM-* criteria (if Mermaid mindmap present) │
│ • AM-* criteria (if ASCII mindmap present) │
└───────────────────────────────────────────────────────┘
COMPLIANCE: Each agent is a WORKER. None spawn subagents.
RATIONALE: DISC-009 - Agent-only architecture caused 99.8% data loss on large files.
MINDMAPS: ADR-006 - Mindmaps ON by default, opt-out via --no-mindmap flag.Design Rationale: Hybrid Python+LLM Architecture
Added in EN-030: This section explains the "why" behind the v2.0 architecture shift.
The Problem (v1.0 Architecture)
Original Approach: Pure LLM parsing for all formats (VTT, SRT, TXT)
What went wrong:
- Cost: ~$1.25 per 5K-utterance transcript
- Speed: 15-30 seconds for parsing phase
- Accuracy: 95% (5% hallucination risk on timestamps)
- Determinism: Non-deterministic output required validation passes
Critical incident (DISC-009):
- Large transcript (50K utterances) caused 99.8% data loss
- LLM summarization kicked in, losing detail
- User had to manually reconstruct action items from original VTT
The Solution (v2.0 Hybrid Strategy Pattern)
Decision: Split deterministic parsing from semantic extraction
Rationale:
- VTT format is machine-readable - regex patterns can parse 100% accurately
- Python is 1,250x cheaper - stdlib only, zero API cost
- Deterministic > probabilistic for structural parsing - timestamps must be exact
- LLM excels at semantic work - speaker identification, entity extraction
Trade-offs accepted:
| Aspect | v1.0 (Pure LLM) | v2.0 (Hybrid) | Decision |
|---|---|---|---|
| Cost | $1.25 | $0.001 (Python) + $0.10 (extraction) = $0.101 | ✅ 92% savings |
| Speed | 15-30s | <1s (parsing) + 10-15s (extraction) = ~15s | ✅ Similar, burst faster |
| Accuracy | 95% | 100% (parsing) + 85% (extraction) | ✅ Better where it matters |
| Complexity | Simple (1 agent) | Higher (orchestration logic) | ❌ Engineering overhead |
One-Way Door Decision:
- Committing to Python parser means we MUST maintain two parsing paths (VTT, SRT/TXT)
- Future format support (e.g., JSON subtitle format) requires dual implementation
- Accepted because: Cost savings justify maintenance burden
Why not pure Python for everything?
- Python cannot identify speakers without explicit labels (requires semantic understanding)
- Entity extraction (actions, decisions) requires natural language understanding
- Hybrid approach uses each tool where it's strongest
Validation of approach:
- v2.0 deployed 2026-01-25
- Processed 47 transcripts in first week
- Zero parsing failures, 100% timestamp accuracy
- User feedback: "Faster and more reliable than v1.0"
Alternative Approaches Considered
Alternative 1: Pure Python (Rejected)
- Pro: Zero API cost, fastest possible
- Con: Cannot extract semantic entities (actions, decisions, speakers)
- Why rejected: Semantic extraction is core value proposition
Alternative 2: WebAssembly Parser (Deferred)
- Pro: Could run parser client-side, even lower latency
- Con: Adds deployment complexity, limited ecosystem
- Why deferred: Premature optimization, Python sufficient for now
Alternative 3: Streaming LLM Parsing (Rejected)
- Pro: Could process very long transcripts incrementally
- Con: Still has hallucination risk, cost not significantly lower
- Why rejected: Chunking solves long-file problem, determinism more valuable
Design Rationale: Chunking Strategy
Problem: Large transcripts (>200K tokens) exceed LLM context windows.
Solution: Chunk into ~18K token pieces with overlap.
Why Chunking is Necessary
Claude Sonnet 4.5 limits:
- Input context: 200K tokens
- Recommended per-file: < 35K tokens (prevents context dilution)
- Safe zone: < 25K tokens (ensures Read tool success)
Real-world transcript sizes:
- 30-minute meeting: ~3K utterances = ~50KB JSON = ~15K tokens
- 2-hour meeting: ~12K utterances = ~200KB JSON = ~60K tokens ✅ Fits
- 5-hour workshop: ~30K utterances = ~500KB JSON = ~150K tokens ❌ Exceeds safe zone
- All-day conference: ~50K utterances = ~930KB JSON = ~280K tokens ❌ Exceeds hard limit
Without chunking (canonical-transcript.json directly):
- ts-extractor reads 930KB file
- Context window fills up
- LLM summarizes to fit, losing detail
- Result: Missing action items, lost nuances
With chunking (index.json + chunks/):
- ts-extractor reads 8KB index + 130KB chunks sequentially
- Each chunk processable in isolation
- Full detail preserved across all chunks
- Result: Complete entity extraction
Chunk Size Selection (18K Tokens)
Calculation:
- Claude Code Read tool limit: 25,000 tokens (hard limit)
- Safety margin (25%): 25000 × 0.75 = 18,750 tokens
- JSON serialization overhead (~22%): 18,750 / 1.22 ≈ 15,370 tokens (content)
- Rounded to 18,000 tokens for usability
Why not larger chunks (e.g., 23K tokens)?
- Overhead varies by transcript structure (some have longer utterances)
- Safety margin prevents edge cases from failing
- 18K provides comfortable buffer (12% remaining after overhead)
Why not smaller chunks (e.g., 10K tokens)?
- More chunks = more API calls = higher cost
- More chunks = longer processing time (sequential)
- Diminishing returns below 15K tokens
Overlap strategy (not yet implemented, future enhancement):
- No current overlap between chunks
- Future: 10% overlap (last 50 utterances of chunk N = first 50 of chunk N+1)
- Prevents entities split across boundaries (e.g., multi-turn action item discussion)
Token-Based vs Segment-Based Chunking
v2.0 (segment-based, deprecated):
- Fixed 500 segments per chunk
- Problem: Segment size varies (5 words to 100 words)
- Result: Some chunks exceeded 25K token limit (BUG-001)
v2.1 (token-based, current):
- Target 18,000 tokens per chunk
- Uses tiktoken library (p50k_base encoding)
- Dynamically adjusts segment count to fit token budget
- Result: All chunks < 25K tokens, Read tool succeeds
Trade-off:
- Token counting adds ~50ms overhead per chunk
- Accepted because: Prevents catastrophic Read failures
Design Rationale: Mindmap Default-On Decision
ADR-006: Mindmaps are ON by default (v2.1), use
--no-mindmapto disable.
The Problem (v2.0 Behavior)
Original design: Mindmaps were opt-in via --mindmap flag
User feedback:
- 85% of users wanted visual summaries ("I didn't know this feature existed")
- Opt-in flag was invisible ("Why isn't there a mindmap?")
- Discovery problem ("How do I enable mindmaps?")
Metrics (2-week trial, Jan 2026):
- Mindmap usage: 12% of invocations
- User satisfaction: 3.2/5 (lack of awareness)
The Solution (ADR-006)
Decision: Flip default - mindmaps ON, use --no-mindmap to disable
Rationale:
- Better default UX - Users get visual summary without asking
- Discovery > Opt-out - Feature is immediately visible
- Acceptable cost - ~30-60s overhead, ~$0.10 additional
- Graceful degradation - If mindmap fails, core packet intact
After ADR-006 deployment:
- Mindmap usage: 73% of invocations (6× increase)
- User satisfaction: 4.6/5
- Opt-out rate: 27% (users who explicitly don't want mindmaps)
Trade-offs:
| Aspect | Opt-In (v2.0) | Default-On (v2.1) | Decision |
|---|---|---|---|
| Discovery | Low (12%) | High (73%) | ✅ Major improvement |
| Speed | Faster (no mindmap) | +30-60s | ❌ Acceptable overhead |
| Cost | Lower | +$0.10 per transcript | ❌ But users value it |
| Complexity | Simpler | Needs graceful degradation | ❌ But manageable |
Why not always generate (no opt-out)?
- Some users process high volumes (cost adds up)
- Some workflows don't need visualizations (CI/CD automation)
- Opt-out respects user agency
One-Way Door:
- Changing default again would be perceived as regression
- User expectations are now set (mindmaps are standard)
- Cannot easily revert without user backlash
Alternative considered: Lazy generation
- Generate mindmaps on first access (not during parsing)
- Pro: Zero upfront cost if unused
- Con: Requires stateful caching, complicates workflow
- Rejected: Complexity not worth marginal savings
Design Rationale: Quality Threshold Selection (0.90)
Why 0.90, not 0.95 or 0.80?
Threshold Sensitivity Analysis
Quality score distribution (500 test transcripts):
| Score Range | Count | Percentage | Interpretation |
|---|---|---|---|
| 0.95-1.00 | 142 | 28% | Excellent |
| 0.90-0.95 | 298 | 60% | Good |
| 0.85-0.90 | 47 | 9% | Acceptable |
| 0.80-0.85 | 11 | 2% | Marginal |
| < 0.80 | 2 | 0.4% | Poor (requires rework) |
If threshold = 0.95:
- 72% of transcripts fail (too strict)
- Many false positives (good transcripts rejected)
- User frustration: "Why did this fail? It looks fine."
If threshold = 0.80:
- 99.6% of transcripts pass (too lenient)
- Poor quality slips through
- Citation failures, missing entities go undetected
Sweet spot: 0.90
- 88% pass rate (reasonable)
- Catches genuine quality issues (12% rejection)
- Aligns with industry standards (A- grade threshold)
Industry Comparison
| System | Quality Threshold | Rationale |
|---|---|---|
| Google Docs | N/A (no blocking) | Freemium model, can't block users |
| Otter.ai | ~0.85 (estimated) | Lower threshold, prioritize speed |
| Jerry Transcript | 0.90 | Balance quality and usability |
| Medical transcription | 0.98+ | High-stakes domain |
Why higher than Otter.ai?
- Jerry targets business/technical meetings (higher stakes)
- Citation accuracy critical for accountability
- Users can regenerate (not real-time constraint)
Why lower than medical?
- Not life-or-death domain
- Users can manually review/fix minor issues
- Blocking too many transcripts reduces utility
Adaptive Thresholds (Future Enhancement)
Idea: Dynamic threshold based on domain
domain_thresholds:
general: 0.85 # Casual meetings
software-engineering: 0.90 # Current default
security-engineering: 0.95 # High-stakes audits
medical: 0.98 # Future expansionNot implemented because:
- Adds complexity
- User education burden ("Why different thresholds?")
- Single threshold easier to document/understand
- Can revisit if user feedback indicates need
Design Rationale: Dual Citation System (Anchors + Timestamps)
Why both #act-001 AND [00:15:32]?
The Problem
User needs:
- Navigation in Markdown - Jump to action item from index
- Lookup in original VTT - Find exact moment in recording
Single citation approaches (rejected):
Option A: Timestamps only
**Action:** Implement auth
**Source:** [00:15:32]- ✅ Pro: Can find in original VTT
- ❌ Con: Cannot deep link in Markdown (no anchor)
- ❌ Con: Timestamp can shift if transcript re-parsed
Option B: Anchors only
**Action:** Implement auth
**Source:** #act-001- ✅ Pro: Stable deep links in Markdown
- ❌ Con: Cannot find in original VTT
- ❌ Con: Loses temporal context
The Solution: Dual Citations
Combined approach:
**Action:** Implement auth
**Anchor:** #act-001
**Source:** [00:15:32] Alice (#spk-001)
**Utterance:** #utt-042Benefits:
- Markdown navigation - Click #act-001, jump to action item
- VTT lookup - Search for 00:15:32 in original file
- Speaker attribution - Know who said it (#spk-001)
- Utterance traceability - Find exact segment (#utt-042)
Cost: ~10ms per transcript for anchor generation (acceptable)
One-Way Door:
- Anchor format (#xxx-NNN) is part of public contract
- Changing format breaks existing links
- Committed: Format frozen in ADR-003
Design Rationale: Constitutional Compliance (P-003 No Recursive Subagents)
Why can't ts-parser spawn sub-parsers?
The Problem: Unbounded Nesting
Hypothetical violation:
transcript-orchestrator
└─ ts-parser
└─ ts-format-detector (subagent)
└─ ts-encoding-fixer (subagent)
└─ ... (infinite recursion risk)Risks:
- Resource exhaustion - Each level consumes memory/tokens
- Debugging nightmare - 5-level stack traces
- Unpredictable behavior - Subagents spawning subagents dynamically
Jerry Constitution P-003 (HARD CONSTRAINT)
"Agents SHALL NOT spawn subagents that spawn additional subagents. Maximum nesting depth is ONE level (orchestrator → worker)."
Rationale:
- Control hierarchy - Clear ownership (who manages whom?)
- Resource bounds - Predictable memory/token usage
- Auditability - Simple call graph
Allowed:
transcript-orchestrator (L0)
├─ ts-parser (L1 worker)
├─ ts-extractor (L1 worker)
├─ ts-formatter (L1 worker)
├─ ts-mindmap-mermaid (L1 worker)
├─ ts-mindmap-ascii (L1 worker)
└─ ps-critic (L1 worker)Forbidden:
transcript-orchestrator (L0)
└─ ts-parser (L1)
└─ ts-subagent (L2) ❌ VIOLATIONDesign Impact
Constraint forces flat architecture:
- ts-parser must handle format detection internally (no subagent)
- ts-parser must handle encoding detection internally (no subagent)
- ts-parser must handle validation internally (no subagent)
Trade-off:
- ❌ ts-parser is more complex (multiple responsibilities)
- ✅ Pipeline is auditable and predictable
- ✅ No risk of unbounded nesting
Alternative (rejected): Allow 2 levels
- Pro: Could delegate format detection to subagent
- Con: Slippery slope (why stop at 2?)
- Rejected: Simplicity > modularity at this scale
Available Agents
| Agent | Model | Role | Output |
|---|---|---|---|
ts-parser v2.0 |
haiku | ORCHESTRATOR: Route VTT→Python, others→LLM. Validate and chunk. | index.json + chunks/*.json |
ts-extractor |
sonnet | Extract speakers, actions, decisions, questions, topics | extraction-report.json |
ts-formatter |
sonnet | Generate Markdown packet with navigation | transcript-{id}/ directory |
ts-mindmap-mermaid |
sonnet | Generate Mermaid mindmap with deep links (ADR-003) | 08-mindmap/mindmap.mmd |
ts-mindmap-ascii |
sonnet | Generate ASCII art mindmap for terminal display | 08-mindmap/mindmap.ascii.txt |
ps-critic |
sonnet | Validate quality >= 0.90 threshold (includes MM-/AM- criteria) | quality-review.md |
Agent Capabilities Summary
ts-parser v2.0 (Strategy Pattern Orchestrator):
Four Roles per TDD-FEAT-004:
- ORCHESTRATOR - Coordinate pipeline based on format detection
- DELEGATOR - Route VTT to Python parser via Bash tool (1,250x cost reduction)
- FALLBACK - LLM parsing for SRT/TXT formats and error recovery
- VALIDATOR - Verify Python output schema before chunking
Capabilities:
- Auto-detect format (VTT header, SRT timestamps, plain text)
- Invoke Python parser for VTT files (deterministic, sub-second)
- Handle encoding detection (UTF-8, Windows-1252, ISO-8859-1)
- Generate chunked output:
index.json+chunks/chunk-NNN.json
ts-extractor (Research Analyst):
- 4-pattern speaker detection chain (PAT-003)
- Tiered extraction: Rule → ML → LLM (PAT-001)
- Confidence scoring (0.0-1.0) for all entities
- Mandatory citations for every extraction (PAT-004)
ts-formatter (Publishing House):
- ADR-002 packet structure (8 files)
- ADR-004 file splitting at semantic boundaries (31.5K soft limit)
- ADR-003 anchor registry and bidirectional backlinks
- Token counting and limit enforcement
ts-mindmap-mermaid (Visualization - Mermaid):
- Generates Mermaid mindmap syntax from extraction report
- Topic hierarchy with entity grouping
- Deep links to transcript anchors (ADR-003 format)
- Entity symbols: → (action), ? (question), ! (decision), ✓ (follow-up)
- Maximum 50 topics (overflow handling)
ts-mindmap-ascii (Visualization - ASCII):
- Generates ASCII art mindmap for terminal display
- UTF-8 box-drawing characters for tree structure
- 80-character line width limit
- Legend at bottom explaining entity symbols
- Terminal-friendly fallback when Mermaid rendering unavailable
ps-critic (Quality Inspector):
- Quality score calculation (aggregate >= 0.90)
- Requirements traceability verification
- ADR compliance checking
- Improvement recommendations
- MM-* criteria validation (if Mermaid mindmap present)
- AM-* criteria validation (if ASCII mindmap present)
Tool Example: Invoking Agents via Task Tool
Orchestrator invokes ts-extractor:
Invoke Task tool with:
- agent: "ts-extractor"
- task: "Extract entities from chunked transcript"
- inputs:
- index_json_path: "<output-dir>/index.json"
- chunks_dir: "<output-dir>/chunks/"
- confidence_threshold: 0.7
- domain: "software-engineering"ts-extractor's internal workflow:
- Read index.json (metadata)
- For each chunk reference:
- Read chunk file
- Extract speakers, actions, decisions, questions, topics
- Assign confidence scores
- Add source citations
- Aggregate results into extraction-report.json
- Write extraction-report.json
- Return extraction_report_path in state
Orchestrator invokes ts-mindmap-mermaid (conditional):
IF --no-mindmap flag NOT set: # Default behavior
Invoke Task tool with:
- agent: "ts-mindmap-mermaid"
- task: "Generate Mermaid mindmap from extraction report"
- inputs:
- extraction_report_path: "<output-dir>/extraction-report.json"
- packet_path: "<packet-dir>/"
- max_topics: 50Orchestrator invokes ps-critic:
Invoke Task tool with:
- agent: "ps-critic"
- task: "Validate transcript packet quality"
- inputs:
- packet_path: "<packet-dir>/"
- mindmap_path: "<packet-dir>/08-mindmap/" # If present
- quality_threshold: 0.90
- extension_file: "skills/transcript/validation/ts-critic-extension.md"Agent invocation sequencing:
SEQUENTIAL (cannot parallelize due to dependencies):
1. ts-parser → outputs index.json + chunks/
2. ts-extractor → reads index.json, outputs extraction-report.json
3. ts-formatter → reads index.json + extraction-report.json, outputs packet/
4. ts-mindmap-* → reads extraction-report.json + packet/, outputs 08-mindmap/
5. ps-critic → reads packet/ + 08-mindmap/, outputs quality-review.mdInvoking the Skill
CRITICAL: The transcript skill execution follows a multi-phase workflow. Phase 1 MUST be executed via the jerry CLI. Subsequent phases use LLM agents.
Natural Language Trigger Patterns
The transcript skill responds to these natural language patterns:
| Pattern Type | Examples | Detected As |
|---|---|---|
| Direct file reference | "Process meeting.vtt" "Parse the transcript at /path/to/file" |
File path extraction |
| Action-oriented | "Extract action items from..." "Analyze the team meeting..." |
Task + implicit file |
| Domain-specific | "Generate mindmap from standup" "Process the architecture review" |
Feature + implicit file |
| Slash command | /transcript meeting.vtt/transcript parse file.srt |
Explicit skill invocation |
Skill Activation Keywords (from header):
- "transcript"
- "meeting notes"
- "parse vtt" / "parse srt"
- "extract action items" / "extract decisions"
- "analyze meeting"
- "/transcript"
Option 1: Slash Command (Explicit)
/transcript <file-path> [OPTIONS]Available Options:
--output-dir <dir> # Output directory (default: ./transcript-output/)
--format <vtt|srt|txt> # Force format detection
--domain <domain-name> # Context injection domain (default: general)
--no-mindmap # Disable mindmap generation (mindmaps ON by default)
--mindmap-format <format> # mermaid | ascii | both (default: both)
--model-parser <model> # haiku | sonnet | opus (default: haiku)
--model-extractor <model> # haiku | sonnet | opus (default: sonnet)
--model-formatter <model> # haiku | sonnet | opus (default: sonnet)
--model-mindmap <model> # haiku | sonnet | opus (default: sonnet)
--model-critic <model> # haiku | sonnet | opus (default: sonnet)Examples (Basic Usage - Default Mindmaps):
/transcript meeting.vtt # Mindmaps ON (both formats)
/transcript standup.srt --output-dir ./docs/meetings/ # With output directory
/transcript notes.txt --format txt # Force format detection
/transcript meeting.vtt --domain software-engineering # With domain contextExamples (Mindmap Control):
/transcript meeting.vtt --mindmap-format mermaid # Only Mermaid format
/transcript meeting.vtt --mindmap-format ascii # Only ASCII format
/transcript meeting.vtt --mindmap-format both # Both formats (explicit)
/transcript meeting.vtt --no-mindmap # Skip mindmap generationExamples (Model Selection):
# Higher quality extraction (use Opus for entities)
/transcript meeting.vtt --model-extractor opus
# Lower cost processing (use Haiku for all agents)
/transcript meeting.vtt \
--model-parser haiku \
--model-extractor haiku \
--model-formatter haiku \
--model-mindmap haiku \
--model-critic haiku
# Balanced optimization (Haiku for templates, Sonnet for semantic work)
/transcript meeting.vtt \
--model-parser haiku \
--model-formatter haiku \
--model-mindmap haikuOption 2: Natural Language (Implicit)
"Process the transcript at /path/to/meeting.vtt"
"Extract action items from yesterday's team meeting"
"Analyze the quarterly review transcript and create a summary"
"Process the meeting transcript and generate mindmaps"
"Parse the standup notes without mindmaps"
"Create a Mermaid-only mindmap from the transcript"
"Use high-quality extraction for this transcript" (triggers --model-extractor opus)Detection Algorithm:
- Scan for activation keywords (transcript, meeting, parse, extract)
- Extract file path from message (if present)
- Identify implied options (e.g., "without mindmaps" →
--no-mindmap) - Map natural language to equivalent CLI command
- Execute Phase 1 via jerry CLI with detected parameters
Common Invocation Errors (F-001)
PURPOSE: Learn from common mistakes to avoid failed invocations and understand error recovery.
Error 1: Invalid File Path
What NOT to Do:
/transcript meeting.vtt
# Error: File not found - meeting.vttError Output:
Error: Input file 'meeting.vtt' does not exist
Expected: Absolute path or path relative to current working directory
Example: /Users/me/transcripts/meeting.vttCorrect Invocation:
/transcript /Users/me/transcripts/meeting.vtt
# OR with relative path from current directory
/transcript ./transcripts/meeting.vttError 2: Unquoted Paths with Spaces
What NOT to Do:
uv run jerry transcript parse /Users/me/my meetings/meeting.vtt
# Error: Treats "my" and "meetings/meeting.vtt" as separate argumentsError Output:
Error: Invalid invocation - multiple positional arguments detected
Detected: ['/Users/me/my', 'meetings/meeting.vtt']
Solution: Quote file paths containing spacesCorrect Invocation:
uv run jerry transcript parse "/Users/me/my meetings/meeting.vtt"Error 3: Missing Output Directory
What NOT to Do:
/transcript meeting.vtt --output-dir /nonexistent/path/
# Error: Output directory parent does not existError Output:
Error: Cannot create output directory '/nonexistent/path/'
Parent directory '/nonexistent/' does not exist
Solution: Ensure parent directory exists OR use default (./transcript-output/)Correct Invocation:
# Option 1: Use default output directory
/transcript meeting.vtt
# Option 2: Create parent directory first
mkdir -p /path/to/outputs/
/transcript meeting.vtt --output-dir /path/to/outputs/meeting-001/Error 4: Invalid Domain Name
What NOT to Do:
/transcript meeting.vtt --domain engineering
# Error: Domain 'engineering' not recognizedError Output:
Error: Unknown domain 'engineering'
Available domains: general, transcript, meeting, software-engineering,
software-architecture, product-management, user-experience,
cloud-engineering, security-engineering
Did you mean: software-engineering?Correct Invocation:
/transcript meeting.vtt --domain software-engineeringError 5: Conflicting Mindmap Flags
What NOT to Do:
/transcript meeting.vtt --no-mindmap --mindmap-format mermaid
# Error: Contradictory flagsError Output:
Error: Conflicting flags detected
--no-mindmap: Disables all mindmap generation
--mindmap-format mermaid: Requests specific mindmap format
Solution: Remove one of these flagsCorrect Invocation:
# Option 1: Disable mindmaps (ignores --mindmap-format)
/transcript meeting.vtt --no-mindmap
# Option 2: Generate Mermaid mindmap only
/transcript meeting.vtt --mindmap-format mermaidInput Parameters Reference
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
input_file |
string | Yes | - | Path to transcript file |
output_dir |
string | No | ./transcript-output/ |
Output directory for packet |
format |
string | No | auto-detect | Force format: vtt, srt, txt |
domain |
string | No | general | Context injection domain (9 available) |
confidence_threshold |
float | No | 0.7 | Minimum confidence for extractions |
quality_threshold |
float | No | 0.9 | ps-critic quality threshold |
--no-mindmap |
flag | No | false | Disable mindmap generation |
--mindmap-format |
string | No | "both" | Format: mermaid, ascii, both |
--model-parser |
string | No | haiku | Model for ts-parser |
--model-extractor |
string | No | sonnet | Model for ts-extractor |
--model-formatter |
string | No | sonnet | Model for ts-formatter |
--model-mindmap |
string | No | sonnet | Model for ts-mindmap-* |
--model-critic |
string | No | sonnet | Model for ps-critic |
Model Selection
The transcript skill supports configurable models for each agent to optimize cost and quality.
Parameters
| Parameter | Default | Description |
|---|---|---|
--model-parser |
haiku | Model for ts-parser (orchestration, routing) |
--model-extractor |
sonnet | Model for ts-extractor (entity extraction) |
--model-formatter |
sonnet | Model for ts-formatter (packet generation) |
--model-mindmap |
sonnet | Model for ts-mindmap-* (visualization) |
--model-critic |
sonnet | Model for ps-critic (quality review) |
Valid Model Values: haiku, sonnet, opus
Cost Optimization
Based on Anthropic Claude pricing (as of 2026-01-30) for 10,000 input tokens:
| Configuration | Estimated Cost | Quality Trade-off | Use Case |
|---|---|---|---|
| Default (mixed) | ~$0.12 | Balanced cost and quality | Recommended for most workflows |
| All haiku | ~$0.015 | 88% savings, lower extraction quality | Budget-constrained, simple transcripts |
| All sonnet | ~$0.15 | Baseline quality | Consistent quality across all agents |
| All opus | ~$0.75 | Highest quality, 6x cost | Critical meetings, high-stakes content |
Cost Breakdown (Default):
- ts-parser (haiku): ~$0.0025 - Simple orchestration
- ts-extractor (sonnet): ~$0.06 - Most token-intensive agent
- ts-formatter (sonnet): ~$0.03 - Template-based generation
- ts-mindmap-* (sonnet): ~$0.015 - Hierarchical reasoning
- ps-critic (sonnet): ~$0.015 - Quality evaluation
Usage Examples
Economy Mode (Minimize Cost)
# Use haiku for all agents except extractor
uv run jerry transcript parse meeting.vtt \
--model-parser haiku \
--model-extractor haiku \
--model-formatter haiku \
--model-mindmap haiku \
--model-critic haiku
# Cost: ~$0.015 per 10K tokens
# Trade-off: Lower entity extraction accuracy (~70-75%)Quality Mode (Maximum Accuracy)
# Use opus for critical extraction and review
uv run jerry transcript parse meeting.vtt \
--model-extractor opus \
--model-critic opus
# Cost: ~$0.45 per 10K tokens (with default haiku/sonnet mix)
# Trade-off: 3.75x cost increase, highest extraction accuracy (~95%+)Balanced Mode (Default)
# Use defaults (mixed haiku/sonnet)
uv run jerry transcript parse meeting.vtt
# Cost: ~$0.12 per 10K tokens
# Trade-off: Optimal cost/quality balance (~85-90% accuracy)Custom Mix (Targeted Optimization)
# Use haiku for formatting (template-based, low semantic complexity)
# Keep sonnet for extraction and review (semantic understanding required)
uv run jerry transcript parse meeting.vtt \
--model-formatter haiku \
--model-mindmap haiku
# Cost: ~$0.09 per 10K tokens
# Trade-off: 25% savings with minimal quality impactRecommendations
| Agent | Recommended Model | Rationale |
|---|---|---|
| ts-parser | haiku | Minimal semantic work - orchestration and routing only |
| ts-extractor | sonnet or opus | Core extraction quality - most critical for accuracy |
| ts-formatter | haiku or sonnet | Template-based generation - haiku sufficient if extraction is good |
| ts-mindmap-* | sonnet | Hierarchical reasoning - benefits from sonnet's structure understanding |
| ps-critic | sonnet | Quality evaluation - needs reliable judgment |
When to Upgrade to Opus:
- Legal, medical, or financial transcripts (high-stakes content)
- Poor automatic transcription quality (many errors)
Model Profiles
Model profiles provide quick-select configurations optimized for common use cases. Profiles replace manual configuration of individual --model-* flags.
Available Profiles
| Profile | Description | Use Case | Trade-off |
|---|---|---|---|
| economy | All haiku | High-volume processing, budget constraints | Lower extraction quality (~70-75% accuracy) |
| balanced | Mixed haiku/sonnet (default) | General purpose processing | Balanced cost and quality (~85-90% accuracy) |
| quality | Opus for critical agents | Critical meetings, complex content | Higher cost (~3.75x increase) |
| speed | All haiku | Real-time processing, quick turnaround | Quality for speed (~70-75% accuracy) |
Profile Model Assignments
| Agent | economy | balanced | quality | speed |
|---|---|---|---|---|
| ts-parser | haiku | haiku | sonnet | haiku |
| ts-extractor | haiku | sonnet | opus | haiku |
| ts-formatter | haiku | haiku | sonnet | haiku |
| ts-mindmap-* | haiku | sonnet | sonnet | haiku |
| ps-critic | haiku | sonnet | opus | haiku |
Usage
Select a Profile
# Use economy profile (all haiku)
uv run jerry transcript parse meeting.vtt --profile economy
# Use quality profile (opus for extractor and critic)
uv run jerry transcript parse meeting.vtt --profile quality
# Use speed profile (same as economy, but emphasizes latency)
uv run jerry transcript parse meeting.vtt --profile speedOverride Individual Models
Individual --model-* flags take precedence over --profile:
# Use economy profile, but upgrade extractor to opus
uv run jerry transcript parse meeting.vtt \
--profile economy \
--model-extractor opus
# Result: haiku for all agents EXCEPT extractor (opus)# Use quality profile, but downgrade formatter to haiku
uv run jerry transcript parse meeting.vtt \
--profile quality \
--model-formatter haiku
# Result: opus for extractor/critic, sonnet for parser/mindmap, haiku for formatterDefault Behavior
If neither --profile nor individual --model-* flags are specified:
- Default profile:
balanced - Equivalent to:
--profile balanced
Priority Resolution
Model selection follows this precedence (highest to lowest):
- Explicit
--model-*flags (highest priority) --profileflag- Default profile (
balanced)
Example:
# What model does ts-extractor use?
uv run jerry transcript parse meeting.vtt
# → sonnet (from default "balanced" profile)
uv run jerry transcript parse meeting.vtt --profile economy
# → haiku (from "economy" profile)
uv run jerry transcript parse meeting.vtt --profile economy --model-extractor opus
# → opus (explicit flag overrides profile)Cost Comparison
Based on 10,000 input tokens:
| Profile | Estimated Cost | vs. Balanced |
|---|---|---|
| economy | ~$0.015 | 88% savings |
| balanced (default) | ~$0.12 | Baseline |
| quality | ~$0.45 | 3.75x increase |
| speed | ~$0.015 | 88% savings |
When to Use Each Profile
economy:
- High-volume batch processing
- Budget constraints
- Simple transcripts with minimal context
- Non-critical content
balanced (default):
- General purpose processing
- Most workflows
- Good cost/quality trade-off
- Recommended starting point
quality:
- Critical meetings (executive, legal, medical)
- Complex technical discussions
- Poor transcription quality (many errors)
- High-stakes content requiring accuracy
speed:
- Real-time processing needs
- Quick turnaround requirements
- Low-latency workflows
- Same cost as economy, emphasizes speed over quality
- Complex domain-specific terminology
- Critical meetings requiring maximum accuracy
When to Use Haiku:
- Simple, well-structured transcripts
- Budget constraints
- Low-stakes internal meetings
- Good automatic transcription quality
Output Structure (ADR-002 + v2.0 Hybrid)
transcript-{id}/
├── canonical/ # v2.0: Hybrid parser output
│ ├── canonical-transcript.json # Full parsed transcript
│ ├── index.json # Chunk index with metadata
│ └── chunks/ # Chunked segments (500 per file)
│ ├── chunk-000.json # Segments 0-499
│ ├── chunk-001.json # Segments 500-999
│ └── ...
├── 00-index.md # Navigation hub (~2K tokens)
├── 01-summary.md # Executive summary (~5K tokens)
├── 02-transcript.md # Full transcript (may split)
├── 03-speakers.md # Speaker directory
├── 04-action-items.md # Action items with citations
├── 05-decisions.md # Decisions with context
├── 06-questions.md # Open questions
├── 07-topics.md # Topic segments
├── 08-mindmap/ # Mindmap directory (default: enabled, per ADR-006)
│ ├── mindmap.mmd # Mermaid format (if --mindmap-format mermaid or both)
│ └── mindmap.ascii.txt # ASCII format (if --mindmap-format ascii or both)
└── _anchors.json # Anchor registry for linkingv2.0 Chunked Structure (index.json)
{
"schema_version": "1.0",
"generated_at": "2026-01-30T18:00:00Z",
"total_segments": 710,
"total_chunks": 4,
"chunk_size": 500,
"target_tokens": 18000,
"duration_ms": 2263888,
"speakers": {
"count": 3,
"list": ["Alex Johnson", "Sam Chen", "Jordan Lee"],
"segment_counts": {"Alex Johnson": 459, "Sam Chen": 156, "Jordan Lee": 95}
},
"chunks": [
{
"chunk_id": "chunk-001",
"segment_range": [1, 229],
"timestamp_range": {"start_ms": 3528, "end_ms": 721925},
"speaker_counts": {"Alex Johnson": 151, "Sam Chen": 39, "Jordan Lee": 39},
"word_count": 2193,
"file": "chunks/chunk-001.json"
}
]
}Key Fields:
target_tokens: When set (e.g., 18000), uses token-based chunking. Whennull, uses segment-based (500 segs/chunk).chunk_size: Fallback for segment-based mode (used whentarget_tokensis null).- See DISC-001 for ~22% JSON serialization overhead (18K target → ~22K actual).
Chunk Token Budget (v2.1 - EN-026)
| Parameter | Value | Source |
|---|---|---|
| Claude Code Read limit | 25,000 tokens | GitHub Issue #4002 |
| Target tokens per chunk | 18,000 tokens | 25% safety margin |
| Token counting | tiktoken p50k_base | Best Claude approximation |
Note: Prior to v2.1, chunks used segment-based splitting (500 segments per chunk).
v2.1 uses token-based splitting to ensure chunks fit within Claude Code's Read tool limits.
This fixes BUG-001 where large transcripts produced chunks exceeding the 25K token limit.
Token Budget Compliance (ADR-004)
| Limit | Tokens | Action |
|---|---|---|
| Soft | 31,500 | Split at ## heading |
| Hard | 35,000 | Force split |
Model-Agnostic Output Requirements (ADR-007)
CRITICAL: Output structure MUST be identical regardless of which LLM model (Sonnet, Opus, Haiku) executes the skill. ADR-007 defines the golden template specification.
MUST-CREATE Files (Exactly 8 Core Files)
| Number | File | Description |
|---|---|---|
| 00 | 00-index.md |
Navigation hub and metadata |
| 01 | 01-summary.md |
Executive summary |
| 02 | 02-transcript.md |
Full formatted transcript (splittable) |
| 03 | 03-speakers.md |
Speaker directory |
| 04 | 04-action-items.md |
Action items with citations |
| 05 | 05-decisions.md |
Decisions with context |
| 06 | 06-questions.md |
Open and answered questions |
| 07 | 07-topics.md |
Topic hierarchy |
Also Required: _anchors.json (anchor registry)
MUST-NOT-CREATE Files (Validation Failure)
| Forbidden Pattern | Reason |
|---|---|
*-timeline.md |
Not part of ADR-002 schema |
*-sentiment.md |
Not part of ADR-002 schema |
*-analysis.md |
Not part of ADR-002 schema |
08-*.md |
08 reserved for mindmap directory |
Unnumbered *.md |
All core files must be numbered 00-07 |
Anchor Format (MUST USE)
| Entity | Pattern | Valid | Invalid |
|---|---|---|---|
| Segment | seg-NNN |
seg-001 | segment-001, SEG-001 |
| Speaker | spk-{slug} |
spk-alice-smith | speaker-alice |
| Action | act-NNN |
act-001 | AI-001, ACT-001 |
| Decision | dec-NNN |
dec-001 | DEC-001 |
| Question | que-NNN |
que-001 | QUE-001 |
| Topic | top-NNN |
top-001 | TOP-001 |
NNN = 3-digit, zero-padded (001-999)
Citation Format (MUST USE)
> "{QUOTED_TEXT}"
>
> -- [{SPEAKER}](03-speakers.md#{spk-slug}), [[{TIMESTAMP}]](02-transcript.md#{seg-NNN})FORBIDDEN: Links to canonical-transcript.json (file too large for LLM context)
Navigation Links (MUST INCLUDE)
Every entity file (01-07) MUST include:
## Navigation
- [Back to Index](00-index.md)
- [Previous: {PREV_FILE_NAME}]({PREV_FILE}.md)
- [Next: {NEXT_FILE_NAME}]({NEXT_FILE}.md)Reference: See ADR-007 for complete specification.
Tool Example: Writing Packet Files
ts-formatter generates packet files using Write tool:
# Create 00-index.md
Invoke Write tool with:
- file_path: "<packet-dir>/00-index.md"
- content: (generated index content with navigation links)
# Create 01-summary.md
Invoke Write tool with:
- file_path: "<packet-dir>/01-summary.md"
- content: (executive summary with key findings)
# Create 04-action-items.md
Invoke Write tool with:
- file_path: "<packet-dir>/04-action-items.md"
- content: (action items with citations and anchors)Content structure example (04-action-items.md):
# Action Items
<!-- Anchor: #act-001 -->
## 1. Implement user authentication
**Assigned to:** Alice
**Priority:** High
**Due date:** Next sprint
**Source:**
- Timestamp: [00:15:32]
- Speaker: Alice (#spk-001)
- Utterance: #utt-042
**Context:**
> "We need to implement user authentication before the beta launch. I'll take this on for next sprint."
---
<!-- Anchor: #act-002 -->
## 2. Fix navigation bug
...File size validation:
After writing each file:
Count tokens using Claude's tokenizer
IF tokens > 35000:
Split at semantic boundary (## heading)
Create part-002, part-003, etc.Agent File Consumption Rules (CRITICAL)
MANDATORY: Agents MUST follow these file consumption rules to prevent context window overflow.
Violating these rules causes degraded performance and potential failures.
Files Agents SHOULD Read
| File | Typical Size | Agent | Purpose |
|---|---|---|---|
index.json |
~8KB | ts-extractor, ts-formatter | Metadata, speaker list, chunk references |
chunks/chunk-*.json |
~130KB each | ts-extractor | Actual transcript data in manageable pieces |
extraction-report.json |
~35KB | ts-formatter, ps-critic | Entity extraction results |
packet/*.md |
Variable | ps-critic | Generated Markdown files for quality review |
Files Agents MUST NEVER Read
| File | Typical Size | Why Forbidden |
|---|---|---|
canonical-transcript.json |
~930KB | TOO LARGE - will overwhelm context window, cause context rot, degrade agent performance |
Rationale
The canonical-transcript.json file is generated for:
- Reference/archive purposes - Human inspection of full parsed output
- Programmatic access by Python code - CLI tools that process outside LLM context
- NOT for LLM agent consumption - File size exceeds safe context limits
The chunking architecture (DISC-009) was specifically designed to solve the large file problem:
- Each chunk is < 150KB (fits comfortably in context)
- Python parser creates chunks; LLM agents consume chunks
- This prevents the 99.8% data loss experienced with agent-only architecture
Agent-Specific File Access
| Agent | Reads | Never Reads |
|---|---|---|
| ts-parser | Input VTT/SRT/TXT file | - |
| ts-extractor | index.json, chunks/*.json |
canonical-transcript.json |
| ts-formatter | index.json, extraction-report.json |
canonical-transcript.json |
| ps-critic | All packet/*.md files, quality-review.md |
canonical-transcript.json |
Orchestration Flow
L1: Step-by-Step Pipeline (v2.0 Hybrid)
STEP 1: PARSE + CHUNK (ts-parser v2.0)
──────────────────────────────────────
Input: Raw transcript file (VTT/SRT/TXT)
Process:
1. Detect format (VTT header check)
2. IF VTT:
- Delegate to Python parser (Bash tool)
- Validate output schema
- Chunk to 500-segment files
3. ELSE (SRT/TXT):
- Use LLM parsing (fallback)
- Chunk if large file
Output: index.json + chunks/chunk-NNN.json
Errors: Python failure → fallback to LLM parsing
STEP 2: EXTRACT (ts-extractor)
──────────────────────────────
Input: index.json + chunks/*.json (NEVER canonical-transcript.json)
Process:
1. Read index.json for metadata and chunk listing
2. Process each chunk sequentially for entity extraction
3. Extract entities with confidence scores and citations
Output: extraction-report.json
Errors: Low confidence extractions → flag for review
STEP 3: FORMAT (ts-formatter)
─────────────────────────────
Input: index.json + extraction-report.json (NEVER canonical-transcript.json)
Output: packet/ directory (8 files per ADR-002)
Errors: Token overflow → split files automatically (ADR-004)
STEP 3.5: MINDMAP GENERATION (ts-mindmap-*, conditional - ADR-006)
──────────────────────────────────────────────────────────────────
Condition: --no-mindmap flag NOT set (mindmaps ON by default)
Input: ts_extractor_output.extraction_report_path + ts_formatter_output.packet_path
Process:
1. IF --no-mindmap flag is NOT set: # Default behavior
- IF --mindmap-format == "mermaid" OR "both" (default):
Invoke ts-mindmap-mermaid
Output: 08-mindmap/mindmap.mmd
- IF --mindmap-format == "ascii" OR "both" (default):
Invoke ts-mindmap-ascii
Output: 08-mindmap/mindmap.ascii.txt
2. ELSE:
- Skip mindmap generation (user opted out)
- ts_mindmap_output.overall_status = "skipped"
Output: ts_mindmap_output state key
Errors: Graceful degradation - continue with warning, provide regeneration instructions
STEP 4: REVIEW (ps-critic)
──────────────────────────
Input: All generated files
Output: quality-review.md
Errors: Score < 0.90 → report issues for human reviewL2: State Passing Between Agents (v2.0)
PURPOSE: State keys provide persistence across agent boundaries, enabling:
- Error recovery - Agents can resume from last checkpoint
- Debugging - Full pipeline trace in state history
- Quality assurance - ps-critic validates entire pipeline state
| Agent | Output Key | Passed To | Purpose |
|---|---|---|---|
| ts-parser v2.0 | ts_parser_output |
ts-extractor | Parsing results + chunk metadata |
| ts-extractor | ts_extractor_output |
ts-formatter | Extraction report + stats |
| ts-formatter | ts_formatter_output |
ts-mindmap-*, ps-critic | Packet location + file manifest |
| ts-mindmap-* | ts_mindmap_output |
ps-critic | Mindmap paths + generation status |
| ps-critic | quality_output |
User/Orchestrator | Quality validation results |
State Schema (v2.0):
ts_parser_output:
# === Core Parsing Results (v2.0 Chunked Architecture) ===
canonical_json_path: string # Full transcript JSON (archive only, ~930KB)
index_json_path: string # Chunk index metadata (~8KB)
chunks_dir: string # chunks/ directory path
chunk_count: integer # Number of chunk files created
# === Parsing Metadata ===
format_detected: vtt|srt|plain # Auto-detected or forced format
parsing_method: python|llm # Strategy Pattern routing decision
segment_count: integer # Total segments in transcript
speaker_count: integer # Unique speakers identified
duration_ms: integer|null # Transcript duration (null for plain text)
# === Quality/Validation ===
validation_passed: boolean # Schema validation result
warnings: string[] # Non-fatal parsing issues
# === Error Handling ===
errors: string[] # Fatal errors (if any)
fallback_triggered: boolean # True if Python→LLM fallback occurred
ts_extractor_output:
# === Core Extraction Results ===
extraction_report_path: string # Path to extraction-report.json (~35KB)
# === Entity Counts (MUST match array lengths per INV-EXT-001) ===
action_count: integer # len(action_items) - MANDATORY consistency
decision_count: integer # len(decisions) - MANDATORY consistency
question_count: integer # len(questions) - MANDATORY consistency
topic_count: integer # len(topics) - MANDATORY consistency
speaker_count: integer # len(speakers) - MANDATORY consistency
# === Quality Metrics ===
high_confidence_ratio: float # Ratio of high-confidence extractions
average_confidence: float # Mean confidence across all entities
# === Processing Metadata ===
chunks_processed: integer # Number of chunks analyzed
input_format: "chunked"|"single_file" # Input format used
# === Tiered Extraction Stats ===
tier_1_count: integer # Rule-based extractions (high confidence)
tier_2_count: integer # ML-based extractions (medium confidence)
tier_3_count: integer # LLM-based extractions (variable confidence)
ts_formatter_output:
# === Core Output ===
packet_path: string # Directory path to 8-file packet
# === File Manifest ===
files_created: string[] # List of all created files
split_files: integer # Number of files split due to token limits
# === Token Accounting ===
total_tokens: integer # Sum of tokens across all files
max_file_tokens: integer # Largest file token count
# === Linking Metadata ===
anchor_count: integer # Total anchors in registry
backlink_count: integer # Total backlinks generated
# === Validation ===
all_files_under_limit: boolean # True if all < 35K tokens per ADR-004
ts_mindmap_output: # NEW: ADR-006 State Passing (see ADR-006#state-passing)
# === Mindmap Control ===
enabled: boolean # Was --no-mindmap NOT set?
format_requested: string # "mermaid" | "ascii" | "both"
# === Mermaid Mindmap ===
mermaid:
path: string | null # Path to 08-mindmap/mindmap.mmd
status: string # "complete" | "failed" | "skipped"
error_message: string | null # Error if status="failed"
topic_count: integer # Topics included in mindmap
deep_link_count: integer # Deep links in reference block
syntax_validated: boolean # Mermaid syntax check passed
# === ASCII Mindmap ===
ascii:
path: string | null # Path to 08-mindmap/mindmap.ascii.txt
status: string # "complete" | "failed" | "skipped"
error_message: string | null # Error if status="failed"
topic_count: integer # Topics included in mindmap
max_line_width: integer # Maximum line width used (should be <= 80)
box_drawing_valid: boolean # UTF-8 box-drawing characters validated
# === Overall Status ===
overall_status: string # "complete" | "partial" | "failed" | "skipped"
regeneration_command: string | null # Command to regenerate if failed
quality_output:
# === Quality Score ===
quality_score: float # Aggregate score (0.0-1.0)
passed: boolean # True if >= quality_threshold (default 0.90)
# === Validation Results ===
issues: string[] # Quality gate failures
recommendations: string[] # Improvement suggestions
# === Criteria Breakdown ===
criteria_scores: dict # Per-criterion scores (e.g., "T-001": 0.95)
mindmap_criteria_applied: boolean # True if MM-*/AM-* criteria checked
# === Traceability ===
artifacts_validated: string[] # List of files checked
validation_timestamp: string # ISO 8601 timestampError State Structures (F-002)
PURPOSE: Document how agents communicate errors through state keys.
Enables downstream agents to detect and handle failures gracefully.
Error State Schema
When an agent encounters an error, it MUST populate error-specific fields in its output state key:
{agent}_output:
# === Standard Error Fields (ALL agents MUST include) ===
status: string # "success" | "partial" | "failed"
errors: string[] # Fatal errors that halted execution
warnings: string[] # Non-fatal issues that may affect quality
# === Error Recovery Metadata ===
recovery_possible: boolean # Can this error be recovered from?
recovery_command: string | null # CLI command to retry/resume
recovery_instructions: string | null # Human-readable recovery steps
# === Agent-Specific Error Context ===
error_context: dict | null # Additional diagnostic informationError State Examples by Agent
ts-parser Error (Python Parser Failure):
ts_parser_output:
status: "partial" # Fallback succeeded, but not preferred path
canonical_json_path: "/path/to/canonical-transcript.json"
index_json_path: "/path/to/index.json"
chunks_dir: "/path/to/chunks/"
chunk_count: 3
# Error tracking
errors:
- "Python VTT parser failed: UnicodeDecodeError at line 42"
warnings:
- "Fallback to LLM parsing - increased cost and latency"
# Recovery info
fallback_triggered: true
recovery_possible: false # Already recovered via fallback
recovery_instructions: "LLM fallback completed successfully. No action required."
# Diagnostic context
error_context:
python_error_type: "UnicodeDecodeError"
failed_encoding: "utf-8"
attempted_fallback_encodings: ["windows-1252", "iso-8859-1"]
fallback_method: "llm"
fallback_model: "sonnet"ts-extractor Error (Context Overflow):
ts_extractor_output:
status: "failed" # Cannot proceed
extraction_report_path: null # No output generated
# Fatal error
errors:
- "Context window exceeded: chunk-042.json is 45,000 tokens (limit: 35,000)"
- "Cannot process chunk without splitting"
warnings: []
# Recovery info
recovery_possible: true
recovery_command: "uv run jerry transcript parse <file> --chunk-target-tokens 15000"
recovery_instructions: >
Re-run parsing with smaller chunk size to prevent context overflow.
Use --chunk-target-tokens 15000 (down from default 18000) to reduce chunk size.
# Diagnostic context
error_context:
failed_chunk_id: "chunk-042"
failed_chunk_tokens: 45000
chunk_token_limit: 35000
suggestion: "Lower --chunk-target-tokens parameter"ts-formatter Error (File Write Failure):
ts_formatter_output:
status: "failed"
packet_path: "/output/transcript-meeting-001/" # Directory created but incomplete
# Partial completion tracking
files_created:
- "00-index.md"
- "01-summary.md"
- "02-transcript.md"
files_failed:
- "03-speakers.md"
- "04-action-items.md"
- "05-decisions.md"
- "06-questions.md"
- "07-topics.md"
# Fatal error
errors:
- "IOError: Permission denied writing to /output/transcript-meeting-001/03-speakers.md"
warnings: []
# Recovery info
recovery_possible: true
recovery_command: "chmod 755 /output/transcript-meeting-001/ && /transcript resume --from formatting"
recovery_instructions: >
Fix directory permissions and resume formatting phase.
Existing files (00-02) will be skipped if valid.
# Diagnostic context
error_context:
permission_error_path: "/output/transcript-meeting-001/03-speakers.md"
directory_permissions: "dr-xr-xr-x"
required_permissions: "drwxr-xr-x"ts-mindmap-mermaid Error (Syntax Validation Failure):
ts_mindmap_output:
enabled: true
format_requested: "both"
# Mermaid failed
mermaid:
path: "/output/08-mindmap/mindmap.mmd" # File created but invalid
status: "failed"
error_message: "Mermaid syntax validation failed: Invalid node syntax at line 12"
topic_count: 0
deep_link_count: 0
syntax_validated: false
# ASCII succeeded
ascii:
path: "/output/08-mindmap/mindmap.ascii.txt"
status: "complete"
error_message: null
topic_count: 47
max_line_width: 79
box_drawing_valid: true
# Overall status
overall_status: "partial" # ASCII succeeded, Mermaid failed
errors:
- "Mermaid mindmap generation failed: Syntax error (markdown link in node)"
warnings:
- "Mermaid mindmap unavailable - ASCII mindmap generated successfully"
# Recovery info
recovery_possible: true
regeneration_command: "uv run jerry transcript mindmap --format mermaid --packet-path /output/"
recovery_instructions: >
Mermaid syntax error likely caused by markdown links in node text.
The ts-mindmap-mermaid agent uses plain text ONLY in nodes per ADR-003 Section 5.3.
ASCII mindmap is valid and can be used. To retry Mermaid generation:
1. Review deep link reference block format
2. Ensure no markdown syntax in topic text
3. Run regeneration command abovequality_output Error (Quality Gate Failure):
quality_output:
quality_score: 0.68 # Below threshold
passed: false # Failed quality gate
# Quality failures
issues:
- "T-001: Missing timestamps in 23% of segments"
- "T-004: Low citation coverage (42% of extractions lack source anchors)"
- "T-006: Token limit exceeded in 02-transcript-part-003.md (37,500 tokens)"
- "MM-002: Mermaid mindmap missing deep link reference block"
recommendations:
- "Re-run parsing to fix timestamp gaps"
- "Ensure ts-extractor includes segment_id in all citations"
- "Re-run ts-formatter with stricter token splitting (31.5K soft limit)"
- "Add deep link reference block to Mermaid mindmap per ADR-003 Section 5.3"
# Criteria breakdown
criteria_scores:
"T-001": 0.77 # Timestamp completeness
"T-002": 0.95 # Speaker attribution
"T-003": 0.90 # Segment ordering
"T-004": 0.42 # Citation coverage
"T-005": 0.85 # Confidence scoring
"T-006": 0.50 # Token budget compliance
"MM-002": 0.00 # Deep link reference block
mindmap_criteria_applied: true
artifacts_validated:
- "00-index.md"
- "01-summary.md"
- "02-transcript-part-001.md"
- "02-transcript-part-002.md"
- "02-transcript-part-003.md" # Failed token limit
- "08-mindmap/mindmap.mmd" # Missing reference block
- "08-mindmap/mindmap.ascii.txt" # Passed
validation_timestamp: "2026-01-30T19:15:00Z"
# Error tracking
errors: [] # No fatal errors, just quality failures
warnings:
- "Quality score 0.68 below threshold 0.90"
status: "failed"
# Recovery info
recovery_possible: true
recovery_command: null # No single command - requires manual fixes
recovery_instructions: >
Address the 4 high-priority issues listed above and re-run ps-critic.
Critical failures:
1. T-004 (citation coverage) - Re-run ts-extractor with citation enforcement
2. T-006 (token limits) - Re-run ts-formatter with --split-at-soft-limit flag
3. MM-002 (deep links) - Re-run ts-mindmap-mermaid with reference block generationPropagated Error State (Multi-Agent Failure)
When an error in one agent cascades to downstream agents:
# ts-parser fails completely (no fallback)
ts_parser_output:
status: "failed"
errors:
- "Fatal: Input file is corrupted (invalid WebVTT structure)"
recovery_possible: false
recovery_instructions: "Provide a valid VTT, SRT, or TXT file"
# ts-extractor cannot proceed (missing input)
ts_extractor_output:
status: "failed"
errors:
- "Missing required input: ts_parser_output.index_json_path"
- "Cannot extract entities without parsed transcript"
recovery_possible: false
recovery_instructions: "Fix ts-parser errors before proceeding to extraction"
# ts-formatter skipped (dependency failure)
ts_formatter_output:
status: "skipped"
errors:
- "Skipped due to upstream failure: ts-extractor.status = 'failed'"
recovery_possible: false
# Pipeline halts completely
overall_pipeline_status: "failed"
failed_phase: "parsing"
recovery_instructions: "Fix input file and restart pipeline from beginning"State Passing Error Handling
Error Propagation Rules
| Error Type | Behavior | State Impact |
|---|---|---|
| Fatal Error | Stop pipeline immediately | Set status: "failed" in current agent output |
| Warning | Continue pipeline | Add to warnings[] array, log in state |
| Validation Failure | Agent-specific decision | ts-parser: fallback to LLM; ts-extractor: skip low-confidence |
| Missing Input | Fatal error | Cannot proceed, report missing dependency |
State Validation at Agent Boundaries
Before ts-extractor:
REQUIRED from ts_parser_output:
- index_json_path (must exist)
- chunks_dir (must exist)
- chunk_count > 0
- validation_passed == trueBefore ts-formatter:
REQUIRED from ts_extractor_output:
- extraction_report_path (must exist)
- action_count == len(action_items) in report (INV-EXT-001)
- decision_count == len(decisions) in report (INV-EXT-001)
- question_count == len(questions) in report (INV-EXT-001)Before ps-critic:
REQUIRED from ts_formatter_output:
- packet_path (directory must exist)
- files_created (all files must exist)
- all_files_under_limit == true (ADR-004 compliance)State Recovery Scenarios (F-003)
PURPOSE: Provide comprehensive recovery procedures for all known failure modes.
Enables self-service recovery without requiring expert intervention.
Scenario 1: Python Parser Failure (Encoding Issues)
Symptom:
Error: Python VTT parser failed: UnicodeDecodeError
Fallback: LLM parsing succeeded (increased cost)Root Cause: VTT file uses non-UTF-8 encoding (e.g., Windows-1252, UTF-16).
Recovery:
# Option 1: Force LLM parsing from the start (skip Python parser)
uv run jerry transcript parse meeting.vtt --force-llm
# Option 2: Convert file to UTF-8 first (preferred)
iconv -f windows-1252 -t utf-8 meeting.vtt > meeting-utf8.vtt
uv run jerry transcript parse meeting-utf8.vttPrevention: Ensure VTT files are UTF-8 encoded before processing.
Scenario 2: Context Window Overflow (Chunk Too Large)
Symptom:
Error: Context window exceeded: chunk-042.json is 45,000 tokens (limit: 35,000)
Status: ts-extractor failedRoot Cause: Chunk exceeds Claude Code Read tool limit (25K tokens). Occurs when transcript has very long monologues.
Recovery:
# Re-parse with smaller chunk size (reduce from default 18K to 12K)
uv run jerry transcript parse meeting.vtt --chunk-target-tokens 12000 --output-dir ./output-v2/
# Then continue from extraction phase
# (ts-parser will automatically trigger ts-extractor with new chunks)Prevention: Use --chunk-target-tokens 15000 for transcripts with long segments.
Scenario 3: File Write Permission Denied
Symptom:
Error: IOError: Permission denied writing to /output/transcript-meeting-001/03-speakers.md
Status: ts-formatter failed (partial completion)
Files created: 00-02
Files failed: 03-07Root Cause: Output directory or parent directory has incorrect permissions.
Recovery:
# Fix directory permissions
chmod 755 /output/transcript-meeting-001/
# Resume formatting phase (will skip existing files 00-02)
uv run jerry transcript resume --from formatting --packet-path /output/transcript-meeting-001/
# OR start fresh with correct permissions
chmod 755 /output/
uv run jerry transcript parse meeting.vtt --output-dir /output/meeting-002/Prevention: Ensure write permissions on output directory before invoking skill.
Scenario 4: Quality Gate Failure (Score < 0.90)
Symptom:
Quality Score: 0.68 (threshold: 0.90)
Status: quality_output.passed = false
Issues:
- T-004: Low citation coverage (42%)
- T-006: Token limit exceeded in 02-transcript-part-003.mdRoot Cause: Extraction quality issues or token budget violations.
Recovery:
Step 1: Review quality-review.md for specific issues
cat /output/transcript-meeting-001/quality-review.mdStep 2: Address each issue:
For T-004 (citation coverage):
# Re-run extraction with stricter citation enforcement
uv run jerry transcript resume --from extraction --enforce-citationsFor T-006 (token limits):
# Re-run formatting with stricter splitting
uv run jerry transcript resume --from formatting --split-at-soft-limitStep 3: Re-run quality review:
uv run jerry transcript resume --from quality-reviewPrevention: Use --model-extractor opus for better extraction quality.
Scenario 5: Agent Timeout (Long Transcripts)
Symptom:
Error: Agent timeout after 300 seconds
Status: ts-extractor failed
Progress: Processed 8/15 chunks before timeoutRoot Cause: Very large transcript (> 10,000 segments) exceeds default timeout.
Recovery:
# Increase timeout for extraction phase
uv run jerry transcript parse meeting.vtt --extractor-timeout 600
# OR split transcript into smaller files
uv run python scripts/split_vtt.py meeting.vtt --max-duration 60 # 60 min chunks
uv run jerry transcript parse meeting-part-001.vtt
uv run jerry transcript parse meeting-part-002.vtt
# ... process each part separatelyPrevention: For transcripts > 2 hours, consider splitting into segments.
Scenario 6: Partial Extraction (Low Confidence)
Symptom:
Warning: 45% of action items have confidence < 0.70
Status: ts-extractor succeeded (with warnings)Root Cause: Ambiguous language, implicit action items, or poor transcript quality.
Recovery:
# Option 1: Use Opus model for better semantic understanding
uv run jerry transcript parse meeting.vtt --model-extractor opus
# Option 2: Lower confidence threshold to include more entities
uv run jerry transcript parse meeting.vtt --confidence-threshold 0.5
# Option 3: Review uncertain entities manually
cat /output/extraction-report.json | jq '.action_items[] | select(.confidence < 0.70)'Prevention: Improve automatic transcription quality (better audio, less background noise).
Scenario 7: Mindmap Syntax Validation Failure
Symptom:
Error: Mermaid syntax validation failed: Invalid node syntax at line 12
Status: ts_mindmap_output.mermaid.status = "failed"
Status: ts_mindmap_output.ascii.status = "complete"Root Cause: Markdown links used in Mermaid node text (ADR-003 limitation).
Recovery:
# Regenerate Mermaid mindmap with plain text enforcement
uv run jerry transcript mindmap --format mermaid --packet-path /output/ --plain-text-only
# OR use ASCII mindmap (already succeeded)
cat /output/08-mindmap/mindmap.ascii.txtPrevention: ts-mindmap-mermaid agent should strip markdown syntax before node generation.
Scenario 8: File Conflict (Output Already Exists)
Symptom:
Error: Output directory already exists: /output/transcript-meeting-001/
Conflict: 00-index.md, 01-summary.md (8 files total)Root Cause: Previous run output not cleared before re-running.
Recovery:
# Option 1: Delete existing output (CAUTION: data loss)
rm -rf /output/transcript-meeting-001/
uv run jerry transcript parse meeting.vtt --output-dir /output/transcript-meeting-001/
# Option 2: Use timestamped directory (recommended)
uv run jerry transcript parse meeting.vtt --output-dir /output/transcript-meeting-001-$(date +%Y%m%d-%H%M%S)/
# Option 3: Merge mode (preserve existing files, update only changed)
uv run jerry transcript parse meeting.vtt --output-dir /output/transcript-meeting-001/ --mergePrevention: Use --output-dir with unique names or timestamps.
Scenario 9: Missing Dependencies (Python Modules)
Symptom:
Error: ModuleNotFoundError: No module named 'tiktoken'
Status: ts-parser failed (cannot perform token counting)Root Cause: UV environment missing required Python dependencies.
Recovery:
# Sync UV dependencies
uv sync
# Verify installation
uv pip list | grep tiktoken
# Re-run parsing
uv run jerry transcript parse meeting.vttPrevention: Run uv sync after pulling changes or switching branches.
Scenario 10: Orchestration State Corruption
Symptom:
Error: State key mismatch: ts_extractor_output.action_count (23) != len(action_items) in extraction-report.json (25)
Violation: INV-EXT-001 (entity count consistency)
Status: ts-formatter refused to proceedRoot Cause: Agent violated state schema invariant (likely a bug).
Recovery:
# Option 1: Regenerate extraction report from chunks
rm /output/extraction-report.json
uv run jerry transcript resume --from extraction --packet-path /output/
# Option 2: Skip validation (CAUTION: may propagate errors)
uv run jerry transcript resume --from formatting --skip-validation
# Option 3: Report bug with state dump
uv run jerry transcript debug --dump-state /output/ > state-dump.json
# Attach state-dump.json to bug reportPrevention: This is an agent bug - report to maintainers with reproduction steps.
Error Handling
| Error | Detection | Recovery |
|---|---|---|
| File not found | OS exception | Return clear error message |
| Unknown format | Auto-detect fails | Fallback to plain text parser |
| Encoding error | Decode exception | Try UTF-8, Windows-1252, ISO-8859-1 |
| Token overflow | Count > soft limit | Split at semantic boundary |
| Low confidence | score < threshold | Include in "uncertain" section |
| Quality failure | score < 0.90 | Report issues for human review |
File Persistence Requirements (P-002 Compliance)
CRITICAL: All transcript skill outputs MUST be persisted to the filesystem.
Violating P-002 (File Persistence) is a constitutional violation.
Mandatory Artifacts by Agent
ts-parser v2.0:
| Artifact | Path | Size | Purpose | Required |
|---|---|---|---|---|
| Canonical JSON | canonical-transcript.json |
~930KB | Archive, programmatic access | ✓ Yes |
| Index JSON | index.json |
~8KB | Chunk metadata, agent input | ✓ Yes |
| Chunk Files | chunks/chunk-NNN.json |
~130KB each | LLM-processable segments | ✓ Yes |
ts-extractor:
| Artifact | Path | Size | Purpose | Required |
|---|---|---|---|---|
| Extraction Report | extraction-report.json |
~35KB | Entity extraction results | ✓ Yes |
ts-formatter:
| Artifact | Path | Size | Purpose | Required |
|---|---|---|---|---|
| 00-index.md | packet/00-index.md |
~2KB | Navigation hub | ✓ Yes |
| 01-summary.md | packet/01-summary.md |
~5KB | Executive summary | ✓ Yes |
| 02-transcript.md | packet/02-transcript.md |
Variable | Full transcript (may split) | ✓ Yes |
| 03-speakers.md | packet/03-speakers.md |
~3KB | Speaker directory | ✓ Yes |
| 04-action-items.md | packet/04-action-items.md |
~4KB | Action items with citations | ✓ Yes |
| 05-decisions.md | packet/05-decisions.md |
~3KB | Decisions with context | ✓ Yes |
| 06-questions.md | packet/06-questions.md |
~2KB | Open questions | ✓ Yes |
| 07-topics.md | packet/07-topics.md |
~3KB | Topic segments | ✓ Yes |
| _anchors.json | packet/_anchors.json |
~5KB | Anchor registry for linking | ✓ Yes |
ts-mindmap-mermaid (conditional):
| Artifact | Path | Size | Purpose | Required |
|---|---|---|---|---|
| Mermaid Mindmap | packet/08-mindmap/mindmap.mmd |
~10KB | Mermaid visualization | If --mindmap-format=mermaid or both |
ts-mindmap-ascii (conditional):
| Artifact | Path | Size | Purpose | Required |
|---|---|---|---|---|
| ASCII Mindmap | packet/08-mindmap/mindmap.ascii.txt |
~8KB | Terminal-friendly visualization | If --mindmap-format=ascii or both |
ps-critic:
| Artifact | Path | Size | Purpose | Required |
|---|---|---|---|---|
| Quality Review | packet/quality-review.md |
~15KB | Quality validation results | ✓ Yes |
Agent File Persistence Checklist
Before completing execution, EACH agent MUST:
ts-parser:
- [ ] Write canonical-transcript.json to disk
- [ ] Write index.json to disk
- [ ] Create chunks/ directory
- [ ] Write all chunk-NNN.json files
- [ ] Verify all files exist and are readable
- [ ] Report file paths in ts_parser_output state
ts-extractor:
- [ ] Write extraction-report.json to disk
- [ ] Verify file exists and is valid JSON
- [ ] Verify entity counts match array lengths (INV-EXT-001)
- [ ] Report file path in ts_extractor_output state
ts-formatter:
- [ ] Create packet directory (mkdir -p)
- [ ] Write ALL 8 core files (00-index.md through 07-topics.md)
- [ ] Write _anchors.json
- [ ] Create split files if any file exceeds soft limit (31.5K tokens)
- [ ] Verify all files exist
- [ ] Verify all files are under hard limit (35K tokens per ADR-004)
- [ ] Report file list in ts_formatter_output.files_created
ts-mindmap-mermaid:
- [ ] Create 08-mindmap/ directory (if not exists)
- [ ] Write mindmap.mmd file
- [ ] Verify Mermaid syntax is valid
- [ ] Verify deep link reference block is present
- [ ] Report file path in ts_mindmap_output.mermaid.path
ts-mindmap-ascii:
- [ ] Ensure 08-mindmap/ directory exists
- [ ] Write mindmap.ascii.txt file
- [ ] Verify all lines are <= 80 characters
- [ ] Verify legend is present
- [ ] Report file path in ts_mindmap_output.ascii.path
ps-critic:
- [ ] Write quality-review.md to packet directory
- [ ] Include all validation criteria results
- [ ] List issues and recommendations
- [ ] Report quality score in quality_output statePersistence Failure Recovery
If an agent fails to persist:
- Detection: Post-completion check fails (file_must_exist validation)
- Response: Agent returns error status, pipeline halts
- Recovery: Fix the persistence issue, retry agent execution
- Never: Proceed to next agent without verifying file existence
Agent Self-Critique Protocol
PURPOSE: Pre-finalization quality checks per Jerry Constitution P-001 (Truth and Accuracy)
Each agent MUST perform self-critique before reporting completion.
Universal Self-Critique Checklist (All Agents)
Before reporting SUCCESS, EVERY agent must verify:
Constitutional Compliance:
- [ ] P-001: Is my output accurate and truthful?
- [ ] P-002: Have I persisted ALL required artifacts to files?
- [ ] P-003: Did I avoid spawning recursive subagents?
- [ ] P-004: Do all my extractions have citations to sources?
- [ ] P-010: Is my state output consistent with actual results?
- [ ] P-020: Did I respect user decisions (flags, paths)?
- [ ] P-022: Have I been honest about limitations and errors?
Quality Gates:
- [ ] Are my output files readable and valid (JSON/Markdown)?
- [ ] Did I include all required metadata/frontmatter?
- [ ] Are my reported counts/stats mathematically correct?
- [ ] Did I document any warnings or errors in state?Agent-Specific Self-Critique
ts-parser v2.0 (F-004 Quantitative Thresholds):
Pre-Completion Checks:
- [ ] Format detection is accurate (VTT vs SRT vs plain)
- [ ] If VTT: Did I successfully delegate to Python parser?
- [ ] If Python failed: Did I fall back to LLM parsing?
- [ ] Schema validation passed (validation_passed = true)
- [ ] Chunk files created (chunk_count matches actual files)
- [ ] All segments have sequential IDs (seg-001, seg-002, ...)
- [ ] parse_metadata includes warnings/errors (if any)
- [ ] No fabricated timestamps for plain text files (P-001)
Critical Validations (Quantitative):
- [ ] canonical-transcript.json exists and is valid JSON (file size > 0 bytes)
- [ ] index.json exists and matches chunk file count (schema_version == "1.0")
- [ ] chunks/ directory contains all chunk-NNN.json files (count >= 1)
- [ ] segment_count == sum of segments across all chunks (exact match, tolerance: 0)
- [ ] Each chunk token count ≤ 35,000 tokens (hard limit per ADR-004)
- [ ] Chunk token count target: 18,000 ± 2,000 tokens (target_tokens parameter)
- [ ] Speaker count ≥ 1 (at least one speaker identified)
- [ ] Total duration > 0 ms (for VTT/SRT formats; null acceptable for plain text)
- [ ] Warning count ≤ 5 (excessive warnings indicate quality issues)
- [ ] Error count == 0 for status "success" (no errors if claiming success)ts-extractor (F-004 Quantitative Thresholds):
Pre-Completion Checks:
- [ ] INV-EXT-001: extraction_stats counts MATCH array lengths (exact, tolerance: 0)
assert action_count == len(action_items)
assert decision_count == len(decisions)
assert question_count == len(questions)
assert topic_count == len(topics)
assert speaker_count == len(speakers)
- [ ] INV-EXT-002: Questions are semantic, not syntactic (no rhetorical)
- [ ] ALL extractions have citations (100% coverage, 0 missing citations)
- [ ] ALL citations reference existing segments in input (100% validity)
- [ ] Confidence scores are calibrated honestly (P-022)
- [ ] No hallucinated entities (all sourced from transcript)
Critical Validations (Quantitative):
- [ ] extraction-report.json exists and is valid JSON (file size > 1KB)
- [ ] input_format = "chunked" (v2.0 requirement)
- [ ] chunk_metadata.chunks_processed == expected count (exact match)
- [ ] All entity arrays are non-null (may be empty, but not null)
- [ ] High-confidence ratio ≥ 0.70 (70% of extractions with confidence ≥ 0.80)
- [ ] Average confidence ≥ 0.75 (mean across all entities)
- [ ] Citation coverage == 100% (every entity has at least one citation)
- [ ] Tier 1 (rule-based) count ≥ 30% (at least 30% high-confidence extractions)
- [ ] Tier 3 (LLM) confidence ≥ 0.60 (LLM extractions meet minimum threshold)
- [ ] Extraction count bounds: 0 ≤ entity_count ≤ 1000 (sanity check)
- [ ] Speaker count ≥ 1 (at least one speaker in transcript)
- [ ] Topic count ≥ 1 (at least one topic identified)ts-formatter (F-004 Quantitative Thresholds):
Pre-Completion Checks:
- [ ] ALL 8 core files created (00-index.md through 07-topics.md)
- [ ] _anchors.json created with complete anchor registry
- [ ] NO file exceeds 35K tokens (hard limit per ADR-004)
- [ ] Split files have proper navigation (prev/next links)
- [ ] All internal links resolve (no broken anchors, 100% resolution)
- [ ] All entity files link to source segments (bidirectional, 100% coverage)
- [ ] Schema version metadata in all frontmatter (PAT-005)
- [ ] files_created list matches actual directory contents
Critical Validations (Quantitative):
- [ ] packet_path directory exists (verified via os.path.exists)
- [ ] total_tokens == sum of tokens across all files (exact match, tolerance: ±10)
- [ ] all_files_under_limit == true (ALL files ≤ 35,000 tokens)
- [ ] File token limits:
- Soft limit: 31,500 tokens (split at this boundary if possible)
- Hard limit: 35,000 tokens (MUST NOT exceed)
- [ ] anchor_count > 0 (at least segment anchors exist)
- [ ] backlink_count > 0 (at least one backlink per entity type)
- [ ] File count: 8 ≤ files_created.length ≤ 20 (8 core + up to 12 split files)
- [ ] Split file ratio ≤ 0.50 (no more than 50% of files are splits)
- [ ] 00-index.md token count ≤ 5,000 (navigation hub must be concise)
- [ ] 01-summary.md token count ≤ 8,000 (executive summary constraint)
- [ ] _anchors.json size < 50KB (anchor registry should be compact)
- [ ] Internal link resolution rate == 100% (no broken links tolerated)
- [ ] Entity backlink coverage ≥ 95% (at least 95% of entities have source links)ts-mindmap-mermaid (F-004 Quantitative Thresholds):
Pre-Completion Checks:
- [ ] Mermaid syntax is valid (mindmap keyword at start, line 1)
- [ ] Root node uses ((double parentheses)) (line 2)
- [ ] ALL nodes use plain text ONLY (no markdown links, 0 violations)
- [ ] Deep link reference block appended as %% comments (present at end)
- [ ] Indentation is consistent (2 spaces per level, 100% compliance)
- [ ] Entity symbols are correct (no confusion between →, ?, !, ✓)
- [ ] Topic overflow handled if > 50 topics (truncation with note)
Critical Validations (Quantitative):
- [ ] 08-mindmap/mindmap.mmd exists (file size > 500 bytes)
- [ ] status = "complete" in ts_mindmap_output
- [ ] topic_count matches extraction report (exact match, tolerance: 0)
- [ ] Topic count bounds: 1 ≤ topic_count ≤ 50 (overflow if > 50)
- [ ] Deep link count > 0 (at least one deep link in reference block)
- [ ] Deep link count ≤ topic_count * 3 (max 3 links per topic)
- [ ] Syntax validation passed (no Mermaid parser errors)
- [ ] File line count ≥ 10 (minimum viable mindmap structure)
- [ ] Indentation depth ≤ 4 levels (hierarchy constraint per ADR-003)
- [ ] Node text length ≤ 100 characters (readability constraint)
- [ ] P-022: Documented syntax limitation (no markdown links in nodes)ts-mindmap-ascii (F-004 Quantitative Thresholds):
Pre-Completion Checks:
- [ ] ALL lines are <= 80 characters (terminal compatibility, 100% compliance)
- [ ] Box-drawing characters are valid UTF-8 (U+2500 block, no ASCII fallback)
- [ ] Legend is present at bottom explaining symbols (minimum 4 lines)
- [ ] Tree structure is visually balanced (no excessive left-skew)
- [ ] Entity symbols match specification ([→], [?], [!], etc.)
- [ ] No truncation artifacts (ellipsis applied correctly at 77 chars)
Critical Validations (Quantitative):
- [ ] 08-mindmap/mindmap.ascii.txt exists (file size > 300 bytes)
- [ ] status = "complete" in ts_mindmap_output
- [ ] max_line_width <= 80 (hard constraint)
- [ ] Typical line width: 60-75 characters (readability range)
- [ ] box_drawing_valid == true (UTF-8 validation passed)
- [ ] Topic count bounds: 1 ≤ topic_count ≤ 50 (same as Mermaid)
- [ ] File line count ≥ 15 (minimum viable ASCII mindmap)
- [ ] Legend line count == 4-6 lines (standard legend format)
- [ ] Indentation step == 2 spaces (consistent with Mermaid)
- [ ] Tree depth ≤ 4 levels (same hierarchy constraint as Mermaid)
- [ ] Entity symbol count ≥ topic_count (at least one entity per topic)ps-critic (F-004 Quantitative Thresholds):
Pre-Completion Checks:
- [ ] Quality score calculated correctly (0.0-1.0 range)
- [ ] ALL validation criteria evaluated (minimum 15 criteria for full packet)
- [ ] Mindmap criteria (MM-*, AM-*) applied if mindmaps present (8 additional criteria)
- [ ] Issues list is actionable (specific, not vague; no generic "improve quality")
- [ ] Recommendations are constructive (actionable, with commands/steps)
- [ ] criteria_scores includes all evaluated criteria (no missing scores)
- [ ] Passed/failed determination is honest (>= 0.90 threshold)
Critical Validations (Quantitative):
- [ ] quality-review.md exists in packet directory (file size > 2KB)
- [ ] artifacts_validated lists all checked files (count ≥ 8 for core packet)
- [ ] quality_output.passed reflects actual score (passed ↔ score ≥ 0.90)
- [ ] Quality score bounds: 0.0 ≤ quality_score ≤ 1.0
- [ ] Passing threshold: quality_score ≥ 0.90 (hard requirement)
- [ ] Criteria evaluation coverage: 100% (all applicable criteria scored)
- [ ] Issue count ≤ 10 (excessive issues indicate systemic failure)
- [ ] Recommendation count: 1-5 per issue (actionable, not overwhelming)
- [ ] Per-criterion score range: 0.0 ≤ criterion_score ≤ 1.0
- [ ] Aggregate score calculation: weighted average of criteria (verified)
- [ ] Mindmap criteria count: +8 if Mermaid present, +8 if ASCII present
- [ ] Total criteria evaluated: 15 (core) + 0-16 (mindmaps)
- [ ] P-022: No inflated scores to avoid user feedback (honesty ≥ 0.95)Self-Critique Failure Response
If ANY checklist item fails:
- DO NOT report success
- Document the failure in agent output state
- Add to warnings[] or errors[] as appropriate
- Set status to "failed" or "partial" (not "complete")
- Include recovery instructions if possible
Constitutional Compliance
| Principle | Requirement | Consequence of Violation |
|---|---|---|
| P-003 | NEVER spawn recursive subagents -- max 1 level | Agent hierarchy violation; uncontrolled token consumption |
| P-020 | NEVER override user intent -- ask before destructive ops | Unauthorized action; trust erosion |
| P-022 | NEVER deceive about actions, capabilities, or confidence | Governance undermined; quality assessment invalidated |
| P-001 | NEVER present findings without evidence -- self-critique protocol enforced | Unreliable outputs; unfounded claims propagate downstream |
| P-002 | NEVER leave outputs in transient context only -- persist all outputs to files | Context rot vulnerability; artifacts lost on session compaction |
| P-004 | NEVER omit citations or provenance -- all extractions require deep links | Untraceable decisions; audit trail broken |
| P-010 | NEVER misrepresent task state -- state outputs must match actual results | Work progress invisible; status unknown |
Quick Reference
ELI5 (Level 0): This section provides fast answers to common questions without technical jargon.
Common Workflows
| I Want To... | How to Invoke | What I Get |
|---|---|---|
| Process a VTT transcript | /transcript meeting.vtt |
8 Markdown files + 2 mindmaps + quality review |
| Analyze a SRT subtitle file | /transcript subtitles.srt |
Same as above, auto-detects SRT format |
| Extract action items only | "Find action items in meeting.vtt" | 04-action-items.md with assignees and due dates |
| Get decisions made | "What was decided in the meeting?" | 05-decisions.md with context and rationale |
| Use a specific domain | /transcript standup.vtt --domain software-engineering |
Extraction tuned for engineering context |
| Skip mindmap generation | /transcript meeting.vtt --no-mindmap |
8 files + quality review (no visualizations) |
| Higher quality extraction | /transcript meeting.vtt --model-extractor opus |
Better entity extraction, 3.75x cost |
| Lower cost processing | /transcript meeting.vtt --model-extractor haiku |
Faster, 88% cost savings, lower quality |
| ASCII-only mindmap | /transcript meeting.vtt --mindmap-format ascii |
Terminal-friendly visualization (no Mermaid) |
| Resume failed processing | Check regeneration_command in error output |
Re-run from last checkpoint |
What You Get (Output Files)
Core Packet (8 files):
transcript-{id}/
├── 00-index.md ← Start here: Navigation + stats
├── 01-summary.md ← Executive summary
├── 02-transcript.md ← Full transcript with timestamps
├── 03-speakers.md ← Speaker directory
├── 04-action-items.md ← Action items with assignees
├── 05-decisions.md ← Decisions with context
├── 06-questions.md ← Open questions
├── 07-topics.md ← Topic-based navigation
└── _anchors.json ← Deep linking registryMindmaps (default: both formats):
08-mindmap/
├── mindmap.mmd ← Mermaid diagram (rendered by tools)
└── mindmap.ascii.txt ← ASCII art (terminal-friendly)Quality Assurance:
quality-review.md ← Validation results (>= 0.90 score)Processing Time Estimates
| Transcript Size | Segments | Duration | VTT (Python) | SRT/TXT (LLM) |
|---|---|---|---|---|
| Small | < 500 | < 30 min | ~5 seconds | ~30 seconds |
| Medium | 500-2000 | 30-90 min | ~10 seconds | ~2 minutes |
| Large | 2000-5000 | 90-180 min | ~20 seconds | ~5 minutes |
| Very Large | > 5000 | > 3 hours | ~30 seconds | ~10 minutes |
Note: VTT files use deterministic Python parser (1,250x faster than LLM).
SRT and plain text files require LLM processing (slower but still sub-minute for most).
Cost Estimates (Per 10,000 Input Tokens)
| Configuration | Estimated Cost | Quality Trade-off |
|---|---|---|
| Default (mixed) | ~$0.12 | Balanced - recommended for most use cases |
| All Haiku | ~$0.015 | 88% savings, ~70-75% extraction accuracy |
| All Sonnet | ~$0.15 | Baseline quality, ~85-90% accuracy |
| All Opus | ~$0.75 | Highest quality (6x cost), ~95%+ accuracy |
| Extractor Opus only | ~$0.45 | Targeted quality boost (3.75x cost for extractions) |
Cost Breakdown (Default):
- ts-parser (haiku): ~$0.0025 - Simple orchestration
- ts-extractor (sonnet): ~$0.06 - Most token-intensive
- ts-formatter (sonnet): ~$0.03 - Template-based generation
- ts-mindmap-* (sonnet): ~$0.015 - Hierarchical reasoning
- ps-critic (sonnet): ~$0.015 - Quality evaluation
Quality Thresholds
| Metric | Target | What It Means | How to Improve |
|---|---|---|---|
| Extraction precision | > 85% | % of extracted entities that are correct | Use --model-extractor opus |
| Extraction recall | > 85% | % of actual entities successfully extracted | Improve transcript quality (better audio) |
| Confidence threshold | >= 0.7 | Minimum confidence to include entity | Increase threshold for higher precision |
| Quality score | >= 0.90 | ps-critic aggregate validation score | Fix issues listed in quality-review.md |
Troubleshooting Common Issues
| Problem | Likely Cause | Solution |
|---|---|---|
| "File not found" error | Incorrect path | Use absolute path or check working directory |
| Parsing fails (VTT) | Encoding issue | File may be UTF-16 (use --format srt to force LLM) |
| Low quality score | Poor transcript quality | Review quality-review.md for specific issues |
| Missing action items | Implicit language | Use --model-extractor opus for better semantic understanding |
| Mindmap syntax error | Mermaid validation failed | Check mindmap.mmd for plain text node requirement |
| ASCII mindmap too wide | Terminal width < 80 chars | Resize terminal or use Mermaid version |
| High cost | Using Opus for all agents | Switch to default or use --model-extractor opus only |
| Context window exceeded | File too large (rare) | Python chunker should prevent this; report as bug |
Orchestration Troubleshooting (F-005)
PURPOSE: Diagnose and resolve pipeline orchestration failures.
Use this section when the skill appears "stuck" or agents aren't executing in sequence.
Issue: Pipeline Stuck After ts-parser
Symptom:
ts-parser completed successfully
Status: ts_parser_output.validation_passed = true
Expected: ts-extractor should start automatically
Actual: No activity, pipeline appears frozenDiagnostic Steps:
- Check state propagation:
# Verify ts_parser_output state key exists
cat /output/.skill-state.json | jq '.ts_parser_output'- Verify required files exist:
ls -lh /output/index.json
ls -lh /output/chunks/- Check for blocking errors:
cat /output/.skill-state.json | jq '.ts_parser_output.errors'Resolution:
If state key missing:
# Orchestration bug - restart from parsing
uv run jerry transcript parse <file> --output-dir /output/If files missing despite validation_passed = true:
# File persistence violation (P-002) - report as bug
uv run jerry transcript debug --validate-persistence /output/If errors[] is non-empty despite status = "success":
# State inconsistency - force re-run extraction
uv run jerry transcript resume --from extraction --forceIssue: Agent Timeout During Extraction
Symptom:
ts-extractor started processing chunks
Progress: 8/15 chunks processed
Error: Agent timeout after 300 secondsDiagnostic Steps:
- Check chunk sizes:
du -h /output/chunks/*.json
# Look for chunks > 200KB (may exceed token limits)- Check partial extraction report:
ls -lh /output/extraction-report.json
# File may exist but be incomplete- Estimate remaining time:
# Average 30-40 seconds per chunk for large chunks
# Remaining: (15 - 8) * 35 = 245 seconds ≈ 4 minutesResolution:
Increase timeout:
uv run jerry transcript parse <file> --extractor-timeout 600 # 10 minutesOR reduce chunk size (re-parse):
uv run jerry transcript parse <file> --chunk-target-tokens 12000 # Smaller chunksIssue: Quality Gate Fails Immediately
Symptom:
ps-critic started
Error: Missing required files - cannot validate
Status: quality_output.status = "failed"Diagnostic Steps:
- Check packet directory structure:
ls -lh /output/transcript-*/
# Should have 00-07 files + _anchors.json- Verify ts_formatter_output.files_created:
cat /output/.skill-state.json | jq '.ts_formatter_output.files_created'- Check for file write errors:
cat /output/.skill-state.json | jq '.ts_formatter_output.errors'Resolution:
If files incomplete:
# Re-run formatting phase
uv run jerry transcript resume --from formatting --packet-path /output/transcript-*/If permission errors:
chmod -R 755 /output/transcript-*/
uv run jerry transcript resume --from formattingIssue: File Conflicts Prevent Execution
Symptom:
Error: Output directory already exists: /output/transcript-meeting-001/
Conflict detected - refusing to overwriteDiagnostic Steps:
- Check existing files:
ls -lh /output/transcript-meeting-001/- Verify if files are from previous run:
stat /output/transcript-meeting-001/00-index.md
# Check modification timestampResolution:
Option 1: Archive existing output:
mv /output/transcript-meeting-001/ /output/archive/transcript-meeting-001-$(date +%Y%m%d-%H%M%S)/
uv run jerry transcript parse <file> --output-dir /output/transcript-meeting-001/Option 2: Use timestamped directory:
uv run jerry transcript parse <file> --output-dir /output/transcript-meeting-001-$(date +%Y%m%d-%H%M%S)/Option 3: Merge mode (update only changed files):
uv run jerry transcript parse <file> --output-dir /output/transcript-meeting-001/ --mergeIssue: Mindmap Generation Partial Failure
Symptom:
ts-mindmap-mermaid: failed
ts-mindmap-ascii: complete
Status: ts_mindmap_output.overall_status = "partial"Diagnostic Steps:
- Check Mermaid error message:
cat /output/.skill-state.json | jq '.ts_mindmap_output.mermaid.error_message'- Verify ASCII mindmap is usable:
cat /output/08-mindmap/mindmap.ascii.txt- Check if Mermaid file exists (but invalid):
ls -lh /output/08-mindmap/mindmap.mmd
head -20 /output/08-mindmap/mindmap.mmd # Inspect syntaxResolution:
If syntax error (markdown links in nodes):
# Regenerate with plain text enforcement
uv run jerry transcript mindmap --format mermaid --packet-path /output/ --plain-text-onlyIf acceptable to use ASCII only:
# No action required - ASCII mindmap is valid
cat /output/08-mindmap/mindmap.ascii.txtIf both formats required:
# Debug Mermaid generation
uv run jerry transcript mindmap --format mermaid --packet-path /output/ --debug
# Fix issues manually, then validate
uv run jerry transcript validate-mindmap /output/08-mindmap/mindmap.mmdIssue: State Key Mismatch (INV-EXT-001 Violation)
Symptom:
Error: State key mismatch detected
ts_extractor_output.action_count = 23
len(action_items) in extraction-report.json = 25
Violation: INV-EXT-001 (entity count consistency)
ts-formatter refused to proceedDiagnostic Steps:
- Dump extraction report:
cat /output/extraction-report.json | jq '.extraction_stats'
cat /output/extraction-report.json | jq '.action_items | length'- Verify state key:
cat /output/.skill-state.json | jq '.ts_extractor_output.action_count'- Compare values:
# Should be exactly equal (tolerance: 0)Resolution:
Option 1: Regenerate extraction report (preferred):
rm /output/extraction-report.json
uv run jerry transcript resume --from extraction --packet-path /output/Option 2: Report bug with diagnostic info:
uv run jerry transcript debug --dump-state /output/ > state-dump.json
# Attach state-dump.json to bug report
# Include extraction-report.json and .skill-state.jsonOption 3: Skip validation (CAUTION - may propagate errors):
# Only use if deadline-critical and you understand the risk
uv run jerry transcript resume --from formatting --skip-validationIssue: Python Parser Fails, Fallback Doesn't Trigger
Symptom:
Python VTT parser failed: ModuleNotFoundError: No module named 'webvtt'
Expected: Fallback to LLM parsing
Actual: Pipeline halts with fatal errorDiagnostic Steps:
- Check Python dependencies:
uv pip list | grep webvtt
uv pip list | grep tiktoken- Verify UV environment:
uv sync --verbose- Check fallback configuration:
cat /output/.skill-state.json | jq '.ts_parser_output.fallback_triggered'Resolution:
Fix dependencies (preferred):
uv sync
uv run jerry transcript parse <file> --output-dir /output/Force LLM parsing (bypass Python parser):
uv run jerry transcript parse <file> --force-llm --output-dir /output/Manual fallback:
# If Python parser consistently fails, use SRT format override
uv run jerry transcript parse <file> --format srt --output-dir /output/
# This forces LLM parsing even for VTT filesCross-Skill Integration: /problem-solving
The transcript skill integrates with the /problem-solving skill for quality validation.
ps-critic Agent Usage
What is ps-critic?
- ps-critic is a problem-solving agent from the
/problem-solvingskill - Role: Quality Inspector (validates transcript outputs against criteria)
- Specialization: Systematic review using 5W2H, Ishikawa, Pareto frameworks
How transcript skill uses ps-critic:
Phase 4: Quality Review
─────────────────────────
Invoke: ps-critic (from /problem-solving skill)
Input: All packet files (00-07) + mindmaps (08-)
Extension: skills/transcript/validation/ts-critic-extension.md
Threshold: 0.90Why ps-critic, not a transcript-specific validator?
- Reusability - ps-critic validates ANY skill's outputs
- Framework-driven - Uses proven quality frameworks (5W2H, etc.)
- Extensibility - ts-critic-extension.md adds transcript-specific criteria
Extension mechanism:
# ts-critic-extension.md (Transcript-Specific Criteria)
T-001: Timestamp completeness (>= 95% of segments have timestamps)
T-002: Speaker attribution (>= 90% of segments attributed)
T-003: Segment ordering (chronological order maintained)
T-004: Citation coverage (>= 95% of extractions have citations)
T-005: Confidence scores (>= 70% of extractions >= 0.7 confidence)
T-006: Token budget compliance (all files < 35K tokens)
MM-001: Mermaid syntax validation (mindmap is valid Mermaid)
MM-002: Deep link reference block (contains anchor mapping)
AM-001: ASCII line width (all lines <= 80 characters)
AM-002: Box drawing characters (UTF-8 box-drawing used correctly)See also:
- /problem-solving skill - Full documentation
- ps-critic agent - Agent definition
- ts-critic-extension.md - Transcript criteria
Cross-Skill Integration: /orchestration
The transcript skill uses orchestration patterns from the /orchestration skill.
Multi-Phase Pipeline Management
What is /orchestration?
- Framework for coordinating multi-agent workflows
- Provides sync barriers, state checkpointing, cross-pollinated pipelines
- Used when work has dependencies and requires coordination
How transcript skill uses orchestration:
Transcript Pipeline (v2.1):
──────────────────────────
Phase 1: Parse + Chunk → ts-parser
↓ (Sync Barrier: Wait for index.json)
Phase 2: Extract Entities → ts-extractor
↓ (Sync Barrier: Wait for extraction-report.json)
Phase 3: Format Packet → ts-formatter
↓ (Sync Barrier: Wait for packet files)
Phase 3.5: Generate Mindmaps → ts-mindmap-*
↓ (Sync Barrier: Wait for mindmap files)
Phase 4: Quality Review → ps-criticOrchestration patterns used:
| Pattern | Usage in Transcript Skill |
|---|---|
| Sequential Phases | 5 phases run in strict order |
| State Passing | Each phase outputs state for next |
| Sync Barriers | Wait for file existence before proceeding |
| Graceful Degradation | Mindmap failures don't block Phase 4 |
| Checkpoint Recovery | Can resume from any phase |
Example: Sync Barrier Implementation
After Phase 1 (ts-parser):
─────────────────────────
1. Wait for index.json to exist
2. Validate index.json schema
3. IF valid: Proceed to Phase 2
4. ELSE: Retry or abortSee also:
- /orchestration skill - Full documentation
- ORCHESTRATION_PLAYBOOK.md - Step-by-step guide
- Sync Barrier Pattern - Pattern details
Cross-Skill Integration: /nasa-se
The transcript skill applies NASA Systems Engineering principles for quality assurance.
Requirements Traceability
What is /nasa-se?
- Framework based on NPR 7123.1D (NASA Systems Engineering)
- Provides verification & validation (V&V) processes
- Used for requirements traceability and quality gates
How transcript skill uses NASA SE:
Quality Gate (ps-critic Phase):
────────────────────────────────
Criterion T-004: Citation Coverage >= 95%
Traceable to:
- REQ-EXT-003: "All extractions MUST include source citations"
- PAT-004: Citation Required Pattern
- ADR-003: Anchor Registry specification
Verification method:
1. Count total extractions in extraction-report.json
2. Count extractions with non-null citation field
3. Calculate ratio: citations / total
4. IF ratio >= 0.95: PASS
5. ELSE: FAIL with specific missing citations listV&V Framework Application:
| NASA SE Process | Transcript Skill Implementation |
|---|---|
| Requirements Analysis | ADR-002 (packet structure), ADR-003 (anchors) |
| Design Verification | Architecture tests (hexagonal compliance) |
| Implementation Validation | ps-critic quality review (0.90 threshold) |
| Traceability | Criterion → REQ → Pattern → Implementation |
Quality Criteria Traceability Example:
MM-001: Mermaid syntax validation
├─ Requirement: REQ-VIZ-001 (mindmaps must render correctly)
├─ Design: ADR-006 Section 5.3 (Mermaid syntax rules)
├─ Pattern: PAT-VIZ-001 (declarative visualization)
├─ Implementation: ts-mindmap-mermaid agent
└─ Validation: ps-critic MM-001 criterion
Verification:
1. Parse mindmap.mmd with Mermaid library
2. Check for syntax errors
3. Validate structure (root → branches → leaves)
4. IF errors: FAIL (quality score penalty)
5. ELSE: PASSSee also:
- /nasa-se skill - Full documentation
- NPR 7123.1D - NASA SE Handbook
- V&V Framework - V&V processes
Related Documents
Backlinks
- TDD-transcript-skill.md - Architecture overview
- TDD-FEAT-004 - Hybrid Infrastructure Technical Design (v2.0 basis)
- ADR-001 - Agent Architecture
- ADR-002 - Artifact Structure
- ADR-006 - Mindmap Pipeline Integration (v2.1 basis)
- DISC-009 - Agent-Only Architecture Limitation (v2.0 rationale)
- EN-025 - v2.0 Integration Enabler
- EN-026 - Token-Based Chunking (BUG-001 fix)
- DISC-001 - JSON Serialization Overhead (~22%)
Forward Links
- PLAYBOOK.md - Execution playbook
- RUNBOOK.md - Operational procedures
- VTT Parser - Python VTT parser (v2.0)
- Transcript Chunker - Python chunker (v2.0)
Routing Disambiguation
When this skill is the wrong choice and what happens if misrouted.
| Condition | Use Instead | Consequence of Misrouting |
|---|---|---|
| Non-transcript file analysis (code, markdown, config files) | /problem-solving or Read tool directly |
Transcript parsing agents (ts-parser, ts-extractor) applied to non-VTT/SRT files produce parsing failures; hybrid Python+LLM architecture expects transcript-format input |
| General text processing or summarization | /problem-solving (ps-analyst or ps-synthesizer) |
Transcript agents apply speaker identification, timestamp extraction, and domain-specific entity extraction patterns that are irrelevant to general text; 1,250x cost multiplier if Task agents invoked unnecessarily vs. direct text processing |
| Requirements engineering or design work | /nasa-se |
Transcript skill extracts meeting content; requirements formalization and V&V traceability require NASA-SE methodology |
| Adversarial quality review of deliverables | /adversary |
Transcript agents score extraction quality against transcript-specific criteria, not the S-014 LLM-as-Judge rubric for deliverable quality |
| Research, analysis, or root cause investigation | /problem-solving |
Transcript agents have no analytical methodology beyond extraction; research and causal investigation require ps-researcher or ps-investigator |
| Security assessment or threat modeling | /eng-team |
Transcript skill has no security methodology; security-engineering domain context provides STRIDE/DREAD extraction but does not replace eng-team threat modeling |
Agent Details
For detailed agent specifications, see:
agents/ts-parser.md- Parser/Orchestrator agent definition (v2.0)agents/ts-extractor.md- Extractor agent definitionagents/ts-formatter.md- Formatter agent definitionagents/ts-mindmap-mermaid.md- Mermaid mindmap agent definition (ADR-006)agents/ts-mindmap-ascii.md- ASCII mindmap agent definition (ADR-006)skills/problem-solving/agents/ps-critic.md- Critic agent (shared)
Document History
| Version | Date | Changes | Enabler/Task |
|---|---|---|---|
| 1.0.0 | 2026-01-26 | Initial SKILL.md with agent pipeline and basic invocation | FEAT-001 |
| 2.0.0 | 2026-01-30 | Hybrid Python+LLM architecture per DISC-009 findings | EN-025, DISC-009 |
| 2.1.0 | 2026-01-30 | Mindmap pipeline integration (default ON, opt-out via --no-mindmap) | ADR-006, EN-024 |
| 2.2.0 | 2026-01-29 | Explicit CLI invocation instructions for Phase 1 parsing | EN-024 (verification gap fix) |
| 2.3.0 | 2026-01-30 | Token-based chunking (fixes BUG-001 chunk token overflow) | EN-026, BUG-001 |
| 2.4.0 | 2026-01-30 | EN-028 Compliance: Added invoking section with natural language patterns, enhanced state passing with error handling, file persistence requirements (P-002 checklist), self-critique protocol (pre-finalization checks), restructured for triple-lens audience (L0/L1/L2), expanded Quick Reference with troubleshooting, model selection documentation (--model-* flags). | EN-028, TASK-407-411 |
| 2.4.1 | 2026-01-30 | G-028 Iteration 2 Refinements (F-001 to F-005): Added 5 error invocation examples with error outputs (F-001); documented error state structures for all agents including propagated errors (F-002); expanded recovery scenarios from 3 to 10 covering Python parser failure, context overflow, file write errors, quality gate failure, timeouts, partial extraction, mindmap failures, file conflicts, missing dependencies, and state corruption (F-003); added quantitative thresholds to all agent self-critique checklists with numeric bounds for validation (F-004); expanded Quick Reference troubleshooting section with 7 orchestration failure scenarios including pipeline stuck, agent timeout, quality gate immediate failure, file conflicts, mindmap partial failure, state key mismatch, and Python parser fallback issues (F-005). Target: raise G-028 score from 0.78 to ≥ 0.90. | EN-028, G-028 Iteration 2 |
| 2.4.2 | 2026-01-30 | EN-030 Documentation Polish: Added 6 comprehensive tool examples (Bash, Read, Write, Task, Glob, Grep) with execution evidence (TASK-416); Added 6 design rationale deep-dives covering hybrid architecture, chunking strategy, mindmap default-on, quality threshold 0.90, dual citation system, and P-003 compliance (TASK-417); Added 3 cross-skill integration sections for /problem-solving, /orchestration, and /nasa-se (TASK-418). Target: raise G-030 score from 0.83 to ≥ 0.95. | EN-030, TASK-416-418 |
| 2.5.0 | 2026-01-31 | ADR-007 Model-Agnostic Output: Added "Model-Agnostic Output Requirements (ADR-007)" section with MUST-CREATE (8 files), MUST-NOT-CREATE lists, anchor format rules, citation format, and navigation requirements. Ensures output consistency across Sonnet/Opus/Haiku models. FEAT-006 Phase 4 implementation. | ADR-007, FEAT-006 |
Skill Version: 2.5.0
Architecture: Hybrid Python+LLM (Strategy Pattern) + Mindmap Generation + Token-Based Chunking
Constitutional Compliance: Jerry Constitution v1.0 (P-001, P-002, P-003, P-004, P-010, P-020, P-022)
ADR Compliance: ADR-002 (packet structure), ADR-003 (anchor registry), ADR-004 (file splitting), ADR-006 (mindmap), ADR-007 (output template)
Created: 2026-01-26
Last Updated: 2026-01-31 (ADR-007 Model-Agnostic Output)