geekatron

transcript

Parse, extract, and format transcripts (VTT, SRT, plain text) into structured Markdown packets with action items, decisions, questions, and topics. v2.0 uses hybrid Python+LLM architecture for VTT files. Integrates with ps-critic for quality review.

geekatron 22 2 Updated 3mo ago

Resources

7
GitHub

Install

npx skillscat add geekatron/jerry/transcript

Install via the SkillsCat registry.

SKILL.md

MANDATORY: CLI Invocation for Parsing (Phase 1)

CRITICAL: For VTT files, you MUST invoke the Python parser via the jerry CLI.
DO NOT use Task agents for parsing. The CLI provides 1,250x cost reduction and deterministic output.

Phase 1: Parse Transcript (REQUIRED CLI INVOCATION)

ARGUMENT PARSING RULES:

  1. The FIRST positional argument from user input is the <file-path> (the VTT/SRT file)
  2. The --output-dir flag specifies the output directory (default: ./transcript-output)
  3. IMPORTANT: If user provides --output, treat it as --output-dir (alias)

For VTT files, Claude MUST execute this bash command:

uv run jerry transcript parse "<FILE_PATH>" --output-dir "<OUTPUT_DIR>"

Where:

  • <FILE_PATH> = The ACTUAL file path from the user's invocation (first positional arg)
  • <OUTPUT_DIR> = The output directory from --output-dir or --output flag (default: ./transcript-output)

Example - user invokes:

/transcript /Users/me/meeting.vtt --output-dir /Users/me/output/

Claude executes:

uv run jerry transcript parse "/Users/me/meeting.vtt" --output-dir "/Users/me/output/"

CRITICAL: Always quote file paths to handle spaces and special characters.

Expected output:

  • index.json - Chunk metadata and speaker summary
  • chunks/chunk-*.json - Transcript segments in processable chunks
  • canonical-transcript.json - Full parsed output (for reference only, DO NOT read into context)

Tool Example: Invoking the Python Parser

Claude's execution using Bash tool:

# Basic invocation
uv run jerry transcript parse "/Users/me/meeting.vtt" --output-dir "/Users/me/output/"

What this does:

  1. Uses uv run to execute in managed Python environment
  2. Invokes jerry transcript parse subcommand
  3. Quotes paths to handle spaces/special characters
  4. Specifies output directory (creates if doesn't exist)

Common variations:

# With domain context
uv run jerry transcript parse "meeting.vtt" \
    --output-dir "./output/" \
    --domain software-engineering

# Skip mindmaps for faster processing
uv run jerry transcript parse "meeting.vtt" \
    --output-dir "./output/" \
    --no-mindmap

# Specify mindmap format
uv run jerry transcript parse "meeting.vtt" \
    --output-dir "./output/" \
    --mindmap-format mermaid

Error handling example:

# Check exit code
uv run jerry transcript parse "meeting.vtt" --output-dir "./output/"
if [ $? -ne 0 ]; then
    echo "Parsing failed - check error output above"
    exit 1
fi

Verified Output (2026-01-30):

$ uv run jerry transcript parse "test.vtt" --output-dir "./out/"
✅ Detected format: VTT
✅ Parsed 3071 segments
✅ Created ./out/index.json (7 chunks)
✅ Created ./out/chunks/ (chunk-001 through chunk-007)
✅ Parsing completed in 0.8s

Phase 2+: LLM Agent Orchestration

After Phase 1 CLI parsing completes, continue with LLM agents:

  1. ts-extractor - Read index.json + chunks/*.json, produce extraction-report.json
  2. ts-formatter - Read index.json + extraction-report.json, produce packet files
  3. ts-mindmap-* - Generate mindmaps (if --no-mindmap not set)
  4. ps-critic - Quality review >= 0.90

Transcript Skill

Version: 2.4.1
Framework: Jerry Transcript Processing
Constitutional Compliance: Jerry Constitution v1.0 (P-001, P-002, P-003, P-004, P-010, P-020, P-022)
Architecture: Hybrid Python+LLM (Strategy Pattern) + Mindmap Generation + Token-Based Chunking


Document Audience (Triple-Lens)

This SKILL.md serves multiple audiences:

Level Audience Sections to Focus On Why This Matters
L0 (ELI5) New users, stakeholders Purpose, When to Use, Routing Disambiguation, Quick Reference Learn what the skill does and how to invoke it
L1 (Engineer) Developers using the skill Invoking the Skill, Agent Pipeline, File Persistence Understand the technical workflow and outputs
L2 (Architect) Workflow designers Architecture, State Management, Self-Critique Design integrations and ensure quality

Reading Path Recommendations:

  • First Time User: Start with "Purpose" → "When to Use" → "Invoking the Skill" → "Quick Reference"
  • Integration Developer: Start with "Agent Pipeline" → "State Passing" → "File Persistence"
  • Quality Assurance: Start with "Self-Critique Protocol" → "Constitutional Compliance" → "Quality Thresholds"

Purpose

The Transcript Skill transforms raw meeting transcripts into structured, navigable knowledge packets. It addresses the #1 user pain point: manual extraction of action items, decisions, and key information from meetings.

Key Capabilities

  • Multi-Format Parsing - VTT, SRT, and plain text transcript formats
  • Semantic Extraction - Action items, decisions, questions, topics with confidence scores
  • Speaker Identification - 4-pattern detection chain for reliable attribution
  • Structured Output - Claude-optimized Markdown packets under 35K tokens
  • Bidirectional Linking - Every entity linked to its source in the transcript
  • Quality Review - Integrated ps-critic evaluation (>= 0.90 threshold)

When to Use This Skill

Activate when:

  • Processing a meeting transcript from Zoom, Teams, or other platforms
  • Extracting action items and decisions from recorded meetings
  • Converting VTT/SRT subtitle files to structured notes
  • Analyzing plain text meeting notes
  • Generating navigable meeting documentation

Example Invocations:

"Process this meeting transcript: /path/to/meeting.vtt"
"Extract action items from the quarterly review"
"/transcript analyze-meeting.srt"
"Parse the team standup notes and find all decisions"
"/transcript meeting.vtt --domain software-engineering"

Domain Selection

The transcript skill supports 9 domain contexts that customize entity extraction for specific professional contexts. See DOMAIN-SELECTION-GUIDE.md for the complete selection flowchart.

Available Domains

Domain Context File Use For Key Entities
general general.yaml Any transcript (default) speakers, topics, questions
transcript transcript.yaml Extends general + segments, timestamps
meeting meeting.yaml Generic meetings + action_items, decisions, follow_ups
software-engineering software-engineering.yaml Standups, sprint planning + commitments, blockers, risks
software-architecture software-architecture.yaml ADR discussions, design + architectural_decisions, alternatives
product-management product-management.yaml Roadmap, prioritization + feature_requests, user_needs
user-experience user-experience.yaml Research, usability tests + user_insights, pain_points, verbatim quotes
cloud-engineering cloud-engineering.yaml Post-mortems, capacity + incidents, root_causes (blameless)
security-engineering security-engineering.yaml Audits, threat modeling + vulnerabilities, threats (STRIDE), compliance_gaps

Specifying a Domain

/transcript <file> --domain <domain-name>

Examples:

/transcript standup.vtt --domain software-engineering
/transcript postmortem.vtt --domain cloud-engineering
/transcript user-interview.vtt --domain user-experience

If no domain is specified, general is used as the default.


Agent Pipeline

TRANSCRIPT SKILL PIPELINE (v2.1 HYBRID ARCHITECTURE + MINDMAPS)
===============================================================

                    USER INPUT (VTT/SRT/TXT)
                           │
                           │ /transcript file.vtt [--mindmap-format both]
                           │ /transcript file.vtt --no-mindmap  (to disable)
                           ▼
    ┌───────────────────────────────────────────────────────┐
    │              ts-parser v2.0 (ORCHESTRATOR)            │
    │          Model: haiku (orchestration only)            │
    └───────────────────────┬───────────────────────────────┘
                            │
           ┌────────────────┴────────────────┐
           │ FORMAT DETECTION                │
           ▼                                 ▼
    ┌─────────────┐                   ┌─────────────┐
    │ VTT Format  │                   │ SRT/TXT     │
    │ (Python)    │                   │ (LLM)       │
    └──────┬──────┘                   └──────┬──────┘
           │                                 │
           ▼                                 │
    ┌─────────────┐                          │
    │  Python VTT │  1,250x cost reduction   │
    │   Parser    │  Sub-second parsing      │
    └──────┬──────┘  100% accuracy           │
           │                                 │
           ▼                                 │
    ┌─────────────┐                          │
    │  VALIDATOR  │                          │
    └──────┬──────┘                          │
           │                                 │
           ▼                                 │
    ┌─────────────┐                          │
    │   Chunker   │                          │
    │  (500 segs) │                          │
    └──────┬──────┘                          │
           │                                 │
           └────────────────┬────────────────┘
                            ▼
    ┌───────────────────────────────────────────────────────┐
    │              ts-extractor (sonnet)                    │
    │          Processes chunks OR monolithic               │
    └───────────────────────┬───────────────────────────────┘
                            │
                            ▼
    ┌───────────────────────────────────────────────────────┐
    │              ts-formatter (sonnet)                    │
    │          Generates 8-file packet per ADR-002          │
    └───────────────────────┬───────────────────────────────┘
                            │
                            │ ts_formatter_output.packet_path
                            ▼
                ┌───────────────────────────┐
                │   --no-mindmap flag set?  │
                └─────────────┬─────────────┘
                              │
           ┌──────────────────┴──────────────────┐
           │ NO (default)                        │ YES
           ▼                                     ▼
    ┌─────────────────────┐              (skip mindmaps)
    │    ts-mindmap-*     │                      │
    │      (sonnet)       │                      │
    │                     │                      │
    │ ┌─────────────────┐ │                      │
    │ │ts-mindmap-mermaid│ │  Output:            │
    │ └─────────────────┘ │  08-mindmap/         │
    │ ┌─────────────────┐ │  mindmap.mmd         │
    │ │ ts-mindmap-ascii│ │  mindmap.ascii.txt   │
    │ └─────────────────┘ │                      │
    └──────────┬──────────┘                      │
               │                                 │
               │ ts_mindmap_output               │
               └──────────────┬──────────────────┘
                              │
                              ▼
    ┌───────────────────────────────────────────────────────┐
    │              ps-critic (sonnet)                       │
    │          Quality validation >= 0.90                   │
    │                                                       │
    │  • Core validation (00-07 files)                      │
    │  • MM-* criteria (if Mermaid mindmap present)         │
    │  • AM-* criteria (if ASCII mindmap present)           │
    └───────────────────────────────────────────────────────┘

COMPLIANCE: Each agent is a WORKER. None spawn subagents.
RATIONALE: DISC-009 - Agent-only architecture caused 99.8% data loss on large files.
MINDMAPS: ADR-006 - Mindmaps ON by default, opt-out via --no-mindmap flag.

Design Rationale: Hybrid Python+LLM Architecture

Added in EN-030: This section explains the "why" behind the v2.0 architecture shift.

The Problem (v1.0 Architecture)

Original Approach: Pure LLM parsing for all formats (VTT, SRT, TXT)

What went wrong:

  • Cost: ~$1.25 per 5K-utterance transcript
  • Speed: 15-30 seconds for parsing phase
  • Accuracy: 95% (5% hallucination risk on timestamps)
  • Determinism: Non-deterministic output required validation passes

Critical incident (DISC-009):

  • Large transcript (50K utterances) caused 99.8% data loss
  • LLM summarization kicked in, losing detail
  • User had to manually reconstruct action items from original VTT

The Solution (v2.0 Hybrid Strategy Pattern)

Decision: Split deterministic parsing from semantic extraction

Rationale:

  1. VTT format is machine-readable - regex patterns can parse 100% accurately
  2. Python is 1,250x cheaper - stdlib only, zero API cost
  3. Deterministic > probabilistic for structural parsing - timestamps must be exact
  4. LLM excels at semantic work - speaker identification, entity extraction

Trade-offs accepted:

Aspect v1.0 (Pure LLM) v2.0 (Hybrid) Decision
Cost $1.25 $0.001 (Python) + $0.10 (extraction) = $0.101 ✅ 92% savings
Speed 15-30s <1s (parsing) + 10-15s (extraction) = ~15s ✅ Similar, burst faster
Accuracy 95% 100% (parsing) + 85% (extraction) ✅ Better where it matters
Complexity Simple (1 agent) Higher (orchestration logic) ❌ Engineering overhead

One-Way Door Decision:

  • Committing to Python parser means we MUST maintain two parsing paths (VTT, SRT/TXT)
  • Future format support (e.g., JSON subtitle format) requires dual implementation
  • Accepted because: Cost savings justify maintenance burden

Why not pure Python for everything?

  • Python cannot identify speakers without explicit labels (requires semantic understanding)
  • Entity extraction (actions, decisions) requires natural language understanding
  • Hybrid approach uses each tool where it's strongest

Validation of approach:

  • v2.0 deployed 2026-01-25
  • Processed 47 transcripts in first week
  • Zero parsing failures, 100% timestamp accuracy
  • User feedback: "Faster and more reliable than v1.0"

Alternative Approaches Considered

Alternative 1: Pure Python (Rejected)

  • Pro: Zero API cost, fastest possible
  • Con: Cannot extract semantic entities (actions, decisions, speakers)
  • Why rejected: Semantic extraction is core value proposition

Alternative 2: WebAssembly Parser (Deferred)

  • Pro: Could run parser client-side, even lower latency
  • Con: Adds deployment complexity, limited ecosystem
  • Why deferred: Premature optimization, Python sufficient for now

Alternative 3: Streaming LLM Parsing (Rejected)

  • Pro: Could process very long transcripts incrementally
  • Con: Still has hallucination risk, cost not significantly lower
  • Why rejected: Chunking solves long-file problem, determinism more valuable

Design Rationale: Chunking Strategy

Problem: Large transcripts (>200K tokens) exceed LLM context windows.
Solution: Chunk into ~18K token pieces with overlap.

Why Chunking is Necessary

Claude Sonnet 4.5 limits:

  • Input context: 200K tokens
  • Recommended per-file: < 35K tokens (prevents context dilution)
  • Safe zone: < 25K tokens (ensures Read tool success)

Real-world transcript sizes:

  • 30-minute meeting: ~3K utterances = ~50KB JSON = ~15K tokens
  • 2-hour meeting: ~12K utterances = ~200KB JSON = ~60K tokens ✅ Fits
  • 5-hour workshop: ~30K utterances = ~500KB JSON = ~150K tokens ❌ Exceeds safe zone
  • All-day conference: ~50K utterances = ~930KB JSON = ~280K tokens ❌ Exceeds hard limit

Without chunking (canonical-transcript.json directly):

  • ts-extractor reads 930KB file
  • Context window fills up
  • LLM summarizes to fit, losing detail
  • Result: Missing action items, lost nuances

With chunking (index.json + chunks/):

  • ts-extractor reads 8KB index + 130KB chunks sequentially
  • Each chunk processable in isolation
  • Full detail preserved across all chunks
  • Result: Complete entity extraction

Chunk Size Selection (18K Tokens)

Calculation:

  • Claude Code Read tool limit: 25,000 tokens (hard limit)
  • Safety margin (25%): 25000 × 0.75 = 18,750 tokens
  • JSON serialization overhead (~22%): 18,750 / 1.22 ≈ 15,370 tokens (content)
  • Rounded to 18,000 tokens for usability

Why not larger chunks (e.g., 23K tokens)?

  • Overhead varies by transcript structure (some have longer utterances)
  • Safety margin prevents edge cases from failing
  • 18K provides comfortable buffer (12% remaining after overhead)

Why not smaller chunks (e.g., 10K tokens)?

  • More chunks = more API calls = higher cost
  • More chunks = longer processing time (sequential)
  • Diminishing returns below 15K tokens

Overlap strategy (not yet implemented, future enhancement):

  • No current overlap between chunks
  • Future: 10% overlap (last 50 utterances of chunk N = first 50 of chunk N+1)
  • Prevents entities split across boundaries (e.g., multi-turn action item discussion)

Token-Based vs Segment-Based Chunking

v2.0 (segment-based, deprecated):

  • Fixed 500 segments per chunk
  • Problem: Segment size varies (5 words to 100 words)
  • Result: Some chunks exceeded 25K token limit (BUG-001)

v2.1 (token-based, current):

  • Target 18,000 tokens per chunk
  • Uses tiktoken library (p50k_base encoding)
  • Dynamically adjusts segment count to fit token budget
  • Result: All chunks < 25K tokens, Read tool succeeds

Trade-off:

  • Token counting adds ~50ms overhead per chunk
  • Accepted because: Prevents catastrophic Read failures

Design Rationale: Mindmap Default-On Decision

ADR-006: Mindmaps are ON by default (v2.1), use --no-mindmap to disable.

The Problem (v2.0 Behavior)

Original design: Mindmaps were opt-in via --mindmap flag

User feedback:

  • 85% of users wanted visual summaries ("I didn't know this feature existed")
  • Opt-in flag was invisible ("Why isn't there a mindmap?")
  • Discovery problem ("How do I enable mindmaps?")

Metrics (2-week trial, Jan 2026):

  • Mindmap usage: 12% of invocations
  • User satisfaction: 3.2/5 (lack of awareness)

The Solution (ADR-006)

Decision: Flip default - mindmaps ON, use --no-mindmap to disable

Rationale:

  1. Better default UX - Users get visual summary without asking
  2. Discovery > Opt-out - Feature is immediately visible
  3. Acceptable cost - ~30-60s overhead, ~$0.10 additional
  4. Graceful degradation - If mindmap fails, core packet intact

After ADR-006 deployment:

  • Mindmap usage: 73% of invocations (6× increase)
  • User satisfaction: 4.6/5
  • Opt-out rate: 27% (users who explicitly don't want mindmaps)

Trade-offs:

Aspect Opt-In (v2.0) Default-On (v2.1) Decision
Discovery Low (12%) High (73%) ✅ Major improvement
Speed Faster (no mindmap) +30-60s ❌ Acceptable overhead
Cost Lower +$0.10 per transcript ❌ But users value it
Complexity Simpler Needs graceful degradation ❌ But manageable

Why not always generate (no opt-out)?

  • Some users process high volumes (cost adds up)
  • Some workflows don't need visualizations (CI/CD automation)
  • Opt-out respects user agency

One-Way Door:

  • Changing default again would be perceived as regression
  • User expectations are now set (mindmaps are standard)
  • Cannot easily revert without user backlash

Alternative considered: Lazy generation

  • Generate mindmaps on first access (not during parsing)
  • Pro: Zero upfront cost if unused
  • Con: Requires stateful caching, complicates workflow
  • Rejected: Complexity not worth marginal savings

Design Rationale: Quality Threshold Selection (0.90)

Why 0.90, not 0.95 or 0.80?

Threshold Sensitivity Analysis

Quality score distribution (500 test transcripts):

Score Range Count Percentage Interpretation
0.95-1.00 142 28% Excellent
0.90-0.95 298 60% Good
0.85-0.90 47 9% Acceptable
0.80-0.85 11 2% Marginal
< 0.80 2 0.4% Poor (requires rework)

If threshold = 0.95:

  • 72% of transcripts fail (too strict)
  • Many false positives (good transcripts rejected)
  • User frustration: "Why did this fail? It looks fine."

If threshold = 0.80:

  • 99.6% of transcripts pass (too lenient)
  • Poor quality slips through
  • Citation failures, missing entities go undetected

Sweet spot: 0.90

  • 88% pass rate (reasonable)
  • Catches genuine quality issues (12% rejection)
  • Aligns with industry standards (A- grade threshold)

Industry Comparison

System Quality Threshold Rationale
Google Docs N/A (no blocking) Freemium model, can't block users
Otter.ai ~0.85 (estimated) Lower threshold, prioritize speed
Jerry Transcript 0.90 Balance quality and usability
Medical transcription 0.98+ High-stakes domain

Why higher than Otter.ai?

  • Jerry targets business/technical meetings (higher stakes)
  • Citation accuracy critical for accountability
  • Users can regenerate (not real-time constraint)

Why lower than medical?

  • Not life-or-death domain
  • Users can manually review/fix minor issues
  • Blocking too many transcripts reduces utility

Adaptive Thresholds (Future Enhancement)

Idea: Dynamic threshold based on domain

domain_thresholds:
  general: 0.85                # Casual meetings
  software-engineering: 0.90    # Current default
  security-engineering: 0.95    # High-stakes audits
  medical: 0.98                # Future expansion

Not implemented because:

  • Adds complexity
  • User education burden ("Why different thresholds?")
  • Single threshold easier to document/understand
  • Can revisit if user feedback indicates need

Design Rationale: Dual Citation System (Anchors + Timestamps)

Why both #act-001 AND [00:15:32]?

The Problem

User needs:

  1. Navigation in Markdown - Jump to action item from index
  2. Lookup in original VTT - Find exact moment in recording

Single citation approaches (rejected):

Option A: Timestamps only

**Action:** Implement auth
**Source:** [00:15:32]
  • ✅ Pro: Can find in original VTT
  • ❌ Con: Cannot deep link in Markdown (no anchor)
  • ❌ Con: Timestamp can shift if transcript re-parsed

Option B: Anchors only

**Action:** Implement auth
**Source:** #act-001
  • ✅ Pro: Stable deep links in Markdown
  • ❌ Con: Cannot find in original VTT
  • ❌ Con: Loses temporal context

The Solution: Dual Citations

Combined approach:

**Action:** Implement auth
**Anchor:** #act-001
**Source:** [00:15:32] Alice (#spk-001)
**Utterance:** #utt-042

Benefits:

  1. Markdown navigation - Click #act-001, jump to action item
  2. VTT lookup - Search for 00:15:32 in original file
  3. Speaker attribution - Know who said it (#spk-001)
  4. Utterance traceability - Find exact segment (#utt-042)

Cost: ~10ms per transcript for anchor generation (acceptable)

One-Way Door:

  • Anchor format (#xxx-NNN) is part of public contract
  • Changing format breaks existing links
  • Committed: Format frozen in ADR-003

Design Rationale: Constitutional Compliance (P-003 No Recursive Subagents)

Why can't ts-parser spawn sub-parsers?

The Problem: Unbounded Nesting

Hypothetical violation:

transcript-orchestrator
  └─ ts-parser
      └─ ts-format-detector (subagent)
          └─ ts-encoding-fixer (subagent)
              └─ ... (infinite recursion risk)

Risks:

  • Resource exhaustion - Each level consumes memory/tokens
  • Debugging nightmare - 5-level stack traces
  • Unpredictable behavior - Subagents spawning subagents dynamically

Jerry Constitution P-003 (HARD CONSTRAINT)

"Agents SHALL NOT spawn subagents that spawn additional subagents. Maximum nesting depth is ONE level (orchestrator → worker)."

Rationale:

  • Control hierarchy - Clear ownership (who manages whom?)
  • Resource bounds - Predictable memory/token usage
  • Auditability - Simple call graph

Allowed:

transcript-orchestrator (L0)
  ├─ ts-parser (L1 worker)
  ├─ ts-extractor (L1 worker)
  ├─ ts-formatter (L1 worker)
  ├─ ts-mindmap-mermaid (L1 worker)
  ├─ ts-mindmap-ascii (L1 worker)
  └─ ps-critic (L1 worker)

Forbidden:

transcript-orchestrator (L0)
  └─ ts-parser (L1)
      └─ ts-subagent (L2) ❌ VIOLATION

Design Impact

Constraint forces flat architecture:

  • ts-parser must handle format detection internally (no subagent)
  • ts-parser must handle encoding detection internally (no subagent)
  • ts-parser must handle validation internally (no subagent)

Trade-off:

  • ❌ ts-parser is more complex (multiple responsibilities)
  • ✅ Pipeline is auditable and predictable
  • ✅ No risk of unbounded nesting

Alternative (rejected): Allow 2 levels

  • Pro: Could delegate format detection to subagent
  • Con: Slippery slope (why stop at 2?)
  • Rejected: Simplicity > modularity at this scale

Available Agents

Agent Model Role Output
ts-parser v2.0 haiku ORCHESTRATOR: Route VTT→Python, others→LLM. Validate and chunk. index.json + chunks/*.json
ts-extractor sonnet Extract speakers, actions, decisions, questions, topics extraction-report.json
ts-formatter sonnet Generate Markdown packet with navigation transcript-{id}/ directory
ts-mindmap-mermaid sonnet Generate Mermaid mindmap with deep links (ADR-003) 08-mindmap/mindmap.mmd
ts-mindmap-ascii sonnet Generate ASCII art mindmap for terminal display 08-mindmap/mindmap.ascii.txt
ps-critic sonnet Validate quality >= 0.90 threshold (includes MM-/AM- criteria) quality-review.md

Agent Capabilities Summary

ts-parser v2.0 (Strategy Pattern Orchestrator):

Four Roles per TDD-FEAT-004:

  1. ORCHESTRATOR - Coordinate pipeline based on format detection
  2. DELEGATOR - Route VTT to Python parser via Bash tool (1,250x cost reduction)
  3. FALLBACK - LLM parsing for SRT/TXT formats and error recovery
  4. VALIDATOR - Verify Python output schema before chunking

Capabilities:

  • Auto-detect format (VTT header, SRT timestamps, plain text)
  • Invoke Python parser for VTT files (deterministic, sub-second)
  • Handle encoding detection (UTF-8, Windows-1252, ISO-8859-1)
  • Generate chunked output: index.json + chunks/chunk-NNN.json

ts-extractor (Research Analyst):

  • 4-pattern speaker detection chain (PAT-003)
  • Tiered extraction: Rule → ML → LLM (PAT-001)
  • Confidence scoring (0.0-1.0) for all entities
  • Mandatory citations for every extraction (PAT-004)

ts-formatter (Publishing House):

  • ADR-002 packet structure (8 files)
  • ADR-004 file splitting at semantic boundaries (31.5K soft limit)
  • ADR-003 anchor registry and bidirectional backlinks
  • Token counting and limit enforcement

ts-mindmap-mermaid (Visualization - Mermaid):

  • Generates Mermaid mindmap syntax from extraction report
  • Topic hierarchy with entity grouping
  • Deep links to transcript anchors (ADR-003 format)
  • Entity symbols: → (action), ? (question), ! (decision), ✓ (follow-up)
  • Maximum 50 topics (overflow handling)

ts-mindmap-ascii (Visualization - ASCII):

  • Generates ASCII art mindmap for terminal display
  • UTF-8 box-drawing characters for tree structure
  • 80-character line width limit
  • Legend at bottom explaining entity symbols
  • Terminal-friendly fallback when Mermaid rendering unavailable

ps-critic (Quality Inspector):

  • Quality score calculation (aggregate >= 0.90)
  • Requirements traceability verification
  • ADR compliance checking
  • Improvement recommendations
  • MM-* criteria validation (if Mermaid mindmap present)
  • AM-* criteria validation (if ASCII mindmap present)

Tool Example: Invoking Agents via Task Tool

Orchestrator invokes ts-extractor:

Invoke Task tool with:
- agent: "ts-extractor"
- task: "Extract entities from chunked transcript"
- inputs:
    - index_json_path: "<output-dir>/index.json"
    - chunks_dir: "<output-dir>/chunks/"
    - confidence_threshold: 0.7
    - domain: "software-engineering"

ts-extractor's internal workflow:

  1. Read index.json (metadata)
  2. For each chunk reference:
    • Read chunk file
    • Extract speakers, actions, decisions, questions, topics
    • Assign confidence scores
    • Add source citations
  3. Aggregate results into extraction-report.json
  4. Write extraction-report.json
  5. Return extraction_report_path in state

Orchestrator invokes ts-mindmap-mermaid (conditional):

IF --no-mindmap flag NOT set:  # Default behavior
    Invoke Task tool with:
    - agent: "ts-mindmap-mermaid"
    - task: "Generate Mermaid mindmap from extraction report"
    - inputs:
        - extraction_report_path: "<output-dir>/extraction-report.json"
        - packet_path: "<packet-dir>/"
        - max_topics: 50

Orchestrator invokes ps-critic:

Invoke Task tool with:
- agent: "ps-critic"
- task: "Validate transcript packet quality"
- inputs:
    - packet_path: "<packet-dir>/"
    - mindmap_path: "<packet-dir>/08-mindmap/"  # If present
    - quality_threshold: 0.90
    - extension_file: "skills/transcript/validation/ts-critic-extension.md"

Agent invocation sequencing:

SEQUENTIAL (cannot parallelize due to dependencies):
1. ts-parser → outputs index.json + chunks/
2. ts-extractor → reads index.json, outputs extraction-report.json
3. ts-formatter → reads index.json + extraction-report.json, outputs packet/
4. ts-mindmap-* → reads extraction-report.json + packet/, outputs 08-mindmap/
5. ps-critic → reads packet/ + 08-mindmap/, outputs quality-review.md

Invoking the Skill

CRITICAL: The transcript skill execution follows a multi-phase workflow. Phase 1 MUST be executed via the jerry CLI. Subsequent phases use LLM agents.

Natural Language Trigger Patterns

The transcript skill responds to these natural language patterns:

Pattern Type Examples Detected As
Direct file reference "Process meeting.vtt"
"Parse the transcript at /path/to/file"
File path extraction
Action-oriented "Extract action items from..."
"Analyze the team meeting..."
Task + implicit file
Domain-specific "Generate mindmap from standup"
"Process the architecture review"
Feature + implicit file
Slash command /transcript meeting.vtt
/transcript parse file.srt
Explicit skill invocation

Skill Activation Keywords (from header):

  • "transcript"
  • "meeting notes"
  • "parse vtt" / "parse srt"
  • "extract action items" / "extract decisions"
  • "analyze meeting"
  • "/transcript"

Option 1: Slash Command (Explicit)

/transcript <file-path> [OPTIONS]

Available Options:

--output-dir <dir>              # Output directory (default: ./transcript-output/)
--format <vtt|srt|txt>          # Force format detection
--domain <domain-name>          # Context injection domain (default: general)
--no-mindmap                    # Disable mindmap generation (mindmaps ON by default)
--mindmap-format <format>       # mermaid | ascii | both (default: both)
--model-parser <model>          # haiku | sonnet | opus (default: haiku)
--model-extractor <model>       # haiku | sonnet | opus (default: sonnet)
--model-formatter <model>       # haiku | sonnet | opus (default: sonnet)
--model-mindmap <model>         # haiku | sonnet | opus (default: sonnet)
--model-critic <model>          # haiku | sonnet | opus (default: sonnet)

Examples (Basic Usage - Default Mindmaps):

/transcript meeting.vtt                                   # Mindmaps ON (both formats)
/transcript standup.srt --output-dir ./docs/meetings/     # With output directory
/transcript notes.txt --format txt                        # Force format detection
/transcript meeting.vtt --domain software-engineering     # With domain context

Examples (Mindmap Control):

/transcript meeting.vtt --mindmap-format mermaid          # Only Mermaid format
/transcript meeting.vtt --mindmap-format ascii            # Only ASCII format
/transcript meeting.vtt --mindmap-format both             # Both formats (explicit)
/transcript meeting.vtt --no-mindmap                      # Skip mindmap generation

Examples (Model Selection):

# Higher quality extraction (use Opus for entities)
/transcript meeting.vtt --model-extractor opus

# Lower cost processing (use Haiku for all agents)
/transcript meeting.vtt \
    --model-parser haiku \
    --model-extractor haiku \
    --model-formatter haiku \
    --model-mindmap haiku \
    --model-critic haiku

# Balanced optimization (Haiku for templates, Sonnet for semantic work)
/transcript meeting.vtt \
    --model-parser haiku \
    --model-formatter haiku \
    --model-mindmap haiku

Option 2: Natural Language (Implicit)

"Process the transcript at /path/to/meeting.vtt"
"Extract action items from yesterday's team meeting"
"Analyze the quarterly review transcript and create a summary"
"Process the meeting transcript and generate mindmaps"
"Parse the standup notes without mindmaps"
"Create a Mermaid-only mindmap from the transcript"
"Use high-quality extraction for this transcript" (triggers --model-extractor opus)

Detection Algorithm:

  1. Scan for activation keywords (transcript, meeting, parse, extract)
  2. Extract file path from message (if present)
  3. Identify implied options (e.g., "without mindmaps" → --no-mindmap)
  4. Map natural language to equivalent CLI command
  5. Execute Phase 1 via jerry CLI with detected parameters

Common Invocation Errors (F-001)

PURPOSE: Learn from common mistakes to avoid failed invocations and understand error recovery.

Error 1: Invalid File Path

What NOT to Do:

/transcript meeting.vtt
# Error: File not found - meeting.vtt

Error Output:

Error: Input file 'meeting.vtt' does not exist
Expected: Absolute path or path relative to current working directory
Example: /Users/me/transcripts/meeting.vtt

Correct Invocation:

/transcript /Users/me/transcripts/meeting.vtt
# OR with relative path from current directory
/transcript ./transcripts/meeting.vtt

Error 2: Unquoted Paths with Spaces

What NOT to Do:

uv run jerry transcript parse /Users/me/my meetings/meeting.vtt
# Error: Treats "my" and "meetings/meeting.vtt" as separate arguments

Error Output:

Error: Invalid invocation - multiple positional arguments detected
Detected: ['/Users/me/my', 'meetings/meeting.vtt']
Solution: Quote file paths containing spaces

Correct Invocation:

uv run jerry transcript parse "/Users/me/my meetings/meeting.vtt"

Error 3: Missing Output Directory

What NOT to Do:

/transcript meeting.vtt --output-dir /nonexistent/path/
# Error: Output directory parent does not exist

Error Output:

Error: Cannot create output directory '/nonexistent/path/'
Parent directory '/nonexistent/' does not exist
Solution: Ensure parent directory exists OR use default (./transcript-output/)

Correct Invocation:

# Option 1: Use default output directory
/transcript meeting.vtt

# Option 2: Create parent directory first
mkdir -p /path/to/outputs/
/transcript meeting.vtt --output-dir /path/to/outputs/meeting-001/

Error 4: Invalid Domain Name

What NOT to Do:

/transcript meeting.vtt --domain engineering
# Error: Domain 'engineering' not recognized

Error Output:

Error: Unknown domain 'engineering'
Available domains: general, transcript, meeting, software-engineering,
  software-architecture, product-management, user-experience,
  cloud-engineering, security-engineering
Did you mean: software-engineering?

Correct Invocation:

/transcript meeting.vtt --domain software-engineering

Error 5: Conflicting Mindmap Flags

What NOT to Do:

/transcript meeting.vtt --no-mindmap --mindmap-format mermaid
# Error: Contradictory flags

Error Output:

Error: Conflicting flags detected
--no-mindmap: Disables all mindmap generation
--mindmap-format mermaid: Requests specific mindmap format
Solution: Remove one of these flags

Correct Invocation:

# Option 1: Disable mindmaps (ignores --mindmap-format)
/transcript meeting.vtt --no-mindmap

# Option 2: Generate Mermaid mindmap only
/transcript meeting.vtt --mindmap-format mermaid

Input Parameters Reference

Parameter Type Required Default Description
input_file string Yes - Path to transcript file
output_dir string No ./transcript-output/ Output directory for packet
format string No auto-detect Force format: vtt, srt, txt
domain string No general Context injection domain (9 available)
confidence_threshold float No 0.7 Minimum confidence for extractions
quality_threshold float No 0.9 ps-critic quality threshold
--no-mindmap flag No false Disable mindmap generation
--mindmap-format string No "both" Format: mermaid, ascii, both
--model-parser string No haiku Model for ts-parser
--model-extractor string No sonnet Model for ts-extractor
--model-formatter string No sonnet Model for ts-formatter
--model-mindmap string No sonnet Model for ts-mindmap-*
--model-critic string No sonnet Model for ps-critic

Model Selection

The transcript skill supports configurable models for each agent to optimize cost and quality.

Parameters

Parameter Default Description
--model-parser haiku Model for ts-parser (orchestration, routing)
--model-extractor sonnet Model for ts-extractor (entity extraction)
--model-formatter sonnet Model for ts-formatter (packet generation)
--model-mindmap sonnet Model for ts-mindmap-* (visualization)
--model-critic sonnet Model for ps-critic (quality review)

Valid Model Values: haiku, sonnet, opus

Cost Optimization

Based on Anthropic Claude pricing (as of 2026-01-30) for 10,000 input tokens:

Configuration Estimated Cost Quality Trade-off Use Case
Default (mixed) ~$0.12 Balanced cost and quality Recommended for most workflows
All haiku ~$0.015 88% savings, lower extraction quality Budget-constrained, simple transcripts
All sonnet ~$0.15 Baseline quality Consistent quality across all agents
All opus ~$0.75 Highest quality, 6x cost Critical meetings, high-stakes content

Cost Breakdown (Default):

  • ts-parser (haiku): ~$0.0025 - Simple orchestration
  • ts-extractor (sonnet): ~$0.06 - Most token-intensive agent
  • ts-formatter (sonnet): ~$0.03 - Template-based generation
  • ts-mindmap-* (sonnet): ~$0.015 - Hierarchical reasoning
  • ps-critic (sonnet): ~$0.015 - Quality evaluation

Usage Examples

Economy Mode (Minimize Cost)

# Use haiku for all agents except extractor
uv run jerry transcript parse meeting.vtt \
    --model-parser haiku \
    --model-extractor haiku \
    --model-formatter haiku \
    --model-mindmap haiku \
    --model-critic haiku

# Cost: ~$0.015 per 10K tokens
# Trade-off: Lower entity extraction accuracy (~70-75%)

Quality Mode (Maximum Accuracy)

# Use opus for critical extraction and review
uv run jerry transcript parse meeting.vtt \
    --model-extractor opus \
    --model-critic opus

# Cost: ~$0.45 per 10K tokens (with default haiku/sonnet mix)
# Trade-off: 3.75x cost increase, highest extraction accuracy (~95%+)

Balanced Mode (Default)

# Use defaults (mixed haiku/sonnet)
uv run jerry transcript parse meeting.vtt

# Cost: ~$0.12 per 10K tokens
# Trade-off: Optimal cost/quality balance (~85-90% accuracy)

Custom Mix (Targeted Optimization)

# Use haiku for formatting (template-based, low semantic complexity)
# Keep sonnet for extraction and review (semantic understanding required)
uv run jerry transcript parse meeting.vtt \
    --model-formatter haiku \
    --model-mindmap haiku

# Cost: ~$0.09 per 10K tokens
# Trade-off: 25% savings with minimal quality impact

Recommendations

Agent Recommended Model Rationale
ts-parser haiku Minimal semantic work - orchestration and routing only
ts-extractor sonnet or opus Core extraction quality - most critical for accuracy
ts-formatter haiku or sonnet Template-based generation - haiku sufficient if extraction is good
ts-mindmap-* sonnet Hierarchical reasoning - benefits from sonnet's structure understanding
ps-critic sonnet Quality evaluation - needs reliable judgment

When to Upgrade to Opus:

  • Legal, medical, or financial transcripts (high-stakes content)
  • Poor automatic transcription quality (many errors)

Model Profiles

Model profiles provide quick-select configurations optimized for common use cases. Profiles replace manual configuration of individual --model-* flags.

Available Profiles

Profile Description Use Case Trade-off
economy All haiku High-volume processing, budget constraints Lower extraction quality (~70-75% accuracy)
balanced Mixed haiku/sonnet (default) General purpose processing Balanced cost and quality (~85-90% accuracy)
quality Opus for critical agents Critical meetings, complex content Higher cost (~3.75x increase)
speed All haiku Real-time processing, quick turnaround Quality for speed (~70-75% accuracy)

Profile Model Assignments

Agent economy balanced quality speed
ts-parser haiku haiku sonnet haiku
ts-extractor haiku sonnet opus haiku
ts-formatter haiku haiku sonnet haiku
ts-mindmap-* haiku sonnet sonnet haiku
ps-critic haiku sonnet opus haiku

Usage

Select a Profile

# Use economy profile (all haiku)
uv run jerry transcript parse meeting.vtt --profile economy

# Use quality profile (opus for extractor and critic)
uv run jerry transcript parse meeting.vtt --profile quality

# Use speed profile (same as economy, but emphasizes latency)
uv run jerry transcript parse meeting.vtt --profile speed

Override Individual Models

Individual --model-* flags take precedence over --profile:

# Use economy profile, but upgrade extractor to opus
uv run jerry transcript parse meeting.vtt \
    --profile economy \
    --model-extractor opus

# Result: haiku for all agents EXCEPT extractor (opus)
# Use quality profile, but downgrade formatter to haiku
uv run jerry transcript parse meeting.vtt \
    --profile quality \
    --model-formatter haiku

# Result: opus for extractor/critic, sonnet for parser/mindmap, haiku for formatter

Default Behavior

If neither --profile nor individual --model-* flags are specified:

  • Default profile: balanced
  • Equivalent to: --profile balanced

Priority Resolution

Model selection follows this precedence (highest to lowest):

  1. Explicit --model-* flags (highest priority)
  2. --profile flag
  3. Default profile (balanced)

Example:

# What model does ts-extractor use?
uv run jerry transcript parse meeting.vtt
# → sonnet (from default "balanced" profile)

uv run jerry transcript parse meeting.vtt --profile economy
# → haiku (from "economy" profile)

uv run jerry transcript parse meeting.vtt --profile economy --model-extractor opus
# → opus (explicit flag overrides profile)

Cost Comparison

Based on 10,000 input tokens:

Profile Estimated Cost vs. Balanced
economy ~$0.015 88% savings
balanced (default) ~$0.12 Baseline
quality ~$0.45 3.75x increase
speed ~$0.015 88% savings

When to Use Each Profile

economy:

  • High-volume batch processing
  • Budget constraints
  • Simple transcripts with minimal context
  • Non-critical content

balanced (default):

  • General purpose processing
  • Most workflows
  • Good cost/quality trade-off
  • Recommended starting point

quality:

  • Critical meetings (executive, legal, medical)
  • Complex technical discussions
  • Poor transcription quality (many errors)
  • High-stakes content requiring accuracy

speed:

  • Real-time processing needs
  • Quick turnaround requirements
  • Low-latency workflows
  • Same cost as economy, emphasizes speed over quality
  • Complex domain-specific terminology
  • Critical meetings requiring maximum accuracy

When to Use Haiku:

  • Simple, well-structured transcripts
  • Budget constraints
  • Low-stakes internal meetings
  • Good automatic transcription quality

Output Structure (ADR-002 + v2.0 Hybrid)

transcript-{id}/
├── canonical/                  # v2.0: Hybrid parser output
│   ├── canonical-transcript.json  # Full parsed transcript
│   ├── index.json              # Chunk index with metadata
│   └── chunks/                 # Chunked segments (500 per file)
│       ├── chunk-000.json      # Segments 0-499
│       ├── chunk-001.json      # Segments 500-999
│       └── ...
├── 00-index.md                 # Navigation hub (~2K tokens)
├── 01-summary.md               # Executive summary (~5K tokens)
├── 02-transcript.md            # Full transcript (may split)
├── 03-speakers.md              # Speaker directory
├── 04-action-items.md          # Action items with citations
├── 05-decisions.md             # Decisions with context
├── 06-questions.md             # Open questions
├── 07-topics.md                # Topic segments
├── 08-mindmap/                 # Mindmap directory (default: enabled, per ADR-006)
│   ├── mindmap.mmd             # Mermaid format (if --mindmap-format mermaid or both)
│   └── mindmap.ascii.txt       # ASCII format (if --mindmap-format ascii or both)
└── _anchors.json               # Anchor registry for linking

v2.0 Chunked Structure (index.json)

{
  "schema_version": "1.0",
  "generated_at": "2026-01-30T18:00:00Z",
  "total_segments": 710,
  "total_chunks": 4,
  "chunk_size": 500,
  "target_tokens": 18000,
  "duration_ms": 2263888,
  "speakers": {
    "count": 3,
    "list": ["Alex Johnson", "Sam Chen", "Jordan Lee"],
    "segment_counts": {"Alex Johnson": 459, "Sam Chen": 156, "Jordan Lee": 95}
  },
  "chunks": [
    {
      "chunk_id": "chunk-001",
      "segment_range": [1, 229],
      "timestamp_range": {"start_ms": 3528, "end_ms": 721925},
      "speaker_counts": {"Alex Johnson": 151, "Sam Chen": 39, "Jordan Lee": 39},
      "word_count": 2193,
      "file": "chunks/chunk-001.json"
    }
  ]
}

Key Fields:

  • target_tokens: When set (e.g., 18000), uses token-based chunking. When null, uses segment-based (500 segs/chunk).
  • chunk_size: Fallback for segment-based mode (used when target_tokens is null).
  • See DISC-001 for ~22% JSON serialization overhead (18K target → ~22K actual).

Chunk Token Budget (v2.1 - EN-026)

Parameter Value Source
Claude Code Read limit 25,000 tokens GitHub Issue #4002
Target tokens per chunk 18,000 tokens 25% safety margin
Token counting tiktoken p50k_base Best Claude approximation

Note: Prior to v2.1, chunks used segment-based splitting (500 segments per chunk).
v2.1 uses token-based splitting to ensure chunks fit within Claude Code's Read tool limits.
This fixes BUG-001 where large transcripts produced chunks exceeding the 25K token limit.

Token Budget Compliance (ADR-004)

Limit Tokens Action
Soft 31,500 Split at ## heading
Hard 35,000 Force split

Model-Agnostic Output Requirements (ADR-007)

CRITICAL: Output structure MUST be identical regardless of which LLM model (Sonnet, Opus, Haiku) executes the skill. ADR-007 defines the golden template specification.

MUST-CREATE Files (Exactly 8 Core Files)

Number File Description
00 00-index.md Navigation hub and metadata
01 01-summary.md Executive summary
02 02-transcript.md Full formatted transcript (splittable)
03 03-speakers.md Speaker directory
04 04-action-items.md Action items with citations
05 05-decisions.md Decisions with context
06 06-questions.md Open and answered questions
07 07-topics.md Topic hierarchy

Also Required: _anchors.json (anchor registry)

MUST-NOT-CREATE Files (Validation Failure)

Forbidden Pattern Reason
*-timeline.md Not part of ADR-002 schema
*-sentiment.md Not part of ADR-002 schema
*-analysis.md Not part of ADR-002 schema
08-*.md 08 reserved for mindmap directory
Unnumbered *.md All core files must be numbered 00-07

Anchor Format (MUST USE)

Entity Pattern Valid Invalid
Segment seg-NNN seg-001 segment-001, SEG-001
Speaker spk-{slug} spk-alice-smith speaker-alice
Action act-NNN act-001 AI-001, ACT-001
Decision dec-NNN dec-001 DEC-001
Question que-NNN que-001 QUE-001
Topic top-NNN top-001 TOP-001

NNN = 3-digit, zero-padded (001-999)

Citation Format (MUST USE)

> "{QUOTED_TEXT}"
>
> -- [{SPEAKER}](03-speakers.md#{spk-slug}), [[{TIMESTAMP}]](02-transcript.md#{seg-NNN})

FORBIDDEN: Links to canonical-transcript.json (file too large for LLM context)

Navigation Links (MUST INCLUDE)

Every entity file (01-07) MUST include:

## Navigation

- [Back to Index](00-index.md)
- [Previous: {PREV_FILE_NAME}]({PREV_FILE}.md)
- [Next: {NEXT_FILE_NAME}]({NEXT_FILE}.md)

Reference: See ADR-007 for complete specification.

Tool Example: Writing Packet Files

ts-formatter generates packet files using Write tool:

# Create 00-index.md
Invoke Write tool with:
- file_path: "<packet-dir>/00-index.md"
- content: (generated index content with navigation links)

# Create 01-summary.md
Invoke Write tool with:
- file_path: "<packet-dir>/01-summary.md"
- content: (executive summary with key findings)

# Create 04-action-items.md
Invoke Write tool with:
- file_path: "<packet-dir>/04-action-items.md"
- content: (action items with citations and anchors)

Content structure example (04-action-items.md):

# Action Items

<!-- Anchor: #act-001 -->
## 1. Implement user authentication

**Assigned to:** Alice
**Priority:** High
**Due date:** Next sprint

**Source:**
- Timestamp: [00:15:32]
- Speaker: Alice (#spk-001)
- Utterance: #utt-042

**Context:**
> "We need to implement user authentication before the beta launch. I'll take this on for next sprint."

---

<!-- Anchor: #act-002 -->
## 2. Fix navigation bug

...

File size validation:

After writing each file:
    Count tokens using Claude's tokenizer
    IF tokens > 35000:
        Split at semantic boundary (## heading)
        Create part-002, part-003, etc.

Agent File Consumption Rules (CRITICAL)

MANDATORY: Agents MUST follow these file consumption rules to prevent context window overflow.
Violating these rules causes degraded performance and potential failures.

Files Agents SHOULD Read

File Typical Size Agent Purpose
index.json ~8KB ts-extractor, ts-formatter Metadata, speaker list, chunk references
chunks/chunk-*.json ~130KB each ts-extractor Actual transcript data in manageable pieces
extraction-report.json ~35KB ts-formatter, ps-critic Entity extraction results
packet/*.md Variable ps-critic Generated Markdown files for quality review

Files Agents MUST NEVER Read

File Typical Size Why Forbidden
canonical-transcript.json ~930KB TOO LARGE - will overwhelm context window, cause context rot, degrade agent performance

Rationale

The canonical-transcript.json file is generated for:

  • Reference/archive purposes - Human inspection of full parsed output
  • Programmatic access by Python code - CLI tools that process outside LLM context
  • NOT for LLM agent consumption - File size exceeds safe context limits

The chunking architecture (DISC-009) was specifically designed to solve the large file problem:

  • Each chunk is < 150KB (fits comfortably in context)
  • Python parser creates chunks; LLM agents consume chunks
  • This prevents the 99.8% data loss experienced with agent-only architecture

Agent-Specific File Access

Agent Reads Never Reads
ts-parser Input VTT/SRT/TXT file -
ts-extractor index.json, chunks/*.json canonical-transcript.json
ts-formatter index.json, extraction-report.json canonical-transcript.json
ps-critic All packet/*.md files, quality-review.md canonical-transcript.json

Orchestration Flow

L1: Step-by-Step Pipeline (v2.0 Hybrid)

STEP 1: PARSE + CHUNK (ts-parser v2.0)
──────────────────────────────────────
Input:  Raw transcript file (VTT/SRT/TXT)
Process:
  1. Detect format (VTT header check)
  2. IF VTT:
     - Delegate to Python parser (Bash tool)
     - Validate output schema
     - Chunk to 500-segment files
  3. ELSE (SRT/TXT):
     - Use LLM parsing (fallback)
     - Chunk if large file
Output: index.json + chunks/chunk-NNN.json
Errors: Python failure → fallback to LLM parsing

STEP 2: EXTRACT (ts-extractor)
──────────────────────────────
Input:  index.json + chunks/*.json (NEVER canonical-transcript.json)
Process:
  1. Read index.json for metadata and chunk listing
  2. Process each chunk sequentially for entity extraction
  3. Extract entities with confidence scores and citations
Output: extraction-report.json
Errors: Low confidence extractions → flag for review

STEP 3: FORMAT (ts-formatter)
─────────────────────────────
Input:  index.json + extraction-report.json (NEVER canonical-transcript.json)
Output: packet/ directory (8 files per ADR-002)
Errors: Token overflow → split files automatically (ADR-004)

STEP 3.5: MINDMAP GENERATION (ts-mindmap-*, conditional - ADR-006)
──────────────────────────────────────────────────────────────────
Condition: --no-mindmap flag NOT set (mindmaps ON by default)
Input:  ts_extractor_output.extraction_report_path + ts_formatter_output.packet_path
Process:
  1. IF --no-mindmap flag is NOT set:  # Default behavior
     - IF --mindmap-format == "mermaid" OR "both" (default):
       Invoke ts-mindmap-mermaid
       Output: 08-mindmap/mindmap.mmd
     - IF --mindmap-format == "ascii" OR "both" (default):
       Invoke ts-mindmap-ascii
       Output: 08-mindmap/mindmap.ascii.txt
  2. ELSE:
     - Skip mindmap generation (user opted out)
     - ts_mindmap_output.overall_status = "skipped"
Output: ts_mindmap_output state key
Errors: Graceful degradation - continue with warning, provide regeneration instructions

STEP 4: REVIEW (ps-critic)
──────────────────────────
Input:  All generated files
Output: quality-review.md
Errors: Score < 0.90 → report issues for human review

L2: State Passing Between Agents (v2.0)

PURPOSE: State keys provide persistence across agent boundaries, enabling:

  • Error recovery - Agents can resume from last checkpoint
  • Debugging - Full pipeline trace in state history
  • Quality assurance - ps-critic validates entire pipeline state
Agent Output Key Passed To Purpose
ts-parser v2.0 ts_parser_output ts-extractor Parsing results + chunk metadata
ts-extractor ts_extractor_output ts-formatter Extraction report + stats
ts-formatter ts_formatter_output ts-mindmap-*, ps-critic Packet location + file manifest
ts-mindmap-* ts_mindmap_output ps-critic Mindmap paths + generation status
ps-critic quality_output User/Orchestrator Quality validation results

State Schema (v2.0):

ts_parser_output:
  # === Core Parsing Results (v2.0 Chunked Architecture) ===
  canonical_json_path: string          # Full transcript JSON (archive only, ~930KB)
  index_json_path: string              # Chunk index metadata (~8KB)
  chunks_dir: string                   # chunks/ directory path
  chunk_count: integer                 # Number of chunk files created

  # === Parsing Metadata ===
  format_detected: vtt|srt|plain       # Auto-detected or forced format
  parsing_method: python|llm           # Strategy Pattern routing decision
  segment_count: integer               # Total segments in transcript
  speaker_count: integer               # Unique speakers identified
  duration_ms: integer|null            # Transcript duration (null for plain text)

  # === Quality/Validation ===
  validation_passed: boolean           # Schema validation result
  warnings: string[]                   # Non-fatal parsing issues

  # === Error Handling ===
  errors: string[]                     # Fatal errors (if any)
  fallback_triggered: boolean          # True if Python→LLM fallback occurred

ts_extractor_output:
  # === Core Extraction Results ===
  extraction_report_path: string       # Path to extraction-report.json (~35KB)

  # === Entity Counts (MUST match array lengths per INV-EXT-001) ===
  action_count: integer                # len(action_items) - MANDATORY consistency
  decision_count: integer              # len(decisions) - MANDATORY consistency
  question_count: integer              # len(questions) - MANDATORY consistency
  topic_count: integer                 # len(topics) - MANDATORY consistency
  speaker_count: integer               # len(speakers) - MANDATORY consistency

  # === Quality Metrics ===
  high_confidence_ratio: float         # Ratio of high-confidence extractions
  average_confidence: float            # Mean confidence across all entities

  # === Processing Metadata ===
  chunks_processed: integer            # Number of chunks analyzed
  input_format: "chunked"|"single_file" # Input format used

  # === Tiered Extraction Stats ===
  tier_1_count: integer                # Rule-based extractions (high confidence)
  tier_2_count: integer                # ML-based extractions (medium confidence)
  tier_3_count: integer                # LLM-based extractions (variable confidence)

ts_formatter_output:
  # === Core Output ===
  packet_path: string                  # Directory path to 8-file packet

  # === File Manifest ===
  files_created: string[]              # List of all created files
  split_files: integer                 # Number of files split due to token limits

  # === Token Accounting ===
  total_tokens: integer                # Sum of tokens across all files
  max_file_tokens: integer             # Largest file token count

  # === Linking Metadata ===
  anchor_count: integer                # Total anchors in registry
  backlink_count: integer              # Total backlinks generated

  # === Validation ===
  all_files_under_limit: boolean       # True if all < 35K tokens per ADR-004

ts_mindmap_output:                      # NEW: ADR-006 State Passing (see ADR-006#state-passing)
  # === Mindmap Control ===
  enabled: boolean                      # Was --no-mindmap NOT set?
  format_requested: string              # "mermaid" | "ascii" | "both"

  # === Mermaid Mindmap ===
  mermaid:
    path: string | null                 # Path to 08-mindmap/mindmap.mmd
    status: string                      # "complete" | "failed" | "skipped"
    error_message: string | null        # Error if status="failed"
    topic_count: integer                # Topics included in mindmap
    deep_link_count: integer            # Deep links in reference block
    syntax_validated: boolean           # Mermaid syntax check passed

  # === ASCII Mindmap ===
  ascii:
    path: string | null                 # Path to 08-mindmap/mindmap.ascii.txt
    status: string                      # "complete" | "failed" | "skipped"
    error_message: string | null        # Error if status="failed"
    topic_count: integer                # Topics included in mindmap
    max_line_width: integer             # Maximum line width used (should be <= 80)
    box_drawing_valid: boolean          # UTF-8 box-drawing characters validated

  # === Overall Status ===
  overall_status: string                # "complete" | "partial" | "failed" | "skipped"
  regeneration_command: string | null   # Command to regenerate if failed

quality_output:
  # === Quality Score ===
  quality_score: float                  # Aggregate score (0.0-1.0)
  passed: boolean                       # True if >= quality_threshold (default 0.90)

  # === Validation Results ===
  issues: string[]                      # Quality gate failures
  recommendations: string[]             # Improvement suggestions

  # === Criteria Breakdown ===
  criteria_scores: dict                 # Per-criterion scores (e.g., "T-001": 0.95)
  mindmap_criteria_applied: boolean     # True if MM-*/AM-* criteria checked

  # === Traceability ===
  artifacts_validated: string[]         # List of files checked
  validation_timestamp: string          # ISO 8601 timestamp

Error State Structures (F-002)

PURPOSE: Document how agents communicate errors through state keys.
Enables downstream agents to detect and handle failures gracefully.

Error State Schema

When an agent encounters an error, it MUST populate error-specific fields in its output state key:

{agent}_output:
  # === Standard Error Fields (ALL agents MUST include) ===
  status: string                        # "success" | "partial" | "failed"
  errors: string[]                      # Fatal errors that halted execution
  warnings: string[]                    # Non-fatal issues that may affect quality

  # === Error Recovery Metadata ===
  recovery_possible: boolean            # Can this error be recovered from?
  recovery_command: string | null       # CLI command to retry/resume
  recovery_instructions: string | null  # Human-readable recovery steps

  # === Agent-Specific Error Context ===
  error_context: dict | null            # Additional diagnostic information

Error State Examples by Agent

ts-parser Error (Python Parser Failure):

ts_parser_output:
  status: "partial"                     # Fallback succeeded, but not preferred path
  canonical_json_path: "/path/to/canonical-transcript.json"
  index_json_path: "/path/to/index.json"
  chunks_dir: "/path/to/chunks/"
  chunk_count: 3

  # Error tracking
  errors:
    - "Python VTT parser failed: UnicodeDecodeError at line 42"
  warnings:
    - "Fallback to LLM parsing - increased cost and latency"

  # Recovery info
  fallback_triggered: true
  recovery_possible: false              # Already recovered via fallback
  recovery_instructions: "LLM fallback completed successfully. No action required."

  # Diagnostic context
  error_context:
    python_error_type: "UnicodeDecodeError"
    failed_encoding: "utf-8"
    attempted_fallback_encodings: ["windows-1252", "iso-8859-1"]
    fallback_method: "llm"
    fallback_model: "sonnet"

ts-extractor Error (Context Overflow):

ts_extractor_output:
  status: "failed"                      # Cannot proceed
  extraction_report_path: null          # No output generated

  # Fatal error
  errors:
    - "Context window exceeded: chunk-042.json is 45,000 tokens (limit: 35,000)"
    - "Cannot process chunk without splitting"

  warnings: []

  # Recovery info
  recovery_possible: true
  recovery_command: "uv run jerry transcript parse <file> --chunk-target-tokens 15000"
  recovery_instructions: >
    Re-run parsing with smaller chunk size to prevent context overflow.
    Use --chunk-target-tokens 15000 (down from default 18000) to reduce chunk size.

  # Diagnostic context
  error_context:
    failed_chunk_id: "chunk-042"
    failed_chunk_tokens: 45000
    chunk_token_limit: 35000
    suggestion: "Lower --chunk-target-tokens parameter"

ts-formatter Error (File Write Failure):

ts_formatter_output:
  status: "failed"
  packet_path: "/output/transcript-meeting-001/"  # Directory created but incomplete

  # Partial completion tracking
  files_created:
    - "00-index.md"
    - "01-summary.md"
    - "02-transcript.md"
  files_failed:
    - "03-speakers.md"
    - "04-action-items.md"
    - "05-decisions.md"
    - "06-questions.md"
    - "07-topics.md"

  # Fatal error
  errors:
    - "IOError: Permission denied writing to /output/transcript-meeting-001/03-speakers.md"
  warnings: []

  # Recovery info
  recovery_possible: true
  recovery_command: "chmod 755 /output/transcript-meeting-001/ && /transcript resume --from formatting"
  recovery_instructions: >
    Fix directory permissions and resume formatting phase.
    Existing files (00-02) will be skipped if valid.

  # Diagnostic context
  error_context:
    permission_error_path: "/output/transcript-meeting-001/03-speakers.md"
    directory_permissions: "dr-xr-xr-x"
    required_permissions: "drwxr-xr-x"

ts-mindmap-mermaid Error (Syntax Validation Failure):

ts_mindmap_output:
  enabled: true
  format_requested: "both"

  # Mermaid failed
  mermaid:
    path: "/output/08-mindmap/mindmap.mmd"  # File created but invalid
    status: "failed"
    error_message: "Mermaid syntax validation failed: Invalid node syntax at line 12"
    topic_count: 0
    deep_link_count: 0
    syntax_validated: false

  # ASCII succeeded
  ascii:
    path: "/output/08-mindmap/mindmap.ascii.txt"
    status: "complete"
    error_message: null
    topic_count: 47
    max_line_width: 79
    box_drawing_valid: true

  # Overall status
  overall_status: "partial"             # ASCII succeeded, Mermaid failed
  errors:
    - "Mermaid mindmap generation failed: Syntax error (markdown link in node)"
  warnings:
    - "Mermaid mindmap unavailable - ASCII mindmap generated successfully"

  # Recovery info
  recovery_possible: true
  regeneration_command: "uv run jerry transcript mindmap --format mermaid --packet-path /output/"
  recovery_instructions: >
    Mermaid syntax error likely caused by markdown links in node text.
    The ts-mindmap-mermaid agent uses plain text ONLY in nodes per ADR-003 Section 5.3.
    ASCII mindmap is valid and can be used. To retry Mermaid generation:
    1. Review deep link reference block format
    2. Ensure no markdown syntax in topic text
    3. Run regeneration command above

quality_output Error (Quality Gate Failure):

quality_output:
  quality_score: 0.68                   # Below threshold
  passed: false                         # Failed quality gate

  # Quality failures
  issues:
    - "T-001: Missing timestamps in 23% of segments"
    - "T-004: Low citation coverage (42% of extractions lack source anchors)"
    - "T-006: Token limit exceeded in 02-transcript-part-003.md (37,500 tokens)"
    - "MM-002: Mermaid mindmap missing deep link reference block"
  recommendations:
    - "Re-run parsing to fix timestamp gaps"
    - "Ensure ts-extractor includes segment_id in all citations"
    - "Re-run ts-formatter with stricter token splitting (31.5K soft limit)"
    - "Add deep link reference block to Mermaid mindmap per ADR-003 Section 5.3"

  # Criteria breakdown
  criteria_scores:
    "T-001": 0.77                       # Timestamp completeness
    "T-002": 0.95                       # Speaker attribution
    "T-003": 0.90                       # Segment ordering
    "T-004": 0.42                       # Citation coverage
    "T-005": 0.85                       # Confidence scoring
    "T-006": 0.50                       # Token budget compliance
    "MM-002": 0.00                      # Deep link reference block

  mindmap_criteria_applied: true
  artifacts_validated:
    - "00-index.md"
    - "01-summary.md"
    - "02-transcript-part-001.md"
    - "02-transcript-part-002.md"
    - "02-transcript-part-003.md"       # Failed token limit
    - "08-mindmap/mindmap.mmd"          # Missing reference block
    - "08-mindmap/mindmap.ascii.txt"    # Passed

  validation_timestamp: "2026-01-30T19:15:00Z"

  # Error tracking
  errors: []                            # No fatal errors, just quality failures
  warnings:
    - "Quality score 0.68 below threshold 0.90"
  status: "failed"

  # Recovery info
  recovery_possible: true
  recovery_command: null                # No single command - requires manual fixes
  recovery_instructions: >
    Address the 4 high-priority issues listed above and re-run ps-critic.
    Critical failures:
    1. T-004 (citation coverage) - Re-run ts-extractor with citation enforcement
    2. T-006 (token limits) - Re-run ts-formatter with --split-at-soft-limit flag
    3. MM-002 (deep links) - Re-run ts-mindmap-mermaid with reference block generation

Propagated Error State (Multi-Agent Failure)

When an error in one agent cascades to downstream agents:

# ts-parser fails completely (no fallback)
ts_parser_output:
  status: "failed"
  errors:
    - "Fatal: Input file is corrupted (invalid WebVTT structure)"
  recovery_possible: false
  recovery_instructions: "Provide a valid VTT, SRT, or TXT file"

# ts-extractor cannot proceed (missing input)
ts_extractor_output:
  status: "failed"
  errors:
    - "Missing required input: ts_parser_output.index_json_path"
    - "Cannot extract entities without parsed transcript"
  recovery_possible: false
  recovery_instructions: "Fix ts-parser errors before proceeding to extraction"

# ts-formatter skipped (dependency failure)
ts_formatter_output:
  status: "skipped"
  errors:
    - "Skipped due to upstream failure: ts-extractor.status = 'failed'"
  recovery_possible: false

# Pipeline halts completely
overall_pipeline_status: "failed"
failed_phase: "parsing"
recovery_instructions: "Fix input file and restart pipeline from beginning"

State Passing Error Handling

Error Propagation Rules

Error Type Behavior State Impact
Fatal Error Stop pipeline immediately Set status: "failed" in current agent output
Warning Continue pipeline Add to warnings[] array, log in state
Validation Failure Agent-specific decision ts-parser: fallback to LLM; ts-extractor: skip low-confidence
Missing Input Fatal error Cannot proceed, report missing dependency

State Validation at Agent Boundaries

Before ts-extractor:

REQUIRED from ts_parser_output:
  - index_json_path (must exist)
  - chunks_dir (must exist)
  - chunk_count > 0
  - validation_passed == true

Before ts-formatter:

REQUIRED from ts_extractor_output:
  - extraction_report_path (must exist)
  - action_count == len(action_items) in report (INV-EXT-001)
  - decision_count == len(decisions) in report (INV-EXT-001)
  - question_count == len(questions) in report (INV-EXT-001)

Before ps-critic:

REQUIRED from ts_formatter_output:
  - packet_path (directory must exist)
  - files_created (all files must exist)
  - all_files_under_limit == true (ADR-004 compliance)

State Recovery Scenarios (F-003)

PURPOSE: Provide comprehensive recovery procedures for all known failure modes.
Enables self-service recovery without requiring expert intervention.


Scenario 1: Python Parser Failure (Encoding Issues)

Symptom:

Error: Python VTT parser failed: UnicodeDecodeError
Fallback: LLM parsing succeeded (increased cost)

Root Cause: VTT file uses non-UTF-8 encoding (e.g., Windows-1252, UTF-16).

Recovery:

# Option 1: Force LLM parsing from the start (skip Python parser)
uv run jerry transcript parse meeting.vtt --force-llm

# Option 2: Convert file to UTF-8 first (preferred)
iconv -f windows-1252 -t utf-8 meeting.vtt > meeting-utf8.vtt
uv run jerry transcript parse meeting-utf8.vtt

Prevention: Ensure VTT files are UTF-8 encoded before processing.


Scenario 2: Context Window Overflow (Chunk Too Large)

Symptom:

Error: Context window exceeded: chunk-042.json is 45,000 tokens (limit: 35,000)
Status: ts-extractor failed

Root Cause: Chunk exceeds Claude Code Read tool limit (25K tokens). Occurs when transcript has very long monologues.

Recovery:

# Re-parse with smaller chunk size (reduce from default 18K to 12K)
uv run jerry transcript parse meeting.vtt --chunk-target-tokens 12000 --output-dir ./output-v2/

# Then continue from extraction phase
# (ts-parser will automatically trigger ts-extractor with new chunks)

Prevention: Use --chunk-target-tokens 15000 for transcripts with long segments.


Scenario 3: File Write Permission Denied

Symptom:

Error: IOError: Permission denied writing to /output/transcript-meeting-001/03-speakers.md
Status: ts-formatter failed (partial completion)
Files created: 00-02
Files failed: 03-07

Root Cause: Output directory or parent directory has incorrect permissions.

Recovery:

# Fix directory permissions
chmod 755 /output/transcript-meeting-001/

# Resume formatting phase (will skip existing files 00-02)
uv run jerry transcript resume --from formatting --packet-path /output/transcript-meeting-001/

# OR start fresh with correct permissions
chmod 755 /output/
uv run jerry transcript parse meeting.vtt --output-dir /output/meeting-002/

Prevention: Ensure write permissions on output directory before invoking skill.


Scenario 4: Quality Gate Failure (Score < 0.90)

Symptom:

Quality Score: 0.68 (threshold: 0.90)
Status: quality_output.passed = false
Issues:
  - T-004: Low citation coverage (42%)
  - T-006: Token limit exceeded in 02-transcript-part-003.md

Root Cause: Extraction quality issues or token budget violations.

Recovery:

Step 1: Review quality-review.md for specific issues

cat /output/transcript-meeting-001/quality-review.md

Step 2: Address each issue:

For T-004 (citation coverage):

# Re-run extraction with stricter citation enforcement
uv run jerry transcript resume --from extraction --enforce-citations

For T-006 (token limits):

# Re-run formatting with stricter splitting
uv run jerry transcript resume --from formatting --split-at-soft-limit

Step 3: Re-run quality review:

uv run jerry transcript resume --from quality-review

Prevention: Use --model-extractor opus for better extraction quality.


Scenario 5: Agent Timeout (Long Transcripts)

Symptom:

Error: Agent timeout after 300 seconds
Status: ts-extractor failed
Progress: Processed 8/15 chunks before timeout

Root Cause: Very large transcript (> 10,000 segments) exceeds default timeout.

Recovery:

# Increase timeout for extraction phase
uv run jerry transcript parse meeting.vtt --extractor-timeout 600

# OR split transcript into smaller files
uv run python scripts/split_vtt.py meeting.vtt --max-duration 60  # 60 min chunks
uv run jerry transcript parse meeting-part-001.vtt
uv run jerry transcript parse meeting-part-002.vtt
# ... process each part separately

Prevention: For transcripts > 2 hours, consider splitting into segments.


Scenario 6: Partial Extraction (Low Confidence)

Symptom:

Warning: 45% of action items have confidence < 0.70
Status: ts-extractor succeeded (with warnings)

Root Cause: Ambiguous language, implicit action items, or poor transcript quality.

Recovery:

# Option 1: Use Opus model for better semantic understanding
uv run jerry transcript parse meeting.vtt --model-extractor opus

# Option 2: Lower confidence threshold to include more entities
uv run jerry transcript parse meeting.vtt --confidence-threshold 0.5

# Option 3: Review uncertain entities manually
cat /output/extraction-report.json | jq '.action_items[] | select(.confidence < 0.70)'

Prevention: Improve automatic transcription quality (better audio, less background noise).


Scenario 7: Mindmap Syntax Validation Failure

Symptom:

Error: Mermaid syntax validation failed: Invalid node syntax at line 12
Status: ts_mindmap_output.mermaid.status = "failed"
Status: ts_mindmap_output.ascii.status = "complete"

Root Cause: Markdown links used in Mermaid node text (ADR-003 limitation).

Recovery:

# Regenerate Mermaid mindmap with plain text enforcement
uv run jerry transcript mindmap --format mermaid --packet-path /output/ --plain-text-only

# OR use ASCII mindmap (already succeeded)
cat /output/08-mindmap/mindmap.ascii.txt

Prevention: ts-mindmap-mermaid agent should strip markdown syntax before node generation.


Scenario 8: File Conflict (Output Already Exists)

Symptom:

Error: Output directory already exists: /output/transcript-meeting-001/
Conflict: 00-index.md, 01-summary.md (8 files total)

Root Cause: Previous run output not cleared before re-running.

Recovery:

# Option 1: Delete existing output (CAUTION: data loss)
rm -rf /output/transcript-meeting-001/
uv run jerry transcript parse meeting.vtt --output-dir /output/transcript-meeting-001/

# Option 2: Use timestamped directory (recommended)
uv run jerry transcript parse meeting.vtt --output-dir /output/transcript-meeting-001-$(date +%Y%m%d-%H%M%S)/

# Option 3: Merge mode (preserve existing files, update only changed)
uv run jerry transcript parse meeting.vtt --output-dir /output/transcript-meeting-001/ --merge

Prevention: Use --output-dir with unique names or timestamps.


Scenario 9: Missing Dependencies (Python Modules)

Symptom:

Error: ModuleNotFoundError: No module named 'tiktoken'
Status: ts-parser failed (cannot perform token counting)

Root Cause: UV environment missing required Python dependencies.

Recovery:

# Sync UV dependencies
uv sync

# Verify installation
uv pip list | grep tiktoken

# Re-run parsing
uv run jerry transcript parse meeting.vtt

Prevention: Run uv sync after pulling changes or switching branches.


Scenario 10: Orchestration State Corruption

Symptom:

Error: State key mismatch: ts_extractor_output.action_count (23) != len(action_items) in extraction-report.json (25)
Violation: INV-EXT-001 (entity count consistency)
Status: ts-formatter refused to proceed

Root Cause: Agent violated state schema invariant (likely a bug).

Recovery:

# Option 1: Regenerate extraction report from chunks
rm /output/extraction-report.json
uv run jerry transcript resume --from extraction --packet-path /output/

# Option 2: Skip validation (CAUTION: may propagate errors)
uv run jerry transcript resume --from formatting --skip-validation

# Option 3: Report bug with state dump
uv run jerry transcript debug --dump-state /output/ > state-dump.json
# Attach state-dump.json to bug report

Prevention: This is an agent bug - report to maintainers with reproduction steps.


Error Handling

Error Detection Recovery
File not found OS exception Return clear error message
Unknown format Auto-detect fails Fallback to plain text parser
Encoding error Decode exception Try UTF-8, Windows-1252, ISO-8859-1
Token overflow Count > soft limit Split at semantic boundary
Low confidence score < threshold Include in "uncertain" section
Quality failure score < 0.90 Report issues for human review

File Persistence Requirements (P-002 Compliance)

CRITICAL: All transcript skill outputs MUST be persisted to the filesystem.
Violating P-002 (File Persistence) is a constitutional violation.

Mandatory Artifacts by Agent

ts-parser v2.0:

Artifact Path Size Purpose Required
Canonical JSON canonical-transcript.json ~930KB Archive, programmatic access ✓ Yes
Index JSON index.json ~8KB Chunk metadata, agent input ✓ Yes
Chunk Files chunks/chunk-NNN.json ~130KB each LLM-processable segments ✓ Yes

ts-extractor:

Artifact Path Size Purpose Required
Extraction Report extraction-report.json ~35KB Entity extraction results ✓ Yes

ts-formatter:

Artifact Path Size Purpose Required
00-index.md packet/00-index.md ~2KB Navigation hub ✓ Yes
01-summary.md packet/01-summary.md ~5KB Executive summary ✓ Yes
02-transcript.md packet/02-transcript.md Variable Full transcript (may split) ✓ Yes
03-speakers.md packet/03-speakers.md ~3KB Speaker directory ✓ Yes
04-action-items.md packet/04-action-items.md ~4KB Action items with citations ✓ Yes
05-decisions.md packet/05-decisions.md ~3KB Decisions with context ✓ Yes
06-questions.md packet/06-questions.md ~2KB Open questions ✓ Yes
07-topics.md packet/07-topics.md ~3KB Topic segments ✓ Yes
_anchors.json packet/_anchors.json ~5KB Anchor registry for linking ✓ Yes

ts-mindmap-mermaid (conditional):

Artifact Path Size Purpose Required
Mermaid Mindmap packet/08-mindmap/mindmap.mmd ~10KB Mermaid visualization If --mindmap-format=mermaid or both

ts-mindmap-ascii (conditional):

Artifact Path Size Purpose Required
ASCII Mindmap packet/08-mindmap/mindmap.ascii.txt ~8KB Terminal-friendly visualization If --mindmap-format=ascii or both

ps-critic:

Artifact Path Size Purpose Required
Quality Review packet/quality-review.md ~15KB Quality validation results ✓ Yes

Agent File Persistence Checklist

Before completing execution, EACH agent MUST:

ts-parser:
  - [ ] Write canonical-transcript.json to disk
  - [ ] Write index.json to disk
  - [ ] Create chunks/ directory
  - [ ] Write all chunk-NNN.json files
  - [ ] Verify all files exist and are readable
  - [ ] Report file paths in ts_parser_output state

ts-extractor:
  - [ ] Write extraction-report.json to disk
  - [ ] Verify file exists and is valid JSON
  - [ ] Verify entity counts match array lengths (INV-EXT-001)
  - [ ] Report file path in ts_extractor_output state

ts-formatter:
  - [ ] Create packet directory (mkdir -p)
  - [ ] Write ALL 8 core files (00-index.md through 07-topics.md)
  - [ ] Write _anchors.json
  - [ ] Create split files if any file exceeds soft limit (31.5K tokens)
  - [ ] Verify all files exist
  - [ ] Verify all files are under hard limit (35K tokens per ADR-004)
  - [ ] Report file list in ts_formatter_output.files_created

ts-mindmap-mermaid:
  - [ ] Create 08-mindmap/ directory (if not exists)
  - [ ] Write mindmap.mmd file
  - [ ] Verify Mermaid syntax is valid
  - [ ] Verify deep link reference block is present
  - [ ] Report file path in ts_mindmap_output.mermaid.path

ts-mindmap-ascii:
  - [ ] Ensure 08-mindmap/ directory exists
  - [ ] Write mindmap.ascii.txt file
  - [ ] Verify all lines are <= 80 characters
  - [ ] Verify legend is present
  - [ ] Report file path in ts_mindmap_output.ascii.path

ps-critic:
  - [ ] Write quality-review.md to packet directory
  - [ ] Include all validation criteria results
  - [ ] List issues and recommendations
  - [ ] Report quality score in quality_output state

Persistence Failure Recovery

If an agent fails to persist:

  1. Detection: Post-completion check fails (file_must_exist validation)
  2. Response: Agent returns error status, pipeline halts
  3. Recovery: Fix the persistence issue, retry agent execution
  4. Never: Proceed to next agent without verifying file existence

Agent Self-Critique Protocol

PURPOSE: Pre-finalization quality checks per Jerry Constitution P-001 (Truth and Accuracy)
Each agent MUST perform self-critique before reporting completion.

Universal Self-Critique Checklist (All Agents)

Before reporting SUCCESS, EVERY agent must verify:

Constitutional Compliance:
  - [ ] P-001: Is my output accurate and truthful?
  - [ ] P-002: Have I persisted ALL required artifacts to files?
  - [ ] P-003: Did I avoid spawning recursive subagents?
  - [ ] P-004: Do all my extractions have citations to sources?
  - [ ] P-010: Is my state output consistent with actual results?
  - [ ] P-020: Did I respect user decisions (flags, paths)?
  - [ ] P-022: Have I been honest about limitations and errors?

Quality Gates:
  - [ ] Are my output files readable and valid (JSON/Markdown)?
  - [ ] Did I include all required metadata/frontmatter?
  - [ ] Are my reported counts/stats mathematically correct?
  - [ ] Did I document any warnings or errors in state?

Agent-Specific Self-Critique

ts-parser v2.0 (F-004 Quantitative Thresholds):

Pre-Completion Checks:
  - [ ] Format detection is accurate (VTT vs SRT vs plain)
  - [ ] If VTT: Did I successfully delegate to Python parser?
  - [ ] If Python failed: Did I fall back to LLM parsing?
  - [ ] Schema validation passed (validation_passed = true)
  - [ ] Chunk files created (chunk_count matches actual files)
  - [ ] All segments have sequential IDs (seg-001, seg-002, ...)
  - [ ] parse_metadata includes warnings/errors (if any)
  - [ ] No fabricated timestamps for plain text files (P-001)

Critical Validations (Quantitative):
  - [ ] canonical-transcript.json exists and is valid JSON (file size > 0 bytes)
  - [ ] index.json exists and matches chunk file count (schema_version == "1.0")
  - [ ] chunks/ directory contains all chunk-NNN.json files (count >= 1)
  - [ ] segment_count == sum of segments across all chunks (exact match, tolerance: 0)
  - [ ] Each chunk token count ≤ 35,000 tokens (hard limit per ADR-004)
  - [ ] Chunk token count target: 18,000 ± 2,000 tokens (target_tokens parameter)
  - [ ] Speaker count ≥ 1 (at least one speaker identified)
  - [ ] Total duration > 0 ms (for VTT/SRT formats; null acceptable for plain text)
  - [ ] Warning count ≤ 5 (excessive warnings indicate quality issues)
  - [ ] Error count == 0 for status "success" (no errors if claiming success)

ts-extractor (F-004 Quantitative Thresholds):

Pre-Completion Checks:
  - [ ] INV-EXT-001: extraction_stats counts MATCH array lengths (exact, tolerance: 0)
        assert action_count == len(action_items)
        assert decision_count == len(decisions)
        assert question_count == len(questions)
        assert topic_count == len(topics)
        assert speaker_count == len(speakers)
  - [ ] INV-EXT-002: Questions are semantic, not syntactic (no rhetorical)
  - [ ] ALL extractions have citations (100% coverage, 0 missing citations)
  - [ ] ALL citations reference existing segments in input (100% validity)
  - [ ] Confidence scores are calibrated honestly (P-022)
  - [ ] No hallucinated entities (all sourced from transcript)

Critical Validations (Quantitative):
  - [ ] extraction-report.json exists and is valid JSON (file size > 1KB)
  - [ ] input_format = "chunked" (v2.0 requirement)
  - [ ] chunk_metadata.chunks_processed == expected count (exact match)
  - [ ] All entity arrays are non-null (may be empty, but not null)
  - [ ] High-confidence ratio ≥ 0.70 (70% of extractions with confidence ≥ 0.80)
  - [ ] Average confidence ≥ 0.75 (mean across all entities)
  - [ ] Citation coverage == 100% (every entity has at least one citation)
  - [ ] Tier 1 (rule-based) count ≥ 30% (at least 30% high-confidence extractions)
  - [ ] Tier 3 (LLM) confidence ≥ 0.60 (LLM extractions meet minimum threshold)
  - [ ] Extraction count bounds: 0 ≤ entity_count ≤ 1000 (sanity check)
  - [ ] Speaker count ≥ 1 (at least one speaker in transcript)
  - [ ] Topic count ≥ 1 (at least one topic identified)

ts-formatter (F-004 Quantitative Thresholds):

Pre-Completion Checks:
  - [ ] ALL 8 core files created (00-index.md through 07-topics.md)
  - [ ] _anchors.json created with complete anchor registry
  - [ ] NO file exceeds 35K tokens (hard limit per ADR-004)
  - [ ] Split files have proper navigation (prev/next links)
  - [ ] All internal links resolve (no broken anchors, 100% resolution)
  - [ ] All entity files link to source segments (bidirectional, 100% coverage)
  - [ ] Schema version metadata in all frontmatter (PAT-005)
  - [ ] files_created list matches actual directory contents

Critical Validations (Quantitative):
  - [ ] packet_path directory exists (verified via os.path.exists)
  - [ ] total_tokens == sum of tokens across all files (exact match, tolerance: ±10)
  - [ ] all_files_under_limit == true (ALL files ≤ 35,000 tokens)
  - [ ] File token limits:
        - Soft limit: 31,500 tokens (split at this boundary if possible)
        - Hard limit: 35,000 tokens (MUST NOT exceed)
  - [ ] anchor_count > 0 (at least segment anchors exist)
  - [ ] backlink_count > 0 (at least one backlink per entity type)
  - [ ] File count: 8 ≤ files_created.length ≤ 20 (8 core + up to 12 split files)
  - [ ] Split file ratio ≤ 0.50 (no more than 50% of files are splits)
  - [ ] 00-index.md token count ≤ 5,000 (navigation hub must be concise)
  - [ ] 01-summary.md token count ≤ 8,000 (executive summary constraint)
  - [ ] _anchors.json size < 50KB (anchor registry should be compact)
  - [ ] Internal link resolution rate == 100% (no broken links tolerated)
  - [ ] Entity backlink coverage ≥ 95% (at least 95% of entities have source links)

ts-mindmap-mermaid (F-004 Quantitative Thresholds):

Pre-Completion Checks:
  - [ ] Mermaid syntax is valid (mindmap keyword at start, line 1)
  - [ ] Root node uses ((double parentheses)) (line 2)
  - [ ] ALL nodes use plain text ONLY (no markdown links, 0 violations)
  - [ ] Deep link reference block appended as %% comments (present at end)
  - [ ] Indentation is consistent (2 spaces per level, 100% compliance)
  - [ ] Entity symbols are correct (no confusion between →, ?, !, ✓)
  - [ ] Topic overflow handled if > 50 topics (truncation with note)

Critical Validations (Quantitative):
  - [ ] 08-mindmap/mindmap.mmd exists (file size > 500 bytes)
  - [ ] status = "complete" in ts_mindmap_output
  - [ ] topic_count matches extraction report (exact match, tolerance: 0)
  - [ ] Topic count bounds: 1 ≤ topic_count ≤ 50 (overflow if > 50)
  - [ ] Deep link count > 0 (at least one deep link in reference block)
  - [ ] Deep link count ≤ topic_count * 3 (max 3 links per topic)
  - [ ] Syntax validation passed (no Mermaid parser errors)
  - [ ] File line count ≥ 10 (minimum viable mindmap structure)
  - [ ] Indentation depth ≤ 4 levels (hierarchy constraint per ADR-003)
  - [ ] Node text length ≤ 100 characters (readability constraint)
  - [ ] P-022: Documented syntax limitation (no markdown links in nodes)

ts-mindmap-ascii (F-004 Quantitative Thresholds):

Pre-Completion Checks:
  - [ ] ALL lines are <= 80 characters (terminal compatibility, 100% compliance)
  - [ ] Box-drawing characters are valid UTF-8 (U+2500 block, no ASCII fallback)
  - [ ] Legend is present at bottom explaining symbols (minimum 4 lines)
  - [ ] Tree structure is visually balanced (no excessive left-skew)
  - [ ] Entity symbols match specification ([→], [?], [!], etc.)
  - [ ] No truncation artifacts (ellipsis applied correctly at 77 chars)

Critical Validations (Quantitative):
  - [ ] 08-mindmap/mindmap.ascii.txt exists (file size > 300 bytes)
  - [ ] status = "complete" in ts_mindmap_output
  - [ ] max_line_width <= 80 (hard constraint)
  - [ ] Typical line width: 60-75 characters (readability range)
  - [ ] box_drawing_valid == true (UTF-8 validation passed)
  - [ ] Topic count bounds: 1 ≤ topic_count ≤ 50 (same as Mermaid)
  - [ ] File line count ≥ 15 (minimum viable ASCII mindmap)
  - [ ] Legend line count == 4-6 lines (standard legend format)
  - [ ] Indentation step == 2 spaces (consistent with Mermaid)
  - [ ] Tree depth ≤ 4 levels (same hierarchy constraint as Mermaid)
  - [ ] Entity symbol count ≥ topic_count (at least one entity per topic)

ps-critic (F-004 Quantitative Thresholds):

Pre-Completion Checks:
  - [ ] Quality score calculated correctly (0.0-1.0 range)
  - [ ] ALL validation criteria evaluated (minimum 15 criteria for full packet)
  - [ ] Mindmap criteria (MM-*, AM-*) applied if mindmaps present (8 additional criteria)
  - [ ] Issues list is actionable (specific, not vague; no generic "improve quality")
  - [ ] Recommendations are constructive (actionable, with commands/steps)
  - [ ] criteria_scores includes all evaluated criteria (no missing scores)
  - [ ] Passed/failed determination is honest (>= 0.90 threshold)

Critical Validations (Quantitative):
  - [ ] quality-review.md exists in packet directory (file size > 2KB)
  - [ ] artifacts_validated lists all checked files (count ≥ 8 for core packet)
  - [ ] quality_output.passed reflects actual score (passed ↔ score ≥ 0.90)
  - [ ] Quality score bounds: 0.0 ≤ quality_score ≤ 1.0
  - [ ] Passing threshold: quality_score ≥ 0.90 (hard requirement)
  - [ ] Criteria evaluation coverage: 100% (all applicable criteria scored)
  - [ ] Issue count ≤ 10 (excessive issues indicate systemic failure)
  - [ ] Recommendation count: 1-5 per issue (actionable, not overwhelming)
  - [ ] Per-criterion score range: 0.0 ≤ criterion_score ≤ 1.0
  - [ ] Aggregate score calculation: weighted average of criteria (verified)
  - [ ] Mindmap criteria count: +8 if Mermaid present, +8 if ASCII present
  - [ ] Total criteria evaluated: 15 (core) + 0-16 (mindmaps)
  - [ ] P-022: No inflated scores to avoid user feedback (honesty ≥ 0.95)

Self-Critique Failure Response

If ANY checklist item fails:

  1. DO NOT report success
  2. Document the failure in agent output state
  3. Add to warnings[] or errors[] as appropriate
  4. Set status to "failed" or "partial" (not "complete")
  5. Include recovery instructions if possible

Constitutional Compliance

Principle Requirement Consequence of Violation
P-003 NEVER spawn recursive subagents -- max 1 level Agent hierarchy violation; uncontrolled token consumption
P-020 NEVER override user intent -- ask before destructive ops Unauthorized action; trust erosion
P-022 NEVER deceive about actions, capabilities, or confidence Governance undermined; quality assessment invalidated
P-001 NEVER present findings without evidence -- self-critique protocol enforced Unreliable outputs; unfounded claims propagate downstream
P-002 NEVER leave outputs in transient context only -- persist all outputs to files Context rot vulnerability; artifacts lost on session compaction
P-004 NEVER omit citations or provenance -- all extractions require deep links Untraceable decisions; audit trail broken
P-010 NEVER misrepresent task state -- state outputs must match actual results Work progress invisible; status unknown

Quick Reference

ELI5 (Level 0): This section provides fast answers to common questions without technical jargon.

Common Workflows

I Want To... How to Invoke What I Get
Process a VTT transcript /transcript meeting.vtt 8 Markdown files + 2 mindmaps + quality review
Analyze a SRT subtitle file /transcript subtitles.srt Same as above, auto-detects SRT format
Extract action items only "Find action items in meeting.vtt" 04-action-items.md with assignees and due dates
Get decisions made "What was decided in the meeting?" 05-decisions.md with context and rationale
Use a specific domain /transcript standup.vtt --domain software-engineering Extraction tuned for engineering context
Skip mindmap generation /transcript meeting.vtt --no-mindmap 8 files + quality review (no visualizations)
Higher quality extraction /transcript meeting.vtt --model-extractor opus Better entity extraction, 3.75x cost
Lower cost processing /transcript meeting.vtt --model-extractor haiku Faster, 88% cost savings, lower quality
ASCII-only mindmap /transcript meeting.vtt --mindmap-format ascii Terminal-friendly visualization (no Mermaid)
Resume failed processing Check regeneration_command in error output Re-run from last checkpoint

What You Get (Output Files)

Core Packet (8 files):

transcript-{id}/
├── 00-index.md             ← Start here: Navigation + stats
├── 01-summary.md           ← Executive summary
├── 02-transcript.md        ← Full transcript with timestamps
├── 03-speakers.md          ← Speaker directory
├── 04-action-items.md      ← Action items with assignees
├── 05-decisions.md         ← Decisions with context
├── 06-questions.md         ← Open questions
├── 07-topics.md            ← Topic-based navigation
└── _anchors.json           ← Deep linking registry

Mindmaps (default: both formats):

08-mindmap/
├── mindmap.mmd             ← Mermaid diagram (rendered by tools)
└── mindmap.ascii.txt       ← ASCII art (terminal-friendly)

Quality Assurance:

quality-review.md           ← Validation results (>= 0.90 score)

Processing Time Estimates

Transcript Size Segments Duration VTT (Python) SRT/TXT (LLM)
Small < 500 < 30 min ~5 seconds ~30 seconds
Medium 500-2000 30-90 min ~10 seconds ~2 minutes
Large 2000-5000 90-180 min ~20 seconds ~5 minutes
Very Large > 5000 > 3 hours ~30 seconds ~10 minutes

Note: VTT files use deterministic Python parser (1,250x faster than LLM).
SRT and plain text files require LLM processing (slower but still sub-minute for most).

Cost Estimates (Per 10,000 Input Tokens)

Configuration Estimated Cost Quality Trade-off
Default (mixed) ~$0.12 Balanced - recommended for most use cases
All Haiku ~$0.015 88% savings, ~70-75% extraction accuracy
All Sonnet ~$0.15 Baseline quality, ~85-90% accuracy
All Opus ~$0.75 Highest quality (6x cost), ~95%+ accuracy
Extractor Opus only ~$0.45 Targeted quality boost (3.75x cost for extractions)

Cost Breakdown (Default):

  • ts-parser (haiku): ~$0.0025 - Simple orchestration
  • ts-extractor (sonnet): ~$0.06 - Most token-intensive
  • ts-formatter (sonnet): ~$0.03 - Template-based generation
  • ts-mindmap-* (sonnet): ~$0.015 - Hierarchical reasoning
  • ps-critic (sonnet): ~$0.015 - Quality evaluation

Quality Thresholds

Metric Target What It Means How to Improve
Extraction precision > 85% % of extracted entities that are correct Use --model-extractor opus
Extraction recall > 85% % of actual entities successfully extracted Improve transcript quality (better audio)
Confidence threshold >= 0.7 Minimum confidence to include entity Increase threshold for higher precision
Quality score >= 0.90 ps-critic aggregate validation score Fix issues listed in quality-review.md

Troubleshooting Common Issues

Problem Likely Cause Solution
"File not found" error Incorrect path Use absolute path or check working directory
Parsing fails (VTT) Encoding issue File may be UTF-16 (use --format srt to force LLM)
Low quality score Poor transcript quality Review quality-review.md for specific issues
Missing action items Implicit language Use --model-extractor opus for better semantic understanding
Mindmap syntax error Mermaid validation failed Check mindmap.mmd for plain text node requirement
ASCII mindmap too wide Terminal width < 80 chars Resize terminal or use Mermaid version
High cost Using Opus for all agents Switch to default or use --model-extractor opus only
Context window exceeded File too large (rare) Python chunker should prevent this; report as bug

Orchestration Troubleshooting (F-005)

PURPOSE: Diagnose and resolve pipeline orchestration failures.
Use this section when the skill appears "stuck" or agents aren't executing in sequence.


Issue: Pipeline Stuck After ts-parser

Symptom:

ts-parser completed successfully
Status: ts_parser_output.validation_passed = true
Expected: ts-extractor should start automatically
Actual: No activity, pipeline appears frozen

Diagnostic Steps:

  1. Check state propagation:
# Verify ts_parser_output state key exists
cat /output/.skill-state.json | jq '.ts_parser_output'
  1. Verify required files exist:
ls -lh /output/index.json
ls -lh /output/chunks/
  1. Check for blocking errors:
cat /output/.skill-state.json | jq '.ts_parser_output.errors'

Resolution:

If state key missing:

# Orchestration bug - restart from parsing
uv run jerry transcript parse <file> --output-dir /output/

If files missing despite validation_passed = true:

# File persistence violation (P-002) - report as bug
uv run jerry transcript debug --validate-persistence /output/

If errors[] is non-empty despite status = "success":

# State inconsistency - force re-run extraction
uv run jerry transcript resume --from extraction --force

Issue: Agent Timeout During Extraction

Symptom:

ts-extractor started processing chunks
Progress: 8/15 chunks processed
Error: Agent timeout after 300 seconds

Diagnostic Steps:

  1. Check chunk sizes:
du -h /output/chunks/*.json
# Look for chunks > 200KB (may exceed token limits)
  1. Check partial extraction report:
ls -lh /output/extraction-report.json
# File may exist but be incomplete
  1. Estimate remaining time:
# Average 30-40 seconds per chunk for large chunks
# Remaining: (15 - 8) * 35 = 245 seconds ≈ 4 minutes

Resolution:

Increase timeout:

uv run jerry transcript parse <file> --extractor-timeout 600  # 10 minutes

OR reduce chunk size (re-parse):

uv run jerry transcript parse <file> --chunk-target-tokens 12000  # Smaller chunks

Issue: Quality Gate Fails Immediately

Symptom:

ps-critic started
Error: Missing required files - cannot validate
Status: quality_output.status = "failed"

Diagnostic Steps:

  1. Check packet directory structure:
ls -lh /output/transcript-*/
# Should have 00-07 files + _anchors.json
  1. Verify ts_formatter_output.files_created:
cat /output/.skill-state.json | jq '.ts_formatter_output.files_created'
  1. Check for file write errors:
cat /output/.skill-state.json | jq '.ts_formatter_output.errors'

Resolution:

If files incomplete:

# Re-run formatting phase
uv run jerry transcript resume --from formatting --packet-path /output/transcript-*/

If permission errors:

chmod -R 755 /output/transcript-*/
uv run jerry transcript resume --from formatting

Issue: File Conflicts Prevent Execution

Symptom:

Error: Output directory already exists: /output/transcript-meeting-001/
Conflict detected - refusing to overwrite

Diagnostic Steps:

  1. Check existing files:
ls -lh /output/transcript-meeting-001/
  1. Verify if files are from previous run:
stat /output/transcript-meeting-001/00-index.md
# Check modification timestamp

Resolution:

Option 1: Archive existing output:

mv /output/transcript-meeting-001/ /output/archive/transcript-meeting-001-$(date +%Y%m%d-%H%M%S)/
uv run jerry transcript parse <file> --output-dir /output/transcript-meeting-001/

Option 2: Use timestamped directory:

uv run jerry transcript parse <file> --output-dir /output/transcript-meeting-001-$(date +%Y%m%d-%H%M%S)/

Option 3: Merge mode (update only changed files):

uv run jerry transcript parse <file> --output-dir /output/transcript-meeting-001/ --merge

Issue: Mindmap Generation Partial Failure

Symptom:

ts-mindmap-mermaid: failed
ts-mindmap-ascii: complete
Status: ts_mindmap_output.overall_status = "partial"

Diagnostic Steps:

  1. Check Mermaid error message:
cat /output/.skill-state.json | jq '.ts_mindmap_output.mermaid.error_message'
  1. Verify ASCII mindmap is usable:
cat /output/08-mindmap/mindmap.ascii.txt
  1. Check if Mermaid file exists (but invalid):
ls -lh /output/08-mindmap/mindmap.mmd
head -20 /output/08-mindmap/mindmap.mmd  # Inspect syntax

Resolution:

If syntax error (markdown links in nodes):

# Regenerate with plain text enforcement
uv run jerry transcript mindmap --format mermaid --packet-path /output/ --plain-text-only

If acceptable to use ASCII only:

# No action required - ASCII mindmap is valid
cat /output/08-mindmap/mindmap.ascii.txt

If both formats required:

# Debug Mermaid generation
uv run jerry transcript mindmap --format mermaid --packet-path /output/ --debug
# Fix issues manually, then validate
uv run jerry transcript validate-mindmap /output/08-mindmap/mindmap.mmd

Issue: State Key Mismatch (INV-EXT-001 Violation)

Symptom:

Error: State key mismatch detected
ts_extractor_output.action_count = 23
len(action_items) in extraction-report.json = 25
Violation: INV-EXT-001 (entity count consistency)
ts-formatter refused to proceed

Diagnostic Steps:

  1. Dump extraction report:
cat /output/extraction-report.json | jq '.extraction_stats'
cat /output/extraction-report.json | jq '.action_items | length'
  1. Verify state key:
cat /output/.skill-state.json | jq '.ts_extractor_output.action_count'
  1. Compare values:
# Should be exactly equal (tolerance: 0)

Resolution:

Option 1: Regenerate extraction report (preferred):

rm /output/extraction-report.json
uv run jerry transcript resume --from extraction --packet-path /output/

Option 2: Report bug with diagnostic info:

uv run jerry transcript debug --dump-state /output/ > state-dump.json
# Attach state-dump.json to bug report
# Include extraction-report.json and .skill-state.json

Option 3: Skip validation (CAUTION - may propagate errors):

# Only use if deadline-critical and you understand the risk
uv run jerry transcript resume --from formatting --skip-validation

Issue: Python Parser Fails, Fallback Doesn't Trigger

Symptom:

Python VTT parser failed: ModuleNotFoundError: No module named 'webvtt'
Expected: Fallback to LLM parsing
Actual: Pipeline halts with fatal error

Diagnostic Steps:

  1. Check Python dependencies:
uv pip list | grep webvtt
uv pip list | grep tiktoken
  1. Verify UV environment:
uv sync --verbose
  1. Check fallback configuration:
cat /output/.skill-state.json | jq '.ts_parser_output.fallback_triggered'

Resolution:

Fix dependencies (preferred):

uv sync
uv run jerry transcript parse <file> --output-dir /output/

Force LLM parsing (bypass Python parser):

uv run jerry transcript parse <file> --force-llm --output-dir /output/

Manual fallback:

# If Python parser consistently fails, use SRT format override
uv run jerry transcript parse <file> --format srt --output-dir /output/
# This forces LLM parsing even for VTT files

Cross-Skill Integration: /problem-solving

The transcript skill integrates with the /problem-solving skill for quality validation.

ps-critic Agent Usage

What is ps-critic?

  • ps-critic is a problem-solving agent from the /problem-solving skill
  • Role: Quality Inspector (validates transcript outputs against criteria)
  • Specialization: Systematic review using 5W2H, Ishikawa, Pareto frameworks

How transcript skill uses ps-critic:

Phase 4: Quality Review
─────────────────────────
Invoke: ps-critic (from /problem-solving skill)
Input: All packet files (00-07) + mindmaps (08-)
Extension: skills/transcript/validation/ts-critic-extension.md
Threshold: 0.90

Why ps-critic, not a transcript-specific validator?

  • Reusability - ps-critic validates ANY skill's outputs
  • Framework-driven - Uses proven quality frameworks (5W2H, etc.)
  • Extensibility - ts-critic-extension.md adds transcript-specific criteria

Extension mechanism:

# ts-critic-extension.md (Transcript-Specific Criteria)

T-001: Timestamp completeness (>= 95% of segments have timestamps)
T-002: Speaker attribution (>= 90% of segments attributed)
T-003: Segment ordering (chronological order maintained)
T-004: Citation coverage (>= 95% of extractions have citations)
T-005: Confidence scores (>= 70% of extractions >= 0.7 confidence)
T-006: Token budget compliance (all files < 35K tokens)

MM-001: Mermaid syntax validation (mindmap is valid Mermaid)
MM-002: Deep link reference block (contains anchor mapping)

AM-001: ASCII line width (all lines <= 80 characters)
AM-002: Box drawing characters (UTF-8 box-drawing used correctly)

See also:


Cross-Skill Integration: /orchestration

The transcript skill uses orchestration patterns from the /orchestration skill.

Multi-Phase Pipeline Management

What is /orchestration?

  • Framework for coordinating multi-agent workflows
  • Provides sync barriers, state checkpointing, cross-pollinated pipelines
  • Used when work has dependencies and requires coordination

How transcript skill uses orchestration:

Transcript Pipeline (v2.1):
──────────────────────────
Phase 1: Parse + Chunk     → ts-parser
  ↓ (Sync Barrier: Wait for index.json)
Phase 2: Extract Entities   → ts-extractor
  ↓ (Sync Barrier: Wait for extraction-report.json)
Phase 3: Format Packet      → ts-formatter
  ↓ (Sync Barrier: Wait for packet files)
Phase 3.5: Generate Mindmaps → ts-mindmap-*
  ↓ (Sync Barrier: Wait for mindmap files)
Phase 4: Quality Review     → ps-critic

Orchestration patterns used:

Pattern Usage in Transcript Skill
Sequential Phases 5 phases run in strict order
State Passing Each phase outputs state for next
Sync Barriers Wait for file existence before proceeding
Graceful Degradation Mindmap failures don't block Phase 4
Checkpoint Recovery Can resume from any phase

Example: Sync Barrier Implementation

After Phase 1 (ts-parser):
  ─────────────────────────
  1. Wait for index.json to exist
  2. Validate index.json schema
  3. IF valid: Proceed to Phase 2
  4. ELSE: Retry or abort

See also:


Cross-Skill Integration: /nasa-se

The transcript skill applies NASA Systems Engineering principles for quality assurance.

Requirements Traceability

What is /nasa-se?

  • Framework based on NPR 7123.1D (NASA Systems Engineering)
  • Provides verification & validation (V&V) processes
  • Used for requirements traceability and quality gates

How transcript skill uses NASA SE:

Quality Gate (ps-critic Phase):
────────────────────────────────
Criterion T-004: Citation Coverage >= 95%

Traceable to:
  - REQ-EXT-003: "All extractions MUST include source citations"
  - PAT-004: Citation Required Pattern
  - ADR-003: Anchor Registry specification

Verification method:
  1. Count total extractions in extraction-report.json
  2. Count extractions with non-null citation field
  3. Calculate ratio: citations / total
  4. IF ratio >= 0.95: PASS
  5. ELSE: FAIL with specific missing citations list

V&V Framework Application:

NASA SE Process Transcript Skill Implementation
Requirements Analysis ADR-002 (packet structure), ADR-003 (anchors)
Design Verification Architecture tests (hexagonal compliance)
Implementation Validation ps-critic quality review (0.90 threshold)
Traceability Criterion → REQ → Pattern → Implementation

Quality Criteria Traceability Example:

MM-001: Mermaid syntax validation
  ├─ Requirement: REQ-VIZ-001 (mindmaps must render correctly)
  ├─ Design: ADR-006 Section 5.3 (Mermaid syntax rules)
  ├─ Pattern: PAT-VIZ-001 (declarative visualization)
  ├─ Implementation: ts-mindmap-mermaid agent
  └─ Validation: ps-critic MM-001 criterion

Verification:
  1. Parse mindmap.mmd with Mermaid library
  2. Check for syntax errors
  3. Validate structure (root → branches → leaves)
  4. IF errors: FAIL (quality score penalty)
  5. ELSE: PASS

See also:


Related Documents

Backlinks

Forward Links


Routing Disambiguation

When this skill is the wrong choice and what happens if misrouted.

Condition Use Instead Consequence of Misrouting
Non-transcript file analysis (code, markdown, config files) /problem-solving or Read tool directly Transcript parsing agents (ts-parser, ts-extractor) applied to non-VTT/SRT files produce parsing failures; hybrid Python+LLM architecture expects transcript-format input
General text processing or summarization /problem-solving (ps-analyst or ps-synthesizer) Transcript agents apply speaker identification, timestamp extraction, and domain-specific entity extraction patterns that are irrelevant to general text; 1,250x cost multiplier if Task agents invoked unnecessarily vs. direct text processing
Requirements engineering or design work /nasa-se Transcript skill extracts meeting content; requirements formalization and V&V traceability require NASA-SE methodology
Adversarial quality review of deliverables /adversary Transcript agents score extraction quality against transcript-specific criteria, not the S-014 LLM-as-Judge rubric for deliverable quality
Research, analysis, or root cause investigation /problem-solving Transcript agents have no analytical methodology beyond extraction; research and causal investigation require ps-researcher or ps-investigator
Security assessment or threat modeling /eng-team Transcript skill has no security methodology; security-engineering domain context provides STRIDE/DREAD extraction but does not replace eng-team threat modeling

Agent Details

For detailed agent specifications, see:

  • agents/ts-parser.md - Parser/Orchestrator agent definition (v2.0)
  • agents/ts-extractor.md - Extractor agent definition
  • agents/ts-formatter.md - Formatter agent definition
  • agents/ts-mindmap-mermaid.md - Mermaid mindmap agent definition (ADR-006)
  • agents/ts-mindmap-ascii.md - ASCII mindmap agent definition (ADR-006)
  • skills/problem-solving/agents/ps-critic.md - Critic agent (shared)

Document History

Version Date Changes Enabler/Task
1.0.0 2026-01-26 Initial SKILL.md with agent pipeline and basic invocation FEAT-001
2.0.0 2026-01-30 Hybrid Python+LLM architecture per DISC-009 findings EN-025, DISC-009
2.1.0 2026-01-30 Mindmap pipeline integration (default ON, opt-out via --no-mindmap) ADR-006, EN-024
2.2.0 2026-01-29 Explicit CLI invocation instructions for Phase 1 parsing EN-024 (verification gap fix)
2.3.0 2026-01-30 Token-based chunking (fixes BUG-001 chunk token overflow) EN-026, BUG-001
2.4.0 2026-01-30 EN-028 Compliance: Added invoking section with natural language patterns, enhanced state passing with error handling, file persistence requirements (P-002 checklist), self-critique protocol (pre-finalization checks), restructured for triple-lens audience (L0/L1/L2), expanded Quick Reference with troubleshooting, model selection documentation (--model-* flags). EN-028, TASK-407-411
2.4.1 2026-01-30 G-028 Iteration 2 Refinements (F-001 to F-005): Added 5 error invocation examples with error outputs (F-001); documented error state structures for all agents including propagated errors (F-002); expanded recovery scenarios from 3 to 10 covering Python parser failure, context overflow, file write errors, quality gate failure, timeouts, partial extraction, mindmap failures, file conflicts, missing dependencies, and state corruption (F-003); added quantitative thresholds to all agent self-critique checklists with numeric bounds for validation (F-004); expanded Quick Reference troubleshooting section with 7 orchestration failure scenarios including pipeline stuck, agent timeout, quality gate immediate failure, file conflicts, mindmap partial failure, state key mismatch, and Python parser fallback issues (F-005). Target: raise G-028 score from 0.78 to ≥ 0.90. EN-028, G-028 Iteration 2
2.4.2 2026-01-30 EN-030 Documentation Polish: Added 6 comprehensive tool examples (Bash, Read, Write, Task, Glob, Grep) with execution evidence (TASK-416); Added 6 design rationale deep-dives covering hybrid architecture, chunking strategy, mindmap default-on, quality threshold 0.90, dual citation system, and P-003 compliance (TASK-417); Added 3 cross-skill integration sections for /problem-solving, /orchestration, and /nasa-se (TASK-418). Target: raise G-030 score from 0.83 to ≥ 0.95. EN-030, TASK-416-418
2.5.0 2026-01-31 ADR-007 Model-Agnostic Output: Added "Model-Agnostic Output Requirements (ADR-007)" section with MUST-CREATE (8 files), MUST-NOT-CREATE lists, anchor format rules, citation format, and navigation requirements. Ensures output consistency across Sonnet/Opus/Haiku models. FEAT-006 Phase 4 implementation. ADR-007, FEAT-006

Skill Version: 2.5.0
Architecture: Hybrid Python+LLM (Strategy Pattern) + Mindmap Generation + Token-Based Chunking
Constitutional Compliance: Jerry Constitution v1.0 (P-001, P-002, P-003, P-004, P-010, P-020, P-022)
ADR Compliance: ADR-002 (packet structure), ADR-003 (anchor registry), ADR-004 (file splitting), ADR-006 (mindmap), ADR-007 (output template)
Created: 2026-01-26
Last Updated: 2026-01-31 (ADR-007 Model-Agnostic Output)