breethomas

@breethomas

GitHub

27 Skills

432 Total Stars

March 2026 Joined

Public Skills

build-judge

by breethomas

Build an LLM-as-Judge evaluator for one specific failure mode. Binary pass/fail only. Use when a failure mode requires interpretation (tone, faithfulness, relevance, completeness) and cannot be checked with code. Do NOT use when the failure can be checked with regex, schema validation, or execution tests. Do NOT use before completing error analysis (/upgrade-evals).

Processing 16 4mo ago

calibrate

by breethomas

Post-launch AI feature calibration workflow. Document error patterns, review eval performance, and decide on agency promotion. Based on CC/CD framework for continuous calibration of AI products.

Code Review 16 4mo ago

generate-test-data

by breethomas

Create diverse synthetic test inputs using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead).

Code Gen 16 4mo ago

upgrade-evals

by breethomas

Systematic error analysis on real AI traces. Read traces, judge pass/fail, let failure categories emerge from data, compute failure rates, decide what to fix. Use when you have 50+ test cases or are seeing production failures. Do NOT use when you have fewer than 20 test cases (use /start-evals first).

Processing 16 4mo ago

eval-rag

by breethomas

Evaluate RAG pipeline retrieval and generation quality separately. Measure Recall@k, Precision@k, MRR, NDCG@k for retrieval. Assess faithfulness and relevance for generation. Use when the AI feature uses retrieval (search, knowledge base, document QA). Do NOT use for non-RAG AI features.

Debugging 16 4mo ago

agency-ladder

by breethomas

Plan the v1→v2→v3 agency progression for AI features. Walk through mapping how autonomy increases over time, define promotion criteria, and generate artifacts for stakeholder alignment. Based on CC/CD framework.

Code Gen 16 4mo ago

prd-writer

by breethomas

Full 5-stage PRD framework for complex features. Use for deep PRD work via /spec --deep full-prd. For quick feature specs, use /spec --feature instead.

Code Review 16 4mo ago

agent-workflow

by breethomas

Expert system for designing and architecting AI agent workflows based on proven Meta methodologies. Use when users need to build AI agents, create agent workflows, solve problems using agentic systems, integrate multiple tools into agent architectures, or need guidance on agent design patterns. Helps translate business problems into structured agent solutions with clear scope, tool integration, and multi-layer architecture planning.

Agents 16 4mo ago

workspace-calibration

by breethomas

Analyze Linear workspace health and usage patterns before jumping into backlog work. Like a pre-flight check for a new PM joining a team or organization.

Code Gen 16 4mo ago

lno-prioritize

by breethomas

Find out if you're spending time on the wrong things. Categorize backlog by Leverage/Neutral/Overhead and challenge your time allocation.

Automation 16 4mo ago

four-fits

by breethomas

Find which fit is broken before you burn cash scaling. Brian Balfour's framework for validating sustainable growth readiness.

Debugging 16 4mo ago

shape-up

by breethomas

Shape work using the Shape Up methodology (Ryan Singer, Basecamp). Walk through the 4-step shaping process to create pitches ready for betting. Distinguishes between established product mode (fixed time, variable scope) and new product mode (looser constraints). Use when planning cycle work, writing pitches, or coaching PMs on shaping.

Automation 16 4mo ago

strategy-session

by breethomas

Your product soundboard. Work through product decisions conversationally - Claude gathers context, challenges assumptions, captures decisions, and creates Linear issues.

Auth 16 4mo ago

coder

by breethomas

Apply Brian Balfour's CODER framework to drive organizational AI adoption. Constraints, Ownership, Directives, Expectations, Rewards.

Code Gen 16 4mo ago

now-next-later

by breethomas

Generate a Now-Next-Later roadmap using Janna Bastow's framework. Communicates sequence and certainty without false dates.

Code Gen 16 4mo ago

project-health

by breethomas

Deep-dive health check on a single Linear project. Produces assessment with 7 dimensions - On Track / At Risk / Stalled.

Agents 16 4mo ago

four-risks

by breethomas

Run Marty Cagan's Four Risks assessment on an issue (value, usability, feasibility, viability). Use when evaluating features before building.

Comments 16 4mo ago

competitive-research

by breethomas

Systematic competitive intelligence with parallel agent analysis. Analyzes competitors thoroughly and synthesizes into actionable insights.

Academic 16 4mo ago

ai-cost-check

by breethomas

Calculate AI feature costs and challenge if you actually need it. Invokes ai-cost-analyzer agent for detailed economics modeling.

Code Review 16 4mo ago

pm-frameworks

by breethomas

Expert knowledge of proven product management frameworks for discovery, growth, measurement, planning, and AI-era practices.

Automation 16 4mo ago

ai-debug

by breethomas

Diagnose why an AI feature is underperforming, hallucinating, or behaving inconsistently. Uses 4D audit to work backwards from symptoms to root cause.

Code Review 16 4mo ago

spec

by breethomas

Write specifications at the right depth for any project. Progressive disclosure from quick Linear issues to full AI feature specs. Embeds Linear Method philosophy (brevity, clarity, momentum) with context engineering for AI features. Use for any spec work - quick tasks, features, or AI products.

Code Gen 16 4mo ago

context-engineering

by breethomas

"[ARCHIVED] Full 4D Context Canvas reference. For new AI features, use /spec --ai. For debugging, use /ai-debug. For quality checks, use /context-check."

Code Review 16 4mo ago

pmf-survey

by breethomas

Create and analyze a PMF survey using Rahul Vohra's Superhuman framework. The magic 40% benchmark for product-market fit.

Code Gen 16 4mo ago

prompt-engineering

by breethomas

Expert prompt optimization system for building production-ready AI features. Use when users request help improving prompts, want to create system prompts, need prompt review/critique, ask for prompt optimization strategies, want to analyze prompt effectiveness, mention prompt engineering best practices, request prompt templates, or need guidance on structuring AI instructions. Also use when users provide prompts and want suggestions for improvement.

Processing 16 4mo ago

ai-health-check

by breethomas

Pre-launch health check that blocks you from shipping broken AI features. Grades 6 dimensions (model selection, data quality, cost, monitoring, failure UX, optimization).

Code Review 16 4mo ago

growth-loops

by breethomas

Find your growth loop or stay stuck in linear acquisition hell. Identify viral, content, network, and paid loop opportunities using Elena Verna's framework.

Code Gen 16 4mo ago