maragudk
@maragudk Organization
Public Skills
rodney
by maragudk
Guide for automating Chrome browser interactions using the rodney CLI. This skill should be used when performing web automation tasks such as navigating to pages, taking screenshots, clicking elements, filling forms, extracting page content, or any other browser-based interaction.
go
by maragudk
Guide for how to develop Go apps and modules/libraries. Always use this skill when reading or writing Go code.
bluesky
by maragudk
Guide for posting content to the Bluesky social network using the bsky terminal app. This skill should be used proactively when working in public repositories and there is interesting, shareable content (new features, insights, achievements, or announcements worth sharing with the community). Use it when asked to post to Bluesky, or when content seems worth sharing publicly.
design-doc
by maragudk
For when you're asked to write a design doc or specification, especially after a brainstorm or feature design session.
collaboration
by maragudk
Guide for collaborating on GitHub projects. This skill should be used when contributing to projects, creating PRs, reviewing code, or managing issues on GitHub.
marimo
by maragudk
Guide for creating and working with marimo notebooks, the reactive Python notebook that stores as pure .py files. This skill should be used when creating, editing, running, or deploying marimo notebooks.
datastar
by maragudk
Guide for building interactive web UIs with Datastar and gomponents-datastar. Use this skill when adding frontend interactivity to Go web applications with Datastar attributes.
save-web-page
by maragudk
Guide for saving a web page for offline use using the monolith CLI. Use this when instructed to save a web page.
worktrees
by maragudk
Guide for using git worktrees to parallelize development with coding agents. Use this skill when the user requests to work in a new worktree or wants to work on a separate feature in isolation (e.g., "Work in a new worktree", "Create a worktree for feature X").
git
by maragudk
Guide for using git according to my preferences. Use it when you're asked to commit something.
code-review
by maragudk
Guide for making code reviews. Use this when asked to make code reviews, or ask to use it before committing changes.
journal
by maragudk
Guide for using the AI's persistent journal database
brainstorm
by maragudk
Guide for how to brainstorm an idea and turn it into a fully formed design.
gomponents
by maragudk
Guide for working with gomponents, a pure Go HTML component library. Use this skill when reading or writing gomponents code, or when building HTML views in Go applications.
Observable Plot Skill
by maragudk
GitHub Repository
observable-notebooks
by maragudk
Guide for creating Observable Notebooks 2.0, the open-source notebook system for interactive data visualization and exploration. Use this skill when creating, editing, or building Observable notebooks.
nanobanana
by maragudk
Guide for generating and editing images using generative AI with the nanobanana CLI
sql
by maragudk
Guide for working with SQL queries, in particular for SQLite. Use this skill when writing SQL queries, analyzing database schemas, designing migrations, or working with SQLite-related code.
decisions
by maragudk
Guide for recording significant architectural and design decisions in docs/decisions.md. Use this skill when clearly significant architectural decisions are made (database choices, frameworks, core design patterns) or when explicitly asked to document a decision. Be conservative - only suggest for major decisions, not minor implementation details.
trace-annotation-tool
by maragudk
Generate a custom trace annotation web app for open coding during LLM error analysis. Use when the user wants to review LLM traces, annotate failures with freeform comments, and do first-pass qualitative labeling (open coding). Also use when the user mentions "annotate traces", "trace review tool", "open coding tool", "label traces", "build an annotation interface", "review LLM outputs", or wants to manually inspect pipeline traces before building a failure taxonomy. This skill produces a tailored Python web application using FastHTML, TailwindCSS, and HTMX.
prompt-engineering
by maragudk
"Use this skill when crafting, reviewing, or improving prompts for LLM pipelines — including task prompts, system prompts, and LLM-as-Judge prompts. Triggers include: requests to write or refine a prompt, diagnose why an LLM produces inconsistent or incorrect outputs, bridge the gap between intent and model behavior, reduce ambiguity in instructions, add few-shot examples, structure complex prompts, or improve output formatting. Also use when the user needs help distinguishing specification failures (unclear instructions) from generalization failures (model limitations), or when iterating on prompts based on observed failure modes. Do NOT use for general coding tasks, document creation, or non-LLM writing."
failure-taxonomy
by maragudk
Build a structured taxonomy of failure modes from open-coded trace annotations. Use this skill whenever the user has freeform annotations from reviewing LLM traces and wants to cluster them into a coherent, non-overlapping set of binary failure categories (axial coding). Also use when the user mentions "failure modes", "error taxonomy", "axial coding", "cluster annotations", "categorize errors", "failure analysis", or wants to go from raw observation notes to structured evaluation criteria. This skill covers the full pipeline: grouping open codes, defining failure modes, re-labeling traces, and quantifying error rates.
llm-as-a-judge
by maragudk
Build, validate, and deploy LLM-as-Judge evaluators for automated quality assessment of LLM pipeline outputs. Use this skill whenever the user wants to: create an automated evaluator for subjective or nuanced failure modes, write a judge prompt for Pass/Fail assessment, split labeled data for judge development, measure judge alignment (TPR/TNR), estimate true success rates with bias correction, or set up CI evaluation pipelines. Also trigger when the user mentions "judge prompt", "automated eval", "LLM evaluator", "grading prompt", "alignment metrics", "true positive rate", or wants to move from manual trace review to automated evaluation. This skill covers the full lifecycle: prompt design → data splitting → iterative refinement → success rate estimation.