dtsong

dtsong

@dtsong

GitHub
15 Skills
159 Total Stars
February 2026 Joined

Public Skills

client-delivery

by dtsong

"Use this skill when managing a consulting data cleaning engagement. Covers engagement setup, schema profiling, security tier selection, project scaffolding, deliverable generation, and client handoff. Common phrases: \"set up a cleaning project\", \"profile this schema\", \"data cleaning engagement\", \"generate deliverables\", \"client handoff\". Do NOT use for writing dbt models (use dbt-transforms), DuckDB queries (use duckdb), or pipeline orchestration (use data-pipelines)."

Processing 11 3mo ago

data-pipelines

by dtsong

"Use this skill when scheduling, orchestrating, or monitoring data pipelines. Covers Dagster assets, Airflow DAGs, Prefect flows, sensors, retries, alerting, and cross-tool integrations (dagster-dbt, dagster-dlt). Common phrases: \"schedule this pipeline\", \"Dagster vs Airflow\", \"add retry logic\", \"pipeline alerting\", \"consulting pipeline\". Do NOT use for building transformations (use dbt-transforms or python-data-engineering) or designing integration patterns (use data-integration)."

Automation 11 3mo ago

data-testing

by dtsong

"Use this skill when designing testing strategies for data pipelines, writing SQL assertions, validating pipeline output, or packaging tests as client deliverables. Covers dbt test patterns, pipeline validation, SQL assertion libraries, test coverage targets, and test-as-deliverable packaging. Common phrases: \"data testing strategy\", \"pipeline validation\", \"SQL assertions\", \"test coverage\", \"test as deliverable\", \"data quality tests\". Do NOT use for writing dbt models (use dbt-transforms), DuckDB analytical queries (use duckdb), or pipeline scheduling (use data-pipelines)."

Processing 11 3mo ago

dbt-transforms

by dtsong

"Use this skill when building or reviewing dbt models, tests, or project structure. Triggers on analytics engineering tasks including staging/marts layers, materializations, incremental strategies, Jinja macros, sources, warehouse configuration, DuckDB adapter, data cleaning, and deduplication patterns. Common phrases: \"dbt model\", \"write a dbt test\", \"incremental strategy\", \"semantic layer\", \"dbt DuckDB\", \"cleaning patterns\". Do NOT use for Python DataFrame code (use python-data-engineering), pipeline scheduling (use data-pipelines), or standalone DuckDB queries without dbt (use duckdb)."

CI/CD 11 3mo ago

data-integration

by dtsong

"Use this skill when designing data integrations or connecting systems. Covers iPaaS platforms (Workato, MuleSoft, Boomi), dlt pipelines, API patterns, CDC, webhooks, and Reverse ETL. Common phrases: \"connect these systems\", \"build a dlt pipeline\", \"event-driven architecture\", \"change data capture\". Do NOT use for stream processing frameworks (use event-streaming) or pipeline scheduling (use data-pipelines)."

API Dev 11 3mo ago

duckdb

by dtsong

"Use this skill when working with DuckDB for local data analysis, file ingestion, or data exploration. Covers reading CSV/Excel/Parquet/JSON files into DuckDB, SQL analytics on local data, data profiling, cleaning transformations, and export to various formats. Common phrases: \"analyze this CSV\", \"DuckDB query\", \"local data analysis\", \"read Excel in SQL\", \"profile this data\". Do NOT use for dbt model building (use dbt-transforms with DuckDB adapter) or cloud warehouse administration."

Processing 11 3mo ago

python-data-engineering

by dtsong

"Use this skill when writing Python code for data pipelines or transformations. Covers Polars, Pandas, PySpark DataFrames, dbt Python models, API extraction scripts, and data validation with Pydantic or Pandera. Common phrases: \"Polars vs Pandas\", \"PySpark DataFrame\", \"validate this data\", \"Python extraction script\". Do NOT use for SQL-based dbt models (use dbt-transforms) or integration architecture (use data-integration)."

API Dev 11 3mo ago

dlt-extract

by dtsong

"Use this skill when building DLT pipelines for file-based or consulting data extraction. Covers Excel/CSV/SharePoint ingestion via DLT, destination swapping (DuckDB dev to warehouse prod), schema contracts for cleaning, and portable pipeline patterns. Common phrases: \"dlt pipeline for files\", \"extract Excel with dlt\", \"portable data pipeline\", \"dlt filesystem source\". Do NOT use for core DLT concepts like REST API or SQL database sources (use data-integration) or pipeline scheduling (use data-pipelines)."

CI/CD 11 3mo ago

SKILL_NAME

by dtsong

TRIGGER_DESCRIPTION. Use when USER_CONTEXT. Covers CAPABILITIES.

Code Review 11 3mo ago

ai-data-integration

by dtsong

"Use this skill when connecting AI or LLMs to data platforms. Covers MCP servers for warehouses, natural-language-to-SQL, embeddings for data discovery, LLM-powered enrichment, and AI agent data access patterns. Common phrases: \"text-to-SQL\", \"MCP server for Snowflake\", \"LLM data enrichment\", \"AI agent access\". Do NOT use for general data integration (use data-integration) or dbt modeling (use dbt-transforms)."

Processing 10 3mo ago

data-governance

by dtsong

"Use this skill when implementing data governance as part of engineering work. Covers data cataloging (dbt docs, external tools), lineage documentation, data classification (PII/PHI taxonomy), access control patterns (RBAC, row-level security), and compliance frameworks (GDPR, HIPAA, SOX, CCPA). Common phrases: \"data catalog\", \"data lineage\", \"PII classification\", \"access control\", \"RBAC\", \"data governance\", \"compliance requirements\". Do NOT use for writing dbt models (use dbt-transforms), pipeline orchestration (use data-pipelines), or data quality testing (use data-testing)."

Processing 10 3mo ago

data-observability

by dtsong

"Use this skill when implementing monitoring, alerting, and incident response for data pipelines. Covers freshness monitoring, volume anomaly detection, schema change detection, alerting patterns, and incident response workflows. Common phrases: \"data freshness\", \"pipeline monitoring\", \"data anomaly\", \"schema drift\", \"data alerting\", \"incident response\", \"data observability\", \"stale data\". Do NOT use for writing dbt models (use dbt-transforms), pipeline scheduling (use data-pipelines), or data quality testing as deliverables (use data-testing)."

Processing 10 3mo ago

tsfm-forecast

by dtsong

"Use this skill when generating time-series forecasting pipelines using foundation models. Covers TimesFM, Chronos, MOIRAI, and Lag-Llama model selection, DuckDB-based preprocessing code, Python inference generation, backtesting harnesses, multi-model comparison, and client forecast deliverables. Common phrases: \"time-series forecast\", \"demand forecasting\", \"TimesFM\", \"Chronos\", \"predict future values\", \"zero-shot forecast\". Do NOT use for ML model training or fine-tuning (use python-data-engineering), real-time/streaming forecasts (use event-streaming), or pipeline scheduling (use data-pipelines)."

Code Gen 10 3mo ago

event-streaming

by dtsong

"Use this skill when building real-time or near-real-time data pipelines. Covers Kafka, Flink, Spark Streaming, Snowpipe, BigQuery streaming, materialized views, and batch-vs-streaming decisions. Common phrases: \"real-time pipeline\", \"Kafka consumer\", \"streaming vs batch\", \"low latency ingestion\". Do NOT use for batch integration patterns (use data-integration) or pipeline orchestration (use data-pipelines)."

Automation 10 3mo ago

SUITE_NAME

by dtsong

TRIGGER_DESCRIPTION. Use when USER_CONTEXT. Routes to specialists for CAPABILITIES.

Processing 10 3mo ago