- Home
- /
- Categories
- /
- ML Ops
ML Ops
Machine learning operations
transcribe-and-analyze
by buddyh
Transcribe audio and video from URLs (YouTube, direct media links) using WhisperKit locally. Optionally analyze transcripts with AI when explicitly requested. Use when users provide URLs to media content and request transcription or speech-to-text conversion.
midnight-concepts
by mzf11125
Foundational knowledge about Midnight Network zero-knowledge blockchain technology, privacy mechanisms, and architecture. Use when users need to understand zero-knowledge proofs, privacy mechanisms like Zswap and selective disclosure, partner chain architecture, real-world use cases for private DeFi and voting, when to use Midnight for privacy-preserving applications, and core concepts of the Midnight ecosystem.
data-cleaning-pipeline-generator
by Dexploarer
Generates data cleaning pipelines for pandas/polars with handling for missing values, duplicates, outliers, type conversions, and data validation. Use when user asks to "clean data", "generate data pipeline", "handle missing values", or "remove duplicates from dataset".
jupyter-notebook-assistant
by Dexploarer
Organizes, cleans, and optimizes Jupyter notebooks - removes empty cells, adds structure, extracts functions, generates documentation. Use when user asks to "clean notebook", "organize jupyter", "refactor notebook", or "jupyter best practices".
dlt-extract
by dtsong
"Use this skill when building DLT pipelines for file-based or consulting data extraction. Covers Excel/CSV/SharePoint ingestion via DLT, destination swapping (DuckDB dev to warehouse prod), schema contracts for cleaning, and portable pipeline patterns. Common phrases: \"dlt pipeline for files\", \"extract Excel with dlt\", \"portable data pipeline\", \"dlt filesystem source\". Do NOT use for core DLT concepts like REST API or SQL database sources (use data-integration) or pipeline scheduling (use data-pipelines)."
tinker-training-cost
by M4n5ter
Calculate training costs for Tinker fine-tuning jobs. Use when estimating costs for Tinker LLM training, counting tokens in datasets, or comparing Tinker model training prices. Tokenizes datasets using the correct model tokenizer and provides accurate cost estimates.
MLIP Simulation Skill
by fl-sean03
Phonopy: https://phonopy.github.io/phonopy/
gemini-imagegen
by drshailesh88
Generate and edit images using the Gemini API (Nano Banana Pro). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.
perplexity-search
by drshailesh88
Perform AI-powered web searches with real-time information using Perplexity models via LiteLLM and OpenRouter. This skill should be used when conducting web searches for current information, finding recent scientific literature, getting grounded answers with source citations, or accessing information beyond the model's knowledge cutoff. Provides access to multiple Perplexity models including Sonar Pro, Sonar Pro Search (advanced agentic search), and Sonar Reasoning Pro through a single OpenRouter API key.
ipynb-notebooks
by M4n5ter
面向 .ipynb Notebook(Jupyter / JupyterLab / Google Colab / VS Code)的创建、审阅、重构与展示。涵盖工程化目录结构、token 高效处理、演示/分享模式、以及 uv/venv 可复现工作流。
Knowledge Pipeline Skill
by drshailesh88
```
tinker
by M4n5ter
Comprehensive guide for Tinker Cookbook supervised fine-tuning covering all patterns including high-level Cookbook abstractions, low-level API usage, streaming datasets, file-based data, Blueprint configuration, and vision-language models.
torch-sim Skill
by fl-sean03
data-analysis skill - Analyzing torch-sim outputs
dspy
by L-yifan
Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming
training-data-curation
by M4n5ter
Guidelines for creating high-quality datasets for LLM post-training (SFT/DPO/RLHF). Use when preparing data for fine-tuning, evaluating data quality, or designing data collection strategies.
multi-model-writer
by drshailesh88
"Unified writing system with intelligent model routing. Default: Claude. Options: GLM-4.7 (cheapest), GPT-4o/mini, Gemini, Grok. Includes browser automation for web interfaces. Cost-aware routing based on task complexity."
x-algo-pipeline
by CloudAI-X
Explain the complete X recommendation algorithm pipeline. Use when users ask how posts are ranked, how the algorithm works, or want an overview of the recommendation system.
cnn-vision
by levy-n
Implements CNN architectures for computer vision tasks. Covers convolution operations, pooling, CNN design patterns (LeNet, ResNet, VGG), transfer learning, fine-tuning pretrained models, data augmentation, and image preprocessing. Use when building image classifiers, doing object detection, or when user mentions 'CNN', 'convolution', 'pooling', 'ResNet', 'VGG', 'transfer learning', 'fine-tuning', 'image augmentation', 'ImageNet', 'feature maps', 'MNIST', 'image classification', 'multi-modal', 'image captioning', or 'multimodal network'.
deep-learning-core
by levy-n
Explains neural network fundamentals: the Three Pillars (Model, Loss, Optimizer), backpropagation, gradient descent variants (SGD, Adam), regularization (Dropout, BatchNorm), and MLP architecture design. Use when learning how neural networks work, debugging training issues, or when user asks about 'backpropagation', 'vanishing gradients', 'learning rate', 'loss function', 'overfitting', 'underfitting', 'activation functions', 'why isn\'t my model learning', 'gradient descent', 'Adam', 'Dropout', 'BatchNorm', 'autoencoder', 'denoising autoencoder', or 'latent space'.
ml-knowledge-index
by levy-n
Routes ML/DL questions to specialized skills. Use FIRST when unsure which skill applies, when user asks broad ML questions, or when multiple topics might be relevant. Maps: regression/classification → ml-fundamentals, ensembles/clustering → ml-advanced, TF-IDF/Word2Vec → nlp-classical, training/backprop → deep-learning-core, PyTorch → pytorch-mastery, CNNs/images → cnn-vision, LSTM/time-series → sequence-models, BERT/HuggingFace → transformers-llm, RAG/embeddings → rag-retrieval, APIs/PDF-parsing → data-pipeline, LoRA/QLoRA/PEFT → fine-tuning-peft, MLflow/W&B/Optuna → mlops-experiment, SHAP/Grad-CAM → model-interpretability, Q-learning/PPO/DQN → reinforcement-learning, GAN/VAE/diffusion → generative-models, explanations → ml-teaching-assistant.
ml-fundamentals
by levy-n
Implements classical ML algorithms for regression and classification. Covers Linear/Polynomial/Logistic Regression, Decision Trees, Ridge/Lasso regularization, train/test splits, cross-validation, and evaluation metrics (R², RMSE, Precision, Recall, F1, ROC-AUC, Confusion Matrix). Use when building predictive models on tabular data, comparing baseline algorithms, handling imbalanced data, or when user mentions 'regression', 'classification', 'overfitting', 'cross-validation', 'confusion matrix', 'feature importance', 'precision/recall', or 'regularization'.
nlp-classical
by levy-n
Implements traditional NLP techniques before transformers. Covers text vectorization (TF-IDF, Bag-of-Words), word embeddings (Word2Vec, FastText, GloVe, Doc2Vec), topic modeling (LDA, Gensim), and text similarity (Jaccard, Cosine, FuzzyWuzzy, record linkage). Use when building text classifiers without deep learning, doing topic extraction, entity matching, or when user mentions 'TF-IDF', 'Word2Vec', 'topic modeling', 'LDA', 'text similarity', 'n-grams', 'document clustering', 'GloVe', 'Doc2Vec', 'FuzzyWuzzy', or 'record linkage'.
ml-dl-expert
by levy-n
Expert ML/DL teaching assistant for Hebrew University AI Engineering course. Activates for ANY machine learning or deep learning question: neural networks, PyTorch, TensorFlow, transformers, BERT, GPT, RAG, embeddings, CNNs, RNNs, LSTM, NLP, computer vision, clustering, regression, classification, training loops, backpropagation, loss functions, optimization, HuggingFace, vector stores, FAISS, ChromaDB, recommender systems, matrix factorization, transfer learning, data augmentation, autoencoders, Word2Vec, TF-IDF, topic modeling, prompt engineering, fine-tuning, LoRA, QLoRA, PEFT, quantization, sentiment analysis, image classification, object detection, time series, XGBoost, Random Forest, PCA, t-SNE, DBSCAN, K-Means, data pipeline, PDF parsing, chunking, function calling, AI agents, MLflow, W&B, experiment tracking, hyperparameter tuning, Optuna, SHAP, feature importance, Grad-CAM, model interpretability, reinforcement learning, Q-learning, DQN, PPO, policy gradient, GANs, VAE, diffusion models, Stable Diffusion, generative AI, model deployment, MLOps, synthetic data, data sourcing, Kaggle, dataset, data augmentation, SMOTE. Routes to 17 specialized sub-skills and provides code examples, visual diagrams, and Hebrew explanations when needed.
reinforcement-learning
by levy-n
Reinforcement learning fundamentals and practical implementations. Covers RL concepts (agent, environment, reward, policy), Q-Learning, Deep Q-Network (DQN), Policy Gradient methods, PPO, Actor-Critic, Gymnasium environments, Stable-Baselines3, reward shaping, and exploration-exploitation trade-off. Use when user asks about 'reinforcement learning', 'RL', 'Q-learning', 'DQN', 'PPO', 'policy gradient', 'reward function', 'agent', 'environment', 'Gym', 'Gymnasium', 'exploration', 'exploitation', 'Stable-Baselines3', 'Actor-Critic', 'SARSA', 'Bellman equation', or 'Markov decision process'.