Scraping

Web scraping and data extraction

Showing 481-504 of 697 skills
oldwinter

firecrawl

by oldwinter

Official Firecrawl CLI skill for web scraping, search, crawling, and browser automation. Returns clean LLM-optimized markdown. USE FOR: - Web search and research - Scraping pages, docs, and articles - Site mapping and bulk content extraction - Browser automation for interactive pages Must be pre-installed and authenticated. See rules/install.md for setup, rules/security.md for output handling.

Processing 3 3mo ago
timequity

airflow-workflows

by timequity

Apache Airflow DAG design, operators, and scheduling best practices.

Processing 6 5mo ago
Nymbo

music-downloader

by Nymbo

This skill should be used when users need to download audio or music from online platforms like YouTube, SoundCloud, Spotify, or other streaming services. It provides yt-dlp and spotdl command templates for high-quality audio extraction, playlist downloads, metadata embedding, and multi-platform support.

CLI Tools 6 5mo ago
thechandanbhagat

pdf

by thechandanbhagat

Work with PDF files - read, extract text/images/tables, create PDFs, merge, split, and convert PDFs. Use when the user asks to read, create, modify, or analyze PDF documents.

CLI Tools 6 4mo ago
breverdbidder

website-to-vite-scraper

by breverdbidder

Multi-provider website scraper that converts any website (including CSR/SPA) to deployable static sites. Uses Playwright, Apify RAG Browser, Crawl4AI, and Firecrawl for comprehensive scraping. Triggers on requests to clone, reverse-engineer, or convert websites.

Embeddings 5 5mo ago
SherifEldeeb

memory-forensics

by SherifEldeeb

Analyze volatile memory (RAM) dumps for forensic investigation. Use when investigating malware infections, rootkits, process injection, credential theft, or any incident requiring analysis of system memory state. Supports Windows, Linux, and macOS memory images.

Code Review 5 4mo ago
lycfyi

Extract Member Profiles

by lycfyi

To start fresh, delete the profile files manually before extracting

CLI Tools 5 4mo ago
SherifEldeeb

malware-forensics

by SherifEldeeb

Analyze malware samples for forensic investigation. Use when investigating malware infections, determining malware capabilities, extracting IOCs, or understanding attack techniques. Supports static and dynamic analysis of executables, scripts, and documents.

Automation 5 4mo ago
lwmxiaobei

yt-dlp

by lwmxiaobei

Download videos and extract audio from various platforms using yt-dlp. Use when user provides a video URL, asks to download a video, or when conversation contains video links from YouTube, Twitter/X, Vimeo, TikTok, Instagram, etc.

Processing 4 4mo ago
JeongHeonK

csharp-refactor

by JeongHeonK

C# code refactoring skill. Applies SOLID principles, extracts methods/classes, introduces design patterns, and modernizes syntax. Use when improving code maintainability, addressing code smells, or modernizing legacy C# code.

Code Gen 4 3mo ago
partme-ai

stitch-mcp-get-project

by partme-ai

Retrieves the detailed metadata of a specific Stitch project.

Processing 4 3mo ago
EngineerWithAI

airflow-dag-patterns

by EngineerWithAI

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

Automation 4 5mo ago
partme-ai

stitch-mcp-list-screens

by partme-ai

Lists all screens contained within a specific project.

Database 4 3mo ago
tangjunyi23

firmware-extraction

by tangjunyi23

Firmware extraction and filesystem analysis for IoT devices. Use when analyzing firmware binaries, extracting filesystems with binwalk, identifying firmware format/structure, locating key files after extraction, or performing initial reconnaissance on router/camera/IoT firmware images. Triggers on tasks involving .bin/.img/.trx/.chk firmware files.

Code Review 2 3mo ago
lihaoze123

nix-packaging-best-practices

by lihaoze123

Best practices for packaging pre-compiled binaries (.deb, .rpm, .tar.gz, AppImage) for NixOS, handling library dependencies, or facing "library not found" errors with binary distributions

Debugging 2 4mo ago
dtinth

playwright-testing

by dtinth

Playwright testing. Use this skill to write and run automated tests for web applications using Playwright.

Code Gen 2 3mo ago
borisghidaglia

assessment

by borisghidaglia

Fitness and nutrition assessment. Activate when users want to evaluate their training or diet, identify gaps, get an initial assessment, or ask "what am I doing wrong?" or "where should I start?"

Automation 2 4mo ago
echoleesong

remotion-best-practices

by echoleesong

Best practices for Remotion - Video creation in React. Use this skill when working with Remotion code, creating programmatic videos, React-based animations, or video compositions.

Animation 2 4mo ago
clearsmog

extract-diagrams

by clearsmog

Extract CeTZ diagrams to SVG from Typst files. For TikZ extraction, configure at project level.

File Ops 2 3mo ago
johnie

extract-spark-meetings

by johnie

Extract meeting summaries and action items from Spark Mail shared links. Processes single URLs (pass as argument) or batch processes unchecked links from links.md. Use when working with Spark Mail shared meeting links.

File Ops 2 4mo ago
tankpkg

@tank/bdd-e2e-testing

by tankpkg

"BDD end-to-end testing against real systems. Covers web apps (Playwright), libraries (pytest-bdd + Docker), APIs, CLIs, message queues. Gherkin writing, step definitions, Page Objects, Screenplay, 3-layer architecture, CI/CD, multi-language (TypeScript, Python, Java, .NET). Triggers: BDD test, Gherkin, Cucumber, feature file, Given When Then, playwright-bdd, pytest-bdd, Behave, Cucumber-JVM, Serenity BDD, Reqnroll, Example Mapping, Three Amigos, living documentation, BDD setup, BDD architecture."

API Dev 1 2mo ago
NavanithanS

ask-refactoring-readability

by NavanithanS

Refactor code for readability using DRY, meaningful names, and modularization.

File Ops 1 3mo ago
microlinkhq

browserless

by microlinkhq

Automate websites with browserless and Puppeteer for screenshots, PDFs, HTML/text extraction, URL status checks, and Lighthouse audits. Use when the user mentions browserless, @browserless/cli, headless Chrome automation, Puppeteer wrappers, website screenshots, PDF generation from URLs, or extracting rendered page content.

CLI Tools 1 3mo ago
abhinav-bharti-max

soushen-hunter

by abhinav-bharti-max

高性能 Bing/Google 搜索引擎 Skill - "搜神猎手" 使用 Playwright 底层 API 进行深度网页搜索和元素提取 功能: 1. Bing/Google 搜索执行 - 返回结构化搜索结果(标题、链接、摘要、来源) 2. 深度页面分析 - 提取页面的所有关键元素(链接、表单、按钮、脚本、元数据) 3. 可配置搜索引擎 - 支持 Bing 和 Google 切换 触发条件: - 用户需要进行网络搜索时 - 需要提取网页结构信息(链接、表单等)时 - 需要无 API 成本的搜索解决方案时 使用方法: - 基础搜索:./soushen "搜索关键词" [--num N] [--engine ENGINE] - 深度分析:./soushen --deep <目标 URL> - 配置引擎:./soushen --set-default-engine bing google

Scraping 1 2mo ago