- Home
- /
- Categories
- /
- Scraping
Scraping
Web scraping and data extraction
Bright Data Web MCP
by patchy631
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
moai-library-mermaid
by modu-ai
Enterprise Mermaid diagramming skill for Claude Code using MCP Playwright. Use when creating architecture diagrams, flowcharts, sequence diagrams, or visual documentation.
dcf-valuation
by virattt
Performs discounted cash flow (DCF) valuation analysis to estimate intrinsic value per share. Triggers when user asks for fair value, intrinsic value, DCF, valuation, "what is X worth", price target, undervalued/overvalued analysis, or wants to compare current price to fundamental value.
pdf-processing
by MervinPraison
Process and extract information from PDF documents. Use this skill when the user asks to read, analyze, or extract data from PDF files.
summarize
by HKUDS
Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).
rendering-hoist-jsx
by TheOrcDev
Extract static JSX elements outside components to avoid re-creation on every render. Apply when rendering static elements repeatedly or in lists.
content-extract
by blessonism
Robust URL-to-Markdown extraction for OpenClaw workflows. Use when the user wants to "extract/summarize/convert a webpage to markdown" (especially WeChat mp.weixin.qq.com) and web_fetch/browser is blocked or messy. Uses a cheap probe via web_fetch first, then falls back to the official MinerU API (via the local mineru-extract skill) and returns a traceable result contract with source links.
playwright-e2e
by comet-ml
Playwright E2E test generation workflow for Opik. Use when generating, fixing, or planning automated tests in tests_end_to_end/.
ffind
by BrownFineSecurity
Advanced file finder with type detection and filesystem extraction for analyzing firmware and extracting embedded filesystems. Use when you need to analyze firmware files, identify file types, or extract ext2/3/4 or F2FS filesystems.
webapp-testing
by MiniMax-AI
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
playwright-cli
by sonofmagic
使用 playwright-cli 自动化浏览器交互,适用于网页测试、表单填写、截图、抓取页面信息与录制可复现操作步骤。
options-payoff
by himself65
Generate an interactive options payoff curve chart with dynamic parameter controls. Use this skill whenever the user shares an options position screenshot, describes an options strategy, or asks to visualize how an options trade makes or loses money. Triggers include: any mention of butterfly, spread (vertical/calendar/diagonal/ratio), straddle, strangle, condor, covered call, protective put, iron condor, or any multi-leg options structure. Also triggers when a user pastes strike prices, premiums, expiry dates, or says things like "show me the payoff", "draw the P&L curve", "what does this trade look like", or uploads a screenshot from a broker (IBKR, TastyTrade, Robinhood, etc). Always use this skill even if the user only provides partial info — extract what you can and use defaults for the rest.
mermaid-tools
by daymade
Extracts Mermaid diagrams from markdown files and generates high-quality PNG images using bundled scripts. Activates when working with Mermaid diagrams, converting diagrams to PNG, extracting diagrams from markdown, or processing markdown files with embedded Mermaid code.
by guanyang
Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.
setup
by bitwize-music-studio
Detects your Python environment and guides you through installing plugin dependencies. Use on first-time setup or when MCP server fails to start.
typescript
by prowler-cloud
TypeScript strict patterns and best practices. Trigger: When implementing or refactoring TypeScript in .ts/.tsx (types, interfaces, generics, const maps, type guards, removing any, tightening unknown).
summarize
by elizaOS
Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).
extract-transcripts
by aiskillstore
Extract readable transcripts from Claude Code and Codex CLI session JSONL files
extract-fuzzer-repro
by noir-lang
Extract a Noir reproduction project from fuzzer failure logs in GitHub Actions. Use when a CI fuzzer test fails and you need to create a local reproduction.
sherpa-onnx-tts
by elizaOS
Local text-to-speech via sherpa-onnx (offline, no cloud)
playwright-ux-ui-capture
by raphaelmansuy
Capture EdgeQuake WebUI routes with Playwright and write artifacts immediately (screenshots + per-page request JSON + capture index). Use when adding/updating Playwright E2E capture specs or when asked to automate UI screenshot collection.
refactoring-dbt-models
by AltimateAI
Safely refactors dbt models with downstream impact analysis. Use when restructuring dbt models for: (1) Task mentions "refactor", "restructure", "extract", "split", "break into", or "reorganize" (2) Extracting CTEs to intermediate models or creating macros (3) Modifying model logic that has downstream consumers (4) Renaming columns, changing types, or reorganizing model dependencies Analyzes all downstream dependencies BEFORE making changes.
document-hunter
by bitwize-music-studio
Searches and retrieves documents from free public sources using automated browser navigation. Use when research needs primary source documents like court filings, government reports, or public records.
histolab-wsi-processing
by jaechang-hits
"Whole slide image processing for digital pathology. Tissue detection, tile extraction (random, grid, score-based), filter pipelines for H&E/IHC preprocessing. Use for dataset preparation, tile-based deep learning, and slide quality assessment. For advanced spatial proteomics or multiplexed imaging use pathml."