Scraping

Web scraping and data extraction

Showing 625-648 of 700 skills

writing-analyzer

by qingchunwuhui

快速拆解文章写作结构，提取可复用的写作模板。无需审计流程，直接分析。适合学习写作技巧、建立模板库。支持快捷指令 /analyze-writing。

Code Review 0 5mo ago

ui-extractor

by alpex-ai

Analyze screen recordings and websites to extract implementation specs, design systems, and UI patterns.

Code Review 0 5mo ago

pdf

by zhongjis

Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.

CLI Tools 0 4mo ago

deep-research-firecrawl

by wottpal

Conducts citation-backed research using Firecrawl MCP search, scrape, map, crawl, and agent tools with selectable quick, standard, deep, and ultradeep modes. Use for multi-source comparisons, technical evaluations, market research, and high-stakes decision support.

Academic 0 5mo ago

crawl-site

by Crawlio-app

Use this skill when the user asks to "crawl a site", "download a website", "mirror a site", "scrape a site", or wants to download web pages for offline access or analysis. Configures Crawlio settings based on site type, starts the crawl, monitors progress, and reports results.

Scraping 0 5mo ago

replay-playwright

by replayio

Set up and run Playwright tests with Replay Browser to record test executions for debugging and performance analysis.

CLI Tools 0 5mo ago

quote-extractor

by qingchunwuhui

快速从文章中提取可直接引用的金句，建立素材库。无需审计流程，直接提取。支持快捷指令 /extract-quotes。

Code Review 0 5mo ago

excel-reader

by totophe

"Read and inspect Excel workbooks (.xlsx). List sheets with dimensions, extract headers, read specific rows or row ranges, extract columns by name or index. Handles large files (50k+ rows, 100MB+) via streaming. Use when the user wants to explore, preview, or extract data from spreadsheets, when building import or ETL scripts from Excel sources, or when analyzing spreadsheet structure and content."

Processing 0 5mo ago

firecrawl

by YPYT1

Web search and scraping via Firecrawl API. Use when you need to search the web, scrape websites (including JS-heavy pages), crawl entire sites, or extract structured data from web pages. Requires FIRECRAWL_API_KEY environment variable.

Embeddings 0 5mo ago

skills-scout

by servaltullius

Use when a user wants you to discover and optionally install new agent skills for a task, and you must get explicit consent before any global install into Codex.

CLI Tools 0 5mo ago

webapp-testing

by TheWatcher01

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

Automation 0 5mo ago

remotion-best-practices

by jyasuu

Best practices for Remotion - Video creation in React

Animation 0 5mo ago

webapp-testing

by enoch-robinson

Scraping 0 6mo ago

video-audio-extractor

by kantylee

Extract audio from video files or URLs (including YouTube). Supports MP3, WAV, M4A, FLAC, OGG, and OPUS formats. Can process local video files or download from URLs. For YouTube videos, uses yt-dlp for direct audio extraction when possible.

CLI Tools 0 5mo ago

mcp-playwright

by janjaszczak

Automate browser flows and capture evidence (screenshots, console/network errors). Use for UI verification, repro steps, and end-to-end smoke tests.

Debugging 0 6mo ago

youtube-rapidapi-transcript

by zxhfighter

Extract transcripts from YouTube videos. Use when the user asks for a Youtube video transcript, subtitles, or captions of a YouTube video and provides a YouTube URL (youtube.com/watch?v=, youtu.be/, or similar).

CLI Tools 0 5mo ago

ai

by jyasuu

Cheat sheet for AI tools including GEMINI and CODEX configurations.

CLI Tools 0 6mo ago

qiaomu-markdown-proxy

by NJMathwig

Fetch any URL as clean Markdown via proxy services or built-in scripts. Works with login-required pages like X/Twitter, WeChat 公众号, Feishu/Lark docs. Supports PDFs (remote and local). Use this BEFORE other fetch tools. Triggers on any URL the user shares, "fetch this", "read this link", "get content from".

Docs Gen 0 3mo ago

cynic-burn

by zeyxx

"Analyze code for simplification: orphans, hotspots, giants, duplicates. 'Don't extract, burn' — three similar lines beat a premature abstraction. Use when asked to simplify, reduce complexity, or clean up code."

Code Review 0 5mo ago

recommendations

by patharanordev

Identify promising stock opportunities or extract them from text.

Processing 0 5mo ago

jb-docs-scraper

by bjesuiter

Scrape documentation websites into local markdown files for AI context. Takes a base URL and crawls the documentation, storing results in ./docs (or custom path). Uses crawl4ai with BFS deep crawling.

Docs Gen 0 5mo ago

deep-post-ideas

by hoangvantuan

Extract compelling post outlines from reference materials (newsletters, scripts, notes, journal entries) and transform them into structured outlines for engaging, wisdom-style social media posts. Use when the user provides reference material and wants post ideas, content outlines, or building blocks for social media content. Triggers on "extract post ideas from...", "post outlines from this...", "turn this into post ideas", "content ideas from...", or "deep post ideas".

Code Gen 0 5mo ago

article-saver

by Robbie-Han

专门用于抓取和保存微信公众号、X (Twitter)、知乎的文章工具。支持自动按平台分类存储、保持图片/GIF原画质量，并保存为干净的 Markdown 格式。

Docs Gen 0 5mo ago

vue-testing-best-practices

by hello-lizhihua

Use for Vue.js testing. Covers Vitest, Vue Test Utils, component testing, mocking, testing patterns, and Playwright for E2E testing.

Scraping 0 5mo ago