Scraping

Web scraping and data extraction

Showing 553-576 of 697 skills
Krosebrook

tapestry

by Krosebrook

Unified content extraction and action planning. Use when user says "tapestry <URL>", "weave <URL>", "help me plan <URL>", "extract and plan <URL>", "make this actionable <URL>", or similar phrases indicating they want to extract content and create an action plan. Automatically detects content type (YouTube video, article, PDF) and processes accordingly.

Automation 2 6mo ago
Jackiexiao

pdf

by Jackiexiao

"Use this skill whenever the user wants to work with PDF files: read/extract, merge/split, rotate, watermark, create, fill forms, encrypt/decrypt, image extraction, and OCR."

CLI Tools 1 3mo ago
xdrshjr

remotion-best-practices

by xdrshjr

Best practices for Remotion - Video creation in React

Animation 1 4mo ago
Git-Fg

crawling-content

by Git-Fg

"High-speed read-only web extraction. Use when fetching documentation, blogs, and static pages. Do not use for apps requiring login or interaction."

CLI Tools 1 4mo ago
pluginagentmarketplace

advanced-types

by pluginagentmarketplace

Advanced TypeScript types including generics, conditionals, and mapped types

Code Gen 1 5mo ago
brettatoms

playwright

by brettatoms

Browser automation for web testing and interaction. Use for navigating pages, filling forms, clicking elements, taking screenshots, and inspecting page content. Maintains stateful browser session across commands.

CLI Tools 1 4mo ago
yfe404

apify-actor-developer

by yfe404

Build and monetize Apify Actors (web scrapers, automation tools, AI agents). Use when user wants to create an Actor, scraper, crawler, web automation, publish to Apify Store, set up pay-per-event/pay-per-result pricing, or integrate with Crawlee. Covers full lifecycle from development to monetization.

Code Gen 1 5mo ago
xdrshjr

remotion-best-practices

by xdrshjr

Best practices for Remotion - Video creation in React

Animation 1 4mo ago
pluginagentmarketplace

data-analysis-sql

by pluginagentmarketplace

SQL for data analysis with exploratory analysis, advanced aggregations, statistical functions, outlier detection, and business insights. 50+ real-world analytics queries.

Processing 1 5mo ago
mmcmedia

Brian — Knowledge Specialist (Concise)

by mmcmedia

McKinzie decides human sharing

Code Review 1 3mo ago
le-dat

Playwright Browser Automation

by le-dat

Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing.

Automation 1 5mo ago
aig787

pdf-processor

by aig787

Process PDF files for text extraction, form filling, and document analysis. Use when you need to extract content from PDFs, fill forms, or analyze document structure.

File Ops 1 6mo ago
yoitaoai

yoitao-jimeng-sessionid

by yoitaoai

当需要获取或刷新即梦(jimeng.jianying.com)登录态中的 sessionid cookie 时使用。自动打开即梦网站检查登录状态,并在已登录时返回 sessionid 值给调用方。

Processing 1 3mo ago
BusiRocket

busirocket-tailwindcss-v4

by BusiRocket

Applies Tailwind CSS v4 setup and styling strategy. Use when configuring

Linting 1 4mo ago
WebSmartTeam

site-harvest

by WebSmartTeam

Extract complete website content, design system, and assets for rebuilding or migration. Uses Firecrawl for content/CSS extraction, Chrome for visual comparison. Generates theme skill file for rebuild. Triggers: harvest site, scrape website, extract design, clone website, migrate site, copy website design, grab design tokens.

Processing 1 4mo ago
mmcmedia

playwright-cli

by mmcmedia

Browser automation via Playwright CLI. Open pages, interact with elements, take screenshots, and more. Ideal for coding agents and automated testing workflows.

Auth 1 3mo ago
JNHFlow21

social-fetcher

by JNHFlow21

统一抓取社交媒体内容(Twitter/X、小红书、抖音)。使用 Playwright + 持久化浏览器上下文,支持登录状态保存,一次登录后重复抓取。

Auth 1 3mo ago
fefogarcia

playwright-local

by fefogarcia

Build browser automation and web scraping with Playwright on your local machine. Prevents 10 documented errors including CI timeout hangs, extension testing failures, and Ubuntu compatibility issues. Includes stealth mode for anti-bot bypass, authenticated sessions, infinite scroll handling, screenshot/PDF generation, and v1.57 Speedboard performance analysis. Use when: automating browsers, scraping protected sites, testing with real IPs, bypassing bot detection, generating screenshots/PDFs, or troubleshooting "target closed", "page.pause() hangs CI", "permission prompts block tests", or "Ubuntu 25.10 installation" errors.

Debugging 1 3mo ago
wrt820232

xiaohongshu-automation

by wrt820232

小红书自动化控制 - 通过 Playwright CDP 连接 OpenClaw 浏览器实现发布、搜索、评论等功能

Scraping 1 3mo ago
Within-7

beauty-step1

by Within-7

"Document content analysis and merging. Automatically invoked during step 1 of the beauty command to fully understand source document content, extract key information, and establish content structure. 文档内容分析合并。在beauty命令的步骤1执行时自动调用,用于完整理解源文档内容,提取关键信息,建立内容结构。"

Processing 1 4mo ago
ajaywadhara

qa-run

by ajaywadhara

"8-agent QA loop: browser exploration via Playwright MCP, then analyze, plan, test, audit, heal, expand, snapshot. Quality gate score >= 85 to pass."

Debugging 1 3mo ago
leobrival

website-crawler

by leobrival

High-performance web crawler for discovering and mapping website structure. Use when users ask to crawl a website, map site structure, discover pages, find all URLs on a site, analyze link relationships, or generate site reports. Supports sitemap discovery, checkpoint/resume, rate limiting, and HTML report generation.

CLI Tools 1 4mo ago
Jackiexiao

just-scrape

by Jackiexiao

"CLI tool for AI-powered web scraping, data extraction, search, and crawling via ScrapeGraph AI. Use when the user needs to scrape websites, extract structured data from URLs, convert pages to markdown, crawl multi-page sites, search the web for information, automate browser interactions (login, click, fill forms), get raw HTML, discover sitemaps, or generate JSON schemas. Triggers on tasks involving: (1) extracting data from websites, (2) web scraping or crawling, (3) converting webpages to markdown, (4) AI-powered web search with extraction, (5) browser automation, (6) generating output schemas for scraping. The CLI is just-scrape (npm package just-scrape)."

Processing 1 3mo ago
sarfraznawaz2005

gemini-computer-use

by sarfraznawaz2005

Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.

CLI Tools 1 4mo ago