- Home
- /
- Categories
- /
- Scraping
Scraping
Web scraping and data extraction
web-tests
by marcioaltoe
Complete browser automation with Playwright. ALWAYS use when user needs browser testing, E2E testing, screenshots, form testing, or responsive design validation. Auto-detects dev servers, saves test scripts to working directory. Examples - "test this page", "take screenshots of responsive design", "test login flow", "check for broken links", "validate form submission".
firecrawl-web
by BexTuychiev
"Fetch web content, take screenshots, extract structured data, search the web, and crawl documentation sites. Use when the user needs current web information, asks to scrape a URL, wants a screenshot, needs to extract specific data from a page, or wants to learn about a framework or library."
play-tight
by slamb2k
Context-efficient browser automation using Playwright scripts and subagent isolation. Use when you need to interact with web pages, extract data from websites, verify page elements, or automate browser tasks while avoiding context window pollution from verbose HTML/accessibility trees. Provides both direct script execution and a specialized subagent pattern for complex investigations that generate large intermediate responses.
by mikefilsaime-groove
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.
play-tight
by slamb2k
Context-efficient browser automation using Playwright scripts and subagent isolation. Use when you need to interact with web pages, extract data from websites, verify page elements, or automate browser tasks while avoiding context window pollution from verbose HTML/accessibility trees. Provides both direct script execution and a specialized subagent pattern for complex investigations that generate large intermediate responses.
web-automation
by mauromedda
Web automation, debugging, and E2E testing with Playwright. Handles interactive (login, forms, reproduce bugs) and passive modes (network/console capture). Triggers on "e2e test", "browser test", "playwright", "screenshot", "debug UI", "debug frontend", "reproduce bug", "network trace", "console output", "verify fix", "test that", "verify change", "test the flow", "http://localhost", "open browser", "click button", "fill form", "submit form", "check page", "web scraping", "automation script", "headless browser", "browser automation", "selenium alternative", "puppeteer alternative", "page object", "web testing", "UI testing", "frontend testing", "visual regression", "capture network", "intercept requests", "mock API responses". PROACTIVE: Invoke for security verification, UI fix verification, testing forms/dropdowns, or multi-step UI flows. ON SESSION RESUME - check for pending UI verifications.
Research Paper Extractor
by drshailesh88
Zero cost. Maximum utility. Uses what you already pay for.
playwright
by fellipeutaka
Write, debug, and maintain Playwright end-to-end tests for web applications. Use when working with Playwright test files, configuring playwright.config.ts, writing browser automation, debugging flaky E2E tests, setting up authentication for tests, API mocking/interception, visual regression testing, accessibility testing, or CI/CD integration for browser tests. Triggers: Playwright, E2E test, end-to-end, browser test, @playwright/test, playwright.config, page object model, test fixture, visual snapshot, trace viewer.
iconfont-downloader
by JS-mark
Iconfont图标下载器 Skill 可以帮助用户从 iconfont.cn 搜索并下载最匹配的 SVG 图标。
article-extractor
by ryanhudson
This skill should be used when the user wants to "download article", "extract article", "save blog post", "get article text", or provides a web URL and asks to extract the main content without ads, navigation, or clutter. Saves clean, readable text from web articles and blog posts.
remotion-best-practices
by aiaiohhh
Best practices for Remotion - Video creation in React
article-extractor
by founderjourney
Extract clean article content from URLs, removing ads, navigation, and clutter. Save as readable text files for research, archiving, or offline reading.
chrome-devtools
by samhvw8
"Browser automation via Puppeteer CLI scripts (JSON output). Capabilities: screenshots, PDF generation, web scraping, form automation, network monitoring, performance profiling, JavaScript debugging, headless browsing. Actions: screenshot, scrape, automate, test, profile, monitor, debug browser. Keywords: Puppeteer, headless Chrome, screenshot, PDF, web scraping, form fill, click, navigate, network traffic, performance audit, Lighthouse, console logs, DOM manipulation, element selector, wait, scroll, automation script. Use when: taking screenshots, generating PDFs from web, scraping websites, automating form submissions, monitoring network requests, profiling page performance, debugging JavaScript, testing web UIs."
by samhvw8
"PDF document processing and manipulation. Tools: Python (PyPDF2, pdfplumber, reportlab), CLI tools. Capabilities: text extraction, table extraction, form filling, merge/split documents, create PDFs, add annotations, watermarks, page manipulation. Actions: extract, create, merge, split, fill, annotate PDFs. Keywords: PDF, text extraction, table extraction, form fill, PDF form, merge PDF, split PDF, create PDF, reportlab, PyPDF2, pdfplumber, annotation, watermark, page rotation, PDF metadata, bookmarks, OCR. Use when: extracting text/tables from PDFs, filling PDF forms, merging/splitting documents, creating PDFs programmatically, adding annotations/watermarks, processing PDFs at scale."
firecrawl
by founderjourney
Web scraping, search, and data extraction using Firecrawl API. Use when users need to fetch web content, discover URLs on sites, search the web, or extract structured data from pages.
digitaliza-data-extractor
by founderjourney
Extract and prepare client data for digitalizaweb.vercel.app LinkTree-style digital cards. Use when: (1) Processing restaurant/business client folders containing screenshots, scraped HTML, or LinkTree data, (2) Extracting brand colors from logos/images, (3) Generating Digitaliza-ready JSON with slug, name, links, colors, and theme configuration, (4) Batch processing multiple client folders for 100+ restaurants project, (5) User mentions "digitaliza", "tarjeta digital", "linktree", "extraer datos de cliente", or "procesar carpeta de restaurante".
by founderjourney
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.
webapp-testing
by aiaiohhh
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
webapp-testing
by AutumnsGrove
"Professional web application testing and automation using Playwright with support for multiple browsers, mobile emulation, screenshot capture, network interception, and comprehensive test assertions. Use for: (1) E2E testing across browsers, (2) UI automation, (3) Form testing and validation, (4) Visual regression testing, (5) API mocking and interception, (6) Mobile responsive testing"
by AutumnsGrove
"Comprehensive PDF manipulation, extraction, and generation with support for text extraction, form filling, merging, splitting, annotations, and creation. Use when working with .pdf files for: (1) Extracting text and tables, (2) Filling PDF forms, (3) Merging/splitting PDFs, (4) Creating PDFs programmatically, (5) Adding watermarks/annotations, (6) PDF metadata management"
flashcard-generator
by gked2121
Extract key concepts from any content and create spaced-repetition flashcards. Multiple formats: Anki-compatible, printable PDFs, interactive web.
refactor-assistant
by CuriousLearner
Automated code refactoring suggestions and implementation.
video-download
by csfuwwc
Download videos from Douyin (抖音), Xiaohongshu (小红书), and Bilibili (B站) to local disk. Use when the user shares a video link from these platforms, asks to download a video, or mentions v.douyin.com / xiaohongshu.com / xhslink.com / bilibili.com / b23.tv URLs.
nuxt-seo
by display-design-studio
@nuxtjs/robots module best practices for Nuxt 3 apps — robots.txt generation, crawl control, noindex per page, route-rule blocking, AI bot blocking, and environment-based indexing. Also covers llms.txt for AI-tool documentation access. Use when the user mentions @nuxtjs/robots, robots.txt, crawl, indexing, noindex, disallow, blockAiBots, or llms.txt in a Nuxt project.