Scraping

Web scraping and data extraction

Showing 1-24 of 700 skills

playwright

by openai

"Use when the task requires automating a real browser from the terminal (navigation, form filling, snapshots, screenshots, data extraction, UI-flow debugging) via playwright-cli or the bundled wrapper script."

Automation 24K 5mo ago

webclaw

by 0xMassi

Web extraction engine with antibot bypass. Scrape, crawl, extract, summarize, search, map, diff, monitor, research, and analyze any URL — including Cloudflare-protected sites. Use when you need reliable web content, the built-in web_fetch fails, or you need structured data extraction from web pages.

Processing 1.8K 3mo ago

baoyu-post-to-weibo

by JimLiu

Posts content to Weibo (微博). Supports regular posts with text, images, and videos, and headline articles (头条文章) with Markdown input via Chrome CDP. Use when user asks to "post to Weibo", "发微博", "发布微博", "publish to Weibo", "share on Weibo", "写微博", or "微博头条文章".

Automation 23.9K 3mo ago

dev-browser

by SawyerHood

Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

Automation 6.5K 4mo ago

defuddle

by kepano

Extract clean markdown content from web pages using Defuddle CLI, removing clutter and navigation to save tokens. Use instead of WebFetch when the user provides a URL to read or analyze, for online documentation, articles, blog posts, or any standard web page. Do NOT use for URLs ending in .md — those are already markdown, use WebFetch directly.

CLI Tools 42.9K 3mo ago

web-access

by eze-is

所有联网操作必须通过此 skill 处理，包括：搜索、网页抓取、登录后操作、网络交互等。

Automation 8.4K 2mo ago

competitive-ads-extractor

by ComposioHQ

Extracts and analyzes competitors' ads from ad libraries (Facebook, LinkedIn, etc.) to understand what messaging, problems, and creative approaches are working. Helps inspire and improve your own ad campaigns.

Analytics 68.2K 9mo ago

cloudflare-browser

by cloudflare

Control headless Chrome via Cloudflare Browser Rendering CDP WebSocket. Use for screenshots, page navigation, scraping, and video capture when browser automation is needed in a Cloudflare Workers environment. Requires CDP_SECRET env var and cdpUrl configured in browser.profiles.

Automation 9.9K 5mo ago

lets-go-rss

by ALBEDO-TABAI

轻量级全平台 RSS 订阅管理器。一键聚合 YouTube、Vimeo、Behance、Twitter/X、知识星球、B站、微博、抖音、小红书的内容更新，支持增量去重和 AI 智能分类。

Automation 95 2mo ago

e2e

by langwatch

"Generate and verify E2E tests for a feature. Explores live app, creates test plan, generates tests, runs and fixes until passing."

Debugging 3.4K 5mo ago

typescript

by prowler-cloud

TypeScript strict patterns and best practices. Trigger: When implementing or refactoring TypeScript in .ts/.tsx (types, interfaces, generics, const maps, type guards, removing any, tightening unknown).

Code Gen 14.4K 6mo ago

gemini-computer-use

by am-will

Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.

CLI Tools 999 6mo ago

webapp-testing

by Project-N-E-K-O

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

Automation 2.2K 6mo ago

pdf

by Project-N-E-K-O

Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.

CLI Tools 2.2K 6mo ago

tavily-usage

by fcakyon

This skill should be used when user asks to "search the web", "fetch content from URL", "extract page content", "use Tavily search", "scrape this website", "get information from this link", or "web search for X".

Embeddings 807 7mo ago

playwright-testing

by fcakyon

This skill should be used when user asks about "Playwright", "responsiveness test", "test with playwright", "test login flow", "file upload test", "handle authentication in tests", or "fix flaky tests".

Auth 807 6mo ago

webapp-testing

by henkisdabro

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behaviour, capturing browser screenshots, and viewing browser logs. Use when user asks to test a web app, verify UI, capture screenshots, check browser logs, or debug frontend issues.

Automation 75 5mo ago

douyin-batch-download

by cat-xierluo

抖音视频批量下载工具 - 基于 F2 框架实现高效、增量的视频下载功能。支持单个/批量博主下载，自动 Cookie 管理，差量更新机制。本技能应在用户需要批量下载特定博主视频、服务器部署自动化下载、或定期更新视频库时使用。

Automation 485 5mo ago

refactoring-patterns

by wondelai

'Apply named refactoring transformations to improve code structure without changing behavior. Use when the user mentions "refactor this", "code smells", "extract method", "replace conditional", or "technical debt". Covers smell-driven refactoring, safe transformation sequences, and testing guards. For code quality foundations, see clean-code. For managing complexity, see software-design-philosophy.'

File Ops 1.7K 4mo ago

extract

by pbakaus

Extract and consolidate reusable components, design tokens, and patterns into your design system. Identifies opportunities for systematic reuse and enriches your component library.

Code Gen 48.5K 4mo ago

bio-atac-seq-nucleosome-positioning

by GPTomics

Extract nucleosome positions from ATAC-seq data using NucleoATAC, ATACseqQC, and fragment analysis. Use when analyzing chromatin organization, identifying nucleosome-free regions at promoters, or characterizing nucleosome occupancy patterns from ATAC-seq fragment size distributions.

CLI Tools 1K 5mo ago

bio-atac-seq-footprinting

by GPTomics

Detect transcription factor binding sites through footprinting analysis in ATAC-seq data using TOBIAS. Use when identifying TF occupancy patterns within accessible regions, as TF binding protects DNA from Tn5 cutting.

CLI Tools 1K 5mo ago

hexdocs-fetcher

by oliver-kriska

Fetches HexDocs documentation efficiently using WebFetch tool. Converts HTML to markdown automatically for context efficiency.

Prompts 493 5mo ago

test

by Automattic

Testing patterns for PHPUnit and Playwright E2E tests. Use when writing tests, debugging test failures, setting up test coverage, or implementing test patterns for ActivityPub features.

Scraping 574 5mo ago