- Home
- /
- Categories
- /
- Scraping
Scraping
Web scraping and data extraction
baoyu-post-to-weibo
by JimLiu
Posts content to Weibo (微博). Supports regular posts with text, images, and videos, and headline articles (头条文章) with Markdown input via Chrome CDP. Use when user asks to "post to Weibo", "发微博", "发布微博", "publish to Weibo", "share on Weibo", "写微博", or "微博头条文章".
playwright
by openai
"Use when the task requires automating a real browser from the terminal (navigation, form filling, snapshots, screenshots, data extraction, UI-flow debugging) via playwright-cli or the bundled wrapper script."
competitive-ads-extractor
by ComposioHQ
Extracts and analyzes competitors' ads from ad libraries (Facebook, LinkedIn, etc.) to understand what messaging, problems, and creative approaches are working. Helps inspire and improve your own ad campaigns.
web-access
by eze-is
所有联网操作必须通过此 skill 处理,包括:搜索、网页抓取、登录后操作、网络交互等。
defuddle
by kepano
Extract clean markdown content from web pages using Defuddle CLI, removing clutter and navigation to save tokens. Use instead of WebFetch when the user provides a URL to read or analyze, for online documentation, articles, blog posts, or any standard web page. Do NOT use for URLs ending in .md — those are already markdown, use WebFetch directly.
dev-browser
by SawyerHood
Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.
cloudflare-browser
by cloudflare
Control headless Chrome via Cloudflare Browser Rendering CDP WebSocket. Use for screenshots, page navigation, scraping, and video capture when browser automation is needed in a Cloudflare Workers environment. Requires CDP_SECRET env var and cdpUrl configured in browser.profiles.
ctf-osint
by ljagiello
Provides open source intelligence techniques for CTF challenges. Use when gathering information from public sources, social media, geolocation, DNS records, username enumeration, reverse image search, Google dorking, Wayback Machine, Tor relays, FEC filings, or identifying unknown data like hashes and coordinates.
tooluniverse-drug-research
by mims-harvard
Generates comprehensive drug research reports with compound disambiguation, evidence grading, and mandatory completeness sections. Covers identity, chemistry, pharmacology, targets, clinical trials, safety, pharmacogenomics, and ADMET properties. Use when users ask about drugs, medications, therapeutics, or need drug profiling, safety assessment, or clinical development research.
analyzing-pdf-malware-with-pdfid
by mukul975
Analyzes malicious PDF files using PDFiD, pdf-parser, and peepdf to identify embedded JavaScript, shellcode, exploits, and suspicious objects without opening the document. Determines the attack vector and extracts embedded payloads for further analysis. Activates for requests involving PDF malware analysis, malicious document analysis, PDF exploit investigation, or suspicious attachment triage.
analyzing-windows-registry-for-artifacts
by mukul975
Extract and analyze Windows Registry hives to uncover user activity, installed software, autostart entries, and evidence of system compromise.
e2e-testing
by affaan-m
Playwright E2E testing patterns, Page Object Model, configuration, CI/CD integration, artifact management, and flaky test strategies.
extract
by pbakaus
Extract and consolidate reusable components, design tokens, and patterns into your design system. Identifies opportunities for systematic reuse and enriches your component library.
analyzing-golang-malware-with-ghidra
by mukul975
Reverse engineer Go-compiled malware using Ghidra with specialized scripts for function recovery, string extraction, and type reconstruction in stripped Go binaries.
websh
by openprose
A shell for the web. Navigate URLs like directories, query pages with Unix-like commands. Activate on websh command, shell-style web navigation, or when treating URLs as a filesystem.
extract-project-logo
by tradingstrategy-ai
Extract a project's logo from its website, brand kit, or other sources
hexdocs-fetcher
by oliver-kriska
Fetches HexDocs documentation efficiently using WebFetch tool. Converts HTML to markdown automatically for context efficiency.
by iOfficeAI
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.
douyin-batch-download
by cat-xierluo
抖音视频批量下载工具 - 基于 F2 框架实现高效、增量的视频下载功能。支持单个/批量博主下载,自动 Cookie 管理,差量更新机制。本技能应在用户需要批量下载特定博主视频、服务器部署自动化下载、或定期更新视频库时使用。
webapp-testing
by ComposioHQ
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
fetch-wechat-article
by cat-xierluo
抓取微信公众号文章内容,使用 Playwright headless 模式无弹窗后台抓取,支持动态加载内容,自动提取标题和正文并保存为 Markdown 文件。本技能应在用户需要抓取微信公众号文章内容时使用。
paper-write
by xstongxue
本科与硕士学位论文全流程撰写辅助。支持大纲审核(理工科/文科)、结构仿写(通用章节/实验章节/绪论/摘要)、参考文献获取、融合、润色、缩写、扩写、防 AIGC、中英互译、结构化信息提取。当用户提到论文撰写、大纲审核、论文章节仿写、参考文献、论文润色、防 AIGC、论文翻译时使用。
histolab
by K-Dense-AI
Lightweight WSI tile extraction and preprocessing. Use for basic slide processing tissue detection, tile extraction, stain normalization for H&E images. Best for simple pipelines, dataset preparation, quick tile-based analysis. For advanced spatial proteomics, multiplexed imaging, or deep learning pipelines use pathml.
playwright-expert
by Jeffallan
Use when writing E2E tests with Playwright, setting up test infrastructure, or debugging flaky browser tests. Invoke for browser automation, E2E tests, Page Object Model, test flakiness, visual testing.