Scraping

Web scraping and data extraction

Showing 49-72 of 700 skills

paper-write

by xstongxue

本科与硕士学位论文全流程撰写辅助。支持大纲审核（理工科/文科）、结构仿写（通用章节/实验章节/绪论/摘要）、参考文献获取、融合、润色、缩写、扩写、防 AIGC、中英互译、结构化信息提取。当用户提到论文撰写、大纲审核、论文章节仿写、参考文献、论文润色、防 AIGC、论文翻译时使用。

Code Review 2.3K 4mo ago

youtube-transcribe-skill

by feiskyer

'Extract subtitles/transcripts from YouTube videos. Triggers: "youtube transcript", "extract subtitles", "video captions", "视频字幕", "字幕提取", "YouTube转文字", "提取字幕".'

CLI Tools 1.6K 5mo ago

playwright-e2e-tests

by onyx-dot-app

Write and maintain Playwright end-to-end tests for the Onyx application. Use when creating new E2E tests, debugging test failures, adding test coverage, or when the user mentions Playwright, E2E tests, or browser testing.

Auth 31.1K 5mo ago

document-hunter

by bitwize-music-studio

Searches and retrieves documents from free public sources using automated browser navigation. Use when research needs primary source documents like court filings, government reports, or public records.

File Ops 366 5mo ago

e2e-testing

by affaan-m

Playwright E2E testing patterns, Page Object Model, configuration, CI/CD integration, artifact management, and flaky test strategies.

Processing 231.8K 5mo ago

phoenix-playwright-tests

by Arize-ai

Write Playwright E2E tests for the Phoenix AI observability platform. Use when creating, updating, or debugging Playwright tests, or when the user asks about testing UI features, writing E2E tests, or automating browser interactions for Phoenix.

Scraping 10.6K 5mo ago

pdftk-server

by github

'Skill for using the command-line tool pdftk (PDFtk Server) for working with PDF files. Use when asked to merge PDFs, split PDFs, rotate pages, encrypt or decrypt PDFs, fill PDF forms, apply watermarks, stamp overlays, extract metadata, burst documents into pages, repair corrupted PDFs, attach or extract files, or perform any PDF manipulation from the command line.'

CLI Tools 36.9K 5mo ago

fetch-wechat-article

by cat-xierluo

抓取微信公众号文章内容，使用 Playwright headless 模式无弹窗后台抓取，支持动态加载内容，自动提取标题和正文并保存为 Markdown 文件。本技能应在用户需要抓取微信公众号文章内容时使用。

Docs Gen 485 5mo ago

context-manager

by darrenhinde

Context management skill providing discovery, fetching, harvesting, extraction, compression, organization, cleanup, and guided workflows for project context

CLI Tools 4.6K 5mo ago

nl-router

by jmagly

Translation table: docs/simple-language-translations.md

Code Gen 165 5mo ago

BrightData

by danielmiessler

Progressive URL scraping. USE WHEN Bright Data, scrape URL, web scraping tiers. SkillSearch('brightdata') for docs.

Processing 16.8K 5mo ago

ExtractWisdom

by danielmiessler

Dynamic wisdom extraction that adapts sections to content. USE WHEN extract wisdom, analyze video, analyze podcast, extract insights, what's interesting, extract from YouTube, what did I miss, key takeaways. Replaces static extract_wisdom with content-adaptive extraction.

Finance 16.8K 5mo ago

pdf

by guanyang

Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.

CLI Tools 934 5mo ago

browsing

by obra

Use when you need direct browser control - teaches Chrome DevTools Protocol for controlling existing browser sessions, multi-tab management, form automation, and content extraction via use_browser MCP tool

Performance 336 4mo ago

Playwright Browser Automation

by smallnest

Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing.

Automation 279 7mo ago

triage-ci-flake

by payloadcms

Use when CI tests fail on main branch after PR merge, or when investigating flaky test failures in CI environments

Debugging 43.7K 6mo ago

Test Engineer Skill

by wasintoh

```

Debugging 87 7mo ago

bulk-wgcna-analysis-with-omicverse

by Starlitnightly

Assist Claude in running PyWGCNA through omicverse—preprocessing expression matrices, constructing co-expression modules, visualising eigengenes, and extracting hub genes.

Code Gen 1.2K 8mo ago

auto-weixin-video

by zrt-ai-lab

微信视频号自动发布技能。当用户需要发布视频到微信视频号时使用这个技能。技能包含：获取登录Cookie、上传视频、设置标题话题、定时发布、原创声明等功能。

Automation 251 5mo ago

airflow-dag-patterns

by wshobson

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

Automation 38.1K 6mo ago

summarize

by openclaw

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).

Processing 383.7K 5mo ago

senior-qa

by alirezarezvani

This skill should be used when the user asks to "generate tests", "write unit tests", "analyze test coverage", "scaffold E2E tests", "set up Playwright", "configure Jest", "implement testing patterns", or "improve test quality". Use for React/Next.js testing with Jest, React Testing Library, and Playwright.

Code Gen 22.9K 5mo ago

pdf-processing

by MervinPraison

Process and extract information from PDF documents. Use this skill when the user asks to read, analyze, or extract data from PDF files.

Code Review 8.5K 7mo ago

ctf-rev

by cyberkaida

Solve CTF reverse engineering challenges using systematic analysis to find flags, keys, or passwords. Use for crackmes, binary bombs, key validators, obfuscated code, algorithm recovery, or any challenge requiring program comprehension to extract hidden information.

Processing 783 8mo ago