Scraping

Web scraping and data extraction

Showing 49-72 of 697 skills
open-metadata

playwright-test

by open-metadata

Generate robust, zero-flakiness Playwright E2E tests following OpenMetadata patterns. Creates comprehensive test files with proper waits, API validation, multi-role permissions, and complete entity lifecycle management.

Code Gen 14.1K 4mo ago
voicetreelab

playwright-debug

by voicetreelab

This skill should be used when the user asks to "debug the electron app", "connect playwright to VoiceTree", "take screenshots of the running app", "interact with the live UI", "inspect the running application", or "test UI elements live". Provides step-by-step instructions for connecting Playwright MCP to a running Electron app for live debugging and automation.

Processing 865 3mo ago
open-metadata

writing-playwright-tests

by open-metadata

Use when writing new Playwright E2E tests or adding test cases. Provides testing philosophy, patterns, and best practices from the Playwright Developer Handbook.

Scraping 14.1K 3mo ago
tavily-ai

tavily-best-practices

by tavily-ai

"Build production-ready Tavily integrations with best practices baked in. Reference documentation for developers using coding assistants (Claude Code, Cursor, etc.) to implement web search, content extraction, crawling, and research in agentic workflows, RAG systems, or autonomous agents."

Academic 356 2mo ago
proffesor-for-testing

compatibility-testing

by proffesor-for-testing

"Cross-browser, cross-platform, and cross-device compatibility testing ensuring consistent experience across environments. Use when validating browser support, testing responsive design, or ensuring platform compatibility."

Responsive 371 3mo ago
GPTomics

bio-atac-seq-nucleosome-positioning

by GPTomics

Extract nucleosome positions from ATAC-seq data using NucleoATAC, ATACseqQC, and fragment analysis. Use when analyzing chromatin organization, identifying nucleosome-free regions at promoters, or characterizing nucleosome occupancy patterns from ATAC-seq fragment size distributions.

CLI Tools 839 3mo ago
openclaw

sherpa-onnx-tts

by openclaw

Local text-to-speech via sherpa-onnx (offline, no cloud)

Git & VCS 376.5K 3mo ago
antfu

vue-testing-best-practices

by antfu

Use for Vue.js testing. Covers Vitest, Vue Test Utils, component testing, mocking, testing patterns, and Playwright for E2E testing.

Scraping 5.2K 4mo ago
openclaw

summarize

by openclaw

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).

Processing 376.5K 4mo ago
posit-dev

positron-e2e-tests

by posit-dev

This skill should be used when writing, debugging, or maintaining Playwright e2e tests for Positron. Load this skill when creating new test files, adding test cases, fixing flaky tests, or understanding the test infrastructure.

File Ops 4.2K 3mo ago
ljagiello

ctf-malware

by ljagiello

Malware analysis and network traffic techniques for CTF challenges. Use when analyzing obfuscated scripts, malicious packages, custom crypto protocols, C2 traffic, PE/.NET binaries, RC4/AES encrypted communications, or extracting malware configurations and indicators of compromise.

CLI Tools 2.3K 3mo ago
openakita

browser-get-content

by openakita

Extract page content and element text from current webpage. When you need to read page information, get element values, scrape data, or verify page content.

Processing 1.8K 4mo ago
openclaw

video-frames

by openclaw

Extract frames or short clips from videos using ffmpeg.

CLI Tools 376.4K 4mo ago
GPTomics

bio-clinical-databases-hla-typing

by GPTomics

Call HLA alleles from NGS data using OptiType, HLA-HD, or arcasHLA for immunogenomics applications. Use when determining HLA genotype for transplant matching, neoantigen prediction, or pharmacogenomic screening.

Processing 839 3mo ago
tavily-ai

tavily-map

by tavily-ai

Discover and list all URLs on a website without extracting content, via the Tavily CLI. Use this skill when the user wants to find a specific page on a large site, list all URLs, see the site structure, find where something is on a domain, or says "map the site", "find the URL for", "what pages are on", "list all pages", or "site structure". Faster than crawling — returns URLs only. Essential when you know the site but not the exact page. Combine with extract for targeted content retrieval.

CLI Tools 356 2mo ago
tavily-ai

tavily-extract

by tavily-ai

Extract clean markdown or text content from specific URLs via the Tavily CLI. Use this skill when the user has one or more URLs and wants their content, says "extract", "grab the content from", "pull the text from", "get the page at", "read this webpage", or needs clean text from web pages. Handles JavaScript-rendered pages, returns LLM-optimized markdown, and supports query-focused chunking for targeted extraction. Can process up to 20 URLs in a single call.

CLI Tools 356 2mo ago
rcarmo

playwright

by rcarmo

Use Playwright for browser automation in this workspace. Install locally and run scripts as needed.

CLI Tools 717 3mo ago
payloadcms

triage-ci-flake

by payloadcms

Use when CI tests fail on main branch after PR merge, or when investigating flaky test failures in CI environments

Debugging 42.8K 4mo ago
sonofmagic

playwright-cli

by sonofmagic

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

CLI Tools 1.8K 3mo ago
Project-N-E-K-O

webapp-testing

by Project-N-E-K-O

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

Automation 1.1K 4mo ago
Project-N-E-K-O

pdf

by Project-N-E-K-O

Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.

CLI Tools 1.1K 4mo ago
openakita

webapp-testing

by openakita

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

Automation 1.8K 4mo ago
langwatch

browser-test

by langwatch

"Validate a feature works by driving a real browser with Playwright MCP. No test files — just interactive verification."

File Ops 3.3K 3mo ago
langwatch

e2e

by langwatch

"Generate and verify E2E tests for a feature. Explores live app, creates test plan, generates tests, runs and fixes until passing."

Debugging 3.3K 4mo ago