Scraping

Web scraping and data extraction

Showing 217-240 of 700 skills

read-bin-docs

by YPares

Straightforward text extraction from document files (text-based PDF only for now, no OCR or docx). Use when you just need to read/extract text from binary documents.

CLI Tools 28 7mo ago

earnings-call-insights

by OctagonAI

Analyze earnings call transcripts to extract key insights about future guidance, strategic priorities, management commentary, and market signals.

Code Review 130 5mo ago

sec-10q-analysis

by OctagonAI

Analyze 10-Q quarterly filings for public companies using Octagon MCP. Use when extracting quarterly performance metrics, revenue breakdown, operating margins, segment performance, and interim financial updates from SEC 10-Q filings.

Code Review 130 5mo ago

rtl-document-translation

by belumume

Translate DOCX to RTL languages (Arabic, Hebrew, Urdu) preserving exact formatting, tables, colors, layouts. Handles quote normalization and multi-pass matching.

Code Gen 47 7mo ago

rtl-document-translation

by belumume

Translate structured documents (DOCX) to RTL languages (Arabic, Hebrew, Urdu) while preserving exact formatting, table structures, colors, and layouts. Handles quote normalization, multi-pass translation matching, and RTL-specific formatting patterns.

Code Gen 47 8mo ago

firecrawl-scrape

by parcadei

Scrape web pages and extract content via Firecrawl MCP

Processing 3.9K 6mo ago

earnings-call-analysis

by OctagonAI

Analyze earnings call transcripts to extract forward-looking guidance, strategic focus areas, supply chain insights, and generate follow-up questions for deeper analysis.

Code Review 130 5mo ago

sec-10k-analysis

by OctagonAI

Analyze 10-K annual filings for public companies using Octagon MCP. Use when extracting key financial metrics, risk factors, business overview, management discussion, and regulatory disclosures from SEC 10-K filings.

Agents 130 5mo ago

earnings-capital-allocation

by OctagonAI

Extract management's commentary on capital allocation, investment priorities, shareholder returns, and strategic investments from earnings call transcripts.

Finance 130 5mo ago

earnings-financial-guidance

by OctagonAI

Extract and analyze financial guidance and forward-looking statements from earnings transcripts, including segment guidance, risk factors, and guidance vs. actuals comparison.

Code Review 130 5mo ago

earnings-revenue-guidance

by OctagonAI

Extract specific revenue guidance and growth projections from earnings call transcripts, including segment breakdown, constant currency adjustments, and M&A contributions.

Scraping 130 5mo ago

honest-forget

by SimHacker

Graceful memory compression with integrity — summarize before forgetting, never fabricate

File Ops 44 6mo ago

just-scrape

by ScrapeGraphAI

"CLI tool for AI-powered web scraping, data extraction, search, and crawling via ScrapeGraph AI. Use when the user needs to scrape websites, extract structured data from URLs, convert pages to markdown, crawl multi-page sites, search the web for information, automate browser interactions (login, click, fill forms), get raw HTML, discover sitemaps, or generate JSON schemas. Triggers on tasks involving: (1) extracting data from websites, (2) web scraping or crawling, (3) converting webpages to markdown, (4) AI-powered web search with extraction, (5) browser automation, (6) generating output schemas for scraping. The CLI is just-scrape (npm package just-scrape)."

Processing 37 5mo ago

youtube-transcript

by intellectronica

Extract transcripts from YouTube videos. Use when the user asks for a transcript, subtitles, or captions of a YouTube video and provides a YouTube URL (youtube.com/watch?v=, youtu.be/, or similar). Supports output with or without timestamps.

CLI Tools 279 5mo ago

youtube-transcript

by intellectronica

CLI Tools 279 6mo ago

crawl-websites-at-scale

by besoeasy

"Scrape websites at scale using Scrapy, a Python web crawling and scraping framework. Use when: (1) Crawling multiple pages or entire sites, (2) Extracting structured data from HTML/XML, or (3) Building automated data pipelines from web sources."

Processing 127 4mo ago

phone-specs-scraper

by besoeasy

"Scrape phone specifications from GSM Arena, PhoneDB, and alternative sites. Use when: (1) Comparing smartphone specs, (2) Researching device features, or (3) Building phone comparison tools."

Design 127 27mo ago

using-web-scraping

by besoeasy

Search and scrape public web content with headless Chrome and DuckDuckGo using safe practices.

Scraping 127 27mo ago

playwright-patterns

by ed3dai

Use when writing Playwright automation code, building web scrapers, or creating E2E tests - provides best practices for selector strategies, waiting patterns, and robust automation that minimizes flakiness

Processing 241 5mo ago

earnings-qa-analysis

by OctagonAI

Analyze the Q&A section of earnings call transcripts for strategic insights, analyst concerns, and management responses on key topics.

Agents 130 5mo ago

firecrawl

by tdimino

Firecrawl produces cleaner markdown than WebFetch, handles JavaScript-heavy pages, and avoids content truncation. This skill should be used when fetching URLs, scraping web pages, converting URLs to markdown, extracting web content, searching the web, crawling sites, mapping URLs, LLM-powered extraction, autonomous data gathering with the Agent API, or fetching AI-generated documentation for GitHub repos via DeepWiki. Provides complete coverage of Firecrawl v2.8.0 API endpoints including parallel agents, spark-1-fast model, and sitemap-only crawling.

API Dev 38 4mo ago

figma-mcp

by tdimino

Convert Figma designs into production-ready code using MCP server tools. Use this skill when users provide Figma URLs, request design-to-code conversion, ask to implement Figma mockups, or need to extract design tokens and system values from Figma files. Works with frames, components, and entire design files to generate HTML, CSS, React, or other frontend code.

Processing 38 6mo ago

4chan-reader

by openclaw

Browse 4chan boards and extract thread discussions into structured text files. Use when you need to fetch catalog information or specific thread content (including post text and file metadata) from 4chan boards like /a/, /vg/, /v/, etc.

Scraping 4.5K 5mo ago

Instructions

by openclaw

All versions of all skills that are on clawhub.com archived

Processing 4.5K 5mo ago