AI-powered media understanding and analysis for images, videos, and audio. Use when users ask to describe, analyze, summarize, or extract text (OCR) from media files.
Resources
1Install
npx skillscat add maxgent-ai/maxgent-plugin/media-understand Install via the SkillsCat registry.
SKILL.md
Media Understanding
Analyze multimedia content via Maxgent FAL API proxy, using the default route.
Supported Formats
| Type | Formats | Max Size |
|---|---|---|
| Image | jpg, jpeg, png, gif, webp | 20MB |
| Video | mp4, mpeg, mov, webm, YouTube URL | 100MB |
| Audio | wav, mp3, aiff, aac, ogg, flac, m4a | 100MB |
Prerequisites
MAX_API_KEYenvironment variable (auto-injected by Max)- Bun 1.0+ (built into Max)
Routing
default- Endpoint:
openrouter/router/openai/v1/chat/completions - Model:
DEFAULT_MM_MODEL, defaults togoogle/gemini-2.5-pro(override with--model)
- Endpoint:
Usage
bun skills/media-understand/media-understand.js \
--media PATH_OR_URL --prompt "PROMPT" \
[--language chinese|english] [--model MODEL_ID] \
[--max-tokens N] [--temperature X]Parameters:
--media: local file path or YouTube URL--prompt: analysis question--language:chinese(default) orenglish--model: override the default model--max-tokens: max output tokens (default4096)--temperature: sampling temperature (default0.2)
Examples
# Image OCR
bun skills/media-understand/media-understand.js --media ./screenshot.png --prompt "extract all text from this image" --language english
# Video summary (YouTube)
bun skills/media-understand/media-understand.js --media "https://youtube.com/watch?v=xxx" --prompt "summarize this video" --language english
# Local audio analysis
bun skills/media-understand/media-understand.js --media ./meeting.m4a --prompt "summarize key points and list action items" --language englishInstructions
- Check
MAX_API_KEY. - Identify media type and validate size limits.
- Analyze using the default route; override the model with
--modelif needed. - Local images/videos/audio are auto-uploaded via FAL upload proxy before analysis.
- On success, return readable text.
- On failure:
- HTTP 402 (insufficient credits): Stop immediately. Do NOT retry. Tell the user their API credits are exhausted.
- Other errors: retry once with a different model. If it fails again, stop and clearly indicate whether it's an upload / proxy / model parameter issue.