Generate videos with Veo 3.1 (Google), Kling 3.0 (Kuaishou), or Sora 2 (OpenAI). Supports text-to-video AND image-to-video with customizable duration, aspect ratio, resolution, person generation, and multi-video batch output. Video extension/continuation up to ~3 minutes. Automatically uploads to Slack.
Resources
1Install
npx skillscat add cgk-platform/cgk/veo-video-gen Install via the SkillsCat registry.
Veo/Kling/Sora Video Generation
Generate high-quality videos using:
- Veo 3.1 (Google) — 1080p videos with native audio/dialogue/music
- Kling (Kuaishou) — 1080p videos with native audio in 5 languages, 5-10s, best for long-form via extension (default model: kling-v3-0)
- Sora 2 (OpenAI) — Variable duration videos (fallback)
Preferred backend order: veo-fast (primary) → kling-pro (secondary) → sora (tertiary)
Default backend: vertex-veo-fast (Veo 3.1 Fast on Vertex AI, auto-falls back to AI Studio)
NEVER call
_generate_video_internal.pyor_extend_video_internal.pydirectly — they WILL reject the call. Usegenerate_video_safe.shandextend_video_safe.shinstead.
Quick Start
SLACK_CHANNEL_ID="<channel_id>" \
SLACK_THREAD_TS="<thread_ts>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/generate_video_safe.sh \
--prompt "your video description" \
--filename "/Users/novarussell/.openclaw-rawdog/media/$(date +%Y-%m-%d-%H-%M-%S)-video.mp4"All Parameters
| Parameter | Short | Required | Values | Default | Description |
|---|---|---|---|---|---|
--prompt |
-p |
YES | text | — | Full video description including dialogue/voiceover |
--filename |
-f |
YES | path | — | Output path (absolute, to ~/.openclaw-rawdog/media/) |
--backend |
-b |
no | see below | vertex-veo-fast |
Generation backend |
--duration |
-d |
no | integer (seconds) | 8 |
Video length |
--aspect-ratio |
-a |
no | 9:16, 16:9, 1:1 |
9:16 |
Aspect ratio |
--resolution |
-r |
no | 720p, 1080p |
1080p |
Output resolution |
--num-videos |
-n |
no | 1, 2, 3, 4 |
1 |
Number of videos to generate in one call |
--person-generation |
— | no | allow_all, allow_adult, dont_allow |
unset | Person generation policy |
--seed |
— | no | integer | unset | Seed for reproducibility |
--reference-image |
-I |
no | path | — | Reference image for image-to-video |
--negative-prompt |
— | no | text | — | Negative prompt (Kling V1 only, skipped on v2.5+) |
--kling-model |
— | no | model name | kling-v3-0 |
Kling model override (kling-v1, kling-v2-6, kling-video-o1, kling-v3-0) |
--cfg-scale |
— | no | float 0-1 | 0.5 |
CFG scale (Kling V1 only, ignored on v2+) |
--camera |
— | no | preset or JSON | — | Camera control for Kling text2video (V1 only, see presets below) |
--multi-shot |
— | no | JSON array | — | Multi-shot scenes for Kling (see below) |
--end-image |
— | no | path | — | End-frame image for Kling image-to-video (image_tail) |
--no-upload |
— | no | flag | — | Skip automatic Slack upload |
Backends
| Backend | Model | Route | Notes |
|---|---|---|---|
vertex-veo-fast |
veo-3.1-fast-generate-001 |
Vertex AI → AI Studio fallback | Default. GA model, 50 RPM |
vertex-veo-quality |
veo-3.1-generate-001 |
Vertex AI → AI Studio fallback | Higher quality, slower |
veo-fast |
veo-3.1-fast-generate-preview |
AI Studio → Vertex fallback | AI Studio first |
veo-quality |
veo-3.1-generate-preview |
AI Studio → Vertex fallback | AI Studio first |
kling-pro |
kling-v3-0 (pro mode) |
Kling REST API | 1080p, ~60s gen, best quality. Override with --kling-model |
kling-std |
kling-v3-0 (std mode) |
Kling REST API | 720p, ~30s gen, faster. Override with --kling-model |
sora |
Sora 2 | OpenAI via litellm | Fallback option |
Auto-fallback: vertex backends fall back to AI Studio on failure; AI Studio backends fall back to Vertex on rate limit.
Video Extension
Extend/continue a previously generated video using extend_video_safe.sh. Reads the .meta.json sidecar (auto-created by generate_video_safe.sh) to detect the backend and call the appropriate extension API.
Extension Command
SLACK_CHANNEL_ID="<channel_id>" \
SLACK_THREAD_TS="<thread_ts>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/extend_video_safe.sh \
--video "/Users/novarussell/.openclaw-rawdog/media/original-video.mp4" \
--prompt "continuation description" \
--extensions 3Extension Parameters
| Parameter | Short | Required | Default | Description |
|---|---|---|---|---|
--video |
-v |
YES | — | Path to video (must have .meta.json sidecar) |
--prompt |
-p |
no | original prompt | Continuation prompt |
--negative-prompt |
— | no | — | Negative prompt (Kling only) |
--extensions |
-n |
no | 1 |
Number of extensions to chain |
--filename |
-f |
no | {original}_extended.mp4 |
Output path |
--no-upload |
— | no | — | Skip Slack upload |
Extension Behavior Per Backend
| Backend | Method | Per Extension | Max Total | Notes |
|---|---|---|---|---|
| Kling | Dedicated /video-extend endpoint |
+5s | ~3 min | Best for long-form, chains via video_id |
| Veo 3.1 | SDK re-call with video ref | +7s | ~148s (2.5 min) | 720p only, requires AI Studio backend, 2-day URI expiry |
| Sora 2 | Frame extract + image-to-video + concat | configurable | ~60s practical | Coherence degrades, no audio continuity |
Extension Notes
- Prerequisite: The video must have a
.meta.jsonsidecar (auto-created bygenerate_video_safe.sh) - Veo: Only AI Studio backends (
veo-fast,veo-quality) save the video URI needed for extension. Vertex-generated videos cannot be extended — regenerate with an AI Studio backend first. URIs expire after 2 days. - Sora: Uses frame extraction + image-to-video + ffmpeg concat. Requires
ffmpeginstalled. Calls OpenAI directly (not litellm). - Kling: Best for long-form — dedicated
/video-extendendpoint, chains viavideo_id(saved in.meta.json).
Multi-Video Output (--num-videos)
Generate 1–4 variations in one call:
SLACK_CHANNEL_ID="<channel>" SLACK_THREAD_TS="<thread>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/generate_video_safe.sh \
--prompt "A serene beach at sunset with gentle waves" \
--filename "/Users/novarussell/.openclaw-rawdog/media/$(date +%Y-%m-%d-%H-%M-%S)-beach.mp4" \
--num-videos 4 \
--duration 8When --num-videos > 1, files are saved as {stem}_1.mp4, {stem}_2.mp4, etc. All are uploaded to Slack individually.
Person Generation
Controls whether and what age of people can appear:
--person-generation allow_all # All ages allowed (good for family/lifestyle content)
--person-generation allow_adult # Adults only
--person-generation dont_allow # No people (product shots, landscapes)Default is unset (model decides). For ad content featuring people, use allow_adult or allow_all.
Resolution
--resolution 1080p # Default — full HD
--resolution 720p # Faster generation, smaller filesSeed (Reproducibility)
--seed 42 # Reproduce a specific generation
--seed 0 # Fixed seed from the docs exampleUseful for iterating on a prompt while keeping similar composition.
Image-to-Video
SLACK_CHANNEL_ID="<channel>" SLACK_THREAD_TS="<thread>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/generate_video_safe.sh \
--prompt "A woman relaxing in white bedding, camera slowly dollies forward, morning sunlight streaming through windows" \
--reference-image "/Users/novarussell/.openclaw-rawdog/media/remixed-frame.png" \
--filename "/Users/novarussell/.openclaw-rawdog/media/$(date +%Y-%m-%d-%H-%M-%S)-from-image.mp4" \
--duration 8 \
--aspect-ratio 9:16 \
--person-generation allow_adultKling Camera Control (--camera)
Control camera movement in Kling text-to-video generation. V1 models only (--kling-model kling-v1) — camera_control is not supported on v2+ models and will be ignored with a warning.
Named Presets (type-based, no config)
| Preset | API type | Effect |
|---|---|---|
down-back |
down_back |
Camera moves down and back |
forward-up |
forward_up |
Camera moves forward and up |
right-turn |
right_turn_forward |
Camera turns right moving forward |
left-turn |
left_turn_forward |
Camera turns left moving forward |
Simple Movement Presets (type="simple" with config)
| Preset | Effect |
|---|---|
dolly-in |
Zoom into scene (zoom=10) |
dolly-out |
Zoom out of scene (zoom=-10) |
pan-left |
Camera pans left (pan=-10) |
pan-right |
Camera pans right (pan=10) |
tilt-up |
Camera tilts upward (tilt=10) |
tilt-down |
Camera tilts downward (tilt=-10) |
zoom-in |
Zoom into scene (zoom=10) |
zoom-out |
Zoom out of scene (zoom=-10) |
SLACK_CHANNEL_ID="<channel>" SLACK_THREAD_TS="<thread>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/generate_video_safe.sh \
--prompt "A cat sitting on a windowsill" \
--filename "/Users/novarussell/.openclaw-rawdog/media/cat-dolly.mp4" \
--backend kling-pro --kling-model kling-v1 --camera dolly-inFor custom camera values, pass raw JSON matching the official API structure: --camera '{"type": "simple", "config": {"horizontal": 0, "vertical": 0, "pan": -5, "tilt": 0, "roll": 0, "zoom": 3}}'
Kling Multi-Shot (--multi-shot)
Generate multi-scene videos (3-15s total) in a single Kling call. Pass a JSON array of scenes:
SLACK_CHANNEL_ID="<channel>" SLACK_THREAD_TS="<thread>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/generate_video_safe.sh \
--filename "/Users/novarussell/.openclaw-rawdog/media/multi-shot.mp4" \
--backend kling-pro \
--multi-shot '[{"prompt":"Wide shot of a beach at sunrise","duration":5},{"prompt":"Close-up of waves crashing on rocks","duration":5}]'When --multi-shot is provided, --prompt is ignored. Each shot needs prompt and duration (seconds).
Kling End-Frame (--end-image)
For Kling image-to-video, specify an end-frame image to create smooth start-to-end transitions:
SLACK_CHANNEL_ID="<channel>" SLACK_THREAD_TS="<thread>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/generate_video_safe.sh \
--prompt "Smooth zoom transition" \
--reference-image start.png --end-image end.png \
--filename "/Users/novarussell/.openclaw-rawdog/media/transition.mp4" \
--backend kling-proKling Model Override (--kling-model)
Default model is kling-v2-6. Override for newer/different models:
--kling-model kling-video-o1 # Reasoning model
--kling-model kling-v2-5-turbo # Turbo variant
--kling-model kling-v3-0 # V3 (if available on your account)Audio / Voiceover
Veo 3.1 generates audio natively (dialogue, sound effects, ambient, music). Audio is always enabled on Vertex backends. Kling v2-6+ also generates native audio via enable_audio: true (sent automatically for v2+ models).
- Include spoken dialogue directly in the prompt in quotes
- Example:
A woman says, "Your bedroom really says a lot about you." - Keep spoken lines under ~7 words each for best lip-sync
- Kling: Audio is auto-enabled on v2-6+ models. Not available when using
--end-image(image_tail is incompatible with enable_audio). V1 models do not support audio.
Full-Featured Example
SLACK_CHANNEL_ID="<CHANNEL_ID>" \
SLACK_THREAD_TS="<THREAD_TS>" \
~/.openclaw-rawdog/skills/veo-video-gen/scripts/generate_video_safe.sh \
--prompt "The camera dollies to show a close up of a desperate man in a green trench coat making a call on a rotary wall phone with green neon light, cinematic movie scene" \
--filename "/Users/novarussell/.openclaw-rawdog/media/$(date +%Y-%m-%d-%H-%M-%S)-trenchcoat.mp4" \
--backend vertex-veo-quality \
--duration 8 \
--aspect-ratio 9:16 \
--resolution 1080p \
--num-videos 4 \
--person-generation allow_all \
--seed 0Output
- Saves MP4(s) to specified path (or
{stem}_1.mp4,{stem}_2.mp4for multi-video) - Saves
.meta.jsonsidecar alongside each video (for extension support) - Uploads each video to Slack automatically (unless
--no-upload) - Prints
MEDIA:<full_path>for each video - Prints
SLACK_UPLOADED:trueon successful upload
Timing
- Veo: 60–120 seconds per generation
- Kling: ~30s (std) to ~60s (pro) per generation
- Sora: 60–90 seconds
- For
--num-videos 4: allow up to 10 minutes (videos generate in parallel on Vertex) - Use a yield/timeout of at least 600000ms (10 minutes) for multi-video or extension calls
CRITICAL RULES
- NEVER modify, shorten, or simplify the user's prompt. Pass it exactly as provided.
- NEVER remove dialogue, voiceover, or narration from the prompt.
- NEVER edit this script. If it fails, try a different backend or report the error.
- If one backend fails, try another (
vertex-veo-fast→vertex-veo-quality→kling-pro→sora).
API Keys / Auth
- Vertex:
GOOGLE_APPLICATION_CREDENTIALS→~/.config/gcloud/vertex-service-account.json - AI Studio:
GEMINI_API_KEY - Kling:
KLING_ACCESS_KEY+KLING_SECRET_KEY(JWT auth) - Sora:
OPENAI_API_KEY(via litellm proxy)