This skill provides comprehensive guidance for using the Replicate CLI to run AI models, create predictions, manage deployments, and fine-tune models. Use this skill when the user wants to interact with Replicate's AI model platform via command line, including running image generation models, language models, or any ML model hosted on Replicate. This skill should be used when users ask about running models on Replicate, creating predictions, managing deployments, fine-tuning models, or working with the Replicate API through the CLI.
Resources
2Install
npx skillscat add vladimirbrejcha/ios-ai-skills/replicate-cli Install via the SkillsCat registry.
Replicate CLI
The Replicate CLI is a command-line tool for interacting with Replicate's AI model platform. It enables running predictions, managing models, creating deployments, and fine-tuning models directly from the terminal.
Authentication
Before using the Replicate CLI, set the API token:
export REPLICATE_API_TOKEN=<token-from-replicate.com/account>Alternatively, authenticate interactively:
replicate auth loginVerify authentication:
replicate account currentCore Commands
Running Predictions
The primary use case is running predictions against hosted models.
Basic prediction:
replicate run <owner/model> input_key=valueExamples:
Image generation:
replicate run stability-ai/sdxl prompt="a studio photo of a rainbow colored corgi"Text generation with streaming:
replicate run meta/llama-2-70b-chat --stream prompt="Tell me a joke"Prediction flags:
--stream- Stream output tokens in real-time (for text models)--no-wait- Submit prediction without waiting for completion--web- Open prediction in browser--json- Output result as JSON--save- Save outputs to local directory--output-directory <dir>- Specify output directory (default:./{prediction-id})
Seedance / Seedream Notes (Video Loops)
- Seedream is image-only on Replicate (
bytedance/seedream-3). It does not generate video. - Video models:
bytedance/seedance-1-lite→ good for short loops; useduration=3,fps=24,camera_fixed=true.bytedance/seedance-1.5-pro→ 3s duration fails; useduration=5minimum. Output is a URL.
- Looping basics: set
last_frame_imageto the same image asimage, and usecamera_fixed=trueto reduce camera motion.
CLI output quirks & fixes
replicate run ... --jsoncan return[]even when the prediction succeeds.- Fix: poll predictions and fetch output URL:
replicate prediction list --json replicate prediction show <id> --json
- Fix: poll predictions and fetch output URL:
- Seedance-1-lite often returns a data URI (
data:video/mp4;base64,...) → base64‑decode to MP4. - Seedance-1.5-pro returns a direct URL in
output→ download withcurl -L.
Input Handling
File uploads: Prefix local file paths with @:
replicate run nightmareai/real-esrgan image=@photo.jpgOutput chaining: Use {{.output}} template syntax to chain predictions:
replicate run stability-ai/sdxl prompt="a corgi" | \
replicate run nightmareai/real-esrgan image={{.output[0]}}Model Operations
View model schema (see required inputs and outputs):
replicate model schema <owner/model>
replicate model schema stability-ai/sdxl --jsonList models:
replicate model list
replicate model list --jsonShow model details:
replicate model show <owner/model>Create a new model:
replicate model create <owner/name> \
--hardware gpu-a100-large \
--private \
--description "Model description"Model creation flags:
--hardware <sku>- Hardware SKU (seereferences/hardware.md)--private/--public- Visibility setting--description <text>- Model description--github-url <url>- Link to source repository--license-url <url>- License information--cover-image-url <url>- Cover image for model page
Training (Fine-tuning)
Fine-tune models using the training command:
replicate train <base-model> \
--destination <owner/new-model> \
input_key=valueExample - Fine-tune SDXL with DreamBooth:
replicate train stability-ai/sdxl \
--destination myuser/custom-sdxl \
--web \
input_images=@training-images.zip \
use_face_detection_instead=trueList trainings:
replicate training listShow training details:
replicate training show <training-id>Deployments
Deployments provide dedicated, always-on inference endpoints with predictable performance.
Create deployment:
replicate deployments create <name> \
--model <owner/model> \
--hardware <sku> \
--min-instances 1 \
--max-instances 3Example:
replicate deployments create text-to-image \
--model stability-ai/sdxl \
--hardware gpu-a100-large \
--min-instances 1 \
--max-instances 5Update deployment:
replicate deployments update <name> \
--max-instances 10 \
--version <version-id>List deployments:
replicate deployments listShow deployment details and schema:
replicate deployments show <name>
replicate deployments schema <name>Hardware
List available hardware options:
replicate hardware listSee references/hardware.md for detailed hardware information and selection guidelines.
Scaffolding
Create a local development environment from an existing prediction:
replicate scaffold <prediction-id-or-url> --template=<node|python>This generates a project with the prediction's model and inputs pre-configured.
Command Aliases
For convenience, these aliases are available:
| Alias | Equivalent Command |
|---|---|
replicate run |
replicate prediction create |
replicate stream |
replicate prediction create --stream |
replicate train |
replicate training create |
Short aliases for subcommands:
replicate m=replicate modelreplicate p=replicate predictionreplicate t=replicate trainingreplicate d=replicate deploymentsreplicate hw=replicate hardwarereplicate a=replicate account
Common Workflows
Image Generation Pipeline
Generate an image and upscale it:
replicate run stability-ai/sdxl \
prompt="professional photo of a sunset" \
negative_prompt="blurry, low quality" | \
replicate run nightmareai/real-esrgan \
image={{.output[0]}} \
--saveCheck Model Inputs Before Running
Always check the model schema to understand required inputs:
replicate model schema owner/model-nameBatch Processing
Run predictions and save outputs:
for prompt in "cat" "dog" "bird"; do
replicate run stability-ai/sdxl prompt="$prompt" --save --output-directory "./outputs/$prompt"
doneMonitor Long-Running Tasks
Submit without waiting, then check status:
# Submit
replicate run owner/model input=value --no-wait --json > prediction.json
# Check status later
replicate prediction show $(jq -r '.id' prediction.json)Best Practices
Always check schema first - Run
replicate model schema <model>to understand required and optional inputs before running predictions.Use streaming for text models - Add
--streamflag when running language models to see output in real-time.Save outputs explicitly - Use
--saveand--output-directoryto organize prediction outputs.Use JSON output for automation - Add
--jsonflag when parsing outputs programmatically.Open in web for debugging - Add
--webflag to view predictions in the Replicate dashboard for detailed logs.Chain predictions efficiently - Use the
{{.output}}syntax to pass outputs between models without intermediate saves.
Troubleshooting
Authentication errors:
- Verify
REPLICATE_API_TOKENis set correctly - Run
replicate account currentto test authentication
Model not found:
- Check model name format:
owner/model-name - Verify model exists at replicate.com
Input validation errors:
- Run
replicate model schema <model>to see required inputs - Check input types (string, number, file)
File upload issues:
- Ensure
@prefix is used for local files - Verify file path is correct and file exists
Additional Resources
- Replicate documentation: https://replicate.com/docs
- Model explorer: https://replicate.com/explore
- API reference: https://replicate.com/docs/reference/http
- GitHub repository: https://github.com/replicate/cli