replicate-cli

This skill provides comprehensive guidance for using the Replicate CLI to run AI models, create predictions, manage deployments, and fine-tune models. Use this skill when the user wants to interact with Replicate's AI model platform via command line, including running image generation models, language models, or any ML model hosted on Replicate. This skill should be used when users ask about running models on Replicate, creating predictions, managing deployments, fine-tuning models, or working with the Replicate API through the CLI.

VladimirBrejcha 19 Updated 5mo ago

Resources

GitHub

Install

npx skillscat add vladimirbrejcha/ios-ai-skills/replicate-cli

Install via the SkillsCat registry.

SKILL.md

Replicate CLI

The Replicate CLI is a command-line tool for interacting with Replicate's AI model platform. It enables running predictions, managing models, creating deployments, and fine-tuning models directly from the terminal.

Authentication

Before using the Replicate CLI, set the API token:

export REPLICATE_API_TOKEN=<token-from-replicate.com/account>

Alternatively, authenticate interactively:

replicate auth login

Verify authentication:

replicate account current

Core Commands

Running Predictions

The primary use case is running predictions against hosted models.

Basic prediction:

replicate run <owner/model> input_key=value

Examples:

Image generation:

replicate run stability-ai/sdxl prompt="a studio photo of a rainbow colored corgi"

Text generation with streaming:

replicate run meta/llama-2-70b-chat --stream prompt="Tell me a joke"

Prediction flags:

--stream - Stream output tokens in real-time (for text models)
--no-wait - Submit prediction without waiting for completion
--web - Open prediction in browser
--json - Output result as JSON
--save - Save outputs to local directory
--output-directory <dir> - Specify output directory (default: ./{prediction-id})

Seedance / Seedream Notes (Video Loops)

Seedream is image-only on Replicate (bytedance/seedream-3). It does not generate video.
Video models:
- bytedance/seedance-1-lite → good for short loops; use duration=3, fps=24, camera_fixed=true.
- bytedance/seedance-1.5-pro → 3s duration fails; use duration=5 minimum. Output is a URL.
Looping basics: set last_frame_image to the same image as image, and use camera_fixed=true to reduce camera motion.

CLI output quirks & fixes

replicate run ... --json can return [] even when the prediction succeeds.
- Fix: poll predictions and fetch output URL:
```
replicate prediction list --json
replicate prediction show <id> --json
```
Seedance-1-lite often returns a data URI (data:video/mp4;base64,...) → base64‑decode to MP4.
Seedance-1.5-pro returns a direct URL in output → download with curl -L.

Input Handling

File uploads: Prefix local file paths with @:

replicate run nightmareai/real-esrgan image=@photo.jpg

Output chaining: Use {{.output}} template syntax to chain predictions:

replicate run stability-ai/sdxl prompt="a corgi" | \
replicate run nightmareai/real-esrgan image={{.output[0]}}

Model Operations

View model schema (see required inputs and outputs):

replicate model schema <owner/model>
replicate model schema stability-ai/sdxl --json

List models:

replicate model list
replicate model list --json

Show model details:

replicate model show <owner/model>

Create a new model:

replicate model create <owner/name> \
  --hardware gpu-a100-large \
  --private \
  --description "Model description"

Model creation flags:

--hardware <sku> - Hardware SKU (see references/hardware.md)
--private / --public - Visibility setting
--description <text> - Model description
--github-url <url> - Link to source repository
--license-url <url> - License information
--cover-image-url <url> - Cover image for model page

Training (Fine-tuning)

Fine-tune models using the training command:

replicate train <base-model> \
  --destination <owner/new-model> \
  input_key=value

Example - Fine-tune SDXL with DreamBooth:

replicate train stability-ai/sdxl \
  --destination myuser/custom-sdxl \
  --web \
  input_images=@training-images.zip \
  use_face_detection_instead=true

List trainings:

replicate training list

Show training details:

replicate training show <training-id>

Deployments

Deployments provide dedicated, always-on inference endpoints with predictable performance.

Create deployment:

replicate deployments create <name> \
  --model <owner/model> \
  --hardware <sku> \
  --min-instances 1 \
  --max-instances 3

Example:

replicate deployments create text-to-image \
  --model stability-ai/sdxl \
  --hardware gpu-a100-large \
  --min-instances 1 \
  --max-instances 5

Update deployment:

replicate deployments update <name> \
  --max-instances 10 \
  --version <version-id>

List deployments:

replicate deployments list

Show deployment details and schema:

replicate deployments show <name>
replicate deployments schema <name>

Hardware

List available hardware options:

replicate hardware list

See references/hardware.md for detailed hardware information and selection guidelines.

Scaffolding

Create a local development environment from an existing prediction:

replicate scaffold <prediction-id-or-url> --template=<node|python>

This generates a project with the prediction's model and inputs pre-configured.

Command Aliases

For convenience, these aliases are available:

Alias	Equivalent Command
`replicate run`	`replicate prediction create`
`replicate stream`	`replicate prediction create --stream`
`replicate train`	`replicate training create`

Short aliases for subcommands:

replicate m = replicate model
replicate p = replicate prediction
replicate t = replicate training
replicate d = replicate deployments
replicate hw = replicate hardware
replicate a = replicate account

Common Workflows

Image Generation Pipeline

Generate an image and upscale it:

replicate run stability-ai/sdxl \
  prompt="professional photo of a sunset" \
  negative_prompt="blurry, low quality" | \
replicate run nightmareai/real-esrgan \
  image={{.output[0]}} \
  --save

Check Model Inputs Before Running

Always check the model schema to understand required inputs:

replicate model schema owner/model-name

Batch Processing

Run predictions and save outputs:

for prompt in "cat" "dog" "bird"; do
  replicate run stability-ai/sdxl prompt="$prompt" --save --output-directory "./outputs/$prompt"
done

Monitor Long-Running Tasks

Submit without waiting, then check status:

# Submit
replicate run owner/model input=value --no-wait --json > prediction.json

# Check status later
replicate prediction show $(jq -r '.id' prediction.json)

Best Practices

Always check schema first - Run replicate model schema <model> to understand required and optional inputs before running predictions.
Use streaming for text models - Add --stream flag when running language models to see output in real-time.
Save outputs explicitly - Use --save and --output-directory to organize prediction outputs.
Use JSON output for automation - Add --json flag when parsing outputs programmatically.
Open in web for debugging - Add --web flag to view predictions in the Replicate dashboard for detailed logs.
Chain predictions efficiently - Use the {{.output}} syntax to pass outputs between models without intermediate saves.

Troubleshooting

Authentication errors:

Verify REPLICATE_API_TOKEN is set correctly
Run replicate account current to test authentication

Model not found:

Check model name format: owner/model-name
Verify model exists at replicate.com

Input validation errors:

Run replicate model schema <model> to see required inputs
Check input types (string, number, file)

File upload issues:

Ensure @ prefix is used for local files
Verify file path is correct and file exists

Additional Resources

Replicate documentation: https://replicate.com/docs
Model explorer: https://replicate.com/explore
API reference: https://replicate.com/docs/reference/http
GitHub repository: https://github.com/replicate/cli

replicate-cli

Resources

Install

Replicate CLI

Authentication

Core Commands

Running Predictions

Seedance / Seedream Notes (Video Loops)

Input Handling

Model Operations

Training (Fine-tuning)

Deployments

Hardware

Scaffolding

Command Aliases

Common Workflows

Image Generation Pipeline

Check Model Inputs Before Running

Batch Processing

Monitor Long-Running Tasks

Best Practices

Troubleshooting

Additional Resources

Categories

Install

Recommended Skills