Fast Replicate model inference via FGP daemon. Use when user needs to run ML models (image generation, LLMs, audio processing, video generation) on Replicate. Triggers on "replicate run", "run model replicate", "stable diffusion", "SDXL", "replicate llama", "whisper replicate".
Resources
1Install
npx skillscat add fast-gateway-protocol/fgp-skills/replicate-daemon Install via the SkillsCat registry.
FGP Replicate Daemon
Fast, persistent gateway to Replicate's model inference API. Run thousands of open-source ML models with minimal latency overhead.
Why FGP?
FGP daemons maintain persistent connections and avoid cold-start overhead. Instead of spawning a new API client for each request, the daemon stays warm and ready.
Benefits:
- No cold-start latency
- Connection pooling
- Persistent authentication
Installation
# Via Homebrew (recommended)
brew tap fast-gateway-protocol/fgp
brew install fgp-replicate
# Via npx
npx add-skill fgp-replicateQuick Start
# Set your API token
export REPLICATE_API_TOKEN="r8_..."
# Start the daemon
fgp start replicate
# Run a model
fgp call replicate.run \
--model "stability-ai/sdxl" \
--input '{"prompt": "A photo of an astronaut riding a horse"}'
# Run Llama
fgp call replicate.run \
--model "meta/llama-2-70b-chat" \
--input '{"prompt": "Explain quantum computing in simple terms"}'Methods
Predictions
replicate.run- Run a model and wait for resultsmodel(string, required): Model identifier (owner/name or version)input(object, required): Model-specific input parameterstimeout(int, optional): Max wait time in seconds (default: 300)
replicate.create- Create prediction without waitingmodel(string, required): Model identifierinput(object, required): Model-specific inputwebhook(string, optional): URL for completion callback
replicate.get- Get prediction status/resultsid(string, required): Prediction ID
replicate.cancel- Cancel a running predictionid(string, required): Prediction ID
replicate.list- List recent predictionscursor(string, optional): Pagination cursor
Models
replicate.models- Search/list available modelsquery(string, optional): Search queryowner(string, optional): Filter by owner
Popular Models
Image Generation
stability-ai/sdxl- Stable Diffusion XLblack-forest-labs/flux-schnell- FLUX.1 Schnell (fast)black-forest-labs/flux-dev- FLUX.1 Dev (high quality)
Language Models
meta/llama-2-70b-chat- Llama 2 70B Chatmeta/llama-3-70b-instruct- Llama 3 70Bmistralai/mixtral-8x7b-instruct- Mixtral 8x7B
Audio
openai/whisper- Speech-to-textsuno-ai/bark- Text-to-speech
Video
stability-ai/stable-video-diffusion- Image-to-video
Configuration
Environment variables:
REPLICATE_API_TOKEN(required): Your Replicate API token
Examples
Generate an image with SDXL
fgp call replicate.run \
--model "stability-ai/sdxl" \
--input '{
"prompt": "A cyberpunk cityscape at night, neon lights, rain",
"negative_prompt": "blurry, low quality",
"width": 1024,
"height": 1024,
"num_inference_steps": 30
}'Chat with Llama 3
fgp call replicate.run \
--model "meta/llama-3-70b-instruct" \
--input '{
"prompt": "What are the key differences between Python and Rust?",
"max_tokens": 500,
"temperature": 0.7
}'Transcribe audio
fgp call replicate.run \
--model "openai/whisper" \
--input '{
"audio": "https://example.com/audio.mp3",
"model": "large-v3",
"language": "en"
}'Async prediction with webhook
# Create prediction
fgp call replicate.create \
--model "stability-ai/sdxl" \
--input '{"prompt": "A sunset over mountains"}' \
--webhook "https://your-server.com/webhook"
# Check status later
fgp call replicate.get --id "prediction_id_here"