"Manage Vast.ai autoscaling endpoints and worker groups for production deployments. Use when setting up auto-scaling GPU inference, managing worker pools, or deploying services."
Install
npx skillscat add liorz/vastai-claude-skill/autoscale Install via the SkillsCat registry.
SKILL.md
Vast.ai Autoscaling & Endpoints
Manage production deployments with auto-scaling worker pools.
User Request
$ARGUMENTS
Concepts
- Endpoint: A deployment target that manages load and scaling policy
- Worker Group: A pool of instances (workers) tied to an endpoint, auto-scaled based on load
Endpoints
Create
vastai create endpoint \
--endpoint_name '<NAME>' \
--target_util 0.9 \
--max_workers 20 \
--cold_workers 5 \
--cold_mult 2.5 \
--min_load 0.0| Option | Description | Default |
|---|---|---|
--endpoint_name |
Name for the endpoint | (required) |
--target_util |
Target utilization 0–1 | 0.9 |
--max_workers |
Max workers | 20 |
--cold_workers |
Min cold/standby workers | 5 |
--cold_mult |
Cold capacity multiplier | 2.5 |
--min_load |
Minimum floor load (perf units/s) | 0.0 |
--min_cold_load |
Minimum cold load | 0.0 |
Manage
vastai show endpoints
vastai update endpoint <ID> [--target_util 0.85 --max_workers 50 ...]
vastai delete endpoint <ID>
vastai get endpt-logs <ID> [--level 0-3 --tail N]Worker Groups
Create
vastai create workergroup \
--template_hash '<HASH>' \
--endpoint_name '<NAME>' \
--test_workers 3 \
--cold_workers 2 \
--target_util 0.9 \
--search_params 'gpu_name=RTX_4090 reliability>0.9'| Option | Description |
|---|---|
--template_hash |
Template for worker instances |
--template_id |
Template ID (alternative) |
--endpoint_name / --endpoint_id |
Target endpoint |
--test_workers |
Workers for perf estimation |
--cold_workers |
Min cold workers |
--target_util |
Target utilization |
--cold_mult |
Cold capacity multiplier |
--search_params |
Search query for selecting machines |
--gpu_ram |
Estimated GPU RAM requirement |
--launch_args |
Extra args for instance creation |
-n |
Disable default search params |
Manage
vastai show workergroups
vastai update workergroup <ID> [--target_util --cold_workers ...]
vastai delete workergroup <ID>
vastai get wrkgrp-logs <ID> [--level 0-3 --tail N]Typical Setup Flow
- Create a template with your Docker image and config
- Create an endpoint with scaling policy
- Create a worker group linking the template to the endpoint
- Monitor with
show endpointsandshow workergroups - Check logs with
get endpt-logsandget wrkgrp-logs