Resources
14Install
npx skillscat add portolan-sdi/portolan-cli Install via the SkillsCat registry.
portolan-cli SKILL.md
This file helps AI agents assist users with Portolan CLI tasks.
What is Portolan?
Portolan CLI - Publish and manage cloud-native geospatial data catalogs.
Portolan is a CLI for publishing and managing cloud-native geospatial data catalogs. It orchestrates format conversion (GeoParquet, COG), versioning, and sync to object storage (S3, GCS, Azure)—no running servers, just static files.
Key concepts:
- STAC (SpatioTemporal Asset Catalog) — The catalog metadata spec
- GeoParquet — Cloud-optimized vector data (columnar, spatial indexing)
- COG (Cloud-Optimized GeoTIFF) — Cloud-optimized raster data (HTTP range requests)
- versions.json — Single source of truth for version history, sync state, and checksums
CLI Commands
portolan add
Track files in the catalog.
portolan add demographics/census.parquet
portolan add file1.geojson file2.geojson # Add multiple files
portolan add imagery/ # Add all files in directory
portolan add . # Add all files in catalogportolan check
Validate a Portolan catalog or check files for cloud-native status.
portolan check # Validate all (metadata + geo-assets)
portolan check --metadata # Validate metadata only
portolan check --geo-assets # Check geo-assets only
portolan check --fix # Fix both metadata and geo-assetsportolan clean
Remove all Portolan metadata while preserving data files.
portolan clean # Remove all metadata
portolan clean --dry-run # Preview what would be removedportolan clone
Clone a remote catalog to a local directory.
portolan clone s3://mybucket/my-catalog
portolan clone s3://mybucket/my-catalog .
portolan clone s3://mybucket/catalog -c demographics
portolan clone s3://mybucket/catalog ./local-copyportolan config
Manage catalog configuration.
portolan config set backend iceberg
portolan config get remote
portolan config listportolan extract
Extract data from external sources into Portolan catalogs.
portolan extract arcgis https://services.arcgis.com/.../FeatureServer ./output
portolan extract arcgis URL --layers "Census*" --dry-run
portolan extract arcgis URL --filter "sdn_*" --resumeportolan info
Show information about a file, collection, or catalog.
portolan info demographics/census.parquet # File info
portolan info demographics/ # Collection info
portolan info # Catalog info
portolan info demographics/census.parquet --json # JSON outputportolan init
Initialize a new Portolan catalog.
portolan init # Initialize in current directory
portolan init --auto # Skip prompts, use defaults
portolan init --title "My Catalog" # Set title
portolan init /path/to/data --auto # Initialize in specific directoryportolan list
List all files in the catalog with tracking status.
portolan list # List all files with status
portolan list --collection demographics # Filter by collection
portolan list --tracked-only # Show only tracked files
portolan list --untracked-only # Show only untracked filesportolan metadata
Manage catalog metadata for README generation.
portolan metadata init # Create template at catalog root
portolan metadata init demographics # Create template for collection
portolan metadata validate # Validate metadata.yamlportolan partition
Partition a large GeoParquet file for better query performance.
portolan partition buildings.parquet --preview
portolan partition buildings.parquet output/
portolan partition buildings.parquet output/ --target-rows 50000portolan pull
Pull updates from a remote catalog.
portolan pull s3://mybucket/my-catalog --collection demographics
portolan pull s3://mybucket/catalog -c imagery --dry-run
portolan pull s3://mybucket/catalog
portolan pull s3://mybucket/catalog --workers 4portolan push
Push local catalog changes to cloud object storage.
portolan push s3://mybucket/catalog --collection demographics
portolan push gs://mybucket/catalog -c imagery --dry-run
portolan push s3://mybucket/catalog
portolan push --dry-run # Uses configured remoteportolan readme
Generate README.md from STAC metadata and metadata.yaml.
portolan readme # Generate for catalog and all collections
portolan readme climate # Generate under climate/
portolan readme --check # CI mode: exit 1 if any stale
portolan readme --no-recursive # Only at catalog rootportolan rm
Remove files from tracking.
portolan rm --keep imagery/old_data.tif # Safe: untrack only
portolan rm --dry-run vectors/ # Preview what would be removed
portolan rm -f demographics/census.parquet # Force delete and untrack
portolan rm -f vectors/ # Force remove entire directoryportolan scan
Scan a directory for geospatial files and potential issues.
portolan scan # Scan current directory
portolan scan --json # JSON output in current directory
portolan scan /data/geospatial
portolan scan /large/tree --max-depth=2portolan skills
List and view AI skills for Portolan workflows.
portolan skills list # List available skills
portolan skills show sourcecoop # View Source Co-op upload skillportolan stac-geoparquet
Generate items.parquet for efficient STAC queries.
portolan stac-geoparquet # Generate for ALL collections
portolan stac-geoparquet -c landsat # Generate for landsat collection
portolan stac-geoparquet -c imagery --dry-run # Preview without creating
portolan stac-geoparquet --json # JSON output for all collectionsportolan status
Show local vs remote version state for collections.
portolan status # Status for all collections
portolan status -c demographics # Status for one collection
portolan status --offline # Skip remote check
portolan status --json # JSON output for agentsportolan sync
Sync local catalog with remote storage (pull + push).
portolan sync s3://mybucket/catalog --collection demographics
portolan sync s3://mybucket/catalog -c imagery --dry-run
portolan sync s3://mybucket/catalog -c data --fix --force
portolan sync s3://mybucket/catalog -c data --profile prodportolan version
Version management commands.
Python API
Portolan exposes a Python API for programmatic access:
from portolan_cli import Catalog, FormatType, detect_format
# Initialize a catalog
catalog = Catalog("/path/to/data")
# Detect file format
format_type = detect_format("data.parquet") # Returns FormatType.GEOPARQUETPublic exports:
Catalog- A Portolan catalog backed by a .portolan directory.CatalogExistsError- Raised when attempting to initialize a catalog that already exists.FormatType- Detected format type for routing to conversion library.cli- Portolan - Publish and manage cloud-native geospatial data catalogs.detect_format- Detect whether a file is vector, raster, or unknown.
Common Workflows
Publishing a New Catalog
Initialize the catalog structure:
portolan init --title "My Geospatial Data"Scan directory for files and fix filename issues:
portolan scan /data/geospatial # Fix filename issues (invalid chars, reserved names, long paths) portolan scan /data/geospatial --fixCheck cloud-native compliance and convert:
portolan check --geo-assets --fix --dry-run # Preview portolan check --geo-assets --fix # ConvertTrack files in the catalog:
portolan add demographics/ portolan add imagery/Push to cloud storage:
portolan push s3://mybucket/my-catalog --collection demographics
Updating an Existing Catalog
Pull latest from remote:
portolan pull s3://mybucket/my-catalog --collection demographicsMake local changes (add/modify files)
Scan and check:
portolan scan . portolan checkPush changes:
portolan push s3://mybucket/my-catalog --collection demographics
Full Sync Workflow (Recommended)
For ongoing synchronization, use sync which orchestrates the full workflow:
# Single command: pull → init → scan → check → push
portolan sync s3://mybucket/my-catalog --collection demographics
# With auto-fix for cloud-native conversion
portolan sync s3://mybucket/my-catalog -c demographics --fix
Troubleshooting
Common Errors
"Not inside a Portolan catalog"
Error: Not inside a Portolan catalog (no catalog.json found)
Solution: Either:
- Run
portolan initto create a catalog - Navigate into an existing catalog directory
- Use
--portolan-dirto specify the catalog path
"Catalog already exists"
Error: Already a Portolan catalog at /path
Solution: The directory already has a catalog. If you want to reinitialize, remove catalog.json and .portolan/ first.
"Push conflict"
Error: Push conflict: remote has newer version
Solution: Either:
- Run
portolan pullfirst to get remote changes - Use
--forceto overwrite (careful: loses remote changes)
"Uncommitted changes"
Error: Pull blocked by uncommitted changes
Solution: Either:
- Commit or push your local changes first
- Use
--forceto discard local changes and pull anyway
File Format Issues
Shapefile Missing Components
Warning: Shapefiles require .shp, .shx, and .dbf files together.
Solution: Ensure all required sidecar files are present. portolan scan will detect incomplete shapefiles.
Non-Cloud-Native Files
Warning: Files like GeoJSON or Shapefiles aren't cloud-optimized.
Solution: Use portolan check --fix to convert:
- Vectors → GeoParquet
- COG (Cloud-Optimized GeoTIFF)
Getting JSON Output
All commands support --json or --format json for machine-readable output:
portolan scan . --json
portolan check --format json
portolan --format json init --autoJSON output follows a consistent envelope format:
{
"success": true,
"command": "scan",
"data": { ... },
"errors": []
}