devcsde

oatda-generate-image

Use when the user wants to generate images using AI models through OATDA's unified API. Supports DALL-E 3, GPT-Image-1, Google Imagen, MiniMax, Qwen, and xAI image models.

devcsde 0 Updated 3mo ago
GitHub

Install

npx skillscat add devcsde/oatda-skills/oatda-generate-image

Install via the SkillsCat registry.

SKILL.md

OATDA Image Generation

Generate images from text descriptions using AI models through OATDA's unified API.

When to Use

Use this skill when the user wants to:

  • Generate images from text descriptions via OATDA
  • Create AI artwork, illustrations, or designs
  • Generate product mockups or concept art
  • Use DALL-E, Imagen, or other image models through a single API

Prerequisites

The user needs an OATDA API key. Check in this order:

  1. $OATDA_API_KEY environment variable
  2. ~/.oatda/credentials.json config file

If neither exists, tell the user:

You need an OATDA API key. Get one at https://oatda.com, then set it:
export OATDA_API_KEY=your_key_here

Step-by-Step Instructions

1. Resolve the API key

# Check env var first; if empty, auto-load from credentials file
if [[ -z "$OATDA_API_KEY" ]]; then
  export OATDA_API_KEY=$(cat ~/.oatda/credentials.json 2>/dev/null | jq -r '.profiles[.defaultProfile].apiKey' 2>/dev/null)
fi

# Verify key exists (show first 8 chars only)
echo "${OATDA_API_KEY:0:8}"

If the output is empty or null, stop and ask the user to configure their API key.

IMPORTANT:

  • Never print the full API key. Only show the first 8 characters for verification.
  • The key resolution script and subsequent curl commands must run in the same shell session. Each separate bash/terminal invocation starts with an isolated environment where previously exported variables are lost. Either run all commands in one session, or chain them (e.g., export OATDA_API_KEY=... && curl ...).

2. Determine the model

Map common aliases:

User says Provider Model
dall-e, dall-e-3 (default) openai dall-e-3
gpt-image openai gpt-image-1
imagen google imagen-4.0-generate-001
seedream bytedance seedream-4-5-251128
wan alibaba wan2.6-t2i
minimax image minimax image-01
grok image xai grok-imagine-image

Default: openai / dall-e-3 if no model specified.

3. Discover model-specific parameters

IMPORTANT: Different models support different parameters (sizes, quality levels, styles, masks, watermarks, negative prompts, etc.). Before generating, discover what parameters a model supports:

curl -s -X GET "https://oatda.com/api/v1/llm/models?type=image" \
  -H "Authorization: Bearer $OATDA_API_KEY" | jq '.image_models[] | {id, supported_params}'

This returns each image model's supported_params with:

  • type: Parameter type (string, number, boolean, file)
  • values: Allowed values for enums
  • default: Default value
  • description: What the parameter does
  • optional: Whether it's required
  • accept: For file types, what's accepted (e.g., "image/*")

File-type parameters: Parameters like mask or reference images require publicly accessible URLs (https://...), not local file paths.

Pass model-specific parameters via the model_params object (see examples below).

4. Make the API call

curl -s -X POST "https://oatda.com/api/v1/llm/generate-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "<PROVIDER>",
    "model": "<MODEL>",
    "prompt": "<IMAGE_DESCRIPTION>",
    "size": "1024x1024",
    "quality": "standard",
    "n": 1,
    "numberOfImages": 1,
    "aspectRatio": "1:1",
    "style": "vivid",
    "personGeneration": "allow_adult"
  }'

Replace <PROVIDER>, <MODEL>, and <IMAGE_DESCRIPTION> with actual values.

CRITICAL: The endpoint is /api/v1/llm/generate-image (NOT /api/v1/llm/image — that's vision analysis).

Parameters:

  • prompt: Image description (1-4000 characters)
  • size: Dimensions — "1024x1024", "1792x1024", "1024x1792", or "2K", "4K" (model-dependent)
  • quality: "standard", "hd", "auto", "low", "medium", "high"
  • n and numberOfImages: Number of images (1-10), set both to the same value
  • aspectRatio: "1:1", "16:9", "9:16", "3:2", "2:3", "4:3", "3:4", etc.
  • style: "vivid" (dramatic, hyper-real) or "natural" (realistic)
  • background: "auto", "transparent", or "opaque"
  • outputFormat: "png", "jpeg", or "webp"
  • model_params: Model-specific parameters as key-value pairs. Use list_models?type=image or /api/v1/llm/models to discover supported params per model. Examples:
    • DALL-E 3: { "style": "vivid", "quality": "hd" }
    • GPT-Image-1: { "quality": "high", "background": "transparent", "outputFormat": "png" }
    • Imagen 4: { "sampleImageSize": "2K", "personGeneration": "allow_adult" }
    • Seedream: { "size": "4K", "watermark": false }
    • Wan 2.6: { "seed": "42", "negative_prompt": "blurry", "prompt_extend": true }

5. Parse the response

{
  "success": true,
  "url": "https://cdn.example.com/generated-image.png",
  "all_images": [
    {"url": "https://cdn.example.com/image-1.png"},
    {"url": "https://cdn.example.com/image-2.png"}
  ],
  "revised_prompt": "A detailed cyberpunk cityscape at night with neon lights..."
}
  • Show the image URL(s) to the user (from all_images array, or url field if single image)
  • If revised_prompt is present, mention how the model expanded the prompt

6. Handle errors

HTTP Status Meaning Action
401 Invalid API key Tell user to check their key
400 Bad request / prompt too long Keep prompt under 4000 chars
429 Rate limited Wait 5 seconds and retry once
400 with content_policy Content policy violation Ask user to adjust the description

Full Examples

DALL-E 3

User asks: "Generate an image of a cyberpunk city at night using DALL-E 3"

curl -s -X POST "https://oatda.com/api/v1/llm/generate-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "openai",
    "model": "dall-e-3",
    "prompt": "A cyberpunk city at night with neon lights reflecting on wet streets",
    "size": "1024x1024",
    "quality": "hd",
    "n": 1,
    "numberOfImages": 1,
    "model_params": {
      "style": "vivid"
    }
  }'

GPT-Image-1 with transparent background

User asks: "Generate a transparent PNG logo"

curl -s -X POST "https://oatda.com/api/v1/llm/generate-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "openai",
    "model": "gpt-image-1",
    "prompt": "A sleek minimalist logo of a mountain, clean vector style",
    "n": 1,
    "model_params": {
      "quality": "high",
      "background": "transparent",
      "outputFormat": "png"
    }
  }'

Bytedance Seedream (no watermark)

User asks: "Generate a 4K image without watermark using Seedream"

curl -s -X POST "https://oatda.com/api/v1/llm/generate-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "bytedance",
    "model": "seedream-4-5-251128",
    "prompt": "A majestic dragon flying over a fantasy kingdom at sunset",
    "model_params": {
      "size": "4K",
      "watermark": false
    }
  }'

Alibaba Wan 2.6 with negative prompt

User asks: "Generate an image avoiding blurry elements"

curl -s -X POST "https://oatda.com/api/v1/llm/generate-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OATDA_API_KEY" \
  -d '{
    "provider": "alibaba",
    "model": "wan2.6-t2i",
    "prompt": "A sharp detailed portrait of a warrior in ornate armor",
    "model_params": {
      "negative_prompt": "blurry, low quality, distorted",
      "seed": "42"
    }
  }'

Tips

  • The endpoint is /api/v1/llm/generate-image — do NOT confuse with /api/v1/llm/image (that's vision)
  • DALL-E 3 costs ~$0.04/image (standard), ~$0.08/image (HD)
  • Set both n and numberOfImages to the same value for compatibility
  • Use list_models?type=image to discover model-specific parameters before generating
  • Use model_params for model-specific options (watermark, negative_prompt, seed, etc.)
  • Image URLs may be temporary — recommend downloading promptly
  • Maximum prompt length is 4000 characters
  • NEVER expose the full API key in output
  • Related skills: /oatda:oatda-vision-analysis for analyzing images, /oatda:oatda-list-models for available image models