Generate AI images using Gemini image generation API. Use this skill when content needs images - thumbnails, social posts, blog headers, or creative visuals. Follows an iterative workflow - brainstorm concepts, select direction, generate in multiple styles, then produce via API.
Resources
2Install
npx skillscat add cdeistopened/skill-stack/image-prompt-generator Install via the SkillsCat registry.
Image Prompt Generator
Generate professional, non-generic images using Google's Gemini API for image generation.
Prerequisites & Setup
Getting Your Gemini API Key
- Go to Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the generated key
Configuring the API Key
Option 1: Environment file (recommended)
Create a .env file in your project root:
GEMINI_API_KEY=your_api_key_hereOption 2: Direct environment variable
export GEMINI_API_KEY=your_api_key_hereInstall Dependencies
pip install google-generativeai python-dotenv pillowAvailable Models
| Model | API Name | Best For |
|---|---|---|
| Flash | gemini-2.5-flash-image |
Speed, drafts, iteration |
| Pro | gemini-3-pro-image-preview |
Final assets, 16:9 aspect ratio, quality |
CRITICAL: Use gemini-3-pro-image-preview for:
- Thumbnails (need 16:9 aspect ratio)
- Final production images
- Any image where aspect_ratio config is needed
Workflow Overview
- Brainstorm Concepts - Generate 4-6 high-level visual ideas
- Select Direction - User picks the concept they like
- Optimize Prompt - Refine into a strong, detailed prompt
- Style Variations - Adapt to 2-3 different visual styles
- Generate Images - Run via Gemini API
Step 1: Brainstorm Concepts
When the user provides a topic or use case, generate 4-6 high-level visual concepts. Each concept should be:
- One sentence describing the visual idea
- Concrete and immediate - you can picture it instantly
- Conceptual but not abstract - a clear object/scene with meaning
- Non-generic - avoid cliches (no lightbulbs for ideas, no handshakes for partnership)
Format:
1. **[Short label]** - One sentence description of the visual concept and why it works.
2. **[Short label]** - One sentence description...Example for "newsletter about personal productivity":
1. **Compass with coffee stain** - A vintage compass where the needle points toward a coffee ring stain on a map, suggesting direction emerges from daily rituals.
2. **Clock face with seasons** - A clock where the 12 hours show seasonal changes, suggesting time management over long arcs, not just hours.
3. **Empty desk with shadow** - A minimalist desk in morning light, but the shadow shows a cluttered desk - the gap between intention and reality.
4. **Single key on many keychains** - One small key attached to dozens of decorative keychains, suggesting we overcomplicate simple solutions.Wait for user to select before proceeding.
Step 2: Optimize the Prompt
Once the user selects a concept, develop it into a full prompt. Structure:
Create a [style type] illustration of [subject].
CONCEPT: [Expand the one-sentence idea into a clear visual description]
STYLE: [Artistic approach - load from references/styles/ if brand-specific]
COMPOSITION: [Framing, focal point, negative space, balance]
COLORS: [Palette - describe by name, not hex codes which may render as text]
TEXTURE: [Surface qualities, analog/digital feel]
AVOID: [What should NOT appear - be specific]
FORMAT: [Aspect ratio]Key principles:
- Natural language, full sentences - no tag soup
- Describe colors by name (burnt orange, sky blue, near-black) not hex codes
- Maximum 2-3 elements - if it feels busy, remove something
- Favor metaphor over literal depiction
Step 3: Style Variations
Default style: Risograph - Use references/styles/risograph.md unless the content calls for something different.
Available styles in references/styles/:
- risograph.md - DEFAULT. Halftone dots, misregistration, indie printmaking aesthetic. Warm, tactile, analog.
- minimalist-ink.md - High-contrast black and white, crosshatching. For craft/mastery posts.
- watercolor-line.md - Ink linework with watercolor washes, warm. For organic topics.
- editorial-conceptual.md - Conceptual, sophisticated, editorial wit. For abstract/philosophical posts.
Present style options to user, recommending risograph as default.
Step 4: Generate via API
Running the Script
# Load key from .env and generate
export $(grep GEMINI_API_KEY .env) && \
python scripts/generate_image.py "prompt here" --model pro --aspect 16:9
# Save to specific folder
python scripts/generate_image.py "prompt" --output "./images" --name "my_image"Options:
--model flash(faster, cheaper) or--model pro(higher quality)--aspect 16:9,1:1, or9:16(PRO MODEL ONLY - for flash, you MUST include ratio in prompt text)--variations N- generate N versions--output ./path- save location--name prefix- filename prefix
Output location: Save images alongside the content they belong to - not a generic images dump.
Step 5: Iterate
After user reviews generated images:
- 80% good? Request specific edits conversationally rather than regenerating
- Composition off? Adjust framing or element placement in prompt
- Wrong style? Try a different style reference
- Too busy? Simplify to fewer elements
- Colors wrong? Be more explicit about palette
Prompting Principles
Write Like a Creative Director
Brief the model like a human artist. Use proper grammar, full sentences, and descriptive adjectives.
| Don't | Do |
|---|---|
| "Cool car, neon, city, night, 8k" | "A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car's metallic chassis." |
Be specific about:
- Subject: Instead of "a woman," say "a sophisticated elderly woman wearing a vintage chanel-style suit"
- Materiality: Describe textures - "matte finish," "brushed steel," "soft velvet," "crumpled paper"
- Setting: Define location, time of day, weather
- Lighting: Specify mood and light source
- Mood: Emotional tone of the image
Provide Context
Context helps the model make logical artistic decisions. Include the "why" or "for whom."
Example: "Create an image of a sandwich for a Brazilian high-end gourmet cookbook."
(Model infers: professional plating, shallow depth of field, perfect lighting)
Keep It Simple
- One clear focal point
- Maximum 2-3 elements total
- Generous negative space
- If it feels busy, remove something
Avoid the Generic
- No lightbulbs for "ideas"
- No handshakes for "partnership"
- No happy stock photo poses
- No glossy AI aesthetic
Resources
references/styles/
Aesthetic style definitions:
risograph.md- DEFAULT - Halftone, misregistration, indie printmakingminimalist-ink.md- Black and white ink illustrationwatercolor-line.md- Ink with watercolor washeseditorial-conceptual.md- Conceptual editorial style
scripts/
generate_image.py- Gemini API image generation
Prompt Modifiers Reference
| Category | Examples |
|---|---|
| Lighting | golden hour, dramatic shadows, soft diffused light, neon glow, overcast |
| Style | cinematic, editorial, technical diagram, hand-drawn, photorealistic |
| Texture | matte finish, brushed steel, soft velvet, crumpled paper, weathered wood |
| Composition | wide shot, close-up, bird's eye view, dutch angle, symmetrical |
| Mood | energetic, serene, dramatic, playful, sophisticated |
| Quality | 4K, high-fidelity, pixel-perfect, professional grade |
Advanced Capabilities
Text Rendering & Infographics
Put exact text in quotes. Specify style: "polished editorial," "technical diagram," or "hand-drawn whiteboard."
Example prompts:
Earnings Report Infographic:
"Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for 'Revenue Growth' and 'Net Income', and highlight the CEO's key quote in a stylized pull-quote box."Whiteboard Summary:
"Summarize the concept of 'Transformer Neural Network Architecture' as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for 'Self-Attention' and 'Feed Forward'."Character Consistency & Thumbnails
Use reference images and state "Keep the person's facial features exactly the same as Image 1." Describe expression/action changes while maintaining identity.
Example prompt:
Viral Thumbnail:
"Design a viral video thumbnail using the person from Image 1.
Face Consistency: Keep the person's facial features exactly the same as Image 1, but change their expression to look excited and surprised.
Action: Pose the person on the left side, pointing their finger towards the right side of the frame.
Subject: On the right side, place a high-quality image of a delicious avocado toast.
Graphics: Add a bold yellow arrow connecting the person's finger to the toast.
Text: Overlay massive, pop-style text in the middle: 'Done in 3 mins!'. Use a thick white outline and drop shadow.
Background: A blurred, bright kitchen background. High saturation and contrast."Image Reworking (Edit Existing Images)
The --input flag enables "rework mode" - pass an existing image to Gemini and describe the changes you want.
Key use cases:
- Small tweaks - Adjust colors, add/remove elements, change lighting
- Style transfer - Keep composition but change artistic style
- Object manipulation - Remove, add, or modify specific objects
- Seasonal/temporal changes - Same scene, different time/season
Running in rework mode:
# Basic edit - add something
python scripts/generate_image.py "Add snow to the roof and yard" \
--input ./house.png \
--model pro
# Color adjustment
python scripts/generate_image.py "Change the accent color from red to teal, keep everything else identical" \
--input ./thumbnail.png \
--model pro
# Style transfer - keep composition, change aesthetic
python scripts/generate_image.py "Convert this to risograph style with halftone dots and slight color misregistration" \
--input ./photo.png \
--model pro
# Generate variations of an edit
python scripts/generate_image.py "Make the lighting warmer, like golden hour" \
--input ./portrait.png \
--variations 3 \
--model proPrompting tips for rework mode:
Be specific about what to preserve:
- "Keep the person's facial features exactly the same"
- "Maintain the composition and framing"
- "Don't change the background"
Be explicit about what to change:
- "Change ONLY the color of the shirt from blue to red"
- "Add snow to the roof and nothing else"
- "Remove the text overlay"
Use comparative language:
- "Make the colors more vibrant"
- "Increase the contrast slightly"
- "Make the lighting softer and more diffused"
Output naming: Files from rework mode are named {prefix}_{timestamp}_edit_{model}.png to distinguish from generated images (_gen_).
Advanced Editing Examples
Object Removal:
python scripts/generate_image.py \
"Remove the tourists from the background and fill with matching cobblestones and storefronts" \
--input ./street-photo.png \
--model proSeasonal Control:
python scripts/generate_image.py \
"Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \
--input ./house-summer.png \
--model proCharacter Consistency (thumbnail series):
python scripts/generate_image.py \
"Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \
--input ./person-reference.png \
--model proRelated Skills
- youtube-title-creator - Pair generated images with optimized titles
- social-content-creation - Use images in platform-optimized posts
For custom brand styles, create new style files in references/styles/ following the existing format