image-prompt-generator

Generate AI images using Gemini image generation API. Use this skill when content needs images - thumbnails, social posts, blog headers, or creative visuals. Follows an iterative workflow - brainstorm concepts, select direction, generate in multiple styles, then produce via API.

cdeistopened 21 6 Updated 5mo ago

Resources

GitHub

Install

npx skillscat add cdeistopened/skill-stack/image-prompt-generator

Install via the SkillsCat registry.

SKILL.md

Image Prompt Generator

Generate professional, non-generic images using Google's Gemini API for image generation.

Prerequisites & Setup

Getting Your Gemini API Key

Go to Google AI Studio
Sign in with your Google account
Click "Create API Key"
Copy the generated key

Configuring the API Key

Option 1: Environment file (recommended)

Create a .env file in your project root:

GEMINI_API_KEY=your_api_key_here

Option 2: Direct environment variable

export GEMINI_API_KEY=your_api_key_here

Install Dependencies

pip install google-generativeai python-dotenv pillow

Available Models

Model	API Name	Best For
Flash	`gemini-2.5-flash-image`	Speed, drafts, iteration
Pro	`gemini-3-pro-image-preview`	Final assets, 16:9 aspect ratio, quality

CRITICAL: Use gemini-3-pro-image-preview for:

Thumbnails (need 16:9 aspect ratio)
Final production images
Any image where aspect_ratio config is needed

Workflow Overview

Brainstorm Concepts - Generate 4-6 high-level visual ideas
Select Direction - User picks the concept they like
Optimize Prompt - Refine into a strong, detailed prompt
Style Variations - Adapt to 2-3 different visual styles
Generate Images - Run via Gemini API

Step 1: Brainstorm Concepts

When the user provides a topic or use case, generate 4-6 high-level visual concepts. Each concept should be:

One sentence describing the visual idea
Concrete and immediate - you can picture it instantly
Conceptual but not abstract - a clear object/scene with meaning
Non-generic - avoid cliches (no lightbulbs for ideas, no handshakes for partnership)

Format:

1. **[Short label]** - One sentence description of the visual concept and why it works.

2. **[Short label]** - One sentence description...

Example for "newsletter about personal productivity":

1. **Compass with coffee stain** - A vintage compass where the needle points toward a coffee ring stain on a map, suggesting direction emerges from daily rituals.

2. **Clock face with seasons** - A clock where the 12 hours show seasonal changes, suggesting time management over long arcs, not just hours.

3. **Empty desk with shadow** - A minimalist desk in morning light, but the shadow shows a cluttered desk - the gap between intention and reality.

4. **Single key on many keychains** - One small key attached to dozens of decorative keychains, suggesting we overcomplicate simple solutions.

Wait for user to select before proceeding.

Step 2: Optimize the Prompt

Once the user selects a concept, develop it into a full prompt. Structure:

Create a [style type] illustration of [subject].

CONCEPT: [Expand the one-sentence idea into a clear visual description]

STYLE: [Artistic approach - load from references/styles/ if brand-specific]

COMPOSITION: [Framing, focal point, negative space, balance]

COLORS: [Palette - describe by name, not hex codes which may render as text]

TEXTURE: [Surface qualities, analog/digital feel]

AVOID: [What should NOT appear - be specific]

FORMAT: [Aspect ratio]

Key principles:

Natural language, full sentences - no tag soup
Describe colors by name (burnt orange, sky blue, near-black) not hex codes
Maximum 2-3 elements - if it feels busy, remove something
Favor metaphor over literal depiction

Step 3: Style Variations

Default style: Risograph - Use references/styles/risograph.md unless the content calls for something different.

Available styles in references/styles/:

risograph.md - DEFAULT. Halftone dots, misregistration, indie printmaking aesthetic. Warm, tactile, analog.
minimalist-ink.md - High-contrast black and white, crosshatching. For craft/mastery posts.
watercolor-line.md - Ink linework with watercolor washes, warm. For organic topics.
editorial-conceptual.md - Conceptual, sophisticated, editorial wit. For abstract/philosophical posts.

Present style options to user, recommending risograph as default.

Step 4: Generate via API

Running the Script

# Load key from .env and generate
export $(grep GEMINI_API_KEY .env) && \
python scripts/generate_image.py "prompt here" --model pro --aspect 16:9

# Save to specific folder
python scripts/generate_image.py "prompt" --output "./images" --name "my_image"

Options:

--model flash (faster, cheaper) or --model pro (higher quality)
--aspect 16:9, 1:1, or 9:16 (PRO MODEL ONLY - for flash, you MUST include ratio in prompt text)
--variations N - generate N versions
--output ./path - save location
--name prefix - filename prefix

Output location: Save images alongside the content they belong to - not a generic images dump.

Step 5: Iterate

After user reviews generated images:

80% good? Request specific edits conversationally rather than regenerating
Composition off? Adjust framing or element placement in prompt
Wrong style? Try a different style reference
Too busy? Simplify to fewer elements
Colors wrong? Be more explicit about palette

Prompting Principles

Write Like a Creative Director

Brief the model like a human artist. Use proper grammar, full sentences, and descriptive adjectives.

Don't	Do
"Cool car, neon, city, night, 8k"	"A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car's metallic chassis."

Be specific about:

Subject: Instead of "a woman," say "a sophisticated elderly woman wearing a vintage chanel-style suit"
Materiality: Describe textures - "matte finish," "brushed steel," "soft velvet," "crumpled paper"
Setting: Define location, time of day, weather
Lighting: Specify mood and light source
Mood: Emotional tone of the image

Provide Context

Context helps the model make logical artistic decisions. Include the "why" or "for whom."

Example: "Create an image of a sandwich for a Brazilian high-end gourmet cookbook."
(Model infers: professional plating, shallow depth of field, perfect lighting)

Keep It Simple

One clear focal point
Maximum 2-3 elements total
Generous negative space
If it feels busy, remove something

Avoid the Generic

No lightbulbs for "ideas"
No handshakes for "partnership"
No happy stock photo poses
No glossy AI aesthetic

Resources

references/styles/

Aesthetic style definitions:

risograph.md - DEFAULT - Halftone, misregistration, indie printmaking
minimalist-ink.md - Black and white ink illustration
watercolor-line.md - Ink with watercolor washes
editorial-conceptual.md - Conceptual editorial style

scripts/

generate_image.py - Gemini API image generation

Prompt Modifiers Reference

Category	Examples
Lighting	golden hour, dramatic shadows, soft diffused light, neon glow, overcast
Style	cinematic, editorial, technical diagram, hand-drawn, photorealistic
Texture	matte finish, brushed steel, soft velvet, crumpled paper, weathered wood
Composition	wide shot, close-up, bird's eye view, dutch angle, symmetrical
Mood	energetic, serene, dramatic, playful, sophisticated
Quality	4K, high-fidelity, pixel-perfect, professional grade

Advanced Capabilities

Text Rendering & Infographics

Put exact text in quotes. Specify style: "polished editorial," "technical diagram," or "hand-drawn whiteboard."

Example prompts:

Earnings Report Infographic:
"Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for 'Revenue Growth' and 'Net Income', and highlight the CEO's key quote in a stylized pull-quote box."

Whiteboard Summary:
"Summarize the concept of 'Transformer Neural Network Architecture' as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for 'Self-Attention' and 'Feed Forward'."

Character Consistency & Thumbnails

Use reference images and state "Keep the person's facial features exactly the same as Image 1." Describe expression/action changes while maintaining identity.

Example prompt:

Viral Thumbnail:
"Design a viral video thumbnail using the person from Image 1.
Face Consistency: Keep the person's facial features exactly the same as Image 1, but change their expression to look excited and surprised.
Action: Pose the person on the left side, pointing their finger towards the right side of the frame.
Subject: On the right side, place a high-quality image of a delicious avocado toast.
Graphics: Add a bold yellow arrow connecting the person's finger to the toast.
Text: Overlay massive, pop-style text in the middle: 'Done in 3 mins!'. Use a thick white outline and drop shadow.
Background: A blurred, bright kitchen background. High saturation and contrast."

Image Reworking (Edit Existing Images)

The --input flag enables "rework mode" - pass an existing image to Gemini and describe the changes you want.

Key use cases:

Small tweaks - Adjust colors, add/remove elements, change lighting
Style transfer - Keep composition but change artistic style
Object manipulation - Remove, add, or modify specific objects
Seasonal/temporal changes - Same scene, different time/season

Running in rework mode:

# Basic edit - add something
python scripts/generate_image.py "Add snow to the roof and yard" \
  --input ./house.png \
  --model pro

# Color adjustment
python scripts/generate_image.py "Change the accent color from red to teal, keep everything else identical" \
  --input ./thumbnail.png \
  --model pro

# Style transfer - keep composition, change aesthetic
python scripts/generate_image.py "Convert this to risograph style with halftone dots and slight color misregistration" \
  --input ./photo.png \
  --model pro

# Generate variations of an edit
python scripts/generate_image.py "Make the lighting warmer, like golden hour" \
  --input ./portrait.png \
  --variations 3 \
  --model pro

Prompting tips for rework mode:

Be specific about what to preserve:
- "Keep the person's facial features exactly the same"
- "Maintain the composition and framing"
- "Don't change the background"
Be explicit about what to change:
- "Change ONLY the color of the shirt from blue to red"
- "Add snow to the roof and nothing else"
- "Remove the text overlay"
Use comparative language:
- "Make the colors more vibrant"
- "Increase the contrast slightly"
- "Make the lighting softer and more diffused"

Output naming: Files from rework mode are named {prefix}_{timestamp}_edit_{model}.png to distinguish from generated images (_gen_).

Advanced Editing Examples

Object Removal:

python scripts/generate_image.py \
  "Remove the tourists from the background and fill with matching cobblestones and storefronts" \
  --input ./street-photo.png \
  --model pro

Seasonal Control:

python scripts/generate_image.py \
  "Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \
  --input ./house-summer.png \
  --model pro

Character Consistency (thumbnail series):

python scripts/generate_image.py \
  "Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \
  --input ./person-reference.png \
  --model pro

Related Skills

youtube-title-creator - Pair generated images with optimized titles
social-content-creation - Use images in platform-optimized posts

For custom brand styles, create new style files in references/styles/ following the existing format

image-prompt-generator

Resources

Install

Image Prompt Generator

Prerequisites & Setup

Getting Your Gemini API Key

Configuring the API Key

Install Dependencies

Available Models

Workflow Overview

Step 1: Brainstorm Concepts

Step 2: Optimize the Prompt

Step 3: Style Variations

Step 4: Generate via API

Running the Script

Step 5: Iterate

Prompting Principles

Write Like a Creative Director

Provide Context

Keep It Simple

Avoid the Generic

Resources

references/styles/

scripts/

Prompt Modifiers Reference

Advanced Capabilities

Text Rendering & Infographics

Character Consistency & Thumbnails

Image Reworking (Edit Existing Images)

Advanced Editing Examples

Related Skills

Categories

Install

Recommended Skills