Connor Holly

← AI Skills

AI Image Generation Patterns

Content & Docs

imagesgeminiprompts

What it does

Produces consistent, high-quality images from AI generation models using structured prompts and reference images. The key insight: reference images control output quality far more than text descriptions alone.

The pattern

Prompt structure (5 components, in order):

  1. Style. The visual treatment: "watercolor illustration," "3D render," "flat vector," "photorealistic." This anchors the entire output aesthetic.
  2. Subject. What is in the image: "a dashboard showing revenue metrics," "a person using a mobile app."
  3. Visual elements. Specific details: "blue and purple gradient background," "soft shadows," "grid layout."
  4. Typography. If text appears in the image: font style, placement, content. Be explicit or text will render poorly.
  5. Mood. The emotional tone: "professional and clean," "playful and energetic," "minimal and calm."

Reference images are the real lever.

Text prompts describe what you want. Reference images show what you want. The gap between those two is enormous. Upload 3-14 reference images that demonstrate:

  • Color palette (the dominant colors you want)
  • Composition style (how elements are arranged)
  • Visual density (sparse vs. detailed)
  • Typography treatment (if applicable)

Reference images anchor the model to a specific visual neighborhood. Without them, you get the model's "average" interpretation of your text, which is usually generic.

Batch generation workflow:

  1. Generate 4-8 variations from the same prompt
  2. Select the best 1-2 as new reference images
  3. Regenerate with the refined references
  4. Repeat until the output matches your intent

Each iteration narrows the style space. By round 3, outputs are usually consistent enough to use.

Resolution and aspect ratio selection:

  • Social media posts: 1:1 (1024x1024)
  • Stories/reels: 9:16 (1080x1920)
  • Blog headers: 16:9 (1920x1080)
  • Print: highest available resolution at the target aspect ratio

Key decisions

Reference images over long prompts. A 200-word prompt describing a color palette is less effective than one reference image showing those colors. Use text for content and structure. Use images for style.

Iterate with selection, not with prompt engineering. Rewriting prompts is slow and unpredictable. Generating variations and selecting the best ones is faster and converges more reliably.

Consistent reference sets for series. If you are producing multiple images that should look like they belong together (a blog series, a slide deck, social media campaign), lock your reference images and reuse them across all generations.

When to use it

When you need images for content, marketing, or product work and want consistent visual quality without a designer. When you need multiple images in a cohesive style. When you are iterating on visual concepts and need rapid feedback loops. Not suitable when pixel-perfect precision or specific brand guidelines require manual design work.