Experiment with Gemini 2.0 Flash native image generation

1 2 minutes read

In December, we first introduced native image output in Gemini 2.0 Flash to trusted testers. Today, we’re making it available for developer experience in all regions currently supported by Google AI Studio. You can test this new capability using a beta version of Gemini 2.0 Flash (gemini-2.0-flash-exp) in Google AI Studio and via the Gemini API.

Gemini 2.0 Flash combines multimodal input, augmented reasoning, and natural language understanding to create images.

Here are some examples of where Flash 2.0 multimedia output shines:

1. Text and images together

Use Gemini 2.0 Flash to tell a story and you’ll illustrate it with pictures, keeping the characters and setting consistent throughout. Give him feedback and the model will retell the story or change the style of his drawings.

Sorry, your browser does not support playing this video

Create story and illustrations in Google AI Studio

2. Edit conversation photos

Gemini 2.0 Flash helps you edit photos with many turns of natural language dialogue, and is great for iterating to get the perfect photo, or for exploring different ideas together.

Sorry, your browser does not support playing this video

Multi-turn conversation image editing while preserving context throughout the conversation in Google AI Studio

3. Understanding the world

Unlike many other image generation models, Gemini 2.0 Flash leverages global knowledge and enhanced thinking for image creation right image. This makes it ideal for creating detailed, realistic images, such as recipe illustration. While it strives for accuracy, like all models of language, its knowledge is broad and general and not absolute or complete.

Sorry, your browser does not support playing this video

Output text and overlapping images for a recipe in Google AI Studio

4. Introducing the text

Most image generation templates struggle to display long sequences of text accurately, often resulting in poorly formatted or illegible characters, or misspellings. Internal benchmarks show that Flash 2.0 has a stronger display than leading competitive models, and is great for creating ads, social posts or even invitations.

Sorry, your browser does not support playing this video

Image output with long text display in Google AI Studio

Start taking photos with Gemini today

Get started with Gemini 2.0 Flash via the Gemini API. Read more about creating images in our docs.

from google import genai
from google.genai import types

client = genai.Client(api_key="GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=(
        "Generate a story about a cute baby turtle in a 3d digital art style. "
        "For each scene, generate an image."
    ),
    config=types.GenerateContentConfig(
        response_modalities=["Text", "Image"]
    ),
)

Python

Whether you’re building AI agents, developing applications with beautiful visuals like interactive storyboards, or brainstorming visual ideas in conversation, Gemini 2.0 Flash lets you add text and image creation with just one form. We’re excited to see what the developers create using the original image output and your feedback will help us finalize a production-ready version soon.

Don’t miss more hot News like this! Click here to discover the latest in AI news!

2025-03-12 14:58:00

1 2 minutes read