AI Image Generators Default to the Same 12 Photo Styles, Study Finds
AI-powered image generation models have huge sets of visual data that they can pull from to create unique outputs. However, the researchers found that when models were pushed to produce images based on a series of slowly changing prompts, they would default to a few visual elements, leading to an ultimately generic style.
A study published in the journal Patterns I took two AI image generators, Stable Diffusion XL and LLaVA, and tested them by playing a video game. The game went like this: The Stable Diffusion XL model would be given a short prompt and asked to produce an image – for example, “As I sat alone, surrounded by nature, I found an old book of exactly eight pages telling a story in a forgotten language waiting to be read and understood.” That image was presented to the LLaVA model, who was asked to describe it. This description was then sent back to Stable Diffusion, which was asked to create a new image based on this prompt. This continued for 100 rounds.
Just like a human game of telephone, the original image was quickly lost. It’s not surprising, especially if you’ve ever watched one of those video sequences where people ask an AI model to reproduce an image without making any changes, only for the image to quickly turn into something that doesn’t even remotely resemble the original. But what surprised the researchers was the fact that the models were based on only a few generic-looking methods. Through 1,000 different iterations of the phone game, the researchers found that most image sequences would eventually fall into just one of 12 dominant shapes.
In most cases, the transformation is gradual. Several times, it happened suddenly. But this almost always happened. The researchers were not impressed. In the study, they referred to popular image styles as “visual elevator music,” which is basically the type of images you see hanging in a hotel room. The most common scenes included such things as lighthouses, formal interiors, urban night settings, and rural architecture.
Even when researchers turned to different models for generating images and descriptions, the same types of trends emerged. When a game extends to 1,000 turns, fusion around a pattern still occurs at about turn 100, but differences appear in those additional turns, the researchers said. Interestingly, these variations are still usually derived from one common visual element.

So what does all this mean? For the most part, AI is not particularly creative. In the human game of telephone, you will end up with extreme variance because each message is delivered and heard differently, and each person has their own internal biases and preferences that may influence the message they receive. Artificial intelligence has the opposite problem. No matter how exotic the original prompt is, it will always choose a narrow set of styles by default.
Of course, the AI model relies on human-generated prompts, so there’s something to be said for the data set and what humans are drawn to taking pictures of. If there is a lesson here, perhaps it is that copying styles is much easier than teaching taste.
Don’t miss more hot News like this! Click here to discover the latest in Technology news!
2025-12-20 13:00:00



