thing 12: Generating Images

Have you ever thought, I’d like to find an image of a sunset beach? A quick internet search will usually do the trick. But what if what you really wanted was:

An impressionist painting of a sunset beach attended by Victorian-era ghosts having a tea party on jellyfish-shaped beach towels?

While a typical internet search is ideal for finding real or existing images or photographs of specific people or places, if your goal is to create custom illustrations, conceptual art or visualizing ideas of things that may not even exist, you’ll need a different tool (besides your own talent and competence of course).

AI image generators really do put the generative in generative AI. AI image generators don’t search the web or refer to a database of images. Instead, they generate entirely new visual content by drawing on patterns and relationships learned from millions of examples. The image doesn’t exist until your prompt activates a generative process that brings your idea to life.

So how does it do that?

Read

How does the technology work? (Univ. of Arizona Libraries)

When it comes to AI image generation you don’t find the image. You write it into existence. The prompt is the blueprint.

Image Prompting 101 (Microsoft)

Activity

AI Image Prompting Walkthrough: Yarn Garden Edition

We’ll be using Microsoft Copilot for this exercise which uses the Dall-E model from openAI.
Heads up: If you’re seeing a message about daily limits while using Copilot’s image generator, it just means you’ve hit the usage cap for the day. This also may impact the complexity of images Copilot delivers. You’re still doing the activity right! Try another image generator like ChatGPT, or Ideogram, or Gemini, or Krea!

Optional: Download this companion document for some hints on how you might remix the prompts in this activity. There’s also an additional and optional activity and some more prompts to try out.

16aithings93-yarn-garden Download

Prompting Tip

If using Copilot, to make sure Copilot generates an image and not a paragraph of text, start your prompt with a visual action verb, such as:

Create, Draw, Generate, Illustrate, Design, or Visualize

Example:

❌ A beautiful mountain made of books → (Might return text)

✅ Generate an oil painting of a beautiful mountain made of books → (Image result!)

Prompt Formula

Here is a rough formula we’ll use to craft increasingly descriptive image prompts:

Create a [image type or medium] of [subject(s)] in [setting or environment], [composition or layout], [style or mood], [lighting or texture], and optionally [camera or rendering detail].

Example: Create a watercolor painting of two golden retrievers lounging on a front porch in autumn, framed by a wooden railing and scattered leaves, in a cozy and nostalgic mood, with soft afternoon lighting and gentle brush textures.

Activity

Prompting in Action: The Yarn Garden

Before you start, head over to Microsoft Copilot (or ChatGPT, or Ideogram, or Gemini, or Krea!). You can copy and paste the prompts below and see what happens as you change the descriptive language of your prompt. These prompts will get you started, but feel free to try out your own versions and variations!

PROMPT 1: Basic PROMPT

Create a digital illustration of a garden in the countryside with rows of vegetables and a sunset.

PROMPT 2: Add Texture and Lighting

Generate a whimsical digital illustration of a garden made entirely from yarn, set during golden hour, showing colorful vegetables, swaying sunflowers, and soft yarn textures.

PROMPT 3: Focus on Scene Details

Create an image of a scene where rows of tomatoes, peppers, and carrots made from colorful yarn are surrounded by yarn sunflowers and scattered yarn seeds, captured in a wide-angle view, lit by a glowing yarn sunset, evoking a cozy and enchanting atmosphere.

PROMPT: Advanced Variation

Design a high-resolution digital illustration of a magical yarn garden at sunset, with vegetables and flowers crafted from vibrant thread, styled like a stop-motion children’s book, using soft natural lighting and tactile fiber textures to create a handmade, heartwarming tone.

To see the prompt used to create the images at the top of this page click the small triangle. The images were all generated with the exact same prompt using different tools.

A high-resolution digital illustration in a stop-motion children’s book style depicts a whimsical yarn garden at sunset. At the heart of the scene sits a plump, vibrant tomato constructed entirely of scarlet thread, with delicate green tendrils woven to resemble leaves and supported by a spool base. Around the tomato are crocheted sunflowers with bright yellow petals and fuzzy brown centers, alongside miniature carrot sculptures crafted from orange yarn, all nestled within a bed of moss-like textured thread. Soft, warm sunlight filters through the scene, casting long shadows and illuminating the tactile fiber textures, creating a heartwarming and inviting atmosphere.

Discussion

Which part of your prompt had the greatest impact? Was it the the subject, the material, the lighting, or the style? Or something else? How do you know?

33 replies on “thing 12: Generating Images”

In copilot it only created the first prompt. It errored every prompt after that. Very disappointing. I used ChatGPT after that, and it created a very similar first image oddly enough. It took the prompt very literally, as it just had the listed vegetables placed on the ground, and not them on a plant like you would expect. The third and fourth images look extremely similar. The greatest impact for my responses were specific defined details. It took it very literally, and the “heartwarming tone” did not seem to mean anything to it- the AI didn’t know how to visualize heartwarming.

Went off on a totally different direction and ended up doing my own prompt. I asked: “Create an impressionist painting of Victorian ghosts drinking tea in a poison garden.” I wanted to see if it knew what a poison garden was or if it would totally go sideways in its interpretation. It didn’t. It knew what a poison garden was. I asked for the same thing, except to give the painting a gothic fairytale feel, which then took the impressionist style out for some reason. I then tried a third time to give the original prompt a golden hour glow. It started to give me error messages after that.

The image Copilot created, after following the prompts provided above, was very two-dimensional and didn’t look like the image above that was generated by Copilot. I asked to add farm animals and Copilot added a cute kitten, dog and some rabbits. And then when I tried additional prompts, I was told I had reached my daily limit. I would try again and try to be more specific and creative with prompts.

I think that the last three photos Copilot gave looked almost identical, especially the final two photos. The instructions on lighting seemed to have made the biggest difference, as there was a distinct difference in warmth once that was specified. I think it quickly hit a wall with how much it was going to change, and it got in the groove of making tiny, almost imperceptible changes.

I think ChatGPT (or other GenAI) could just write their own prompts for this stuff. I asked ChatGPT to define to two concepts “self compassion” and “psychological richness.” Then I asked it to “Describe a piece of visual art that would inspire the audience to imagine a reinforcing relationship between self compassion and psychological richness.” It came up with a description that was much richer and more beautiful than I could have composed (given that I am not much of an artist). Then I asked it to “translate that description to a large format acrylic painting,” where it outlined all the details about the visual structure, color palette, and interpretative statement for the gallery wall. The end product was actually quite nice.

I think the phrase “lit by a glowing yarn sunset” had the most impact. It resulted in the only image that had rays extending from the sun and a brighter, lighter image. It was interesting that the image created from the prompt using the word “glowing” was the only “yarn” image that had sunflowers but no purple flowers.

The material and style made the most difference from the first prompt. However, those differences weren’t THAT impactful. If I were going to use this image in a presentation, the differences were subtle. This activity demonstrated how the tool just reuses information and how human originality is crucial.

The last two images were quite similar. Interestingly, the first two images both had houses in them. I thought those might be generated by the “in the countryside” prompt, since the house was barn-like (except for the chimney), but it was in the second prompt as well. Adding the “from yarn” portion of the prompt flattened the images — the third three look almost like needlepoint to me.

I was only able to do the first prompt with Copilot before I hit the image generating cap. I tried the second prompt in ChatGPT, and the image seemed to me to convey malicious sunflowers moving in to attack unsuspecting vegetables– which I thought was kind of cool. Nothing changed in the following prompts except the sky was a bit darker. Since I’m not an illustrator, I found this exercise to be an impressive display of what GenAI can do. If I was an illustrator, I’d be worried about losing my job and sad about the flattening of creativity that is “good enough” for non-illustrators.

I tried rewording this prompt several ways: “Create a digital illustration of a forest setting with a river running through it. In the forest are unicorns and in the river are mermaids.” Each time it started to craft an image and then gave me an error: “Sorry, I can’t do that. However, I can help you create a detailed written description of the scene for an artist or suggest tools you can use to bring your vision to life. Would you like a description or some creative tool recommendations?”

I ended up having to change it to a 2 part prompt, where I first asked for: “Create a storybook image of unicorns in a forest at sunset. ” and then asked it to “Include mermaids in a river.” Interestingly it was able to add the mermaids but then it lost the sunset portion.

The images were fine but it took a lot of time and prompting. Not sure I’ll do that again unless I need something super specific.

I tried Prompt 1 and then went to Prompt 4. The first pic resembled an oil painting and was very like something you would see in a children’s book ( I feel like). The 4th prompt let to an image very similar to the one shown at the top of the site, however the sunflowers were standing up straight and I had a square pattern on the ground (possibly resembling gardening beds). I would love to see the results from others and what differences there are in the results when using the same prompt and the same AI tool. I feel the more detail you can give the better and tested that theory by taking some detail away from Prompt 4 (Create A high-resolution digital illustration in children’s book style of a yarn garden at sunset. At the heart of the scene sits a tomato constructed entirely of thread, with tendrils woven to resemble leaves and supported by a spool base. Around the tomato are crocheted sunflowers and miniature carrot sculptures. Sunlight filters through the scene) and the results were very similar (however it made a mistake and it looks like a sunflower grows out of a carrot lol). I then took that prompt and put it in ChatGPT as comparison. The image was quite different. I used the longer, more detailed prompt then in ChatGPT to see if the image would be similar then. The result was almost identical to the first one. So more detail does not result in changes once we reached a certain level of information already it seems.

I used the example prompts and the biggest changes were form prompt one to prompt two and prompt three to prompt four. Prompt one was an actual realistic farm with rows of vegetables and realistic lighting while three, four, and five were just plants and a sun behind them. Adding the “digital illustration” modifier in prompt four made the image go from two-dimensional to three-dimensional. The shadows were still unrealistic and there was a weird plant that grew a Sunflower, Tomato, and Bell pepper on the same plant.

I created a different image of a dog dancing as a ballerina on a beach. The picture was just ok, easily detected as a generated photo; but I did appreciate the shadow and the ocean waves crashing in the background.

I asked Copilot to “Create a pointillist painting of a garden in the countryside with rows of vegetables during golden hour” and then to shift the focus of the painting to the foreground. It was quirky–it shifted the frame of the ‘painting’ as you might expect, and it changed the size of the dots somewhat–larger in the background more precise and pixilated in the foreground like a camera (not like pointillism). I am also not sure it understands what visual focus is either–the garden rows in the image while “close up” draw the eye out into the distance of the second image creating something where your eye is dwelling on the upper third of the frame and not the foreground.

The first picture created by CoPilot looked like a painting, but it had huge rows of carrots laying on their sides. It was quite comical. Then, it just got too tired to give me more for the next prompt. I do think mentioning style helps a lot. I have used CoPilot to create pictures for presentations. It was gotten much faster than a year ago. I do notice that it really doesn’t do human hands well. Usually too many fingers.

The first picture created by CoPilot looked like a painting, but it had huge rows of carrots laying on their sides. It was quite comical. Then, it just got too tired to give me more for the next prompt. I do think mentioning style helps a lot. I have used CoPilot to create pictures for presentations. It has gotten much faster than a year ago. I do notice that it really doesn’t do human hands well. Usually too many fingers.

To me, step 3 impacted the most because I asked chat gpt how to set up the specific scene (sunflowers, how they surround other things etc). It’s really really slow and every single time, it creates a somewhat different picture. Instead of adding new things on an existing photo, it just kept creating completely new things

The texture prompt and the last prompt made the greatness impact on the image. I did change prompts wording by starting off with enhance or revise the image. Including descriptions for specific objects, tone, style, and textures, especially examples of genres enriched the image.

The image had a weird kind of hyper-real lighting and definition, making it readily identifiable as AI-generated.
Also, I want to push back a little on the claim that “AI image generators don’t search the web or refer to a database of images. Instead, they generate entirely new visual content by drawing on patterns and relationships learned from millions of examples.” This is true, but those examples may well be copyrighted (as with texts). For instance, Studio Ghibli fans are none too happy with the recent spate of Ghibli-style AI-generated images (https://www.tokyoweekender.com/japan-life/news-and-opinion/ghibli-style-ai-images-raise-ethical-concerns/)

Maybe it was just me, but all three of the “yarn” images seem to be pretty much the same. The lighting was a bit different, but overall it seems to be very close. For fun, I put in another prompt “Create a hyper real painting of an Akita looking out a window with natural lighting, shallow depth of field, in an Impressionist style.” It produced a beautiful picture that looks a lot like my daughter’s dog. This was a great exercise to do at the end of a stressful week!

I decided to go for an orange cats family. In a couple of iterations I made one of them were glasses and added a study room background. I must say, I’ve seem more likable cates in my life.

I asked it to generate an image of an atomic nucleus with orbiting electrons. The first prompt was the best, it gave a great starting image. All attempts to refine the image resulted in things that were not accurate to the prompt. I know because this is my field of expertise.

When I tried this activty, I noticed the part of the prompt that made the biggest diffrence was the style and material. Saying “made from yarn” vs just “a garden” totally changed the image—it went from a normal countryside scene to something whimsical and kinda surreal. Lighting also had a strong efect, especially “golden hour” or “glowing sunset,” which added warmth and atmosphere. I know this becuase when I kept the subject the same (vegetables and sunflowers) but swtiched style/lighting, the images felt completly different in tone—one looked magical and cozy, while another looked flat and ordinary.

Playing with the garden prompt was fun – ChatGPT produced some cute results in different styles, including a pretty decent imitation of a knitted fabric (and then it went to a totally hallucinating mode when I asked for a chart – lol). However, I am also using it to generate a logo for my professional organization, and I found it very frustrating if I wanted to modify specific parts of the design, while keeping the rest. Maybe many more words were needed, but I got frustrated and were starting over after a few iterations when every iteration would produce changes everywhere, even in the parts I wanted to keep.

Prompt 2 saw the most change, with subtle changes from the additional prompts. Curious as to how much others doing this exact exercise impacted the image Gemini created?

The part of the prompt with the greatest impact was specifying the garden as being “made from yarn.” That detail shifted the entire image, while lighting and style mostly just adjusted the atmosphere. Adding more details didn’t always help, though. In fact, the more I tried to layer in specifics, the harder it became for the AI to balance everything the way I imagined. And when I asked for revisions to fix what it missed, it would often make a mistake somewhere else, so I never got exactly what I wanted.

I think the lighting and warmth prompts really shifted the outputs for my prompt. I also tried different follow-up formats it prompted, like making a recipe card to accompany the scene, which was not something I would have expected but turned out to be quite charming too.

I noticed “made from yarn” changed the image the most, turning it whimsical. I saw lighting also shift the tone, especially “golden hour.” I realized details can enrich prompts, but too many sometimes confuse the AI instead of refining it.

I kind of jumped ahead a little and my first prompt was a bit detailed. I asked it to “Create a painting of boston terrier dogs in a restaurant, the mood is cute and whimsical, bright colors and warm lighting.” The image was adorable; four bostons sitting at tables as couples, eating spaghetti and drinking wine. Interesting that it chose to make them the diners instead of pets. For the next few prompts I asked for a few revisions and details, but it seemed to get stuck on the original layout of the outputted image. For the second activity, specifying the material had a big impact!

I experimented with various different style, my favorite being, instead of “made from yarn”, the “claymation” style. It generated a variety of different characters. I increased the complexity of the prompt a few times, specifying furniture in the scene and colors of characters buttons. It performed well as long as I didn’t use a numerical value for things. Generating a character with any more than four arms came up with an incorrect amount, particularly for odd values I notice.

I asked Gemini to create a postcard showing a family enjoying a park, vintage 1960s, using only photo techniques available in that decade. I also asked it to make the postcard appear warn from handling. Gemini gave me an odd-looking image of a family, but it was more of a labeled family photo than a postcard. Then I asked Copilot and it gave me something much more convincing, complete with fonts you’d find on a postcard from that era, colors that looked slightly aged, and some light creasing. Very different results! I’ve also been experimenting with AI for book covers, but I haven’t found one to produce 300dpi images for print. One generator even told me “You need a human for that.”

Iwent a different direction and generated an image of a professor in a lecture hall giving a chemistry demonstration. It is really interesting how you can play with the image angle, perspective, lighting etc. by adjusting the prompt. I found it took really long to generate the image but they are really realistic looking.

When it comes to giving me more what I expected to see, prompt #3 worked best. Giving detailed instructions helps a lot. Then I remembered seeing an episode of Last Week Tonight where they used AI to generate images of John Oliver marrying a head of cabbage. I decided to try it, entering, “create an image of john oliver during his marriage ceremony to a head of cabbage.” That image was actually the closest result to what I was expecting when I entered the prompt. Kinda funny, if you ask me: https://copilot.microsoft.com/shares/BNcwANZbTJbE4PPQ1dhwZ