Table of Contents
- Prompt must contain the same keyword
- Batch issue
- Character and context matter
- Choose an AI
- Prompt refinement
The Dall-E2 can paint amazing pictures with just words. However, when using the images output by this AI as illustrations for books and comics, it is necessary to make the characters and style consistent.
It is currently not possible to upload a character to the Dall-E2 and instruct the AI to use that character when drawing a new drawing. The following advice will give you at least some degree of consistency in your output images.
Prompt must contain the same keyword
Once you have decided on a style, repeat that style over and over with prompts.
When I did the art direction for the picture book Indecisive Chameleon, I used the keywords “oil painting, children’s art.” With this keyword, we unified the style and character design.
The Dall-E2 seems to respect prompts made in the same batch. As an example, take a look at the chameleon I generated. Each of the above paintings was painted in the same batch, and the style appears to be the same.
A few days later, the following image was drawn with the same keywords (oil painting, children’s drawing). The painting style is consistent, but slightly different from the last time.
Character and context matter
The Dall-E2 cannot (or will not) generate images in conjunction with certain characters or settings. The model should therefore be used in media that allow some degree of contradiction. I used the Dall-E2 to create a picture book with a chameleon as the main character, but due to the nature of the picture (generated by the DALL-E2), the depiction of the chameleon looks slightly different each time. Precisely because it is a children’s picture book about chameleons (a medium), it seems to be consistent from beginning to end. Here’s a short video of the book in its entirety.
Choose an AI
There are various painting styles depending on the AI model. The Dall-E2 is the most “style agnostic” design. The styles of the output images are so diverse that they cannot be attributed to a single source (for learning data). Models like Midjourney, on the other hand, seem to have a distinctive style of painting. It’s easy to maintain consistency with these models, but the style can quickly become worn out. Check out the awesome MidJourney video clip below to get a feel for the style.
(*Translation Note 2) As a research report on the differences in style between DALL-E 2 and Midjourney, a blog post published by Michael, a digital analyst living in Canada, titled ” Craiyon, DALL-E 2, and Midjourney. ” Compare . The article has images showing that DALL-E 2’s style is wider than Midjouenry’s. The image below is the output for the prompt “Two Astronauts Exploring a Tomb on Mars”. The top is the output image of DALL-E 2, the bottom is that of Midjourney.
The Dall-E2 has been trained on millions of images, much of its training material derived from stock photos.
Therefore, it is safer to display trained images that may have biases. Below is a DallE2 generated image for “A portrait of a kindergarten teacher”. As you can see, only women are depicted (although there could be male kindergarten teachers).
I have a problem when trying to display “woman with her 3 months old baby in a cafe” on my Dall-E2 and it says “3 month old baby in a cafe” A young family with a 3-month-old baby in a cafe” gave a much more accurate output.
All in all, the Dall-E2 seems to do its job when the prompt makes sense.
Below is an output image example of a monkey eating a banana. No problem.
However, when I typed in “monkey-eating bananas”, things didn’t go as planned as shown below.
If you want Dall-E2 to draw a banana eating a monkey, you need to elaborate. For example, the following image is output by inputting “A giant banana with a big mouth is taking a bite from a monkey”. ).
Furthermore, when he entered “a realistic photo of llama playing basketball,” the output was a llama that was more photoreal than the previous output. Therefore, adding the phrase “photorealistic image” to the “monkey eating banana” prompt referred to in this article may result in the output of photorealistic bananas and monkeys.
The image above is close to what I had in mind, but it’s still not perfect.
You can get more accurate results by telling DallE2 what style you want to see in the output and making sure the style matches the prompt. For example, “A giant banana with a big mouth is taking a bite from a monkey, a surrealist digital art” is below.
Now that the character and art style are established, we can continue with the Giant Banana Batch with a little more creativity from DallE2. After all, no one likes to be directed to every detail, and this trend seems to apply to AI models as well. “A giant monster banana with a big mouth is the fear of all monkeys, a surrealist digital art” Below are the input results:
The Dall-E2 has received millions of prompts and the community has been very generous. Dall-E’s official Discord has lots of tips and tricks. This dictionary describes how the Dall-E2 reacts to different filters, camera angles and famous creators. These materials will help you envision how the Dall-E2 will react to your prompts without actually typing them.