Innovation is in our DNA at TheVentury. Our experts are continuously learning about the newest trends and follow the latest advancements when it comes to technologies and their application.
Big changes are rare, but when they appear they tend to revolutionize whole sectors. Let us take a look together at one of these big changes and let us jump into the topic of
AI image generators
If you haven’t yet heard about Stable Diffusion, DALL·E-2 or Midjourney, you might be amazed by the current status of AI image and art generators. It can hardly be overstated how quick and good they suddenly became. And they evolve every week (since I wrote this article, I have had to update it three times to include all the developments that happened).
It is an absolute game-changer in photography, art creation, illustration, concept art, video creation and soon – 3D.
If you have followed the development of AI-based visuals in the last years, you might agree that the possibilities were limited and the results were usually underwhelming.
Now we are witnessing a giant leap forward, and the best part is we have just scratched the surface. To cite the founder of Midjourney, David Holz: “We don’t even know yet what this technology will be fully capable of, but we already know there will be much more.”
What is an AI image generator, and how does it work?
Nowadays, AI is light years away from what you would imagine a terminator or a sci-fi robot to be like. Instead machine learning models use neural networks and train with databases containing millions of text-image combinations.
Diffusion models work by corrupting the training images by adding Gaussian noise for a set number of iterations. This is done until the training image is approximately pure noise. The model is then trained to reverse this process and gradually denoise the image until a clean sample is produced. It’s a hard-to-imagine process, but you can check out this video for a good explanation:
What does this look like for the user? There is an interface – e.g. a bot in Discord, an environment in Google Collab or another application interface, where you can send messages with simple (or complex) text prompts describing the graphic or photo you want to create. Based on what it understood from your input, the machine learning model returns original and unique visuals.
There are many tricks and things you can do to get better results and create precisely what you want, but even with just a few words and no previous knowledge, you can get something extraordinary. In general, the more specific you are, the more likely is to get a relevant result.
fire opal, illuminated by lightsource, mineral photography, minerals, dark background beautiful photography, 8k, highly detailed, intrinsic details, octane render
What can you do with it?
Pretty much everything. From concept art and architecture through drawings, paintings, photography to even logos or tattoos, these image generators let anything you can image become reality.
Need a concept for buildings?
© Stefan Huber
Or a photograph of a situation that would usually require a professional setup?
© Nico Angelone – https://www.facebook.com/groups/548208310207197
Do you want textures for your game? Do you want to illustrate a comic story? Do you need a movie, game, or interior design concept? Anything that can be created by digital painting, hand drawing or illustration can also be done by AI algorithms.
The best part about endless possibilities is how conceptual they allow you to think.
Being free of all restrictions that come with the labour necessary to create an image means endless possibilities. The concept and idea become the focus, and stories emerge everywhere.
Filmmakers, comic book creators, artists or designers are looking into AI as support to create backgrounds, textures, storyboards & concepts, or even animations. Soon we will also see 3D images made with these tools.
Still, there are certain limitations. It is pretty hard to create things that were not covered enough by the reference images in the databases. Also, creating coherent characters through multiple photos is almost impossible. As a filmmaker or game developer, you always want reliable results.
Training your model
Stable Diffusion lets you tackle this challenge by training the model with your own images. You can even have your face painted as an AI-generated portrait. 🙂
One of the more recent developments is the progress of video creation with Stable Diffusion. The principle is the same like the static images, but in an advanced step it lets you morph them into each other, with control over transitions and other aspects.
Check out this great example by Maui Mauricio:
Repair and outpaint
DALL·E 2 can now repair damaged or old photos and even expand a picture to fill in the surroundings that have been cropped or never existed.
Original: Girl with a Pearl Earring by Johannes Vermeer
Outpainting: August Kamp
Due to the nature of the generation process, each output image is unique and can never be replicated fully. This also means there is currently no way of having editing control over the images (emphasizing on currently).
If you want to change the outcome, you have to go through the process again, but you cannot manipulate parts of the finished image (of course, you can do post-processing in Photoshop or similar, but you can’t alter the original picture).
This also means that you rely on how good the application understands dependencies. For example, creating multiple objects that are related or depend on each other is something much harder-to-understand for the AI than just creating a single person with a nice background.
What are the implications to this technology?
When following the communities around these applications, it becomes clear that users who try them out are very enthusiastic about the use cases and the amazing things they can suddenly create by imagination alone.
These tools give you access to complex art without needing years of experience. Artists and photographers who make their living by selling original art might have a different opinions about AI’s impact on their work since it also means that certain tasks can be replaced entirely. There are already petitions to ban AI-generated artworks from different platforms.
The thing is, AI will not replace all manually-crafted things, but it will take a considerable part of the market. For example, look at those amazingly realistic AI-generated food images.
Food photography © to DALL E 2 creators
Of course, we can only speculate about what will come next and how we will use AI image generators. What will the regulations be regarding copyright ownership (check out this podcast on AI copyrights on Spotify by Kristina Kashtanova)? How will we, as a society, deal with the issue that ML models train on other artists’ previous work without any compensation? And what happens when the internet is flooded with images that can’t be easily identified as fake anymore?
Also, there are philosophical questions: Who is creating art? The machine? The user? It is hard to answer since there is no clear definition of what art is and what skills are needed to fulfill this definition.
What is next?
Since Stable Diffusion is open source, many well-known companies, such as Adobe or Canva, are working eagerly to include these tools in their own programs. I expect that in a few months, all major design platforms will have integrations of some AI image generators.
The devil is out of the box.
The technology is there and it won’t be possible to reverse that. The best you can do, if you’re a photographer, an artist or an illustrator, is to dive deep into the subject, check out how it can be helpful for you and treat it as a new resource.
Check out the 3 AI Image Generators mentioned in this article:
Stable Diffusion is open source and can also run locally. It is meant to be free. It’s possible to integrate it in other applications and is for example used in Midjourney’s latest update too. Photoshop has also some plugins that run on Stable Diffusion.
“Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.”
Midjourney is an application that you can access via Discord. It’s pretty simple to understand and easy to use.
DALL·E 2 – OpenAI
“DALL·E 2 is a new AI system that can create realistic images and art from a description in natural language.”
DALL·E 2 is created by OpenAI which is an AI research and deployment company. It is mostly praised for it’s photorealistic features.
© It’s hard to track the creator of every AI-generated image out there. If you made one of those awesome pictures and want to be credited, please reach out!