Wednesday, December 21, 2022

What I've Been Thinking About Generative Art

Image generated by computer from a text prompt

The field of text-to-image generative art has been drawing investor dollars, and has been growing at an impressive pace, with lots of startups creating services using Stable Diffusion, Midjourney, Dall-E2 and ChatGPT. One notable investor newsletter called Antler has just released an introduction to the field and a map of what's out there.

My brief quote in the article is part of a longer Q and A that I did with the author. If you're interested in where my thoughts are at the moment, read on for the full interview, and I invite you to share your thoughts in the comments:

December 19, 2022

Thoughts about Generative AI or AI Art

Questions from Ollie Forsyth of Antler Investments


1. What impact does Generative AI have on you as a creator? 

Not much. I've tried Dall-E2, Stable Diffusion, and ChatGPT but the results for me always fall too far short of what I have in mind to create. I have very specific ideas in mind when I want to make a picture, and I get the best results and the most satisfaction from making my pictures with paint and brushes. I have always made my art with physical tools, and use digital tools for photography and video to document the process and the results.

That said, I'm watching what's happening with great interest. I'm truly inspired by the results other people are getting with the help of generative models. The people who consistently get the best results often have a computer coding background as well as an aesthetic sense. Is it “art?” Sure, why not? It definitely stimulates my imagination and the good stuff strikes me as original.

You often hear artists call AI image models as "tools," but AI is so much more than a tool. It's a creative partner, a synthetic genie, or an inspirational ally. It’s a weird feeling to acknowledge that a machine can be an able creative partner, with the human acting as a kind of midwife or helping.

But I'm 100% old school. For me, imagination is a basic human process, like eating or walking. I don't think I would feel fully human if I had to rely on such a system as a creative partner.

Image generated by computer from a text prompt

Is this a good thing or is it going to have a negative impact on creators?

There's a lot of good about generative AI in art. For one thing, there has been a huge surge in appreciation for surrealism and fantasy, because that's what it does best. I've seen a lot of images that are new and exciting. The potentials of the medium are almost limitless. But it’s also scary and threatening, especially to digital artists who are part of corporate entertainment pipelines, such as concept artists.

Many artists are concerned about Generative AI “scraping” copyrighted images in their training data. Others worry about it displacing artists. Some people are trying to stop it, or at least shape it.

I'm not worried about either of those issues because AI art for the most part alters and transforms its source material, just as humans do. An artist’s style can’t and shouldn’t be copyrightable.

I don’t feel threatened by AI Art. There's no way it can take away my livelihood because of where I'm positioned as a painter in gouache. Mostly what I do is share behind-the-scenes demos of plein-air sketching on YouTube. No way can AI ever replace that.

I've been in the art business for almost 45 years, through a lot of technological changes. My career has had plenty of highs and lows during that span and I've had to reinvent myself several times. I am concerned not only for emerging artists, but also most all digital artists (top of their game or not, it doesn't matter).

What I try to keep in mind is that AI represents both a threat and an opportunity. If we only focus only on the threat, it may kill our career. If we focus on the opportunity we might end up doing fine.

I would caution the alarmists to remember that applying some sort of digital rights management mechanism could have a chilling effect on the growth of this new art form, and end up helping the powerful entertainment corporations more than the little guys.

Looking a little further down the road into the future, I don't see any easy way around the pitfalls of potential misinformation, deep-faking, erosion of shared reality, and content moderation. Should we let people use these systems to create whatever they want? Should we ask the generative models to edit “dangerous” ideas at the level of generation or distribution? Who defines what is dangerous? Should the government define the limits of what we're allowed to imagine or should private companies do so? China is passing new laws requiring watermarking all AI works. Should other countries do that? What are the political ramifications? These are big societal questions.

The biggest negative impact from my point of view is an erosion of human hand-eye skills and a weakening of the artist's confidence in his or her own imagination. Just as the industrial revolution and the invention of the motorcar made us lazier and less physically active, the development of push-button creativity and synthetic writing partners will make us stupider, and dull our ability to dream. These are core human values, and the threat is very real.

Those concerns were at the forefront of concerns among my YouTube followers:

2. What are your most pressing concerns about the platforms?

By "platforms" I assume you mean generative AI models, such as Midjourney, Dall-E 2, and Stable Diffusion. The ease of creating images means that there has been such a cascade of computer generated images and videos that they've glutted many arenas of the art world: portfolio sites, art print markets, contests, and freelance illustrators. This leads to a cheapening of all the images, a dulling of appreciation, and a confusion about what's really created by humans. How do you run an art school when the students can solve any assignment with the push of a button?

People who worry about the threat of AI art focus on the production side, but we shouldn't forget about the distribution side. How will people consume these images, films, and writing? Remember that the prompt doesn't have to come only from the artist, creator or director. It can also come directly from the consumer, acting as individuals or as fan groups.

Imagine an AI video-generator with personal biofeedback input that creates hallucinatory music or video to suit your current mood. Or let's suppose you have online groups who love Japanese anime style. Their preferences can shape the generation of every video they see. They can watch custom editions of their favorite media or games output with their favorite anime style. Or they can watch shorter or longer versions of films, or films customized for whatever language they speak.

If the viewer and the generative model drive the entire creative process, there's really no need for an "artist" or "director" at all. The idea that a work of art or film exists fixed in a single form at the moment of creation and then finds its audience may be a quaint idea that many of us will outgrow as a culture.

The other big concern is that all these models are based on existing, previously manifested styles of art and photography rather than on the direct human response to nature itself. Therefore it is, by definition, "mannerist." In terms of Plato's Allegory of the Cave, we're talking about the realm of firelit shadows in the cave wall, not the light of the sun itself. Maybe someone will invent ambulatory robots exploring the real world, but without such empirical input, this will always be a profound limitation.

Image generated by computer from a text prompt

3. How will you be using Generative AI platforms? 

I won't condemn AI or vow never to use it, and I respect artists who choose to ally with computers to make art. This all didn't arrive out of the blue. Digital artists have been automating parts of their creative process for many years now, with 3D rendering, ray tracing, procedural effects, photobashing, etc. AI models have just moved the needle way beyond where it was.

Now, for anyone to succeed with generative AI, there's no halfway. They need to join the arms race, learn coding, train their models, learn about the secretive alchemy of prompt writing, and cultivate their awareness of past art styles.

It's a very different path from painting, drawing, and animating with physical materials. We all have to make our peace with the digital sphere. Even classical musicians playing original instruments use digital recording techniques and social media, and read their music off iPads in concerts.

4. Where is all this headed? Please provide examples.

AI art draws upon the collective unconscious of human expression to generate something new. Laws will be passed and new businesses will be formed that will shape the evolution. We will come to enjoy art forms that we can't currently imagine. Those forms may take market share away from traditional forms, just as the introduction of television took away from magazines and movies.

Because of the lack of friction to create these new art forms, there will be a lot of derivative junk out there. But let's assume we can develop algorithmic sorting techniques to allow the truly great stuff to surface. The art we'll see as a result of this technology will be surprising and fascinating. It will reflect many inputs: the prompter, the design of the generative model, the zeitgeist of the audience, and of the immensity of the dataset it draws from.

But as you think about generative computer models, don't forget that there's going to be a healthy backlash to the idea of handing off human creativity to computers. That cultural countermovement hasn't coalesced yet, but it will, and it will be powerful. What will this movement be called: the Human Agency? Hand-Eyes? H&H (Hearts and Hands)? Authentics? Analoggers? As David Sax has argued, the future is analog, or at least a healthy portion of it will be.

The invention of the internet a few decades ago was a boon for knitting and hand-lettering. In a similar way, the growth of AI and the decline of social media will be a powerful stimulus for such uber-traditional forms as face-to-face storytelling, on-location sketching, and recitation. Whatever it ends up being called, I plan to be part of that cultural countermovement.

--James Gurney
Revised slightly in 2024 to clarify a few points.


nuum said...

Hand-eye coordination in the drawing process is very important for me to really feel like I'm creating art.

The ability to create a prompt that generates beautiful AI works will not make me experience the pleasure I get when I draw.

I don't know what will happen to the young people who are now entering art academies and will not have the chance to develop this very basic skill.

Your description of the situation we're in now with regard to this new technology was accurate, James.


Unknown said...
Thanks for keeping us less adventurous folks up to date on this NEW World. FACINATING! My digital work is really just direct extensions of the physical process that I learned in the past and still use, so have never felt uncomfortable. We know that through the ages, adding the latest thinking, tools, techniques has been the hallmark of art, and there has always been the objectors along the way! We still question Vermeer about his camera-lucida, or "lazy lucy". How do we draw the line? Even without the AI generation, I have been itching to work the analog more-- but gee all of those sheets of paper pile up under your bed!

Wes McBride said...

Thanks for posting the entire interview. The quote seems a bit out of context.

Recently, I had a conversation about generative art with a well-known and influential entrepreneur-tech blogger. He thought of all artists as resistant to advances in technology (at least since photography). In his view, successful artists tend to hold onto their position and actively resist any change.

Since that conversation, I've been noticing the same perception about artists among many coders and venture capitalists working in AI. They might be parroting the blogger I mentioned, or it might be a cultural attitude in the field. (Mind, I'm not personal friends with any of these people. Just reading what they post publicly. So, I might be getting a warped view of things.)

So, thank you for clearly demonstrating how natural it is for an artist to be open to technology, while at the same time committed to traditional methods and tools.

A few of my former students were recently let go from their jobs in the video game industry. Officially, it wasn't because of AI, but unofficially, automation does seem to be the reason they are no longer needed. They are young, and they have time recover from the loss. But I'd be lying if I said I wasn't upset on their behalf.

So another thing I appreciate is your rationality. You haven't avoided talking about concerns, but you've laid out a landscape that still has some hope in it.

Actually, a lot of hope.

Kessie said...

Thanks for this balanced, rational opinion! I've seen so many people shrieking that AI generated art is the end of artists as we know them, so I've been wondering what you thought of it all. So far I've only used the tools as kind of a personalized image search. As you called it, the alchemy of prompt writing still eludes me. But I think it's an interesting resource, and I hope it doesn't get closed down by regulations too quickly.

Joel Fletcher said...

Wow, what a great insightful interview! Having looked deeply into this AI generated image phenomenon, I agree 100% with your take on this matter James.

Something that occurs to me is that there is already a huge amount of generated art on the internet. Considering that the AI is trained on scraped content from the web, that would mean that it will be training on generated content too. Essentially training on itself in addition to human-made content. I would think that would muddy and corrupt the outcome eventually.

Coffsartist said...

An interesting article. Although I have quite a few years of Photoshop etc up my sleeve I prefer using traditional painting and drawing methods to make images.
I like the "hand of the artist" being apparent in the work.
On the 'backlash' theory, a book was recently published entitled "The future is to create a more human world"
by author David Sax. Haven't read it yet but sounds interesting

arenhaus said...

Some points for perspective, James:

1. Calling it "generative" is becoming misleading. It can only generate something by mixing what has been fed into it with random noise, in various combinations. In fact, the way it is built, it takes special effort to prevent it from simply reproducing the images that had been fed into it; that's what it does "naturally", and even when the programmers had taken special effort to force it to prefer states "in between" rather than states that correspond to source images, they often still closely reproduce the source images. There are plenty examples of that, and the only reason it is not blatantly easy to spot is that there are millions of images in the starting set, so no human being would recognize each one on sight. Still, artists keep noticing parts or whole artwork of theirs in the "generated" output all the time. It happened more often before, happens less now - the programmers are keeping busy to obfuscate the fact, and more and more of the images keep being fed into the NNs, making the set bigger and harder to spot the copying.

2. You wrote "AI art draws upon the collective unconscious of human expression to generate something new". Which one wished were so. What it actually does is: you feed a big database of tagged images into it, which are treated as point sets and converted to network node weights. Then you can give it a prompt that uses some tags, and the network reaches a state in which the nodes with more weight for that tag set activate, producing another point set - more or less near the original one. In the simplest case, and on a small starting set, you get a rather faithful reproduction of one of the original images, with the imprecisions distorting it, like the distortion via a wavy glass plate. In that way it behaves like a lossy compression algorithm - think of MP3; the MP3 file is nothing like the digitized sound waveform recorded on the CD, but it still can be decoded into a waveform, with more or less distortion but still very similar (to our hearing) to the original. If it does not hit a state with the clearly recorded image - it combines multiple images or produces noise. So they feed it millions of images and make it prefer inexact hits, so there is less chance you get a reproduction of a source image from it, but something in between. That's the extent of its "generation", and there is nothing "subconscious" about it. It makes noise-based collages from distorted pieces of the starting set. Start with another set, you get different images, but still variations on the starting set.


arenhaus said...

3. And here is the real problem. These "AI" startups all started with the pre-assembled tagged image databases compiled for academic use, most prominently LAION-5B. These databases contain innumerable pictures scraped off the internet, with no credit to the artists, no consent from the artist or copyright owner, etc. etc. For the intended academic use that can be viewed as "fair use". But these guys used them commercially, and kept feeding them scraped copyrighted work - in fact, with some artists leeched so extensively that the NN output for a while was mostly rehashes of their works. "AI art" is not art, it is art theft on an unprecedented scale, and not by dumb NNs - by people who feed them other people's stolen art by millions. And the way the NNs are build - there is literally no way to get credit, or to opt out, or make them forget your copyrighted images. The only way it can be solved would making them start from zero and only use images released to them for this purpose. Guess how likely that would be.

4. The starter sets are not the only problematic source involved. The recent flashmob that bombed the ArtStation front page with "No AI" signs resulted in at least one NN happily beginning to spew images spangled with pieces of black/red/white band. Which clearly revealed that 1) the thing maps the image in small square blocks and 2) the guys who run it are obviously actively scraping the internet for more and more images. They started out with art theft in mass, and are still busy with art theft in mass.

5. All this means that 1) any "originality" you see in "AI art" is due to the human brain overwhelmed by the sheer amount of rehashed source images - virtually every meaningful bit in it is borrowed, and 2) you cannot ever be sure that any image it produces contains no stolen copyrighted material. Actually, the way these were trained ensures that there will be virtually no images produced that do NOT contain stolen copyrighted material. Which is an art director's nightmare in making.

All this does not even mention the fact that these "AI bros" are taking the collective output of all artists available online, without consent, and are happily using it to profiteer - from clueless investors and clueless subscribers - all the while rapidly eroding the very environment that this art grew in. We are seeing a "tragedy of the commons" in action, with not a small dose of "sheep eating people" thrown in. A wave of irreparably inferior surrogate trying to supplant quality product and the very industry that produces it. And the crowd gathering around "AI art" is collecting an astonishing parcel of aggressive people with toxic notions, who attack artists for protesting the abuse of their work, to add insult to the injury. It is rapidly developing into a terrifying mess.

M. said...

I'm eager to see and be part of the countermovement that springs from this


Unknown said...

From what I can tell there are two simple rules would satisfy the majority of visual artist's complaints and fear towards this application of AI, and without hobbling the AI to an unreasonable degree:

"the AI should not be trained using an artist's copyrighted work (without explicit permission)"


"the AI should not accept an artist's working name in prompts (without explicit permission)"

Even one of these rules may be enough, though I think the first is more important. Even with following these rules, the AI would have access to a very large amount of public domain artwork and photographs for the AI to be satisfactorily trained on, it just wouldn't make mimicking a currently working artist trivial.

Do you think these theoretical rules are draconian towards AI users or have the potential to backfire against small-time artists?

arenhaus said...

To "Unknown"/dfortune2: I wish people stopped propagating this tired myth. It is simply highly unlikely to be true, does not match historical context, is contrary to the way Vermeer composed his paintings as evident in X-ray photos, and it was started by a hack artist with a case of really sour green grapes.

You've just aggravated it by mentioning camera lucida in connection with Vermeer, too. Camera lucida, unlike camera obscura, is actually useful in drawing (but virtually useless for painting), but it was invented in 19th century, two hundred years after Vermeer died.

Unknown said...

Thank you for sharing your thoughts on this very controversial topic! I'm honestly torn on this issue and have flip-flopped so many times I'm beginning to feel like a yo-yo. I think it could be a great tool during the ideation phase and would love to use it in that way, but wouldn't want it to create the whole final artwork for me. I've experimented with at least ten different AI art generators, and, while I am fascinated with the results they can give, the sense of accomplishment that you feel after actually creating something with your hands (be it through traditional or digital media) just isn't the same. I also haven't been able to shake off the legal and ethical concerns about how the AI art tech works (though that could just be because there are so many emotions flying about right now it's hard not to get caught up in them). I've tried to keep a level head about it and have been closely following all the arguments both for and against it and have heard things on both sides that make sense to me. For right now though, I've chosen to back-off of using AI in my art until things have settled down more and there is more information about how they do or do not actually work. I'm also highly leery of government getting involved, but I do really hope that, somehow, a good solution to the current mess can be found for all parties involved.

llawrence said...

Mr. Gurney, thank you for providing a calm voice of reason in these histrionic times.

Lynnwood said...

Just saying... as an artist or just as a human being,A .I. is simply the least interesting subject that I can think ..or not think of 😶

Bob said...

Hi James,
Thank you for such an informative blog post. Just this week, a member on the Dinotopia Message Board posted a Midjourney image of Dinotopian law enforcement -- the first time ever I've seen such a thing. This image showing a human and a saurian guard appears surprisingly plausible, at least to me. Since Midjourney "knows" what Dinotopia is and might look like, apparently its training includes some of your artwork! This is quite fascinating stuff. Your blog post helped me understand the pluses and minuses of AI art.