If you’d told me two years ago that I’d soon be able to type a short caption and get back fully-formed images of whatever I wanted in seconds, I wouldn’t have believed you. After all, how could a computer possibly learn to assemble shapes, lines, and colors into a visually compelling image without step-by-step instructions from a human artist? Yet this is precisely what is happening millions of times every day with the help of text-to-image generative artificial intelligence (TTIG AI) programs. Since the release of various TTIG AI models in 2022, such technology has garnered widespread attention across the internet in the form of both enthusiastic support and harsh criticism, but it’s caused significant backlash in the visual art community specifically. The heated discourse surrounding TTIG AI models stems from the direct challenge that AI poses to the identity of human artists as the central agent in the process of artistic creation. By disrupting the historically close relationship between art and artist, generative AI models also call into question the perceived importance of humanity to art, ultimately demanding careful reconsideration of how all humans, not just artists, attempt to define themselves.
Though visual generative AI is not a new concept, the recent explosion in high-quality text-to-image generators has made it more convenient than ever to create polished images on par with those made by human artists. A variety of powerful programs such as Midjourney, Stable Diffusion, DALL·E 2, and Disco Diffusion were made available to the public in 2021 and 2022, which allow millions of people to write simple descriptions like “an astronaut riding a horse in photorealistic style” and create fully-formed images like the one below in less than a minute (@midjourney; “Stable Diffusion”; “DALL·E Now Available”; “Disco Diffusion”).
Fig. 1. Images generated by DALL·E 2 with the prompt “an astronaut riding a horse in photorealistic style.” From “DALL·E 2,” OpenAI, openai.com/dall-e-2.
No additional expertise or specialized equipment is necessary; people simply have to navigate to the company website or download an app on their personal devices, which makes these TTIG programs incredibly easy to use. The convenience the programs offer paired with their admittedly fantastic visual results is, in fact, a point of pride cited by many of the companies behind them. Stability AI, for example, characterizes its model as a tool that empowers people to make “stunning” images “within seconds,” and on the page for the recently launched DALL·E 3 model, Open AI emphasizes how “easily” users can “translate [their] ideas into exceptionally accurate images” (“DALL·E 3”; “Stable Diffusion”). These claims of high quality are not just good marketing, either. A survey of 504 Yale undergraduate students found that they could only accurately distinguish between human-made and AI-generated images about 54% of the time, with certain images confusing up to 86% of respondents (Yup). I’m not immune to confusion myself, even as a digital artist; when scrolling through my Instagram feed, I’ve frequently clicked on posts that I thought were drawn digitally, only to see #aiart or related tags in the caption. To be fair, AI generators still have room for improvement. They notoriously struggle with generating hands and fine details like text, for example, and they also tend to lack consistency within an image (Yup; McCormack). However, it’s clear that upon first glance—and sometimes even upon inspection—these generated images can hold their own against human-made art. For the first time, in other words, AI-generated images have attained a high enough aesthetic and creative value to allow them to be reasonably called “art,” placing them in competition with human-made images and consequently calling into question just how necessary human artists are in the creative process.
But while human-drawn images could compete with AI-generated ones in terms of quality, when it comes to the time and effort needed to create art, humans fall far short of the speed and effortlessness of AI. Traditional art not only requires the time needed to procure and prepare all the necessary materials but also often demands hours upon hours of work, with multiple time-consuming steps like preparing the canvas, laying down a sketch, properly mixing the paint, and more, depending on the exact medium being used. But even digital art—which involves hosts of useful tools like transformations and separate layers that improve artists’ efficiency and workflow—takes multiple hours and a considerable amount of skill to create. For example, as a digital artist myself, I’ve never been able to finish a single drawing in under five hours due to the time I spend sketching, laying down base colors, rendering everything, adding a background, and finally polishing the result. Of course, time spent on drawing does vary between artists and between pieces; while one artist I follow tends to spend only two to four hours on drawings, another artist consistently publishes pieces that take her over ten hours total, once spending more than fifty-seven hours on a single work of art (Loane; Ji, “fan art”). None, however, get anywhere close to the one or two minutes that AI generation takes. Add on the years that many artists spend learning fundamental skills like perspective, anatomy, scene composition, and color theory (among multiple other topics), and the total effort invested into drawing, even just digitally, jumps even further past the effort required to use TTIG AI.
Since this stark difference in the effort necessary to create an image with or without TTIG AI doesn’t correspond to much of a difference in output quality, it logically follows that using generative AI to do artists’ work would be much more efficient. This leads to one of the largest fears in the art community: that many artists will be replaced, or at the very least displaced, in favor of TTIG programs instead. A peer-reviewed study of Finnish video game industry creatives, for instance, found that the majority of them were concerned about potential job loss due to AI. One professional interviewed even went so far as to say that “every single artist [he’s] been talking to about this” has also shared his worries over losing their jobs (Vimpari). These fears are not merely speculative, either. Take freelance illustrator Zhang Wei as an example; he was originally contracted to create sixty-five character sketches for a company, but just days after sending over his first draft—which received a positive response—he was replaced by an AI model instead (Cheung). Similarly, Greg Rutkowski, a digital artist whose name has been used millions of times in AI prompts to generate works in his unique style, has noticed that he has recently received “far fewer” cover illustration requests from first-time authors than before (Hill). Though the art community extends far beyond just those who use it to make a living, the replacement of professional artists with AI sends a clear message to all artists, not only those whose livelihoods may be directly affected: artists and the skillset they provide are no longer entirely necessary to make art. That is to say, these TTIG AI models disrupt the previously almost inseparable relationship between artists and the creation of art, challenging the centrality of human artists to the very activity that defines their identity as artists in the first place.
Admittedly, TTIG models don’t always have to be in competition with artists. If used as a tool, AI could potentially enhance artists’ productivity and creativity, thus supporting instead of challenging their role in the creation of art. It is true that many artists have expressed optimism about how these programs could help them find inspiration or quickly try out several iterations of an idea (Vimpari). However, the use of these AI programs is likely not going to remain limited to just ideation and conceptual work, especially since they are capable of generating nearly finished products. After all, to fully take advantage of the efficiency these AI models provide, it would make the most sense to either replace artists entirely with AI, or have artists input prompts and then make final edits to AI-generated images rather than spending hours creating a drawing from scratch. This more liberal use of AI in the creative process is a reality that’s already becoming apparent with how nearly half of Finnish video game industry professionals interviewed said they used AI to make “production-ready art,” not to mention how two lead artists estimated that up to 80 or 90% of a final piece could be AI-generated (Vimpari). In this way, though artists may not be completely pushed out by AI, their role in the process of artistic creation is still at risk of being greatly reduced. As one narrative designer put it, artists will potentially be diminished to being merely an “assistant to the AI,” someone relegated to fixing errors in an AI-generated piece instead of being the central decision maker in creating a work of art (Vimpari). Control, in other words, would mostly be handed over to the AI, undermining both artists’ importance and agency within the creative process.
Because agency is such an essential—and defining—part of creating art for many artists, this reduction in control as a result of TTIG AI ultimately has the potential to diminish their identity as artists. In fact, many people categorically refuse to acknowledge those who use TTIG AI as true “artists,” in large part because of the perceived lack of agency that the human user has in creating the final product. As a case in point, several users on a Reddit community dedicated towards digital art called it “annoying,” “unfair,” and even “outright false” when people who use TTIG AI try to claim the work is produced by them. One user commented that “I would never try to pass [AI-generated images] off as my own creation” because “all I did was tell something else what to do,” comparing it to how her manager couldn’t claim that they made an iced latte if all they did was tell her to make it (u/ValgosStygiansson et al.). Along the same lines, a reel with more than 1.5 million views and 235,000 likes on Instagram wrote that “calling yourself [an] AI artist is almost exactly [like] calling yourself a chef for heating readymade food in a microwave” (Ji, video). The common reasoning behind these comments is that people who use TTIG AI programs don’t truly engage with or direct the process of creating the final image beyond the bare minimum of providing a description. In the end, it’s another agent that is determining what colors to put on the canvas, what the composition of the scene will look like, what details will be incorporated and how it’ll be done. That is to say, it’s another agent that performs the part of the artist. Thus by potentially reducing the role that human artists play to simply giving textual descriptions and fixing output, TTIG AI models would force artists to surrender the very part of the process that, for many, defines who they are.
Yet this challenge to identity that generative AI has sparked is not confined solely to the artist community. TTIG models and the resulting discourse around them have exposed a fundamental assumption that underlies how people perceive their own humanity in relation to art: namely, that the two concepts are inextricably linked, each defining and being defined by the other. Art is often seen as something distinctly and uniquely human, borne of the soul, emotions, and the lived experiences of the creator (Millet). Many artists like successful illustrator Karla Ortiz have expressed that their work is their “life” and their “identity,” and others have stressed the necessity of the human experience in being able to create art in the first place; several users on Reddit and Instagram, for example, have said in not uncertain terms that “art is human expression captured in imagery” and that art is “a field specifically inclined to senses and feelings,” with “the process and emotions” behind an image informing its value (Hill; u/arifterdarkly; Romanny; Glistica). This common belief that humanity is essential to the creation of art implies, in turn, that only humans can make art, which contributes to the designation of art as an act inherently special to humankind. In fact, some artists like illustrator Kelly McKernan have gone so far as to say that artistic creation is “what makes people human,” highlighting how art is frequently tied to the very “essence of being human” (Noveck; Millet). However, now that AI—arguably devoid of a soul, feelings, and human experience—can generate works that are nearly indistinguishable from those of human artists, it calls into question the pervasive idea that humanity is somehow instrumental in creating art, and that as a result, art is a hallmark of humanity.
Though many people continue to insist on this anthropocentric view of art, it’s undeniable that TTIG AI models have already deeply unsettled how humans attempt to define themselves through activities like artistic creation. The disturbance TTIG AI models pose to people’s worldviews would, according to psychological research, lead to a bias against AI-generated work, which is a phenomenon that has in fact been repeatedly observed. Indeed, one peer-reviewed study found that across four experiments and two different mediums, participants consistently rated works labeled as “AI-generated” lower in terms of experienced awe than those labeled as “human made,” regardless of the actual method of generation. Furthermore, those with stronger anthropocentric beliefs about the uniqueness of human creativity showed a stronger negative bias towards AI art, suggesting that it was, in fact, the perceived challenge to their viewpoints that resulted in this bias (Millet). Outside of the experimental setting, too, many artists have responded to the psychological threat of TTIG AI by refusing to consider computer-generated images as “true” art, therefore allowing them to preserve their anthropocentric mentality. Famous songwriter Nick Cave, for instance, once called ChatGPT-generated lyrics a “grotesque mockery of what it is to be human” and “replication as travesty,” citing the AI’s inability to feel or “have an authentic human experience” as the reason why it could never create a “genuine song” (Cave). Echoing this thought, award-winning filmmaker Guillermo del Toro stated in an interview that “AI can interpolate information but it can never draw,” as it cannot “capture a feeling or a countenance or the softness of a human face” (Sharf). Clearly, TTIG AI models have upended the widely-held belief that humanity can lay special claim to artistic creation. Since art is seen as inherently human and thus not replicable by a mere machine—unlike more tedious or analytical tasks like solving equations, which already have been automated—the rise of TTIG AI models represents the “breaching of one of the last bastions” of human uniqueness (Millet). As a result, such generative AI models push us to reconsider how, exactly, we can define and differentiate ourselves from what surrounds us, and ultimately leads us to wonder whether we can even do so at all.
Like with many of the implications raised by TTIG AI, that question has no clear answer. What is clear, though, is that new generative AI models are directly challenging the current importance that’s placed on humans—and humanity—in the creative process. In potentially taking over work that has traditionally been reserved for human artists, TTIG AI disrupts not only a key part of artists’ identities but also upsets the common understanding of humans as uniquely artistic, which is a cornerstone of human identity as a whole. And though the disturbances caused by TTIG AI may not be resolved anytime soon, the larger societal shift towards AI and automation in our everyday lives means that similar questions raised by such technologies will become increasingly hard to ignore. As we look to a future where the lines between human and machine capabilities blur even further, leading to the inexorable displacement of what once was considered human, we will eventually have to confront several pressing questions to which we have yet to find a satisfying answer: what is the role humans will play in the world around us? What is it that makes us different from the technology we’ve made? And what, in the end, does it mean to be human?
Leave a Reply