In 10 years, gamers will use AI prompts to build what they play
To celebrate Polygon’s 10th anniversary, we’re rolling out a special issue: The Next 10, a consideration of what games and entertainment will become over the next decade from some of our favorite artists and writers. Here, Far Cry 2 and Watch Dogs: Legion creative director Clint Hocking looks at the future of AI-created games.
In the year 2032, we humans will have risen above our baser instincts, and the world will finally be at peace. There will be no more crime, no more violence, no more disagreement. We will be safe from dangerous toxins as there will be no more drugs or alcohol, and global warming and its associated problems will be long forgotten. Everything will be perfect until a dangerous supervillain named Phoenix escapes from cryo-prison and we have no choice but to also release a violent, insubordinate cop named Spartan from the same cryo-prison in order to save us all (don’t worry, he was innocent all along).
Of course, this is not what 2032 will look like — this is the plot of the 1993 movie Demolition Man, starring Sylvester Stallone, Wesley Snipes, and Sandra Bullock, and directed by Marco Brambilla. Unsurprisingly, the 2032 of the film looks nothing like what the actual 2032 will look like, though in fairness to the creators, that was probably not their goal. In 1993 it was unimaginable that today, someone sitting on a bus could tell their phone to play Demolition Man, and within five seconds the digitized film would be streaming from a data center, over cellular wireless networks, directly to their 5.8-inch HD touchscreen.
While this fact seems banal and almost insignificant to us, it is hard to overstate how much the chain of technologies that makes this possible has defined the entire structure of the world of 2022. The scale of the internet, data centers, fiber optic and wireless networks, the digitization of content, data compression and streaming, recommendation algorithms, machine learning, trillion-dollar corporations, media and telecommunications empires, and a supercomputer/shopping mall in everyone’s pocket — all of these things have been subsumed under the neoliberal imperative to keep our collapsing postindustrial economy on life support by sedating our entire species with the opiate of digital entertainment. So where does it all lead?
In 2032, someone sitting on a bus will be able to tell their phone to play Demolition Man, written and directed by Jordan Peele and starring Millie Bobby Brown as Spartan, Lil Nas X as Phoenix, and Tom Holland in the Sandra Bullock role. Machine learning will be able to analyze every word Jordan Peele has ever written while simultaneously examining his oeuvre as a director — looking not only at how to rewrite the 1993 script to make it more unsettling and odd, but also how to frame and light individual scenes and shots. 3D sets and locations will be generated, the stars will be aged to match the requirements of the script, and their performances and voices will all be synthesized, with licensing fees automatically dispatched to their bank accounts. The entire process of generating a two-hour film from a 40-year-old script as a pastiche of the works and performances of everyone involved will take several minutes, but just as with streaming services today, you’ll be able to start watching within a few seconds, with the film being generated on the fly in the cloud. The technologies that enable this future mostly exist now — they are just not robust or well-integrated enough yet. I believe we are close enough to this future today, and the technological foundation is well-enough established, that it is no longer a question of whether this will happen; it is merely a question of when. But the important question is, how will this new future transform us and our culture?
As a game developer, I sometimes feel that I have a front-row seat for this coming transformation. I already produce works that may be experienced in radically different ways by different members of the audience. This has always been the case with games; no two games of football or go are the same. In recent years, there have been many more attempts to offer this diversity of experience in stories as well. In Watch Dogs: Legion, we intentionally set out to make a game where we abdicated control over casting to the player: You could be anybody you wanted, and when you watched a cutscene, we had no idea who would be in it. Just as the generated Peele version of Demolition Man still has the same general plot as the 40-year-old original, we knew what each scene would be about and how the plot would move, but we didn’t know if you’d be playing Sylvester Stallone, Millie Bobby Brown, or an old lady named Helen.
It seems probable that as we refine our abilities to create in this new paradigm, and as machine learning and AI become more powerful and sophisticated, we will start to see these technologies used not just to generate linear content, but interactive content as well. Games like Flappy Bird consist of only a few hundred lines of code, and the speed with which people created clones of the game when it was released in 2013 was staggering. But computers are faster. It won’t be long before machine learning can be used to generate these kinds of simple games, and soon platform holders will release proprietary generators that allow users to create games on demand from prompts: “a side-scroller where I am an ostrich in a tuxedo trying to escape a robot uprising.” This may sound weird, and may be useless, but I also would not be surprised if this appeared tomorrow. And once it does, it will only get better.
Not far behind this coming capability to generate simple games will be the capability to generate complex games: “an open-world fantasy RPG in a steampunk version of early Napoleonic France.” Of course, these are extremely broad parameters — much broader than regenerating a film from a specified script — but nothing will stop you from being able to refine the parameters as you play; “make the progression more like Skyrim,” or “make the bosses more like Elden Ring,” and the AI will also be learning about you and your preferences over time.
Of course, with content this malleable, it begs the question: What is the difference between a generated film and a game? In the middle of Jordan Peele’s Demolition Man, could I not simply say “let me play now,” and immediately take control of Millie Bobby Brown as officer Spartan? While I can’t tell you how to build this thing yet, I can tell you for sure that it will arrive as a simulation running a script, not as a linear sequence of authored images being rendered out one frame at a time. You will definitely be able to play it.
As we look at the rising tide of generated content that seems about to inundate our culture, it is easy to be fearful of a future where artists have been replaced by machines, but it is not so simple. There are two kinds of artists: creators who make nouns beautifully, and performers who do verbs beautifully (most artists are at least a bit of both). The last century, with the rise of broadcast media culture, the industrialization of entertainment, and centralized control over the economics of created artifacts, creators have benefited enormously compared to performers, but this is a historical anomaly. For the first 10,000-ish years of human culture, the balance was reversed, and the correction has been underway since Napster torpedoed the music industry below the waterline and the concept of albums intended to be listened to linearly on discs made of plastic stopped making sense.
In the future — potentially as soon as 2032 — the process of making digital nouns beautifully will be fully automated. This will not obviate our desire or compulsion to continue making digital nouns (though it may challenge our ability to do so profitably). Nor will it impact our desire to perform… to play. No computer need ever lose a game of chess or go against a human, but this does not prevent us from wanting to play, to play better, or to learn to play beautifully. As humans, we will always find beauty and artfulness in what other humans can do, and our fascination with delineating the boundaries of what we are, and what we can be, is inescapable.