Over the holidays I visited my 3- and 5-year old nephews in Montreal. Being an Uncle was fun, if exhausting, but the visit featured a surprising amount of AI.
One of my nephew's favourite things recently is the Ubisoft AI car trainer exhibit at the Montreal Science Centre. It lets you set up a few simple training parameters for an AI-driven car, such as length of training, how fast it drives, and how far it looks ahead, and then you can watch how it performs in a strikingly realistic simulation of downtown San Francisco. If you set it up with long training, medium speed, and looking ahead a great distance, it will drive around town quite well. Of course, my nephew discovered that it's much more fun to make it go very fast with little training and not looking ahead at all - this makes the car careen through the Mission District, knocking over fire hydrants and garbage cans and generally making a mess. After a few minutes of chaos, it calls you a "Public Menace":
It turns out "public menace" was a new word for my nephew, and it quickly became a favourite.
While hanging out with him, I found myself often turning to web image searches to show him things we were talking about, like Godzilla, and Chinese Dragons, and ninjas etc. (he's a 5 year old boy!) Then he started asking for pictures of things that didn't exist yet, such as "a menace that has three eyes and green fur". For work and research purposes, I'd recently signed up for OpenAI's monthly plan, which gives me access to their latest GPT4 and Dall-E 3 models via their chat and app interfaces - so I decided to see how it could do with a 5-year-old's weird requests.
It turns out AI is perfect for realizing the random requests of a 5-year-old's imagination:
In between making paper airplanes and assembling Legos and trying to get him to eat his actual lunch instead of just snacking on crackers, we ended up spending quite a bit of time deep-diving into his random thoughts and coming up with a lot of wacky art. As long as we avoided specific trademarked names like "Yoda" and "Ninja Turtles" and "Minions" and "Groot" ChatGPT/Dall-E was able to comply with nearly every request.
"Draw me a picture of a menace that looks like a Tyrannosaurus rex but coloured like a panda bear and with ears like a dog"
"Make a picture of a menace that is a mixture of a velociraptor and a duck"
"Make me a picture of a menace that is a mixture of a stegosaurus and a brontosaurus and a Tyrannosaurus rex and a triceratops and a wooly mammoth"
"How about a combination of a spider and a robot and a truck?"
These last few images might be the most 5-year-old boy things I've ever seen in my life, including when I was one myself. Personally, I preferred when things got more whimsical:
"How about a mixture of a toaster and a robot and a construction truck"
I especially love the toast coming up out top in multiple directions.
My nephew is a big fan of the Disney+/Marvel "I Am Groot" animated shorts, which we watched nearly every day, and I managed to create some images in that direction while skirting explicit IP violations - we were both very fond of this wooden monkey, who became the subject of our own stories and spin-offs:
I even made a paper cut-out "puppet" of it for him - my drawing skills really improved after only a few days trying to keep a child entertained!
We eventually hit our transactions limit for the day, and even managed to befuddle Dall-E a few times with prompts that just returned errors. One morning he got fixated on the idea of various things swimming in coffee, which resulted in some of my favourite images of the whole visit:
I especially liked when things got extra surreal:
I was impressed how well Dall-E could manage to create images of some of the most abstract, dadaist prompts. Here's "several rakes running away":
This led to a series of "running away" images, my favourite being this rather adorable one of milk bottles merrily fleeing a harried milkman:
We tried getting GPT-4 to write us a naptime story, and the results were, to be honest, pretty mediocre. Repeated requests mostly gave variations on the same somewhat meh structure. But the images were a hit.
ChatGPT-4 and Dall-E 3 are among the most advanced examples of the current state of the art in AI, leveraging vast global computing resources to create amazing text and images on the fly that we could not even have imagined just a few years ago, accessed through high-speed internet on my powerful pocket computer. In many ways, this is the current peak of our technology, the epitome of everything that has come before in our civilization.
And it's a fun way to keep a child amused for a few hours.
It turns out "keeping kids amused" is actually a key use case: OpenAI's own website prominently features "concretize a child's wild imaginings" as an example of how to use ChatGPT4/Dall-E 3:
There is a lot of debate right now about AI art and what it means for artists and creators and even what the definition of 'art' means. Personally, I'm finding a lot of the discussion echoing the debates in the early 1980s about electronic music, drum machines, sequencers and the like not being "real music" and threatening "real musicians" and now there are entire genres dedicated to completely artificial digitally generated sounds. I predict AI art will find its place as yet another set of tools for expression, requiring a different set of skills and perhaps a different way of experiencing it, but still in some way or another "art" (whatever that actually means).
In the meantime, it's still pretty fun for kids.