Landmarks within a dream

Infinite Craft as the TodoMVC of virtual worlds.

Sep 16, 2024

Testing of non-deterministic AI output was made possible by the Selfie JVM testing library.

The Infinite Craft game by Neal Agarwal is a landmark achievement in successfully navigating the jagged frontier of AI capabilities.

When it launched January 31 2024, it was quickly posted to Hacker News where it soared to over 1000 upvotes. Top comment:

I couldn't find any information but does this use some kind of LLM to derive the combinations from? It makes a request to the backend every time you combine items which sometimes takes >500ms, and also supports some really wild combinations that I highly doubt someone has taken the time to come up with. It would also explain why the icons are emoji's, it would be fairly trivial to ask ChatGPT to give you the result of Fire + Water and an accompanying emoji.

Keep in mind that this was just 10 months after GPT-4 was unveiled. Even in the heart of the hype wave, the experience was so smooth it was hard to believe it wasn’t fully hand-crafted.

The game has spawned a subreddit (r/infinitecraft), where 20k people share screenshots of the things they find in this combinatoric dreamworld. One poster managed to combine the four starting elements into every snake species in the United States. Another discovered that 🧑‍🚀 Neil Armstrong + 💉Vasectomy = ✂️ One Small Snip for Man.

You never really know how something works until you’ve built one yourself, so we built a knockoff. Many knockoffs, actually! And each week on this blog (and X and Facebook), we’re going to share some of these with you to see what we can learn about language, language models, and what kinds of worlds are coming. Welcome to… The Wordiverse.

Whether we live in a simulation or not is a boring question. Whether or not we can build an interesting simulation to play around in? Obviously yes, thanks Nintendo! Can we build an interesting simulation out of only single words interacting one at a time? I would not have guessed that this could be interesting, but Infinite Craft is yet another demonstration of how surprising language and language models really are!

In the eight months since Infinite Craft was released, the jagged frontier of AI has advanced, and it’s now possible to simulate the screen and controls of a 3D game, entirely within a neural net called GameNGen.

It’s interesting to compare these two worlds, Infinite Craft versus GameNGen. Infinite Craft is made of cartoonishly simple atoms, and lets the user explore a fractal that escapes its starting four elements to wrap around the entire fraction of the human experience which has been captured by language. GameNGen provides a relatively rich atom - a 160x120 pixel image - but trades away the expansive world hinted at by language for the sealed tunnels of Doom. Let’s place these worlds into a 2x2, confined/open and simple/rich.

Note the circular isocomputes rippling from the origin. These represent the computation required to run a meaningful simulation in that region. In the top left nearest the origin, we have the simple bounded world of Boggle and Sudoku, where mere hand calculations will do. In January 2024, Infinite Craft proved that we had reached enough compute to simulate a meaningful infinite world made out of very simple elements. September 2024, and GameNGen showed that if we traded the limitless domain of Infinite Craft for a few confined tunnels, then we had the capacity to simulate the relationship between pairs of video frames (hundred of thousands of bytes per element) instead of only pairs of words (tens of bytes per element).

At the bottom right, furthest from the low compute world of the past, we have the future. Meaningful simulations of rich elements in open domains. YouTube is full of “AI infinite zoom” videos which are proto-entries in this category, Spongebob’s Existential Crisis is a good sneak peek.

The key word here is meaningful. A consumer laptop can run a couple Nintendo 64 emulators at once, then huck them on simulated hyberbolic trajectories out of the galaxy, and you have a trivial simulation of rich elements in an open domain. But nobody with any interest in their own life would be interested in watching the output of such a simulation for more than a few seconds. People have been seeking and sharing discoveries in the pairwise combinatorics of Infinite Craft for several months!

It’s because the world of language models is deeply entangled with the world that we live in. When you explore this entangled world and find something interesting, you can bring it back with you - hero’s journey style. “Hey barkeep, what did Neil Armstrong say when he got his vasectomy?” Time well spent!

The genius of Infinite Craft is that it gives you real stories to tell - its weakness is that it does not give you items to show. You can take a screenshot, but that’s not the same thing. I don’t want a picture of the fountain of youth, I want to drink the water myself. At best, the picture is an existence proof and clue; at worst a forgery pointing to a dead end. The deep promise we make in the toys of The Wordiverse is that you can keep the items you find, and share them with others:

Here are a couple interesting things, drop other good stuff in the comments below!

The other change you might notice is that Wordiverse generates the differences between items, not their merger. The role of differences is a rich topic, but we’ll save that for next week!

Landmarks within a dream

Infinite Craft as the TodoMVC of virtual worlds.

Testing of non-deterministic AI output was made possible by the Selfie JVM testing library.

Discussion about this post