Tool assisted speedrun: Chef Stef's NES Arkanoid "warpless" in 11:11.18

an_angry_tiger

November 29, 2021

Link

Wow that's impressive, recreated the game's logic so they could run fast simulations of it, and then ran every combination of inputs possible to determine what the fastest speedrun could be. They...

I've always been intrigued by brute forcing as an optimization strategy and tried to find a game where it might be possible without spending multiple lifetimes finishing it. After some research, I decided that NES Arkanoid was a good candidate.

At first glance, brute-forcing an 11-minute TAS might seem to be completely impossible, having 2^8^(60 * 60 * 11) possibilities to evaluate. But that assumes we actually want to try every combination of inputs; if we encode the rules of the game into the bot and don't bother looking for things like glitches or ACE exploits, we can actually get this into the realm of possibility. The input surface of the game is actually quite small: you only have to press left, right, and A, and never any of those at the same time. There also aren't all that many ways to bounce the ball around.

...

The bottleneck was emulation speed. Could I improve performance within the emulator? I found a few settings (like disabling the display) that gave marginal improvements. But I was still a long way from the goal.

And then I realized something - what if I could simulate the game state instead of emulating it? There's a lot of overhead in running BizHawk, running the Lua engine, and emulating the NES internals that isn't necessary if we're just trying to make a ball bounce around. Could I instead write an optimized, bare-bones program that mimicked Arkanoid's mechanics and then run my brute force bot on that?

So I went back to the drawing board. I disassembled the game, analyzed its logic, pulled out the routines that mattered for gameplay, and rewrote them all in C++. Logic for graphics, sound, and such could be omitted since they don't affect the physics or any outcomes in the game.

With the replicated game engine in hand, the next step was proving it was equivalent to the original. I constructed a test harness that fed Baxter's TAS inputs into the engine and compared memory values against what BizHawk showed for any divergence. This uncovered many small issues and inconsistencies that I painstakingly fixed. Eventually I'd resolved everything and could "play back" Baxter's TAS perfectly.

Wow that's impressive, recreated the game's logic so they could run fast simulations of it, and then ran every combination of inputs possible to determine what the fastest speedrun could be. They turned the game in to a problem they could run a brute force algorithm on. The people in the TAS community constantly blow my mind with the ingenuity, creativity, and dedication they put in to do something impressive just for the fun of it.

7 votes

bhrgunatha

November 29, 2021

Link

So much to enjoy about this write-up. I like that it explains not just the standard game mechanics but also the logic of enemy placement and the rewards. Having lost plenty of coins to Arkanoid, I...

So much to enjoy about this write-up.

I like that it explains not just the standard game mechanics but also the logic of enemy placement and the rewards. Having lost plenty of coins to Arkanoid, I can appreciate the game even more.

2 votes

[3]

nothis

November 29, 2021 (edited November 29, 2021)

Link

I just saw a clip of the DeepMind guys (AlphaGo) using this game (or maybe an older/non-NES version of it?) for testing their AI. It showed it learning the trick of trapping the ball above after...

I just saw a clip of the DeepMind guys (AlphaGo) using this game (or maybe an older/non-NES version of it?) for testing their AI. It showed it learning the trick of trapping the ball above after only a couple of iterations. I wonder if it could beat manual optimizations. Probably not, all this brute-forcing seems to be absurdly in-depth and should have captured almost all solutions.

Also: Arkanoid has a "story"? Cute!

2 votes

DataWraith
November 29, 2021 (edited November 29, 2021)
Link Parent
I doubt it. The vanilla DQN algorithm you're talking about, and especially the improvements developed in the 8 or so years hence, do play better than a human who is given 2 hours to learn a game,...

I wonder if it could beat manual optimizations.

I doubt it. The vanilla DQN algorithm you're talking about, and especially the improvements developed in the 8 or so years hence, do play better than a human who is given 2 hours to learn a game, but (a) until very recently not in all tested games and (b) after experiencing hundreds to thousands of hours of simulated gameplay, per game.

This is also qualitatively very different from brute force optimization -- which will find the optimal route eventually if done right (but it will take a long time). DQNs are somewhat difficult to train reliably, and have no such guarantees. A human playing frame-by-frame with an editor instead of real-time will definitively outperform a DQN.

There is, however, an algorithm that does the equivalent of a Tool Assisted Speedrun called Go Explore. It plays randomly, but whenever something good happens (new highscore, new level reached, etc.) it remembers the actions it took. The best segments of play it discovers thus are stitched together into an actual speedrun automatically.

4 votes
whbboyd
November 29, 2021
Link Parent
By definition, no, no optimization technique can produce a more optimal result than brute force, as brute force is guaranteed to produce a globally optimal result. However, if your brute force...

By definition, no, no optimization technique can produce a more optimal result than brute force, as brute force is guaranteed to produce a globally optimal result.

However, if your brute force approach will run for billions of years, machine learning approaches can probably yield an adequately optimal result in tractable amounts of time.

(Note that the author mentions they did perform some heuristic optimizations in order to keep the search space tractable, instead of "barely-intractible". See the section "A Word on Optimality" in the posted writeup. It's possible more optimal results were discarded by the heuristics, though the ones they mention are quite conservative.)

2 votes

riQQ (OP)

November 29, 2021

Link

This post is an interesting write-up about how to optimize game / controller input to minimize the time required to complete a game (tool assisted speedrun (TAS)). In addition to the standard...

This post is an interesting write-up about how to optimize game / controller input to minimize the time required to complete a game (tool assisted speedrun (TAS)).
In addition to the standard rules of the played game additional artificial limitations can be introduced.

1 vote

Link information

6 comments