Google’s DeepMind has a great track record when it comes to beating humans at board games; it does this with a long-term aim to create something that can be applied to a varied range of situations. In a new paper, showing off its latest work, DeepMind has described how its AI program has learned to play chess and a couple of other board games.
With just 4 hours of training, DeepMind’s AlphaZero AI developed superhuman performance in chess. After a “self-play reinforcement learning” of 300k steps, it outperformed Stockfish, the world’s best chess-playing program. It’s worth noting that the AI was programmed with only rules of chess and no game strategies were fed.
The AlphaZero algorithm is a more generic version of AlphaGo Zero algorithm that was used to play Go.
Talking about the results of 100 games against Stockfish, AlphaZero won 25 games as white and first mover advantage. With black, it won 3 games. Rest of the games were drawn and Stockfish wasn’t able to register a single win.
For training, AlphaZero used a single machine with 4 TPUs. Stockfish played at their strongest skill level using 1GB hash and 64 threads.
Find more interesting technical details and game results in this paper.