Google new algorithm — AlphaZero. He learned three games without human intervention
Google has released a new version of the DeepMind algorithm. Previously, he was a world champion at the game of go — AlphaGo beat the world’s strongest player with the score 3:0 in may 2017. Now it’s time for his heir called AlphaZero.
The algorithm cleared the leadership table in three disciplines: chess, go and Shogi (Japanese chess game). AlphaZero had worked out on their own, without human intervention. All preliminary matches went to 3 days, depending on the game. Using the principle of “reinforcement learning” algorithm conducted a test of the party, rewarding yourself for the moves that led to success. As the only introductory was the basic rules of the games, the creators of the project, certify, AlphaZero is completely free from human ideas about tactics and strategy.
This had an impact on performance. According to the results of 1000 parties in every game, AlphaZero knocked out the victory from their computer opponents: software systems Stockfish and Elmo, as well as the previous version of DeepMind called AlphaZeroGo. The worst result when playing as black in chess. So the algorithm won just 2% of the vote, brought by 97.2% in a draw and lost 0.8% battles. The most convincing result of the game Shogi for black AlphaZero won 98.2% of meetings.
But as the researchers write success AlphaZero difficult to apply to the real world. His teaching methods work well only in strictly limited conditions with a finite number of modifiable parameters. Game rules — the perfect environment in this framework. In the real world, the algorithm is no clear application. Now researchers from Google are going to move on and train the algorithms for the game of poker, which is poorly to artificial intelligence because of the limited access to information.