Search results
Results from the WOW.Com Content Network
Self-play is used by the AlphaZero program to improve its performance in the games of chess, shogi and go. [2] Self-play is also used to train the Cicero AI system to outperform humans at the game of Diplomacy. The technique is also used in training the DeepNash system to play the game Stratego. [3] [4]
In 100 shogi games against Elmo (World Computer Shogi Championship 27 summer 2017 tournament version with YaneuraOu 4.73 search), AlphaZero won 90 times, lost 8 times and drew twice. [11] As in the chess games, each program got one minute per move, and Elmo was given 64 threads and a hash size of 1 GB. [2]
General video game playing (GVGP) is the concept of GGP adjusted to the purpose of playing video games. For video games, game rules have to be either learnt over multiple iterations by artificial players like TD-Gammon , [ 5 ] or are predefined manually in a domain-specific language and sent in advance to artificial players [ 6 ] [ 7 ] like in ...
Game playing was an area of research in AI from its inception. One of the first examples of AI is the computerized game of Nim made in 1951 and published in 1952. Despite being advanced technology in the year it was made, 20 years before Pong, the game took the form of a relatively small box and was able to regularly win games even against highly skilled players of the game. [1]
MuZero (MZ) is a combination of the high-performance planning of the AlphaZero (AZ) algorithm with approaches to model-free reinforcement learning. The combination allows for more efficient training in classical planning regimes, such as Go, while also handling domains with much more complex inputs at each stage, such as visual video games.
The AI engaged in reinforcement learning, playing against itself until it could anticipate its own moves and how those moves would affect the game's outcome. [10] In the first three days AlphaGo Zero played 4.9 million games against itself in quick succession. [11]
The 2014 research paper on "Variational Recurrent Auto-Encoders" attempted to generate music based on songs from 8 different video games. This project is one of the few conducted purely on video game music. The neural network in the project was able to generate data that was very similar to the data of the games it trained off of. [35]
TD-Gammon is a computer backgammon program developed in 1992 by Gerald Tesauro at IBM's Thomas J. Watson Research Center.Its name comes from the fact that it is an artificial neural net trained by a form of temporal-difference learning, specifically TD-Lambda.