Home

Co-evolution in the successful learning of backgammon strategy


Author(s) : Alan D. Blair Jordan B. Pollack, 
Publisher : N/A
Publication Date : 1998
ISSN : N/A
Abstract : Following T esauro's work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a r oll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back-propagation, r einforcement or temporal dif ference learning methods wer e employed. Instead we apply simple hill-climbing in a r elative fitness envir onment. We start with an initial champion of all zer o weights and pr oceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. W e investigate how the peculiar dynamics of this domain enabled a pr eviously discarded weak method to succeed, by pr eventing suboptimal equilibria in a "meta-game" of self-learning.,