Contributor(s)The Pennsylvania State University CiteSeerX Archives
Full recordShow full item record
AbstractThis paper introduces a multiagent reinforcement learning algorithm that converges with a given accuracy to stationary Nash equilibria in general-sum discounted stochastic games. Under some assumptions we formally prove its convergence to Nash equilibrium in self-play. We claim that it is the first algorithm that converges to stationary Nash equilibrium in the general case.