NETWORKED POLICY GRADIENT PLAY IN MARKOV POTENTIAL GAMES

Sarper Aydin (Texas A&M University); Ceyhun A Eksin (Texas A&M University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

We propose a networked policy gradient play algorithm for solving Markov potential games. In a Markov game, each agent has a reward function that depends on the actions of all the agents and a common dynamic state. A differentiable Markov potential game admits a potential value function that has local gradients equal to the gradients of agents' local value functions. In the proposed algorithm, agents use parameterized policies that depend on the state and other agents' policies. Agents use stochastic gradients and local parameter values received from their neighbors to update their policies. We show that the joint policy parameters converge to a first-order stationary point of a Markov potential game in expectation for general action and state spaces. Numerical results on the lake game exemplify the convergence of the proposed method.

Tags:

sequential learning

sequential decision methods

NETWORKED POLICY GRADIENT PLAY IN MARKOV POTENTIAL GAMES

Sarper Aydin (Texas A&M University); Ceyhun A Eksin (Texas A&M University)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Sequential Invariant Information Bottleneck

RECURSIVE ESTIMATION OF USER INTENT FROM NONINVASIVE ELECTROENCEPHALOGRAPHY USING DISCRIMINATIVE MODELS

Change Point Detection with Neural Online Density-ratio Estimator

Join the IEEE Signal Processing Society