Stabilizing Multi-Agent Deep Reinforcement Learning By Implicitly Estimating Other AgentsÃ¢â‚¬â„¢ Behaviors

Yue Jin, Shuangqing Wei, Jian Yuan, Xudong Zhang, Chao Wang

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 14:12

04 May 2020

Deep reinforcement learning (DRL) is able to learn control policies for many complicated tasks, but itâs power has not been unleashed to handle multi-agent circumstances. Independent learning, where each agent treats others as part of the environment and learns its own policy without considering othersâ policies is a simple way to apply DRL to multi-agent tasks. However, since agentsâ policies change as learning proceeds, from the perspective of each agent, the environment is non-stationary, which makes conventional DRL methods inefficient. To cope with this challenge, we propose a novel approach where each agent uses an implicit estimate of othersâ actions to guide its own policy learning. We demonstrate that given the implicit estimate of othersâ actions, each agent can learn its policy in a relatively stationary environment. Extensive experiments show that our method significantly alleviates the non-stationarity and outperforms the state-of-the-art in terms of both convergence speed and policy performance.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Stabilizing Multi-Agent Deep Reinforcement Learning By Implicitly Estimating Other AgentsÃ¢â‚¬â„¢ Behaviors

Yue Jin, Shuangqing Wei, Jian Yuan, Xudong Zhang, Chao Wang

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society