Learning Diverse Sub-Policies Via A Task-Agnostic Regularization On Action Distributions.

Liangyu Huo, Mai Xu, Zulin Wang, Yuhang Song

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 14:19

04 May 2020

Automatic sub-policy discovery has recently received much attention in hierarchical reinforcement learning (HRL). The conventional approaches to learning sub-policies suffer from collapsing into just one sub-policy dominating the whole task, lacking techniques to ensure the diversity of different sub-policies. In this paper, we formulate the discovery of diverse sub-policies as a trajectory inference. Then, we propose an information-theoretic objective based on action distributions to encourage diversity. Moreover, two simplifications are derived on discrete and continuous action space for reducing the computation. Finally, the experimental results show that the proposed approach can further improve the state-of-the-art approaches without modifying existing hyperparameters on two different HRL domains, suggesting the wide applicability and robustness of our approach.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Learning Diverse Sub-Policies Via A Task-Agnostic Regularization On Action Distributions.

Liangyu Huo, Mai Xu, Zulin Wang, Yuhang Song

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society