Correlated Multi-Armed Bandits With A Latent Random Source

Samarth Gupta, Gauri Joshi, Osman Yagan

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 13:48

04 May 2020

Multi-armed bandit models are widely studied sequential decision-making problems that exemplify the exploration-exploitation trade-off. We study a novel correlated multi-armed bandit model where the rewards obtained from the arms are functions of a common latent random variable. We propose and analyze the performance of the C-UCB algorithm that leverages the correlations between arms to reduce the cumulative regret (i.e., to increase the total reward obtained after T rounds). Unlike the standard UCB algorithm that pulls all sub-optimal arms O(log T) times, the C-UCB algorithm takes only O(1) times to identify that some arms, which we refer to as non-competitive arms, are optimal. Thus, we effectively reduce a K-armed bandit problem to a C+1-armed bandit problem with C

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Correlated Multi-Armed Bandits With A Latent Random Source

Samarth Gupta, Gauri Joshi, Osman Yagan

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society