Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:08:51
08 May 2022

A wide range of multi-agent decision-making problems can be abstracted as a federated multi-armed bandit (FMAB) problem. A key challenge of the FMAB problem is that the exploration-exploitation dichotomy inherited from the multi-armed bandit aspect is compounded with data heterogeneity in federated learning. This renders the exploration and exploitation of different agents inherently entangled. This paper focuses on overcoming the difficulty of exploration in FMAB problems, and it proposes a novel federated upper confidence bound (UCB) algorithm that requires uncoordinated exploration (UE) decisions by the agents. The major distinction of this algorithm, referred to as FedUCB-UE, with the existing FMAB algorithms is that it allows the agents to explore the non-optimal arms and make personalized arm-selection decisions without coordination. While such uncoordinated exploration makes the regret analysis non-trivial, it comes with both the theoretical and empirical benefit of diversity in explorations. Under certain mild assumptions, this paper establishes that FedUCB-UE has a O(logT) regret bound. Furthermore, experiments performed on synthetic datasets show that FedUCB-UE outperforms the state-of-the-art algorithms.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00