Skip to main content

Generalized Linear Bandits With Safety Constraints

Sanae Amani, Mahnoosh Alizadeh, Christos Thrampoulidis

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 17:40
04 May 2020

The classical multi-armed bandit is a class of sequential decision making problems where selecting actions incurs costs that are sampled independently from an unknown underlying distribution. Bandit algorithms have many applications in safety critical systems, where several constraints must be respected during the run of the algorithm in spite of uncertainty about problem parameters. This paper formulates a generalized linear stochastic multi-armed bandit problem with generalized linear safety constraints that depend on an unknown parameter vector. In this setting, we propose a Safe UCB-GLM algorithm for which we provide general and problem-dependent regret bounds.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00