-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 18:28
As stochastic multi-armed bandit model has many important applications, understanding the impact of adversarial attacks on this model is essential for the safe applications of this model. In this paper, we propose a new class of attack named action-manipulation attack, where an adversary can change the action signal selected by the user. We investigate the attack against a very popular and widely used bandit algorithm: Upper Confidence Bound (UCB) algorithm. Without knowledge of mean rewards of arms, our proposed attack scheme can force the user to pull a target arm very frequently by spending only logarithm cost.