ADAPTER TUNING WITH TASK-AWARE ATTENTION MECHANISM
Jinliang Lu (Institute of Automation,Chinese Academy of Sciences); Feihu Jin (Institute of Automation ,Chinese Academy of Sciences); Jiajun Zhang (Institute of Automation Chinese Academy of Sciences)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Adapter-tuning inserts simple feed-forward layers (adapters) in pre-trained language models (PLMs) and just tunes the adapters when transferring to downstream tasks, having become the state-of-the-art parameter-efficient tuning (PET) strategy. Although the adapters aim to learn task-related representations, their inputs are still obtained from the task-independent and frozen multi-head attention (MHA) modules, leading to insufficient utilization of contextual information for various downstream tasks. Intuitively, MHA should be task-dependent and could attend to different contexts in different downstream tasks. Thus, this paper proposes the task-aware attention mechanism (TAM) to enhance adapter tuning. Specifically, we first utilize the task-dependent adapter to generate token-wise task embedding. Then, we apply the task embedding to influence MHA which task-dependently aggregates the contextual information. Experimental results on a wide range of natural language understanding and generation tasks demonstrate the effectiveness of our method. Furthermore, extensive analyses demonstrate that the generated task embedding corresponds with the difficulty of tasks.