SUPERVISED CONTRASTIVE LEARNING AS MULTI-OBJECTIVE OPTIMIZATION FOR FINE-TUNING LARGE PRE-TRAINED LANGUAGE MODELS
youness moukafih (International University of Rabat); Mounir Ghogho (Université Internationale de Rabat); Kamel Smaïli (University of Lorraine)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Recently, Supervised Contrastive Learning (SCL) has been shown to significantly outperform the well-known cross-entropy loss-based learning on most classification tasks. In SCL, a neural network is trained to optimize two objectives:
pull an anchor and positive samples together in the embedding space, and push the anchor apart from the negatives. These two different objectives may be conflicting with one another, thus requiring a trade-off between them during optimization. In this work, we formulate the SCL problem as a MultiObjective Optimization (MOO) problem for the fine-tuning phase of RoBERTa language model. Two methods are utilized to solve the optimization problem: (i) the linear scalarization (LS) method, which minimizes a weighted linear combination of per-task losses; and (ii) the Exact Pareto Optimal (EPO) method which finds the intersection of the Pareto front with a given preference vector. We evaluate our approach on several GLUE benchmark tasks, without using data augmentations,
memory banks, or generating adversarial examples. The empirical results show that the proposed learning strategy significantly outperforms a strong competitive contrastive learning baseline.