On the Fairness of Multitask Representation Learning
Yingcong Li (University of California, Riverside); Samet Oymak (University of California, Riverside)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
In the context of multitask learning (MTL), representation learning is often accomplished through a feature-extractor \phi that is shared across all tasks. This way, intuitively, the statistical cost of learning \phi is collaboratively split across all tasks which enables sample efficiency. In this work, we consider a novel fairness scenario where T tasks can be split into majority and minority groups of sizes Tmaj and Tmin respectively: The group assignments are unknown during MTL and Tmin/Tmaj ratio corresponds to the imbalance level of the problem. We further assume that these groups admit r0, r1-dimensional linear representations which are orthogonal to each other, thus, they would not benefit each other during MTL. Our main finding is that misspecification disproportionately hurts the minority tasks and over-parameterization is key to ensuring fairness of MTL representations. Specifically, we prove that, when we fit a R=r0 dimensional misspecified representation, MTL model achieves small task-averaged risk however it has vanishing explanatory power on minority tasks. Conversely, when we fit a R=r0+r1 dimensional well-specified representation, MTL model achieves small risks on both majority and minority tasks which are on par with the oracle baseline of training each group individually with the hindsight knowledge of assignments. Finally, we provide experimental results which are consistent with our theoretical findings.