Is Quality Enough? Integrating Energy Consumption in a Large-Scale Evaluation of Neural Audio Synthesis Models
Constance Douwes (IRCAM); Giovanni Bindi (IRCAM); Antoine CAILLON (IRCAM); Philippe Esling (IRCAM); Jean-Pierre Briot (CNRS)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Deep learning models are now core components of modern audio synthesis, and their use has increased significantly in recent years, leading to highly accurate and successful solutions. However, the quest for quality comes at a tremendous computational cost, which incurs vast energy consumption and greenhouse gas emissions. At the heart of this problem are the measures we use as a scientific community to evaluate our work. In this paper, we suggest relying on a multi-objective metric based on Pareto optimality, which considers both the model's quality and energy consumption. By applying our measure to the current state-of-the-art in generative audio models, we show that it can drastically change the significance of the results. We hope to raise awareness of the need to simultaneously investigate energy-efficient models of high perceived quality, thus putting computational cost in the spotlight of deep learning research.