Pyramid Dynamic Inference: Encouraging Faster Inference via Early Exit Boosting

Ershad Banijamali (Amazon Inc.); Pegah Kharazmi (Amazon); Sepehr Eghbali (Amazon); Jixuan Wang (Amazon); Clement Chung (Amazon); Samridhi Choudhary (Amazon)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

07 Jun 2023

Transformer-based models demonstrate state of the art results on several natural language understanding tasks. However, their deployment comes at the cost of ncreased footprint and inference latency, limiting their adoption to real-time applications. Early exit strategies are designed to speed-up the inference by routing out a subset of samples at the earlier layers of the model. Exiting early causes losing model accuracy. In order to optimize the trade-off between model accuracy and latency, we propose Pyramid Dynamic Inference (PDI), a scheme that encourages fast inference via boosting the performance of early exit heads. PDI allows for more confident early inference by injecting stronger classifiers at earlier layers. It also prevents a significant increase in the model footprint by gradually shrinking the classifiers as the semantic capacity of the deeper transformer layers increase. Experiment results show that PDI outperforms the baselines on both accuracy and latency on the GLUE benchmark.

Tags:

Machine learning methods for language

Pyramid Dynamic Inference: Encouraging Faster Inference via Early Exit Boosting

Ershad Banijamali (Amazon Inc.); Pegah Kharazmi (Amazon); Sepehr Eghbali (Amazon); Jixuan Wang (Amazon); Clement Chung (Amazon); Samridhi Choudhary (Amazon)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction

A Sentiment and Syntactic-Aware Graph Convolutional Network for Aspect-level Sentiment Classification

SELF SUPERVISED BERT FOR LEGAL TEXT CLASSIFICATION

Join the IEEE Signal Processing Society