Skip to main content

Processing Convolutional Neural Networks On Cache

João Vieira, Nuno Roma, Gabriel Falcao, Pedro Tomás

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 12:14
04 May 2020

With the advent of Big Data application domains, several Machine Learning (ML) signal-processing algorithms such as Convolutional Neural Networks (CNNs) are required to process progressively larger datasets at a great cost in terms of both compute power and memory bandwidth. Although dedicated accelerators have been developed targeting this issue, they usually require moving massive amounts of data across the memory hierarchy to the processing cores and low-level knowledge of how data is stored in the memory devices to enable in-/near-memory processing solutions. In this paper, we propose and assess a novel mechanism that operates at cache level, leveraging both data-proximity and parallel processing capabilities enabled by dedicated fully-digital vector Functional Units (FUs). We also demonstrate the integration of the mechanism with a conventional Central Processing Unit (CPU). The obtained results show that our engine provides performance improvements on CNNs ranging from 3.92× to 16.6×.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00