A Fifo Based Accelerator For Convolutional Neural Networks
Vineet Panchbhaiyye, Tokunbo Ogunfunmi
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 15:02
In recent years, Deep Neural Networks (DNNs) have achieved state-of-the-art results in various fields like Computer Vision, Natural Language Processing and Speech Recognition. Of all the DNN architectures, Convolutional Neural Networks (CNNs) have been most effective in tasks like image classification and object detection. The high performance of the CNNs comes at the cost of computational complexity. Currently Graphics Processing Units (GPUs) are used to accelerate CNN training and inference on workstations and data servers. Though popular, GPUs are not suitable for embedded applications because they are not energy efficient. ASIC and FPGA accelerators have the potential to run CNNs that are optimized for energy and performance. In this paper we present an architecture which takes a novel approach to compute convolution results using rowwise inputs as opposed to traditional tile-based processing. We are able to exceed the results of state of the art architectures when implemented on an inexpensive PYNQ Z1 board running at 100Mhz. The total latency to run the convolution layers in the VGG16 benchmark is nearly 1.5x lower for our architecture than state of the art architectures.