Abstract—Convolutional neural networks (CNNs) are widely used in modern Artificial Intelligence (AI) systems. Compared with other classical methods, CNNs have superior performance in image classification, speech recognition and object detection. However the computational load of CNNs is very heavy and a large amount of data movement are expected. An efficient way of data movement is critical for both performance and power efficiency for an accelerator design. In this paper we propose a novel CNN accelerator architecture with unique parallel loading scheme and smart memory addressing solution. Our solution is 30% faster than others on Alexnet. Our proposal can achieve high efficiency for FC layer without using image batching. This will make our solution very suitable for edge applications.
Index Terms—Convolutional neural networks (CNNs), deep learning, energy-efficient accelerators.
The authors are with the Institute of Microelectronic, Agency for Science, Technology and Research (A*STAR), 2 Fusionopolis Way, #08-02 Innovis Tower A, Singapore (e-mail: firstname.lastname@example.org, email@example.com).
Cite: Kong Anmin and Zhao Bin, "A Parallel Loading Based Accelerator for Convolution Neural Network," International Journal of Machine Learning and Computing vol. 10, no. 5, pp. 669-674, 2020.Copyright © 2020 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).