A Method for Accelerating YOLO by Hybrid Computing Based on ARM and FPGA
1. Key Laboratory of Agricultural Information Standardization,Ministry of Agriculture and Rural Affairs,China Agricultural University2. Mathematical department of Science and Key Laboratory of Agricultural Information Standardization,Ministry of Agriculture and Rural Affairs,China Agricultural University
摘要：CNN has promoted the rapid development of target recognition and detection technology. By comparison with machine learning, it has faster detection speed and higher robustness. However, the deployment of the CNN network model often needs more computing resources, which hinders the application of artificial intelligence technology. In this paper, the authors use the hybrid architecture of ARM and FPGA to deploy a You Only Look Once（YOLO） model on the FPGA to improve the efficiency of target recognition and detection under condition of low resources consumption and low power consumption. YOLO is a one-stage real-time detection model and it has high detection speed and remarkable accuracy. High-level Synthesis（HLS） is a fast development and verification technology of FPGA based on C/C++. We use HLS to implement the pipeline mechanism and complete the parallel calculation of convolution,thereby constructing a forward reasoning model of YOLOv3-tiny.In order to accelerate the forward inference process of YOLO, we combine convolution with batch normalization. The FPGA we use in the paper is Xilinx Zynq-7035 containing system on chip（SoC）.We build the software and hardware co-architecture of ARM and FPGA on Zynq-7035, which makes full use of the logic control advantages of ARM and the logic computing advantages of FPGA. In the end, we achieve 28.99 GOP/S speed with only 3.715 W power consumption. Finally, compared with the Ryzen 5 3600, we achieve41.3× times inference speed at a lower clock rate.