چکیده
|
A serious challenge in artificial real-time applications is the hardware implementation of deep neural networks (DNN). Among various methods, stochastic computing (SC)-based implementations received tremendous attention due to the low hardware overhead. However, the slow convergence rate is a major problem in SC-based neural networks’ implementation, and millions of clock cycles are required to generate a relatively accurate output. The reconfigurability and parallel nature of field programmable gate array (FPGA) chips make them a preferable platform for SC-based DNN implementation. A fully or semi-parallel implementation of DNNs requires extensive hardware resources. In this paper, an efficient method for DNN implementation on an FPGA chip is presented to address these problems. The FPGA chip reconfiguration feature allows a DNN with several different neurons and topologies to be implemented on a single chip. Convergence time is significantly reduced by limiting the length of the stochastic bitstreams and establishing synchronization between the processing units based on precise timing. Furthermore, due to the limited number of input–output pins in the FPGA chip, a sequential architecture is proposed wherein the DNN inputs enter through only three 8-bit ports. This makes it possible to implement DNNs for image-processing applications. The proposed method is implemented using the Verilog hardware description language on the Xilinx FPGA Virtex-7 xc7v2000t chip. The results show a more than 82% reduction in hardware resources and the minimum rate of power consumption compared to state-of-the-art methods. In addition, the average error rate of the implemented DNN is reduced by 2%.
|