3D FPGAs recently is produced as the next generation of the FPGA family to continue the integration of more transistors on a single chip seamlessly. In this paper, we propose a complete CAD flow to implement an arbitrary logic circuit on the 3D FPGA. The partitioning and placement stages of the flow are based on the simulated annealing algorithm. Furthermore, the routing stage is a modified version of the Pathfinder algorithm. The simulation results indicate that the comparison between 2D FPGA and 3D FPGA (including 2-tier) shows that the circuit speed increases by 28.66% and minimum channel width decrease by 29.92%, while the total area raises by 8.86%. Finally, the results of the comparison between 2-tier and 4-tier in 3D FPGA show that circuit speed and minimum channel width increase by 15.95% and 15.92% in 4-tier, respectively. Meanwhile, the total area increases only by 1.96%.