# lenet5_hls

**Repository Path**: biasbb/lenet5_hls

## Basic Information

- **Project Name**: lenet5_hls
- **Description**: FPGA Accelerator for CNN using Vivado HLS
- **Primary Language**: C++
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 1
- **Created**: 2019-08-11
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

LeNet-5 in HLS
===========
This repository is about my graduate report, implementing LeNet-5 in Vivado High Level Synthesis 2016.4 & Vivado   SDSoC 2016.4


![lenet5](https://world4jason.gitbooks.io/research-log/content/deepLearning/CNN/img/lenet.png "LeNet-5")


## Win 10 Test App
You can test the accelerator by your own handwritten digits image.  

### Youtube Video

[![Youtube Video Here](http://cfile21.uf.tistory.com/image/99C6A7335A1524F20AFF26)](https://youtu.be/C7MUhBBczss)

If you want to test the app, follow these instruction

1. Configure the IP address of Zedboard.  
```
	username@Zedboard:~# ifconfig
```
2. Start .elf file with port name argument (in here, 5555 is port name)    
```
	username@Zedboard:~# lenet5_test.elf 5555
```
3. Start the win 10 test application and input the IP address & port name.
4. Press connect
5. Open image file

I did not put a zoom in/out function to the app, so please suit the image size. 

## Model description
Used model is LeNet5-Like Deep CNN  
Input : -1.0 to 1.0  
Conv1 : 1x32x32 -> 6x28x28, ksize = 1x6x5x5, stride = 1  
Pool1 : 6x28x28 -> 6x14x14, average pooling, window size = 2x2, stride = 2  
Conv2 : 6x14x14 -> 16x10x10, ksize = 6x16x25, stride = 1  
Pool2 : 16x10x10 -> 16x5x5, average pooling, window size = 2x2, stride = 2  
Conv3 : 16x5x5 -> 120x1x1, ksize = 16x120x25, stride = 1  
FC1 : 120x84  
FC2 : 84x10    

## Environments
I used Zedboard(Zynq 7z020) for testing.  

HW Functions : CONVOLUTION_ LAYER_ 1, CONVOLUTION_ LAYER_ 2, and CONVOLUTION_ LAYER_ 3, Clk freq set as 100MHz.


## Accuracy  
	SW accuracy : 98.63% (single precision fp)    
	HW accuracy : 98.63% (single precision fp)  

## Runtime  
	# of images : 10,000, batch size : 1  
	
	SW runtime  : 59.4456 seconds  
	HW runtime  : 16.3954 seconds  

	speedup : x3.63 faster