Designing Control logic and Memories using Vitis HLS
In this part of the lab, we will replicate the FSM and LUT logic in Vitis HLS, while the GEMM operations will be performed in Lab 3. The flow remains the same as Lab 1.
AWS F2 HLS Flow
Source the AWS scripts and enter the lab directory:
cd ~/aws-fpga/
source hdk_setup
source sdk_setup
cd skeleton-lab-2/
- Enter the part 3 directory.
cd Lab1Part3BasicCondFSMAlt/src/
-
Fill in the TODO sections in the source file, and answer the following question in
lab2_submit.md:- By inspecting the module interface, what is the width of the
bramport?
- By inspecting the module interface, what is the width of the
-
Lets generate RTL from this source code. Run the following:
cd ../design_top/
source setup.sh
make gen_rtl
-
The generated RTL is visible here:
design_top/design/concat_top.sv. Thevaddmodule definition can be seen at the bottom of the file, and includes the typical clock, reset, start, and done ports, the axi master, and slave axi control ports that you are familiar with. -
Take a look at the
design_top.svwrapper. Specifically take a look at the code block starting at line 362, and answer the following question:- Why is this codeblock necessary, specifically for this
vaddblock? What does it do and how? (Hint; you don't need to go in depth into how the axi specifics unfold, just provide a high level description of what is being done.)
- Why is this codeblock necessary, specifically for this
-
Proceed to the RTL simulation. Run the following:
make hw_sim
-
The RTL testbench can be found under
verif/test/design_top_base_test.sv. The test configures the block with the correctbramarray pointer, and provides the start signal to the custom logic. When the block finishes, it checks the values written at thebrampointer. Answer the following:- What are the RTL Sim data transfer cycles?
- What are the RTL Sim compute cycles?
- Recall that the compute cycles include the time it takes for the block to interact with the DRAM. How many DRAM accesses does the custom logic perform? (Hint: the logic itself, ignore the testbench).
- Given the runtimes you've seen in this and Lab 1's simulations, as well as what you can infer from FSM behaviour, provide an estimate of the cycles taken by the DRAM access, and the cycles taken by the computation.
-
We can now move on to the FPGA test. First we need to perform synthesis and implementation, by running the following command:
make fpga_build
- After the build finishes, generate the AWS FPGA Image AFI as such:
make generate_afi
- We now need to wait until the AFI becomes available. Run the following command. The AFI will most likely be listed as "Pending", and will take about 20 minutes to become available. Run the command periodically and only proceed when it shows "Available".
make check_afi_available
- Now that the AFI is available, program the FPGA, and run the FPGA test.
make program_fpga
make run_fpga_test
- The FPGA test loads our custom logic to the FPGA, and then executes the C testbench under
software/src/design_top.c. This testbench mimics our RTL test. Answer the following questions:- What are the RTL Sim data transfer cycles?
- What are the RTL Sim compute cycles?
Your Turn
- Fill in the LUT initialization code
TODOinLab2Part4LUT/src/vadd.cppand go through the entire F2 HLS flow, answering the relevant questions inlab2_submit.md.
Submission
Make sure all the files below are included in your Github Classroom repository.
Part 1
You should add the following implementation to each file and answer part 1 of lab2_submit.md
- In
Lab2Part1.scala: Fill in the commentedTODOsection inLab2Part2SimpleMemFold. For more details, refer to MemReduce, MemFold section - In
Lab2Part3.scala: Fill in the commentedTODOsection inLab2Part3BasicCondFSMAlt. For more details, refer to FSM section - In
Lab2Part4.scala: Fill the commentedTODOsection inLab2Part4LUT. For more details, refer to LUT section - In
Lab2GEMM.scala: Fill the commentedTODOsection inLab2Part5GEMMandLab2Part6GEMM. For more details, refer to Lab2Part5GEMM, Lab2Part6GEMM.
Part 2
You shoud add the following implementation to each file and answer part 2 of lab2_submit.md
- In
Lab2Part3BasicCondFSMAlt/src/vadd.cpp: Fill in the commented TODO section. - In
Lab2Part4LUT/src/vadd.cpp: Fill in the commented TODO section. - Make sure the
logsdirectory of each part contains:gen_rtl.log.txt,hw_sim.log.txt&fpga_test.log.txtin your Github Classroom repository.
Gradescope
- Gradescope: a doc with your commit ID & repo (for the entire lab2). Be sure to push all the changes required for submission (Part 1 and Part 2).