Skip to the content.

Spatial Development Route

When using Spatial and its supported simulation backends to design your hardware, you will follow a design flow that is shown in this diagram. If you are not familiar with concepts such as RTL, Logic Synthesis, and Placement and Routing, please read throught the explanation of each step and terminonlogy in the diagram below.

spatial-design-flow

For the scope of this course, you are required to synthesize and do the placement and routing of your design. More details on how to lower your design and conduct logic synthesis, placemane and routing will be introduced in the following sections together with instructions on how to run simulation in each step. (We will not require you to actually run it on a actual FPGA.)

There are three simulations provided to help you check the correctness and improve performance of your design.


  1. Scalasim
  2. VCS
  3. ZCU
  4. Known Issues

Scalasim

How to run Scalasim simulation

You can run the scala simulation by running the following command (you should replace $PROJECT_DIRECTORY and $TEST_NAME):

cd $PROJECT_DIRECTORY  # Change this to your project directory
sbt -Dtest.CS217=true "; testOnly $TEST_NAME"

Simulation reports

Cycle Count

After running your application, artifacts will be generated into gen/CS217/$TEST_NAME. The most important files are:

Both of the files show the line of code for each controller, so you can use this information to match it with your code. For example, in lab1’s Lab1Part2DramSramExample, you can use the SimulatedExecutionSummary_*.log to understand how long each parts of your codeo took like this:

src-code-cycle-count

perf-break-down

For more detailed information, you can look at info/PostExecution.html. It contains informations such as initiaion interval and latency for the Foreach loops. (If you are not familiar with initiation interval or pipelining, read this.)

Resource Utilization

Scalasim also gives you a rough estimate of the resource utilization of your application. Whenever you run a simulation, a Main.json file will be created under gen/CS217/$TEST_NAME/reports/. For example, the file for Lab1Part2DramSramExample in Lab 1 will look like:

{
	"bram": {
		"x224_b1_0": [32, [16], [0], 1],
		"x245_b2_0": [32, [16], [0], 1]
	},
	"fixed_ops": 5
}

You can also use the provided python script computeResourceUtilization.py which summarizes the resource utilization in a more concise manner. Run the following script with the proper $file_name. (Note: Memory sizes in this output are given in bits, not bytes)

python computeResourceUtilization.py $file_name
# ex: python computeResourceUtilization.py gen/CS217/Lab1Part2DramSramExample/reports/Main.json

Test Results

Terminal If you succeeded running the simulation, the terminal will let you know.

SimulatedExecutionLog_*.log You can find this file under the gen/CS217/$TEST_NAME folder. This captures the print statements in your test case


VCS

How to run VCS simulation

You can run the VCS simulation by running the following command (you should replace $PROJECT_DIRECTORY and $TEST_NAME):

cd $PROJECT_DIRECTORY 
source exports.sh
sbt -Dtest.VCS=true "; testOnly $TEST_NAME" 

For VCS and ZCU backend, you will have to set several environment variables using the source exports.sh command. Use the expoerts.sh file placed in lab3’s skeleton repository.

Simulation reports

Cycle Count

To see the cycle count for the controllers, open the gen/VCS/$TEST_NAME/info/controller_tree.html.

Resource Utilization

N/A (To get the resource utilization information, you need to do placement and routing, which happens in the synthesis step.)

Test Results

Terminal If you succeeded running the simulation, the terminal will let you know.

run.log To view the results of the print statements you inserted, open the logs/VCS/$TEST_NAME/run.log.

Tip: If you’re using Visual Studio Code IDE, hovering your mouse over the file name in the line Backend run in $PROJECT_DIRECTORY/./logs/VCS/$TEST_NAME//run.log and doing ctrl + click will directly open the file for you.

At the end of the file, you will be able to see the print statements in your tests (line 132-135 in the picture below).


ZCU

How to run the ZCU backend

cd $PROJECT_DIRECTORY 
source exports.sh
sbt -Dtest.ZCU=true "; testOnly $TEST_NAME" 

For VCS and ZCU backend, you will have to set several environment variables using the source exports.sh command. Use the expoerts.sh file placed in lab3’s skeleton repository. The synthesis process would take 30 min ~ 1 hour to run.

Known issues during synthesis

When running synthesis, there are some issues with building the host software. So you will see an error message in the terminal that looks like:

However, this does not affect the synthesis stage of your hardware design. Check if the end.log file has been properly created under gen/ZCU/$TEST_NAME/. The end.log contains the timestamp when the hardware synthesis is completed (this will usually be a 10-digit number). If you can find this file and can see the resource utilization report, then your hardware design has been successfully synthesized. The following section will explain how to see the resource utilization report.

Synthesis reports

Resource Utilization

After the synthesis finishes, you will have access to the report of your design’s resource utilization on the target FPGA. The report is located in gen/$TEST_NAME/verilog-zcu/. The resource utilization report is named par_utilization.rpt.

The figure below shows the basic FPGA structure that consists of an array of:

fpga-diagram (ref: https://www.sciencedirect.com/science/article/pii/S0065245820300899)

You can access how much of the resources your design is using by looking at the following sections in the report par_utilization.rpt:

For students that are trying to draw a roofline model for their application:


If you would like to learn more about the report, watching this video will be helpful. (The video uses ‘Slice’ instead of ‘CLB’, but you can think of them similarly.)

Difference between ‘Slice’ and ‘CLB’
In the context of FPGA design, particularly when using Xilinx FPGAs and the Vivado design suite, the terms “slice” and “Configurable Logic Block (CLB)” refer to specific components of the FPGA architecture.

In summary, the main difference lies in the hierarchy and scale of functionality: a CLB is a larger structural unit in an FPGA that contains multiple slices, which are the actual implementers of logic. The CLB coordinates the operations of its constituent slices to execute complex logic and storage operations. In Vivado, you’ll often deal with both terms when defining and analyzing the physical layout and logical implementation of your FPGA designs.


Known Issue

Scalasim: line buffers

Scalasim can simulate the building blocks we have used so far in the labs except for line buffers. Unfortunately, Scalasim cannot properly simulate the line buffer, so if you want to use line buffers, you will have to simulate it with VCS like how we did in lab3.

Using Math Functions

For math functions such as power, exponential, cos, sin, random number generation, etc., you have to implement them manually in your accelerator design or generate the numbers that use this in the host code region and read them in.

If you would like to implement the math functions in your accelerator design, consider either one of the following methods. Both algorithms will be an approximation; For CORDIC, you adjust the number of itertions and for Taylor Expansion, you adjust the number of terms for accuracy. Therefore, make sure what is the level of precision your application requires and check whether your implementation does not affect the correctness / quality of your application:

Some Pros and Cons for each methods: