& write_stream,
int elements) {
int pixel = 0;
while(elements--) {
write_stream >> outFrame[pixel++];
}
}
```
This function is quite simple. It just outputs the streaming results to the global memory.
## Run Hardware Emulation for Dataflow
Go to the `makefile` directory and use the following command to run hardware emulation.
```
make run TARGET=hw_emu STEP=dataflow SOLUTION=1 NUM_FRAMES=1
```
You should see the following results.
```
Processed 0.02 MB in 29.728s (0.00 MBps)
INFO: [Vitis-EM 22] [Wall clock time: 23:03, Emulation time: 0.108257 ms] Data transfer between kernel(s) and global memory(s)
convolve_fpga_1:m_axi_gmem1-DDR[0] RD = 20.000 KB WR = 0.000 KB
convolve_fpga_1:m_axi_gmem2-DDR[0] RD = 0.000 KB WR = 20.000 KB
convolve_fpga_1:m_axi_gmem3-DDR[0] RD = 0.035 KB WR = 0.000 KB
```
## View the Profile Summary Report for Hardware Emulation
Use the following command to view the Profile Summary report.
```
make view_run_summary TARGET=hw_emu STEP=dataflow
```
The kernel execution time is now reduced to 0.059 ms.
Here is the updated table.
| Step | Image Size | Time (HW-EM)(ms) | Reads(KB) | Writes(KB) | Avg. Read (KB) | Avg. Write (KB) | Bandwidth (MBps) |
| :--------------- | :--------- | ---------------: | --------------: | ---------: | -------------: | --------------: | ---------: |
| baseline | 512x10 | 3.903 | 344 | 20.0 | 0.004 | 0.004 | 5.2 |
| localbuf | 512x10 | 1.574 (2.48x) | 21 (0.12x) | 20.0 | 0.064 | 0.064 | 13 |
| fixed-point data | 512x10 | 0.46 (3.4x) | 21 | 20.0 | 0.064 | 0.064 | 44 |
| dataflow | 512x10 | 0.059 (7.8x) | 21 | 20.0 | 0.064 | 0.064 | 347 |
## Next Step
You have performed a couple of optimizations on the hardware kernels to improve the performance. In the next section, you look at different host code optimizations, such as [using out-of-order queues and multiple compute units](./multi-CU.md).
[function_pipeline]: ./images/4_function_pipelining.png "Function Pipelining"
[dataflow]: ./images/dataflow.png "Dataflow"
[dataflow_hwemu_profilesummary]: ./images/191_dataflow_hwemu_pfsummary_new_2.jpg "Dataflow version hardware emulation profile summary"
Return to Getting Started Pathway — Return to Start of Tutorial
Copyright© 2020 Xilinx