# 5. Faster ROS 2 publisher | | Source code | |---|----------| | [`faster_doublevadd_publisher`](https://github.com/ros-acceleration/acceleration_examples/tree/main/nodes/faster_doublevadd_publisher) | | | kernel | [`vadd.cpp`](https://github.com/ros-acceleration/acceleration_examples/blob/main/nodes/faster_doublevadd_publisher/src/vadd.cpp) | | publisher | [`faster_doublevadd_publisher.cpp`](https://github.com/ros-acceleration/acceleration_examples/blob/main/nodes/faster_doublevadd_publisher/src/faster_doublevadd_publisher.cpp) | ```eval_rst .. sidebar:: Before you begin Past examples of this series include: - `4. Accelerated ROS 2 publisher - offloaded_doublevadd_publisher <4_accelerated_ros2_publisher.html>`_, which offloads and accelerates the `vadd` operation to the FPGA, optimizing the dataflow and leading to a deterministic vadd operation with an improved publishing rate of :code:`6.3 Hz`. - `3. Offloading ROS 2 publisher - offloaded_doublevadd_publisher <3_offloading_ros2_publisher.html>`_, which offloads the `vadd` operation to the FPGA and leads to a deterministic vadd operation, yet insuficient overall publishing rate of :code:`1.935 Hz`. - `0. ROS 2 publisher - doublevadd_publisher <0_ros2_publisher.html>`_, which runs completely on the scalar quad-core Cortex-A53 Application Processing Units (APUs) of the KV260 and is only able to publish at :code:`2.2 Hz`. ``` This example is the last one of the *ROS 2 publisher series*. It features a trivial vector-add ROS 2 publisher, which adds two vector inputs in a loop, and tries to publish the result at 10 Hz. This example will leverage KRS to produce an acceleration kernel that a) optimizes the dataflow and b) leverages parallelism via loop unrolling to meet the initial goal established by the `doublevadd_publisher`. - [4. Accelerated ROS 2 publisher - `offloaded_doublevadd_publisher`](4_accelerated_ros2_publisher/), which offloads and accelerates the `vadd` operation to the FPGA, optimizing the dataflow and leading to a deterministic vadd operation with an improved publishing rate of `6.3 Hz`. - [3. Offloading ROS 2 publisher - `offloaded_doublevadd_publisher`](3_offloading_ros2_publisher/), which offloads the `vadd` operation to the FPGA and leads to a deterministic vadd operation, yet insuficient overall publishing rate of `1.935 Hz`. - [0. ROS 2 publisher - `doublevadd_publisher`](0_ros2_publisher/), which runs completely on the scalar quad-core Cortex-A53 Application Processing Units (APUs) of the KV260 and is only able to publish at `2.2 Hz`. ```eval_rst .. important:: The examples assume you've already installed KRS. If not, refer to `install <../install.html>`_. .. note:: `Learn ROS 2 `_ before trying this out first. ``` ## `accelerated_doublevadd_publisher` Let's take a look at the [kernel source code](https://github.com/ros-acceleration/acceleration_examples/blob/main/nodes/faster_doublevadd_publisher/src/vadd.cpp) first: ```cpp /* ____ ____ / /\/ / /___/ \ / Copyright (c) 2021, Xilinx®. \ \ \/ Author: Víctor Mayoral Vilches \ \ / / /___/ /\ \ \ / \ \___\/\___\ Inspired by the Vector-Add example. See https://github.com/Xilinx/Vitis-Tutorials/blob/master/Getting_Started/Vitis */ #define DATA_SIZE 4096 // TRIPCOUNT identifier const int c_size = DATA_SIZE; extern "C" { void vadd( const unsigned int *in1, // Read-Only Vector 1 const unsigned int *in2, // Read-Only Vector 2 unsigned int *out, // Output Result int size // Size in integer ) { #pragma HLS INTERFACE m_axi port=in1 bundle=aximm1 #pragma HLS INTERFACE m_axi port=in2 bundle=aximm2 #pragma HLS INTERFACE m_axi port=out bundle=aximm1 for (int j = 0; j