How-to install and run a pre-built AVED design on an ALVEO card

This is a getting started guide for the Alveo™ Versal™ Example Design (AVED) to be used with the provided pre-built AVED package file. This guide offers instruction in installing the package on your server and running key commands to test and evaluate the design example. The pre-built example image includes the AVED base design with the xbtest IP integrated into it. This provides a turnkey card capable of running through the canned tests in the xbtest suite. For the AVED source files, only the AVED base design is included - the xbtest IP is not included and instead would be replaced with a user application.


Before You Begin

Dependencies

The Example Design has been tested with the operating systems in the table below. AVED may work with other versions of these operating systems, but they have not been tested. All supported operating systems are tested with general access versions (GA). Ubuntu Hardware Enablement (HWE) is not supported. By default, HWE is disabled by Ubuntu Server versions and enabled by Desktop versions.

Operating SystemArchitectureSupported VersionsKernel Version
Ubuntux86_6422.04

5.4.0

5.15.0

RHELx86_648.34.18.0

Note: root/sudo permissions are required.

Tools

  • Vivado™ 2024.1

Preparing for AVED Installation

  1. Confirm the server is running a supported OS. Follow the instructions provided in the Alveo V80 Data Center Accelerator Cards Installation Guide (UG1617) to physically install the card in your server.

  2. Get the latest AVED installation package for your Alveo card.

    1. Refer to the AVED release table to find the latest example design version. The AVED Git Release Tag will direct you to the tagged source code in GitHub. The AVED Deployment Archive is a zipped AVED installation package; it is pre-built from the tagged source code using the Vivado™ tools version noted in the table.

    2. Download the latest AVED Deployment archive to the server that contains your Alveo card.

    3. Unzip the archive, and find the aved_install.sh script, which will be used in the AVED installation. Review the AVED Deployment Archive for details on the extracted files and directory structure.

  3. Unlike prior Alveo platforms, AVED does not use the Xilinx RunTime Library (XRT). Instead, AVED includes the AVED Management Interface (AMI) as part of the example design. Before AVED installation, ensure that XRT has been removed. It’s not possible to have AMI and XRT running on the same host at the same time.

# --- Ubuntu -----------------------------
# Check if XRT is installed
$ apt list | grep xrt

# Remove XRT if present
$ sudo apt remove xrt

# --- RHEL -------------------------------
# Check if XRT is installed
$ yum list | grep xrt

# Remove XRT if present
$ sudo yum remove xrt

AVED Installation

Installing AVED is a two-step process. First, you will install the pre-built AVED package onto the host. Second, you will program AVED into flash on your Alveo card.

To install the package onto the host, navigate to the directory with the aved_install.sh script, and execute this script.

$ sudo ./aved_install.sh

If successfully installed, the console will display “AVED Installation Complete.” For further information, see Installing AVED Deployment package onto the Host Server for detailed instructions with a complete example output.

To program AVED onto your Alveo card for the first time, update the Flash Partition Table (FPT) image in flash using Vivado HW Manager and JTAG. Refer to AVED Updating FPT Image in Flash for step-by-step instructions. If successful, Vivado HW Manager will report this with a pop-up message: “Flash programming completed successfully.” Cold boot the server to boot AVED from flash.


Testing AVED

Two sets of tools are provided to test the AVED design running in hardware: a suite of AMI commands (ami_tool <command>) and xbtest testcases (xbtest -d <Bus:Device.Function(BDF)> -c <testcase>). A subset of these tools are illustrated below.


ami_tool Overview

The overview command shows basic information including the AMI version, the BDF, the device name, the currently programmed design’s UUID, and the device state. Take note of the BDF of your card (21:00.0 in the example below), which will be used in subsequent commands.

$ ami_tool overview

AMI
-------------------------------------------------------------
Version          | 2.3.0  (0)
Branch
Hash             | 0bab29e568f64a25f17425c0ffd1c0e89609b6d1
Hash Date        | 20240307
Driver Version   | 2.3.0  (0)


BDF       | Device          | UUID                               | AMC          | State
-----------------------------------------------------------------------------------------
21:00.0   | ALVEO V80 ES3   | 23326fac27ccee0161619850bd90a9cc   | 2.3.0  (0)   | READY
If the device state is NOT_READY, cold reboot the system.
Confirm the AMI driver version matches the AMC version on the device before continuing (compatibility is indicated by having the same Major & Minor version numbers in the AMI/AMC Major.Minor.Patch version number format). If they do not match, AMI behavior is undefined.

ami_tool mfg_info

The mfg_info command displays the card manufacturing information stored on an EEPROM. The values will vary from card to card.

$ ami_tool mfg_info -d <BDF>

Manufacturing Information
--------------------------------------------------------------------
Eeprom Version              | 4.0
Product Name                | ALVEO V80 PQ
Board Revision              | 1
Serial Number               | XFL1GHU4KQUW
Mac Address 1               | 00:0a:35:18:6a:70
Mac Address N               | 00:0a:35:18:6a:7f
Manufacturing Date          | Sat May  4 23:53:00 2024
UUID                        | 05a178af-1d5b-dd8c-8a44-f8dde589a218
Board Part Num              | A-V80-P64G-PQ-G
Mfg Part Num                | 043-05113-01

ami_tool sensors

The sensors command displays the available sensors and their values. The values may vary. For removable sensors (such as QSFP devices), the Status may appear as “invalid”; this indicates that no device is inserted. The * displayed after a sensor status indicates that the value retrieved was from a cached store - i.e., the value had been previously updated within the refresh timeframe (1 second).

$ ami_tool sensors -d <BDF>

Name            |      Value | Status
----------------------------------------
1V2_GTXAVTT     |   33.000 C | valid
                |    8.000 A | valid
                |    1.199 V | valid
----------------------------------------
0V88_VCC_CPM5   |   30.000 C | valid*
                |    3.000 A | valid*
                |    0.880 V | valid*
----------------------------------------
PCB             |   31.000 C | valid*
----------------------------------------
Device          |   38.000 C | valid*
----------------------------------------
VCCINT          |   36.000 C | valid*
                |   31.000 A | valid*
                |    0.800 V | valid*
----------------------------------------
Module_0        |    0.000 C | invalid
----------------------------------------
Module_1        |    0.000 C | invalid
----------------------------------------
Module_2        |    0.000 C | invalid
----------------------------------------
Module_3        |    0.000 C | invalid
----------------------------------------
DIMM            |   29.000 C | valid*
----------------------------------------
1V2_VCC_HBM     |   36.000 C | valid*
                |    4.000 A | valid*
                |    1.200 V | valid*
----------------------------------------
Total_Power     |   61.615 W | valid
----------------------------------------
12V_AUX1        |    1.040 A | valid*
                |   12.200 V | valid*
----------------------------------------
12V_AUX2        |    1.420 A | valid*
                |   12.200 V | valid*
----------------------------------------
1V2_VCCO_DIMM   |    1.040 A | valid*
                |    1.208 V | valid*
----------------------------------------
3V3_PEX         |    1.579 A | valid*
                |    3.312 V | valid*
----------------------------------------
12V_PEX         |    2.179 A | valid*
                |   12.184 V | valid*
----------------------------------------
3V3_QSFP        |    0.020 A | valid*
                |    3.304 V | valid*
----------------------------------------
1V5_VCCAUX      |    1.497 V | valid*

xbtest -d <BDF> -c verify

Verify successful AVED installation by running the xbtest verify test, which checks basic communication between the host and xbtest IP’s scratch registers. In the following example, the BDF is 21:00.0; this may differ on your system.

Successful completion of this test is indicated by 0 Errors, 0 Failures, and the message “RESULT: ALL TESTS PASSED”. The Critical Warning ITF_138 “PCIe link speed (4) configuration running on the card does not match expected (5) according to device information” is an indication that the PCIe link speed is limited by the host capability, but is not a failure.

$  xbtest -d 21:00.0 -c verify
INFO      :: GEN_016 :: GENERAL      :: Scanning installed xbtest HW designs...
STATUS    :: MGT_002 :: TIMER        :: Start 1s tick function
INFO      :: GEN_016 :: GENERAL      :: 21:00.0 :: Getting card configuration
INFO      :: GEN_064 :: GENERAL      :: 21:00.0 :: Starting test 1 of 1. Executing: xbtest -d 21:00.0 -j /opt/amd/aved/amd_v80_gen5x8_24.1_xbtest_stress/xbtest/test/verify.json -i /opt/amd/aved/amd_v80_gen5x8_24.1_xbtest_stress/design.pdi -e /opt/amd/aved/amd_v80_gen5x8_24.1_xbtest_stress/xbtest/metadata/xbtest_pfm_def.json
INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
INFO      :: GEN_039 :: GENERAL      :: xbtest version: 7.0.0
INFO      :: GEN_016 :: GENERAL      ::          - SW Build    : 5009311 on Wed Aug 14 10:43:17 2024
INFO      :: GEN_016 :: GENERAL      ::          - Process ID  : 3315
INFO      :: GEN_016 :: GENERAL      :: #####################################################################################################
INFO      :: GEN_016 :: GENERAL      :: System:
INFO      :: GEN_016 :: GENERAL      ::          - Name          : Linux
INFO      :: GEN_016 :: GENERAL      ::          - Release       : 4.18.0-240.el8.x86_64
INFO      :: GEN_016 :: GENERAL      ::          - Version       : #1 SMP Wed Sep 23 05:13:10 EDT 2020
INFO      :: GEN_016 :: GENERAL      ::          - Machine       : x86_64
INFO      :: GEN_039 :: GENERAL      ::          - Running with AMI driver 2.3.0
INFO      :: GEN_016 :: GENERAL      :: #####################################################################################################
INFO      :: GEN_039 :: GENERAL      :: Start of xbtest session at: Wed Aug 21 11:54:09 2024 BST
INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
INFO      :: ITF_008 :: XBT_SW_CFG   :: Using card: ALVEO V80 ES3 (21:00.0)
Starting dynamic display mode...


Repeating last content of dynamic display mode:
  INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
  INFO      :: GEN_039 :: GENERAL      :: xbtest version: 7.0.0
  INFO      :: GEN_039 :: GENERAL      ::        - Running with AMI driver 2.3.0
  INFO      :: GEN_039 :: GENERAL      :: Start of xbtest session at: Wed Aug 21 11:54:09 2024 BST
  INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
  INFO      :: ITF_008 :: XBT_SW_CFG   :: Using card: ALVEO V80 ES3 (21:00.0)
  CRIT WARN :: ITF_138 :: DEVICE       :: PCIe link speed (4) configuration running on the card does not match expected (5) according to device information
  WARNING   :: VER_012 :: VERIFY       :: Verify xbtest HW IP has not access to the DNA

  +------------------+ +-----------------------------------------------------------+ +---------------------------------------------------------------------+
  |                  | |                          STATUS                           | |                            ONGOING TEST                             |
  |     TESTCASE     | |-----------------------------------------------------------| |---------------------------------------------------------------------|
  |                  | | Pending | Completed | Passed | Failed | Errors | Warnings | | Remaining time (s) | Parameters                                     |
  |------------------| |-----------------------------------------------------------| |---------------------------------------------------------------------|
  | verify           | |       0 |         3 |      3 |      0 |      0 |        1 | |                n/a | n/a                                            |
  +------------------+ +-----------------------------------------------------------+ +---------------------------------------------------------------------+

  Card status: Power: 63.3 W; Temperature: 38 C; Qty of measurements: 1

  Messages stats: 1 Warnings, 1 Critical Warnings, 27 Passes, 0 Errors, 0 Failures encountered

  Total elapsed: 7 s


INFO      :: GEN_040 :: GENERAL      :: ############################################## SUMMARY ##############################################
INFO      :: GEN_040 :: GENERAL      :: End of xbtest session at: Wed Aug 21 11:54:14 2024 BST
INFO      :: GEN_040 :: GENERAL      :: 1 Warnings, 1 Critical Warnings, 26 Passes, 0 Errors, 0 Failures encountered
INFO      :: GEN_040 :: GENERAL      :: #####################################################################################################
PASS      :: GEN_024 :: GENERAL      :: RESULT: ALL TESTS PASSED

xbtest -d <BDF> -c memory

Verify the DDR and HBM memory performance by running the xbtest memory test, which checks communication between the xbtest IP and available memories. Read and write bandwidth and latency measurements are logged to .csv files for detailed review. In the following example, the BDF is 21:00.0. This may differ on your system.

Successful completion of this test is indicated by 0 Errors, 0 Failures, and the message “RESULT: ALL TESTS PASSED”. The Critical Warning ITF_138 “PCIe link speed (4) configuration running on the card does not match expected (5) according to device information” is an indication that the PCIe link speed is limited by the host capability, but is not a failure.

$ xbtest -d 21:00.0 -c memory
INFO      :: GEN_016 :: GENERAL      :: Scanning installed xbtest HW designs...
STATUS    :: MGT_002 :: TIMER        :: Start 1s tick function
INFO      :: GEN_016 :: GENERAL      :: 21:00.0 :: Getting card configuration
INFO      :: GEN_064 :: GENERAL      :: 21:00.0 :: Starting test 1 of 1. Executing: xbtest -d 21:00.0 -j /opt/amd/aved/amd_v80_gen5x8_24.1_xbtest_stress/xbtest/test/memory.json -i /opt/amd/aved/amd_v80_gen5x8_24.1_xbtest_stress/design.pdi -e /opt/amd/aved/amd_v80_gen5x8_24.1_xbtest_stress/xbtest/metadata/xbtest_pfm_def.json
INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
INFO      :: GEN_039 :: GENERAL      :: xbtest version: 7.0.0
INFO      :: GEN_016 :: GENERAL      ::          - SW Build    : 5009311 on Wed Aug 14 10:43:17 2024
INFO      :: GEN_016 :: GENERAL      ::          - Process ID  : 3614
INFO      :: GEN_016 :: GENERAL      :: #####################################################################################################
INFO      :: GEN_016 :: GENERAL      :: System:
INFO      :: GEN_016 :: GENERAL      ::          - Name          : Linux
INFO      :: GEN_016 :: GENERAL      ::          - Release       : 4.18.0-240.el8.x86_64
INFO      :: GEN_016 :: GENERAL      ::          - Version       : #1 SMP Wed Sep 23 05:13:10 EDT 2020
INFO      :: GEN_016 :: GENERAL      ::          - Machine       : x86_64
INFO      :: GEN_039 :: GENERAL      ::          - Running with AMI driver 2.3.0
INFO      :: GEN_016 :: GENERAL      :: #####################################################################################################
INFO      :: GEN_039 :: GENERAL      :: Start of xbtest session at: Wed Aug 21 12:01:00 2024 BST
INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
INFO      :: ITF_008 :: XBT_SW_CFG   :: Using card: ALVEO V80 ES3 (21:00.0)
Starting dynamic display mode...


Repeating last content of dynamic display mode:
  INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
  INFO      :: GEN_039 :: GENERAL      :: xbtest version: 7.0.0
  INFO      :: GEN_039 :: GENERAL      ::        - Running with AMI driver 2.3.0
  INFO      :: GEN_039 :: GENERAL      :: Start of xbtest session at: Wed Aug 21 12:01:00 2024 BST
  INFO      :: GEN_039 :: GENERAL      :: #####################################################################################################
  INFO      :: ITF_008 :: XBT_SW_CFG   :: Using card: ALVEO V80 ES3 (21:00.0)
  CRIT WARN :: ITF_138 :: DEVICE       :: PCIe link speed (4) configuration running on the card does not match expected (5) according to device information
  WARNING   :: VER_012 :: VERIFY       :: Verify xbtest HW IP has not access to the DNA

  +------------------+ +-----------------------------------------------------------+ +---------------------------------------------------------------------+
  |                  | |                          STATUS                           | |                            ONGOING TEST                             |
  |     TESTCASE     | |-----------------------------------------------------------| |---------------------------------------------------------------------|
  |                  | | Pending | Completed | Passed | Failed | Errors | Warnings | | Remaining time (s) | Parameters                                     |
  |------------------| |-----------------------------------------------------------| |---------------------------------------------------------------------|
  | memory                                                                                                                                                 |
  |              DDR | |       0 |         4 |      4 |      0 |      0 |        2 | |                n/a | n/a                                            |
  |              HBM | |       0 |         4 |      4 |      0 |      0 |        5 | |                n/a | n/a                                            |
  |------------------| |-----------------------------------------------------------| |---------------------------------------------------------------------|
  | verify           | |       0 |         3 |      3 |      0 |      0 |        1 | |                n/a | n/a                                            |
  +------------------+ +-----------------------------------------------------------+ +---------------------------------------------------------------------+

  Card status: Power: 64.5 W; Temperature: 43 C; Qty of measurements: 137

  Messages stats: 1 Warnings, 8 Critical Warnings, 67 Passes, 0 Errors, 0 Failures encountered

  Message history (limited to the 10 last ones)
  CRIT WARN :: MEM_047 :: MEMORY       :: HBM :: The host application was not able to read xbtest HW IP measurements every second. 3 measurements were done during 20 seconds of test
  CRIT WARN :: MEM_047 :: MEMORY       :: DDR :: The host application was not able to read xbtest HW IP measurements every second. 17 measurements were done during 20 seconds of test

  Total elapsed: 142 s


INFO      :: GEN_040 :: GENERAL      :: ############################################## SUMMARY ##############################################
INFO      :: GEN_040 :: GENERAL      :: End of xbtest session at: Wed Aug 21 12:03:20 2024 BST
PASS      :: GEN_022 :: GENERAL      :: memory DDR test passed
PASS      :: GEN_022 :: GENERAL      :: memory HBM test passed
INFO      :: GEN_040 :: GENERAL      :: 1 Warnings, 8 Critical Warnings, 66 Passes, 0 Errors, 0 Failures encountered
INFO      :: GEN_040 :: GENERAL      :: #####################################################################################################
PASS      :: GEN_024 :: GENERAL      :: RESULT: ALL TESTS PASSED

Troubleshooting

For debugging information, see AVED Debug Techniques.

Next Steps

This getting started guide covered AVED installation and basic hardware testing commands. If you would like to explore the full set of hardware test commands available, see:

Notably, some interesting commands to try are:

For a quick look at how to build AVED, refer to the getting started guide: How-to Rebuild an AVED Design for Yourself.


Page Revision: v. 49