SC Firmware Update

In the Alveo™ U30 Hyperscaler only SKU, the satellite controller supports out-of-band method of SC FW upgrade in Xilinx® Alveo™ cards. The out-of-band SC FW update is supported at I2C address 0x65 (0xCA in 8-bit). Server BMC is expected to initiate the FW upgrade process by sending I2C commands to the SC FW. After the initial handshake with the SC FW, the server BMC will need to communicate with the MSP432 boot loader (BSL) to transfer the FW into MSP Flash and complete the upgrade process.

Note: Currently, the SC FW upgrades are always force upgrades, there is no version check supported by MSP’s Bootloader. The old FW file will be overwritten by the new FW. Server BMC is expected to check and decide if the SC FW upgrade is needed. I2C speed of only 100 KHz is supported for all the commands mentioned in this chapter.

The following table lists the commands supported/needed for the FW upgrade.

Table: BMC to SC Commands

Command Code

Command Name

Description

0x04

GET_SC_FW_VER

Get SC FW version (xx.yy.zz format)

0x31

GET_SC_STATUS

Returns status about what is running in MSP, FW, or BSL.

0x32

ENABLE_BSL_MODE

OoB command to reboot the SC and invoke BSL.

Table: BMC to BSL Commands

Command Code

Command Name

Description

0x31

GET_SC_STATUS

Returns status weather SC is in SC FW mode or BSL mode

0x21

BSL_RX_PASSWORD

Sends 56 bytes password to unlock the BSL

0x15

BSL_ERASE_SC_FW

BSL erases old SC FW

0x20

BSL_RX_DATA_BLOCK

Sends 32-bit data block to write (256 bytes max)

0x26

BSL_CRC_CHECK

Ask BSL to perform CRC check for validation

0x27

BSL_LOAD_PC

Jump to the SC’s application FW , after FW upgrade

Table: SC flash write and read-back Commands

Command Code

Command Name

Description

0x34

GET_SC_FLASH_WRITE_STATUS

SC sends the status for SC sector writes

0x35

GET_SC_WRITE_SECTOR_RANGE

SC sends the SC flash sector range for writes

0x36

SC_FLASH_WRITE_DATA_BLOCK

BMC sends payload for writes into SC sectors

0x37

SC_FLASH_READ_DATA_BLOCK

SC sends entire SC flash data to BMC

0x31 - GET_SC_STATUS (SC firmware)

The GET_SC_STATUS command serves as the status command, revealing if the MSP432 processor is running in the application code (SC FW) or in BSL mode. Upon receiving this command, the SC FW responds with 0x02 in Byte 0.

Note: The same command is supported by BSL. BSL will respond saying ‘am in BSL mode’.

Table: GET_SC_STATUS Server BMC Request

Server BMC Request

Command code

0x31

Data bytes

N/A

Table: GET_SC_STATUS Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data bytes

Byte 0

0x02

Byte 1

N/A

0x32 - ENABLE_BSL_MODE

Upon receiving the ENABLE_BSL_MODE command, the SC FW configures FW update mode in the BSL and reboots itself. The next boot up takes the control to BSL mode. Absence of this step results in normal reboots, where the application code/FW will boot-up instead of staying in BSL to enable the FW update process.

Note: For this command, the SC FW will not be able to respond to the BMC with success or failure before rebooting itself.

Table: ENABLE_BSL_MODE Server BMC Request

Server BMC Request

Command code

0x32

Data bytes

N/A

Table: ENABLE_BSL_MODE Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data bytes

N/A

N/A

BSL Communication

IMPORTANT! The following is a recommendation from TI. Refer to TI’s MSP432P4xx SimpleLink Microcontrollers Bootloader user guide (BSL)for more information.

The I2C protocol used by the BSL is defined as:

  • The master must request data from the BSL slave.

  • 7-bit addressing mode is used. By default, the slave listens to address 0x65 (0xCA 8-bit).

  • In addition to the I2C protocol-based hardware ACK, handshake for commands is performed by an acknowledged character in the BSL core response format, as specified in the I2C BSL response table of MSP432P4xx SimpleLink Microcontrollers Bootloader (BSL).

  • Repeated starts are not required by the BSL, but can be used.

  • TI recommends waiting 1.2 ms after sending a command to the BSL and receiving the response. TI also recommends waiting 1.2 ms before sending the next command after a response was received.

  • The I2C BSL interface supports a maximum clock speed of 400 kHz.

CRC Calculation

For the purposes of CRC calculation in the BSL, the MSP432 device performs a 16-bit CRC check using the CRC16-CCITT standard. This CRC signature is based on the polynomial given in the CRC16-CCITT with the following function:

f (x) = x16 + x12 + x5 + 1

CRC Checksum Low, CRC Checksum High

The checksum is computed on bytes in the BSL core command section only. The BSL uses CRC16-CCITT for the checksum and computes it using the MSP432 CRC module. CRC bytes (CKL, CKH) are mandatory for all commands. The ACK, header, and length bytes must be ignored.

Length Low Byte, Length High Byte

Length low byte, length high byte is the number of bytes in the BSL core data packet, broken into high and low bytes. The number of bytes must include only core data packets, as detailed below, and does not include the length bytes and checksum bytes.

  • Command code

  • All address bytes (if applicable)

  • All data bytes (if applicable)

Note: All commands with prefix BSL_ are core commands supported by BSL. The request and response bytes are pre-defined by TI.

0x31 - GET_SC_STATUS (BSL)

The GET_SC_STATUS command serves as a status command telling whether the MSP432 processor is running the application code (SC FW) or in BSL mode. Upon receiving this command, BSL responds with 0x01 in byte 0 MSP BSL mode. Byte 1 serves as status byte.

Note: The same command is supported by the SC application FW, where the SC responds with SC FW mode.

Table: GET_SC_STATUS (BSL) Server BMC Request

Server BMC Request

Command code

0x31

Data bytes

N/A

Table: GET_SC_STATUS (BSL) Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data bytes

Byte 0

0x01

Byte 1 (status)

0x00: BSL_OK

0x01: BSL_CRC_CHECK_FAIL

0x02: BSL_PARTIAL_FW_UPGRADE

0x03: BSL_FLASH_WRITE_ERROR

0x21 - BSL_RX_PASSWORD

The BSL core receives the password contained in the packet and unlocks the BSL protected commands if the password matches the 56 bytes in the BSL. When an incorrect password is given, BSL responds with Password Error and subsequent commands sent to the BSL result in no-operation.

Note: Contact Xilinx® for the password information.

Table: BSL_RX_PASSWORD Server BMC Request

Server BMC Request

Header

0x80

Length (low byte)

0x39

Length (high byte)

0x00

Command code

0x21

Data bytes

D1…D56

D1-D56–Xilinx Password D57–D256–0xFF

Table: BSL_RX_PASSWORD Xilinx Alveo Card (BSL) Response

Xilinx Alveo Card (BSL) Response

Data bytes B0 … B7

B0: ACK

0x00

B1: Header

0x80

B2: Length

0x02

B3: Length

0x00

B4: CMD

0x3B

B5: Message

0x00 – Operation successful

0x04– BSL locked. Password incorrect resulted in BSL locking

0x05– BSL password error. Incorrect password sent to unlock BSL

0x07– Unknown Command

B6: CKL

0x60

B7: CKH

0xC4

Table: BSL_RX_PASSWORD BSL Command Response for a Successful Password

ACK

Header

Length

Length

CMD

MSG

CKL

CKH

0x00

0x80

0x02

0x00

0x3B

0x00

0x60

0xC4

Table: BSL_RX_PASSWORD Command Example

Header

Length

Length

CMD

D1

D2

D3

D4

D5

D6

0x80

0x01

0x01

0x21

0xFF

0xFF

0xFF

0xFF

0xFF

0xFF

D7

/././.

D251

D252

D253

D254

D255

D256

CKL

CKH

0xFF

0xFF

0xFF

0xFF

0xFF

0xFF

0xFF

0xFF

0xAD

0x08

0x15 - BSL_ERASE_SC_FW

The BSL_ERASE_SC_FW command erases the entire SC FW code in the MSP432 MCU flash. Other flash sectors will not be erased. This function does not erase RAM.

Note: Allow at least 1 second for the erase operation to complete before proceeding with next set of commands.

Table: BSL_ERASE_SC_FW Server BMC Request

Server BMC Request

Header

0x80

Length (low byte)

0x01

Length (high byte)

0x00

Command code

0x15

CKL

TBD

CKH

TBD

Table: BSL_ERASE_SC_FW Xilinx Alveo Card (BSL) Response

Xilinx Alveo Card (BSL) Response

Data bytes B0 … B7

B0: ACK

0x00

B1: Header

0x80

B2: Length

0x02

B3: Length

0x00

B4: CMD

0x3B

B5: Message

0x00 – Operation successful

0x04– BSL locked. Password incorrect resulted in BSL locking

0x05– BSL password error. Incorrect password sent to unlock BSL

0x07– Unknown Command

B6: CKL

0x60

B7: CKH

0xC4

Command Example

Table: BSL_ERASE_SC_FW Initiate Erase

Header

Length

Length

CMD

CKL

CKH

0x80

0x01

0x00

0x15

0x64

0xA3

Table: BSL_ERASE_SC_FW BSL Response (Successful Operation)

ACK

Header

Length

Length

CMD

MSG

CKL

CKH

0x00

0x80

0x02

0x00

0x3B

0x00

0x60

0xC4

0x20 - BSL_RX_DATA_BLOCK

The BSL core writes bytes data byte 1 (D1)–data byte n (Dn) starting from the location specified in the address fields. The BSL_RX_DATA_BLOCK command allows the BSL to address the device with the full 32-bit range.

Table: BSL_RX_DATA_BLOCK Server BMC Request

Server BMC Request

Header

0x80

Length (low byte)

0x05

Length (high byte)

0x01

Command code

0x20

Address bytes

A0, A1, A2, A3

Data bytes

D1, D2 … D256

CKL

TBD

CKH

TBD

Table: BSL_RX_DATA_BLOCK Xilinx Alveo Card (BSL) Response

Xilinx Alveo Card (BSL) Response

Data bytes B0 … B7

B0: ACK

0x00

B1: Header

0x80

B2: Length

0x02

B3: Length

0x00

B4: CMD

0x3B

B5: Message

0x00 – Operation successful

0x04– BSL locked. Password incorrect resulted in BSL locking

0x05– BSL password error. Incorrect password sent to unlock BSL

0x07– Unknown Command

B6: CKL

0x60

B7: CKH

0xC4

BSL_RX_DATA_BLOCK Command Example

Table: Write Data 0x76543210 to Address 0x0001:0000

Header

Length

Length

CMD

A0

A1

A2

A3

D1

D2

D3

D4

CKL

CKH

0x80

0x09

0x00

0x20

0x00

0x00

0x01

0x00

0x10

0x32

0x54

0x76

0x66

0x96

Table: BSL_RX_DATA_BLOCK BSL Response for a Successful Data Write

ACK

Header

Length

Length

CMD

MSG

CKL

CKH

0x00

0x80

0x02

0x00

0x3B

0x00

0x60

0xC4

Note: The BMC will need to parse through the SC FW file to identify the start location for each segment. To be specific, search for ‘@’ and use the following 4-byte address to frame and send the address bytes: A0, A1, A2, and A3 (LSB first).

Figure: Linux grep Command

_images/sc-segments.png

There are 4 segments in the following example:

  • @200– Segment starting at (0x00000200 A0 = 0x00; A1 = 0x02; A2 = 0x00; A3 = 0x00)

  • @1f780– Segment starting at 0x0001F780 (A0 = 0x80; A1 = 0xF7; A2 = 0x01; A3 = 0x00)

  • @20e58– Segment starting at 0x00020E58 (A0 = 0x58; A1 = 0x0E; A2 = 0x02; A3 = 0x00)

  • @0000– Segment starting at 0x00000000 (A0 = 0x00; A1 = 0x00; A2 = 0x00; A3 = 0x00)

This figure captures the linux grep command and response for the string ‘@’ within the FW file.

Note: The string ‘@’ represents the start of a new section in the flash memory.

Because the BSL_RX_DATA_BLOCK command’s maximum data size is 256 bytes, the address needs to be incremented by 256 or 0x100.

  • For the first packet in every segment, the BMC will send the 4-byte address as parsed above

    0x80 0x09 0x00 0x20 0x00 0x02 0x00 0x00 0x00 0x01 .. 0xFF 0x66 0x96.

  • For all subsequent packets, the BMC will increment the address by 0x100 while sending the commands 0x80 0x09 0x00 0x20 0x00 0x03 0x00 0x00 0x00 0x01 .. 0xFF 0x66 0x96 Header-Length-CMD-Address-Data-Checksum.

0x26 - BSL_CRC_CHECK

Note: The BSL_CRC_CHECK command is an optional command.

The MSP432 device performs a 16-bit CRC check using the CCITT standard. The address given is the first byte of the CRC check; 2 bytes are used for the length.

Table: BSL_CRC_CHECK Server BMC Request

Server BMC Request

Header

0x80

Length (low Byte)

TBD

Length (high Byte)

0x00

Command code

0x26

Address bytes

A0, A1, A2, A3

Data bytes

D1: length (low byte)

D2: length (high byte)

CKL

TBD

CKH

TBD

Table: BSL_CRC_CHECK Xilinx Alveo Card (BSL) Response

Xilinx Alveo Card (BSL) Response

Data bytes B0 … B8

B0: ACK

0x00

B1: Header

0x80

B2: Length

0x02

B3: Length

0x00

B4: CMD

0x3A

B5: Data1

TBD

B6: Data2

TBD

B7: CKL

TBD

B8: CKH

TBD

BSL_CRC_CHECK Command Example

Perform a CRC check from address 0x0000:4400 to 0x0000:47FF (size of 1024 bytes of data).

Table: BSL_CRC_CHECK Command Example

Header

Length

Length

CMD

A0

A1

A2

A3

D1

D2

CKL

CKH

0x80

0x07

0x00

0x26

0x00

0x44

0x00

0x00

0x00

0x04

0xF7

0xE6

The BSL response where 0x55 is the low byte of the calculated checksum and 0xAA is the high byte of the calculated checksum:

Table: BSL_CRC_CHECK Response for a Successful CRC Calculation

ACK

Header

Length

Length

CMD

D1

D2

CKL

CKH

0x00

0x80

0x03

0x00

0x3A

0x55

0xAA

0x12

0x2B

Note: As noted in the BSL_RX_DATA_BLOCK command, BMC will need to parse through the SC FW file to identify the start address for each command.

0x27 - BSL_LOAD_PC

The BSL_LOAD_PC command causes the BSL to jump and begin execution at the given address. The BSL responds with 0x00. In this case, the jump address is 0x0000:0201.

Table: BSL_LOAD_PC Server BMC Request

Server BMC Request

Header

0x80

Length (low byte)

0x05

Length (high byte)

0x00

Command code

0x27

Address bytes

A0, A1, A2, A3 A0: 0x01 A1: 0x02 A2: 0x00 A3: 0x00

CKL

TBD

CKH

TBD

Table: BSL_LOAD_PC Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data bytes

Byte 0

0x00–Success

Command Example

The program counter is set to 0x0000:0201. The server BMC must send the address bytes as A0=0x01, A1=0x02, A2=0x00, and A3=0x00.

Header

Length

Length

CMD

A0

A1

A2

A3

CKL

CKH

0x80

0x05

0x00

0x27

0x01

0x02

0x00

0x00

0x8E

0xBC

The BSL responds with 0x00.

Note: Functionality of the BSL core command has been modified to improve robustness around the SC FW upgrade process. When BMC issues this command to jump to SC application code, BSL checks the CRC of the entire SC FW image. If the CRC check is successful, BSL loads the new SC application code. If not, the MSP stays in BSL mode with the assumption that SC FW is corrupted/interrupted due to CRC failure.

Sample BSL Commands

The contents from the following table have been imported from TotalPhase Aardvark I2C adapter.

Figure: I2C Transaction captured using I2C Aardvark Tool

_images/aardvark_capture_SC_FW_update.PNG

Timing Diagram: Normal Flow of OoB SC FW Upgrade

  1. The BMC sends the 0x31 GET_SC_STATUS command to the SC, which responds with 0x02 - MSP SC FW mode.

  2. The BMC sends the 0x32 Enable_BSL_Mode command to the SC which configures the BSL parameters and reboots itself. The MSP enters BSL mode on the next boot up. No response is sent to BMC.

  3. The BMC waits 1 second and sends the 0x31 GET_SC_STATUS command to BSL and gets response 0x01 from the BSL MSP in BSL mode.

  4. The BMC unlocks the BSL by sending the password (0x21 BSL_RX_PASSWORD) and the BSL sends the status in response.

  5. The BMC sends the 0x15 BSL_ERASE_SC_FW command to the BSL asking that the entire SC FW image to be erased. BSL erases the FW and sends the response back to BMC.

  6. The BMC sends the entire SC FW via repeated 0x20 BSL_RX_Data command with the correct start address and BSL sends the status in response.

  7. The BMC (optionally) sends the 0x26 BSL_CRC_CHECK command with the correct start address and the BSL sends the status in response.

  8. The BMC sends the 0x27 BSL_Load_PC command and the BSL checks the CRC on the full FW. If CRC passes, the new SC FW loads. If not, it stays in BSL mode, enabling the BMC to restart the SC FW upgrade (see step 3).

Figure: Timing Diagram: Normal flow of Out-of-Band SC FW Upgrade

_images/sc-update-normal-flow.png

Timing Diagram: Interrupted Flow of the OoB SC FW Upgrade

  1. If the SC FW upgrade is interrupted mid-way due to power cycle (i.e., BMC reboot, MSP reboot, etc.,) the BSL takes corrective action by prohibiting the partial/corrupt FW to boot.

  2. The BSL disables the SC FW application code and hangs in BSL, waiting for a new SC FW upgrade process by BMC.

  3. The BMC will need to re-trigger the upgrade process from the start. This is done by sending a 0x31 GET_SC_STATUS command to get the status and following Timing Diagram: Normal Flow of OoB SC FW Upgrade.

Note: It is possible the I2C engine in the BSL can get stuck if the transaction got interrupted (as mentioned in step 1). Because the BSL does not have I2C recovery mechanisms, the only way to get back to BSL mode is to reboot the MSP. This can be only done by the AC power cycle of the server.

Figure: Interrupted flow of OoB SC FW Upgrade

_images/sc-update-interrupted-flow.png

SC flash write and read-back Commands

0x34 - GET_SC_FLASH_WRITE_STATUS

BMC sends this command to get the status for the SC flash sector write operations.

Table: GET_SC_FLASH_WRITE_STATUS Server BMC Request

Server BMC Request

Command code

0x34

Byte0

N/A

Table: GET_SC_FLASH_WRITE_STATUS Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data bytes

B0

0x01 - Operation success (No error)

0x02 - Operation failed

0x03 - Operation in progress

0x04 - Invalid input parameters

0x05 - Device busy, recheck later

0x06 - Invalid CRC (for I2C transaction)

0x07 - Data sector overflow

0x08 - SC flash write error

0x09 - Unwritable sector

0x35 - GET_SC_WRITE_SECTOR_RANGE

The BMC sends this command to get the valid SC flash sector range for the write operations. Based on the response code in Byte 0 and the sector range (B1 - B4), BMC can calculate the total number of bytes of writable data to be sent to SC flash. SC’s total flash size is 2 MB and there are 512 sectors (4 KB each) in total. BMC can write only in the writable sectors. Refer the table below for flash sector partition and valid writable sector range (156 - 511).

Note: BMC must send command 0x35 to get valid write sectors before sending the command 0x36 to transfer write data payload. Additionally, if BMC resends the command 0x35 in the middle of the write data payload transfer, SC will reset the entire write flow (i.e.) the start & end sectors. Optionally, BMC can also send the command 0x35 to reset the flow, in case it is needed.

Table: SC flash sector partition information

SC flash sector range

Usage

Writable to BMC

0 - 127

SC firmware

NO

128 - 129

Run-time config data

NO

130 - 147

BSL (Boot-loader) firmware

NO

148 - 155

Run-time config data & logs

NO

156 - 511

Unused

YES

Table: GET_SC_WRITE_SECTOR_RANGE Server BMC Request

Server BMC Request

Command code

0x35

Byte0

N/A

Table: GET_SC_WRITE_SECTOR_RANGE Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data bytes

B0

0x01 - Operation success (No error)

0x02 - Operation failed

0x03 - Unwritable sector range

B1

Start sector number (low byte)

B2

Start sector number (high byte)

B3

End sector number (low byte)

B4

End sector number (high byte)

0x36 - SC_FLASH_WRITE_DATA_BLOCK

BMC sends this command iteratively to send payload to write into SC flash sectors. Upon receiving the write payload, SC automatically starts writing the data from the 1st available writable flash sector as informed via command 0x35. Each transaction is limited to 251 bytes to accommodate 2 bytes of CRC. BMC must use the CRC16-CCITT signature based on the polynomial function below: f (x) = x16 + x12 + x5 + 1

In case of CRC mismatch (return code 0x05), BMC must resend the transaction. And it is BMC’s responsibility to keep track of the total number of bytes written/sent. After the completion of each transaction, BMC must check the write status via command 0x34 and proceed to next transaction only if SC returns 0x01 as response. SC performs the flash write operation in the background and will not be able to handle parallel/multiple transactions.

Note: BMC must send command 0x35 to set valid write sectors before sending the command 0x36 to send write data. Additionally, if BMC resends the command 0x35 in the middle of the write data payload transfer, SC will reset the entire write flow (i.e.) the start & end sectors. Optionally, BMC can also send the command 0x35 to reset the flow, in case it is needed.

Table: SC_FLASH_WRITE_DATA_BLOCK Server BMC Request

Server BMC Request

Command code

0x36

Data Bytes

D1, D2, … D251

CKL

CRC Low byte

CKH

CRC High byte

Table: SC_FLASH_WRITE_DATA_BLOCK Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data bytes

B0

0x01 - Operation success (No error)

0x02 - Operation failed

0x03 - Invalid CRC (for I2C transaction)

0x04 - Send command 0x35 first and retry 0x36

0x37 - SC_FLASH_READ_DATA_BLOCK

BMC sends this command iteratively to read the data from entire SC flash. Each transaction is limited to 251 bytes to accommodate 2 bytes of CRC. BMC must use the CRC16-CCITT signature based on the polynomial function below: f (x) = x16 + x12 + x5 + 1

At the end of each transaction, BMC must perform CRC check and then only proceed to read the next sector. In case of CRC mismatch, BMC must ask SC to resend the previous transaction. For the last transaction, SC may send a payload less than 251 bytes and it is BMC’s responsibility to keep track of the total number of bytes read.

Table: SC_FLASH_READ_DATA_BLOCK Server BMC Request

Server BMC Request

Command code

0x37

Byte0

0x00: Resend previous transaction

0x01: Send next transaction

Table: SC_FLASH_READ_DATA_BLOCK Xilinx Alveo Card Response

Xilinx Alveo Card Response

Data Bytes

D1, D2, … D251

CKL

CRC Low byte

CKH

CRC High byte

AMD Support

For support resources such as answers, documentation, downloads, and forums, see the Alveo Accelerator Cards AMD/Xilinx Community Forum.

License

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License.

You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

All images and documentation, including all debug and support documentation, are licensed under the Creative Commons (CC) Attribution 4.0 International License (the “CC-BY-4.0 License”); you may not use this file except in compliance with the CC-BY-4.0 License.

You may obtain a copy of the CC-BY-4.0 License at https://creativecommons.org/licenses/by/4.0/

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

XD038 | © Copyright 2023, Advanced Micro Devices Inc.