Release Notes 3.5¶
Version Compatibility¶
Vitis™ AI v3.5 and the DPU IP released with the v3.5 branch of this repository are verified as compatible with Vitis, Vivado™, and PetaLinux version 2023.1. If you are using a previous release of Vitis AI, you should review the version compatibility matrix for that release.
Documentation and Github Repository¶
Merged UG1333 into UG1414
Streamlined UG1414 to remove redundant content
Streamlined UG1414 to focus exclusively on core tool usage. Core tools such as the Optimizer, Quantizer and Compiler are now being utilized across multiple targets (ie Ryzen™ AI, EPYC™) and this change seeks to make UG1414 more portable to these targets
Migrated Adaptive SoC and Alveo specific content from UG1414 to Github.IO
New Github.IO Toctree structure
Integrated VART Runtime APIs in Doxygen format
Docker Containers and GPU Support¶
Removed Anaconda dependency from TensorFlow 2 and PyTorch containers in order to address Anaconda commercial license requirements
Updated Docker container to disable Ubuntu 18.04 support (which was available in Vitis AI but not officially supported). This was done to address CVE-2021-3493.
Model Zoo¶
Added more classic models without modification such as YOLO series and 2D Unet
Provided model info card for each model and Jupyter Notebook tutorials for new models
New copyleft repo for GPL license models
ONNX CNN Quantizer¶
Initial release
This is a new quantizer that supports the direct PTQ quantization of ONNX models for DPU. It is a plugin built for the ONNXRuntime native quantizer.
Support for power-of-two quantization with both QDQ and QOP format.
Support for Non-overflow and Min-MSE quantization methods.
Support for various quantization configurations in power-of-two quantization in both QDQ and QOP format.
Support for signed and unsigned configurations.
Support for symmetry and asymmetry configurations.
Support for per-tensor and per-channel configurations.
Support for ONNX models in excess of 2GB.
Support for the use of the CUDAExecutionProvider for calibration in quantization.
PyTorch CNN Quantizer¶
Support for Pytorch 1.13 and 2.0
Support for mixed precision quantization, float32/float16/bfloat16/intx
Support for bit-wise accuracy cross check between quantizer and ONNX-runtime
Split and chunk operators were automatically converted to slicing
Dict input/output supports for model forward function
Keywords argument supports for model forward function
Support for matmul subroutine
Added support for BFP data type quantization
QAT supports training on mutiple GPUs
QAT supports operations with multiple inputs or outputs
TensorFlow 2 CNN Quantizer¶
Updated to support for Tensorflow 2.12 and Python 3.8.
Support for quantizing subclass models.
Support for mix precision, supports layer-wise data type configuration, supports float32, float16, bfloat16, and int quantization.
Support for BFP datatypes, and add a new quantize strategy called ‘bfp’.
Support for quantize Keras nested models.
Experimental support for quantizing the frozen pb format model in TensorFlow 2.x.
Added a new ‘gpu’ quantize strategy which uses float scale quantization and is used in GPU deployment scenarios.
Support for exporting the quantized model to frozen pb format or onnx format.
Support for exporting the quantized model with power-of-two scales to frozen pb format with “FixNeuron” inside, to be compatible with some compilers with pb format input.
Support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes.
Bug Fixed: 1. Fixed a gradient bug in the ‘pof2s_tqt’ quantize strategy. 2. Fixed a bug of quantization position change introduced by the fast fine-tuning process after the PTQ. 3. Fixed a graph transformation bug when a TFOpLambda op has multiple inputs.
TensorFlow 1 CNN Quantizer¶
Support for fast fine-tuning that improves PTQ accuracy.
Support for folding Reshape and ResizeNearestNeighbor operators.
Support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes.
Support for quantizing Sum, StridedSlice, and Maximum operators.
Support for setting the input shape of the model, which is useful in the deployment of models with undefined input shapes.
Support for setting the opset version in exporting onnx format.
Bug Fixed: 1. Fixed a bug where the AddV2 operation is misinterpreted as a BiasAdd.
Compiler¶
New operators supported: Broadcast add/mul, Bilinear downsample, Trilinear downsample, Group conv2d, Strided-slice
Performance improved on XV2DPU
Error message improved
Compilation time speed up
PyTorch Optimizer¶
Removed requirement for license purchase
Migrated to Github open-source
Support for PyTorch 1.11, 1.12 and 1.13
Support for pruning of grouped convolution
Support for setting the number of channels to be a multiple of the specified number after pruning
TensorFlow 2 Optimizer¶
Removed requirement for license purchase
Migrated to Github open-source
Support for TensorFlow 2.11 and 2.12
Support for pruning of tf.keras.layers.SeparableConv2D
Fixed tf.keras.layers.Conv2DTranspose pruning bug
Support for setting the number of channels to be a multiple of the specified number after pruning
Runtime¶
Supports Versal AI Edge VEK280 evaluation kit
Buffer optimized for multi-batches to improve performance
Added new tensor buffer interface to enhance zero copy
Vitis ONNX Runtime Execution Provider (VOE)¶
Support for ONNX Opset version 18, ONNX Runtime 1.16.0 and ONNX version 1.13
Support for both C++ and Python APIs(Python version 3)
Support for Vitis AI EP and other EPs to work together to deploy the model
Provided Onnx examples based on C++ and Python APIs
Vitis AI EP is open sourced and upstreamed to ONNX public repo on Github
Library¶
Added three new model libraries and support for five additional models
Model Inspector¶
Added support for DPUCV2DX8G
Profiler¶
Added Profiler support for DPUCV2DX8G
DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge / Core)¶
General access release for the Versal AI Edge device VE2802, Versal AI Core device VC2802 and Alveo V70 card
Configurable from C20B1 to C20B14
Support most 2D operators based on models in the Model Zoo
DPU IP - Zynq Ultrascale+ DPUCZDX8G¶
No DPU IP updates in 3.5 release
No DPU reference design updates in 3.5 release
No pre-built board image updates in 3.5 release
DPU IP - Versal AIE Targets DPUCVDX8G¶
No DPU IP updates in 3.5 release
No DPU reference design updates in 3.5 release
No pre-built board image updates in 3.5 release
DPU IP - CNN - Alveo Data Center DPUCVDX8H¶
No DPU IP updates in 3.5 release
No DPU reference design updates in 3.5 release
No pre-built board image updates in 3.5 release
WeGO¶
Support for Alveo V70 DPU GA release.
Support for PyTorch 1.13.1 and TensorFlow r2.12.
Enhanced WeGO-Torch to support PyTorch 2.0 as a preview feature.
Introduced new C++ API that supports for WeGO-Torch
Implemented WeGO-TF1 and WeGO-TF2 as out-of-tree plugins.
Known Issues¶
To be announced ASAP
AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc.