Release Notes 3.5 ================= Version Compatibility --------------------- Vitis |trade| AI v3.5 and the DPU IP released with the v3.5 branch of this repository are verified as compatible with Vitis, Vivado |trade|, and PetaLinux version 2023.1. If you are using a previous release of Vitis AI, you should review the :ref:`version compatibility matrix ` for that release. Documentation and Github Repository ----------------------------------- - Merged UG1333 into UG1414 - Streamlined UG1414 to remove redundant content - Streamlined UG1414 to focus exclusively on core tool usage. Core tools such as the Optimizer, Quantizer and Compiler are now being utilized across multiple targets (ie Ryzen |trade| AI, EPYC |trade|) and this change seeks to make UG1414 more portable to these targets - Migrated Adaptive SoC and Alveo specific content from UG1414 to Github.IO - New Github.IO Toctree structure - Integrated VART Runtime APIs in Doxygen format Docker Containers and GPU Support ---------------------------------- - Removed Anaconda dependency from TensorFlow 2 and PyTorch containers in order to address Anaconda commercial license requirements - Updated Docker container to disable Ubuntu 18.04 support (which was available in Vitis AI but not officially supported). This was done to address `CVE-2021-3493 `__. Model Zoo --------- - Added more classic models without modification such as YOLO series and 2D Unet - Provided model info card for each model and Jupyter Notebook tutorials for new models - New copyleft repo for GPL license models ONNX CNN Quantizer ------------------ - Initial release - This is a new quantizer that supports the direct PTQ quantization of ONNX models for DPU. It is a plugin built for the ONNXRuntime native quantizer. - Support for power-of-two quantization with both QDQ and QOP format. - Support for Non-overflow and Min-MSE quantization methods. - Support for various quantization configurations in power-of-two quantization in both QDQ and QOP format. - Support for signed and unsigned configurations. - Support for symmetry and asymmetry configurations. - Support for per-tensor and per-channel configurations. - Support for ONNX models in excess of 2GB. - Support for the use of the CUDAExecutionProvider for calibration in quantization. PyTorch CNN Quantizer --------------------- - Support for Pytorch 1.13 and 2.0 - Support for mixed precision quantization, float32/float16/bfloat16/intx - Support for bit-wise accuracy cross check between quantizer and ONNX-runtime - Split and chunk operators were automatically converted to slicing - Dict input/output supports for model forward function - Keywords argument supports for model forward function - Support for matmul subroutine - Added support for BFP data type quantization - QAT supports training on mutiple GPUs - QAT supports operations with multiple inputs or outputs TensorFlow 2 CNN Quantizer -------------------------- - Updated to support for Tensorflow 2.12 and Python 3.8. - Support for quantizing subclass models. - Support for mix precision, supports layer-wise data type configuration, supports float32, float16, bfloat16, and int quantization. - Support for BFP datatypes, and add a new quantize strategy called 'bfp'. - Support for quantize Keras nested models. - Experimental support for quantizing the frozen pb format model in TensorFlow 2.x. - Added a new 'gpu' quantize strategy which uses float scale quantization and is used in GPU deployment scenarios. - Support for exporting the quantized model to frozen pb format or onnx format. - Support for exporting the quantized model with power-of-two scales to frozen pb format with "FixNeuron" inside, to be compatible with some compilers with pb format input. - Support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes. Bug Fixed: 1. Fixed a gradient bug in the 'pof2s_tqt' quantize strategy. 2. Fixed a bug of quantization position change introduced by the fast fine-tuning process after the PTQ. 3. Fixed a graph transformation bug when a TFOpLambda op has multiple inputs. TensorFlow 1 CNN Quantizer -------------------------- - Support for fast fine-tuning that improves PTQ accuracy. - Support for folding Reshape and ResizeNearestNeighbor operators. - Support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes. - Support for quantizing Sum, StridedSlice, and Maximum operators. - Support for setting the input shape of the model, which is useful in the deployment of models with undefined input shapes. - Support for setting the opset version in exporting onnx format. Bug Fixed: 1. Fixed a bug where the AddV2 operation is misinterpreted as a BiasAdd. Compiler -------- - New operators supported: Broadcast add/mul, Bilinear downsample, Trilinear downsample, Group conv2d, Strided-slice - Performance improved on XV2DPU - Error message improved - Compilation time speed up PyTorch Optimizer ----------------- - Removed requirement for license purchase - Migrated to Github open-source - Support for PyTorch 1.11, 1.12 and 1.13 - Support for pruning of grouped convolution - Support for setting the number of channels to be a multiple of the specified number after pruning TensorFlow 2 Optimizer ---------------------- - Removed requirement for license purchase - Migrated to Github open-source - Support for TensorFlow 2.11 and 2.12 - Support for pruning of tf.keras.layers.SeparableConv2D - Fixed tf.keras.layers.Conv2DTranspose pruning bug - Support for setting the number of channels to be a multiple of the specified number after pruning Runtime ------- - Supports Versal AI Edge VEK280 evaluation kit - Buffer optimized for multi-batches to improve performance - Added new tensor buffer interface to enhance zero copy Vitis ONNX Runtime Execution Provider (VOE) ------------------------------------------- - Support for ONNX Opset version 18, ONNX Runtime 1.16.0 and ONNX version 1.13 - Support for both C++ and Python APIs(Python version 3) - Support for Vitis AI EP and other EPs to work together to deploy the model - Provided Onnx examples based on C++ and Python APIs - Vitis AI EP is open sourced and upstreamed to ONNX public repo on Github Library ------- - Added three new model libraries and support for five additional models Model Inspector --------------- - Added support for DPUCV2DX8G Profiler -------- - Added Profiler support for DPUCV2DX8G DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge / Core) ----------------------------------------------------------------------------- - General access release for the Versal AI Edge device VE2802, Versal AI Core device VC2802 and Alveo V70 card - Configurable from C20B1 to C20B14 - Support most 2D operators based on models in the Model Zoo DPU IP - Zynq Ultrascale+ DPUCZDX8G ----------------------------------- - No DPU IP updates in 3.5 release - No DPU reference design updates in 3.5 release - No pre-built board image updates in 3.5 release DPU IP - Versal AIE Targets DPUCVDX8G ------------------------------------- - No DPU IP updates in 3.5 release - No DPU reference design updates in 3.5 release - No pre-built board image updates in 3.5 release DPU IP - CNN - Alveo Data Center DPUCVDX8H ------------------------------------------ - No DPU IP updates in 3.5 release - No DPU reference design updates in 3.5 release - No pre-built board image updates in 3.5 release WeGO ------------------------------------------ - Support for Alveo V70 DPU GA release. - Support for PyTorch 1.13.1 and TensorFlow r2.12. - Enhanced WeGO-Torch to support PyTorch 2.0 as a preview feature. - Introduced new C++ API that supports for WeGO-Torch - Implemented WeGO-TF1 and WeGO-TF2 as out-of-tree plugins. Known Issues ------------ - To be announced ASAP .. |trade| unicode:: U+02122 .. TRADEMARK SIGN :ltrim: .. |reg| unicode:: U+000AE .. REGISTERED TRADEMARK SIGN :ltrim: AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc.