.. Copyright 2021 Xilinx, Inc. Copyright 2022 Advanced Micro Devices, Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. Performance =========== MLCommons --------- :github:`MLPerf ` is a standard benchmark suite for machine learning inference from `MLCommons `__. The :amdinferTree:`MLCommons application ` uses the ``loadgen`` component of MLPerf to send requests to the inference server for a variety of scenarios and models. SingleStream ^^^^^^^^^^^^ This scenario sends one inference request after the previous one completes and so measures end-to-end latency. Note that these tests are not run with same duration/query count as specified in the official MLPerf test rules. .. tabs:: .. group-tab:: Fake .. chart:: charts/fake_SingleStream_protocols.json The Fake model does no work and it shows the latency of the system excluding serialization and inference delays .. chart:: charts/fake_SingleStream_protocols_qps.json The Fake model does no work and it shows the latency of the system excluding serialization and inference delays .. group-tab:: ResNet50 .. chart:: charts/resnet50_SingleStream_protocols.json Using Vitis AI and U250 .. chart:: charts/resnet50_SingleStream_protocols_qps.json Using Vitis AI and U250 Server ^^^^^^ This scenario sends inference requests in a Poisson distribution with a configurable number of outstanding requests. Note that these tests are not run with same duration/query count as specified in the official MLPerf test rules. .. tabs:: .. group-tab:: Fake .. chart:: charts/fake_Server_protocols.json The Fake model does no work and it shows the latency of the system excluding serialization and inference delays .. chart:: charts/fake_Server_protocols_qps.json The Fake model does no work and it shows the latency of the system excluding serialization and inference delays