PtZenDNN

The PtZenDNN backend executes a PyTorch model on CPUs. On AMD CPUs, the ZenDNN library offers optimized performance.

Model support

The PtZenDNN backend currently only supports ResNet50.

To use PyTorch models with this backend , you need to convert downloaded PyTorch Eager models to TorchScript.

Hardware support

Check the support matrix for compatible AMD CPUs.

Host setup

No special host setup is required to use the PtZenDNN backend.

Build an image

To build an image with the PtZenDNN backend enabled, you need to download the ZenDNN library and then build the image by pointing the build script to the location of this downloaded package.

You can download the PT+ZenDNN package from the ZenDNN developer downloads: PT_v1.12_ZenDNN_v4.0_C++_API.zip. Before downloading this packages, you will be required to read and agree to the EULA.

After downloading the package, place it in the root of the repository. To build an image with the backend enabled, you need to add the --ptzendnn flag to the amdinfer dockerize command and pass the file to the package:

# create the Dockerfile
python3 docker/generate.py

# build the development image $(whoami)/amdinfer-dev:latest
./amdinfer dockerize --ptzendnn=./PT_v1.12_ZenDNN_v4.0_C++_API.zip

# build the development image $(whoami)/amdinfer-dev-zendnn:latest
./amdinfer dockerize --ptzendnn=./PT_v1.12_ZenDNN_v4.0_C++_API.zip --suffix="-zendnn"

# build the deployment image $(whoami)/amdinfer-zendnn:latest
./amdinfer dockerize --ptzendnn=./PT_v1.12_ZenDNN_v4.0_C++_API.zip --suffix="-zendnn" --production

Note

The downloaded ZenDNN package will be used by the Docker build process so it must be in the inference server repository directory and in a location that is not excluded by the .dockerignore file. These instructions suggest using the repository root but any path that meets this criteria will work.

Start a container

Depending on your use case and how you are using the server, you can start a container to use this backend in multiple ways.

Deployment

You can start a deployment container with something like:

$ docker run ...

Development

A development container can be started with:

$ ./amdinfer run --dev

This automatically publishes ports and mounts some convenient directories, such as your SSH directory, and drops you into a terminal in the container.

Get test assets

You can download the assets and models used with this backend for tests and examples with:

$ ./amdinfer get --ptzendnn --all-models

Loading the backend

There are multiple ways to load this backend to make it available for inference requests from clients. If you are using a client’s workerLoad() method:

// amdinfer::Client* client;
// amdinfer::ParameterMap parameters;
std::string endpoint = client->workerLoad("ptzendnn", parameters)

With a client’s modelLoad() method or using the repository approach, you need to create a model repository and put a model in it. To use this backend with your model, use pytorch_torchscript as the platform for your model.

Then, you can load the model from the server after setting up the path to the model repository. The server may be set to automatically load all models from the configured model repository or you can load it manually using modelLoad(). In this case, the endpoint is defined in the model’s configuration file in the repository and it is used as the argument to modelLoad().

// amdinfer::Client* client;
// amdinfer::ParameterMap parameters;
client->modelLoad(<model>, parameters)

Parameters

You can provide the following backend-specific parameters at load-time:

Parameter

Type

Usage

batch_size

integer

Requested batch size for incoming batches. Defaults to 1.

image_channels

integer

Number of channels in the input image. Defaults to 3.

input_size

integer

Assuming a square input image

the size of the image in pixels. Defaults to 224.

model

string

Full path to the model to load

output_classes

integer

Number of output classes in the classification model. Defaults to 1000.

threads

integer

Number of threads to use in the thread pool for the backend. Defaults to 3.

Troubleshooting

If you run into problems, first check the general troubleshooting guide guide. Then continue on to this XModel specific troubleshooting guide. You will need access to the machine where the inference server is running to debug.

Tune performance

For tuning ZenDNN performance, you can refer to the PyTorch + ZenDNN user guide.