- Administrator (User)¶
a user that sets up and maintains the inference server deployment container(s)
- Chain¶
a linear ensemble where all the output tensors of one stage are inputs to the same next stage without having loops, broadcasts or concatenations
- Client (User)¶
a user that interacts with a running server using its APIs to send it inference requests
- Container (Docker)¶
a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another [1]
see also: image
- Developer (User)¶
a user that uses the development container to build and test the server executable
- Ensemble¶
a logical pipeline of workers to execute a graph of computations where the output tensors of one model are passed as input to others
- EULA¶
End User License Agreement
- Image (Docker)¶
a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings [1]
see also: containers
- Model repository¶
a directory that exists on the host machine where the server container is running and it holds the models you want to serve and their associated metadata in a standard structure
- User¶
anything or anyone that uses the AMD Inference Server i.e. clients, administrators, or developers
- XRT¶
Xilinx Runtime Library: an open-source standardized software interface that facilitates communication between the application code and the accelerated-kernels deployed on the reconfigurable portion of PCIe-based Alveo accelerator cards, Zynq-7000, Zynq UltraScale+ MPSoC based embedded platforms or Versal ACAPs