REST Endpoints¶

The REST endpoints are based on KServe’s v2 specification. Additional endpoints are driven by community adoption.

Health¶

GET v2/health/live: Check if the server is live
GET v2/health/ready: Check if the server is ready for inference requests
GET v2/models/{model}/ready: Check if a particular model is ready for inference requests

Metadata¶

GET v2: Get AMD Inference Server’s metadata
GET v2/hardware: Get a string describing the number and type of kernels that are available
GET v2/models: Get a list of active models
GET v2/models/{model}: Get model metadata

Inference¶

POST v2/repository/models/{model}/load: Load a model
POST v2/workers/{worker}/load: Load a worker. The HTML body in the response contains the endpoint to use for subsequent requests
POST v2/repository/models/{model}/unload: Unload a model
POST v2/workers/{worker}/unload: Unload a worker
POST v2/models/{model}/infer: Make an inference request to a particular model

Observation¶

GET metrics: Get Prometheus metrics