REST Endpoints¶
The REST endpoints are based on KServe’s v2 specification. Additional endpoints are driven by community adoption.
Health¶
GET
v2/health/live: Check if the server is liveGET
v2/health/ready: Check if the server is ready for inference requestsGET
v2/models/{model}/ready: Check if a particular model is ready for inference requests
Metadata¶
GET
v2: Get AMD Inference Server’s metadataGET
v2/hardware: Get a string describing the number and type of kernels that are availableGET
v2/models: Get a list of active modelsGET
v2/models/{model}: Get model metadata
Inference¶
POST
v2/repository/models/{model}/load: Load a modelPOST
v2/workers/{worker}/load: Load a worker. The HTML body in the response contains the endpoint to use for subsequent requestsPOST
v2/repository/models/{model}/unload: Unload a modelPOST
v2/workers/{worker}/unload: Unload a workerPOST
v2/models/{model}/infer: Make an inference request to a particular model
Observation¶
GET
metrics: Get Prometheus metrics