Class Client¶

Defined in File client.hpp

Inheritance Relationships¶

Derived Types¶

public amdinfer::GrpcClient (Class GrpcClient)
public amdinfer::HttpClient (Class HttpClient)
public amdinfer::NativeClient (Class NativeClient)
public amdinfer::WebSocketClient (Class WebSocketClient)

Class Documentation¶

class Client¶

The base Client class defines the set of methods that all client implementations must provide. These methods are based on the API defined by KServe, with some extensions. This is a pure virtual class.

Subclassed by amdinfer::GrpcClient, amdinfer::HttpClient, amdinfer::NativeClient, amdinfer::WebSocketClient

Public Functions

virtual ~Client() = default¶: Destructor.

virtual ServerMetadata serverMetadata() const = 0¶

Returns the server metadata as a ServerMetadata object.

Returns: ServerMetadata

virtual bool serverLive() const = 0¶

Checks if the server is live.

Returns: bool - true if server is live, false otherwise

virtual bool serverReady() const = 0¶

Checks if the server is ready.

Returns: bool - true if server is ready, false otherwise

virtual bool modelReady(const std::string &model) const = 0¶

Checks if a model/worker is ready.

Parameters: model – name of the model to check
Returns: bool - true if model is ready, false otherwise

virtual ModelMetadata modelMetadata(const std::string &model) const = 0¶

Returns the metadata associated with a ready model/worker.

Parameters: model – name of the model/worker to get metadata
Returns: ModelMetadata

virtual void modelLoad(const std::string &model, RequestParameters *parameters) const = 0¶

Loads a model with the given name and load-time parameters. This method assumes that a directory with this model name already exists in the model repository directory for the server containing the model and its metadata in the right format.

Parameters

model – name of the model to load from the model repository directory
parameters – load-time parameters for the worker supporting the model

virtual void modelUnload(const std::string &model) const = 0¶

Unloads a previously loaded model and shut it down. This is identical in functionality to workerUnload and is provided for symmetry.

Parameters: model – name of the model to unload

virtual InferenceResponse modelInfer(const std::string &model, const InferenceRequest &request) const = 0¶

Makes a synchronous inference request to the given model/worker. The contents of the request depends on the model/worker that the request is for.

Parameters

model – name of the model/worker to request inference to
request – the request

Returns

InferenceResponse

virtual InferenceResponseFuture modelInferAsync(const std::string &model, const InferenceRequest &request) const = 0¶

Makes an asynchronous inference request to the given model/worker. The contents of the request depends on the model/worker that the request is for. The user must save the Future object and use it to get the results of the inference later.

Parameters

model – name of the model/worker to request inference to
request – the request

Returns

InferenceResponseFuture

virtual std::vector<std::string> modelList() const = 0¶

Gets a list of active models on the server, returning their names.

Returns: std::vector<std::string>

virtual std::string workerLoad(const std::string &worker, RequestParameters *parameters) const = 0¶

Loads a worker with the given name and load-time parameters.

Parameters

worker – name of the worker to load
parameters – load-time parameters for the worker

Returns

std::string

virtual void workerUnload(const std::string &worker) const = 0¶

Unloads a previously loaded worker and shut it down. This is identical in functionality to modelUnload and is provided for symmetry.

Parameters: worker – name of the worker to unload

virtual bool hasHardware(const std::string &name, int num) const = 0¶

Checks if the server has the requested number of a specific hardware device.

Parameters

name – name of the hardware device to check
num – number of the device that should exist at minimum

Returns

bool - true if server has at least the requested number of the hardware device, false otherwise

Protected Functions

Client()¶