Class Client

Inheritance Relationships

Derived Types

Class Documentation

class Client

The base Client class defines the set of methods that all client implementations must provide. These methods are based on the API defined by KServe, with some extensions. This is a pure virtual class.

Subclassed by amdinfer::GrpcClient, amdinfer::HttpClient, amdinfer::NativeClient, amdinfer::WebSocketClient

Public Functions

virtual ~Client() = default

Destructor.

virtual ServerMetadata serverMetadata() const = 0

Returns the server metadata as a ServerMetadata object.

Returns

ServerMetadata

virtual bool serverLive() const = 0

Checks if the server is live.

Returns

bool - true if server is live, false otherwise

virtual bool serverReady() const = 0

Checks if the server is ready.

Returns

bool - true if server is ready, false otherwise

virtual bool modelReady(const std::string &model) const = 0

Checks if a model/worker is ready.

Parameters

model – name of the model to check

Returns

bool - true if model is ready, false otherwise

virtual ModelMetadata modelMetadata(const std::string &model) const = 0

Returns the metadata associated with a ready model/worker.

Parameters

model – name of the model/worker to get metadata

Returns

ModelMetadata

virtual void modelLoad(const std::string &model, RequestParameters *parameters) const = 0

Loads a model with the given name and load-time parameters. This method assumes that a directory with this model name already exists in the model repository directory for the server containing the model and its metadata in the right format.

Parameters
  • model – name of the model to load from the model repository directory

  • parameters – load-time parameters for the worker supporting the model

virtual void modelUnload(const std::string &model) const = 0

Unloads a previously loaded model and shut it down. This is identical in functionality to workerUnload and is provided for symmetry.

Parameters

model – name of the model to unload

virtual InferenceResponse modelInfer(const std::string &model, const InferenceRequest &request) const = 0

Makes a synchronous inference request to the given model/worker. The contents of the request depends on the model/worker that the request is for.

Parameters
  • model – name of the model/worker to request inference to

  • request – the request

Returns

InferenceResponse

virtual InferenceResponseFuture modelInferAsync(const std::string &model, const InferenceRequest &request) const = 0

Makes an asynchronous inference request to the given model/worker. The contents of the request depends on the model/worker that the request is for. The user must save the Future object and use it to get the results of the inference later.

Parameters
  • model – name of the model/worker to request inference to

  • request – the request

Returns

InferenceResponseFuture

virtual std::vector<std::string> modelList() const = 0

Gets a list of active models on the server, returning their names.

Returns

std::vector<std::string>

virtual std::string workerLoad(const std::string &worker, RequestParameters *parameters) const = 0

Loads a worker with the given name and load-time parameters.

Parameters
  • worker – name of the worker to load

  • parameters – load-time parameters for the worker

Returns

std::string

virtual void workerUnload(const std::string &worker) const = 0

Unloads a previously loaded worker and shut it down. This is identical in functionality to modelUnload and is provided for symmetry.

Parameters

worker – name of the worker to unload

virtual bool hasHardware(const std::string &name, int num) const = 0

Checks if the server has the requested number of a specific hardware device.

Parameters
  • name – name of the hardware device to check

  • num – number of the device that should exist at minimum

Returns

bool - true if server has at least the requested number of the hardware device, false otherwise

Protected Functions

Client()