Function amdinfer::inferAsyncOrderedBatched

Function Documentation

std::vector<InferenceResponse> amdinfer::inferAsyncOrderedBatched(Client *client, const std::string &model, const std::vector<InferenceRequest> &requests, size_t batch_size)

Makes inference requests in parallel to the specified model in batches. Each batch of requests are gathered and the responses are added to a vector. Once all the responses are received, the response vector is returned.

Parameters
  • client – a pointer to a client object

  • model – the model/worker to make inference requests to

  • requests – a vector of requests

  • batch_size – the number of requests that should be sent in parallel at once

Returns

std::vector<InferenceResponse>