File tensor_buffer.hpp¶

namespace xir

namespace vart

class TensorBuffer

#include <tensor_buffer.hpp>

Class of TensorBuffer.

Subclassed by TensorBufferExt

Public Types

enum class location_t¶

Values:

enumerator HOST_VIRT¶: Host only data() should return a valid pair (0,Nonzero_u); data_phy() should return an invalid pair (0,0u);.

enumerator HOST_PHY¶

continuous physicial memory, shared among host and device.

both data () and data_phy() should return a valid pair.

enumerator DEVICE_0¶

only accessiable by device.

data () should return an invalid pair (0,0u); data_phy() should return a valid pair.

enumerator DEVICE_1¶: only accessiable by device.

enumerator DEVICE_2¶: only accessiable by device.

enumerator DEVICE_3¶: only accessiable by device.

enumerator DEVICE_4¶: only accessiable by device.

enumerator DEVICE_5¶: only accessiable by device.

enumerator DEVICE_6¶: only accessiable by device.

enumerator DEVICE_7¶: only accessiable by device.

Public Functions

explicit TensorBuffer(const xir::Tensor *tensor)¶

virtual ~TensorBuffer() = default¶

virtual std::pair<std::uint64_t, std::size_t> data(const std::vector<std::int32_t> idx = {}) = 0

Get the data address of the index and the size of the data available for use.

Sample code:

vart::TensorBuffer* tb;
std::tie(data_addr, tensor_size) = tb->data({0,0,0,0});

Parameters:: idx – The index of the data to be accessed, its dimension same as the tensor shape.
Returns:: A pair of the data address of the index and the size of the data available for use in byte unit.

inline virtual location_t get_location() const

Get where the tensor buffer located.

Sample code:

vart::TensorBuffer* tb;
switch (tb->get_location()) {
            case vart::TensorBuffer::location_t::HOST_VIRT:
                  // do nothing
                  break;
            case vart::TensorBuffer::location_t::HOST_PHY:
                  // do nothing
                  break;
           default:
                  // do nothing
                  break;
      }

Returns:: the tensor buffer location, a location_t enum type value: HOST_VIRT/HOST_PHY/DEVICE_*.

inline virtual std::pair<uint64_t, size_t> data_phy(const std::vector<std::int32_t> idx)

Get the data physical address of the index and the size of the data available for use.

Sample code:

vart::TensorBuffer* tb;
std::tie(phy_data, phy_size) = tb->data_phy({0, 0});

Parameters:: idx – The index of the data to be accessed, its dimension same to the tensor shape.
Returns:: A pair of the data physical address of the index and the size of the data available for use in byte unit.

inline virtual void sync_for_read(uint64_t offset, size_t size)

Invalid cache for reading Before read, it is no-op in case get_location() returns DEVICE_ONLY or HOST_VIRT.

Sample code:

for (auto& output : output_tensor_buffers) {
    output->sync_for_read(0, output->get_tensor()->get_data_size() /
                                output->get_tensor()->get_shape()[0]);
}

Parameters:

offset – The start offset address.
size – The data size.

Returns:

void

inline virtual void sync_for_write(uint64_t offset, size_t size)

Flush cache for writing after write, it is no-op in case get_location() returns DEVICE_ONLY or HOST_VIRT.

Sample code:

for (auto& input : input_tensor_buffers) {
    input->sync_for_write(0, input->get_tensor()->get_data_size() /
                              input->get_tensor()->get_shape()[0]);
}

Parameters:

offset – The start offset address.
size – The data size.

Returns:

void

virtual void copy_from_host(size_t batch_idx, const void *buf, size_t size, size_t offset)

copy data from source buffer.

Parameters:

batch_idx – the batch index.
buf – source buffer start address.
size – data size to be copied.
offset – the start offset to be copied.

Returns:

void

virtual void copy_to_host(size_t batch_idx, void *buf, size_t size, size_t offset)

copy data to destination buffer.

Sample code:

vart::TensorBuffer* tb_from;
vart::TensorBuffer* tb_to;
for (auto batch = 0u; batch < batch_size; ++batch) {
       std::tie(data, tensor_size) = tb_to->data({(int)batch, 0, 0, 0});
    tb_from->copy_to_host(batch, reinterpret_cast<void*>(data),
                        tensor_size, 0u);
}

Parameters:

batch_idx – the batch index.
buf – destination buffer start address.
size – data size to be copied.
offset – the start offset to be copied.

Returns:

void

const xir::Tensor *get_tensor() const

Get tensor of TensorBuffer.

Returns:: A pointer to the tensor.

virtual std::string to_string() const: for fancy log messages

Public Static Functions

static std::string to_string(location_t value): for TensorBuffer location message

static void copy_tensor_buffer(vart::TensorBuffer *tb_from, vart::TensorBuffer *tb_to)

copy TensorBuffer from one to another.

Sample code:

vart::TensorBuffer* tb_from;
vart::TensorBuffer* tb_to;
vart::TensorBuffer::copy_tensor_buffer(tb_from.get(), tb_to.get());

Parameters:

tb_from – the source TensorBuffer.
tb_to – the destination TensorBuffer.

Returns:

void

static std::unique_ptr<TensorBuffer> create_unowned_device_tensor_buffer(const xir::Tensor *tensor, uint64_t batch_addr[], size_t addr_arrsize)

create unowned device tensor buffer with device physical addresses for a tensor.

There are some limitations on the arguments:

The addr_arrsize must NOT be greater than the tensor batch.
The tensor must have attribute ddr_addr whose value must be 0.

Sample code:

auto runner = vart::RunnerExt::create_runner(subgraph, attrs);
auto input_tensors = runner->get_input_tensors();
auto output_tensors = runner->get_output_tensors();
std::vector<vart::TensorBuffer*> input_tensor_buffers;
std::vector<vart::TensorBuffer*> output_tensor_buffers;
uint64_t in_batch_addr[1];
uint64_t out_batch_addr[1];
in_batch_addr[0] = DEVICE_PHY_ADDRESS_IN;
out_batch_addr[0] = DEVICE_PHY_ADDRESS_OUT;
auto input_tb = vart::TensorBuffer::create_unowned_device_tensor_buffer(
      input_tensors[0], in_batch_addr, 1);
auto output_tb = vart::TensorBuffer::create_unowned_device_tensor_buffer(
      output_tensors[0], out_batch_addr, 1);
input_tensor_buffers.emplace_back(input_tb.get());
output_tensor_buffers.emplace_back(output_tb.get());
auto v = runner->execute_async(input_tensor_buffers, output_tensor_buffers);

Parameters:

tensor – XIR tensor pointer
batch_addr – Array which contains device physical address for each batch
addr_arrsize – The array size of batch_addr

Returns:

Unique pointer of created tensor buffer.

Protected Attributes

const xir::Tensor *tensor_¶

class TensorBufferExt : public TensorBuffer

Public Types

enum class location_t¶

Values:

enumerator HOST_VIRT¶: Host only data() should return a valid pair (0,Nonzero_u); data_phy() should return an invalid pair (0,0u);.

enumerator HOST_PHY¶

continuous physicial memory, shared among host and device.

both data () and data_phy() should return a valid pair.

enumerator DEVICE_0¶

only accessiable by device.

data () should return an invalid pair (0,0u); data_phy() should return a valid pair.

enumerator DEVICE_1¶: only accessiable by device.

enumerator DEVICE_2¶: only accessiable by device.

enumerator DEVICE_3¶: only accessiable by device.

enumerator DEVICE_4¶: only accessiable by device.

enumerator DEVICE_5¶: only accessiable by device.

enumerator DEVICE_6¶: only accessiable by device.

enumerator DEVICE_7¶: only accessiable by device.

Public Functions

virtual XclBo get_xcl_bo(int batch_index) const = 0¶

virtual std::string to_string() const: for fancy log messages

virtual std::pair<std::uint64_t, std::size_t> data(const std::vector<std::int32_t> idx = {}) = 0

Get the data address of the index and the size of the data available for use.

Sample code:

vart::TensorBuffer* tb;
std::tie(data_addr, tensor_size) = tb->data({0,0,0,0});

Parameters:: idx – The index of the data to be accessed, its dimension same as the tensor shape.
Returns:: A pair of the data address of the index and the size of the data available for use in byte unit.

inline virtual location_t get_location() const

Get where the tensor buffer located.

Sample code:

vart::TensorBuffer* tb;
switch (tb->get_location()) {
            case vart::TensorBuffer::location_t::HOST_VIRT:
                  // do nothing
                  break;
            case vart::TensorBuffer::location_t::HOST_PHY:
                  // do nothing
                  break;
           default:
                  // do nothing
                  break;
      }

Returns:: the tensor buffer location, a location_t enum type value: HOST_VIRT/HOST_PHY/DEVICE_*.

inline virtual std::pair<uint64_t, size_t> data_phy(const std::vector<std::int32_t> idx)

Get the data physical address of the index and the size of the data available for use.

Sample code:

vart::TensorBuffer* tb;
std::tie(phy_data, phy_size) = tb->data_phy({0, 0});

Parameters:: idx – The index of the data to be accessed, its dimension same to the tensor shape.
Returns:: A pair of the data physical address of the index and the size of the data available for use in byte unit.

inline virtual void sync_for_read(uint64_t offset, size_t size)

Invalid cache for reading Before read, it is no-op in case get_location() returns DEVICE_ONLY or HOST_VIRT.

Sample code:

for (auto& output : output_tensor_buffers) {
    output->sync_for_read(0, output->get_tensor()->get_data_size() /
                                output->get_tensor()->get_shape()[0]);
}

Parameters:

offset – The start offset address.
size – The data size.

Returns:

void

inline virtual void sync_for_write(uint64_t offset, size_t size)

Flush cache for writing after write, it is no-op in case get_location() returns DEVICE_ONLY or HOST_VIRT.

Sample code:

for (auto& input : input_tensor_buffers) {
    input->sync_for_write(0, input->get_tensor()->get_data_size() /
                              input->get_tensor()->get_shape()[0]);
}

Parameters:

offset – The start offset address.
size – The data size.

Returns:

void

virtual void copy_from_host(size_t batch_idx, const void *buf, size_t size, size_t offset)

copy data from source buffer.

Parameters:

batch_idx – the batch index.
buf – source buffer start address.
size – data size to be copied.
offset – the start offset to be copied.

Returns:

void

virtual void copy_to_host(size_t batch_idx, void *buf, size_t size, size_t offset)

copy data to destination buffer.

Sample code:

vart::TensorBuffer* tb_from;
vart::TensorBuffer* tb_to;
for (auto batch = 0u; batch < batch_size; ++batch) {
       std::tie(data, tensor_size) = tb_to->data({(int)batch, 0, 0, 0});
    tb_from->copy_to_host(batch, reinterpret_cast<void*>(data),
                        tensor_size, 0u);
}

Parameters:

batch_idx – the batch index.
buf – destination buffer start address.
size – data size to be copied.
offset – the start offset to be copied.

Returns:

void

const xir::Tensor *get_tensor() const

Get tensor of TensorBuffer.

Returns:: A pointer to the tensor.

Public Static Functions

static std::string to_string(location_t value): for TensorBuffer location message

static void copy_tensor_buffer(vart::TensorBuffer *tb_from, vart::TensorBuffer *tb_to)

copy TensorBuffer from one to another.

Sample code:

vart::TensorBuffer* tb_from;
vart::TensorBuffer* tb_to;
vart::TensorBuffer::copy_tensor_buffer(tb_from.get(), tb_to.get());

Parameters:

tb_from – the source TensorBuffer.
tb_to – the destination TensorBuffer.

Returns:

void

static std::unique_ptr<TensorBuffer> create_unowned_device_tensor_buffer(const xir::Tensor *tensor, uint64_t batch_addr[], size_t addr_arrsize)

create unowned device tensor buffer with device physical addresses for a tensor.

There are some limitations on the arguments:

The addr_arrsize must NOT be greater than the tensor batch.
The tensor must have attribute ddr_addr whose value must be 0.

Sample code:

auto runner = vart::RunnerExt::create_runner(subgraph, attrs);
auto input_tensors = runner->get_input_tensors();
auto output_tensors = runner->get_output_tensors();
std::vector<vart::TensorBuffer*> input_tensor_buffers;
std::vector<vart::TensorBuffer*> output_tensor_buffers;
uint64_t in_batch_addr[1];
uint64_t out_batch_addr[1];
in_batch_addr[0] = DEVICE_PHY_ADDRESS_IN;
out_batch_addr[0] = DEVICE_PHY_ADDRESS_OUT;
auto input_tb = vart::TensorBuffer::create_unowned_device_tensor_buffer(
      input_tensors[0], in_batch_addr, 1);
auto output_tb = vart::TensorBuffer::create_unowned_device_tensor_buffer(
      output_tensors[0], out_batch_addr, 1);
input_tensor_buffers.emplace_back(input_tb.get());
output_tensor_buffers.emplace_back(output_tb.get());
auto v = runner->execute_async(input_tensor_buffers, output_tensor_buffers);

Parameters:

tensor – XIR tensor pointer
batch_addr – Array which contains device physical address for each batch
addr_arrsize – The array size of batch_addr

Returns:

Unique pointer of created tensor buffer.

Protected Functions

inline explicit TensorBufferExt(const xir::Tensor *tensor)¶

Protected Attributes

const xir::Tensor *tensor_¶

struct XclBo¶

Public Members

void *xcl_handle¶

unsigned int bo_handle¶