Changelog¶

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased¶

Allow building Debian package (@930fab2)
Add modelInferAsync to the API (@2f4a6c2)
Add inferAsyncOrdered as a client operator for making inferences in parallel (#66)
Support building Python wheels with cibuildwheel (#71)
Support XModels with multiple output tensors (#74)
Add FP16 support (#76)
Add more documentation (#85, #90)
Add Python bindings for gRPC and Native clients (#88)
Add tests with KServe (#90)
Add batch size flag to examples (#94)
Add Kubernetes test for KServe (#95)
Use exhale to generate Python API documentation (#95)
OpenAPI spec for REST protocol (#100)
Use a timer for simpler time measurement (#104)
Allow building containers with custom backend versions (#107)

Mappings between XIR data types <-> inference server data types from public API (#102)
Web GUI (#110)

Use input tensors in requests correctly (#61)
Fix bug with multiple input tensors (#74)
Align gRPC responses using non-gRPC-native data types with other input protocols (#81)
Fix the Manager’s destructor (#88)
Fix using --no-user-config with proteus run (#89)
Handle assigning user permissions if the host UID is same as UID in container (#101)
Fix test discovery if some test assets are missing (#105)
Fix gRPC queue shutdown race condition (#111)

Use Pybind11 to create Python API (#20)
Two logs are created now: server and client
Logging macro is now PROTEUS_LOG_*
Loading workers is now case-insensitive (@14ed4ef and @90a51ae)
Build AKS from source (@e04890f)
Use consistent custom exceptions (#30)
Update Docker build commands to opt-in to all backends (#43)
Renamed ‘modelLoad’ to ‘workerLoad’ and changed the behavior for ‘modelLoad’ (#27)

Get the right request size in the batcher when enqueuing with the C++ API (@d1ad81d)
Construct responses correctly in the XModel worker if there are multiple input buffers (@d1ad81d)
Populate the right number of offsets in the hard batcher (@6666142)
Calculate offset values correctly during batching (@8c7534b)
Get correct library dependencies for production container (@14ed4ef)
Correctly throw an exception if a worker gets an error during initialization (#29)
Detect errors in HTTP client during loading (@99ffc33)
Construct batches with the right sizes (#57)