Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased¶
Added¶
Changed¶
Refactor how global state is managed (#125)
Require the server as an argument for creating the NativeClient (#125)
Use a global memory pool to allocate memory for incoming requests (#149)
Resolve the request at the incoming server rather than the batcher (#164)
Add flags to run container-based tests in parallel (#168)
Bump up to Vitis AI 3.0 (#169)
Refactor inference request objects and tensors (#172)
Use const references throughout for ParameterMap (#172)
Update workers’
doRun
method signature to produce and return a batch (#176)Use TOML-based configuration files in the repository by default (#178)
Location of test model lists moved to
tests
directory (#180)
Deprecated¶
N/A
Removed¶
N/A
Fixed¶
Security¶
N/A
0.3.0 - 2023-02-01¶
Added¶
Allow building Debian package (@930fab2)
Add
modelInferAsync
to the API (@2f4a6c2)Add
inferAsyncOrdered
as a client operator for making inferences in parallel (#66)Support building Python wheels with cibuildwheel (#71)
Support XModels with multiple output tensors (#74)
Add FP16 support (#76)
Add Python bindings for gRPC and Native clients (#88)
Add tests with KServe (#90)
Add batch size flag to examples (#94)
Add Kubernetes test for KServe (#95)
Use exhale to generate Python API documentation (#95)
OpenAPI spec for REST protocol (#100)
Use a timer for simpler time measurement (#104)
Allow building containers with custom backend versions (#107)
Changed¶
Refactor pre- and post-processing functions in C++ (@42cf748)
Templatize Dockerfile for different base images (#71)
Use multiple HTTP clients internally for parallel HTTP requests (#66)
Update test asset downloading (#81)
Reimplement and align examples across platforms (#85)
Reorganize Python library (#88)
Rename ‘proteus’ to ‘amdinfer’ (#91)
Use Ubuntu 20.04 by default for Docker (#97)
Bump up to ROCm 5.4.1 (#99)
Some function names changed for style (#102)
Bump up to ZenDNN 4.0 (#113)
Deprecated¶
ALL_CAPS style enums for the DataType (#102)
Removed¶
Fixed¶
Use input tensors in requests correctly (#61)
Fix bug with multiple input tensors (#74)
Align gRPC responses using non-gRPC-native data types with other input protocols (#81)
Fix the Manager’s destructor (#88)
Fix using
--no-user-config
withproteus run
(#89)Handle assigning user permissions if the host UID is same as UID in container (#101)
Fix test discovery if some test assets are missing (#105)
Fix gRPC queue shutdown race condition (#111)
0.2.0 - 2022-08-05¶
Added¶
HTTP/REST C++ client (@cbf33b8)
gRPC API based on KServe v2 API (@37a6aad and others)
‘ServerMetadata’ endpoint to the API (@7747911)
‘modelList’ endpoint to the API (@7477b7d)
Parse JSON data as string in HTTP body (@694800e)
Directory monitoring for model loading (@6459797)
‘ModelMetadata’ endpoint to the API (@22b9d1a)
MIGraphX backend (#34)
Pre-commit for style verification(@048bdd7)
Changed¶
Use Pybind11 to create Python API (#20)
Two logs are created now: server and client
Logging macro is now
PROTEUS_LOG_*
Loading workers is now case-insensitive (@14ed4ef and @90a51ae)
Build AKS from source (@e04890f)
Use consistent custom exceptions (#30)
Update Docker build commands to opt-in to all backends (#43)
Renamed ‘modelLoad’ to ‘workerLoad’ and changed the behavior for ‘modelLoad’ (#27)
Fixed¶
Get the right request size in the batcher when enqueuing with the C++ API (@d1ad81d)
Construct responses correctly in the XModel worker if there are multiple input buffers (@d1ad81d)
Populate the right number of offsets in the hard batcher (@6666142)
Calculate offset values correctly during batching (@8c7534b)
Get correct library dependencies for production container (@14ed4ef)
Correctly throw an exception if a worker gets an error during initialization (#29)
Detect errors in HTTP client during loading (@99ffc33)
Construct batches with the right sizes (#57)
0.1.0 - 2022-02-08¶
Added¶
Core inference server functionality
Batching support
Support for running multiple workers simultaneously
Support for different batcher and buffer implementations
XModel support
Logging, metrics and tracing support
REST API based on KServe v2 API
C++ API
Python library for REST
Documentation, examples, and some tests
Experimental GUI