Roadmap¶
The AMD Inference Server is in active development and this is the tentative and non-exhaustive roadmap of features we would like to add. Of course, this is subject to change based on our own assessment and on feedback from the community, both of which may affect which features take priority over others. More detailed information about the work that’s ongoing and/or completed can be found in the change log and the Github roadmap.
2022¶
2022 Q1¶
gRPC support (series of commits starting in @37a6aad)
2022 Q2¶
2022 Q3¶
GPU support (#34)
2023¶
The theme for 2023 is ease-of-use and performance. These two prongs are related and connected as two ways of engaging users and driving development. Ease-of-use means improving documentation and expanding testing with different models and devices to provide guides on how users can do the same. Making it easier to install and get started is a big part of that too. As test coverage expands, the question inevitably gets asked: how is it compared to the alternative? Thus, measuring performance and reliably reporting results consistently in a reproducible manner becomes important. The quality of those results should then guide what changes need to be made internally to improve performance. Having these results to compare with is also useful to maintain the numbers
2023 Q1¶
Benchmarking with MLPerf
2023 Q2¶
Refactor memory model
Enable installation without Docker
2023 Q3¶
Expanded testing with models in Vitis AI model zoo