Roadmap

The AMD Inference Server is in active development and this is the tentative and non-exhaustive roadmap of features we would like to add. Of course, this is subject to change based on our own assessment and on feedback from the community, both of which may affect which features take priority over others. More detailed information about the work that’s ongoing and/or completed can be found in the change log and the Github roadmap.

2022

2022 Q1

  • gRPC support (series of commits starting in @37a6aad)

2022 Q2

  • ZenDNN CPU support (#17 and #21)

  • Official integration with KServe (KServe website #179)

2022 Q3

  • GPU support (#34)

2023

The theme for 2023 is ease-of-use and performance. These two prongs are related and connected as two ways of engaging users and driving development. Ease-of-use means improving documentation and expanding testing with different models and devices to provide guides on how users can do the same. Making it easier to install and get started is a big part of that too. As test coverage expands, the question inevitably gets asked: how is it compared to the alternative? Thus, measuring performance and reliably reporting results consistently in a reproducible manner becomes important. The quality of those results should then guide what changes need to be made internally to improve performance. Having these results to compare with is also useful to maintain the numbers

2023 Q1

  • Benchmarking with MLPerf

2023 Q2

  • Refactor memory model

  • Enable installation without Docker

2023 Q3

  • Expanded testing with models in Vitis AI model zoo

2023 Q4

Future