GitHub - llm-d/llm-d: Achieve state of the art inference performance with modern accelerators on Kubernetes
Achieve state of the art inference performance with modern accelerators on Kubernetes - llm-d/llm-d