ZeRO-Inference: Democratizing massive model inference

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.