Modular: A Fast, Scalable Gen AI Inference Platform

A high-performance inference engine to build, optimize, and deploy AI apps fast. Run open models, scale across GPUs, and tap into CPU+GPU performance with Mojo.