How to run gpt-oss with vLLM

vLLM is an open-source, high-throughput inference engine designed to efficiently serve large language models (LLMs) by optimizing memory usa

cookbook.openai.com