Maintained with ☕️ by
IcePanel logo

VLLM TPU vLLM TPU, a highly-efficient serving framework for large language models (LLM) that's

Share

Services

## Feature **vLLM TPU** [vLLM TPU](https://cloud.google.com/vertex-ai/generative-ai/docs/open-models/vllm/use-vllm-tpu), a highly-efficient serving framework for large language models (LLM) that's optimized for [Cloud TPU](https://cloud.google.com/tpu) hardware, is available through Model Garden.