June 28th, 2024

The following models have been added to Model Garden For more information, see the Hugging Face model deployment in the console

Services

## Feature The following models have been added to [Model Garden](https://console.cloud.google.com/vertex-ai/model-garden): * 36 Hugging Face embedding models with verified deployment settings such as [BAAI/bge-m3](https://console.cloud.google.com/vertex-ai/publishers/BAAI/model-garden/bge-m3;action=deploy;hfSource=true) and [intfloat/multilingual-e5-large-instruct](https://console.cloud.google.com/vertex-ai/publishers/intfloat/model-garden/multilingual-e5-large-instruct;action=deploy;hfSource=true). * 35 Hugging Face PyTorch models with verified deployment settings such as [stabilityai/stable-diffusion-2-1](https://console.cloud.google.com/vertex-ai/publishers/stabilityai/model-garden/stable-diffusion-2-1;action=deploy;hfSource=true) and [HuggingFaceFW/fineweb-edu-classifier](https://console.cloud.google.com/vertex-ai/publishers/HuggingFaceFW/model-garden/fineweb-edu-classifier;action=deploy;hfSource=true). For more information, see the [Hugging Face model deployment](http://console.cloud.google.com/vertex-ai/model-garden;action=deploy;hfSource=true) in the console. ## Feature Launched [Hex-LLM](https://cloud.google.com/vertex-ai/generative-ai/docs/open-models/use-hex-llm) for high-efficiency large language model serving. This performant TPU serving solution is based on XLA and optimized kernels to achieve high throughput and low latency. Hex-LLM uses several parallelism strategies for multiple TPU chips, quantizations, dynamic LoRA, and more. Hex-LLM supports the following dense and sparse LLMs: * Gemma 2B and 7B * Gemma 2 9B and 27B * Llama 2 7B, 13B and 70B * Llama 3 8B and 70B * Mistral 7B and Mixtral 8x7B ## Change * Updated Docker images in [Llama 3 notebooks](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model%5Fgarden/model%5Fgarden%5Fpytorch%5Fllama3%5Ffinetuning.ipynb) that are more efficient at tuning. * A notebook-based interactive workshop UI was added in [Model Garden](https://console.cloud.google.com/vertex-ai/model-garden) for image generative models such as [stable-diffusion-xl-base](https://console.cloud.google.com/vertex-ai/publishers/stability-ai/model-garden/stable-diffusion-xl-base), [image inpainting](https://console.cloud.google.com/vertex-ai/publishers/runwayml/model-garden/stable-diffusion-inpainting), [controlnet](https://console.cloud.google.com/vertex-ai/publishers/lllyasviel/model-garden/control-net). You can find these models from the **Open Notebook** list. * Colab Notebooks for frequently used models in Model Garden have been revised with no-code or low-code implementations to improve accessibility and user experience.