Kubernetes AI toolchain operator
Share
Services
You can now run specialized machine learning workloads like large language models (LLMs) on Azure Kubernetes Service (AKS) more cost-effectively and with less manual configuration.
The initial release of **Kubernetes AI toolchain operator,** an open source project, automates LLM model deployment on AKS across available CPU and GPU resources by selecting optimally sized infrastructure for the model. It makes it possible to easily split inferencing across multiple lower-GPU count VMs, increasing the number of Azure regions where workloads can run, eliminating wait times for higher GPU-count VMs, and lowering overall cost. You can also choose from preset models with images hosted by AKS, significantly reducing overall inference service setup time.
<https://aka.ms/aks/ai-toolchain-operator>
* Azure Kubernetes Service (AKS)
* Open Source
* Microsoft Ignite
* [ Azure Kubernetes Service (AKS)](https://azure.microsoft.com/en-gb/products/kubernetes-service/)
What else is happening at Microsoft Azure?
Read update
Services
Share
Generally Available: Storage account default maximum request rate limit increase to 40,000 requests per second
December 12th, 2024
Services
Share
Read update
Services
Share
Generally Available: Regional Disaster Recovery by Azure Backup for AKS
November 22nd, 2024
Services
Share
Generally Available: Enhancements on Azure Container Storage for performance, scalability, and operational insights
November 19th, 2024
Services
Share