Introducing Amazon SageMaker Operators for Kubernetes
Share
Services
[Amazon SageMaker Operators for Kubernetes](https://github.com/aws/amazon-sagemaker-operator-for-k8s) make it easier for developers and data scientists using Kubernetes to train, tune, and deploy machine learning (ML) models in Amazon SageMaker.
Customers use Kubernetes, a general purpose container orchestration system, to setup repeatable pipelines and maintain greater control and portability over their workloads. But when running ML workloads in Kubernetes, customers also have to manage and optimize the underlying ML infrastructure, ensure high availability and reliability, provide ML tools to make data scientists more productive, and comply with appropriate security and regulatory requirements. With Amazon SageMaker Operators for Kubernetes, customers can invoke SageMaker using the Kubernetes API or Kubernetes tools such as kubectl to create and interact with their ML jobs in SageMaker. This gives Kubernetes customers the portability and standardization benefits of Kubernetes and EKS, along with the benefits of fully managed ML services with Amazon SageMaker.
Customers can use Amazon SageMaker Operators for model training, model hyperparameter optimizations, real-time inference, and batch inference. For model training, Kubernetes customers can now leverage all the benefits of fully managed ML model training in SageMaker, including [Managed Spot Training](https://aws.amazon.com/blogs/aws/managed-spot-training-save-up-to-90-on-your-amazon-sagemaker-training-jobs/) to save up to 90% in cost, and distributed training to reduce training time by scaling to multiple GPU nodes. Compute resources are only provisioned when requested, scaled as needed, and shut down automatically when jobs complete, ensuring near 100% utilization. For hyperparameter tuning, customers can use SageMaker’s Automatic Model Tuning, saving data scientists days or even weeks of time improving model accuracy. Customers can also use Spot instance for Automatic Model Tuning. For inference, customers can use SageMaker Operators to deploy trained models in SageMaker to fully managed auto-scaling clusters, spread across multiple availability zones to deliver high performance and availability for real-time or batch prediction.
Amazon SageMaker Operators for Kubernetes are generally available as of this writing in US East (Ohio), US East (N. Virginia), US West (Oregon), and EU (Ireland). You can get started with step-by-step tutorials in our [user guide](https://sagemaker.readthedocs.io/en/stable/amazon%5Fsagemaker%5Foperators%5Ffor%5Fkubernetes.html) and [GitHub repository](https://github.com/aws/amazon-sagemaker-operator-for-k8s).
What else is happening at Amazon Web Services?
Read update
Services
Share
Read update
Services
Share
Amazon Managed Service for Prometheus now supports configuring a minimum firing period for alerts
October 16th, 2024
Services
Share