September 25th, 2024

Customize your Amazon SageMaker model deployment software and driver versions

Services

You can now pick the software and driver versions used by the instances that best fits your needs when deploying models on SageMaker. Amazon SageMaker makes it easier to deploy ML models including foundation models (FMs) to make inference requests at the best price performance for any use case. Previously, customers had to use preset software and driver versions defined by SageMaker on the managed instances behind an endpoint. Now customers can specify the “InferenceAmiVersion” parameter when configuring endpoints to select the combination of software and driver versions (such as Nvidia driver and CUDA version) that best meets their requirements. This allows you to tailor your hosting environment to meet your performance, compatibility, scalability, and operational requirements of your ML applications. By using this parameter, you can also downgrade and upgrade driver versions for your endpoints on your own schedule. This feature is available in all regions where SageMaker is available. You can learn more about deploying model on SageMaker [here](https://aws.amazon.com/sagemaker/deploy/) and more about this feature in [our documentation](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API%5FProductionVariant.html#sagemaker-Type-ProductionVariant-InferenceAmiVersion).