Amazon Elastic Inference
Amazon Elastic Inference is a service that allows you to attach low-cost GPU-powered acceleration to many Amazon machine instances in order to reduce the cost of running deep learning inference by up to 75%. Amazon Elastic Inference supports TensorFlow, Apache MXNet, and ONNX models through MXNet.
Amazon announces new NVIDIA Triton Inference Server on Amazon SageMaker
November 12th, 2021
Services
Share
Amazon SageMaker now supports inference endpoint testing from SageMaker Studio
September 16th, 2021
Services
Share
Amazon EC2 Inf1 instances based on AWS Inferentia now available in 6 additional regions
November 19th, 2020
Services
Share
Read update
Services
Share
Attach multiple Elastic Inference accelerators to a single EC2 instance
December 12th, 2019
Services
Share
Read update
Services
Share