Amazon Bedrock expands support for Service Quotas
Share
Services
Amazon Bedrock is a fully managed service that provides secure, enterprise-grade access to high-performing foundation models from leading AI companies, enabling you to build and scale generative AI applications. Amazon Bedrock customers can now view inference quotas for the bedrock-mantle endpoint through AWS Service Quotas. This gives customers a familiar, consistent way to track limits for this endpoint, the same way they already do for the bedrock-runtime endpoint and other AWS services, and gives them clear visibility into the limits that apply to their workloads.
The bedrock-mantle endpoint supports the OpenAI Responses API, OpenAI Chat Completions API, and the Anthropic Messages API, letting customers run existing OpenAI or Anthropic based applications on Amazon Bedrock with minimal code changes. AWS Service Quotas now exposes per-model input-tokens-per-minute and output-tokens-per-minute quotas for supported models on the endpoint.
With this launch, customers gain visibility into how much limits they have on the bedrock-mantle endpoint and can proactively plan for production scale. To get started, open the AWS Service Quotas console, choose Amazon Bedrock, and search for "Bedrock Mantle" to view your current quotas. To request an increase to any of these quotas, follow the standard Amazon Bedrock limit increase process. Service Quotas support for the bedrock-mantle endpoint is available in all AWS Regions where the endpoint is offered: US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Mumbai, Tokyo, Sydney, Jakarta), Europe (Frankfurt, Ireland, London, Milan, Stockholm), and South America (São Paulo). To learn more, see [Quotas for Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/quotas-mantle.html).
What else is happening at Amazon Web Services?
Read update
Services
Share
SageMaker Notebook Instances now support P5.4xl instance types
about 11 hours ago
Services
Share
SageMaker Notebook Instances now support P5en.48xl instance types
about 11 hours ago
Services
Share
Amazon EMR now supports Apache Spark 4.0.2 in general availability
about 12 hours ago
Services
Share
AWS Glue large and memory optimized workers now available in Europe (Spain) Region
about 12 hours ago
Services
Share
Amazon Connect Customer now uses generative AI to automatically evaluate self-service interactions
about 15 hours ago
Services
Share