Public Preview Azure OpenAI Semantic Caching policy in Azure API Management
Share
Services
We're excited to announce the Public Preview for the Azure OpenAI Semantic Caching policy in Azure API Management! This innovative feature empowers customers to optimize token usage by leveraging semantic caching, which intelligently stores completions for prompts with similar meanings.
With this policy, customers can easily configure semantic caching for their Azure OpenAI endpoints. This caching mechanism utilizes Azure Redis Enterprise or any other external cache that has been onboarded to APIM, providing flexibility in caching solutions.
By leveraging the Azure OpenAI Embeddings model to calculate vectors for prompts, the semantic caching policy intelligently identifies semantically similar prompts and stores respective completions in the cache. This allows for efficient completions reuse, reducing token consumption and improving overall performance.
Customers can configure semantic caching in a centralized manner for multiple API consumers, streamlining management and ensuring consistent caching behavior across their API ecosystem. This capability enables customers to maximize the benefits of caching and optimize token usage, enhancing the scalability and efficiency of their Azure OpenAI integration.
Click [here](https://aka.ms/apim/openai/semantic-caching) to learn more.
* API Management
* Features
* Microsoft Build
* [ API Management](https://azure.microsoft.com/en-gb/products/api-management/)
What else is happening at Microsoft Azure?
Read update
Services
Share
Generally Available: Storage account default maximum request rate limit increase to 40,000 requests per second
December 12th, 2024
Services
Share
Read update
Services
Share
Generally Available: Regional Disaster Recovery by Azure Backup for AKS
November 22nd, 2024
Services
Share
Generally Available: Enhancements on Azure Container Storage for performance, scalability, and operational insights
November 19th, 2024
Services
Share