Maintained with ☕️ by
IcePanel logo

Public Preview: Azure API Management now supports token metrics for all token types

Share

Services

As organizations adopt advanced AI models, understanding token consumption is becoming more complex. Traditional metrics focused primarily on prompt and completion tokens often fail to capture newer token categories such as cached, reasoning, or thinking tokens — limiting cost visibility and operational insight. With this update, Azure API Management now enables organizations to log metrics for a broader range of token types into Application Insights, extending observability beyond standard token tracking. This allows teams to: * Track usage for cached, reasoning, thinking, and other token types, in addition to previously supported prompt, completion, and total tokens * Collect token metrics across model providers and API formats, including OpenAI Chat Completions API, OpenAI Responses API, and Anthropic Messages API models across Microsoft Foundry, OpenAI, Amazon AWS Bedrock, Google Vertex AI, and other providers * Build more accurate dashboards for AI consumption and cost analysis * Improve budget monitoring and alerting across AI workloads * Gain deeper operational visibility into evolving model behaviors and token economics By expanding token observability, Azure API Management helps organizations better manage AI usage, optimize costs, and improve governance across modern generative AI applications. [Learn more](https://learn.microsoft.com/azure/api-management/llm-emit-token-metric-policy).