Amazon Bedrock AgentCore Evaluations is now generally available
Share
Services
Amazon Bedrock AgentCore Evaluations is now generally available, providing automated quality assessment for AI agents. Evaluations enables developers to monitor agent quality through continuous evaluation of production traffic, validate changes through testing workflows, and measure agent performance against defined expectations. AgentCore Evaluations offers two evaluation types. Online evaluation continuously monitors agent performance in production by sampling and scoring live traces. On-demand evaluation enables teams to test agents programmatically, supporting regression testing in CI/CD pipelines and interactive development workflows.
Teams can evaluate agents using 13 built-in evaluators for response quality, safety, task completion, and tool usage. Developers can also use Ground Truth to measure agent performance against expectations, including reference answers for response validation, behavioral assertions for session-level goals, and expected tool execution sequences. For domain-specific requirements, teams can configure custom evaluators using their choice of prompts and model for LLM-based evaluation, or implement custom logic in Python or JavaScript through Lambda-hosted functions for code-based evaluation. Evaluations integrates with AgentCore Observability for unified monitoring and real-time alerts.
AgentCore Evaluations is available in [nine AWS Regions](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/agentcore-regions.html): US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland).
Learn more about Amazon Bedrock AgentCore Evaluations through the [documentation](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/evaluations.html), and get started with the [AgentCore Starter Toolkit](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/agentcore-get-started-toolkit.html)
What else is happening at Amazon Web Services?
Amazon CloudWatch now supports ingesting Security Hub CSPM findings with organization-wide enablement
about 14 hours ago
Services
Share
Amazon OpenSearch Service introduces agentic AI for log analytics
about 16 hours ago
Services
Share
AWS IAM Identity Center is now available in AWS European Sovereign Cloud (Germany) Region
about 16 hours ago
Services
Share
Amazon SageMaker Unified Studio adds Observability for AWS Glue jobs via CloudWatch metrics
about 17 hours ago
Services
Share
Amazon ECS Managed Instances now supports Amazon EC2 instance store
about 20 hours ago
Services
Share