AWS Glue announces AWS Glue Data Quality (Preview)
Share
Services
[AWS Glue ](/glue/)announces the preview of AWS Glue Data Quality, a new capability that automatically measures and monitors data lake and data pipeline quality. AWS Glue is a serverless, scalable data integration service that makes it more efficient to discover, prepare, move, and integrate data from multiple sources. Managing data quality is manual and time-consuming. You must set up data quality rules and validate your data against these rules on a recurring basis, also writing code to set up alerts when quality deteriorates. Analysts must manually analyze data, write rules, and then write code to implement these rules.
AWS Glue Data Quality automatically analyzes your data to gather data statistics. It then recommends data quality rules to get started. You can update recommended rules or add new rules using provided data quality rules. If data quality deteriorates, you can then configure actions to alert users. Data quality rules and actions can also be configured on AWS Glue extract, transform, and load (ETL) jobs on data pipelines. These guidelines can prevent “bad” data from entering data lakes and data warehouses. AWS Glue is serverless, so there is no infrastructure to manage, and AWS Glue Data Quality uses open-source Deequ to evaluate rules. AWS uses Deequ to measure and monitor data quality of petabyte-scale data lakes.
AWS Glue Data Quality is available in preview in the following [AWS Regions](/about-aws/global-infrastructure/regional-product-services/): US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland).
To learn more, review the [AWS Glue Data Quality documentation](https://docs.aws.amazon.com/glue/latest/dg/glue-data-quality.html) for data quality on data at rest, for [data quality in data pipelines](https://docs.aws.amazon.com/glue/latest/ug/gs-data-quality-chapter.html).
What else is happening at Amazon Web Services?
AWS Elemental MediaLive now supports color space conversion with custom tone mapping
about 19 hours ago
Services
Share
Read update
Services
Share
AWS DMS adds support for Amazon Relational Database Service for Db2 as a target endpoint
about 19 hours ago
Services
Share
Amazon Rekognition launches Face APIs version 7 for improved accuracy and lower latency
about 22 hours ago
Services
Share
AWS Secrets Manager announces 99.99% Service Level Agreement
about 23 hours ago
Services
Share
Amazon Connect adds an additional 24 contact and agent metrics to access programmatically
about 23 hours ago
Services
Share