Maintained with ☕️ by
IcePanel logo

AWS Glue FindMatches now provides match scores

Share

Services

The FindMatches ML transform in AWS Glue now includes an option to output match scores, which indicate how closely each grouping of records match each other. The FindMatches transform allows you to identify duplicate or matching records in your dataset, even when the records do not have a common unique identifier and no fields match exactly. FindMatches helps automate complex data cleaning and deduplication tasks. AWS Glue FindMatches automates the process of identifying partially matching records for use cases including linking customer records, deduplicating product catalogs, and fraud detection. Use match scoring in FindMatches to understand your FindMatches models, decide if they are trained to your satisfaction, and to decide which records to merge. This feature is available in the same [AWS Regions](/about-aws/global-infrastructure/regional-product-services/) as AWS Glue. To learn more, visit our [documentation](https://docs.aws.amazon.com/glue/latest/dg/match-scoring.html) and read the FindMatches [blog post](https://aws.amazon.com/blogs/big-data/integrate-and-deduplicate-datasets-using-aws-lake-formation-findmatches/).