AWS Glue Crawlers enhances support for Delta Lake Tables
Share
Services
[AWS Glue crawlers](https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html) now have enhanced support for Linux Foundation [Delta Lake](https://delta.io/) tables, increasing operational efficiency to extract meaningful insights from analytics services such as [Amazon Athena](/athena/), [Amazon EMR](/emr/), and [AWS Glue](/glue/). This feature enables analytics services scan Delta Lake tables without requiring the creation of manifest files by Glue crawlers. Newly cataloged data is now quickly made available for analysis using your preferred analytics and machine learning (ML) tools.
Previously, Glue crawlers supported Delta Lake tables by creating manifest files in [Amazon S3](/pm/serv-s3/) for different analytics services to consume. Glue crawlers needed to generate manifest files on a periodic basis to include newer transactions in the original Delta Lake tables resulting in longer processing times.
With today’s launch, you can create and schedule a Glue crawler with the option to create native Delta Lake tables, then provide a path to Amazon S3 where the Delta Lake tables are located. With each crawler run, the crawler inspects and catalogs schema information and partition information, such as updates or deletes, to Delta Lake tables in the Glue Data Catalog.
AWS Glue crawler support for native Delta Lake tables is available in all commercial regions where AWS Glue is available, see the [AWS Region Table](/about-aws/global-infrastructure/regional-product-services/). Enhanced Delta Lake support is available in Athena engine version 3.0 and Glue version 3.0 or later. To learn more, read the [blog](https://aws.amazon.com/blogs/big-data/introducing-native-delta-lake-table-support-with-aws-glue-crawlers/), and visit the AWS Glue crawler [documentation](https://docs.aws.amazon.com/glue/latest/dg/crawler-data-stores.html) to learn more.
What else is happening at Amazon Web Services?
Read update
Services
Share
Read update
Services
Share
Amazon Managed Service for Prometheus now supports configuring a minimum firing period for alerts
October 16th, 2024
Services
Share