AWS Glue Data Catalog offers advanced automatic optimization for Apache Iceberg tables
Share
Services
[AWS Glue Data Catalog](https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html) now offers advanced automatic optimization for Apache Iceberg tables. This update includes supporting compaction of delete files, nested data types, partial progress commits, and partition evolution support, making it easier to maintain consistently performant transactional data lakes. These features address challenges faced by customers with streaming data continuously ingested into Apache Iceberg tables, resulting in a large number of delete files that track changes in data files.
With this new capability, Glue Data Catalog constantly monitors table partitions for positional and equality delete files, initiates the compaction process, and regularly commits partial progress to reduce conflicts. Glue Catalog optimizers now support schema evolution as you reorder or rename columns as well as partition spec evolution. In addition, Glue Catalog has expanded support for heavily nested complex data and support for parquet compression codecs - zstd, brotli, lz4, gzip, snappy. Enabling automatic compaction reduces delete files and metadata overhead on your Iceberg tables and improves query performance. These new features are automatically applied to existing and new Glue Catalog optimizers.
In addition to the AWS console, customers can also use the AWS CLI or AWS SDKs to automate optimization for Apache Iceberg tables. The feature is available in 14 AWS regions US East (N. Virginia, Ohio), US West (Oregon), Europe (Ireland, London, Frankfurt, Stockholm), Canada (central), Asia Pacific (Tokyo, Seoul, Mumbai, Singapore, Sydney), South America (São Paulo). To learn more, read the [blog](https://aws.amazon.com/blogs/big-data/accelerate-queries-on-apache-iceberg-tables-through-aws-glue-auto-compaction/), and visit the AWS Glue Data Catalog [documentation](https://docs.aws.amazon.com/glue/latest/dg/table-optimizers.html).
What else is happening at Amazon Web Services?
Read update
Services
Share
Read update
Services
Share
Amazon ECS Service Connect is now available in the AWS GovCloud (US-West) and AWS GovCloud (US-East) Regions
February 5th, 2025
Services
Share
Amazon SageMaker AI is now available in Asia Pacific (Malaysia)
February 5th, 2025
Services
Share
Amazon DocumentDB now offers one-click connectivity with CloudShell
February 5th, 2025
Services
Share