AWS Glue 4.0 now supports Streaming ETL
Share
Services
[AWS Glue](/glue/) now supports Streaming ETL in version 4.0, a new version of AWS Glue that accelerates data integration workloads in AWS. AWS Glue 4.0 upgrades data integration engines, including an upgrade to [Apache Spark 3.3.0](https://spark.apache.org/releases/spark-release-3-3-0.html) and to [Python 3.10](https://docs.python.org/3/whatsnew/3.10.html).
AWS Glue streaming ETL jobs continuously consume data from streaming sources, clean and transform the data in-flight, and make it available for analysis in seconds. This release includes an optimized [state-management store](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#state-store) to build efficient streaming solutions across micro-batches. This makes it easier to remove duplicates in a stream and to perform stream-based aggregations. You can also add a new column that indicates when a corresponding record was received by the stream for better data observability. This version also supports IAM authentication for Amazon Managed Streaming for Apache Kafka Serverless.
AWS Glue 4.0 Streaming ETL is now available in the same [AWS regions](/about-aws/global-infrastructure/regional-product-services/) as AWS Glue, except for China and GovCloud.
To learn more, read about Streaming ETL jobs in our [documentation](https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html).
What else is happening at Amazon Web Services?
Amazon AppStream 2.0 users can now save their user preferences between streaming sessions
December 13th, 2024
Services
Share
AWS Elemental MediaConnect Gateway now supports source-specific multicast
December 13th, 2024
Services
Share
Amazon EC2 instances support bandwidth configurations for VPC and EBS
December 13th, 2024
Services
Share
AWS announces new AWS Direct Connect location in Osaka, Japan
December 13th, 2024
Services
Share
Amazon DynamoDB announces support for FIPS 140-3 interface VPC and Streams endpoints
December 13th, 2024
Services
Share