Maintained with ☕️ by
IcePanel logo

Amazon EMR on EC2 Adds Apache Spark native FGAC and AWS Glue Data Catalog Views Support

Share

Services

Amazon EMR on EC2 announces two significant enhancements for governance: Apache Spark native fine-grained access control (FGAC) via AWS Lake Formation, and support for AWS Glue Data Catalog views. These features allow organizations to improve data security, simplify access management, and enhance data sharing capabilities across their analytics environments. The Apache Spark native FGAC implementation allows customers to define granular access policies once in AWS Lake Formation and apply them consistently across EMR clusters. This reduces security risks and administrative overhead while providing a unified approach to data governance. Customers can now use familiar Lake Formation grant and revoke statements to manage access controls for their Spark jobs and interactive sessions on EMR on EC2, similar to how this works for other AWS analytics services. AWS Glue Data Catalog views enables customers to create, manage, and query multi-engine SQL views across AWS regions, accounts, and organizations. This feature allows administrators to create views from Spark jobs that can be queried from multiple engines, while controlling data access through Lake Formation permissions. These permissions include named resource grants, data filters, and tags, with all access requests automatically logged in AWS CloudTrail for comprehensive auditing. Apache Spark native FGAC and Glue Data Catalog view features are available with Amazon EMR release 7.10 in all AWS Regions where EMR on EC2 is available. To learn more, visit [Using AWS Lake Formation with Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-lake-formation.html) and [Working with AWS Glue Data Catalog Views ](https://docs.aws.amazon.com/emr/latest/ManagementGuide/SECTION-jobs-glue-data-catalog-views-ec2.html)in the Amazon EMR documentation.