Amazon SageMaker Data Wrangler now supports over 40 third-party applications as data sources
Share
Services
Today, AWS announces the general availability of Amazon SageMaker Data Wrangler support for over 40 third party applications as data sources for machine learning (ML) through the integration with Amazon AppFlow. [Amazon SageMaker Data Wrangler](/sagemaker/data-wrangler/) reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes. Preparing high quality data for ML is often complex and time consuming as it requires aggregating data across various sources and formats using different tools. With SageMaker Data Wrangler, you can explore and import data from a variety of popular sources, such as Amazon S3, Amazon Athena, Amazon Redshift, Snowflake, Databricks and Salesforce Customer Data Platform. Starting today, we are making it easier for customers to aggregate data for ML from over 40 third-party application data sources, including Salesforce Marketing, SAP, Google Analytics, LinkedIn and more via Amazon AppFlow.
[Amazon AppFlow](/appflow/) is a fully managed service that enables customers to securely transfer data from third-party applications to AWS services such as Amazon S3, and catalog the data in the AWS Glue Data Catalog in just a few clicks. Once the data sources are set up in AppFlow, you can browse tables and schemas from these data sources using Data Wrangler SQL explorer. You can write Athena queries to preview data to ensure that it is relevant for your use cases, and import data to prepare for ML model training. You can also join data from multiple sources after import to create the right data set for ML. Once the data is imported, you can quickly understand data quality, clean the data, and create features with 300+ built in analysis and data transformation. You can also train and deploy model with SageMaker Autopilot, and operationalize data preparation process in a feature engineering, training or or deployment pipeline using integration with SageMaker Pipeline from Data Wrangler.
Data Wrangler supports 40+ third-party data sources in [all the regions currently supported by AppFlow](https://docs.aws.amazon.com/general/latest/gr/appflow.html). This feature is available at no additional charge beside Data Wrangler and AppFlow cost.
To get started, see the following resources:
* New — [Amazon SageMaker Data Wrangler supports SaaS applications as data sources](http://aws.amazon.com/blogs/aws/new-amazon-sagemaker-data-wrangler-supports-saas-applications-as-data-sources/)
* [Import data from third-party applications](https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-import.html#data-wrangler-import-saas) in the AWS technical documentation
What else is happening at Amazon Web Services?
Amazon AppStream 2.0 users can now save their user preferences between streaming sessions
December 13th, 2024
Services
Share
AWS Elemental MediaConnect Gateway now supports source-specific multicast
December 13th, 2024
Services
Share
Amazon EC2 instances support bandwidth configurations for VPC and EBS
December 13th, 2024
Services
Share
AWS announces new AWS Direct Connect location in Osaka, Japan
December 13th, 2024
Services
Share
Amazon DynamoDB announces support for FIPS 140-3 interface VPC and Streams endpoints
December 13th, 2024
Services
Share