Amazon Comprehend announces support for classification and entity extraction directly from a variety of document formats
Share
Services
Amazon Comprehend announced single-step APIs that customers can now use to classify and extract entities of interest from PDF documents, Microsoft Word files, and images.
Amazon Comprehend is a Natural Language Processing (NLP) service that provides pre-trained and custom APIs to derive insights from textual data. The new capability simplifies document processing by adding support for common document types like PDF documents, Microsoft Word and images, in Amazon Comprehend Custom Classification and Custom Entity Recognition APIs. Previously, to process such documents, customers were required to pre-process and flatten documents into machine-readable text, which can reduce the quality of the document context. Now, with a single API call, customers can process both scanned or digital semi-structured documents (like PDFs, Microsoft Word documents, and images in their native format), and plain-text documents, eliminating pre-processing overhead. Customers can use the new capability to simplify document processing for batch processing or real-time use cases.
Customer can process documents in the English language for contextual entity recognition and German (de), English (en), Spanish (es), French (fr), Italian (it), and Portuguese (pt) languages for document classification. These capabilities are available in all AWS regions where Amazon Comprehend is available. To learn more and get started, visit the [Amazon Comprehend Intelligent Document Processing page,](/comprehend/idp/) [AWS News Blog](http://aws.amazon.com/blogs/aws/now-process-pdfs-word-documents-and-images-with-amazon-comprehend-for-idp), and our [documentation](https://docs.aws.amazon.com/comprehend/latest/dg/idp.html).
What else is happening at Amazon Web Services?
Amazon AppStream 2.0 users can now save their user preferences between streaming sessions
December 13th, 2024
Services
Share
AWS Elemental MediaConnect Gateway now supports source-specific multicast
December 13th, 2024
Services
Share
Amazon EC2 instances support bandwidth configurations for VPC and EBS
December 13th, 2024
Services
Share
AWS announces new AWS Direct Connect location in Osaka, Japan
December 13th, 2024
Services
Share
Amazon DynamoDB announces support for FIPS 140-3 interface VPC and Streams endpoints
December 13th, 2024
Services
Share