Use DocAI from Google Cloud to derive value faster and efficiently with reduced costs.
Aug. 12, 2021 | By Aditi Hegde and Durga Pravallika Tulluru
Most, if not all, business processes begin and end with documents that contain critical data to maintain business operations. But organizations are challenged with organizing document data that is usually unstructured or in a format that is not machine-readable or readily available. The manual labor to process these documents is tedious and error-prone—adding to document processing cost and time.
Enter: Document AI from Google Cloud. The Document AI (DocAI) platform is a unified console for document processing that can automatically classify, extract and enrich data within your documents to unlock insights. If your organization deals with a large number of similar documents on a day-to-day basis, DocAI can help you process them quickly and efficiently.
How does Google DocAI work?
Google Document AI uses computer vision and optical character recognition (OCR), along with natural language processing (NLP), to create pretrained models for extracting information from the documents. Google’s DocAI provides a variety of parsers across industries. Google’s Lending DocAI and Procurement DocAI can help organizations process high volumes of documents and optimize the processing time. DocAI also has generic parsers like OCR and form parsers that can be used to provide some structure to the data and easily extract values. These parsers reside in a unified dashboard from where they can be tested by uploading a document directly in the console.
Figure 1
TEKsystems’ approach to leverage Google’s DocAI platform
We can help organizations build an end-to-end document solution that starts from the raw document and moves through our pipeline to extract high-value information by converting the unstructured data into structured data. Along with Google DocAI parser integration, our custom pipeline has a user-friendly interface so that document and extracted information can be interfaced with the existing business processes.
Figure 2
TEKsystems’ process:
- Upload document – Our custom UI interface lets users select one or multiple files to upload for document processing. Users can then upload their documents through a clean UI.
- Document identification and classification – The uploaded documents go through DocAI classifiers and a custom-built classifier to identify and classify the type of document. This step is needed as each parser can parse only the respective file type. If a wrong file type is used on the DocAI parser, it will error out.
- Lending DocAI – Some Document AI parsers include form identifiers such as W2, 1040 and W9. Therefore, the classification model will have the same classes to identify the file types.
- DocAI processing – Once the document type is determined, the respective processor is tapped to extract the information. When the parser is not available, custom parsers can be built with AutoML and/or cloud vision for the document type.
- Information selection and storage – The information extracted from the document is stored in a Big Query table using the document name as the unique identifier. The extracted text is then analyzed and verified. Once the information is extracted, business processes can use these values in the downstream processing and to make decisions.
Figure 3 | Example of a W2 parser output