Wednesday 6 December 2023

Extract Tabular Data from PDF using AI Builder Form Processing Power Apps | Extract Field, Tabular Data

 

Extract Tabular Data from PDF using AI Builder Form Processing Power Apps | Extract Field, Tabular Data

AI Builder plays an important role to have AI enables processes in our real business scenarios. In this post I will explain the steps how we can extract field information and also tabular data from PDF forms into power apps using AI Builder Form Processing Model.



Let us start. First open Power Apps screen and navigate to AI Builder and Build option. Choose Form Processing Si Builder option tile and give a name lets say, “Extract Invoice Lines”

Once the name is given you can see the notes what you will need. For this form processing you have to have 5+ layout documents which needs for this AI training.

I have sample PDF Invoices with me which I can use for this to pull fields and some tabular data from Invoices. When we give the name of AI Model then it will open AI model interface where we supply more information.

First, we have to define which fields or tables we want to extract. So define below items as I have done.

I have declared fields such as Invoice Date, Invoice Number, Invoice To, and Invoice Total Amount as fields. Also I have declared Table data as Invoice Items. keep on repeating the below process to complete all field addition.

Next choose Single Page table option and click next. Then give a Table Name I have given as “Line Items”.

Important Note : Single Page Table option is in Preview while preparing this blog and also the Multipage Table is in experimental in this time of blog preparation.

Once the name of table given you can design the table by adding columns as per your choice those you want to extract from form.

So finally I have added all the fields and tables as per below screenshot.

Now click on NEXT and Add a collection and upload documents.

Now once all files are uploaded click ANALYSE. Now it will take some time to analyze the documents and open the tagging panel. where we can tag each fields and table we defined to specific parts of the each documents for AI training.

Now click each PDF from right panel and mark the specific data items and tag to corresponding field as per below image. You have to map all fields and tables for each PDF.

Repeat all steps to map fields but as we want to map tables select the top left corner of the item table from the PDF and drag till the table ends and tag it to Line Items.

After tagging you can review the table by clicking the preview table.

Keep doing the same tagging for each document to tag corresponding fields and tables. Once all tag is completed click NEXT. make sure there is no warnings or red circles.

Now click on Train. The system will take some time for this training process and will inform after some minutes. if you go to Model Details screen you can see the status of the AI Model.

After the status changes to Trained we can use the AI model in a Canvas App or Power Automate.

DONOT FORGET TO PUBLISH THE MODEL OTHERWISE YOU CANNOT USE IN POWER APPS or POWER AUTOMATE.

After publishing you can quickly test the Model by using Quick Test option otherwise you can directly use in Canvas apps or power automate. Let us use it in a power automate. Open Power Automate and choose an Instant Trigger manual with a File Parameter so that the file will be used as input for the Form processer AI Model we have built and use a new action step called as Extract information from forms. I want to insert all line item table data returned from the AI Builder step into OneDrive business Excel sheet table.

I have a excel sheet in OneDrive with 4 columns.

In the Power Automate I used the Manually Button click trigger and added a File Type Input parameter then called Extract Information from Forms step with AI Model name selected as my model. Form type chosen as PDF and Form is assigned as the input parameter from the trigger step. In the last step I have taken a for loop and adding a “Add a row into a table” action of OneDrive excel connector.

Now when I test this it works as expected.

The file selected is analyzed by the AI builder and data inserted into excel sheet.

The PDF chosen is given below.

and the flow executed to fill the excel sheet.

Hope this helps.

No comments:

Post a Comment

Git Basic working