Read Text from a specific area of a PDF with Workflows

By Vishnu Subramoniam | Automation

Read Text from a specific area of a PDF with Workflows

We automate a lot of document processes nowadays. Want to copy text from a specific area of a PDF? Want to use that text at a later stage while processing the PDF? Well, we have the perfect solution for you. The PDF4me Workflows actions cater to all such document logics.

Workflows uniquely focus on delivering the best automation solution for your document processes. The Read Text action from Workflows can read and copy text from a specific area from a PDF document. This area can be defined in Pixels, which can refer to any area inside a PDF.

How to Read Text from a PDF?

Let us look at this by creating a sample workflow where you want to split a large PDF file and name the output files with a specific text from each of the PDFs.

Start by launching the PDF4me Dashboard.

  • Select the Create Workflow button.
Create PDF4me workflow

Add a trigger to start your Workflow

Add a trigger to kick-start your automation. Currently, Workflows provide 2 triggers.

Dropbox and Google Drive. For e.g. let us create a Dropbox trigger.

Configure the connection and choose the folder where the input files are expected.

Configure Dropbox Trigger

Add a Split action

Add and configure a Split PDF action to separate the file pages as required. Here we use the Split recurring action to split pages periodically after a certain number of pages.

Split action split PDF pages recurringly

Add a For Each Documet Control

Since the Split Recurring action generates multiple documents, a For Each Document control is necessary to handle the output files one by one. The rest of the actions should be included inside this control.

FOre each document control to handle output files individually

Add the Read Text from PDF Action

Add and configure the Read Text action with all required parameters.

  • X1 - X position on the left side
  • Y1 - Y position on the left side
  • X2 - X position on the right side
  • Y2 - Y position on the right side
Read text from a specified location in PDF

Add a Save to action

The output files needed to be saved to cloud storage. In our use-case let us configure a Save to Dropbox action. In the above image, you can see an expression for getting a text from the ‘Read text’ action. You can copy-paste the same below given regular expression in the Output File Name parameter

${file.pages[0].PageText}.pdf

Save output files to dropbox after naming them with the read text

The expression will pass the text from the specified location of the PDF and pass it to the output filename parameter so that the files are renamed based on the read text.

Final Read Text from PDF workflow summary

For getting access to Workflows you would require a PDF4me Subscription. You can even get a Daypass and try out Workflows to see how it can help automate your document jobs.

Related Blog Posts