how can get text from table in pdf file?

Question

I want to get text from table in PDF file? enter image description here

I cannot get cell in table. I was try to run example of Leadtools but it cannot auto detect cell.

https://www.leadtools.com/help/leadtools/v20/dh/fo/iocrtablezonemanager.html

Can you give me advice? Thanks all

use this reference: [https://stackoverflow.com/questions/83152/reading-pdf-documents-in-net][1] — Shusil Satyal, Jan 06 '20 at 10:47
@ShusilSatyal Oh thanks you. But I want to use LeadTools to get table data. Because I am studying LeadTools. — denis bui, Jan 07 '20 at 02:34

score 0 · Accepted Answer · answered Jan 09 '20 at 18:11

In tables similar to the image you posted, you should be able to find the cells using the IOcrPage.TableZoneManager.AutoDetectCells() method. This method is used in the OcrMultiEngineDemo project that’s shipped with the current version of LEADTOOLS.

Here’s how you can test it:

Run the OCR Multi-Engine Demo.
Select the OmniPage OCR Engine
Open the image or PDF file that contains the table.
Draw a zone around the table.
Choose “Update Zones…” from the OCR->Zones menu.
In the “Update Zones” dialog, click “Detect Cells” as shown in attached image.

If this doesn’t give you the result you’re expecting, send the actual files you’re testing with to support@leadtools.com and explain how you tested exactly.

how can get text from table in pdf file?

1 Answers1