How to extract table text from pdfs using pdfminer python

Question

I am looking for script to extract table text from pdfs using pdfminer. I have tried tabula but I am looking to integrate the normal text and table text to a database. Any ideas on how to implement this are welcome.

score 0 · Answer 1 · answered Feb 07 '20 at 06:49

0

maybe you can get some idea from this links

answered Feb 07 '20 at 06:49

samuel161

221
3
2

I have tried using pdfminer and tabula. Both are good for specific purposes. Pdfminer extracts all the text in the document where as tabula extracts only table related text. I need to get the table related text in the document. – Aravind Feb 07 '20 at 08:18

score 0 · Answer 2 · answered Sep 14 '22 at 12:13

0

As many people provided in this link: How to extract tables from a pdf with PDFMiner?

You can use Camelot to extract tables from PDF Miner.

https://camelot-py.readthedocs.io/en/master/user/quickstart.html#read-the-pdf

answered Sep 14 '22 at 12:13

Mohamed Nabil

15
4

How to extract table text from pdfs using pdfminer python

2 Answers2