0

I am working on web scraping for tables in pdf file using python

Can some one suggest me a good module which fetch's only required table I have tried pypdf,pdf2html,ocr,slate but nothing works

Thanks

user1369478
  • 107
  • 6

1 Answers1

3

First, convert PDF to HTML. See Converting PDF to HTML with Python.

And then, using an HTML parsing library, parse the HTML generated from the PDF. See BeautifulSoup HTML table parsing

Community
  • 1
  • 1
Priyank Patel
  • 3,495
  • 20
  • 20