1

I am trying to parse some pdf files in order to extract some key information.There is number of tables in each pdf that contains a part of these information. So I tried to use camelot to extract tables and I got good results but I want to extract the title of each table because I want to do a mapping for each table with its title. Can anyone tell me how to extract the title of table from pdf using python?

jessy
  • 65
  • 1
  • 5
  • Currently, Camelot can't extract table titles (https://github.com/atlanhq/camelot/issues/247). If you post the PDF, we can analyze better the problem. – Stefano Fiorucci - anakin87 Sep 12 '19 at 06:26
  • @Anakin87 thanks, it is not just one pdf with defined format but a number of pdf files related to the financial field.I thought about using OCR or also converting the file to HTML hoping that tables can be detected with the
    in HTML
    – jessy Sep 13 '19 at 15:21
  • 1
    Does this answer your question? [Python PDF Parsing with Camelot and Extract the Table Title](https://stackoverflow.com/questions/58185404/python-pdf-parsing-with-camelot-and-extract-the-table-title) – Brian Wylie Feb 17 '21 at 02:04

0 Answers0