I tried to use many libraries to extract table from PDF like : camelot , tabula , PDFPlumber , PDFTabExtract ... but they don't give a good result. The main problem is that headers are in complex format , and i have different format of headers .
with camelot i can't have a script that works for all pages in my PDF. with Tabula i got a confusing dataframe when the table has a rotated text header . with PDFPlumber i got problem with Stream Table (it works good only for Lattice table) and whith PDFTabExtract i got a problem when the text is rotated , it ignore it.
is there any solution whith which i can convert any table in my pdf that has different format ? i know that i can't find a generic solution , but atleast something that give a decent result .
Should i work with OCR ? what would you recommend ?