I have some PDFs that are organised into columns that I need to scrape. The problem is that each column is multi-page and isn't in the typical layout for columns, for example:
******Column 1******************Column 2*************
Sombody once told me Finger and her thumb
The world was gonna In the shape of an "L"
Roll me. I ain't the On her forehead. Well
*******************NEXT PAGE**************************
Sharpest tool in the The years start coming
Shed. She was looking And they don't stop coming
Kind of dumb with her
I have tried using standard PDF scrapers like PDFMiner but it will just return a string that reads like:
Sombody once told me
The world was gonna
Roll me. I ain't the
Finger and her thumb
Any help would be appreciated!