I have a pdf file. It contains of four columns and all the pages don't have grid lines. They are the marks of students.
I would like to run some analysis on this distribution.(histograms, line graphs etc).
I want to parse this pdf file into a Spreadsheet or an HTML file (which i can then parse very easily).
The link to the pdf is:
this is a public document and is available on this domain openly to anyone.
note: I know that this can be done by exporting the file to text from adobe reader and then import it into Libre Calc or Excel. But i want to do this using a python script.
Kindly help me with this issue. specs: Windows 7 Python 2.7