I have used ghostscript to successfully extract text from PDFs that have tables.
This simple command works very well:
gswin64c -sDEVICE=txtwrite -o test.txt "c:\reports\sample.pdf"
However some words get joined together especially from tables, for example:
234801111111109-12-2014 16:17:04764030208117034 2883253100.00 Payment
234801111111109-12-2014 16:18:461088956908117033 2883253400.00 Payment
234801111111109-12-2014 16:19:48769948208117040 2883253750.00 Payment
should actually be:
2348011111111 09-12-2014 16:17:04 764030208117034 2883253 100.00 Payment
2348011111111 09-12-2014 16:18:46 1088956908117033 2883253 400.00 Payment
2348011111111 09-12-2014 16:19:48 769948208117040 2883253 750.00 Payment
Please is there a way to add a separator character at the end of each word.
That would solve this perfectly.