I'm facing a problem while parsing a PDF with Jpedal.
While reading the wordlist
from the Jpedal, I get garbled characters in the wordslist
. This also happens when using OCR, and when I copy the text from PDF and paste in Word or a simple text editor. What I understand is this PDF was generated by Quartz PDF context on MAC OS X 10.6.4, which is used to compress the file size, but iseasily viewable on PDF viewers. I searched for any Java API supporting for decoding this kind of PDF but was unsuccessful. I'm looking for any application or Java API which I can use to decode it; must be usable on a Linux machine.