I am using below logic to extract text from PDF using PDFBox. It is giving good output for normal PDFs.
PDFTextStripper stripper = new PDFTextStripper();
stripper.setSortByPosition(false);
stripper.setParagraphStart("$");
stripper.setParagraphEnd("$$");
String output = stripper.getText(pdf);
But I have some PDFs in which text is inclined at some angle as shown in the attached image. For this type of PDFs, PDFBox gives output as given below
$ Image proc $$
$ essing is pr $$
$ ocessing of im $$
$ ages usin $$
$ g mathe $$
$ matical $$.....
I want to get output as
$ Image processing is processing of images using
mathematical...................................................
..........................techniques to the input $$
Please suggest me on how to get good output from these type of PDFs.