If you weren't on Android, the answer would be easy: use iText 7. The output comes out much cleaner when parsing the document with iText 7. It is still not 100% correct, but at least it looks mostly readable to me (although I'd need a native speaker to confirm). This is for page 2:
मैत्रबधं अरुण कुळकणी
मैत्रबधं
अरुण कुळकणी
ई साहित्य प्रहिष्ठान
ई साहित्य प्रहिष्ठान
The results are similar for the next page, with some minor hiccups but nothing as distorted as in iText 5.
But yeah, unfortunately you're on Android. There is as of yet no Android version for iText 7, so you'd be stuck waiting for one or trying to manually port iText to the Android platform (which will probably take forever if you're not intimately familiar with both Android and iText).
This is the iText 7 code I used:
// iText imports
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.kernel.pdf.canvas.parser.PdfTextExtractor;
public class HindiText {
@Test
public void go() throws Exception {
try (PdfDocument doc = new PdfDocument(new PdfReader("input.pdf"))) {
try (OutputStream os = new FileOutputStream("output.txt")) {
String result = PdfTextExtractor.getTextFromPage(doc.getPage(3));
os.write(result.getBytes(Charset.forName("UTF-16")));
}
}
}
}
FYI: as of 2016-12-02 you need to build iText 7 from source (https://github.com/itext/itext7) to achieve the quality I described above. This functionality will be contained in iText 7.0.2 when it is released.