I'm trying to extract text from a rotated PDF page: the page has "/Rotate 90" instruction inside. This mean page is rotated when displayed, but it seems not be rotated when extracting text with PdfTextExtractor and LocationTextExtractionStrategy. I followed example by Mr. Lowagie on this link
I tryed to rotate area instead of page, but it seem to extract whole text block as one piece instead the exact selected area.
I'm using iText 5.5.12 with Java 1.8
How can I rotate the page for extraction?
Update
The code I use is like this:
PdfReader reader = null;
try {
reader = new PdfReader("C:\\Temp\\rotated.pdf");
Rectangle rect = new Rectangle(480, 484, 576, 525);
final Rectangle pageRect = reader.getPageSize(1);
RenderFilter regionFilter = new RegionTextRenderFilter(rect);
TextExtractionStrategy strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(),
regionFilter);
System.out.println(">>" + PdfTextExtractor.getTextFromPage(reader, 1, strategy).trim());
} catch (IOException e) {
e.printStackTrace();
} finally {
if (reader != null)
reader.close();
}
I can't find a way to upload here an example PDF. I put this image taken from Gimp with selected area. Pdf was created with LibreOffice export function and then manually edited to add /Rotate command.
Given coordinates consider zero point on lower-right corner.
Program output is empty string.