I ran into something similar lately when parsing text from PDFs
WARNING: No Unicode mapping for 112 (142) in font AEDNQJ+Palatino-BoldItalic+2
This was causing the output result to be missing certain characters (such as á) in the output
M_s alta que los cielos, m_s honda que la mar,
(Added underscores where the character <á> should have been in the text)
The fix is to regenerate your PDF with all fonts embedded (such as PDF/A), so that all fonts are available at text extraction time.
Example:
public String parsePdf(InputStream pdfStream) {
try (PDDocument pdfDoc = PDDocument.load(pdfStream)) {
PDFTextStripper textStripper = new PDFTextStripper();
return textStripper.getText(pdfDoc);
} catch(IOException e) {
throw new ParsingException("Unable to load input pdf stream", e);
}
}
Más alta que los cielos, más honda que la mar,
You can convert an existing PDF to PDF/A using acrobat or the preview tool in macos.