3

I have a pdf file which shows font properties in Okular (or whatever PDF viewer) like that:

Name: Helvetica 
Type: Type1
File: /usr/share/fonts/truetype/liberation2/LiberationSans-regular.ttf
Embedded: No

I want to embed Helvetica with PDFBox 2xx without modifying file content (text) itself so it would always available with a file. Is it possible at all? I tried something like:

PDDocument document = PDDocument.load(myFile);

InputStream stream = new FileInputStream(new File("/home/user/fonts_temp/Helvetica.ttf"));
PDFont fontToEmbed = PDType0Font.load(document, stream, true);              
PDResources resources = document.getPage(pageNumber).getResources();
resources.add(fontToEmbed);
//or use the font from pdfbox:
resources.add(PDType1Font.HELVETICA);

document.save(somewhere);
document.close();

I also tried to call

COSName fontCosName = resources.add(PDType1Font.HELVETICA);
resources.put(fontCosName, font);

What am I doing wrong?

Edit:

@TilmanHausherr thank you for the clue! But I'm still missing something. Currently my code looks like:

PDFont helvetica = PDType0Font.load(document, new FileInputStream(new File("/path/Helvetica.ttf")), false);
...
PDResources resources = page.getResources();
for (COSName fontCosName : resources.getFontNames()){
    if(resources.getFont(fontCosName).getName().equals("Helvetica")) {
        resources.put(fontCosName, helvetica);
    }
}

End result shows Helvetica CID TrueType Fully Embedded But the font is not displayed in PDF file at all now. I mean those places where the font is used are literally empty, blank page... Still something is not there. Font itself was downloaded from here

jeffest
  • 65
  • 2
  • 6
  • You'd need to know the name that is currently used in the resources, so check these with `resources.getFontNames()`. Also don't subset, so the last parameter should be false. – Tilman Hausherr Nov 09 '20 at 14:16
  • @TilmanHausherr thanks a lot. But I have updated post with the next problem - font is not rendered in the document at all – jeffest Nov 10 '20 at 07:33
  • Please share the PDF before and after – Tilman Hausherr Nov 10 '20 at 08:48
  • 2
    I see... problem with codes starting at 0 ... try using PDTrueTypeFont instead of PDType0Font. `PDTrueTypeFont.load(document, file, WinAnsiEncoding.INSTANCE);` – Tilman Hausherr Nov 10 '20 at 08:58
  • It works! But what was the problem? Why PDType0Font didn't fit? If I currently call `resources.getFont(<"F1" means Helvetica which is used in an original file>).getSubType()` - it returns `Type0`. How to understand how to load fonts properly then? – jeffest Nov 10 '20 at 12:28
  • 1
    `System.out.println(doc.getPage(0).getResources().getFont(COSName.getPDFName("F1")).getSubType());` gets me "Type1". With the new file it gets "Type 0". The problem is that this font starts the numbers with 1, while the "old" truetype class starts them with 32 which is also what the standard 14 font did. You can look at the font with PDFDebugger then it's more clear. – Tilman Hausherr Nov 10 '20 at 14:45
  • @TilmanHausherr Thanks. Now it is more clear. But how to distinguish in code, that in document this font is TrueType and that one is Type1? My end goal is to embed fonts with a certain logic, where I need to know font type. Because currently `getSubType()` is `Type1` and it is very confusing, as you see I wasn't able to embed Helvetica properly.. – jeffest Nov 11 '20 at 07:33
  • It's complicated... this time it worked nicely. It's possible that it won't work for other files. If your task is to convert files to PDF/A then I'd rather suggest to buy a commercial product. The solution is likely to work only for type 1 non embedded. "how to distinguish in code" well you did, i.e. getSubType, or the class itself. – Tilman Hausherr Nov 11 '20 at 10:08
  • @TilmanHausherr I'm a bit confused now. :) Original file showed `Type1`. While the loading process is done via `PDTrueTypeFond.load()` not via `PDType0Font.load()`. But `getSubType()` returned `Type1`, not `TrueType` - isn't it conflicting logic here? My task is just to embed fonts and send docs to the printing service. I don't think that PDF/A is what I need – jeffest Nov 11 '20 at 11:31
  • I tried it. I get "Type1", and after replacement, I get "TrueType". `System.out.println(doc.getPage(0).getResources().getFont(COSName.getPDFName("F1")).getSubType());` – Tilman Hausherr Nov 11 '20 at 14:39
  • @TilmanHausherr but shouldn't it normally be "TrueType" before replacement as well? Because Helvetica is a TTF font? Or I am still missing the logic here? – jeffest Nov 12 '20 at 08:12
  • Helvetica is available both as type1 and truetype. This applies to many fonts. The PDF standard 14 fonts (Times-Roman, Helvetica, Courier, Symbol, Times-Bold, Helvetica-Bold, Courier-Bold, ZapfDingbats, Times-Italic, Helvetica-Oblique, Courier-Oblique, Times-BoldItalic, Helvetica-BoldOblique, Courier-BoldOblique) are all type1, but the same fonts are available as TrueType on many computers (and in PDFs, but then they are not "standard 14" but just some font) – Tilman Hausherr Nov 12 '20 at 08:50

1 Answers1

0

You'd need to know the name that is currently used in the resources, so check these with resources.getFontNames()

2. To replace a standard 14 font, use this font object:

PDTrueTypeFont.load(document, file, oldFont.getEncoding() /* or WinAnsiEncoding.INSTANCE which is usually right */ );

this ensures that the same encoding is used as the standard 14 font. (It's different for the Zapf Dingbats and the Symbol font)

Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97
  • 1
    *"this ensures that the same encoding is used as the standard 14 font"* - Well... no. It does not ensure, it merely makes very probable. Standard 14 fonts can be used with other encodings, too, even custom ones; merely hardly anyone does so. Essentially you need to keep the original encoding to ensure the same encoding.. – mkl Nov 10 '20 at 15:58
  • I marked that answer as a valid one, but actually there is a great discussion in a comments after the question – jeffest Nov 12 '20 at 08:13