0

Have two pdfs, first created with libharu and second created with PDF::API2. If not mention to coordinates then content is the same. But first pdf oversize second by four times. Only one distinction what i found that is type of fonts embedding showed in document properties fonts tab.

In first

Verdana (Embedded Subset) 
  Type: TrueType 
  Encoding: Custom

In second

Verdana 
  Type: TrueType
  Encoding: Custom
  Actual Font: Verdana
  Actual font Type: TrueType

How to deal with that embedded subset?

Yola
  • 18,496
  • 11
  • 65
  • 106

2 Answers2

1

this is an old question but I had a similar issue.

Did you set libharu to compress your pdf?

in C++, from the documentation

HPDF_SetCompressionMode (pdf, HPDF_COMP_ALL);
M J Barrow
  • 111
  • 1
  • 7
1

There are many factors that affect the size of the PDF. Your problem may be in the way the PDF creation libraries handle font embedding, specifically:

  • "Embedded subset" means that part of the font's metrics, like glyph widths, are included in the file.
  • If the font is not embedded, presumably it is loaded by the reader from the system, reducing the size of the file.

If the PDF is already small (only has one page, little text and no images), embedding fonts may make a relatively big difference on the size of the document. Still, in absolute terms, an embedded font shouldn't take a lot of space.

Another factor you should check is compression. PDF is mostly a plain-text stream, but it usually comes in compressed form. Try opening both PDFs in a plain text editor and see if it's readable or gibberish. The gibberish (compressed) form will naturally take less space.

Finally, you can inspect the objects the PDF file is composed from using the many PDF inspectors out there, for example this one (I just googled it up, no guarantees it'll work as expected).

Vladimir Gritsenko
  • 1,669
  • 11
  • 25
  • dont you know any pdf inspector for linux? i googled much but it looks what theay are exist only for Mac and Windows. – Yola Dec 15 '11 at 11:56
  • 1
    The embedded subset refers to the font only containing glyphs that are used in the document. The fonts metrics should always be in the file. – Jimmy Dec 15 '11 at 11:58
  • @Jimmy, after rereading the reference, you're right. (There are the 14 standard fonts which do not have to be embedded, but Verdana isn't one of them.) – Vladimir Gritsenko Dec 15 '11 at 13:02