6

I got alot of PDF files and some of them are quite large..

I got two alternatives

  1. remove images and remove embedded fonts
  2. compress images

Is it possible to remove all objects like images/fonts in a PDF (PHP lib or command-line tool)?

Or if I want to compress images in the PDF, which PHP library do you recommend (or command-line tool)?

Debian/PHP

clarkk
  • 27,151
  • 72
  • 200
  • 340

2 Answers2

2

pdftk is the way to go IMO.

It can uncompress and compress the textual part of the PDF. Further you can use it in a script to extract all the images, compress them with some other tool and then put them back into your original document.

I'm not sure whether it can remove embedded fonts.

HTH

Daniele
  • 1,005
  • 9
  • 26
  • Have looked at `pdftk`.. found a simple way to compress a pdf file `pdf2ps file.pdf output file.ps` then `ps2pdf file.ps output new_file.pdf`.. Do you have a link how to extract images and afterwards replace them with compressed ones (pdftk)? – clarkk Apr 27 '12 at 16:51
  • @clarkk: [`QPDF`](http://qpdf.sourceforge.net/files/qpdf-manual.html) can extract images, see this answer at [Imagemagick: generate raw image data for PDF flate embedding?](http://stackoverflow.com/a/10935716/277826) for example, not sure about replacement, I'm also looking for [re-encoding only images of a PDF](http://stackoverflow.com/questions/10936142/re-encoding-only-images-of-a-pdf-or-ghostscript-fails-on-8-bit-rgb-while-opti). Cheers! – sdaau Jun 07 '12 at 23:00
  • The `pdftk` utility does not do anything about fonts, though compressing the fonts to CFF (a.k.a. Type 1C) and keeping a subset is often the best way to make the PDF file smaller, e.g. when the PDF file has been obtained with `pdflatex`. The `ps2pdf` utility can do that on fonts, but be careful, as it may corrupt the text part, and the Ghostscript developers do not care very much about that; see [Ghostscript bug 704478](https://bugs.ghostscript.com/show_bug.cgi?id=704478). – vinc17 Oct 02 '21 at 23:33
-2

Lots of tools, including Acrobat can improve the file size by looking for optimizations such as

  • dead objects
  • raw image data at an unnecessary resolution.

Have you looked at any of these tools?

dplante
  • 2,445
  • 3
  • 21
  • 27
mark stephens
  • 3,205
  • 16
  • 19