8

In a system I'm working on we're generating thumbnails as part of the workflow. Sometimes the pdf files are quite large (print size 3m2) and can contain huge bitmap images.

Are there thumbnail generation capable programs that are optimized for memory footprint handling such large pdf files?

The resulting thumbnail can be png or jpg.

Carl R
  • 8,104
  • 5
  • 48
  • 80

2 Answers2

9

ImageMagick is what I use for all my CLI graphics, so maybe it can work for you:

convert foo.pdf foo-%png

This produces three separate PNG files:

foo-0.png
foo-1.png
foo-2.png

To create only one thumbnail, treat the PDF as if it were an array ([0] is the first page, [1] is the second, etc.):

convert foo.pdf[0] foo-thumb.png

Since you're worrying about memory, with the -cache option, you can restrict memory usage:

-cache threshold megabytes of memory available to the pixel cache.

Image pixels are stored in memory until threshold megabytes of memory have been consumed. Subsequent pixel operations are cached on disk. Operations to memory are significantly faster but if your computer does not have a sufficient amount of free memory you may want to adjust this threshold value.

So to thumbnail a PDF file and resize it,, you could run this command which should have a max memory usage of around 20mb:

convert -cache 20 foo.pdf[0] -resize 10%x10% foo-thumb.png

Or you could use -density to specify the output density (900 scales it down quite a lot):

convert -cache 20 foo.pdf[0] -density 900 foo-thumb.png
Blender
  • 289,723
  • 53
  • 439
  • 496
  • Of course you can also specify a `-density` parameter if you have 3m² PDFs. Otherwise… default being 72 dots per inch you end up with… let me do some calculation… 24 megapixel image… – Benoit Mar 16 '11 at 16:39
  • That would be nice. I'll add in `-resize` too. – Blender Mar 16 '11 at 16:40
  • @Blender: Yes. But `-density` is what applies when invoking the delegate (be it Ghostscript) which is an external command. Maybe the delegate cannot be memory-bounded? – Benoit Mar 16 '11 at 16:41
  • I don't have any huge PDF's to try this on, but I would guess that Ghostscript has an option to restrict memory (and maybe `convert` pipes it over). – Blender Mar 16 '11 at 16:45
  • Does it resize properly? I haven't worked with such huge PDF files, so maybe it would benefit others with the same problem. – Blender Mar 16 '11 at 20:27
  • You may want to add `-flatten` to avoid transparent background. For several pages that combines them into one, though. I have tried `-background white` instead, but that didn't work for me. But `-flatten` will do great for one-page thumbnails. – Titus Cieslewski Jun 02 '13 at 09:23
0

Should you care? Current affordable servers have 512 GB ram. That supports storing a full colour uncompressed bitmap of over 9000 inches (250 m) square at 1200 dpi. The performance hit you take from using disk is large.

Stephan Eggermont
  • 15,847
  • 1
  • 38
  • 65