How can I convert a series of images to a PDF from the command line on Linux?

Question

I have a scanning server I wrote in CGI and Bash. I want to be able to convert a bunch of images (all in one folder) to a PDF from the command line. How can that be done?

See also [How to generate a PDF from a series of images?](http://superuser.com/questions/687849/how-to-generate-a-pdf-from-a-series-of-images) on superuser. — zrajm, Dec 13 '13 at 10:21
Related: [Converting multiple image files from JPEG to PDF format](http://unix.stackexchange.com/q/29869/21471) at unix SE — kenorb, Feb 26 '15 at 15:59
Use [img2pdf](https://github.com/josch/img2pdf), not ImageMagick. ImageMagick decodes the JPEG, resulting in [generation loss](https://en.wikipedia.org/wiki/Generation_loss) and is 10–100 times slower than img2pdf. — Robert Fleming, Jan 19 '17 at 20:27
`img2pdf $(find . -iname '*.jpg' | sort -V) -o ./document.pdf` will give you `document.pdf` containing all images with jpg or JPG extension in the current dir - one image per page. `document.pdf` will have all images ordered as pages naturally (`-V` option for `sort`) so there is no need to add any leading zeros when numbering image files. — Jimmix, Apr 11 '20 at 18:52
I've asked and answered a [very similar question on SoftwareRecs.SX](https://softwarerecs.stackexchange.com/q/60187/15631). — einpoklum, Oct 15 '21 at 08:43
@philoopher97 Perhaps this is due to unknown value in the Exif that relates to the picture orientation (landscape/portrait). You may try to remove that value by removing whole Exif [link](https://linuxnightly.com/how-to-remove-exif-data-via-linux-command-line/) or look for other software to edit that value. [Exif orientation values](https://sirv.com/help/articles/rotate-photos-to-be-upright/) — Jimmix, Oct 29 '21 at 22:49
See also: [Ask Ubuntu: Create a single pdf from multiple text, images or pdf files](https://askubuntu.com/questions/303849/create-a-single-pdf-from-multiple-text-images-or-pdf-files/1385947). I've added [an answer here](https://askubuntu.com/a/1385947/327339) which does OCR in the process. — Gabriel Staples, Jan 20 '22 at 07:21

score 503 · Accepted Answer · edited Sep 24 '22 at 20:05

503

Using ImageMagick, you can try:

convert page.png page.pdf

For multiple images:

convert page*.png mydoc.pdf

edited Sep 24 '22 at 20:05

Matthias Braun

32,039
22
142
171

answered Jan 21 '12 at 18:22

Marvin Pinto

30,138
7
37
54

8

what if page*.png does not sort the images in the way you want ? e.g. page_1.png, page_2.png ... page_10.png -> page_10 will appear before page_1 – vcarel Jul 17 '13 at 00:29
If not sort - sort himself and create own list. – Andrzej Jozwik Sep 19 '13 at 07:24
1

That's nice but how to sort files while making the pdf file? – Alsemany Jan 30 '14 at 03:32
48

To sort the files, you can use: `ls page*.png | sort -n | tr '\n' ' ' | sed 's/$/\ mydoc.pdf/' | xargs convert` – GaloisPlusPlus Feb 07 '14 at 13:01
39

FYI you *almost* never need to use `ls` for anything apart from displaying files... i.e. do not parse it's output. `find` is a much more suitable tool. Here is an example `convert $(find -maxdepth 1 -type f -name 'page*.png' | sort -n | paste -sd\ ) output.pdf`. Keep in mind that the aforementioned command will not work if your pathnames contain spaces. The addition of characters that need to be escaped makes things a little more complicated. – Six May 06 '15 at 12:49
1

On my machine `convert` rapidly consumes all available memory (that's 8Gb of RAM), hangs the entire system and kills KDE. The only way out is `ctrl+alt+f1` and `sudo reboot`. Doesn't look like the best solution to me. – Pastafarianist Jun 23 '15 at 14:36
the order is not working for me – gal007 Sep 28 '15 at 15:15
25

This is simple and works very well, thank you! To avoid generating huge PDF files, use something like `convert -compress jpeg -quality 85 *.png out.pdf` – jlh Nov 18 '15 at 17:40
1

When converting multiple files, `convert` consumed all available disk space then failed so I converted them into separate pdf files and joined using `pdfunite`. – Tereza Tomcova Dec 03 '16 at 19:49
25

ImageMagick decodes the JPEG, resulting in [generation loss](https://en.wikipedia.org/wiki/Generation_loss). Use [img2pdf](https://github.com/josch/img2pdf) instead; it's also 10–100 times faster. – Robert Fleming Jan 19 '17 at 20:29
1

`ls` [can do the sorting](https://stackoverflow.com/a/21279329/1959808). – 0 _ Jul 10 '17 at 22:05
1

@GaloisPlusPlus a more advanced sorting methods: `sort --version-sort` _natural sort of (version) numbers within text_. That will correctly sort things like 1.2.3, 1.22.2, 1.222.0 – Orwellophile Mar 16 '20 at 11:27
How to reduce the size of the output pdf file? The generated pdf file has almost size of all the images combined. – Mohith7548 Jun 15 '20 at 11:18
5

I got `PDF' @ error/constitute.c/IsCoderAuthorized/408.` – Lori Apr 16 '21 at 02:22
1

You can shorten @GaloisPlusPlus using: `ls page*.png | sort -n | xargs -I % convert % mydoc.pdf` – jpap Mar 05 '23 at 02:42

score 58 · Answer 2 · edited Aug 30 '23 at 06:07

58

Use img2pdf instead of convert from ImageMagick. For example:

img2pdf im1.png im2.jpg -o out.pdf

To include all .jpg images in the current working directory:

img2pdf *.jpg -o out.pdf

Why img2pdf vs. convert?

To summarise comments, ImageMagick's convert:

decodes the JPEG resulting in generation loss;
is slower than img2pdf;
requires PDF creation to be enabled (off by default) due to security issues; and
has issues/limitations when handling large/many images.

edited Aug 30 '23 at 06:07

Kyle F Hartzenberg

2,567
3
6
24

answered Jan 21 '12 at 18:21

ziesemer

27,712
8
86
94

50

ImageMagick decodes the JPEG, resulting in [generation loss](https://en.wikipedia.org/wiki/Generation_loss). Use [img2pdf](https://github.com/josch/img2pdf) instead; it's also 10–100 times faster. – Robert Fleming Jan 19 '17 at 20:30
18

Note: img2pdf has moved to https://gitlab.mister-muffin.de/josch/img2pdf. – kelvin May 17 '19 at 01:55
@RobertFleming, Kelvin, your suggestions are awesome, too bad we cannot add them as a proper answer to this thread. Cheers – Azurtree Apr 04 '22 at 11:44
2

To install: `sudo apt update && sudo apt install img2pdf` – Gabriel Staples Jun 28 '22 at 05:03
I have an [issue open](https://github.com/ElectricRCAircraftGuy/PDF2SearchablePDF/issues/23) in my [`pdf2searchablepdf`](https://github.com/ElectricRCAircraftGuy/PDF2SearchablePDF) project to allow images to be used as inputs too, so they can be converted to searchable PDFs via `tesseract`. Meanwhile, I've used your answer in my work-around. – Gabriel Staples Jun 28 '22 at 05:19
1

There are so many more problems with `convert`. Due to security problems, PDF creation is disabled by default and needs to turned on in a config file. It even crashes when concatenating between 100 and 200 images (at about 10 MiB PDF size) due to "exhausted cache memory", requiring the user to implement a loop and later concatenating the PDFs, it's very stupid. Also, the resulting PDF with `convert` wastes more memory than `img2pdf`. But still unbelievable, that `img2pdf` is not installed by default on Ubuntu. – ChrisoLosoph May 28 '23 at 04:42

How can I convert a series of images to a PDF from the command line on Linux?

2 Answers2

Linked