291

I have a scanning server I wrote in CGI and Bash. I want to be able to convert a bunch of images (all in one folder) to a PDF from the command line. How can that be done?

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
Jakob Weisblat
  • 7,450
  • 9
  • 37
  • 65
  • See also [How to generate a PDF from a series of images?](http://superuser.com/questions/687849/how-to-generate-a-pdf-from-a-series-of-images) on superuser. – zrajm Dec 13 '13 at 10:21
  • 2
    Related: [Converting multiple image files from JPEG to PDF format](http://unix.stackexchange.com/q/29869/21471) at unix SE – kenorb Feb 26 '15 at 15:59
  • 21
    Use [img2pdf](https://github.com/josch/img2pdf), not ImageMagick. ImageMagick decodes the JPEG, resulting in [generation loss](https://en.wikipedia.org/wiki/Generation_loss) and is 10–100 times slower than img2pdf. – Robert Fleming Jan 19 '17 at 20:27
  • 2
    `sudo apt-get install gscan2pdf` for simple and easy use. – Haziq Jan 18 '18 at 06:31
  • 4
    `img2pdf $(find . -iname '*.jpg' | sort -V) -o ./document.pdf` will give you `document.pdf` containing all images with jpg or JPG extension in the current dir - one image per page. `document.pdf` will have all images ordered as pages naturally (`-V` option for `sort`) so there is no need to add any leading zeros when numbering image files. – Jimmix Apr 11 '20 at 18:52
  • I've asked and answered a [very similar question on SoftwareRecs.SX](https://softwarerecs.stackexchange.com/q/60187/15631). – einpoklum Oct 15 '21 at 08:43
  • @Jimmix I got an error `invalid rotation(0)`. – philoopher97 Oct 29 '21 at 03:26
  • @philoopher97 Perhaps this is due to unknown value in the Exif that relates to the picture orientation (landscape/portrait). You may try to remove that value by removing whole Exif [link](https://linuxnightly.com/how-to-remove-exif-data-via-linux-command-line/) or look for other software to edit that value. [Exif orientation values](https://sirv.com/help/articles/rotate-photos-to-be-upright/) – Jimmix Oct 29 '21 at 22:49
  • See also: [Ask Ubuntu: Create a single pdf from multiple text, images or pdf files](https://askubuntu.com/questions/303849/create-a-single-pdf-from-multiple-text-images-or-pdf-files/1385947). I've added [an answer here](https://askubuntu.com/a/1385947/327339) which does OCR in the process. – Gabriel Staples Jan 20 '22 at 07:21

2 Answers2

503

Using ImageMagick, you can try:

convert page.png page.pdf

For multiple images:

convert page*.png mydoc.pdf
Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
Marvin Pinto
  • 30,138
  • 7
  • 37
  • 54
  • 8
    what if page*.png does not sort the images in the way you want ? e.g. page_1.png, page_2.png ... page_10.png -> page_10 will appear before page_1 – vcarel Jul 17 '13 at 00:29
  • If not sort - sort himself and create own list. – Andrzej Jozwik Sep 19 '13 at 07:24
  • 1
    That's nice but how to sort files while making the pdf file? – Alsemany Jan 30 '14 at 03:32
  • 48
    To sort the files, you can use: `ls page*.png | sort -n | tr '\n' ' ' | sed 's/$/\ mydoc.pdf/' | xargs convert` – GaloisPlusPlus Feb 07 '14 at 13:01
  • 39
    FYI you *almost* never need to use `ls` for anything apart from displaying files... i.e. do not parse it's output. `find` is a much more suitable tool. Here is an example `convert $(find -maxdepth 1 -type f -name 'page*.png' | sort -n | paste -sd\ ) output.pdf`. Keep in mind that the aforementioned command will not work if your pathnames contain spaces. The addition of characters that need to be escaped makes things a little more complicated. – Six May 06 '15 at 12:49
  • 1
    On my machine `convert` rapidly consumes all available memory (that's 8Gb of RAM), hangs the entire system and kills KDE. The only way out is `ctrl+alt+f1` and `sudo reboot`. Doesn't look like the best solution to me. – Pastafarianist Jun 23 '15 at 14:36
  • the order is not working for me – gal007 Sep 28 '15 at 15:15
  • 25
    This is simple and works very well, thank you! To avoid generating huge PDF files, use something like `convert -compress jpeg -quality 85 *.png out.pdf` – jlh Nov 18 '15 at 17:40
  • 1
    When converting multiple files, `convert` consumed all available disk space then failed so I converted them into separate pdf files and joined using `pdfunite`. – Tereza Tomcova Dec 03 '16 at 19:49
  • 25
    ImageMagick decodes the JPEG, resulting in [generation loss](https://en.wikipedia.org/wiki/Generation_loss). Use [img2pdf](https://github.com/josch/img2pdf) instead; it's also 10–100 times faster. – Robert Fleming Jan 19 '17 at 20:29
  • 1
    `ls` [can do the sorting](https://stackoverflow.com/a/21279329/1959808). – 0 _ Jul 10 '17 at 22:05
  • 1
    @GaloisPlusPlus a more advanced sorting methods: `sort --version-sort` _natural sort of (version) numbers within text_. That will correctly sort things like 1.2.3, 1.22.2, 1.222.0 – Orwellophile Mar 16 '20 at 11:27
  • How to reduce the size of the output pdf file? The generated pdf file has almost size of all the images combined. – Mohith7548 Jun 15 '20 at 11:18
  • 5
    I got `PDF' @ error/constitute.c/IsCoderAuthorized/408.` – Lori Apr 16 '21 at 02:22
  • 1
    You can shorten @GaloisPlusPlus using: `ls page*.png | sort -n | xargs -I % convert % mydoc.pdf` – jpap Mar 05 '23 at 02:42
58

Use img2pdf instead of convert from ImageMagick. For example:

img2pdf im1.png im2.jpg -o out.pdf

To include all .jpg images in the current working directory:

img2pdf *.jpg -o out.pdf

Why img2pdf vs. convert?

To summarise comments, ImageMagick's convert:

  • decodes the JPEG resulting in generation loss;
  • is slower than img2pdf;
  • requires PDF creation to be enabled (off by default) due to security issues; and
  • has issues/limitations when handling large/many images.
Kyle F Hartzenberg
  • 2,567
  • 3
  • 6
  • 24
ziesemer
  • 27,712
  • 8
  • 86
  • 94
  • 50
    ImageMagick decodes the JPEG, resulting in [generation loss](https://en.wikipedia.org/wiki/Generation_loss). Use [img2pdf](https://github.com/josch/img2pdf) instead; it's also 10–100 times faster. – Robert Fleming Jan 19 '17 at 20:30
  • 18
    Note: img2pdf has moved to https://gitlab.mister-muffin.de/josch/img2pdf. – kelvin May 17 '19 at 01:55
  • @RobertFleming, Kelvin, your suggestions are awesome, too bad we cannot add them as a proper answer to this thread. Cheers – Azurtree Apr 04 '22 at 11:44
  • 2
    To install: `sudo apt update && sudo apt install img2pdf` – Gabriel Staples Jun 28 '22 at 05:03
  • I have an [issue open](https://github.com/ElectricRCAircraftGuy/PDF2SearchablePDF/issues/23) in my [`pdf2searchablepdf`](https://github.com/ElectricRCAircraftGuy/PDF2SearchablePDF) project to allow images to be used as inputs too, so they can be converted to searchable PDFs via `tesseract`. Meanwhile, I've used your answer in my work-around. – Gabriel Staples Jun 28 '22 at 05:19
  • 1
    There are so many more problems with `convert`. Due to security problems, PDF creation is disabled by default and needs to turned on in a config file. It even crashes when concatenating between 100 and 200 images (at about 10 MiB PDF size) due to "exhausted cache memory", requiring the user to implement a loop and later concatenating the PDFs, it's very stupid. Also, the resulting PDF with `convert` wastes more memory than `img2pdf`. But still unbelievable, that `img2pdf` is not installed by default on Ubuntu. – ChrisoLosoph May 28 '23 at 04:42