0

This can concatenate PDFs (source):

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=out.pdf in1.pdf in2.pdf

but it's slow when concatenating hundreds of PDFs.

Is there a way to parallelize PDF concatenation, such as by using GNU Parallel somehow?

Geremia
  • 4,745
  • 37
  • 43
  • Probably. Ignoring **GNU Paralllel** for a moment, please show the first 3 `gs` commands you would want to run in parallel so we can discern the pattern of the parameters. – Mark Setchell Mar 15 '21 at 07:36
  • @MarkSetchell I'm sure some recursive, divide-and-conquer function could be constructed so that each execution of `gs` just concats two PDFs. – Geremia Mar 15 '21 at 21:08
  • Please give some indication of what the first 3 commands would be, without **GNU Parallel**. And also indicate whether the PDFs are all in the same directory. – Mark Setchell Mar 15 '21 at 21:15
  • @MarkSetchell Why 3? And, yes, they're all in the same directory. – Geremia Mar 15 '21 at 21:16
  • I still don't know if you want all 100+ concatenated into a single PDF or if you just want the number reduced by half by pairing them, or which ones to pair with which, or how the output files should be named... – Mark Setchell Mar 15 '21 at 21:17
  • @MarkSetchell I want to concatenate all PDFs in a directory into one single PDF. – Geremia Mar 15 '21 at 21:22
  • Have you tried timing how long it takes to concatenate say 10 PDFs into a single output PDF and then timing how long it takes to make 2 PDFs with 5 in each and then combine the result? – Mark Setchell Mar 15 '21 at 21:29
  • Have you tried `pdfunite` and/or `pdftk`? https://linoxide.com/merge-pdf-files-linux/ – Mark Setchell Mar 15 '21 at 21:30
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/229960/discussion-between-geremia-and-mark-setchell). – Geremia Mar 16 '21 at 04:39
  • I have deleted my answer as your experiments appear to show that the task is inherently sequential and therefore not parallelisable with this approach. – Mark Setchell Mar 17 '21 at 06:44

1 Answers1

0

You are going to say this is cheating:

pdftk *pdf cat output /tmp/my.pdf

But it is waaay faster than gs.

Ole Tange
  • 31,768
  • 5
  • 86
  • 104