Assuming you want to use ImageMagick (and only ImageMagick) for this: that can't be done. ImageMagick cannot process PDF input all by itself. It has to make use of Ghostscript anyway, so without a local Ghostscript installation it won't work. (You will not necessarily see Ghostscript at work while you feed PDF input to ImageMagick, unless you add a -verbose
to its command line, because ImageMagick's delegation of the job to Ghostscript happens behind your back...)
Your question has two parts:
- "Is there a way to "detect" extra wide pages, like the center spreads?"
- "Is there a way to crop the left and right parts from center spreads as two separate pages?"
Detect page sizes
You can use ImageMagick's identify
to detect the page sizes of a PDF.
Just run the most simple command:
identify multipage.pdf
The output will be s.th. like
multipage.pdf[0] PDF 595x792 595x792+0+0 16-bit Bilevel DirectClass 59.5KB 0.000u 0:00.000
multipage.pdf[1] PDF 595x792 595x792+0+0 16-bit Bilevel DirectClass 59.5KB 0.000u 0:00.000
multipage.pdf[2] PDF 595x792 595x792+0+0 16-bit Bilevel DirectClass 59.5KB 0.000u 0:00.000
multipage.pdf[3] PDF 595x792 595x792+0+0 16-bit Bilevel DirectClass 59.5KB 0.000u 0:00.000
The output's page count is 0-based. So [0]
indicates the first page, [1]
the second page, etc.
To customize the output a bit better, you could do this:
identify -format '%f, page %s + 1: %W x %H\n' multipage.pdf
and get
multipage.pdf, page 0 + 1: 595 x 792
multipage.pdf, page 1 + 1: 595 x 792
multipage.pdf, page 2 + 1: 595 x 792
multipage.pdf, page 3 + 1: 595 x 792
For a double-spread page the respective output should be 1190 x 792
or similar.
However, be warned: to use ImageMagick for querying the page sizes of PDF files is veeeery slow. Therefor, better use a different tool for this sub-task: pdfinfo
. This will be faster by several orders of magnitude:
pdfinfo -f 1 -l 1000 -box multipage.pdf
will output
Pages: 4
Page 1 size: 595 x 792 pts
Page 1 rot: 0
Page 2 size: 595 x 792 pts
Page 2 rot: 0
Page 3 size: 595 x 792 pts
Page 3 rot: 0
Page 4 size: 595 x 792 pts
Page 4 rot: 0
If you need additional info about the pages' ArtBox, TrimBox, BleedBox and CropBox values, just add -box
to the commandline.
As I said: pdfinfo
is significantly faster in identifying page sizes for PDFs than ImageMagick is. Use the right tool for the job.
Crop left and right parts of a page
Now that you have identified the large double-spread page, you could use one of the following methods (based on Ghostscript) to split down the pages in the middle:
Adapting the method described in above links will result in 2 PDF pages that still contain all their original vector and font info.
Alternatively, you can use ImageMagick. Assuming your 'double-spread' page is of dimension 1190x842 pt, based on A4 (595x842 pt), and assuming it is page 16 (which translates to [15]
for ImageMagick) inside an original PDF, your convert
commands could be s.th. like:
convert multipage.pdf[15] -crop 595x842+0+0 page16-left.png
convert multipage.pdf[15] -crop 595x842+595+0 page16-right.png
The result gives you two raster images.