1

I would like to crop just some pages in a multipage pdf keeping all pages, some cropped, others not. I tried the following but it "deletes" the non cropped pages...

gswin64.exe -o cropped.pdf -sDEVICE=pdfwrite  -dFirstPage=3 -dLastPage=4 -c "[/CropBox [24 72 1559 1794]" -c " /PAGES pdfmark" -f input.pdf

I've seen the posts on different cropping on odd and even pages, but I could not figure out how to apply this to a certain page in a multipage document.

gswin64.exe -o cropped.pdf -sDEVICE=pdfwrite -c "<</EndPage {0 eq {2 mod 0 eq {[/CropBox [0 0 1612 1792] /PAGE pdfmark true}{[/CropBox [500 500 612 792] /PAGE pdfmark true} ifelse}{false}ifelse}>> setpagedevice" -f input.pdf

This does crop all pages according to the settings of the second CropBox. If anybody is wondering about the large margins... I apply this do large drawings. I have also tried to substitute some operators to only apply the crop to a certain page number: "sub 4" instead of "2 mod" was one attempt to attain the " 0 eq" condition only when the current page number reaches 4.

JPF
  • 45
  • 4
  • Possible duplicate of [Cropping a PDF using Ghostscript 9.01](http://stackoverflow.com/questions/6183479/cropping-a-pdf-using-ghostscript-9-01) – ElChupacabra Oct 27 '16 at 19:14
  • My issue is not treated in the post you recommended, I have worked through it before asking. – JPF Oct 27 '16 at 19:32

1 Answers1

1

OK first things first, Ghostscript and the pdfwrite device do not 'modify' an input PDF file. For regular readers; standard lecture here, if you've read it before you can skip the following paragraph.

The way this works is that the input file is completely interpreted into a sequence of graphics primitives which are sent to the device. Rendering devices then call the graphics library to render the primitives to a bitmap, which is then output. High level (vector) devices, such as pdfwrite, translate the primitives into equivalent operations in some high level page description language, and emit that.

So, when you select -dFirstPage and -dLastPage, those are only pages for the input file you are choosing to process. So pdfwrite isn't 'deleting' your pages, you never sent them to the device in the first place.

Now, Ghostscript is a PostScript interpreter, and therefore its action can be affected by writing PostScript programs. In your case you probably want to actually process all the pages (so drop -dFirstPage and -dLastPage), but only write the pdfmark on selected pages.

The way to do this is via a BeginPage or EndPage procedure. If you search here or in the PostScript tag you'll find a number of examples. Fundamentally both procedures are called with a reason code and a count of pages so far.

From memory you will want to check the reason code is 2. If it is, then you want to check the count of pages, and it it matches your criteria (in the case here, count is 3 or 4), execute the /PAGE pdfmark. In any case you want to return 'true' so that the page is emitted.

[EDIT added here]

Hmm, OK I see the problem. What's happening is that the PDF interpreter is calling 'setpagedevice' to set the page size for each page, in case the page size has altered. The problem is that this resets the page count back to 0 each time.

Now, I wouldn't normally suggest the following, because it relies on some undocumented aspects of Ghostscript's PDF interpreter. However, I happen to know that the PDF interpreter tracks the page number internally using a named object called /Page#.

So, if I take the code you wrote, and modify it slightly:

<<
  /EndPage {
    0 eq {
      pop /Page# where {
        /Page# get
        3 eq {
          (page 3) == flush
          [/CropBox [0 0 1612 1792] /PAGE pdfmark 
          true
        }
        {
          (not page 3) == flush
          [/CropBox [500 500 612 792] /PAGE pdfmark
          true
        } ifelse
      }{
        true
      } ifelse
    }
    {
      false
    }
    ifelse
  }
>> setpagedevice

Couple of things to note; there's some debug in there, the lines with '== flush' print out some stuff on the back channel so you know how each page is being handled. If /Page# isn't defined, then the code simply leaves everything alone, this is just some basic safety-first stuff.

Rather than type all this on the command line (which also loses indenting and is hard to read) I stuck it in a file, called test.ps, then invoked GS as:

gswin32c -sDEVICE=pdfwrite -sOutputFile=out.pdf test.ps input.pdf

Its not the neatest solution in the world, but it works for me.

KenS
  • 30,202
  • 3
  • 34
  • 51
  • I tried using the code provided by Laura as a base and it partially works: [here](http://stackoverflow.com/questions/30413904/pdf-crop-even-odd-pages-with-php-ghostscript) – JPF Oct 28 '16 at 08:23
  • Just that they are always cropped acording to the second CropBox. I just cant figure out how one knows which variable is in the stack. Isn't there just a possibility to hand over page number? And than: How do you use these operators eq/mod, I guess this is postscript specific? – JPF Oct 28 '16 at 08:31
  • eq and mod are the equality and modulus operators. There's no way to tell what's on the stack, you just have to know, so stack tracking is a particular PostScript skill (there are other similar languages). There's no way to do what you want to do without some programming, and since it isn't a frequent requirement there's no likelihood the Ghostscript developers will add it for you, so you need to do it yourself. It would probably help if you posted the code you've written rather than pointing to another answer. You have been careful to use the /PAGE pdfmark, and not /PAGES haven't you ? – KenS Oct 28 '16 at 13:46
  • Thank You KenS. I have added my "work in progress" command in the original post. – JPF Oct 28 '16 at 16:32
  • KenS, I am inclined to call you my hero. Thank You so much! The whole structure of your approach makes me understand a number of things I was struggling with. It works for me as well, and I will try to continue on this basis. Thanks again for sharing your knowledge! – JPF Oct 31 '16 at 08:42
  • Pleased to hear it works for you as well. Like I said, not the nicest solution, but I can't think of a better one offhand. – KenS Oct 31 '16 at 13:39
  • 1
    Ken, since I am new to Stackoverflow I am unsure how to proceed: I would like to broaden the issue to using an array of values to be able to crop any page of a pdf to an individual size. Should I open a new question or edit the above one? – JPF Nov 10 '16 at 10:13
  • Probably best to open a new question, SO complains if you get extended comments. You may well want to tag it as 'PostScript' instead of 'PDF' since its more likely a PostScript programming question than anything to do with PDF. (I know, you want to affect a PDF, but its all done with PostScript). – KenS Nov 10 '16 at 11:22
  • Thank You, I will briefly start a new question. – JPF Nov 10 '16 at 15:33
  • I finally got to posting the question... [here](http://stackoverflow.com/questions/40803411/how-to-crop-a-multipage-pdf-using-ghostscript-with-an-array-of-page-specific-cro) – JPF Nov 25 '16 at 11:00