1

In ghostscript-crop-pdf-not-correctly,I got a cropped PDF,but it's only seemingly cropped.The remaining content still exists in fact.

In ghostscript-removes-content-outside-the-crop-box or how-to-truly-crop-a-pdf-file or pdf-real-cropping or cropping-a-pdf-using-ghostscript-9-01 or itext-crop-out-a-part-of-pdf-file, no solution was found.May be a virtual PDF printer is the only way.

Use ghostscript or itext, Is there any way to clip a PDF file really.

Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
zhusp
  • 131
  • 12

2 Answers2

1

A very straightforward (but perhaps not the most intelligent) way of solving your problem is to use pdfSweep.

pdfSweep is an iText7 addon that allows you to redact (remove) content.

It allows you to remove content by:

  • specifying a regular expression
  • specifying a rectangle (or rectangles)

In your case, you could calculate the rectangles you want removed, and then apply pdfSweep.

If you then crop the remaining page, the content would really be gone.

More information (including code samples) can be found here.

Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
0

What leads you to believe that the content is still present ?

Any object which is not at least partially contained within the page clip will not be forwarded on to the pdfwrite device by Ghostscript, so I'm doubtful that content is preserved.

Your original question related to cropping away white space, so that makes your example file less than useful in this case. You should post an example of the problem file, and the Ghostscript command line you have used.

Note that if you are trying to crop out an image then no, this won't do what you want. If any part of the image lies on the media, then the entire image will be included in the file. The pdfwrite device isn't equipped to extract sub-areas from images. This is true for all the PDF editors that I'm aware of.

KenS
  • 30,202
  • 3
  • 34
  • 51
  • As far as I know,.the iText 7 pdfSweep add-on clips images in such a way that the cropped pixels are removed. – Bruno Lowagie Jul 16 '18 at 11:00
  • Fair enough, its not code I'm familiar with. Ghostscript certainly doesn't do this. – KenS Jul 16 '18 at 14:47
  • open a croped pdf with adobe illustrator and select outline in menu view(ctrl+y),i can see the whole content.In addition the filesize is not smaller. – zhusp Jul 17 '18 at 02:26
  • Here is the [sample](https://www.dropbox.com/s/0uz9n3iug58mndc/MUT.06.05.10.07.57.pdf?dl=0). I want get the left half,and the gs command is:`gswin32c -sDEVICE=pdfwrite -dFirstPage=1 -dLastPage=1 -o croped.pdf -dDEVICEWIDTHPOINTS=298 -dDEVICEHEIGHTPOINTS=841 -dFIXEDMEDIA -c "<>setpagedevice" -f MUT.06.05.10.07.57.pdf` – zhusp Jul 17 '18 at 02:42
  • the screenshot: [img_croped](https://www.dropbox.com/s/0ccgg7nx9s93lw9/croped.png?dl=0), [img_with_outlines](https://www.dropbox.com/s/tbg6srz1rv86u5v/result_with_outline.png?dl=0) – zhusp Jul 17 '18 at 02:52
  • Yes, that's a PDF file whose contents are a single bitmap image. See answer and comments above. Another solution would be to extract the image form the PDF, crop out the bit you want using an image editor and make a new PDF. Or just use pdfSweep. – KenS Jul 17 '18 at 07:19