2

I was wondering if I can compress/Change the Quality of my outcoming pdf-file with iTextSharp and C# like I can do with Adobe Acrobat Pro or PDF24Creator.

Using the PDF24Creator I can open the pdf, save the file again and set the "Quality of the PDF" to "Low Quality" and my file size decreases from 88,6MB to 12,5MB while the Quality is still good enough.

I am already using the

    writer = new PdfCopy(doc, fs);
    writer.SetPdfVersion(PdfCopy.PDF_VERSION_1_7);
    writer.CompressionLevel = PdfStream.BEST_COMPRESSION;
    writer.SetFullCompression();

which decreases the file size from about 92MB to 88MB.

Alternatively: Can I run the pdf24 Program through my C# code using command line arguments or starting Parameters? Something like that:

pdf24Creator.exe -save -Quality:low -inputfile -outputfile

Thanks for your help (Bruno)!

HideAndSeek
  • 374
  • 3
  • 16

1 Answers1

2

Short answer: no.

Long answer: yes but you must do a lot of the work yourself.

If you read the third and fourth paragraphs here you'll hopefully get a better understanding of what "compression" actually means from a PDF perspective.

Programs like Adobe Acrobat and PDF24 Creator allow you to reduce the size of a file by destroying the data within the PDF. When you select a low quality setting one of the most common changes these programs make is to actually extract all of the images, reduce their quality and replace the original files in the PDF. So a JPEG originally saved without any compression might be knocked down to 60% quality. And just to be clear, that 60% is non-reversible, it isn't zipping the file, it is literally destroying the data in order to save space.

Another setting is to reduce the effective DPI of an image. A 500 pixel wide image placed into a 2 inch wide box is effectively 250 DPI. These programs will extract the image, reduce the image to maybe 96 or 72 DPI which means the 500 pixel image be reduced to 192 or 144 pixels in width and replace the original file in the PDF. Once again, this is a destructive non-reversible change.

(And by destructive non-reversible, you still probably have the original file, I just want to be clear that this isn't true "compression" like ZIP.)

However, if you really want to do it you can look at code like this which shows how you can use iText to perform the extraction and re-insertion of images. It is 100% up to you, however, to change the images because iText won't make destructive changes to your data (and that's a good thing I'd say!)

Community
  • 1
  • 1
Chris Haas
  • 53,986
  • 12
  • 141
  • 274
  • Hey, thanks for the great answer. That seems like a lot of work to me. Is there any other pdf library you know that might have this implemented? (Might cost something). Maybe I can use the hOCR2pdf Library with only using the pdf Tools? – HideAndSeek Feb 25 '15 at 16:03
  • Additionally: If textsharp can't do it. There should be another library that can compress Images in pdf. So in my program, after I created the pdf with iTextsharp - I open the Pdf with that different library, "compress" it and then save it again. Would this be possible? – HideAndSeek Feb 25 '15 at 16:16
  • Instead of creating a PDF with iText *and then* compressing it, I would just compress your images beforehand *and then* add them to the PDF. The .Net runtime has built-in capabilities to do most of the image manipulations on the fly and if you've got static resources then I would just use Photoshop or GIMP to compress the images. But if you don't want to do that, couldn't you just shell exec out to pdf24creator or [ghostscript](http://stackoverflow.com/a/11851460/231316) – Chris Haas Feb 25 '15 at 17:15
  • Alright, I think im gonna try the iTextSharp Part. I am merging about 800 pdf Documents to one huge. The Images are in those pdf documents. I did some tests to check when the file size is the best. For the example I merged 5 pdf Documents to one resulting. **First Case**: I merged the 5 Documents and then compressed them (130 DPI) with pdf24Creator: **740kb** ---- **Second Case:** compressed the 5 Documents with 130 DPI then merged them: **870kb** ---- **Third Case:** I compressed (130 DPI) the resulting pdf file from the Second case, file size: **780kb** – HideAndSeek Mar 03 '15 at 16:46
  • Are These resulting filesizes because I **only used 5** files? Would the filesize only differ by ~1 mb if I use 700 files? I dont know how I can test this. Since i manually compress the files with the pdf24 Creator. – HideAndSeek Mar 03 '15 at 16:48
  • If you merge documents with `PdfSmartCopy` you should generally see a reduction in overall file size **if** the pages use the **exact same image** and by "exact" I mean all the way down to the actual byte level because iText is able reuse images in this case. If you compress the pages first, there's a very good chance that images that were previously the same will have slightly different byte contents so iText won't be able to reuse the image. So I would recommend performing all of your merging and saving compression until the last possible step. – Chris Haas Mar 03 '15 at 22:39