80

I have around 1000 pdf filesand I need to convert them to 300 dpi tiff files. What is the best way to do this? If there is an SDK or something or a tool that can be scripted that would be ideal.

gyurisc
  • 11,234
  • 16
  • 68
  • 102
  • This is the solution that I use: [Pdf to Tiff using Xpdf's pdftoppm and LibTIFF's ppm2tiff and tiffcp (optional, only if multipage)][1] [1]: http://stackoverflow.com/a/12868254/551460 – Ronald Wang Oct 12 '12 at 23:15
  • any final solution with full source code sample ? maybe using powershell script.. – Kiquenet Dec 17 '13 at 19:48
  • @Kiquenet I posted one solution using powershell. See it below... – gyurisc Dec 18 '13 at 20:46
  • 1
    Use Ghrostscript as `gs -q -dNOPAUSE -r300x300 -sDEVICE=tiff24nc -sOutputFile=output.tif input.pdf -c quit` (on Windows the command is `gswin32c`) to produce 300x300 dpi and 24bit color image – juanmirocks May 27 '16 at 11:19
  • Best way to convert PDF files to TIFF files? For sure: use `pdftoppm`, as follows: `mkdir images && pdftoppm -tiff -r 300 mypdf.pdf images/pg`. See here for details, usage, & more info: https://askubuntu.com/questions/150100/extracting-embedded-images-from-a-pdf/1187844#1187844. – Gabriel Staples Nov 11 '19 at 10:18
  • This is somewhat of a cop-out answer, but I tried Ghostscript and didn't have good success. On the other hand, Adobe Acrobat had an option to export the PDF as `.tif` files, and it worked perfectly for my needs. It output settings can also be adjusted. – GDP2 Feb 01 '22 at 09:02

13 Answers13

73

Use Imagemagick, or better yet, Ghostscript.

http://www.ibm.com/developerworks/library/l-graf2/#N101C2 has an example for imagemagick:

convert foo.pdf pages-%03d.tiff

http://www.asmail.be/msg0055376363.html has an example for ghostscript:

gs -q -dNOPAUSE -sDEVICE=tiffg4 -sOutputFile=a.tif foo.pdf -c quit

I would install ghostscript and read the man page for gs to see what exact options are needed and experiment.

BAR
  • 15,909
  • 27
  • 97
  • 185
Aeon
  • 6,467
  • 5
  • 29
  • 31
  • 2
    ghostscript works really good, as far as i understand imagemagick is reusing ghostscript for pdf operations. Is this correct? – gyurisc Sep 17 '08 at 11:11
  • that's what I hear, but I'm not an expert on ImageMagick internals ;) – Aeon Sep 17 '08 at 16:37
  • 2
    does imagemagick handle multipage pdf --> tiff properly? – David Shaked Feb 25 '16 at 00:13
  • 2
    wow, ghostscript really needs to clean up its command line interface! – CpILL Jul 15 '17 at 07:13
  • `imagemagick` worked well without configuration. I could not configure `ghostscript` properly to get a high resolution colour image. – Seanny123 Oct 16 '17 at 19:00
  • 1
    `convert foo.pdf pages-%03d.tiff` produces horribly-low-quality images. How do we increase the resolution to be what is already in the pdf, so no resolution is lost? – Gabriel Staples Nov 09 '19 at 21:14
  • 1
    After tons and tons and tons of research, here's what I've decided is best: `pdftoppm`: https://askubuntu.com/questions/150100/extracting-embedded-images-from-a-pdf/1187844#1187844. Also, in case the goal if making TIFFs here is to use `tesseract` to convert the PDF to a *searchable* pdf via OCR, I've done that now too and written an interface to do it in one step: `pdf2searchablepdf input.pdf`--see here: https://askubuntu.com/questions/473843/how-to-turn-a-pdf-into-a-text-searchable-pdf/1187881#1187881. – Gabriel Staples Nov 11 '19 at 10:22
46

Using GhostScript from the command line, I've used the following in the past:

on Windows:

gswin32c -dNOPAUSE -q -g300x300 -sDEVICE=tiffg4 -dBATCH -sOutputFile=output_file_name.tif input_file_name.pdf

on *nix:

gs -dNOPAUSE -q -g300x300 -sDEVICE=tiffg4 -dBATCH -sOutputFile=output_file_name.tif input_file_name.pdf

For a large number of files, a simple batch/shell script could be used to convert an arbitrary number of files...

tomasso
  • 539
  • 3
  • 5
  • 3
    +1. Useful command. But my color figure is outputting in black and white. Any idea why? – Faheem Mitha Apr 10 '11 at 08:36
  • 6
    `-sDEVICE=tiffg4` is a black and white fax compression model. See: http://pages.cs.wisc.edu/~ghost/doc/AFPL/8.00/Devices.htm#TIFF – HairyFotr Jun 18 '12 at 21:08
  • 23
    Most of the time you want to convert a pdf to TIFF images of 300x300 dpi, not 300x300 size. For this reason, replace `-g` switch with `-r`: `gswin32c -dNOPAUSE -q -r300x300 ...` – berezovskyi Dec 24 '13 at 01:52
  • 6
    Thanks @HairyFotr. For anyone else visiting, you should be using `-sDEVICE=tiff24nc` for 24-bit RGB, or `-sDEVICE=tiff12nc` for 12-bit (8/4 bits per channel, respectively). – naught101 Jan 22 '14 at 00:58
19

I wrote a little powershell script to go through a directory structure and convert all pdf files to tiff files using ghostscript. Here is my script:

$tool = 'C:\Program Files\gs\gs8.63\bin\gswin32c.exe'
$pdfs = get-childitem . -recurse | where {$_.Extension -match "pdf"}

foreach($pdf in $pdfs)
{

    $tiff = $pdf.FullName.split('.')[0] + '.tiff'
    if(test-path $tiff)
    {
        "tiff file already exists " + $tiff
    }
    else        
    {   
        'Processing ' + $pdf.Name        
        $param = "-sOutputFile=$tiff"
        & $tool -q -dNOPAUSE -sDEVICE=tiffg4 $param -r300 $pdf.FullName -c quit
    }
}
Jerod Venema
  • 44,124
  • 5
  • 66
  • 109
gyurisc
  • 11,234
  • 16
  • 68
  • 102
  • 2
    After 7 years, this continues to be helpful! I would only add that a person who has no PowerShell experience, you need to: 1. Edit the $tool value to match the path and version on your system. 2. Open PowerShell and cd to the directory where the PDFs are stored. 3. Paste the code into the PowerShell window. I needed to press enter a couple times after to get it to run. Thanks gyurisc – Timothy Barmann Sep 04 '15 at 19:08
9

1) Install GhostScript

2) Install ImageMagick

3) Create "Convert-to-TIFF.bat" (Windows XP, Vista, 7) and use the following line:

for %%f in (%*) DO "C:\Program Files\ImageMagick-6.6.4-Q16\convert.exe" -density 300 -compress lzw %%f %%f.tiff

Dragging any number of single-page PDF files onto this file will convert them to compressed TIFFs, at 300 DPI.

Tyler
  • 91
  • 1
  • 1
7

using python this is what I ended up with

import os
os.popen(' '.join([
                   self._ghostscriptPath + 'gswin32c.exe', 
                   '-q',
                   '-dNOPAUSE',
                   '-dBATCH',
                   '-r300',
                   '-sDEVICE=tiff12nc',
                   '-sPAPERSIZE=a4',
                   '-sOutputFile=%s %s' % (tifDest, pdfSource),
                   ]))
S.A.
  • 1,819
  • 1
  • 24
  • 39
Setori
  • 10,326
  • 11
  • 40
  • 46
  • 2
    Generally you'll want to use subprocess for this. os.popen is considered deprecated. The syntax is nearly the same. – mlissner Sep 01 '16 at 21:47
4

The PDF Focus .Net can do it in such way:

1. PDF to TIFF

SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();    

string pdfPath = @"c:\My.pdf";

string imageFolder = @"c:\images\";

f.OpenPdf(pdfPath);

if (f.PageCount > 0)
{
    //Save all PDF pages to image folder as tiff images, 200 dpi
    int result = f.ToImage(imageFolder, "page",System.Drawing.Imaging.ImageFormat.Tiff, 200);
}

2. PDF to Multipage-TIFF

//Convert PDF file to Multipage TIFF file

SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

string pdfPath = @"c:\Document.pdf";
string tiffPath = @"c:\Result.tiff";

f.OpenPdf(pdfPath);

if (f.PageCount > 0)
{
    f.ToMultipageTiff(tiffPath, 120) == 0)
    {
        System.Diagnostics.Process.Start(tiffPath);
    }
}   
Unni Kris
  • 3,081
  • 4
  • 35
  • 57
k venkat
  • 41
  • 1
3

Required ghostscript & tiffcp Tested in Ubuntu

import os

def pdf2tiff(source, destination):
    idx = destination.rindex('.')
    destination = destination[:idx]
    args = [
    '-q', '-dNOPAUSE', '-dBATCH',
    '-sDEVICE=tiffg4',
    '-r600', '-sPAPERSIZE=a4',
    '-sOutputFile=' + destination + '__%03d.tiff'
    ]
    gs_cmd = 'gs ' + ' '.join(args) +' '+ source
    os.system(gs_cmd)
    args = [destination + '__*.tiff', destination + '.tiff' ]
    tiffcp_cmd = 'tiffcp  ' + ' '.join(args)
    os.system(tiffcp_cmd)
    args = [destination + '__*.tiff']
    rm_cmd = 'rm  ' + ' '.join(args)
    os.system(rm_cmd)    
pdf2tiff('abc.pdf', 'abc.tiff')
Russell Wong
  • 103
  • 1
  • 4
3

How about pdf2tiff? http://python.net/~gherman/pdf2tiff.html

JBB
  • 4,543
  • 3
  • 24
  • 25
  • this does not handle multipage tiffs yet, so unfortunately this is no go for me. Thanks for the suggestion though. – gyurisc Sep 17 '08 at 09:23
3

ABCPDF can do so as well -- check out http://www.websupergoo.com/helppdf6net/default.html

Danimal
  • 7,672
  • 8
  • 47
  • 57
2

https://pypi.org/project/pdf2tiff/

You could also use pdf2ps, ps2image and then convert from the resulting image to tiff with other utilities (I remember 'paul' [paul - Yet another image viewer (displays PNG, TIFF, GIF, JPG, etc.])

Саша Черных
  • 2,561
  • 4
  • 25
  • 71
INS
  • 10,594
  • 7
  • 58
  • 89
2

Maybe also try this? PDF Focus

This .Net library allows you to solve the problem :)

This code will help (Convert 1000 PDF files to 300-dpi TIFF files in C#):

    SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

    string[] pdfFiles = Directory.GetFiles(@"d:\Folder with 1000 pdfs\", "*.pdf");
    string folderWithTiffs = @"d:\Folder with TIFFs\";

    foreach (string pdffile in pdfFiles)
    {
        f.OpenPdf(pdffile);

        if (f.PageCount > 0)
        {
            //save all pages to tiff files with 300 dpi
            f.ToImage(folderWithTiffs, Path.GetFileNameWithoutExtension(pdffile), System.Drawing.Imaging.ImageFormat.Tiff, 300);
        }
        f.ClosePdf();
    }
sth
  • 222,467
  • 53
  • 283
  • 367
Sally
  • 21
  • 1
2

Disclaimer: work for product I am recommending

Atalasoft has a .NET library that can convert PDF to TIFF -- we are a partner of FOXIT, so the PDF rendering is very good.

Lou Franco
  • 87,846
  • 14
  • 132
  • 192
1

I like PDFTIFF.com to convert PDF to TIFF, it can handle unlimited pages

John
  • 19
  • 1