23

I am looking for a way to 'outline' all text/fonts in a PDF file, i.e. convert them to curves.

I would prefer to do this without having to convert the PDF to PostScript and back. Also, I would like to use free lightweight cross-platform tools that can be automated from the command line, such as Ghostscript or MuPDF.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
Szabolcs
  • 24,728
  • 9
  • 85
  • 174
  • [LaTeXiT](http://www.chachatelier.fr/latexit/) can do this and I believe it uses GhostScript (not sure). I tried to dig through the source and find how it does it but didn't succeed. – Szabolcs Mar 01 '15 at 18:37
  • Ghostscript *can* do this, now, but it couldn't readily do so previously (you would have had to go via PostScript). I've added the information as an answer below. – KenS Mar 01 '15 at 19:47
  • [PDF-TEXT-To-Outlines](https://pdf-editor-free.com/PDF-TEXT-To-Outlines/) with adblocker seems to work well for one off privacy insensitive documents. – tejasvi88 Apr 22 '22 at 15:15
  • @tejasvi88 However, it is not a command line tool that can easily be automated, which is what I was looking for. – Szabolcs Apr 22 '22 at 15:16

3 Answers3

44

Yes, you can use Ghostscript to achieve what you want.

I. For Ghostscript versions up to 9.14

You need to go through 2 steps:

  1. Convert the PDF to a PostScript file, but use the side effect of a relatively unknown parameter: it is called -dNOCACHE. This will convert all used fonts to outline shapes:

    gs -o somepdf.ps -dNOCACHE -sDEVICE=pswrite somepdf.pdf
    
  2. Convert the PS back to PDF (and, maybe delete the intermediate PS again):

    gs -o somepdf-with-outlines.pdf -sDEVICE=pdfwrite somepdf.ps
    
    rm somepdf.ps
    

This method is not reliable long-term, because the Ghostscript developers have stated that -dNOCACHE may not be present in future versions.

Note: the resulting PDF will very likely be larger than the original one. Plus, without additional command line parameters, all images in the original PDF will likely also be processed according to Ghostscript builtin defaults. This can lead to unwanted side-effects. Those side-effects can be avoided by adding more command line parameters to do otherwise.


II. Ghostscript versions 9.15 or newer

Ghostscript version 9.15 (released in September 2014) supports a new command line parameter:

 -dNoOutputFonts

This will cause the output devices pdfwrite, ps2write and eps2write "to 'flatten' glyphs into 'basic' marking operations (rather than writing fonts to the output)".

This means: the two steps described for pre-9.15 GS versions can be avoided. The desired result can be achieved with a single command:

 gs -o file-with-outlines.pdf -dNoOutputFonts -sDEVICE=pdfwrite file.pdf

Note: the same caveat is true as already noted in part I. If your PDF includes images, there may be unwanted side effects introduced by the simple command line above. To avoid these, you need to add more specific parameters.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
  • hey Kurt, Actually I have created a photobook pdf with images, captions and emojis.. And I need to print the pdf. What is the ideal way to covert any photobook pdf to the "print-ready" pdf format.. What are the options to use in ghostscript? Can you guide me or point to some resources? Thanks a lot in advance. Actually I tried outlines the fonts in my photobook pdf via command you mentioned in this answer.. it works fine. But since this pdf contains images, emojis, text.. Am not sure is the exact command? or I need to use some extra options on the longer run... ? – Kaviraj Kanagaraj Jan 13 '16 at 07:16
  • @Kurt, nice answer, you really should add the link to another answer by you, about how to keep the raster image resolution: https://superuser.com/a/373740/207447 – Libin Wen Dec 06 '19 at 08:31
  • Add a [related document reference](https://www.ghostscript.com/doc/9.52/VectorDevices.htm#COMMON) for `-dNoOutputFonts`. But note the new output PDF created by Ghostscript is not necessarily much more "intelligent" (overall smaller, better optimized files from bloated input PDF) with default settings. See also [How to remove duplicate objects in PDF using ghostscript?](https://stackoverflow.com/questions/27295777/how-to-remove-duplicate-objects-in-pdf-using-ghostscript) – samm Jun 14 '20 at 08:37
11

This commit adds a new switch -dNoOutputFonts to the Ghostscript pdfwrite and ps2write devices which will produce a PDF file (or PostScript, depending on the selected device) where all the glyphs have been created as vectors, not as text.

You will need at least version 9.15 of Ghostscript to get this feature. Be aware that the PDF file will almost certainly be larger and copy/paste/search will (obviously) not work.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
KenS
  • 30,202
  • 3
  • 34
  • 51
  • Yes, I tested, I found that the cause for larger size was not just in convert fonts to outline shapes/vectors/curves. For example, I had a PDF with one watermask image embedded and referenced/indirectly used on each page. After ghostscript, I found the output PDF contained duplicated images on each page using [itext-rups-7.1.11.jar](https://github.com/itext/i7j-rups/releases). ``` Pages: ... Page 3 124 0 R => Image Stream Page 4 171 0 R => Image Stream ... XRef: ... 124 => Image Stream 171 => Image Stream ... ``` – samm Jun 14 '20 at 08:55
  • The comment above doesn't seem to be anything to do with the original question or answer. samm, if you have a problem, please start a new question. For other readers, Ghostscript's pdfwrite device (by default) will hash all images, and only use one if they are identical. Of course samm has not provided an input file, a command line, an output file or even informaiton on which OS or version of Ghostscript, which makes it impossible to investigate or comment. – KenS Jun 14 '20 at 10:58
  • Well, it seems to have little to do with converting texts to curves without fonts embedded. I just wanted to add a note about larger size of the output PDF file if someone is concerned with the size. I used gs v9.52 on windows 10 by ` gs -o book.vectored.pdf -dNoOutputFonts -sDEVICE=pdfwrite book.optimized.pdf` and the pdf had 300+ of pages. I used the same optimization algorithm to book.vectored.pdf as was used to book.optimized.pdf, I could reduce the size by 10 MB. – samm Jun 15 '20 at 11:48
0

III. Ghostscript versions 9.54.0 (Windows 10)

I found a method that preserves all fonts flawlessly as vectors without any visual errors and with just two printing steps, after Ghostscript is first installed and configured correctly.

(Note! You must Add the Ghostscript bin-/ and lib-folder to your windows PATH in order to get Ghostscript to do anything) Instructions here

  1. Print your PDF-file that contains vector based fonts or other vector elements with Acrobat Reader and using Microsoft PS Class Driver to a YourFile.prn file. (To install this driver -- Control Panel - Devices - Printers & Scanners - Add a Printer or scanner -- and let first Windows to look for a while for a connected printer, and when it stops select an option -- The printer that I want is not listed - Add a local printer or network printer with manual settings - Next - Use an existing port: > File:(Print to File) - Next - Microsoft: Microsoft PS Class Driver - Next)

  2. Open Command prompt, navigate to the folder where YourFile.prn file is located and type: "C:\Program Files\gs\gs9.54.0\bin\gswin64c.exe" -dNOPAUSE -dNOCACHE -dBATCH -sDEVICE=eps2write -sOutputFile=YourFile.eps YourFile.prn

If you have a constant need to do this you can also create prn2eps.bat file containing the following:

"C:\Program Files\gs\gs9.54.0\bin\gswin64c.exe" -dNOPAUSE -dNOCACHE -dBATCH -sDEVICE=eps2write -sOutputFile=%1.eps %1.prn

To use that bat file you just need to type: prn2eps YourFile. (Note! you must have the bat file and Yourfile.prn in the same directory)

For some reason newest Ghostscript ps2epsi function didn't work in Windows 10, and Adobe made PDF:s had e.g. minor but consistent errors in some font characters when I imported them in non-Adobe design software as PDF:s. I have found out during the years that EPS-file format is one of the most reliable formats when vectors must be preserved from one software to another. Many times printing PDF again to PDF using just another printer driver may be enough or single file format change using Ghostscript, but not always.

Supernuija
  • 19
  • 2
  • 1
    Solution "II" form the accepted answer does work in Ghostscript 9.54 just as before (I use it regularly). The other answers did not rely on GSView. I am not sure what issue your answer is trying to address. – Szabolcs Jun 22 '21 at 21:17
  • I did try that solution, but for some reason some specific fonts still had some errors (some disformed characters, as if some vertices or control vectors were missing) in them, which were fixed only when printing first PS with Windows 10 own driver, and then converting that to EPS. I have used Ghostscript for decades to fix all kind odd visual errors in vector file conversions, it's a great tool! Gsview just made it super easy to use, since it had a graphical UI, and that's no longer available. – Supernuija Jun 23 '21 at 13:08
  • 1
    It will be helpful to readers if you explain (within the answer itself) what problem your solution is meant to address. – Szabolcs Jun 23 '21 at 13:12