6

I'm trying to convert PDFs to PCL (using ghostscript, but I'd love to hear alternative suggestions), and every driver (ghostscript device), including all of the built-ins and gutenprint generate PCL files many times larger than the input PDF. (This is the problem - I need my PCL to be about as small as the input).

Given that the text doesn't show up in the PCL file, I guess that Ghostscript is rasterizing the text. Is there a way to prevent GS generally, or just gutenprint, from doing that? I'd rather either have it embed the fonts, or not even embed the fonts (leave it to the printer to render the fonts)?

Unfortunately, there doesn't seem to be any documentation on this point.

Marcin
  • 48,559
  • 18
  • 128
  • 201
  • In other questions around the same topic you specified you wanted PCL5 or PCL5e output, but you didn't want PCL-XL. I assume this is still the case? – Kurt Pfeifle May 27 '12 at 14:23

2 Answers2

6

There are 3 (I think) types of font in PCL. There are rendered bitmaps, TrueType fonts (in later versions) and the HPGL stick font.

PDF and PostScript Have type 1, 2 (CFF), 3 and 42 (TrueType, but not the same as PCL) and CIDFonts based on any of the preceding types.

The only font type the two have in common is TrueType, so in order to retain text, any font which was not TrueType would have top be converted into TrueType. This is not a simple task. So Ghostscript simply renders the text, which is guaranteed to work.

PDF is, in general, a much richer format than PCL< there are many PDF constructs (fonts, shading, stroke/fill in a single operation, transparency) which cannot be represented in PCL. So its entirely possible that the increase in size is nothing to do with text and fonts.

In fact, I believe that the PXL drivers in Ghostscript simply render the entire page to a bitmap at the required resolution, and then wrap that up with enough PCL to be successfully sent to a printer. (I could be mistaken on this point though)

Basically, you are not going to get PCL of a similar size to your PDF out of Ghostscript.

KenS
  • 30,202
  • 3
  • 34
  • 51
  • Thanks for this. Nevertheless, it doesn't preclude ghostscript from omitting font data. Is there a way to do that? – Marcin May 27 '12 at 12:15
  • The only font data GS could produce would be TT or, as you have found, bitmap. It would be 'hard' to convert the PostScript/PDF font types to TrueType, and impossible to do a good job. Since the conversion can't be made 100% for PCL, GS doesn't even try. As I said, I think the whole page is simply rendered to a bitmap, but I could be wrong about that. To the best of my knowledge there is no way *with the PCL output device* of preventing GS from rendering the text. – KenS May 27 '12 at 15:42
  • Of course, you could take the existing device, and use it to develop a more fully-featured PCL output device. The pdfwrite and ps2write devices already embed and convert font data, so you could examine those for more code which might help a little. Ghostscritp is open source after all, so although there is no device which meets your needs, you can always write one! – KenS May 27 '12 at 15:44
  • That's true - although I think I'll try to make some changes further up my tool chain to avoid generating PDFs as an intermediate step (not least because I've no real C, as opposed to C++ experience, and I bet those drivers are all written in C). – Marcin May 27 '12 at 15:50
3

Here is a way to 'prevent Ghostscript from rasterizing text'. But its output will be PostScript. You may however succeed convert this PostScript to a PCL5e in an additional step.

The method will convert all glyphs into outline shapes for its PostScript output, and it does not work for its PDF or PCL output. The key here is the -dNOCACHE parameter:

gs -o somepdf.ps -dNOCACHE -sDEVICE=pswrite somepdf.pdf

Of course, converting font glyphs to outlines will take more space than keeping the original fonts embedded, because "fonts" are a space-optimized concept to store, retrieve and render glyph shapes.

Once you have this PostScript, you may be able to convert it to PCL5e with the help of either of the methods you tried before for PDF input (including {Apache?} FOP).

However, I have no idea if the output will be much smaller than versions with rasterized fonts (or even wholesome rasterized pages). But it may be worth a test.

Now vote down this answer too...


Update

Apparently, from version 9.15 (to be released during September/October 2014), Ghostscript will support a new command line parameter:

 -dNoOutputFonts

which will cause the output devices pdfwrite, ps2write and eps2write to "to 'flatten' glyphs into 'basic' marking operations (rather than writing fonts to the output)".

That means that the above command should be replaced by this:

 gs -o somepdf.ps -dNoOutputFonts -sDEVICE=ps2write somepdf.pdf

Caveats: I've tested this with a few input files using a self-compiled Ghostscript based on current Git sources. It worked flawlessly in each case.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345