34

How can I use Unicode characters for labels, titles and similar things in a PDF plot created with ggplot2?

Consider the following example:

library(ggplot2)
qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ")
ggsave("t.pdf")

The title of the plot uses Unicode characters (small caps), which in the output appear as .... The problem occurs only with pdf plots; if I replace the last line with ggsave("t.png"), then the output is as expected.

What am I doing wrong? The R script I have is in UTF-8 encoding. Some system information:

R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

When searching for a solution for this problem, I found some evidence that R uses a single-byte encoding for mutli-byte encodigns such as UTF-8 for PDF or postscript output. I also found suggestions to, for instance, be able to get the Euro sign working, but no general solution.

Community
  • 1
  • 1
stefan
  • 1,135
  • 1
  • 11
  • 22
  • 4
    `cairo_pdf("t.pdf"); ...; dev.off()` works for me ... see http://stackoverflow.com/questions/5886018/using-unicode-dingbat-like-glyphs-in-r-graphics-across-devices-platforms-e – Ben Bolker Oct 07 '12 at 15:17

3 Answers3

25

As Ben suggested, cairo_pdf() is your friend. It also allows you to embed non-postscript fonts (i.e. TTF/OTF) in the PDF via the family argument (crucial if you don't happen to have any postscript fonts that contain the glyphs you want to use). For example:

library(ggplot2)
cairo_pdf("example.pdf", family="DejaVu Sans")
qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ")
dev.off()

...gives a PDF that looks like this: ggplot2 graph with custom font family and non-ASCII characters in the title

See also this question; though it doesn't look directly relevant from the title, there is a lot in there about getting fonts to do what you want in R.

EDIT per request in comments, here is the windows-specific code:

library(ggplot2)
windowsFonts(myCustomWindowsFontName=windowsFont("DejaVu Sans"))
cairo_pdf("example.pdf", family="myCustomWindowsFontName")
qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ")
dev.off()

To use the base graphics command cairo_pdf() it should suffice to just define your font family with the windowsFonts() command first, as shown above. Of course, make sure you use a font that you actually have on your system, and that actually has all the glyphs that you need.

TThe instructions about DLL files in the comments below are what I had to do to get the Cairo() and CairoPDF() commands in library(Cairo) to work on Windows. Then:

library(ggplot2)
library(Cairo)
windowsFonts(myCustomWindowsFontName=windowsFont("DejaVu Sans"))
CairoPDF("example.pdf")
par(family="myCustomWindowsFontName")
qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ")
dev.off()
Community
  • 1
  • 1
drammock
  • 2,373
  • 29
  • 40
  • Thanks, this works for me on Linux. I did not get to work it yet on Windows, neither using the code your provided, nor using `CairoPDF`. – stefan Oct 09 '12 at 07:43
  • 1
    Getting the Cairo package to work on Windows is tricky. I wrote up a little tutorial for it [here](https://raw.github.com/drammock/phonR/master/installingCairo.txt). Does that help? – drammock Oct 09 '12 at 16:21
  • 1
    To get Cairo library to work on Windows, go to [this page](http://www.gtk.org/download/), click Windows (32/64bit), & under "Required third party dependencies" download the run-time files for zlib, cairo, libpng, fontconfig, freetype, and expat. Unzip and collect all DLLs and put them in: C:\Program Files\R\R-2.14.0\bin\i386 (or on 64bit systems C:\Program Files\R\R-2.14.0\bin\x64). Also move fonts.conf file from fontconfig zip file into C:\Program Files\R\R-2.14.0\etc\i386\fonts\. On 64bit systems replace "i386" with "x64" in pathnames; also sub 2.14.0 with whatever your version of R is. – drammock Oct 11 '12 at 16:21
  • Thanks for your help, but it still is not working for me. I followed all your steps, but when I execute the code from your example above, I still get garbage as a title. The axis labels work correctly (also when using a different font), just the title that uses the Unicode characters is messed up. The output is "null device", not sure if that is expected. – stefan Oct 12 '12 at 07:07
  • windows-specific code added, drawn from [this question](http://stackoverflow.com/questions/12378620/ttf-otf-font-selection-in-r-for-windows-onscreen-vs-pdf) already mentioned in the original answer. If **that** doesn't work, then probably your version of R was compiled without support for cairo-based graphics. – drammock Oct 13 '12 at 04:26
  • Thanks drammock, I appreciate you trying to help to get this working on Windows. However, the code you posted does not work for me, I still get garbage output in the PDF. No error is shown, except for "null device" mentioned earlier. – stefan Oct 15 '12 at 07:04
  • well, assuming you changed "DejaVu Sans" to some other font that you actually have on your windows system, you've exhausted my expertise. FYI, "null device" is not an error, it just tells you which graphical device is the currently active one. "null device" means you have closed the only graphical device that was open. – drammock Oct 15 '12 at 16:50
  • Ok, thanks. I really appreciate your help, and for now producing the graphs on Linux works fine for me. And indeed, I experimented with various fonts that are available on my system, without luck. – stefan Oct 15 '12 at 20:27
  • No problem. I made another edit adding the CairoPDF() code, check that against what you've tried just to make sure you didn't miss something small. – drammock Oct 17 '12 at 17:29
  • Still does not work, and I get a warning that CairoFonts() does not have an effect on Windows and one should use par(family=..) instead. – stefan Oct 18 '12 at 08:28
  • right, sorry, I don't use Windows often and forgot about that issue. I've corrected the answer now, but I'm out of ideas (other than "stick with Linux"). – drammock Oct 18 '12 at 17:42
  • ... No it isn't anymore; see [my update](https://stackoverflow.com/a/64471402/435004). – DomQ Oct 21 '20 at 20:55
  • cairo is also problematic due to it's poor performance with small font spacing https://stackoverflow.com/questions/65188058/font-spacing-whern-using-cairo-pdf-device – kennyB Mar 04 '21 at 08:43
9

As of 2020 and R version 4.0.3, cairo_pdf() is not your friend anymore on Mac OS X, at least as far as Cyrillic is concerned — See Fail Gallery below.

TL;DR

If you must have Cyrillic, just go back to good ole png driver. (And kiss your antialiased diagrams goodbye.)

R -e 'png(filename = "ftw.png"); library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); dev.off()'
open ftw.png

What is old, is new again.

Or if you use Rmarkdown with knitr:

R -e 'rmarkdown::render("foo.Rmd", "pdf_document", output_file="foo.pdf", runtime = "static", output_options = list(dev = "png"))'

The Fail Gallery

The “modern” approach with Cairo fails in v4.0.3 as demonstrated below. Note that this is not (or not only) a font embedding or rendering problem, since selecting and pasting text out of the generated PDFs also produces garbled output.

Prep steps:

  1. install the latest R (version 4.0.3 or higher, with all capabilities() showing TRUE)
  2. R -e 'install.packages(c("Cairo", "ggplot2"), repos="https://cloud.r-project.org")'

Vanilla config

R -e 'library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); ggsave("fail1.pdf")'
open fail1.pdf

Fail Gallery: vanilla config

Using cairo_pdf() alone

R -e 'cairo_pdf("fail2.pdf"); library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); dev.off()'
open fail2.pdf

Fail Gallery: using cairo_pdf() alone

Using cairo_pdf() with a custom (supposedly Unicode-compliant) font

R -e 'cairo_pdf("fail3.pdf", family = "Arial Unicode MS"); library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); dev.off()'
open fail3.pdf

This is as close as it gets to working with “modern” approaches.

Another attempt with Comic Sans for good measure:

R -e 'cairo_pdf("fail3bis.pdf", family = "Comic Sans MS"); library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); dev.off()'
open fail3bis.pdf

Fail Gallery: using cairo_pdf() with family = "MS Comic Sans"

A few more...

With the older "Dark and Stormy Night" version (3.6.2):

/Library/Frameworks/R.framework/Versions/3.6/Resources/bin/R -e 'cairo_pdf("fail4.pdf", family = "Arial Unicode MS"); library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); dev.off()'
open fail4.pdf

enter image description here

And with DejaVu Sans as suggested by @drammock:

R -e 'cairo_pdf("fail5.pdf", family = "DejaVu Sans"); library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); dev.off()'
open fail5.pdf

enter image description here

DejaVu Sans on older R:

/Library/Frameworks/R.framework/Versions/3.6/Resources/bin/R -e 'cairo_pdf("fail5bis.pdf", family = "DejaVu Sans"); library(ggplot2); qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ"); dev.off()'
open fail5bis.pdf

enter image description here

DomQ
  • 4,184
  • 38
  • 37
  • 1
    Just because a font is "unicode compliant" doesn't mean it contains glyphs at every single codepoint. Does it fail if you use the font shown in my answer (DejaVu Sans)? Many of the small-cap glyphs in my answer are in the "phonetic extensions" block of unicode which are most likely not present in Comic Sans or Arial MS. – drammock Oct 21 '20 at 22:38
  • @drammock DejaVu Sans is the one that comes closest to working indeed (tied with Arial Unicode MS as far as the number of successfully rendered glyphs is concerned). Updated the Fail Gallery. – DomQ Oct 21 '20 at 23:16
  • strange that it doesn't work even with DejaVu Sans. Page 22 of the DejaVu Sans glyph tables (http://dejavu.sourceforge.net/samples/DejaVuSans.pdf) suggests that it does indeed have glyphs for ᴄ,ᴅ,ᴇ at least, which aren't appearing for you (I didn't check every glyph). Notably, all the failing codepoints are `U+1xxx` while all the successful codepoints are `U+0xxx` so I suspect an encoding problem rather than a font problem. – drammock Oct 22 '20 at 14:09
  • I am glad I am not the only one with that problem. Makes me feel less alone – tjebo Feb 17 '21 at 10:22
2

If you are using ggsave(...), you can call ggsave(..., device=cairo_pdf).

You will need to first install and load the Cairo bindings.

install.packages("Cairo")
library(Cairo)

Here is a full example (not my work).

vlopez
  • 594
  • 4
  • 9