21

Some of you may have seen my blog post on this topic, where I wrote the following code after wanting to help a friend produce half-filled circles as points on a graph:

TestUnicode <- function(start="25a0", end="25ff", ...)
  {
    nstart <- as.hexmode(start)
    nend <- as.hexmode(end)
    r <- nstart:nend
    s <- ceiling(sqrt(length(r)))
    par(pty="s")
    plot(c(-1,(s)), c(-1,(s)), type="n", xlab="", ylab="",
         xaxs="i", yaxs="i")
    grid(s+1, s+1, lty=1)
    for(i in seq(r)) {
      try(points(i%%s, i%/%s, pch=-1*r[i],...))
    }
  }

TestUnicode(9500,9900) 

This works (i.e. produces a nearly-full grid of cool dingbatty symbols):

  • on Ubuntu 10.04, in an X11 or PNG device
  • on Mandriva Linux distribution, same devices, with locally built R, once pango-devel was installed

It fails to varying degrees (i.e. produces a grid partly or entirely filled with dots or empty rectangles), either silently or with warnings:

  • on the same Ubuntu 10.04 machine in PDF or PostScript (tried setting font="NimbusSan" to use URW fonts, doesn't help)
  • on MacOS X.6 (quartz, X11, Cairo, PDF)

For example, trying all the available PDF font families:

flist <- c("AvantGarde", "Bookman","Courier", "Helvetica", "Helvetica-Narrow",
        "NewCenturySchoolbook", "Palatino", "Times","URWGothic",
        "URWBookman", "NimbusMon", "NimbusSan", "NimbusSanCond",
        "CenturySch", "URWPalladio","NimbusRom")

for (f in flist) {
  fn <- paste("utest_",f,".pdf",sep="")
  pdf(fn,family=f)
  TestUnicode()
  title(main=f)
  dev.off()
  embedFonts(fn)
}

on Ubuntu, none of these files contains the symbols.

It would be nice to get it to work on as many combinations as possible, but especially in some vector format and double-especially in PDF.

Any suggestions about font/graphics device configurations that would make this work would be welcomed.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453

4 Answers4

14

I think you are out of luck Ben, as, according to some notes by Paul Murrell, pdf() can only handle single-byte encodings. Multi-byte encodings need to be converted to a the single-byte equivalent, and therein lies the rub; by definition, single-byte encodings cannot contain all the glyphs that can be represented in a multi-byte encoding like UTF-8, say.

Paul's notes can be found here wherein he suggests a couple of solutions using Cairo-based PDF devices, using cairo_pdf() on suitably-endowed Linux and Mac OS systems, or via the Cairo package under MS Windows.

Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
8

I have found the cairo_pdf device to be completely insufficient: the output is markedly different from both pdf and on-screen rendering, and its plotmath support is sketchy.

However, there’s a rather simple workaround on OS X: Use the “normal” quartz device and set its type to pdf:

quartz(type = 'pdf', file = 'output.pdf')

Unfortunately, on my computer this ignores the font family and always uses Helvetica (although the documentation claims that the default is Arial).

There are at least two other gotchas:

  • pdf converts hyphens to minuses. This may not even always be what you want but it’s quite useful to properly typeset negative numbers. The linked thread describes workarounds for this.
  • It’s of course platform specific and only works on OS X.

(I realise that OP briefly mentions the Quartz device but this thread is frequently viewed and I think this solution needs more prominence.)

Community
  • 1
  • 1
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • can you be more specific/give specific examples of the problems, especially those with plotmath rendering? It would probably help to give the results of `sessionInfo()` too ... – Ben Bolker Oct 26 '13 at 21:11
  • @Ben Hmm, since this post isn’t really about plotmath, maybe a comment suffices. Tell me if you feel otherwise. Here goes: Cairo only implements a subset of plotmath; it doesn’t implement some of the symbols (see `demo(plotmath)`), some of the spaces are off, and it doesn’t support italic text (with Helvetica at least). [R v3.0.1, x86_64-apple-darwin10.8.0 in case that’s relevant, but I doubt it] – Konrad Rudolph Oct 26 '13 at 22:02
3

Another solution might be to use tikzDevice which can now use XeLaTeX with Unicode characters. The resulting tex file can then be compiled to produce a pdf. The problem is still that you must have a font on your system that contains the characters.

library(tikzDevice)
options(tikzXelatexPackages=c(getOption('tikzXelatexPackages'),
    '\\setromanfont{Courier New}'))
tikz(engine='xetex',standAlone=T)
TestUnicode(9500,9900)
dev.off()

The first time, this will take a LONG time.

cameron.bracken
  • 1,236
  • 1
  • 9
  • 14
  • Hmmm. Tip appreciated. I installed XeTeX (on Ubuntu, `apt-get install texlive-xetex`), but I don't seem to have "Courier New" on my (Ubuntu 10.04) system (or at least XeTeX can't find it: normally `apt-get install` runs all of the TeX updates necessary ...). Suggestions for how to guess/find an appropriate font? – Ben Bolker May 10 '11 at 13:11
  • `fc-list | grep -i ding` should show a list of fonts installed on your computer that contain the word "ding" in their names. XeTeX should be able to access these using `fontspec` commands. – Sharpie May 10 '11 at 17:30
  • `Courier New` was just an arbitrary choice, as Sharpie mentioned, you will need to find a font on your system that has the symbols. – cameron.bracken May 10 '11 at 18:00
  • Double hmmm. I pursued this a little way because it seems like a nice alternative, and might work on some systems where the cairo_pdf() solution does not, but ... all I know in my current state of cluelessness is that `cairo_pdf` seems to find a suitable font automatically (at least on Mandriva or Ubuntu systems with pango installed), whereas it would take me a while to dig through and figure out how to find the appropriate fonts, translate the names to their XeTeX equivalents, etc.. – Ben Bolker May 10 '11 at 21:51
  • (Was running out of room in the previous comment.) The `fc-list` command given above returns `Dingbats:style=Regular` on my system. I'm not sure I know how to translate this to a valid XeTeX font specification ... naively changing `Courier New` to `Dingbats` in the code above fails. Bottom line: this seems interesting, but (given that I have a solution that works for me) probably not worth my effort pursuing at this point. Hopefully it will be useful to someone else. – Ben Bolker May 10 '11 at 21:59
0

Have you tried embedding a font in the PDF, or including one for Mac users that would work?

Andrea
  • 19,134
  • 4
  • 43
  • 65
  • 2
    Thanks. Can you be slightly more specific? R has an `embedFonts()` function, but I believe that's intended to post-process a PDF/PostScript to make sure that fonts that are present on the current system get embedded in the file. This is (I think) a different situation, where the fonts used by R for the pdf don't include the glyphs in the first place. For example, `pdf("test.pdf", family="NimbusSan"); TestUnicode(); dev.off()` fails. – Ben Bolker May 04 '11 at 15:47