Save unicode characters to .pdf in R

Question

I would like to save specific unicode characters to a pdf file with ggsave.

Example code

library(ggplot2)

ggplot() +
  geom_point(data = data.frame(x=1, y=1), aes(x,y), shape = "\u2191") +
  geom_point(data = data.frame(x=2, y=2), aes(x,y), shape = "\u2020")

ggsave("test.pdf", plot = last_plot()), width = 40, height = 40, units = "mm")

However, when saving the .pdf the unicode characters are transformed to three dots...

Attempts to fix it

I tried to use the cairo_pdf device in ggsave -> didn't work.
Used this post to plot the unicode characters, but didn't quite understand it...

Question

How do I use both unicode characters in a pdf?

> sessionInfo()
R version 3.6.2
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.5

I hate to be the bearer of bad news but unfortunately this problem doesn’t have a proper solution. In principle you can set a different encoding with `pdf.options` but the supported encodings are platform dependent (see `dir(system.file('enc', package = 'grDevices'))`), and I couldn’t find any encoding that supported both symbols on macOS. You could write a *custom* encoding file. I don’t recommend it. — Konrad Rudolph, Jul 10 '20 at 15:51

score 6 · Accepted Answer · answered Jul 10 '20 at 17:21

6

This seems to work on my mac:

library(tidyverse)

quartz(type = 'pdf', file = 'test.pdf')

ggplot() +
    geom_point(data = data.frame(x=1, y=1), aes(x,y), shape = "\u2191") +
    geom_point(data = data.frame(x=2, y=2), aes(x,y), shape = "\u2020")

Using the suggestion from here: https://stackoverflow.com/a/44548861/1827

answered Jul 10 '20 at 17:21

Shabaz

873
1
8
17

As a Linux user, I simply use `dev.copy2pdf` with `out.type="cairo"` – user3236841 Aug 05 '22 at 16:45

score 4 · Answer 2 · edited May 19 '23 at 11:24

4

It's a bit touchy using ggsave() with unicode characters and pdfs. Try to explicitly post to the device. It does not work for me when I use pdf(), but using cairo_pdf() worked.

p <- ggplot() +
  geom_point(data = data.frame(x=1, y=1), aes(x,y), shape = "\u2191", size=4) +
  geom_point(data = data.frame(x=2, y=2), aes(x,y), shape = "\u2020", size=4)

Then compare these:

# using pdf() gives me warnings and does not work
pdf('test.pdf')
print(p)
dev.off()

# using cairo_pdf() works
cairo_pdf('test_cairo.pdf')
print(p)
dev.off()

edited May 19 '23 at 11:24

user1329307

117
2
16

answered Jul 10 '20 at 18:01

chemdork123

12,369
2
16
32

1

Unfortunately `cairo_pdf` (at least on macOS) gives bad output. :-( Text rendering is slightly but jarringly off (especially kerning). It gives really bad output. In addition, this does *not* actually fix the Cairo issue. Only one of the two symbols are printed correctly on macOS. Using this instead of `ggsave(…, device = cairo_pdf)` makes absolutely no difference. – Konrad Rudolph Jul 10 '20 at 20:45

James Silva · Answer 3 · 2020-09-18T13:42:56.163

0

You are welcome to check my answer to a similar question here: https://stackoverflow.com/questions/12096152/plotting-symbols-fails-in-pdf/63214207?r=SearchResults&s=2|0.0000#63214207

But here is the solution for your problem.

#--- A function to install missing packages and load them all
myfxLoadPackages = function (PACKAGES) {
  lapply(PACKAGES, FUN = function(x) {
    if (suppressWarnings(!require(x, character.only = TRUE))) {
      install.packages(x, dependencies = TRUE, repos = "https://cran.rstudio.com/")
    }
  })
  lapply(PACKAGES, FUN = function(x) library(x, character.only = TRUE))
}

packages = c("ggplot2","gridExtra","grid","png")
myfxLoadPackages(packages)

#--- The trick to get unicode characters being printed on pdf files:
#--- 1. Create a temporary file, say "temp.png"
#--- 2. Create the pdf file using pdf() or cairo_pdf(), say "UnicodeToPDF.pdf"
#--- 3. Combine the use of grid.arrange (from gridExtra), rasterGrob (from grid), and readPNG (from png) to insert the
#       temp.png file into the UnicodeToPDF.pdf file
test.plot = ggplot() +
  geom_point(data = data.frame(x=1, y=1), aes(x,y), shape = "\u2191", size=3.5) +
  geom_point(data = data.frame(x=2, y=2), aes(x,y), shape = "\u2020", size=3.5) +
  geom_point(data = data.frame(x=1.2, y=1.2), aes(x,y), shape = -10122, size=3.5, color="#FF7F00") +
  geom_point(data = data.frame(x=1.4, y=1.4), aes(x,y), shape = -129322, size=3.5, color="#FB9A99") +
  geom_point(data = data.frame(x=1.7, y=1.7), aes(x,y), shape = -128515, size=5, color="#1F78B4")
ggsave("temp.png", plot = test.plot, width = 80, height = 80, units = "mm")
#--- Refer to http://xahlee.info/comp/unicode_index.html to see more unicode character integers

pdf("UnicodeToPDF.pdf")
grid.arrange(
  rasterGrob(
    readPNG(
      "temp.png",
      native=F
    )
  )
)
dev.off()

file.remove("temp.png")

The following image has been added to follow up on Konrad Rudolph's comments.

edited Sep 18 '20 at 13:42

answered Sep 16 '20 at 21:36

James Silva

62
2

This is fundamentally not equivalent to, and not a substitute for, saving to PDF. You are rasterising the image to PNG and put that into a PDF. That isn’t the same at all (zoom in!), and in most cases you don’t need to bother about the PDF at all then — just stick with the PNG. – Konrad Rudolph Sep 17 '20 at 07:29
Hi Konrad. With all due respect, I think the one who need to zoom in is you. If you scroll to the top you will see how the original issue was written: "I would like to save specific unicode characters to a pdf file with ggsave.". Thas is exactly what I addressed. Enjoy! – James Silva Sep 17 '20 at 13:01
You may be addressing a narrow reading of the question but, nevertheless, this is (unfortunately!) a *bad* solution, for the reason explained in my previous comment. The whole point of using PDF instead of PNG is to get vector graphics; your answer doesn’t accomplish this. – Konrad Rudolph Sep 17 '20 at 13:34
I understand your frustration and agree with you that the pdf() function has a fundamental bug. I apologize If my practical solution does not fit your needs and expectations. But I hope it helps other users, while a fundamental and technically sound solution is published. – James Silva Sep 17 '20 at 15:04
Well, I spent a few minutes putting the two images side-by-side, zooming in as much as I could, and I could not spot any difference. From a practical perspective, it will be extremely difficult that a human eye can look at the two images and be able grasp any difference and end up detecting lack of vector graphics; therefore, the decision whether or not the provided solution is as bad as you think for a very specific visualization should be left to the user. It is good that you have encouraged the community to do the zoom-in first, before making a final call. Thanks – James Silva Sep 18 '20 at 12:27
For context, here’s the side by side zoomed in view: https://i.imgur.com/tLX7VAv.png – Konrad Rudolph Sep 18 '20 at 12:58
Interesting. Using Acrobat Reader to zoom in I still see the desired unicode characters. Perhaps it has to do with me using R-4.0.1 64 under Windows 10. Indeed, looking at what you got really clarified our concern. Thanks for your contribution. – James Silva Sep 18 '20 at 13:53

Save unicode characters to .pdf in R

3 Answers3

Linked