8

I have unicode text that includes emoji. I'd like to render them in a ggplot2 graphic with geom_text or geom_label in a way that includes the emoji's colour. I've looked at emojifont, emo and ggtext and none of these seem to allow this. The issue of course is that the colour of the text in geom_text is governed by the colour aesthetic. Is there any way I can get colours rendered in my text, either through geom_text or some other workaround?

Reproducible example:

library(ggplot2)

pets <- "I like    "

cat(pets)

ggplot() +
  theme_void() +
  annotate("text", x = 1, y = 1, label = pets, size = 15)

The cat(pets) works on screen in RStudio, but the graphic drawn with the last line looks like this:

enter image description here

Alternatively, with ggtext::geom_richtext() I get a similar black and white result and this error message:

> library(ggtext)
> ggplot() +
+   theme_void() +
+   annotate("richtext", x = 1, y = 1, label = pets, size = 15)
Warning messages:
1: In text_info(label, fontkey, fontfamily, fontface, fontsize, cache) :
  unable to translate '<U+0001F436>RStudioGD142.6791338582677' to native encoding
2: In text_info(label, fontkey, fontfamily, fontface, fontsize, cache) :
  unable to translate '<U+0001F431>RStudioGD142.6791338582677' to native encoding
3: In text_info(label, fontkey, fontfamily, fontface, fontsize, cache) :
  unable to translate '<U+0001F41F>RStudioGD142.6791338582677' to native encoding
4: In text_info(label, fontkey, fontfamily, fontface, fontsize, cache) :
  unable to translate '<U+0001F422>RStudioGD142.6791338582677' to native encoding
5: In do.call(gList, grobs) :
  unable to translate 'I like <U+0001F436> <U+0001F431> <U+0001F41F> <U+0001F422>' to native encoding
Peter Ellis
  • 5,694
  • 30
  • 46

1 Answers1

5

OK, here's an answer to my own question.

Overall approach: we convert each emoji to a hyperlink to an image of the emoji, and use ggtext to render the new version of combination of text and images.

First we need a vector of all emoji so down the track we will be able to recognise them:

library(tidyverse)
library(ggtext)
library(rvest)

# test vector
pets <- "I like    "

# the definitive web page with emoji:
unicode <- read_html("https://unicode.org/emoji/charts/full-emoji-list.html")

ut <- unicode %>%
  html_node("table") %>%
  html_table()

# vector of all emoji - purely for recognition purposes
all_emoji <- ut[,3]

Then I borrow with virtually no alteration several functions from this page by Emil Hvitfeldt. Emil had a similar challenge to me, but without the problem of the original emoji just being text.

emoji_to_link <- function(x) {
  paste0("https://emojipedia.org/emoji/",x) %>%
    xml2::read_html() %>%
    rvest::html_nodes("tr td a") %>%
    .[1] %>%
    rvest::html_attr("href") %>%
    paste0("https://emojipedia.org/", .) %>%
    xml2::read_html() %>%
    rvest::html_node('div[class="vendor-image"] img') %>%
    rvest::html_attr("src")
}

link_to_img <- function(x, size = 24) {
  paste0("<img src='", x, "' width='", size, "'/>")
}

Those links take an emoji and convert it into a hyperlink to an image of the emoji as rendered by the Apple Color Emoji font. So far so good, but I need to extract the emoji from my mixed test in the first place. To do this I wrote two more functions

  • to convert an individual token (where a token might be an individual emoji) into an emoji or return it as unchanged text; and
  • to tokenize a text string, convert any emoji tokens to images, and then paste them all back together again.

Here's those two functions:

token_to_rt <- function(x){
  if(x %in% all_emoji){
    y <- link_to_img(emoji_to_link(x))
  } else {
    y <- x
  }
  return(y)
}

string_to_rt <- function(x){
  tokens <- str_split(x, " ", simplify = FALSE)[[1]]
  y <- lapply(tokens,  token_to_rt)
  z <- do.call(paste, y)
  return(z)
}

Now we have everything we need. First I convert my pets vector into pets2, then I can use ggplot2 and ggtext to render it on screen, in glorious colour

pets2 <- string_to_rt(pets)

ggplot() +
  theme_void() +
  annotate("richtext", x = 1, y = 1, label = pets2, size = 15)

There we are:

enter image description here

For completeness, here's how the key objects pets, pets2 and all_emoji look when just printed in the R console:

> pets
[1] "I like \U0001f436 \U0001f431 \U0001f41f \U0001f422"
> pets2
[1] "I like <img src='https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/120/apple/237/dog-face_1f436.png' width='24'/> <img src='https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/120/apple/237/cat-face_1f431.png' width='24'/> <img src='https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/120/apple/237/fish_1f41f.png' width='24'/> <img src='https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/120/apple/237/turtle_1f422.png' width='24'/>"
> all_emoji[1:10]
 [1] "face-smiling" "Browser"      "\U0001f600"            "\U0001f603"            "\U0001f604"            "\U0001f601"           
 [7] "\U0001f606"            "\U0001f605"            "\U0001f923"            "\U0001f602"  
Peter Ellis
  • 5,694
  • 30
  • 46
  • 1
    Close readers will note that this workflow depends on there being spaces on either side of the emoji. I imagine some extra work would be needed to harden this for use with real data. – Peter Ellis May 27 '20 at 11:02