0

I see a lot of examples for getting subscripts and superscripts into plot labels, but I'm curious how to add a subscript to a character vector that can eventually be used for a plot.

Example data and plot:

 data.frame(site = LETTERS[1:10],
            a1 = c(rep(1, 7), rep(NA, 3)),
            a2 = c(rep(NA, 4), rep(2, 6)),
            minY = sample(1:9, 10, replace = TRUE),
            maxY = sample(10:19, 10, replace = TRUE)) %>% 
 mutate(label = case_when(is.na(a1) ~ paste(site, a2, sep = ""), 
                          is.na(a2) ~ paste(site, a1, sep = ""), 
                          TRUE ~ paste(site, paste(a1, a2, sep = ","), sep = ""))) %>% 
 ggplot() +
 geom_segment(aes(x = label, xend = label, y = minY, yend = maxY))

enter image description here

How can I made the 1's and 2's into subscripts for the plot?

tnt
  • 1,149
  • 14
  • 24
  • 2
    Plain text doesn't really support subscripts. Do you want to use some other markup style? Is this plotting or some other output? – MrFlick Mar 10 '23 at 19:00
  • @MrFlick I do plan on using the label in a plot, but because there was a few steps to generate the label, I was trying to set it up as a character in advance. – tnt Mar 10 '23 at 19:07
  • 2
    The "how to add subscripts" depends on the rendering mechanism and target format, e.g., R plots (base or grid/ggplot2), or rmarkdown into pdf, html, or docx. What's your intent? – r2evans Mar 10 '23 at 19:11
  • @r2evans I've added a bit more detail about the downstream use of the subscripts in a ggplot and provided an example. – tnt Mar 10 '23 at 19:21

2 Answers2

3

Form the expression/bquote string programmatically (e.g., A[2]) and then use scales::label_parse().

Note that with R's plotmath (which is how this is being done internally), we need to be explicit about the comma-separated subscripts. While a single number has an "expression" (symbolic) of A[1], we can't use A[1,2] since the ,2 will be dropped from view; instead, we need A[1*','*2] which is the number 1 followed (multiplied, in math syntax) by the string-literal comma followed by the number 2.

data.frame(site = LETTERS[1:10],
            a1 = c(rep(1, 7), rep(NA, 3)),
            a2 = c(rep(NA, 4), rep(2, 6)),
            minY = sample(1:9, 10, replace = TRUE),
            maxY = sample(10:19, 10, replace = TRUE)) %>% 
  mutate(
     label = sprintf("%s[%s]", site,             # CHANGED
       mapply(function(x, y) paste(na.omit(c(x, y)), collapse = "*','*"), a1, a2))
  ) %>%
  ggplot() +
    geom_segment(aes(x = label, xend = label, y = minY, yend = maxY)) +
    scale_x_discrete(labels = scales::label_parse()) # NEW

ggplot grob with subscripted labels

Note: this is nearly a duplicate of several questions such as How to use subscripts in ggplot2 legends [R], though I couldn't find quite the right verbiage to address this question.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • works perfectly in the example, but not with my actual dataset. syntax seems to be the same, so I can't figure out what's going on. – tnt Mar 10 '23 at 20:47
2

We convert the label to latex and use TeX on that. DF is the data frame defined in the question, also shown in the Note at the end.

library(dplyr)
library(ggplot2)
library(latex2exp)

DF %>% 
  ggplot() +
  geom_segment(aes(x = label, xend = label, y = minY, yend = maxY)) +
  scale_x_discrete(labels = ~ TeX(sub("(.)([0-9,]+)", "$\\1_{\\2}$", .x)))

screenshot

Note

This is as in the question except we have separated it out and named it DF.

DF <- data.frame(site = LETTERS[1:10],
            a1 = c(rep(1, 7), rep(NA, 3)),
            a2 = c(rep(NA, 4), rep(2, 6)),
            minY = sample(1:9, 10, replace = TRUE),
            maxY = sample(10:19, 10, replace = TRUE)) %>% 
 mutate(label = case_when(is.na(a1) ~ paste(site, a2, sep = ""), 
                          is.na(a2) ~ paste(site, a1, sep = ""), 
                          TRUE ~ paste(site, paste(a1, a2, sep = ","), sep = "")))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • works perfectly! Can you explain the syntax in the scale_x_discrete line? – tnt Mar 10 '23 at 20:58
  • 1
    `labels=` takes a function which can be provided in a short form as an formula where the part to the right of the tilde is the body and `.x` is the argument. Each label is passed to that function which transforms it to a latex string using `sub`, e.g. `$F_{1,2}$`, and then `TeX` converts that to plotmath form which is passed back to ggplot for final use as the label. The regular expression in the `sub` assumes labels consist of a single character followed by a subscript which may contain digits and commas. – G. Grothendieck Mar 10 '23 at 21:46