How to create subscripts in the names of variables in R?

Question

I'm trying to create a plot with five variables, three of which are air pollutants: PM2.5, NO2, and O3. I want the numbers to be subscripts, which is how these are written in most literature, but I've been trying every variation I can think of using square brackets, expression(), bquote(), and others and nothing is working. This is the simplified code I have right now:

risks$variable <- factor(risks$variable,
                                           levels=c("O3","PM2.5","NO2",
                                                    "Diethyl","Dimethyl"),
                                           labels=c(expression(O[3]),
                                                    expression(PM[2.5]),
                                                    expression(NO[2]),
                                                    "Diethyl",
                                                    "Dimethyl"))

ggplot(risks, aes(variable, est, ymin = est - 1.96*sd,
                          ymax = est + 1.96*sd, col = q.fixed)) +
  geom_pointrange(position = position_dodge(width = 0.75)) +
  coord_flip()

Unfortunately, with everything I've tried the plot either shows the variable names with the square brackets around them, or the labels without brackets but the numbers still aren't subscripts. Any thoughts here?

Please remember to accept an answer if it is satisfactory. It helps avoid unnecessary attempts to provide solutions from others and helps people know an answer certainly worked rapidly. — socialscientist, Jul 17 '22 at 21:33

score 1 · Accepted Answer · answered Jul 17 '22 at 20:40

Obviously, we don't have your data, so here's a random set with the same names / structure:

risks <- data.frame(
    variable = c("O3","PM2.5","NO2", "Diethyl","Dimethyl"),
    est      = c(10, 12, 15, 6, 9),
    sd       = c(1.5, 3, 5, 1, 0.5),
    q.fixed  = c("A", "B", "C", "D", "E")
)

Now let's convert your variable column to a factor using your own code:

risks$variable <- factor(risks$variable,
                         levels=c("O3","PM2.5","NO2",
                                  "Diethyl","Dimethyl"),
                         labels=c(expression(O[3]),
                                  expression(PM[2.5]),
                                  expression(NO[2]),
                                  "Diethyl",
                                  "Dimethyl"))

Now we can plit the result. The secret here is to parse the x axis labels (which look like y axis labels due to the coord_flip. I've added a theme to make the labels more obvious.

ggplot(risks, aes(variable, est, ymin = est - 1.96*sd,
                          ymax = est + 1.96*sd, col = q.fixed)) +
  geom_pointrange(position = position_dodge(width = 0.75)) +
  coord_flip() +
  scale_x_discrete(labels = ~parse(text = .x)) +
  scale_color_brewer(palette = "Set1") +
  theme_minimal(base_size = 18)

score 1 · Answer 2 · answered Jul 17 '22 at 20:42

1

You need to assign the labels in the ggplot call. Since they're not being parsed, the subscripts aren't processed correctly.

library(ggplot2)

# Example data
df <- data.frame(variable = rep(c("O","NO"), 10),
           est = rnorm(20))

# Make figure
ggplot(df, aes(x = variable, y = est)) + geom_point() +
         scale_x_discrete(labels = c("O" = expression("O"[2]),
                                     "NO" = expression("NO"[1])))

answered Jul 17 '22 at 20:42

socialscientist

3,759
5
23
58

This is a decent solution (+1), but remember we can use a labelling _function_, so it is not necessary to do this inside ggplot if you already have parseable labels. – Allan Cameron Jul 17 '22 at 20:49
1

Certainly. I was trying to propose a solution that would let them understand they can change the labels to differ from the factor labels if, for example, working with strings is easier than factors or they want to change them in ggplot for whatever reason. They could also use the function there as well, see: https://stackoverflow.com/questions/72961962/r-programmatically-changing-ggplot-scale-labels-to-greek-letters-with-expressio – socialscientist Jul 17 '22 at 21:44
1

Yes, `label_parse` is probably the "proper" way to do it, though in effect here it does the same thing as `~parse(text = .x)` – Allan Cameron Jul 17 '22 at 21:48
1

Thanks a billion @AllanCameron and @socialscientist! I really appreciate both answers you gave, and both solutions worked for me! – Matt Jul 17 '22 at 23:14

How to create subscripts in the names of variables in R?

2 Answers2