1

Currently I have a data frame where I want to plot three variables into one boxplot:

  livingsetting        factor                  outcome
1      1                CKD                         2
2      1                CKD                        13
3      1                CKD                        23
4      13               CKD                        12
5      7                CKD                       -14

The livingsetting variable contains factors "1", "7", and "13". The factor variable contains factors "CKD", "HD", and "Transplant". The outcome variable is a continuous outcome variable.

This is my code for the boxplot:

ggplot(df, aes(x = interaction(livingsetting, factor),  
y= outcome)) + geom_boxplot(aes(fill = livingsetting)) + xlab("Factors")+ ylab("Y")

And my plot looks like this:

dd

The x-axis labels show 1.CKD, 13.CKD, 7.CKD, 1.HD, 13.HD, etc., but is it possible to tweak the xlab part so that the boxplot shows "CKD", "HD", and "Transplant" as the labels? (so that each of the individual plots are grouped by threes).

For example, the first red, green, and blue plots will be labeled as "CKD" (as the group), the second red, green, and blue plots will be labeled as "HD", etc.

Luiance
  • 11
  • 6
  • 1
    What if you use `aes(x = factor, y = outcome, fill = livingsetting)`? Works only if `livingsetting` really is a factor, else transform to factor... If I understood correctly what you are looking for, you don't need the `interaction`. – Tino Feb 13 '18 at 17:31
  • might be a way to do it with regex (extract first group before dot) but here's a manual method https://stackoverflow.com/questions/5096538/customize-axis-labels – Robert Tan Feb 13 '18 at 17:34

2 Answers2

1

Here is an example illustrating my comment from above. You don't need interaction, since each aesthetic will create another boxplot:

df <- read.table(text = "  livingsetting        factor                  outcome
1      7                BLA                         2
2      1                BLA                        13
3      1                CKD                        23
4      13               CKD                        12
5      7                CKD                       -14", header = T, row.names = 1)

df$livingsetting <- as.factor(df$livingsetting)

library(ggplot2)

ggplot(data = df, aes(x = factor, y = outcome, fill = livingsetting)) + 
    geom_boxplot()
Tino
  • 2,091
  • 13
  • 15
  • Your solution and Camille's solution both worked perfectly, thank you! It was exactly what I was looking for (outcome on the y-axis, the data grouped by the three factors and then by livingsetting). – Luiance Feb 14 '18 at 23:21
0

Is there a reason not to use facet_wrap or facet_grid? Unless I'm misunderstanding what you're looking for, this is a perfect use-case for faceting, and then you don't need interaction.

You should be able to change to this:

ggplot(df, aes(x = livingsetting, y = outcome)) +
    geom_boxplot(aes(fill = livingsetting)) +
    facet_wrap(~ factor)

This uses the dataframe as is, rather than getting the interaction, and adds labels for the factor variable to the tops of the facets, rather than on the tick labels (though you could do that if that's something you want).

Hope that helps!

camille
  • 16,432
  • 18
  • 38
  • 60