0

Struggling a little with ggplot here. As the title says: is it possible to boxplot columns from a dataframe without a factor column? That is to say, using names of columns of interest as x?

Example 1 (graphics)

df <- data.frame(c(0.2, 0.3, 0.4), c(0.4, 0.2, 0.5))
colnames(df) <- c("A1", "A2")
rownames(df) <- c("001", "002", "003")
df

     A1  A2
001 0.2 0.4
002 0.3 0.2
003 0.4 0.5

boxplot(df[,"A1"], df[,"A2"], names=colnames(df))

graphics package

Exemple 2 (ggplot2)

library(ggplot2)

df2 <- data.frame(c("A1", "A1", "A1", "A2", "A2", "A2"), c(0.2, 0.3, 0.4, 0.4, 0.2, 0.5))
colnames(df2) <- c("Series", "Value")
df2

  Series Value
1     A1   0.2
2     A1   0.3
3     A1   0.4
4     A2   0.4
5     A2   0.2
6     A2   0.5

p <- ggplot(df2, aes(as.factor(Series), Value)) + geom_boxplot()
p

ggplot2

In the second case, I lose the rownames that cannot be duplicated, although they're IDs I need to keep. So could I obtain this result with ggplot2 keeping the first data structure? Thanks

rioualen
  • 948
  • 8
  • 17
  • 1
    You'll have to process the first structure to obtain the second. `ggplot(reshape2::melt(df), aes(x = variable, y = value)) + geom_boxplot()` – d.b Aug 14 '19 at 18:14
  • Thanks, I've come across this syntax but I don't quite understand where do "value" and "variable" come from, and when I try to apply it to my actual data I get an "object 'variable' not found" error. ```data_to_plot <- my_data[, c("IC.2", "IC.3", "IC.4")] ggplot(reshape2::melt(data_to_plot), aes(x = variable, y = value)) + geom_boxplot() Using IC.2, IC.3, IC.4 as id variables Error in FUN(X[[i]], ...) : object 'variable' not found``` – rioualen Aug 14 '19 at 18:28
  • I'm expecting `reshape2::melt(df)`to output something like the validated anwser [here](https://stackoverflow.com/questions/14604439/plot-multiple-boxplot-in-one-graph) but it still shows my 3-column data, although it should group all values into 1 column, right? – rioualen Aug 14 '19 at 18:59

1 Answers1

1

So I couldn't get reshape2 to work, however I came up with a solution using the tidyr package:

library(dplyr)
library(tidyr)
library(ggplot2)

df <- data.frame(c(0.2, 0.3, 0.4), c(0.4, 0.2, 0.5))
colnames(df) <- c("A1", "A2")
rownames(df) <- c("001", "002", "003")
df
    A1  A2
001 0.2 0.4
002 0.3 0.2
003 0.4 0.5

tidy_df <- df %>% gather(variable, value, c("A1", "A2"))
p <- ggplot(tidy_df, aes(x = variable, y = value)) + geom_boxplot()
p

tidyr+ggplot2

rioualen
  • 948
  • 8
  • 17