1

I'd like to create a boxplot from two different dataframes in R. In each dataframe, the rows represent samples. While the columns represent diseases. The boxplot distribution should be made from the values in each row. The data is supposed to show the comparison between the row distributions in each data frame(control, experimental group). So if there are 6 rows in each data frame, there should be 12 boxes.

It should look something like this. https://i.stack.imgur.com/17OIk.png

Both data frames have the same number of rows, but a different number of columns, since the experimental conditions were different. I'd also like the plots to be reordered by the row median of only one of the data frames, and this order should be preserved for the entirety of the box plot.

Any ideas?? I'm new to R and would appreciate any leads.

shwetaaaa
  • 61
  • 2
  • 7
  • 1
    Please read [How to make a great reproducible example in R?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – M-- Jul 17 '17 at 18:08
  • in addition posting an example of your data (try using `dput()` on the `head()` of the dataframes as suggested in that other thread), it is also going to be helpful to change your data from wide format to long format. See [this page](http://seananderson.ca/2013/10/19/reshape.html) for details. This is important because a long formatted table can be used to easily create the kind of boxplot you posted in a single line with `ggplot2`. – user5359531 Jul 17 '17 at 18:37

1 Answers1

3

Generate some sample data

df1 <- data.frame(disease.a=rnorm(10,2), 
disease.b=rnorm(10,2),
disease.c=rnorm(10,2)) # experimental group

df2 <- data.frame(disease.a=rnorm(10,0),
disease.b=rnorm(10,0),
disease.c=rnorm(10,0)) # control group

Add a column to df1 and df2 to represent experimental condition

df1$condition <- "experimental"
df2$condition <- "control"

Bind your data frames together

df3 <- rbind(df1, df2)

Reshape the data

library(reshape2)
m.df <- melt(df3, id.var="condition")

Plot the data with ggplot as per your example

library(ggplot2)
ggplot(m.df, aes(x=condition, y=value)) + geom_boxplot(aes(fill=variable))
kipp
  • 56
  • 2