1

I have 3 data frames showing the sizes of genetic sequences from 3 samples. Instead of listing the size of every sequences, my data are summarised as the number of every size. The data frames look like these:

> head(df1)
  Size Count
1   56     1
2   58     1
3   59     2
4   60     2
5   61     3
6   62     1

> head(df2)
  Size Count
1   53     1
2   55     1
3   57     2 
4   58     2
5   59     3
6   60     3

> head(df3)
  Size Count
1   53     1
2   56     1
3   57     3 
4   58     2
5   59     5
6   60    10

I would like to draw an overlapped density plot of these 3 samples, like this one:

Example

How can I do this? The way I figured out was to make new data frames with repeated number of each size, combine these 3 new data frames, and then use ggplot() + geom_density().

new_df1 <- data.frame(size=rep(df1$Size, df1$Count), sample="No_1")
new_df2 <- data.frame(size=rep(df2$Size, df2$Count), sample="No_2")
new_df3 <- data.frame(size=rep(df3$Size, df3$Count), sample="No_3")
all_sample <- rbind(new_df1, new_df2, new_df3)

ggplot(data=all_sample, aes(x=size)) + geom_density(aes(colour=sample))

Was this the right way to do what I'd like to do? Is there any neater way to do this?

Any thoughts are welcome! Thank you.

camille
  • 16,432
  • 18
  • 38
  • 60
Michael
  • 11
  • 2
  • Seems fine, does that get the output you want? – camille Dec 05 '18 at 19:52
  • 1
    This is fine if you have 3 data frames. See my answer at [How to make a list of data frames](https://stackoverflow.com/a/24376207/903061) for an upstream fix and fancier ways to do the combining if you ever need to scale up... if you had 10 data frames this would be very annoying. If you had 100 data frames it would be impractical. – Gregor Thomas Dec 05 '18 at 19:59
  • Thanks for your replies. Yes, I got the output I wanted. And yes, I do have more than 10 data frames. Thanks @Gregor for referring me to the useful post. – Michael Dec 20 '18 at 15:22

0 Answers0