0

enter image description here I have a dataframe (df1) that contains 3 columns (y1, y2, x). I managed to plot a boxplot graph between y1, x and y2, x. I have another dataframe (df2) which contains two columns A, x. I want to plot a line graph (A,x) and add it to the boxplot. Note the variable x in both dataframes is the axis access, however, it has different values. I tried to combine and reshape both dataframes and plot based on the factor(x)... I got 3 boxplots in one graph. I need to plot df2 as line and df1 as boxplot in one graph.

df1 <- structure(list(Y1 = c(905L, 941L, 744L, 590L, 533L, 345L, 202L, 
369L, 200L, 80L, 200L, 80L, 50L, 30L, 60L, 20L, 30L, 30L), Y2 = c(774L, 
823L, 687L, 545L, 423L, 375L, 249L, 134L, 45L, 58L, 160L, 60L, 
20L, 40L, 20L, 26L, 19L, 27L), x = c(10L, 10L, 10L, 20L, 20L, 
20L, 40L, 40L, 40L, 50L, 50L, 50L, 70L, 70L, 70L, 90L, 90L, 90L
 )), .Names = c("Y1", "Y2", "x"), row.names = c(NA, -18L), class = "data.frame")

df2 <- structure(list(Y3Line = c(384L, 717L, 914L, 359L, 241L, 265L, 
240L, 174L, 114L, 165L, 184L, 96L, 59L, 60L, 127L, 54L, 31L, 
44L), x = c(36L, 36L, 36L, 56L, 56L, 56L, 65L, 65L, 65L, 75L, 
75L, 75L, 85L, 85L, 85L, 99L, 99L, 99L)), .Names = c("A", 
"x"), row.names = c(NA, -18L), class = "data.frame")

df_l <- melt(df1, id.vars = "x")

ggplot(df_l, aes(x = factor(x), y =value, fill=variable  )) +
geom_boxplot()+
#  here I'trying to add the line graph from df2
geom_line(data = df2, aes(x = x, y=A))

Any suggestions?

Julius Vainora
  • 47,421
  • 9
  • 90
  • 102
SimpleNEasy
  • 879
  • 3
  • 11
  • 32

1 Answers1

2

In the second dataset you have three y values per x value, do you want to draw seperate lines per x value or the mean per x value? Both are shown below. The trick is to first change the x variables in both datasets to factors that contain all the levels of both variables.

df1 <-structure(list(Y1 = c(905L, 941L, 744L, 590L, 533L, 345L, 202L, 
369L, 200L, 80L, 200L, 80L, 50L, 30L, 60L, 20L, 30L, 30L), Y2 = c(774L, 
823L, 687L, 545L, 423L, 375L, 249L, 134L, 45L, 58L, 160L, 60L, 
20L, 40L, 20L, 26L, 19L, 27L), x = c(10L, 10L, 10L, 20L, 20L, 
20L, 40L, 40L, 40L, 50L, 50L, 50L, 70L, 70L, 70L, 90L, 90L, 90L
)), .Names = c("Y1", "Y2", "x"), row.names = c(NA, -18L), class = "data.frame")

df2 <- structure(list(Y3Line = c(384L, 717L, 914L, 359L, 241L, 265L, 
240L, 174L, 114L, 165L, 184L, 96L, 59L, 60L, 127L, 54L, 31L, 
44L), x = c(36L, 36L, 36L, 56L, 56L, 56L, 65L, 65L, 65L, 75L, 
75L, 75L, 85L, 85L, 85L, 99L, 99L, 99L)), .Names = c("A", 
"x"), row.names = c(NA, -18L), class = "data.frame")

library(ggplot2)
library(reshape2)

df_l <- melt(df1, id.vars = "x")

allLevels <- levels(factor(c(df_l$x,df2$x)))
df_l$x <- factor(df_l$x,levels=(allLevels))
df2$x <- factor(df2$x,levels=(allLevels))

Line per x category:

ggplot(data=df_l,aes(x = x, y =value))+geom_line(data=df2,aes(x = factor(x), y =A))  + 
geom_boxplot(aes(fill=variable )) 

Connected means of x categories:

ggplot(data=df2,aes(x = factor(x), y =A)) + 
stat_summary(fun.y=mean, geom="line", aes(group=1))  + 
geom_boxplot(data=df_l,aes(x = x, y =value,fill=variable )) 
Jonas Tundo
  • 6,137
  • 2
  • 35
  • 45
  • The second dataset contains two columns of new results. However, the second column "x" is the same variable in the first dataset. I managed to graph the boxplot of the first dataset. I need to graph a simple line graph(geom_line()) of the second data and combine it with boxplot graph. – SimpleNEasy Mar 20 '13 at 13:42
  • It gave a vertical line for each data. I need to have a continues line. I know since I have 3 "A" values for each unique x value. But I guess I can still graph it (with the aid of mean function). – SimpleNEasy Mar 20 '13 at 14:30
  • The code is not connecting the lines. I need to get the line is separated from the boxplot. See the update for the required graph. – SimpleNEasy Mar 20 '13 at 15:34
  • You should check my answer better, the last plot does exactly that. The line starts at 36 (in contrast to your latest drawing), but that's due to the second dataset. – Jonas Tundo Mar 20 '13 at 15:46
  • I'm sorry but I'm getting this error:" geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?" and it seems due to X variable.?? – SimpleNEasy Mar 20 '13 at 16:47
  • It works fine when I run my code so I don't know what the problem is. Try to run my whole code and if it still doesn't work then provide your sessionInfo(). – Jonas Tundo Mar 20 '13 at 17:21
  • seems this is a bug in the new version of R! [link](https://groups.google.com/forum/#!msg/ggplot2/ECI59jsdOYc/hMJ1Uk4SW-gJ), what is your version? – SimpleNEasy Mar 20 '13 at 17:56
  • sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-pc-mingw32/x64 (64-bit) – SimpleNEasy Mar 20 '13 at 18:02
  • Not the version of R but of ggplot2. I'm using ggplot2 0.9.3.1 and R 2.15.2 and don't have the problem. Maybe you should upgrade. When you provide sessionInfo() you should include all the output in your question. – Jonas Tundo Mar 20 '13 at 18:06
  • I thought the bug from R that's why I haven't provided all of the sessionInfo() output. I found that I'm using ggplot2_0.9.3, then I upgraded it to latest. Now it works. Thank you perfect – SimpleNEasy Mar 20 '13 at 20:53
  • @JT85: How does one interpret group=1 in this case? I'm using your answer in part of my code. – val Apr 05 '17 at 00:37
  • answer is here: http://stackoverflow.com/questions/10357768/plotting-lines-and-the-group-aesthetic-in-ggplot2 – val Apr 05 '17 at 00:44