5

Let's say I collect the posts in Stack Overflow and I classify them in N categories. My goal is to plot the N percentage every day and a line with the total number of posts per day.

To play with, I'll use a toy dataframe. I can plot the percentage of every category per day:

data(beav1)
beav1$day <- as.factor(beav1$day)
beav1[beav1$day==346,]$time <- 1:sum(beav1$day==346)
beav1[beav1$day==347,]$time <- 1:sum(beav1$day==347)
beav1 <- filter(beav1, time<23)
ggplot(beav1, aes(x=time, y=temp, group=day, fill=day, color=day)) + 
  geom_line()

enter image description here

But how can I add the line with the total temperature? Or the mean?

Edit: The difference with this other question is that I would like one single line for all the groups and not a line per group.

dataset

dput(beav1)
structure(list(day = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), .Label = c("346", "347"), class = "factor"), 
    time = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 
    13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 1L, 2L, 
    3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
    16L, 17L, 18L, 19L, 20L, 21L, 22L), temp = c(36.33, 36.34, 
    36.35, 36.42, 36.55, 36.69, 36.71, 36.75, 36.81, 36.88, 36.89, 
    36.91, 36.85, 36.89, 36.89, 36.67, 36.5, 36.74, 36.77, 36.76, 
    36.78, 36.82, 36.93, 36.83, 36.8, 36.75, 36.71, 36.73, 36.75, 
    36.72, 36.76, 36.7, 36.82, 36.88, 36.94, 36.79, 36.78, 36.8, 
    36.82, 36.84, 36.86, 36.88, 36.93, 36.97), activ = c(0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-44L), .Names = c("day", "time", "temp", "activ"))
Community
  • 1
  • 1
alberto
  • 2,625
  • 4
  • 29
  • 48
  • 3
    Possible duplicate of [How to add horizontal lines showing means for all groups in ggplot2?](http://stackoverflow.com/questions/32504313/how-to-add-horizontal-lines-showing-means-for-all-groups-in-ggplot2) –  Feb 16 '16 at 10:04
  • The linked post asks for "per group" means while I'm asking for overall means (I want one line, not N) – alberto Feb 16 '16 at 10:07
  • Possible duplicate of [Add a horizontal line to plot and legend in ggplot2](http://stackoverflow.com/questions/13254441/add-a-horizontal-line-to-plot-and-legend-in-ggplot2) –  Feb 16 '16 at 10:09
  • if I understand you want the temperature for each group and the total ? you can try `ggplot(beav1, aes(x=time, y=temp, group=day, fill=day, color=day)) + stat_summary(fun.y = sum, na.rm=TRUE, geom='line')` – Mamoun Benghezal Feb 16 '16 at 10:10
  • Thanks Mamoun. But your line does not draw the sum (or the mean) of the temperatures, does it? – alberto Feb 16 '16 at 10:13
  • @alberto just changed mean with sum – Mamoun Benghezal Feb 16 '16 at 10:14
  • yeah, sorry, my mistake, I see the idea, but I don't see the line yet :O – alberto Feb 16 '16 at 10:15
  • can you edit you question and `dput(bav1)` or a part of it? – Mamoun Benghezal Feb 16 '16 at 10:16
  • done! I didn't now dput, amazing :) – alberto Feb 16 '16 at 10:18
  • @Pascal regarding the second link, I know how to add lines to a ggplot, but in my case it has to be a sum (or mean, and over the N groups). I guess it's something with `stat_summary`, don't know... – alberto Feb 16 '16 at 10:25

1 Answers1

8

Ok, notice that group, fill and color are not in ggplot but in geom_line, this way you can use stat_summarywithout redefining the groups.

ggplot(beav1, aes(x=time, y=temp)) + 
    geom_line(aes(group=day, fill=day, color=day))+
    stat_summary(fun.y = mean, na.rm = TRUE, group = 3, color = 'black', geom ='line')

And if you want the sum just place fun.y = sum

Mamoun Benghezal
  • 5,264
  • 7
  • 28
  • 33