1

I want to plot the cumulative counts of level OK of factor X (*), over time (column Date). I am not sure what is the best strategy, whether or not I should create a new data frame with a summary column, or if there is a ggplot2 way of doing this.

Sample data

DF <- data.frame(
  Date = as.Date(c("2018-01-01", "2018-01-01", "2018-02-01", "2018-03-01", "2018-03-01", "2018-04-01") ),
  X = factor(rep("OK", 6), levels = c("OK", "NOK")),
  Group = factor(c(rep("A", 4), "B", "B"))
)
DF <- rbind(DF, list(as.Date("2018-02-01"), factor("NOK"), "A"))

From similar questions I tried this:

ggplot(DF, aes(Date, col = Group)) + geom_line(stat='bin')

enter image description here

Using stat='count' (as the answer to this question) is even worse:

ggplot(DF, aes(Date, col = Group)) + geom_line(stat='count')

enter image description here

which shows the counts for factor levels (*), but not the accumulation over time.

Desperate measure - count with table

I tried creating a new data frame with counts using table like this:

cum <- as.data.frame(table(DF$Date, DF$Group))
ggplot(cum, aes(Var1, cumsum(Freq), col = Var2, group = Var2)) +
  geom_line()

enter image description here

Is there a way to do this with ggplot2? Do I need to create a new column with cumsum? If so, how should I cumsum the factor levels, by date?

(*) Obs: I could just filter the data frame to use only the intended levels with DF[X == "OK"], but I am sure someone can find a smarter solution.

philsf
  • 217
  • 1
  • 14
  • Probably it is not possible to get cumulative value in `aes` since you have got multiple groups. Otherwise `cumsum` within `aes` would have helped. Finger-crossed. Lets see if someone got some idea. In the mean time I have posted an answer using `dplyr` chain to calculate cumulative count and draw line. Have a look. – MKR Jul 01 '18 at 17:57

1 Answers1

3

One option using dplyr and ggplot2 can be as:

library(dplyr)
library(ggplot2)

DF %>% group_by(Group) %>%
       arrange(Date) %>%
       mutate(Value = cumsum(X=="OK")) %>%
      ggplot(aes(Date, y=Value, group = Group, col = Group)) + geom_line()

enter image description here

MKR
  • 19,739
  • 4
  • 23
  • 33