For each organization, you'll want to make sure you have at least one value for counts for the minimum and maximum years. This is so that ggplot2
will fill in the gaps. Also, you'll want to be careful with cumulating sums. So the solution I've shown below adds in a zero count if not value exists for the earliest and last year.
I've added some code so that you can automate the adding of rows for organizations that don't have data for the first and last all years of your data.
To incorporate this automated code, you'll want to merge in the tail_dat
complete_dat
data frame and change the variables dat
within the data.frame()
definition to suite your own data.
library(ggplot2)
library(dplyr)
library(tidyr)
# Create sample data
dat <- tribble(
~organization, ~year, ~count,
"a", 1990, 1,
"a", 1991, 1,
"b", 1991, 1,
"c", 1992, 1,
"c", 1993, 0,
"a", 1994, 1,
"b", 1995, 1
)
dat
#> # A tibble: 7 x 3
#> organization year count
#> <chr> <dbl> <dbl>
#> 1 a 1990 1
#> 2 a 1991 1
#> 3 b 1991 1
#> 4 c 1992 1
#> 5 c 1993 0
#> 6 a 1994 1
#> 7 b 1995 1
# NOTE incorrect results for comparison
dat %>%
group_by(organization, year) %>%
summarise(total = sum(count)) %>%
ggplot(aes(x = year, y = cumsum(total), fill = organization)) +
geom_area()
#> `summarise()` regrouping output by 'organization' (override with `.groups` argument)

# Fill out all years and organization combinations
complete_dat <- tidyr::expand(dat, organization, year = 1990:1995)
complete_dat
#> # A tibble: 18 x 2
#> organization year
#> <chr> <int>
#> 1 a 1990
#> 2 a 1991
#> 3 a 1992
#> 4 a 1993
#> 5 a 1994
#> 6 a 1995
#> 7 b 1990
#> 8 b 1991
#> 9 b 1992
#> 10 b 1993
#> 11 b 1994
#> 12 b 1995
#> 13 c 1990
#> 14 c 1991
#> 15 c 1992
#> 16 c 1993
#> 17 c 1994
#> 18 c 1995
# Update data so that counting works and fills in gaps
final_dat <- complete_dat %>%
left_join(dat, by = c("organization", "year")) %>%
replace_na(list(count = 0)) %>% # Replace NA with zeros
group_by(organization, year) %>%
arrange(organization, year) %>% # Arrange by year so adding works
group_by(organization) %>%
mutate(aggcount = cumsum(count))
final_dat
#> # A tibble: 18 x 4
#> # Groups: organization [3]
#> organization year count aggcount
#> <chr> <dbl> <dbl> <dbl>
#> 1 a 1990 1 1
#> 2 a 1991 1 2
#> 3 a 1992 0 2
#> 4 a 1993 0 2
#> 5 a 1994 1 3
#> 6 a 1995 0 3
#> 7 b 1990 0 0
#> 8 b 1991 1 1
#> 9 b 1992 0 1
#> 10 b 1993 0 1
#> 11 b 1994 0 1
#> 12 b 1995 1 2
#> 13 c 1990 0 0
#> 14 c 1991 0 0
#> 15 c 1992 1 1
#> 16 c 1993 0 1
#> 17 c 1994 0 1
#> 18 c 1995 0 1
# Plot results
final_dat %>%
ggplot(aes(x = year, y = aggcount, fill = organization)) +
geom_area()

Created on 2020-12-10 by the reprex package (v0.3.0)