1

This is related to Plotting cumulative counts in ggplot2, but that question was dealing with a continuous variable (rather than discrete).

Here, I have a bar chart

set.seed(2021)
dat <- data.frame(x = c(rpois(100, 1), 7, 10))
ggplot(dat) + geom_bar(aes(x, ..count..))

enter image description here

I'm trying to plot a cumulative count with

ggplot(dat) + geom_bar(aes(x, cumsum(..count..)))

enter image description here

There are gaps when there are 'missing values' (i.e. when x is 5, 6, 7, 9).

Is there a quick and easy way to have a bar chart with gaps filled with bars, i.e. I will have 11 bars? I could have manually created a data frame with the cumulative counts and plot it as usual, but I'm curious if there's a more elegant way.

Kenyon Ng
  • 48
  • 7

3 Answers3

1

You can convert the variable to a factor when plotting.

ggplot(dat) + geom_bar(aes(factor(x), cumsum(..count..)))

David P
  • 114
  • 3
  • Thanks for your answer David. I should have made my question clearer that I'm looking for a plot with the gaps filled with bars (i.e. a graph with 11 bars). – Kenyon Ng Feb 06 '21 at 16:22
1

I would not call this an "easy" approach but the only one I could come up with so solve your question:

  1. Pre-summarise your dataset using e.g. dplyr::count

  2. Fill up your dataset with the missing categories using e.g. tidyr::complete (To this end I first convert x to a factor).

  3. Plot via geom_col

library(ggplot2)
library(dplyr)
library(tidyr)

set.seed(2021)
dat <- data.frame(x = c(rpois(100, 1), 7, 10))
dat <- dat %>% 
  count(x) %>% 
  mutate(x = factor(x, levels = seq(range(x)[1], range(x)[2], by = 1))) %>% 
  tidyr::complete(x, fill = list(n = 0))

ggplot(dat) + geom_col(aes(x, cumsum(n)))

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thanks for your code, stefan. This is the approach I'm using now, but I do appreciate your code for creating the plotting dataframe. It's much cleaner than mine! Hopefully there's a quicker solution to do this with just `ggplot`. – Kenyon Ng Feb 07 '21 at 05:44
1

If you'll use stat_bin instead of geom_bar may be that can help..

ggplot(dat) + stat_bin(aes(x, cumsum(..count..)))

enter image description here

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
  • Thanks for your solution AnilGoyal. This works, although I prefer a solution that works on categorical `x`. I will probably also use `ggplot(dat) + stat_bin(aes(x, cumsum(..count..)), breaks = seq(-0.5, 10.5), col = 'white')` to make it more aesthetically pleasing. – Kenyon Ng Feb 07 '21 at 13:28
  • Yes, I was about to suggest that but after testing – AnilGoyal Feb 07 '21 at 13:30