0

I would like to ask how to always have fixed number of bins in barplots no matter how much variables we have - it must be in bar plot not histogram

for example:

DF <- mtcars
ggplot(DF, aes(gear)) + geom_bar()

will produce three bars from (3 to 5 values) I would like to also have values 1 and 2 and they must be equal to zero - So we will end up with 5 bar plots. where 2 will be equal to 0 and last 3 values will be equal to values in dataset.

Petr
  • 1,606
  • 2
  • 14
  • 39
  • 1
    you'll have to create a data frame first *which contains data for those observations*. How would ggplot know which values are missing? In the toy example, count your gears first. Then you'd need empty rows with gear = 1 and 2 and the count values set to zero. Then use `geom_col` (which is nothing else than `geom_bar(stat = 'identity')` in order to show your count values – tjebo Dec 06 '19 at 08:29
  • I know that I will always need to see exactly 5 Bars - So I thought it could be done somehow easily – Petr Dec 06 '19 at 08:32
  • 1
    What I suggested is pretty straight forward? One problem which beginners often have, is that they hesitate to shape their data differently for different purposes. But it's not such a bad thing. It's actually absolutely essential to do so. You have to tell ggplot what you want for your x. If it's always 5 categories, your x column always needs to contain those five categories – tjebo Dec 06 '19 at 08:34
  • Possible duplicate of? [ggplot2 keep unused levels barplot](https://stackoverflow.com/questions/10834382/ggplot2-keep-unused-levels-barplot) – tjebo Dec 06 '19 at 09:33

2 Answers2

3

You need to include the counts for all missing values of gear that you want. One way of achieving that is by using complete:

DF <- mtcars %>% 
  group_by( gear ) %>%
  tally() %>%
  complete( gear = 1:max(gear), fill = list(n=0) )

ggplot(DF, aes(x = gear, y = n)) + geom_bar( stat = 'identity' )
mrhd
  • 1,036
  • 7
  • 16
  • 2
    I'd use geom_col instead, as suggested in my comment to the OP. This is essentially the same as `geom_bar(stat = 'identity')` – tjebo Dec 06 '19 at 08:37
  • 1
    Oh nice! I didn't know that one. Every day something new. Thanks, @Tjebo! I'll leave my answer as-is so your comment still applies. Anyway, there is some merit in using `geom_bar` as it is widely known and used. – mrhd Dec 06 '19 at 08:39
2

You can edit the properties of the x-axis to include 1 and 2. You can add a scale_x_continous and manually define the breaks and the limits. However, you cannot really see the column for these values because it is a line...

library(tidyverse)

DF <- mtcars
ggplot(DF, aes(gear)) + geom_bar() +
    scale_x_continuous(breaks = 1:5, limits = c(0.5,5.5))

Created on 2019-12-06 by the reprex package (v0.3.0)

Does this help?

albgarre
  • 119
  • 3
  • Perfect answer as it doesn't need to modify data. Also, you don't need `tidyverse`, `library(ggplot2)` is enough and `ggplot(mtcars, aes(gear))` will work too. – pogibas Dec 06 '19 at 09:42