Why does the histogram look like that?

Question

I feel like I should know this and like I have a knot in my brain. I have data that looks like this:

there a scale going 1, 1.25, 1.5, 1.75, 2, 2.25, 2.5, 2.75, 3 etc and for each scale point there's a certain number of people who selected it - none of the data points is 0.

Yet, the histogram looks like this with the weird spaces in between. Why?

Thank you!

It would be helpful if you could add the code you tried. If you are not able to share your data, it would be helpful if you shared a similar fake example. See also this discussion: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — Max Teflon, Mar 15 '23 at 10:50
This is the code that I tried: ggplot(df.cs_overall, aes(x=cs_mean.score)) + geom_histogram(fill="#F0A9D7") — Sofia, Mar 15 '23 at 10:52
Without information on what your `df.cs_overall` contains, it is unfortunately a bit hard to reproduce your issue. Could you maybe add an excerpt (i.e. the one you get running `dput(head(df.cs_overall, 20))`) to your question? — Max Teflon, Mar 15 '23 at 10:54
cs_mean.score = c(6, 4.5, 5.75, 6, 4.5, 5.5, 4.5, 5.25, 7, 4, 6.5, 6, 4.75, 2.75, 5, 5.25, 5.75, 6, 5.25, 4.75) - the histogram only depicts this object from the dataframe - does that help? — Sofia, Mar 15 '23 at 11:01

score 1 · Answer 1 · answered Mar 15 '23 at 11:01

It might be that you just need to set your binwidth appropriately. Here is an example using simulated data, first with the standard binwidth, second with a binwidth of .25:

library(tibble)
library(dplyr)
library(ggplot2)

df.cs_overall <- tibble(cs_mean.score = sample(seq(0,5,.25), 500, T))


ggplot(df.cs_overall, aes(x=cs_mean.score)) + geom_histogram(fill="#F0A9D7")
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.



ggplot(df.cs_overall, aes(x=cs_mean.score)) + geom_histogram(fill="#F0A9D7", binwidth = .25)

Why does the histogram look like that?

1 Answers1