0

In psychometrics, you might have discrete measurements (e.g. on a scale from 1 to 4), but still assume that those measurements represent an underlying continuous process.

I am trying to produce a plot that depicts these discrete measurements and the underlying distribution.

So far I haven't managed to get what I produced. The best I have come up so far is trying to overlay the density plot on a histogram. But there is a mismatch between the scale of the histogram densities and the scale of the density line:

library(ggplot2)

var1  <- c(rep(1, times = 50),
           rep(2, times = 60),
           rep(3, times = 40),
           rep(4, times = 30))

df <- as.data.frame(var1)

ggplot(df, aes(x=var1)) +
  geom_line(aes(y=..density..),stat = 'density') +
  geom_histogram(aes(y=..density..))
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

My understanding is that I am looking at two different density functions:

  • on the histogram, I oversample some parts of the distribution and undersample others, which produces "distorted" density estimates compared to
  • the density line, where the density estimate is computed the whole (continuous) range of values on my interval.

... is there a way to get both functions to the same scale (or if not, maybe someone has a clue why it makes no statistical sense to try and do that).

Thanks!

raphael_ldl
  • 113
  • 8
  • the trick is to use `stat(density)` as y in `geom_histogram`, as in: https://stackoverflow.com/a/54700242/7941188 – tjebo Dec 02 '20 at 10:25
  • Thanks for the amazingly quick responses! I have another piece of work at hand right now, I will come back to this later. – raphael_ldl Dec 02 '20 at 12:08
  • I am still not getting the result I am looking for with this. I am trying to produce a plot where the "tip" of the histogram is at the same height as the "tip" of the density curve. Maybe you can share a reprex to see if we are really talking about the same thing? – raphael_ldl Dec 02 '20 at 13:12
  • to get the tip of the density curve at a desired point may not be straight forward - it is very much a matter of the smoothing parameter. (I think it is something like `bw = ...`) within the geom_density call – tjebo Dec 03 '20 at 15:11

0 Answers0