2

I am trying to use geom_dotplot. In general it works good for smaller datasets. But I'd also like to use it for larger amounts of data.

When running with default binwidth settings (so not specifying binwidth at all) some of the points will just be cut on the y-axis.

Take this a example code:

df <- data.frame(input = c(rep(3,100), runif(200)))

ggplot(data = df, aes(x = input)) +
  geom_dotplot(  binwidth = 0.015)

If I specify binwidth this works perfectly and I can see all points in the plot.

enter image description here

But if I use just standard settings without specifying binwidth like this:

ggplot(data = df, aes(x = input)) +
  geom_dotplot( )

I get this plot:

enter image description here

Of course you could manually set binwidth each time. But I am looking for a solution, to automatically adjusts the binwidth (or some other parameter) in way that all points are displayed on the y-axis.

One of the problems is, that there is no reaction to commands like ylim(0, 50) or scale_y_continuous(limits=c(0, 2)).

Probably related to this from the geom_dotplot documentation

the numbers on y axis are not meaningful, due to technical limitations of ggplot2.

Was wondering if I maybe could use the "Computed variables" of geom_dotplot for this. Something like count. Like you usually do with geom_histogram with after_stat.

ggplot(mpg, aes(displ)) +
  geom_histogram(aes(y = after_stat(count / max(count))))

But it seems it doesn't work as input for binwidth ... seems to only work inside of aes()...?

Maybe there is even a more obvious solution I didn't think of.

tjebo
  • 21,977
  • 7
  • 58
  • 94
Steffen Moritz
  • 7,277
  • 11
  • 36
  • 55
  • 1
    You are right, probably the code was a little bit excessive ... tried to break it down to the key problem now. – Steffen Moritz Jun 14 '21 at 16:34
  • I guess the reason is that each point will end up having an "absolute" rather than relative size. If you change the device size, this results in the dots being displayed. Still an interesting question how to automate that device size will be adjusted to bin width. – tjebo Jun 16 '21 at 19:07
  • A workaround could be to not use `geom_dotplot`, but manually calculate the bins and use `geom_point` or even better `ggforce::geom_ellipse`. see for example https://stackoverflow.com/a/61545633/7941188 or https://stackoverflow.com/a/61500224/7941188 – tjebo Jun 16 '21 at 19:12
  • Thanks for your help! I would of course have liked to prevent calculating everything on my own, but true on the other hand it solves all possible problems as it gives me control. geom_dotplot somehow seems to be one of the ggplot2 functions that still has room for improvement. I also saw multiple issues creating a meaningful y-axis that displays counts instead of arbitrary values from 0-1 that do not correspond to the dots. – Steffen Moritz Jun 17 '21 at 17:26

0 Answers0