0

I do have one question regarding the jitter plot function. It is a nice way to plot point for different categories and avoid overlapping. But what does the exact distance between the points tell us? I would appreciate an answer a lot. Thank you in advance!

click here for image of graph

Code:

# Visualisation in grid
#Workday alcohol consumption per school
DFV1 <- ggplot(data.source1, aes(x=dalc, y=school, color=sex))+geom_jitter(alpha=0.7)+scale_colour_manual(values=c("#ff7f50", "#468499"))+theme_bw()+xlab("Workday alcohol consumption")+ylab("School")+ggtitle("Workday alcohol consumption per school and sex")

#Weekend alcohol consumption per school
DFV2 <- ggplot(data.source1, aes(x=walc, y=school, color=sex))+ geom_jitter(alpha=0.7)+ scale_colour_manual(values=c("#ff7f50", "#468499"))+theme_bw()+xlab("Weekend alcohol consumption")+ylab("School")+ggtitle("Weekend alcohol consumption per school and sex")

#set grid
(Package gridExtra is needed)
grid.arrange(DFV1,DFV2, nrow=2)
tjebo
  • 21,977
  • 7
  • 58
  • 94
ALOtto95
  • 1
  • 2
  • 1
    The horizontal point distance is arbitrary. The jitter is just random. This is why one should use a beeswarm plot here, instead. – danlooo Jun 13 '22 at 09:45

1 Answers1

0

That distance is specified in the width and height arguments. From the documentation:

If omitted, defaults to 40% of the resolution of the data: this means the jitter values will occupy 80% of the implied bins

So the distance does not tell you anything about the data.

Andrea M
  • 2,314
  • 1
  • 9
  • 27