28

I have a variable ceroonce which is number of schools per county (integers) in 2011. When I plot it with boxplot() it only requires the ceroonce variable. A boxplot is then retrieved in which the y axis is the number of schools and the x axis is... the "factor" ceroonce. But in ggplot, when using geom_boxplot, it requires me to input both x and y axis, but I just want a boxplot of ceroonce. I have tried inputing ceroonce as both the x and y axis. But then a weird boxplot is retrieved in which the y axis is the number of schools but the x axis (which should be the factor variable) is also the number of schools? I am assuming this is very basic statistics, but I am just confused. I am attaching the images hoping this will clarify my question.

This is the code I am using:

ggplot(escuelas, aes(x=ceroonce, y=ceroonce))+geom_boxplot()
boxplot(escuelas$ceroonce)
uthark
  • 5,333
  • 2
  • 43
  • 59
manuelq
  • 345
  • 1
  • 6
  • 10

3 Answers3

29
ggplot(escuelas, aes(x="ceroonce", y=ceroonce))+geom_boxplot()

ggplot will interpret the character string "ceroonce" as a vector with the same length as the ceroonce column and it will give the result you're looking for.

yeedle
  • 4,918
  • 1
  • 22
  • 22
23

There are no fancy statistics happening here. boxplot is simply assuming that since you've given it a single vector, that you want a single box in your boxplot. ggplot and geom_histogram simply don't make that assumption.

If you want a bit less typing, you can do this:

qplot(y=escuelas$ceroonce, x= 1, geom = "boxplot")

ggplot2 will automatically create a vector of 1s equal in length to the length of escuelas$ceroonce

Tommy O'Dell
  • 7,019
  • 13
  • 56
  • 69
  • thank you! yes, i found out that stat_boxplot and thus geom_boxplot require x and y arguments... and that you cannot simply define x (as with boxplot()) – manuelq Jul 31 '14 at 09:36
  • 2
    thanks for this response. i think this is a rather unintuitive behaviour from ggplot. A lot of people will likely not group their data. – joaoal Mar 20 '17 at 20:05
2

This could work for you:

ggplot(escuelas, aes(x= "", y=ceroncee)) + geom_boxplot()
Paul Roub
  • 36,322
  • 27
  • 84
  • 93
Stephi
  • 21
  • 1