0

Age vs. Medal Plot

I did a box plot comparing the ages of male swimming Olympic athletes and then whether or not they earned a medal. I'm wondering how to do the code to get a five number summary for the box plot with no medal and the box plot with medal (I changed medal to a factor). I tried summary(age,medal.f) and summary(age~medal.f) and nothing seems to be working/I don't know how to separate the box plots. Any thoughts on how to do this?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
user2821333
  • 41
  • 1
  • 4

1 Answers1

6

The easiest way to get this information is to save the result of your boxplot() call and extract the $stats component. Using the built-in ToothGrowth data set,

b <- boxplot(len~supp,data=ToothGrowth)
b$stats
##      [,1] [,2]
## [1,]  8.2  4.2
## [2,] 15.2 11.2
## [3,] 22.7 16.5
## [4,] 25.8 23.3
## [5,] 30.9 33.9

More generally, you can do this by hand with something like

with(data,lapply(split(age,medal),boxplot.stats))

There are many other solutions involving by() or the plyr, dplyr, data.table packages ...

Again using ToothGrowth:

(bps <- with(ToothGrowth,lapply(split(len,supp),boxplot.stats)))
$OJ
$OJ$stats
[1]  8.2 15.2 22.7 25.8 30.9

$OJ$n
[1] 30

$OJ$conf
[1] 19.64225 25.75775

$OJ$out
numeric(0)


$VC
$VC$stats
[1]  4.2 11.2 16.5 23.3 33.9

$VC$n
[1] 30

$VC$conf
[1] 13.00955 19.99045

$VC$out
numeric(0)

If you just want the 5-number summaries, you can extract them as follows:

 sapply(bps,"[[","stats")
       OJ   VC
[1,]  8.2  4.2
[2,] 15.2 11.2
[3,] 22.7 16.5
[4,] 25.8 23.3
[5,] 30.9 33.9
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thanks for the response! split(age,medal.f) listed everything in the two groups and separated like I needed it to but now I'm still confused on how to take the 5 number summary of those splits. I tried using with like you suggested but that didn't get to the result I needed. – user2821333 Dec 08 '15 at 18:25
  • if this doesn't work you definitely need to provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) ... – Ben Bolker Dec 08 '15 at 18:28
  • library(coin) age <- c(15,15,16,17,20,21,21,21,25) medal <- c(0,0,1,0,0,1,1,0,0) medal.f <- factor(medal, labels = c("No Medal", "Medal")) wilcox_test(age~medal.f) boxplot(age~medal.f, main="Age vs. Medals", ylab="Age", col=(c("darkolivegreen1","lavender"))) split(age,medal.f) – user2821333 Dec 08 '15 at 18:40
  • You should add this information to your question (although it doesn't match the boxplot you've shown) ... what's wrong with `lapply(split(age,medal.f), boxplot.stats)` ... ??? – Ben Bolker Dec 08 '15 at 18:46
  • I simplified the data because my original vectors had 444 numbers in them but the answer you just edited fixed my problem, thanks so much! – user2821333 Dec 08 '15 at 18:50
  • if this answered your question, you're encouraged to click on the check-mark to accept it ... – Ben Bolker Dec 08 '15 at 19:38