-5

Is there a way to create a box plot for 1 binary (yes/no) column and one numeric column from a data frame in R using the plot() command?

I tried:

boxplot(college$Accept, college$Private, main = "Accepted Versus Private")

but the binary valued (Private) column is flat: Box Plot of Students Accepted Versus Private School

(I am specifically trying to use the plot() command but if this is not possible I am interested in learning how to do it with boxplot().)

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • By "attributes", do you mean columns of the data frame? Because "attribute" is something else altogether. – Dominic Comtois Jan 21 '18 at 05:34
  • What have you tried? Please read the [how-to-ask](https://stackoverflow.com/help/how-to-ask) page and produce a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of your problem. – Kevin Arseneau Jan 21 '18 at 05:34
  • Ok, I tried to add a minimal reproducible example. I apologize for my non-standard format. I'm still learning! – Aliza Miller Jan 21 '18 at 05:42
  • Too minimal. We know nothing about `mydata` except that it might have columns named `"var1"` and `"var2"`. Have you tried `boxplot(...)`? – r2evans Jan 21 '18 at 06:04
  • Okay, I'll try including more. I am specifically trying to use `plot()`. I did use `boxplot()` but it did not seem correct. – Aliza Miller Jan 21 '18 at 06:09
  • Please add some of the data to your question. Use `dput()` for a subset of the data, and add it to your post. – Len Greski Jan 21 '18 at 07:35

1 Answers1

1

The second boxplot on the image in the OP is "flat" because it has a range of 0 - 1 when the first boxplot has a range of 0 - 25,000. boxplot() is working correctly.

A more useful chart would be to produce 2 boxplots of the Accept variable, one for each value of Private, using the following syntax:

boxplot(Accept ~ Private,data=college)

Since the OP did not include a Complete, Minimal, and Verifiable Example, here is an example boxplot with generated data.

set.seed(100317)
acceptPrivate <- rnorm(500,mean=5000,sd=2000)
acceptPrivate[acceptPrivate <0] <- 10
acceptPublic <- rnorm(500,mean=15000,sd=4000)
acceptPublic[acceptPublic <0] <- 10
Private <- c(rep(1,500),rep(0,500))
college <- data.frame(Private,Accept=c(acceptPrivate,acceptPublic))
boxplot(Accept ~ Private,data=college,main="Accept by Private")

enter image description here

One could make the chart easier to understand by converting the binary Private variable to a factor.

# convert Private to factor 
college$Private <- factor(college$Private,labels=c("Public","Private"))
boxplot(Accept ~ Private,data=college,main="Accept by College Type")

enter image description here

Len Greski
  • 10,505
  • 2
  • 22
  • 33