I am using the data found here: https://www.kaggle.com/cdc/behavioral-risk-factor-surveillance-system. In my R studio, I have named the csv file, BRFSS2015. Below is the code I am trying to execute. I remove outliers from PA1MIN_
Then I turn MARITAL
into a factor. Now I am trying to create a boxplot. I feel like the result looks weird. Is there something wrong in my graph syntax?
PA1MIN_ <- BRFSS2015$PA1MIN_
upper_PA1MIN_ <- quantile(PA1MIN_, 0.997, na.rm=TRUE)
lower_PA1MIN_ <- quantile(PA1MIN_, 0.003, na.rm=TRUE)
out_PA1MIN_ <- which(PA1MIN_ > upper_PA1MIN_ | PA1MIN_ < lower_PA1MIN_)
BRFSS2015_noout <- subset(BRFSS2015, PA1MIN_ > lower_PA1MIN_ &
PA1MIN_ < upper_PA1MIN_)
MARITAL <- c('MARITAL')
BRFSS2015[MARITAL] <- lapply(BRFSS2015[MARITAL], factor)
ggplot(BRFSS2015_noout) +
geom_boxplot(aes(PA1MIN_, MARITAL), na.rm=T)