-2

I am looking to find the summary statistics (mean and potentially standard deviation and other quantities) of a vector (column) in a data frame, but grouped. I hope to group the statistics by another categorical variable

I know that one find summary as

summary(data$rating)

however I am not sure how I find summary statistics for gender separately.

I tried

summary(data$rating, data$gender)

but that does give my anything but summary(data$rating)

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
pkpkPPkafa
  • 13
  • 2
  • 11

2 Answers2

0

You could also use the by function:

by(data$rating, data$gender, summary)
cimentadaj
  • 1,414
  • 10
  • 23
-1

Use tapply() or aggregate():

data <- data.frame(rating = 100*runif(30), 
                   gender=sample(c("female","male"),30, replace=TRUE))

tapply(data$rating, data$gender, summary)

aggregate(data$rating, by=list(data$gender), 
      FUN=function(x) cbind(mean(x), median(x), sd(x)))
Bernhard
  • 4,272
  • 1
  • 13
  • 23