2

I have a set of car sales data and I subset my data into different groups as the following:

Car brand and sales year.

toyota=subset(car, brand=="Toyota")
toyota.yr = cut(toyota$date, "year")
honda=subset(car, brand=="Honda")
honda.yr = cut(honda$date, "year")

etc.

so now I have 6 subgroup for the car brands and then I use tapply to get the mean of sales of each brand by year:

tapply(toyota$price, toyota.yr, mean, na.rm=TRUE)

I would like to do this to all 6 subgroups, is there anyway I can do this at one time rather than typing the tapply function for 6 times?

I appreciate any helps, thanks!

user2978129
  • 191
  • 1
  • 3
  • 10
  • 3
    maybe `aggregate(price ~ brand + year, FUN=mean, data=car)`, this is only a guess, please [make your question reproducible](http://stackoverflow.com/q/5963269/1315767) and you'll get better answers – Jilber Urbina Jan 16 '14 at 12:53
  • 1
    You can simply do this `tapply( car$price , list( car$brand , car$year ) , FUN = mean , na.rm = TRUE )` – Simon O'Hanlon Jan 16 '14 at 13:08
  • Thanks @SimonO'Hanlon, may I ask a follow up question: How could I plot the result? I used the simplest function plot() but it gives a kind of 3x3 matrix plots. I would like to make a plot of y-axis as the price and x-axis as the year, so the dots on the plot are the car brands in different colors. Thanks !! – user2978129 Jan 16 '14 at 13:13

1 Answers1

6
tt=by(car$price, list(car$brand,car$year),mean,na.rm=T); 
print(tt["1986","Toyota"])

Jilber's suggestion is nicer if you want it straight in a data.frame instead of a list:

aggregate(price ~ brand + year, FUN=mean, data=car, na.rm=T)

Use Simon's suggestion if you wish to put it in a matrix and easily retrieve the results later:

tt=tapply( car$price , list( car$brand , car$year ) , FUN = mean , na.rm = TRUE )
print(tt["1986","Toyota"])

use dput(sample_data) to give reproducible code.

crogg01
  • 2,446
  • 15
  • 35