Using tapply for the subset group of data

Question

I have a set of car sales data and I subset my data into different groups as the following:

Car brand and sales year.

toyota=subset(car, brand=="Toyota")
toyota.yr = cut(toyota$date, "year")
honda=subset(car, brand=="Honda")
honda.yr = cut(honda$date, "year")

etc.

so now I have 6 subgroup for the car brands and then I use tapply to get the mean of sales of each brand by year:

tapply(toyota$price, toyota.yr, mean, na.rm=TRUE)

I would like to do this to all 6 subgroups, is there anyway I can do this at one time rather than typing the tapply function for 6 times?

I appreciate any helps, thanks!

maybe `aggregate(price ~ brand + year, FUN=mean, data=car)`, this is only a guess, please [make your question reproducible](http://stackoverflow.com/q/5963269/1315767) and you'll get better answers — Jilber Urbina, Jan 16 '14 at 12:53
You can simply do this `tapply( car$price , list( car$brand , car$year ) , FUN = mean , na.rm = TRUE )` — Simon O'Hanlon, Jan 16 '14 at 13:08
Thanks @SimonO'Hanlon, may I ask a follow up question: How could I plot the result? I used the simplest function plot() but it gives a kind of 3x3 matrix plots. I would like to make a plot of y-axis as the price and x-axis as the year, so the dots on the plot are the car brands in different colors. Thanks !! — user2978129, Jan 16 '14 at 13:13

crogg01 · Accepted Answer · 2014-01-16T21:57:30.240

6

tt=by(car$price, list(car$brand,car$year),mean,na.rm=T); 
print(tt["1986","Toyota"])

Jilber's suggestion is nicer if you want it straight in a data.frame instead of a list:

aggregate(price ~ brand + year, FUN=mean, data=car, na.rm=T)

Use Simon's suggestion if you wish to put it in a matrix and easily retrieve the results later:

tt=tapply( car$price , list( car$brand , car$year ) , FUN = mean , na.rm = TRUE )
print(tt["1986","Toyota"])

use dput(sample_data) to give reproducible code.

edited Jan 16 '14 at 21:57

answered Jan 16 '14 at 12:53

crogg01

`tapply` and `by` are essentially the same. Good answer. +1. – Simon O'Hanlon Jan 16 '14 at 13:14
`dplyr` style: `q4totalnetassets %>% filter(Country != "TOTAL") %>% group_by(Currency) %>% summarise(Sum=sum(Value))` – Vincent Bonhomme Mar 27 '16 at 12:04

1 Answers1