1

I have a big data frame df, with columns named as :

age, income, country

what I want to do is very simpe actually, do

fitFunc<-function(thisCountry){
    subframe<-df[which(country==thisCountry)];
    fit<-lm(income~0+age, data=subframe);
    return(coef(fit));
}

for each individual country. Then aggregate the result into a new data frame looks like :

    countryname,  coeffname
1      USA         1.2
2      GB          1.0
3      France      1.1

I tried to do :

do.call("rbind", lapply(allRics[1:5], fitit))

but i don know what to do next.

Can anyone help?

thanks!

James Bond
  • 7,533
  • 19
  • 50
  • 64
  • I didn't know this... apparently `lm` has a `subset` option: http://stackoverflow.com/questions/11328003/how-does-the-subset-argument-work-in-the-lm-function?rq=1 Look at the other "related links" to the right. – Frank May 19 '13 at 11:07
  • And what is a problem? Little tip - add comma to `subframe<-df[which(country==thisCountry),]`, otherwise this line should return error. – DrDom May 19 '13 at 11:21

3 Answers3

2

Does this work for you?

    set.seed(1)
    df<-data.frame(income=rnorm(100,100,20),age=rnorm(100,40,10),country=factor(sample(1:3,100,replace=T),levels=1:3,labels=c("us","gb","france")))

    out<-lapply(levels(df$country) , function(z) {
        data.frame(country=z, age= coef(lm(income~0+age, data=df[df$country==z,])),row.names=NULL)
    })
do.call(rbind ,out)
user20650
  • 24,654
  • 5
  • 56
  • 91
2

Using @user20650's example data, this seems to produce the same result:

require(data.table)
dt <- data.table(df)
dt[,list(age=lm(income~0+age)$coef),by=country]

#    country      age
# 1:      gb 2.428830
# 2:      us 2.540879
# 3:  france 2.369560

You'll need to install the data.table package first.

Frank
  • 66,179
  • 8
  • 96
  • 180
1

Note that the plyr package is created for tasks like this. It performs a function on a subset of the data and returns the results in a prespicified form. Using ddply we enter a data frame and get a data frame with the results back. See plyr example sessions and help files to learn more about this. It is well worth the effort to get acquanted with this package! See http://plyr.had.co.nz/ for a start.

library(plyr)
age <- runif(1000, 18, 80)
income <- 2000 + age*100 + rnorm(1000,0, 2000)
country <- factor(sample(LETTERS[1:10], 1000, replace = T))
dat <- data.frame(age, income, country)

get.coef <- function(dat) lm(income ~ 0 + age, dat)$coefficients

ddply(dat, .(country), get.coef)
Edwin
  • 3,184
  • 1
  • 23
  • 25