0

I have a panel dataset with the following columns/variables: Week(t), Custid(i), Activity(i,t), Initial(i) and a host of other variables. I need to create a new variable Experience(i,t) = alpha * Experience(i,t-1) + Activity(i, t-1). The initial value for Experience(i,0)= Initial(i). I am new to R and just about making my transition from SAS to R. How can I create this new variable Experience at the customer level and by week. That is, the value of this variable for the ith customer at week t will depend on the lagged value of the variable in the past week (t-1) plus an Activity made by the customer I in the past week (t-1). Please help, I have a super time crunch to work this problem out. All and any help is sincerely appreciated!

sharmi
  • 5
  • 1
  • 2
    When asking a question, please include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Ideally you will have sample input and desired output. This will make it easier for people to help you otherwise we have to build test datasets ourselves. – MrFlick Jul 11 '14 at 03:58

1 Answers1

1

I'll assume this is your input data

#sample data
set.seed(15)
dd<-data.frame(
     week=rep(1:5, 3), 
     cust=rep(1:3, each=5), 
     activity=runif(15)
)
init<-c(1,2,3)
alpha<-.5

and that looks like

   week cust  activity
1     1    1 0.6021140
2     2    1 0.1950439
3     3    1 0.9664587
4     4    1 0.6509055
5     5    1 0.3670719
6     1    2 0.9888592
7     2    2 0.8151934
8     3    2 0.2539684
9     4    2 0.6872308
10    5    2 0.8314290
11    1    3 0.1046694
12    2    3 0.6461509
13    3    3 0.5090904
14    4    3 0.7066286
15    5    3 0.8623137

Then we calculate Experience with

Experience <- Map(function(i,d) 
    Reduce(function(a,b) 
        {alpha*a + b}, d$activity, i, accumulate=TRUE),
    init, split(dd, dd$cust)
)

we use the outer Map to iterate over the initial values and the subsets of the data for each customer created using split. Then, the inner Reduce implements the lagged algorithm as you've described it.

Then to join it back to the table, we need to remove the initial values form the list and re-stack the values back in order. We can do that with

ExpCol <- unsplit(lapply(Experience, tail, -1), dd$cust)
cbind(dd, ExpCol)

which gives us

   week cust  activity    ExpCol
1     1    1 0.6021140 1.1021140
2     2    1 0.1950439 0.7461009
3     3    1 0.9664587 1.3395092
4     4    1 0.6509055 1.3206601
5     5    1 0.3670719 1.0274020
6     1    2 0.9888592 1.9888592
7     2    2 0.8151934 1.8096230
8     3    2 0.2539684 1.1587799
9     4    2 0.6872308 1.2666208
10    5    2 0.8314290 1.4647394
11    1    3 0.1046694 1.6046694
12    2    3 0.6461509 1.4484856
13    3    3 0.5090904 1.2333332
14    4    3 0.7066286 1.3232952
15    5    3 0.8623137 1.5239612
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • Hi MrFlick, Thanks so much for this. This works however getting the following warning: 50: In if (!accumulate) { ... : the condition has length > 1 and only the first element will be used....I checked the result for the first few observations and the results look fine. Not sure if I need to worry about the warning. Please do confirm. Also the init variable could just take 1 value right? – sharmi Jul 11 '14 at 09:24
  • Alternatively, what if I don't have an initial value. How do I modify the code to just have Exp(I,t) = alpha*Exp(I,t-1) + Activity...without any initialization? – sharmi Jul 11 '14 at 10:10
  • Maybe you created a variable called `T`; i changed `accumulate=T` to `accumulate=TRUE` just to be safe. I don't understand how you can have a recursive definition without initialization. Do you just want to initialize everything to 0? Then you can just explicitly set `i` to 0 in the `Reduce` call. – MrFlick Jul 11 '14 at 12:40
  • Yeah changing T to TRUE works! There is no warning now. Thanks so much!!! And yes I think I wanted everything to be initialized to 0, so will change this accordingly in the Reduce call. Also, just a check since I am so new in R, are map, Reduce names of user defined functions or are these R functions? – sharmi Jul 11 '14 at 13:46
  • `Map` and `Reduce` are names of functions included in base R. You can see the documentation for both on the `?Map` help page. – MrFlick Jul 11 '14 at 14:20