I have a panel dataset with the following columns/variables: Week(t), Custid(i), Activity(i,t), Initial(i) and a host of other variables. I need to create a new variable Experience(i,t) = alpha * Experience(i,t-1) + Activity(i, t-1). The initial value for Experience(i,0)= Initial(i). I am new to R and just about making my transition from SAS to R. How can I create this new variable Experience at the customer level and by week. That is, the value of this variable for the ith customer at week t will depend on the lagged value of the variable in the past week (t-1) plus an Activity made by the customer I in the past week (t-1). Please help, I have a super time crunch to work this problem out. All and any help is sincerely appreciated!
Asked
Active
Viewed 219 times
0
-
2When asking a question, please include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Ideally you will have sample input and desired output. This will make it easier for people to help you otherwise we have to build test datasets ourselves. – MrFlick Jul 11 '14 at 03:58
1 Answers
1
I'll assume this is your input data
#sample data
set.seed(15)
dd<-data.frame(
week=rep(1:5, 3),
cust=rep(1:3, each=5),
activity=runif(15)
)
init<-c(1,2,3)
alpha<-.5
and that looks like
week cust activity
1 1 1 0.6021140
2 2 1 0.1950439
3 3 1 0.9664587
4 4 1 0.6509055
5 5 1 0.3670719
6 1 2 0.9888592
7 2 2 0.8151934
8 3 2 0.2539684
9 4 2 0.6872308
10 5 2 0.8314290
11 1 3 0.1046694
12 2 3 0.6461509
13 3 3 0.5090904
14 4 3 0.7066286
15 5 3 0.8623137
Then we calculate Experience with
Experience <- Map(function(i,d)
Reduce(function(a,b)
{alpha*a + b}, d$activity, i, accumulate=TRUE),
init, split(dd, dd$cust)
)
we use the outer Map
to iterate over the initial values and the subsets of the data for each customer created using split
. Then, the inner Reduce
implements the lagged algorithm as you've described it.
Then to join it back to the table, we need to remove the initial values form the list and re-stack the values back in order. We can do that with
ExpCol <- unsplit(lapply(Experience, tail, -1), dd$cust)
cbind(dd, ExpCol)
which gives us
week cust activity ExpCol
1 1 1 0.6021140 1.1021140
2 2 1 0.1950439 0.7461009
3 3 1 0.9664587 1.3395092
4 4 1 0.6509055 1.3206601
5 5 1 0.3670719 1.0274020
6 1 2 0.9888592 1.9888592
7 2 2 0.8151934 1.8096230
8 3 2 0.2539684 1.1587799
9 4 2 0.6872308 1.2666208
10 5 2 0.8314290 1.4647394
11 1 3 0.1046694 1.6046694
12 2 3 0.6461509 1.4484856
13 3 3 0.5090904 1.2333332
14 4 3 0.7066286 1.3232952
15 5 3 0.8623137 1.5239612

MrFlick
- 195,160
- 17
- 277
- 295
-
Hi MrFlick, Thanks so much for this. This works however getting the following warning: 50: In if (!accumulate) { ... : the condition has length > 1 and only the first element will be used....I checked the result for the first few observations and the results look fine. Not sure if I need to worry about the warning. Please do confirm. Also the init variable could just take 1 value right? – sharmi Jul 11 '14 at 09:24
-
Alternatively, what if I don't have an initial value. How do I modify the code to just have Exp(I,t) = alpha*Exp(I,t-1) + Activity...without any initialization? – sharmi Jul 11 '14 at 10:10
-
Maybe you created a variable called `T`; i changed `accumulate=T` to `accumulate=TRUE` just to be safe. I don't understand how you can have a recursive definition without initialization. Do you just want to initialize everything to 0? Then you can just explicitly set `i` to 0 in the `Reduce` call. – MrFlick Jul 11 '14 at 12:40
-
Yeah changing T to TRUE works! There is no warning now. Thanks so much!!! And yes I think I wanted everything to be initialized to 0, so will change this accordingly in the Reduce call. Also, just a check since I am so new in R, are map, Reduce names of user defined functions or are these R functions? – sharmi Jul 11 '14 at 13:46
-
`Map` and `Reduce` are names of functions included in base R. You can see the documentation for both on the `?Map` help page. – MrFlick Jul 11 '14 at 14:20