I'm facing the following problem in R. I have a dataframe with values identifing a customer. There is a column with User ID. I need to add another column with a counter what is the occurence number of that particular customer in the data. The dataframe is sorted by User ID. So i have something like that:
> niekonwersyjne[c(57:62,72:77),1]
User_ID
AMsySZa--1Og4WwseZJKRyABTWdh
AMsySZa--1Og4WwseZJKRyABTWdh
AMsySZa--1Og4WwseZJKRyABTWdh
AMsySZa--1Og4WwseZJKRyABTWdh
AMsySZa--1Og4WwseZJKRyABTWdh
AMsySZa--1qZghdxj4gypoSQRt_F
AMsySZa--2gL6xRCZFUCOXtpYxNs
AMsySZa--2gL6xRCZFUCOXtpYxNs
AMsySZa--2gL6xRCZFUCOXtpYxNs
AMsySZa--2gL6xRCZFUCOXtpYxNs
AMsySZa--2gL6xRCZFUCOXtpYxNs
AMsySZa--2gL6xRCZFUCOXtpYxNs
But need something like this:
> niekonwersyjne[c(57:62,72:77),c(1,11)]
User_ID Counter
AMsySZa--1Og4WwseZJKRyABTWdh 1
AMsySZa--1Og4WwseZJKRyABTWdh 2
AMsySZa--1Og4WwseZJKRyABTWdh 3
AMsySZa--1Og4WwseZJKRyABTWdh 4
AMsySZa--1Og4WwseZJKRyABTWdh 5
AMsySZa--1qZghdxj4gypoSQRt_F 1
AMsySZa--2gL6xRCZFUCOXtpYxNs 1
AMsySZa--2gL6xRCZFUCOXtpYxNs 2
AMsySZa--2gL6xRCZFUCOXtpYxNs 3
AMsySZa--2gL6xRCZFUCOXtpYxNs 4
AMsySZa--2gL6xRCZFUCOXtpYxNs 5
AMsySZa--2gL6xRCZFUCOXtpYxNs 6
I can do this with a loop but the data frame has over 20 mil observations so the calculation time is defintely too high. Is there some other way to achieve this result?
The loop that I am using right now looks like this:
niekonwersyjne$Counter<-1
for (i in 2:nrow(niekonwersyjne)) {
if (niekonwersyjne[i-1,"User_ID"]==niekonwersyjne[i,"User_ID"]) {
niekonwersyjne[i,"Counter"]<-niekonwersyjne[i-1,"Counter"]+1} else {
niekonwersyjne[i,"Counter"]<-1
}
}