0

This is my first question so here goes...

my data set is

person = c("a","a","a","a","b","b","b","b","c","c","d","d","d","d","d","e","e","e","f")

training = c("q1","q2","q7","q4","q1","q2","q3","q4","q3","q4","q3","q4","q5","q6","q99","q18","q1","q9","q99")

data = data.frame(person,training)

I want to do a countifs style function to count the number of times a person is recorded based on their position in the list- normally i would do this in excel with an absolute reference however i have 93k rows of data so it keeps crashing.

in Excel I would have:-

see excel table here

How do I replicate this in R for data$id?

David Arenburg
  • 91,361
  • 17
  • 137
  • 196

2 Answers2

0

Here is a solution with dplyr:

library(dplyr)
data <- group_by(data,person) %>% mutate(id=row_number())

data

   person training    id
   (fctr)   (fctr) (int)
1       a       q1     1
2       a       q2     2
3       a       q7     3
4       a       q4     4
5       b       q1     1
6       b       q2     2
7       b       q3     3
8       b       q4     4
9       c       q3     1
10      c       q4     2
11      d       q3     1
12      d       q4     2
13      d       q5     3
14      d       q6     4
15      d      q99     5
16      e      q18     1
17      e       q1     2
18      e       q9     3
19      f      q99     1
scoa
  • 19,359
  • 5
  • 65
  • 80
0

Here's a possible solution :

data$id <- sapply(1:nrow(data),function(r) sum(data$person[1:r]==data$person[r]))
digEmAll
  • 56,430
  • 9
  • 115
  • 140