0

I would like to create variable "Time" which basically indicates the number of times variable ID showed up within each day minus 1. In other words, the count is lagged by 1 and the first time ID showed up in a day should be left blank. Second time the same ID shows up on a given day should be 1.

Basically, I want to create the "Time" variable in the example below.

ID Day Time Value
1  1        0
1  1    1   0
1  1    2   0
1  2        0
1  2    1   0
1  2    2   0
1  2    3   1
2  1        0
2  1    1   0
2  1    2   0

Below is the code I am working on. Have not been successful with it.

data$time<-data.frame(data$ID,count=ave(data$ID==data$ID, data$Day, FUN=cumsum))
Louis
  • 55
  • 5
  • 1
    check out http://stackoverflow.com/questions/12925063/numbering-rows-within-groups-in-a-data-frame and http://stackoverflow.com/questions/28647954/add-an-index-or-counter-to-a-dataframe-by-group-in-r – chinsoon12 Apr 01 '16 at 03:20

1 Answers1

2

We can do this with data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', 'Day', we get the lag of sequence of rows (shift(seq_len(.N))) and assign (:=) it as "Time" column.

library(data.table)
setDT(df1)[, Time := shift(seq_len(.N)), .(ID, Day)]
df1
#    ID Day Value Time
# 1:  1   1     0   NA
# 2:  1   1     0    1
# 3:  1   1     0    2
# 4:  1   2     0   NA
# 5:  1   2     0    1
# 6:  1   2     0    2
# 7:  1   2     1    3
# 8:  2   1     0   NA
# 9:  2   1     0    1
#10:  2   1     0    2

Or with base R

with(df1, ave(Day, Day, ID, FUN= function(x)
        ifelse(seq_along(x)!=1, seq_along(x)-1, NA)))
#[1] NA  1  2 NA  1  2  3 NA  1  2

Or without the ifelse

with(df1, ave(Day, Day, ID, FUN= function(x) 
            NA^(seq_along(x)==1)*(seq_along(x)-1)))
#[1] NA  1  2 NA  1  2  3 NA  1  2

data

df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), 
Day = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L), Value = c(0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L)), .Names = c("ID", "Day", 
"Value"), row.names = c(NA, -10L), class = "data.frame")
akrun
  • 874,273
  • 37
  • 540
  • 662