1

So I have a table that looks like this currently:

data_wrong <- data.table(State = c("NY", "NY", "NY", "NY", "PA", "PA", "PA", 
"NJ", "NJ", "NJ"), Year = c("1973", "1974", "1975", "2005", "1992", "1993", 
"2001", "1930", "1931", "1932"), Consecutive_Yrs = c(1,2,3,1,1,6,1,1,9,10))

And I'd like it to look like this:

data <- data.table(State = c("NY", "NY", "NY", "NY", "PA", "PA", "PA", "NJ", 
"NJ", "NJ"), Year = c("1973", "1974", "1975", "2005", "1992", "1993", 
"2001", "1930", "1931", "1932"), Consecutive_Yrs = c(1,2,3,1,1,2,1,1,2,3))

This is the code I'm using right now to get my table:

data$diff <- NA

data <- data %>%
  group_by(State) %>%
  arrange(State) %>%
  mutate(diff = Year - lag(Year, default = first(Year)))

data$Consecutive_Yrs <- 1

data$Consecutive_Yrs <- ifelse(data$diff == 1, cumsum(data$Consecutive_Yrs), 
1)

Any help would be greatly appreciated :)

Sarah
  • 411
  • 4
  • 14
  • Both answers I have received do not work with my version of R. Can someone please suggest and ideas for me? Thanks! – Sarah Jul 29 '19 at 15:22

1 Answers1

5

As. it is a data.table, an option is to use data.table methods

library(data.table)
data_wrong[, grp := cumsum(c(TRUE, diff(as.numeric(Year)) > 1)), 
       .(State)][, Consecutive_Yrs := as.numeric(seq_len(.N)), .(State, grp)]
data_wrong
#    State Year Consecutive_Yrs grp
# 1:    NY 1973               1   1
# 2:    NY 1974               2   1
# 3:    NY 1975               3   1
# 4:    NY 2005               1   2
# 5:    PA 1992               1   1
# 6:    PA 1993               2   1
# 7:    PA 2001               1   2
# 8:    NJ 1930               1   1
# 9:    NJ 1931               2   1
#10:    NJ 1932               3   1

Or use rowid

data_wrong[, Consecutive_Yrs2 := rowid(rleid(as.numeric(Year) - 
        shift(as.numeric(Year), fill = as.numeric(Year[1])) >1)), .(State)]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I got this error: Error in `[<-.data.table`(x, j = name, value = value) : (list) object cannot be coerced to type 'integer' In addition: Warning message: In `[<-.data.table`(x, j = name, value = value) : Supplied 9 items to be assigned to 16510 items of column 'Consecutive_Yrs' (recycled leaving remainder of 4 items). – Sarah Jul 26 '19 at 16:34
  • Oh wait, I got an error but it actually ran and worked..is that ok? – Sarah Jul 26 '19 at 16:38
  • @Sarah The first error is because you already have a 'Consecutive_Yrs' column n the dataset which is of type `numeric`. The `seq_len(.N)` returns `integer`. Either, convert it to `numeric` or create a new column name would solve it – akrun Jul 26 '19 at 16:39
  • @Sarah Can you please run the updated code on the fresh data created – akrun Jul 26 '19 at 16:39
  • I'm still getting that error. I changed all of my variables to numeric. But it's still outputting what I want. – Sarah Jul 26 '19 at 16:46
  • @Sarah can you test the code on the example you showed. It is working fine for me – akrun Jul 26 '19 at 16:49
  • @Sarah I updated with another option. Please check if that works – akrun Jul 26 '19 at 16:56
  • It's saying could not find function "." – Sarah Jul 29 '19 at 14:39
  • @Sarah Have you loadeed `library(data.table)` and what is the version of data.table – akrun Jul 29 '19 at 14:41
  • I'm using version 1.12.2 and yes I loaded it. – Sarah Jul 29 '19 at 14:46
  • @Sarah I alsso have the same version on `R 3.6.0`. Sorry, i can't reproduce the issue and the datasets also is the same as you already showed it – akrun Jul 29 '19 at 14:47
  • Thanks. I'm just trying to figure out how to do this because the solution below also is not working properly.. – Sarah Jul 29 '19 at 14:51
  • @Sarah What is the R version you havee – akrun Jul 29 '19 at 14:52
  • I'm on 3.4.2. I am unable to update unfortunately – Sarah Jul 29 '19 at 14:53
  • @Sarah Okay, that may explain the issue – akrun Jul 29 '19 at 14:54
  • do you have any insight on how to fix the code the other person responded with to get it to work? It doesn't work for the 1 circumstance I explained below. – Sarah Jul 29 '19 at 15:01