0

Suppose we start with this data frame myDF generated by the code immediately beneath:

> myDF
  index
1     2
2     2
3     4
4     4
5     6
6     6
7     6

Generating code: myDF <- data.frame(index = c(2,2,4,4,6,6,6))

I'd like to add a column cumGrp to data frame myDF that provides a cumulative count of implicitly grouped elements, as illustrated below. Any suggestions of simple concise base R or dplyr code to do this?

> myDF
  index cumGrp   cumGrp explained
1     2      1   1st grouping of same index numbers (2) adjacent to each other
2     2      1   Same as above
3     4      2   2nd grouping of same index numbers (4) adjacent to each other
4     4      2   Same as above
5     6      3   3rd grouping of same index numbers (6) adjacent to each other
6     6      3   Same as above
7     6      3   Same as above
Village.Idyot
  • 1,359
  • 2
  • 8

1 Answers1

2

Many possible ways:

dplyr::cur_group_id

library(dplyr)
myDF %>% 
  group_by(index) %>% 
  mutate(cumGrp = cur_group_id())

cumsum

library(dplyr)
myDF %>% 
  mutate(cumGrp = cumsum(index != lag(index, default = 0)))

as.numeric + factor

myDF |>
  transform(cumGrp = as.numeric(factor(index)))

data.table::.GRP

library(data.table)
setDT(myDF)[, num := .GRP, by = index]

match

myDF |>
  transform(cumGrp = match(index, unique(index))) 

collapse::group

library(collapse)
myDF |>
  settransform(cumGrp = group(index))
Maël
  • 45,206
  • 3
  • 29
  • 67