How to count cumulative number of implied groupings in a single column of a dataframe in base R or dplyr?

Question

Suppose we start with this data frame myDF generated by the code immediately beneath:

Generating code: myDF <- data.frame(index = c(2,2,4,4,6,6,6))

I'd like to add a column cumGrp to data frame myDF that provides a cumulative count of implicitly grouped elements, as illustrated below. Any suggestions of simple concise base R or dplyr code to do this?

> myDF
  index cumGrp   cumGrp explained
1     2      1   1st grouping of same index numbers (2) adjacent to each other
2     2      1   Same as above
3     4      2   2nd grouping of same index numbers (4) adjacent to each other
4     4      2   Same as above
5     6      3   3rd grouping of same index numbers (6) adjacent to each other
6     6      3   Same as above
7     6      3   Same as above

Maël · Accepted Answer · 2022-09-15T08:50:56.727

Many possible ways:

dplyr::cur_group_id

library(dplyr)
myDF %>% 
  group_by(index) %>% 
  mutate(cumGrp = cur_group_id())

cumsum

library(dplyr)
myDF %>% 
  mutate(cumGrp = cumsum(index != lag(index, default = 0)))

as.numeric + factor

myDF |>
  transform(cumGrp = as.numeric(factor(index)))

data.table::.GRP

library(data.table)
setDT(myDF)[, num := .GRP, by = index]

match

myDF |>
  transform(cumGrp = match(index, unique(index)))

collapse::group

library(collapse)
myDF |>
  settransform(cumGrp = group(index))

How to count cumulative number of implied groupings in a single column of a dataframe in base R or dplyr?

1 Answers1