0

I am trying to have a column that counts the number of names and starts from scratch each time it is different like this :

NAME          ID
PIERRE         1
PIERRE         2
PIERRE         3
PIERRE         4
JACK           1
ALEXANDRE      1
ALEXANDRE      2

Reproducible data

structure(list(NAME = structure(c(3L, 3L, 3L, 3L, 2L, 1L, 1L), .Label = 
c("ALEXANDRE", 
"JACK", "PIERRE"), class = "factor")), class = "data.frame", row.names 
= c(NA, 
-7L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
P. Vauclin
  • 367
  • 1
  • 2
  • 10

3 Answers3

1

You could build a sequence along the elements in each group (= Name):

ave(1:nrow(df), df$NAME, FUN = seq_along)

Or, if names may reoccur later on, and it should still count as a new group (= Name-change), e.g.:

groups <- cumsum(c(FALSE, df$NAME[-1]!=head(df$NAME, -1)))
ave(1:nrow(df), groups, FUN = seq_along)
lukeA
  • 53,097
  • 5
  • 97
  • 100
0

Using dplyr and data.table:

df %>%
  group_by(ID_temp = rleid(NAME)) %>%
  mutate(ID = seq_along(ID_temp)) %>%
  ungroup() %>%
  select(-ID_temp)

Or just data.table:

setDT(df)[, ID := seq_len(.N), by=rleid(NAME)]
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
0

Here's a quick way to do it.

First you can set up your data:

mydata <- data.frame("name"=c("PIERRE", "ALEX", "PIERRE", "PIERRE", "JACK", "PIERRE", "ALEX"))

Next, I add a dummy column of 1s that makes the solution inelegant:

mydata$placeholder <- 1

Finally, I add up the placeholder column (cumulative sum), grouped by the name column:

mydata$ID <- ave(mydata$placeholder, mydata$name, FUN=cumsum)

Since I started with unsorted names, my dataframe is currently unsorted, but that can be fixed with:

mydata <- mydata[order(mydata$name, mydata$ID),]

afszcz
  • 1
  • 1