Count values in column then reset

Question

I am trying to have a column that counts the number of names and starts from scratch each time it is different like this :

NAME          ID
PIERRE         1
PIERRE         2
PIERRE         3
PIERRE         4
JACK           1
ALEXANDRE      1
ALEXANDRE      2

Reproducible data

structure(list(NAME = structure(c(3L, 3L, 3L, 3L, 2L, 1L, 1L), .Label = 
c("ALEXANDRE", 
"JACK", "PIERRE"), class = "factor")), class = "data.frame", row.names 
= c(NA, 
-7L))

My code is quite important that's why I put a simple example where I could use the principle. What you want exactly ? — P. Vauclin, Oct 21 '18 at 12:09
@P.Vauclin Mark means to show what you have tried so far, and some code that we can just copy paste and work on. Look up the dput() function to help with this — Hector Haffenden, Oct 21 '18 at 12:19

score 1 · Accepted Answer · answered Oct 21 '18 at 12:08

You could build a sequence along the elements in each group (= Name):

ave(1:nrow(df), df$NAME, FUN = seq_along)

Or, if names may reoccur later on, and it should still count as a new group (= Name-change), e.g.:

groups <- cumsum(c(FALSE, df$NAME[-1]!=head(df$NAME, -1)))
ave(1:nrow(df), groups, FUN = seq_along)

score 0 · Answer 2 · answered Oct 21 '18 at 12:07

0

Using dplyr and data.table:

df %>%
  group_by(ID_temp = rleid(NAME)) %>%
  mutate(ID = seq_along(ID_temp)) %>%
  ungroup() %>%
  select(-ID_temp)

Or just data.table:

setDT(df)[, ID := seq_len(.N), by=rleid(NAME)]

answered Oct 21 '18 at 12:07

tmfmnk

38,881
4
47
67

score 0 · Answer 3 · answered Oct 21 '18 at 12:08

Here's a quick way to do it.

First you can set up your data:

mydata <- data.frame("name"=c("PIERRE", "ALEX", "PIERRE", "PIERRE", "JACK", "PIERRE", "ALEX"))

Next, I add a dummy column of 1s that makes the solution inelegant:

mydata$placeholder <- 1

Finally, I add up the placeholder column (cumulative sum), grouped by the name column:

mydata$ID <- ave(mydata$placeholder, mydata$name, FUN=cumsum)

Since I started with unsorted names, my dataframe is currently unsorted, but that can be fixed with:

mydata <- mydata[order(mydata$name, mydata$ID),]

Count values in column then reset

3 Answers3