How to assign a unique ID number to each group of identical values in a column

Question

I have a data frame with a number of columns. I would like to create a new column called “id” that gives a unique id number to each group of identical values in the “sample” column.

Example data:

df <- data.frame(
  index = 1:30,
  val = c(
    14L, 22L, 1L, 25L, 3L, 34L, 35L, 36L, 24L, 35L, 33L, 31L, 30L,
    30L, 29L, 28L, 26L, 12L, 41L, 36L, 32L, 37L, 56L, 34L, 23L, 24L,
    28L, 22L, 10L, 19L
  ),
  sample = c(
    5L, 6L, 6L, 7L, 7L, 7L, 8L, 9L, 10L, 11L, 11L, 12L, 13L, 14L,
    14L, 15L, 15L, 15L, 16L, 17L, 18L, 18L, 19L, 19L, 19L, 20L, 21L,
    22L, 23L, 23L
  )
)

What I would like to end up with:

  index val sample id
1     1  14      5  1
2     2  22      6  2
3     3   1      6  2
4     4  25      7  3
5     5   3      7  3
6     6  34      7  3

dplyr solution: `df$id <- group_indices(df$sample)`. – user3932000 Jul 29 '19 at 22:04 — user3932000, Jul 29 '19 at 22:04

Ben Bolker · Accepted Answer · 2021-02-24T02:07:57.977

75

How about

df2 <- transform(df,id=as.numeric(factor(sample)))

?

I think this (cribbed from Add ID column by group) should be slightly more efficient, although perhaps a little harder to remember:

df3 <- transform(df, id=match(sample, unique(sample)))
all.equal(df2,df3)  ## TRUE

If you want to do this in tidyverse:

library(dplyr)
df %>% group_by(sample) %>% mutate(id=cur_group_id())

edited Feb 24 '21 at 02:07

answered Jun 09 '14 at 12:05

Ben Bolker

211,554
25
370
453

Love it: a use for `factors` that I can understand. :-) – Carl Witthoft Jun 09 '14 at 12:09
1

Just a small note here: the `as.numeric(factor(sample))` method will only result in a descending numbers sequence if `sample` is already ordered. – David Arenburg May 05 '16 at 15:10
1

the nice thing about the `factor()` solution is that it ignores `NA` values – Will T-E Nov 17 '16 at 10:16
@Ben Bolker, thanks! can you write your code with `dplyr`? – Alex Feb 23 '21 at 21:41
did you see the comment above https://stackoverflow.com/questions/24119599/how-to-assign-a-unique-id-number-to-each-group-of-identical-values-in-a-column/24119941?noredirect=1#comment101022995_24119599 ? – Ben Bolker Feb 23 '21 at 21:42
@Ben Bolker, I assumed I can write your code using `dplyr` – Alex Feb 24 '21 at 01:37

David Arenburg · Answer 2 · 2016-05-05T15:12:27.890

45

Here's a data.table solution

library(data.table)
setDT(df)[, id := .GRP, by = sample]

edited May 05 '16 at 15:12

answered Jun 09 '14 at 12:13

David Arenburg

91,361
17
137
196

How to assign a unique ID number to each group of identical values in a column

2 Answers2

Linked

Related