dplyr: replace grouping values with 1 through N groups

Question

I have data where each row represents one observation from one person. For example:

library(dplyr)
dat <- tibble(ID = rep(sample(1111:9999, 3), each = 3),
              X = 1:9)

# A tibble: 9 x 2
     ID     X
  <int> <int>
1  9573     1
2  9573     2
3  9573     3
4  7224     4
5  7224     5
6  7224     6
7  7917     7
8  7917     8
9  7917     9

I want to replace these IDs with a different value. It can be anything, but the easiest (and preferred) solutions is just to replace with 1:n groups. So the desired solution would be:

# A tibble: 9 x 2
     ID     X
  <int> <int>
1     1     1
2     1     2
3     1     3
4     2     4
5     2     5
6     2     6
7     3     7
8     3     8
9     3     9

Probably something that starts with:

dat %>%
  group_by(IID) %>%
  ???

score 1 · Accepted Answer · answered May 14 '21 at 19:50

A fast option would be match

library(dplyr)
dat %>% 
    mutate(ID = match(ID, unique(ID)))

-output

# A tibble: 9 x 2
#     ID     X
#  <int> <int>
#1     1     1
#2     1     2
#3     1     3
#4     2     4
#5     2     5
#6     2     6
#7     3     7
#8     3     8
#9     3     9

Or use as.integer on a factor

dat %>%
   mutate(ID = as.integer(factor(ID, levels = unique(ID))))

In tidyverse, we can also cur_group_id

dat %>%
   group_by(ID = factor(ID, levels = unique(ID))) %>% 
   mutate(ID = cur_group_id()) %>%
   ungroup

dplyr: replace grouping values with 1 through N groups

1 Answers1