0

I am trying to create a variable that identifies each unique subject and "clusters" their visits together. For example:

ID   Visit  Cluster
S101  0      1
S101  6      1
S101  12     1
S102  0      2
S105  0      3
S105  6      3

How can I create this new variable "Cluster"? I mostly use the dplyr package.

www
  • 38,575
  • 12
  • 48
  • 84
Hank Lin
  • 5,959
  • 2
  • 10
  • 17

1 Answers1

1

Create a factor column first and then convert to integer.

library(dplyr)

dat2 <- dat %>%
  mutate(Cluster = as.integer(factor(ID)))

dat2
#     ID Visit Cluster
# 1 S101     0       1
# 2 S101     6       1
# 3 S101    12       1
# 4 S102     0       2
# 5 S105     0       3
# 6 S105     6       3

Or use group_indices.

dat2 <- dat %>%
  mutate(Cluster = group_indices(., ID))
dat2
#     ID Visit Cluster
# 1 S101     0       1
# 2 S101     6       1
# 3 S101    12       1
# 4 S102     0       2
# 5 S105     0       3
# 6 S105     6       3

DATA

dat <- read.table(text = "ID   Visit
S101  0
                  S101  6 
                  S101  12
                  S102  0
                  S105  0
                  S105  6",
                  header = TRUE, stringsAsFactors = FALSE)
www
  • 38,575
  • 12
  • 48
  • 84