Tried finding a similar post, but couldn't.
I have a column in data table which looks like this ->
x,x,x,x,y,y,y,c,c,c
I want to index in a separate column such that ->
1,1,1,1,2,2,2,3,3,3
How to do it?
Tried finding a similar post, but couldn't.
I have a column in data table which looks like this ->
x,x,x,x,y,y,y,c,c,c
I want to index in a separate column such that ->
1,1,1,1,2,2,2,3,3,3
How to do it?
I'd go with this, which has the advantage of working with data frames and data tables, (and maybe tibbles, idk). The index numbers are obtained from the first appearance of a col
code and the output index numbers are not dependent on col
codes being adjacent rows (so if col
goes x,x,x,x,y,y,y,x,x,x
all the x
get index 2).
> dt <- data.table(col = c("x", "x", "x", "x", "y", "y", "y", "c", "c", "c"))
> dt$index = as.numeric(factor(dt$col,levels=unique(dt$col)))
> dt
col index
1: x 1
2: x 1
3: x 1
4: x 1
5: y 2
6: y 2
7: y 2
8: c 3
9: c 3
10: c 3
A solution with data.table
:
library(data.table)
dt <- data.table(col = c("x", "x", "x", "x", "y", "y", "y", "c", "c", "c"))
dt[ , idx := .GRP, by = col]
# col idx
# 1: x 1
# 2: x 1
# 3: x 1
# 4: x 1
# 5: y 2
# 6: y 2
# 7: y 2
# 8: c 3
# 9: c 3
# 10: c 3
A solution in base R:
dat <- data.frame(col = c("x", "x", "x", "x", "y", "y", "y", "c", "c", "c"))
dat <- transform(dat, idx = match(col, unique(col)))
# col idx
# 1 x 1
# 2 x 1
# 3 x 1
# 4 x 1
# 5 y 2
# 6 y 2
# 7 y 2
# 8 c 3
# 9 c 3
# 10 c 3
dt$index <- cumsum(!duplicated(dt$a))
dt
a index
# 1 x 1
# 2 x 1
# 3 x 1
# 4 x 1
# 5 y 2
# 6 y 2
# 7 y 2
# 8 c 3
# 9 c 3
# 10 c 3