0

I would like to create a new variable which indexes the value of another variable. The new column starts from 1 as below. Thanks.

  ColumnIHave ColumnIWant
            A           1
            A           1
            A           1
            B           2
            B           2
            B           2
            C           3
            C           3
            C           3
Matt
  • 2,947
  • 1
  • 9
  • 21
TJ87
  • 404
  • 1
  • 3
  • 13

2 Answers2

2

One option using data.table:

Data:

DT <- read.table(header = TRUE, text = "ColumnIHave 
A
A
A
B
B
B
C
C
C")

Create column:

library(data.table)
DT <- data.table(DT)
DT[, ColumnIWant:= .GRP, by = ColumnIHave]
DT

Result:

   ColumnIHave ColumnIWant
1:           A     1
2:           A     1
3:           A     1
4:           B     2
5:           B     2
6:           B     2
7:           C     3
8:           C     3
9:           C     3
Matt
  • 2,947
  • 1
  • 9
  • 21
2

You can convert your data to factor and then numeric using dplyr.

With pipes, the code would read like this:

tbl1 %>% mutate(ColumnIWant = ColumnIHave %>% as.factor() %>% as.numeric())

If you're not familiar with pipes and are more familiar with functions in other programming languages, the non-piped version is below.

tbl1 <- read.table(header = TRUE, text = "ColumnIHave 
A
A
A
B
B
B
C
C
C")
library(dplyr)
mutate(tbl1, ColumnIWant = as.numeric(as.factor(ColumnIHave)))
#>   ColumnIHave ColumnIWant
#> 1           A           1
#> 2           A           1
#> 3           A           1
#> 4           B           2
#> 5           B           2
#> 6           B           2
#> 7           C           3
#> 8           C           3
#> 9           C           3

Created on 2019-07-23 by the reprex package (v0.3.0)

Arthur Yip
  • 5,810
  • 2
  • 31
  • 50