1

I am trying to add a numeric key to my dataframe, but I want the number to correspond to the value in a specific column. This column has repeats, and I would like the numeric key to reflect that.

For example, this is a generalized look at my dataframe:

Gene   Type
A      1
A      2
B      1
C      1
C      1
C      2

I'm leaving a lot out of the table for simplicity's sake. I would then like the numeric key to reflect the value of the Gene column, so that it looks like this:

Gene   Type   Key
A      1      1
A      2      1
B      1      2
C      1      3
C      1      3
C      2      3

Any suggestions? I've hit a mental block and really have no idea what to do.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
BioBaker
  • 13
  • 4
  • Something like this could work `SELECT gene, type, (ASCII(gene) - ASCII('A')) AS key FROM my_table` Or `SELECT gene, type, (CONV(gene, 36, 10) - 9) AS key ...` – tanner0101 Mar 23 '16 at 21:18

1 Answers1

2

You can try

df1$Key <- as.numeric(as.factor(df1$Gene))
#   Gene Type Key
# 1    A    1   1
# 2    A    2   1
# 3    B    1   2
# 4    C    1   3
# 5    C    1   3
# 6    C    2   3

data

df1 <- structure(list(Gene = c("A", "A", "B", "C", "C", "C"),
       Type = c(1L, 2L, 1L, 1L, 1L, 2L)), .Names = c("Gene", "Type"), 
        class = "data.frame", row.names = c(NA, -6L))
RHertel
  • 23,412
  • 5
  • 38
  • 64
  • this would be great if each value didn't have multiple levels. as such, your method gives the number of levels for each output. any suggestions? – BioBaker Mar 23 '16 at 23:49