Another option (for fun) using match
:
match(Alphabet, Alphabet)
match
only matches the first occurrence, so this works, though the numbers will not be 1:26. If they must absolutely be 1:26, and not just unique:
match(Alphabet, unique(Alphabet))
To actually do what you want (adding a column in data frame, etc.):
transform(DF, outcome=match(Alphabet, Alphabet))
Or
transform(DF, outcome=match(Alphabet, unique(Alphabet)))
Or you can use a faster version of match
ie. fmatch
from library(fastmatch)
library(fastmatch)
transform(DF, outcome=fmatch(Alphabet, unique(Alphabet)))
# No. Alphabet outcome
#1 1 A 1
#2 2 B 2
#3 3 A 1
#4 4 A 1
#5 5 C 3
#6 6 B 2
#7 7 C 3
This is actually a little faster than the factor
version:
> x <- sample(letters, 1e5, rep=T)
> library(microbenchmark)
> microbenchmark(as.numeric(factor(x)), match(x, x))
Unit: milliseconds
expr min lq mean median uq max neval
as.numeric(factor(x)) 4.68927 4.792212 9.042732 4.915268 5.175275 64.65473 100
match(x, x) 3.55855 3.617609 6.981944 3.731522 3.922048 53.07911 100
most likely because factor
internally uses something like match(x, unique(x))
anyway.