26

I have a bunch of letters, and cannot for the life of me figure out how to convert them to their number equivalent.

letters[1:4]

Is there a function

numbers['e']

which returns

5

or something user defined (ie 1994)?

I want to convert all 26 letters to a specific value.

zx8754
  • 52,746
  • 12
  • 114
  • 209
frank
  • 3,036
  • 7
  • 33
  • 65

5 Answers5

34

I don't know of a "pre-built" function, but such a mapping is pretty easy to set up using match. For the specific example you give, matching a letter to its position in the alphabet, we can use the following code:

myLetters <- letters[1:26]

match("a", myLetters)
[1] 1

It is almost as easy to associate other values to the letters. The following is an example using a random selection of integers.

# assign values for each letter, here a sample from 1 to 2000
set.seed(1234)
myValues <- sample(1:2000, size=26)
names(myValues) <- myLetters

myValues[match("a", names(myValues))]
a 
228

Note also that this method can be extended to ordered collections of letters (strings) as well.

lmo
  • 37,904
  • 9
  • 56
  • 69
  • 3
    I tend to prefer this solution over `which(x == letters)` because it's vectorised (I can get the letter indices for a vector of characters). – jimjamslam Aug 31 '17 at 05:54
15

The which function seems appropriate here.

which(letters == 'e')
#[1] 5
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
MrAesthetic
  • 171
  • 1
  • 5
14

You could try this function:

letter2number <- function(x) {utf8ToInt(x) - utf8ToInt("a") + 1L}

Here's a short test:

letter2number("e")
#[1] 5
set.seed(123)
myletters <- letters[sample(26,8)]
#[1] "h" "t" "j" "u" "w" "a" "k" "q"
unname(sapply(myletters, letter2number))
#[1]  8 20 10 21 23  1 11 17

The function calculates the utf8 code of the letter that it is passed to, subtracts from this value the utf8 code of the letter "a" and adds to this value the number one to ensure that R's indexing convention is observed, according to which the numbering of the letters starts at 1, and not at 0.

The code works because the numeric sequence of the utf8 codes representing letters respects the alphabetic order.


For capital letters you could use, accordingly,

LETTER2num <- function(x) {utf8ToInt(x) - utf8ToInt("A") + 1L}
Blundering Ecologist
  • 1,199
  • 2
  • 14
  • 38
RHertel
  • 23,412
  • 5
  • 38
  • 64
12

Create a lookup vector and use simple subsetting:

x <- letters[1:4]
lookup <- setNames(seq_along(letters), letters)
lookup[x]
#a b c d 
#1 2 3 4 

Use unname if you want to remove the names.

Roland
  • 127,288
  • 10
  • 191
  • 288
-1

thanks for all the ideas, but I am a dumdum.

Here's what I did. Made a mapping from each letter to a specific number, then called each letter

df=data.frame(L=letters[1:26],N=rnorm(26))
df[df$L=='e',2]
frank
  • 3,036
  • 7
  • 33
  • 65