-2

I have an indicate matrix, like this:

John   1
Ann   2
Ruby   3
Clair   4

So I want translate this vector to number with keep order, like this:

(John,Ann,John,Clair,John,Ruby,Ann,John,Ruby)->(1,2,1,4,1,3,2,1,3)

I don't know how do it with R(without loop).

Please help me. Thks

Jame H
  • 1,324
  • 4
  • 15
  • 26
  • Do a `merge` http://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left-right/1300618#1300618 – Hugh Jun 26 '14 at 07:49
  • @Hugh, the problem with merge is that it won't preserve the order, even when `sort = F` – David Arenburg Jun 26 '14 at 08:10

5 Answers5

2

You can use factor, see the documentation using ?factor.

x <- c('John','Ann','John','Clair','John','Ruby','Ann','John','Ruby')
y <- factor(x, levels = c("John", "Ann", "Ruby", "Clair"))

as.numeric(y)
## [1] 1 2 1 4 1 3 2 1 3

Hope it helps,

alex

alko989
  • 7,688
  • 5
  • 39
  • 62
0
x <- c('A','B','A','D','A','C','B','A','C')
y <- match(x, LETTERS)

source

[Edit] Modified to take into account Shadow's comment

0

Even in your updated example, you can use ?match just as @Pascal describes. Just don't match to LETTERS.

# generate example
df <- data.frame(input=c("John", "Ann", "Ruby", "Clair"), 
                 output=1:4)
x <- c("John", "Ann", "John", "Clair", "John", 
       "Ruby", "Ann", "John", "Ruby")
# 
# if output is indeed just 1:nrow(df)
match(x, df[, "input"])
# if output is different
df[match(x, df[, "input"]), "output"]
shadow
  • 21,823
  • 4
  • 63
  • 77
0

Another approach to the one by @Shadow:

library(doBy)

x <- c('John','Ann','John','Clair','John','Ruby','Ann','John','Ruby')
src <- c('John', 'Ann', 'Ruby', 'Clair')
tgt <- c(1,2,3,4)

x <- recodeVar(x, src, tgt)
x <- as.numeric(x)   
0

You could also do:

 df <- data.frame(input=c("John", "Ann", "Ruby", "Clair"), 
             output=1:4,stringsAsFactors=F) #should work with factors also
 x <- c("John", "Ann", "John", "Clair", "John", 
   "Ruby", "Ann", "John", "Ruby")
 with(df, setNames(output, input))[x]
 #John   Ann  John Clair  John  Ruby   Ann  John  Ruby 
 # 1     2     1     4     1     3     2     1     3 

If you don't need names

 as.vector(with(df, setNames(output,input))[x])
 #[1] 1 2 1 4 1 3 2 1 3
akrun
  • 874,273
  • 37
  • 540
  • 662