1

Is there a way of creating new vector of numerical values based on my vector of strings?

For example I have this :

a<-c("A", "B", "A", "A")

and from this a I want to make new vector b with values replacing "A" with 1 and "B" with -1 so b(1, -1, 1, 1)

I tried using something like factor(a, levels = c("A", "B"), labels = c(1, -1)) but this doesn't produce numerical vector.

zx8754
  • 52,746
  • 12
  • 114
  • 209
lllook
  • 729
  • 2
  • 7
  • 20
  • 2
    `as.numeric(as.character(factor(a, levels = c("A", "B"), labels = c(1, -1))))` – Gregor Thomas Nov 27 '17 at 21:00
  • The `factor` method is good you just have to convert it to numeric the right way [(possible dupe? Converting factor to numeric R-FAQ)](https://stackoverflow.com/q/3418128/903061) – Gregor Thomas Nov 27 '17 at 21:02

3 Answers3

4

No need to that, just use:

a[a=="A"] = 1
a[a=="B"] = -1
a = as.numeric(a)

if you want keep a unchanged use:

    b = a
    b[a=="A"] = 1
    b[a=="B"] = -1
    b = as.numeric(b)

Or better solution as @joran said:

b = ifelse(a == "A",1,-1)
SirSaleh
  • 1,452
  • 3
  • 23
  • 39
3
# Packages
library(stringi)
library(microbenchmark)

# 1. Vector
# a <- c("A", "B", "A", "A") 
a <- stri_rand_strings(1e5, 1, pattern = "[A-B]")

# 2. The 'factor' solution
f1 <- function(){ as.numeric(as.character(factor(a, levels = c("A", "B"), labels = c(1, -1)))) }

# 3. The faster solution
f2 <- function(){ (-1)^(a != "A") }

# 3. Ifelse solution
f3 <- function(){ ifelse(a == "A", 1, -1) }

# 4. Ignore case of letters or my solution
f4 <- function(){ ifelse(as.numeric(grepl("a", a, ignore.case = TRUE)) == 1, 1, -1) }

# 5. Code map solution from "Nathan Werth"
f5 <- function(){ c(A = 1, B = -1)[a] }

# 6. Test
microbenchmark(
  f1(), f2(), f3(), f4(), f5())

Unit: milliseconds
expr       min        lq      mean    median        uq       max neval cld
f1() 23.331763 23.648421 28.253174 24.235554 26.582799 123.49315   100  b 
f2()  5.808460  6.025908  6.421053  6.067174  6.200166  12.94342   100 a  
f3() 13.817060 14.926539 25.900652 16.388596 18.122837 129.67193   100  b 
f4() 28.772036 31.363670 39.185333 32.352557 34.388918 134.35915   100   c
f5()  4.577321  5.186689  8.727417  7.375286  7.895280 106.31922   100 a 
Andrii
  • 2,843
  • 27
  • 33
  • 1
    If you're going to post benchmark code, you should post the results so we can get a sense of relative rankings at least. (And if you're going to benchmark, do it on a big enough vector to matter, at least 1e5 elements.) – Gregor Thomas Nov 27 '17 at 21:46
  • @Gregor I've done test for 1e5 random chars as you suggested me. Please, have a look on the results above – Andrii Nov 27 '17 at 22:09
2
code_map <- c(A = 1, B = -1)
b <- code_map[a]
Nathan Werth
  • 5,093
  • 18
  • 25