0

This code transforms a vector of nucleotides into a coded version with numbers from 1 to 4. However, I'd like something a bit more elegant, possibly in a single line. Is this possible?

vector2 <- c("c","a","g","g","c","g","g","g","a","t","t","t","c","t","c","t","t","g","t","t","g","a","c","a","g",  "a","a","t","c","c")
vector2[vector2=="a"]<-1
vector2[vector2=="c"]<-2
vector2[vector2=="g"]<-3
vector2[vector2=="t"]<-4
as.numeric(vector2)

Thanks

Sotos
  • 51,121
  • 6
  • 32
  • 66
zest16
  • 455
  • 3
  • 7
  • 20
  • 1
    A nested `ifelse` will do the trick here: `as.numeric(ifelse(vector2 == 'a', 1, ifelse(vector2 == 'c', 2, ifelse(...))))` and so on – Sotos May 16 '19 at 11:45
  • Another option: `match(vector2, sort(unique(vector2)))` – markus May 16 '19 at 11:58
  • transform to a factor is probably the way to go: `as.numeric(factor(vector2,levels=c('a','c','g','t')))` gives `2 1 3 3 2 3 3 3 1 4 4 4 2 4 2 4 4 3 4 4 3 1 2 1 3 1 1 4 2 2` – Tensibai May 16 '19 at 11:59

2 Answers2

0

Here is a solution in base R :

sapply(vector2, switch, a = 1, c = 2, g = 3, t = 4)

Something more efficient should exists, but it is elegant and in one line.

Clemsang
  • 5,053
  • 3
  • 23
  • 41
0

This might work for you ...

as.numeric(as.factor(c("c","a","g","g","c","g","g","g","a","t","t")))
Petr Matousu
  • 3,120
  • 1
  • 20
  • 32