0

I am creating a function that takes a list of user-specified words and then labels them as a number depending on the order of the number in the list. The user can specify different list lengths.

For example:

myNotableWords<-c("No_IM","IM","LGD","HGD","T1a")

aa<-c("No_IM","IM","No_IM","HGD","T1a","HGD","T1a","IM","LGD")
aa<-data.frame(aa,stringsAsFactors=FALSE)

Intended Output

new<-(1,2,1,4,5,4,5,2,3)

Is there a way of maybe getting the index of the original list and then looking up where the each element of the target list is in that index and replacing it with the index number?

Sebastian Zeki
  • 6,690
  • 11
  • 60
  • 125
  • `match(aa, myNotableWords)` – Henrik Oct 22 '18 at 18:52
  • [Is there an R function for finding the index of an element in a vector?](https://stackoverflow.com/questions/5577727/is-there-an-r-function-for-finding-the-index-of-an-element-in-a-vector) – Henrik Oct 22 '18 at 18:53

3 Answers3

1
new <- c()
for (item in aa) {
  new <- c(new, which(myNotableWords == item))
}
print(new)
#[1] 1 2 1 4 5 4 5 2 3
12b345b6b78
  • 995
  • 5
  • 16
1

You can do this using data.frame; the syntax shouldn't change. I prefer using data.table though.

library(data.table)
myWords <- c("No_IM","IM","LGD","HGD","T1a")
myIndex <- data.table(keywords = myWords, word_index = seq(1, length(myWords)))

The third line simply adds an index to the vector myWords.

aa <- data.table(keywords = c("No_IM","IM","No_IM","HGD","T1a",
                         "HGD","T1a","IM","LGD"))
aa <- merge(aa, myIndex, by = "keywords", all.x = TRUE)

And now you have a table that shows the keyword and its unique number.

Arturo Sbr
  • 5,567
  • 4
  • 38
  • 76
1

Why not just use the factor functionality of R?

A "factor data type" stores an integer that references a "level" (= character string) via the index number:

myNotableWords<-c("No_IM","IM","LGD","HGD","T1a")
aa<-c("No_IM","IM","No_IM","HGD","T1a","HGD","T1a","IM","LGD")

aa <- as.integer(factor(aa, myNotableWords, ordered = TRUE))

aa
# [1] 1 2 1 4 5 4 5 2 3
R Yoda
  • 8,358
  • 2
  • 50
  • 87