I have two dataframes,
word_table <-
word_9 word_1 word_3 ...word_random
word_2 na na ...word_random
word_5 word_3 na ...word_random
dictionary_words <-
word_2
word_3
word_4
word_6
word_7
word_8
word_9
.
.
.
word_n
what I am looking for, matching the word_table
with the dictionary_words
and replacing the words with the word position available in the dictionary, like this,
result <-
7 na 2 ...
1 na na ...
na 2 na ...
I have tried pmatch
, charmatch
, match
functions, that returning result
right way when the dictionary_words
are in smaller length, but when it is relatively long like more than 20000 words, the result
is coming only for first column, and rest of the columns are just becoming na
like this.
result <-
7 na na ...
1 na na ...
na na na ...
is there any other way I can do character matching, like using any apply function?
sample
word_table <- data.frame(word_1 <- c("conflict","", "resolved", "", "", ""), word_2 <- c("", "one", "tricky", "one", "", "one"),
word_3 <- c("thanks","", "", "comments", "par",""),word_4 <- c("thanks","", "", "comments", "par",""), word_5 <- c("", "one", "tricky", "one", "", "one"), stringsAsFactors = FALSE)
colnames(word_table) <- c("word_1", "word_2", "word_3", "word_4", "word_5")
## Targeted Words
dictionary_words <- data.frame(cbind(c("abovementioned","abundant","conflict", "thanks", "tricky", "one", "two", "three","four", "resolved")))
## convert into matrix (if needed)
word_table <- as.matrix(word_table)
dictionary_words <- as.matrix(dictionary_words)
## pmatch for each of the element in the dataframe (dt)
# matched_table <- pmatch(dt, TargetWord)
# dim(matched_table) <- dim(dt)
# print(matched_table)
result <- `dim<-`(pmatch(word_table, dictionary_words, duplicates.ok=TRUE), dim(word_table))
print(result) # working fine, but when the dictionary_words is large, returning result for only first column of the word_table