-3

I have a character vector of misspelled words:

wordswrong <- c("veh", "crrts", "ornges")
wordscorrect <- c("vehicle", "carrots", "oranges")

Here's a dataframe:

words <- data.frame(terms = c("crrts oranges",
+                               "car is a veh", 
+                               "orngs bannas peas"))

How can I go through each word in words$terms and update based on my two vectors?

Doug Fir
  • 19,971
  • 47
  • 169
  • 299
  • Try `for(i in seq_along(wordswrong)) words$terms <- gsub(wordswrong[i], wordscorrect[i], words$terms)` or `library(qdap); words$terms <- mgsub(wordswrong, wordscorrect, words$terms)` – akrun Jun 30 '17 at 07:02
  • Thanks @akrun! I'm sure I have a memory of once seeing code where someone used a lookup table along the lines of df$wrongwords <- lut (lookuptable). Does this sound familiar? Or maybe it's the wrong context for a list? Or perhaps since each cell is not an exact lookup I cannot go this route – Doug Fir Jun 30 '17 at 07:04
  • 1
    Oh hang on, mgsub looks perfect! Cheers for the tip – Doug Fir Jun 30 '17 at 07:05

1 Answers1

1

We can use mgsub from qdap

library(qdap)
words$terms <- mgsub(wordswrong, wordscorrect, words$terms)
akrun
  • 874,273
  • 37
  • 540
  • 662