I am trying to replace substrings of string elements within a vector with blank spaces. Below are the vectors we are considering:
test <- c("PALMA DE MALLORCA", "THE RICH AND THE POOR", "A CAMEL IN THE DESERT", "SANTANDER SL", "LA")
lista <- c("EL", "LA", "ES", "DE", "Y", "DEL", "LOS", "S.L.", "S.A.", "S.C.", "LAS",
"DEL", "THE", "OF", "AND", "BY", "S", "L", "A", "C", "SA", "SC", "SL")
Then if we apply the mgsub
function as it is, we get the following output:
library(qdap)
mgsub(lista, "", test)
# [1] "PM MOR" "RIH POOR" "M IN ERT" "NTER" ""
So I change my list to the following and reexecute:
lista <- paste("\\b", lista, "\\b", sep = "")
mgsub(lista, "", test)
# [1] "PALMA DE MALLORCA" "THE RICH AND THE POOR" "A CAMEL IN THE DESERT"
# [4] "SANTANDER SL" "LA"
I cannot get the word boundary regex to work for this function.