I am not good with regex, so found an alternative. d
is a vector of words which needs to be excluded.
We split the string into words using strsplit
and then check if any of the word matches with the vector d
, if it doesn't then we capitalize it using the capitalize
function in the Hmisc
package.
library(Hmisc)
x <- c('I like the pizza', 'The water in the pool')
d <- c("the","of","in")
lapply(strsplit(x, " "), function(x) ifelse(is.na(match(x, d)), capitalize(x),x))
# [[1]]
#[1] "I" "Like" "the" "Pizza"
#[[2]]
#[1] "The" "Water" "in" "the" "Pool"
Further you can use sapply
along with paste
to get it back as vector of string
a <- lapply(strsplit(x, " "), function(x) ifelse(is.na(match(x, d)), capitalize(x),x))
sapply(a, function(x) paste(x, collapse = ' '))
#[1] "I Like the Pizza" "The Water in the Pool"