I have the simpsons data from kaggle.com which includes titles of each episode. I want to check how many times the character names have been used in each title. I can find the exact words in titles but my code is missing out the words such as Homers when I look for Homer. Is there a way to do it?
Data example and my code:
text <- 'title
Homer\'s Night Out
Krusty Gets Busted
Bart Gets an "F"
Two Cars in Every Garage and Three Eyes on Every Fish
Dead Putting Society
Bart the Daredevil
Bart Gets Hit by a Car
Homer vs. Lisa and the 8th Commandment
Oh Brother, Where Art Thou?
Old Money
Lisa\'s Substitute
Blood Feud
Mr. Lisa Goes to Washington
Bart the Murderer
Like Father, Like Clown
Saturdays of Thunder
Burns Verkaufen der Kraftwerk
Radio Bart
Bart the Lover
Separate Vocations
Colonel Homer'
simpsons <- read.csv(text = text, stringsAsFactors = FALSE)
library(stringr)
titlewords <- paste(simpsons$title, collapse = " " )
words <- c('Homer')
titlewords <- gsub("[[:punct:]]", "", titlewords)
HomerCount <- str_count(titlewords, paste(words, collapse=" "))
HomerCount