I want to filter out specific rows from a data set I got from the project Gutenberg r package. For that, I want to select only rows that contain a given word, but the problem is all my rows have got more than one word so using the filter() will not work.
For example:
The sentence is: "The Little Vanities of Mrs. Whittaker: A Novel"
. I want to filter out all the rows that contain the word "novel", but I can not find out how.
gutenberg_full_data <- left_join(gutenberg_works(language == "en"), gutenberg_metadata, by = "gutenberg_id")
gutenberg_full_data <- left_join(gutenberg_full_data, gutenberg_subjects)
gutenberg_full_data <- subset(gutenberg_full_data, select = -c(rights.x,has_text.x,language.y,gutenberg_bookshelf.x, gutenberg_bookshelf.y,rights.y, has_text.y,gutenberg_bookshelf.y, gutenberg_author_id.y, title.y, author.y))
gutenberg_full_data <- gutenberg_full_data[-which(is.na(gutenberg_full_data$author.x)),]
novels <- gutenberg_full_data %>% filter(subject == "Drama")
original_books <- gutenberg_download((novels), meta_fields = "title")
original_books
tidy_books <- original_books %>%
unnest_tokens(word, text)
This is the code I used to get the data frame using the "gutenbergr" package.