15

I am using grepl() in R to search if either of the following genres exist in my text. I am doing it like this right now:

grepl("Action", my_text) |
grepl("Adventure", my_text) |
grepl("Animation", my_text) |
grepl("Biography", my_text) |
grepl("Comedy", my_text) |
grepl("Crime", my_text) |
grepl("Documentary", my_text) |
grepl("Drama", my_text) |
grepl("Family", my_text) |
grepl("Fantasy", my_text) |
grepl("Film-Noir", my_text) |
grepl("History", my_text) |
grepl("Horror", my_text) |
grepl("Music", my_text) |
grepl("Musical", my_text) |
grepl("Mystery", my_text) |
grepl("Romance", my_text) |
grepl("Sci-Fi", my_text) |
grepl("Sport", my_text) |
grepl("Thriller", my_text) |
grepl("War", my_text) |
grepl("Western", my_text)

Is there a better way to write this code? Can I put all the genres in an array and then somehow use grepl() on that?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user3422637
  • 3,967
  • 17
  • 49
  • 72

2 Answers2

39

You could paste the genres together with an "or" | separator and run that through grepl as a single regular expression.

x <- c("Action", "Adventure", "Animation", ...)
grepl(paste(x, collapse = "|"), my_text)

Here's an example.

x <- c("Action", "Adventure", "Animation")
my_text <- c("This one has Animation.", "This has none.", "Here is Adventure.")
grepl(paste(x, collapse = "|"), my_text)
# [1]  TRUE FALSE  TRUE
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
3

You can cycle through a list or vector of genres, as below:

genres <- c("Action",...,"Western")
sapply(genres, function(x) grepl(x, my_text))

To answer your question, if you just want to know if any element of the result is TRUE you can use the any() function.

any(sapply(genres, function(x) grepl(x, my_text)))

Quite simply, if any element of is TRUE, any will return TRUE.

Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255
  • This gets me close to what I am looking for. But what I get here is TRUE/FALSE values for each genre. If I have an array of 20 genres, I get say 19 FALSE values and 1 TRUE value if one of the genres was contained in my_text. I want a final result from this saying 19 FALSE and 1 TRUE equates TRUE in the end. You get what I am saying? How would I do that? – user3422637 Oct 12 '14 at 01:32
  • I am doing a if statement on top of this to see if the condition returns true. – user3422637 Oct 12 '14 at 01:35