0

I would like to create a dummy variables for action movies in my data set.

My code is,

imdb$action_movies <- ifelse(imdb$imdb.com_genres == "Action", 1,0)

Unfortunately when I run this code I only get movies with exclusively the Action tag and not movies with multiple tags such as Action Adventure.

How can I make it so that my dummy variable will include movies that have the action tag and multiple other genres?

Nad Pat
  • 3,129
  • 3
  • 10
  • 20
  • 1
    Provide a sample dataset it will make things clear. – Nad Pat Sep 03 '21 at 09:31
  • 1
    More specifically, please have a look here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example So the easiest way would be to use `dput(your data)` and copy the result here, or, if the data is too large, use `dput(head(your data))`. – deschen Sep 03 '21 at 09:34
  • simplest version is `grepl("Action", imdb$imdb.com_genres)`. NB that you don't need the `ifelse` in your code. Just the `==` or `grepl` will create a logical variable, which will be interpreted as 1/0 if used in a numeric context. – dash2 Sep 03 '21 at 10:10
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Sep 03 '21 at 16:25

1 Answers1

0

This is a relatively simple problem that regex that can solve.

Basically we want to inspect every string to see if it contains "Action". If it does we give it a 1, if it does not a 0.

We can use str_detect() from {stringr} to do this.

From there we throw our matches into an ifelse() statement as you had done above.

An example of what the final column will look like this is shown below

movies <- c("Action", "Comedy, Action, Adventure", "Action, Adventure")
imdb$action_movies <- ifelse(str_detect(movies, "Action") == T, 1, 0)

Which returns

[1] 1 1 1
Hansel Palencia
  • 1,006
  • 9
  • 17