1

I'm basically trying to mutate a data set and add a column based on the value of another column in that dataset. How do I do this?

Say I have a data set that looks like this:

movies
# A tibble: 651 x 32
                    title   title_type       genre runtime mpaa_rating                   studio
                    <chr>       <fctr>      <fctr>   <dbl>      <fctr>                   <fctr>
 1            Filly Brown Feature Film       Drama      80           R      Indomina Media Inc.
 2               The Dish Feature Film       Drama     101       PG-13    Warner Bros. Pictures
 3    Waiting for Guffman Feature Film      Comedy      84           R   Sony Pictures Classics
 4   The Age of Innocence Feature Film       Drama     139          PG        Columbia Pictures
 ... (more columns and more rows than shown)

Say it has a column (not shown) called thtr_release_month with possible values equal to a month of the year like "October" or "January"

I want to add a column called oscar_season that is either yes or no if the movie was released in November or December. How does one do this? I feel like this is close:

movies_with_oscar_season <- movies %>% mutate(oscar_season = ifelse(movies$thtr_release_month == 'November' | movies$thtr_release_month == 'December', 'yes', 'no'))

What am I missing? How can I improve the above code?

I actually get an error:

Column oscar_season must be length 651 (the number of rows) or one, not 0 Calls: <Anonymous> ... <Anonymous> -> mutate -> mutate.tbl_df

What am I doing wrong?

Is there a way to write that long or expression too?

Alejandro Montilla
  • 2,626
  • 3
  • 31
  • 35
Jwan622
  • 11,015
  • 21
  • 88
  • 181

1 Answers1

2

You can create a new vector with the result of evaluating your condition:

oscar_season <- (ifelse(movies$thtr_release_month %in% c('November','December')), "yes", "no")

Edit: Based on the comments, it is required to show a "yes" or a "no" if the condicion is TRUE or FALSE respectively.

And then call mutate with that new column:

movies_oscar_season <- mutate(movies, oscar_season)

That should give you the original dataset with the oscar_season column.

Alejandro Montilla
  • 2,626
  • 3
  • 31
  • 35