0

I have a feeling this is a pretty simple one. I have a data frame that looks something like this:

ID   Genre1         Genre2
1    Comedy         Comedy
2    Drama          Drama
3    Sport          Sport
4    Drama          Comedy
5    Documentary    Documentary
6    Entertainment  Entertainment
7    Film           Film
8    Drama          Crime Drama
9    Crime Drama    Drama

I want to identify which rows have the same values (e.g. "comedy" and "comedy") and create a new column called match which labels them as "yes" (or "no", for those that don't match).

Based on the sample above, the expected output should look something like this:

ID   Genre1         Genre2          Match
1    Comedy         Comedy          Yes
2    Drama          Drama           Yes
3    Sport          Sport           Yes
4    Drama          Comedy          No
5    Documentary    Documentary     Yes
6    Entertainment  Entertainment   Yes
7    Film           Film            Yes
8    Drama          Crime Drama     No
9    Crime Drama    Drama           No

Any ideas how I could go about doing this and/or what package would be best? Thanks in advance!

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
Japes
  • 209
  • 1
  • 10

1 Answers1

3

Use ifelse:

df$Match <- ifelse(df$Genre1 == df$Genre2, 'Yes', 'No')
Elle
  • 998
  • 7
  • 12
  • Thanks for the quick reply! Just to check, I have some other datasets where I need to run this on which have more than just two columns. Some have four. Could this solution be adapted to work for those? – Japes Apr 23 '21 at 12:34
  • 1
    As in to say if all four columns are the same? You could technically do that by altering the condition with something like `df$Genre 1 == df$Genre2 & df$Genre2 == df$Genre3 & df$Genre3 == df$Genre4` but that's pretty ugly. You could maybe try this answer: https://stackoverflow.com/questions/54907539/row-wise-test-if-multiple-not-all-columns-are-equal – Elle Apr 23 '21 at 12:43
  • 1
    Nice, glad it helped! – Elle Apr 23 '21 at 13:32