1

I want to create an out_player_1 and out_player_2 column based on if the player name in Data frame 1 is not in the player name column of Data frame 2. Plus, I want it grouped by team and date in Dataframe 1.

Basically, I want to say which player is out for a certain team on a certain date. Dataframe 1 is a complete dataframe and Dataframe 2 is a complete list of all the players on each team.

dplyr is preferred by the way. Speed is critical and I am more familiar with dplyr than data.table.

Dataframe 1

date      name   team   
1/1/16    Bill   Hawks  
1/1/16    Bob    Hawks
1/1/16    Matt   Hawks
1/1/16    George Hawks
1/2/16    Bill   Hawks  
1/2/16    Bob    Hawks
1/2/16    Matt   Hawks
1/2/16    Josh   Hawks
1/3/16    Josh   Hawks
1/3/16    Josh   Browns
1/3/16    Bill   Browns
1/3/16    Bob    Browns
1/3/16    George Browns

Dataframe 2

player team 
Bill   Hawks
Bob    Hawks
Matt   Hawks
George Hawks
Josh   Hawks
Bill   Browns
Bob    Browns
Matt   Browns
George Browns
Josh   Browns

Desired Dataframe

date      name   team   out_player_1 out_player_2
1/1/16    Bill   Hawks   Josh           NA
1/1/16    Bob    Hawks   Josh           NA
1/1/16    Matt   Hawks   Josh           NA
1/1/16    George Hawks   Josh           NA
1/2/16    Bill   Hawks   George         NA
1/2/16    Bob    Hawks   George         NA
1/2/16    Matt   Hawks   George         NA
1/3/16    Josh   Hawks   George         Josh
1/3/16    Bob    Hawks   George         Josh
1/3/16    Matt   Hawks   George         Josh
1/3/16    Josh   Browns  Matt           NA
1/3/16    Bill   Browns  Matt           NA
1/3/16    Bob    Browns  Matt           NA
1/3/16    George Browns  Matt           NA
user6452857
  • 117
  • 2
  • 9

0 Answers0