I want to create an out_player_1 and out_player_2 column based on if the player name in Data frame 1 is not in the player name column of Data frame 2. Plus, I want it grouped by team and date in Dataframe 1.
Basically, I want to say which player is out for a certain team on a certain date. Dataframe 1 is a complete dataframe and Dataframe 2 is a complete list of all the players on each team.
dplyr
is preferred by the way. Speed is critical and I am more familiar with dplyr
than data.table
.
Dataframe 1
date name team
1/1/16 Bill Hawks
1/1/16 Bob Hawks
1/1/16 Matt Hawks
1/1/16 George Hawks
1/2/16 Bill Hawks
1/2/16 Bob Hawks
1/2/16 Matt Hawks
1/2/16 Josh Hawks
1/3/16 Josh Hawks
1/3/16 Josh Browns
1/3/16 Bill Browns
1/3/16 Bob Browns
1/3/16 George Browns
Dataframe 2
player team
Bill Hawks
Bob Hawks
Matt Hawks
George Hawks
Josh Hawks
Bill Browns
Bob Browns
Matt Browns
George Browns
Josh Browns
Desired Dataframe
date name team out_player_1 out_player_2
1/1/16 Bill Hawks Josh NA
1/1/16 Bob Hawks Josh NA
1/1/16 Matt Hawks Josh NA
1/1/16 George Hawks Josh NA
1/2/16 Bill Hawks George NA
1/2/16 Bob Hawks George NA
1/2/16 Matt Hawks George NA
1/3/16 Josh Hawks George Josh
1/3/16 Bob Hawks George Josh
1/3/16 Matt Hawks George Josh
1/3/16 Josh Browns Matt NA
1/3/16 Bill Browns Matt NA
1/3/16 Bob Browns Matt NA
1/3/16 George Browns Matt NA