I would like to combine two tables based on first name, last name, and year, and create a new binary variable indicating whether the row from table 1 was present in the 2nd table.
First table is a panel data set of some attributes of NBA players during a season:
firstname<-c("Michael","Michael","Michael","Magic","Magic","Magic","Larry","Larry")
lastname<-c("Jordan","Jordan","Jordan","Johnson","Johnson","Johnson","Bird","Bird")
year<-c("1991","1992","1993","1991","1992","1993","1992","1992")
season<-data.frame(firstname,lastname,year)
firstname lastname year
1 Michael Jordan 1991
2 Michael Jordan 1992
3 Michael Jordan 1993
4 Magic Johnson 1991
5 Magic Johnson 1992
6 Magic Johnson 1993
7 Larry Bird 1992
8 Larry Bird 1992
The second data.frame is a panel data set of some attributes of NBA players selected to the All-Star game:
firstname<-c("Michael","Michael","Michael","Magic","Magic","Magic")
lastname<-c("Jordan","Jordan","Jordan","Johnson","Johnson","Johnson")
year<-c("1991","1992","1993","1991","1992","1993")
ALLSTARS<-data.frame(firstname,lastname,year)
firstname lastname year
1 Michael Jordan 1991
2 Michael Jordan 1992
3 Michael Jordan 1993
4 Magic Johnson 1991
5 Magic Johnson 1992
6 Magic Johnson 1993
My desired result looks like:
firstname lastname year allstars
1 Michael Jordan 1991 1
2 Michael Jordan 1992 1
3 Michael Jordan 1993 1
4 Magic Johnson 1991 1
5 Magic Johnson 1992 1
6 Magic Johnson 1993 1
7 Larry Bird 1992 0
8 Larry Bird 1992 0
I tried to use a left join. But not sure whether that makes sense:
test<-join(season, ALLSTARS, by =c("lastname","firstname","year") , type = "left", match = "all")