0

I have one data set, labeled xPR, which has 3 columns, Player, Team, and xPR, and another data set, labeled yPR, which has 3 columns, Player, Team, and yPR. I want to combined these two data sets so that each player's xPR and yPR lined up, and that if they did not have a value in one of those categories, it would be labeled NA.

I tried using rbind, but it did not end up working

Here's the code:

xPlayer <- x2017_CBB_Pitch$Player
xTeam <- x2017_CBB_Pitch$Team
xER <- x2017_CBB_Pitch$ERA
xIP <- x2017_CBB_Pitch$IP
xBB <- x2017_CBB_Pitch$BB
xSO <- x2017_CBB_Pitch$SO
xWP <- x2017_CBB_Pitch$WP
xHBP <- x2017_CBB_Pitch$HP

xPR.df <- data.frame(xPlayer,xTeam, xPR)

yPlayer <- y2018_CBB_Pitch$Player
yTeam <- y2018_CBB_Pitch$Team
yER <- y2018_CBB_Pitch$ERA
yIP <- y2018_CBB_Pitch$IP
yBB <- y2018_CBB_Pitch$BB
ySO <- y2018_CBB_Pitch$SO
yWP <- y2018_CBB_Pitch$WP
yHBP <- y2018_CBB_Pitch$HP

yPR.df <- data.frame(yPlayer, yTeam, yPR)

    > head(xPR.df)
             xPlayer          xTeam    xPR
    1  Luke Heimlich   Oregon State 33.428
    2 Clarke Schmidt South Carolina 27.388
    3    Beau Sulser      Dartmouth 20.460
    4   Andrew Crane           Troy 27.348
    5 Steven Gingery     Texas Tech 33.108
    6   Miguel Ausua   Oral Roberts 34.096
    > head(yPR.df)
             yPlayer                 yTeam    yPR
    1   Nick Sandlin         Southern Miss 24.528
    2    John Rooney               Hofstra 33.240
    3    Carter Love College of Charleston 30.616
    4  Ryan Campbell      Illinois-Chicago 36.580
    5   Frank German         North Florida 28.708
    6 Andre Pallante             UC Irvine 31.188
ska5513
  • 27
  • 4
  • Please type `dput(head(xPR))` and `dput(head(yPR))` and paste the resultrs into your question so that we have some data to work with. Also, please read [How do I ask a good question?](https://stackoverflow.com/help/how-to-ask). – G5W Feb 25 '19 at 00:06
  • 1
    Sounds like you need `merge()` (base R) or a join from `dplyr`. Easier to answer with some example data. – neilfws Feb 25 '19 at 00:06
  • @G5W sorry this was a vague question when rereading it, sorry. Just edited it – ska5513 Feb 25 '19 at 05:26

1 Answers1

0

The most basic way is to use outer merge.

I suppose you want to merge based on both Team and Player.

master.pitch.df <- merge(xPR.df,yPR.df, by = c("Player","Team"), all = True)
caden Hong
  • 147
  • 1
  • 10
  • I did this, however received this error 'Error in fix.by(by.y, y) : 'by' must specify uniquely valid columns.' I changed all = FALSE, but then it left out a lot of players – ska5513 Feb 25 '19 at 05:44