0

I have the following data

Name <- c("Kobe Bryant", "Kobe Bryant", "Kobe Bryant", 
          "Kobe Bryant", "Kobe Bryant", "Kobe Bryant", 
          "Lebron James", "Lebron James", "Lebron James", 
          "Lebron James", "Kevin Durant", "Kevin Durant",
          "Kevin Durant", "Kevin Durant", "Kevin Durant")

Date <- as.Date(c("2015-05-14", "2015-05-15", "2015-05-19", "2015-05-21", 
           "2015-05-24", "2015-05-28", "2015-05-14", "2015-05-20", 
           "2015-05-21", "2015-05-23", "2015-05-22", "2015-05-24", 
           "2015-05-28", "2015-06-02", ""2015-06-04"))

df <- data.frame c(Name, Date)

Desired_output <- c(1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0)

df2 <- data.frame c(Name, Date, Desired_output)

I want to create a new column that identifies the back-to-back games (playing a game two consecutive days) for a specific player.

Output of the column: 1 (if b2b) 0 if not.

Both the first day and the second day of the b2b should have a 1.

lmo
  • 37,904
  • 9
  • 56
  • 69
Sburg13
  • 121
  • 5
  • 2
    These are not valid R vectors. See [how to create a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Be sure to include the desired output of your sample input. Describe what you've tried so far and exactly what doesn't work. – MrFlick Aug 07 '15 at 18:31
  • 1
    Welcome to SO. Please do fix your code, every argument of `Name` need to be quoted like `"Kobe.."` since they are character strings. – SabDeM Aug 07 '15 at 18:31
  • 1
    The vectors are of different length -- `data.frame(Name, Date)` gives an error. – user295691 Aug 07 '15 at 18:46
  • You now have length-15 Name vector and a length-13 Date vector. – IRTFM Aug 07 '15 at 18:49

1 Answers1

1

This is a split-apply-combine problem (since you need to handle each player separately), which you can do in base R (by(), aggregate(), ...) or with a variety of packages (plyr, dplyr, data.table) ... here's a plyr() solution.

Name <- rep(c("Kobe Bryant", "Lebron James", "Kevin Durant"),
            c(6,4,5))
Date <- as.Date(c("2015-05-14", "2015-05-15", "2015-05-19",
  "2015-05-21","2015-05-12", "2015-05-28", "2015-05-14",
  "2015-05-16","2015-05-17", "2015-05-21", "2015-05-22",
  "2015-05-24","2015-05-28","2015-06-02","2015-06-10"))
dd <- data.frame(Name,Date)
b2b <- function(x,ind=FALSE) {
    x0 <- head(x,-1)  ## all but last
    x1 <- tail(x,-1)  ## all but first
    comp <- abs(head(x,-1)-tail(x,-1))==1
    res <- c(comp,FALSE) | c(FALSE,comp)
    if (ind) {
        w <- res==1 & c(0,res[-length(res)])==1
        res[w] <- 2
    }
    return(res)
}
library("plyr")
ddply(dd,"Name",
      transform,
         b2b=as.numeric(b2b(Date)),
         b2b_ind=as.numeric(b2b(Date,ind=TRUE)))

My code has automatically reorganized the players by alphabetical order (because players got turned into a factor with levels in alphabetical order, and ddply returns the data in this rearranged order). If that's important you can make sure the factors are ordered the way you want before beginning.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • I tried applying the formula and it works perfectly until the last line of code when I get this message: – Sburg13 Aug 08 '15 at 21:57
  • ddply(dd,Name, + transform,b2b=b2b(Date)) Error in parse(text = x) : :1:6: unexpected symbol 1: Kobe Bryant ^ – Sburg13 Aug 08 '15 at 21:57
  • I am not sure I understand well the last part of the formula. The output of the formula should be a 1 if it is a b2b game ("2015-05-15" and "2015-05-16"). I should get a 1 for both days. If it is not a b2b game then it should return a 0. – Sburg13 Aug 08 '15 at 22:20
  • like this Name <- c("Kobe Bryant", "Kobe Bryant", "Kobe Bryant", + "Kobe Bryant", "Kobe Bryant", "Kobe Bryant", + "Lebron James", "Lebron James", "Lebron James", + "Lebron James", "Kevin Durant", "Kevin Durant", + "Kevin Durant", "Kevin Durant", "Kevin Durant") Date <- as.Date(c("2015-05-14", "2015-05-15", "2015-05-19", "2015-05-21", + "2015-05-12", "2015-05-28", "2015-05-14", "2015-05-16", + "2015-05-17", "2015-05-21", "2015-05-22", "2015-05-24", + "2015-05-28","2015-06-02","2015-06-10")) – Sburg13 Aug 08 '15 at 22:48
  • output <- c(1,1,0,0,0,0,0,1,1,0,0,0,0,0,0) – Sburg13 Aug 08 '15 at 22:48
  • thanks Ben. On my full data set last edit works perfect except in the cases when there are two b2b together. For example 2014-05-21 2014-05-22 2014-05-26 2014-05-27. In that case I get 1 2 2 2 on the output. – Sburg13 Aug 12 '15 at 18:30