3

This is what my dataframe looks like. The two rightmost columns are my desired columns.These two columns check the condition whether in the last 21 days there is an "Email" ActivityType and whether in the last 21 days there is a "Webinar" ActivityType.

 Name      ActivityType     ActivityDate  Email(last21days) Webinar(last21day)**             
John       Email            1/1/2014        TRUE                  NA   
John       Webinar          1/5/2014        TRUE                 TRUE
John       Sale             1/20/2014       TRUE                 TRUE
John       Webinar          3/25/2014       NA                   TRUE
John       Sale             4/1/2014        NA                   TRUE
John       Sale             7/1/2014        NA                   NA
Tom        Email            1/1/2015        TRUE                   NA   
Tom        Webinar          1/5/2015        TRUE                 TRUE
Tom        Sale             1/20/2015      TRUE                 TRUE
Tom        Webinar          3/25/2015       NA                   TRUE
Tom        Sale              4/1/2015        NA                   TRUE
Tom        Sale              7/1/2015       NA                   NA

Based on the help here: Extracting event types from last 21 day window I tried:

df$ActivityDate <- as.Date(df$ActivityDate)
library(data.table)
setDT(df)
setkey(df, Name,ActivityDate)
Elsetemp <- df[, .(Name, ActivityDate, ActivityType)]
df[Elsetemp, `:=`(Email21 = as.logical(which(i.ActivityType == "Email")), 
                        Webinar21 = as.logical(which(i.ActivityType == "Webinar"))), 
         roll = -21, by = .EACHI]

to no avail as I only get TRUEs for rows with "Sale". For example, second row where ActivityType = Webinar, both Email21 & Webinar21 should say TRUE. When I am defining last 21 days, I am trying to include that very day when the event happened too.

Community
  • 1
  • 1
gibbz00
  • 1,947
  • 1
  • 19
  • 31
  • 1
    You should describe your desired columns not just show them. David won't get your ping unless he's already part of the Q&A, e.g., by having commented here. Also, dates are messy things, so maybe you could put the structure of your data.frame not just a picture of it. CathG did that for you in his answer on your last question. – Frank Jun 10 '15 at 16:02

2 Answers2

4

How about this?

Using rolling joins from data.table:

require(data.table)
dt[, ActivityDate := as.Date(ActivityDate, format="%m/%d/%Y")]
setkey(dt, Name, ActivityDate)

roll_index <- function(x, types, roll=21) {
    lapply(types, function(type) {
         idx = x[ActivityType == type][x, roll=roll, which=TRUE]
         as.logical(idx)
    })
}
dt[, c("Email_21", "Webinar_21") := roll_index(dt, c("Email", "Webinar"))]

#     Name ActivityType ActivityDate Email_21 Webinar_21
#  1: John        Email   2014-01-01     TRUE         NA
#  2: John      Webinar   2014-01-05     TRUE       TRUE
#  3: John         Sale   2014-01-20     TRUE       TRUE
#  4: John      Webinar   2014-03-25       NA       TRUE
#  5: John         Sale   2014-04-01       NA       TRUE
#  6: John         Sale   2014-07-01       NA         NA
#  7:  Tom        Email   2015-01-01     TRUE         NA
#  8:  Tom      Webinar   2015-01-05     TRUE       TRUE
#  9:  Tom         Sale   2015-01-20     TRUE       TRUE
# 10:  Tom      Webinar   2015-03-25       NA       TRUE
# 11:  Tom         Sale   2015-04-01       NA       TRUE
# 12:  Tom         Sale   2015-07-01       NA         NA
Arun
  • 116,683
  • 26
  • 284
  • 387
  • Thank you so much. This works perfectly. If you kindly add some comments to your code, that would be awesome as I have no idea what is going on when you are defining roll_index. – gibbz00 Jun 10 '15 at 17:01
  • I'll leave that as an exercise for you. Have a look at `?data.table` and work through the examples. Also go through the [vignettes](https://github.com/Rdatatable/data.table/wiki/Getting-started). That'll help you in your future tasks. – Arun Jun 10 '15 at 17:32
  • Thank you. What if I want to look 21 days forward as opposed to 21 day prior(backward) as in this example? Do I use roll = -21? – gibbz00 Jun 10 '15 at 17:58
  • To look for 21 days forward, `roll=-21`, yes (check `roll` argument in `?data.table`). I don't know what you expect, and what you got. In any case, it's a new question. I once again suggest you take your time to learn a package if you are going to use it, instead of asking solution to every variation of your tasks. – Arun Jun 10 '15 at 21:51
  • Thanks a lot. I already signed up for the data camp course on data.table which is co presented by you. – gibbz00 Jun 10 '15 at 21:58
0

A base R solution:

#New type of sequence function that can accept vectors
seq2 <- function(v1) {
  res <- list()
  for(i in seq_along(v1)) {
    res[[i]] <- seq(v1[i], v1[i]+21, by='day')
  }
  as.Date(unlist(res), origin='1970-01-01')
}

df <- df[ ,1:3]
df$ActivityDate <- as.Date(df$ActivityDate, format='%m/%d/%Y')

#Email column
emailed <- df[df$ActivityType == 'Email', 'ActivityDate']
df$Email <- df$ActivityDate %in% seq2(emailed)

#Webinar column
webbed <- df[df$ActivityType == 'Webinar', 'ActivityDate']
df$Webinar <- df$ActivityDate %in% seq2(webbed)

First we subset the first three columns without the example output. Then convert the date factors with as.Date. The vector emailed looks for ActivityType with the Email string. The function seq2 was created to look for the date and 21 days after. It creates a sequence that can be checked against.

df
#    Name ActivityType ActivityDate Email Webinar
# 1  John        Email   2014-01-01  TRUE   FALSE
# 2  John      Webinar   2014-01-05  TRUE    TRUE
# 3  John         Sale   2014-01-20  TRUE    TRUE
# 4  John      Webinar   2014-03-25 FALSE    TRUE
# 5  John         Sale   2014-04-01 FALSE    TRUE
# 6  John         Sale   2014-07-01 FALSE   FALSE
# 7   Tom        Email   2015-01-01  TRUE   FALSE
# 8   Tom      Webinar   2015-01-05  TRUE    TRUE
# 9   Tom         Sale   2015-01-20  TRUE    TRUE
# 10  Tom      Webinar   2015-03-25 FALSE    TRUE
# 11  Tom         Sale   2015-04-01 FALSE    TRUE
# 12  Tom         Sale   2015-07-01 FALSE   FALSE

Data

df <- read.table(text=' Name      ActivityType     ActivityDate  Email(last21days) Webinar(last21day)**             
John       Email            1/1/2014        TRUE                  NA   
John       Webinar          1/5/2014        TRUE                 TRUE
John       Sale             1/20/2014       TRUE                 TRUE
John       Webinar          3/25/2014       NA                   TRUE
John       Sale             4/1/2014        NA                   TRUE
John       Sale             7/1/2014        NA                   NA
Tom        Email            1/1/2015        TRUE                   NA   
Tom        Webinar          1/5/2015        TRUE                 TRUE
Tom        Sale             1/20/2015      TRUE                 TRUE
Tom        Webinar          3/25/2015       NA                   TRUE
Tom        Sale              4/1/2015        NA                   TRUE
Tom        Sale              7/1/2015       NA                   NA', header=T)
Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • I need to adjust the combination logical. As of now it matches all Dates less than or equal to the 21 days. I need to make it a range of 21. Will continue to solve. – Pierre L Jun 10 '15 at 16:12
  • I think you want to add 21 to the event date to find the relevant range of dates. – Frank Jun 10 '15 at 16:13