5

I have a dataframe like so:

sport   contract start contract end visits spends purchases
basket   2013-10-01     2014-10-01   12      14      23
basket   2014-02-12     2015-03-03   23      11      7
football 2015-02-12     2016-03-03   23      11      7
basket   2016-07-17     2013-09-09   12       7      13

I would like to conditionally replace the columns [4:6] with NAs, based on the variables "sport" and "contract start". So for instance:

i1 <- which(df$sport =="basket" & df$contract_start>="2014-01-01")

will index all the rows in which my conditions are met. Is there an easy piece of code to add to the above, that will replace df[4:6] with NAs given the above conditions? I would like to end up with something like that:

sport   contract start contract end visits spends purchases
basket   2013-10-01     2014-10-01   12      14      23
basket   2014-02-12     2015-03-03   NA      NA      NA
football 2015-02-12     2016-03-03   23      11      7
basket   2016-07-17     2013-09-09   NA      NA      NA

Thanks! A.

2 Answers2

9

You can simply specify the rows and columns that you would like to replace with NA, and assign NA to it:

df[df$sport =="basket" & df$contract_start>="2014-01-01", 4:6] <- NA

df
#      sport contract_start contract_end visits spends purchases
# 1   basket     2013-10-01   2014-10-01     12     14        23
# 2   basket     2014-02-12   2015-03-03     NA     NA        NA
# 3 football     2015-02-12   2016-03-03     23     11         7
# 4   basket     2016-07-17   2013-09-09     NA     NA        NA
Psidom
  • 209,562
  • 33
  • 339
  • 356
3
library("data.table")
setDT(df)
df[i = sport == "basket" & contract_start >= "2014-01-01", 
   j = c("visits", "spends", "purchases") := NA]

> df
      sport contract_start contract_end visits spends purchases
1:   basket     2013-10-01   2014-10-01     12     14        23
2:   basket     2014-02-12   2015-03-03     NA     NA        NA
3: football     2015-02-12   2016-03-03     23     11         7
4:   basket     2016-07-17   2013-09-09     NA     NA        NA

Variant of the above code using the my_cols variable:

my_cols <- names(df)[4:6]
df[i = sport == "basket" & contract_start >= "2014-01-01", 
   j = (my_cols) := .(NA)]
Sathish
  • 12,453
  • 3
  • 41
  • 59