0

I have a dataframe with many ID's, each with two different columns: one defining an important cut-off date and another being many different dates. I need to add a column defining one months before the cut-off date as "-1", two months before the-cut off date as "-2", three months before the cut-off date as "-3" and also one months after the cut-off date as "1", two months after the cut-off date as "2" and so on until 6 months after the cut-off date. My Problem is, that all cut-off dates are different for each ID and I only know how to name specific date ranges with an ifelse() function.

Here a small set of my dataframe to understand the structure better:

df:

i.d.    registered_at         Day    
x        2013-12-20        2013-11-19       
x        2013-12-20        2014-02-20        
x        2013-12-20        2014-05-11        
y        2013-10-01        2013-08-05       
y        2013-10-01        2013-10-01       
z        2014-01-15        2013-10-25      

So for i.d. x for example: months "-1" is defined as 2013-11-20<= "-1" <2013-12-20 ; months "-2" is defined as 2013-10-20<= "-2" <2013-11-20 ; months "-3" is defined as 2013-09-20<= "-3" <2013-10-20 ; months "1" is defined as 2013-12-20<= "1" <2014-01-20 and so on up until months "6"

My issue is that I can not define these labels by giving specific dates as i.d. "x" for example will had "-1" months defined in a different months on a different day due to the registered at date being completely different to i.d. "y".

I have tried ifelse() function on this but could not figure this out.

My final data frame should look like this:

Newdf:

i.d.    registered_at         Day         MonthsNo
x        2013-12-20        2013-11-19       -2
x        2013-12-20        2014-02-20        3
x        2013-12-20        2014-05-11        5
y        2013-10-01        2013-08-05       -2
y        2013-10-01        2013-10-01        1
z        2014-01-15        2013-10-25       -3

Another issue that needs to be considered is, what if one i.d. was registered on the 2013-03-31, there is no 2013-02-31.. only 2013-02-28, how to make sure the code includes such a situation?

I hope someone can help me solve this :)

Fee
  • 89
  • 1
  • 9
  • 1
    how are you defining a month? 30 days? In a different month of the year? for 30 days, use `difftime`. for different month of the year, use (e.g.) `data.table::month` and subtract. – MichaelChirico Nov 11 '17 at 14:11
  • 1
    How come there's a `3` in the second row? Shouldn't it be `2`? And a `1` in the 5th instead of a zero? – Rui Barradas Nov 11 '17 at 14:34
  • Your logic that computes the month difference is not clear. Check here if you want the difference in calendar months: https://stackoverflow.com/questions/1995933/number-of-months-between-two-dates/1996404. You should be able to adjust some of the functions there to get what you want. – AntoniosK Nov 11 '17 at 17:19
  • @MichaelChirico I defined a timeframe to be for example 2013-03-10 to 2013-04-10. Thus, it is not exactly months but rather the same date in a different months. – Fee Nov 11 '17 at 19:37
  • @RuiBarradas No, it should be 2 as I had defined my time frame to be defined by date1 <= x < date2 . Thus, 2014-02-20 already belongs in the next timeframe. – Fee Nov 11 '17 at 19:37
  • But that's exactly what I am saying. You have `x 2013-12-20 2014-02-20 3` and `"it should be 2"`? And in the 5th row? Should it be a `1` or a `0` since the values of `registered_at` and `Day` are the same `2013-10-01`? – Rui Barradas Nov 11 '17 at 20:30

0 Answers0