I have a dataframe with many ID's, each with two different columns: one defining an important cut-off date and another being many different dates. I need to add a column defining one months before the cut-off date as "-1", two months before the-cut off date as "-2", three months before the cut-off date as "-3" and also one months after the cut-off date as "1", two months after the cut-off date as "2" and so on until 6 months after the cut-off date.
My Problem is, that all cut-off dates are different for each ID and I only know how to name specific date ranges with an ifelse()
function.
Here a small set of my dataframe to understand the structure better:
df:
i.d. registered_at Day
x 2013-12-20 2013-11-19
x 2013-12-20 2014-02-20
x 2013-12-20 2014-05-11
y 2013-10-01 2013-08-05
y 2013-10-01 2013-10-01
z 2014-01-15 2013-10-25
So for i.d. x for example: months "-1" is defined as 2013-11-20<= "-1" <2013-12-20 ; months "-2" is defined as 2013-10-20<= "-2" <2013-11-20 ; months "-3" is defined as 2013-09-20<= "-3" <2013-10-20 ; months "1" is defined as 2013-12-20<= "1" <2014-01-20 and so on up until months "6"
My issue is that I can not define these labels by giving specific dates as i.d. "x" for example will had "-1" months defined in a different months on a different day due to the registered at date being completely different to i.d. "y".
I have tried ifelse()
function on this but could not figure this out.
My final data frame should look like this:
Newdf:
i.d. registered_at Day MonthsNo
x 2013-12-20 2013-11-19 -2
x 2013-12-20 2014-02-20 3
x 2013-12-20 2014-05-11 5
y 2013-10-01 2013-08-05 -2
y 2013-10-01 2013-10-01 1
z 2014-01-15 2013-10-25 -3
Another issue that needs to be considered is, what if one i.d. was registered on the 2013-03-31, there is no 2013-02-31.. only 2013-02-28, how to make sure the code includes such a situation?
I hope someone can help me solve this :)