I'm looking for a way, to determine "join", "churn", and "rejoin" dates for clients in a data frame (clients as row labels, month/year as column labels, earnings as values) where:
- The first month with a non-zero earnings value for a client = join date
- If there are 13 consecutive months with no earnings (zero value), then the 13th month is considered a "churn" month.
- If, after a "churn" month has occurred, there is a non-zero value for that client, then the corresponding month in that column is the "rejoin" month.
- If there is another 13 months with no earnings AFTER a rejoin month, then the last month in that block of 13 zero value columns is another churn month
I could probably hack this together with For loops (see pseudo-code below), but I'm wondering if there is a more elegant way to do it with apply functions or with the dplyr package.
So, data frame looks like this:
> clients
Jan08 Feb08 Mar08 Apr08 May08 ...
client 1 1 0 0 0 0 ...
client 2 0 -6 6 -6 6 ...
client 3 28 4 108 93 3.85 ...
............
and I'm looking to run logic like this:
For each client {
For each month {
If the value this month == 0 {
If the previous 12 months values all == 0* {
If there is a non-zero value exactly 13 months before {
then mark the current month as a churn month
}
}
}
Else {
If this is the first non zero value in the row {
then mark the current month as the join month
}
Else {
If this is the first non-zero value in 13 or more months {
then mark this as a rejoin month
}
}
}
}
*NOTE: As with client 2, I could have negative values, so I can't just sum up values to determine if proceeding values are all zero.
Any direction on this would be much appreciated.