2

Let's suppose I have the following data frame:

one =   c(0, 2, 1, 3, 0, 5)
two =   c(0, 2, 1, 3, 0, 0)
three = c(1, 0, 8, 0, 5, 0)
four =  c(3, 0, 0, 7, 9, 0)

df <- data.frame(one, two, three, four)
df

For row 1: There is no column containing fulfilling the criteria (column contains a zero and is followed by only zeros) For row 2: It is column 3. because it contains a zero and is followed by zeros until the last column (until column 4) For row 3: It is column 4 For row 4: same as row 1 For row 5: same as row 1 and 4 For row 6: It is column 2 (because it contains the first 0 and is followed by only 0 until the last column)

I want my result to be an additional column to my dataset containing the month of churn

churn = ('','three','four','','','two')

My idea is to detect the column that contains a zero and check if the sum of the following columns would be zero

1 Answers1

1

You can cumulatively sum by row from the right and use max.col() to detect the first zero, binding a column of zeroes to the end so the result will be out of range for rows that don't end with a zero.

idx <- seq_along(df)[max.col(do.call(cbind, c(Reduce(`+`, df, accumulate = TRUE, right = TRUE), 0)) == 0, ties.method = "first")]

Which gives:

[1] NA  3  4 NA NA  2

Index against the column names:

transform(df, churn = names(df)[idx])

  one two three four churn
1   0   0     1    3  <NA>
2   2   2     0    0 three
3   1   1     8    0  four
4   3   3     0    7  <NA>
5   0   0     5    9  <NA>
6   5   0     0    0   two
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56