0

I have below data set in R. I would like to find out customers' behavior after (including banana) bought banana. So, group by customer_id, once a customer_id bought a banana, return all the following purchases after (including) that banana purchasing. In other word, once the product is banana, return the customer_id and account_seq, and return customer_id and account_seq for that customer after that purchase of banana.

#   customer_id account_seq product 

#1       1          1       apple
#2       1          2       banana
#3       2          1       apple
#4       2          3       banana
#5       2          4       orange
#6       3          1       banana
#7       3          3       apple

The outcome should be like:

#   customer_id account_seq product 

#1       1          2       banana
#2       2          3       banana
#3       2          4       orange 
#4       3          1       banana
#5       3          3       apple

I spent a lot time figuring it out, would be super appreciate if anyone could help.


Ivan
  • 11

1 Answers1

0
library(dplyr)

data <- data.frame(customer_id = c(1L, 1L, 2L, 2L, 2L, 3L, 3L), account_seq = c(1L, 
2L, 1L, 3L, 4L, 1L, 3L), product = c("apple", "banana", "apple", 
"banana", "orange", "banana", "apple"))


data %>% group_by(customer_id) %>% 
  filter(cumsum(product=="banana")>0)
#> # A tibble: 5 × 3
#> # Groups:   customer_id [3]
#>   customer_id account_seq product
#>         <int>       <int> <chr>  
#> 1           1           2 banana 
#> 2           2           3 banana 
#> 3           2           4 orange 
#> 4           3           1 banana 
#> 5           3           3 apple
Ric
  • 5,362
  • 1
  • 10
  • 23