-1

I have a transaction table with the following columns:

TransactionId  UserId  YearMonth  Group

What I am trying to accomplish is to get unique users across different months. Eg:

YearMonth Group UsersCountMonth1 UsersCountMonth2 UsersCountMonth3
201301    A     1000             900              800
201301    B     1200             940              700
201302    B     1300             1140             900
201303    A     12e0             970              706

Basically Month1 and Month2 are the incremental months based on YearMonth value for the record.

I am using this result to perform retention analysis.

d_luffy_de
  • 967
  • 1
  • 9
  • 24
  • 1
    Unclear. Please include reproducible examples along with expected output and any code you have tried and failed. – Sotos Nov 22 '16 at 07:57

1 Answers1

1

I remember you were looking for a possibility to analyze subscription cohorts, yesterday. So I guess you can do

library(tidyverse)
set.seed(1)
n <- 100
df <- data.frame(
  user = sample(1:20, n, T), 
  transDate = sample(seq(as.Date("2016-01-01"), as.Date("2016-12-31"), "1 month"), n, T),
  group = sample(LETTERS[1:2], n, T)
)
diffmonth <- function(d1, d2) {
# http://stackoverflow.com/questions/1995933/number-of-months-between-two-dates
  monnb <- function(d) {
    lt <- as.POSIXlt(as.Date(d, origin="1900-01-01"))
    lt$year*12 + lt$mon
  }
  monnb(d2) - monnb(d1) + 1L
}
df %>% 
  group_by(user, group) %>%
  mutate(cohort = min(transDate), month = diffmonth(cohort, transDate)) %>%
  unite(cohort, cohort, group, remove = T) %>%
  group_by(month, cohort) %>% 
  summarise(n=n()) %>% 
  spread(month, n, fill = 0, drop = F) 
# # A tibble: 16 × 12
#          cohort   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`
# *         <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1  2016-01-01_A     5     1     0     1     1     1     1     0     2     0     0
# 2  2016-02-01_A     1     0     0     0     0     0     0     0     1     0     1
# 3  2016-02-01_B     4     1     2     1     0     1     2     0     1     1     0
# 4  2016-03-01_A     5     0     3     1     2     2     2     0     1     2     0
# 5  2016-03-01_B     4     0     0     0     2     0     1     0     0     0     0
# 6  2016-04-01_A     4     0     2     1     0     1     0     2     1     0     0
# 7  2016-04-01_B     1     0     0     0     0     0     0     0     0     0     0
# 8  2016-05-01_A     2     0     2     2     0     0     2     0     0     0     0
# 9  2016-05-01_B     1     0     0     1     0     0     2     0     0     0     0
# 10 2016-06-01_A     1     0     2     0     0     1     0     0     0     0     0
# 11 2016-06-01_B     4     0     0     0     0     1     1     0     0     0     0
# 12 2016-07-01_A     1     0     1     0     0     0     0     0     0     0     0
# 13 2016-08-01_B     4     1     1     0     0     0     0     0     0     0     0
# 14 2016-09-01_A     1     0     0     0     0     0     0     0     0     0     0
# 15 2016-10-01_B     1     0     0     0     0     0     0     0     0     0     0
# 16 2016-12-01_A     3     0     0     0     0     0     0     0     0     0     0
lukeA
  • 53,097
  • 5
  • 97
  • 100