2

client_id <- (2260419,2260412,2260413,2260415) transaction_date <- (2016-09-03, 2016-09-04, 2016-09-06, 2016-09-07) amount <- (350.0, 250.0,431.0,220.0)

month_ARPU AS
(SELECT
  visit_month,
  Avg(revenue) AS ARPU 
  FROM
  (SELECT
    Cust_id,
    Datediff(MONTH, ‘2010-01-01’, transaction_date) AS visit_month,
    Sum(transaction_size) AS revenue 
    FROM   transactions 
    WHERE  transaction_date &gt; Dateadd(‘year’, -1, CURRENT_DATE)
    GROUP BY
    1,
    2)
  GROUP BY 1)

I understand how SELECT, GROUP BY, COUNT(1), SUM() and AS() works individually but not as a whole like in the code above, mainly how COUNT(1) and SUM() are working.

OzanStats
  • 2,756
  • 1
  • 13
  • 26
Shubham P
  • 21
  • 2
  • Please read [(1)](http://stackoverflow.com/help/how-to-ask) how do I ask a good question, [(2)](http://stackoverflow.com/help/mcve) How to create a MCVE as well as [(3)](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610) how to provide a minimal reproducible example in R. Then edit and improve your question accordingly. I.e., abstract from your real problem... – Christoph Mar 06 '18 at 06:58
  • Doesn't group by use column names not numbers? – Marichyasana Mar 06 '18 at 07:00
  • Also, you can use the `sqldf` library and use your SQL code more or less as-is. – Mako212 Mar 06 '18 at 07:14
  • @Mako212 Library sqldf is not the requirement . – Shubham P Mar 06 '18 at 09:29
  • I'm trying to calculate LTV =ARPU/churn rate.Where ARPU is average monthly recurring revenue per user and the churn rate is the rate at which we are losing customers (so the inverse of retention). – Shubham P Mar 06 '18 at 09:31
  • @Christoph thank you for your guidance . – Shubham P Mar 06 '18 at 09:40
  • it seems you average over revenue but do not aggregate/ select visit_month per level of (1; the grouped variable). This will likely throw an error, do you need the min/max of the visit_month? – timfaber Mar 06 '18 at 09:47
  • Same goes for transaction_date, if you run the `Datediff` command you need to involve transaction_date in the group_by function. Can you run the SQL code without problems? – timfaber Mar 06 '18 at 12:14
  • @timfaber can we make avg revenue per month that will give some idea . – Shubham P Mar 07 '18 at 06:07

1 Answers1

0

Ok, this is an option but I'm making some assumptions. The problem is that there are 3 occasions where a value is selected which is not added in the group by function. Therefore I assume here that the value is unique per level of each variable in the group by clause.

library(tidyverse)

month_ARPU <- transactions %>% 
          group_by(1,2) %>%
          summarise(visit_month = length(seq(from = as.Date("2010-01-01"), to = as.Date(unique(transaction_date)), by = 'month')) - 1,
                    revenue = sum(transaction_size),
                    Cust_id = unique(Cust_id),
                    transaction_date = as.Date(unique(transaction_date)) %>% ungroup() %>%
          filter(transaction_date > as.Date(Sys.time())-365) %>%
          group_by(1,visit_month) %>% 
          summarise(ARPU = mean(revenue))
timfaber
  • 2,060
  • 1
  • 15
  • 17