0

I have a dataframe that has the following structure.

ProdID         Date                    class          price        set
PD-10011       2021-05-01 10:12:16     Regular        1000         ZR
PD-10712       2021-05-02 18:12:06     Premium        1000         ZR
PD-10213       2021-05-02 16:02:59     Premium        1000         ZR
PD-10814       2021-05-03 17:12:06     Premium        1000         RS
PD-10315       2021-05-04 19:10:11     Other          1000         RR
PD-10616       2021-05-04 13:18:14     Expired        1000         ZR
PD-10617       2021-05-04 15:14:19     Regular        1000         ZR

I need to create a funnel view by utilizing the data frame considering the following conditions:

May-21, the month should be auto-populating based on Date. If there are records of multiple months then it should be sequenced by month in descending month order. Here, Total is count of Total unique ProdID for that particular month. Regular is the count and sum of those ProdID having class as "Regular" and similar logic for the "Premium" row. The miscellaneous row is a count of ProdID having classes other than "Regular" and "Premium". RS (Premium) is count and sum of those where the class is "Premium" along with set as "RS".

May-21           Count of ProdID       Percentage Count       Sum of Price        Percentage Sum
Total            7                     100.00%                7000                100.00%
Regular          2                     28.57%                 2000                28.57%
Premium          3                     42.85%                 3000                42.85%
miscellaneous    2                     28.57%                 2000                28.57%
RS (Premium)     1                     33.33%                 1000                33.33%  

         
user9211845
  • 131
  • 1
  • 12

1 Answers1

0

On StackOverflow it is expected that you show your own attempts to solve the problem. Having said that, this should get you started:

library(tidyverse)
library(lubridate)
library(janitor)

df <- tibble::tribble(
                                      ~ProdID,                 ~Date,    ~class, ~price, ~set,
                                   "PD-10011", "2021-05-01 10:12:16", "Regular",  1000L, "ZR",
                                   "PD-10712", "2021-05-02 18:12:06", "Premium",  1000L, "ZR",
                                   "PD-10213", "2021-05-02 16:02:59", "Premium",  1000L, "ZR",
                                   "PD-10814", "2021-05-03 17:12:06", "Premium",  1000L, "RS",
                                   "PD-10315", "2021-05-04 19:10:11",   "Other",  1000L, "RR",
                                   "PD-10616", "2021-05-04 13:18:14", "Expired",  1000L, "ZR",
                                   "PD-10617", "2021-05-04 15:14:19", "Regular",  1000L, "ZR"
                                   )
df %>%
  mutate(Date = ymd_hms(Date)) %>% 
  filter(Date %within% interval("2021-05-01", "2021-05-31")) %>%
  mutate(class = case_when(class == "Premium" & set == "RS" ~ "RS (Premium)",
                           class != "Premium" & class != "Regular" ~ "miscellaneous",
                           class == "Premium" ~ "Premium",
                           class == "Regular" ~ "Regular")) %>% 
  group_by(class, price) %>% 
  summarise(count_of_prod_ID = n()) %>% 
  ungroup() %>% 
  mutate(`Sum of Price` = count_of_prod_ID * price,
         `Count (%)` = round((count_of_prod_ID / 7)*100, 2)) %>%
  janitor::adorn_totals() %>% 
  select(-c(price)) %>% 
  arrange(desc(count_of_prod_ID))
#>        class count_of_prod_ID Sum of Price Count (%)
#>         Total                7         7000    100.00
#> miscellaneous                2         2000     28.57
#>       Premium                2         2000     28.57
#>       Regular                2         2000     28.57
#>  RS (Premium)                1         1000     14.29
jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
  • How to divide `RS (Premium)` count with `Premium` count in last row. – user9211845 May 06 '21 at 09:48
  • You want to count `RS (Premium)` as a `Premium`? But then all of your percentages will be off – jared_mamrot May 06 '21 at 10:13
  • Not for all, I just need it for last row. – user9211845 May 06 '21 at 10:15
  • I'm not sure how to do that. If you count the `RS (Premium)` as a `Premium` (i.e. count it twice) then nothing will add up, e.g. if `RS (Premium) = 1` and `Premium = 3`, then your total prod count will be 8... I'm not sure how to summarise a subset of rows, but then print all rows. Sorry. – jared_mamrot May 06 '21 at 10:26