1

I would like to get a dataframe of the count of the total amount of items sold by day, but at the "end" of the day.

For example, I have the following data:

Item    Day Facility
Item_A  Day1    B
Item_B  Day1    A
Item_B  Day1    B
Item_C  Day2    B
Item_A  Day2    A
Item_C  Day2    C
Item_B  Day2    A
Item_B  Day2    A
Item_C  Day3    A

I would like a table which shows the total sold items at the end of the day + the previous days. I've included a dummy column because my data has additional columns which I don't need.

For example:

Item    Day Facility    Item_Total_Accumulative
Item_A  Day1    B   1
Item_B  Day1    A   2
Item_B  Day1    B   2
Item_C  Day2    B   2
Item_A  Day2    A   2
Item_C  Day2    C   2
Item_B  Day2    A   4
Item_B  Day2    A   4
Item_C  Day3    A   3
M--
  • 25,431
  • 8
  • 61
  • 93
Blaze9
  • 193
  • 9
  • 2
    @Shree what about cumulative? I don't think this is a dupe for the thread you are pointing to. – M-- Jun 05 '19 at 18:17
  • @M-M I think you are right. This has got to be a duplicate though. – Shree Jun 05 '19 at 18:19
  • 2
    @DavidArenburg Please read the question carefully. It needs cumulative count. This is not a dupe of that thread. – M-- Jun 05 '19 at 18:22
  • @DavidArenburg would you explain how ```table(unique(df)$Item)[df$Item]``` this answers this question? – M-- Jun 05 '19 at 18:25
  • 1
    @M-M, Yes that's correct I've edited the example data. Also I don't think this a dupe, the answer in that question doesn't do a cumulative total, just a total by two groups. I want a "running total count" per se. – Blaze9 Jun 05 '19 at 18:25

2 Answers2

3

You can use group_by() and mutate() (or count()) in order to calculate per day and per item, and for the calculation, I think the function cumsum() will be appropriated (cumulative sum).

As your variable Facility is not a numeric, I added a variable Value = 1, but using count() is a better way (same result than the commented line).

Then just make a left_join() with the previous result and you have what you are looking for.

library(dplyr)

 df %>%
  count(Item, Day) %>%
  group_by(Item) %>%
  mutate(Item_Total_Accumulative = cumsum(n)) %>% 
  select(-n) %>% 
  left_join(df,.)

#> Joining, by = c("Item", "Day")
#>     Item  Day Facility Item_Total_Accumulative
#> 1 Item_A Day1        B                       1
#> 2 Item_B Day1        A                       2
#> 3 Item_B Day1        B                       2
#> 4 Item_C Day2        B                       2
#> 5 Item_A Day2        A                       2
#> 6 Item_C Day2        C                       2
#> 7 Item_B Day2        A                       4
#> 8 Item_B Day2        A                       4
#> 9 Item_C Day3        A                       3
M--
  • 25,431
  • 8
  • 61
  • 93
demarsylvain
  • 2,103
  • 2
  • 14
  • 33
1

There should be an easier way but my quick doodle came out like this:

library(dplyr)
library(tidyr)

df %>%
  group_by(Day, Item) %>%
  count() %>%
  group_by(Item) %>%
  mutate(Item_Total_Accumulative=lag(n, default = 0) + n) %>% 
  group_by(Day, Item, Item_Total_Accumulative) %>% 
  expand(n=1:n) %>% 
  select(-n)

#> # A tibble: 9 x 3
#> # Groups:   Day, Item, Item_Total_Accumulative [6]
#>   Day   Item   Item_Total_Accumulative
#>   <fct> <fct>                    <dbl>
#> 1 Day1  Item_A                       1
#> 2 Day1  Item_B                       2
#> 3 Day1  Item_B                       2
#> 4 Day2  Item_A                       2
#> 5 Day2  Item_B                       4
#> 6 Day2  Item_B                       4
#> 7 Day2  Item_C                       2
#> 8 Day2  Item_C                       2
#> 9 Day3  Item_C                       3

Created on 2019-06-05 by the reprex package (v0.3.0)

M--
  • 25,431
  • 8
  • 61
  • 93
  • Hi M-M, this gets very close but for some reason my "Item_Total_Accumulative" column "resets" after 20, and starts over. – Blaze9 Jun 05 '19 at 18:30
  • 1
    @Blaze9 please share enough data that I can reproduce the error. avoid copy/pasting. Do `dput(head(df[,c("Day","Item","Facility")],50)` – M-- Jun 05 '19 at 18:31