1

I'm using dplyr and rollmean mean to calculate a 13 Week Moving Average and Growth rates. The following works:

NEW_DATA <- DATA %>% 
    select(CAT, Inventory_Amount, Sales, Shipments, DATE)%>%
    group_by(CAT, DATE)%>%
    summarise(
            INVENTORY = sum(Inventory_Amount),
            SO = sum(Sale),
            SI = sum(Shipments)
    ) %>%
    arrange(CAT, DATE)%>%
    mutate(SO_13WK_AVG = rollmean(x = SO, 13, align = "right", fill = NA ),
           GROWTH = round(((SO - lag(SO, 52)) / lag(SO, 52)) *100,2))

This codes adds two new columns "SO_13WK_AVG" (the 13 week sales average) and Growth (YoY Growth Rate for Sales)

When I try to select an additional variable from the original dataframe to include in the new summarized dataframe, the values for the new variables being created all turn to NA's. The following code generates NA's for the SO_13WK_AVG and GROWTH (all I've done is selected the "WK" variable:

NEW_DATA <- DATA %>% 
    select(CAT, Inventory_Amount, Sales, Shipments, DATE, WK)%>%
    group_by(CAT, DATE, WK)%>%
    summarise(
            INVENTORY = sum(Inventory_Amount),
            SO = sum(Sale),
            SI = sum(Shipments)
    ) %>%
    arrange(CAT, DATE)%>%
    mutate(SO_13WK_AVG = rollmean(x = SO, 13, align = "right", fill = NA ),
           GROWTH = round(((SO - lag(SO, 52)) / lag(SO, 52)) *100,2))

I searched stackoverflow and one found one thread that seems related:

Group/Mutate only returns NA and not an average

This thread suggests using na.rm = TRUE to remove NA values from calculations. However as far as I can tell I don't have any missing values. Any help / commentary is appreciated.

  • 4
    Can you post a sample of data so we can reproduce this? – camille Jul 25 '18 at 18:34
  • 1
    Try this link for an explanation of how to write a [reproducible](https://stackoverflow.com/q/5963269/1422451) example that meets the [MCVE](https://stackoverflow.com/help/mcve) and [`r`](https://stackoverflow.com/tags/r/info) tag description, with the desired output. You can use `dput()`, `reprex::reprex()` or built-in data sets for reproducible data. – Hack-R Jul 25 '18 at 18:36
  • 1
    What happens when you do a `rollmean(..., k = 13)` and there are less than 13 data points? What happens when you `lag(SO, 52)` and there are less than 52 data points? My guess is an `NA` is produced, and when you add `WK` to the grouper you don't have enough data points in each group. But without seeing your data that's just a guess. – Gregor Thomas Jul 25 '18 at 18:55
  • Thanks Camille and Hack-R. Unfortunately I'm unable to post the data at this time. I realize this makes helping me extremely difficult. I'll need to work on masking the data before I can share an example. I appreciate the suggestions – Crash_Overide Jul 25 '18 at 19:09
  • @Gregor Yes. This is definitely possible. I will take a look and get back to you. Thank you! – Crash_Overide Jul 25 '18 at 19:18
  • 1
    I doubt this would solve your specific problem, but you should use `na.rm =TRUE` all the time anyway, just to be safe. – mmyoung77 Jul 25 '18 at 19:46
  • @mmyoung77. Agreed. Thanks – Crash_Overide Jul 25 '18 at 19:47

1 Answers1

0

I just resolved a very similar issue. Can't quite tell whether it will fix yours without spending more time thingking about it, but I was grouping by the two variables which accounted for all of the variation across my data set (location and week). Therefore, the rolling mean was either not able to calculate, or could only create the fill values. Not grouping by "week" solved the issue. Since "WK" is almost certainly 100% dependent on "Date", I expect you have the same issue. Remember, "summarise" drops the last grouping variable from the grouping. Try grouping by WK before your summarise, and then regrouping without week or date.

(BTW, I'm sure you've figured something out, since this was almost two years ago, but I imagine others will encounter this as well, after all, that's why I came to this question.)

Goose
  • 16
  • 2
  • 2
    Welcome to SO. Please include the code which your answer proposes. You may find this link helpful for answering questions [answer] – Peter May 11 '20 at 21:22