0

I'm using a batch forecasting method for a dataframe (name:df5) that looks like this:

 Primary.Base.Product  Variable  Value
    A                     Aug '16    1 
    A                     Sep '16    4
    B                     Aug '16    10
    B                     Sep '16    2
    Z                     Aug '16    6
    Z                     Sep '16    12

I tried the DPLYR code suggested by ramhiser here : For loop for forecasting several datasets at once in R

library(dplyr)
    library(smooth)
    library(forecast)
    library(tstools)
    #Create a DF
    Primary.Base.Product <- c('A','A','B','B','C','C')
    variable <- c('Aug16','Sep16','Aug16','Sep16','Aug16','Sep16')
    value <- c(1,4,10,2,6,12)
    df5 = data.frame(Primary.Base.Product,variable,value)
    #Do Batch Forecasting:
    model_fits2 <- group_by(df5, Primary.Base.Product) %>% do(fit=ets(.$value))
    head(model_fits2)
    forecast(model_fits2$fit[[1]])

It works fine but how do I split the data into test and train and calculate the accuracy using accuracy() function? Also, how do I calculate the accuracy of fitted values versus actual values?

Any sort of help is appreciated! Thanks in advance!

I've tried:

model_fits2 <- group_by(df5, Primary.Base.Product) %>%   
               do(fit=ets(.$value[1:(nrow(df5)-10)]))
model_acc2 <- group_by(df5, Primary.Base.Product) %>% 
              do(acc=accuracy(.$value[(nrow(df5)+1):nrow(df5)],                                
                              forecast(model_fits2$fit,h=10)))

The error was:

Error in ets(object, lambda = lambda, biasadj = biasadj, allow.multiplicative.trend = allow.multiplicative.trend, : y should be a univariate time series

A. Suliman
  • 12,923
  • 5
  • 24
  • 37
S Ne.
  • 3
  • 2
  • Hi, Please provide [complete reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), and surely someone will help. – A. Suliman Aug 19 '19 at 07:50
  • [This](https://www.r-bloggers.com/batch-forecasting-in-r-2/) may be helpful. – maydin Aug 19 '19 at 08:31
  • Thank you so much for editing my question, @A. Suliman. I've provided a complete example now. – S Ne. Aug 19 '19 at 08:42

1 Answers1

0

Split each group 50/50 then apply forecast for values with test=0 and accuracy for values with test=1.

library(dplyr)
library(smooth)
library(forecast)
library(tstools)
model_fits2 <- group_by(df5, Primary.Base.Product) %>% 
               arrange(Primary.Base.Product) %>% 
               #n() number of obs in this group
               mutate(test=ifelse(row_number() <= n()/2, 0, 1)) %>% 
               do(acc=accuracy(forecast(ets(.$value[.$test==0])), .$value[.$test==1]))
               #If you need the output as dataframe then use tidyr::unnest
               #do(acc=data.frame(accuracy(forecast(ets(.$value[.$test==0])), .$value[.$test==1])) %>% rownames_to_column(var = 'model')) 
               # %>% tidyr::unnest()

Data

df5 <- rbind(df5,df5) #Increase the dataset

Note that nrow(df5)+1 : nrow(df5) returns indices outside df5, therefore any subset using these indices will return NAs, see below.

nrow(df5)+1 : nrow(df5)
[1] 13 14 15 16 17 18 19 20 21 22 23 24
df5$value[nrow(df5)+1 : nrow(df5)]
[1] NA NA NA NA NA NA NA NA NA NA NA NA
#See the difference between 
> 4+1 : 4 
[1] 5 6 7 8 
> (4+1) : 4 
[1] 5 4
#the 1st one similar to 
> 4 + (1:4) 
[1] 5 6 7 8
A. Suliman
  • 12,923
  • 5
  • 24
  • 37
  • 1
    Thank you so much for your detailed answer! Yes, `nrow(df5)+1 : nrow(df5)` was a stupid mistake. I should know better. This dplyr method is not very straightforward, so I converted my dataframe into a matrix, transposed the said matrix and looped through each column, applying ETS to each. That being said, I will try your method as well. Thanks again! – S Ne. Aug 21 '19 at 16:56