I have got around 4 years of data.(US retail data) I aggregated it by (year,weekoftheyear) and built some models and checked the quantity forecast. The performance was not upto the mark. Now I am trying to aggregated data on week basis without considering years.(as all years have almost same behavior in US market and holidays,events fall same date every year). So I end up having only 52 rows of data. I have got around 35 features that I have derived earlier so stepAIC giving infinity error. How do I deal with this issue? Can anyone suggest other good methods in choosing important features instead.Unfortunately I cannot give more information about the data. Thanks in advance.
Asked
Active
Viewed 22 times
0
-
1Don't throw out all that information by collapsing your data to summaries that cover all four years. There just isn't enough statistical power in 52 observations for your task. So keep the weekly time series and try some models that include parameters for seasonality (e.g., ARIMA, bsts). – ulfelder May 12 '17 at 10:30
-
The US way of numbering weeks may result in incomplete weeks having less than 7 days at begin or end of the year. This may cause severe problems for your model. I suggest to use either `cut()` with `breaks = "week"` and `start.on.monday = FALSE` or the ISO8601 week numbering scheme. For more details see the related question http://stackoverflow.com/questions/43813249/r-round-down-dates-to-first-day-of-the-week/43818261#43818261 and [this comparison](http://stackoverflow.com/a/43806987/3817004) – Uwe May 13 '17 at 08:15