1

I am trying to forecast value in 30 days. I have a timeseries data with some parameteers. The example of date I will attach at the bottom.

The main idea is that Y value is our aim variable to predict in 30 days from today. The f1-f5 variables is values which influence on Y value. So I need to predict Y using Date and f1-f5 columns. All the data comes every day.

Recomend me please some ML and DL approaches to predict "Y" value?

My thoughts. As I understood it is time series data. And the task is regression. But I am a bit disapointed because time series approaches, as I understood, predict value based on only date value, using seasonality and so on. But I afraid that if I will use XGBoost or Linear regression approaches I will loose timeseries effect on this data.

Date,f1,f2,f3,f4,f5,Y
2015-01-01,183,34,15,1166,50,3251
2015-01-02,364,173,5,739,32,8132
2015-01-03,83,72,38,551,49,6271
2015-01-04,183,81,7,937,32,3334
2015-01-05,324,61,73,554,71,3742
2015-01-06,183,97,15,337,17,5543
2015-01-07,38,152,83,883,32,9143
2015-01-08,78,72,5,551,11,6435
2015-01-09,183,30,21,443,92,4353
...,...,...,...,...,...,...
2018-06-08,924,9,53,897,88,7446

1 Answers1

1

Time series are traditionally modeled with AR (auto-regression) and MA (moving average). Trend and seasonality should also be accounted for. So why not use ARIMA or Prophet? Here's some theory on the subject - https://otexts.com/fpp2/

There are some ML/DL implementations based on RNN/LSTM but they are really complex, often hard to explain, and tend to suffer from vanishing gradient problem. If you must use ML/DL, you may want to have a look at LSTNet.

Maxim Volgin
  • 3,957
  • 1
  • 23
  • 38
  • As I understood, if I will use ARIMA, I will loose dependences with other features, am I right? – Nikita Belov Dec 12 '19 at 09:19
  • 1
    Ah, I see your point. So you have a multivariate time series. LSTNet can deal with it, but it won't be easy. – Maxim Volgin Dec 12 '19 at 09:23
  • Can u recommend me any materials to read about it? Because, I have never worked with time series data, or may be u can give me your social network link to ask some questions about time series analysis? – Nikita Belov Dec 12 '19 at 09:26
  • @NikitaBelov, [this](https://otexts.com/fpp2/) book uses R but all models are probably available in Python somewhere. The book will focus on using specific forecasting tools rather than ML models. – cimentadaj Dec 12 '19 at 09:29
  • The book I mentioned in the answer is a good start. For multivariate time series, there is no simple answer, just search on the web what people do about it. Solution depends a lot on your data, so maybe try a few and see how they perform. – Maxim Volgin Dec 12 '19 at 09:30
  • @cimentadaj hmm, okey. Thank u, I will check it. – Nikita Belov Dec 12 '19 at 09:31
  • 1
    I think https://stats.stackexchange.com would be a better fit for this question as it is not specfically about programming. – warped Dec 12 '19 at 09:31
  • @MaximVolgin ok, thank u for your recommendations. I hope that it will work:) – Nikita Belov Dec 12 '19 at 09:32
  • @cimentadaj Does this book suitable for multivariate tome series? – Nikita Belov Dec 12 '19 at 09:41
  • @MaximVolgin is right it is multivariate time series you can go for deep learning methods or vector auto regression model (VAR) – Akash Kumar Dec 12 '19 at 09:44
  • @NikitaBelov, you can find some multivariate time series approaches in Chapter 11 of the book but I haven't checked them thoroughly. This is just from the top of my head. – cimentadaj Dec 12 '19 at 09:50
  • @Akash_Kumar ok thank you, I will try to use VAR model and Recurrent NN, if there will be some more ideas I will test it:) – Nikita Belov Dec 12 '19 at 10:18
  • @cimentadaj I see. Thank you, I will check it, I hope it will be helpfull:) – Nikita Belov Dec 12 '19 at 10:19
  • @NikitaBelov i would start with Prophet to establish a baseline, and try to improve from there - https://stackoverflow.com/questions/54544285/is-it-possible-to-do-multivariate-multi-step-forecasting-using-fb-prophet – Maxim Volgin Dec 12 '19 at 10:33
  • @MaximVolgin Does using XGBoost regression model will suitable for me, if I will transfer 1 min time series to 30 days timeseries? Some datascientist used it for time series data, and as I understand it take into account my features, isn’t it? – Nikita Belov Dec 12 '19 at 10:37
  • @MaximVolgin This article is very helpful, thank you. I have one more question about multivariate timeseries. Should I use some approaches to every timeseries stationary, because I just read many articles and nobody think about does time series stationary or not? – Nikita Belov Dec 12 '19 at 10:40
  • @NikitaBelov some methods (like auto-ARIMA) take care of it automatically, but in general you need to make time series stationary (remove trend, seasonality, etc.) and reapply them later on the forecast of the model. Read the chapter on residuals analysis - https://otexts.com/fpp2/residuals.html – Maxim Volgin Dec 12 '19 at 10:51