1

So i've made a model for values prediction using linear regression. And now i need to get it to predict for 2022-2024 years into the future. how can i do it? maybe add rows 2023-2024 to dataframe? but will it be correct? Data

data['Year'] = pd.to_datetime(data['Year'])
data.index = data['Year']
data.drop(['Year'], axis=1, inplace=True)

data = data.bfill().ffill()
y = data['x4']
X = data[['x1','x3','x5','x6','x7','x8','x9','x10','x11','x14','x15','x17']]

# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# fit the model
model = LinearRegression()
model.fit(X_train, y_train)
# evaluate the model
yhat = model.predict(X_test)
# evaluate predictions
mae = mean_absolute_error(y_test, yhat)
print('MAE: %.3f' % mae)
print(model.score(X_train, y_train))
print(model.score(X_test, y_test))
kiwi555
  • 21
  • 4

1 Answers1

0

If you want to predict you just have to put the new data:

X_new = new_data[['x1','x3','x5','x6','x7','x8','x9','x10','x11','x14','x15','x17']]
y_new = model.predict(X_new)

model is your linear regression which was trained, now you predict with the new data, in the same order and same format you did for your X_train/X_test and that's it

DataSciRookie
  • 798
  • 1
  • 3
  • 12
  • I need to use the existing data to get a predict(forecast?) for 2022-2024, i.e. to get new data – kiwi555 Apr 28 '22 at 18:04
  • your model is ready, so you have the data for 2022-2024 right ? The only thing which is missing is the y the potential predict value ? So you use the data of 2022-2024 to predict the potential value the y_new – DataSciRookie Apr 28 '22 at 18:06
  • i have only this data https://i.stack.imgur.com/n855O.jpg . data only until 2022 – kiwi555 Apr 28 '22 at 18:10
  • you need the row from 2023-2024 to predict on them – DataSciRookie Apr 28 '22 at 18:11
  • so, how can I get it using linear regression on my data? – kiwi555 Apr 28 '22 at 18:15
  • to use you linear regression you need to have the x which is : data[['x1','x3','x5','x6','x7','x8','x9','x10','x11','x14','x15','x17']] from that you are going to predict your y which is : y = data['x4']. If you want to have x4 for 2023-2024 you need at least to have data[['x1','x3','x5','x6','x7','x8','x9','x10','x11','x14','x15','x17']] for 2023-2024 – DataSciRookie Apr 28 '22 at 18:18