I was doing the modeling on the House Pricing dataset. My target is to get the mse result and predict with the input variable
I'm doing the modeling with scaling the data using MinMaxSclaer(), and the model is trained with LinearRegression(). After this I got the score, mse, mae, dan rmse result.
But when I want to predict it with the actual result. It got scaled, how to predict the after result with the actual price?
This is my script:
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error
train = pd.read_csv('train.csv')
column = ['SalePrice', 'OverallQual', 'GrLivArea', 'GarageCars', 'TotalBsmtSF', 'FullBath', 'YearBuilt']
train = train[column]
# Convert Feature/Column with Scaler
scaler = MinMaxScaler()
train[column] = scaler.fit_transform(train[column])
X = train.drop('SalePrice', axis=1)
y = train['SalePrice']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=15)
# Calling LinearRegression
model = LinearRegression()
# Fit linearregression into training data
model = model.fit(X_train, y_train)
y_pred = model.predict(X_test)
# Calculate MSE (Lower better)
mse = mean_squared_error(y_test, y_pred)
print("MSE of testing set:", mse)
# Calculate MAE
mae = mean_absolute_error(y_test, y_pred)
print("MAE of testing set:", mae)
# Calculate RMSE (Lower better)
rmse = np.sqrt(mse)
print("RMSE of testing set:", rmse)
# Predict the Price House by input:
overal_qual = 6
grlivarea = 1217
garage_cars = 1
totalbsmtsf = 626
fullbath = 1
year_built = 1980
predicted_price = model.predict([[overal_qual, grlivarea, garage_cars, totalbsmtsf, fullbath, year_built]])
print("Predicted price:", predicted_price)
The result:
MSE of testing set: 0.0022340806066149734
MAE of testing set: 0.0334447655149599
RMSE of testing set: 0.04726606189027147
Predicted price: [811.51843959]
Where the price is should be for example 208500, 181500, or 121600 with grands value in $.
What step I missed here?