I am trying to solve one problem that resembles that of Fisher's irises classification. The problem is that I can train the model on my computer, but the given model has to predict class membership on a computer where it is impossible to install python and scikit learn. I want to understand how, having received the coefficients of the logistic regression model, I can predict the belonging to a certain class without using the predict method of the model. Using the Fisher problem as an example, I do the following.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.metrics import accuracy_score, f1_score
# data preparation
iris = load_iris()
data = pd.DataFrame(data=np.hstack([iris.data, iris.target[:, np.newaxis]]),
columns=iris.feature_names + ['target'])
names = data.columns
# split data
X_train, X_test, y_train, y_test = train_test_split(data[names[:-1]], data[names[-1]], random_state=42)
# train model
cls = make_pipeline(
StandardScaler(),
LogisticRegression(C=2, random_state=42)
)
cls = cls.fit(X_train.to_numpy(), y_train)
preds_train = cls.predict(X_train)
# prediction
preds_test = cls.predict(X_test)
# scores
train_score = accuracy_score(preds_train, y_train), f1_score(preds_train, y_train, average='macro') # on train data
# train_score = (0.9642857142857143, 0.9653621232568601)
test_score = accuracy_score(preds_test, y_test), f1_score(preds_test, y_test, average='macro') # on test data
# test_score = (1.0, 1.0)
# model coefficients
cls[1].coef_, cls[1].intercept_
>>> (array([[-1.13948079, 1.30623841, -2.21496793, -2.05617771],
[ 0.66515676, -0.2541143 , -0.55819748, -0.86441227],
[ 0.47432404, -1.05212411, 2.77316541, 2.92058998]]),
array([-0.35860337, 2.43929019, -2.08068682]))
Now I have the coefficients of the model. And I want to use them to make predictions. First, I make a prediction using the predict method for the first five observations on the test sample.
preds_test = cls.predict_proba(X_test)
preds_test[0:5]
>>>array([[5.66019001e-03, 9.18455687e-01, 7.58841233e-02],
[9.75854479e-01, 2.41455095e-02, 1.10881450e-08],
[1.18780156e-09, 6.53295166e-04, 9.99346704e-01],
[6.71574900e-03, 8.14174200e-01, 1.79110051e-01],
[6.98756622e-04, 8.09096425e-01, 1.90204818e-01]])
Then I manually calculate the predictions of the class probabilities for the observations using the coefficients of the model.
# define two functions for making predictions
def logit(x, w):
return np.dot(x, w)
# from here: https://stackoverflow.com/questions/34968722/how-to-implement-the-softmax-function-in-python
def softmax(z):
assert len(z.shape) == 2
s = np.max(z, axis=1)
s = s[:, np.newaxis] # necessary step to do broadcasting
e_x = np.exp(z - s)
div = np.sum(e_x, axis=1)
div = div[:, np.newaxis] # dito
return e_x / div
n, k = X_test.shape
X_ = np.hstack((np.ones((n, 1)), X_test)) # add column with 1 for intercept
weights = np.hstack((cls[1].intercept_[:, np.newaxis], cls[1].coef_)) # create weights matrix
results = softmax(logit(X_, weights.T)) # calculate probabilities
results[0:5]
>>>array([[3.67343725e-14, 4.63938438e-06, 9.99995361e-01],
[2.81976786e-05, 8.63083152e-01, 1.36888650e-01],
[1.24572182e-22, 5.47800683e-11, 1.00000000e+00],
[3.32990060e-14, 3.08352323e-06, 9.99996916e-01],
[2.66415118e-15, 1.78252465e-06, 9.99998217e-01]])
If you compare the two results obtained (preds_test[0:5] and results[0:5]), you can see that they do not coincide at all. Please explain me what I am doing wrong and how I can use the model's coefficients to calculate predictions without using the predict method.