Let's take data following
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
N=30
df1 =5*pd.DataFrame(np.random.randn(N, 3), columns=['A', 'B','C'])
df2 =10+10*pd.DataFrame(np.random.randn(N, 3), columns=['A', 'B','C'])
Data=np.concatenate((df1, df2), axis=0)
Data[:,2]=1
Data[0:N,2]=0
y=Data[:,2]
df=pd.DataFrame(Data[:,0],Data[:,1])
I want to create two logistic regression followed by the equations :
and
what I want to do is to plot fitted values for each model on one plot (just simple scatter plot). After that I want to plot two boundary lines - for two dimensional, and for five dimensional case.
And there is a problem - two dimensional case seems very easy - we just do some magic with coefficients.
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
N=30
df1 =5*pd.DataFrame(np.random.randn(N, 3), columns=['A', 'B','C'])
df2 =10+10*pd.DataFrame(np.random.randn(N, 3), columns=['A', 'B','C'])
Data=np.concatenate((df1, df2), axis=0)
Data[:,2]=1
Data[0:N,2]=0
y=Data[:,2]
df=pd.DataFrame(np.c_[Data[:,0],Data[:,1]])
modelL = LogisticRegression()
modelL = modelL.fit(df,y)
LogFit = modelL.predict(df)
b = modelL.intercept_
B1, B2 = modelL.coef_.T
c = -b/B2
m = -B1/B2
xmin, xmax = -20, 60
ymin, ymax = -20, 60
xd = np.array([xmin, xmax])
yd = m*xd + c
plt.plot(xd, yd, 'k', lw=1, ls='--')
plt.fill_between(xd, yd, ymin, color='tab:green', alpha= 1)
plt.fill_between(xd, yd, ymax, color='orange', alpha= 1)
plt.xlim(xmin, xmax)
plt.ylim(ymin, ymax)
plt.scatter(Data[:,0], Data[:,1], s = 5, c = LogFit)
plt.show()
To summarize
I don't know how to add next data (given by second equation) to my plot above with additional boundary line (for the second equation). The biggest problem I have is plotting this additional boundary decision. I have no intuition behind plotting it when dealing with 5 dimensional dependent variable. I want to do it only with Logistic Regression classificator. Not using Support vector machine or any other tools. Only Logistic Regression.