I have a matrix of data X
which is n
x 2 and a corresponding array of binary labels y
, saying whether the i-th person was a winner. I'm trying to create a scatter plot with a heatmap on top of it which shows the predicted probability of each point on the graph being a winner. Here is my code so far
import matplotlib.pyplot as plt
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression
X = joblib.load('X.pkl')
y = joblib.load('y.pkl')
lr = LogisticRegression()
lr.fit(X, y)
plt.scatter(X[y == 1, 0], X[y == 1, 1], color='r', label='winners', s=1)
plt.scatter(X[y == 0, 0], X[y == 0, 1], color='b', label='losers', s=1)
plt.legend()
# Want to add a heatmap in the background for predicted probabilities here
plt.show()
Essentially, I want the background to be more red where the predicted probability is high and more blue where it is low. I can obtain the probabilities for a set of points using lr.predict_proba(X)[:0]
.
How do I colorize the background such that each point (x1, x2) in the graph is given a color based off of its predicted probability of winning?