0

I have a matrix of data X which is n x 2 and a corresponding array of binary labels y, saying whether the i-th person was a winner. I'm trying to create a scatter plot with a heatmap on top of it which shows the predicted probability of each point on the graph being a winner. Here is my code so far

import matplotlib.pyplot as plt
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression

X = joblib.load('X.pkl')
y = joblib.load('y.pkl')
lr = LogisticRegression()
lr.fit(X, y)
plt.scatter(X[y == 1, 0], X[y == 1, 1], color='r', label='winners', s=1)
plt.scatter(X[y == 0, 0], X[y == 0, 1], color='b', label='losers', s=1)
plt.legend()
# Want to add a heatmap in the background for predicted probabilities here
plt.show()

Essentially, I want the background to be more red where the predicted probability is high and more blue where it is low. I can obtain the probabilities for a set of points using lr.predict_proba(X)[:0].

How do I colorize the background such that each point (x1, x2) in the graph is given a color based off of its predicted probability of winning?

michaelsnowden
  • 6,031
  • 2
  • 38
  • 83
  • Have you read the solution in this [question](http://stackoverflow.com/questions/2369492/generate-a-heatmap-in-matplotlib-using-a-scatter-data-set)? Scattering as a heatmap might not be the same as you want, but this might help you. – Vinícius Figueiredo May 19 '17 at 03:24

1 Answers1

0

What you are looking for is specifying the c values for the points, and the colormap:

props = lr.predict_proba(X)
plt.scatter(X[y == 1, 0], X[y == 1, 1], c=props[:, 1], cmap='Reds', label='winners', s=1)
plt.scatter(X[y == 0, 0], X[y == 0, 1], c=props[:, 0], cmap='Blues', label='losers', s=1)

If you want different colormaps, check this link for a full list: https://matplotlib.org/examples/color/colormaps_reference.html

Gerges
  • 6,269
  • 2
  • 22
  • 44
  • This almost works for me, but I was a little unclear in the question. I want every single point on the graph (not just the ones in X) to be assigned a color (i.e. the entire plot is colored) – michaelsnowden May 19 '17 at 03:15
  • You can try plt.pcolormesh(x, y, props). You will need to reshape your arrays depending on your intention... – Gerges May 19 '17 at 03:22