0

I have the following dataframe (see below) which I am trying to plot the y vs y_pred, then color according to their size and also change the size of the markers according to their interval size as well. For which if there are markers with big interval size, tey would be a more vibrant color and bigger size.

y   interval_size   y_pred
0.039268    2.414647    0.487695
0.049268    0.984652    0.326719
0.044148    1.268927    0.520769
0.050269    0.985148    0.415107
0.059282    0.965122    0.467267

where I am using the following code to produce this plot:

Figure Code

plt.style.use("seaborn")
sns.set_style("darkgrid")
fig, ax = plt.subplots(figsize=(18,9))
plt.scatter(x = true_labels, y = predictions, c=interval_size, alpha=0.6,
            cmap='viridis', sizes=(20, 150), s = interval_size) 
cbar= plt.colorbar()
cbar.set_label("Interval Width", labelpad=+1, fontsize = 20)
plt.title("True vs Predicted Labels", fontsize = 36)
plt.xlabel("True Labels", fontsize = 25)
plt.ylabel("Predicted Labels", fontsize = 25)

I'm able to produce the plot with sizes varying according to color but I don't truly ge the markers to be different sizes.

enter image description here

Additionally, is it possible to change the marker type per column, in this case, those that belong to column y or y_pred can be different?? I tried implementing what was done here but had no success: stackoverflow_link

I get this warning:

/home/felicia/my_python_env/lib/python3.7/site-packages/matplotlib/collections.py:922: RuntimeWarning: invalid value encountered in sqrt
  scale = np.sqrt(self._sizes) * dpi / 72.0 * self._factor
desertnaut
  • 57,590
  • 26
  • 140
  • 166

1 Answers1

0

So playing around a bit and googling more, I was able to fix it, if anyone is interested:

#Plot Data
plt.style.use("seaborn")
sns.set_style("darkgrid")
fig, ax = plt.subplots(figsize=(18,9))
plt.scatter(x = true_labels, y = predictions, c=interval_size, alpha=0.65,
            cmap='viridis', s = (interval_size**2)*20)

#Plot Characteristics
cbar= plt.colorbar()
cbar.set_label("Interval Width", labelpad=+1, fontsize = 20)
plt.title("True vs Predicted Labels", fontsize = 36)
plt.xlabel("True Labels", fontsize = 25)
plt.ylabel("Predicted Labels", fontsize = 25)

Please note that the size argument s denotes the area of the dots. In that case, to have the diameter be proportional to the quantity to show this value will have to be squared.

enter image description here