1

When I am plotting a grouped scatter plot from pandas (as described in the documentation) where the second group needs to contain a color bar, I get an error TypeError: You must first set_array for mappable.

Following other but different questions for ungrouped scatter plots, this is because cmap is only used if c is an array of floats. But stand-alone it works perfectly and the data is not manipulated between creating the two axes-objects.

Here is the code that I am using:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(100, 5), columns=['A', 'B', 'C', 'D', 'E'])

# this works stand-alone
#df.plot(kind='scatter', x='A', y='B', c='C', cmap='Blues')

# why does this break?
ax = df.plot(kind='scatter', x='D', y='E', color='red', label='Other group')
df.plot(kind='scatter', x='A', y='B', c='C', cmap='Blues', ax=ax)
plt.show()

Both groups should be displayed in one plot. Note, that it is important for me to plot columns D and E before plotting A, B and C on top of them so the latter need to be in the second plot. Vice versa it works but for my requirements it breaks.

Does anyone know how to fix this and obtain the desired result?

Thanks in advance!

Cord Kaldemeyer
  • 6,405
  • 8
  • 51
  • 81

2 Answers2

3

It seems pandas confuses itself about making a colorbar internally. You always have the options to create the colorbar with matplotlib though.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(100, 5), columns=['A', 'B', 'C', 'D', 'E'])

ax = df.plot(kind='scatter', x='D', y='E', color='red', label='Other group')
df.plot(kind='scatter', x='A', y='B', c='C', cmap='Blues', ax=ax, colorbar=False)
ax.figure.colorbar(ax.collections[1])   # Note the index 1, which stands
                                        # for second scatter in the axes.
plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • Thanks a lot! I have accepted the other answer as it was the first one and it still contains the colorbar label without additional code.. – Cord Kaldemeyer Jan 08 '19 at 15:50
2

Reverse the order of you plotting. I think the colorbar is getting confused which chart to apply to. Hence, we try to do plot the first with color bar then apply the red scatter on top.

df = pd.DataFrame(np.random.rand(100, 5), columns=['A', 'B', 'C', 'D', 'E'])

# this works stand-alone
#df.plot(kind='scatter', x='A', y='B', c='C', cmap='Blues')

# why does this break?
# ax = df.plot(kind='scatter', x='D', y='E', color='red', abel='Other group')
ax = df.plot(kind='scatter', x='A', y='B', c='C', cmap='Blues', zorder=10)
df.plot(kind='scatter', x='D', y='E', color='red', label='Other group', ax=ax, zorder=1)
plt.show()

Output:

enter image description here

With zorder:

enter image description here

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • 1
    Thanks for your help. As I have mentioned, I need to plot D and E before columns A, B and C. Or result-wise the markers for A, B and C have to "cover" the markers for D and E. Is it somehow possible to get this result? – Cord Kaldemeyer Jan 08 '19 at 15:26
  • Try the `zorder` parameter. See updated solution. see matplotdocs and [example](https://matplotlib.org/examples/pylab_examples/zorder_demo.html) – Scott Boston Jan 08 '19 at 15:28
  • 1
    Perfect! `zorder` was the keyword and you saved my day ;-) Thanks a lot! – Cord Kaldemeyer Jan 08 '19 at 15:49