3

I have a pandas dataframe. I'd like to make a scatter plot of two different quantites and color the scatter plot points by category.

I have dictionary that describes the mapping to colors, and I'd like to use that dictionary.

Here's what I have:

import pandas as pd
import matplotlib.pyplot as plt

weights = [1.0, 1.1, 1.3, 2.6, 5.1] 
volumes = [2.1, 4.3, 2.6, 2.7, 9.6]
fruits = ['apple', 'banana', 'banana', 'apple', 'coconut']

data_dict = {'weight': weights, 'vol': volumes, 'class': fruits}

df = pd.DataFrame(data_dict)

color_dict = {'apple' : 'r', 'banana' : 'yellow', 'coconut' : 'lime'}

plt.scatter(df['weight'], df['vol'], c = [color_dict[i] for i in df['class'].iloc[:]])
plt.xlabel('Weight')
plt.ylabel('Volume')
plt.colorbar()
plt.show()

This:

Samle plot

is the result. I'd like

  • a discrete colorbar with the category colors and labels
  • with colors specified by a dictionary.

How can this be achieved? Can it be done in pandas, or do I have to use matplotlib commands?


Addition:

The following one-liner achieves the same result as above in pandas.

df.plot(kind="scatter", x="weight", y="vol", c=df['class'].map(color_dict), colorbar=True)

Doing:

df['class'] = df['class'].astype('category')
df.plot(kind="scatter", x="weight", y="vol", c='class')

gives a categorical colormap with labels, but how can I supply my custom colormap?

Erik
  • 2,500
  • 2
  • 13
  • 26
user3517167
  • 137
  • 1
  • 9

0 Answers0