I have a pandas dataframe. I'd like to make a scatter plot of two different quantites and color the scatter plot points by category.
I have dictionary that describes the mapping to colors, and I'd like to use that dictionary.
Here's what I have:
import pandas as pd
import matplotlib.pyplot as plt
weights = [1.0, 1.1, 1.3, 2.6, 5.1]
volumes = [2.1, 4.3, 2.6, 2.7, 9.6]
fruits = ['apple', 'banana', 'banana', 'apple', 'coconut']
data_dict = {'weight': weights, 'vol': volumes, 'class': fruits}
df = pd.DataFrame(data_dict)
color_dict = {'apple' : 'r', 'banana' : 'yellow', 'coconut' : 'lime'}
plt.scatter(df['weight'], df['vol'], c = [color_dict[i] for i in df['class'].iloc[:]])
plt.xlabel('Weight')
plt.ylabel('Volume')
plt.colorbar()
plt.show()
This:
is the result. I'd like
- a discrete colorbar with the category colors and labels
- with colors specified by a dictionary.
How can this be achieved? Can it be done in pandas, or do I have to use matplotlib commands?
Addition:
The following one-liner achieves the same result as above in pandas.
df.plot(kind="scatter", x="weight", y="vol", c=df['class'].map(color_dict), colorbar=True)
Doing:
df['class'] = df['class'].astype('category')
df.plot(kind="scatter", x="weight", y="vol", c='class')
gives a categorical colormap with labels, but how can I supply my custom colormap?