I have a dataset with about 9800 entries. One column contains user names (about 60 individual user names). I want to generate a scatter plot in matplotlib and assign different colors to different users.
This is basically what I do:
import matplotlib.pyplot as plt
import pandas as pd
x = [5, 10, 20, 30, 5, 10, 20, 30, 5, 10, 20, 30]
y = [100, 100, 200, 200, 300, 300, 400, 400, 500, 500, 600, 600]
users =['mark', 'mark', 'mark', 'rachel', 'rachel', 'rachel', 'jeff', 'jeff', 'jeff', 'lauren', 'lauren', 'lauren']
#this is how the dataframe basicaly looks like
df = pd.DataFrame(dict(x=x, y=y, users=users)
#I go on an append the df with colors manually
#I'll just do it the easy albeit slow way here
colors =['red', 'red', 'red', 'green', 'green', 'green', 'blue', 'blue', 'blue', 'yellow', 'yellow', 'yellow']
#this is the dataframe I use for plotting
df1 = pd.DataFrame(dict(x=x, y=y, users=users, colors=colors)
plt.scatter(df1.x, df1.y, c=df1.colors, alpha=0.5)
plt.show()
However, I don't want to assign colors to the users manually. I have to do this many times in the coming weeks and the users are going to be different every time.
I have two questions:
(1) Is there a way to assign colors automatically to the individual users? (2) If so, is there a way to assign a color scheme or palette?