0

I have a set of data and I want to cluster them by the year! at the end I want to show each color is related to which year? this i my code:

col = [x[0].get_year() for x in vectors]
plt.scatter(X_train_pca[:, 0], X_train_pca[:, 1], c=col)
n=[x[0].title for x in vectors]
for i, txt in enumerate(n):
    plt.annotate(_(txt), (x[i], y[i]))
plt.title("Poem Clustering by year")
plt.savefig(newpath+"Clustering_by_year"+".png", bbox_inches='tight')
print("DONE!")

And This is what I mean! I want just something similar to this. enter image description here I don't know how to search for this, I try, but I can't find anything related.

Zahra Hosseini
  • 478
  • 2
  • 4
  • 14

1 Answers1

0

Adapted from the answer found here, we can use seaborn to do this. Most plots in seaborn have a hue parameter, which is what you're looking for. In this case I've set the hue to species, in your case it'll be year or whatever your column name is.

Of course the final part of the code where the labels are generated, you will need to update with the correct variables.

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

df_iris=sns.load_dataset("iris") 

plt.figure(figsize=(20,10))
p1 = sns.scatterplot(x='sepal_length', # Horizontal axis
       y='sepal_width', # Vertical axis
       data=df_iris, # Data source
       legend=True,
       hue='species')  

for line in range(0,df_iris.shape[0]):
     p1.text(df_iris.sepal_length[line]+0.01, df_iris.sepal_width[line], 
     df_iris.species[line], horizontalalignment='left', 
     size='medium', color='black', weight='semibold')

enter image description here

Chris
  • 15,819
  • 3
  • 24
  • 37