0

I have the following data:

   a  b  col lab
0  1  6    1   a
1  4  5    2   b
2  1  7    3   c
3  5  5    4   d
4  6  2    5   e

My goal here is to plot a scatter of [a] and [b] and have each value of col be different colors and have a legend that mnaps the color to the label.

I tried the following, but it only shows the legend for the first element.

plt.scatter(df['a'], df['b'], c = df['col'])
plt.legend(df['lab'])

enter image description here

What am I doing wrong?

ben890
  • 1,097
  • 5
  • 25
  • 56
  • Check out this answer: https://stackoverflow.com/a/37813290/2943652 – Francisca Concha-Ramírez Nov 14 '19 at 16:36
  • it is giving you only one legend because the method `plt.scatter` is being called only once – learner Nov 14 '19 at 16:40
  • ahh gotcha. I'm more familiar with R so I guess I expected matplotlib to work simiarly to ggplot. So I must loop through each datapoint. – ben890 Nov 14 '19 at 16:42
  • My dataframe actually has about 8k datapoints. Is this prohibitive to plot in a loop? – ben890 Nov 14 '19 at 16:43
  • [This answer](https://stackoverflow.com/a/58000327/4124317); [This answer](https://stackoverflow.com/a/56507383/4124317); [This answer](https://stackoverflow.com/questions/56394204/pandas-groupby-scatter-plot-in-a-single-plot/56394972#56394972); [This answer](https://stackoverflow.com/a/58449842/4124317) – ImportanceOfBeingErnest Nov 14 '19 at 16:50
  • I'm getting the following error when replicating the first answer mentioned in the thread: Image size of 380x121747 pixels is too large. It must be less than 2^16 in each direction. \ – ben890 Nov 14 '19 at 19:31

0 Answers0