1

Say I have this table in a dataframe:

     DATE          SUNHOUR    YEAR
---  ----------  ---------  ------
281  2018-10-09       11.1    2018
 29  2018-01-30        6.5    2018
266  2018-09-24        6.2    2018
115  2018-04-26       13.4    2018
 69  2018-03-11        7.3    2018
158  2019-06-08       13.7    2019
287  2019-10-15        8.5    2019
177  2019-06-27       15.9    2019
136  2019-05-17       11.5    2019
 59  2019-03-01       10.1    2019

This will give me a scatterplot:

df.plot.scatter(x='DATE', y='SUNHOUR')

scatter

Now, when I look at the documentation, I read that the parameter c can take a column name or position whose values will be used to color the marker points according to a colormap. So I thought this would work to have a different color for each year:

df.plot.scatter(x='DATE', y='SUNHOUR', c='YEAR')

But this returns:

ValueError: 'c' argument must be a color, a sequence of colors, or a sequence of numbers, not ['2018' '2018' '2018' '2018' '2018' '2019' '2019' '2019' '2019' '2019']

What am I missing?

mrgou
  • 1,576
  • 2
  • 21
  • 45
  • I think the type of `YEAR` column is `object`. You could try to run this - `df['YEAR'] = df['YEAR'].astype(int)` and then run the `plot` command. – Sajan May 04 '20 at 12:51

1 Answers1

1

According to the documentation:

c : str, int or array_like, optional
    The color of each point. Possible values are:
    * A single color string referred to by name, RGB or RGBA code, for instance ‘red’ or ‘#a98d19’.
    * A sequence of color strings referred to by name, RGB or RGBA code, which will be used for each point’s color recursively. For instance [‘green’,’yellow’] all points will be filled in green or yellow, alternatively.
    * A column name or position whose values will be used to color the marker points according to a colormap.

You can't give just any values, but a column with values of colours (e.g. you would have a column with values "green", "red", etc.

For what you're trying to do, take a look here

Zionsof
  • 1,196
  • 11
  • 23